By Varun D in AI & Deep Learning — Aug 28, 2019

Designing Chatbots 101: Top 5 challenges

Designing Chatbots 101. Top 5 challenges faced while designing chatbots.

Chatbots are slowly becoming ubiquitous on social media and retail websites. There are ton of articles and trend reports.

And more articles on how to build chatbots. But they don't talk about why you may need one or how to plan creating one.

Retail companies have introduced new chatbots, like TMY.GRL by Tommy Hilfiger and unceremoniously decommissioned them.

At the same time, companies are now providing personalized and human support. Due to better voice and video tools, hiring remote customer service agents half-way across the world is on the rise.

Like most emerging technologies, chatbots will get better while seamlessly augmenting one-on-one human interactions, if needed.

The power to understand language is getting stronger thanks to in-depth research on Natural Language Processors (NLP) which analyzes a statement and intelligently points out what the user's intent is. Here intent has two meanings; user's intention and the captured intent understood by the NLP.

We look at design challenges with chatbots and a few ways to solve them.

I. Challenges of the Why

First, a simple question to ask ourselves is, "why would I need a chatbot?".

Will it make it easier for your users to find answers and save their time?
Will you save on resources by reducing human agents?
Does the "chatbot" fit in with overall brand experience? For example, would I expect talking to a human concierge or a chatbot from a luxury hotel brand?
Do we just need one because it's cool and our users can have some fun?

The above questions should be enough for you and your team to start a discussion.

Starting without a "why" can be disastrous down the road. Unplanned chatbot creations can lead to weird or unwanted conversations causing great damage to your brand.

II. Challenges of Conversational UI

Today, you can sign-up and start creating simple chatbots through DialogFlow, IBM Watson chatbot creator or others. The challenge is in creating a good conversational user interface.

Here's a quick overview of conversation UI models which you can use. Some are better than others.

Dialog trees

The good ol' IVR, press 1 for... press 2 for... A tried and tested (albeit frustrating) use of dialog trees.

Adventure and role-playing games also use dialog trees for character development without having to worry about natural language processing.

This method can become rigid and confusing and complicated to implement. A conversation flow with 3 dialog options and 5 levels deep will need about 243 written or spoken dialogs!

For the user, a rigid structure is feels robotic. User's expect a chatbot to have some personality. Injecting personality through dialog trees gets difficult.

Dialog Decision Graphs

Slightly similar to dialog trees, decision graphs are interconnected dialog trees housing multiple if...then conversations.

Similar to dialog trees, these can get complicated quickly and manage to frustrate users; mainly getting stuck in an edge case conversation; - more so in AI experiences.

Source: Towards optimising modality allocation for multimodal output generation in incremental dialogue.

Natural Language Processing (NLP)

One of the tools that work well these days is a subset of NLP; Natural Language Understanding / Interpretation (NLU / NLI) using intent recognition.

In the case of Intent Recognition, if...then is driven by understanding user's intent in a statement and its relevant entities.

The NLU is able to do this through training by providing it with enough utterance samples.

"What was the weather in New York yesterday?"

User's intent: weather_request
User's entity: { city: New York, datetime: 01-12-2019:0000}

It's not as simple as recognizing keywords like "New York" and "yesterday".

Today, BERT (Bidirectional Encoder Representations from Transformers) is a fully trained language model by Google and is one of the few robust NLUs out there.

Why robust? BERT considers the context from both left and right side of a word. All other models have considered either left or the right of a word.

Even IBM uses pre-trained BERT models for some of their AI modules.

Alternatives to BERT exist. Like the ULMFiT (Universal Language Model Fine-Tuning) created by fast.ai’s Jeremy Howard and DeepMind’s Sebastian Ruder.

Another NLP model kicking up a storm is GPT-2 by OpenAI. They claimed to have designed an NLP model so advanced that they didn't release the full version fearing malicious use.

For the adventurous few, feel free to train your own BERT model.

III. Challenges in knowledge; humans vs NLP

Chatbots can be driven by a decision tree, a dialog graph or an advanced NLP based intent:response collection.

On the other hand, humans are great at abstracting knowledge. ~~They~~ We can create complex mental models of a situation.

We can prioritize sections of these models and form an expert opinion based on the process of elimination.

For example, when you you enter an Apple Genius Bar with a MacBook Pro 15" Touchbar in hand, they will instantly know it's a keyboard or battery issue.

IV. Challenge of language and our world

Physical encoding of the world is easy for us but difficult for computers, considering the nuance of language and exchange of meaning.

For example, how would you describe gloves to a machine? There's a chance if a robot has hands or is trained to understand the concept of hands. It takes human babies years to understand fingers, hands and grasping. Now try explaining gloves for work, cold or fashion.

Sarcasm and negatives are also difficult for NLP to understand. How would you react if someone asked you, "Don't tell me the time." or "Please, don't tell me a bad joke." or getting "Yeah, right..." from the users when expecting a positive reply.

The following conversation is possible with enough training.

ChatBot: Are you interested to learn English or French?
Human: I don't think I'd like to learn French.
Chatbot: Sure, here are a few English courses: Course 1, Course 2, etc...

V. Challenge of complexity

Assuming we are now comfortably serving users with a few well trained intents, we now come across a complexity challenge.

The more intents we train an NLU to recognize, the more probability of NLU mismatching and confusing intents with each other.

Let's take a utility company example. After training one Billing intent and one Complaint intent, the chatbot is working fine:

Intent "bill_request_month"
Training Samples:
"I'd like a copy of my bill for the month."
"What do I owe this month?" etc

Intent "lodge_complaint"
Training Samples:
"I wan't to lodge a complain please"
"What's the status of my complaint comp-93432?"

How and why to select what intents you would like to add to your chatbot is an art.

We decide to add 4 new intents to our chatbot:

"bill_dispute": So users can lodge a bill specific dispute.
"bill_year": To request all copies of yearly bill.
"new_connection": To request a new connection.
"tell_joke": Why not.

Notice that now it is possible for the NLP to confuse the first two intents, bill_dispute and bill_year with our first intent, bill_request_month.

We will need to creating a complexity matrix to understand where intent collisions happen.

Intent Collision Matrix
=======================

#	Intnet			CUI Complexity	Similar Intents
1 	bill_request_month		H	3
2	lodge_complaint			L	1
3 	bill_dispute			M	3
4 	bill_year			L	3
5 	new_connection			L	1
6 	tell_joke			L	1

For example, what intent will the chatbot trigger if the user asks:

User: "What happened to my bill complaint last month?"

Solutions based on simplicity

Technically, there are great guides on how to get started on Dialog Flow and IBM Watson to create simple chatbots.

We went through non-exhaustive list of common challenges we face while creating chatbots while going through some working principles.

In future articles, we will cover topics like Chatbot Technical Architecture and high fun-factor Conversation UI Design by creating our own location based chatbot ;)