Chatbots, Conversational Interfaces, and the Rise of Messaging platforms
This is based on a presentation Manifold partner Sean Johnson has been giving for innovation groups at several companies in the last few weeks. If you're more of a visual learner, we've made a video of his presentation below. You can also view the slides on SlideShare.
The Rise of the Third Interface
There have been various phases in how we have interacted with computers. The first phase was the Terminal Interface — using the command line or DOS prompt. This interface was embraced by early adopters but did not become mainstream because it required people to have a working knowledge of the guts of the machine and knowledge of the precise syntax to use to execute commands.
The second phase was the Graphical Interface. It used visual representations of programs, files and actions, leveraging many of the mental models people already had from the real world. This made it much easier for users to interact with a machine, and adoption took off. But it still had limitations. It represented an abstraction and could lead to confusion. The field of user experience sprang up to help make these interfaces simpler and more intuitive.
A third wave has emerged, and it is what we’re calling the Conversational Interface. In some ways it returns to the simplicity of the command line, in that it is primarily a text-based medium. But it differs in that rather than requiring the user to know exact commands, it offers the ability for the user to interact with a machine (or a person behind a machine) using natural language.
This change is subtle but profound. At a minimum, Conversational Interfaces are much more user-friendly than the command line. But the promise and excitement around them is their eventual ability to parse complicated requests, execute them in real time and return a result that almost feels magical. Like talking to a person.
In short, we are moving from us having to learn how to interact with computers to computers learning how to interact with us.
"We are moving from us having to learn how to interact with computers to computers learning how to interact with us."
Why are Conversational Interfaces Interesting?
It seems as though users clearly prefer conversational interfaces. Many of the web’s most recent success stories primarily leverage a conversational interface. In fact, in just 4 short years the top 4 messaging apps eclipsed the top 4 social networking sites (in terms of monthly active users).
And the rise of platforms like Siri, Google Now, Cortana, and Amazon Echo are going further. They don’t rely on typed out text, but still represent conversational interfaces, perhaps in the best possible way.
Why do people prefer these kinds of interfaces? We believe there are several reasons.
They feel more personal than apps.
Most native apps (and websites, for that matter), have the same experience for every user. It would be difficult to impossible to design a customized series of screens, with different language on each page to speak to each type of user, and with different imagery and visual aesthetic.
In a counterintuitive way, the removal of all the artifice actually opens up the opportunity for a more personalized experience. As Jonathan Libov observed, “Language is the most powerful, useful, effective communication technology ever, period.”
This personalization can manifest itself in several ways. At its simplest, a retrieval-based approach to conversation is able to create a “choose your own adventure” style of interactions with customers. While the first couple questions are consistent for each user, it quickly takes on a unique experience.
This personalization can be magnified by remembering user preferences. Alexander Weidauer demonstrates a great example, showing how the question “how is my business doing” could be answered in two completely different ways depending on a person’s role in the company.
Conversational interfaces are more appropriate for many interactions.
In a series of customer development interviews about chatbots and conversational interfaces, we uncovered several kinds of interactions where customers prefer them.
"I like the idea of not having to download an entire app."
The first and possibly most important relates to the nature of interaction customers want to have with a brand. While brands are constantly pushing for deeper levels of engagement with their customers, customers don’t always want it. Mobile apps are a great example - of course companies want their customers to download their app. But customers treat their phone’s home screen as precious real estate, and simply are unwilling to give their mortgage company (or even their phone provider) a spot on that page.
Those same customers are much more willing to engage in a conversation through a platform like FB messenger. They know it’s there when they need it, but is out of the way when they don’t. While mobile apps still should represent a crucial piece of a company’s technology strategy, there are many customers they might not be engaging with because of the depth of relationship implicit in such a platform. Conversational interfaces allow them to engage those customers.
"I’d actually prefer to give my information to a bot. No judgment."
The second insight is related to sharing personal information. When sharing things like medical information (a user’s weight, for example), financial information (how much they’ve saved for retirement) or their adherence to a plan of some kind, they actually prefer talking to a computer.
The belief is that unlike a person, who might consciously or unconsciously judge their choices, computers will simply provide helpful objective information.
"Sometimes I don’t want to browse - I just want you to tell me what to get."
The third insight is related to curation. It is unlikely that a traditional e-commerce experience would be viable inside a conversational interface. And there is plenty of evidence that the visual-based browsing modality is an enjoyable one for many people.
But there are times when a user would prefer not to have to hunt through your entire catalog, and would instead like to be guided in a much more focused way.
"My kids text all day. They don’t use email at all."
The last insight was the strong preference for conversational interfaces with people in their teens and early 20s. Their content consumption patterns are dramatically different from older generations... They send hundreds of text messages... They live inside of Snapchat... They never check email... They use social sites less frequently than older demographics (except for Instagram). For companies or brands looking to reach teenagers, understanding the nuances, benefits and limitations of conversational interfaces seems wise.
They’re Available Wherever a User Wants
Many brands have been hesitant to play inside of the confines of the walled gardens that are FB, Whatsapp, etc. They (rightfully) understand that they are effectively renting their real estate from these companies, and in the face of quarterly earnings these companies continue to decrease organic reach and force you to pay for access. It’s possible that they might charge rent on your bots in the future as well.
But this mindset assumes that your app (vs. Facebook’s or anyone else’s) is your product. It’s not.
You are no longer a web app or a mobile app. You provide a product, service or experience. And that experience can and should be able to be delivered wherever is most convenient for your customers.
As Chris Messina puts it, “Conversational commerce is about delivering convenience, personalization, and decision support while people are on the go, with only partial attention to spare.” This means that you need to think beyond the confines of your own app. It’s becoming increasingly important to think about your products and services in the right context.
Some companies have come to the same conclusion, and are rapidly experimenting. You can now order a Domino’s pizza from Twitter with a pizza emoji. You can order Taco Bell from inside Slack. You can order an Uber using your Amazon Echo.
These companies are happy to jump onto the next big platform because of the tremendous opportunity it opens up in terms of customers. Just as newspapers and televisions and shopping malls used to be the aggregators and platforms of yesterday, there are now a dozen or so platforms with hundreds of millions of customers, waiting to be reached.
One Interface, Multiple Departments
The last big benefit of conversational interfaces is the simplicity for users to interact with several departments through a singular interface. There doesn’t have to be multiple websites or phone numbers or email addresses to visit.
With the right plumbing behind the scenes, a user should be able to interact with several areas inside your organization through a single text field, just as a terminal allows you to navigate to disparate parts of your computer. This sounds simple, but the benefit to users is profound.
Messaging apps become the new home screen.
For these reasons, it’s not unreasonable to envision a future where users spend more and more time inside of the conversation view of various messaging apps, to the point that it effectively becomes their home screen.
In such a world, it becomes somewhat similar to email in that it’s natural for me to jump in and out of conversations with friends and brands. Having real estate in that threaded view becomes critically important in such a world. But unlike email, these platforms have made it impossible for users to receive spam - they have to opt into any communications with brands, and at any moment can banish them with forever with a couple taps. For this reason these platforms won’t suffer the same fate as your inbox - real estate on these screens will continue to be extremely valuable.
How to Make Great Conversational Experiences
So how do you take advantage of the opportunity that conversational interfaces present? The following are some usability and strategic guidelines we’ve identified in our research of existing implementations and through customer interviews.
Keep in mind we’re in the very early stages of this opportunity, and many of the things that are not currently possible will soon be trivial. The platforms will only become more powerful and useful for customers.
Conversational Interfaces vs. Bots
Many of the exciting applications of conversational interfaces surround the use of bots and machine learning. But conversational interfaces are powerful in and of themselves, even if they are augmented partially or entirely by humans. They’re great interfaces for onboarding new users inside of native apps, and many of the use cases we’ll discuss in this paper are just as relevant without the use of bots.
The interest in bots and the advent of machine learning that can actually simulate human conversation make conversational interfaces much more powerful. The benefits to the user are largely the same (meaning I can communicate using natural language and have the interface understand and respond to me intelligently). But by leveraging technology, it allows these kinds of interactions that previously required human intervention to happen at scale.
It’s Easy to Build a Bot. It’s Hard to Build a Useful One.
Building a bot is actually a trivial exercise - you can build one in a weekend without much trouble. But to make a bot that actually does something requires more work.
There are several layers to creating a useful bot experience. You have the bot itself, which will likely leverage one of a variety of retrieval based platforms to communicate with the user. It will allow your system to understand a user’s input and the intent behind their request, but it won’t be able to do anything with it by itself.
Your bot “brain” needs body of knowledge to support it and provide relevant answers. This will come from a couple places. The first is your app, or database, or whatever houses the information that is proprietary to your business or organization. The second comes from a series of integrations with the other tools your organization relies on - CRMs like Salesforce, your e-commerce platform, etc.
You likely will need some form of human interface with your system, at a minimum to monitor the interactions users have with your system to identify opportunities for improvement, and possibly to intervene and handle portions of the customer interaction directly.
And finally you have the integrations with the platforms themselves. Each will require its own code, has its own conventions and data structures, and likely has different user expectations. A user interacts with a bot differently inside of Slack than with their Echo - the system should adapt the experience to reflect those differences.
All (or most) of these layers are critical to building a bot that works well. But they don’t all have to be online at the outset.
Narrow the domain.
The most effective way to layer in functionality over time is to make sure you narrow the domain or scope of the interface. The less a bot does, the less integration work you have to do, the fewer opportunities there are for the conversation going off the rails, and the more likely you’re able to create an experience your customers appreciate.
When you first put a conversational interface into the wild, one of the first things people will do is try to break it by asking strange questions. But as long as you’ve done a good job of communicating the domain up front, they won’t get upset when it responds with canned variations of “I don’t understand.” Your user isn’t going to expect your customer support bot to have a conversation about politics.
Making sure you communicate the domain of the interface at the outset of a customer engagement ensures that expectations are properly set.
Start with your script.
Before you start working with one of the bot platforms or writing a line of code, spend some time thinking about how you expect the conversation to flow. You can leverage flow chart tools to think through when to branch, when to have open input vs. forcing the user to select options, when to show results and what those results should include, what the calls to action should be, and how you want an interaction to ideally resolve itself.
You can test your flows via SMS using a tool like Twilio (or even having a person behind the curtain using the pre-written responses) and doing some user testing. If it doesn’t work as an SMS conversation, it’s unlikely to work as a bot.
Spend a lot of time error handling.
Error handling is important for any type of application, but the importance is even more critical inside of conversational interfaces. Every interaction will either enhance or erode trust with users, which can increase abandonment.
The nice thing is that every conversation is like a log file - if you’re paying attention, every interaction with a customer is an opportunity to improve your interface. And unlike the app stores which can have 7-10 day lag times, you can update your conversational interface daily and have it instantly available.
Have a personality
Because the chrome of an app gets stripped away, there are limited opportunities to have your conversational interface reinforce the brand. The best tool at your disposal is going to be your copy.
Even if your interface is bot-based, you can and should take the time to craft copy that sounds human, and embodies the traits of your brand.
Your brand team has likely done this exercise before and already knows the personality of the brand very well, but if not it pays to spend time on that here. What would your brand sound like if it were a person? Would it be funny? Would it use formal language or be more colloquial?
Clearly visualizing the personality of your interface and embodying that in your responses to users will go a long way toward creating a positive experience.
Simplify data entry whenever possible.
The magic of conversational interfaces is not about the input users make into a system. It’s about the response being relevant, fast, personalized and humanizing.This means that while you do want users to interact with a simple text input for many interactions, there are plenty of opportunities to simplify data entry by suggesting sensible defaults.
Dropping some multiple choice options, confirmation buttons and the like can often be preferable to having to type in “yes”. As a general rule, when you’re trying to guide the conversation or when a user should be selecting from a couple of options, give them tools to make that simple.
This also means taking advantage of tools to eliminate the need for data entry altogether. FB Messenger of example already knows and gives you access to the user’s name, so you don’t have to ask. There are other features that are sure to be coming like access to a user’s location (you can prompt the user to supply location using the location button on the Messenger keyboard, and a few companies like Uber already have direct access to location data). Those features should make many data entry tasks unnecessary, allowing you to deliver value to the user faster.
I said it’s like a phone tree, not “literally create a phone tree.”
You can take the principle of limiting data entry overboard though, and actually create a worse experience for users. Asking users a bunch of multiple choice questions to help them drill down is not nearly as efficient as letting them tell you what they’re looking for, and one of the benefits of the conversational platforms is explicitly supposed to be to allow those kinds of interactions.
If you need 6 pieces of data from a user to begin showing results, you can still make it easier for them by asking them to search first and supply intelligent follow-up questions based on their input. For example, rather than forcing them to tap:
Women’s clothes > dresses > work > red > size 4 > under $200
You could instead have them search for “red work dress”, guess they’re looking for women’s clothing, and only ask 2 follow-up questions.
This approach wouldn’t be limited to the initial search - it could also be leveraging for filtering results. Rather than having to tap multiple times to filter or modify, allow a user to simply say “show me the first one in blue” and have it respond appropriately.
Conversational interfaces lend themselves perfectly to curated search. Rather than trying to drill users down through 10,000 SKUs, you can use a subset of your best-selling or most interesting products, grouped by theme or user type.
Think about ways to make searching easier.
Just because a user is interacting with a text input doesn’t mean you have to be limited to text.
Allowing a user to drop a photo in for example, can be a powerful way for them to tell your system exactly what they’re looking for. Google and others have open source libraries for handling these kinds of processing tasks already.
You could also bake in small delights like allowing users to communicate with emoji. Even though the meaning of emoji often can only be understood in the context of the people having the conversation, there are likely global instances where its meaning could be clear to a bot.
Develop a history with me over time.
While remembering user preferences and defaults is important for any application, it takes on additional importance inside of a conversational interface. Because we are trying to approximate what it’s like to communicate with a live person as much as possible, having the interface “remember” what was previously discussed is essential.
A user should never have to tell a conversational interface something more than once. Even if they are interacting with different departments, a user’s history of products purchased, previous support issues, etc. should be shared and baked into future interactions.
Doing so allows for some fun opportunities. It’s possible to imagine a return or exchange being as simple as a single request from a user, with the interface understanding the intent and handling all the details for them without the need for followup or transferring to different departments.
Over time, such a “relationship” becomes more valuable for the user. If a retailer knows I ordered a small shirt, it can start with that assumption (clarifying of course). If it knows I returned a small for a medium, it could either update my preferences to be a medium, or it could remember that I’m usually a small, but when products run small proactively suggest the medium instead.
If a system knows I bought a present for my mom last year roughly around this time, it could proactively remind me to get mom something again this year and suggest things similar to what I ordered last time.
Use notifications intelligently and prudently.
As we mentioned earlier, the big messaging platforms have learned the lessons of our past. They’ve seen how email gets abused. Facebook saw firsthand what brands will do if you let them with the first incarnation of their platform.
As such, they’ve deliberately made their platforms customer-initiated. And they’ve made it extremely easy to block a conversation whenever a user gets annoyed.
As such, it’s imperative that notifications be used with extreme moderation. Even if you’re a reputable brand, don’t assume people will stick around if the conversation stops being relevant - many users stopped using CNN and other early bots because the daily cadence of communication was not in alignment with user expectations.
Again, while stopping push notifications takes a couple of steps on most devices, blocking a bot takes a single tap. Don’t just transfer your push notification strategy over to your conversational interface strategy.
When is the Right Time to Start Building Your Conversational Interface?
A lot of the early excitement with conversational interfaces was immediately followed by usr being nonplussed with the current implementations. This is not unusual. When the app store first opened up on the iPhone, many of the early apps were very crude implementations of what was to come.
But this isn’t a bad thing. The people jumping in and trying early iterations of tools are much more forgiving. Your version might suck, but so does everyone else’s. There are several reasons why we think being early in the game is beneficial.
It’s a Great PR Opportunity.
Being early maximizes the likelihood the media will care. In two years nobody is going to cover a story about how your brand is launching a conversational interface or bot (unless its implementation is incredibly novel). But right now they’re eating it up.
You Can Capitalize on an Unsaturated Channel.
The first banner ad was on HotWired in 1994, and had a click-through rate of 78%. The average FB CTR as of 2015 was .171%. That a 450x difference.
With any new marketing channel, new ad unit, or new platform, performance initially is great. There’s a novelty to it, and users are open to trying new things. But as a channel gets saturated, its effectiveness wanes, user demand begins to calcify.
Just as users now download zero new apps on average, a time is coming when their willingness to adopt new conversational tools will decrease dramatically.
You Figure Out What Works Faster.
One of the best concepts that emerged from the Lean Startup movement is the concept of the build/measure/learn loop. Companies who adopt a cadence of experimentation and are able to navigate their way through that loop quickly develop big competitive advantages over time.
The benefit of being early is less about first mover advantage and much more about your ability to figure out what works and what doesn’t faster than your peers.
More data = smarter interfaces
While the systems you’ll be leveraging are retrieval-based today, there will come a time (most likely sooner than we think) where truly generative models become viable.
The effectiveness of a learner is directly correlated to the amount of data you can provide it. The more data you have, the more intelligent your interface can become over time.
Even in the context of a retrieval model data is your friend. Since you can tweak and improve your interface on a daily basis if you want, every conversation is an opportunity to improve. Every conversation teaches you what users want be able to do with your interface, what it’s limitations are and what opportunities might exist to add value. It makes sense to start figuring that stuff out now.
Build Version One Now. Learn What Works. Repeat.
For those reasons, we see no reason to wait. Open platforms with multiple billions of users, with tons of demand for great applications and limited supply simply don’t appear often. Whenever one does, it is wise to jump on it immediately and wrap your heads around it.
Don’t wait. Put a stake in the ground. Get something up, even if it’s not as good as it ultimately could be. Learn from your users - adopt a rapid cadence of experimentation and improvement.
We’d Love to Help.
If you think you’d like to dip your toes into the space and start trying some things, we’d love to help you make that happen.
Manifold has a full service team that can help you execute from start to finish. We can help craft the narrative. We can design and build conversational interfaces inside of native app experiences. We can build your bot “brain” and connect it to the various messaging platforms. We can handle the integration work with your backend platforms. And we love to work inside of an iterative process to maximize the likelihood of success.
To learn more about how Manifold can help with your digital innovation needs, contact us today.