What Data Do You Need to Jumpstart Conversational AI Training?

Written by Emily Peck on Apr 9, 2019

Behind every good AI is a lot of data. The more data that’s available, the more training it will have and the better it will perform. There’s a clear correlation between the amount of data that is used to bootstrap an AI’s knowledge base and its accuracy in carrying on conversations with your customers correctly.  

Quality data is one of the many reasons that makes customer service such a great use case for AI. The majority of companies have loads and loads of data available. Data comes in many forms. Let’s take a look at the data sources we love to work with:

  • FAQs and Wikis:  A great base for training is all of your existing FAQ documents and online self-help wikis. Figure out how you can easily export this data in a uniform way.  
  • Historic Logs:  This is the golden goose for conversational AI training, as you can see how real customers are asking questions and how your human agents and social care specialists have historically responded. See if you can access: 
    • Social posts and messages
    • Chat records
    • Historic email exchanges
    • Phone conversation records

Historic logs are a great way to test how well your AI customer service is responding to your core customer queries by seeing how it interrupts real-life utterances. Your customers are not always going to ask their questions the way you have it in an FAQ. For instance, your base data might have an order status query trained for “How can I track my order?”, yet through your logs, you will be able to see that customers actually ask this question like this:

  • I can’t find my email confirmation with my tracking number.
  • Will my order arrive today as planned?
  • Is my order on time?
  • I’m so excited about my dress! When will it be here?

With the right tools, you will be able to train your AI to handle all of these various utterances and increase its confidence to respond to similarly-phrased queries.

By analyzing your historical logs, you can also identify the high-volume repeatable issues that you should focus on.

  • Conversational Data: Training an AI requires specifying how you want an AI to participate in a conversation outside of the core support queries. Some companies have a starting point with scripts and other training manuals for human agents; yet for many companies, we start from scratch. Conversational data includes welcomes and greetings, responses to questions like “how are you” and “who are you”, menus and communicating how an AI can be of assistance, etc. You might have an internal copywriter or work with your technology partner to craft copy that is personal, engaging and true-to-brand. 
  • Product/service databases: Depending on your business, your AI might need to be knowledgeable about your various products and services to provide descriptions, pricing, color and sizing information, etc. This data might be provided via a real-time API if products and services constantly change, or a one-time data upload if your offerings are set.
  • Product manuals and troubleshooting: If your customers typically have questions related to product care or fixing various issues, you’ll need to have this data in a uniform, exportable way.  You’ll also need to adapt it to a conversational interface. For instance, steps to reset a password or address an error code on a cable box should be presented in quick, easily digestible steps that can be sent as individual messages instead of long-form paragraphs.

In order to set your automated customer service function up for success, it will be important to have an understanding of the types of data you have at your disposal. Talk to your internal teams to understand how you can get access, and how to gain approval to share with your technical AI partner (sometimes legal teams need to grant approval).  

What if my company doesn’t have this data available?

Don’t fret. If your company doesn’t have this type of data available, you can still have a great AI-powered automated customer support function, however, the initial training and set-up will be more manual. Don’t let a lack of data steer you away from AI.

Ready to get started? Let’s chat