Custom NLP by integrating Amazon Alexa with IBM Watson

Amazon is winning the smart home speaker battle.

New data from Strategy Analytics suggest that Amazon’s Alexa smart assistant is beating Google in the home. Strategy Analytics found that Amazon Alexa will be on 68 percent of all smart speakers by the end of the fourth quarter of 2017. This includes Echos built by Amazon and other products built by third parties that also run Alexa. Sonos will soon launch a speaker running Alexa, for example.

Amazon’s Echo home speaker and the device’s built-in Alexa voice-activated assistant spring into action any time you call out, “Alexa.” You can cue up music, call an Uber, or play games. If you have Internet-connected home devices you can turn on the lights with your arms full of groceries, or adjust the thermostat without lifting a finger. It’s incredibly handy.

Alexa is really cool because it’s extensible, via what Amazon calls “Skills.” The Echo is pretty cool too, mainly because it has a great audio input system, consisting of seven “far field” microphones. It enables the Echo and Alexa to pick up voice when other systems wouldn’t.

Once you have an Amazon developer account, creating an unreleased, private-to-you Alexa skill is really easy. There’s an option to create one within the Amazon Developer Console, and it’s a few simple steps:

  1. Set the Skill name and its “Invocation Name”. The invocation name is what the user says to start your skill from Alexa’s “root menu.”
  2. Create the Interaction Model. This is Amazon’s term for the data that will be used to train its Natural Language Processing (NLP) system with the terms and questions unique to your Skill.
  3. Connect to either an AWS Lambda function or an HTTPS endpoint. This is the programmatic logic that controls how your skill responds to the user.
  4. Test it! You can submit text via the console, or you can use an Echo linked to the Amazon developer account.

We chose to use Lambda for the logic system since we are familiar with it. (Lambda is also one of the places developers can run their conversational app’s business and routing logic). Getting a basic Skill up and running was trivial. Amazon’s sample projects provide a decent starting point.

But we wanted to go beyond a simple Alexa skill. We wanted to talk back and forth with it to enable recommendation or advisory system.

Alexa isn’t really set up for interactive and longer conversation. Or more accurately, it is, but only to the extent that most bot building platforms and frameworks are today.

It looks something like this:

  1. Message comes in from user
  2. An NLP system classifies that message into an “intent,” and extracts relevant information into “slots.” An example of slot extraction would be extracting “SUV for Family” from the sentences “I am looking an SUV for my Family”, while the intent might be “Find a Car”.
  3. To recommend a car requires further qualification of user needs e.g. the budget, size, safety needs, brand preference etc. which can be very chatty and longer.
  4. The results from the NLP system get sent to the logic system — in this case, a Lambda function that Alexa is invoking — which then routes it.

The problem here is step 3. That’s because we wanted to go beyond basic question and answering with the Echo. We wanted to ask follow-up questions and recommend or advise.

Typically, handling followup messages mean tracking the conversation state in your bot or Skill’s logic system, then handling the intent from the NLP system from Alexa  in various ways depending on where the user is in the conversation flow. It’s a clumsy, but necessary technique in most systems.

But we know that conversations can be handled a little more elegantly than that.

So we addressed these limitations: Alexa meets IBM Watson

We made an Alexa app with custom NLP and business logic, using Alexa’s pre-built NLP system only as a channel. Using IBM Watson’s advanced Conversational and NLP system allowed best of both worlds.

Solution

 

Amazon’s strategy with Alexa is to allow the assistant to extend beyond just voice-controlled speakers it manufacturers. The company has also included the assistant in its Amazon mobile shopping app and has made it available to third-parties for use in their own hardware and software applications.

Is your brand ready to tap this market?