As of January 2018, it would be a fair bet to say that Amazon’s Alexa is arguably the leading voice recognition service for end users.* The Amazon Appstore for Voice has seen a gigantic surge since its launch in 2015, going from 130 to over 1,400 skills only in the first year. Today there are over 15,000 new skills and many more in approval process. Just as we’ve experienced at weKnow, one of the major prerequisites required to be able to develop / design for Alexa Voice will be getting familiar with Amazon’s terminology, specifically created to standardize the structure of an invoked skill. Here is a quick guide to getting your head around the key terms Amazon uses when developing for Alexa.
What Is an Alexa Skill?
Alexa provides capabilities, or skills, that enable customers to create personalized experiences. Think of a "skill" as an App… but for Alexa. There are now tens of thousands of skills from companies like Starbucks, Uber, and Capital One as well as other innovative designers and developers.
Keep in mind that when developing a custom skill for Alexa, you can create them for either "headless" devices (ones that you interact with only using voice), or Screen-based devices which in addition to voice commands, allow you to interact with Alexa through complementary display cards on the screen.
Types of Alexa Skills
This is the most common type of skill, and gives you the most control over the user experience. This type of skill lets you develop just about anything you can imagine. Example: “Order a pizza” or “Request a taxi”.
Smart Home Skills
This is a type of skill specifically for controlling smart home appliances. It gives you less control over the user experience, but is simpler to develop.
Smart Home Skills: “Turn on the living room lights” or “Lock the back door”.
Flash Briefing Skills
This type of skill is specifically for compatibility with Alexa’s native ‘Flash Briefing’ ability. This type of skill also gives you reduced experience control, but again is simpler to develop. Flash Briefing Skills: “Tell me the news” or “Give me my flash briefing”.
The first step in building a new skill is to figure out what your skill will do. This determines how your skill integrates with the Alexa service and what you need to build. The Alexa Skills Kit supports building several different types of skills.
- Video Skills: “Change to channel 4” or “Play Manchester by the Sea”.
- List Skills: Such as adding an item to a list or removing them.
Intents, Slots and Utterances
Now that you have a pretty good understanding of what a skill is and what are the types of skills that can be built, let’s dig into the structure of the interaction a user has with Alexa over voice commands.
An intent represents an action that fulfills a user’s spoken request. In other words, the intent is the keyword you invoke in order to trigger a skill. Intents can optionally have arguments called slots.
In addition to intents and slots, there are utterances which refers to a specific phrase that people will use when making a request to Alexa. These vary vastly — For example, think of how many ways that people can ask for the weather forecast;
- “What’s the weather going to be like tomorrow?”
- “Can you tell me the weather for Tuesday”
- “What’s the weather forecast for tomorrow?”
- “Is it gonna rain tomorrow?”
This is where a flair for communication comes in — when developing a skill, utterances have to be coded to tell Alexa what to expect. This can mean typing out dozens of very slight variations of questions and statements — basically anything you think a user would actually say to get the result they want.
Luckily there are scripts that help us avoid having to manually generate all the possible combinations and variations of words that build a human constructed phrase. As an example, such scripts automatically generate variations from an array such as:
- "what is the status"
- "what's the status"
- "check the status"
Here’s an example, that’ll help you understand how a sentence is processed by Alexa.
On top of intents, slots and utterances, there are Dialogs that consist of a longer interaction between Alexa and the user, more like a conversation with multiple turns in which Alexa asks questions and the user responds with the answers. The conversation is tied to a specific intent representing the user’s overall request.
There’s also a Skill Builder that help you define Intents, Slots and Dialogs.
Now that we understand how Alexa works, we can proceed to the design process in order to build a great experience for the user. First creating a script focusing on the happy path (assuming everything goes well and the user asks the right questions) and then proceed to add unpredictable paths or variations of that linear script.
If you’ve used Alexa before you’ve probably seen cards in the app. Cards are used to display information relating to the user’s request, whether that’s simply displaying what the user asked and Alexa’s response, or information which is difficult to convey through voice (e.g. a picture, or long numbers or lists, which can be difficult to process and remember when delivered through voice only).
When it comes to testing your skill, along the way during development, there are multiple options. On top of using an Alexa-enabled device, there’s a Test Simulator that give you access to most Alexa Skills kit features, an ASK CLI to test the skill from the command line or use the Skill Testing Feature of the Skill Management API .
Getting your Skill submitted for Certification
When you submit your skill to the Alexa Store, it must pass a certification process before it can be published live to Amazon customers. To ensure that your skill will meet the certification requirements, complete all of the testing described in the Certification Requirements for Custom Skills.
When your skill is ready for publication, you can submit it to Amazon for review. The Submit for Certification button becomes available once all required fields are completed.
*Sure there’s Google with its Home suite of products as well as the upcoming Apple HomePod to be launch on Feb. 9 but for the sake of this article we will be focusing on Amazon Alexa Voice Service.