I am currently in the process of building an Alexa skill that contains all of the knowledge of the Star Wars Universe. This includes characters, droids, weapons, vehicles, planets, creatures, and even different species and organizations. It also includes the ability to request the opening crawl videos from each of the movies in the Star Wars saga, and the trailers for the movies, television shows, and video games.
It’s the videos that have brought me here to share what I have learned.
Alexa is available in a wide variety of devices. Some small, some big, some with screens, others without. For those devices with screens, I want to be able to provide my users with a simple workflow.
- Ask for a specific video.
- View the requested video.
- Continue the conversation when the video ends.
For the first two steps, this was surprisingly easy to implement using Alexa Presentation Language (APL.) . For the third step, it required some research and trial and error, but I have it working successfully now.
Identifying the Video a User Requested
While there is nothing complicated about identifying a user’s request, I’ll show you how I am handling this so that if you want to build your own version of this, you have everything you need.
In my Interaction Model, I have an intent called “CrawlIntent.” This is there to handle all of the ways a user might ask to see the opening crawl of a specific film. It looks like this:
{
"name": "CrawlIntent",
"slots": [
{
"name": "media",
"type": "Media"
}
],
"samples": [
"show me the {media} crawl",
"{media} crawl",
"can I see the {media} crawl",
"show the crawl for {media}",
"for the {media} crawl",
"to show the crawl for {media}",
"show me the {media} opening crawl",
"{media} opening crawl",
"can I see the {media} opening crawl",
"show the opening crawl for {media}",
"for the {media} opening crawl",
"to show the opening crawl for {media}",
"play the {media} opening crawl",
"play the {media} crawl"
]
}
When a user says something to my skill like one of the utterances above, I can be confident they are looking for the opening crawl video for a specific film. I also have a slot, called media
that contains a list of all of the films and shows that I want my skill to be aware of.
{
"values": [
{"name": { "value": "Battlefront 2","synonyms": ["battlefront 2", "battlefront"]}},
{"name": { "value": "Clone Wars","synonyms": ["the clone wars"]}},
{"name": { "value": "Episode 1","synonyms": ["the phantom menace"]}},
{"name": { "value": "Episode 2","synonyms": ["attack of the clones"]}},
{"name": { "value": "Episode 3","synonyms": ["revenge of the sith"]}},
{"name": { "value": "Episode 4","synonyms": ["a new hope", "new hope"]}},
{"name": { "value": "Episode 5","synonyms": ["empire", "the empire strikes back", "empire strikes back"]}},
{"name": { "value": "Episode 6","synonyms": ["return of the jedi", "jedi"]}},
{"name": { "value": "Episode 7","synonyms": ["the force awakens", "force awakens"]}},
{"name": { "value": "Episode 8","synonyms": ["the last jedi", "last jedi"]}},
{"name": { "value": "Episode 9","synonyms": ["rise of skywalker", "the rise of skywalker"]}},
{, "name": { "value": "Rebels","synonyms": ["star wars rebels"]}},
{"name": { "value": "Resistance","synonyms": ["star wars resistance"]}},
{"name": { "value": "Rogue One","synonyms": ["rogue one a star wars story"]}},
{"name": { "value": "Solo","synonyms": ["han solo movie", "solo a star wars story"]}},
{"name": { "value": "The Mandalorian","synonyms": ["the mandalorian"]}}
],
"name": "Media"
}
Playing A Video Using APL
For the code of my skill, I am using the Alexa Skill Kit SDK. This makes parsing through the JSON that Alexa provides far easier, and gives me greater control over building responses for my users.
To add APL to my skill’s response, I do something like this:
var apl = require("apl/videoplayer.json");
apl.document.mainTemplate.items[0].items[0].source = media.fields.Crawl;
handlerInput.responseBuilder.addDirective({
type: 'Alexa.Presentation.APL.RenderDocument',
token: '[SkillProvidedToken]',
version: '1.0',
document: apl.document,
datasources: apl.datasources
})
handlerInput.responseBuilder.getResponse();
Line #1 refers to the location of my APL document. This document is the markup that tells the screen what to show. Line #2 is dynamically updating the source of the video file to be played, so that we can play the appropriate video for the appropriate request.
As you’ll see in the APL document below, we define a Video element, and include a source property that indicates a specific URL for our video.
The important lesson I learned when building this is that I don’t want to include any speech or reprompts to my user in this response. I can send this APL document to the user’s device, which immediately starts playing the video. This is completely counter-intuitive to everything I’ve ever considered when building an Alexa skill, but it makes sense. I’m sending them a video to watch…not trying to continue our conversation.
Adding an Event to the Video When It Is Finished
Finally, I had to do some exploration to figure out how to not only identify when the video has concluded, but also prompt my skill to speak to the user in order to continue the conversation. This is done using the onEnd event on the Video element that we created earlier. Here is the entire APL document.
{
"document": {
"type": "APL",
"version": "1.1",
"settings": {},
"theme": "dark",
"import": [],
"resources": [],
"styles": {},
"onMount": [],
"graphics": {},
"commands": {},
"layouts": {},
"mainTemplate": {
"parameters": [
"payload"
],
"items": [
{
"type": "Container",
"items": [
{
"type": "Video",
"width": "100%",
"height": "100%",
"autoplay": true,
"source": "https://starwarsdatabank.s3.amazonaws.com/openingcrawl/Star+Wars+Episode+I+The+Phantom+Menace+Opening+Crawl++StarWars.com.mp4",
"scale": "best-fit",
"onEnd": [
{
"type": "SendEvent",
"arguments": [
"VIDEOENDED"
],
"components": [
"idForTheTextComponent"
]
}
]
}
],
"height": "100%",
"width": "100%"
}
]
}
},
"datasources": {}
}
const VideoEndedIntent = {
canHandle(handlerInput) {
return Alexa.getRequestType(handlerInput.requestEnvelope) === 'Alexa.Presentation.APL.UserEvent'
&& handlerInput.requestEnvelope.request.arguments[0] === 'VIDEOENDED';
},
handle(handlerInput) {
const actionQuery = "What would you like to know about next?";
return handlerInput.responseBuilder
.speak(actionQuery)
.reprompt(actionQuery)
.getResponse();
}
};