Tips & Tricks for Playing Video with APL and Alexa

I am currently in the process of building an Alexa skill that contains all of the knowledge of the Star Wars Universe.  This includes characters, droids, weapons, vehicles, planets, creatures, and even different species and organizations.  It also includes the ability to request the opening crawl videos from each of the movies in the Star Wars saga, and the trailers for the movies, television shows, and video games.

It’s the videos that have brought me here to share what I have learned.

Alexa is available in a wide variety of devices.  Some small, some big, some with screens, others without.  For those devices with screens, I want to be able to provide my users with a simple workflow.

  1. Ask for a specific video.
  2. View the requested video.
  3. Continue the conversation when the video ends.

For the first two steps, this was surprisingly easy to implement using Alexa Presentation Language (APL.) . For the third step, it required some research and trial and error, but I have it working successfully now.

Identifying the Video a User Requested

While there is nothing complicated about identifying a user’s request, I’ll show you how I am handling this so that if you want to build your own version of this, you have everything you need.

In my Interaction Model, I have an intent called “CrawlIntent.”  This is there to handle all of the ways a user might ask to see the opening crawl of a specific film.  It looks like this:

{
  "name": "CrawlIntent",
  "slots": [
  {
    "name": "media",
    "type": "Media"
  }
  ],
  "samples": [
    "show me the {media} crawl",
    "{media} crawl",
    "can I see the {media} crawl",
    "show the crawl for {media}",
    "for the {media} crawl",
    "to show the crawl for {media}",
    "show me the {media} opening crawl",
    "{media} opening crawl",
    "can I see the {media} opening crawl",
    "show the opening crawl for {media}",
    "for the {media} opening crawl",
    "to show the opening crawl for {media}",
    "play the {media} opening crawl",
    "play the {media} crawl"
  ]
}

When a user says something to my skill like one of the utterances above, I can be confident they are looking for the opening crawl video for a specific film.  I also have a slot, called media that contains a list of all of the films and shows that I want my skill to be aware of.

{
  "values": [
    {"name": { "value": "Battlefront 2","synonyms": ["battlefront 2", "battlefront"]}},
    {"name": { "value": "Clone Wars","synonyms": ["the clone wars"]}},
    {"name": { "value": "Episode 1","synonyms": ["the phantom menace"]}},
    {"name": { "value": "Episode 2","synonyms": ["attack of the clones"]}},
    {"name": { "value": "Episode 3","synonyms": ["revenge of the sith"]}},
    {"name": { "value": "Episode 4","synonyms": ["a new hope", "new hope"]}},
    {"name": { "value": "Episode 5","synonyms": ["empire", "the empire strikes back", "empire strikes back"]}},
    {"name": { "value": "Episode 6","synonyms": ["return of the jedi", "jedi"]}},
    {"name": { "value": "Episode 7","synonyms": ["the force awakens", "force awakens"]}},
    {"name": { "value": "Episode 8","synonyms": ["the last jedi", "last jedi"]}},
    {"name": { "value": "Episode 9","synonyms": ["rise of skywalker", "the rise of skywalker"]}},
    {, "name": { "value": "Rebels","synonyms": ["star wars rebels"]}},
    {"name": { "value": "Resistance","synonyms": ["star wars resistance"]}},
    {"name": { "value": "Rogue One","synonyms": ["rogue one a star wars story"]}},
    {"name": { "value": "Solo","synonyms": ["han solo movie", "solo a star wars story"]}},
    {"name": { "value": "The Mandalorian","synonyms": ["the mandalorian"]}}
],
"name": "Media"
}
This slot allows me to match the user’s request against the list of items my skill can handle, using Entity Resolution.  This allows me to be certain that I’m choosing the right video for their request.

 

Playing A Video Using APL

For the code of my skill, I am using the Alexa Skill Kit SDK.  This makes parsing through the JSON that Alexa provides far easier, and gives me greater control over building responses for my users.

To add APL to my skill’s response, I do something like this:

var apl = require("apl/videoplayer.json");
apl.document.mainTemplate.items[0].items[0].source = media.fields.Crawl;
handlerInput.responseBuilder.addDirective({
  type: 'Alexa.Presentation.APL.RenderDocument',
  token: '[SkillProvidedToken]',
  version: '1.0',
  document: apl.document,
  datasources: apl.datasources
})
handlerInput.responseBuilder.getResponse();

Line #1 refers to the location of my APL document.  This document is the markup that tells the screen what to show.  Line #2 is dynamically updating the source of the video file to be played, so that we can play the appropriate video for the appropriate request.

As you’ll see in the APL document below, we define a Video element, and include a source property that indicates a specific URL for our video.

The important lesson I learned when building this is that I don’t want to include any speech or reprompts to my user in this response.  I can send this APL document to the user’s device, which immediately starts playing the video.  This is completely counter-intuitive to everything I’ve ever considered when building an Alexa skill, but it makes sense.  I’m sending them a video to watch…not trying to continue our conversation.

Adding an Event to the Video When It Is Finished

Finally, I had to do some exploration to figure out how to not only identify when the video has concluded, but also prompt my skill to speak to the user in order to continue the conversation.  This is done using the onEnd event on the Video element that we created earlier.  Here is the entire APL document.

{
  "document": {
    "type": "APL",
    "version": "1.1",
    "settings": {},
    "theme": "dark",
    "import": [],
    "resources": [],
    "styles": {},
    "onMount": [],
    "graphics": {},
    "commands": {},
    "layouts": {},
    "mainTemplate": {
      "parameters": [
        "payload"
      ],
      "items": [
      {
        "type": "Container",
        "items": [
          {
            "type": "Video",
            "width": "100%",
            "height": "100%",
            "autoplay": true,
            "source": "https://starwarsdatabank.s3.amazonaws.com/openingcrawl/Star+Wars+Episode+I+The+Phantom+Menace+Opening+Crawl++StarWars.com.mp4",
            "scale": "best-fit",
            "onEnd": [
            {
              "type": "SendEvent",
              "arguments": [
                "VIDEOENDED"
              ],
              "components": [
                "idForTheTextComponent"
              ]
            }
            ]
          }
          ],
          "height": "100%",
          "width": "100%"
        }
        ]
      }
    },
    "datasources": {}
}
This is the second lesson that I learned when building this.  By adding this onEnd event, when the video finishes playing, it will send a new kind of request type to your skill: Alexa.Presentation.APL.UserEvent. You will need to handle this new event type, and prompt the user to say something in order to continue the conversation. I included the argument “VIDEOENDED” so that I’d be confident I was handling the appropriate UserEvent. Here is my example code for handling this:
const VideoEndedIntent = {
  canHandle(handlerInput) {
    return Alexa.getRequestType(handlerInput.requestEnvelope) === 'Alexa.Presentation.APL.UserEvent'
    && handlerInput.requestEnvelope.request.arguments[0] === 'VIDEOENDED';
  },
  handle(handlerInput) {
    const actionQuery = "What would you like to know about next?";
    return handlerInput.responseBuilder
      .speak(actionQuery)
      .reprompt(actionQuery)
      .getResponse();
  }
};
With these few additions to my Alexa skill, I was able to play videos for my users, but bring them back to the conversation once the video concludes.
Have you built anything using APL?  Have you published an Alexa skill?  I’d love to hear about it.  Share your creations in the comments!

Introducing the Smart Deck

We recently renovated our 15-year old wooden deck, and I wanted to share with you how we created a smart deck.  Here’s what it looked like before we started this project (we had already cut the benches up before I took the photo):

OldDeck

This video illustrates how it has changed pretty well.

The technology behind everything is actually pretty simple.  For the floodlight, it’s a standard floodlight hooked up to a WeMo Light Switch.  I’ve had this switch installed for about three years now, and it’s still the perfect solution.  We have 6 more of these throughout our house.

For the colored lights in the deck itself, I took a chance on a set of LED lights that I found on Amazon.com that were listed in the “Works with Alexa” category.  They’re made by a company called FVTLED.  Could not be happier with how they turned out.  Each 10-light kit costs about $100, but has a wi-fi module, a remote control, and an outdoor power supply as part of the kit.

You couldn’t see them in the dark (and I didn’t want to turn them on and wake the neighbors), but there are two speakers connected to a Bluetooth receiver mounted above the deck as well. This allows me to pair an Alexa device, or my phone, to the receiver, and play music through the speakers.

The Grace Digital receiver is small.  Maybe 6 inches wide, and 10 inches deep.

Grace Digital GDI-BTAR513 100 Watt Digital Integrated Stereo Amplifier with Built-In AptX Bluetooth Wireless Stereo Receiver

The Yamaha speakers are pretty standard outdoor speakers.  I had to run speaker wire to them, and they don’t require any additional power to run them.

Yamaha NS-AW150WH 2-Way Indoor/Outdoor Speakers (Pair, White)

 

Overall, I’m pretty happy with how this turned out.  I don’t actually expect that I’ll be running techno dance parties with flashing colored lights, but I love that I have the option.  Most of the time, I expect to be running standard white (or off white) colors.

Have you done anything cool to improve your outdoor living space?  I’d love to see it!

Getting Alexa To Pronounce Ordinals

Today, I’m working on a project that requires Alexa to say things like “first,” “second,” or “twenty-first.”  I’ve gone through a few iterations of creating these ordinal strings.

First: Brute Force Attempt

I started the easy way: I created a hard-coded switch statement for the values from 1 – 10, and used a helper function to feed me the appropriate return value as a string..  Not the most elegant, but it got the job done.

Second: Slightly More Elegant and Scaleable

As my application grew, I realized that I would now need the values from 1 – 50 available in my application.  I added to my switch statement…until I got to 15.  At that point, I realized I needed a new solution that could scale to any number I passed in.  So I started writing some logic to append “st” to numbers that ended in 1, “nd” to numbers that ended in 2, “rd” to numbers that ended in 3, and “th” to pretty much everything else.  I had to write some exception cases for 11, 12, and 13.

It was at this point that I made an amazing discovery.

Third: Alexa is already too smart for me.

While playing with my second solution, I used the Voice Simulator that is available when you are building an Alexa skill.  I wanted to see if Alexa would pronounce the words the same if I just appended the suffixes like “th” or “nd” to the actual number value, rather than trying to convert the whole thing to a word.

This is where the discovery was made.

I tried getting her to say “4th,” and she pronounced it as I expected: “fourth.”

On a whim, I added “th” to the number 2, which would normally be incorrect.  She pronounced it “second.”  I had the same experience with “1th,” which she still got correct as “first.”

If you append “th” to the end of any number, Alexa will pronounce the appropriate ordinal.

My mind was slightly blown today.  Thanks, Alexa.

Holy Cow Garageio!

I’ve decided to start a series of posts about the ever-growing list of smart home devices I’ve decided to bring into my home.  These won’t be on a regular schedule, but as I continue to add functionality to my house, I’ll do my best to provide my opinions and experience with those products.

Today, I want to talk about Garageio.

garageio

You can probably guess from the name, but Garageio is a device that you connect to your garage doors to open/close them, as well as monitor their state.  In addition, you can connect Garageio to Alexa, and make all of that functionality happen with your voice.

Yes, there are certainly cheaper options.  Yes, you could probably build one yourself.  But to get all of this functionality in a package that works reliably, had IFTTT integrations, a great mobile experience, AND works with Alexa?  That’s a tougher deal to beat.

In fact, I tried.  I bought a WeMo Maker device ($70) and hooked that up to my garage door.  It worked, but it didn’t manage state.  So I added a webcam to my garage so that I could see whether the door was open.  It also only allowed me to send an “event” to my door, which meant that it would close if it was open, and open if it was closed.  Not a great experience.

Installation

Installation was surprisingly easy.  The entire contents of the box boiled down to five parts. (I have the two-door model, but the different models really just determine how many wires you get.  It appears it’s always the same box.)

  • The Garageio Black Box
  • Wire for connecting box to garage door opener #1.
  • Wire for connecting box to garage door opener #2.
  • Sensor for garage door #1.
  • Sensor for garage door #2.

Basically, you connect all four wires to the box, connect the box to your wifi, and you’re off and running.  Incredibly easy.

garageioproductphotos-6

Using the Garageio App

For most smart home devices, the app that drives everything is a make-or-break experience.  Thankfully, the Garageio team knocked this one out of the park.  I have a horizontal scrolling list of my doors, and swiping up on a door opens it, swiping down closes it.

20161107_151428000_ios

I also get notifications if a door stays open for 15 minutes.  This is a nice feature, but as a parent of two active kids, the door is constantly open in the afternoons after school.  My daughter gets home at 3pm, and so nearly every day at 3:15pm, I get a notification that the door is still open.  You only get one notification, however, so it’s not annoying.

You can see from my screenshot that there’s also the ability to “Share Doors.”  This allows me to grant temporary (or permanent) access to my garage door to others.

20161107_152049000_ios

IFTTT Integration

As expected, they also did an excellent job with their IFTTT integrations, so that all of the functionality I want can be triggered by all of the other services I use.  For example, I can set a geofence on my phone, so that if I enter a specific area, my garage door automatically opens.

I can also set specific times for it, so that at 10:30pm, it automatically shuts both of my doors so that I don’t leave them open all night.

If you’ve used IFTTT, you know this is only scratching the surface of what is possible, but there’s only so many creative ways to open and close a door.  So far, I’ve been delighted, however.

Alexa Integration

“Alexa, ask Garageio to close Bike Door.”

It works exactly as you would expect.

Garageio was an early entrant in to the world of Alexa, which is awesome.   I think that they will eventually hook it up to the new smart home skill API, which helps in simplifying how I communicate with it, but even now, it’s perfect. It recognizes the names I gave my doors, and works every time.  I’m really happy to have this device in my house, and I would highly recommend it for yours.

You can pick one up on Amazon for about $200.

Making An Alexa Raspberry Pi

Last week, I ordered all of the bits and pieces I needed to get a Raspberry Pi configured to become an Alexa device.  It was incredibly easy, the tutorial was very straightforward, and I ended up with something that can do this:

What You Need

If you want to try this, here’s what you’ll need (links and prices from Amazon):

Optionally, you might want to protect your Raspberry Pi if you plan to take it anywhere.  They make a very nice, inexpensive case for it:

Finally, there are a few things you’ll need to get it running, but these are things I assume you probably have.  If you don’t, I’ve recommended some with the links.

  • USB keyboard & mouse (Logitech MK270 Wireless USB keyboard and mouse – $19.95)

    keyboard
    I like this one because it’s small, compact, and easy to travel with.  Most travel keyboards are garbage, so I tend to lean towards smaller, full-function keyboards instead.  (My primary keyboard is a Das Keyboard, much bigger and clickier.)

  • HDMI monitor (there are way too many options here, any monitor will do. I’m hunting for a tiny one I can travel with.  Like 5″ or smaller.  But in a secure case, since it will likely see the bottom of my backpack occasionally.)
  • Micro-USB Charging Cable (you literally have 100 of these in a drawer. Any of them.)
  • 3.5mm audio cable
  • Literally ANY speaker that can take a 3.5mm audio cable as input (I used the Nokia MD-12 for mine, but you can certainly find cheaper speakers if you need one.)

    nokiamd12

The How-To

I would normally give you a run-down of the steps I took, and the issues I faced, but there simply isn’t much point in that.  I followed the provided tutorial on GitHub, and it was one of the smoothest experiences I’ve ever had setting something like this up.

https://github.com/alexa/alexa-avs-sample-app/wiki/Raspberry-Pi

My Takeaways

I’m working on a few things to enhance the experience, but here’s my takeaways:

  1. If you ONLY want an Alexa device, this is probably not the project for you.  The Echo Dot is $49.99, and doesn’t require any setup to work.  This project, at a minimum cost, is about $53.15.  That being said, having an Alexa device that can also run some other services is really compelling.  Adding a touchscreen to it would allow you to see the “cards” that Alexa skills produce at http://alexa.amazon.com, for example.
  2. Each time you power up the Raspberry Pi, you have to manually start all of the services again.  I’m hoping that with some creative effort, this might not always be true, but there’s some authentication that happens that requires your monitor, mouse, and keyboard every time you power it up.  (This is why I’m looking for travel keyboards and monitors.)
  3. This was one of my first experiences in the Raspberry Pi ecosystem, and I’m very excited by what I found.  There are tons of accessories to enhance and protect your device, and I’m looking forward to seeing where I can take this project forward.