Making a Sound

Ultimately, the goal of producing an APL-A template is to produce some speech or sounds that will be returned as part a skill response. For this purpose, APL-A offers two fundamental components that make sound—Speech and Audio—and a third component that intentionally doesn’t make any sound. These components are:

  • Speech—Causes Alexa to speak some given text or SSML

  • Audio—Plays an sound effect from Alexa’s soundbank or a custom MP3 from a given URL

  • Silence—Pauses silently for a given duration of time

The Speech component works much like the speak() method on the response builder in a request handler, accepting either plain-text for Alexa to speak or SSML to customize how Alexa speaks. The Audio component, on the other hand, plays music or sound effects files, similar to SSML’s <audio> tag.

The Silence component works much like SSML’s <break> tag to pause for a specified amount of time. This can be used to space out sounds and speech in an APL-A template.

Let’s see how these components work, starting with the Speech component.

Rendering Speech

In its simplest form, Speech can simply cause Alexa to say whatever text you provide. For example, here’s a very simple APL-A template that causes Alexa to speak a welcome message:

 {
 "type"​: ​"APLA"​,
 "version"​: ​"0.9"​,
 "mainTemplate"​: {
 "item"​: {
 "type"​: ​"Speech"​,
 "content"​: ​"Welcome to Star Port 75 Travel!"
  }
  }
 }

The structure of the Speech element is typical of all APL-A components. It is a JSON object that has a type property to specify which APL-A component you want (in this case Speech). And it includes one or more other properties to specify how the component should work. In this case, the content property contains the text you want Alexa to speak.

Feel free to try out this template (and all others we create in this chapter) in the APL-A editor to see how it sounds.

While plain-text speech is easy enough, you might want to produce something a bit richer. In the previous chapter, we learned how to use SSML to embellish and alter the way that Alexa speaks. You can use SSML in the content, as shown in the following example:

 {
 "type"​: ​"APLA"​,
 "version"​: ​"0.9"​,
 "mainTemplate"​: {
 "item"​: {
 "type"​: ​"Speech"​,
 "contentType"​: ​"SSML"​,
 "content"​:
 "<speak><say-as interpret-as='interjection'>Awesome!</say-as></speak>"
  }
  }
 }

This uses the SSML <say-as> tag to say “Awesome” as an interjection, which gives it a bit more character than if it were in plain-text. But you can use any of SSML’s tags in the content property, so long as you set the contentType property to “SSML”. The contentType property is required for SSML, but is optional when sending plain-text. If you would rather be explicit about the content type when using plain-text, you can set it to “PlainText”.

It’s common to pass model data from the request handler to an APL-A template so that the spoken text dynamically includes content specific to the current task. For example, let’s say that the user has just scheduled a trip to Jupiter. You wouldn’t want to hard-code the content property to mention Jupiter, because that would not apply if another user plans a trip to Mars or some other destination.

Instead, you can put a placeholder expression in the APL-A template that will be filled with the destination. For example:

 {
 "type"​: ​"APLA"​,
 "version"​: ​"0.9"​,
 "mainTemplate"​: {
 "parameters"​: [
 "payload"
  ],
 "item"​: {
 "type"​: ​"Speech"​,
 "content"​: ​"Enjoy your trip to ${payload.trip.destination}!"
  }
  }
 }

Notice the addition of parameters under “mainTemplate”. This is used to bind model data to the template. In this case, the entire model data object will be bound to the name “payload”. This is then used in the content property whose value “${payload.trip.destination}” references the model data passed to APL-A from the request handler. We’ll see how to do this in the request handler’s response in section Returning APL-A Responses. For now, though, if you want to try out this template in the APL-A editor, click on the “Data” button and add the following JSON in the Data editor:

 {
 "trip"​: {
 "destination"​: ​"Jupiter"
  }
 }

When the model data is bound to a parameter named payload, it will contain the entire model data object passed to the APL-A template. But you can also bind subsets of the model data object by specifying parameters whose names match the top-level properties of the model data. Given the trip model data described previously, you could write the APL-A template like this:

 {
 "type"​: ​"APLA"​,
 "version"​: ​"0.9"​,
 "mainTemplate"​: {
 "parameters"​: [
»"trip"
  ],
 "item"​: {
 "type"​: ​"Speech"​,
»"content"​: ​"Enjoy your trip to ${trip.destination}!"
  }
  }
 }

Notice that in this example only the trip property of the model data is bound to the APL-A template. As a consequence, the reference in the Speech component is now shortened to “${trip.destination}”.

As you’re trying this out, feel free to change the value of the destination property. When you do, be sure to refresh the audio, and then click the play button to see how APL-A inserts the value of the destination property into the produced audio.

Now let’s look at APL-A’s other sound-making component: Audio.

Playing Music and Sound Effects

As we saw in the previous chapter, using sound effects and music in a response adds some flair beyond just speech. APL-A’s Audio component serves that purpose, as shown here:

 {
 "type"​: ​"APLA"​,
 "version"​: ​"0.9"​,
 "mainTemplate"​: {
 "item"​: {
 "type"​: ​"Audio"​,
 "source"​:
 "soundbank://soundlibrary/scifi/amzn_sfx_scifi_small_zoom_flyby_01"
  }
  }
 }

Here, the source property is set to reference one of the many sounds in the ASK sound library.

Although the ASK sound library has a vast selection of sound effects, you can also provide your own by supplying a URL to an MP3 file in the source property, as shown in the following APL-A template:

 {
 "type"​: ​"APLA"​,
 "version"​: ​"0.9"​,
 "mainTemplate"​: {
 "item"​: {
 "type"​: ​"Audio"​,
 "source"​: ​"https://starport75.dev/audio/SP75.mp3"
  }
  }
 }

It’s important that the given URL is accessible by the fulfillment code that backs your Alexa skill. It also must be an HTTPS URL because HTTP URLs will not work.

If you’re hosting your audio files in Amazon S3, then you’ll either need to open up the security on the S3 bucket to allow access to the audio files or, even better, leave them locked down, but use a pre-signed S3 URL. To help you obtain a pre-signed URL, the util.js module that you got when initializing the skill project provides a getS3PreSignedUrl() function.

While you can’t use the getS3PreSignedUrl() function in the APL-A template itself, you can call it from your fulfillment code and then pass the resulting URL as a parameter to the APL-A template.

We’ll look at how to pass parameters from the fulfillment code in section Returning APL-A Responses. But to give some idea of how you might pass a pre-signed S3 URL to the APL-A template, consider the following snippet of fulfillment code:

 const​ Util = require(​'./util'​);
 
 ...
 
 const​ LaunchRequestHandler = {
  canHandle(handlerInput) { ... },
  handle(handlerInput) {
  ...
 const​ mySoundUrl = Util.getS3PreSignedUrl(​'Media/sounds/my-sound.mp3'​);
 return​ responseBuilder
      ​ .addDirective({
           ​​"type"​: ​"Alexa.Presentation.APLA.RenderDocument"​,
           ​​"document"​: {
             ​​"type"​: ​"Link"​,
             ​​"src"​: ​"doc://alexa/apla/documents/welcome"
           ​},
 "datasources"​: {
 "sounds"​: {
 "mySoundUrl"​: mySoundUrl
  }
  }
         ​})
  .getResponse()
  }
 };

The URL returned from getS3PreSignedUrl() includes token information to get through the S3 bucket’s security. But it is only valid for about one minute, which is plenty of time for the response to be sent to the device. Do not try to assign it to a module-level constant, though, because it will expire before it can be used.

You may be wondering why you’d use the Audio component instead of just using the Speech component to return SSML with the <audio> tag. While you could do that, it limits how the audio is presented in relation to speech and other sounds. As we’ll see in section Mixing Sounds, the Audio component can be mixed with Speech or other Audio components such that they play at the same time, effectively offering background audio.

But before we get to that, let’s take a quick look at the silent partner of the Speech and Audio components: the Silence component.

Taking a Moment of Silence

If you know how to read music, you’re probably familiar with rest notes. In sheet music, rest notes indicate when the instrument should make no sound for a defined period of time. Without rest notes, music would just run together and be unpleasant.

The same is true in audio responses from Alexa. Whether it be for dramatic effect or because you need to sync up one bit of audio with another, a moment of silence can be very useful.

We saw in the previous chapter how to use SSML’s <break> tag to put a pause in an Alexa response. In APL-A, the Silence component meets that same purpose. Here’s a snippet of APL-A (not a complete template) that shows how the Silence component can be used:

 {
 "type"​: ​"Silence"​,
 "duration"​: 10000
 }

The duration property specifies how long, in milliseconds, that the silence should last. In this case, the silence lasts 10 seconds.

The Silence component can’t be used alone in an APL-A template. And, in fact, it doesn’t make much sense to send a response in which Alexa doesn’t say anything or play any sounds whatsoever. But where Silence shines is when you combine multiple sounds and speech in an APL-A template. So now that we’ve covered the sound-making (and silence-making) APL-A components, let’s see how they can be combined in an APL-A template.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset