SSML[26] is an XML-based markup language that describes not only what Alexa should say, but how she should say it. Much the same as how HTML describes how text should appear on a webpage, SSML describes how text should sound when spoken. We will apply SSML in our skill’s responses to make Alexa sound more natural, and expressive, and to add a little pizzazz.
Before we start adding SSML to our skill responses, let’s have a look at a simple SSML document on its own, outside of the context of a skill:
| <speak> |
| Hello world! |
| </speak> |
As you can see, the root element of any SSML document is the <speak> element. But you can leave it out in the text passed to speak() and it will be inferred.
Aside from the <speak> element, this example doesn’t leverage SSML to alter how Alexa speaks. So let’s make a small change to alter how Alexa will say “Hello world”:
| <speak> |
| <amazon:effect name="whispered">Hello world</amazon:effect>! |
| </speak> |
Now, instead of simply saying “Hello world” in her normal voice, Alexa will whisper the greeting, thanks to the <amazon:effect> element. But don’t just assume that to be true; let’s actually hear her say it using handy text-to-speech simulator provided in the Alexa Developer Console.