Enable Wake Word Verification
Cloud-based wake word verification improves wake word accuracy for Alexa Built-in products by reducing false wakes caused by words that sound similar to the wake word. Examples of words that might cause a false wake for "Alexa" include "Alex", "election", "Alexis". Cloud-based wake word verification also detects media mentions of the wake word, such as, the mention of "Alexa" in an Amazon commercial.
The wake word engine on a device performs the initial wake word detection, and then the cloud verifies the wake word. If the cloud detects a false wake, the Alexa Voice Service (AVS) sends a StopCapture
directive to the device in the downchannel that instructs it to close the audio stream, and if applicable, to turn off the blue LEDs to indicate that Alexa has stopped listening.
Requirements for Cloud-Based wake word verification
Voice-initiated devices begin to stream user speech to AVS when the wake word engine detects a spoken wake word, such as "Alexa." The stream closes when the user stops speaking or when AVS identifies a user intent, and AVS returns a StopCapture
directive to the device. Cloud-based wake word verification has the following requirements:
- Wake word – Include the wake word in the stream so that AVS can perform cloud-based wake word verification, which reduces false wakes. If AVS can't detect the wake word detected during cloud-based wake word verification, AVS discards the utterance.
- 500 milliseconds of pre-roll – Pre-roll is the audio captured before AVS detects the wake word and helps calibrate the ambient noise level of the recording to enhance speech recognition.
- User speech – Any user speech that the device captures until receiving a
StopCapture
directive. This allows AVS to verify the wake word included in the stream, reducing the number of erroneous responses due to false wakes.
To learn how to implement a shared memory ring buffer for writing and reading audio samples and to include the start and stop indices for wake word detection in each Recognize
event sent to AVS, see the Requirements for Cloud-Based Wake Word Verification.
Update device code to send RecognizerState
A Context
container communicates the state of your device components to AVS. To support cloud-based wake word verification, all voice-initiated products must send a RecognizerState
context object with each applicable event.
If your product isn't voice-initiated, the RecognizerState
object isn't required.
Example message
{ "header": { "namespace": "SpeechRecognizer", "name": "RecognizerState" }, "payload": { "wakeword": "ALEXA" } }
Payload parameters
Parameter | Description | Type |
---|---|---|
wakeword | Identifies the current wake word.
Accepted value: "ALEXA" |
string |
Example
The following example illustrates a SpeechRecognizer.Recognize
event with RecognizerState
included.
Click to view example+
Implement SpeechRecognizer support for Recognize and ExpectSpeech
The SpeechRecognizer interface implements support for cloud-based wake word recognition through the Recognize
event and ExpectSpeech
directive.
Recognize event
The Recognize
event includes the initiator
object, which contains information about the initiation of an interaction with Alexa. If the interaction was voice-initiated, initiator
includes the start and stop indices for the wake word. For more details, see the Recognize
event reference.
ExpectSpeech directive
The ExpectSpeech
directive also includes an initiator
object. In a multi-turn scenario, where Alexa requires additional information from the user to complete a request, return the initiator
from the directive back to Alexa in the subsequent Recognize
event, regardless of how the interaction was initiated. For more details, see the ExpectSpeech
directive reference.
Resources
Last updated: Dec 08, 2020