PDF - Google Developers

Unlocking the Power of
Spoken Language
—
Actions on Google
Table of contents
Page
Communicate what was understood
2
Offer examples that illustrate what people can say, and how
2
Avoid stating the obvious
2
Give users credit and save extra guidance for those who need it
3
1
Actions on Google
The advantage of speech-enabled services is that
people already know how to talk. A well-designed
voice user interface (VUI) is intuitive—commands
don’t have to be taught, unlike the meaning of a button
in a visual interface or the keys on a touchtone phone
system. Still, sometimes we do have to let people
know what they can say, either because they’ve asked
for help or they’re unsure how to proceed (especially
if they’re new users).
hearing is in fact what they asked for. Put another
way, new information is presented last—which is
known as the end-focus principle.
Offer examples that illustrate what people
can say, and how
A VUI should also offer people a range of ways they
can phrase their queries, or examples that illustrate
levels of complexity, as in this exchange:
Here are some pointers on building a VUI that can
carry on better conversations.
User
VUI
Communicate what was understood
If a person asks a question or how to perform a task
or action, the VUI should communicate what the system
(the “recognizer”) understood (or parsed) about the
question, so that people know they’ve been heard,
bolstering their trust in the speech technology.
These are what we call implicit confirmations of the
user’s intent:
Avoid stating the obvious
Uninformative messages make your voice persona
(which is also your brand) sound hapless and lack
depth. People don’t appreciate any device underestimating their intelligence. Let’s take the example of a
person who asks, “How do I get the news?”
User How do I cancel an alarm?
VUI
If you have only one alarm and you want to
cancel it, just say "Cancel my alarm." Otherwise, try saying things like "Cancel my 7 a.m.
alarm" or "Cancel all my alarms."
User
VUI
How do I set an alarm?
To set an alarm, try saying things like "Set
an alarm for 6 a.m.," or "Wake me up in two
hours," or "Set a ‘medication’ alarm for everyday at 7 p.m."
Consider the entirely unhelpful:
To get the news, say
“Get the news.”
Who made the statue David?
David was created by Michelangelo.
And swapping words for variation doesn’t add value,
either:
To listen to the news, say
“Hear the news.”
Remember, due to the linear nature of an audio-only interface, people have no way of skipping ahead.
They have to hear one word. Before. They hear. The
next. Contrast this with how a screen might return a
written response all at once, and can therefore get
away with saying “Michelangelo, artist” (most likely
accompanied by an image). The VUI on the other
hand needs to signal to the user that what they’re
The same person probably knows or has tried framing the query just that way before. People likely are
trying to explore what else they can do. Instead,
build in some intuitive instruction to help them infer
the broader realm of possibilities:
2
Actions on Google
If you’re interested in recent headlines, you
can say, for instance, “Tell me the latest
news.” Or try asking for a specific category,
like technology or sports news.
User
VUI
Still, a VUI often needs to tell people what to say,
without anyone explicitly asking for help. Here, too,
you should avoid stating the obvious.
Or you can give me a tempo like “Allegro” or
“Moderately fast.” (We can always speed it
up or slow it down later.)
92 beats per minute.
Okay, 92 beats per minute. Here you go.
This is bad behavior on the part of the VUI, asking
a question and then immediately continuing to talk
without handing over the turn to the user. This approach also requires the person either to wait for
the lengthy initial message to finish or to interrupt (if
barge-in is enabled), also causing them to be a poor
conversation participant.
For example, instead of using a literal instruction like
this one:
Editing your shopping list is easy. To add an
item, just say “Add,” followed by the item you’d
like to add. To remove an item, just say “Remove,”
followed by the item you’d like to remove.
Instead, consider the sequential, time-consuming
nature of speech, and yield the speaking turn back
to the user:
Consider being intuitively obvious instead:
By the way, to edit your list, you can
say something like “Add toothpaste
or “Take off the ice cream.”
User VUI User
VUI
Start a metronome.
Sure, what tempo?
[No reply]
You can say, for instance, “110 beats per
minute.” Or you can give me a tempo like “Allegro” or “Moderately fast.” (We can always
speed it up or slow it down later.)
User 92 beats per minute.
VUI Okay, 92 beats per minute. Here you go.
Give users credit and save extra guidance
for those who need it
A VUI shouldn’t try to “teach” people what to say to
protect them from veering off the so-called “happy
path.” Instruction is irrelevant for those who aren’t
having problems—which should be most people
if you’ve designed an intuitive VUI. Instead, give
instructions in fallback paths and in repair (error)
prompts, as in the following example. This way, you
optimize relevance for people who don’t need help,
but offer help when someone seems to be stuck.
Clearly, it seems like fewer steps to offer suggestions right away. But in addition to actually taking
longer, it presumes the user is a novice, which probably isn’t the case.
Remember that people know what they want. Give
them a chance before jumping in to help.
Don’t assume that everyone needs help knowing
how to ask for what they want:
User Start a metronome.
VUIWhat tempo did you want to start with? You
can say, for instance, “110 beats per minute.”
3
Actions on Google
Best practices
Remember these guidelines when
creating a voice experience:
→→ Avoid stating the obvious
→→ Communicate what the system
understood
→→ Offer meaningful examples
when letting people know what
→→ Give instructions only if needed
they can say
© 2016 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc.
All other company and product names may be trademarks of the respective companies with which they are associated.
4