Language Based Communication between Humans and

Language Based Communication
between Humans and Robots
Alexander I. Rudnicky
MSR Faculty Summit
13 July 2010
Some areas of application
• Search and rescue
– Mobile teams of humans and robots
• Maintenance and logistics
– DARPA’s mule / big dog
• Domestic robots
– Companions and simple tasks
3
8/6/2010
Spoken Dialog Systems
• Calculator and Calendar (1988)
– Video link
• Office Manager (1990)
• Communicator (~2000)
• TeamTalk (2005-)
4
8/6/2010
A generic speech system
human speech
Signal
Processing
Parser
Dialog
Manager
Language
Generator
Decoder
post
Parser
Domain
Domain
Domain
Agent
agent
agent
Speech
Synthesizer
display
effector
machine
speech
Speech Group
5
August 6, 2010
Core problems in spoken language interaction
•
Recognition
– Mapping speech to words
– Tracking the topic and dynamically changing the language
•
Understanding
– Words to concepts
– Capturing domain-specific natural language
•
Dialog Management
– Creating a coherent conversation
– Tracking the topic and supporting mixed-initiative dialog
•
Generation
– Mapping concepts to words
– Creating comprehensible templates; learning from corpora
•
Synthesis
– Understandable voices
– Limited-domain synthesis
•
Domain Reasoning
– Capturing real-world information
– Plans, ontology-based reasoning
6
8/6/2010
Dialog system architecture
Multiple decoders
VAD
Speech
Sphinx
Gesture, etc
Semantic grammars
Understanding
Phoenix
Attention
Management
Synthesis
Apollo
Speech
Festival, flite
Effectors
Displays, robots
Multi-source confidence
annotation
Generation
Kalliope Augmented templates
Stochastic engine
Dialog
Management
Ravenclaw Product/Agenda
Domain
Reasoning
Speech applications and dialog systems
at Carnegie Mellon
1990
1995
OM
Script1,2
2000
AGENDA
2005
RavenClaw
TeamTalk
Communicator
Office Manager
SpeechWear
Scheduler
stack architecture
multiple –application
event queue control
multi-modality
Wearable systems
html-based dialog
Dialog “programming”
TreasureHunt
LARRI
MovieLine
BusLine
SportsLine
etc
Product trees
Ordered queue
System-initiated
dialog
asynchronous dialog
Multi-participant dialog
Spoken Dialog Systems
• Calculator and Calendar (1988)
• Office Manager (1990)
– Video link
• Communicator (~2000)
• TeamTalk (2005-)
9
8/6/2010
Key ideas
• Product / Agenda control
– Tasks represented as (partial or complete) plans
– Mixed-initiative dialogue
• Plans and plan composition
– Tasks represented as hierarchical plans
– Sub-plan library for dynamically modifying plan
• Separation of domain and discourse processing
– Domain Reasoner to maintain context, interact with the world (“backend”)
• Concept Based dialog flow
– Concept-in / concept-out; no explicit understanding at the dialog level
• Separation of task and domain knowledge
– Error-recovery sub-dialogs
Spoken Dialog Systems
• Calculator and Calendar (1988)
• Office Manager (1990)
• Communicator (~2000)
• Video Link
• TeamTalk (2005-)
11
8/6/2010
Domain: Treasure Hunt
Let’s form a
sub-team…
Someone search area 3.
Robot X retrieve treasure
from position Y
Human-robot teams coordinate to explore and
locate items in an unknown environment
12
School of Computer Science
Architecture
Recognition &
Generation
Dialog
Management
OpTrader
RoboTrader
Play Manager
Task Executive
Manage
concurrent
plays
MultiModal
Input
Play
Dispatcher
Tactics/Skills
Definitions
Relations
History
Language
Understanding
Domain
Reasoner
Ontology
(instances, plays)
Robot Layer
13
School of Computer Science
Issues in Human-Robot Dialog
• Managing multi-participant dialog
– Addressing, turn-taking, controlling the floor
– Communicating robot state, urgency
• Integrating multiple human and robot actors into the
same communication space
– Channel maitenance, interruption, eaves-dropping
• Grounding and sharing ontologies
– Talking about physical environments, actions
– Augmenting language and concepts
• Sharing knowledge
– Mapping human instructions to robot representation and
action
– Learning new action sequences
14
TeamTalk Research Issues
• Grounding
– How humans and robots can agree about things in the
environment
• Instruction
– Allowing robots to learn through instruction by
humans
• Spatial language
– Communicating about objects and events in the world
• Multi-participant dialog management
– Using spoken language in groups
School of Computer Science
15
Spoken Dialog Systems
• Calculator and Calendar (1988)
• Office Manager (1990)
• Communicator (~2000)
• TeamTalk (2005-)
– Video link
16
8/6/2010
Managing Dialog
• Managing one’s own behavior in conversation
– Humans know how to do it
– Robots don’t

Build a computational
model of multiparticipant dialog
0
n
join
1
1
–FL
leave
eavesdrop
listen
+FL
speak
status
alert
join
(ask)
leave
leave
–FL
+FL
add
listen
drop
1
m
speak
join
17
School of Computer Science
Some possible dialog system architectures
Robot(s)
Robot(s)
DM\DR
Robot(s)
DM\DR
DM\DR
NLGTTS
ASR/SLU
Robot(s)
DM\DR
NLGTTS
ASR/SLU
Robot(s)
Robot(s)
DM\DR
DM\DR
NLGTTS
NLGTTS
Robot(s)
DM\DR
NLGTTS
Robot(s)
ASR/SLU
Robot(s)
DM\DR
Robot(s)
DM\DR
NLGTTS
DM\DR
NLGTTS
ASR/SLU
NLGTTS
ASR/SLU
ASR/SLU
18
Managing Knowledge
• Robots build and maintain a shared world
model
– OWL-based ontology knowledge base
• Humans introduce and define new entities
– Locations, plans, etc
– Robots interactively build and clarify entities
– Augment ontology as needed
19
School of Computer Science
Interaction between Dialog and Robot Components
OLY ↔ MAP ↔ EXE structure
The executive and Olympus each manage
corresponding plans. The correspondence is
managed by the Mapper. The Executive’s
plans are complete with respect to the task at
hand. Olympus only knows about those parts
of the plan that humans may be expected to
be able to contribute to.
The frames in the diagram specify the slots
needed to carry out the plan. Not all frames,
or slots within a frame need to be exposed to
Olympus.
Olympus maintains a live list of all possible
tasks. Whenever the User selects a task
(through a request or command) Olympus
interacts with the user to fill in all required
slots. Once this is completed the plan is
shared with the Executive, which then
attempts the plan. If problems come up which
require user intervention the plan is suitably
annotated and Olympus notified (through the
Mapper). After interacting with the User
Olympus returns what it believes is a
consistent set of updated slots. The Executive
then attempts to execute the new plan. The
user may be consulted again, as problems
develop.
Plan
a
Step
1
Step
2
Step
3
Step
n-1
Step
n
Shared Frames specifying
constraints, etc. that can
be discussed with User
Step
1
Step
3
Step
n-1
Task
a
Frames specifying
constraints, etc. that
relevant only to the
Executive
20
Managing Learning
• Articulating the learning cycle
– Sketch → Detail → Exceptions
• Interactive learning
– Clarification
– Modification
• Maintaining knowledge over time
– Consistency
– Generalization
– Forgetting
21
School of Computer Science
Conclusion
• Moving spoken language systems into the
world
– Uncertainty about the state of the world and
other participants
– Learning through language and interaction
– Expanding the architecture
22
8/6/2010