Language Based Communication between Humans and Robots Alexander I. Rudnicky MSR Faculty Summit 13 July 2010 Some areas of application • Search and rescue – Mobile teams of humans and robots • Maintenance and logistics – DARPA’s mule / big dog • Domestic robots – Companions and simple tasks 3 8/6/2010 Spoken Dialog Systems • Calculator and Calendar (1988) – Video link • Office Manager (1990) • Communicator (~2000) • TeamTalk (2005-) 4 8/6/2010 A generic speech system human speech Signal Processing Parser Dialog Manager Language Generator Decoder post Parser Domain Domain Domain Agent agent agent Speech Synthesizer display effector machine speech Speech Group 5 August 6, 2010 Core problems in spoken language interaction • Recognition – Mapping speech to words – Tracking the topic and dynamically changing the language • Understanding – Words to concepts – Capturing domain-specific natural language • Dialog Management – Creating a coherent conversation – Tracking the topic and supporting mixed-initiative dialog • Generation – Mapping concepts to words – Creating comprehensible templates; learning from corpora • Synthesis – Understandable voices – Limited-domain synthesis • Domain Reasoning – Capturing real-world information – Plans, ontology-based reasoning 6 8/6/2010 Dialog system architecture Multiple decoders VAD Speech Sphinx Gesture, etc Semantic grammars Understanding Phoenix Attention Management Synthesis Apollo Speech Festival, flite Effectors Displays, robots Multi-source confidence annotation Generation Kalliope Augmented templates Stochastic engine Dialog Management Ravenclaw Product/Agenda Domain Reasoning Speech applications and dialog systems at Carnegie Mellon 1990 1995 OM Script1,2 2000 AGENDA 2005 RavenClaw TeamTalk Communicator Office Manager SpeechWear Scheduler stack architecture multiple –application event queue control multi-modality Wearable systems html-based dialog Dialog “programming” TreasureHunt LARRI MovieLine BusLine SportsLine etc Product trees Ordered queue System-initiated dialog asynchronous dialog Multi-participant dialog Spoken Dialog Systems • Calculator and Calendar (1988) • Office Manager (1990) – Video link • Communicator (~2000) • TeamTalk (2005-) 9 8/6/2010 Key ideas • Product / Agenda control – Tasks represented as (partial or complete) plans – Mixed-initiative dialogue • Plans and plan composition – Tasks represented as hierarchical plans – Sub-plan library for dynamically modifying plan • Separation of domain and discourse processing – Domain Reasoner to maintain context, interact with the world (“backend”) • Concept Based dialog flow – Concept-in / concept-out; no explicit understanding at the dialog level • Separation of task and domain knowledge – Error-recovery sub-dialogs Spoken Dialog Systems • Calculator and Calendar (1988) • Office Manager (1990) • Communicator (~2000) • Video Link • TeamTalk (2005-) 11 8/6/2010 Domain: Treasure Hunt Let’s form a sub-team… Someone search area 3. Robot X retrieve treasure from position Y Human-robot teams coordinate to explore and locate items in an unknown environment 12 School of Computer Science Architecture Recognition & Generation Dialog Management OpTrader RoboTrader Play Manager Task Executive Manage concurrent plays MultiModal Input Play Dispatcher Tactics/Skills Definitions Relations History Language Understanding Domain Reasoner Ontology (instances, plays) Robot Layer 13 School of Computer Science Issues in Human-Robot Dialog • Managing multi-participant dialog – Addressing, turn-taking, controlling the floor – Communicating robot state, urgency • Integrating multiple human and robot actors into the same communication space – Channel maitenance, interruption, eaves-dropping • Grounding and sharing ontologies – Talking about physical environments, actions – Augmenting language and concepts • Sharing knowledge – Mapping human instructions to robot representation and action – Learning new action sequences 14 TeamTalk Research Issues • Grounding – How humans and robots can agree about things in the environment • Instruction – Allowing robots to learn through instruction by humans • Spatial language – Communicating about objects and events in the world • Multi-participant dialog management – Using spoken language in groups School of Computer Science 15 Spoken Dialog Systems • Calculator and Calendar (1988) • Office Manager (1990) • Communicator (~2000) • TeamTalk (2005-) – Video link 16 8/6/2010 Managing Dialog • Managing one’s own behavior in conversation – Humans know how to do it – Robots don’t Build a computational model of multiparticipant dialog 0 n join 1 1 –FL leave eavesdrop listen +FL speak status alert join (ask) leave leave –FL +FL add listen drop 1 m speak join 17 School of Computer Science Some possible dialog system architectures Robot(s) Robot(s) DM\DR Robot(s) DM\DR DM\DR NLGTTS ASR/SLU Robot(s) DM\DR NLGTTS ASR/SLU Robot(s) Robot(s) DM\DR DM\DR NLGTTS NLGTTS Robot(s) DM\DR NLGTTS Robot(s) ASR/SLU Robot(s) DM\DR Robot(s) DM\DR NLGTTS DM\DR NLGTTS ASR/SLU NLGTTS ASR/SLU ASR/SLU 18 Managing Knowledge • Robots build and maintain a shared world model – OWL-based ontology knowledge base • Humans introduce and define new entities – Locations, plans, etc – Robots interactively build and clarify entities – Augment ontology as needed 19 School of Computer Science Interaction between Dialog and Robot Components OLY ↔ MAP ↔ EXE structure The executive and Olympus each manage corresponding plans. The correspondence is managed by the Mapper. The Executive’s plans are complete with respect to the task at hand. Olympus only knows about those parts of the plan that humans may be expected to be able to contribute to. The frames in the diagram specify the slots needed to carry out the plan. Not all frames, or slots within a frame need to be exposed to Olympus. Olympus maintains a live list of all possible tasks. Whenever the User selects a task (through a request or command) Olympus interacts with the user to fill in all required slots. Once this is completed the plan is shared with the Executive, which then attempts the plan. If problems come up which require user intervention the plan is suitably annotated and Olympus notified (through the Mapper). After interacting with the User Olympus returns what it believes is a consistent set of updated slots. The Executive then attempts to execute the new plan. The user may be consulted again, as problems develop. Plan a Step 1 Step 2 Step 3 Step n-1 Step n Shared Frames specifying constraints, etc. that can be discussed with User Step 1 Step 3 Step n-1 Task a Frames specifying constraints, etc. that relevant only to the Executive 20 Managing Learning • Articulating the learning cycle – Sketch → Detail → Exceptions • Interactive learning – Clarification – Modification • Maintaining knowledge over time – Consistency – Generalization – Forgetting 21 School of Computer Science Conclusion • Moving spoken language systems into the world – Uncertainty about the state of the world and other participants – Learning through language and interaction – Expanding the architecture 22 8/6/2010
© Copyright 2026 Paperzz