Agent-AI A-7 Diversified input/output (DIO) speech recognition and speech synthesis technologies for automatic interactive agent Understand you conveyed meaning, and talk expressively We are developing diversified input/output (DIO) speech recognition technologies that can understand not only text information but also a speaker’s emotion and intention from input speech, and DIO speech synthesis that can put thoughtful feeling into output speech. Our technologies can contribute to an automatic interactive agent that empathizes with users. Senior caring robot Contact center automatic answering agent Hmm… ■ Speech nuance extraction: it can understand the speaker’s What is this regarding? (Anxiously) What’s the matter, grandpa? emotion and true intention from speech input. ■ Speaker identification for families: it can identify from a short Hmm…, let me see… Oh, I’m looking for my glasses. speech which family member is speaking. ■ DNN-based speech recognition: it can achieve accurate and fast dictation based on deep learning technology. (Slowly and kindly) (Cheerfully) You always put them on your desk. Do you mean an advance booking? Nuance: distress User: grandpa ■ User-designed speech synthesis: it can synthesize expressive Nuance: impatience What’s the matter, grandpa? Dictation “Hmm” Nuance extraction Distress Speaker identification Grandpa DIO speech recognition DIO speech synthesis speech suited to each user. Application Scenarios Diversified input/output (DIO) speech technologies Hmm… Features ■ Interactive agent that can enrich daily family life ■ Robot to provide care for senior citizens ■ Automatic answering agent in contact center. Dialogue manager “What’s the matter, grandpa?” Anxiously Role in Agent-AI Our technologies aim to contribute to interactive dialogue agents by utilizing various kinds of information in speech. 〈Contact〉[email protected] Copyright © 2016 NTT. All Rights Reserved.
© Copyright 2026 Paperzz