Automotive Forum Detroit 2012

AUTOMOTIVE FORUM DETROIT
Automotive Forum Detroit 2012
Andreas Haag & Michael Becker
November 2012
1
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Music is Everywhere
Seamless access to complete Music collection
Cloud
2
Devices
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
Car
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Why access Music via Voice
• Drivers select or search Music by....
– a specific Title, Album, Artist
"play Stairway to Heaven"
– a Genre
"listen to Pop Music"
– a Decade
"music from the 90's"
– a Playlists
"November Favorites"
– discovering based on a known Title
"More Like This"
• Advantages of Voice Search for Music
Just say "play the Beatles„
– Handsfree Operation
– Direct Access without Menu Structures
– Ambiguity Resolution
3
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
Just say combinations of
Title and Album or Artist and
Title "Like a Hurricane by
Neil Young"
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Music is a creative Domain
What makes it so difficult ?
• Partials
– Many people do not remember Songtitles correctly
"NewYork, NewYork" is actually "The Theme from
NewYork, NewYork"
• Unknown and Wrong Input
– Many times users dont know the Artist of a Song
"Play Smoke on the Water by Madonna"
• Nicknames
– As Musicians are creative with their names,
so are the fans like "Elvis / The King"
• Limitations in Technology
– Sometimes the Language detection cannot detect
the language correctly, since its e.g. from another
unsupported language e.g. "Sade"
4
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
VoCon Music Basic
Specialized Music Solutions for ASR and TTS
• Now Vocon Music Basic
– Identify language of input text.
– Supports basic partial title processing based on text based rules
– Converts text strings from the media player to phonetics that can be
recognised by VoCon speech recognition.
– Multi-lingual: native + up to 5 additional languages
• Minimum of native + English
– Language Mapping makes the phonetics available for the available
language in the speech recognizer.
Language
Identification
Partial title
processing
Text to
Phonetics
Language
mapping
Speech
recognition
(VoCon
Hybrid)
Multi-lingual CLC
5
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Partner with the Best
Gracenote & Nuance Strategic Alliance Announced at 2012 CES
Together we are
Great in Speech
and great in Music
6
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Michael Becker
7
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Why Partner with Nuance?
Gracenote Knows Music
Nuance Knows Voice Recognition
8
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
9
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Content Expertise
PARTNERS
• 3,000+ label,
studio,
independents
• TMS, Getty and
others
10
TECHNOLOGY
• Advanced
algorithms for
content generation
• Leading
technology
partners
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
EDITORIAL
• Content experts
around the globe
• Machine
generated content
COMMUNITY
• 100K+ daily
submissions
• More than 1
Billion
submissions
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
The Link Between GN and Nuance
• To accurately navigate a collection by voice, you need
to associate your music collection with the correct
phonetic transcriptions
• Challenges:
– Need knowledge of popularity of digital media
– Many artists/albums are known by alternate names
– Even within a single user’s music collection, artists may be
represented differently
11
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Music
Challenges
Voice Systems
12
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Mispronunciations
Artist Names
“Sade”
“Flo Rida”
“Kelis”
13
13
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Featuring Numbers/Letters
Artist Names
“311”
“RZA”
“R.E.M.”
14
14
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Short Names
Artist Names
“Beatles”
15
“Stones”
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
“Aretha”
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Special Characters
Artist Names
“AC/DC”
“?uestlove”
“Deadmau5”
“Ke$ha”
16
16
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Nicknames
Artist Names
“Fiddy”
17
“CCR”
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
“The King”
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Regional Pronunciations
Artist Names
“Ooh dos”
(Spanish)
“Ah-seh deh-seh”
(French)
18
18
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Gracenote MusicID
Cleans Up the Data to Enable Voice Navigation
• Drivers can access all of their music despite inconsistent spellings in
a collection. Gracenote identifies, cleans-up and organizes
information for navigation and on screen display.
19
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Collaborative Artist Support
• Artist and Collaborator were often identified as separate artists,
making it difficult to navigate collections
20
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Collaborative Artist Support
• Gracenote recognizes artist and
collaborator names
21
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Data Exchange & Embedded DB Synching
Gracenote Created Transcriptions,
Popularity Information
New and Modified
Transcriptions
• Legacy MediaVOCS Support
• Synchronize Gracenote’s and Nuance’s Embedded Data Stores
• Includes Global IDs used for Transcription Retrieval
22
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
SDK Integration Overview
GNSDK for Auto
GDB
*Global
IDs
VoCon Music
Premium
Transcription Buffer
TTS
ASR
23
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Win-Win
Benefits of Partnership & Integration
What do our customers and end-users get out of this?
• Simpler implementation
• Better data quality
• Better user experience
24
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Nuance VoCon Music Premium
VoCon Music
Basic
VoCon Music
PREMIUM
MediaVocs
25
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Gracenote MediaVOCS gets VoCon Music
Nuance Automotive Forum Europe 2012 – The Music Experience
• Gracenote MediaVOCS is VoCon Music
since September 2012.
• Following on from Gracenote & Nuance
strategic alliance at CES January 2012
• VoCon Music will be an add on
component from VoCon Hybrid v4.4 and
onwards and compatible with Vocalizer
Expressive.
• Initial functionality will be equivalent to
MediaVOCS
26
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Premium Quality
Nuance Automotive Forum Europe 2012 – The Music Experience
27
•
VoCon Music Premium provides editorially
derived phonetic transcription for official &
alternative artist/album names in 17 languages
spanning 48 countries
•
Utilize Gracenote’s global popularity data to focus
editorial transcription work on high impact content
that represent the majority of usage
•
Using editorial transcriptions when available but
falling back to auto-generated transcriptions as
needed
•
Yields increased accuracy & performance,
reduced latency & start-up
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
Accurately
navigate &
manage music via
speech in autos
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Premium Quality
Nuance Automotive Forum Europe 2012 – The Music Experience
28
•
More critical in emerging markets where autogenerated transcription quality is not as good
(especially for western artists)
•
Delivers a Consistent User Experience across all
Devices and Music Sources
•
Supports all features of VoCon Music Basic
– Track name support is identical with Basic &
Premium options.
•
Combination of former Gracenote MediaVOCS
and Nuance ddG2P/CLC
•
Full featured/Larger footprint
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
Accurately
navigate &
manage music via
speech in autos
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Premium Quality
Nuance Automotive Forum Europe 2012 – The Music Experience
•
Integrated with the Gracenote “MusicID” &
“Playlist Plus” music identification technologies
for faster creation of music grammars.
•
Editorially enhanced content for album & artists
with
– Alternative names e.g. CCR for Creedence
Clearwater Revival
– Recognition of most common
mispronunciations for
non-standard names
•
29
Regular updates
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
Accurately
navigate &
manage music via
speech in autos
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Benefits
• Improvements for ASR & TTS
– VoCon Music Premium phonetics can be
used with both ASR & TTS solutions for a
consistent User Experience
• TTS Samples (Listen for yourself)
The music domain is pretty difficult for me! But
with VoCon Music Premium, i can say "Sade"
instead of Sade. "R. E. M." is also better than
R.E.M. Did I get that right? I can even improve
my foreign language skills and pronounce "Die
Ärzte", a german band, much better. Vocon
Music Premium helps me a lot. Really cool!
30
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Data Processing Flow
ID lookup API
Application
App database
Gracenote Lookup
(GNSDK)
 GN MusicID
 Vocon Music Premium
 Application Layer
31
GDO 2
Lex-ID
Metadata 2
Orthography
Lex-ID
phonetics
CLC Phonetics
Generation
Vocon Music
Premium buffer
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Data Retrieval Flow
Display the
result
Nuance ASR/TTS
Voice
Command
Collection
Dictionary *
1
2
6
Command,
Action (text),
Multiple App IDs
Application
Playlist
More Like This
Request
3
5
Track
App IDs
Fetch Track App
IDs
Playlist
Collection
Play tracks
7
Gracenote APIs
4
32
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
The API
So Simple to use in our API framework
• Four function calls and you have high quality transcriptions
– lh_IdTransLookupLoadData() is called to load the
ID-based lookup data onto VoCon. CLC instances
allow access to this interface.
Only one ID buffer can be loaded at a time.
– lh_IdTransLookupUnloadData() is called to unload
the data, e.g. from CLC.
– lh_IdTransLookupFetchTranscriptions() is called
to fetch transcriptions from the buffer for a given ID.
– lh_IdTransLookupReturnTranscriptions() returns
the memory allocated while fetching transcriptions to
VoCon so it can be freed correctly.
33
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Where are we going to in 2013
34
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
We will not stop
Whats next on our list.
• Global Expansion of the Coverage based on
Gracenote‘s Popularity Index
• Future Feature Enhancements
– AM/FM Radio station names
– SiriusXM phonetic service support
– Online lookup service
• Joint, future enhancements to Vocon Music
Premium are currently being planned by
both companies
35
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Hybrid Music Search
Seamless access to all your Devives and Services
Your Music at your command, no matter where it is.
VoCon Hybrid
36
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
Nuance Cloud Services
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Vocon Music Basic / Premium
Intuitive and Handsfree Operation
Exciting way to access Music
Multilinguality
Same API as in VoCon Hybrid
Huge Coverage for Artist and
Album Names
Just a great User Experience
37
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
38
CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Basic & Premium compared
VoCon Music
Functionality
Detail
Language support
Multillinguality
Artist name
Album name
39
Basic
Premium


Regional differences in pronunciations of artist, album & genre in
multiple spoken languages and language of origin

Phonetic transcription tagged with spoken language ID

Phonetic transcription of orthography on media player


Partial title processing of orthography as saved on media player


Editorially created phonetic transcription for official name

Editorially created phonetic transcription for alternative name

Support for collaborative artist names (i.e. Beyonce feat. Jay-Z)


Phonetic transcription of orthography as saved on media player


Partial title processing of orthography as saved on media player


Editorially created phonetic transcription for official name &
mispronunciations

Editorially created phonetic transcription for alternate name &
mispronunciations

CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS
AUTOMOTIVE FORUM DETROIT
Basic & Premium compared
VoCon Music
Functionality
Detail
Track title
Genre
Non-standard
names
Normalization
40
Basic
Premium
Phonetic transcription of orthography as saved on media player


Partial title processing of orthography as saved on media player


Phonetic transcription of orthography as saved on media player


Editorially created phonetic transcription for official name from
Gracenote

Editorially created phonetic transcription for alternative name from
Gracenote

Editorially created phonetic transcription for official names from
Gracenote (i.e. INXS, REM, Junior M.A.F.I.A., !!!, Too
$hort, ?uestlove)

Editorially created phonetic transcriptions for alternate name from
Gracenote (i.e. nicknames, short names, alternate spellings for
same pronunciation)

Editorially created phonetic transcriptions for regional pronunciations
and common mispronunciations (e.g. names may be pronounced
differently in different languages and some names are commonly
mispronounced)

Associates artist, album or genre name that may be misspelled or an
alternate enabling normalized playback (i.e. Creedance Clearwater
Revival, Creedance, Credance, CCR)

CONFIDENTIAL | © 2002-2011 Nuance Communications, Inc. All rights reserved.
VOCON SPEECH SOLUTIONS