The Future of Voice Interfaces

The Future of Voice Interfaces
CEVA TECHNOLOGY SYMPOSIUM 2016 – ASIA
Eran Belaish, Product Marketing Manager, Audio, Voice and Sensing
Voice Interfaces - Where are we at Today?
Apple’s Iphone 7 and Samsung’s Galaxy S7 are always-listening
Most smartwatches offer voice activation
Amazon Echo ignited the far-field conversational assistant trend
GoPro Hero 5 can be operated using voice commands
Voice activated car infotainment systems are a commodity
Hearables (AirPods, Samsung’s Gear IconX etc.) offer conversational assistants
Voice Interfaces Have Become Ubiquitous
CEVA Proprietary Information
2
What About Always-Listening?
Leading examples of always-listening devices
Apple’s Iphone 7 and Samsung’s Galaxy S7
Amazon Echo and Google Home
But, most devices are not always-listening
Portable speakers (e.g. Amazon Echo Tap)
Alexa open
the door
Alexa open
the door!
Most smartphones and smartwatches
Hearables and wireless headsets
Why???  3 reasons
Alexa!
Alexaaaaaa
!!!!!!!!!!!!!!
Power consumption
Power consumption
Power consumption
An Ultra Low Power DSP Like the CEVA-TeakLite-4
Enables Always-Listening Voice Activation
CEVA Proprietary Information
3
Always-on Technology by CEVA
Market Leadership
Always-on Development Kit
Smartphone flagships are alwayslistening and started a strong trend
Power is the main consideration by
far  CEVA offers the world’s lowest
power solution, already in mass
production
Join Our LinkedIn
Group – Always-on
Technology
CEVA and DSPG power Voice
Trigger in Samsung’s Galaxy
S7 and Gear S2 Smartwatch
CEVA Proprietary Information
4
From Ultra Low Power to High Performance
Ultra low power
High Performance
Battery operated devices
(smartwatches, smartphones etc.)
Smart home devices, often AC
powered (digital assistants etc.)
Ultra low power e.g. near field
always-on voice activation
High performance e.g. far field voice
activation
CEVA offers the world’s lowest
power solution for smart-mics and
always-listening chips
CEVA offers solutions for multi-mic
processing, CODEC Chips and
audio/voice Processing Chips
CEVA Proprietary Information
5
Ultra Low Power or Extra Horse Power?
CEVA-TL410 Ultra Low Power DSP
CEVA-X2 High Performance DSP
CEVA is powering the world’s
lowest power voice activation chips
CEVA -X2 is a high performance hub
for sensing and connectivity workloads
CEVA -TL410 can be always-on with
very low power consumption
Can handle multiple tasks from BT
audio through multi-mic processing
10-stage pipeline allows working
with low power memories
10-stage pipeline enables high
frequencies and intensive workloads
CEVA Proprietary Information
6
Deep Neural Networks (DNN)
CEVA Proprietary Information
7
DNNs Have Enabled a Breakthrough
►
All DNNs employ similar concept
► Deep learning - offline training
with massive data sets
► Use the generated DNN to
classify/filter real time signals
►
DNNs on CEVA platforms
► Speech recognition – Sensory is in mass production
► Noise reduction – Cypher’s voice isolation
► Sound sensing – under development by several partners
► Several OEMs have ported their own DNNs to TL4
► Beyond 3rd parties - CEVA is conducting its own
DNN research
CEVA Proprietary Information
8
Voice Isolation by
- Demos
Cafeteria
►
Pub
Raw
Raw
Filtered
Filtered
Cypher’s DNN can tell a human
voice from other sound sources
and isolate it
Competing Speakers
Raw
Filtered
CEVA Proprietary Information
9
Far Field Voice Pickup
CEVA Proprietary Information
10
Adaptive Beam Forming
Adaptive beamforming is key
for robust voice UI
Noise reduction
Speaker separation
Speaker tracking
Audio “zoom”
Enabled by a multi-microphone setup
Normally between 2 to 8 mics!
Essential for far field ASR
Intensive Multi-mic Processing Mandates a High Performance DSP
CEVA Proprietary Information
11
Stereo Echo Canceller to Allow ASR Engine
Operation During Music Playback
Demonstrated on HW by CEVA, Alango and Sensory
CEVA Proprietary Information
12
The Future of
Voice Interfaces
CEVA Proprietary Information
13
Near Future of Voice Interfaces
Voice controlled everything
Conversational personal assistants everywhere
Mobile, wearable, smart home, smart car, medical
Mostly driven by search and e-commerce giants
Consumer IoT switches to voice-first UI
Home robots, thermostats, AC, white goods
Smart mics can voice-enable even the
simplest IoT devices
Natural Voice User Interface is Set to Replace Smartphone
Apps as the Default Control Method of Many Devices
CEVA Proprietary Information
14
Smart Microphones - Why Smart Mic?
Add a DSP to the Otherwise Passive Mic
Voice-Enabled Everything
Voice trigger  voice activated devices
Sound sensing  contextual awareness
Noise suppression  improved ASR
Ultrasonic gestures  gesture control
Reduced Cost (BOM + integration effort)
+
Mic
Always-listening
Chip
Smart mic
Join Our LinkedIn Group – Smart Microphones
CEVA Proprietary Information
15
The Fascinating Future of Voice Interfaces
When People and Machines Talk…
CEVA Proprietary Information
16
The Fascinating Future of Voice Interfaces
Use natural language to control any device
Alexa, what’s the weather
in ummm, oh I
forgot…San Diego!
Voice trigger will no longer be mandatory
“Alexa”
“OK Google”
“Hi Siri”
Optional unlike
today’s systems
“Hey Cortana”
Integration with computer vision
Improved artificial intelligence
Further Advancements will Make Voice an Intuitive UI
CEVA Proprietary Information
17
The Fascinating Future of Voice Interfaces
Human-like contextual awareness
Always-on voice authentication
Human-like memory
Emotion detection
Human-like Voice Interaction with Every Machine
CEVA Proprietary Information
18
The Fascinating Future of Voice Interfaces
Overcoming Cloud Drawbacks
Smarter devices will need
less cloud support
Introduction of local fog
Improved user experience
Less privacy concerns
Lower latency
Longer battery life
Smarter Devices and Fog Improve Privacy and UX
CEVA Proprietary Information
19
I Want to Voice-Enable my Product
What’s Next?
CEVA Proprietary Information
20
CEVA is Powering the Voice-First Revolution
Samsung Galaxy S7
DSPG’s DBMD4
with CEVA Inside
Voice-Enabling Your Product? Tell us About it
Our Customers Have Shipped Over 5 Billion Voice-Enabled Devices
CEVA Proprietary Information
21
CEVA Sensing/Audio/Voice Ecosystem
Category
Partner
Offering
Category
Life Vibes Voice Experience
Noise
Reduction /
Echo
Cancellation
Voice Comm. Package
Codecs
Flexbeam
Voice Trigger
DRA
Real Audio
Speaker Correction
Noise reduction
Voice Trigger/
Voice
Activation
Offering
TruMedia HD/StudioSound 3D
Noise reduction
Truly Handsfree
Partner
ZIRENE Sound/3D
Audio PostProcessing
3D Positional Audio
MAXX-Audio/Voice/Speech
Voice Trigger
MobiSound
Voice Boost
(Noise reduction & Voice Trigger)
microQ
Dolby HD Audio
DTS HD Audio
Codecs
WMA 8/9/10-pro
Sensor Fusion
Motion Sensing
GNSS
GPS, Glonass, Beidou
Motion
Detection
Always-on Motion
Detection
SILK
AMBE vocoders
Demonstrated Today
CEVA Proprietary Information
22
More than 150 Audio/Voice SW Modules
Always-on &
NUI
Sensory
Sensory
TrulyHandsFree;
TrulyHandsFree;
Sensory User
Defined Trigger;
Sensory Speaker
Verification;
Rubidium Voice
Trigger;
Malaspina Labs
Voice Trigger;
Cywee Motion
Sensing;
Visidon Face
Detection;
IVT BT Stack
RTOS
Voice
FreeRTOS;
Express Logic
ThreadX;
ENEA OSEck;
Quadros RTXC;
uT-Kernel;
CMX-RTX;
Mentor Graphics
Nucleus
G.723;
G.729;
G.728;
G.729.1;
G.711;
G.722;
G.726;
G.727;
G.168;
G.161;
iLBC;
AMR-NB;
HR;
FR;
EFR;
AMR-WB;
EVRC;
EVRC-B;
QCELP;
SILK (32-bit);
Opus
Opus;
AMBE Vocoders;
EVS
Sample Modules
Required for a MultiMic Far-field Device
Noise Reduction
& Echo
Cancellation
NXP SW Life
Vibes Voice
Experience;
Alango Voice
Comm. Package
Package;
Dimagic
Flexbeam;
Cypher Voice
Isolation;
Malaspina Labs
Noise Reduction;
Waves MAXX
Voice;
Waves MAXX
Speech
Audio
MP3;
MP3Pro;
Ogg Vorbis;
FLAC;
MPEG4 AAC LC;
HE AAC V1;
HE AAC V2;
HE-AAC V2 5.1
Ch;
MPEG4 AAC
BSAC;
WMA;
RealAudio;
SBC;
CELT;
DRA
Postprocessing
Dolby HDAudio
DTS/SRS
TruMedia HD;
DTS/SRS
StudioSound 3D;
DTS Neo:6;
Dolby ProLogic
IIx;
Dolby Mobile 3+;
Dolby DS1;
Dirac Speaker
Correction;
AM3D Zirene;
AM3D Sound/3D;
Waves MAXX
Audio;
Arkamys
MobiSound;
Qsound MicroQ
Dolby TrueHD;
Dolby Digital
Plus;
Dolby Digital
decoder (AC3);
Dolby Digital
encoder (DDCE);
Dolby MS10;
Dolby MS11;
Dolby Volume
DTS HD-Audio
Master Audio;
High Resolution;
Low Bit Rate;
Extended
Surround (ES);
DTS 96/24;
DTS Digital
Surround;
DTS Transcoder;
DTS M6;
DTS M8
CEVA is Powering EVS
in Samsung’s Flagships
CEVA DSPs are Silicon Proven, in Mass Production, With a Strong
Ecosystem  Major Cost Reduction and Time-to-Market Advantage
CEVA Proprietary Information
23
Smart & Connected Development Platform
Availability: Now,
directly from CEVA
500 MHz Silicon
CEVA Proprietary Information
24
CEVA ‘Smart & Connected’ Dev Platform
JTAG port
Arduino shield
connectors & GPIOs
PCIe
Digital mics
Silicon
Proven
Ethernet
500MHz TL4 Silicon
(DSP + TLS100 - DMA, TDM,
I2S, I2C, ICU, Timers)
Analog Mic 2
RTOS
Analog Mic 1
Color LCD
Line out
ARM Cortex-A9 x 2
Linux OS
User area FPGA
Line in
USB/UART
RF I/F
Power
User switches and dip-switches
CEVA Proprietary Information
25
TeakLite-4 DSP Library
Filter Functions
Autocorrelation
Cross-correlation
Convolution (Block FIR)
Block LMS Filter
Delayed LMS Filter
Decimation
Interpolation
Symmetric FIR
Single Sample FIR
Complex IIR (Biquad)
Fourier
Transformations
Bit-Reverse Permutation
Complex FFT
Real FFT
Complex Inverse FFT
Real Inverse FFT
Vector Operations
Addition
Subtraction
Dot Product
Maximum / Minimum
Multiplication
Shift
Math Functions
Division
Square Root
Inverse Square Root
Log
Power
Cosine
Sine
Tangent
Arctangent
Bit-accurate C (Visual Studio) and assembly functions for easy integration
Supports both 16-bit and 32-bit precision
Total of 74 functions including full source code
Fully integrated with SDT - NO ADDITIONAL COSTS INVOLVED!
CEVA Proprietary Information
26
Architecture Block Diagram
AXI Master/Slave
Feature
Configuration
Pipeline
10
VLIW
5 way
SIMD [bit]
64
Scalar Units
2
MAC [16x16-bit]
4
MAC [32x32-bit]
2
SP Floating-Point
Data Memory width [bit]
Optional
128
Branch Target Buffer
Optional
Data Cache
Optional
Instruction Cache
Optional
A Unified DSP for Intensive Multi-Sensor and Connectivity Workloads
CEVA Proprietary Information
27
Enhanced Power Scaling Unit
► Multiple clock sources
►
►
►
►
DSP Core - internal unit manages
the clock automatically
►
Early in instruction decode pipe stage
►
Unneeded modules are shut down
TeakLite-4 DSP
Core PSU
Memory subsystem
Data & program memories
Emulation & debug modules
► Multiple voltage domains
►
►
►
DSP and memory subsystem
Data and Program L1 memories - enables
data retention when core is powered off
Emulation & debug modules
Fine Granularity Enables Ultra-Low Power Controlled Both Automatically and by SW
CEVA Proprietary Information
28
CEVA-Xtend Interface
Add Customized Instructions and Accelerators
CEVA -Xtend Interface support
Two 32-bit source operands
Can be both memory and internal
registers
Two 32-bit results
Can be written to any of the TeakLite-4
accumulators
Can be written directly to the memory
Instruction Opcode
User defined opcode for the Xtend logic
With CEVA-Xtend You Can Differentiate
Your Product and Add Powerful Instructions
CEVA Proprietary Information
29
CEVA-ToolBox™
Software Development Environment
Profiler – Function
Graph
Highly focused on
two main
objectives:
Performance of
generated code,
especially cycle count
and code size
User experience, with
emphasize on ease of
programming, user
interface and
automation tools
Profiler – Function
Info
Profiler – Cache
Performance
Eclipse IDE /
Debugger View
Build
Optimizer
CEVA Proprietary Information
Profiler – Code
Coverage
30
Audio/Voice/Sensing Value Proposition
►
►
Most powerful audio/voice DSP
Optional blocks, configurable memory/system I/F
Single/Dual 32-bit MACs, Dual/Quad 16-bit MACs, 64bit/128-bit data bandwidth  algorithmic efficiency for
lower MCPS
10-stage pipeline  easily reach working frequency
with LP memory/cells
►
Ultra-low power by design
Power scaling unit keeps power consumption under
control
A multifunctional DSP platform, handling Audio,
Voice, Sensor Fusion, Voice Activation and
Connectivity
Area optimized, down to 90K gates (TL410)
Small memory footprint – using 16 and 32-bit instruction
width
►
Extensible architecture
Customer differentiation through application specific ISA
►
Large suite of codecs and ecosystem of pre/post
processing audio and sensor hub partners
►
Same DSP can be used for audio as well as
connectivity (BT/WIFI) control
►
HW based (FPGA/silicon) SW eval / dev board
CEVA Offers A Clear Differentiation and Enables the Voice-First Revolution
CEVA Proprietary Information
31
Thank You!