Understanding the Internet Low Bit Rate Coder

Understanding the Internet Low Bit
Rate Coder
Jan Linden
Vice President of Engineering
Global IP Sound
Presented by
Jan Skoglund
Sr. Research Scientist
Global IP Sound
iLBC – Background info
•
•
•
•
•
•
•
•
•
•
•
Development started in Summer 2000
Contributed to IETF as an internet draft in Feb 2002
Accepted as work item in IETF AVT group Mar 2002
Contributed to CableLabs RFP in June 2002
Improved version to IETF, Fall 2002
ECR submitted in May 2003
Support for 20 ms frames spring 2003
Successful interoperability events
Past Working Group Last call in IETF Jan 2004
April 2004 added as a mandatory codec in PacketCable 1.1
December 2004 IETF process finalized (became
Experimental RFC 3951 and 3952)
Design Principles
• Free of 3rd party IPR
o
o
o
extensive experience in speech coding patents by design team
patent and research situation monitored since 2000
has been public in IETF since March 2002 and reviewed by independent
speech coding researchers
• Packet independency
o
o
o
no coding interdependency between frames
increased packet loss robustness
suitable for IP networks
• Linear Predictive Coding
o
o
well know highly successful coding model
novel coding techniques of residual signal
iLBC Features
• Sampling Rate: 8 kHz
• Supports 30 ms and 20 ms speech frame modes
• Bitrate
o
o
13.3 kbps (399 bits, packetized in 50 bytes) for 30 ms frames
15.2 kbps (303 bits, packetized in 38 bytes) for 20 ms frames
• Computational complexity (TI C54x)
o
o
30 ms frames: appr. 18 MIPS/channel
20 ms frames: appr. 15 MIPS/channel
• Memory
o
o
o
400 Words/channel state memory (RAM)
less than 4 kWords table memory (ROM)
Stack and program memory requirements similar to other low bit rate
codecs (e.g. G.729A)
The Core iLBC method
•
•
•
•
•
Start state encoding
Gain-shape waveform matching forward in time
Gain-shape waveform matching backward in time
Pitch enhancement
Packet loss concealment
iLBC Encoding
Incoming
speech
Packets to
network
iLBC Decoding
Decoded
speech
Packets from
network
20 ms vs 30 ms sub-blocks
0
39
79
119
159
+---------------------------------------+
|
1
|
2
|
3
|
4
|
+---------------------------------------+
20 ms frame
0
39
79
119
159
199
239
+-----------------------------------------------------------+
|
1
|
2
|
3
|
4
|
5
|
6
|
+-----------------------------------------------------------+
30 ms frame
• 20 ms frame size mode - 4 sub-blocks with the total length
of 160 samples
• 30 ms frame size mode - 6 sub-blocks with the total length
of 240 samples
20 ms vs 30 ms mode – bit allocation
240 samples encoded to 399 bits
= 13.3 kbit/s (50 oct)
160 samples encoded to 303 bits
= 15.2 kbit/s (38 oct)
Parameter
Bits
Parameter
Bits
LPC
Start state position
Start state scale
Start state samples
Shapes
Gains
40
4
6
174
115
60
LPC
Start state position
Start state scale
Start state samples
Shapes
Gains
20
3
6
171
67
36
Total
399
Total
303
Advantage over CELP
original
iLBC
g729
g723
PLC
State
recovery
iLBC Performance vs G.729A & G.723.1
old version from Winter 2002
Source: Dynastat
iLBC Performance
Equivalent or slightly lower
performance than G.729E in clean.
Improved robustness to packet loss
compared to G.729E.
iLBC showed better than G.728 in
other testing.
Implementation
Floating Point
Source
Fixed Point
Source
• Significant signal processing skills
necessary
• Quality / efficiency trade-off
• ~ 6 Months
DSP Source
• Optimization skills
• ~ 4 Months
iLBC Specifications
• Available in floating point , fixed point ANSI C, TIc54x, TIc55x,
TIc64x,…
• Supports 20 and 30 ms speech frames
• Algorithmic delay: Same as frame size
• Sampling Rate: 8 kHz
• Bit rate: 13.333 kpbs for 30ms and 15.2 kpbs for 20ms
Product
Frame
size
Complexity (max)
Encoder
Decoder
GIPS iLBC TIc54x
20 ms
11.5 MIPS
4.1 MIPS
GIPS iLBC TIc54x
30 ms
13.5 MIPS
4.4 MIPS
GIPS iLBC TIc55x
20 ms
7.5 MIPS
3.0 MIPS
GIPS iLBC TIc55x
30 ms
8.8 MIPS
3.1 MIPS
Data Memory Static
Fix
Per
channel
Data
Memory
Dynamic
17.6
2.4
1.4
2.3
15.6
2.4
1.4
2.3
Program
Memory
Memory in kWord16