Course Presentation Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology Sound Basics waves of pressure which propagates through compressible media such as air or water. Digital Representation of an Analog Signal Sampling and Quantization Parameters: Sound is a sequence of Sampling Rate (Samples per Second) Quantization Levels (Bits per Sample) This is a form of coding too: Pulse-code modulation (PCM) Page 1 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Why Call it PCM? 4-bit PCM Page 2 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Bit per Second (bit/s or bps) How to choose proper… Sampling Rate 8 Khz ? Quantization Level 8 bit/sample ? Bit per Second for 8000 Hz 8 bit PCM 64 kbit/s Page 3 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Sampling Rate Human Hearing Frequency Range 20 Hz to 20 kHz Play with “Audacity” tone generator to test your hearing Most people will find that their hearing is most sensitive around 1-4 kHz and that it is less sensitive at high and low frequencies. Page 4 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Hearing Range Porpoise Ferret Page 5 Check out Freq. Alloc. Table Multimedia Systems, Speech I Pulse-code Modulation (PCM) Sampling Rate Human Vocal Range Normal: 80 Hz to 1100 Hz Charles Kellogg (14 KHz) (not verified) Guinness Book of Records Female: Georgia Brown (Eight octaves, 25087Hz) Male: Tim Storms (Six octaves) Page 6 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Common Sampling Rates 8,000 Hz - Telephone, adequate for human speech. 11,025 Hz – lower quality PCM (one quarter the sampling rate of audio CDs). 22,050 Hz – Radio. 32,000 Hz - miniDV digital video camcorder, DAT (LP mode). 44,100 Hz - Audio CD, also most commonly used with MPEG-1 audio (VCD, SVCD, MP3) (Originally chosen by Sony, 1979). 48,000 Hz - Digital sound used for miniDV, digital TV, DVD, DAT, films and professional audio. 96,000 or 192,000 Hz - DVD-Audio, some LPCM DVD tracks, BD-ROM (Blu-ray Disc) audio tracks, and HD-DVD (High-Definition DVD) audio tracks. 2.8224 MHz - Super Audio CD (SACD), 1-bit sigma-delta modulation process known as Direct Stream Digital (DSD), co-developed by Sony and Philips. 5.6448 MHz - Double-Rate DSD, 1-bit Direct Stream Digital at 2x the rate of the SACD. Used in some professional DSD recorders (128 * 44100 Hz). DXD - 24-bit sampled at 352.8 kHz, suited for editing, eq. with 8.4672 MHz 1-bit DSD Page 7 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Direct Stream Digital The trademark name used by Sony and Philips. Uses pulse-density modulation encoding Page 8 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Quantization, Images Page 9 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Uniform Quantizer, Midtread Simple and popular midtread odd number of reconstruction levels Quantization error for bounded input Page 10 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Uniform Quantizer, Midtrise Simple and popular midtrise even number of reconstruction levels Page 11 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Quantization Levels Want to prevent human ear fatigue by minimizing quantization noise Signal-to-Noise Ratio = 6.02*B dB SNR is approximately 6 dB per bit. 16-bit => 96 dB Above 36 dB is required Page 12 Multimedia Systems, Speech I Pulse-code Modulation (PCM) 6 dB per bit rule of thump Average power of a process or signal: e n xˆ n x n e n 2 2 2 x p x dx x x 2Xm 2B x : 2 Mean p x : : Variance Probability density function SNRdB B 10log10 x2 e2 X m x n X m Assumption: e n is uniform over ( , ] 2 2 e e 2 e 2 2 2 X m2 1 2 p e de e de 2B 12 2 3 2 2 2 SNRdB B 10 log10 22 B 3 x2 X m2 20 log10 2 B 10 log10 3 x2 X m2 The probability density function of e[n] Page 13 SNRdB B 6.02B 10log10 3 x2 X m2 Multimedia Systems, Speech I Pulse-code Modulation (PCM) 6 dB per bit rule of thump SNRdB B 6.02B 10log10 3 x2 X m2 Example 1: SNR for Uniform Quantization of Uniformly-Distributed Input x2 X m2 3 SNRdB B 6.02B Example 2: SNR for Uniform Quantization of Sinusoidal Input x2 X m2 2 SNRdB B 6.02B 1.76 Example 3: SNR for Uniform Quantization of Gaussian Input x2 X m2 16 Page 14 SNRdB B 6.02B 7.27 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Good to Know The average person cannot tell the difference between a bitrate above 192 kbit/s and the original CD/WAV. Even if your headphones seal really well around your ears, they will probably only give you about 20 to 25 dB insulation from the external sound 20 ~ 25 dB insulation Noise level for 192 kbps audio is under -125 db and certainly inaudible Page 15 Multimedia Systems, Speech I Pulse-code Modulation (PCM) Nonuniform Quantization Typical speech signal and its histogram (a) Uniform and (b) non-uniform quantization Q(x) and quantization error q(x) Page 16 Multimedia Systems, Speech I Pulse-code Modulation (PCM) u-law, a-law Nonuniform quantizers: Difficult to make, Expensive. Solution: Companding Uniform Q. Expanding Page 17 Multimedia Systems, Speech I Pulse-code Modulation (PCM) u-law, a-law Page 18 Multimedia Systems, Speech I Pulse-code Modulation (PCM) u-law, a-law u-law North America and Japan Page 19 Multimedia Systems, Speech I a-law Europe Differential PCM (DPCM) Idea Take advantage of data redundancy [… 110 112 111 112 112 114 115 115 114 114… ] [… +2 -1 +1 0 +2 +1 0 -1 0 …] Page 20 Multimedia Systems, Speech I Differential PCM (DPCM) Basic Scheme General Predictive Coding Delta Modulation (DM): ai x n i z 1 Problem? Page 21 Multimedia Systems, Speech I Differential PCM (DPCM) Better Structure Page 22 Multimedia Systems, Speech I Multimedia Systems Speech I Thank You Next Session: Speech II FIND OUT MORE AT... 1. http://ce.sharif.edu/~m_amiri/ 2. http://www.dml.ir/ Page 23 Multimedia Systems, Speech I
© Copyright 2026 Paperzz