Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Senior Manger, Sound Development Dolby Australia Pty Limited Overview Audio Signal Processing Applications @ Dolby Audio Signal Processing Basics • Sampling • What is an audio signal? • Signal Processing Domains Case Study 1 – Headphone Virtualisation • Frequency Response • FIR filtering • Computational Complexity Case Study 2 – Perceptual Audio Coding • Psychoacoustics © 2016-17 DOLBY LABORATORIES, INC. CONFIDENTIAL Audio Signal Processing Applications @ Dolby Audio Signal Processing Applications @ Dolby Cinema • Delivering channel based audio - 5.1 – 7.1 – Distribute movies to multiple screens in a multiplex – Cinemas use speaker arrays – rather than single speakers – so processing required to fill the arrays from single channel feeds • Rendering immersive audio – Dolby Atmos – Cinema soundtrack is express as individual objects and locations - in every cinema the movie is renderer for that specific cinema’s speaker locations • Speaker equalisation & protection – Process the audio sent to each speaker to compensate for the electro-acoustic properties of the speaker. (e.g., frequency response, distortion characteristics) – Ensure that audio sent to the speakers doesn’t over driver the speaker, which would damage them. CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 4 Audio Signal Processing Applications @ Dolby Broadcast / Home Theatre • Compression of Audio for Streaming / DVD / Blu-ray Disc – Perceptual audio coding (case study later) – Matrix encoding (Pro-logic) – Multi-channel audio coding – Multiple languages – Multiple playback formats (stereo / 5.1 / etc) • Broadcast end-to-end – Capture, mixing, coding, transmission, playback • AV Receivers (AVRs), Set Top Boxes (STBs), Digital Media Adapters (DMAs) • Games consoles CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 5 Audio Signal Processing Applications @ Dolby Personal Audio • Devices – Mobile phones (feature phones & smart phones) – Tablets – Music players – PCs • Same issues as Home Theatre, but usually more limited acoustic hardware (i.e. cheap speakers) • Headphone playback is a big use case (case study later) CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 6 Audio Signal Processing Applications @ Dolby Voice Processing • Many of the ‘same’ basic challenges – but because speech has some specifically different characteristics from general audio, different solutions exist • Speech coders use different approaches than audio codecs – What makes a good codec is measured differently – The transmission bandwidths used for the data is much more limited • Conferencing & Telephony CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 7 CONFIDENTIAL Audio Signal Processing Basics Audio Signal Processing Basics Sampling • Digital signals have samples which are discrete in time and magnitude • Process of converting a continuous signal to the digital domain is Sampling – Two key questions when sampling are: How often to sample & how precisely? Analogue to Digital Converter (ADC) CONFIDENTIAL Digital Signal Processing Digital to Analogue Converter (DAC) © 2016-17 DOLBY LABORATORIES, INC. 9 Audio Signal Processing Basics Sampling Frequency – 𝑓𝑠 (how often?) • Number of samples per second • Nyquist rate: – Greater than twice the highest frequency CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 10 Audio Signal Processing Basics Resolution (how precisely?) • Each sample is represented by a number, how many bits should we use? • Converting a continuous value to a discrete value requires quantisation. 1 • Quantisation Error – ‘1’ → 0.5 – ‘0’ → -0.5 Digital 0 -1.0 CONFIDENTIAL Analogue +1.0 © 2016-17 DOLBY LABORATORIES, INC. 11 Audio Signal Processing Basics Resolution (how precisely?) • By using more bits, we reduce the error 101 … skipping all the math … • Each additional bit of resolution improves SNR (signal to noise ratio) by 6.02 dB 000 -1.0 CONFIDENTIAL Analogue +1.0 © 2016-17 DOLBY LABORATORIES, INC. 12 Audio Signal Processing Basics Audio Signal • Sampling Frequency – Human perception – 20 Hz – 20,000 Hz – Nyquist says Fs >= 40 kHz • CD Audio: 44.1 kHz • Blu-ray (and before that DAT): 48 kHz • Bit depth – Range of loudness relative to human hearing… • Threshold of hearing – 0 dB When/Where might we use more? (higher sampling rate or more bits?) • Jet Engines – 110-140 dB • Busy Road (standing at the curb) – 100 dB • Sustained exposure will cause damage – 85dB – 16 bits per sample gives ~ 96 dB of dynamic range – 24 bits per sample = 144 dB CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 13 Audio Signal Processing Basics Audio Signal • Raw data rate – 48 kHz, 16 bits per sample = 768 kbps / ch – 3.86 GB for a 2hr movie (5.1 channels) (NB: DVD capacity = 4.7GB) CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 14 Audio Signal Processing Basics Processing domains • Sampled audio i.e., Pulse Code Modulated (PCM) data is in the time domain • Not everything we want to do with audio is formulated as a time domain operation – e.g., Flattening the frequency response of a speaker • The Fourier Transform expresses a signal in terms of it’s frequency components (sinusoids). Using it we can formulate processing in the frequency domain • Whether processing is implemented in the time or the frequency domain can depend on where it is most efficient. • Signal processing also has other useful transform domains which may offer advantages for specific types of processing – e.g., image coding often uses the Discrete Cosine Transform – DCT CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 15 Headphone Virtualisation Case Study 1 Headphone Virtualisation How do you get surround sound out of a pair of headphones? Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 17 Headphone Virtualisation Two things we need to achieve: • Make it sound like the audio is coming from different directions • Make it sound like the listener is in a room. Both can be achieved by filtering the signal using the impulse response of the room (RIR) and the head-related transfer functions (HRTF). Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 18 Headphone Virtualisation Room impulse response • By measuring how a short impulsive sound is altered by a room, the room’s reflections and echoes can be characterised to create an impulse response. https://www.youtube.com/watch?v=PkZjIHTJ4jc • The impulse response can in turn be used to filter any signal, to make it sound like it was in the room. • The process of filtering a signal using an impulse response is convolution: ∞ 𝑦[𝑛] = ℎ 𝑘 𝑥 𝑛−𝑘 𝑘=−∞ Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 19 Headphone Virtualisation Room impulse response • How many points would be required to capture a room? (i.e. how long is the impulse response?) • Limiting the impulse response to 50ms gives us 1440 points (@48kHz) • Considering the computational cost: 1440 * 48k –> 69 MFLOPS Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 20 Headphone Virtualisation Computational load • On a DSP chip with a single cycle MAC -> 69 MCPS • On an ARM, ‘MAC’s ~ 3.5 cycles each -> ~240 MCPS • 5.1 channels -> 10 filters = 2,400 MCPS Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 21 Headphone Virtualisation The solution? • Convolution in Time domain <-> Multiplication in Frequency Domain – Fourier Transform the impulse response & the signal • Block based, e.g., blocks of 2048 • O[N.log2(N)] -> k*22528 ~ 78,848 – Operate in the Frequency domain, • Complex multiplies -> 4 * 2048 -> 8,192 – Transform the result back to the time domain. • Same as forward transform – Blocks per second? • 23 blocks/sec … ~4 MFLOPS / filter What about the HRTFs ? Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 22 Headphone Virtualisation Head-related Transfer Function • Measured on a dummy • Applied as filters • Same computational arguments lead us to the need to apply these in the frequency domain. NB: we don’t need to go back to the time domain between the two sets of filters Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 23 Dolby Atmos for headphones debuted in Blizzard’s Overwatch © 2016-17 DOLBY LABORATORIES, INC. 24 Perceptual Audio Coding Case study 2 Perceptual Audio Coding How do you reduce the storage and transmission bandwidth requirements of Audio signals? Bitrates: • Uncompressed : 768 kbps / ch • DVD (AC3) : 448 kbps (5.1 channels) (~10:1 compression ratio) Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 26 Perceptual Audio Coding Audio Coding is Lossy • Lossless compression: must perfectly reconstruct their source. (zip files) • Lossy compression: can ‘throw away’ data if it isn’t ‘needed’. The reconstruction need only be ‘good enough.’ – Deciding which bits to ‘throw away’ and what is ‘good enough’ is the hard part. Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 27 Perceptual Audio Coding Time/Frequency analysis Quantisation Psychoacoustic Bit allocation Entropy coding analysis Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 28 Perceptual Audio Coding Psychoacoustics • Study of sound Perception – Perception implies the human experience – which include physiological and psychological factors. https://auditoryneuroscience.com/McGurkEffect • Is at the heart of the question of which parts of an audio signal are important, or unimportant. Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 29 Perceptual Audio Coding Psychoacoustics • Most perceptual quantities are non-linear and subjective • Loudness – Non-linearly related to sound pressure – Scales include: sone, phon • Pitch – Non-linearly related to frequency – Scales include: Bark, Mel, ERB Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 30 Perceptual Audio Coding Frequency Masking Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 31 Perceptual Audio Coding Temporal Masking Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 32 Perceptual Audio Coding Time/Frequency analysis • Break the incoming signal into time blocks and transform into the frequency domain • Coding is always block based • The frequency representation is analysed in bins of equal perceptual bandwidth (bark) Psychoacoustic analysis • Use the frequency representation of the current block to calculate the masking curve Time/Frequency analysis Quantisation Psychoacoustic Bit allocation analysis • Use the frequency masking curves from previous frames to account for temporal masking Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 33 Perceptual Audio Coding Masking Curve • Areas of the spectrum where the masking curve is above the signal energy, represent ‘things we can’t hear’ • If we can’t hear them, we shouldn’t spend bits encoding them Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 34 Perceptual Audio Coding Bit allocation • Using the masking curve, we can calculate the allowed signal to noise ratio in each of the frequency bands • Knowing that allocating a bit to a quantiser improves SNR by 6 dB, iterative allocate the bits available in the bit pool to band, until we either; run out of bits, or exceed the SNR requirements in all bands • (any left over bits can be used to code the next frame) Time/Frequency analysis Quantisation Psychoacoustic Bit allocation analysis • The bit distribution must be sent to the decoder Quantiser • Quantise the frequency domain representation to send to the decoder. Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 35 Perceptual Audio Coding Decoding is ‘simple’ • Recreate the frequency representation of each frame • Transform back to the time domain • Additional processing can be used to enhance the reconstructed signal Introducing Audio Signal Processing & Audio Coding © 2016-17 DOLBY LABORATORIES, INC. 36 CONFIDENTIAL Summary Summary Audio Signal Processing Applications Audio Signal Processing Basics • Sampling • What is an audio signal? • Signal Processing Domains Case Study 1 – Headphone Virtualisation • Frequency Response • FIR filtering • Computational Complexity Case Study 2 – Perceptual Audio Coding • Psychoacoustics Questions? CONFIDENTIAL © 2016-17 DOLBY LABORATORIES, INC. 38
© Copyright 2026 Paperzz