Digital Audio Audio Basics Sound Waves Loudness (Sound Intensity

Audio Basics
 
 
Digital Audio
 
 
YEEE0004, Fall 09
Sound is vibration mechanical energy that propagates
through matter as a wave.
The energy in a sound wave causes our ear drums to
vibrate, which leads to our perceptions of sound.
Sound can be measured by recording the changes in air
pressure
A microphone converts the motion of air into electrical
signals - audio signals
Hayden Kwok-Hay So
Sound Waves
Loudness (Sound Intensity)
Violin
amplitude
low amplitude = soft
Trumpet
Subjective Description
Objective Measurements
Pitch
Frequency
Loudness
Amplitude
Timbre
“Shape”
Pitch & Loudness
low frequency = low pitch
A relative measurement
Common decibel levels
• 
• 
• 
• 
• 
• 
35 – Library
60 – normal conversation
85 – busy street
110 – loud music
120 – airplane takeoff and landing
140 - firework
high frequency = high pitch
 
 
Example:
•  Number of cycles per second
•  C3 = 130.81 Hz
•  A4 = 440 Hz
•  A5 = 880 Hz
 
 
•  Intensity level = 10 log (P/P0) dB
•  P and P0 are values of acoustic power where P0 is 10-12 W/m2
Timbre
Higher frequencies produce higher pitches
Frequencies measured in Hertz (HZ)
 
Sound intensity measured in decibels (dBs)
 
frequency
high amplitude = loud
 
Doubling the frequency increases the pitch by 1
Octave
The shape of the sound wave affects the “feel”
of the sound, even when they have the same
frequency
  It gives the quality of music sound
  But why?
 
1
Signals in Frequency Domain
amplitude
Spectrum Analysis frequency
amplitude
sin(x)
All sounds can be
constructed using many
many sine waves of
different frequencies,
amplitudes and phases.
Demos
  2
frequency
sin(x)+sin(2x)
Digitizing Sound Waves
Tones
  Chords
  DTMT
  Equalizer
Sampling
 
Q: How do we represent this signal in digital
form?
 
A: We need (1) the amplitude of the
waveform at any particular (2) time.
Sampling a Signal
  The
process of converting a continuous
analog signal into a discrete sequence of
values by measuring the value of the
analog signal at regular time intervals.
  Each measurement is called a sample.
  Sample data are discrete in time, but
continuous in amplitude.
  Need quantization on the sampled values.
  Signal represented this way is called PCM,
pulse-code modulation
•  Similar to bitmap in the field of image
2
How Fast? How Precise?
 
How fast should we sample the audio?
 
How precise should we save each sample?
 
Simple answer:
•  The faster the better
•  The more precise the better
 
Longer answer: Meet the min requirements for
• 
• 
• 
• 
Signal reconstruction
Sound quality
Storage
Bitrate
How Precise? Quantization Error
 
 
 
The difference between the
actual signal amplitude and
the corresponding nominal
amplitude is called the
quantization error.
This error varies randomly.
Also called quantization
noise
Less quantization noise
when encoded with more
bits. For example:
•  CD audios are quantized at
16-bit
•  Telephone are quantized at 8bit
Reconstructing Sampled Signal
Back of envelope calculation
  Example:
60 minute of audio, sampled
at 40 kHz, 32 bit sample
  Storage:
•  60 * 60 * 40 000 * 4 byte = 576 MB
  Bitrate:
•  32 * 40k = 1.28 Mbps
  As
a comparison, ADSL downlink is just
about 1.5 Mbps
How often do we sample?
  The
more the better
  A
minimum sampling frequency as
required by signal reconstruction
•  We need enough sample so that we can
reconstruct the original analog signal
What is the minimum sampling
frequency?
Error in Reconstructed Sample
  Reconstruct
the original continuous time
signal out of discrete samples.
  Like
  But
connecting the dots…
how?!
•  Staircase? Linear? Wild guess?
time
Is it possible to reconstruct the original signal?
3
From Sampling to Signal
Reconstruction
Fourier
transform
Inverse
Fourier
transform
Sampling Theory
sampling
reconstruct
aliasing
t
filtering
t
Time Domain
Nyquist showed that the minimum sampling
rate is 2x the maximum frequency of the
signal.
•  Lower sampling frequencies cause signal aliasing
 
The faster (more frequently) you sample, the
better the result is.
•  Oversampling allows easier reconstruction of
original signal
 
Human can hear audio signals from 20Hz to
20k Hz
 
CD uses 44.1kHz sampling rate
Steps of MP3 encoding
 
Break audio into short frames
Audio Compression – MP3
 
 
Transform the frame into frequency domain
 
Compare the spectral energy of this frame to the
mathematical models of human psychoacoustic.
•  Determine the bandwidth allocated to different frequency
sub-band.
•  More bits for highly audible frequencies, less bits/drop the
less audible frequencies.
•  Must not exceed the specified bitrate: 96 kbs, 128kbs, 192
kbs, etc
Lossless Huffman encoding of the resulting bits
MPEG-1 Layer 3
•  commonly known as MP3 files
•  An audio codec
 
 
Perceptual coding
Encode audio optimized to the way human perceive,
resulting in large compression ratio
•  Roughly 1/10 file size when compared to uncompressed
PCM CD-audio
 
Idea similar to JPEG
 
The mp3 standard does not specify exactly how to
encode an audio stream.
•  Lossy perceptual coding + lossless entropy coding
•  It only specify how to decode.
•  General encoding steps are similar.
MIDI
 
Musical Instrument Digital Interface (MIDI)
was developed in early 80s
 
Specify how musical information can be
exchanged between instruments from
different manufacturers
 
MIDI describes music, not waveform:
 
Two components
•  roughly 0.5 seconds long
 
f
Frequency Domain
Back to Sampling: How Fast?
 
f
•  E.g. Piano plays note F# at bar 3, beat 2, loud
•  Hardware: specify physical connection of musical
instruments
•  Data format: how information is encoded
4
Playing a MIDI file
 
A MIDI file records musical events
•  E.g. Piano plays note F# at bar 3, beat 2, loud
 
A MIDI musical instrument interprets the file
and generates requested events
•  E.g. Look up the waveform for the requested
instrument at certain pitch from wavetables or a
set of sound fonts
 
Resulting sound waves are mixed to output
speaker
 
Demo: MIDI vs CD recording
5