Sound

Lab #8
Follow-Up:
Sounds and
Signals*
* Figures from Kaplan, D. (2003) Introduction to Scientific Computation and Programming CLI Engineering.
Intro to Sounds & Signals
•
•
Recall transducer concept convert input signal
into numbers.
Signal: a quantity that changes over time
– Body temperature
– Air pressure (sound)
– Electrical potential on skin (electrocardiogram)
– Seismological disturbances
– Stock prices
Intro to Sounds & Signals
•
•
We will study audio signals (sounds), but the
same issues apply across a broad range of
signal types.
Two different approaches to doing the same
thing:
–
–
Commercial GUI program (Audacity, Pro Tools)
Programmatic (Python)
13.1 Basics of Computer Sound
>>> x, fs, bits = wavread("fh.wav")
>>> len(x)
41777
>>> min(x), max(x)
('\x00', '\xff')
>>> fs
11025
>>> bits
8
Basics of Computer
Sound
Basics of Computer Sound
Basics of Computer
Sound
• x contains the sound waveform (signal) –
essentially, voltage levels representing
transduced air pressure on microphone.
• fs is the sampling frequency (rate) – how
many time per second (Hertz, Hz), did we
measure the voltage?
• bits is the number of bits used to represent
each sample.
Questions
• Why does the sound waveform range from
hexadecimal 00 to FF, whereas we plot it as
1 to +1?
– These values are essentially arbitrary. One nice
feature of a ±x representation is that zero
means silence. But the audio player likes values
between 0 and 255.
• What role does the sampling frequency play in the
quality of the sound?
– The more samples per second, the closer the
sound is to a “perfect” recording.
Questions
• What happens if we double (or halve) the sampling
frequency at playback, and why?
• What is it about the waveform that determines the
sound we're hearing (which vowel), and the
speaker's voice?
Questions
• What is it about the waveform that
determines the sound we're hearing (which
vowel), and the speaker's voice?
–Most of this information is encoded in the
frequencies that make up the waveform –
roughly, the differences between locations
of successive peaks – and not in the actual
waveform values themselves.
–We can do some useful processing on the
“raw” waveform, however – e.g., count
syllables:
Syllable Counting by Smoothing
and Peak-Picking
Perception and Generation
of Sound
• Sound is the perception of small, rapid
vibrations in air pressure on the ear.
• Simplest model of sound is a function P(t)
expressing pressure P at time t:
P(t) = A sin(2πft + φ)
where A = amplitude (roughly, loudness)
f = frequency (cycles per second)
φ = phase (roughly, starting point)
• This is the equation for a pure musical tone
(just one pitch)
Perception and Generation of
Sound
–Inverse of frequency is period
(distance between peaks):
Perception and Generation of
Sound
–E.g., whistling a musical scale:
Transducing and Recording
Sound
• Convert sound pressure to voltage, then digitize
voltage into N discrete values in interval
[xmin, xmax], by sampling at frequency Fs.
• This is done by a analog /digital converter.
• Another device must pre-amplify sound to match
input expectations of a/d converter.
• N is typically a power of 2, so we can use bits to
express sampling precision (minimum 8 for decent
quality). This is called quantization.
• Various things can go wrong if we don't choose
these values wisely....
Transducing and Recording Sound
Appropriate preamplification
4
96
64
Voltage
32
0
0
-4
-32
Analog
Digital
-2
0
2
4
6
A/D units
2
-64
8
10
12
-96
14
Preamplification too low
4
0.1
2
0
0
-2
-0.1
-4
0
2
4
6
8
Time (ms)
10
12
14
A/D units
Voltage
A segment of the sound
“OH” transduced to
voltage.
Top: The preamplifier has
been set appropriately
so that the analog
voltage signal takes up
a large fraction of the
A/D voltage range. The
digitized signal closely
resembles the analog
signal even though the
A/D conversion is set
to 8 bits.
Bottom: The preamplifier
has been set too low.
Consequently, there is
effectively only about 3
bits of resolution in the
digitized signal; most
of the range is unused.
Transducing and Recording Sound
Figure 13.6.
Clipping of a signal (right) when the preamplifier
has been set too high, so that the signal is
outside of the −5 to 5 V range of the A/D
converter.
Aliasing and the Sampling Frequency
• Someone has an alias when they use more than
one name (representation)
• In the world of signals, this means having more
than one representation of an analog signal,
because of inadequate sampling frequency
• Familiar visual aliasing from the movies (when 32
frames per second is too slow)
• Wagon wheel / propeller going backwards
• Scan lines appearing on computer screen
• Inadequate Fs can result in aliasing for sounds
too....
Aliasing and the Sampling Frequency
Aliasing and the Sampling Frequency
1
m=0
m=1
m=2
samples
Amplitude
Aliasing. A set of samples
marked as circles. The
three sine waves plotted
are of different
frequencies, but all pass
through the same
samples. The aliased
frequencies are F +m/∆T,
where m is any integer
and ∆T is the sampling
interval. The sine waves
shown are m = 0, m = 1,
and m = 2.
0
-1
0
1
2
Time ( T)
3
Aliasing and the Sampling Frequency
• Nyquist's Theorem tells us that Fs should be at
least twice the maximum frequency Fmax we wish
to reproduce.
• Intuitively, we need two values to represent a
single cycle: one for peak, one for valley:
Aliasing in the Time Domain