Eng. 100: Music Signal Processing DSP Lecture 6

Eng. 100: Music Signal Processing
DSP Lecture 6
Lab 3: Spectra/Spectrogram (Continued)
• Curiosity:
• http://www.soundhound.com
• http://dx.doi.org/10.1109/MSPEC.2015.7049439
• Read Lab 3!
• HW 3 on Canvas, due Thu. Oct. 15
Mon
Oct 05
(no lab)
Oct 12
Lab 3a
• DSP schedule:
Oct 19
No lab
Break
Oct 26
Lab 3b
Nov 02
Lab 3 due / P2
Tue
Oct 06
Wed
Oct 07
Thu
Oct 08
HW2 due
Oct 13 Oct 14 Oct 15
Ethics &
HW3 due
Review
Review
Oct 20 Oct 21 Oct 22
No class
Break
Exam 1
Oct 27 Oct 28 Oct 29
...
...
1
Fri
Oct 09
Lab 3a
Oct 16
Lab 3b
Oct 23
No lab
Oct 30
Lab 3 due / P2
Outline
• The spectrum of a signal
(first class)
◦ Part 1. Why we need spectra
◦ Part 2. Periodic signals
◦ Part 3. Band-limited signals
• Methods for computing spectra
(second class)
◦ Part 4. Sampling rate S > 2B
◦ Part 5. By hand by solving systems of equations
◦ Part 6. Using general Fourier series solution
◦ Part 7. Using fast Fourier transform (FFT), e.g., in Matlab
• Using a signal’s spectrum
(third class)
◦ to determine note frequencies
◦ to remove unwanted noise
◦ to visualize frequency content (spectrogram)
◦ Lab 3
2
Summary of previous lecture(s)
T -periodic, band-limited signals have finite Fourier series:
• Sinusoidal form:
K
2
1
x(t) = |{z}
c0 + c1 cos 2π t − θ1 + c2 cos 2π t − θ2 + · · · + cK cos 2π t − θK
T
T
T
|
|
{z
}
{z
}
DC
fundamental
harmonics
• Trigonometric form:
1
2
K
x(t) = a0 + a1 cos 2π t + a2 cos 2π t + · · · + aK cos 2π t
T
T
T
1
2
K
+ b1 sin 2π t + b2 sin 2π t + · · · + bK sin 2π t
T
T
T
• Coefficients in these two forms are related by
q
ak = ck cos θk
ck = a2k + b2k
bk = ck sin θk ,
tan θk = bk /ak
√
because
a cos(φ ) +b sin(φ ) = a2 + b2 cos(φ − arctan(b/a))
and
c cos(φ − θ ) = (c cos θ ) cos φ + (c sin θ ) sin φ
• We can use the fast Fourier transform (FFT) method (Matlab’s fft function)
to compute these coefficients very quickly from signal samples if S > 2B.
a0 = c0
b0 = 0,
3
FFT summary
For an array x of length N with signal samples x = x S0 x S1 · · · x
taken from a signal x(t) that is periodic and band-limited with S > 2B:
N−1
S
disp((2/N)*real(fft(x))) gives
[2a0 a1 a2 . . . aN/2−2 aN/2−1 aN/2 aN/2−1 aN/2−2 . . . a2 a1]
{z
}
{z
}
|
|
disp((-2/N)*imag(fft(x))) gives
[0 b1 b2 . . . bN/2−2 bN/2−1 0 −bN/2−1 − bN/2−2 . . . − b2 − b1]
|
{z
} |
{z
}
disp((2/N)*abs(fft(x))) gives
(we usually use this one!)
[2c0 c1 c2 . . . cN/2−2 cN/2−1 cN/2 cN/2−1 cN/2−2 . . . c2 c1]
{z
}
|
{z
}
|
disp(angle(fft(x))) gives
[0(or π) −θ1 . . . −θN/2−1 0(or π) θN/2−1 . . . θ2 θ1]
|
{z
}
|
{z
}
Frequency associated with kth term (Matlab index k + 1) is f =
4
k
S.
N
FFT: Why the mirrored coefficients?
√
(ı = −1)
Complex-exponential form of Fourier series of any T -periodic signal:
∞
x(t) =
k
Xk e−ı2π T t .
∑
k=−∞
For a T -periodic signal band-limited to B = K/T :
K
x(t) =
−ı2π Tk t
Xk e
∑
.
k=−K
0
S
1
S
For an array x of length N with signal samples x = x
x
··· x
taken from a signal x(t) that is periodic and band-limited with S > 2B:
N−1
S
disp((1/N)*fft(x)) gives
[X0 X1 X2 . . . XN/2−2 XN/2−1 X−N/2 X−N/2+1 X−N/2+2 . . . X−2 X−1]
|
{z
}
{z
}
|
positive k
negative k
because
1 ıt 1 −ıt
cos(t) = e + e .
2
2
We will not use the complex-exponential form in ENGN 100.
5
(See EECS 216.)
FFT: Why f = Nk S ?
Recall sinusoidal form of Fourier series of T -periodic signal with band-limit B:
K
k
x(t) = c0 + ∑ ck cos 2π t − θk
T
k=1
Frequency of kth term in sum is f =
k
.
T
But for unknown pitches we do not know the fundamental period T !
So the formula f = Tk is not immediately useful with the FFT.
6
FFT: Why f = Nk S ?
Imagine repeating a segment of audio over and over:
t
This thought experiment yields a periodic signal with period T = N∆ =
N/S where N is the number of samples.
Substituting T = N/S into the formula f = Tk and simplifying yields
k
f = S.
N
• This formula works perfectly when the unknown frequency f is an
integer multiple of S/N.
• If f is not exactly an integer multiple of S/N, then we will see a
“bump” in the FFT spectrum instead of a single line.
7
Line spectra
• Spectrum of a T -periodic, band-limited signal
1
2
K
x(t) = c0 + c1 cos 2π t − θ1 +c2 cos 2π t − θ2 + · · · + cK cos 2π t − θK
T
T
T
• More generally, line spectra of band-limited signals look something like this:
Formula: x(t) = c0 + c1 cos(2π f1t) + · · · + c4 cos(2π f4t)
(We assumed θk = 0 in this formula because phase is unspecified in spectrum.)
• In physics, often only the frequencies of atomic spectra are shown, not the amplitudes:
8
[wiki]
Exercise
Draw the spectrum of this signal:
x(t) = 20 sin2(2π50t) +30 cos(2π200t) .
Hint: sin2(φ ) = 12 − 12 cos(2φ )
“power reduction formula”
9
DSP, FFT and “Stairway to heaven”
0.2
x(n/S)
0.1
0
play
−0.1
−0.2
0
0.5
1
1.5
2
n
2.5
3
3.5
4
4
x 10
x(n/S)
0.05
0
play
−0.05
4000
4200
4400
4600
4800
5000
n
5200
5400
5600
5800
6000
spectrum
0.02
0.015
0.01
0.005
0
111
221
331
441
551
661
771
881
991
k+1
110
Fundamental frequency of first note: f = Nk S = 2000
8000 = 440Hz.
What is the lowest (nonzero) frequency one could find here? ??
10
Listening and seeing
1
0.5
0
0
400
4000
play
1
0.5
0
400
play
713
1
0.5
0
play
400
800
1200
1600
2000
2400
1
play
0.5
0
400
800
1200
1600
2000
f [Hz]
2400
Why is the lowest note on a tuba (or large pipe organ) C0 = 16.35 Hz? ??
[wiki]
11
Application:
Removing noise from signals
(Lab 3)
12
Removing noise from signals
• Given: signal = desired signal + noise signal
y(t) = x(t) + ε(t)
•
•
•
•
Goal: reduce (“filter out”) as much noise as possible
Assume/know: signal is band limited to B Hz
Idea: eliminate frequencies above B Hz (must be noise)
Analog approach: use electrical circuit (e.g., treble control)
(See EECS 215/216)
• Digital approach:
◦ sample signal
◦ compute spectrum using FFT
◦ set to zero portions of the spectrum that are just noise
◦ use inverse FFT (ifft) to synthesize improved signal
◦ (Many more ways described in DSP course EECS 351.)
13
Removing noise digitally: Details
• sample using S > 2B
• Recall f = Nk S and disp((2/N)*abs(fft(x))) gives:
[2c0 |c1 . .{z
. cK−1} cK . . . cN/2−1 cN/2 cN/2−1 . . . cK |cK−1{z. . . c}1]
| {z } |{z} | {z }
low
low
high
highest
high
|
{z
}
mirror
• Find index K such that (K − 1)S/N = B. Then we want:
[2c0 |c1 . .{z
. cK−1} 0
. . 0} |{z}
0 0| .{z
. . 0} c|K−1{z. . . c}1]
| .{z
high highest high
low
|
{z low }
mirror
• remove frequency components above B using fft,
synthesize “improved” signal using ifft:
(review a(k) = 0;)
Yf = fft(y);
Zf = Yf; Zf(K+1:N+1-K) = 0;
z = real(ifft(Zf));
• This is called low pass filtering (via FFT).
14
Example: Low-pass filtering a noisy EKG signal
• EKG=Electrocardiogram (heart) electric signal.
• Periodic (we hope!) with period about 1 second
(60 beats per minute = 1 beat per second).
• Components at 1, 2, 3, . . . Hz up to about 7 Hz.
• So: We eliminate all frequencies above 7 Hz.
• Result: high-frequency noise eliminated,
but low frequency noise remains.
• See next slide for signals and their spectra.
15
Example: Low-pass filtering a noisy EKG signal
t [sec]
f [Hz]
16
Example: Removing noise from mystery signal
20
y(t)
10
0
−10
−20
•
•
•
•
0
0.1
0.2
0.3
0.4
0.5
t [sec]
0.6
0.7
Known: N = 8000 samples at S = 8000 Hz
What is the signal x(t) buried in this noise?
How do we recover it from y(t)?
(cf. restoring old analog recordings, movies, etc.)
17
0.8
0.9
1
Spectrum of mystery signal
1.2
spectrum of y
1
0.8
0.6
0.4
0.2
0
1
8000
frequency index k + 1
•
•
•
•
stem(2/N*abs(fft(y)))
This spectrum also looks like random noise!?
Look at vertical scale: large peaks near 1 and 8000.
Zoom in to see exactly where the peaks are (next slide)
18
Spectrum of mystery signal: Zoomed
1.2
1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
1
6
19
7996
k+1
•
•
•
•
8000
k+1
Large peaks at Matlab array index 6 and 7996 = 8000+1−(6−1)
(If we counted from 0, then peaks at k = 5 and N − k = 8000 − 5 = 7995.)
Sinusoid frequency (6 − 1)S/N = 5(8000)/8000 = 5 Hz
Now filter out all other frequencies (all but 6 and 7996).
19
Filtering noisy mystery signal
Generating data: N = 8000; S = 8000; n = 0:N-1; t = n/S;
x = cos(2*pi*5*t); y = x + 4 * randn(size(x));
Processing data: Yf = fft(y); Zf = Yf;
Zf(1:5) = 0; Zf(7:7995) = 0; Zf(7997:8000) = 0;
z = real(ifft(Zf));
1
z(t)
0.5
0
−0.5
−1
0
0.1
0.2
0.3
0.4
0.5
t [sec]
0.6
Did our noise removal method work perfectly? ??
20
0.7
0.8
0.9
1
Application:
Pitch tracking (transcription) for songs
Spectrogram
(Lab 3)
21
Music transcription: Songs?
Spectrum of “The Victors”
load victors_tone.mat; S = 8192; N = length(x);
stem(2/N*abs(fft(x)))
victors spectrum
0.4
0.3
0.2
0.1
0
matlab frequency index: k+1
1
Zoom in:
victors spectrum
0.4
0.3
0.2
0.1
0
matlab frequency index: k+1
What notes? What order? What song?
Need a time varying spectrum to transcribe songs.
22
N=78000
Time-varying spectral analysis
•
•
•
•
•
Idea: Subdivide long signal into shorter time segments..
Hopefully notes do not change during most time segments.
Apply fft to each signal segment.
Examine spectra for each segment to identify note(s) played.
If we make many short segments, then we get many spectra.
How to best display many spectra?
23
Displaying many 1D arrays: stem plots
→ f [Hz]
But a picture is worth 1000 words (or plots)...
24
Displaying many 1D arrays: A grayscale image
Lincoln illusion by Harmon & Jules 1973
250
1
200
150
100
50
19
0
1
15
We will use grayscale images for sequences of spectra.
25
Spectrogram
A grayscale display of spectra of many sequential short signal segments
is called a spectrogram.
-gram: from Greek root meaning something that is drawn.
• Horizontal axis is time (segment number)
• Vertical axis is frequency
• Grayscale intensity (or color) for the amplitude coefficients {ck }
Useful for signals whose spectra vary with time, e.g., songs!
26
Spectrogram in Matlab
load train; sound(y);
length(y)
N = 560;
L = 23;
z = reshape(y, N, L);
imagesc(2/N*abs(fft(z))); colorbar
= 12880 = 560 · 23
choice of # of samples per segment
choice of # of segments
N × L array
0.45
50
0.4
100
0.35
frequency index k+1
150
0.3
200
250
0.25
300
0.2
350
0.15
400
0.1
450
500
0.05
550
2
4
6
8
10
12
14
time segment
16
18
Matlab’s fft computes FFT of each column of a 2D array
What is size of spectrogram array here? ??
27
20
22
Spectrogram of train whistle
Make it look better:
colormap(gray); axis xy; axis([0 L+1 0 100])
100
0.45
90
0.4
80
0.35
70
frequency index k+1
0.3
60
0.25
50
0.2
40
0.15
30
0.1
20
0.05
10
0
1
23
time segment
What is pitch of lowest note of chord? ??
28
play
Spectrogram of a song
victors_tone.mat contains the signal of a 26 note song (Lab 2).
Each note has 3000 samples for sampling rate S = 8192 Hz.
load victors_tone.mat; N = 3000; L = 26;
z = reshape(x, N, L);
imagesc(2/N*abs(fft(z))); colorbar
colormap(gray); axis xy; axis([0 L+1 0 300])
29
Spectrogram of “The Victors”
300
0.9
250
0.8
play
0.7
frequency index k+1
200
0.6
0.5
150
0.4
100
0.3
0.2
50
0.1
0
0
5
10
15
20
25
note
http://haostaff.com/store/images/pianoroll.gif
Each column of this spectrogram is one spectrum shown in gray scale.
What kind of signal is each note? ??
This representation (time horizontal, frequency vertical) is a lot like music!
242
8192Hz = 660.8Hz (E)
Frequency of highest tone? f = Nk S = 3000
30
Real music spectrograms
Bizarre “hidden” spectrogram from song by “Aphex Twin”
[wiki]
Violin playing:
[wiki]
play
31
Example: Spectrogram of chirp
2
Chirp signal: x(t) = cos 2παt
◦ birds, dolphins, some forms of RADAR, slide trombone?
◦ Instantaneous frequency increases linearly with time
◦ Instantaneous frequency = 2αt (not αt) Hz.
t = [0:8191]/250; x = cos(2*pi*t.^2); plot(t, x)
1
x(t)
0.5
0
−0.5
−1
0
0.5
1
1.5
2
t
2.5
3
Instantaneous frequency increases from 0 to 2 ∗ 8191/250 = 65.5 Hz
32
3.5
4
Example: Spectrogram of chirp
imagesc(abs(fft(reshape(x,256,32))))
colormap(gray), axis xy, axis([0 33 0 256])
250
frequency index
200
150
100
50
0
1
32
time segment
Final segment has frequency index 68.
k
68 − 1
Corresponding frequency f = S =
250 = 65.4 Hz.
N
256
This is the average of frequencies in the final time segment.
Sound of a higher-frequency chirp: play
33
Summary of spectra/spectrogram
• With spectrum, we can determine frequency components of one or
more notes played simultaneously
• Need S > 2B
• Can remove or reduce some forms of noise / interference
◦ use fft to analyze spectrum
◦ can modify spectrum
◦ Example: set some values to zero
◦ Example: shift some values to other frequencies
(e.g., auto-tune)
◦ use ifft to synthesize modified signal
(e.g., signal with reduced noise or interference)
• Spectrogram lets us analyze sequences of notes (songs)
34
Aliasing: audio example
S = 8192; t = [0:1/S:0.3];
x = 0.9*[cos(2*pi*2800*t), cos(2*pi*3800*t)];
play
y = 0.9*[cos(2*pi*3800*t), cos(2*pi*4800*t)];
play
1
samples
4800 Hz
3392 Hz
y(t)
0.5
0
−0.5
−1
0
1
2
3
t
4
arccos method says 3392 Hz, not 4800 Hz for this example
Is S > 2B here? ??
35
5
6
−4
x 10
Audio spectrum equalizers: Analog and digital
[wiki]
spotify equalizer
36

Download Report

Eng. 100: Music Signal Processing DSP Lecture 6

Paperzz.com

Your Paperzz