Beat Tracking

BEAT TRACKING
Qiaozhan Gao
Beat Tracking by Dynamic Programming
Daniel P.W. Ellis
MIREX-06
■ Beat: a sequence of beat instants
corresponding to “foot tapping” or
“hand clapping”
■ Beats Per Minute (BPM):
determine the tempo of the music
■ Beat tracking: the determination of
a repeating time interval between
perceived pulses in music
1
History
■ ‘Foot tapping’ system [1999]
■ Audio-driven systems in the MIREX-06 Audio Beat Tracking
evaluation [2006]
■ Currently, find and resolve local peaks in volume
2
MIREX-06
■ Music Information Retrieval Evaluation eXchange
■ Run by The International Music Information Retrieval Systems
Evaluation Laboratory
■ Automatic beat tracking: to track each beat locations in a
collection of sound files
3
2006: Audio Beat Tracking
■ Separate into excerpts.
■ An impulse train will be created from each of the 40 annotated ground truth beat vectors.
■ The impulse trains will be 25 seconds long, constructed with a 100-Hz sampling rate, and
have unit impulses at beat times.
■ Calculate the cross-correlation function of as[n] and y[n] within a small delay window.
■ Average across the number of annotators (S).
7
Performance of
beat times
34
Each impulse train
/
1
1
𝑃= %
% % 𝑦 𝑛 𝑎* [𝑛 − 𝑚]
𝑆
𝑁𝑃
712
5164 012
impulse train from
the algorithm
𝑁𝑃 = max(% 𝑦 𝑛 , % 𝑎* [𝑛])
4
A Beat Tracking System
Estimate a
global tempo
Construct a
transition cost
function
Dynamic
programming
Find the bestscoring set of
beat times
■ A simple optimization framework
■ Maximize the “onset strength” at every hypothesized beat time
■ Maximize the consistency of the inter-onset-interval with pre-estimated
constant tempo
5
Dynamic Programming
a single objective function:
The sequence of
N beat instants
/
𝐶 𝑡A
An inter-beat
interval
Ideal beat
spacing
/
= % 𝑂(𝑡A ) + 𝛼 % 𝐹(𝑡A − 𝑡A62 , 𝜏G )
A12
A1H
Consistency
between ∆t and τp
Onset strength
envelope
squared-error function:
𝐹 ∆t,𝜏 = −(𝑙𝑜𝑔
∆t H
)
𝜏
6
the recursive relation
(based on the observation that best score for time t is local onset strength):
Preceding beat time
𝐶 ∗ 𝑡 = 𝑂 𝑡 + max{𝛼𝐹 𝑡 − 𝜏, 𝜏G + 𝐶 ∗ 𝜏 }
the actual preceding beat time that gave the best score:
𝑃 ∗ 𝑡 = argmax{𝛼𝐹 𝑡 − 𝜏, 𝜏G + 𝐶 ∗ 𝜏 }
7
How to find the set of beat times
■ 1. Calculate C∗ and P∗ for every time starting
from 0
■ 2. Look for the largest value of C∗
■ 3. Obtain the final beat instant tN
■ 4. ‘Backtrace’ via P∗, finding the preceding beat
time tN-1= P∗(tN)
■ 5. Repeat backwards until reaching the
beginning of the signal
■ 6. Get the entire beat sequence {ti}∗
8
1st part of system:
Onset Strength Envelope
■ Calculate from a perceptual model
Resample the input
sound to 8 kHz
STFT
Convert to an
approximate
auditory
representation
Mel
spectrogram
9
An example of the STFT spectrogram, Mel spectrogram, and
onset strength envelope for a brief example of singing plus guitar
10
2nd part of system:
Global Tempo Estimate
Tempo period strength:
𝑇𝑃𝑆 𝜏 = 𝑊(𝜏) % 𝑂 𝑡 𝑂(𝑡 − 𝜏)
S
Gaussian weighting function on a log-time axis:
𝜏
𝑙𝑜𝑔
H𝜏
1
W
𝑊 𝜏 = exp{−
2
𝜎Y
H
}
center of tempo
period bias
Control width of
weighting curve
TPS2(τ) = TPS(τ) + 0.5TPS(2τ) + 0.25TPS(2τ − 1) + 0.25TPS(2τ + 1)
TPS3(τ) = TPS(τ) + 0.33TPS(3τ) + 0.33TPS(3τ − 1) + 0.33TPS(3τ + 1)
11
Tempo Calculation
12
Tempo Estimation Training Dataset
■ the MIREX-06 Tempo Extraction contest
■ data used in the 2004 Audio Description Contest for Tempo
■ 465 “song excerpt” examples
13
Tempo Estimation Evaluation Results
■ The original tempo extraction algorithm, which is the global maximum of
TPS, scored 35.7%.
■ The modified tempo algorithm (taking the maximum of TPS2 or TPS3)
improves performance to 45.8%.
14
Beat Tracking
Three free parameters:
■ the two values determining the tempo window (τ0 and στ)
■ the α of equation which determines the balance between the local score
(sum of onset strength values at beat times)
■ inter-beat-interval scores
15
Discussion: Non-constant tempos
Effect of deviations from target BPM:
16
Two problems:
Limitation of algorithm: dependence on a single, predefined ideal tempo
■ experimented with tracking specific single-tempo excerpts while
systematically varying the target tempo around the “true” target tempo
■ accommodate slowly-varying tempos by updating τp dynamically during
the progressive calculation
Abrupt changes in tempo would involve a more radical modification
■ searching across several different τp values
■ with an appropriate cost penalty to discourage frequent shifts between
different tempos
17
Conclusion
■ Used this algorithm successfully as the basis of the beat-synchronous
chroma features which underlie our cover song detection system [Ellis
and Poliner, 2007] which had the best performance by a wide margin in
the MIREX-06 Cover Song Detection evaluation
■ Run the beat tracker over very many pop music tracks, including the 8764
tracks of the uspop2002 database [Ellis et al., 2003], and we have found it
generally satisfactory
■ The tracked beats are very often at a reasonable tempo and in reasonable
places.
18