Motif Detection From Audio In
Hindustani Classical Music:
Methods And Evaluation Strategy
Joe Cheri Ross and Preeti Rao
IIT Bombay
Motifs in Hindustani Music
Melodic motifs or signature phrases are essential building
blocks in Indian Classical music.
Apart from the swaras that define the raga, it is the
characteristic phrases give it a unique identity [1]
Objective of the present work
Identify all occurrences of melodically similar phrases
in the song given a specific instance of the phrase
Audio example: ‘Jag Mein’ Bandish (Composition)
Rendered by Pt. Ajoy Chakrabarty
Melodic contour extracted by PloyphonicPDA [3]
An Approach to Motif Detection
Segmentation: find the boundaries (in time) of
candidate phrases. What are the acoustic
cues?
Similarity matching: compute a “melodic
distance” between the given phrase and
candidate phrases. What is a good melodic
distance measure ?
A Prominent Motif: Mukhda phrase
Mukhda is the recurring title phrase of a „Bandish’
(Composition)
Why did we restrict ourselves to Mukhda phrases ?
•The ease of marking ground truth based on lyrical
similarity
•The availability of cues to phrase location from the
rhythmic structure
Mukhda Phrases as seen on the pitch contour
Song: Piya Jag
Swaras: D P G P
Segmentation:
Characteristic of a Mukhda motif
Mukhda phrase has a specific location in the rhythmic cycle- around
sam
Ex: Phrase 'Guru Bina'
Starts 5 beats before sam (t1)
Ends at sam (t2)
This is the cue for identifying the candidate phrases
Candidate phrase length dependent on the tempo at the instant
Mukhda Phrases on the Pitch Contour
Song: Guru Bina
Swaras: S S N R
Performance of Guru Bina by Pt. Ajoy Chakrabarty
Example
Identification of ‘Guru Bina’ phrase
Detects phrases melodically similar to „Guru Bina‟ pitch contour
Swaras: S S N R
Positive
phrases
Emphatic beat
sam
Negative
phrase
Example : ‘Piya Jag’ Phrases
Positive
phrases
Negative
phrase
Similarity Measures for time series
Symbolic Aggregate approXimation(SAX) [7]
Pitch sequence of each phrase is reduced to uniform length(w)
Euclidean distance between phrases is computed
Dynamic Time Warping(DTW) [6]
Finds similarity between sequences which vary in time or
speed
Sakoe-Chiba constraint is enabled to avoid any pathological
warping
Experiment
To evaluate the performance of similarity measures
•The location of positive phrases is manually annotated in the song.
•The pitch sequence of the song (pitch value for each 10ms)
1.
2.
3.
Extract candidate phrases(same rhythmic structure) from
the song(pitch contour) by automatic detection of the sam
(or similar bols)
With the help of annotated ground truth, find the positive
phrases among the generated
Compare each positive candidate phrase with the all
phrases using similarity measures
Experiments were done with quantized and un-quantized pitch
Dataset
Expt
Bandish
Singer
#Phrases
POS
NEG
A
Guru Bina
Pt. Bhimsen Joshi
156
715
B
Guru Bina
Ajoy Chakraborty
1056
9735
C
Jana na na na Pt. Bhimsen Joshi
272
1649
D
Piya Jaag
Kishori Amonkar
1892
7744
E
Guru Bina
BJ vs AC
429
3835
'Piya Jaag' Distance Distribution
ROC of DTW and SAX
Song: ‘Piya Jaag’
Hit rate- 87%
False Alarm- 3.2 %
(This work has been reported in Proc. ISMIR 2012 )
Extension to other phrases
Why it is Challenging ?
Melodically similar motifs may not occur at the same
location in the rhythmic cycle.
Make it difficult to identify right candidate phrases to be
compared with
Results in increase in number of candidate phrases, thus the
complexity
Mukhda phrase: ‘Jag Mein Kachu’
Swaras: G-R-SNRS-N-D-N-S
N-NDS
Emphatic beat sam
Location of Mukhda phrases is consistent w.r.t to location of
emphatic beat sam in rhythmic cycle
Non-Mukhda phrase N-D-S
•N-D-S is one of the prominent phrases in this bandish
•Location of phrases are not consistent in the rhythmic cycle
•Range of variations due to improvisations is high compared to Mukhda phrases.
Vistar(Variations) of the phrase N-D-S
•
•
All these phrases are to be identified as similar motifs
Phrase ending in Nyas swar(long note) S.
Long note S
Approaches
1.
Identify motifs based on repeating patterns
2.
Identify motifs based on potential segment
boundary cues and cluster
Approach 1:
Find repeating patterns from the symbolic sequence and
similar patterns are grouped together.
Symbolic sequence is derived from the pitch contour
Crochemore algorithm[4,5] extracts repeating patterns
from the input symbolic sequence.
Complexity of algorithm- O(n log n)
n- length of sequence
Approach 1:
Crochemore Algorithm
Crochemore algorithm extracts repeating patterns from
symbolic sequence.
Example:
S R G S R G P G S R S R G P G P G S
1
2
3
4
5
6
{1,4,9,11,18}S
{1}SRGS
8
9 10 11 12 13 14 15 16 17 18
{2,5,10,12}R
{1,4,9,11}SR
{1,4,11}SRG
7
{2,5,12}RG
{9}SRS
{4,11}SRGP
{4,11}SRGPG
{3,6,8,13,15,17}G
{10}RS
{3,8}GSR
{3,8,17}GS
{7,14,16}P
{6,8,13,15}GP
{6,13,15}GPG
{3}GSRG {8}GSRS {6,15}GPGS
{6}GPGSR
{13}GPGP
Approach 1:
Experiment Method
•Annotation of location of motifs and the belonging cluster.
•Symbolic sequence from the pitch contour
1.
Crochemore algorithm can get the motifs at different levels
from the symbolic sequence
2.
Remove short length motifs
3.
With the help of annotated ground truth, find the purity
and rand index of clustering
Approach 2:
Find motif boundaries with segmentation cues and cluster
similar motifs
Cues to Segmentation:
1. Pauses(Silence) occurs at major boundaries (lyrical
phrase boundaries)
2. Nyasa(Long notes) occurs at most of the boundaries
3. Recurring patterns
Approach 2:
Experiment Method
•Annotation of the location of motifs and the belonging cluster.
•The pitch sequence of the song (pitch value for each 10ms)
1.
Extract candidate phrases by segmentation from the
song(pitch contour)
2.
Find similar motifs using similarity measures and
cluster(Agglomerative) them
3.
With the help of annotated ground truth, find the purity
and rand index of clustering
Conclusion & Future Work
Detecting phrase motifs is challenging due to the inherent
variability. However:
Prominent swaras remains the same (Ex: N D S)
Explicit phrase segmentation cues need to be further explored
Time-series pattern matching methods may be extended
to motif discovery (i.e. no prior knowledge about motifs is
available)
References
[1] J. Chakravorty, B. Mukherjee and A. K. Datta: “Some Studies in Machine Recognition
of Ragas in Indian Classical Music,” Journal of the Acoust. Soc. India, Vol. 17, No.3&4,
1989.
[2] S. Rao, W. van der Meer and J. Harvey: “The Raga Guide: A Survey of 74 Hindustani
Ragas,” Nimbus Records with the Rotterdam Conservatory of Music, 1999.
[3] V. Rao and P. Rao: “Vocal Melody Extraction in the Presence of Pitched
Accompaniment in Polyphonic Music,” IEEE Trans. Audio Speech and Language
Processing,Vol. 18, No.8, 2010.
[4] M. Crochemore: “An Optimal Algorithm for Computing the Repetitions in a
Word,” Information Processing Letters, Vol.12, No.5, 1981.
[5] E. Cambouropoulos: “Musical parallelism and melodic segmentation: A
computational approach,” Music Perception: An Interdisciplinary Journal, Vol.23, No.3,
2006
[6] D. Berndt and J. Clifford: “Using Dynamic Time Warping to Find Patterns in Time
Series,” AAAI-94 Workshop on Knowledge Discovery in Databases, 1994.
[7] J. Lin, E. Keogh, S. Lonardi and B. Chiu: “A Symbolic Representation of Time Series,
with Implications for Streaming Algorithms,” In Proc. of the Eighth ACM SIGMOD
Workshop on Research Issues in Data Mining and Knowledge Discovery, 2003.
[8] A. Mueen , E. Keogh , Q. Zhu and S. Cash: “Exact Discovery of Time Series Motifs,”
Proc. of the SIAM International Conference on Data Mining, 2009.
[9] J. Ross, T.P. Vinutha and P.Rao: “Detecting Melodic Motifs From Audio For
Hindustani Classical Music,” Proc. of Int. Soc. for Music Information Retrieval Conf.
(ISMIR), 2012.
© Copyright 2026 Paperzz