Discovering variable length phrases from symbolic notation of

Goal
Formulation
Results
Conclusion
Discovering variable length phrases from symbolic
notation of Carnatic music
Ranjani H. G.
Advisor: Prof. T. V. Sreenivas,
Dept of ECE,
Indian Institute of Science
April 26, 2016
Ranjani H. G.
Phrase discovery
Dept of ECE
1 / 10
Goal
Formulation
Results
Conclusion
Problem Formulation
Given symbolic transcript, discover variable length phrases for a rāga
D P M ,
In - - -
G R S ,
tha - - -
R N D D
cha - - -
P P S ,
- - la -
S , N S
- - mu -
G R G ,
- - je -
M ,
si -
G R G M ||
the - - -
P ,
e -
M- D
- ma
- P- S N
- - ni -
P D
- -
P- N
- lu
-
,
-
-
D ,
-
M ,
- -
D P- S
- - -
D R S ,
- - tha -
N
-
P
M ,
du - -
G
-
M P D
ran - -
N
-
,
-
R- G M P ||
ra - -
S ,- D D
An - - -
P- N N D
- tha - -
G R S- M
da - - ni
G M- R G M P- - - - -
P D ,- D
A - - na
P M- M P
- - va -
,-
P
-
M
-
G
-
R G M P
cchi - - -
M D ,
S
S
N
D
R
S
P
N
D P
D P- M ,
- - gu M , , , ||
vi - - -
P D
mo -
,
, ||
,
- - - symbol
the - transcript
- - na - -of Begada
tho - - Sample
rāga
P D P S
Pan - - tha
R ,
ra -
,- R
- -
R S , N
Go - - pa
P
ri
S ,- P
- - -
N S- G R
- - me S N
- -
G M G M
- - - -
R ,- P M
la - - -
D D
sree -
P- S
- ve
,
-
N
-
D N
nu -
D
-
P- D M
- la -
, G
- da
M
-
R
sa
G M P D
pa - - -
D
-
P-
D P
- -
M
-
G
-
N ,
pa -
Ranjani H. G.
D , P M
M G , R
,- D P M
S N- R S
Phrase discovery
N R , S
G
R- P M
G R ,- G
,- R N D
,
P
G
R
R G
- la
,
P
G
M
D
M
S R||
- -
M P
- P
P
D
S
Dept of ECE
P
D
2 / 10
Goal
Formulation
Results
Conclusion
Problem Formulation
Given symbolic transcript, discover variable length phrases for a rāga
D P M ,
In - - -
G R S ,
tha - - -
R N D D
cha - - -
P P S ,
- - la -
S , N S
- - mu -
G R G ,
- - je -
M ,
si -
G R G M ||
the - - -
P ,
e -
M- D
- ma
- P- S N
- - ni -
P D
- -
P- N
- lu
-
,
-
-
D ,
-
M ,
- -
D P- S
- - -
D R S ,
- - tha -
N
-
P
M ,
du - -
G
-
M P D
ran - -
N
-
,
-
R- G M P ||
ra - -
S ,- D D
An - - -
P- N N D
- tha - -
G R S- M
da - - ni
G M- R G M P- - - - -
P D ,- D
A - - na
P M- M P
- - va -
,-
P
-
M
-
G
-
R G M P
cchi - - -
M D ,
S
S
N
D
R
S
P
N
D P
D P- M ,
- - gu M , , , ||
vi - - -
P D
mo -
,
, ||
,
- - - symbol
the - transcript
- - na - -of Begada
tho - - Sample
rāga
P D P S
Pan - - tha
R ,
ra -
,- R
- -
N S- G R
- - me S N
- -
G M G M
- - - P- S
- ve
,
-
N
-
D N
nu -
D
-
P- D M
- la -
, G
- da
M
-
R
sa
G M P D
pa - - -
D
-
P-
D P
- -
M
-
G
-
Multiple phrases exist - unknown
Variable number of notes
R S , N
Go - - pa
P
ri
S ,- P
- - -
R ,- P M
la - - -
D D
sree -
N ,
pa -
Ranjani H. G.
D , P M
M G , R
,- D P M
S N- R S
Phrase discovery
N R , S
G
R- P M
G R ,- G
,- R N D
,
P
G
R
R G
- la
,
P
G
M
D
M
S R||
- -
M P
- P
P
D
S
Dept of ECE
P
D
2 / 10
Goal
Formulation
Results
Conclusion
Multigram
Let transcript be denoted by A = [A1 , A2 , . . . AI ]; sequence of
rhythm cycles
Ranjani H. G.
Phrase discovery
Dept of ECE
3 / 10
Goal
Formulation
Results
Conclusion
Multigram
Let transcript be denoted by A = [A1 , A2 , . . . AI ]; sequence of
rhythm cycles
Consider any rhythm cycle A = [u1 , u2 , u3 , . . . , uTA ]; where ut ∈ V ,
with V = {S, R, G , M, P, D, N}.
p(A ) =
QA
Y
k=1
Ranjani H. G.
Phrase discovery
p(sk ) ,
QA
Y
θk
(1)
k=1
Dept of ECE
3 / 10
Goal
Formulation
Results
Conclusion
Multigram
Let transcript be denoted by A = [A1 , A2 , . . . AI ]; sequence of
rhythm cycles
Consider any rhythm cycle A = [u1 , u2 , u3 , . . . , uTA ]; where ut ∈ V ,
with V = {S, R, G , M, P, D, N}.
p(A ) =
QA
Y
p(sk ) ,
k=1
QA
Y
θk
(1)
k=1
Segmentation on A results in
D P M M G R S S R N D D P P S S S S N S G R G G M M M M G R G M ||
b0
b1
b2
b3
b4
b5
b6
A ≡ [s1 , s2 , s3 , . . . sQA ].
Ranjani H. G.
Phrase discovery
Dept of ECE
3 / 10
Goal
Formulation
Results
Conclusion
Multigram
Let transcript be denoted by A = [A1 , A2 , . . . AI ]; sequence of
rhythm cycles
Consider any rhythm cycle A = [u1 , u2 , u3 , . . . , uTA ]; where ut ∈ V ,
with V = {S, R, G , M, P, D, N}.
p(A ) =
QA
Y
p(sk ) ,
k=1
QA
Y
θk
(1)
k=1
Segmentation on A results in
D P M M G R S S R N D D P P S S S S N S G R G G M M M M G R G M ||
b0
b1
b2
b3
b4
b5
b6
A ≡ [s1 , s2 , s3 , . . . sQA ].
Set of boundaries {bk } represented by r.v. Z
Ranjani H. G.
Phrase discovery
Dept of ECE
3 / 10
Goal
Formulation
Results
Conclusion
Parameter estimation: Segmental K-means
Estimate parameters, θk to maximize posterior p(Z |A; θ)
n
o
θ∗ = arg max max log p(Z |A; θold )
θ
Ranjani H. G.
Phrase discovery
Z
Dept of ECE
4 / 10
Goal
Formulation
Results
Conclusion
Parameter estimation: Segmental K-means
Estimate parameters, θk to maximize posterior p(Z |A; θ)
n
o
θ∗ = arg max max log p(Z |A; θold )
θ
Z
PY
Constraint :
k=1 θk = 1 where, Y is total number of unique
sub-sequence entries
Ranjani H. G.
Phrase discovery
Dept of ECE
4 / 10
Goal
Formulation
Results
Conclusion
Parameter estimation: Segmental K-means
Estimate parameters, θk to maximize posterior p(Z |A; θ)
n
o
θ∗ = arg max max log p(Z |A; θold )
θ
Z
PY
Constraint :
k=1 θk = 1 where, Y is total number of unique
sub-sequence entries
Algorithm
1. Find Z ∗ such that
Z∗
=
=
Ranjani H. G.
Phrase discovery
arg max log p(A, Z ; θold )
(2)
Z ∈Z
arg max log p(A|Z , θ
Z ∈Z
old
)p(Z ; θ
old
)
Dept of ECE
4 / 10
Goal
Formulation
Results
Conclusion
Parameter estimation: Segmental K-means
Estimate parameters, θk to maximize posterior p(Z |A; θ)
n
o
θ∗ = arg max max log p(Z |A; θold )
θ
Z
PY
Constraint :
k=1 θk = 1 where, Y is total number of unique
sub-sequence entries
Algorithm
1. Find Z ∗ such that
Z∗
=
=
arg max log p(A, Z ; θold )
(2)
Z ∈Z
arg max log p(A|Z , θ
Z ∈Z
old
)p(Z ; θ
old
)
2. Update parameters
∗
θjnew =
cjZ
cZ∗
(3)
where Z ∗ maximizes posterior
Ranjani H. G.
Phrase discovery
Dept of ECE
4 / 10
Goal
Formulation
Results
Conclusion
Multigram attributes
Convergence criteria : Boundaries do not change
{θ} - Variable length multinomial distribution
Normalized count over number of segments
Phrase entries themselves can change across iterations
Total number of phrases can change across iterations
Ranjani H. G.
Phrase discovery
Dept of ECE
5 / 10
Goal
Formulation
Results
Conclusion
Analysis
Rough pitch contours of more than 100 rhythm cycles from symbolic transcripts of
rāga Begada (in blue)
Ranjani H. G.
Phrase discovery
Dept of ECE
6 / 10
Goal
Formulation
Results
Conclusion
Analysis
Rough pitch contours of more than 100 rhythm cycles from symbolic transcripts of
rāga Begada (in blue) and top ten frequently occurring phrases (sorted aided by other
colors) as discovered by 8-multigram. Two musicological phrase(s) are highlighted
using (black and red) arrowheads.
Ranjani H. G.
Phrase discovery
Dept of ECE
6 / 10
Goal
Formulation
Results
Conclusion
Modified multigram
Sub-sequences limited by N
Ranjani H. G.
Phrase discovery
Dept of ECE
7 / 10
Goal
Formulation
Results
Conclusion
Modified multigram
Sub-sequences limited by N
Propose a modified 2-stage approach:
Obtain phrase set (≤ N length phrases), using multi-gram model
Creatennew vocab:
o
r
V 0 = V ∪ {si : |si | = N, θi > Pthr }, ∀i ∈ DN−multi
.
Replace any occurrence of si with its corresponding entry from V 0
Obtain new set of phrases of maximum N + M length phrases
Ranjani H. G.
Phrase discovery
Dept of ECE
7 / 10
Goal
Formulation
Results
Conclusion
Analysis
Rough pitch contours of more than 100 rhythm cycles from training data of rāga
Begada (in blue) and top ten frequently occurring phrases (sorted aided by other
colors) as discovered by modified M-multigram with (N, M) = (8, 8). Two
characteristic phrase(s) are highlighted using (black and red) arrowheads.
Ranjani H. G.
Phrase discovery
Dept of ECE
8 / 10
Goal
Formulation
Results
Conclusion
Conclusions
Use only 7 notes (irrespective of pitch position)
Discover variable length phrases
Possible representative feature for symbolic music
Some discovered phrases also correlate with musicological phrases
Capture grammatical structure of music
Ranjani H. G.
Phrase discovery
Dept of ECE
9 / 10
Goal
Formulation
Results
Conclusion
Conclusions
Use only 7 notes (irrespective of pitch position)
Discover variable length phrases
Possible representative feature for symbolic music
Some discovered phrases also correlate with musicological phrases
Capture grammatical structure of music
Ranjani H. G.
Phrase discovery
Dept of ECE
9 / 10
Goal
Formulation
Results
Conclusion
QUESTIONS?
Ranjani H. G.
Phrase discovery
Dept of ECE
10 / 10
Additional Material
Performance: Perplexity
Perplexity values of N-gram, N-multigram on training and testing symbolic music data
for the rāgas considered.
Ranjani H. G.
Phrase discovery
Dept of ECE
1/4
Additional Material
Performance: Perplexity
Perplexity values of N-multigram, (N, M) modified multigram on training and testing
symbolic music data for the rāgas considered.
Ranjani H. G.
Phrase discovery
Dept of ECE
2/4
Additional Material
Performance: Correlation with musicological phrases
Phrases as marked by musicians. Those found by N-multigram marked by black left
arrowhead, phrases found by 2-stage (N,M) multigram marked by red left arrowhead.
Ranjani H. G.
Phrase discovery
Dept of ECE
3/4
Additional Material
Performance: Total size of phrases discovered
A comparison of phrase set sizes for the considered rāgas obtained using multigram
and the 2-stage multigram models for N and M ranging from 5 to 8.
Ranjani H. G.
Phrase discovery
Dept of ECE
4/4