### MLVQ(EM演算法)

```MLVQ(EM演算法)
Speaker:楊志民
Date:96.10.4
training
Test data
Remove
Dc_bias
Feature
extraction
411.C
Speech
feature
Breath.c
Feature
extraction
train
model
recognize
Silence.c
Recognize rate
Duration.c
Initial models
Initialize VQ
Initial state
loop
VQ
(get mixture means)
Initialize MLVQ
MLVQ
(get mean ,variance
weight, determin)
Mixture Gaussian density function
The mixture Gaussian can fit any kinds of distribution
K

p  y | Φ    ck N k y | μk ,  k
k 1
  c p
K
k 1
k
k
 y | Φk 
f(x)
x

Estimation theory
 Bayes’
theorem
P  X  x, Y  y |    P  X  x | Y  y ,   P Y  y |  
 our
goal is to maximize the log-likelihood of the
observable，
 P Y  y |   
P  X  x, Y  y |  
P  X  x | Y  y,  
 log P Y  y |    log P  X  x, Y  y |    log P  X  x | Y  y,  

We take the conditional expectation of
X
computed with parameter vector
:






E log P
Y

y




 X

Yy



log P Y  y   over










 
P
X

x
Y

y
,

log
P
Y

y





x 






 log P
Y

y






The following expression is obtained







log P Y  y    E log P X , Y  y  



 X







 Q  ,    H   ,  




Yy




 E log P X Y  y,  

 X

Yy
• by Jensen’s inequality :
H  ,    H  ,  
• The convergence of the EM algorithm lies in the fact that if we
choose 
so that Q  ,    Q  ,  
then
log P Y  y |    log P Y  y |  

Jensen’s inequality

log[ x1  (1   ) x2 ]   log( x1 )  (1   ) log( x2 )
n
n

i 1

i
i
i 1
n

i 1
i
i
1
log( xi )
H (, )  H (, )
  P( X  x | Y  y, ) log P( X  x | Y  y, )
x
 P( X  x | Y  y, ) log P( X  x | Y  y, )
x
Jensen’s
inequality

 P( X  x | Y  y, ) log[
x
P( X  x | Y  y, )
]
P( X  x | Y  y, )
P( X  x | Y  y, )
 log  [ P( X  x | Y  y, )
]
P( X  x | Y  y, )
x
 log  P( X  x | Y  y, )
0
x
 Thus, we can
Max Q ( ,  )

Maximization
 The EM method is an iterative
method, and we need a initial
model
Q0Q1Q2 …
Step of implement EM
initialization:
Choose an initial estimate Φ
E-Step
Estimate unobserved data using 
auxiliary Q-function Q(, )
M-step:
^

Compute   arg max Q(, )

to maximize the auxiliary
Q-function.
^
Iteration: ^
Set   
repeat from step2
until convergence.
Yes
No
```