Inference in HMM
Tutorial #6
.
Hidden Markov Models - HMM
Hidden variables
S1
S2
Si
SL-1
SL
X1
X2
Xi
XL-1
XL
Observed data
2
Coin-Tossing Example
Start
1/2
1/2
tail
1/2
0.1
loaded
Fair
0.9
tail
1/4
0.1
3/4
1/2
0.9
head
head
L tosses
Fair/Loade
d
S1
S2
Si
SL-1
SL
X1
X2
Xi
XL-1
XL
Head/Tail
3
Coin-Tossing Example
Query: what are the probabilities for fair/loaded
coins given the set of outcomes {x1,…,xL}?
1. Compute the posteriori belief in Si (specific i) given the
evidence {x1,…,xL} for each of Si’s values si, namely, compute
p(si | x1,…,xL).
2. Do the same computation for every Si but without repeating
the first task L times.
Seeing the set of outcomes {x1,…,xL}, compute
p(loaded | x1,…,xL) for each coin toss
4
Decomposing the computation
S1
S2
Si
SL-1
SL
X1
X2
Xi
XL-1
XL
Answer: P(si | x1,…,xL) = (1/K) P(x1,…,xL,si)
where K= si P(x1,…,xL,si).
P(x1,…,xL,si) = P(x1,…,xi,si) P(xi+1,…,xL | x1,…,xi,si)
= P(x1,…,xi,si) P(xi+1,…,xL | si) f(si) b(si)
5
The forward algorithm
S1
S2
Si
X1
X2
Xi
The task: Compute f(si) = P(x1,…,xi,si) for i=1,…,L (namely,
considering evidence up to time slot i).
P(x1, s1) = P(s1) P(x1|s1)
s P(x1,s1,s2,x2)
= s P(x1,s1) P(s2 | x1,s1)
Last equality due
P(x1,x2,s2) =
1
1
to conditional
independence
{Basis step}
{Second step}
P(x2 | x1,s1,s2)
= s P(x1,s1) P(s2 | s1) P(x2 | s2)
1
{step i}
P(x1,…,xi,si) = si-1P(x1,…,xi-1, si-1) P(si | si-1 ) P(xi | si)
6
The backward algorithm
Si
Si+1
SL-1
SL
Xi+1
XL-1
XL
The task: Compute b(si) = P(xi+1,…,xL|si) for i=L-1,…,1
(namely, considering evidence after time slot i).
P(xL| sL-1) =
s P(xL ,sL |sL-1) = s P(sL |sL-1) P(xL |sL-1 ,sL )=
L
Last equality due to
conditional independence
L
= sL P(sL |sL-1) P(xL |sL ) {first step}
{step i}
P(xi+1,…,xL|si) = si+1 P(si+1 | si) P(xi+1 | si+1) P(xi+2,…,xL| si+1)
b(si)
b(si+1)
7
The combined answer
S1
S2
Si
SL-1
SL
X1
X2
Xi
XL-1
XL
1. To Compute the posteriori belief in Si (specific i)
given the evidence {x1,…,xL} run the forward algorithm
and compute f(si) = P(x1,…,xi,si), run the backward
algorithm to compute b(si) = P(xi+1,…,xL|si), the product
f(si)b(si) is the answer (for every possible value si).
2. To Compute the posteriori belief for every Si simply
run the forward and backward algorithms once, storing
f(si) and b(si) for every i (and value si). Compute f(si)b(si)
for every i.
8
Likelihood of evidence
S1
S2
Si
SL-1
SL
X1
X2
Xi
XL-1
XL
1. To compute the likelihood of evidence P(x1,…,xL), do
one more step in the forward algorithm, namely,
s f(sL) = s P(x1,…,xL,sL)
L
L
2. Alternatively, do one more step in the backward
algorithm, namely,
s b(s1) P(s1) P(x1|s1) = s P(x2,…,xL|s1) P(s1) P(x1|s1)
1
1
9
Coin-Tossing Example
Numeric example: 3 tosses
Outcomes: head, head, tail
Recall:
F(si)=P(x1,…,xi,si) = si-1 P(x1,…,xi-1, si-1) P(si | si-1 ) P(xi | si)
First coin
is loaded
{step 1- forward}
P(x1=head,s1=loaded)= P(loaded1) P(head| loaded1)= 0.5*0.75=0.375
P(x1=head,s1=fair)= P(fair1) P(head| fair1)= 0.5*0.5=0.25
10
Coin-Tossing Example - forward
Numeric example: 3 tosses
Outcomes: head, head, tail
P(x1,…,xi,si) = si-1 P(x1,…,xi-1, si-1) P(si | si-1 ) P(xi | si)
{step 1}
P(x1=head,s1=loaded)= P(loaded1) P(head| loaded1)= 0.5*0.75=0.375
P(x1=head,s1=fair)= P(fair1) P(head| fair1)= 0.5*0.5=0.25
{step 2}
P(x1 =head,x2 =head,s2 =loaded) = s1 P(x1,s1) P(s2 | s1) P(x2 | s2) =
p(x1 =head , loaded1) P(loaded2 | loaded1) P(x2 =head | loaded2) +
p(x1 =head , fair1) P(loaded2 | fair1) P(x2 =head | loaded2) =
0.375*0.9*0.75 + 0.25*0.1*0.75=0.253125+ 0.01875= 0.271875
P(x1 =head,x2 =head,s2 =fair) = p(x1 =head , loaded1) P(fair2 | loaded1)
P(x2 =head | fair2) +p(x1 =head , fair1) P(fair2 | fair1) P(x2 =head | fair2)
=
0.375*0.1*0.5 + 0.25*0.9*0.5= 0.01875 + 0.1125= 0.13125
11
Coin-Tossing Example - forward
Numeric example: 3 tosses
Outcomes: head, head, tail
P(x1,…,xi,si) = si-1 P(x1,…,xi-1, si-1) P(si | si-1 ) P(xi | si)
P(x1 =head,x2 =head,s2 =loaded) = 0.271875
P(x1 =head,x2 =head,s2 =fair) = 0.13125
{step 2}
{step 3}
P(x1 =head,x2 =head, x3 =tail ,s3 =loaded) = s2 P(x1, x2 ,s2) P(s3 | s2) P(x3 | s3) =
p(x1 =head , x2 =head, loaded2) P(loaded3 | loaded2) P(x3 =tail | loaded3) +
p(x1 =head , x2 =head, fair2) P(loaded3 | fair2) P(x3 =tail | loaded3) =
0.271875 *0.9*0.25 + 0.13125 *0.1*0.25=0.06445
P(x1 =head,x2 =head, x3 =tail ,s3 =fair) = p(x1 =head , x2 =head, loaded2)
P(fair3 | loaded2) P(x3 =tail | fair3) +
p(x1 =head , x2 =head, fair2) P(fair3 | fair2) P(x3 =tail | fair3) =
0.271875 *0.1*0.5 + 0.13125 *0.9*0.5=0.07265
12
Coin-Tossing Example - backward
Numeric example: 3 tosses
Outcomes: head, head, tail
b(si) = P(xi+1,…,xL|si)= P(xi+1,…,xL|si) = si+1 P(si+1 | si) P(xi+1 | si+1) b(si+1)
{step 1}
P(x3=tail | s2=loaded)=P(s3=loaded | s2=loaded) P(x3=tail | s3=loaded)+
P(s3=fair | s2=loaded) P(x3=tail | s3=fair)=0.9*0.25+0.1*0.5=0.275
P(x3=tail | s2=fair)=P(s3=loaded | s2=fair) P(x3=tail | s3=loaded)+
P(s3=fair | s2=fair) P(x3=tail | s3=fair)=0.1*0.25+0.9*0.5=0.475
13
Coin-Tossing Example - backward
Numeric example: 3 tosses
Outcomes: head, head, tail
b(si) = P(xi+1,…,xL|si)= P(xi+1,…,xL|si) = si+1 P(si+1 | si) P(xi+1 | si+1) b(si+1)
{step 1}
P(x3=tail | s2=loaded)=0.275
P(x3=tail | s2=fair)=0.475
{step 2}
P(x2 =head,x3 =tail | s1 =loaded) = P(loaded2 | loaded1) *P(head| loaded)*
0.275 +P(fair2 | loaded1) *P(head|fair)*0.475=
0.9*0.75*0.275+0.1*0.5*0.475=0.209
P(x2 =head,x3 =tail | s1 =fair) = P(loaded2 | fair1) *P(head|loaded)*
0.275 +P(fair2 | fair1) * P(head|fair)*0.475=
0.1*0.75*0.275+0.9*0.5*0.475=0.234
14
The MAP query in HMM
S1
S2
Si
SL-1
SL
X1
X2
Xi
XL-1
XL
1. Recall that the query asking likelihood of evidence is
to compute P(x1,…,xL) =
P(x1,…,xL, s1,…,sL)
(s ,…,s )
1
L
2. Now we wish to compute a similar quantity:
P*(x1,…,xL) = MAX P(x1,…,xL, s1,…,sL)
(s1,…,sL)
And, of course, we wish to find a MAP assignment
(s1*,…,sL*) that brought about this maximum.
15
Example: Revisiting likelihood of evidence
S1
S2
S3
X1
X2
X3
P(x1,x2,x3) = s1 P(s1)P(x1|s1) s P(s2|s1)P(x2|s2) s P(s3 |s2)P(x3|s3)
3
2
= s P(s1)P(x1|s1) s b(s2) P(s2|s1)P(x2|s2)
1
2
= s b(s1) P(s1)P(x1|s1)
1
16
Example: Computing the MAP assignment
S1
S2
S3
X1
X2
X3
Replace sums with taking maximum:
maximum = max s P(s1)P(x1|s1) max P(s2|s1)P(x2|s2) max P(s3 |s2)P(x3|s3)
s3
s2
1
= max P(s1)P(x1|s1) max b (s2) P(s2|s1)P(x2|s2)
s2 s3
s1
= max b (s1) P(s1)P(x1|s1)
s1 s2
s1* = arg max b (s1) P(s1)P(x1|s1)
s1 s2
s2* = x*
(s1*); s3* = x* (s2*)
s
s
2
3
x*
s3 (s2)
x*
s2 (s1)
{Finding the maximum}
{Finding the map assignment}
17
Viterbi’s algorithm
S1
Backward phase:
S2
Si
bs (sL) = 1
X1
X2
Xi
L+1
For i=L-1 down to 1 do
bs (si) =
MAX s P(si+1 | si) P(xi+1 | si+1) bs (si+1)
i+1
i+1
SL-1
SL
XL-1
XL
i+2
x*s (si) = ARGMAX s P(si+1 | si) P(xi+1 | si+1) bs (si+1)
i+1
i+1
i+2
(Storing the best value as a function of the parent’s values)
Forward phase (Tracing the MAP assignment) :
s1* = ARG MAX s P(s1) P(x1|s1) bs (s1)
For i=1 to L-1 do
si+1* = x*s (si *)
i+1
1
2
18
Coin-Tossing Example - Viterbi’s algorithm
A reminder:
Start
1/2
1/2
tail
1/2
Fair
0.9
1/2
0.1
0.1
tail
1/4
loaded
0.9
3/4
head
head
Fair/Loade
d
L tosses
S1
S2
Si
SL-1
SL
X1
X2
Xi
XL-1
XL
Head/Tail
Query: what are the most likely values in the S-nodes to
generate the given data?
19
Coin-Tossing Example - Viterbi’s algorithm
Numeric example: 3 tosses
S1
S2
S3
X1
X2
X3
Outcomes: head, head, tail
S1 ,S2,S3 P(x1,x2,x3, s1,s2,s3)
(0.5)3*0.5*(0.9)2=0.050625
F,F,F
(0.5)2*0.25*0.5*0.9*0.1=0.0028125
F,F,L
F,L,F
F,L,L
L,F,F
L,F,L
L,L,F
L,L,L
0.5*0.75*0.5*0.5*0.1*0.1=0.0009375
P(x1,x2,x3, s1,s2,s3)=
P(x1,x2,x3 | s1,s2,s3)
P(s1,s2,s3)
0.5*0.75*0.25*0.5*0.1*0.9=0.00422
0.75*0.5*0.5*0.5*0.1*0.9=0.0084375
0.75*0.5*0.25*0.5*0.1*0.1=0.000468
0.75*0.75*0.5*0.5*0.9*0.1=0.01265
0.75*0.75*0.25*0.5*0.9*0.9=0.0569
max
20
Coin-Tossing Example - Viterbi’s algorithm
Numeric example: 3 tosses
Backward phase:
S1
S2
S3
X1
X2
X3
Outcomes: head, head, tail
bs (s3) = 1
4
1
bs (s2) = MAX s P(s3| s2) P(x3 | s3) bs (s3)
3
3
4
b (s2=fair) = MAX {P(loaded3 | fair2) P( tail | loaded3), P(fair3 | fair2) P( tail | fair3) =
s3
MAX{0.1*0.25, 0.9*0.5}=0.45
(ARGMAX s3=fair)
bs (s2=loaded) = MAX {P(loaded3 | loaded2) P( tail | loaded3), P(fair3 | loaded2) P( tail | fair3)
3
= MAX{0.9*0.25, 0.1*0.5}=0.225
(ARGMAX s3=loaded)
21
Coin-Tossing Example - Viterbi’s algorithm
Numeric example: 3 tosses
Backward phase:
S1
S2
S3
X1
X2
X3
Outcomes: head, head, tail
bs (s1) = MAX s P(s2| s1) P(x2 | s2) b s (s2)
2
2
3
b (s1=fair) = MAX {P(loaded2|fair1) P( head|loaded2)*0.225,P(fair2|fair1)P( head|fair2)*0.45=
s2
MAX{0.1*0.75*0.225, 0.9*0.5*0.45}=0.2025
(ARGMAX s2=fair)
b s (s1=loaded)=
2
MAX{P(loaded2|loaded1)P(head|loaded2)*0.225,P(fair2|loaded1)P(head|fair2)*0.45
= MAX{0.9*0.75*0.225, 0.1*0.5*0.45}=0.151875
(ARGMAX s2=loaded)
22
Coin-Tossing Example - Viterbi’s algorithm
Numeric example: 3 tosses
Forward phase:
S1
S2
S3
X1
X2
X3
Outcomes: head, head, tail
s1* = ARG MAX s1P(s1) P(x1|s1) b s(s
)=
2 1
ARG MAX{P(loaded)P(head|loaded)*0.151875,P(fair)P(head|fair)*0.2025}=
loaded
s2* = ARG MAX s P(s2|loaded1) P(head|s2) b s(s2) =
2
3
ARG MAX{P(loaded|loaded1)P(head|loaded)*0.225,
P(fair|loaded1)P(head|fair)*0.45} =loaded
s3* = ARG MAX s P(s3|loaded2) P(tail|s3) bs4(s3) =
3
ARG MAX{P(loaded|loaded2)P(tail|loaded),
P(fair|loaded2)P(tail |fair)} =loaded
23
© Copyright 2026 Paperzz