Markov Chains
and
Hidden Markov Models
Marjolijn Elsinga
&
Elze de Groot
Marjolijn Elsinga & Elze de Groot
1
Andrei A. Markov
Born: 14 June 1856 in Ryazan,
Russia
Died: 20 July 1922 in
Petrograd, Russia
Graduate of Saint Petersburg
University (1878)
Work: number theory and
analysis, continued fractions,
limits of integrals,
approximation theory and the
convergence of series
Marjolijn Elsinga & Elze de Groot
2
Todays topics
Markov chains
Hidden Markov models
- Viterbi Algorithm
- Forward Algorithm
- Backward Algorithm
- Posterior Probabilities
Marjolijn Elsinga & Elze de Groot
3
Markov Chains (1)
Emitting states
Marjolijn Elsinga & Elze de Groot
4
Markov Chains (2)
Transition probabilities
Probability of the sequence
Marjolijn Elsinga & Elze de Groot
5
Key property of Markov Chains
The probability of a symbol xi depends only
on the value of the preceding symbol xi-1
Marjolijn Elsinga & Elze de Groot
6
Begin and End states
Silent states
Marjolijn Elsinga & Elze de Groot
7
Example: CpG Islands
CpG = Cytosine – phosphodiester bond –
Guanine
100 – 1000 bases long
Cytosine is modified by methylation
Methylation is suppressed in short stretches
of the genome (start regions of genes)
High chance of mutation into a thymine (T)
Marjolijn Elsinga & Elze de Groot
8
Two questions
How would we decide if a short strech of
genomic sequence comes from a CpG
island or not?
How would we find, given a long piece of
sequence, the CpG islands in it, if there are
any?
Marjolijn Elsinga & Elze de Groot
9
Discrimination
48 putative CpG islands are extracted
Derive 2 models
- regions labelled as CpG island (‘+’ model)
- regions from the remainder (‘-’ model)
Transition probabilities are set
- Where Cst+ is number of times letter t follows
letter s
Marjolijn Elsinga & Elze de Groot
10
Maximum Likelihood Estimators
Each row sums to 1
Tables are asymmetric
Marjolijn Elsinga & Elze de Groot
11
Log-odds ratio
Marjolijn Elsinga & Elze de Groot
12
Discrimination shown
Marjolijn Elsinga & Elze de Groot
13
Simulation: ‘+’ model
Marjolijn Elsinga & Elze de Groot
14
Simulation: ‘-’ model
Marjolijn Elsinga & Elze de Groot
15
Todays topics
Markov chains
Hidden Markov models
- Viterbi Algorithm
- Forward Algorithm
- Backward Algorithm
- Posterior Probabilities
Marjolijn Elsinga & Elze de Groot
16
Hidden Markov Models (HMM) (1)
No one-to-one correspondence between
states and symbols
No longer possible to say what state the
model is in when in xi
Transition probability from state k to l:
πi is the ith state in the path (state sequence)
Marjolijn Elsinga & Elze de Groot
17
Hidden Markov Models (HMM) (2)
Begin state: a0k
End state: a0k
In CpG islands example:
Marjolijn Elsinga & Elze de Groot
18
Hidden Markov Models (HMM) (3)
We need new set of parameters because
we decoupled symbols from states
Probability that symbol b is seen when
in state k:
Marjolijn Elsinga & Elze de Groot
19
Example: dishonest casino (1)
Fair die and loaded die
Loaded die: probability 0.5 of a 6 and
probability 0.1 for 1-5
Switch from fair to loaded: probability
0.05
Switch back: probability 0.1
Marjolijn Elsinga & Elze de Groot
20
Dishonest casino (2)
Emission probabilities: HMM model that
generate or emit sequences
Marjolijn Elsinga & Elze de Groot
21
Dishonest casino (3)
Hidden: you don’t know if die is fair or
loaded
Joint probability of observed sequence x
and state sequence π:
Marjolijn Elsinga & Elze de Groot
22
Three algorithms
What is the most probable path for generating a
given sequence?
Viterbi Algorithm
How likely is a given sequence?
Forward Algorithm
How can we learn the HMM parameters given a
set of sequences?
Forward-Backward (Baum-Welch) Algorithm
Marjolijn Elsinga & Elze de Groot
23
Viterbi Algorithm
CGCG can be generated on different ways, and
with different probabilities
Choose path with highest probability
Most probable path can be found recursively
Marjolijn Elsinga & Elze de Groot
24
Viterbi Algorithm (2)
vk(i) = probability of most probable path
ending in state k with observation i
Marjolijn Elsinga & Elze de Groot
25
Viterbi Algorithm (3)
Marjolijn Elsinga & Elze de Groot
26
Viterbi Algorithm
Most probable path for CGCG
Marjolijn Elsinga & Elze de Groot
27
Viterbi Algorithm
Result with casino example
Marjolijn Elsinga & Elze de Groot
28
Three algorithms
What is the most probable path for generating a
given sequence?
Viterbi Algorithm
How likely is a given sequence?
Forward Algorithm
How can we learn the HMM parameters given a
set of sequences?
Forward-Backward (Baum-Welch) Algorithm
Marjolijn Elsinga & Elze de Groot
29
Forward Algorithm (1)
Probability over all possible paths
Number of possible paths increases
exponentonial with length of sequence
Forward algorithm enables us to compute this
efficiently
Marjolijn Elsinga & Elze de Groot
30
Forward Algorithm (2)
Replacing maximisation steps for sums in
viterbi algorithm
Probability of observed sequence up to and
including xi, requiring πi = k
Marjolijn Elsinga & Elze de Groot
31
Forward Algorithm (3)
Marjolijn Elsinga & Elze de Groot
32
Three algorithms
What is the most probable path for generating a
given sequence?
Viterbi Algorithm
How likely is a given sequence?
Forward Algorithm
How can we learn the HMM parameters given a
set of sequences?
Forward-Backward (Baum-Welch) Algorithm
Marjolijn Elsinga & Elze de Groot
33
Backward Algorithm (1)
Probability of observed sequence from xi to the
end of the sequence, requiring πi = k
Marjolijn Elsinga & Elze de Groot
34
Disadvantage Algorithms
Multiplying many probabilities gives very
small numbers which can lead to underflow
errors on the computer
can be solved by doing the algorithms in
log space, calculating log(vl(i))
Marjolijn Elsinga & Elze de Groot
35
Backward Algorithm
Marjolijn Elsinga & Elze de Groot
36
Posterior State Probability (1)
Probability that observation xi came from
state k, given the observed sequence
Posterior probability of state k at time i
when the emitted sequence is known:
P(πi = k | x)
Marjolijn Elsinga & Elze de Groot
37
Posterior State Probability (2)
First calculate probability of producing entire
observed sequence with the ith symbol being
produced by state k
P(x, πi = k) = fk (i) · bk (i)
Marjolijn Elsinga & Elze de Groot
38
Posterior State Probability (3)
Posterior probabilities will then be:
P(x) is result of forward or backward calculation
Marjolijn Elsinga & Elze de Groot
39
Posterior Probabilities (4)
For the casino example
Marjolijn Elsinga & Elze de Groot
40
Two questions
How would we decide if a short strech of
genomic sequence comes from a CpG
island or not?
How would we find, given a long piece of
sequence, the CpG islands in it, if there are
any?
Marjolijn Elsinga & Elze de Groot
41
Prediction of CpG islands
First way: Viterbi Algorithm
- Find most probable path through the
model
- When this path goes through the ‘+’
state, a CpG island is predicted
Marjolijn Elsinga & Elze de Groot
42
Prediction of CpG islands
Second Way: Posterior Decoding
- function:
-
g(k) = 1 for k Є {A+, C+, G+, T+}
g(k) = 0 for k Є {A-, C-, G-, T-}
G(i|x) is posterior probability according to the
model that base i is in a CpG island
Marjolijn Elsinga & Elze de Groot
43
Summary (1)
Markov chain is a collection of states where
a state depends only on the state before
Hidden markov model is a model in which
the states sequence is ‘hidden’
Marjolijn Elsinga & Elze de Groot
44
Summary (2)
Most probable path: viterbi algorithm
How likely is a given sequence?: forward
algorithm
Posterior state probability: forward and
backward algorithms (used for most
probable state of an observation)
Marjolijn Elsinga & Elze de Groot
45
© Copyright 2026 Paperzz