Physical Fluctuomatics / Applied Stochastic

Physical Fluctuomatics
12th Bayesian network and belief propagation in
statistical inference
Kazuyuki Tanaka
Graduate School of Information Sciences, Tohoku University
[email protected]
http://www.smapip.is.tohoku.ac.jp/~kazu/
Physics Fluctuomatics (Tohoku
University)
1
Textbooks
Kazuyuki Tanaka: Mathematics of Statistical
Inference by Bayesian Network, Corona
Publishing Co., Ltd., October 2009 (in
Japanese).
Physics Fluctuomatics (Tohoku
University)
2
Fundamental Probabilistic Theory for
Image Processing
Joint Probability and Conditional Probability
Probability of Event A=a
Pr{ A  a}
Joint Probability of Events A=a and B=b
PrA  a, B  b  Pr( A  a)  ( B  b)
Conditional Probability of Event B=b
when Event A=a has happened.
PrA  a, B  b
PrB  b A  a 
PrA  a
A
B
 PrA  a, B  b  PrB  b A  aPrA  a
Physics Fluctuomatics (Tohoku
University)
3
Fundamental Probabilistic Theory for
Image Processing
Marginal Probability of Event B
Pr{B  b}   PrA  a, B  b, C  c, D  d 
a
Marginalization
c
d
A
B
C
D
Physics Fluctuomatics (Tohoku
University)
4
Fundamental Probabilistic Theory for
Image Processing
Causal Independence
A
Pr{ A  a, B  b, C  c}
Pr{C  c | A  a, B  b} 
Pr{ A  a, B  b}
B
C
Pr{C  c | A  T,B  b}  Pr{C  c | A  F,B  b}
(b, c)  (T, T), (T, F), (F, T), (F, F)
A
B
Pr{C  c | A  a, B  b}  Pr{C  c | B  b}
C
Physics Fluctuomatics (Tohoku
University)
5
Fundamental Probabilistic Theory for
Image Processing
Causal Independence
A
Pr{D  d | A  a, B  b, C  c}

Pr{ A  a, B  b, C  c, D  d }
Pr{ A  a, B  b, C  c}
B
C
D
Pr{D  d | A  T,B  T, C  c}  Pr{D  d | A  T,B  F,C  c}
 Pr{D  d | A  F,B  T,C  c}  Pr{D  d | A  F,B  F,C  c}
(c, d )  (T, T), (T, F), (F, T), (F, F)
A
B
Pr{D  d | A  a, B  b, C  c}  Pr{D  d | C  c}
D
Physics Fluctuomatics (Tohoku
University)
C
6
Fundamental Probabilistic Theory
for Image Processing
PrA  a, B  b, C  c  PrC  c A  b, B  cPrA  a, B  b
 PrC  c A  a, B  bPrB  b A  aPrA  a
 PrC  c B  bPrB  b A  aPrA  a
A
Pr{B  b | A  a}
PrC  c A  a, B  b
 PrC  c B  b
PrA  a C  c

B
PrA  a, C  c
PrC  c
 PrA  a, B  b, C  c

 PrA  a, B  b, C  c
b
C
Causal Independence
Physics Fluctuomatics (Tohoku
University)
a
b
7
Fundamental Probabilistic Theory
for Image Processing
PrA  a, B  b, C  c  PrC  c B  bPrB  b A  aPrA  a
 f{ A, B} (a, b) f{ B ,C } (b, c)
f{ B ,C } (b, c)  PrC  c B  b
f{ A, B} (a, b)  PrB  b A  aPrA  a
Directed Graph
A
A
B
B
C
C
Physics Fluctuomatics (Tohoku
University)
Undirected Graph
8
Simple Example of Bayesian Networks

PrAR  T AW  T PrAS  T AW  T?

Physics Fluctuomatics (Tohoku
University)
9
Simple Example of Bayesian Networks
PrAC  aC , AS  aS , AR  aR , AW  aW 
 PrAW  aW AC  aC , AS  aS , AR  aR 
 PrAC  aC , AS  aS , AR  aR 
 PrAW  aW AC  aC , AS  aS , AR  aR 
 PrAR  aR AC  aC , AS  aS PrAC  aC , AS  aS 
 PrAW  aW AC  aC , AS  aS , AR  aR 
 PrAR  aR AC  aC , AS  aS 
 PrAS  aS AC  aC PrAC  aC 
 PrAW  aW AS  aS , AR  aR 
 PrAR  aR AC  aC 
 PrAS  aS AC  aC 
 PrAC  aC 
Physics Fluctuomatics (Tohoku
University)
10
Simple Example of Bayesian Networks
PrAR  T, AW  T
PrAR  T AW  T 
PrAW  T
PrA  a, B  b, C  c  PrC  c B  bPrB  b A  aPrA  a
 f{ A, B} ( a, b) f{B ,C} (b, c)
Physics Fluctuomatics (Tohoku
University)
11
Simple Example of Bayesian Networks
PrAR  T, AW  T 
PrAS  T, AW  T 
PrAW  T 
  PrA
C
aC T, F aS T, F
  PrA
C
aC T, F aR T, F
   PrA
aC T, F aS T, F aR T, F
C
 aC , AS  aC , AR  T, AW  T  0.4581
 aC , AS  T, AR  aR , AW  T  0.2781
 aC , AS  aS , AR  aR , AW  T  0.6471
Physics Fluctuomatics (Tohoku
University)
12
Simple Example of Bayesian Networks
PrAR  T AW  T 
PrAR  T, AW  T 0.4581

 0.7079
PrAW  T
0.6471
PrAS  T AW  T 
PrAS  T, AW  T 0.2781

 0.4298
PrAW  T
0.6471
Physics Fluctuomatics (Tohoku
University)
13
Simple Example of Bayesian Networks
PrAC  aC , AS  aS , AR  aR , AW  aW 
 PrAW  aW AS  aS , AR  aR PrAR  aR AC  aC 
 PrAS  aS AC  aC PrAC  aC 
 f{W ,S, R } (aW , aS , aR ) f{R, C} ( aR , aC ) f{S, C} ( aS , aC )
 f{W ,S, R } (aW , aS , aR )  PrAW  aW AS  aS , AR  aR 

f{R, C} (aR , aC )  PrAR  aR AC  aC 

 f
{S, C} ( aS , aC )  PrAS  aS AC  aC PrAC  aC 

Directed Graph
Undirected Graph
C
C
S
R
W
Physics Fluctuomatics (Tohoku
University)
S
R
W
14
Belief Propagation for Bayesian Networks
Belief propagation cannot give us exact
computations in Bayesian networks on cycle
graphs.
Applications of belief propagation to
Bayesian networks on cycle graphs provide
us many powerful approximate
computational models and practical
algorithms for probabilistic information
processing.
Physics Fluctuomatics (Tohoku
University)
15
Simple Example of Bayesian Networks
PrX 1  x1 , X 2  x2 , , X 8  x8 
 PrX 8  x8 X 5  x5 , X 6  x6 
 PrX 7  x7 X 6  x6 

 Pr X 6  x6 X 3  x3 , X 4  x4

 PrX 5  x5 X 2  x2 PrX 4  x4 X 2  x2 
 PrX 3  x3 X 1  x1PrX 1  x1PrX 2  x2 
PrX 1  x1 , X 2  x2 ,, X 8  x8 
 W568 (x 5 , x6 , x8 )W346 ( x3 , x4 , x5 )W67 ( x6 , x7 )
 W25 ( x2 , x5 )W24 ( x2 , x4 )W13 ( x1 , x3 )
Physics Fluctuomatics (Tohoku
University)
16
Joint Probability of Probabilistic Model
with Graphical Representation including
Cycles
PrX 1  x1 , X 2  x2 ,, X 8  x8 
 W568 (x 5 , x6 , x8 )W346 ( x3 , x4 , x5 )W67 ( x6 , x7 )
 W25 ( x2 , x5 )W24 ( x2 , x4 )W13 ( x1 , x3 )
1
2
W24
W13
Directed
Graph
3
W67
7
Physics Fluctuomatics (Tohoku
University)
W346
6
4
W25
W568
8
5
Undirected
Hypergraph
17
Marginal Probability Distributions
Pr{ X i  xi }  Pi ( xi )

 PrX
1
 x1 , X 2  x2 ,, X 8  x8 
1

PrX

 
1
 x1 , X 2  x2 ,, X 8  x8 
W24
W13
3
x \ xi
Pr{ X i  xi , X j  x j }  Pij ( xi , x j )
2
W346
W67
7
6
4
W25
W568
5
8
x \ xi , x j
Pr{ X i  xi , X j  x j , X k  xk }  Pijk ( xi , x j , xk )

PrX



1
 x1 , X 2  x2 ,, X 8  x8 
x \ xi , x j , xk
Physics Fluctuomatics (Tohoku
University)
18
Approximate Representations
of Marginal Probability Distributions in terms of Messages
W346 ( x3 , x4 , x6 ) M 313 ( x3 ) M 424 ( x4 ) 


 M 667 ( x6 ) M 6568 ( x6 ) 

P346 ( x3 , x4 , x6 ) 
W346 ( x3 , x4 , x6 ) M 313 ( x3 ) M 424 ( x4 ) 



 M 667 ( x6 ) M 6568 ( x6 ) 
x3 x4 x6 
1
M 313 ( x3 )
W24
W13
3
W67
7
1
2
W346
6
4
3
W25
W568
8
5
2
M 667 ( x6 )
7
W346
M 424 ( x4 )
4
6
5
8
M 6568 ( x6 )
Physics Fluctuomatics (Tohoku
University)
19
Approximate Representations
of Marginal Probability Distributions in terms of Messages
M 6346 x6 M 6568 x6 M 667 x6 
P6 x6  
 M 6346 x6 M 6568 x6 M 667 x6 
x6
1
2
3
W24
W13
3
W67
7
M 6346 ( x6 )
W346
6
4
W25
W568
5
M 667 ( x6 )
7
4
6
5
8
M 6568 ( x6 )
8
Physics Fluctuomatics (Tohoku
University)
20
1
2
W24
W13
3
W67
7
W346
6
4
Basic Strategies of Belief
Propagations in Probabilistic
W25 Model with Graphical
Representation including Cycles
5
P6 x6    P346 x3 , x4 , x6 
W568
8
x3
x4
1
3
6
7
3
Approximate Expressions
of Marginal Probabilities
4
7
Physics Fluctuomatics (Tohoku
University)
W346
4
6
5
8
2
5
8
21
Simultaneous Fixed Pint Equations
for Belief Propagations in
Hypergraph Representations
 W
346
M 6346 ( x6 ) 
x3
( x3 , x4 , x6 ) M 313 ( x3 ) M 424 ( x4 ) 
x4
 W
x3
x4
346
( x3 , x4 , x6 ) M 313 ( x3 ) M 424 ( x4 ) 
2
1
x6
1
M 313 ( x3 )
M 424 ( x4 )
3
W24
W13
3
4
M 6346 ( x6 )
2
W67
6
Belief Propagation Algorithm
Physics Fluctuomatics (Tohoku
University)
7
W346
6
4
W25
W568
5
8
22
Fixed Point Equation and
Iterative Method
Fixed Point Equation
 
*  *
M  M
Iterative Method
 
 
 


M1   M 0


M 2   M1


M3   M2
yx
y
M1
M2
0
y   (x)
M * M1
M0
x

Physics Fluctuomatics (Tohoku
University)
23
Belief Propagation for Bayesian Networks
Belief propagation can be applied to Bayesian
networks also on hypergraphs as powerful
approximate algorithms.
Physics Fluctuomatics (Tohoku
University)
24
Numerical Experiments
P8 (1)  0.5607 P8 (0)  0.4393
P58 (1,1)  0.4736
P58 (1,0)  0.0764
P58 (0,1)  0.0871
P58 (0,0)  0.3629
Belief Propagation
P8 (1)  0.5640 P8 (0)  0.4360
P58 (1,1)  0.4776
P58 (1,0)  0.0724
P58 (0,1)  0.0864
P58 (0,0)  0.3636
P8 ( x8 ) 
Pr{ X

 
1
Exact
 x1 , X 2  x2 ,, X 8  x8 }
1
W24
W13
3
x \ x8
P58 ( x5 , x8 ) 
 Pr{ X1  x1 , X 2  x2 ,, X 8  x8}
x \x5 , x8 
W67
7
Physics Fluctuomatics (Tohoku
University)
W346
6
4
2
W25
W568
5
8
25
Numerical Experiments
Belief Propagation

Pr X Bronchitis  Present X Dyspnea  Present

PrX Bronchitis  Present , X Dyspnea  Present  0.3629


 0.8261
PrX Dyspnea  Present 
0.4393
1
2
W24
W13
3
W67
7
Physics Fluctuomatics (Tohoku
University)
W346
6
4
W25
W568
5
8
26
Linear Response Theory
2
1
Pij (m, n)  Pi ( m) Pj ( n)
4
3
 ( i ) Pj ( n) 

 lim 

hi  0
h
i


6
7
x  {xi | i  Ω}
5
8
~
~
 Pj (n)  Pj (n)  Pj (n)    n, x j P ( x )    n, x j P( x )
xx

(i )
Deviation of Average at Node j with respect to External Field at Node i

1
~
P ( x )  ~ P ( x )exp hi xi ,m
Z
Physics Fluctuomatics (Tohoku
University)

27
Numerical Experiments

Pr X Tuberculosis  Present X Dyspnea  Present


PrX Tuberculosis  Present , X Dyspnea  Present 
PrX Dyspnea  Present 
0.0082

 0.0187
0.4393
1
2
W24
W13
3
W67
7
Physics Fluctuomatics (Tohoku
University)
W346
6
4
W25
W568
5
8
28
Summary
Bayesian Network for Probabilistic Inference
Belief Propagation for Bayesian Networks
Physics Fluctuomatics (Tohoku
University)
29
Practice 11-1
Compute the exact values of the marginal probability Pr{Xi}
for every nodes i(=1,2,…,8), numerically, in the Bayesian
network defined by the joint probability distribution
Pr{X1,X2,…,X8} as follows: PrX , X ,, X 
1
2
8

 PrX 8 X 5 , X 6 PrX 7 X 6 Pr X 6 X 3 , X 4
 PrX 5 X 2 PrX 4 X 2 PrX 3 X 1
 PrX 1PrX 2 
Each conditional probability table and probability
table is given in Figure 3.12 and Table 3.11 in
Kazuyuki Tanaka: Mathematics of Statistical
Inference by Bayesian Network, Corona
Publishing Co., Ltd., October 2009.
Physics Fluctuomatics (Tohoku
University)
30

Practice 11-2
Make a program to compute the approximate values of the
marginal probability Pr{Xi} for every nodes i(=1,2,…,8) by
using the belief propagation method in the Bayesian network
defined by the joint probability distribution Pr{X1,X2,…,X8} as
follows:
PrX , X ,, X 
1
2
8

 PrX 8 X 5 , X 6 PrX 7 X 6 Pr X 6 X 3 , X 4

 PrX 5 X 2 PrX 4 X 2 PrX 3 X 1PrX 1PrX 2 
Each conditional probability table and probability table is
given in Figure 3.12 and Table 3.11 in
Kazuyuki Tanaka: Mathematics of Statistical Inference
by Bayesian Network, Corona Publishing Co., Ltd.,
October 2009.
The algorithm has appeared explicitly in the above textbook.
Physics Fluctuomatics (Tohoku
University)
31