PowerPoint

Probabilistic Graphical Models
Zhongqing Wang
The slides is adapted by (Coughlan, 2009), (Wang, 2012),
(Gormley and Eisner, 2015), (Tang, 2015), and Daphne Koller’s PGM lessons
An (Old) News
• 2011 Turing Award goes to Prof. Judea Pearl, for his
pioneer work on Probabilistic Graphical Models
Outlines
• Representation
• Inference
• Applications
• Usage
Representation
Bayes Networks, MRFs, and Factor Graph
Probabilistic Graphical Models: What and
Why
• PGMs:
• A model for joint probability distribution over
random variables.
• Represent dependencies and independencies
between the random variables.
• Three types of PGMs
• Directed graph: Bayesian Networks (BN)
• Undirected graph: Markov Random Field (MRF)
• Factor Graph
Chain Rule for Bayesian Networks
B
A
D
C
F
E
G
P( ABCDEFG )
 P( A) P( B) P(C | AB) P( D | A) P( E | C ) P( F | D) P(G | E )
Example: The Student Network
Bayes Networks: Conditional Independence
• a is independent of b given c
• Equivalently
• Notation
Conditional Independence
Conditional Independence
Conditional Independence
Note: this is the opposite of Example 1, with c observed.
B
A
D
C
F
E
G
The graph must be acyclic!
BN must be DAG
Representation of MRFs
B
A
C
factors
Partition function
Cliques and Maximal Cliques
Clique
Maximal Clique
Examples of MRFs
Examples of MRFs
Joint Distribution of MRFs
• where
is the potential over clique C and
• is the normalization coefficient; note: M K-state variables  KM terms in Z.
• Energies and the Boltzmann distribution
Directed vs. Undirected Graphs
20
Factor Graphs
• A factor graph is a more general graph
• It allows us to be more explicit about the details of the
factorization
• An example:
x2
x1
x3
Variable node
Factor node
fa
fb
fc
fd
p ( x1 , x2 , x3 )  f a ( x1 , x2 ) f b ( x1 , x2 ) f c ( x2 , x3 ) f d ( x3 )
Factor Graphs from Directed Graphs
Factor Graphs from Undirected Graphs
Inference
Belief Propagation
Inference
Given a factor graph, two common tasks …
• Compute the most likely joint assignment,
x* = argmaxx p(X=x)
• Compute the marginal distribution of variable Xi:
p(Xi=xi) for each value xi
Both consider all joint assignments.
Both are NP-Hard in general.
So, we turn to approximations.
p(Xi=xi) = sum of
p(X=x) over joint
assignments with
Xi=xi
24
25
Marginals
bymany
Sampling
Graph
Suppose we took
samples on
fromFactor
the distribution
over
taggings:
Sample 1:
n
v
p
d
n
Sample 2:
n
n
v
d
n
Sample 3:
n
v
p
d
n
Sample 4:
v
n
p
d
n
Sample 5:
v
n
v
d
n
Sample 6:
n
v
p
d
n
ψ0
X1
X0
<START>
ψ2
X2
ψ4
X3
ψ6
X4
ψ8
X5
ψ1
ψ3
ψ5
ψ7
ψ9
time
flies
like
an
arrow
26
Marginals
byi =Sampling
on Factor
The marginal p(X
xi) gives the probability
that Graph
variable Xi
takes value xi in a random sample
Sample 1:
n
v
p
d
n
Sample 2:
n
n
v
d
n
Sample 3:
n
v
p
d
n
Sample 4:
v
n
p
d
n
Sample 5:
v
n
v
d
n
Sample 6:
n
v
p
d
n
ψ0
X1
X0
<START>
ψ2
X2
ψ4
X3
ψ6
X4
ψ8
X5
ψ1
ψ3
ψ5
ψ7
ψ9
time
flies
like
an
arrow
27
Marginals by Sampling on Factor Graph
Estimate the
marginals
as:
n 4/6
v 2/6
n 3/6
v 3/6
p 4/6
v 2/6
d 6/6
n 6/6
Sample 1:
n
v
p
d
n
Sample 2:
n
n
v
d
n
Sample 3:
n
v
p
d
n
Sample 4:
v
n
p
d
n
Sample 5:
v
n
v
d
n
Sample 6:
n
v
p
d
n
ψ0
X1
X0
<START>
ψ2
X2
ψ4
X3
ψ6
X4
ψ8
X5
ψ1
ψ3
ψ5
ψ7
ψ9
time
flies
like
an
arrow
28
How do we get marginals without
sampling?
That’s what Belief Propagation is all about!
Why not just sample?
• Sampling one joint assignment is also NP-hard in general.
• In practice: Use MCMC (e.g., Gibbs sampling) as an anytime algorithm.
• So draw an approximate sample fast, or run longer for a “good” sample.
• Sampling finds the high-probability values xi efficiently.
But it takes too many samples to see the low-probability ones.
• How do you find p(“The quick brown fox …”) under a language model?
• Draw random sentences to see how often you get it? Takes a long time.
• Or multiply factors (trigram probabilities)? That’s what BP would do.
Overview of Belief Propagation
• Overview: iterative process in which neighboring variables
“talk” to each other, passing messages such as:
Message Update
Overview of BP
• Demo by Gormley and Eisner (2015)
Demo of BP
From Gormley and Eisner’s tutorial
Great Ideas in ML: Message Passing
Count the soldiers
there's
1 of me
1
before
you
2
before
you
3
before
you
4
before
you
5
behind
you
4
behind
you
3
behind
you
2
behind
you
adapted from MacKay (2003) textbook
5
before
you
1
behind
you
33
Great Ideas in ML: Message Passing
Count the soldiers
there's
1 of me
Belief:
Must be
22 + 11 + 3 = 6 of
us
2
before
you
only see
my incoming
messages
adapted from MacKay (2003) textbook
3
behind
you
34
Great Ideas in ML: Message Passing
Count the soldiers
there's
1 of me
1 before
you
only see
my incoming
messages
Belief:
Belief:
Must be
Must be
11 + 1 + 4 = 6 of 22 + 11 + 3 = 6 of
us
us
4
behind
you
adapted from MacKay (2003) textbook
35
Great Ideas in ML: Message Passing
Each soldier receives reports from all branches of tree
3 here
7 here
1 of me
11 here
(= 7+3+1)
adapted from MacKay (2003) textbook
36
Great Ideas in ML: Message Passing
Each soldier receives reports from all branches of tree
3 here
7 here
(= 3+3+1)
3 here
adapted from MacKay (2003) textbook
37
Great Ideas in ML: Message Passing
Each soldier receives reports from all branches of tree
11 here
(= 7+3+1)
7 here
3 here
adapted from MacKay (2003) textbook
38
Great Ideas in ML: Message Passing
Each soldier receives reports from all branches of tree
3 here
7 here
3 here
adapted from MacKay (2003) textbook
Belief:
Must be
14 of us
39
Great Ideas in ML: Message Passing
Each soldier receives reports from all branches of tree
3 here
7 here
3 here
adapted from MacKay (2003) textbook
Belief:
Must be
14 of us
40
Message Passing in Belief Propagation
v 6
n 6
a 9
v 1
n 6
a 3
My other factors
think I’m a noun
…
…
Ψ
X
…
But my other
variables and I
think you’re a verb
…
v 6
n 1
a 3
Both of these messages judge the possible values of variable X.
41
Their product = belief at X = product of all 3 messages to X.
The Sum-Product Algorithm (1)
• Objective:
i.
to obtain an efficient, exact inference algorithm for finding
marginals;
ii.
in situations where several marginals are required, to allow
computations to be shared efficiently.
The Sum-Product Algorithm (2)
The Sum-Product Algorithm (3)
Sum-product VS. Max-product
Applications
基于社交信息的文本摘要(Wang et al.,
2013)
• 在线个人简历信息可以帮助人们联系其他拥有类似背景的
人们。并提供非常有价值的商业信息给人们
• 从在线简历文本中抽取个人信息
• 技能信息
• 文本摘要信息
抽取个人信息
linkedIn中一个个人简历的例子
抽取个人信息(续)
文本摘要信息
工作经历文本
LinkedIn不同社会关系的分布情况
linkedIn中人与人的关系网络
基于概率图模型构建个人简历关联模型
文本属性函数
1


exp   k f k  xik , yi  
Z1
 i k

个人联系因子函数

g  yi , y j   exp ij  yi  y j 
2

模型定义
• 对于一个网络G:
P Y | X , G  
P  X , G | Y  P Y 
P  X ,G
 P  X | Y  P Y | G 
P Y | X , G   P Y | G   P  xi | yi 
i
属性函数与因子函数
d

1
P  xi | yi   exp  j f j  xij , yi  
Z1
 j 1



1
P Y | G   exp   g  i, j  
Z2
 i jNB (i )


g  yi , y j   exp ij  yi  y j 
2

对数似然目标函数
• EM算法获取权重
• BP算法预测模型
 *  arg max L  
技能预测-实验
• 技能分布情况
文本摘要-实验
• 实验设置
• 我们在每个简历文本中选择40个单词用来构建摘要结果。
• 数据集包含了497个简历样本
• 我们使用200个简历文本作为测试样本,并且使用剩下的样本构建训练样
本。
• 我们使用ROUGE-1.5.5工具包用来进行验证
文本摘要-实验
• 实验结果
无监督学习
监督学习
Who will follow you back? (Hopcroft et al.,
2011)
On Twitter…
30%
?
100%
?
60%
?
Ladygaga
1%
?
Shiteng
Obama
Huwei
JimmyQiao
Interaction
Retweet vs. reply
*Retweeting seems to be more helpful
Structural Balance
(A) and (B) are balanced, but (C) and (D) are not.
• Structural balance
• Reciprocal relationships
are balanced (88%);
• Parasocial relationships
are not (only 29%).
Triad Factor Graph (TriFG)
y2=friend
y4
y2
TriFG model
3
v6
5
v4
h (y3, y4, y5)
y5
y1
Input: Mobile Network
y4=?
y1=friend
y5=non-friend
y3
h (y1, y2, y3)
y6
y3=?
y6=non-friend
f (v3u, v3s, y3)
f (v1u, v1s, y1)
u
f (v5u, v5s, y5)
s
f (v2 , v2 , y2)
f (v4u, v4s ,y4)
v3
4
2
6
v2u, v2s
v5
(v2, v3)
v2
u
v1 , v1
1
f (v6u, v6s ,y6)
v1
(v2, v1)
s
v4u, v4s
v3u, v3s
(v4, v5)
(v4, v3)
Observations
v5u, v5s
v6u, v6s
(v6, v5)
(v4, v6)
Usage
Usage
• The factor graph model toolkit
• A very easy way to solve the PGM problem, just as SVM.
+1 1:2 3:2 4:5
-1 3:2 5:2 6:3
+1 3:2 4:6 6:1
Attributes
# Edge_1 1 2
# Edge_2 2 3
Factors / Connections
Thanks Dr Jie Tang’s
helpful tools
Q&A
Thanks