2PP

CS 188: Artificial Intelligence
Fall 2006
Lecture 16: Bayes Nets II
10/24/2006
Dan Klein – UC Berkeley
Example Bayes’ Net
1
Bayes’ Net Semantics
ƒ A Bayes’ net:
ƒ A set of nodes, one per variable X
ƒ A directed, acyclic graph
ƒ A conditional distribution of each variable
conditioned on its parents (the parameters θ)
A1
An
X
ƒ Semantics:
ƒ A BN defines a joint probability distribution
over its variables:
Building the (Entire) Joint
ƒ We can take a Bayes’ net and build any entry
from the full joint distribution it encodes
ƒ Typically, there’s no reason to build ALL of it
ƒ We build what we need on the fly
ƒ To emphasize: every BN over a domain implicitly
represents some joint distribution over that
domain, but is specified by local probabilities
2
Example: Alarm Network
Bayes’ Nets
ƒ So far: how a Bayes’ net encodes a joint distribution
ƒ Next: how to answer queries about that distribution
ƒ Key idea: conditional independence
ƒ Last class: assembled BNs using an intuitive notion of
conditional independence as causality
ƒ Today: formalize these ideas
ƒ Main goal: answer queries about conditional independence and
influence
ƒ After that: how to answer numerical queries (inference)
3
Conditional Independence
ƒ Reminder: independence
ƒ X and Y are independent if
ƒ X and Y are conditionally independent given Z
ƒ (Conditional) independence is a property of a
distribution
Example: Independence
ƒ For this graph, you can fiddle with θ (the CPTs) all you
want, but you won’t be able to represent any distribution
in which the flips are dependent!
X1
X2
h
0.5
h
0.5
t
0.5
t
0.5
All distributions
4
Topology Limits Distributions
Y
ƒ Given some graph
topology G, only certain
joint distributions can
be encoded
ƒ The graph structure
guarantees certain
(conditional)
independences
ƒ (There might be more
independence)
ƒ Adding arcs increases
the set of distributions,
but has several costs
X
Z
Y
X
Z
Y
X
Z
Independence in a BN
ƒ Important question about a BN:
ƒ
ƒ
ƒ
ƒ
Are two nodes independent given certain evidence?
If yes, can calculate using algebra (really tedious)
If no, can prove with a counter example
Example:
X
Y
Z
ƒ Question: are X and Z independent?
ƒ Answer: not necessarily, we’ve seen examples otherwise:
low pressure causes rain which causes traffic.
ƒ X can influence Z, Z can influence X (via Y)
ƒ Addendum: they could be independent: how?
5
Causal Chains
ƒ This configuration is a “causal chain”
X: Low pressure
X
Y
Z
Y: Rain
Z: Traffic
ƒ Is X independent of Z given Y?
Yes!
ƒ Evidence along the chain “blocks” the influence
Common Cause
ƒ Another basic configuration: two
effects of the same cause
Y
ƒ Are X and Z independent?
ƒ No, remember the “project due” example
ƒ Are X and Z independent given Y?
X
Z
Y: Project due
X: Newsgroup
busy
Yes!
Z: Lab full
ƒ Observing the cause blocks
influence between effects.
6
Common Effect
ƒ Last configuration: two causes of
one effect (v-structures)
ƒ Are X and Z independent?
ƒ Yes: remember the ballgame and the rain
causing traffic, no correlation?
ƒ Still need to prove they must be (homework)
ƒ Are X and Z independent given Y?
ƒ No: remember that seeing traffic put the rain
and the ballgame in competition?
ƒ This is backwards from the other cases
X
Z
Y
X: Raining
Z: Ballgame
Y: Traffic
ƒ Observing the effect enables influence
between effects.
The General Case
ƒ Any complex example can be analyzed
using these three canonical cases
ƒ General question: in a given BN, are two
variables independent (given evidence)?
ƒ Solution: graph search!
7
Reachability
ƒ Recipe: shade evidence nodes
L
ƒ Attempt 1: if two nodes are
connected by an undirected path
not blocked by a shaded node,
they are conditionally independent
R
ƒ Almost works, but not quite
D
ƒ Where does it break?
ƒ Answer: the v-structure at T doesn’t
count as a link in a path unless shaded
B
T
T’
Reachability (the Bayes’ Ball)
ƒ Correct algorithm:
S
ƒ Shade in evidence
ƒ Start at source node
ƒ Try to reach target by search
ƒ States: pair of (node X, previous
state S)
X
X
S
ƒ Successor function:
ƒ X unobserved:
ƒ To any child
ƒ To any parent if coming from a
child
S
ƒ X observed:
ƒ From parent to parent
ƒ If you can’t reach a node, it’s
conditionally independent of the
start node given evidence
X
X
S
8
Example
Yes
Example
L
Yes
R
Yes
D
B
T
Yes
T’
9
Example
ƒ Variables:
ƒ
ƒ
ƒ
ƒ
R: Raining
T: Traffic
D: Roof drips
S: I’m sad
R
T
ƒ Questions:
D
S
Yes
Causality?
ƒ When Bayes’ nets reflect the true causal patterns:
ƒ Often simpler (nodes have fewer parents)
ƒ Often easier to think about
ƒ Often easier to elicit from experts
ƒ BNs need not actually be causal
ƒ Sometimes no causal net exists over the domain
ƒ E.g. consider the variables Traffic and Drips
ƒ End up with arrows that reflect correlation, not causation
ƒ What do the arrows really mean?
ƒ Topology may happen to encode causal structure
ƒ Topology only guaranteed to encode conditional independencies
10
Example: Traffic
ƒ Basic traffic net
ƒ Let’s multiply out the joint
R
r
1/4
¬r
3/4
r
T
¬r
t
3/4
¬t
1/4
t
1/2
¬t
1/2
r
t
3/16
r
¬t
1/16
¬r
t
6/16
¬r
¬t
6/16
Example: Reverse Traffic
ƒ Reverse causality?
T
t
R
¬t
t
9/16
¬t
7/16
r
1/3
¬r
2/3
r
1/7
¬r
6/7
r
t
3/16
r
¬t
1/16
¬r
t
6/16
¬r
¬t
6/16
11
Example: Coins
ƒ Extra arcs don’t prevent representing
independence, just allow non-independence
X1
X2
X1
X2
h
0.5
h
0.5
h
0.5
h|h
0.5
t
0.5
t
0.5
t
0.5
t|h
0.5
h|t
0.5
t|t
0.5
Alternate BNs
12
Summary
ƒ Bayes nets compactly encode joint distributions
ƒ Guaranteed independencies of distributions can
be deduced from BN graph structure
ƒ The Bayes’ ball algorithm (aka d-separation)
ƒ A Bayes net may have other independencies
that are not detectable until you inspect its
specific distribution
13