6PP

Announcements
CS 188: Artificial Intelligence
Optional midterm
Fall 2006
Lecture 18: Decision Diagrams
10/31/2006
On Tuesday 11/21 in class unless conflicts
We will count the midterms and final as 1 / 1 / 2, and
drop the lowest (or halve the final weight)
Will also run grades as if this midterm did not happen
You will get the better of the two grades
Projects
Dan Klein – UC Berkeley
Recap: Sampling
Exact inference can be slow
3.2 up today, grades on previous projects ASAP
Total of 3 checkpoints left (including 3.2)
Likelihood Weighting
Problem with rejection sampling:
Basic idea: sampling
Draw N samples from a sampling distribution S
Compute an approximate posterior probability
Show this converges to the true probability P
Benefits:
Easy to implement
You can get an (approximate) answer fast
If evidence is unlikely, you reject a lot of samples
You don’t exploit your evidence as you sample
Consider P(B | A=true)
Burglary
Alarm
Idea: fix evidence variables and sample the rest
Burglary
Alarm
Last time:
Rejection sampling: reject samples disagreeing with evidence
Likelihood weighting: use evidence to weight samples
Problem: sample distribution not consistent!
Solution: weight by probability of evidence given parents
Likelihood Sampling
Likelihood Weighting
Sampling distribution if z sampled and e fixed evidence
Cloudy
Cloudy
C
Now, samples have weights
Sprinkler
S
Rain
R
Rain
W
WetGrass
Together, weighted sampling distribution is consistent
1
Likelihood Weighting
Note that likelihood weighting
doesn’t solve all our problems
Rare evidence is taken into account
for downstream variables, but not
upstream ones
A better solution is Markov-chain
Monte Carlo (MCMC), more
advanced
We’ll return to sampling for robot
localization and tracking in dynamic
BNs
Decision Networks
MEU: choose the action which
maximizes the expected utility
given the evidence
Cloudy
C
S
U
Can directly operationalize this
with decision diagrams
Rain
R
Bayes nets with nodes for
utility and actions
Lets us calculate the expected
utility for each action
W
Weather
New node types:
Chance nodes (just like BNs)
Actions (rectangles, must be
parents, act as observed
evidence)
Utilities (depend on action and
chance nodes)
Decision Networks
Action selection:
Umbrella
Report
Example: Decision Networks
Umbrella
Instantiate all
evidence
Calculate posterior
over parents of
utility node
Set action node
each possible way
Calculate expected
utility for each
action
Choose maximizing
action
Umbrella
U
U
Weather
Weather
A
W
P(W)
leave
sun
sun
0.7
leave
rain
0
rain
0.3
take
sun
20
take
rain
70
W
Report
Example: Decision Networks
U(A,W)
100
Value of Information
Idea: compute value of acquiring each possible piece of evidence
Can be done directly from decision network
Umbrella
W
P(W)
sun
0.7
rain
0.3
R
clear
cloudy
R
Example: buying oil drilling rights
U

Weather
A
W
leave
sun
100
P(R|sun)
leave
rain
0
0.5
take
sun
20
take
rain
70
0.5
P(R|rain)
clear
0.2
cloud
0.8
Report
Two blocks A and B, exactly one has oil, worth k
Prior probabilities 0.5 each, mutually exclusive
Current price of each block is k/2
Probe gives accurate survey of A. Fair price?
DrillLoc
U
OilLoc
U(A,W)
Solution: compute value of information
= expected value of best action given the information minus expected
value of best action without information
Survey may say “oil in A” or “no oil in A,” prob 0.5 each
= [0.5 * value of “buy A” given “oil in A”] +
[0.5 * value of “buy B” given “no oil in A”] – 0
= [0.5 * k/2] + [0.5 * k/2] - 0 = k/2
2
General Formula

Current evidence E=e, possible utility inputs s

Potential new evidence E’: suppose we knew E’ = e’
VPI Properties
Nonnegative in expectation
Nonadditive ---consider, e.g., obtaining Ej twice

BUT E’ is a random variable whose value is currently unknown, so:
Must compute expected gain over all possible values
Order-independent

(VPI = value of perfect information)
VPI Example
VPI Scenarios
Imagine actions 1 and 2, for which U1 > U2
How much will information about Ej be worth?
Umbrella
U
Weather
Report
Little – we’re sure
action 1 is better.
Reasoning over Time
Often, we want to reason about a sequence of
observations
Speech recognition
Robot localization
User attention
Little – info likely to
change our action
but not our utility
Markov Models
A Markov model is a chain-structured BN
Each node is identically distributed (stationarity)
Value of X at a given time is called the state
As a BN:
X1
Need to introduce time into our models
Basic approach: hidden Markov models (HMMs)
More general: dynamic Bayes’ nets
A lot – either could
be much better
X2
X3
X4
Parameters: called transition probabilities, specify
how the state evolves over time (also, initial probs)
3
Conditional Independence
Example
0.1
Weather:
X1
X2
X3
X4
0.9
States: X = {rain, sun}
Transitions:
Basic conditional independence:
Past and future independent of the present
Each time step only depends on the previous
This is called the (first order) Markov property
sun
0.9
0.1
This is a
CPT, not a
BN!
Initial distribution: 1.0 sun
What’s the probability distribution after one step?
Mini-Forward Algorithm
Question: probability of being in state x at time t?
Slow answer:
rain
Example
From initial observation of sun
Enumerate all sequences of length t which end in s
Add up their probabilities
P(X1)
Better answer: cached incremental belief update
P(X2)
P(X3)
P(X∞)
From initial observation of rain
P(X1)
Forward simulation
Stationary Distributions
If we simulate the chain long enough:
What happens?
Uncertainty accumulates
Eventually, we have no idea what the state is!
P(X3)
P(X∞)
Web Link Analysis
PageRank over a web graph
Each web page is a state
Initial distribution: uniform over pages
Transitions:
With prob. c, uniform jump to a
random page (solid lines)
With prob. 1-c, follow a random
outlink (dotted lines)
Stationary distributions:
For most chains, the distribution we end up in is
independent of the initial distribution
Called the stationary distribution of the chain
Usually, can only predict a short time out
P(X2)
Stationary distribution

Will spend more time on highly reachable pages
E.g. many ways to get to the Acrobat Reader download page!
Somewhat robust to link spam
Google 1.0 returned pages containing your keywords in
decreasing rank, now all search engines use link analysis
along with many other factors
4

Download Report

6PP

Paperzz.com

Your Paperzz