Smart Good test taker Understands material Hard working Exam

Bayesian Nets and Applications
Naïve Bayes
What happens if we have more than one piece of evidence?
If we can assume conditional independence




Overslept and trafficjam are independent, given late
A and B are conditionally independent given C just in case B doesn't tell us
anything about A if we already know C:
P(late|overslept Λ trafficjam) =
αP(overslept Λ trafficjam)|late)P(late)
= αP(overslept)|late)P(trafficjam|late)P(late)
Naïve Bayes where a single cause directly influences a number of
effects, all conditionally independent
Independence often assumed even when not so



2
Bayesian Networks
A directed acyclic graph in which each node is annotated
with quantitative probability information




3
A set of random variables makes up the network nodes
A set of directed links connects pairs of nodes. If there is an arrow
from node X to node Y, X is a parent of Y
Each node Xi has a conditional probability
distributionP(Xi|Parents(Xi) that quantifies the effect of the parents
on the node
Example

4
Topology of network encodes conditional
independence assumptions
Smart
Hard working
Good test taker
Understands
material
5
Exam Grade
Homework Grade
Smart
Hard Working
True
False
.5
.5
Hard working
Smart
True
False
.7
.3
Good test taker
S
Good Test Taker
True
False
True
.75
.25
False
.25
.75
6
Exam Grade
Understands
material
S
HW
UM
True
False
True
True
.95
.05
True
False
.6
.4
False
True
.6
.4
False
False
.2
.8
Homework Grade
Conditional Probability Tables
Smart
Hard Working
True
False
True
False
.5
.5
.7
.3
S
Good Test Taker
True
False
True
.75
.25
False
.25
.75
S
GTT
True
UM
True
HW
UM
Exam Grade
True
False
A
B
C
D
F
True
True
.95
.05
.7
.25
.03
.01
.01
True
False
.6
.4
False
True
.6
.4
False
False
.2
.8
True
False
.3
.4
.2
.05
.05
False
True
.4
.3
.2
.08
.02
False
False
.05
.2
.3
.3
.15
Homework Grade
UM
A
B
C
D
F
True
.7
.25
.03
.01
.01
False
.2
.37
.4
.05
.05
Compactness
A CPT for Boolean Xi with k Boolean parents has 2k rows
for the combinations of parent values
Each row requires one number p for Xi=true (the number
for Xi=false is just 1-p)
If each variable has no more than k parents, the complete
network requires O(nx2k) numbers





8
Grows linearly with n vs O(2n) for the full joint distribution
Student net: 1+1+2+2+5+5=11 numbers (vs. 26-1)=31
Conditional Probability
9
Global Semantics/Evaluation


Global semantics defines the full
joint distribution as the product of
the local conditional distributions:
P(x1,…,xn)=∏in=1P(xi|
Parents(Xi))
e.g.,
P(EG=AΛGTΛ⌐UMΛSΛHW)
10
Global Semantics

Global semantics defines the full joint distribution as the product of
the local conditional distributions:
P(X1,…,Xn)=∏in=1P(Xi|Parents(Xi))

e.g., Observations:S, HW, not UM, will I get an A?
P(EG=AΛGTΛ⌐UMΛSΛHW)
= P(EG=A|GT Λ⌐UM)*P(GT|S)*P(⌐UM |HW ΛS)*P(S)*P(HW)
11
Conditional Independence and Network
Structure

The graphical structure of a Bayesian network forces
certain conditional independences to hold regardless of
the CPTs.

This can be determined by the d-separation criteria
12
a
c
Converging
a
b
b
Diverging
b
Linear
c
13
a
c
D-separation (opposite of d-connecting)

A path from q to r is d-connecting with respect to the
evidence nodes E if every interior node n in the path has
the property that either



14
It is linear or diverging and is not a member of E
It is converging and either n or one of its decendents is in E
If a path is not d-connecting (is d-separated), the nodes are
conditionally independent given E
Smart
Hard working
Good test taker
Understands
material
15
Exam Grade
Homework Grade


S and EG are not independent given GTT
S and HG are independent given UM
16
Medical Application of Bayesian
Networks:
Pathfinder
Pathfinder

Domain: hematopathology diagnosis




18
Microscopic interpretation of lymph-node biopsies
Given: 100s of histologic features appearing in lymph node
sections
Goal: identify disease type
malignant or benign
Difficult for physicians
Pathfinder System




Bayesian Net implementation
Reasons about 60 malignant and benign diseases of the
lymph node
Considers evidence about status of up to 100
morphological features presenting in lymph node tissue
Contains 105,000 subjectively-derived probabilities
19
20
Commercialization




Intellipath
Integrates with videodisc libraries of histopathology slides
Pathologists working with the system make significantly
more correct diagnoses than those working without
Several hundred commercial systems in place worldwide
21
Sequential Diagnosis
22
Features


Structured into a set of 2-10 mutually exclusive values
Pseudofollicularity


Absent, slight, moderate, prominent
Represent evidence provided by a feature as F1,F2, … Fn
23
Value of information

User enters findings from microscopic analysis of tissue

Probabilistic reasoner assigns level of belief to different diagnoses

Value of information determines which tests to perform next

Full disease utility model making use of life and death decision making


24
Cost of tests
Cost of misdiagnoses
25
26
Group Discrimination Strategy



Select questions based on their ability to discriminate
between disease classes
For given differential diagnosis, select most specific level
of hierarchy and selects questions to discriminate among
groups
Less efficient

27
Larger number of questions asked
28
29
Other Bayesian Net Applications

Lumiere – Who knows what it is?
30
Other Bayesian Net Applications

Lumiere




VISTA






Single most widely distributed application of BN
Microsoft Office Assistant
Infer a user’s goals and needs using evidence about user background, actions
and queries
Help NASA engineers in round-the-clock monitoring of each of the Space
Shuttle’s orbiters subsystem
Time critical, high impact
Interpret telemetry and provide advice about likely failures
Direct engineers to the best information
In use for several years
Microsoft Pregnancy and Child Care

31
What questions to ask next to diagnose illness of a child
Other Bayesian Net Applications

Speech Recognition

Text Summarization

Language processing tasks in general
32