Presentation - CS

Detecting Hidden Variables: A Structure-Based Approach
ABSTRACT: We examine how to detect hidden variables when learning probabilistic models. This problem is crucial for for improving our
understanding of the domain and as a preliminary step that guides the learning procedure. A natural approach is to search for ``structural
signatures'' of hidden variables. We make this basic idea concrete, and show how to integrate it with structure-search algorithms. We
evaluate this method on several synthetic and real-life datasets, and show that it performs surprisingly well.
A Bayesian network represents a joint probability over a set of
random variables using a DAG :
P(X1,…Xn)=P(V)P(S)P(T|V) … P(X|A)P(D|A,B)
Daphne Koller
Hebrew University
{galel,noaml,nir}@huji.ac.il
Stanford University
[email protected]
1200
Characterizing Hidden Variables
This following theorem helps us to detect structural
signatures for the presence of hidden variables:
600
600
Logloss on
test data
What is a Bayesian Network
Gal Elidan, Noam Lotner, Nir Friedman
120
400
800
400
80
200
200
0
0
-200
A
G M
V
400
I
L
V
The Alarm
network
Hidden
PVSAT
ANAPHYLAXIS
Naive
ARTCO2
40
H
I
L
EXPCO2
SAO2
TPR
0
0
H
Original
V
HYPOVOLEMIA
LVFAILURE
CATECHOL
200
Visit to
Asia
Smoking
Tuberculosis
Parents of H
Lung Cancer
preserve I-Map
Bronchitis
Abnormality
in Chest
all parents connected
to all children
Children of H
200
-200
-400
0
-600
-200
-800
P(D|¬A,B)=0.1
150
0
100
-1000
50
-2000
0
LVEDVOLUME
CVP
PCWP
STROEVOLUME
CO
HREKG
P(D| ¬ A, ¬B)=0.01
G M
V
H
I
L
V
Alarm 1k
H
I
L
BP
V
Alarm 10k
HR is hidden and
structure learned
from data
PVSAT
ANAPHYLAXIS
Bayesian scoring metric:
Real-life example: Stockdata
The FindHidden Algorithm
 P (D | G , )P ( | G )d
Search for semi-cliques by expansion of 3-clique seeds
EXPCO2
LVFAILURE
CATECHOL
HR
market trend:
“Strong”
LVEDVOLUME
vs.
Semi-Clique S with N nodes
Score (G : D )  log P (G | D )  log P (D | G )  log P (G )  C
HYPOVOLEMIA
ARTCO2
SAO2
TPR
Learning: Structural EM
HRSAT
HRBP
Reference: network with no hidden. Original: golden model for artificial datasets;
best on test data. Naive: hidden parent of all nodes; acts as a straw-man. Hidden:
best FindHidden network; outperforms Naive and Reference, excels Original on
training data. Efficient Frozen EM performs as well as inefficient Flexible EM.
P(D|A, ¬B)=0.1
ERRCAUTER
HR
ERRBLOW
HISTORY
-400
Insurance 1k
Clique over
children of H
1000
0
A
P(D|A,B) = 0.8
Dyspnea
X-Ray
Parents of H
(not introducing new
independencies)
H
Score on
Training data
400
STROEVOLUME
ERRCAUTER
ERRBLOW
HISTORY
HIDDEN
(MARKET TREND)
“Stationary”
CVP
PCWP
CO
HREKG
HRSAT
HRBP
BP
E-Step:
Computation
X1
Training
Data
X2
+
X3
H
Y1
Y2
Y3
M-Step:
Score & Parameterize
X1
Expected Counts
N(X1)
N(X2)
N(X3)
N(H, X1, X1, X3)
...
X2
# neighbors 
X3
MICROSOFT
N
2
DELL
3Com
COMPAQ
FindHidden
breaks clique
H
PVSAT
Y1
X1
Y2
X2
Y3
 Propose a candidate network:
H
re-iterate with best candidate
Y1
Y2
ANAPHYLAXIS
all other nodes
(1) Introduce H as a parent of all nodes in S
(2) Replace all incoming edges to S by edges to H
(3) Remove all inter-S edges
(4) Make all children of S children of H if acyclic
X3
Y3
TPR
Hidden
HYPOVOLEMIA
X2
H
X3
X1
X2
Applying the algorithm
X3
not introducing new
independencies
Summary and Future Work
EM was applied with Fixed structure, Frozen structure
(modify only semi-clique neighborhood) and Flexible
structure
X2
X1
X2
X1
X3
LVFAILURE
CATECHOL
HR
CVP
X1
X3
PCWP
STROEVOLUME
Y1
Y2
Y3
Y1
Y2
Y3
Representation: The I-map—minimal structure which implies
only independencies that hold in the marginal distribution—is
typically complex
Improve Learning: Detecting approximate position is crucial
pre-processing for the EM algorithm
Understanding: A true hidden variable improves the quality
and “order” of the explanation
X1
X2
Y1
Structural
EM
H
H
Find
Hidden
Y1
Y2
Y1
Y3
Y2
Y3
HREKG
We introduced the importance of hidden variables and
implemented a natural idea to detect them. FindHidden
performed surprisingly well and proved extremely useful as a
preliminary step to a learning algorithm.
X3
Y2
Y3
Y1
Y2
X1
Structural
EM
PVSAT
Y3
X2
X3
Y1
Y2
X1
ARTCO2
EXPCO2
SAO2
TPR
Experiment with multi-valued hidden variables
Y3
X2
Use additional information such as edge confidence
Detect hidden variables when the data is sparse
Hidden
LVFAILURE
CATECHOL
HR
LVEDVOLUME
CVP
PCWP
Explore hidden variables in Probabilistic Relational Models
STROEVOLUME
HISTORY
ERRCAUTER
ERRBLOW
CO
HREKG
HRBP
BP
We choose the best scoring candidate produced by the SEM
HRSAT
EM adapts
structure
Explore additional structural signatures
H
ERRCAUTER
BP
HYPOVOLEMIA
H
ERRBLOW
HRBP
Further extensions:
X3
HISTORY
CO
ANAPHYLAXIS
Original
network
EXPCO2
SAO2
LVEDVOLUME
Why hidden variables?
ARTCO2
HRSAT