Independence Assumptions and Bayesian Updating

95
ARTIFICIAL INTELLIGENCE
Independence Assumptions and
Bayesian Updating
Clark Glymour
Department of History and Philosophy of Science, University of
Pittsburgh, Pittsburgh, PA 15260, U.S.A.
Recommended by E. Pednault
Relying on a result they attribute to Hussian [1], Pednault, Zucker, and
Muresan [2] claim to have proved that the independence assumptions used in
the P R O S P E C T O R [3] program do not permit the probabilities of hypotheses
to be changed by any evidence. The purpose of this note is to show that the
theorem stated by Pednault et al. is false, as is the result claimed by Hussian.
Let P be a probability function and let Hi . . . . . H , be a set of jointly
exhaustive and mutually exclusively hypotheses in the sense that
"~ P(/-/~) = 1
(1)
i
and
P(/-/~&/-/j)=O
if i / j .
(2)
Let E1 . . . . . Em be a sequence of evidence sentences. We say that no
updating occurs iff for every subsequence Ei. . . . . Ek of El . . . . . Em and every
/4,
P(I-IdE, . . . . . Ek ) = P(I--I~)
P(IZIi/E,. . . . . Ek) P(I2I~)
The independence assumptions used in the P R O S P E C T O R program are
P(EI . . . . . E,,,/I-I~) =
I-I
P(EilH~),
(3)
PI(EjlIZI~).
(4)
j=l
m
P(E,
.....
E,,,//--]~) = l-I
j=l
Artificial Intelligence 2,5 (1985) 95-99
0004-3702185/$3.30 @ 1985, Elsevier Science Publishers B.V. (North-Holland)
96
c. GLYMOUR
Pednault et al. are surely correct in claiming that requirements (3) and (4) are
excessively strong restrictions on the probability function P. Indeed, Pearl [4]
and Kim [5] have argued that keeping requirement (3) alone is empirically
more reasonable and yields an efficient updating scheme. If, however, Hussian's result were correct, (3) and the prior independence of the evidence would
be inconsistent with any updating, and Pednault et al. further claim that:
"Proposition. If the hypotheses HI, H2. . . . . HN with N > 2 are complete and
mutually exclusive, i.e., if E~=1 P(H/) = l, and if the assumptions (3) and (4) are
satisfied, then.., no updating takes place."
That this proposition is false may be shown by producing a system of five or
more events /-/1, //2, H3, Eb E2, and a probability function P, such that
equations (1), (2), (3), (4) hold but updating does take place.
Consider the following case:
We specify that
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
(viii)
P(Hi)=~ for alli,
P(/-/~& Iq/) = 0 for i / j ,
P(EI)= ~,
P ( E I & H i ) = ~ for alli,
P(E2)= ~,
P(E2 & H1) = ~,
P(E2&H2)= P(Ez&H3)= O,
P ( E , & E:) = ~.
These conditions suffice to determine the value of P for every Boolean
combination of El, E2, HI, H2,/-/3. In particular, they entail that
P(EI/I-'I~) = P(Ea & I-I~)/P(I--I~)= 12,
P(E2/H1) = | ,
P(E2/H2) = P(Ez/H3)= O,
P(E, & E2/H1) = ~,
P(Ej & E z / H 2 )
=
P(E~ & E2/H3) - O,
as can readily be verified from the representation of the probability distribution
as a Venn diagram (see Fig. 1).
Observe that:
~, P(Hi)= 1 by (i).
(1)
97
INDEPENDENCE ASSUMPTIONS AND BAYESIAN UPDATING
H,E~~H2
°
E, (P: 1/2)
H~
FIG. 1.
P(/-/~&/-/j)=0
P(E, &
if i J j
E21I-I,) =
(2)
by (ii).
(3)
P(EJI-I~). P(E21Hi) ,
because if i = 1,
P(E1 & E2/I-I~) = ~ = ~. 1 = P(EI/I-I~). P(E2/I-I~) ;
and if i-- 2 or 3.
P ( E , & E2/H~) = 0 = ~. 0 = P(E,/I--I~) . P(E2/I-I~) .
And
(4)
P ( E , & E2/ffl~) = P(E1/ffI~) . P(E2/ffI~) ,
because if i = 1,
P ( E , & Ez/f-I~) = P(E1 & E2/Hz v / / 3 )
= P(E,&E2&H2)+ P(E,&E2&H3)=
P (/-/2) + P (//3)
and
p(E,/iYi~) = P(E1 & HE) + P(E1 & Ha)
3~ t
P ( H z ) + P(H3)
=-5--= 2
and P(E2/ffli)= O. So
P(E,/G).
P(EjG)
= o ;
0
98
c. GLYMOUR
and i f i = 2 o r 3 ,
k/i,
k/l,
P(E, & E2/fft~) = P(E1 & E2/H, v Hk)
= P(E, & E2 & H,) + P(E, & E2 & Hk)
P(H~) + P(Hk)
1
_6+0_
--
2
3_1
12 4
and
P(EI/I7I~) • P(E2/IsI~) = P(E,/H1 v Hk)" P(E2/H, v Hk)
= [P(E, & H 0 + P(E, &
L
P(H,)&P(Hk) Ilk)]
[P(E2 & H1) + P(E2 & Ilk)]
X L . P(H1)+P(Hk)
J
[~+~]
[~+0]
1
So P(Et & E21ISI~)=P(Ex/ISI~).P(E2/1sI~).Thus all four conditions of the hypothesis of the theorem are met. But
P(I-Ii)/P(ISIi) = ~1 for all i
and
P(Hz/E~ & E2) = P ( H 3 / E , & E2) =
P(flE/E, & E2) P(~I3/E1 & E2)
0
and
P(H1/EI & E2)
P(ISIl/E1 & E2)
is undefined. Hence the theorem is false.
The error in the proof given by Pednault et al. lies in their use of a result
claimed by Hussian, namely,
"Theorem. Let the set of hypotheses Hi, i = 1, 2 . . . . . N be exhaustive and
mutually exclusive; if
INDEPENDENCE ASSUMPTIONS AND BAYESIAN UPDATING
99
P(E~ . . . . . Era) = r-I P(Ek),
k=l
P ( E , . . . . . E,,,It-I~) = FI P ( E k l H O ,
for all i,
k-I
then, for all i and k
P(EklI-I~) = P ( E k ) . "
H u s s i a n ' s d e r i v a t i o n c o n t a i n s an a l g e b r a i c e r r o r . T h e c o u n t e r e x a m p l e just
given to t h e t h e o r e m c l a i m e d by P e d n a u l t et al. also shows that H u s s i a n ' s result
is false, a n d o t h e r c o u n t e r e x a m p l e s to t h e l a t t e r ' s t h e o r e m a r e easily p r o d u c e d .
I d o n o t k n o w w h e t h e r t h e result c l a i m e d b y P e d n a u l t et al. w o u l d b e t r u e if
o n e r e q u i r e d in a d d i t i o n to (1)-(4) t h a t for all i, P ( H J E 1 . . . . . Ek) ~ O.
F r o m the B a y e s i a n p o i n t of view, e l i m i n a t i v e i n d u c t i o n is just a special case
in which t h e e v i d e n c e d e t e r m i n e s a p o s t e r i o r p r o b a b i l i t y of unity for o n e
h y p o t h e s i s a n d of z e r o for all o t h e r s . T h u s t h e e x a m p l e given h e r e shows t h a t
t h e v e r y s t r o n g i n d e p e n d e n c e a s s u m p t i o n s in P R O S P E C T O R are c o n s i s t e n t
with l e a r n i n g b y e l i m i n a t i v e i n d u c t i o n . This conclusion d o e s n o t a d d r e s s the
e m p i r i c a l a d e q u a c y of P R O S P E C T O R ' s u p d a t i n g s c h e m e , which also c o n t a i n s
p r o v i s i o n s for ad hoc u p d a t i n g rules [6]. It d o e s , I h o p e , lay to rest t h e claim
t h a t t h e p r o g r a m ' s B a y e s i a n c o m p o n e n t is i m p o t e n t if strictly a p p l i e d .
ACKNOWLEDGMENT
Research for this note was supported by the National Science Foundation. I thank Henry Kyburg
and Wesley Salmon for checking my derivations, and two anonymous referees for helpful
suggestions about presentation.
REFERENCES
I. Hussian, A., On the correctness of some sequential classification schemes in pattern recognition,
IEEE Trans. Comput. 21 (1972) 318-320.
2. Pednault, E., Zucker, S. and Muresan, I., On the independence assumption underlying subjective Bayesian updating, Artificial Intelligence 16 (1981) 213-222.
3. Duda, R., Hart, P. and Nilsson, N., Subjective Bayesian methods for rule-based inference
systems, Tech. Note 124, Artificial Intelligence Center, SRI, International, Menlo Park, CA.
4. Pearl, J., Reverend Bayes on inference engines: a distributed hierarchical approach, Proc.
National Conf. Artificial Intelligence, Carnegie-Mellon University, Pittsburgh, PA (1982) 133136.
5. Kim, J. and Pearl, J., A computational model for combined causal and diagnostic reasoning in
inference systems, Proc. 8th Internat. Joint Conf. Artificial Intelligence, Karlsruhe, West Germany, 1983.
6. Duda, R., Gaschnig, J. and Hart, P., Model design in the PROSPECTOR system for mineral
exploration, in: D. Michie (Ed.), Expert Systems in the Microelectronic Age (Edinburgh Press,
Edinburgh, 1979) 153-167.
Received September 1983; revised version received February 1984