COMP307 BN: 2 Outline • A Bayesian Network Example • Semantics of Bayesian Networks COMP 307 — Lecture 16 • Probabilities in Bayesian Networks Uncertainty and Probability 4 Probabilities in BN and How to Build a BN • Conditional Independence • Build a Bayesian Network Dr Bing Xue • Nodes Ordering and Compactness Thomas Bayes (/ˈbeɪz/; c. 1701 – 7 April 1761) [email protected] COMP307 • Summary BN: 3 Bayesian Network Example: A Lazy Detective COMP307 BN: 4 Joint Probability • P(T, W, M, C) = P(T)*P(W|T)*P(M|T, W)*P(C|T, W,M) • Number of free parameters: 1+2+4+8=15 • Conditional probability distribution table with n binary variables as conditions, size: 2n • Full joint distribution tables: - n binary variables: (2n-1) - if each has A possible values/states: the number of free parameters is (An-1) - So too big to represent explicitly unless only a few variables - Hard to learn(estimate) empirically about a large number of variables at a time, needs a lot of data A Bayes Net = Topology (graph) + Local Conditional Probabilities • How can a Bayesian Network help ? P(x1 , x2 , . . . , xn ) COMP307 COMP307 = ’ P(xi |Parents(Xi )) i provided Parents(Xi ) ✓ {X1 , . . . , Xi 1 }. For example, by examining Figure 2.1, we COMP307 BN: 57 can simplify its joint probability expressions. E.g., BN: Probabilities in Bayesian in Bayesian Networks P(X = pos ^ D = T ^C = T ^ P = low ^ S = F) = P(X = pos|D = T,C = T, P = low, S = F) •• Chain Chain rule rule (valid (valid for all distributions): ⇥P(D = T |C = T, P = low, S = F) = ⇥P(C = T |P = low, S = F)P(P = low|S = F)P(S = F) P(X = pos|C = T )P(D = T |C = T )P(C = T |P = low, S = F) ⇥P(P = low)P(S = F) •• If If conditional conditional independence independence to to preceding preceding nodes nodes given given its its parents: parents: 2.4.2 Pearl’s network construction Easier:fewer freealgorithm parameters P(x P(xii|x |x1, 1, … …x xi-1 i-1)) = = P(x P(xii|Parents(x |Parents(xii)) Easier:fewer free parameters •• Joint Joint probability: probability: BN: 6 Probabilities in Bayesian Networks • Bayesian Networks implicitly encode joint distributions/probabilities: - describing complex joint distributions(models) using simple, local distributions (conditional probabilities) - describe how variables interact - Local interactions chain together to give global, indirect interactions The condition that Parents(Xi ) ✓ {X1 , . . . , Xi 1 } allows us to construct a network from a given ordering of nodes using Pearl’s network construction algorithm (1988, - as a product of local conditional distribution: • P(T, W, M, C) = P(T)*P(W)*P(M|T,W)*P(C|M) Be careful with the orders(numbers):start with nodes with Be no careful parent with the orders (numbers):start with nodes with no parent COMP307 • P(T, W, M, C) = P(T)*P(W|T)*P(M|T, W)*P(C|T, W,M) BN: 7 Probabilities in Bayesian Networks • P(A,B,C) = P(A)*P(B|A)*P(C|A,B) — (product rule, always true): 7 • Common cause: P(A,B,C)=P(A)*P(B|A)*P(C|A): ? • Common effect: P(A,B,C)=P(A)*P(B)*P(C|A,B): ? BN: 8 Conditional independence in BN • Indirect cause: P(A,B,C)=P(A)*P(B|A)*P(C|B) - P(A,B)*P(C|A,B)=P(A,B)*P(C|B) <-—>P(C|A,B)=P(C|B) • Direct cause: P(A,B)=P(A)*P(B|A): 3 • Indirect cause: P(A,B,C)=P(A)*P(B|A)*P(C|B): 5 COMP307 Fewer free Parameters • Common cause (multiple effects): P(A,B,C)=P(A)*P(B|A)*P(C|A) - P(A,B)*P(C|A,B)=P(A,B)*P(C|A) <—-> P(C|A,B)=P(C|A) - Effect become independent once common cause known • Common effect (multiple causes): P(A,B,C)=P(A)*P(B)*P(C|A,B) - Explaining away: causes become dependent if know their effect (the alternative cause has been explained away) - C=happy, A=finish assignment, B=sunny COMP307 BN: 9 COMP307 BN: 10 Common cause: Naive Bayes Example: Car • Assume features are conditionally independent given the class labels - P(C,X1,X2,X3, …. Xn) = P(C) *P(X1|C)*P(X2|C)….P(Xn|C) Diagnostic Reasoning: P(Cause | Effect) COMP307 BN: 11 COMP307 BN: 12 Build a Bayesian Network • Simply processes each node in order, adding it to the existing network and adding arcs from a minimal set of parents such that the parent set renders the current node conditionally independent of every other node preceding it. • Pearl’s Network Construction Algorithm (A way): 1.Choose a set of relevant variables that describe the domain 2.Choose an order for the variables 3.While there are variables left add the next variable !" to the network add arcs to the !" node from a minimal set of nodes (parents) already in the network, such that the conditional independency property is satisfied: ′ $ !% !1′ , … , !* = $ !% $,-./01(!%′ ) ′ where !1′ , … , !* are all the variables preceding !% Define the conditional probability table for !" Example: Alarm Network • Variables: - Burglary Earthquake Alarm goes off Mary calls John calls Traffic COMP307 BN: 13 COMP307 Example: Alarm Network BN: 14 Compactness and Nodes Ordering • Compactness: - the more compact the model, the more tractable it is: fewer probability values requiring specification; less computer memory; more computationally efficient for probability updates - overly dense networks fail to represent independencies explicitly. - overly dense networks fail to represent the causal dependencies in the domain. • The compactness depends on getting the node ordering “right.” The optimal order is to add the root causes first, then the variable(s) they influence directly, and continue until leaves are reached. Fewer parents: smaller table, fewer free parameters (fewer probability values requiring specification) COMP307 COMP307 - Of course, one may not know the causal order of variables, the automated discovery methods should be used 1515 BN: BN: Ordering Orderingand andCompactness Compactness Introducing Bayesian Networks 39 COMP307 BN: 16 Summary • Semantics of Bayesian Networks Dyspnoea XRay Dyspnoea Pollution Cancer Smoker Pollution (a) XRay Smoker Cancer (b) FIGURE 2.3: Alternative structures obtained using >. Pearl’s network construction al• •<P,S,C,X,P>; D,X,C,P,S >;>; (b) << D,X,P,S,C <P,S,C,X,P>;(a) (a)<< D,X,C,P,S (b) D,X,P,S,C >. gorithm with orderings: (a) < D, X,C, P, S >; (b) < D, X, P, S,C >. It is desirableorder to build the most compact BN possible, for three reasons. First, • •Network Networkstructure structuredepend dependon on orderofofintroduction, introduction,top-to-button top-to-button the more compact the model, the more tractable it is. It will have fewer probability values requiring specification; it will occupy lesstheir computer memory;but probability uptotobe before (upstream of) effect, • •Causes Causesdo donot nothave have bebe before (upstream of) their effect, but fail to dates will more computationally efficient. Second, overly dense networks doing so leads to simpler networks, with fewer represent independencies explicitly. And third,free overly parameters dense networks fail to repredoing so leads to simpler networks, with fewer free parameters sent the causal dependencies in the domain. We will discuss these last two points just below. We can see from the examples that the compactness of the BN depends on getting the node ordering “right.” The optimal order is to add the root causes first, then the variable(s) they influence directly, and continue until leaves are reached.4 To understand why, we need to consider the relation between probabilistic and causal dependence. - A Bayes Net = Topology (graph) + Local Conditional Probabilities - Local interactions chain together to give global, indirect interactions - describe joint distributions using simple, local distributions (conditional probabilities) • Conditional Independence - a nodes is independent to its preceding nodes given its parents - different types of reasoning • Build a Bayesian Network - oder of nodes, add to graph, satisfy conditional independence - compactness import — order of nodes • Next Lecturers: inference in Bayesian Networks
© Copyright 2026 Paperzz