Bayesian Networks 主講人:虞台文 大同大學資工所 智慧型多媒體研究室 Contents Introduction Probability Theory Skip Inference Clique Tree Propagation – – Building the Clique Tree Inference by Propagation Bayesian Networks Introduction 大同大學資工所 智慧型多媒體研究室 What is Bayesian Networks? Bayesian Networks are directed acyclic graphs (DAGs) with an associated set of probability tables. The nodes are random variables. Certain independence relations can be induced by the topology of the graph. Why Use a Bayesian Network? Deal with uncertainty in inference via probability Bayes. Handle incomplete data set, e.g., classification, regression. Model the domain knowledge, e.g., causal relationships. Use a DAG to model the causality. Example Train Strike Martin Oversleep Boss Failure-in-Love Norman Oversleep Martin Late Norman Late Project Delay Office Dirty Boss Angry Attach prior probabilities to all root nodes Martin oversleep Probability Train Strike Probability Norman oversleep Probability T 0.01 T 0.1 T 0.2 F 0.99 F 0.9 F 0.8 Example Train Strike Martin Oversleep Boss Failure-in-Love Norman Oversleep Martin Late Norman Late Project Delay Boss Office Dirty failure-in-love Probability T 0.01 F Boss 0.99 Angry Attach prior probabilities to non-root nodes Example Each column is summed to 1. Train Strike Martin Oversleep Martin Late Norman Oversleep Norman Late Train strike Boss T Failure-in-Love Martin Late Norman oversleep Office F Dirty T Project F Delay Martin oversleep T F T F T 0.95 0.8 0.7 0.05 F 0.05 0.2 0.3 0.95 Norman untidy Norman T untidy F Boss Angry 0.6 0.2 0.4 0.8 Attach prior probabilities to non-root nodes Boss Failure-in-love Example Each column is summed to 1. T F Project Delay T Boss Angry F T Office Dirty Train F T F Strike 0.5 0.3 0.2 MartinT Oversleep very 0.98 F T 0.85 0.6 mid 0.02 0.15 little 0 0 no 0 0 0.3 0.25 Martin 0.25 0.1 Late 0 0 Boss Failure-in-Love F 0.5 0.5 0.2 0.3 0 0 Project Delay What is the difference between probability & fuzzy measurements? T F 0 0.01 0.2 0.02 Norman 0.7 0.07 Late 0.1 0.9 Office Dirty Boss Angry Norman Oversleep Norman untidy Medical Knowledge Example Definition of Bayesian Networks A Bayesian network is a directed acyclic graph with the following properties: Each node represents a random variable. Each node representing a variable A with parent nodes representing variables B1, B2,..., Bn is assigned a conditional probability table (CPT): P( A | B1 , B2 , , Bn ) Problems How to inference? How to learn the probabilities from data? How to learn the structure from data? Bad news: All of them are NP-Hard What applications we may have? Bayesian Networks Inference 大同大學資工所 智慧型多媒體研究室 Inference Example Train Strike Martin Late T F Train Strike T F 0.6 0.5 0.4 0.5 Martin Late Train Strike T F Norman Late Probability 0.1 0.9 Norman Late T F Train Strike T F 0.8 0.1 0.2 0.9 Questions: P (“Martin Late”, “Norman Late”, “Train Strike”)=? Joint distribution P(“Martin Late”)=? Marginal distribution P(“Matrin Late” | “Norman Late ”)=? Conditional distribution Example C Martin Late T F Train Strike T F 0.6 0.5 0.4 0.5 Martin Late A B C Probability T T T 0.048 F T T 0.032 T F T 0.012 F F T 0.008 T T F 0.045 F T F 0.045 T F F 0.405 F F F 0.405 Train Strike T F Train Strike A B Demo Probability Norman Late 0.1 0.9 Norman Late T F Questions: P (“Martin Late”, “Norman Late”, “Train Strike”)=? Joint distribution P( A, B, C ) P( A | B, C ) P( B | C ) P(C ) P( A | C ) P( B | C ) P(C ) e.g., P( A T , B T , C T ) 0.6 0.8 0.1 0.048 Train Strike T F 0.8 0.1 0.2 0.9 Example C Martin Late T F Train Strike T F 0.6 0.5 0.4 0.5 Martin Late A B C Probability A B Probability T T T 0.048 T T 0.093 F T T 0.032 F T 0.077 T F T 0.012 T F 0.417 F F T 0.008 F F 0.413 T T F 0.045 F T F 0.045 T F F 0.405 F F F 0.405 Train Strike T F Train Strike A Demo B Norman Late Probability 0.1 0.9 Norman Late T F Train Strike T F 0.8 0.1 0.2 0.9 Questions: P (“Martin Late”, “Norman Late”)=? P( A, B) P( A, B, C ) C e.g., P( A T , B T ) 0.048 0.045 0.093 Marginal distribution Example C Martin Late T F Train Strike T F 0.6 0.5 0.4 0.5 Martin Late Questions: P (“Martin Late”)=? A B C Probability A B Probability T T T 0.048 T T 0.093 F T T 0.032 F T 0.077 T F T 0.012 T F 0.417 F F T 0.008 F F 0.413 T T F 0.045 F T F 0.045 T F F 0.405 F F F 0.405 Train Strike A B A Probability T 0.51 F 0.49 Marginal distribution P( A) P( A, B, C ) P( A, B) B ,C Train Strike T F B e.g., P( A T ) 0.093 0.417 0.51 Norman Late Demo Probability 0.1 0.9 Norman Late T F Train Strike T F 0.8 0.1 0.2 0.9 Example C Martin Late T F Train Strike T F 0.6 0.5 0.4 0.5 Questions: Martin Late A B C Probability A B Probability T T T 0.048 T T 0.093 F T T 0.032 F T 0.077 T F T 0.012 T F 0.417 F F T 0.008 F F 0.413 T T F 0.045 F T F 0.045 T F F 0.405 F F F 0.405 Train Strike A B P( A, B) P( B) Norman Late A Probability B Probability T 0.51 T 0.17 F 0.49 F 0.83 P (“Martin Late” | “Norman Late” )=? P( A | B) Train Strike T F Probability 0.1 0.9 Norman Late T F Train Strike T F 0.8 0.1 0.2 0.9 Conditional distribution e.g., P( A T | B T ) 0.093 0.5471 0.17 Demo Inference Methods Exact Algorithms: – – – – Probability propagation Variable elimination Cutset Conditioning Dynamic Programming Approximation Algorithms – – – – – Variational methods Sampling (Monte Carlo) methods Loopy belief propagation Bounded cutset conditioning Parametric approximation methods The given terms are called evidences. Independence Assertions Bayesian Networks have build-in independent assertions. An independence assertion is a statement of the form – X and Y are independent given Z That is, or – X Z Y P( X | Y , Z ) P( X | Z ) P( X , Y | Z ) P( X | Z ) P(Y | Z ) We called that X and Y are d-separated by Z. d-Separation X i Z Yj ? Y1 Y2 X i Z X j , i j ? Y4 Y3 Z X1 W1 X2 Yi Z Y j , i j ? W2 X3 Type of Connections Serial Connections Yi – Z – Xj Y1 Y2 Y4 Y3 Y1/2 – Z – Y3/4 Z X1 W1 X2 Converge Connections W2 Y3 – Z – Y4 X3 Diverge Connections Xi – Z – Xj d-Separation Serial X Z Y X Z Y ? Converge X Diverge Y Z X Z Y ? Z X Y X Z Y ? JPT: Joint probability table CPT: Conditional probability table Joint Distribution P( X 1 , X 2 , P( X n | X 1 , , Xn) With this, we can compute all probabilities , X n1 ) P( X n1 | X1 , , X n2 ) P( X 2 | X 1 ) P( X 1 ) n P( X i | X 1 , , X i 1 ) By chain rule X1 i 1 n X5 Parents of Xi Consider binary random variables: 1. To store JPT of all r.v’s : 2n 1 table entries 2. To store CPT of all r.v’s: ? table entries X3 X4 P( X i | i ) By independence assertions i 1 X2 X6 X7 X8 X9 X10 X11 Joint Distribution n , X n ) P( X i | i ) P( X 1 , X 2 , i 1 2| i | entries are required X1 n 2 | i | X2 X3 X4 i 1 X5 Consider binary random variables: 1. To store JPT of all r.v’s : 2n 1 table entries 2. To store CPT of all r.v’s: ? table entries X6 X7 X8 X9 X10 X11 Joint Distribution To store JPT of all random variables: 211 1 2047 entries are required X1 1 To store CPT of all random variables: X3 1 X4 2 X5 2 29 entries are required X2 1 X6 8 X7 2 X8 2 X9 2 X10 4 X11 4 More on d-Separation A path from X to Y is d-connecting w.r.t evidence nodes E is every interior nodes N in the path has the property that either 1. 2. It is linear or diverge and not a member of E; or It is converging, and either N or one of its descendants is in E. X Y E Identify the d-connecting and non-d-connecting paths from X to Y. More on d-Separation A path from X to Y is d-connecting w.r.t evidence nodes E is every interior nodes N in the path has the property that either 1. 2. It is linear or diverge and not a member of E; or It is converging, and either N or one of its descendants is in E. X Y E More on d-Separation Two nodes are d-separated if there is no d-connecting path between them. X Exercise: Withdraw minimum number of edges such that X and Y are d-separated. Y E More on d-Separation Two set of nodes, say, X={X1, …, Xm} and Y={Y1, …, Yn} are d-separated w.r.t. evidence nodes E if any pair of Xi and Yj are d-separated w.r.t. E. X E In this case, we have P( X, Y, E) P( X | Y, E) P(Y, E) P( X | E) P(Y, E) P( X, E) P(Y, E) P(E) Y Bayesian Networks Clique Tree Propagation 大同大學資工所 智慧型多媒體研究室 References Developed by Lauritzen and Spiegelhalter and refined by Jensen et al. Lauritzen, S. L., and Spiegelhalter, D. J., Local computations with probabilities on graphical structures and their application to expert systems, J. Roy. Stat. Soc. B, 50, 157-224, 1988. Jensen, F. V., Lauritzen, S. L., and Olesen, K. G., Bayesian updating in causal probabilistic networks by local computations, Comp. Stat. Quart., 4, 269-282, 1990. Shenoy, P., and Shafer, G., Axioms for probability and belief-function propagation, in Uncertainty and Articial Intelligence, Vol. 4 (R. D. Shachter, T. Levitt, J. F. Lemmer and L. N. Kanal, Eds.), Elsevier, North-Holland, Amsterdam, 169-198, 1990. Clique Tree Propagation (CTP) Given a Bayesian Network, build a secondary structure, called clique tree. – An undirected tree Inference by propagation the belief potential among tree nodes. It is an exact algorithm. Notations Item Random variables Random vectors Notation Examples uninitiated uppercase A, B, C initiated lowercase a, b, c uninitiated Boldface uppercase X, Y, Z initiated Boldface lowercase x, y, z Definition: Family of a Node The family of a node V, denoted as FV, is defined by: FV V V Examples: FA {A} FB {A, B} FH {E, G, H} A B C G D E H F We will model the probability tables as potential functions. Potential and Distributions a on off Prior probability P(a) 0.5 0.5 Function of a. All of these tables map a set of random variables to a real value. A a b P(b | a) on off on 0.7 0.3 off 0.2 0.8 Conditional probability B C D G E Conditional probability H d F Function of a and b. Function of d, e and f. on f P(f | de) off e on off on off on 0.95 0.8 0.7 0.05 off 0.05 0.2 0.3 0.95 Potential X : X R Used to implement matrices or tables. Two operations: 1. Marginalization: Y X \ Y, Y X 2. Multiplication: Z XY , Z X Y X : Marginalization A B C ABC T T T 0.048 F T T 0.032 T F T 0.012 F F T 0.008 T T F 0.045 F T F 0.045 T F F 0.405 F F F 0.405 A A Y X \ Y X , Y X X\Y Example: Y : 1 X { A, B, C} Y1 {A, B} Y2 {A} Y : A B AB T T 0.093 T 0.51 F T 0.077 F 0.49 T F 0.417 F F 0.413 Y X 1 C 2 Y X 2 B ,C Z : Multiplication A B C ABC T T T 0.093 0.08=0.00744 F T T 0.077 0.08=0.00616 T F T 0.417 0.02=0.00834 F F T 0.413 0.02=0.00826 T T F 0.093 0.09=0.00837 F T F 0.077 0.09=0.00693 T F F 0.417 0.91=0.37947 F F F 0.413 0.91=0.37583 Z ( z) X (x)Y (y), Z X Y Not necessary sum to one. x and y are consistent with z. Example: Z { A, B, C} X {A, B} Y {B, C} X : A B AB T T F Y : B C AB 0.093 T T 0.08 T 0.077 F T 0.02 T F 0.417 T F 0.09 F F 0.413 F F 0.91 Z ABC ABBC XY The Secondary Structure Given a Bayesian Network over a set of variables U = {V1, …, Vn} , its secondary structure contains a graphical and a numerical component. Graphic Component: An undirected clique tree: satisfies the join tree property. Numerical Component: Belief potentials on nodes and edges. How to build a clique tree? The Clique Tree T The clique tree T for a belief network over a set of variables U = {V1, …, Vn} satisfies the following properties. Each node in T is a cluster or clique (nonempty set) of variables. The clusters satisfy the join tree property: – – A Given two clusters X and Y in T, all clusters on the path between X and Y contain XY. For each variable VU, FV is included in at least one of the cluster. Sepsets: Each edge in T is labeled with the intersection of the adjacent clusters. B C G D E H F ABD AD ADE AE ACE CE CEG DE EG DEF EGH How to assign belief functions? The Numeric Component Clusters and sepsets are attached with belief functions. For each cluster X and neighboring sepset S, it holds that S X \ S X Local Consistency X\S It also holds that P(U) i Xi i Si Global Consistency ABD AD ADE AE ACE CE CEG DE EG DEF EGH How to assign belief functions? The Numeric Component Clusters and sepsets are attached with belief functions. The key step to satisfy these constraints by letting X P(X) and S P(S) If so, V X P(V ) X \ {V } V S P(V ) S \ {V } ABD AD ADE AE ACE CE CEG DE EG DEF EGH Bayesian Networks Building the Clique Tree 大同大學資工所 智慧型多媒體研究室 The Steps Belief Network Moral Graph Triangulated Graph Clique Set Join Tree Moral Graph A Belief Network A B C G B C G D E H D E H Moral Graph Triangulated Graph F Belief Network Clique Set F Moral Graph 1. Convert the directed graph to undirected. 2. Connect each pair of parent nodes for each node. Join Tree This step is, in fact, done by incorporating with the next step. Triangulation A Belief Network A B C G B C G D E H D E H Moral Graph Triangulated Graph F Triangulated Graph F Moral Graph Clique Set 1. Triangulate the cycles with length more than 4 Join Tree There are many ways. Select Clique Set GM A Belief Network A GM B C G B C G D E H D E H Moral Graph Triangulated Graph Clique Set Join Tree F F 1. Copy GM to GM’. 2. While GM’ is not empty a) select a node V from GM’, according to a criterion. b) Node V and its neighbor form a cluster. c) Connect all the nodes in the cluster. For each edge added to GM’, add the same edge to GM. d) Remove V from GM’. Criterion: 1. The weight of a node V is the number of values of V. 2. The weight of a cluster is the product of it constituent nodes. • • Select Clique Set Choose the node that causes the least number of edges to be added. Breaking ties by choosing the node that induces the cluster with the smallest weight. GM A Belief Network A GM B C G B C G D E H D E H Moral Graph Triangulated Graph Clique Set Join Tree F F 1. Copy GM to GM’. 2. While GM’ is not empty a) select a node V from GM’, according to a criterion. b) Node V and its neighbor form a cluster. c) Connect all the nodes in the cluster. For each edge added to GM’, add the same edge to GM. d) Remove V from GM’. Criterion: 1. The weight of a node V is the number of values of V. 2. The weight of a cluster is the product of it constituent nodes. • • Select Clique Set Choose the node that causes the least number of edges to be added. Breaking ties by choosing the node that induces the cluster with the smallest weight. GM A Belief Network A GM B C G B C G D E H D E H Moral Graph F Triangulated Graph F A Clique Set Join Tree Eliminated Vertex Induced Cluster Edges Added H G EGH CEG none none B C G F C DEF ACE none {A, E} D E H B D E A ABD ADE AE A {A, D} none none none F Building an Optimal Join Tree Belief Network Moral Graph Triangulated Graph Clique Set We need to find minimal number of edges to connect these cliques, i.e. to build a tree. Given n nodes to build a tree, n1 edges are required. There are many ways. How to achieve optimality? Join Tree Eliminated Vertex Induced Cluster Edges Added H G EGH CEG none none F C DEF ACE none {A, E} B D E A ABD ADE AE A {A, D} none none none Building an Optimal Join Tree Belief Network Moral Graph Triangulated Graph Clique Set Join Tree 1. Begin with a set of n trees, each consisting of a single clique, and an empty set S. 2. For each distinct pair of cliques X and Y: a) Create a candidate sepset SXY= XY, with backpointers to X and Y. b) Insert SXY to S. 3. Repeat until n1 sepsets have been inserted into the forest. a) Select a sepset SXY from S, according to the criterion described in the next slide. Delete SXY from S. b) Insert SXY between cliques X and Y only if X and Y are on different trees in the forest. Building an Optimal Join Tree Criterion: 1. The mass of SXY is the1.number nodes in XY. Beginofwith a set of n trees, each consisting of Network 2.Belief The cost of SXY is the weight X plusclique, the weight a single and Y. an empty set S. – The weight of a node V is the number of values of V. 2. For each distinct pair of cliques X and Y: – The weight of a set of nodes X is the product of it constituent nodes in X. Moral Graph a) Create a candidate sepset SXY= XY, with backpointers to X and Y. • Choose the sepset with causes the largest mass. • Breaking ties by choosing the theS.smallest cost. b) sepset Insert with SXY to Triangulated 3. Repeat until n1 sepsets have been inserted Graph into the forest. a) Select a sepset SXY from S, according to the criterion described in the next slide. Clique Set Delete SXY from S. b) Insert SXY between cliques X and Y only if Join Tree X and Y are on different trees in the forest. Building an Optimal Join Tree A Belief Network B C G D E H Moral Graph Triangulated Graph Clique Set Join Tree F Graphical Transformation ABD AD ADE AE ACE CE CEG DE EG DEF EGH Bayesian Networks Inference by Propagation 大同大學資工所 智慧型多媒體研究室 Inferences P(V ) ? P(V | e) ? Inference without evidence Inference with evidence PPTC: Probability Propagation in Tree of Cliques. P(V ) ? Inference without Evidence Train Strike Martin Oversleep Boss Failure-in-Love Norman Oversleep Martin Late Norman Late Project Delay Office Dirty Boss Angry Demo Procedure for PPTC without Evidence Belief Network Graphical Transformation Building Graphic Component Join Tree Structure Initialization Inconsistent Join Tree Propagation Consistent Join Tree Marginalization P(V ) Building Numeric Component Initialization 1. For each cluster and sepset X, set each X(x) to 1: A B C G D E H X 1 2. For each variable V: a) Assign to V a cluster X that contains FV; call X the parent cluster of FV. b) Multiply X(x) by P(V | V). F ABD AD ADE AE ACE CE CEG DE EG DEF EGH X X P(V | V ) X 1 X X P(V | V ) c on on off off a on off on off c on on off off on on off off e on off on off on off on off Initialization a on on on on off off off off ACE A B C G D E H F ABD AD ADE AE ACE CE CEG DE EG DEF EGH P(c | a) 0.7 0.2 0.3 0.8 e on on off off c on off on off P(e | c) 0.3 0.6 0.7 0.4 Initial Values 1 0.7 0.3 = 0.21 1 0.7 0.7 = 0.49 1 0.3 0.6 = 0.18 1 0.3 0.4 = 0.12 1 0.2 0.3 = 0.06 1 0.2 0.7 = 0.14 1 0.8 0.6 = 0.48 1 0.8 0.4 = 0.32 Initial Values c e CE on on 1 on off 1 off on 1 off off 1 Q N N : # clusters Q : # variables i 1 N 1 Now, Xi Initialization Sj j 1 a on on on on off off off off ACE A B C G D E H F ABD AD ADE AE ACE CE c on on off off on on off off CEG DE EG DEF EGH P(V k i 1 1 | CVk ) P(U ) By independence assertions e on off on off on off on off Initial Values 1 0.7 0.3 = 0.21 1 0.7 0.7 = 0.49 1 0.3 0.6 = 0.18 1 0.3 0.4 = 0.12 1 0.2 0.3 = 0.06 1 0.2 0.7 = 0.14 1 0.8 0.6 = 0.48 1 0.8 0.4 = 0.32 Initial Values c e CE on on 1 on off 1 off on 1 off off 1 N : # clusters Q : # variables Q N Now, i 1 N 1 Xi Initialization j 1 Sj P(V k i 1 1 | CVk ) P(U ) By independence assertions a c e Values ACE on on on 1 Initial 0.7 0.3 = 0.21 on on off 1 0.7 0.7 = 0.49 B C G on off on 1 0.3 0.6 = 0.18 AfterE initialization, onglobal off off consistency 1 0.3 0.4 = 0.12 is D H off on on 1 0.2 0.3 = 0.06 satisfied, but localoffconsistency is not. = 0.14 on off 1 0.2 0.7 F off off on 1 0.8 0.6 = 0.48 off off off 1 0.8 0.4 = 0.32 Initial AD AE CE Values ABD ADE ACE CEG c e CE on on 1 EG DE on off 1 off on 1 DEF EGH off off 1 A Global Propagation It is used to achieve local consistency. Let’s consider single message passing first. Message Passing X Projection on sepset: R old R Y R R X \ R Absorption on receiving cluster: R Y Y old R The Effect of Single Message Passing X i i j S j old X i i R Y R Yold j S j old 1 R old Yold olRd P(U) R Y R Message Passing X Projection on sepset: R old R Y R R X \ R Absorption on receiving cluster: R Y Y old R Global Propagation 1. Choose an arbitrary cluster X. 2. Unmark all clusters. Call Ingoing-Propagation(X). 3. Unmark all clusters. Call Outgoing-Propagation(X). Choose an arbitrary cluster X. Unmark all clusters. Call Ingoing-Propagation(X). Unmark all clusters. Call Outgoing-Propagation(X). 1. 2. 3. Global Propagation Ingoing-Propagation(X) Outgoing-Propagation(X) Mark X. Call Ingoing-Propagation recursively on X’s unmarked neighboring clusters, if any. Pass a message from X to the cluster which invoked IngoingPropagation(X). 1 ABD AD 8 3 ADE DEF ACE 6 2 5 AE DE Mark X. Pass a message from X to each of its unmarked clusters, if any. Call Outgoing-Propagation recursively on X’s unmarked neighboring clusters, if any. CE 9 7 CEG EG 10 EGH 4 After global propagation, The clique tree is both global and local consistent. P( A) ABD P( D) ABD BD AB Marginalization ABD = a on on on on off off off off b on on off off on on off off d on off on off on off on off ABD(abd) .225 .025 .125 .125 .180 .020 .150 .150 ABD Consistent Join Tree P (a) a on .225 + .025 + .125 + .125 = .500 off .180 + .020 + .150 + .150 = .500 P (d) d on .225 + .125 + .180 + .150 = .680 off .025 + .125 + .020 + .150 = .320 AD ADE AE ACE CE CEG DE EG DEF EGH Review: Procedure for PPTC without Evidence Belief Network Graphical Transformation Building Graphic Component Join Tree Structure Initialization Inconsistent Join Tree Propagation Consistent Join Tree Marginalization P(V ) Building Numeric Component P(V | e) ? Inference with Evidence Train Strike Martin Oversleep Boss Failure-in-Love Norman Oversleep Martin Late Norman Late Project Delay Office Dirty Boss Angry Demo Observations Observations are the simplest forms of evidences. An observations is a statement of the form V = v. Collections of observations may be denoted by E=e An instantiation of a set of variable E. Observations are referred to as hard evidence. Likelihoods Given E = e, the likelihood of V, denoted as V, is defined as: V E : 1 v e(V ) V (v) 0 otherwise V E : V (v) 1, v Likelihoods V(v) Variable A B on D off F C E G H V v = on v = off A 1 1 B 1 1 C 1 0 D 0 1 E 1 1 F 1 1 G 1 1 H 1 1 Procedure for PPTC with Evidence Belief Network Graphical Transformation Building Graphic Component Join Tree Structure 1. Initialization 2. Observation Entry Inconsistent Join Tree Propagation Consistent Join Tree 1. Marginalization 2. Normalization P(V | e) Building Numeric Component Initialization with Observations 1. For each cluster and sepset X, set each X(x) to 1: A B C G D E H F X 1 2. For each variable V: a) Assign to V a cluster X that contains FV; call X the parent cluster of FV. b) Multiply X(x) by P(V | V). X X P(V | V ) ABD AD ADE AE ACE CE CEG DE EG DEF EGH 3. Set each likelihood element V(v) to 1: V 1 X 1 X X P(V | V ) V 1 Observation Entry 1. Encode the observation V = v as: Vnew A B C G 2. Identify a cluster X that contains V. D E H 3. Update X and V: F ABD AD X X new V ADE AE ACE CE CEG DE EG DEF EGH V Vnew Marginalization After global propagation, ABD X P(X, e) V X P(V , e) X \{V } S P(S, e) V S P(V , e) S \{V } AD ADE AE ACE CE CEG DE EG DEF EGH Normalization After global propagation, X P(X, e) V X P(V , e) X \{V } S P(S, e) V S P(V , e) S \{V } Normalization P(V , e) P(V , e) P(V | e) P(e) P(V , e) V Handling Dynamic Observations e1 Suppose that the join tree now is consistent for e1. e2 How to handle the consistency if the observation is changed to e2? Observation States e2 e1 Three observation states for a variable, say, V : 1. No change 2. Update V is unobserved observed 3. Retraction V is observed unobserved or V = v1 V = v2 , v1 v2 Handling Dynamic Observations Belief Network Graphical Transformation Join Tree Structure 1. Initialization 2. Observation Entry Global Retraction When? Inconsistent Join Tree Propagation Consistent Join Tree 1. Marginalization 2. Normalization P(V | e) Global Update When?
© Copyright 2024 Paperzz