Purpose: This document describes the operation and mechanism of the new Local Probability Distribution (LPD) in MEBN for a user and developer. Authors: Cheol Young Park, Shou Matsumoto. Date: 03/23/2013 Index 1. Introduction ........................................................................................................................................... 1 2. Process of LPD operation ..................................................................................................................... 1 2.1. Discrete LPD operation..................................................................................................................... 1 2.2. Continuous LPD operation................................................................................................................ 7 3. Mechanism of LPD ............................................................................................................................. 11 4. Conclusion ................................................................................................. Error! Bookmark not defined. 1. Introduction The Local Probability Distribution (LPD) specifies numerical probability information for resident random variables of an MFrag. It is a kind of function which should represent Probability Distribution. The LPD in MEBN provides discrete and continuous Probability Distribution, so a user should know the different operation of the LPD in MEBN. Also, we have developed the new LPD which can be performed for the continuous resident node. The chapter 2 introduces the operation. And the chapter 3 describes the mechanism of LPD. 2. Process of LPD operation 2.1. Discrete LPD operation To manipulate LPD, first of all, at least one resident node is created. To understand the operation easily, we will use “VehicleIdentification.ubf” file which is located in “plugins\unbbayes.prs.mebn1.11.9\examples”. Once the file is opened, click “ImageTypeReport_MFrag”. Following describes how to use LPD. (This example is based on Window O/S). 1) Select a resident node in order to open the LPD panel. And click the “Edit table” button . 2) Following figure shows the LPD panel. There is the “Nodes” panel which shows current and parent nodes. There is “States” panel which shows the list of the possible states of a selected node. There is “Arguments” panel which shows arguments of the selected node. There are “LPD Grammar” buttons There is the Text field as the LPD edit panel as which is empty firstly. . 3) Click “if any” or “if all” button. By choosing “any”, the sentence “if any paramSubSet have ( booleanFunction )” will be automatically inserted to the editor panel. It basically states that if there is at least one instance of a parent satisfying the statement at “booleanFunction”, then the probability distribution declared within the next block of pseudocode (delimited by square brackets) will be adopted by the algorithm. We will talk about each of them later. Similarly, by choosing “all”, the sentence “if all paramSubSet have ( booleanFunction )” will be automatically inserted. In this case, the probability distribution declared in the next block of code will only be used if all the parents satisfy “booleanFunction”. 4) Click “else” button. The if-else block can be put in cascade, so that complex LPD may be created using multiple levels of if-else clauses. For instance, the following LPD is valid: if any obj have ( ObjectType = Tracked ) [ if any rgn have ( Weather = Clear ) [ Tracked = .8, Wheeled = .15, NonVehicle = .05 ] else [ Tracked = .6, Wheeled = .3, NonVehicle = .1] ] else if any obj have ( ObjectType = Wheeled ) [ if any rgn have ( Weather = Clear ) [ Tracked = .1, Wheeled = .8, NonVehicle = .1 ] else [Tracked = .2, Wheeled = .6, NonVehicle = .2] ] else [ if any rgn have ( Weather = Clear ) [ Tracked = .05, Wheeled = .05, NonVehicle = .9 ] else [ Tracked = .15, Wheeled = .15, NonVehicle = .7 ] ] Note that an “else” is mandatory after an if, so the number of occurrences of “if” should match the occurrences of “else”. 5) Click “paramSubSet” on the LPD edit panel and make it highlight. And click “ObjectType” on the “Nodes” panel. And double click “obj” on the “Arguments” panel. The “paramSubSet” has a special purpose, and it is basically used for restricting “booleanFunction” and/or enabling the CARDINALITY function in the probability distribution block (the one delimited by square brackets). It is a dot-separated list of ordinary variables. 6) Click “booleanFunction” on the LPD edit panel and make it highlight. And click “equal operator” button . And click “Node” on the LPD edit panel and make it highlight. And click “ObjectType” on the “Nodes” panel. 7) Click “NodeState” on the LPD edit panel and make it highlight. And click “Tracked” on the “States” panel. The “booleanFunction” (the block being edited now) will only consider the parents containing all the ordinary variables in “paramSubSet” as arguments, except those whose types are entities marked as “ordered”. For example, if “paramSubSet” is “x.y” (a list containing ordinary variables “x” and “y”), and the parents are Res1(x), Res2(x,y), Res3(x,y,t), and Res(y) (assuming that the ordinary variable “t” is marked as “ordered” in UnBBayes), then any boolean sentence in “booleanFunction” will return false if it is NOT using Res2(x,y) or Res3(x,y,t) – because these are the only parents using both “x” and “y” simultaneously, if we do not consider the ordered variable “t”. The “booleanFunction”, as the name suggest, represents an boolean expression. If the expression returns true, then It can be composed by: “&” - this is the boolean AND operation, and it returns true if both operands are true; “|” - this is the boolean OR operation, and it returns true if one of the operand are true; “~” - this is the boolean NOT operation, and it returns true if the operand is false; “Node = State” - it means that the actual state of “Node” (which is a node) is “State” (which is a state). Complex boolean expressions can be created by combining the operators, using parenthesis to explicitly specify the order of the operations. After the “( booleanFunction )”, a square-bracket block specifies the probability distribution of each state of the current node. Basically, it is a comma-separated list of probability attributions following the format “<State> = <Expression>”. Expression can be composed by a combination of numeric values and operations (it is recommended to write fully-parenthesized expressions). The following list summarizes the possible operations for numeric expressions: “MAX” - a binary operator which returns the highest of the two operands, and it is useful for creating a lower bound for the probability; “MIN” - a binary operator which returns the lowest of the two operands, and it is useful for creating a upper bound for the probability (e.g. 1.0); +,-,*,/ - the well known add, subtract, multiply, and divide operations; “<State>” - the probability of a state can be declared as a function of another state, just by writing the name of a previously declared state in the expression (it will be substituted by its actual probability); “CARD” - a special function which returns the quantity (count, or cardinality) of possible COMBINATIONS of parent nodes (please, notice that this is not the number of parents) which satisfies the conditions in “booleanFunction”. This is useful for creating a distribution which may vary depending on the set (actually, a combination) of parents in a specific situation. In a normal LPD, this function is used with MAX and MIN to limit the probability values. Its only parameter must be a “paramSubSet” (i.e. a dot-separated list of ordinary variables), which identifies which “if” we are considering in a cascaded if-else situation. It is supposed to specify the probability distribution for all states in a bracket-separated block. States not described in the bracket-block will be considered to have 0% (i.e. equivalent to “State = 0”). 8) As same manner, we put “MIN( .9, CARDINALITY( obj ) ),” and “1-(Trackedd + NonVehicle)” as following figure. 9) Click “Compile” button in order to check LPD grammar. In our example case, it generates an error as the above figure. “>73<” means a position of an error occurred. 10) Click “Save” button 2.2. in order to save the script. Continuous LPD operation The continuous LPD is a new feature of UnBBayes-MEBN, so there are some different notations and implementations from the discrete LPD. Basically, the continuous LPD which are an extended version of the discrete LPD provides a grammar of continuous CPS and combining rule. 1) Continuous Grammar The following figure is a Pseudo MFrag to show a continuous resident node. There are one discrete resident node, A(X), and two continuous resident nodes, B(Y) and C(Z). The node, B(Y), has a script as NormalDist(5, 1); which means the probability distribution of B(Y) is the normal distribution with mean = 5 and standard deviation = 1. The node, C(Z), has a script as if( Some(A( X )) && (A( X ) == a1 ) ){ - 0.5 * B( Y ) + NormalDist( 2, 1 ); } else { B( Y ) + NormalDist( -1, 0.5 ); } which means, for the some A(X), if A(X) is a1, then “{ - 0.5 * B( Y ) + NormalDist( 2, 1 ); }” is used, else “{ B( Y ) + NormalDist( -1, 0.5 ); }” is used. Now, we assume that we know X = {x1, x2}, Y = {y1}, and Z = {z1} and we want to generate SSBN. The following figure shows the generated SSBN based on our assumption. As we can see the above figure, the continuous LPD changed to the continuous CPS. Because in our assumption there were two objects (x1 and x2), the four if-statement were generated. And the scripts of the CPS will be used for DMP algorithm. Here is one problem for the continuous LPD. It is possible that a continuous resident node might be related with multiple objects. For example, we can assume that we know X = {x1}, Y = {y1, y2}, and Z = {z1}. There are two continuous objects as y1 and y2. In this case, we should use just one object for their child node. For example, in our case it is z1. How do we use just one object in the situation of the multiple parents? To solve this problem, we provide the combining rule for the continuous LPD. 2) Combining Rule We provide “Mean”, “Sum”, and “Multiply” combining rule for the continuous LPD. The following example is same with the previous example. One difference is that “Mean” function is added as the red script. The meaning of “Mean(B(Y))” is that if there are several objects of B(Y), then they will be changed to a script to express as an average equation. Following shows the result from this example. In this example, we assume that we know X = {x1}, Y = {y1, y2}, and Z = {z1}. As we can see the red script in the above figure, the “Mean(B(Y))” was decomposed into 0.5*B_y1 + 0.5*B_y2. For the “Sum” and “Multiply” combining rules in situation of our example, “Sum(X)” should be “x1 + x2”. And “Multiply(X)” should be “x1*x2”. 3) Example of converting LPD to CPS Followings show an example to describe semantics of the new LPD. A) is a source LPD and B) is a CPS which is converted from A). A) LPD if all y have ( parentY = T ) [ Mean ( parentX ) + NormalDist( 1, 1) ] else[ NormalDist( 3, 2 ) ] Assumption: Discrete parentY has an instance y1 and y2. Continuous parentX has an instance x1 and x2. B) CPS converted from LPD if( y1 == T && y2 == T ) { 0.5*x1 + 0.5*x2 + NormalDist( 1, 1 ) t } else if( y1 == T && y2 == F ) { NormalDist( 3, 2 ); } else if( y1 == F && y2 == T ) { NormalDist( 3, 2 ); } else if( y1 == F && y2 == F ) { NormalDist( 3, 2 ); } 4) Semantics of if_statement In this case, we should convert from “if all y have ( parentY = T )” to “if( y1 == T && y2 == T ) …”. Internally, the structure of CPT is going to be generated. Base on the structure of CPT, if statement of CPS should be generated. When the structure of CPT is generated, continuous equations should be stored a memory with indices of the continuous equations called CID. In our example on the below figure, c1 and c2 is the index of the continuous equation. If a continuous equation has a combining rule, the equation is going to be applied by the combining rule. This process is going to be described in 2.4. In this process for the if_statement, our output is CPS which has empty action parts as following figure. 5) Semantics of assignment In the action part of if_statement, there is assignment syntax. We add new syntax for empty variable case which means that there is no variable and equal operator to be assigned. For example, “A = 0.5” is a normal assignment notation, but we allow “0.5” without “A = “, because continuous node doesn’t have various states, but it has one variable for it. 3. Mechanism of LPD In a developer's point of view (i.e. technical point of view of the implementation of LPD in UnBBayes), the pseudocode describing the LPD is simply a Java String linked to a ResidentNode. The pseudocode is going to be translated by a Compiler, also linked to a ResidentNode, during SSBN generation, so that a LPD can be converted to instances of IProbabilityFunction, a format UnBBayes uses for describing the probability distribution in BN. Implementations of IProbabilityFunction are mostly tables. The following figure illustrates the relationship between the Java classes in UnBBayes. Basically, each resident node (i.e. instances of the ResidentNode class) is linked to an instance of Compiler, the class responsible for generating an IProbabilityFunction from SSBNNode, a temporary representation of a node during SSBNGeneration. The following diagram illustrates the classes responsible for rendering a LPD pseudocode and handling other GUI features. The core of LPD GUI is the CPTEditionPane, which has a link to the ResidentNode, whose LPD is being edited. It also contains the text area (CPTTextPane) where pseudocode is displayed. MEBNController acts as a mediator for all other MEBN classes. StyleTableImpl and ColloringUtils are classes responsible for the code syle. The dynamic behavior of a LPD manifests basically during a query in MEBN. When a query is triggered, an instance of ISSBNGenerator is activated. Eventually, some of its implementations will generate a SSBNNode, temporary node containing a link to both ResidentNode (a node in MEBN - input) and ProbabilisticNode (a node in BN – output). Also, eventually the SSBN generator will ask the compiler to translate a SSBNNode (i.e. read the pseudocode linked to the related resident node) in order to generate an instance of IProbabilisticFunction. After completion of the creation of SSBN, all SSBNNodes are going to be discarded and only instances of ProbabilisticNode will survive. After that, inference is performed in the generated BN by using an instance of IInferenceAlgorithm. Because of this behavior, the most natural way to implement continuous nodes in UnBBayes-MEBN is to implement instances of ICompiler. Since each ResidentNode is linked to an instance of ICompiler, each node can be compiled in different manners, whose eventually will compile a continuous BN node. A continuous BN node should extend ProbabilisticNode, so that an SSBNNode can have a link to it without changing its code. The SSBN generator (ISSBNGenerator) and the BN inference algorithm (IInferenceAlgorithm) may need some adjustments to adapt to a new format. The interpretation of the LPD was implemented in Compiler using some techniques also used by most of well-known compilers (e.g. gcc, pascal) – the compilation in multiple-phases. The phases in the LPD compiler are: 1. Morphological analysis – the String (LPD pseudocode) is converted to a sequence of tokens; 2. Syntactical analysis – based on a formal grammar (e.g. BNF) the compiler checks if no token is out of its place, missing or excessive; 3. Semantic analysis – consistency check, such as the value of the probability (it must be between 0 and 1), values of the parents, and so on; resulting in a temporary table-like representation which facilitates manipulation; 4. Build - convert the temporary table to an IProbabilityFunction. For sake of speed, these steps are executed in iterations, instead of executing each step just once for the whole LPD. This technique is also used by most compilers. Our LPD compiler was designed to strictly represent a grammar, so a formal analysis of language compliance can be performed (this is also a technique used by most of compilers). When a compiler design is based on a grammar (BNF, in our case), each non-terminal element (i.e. content in the grammar which can be further “expanded”) is represented and managed virtually as functions/methods (or any other relatively independent block of code, such as classes), and the expansion of such non-terminal elements is performed just by calling the functions/methods. The following table/listing shows the grammar of LPD. The methods in Compiler strictly represents this grammar, so for each non-terminal element, there is a method in Compiler. This can help developers to understand the methods in the class. table := statement | if_statement if_statement ::= "if" allop varsetname "have" "(" b_expression ")" statement "else" else_statement allop ::= "any" | "all" varsetname ::= ident ["." ident]* b_expression ::= b_term [ "|" b_term ]* assignment ::= ident "=" expression [ "," assignment ]* expression ::= term [ addop term ]* term ::= signed_factor [ mulop signed_factor ]* signed_factor ::= [ addop ] factor factor ::= number | function | "(" expression ")" function ::= possibleVal | "CARDINALITY" "(" varsetname ")" | "MIN" "(" expression ";" expression ")" b_term ::= not_factor [ "&" not_factor ]* not_factor ::= [ "~" ] b_factor b_factor ::= ident "=" ident | "(" b_expression ")" else_statement ::= statement | if_statement statement ::= "[" assignment_or_if "]" assignment_or_if ::= assignment | if_statement | "MAX" "(" expression ";" expression ")" possibleVal ::= ident addop ::= "+" | "-" mulop ::= "*" | "/" ident ::= letter [ letter | digit ]* number ::= [digit]+ The temporary table generated during the syntactic analysis is the following. TempTableHeaderCell represents literally the header information (some metadata), and TempTableProbabilityCell represents probabilities. Both classes may represent complex formula expressions, by using composite design pattern. The final representation (instance of IProbabilityFunction) is generated by using visitor design pattern to execute code in each cell.
© Copyright 2025 Paperzz