2. Process of LPD operation

Purpose: This document describes the operation and mechanism of the new Local Probability
Distribution (LPD) in MEBN for a user and developer.
Authors: Cheol Young Park, Shou Matsumoto.
Date: 03/23/2013
Index
1.
Introduction ........................................................................................................................................... 1
2.
Process of LPD operation ..................................................................................................................... 1
2.1.
Discrete LPD operation..................................................................................................................... 1
2.2.
Continuous LPD operation................................................................................................................ 7
3.
Mechanism of LPD ............................................................................................................................. 11
4. Conclusion ................................................................................................. Error! Bookmark not defined.
1. Introduction
The Local Probability Distribution (LPD) specifies numerical probability information for resident random
variables of an MFrag. It is a kind of function which should represent Probability Distribution. The LPD
in MEBN provides discrete and continuous Probability Distribution, so a user should know the different
operation of the LPD in MEBN. Also, we have developed the new LPD which can be performed for the
continuous resident node. The chapter 2 introduces the operation. And the chapter 3 describes the
mechanism of LPD.
2. Process of LPD operation
2.1.
Discrete LPD operation
To manipulate LPD, first of all, at least one resident node is created. To understand the operation easily,
we will use “VehicleIdentification.ubf” file which is located in “plugins\unbbayes.prs.mebn1.11.9\examples”. Once the file is opened, click “ImageTypeReport_MFrag”. Following describes how to
use LPD. (This example is based on Window O/S).
1) Select a resident node in order to open the LPD panel. And click the “Edit table” button
.
2) Following figure shows the LPD panel.
There is the “Nodes” panel which shows current and parent nodes.
There is “States” panel which shows the list of the possible states of a selected node.
There is “Arguments” panel which shows arguments of the selected node.
There are “LPD Grammar” buttons
There is the Text field as the LPD edit panel as which is empty firstly.
.
3) Click “if any” or “if all” button.
By choosing “any”, the sentence “if any paramSubSet have ( booleanFunction )” will be automatically
inserted to the editor panel. It basically states that if there is at least one instance of a parent satisfying the
statement at “booleanFunction”, then the probability distribution declared within the next block of
pseudocode (delimited by square brackets) will be adopted by the algorithm. We will talk about each of
them later.
Similarly, by choosing “all”, the sentence “if all paramSubSet have ( booleanFunction )” will be
automatically inserted. In this case, the probability distribution declared in the next block of code will
only be used if all the parents satisfy “booleanFunction”.
4) Click “else” button.
The if-else block can be put in cascade, so that complex LPD may be created using multiple
levels of if-else clauses. For instance, the following LPD is valid:
if any obj have ( ObjectType = Tracked ) [
if any rgn have ( Weather = Clear ) [
Tracked = .8, Wheeled = .15, NonVehicle = .05
] else [ Tracked = .6, Wheeled = .3, NonVehicle = .1]
] else if any obj have ( ObjectType = Wheeled ) [
if any rgn have ( Weather = Clear ) [
Tracked = .1, Wheeled = .8, NonVehicle = .1
] else [Tracked = .2, Wheeled = .6, NonVehicle = .2]
] else [
if any rgn have ( Weather = Clear ) [
Tracked = .05, Wheeled = .05, NonVehicle = .9
] else [ Tracked = .15, Wheeled = .15, NonVehicle = .7 ]
]
Note that an “else” is mandatory after an if, so the number of occurrences of “if” should match the
occurrences of “else”.
5) Click “paramSubSet” on the LPD edit panel and make it highlight. And click “ObjectType” on
the “Nodes” panel. And double click “obj” on the “Arguments” panel.
The “paramSubSet” has a special purpose, and it is basically used for restricting “booleanFunction”
and/or enabling the CARDINALITY function in the probability distribution block (the one delimited by
square brackets). It is a dot-separated list of ordinary variables.
6) Click “booleanFunction” on the LPD edit panel and make it highlight.
And click “equal operator” button
. And click “Node” on the LPD edit panel and make it
highlight. And click “ObjectType” on the “Nodes” panel.
7) Click “NodeState” on the LPD edit panel and make it highlight. And click “Tracked” on the
“States” panel.
The “booleanFunction” (the block being edited now) will only consider the parents containing all the
ordinary variables in “paramSubSet” as arguments, except those whose types are entities marked as
“ordered”. For example, if “paramSubSet” is “x.y” (a list containing ordinary variables “x” and “y”), and
the parents are Res1(x), Res2(x,y), Res3(x,y,t), and Res(y) (assuming that the ordinary variable “t” is
marked as “ordered” in UnBBayes), then any boolean sentence in “booleanFunction” will return false if it
is NOT using Res2(x,y) or Res3(x,y,t) – because these are the only parents using both “x” and “y”
simultaneously, if we do not consider the ordered variable “t”.
The “booleanFunction”, as the name suggest, represents an boolean expression. If the expression returns
true, then It can be composed by:

“&” - this is the boolean AND operation, and it returns true if both operands are true;

“|” - this is the boolean OR operation, and it returns true if one of the operand are true;

“~” - this is the boolean NOT operation, and it returns true if the operand is false;

“Node = State” - it means that the actual state of “Node” (which is a node) is “State” (which is a
state).
Complex boolean expressions can be created by combining the operators, using parenthesis to explicitly
specify the order of the operations.
After the “( booleanFunction )”, a square-bracket block specifies the probability distribution of each state
of the current node. Basically, it is a comma-separated list of probability attributions following the format
“<State> = <Expression>”. Expression can be composed by a combination of numeric values and
operations (it is recommended to write fully-parenthesized expressions). The following list summarizes
the possible operations for numeric expressions:

“MAX” - a binary operator which returns the highest of the two operands, and it is useful for
creating a lower bound for the probability;

“MIN” - a binary operator which returns the lowest of the two operands, and it is useful for
creating a upper bound for the probability (e.g. 1.0);

+,-,*,/ - the well known add, subtract, multiply, and divide operations;

“<State>” - the probability of a state can be declared as a function of another state, just by
writing the name of a previously declared state in the expression (it will be substituted by its
actual probability);

“CARD” - a special function which returns the quantity (count, or cardinality) of possible
COMBINATIONS of parent nodes (please, notice that this is not the number of parents) which
satisfies the conditions in “booleanFunction”. This is useful for creating a distribution which may
vary depending on the set (actually, a combination) of parents in a specific situation. In a normal
LPD, this function is used with MAX and MIN to limit the probability values. Its only parameter
must be a “paramSubSet” (i.e. a dot-separated list of ordinary variables), which identifies which
“if” we are considering in a cascaded if-else situation.
It is supposed to specify the probability distribution for all states in a bracket-separated block. States not
described in the bracket-block will be considered to have 0% (i.e. equivalent to “State = 0”).
8) As same manner, we put “MIN( .9, CARDINALITY( obj ) ),” and “1-(Trackedd + NonVehicle)”
as following figure.
9) Click “Compile” button
in order to check LPD grammar.
In our example case, it generates an error as the above figure. “>73<” means a position of an error
occurred.
10) Click “Save” button
2.2.
in order to save the script.
Continuous LPD operation
The continuous LPD is a new feature of UnBBayes-MEBN, so there are some different notations and
implementations from the discrete LPD. Basically, the continuous LPD which are an extended version of
the discrete LPD provides a grammar of continuous CPS and combining rule.
1) Continuous Grammar
The following figure is a Pseudo MFrag to show a continuous resident node.
There are one discrete resident node, A(X), and two continuous resident nodes, B(Y) and C(Z). The node,
B(Y), has a script as
NormalDist(5, 1);
which means the probability distribution of B(Y) is the normal distribution with mean = 5 and standard
deviation = 1. The node, C(Z), has a script as
if( Some(A( X )) && (A( X ) == a1 ) ){ - 0.5 * B( Y ) + NormalDist( 2, 1 ); }
else { B( Y ) + NormalDist( -1, 0.5 ); }
which means, for the some A(X), if A(X) is a1, then “{ - 0.5 * B( Y ) + NormalDist( 2, 1 ); }” is used,
else “{ B( Y ) + NormalDist( -1, 0.5 ); }” is used.
Now, we assume that we know X = {x1, x2}, Y = {y1}, and Z = {z1} and we want to generate SSBN.
The following figure shows the generated SSBN based on our assumption.
As we can see the above figure, the continuous LPD changed to the continuous CPS. Because in our
assumption there were two objects (x1 and x2), the four if-statement were generated. And the scripts of
the CPS will be used for DMP algorithm.
Here is one problem for the continuous LPD. It is possible that a continuous resident node might be
related with multiple objects. For example, we can assume that we know X = {x1}, Y = {y1, y2}, and Z =
{z1}. There are two continuous objects as y1 and y2. In this case, we should use just one object for their
child node. For example, in our case it is z1. How do we use just one object in the situation of the
multiple parents? To solve this problem, we provide the combining rule for the continuous LPD.
2) Combining Rule
We provide “Mean”, “Sum”, and “Multiply” combining rule for the continuous LPD. The following
example is same with the previous example. One difference is that “Mean” function is added as the red
script.
The meaning of “Mean(B(Y))” is that if there are several objects of B(Y), then they will be changed to a
script to express as an average equation. Following shows the result from this example. In this example,
we assume that we know X = {x1}, Y = {y1, y2}, and Z = {z1}.
As we can see the red script in the above figure, the “Mean(B(Y))” was decomposed into 0.5*B_y1 +
0.5*B_y2.
For the “Sum” and “Multiply” combining rules in situation of our example, “Sum(X)” should be “x1 + x2”. And
“Multiply(X)” should be “x1*x2”.
3) Example of converting LPD to CPS
Followings show an example to describe semantics of the new LPD. A) is a source LPD and B) is a CPS which is
converted from A).
A) LPD
if all y have ( parentY = T ) [
Mean ( parentX ) + NormalDist( 1, 1)
] else[
NormalDist( 3, 2 )
]
Assumption: Discrete parentY has an instance y1 and y2. Continuous parentX has an instance x1 and x2.
B) CPS converted from LPD
if( y1 == T && y2 == T ) { 0.5*x1 + 0.5*x2 + NormalDist( 1, 1 ) t }
else if( y1 == T && y2 == F ) { NormalDist( 3, 2 ); }
else if( y1 == F && y2 == T ) { NormalDist( 3, 2 ); }
else if( y1 == F && y2 == F ) { NormalDist( 3, 2 ); }
4) Semantics of if_statement
In this case, we should convert from “if all y have ( parentY = T )” to “if( y1 == T && y2 == T ) …”.
Internally, the structure of CPT is going to be generated. Base on the structure of CPT, if statement of
CPS should be generated.
When the structure of CPT is generated, continuous equations should be stored a memory with indices of
the continuous equations called CID. In our example on the below figure, c1 and c2 is the index of the
continuous equation. If a continuous equation has a combining rule, the equation is going to be applied by
the combining rule. This process is going to be described in 2.4. In this process for the if_statement, our
output is CPS which has empty action parts as following figure.
5) Semantics of assignment
In the action part of if_statement, there is assignment syntax. We add new syntax for empty variable case
which means that there is no variable and equal operator to be assigned. For example, “A = 0.5” is a
normal assignment notation, but we allow “0.5” without “A = “, because continuous node doesn’t have
various states, but it has one variable for it.
3. Mechanism of LPD
In a developer's point of view (i.e. technical point of view of the implementation of LPD in UnBBayes),
the pseudocode describing the LPD is simply a Java String linked to a ResidentNode. The pseudocode is
going to be translated by a Compiler, also linked to a ResidentNode, during SSBN generation, so that a
LPD can be converted to instances of IProbabilityFunction, a format UnBBayes uses for describing the
probability distribution in BN. Implementations of IProbabilityFunction are mostly tables. The following
figure illustrates the relationship between the Java classes in UnBBayes.
Basically, each resident node (i.e. instances of the ResidentNode class) is linked to an instance of
Compiler, the class responsible for generating an IProbabilityFunction from SSBNNode, a temporary
representation of a node during SSBNGeneration.
The following diagram illustrates the classes responsible for rendering a LPD pseudocode and handling
other GUI features.
The core of LPD GUI is the CPTEditionPane, which has a link to the ResidentNode, whose LPD is being
edited. It also contains the text area (CPTTextPane) where pseudocode is displayed. MEBNController
acts as a mediator for all other MEBN classes. StyleTableImpl and ColloringUtils are classes responsible
for the code syle.
The dynamic behavior of a LPD manifests basically during a query in MEBN. When a query is triggered,
an instance of ISSBNGenerator is activated. Eventually, some of its implementations will generate a
SSBNNode, temporary node containing a link to both ResidentNode (a node in MEBN - input) and
ProbabilisticNode (a node in BN – output). Also, eventually the SSBN generator will ask the compiler to
translate a SSBNNode (i.e. read the pseudocode linked to the related resident node) in order to generate
an instance of IProbabilisticFunction. After completion of the creation of SSBN, all SSBNNodes are
going to be discarded and only instances of ProbabilisticNode will survive. After that, inference is
performed in the generated BN by using an instance of IInferenceAlgorithm.
Because of this behavior, the most natural way to implement continuous nodes in UnBBayes-MEBN is to
implement instances of ICompiler. Since each ResidentNode is linked to an instance of ICompiler, each
node can be compiled in different manners, whose eventually will compile a continuous BN node. A
continuous BN node should extend ProbabilisticNode, so that an SSBNNode can have a link to it without
changing its code. The SSBN generator (ISSBNGenerator) and the BN inference algorithm
(IInferenceAlgorithm) may need some adjustments to adapt to a new format.
The interpretation of the LPD was implemented in Compiler using some techniques also used by most of
well-known compilers (e.g. gcc, pascal) – the compilation in multiple-phases. The phases in the LPD
compiler are:
1. Morphological analysis – the String (LPD pseudocode) is converted to a sequence of
tokens;
2. Syntactical analysis – based on a formal grammar (e.g. BNF) the compiler checks if no
token is out of its place, missing or excessive;
3. Semantic analysis – consistency check, such as the value of the probability (it must be
between 0 and 1), values of the parents, and so on; resulting in a temporary table-like
representation which facilitates manipulation;
4. Build - convert the temporary table to an IProbabilityFunction.
For sake of speed, these steps are executed in iterations, instead of executing each step just once for the
whole LPD. This technique is also used by most compilers.
Our LPD compiler was designed to strictly represent a grammar, so a formal analysis of language
compliance can be performed (this is also a technique used by most of compilers). When a compiler
design is based on a grammar (BNF, in our case), each non-terminal element (i.e. content in the grammar
which can be further “expanded”) is represented and managed virtually as functions/methods (or any
other relatively independent block of code, such as classes), and the expansion of such non-terminal
elements is performed just by calling the functions/methods.
The following table/listing shows the grammar of LPD. The methods in Compiler strictly represents this
grammar, so for each non-terminal element, there is a method in Compiler. This can help developers to
understand the methods in the class.
table := statement | if_statement
if_statement ::=
"if" allop varsetname "have" "(" b_expression ")"
statement
"else" else_statement
allop ::= "any" | "all"
varsetname ::= ident ["." ident]*
b_expression ::= b_term [ "|" b_term ]*
assignment ::= ident "=" expression [ "," assignment ]*
expression ::= term [ addop term ]*
term ::= signed_factor [ mulop signed_factor ]*
signed_factor ::= [ addop ] factor
factor ::= number | function | "(" expression ")"
function ::= possibleVal
| "CARDINALITY" "(" varsetname ")"
| "MIN" "(" expression ";" expression ")"
b_term ::= not_factor [ "&" not_factor ]*
not_factor ::= [ "~" ] b_factor
b_factor ::= ident "=" ident | "(" b_expression ")"
else_statement ::= statement | if_statement
statement ::= "[" assignment_or_if "]"
assignment_or_if ::= assignment | if_statement
| "MAX" "(" expression ";" expression ")"
possibleVal ::= ident
addop ::= "+" | "-"
mulop ::= "*" | "/"
ident ::= letter [ letter | digit ]*
number ::= [digit]+
The temporary table generated during the syntactic analysis is the following. TempTableHeaderCell
represents literally the header information (some metadata), and TempTableProbabilityCell represents
probabilities. Both classes may represent complex formula expressions, by using composite design
pattern. The final representation (instance of IProbabilityFunction) is generated by using visitor design
pattern to execute code in each cell.