2010 2nd International Conference on Industrial and Information Systems A Hybrid Discrete PSO-SA algorithm to Find Optimal Elimination Orderings for Bayesian Networks l2 l2 3 l2 l2 Xuchu Dong , , Dantong Ouyang , , Dianbo Cai , Yonggang Zhang , , Yuxin Ye , 1. Department of Computer Science and Technology,Jilin University 2. Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education 3. Network Monitor Maintenance Centre of Jilin Branch ,China Telecom Corporation Limited Changchun,Jilin,China [email protected] of Jij. Therefore, all the parent nodes of Jij are exactly all the direct causes giving rise to Jij, which is denoted by Pa(Jij). Given a Bayesian network G, a triangulation of a Bayesian network can be obtained by two steps: first, for each node V, in the DAG, connect Pa(V,) into a complete subgraph, and drop the directions of all the edges. The resulting undirected graph is known as moral graph. Here, we let (j1 denote the moral graPth of G, and N(V,) denote all the neighbors of node V, in G . Second,according to an ordering of the nodes in the moral graph, a triangulation can be constructed by the following procedure. ELIMINATION Input: the moral graph GM=(V,E), an ordering ORD= (VJ,...,Vn) of V. Output: a triangulation H Begin Procedure 1. For i=1 to n Begin 1.1 Connect N(V,) into a complete subgraph. Let Fill(V,) represent the added edges; 1.2 Delete V, and all the edges linking to Vi; End 2. Return H=(V,Eu Fill(V1) u ...uFill(Vn)); End Procedure In the above procedure, the operation carried out by step 1.1 and 1.2 is also known as "eliminate node V, from the gra,eh". After all the nodes VJ,...,Vn are eliminated from G ,GM becomes an empty graph. ORD is called "an elimination ordering of G". The state space size of a triangulation H is defmed as J{H)=ICeClq(H)OVieCw(v,), where Clq(H) means all the cliques (i.e. maximum complete subgraphs) of H. Finally, a junction tree can be obtained by running maximum-weight spanning tree algorithm on the junction graph, which is constructed by all the cliques in the triangulation. That construction process is irrelevant to our topic, and detailed discussions can be found in [10]. Fig. 1 gives an example of a Bayesian network, its moral graph, a triangulation generated by an elimination ordering (A BCD E F),and a junction tree obtained from the triangulation. If all the variables are binary, the state space size of the triangulation is 4x23=32. As regard to probabilistic inference, we always expect a junction tree with minimal size. In fact, a node in a junction tree corresponds to a clique in the triangulation,so a triangulation with minimum state space size is preferred. Abstract-In this paper, a hybrid algorithm named DPSO SA is proposed to find near-to-optimal elimination orderings in Bayesian networks. DPSO-SA is a discrete particle swarm optimization method enhanced by simulated annealing. Computational tests show that this hybrid method is very effective and robust for the elimination ordering problem. Keywords-Bayesian networks; triangulation; elimination ordering; particle swarm optimization; simulated annealing I. INTRODUCTION As a kind of graphical model representing causality and inferring uncertainty, Bayesian network has been widely used in the real-world applications [1]. The efficiency of most inference methods depends on the ordering of all the variables in the Bayesian network [2]. However, finding an optimal ordering that produces best computational efficiency is a NP-hard problem [3]. Some heuristics has been proposed to solve this problem [4][5]. Wen and Kjrerulff provided simulated annealing methods to find a near-to-optimal ordering [3][4]. Larraftaga et al. gave a genetic algorithm framework, in which 8 crossover operators and 3 mutation operators can be used [6]. In [7],Wang et al. developed an adaptive genetic algorithm. Other swarm intelligence approaches, such as ant colony system and estimation of distribution algorithms, are also applied to this problem [8][9]. In section II, a brief introduction to the elimination ordering problem is given. Section III presents a discrete particle swarm optimization method, DPSO, to solve such a problem. A novel algorithm hybridizing DPSO and simulated annealing is proposed in section IV. Section V displays the experimental results for comparison of our hybrid algorithm and other swarm intelligence methods. II. ELIMINATION ORDERING OF BAYESIAN NETWORKS In a Bayesian network, the causal relationship between some probabilistic events of interest is expressed by a directed acyclic graph (DAG). A probabilistic event is represented by a random variable,which is formulated as a node in the DAG. Therefore, we do not distinguish between the two terms of "node" and "variable". The set of all the possible values of a random variable V, is also called the state space of v" and the size of v,'s state space is denoted by w(V,). An edge pointing from node V, to node Jij represents the causal relationship between cause event V, and effect event Jij. We also say that V, is a parent 978-1-4244-8217-7110/$26.00 ©201O IEEE lIS 2010 510 positions found by Pk and by the swarm of all particles until time step t respectively. At time step t, the particles (a) a Bayesian network located in Poi can be called as "best particles". rl and g best r2 are two random real numbers. w, Cl and C2 are three parameters given in advance. Given two position vectors Pos; and Posj' the subtraction of Pos; Posj is calculated by the following procedure. and Pos,= (V] ,... ,V]) Input: Pos= ; (V;1,...,V;) J 1 n n (b) moral graph ' Output: a sequence SS of swap operations Begin Procedure 1. SS<-NULL; 2. For k=1 to n Begin 2.1 If V;/T0 then (d) j unction tree (c) triangulation Figure I. Begin 2.1.1 In Posj, find Example of a Bayesian network, its moral graph, triangulation and junction tree. 2.1.2 Add The optimal elimination ordering is an ordering according to which a triangulation with minimum state space size can be generated by the above ELIMINATION procedure. For convenience, we also usej(ORD) to denote the state space size of the triangulation generated by the ELIMINATION procedure according to an ordering ORD. The problem of finding an optimal elimination ordering is NP-hard [3]. To fmd good elimination orderings in a fast way, some heuristics, such as minimum fill, minimum size and H2, have been proposed [4][5]. These heuristics can be used to generate elimination orderings. Given an undirected graph H, a node in H can be selected by a heuristic if the corresponding evaluation function gives the minimal value for that node. For example, the minimum size heuristic evaluate a node with the number of its neighbors plus 1, so the evaluation value of seven nodes in Fig. 1 (b) are 3, 3, 3, 4, 4 and 3 respectively. Therefore, A, B, Cor F can be selected as the node to be eliminated. After eliminating the selected node, another node is selected and eliminated in the same way. By repeating such a greedy procedure, an elimination ordering can be obtained. III. A DISCRETE PSO ALGORITHM In this section,we present a discrete PSO algorithm for fmding good elimination orderings. In our DPSO algorithm,the position vector of each particle is defined as an elimination ordering, and the function computing state space size is used to evaluate a particle. For a particle Ph the following equations are used to calculate its velocity and position. t+ t t t Vel 1 =wXVel +c1Xr1X(Pos -Pos + ;) k k k best (1) c2Xr2X(Poig_ es-Poh k' b t Pos I=Pos +Vel 1 (2) � � End End End Procedure V;k and then swap it with T0k; (V;k T0) to the swap sequence SS; The swap operation sequence obtained by Pos; Pos) can be considered as a velocity. The multiplication of a velocity Vel by a real number r (O:'(r:'( 1) means a swap operation sequence obtained by selecting the swap operations in Vel with a probability of r. If r'? 1 then we let r XVel=Vel. The summation of two velocities is just the concatenation of the two swap operation sequences. The summation of a position Pos and a velocity Vel is applying all the swap operations in Vel to Pos one by one. The above calculation of particle's position and velocity is inspired by [11]. Based on such a calculation, the DPSO algorithm can be described as follows. Algorithm DPSO Input: a moral graph eM, a set RS of heuristics. Output: a best-so-far solution. Begin Procedure 1. S+-InitSwarm(RS); 2. For t=O to M4X ITER-l Begin For each particle Pi< in S DPSO_Move�); End 3. Return PosMAXJTER.' g_best End Procedure In DPSO,the initial swarm of particles at time step 0 is generated by the following procedure of InitSwarm(RS). Procedure InitSwarm Input: a set of heuristics RS. Ouput: a swarm S of particles. Begin Procedure 1. Generate particles by the heuristics in RS; for each particle Pk and Poso 2. Calculate Pos O g_best k _best for the swarm; 3. Let m be the number of best particles located in � Poso . Reserve one best particle's position and reset g_best other m-J best particles at random positions; In (1) and (2), Pos and Vel mean the position vector � k � and the velocity vector of particle Pk at time step t t respectively. Pos and Poi are the best-so-far g_best k _best 4. Update Pos O for each particle k _best for the swarm; 511 Pk and Poso g_best End 6.3.2 Else Begin 6.3.2.1 if Statusk=STATUS- SA then Begin End Procedure First, all the particles are generated greedily by the heuristics in RS. The number of particles generated by each heuristic is about the same. Although particles generated in such a heuristic way are much better than in a random way, the shortcoming is obvious. The diversity of swarm may be much worse because particles generated by a same heuristic are most likely near to each other or even at the same position. To avoid many particles aggregating on the best position, in step 3, the positions of all the best particles except one of them are randomized. The DPSO_Move procedure is used to move a particle in the search space according to (l) and (2). Procedure DPSO - Move Input: a particle Pk at time step t. Begin Procedure 1. Generate two random real numbers rl and r2; . t+ t+ 2. Calculate Vel 1 and POS 1 usmg (1) and (2); k k t+ t+ 3. Update Pos 1 and POS 1 ; g_besl k _besl End Procedure 6.3.2.1.1 Pos <-Pos End 6.4 ; MAX ITER position Poi using simulated annealing through the g_besl whole optimization process. The cooling scheme of simulated annealing can be easily seen from step 4, 5 and 6.4, where To is initial temperature and fJ is cooling rate. This cooling scheme borrows ideas from Kjrerulff's simulated annealing methods, detailed discussions can be found in [4]. Except for Pshadow, other particles in the swarm has two moving status: STATUS DPSO and STATUS - SA. The status of particle Pk is identified by Statusk. If a particle's evaluation value is far from f(Poi \ the particle will g_best' be set in STATUS_DPSO and its moving is under the control of DPSO_Move procedure presented in section III. Otherwise it will be set in STATUS_SA and its moving is controlled by the following SA_Move procedure. Procedure SA Move Input: a particle Pk at time step t,temperature T. Begin Procedure 3. Genearte a particle Pshadow and Poi <-Poi ; shadow g_besl 10l0gJQ2 0-a 7 4. a<- O.1, U<- e ,Y<; ITER 1. ORD'<-Pos ; � 5 T. +-y' fJ<-1.·' 0 a a 2. For i=1 to MAX SA ITER Begin 2.1 Generate two random integers • 6. For t=0 to MAX ITER-l Begin 6.1 SA_Move(pshadow); n); 6.2 iff(Poi ) then shadow)<j(Poig_best' . Poi <-Poi g_besl shadow' 6.3 For each particle Pk in S Begin 2.2 (Vip . . . , Vin)<-ORD'; 2.3 ORD'<-(Vip , Viu•p Viu+p • • • 2.4 6<-f(ORD)' -f(Pos ; j • • • U and w (1 �u<w� , Viw' Viu' Viw+p ,Vin); • • • L::,. 2.5 prob<-min{l, e-r); 2.6 Generate a random real number r3; 2.7 Ifr3<prob then Pos +1<-ORD'; ) f(Poi ) -f(Poi g best' k best' 6 3 1 ·1f 0 1 then <. 1 ) f(Pos g_best' Begin 6.3.1.1 if Statusk=STATUS- DPSO then Begin 1 1 6.3.1.1.1 Pos <-Pos ; k besl k 6.3.1.1.2 Statusk<-STATUS- SA; End 6.3.1.2 SA_Move(phTt); t+ 1 6.3.1.3 Vel 1<-Vel ; k k • To 7. Return Pos - ; g besl In the begmning of DPSO-SA, the swarm is initialized (step 1 and 2) and an additional particle Pshadow is produced, which is used to exploit the area around the global best Due to lack of effective local search, standard PSO may possibly converge to a local optimum. The following algorithm tries to improve the local search ability of DPSO using simulated annealing. A�orithm DPSO-SA Input: a moral graph G ,a set RS of heuristics. Output: a best-so-far solution. Begin Procedure 1. S<-InitSwarm(RS); 2. For each particle Pk in S Statusk<-STATUS_DPSO; MAX 6.3.2.1.2 Statusk<-STATUS- DPSO; End 6.3.2.2 DPSO_Move(pk); End 1; 1+(t+ 1) XfJ End A HYBIRD DISCRETE PS�-SA ALGORITHM IV. �_besl; � � • 1 1 2.8 EIse Pos +1<-Pos ; k k End End Procedure In fact, SA_Move is just a metropolis process which search around the neighborhood of Pos� at given temperature T. The neighboring solution is constructed by selecting two variables Viu and Viu and then moving Viu backwards to the position after Viw (step 2.1 2.3). � 512 V. EXPERIMENT RESULTS Algorithm We test the performance of DPSO-SA on four Bayesian networks: Water, Mildew, Barley and Muninl, which are downloaded from http://bndg.cs.aau.dk!. DPSO SA algorithm is implemented in C++. Its parameter setting listed in Table I. In Table I, n stands for the number of nodes in the Bayesian network. On each network, DPSO SA is performed 50 times. The best, average and deviation of 50 running results for each network are listed in Table II to Table V and compared with other swarm intelligence methods: GA-ALL, TAGA, ACSF, ACSv, ACSCM and ACSALL presented in [7] and [8]. TABLE I. Parameter Value 5n MAX SA ITER 100 w 0.4298 c] 0.69618 C2 0.69618 RS {minimum fill, minimum weight, H2} TABLE II. Algorithm 3,028,305 3,302,154.7 18,515.3 TAGA 3,028,305 3,192,906.7 173,655.3 ACSF 3,028,305 3,175,424.7 22,917.8 ACSy 3,028,305 3,360,359.6 7,927.7 ACSCM 3,362,268 3,438,780.4 8,157.7 ACSALL 3,028,305 3,226,672.2 23,607.3 DPSO-SA 3,028,305 3,028,796.5 3,475.6 Algorithm Best Average GA-ALL 3,400,464 3,421,822,1 2,844.7 TAGA 3,400,464 3,532,394.8 198,092.7 ACSF 3,400,464 3,473.817.4 13,714.7 ACSy 3,400,464 3,418,569.0 8,127.5 ACSCM 3,400,464 3,400,464.0 0.0 3,400,464 3,403,434.2 1,925.1 DPSO-SA 3,400,464 3,400,464.0 0.0 COMPARISON ON BARLEY Best Average GA-ALL 17,140,796 17,199,695.6 16,167.0 17,140,796 17,188,307.7 93,097.1 ACSF 17,140,796 17,147,217.7 2,137.2 ACSy 17,140,941 17,465,208.3 132,182.4 ACSCM 17,140,796 17,272,460.4 11.407.2 ACSALL 17,140,796 17,161,703.3 3,898.0 DPSO-SA 17,140,796 17,140,796.0 0.0 Algorithm COMPARISON ON MUNINI Best Average Deviation 83,735,918 103,269,358.0 3,180,680.3 TAGA 88,968,090 126,839,236.9 23,575,586.9 ACSF 85,352,183 101,328,156.5 1,709,201.5 GA-ALL 99,480,808.5 872,175.0 86,934,403 105,243,863.6 1,548,232.2 DPSO-SA 83,735,758 83,736,638.7 515.7 VI. CONCLUSION [1] Li Feng, Wei Wang, Lina Zhu, Yi Zhang. Predicting intrusion goal using dynamic Bayesian network with transfer probability estimation. Journal of Network and Computer Applications, 2009, 32(3): 721-732. [2] Dechter R, Mateescu R. AND/OR search spaces for graphical models. Artif. Intell. 2007, 171(2-3): 73-106. [3] Wen W X. Optimal decomposition of belief networks. UAI 1990:245-256. [4] Kjrerulff U. Triangulation of graphs-algorithms giving small total state space. R90-09, Department of Mathematics and Computer Science, Aalborg University, 1990. [5] Cano A, Moral S. Heuristic Algorithms for the Triangulation of Graphs. IPMU 1994:98-107. [6] Larraftaga P, Kuijpers C M H, Poza M, Murga R H. Decomposing Bayesian networks: triangulation of the moral graph with genetic algorithms. Statistics and Computing, 1997, 7(1): 19-34. [7] Wang H, Yu K, Wu X H, Yao H L. Triangulation of Bayesian Networks Using an Adaptive Genetic Algorithm. ISMIS 2006:127136. [8] Gamez J A, Puerta J M. Searching for the best elimination sequence in Bayesian networks by using ant colony optimization. Pattern Recognition letters, 2002, 23(1-3): 261-277. [9] Romero T, Larraftaga P. Triangulation of Bayesian networks with recursive estimation of distribution algorithms. Int. 1. Approx. Reasoning, 2009, 50(3): 72-484. Deviation TAGA TABLE V. 87,254,224 ACSALL From Table II�V, it can be seen that DPSO-SA achieves the best average results on all networks and all the deviations obtained by DPSO-SA are very small. Therefore, we can say that DPSO-SA is a very effective and robust method for the elimination ordering problem. Deviation ACSALL Algorithm ACSCM REFERENCES COMPARISON ON MILDEW TABLE IV. 1,520,758.1 This work was supported in part by NSFC Major Research Program under Grant Nos. 60496320 and 60496321, Basic Theory and Core Techniques of Non Canonical Knowledge; NSFC under Grant Nos. 60873148 and 60973089; European Commission under Grant No. THiAsia Link/OlQ (111084); and the Science Foundation for Young Scholars of Jilin Province, China, under Grant Nos. 20080107, 20080607 and 20090108. Thanks all of them. Deviation Average GA-ALL TABLE III. Deviation 103,875,447.1 ACKNOWLEDGMENT COMPARISON ON WATER Best Average 84,586,392 In this paper, a hybrid swarm intelligence algorithm named DPSO-SA is proposed to fmd close-to-optimal elimination orderings for Bayesian networks. DPSO-SA is a discrete particle swarm optimization method enhanced by simulated annealing. Computational experiments show that DPSO-SA is more effective and robust than other existing swarm intelligence methods. THE PARAMETER SETTING OF CCSHGA AND CCSHGA-N ALGORITHMS MAX ITER Best ACSy [10] Finn Verner Jensen, Frank Jensen: Optimal junction Trees. UAI 1994:360-366. [11] KangPing Wang, Lan Huang, ChunGuang Zhou, Wei Pang. Particle swarm optimization for traveling salesman problem. ICMLC 2003: 1583-1585. 513
© Copyright 2026 Paperzz