NIKITIN A., NIKITINA L. THE GA BASED APPROACH TO OPTIMIZATION OF PARALLEL CALCULATIONS IN LARGE PHYSICS PROBLEMS. The GA Based Approach to Optimization of Parallel Calculations in Large Physics Problems. Dr. Andrey Nikitin Dr. Ludmila Nikitina Moscow State University [email protected] ABSTRACT. The parallelization of computational algorithms in large physics problems on multiprocessor computer system containing thousand and more of processor elements requires special tools. In the present work the approach based on the using of Genetic Algorithm (GA) is proposed. The experiment results and the influence of GA parameters on convergence of the method are given. The approach may be using for real-time parallelization. 1. INTRODUCTION. The solving of a parallelization problem for multiprocessor computer system containing thousand and more of processor elements requires a huge number of operations. The problem is NP-complete. Therefore the traditional algorithm of exhaustive search cannot be applied for this purpose. Genetic Algorithm (GA) [1] is the stochastic search technique, based on ideas adopted from nature. GA has been successfully applied to solve many NP-complete problems as Travelling Salesman Problem or mapping problem [5]. The existing methods do not allow exploring high – level parallelism or taking into account performance of communication channels between processor elements. In the present work the approach based on the Genetic Algorithm is proposed. The method may be applied to computer systems with a huge number of processor elements for utilizing high – level parallelism as well as operation – level parallelism. The parallelization problem is formulated here as an optimisation of partitioning graph to subgraphs. 2. BACKGROUND. The criterion for distribution of nodes between subgraphs is the minimization of the algorithm working time on the system and the time required for data transfer between Page 1 of 8 NIKITIN A., NIKITINA L. THE GA BASED APPROACH TO OPTIMIZATION OF PARALLEL CALCULATIONS IN LARGE PHYSICS PROBLEMS. processor elements. In this way, every subgraph corresponds to the some processor element. The computational algorithm can be represented as a weighted graph [3]. The function estimates operating time of the algorithm and data transferring is determined by the given partition R. This functional is described below. A program is represented by an acyclic directed weighted graph Ga=<S,C>. The nodes of the graph are the statements of the algorithm and the edges refer to the information exchange between them. C is a vector of nodes’ weight (calculation cost) and S is an adjacency matrix of the graph. There is decomposition to layers for an acyclic graph. The nodes belonging to the same layer can be carried out simultaneously. Let h is a number of the layers. The decomposition to layers is introduced as matrix Hhxn where hkj is 1 if j-th node belongs to k-th layer and 0 in other case. A multiprocessor computer system is represented by a weighted graph Gc=<,>. The nodes of the graph are processor elements and i is the performance of i-th processor element. There is an edge between i-th and j-th nodes if there is a link between corresponding processors and pxp is communication performance matrix. We assumed: n is a number of the algorithm statements. p is a number of processor elements in the computer. Rpxn is a partition matrix, ril is equal 1 if node number l belongs to i-th subgraph and 0 in other case. We use matrix Ghxp where element gki is equal total weight of nodes belonging to i-th subgraph and k-th layer in the graph of the algorithm: g ki ( R ) n hkl ril cl . l 1 The cost of the data transfer between processor elements is defined by matrix pxp: n ij n ril r jm slm . l 1 m 1 It is possible now to define functional J(R): J ( R) 0 g max ki U A k 1 i 1 i h p p 1 p ij Lij i 1 j i 1 where: 0 is a performance of the processor element. Ua is a total weight of the graph of the algorithm. Matrix Lpxp determines minimal distance between j-th and i-th processors. If both communicating algorithm nodes are allocated on the same processor the distance is zero. Page 2 of 8 NIKITIN A., NIKITINA L. THE GA BASED APPROACH TO OPTIMIZATION OF PARALLEL CALCULATIONS IN LARGE PHYSICS PROBLEMS. For achieving maximum performance we need to find partition matrix R that minimize J(R): J(R)min. GA allows to carry out the optimisation process and build matrix R. The J(R) is used as a fitness function. 3. IMPLEMENTATION DETAILS. The genetic algorithm can be characterized by the following parameters: - coding strategy, - population size, - initialisation strategy, - selection technique, - crossover mechanism, - mutation mechanism, - strategy for offspring inclusion. Below, we briefly describe techniques used in this implementation. Coding strategy - The partition is encoded by a chromosome, represented as a byte string, whose length is equal to the number of nodes in the algorithm graph. A number coded on the i-th position (gene) in a chromosome is a processor element number on which i-th node is placed. Population size - for the algorithm presented in this paper, there are no restrictions on a number of chromosomes in population. Initialisation strategy - initial elements can be randomly chosen or a modified weighted random designed. Selection technique - two methods of choosing an individual for a reproduction are used and compared: - roulette wheel, the probability that the individual i with fitness f i is used as parent is proportional to its relative fitness in the population; - linear rank selection, the probability that the individual i with fitness f i is used as parent based on its rank in the population as a whole. Crossover mechanism - we consider three crossover operators: - one-point crossover(1-PC), a pair of chromosomes representing the selected individuals is intersected in a randomly chosen point and right-hand genes are swapped between chromosomes; - two-point crossover(2-PC), a pair of chromosomes representing the selected individuals is intersected in two randomly chosen points and genes lying between those two points are exchanged between chromosomes; Page 3 of 8 NIKITIN A., NIKITINA L. THE GA BASED APPROACH TO OPTIMIZATION OF PARALLEL CALCULATIONS IN LARGE PHYSICS PROBLEMS. - uniform crossover, for each gene, randomly picks each gene from either of the two parent chromosomes. Mutation mechanism - a node is moved to a randomly chosen subgraph or two nodes exchange between subgraphs. Strategy for an offspring inclusion - after each reproduction a decision should be made whether or not an offspring should replace its parent. We assume that the better of offspring and the better of parents are included. 4. RESULTS The method was investigated on the real problem of self – consistent simulation of negative central shear discharges having the exact solution. The graph has 1563 nodes. All results are averaged over 100 runs. The functional E define relative accuracy and is used for investigation of GA: E J J ex J ex 100% where J – solution found by GA, J ex – exact solution of parallelization problem (calculated analytically). tFirstly the influence of the population size on the convergence of GA was studied. The results are presented on the following figure. Figure 2 helps to choice population size for predefined precision. Bigger populations show better results in high – precision case. Размер популяции 10 30 50 100 500 1000 85 80 75 70 65 55 E Отн. ошибка, % 60 50 45 40 35 30 25 20 15 10 5 t 0 0 20 40 60 80 100 Номер итерации 120 140 160 180 Figure 1. An influence of the population size on the convergence of GA. Page 4 of 8 200 NIKITIN A., NIKITINA L. THE GA BASED APPROACH TO OPTIMIZATION OF PARALLEL CALCULATIONS IN LARGE PHYSICS PROBLEMS. 75% 55% 45% 38 36 34 32 Number of of iterations 30 28 26 Число итераций 24 22 20 18 16 14 12 10 8 6 4 2 10 30 50 100 500 Размер популяции Population size 1000 Figure 2. An influence of the population size on the number of iterations to achieve predefined precision (55%, 45%, 75%). Mutation brings new information to the chromosome and preserve losing diversity. Figures 3, 4 presents results of comparing different mutation schemes. The random movement of nodes shows better results than exchange nodes between subgraphs. Оператор мутации Swap Flip 26 25 24 23 22 21 20 18 E Отн. ошибка, % 19 17 16 15 14 13 12 11 10 9 8 7 6 5 t 0 20 40 60 80 100 120 140 Номер итерации 160 180 200 220 Figure 3. The method convergence for different mutation schemes. Page 5 of 8 240 NIKITIN A., NIKITINA L. THE GA BASED APPROACH TO OPTIMIZATION OF PARALLEL CALCULATIONS IN LARGE PHYSICS PROBLEMS. Вероятность мутации 115 0% 0.01% 1% 2% 5% 10% 110 105 100 95 90 85 80 70 E Отн. ошибка, % 75 65 60 55 50 45 40 35 30 25 20 15 10 5 t 0 0 20 40 60 80 100 120 Номер итерации 140 160 180 200 220 Figure 4. The method convergence for different mutation rate. After that, the selection strategies were compared (Figure 5). Rank and Tournament selection schemes demonstrate the best results. GA with rank selection converges to exact solution slightly faster. Оператор отбора 75 Rank Roulette Tournament DS SRS Uniform 70 65 60 55 45 E Отн. ошибка, % 50 40 35 30 25 20 15 10 5 t 0 20 40 60 80 100 120 140 160 180 Номер итерации 200 220 240 260 280 300 320 Figure 5. A comparison of selection strategies. In our last experiment we analysed an influence of the crossover operator on the method performance. Experiment was done with a crossover probability varying from 0 to 1.0 and a mutation rate varying from 0.001 to 0.5. Figure 6 shows results of this comparison. Each operator was applied with the probability that gave best results. With uniform crossover results are better than with other operators. Page 6 of 8 NIKITIN A., NIKITINA L. THE GA BASED APPROACH TO OPTIMIZATION OF PARALLEL CALCULATIONS IN LARGE PHYSICS PROBLEMS. Оператор скрещивания 1pt 2pts Uni EvenOdd PartialMatch 75 70 65 60 55 45 E Отн. ошибка, % 50 40 35 30 25 20 15 10 5 t 0 0 10 20 30 40 50 60 70 Номер итерации 80 90 100 110 120 130 Figure 6. A comparison of crossover operators. Вероятность скрещивания 24 0% 20% 60% 80% 90% 100% 23 22 21 20 18 E Отн. ошибка, % 19 17 16 15 14 13 12 11 10 t 0 50 100 150 200 Номер итерации 250 300 350 400 Figure 7. Performance of algorithm for different crossover rates. CONCLUSION. An approach for the solution of the parallelization problem on the basis of Genetic Algorithm is suggested. Numerical study of this method was carried out. GA showed fast convergence and may be used for for real-time parallelization. Page 7 of 8 NIKITIN A., NIKITINA L. THE GA BASED APPROACH TO OPTIMIZATION OF PARALLEL CALCULATIONS IN LARGE PHYSICS PROBLEMS. REFERENCES. [1] D.E. Goldberg, Genetic Algorithms in Search, Optimization & Machine Learning, Addison-Wesley Publishing Company, 1989 [2] Nikitin A.V. Optimization of Modular Associative Memory // Computational mathematics and modeling. 1999. vol 10. N. 4. p. 405-412 [3] N.M. Ershov, A.M. Popov, Optimization of Parallel Computations by MonteCarlo Method, In: Proceedings of PaCT-93, Obninsk, Russia, 1993, Vol.3 [4] G. von Laszewski, Intelligent Structural Operators for the K-way Graph Partitioning Problem, In: Proceedings of the 4-th ICGA, 1991 [5] T. Kalinowski, Solving the Mapping Problem with Genetic Algorithm on the MasPar-1, In: Proceedings of MPCS, Ischia, Italy, 1994 Page 8 of 8
© Copyright 2026 Paperzz