1 5thSASTech 2011, Khavaran Higher-education Institute, Mashhad, Iran. May 12-14. Optimization of Farsi Letter Arrangement on Keyboard by Simulated Annealing and Genetic Algorithms Navid Samimi Behbahan , [email protected] Department of Computer, Omidiyeh Branch, Islamic Azad University, Omidiyeh, Iran Paper Reference Number: 8 Name of the Presenter: Navid Samimi Behbahan Abstract Nowadays one of the most common devices for computer data entry is the keyboard. No doubt, saving time, in the present age, is one of the most important goals humankind sought to promote. Optimization of keyboard arrangement is of great importance, since it can help us to have access to information in less time. A combined evolutionary algorithm can search on the keyboard and reach the optimized arrangement with regard to an evaluation factor (the level of typing comfort for a special letter arrangement) in the space of Persian letters arrangement on a keyboard. In this paper, the genetic and simulated annealing algorithms are searching for the best permutation among the 33 Persian letters on the keyboard. The evaluation criteria includes three factors: intermittent use of hands in typing the texts, not using a hand for typing two adjacent letters and the level of hardness of typing a letter in the related arrangement. In the studies conducted by the large and various data sets (Persian texts), it was determined that the optimized arrangement resulted from this hybrid algorithm performs better than the present algorithm. Key words: permutation, genetic algorithm, simulated annealing algorithm, keyboard, optimum arrangement 1. Introduction Since the emergence of computer up to now, key board has been the main interface between the human and computer. Optimized arrangement of Persian letters is beneficial for the people dealing with the typing the Persian texts. Before, the keyboard was used in typing machine. It was 135 years since the rectangular keyboard was designed by Christopher Latham Sholes and used in typing machines, but this invention was constantly questioned by the critics. In addition to physical design of keyboard, letter arrangement on the keyboard is criticized as well. The researchers have presented different algorithms to create the optimized layout but they were mostly used for the English language. Unfortunately, no new design has been given for Persian language and the arrangement recommended at the beginning of Persian language application in computer is still being used. 2 5thSASTech 2011, Khavaran Higher-education Institute, Mashhad, Iran. May 12-14. Many researchers have applied the evolutionary processing algorithms and similar methods to solve the problem of letter arrangements on the keyboard. Glower (1987) was the first researcher who studied in this area. He designed a genetic algorithm and its chromosomes were created from the different permutations of Latin characters on the keyboard. To reach the optimized permutation, he also designed a series of appropriate genetic operators. Light and Anderson (1993) were the next researchers worked on this problem. They use the simulated annealing algorithm to search in the different arrangement forms of English language. The evaluation function used by these people, was based on the typing time and the frequency of words repetition. Klausler (2005) used an operator which frequently changed the place of two letters. He applied this algorithm for 26 English letter and 4 punctuation characters in three rows of ten keys. The evaluation function used was able to calculate the frequency of fingers displacement according to their basic position (Fig.1), then the algorithm moves toward the minimized of finger displacement frequency for a fixed text in exchange for different arrangements. Fig 1: the basic position of the typist's fingers on keyboard Wagner et al. (2003) used the Ant Colony Algorithm to search in the environment of keyboard arrangements. Evaluation function used the factors such as the frequency of pressing different keys, using hands for typing two adjacent letters and using one finger for typing two adjacent letters. Moradi and Shiri (2006) applied the genetic algorithm to solve the problem of Persian letters arrangements on a keyboard with three rows. They used the mutation operator to modify their initial populations. Their evaluation functions measured the optimized fingers displacement, equality of works with both hands, sequences of work with two hands in a way that hands non-consecutively enter the characters for a six letters word (5). In the following section, the problem of the mentioned algorithm was precisely discussed, the combined algorithm was precisely studied a new function was presented to compare the different keyboard arrangements and finally the optimize arrangement was studied along with the results. 2. Problem definition Like the other algorithms searching in the environment of different arrangements, in this problem the geometry of keyboard is fixed and we want to allocate the number of 33 characters including 32 Persian language letters along with ( ءhamzeh) on the three rows of the keyboard which have in order 10, 11 and 12 keys. The target of this problem is to find the best arrangement on these keys in away that the user feel more comfortable at the time of typing the Persian texts. 3. Statistical review of the Persian Letters In this part we try to find the frequency of each of the Persian language characters and the pair of different characters following each other (Malas et al. 2008). The process of coding was performed in Matlab environment on a text including 19092 words and the results are shown as follows. Table 1 represents the frequency for each of the Persian language 3 5thSASTech 2011, Khavaran Higher-education Institute, Mashhad, Iran. May 12-14. characters. The most and least foe the frequency belongs to " "ﺍand " "ءwith the percents of 15.80 and 0.02. Table 2 shows the frequency of the pairs of characters following each other. The rows signify the first character and the column are for the second characters. The most frequency belongs to " "ﺍand " "ﻥcharacters. Table 1. Frequency of the Persian Letters Table2. Frequency of Persian letter pairs 4. Combine of Genetic and Simulated Annealing Algorithms In order to find the answer for the mentioned problem, a hybrid algorithm was used. The main problem of genetic algorithm at the time of problem solving is to be trapped in position called relative maximum position. This problem sometimes results in failure in finding the optimized answer (absolute maximum) for the problem. The genetic algorithm 4 5thSASTech 2011, Khavaran Higher-education Institute, Mashhad, Iran. May 12-14. usually tries to improve the problem solving environment, whereas in order to find an optimized answer we need to come to the worst answers, then improve them and finally reach an optimized answer (absolute maximum). So, in this method the property of improving, which is the characteristic of genetic algorithm and enables it to find a better answer in comparison to the present answer, would not be used and the simulated annealing algorithm would be applied. Actually the changes in genetic algorithm is in a way that leads to the creation of new arrangements, but no arrangement selection would be done in this algorithm and acceptance and non-acceptance of the new arrangements (as defined in fig.4) is to be performed by simulated annealing algorithm. The evaluation function present in this algorithms complex, measures the level of comfort or hardness of applying a arrangement. In every generation, the genetic operators are applied to a the present population which are the different arrangements of Persian language letters on a keyboard, then they are moved toward a direction, by help of a simulated annealing algorithm, that the amount of function reaches to a number which is in proportion with the members of that minimize value. Degree of optimization of each members of the population (which is actually a arrangement of Farsi letters on keyboard) achieve by applying the evaluation function on text provided from variety issues (including political, scientific, historical, social and other issues). // Sinit is the initial set of rules // Sbest is the best set of arrangements // EFbest is Evaluation Fitness for best set of arrangements // EFcurrent is Evaluation Fitness for current set of arrangements // Tmax is initial temperature // Tmin is the final temperature // α is the cooling rate // β is a constant // Time is the time spent for the annealing process so far // k is the number of calls of metropolis at each temperature Begin T = Tmax ; Scurrent = Sinit ; S best = S current ; // Sbest is the best set of rules soon so far Repeat For i = 1 to k Call Metropolis( S current , S best , T ) k = β × k; T = α ×T; Until ( T ≥ Tmin ); Return( S best ); End. //Genetic-Simulated Annealing Procedure Metropolis( Scurrent , Sbest , T ) // S new is the new set of rules Begin Selection( Scurrent ); S new = Mutation( Scurrent ); 5 5thSASTech 2011, Khavaran Higher-education Institute, Mashhad, Iran. May 12-14. EFnew = NNCP ( S new ); ∆EF = ( EFnew − EFcurrent ); If ( ∆EF < 0 ) Then S current = S new ; If EFnew < EFbest Then S best = S new ; Else If ( random[0,1] < e −∆EF / T ) Then S current = S new ; End If End If End. //Metropolis Fig 2: Quasi-code for Presented hybrid algorithm (Genetic-Simulated Annealing) 4.1. Population The members of population in this problem are the different permutations of Persian letters on keyboard or the arrangements. Each member of the population in this problem can be regarded as a vector of the Persian letters, each index of which is corresponding to one key. For example, each vector with the length equal to 33 can be regarded as one chromosome (one member of the population) and ith letter of this vector is corresponding to a key which is labeled as ith on the keyboard. The corresponding indexes of each chromosome are shown on the keyboard. One chromosome of population corresponding to the present arrangement of the keyboard is also shown. Fig 3: The indexes of genes of each chromosome on keyboard 1 2 3 4 5 6 ﻍ ﻑ ﻕ ﺙ ﺹ ﺽ ... 33 پ Fig 4: Structure of a chromosome of a population corresponding to current arrangement of Farsi letters on the keyboard Generally, we are looking for one member of the whole possible population which is defined in the following section with regard to the evaluation function and bears less cost in comparison to the other members (arrangements). The important point is that the number of different arrangements is 33! Or 8.8×1036 and this is the space in which the hybrid algorithm should look for the optimized arrangement. 4.2. The Evaluation Function for Keyboard arrangement In this area we are relying o6+n the works done by the specialists. Norman and Romelhart (1983) have defined four targets for the design of a keyboard including: 1- the most equality of works done by two hands 2- the most number of types done intermittently by two hands 3- the least number of types of two adjacent letters done by the same finger 4- the most allocation of commonly used letters on the middle row of keyboard 6 5thSASTech 2011, Khavaran Higher-education Institute, Mashhad, Iran. May 12-14. One chromosome (by simulating the typing of a text) represents the arrangement of the related keyboard. The evaluation factors with the mentioned targets are defined. For the two first targets we can present the following evaluation factor: C hand : the cost related to the using one hand for typing two adjacent letters: this factor covers the two first targets. The second target is to be met directly. The first target would also be covered, because the intermittent use of both hands to type the two adjacent letters will finally spread the hardness of typing between two hands. For the third factor the following factor is defined: C finger : the cost related to the application of one hand for typing the two adjacent letters. But we have another measuring factor which covers not only the fourth target but regards other factors too. This factor dose not just pay attention to frequency of pressing the basic keys, but also considers the application of different fingers, the level of comfort at the time of working with hands and the frequency of displacement of fingers on keyboard. In figure 4 the costs related to pressing any key on the keyboard is shown and these numbers are resulted form the professional typists. It is to be mentioned that these numbers are drawn from a right hand person. Based on the mentioned information, the third factor can be defined as follows: C ergonomic Fig 5: cost related to pressing any key on the keyboard Evaluation function for each chromosome is obtained from the total of these three factors for all the letters used in a text. 33 33 i=1 j=1 � Fletter (li ) × �� ��Fletter _pairs �lj , li � × �Chand �lj , li � + Cfinger �lj , li ��� + Cergonomic (li )�� (1) F letter : the percent of relative frequency of each letter and F letter_pairs defines the possibility of allocation (percent of relative frequency) of two pairs of characters besides each other. C ergonomic A function of costs which transform the cost of typing the l j letter with regard to the values defined in Figure 3 and the related arrangement. C finger A function which transforms a fixed value (average of numbers shown in Figure 4) if the two letters of l j and l j-1 is typed by one hand considering the related arrangement, otherwise its outcome is zero. C hand : A function which transform the fixed value (a quarter of the mentioned fixed value for function C finger ), if the two letters of l j and l j-1 are typed by one hand based on the related arrangement, otherwise its outcome is zero. If the l j is the letter of W i word in the function of C hand and C finger , the outcome for these two function is zero. R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R 4.3. Genetic Operators In order to lead the population of genetic algorithm (chromosomes) to the direction in which the evaluation function reduces for each chromosome, the genetic operators should be used. The mutation operators are just used here. The reason not to use the exchange operator is that the structure of population members is so that merging the two parent 7 5thSASTech 2011, Khavaran Higher-education Institute, Mashhad, Iran. May 12-14. chromosome bears high cost of time (in a way that the chromosomes of the children born from the common genes have their parent and the rest are randomly changed). The mutation operator is designed in away that bears much less time and fill the empty place of exchange operator. For each generation, our population includes ten chromosomes. We designed the mutation operator for these members in a way that a single number among 3 to 12 is considered for each member with regard to level of optimization of related arrangement. This number in each chromosome shows the number of genes (characters) including mutation operator. The number of 3 and 12 mutation is to be done for the best and the worst chromosome. The mutation operator randomly displaces the contents of then related genes for each member. This kind of discrimination in the frequency of displacement of each chromosome's genes causes the genius members of each generation to go under fewer changes and more changes happens for the normal members of the population. So, in the transition of generations, there is high probability of creating genius members with regard to the level of changes. The more genius members of the past generations bear fewer changes and so there is the chance of generation reform. 5. Conclusion For the present rectangular keyboards, an appropriate arrangement avoid the high : cost related to typing a letter regarding position of the lettershould on keyboard. number of displacement of hand fingers on the keyboard and consider other ergonomic factors which provide the user's comfort. Among these factors is not typing two adjacent letters with one finger and even one hand and spreading then difficulty of typing equally between two hands. The combined, simulated annealing and genetic algorithm lead the arrangement of the 33 Persian letters to the optimized arrangement. We implemented he presented hybrid algorithm with the aforementioned (table3) parameters. Cost of best arrangement which the combined algorithm finally presented for Farsi letters, according to evaluation function is 0.6391 cost of current arrangement of Farsi letter and it really is significant improvement. This arrangement is shown in figure 5. Number of chromosome Primary temperature final temperature Temperature decrease coefficient Number of repeating the metropolis function in each temperature Number of mutated genes for each chromosome 10 10 0.0001 0.95 15 3 to 12 Table 3. Parameters values of the proposed method Fig 6: proposed arrangement References Glover, D. E., & Kaufmann, M. (1987). Genetic Algorithm and Simulated Annealing, page 12-31, Los Altos, CA. 8 5thSASTech 2011, Khavaran Higher-education Institute, Mashhad, Iran. May 12-14. Gotti, J. S., & Brugh, A.W., & Julstrom, B. A. (2005). Arranging the Keyboard with a Permutation-Coded Genetic Algorithm. In Proc. Of the 2005 SCM Symposium on Applied computing, Volume 2, pp. 947-951. Klausler, P. (2005). Available at www.visi.com/~pmk/evovled.html, Sep. Light, W. L., & Anderson G. P. (1993). Typewriter keyboard via simulated annealing, AI Expert, September. Malas, T. M., & Taifour, S. S., & Abandah, G. A. (2008). Toward Optimal Arabic Keyboard Layout Using Genetic Algorithm. In Proc. 9th Int’l Middle Multiconference on Simulation and Modeling, Aug 26-28, Amman, Jordan. Moradi, S., & shiri, S. (2006). Optimization of Farsi Letter Arrangement on Keyboard by Genetic Algorithms. Tehran, 11th International CSI Computer Conference. Norman, D. A., & Rumelhart, D. E. (1983). Cognitive Aspects of Skilled Typing. New York, NY: Springer-Verlag. Wagner, M. O., & Yannou, B., & Kehl. S., & Feillet. D., & Eggers, J. (2003), Ergonomic Modeling and Optimization of Keyboard Arrangement with an ant colony algorithm. European Journal of Operation research.
© Copyright 2026 Paperzz