Sept. 2012, Vol. 2 Iss. 9, PP. 30-34 Communications in Information Science and Management Engineering Influence on ART2 Clustering Algorithm with Different Adjusting Learning Rate* Shujie Du Computer Center of Ocean University of China 238, Songling Road, Qingdao, Shandong Province, China [email protected] Abstract- The process of ART2 can recognize learned models fast and be adapted to new objects rapidly. It carries out clustering with hierarchy structure by using competitive learning and self-steady mechanism in dynamic environment with noise and without supervision. Here discuss the commonused learning rules at first. The way to adjust learning rate is suggested and the assimilation effect is verified by a shape learning trial. The categorization results are also compared to illustrate the effects of different learning rates. To some extent, the improved algorithm solves the pattern drifting problem. field F2. F1 is similar to the comparison layer in ART1, which includes some calculating tiers and gains controller. F2 is similar to the recognition layer in ART1, which is responsible for competitive matching to current input modes [2]. Supposed F1 and F2 have N neurons, where F1 contains M neurons and F2 contains N-M neurons, which constitute N-dimensional state vectors to indicate the short-term memory in this network. The in & out connection weight vectors between F1 and F2 constitute a self-adaptive long-term memory. Downtop and top-down weights are identified by zij, zji respectively. Keywords- ART2; Assimilation Effect; Data Clustering; Learning Rate; Adaptation Process I. INTRODUCTION The Adaptive Resonance Theory (ART) raised by S. Grossberg and A. Carpenet in 1976 is a self-organizing neural network [1]. When the neural network and environment have interactions, if environment information coding will spontaneously produce in neural networks, the selforganizing system is activated. ART grows out of the competition network interaction model, which is composed of two cooperation-competition components. The typical adaptive resonance neural networks mainly include ART1 and ART2. ART1 handles with binary input vectors, while ART2 handles with any analog input vectors, so it has more broad usage. Fig. 1 ART2 neural network ART2 is an unsupervised learning neural network raised in 1987. It is based on competitive learning mechanism, and heavily influenced by the model of organism’s memory. Its memory capacity would increase as learning patterns grow more and more. ART2 can not only learn offline, but also put into use while learning online, which means that learning and application can’t be separated absolutely. Different from other neural networks, ART2 has fast learning capability. Once sample vectors have been input, all the important features would be memorized in long-term memories. The most important feature is that its memory capability could strike a medium perfectly between stability and elasticity, keep weighting coefficients mainly unchanged to make memory system stabilized as well as to adapt vectors changing gradually. So ART2 system is adapted to various kinds of stable or dynamic environments, and becomes one of the ideal clustering algorithms. M neurons in F1 receive input mode X from outsides, handled with features enhancement and noise suppression in F1, they are transferred to F2 by down-top weight zij. N-M neurons in F2 receive signals from F1, determine the winner by competitions, the winner neuron will be activated, while others are inhibited. Then the in & out weight vectors connected with the active neurons should be adjusted. The gain controller takes charge of comparing the similarity between input mode X and out weight vector of the active neuron in F2. If the similarity is lower than threshold, the reset subsystem (Fig. 2) sends out signals to deactivate the winner neuron in F2, and choose another winner neuron in F2, until the similarity suits the demand. A new pattern can definitely be assigned to some neuron respectively if neuron numbers N-M is larger than that of all possible input modes. III. II. LEARNING ALGORITHMS OF ART2 THE WORKING PRINCIPLE OF ART2 A. Mathematical Model of Feature Representation Field F1 The structure of ART2 is shown as Fig. 1, and topology of unit i is shown in Fig. 2. ART2 is composed of two layers: feature representation field F1 and category representation There exist M processing units in F1, each of which is composed of 3 layers: up, middle and down. There are two - 30 - Sept. 2012, Vol. 2 Iss. 9, PP. 30-34 Communications in Information Science and Management Engineering kinds of neurons (represented by circle in Fig. 2) in each layer, circles that can appear solid or empty. Neurons represented by empty circles have two kinds of input excitation, which compares two kinds of input vectors, activate or restrain the excitation. Neurons represented by solid circles achieve calculating modulus of the input vectors [3], [4] . The middle and upper layer in F1 also constitutes a closed positive feedback loop, which includes the calculation below. pi ui N j M 1 g ( y j ) z ji (6) si pi / (e‖P‖) (7) B. Mathematical Model of Category Representation Field F2 Reset The function of F2 is to determine the maximum activated node by competitions, whose weight vectors have maximum similarity to input vectors. Supposed the input vector of Node j in F2 is: M T j pi zij j=M+1, ……, N (8) i 1 The winner in F2 should be chosen as following: T j* max T j j=M+1, ……, N (9) While node j* is the maximum activated node, other nodes should be restrained: d g( y j ) 0 j j* j j* (10) d is a top-bottom (F1→F2) feedback parameter,0 < d < 1. According to (10), Equation (6) can be simplified as following: u dz ji pi i ui The bottom and middle layers in F1 constitute a closed positive feedback loop, which includes two normalized calculations and one nonlinear transformation. The input equation and normalization in bottom layer are given below. dz ji zi xi aui qi zi / (e‖Z‖) dt dt dzij (1) g ( y j )( pi z ji ) (F1→F2) (12) g ( y j )( pi zij ) (F2→F1) (13) when F2 has determined the competition winner j*, if j≠j*, (2) The input equation and normalization in middle layer are shown below: dz ji dt 0, dzij dt 0 if j = j*,we can get the conclusion from Equations (12), (13). (3) (4) dz j*i The tiny positive real number e in above equation can be ignored compared with ‖V‖ and ‖Z‖. u d ( pi z j*i ) d (1 d )( i z j*i ) dt 1 d (14a) As for the nonlinear transformation function f(x) between bottom and middle layer, middle and upper layer, they are usually adopted in the following way. 0 0 x f ( x) x x (11) C. Weight Regulation Rules Weights are adjusted as following equations: Fig. 2 Topological graph of ART2 vi f (qi ) bf ( si ) ui vi / (e‖V‖) j j* j j* dzij* u d ( pi zij* ) d (1 d )( i zij* ) (14b) 1 d dt After iterated once, the adjusted results given by (14a), (14b) are changed to: (5) - 31 - Sept. 2012, Vol. 2 Iss. 9, PP. 30-34 Communications in Information Science and Management Engineering z j*i (t t ) tdui (t ) (1 td (1 d )) z j*i (t ) (15a) zij* (t t ) tdui' (t ) (1 td (1 d )) zij* (t ) (15b) Weight vectors are initialized z ji 0, z ij 1 / (1 d ) M i 1, 2, , M ; j M 1, M 2, , M N ; d 0.9 . The function of orientation subsystem is to determine whether or not reset F2 according to similarity matching. The matching is defined as following. ui cpi ri cP‖ e‖U‖‖ i=1, 2, …, M TABLE I SUPERVISED LEARNING AND TEST RESULTS (a=10,b=10,REFERRED IN eq.(1), (3)) DIFFERENT EFFECTS PRODUCED BY SLOW AND FAST LEARNING RATE As mentioned in Equations (15a), (15b), it gives the learning rule of vectors zji and zij, which corresponds to long time memory (LTM) system of the winner neuron J in F2. The effective exciting duration of input pattern X is represented by ∆t, which definitely means learning rate, it is equivalent to time of input samples’ learning in ART2. Some literatures [5]-[7] put forward the idea that ∆t = 1 in learning rules, while there isn’t any restrictions that ∆t < 1, but in facts, too large ∆t could easily cause oscillation or divergence. Compared with other neural network, the significant feature of ART2 is that it is a continuous network [8]. If Equations (14a), (14b) equal to 0, that means t→∞, then they are simplified to the following. z j*i ui 1 d zij* Round Cluster Result 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ○-? △→○ ○ △→○ △→○ △-? ○ □→○ □→○ □→△ □-? ○ △→○ △→○ △ □→△ □→△ □→△ □ ○ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ○-? △→○ ○ △→○ △→○ △-? ○ □→○ □→△ □→△ □→△ □-? △→○ △ □→△ □→△ □→△ □ ○ △→○ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ○-? △→○ ○ △→○ △→○ △→○ △→○ △-? ○ □→○ □→○ □→△ □→△ □→△ (16) Set threshold ρ, 0 < ρ <1, the output of solid circle A represents modulus of similarity, expressed as║R║, if ║R ║< ρ then the orientation subsystem resets F2. IV. learning. And the weight adjusting algorithm is based on slow updating rules described in Equations (15a), (15b). Learning is taken 20 rounds in each case of ∆t=10, ∆t=5 and ∆ t=1. The results are shown in Table 1. “-?” means raising question, needs answers from teacher. “□→△” represents mistaking quadrilateral for triangle, etc. ui' 1 d This gives the fast learning rule in most common ART2 [9] . As for linear function, if fast learning rule is adopted, LTM vector zj has the same direction with the middle-layer vector U in F1 produced by input vector X. The significant feature of fast learning is amplitude of LTM vector reach 1/(1-d) one at a time, which is called fast commitment [10]. There are no adjustable parameters except d (d is generally close to 1, which makes vector p take z j i as the * principal factor) in fast learning, therefore no methods could be used to adjust learning rates directly. By Fig. 2 and Equations (1), (3), after the winner in F2 returns z j i , it would * pqvu and contribute to U via u w x q , u p two routines. So parameter a and b are critical for adjusting learning rates in ART2. Supposing that a robot is supervised to recognize three different shapes: circle (angles=0), triangle and quadrilateral, and the sensed system has been established with computer vision technology. Teacher supervision is adopted while - 32 - Neuron Activated ∆t=10 1 1 1 1 1 2 1 1 1 2 3 1 1 1 2 2 2 2 3 1 ∆t=5 1 1 1 1 1 2 1 1 2 2 2 3 1 2 2 2 2 3 1 1 ∆t=1 1 1 1 1 1 1 1 2 1 1 1 2 2 2 Succeed Vigilance √ × √ × × √ √ × × × √ √ × × √ × × × √ √ 0.985 0.9925 0.9826 0.9913 0.9956 0.9857 0.9758 0.9879 0.9940 0.9970 0.9870 0.9771 0.9886 0.9943 0.9843 0.9922 0.9961 0.9980 0.9881 0.9782 √ × √ × × √ √ × × × × √ × √ × × × √ √ × 0.985 0.9925 0.9826 0.9913 0.9956 0.9857 0.9758 0.9879 0.9940 0.9970 0.9985 0.9885 0.9943 0.9843 0.9922 0.9961 0.9980 0.9881 0.9782 0.9891 √ × √ × × × × √ √ × × × × × 0.985 0.9925 0.9826 0.9913 0.9956 0.9978 0.9989 0.9889 0.9790 0.9895 0.9948 0.9974 0.9987 0.9993 Sept. 2012, Vol. 2 Iss. 9, PP. 30-34 Communications in Information Science and Management Engineering 15 16 17 18 19 20 □→△ □-? △→○ △→○ △ □→○ 2 3 1 1 2 1 × √ × × √ × coordinate (α, β) represents isosceles triangle mode, whose value is (2, tanθ). 0.9997 0.9897 0.9948 0.9974 0.9874 0.9937 These triangle data are tested by different learning rates and input sequences (sequential, inverse), the parameters in ART2 model are assumed as following: From the table above, we come to the conclusion that assimilation effects and adaptation process are distinguishing features of ART2. Take ∆t=10 for example, In Round 6, while vigilance ρ rises to 0.9956, system no longer mistakes triangle for the previous result circle. New neuron is activated, ρdrops to 0.9857. At the moment, the input shape circle is recognized correctly. In Round 10, with ρ rising to 0.9940, system is aware of the difference of feature vectors between quadrilateral and circle that has been memorized. So it no longer mistakes quadrilateral for circle, but still mistakes quadrilateral for triangle until ρ rises to a higher value 0.9970, whereby the 3rd neuron is activated after the teacher supervises. In Round 16, system mistakes quadrilateral for triangle. From Rounds 16 to 18, with ρ rising to 0.9980, system recognizes quadrilateral again, and in the slow learning mechanism, analyzes the characteristics of the quadrilateral once more, adjusts LTM weights, and is taken into new balance. In last 2 rounds, because LTM weights in ART2 have been adjusted properly, even if ρ drops to 0.9782, system can still recognize these shapes correctly. As to ∆ t=5 and ∆ t=1, with the interval decrease, misidentifications increase. Assimilation effects become more obvious and adaptation process will sustain even if ρ is nearly up to 1, this is due to a short period of stimulus duration, LTM has not approached equilibrium. In practical application, appropriate learning rate is important to eliminate noise effectively and keep system stable, in the mean while, it can also eliminate incorrect effect that different sample input sequences act on the classified results. Experiments show that ultra fast learning is unsuitable for dealing with those data with high-level noise [11] . On the other hand, although slow learning could eliminate noise preferably, it produces meaningless search process, which leads to slow calculation. a=8, b=8, c=0.15, d=0.9, e=0.0001, =0.1, ρ=0.999, zji=0.08-0.001*j, j=0, 1, …, M. ∆t=0.1 As for these base angles, which are input sequentially and ascending, the comparison result is shown in TABLE Ⅱ. TABLE II COMPARISON OF TWO ALGORITHMS IN SEQUENTIAL INPUT ORDER Fast Rate(t→∞) Category Slow Rate (∆t=0.1) Range Scale Range Scale 1 1~89 2 None 89 1~60 60 0 61~80 3 None 20 0 81~89 9 On the other hand, the base angles which are input inversely and descending, the comparison result is shown in TABLE Ⅲ. TABLE III COMPARISON OF TWO ALGORITHMS IN INVERSE INPUT ORDER Category Fast Rate(t→∞) Slow Rate (∆t=0.1) Range Scale Range Scale 1 89~1 89 89~81 9 2 None 0 80~61 20 3 None 0 60~1 60 As shown above, ART2 with slow learning rate could recognize the process that patterns change gradually, the improvement of susceptibility to gradually changing process achieves better classification effects and robustness [14], [15]. VI. CONCLUSION Based on traditional fast learning rates, the modification of weights-updating with slower learning rates can reduce speed of pattern drifting by demonstration. By the way, pattern drifting shown in classification applications in ART2 is essential feature but not shortcoming. It isn’t suitable for some specific samples. In other fields, face recognition, e.g. [16], traditional ART2 model should be more suitable, it fits the condition that person getting older is similar to that one pattern changes gradually as time goes. The efficient solution for balancing learning rate is to put forward the model which combines the features of fast commitment and slow recoding [12]. Fast commitment is usually used when uncommitted nodes have been activated, to avoid meaningless search for committed nodes produced by slow learning. Slow recoding is usually used when committed nodes have been activated, it uses learning rules similar to slow learning in order to solve noise problem and influence of different samples input sequence. ACKNOWLEDGMENT V. This work is supported by the National Natural Science Foundation of China (Nos.40176014, 40067013). CORRESPONDENT EXPERIMENTS ANALYSIS In order to verify different classified results related to weight modification rates, data from samples [13] are adopted as the research object which are isosceles triangle mode with gradual changed base angle, each of these 89 sets of bivector is composed of base long α and high β. Supposed base angle θ, θ=1°, 2°, …, 89°, α=2, so β = α*tanθ/2 = tanθ, We are grateful to Prof.LIU Z.S at Ocean University of China for comments and suggestions, and wish to thank Dr. Chen.Z who is well versed in ART2 algorithm. We also wish to express our appreciation to the Remote sensing institute of Ocean University of China for powerful analytical supports. - 33 - Sept. 2012, Vol. 2 Iss. 9, PP. 30-34 Communications in Information Science and Management Engineering REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] Carpenter G. A., and Grossberg S, “ART2:Self organization of stable category recognition codes for analog input patterns,” Applied Optics,vol. 26, pp. 4919-4930, 1987. Carpenter G. A., and Grossberg S., “ART2-A: An adaptive resonance algorithm for rapid category learning and recognition,” Neural Network, vol. 4, pp. 493-504, Apr. 1991. Frank T., Kraiss K. F., and Kuhlen T., “Comparative-analysis of fuzzy art and ART2A network clustering performance,” IEEE Trans on Neural Network, vol. 9, pp. 544-549, Mar.1998. Shi D., Ong Y. S., and Tan E. C., “Handwritten chinese character recognition using kernel active handwriting model,” Systems, Man and Cybernetics, pp. 251-255, Jan. 2009. Davenport M. P., and Titus A., “Multilevel category structure in the ART2 network,” Neural Network, vol. 15, pp. 145-158, Jan.2004. Klotz G. A., and Stacey D. A., “ART2 based classification of sparse high dimensional parameter sets for a simulation parameter selection assistant,” Neural Networks, vol. 31, pp. 1081-1085, Feb. 2005. Alahakoon D., Halgamuge S. K.,and Srinivasan B. “Dynamic self-organizing maps with controlled growth for knowledge discovery,” Neural Networks , vol. 3, pp. 601-614, Nov. 2000. Martin T.,et al. Neural Network Design. PWS Pub. Co.2006. Ardavan A., and Seyed S. M., “Application of modified art2 artificial neural network in classification of structural - 34 - [10] [11] [12] [13] [14] [15] [16] members,” 15th ASCE Engineering Mechanics Conference, Columbia University, New York, Jun 2-5,2002. Seungdoo P., John M. V., and Raymond J. G. “Direct oxidation of hydrocarbons in a solid-oxide fuel cell,” Nature, vol. 404, pp. 265-267, 2004. Chen Z. G., and Chen D. Z.,“Integrated strategy of pattern classification and its application,” Journal of Zhejiang University: Engineering Science, vol. 36, pp. 601-602, Jun. 2010. Cao Y. Q., and WU J. H., “Projective ART for clustering data sets in high dimensional spaces,” Neural Network, vol. 15, pp. 105-120, 2002. Liu L., Hu B., and Shi L. F., “Systematic review of ART2 neural network,” Journal of Central South University, vol. 8, pp. 21-26, Aug. 2007. Pham D. T., and Sukkar M. F., “A predictor based on adaptive resonance theory,” Artificial Intelligence in Engineering, vol. 12, pp. 219-228, Dec. 2009. Houshmand G. P. B., “An efficient neural Classification chain of SAR and optical urban images,” International Journal of Remote Sensing, vol. 22, pp. 1535-1553, Aug. 2001. Pham D. T., and Chan A. B., “Unsupervised adaptive resonance theory neural networks for control chart pattern recognition,” Proceedings or the Institution of Mechanical Engineers, vol. 215, pp. 59-67, Jan. 2001.
© Copyright 2026 Paperzz