THE TOROIDAL NEURAL NETWORKS Moreno Coli1, Paolo Palazzari2, Rodolfo Rughi1 1 2 University "La Sapienza" - Electronic Engineering Department - Via Eudossiana, 18 - 00184 Rome ENEA - HPCN Project - C.R.Casaccia - Via Anguillarese, 301- 00060 S. Maria di Galeria (Rome) Tel +39-06-3048 3167 Fax +39-06-3048 4230 E-mail [email protected] ABSTRACT In this paper we present the Toroidal Neural Networks (TNN), a new class of neural network derived from Discrete TimeCellular Neural Networks (DT-CNN). TNN are characterized by 2D toroidal topology with local connections, by binary outputs and by a simple equation describing the dynamic of neuron states; binary outputs are obtained comparing initial and final states. Due to the expression of state dynamic, TNN learning has a very appealing geometric interpretation: a transformation, specified by means of a training input sequence, is represented through a polyhedron in the TNN weight space. Along with the definition and theory about TNN, we present a learning algorithm which, for a given transformation expressed by means of a training sequence, gives the set of TNN weights (if existing) which exactly implement the transformation: such a set of weights is a point belonging to the polyhedron representing the training sequence. Furthermore, the algorithm gives the exact minimal spatial locality characterizing the problem; in order to reduce the number of TNN weights, a heuristic is used to try to move neuron connectivity from the spatial to the temporal dimension. 1. INTRODUCTION This work introduces the Toroidal Neural Netowrks (TNN), a new type of neural network dedicated to image processing and inspired by Cellular Neural Networks (CNN [1-3]). TNN have a 2D topology and are closely related to Discrete Time CNN (DT-CNN [4-5]); TNN use only few (typically from 1 to 3) time steps of the discrete state evolution and their binary output is determined through the sign of the difference between the initial and the final states. TNN are characterized by very fast achievement of final output and by the existence of a deterministic learning algorithm which computes (if existing) the weight connections to exactly implement the transformation specified through a sequence of input-output images (training sequence). The deterministic learning algorithm is called the Polyhedral Intersection Learning Algorithm (PILA). PILA is based on a geometric interpretation of the learning process in the space of the TNN weights; searching for a set of weights implementing the transformation specified by a training sequence corresponds to the intersection of a set of polyhedra (one for each I/O pair of the training sequence); each polyhedron represents the set of all the TNN implementing the transformation specified by an I/O pair. PILA is a deterministic algorithm, avoiding the problems typical of heuristic CNN learning [6-7]. PILA is characterized by a two phases behavior: in the first phase the TNN which implements the transformation in one time step and with minimal connectivity is found; in the second step PILA tries to minimize the spatial connectivity of the network, moving (if possible) neuron connectivity from the spatial to the temporal dimension. Paper is structured as it follows: in next section we give the geometrical and mathematical foundations underlying TNN theory; in section 3 we gives basic definition for TNN and we give their state evolution and output equations. Finally, in section 4, we present the Polyhedral Intersection Learning Algorithm. Examples of applications of TNN to image processing can be found in [8]. 2. MATHEMATICAL FOUNDATIONS 2.1 POLYHEDRA We consider the N-dimensional rational space QN. Given a Ndimensional row vector w, an affine half space is the set HS {x | x Q N , wx Q} . A polyhedron P is the intersection of finitely many half spaces [9], i.e. (1) P {x | x Q N , Ax b} being AT w1 wk a matrix composed by k row vectors wi, and bT 1 k a k-dimensional constant vector. P can also be represented as a linear combination (vector ) of lines (columns of matrix L), convex combination (vector ) of vertices (columns matrix V) and positive combination (vector ) of extreme ray (columns matrix R) through the Minkowski characterization [10] (2) P {x | x L R V , 0, 1} Intersection of polyhedra, transformation between implicit form (1) and Minkowski form (2) and other operations on polyhedra are implemented by using the polyhedral library [10]. 2.2 CIRCULATING MATRICES Given a row vector a (a1 , a2 , ..., an ) , ai Q , a (n x n) scalar right circulating matrix is defined as a1 a 2 a n a n a1 a n1 R S (a) a1 a 2 a n being RS () an operator which receives a n-dimensional row vector and returns a matrix which has a as first row; the ith row is computed by rotating toward right of 1 step the (i-1)th row (i=2,3,…,n). The element in position (i,j) of RS(a) is given by a (i j )1 i j ( 3) Ra i, j a n( j i )1 i j In [5] we demonstrated the following Theorem 1: Given two scalar right circulating matrices Ra=RS(a) and Rb=RS(b) derived from two n-dimensional vectors a and b, the matrix Rc=RaRb is still a scalar right circulating matrix, i.e. Rc=RS(c). Now we demonstrate that Theorem 2: Rc=Ra+Rb= RS(a)+RS(b) is a scalar right circulating matrix, i.e. Rc=RS(c)= RS(a+b) Proof: Rc i, j Ra i, j Rbi, j (on the basis of ( 3)) i j a (i j )1 b(i j )1 = ; so Rc=RS(c)=RS(a+b). a b i j n ( j i ) 1 n ( j i ) 1 QED A block right circulating matrix is similar to a scalar right circular matrix, but the entries are circulating matrices instead of scalar values; for example, let us consider the four scalar circulating matrices A i R S (a i ) Rs( ai,1 , ai,2 , ai,3 , ai,4 ) , i=1,…,4. The block right circulating matrix is defined as A1 A 2 A 3 A 4 ~ ~ A A A A R ( A) R ( A1 , A 2 , A 3 , A 4 ) 4 1 2 3 A A A A 3 4 1 2 A 2 A 3 A 4 A1 ~ being R () an operator which receives a n-dimensional block row vector A=(RS(a1),…, RS(an)) and returns a matrix which has A as first row; the ith block row is computed by rotating toward right of 1 step the (i-1)th block row (i=2,3,…,n). If ~ vectors ai (i=1,2,…,n) have length M, R ( A) is a (Mn Mn) matrix. Given two block vectors A A1 R S (a1 ), , A M R S (a M ) B B1 R S (b1 ), , B M R S (bM ) , the following theorem subsists: Theorem 3: the product of two block right circulating matrices is still a block right circulating matrix, i.e. ~ ~ ~ ( 4) Rc R (A) R (B) R (C) Proof: A1 A 2 A M A A1 A 2 A M 1 ~ ~ R ( A ) R (B ) M A1 A 2 A 3 B1 B 2 B M B B1 A 2 B M 1 M B1 B 2 B 3 C1 C M C M C M 1 Previous matrix has the form of a right circulating matrix; in order to be a block right circulating matrix, its elements must be scalar right circulating matrices. Each element C i (i 1,2, , M ) is the summation of the product of scalar right circulating matrices, i.e. Ci Ak Bq k ,q From theorem 1, each term of the summation is a scalar right circulating matrix and, from theorem 2, the summation is a scalar right circulating matrix; so each C i (i 1,2, , M ) is a ~ scalar right circulating matrix and, consequently, Rc R (C) . QED. Corollary to theorem 3: The mth power of a block right circulating matrix is a block right circulating matrix, i.e. ~ ~ Rc R(A) m R(C) The proof derives directly from theorem 3 written for B=A and iterated. 3. TNN In this section we formally introduce the TNN, defining their evolution and output equations. 3.1 TNN: DEFINITION TNNs are characterized by a bidimensional toroidal topology, i.e. neurons are defined over a (n1 n2) grid G with connections between corresponding neurons on opposite borders. Given two points pi=(xi1,xi2) (i=1,2), we define the distance between p1 and p2 as D( p1 , p 2 ) M axmin x1i x 2i , ni x1i x 2i . i 1,2 On a TNN, the neighborhood with radius r of neuron pi is defined as N r ( pi ) p pi , p G, D pi , p r ; where is not ambiguity, we will indicate neuron pi=(xi1,xi2) through its coordinate (xi1,xi2). Neuron with coordinates (i,j) is connected to neuron (k,l) if (k,l) belongs to the neighborhood of (i,j). The weight connecting the two neurons is t (i, j )( k ,l ) ; the set of weights connecting a neuron with its neighborhood is the cloning template CT, being CT t (i, j)(k,l) | (k, l) N r (i, j) . CT is the same for all the neurons (i.e. it is spatially invariant) and determinates the elaboration executed by a TNN. In fact, by indicating with si,j(n) the state of neuron (i,j) at the discrete time instant n, the successive state is linearly given by the following expression (5) si, j (n 1) t (i, j )(k ,l ) s k ,l (n) ( k ,l )N r (i, j ) The output y of a TNN is assigned on the basis of the: (6) from theorem 1, considering block entries as scalar values, C2 C1 1 sij (n 1) sij (0) yij (n 1) 1 sij (n 1) sij (0) 3.2 TNN: EVOLUTION A cloning template with radius r is expressed through the weight matrix t: t r , r t0, r t r , r (7) t (2r 1); (2r 1) t r ,0 t0,0 t r ,0 t r , r t0, r t r , r As introductory example, let us consider a 44 TNN with cloning template with radius r=1: t 1, 1 t (3;3) t 1, 0 t 1,1 t 0, 1 t 0,0 t 0 ,1 t1, 1 t 0 t1, 0 t 3 t1,1 t 6 t1 t4 t7 t2 t 5 t 8 TNN is 44, so we consider a 44 block right circulating matrix RP, having 44 blocks as entries, i.e. Rp 1 Rp ( 8) RP 4 Rp 3 Rp 2 Rp 2 Rp 3 Rp 1 Rp 2 Rp 4 Rp 1 Rp 3 Rp 4 Rp 4 Rp 3 Rp 2 Rp 1 Block entries Rpi (i=1,…,4) are 44 scalar right circulating matrices, defined through the rows of the weight matrix t: Rp1 RS (t4 t5 0 t3), Rp 2 RS (t7 t8 0 t6 ), Rp3 RS (0 0 0 0) , Rp 4 RS (t1 t2 0 t0 ) Rp1, Rp2, Rp4, are the scalar right circulating matrices associated to the three rows of the cloning template (t 3 t 4 t 5 ) , (t 6 t 7 t 8 ) , (t 0 t1 t 2 ) and obtained by 3rd inserting a zero in the position; Rp3 is the null 44 matrix. TNN state at (discrete) time n can be represented through the column vector s i ,1 (n) s1 (n) s (n) s (n) i,2 (i=1, 2, 3, 4); s(n) 2 where s i (n) s i ,3 ( n ) s 3 ( n) s i , 4 (n) s 4 (n) so we can think the bi-dimensional state of an MM TNN as a M2 entries scalar column vector, obtained by row-wise reporting the state matrix into a column vector. From previous definitions, the evolution of TNN state from time n to (n+1) can be compactly written as: s1 (n 1) Rp1 s (n 1) Rp 4 ( 9) 2 s3 (n 1) Rp 3 s 4 (n 1) Rp 2 Rp 2 Rp 3 Rp1 Rp 2 Rp 4 Rp1 Rp 3 Rp 4 Rp 4 s1 (n) Rp 3 s 2 (n) Rp 2 s3 (n) Rp1 s 4 (n) In the general case of a MM TNN, we indicate with Rpi the MM scalar right circulating matrix associated to the (i-r+1)th row of the radius r cloning template t ( 8) extended through the insertion of zeroes into the r+2,…,M-r positions. For the pairs (i,k) of values related through the following expressions i 1,..., r 1 i M - r 1,..., M and k i 1 k i - M - 1 Rpi is given by t 0, k t 1,k (10) Rp i t r ,k 0 t 1,k t1,k t r ,k 0 t r ,k t 1,k t 0,k t r ,k 0 t 2,k t 0, k t r , k 0 0 0 t r , k t 0,k t r , k 0 0 t r ,k 0 0 t r ,k t 0,k Rpi is the null MM matrix when r+2 i M-r. State evolution of an MM TNN with a radius r cloning template can be written as: s 1 (n 1) Rp 1 Rp 2 Rp M s1 (n) s (n 1) Rp Rp Rp 1 M 1 s 2 ( n) M (11) 2 s ( n 1 ) Rp Rp Rp 2 M 1 s M ( n) M With obvious extension to matrix notations, we can write one step evolution of TNN state as: (12) s(n 1) RP s(n) where s() is M21 vector and RP is M2M2 block right circulating matrix. Because of associative property of matrix product, evolution for m instants of the TNN state is compactly given by: (13) s(m) RP m s(0) RP is a block right circulating matrix so, from the corollary to theorem 3, RP m is a block right circulating matrix. 4. POLYHEDRAL INTERSECTION LEARNING ALGORITHM In order to train a TNN, we consider a set S of k pairs of inputoutput (M M) images describing the desired elaboration, S={<Ii,Oi>, i=1,2,…,k}. sx,y(0) is set to the value of pixel Ii(x,y) and the desired output after m steps is yx,y(m)=Oi(x,y) (1 x,y M). From equations (6) and (13) y x, y (m) Oi ( x, y) iff s x, y (m) x, y,Oi s x, y (0) where if Oi ( x, y ) 1 (14) x, y ,Oi . f Oi ( x, y ) 1 s x, y (m) x, y,Oi s x, y (0) means that final state must be greater or equal (smaller) than sx,y(0) when the (x,y) pixel of the output image Oi is equal to 1 (-1). Given a pair of images <I,O>, the TNN which transforms I into O in m time steps must satisfy the following set of inequalities: s ( x My) (m) RP(mx My),k s k (0) x , y ,O s ( x My) (0) k 1,M 2 (15) x=1,…, M ; y=1,…, M. In (15) we indicate with s( xMy) (l ) the state of neuron in position (x,y) at time instant l; such a notation is due to the representation of (M x M) images as (M2 x 1) column vectors. We are now interested to the TNN which transforms I into O in 1 time step. The TNN must satisfy the set of inequalities (16) s ( x My) (1) RP( x My),k s k (0) x, y ,O s ( x My) (0) k 1, M 2 x=1,…, M ; y=1,…, M As elements of RP(x+My),k are either zeroes or the elements of cloning template (eq. (10)), (17) RP( x My),k s k (0) x, y ,O s ( x My) (0) k 1, M 2 represents an half-space in the space having TNN weights as coordinated axis. When we consider the M2 inequalities (16), we obtain the polyhedron P which is originated by the intersection of the M2 half-spaces. If P is not empty, each point TP is a cloning template of a TNN transforming I into O in 1 step. If we compute the polyhedron P i for each pair <Ii,Oi> of the training set S, the intersection of all these polyhedra is a new polyhedron FP; if FP is not empty, each point TFP is the cloning template of a TNN implementing in 1 step all the transformations included into the training set S. We can a priori adopt a cloning template with radius r=1; if FP is empty, r=1 is not sufficient to solve the problem. r is increased until a value rmin, yielding a not empty FP, is reached. rmin is locality of transformation described through S. The two systems of inequalities (15) and (16) are very similar, because they both have the same known terms and the unknowns grouped on a block right circulating matrix. Solution of (16) is also a solution of (15) in the case m=1. If locality of the problem is greater than one, we can try to move from space to time the Degree of Connectivity (DoC) of the TNN, being DoC the distance between any neuron s and the most far neuron influencing its state, i.e. DoC=rm. Unfortunately, solution of (15) for m>1 leads to a not linear problem (the unknowns have degree m). In order to find a solution, if existing, we use the heuristic approach described in [8], where it is defined the function CT=SA_heuristic(r,m,S); CT is the cloning template, found through a Simulated Annealing algorithm [11], which gives the TNN which performs (nearly) the best transformation of input images I into output images O, belonging to S, by using m time steps and a CT with radius r. Moreover, we suppose defined a procedure Factorization(DoC) which returns an ordered list F containing the couple of integers (if existing) which factorize DoC; L(i)=<ri,mi> is the ith pair of integers factorizing DoC; ri<ri+1. TNN Polyhedral Intersection Learning Algorithm Input Training sequence S={<Ii,Oi>, i=1,2,…,k}. Output CTimplementing transformation S or {not possible} begin r=1; fine = false; solution=true; while not fine FP=;for i=1 to k compute Pi by eq. (16); FP=FP Pi; if FP then DoC=r; fine=true else r=r+1 if (2r+1)>Image_Size then fine = true; solution = false; endwhile if not solution then return {not possible} else {each point TFP is the CT, with radius r=DoC, of a TNN which implements the transformation S in one time step} if DoC=r>1 then F=Factorization(DoC); i=1; exit = false; repeat {try to move connectivity from space to time} CT=SA_heuristic(ri,mi,.S) if CT satisfies requirements then exit=true; return CT else i=i+1 until exit end. As rm=rmin is a necessary but not sufficient condition to move neuron connectivity from space to time, the heuristic process could not be able to find a CT with radius r<rmin. 5. CONCLUSIONS In this work we presented the theoretical description of Toroidal Neural Networks (TNN), giving an exact learning algorithm, the Polyhedral Intersection Learning Algorithm (PILA), which is based on geometrical interpretation of the learning process. PILA allows to determine the minimal locality of the problem, corresponding to the dimension of the radius of the Cloning Template (CT) of the TNN which solves in one time step the problem coded through the training set; PILA implements also a heuristic that allows, when possible, to reduce the connectivity of CT by using more than one simulation step, i.e. connectivity is moved from the spatial to the temporal dimension. 6. REFERENCES [1] L.O. Chua, L. Yang, “Cellular Neural Networks: Theory”, IEEE Trans. on CAS, vol. 35, 1988 [2] L.O. Chua, L. Yang, “Cellular Neural Networks: Application”, IEEE Trans. on CAS, 35, 1988 [3] T. Roska, L.O. Chua “The CNN Universal Machine: an analogic array computer”, IEEE Trans. on CAS-II, march 1993. [4] H. Harrer, J.A: Nossek: 'Discrete time Cellular Neural Networks'. Int. J. Circuit Theory & Appl, 20, Sept 1992. [5] M. Coli, P. Palazzari, R. Rughi, “Design of dynamic evolution of discrete-time countinuous-output CNN ”, Int Conf. on Artificial Neural Networks, ICANN ’95, Paris. [6] T. Kozek, T. Roska, O. Chua, “Genetic Algorithm for CNN Template Learning”, IEEE Tr. CAS, 40, n. 6, 1993 [7] Balsi, M. “Recurrent back-propgation for CNN”, ECCTD’93, Elsveier Science Publ.., Amsterdam, 1993. [8] M. Coli, P.Palazzari, R.Rughi: “Non Linear Image Processing through Sequences of Fast Cellular Neural Networks (FCNN)”. Proc. of NSIP’99 Antalya, Turkey [9] A. Schrijver: “Theory of linear and integer programming”. John Wiley & Sons Ltd, 1986. [10] D.K. Wilde: “A Library for doing polyhedral operations”. IRISA – R Report n. 785 [11] S. Kirkpatricket al.: "Optimization by Simulated Annealing" Science, Vol. 220, No. 4598. May 1983
© Copyright 2026 Paperzz