camread_new_iscas2000

THE TOROIDAL NEURAL NETWORKS
Moreno Coli1, Paolo Palazzari2, Rodolfo Rughi1
1
2
University "La Sapienza" - Electronic Engineering Department - Via Eudossiana, 18 - 00184 Rome
ENEA - HPCN Project - C.R.Casaccia - Via Anguillarese, 301- 00060 S. Maria di Galeria (Rome)
Tel +39-06-3048 3167 Fax +39-06-3048 4230 E-mail [email protected]
ABSTRACT
In this paper we present the Toroidal Neural Networks (TNN),
a new class of neural network derived from Discrete TimeCellular Neural Networks (DT-CNN). TNN are characterized
by 2D toroidal topology with local connections, by binary
outputs and by a simple equation describing the dynamic of
neuron states; binary outputs are obtained comparing initial
and final states. Due to the expression of state dynamic, TNN
learning has a very appealing geometric interpretation: a
transformation, specified by means of a training input
sequence, is represented through a polyhedron in the TNN
weight space. Along with the definition and theory about TNN,
we present a learning algorithm which, for a given
transformation expressed by means of a training sequence,
gives the set of TNN weights (if existing) which exactly
implement the transformation: such a set of weights is a point
belonging to the polyhedron representing the training
sequence. Furthermore, the algorithm gives the exact minimal
spatial locality characterizing the problem; in order to reduce
the number of TNN weights, a heuristic is used to try to move
neuron connectivity from the spatial to the temporal dimension.
1. INTRODUCTION
This work introduces the Toroidal Neural Netowrks (TNN), a
new type of neural network dedicated to image processing and
inspired by Cellular Neural Networks (CNN [1-3]). TNN have
a 2D topology and are closely related to Discrete Time CNN
(DT-CNN [4-5]); TNN use only few (typically from 1 to 3)
time steps of the discrete state evolution and their binary output
is determined through the sign of the difference between the
initial and the final states. TNN are characterized by very fast
achievement of final output and by the existence of a
deterministic learning algorithm which computes (if existing)
the weight connections to exactly implement the transformation
specified through a sequence of input-output images (training
sequence). The deterministic learning algorithm is called the
Polyhedral Intersection Learning Algorithm (PILA). PILA is
based on a geometric interpretation of the learning process in
the space of the TNN weights; searching for a set of weights
implementing the transformation specified by a training
sequence corresponds to the intersection of a set of polyhedra
(one for each I/O pair of the training sequence); each
polyhedron represents the set of all the TNN implementing the
transformation
specified by an I/O pair. PILA is a
deterministic algorithm, avoiding the problems typical of
heuristic CNN learning [6-7]. PILA is characterized by a two
phases behavior: in the first phase the TNN which implements
the transformation in one time step and with minimal
connectivity is found; in the second step PILA tries to
minimize the spatial connectivity of the network, moving (if
possible) neuron connectivity from the spatial to the temporal
dimension.
Paper is structured as it follows: in next section we give the
geometrical and mathematical foundations underlying TNN
theory; in section 3 we gives basic definition for TNN and we
give their state evolution and output equations. Finally, in
section 4, we present the Polyhedral Intersection Learning
Algorithm. Examples of applications of TNN to image
processing can be found in [8].
2. MATHEMATICAL FOUNDATIONS
2.1 POLYHEDRA
We consider the N-dimensional rational space QN. Given a Ndimensional row vector w, an affine half space is the set
HS  {x | x  Q N , wx    Q} .
A polyhedron P is the intersection of finitely many half spaces
[9], i.e.
(1) P  {x | x  Q N , Ax  b}
being AT  w1  wk  a matrix composed by k row vectors
wi, and bT   1  k  a k-dimensional constant vector. P
can also be represented as a linear combination (vector ) of
lines (columns of matrix L), convex combination (vector ) of
vertices (columns matrix V) and positive combination (vector
) of extreme ray (columns matrix R) through the Minkowski
characterization [10]
(2) P  {x | x  L  R  V  ,  0,   1}

Intersection of polyhedra, transformation between implicit
form (1) and Minkowski form (2) and other operations on
polyhedra are implemented by using the polyhedral library
[10].
2.2 CIRCULATING MATRICES
Given a row vector a  (a1 , a2 , ..., an ) , ai  Q , a (n x n)
scalar right circulating matrix is defined as
 a1 a 2  a n 


a n a1  a n1 
R S (a)  
 


 


a1 
a 2  a n
being RS () an operator which receives a n-dimensional row
vector and returns a matrix which has a as first row; the ith row
is computed by rotating toward right of 1 step the (i-1)th row
(i=2,3,…,n). The element in position (i,j) of RS(a) is given by
 a (i  j )1 i  j

( 3) Ra i, j  

a n( j i )1 i  j
In [5] we demonstrated the following
Theorem 1: Given two scalar right circulating matrices
Ra=RS(a) and Rb=RS(b) derived from two n-dimensional
vectors a and b, the matrix Rc=RaRb is still a scalar right
circulating matrix, i.e. Rc=RS(c).
Now we demonstrate that
Theorem 2: Rc=Ra+Rb= RS(a)+RS(b) is a scalar right
circulating matrix, i.e. Rc=RS(c)= RS(a+b)
Proof: Rc i, j  Ra i, j  Rbi, j  (on the basis of ( 3))
i j

 a (i  j )1  b(i  j )1
=
; so Rc=RS(c)=RS(a+b).
a

b
i j

n

(
j

i
)

1
n

(
j

i
)

1

QED
A block right circulating matrix is similar to a scalar right
circular matrix, but the entries are circulating matrices instead
of scalar values; for example, let us consider the four scalar

circulating matrices A i  R S (a i ) Rs( ai,1 , ai,2 , ai,3 , ai,4 ) ,
i=1,…,4. The block right circulating matrix is defined as




 A1 A 2 A 3 A 4 


 

~
~    
A
A
A
A
R ( A)  R ( A1 , A 2 , A 3 , A 4 )    4  1  2  3 
A
A
A
A 
 3  4  1 2
 A 2 A 3 A 4 A1 
~
being R () an operator which receives a n-dimensional block
row vector A=(RS(a1),…, RS(an)) and returns a matrix which
has A as first row; the ith block row is computed by rotating
toward right of 1 step the (i-1)th block row (i=2,3,…,n). If
~
vectors ai (i=1,2,…,n) have length M, R ( A) is a (Mn  Mn)
matrix. Given two block vectors


A  A1  R S (a1 ),  , A M  R S (a M )


B  B1  R S (b1 ),  , B M  R S (bM ) ,




the following theorem subsists:
Theorem 3: the product of two block right circulating
matrices is still a block right circulating matrix, i.e.
~
~
~
( 4) Rc  R (A) R (B)  R (C)
Proof:



 A1 A 2
A 


 M 

A
A1 A 2
A M 1 
~
~
R ( A )  R (B )   M










A1 
 A 2 A 3



 B1 B 2
B 


 M 

B
B1 A 2
B M 1 
 M





 

 B1 
 B 2 B 3

 C1


 C M



C 
 M 
 C M 1 




Previous matrix has the form of a right circulating matrix; in
order to be a block right circulating matrix, its elements must
be scalar right circulating matrices. Each element

C i (i  1,2,  , M ) is the summation of the product of scalar

right circulating matrices, i.e. Ci 


 Ak  Bq
k ,q
From theorem 1, each term of the summation is a scalar right
circulating matrix and, from theorem 2, the summation is a

scalar right circulating matrix; so each C i (i  1,2,  , M ) is a
~
scalar right circulating matrix and, consequently, Rc  R (C) .
QED.
Corollary to theorem 3: The mth power of a block right
circulating matrix is a block right circulating matrix, i.e.
~
~
Rc  R(A) m  R(C)
The proof derives directly from theorem 3 written for B=A and
iterated.
3. TNN
In this section we formally introduce the TNN, defining their
evolution and output equations.
3.1 TNN: DEFINITION
TNNs are characterized by a bidimensional toroidal topology,
i.e. neurons are defined over a (n1  n2) grid G with
connections between corresponding neurons on opposite
borders. Given two points pi=(xi1,xi2) (i=1,2), we define the
distance between p1 and p2 as
D( p1 , p 2 )  M axmin  x1i  x 2i , ni  x1i  x 2i  .
i 1,2
On a TNN, the neighborhood with radius r of neuron pi is
defined as N r ( pi )  p pi , p  G, D pi , p   r ; where is not
ambiguity, we will indicate neuron pi=(xi1,xi2) through its
coordinate (xi1,xi2). Neuron with coordinates (i,j) is connected
to neuron (k,l) if (k,l) belongs to the neighborhood of (i,j). The
weight connecting the two neurons is t (i, j )( k ,l ) ; the set of
weights connecting a neuron with its neighborhood is the
cloning template CT, being
CT  t (i, j)(k,l) | (k, l)  N r (i, j) .


CT is the same for all the neurons (i.e. it is spatially invariant)
and determinates the elaboration executed by a TNN. In fact,
by indicating with si,j(n) the state of neuron (i,j) at the discrete
time instant n, the successive state is linearly given by the
following expression
(5) si, j (n  1) 
 t (i, j )(k ,l )  s k ,l (n)
( k ,l )N r (i, j )
The output y of a TNN is assigned on the basis of the:
(6)
from theorem 1, considering block entries as scalar values,

C2

C1
 1 sij (n  1)  sij (0)
yij (n  1)  
 1 sij (n  1)  sij (0)
3.2 TNN: EVOLUTION
A cloning template with radius r is expressed through the
weight matrix t:
t r ,  r  t0,  r  t r ,  r 





(7) t (2r  1); (2r  1)    t r ,0  t0,0  t r ,0 





t r ,  r  t0,  r  t r ,  r 


As introductory example, let us consider a 44 TNN with
cloning template with radius r=1:
t 1, 1

t (3;3)   t 1, 0
 t 1,1

t 0, 1
t 0,0
t 0 ,1
t1, 1  t 0

t1, 0   t 3
t1,1  t 6
t1
t4
t7
t2 
t 5 
t 8 
TNN is 44, so we consider a 44 block right circulating
matrix RP, having 44 blocks as entries, i.e.
 Rp 1
Rp
( 8) RP   4
 Rp 3

Rp 2
Rp 2
Rp 3
Rp 1
Rp 2
Rp 4
Rp 1
Rp 3
Rp 4
Rp 4 
Rp 3 
Rp 2 

Rp 1 
Block entries Rpi (i=1,…,4) are 44 scalar right circulating
matrices, defined through the rows of the weight matrix t:
Rp1  RS (t4 t5 0 t3), Rp 2  RS (t7 t8 0 t6 ),
Rp3  RS (0 0 0 0) , Rp 4  RS (t1 t2 0 t0 )
Rp1, Rp2, Rp4, are the scalar right circulating matrices
associated to the three rows of the cloning template
(t 3 t 4 t 5 ) , (t 6 t 7 t 8 ) , (t 0 t1 t 2 ) and obtained by
3rd
inserting a zero in the
position; Rp3 is the null 44 matrix.
TNN state at (discrete) time n can be represented through the
column vector
 s i ,1 (n) 
 s1 (n) 
s (n)
s (n)
i,2
 (i=1, 2, 3, 4);
s(n)   2  where s i (n)  
s i ,3 ( n ) 
s 3 ( n) 




s i , 4 (n)
s 4 (n)
so we can think the bi-dimensional state of an MM TNN as a
M2 entries scalar column vector, obtained by row-wise
reporting the state matrix into a column vector.
From previous definitions, the evolution of TNN state from
time n to (n+1) can be compactly written as:
 s1 (n  1)   Rp1
s (n  1) Rp
 4
( 9)  2
s3 (n  1)   Rp 3

 
s 4 (n  1) Rp 2
Rp 2
Rp 3
Rp1
Rp 2
Rp 4
Rp1
Rp 3
Rp 4
Rp 4   s1 (n) 
Rp 3  s 2 (n)

Rp 2  s3 (n) 
 

Rp1  s 4 (n)
In the general case of a MM TNN, we indicate with Rpi the
MM scalar right circulating matrix associated to the (i-r+1)th
row of the radius r cloning template t ( 8) extended through the
insertion of zeroes into the r+2,…,M-r positions. For the pairs
(i,k) of values related through the following expressions
i  1,..., r  1
i  M - r  1,..., M
and 

k

i

1

k  i - M - 1
Rpi is given by
 t 0, k
t
 1,k


(10) Rp i  t  r ,k
 0


t
 1,k
t1,k  t r ,k 0  t r ,k  t 1,k 
t 0,k   t r ,k 0   t 2,k 



 t 0, k   t r , k 0 0 0 
t  r , k  t 0,k   t r , k 0 0 



 t r ,k 0  0 t r ,k  t 0,k 
Rpi is the null MM matrix when r+2 i M-r.
State evolution of an MM TNN with a radius r cloning
template can be written as:
 s 1 (n  1)   Rp 1 Rp 2  Rp M   s1 (n) 
 s (n  1)  Rp Rp  Rp
 

1
M 1   s 2 ( n) 
 M
(11)  2


  

    

 
 

s
(
n

1
)
Rp

Rp
Rp
2
M
1  s M ( n) 
 M
 
With obvious extension to matrix notations, we can write one
step evolution of TNN state as:
(12) s(n  1)  RP  s(n)
where s() is M21 vector and RP is M2M2 block right
circulating matrix.
Because of associative property of matrix product, evolution
for m instants of the TNN state is compactly given by:
(13) s(m)  RP m  s(0)
RP is a block right circulating matrix so, from the corollary to
theorem 3, RP m is a block right circulating matrix.
4. POLYHEDRAL INTERSECTION
LEARNING ALGORITHM
In order to train a TNN, we consider a set S of k pairs of inputoutput (M  M) images describing the desired elaboration,
S={<Ii,Oi>, i=1,2,…,k}.
sx,y(0) is set to the value of pixel Ii(x,y) and the desired output
after m steps is yx,y(m)=Oi(x,y) (1  x,y  M). From equations
(6) and (13)
y x, y (m)  Oi ( x, y) iff s x, y (m)  x, y,Oi s x, y (0) where
 if Oi ( x, y ) 1
(14)  x, y ,Oi  
.
 f Oi ( x, y )   1
s x, y (m)  x, y,Oi s x, y (0) means that final state must be greater
or equal (smaller) than sx,y(0) when the (x,y) pixel of the output
image Oi is equal to 1 (-1).
Given a pair of images <I,O>, the TNN which transforms I into
O in m time steps must satisfy the following set of inequalities:


s ( x  My) (m)  
RP(mx  My),k s k (0)  x , y ,O s ( x  My) (0)
k 1,M 2


(15)


x=1,…, M ; y=1,…, M.
In (15) we indicate with s( xMy) (l ) the state of neuron in
position (x,y) at time instant l; such a notation is due to the
representation of (M x M) images as (M2 x 1) column vectors.
We are now interested to the TNN which transforms I into O in
1 time step. The TNN must satisfy the set of inequalities


(16) s ( x  My) (1)  
RP( x  My),k s k (0)  x, y ,O s ( x  My) (0)
k 1, M 2




x=1,…, M ; y=1,…, M
As elements of RP(x+My),k are either zeroes or the elements of
cloning template (eq. (10)),


(17) 
RP( x  My),k s k (0)  x, y ,O s ( x  My) (0)
k 1, M 2

represents an half-space in the space having TNN weights as
coordinated axis. When we consider the M2 inequalities (16),
we obtain the polyhedron P which is originated by the
intersection of the M2 half-spaces. If P is not empty, each point
TP is a cloning template of a TNN transforming I into O in 1
step. If we compute the polyhedron P i for each pair <Ii,Oi> of
the training set S, the intersection of all these polyhedra is a
new polyhedron FP; if FP is not empty, each point TFP is the
cloning template of a TNN implementing in 1 step all the
transformations included into the training set S. We can a
priori adopt a cloning template with radius r=1; if FP is empty,
r=1 is not sufficient to solve the problem. r is increased until a
value rmin, yielding a not empty FP, is reached. rmin is locality
of transformation described through S.
The two systems of inequalities (15) and (16) are very similar,
because they both have the same known terms and the
unknowns grouped on a block right circulating matrix.
Solution of (16) is also a solution of (15) in the case m=1.
If locality of the problem is greater than one, we can try to
move from space to time the Degree of Connectivity (DoC) of
the TNN, being DoC the distance between any neuron s and
the most far neuron influencing its state, i.e. DoC=rm.
Unfortunately, solution of (15) for m>1 leads to a not linear
problem (the unknowns have degree m). In order to find a
solution, if existing, we use the heuristic approach described in
[8], where it is defined the function CT=SA_heuristic(r,m,S);
CT is the cloning template, found through a Simulated
Annealing algorithm [11], which gives the TNN which
performs (nearly) the best transformation of input images I into
output images O, belonging to S, by using m time steps and a
CT with radius r. Moreover, we suppose defined a procedure
Factorization(DoC) which returns an ordered list F containing
the couple of integers (if existing) which factorize DoC;
L(i)=<ri,mi> is the ith pair of integers factorizing DoC; ri<ri+1.
TNN Polyhedral Intersection Learning Algorithm
Input
Training sequence S={<Ii,Oi>, i=1,2,…,k}.
Output
CTimplementing transformation S or {not possible}
begin
r=1; fine = false; solution=true;
while not fine
FP=;for i=1 to k compute Pi by eq. (16); FP=FP Pi;
if FP then DoC=r; fine=true else r=r+1
if (2r+1)>Image_Size then fine = true; solution = false;



endwhile
if not solution then return {not possible}
else
{each point TFP is the CT, with radius r=DoC, of a TNN
which implements the transformation S in one time step}
if DoC=r>1 then
F=Factorization(DoC); i=1; exit = false;
repeat {try to move connectivity from space to time}
CT=SA_heuristic(ri,mi,.S)
if CT satisfies requirements then exit=true; return CT
else i=i+1
until exit
end.
As rm=rmin is a necessary but not sufficient condition to move
neuron connectivity from space to time, the heuristic process
could not be able to find a CT with radius r<rmin.
5. CONCLUSIONS
In this work we presented the theoretical description of
Toroidal Neural Networks (TNN), giving an exact learning
algorithm, the Polyhedral Intersection Learning Algorithm
(PILA), which is based on geometrical interpretation of the
learning process. PILA allows to determine the minimal
locality of the problem, corresponding to the dimension of the
radius of the Cloning Template (CT) of the TNN which solves
in one time step the problem coded through the training set;
PILA implements also a heuristic that allows, when possible, to
reduce the connectivity of CT by using more than one
simulation step, i.e. connectivity is moved from the spatial to
the temporal dimension.
6. REFERENCES
[1] L.O. Chua, L. Yang, “Cellular Neural Networks: Theory”,
IEEE Trans. on CAS, vol. 35, 1988
[2] L.O. Chua, L. Yang, “Cellular Neural Networks:
Application”, IEEE Trans. on CAS, 35, 1988
[3] T. Roska, L.O. Chua “The CNN Universal Machine: an
analogic array computer”, IEEE Trans. on CAS-II, march
1993.
[4] H. Harrer, J.A: Nossek: 'Discrete time Cellular Neural
Networks'. Int. J. Circuit Theory & Appl, 20, Sept 1992.
[5] M. Coli, P. Palazzari, R. Rughi, “Design of dynamic
evolution of discrete-time countinuous-output CNN ”, Int
Conf. on Artificial Neural Networks, ICANN ’95, Paris.
[6] T. Kozek, T. Roska, O. Chua, “Genetic Algorithm for
CNN Template Learning”, IEEE Tr. CAS, 40, n. 6, 1993
[7] Balsi, M. “Recurrent back-propgation for CNN”,
ECCTD’93, Elsveier Science Publ.., Amsterdam, 1993.
[8] M. Coli, P.Palazzari, R.Rughi: “Non Linear Image
Processing through Sequences of Fast Cellular Neural
Networks (FCNN)”. Proc. of NSIP’99 Antalya, Turkey
[9] A. Schrijver: “Theory of linear and integer programming”.
John Wiley & Sons Ltd, 1986.
[10] D.K. Wilde: “A Library for doing polyhedral operations”.
IRISA – R Report n. 785
[11] S. Kirkpatricket al.: "Optimization by Simulated
Annealing" Science, Vol. 220, No. 4598. May 1983

Download Report

camread_new_iscas2000

Paperzz.com

Your Paperzz