A Generalized Convergence Theorem for Multi

A Generalized Convergence Theorem for Multi-Dimensional Neural Networks
G. Rama Murthy,
Associate Professor,
IIIT---Hyderabad,
Gachibowli,
HYDERABAD-500032,
AP, INDIA
Sangram Singh,
B.Tech Student,
Punjab Engineering
College,
Chandigarh,
Punjab, INDIA
Narendra Ahuja,
Distinguished Professor,
University of Illinois at
Urbana Champaign,
Urbana,
Illinois, USA
ABSTARCT
For a discrete time multi-dimensional neural network, the existing
convergence theorems are reviewed and a generalized convergence theorem is stated and
proved. Significant results attained are, the estimation of the maximum time interval it
takes for such a network to attain a stable state after the energy converges, while working
in a serial mode of operation, and the maximum length of the cycle the network
converges to, while in a fully parallel mode of operation.
1. Introduction:
A Multi-Dimensional Neural Network (MDNN) is a neural network
where the neuronal elements are located in multiple independent dimensions. The
smallest processing unit in the network is a called neuron or a node. A node can assume
one of two possible states {1,–1}. The synaptic connections in the network are
established between any two nodes and the strength of a connection is called its
‘synaptic weight’. The state of a node is calculated at discrete time intervals.
The concept of a Multi-Dimensional Neural Network was introduced by
Rama Murthy [Rama1]. Also a mathematical model of multi-dimensional neural
network was formalized for the first time in [Rama1]. This model naturally
utilized tensors. Furthermore a convergence theorem for MDNN was stated and
proved [Rama1]. Based on the earlier efforts [BrW], it is realized that a
generalized convergence theorem can be stated and proved. This research paper
is a realization of that effort.
This research paper is organized as follows. In section 2, mathematical
model of multi-dimensional neural network (discussed in [Rama1] ) is reviewed. In
section 3, generalized convergence Theorem is stated and proved. Some
conclusions are provided in section 4.
2. Mathematical Model of Multi-Dimensional Neural Network using Tensors:
Before we review the mathematical model of a MDNN using tensors [Rama1] we
eliminate the terminological conflict that exists between our understanding of a MDNN
and the tensor notations.
A tensor is defined by two parameters, namely its ‘dimension’ and
its ‘order'. A tensor of dimension ‘m’ and order ‘n’ will represent mn unique elements.
Such a tensor is used to represent an ‘n’ dimensional MDNN with ‘m’ nodes/neurons
placed in each dimension. To avoid confusion we now let a MDNN to mean, a neural
network with neurons placed in n ‘dimensions’ and with ‘m’ neurons placed in each
dimension. m and n are chosen arbitrarily.
As discussed in [Rama1], the following three tensors are used to represent a
multi-dimensional neural network.
X i1, i 2,, in t 
Called the “State Tensor”, is of order n and dimension m and is
used to depict the state of each node in the MDNN at any
discrete time t. Each individual entry in the tensor is {-1, +1}
Ti1, i 2,  , in
Called the “Threshold Tensor”, is of order n and dimension m
and is used to denote the threshold value of each node.
S i1, i 2 , , in; j1, j 2 , , jn
Called the “Connection Tensor” is a tensor of order 2n and
dimension m (i.e. ik and jk [1, m] ). It is used to represent the
connection structure of a MDNN. The 2 sets of n variables on
either side of the semi-colon (‘;’) in the subscript of S represent
the 2 nodes having a particular connection. The value of the
entry in S represents the weight of a connection between 2
particular nodes in the network. The tensor S is symmetric i.e.:
S i1, i 2 , , in ; j1, j 2 ,  , jn  S j1, j 2 ,  , jn ; i1, i 2 ,  , i n
for all, ik and jk  [1,m]
The NEXT STATE of a node is evaluated as:
 1, H i1, i 2,, int   0
X i1, i 2 ,, in t  1  Sign H i1, i 2, , int   
  1, H i1, i 2,, in t   0
--- (2.1)
where,
m
m
m
H i1,i 2,,in t      S i1,i 2,,in; j1, j 2, jn X j1, j 2,, jn t   Ti1,i 2,,in
j11 j 21
jn1
Let G be the set of nodes to be evaluated in one interval of time, then the MODE of
OPERATION of a MDNN is defined whenG  = 1 as the SERIAL MODE of Operation
and when G  = mn as the FULLY PARALLEL Mode of Operation. For all 1<G <
mn the mode of operation of the MDNN is PARALLEL MODE of Operation.
A state of the network is called STABLE STATE if:
m
m m

X i1,i 2,,in t   Sign     S i1,i 2,,in; j1, j 2,, jn X i1,i 2,,in t  T i1,i 2,,in 
 j11 j 21 jn1

---- (2.2)
for all ik  [1,m]
A MDNN can only reach the stable state once, after which it remains in the same state.
The ENERGY FUNCTION (E(t)) is defined as:
m
m
m
m
m
m
m
Et     S i1,,in; j1,, jn X i1,,in t X j1,, jn t   2   X i1,,in t Ti1,,in t 
i11
in1 j11
jn1
i11 i 21
in1
---- (2.3)
3. Generalized Convergence Theorem for Multi-Dimensional Neural Networks
The Generalized Convergence Theorem, proves that a general Multi-Dimensional Neural

Network (MN) has an equivalent MDNN ( MN ) which is capable depicting both the
serial and fully parallel mode of operation of MN by an equivalent serial mode of
operation. Further, it is shown that a general MDNN operating in a serial mode will
converge to a stable state after a fixed maximum number of time intervals. Lastly, the

theorem uses the equivalent mode of operation of MN to show that MN will converge to
a stable state when working in a serial mode of operation and to a cycle of length at most
2 when working in a fully parallel mode of operation.
Theorem 1:

(1) Let MN = (S,T) be any MDNN of dimension m and order n. Let MN  Sˆ , Tˆ be
another MDNN with dimension 2m and order n, which is obtained from MN as
follows:
 
0

Sˆ  

S



S



0


And

T

Tˆ  
 T







--- (3.1)
Sˆs1,s 2,,sn ; t1,t 2, ,tn
has 2m dimensions (i.e. sk and tk [1,2m] )
and order 2n . Also Ŝ is symmetric i.e.
Sˆ i1, i2,,in ; j1 m, j2 m,, jn m  Sˆ j1 m, j2 m,, jn m; i1, i2,,in
The elements of Ŝ are obtained from S, in the following manner:
Sˆ i1, i2,,in ; j1 m, j2 m,, jn m  S i1, i2,,in ; j1 , j2 ,, jn
--- (3.2)
The following claims are made:
(a) for any serial mode of operation in MN there exists an equivalent

serial mode of operation in MN provided the diagonal elements in S
are non-negative i.e.
S i1, i2,,in ; i1 ,i2,,i n  0 .
(b) for the fully parallel mode of operation of MN, there exists an

equivalent serial mode in MN .
(2) Let MN = (S,T) be any MDNN of dimension m and order n, where S is a fully
symmetric tensor of order 2n and dimension m, with zero diagonal elements. Then,
the network MN when working in a serial mode of operation always converges to a
stable state.
(3) Let MN = ( S,T ) be a MDNN. Given (a), then (b) and (c) hold
(a) if MN is operating in a serial mode and S is a symmetric tensor with
zero diagonal elements i.e. S i1, i2,,in ; i1 ,i2,,i n  0 , then the MDNN
will always converge to a stable state.
(b) if MN is operating in a serial mode and S is symmetric tensor with
non-negative diagonal element, then the network will converge to a
stable state.
(c) if MN is operating in a fully parallel mode, and S is a symmetric
tensor, then the network will converge a cycle of length  2.
Proof of part 1) of theorem:

From (3.2): the connection in MN obtained from a connection {i1, i2, …, in; j1, j2, …, jn} in
MN, is {s1, s2, …, sn; t1, t2, …, tn} where sk = ik , so sk  [1,m] and tk = jk +m, so tk 
[m+1,2m]. For sk and tk  [1,m] or sk and tk  [m+1,2m], we have
Sˆs1,s 2,,sn ; t1,t 2, ,tn
= 0, hence the 2 set of nodes (1 ≤ sk ≤ m) and ( m+1 ≤ tk ≤ 2m )
are independent i.e. no connection exists between any 2 nodes in the same set. Hence, the

graph of MN is bipartite. We denote the 2 set of independent set of nodes as P1 and P2
respectively.
Proof of part 1 a)
(Serial Mode)
Let ap = { i1 (p), i2(p), i3(p), …, in(p)} be a set of n elements, where ik(p)  [1,m] is
uniquely dependent on p. The set is used to represent a unique node in the MN. Let a1,
a2, a3, …., amn represent the order of evaluation of the nodes of MN in serial mode of
operation, and Xo be the initial state.
Then, we let the sets P1 and P2 have the same initial state as MN or (Xo, Xo) be the

initial state of MN . The order of evaluation of nodes is taken to be:
a1, (a1+m), a2, (a2+m),…, amn, ( amn +m).
where, ak+m = { i1(k)+m, i2 (k)+m, i3 (k)+m ,…,in (k)+m }.
Note that ‘ak’ is a node of P1 and ‘ak+m’ is a node of P2. So, we evaluate elements of
set P1 and P2 alternatively. We show the equivalence of the above serial mode of

operation of MN in the following 2 steps:
(1) the state of P1 remains the same as P2 after an arbitrary even number of
evaluations
(2) The state of MN after k arbitrary evaluations is the same as that of P1 after 2k
evaluations.
We show (1) to hold true in the 2 cases which arise, as:


if the state of a node (i1,…, in ) of P1 does not change after its evaluation,
then by symmetry there will be no change in the corresponding node (i1
+m,…,in +m) of P2.
if the state of the node (i1,…, in ) of P1 changes after an evaluation, then
as the connection from the node (i1,…, in ) to node (i1+m, …, in +m) is
positive (the diagonal element of S), the same change occurs in the
corresponding node in P2 after the next evaluation.
To show (2), we use (1) which shows that P2 attains the same state as P1 after every even
number of evaluations. As P1 is connected only to P2 by connection structure similar to
MN, and P2 has the same initial state as MN, P1 after its evaluation must reach the same
state as that of MN. (2) follows, because P1 is evaluated once every two evaluations of

MN .
Proof of part 1 b):
(Fully Parallel Mode)

Let ( Xo,Xo) be the initial state of MN . Clearly, performing the evaluation at all nodes
belonging to P1( in parallel) and then at all nodes belonging to P2 and continuing with
this alternating order is equivalent to a fully parallel mode of operation in MN. The
equivalence is in sense that the state of MN is equal to the state of the subset of nodes

either P1 or P2 of the MN wherever the last evaluation was performed. A key
observation is that P1 and P2 are independent set of nodes and a parallel evaluation of an
independent set of nodes is equivalent to a serial evaluation of all the nodes in the set [1].
Thus the fully parallel mode of operation in MN has an equivalent serial mode of

MN
operation in
.
Q. E. D.
Proof of Part 2):
E = E(t+1) – E(t) is the difference of energy between two consecutive states of the
network. The network is working in serial mode of operation i.e. G = 1. In the present
context, in mn evaluations every node of the network is evaluated once, such a cycle of
evaluation is called an iteration of evaluation of the network.
Let us assume {i1, i2, …, in} to be the node at which the evaluation takes place at time t.
ΔXi1,…,in = Xi1,…,in (t+1) - Xi1,…,in (t) is the change in the state of the same node
between time t+1 and t. By using (2.3) we have :-
 0,

X i1 , i 2,, in   2,
 2,

if X i1, i 2 ,,in t  Sign H i1,i 2,,in t 
if X i1,i 2, ,in t   1, and Sign H i1,i 2, ,int   1
if X i1,i 2 ,,in t  1, and S ign H i1, i 2,, int   1
--- (3.3)
Only the state of node {i1, i2, …, in} can change in the considered time interval, so E becomes:-
m
m
m
 m

E  X i1 ,, in    S i1,,in; j1,, jn X j1 ,, jn t     S i1,,in; j1, , jn X i1, ,in (t )
i11
in1
 j11 jn1

2
 S i1,, in ;i1 ,in X i1,, in
 2  X i1,, inTi1,, in
--- (3.4)
The above equation when simplified further using the symmetry of S and the definition of
Hi1,i2,…,in(t), becomes:
E  2  X i1,, inH i1,, in (t )  Si1,, in; j1,, jn X 2i1,, in
Since 2  X i1,,in H i1,,in (t )  0 ( by (3.3) ) and
---- (3.5)
S i1, i2,,in ; i1 ,i2,,i n  0 , we can see
that the energy of the network is non-decreasing i.e. E 0. The energy of the network is
bounded, by the values taken by the interconnections (S) and the threshold values of the
nodes (T), and cannot rise or fall infinitely, so it will converge to a constant value( E =
0) . From (3.5):
E = 0 if
(a) X = 0
or
(b)
X = 2, with
H i1,,in (t )  0
Condition (a) implies there will be no change of state and condition (b) implies the
conversion of the state in one direction only namely from -1 to +1. So, once the energy
has converged each node can change its value only once (maximum mn such changes). A
network reaches a stable state, when a complete iteration of evaluation of the nodes of the
network does not change the state. Therefore after the energy converges a maximum of
mn iterations (with 1 change occurring per iteration of evaluation of the network) of the
network are possible before it reaches a stable state. As one evaluation of a node takes
place in one time interval, we have can say:
A MDNN working in a serial mode must reach a stable state after at most m2n time
intervals.
Q . E . D.
The result obtained is coherent with corresponding result for a 1-Dimensional neural
network[2].
Proof of part 3) :
Part 3 b) is implied by Part 3 a):
Using part 1 a), a MDNN denoted by MN with non-negative diagonal tensor S which is

working in a serial mode can be transformed to an equivalent network denoted by MN ,

with zero diagonal tensor Ŝ working in a serial mode. By 3 a), MN will converge to a
stable state, which implies that the network MN will also converge to a stable state. Note
that part 3 a) is trivially implied by part 3 b).
Part 3 c) is implied by Part 3 a):
Using part 1 b), a MDNN (MN) operating in a fully parallel mode can be transformed to

an equivalent MDNN( MN ) operating in a serial mode, where the state of MN is depicted

by P1 and P2 alternatively. Part 3 a) implies that MN will converge to a stable state.
Therefore, P1 and P2 will assume a fixed value. It follows directly that, if the stable state
of P1 is equal to the stable state of P2, MN will converge to a stable state, and if the
stable state of P1 is not equal to that of P2, MN will converge to a cycle of length 2.
Therefore, a MDNN operating in a Fully Parallel Mode will converge to a cycle of length
at most 2 (≤ 2).
4. Conclusions:
The research paper shows that a Multi-Dimensional Neural Network (MDNN), and a
single dimensional neural network display similar behavior when operating in Serial
mode of operation, and in Parallel Mode of Operation.
A neural network can be used as a device to implement local searches algorithms for a
local maximum value of the Energy Function [HpTk]. The value of the Energy function
which corresponds to the initial state is improved by performing a sequence of random
serial iterations until the network reaches a local maximum. The parallel mode of
operation can also be randomized conceptually by using the technique described in Part
1(b) of Theorem1.
MDNNs find their application in Combinatorial Optimization. The optimization problems
can be to a large extent be represented as quadratic equations [HmRu]. Problems
corresponding to the Energy Function Equation (Equation 2.3) can be solved using
MDNNs.
References:
[Rama1] G. Rama Murthy, “Multi/Infinite Dimensional
Multi/Infinite Dimensional Logic Theory, “ International
Systems, Vol.15, No. 3 (2005), 1-13, June 2005
Neural Networks,
Journal of Neural
[Rama2] Garimella Rama Murthy “Biological Neural Networks: 3-D/Multi-Dimension
Neural Network Models, Multi-Dimensional Logic Theory” Proceedings of First
International Conference on Theoretical Neurobiology, February 24-26, 2003
[BrW] Jehoshua Bruck and Joseph W. Goodman, “A Generalized Convergence Theorem
for Neural Networks” IEEE Transactions on Information Theory, pp.1089-1092,
vol.34, No.5, September 1988.
[HpTk] J.J. Hopfield and D.W. Tank, “Neural computations of decision in optimization
problems”, Biol. Cybern., vol. 52, pp. 141-152, 1985.
[HmRu] P.L. Hammer and S. Rudeanu, “Boolean Methods in Operations Research”, New
York: Springer-Verlag, 1968.