On Global Stability of Complex-valued Recurrent Neural Networks

The 21st Annual Conference of the Japanese Neural Network Society (December, 2011)
[P2-44]
On Global Stability of Complex-valued Recurrent Neural Networks
with Time-delays
Jin Hu and Jun Wang
Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong
E-mail: {jhu,jwang}@mae.cuhk.edu.hk
Abstract— As an extension of real-valued recurrent neural networks, complex-valued recurrent neural networks with complex-valued states, connection
weights, or activation functions have much richer and
more complicated dynamic properties than real-valued
ones. This paper presents several sufficient conditions
for ascertaining the existence of unique equilibrium,
global asymptotic stability, and global exponential stability of delayed complex-valued recurrent neural networks with two classes of complex-valued activation
functions.
Keywords— Complex-valued neural networks, Activation functions, Global stability
1
Introduction
In many applications, we need recurrent neural networks to process complex signals. in recent years,
several complex-valued neural network models with
complex-valued states, connection weights, or activation functions have been proposed and analyzed; e.g.,
[1][2][3][4]. These models provides new tools for the applications in optoelectronics, filtering, imaging, speech
synthesis, computer vision, etc. and can deal with
problems which cannot be solved with their real-valued
counterparts. Since activation function is the key element that determines the dynamical behaviors of recurrent neural networks, complex-valued neural networks have different characteristics from real-valued
ones. Thus it is important to study the dynamical behaviors of complex-valued recurrent neural networks.
In real-valued recurrent neural networks, their activation function 𝑓 (⋅) is usually chosen to be a smooth
(continuously differentiable) and bounded function
such as a sigmoid function. However, in the complex
domain, according to the Liouville’s theorem, if 𝑓 (𝑧)
is bounded and analytic at all 𝑧 ∈ 𝐶, then 𝑓 (𝑧) is a
constant function. It follows from this theorem that
if we choose such functions for 𝑓 (𝑧) in the complex
domain, it is constant over the entire 𝐶, which is obviously not suitable. That is to say, the activation functions can not be both bounded and analytic. Therefore, in this paper, we consider two classes of activation functions and systematically study the stability
problem of continuous-time complex-valued recurrent
neural networks using different approaches and provide some useful results.
2
Model Description
In this paper, we consider the following complexvalued recurrent neural network with time-delays:
𝑧(𝑡)
˙ = −𝐷𝑧 + 𝐴𝑓 (𝑧(𝑡)) + 𝐵𝑔(𝑧(𝑡 − 𝜏 )) + 𝑢
(1)
where 𝑧 = (𝑧1 , 𝑧2 , ⋅ ⋅ ⋅ , 𝑧𝑛 )𝑇 ∈ 𝐶 𝑛 is the state vector, 𝐷 = diag(𝑑1 , 𝑑2 , ⋅ ⋅ ⋅ , 𝑑𝑛 ) ∈ 𝑅𝑛×𝑛 with 𝑑𝑗 >
0 (𝑗 = 1, 2, ⋅ ⋅ ⋅ , 𝑛) is the self-feedback connection weight matrix, 𝐴 = (𝑎𝑗𝑘 )𝑛×𝑛 ∈ 𝐶 𝑛×𝑛 and
𝐵 = (𝑏𝑗𝑘 )𝑛×𝑛 ∈ 𝐶 𝑛×𝑛 are, respectively, the connection weight matrix without and with time delays, 𝑓 (𝑧(𝑡)) = (𝑓1 (𝑧1 (𝑡)), 𝑓2 (𝑧2 (𝑡)), ⋅ ⋅ ⋅ , 𝑓𝑛 (𝑧𝑛 (𝑡)))𝑇 :
𝐶 𝑛 → 𝐶 𝑛 and 𝑔(𝑧(𝑡 − 𝜏 )) = (𝑔1 (𝑧1 (𝑡 − 𝜏1 )), 𝑔2 (𝑧2 (𝑡 −
𝜏2 )), ⋅ ⋅ ⋅ , 𝑔𝑛 (𝑧𝑛 (𝑡 − 𝜏𝑛 )))𝑇 : 𝐶 𝑛 → 𝐶 𝑛 are the vectorvalued activation functions without and with time delays whose elements consist of complex-valued nonlinear functions, 𝜏𝑗 (𝑗 = 1, 2, ⋅ ⋅ ⋅ , 𝑛) are constant time
delays, 𝑢 = (𝑢1 , 𝑢2 , ⋅ ⋅ ⋅ , 𝑢𝑛 )𝑇 ∈ 𝐶 𝑛 is the external input vector.
We consider two classes of complex-valued activation functions satisfying the following two set of assumptions:
Assumption 1 𝑓𝑗 (⋅) (𝑗 = 1, 2, ⋅ ⋅ ⋅ , 𝑛) are a set of
complex-valued functions. Suppose 𝑧 = 𝑥 + i𝑦, then
𝑓𝑗 (𝑧) can be expressed by separating into its real and
imaginary parts as:
𝑓𝑗 (𝑧) = 𝑓𝑗𝑅 (𝑥, 𝑦) + i𝑓𝑗𝐼 (𝑥, 𝑦)
where 𝑓𝑗𝑅 (⋅, ⋅) : 𝑅2 → 𝑅 and 𝑓𝑗𝐼 (⋅, ⋅) : 𝑅2 → 𝑅, i
√
denotes the imaginary unit, that is, i = −1.
1) The partial derivatives of 𝑓𝑗 (⋅, ⋅) with respect to
𝑥, 𝑦: ∂𝑓𝑗𝑅 /∂𝑥, ∂𝑓𝑗𝑅 /∂𝑦, ∂𝑓𝑗𝐼 /∂𝑥 and ∂𝑓𝑗𝐼 /∂𝑦 exist and are continuous;
2) The partial derivatives ∂𝑓𝑗𝑅 /∂𝑥, ∂𝑓𝑗𝑅 /∂𝑦,
∂𝑓𝑗𝐼 /∂𝑥 and ∂𝑓𝑗𝐼 /∂𝑦 are bounded, that is, there
𝑅𝐼
𝐼𝑅
exist positive constant numbers 𝜆𝑅𝑅
𝑗 , 𝜆𝑗 , 𝜆𝑗 ,
𝐼𝐼
𝜆𝑗 such that
∣∂𝑓𝑗𝑅 /∂𝑥∣ ≤ 𝜆𝑅𝑅
𝑗 ,
∣∂𝑓𝑗𝑅 /∂𝑦∣ ≤ 𝜆𝑅𝐼
𝑗
∣∂𝑓𝑗𝐼 /∂𝑥∣ ≤ 𝜆𝐼𝑅
𝑗 ,
∣∂𝑓𝑗𝐼 /∂𝑦∣ ≤ 𝜆𝐼𝐼
𝑗
Then, according to Lagrange intermediate value
formula for multivariable functions, we have that
for any 𝑥, 𝑥′ , 𝑦, 𝑦 ′ ∈ 𝑅,
′
𝑅𝐼
′
∣𝑓𝑗𝑅 (𝑥, 𝑦) − 𝑓𝑗𝑅 (𝑥′ , 𝑦 ′ )∣ ≤ 𝜆𝑅𝑅
𝑗 ∣𝑥 − 𝑥 ∣ + 𝜆𝑗 ∣𝑦 − 𝑦 ∣
′
𝐼𝐼
′
∣𝑓𝑗𝐼 (𝑥, 𝑦) − 𝑓𝑗𝐼 (𝑥′ , 𝑦 ′ )∣ ≤ 𝜆𝐼𝑅
𝑗 ∣𝑥 − 𝑥 ∣ + 𝜆𝑗 ∣𝑦 − 𝑦 ∣
(2)
Assumption 2 A set of complex-valued functions
𝑓𝑗 (⋅) (𝑗 = 1, 2, ⋅ ⋅ ⋅ , 𝑛) satisfy the Lipschitz continuity condition in the complex domain, that is, for
𝑗 = 1, 2, ⋅ ⋅ ⋅ , 𝑛, there exists a positive constant 𝜉𝑗 , such
that for any 𝑢, 𝑣 ∈ 𝐶, we have
∣𝑓𝑗 (𝑢) − 𝑓𝑗 (𝑣)∣ ≤ 𝜉𝑗 ∣𝑢 − 𝑣∣
(3)
𝜉𝑗 (𝑗 = 1, 2, ⋅ ⋅ ⋅ , 𝑛) are called Lipschitz constants.
3
Main Results
First, we consider activation functions 𝑓𝑗 (⋅) and
𝑔𝑗 (⋅) satisfying Assumption 1 and the relative con𝑅𝐼
𝐼𝑅
stants for 𝑔𝑗 (⋅) are 𝜇𝑅𝑅
and 𝜇𝐼𝐼
𝑗 , 𝜇 𝑗 , 𝜇𝑗
𝑗 . Since
the real parts and imaginary parts of the activation
functions have individual properties, we separate
the system into its real and imaginary parts. By
separating the state vector, connection weight matrix,
vector-valued activation function and the external
input vector into its real and imaginary part, we
𝐼
𝐼
have that 𝐴𝑅 = (𝑎𝑅
𝑗𝑘 )𝑛×𝑛 and 𝐴 = (𝑎𝑗𝑘 )𝑛×𝑛 are,
respectively, the real part and imaginary part of 𝐴,
𝐼
𝐼
𝐵 𝑅 = (𝑏𝑅
𝑗𝑘 )𝑛×𝑛 and 𝐵 = (𝑏𝑗𝑘 )𝑛×𝑛 are, respectively,
the real part and imaginary part of 𝐵, 𝑓 𝑅 (𝑥, 𝑦) =
(𝑓1𝑅 (𝑥1 , 𝑦1 ), 𝑓2𝑅 (𝑥2 , 𝑦2 ), ⋅ ⋅ ⋅ , 𝑓𝑛𝑅 (𝑥𝑛 , 𝑦𝑛 ))𝑇
and
𝑓 𝐼 (𝑥, 𝑦) = (𝑓1𝐼 (𝑥1 , 𝑦1 ), 𝑓2𝐼 (𝑥2 , 𝑦2 ), ⋅ ⋅ ⋅ , 𝑓𝑛𝐼 (𝑥𝑛 , 𝑦𝑛 ))𝑇
are, respectively, the real part and imaginary
part of 𝑓 (𝑧), 𝑔 𝑅 (𝑥(𝑡 − 𝜏 ), 𝑦(𝑡 − 𝜏 )) = (𝑔1𝑅 (𝑥1 (𝑡 −
𝜏1 ), 𝑦1 (𝑡 − 𝜏1 ), 𝑔2𝑅 (𝑥2 (𝑡 − 𝜏2 ), 𝑦2 (𝑡 − 𝜏2 )), ⋅ ⋅ ⋅ , 𝑔𝑛𝑅 (𝑥𝑛 (𝑡 −
𝜏𝑛 ), 𝑦𝑛 (𝑡 − 𝜏𝑛 )))𝑇 and 𝑔 𝐼 (𝑥(𝑡 − 𝜏 ), 𝑦(𝑡 − 𝜏 )) =
(𝑔1𝐼 (𝑥1 (𝑡 − 𝜏1 ), 𝑦1 (𝑡 − 𝜏1 )), 𝑔2𝐼 (𝑥2 (𝑡 − 𝜏2 ), 𝑦2 (𝑡 −
𝜏2 )), ⋅ ⋅ ⋅ , 𝑔𝑛𝐼 (𝑥𝑛 (𝑡 − 𝜏𝑛 ), 𝑦𝑛 (𝑡 − 𝜏𝑛 )))𝑇 are, respectively,
the real part and imaginary part of 𝑔(𝑧(𝑡 − 𝜏 )),
𝑅
𝑅 𝑇
𝑢𝑅 = (𝑢𝑅
and 𝑢𝐼 = (𝑢𝐼1 , 𝑢𝐼2 , ⋅ ⋅ ⋅ , 𝑢𝐼𝑛 )𝑇
1 , 𝑢2 , ⋅ ⋅ ⋅ , 𝑢𝑛 )
are, respectively, the real part and imaginary part
of 𝑢. For simplicity, we denote 𝑥𝜏 = 𝑥(𝑡 − 𝜏 ),
𝑦 𝜏 = 𝑦(𝑡 − 𝜏 ), 𝑥𝜏𝑘 = 𝑥𝑘 (𝑡 − 𝜏𝑘 ), 𝑦𝑘𝜏 = 𝑦𝑘 (𝑡 − 𝜏𝑘 ).
𝑅𝑅
𝑅𝑅
𝐾 𝑅𝑅
=
diag(𝜆𝑅𝑅
𝐾 𝑅𝐼
=
1 , 𝜆2 , ⋅ ⋅ ⋅ , 𝜆𝑛 ),
𝑅𝐼
𝑅𝐼
𝑅𝐼
𝐼𝑅
𝐼𝑅
diag(𝜆1 , 𝜆2 , ⋅ ⋅ ⋅ , 𝜆𝑛 ), 𝐾 = diag(𝜆1 , 𝜆𝐼𝑅
,
⋅
⋅
⋅
,
2
𝐼𝐼
𝐼𝐼
𝜆𝐼𝑅
𝐾 𝐼𝐼 = diag(𝜆𝐼𝐼
𝐿𝑅𝑅 =
𝑛 ),
1 , 𝜆2 , ⋅ ⋅ ⋅ , 𝜆𝑛 ),
𝑅𝑅
𝑅𝑅
𝑅𝑅
𝑅𝐼
𝑅𝐼
diag(𝜇1 , 𝜇2 , ⋅ ⋅ ⋅ , 𝜇𝑛 ), 𝐿 = diag(𝜇1 , 𝜇𝑅𝐼
2 ,⋅⋅⋅ ,
𝐼𝑅
𝐼𝑅
𝜇𝑅𝐼 ), 𝐿𝐼𝑅 = diag(𝜇𝐼𝑅
,
𝜇
,
⋅
⋅
⋅
,
𝜇
),
𝐿𝐼𝐼 =
𝑛
1
2
𝐼𝐼
𝐼𝐼
𝐼𝐼
diag(𝜇1 , 𝜇2 , ⋅ ⋅ ⋅ , 𝜇𝑛 ).
Theorem 1 Suppose the activation functions of system (1) satisfy Assumption 1. System (1) has a unique
global exponential equilibrium point if 𝐷 − 𝐴 𝐾 − 𝐵 𝐿
is a nonsingular 𝑀 -matrix where
(
)
𝐷 0
𝐷=
,
0 𝐷
)
)
( 𝑅
( 𝑅𝑅
𝐾 𝑅𝐼
∣𝐴 ∣ ∣𝐴𝐼 ∣
𝐾
,
𝐴=
𝐾
=
𝐾 𝐼𝑅 𝐾 𝐼𝐼
∣𝐴𝐼 ∣ ∣𝐴𝑅 ∣
)
)
( 𝑅
(
∣𝐵 ∣ ∣𝐵 𝐼 ∣
𝐿𝑅𝑅 𝐿𝑅𝐼
,
,
𝐵=
𝐿
=
∣𝐵 𝐼 ∣ ∣𝐵 𝑅 ∣
𝐿𝐼𝑅 𝐿𝐼𝐼
∣𝐴𝑅 ∣ = (∣𝑎𝑅
𝑗𝑘 ∣)𝑛×𝑛 ,
∣𝐴𝐼 ∣ = (∣𝑎𝐼𝑗𝑘 ∣)𝑛×𝑛 ,
∣𝐵 𝑅 ∣ = (∣𝑏𝑅
𝑗𝑘 ∣)𝑛×𝑛 ,
∣𝐵 𝐼 ∣ = (∣𝑏𝐼𝑗𝑘 ∣)𝑛×𝑛 .
If the condition in Theorem 1 does not hold, we
need bounded activation functions to guarantee the
existence of the equilibrium points. Based on this condition, we have the following two theorems.
Theorem 2 Suppose the activation functions of system (1) are bounded and satisfy Assumption 1. System (1) is globally asymptotically stable if there exists
a positive definite matrix 𝑃 such that the following linear matrix inequality holds
)
(
𝐷𝑃 + 𝑃 𝐷 𝑄
>0
(4)
𝐼5𝑛
𝑄𝑇
√
√
√
√
where 𝑄 = [2𝑃
2𝜀𝐴𝑅
2𝜀𝐴𝐼
2𝜃𝐵 𝑅
2𝜃𝐵 𝐼 ],
𝐼5𝑛 is the 5𝑛-order identical matrix, 𝜀
=
2
𝐼𝑅 2
max{𝜀1 , 𝜀2 }, 𝜀1 = max1≤𝑗≤𝑛 {(𝜆𝑅𝑅
𝑗 ) + (𝜆𝑗 ) },
2
𝐼𝐼 2
𝜀2 = max1≤𝑗≤𝑛 {(𝜆𝑅𝐼
𝑗 ) + (𝜆𝑗 ) }, 𝜃 = max{𝜃1 , 𝜃2 },
𝑅𝑅 2
2
=
max1≤𝑗≤𝑛 {(𝜇𝑗 ) + (𝜇𝐼𝑅
𝜃2
=
𝜃1
𝑗 ) },
𝑅𝐼 2
𝐼𝐼 2
max1≤𝑗≤𝑛 {(𝜇𝑗 ) + (𝜇𝑗 ) } “> 0” here means
the matrix is positive definite.
If the activation functions satisfy Assumption 2,
they have different properties from those satisfying Assumption 1. Since they satisfy the Lipschitz continuity
condition as an entirety, so we also treat the system as
an entirety. We can obtain the following theorem:
Theorem 3 Suppose the activation functions of system (1) are bounded and satisfy Assumption 2 and the
Lipschitz constants for 𝑔𝑗 (⋅) are 𝜅𝑗 . System (1) is globally asymptotically stable if there exists a positive definite matrix 𝑃 such that the following linear matrix
inequality holds
(
)
𝐷𝑃 + 𝑃 𝐷 𝑊
>0
(5)
𝑊𝑇
𝐼4𝑛
where 𝑊 = [𝑃 ∣𝐴∣ 𝑃 ∣𝐵∣ 𝐽 𝑀 ], 𝐼4𝑛 is the 4𝑛-order
identical matrix, ∣𝐴∣ = (∣𝑎𝑗𝑘 ∣)𝑛×𝑛 , ∣𝐵∣ = (∣𝑏𝑗𝑘 ∣)𝑛×𝑛 ,
𝐽 = diag(𝜉1 , 𝜉2 , ⋅ ⋅ ⋅ , 𝜉𝑛 ), 𝑀 = diag(𝜅1 , 𝜅2 , ⋅ ⋅ ⋅ , 𝜅𝑛 ),“>
0” here means the matrix is positive definite.
4
Concluding Remarks
In this paper, the global stability of continuoustime complex-valued recurrent neural networks with
two classes of activation functions is analyzed. New
sufficient conditions in the forms of 𝑀 -matrix and linear matrix inequalities are derived for ascertaining the
existence and uniqueness of the equilibrium points,
global exponential and asymptotic stability. The theoretical results herein are viable for the designs and
applications of complex-valued neural networks.
References
[1] M. Bohner, V. Sree Hari Rao, and S. Sanyal,
“Global stability of complex-valued neural networks on time scales,” Differential Equations and
Dynamical Systems, vol. 19, pp. 1–9, 2011.
[2] S. L. Goh and D. P. Mandic, “A complex-valued
RTRL algorithm for recurrent neural networks,”
Neural Computation, vol. 16, pp. 2699–2713, December 2004.
[3] A. Hirose, Complex-valued Neural Networks: Theories and Applications, World Scientific, 2003.
[4] Y. Kuroe, N. Hashimoto, and T. Mori, “On energy function for complex-valued neural networks
and its applications,” in Neural Information Processing, ICONIP ’02, Proceedings of the 9th International Conference on, vol. 3, 2002, pp. 1079 –
1083.