IEEE ~NSACnONS
ON INFORMATION
THRORY,
VOL.
m-20,
NO.
4,
JULY
517
1974
On the &-Entropy and the Rate-Distortion
Function of Certain Non-Gaussian
Processes
JACOB BINIA, MOSHE ZAKAI,
FELLOW, IEEE AND
Abstract--Let
C = {5(t), 0 5 t I T} be a process with covariance
function K(s,t ) and E jf t'(t) dt < co. It is proved that for every E > 0
the E-entropy H,(T) satisfies
where & is a Gaussian process with the covarhmce K&t)
and &‘;,(T) is
the entropy of the measure induced by c (in function space) with respect
to that induced by &,. It is also shown that if ze,(T) < co, then, as E + 0
Furthermore, if there exists a Gaussian process g = {g(t); 0 5 t 5 T}
such that y%(T) < co, then the ratio between H,(5) and H,(g) goes to one
as E goes to zero. Similar results are given for the rate-distortion function,
and some particular examples are worked out in detail. Some cases for
which A$,(@ = 00 are discussed, and asymptotic bounds on H,(T),
expressed in terms of H&&J, are derived.
I. INTRODUCTION
HE e-ENTROPY (and its related normalized form,
the rate-distortion function) provides an important
mathematical tool for the analysis of communication
sources and systems. G iven a communication system, the
s-entropy of the source and the channel capacity yield a
lower bound on the m inimum attainable distortion [l].
Let 5 be a real-valued random variable and denote by
H,(c) the s-entropy (rate-distortion function) of 5, relative
to a mean-square-error criterion. A well-known result of
Shannon [2], [3] states that, if the probability distribution
of 5 possessesa density, then
T
1
h(5) + $ In -2nee2 5 K(t)
tT2
5 3 ln ~z
(1)
where r? denotes the variance of 5 and h(5) = -f P&C) *
In P&X) dx. Furthermore, it was shown by Gerrish and
Schultheiss [4] that as E -+ 0
f&(t) = ME)+ 3 ln &
+ 41).
Results of the type (1) and (2) have been extended for
N-dimensional random variables, provided that the random
variables possessa probability density [l], [4], and [5].
The purpose of this paper is to derive results of the type
(1) and (2) for random processes.A direct extension of (1)
JACOB
ZIV,
FELLOW, IEEE
and (2) to random processes or to infinite-dimensional
random variables is impossible, since the results are based
on the existence of a probability density. Namely, the
probability measure of 5 is required to be absolutely
continuous with respect to the Lebesgue measure. In this
paper we replace this requirement by a requirement of
absolute continuity with respect to a Gaussian measure.
This enables us to prove the following bounds on the
s-entropy of a random process 5 = {c(t), 0 I t < T} :
f4(5,) - qJ3
5 f&(C) 5 f&(5,)
(3)
where 5, is the Gaussian process with the same covariance
as that of 5 and &&(e) is the relative entropy of the measure
induced by < with respect to that induced by 5, (cf., Section
II). Furthermore, if Xc,(t) < co, then we show that for
E-+0
f&(r) = f&(5> - JqJ3
+ 4).
(4)
In fact, for a finite-dimensional random variable 5, the
preceding lower bound on H,(t), as well as the a symptotic
behavior (4) are stronger than previously known results
(the upper bound on H,(e), for N-dimensional random
variables, is already known; [l, sec. 4.6.21.
The results in (3) and (4) on H,(c) are expressedin terms
of the s-entropy of a related Gaussian process H,(&) and
the entropy of the measureof t; with respect to the measure
of the related Gaussian process t,. Results on the a-entropy
of Gaussian processesare available in the literature [l],
[6]-[lo]. The entropy Xg(l) of 5 with respect to some
Gaussian processesg can be derived from known results on
the Radon-Nikodym derivative of certain processeswith
respectto certain Gaussian processes[ 1I]-[ 131. It is shown
that if there exists a Gaussian processg for which y10,(5)<
co, then &$(<) < co. Therefore the bounds (3) also yield
asymptotic results, for E + 0. Several examples are given,
and in one of these examples we show that if d{(t) =
c(t) dt + p &v(t) (where w(t) is a standard Brownian
motion) then as E + 0
which is a generalization of previously known results
Manuscript received June 5, 1973; revised January 2,. 1974.
J. Binia is with the Armament Development Authority, Ministry of
Defense, Israel.
M. Zakai and J. Ziv are with the Faculty of Engineering, TechnionIsrael Institute of Technology, Israel.
PI, W I.
Section II is devoted to notation and certain preliminary
results. The m a in results and examples are given in Section
III, and their proofs are given in Section IV.
518
IEEE TRANSACTIONS
ON INFORMATION
THEORY,
JULY
1974
II. PRELIMINARIES AND NOTATION
Then we have
Let P, and P, be two probability measures defined on a
measurable space (C&B)and let {Ei} be a finite B-measurable
Remarks: a) From now on, the discussion is lim ited to
partition of R. The entropy &‘pzCP1) of P, with respect to
the
metrics of L2,12. Although the results will be stated for
P, is defined by [14, ch. 21
processesin L’[O,T], they hold for all processesdefined in
spaces isometric to L’[O,T]. b) The random function E(t),
&&(PI) = sup C P,(E,) In z
1
0
I t I Twill be assumed to be measurable, separable, and
2
i
satisfying E JE (f’(t) dt < co. Also, since the a-entropy is
where the supremum is taken over all partitions of R. Let
independent of the process mean, we always assume
P, and P, be the distributions of the random variables t
and q (taking values in the same space (X,B,.)), respectively. E<(t) = 0.
Lemma 2.2: Let c,&,, n = 1,2, * * . be a sequence of
In this case the notation for the entropy will be
random processessuch that
Jq5) = G,(Py) = xP*(Pl).
lim E ,,T [@(t) - 5,(t)]’ dt = 0.
(5)
Let 5 and r] be random variables with values in the
n+cc s
measurable spaces (X,B,) and (YJ,,), respectively. The
mutual information 1(t,r) between these variables is Then for every E > 0
defined as [14, ch. 21
lim K(L) = f&(4>.
(6)
"-CC
Z(t,v) = sup c PCq(Ei x Fj) In ‘@ I(~’ ’ Fj)
i,j
P<(Ei>Pq(Fj)
where the supremum is taken over all partitions {Ei} of X
and (Fj} of Y.
The s-entropy H,(r) of a random variable 5 taking values
in a metric space (X,&) with a metric p(x,y), x,y E X, is
defined as
where the infimum is taken over all joint distributions Ps,
(defined on (X x X, B, x B,.)), with a fixed marginal
distribution P,;, such that
EP2(5,1) s E2.
The following lemma shows that the s-entropy is independent of the representation of a process e in different
isometric spaces.
Lemma 2.1: Let A be a linear, one-to-one and bimeasurable transformation from the metric space (X&J
onto the metric space@,&) and let Pt; be the distribution
of g in (X,&J. We define the distribution Pe of t by
P,@) = P&E-l),
where E -’ = A-‘(,!?), E E B;. If the
metrics p and p”, defined in X and r?, respectively, satisfy
Proof: We first show that if c1 and c2 are random
processessuch that
E
- t2(t)12 dt 5 8’
0T [(l(t)
(7)
then
f&+&2)
5
W51)
5
& 2
fLdt2)~
8 >
0.
(8)
By the definition of the s-entropy, there exists a process YI
such that
E
0T [cl(t)
- q(t)]’ dt I
(9)
E2
s
and
f&(51) 2 z(tl,r) - 6
(10)
where 6 > 0 is arbitrary. Note that we can choose a version
of the preceding process VIsuch that r~and t2 are conditionally independent, conditioned on ti ; therefore [14, sect. 3.41
G52,d
5
N52,51M
=
I<t,,v>
+
Mt2,1
I 51)
=
Z(td.
(11)
Furthermore, by (7) and (9)
E
then
s
s
0T [g,(t)
- q(t)]” dt I (E + 0)‘.
(12)
Therefore, by (lo)-( 12)
The proof of Lemma 2.1 follows directly from the transformation properties, since such a transformation preserves
the amount of information [15].
In particular, let 5 = {g(t), 0 I t I T} be a stochastic
process with E jz c2(t) dt < co. Let {pi}
be a complete
orthonormal system in I?[O,T], and consider the following
transformation A from L’[O,T] to 1’:
A: x(t) + (x1,x2, * . -1,
x(t) E L’[O,T]
%+,(tz)
5
m2Pd
Xi
=
s0
x(tMi(t)
dt,
i = 1,2;**.
mdi9
5
m-1)
+
6.
This completes the proof of the left side of (8); the right
side of (8) follows by a similar argument.
Equation (6) follows from (5), (8) and the continuity of
H,(t) as a function of E.
Remark: Let g = {t(t),
0 I t I T} be a process with
E jt t’(t) dt c co. Suppose that < is passed through a
linear filter with transfer function h(z), given by
where
T
s
h(z) =
1
s’
0,
0<t56
otherwise.
BINIA
et al.
: &-ENTROPY
AND
519
RATE-DISTORTION
Denote by & the filter output in [O,T]. Then obviously &
is a quadratic-mean-continuousprocess and
liiisoT
[t(t)
-&(t)]2
dt=0.
This result will allow us to assume throughout the proofs
that 5 is mean-continuous, and the results will remain valid
without this assumption by L e m m a 2.2.
F inally, the rate-distortion function R<(D) of a process
e = {c(t), 0 I t < co} is defined as ([l], [16])
where to’ = {C(t), 0 I t I; T}. Pinsker [17] proved,
under some general conditions, that the rate-distortion
function R&o) of a stationary process t is well defined and
finite for D > 0.
The importance of the rate-distortion function in
information theory stems from the following facts: 1) it
provides general lower bounds on the performance of
communication systems and 2) in some cases a coding
theorem holds to the effect that the performance bounds
derived from the rate-distortion function can be approached
asymptotically by using appropriate delays [l].
III. MAIN RESULTS AND EXAMPLES
Let 5 = {r(t), 0 I t I T} be a random process with a
covariance function K&s,t), 0 < t,s 5 T. Let 5, be a
Gaussian processwith the same covariance function. Let Pt;
and Peg be the measuresinduced by < and &, respectively,in
function space (namely the real-valued, Borel-measurable
space over [O,T]), and denote by y%&(t) the entropy of P,
with respect to P<,. The following theorem extends the
upper and lower bounds (Shannon, [2], [3]) on the
s-entropy of a one-dimensionalrandom variable to the case
of a random process.
Theorem 1: @or every E > 0
and
Note that since the entropy is always nonnegative [14],
L e m m a 3.1 implies
In the case that there exists a Gaussian processg such that
Xg(t) < co, we use L e m m a 3.1 and the following known
results about the behavior of the two equivalent Gaussian
processesF, and g.
a) For equivalent Gaussian processes 5 and &, it has
been proved under some general conditions [8] that
& -+ 0.
f&(5,) = K(g)>
That is, H,(<,) and H,(g) are “asymptotically equal,” in the
sensethat their ratio goes to one as E goes to zero.
b) Suppose that 5, and g are also strongly equivalent.
(Denote by K(s,t), 0 _< s,t 5 T, the covariance function
of a process t and consider the integral operator K defined
on the space L’[O,T] by
T
Kf(t) =
s0
K(vJf(4
ds,
f E L2[O,T], t E [O,T].
Two Gaussian measures P, and P2, with covariance
functions K,(s,t) and K,(s,t), respectively, are defined to be
strongly equivalent if the operator
K = I _ K-1’
2K 1 K-112
2
2
is of trace class, n a m e ly if xi IAil < co, where {&} are the
eigenvaluesof K. [18].)
Then it has been shown that [lo]
f&(5,) = H,(g) + o(l),
E-+0
which is tighter than the preceding asymptotic equality.
Hence, the following theorem follows from Theorem 1,
L e m m a 3.1, and the asymptotic behavior of the E-entropies
of equivalent and strongly equivalent Gaussian processes.
Theorem 2: If there exists a Gaussian process g such
that yiG$(<)< co, then
If &Jo
c cc then as E --, 0
E -+ 0.
(18)
f&C3 = He(g),
If, in addition, the Gaussian processesc!$and g are strongly
equivalent, then
fm> = f&(4,) - ~&~ + 41).
(14)
Theorem 1 relates the s-entropy of a process 5 to the
s-entropy of & and to Z<,(e). In the literature the absolute
E -+ 0.
(19)
H,(t) = H,(g) + o(l),
continuity of some non-Gaussian processeswith respect to
Example I: Let 4 be the solution of the stochastic
Gaussian measures are discussed. Furthermore, the
equation
s-entropy of these Gaussian processesis known for E + 0
(Cl], [6]-[lo]). Therefore, in the following we try to
OstsT
5(t) = m + ’dIt<s>lds + b(t),
express H,(c) in terms of H,(g) and Xg(~), where g is any
s0
given Gaussian process for which Xg(t) < co. W e start
(20)
with a lemma which enablesus to replace XC,(~) with yig(Q
or, more generally, let 5 = {i(t); 0 1. t < T} be a random
in the lower bound on H,(t).
process with E Sg c2(t) dt < co, and let 4 be defined by
Lemma 3.1: Let g = {g(t), 0 < t I T} be a Gaussian
process. If
0 I t I T (21)
5(t) = 5(O) + ’ 56) ds + b(t)>
s0
(15)
-qg(5) < co
then g, and g are equivalent Gaussian processes(i.e., the where w is a standard Brownian motion. W e assume that
measures Peg and PB are mutually absolutely continuous) l(O) = 0, or c(O)is a random variable (independentof [ and
520
IEEE TRANSACTIONS
w) with density function such that Et2(0) < co, and that
future increments {w(t) - w(s)}, 0 I s < t I T, are
independent of the a-field generated by the past of [ and w.
Then, as shown in Section IV, the asymptotic behavior of
H,(t) is given by
2T2p2
f&(5> = 72
& -+ 0.
=
s0
(22)
tgi
=
tXtMi(t)
dt,
T tg(tMi(t)
s 0
O<tsT
(23)
i = 1,2;**.
dt,
where t1 stands for the process 5 in Example 1. Then
(Section IV) as s -+ 0
lim 1 “il
n-+m n i=i
i In --$- > 0
then
l/(2m-l)E-2/(2m-1)
Its proof follows directly from (17).
Theorem 3: Let < = {l(t), 0 5 t < co} be a random
process. If there exists a Gaussian process g = {g(t),
0 I t < co} such that Zg([) < co, then for every D > 0
R<,(D) - ~&3
4 R&9
- g&)
I R&D) I R&3.
(25)
Example 3: Let us assume that there exists a stationary
solution l of the stochastic equation (20). Suppose that 5 is
passed through a linear filter with frequency characteristic
G(jo), and the filter output is denoted by r]. If
IG(jc01~cz CCO-~~, I4 -+ @J
(26)
where c and r are positive constants, then (Section IV) as
He(t,) .
II) If Izi FZ p/i” where a > 1 and /I are fixed, then there
exists a constant k > 0 such that H,(t) k kH,(&).
(f@ ) X g(E) means lim,+of(E)Ig(4
10) = -&e
2r
+
7c
2
(2r+2)‘(2r+1)
-Jzln,lxl
and all other li are Gaussian. Then simple calculation
yields
~~81(5i) = + In 271&i - h(5i)
=
(
2 1.1
Remarks: 1) A sufficient condition for (29) is that the
rate of the decrease of the eigenvalues is greater than
that of the sequence {l/i”}, where a > 0. The condition I
of b) is fulfilled if the rate of decrease of the eigenvalues
is greater than that of {l/i”}, for every a > 0. 2) Note
that the set of mean-continuous processes satisfying
yi4Eg,, . .,TsL)(tl,- * *,53 = o(L) and *t,<O = 00, is not
empty. As an example let us define (5iJ2; . .) as follows.
The variables pi, i = 1,2,. . . are statistically independent,
with Eti2 = pi, where {)3i} is an infinite set of summable
positive numbers. Also, ti, i = 1,10,102,103,~~* are
distributed according to
D--+0
;
(30)
0.
b) Suppose that xcG, .,e;,L)(51,
* - +&I = 4-Q;
I) If limn+oo l/n C:Z’i i In ~i/~i+, = co then H,(c) x
. (24)
The following theorem relates the rate-distortion function
RS(D) of a process t = {t(t), 0 I t < co} to the ratedistortion function of the Gaussian process te and the
entropy rate zg(5)(%C,(<)), where
x
E -+
1)
. (2m - l)-
R,(D)
(29)
1+1
H,(5) = f&(&J,
+,32/(2”-
(28)
values of the process satisfy
0
M
1974
Theorem 4: Let 4 be a mean-continuous process with
E j; r2(t) dt < co.
4 If Y%;<,,, .,GL)G,. - * ,tL) = o(L) and if the eigen-
f Tt(s> ds,
s0
L(t) = s’L-t(s) ds,
H,(t,)
JULY
T
Example 2: Let the process &,, be defined by
t-z(t) =
THEORY,
{&} the eigenfunctions and the eigenvalues of KS. We
assume that the eigenvalues are arranged in nonincreasing,
monotonic order. The variables (t1,t2; * a) and (c&,&~; * a)
are defined through {4i(t)}
by
ci
1
ON INFORMATION
(p2c)1/(2r+l)
i = 1,10,102; * *
otherwise
$ + Q ln 7
10,
and therefore in this case
)
. (2r + l)-l/czr+i)
1 1’(2r+1). (27)
0D
The E-Entropy of Certain Processes with Zc,(Q
= co
Gg,,.
~,s,L,G* . * 9t3 = i$l
z<,i(ti)
= OCL)+ co2
L + 03.
3) Part II of b) is a generalization of a result of Kazi [ 191.
In the following we consider mean-continuous processes
g such that Xt,(r) = co. In this case there does not exist any
IV. PROOF OF RESULTS
Gaussian process g for which J$(<) -C co, and therefore the
previous lower bounds on H,(t) are meaningless.
Proof of Theorem I : By the remark to Lemma 2.2 we may
Let K&s,t), the covariance function of t and of &, be assume, without loss of generality, that 5 is a meancontinuous in [O,T] x [O,T] and denote by {pi}
and continuous process. Let 5i, i = 1,2, * * . be the Karhunen-
BINIA
et al. : &-ENTROPY
AND
521
RATE-DISTORTION
Loeve expansion coefficients of t, and let the eigenvalues By the convexity of In x, for all positive numbers xl, x2,
i = 1,2; * * be arranged in nondecreasing, x1’, and x2’ such that xl + x2 = xl’ + x2’ and xl 2
x1’ 2 x2! 2 x2
monotonic order, From Lemmas 2.1 and 2.2
iii = Eli2,
In L + In I
Therefore, we have to prove the lower bound in (13) only
for the finite-dimensional random variable 4 = (gl; * *,e,),
where E<Jj = 0, for i # j. If XC,<@= co, the lower
bound is trivial; therefore in the following proof of the left
side of (13) we assumethat
2 In L
x2
Xl
+ In L .
x2’
Xl’
Therefore, the infimum in (35) over all vectors (Ebb, * * * ,E~~)
satisfying (33) is attained by the following distortion vector,
given by
i = l;*.,n
i = n + l;**,L
1 --
E.2
(32)
q<c> < co.
where 0 is a positive parameter, determined from the
The s-entropy of (tl; * * ,&) is defined by inf Z((tl; * *,&J, equation
L
h* * a,~~)),where the infimum is taken over all pairs (g,~)
E2-- izl m in (0,&)
with distortion vectors (c12,* * * ,Q,~)such that Cf= 1 Ei2 I E2.
(37)
Note that we may restrict the family of distortion vectors
by the additional condition Ed’ < li, for all i. O therwise, if and n is such that A,, 1 < 6’I A,.
Substituting (36) into (35) and using the known result for
(fit; * * * ,qL) is such that Ej2 > lj, even for one j, we have
[14, eq. 2.2.71
W tglt. * * ,c&) in the Gaussian case [7], we obtain
1((51,* * *,tj,’
2
* *,td9(V19*f ‘,?j,”
‘9VL))
f&(tt,. . . ,td 2 Hd&t,* . &.)
+ NC,,. . .,t~>
r<<e1,“‘,~j-l,~j+l,.“,~L),(~l,“‘,~j-l,~j+l,”’,~3)
- i fl
and therefore by neglecting the jth component we can
decrease the amount of information and still keep the
overall distortion less than Ed, Therefore
He(tt,* * * A) = infZ((tl,* * -,LMl,~
. .,rld)
where the infimum is taken over all pairs (4,~) with
distortion vectors (s12; * *,sL2)satisfying
jl
Ei2
I
E2
and Ei2 I &.
In 2ne&. (38)
F inally, if we denote by pt(xl, * * *,x L) the density function
and
by
P&-Q,
* * * ,xJ the Gaussian density
(cl,*
* * X3
function of (&* - .,<&, then
of
=-
. . .,x,)
ln
dxl,’
’ ‘2xL)
P&l,.
. .3x,)
dxl
. . . dx,
(33)
= h(t,;.'
23
- jl
ij In 2neAi
By (32) and [14] (Theorem 2.4.2), the variable (11,* * *,tL)
has a well-defined density function. Since for any n the pair which together with (38) yields the lower bound.
(<,q) can be approximated, with an arbitrary degree of
W e turn now to the proof of (14) and the right side of
“accuracy,” by another pair of random variables whose (13). Let 0 be the positive parameter determined by the
distribution is absolutely continuous (see [7]), it is sufficient equation
to restrict ourself to those pairs which have a joint density
m in (e,A,)
(39)
function. Now, for a given distortion vector we use a known
lower bound on the information between 5 and rl [4] to
and let m be such that &,+l < 0 I A,. W e define the
obtain
variables n = (rl; * *,n,,,)and yls = (vet; . *,v~,,,)by
E2-- jl
where the infimum is taken over all distortion vectors
satisfying (33) and
=-s
h(5,;. .,td
P&l9
* * * ,xL)
In
p&xl,
* * * ,xJ
dx,,
. * * ,k.
E
+ h(t,;..
,tL) - i jl
=
ai4i
+
Vgi
=
aitgi
5i
+
i = l;**,m
ii,
(40)
where ai = 1 - e/ii and ci, i = 1,. * -,m are independent
Gaussian random variables with Eti2 = ai0 and are
independent of 4 and 5,.
Since by (40)
Then, we also have
f&(51,.
. .A,) 2 inf (i$l k ln 3)
Vi
i$l
(5i
-
rlJ2
+
E
i=$+l
ti2
=
e2
we have by the definition of H,(51,42; * *)
In 2di.
(35)
f&(41,52>.
* *)
5
ut,,t,,*
* .),(rll,‘.
=
W5d2;.
~,rl,,w,*~
*Milt,*.
*)>
.,%?Jh
(41)
522
IEEE TRANSACTIONS
ON INFORMATION
THEORY,
JULY
1974
li’emark: The result (51) is derived in [20] for mutual
Observe that by (40) the variables (tm+ r; - *) and (vi,* - a,~,,)
are conditionally independent, conditioned on (cl, - - -,&,). information only; however, the proof given there goes over
directly to the case of entropy. From (50) and (51) we
Therefore, as in (11)
obtain (49). Therefore, from (47)-(49)
H,(t) I f&(5,) - y;p,,tt) + o(l),
and by (40)
E +
0
(52)
which together with the left side of (13) yields (14).
k In 271eQ. (43)
Since the s-entropy of 5, is given by [7]
(44)
Proof of Lemma 3.1: Let {tl; * *,t,} be an arbitrary
finite set in [O,T]. First, let us assume, for simplicity, that
the density functions pr,(xl, * * * ,x,) of (tg(tl), * * * ,<,(t,)),
Pgh * * * J,) of (s(h),* * ~,s(h,)) and P&, - * -A,,) of
(&tl), * * * ,((t,)) exist. Then
and since by (40)
$ In 27WU& (45)
we obtain from (31) and (41)-(45)
H,(5) 5 H,tt,) - z(s,,, . . . ,qsm)h * * *A,>.
(46)
The upper bound on the right side of (13) follows at once
from (46) since the entropy of one measure with respect to
another is always nonnegative [14].
Note that both YE&, . . .,g,,j(rjl, * * * ,y,) and m depend on
E, and m -+ co as E + 0. Since
it follows that for any 6 > 0 there exists an integer N such
that
(47)
Let ~~ be sufficiently small such that m(t) 2 N, for
0 < E I Eo. Since
q,(4)
5
~(&,,.
. ~,~,.&l~*
* *&)
~JLrl>
2
qqsl,.
%bl, . .,~g,,h~~~
for m 2 N [14, ch. 21, (46) yields for
f&15>
5
f&(5,)
-
q,,gt,.
. . .,&rt
+
6.
. ..~gN)hl~.
E
I
* -A%)
e.
3 * * * JN).
Finally, we show that
By (40), for every E I
2 (Q71,~
iw
. . .tlg.v)
~~
we have [14, eq. 2.4.81
* *Jhd
s Z(&,,. . .,&.J,(l,. . .,&q)(tl>e* *Ldt,*
* *7&v)
(48)
Observe that the function In pc,[(xl, * - * ,x,)/(p,(x,, - - . ,x,)1
is a quadratic form of the variables xl, * - *,x,. Therefore, we
may replace the second term on the right side of (53) with
Therefore
2 (&(tl),. . . ,.&(tm))(5(tl)~
* . * ,5(4nN
= Yi4m,~ . . ,,(t,,,ttttl>~ * * * &L>>
- %3w
. . ,gcr,,,(Sg(tl)~
* * - 9tg(4n)).
(54)
Since t,; * *,t, and m were arbitrary, (16) is proved. Also,
by (15) and (16), both &‘$(E) and %g(&J are finite, and thus
& and g are equivalent processes [14, p. 1341.
Finally, in the case where the distributions Pcyct,), . . *atm,,.
P (c&l), . . ~,e;,(L)) and PWt), . . . d7(Gd) are not absolutely
continuous, we define another set of variables as follows.
Let 5i, i = l;** ,m be independent Gaussian random
variables with the same variance a2, which are independent
of 5, t,, and a and let t&A. . *,&,,N, (?,k>,- - -,~,k,)),
and (J(tl); * * J(Q) be defined by
tcti)
=
tcti)
+
5i
Egtti)
=
tgcti>
+
5i
= Jq<,l,. . . &N)(L - * * &)*
(50)
Since by (40) the random variables (vi,- * *,Q and
i = I,**-,m.
s”tti)
=
dt3
+
Ci9
,Q)
converge
(as
E
+
0)
in
the
mean-square
sense
(%71~.”
to the variables (tl, * - *,&) and (&r, * s * ,JBN),respectively. Then, obviously, the distributions
The measures induced by (vi, ; * *,qN) and (Q, * * * ,Q) (on
4m
. . ~.4(f*))~ &7(tl), . . ..Es(fm))
the real-valued N-dimensional Bore1 space) converge
weakly, as E goes to zero, to the measures of (cl, * * *,&J and and hitI), . ~9ir(kJ)are absolutely continuous, and P”
converges weakly to P as 6 goes to zero. The result of
ttg1, * * *,&J, respectively, and we have [20]
Lemma 3.1 follows now by the same arguments as in the
limqqg,,. f .,qgN)h*
* ~Jliv) 2 qee,,. . .,S,N,G - *A)* (51) proof of (49).
e-0
BINL4
et al. : MNTROPY
AND
523
RATE-DISTORTION
W e turn now to the proof of Theorem 4 (the proofs of
Theorems 2 and 3 follow directly from the discussion in
Section III).
Proof of Theorem 4: a) By a property of mutual informa-
tion [14, eq. 2.2.71 and the definition of s-entropy,
for every L. Let E be small enough that 0 < a2 < l/M2
Cg 1 li. W e choose L and 0 satisfying
(~~12 = Le,
aL+l < 8 4 aL.
(66)
Since E’/L I I,, the a-entropy of (&, * * * &,) is given by
f&(t) = fat1,e2,** *> 2 4(S1,***9L)
(55)
for every L. Using Theorem 1 and our assumption on the
asymptotic behavior of &&, . . .,5sJt1, * * s&), we get
m7
2 fcb&l, * * * ,<gL) - 002.
(56)
For fixed L, the s-entropy in the Gaussian case is given by
From (66) and (67) we obtain
H,(t) 2 1 In - 4
2 M2&‘/L
= !ln3Lia
2
(68)
8
Next, define sr2 by
(57)
(6%
812 = (M&j2 + J+ 1 4.
where 8” and n* satisfy
Since by (66) and (69)
E2 = i
m in (0*,&),
i=l
a++ 1 < e* I an,. (58)
Let k > 1 be an integer. Then, for any E satisfying
0 < E’ < x1= 1 Ai, we can always find n* and 6” satisfying
&2 = n*e* + , j+
;In*+l < e* ~2 I,,.
1 ki,
(59)
El2 =
Le +
fJ
Izi
(70)
i=L+l
it follows that
(71)
Observe that for {Ai} as in b), II we have
Now set L = k.n*. By (56), (57), and (59), we have
~~, ai = O(Ln3,
H,(t) 2 ,gl i In -$ - ko(n*).
(60)
BY (59)
Therefore, since n* goes to infinity as E goes to zero, (60),
(61), and (29) yield
E + 0.
(62)
F inally, since k was arbitrary, we take the lim it in (62) for
k --) co to obtain
f&(t)R ,il t in 2,
E--to
E2 -- jl m in (0,&),
f&(t)
2
HB,,(t,)>
E --f
(73)
0.
F inally, it is easy to show that the e-entropy of a Gaussian
process <, with eigenvalues that are going to zero as an
inverse power of i satisfies
H,(&) m (constant) E-~‘(~~-~),
E--t0
(compare also (24)). Therefore we obtain the following
desired result:
E-+0
where k is a constant.
An+l < e I 1,.
Proofof Examples: W e start with the proof of Example 1.
Let the Gaussian processg be defined as
Since J&(&J = C;= 1 4 In (A#) (and since by Theorem 1
H,(c) I H,(&)), the proof of part a) is completed.
b) The proof of part I of b) is similar to that of part a).
II. If X(& . . .&&l, * * * &) = o(L), then there exists a
constant M > 1 such that for every L
3b.
(72)
and therefore by (69) and (70) ai2 M (BME)~ for E -+ 0,
where B is a constant (e.g., for a = 2, which is the case of
the diffusion process discussed by Kazi [19], we have
B2 = 2). Substituting the relation between ai2 and a2 into
(71) and using (68), we obtain
fW3 2 W(5,),
where n and 8 satisfy
L ~ 00
. . &,)(51~ * * .,&) I g In M2.
(64)
Therefore, by (55) and Theorem 1
f&(t) 2 fUtgl,. * * ,tgL) - 4 In M 2
so> = g(O) + Pm
where g(0) = C&(O),and let P, and PB be the measures(in
function space) induced by < and g, respectively. Since the
asymptotic behavior of H,(g) is given by (22) [lo], we have
only to show that Xg(<) < co. In this case the RadonNikodym derivative dPr/dPg is given by [ 121, [ 131
indP,
-- lnd %+J)1 sTt’
- (tMdtt)
dP,
dPg,o,
(65)
OstlT
+ $ f-
P2 o
0
t(t) dw(t) - $1’
0
c2(t) dt
(74)
524
IEEE TRANSACTIONS
(almost surely PC), where b(t) is defined by
R(D) = kIlnmax
I
= ~g(O)MW
+LE
Since Err(t)
= &t)((t) dt - 2/oT/2Wdf]
1
B” [S 0
- t(t)]” 2 0, (75) yields
- (75)
D =
= C2(t) dt < co
(76)
which completes the proof of Example 1.
Example 2 is a particular case of the following generalization: Let 5 be a random process and let g be a Gaussian
process such that Xg(Q < co. Let [ and g be defined by
r = A& # = Ag, where A is a measurable transformation
such that J is a Gaussian process with E fz g2(t) dt < co.
Then
E + 0.
He@ = Hh7),
Since we have &(t) 2 %&) [14, ch. 21, (77) follows
directly from Theorem 1.
Now, the proof of Example 2 follows from (77), (76), and
the result (24) for the Gaussian process defined by m - 1
integrations of the Wiener process [lo].
Finally, we turn to the proof of Example 3. Let the
Gaussian process g be as in the proof of Example 1. Then
z$~([) < co, and therefore Z&u) < co, where g is the
output of the same linear filter G(jo) operating on the
process g. Therefore, by Theorem 3,
D --f 0.
It has been shown in [21], under some general conditions,
that the asymptotic behavior of the power spectral density
of 5 is given by
S&02) z5 p20F2,
101 + co.
Therefore, the power spectral density of q satisfies
533)
= @3-&2’+2)
+ o(w-w+29*
1974
l] df
’
m in [S(f)@] dJ
s
REFERENCES
Compression. Englewood Cliffs, N.J. : Prentice-Hall, 1970.
C. E. Shannon and W. Weaver, The Mathematical Theory of
Communication. Urbana. Ill. : Univ. of Illinois Press, 1949.
[31 C. E. Shannon, “Coding.theorems for a discrete source with a
fidelity criterion,” IRE Nat. Conv. Rec., pt. 4, pp. 142-163, 1959.
r41 A. M. Gerrish and P. M. Schultheiss, “Information rates of nonGaussian processes,” IEEE Trans. Znform. Theory, vol. IT-lo,
pp. 265-271, Oct. 1964.
151 Y. N. Lin’kov, “Evaluation of &-entropy of random variables for
small E,” Probl. Peredach. Inform., vol. 1, p. 12, Apr.-June 1965.
A. N. Kolmogorov, “On the Shannon theory of information
transmission in the case of continuous signals,” IRE Trans. Inform.
Theory, vol. IT-2, pp. 102-108, Dec. 1956.
171 M. S. Pinsker, “Gaussian sources,” Probl. Peredach. Inform.,
vol. 14, pp. 59-100, 1963.
S. Ihara, “On &-entropy of equivalent Gaussian processes,”
Nagoya Math. J., vol. 37, pp. 121-130,197O.
PI J. Binia, M. Zakai, and J. Ziv, “Bounds on the E-entropy of Wiener
and RC processes,” IEEE Trans. Inform. Theory (Corresp.),
vol. IT-19, pp. 359-362, May 1973.
DOIJ. Binia, “On the &-entropy of certain Gaussian processes,”
IEEE Trans. Inform. Theory, vol. IT-20, pp. 190-196, Mar. 1974.
1111A. V. Skorokhod, Studies in the Theory of Random Processes.
Reading, Mass. : Addison-Wesley, 1965, ch. 4.
WI T. Kailath, “The structure of Radon-Nikodym derivatives with
respect to Wiener and related measures,” Ann. Math. Statist.,
vol. 42, no. 3, pp. 1054-1067.
r131 T. E. Duncan, “Evaluation of likelihood functions,” Inform.
Contr., vol. 13, pp. 62-74, 1968.
r141 M. S. Pinsker, Information and Information Stability of Random
Variables and Processes. New York: Holden-Dav. 1964.
r151 A. N. Kolmogorov, “Theory of transmission of’information,”
Amer. Math. Sot. Transl., ser. 2, vol. 33, pp. 291-321.
R. G. Gallager, Information Theory and Reliable Communication.
New York: Wiley, 1968.
sources,” Probl. Peredach.
iI71 M. S. Pinsker, “Communication
Inform., vol. 14, pp. 5-20, 1963.
P81 A. M. Yaglom, “On the equivalence and perpendicularity of two
Gaussian probability measures in function space,” in Proc. Symp.
Time Series Analysis. New York: Wiley, 1963, pp. 327-346.
1191K. Kazi, “On the c-entropy of diffusion processes,” Ann. Inst.
Statist. Math., vol. 21, pp. 347-356, 1969.
I. M. Gelfand and A. M. Yaglom, “Calculation of the amount of
information about a random function contained in another such
function,” Amer. Math. Sot. Transl., ser. 2, vol. 13, pp. 199-246.
M. Zakai and J. Ziv, “Lower and upper bounds on the optimal
1filtering error of certain diffusion processes,” IEEE Trans. Inform.
Theory, vol. IT-18, pp. 325-331, May 1972.
1161
WI
(78)
ml
Using the known result (e.g., [16, ch. 91) on the ratedistortion function of a Gaussian process with spectral
[y,
111T. Berger, Rate Distortion Theory: A Mathematical Basis for Data
[61
(77)
PI
4,(D) = R,(D),
JULY
It is easy to show that the asymptotic behavior of the ratedistortion function of a Gaussian process with spectral
density satisfying (78) is given by (27). Thus the proof is
completed.
sgCt>
I~g(o)t5(0))
++Es
El
0
THEORY,
density Scf)
z I t].
% tt> = EPJtt) I tw9
Taking the expectation of both sides of (74) with respect to
Pe yields [14, theorem 2.4.21
q(t)
ON INFORMATTON
© Copyright 2026 Paperzz