A CONVERSE INEQUALITY FOR THE CSISZ´AR Φ

A CONVERSE INEQUALITY FOR THE CSISZÁR
Φ−DIVERGENCE
S.S. DRAGOMIR
Abstract. A converse inequality for the Csiszár Φ−divergence and applications for some particular distance functions such as: Kullback-Leibler distance,
Renyi distance, χ2 −distance, Bhattacharyya distance, J-distance, etc. are
given.
1. Introduction
Given a convex function Φ : R+ → R+ , the Φ- divergence functional
’ “
n
X
pi
(1.1)
IΦ (p, q) :=
qi Φ
qi
i=1
was introduced in Csiszár [1], [2] as a generalized measure of information, a “distance function” on the set of probability distributions Pn . The restriction here to
discrete distribution is only for convenience, similar results hold for general distributions.
As in Csiszár [2], we interpret undefined expressions by
’ “
0
Φ (0) = lim Φ (t) , 0Φ
=0
t→0+
0
a‘
a‘
Φ (t)
= lim Φ
= a lim
, a > 0.
0Φ
t→∞
ε→0+
0
ε
t
The following results were essentially given by Csiszár and Körner [3].
Theorem 1. If Φ : R+ → R is convex, then IΦ (p, q) is jointly convex in p and q.
The following lower bound for the Φ−divergence functional also holds.
Theorem 2. Let Φ : R+ → R+ be convex. Then for every p, q ∈ Rn+ , we have the
inequality:
P

n
p
n
i
X
 i=1 
.
qi Φ 
IΦ (p, q) ≥
(1.2)
n
P

i=1
qi
i=1
If Φ is strictly convex, equality holds in (1.2) iff
p2
pn
p1
(1.3)
=
= ... =
.
q1
q2
qn
Date: September 7, 1999.
1991 Mathematics Subject Classification. Primary 94Xxx; Secondary 26D15.
Key words and phrases. The Csiszár Φ−Divergence, Kullback-Leibler distance, Rényi
α−entropy.
1
2
S.S. DRAGOMIR
Corollary 1. Let Φ : R+ → R be convex and normalized, i.e.,
(1.4)
Then for any p, q ∈
Φ (1) = 0.
Rn+
with
n
X
(1.5)
pi =
i=1
we have the inequality
(1.6)
n
X
qi
i=1
IΦ (p, q) ≥ 0.
If Φ is strictly convex, the equality holds in (1.6) iff pi = qi for all i ∈ {1, ..., n}.
In particular, if p, q are probability vectors, then (1.5) is assured. Corollary 1
then shows, for strictly convex and normalized Φ : R+ → R,
(1.7)
IΦ (p, q) ≥ 0 for all p, q ∈ Pn .
The equality holds in (1.7) iff p = q.
These are “distance properties”. However, IΦ is not a metric: It violates the
triangle inequality, and is asymmetric, i.e, for general p, q ∈ Rn+ , IΦ (p, q) 6=
IΦ (q, p).
In the examples below we obtain, for suitable choices of the kernel Φ, some of the
best known distance functions IΦ used in mathematical statistics [4]-[5], information
theory [6]-[8] and signal processing [9]-[10].
Example 1. (Kullback-Leibler) For
Φ (t) := t log t, t > 0
(1.8)
the Φ−divergence is
(1.9)
IΦ (p, q) =
n
X
pi log
i=1
’
pi
qi
“
,
the Kullback-Leibler distance [11]-[12].
Example 2. (Hellinger) Let
√ ‘2
1
1 − t , t > 0.
2
Then IΦ gives the Hellinger distance [13]
(1.10)
Φ (t) =
(1.11)
IΦ (p, q) =
n
which is symmetric.
1X √
√ 2
( pi − qi ) ,
2 i=1
Example 3. (Renyi) For α > 1, let
(1.12)
Φ (t) = tα , t > 0.
Then
(1.13)
IΦ (p, q) =
n
X
i=1
which is the α−order entropy [14].
1−α
pα
,
i qi
A CONVERSE INEQUALITY
3
Example 4. (χ2 −distance) Let
2
Φ (t) = (t − 1) , t > 0.
(1.14)
Then
n
2
X
(pi − qi )
IΦ (p, q) =
(1.15)
i=1
qi
is the χ2 −distance between p and q.
Finally, we have
Example 5. (Variational distance). Let Φ (t) = |t − 1| , t > 0. The corresponding divergence, called the variational distance, is symmetric,
IΦ (p, q) =
n
X
i=1
|pi − qi | .
For other examples of divergence measures, see the paper [22] by J.N. Kapur,
where further references are given.
2. Some Converse Inequalities for the Csiszár Φ−Divergence
We start with the following converse of Jensen’s discrete inequality established
in 1994 in [15] by Dragomir and Ionescu.
Lemma 1. Let f : I ⊆ R → R be a differentiable convex mapping on I. If xi ∈I̊,
n
P
ti ≥ 0 with Tn :=
ti > 0 (i = 1, ..., n) , then we have the inequality
i=1
(2.1)
0
n
1 X
ti f (xi ) − f
Tn i=1
≤
n
1 X
ti xi
Tn i=1
!
n
n
n
1 X 0
1 X
1 X
ti xi f 0 (xi ) −
ti xi ·
ti f (xi ) ;
Tn i=1
Tn i=1
Tn i=1
≤
where f 0 :I̊→ R is the derivative of f on I̊.
If ti > 0 (i = 1, ..., n) and the mapping f is strictly convex on I̊, then the case of
equality holds in (2.1) iff x1 = x2 = ... = xn .
Proof. As f : I → R is convex on I̊, then we have the following inequality,
f (x) − f (y) ≥ f 0 (y) (x − y) for all x, y ∈ I̊.
(2.2)
Choose in (2.2) x =
(2.3)
f
1
Tn
n
P
i=1
ti xi ∈ I̊, y = xj , j = 1, ...., n to obtain
n
1 X
ti xi
Tn i=1
for all j ∈ {1, ..., n} .
!
− f (xj ) ≥ f 0 (xj )
n
1 X
ti xi − xj
Tn i=1
!
4
S.S. DRAGOMIR
If we multiply (2.3) by tj ≥ 0 and sum over j from 0 to n, we deduce
!
#
" n
n
X
1 X
ti xi − f (xj )
(2.4)
tj f
Tn i=1
j=1
!
n
n
X
X
1
ti xi − xj .
≥
tj f 0 (xj )
Tn i=1
j=1
However, a simple calculation shows that
" !
#
!
n
n
n
n
X
X
1 X
1 X
tj f
ti xi − f (xj ) = Tn f
ti xi −
ti f (xi )
Tn i=1
Tn i=1
j=1
i=1
and
n
X
j=1
=
0
tj f (xj )
n
1 X
ti xi − xj
Tn i=1
!
n
n
n
X
X
1 X
ti x i
ti f 0 (xi ) −
ti xi f 0 (xi )
Tn i=1
i=1
i=1
and then, by (2.4), we can state
!
n
n
X
1 X
ti xi −
ti f (xi )
Tn f
Tn i=1
i=1
≥
n
n
n
X
X
1 X
ti f 0 (xi ) −
ti x i
ti xi f 0 (xi ) .
Tn i=1
i=1
i=1
Dividing by Tn > 0, we obtain the second inequality in (2.1).
If f is strictly convex, then the equality holds in (2.2) iff x = y. Using this
and an obvious argument, we obtain the fact that the equality holds in (2.1) iff
x1 = ... = xn .
We omit the details.
Remark 1. If, in the above lemma we drop the differentiability
condition,
then
ƒ
‚ 0
0
can choose instead of f 0 (xi ) any number l = l (xi ) ∈ f−
(xi ) and the
(xi ) , f+
inequality still remains valid. This follows by the fact that for a convex mapping
f : R+ → R+ , we have
f (x) − f (y) ≥ l (y) (x − y) for x, y ∈ R+ ,
‚ 0
ƒ
0
0
0
where l (y) ∈ f− (y) , f+
(y) and f−
(y) is the left derivative of f in y and f+
(y)
is the right derivative of f in y. We omit the details.
Remark 2. For an extension of this theorem for convex mappings of several variables see the paper [16] by Dragomir and Goh where further applications in Information Theory for Shannon’s Entropy, Mutual Information, Conditional Entropy,
etc. are given. An integral version of this result can be found in the paper [19] by
Dragomir and Goh where further applications for the continuous case of the Shannon Entropy have been given. Extensions of the above results for convex mappings
defined on convex sets in linear spaces and particularly in normed spaces can be
found in the recent Ph.D. Dissertation [20] of M. Matić where other applications in
Information Theory for Shannon’s Entropy have been considered.
A CONVERSE INEQUALITY
5
In the recent paper [23], we proved the following theorem.
n
we
Theorem 3. Let Φ : R+ → R+ be differentiable convex. Then for all p, q ∈ R+
have the inequality
’ 2 “
p
Φ0 (1) (Pn − Qn ) ≤ IΦ (p, q) − Qn Φ (1) ≤ IΦ0
(2.5)
, q − IΦ0 (p, q) ,
q
n
n
P
P
pi > 0, Qn :=
where Pn :=
qi > 0 and Φ0 is the derivative of Φ. If Φ is
i=1
i=1
strictly convex and pi , qi > 0 (i = 1, ..., n), then the equality holds in (2.5) iff p = q.
The following converse inequality for the Csiszár Φ−divergence also holds.
Theorem 4. Let Φ, p and q be as in Theorem 2. Then we have the inequality:
“
’
Pn
(2.6)
0 ≤ IΦ (p, q) − Qn Φ
Qn
’ 2 “
Pn
p
≤ IΦ0
,p −
IΦ0 (p, q) ,
q
Qn
n
n
P
P
where Qn :=
pi > 0 and Φ0 is the derivative of Φ. If Φ is
qi > 0, Pn :=
i=1
i=1
strictly convex, and pi , qi > 0, (i = 1, ..., n), then the equality holds in (2.6) iff
p1
pn
q1 = ... = qn .
Proof. Choose in Lemma 1 f = Φ, ti := qi and xi = pqii to obtain
’
’ “
“
n
1 X
Pn
pi
0 ≤
qi Φ
−Φ
Qn i=1
qi
Qn
’
“
’ “
n
n
n
1 X
pi
1 X
pi
1 X
pi IΦ0
−
pi ·
qi IΦ0
,
≤
Qn i=1
qi
Qn i=1
Qn i=1
qi
which is equivalent to (2.6). The case of equality is obvious.
The following corollary is natural.
Corollary 2. Let Φ : R+ → R be differentiable convex and normalized. Then, for
any p, q ∈ Rn+ with Pn = Qn , we have the converse of the positivity inequality (1.6)
’ 2 “
p
(2.7)
, p − IΦ0 (p, q) .
0 ≤ IΦ (p, q) ≤ IΦ0
q
The equality holds in (2.7) for a strictly convex mapping Φ iff p = q.
3. Applications for some Particular Φ−Divergences
Let us consider the Kullback-Leibler distance given by (1.9)
’ “
n
X
pi
KL (p, q) :=
pi log
.
(3.1)
qi
i=1
Choose the convex mapping Φ (t) = − log (t) , t > 0. For this mapping we have the
Csiszár Φ−divergence
’ “
”
’ “• X
n
n
X
qi
pi
=
= KL (q, p) .
qi log
IΦ (p, q) =
qi − log
q
p
i
i
i=1
i=1
6
S.S. DRAGOMIR
The following inequality holds.
Proposition 1. Let p, q ∈ Rn+ . Then we have the inequality
"
#
“
’
n
X
1
qi2
Qn
2
0 ≤ KL (q, p) − Qn log
≤
− Qn
Pn
(3.2)
Pn
Qn
p
i=1 i
’
“2
n
1 X
qi
qj
.
−
=
pi pj
2Qn i,j=1
pi
pj
The case of equality holds iff
p1
q1
= ... =
pn
qn .
Proof. If Φ (t) = − log t, then Φ0 (t) = − 1t , t > 0. We have


’ 2 “
n
X
p
1
 = −Qn ,
IΦ 0
,p
=
pi ·   2 ‘
pi
1
q
·
i=1
qi
pi
#
"
n
n
X
X
qi2
1
IΦ0 (p, q) =
qi · − pi = −
p
qi
i=1 i
i=1
and then from (2.6), we can state that
0
i.e.,
≤
KL (q, p) + Qn log
’
Pn
Qn
“
"
#
n
X
1
qi2
2
≤
Pn
− Qn .
Qn
p
i=1 i
“
"
#
n
n
X
qi2
Pn X qi2
1
2
Pn
=
− Qn ,
≤ −Qn +
Qn i=1 pi
Qn
p
i=1 i
0 ≤ KL (q, p) − Qn log
’
Qn
Pn
On the other hand, we observe that
!
’
“2
n
n
qj2
1 X
1 X
qi qj
qi
qj
qi2
−
=
−2
pi pj
pi pj
+ 2
2 i,j=1
pi
pj
2 i,j=1
p2i
pi pj
pj


n
n
n
2
2
X
X
X
q
1
q
j
pi 
=
qi qj +
pj i − 2
2 i,j=1 pi
p
j
i,j=1
i,j=1
= Pn
n
X
q2
i
i=1
pi
− Q2n
and the inequality (3.2) is proved.
The case of equality follows by the strict convexity of the mapping − log.
Corollary 3. If we assume that Pn = Qn in (3.2), we have
•
“2
’
n ” 2
n
X
1 X
qj
qi − p2i
qi
(3.3)
0 ≤ KL (q, p) ≤
=
−
pi pj
pi
2Qn i,j=1
pi
pj
i=1
with equality iff p = q.
A CONVERSE INEQUALITY
7
Remark 3. We know that the χ2 −distance between p and q is (see (1.15))
!
n
n
2
X
X p2 − q 2
(pi − qi )
i
i
Dχ2 (p, q) :=
=
if Pn = Qn .
qi
qi
i=1
i=1
As
Dχ2 (q, p) =
n
2
X
(qi − pi )
pi
i=1
=
n
X q 2 − p2
i
i
i=1
pi
if Pn = Qn
!
,
then the inequality (3.3) can be rewritten in the following manner
(3.4)
0 ≤ KL (q, p) ≤ Dχ2 (q, p) .
Corollary 4. Let p and q be two probability distributions. Then
0 ≤ KL (q, p) ≤ Dχ2 (q, p)
(3.5)
with equality iff p = q.
Remark 4. For a direct proof of the inequality (3.5) see [21] where further bounds
are also given.
Now, let us consider the α−order entropy of Renyi (see (1.13))
Dα (p, q) :=
(3.6)
n
X
1−α
pα
, α>1
i qi
i=1
n
.
and p, q ∈ R+
We know that Renyi’s entropy is actually the Csiszár Φ−divergence for the
convex mapping Φ (t) = tα , α > 1, t > 0 (see Example 3).
The following proposition holds.
Proposition 2. Let p, q ∈ Rn+ . Then we have the inequality
(3.7)
0
≤
≤
Dα (p, q) − Pnα Q1−α
n
”
’
“•
2−α 1
α
.
Qn Dα (p, q) − Pn Dα q α ,
Qn
p
The case of equality holds iff
p1
q1
= ... =
pn
qn .
Proof. If Φ (t) = tα , then Φ0 (t) = αtα−1 .
We have
" ’
“α−1 #
’ 2 “
n
X
pi2
p
,p
=
(3.8)
pi · α
IΦ 0
q
qi pi
i=1
’
“
n
n
α−1
X
X
pi
pi
= α
qi1−α pα
=α
i
q
i
i=1
i=1
=
and
(3.9)
αDα (p, q)
#
" ’ “
’
“
n
α−1
X
2−α 1
pi
α−1 2−α
α
=
pi qi
= αDα q
.
,
IΦ0 (p, q) =
qi α
qi
p
i=1
i=1
n
X
8
S.S. DRAGOMIR
Using the inequality (2.6), we can state that
’
“α
Pn
0 ≤ Dα (p, q) − Qn
Qn
’
“
2−α 1
Pn
α
≤ αDα (p, q) − α
Dα q
,
Qn
p
”
“•
’
2−α 1
α
=
Qn Dα (p, q) − Pn Dα q α ,
Qn
p
and the inequality (3.7) is obtained.
The case if equality follows by the fact that the mapping Φ (t) = tα (α > 1, t > 0)
is strictly convex on (0, ∞) .
Corollary 5. If we assume that Pn = Qn in (3.7), we obtain
”
’
“•
2−α 1
0 ≤ Dα (p, q) − Pn ≤ α Dα (p, q) − Dα q α ,
(3.10)
p
with equality iff p = q.
In particular, if p, q are probability distributions, then
’
”
“•
2−α 1
0 ≤ Dα (p, q) − 1 ≤ α Dα (p, q) − Dα q α ,
(3.11)
p
with equality iff p = q.
Now, let us consider the Bhattacharyya distance given by (see [20])
B (p, q) =
n
X
√
pi q i .
i=1
√
If we consider the convex mapping Φ : (0, ∞) → R, Φ (t) = − t, then
r
n
n
X
X
pi
√
qi
=−
pi qi = −B (p, q) .
IΦ (p, q) = −
q
i
i=1
i=1
The following inequality holds.
n
Proposition 3. Let p, q ∈ R+
. Then we have the inequality
"
#
r
n
p
1 Pn X
qi
(3.12)
0 ≤ Qn Pn − B (p, q) ≤
− B (p, q) .
qi
2 Qn i=1
pi
The case of equality holds iff pq11 = ... = pqnn .
√
Proof. If Φ (t) = − t, then Φ0 (t) = − 21 · √1t , t > 0. We have


’ 2 “
n
n
X
1
p
1 X√
1
1
IΦ0
,p
pi − · q 2  = −
qi pi = − B (p, q) ,
=
pi
q
2
2
2
i=1
i=1
qi pi


r
n
n
X
1
1X
qi
1


qi
IΦ0 (p, q) =
qi − · q
=−
pi
2
2
p
i
i=1
i=1
qi
A CONVERSE INEQUALITY
9
and then, from (2.6), we can state that
s
r
n
Pn
1
1 Pn X
qi
0 ≤ −B (p, q) + Qn
≤ B (p, q) +
,
qi
Qn
2
2 Qn i=1
pi
which is equivalent to (3.12).
The case of equality follows by the strict convexity of the mapping Φ.
Remark 5. The second inequality in (3.12) is equivalent to
"
#
r
n
p
1 Pn X
qi
qi
Pn Qn ≤
+ B (p, q)
(3.13)
2 Qn i=1
pi
with equality iff
p1
q1
= ... =
pn
qn .
Corollary 6. If we assume that Pn = Qn in (3.12), then we obtain
" n r
#
qi
1 X
0 ≤ Pn − B (p, q) ≤
qi
− B (p, q)
2 i=1
pi
with equality iff p = q.
Another important divergence measure in Information Theory is the J-divergence
defined by (see for example [2])
’ “
n
X
pi
J (p, q) =
.
(pi − qi ) log
qi
i=1
Let us observe that the mapping Φ (t) := (t − 1) ln t, t ∈ (0, ∞) , has the derivatives
1
+ 1, t > 0
t
Φ0 (t)
=
ln t −
Φ00 (t)
=
t+1
> 0 for t > 0,
t2
which shows that Φ is convex on (0, 1) and
“ ’ “
’
n
X
pi
pi
− 1 ln
= J (p, q) .
IΦ (p, q) =
qi
q
qi
i
i=1
We can state the following proposition.
Proposition 4. Let p, q ∈ Rn+ . Then we have the inequality
“
’
Pn
0 ≤ J (p, q) − (Pn − Qn ) ln
(3.14)
Qn
n
Pn
Pn X qi2
≤ KL (p, q) +
KL (q, p) +
− Qn .
Qn
Qn i=1 pi
The case of equality holds iff
p1
q1
= ... =
pn
qn .
10
S.S. DRAGOMIR
Proof. We have
IΦ 0
’
p2
,p
q
“
” ’ 2 “
•
qi pi
pi
pi ln
− 2 +1
qi pi
pi
i=1
“
’
n
X
pi
=
− Qn + Pn
pi ln
qi
i=1
=
n
X
= KL (p, q) − Qn + Pn ,
•
” ’ “
n
X
qi
pi
IΦ0 (p, q) =
+1
qi ln
−
qi
pi
i=1
= −KL (q, p) −
n
X
q2
i
i=1
pi
+ Qn
and then, from (2.6), we can state that
“ ’
“
’
Pn
Pn
0 ≤ J (p, q) − Qn
− 1 ln
Qn
Qn
"
#
n
X
qi2
Pn
≤ KL (p, q) − Qn + Pn −
−KL (q, p) −
+ Qn
Qn
p
i=1 i
=
KL (p, q) + Pn +
= KL (p, q) +
n
Pn X qi2
− Pn − Qn
Qn i=1 pi
n
Pn
Pn X qi2
KL (q, p) +
− Qn
Qn
Qn i=1 pi
and the inequality (3.14) is obtained.
4. Further Bounds via the Discrete Grüss Inequality
The following inequality is well known in the literature as the weighted discrete
Grüss inequality.
Lemma 2. Let ai , bi , pi ∈ R (i = 1, ..., n) be such that pi ≥ 0 with Pn :=
and
(4.1)
n
P
pi > 0
i=1
a ≤ ai ≤ A, b ≤ bi ≤ B for all i ∈ {1, ..., n} .
Then we have the inequality
Œ
Œ
n
n
n
Œ
Œ 1
1 X
1 X
Œ 1 X
Œ
(4.2)
pi ai bi −
p i ai ·
pi bi Œ ≤ (A − a) (B − b)
Œ
Œ Pn
Œ 4
P
P
n
n
i=1
i=1
i=1
and the constant 41 is the best possible one in the sense that it cannot be replaced
by a smaller constant independent of n.
Using this result, we can point out the following counterpart of Jensen’s inequality.
A CONVERSE INEQUALITY
11
Lemma 3. Let f, xi , ti be as in Lemma 1. If there exist the real constants m, M ∈I̊
such that m ≤ xi ≤ M for all i ∈ {1, ..., n} , then we have the inequality:
!
n
n
1 X
1 X
1
(4.3) 0 ≤
ti f (xi ) − f
ti xi ≤ (M − m) (f 0 (M ) − f 0 (m)) ,
Tn i=1
Tn i=1
4
where f 0 :I̊→ R is the derivative of f .
Proof. Define in Grüss’ inequality ai := xi and bi := f 0 (xi ) . As f is convex, f 0 is
monotonic nondecreasing and therefore f 0 (m) ≤ bi ≤ f 0 (M ) . Applying the Grüss
inequality, we can state that
n
n
n
1 X
1 X 0
1 X
ti xi f 0 (xi ) −
ti x i ·
ti f (xi )
Tn i=1
Tn i=1
Tn i=1
1
(M − m) (f 0 (M ) − f 0 (m))
4
and by the inequality (2.1) we deduce (4.3).
≤
Remark 6. Similar results can be obtained by using other Grüss type inequalities.
However, we omit the details.
The following converse inequality for the Csiszár Φ−divergence holds.
Theorem 5. Let Φ, p and q be as in Theorem 2. If there exist the real numbers
r, R such that 0 < r ≤ pqii ≤ R < ∞ for all i ∈ {1, ..., n} , then we have the inequality
’
“
1
Pn
≤ (R − r) (Φ0 (R) − Φ0 (r)) Qn .
0 ≤ IΦ (p, q) − Qn Φ
(4.4)
Qn
4
Proof. Apply Lemma 3 for f = Φ, ti = qi and xi =
pi
qi
(i = 1, ..., n) .
The following particular inequalities can be noted as well:
“
’
Pn
Pn
0 ≤ KL (p, q) − Pn log
≤
(R − r) [log (R) − log (r)]
(4.5)
Qn
4
2
Pn (R − r)
· √
.
4
Rr
Indeed, if we choose Φ (t) = t log t in (4.4), we obtain the Kullback-Leibler divergence. The last inequality in (4.5)
√ follows by the well known inequality between
b−a
the geometric mean G (a, b) = ab and the logarithmic mean L (a, b) := log b−log
a
(a, b > 0, a 6= b) which says that
≤
(4.6)
L (a, b) ≥ G (a, b) for all a, b > 0, a 6= b.
In addition, if in (4.4) we put Φ (t) = − log t, we deduce
’
“
2
Qn (R − r)
Qn
≤
·
,
(4.7)
0 ≤ KL (q, p) − Qn log
Pn
4
Rr
provided that 0 < r ≤ pqii ≤ R < ∞ for all i ∈ {1, ..., n} .
Now, if in (4.4) we choose Φ (t) = tα (α > 1) , t > 0, then we get the following
inequality for the α−order Renyi entropy

€
α
≤ · Qn (R − r) Rα−1 − rα−1 ,
0 ≤ Dα (p, q) − Pnα Q1−α
(4.8)
n
4
12
S.S. DRAGOMIR
provided that 0 < r ≤ pqii ≤ R < ∞ for all i ∈ {1, ..., n} .
If we apply Theorem 5 for the Bhattacharyya distance, we get the inequality
√
√ ‘
R− r
p
1 (R − r)
√
(4.9)
0 ≤ Pn Qn − B (p, q) ≤ ·
· Qn ,
8
rR
provided that 0 < r ≤ pqii ≤ R < ∞ for all i ∈ {1, ..., n}.
Finally, if we apply Theorem 5 for the J-divergence, we may obtain the inequality
“
’
R R−r
1
Pn
0 ≤ J (p, q) − (Pn − Qn ) ln
(4.10)
Qn ,
≤ (R − r) ln +
Qn
4
r
rR
provided that 0 < r ≤
pi
qi
≤ R < ∞ for all i ∈ {1, ..., n}.
References
[1] I. CSISZÁR, Information measures: A critical survey, Trans. 7th Prague Conf. on Info. Th.,
Statist. Decis. Funct., Random Processes and 8th European Meeting of Statist., Volume B,
Academia Prague, 1978, 73-86.
[2] I. CSISZÁR, Information-type measures of difference of probability functions and indirect
observations, Studia Sci. Math. Hungar, 2 (1967), 299-318.
[3] I. CSISZÁR and J. KÖRNER, Information Theory: Coding Theorem for Discrete Memoryless Systems, Academic Press, New York, 1981.
[4] J.H. JUSTICE (editor), Maximum Entropy and Bayssian Methods in Applied Statistics,
Cambridge University Press, Cambridge, 1986.
[5] J.N. KAPUR, On the roles of maximum entropy and minimum discrimination information
principles in Statistics, Technical Address of the 38th Annual Conference of the Indian Society
of Agricultural Statistics, 1984, 1-44.
[6] I. BURBEA and C.R. RAO, On the convexity of some divergence measures based on entropy
functions, IEEE Transactions on Information Theory, 28 (1982), 489-495.
[7] R.G. GALLAGER, Information Theory and Reliable Communications, J. Wiley, New York,
1968.
[8] C.E. SHANNON, A mathematical theory of communication, Bull. Sept. Tech. J., 27 (1948),
370-423 and 623-656.
[9] B.R. FRIEDEN, Image enhancement and restoration, Picture Processing and Digital Filtering (T.S. Huang, Editor), Springer-Verlag, Berlin, 1975.
[10] R.M. LEAHY and C.E. GOUTIS, An optimal technique for constraint-based image restoration and mensuration, IEEE Trans. on Acoustics, Speech and Signal Processing, 34 (1986),
1692-1642.
[11] S. KULLBACK, Information Theory and Statistics, J. Wiley, New York, 1959.
[12] S. KULLBACK and R.A. LEIBLER, On information and sufficiency, Annals Math. Statist.,
22 (1951), 79-86.
[13] R. BERAN, Minimum Hellinger distance estimates for parametric models, Ann. Statist., 5
(1977), 445-463.
[14] A. RENYI, On measures of entropy and information, Proc. Fourth Berkeley Symp. Math.
Statist. Prob., Vol. 1, University of California Press, Berkeley, 1961.
[15] S.S. DRAGOMIR and N.M. IONESCU, Some converse of Jensen’s inequality and applications, Anal. Num. Theor. Approx., 23 (1994), 71-78.
[16] S.S. DRAGOMIR and C.J. GOH, A counterpart of Jensen’s discrete inequality for differentiable convex mappings and applications in information theory, Math. Comput. Modelling,
24 (2) (1996), 1-11.
[17] S.S. DRAGOMIR and C.J. GOH, Some counterpart inequalities in for a functional associated
with Jensen’s inequality, J. of Ineq. & Appl., 1 (1997), 311-325.
[18] S.S. DRAGOMIR and C.J. GOH, Some bounds on entropy measures in information theory,
Appl. Math. Lett., 10 (1997), 23-28.
[19] S.S. DRAGOMIR and C.J. GOH, A counterpart of Jensen’s continuous inequality and applications in information theory, RGMIA Preprint,
http://matilda.vu.edu.au/~rgmia/InfTheory/Continuse.dvi
A CONVERSE INEQUALITY
13
[20] M. MATIĆ, Jensen’s inequality and applications in Information Theory (in Croatian), Ph.D.
Thesis, Univ. of Zagreb, 1999.
[21] S.S. DRAGOMIR, J. ŠUNDE and M. SCHOLZ, Some upper bounds for relative entropy and
applications, Comp. and Math. with Appl. ( in press).
[22] J.N. KAPUR, A comparative assessment of various measures of directed divergence, Advances
in Management Studies, 3 (1984), No. 1, 1-16.
[23] S.S. DRAGOMIR, Some inequalities for the Csiszár Φ-divergence, submitted.
School of Commuincations and Informatics, Victoria University of Technology, PO
Box 14428, Melbourne City MC 8001, Victoria, Australia
E-mail address: [email protected]
URL: http://rgmia.vu.edu.au/SSDragomirWeb.html