Mayer-Wolf, EddThe Cramer-Rao Functional and Limiting Laws"

11IE ClWIDl-RAO FUNCfI<lW. AND LIJIITING LAWS
by
Eddy Mayer-Wolf
Department of Mathematics and Statistics
University of Massachusetts
Amherst. Massachusetts 01003
and
Department of Statistics
University of North Carolina
Chapel Hi 11. NC 27599-3260
USA
Key Words and Phrases:
•
Fisher Matrix. Cramer-Rao functional. convergence in
variation .
AMS 1980 subject classifications:
Primary 6OF05; Secondary 49B50
ABSTRAcr
Global versions of the Fisher information matrix and the Cramer-Rao
inequality are considered.
We study properties of the Fisher matrix such as
continuity and convexity and use the Cramer-Rao functional as a variational
tool to prove convergence to Gaussian laws.
to non-Gaussian limiting laws.
These concepts are generalized
§l.
Introduction and Notations
In the context of statistical estimation. the Cramer-Rao inequality is
we ll-lmown.
Name ly • wi th any member of a family {Pa}a€H of probabi Ii ty
densities on mn • sufficiently smooth in a € H (an open set in. say. mm) is
associated
E
a
the
Fisher
(VaPa{X»(vaPa{X»
information
matrix
J
a
of
order
m.
J
=
a
T
Pa{X)2
(va is the gradient operator with respect to a. E
a
the expectation wi th densi ty Pa).
The Cramer-Rao inequaIi ty provides. in
A
terms of J • a lower bound for the error matrix of any unbiased estimator a
a
=
heX):
Va
(Throughout. the matrix inequaIi ty A
non-negative matrix).
~
€
H.
(l.l)
B means that A - B is a symmetric
Consider the following particular case (cf. [Pitman.
1979. pp. 36-39]) in which n = m: if p{x) is a "reasonable" densi ty on mn
with
~ =
I
xp{x)dx and if Pa{x) = p{x-a) for each a in some neighborhood of
mn
the origin. then clearly hex)
parameter a.
For a
= o.
= x-~
is an unbiased estimator of the location
(l.l) becomes
covar{X)
~
J
-1
(l.2)
where X is a random vector distributed with density p and J{=J ) is given by
O
E[{VP{X»{vp{X»T/p{X)2]. Note that (1.2) is an inequality on the density p.
not necessarily connected with any estimation setup. and has a simple direct
proof.
-2-
Here. for (1.2). we shall assume fewer regularity conditions on p
n
(p € H(m ) as in (2.2»
than the usual proofs require (cf. [Rao. 1973]).
This work concentrates on two issues connected wi th inequali ty (1.2).
= J(~)
as a
function of probability measures (such as continuity. convexity. etc.).
The
In Section 2 we shall consider properties of the Fisher matrix J
main result
in Section 2 is Theorem 2.1.
Theorem 2.2
additional assumptions ([Barron. 1986]. [Starn. 1959]).
as an immediate consequence of Theorem 2.1.
is known wi th
It is included here
In Section 3 we shall exploit
the simple fact that equali ty is achieved in (1.2) iff
~
is Gaussian to
establish that probability measures which "nearly" achieve equality in (1.2)
are "close" to being n.orma.'lS~'thus providing a tool for proving convergence to
Gaussian laws.
In particular.
we\q;d.i~u$$
in Section 3 the use of these methods by
[Brown. 1982] and [Barron. 1986] in the classical CLT.
Here we just mention
that asymptotic equality in (1.2) for the normalized sums of iid's has not
yet been established .i;Q,:£ull;generali ty.
On the other hand we have succeeded
in using the results ;ofc:Section 3. extended to random measures. to prove a
CLT
in
nonlinear
fil tering.
Namely.
if
T
Jt
denotes
probability measure of the state of a diffusion process x
t
paths Yo of a second observation process. with
T
t
the
conditional
conditioned on the
denoting the "signal to
noise ratio" in the observation process. then in probability as
T
~
ClO
JT •
t
suitably normalized. converges to a Gaussian law ([Mayer-Wolf. 1987]).
Section 4 extends (1.2) to an inequality involVing a weighted Fisher
matrix.
This extension allows one to obtain. along the lines of Section 3.
convergence to limiting laws other than normal.
The following notation will be used.
For any topological space X. B(X)
will denote the Borel a-algebra. P(X) is the set of probability measures on
-3-
(X.B(X»
and A
n
n
is the Lebesgue measure on (mn.B(R »: weak convergence in
P(X) will be denoted as usual by =>.
LP(X) function spaces. 1 ~ p ~
co.
(with respective norms II-ll co ' II-llco ' II-lip):
also. for any finite signed measure
sup
f€CO(X)
IIx
f(x)~(dx)
I·
We use the standard d'(X). Co(X) and
~
on (X.B(X». its total variations norm
In P(X) we say that Jl
n
converges to
~
in
IIfll co=l
total variation and wri te Jln ~ ~ if II~-~n 111 ... O.
The trace of a matrix A = (a
ij
) is tr A = I a
i
ii
and v is the formal
(weak) gradient operator acting on functions of n real variables.
[::J:=l.
Finally the n-dimensional
Ga~s!pp. densi ties
vf =
are denoted by
~ A(x) = «2w)ndetA)~ exp(~(x-a)TA-1(x-a». the corresponding distribution
a.
functions by ~ A and the measure they induc":c)R\~mn.B(mn» by ~ A
a.
a.
§2.
The Fisher Matrix and Its Properties
For a region D C mn let ,l.2(D) be the;~Sob'Olev space of functions
belonging to L2(D) together wi th its first (wt!'Skl~;derivatives (cf. [Adams.
1975]) equipped with the norm IIfll1,2 = (lIfll 2 +11 I··
vi 111 2 ) ~ • and define
2
2
H(D) =
{~€ p(D)lp = ~ exists and p~ € W1 • 2 (D)}
(2.1)
n
and
(2.2)
IV
For
~
€ H(D) we define the Fisher matrix
J(~)
while if
~
= 4
I D (vp~ (x»
~
(vp (x»
T
dx
(2.3)
€ H(D)
(2.4)
-4-
and
(2.5)
Remarks:
a)
Setting A
the matrix
J(~)
= {x
can be
€
Dlp(x)
= O}.
rewritten as
= I D l A(x)p(x)dx = O.
~(A)
any
of
the
Then
following equivalent
expressions.
I
D
(vp(x) )(vp(x»T dx _ I f!p(x») fvp(x»T
p(x)
- DLP(x)
LP(x)
~(dx)
=
(2.6)
T
ID(v lnp(x»(v lnp(x»
~(dx)
= -ID vvT lnp(x)~(dx).
(The last term requires further regularity for pl.
i',
b)
';!i.!
'.
For n = 1. it follows from Sobolev's imbedding theorem ([Adams. 1975])
that if
'"
~ €
: ;.
, '3 ..., .·;,;.rl ~>
'
H(D) then p is necessari ly bounded and continuous.
For n
~
2
this is no longer true.
c)
We shall not distinguish between a random vector X and the measure
~
it
'"
"
Thus. terminology:such as X € H(D). J(X) or "the mean of ~" will be
induces.
adopted freely.
TIIEOREM 2.1.
densi ties
n
'"
Let D be a region in IR and (~)k~l a sequence in H(D) with
1\ and
~
=>
J.I. € P(D) as k -+ GO.
tr
J(~) ~
Furthermore assume that
M < GO
~
Vk
1.
Then
(i)
~ €
(ii)
tr
'"
H(D)
J(~) ~
(with density pl.
lim inf tr
J(~).
}(-leO
(iii)
Proof:
1\ ... p
1
in L as k
-+GO
First note that if v
€
(equivalently
'"
v
~ -+~).
H(D) with density
T
then
(2.7)
-5~
2
Ihr "1.2 = 1 +
1
4' tr
(2.8)
J{v).
Assume now that D is a bounded region.
I1c~
By (2.7)
is a bounded
sequence in W1.2{D) and thus by the Rellich-Kondrachov compact imbedding
2
theorem ([Adams. 1975. Th. 6.2]) a compact sequence in L {D).
•
I1c~
that there exists a subsequence of
It follows
(which. without loss of generality we
~
~
shall assume to be I1c itself) such that I1c -+ ~ in the weak topology of
12
~
""
2
""
W.
(D) and I1c -+ go in L (D). It is straightforward to verify that go = go
a.e.
We have
which proves that
Since
~
2
~
is indeed
J.L' s
densi ty p
1 2
€ W • (D). (i) also holds.
apd.~~t
conclusion (iii) holds.
Furthermore. since in any Banach space
the norm is a lower semicontinuous function in the weak topology. and in view
of (2.8):
which proves (ii).
Now let D be an arbi trary region. not necessarily bounded.
c2{~)
C
l
o
such that 0
<m
~ ",{p) ~
Vp €(O.I).
< p < 1).
1 Vp €
~. "'I{~.O] =1.
",1[I. m)
(An example of such a function is
",{p)
Choose'" €
=0 and f~~~}} ~
= exp{-p3/(I-p 2 )}.
For a sufficiently large positive N we truncate any probability
measure ~ with density v to a probability measure ~N by defining its density
vN{x) = aN{v)",{ Ix I
-
Note that as N -+ m. a
N)v{x) where aN{v) is a posi tive normalizing factor.
N(~)
-+ 1 and v
family of probability measures.
directly that
N € ""
H{D) and
~
N -+
T
in Ll {D). uniformly on any tight
Furthermore. if
~
""
€ H{D) one can verify
-6-
(2.9)
where C is a constant independent of N and 11.
Returning to the given sequence
obvious that
N
{~},
~,
for any sufficiently large N it is
N => ~N and from (2.9) it follows that (2.1) is satisfied for
~
hence the theorem's conclusions (i),
N
truncated measures, i.e. ~
call p
N
"-
€ H(D
n {Ixl
by an abuse of notation) and
-
N
N
Ill)
{~}NH
~ N + I}) (with a density we shall
-+ p
{~}
Now, being weakly convergent,
previous remark
N
~
(ii), and (iii) hold for the
N
1
in L .
is a tight sequence so that, by a
1
are L -cauchy sequences uniformly in k = 1,2, ... ,
1
Ill)
which {p }N=1 is also Cauchy, thus converging in L
".
obviously is
that for each
~'s
~
to a densi ty p which
,
density.
> 0
by
From (2.9) and the tightness of
{~}
it follows
. . ':.'I::~'.':·}1~~·
•
~ ~(~) lim inf
k-illO
1
1~(~)
[tr
(J(~)
+
~(Ixl > N)] ~
~ (1 + ~) lim inf.' tr J(~) + ~
k-illO
for all N sufficiently large, from which we obtain the theorem's first two
conclusions.
Finally,
conclusion (iii) follows by using the equivalent
result for the truncated measures and a standard
Remark:
Let w
€
RD
of a matrix A = (a
ij
~/3
argument.
•
have positive components and consider the weighted trace
) €
mnxn ,namely
(2.10)
-7-
Then Theorem 2.1 obviously remain true 1£ "tr" is replaced everywhere by
"tr ".
w
(The Sobolev space W1. 2 {D) should be equipPed in this case with the
2
~
af
n
•
equivalent norm IIfll 1 2
= (lIfll 2 + I wi" ax "2) ) .
• •w
i=l
i
•
LEMMA 2.1.
'"
Then AX + b
n x n matrix.
Proof:
n
n
Let X be a random vector in H{IR ). b € IR
€
and A a non-singular
= A- 1T
H{If) and J{AX + b)
J{X)A- 1
•
Follows immediately from the definitions.
Conclusion (it) of Theorem 2.1
COROLLARY 2. 1 .
be
can
replaced by the
stronger statement:
L
T J{~)a ~ lim inf aT J{~)a.
For each a € If a
(ii)'
k--
.
"
In particular the function aTJ{e)a. extended to +m on P{D)IH{D) is lower
semicontinuous on P{D).
I'
First let e.
Proof:
> 0
'..:.
and conclude (it) of Theorem 2.1 with "tr
j,
replacing
"tr"
(1.e..e. •...• e.) T
(as
€
in
the
Remark
n
IR ; letting e.
~
followi~'\~\he
0 we obtain
J{~)11' ~
theorem)
where
w
e.
we.
"
=
lim inf J{~)ll.
k--
Next. given 0 # a € If consider a non-singular matrix A with a as its
first column.
Letting
'"
~
=
= lim
~
0
'"
~
A.
T
inf (A
k--
LEMMA 2.2.
For any region D C If
=
~
0
A and using Lenma 2.1
J{~)A)ll
= lim
T
inf a
•
J{~)a.
k--
H{D) is a convex subset of P{D) and J{ e)
'"
is a convex (matrix valued) function on H{D).
Proof:
If
~i €
H{D) with respective densities Pi' i = 1.2 and if
a.~ ~
0 and
-8-
a + ~
=1
~
set
= ~1
+ ~2 (whose density is p
= ap1
+ ~P2).
The lemma
follows then from the equality
> O.
which holds for any x for which p(x)
TIIEOREM 2.2.
If
-
H(D 1) anq
~ €
1'1 €"
P(D2 ) then JJ.*rl
€
e.
•
-
H(D 1 + D2 ) and
ill! _\
~(JJ.*rl) ~ J(~).
if
Proof:
J(~x
)
o
First if J.LX
o
= J(~).
(2.11)
is the shifted measure J.L
(B) =
X
(see also Lemma 2.1).
o
=
- L) obviously
u
So by extending Lemma 2.2 to all finite
convex combinations. inequality (2.11) is true for
1'1
~(B
1'1
of the form
.
m
~
i=1
a
i
()
(2.12)
.
xi
Now. any arbitrary 1'1 can be approximated weakly by a sequence
measures of the form (2.12) which thus satisfy
obviously
~~
OOROll.ARY 2.2.
=>
J{~}
Proof:
~
of
Since
•
For every ~ € H(of} there exists a sequence (J.ic) in H(lRn }
co
as k
J(~}.
JJ.*rl. {2.11} follows from Corollary 2.1.
with J.ic having positive C densities for all k such that J.ic
J(~}
~
(~)
=>
~
and J(J.ic}
~
~ co.
Simply define J.ic
= ~~
and from Theorem 2.2. J(J.ic}
-1 {recall
O.k
~ J{~}.
VIc.
~a.A
- N{a.A}}.
ObViously J.ic
=>
These two facts and Theorem 2.1
-9-
imply that tr
J(~} ~
trJ(~}}
lim inf
from which the corollary follows.
•
k-tllO
For completeness' sake we shall include the following resul t on the
•
•
Fisher matrix of marginal laws.
Let X be a random vector in H(of} and consider the parti tion X =
k , X(2} € m
n- k with 1 ~ k < n. Then X(l} € H(mk } and
(X(1},X(2}), X(l} € m
LEMMA 2.3.
J(X
Proof:
(1)
k
} ~ (J(X}i,J}i,J=l.
See [Bobrovsky, Mayer-Wolf and Zakai (1987) , Proposition 1].
•
Finally we state the global Cramer-Rao inequality (1.2).
TIIEOREM 2.3.
Let X be a random vector in H~of).
l(X}
Then
~ J(~)-l
(2.13)
with equality if and only if X is Gaussian.
.,
Remarks:
(a)
The Theorem implicitly states that J(X} is non-singular.
,
'.
, - .
•
(b)
Equivalent formulations of (2.13) are J(X}
-1
~
l(X}
,
l(X}~ J(X}l(X}~ ~ 1 and so on.
(e)
The assumption X € H(mn } allows us to verify that I(x-EX}(vp}T(x}dx
= -I (by integration by parts) and the proof may be completed as in
[Rao, 1973, p. 327].
A detailed proof of a more general result
appears in [Bobrovsky, Mayer-Wolf, Zakai, 1987].
Motivated by (2.13) we define the Cramer-Rao "functional"
(2.14)
§3.
•
The Cramer-Rao Functional and Divergence from Normali ty
A natural question which arises from Theorem 2.3 is whether the matrix
valued Cramer-Rao functional
r(~}
defined in (2.14) can serve as an effective
-10-
measure of divergence of
~
from normality.
In this section we shall provide
a positive answer in the sense that (loosely)
r(~) ~
0 implies that
~
n
tends
to a Gaussian probability measure.
Let ~
LEMMA 3.1.
Proof:
€ H(IR)
First assume a
•
have densi ty p. mean a and variance u 2 .
= a O and
u
= U o and
Then
without loss of generali ty we may
assume they are respectively o and 1 since both the UJS of (3.1) and
r(~)
are
invariant under an affine transformation of the underlying random variables.
Denote
g.(x) = p' (x) + xp(x).
(3.2)
Note that
t,:·:"<...
IIgll 1
=I: Ip~f~~ + xl~(dx) ~
=
(J(~)-t:I;2:t:
(I:
xp' (x)dx +
1]~
Ip~f~~
=
2
+ xl
~
~(dx») =
r(Jl)~.
(3.3)
Now solving the ODE (3.2) we may write
p(x)
= e-x2/2
s
[
~ 2
I-co e
1
+
co
/2
g(x)(~O.I(s) - l[O.CO)(s-x»ds
1
so that
•
Now the continuous function h(s)
= e s2/2$O.I(s)(I~O.I(s»
tends to 0 at
-11-
~
so that i t is bounded, actually i t is not hard to prove that IIhllOl) = h{O) =
1/4, from which we conclude IIp-CPO,lI11 ~
i
IIgll
1
which together wi th (3.3)
proves the lelllDB. in this particular case.
In the general case just note that
.
and it is a straightforward exercise to verify that
•
Remarks.
a)
An inequali ty
IIp-cpaO'oO211q
.
can be obtained along
interpolation for 1
b)
~ C{TbL)~
the same
< p < Ol)
+
la~ I
lines as LeIllllB. 3.1
for
(3.1)'
q
= Ol) and by
.. ........J.. .•.
as well.
A sharper inequality than (3.1) follows
2
IIP-CPO,l I1 0 ~ 2
+ 10 -00 I)
' , 'oc
fr~,tbe
chain (take aO=O' 00=1)
p log{p/CPO,l) ~ log{l+T{p», ([Csiszar, 1967, Th. 4.1] and
..
[Starn, 1959, Eq. (2.3)]. However this proof doesn't extend to bounds on the
I
uniform norm as in the previous remark.
TIIEOREM 3.1.
(i)
EZ j
Let Zj' j ~ I, be a sequence of random vectors in H{IR)n.
~a,
(ii) T{Zj)
then
Proof:
choose
row.
v
~
Zj
~
cov{Zj)
~
A
(positive definite) and
0
Z .... N{a,A).
The case n=l follows directly from LeIllllB. 3.1.
a
€
If
~ with
lal
For the general case
= 1 and an orthonormal matrix A with
aT
as its first
For any random vector X it follows from LeIllllB. 2.1 and LeIllllB. 2.3 that
-12-
T
T
J(a X) = J«AX)l) ~ (J(AX»ll = a J(X)a
(3.4)
-IL
IT
~
lso that denoting Zj = (cov(Zj» ""Zj we have Ea Zj ... a A a. cov(a Zj) = 1
•
and by (3.4)
a E nf aTZ j
so that from the scalar case we conclude that for all
-
Z - N(A ~a.I).
Y.
aTZ
where
By the Cramer-Wold technique [Billingsley. 1968. Theorem 7.7]
it follows that Zj => Z or Zj => Z - N(a.A).
Note that only weak convergence was obtained because the Cramer-Wold
".
".:.
.::..
~.,
method does not extend to Ll convergence.
"
~
.'
< '"
{n + sup( tr
:
so by Theorem 2.1. Zj
However. denoting Aj = cov(Zj)
c.· j..
=~,~~~melies
v
•
Zj ... Z.
•
As an illustration. we. . shall
conclude this section by relating its contents
: j : ••
to the classical Central Limit Theorem where X are i.i.d. mean O. variance 1
n
~
n
(};
Xi). Direct applications of the
i=l
resul ts in Section 2 yield simple proofs to known convolution inequal i ties
random variables in H(m) and Sn = n
(cf. [Starn 1959. Eq. (2.9»; in particular (n+m) J(Sn+m )
~
which shows by subadditivity that
Brown (1982) showed
L = lim J(S ) exists.
~
n
n J(Sn ) + m J(Sm)
that L=l if we assume Xl = U+<5Z (6 > 0 and Z - N(O.l) independent of U). thus
proving a total variation CLT in this case.
By letting <5 ... 0 Brown provided
an alternative proof of the (weak convergence) CLT.
Barron (1986) went a
1
step further and proved "convergence in entropy" in the CLT. which implies L
•
-13-
,
•
convergence but is weaker than 1.=1.
remains open.
The question whether 1.=1 in general
(We note that L1 convergence has already been established in
[Prohorov. 1952]; for X € H it also follows directly from our Theorem 2.1).
§4.
Non-Gaussian Limit Laws.
A natural question which arises is whether the techniques of Section 3
can be modified so as to yield limiting laws other than Gaussian.
In this
section we shall show a way of doing this; we restrict ourselves to the
scalar case n=l.
Let a(x) be a measurable function'of a r~i ~riable satisfying
Vx € IR
( 4.1)
cIA
a
and consider the positive measure l\a on IR defined by d1\" = a (l\ being the
r
•
'qI_< .
Lebesgue measure).
If D = (a,b) define just as in (2.1)
where W1.2(D.a) is the' weighted Soholev space {fft;f;';€ L2 (D,l\ )}.
a
words.
~
€
In other
Ha(D) if its density p satisfies
(4.3)
(As before, for a random variable X, J a (X) stands for. Ja'
(~.».
A
THEOREM 4.1.
If
~
Also Ha(D) =
€ H (IR) with density p then
a
(4.4)
•
-14-
Equali ty is achieved in (4.4)
a €
~, ~
>0
C(a,~)
and
>0
if and only if
there exist constants
such that
_ ...! ~ (t-a~
p(x)
_ C(a,p)
- o(x) e
= cp[o]
a,r?
Proof:
o
oCt
(4.5)
Inequality (4.4) is simply the Cauchy-Schwarz inequality (f ,g)2
2
2
2
IIfll IIgll in L (~,J.L) when f(x)
l(f,g)1
r? a
dt
= II~ o(x)J.L(dx)I
=1 (Theorem 2.3).
and g(x)
( )'
=OP
p
(x).
The fact that
follows in the same way as for the particular case
Simila!ly, the functions
cp[o~
defined in (4.5) are the
a,rr
only solutions
to
condi tion
equali ty:"
for
= (x-ii)
~
the equation (op)' (x)
in
the
= -rp(x)(x-a),
Cauchy-Schwarz
arising from
inequali ty,
which
the
are
•
probability densities.
We shall denote the probability measure whose density is cp[o] by J.L[o]
a,r?
a,r?
and motivated by Theorem 4.1
Jo{J.L)I{J.L) -
(I~
O(X)J.L{dx»)2.
probability measure J.L, I (J.L)
o
the value of X
o in
define
the non-negative functional
Let F(x) =
m
= I-m
~ o~ t)
T(J.L)
=
dt with Xo = 0 and for any
-
(x-J.L)F{x)J.L{dx) (finite or +m; note that
the definition of F(x) is immaterial).
We also observe that
~
a,p··
= a and
Io~[o~)
a,p-
= r?, so that the natural
parameters of the family of measures achieving equality in (4.4) are the mean
ii
and the "distorted variance" I (J.L).
o
11IEOREM 4.2.
Let
j
o
ii j
~a
(ii) To(J.L j )
€ R
~
0
be a function which satisfies (4.1), and (J.L ) a sequence
0
of measures in H (D) such that when j
(i)
Note as well that J (X) = J(F(X».
0
and
Io(J.L j )
~
~m
r?
> o.
.
-15-
then Jl ~ Jl[C7~.
j
a,p-
and the second term in the RHS obviously tends to zero, it suffices to show
- Jl[C7] ..2"1
-+
o. For that we proceed just as in LellllB 3.1: denote
aj,Pj
Kj =
gj(x)
(I: C7(x)Jlj (dx) ) / (I: (X4Ij )2Jlj (dx) )
= (C7pj)'(x)
+ Kj(x-aj)Pj(X),. Pj(x)
and
~ (x).
= dA
(4.6)
Solving (4.6) we obtain
(p (x) - ~[C7]
j
(x»
=e
-KjGj(x)
C7 x
I:
a~
()
j'
j
K G (tl' [ ,
g (t)e j j
~C7]
j
'''a~
' j' j
(t)
(4.7)
and
~C7] (u)
a,i?
= jU~
~[C7] (v)dv.
a,i?
From (4.7),
straightforward computations (which are direct generalizations of those in
LellllB 3.1) show that
where the positive constants C and C' are independent of j (because inf I3 j
>
j
0) thus completing the proof.
•
-16-
A(]{NOWI...EGEMENT:
This paper is based on parts of my Ph.D. thesis.
I would
like to thank my advisors. Professor M. Zakai and Professor P. Feigin for
their important ideas and suggestions.
I would also like to express my
appreciation to the referee for his keen interest and numerous suggestions.
<(.
--
...
r
r
-17-
REFERENCES
Adams. R.A. (1975)
Barron. A. (1986)
14. 336-342.
Sobolev Spaces. Academic Press. N.Y.
Entropy and the central limit theorem. Ann. Probab .•
Billingsley. P. (1968)
Convergence of Probability Measures. Wiley. N.Y.
Bobrovsky. B.Z .• Mayer-Wolf. E. and Zakai. M. (1987) Some classes of global
Cramer-Rao inequalities. Ann. Stat .• 15. 4. 1421-1438.
Brown. L.D. (1982) A proof of the central limit theorem motivated by the
Cramer-Rao inequality. Statistics and Probability: Essays in Honor of
C.R. Rao. Eds. G. Kallianpur. P.R. Krishnaiah. j.K. Ghosh. North
Holland.
Csiszar. I. (1967) Information-tyPe measures of difference of probability
distributions and indirect observations. Studia Set. Math. Hungar.. 2.
299-318.
Mayer-Wolf. E. (1987)
Ph.D. thesis (in Hebrew). Technion. Israel.
Pitman. E.j.G. (1979)
and Hall.
Some Basic Theory for Statistical Inference. Chapman
Prohorov. Yu V. (1952) On a local limit theorem for densities. Doklady
Ak.ad. Nauk. SSSR. 83. 797-800.
Rao. C.R. (1973) Linear Statistical Inference and its Applications. j. Wiley
. & Sons. New York.
Stam. A.j. (1959) Some inequalities satisfied by the quantities of
information of Fisher and Shannon. Information and Control. 2. 101-112.
Van Trees. H. (1968)
Estimation and Modulation Theory. Vol. 1. Wiley.
Department of Statistics
University of North Carolina
Chapel Hi 11. NC 27599-3260
•