Wold, Diane E.On Smooth Estimation of the Renewal Density."

'.
CN SMX1l'H ESTIMATICN OF THE RENEWAL DENSITY
by
Diane Easterling
~ld
.'
This research was s\lPIX>rted in part by the Office of Naval Research
under contract NOOOl4-7S-C0809.
DIANE EASTERLING WOLD. On Smooth Estimation of the Renewal Density.
(Under the direction of M.R. LEADBETTER.)
•
•
This work investigates properties of a smooth function estimator
of the renewal density (i.e., the density of the renewal function in
the absolutely continuous case). Asymptotic unbiasedness and the
asymptotic form of the variance are shown. Probability one properties,
based on work of Nadaraya and of Van Ryzin are explored. Pointwise
consistency properties are investigated, and sufficient conditions
are given for uniform consistency of the estimator over a bounded
interval. Sufficient conditions are given for asymptotic joint
normality of the estimator, taken at a finite number of distinct
points.
Table of Contents
Page
•
Chapter 1. Introduction
1.1 Introduction
1.2 Related Work in Probability Density Estimation
1.3 Summary
1
1
4
6
Chapter 2. Preliminary Results
2.1 Convergence of the Renewal Density Series
2.2 ~-Function Sequences
2.3 Pointwise Consistency
8
12
15
Chapter 3. Almost Sure Consistency
3.1 Almost Sure Uniform Consistency
3.2 Almost Sure Pointwise Consistency
31
32
36
Chapter 4. Consistency for L2 Densities
4.1 Preliminaries
4.2 Asymptotic Variance and Covariance--A1gebraic-Type Estimators
4.3 Asymptotic Variance and Covariance--Exponentia1-Type Estimators
4.4 Asymptotic Bias and Mean Square Error
48
48
53
62
73
Chapter 5. Integrated Consistency
5.1 Preliminaries
5.2 Minimization of J m
5.3 Integrated Consistency--Algebraic Case
5.4 Integrated Consistency--Exponentia1 Case
100
112
Chapter 6 Asymptotic Normality
6.1 Asymptotic Normality of the Expanded Estimator
6.2 Asymptotic Joint Normality
124
124
130
Chapter 7. Examples, Further Research
7.1 Examples
7.2 Further Research
136
136
143
References
148
88
88
93
CHAPTER 1
INTRODUCTION
1.1
Introduction
Let Xl' X , ••• be a sequence of i.i.d non-negative random
2
variables with common distribution function F(x) and common probability density f(x).
The renewal process based on {Xi} has
"renewals" or "events" at Xl' Xl + X2 ,
The Xi's can be thought
of as the lifetimes of items which are instantly replaced when they
fail.
This failure-and-replacement is then the event of the renewal
process.
More formally, let Nt be the greatest n such that X +X + ••• +X St,
n
l 2
or the number of renewals in (O,t]. Nt is the renewal process. E(Nt)=H(t)
is called the renewal function.
If H(t) is abs0lutely continuous on
finite intervals, then its Radon-Nikodym derivative, denoted h(t),
exists; and is called the renewal density.
It is the renewal density
which will be estimated.
Let Fj(X) be the distribution function of Xl + X + ••• + X •
2
j
th
The corresponding probability density, fj(~) is the j
convolution
of f(x),with itself.
Let Zj
r
= o
H(t) = EN
t
=
if X +X
1
2
othendse
00
E(E Zj)
j=l
=
+'''+X
00
E Fj(t)
j=l
j
S
t
2
If F is absolutely continuous, then H is absolutely continuous,
and H(t)
..
t
00
Therefore h(t) =
E fOfJ,(U)dU,
j=l
00
E f,(t).
j=l J
The fact that the renewal density is the sum of probability dens ities leads to consideration of a type of smooth estimator of probability
densities first proposed by Rosenblatt (1956),
For a sample of size n, the empirical distribution function of X,
F}
,n(x),may be written
1
n
(x)= - E e:(x - X.)
1
l,n
n i=l
F
where e: (x) =
{~
~,n(x)is
F'
l,n
if x
~
0
otherwise
a natural estimator of F(x), so that, if it existed,
(x) would be a natural choice as an estimator of f (x) •
This leads
to consideration of a slightly different estimator of F(x) which is
differentiable, namely
1
11
(x)= - r
l,n
n.1= 1
F
where s
n
E:
n
(x - X.)
1
is differentiable, and e:
n
~e:
as
n~.
The derivative of
F (x)-jsa smooth probability density estimator.
I ,n
1
A
~
n
(x)= - E 0 (x - X.)
,n
n i =1 n
1
where 0 (x) = e:' (x).
n
n~
Let
n
The sequence {o } "approaches a Dirac-o" as
n
in some sense.
Returning now to the renewal density, since h(t) =
00
E
f
j=l
it is natural to let h(t) =
00
E
j=l
f
j
(t) .
j
(t) ,
£
(t)is an estimator
l,n
for fl(t) = f(t), but there are at least two possible approaches to
3
One is to estimate f.(t), the jth convolution
J
th
of f(t), with the j
convolution of f (t). This approach has not
n
estimating f.(t).
J
A
been pursued because of computational difficulties which are
briefly discussed in Chapter 7.
Another approach is to estimate
f.(t) directly from sums of j X's.
Since the sample size is finite,
J
it will be possible to construct estimators of only a finite number
of convolutions of f in this way.
Let m be the number of convolutions
for which estimators will be constructed.
m+oo as n+oo.
Let m
= m(n) and let
The estimator of h originally proposed is modified to
the following form.
m
=
h (t)
m
f. (t)
L
J
j=1
It turns out to be convenient throughout to index by m, rather
than by the sample size, n, as is more usual.
Let Sij be a sum of j X's the ith of a number of such sums.
The esti-
I
mator of f. is constructed from a number of S .. s in the same way that the
J
1J
estimator of f
1
is constructed from a number of
XIS.
Let k
I'!I
be the number
of S .. 's used in the estimator of f., the same number for each f ..
1J
k
+00
m
J
J
as m+oo.
{Sij}~:l be mutually
Let
k
independent.
Let
Thus
m
and
h (t)
m
m
1
L
k
j=l
m
k
m
L
i=1
<5
(t-S .).
m
iJ
Two different schemes for constructing S.. 's from the X's will
1J
be considered.
In the "expanded" estimator, h (t) each X is used in
m
4
only one Si"
J
~
This means thatk m(m+l)/2 observations of X are used
m
~
to construct h (t), but that all Si" 's are mutually independent.
m
J
This independence will be necessary for some results, but for others
it will be possible to use what we shall call the "compact" estimator,
.
hm
In this version, we use mk
s..
= 5..
observations of X arranged in
j
a k x m array and assigned double indexes, X.• , Let Sij
L X.£
m
1J
£=1 1
so that S. . and Si . are independent i f i :F i , but Si' and Si'
l
2
1 ,]1
2,]2
J2
Jl
1
are related by
1J 2
lJ l
+
Xi'
Jl +
1 + ••• +
Xi'
J2
m
for j 1 < j 2 •
In results which apply to both compact and expanded estimators,
h (t) will be used to denote the estimator.
m
If the result applies
,...
only to. the compact estimator, h (t) will be used, if only to the
m
expanded estimator, h (t) will be used.
m
1.2
Related Work in Probability Density Estimation
Much of the work presented here is based on results for the
analogous probability density estimator.
There has been a great
deal of work done on this kind of estimator for probability densities,
and only that which relates to work undertaken on the renewal
density estimator will be mentioned here.
As mentioned earlier, Rosenblatt (1956) introduced this type
of estimator.
He considered, in particular, the class of so-called
1
x
kernel estimators,. for which 0n(x) = ~ k(~) where An+O as n+oo,
'"
n
n
I>
and evaluated the mean of f 1, n(X) and the covariance of f 1 , n(x1 and
f1 ,n (y). He gave an asymptotic expression for the variance, and
5
considered the asymptotic behavior of the bias when f(x) has continuous derivatives of the first three orders.
Parzen (1962), investigated kernel estimators further, for
kernels which are bounded, integrable,
x+oo.
and such that
xk(x)~oo
as
He gave the asymptotic variance under these conditions and
showed asymptotic normality.
He also found the asymptotic form
of the bias under the condition that for some r,
where f\u) is the characteristic function corresponding to f.
Leadbetter (1963) considered the more general form of Rosenblatt's
estimator, where a (x) is not necessarily of the kernel form
m
1
x
~ k(~), but satisfies an axiom scheme (see Chapter 2) for these
n
n
a-function sequences.
He obtained results on consistency, asymp-
totic normality and the asymptotic forms of the variance and covariance of f
1 ,n
(x).
Much of the rest of the work in Leadbetter (1963) was inspired
by the work of Parzen (1958) in spectral density estimation.
In this
later work, two classes of densities were considered, roughly, those
whose characteristic functions behave like Itl- P for large It I
(the "algebraic class") ,and those whose characteristic functions behave like e-p\t! for large Itl, (the "exponential class").
Appro-
priate types of a-function sequences were defined for estimating
densities in these classes, and results on the asymptotic forms
of the variance, covariance and bias were found.
6
An estimator is said to be "functionally uniformly consistent
of order H(n)" i f
H(n) E sup( If1 , n(x) - f(x)
A
2
I)
is bounded in n.
x
An estimator is said to be "integratedly consistent of order
H(n)" i f
tends to a non-zero limit as n+oo.
Leadbetter (1963) also investigated the functionally uniform
consistency and integrated consistency of estimators of densities
in the algebraic and exponential classes.
A
Nadaraya (1965) gave conditions under which f ,n(x) is uniformly
1
consistent with probability one, that is, sup
x
If 1 ,n(x)- f(x) 1+ a a.s.
Schuster (1969) considered this problem further and found necessary
and sufficient conditions for uniform consistency with probability
one.
Van Ryzin (1969) used martingale techniques to show that
f 1 (x) is pointwise consistent with probability one, that is,
,n
f 1 (x)+f(x) a. s " for continutity points of f.
,n
1.3 Sununary
In Chapter 2, after preliminary investigation of the renewal
density series, and of a-function sequences, h is shown to be
m
asymptotically unbiased, and asymptotic forms of the variance of
h (t) and of the covariance of h (t) and h (s) are found for the
m
m
m
compact and expanded versions of the estimator.
7
Chapter 3 deals with almost sure consistency.
First, in work
that is analogous to that of Nadaraya with probability densities,
h
m
is shown to be uniformly consistent with probability one in
finite intervals.
We also show that h
m
is pointwise consistent
with probability one, using Van Ryzin's martingale methods.
Chapters 4 and 5 follow Par zen (1958) and Leadbetter (1963) and
deal with algebraic and exponential classes of densities, and their
algebraic-type and exponential-type estimators.
Asymptotic forms of
the variance, covariance and bias of the expanded estimator and for
the bias of the compact estimator are obtained in Chapter 4.
Chapter
5 contains a discussion of a modified form of integrated consistency.
Since fh(t)dt =
00,
the mean integrated s'quare error,
Ef(h (t) - h(t»2dt is not finite, but instead we use the (clearly
m
finite) modification
J
m
=E
f(h (t) - h (t»2dt
m
m
m
where h (t) =
m
r
j=l
f. (t) •
J
The asymptotic normality of the expanded estimator is demonstrated in Chapter 6.
It is also shown that h
m
values at a finite
number of distinct points are asymptotically jointly normal.
Chapter 7 contains examples of calculations
of the compact and
expanded estimators from simulated data, and a discussion of possible
further research.
CHAPTER 2
PRELIMINARY RESULTS
2.1 Convergence of the renewal density series
The first set of results deals with the behavior of the series
00
E f. (t)
j=l J
h(t).
2.1.1 Lemma If f is bounded by some M, then f.(t) is bounded by M for all
J
j .
Proof:
If f. 1 is bounded by M, then
J-
f.(t) = ff. l(u)f(t-u)du
J
J-
~
Mff(t-u)du = M
Thus, by induction, f.(t) is bounded by M for all j.
J
o
2.1.2 Lemma If f is bounded by some M, then f.(t) is continuous for
J
j
~
2.
Proof:
By Lemma 2.1.1,
~ Mflf(v + E) - f(v) Idv
which tends to zero as
E~O
since fELl.
o
9
2.1.3 Theorem If f is bounded by some M, then
00
L fj(t) converges
j=2
uniformly in any finite interval.
00
L f.(t) is continuous.
j=2 J
Also,
•
Proof:
Case I:
F(t)<l for all t<oo.
Given any finite interval, let T be its right endpoint.
F(t)~F(T)
for all t in the interval.
Then
First we show, by induction, that
. 1
f.(t)~(T)J- for all j~2.
J
If f. l(t) ~ M(F(T»j-2, then
J-
r F(T)j-l
Since
j=2
M test that
=
r F(T)j
converges, it follows from the Weierstrass
j=l
00
L f.(t) converges uniformly on the finite interval.
j=2 J
There is a constant K such that f(t)<Ke
-t
(e.g. K = Me
to
).
show, again by induction, that f.(t) S Kjtj-le- t for all j ~ 2.
J
(j-l)~
In fact
t
fOf(t-u)f(u)du
=
2
K te
-t
.
We
10
I f f. 1 (t)
$;
J-
K
j-l j-2 -t
e
t
,
[ j -1
(j-2) :
then fj(t)
= f~fj_l(t-u)f(U)dU $;f~ K
j -2
(t-~~_2)~
-t+uJ
Ke-udu
Kj e -t t j-l
(j-l) :
•
so that the induction is completed.
Let T be the right endpoint of the finite interval.
and all t in the finite interval.
00
Since
2:
j=2
00
from the Weiers.trass Mtest that I: f. (t)
j =2 J
Since f.(t) is continuous
J
KjT j - l
converges,it follows
(j-1):
~onverlo!es tlniform1y
forallj~2
in the interval.
(Lemma 2.1.2) and the series con-
00
o
verges uniformly in any fini te interval, 2: f. (t) is continuous.
j =2 J
2.1.4 Corollary If f(t) is bounded and continuous,then h(t) is continuous,
00
•
and Z f. (t) converges umform1y to h(t) on any finite interval.
j =1 J
2.1.5
Corollary
t <
and if I is any finite interval, with right endpoint T, then
00,
If f(t) is bounded by M and if F(t) < 1 for all
E f.(t) $; M(F(T))m for all tEl.
j=m+l J l - F ( T )
Proof:
As in Case I of the proof of Theorem 2.1.3, f.(t)
J
$;
MF(T)j-l.
Thus
00
Z f.(t)
j=m+l J
$;
E MF(T)j-l = M(F(T))m
j=m+l
l-F(T)
o
11
2.1. 6
•
Corollary
If
f is bounded and i f F (to) = 1 for some to <
00,
t
then there is a K such that f(t) < Ke- , and for any finite interval
I, with right endpoint T,
~
f.(t) ~ KeKT(KT)m for all tEl.
j=m+l J
m~
As in Case II of the proof of Theorem 2.1.3,
Proof:
( ) j-l < K (KT)j-l
K -t ~
fj (t ) <
e
(j-l)~ (j-l)~.
00
(KT)j-l
00
(KT)j
th
Since L
----(
1)' = L -.-,- is the m remainder term of the Taylor
j=m+l j- .
j=m J.
series for e t ,evaluated at KT, there is a , in (O,KT) such that
00
L
j=m
(KT)j (KT)m,
--=---e
j ~
m~
m
00
(KT)j-l
00
(KT,)
e', and s i nce
K
Thus L
f. (t) ~ K L
(.-1)' =
m.
j=m+l J
•
j=m+l J
m
,
KT
KT
it follows that
fj(t) ~ Ke
(K;)
e < e
m.
j=m+1
,<
KT ,and
E
o
Many of the results in the third section of this chapter depend on
the assumption that f and h are bounded.
The next theorem, due to
Smith (1954) leads to a theorem concerning conditions on f under
which h is bounded.
2.1.7
Theorem
Suppose that
If(t) IP is integrable.
1
l.l
h(t)-+ - as t+oo.
f(t)~
as t+oo, and that for some p>l,
If ~ = ftf(t)dt, then
12
2.1.8 Theorem
.
If f(t)+O as t+oo, and if f is bounded, then h(t) is
bounded .
Proof:
Choose p>l.
f~(t)IPdt
Then
=
p 1
flf(t) IIf(t>l - dt
1
5MP- flf(t) Idt<oo, where M is a bound for f.
Thus the conditions of
Theorem 2.1.7 are satisfied and h(t) tends to
00
1
f(t)+O as t+oo, E f (t)+ ~ as t+oo also.
j=2
j
j.J
1 as t+oo.
11
Since
1
Note that - <00.
11
00
E fj(t) is continuous (Theorem 2.1.3) and zero for
Since
j=2
t~O,and
tends to a finite limit as t+oo,
h(t), the sum of f(t) and
2.2.
00
E fj(t) is bounded.
j=2
Thus
00
E fj(t) is also bounded.
j=2
o
<S-Function Sequences
Rosenblatt (1956) suggested a type of a·-function sequence which
is
now generally known as a kernel-type sequence.
am (x) =
~ k(~)
rn
where Am+0 as m+oo.
rn
Leadbetter (1963) introduced the following more general type of
a-function sequence, which will be used wherever possible.
2.2.1
Definition
(Leadbetter's Axiom Scheme)
A sequence of functions, {a (x)}satisfying (a)-(d) below will
m
be called a a-function sequence.
13
(a)
flo m(x)ldx<A
for all m, some fixed A.
(b)
fo m(x)dx = I
for all m.
om (x)+O
(c)
(d)
flo
Ixl~Am
uniformly in Ixl~A for any A>O.
(x) Idx+O as m+oo for any fixed A>O.
The next lemma shows that the commonly used kernel-type sequences
are also a-function sequences as defined above.
2.2.2 Lemma
If k is a bounded density on R such that Ixlk(x)+O as
Ixl+-. and if Am>O. Am+O aa m+-, then the sequence
{~mk(~m)}iS a
a-function.
Proof:
(a)
We show that each of the four axioms holds.
f 10m (x) Idx = f I~ k(~ ) Idx = fk(x)dx = 1
m
(b)
fom(x)dx
=
f~ k(~
)dx = fk(x)dx
m
m
(c)
m
Given ~ and A, choose
X
o so
that, for Ixl>xO' Ixlk(x)<A~.
Choose N so that A <l- for all m>N.
m x
o
=1
Then for Ixl>A, m~ N, we have
x
IX- Ik(-)<A~
x
-A k(-)<
A
A
A
A
m
m
m
m
Thus 0 (x)+O uniformly in Ixl~A for any A.
m
(d)
f 0 (x)dx =
Ixl~A m
f
k(x)dx = fXo
Ixl~lA
m
E
k(x)dx
m
14
Ix:lxl~ ~ 1,xE k(x)+O
l
mf m
where E =
m
and IXE
k(x)I~lk(x)1
m
and so by dominated convergence the integral tends to zero.
I
0 (x)dx+O for any
Ixl~A m
Thus
0
A~O.
Leadbetter (1963) proved the following two lemmas.
2.2.3
a
m
Lemma
If {om} is a o-function sequence and if
= In 2 (x)dx<oo, then a +00 as m+oo, and{o *}={o 2 (x)/a } is also a
m
m
m
m
m
o-
function sequence.
2.2.4 Lemma
If g(x) is continuous at X= 0 and I/g(x) Idx<oo, and i f
{o } is a a-function sequence, then
m
Ig(x)o (x)dx+g(O) as m+oo.
m
Johnston (1979) proved the following lemma, which shows how
a a-function sequence, designed to produce a smooth
estimator,
behaves at discontinuities.
2.2.5 Lemma
Let {8 } be a8-func tion sequence such that a is an
m
m
even function for every m, and let g be an integrable function with
both right and left hand limits at O.
for each m, and
Ig(x)am(x)dx+g(O+) + g(O-) as m+oo.
2
Then 0 (x)g(x) is integrable
m
15
2.3
Pointwise Consistency
The following two lemmas lead to theorems on the
unbiasedness of
2.3.1 Lemma
hm (t).
Let {6} be a a-function sequence, and let {g (x)}
m
m
be a sequence of real functions such that Ig (x)
m
x.
asymptotic
I$M
for all m and
Suppose that g (x) converges uniformly in a neighborhood of
m
zero to a function g(x) which is continuous at zero.
Ig (x)a
m
Proof:
(x)dx~(O)
m
as
Then
m~.
Given E>O, choose A so that g
m
(x)~(x)
uniformly in [-A,A]
and /g(x)-g(O) I<~ for IxISA.
Ig (x)o (x)dx-g(O)
m
m
=
+
I (g (x)-g(x»o (x)dx + l(g(x)-g(O»o (x)dx
IX/SA m
m
IxlSA
m
I g (x)o (x)dx-g(O) I 0 (x)dx
m
Ixl~A m
Ixl~Am
The first term tends to zero since g (x)-g(x) tends to zero
m
uniformly in IX/SA, and 110m(x) Idx<A for all m.
does not exceed AE in modulus.
The second term
The third term is dominated in
absolute value by MIlo (x) Idx, which tends to zero by axiom (d).
Ixl~A m
16
The last term also tends to zero by axiom (d).
2.3.2 Lemma
Suppose that f and h are bounded and that f is continThen if {a } is a ~function sequence,
uous at t.
m
m
fa (t-x)E f.(x)dx + h(t) as
m
m+cx>.
• 1 ]
J=
m
fa (t-x) E f.(x)dx
Proof:
[J
j=l ]
m
= fO
m
(t-x)f(x)dx +fO (t-x) E f. (x)dx .
m
m
j=2 ]
The first term tends to f(t) by Lemma 2.2.4.
converges uniformly in any finite interval (Theroem 2.1.3), and
since h is bounded, the second term tends to
00
E f.(t) by Lemma
j=2 ]
2.3.1.
Thus
m
00
fa (t-x) L f.(x)dx+f(t) + J.=E f j (t) = h(t) as m+oo.
m
j=l ]
2.3.3 Theorem
o
2
Suppose that f and h are bounded and f is continuous
in a neighborhood of t.
Then if
estimator of h,
E(h (t»+h(t) as m+oo.
m
hm (t)
is a compact or expanded
17
k
1
Proof:
=
m m
E(h (t)) = - L E L 0m(t-S .0 )
m
k i=l j=l
1)
m
m
m
L E(o (t- s 1o )) = L 10 (t-x)fj(x)dx
j=l
m
J
j=l m
m
o
= 10 (t-x) L f.(x)dx~h(t) as m+oo by Lemma 2.3.2.
m
j=l J
2.3.4 Lemma
Suppose that f and h are bounded, and that f has right
and left hand limits at t.
where each 0
m
10 (t-x)
m
Proof:
Then if {o } is a o-function sequence
m
is even,
Ef.(x)dx~f(t+)
; f(t-) +
j=l J
r
j=2
fj(t)
=
h(t+) + h(t-)
as m+oo.
2
m
ill
10 (t-x) L f (x)dx= 10 (t-x)f(x)dx + 10 (t-x) L f. (x)dx
j
m
j=l
m
m
j=2 J
The second term is handled exactly as the similar expression in
Lemma 2.3.2 was,
and gives the second term in the result.
The
first term gives the first term of the result by Lemma 2.2.5.
2.3.5 Theorem
0
Suppose that f and h are bounded and that f has right
and left hand limits at t.
where each 0 is even, and
m
Then if {o } is a o-function sequence
m
hm(t)
is a compact or expanded estimator
of h(t),
Proof: The proof is similar to that of Theorem 2.3.3 but uses Lemma
2.3.4 rather than Lemma 2.3.2.
o
The next setof theorems give us, not only conditions under which
Var (h (t))+O, but an exact rate for that convergence.
m
2. 3 . 6 Theor em
Suppose a
m
=
fo m2 (x)dx<oo,
f and h are bounded, and
,..,
f is continuous at t.
Then if h(t) is a compact estimator of h as
defined in Section 1.1,
k
~ Var(h (t»+h(t) as m+oo.
a
m
m
k
Proof:
~ar(h (t»
am
m
k
kIm
m
= ~ar(-- L
L 0 (t-S .. »
. 1 m
1J
am
km i=l J=
= !-var( ~ 0 (t-S 1 .»
a
. 1 m
J
= lmE
am
[< j i=.
<t-5 1J »2] _L[fO
(t-x)
=1 m
am
m
o
By Lemma 2.3.2,
a
fo
J
m
(t-x) L f.(x)dx+h(t), and by Lemma 2.2.3,
m
• 1 J
J=
so the second term tends to zero.
+00,
m
j~lf'(X)dX]2
m
2
L E(o (t-S 1 .»
am j=l
m
J
1
2
+ --
L
am lSj<tSm
The first term is
E(o (t-S .)0 (t-S 1n »
1J m
m
~
2
m
(2.3.1) = fS (t-x) L f.(x)dx
m
• 1 J
J=
a
m
2
+ -am
ffo m(t-x)o m(t-x-y)
*
By Lemma 2.2.3, {Om}
={
L
lSj<R,Sm
0 2 (x) }
:m
f.(x) f n .(y)dxdy
J
~-J
is a o-function sequence, so
the first term of 2.3.1 tends to h(t) by Lemma 2.3.2.
19
If B is a boundforh(t), then
fj(x)f£,_j(Y)
L
l:S;j<£,:S;m
2
:S;h(x)h(y)<B •
The second term of(2.3.1)is dominated by
2
2 2
2B
2B A
- - f I0 (t-x)f I0 (t-x-y) Idy Idx <
a
m
m
m
a
+0 as m+oo.
m
Thus, the second term of(2.3.1) tends to zero.
2.3.7 Theorem
Suppose a
m
=
o
fo m2 (x)dx<~, f and h are bounded, and
~
f has right and left-hand limits at t.
Then if h(t) is a compact
estimator of h,
~ar(h (t))+
am
Proof:
m
f(t+) + f(t-) + E f. (t) = h(t+)+ h(t-,)
2
j=2 J
2
The proof of this theorem is the same as that of Theorem
2.3.6, except that Lemma 2.3.4 is used in place of Lemma 2.3.2.
2.3.8 Theorem
Suppose a
is continuous at t.
m
= fo m2 (x)dx<~,
f and h are bounded, and f
Then if h (t) is an expanded estimator of h,
m
as defined in Section 1.1,
k
A
~ar(h (t»+h(t) as m+oo.
a
m
m
k
~ar(h (t»
a
m
A
Proof:
m
1
a
m
= --
1
m
= ~ar( L 0 (t-S 1j »
a'm
j=l m
m
L Var(o (t-S 1 ,»
m
J
. 1
J=
Since the Sij'S are independent in the expanded estimator.
0
20
1
~ [f O (t-X)f.(X)dX)2
am j=l
m
J
The first term tends to h(t).
1
~
am j=l
The second is dominated by
[f I0m(t-x) If.J (x)dx Mf 10m(t-x) IdX)
~
MAfl o (t-x) I ~ f.(x)dx
a
m m .J= 1 J
2
~ MBA +0 as m~,
a
m
where M and B are bounds for f and h.
2.3 . 9 Theor em
D
2
Suppose a = fo (x)dx<oo, f and h are bounded, and f has
m
m
right and left-hand limits at t.
Then if
hm(t)
is an expanded
estimator of h,
f. (t) = _h.-o..Ct_+~)--=-;_h......:C,-t_-:...) as m~.
J
Proof:
The proof is the same as that of Theorem 2.3.8, except that
Lemma 2.3.4 is used in place of Lemma 2.3.2.
D
The final group of lemmas and theorems deals with conditions under
which the covariance of h . at two different points converges to zero.
m
As for the variance, an exact rate of convergence is obtained.
21
2.3.10 Lemma
Igm(x, y) I<M
If
g (x,y) is a sequence of real functions such that
m
for some M and all x, y, and m, and g (x, y) converges
m
uniformly in a neighborhood of (s,t) to a function g(x,y) which is
continuous at (s,t), then if {o } is a o-function sequence,
m
ffg m (x,y)om (s-x)om (t-y)dxdy+g(s,t) as m-+dl
Proof:
ffg (x,y)o (s-x)o (t-y)dxdy - g(s,t)
(2.3.2)
= ff(g
(x,y)-g(x,y))o (s-x)o (t-y)dxdy
Q m
m
m
m
m
m
+ ff(g(x,y)-g(s,t»o (s-x)o (t-y)dxdy
Q
m
m
+ ff(g (x,y)-g(s,t»o (s-x)o (t-y)dxdy
R
where Q
m
m
= {(x,y):
m
Ix-sl<\, ly-tl<X} and R is the complement of Q,
and where, given c, X has been chosen so that g (x,y) converges
m
uniformly to g(x,y) in Q and Ig(x,y)-g(s,t) 1< ~ in Q.
The first term of(2.3.2)tends to zero by uniform convergence,
since
flo m(s-x)o m(t-y) Idxdy<A 2 •
2
cA •
Since R
=
The second term is dominated by
tls-xl>X}u{lt-yl>X}, the third term is dominated
in absolute value by
22
2M [11
10m(s-x)Om(t-y) Idxdy + 1110m(s-x)o m(t-y) I dXdyj
I s-x 1>A
It-yl>A
:S 2MA[ 1
10 (s-x) Idx + 1
10 (t-y) /d Y)
~-X/>A m
It-yl>A m
both terms of which tend to zero by
2.3.11 Lemma
o
axiom (d).
Under the conditions of Lemma 2.3.10 but with s=O,
llg (x,y)o (x)o (t+x-y)dxdy4g(O,t) as m+oo.
ill
Proof:
m
m
The substitution of -x for x and -x+y for y in the integral
gives
llQ'1ll (-x,-x+y)om(-x)om(t-y)dxdy
Let g* (x,y) = g (-x,-x+y) and g* (x,y)
m
m
= g(-x,-x+y).
Since
*
* (x,y)
gm(x,y)+g(x,y) uniformly in a neighborhood of (O,t), gm(x,y)+g
uniformly in a neighborhood of (O,t), and by Lemma 2.3.10,
or
llg (x,y)o (x)o (t+x-y)dxdy + g(O,t) as m+oo.
III
2.3.12 Lemma
m
m
Let O:St<s.
o
Suppose that f and h are bounded and that
f is continuous at t and s-t.
Then if {o } is a o-function sequence,
m
110m(t-x)om(s-x-y) [ } < f j (x)fR,_j (y)] dxdy + h(t)h(s-t) as m +
l;-J<R,-m
Proof:
110 (t-x)o (s-x-y)
m
m
(L
l:Sj<R,:Sm
fj(X)fR,_o(y»)dXd Y
J
00.
23
= ffa (x)a (s-t+x-y)
m
Let gm(x,y) =
m
(~
1<'
f. (t-x)f n .(y»)dXdY
n< J
-J<x.,-m
x.,-J
~
f.(t-x)f£._j(y) and g(x,y) = h(t-x)h(y).
lSj<£.Sm J
Since f is continuous at t and s-t, gm and g are continuous at (O,s-t).
The result will follow from Lemma 2.3.11 when it is shown that
g (x,y)-+g(x,y) uniformly in a neighborhood of (O,s-t), (.(x,y):
m
00
g(x,y) = h(t-x)h(y)
00
~ fj(t-X)~ f£.(y)~
~
f.(t-x)f£._.(y)
lSj<£.Sm J
J
m
m-j
p
m-p
~ f.(t~~)~
f£.(y)~ ~ f.(t-x) ~ f£.(Y)
j=l J
£.=1
j=l J
£.=1
j=l
£.=1
00
m
where p = [2]'
interval.
~ f.(t) converges uniformly to h(t) in any finite
j=l J
Thus, for any e, m can be chosen sufficiently large so
p
m-p
j=l
£.=1
~ fj(t-x»h(t-x)-e for all Ixl<A, and so that
that
for any y such that !y-(s-t) 1<)..
~ fj(y»hty) - e.
Thus for any e, m can be chosen
sufficiently large so that
g(x,y) =
h(t-x)h(y)~g
m
(x,y)~(h(t-·x)-e)(h(y)-t)
for any (x,y) in a neighborhood of (O,s-t).
Thus g (x,y) converges
m
uniformly to g(x,y) in a neighborhood of (O,s-t).
2.3.13
Lemma
Let OSt<s.
Suppose that f and h are bounded.
{a } is a a-function sequence,
m
o
Then if
24
ffo (t-x-y)o (S-X)( E
m
f.(x)f
l~j<t~m J
m
ffo (t-x-y)o (S-X)( E
Proof:
m
(2.3.3)
.(Y»)dXdY+O as m+oo.
t -J
f.(x)f t .(Y»)dXd Y
l~j<t~m J
m
-J
ffo (t-x-y)o (S-X)( E
fj(X)ft_.(Y»)dXd Y
l~j<t~m
J
Q m
m
=
+ ffo (t-x-y)o (S-X)(
R m
m
E
l~j<t~m
fj(X)ft_.(Y»)dXd Y
J
! ~A'
where Q = { (x,y): I
x-s
Ix+y-t I~A} with 'A<-2-'
s-t
R is the complement of Q.
In
y~2A+t-S<O.
zero.
Since f n
•
x,-J
Further,
Rc (
Q,X+Y~A+t,
and
and where
-X~A-S,
so that
(y) = 0 for y<O, the first term of (2.'3. 3} is
Ix-s I> A) U ( Ix+y-t I> A)'
so that the second term
of (2.3.3) is dominated by
2
B flo (s-x)lflo (t-x-y)ldydx
(2.3.4)
m
Ix-s I> A
2
+ B fflo (t-x-y)I
m
\x+y-t I> A
where B is a bound for h.
AB
2
m
10m(s-x)
\dxdy
The first term of (2.3.4) is dominated by
flo (s-x)!dx, which tends to zero by (d) of the Axiom Scheme 2.2.1.
m
Ix-s I> A
Let w
2
B
= x+y
in the second term of (2.3.4) which then becomes
f flo (t-W)
m
Iw-t I> A
I 10m(s-x) Idxdw
25
S AB
2
flo m (t-w)!dw
\W-t\>A
o
which tends to zero by Axiom (d).
~:3.14
Lemma
If g(x) is bounded and {o } is a o-function sequence,
m
then for t and s distinct,
fo m(t-x)
Proof:
0 (s-x)g(x)
m
~
m~.
0 as
Let B be a bound for Igi.
Then
f8 (t-x)o (s-x)g(x)dxISBf!o (t-x)
m
m
m
-
f
It -;..x 1:.; ~
(2.3.5)
t-sl •
where A <\-2-
~
Let K
m
=
I 10m(s-x) Idx
.. -
.
'" I t -.~ I >~ I Urn (t- x) I I 0.'Tn E-s--x) Id x
Is-,x-IS;).
. .
I 5 ~x I !> A
f·
sup
10m (x)l. Km~
by axiom (c).
The first
Ixl>!term of (2.3.5) is dominated by K flo (t-x)\dx<K A, which tends to
m
zero.
m
m
The second and third terms tend to zero similarly.
2.3.15
Theorem
Let OSt<s.
Suppose that hand f are bounded and that
f is continuous at s,t and s-t.
Then if
hm is a compact estimator of h,
k Cov(h (t),hm(S))~h(t)(h(s-t)-h(s)) as m+oo.
m
o
m
Proof:
m
E 0 (5-5
j))
i
m
j2=l
22
26
since S.
1.
. and S
are independent unless i = i .
2
1
i 2 ,j2
1 ,]1
Now
k
m
m
m
E Cov( E 0 (t-S i .), l: 0 (s-Sij ))
m i=l
j1=1 m
31 j2,=1 m
2
~
m
= E(
m
E 0 (t-S . ) 0 (s-Sl. ))
1)1 m
j1=1 j2=1 m
J2
l:
m
- E(
m
l: 0 (t-S . )) E( E 0 (s -Sl. )).
1)1
j2=1 m
J2
J. 1 =1 m
The second term tends to -h(t)h(s) by Lemma 2.3.2.
The first term is
m
+ j:1 E(Om(t-S 1j )Om(S-Slj))
(2.3.6)
= f!o (t-x)o (s-x-y) E
f
(x)fj . (y)dxdy
m
m
1~j1<j2~mj1
2- J 1
m
+ 10 (t-x)o (s-x) l: fj(x)dx
m
m
j=l
+ flom(t-x-y)om(s-x)
l:
f j (x)f j _j (y)dxdy.
2
1 2
1~j2<j1~m
The first term tends to h(t)h(s-t) by Lemma 2.3.12.
tends to zero by Lemma 2.3.13.
The third term
The second term is dominated by
27
flo m(t-x)
2.3.16
I 10m(s-x)lh(x)dx
Lemma
which tends to zero as in Lemma 2.3.14.
0
Suppose that f and h are bounded and that f is
continuous at t.
Then if {o
m
} is a o-function sequence,
m
ffO
(t-x)O (t-y) E (f .(x)-f.(t))(f.(y)-fJ.(t))dxdy+O
m
m
j=l J
J
J
as~.
m
Proof:
Let gm(x,y) =
E (f.(x)-f.(t))(f.(y)-f.(t)),and let
j=l J
J
J
]
ex>
E (f.(x)-f.(t](f (y)-f (t)).
1
j
j=l J
J
g(x,y) =
The series converges since
ex>
E
j=l
I (f.(x)-f.(t))(f.(y)-f.(t))\
J
J
J
ex>
~
ex>
ex>
ex>
.E fj(x)fj(y) + E fj(x)fj(t) + .E fj(y)fj(t) + E fj(t)fj(t)
J=l
j=l
J=l
j=l
ex>
~
J
ex>
B E fj(x) + 3B E fj(t) = Bh(x) + 3Bh(t)~4B2
j=l
j=l
where B is a bound for h.
ex>
ex>
\g(x,y)-g (x,y)l~ E fj(x)fj(y) + E
If.(x)f.(t)
m
j=m+1
j=m+1 J
J
+
E f.(y)f.(t)
j=m+l J
J
E f .(t)fj(t)
j=m+1 J
ex>
~
+
ex>
B E f.(x) + 3B E fj(t).
j=m+1 J
j=m+1
ex>
Since
ex>
E f.(x) tends to zero uniformly in any finite interval,
j=m+1 J
gm(x,y)+g(x,y) uniformly in a neighborhood of (t,t).
Since gm(x,y)
is continuous at (t,t) for each m.,g(x,y) is also continuous at (t,t).
28
AVp1i~Ation
of
II~ (X1Y'~
II
2,1,17
2,),10 gives
L~mma
II
(t-x)6m(t-y)dxdy~g(t,t) • 0 which is the desired
fh@or~
~§DtifiUOU§ At i,
Suppo§@ that f and h are bounded and that f is
.
fh@D if
~ifi~@ §ii,Jl Afid ~12
hm(t)
is an expanded estimator of h, and t ~
,j2 Ar~ ind.p~nd~nt unless (i1,jl) • (i 2 ,j2)'
#tN
ffl
-
'fh@ :Hf§t
1:
j=i
J!:{~ (t=fl
t~fffl
ffl
14 )6 (~-Slj» d
m
tJf (2.3.7·) h
m
E E(6 (t-S .• ,,»E(IS_b-S.,,»
j-l
th@ iAm@ Ai
m
tb~
~
~
IUlcond term of
~
5.
28
Application of Lemma 2.3.10 gives
o which
Ilgm(x,y)o m(t-x)om(t-y)dxdy~g(t,t)
is the desired
o
result.
2.3.17
Theorem
Suppose that f and h are bounded and that f is
A
A
continuous at t.
Then if h (t) is an expanded estimator of h, and t
m
A
00
A
k Cov(h (t),h (s))~-
m
m
m
L f.(t)fj(s) as ID+oo.
j=l J
k
1
Proof:
~ince
m
m
k Cov(6 (t),6 (s)) = ~
m
m
m
L
L Cov(o (t-Si.),c (5-S i .))
m1,=1 j=l
m
J
m
J
S. .
1.1'J 1
and Si
' are independent unless (i 1 ,j1) = (i 2 ,j2)'
:2 ,J 2
Now
k
1
-
m
L
m
L
m
km i=l j=l
Cov(e (t-S .. ),o (s-S .. ))
m
1J
1J
m
=L
j=l
m
(2.3.7)
Cov(o (t-S ),0 (s-Slj'))
1j
m
.
m
m
E(e (t-S1,)c (S-Slj)) - L E(om(t-S,4))E(em(s-S,4))
'1
m
Jm
'1
W
W
J=
J=
L
The first term of (2.3,7) is the same as the second term of
(2.3.6),and tends to zero as before.
The second term of (2.3.7,) can
be written as
m
-
L (Ie (t-u)f.(u)du)(/e (s-v)fj(v)dv)
m
. 1
J=
m
J
m
= -
LX.
Y
j=l J,m j,m
= - <x
v->
-m,~
where x.
= fO (t-u)f. (u)du for
m
J
J ,m
l~j:~m,
x,
J ,m
= 0 for j>m and
~
s,
29
y
j,m
= fo (s-v)fj (v)dv for
m
l~j~m,
y. = 0 for j>m and < , > denotes
J,m
the inner product of the Hilbert space
~ and yare in ~2 since
00
E
f~(t)~B
j=l J
.
~~ •
00
E fj(t) = Bh(t) where B is a
j=l
bound for h.
We will show that
<x
,y >
-m -'-tIl
~
!m~ ~
in
~2
and
1m~
y in 1 2 , and thus that
<x,v>,
which will give the desired result.
L
x ~ -x in 1
-m
2
if the following tends to zero.
Ilx m- X ; I2
m
=
E [fo (t-u)(f. (u)-f. (t) )du]
j=l
m
J
J
2
00
+
E
j=m+l
2
f j (t).
The second term in the right side is the tail of a convergent
series and thus tends to zero as m+oo.
The first tends to zero by
0
Lemma 2.3.16.
Since the mean square error is the sum of the variance and the
square of the bias, a combination of Theorems 2.3.3 and 2.3.6 shows
that the compact estimator is pointwise consistent in quadratic mean
if
km~.
'1n
Similarly, Theorems 2.3.3 and 2.3.8 show that the expanded
estimator is pointwise consistent.
In Chapter 4 rates of convergence
30
of the bias will be investigated and rates of convergence of the mean
square error to zero found (i.e., the "order of consistency" of
and ~ (t) found) for particular classes of underlying distributions.
hm(t)
m
From time to time throughout this work, a simple standard example
will be used to illustrate conditions of theorems in a specific case.
This example involves a kernel type estimator, 0m(t)
A
m
= m- s and km = mr , where 5>0 and r>O.
Note that, for a kernel-
type estimator
ex
=
m
k
The condition that ~
ex
k A
+00
mm
r-s
or m -+00.
m
1
2
7\fk (u)du.
m
"
+00
for Var(h (t)) to tend to zero becomes
m
"
Thus, in the example, Var(h (t]-+O provided r>s.
m
CHAPTER 3
ALMOST SURE CONSISTENCY
The first section in this chapter presents conditions under
which
A
sup Ihm (t) - h(t)! +0 a.s.
tEl
A
for I any finite interval.
For such a case we say that h (t) is
m
"uniformly consistent with probability one" in any finite interval.
The result will apply to both compact and expanded estimators.
In the second section, conditions under which
will be presented.
That is, conditions for which
wise consistent with probability one."
hm(t)+h(t)
hm(t)
a.s.
is "point-
Again, the result will apply
to both compact of expanded estimators.
Although the results in the two sections are similar, the methods
used are quite different.
In the first section, f is assumed to be
uniformly continuous and h to be bounded.
In the second, f and h
are assumed bounded. but the result applies to continuity points of f
without requiring uniform continuity.
32
3.1
Almost Sure Uniform Consistency
Nadaraya (1965) proved that if f is uniformly continuous and if
"
fl
,n
(x) is a kernel-type probability density estimator, i.e., one for
which 0n(x)
=;
k(~ ), with a kernel k which is a function of
n
n
2
-ynA
bounded variation, and if L e n converges for every y>O, then
n=l
00
sup
x
If1, n (x)-f(x) I~o
a.s.
In this section, a similar theorem for h (t)will be proved.
m
series which must converge is, naturally, different.
The
Also, because
of the fact that h (t) uses only m convolutions from the renewal
m
density series, we must restrict ourselves to a supremum over a finite
interval, rather than all t.
Recall that F.(x) is the distribution function of the sum of
J
j
Xl S.
Let F.
J, n
(x) be the empirical distribution function based
on the observation of n such sums.
Let K be the distribution function corresponding to the kernel
k.
Note that either estimator (compact or expanded) may be written as
m
h (t) = L fo (t-x)dF. k (x)
m
j=l
1
= --
m
m
L f
Am j=l
J, m
t-x
k(~)dFj k (x)
m
' m
33
and that
m
E(h (t»
m
f8 (t-x)dF.(x)
E
=
m
j=l
J
m
=
t-x
E fk(A)dF. (x)
A j=l
m J
1
m
3.1.1
Theorem
Suppose h (t) is a kernel-type estimator of h(t).
m
If
k(t), the kernel, is a function of bounded variation, f is uniformly
Y(kmA~
00
continuous, h(t) is bounded by some B, and the series E m exp (-" 2')
m=l
m
converges for every positive y, then
A
suplh (t)-h(t)l~ a.s. as m ~
tEl m
00
where I is any finite interval.
A
Proof:
suplh (t)-h(t)1
tEl m
~
(3.1.1)
suplh (t) - E(h (t»1 + SUPIE(h (t))-h (t)
tEl m
m
tEl
m
m
I
+ sup\h (t) - h(t)1
tEl
m
m
where h (t) is E f.(t), as defined in Section 1.3.
m
j=l J
last term of (3.1.1) tends to zero as m +
00.
By Theorm 2.1.3, the
It will actually be shown
that the first and second terms of (3.1.1) tend to zero when the suprema
are taken over all t.
The result will then be proved.
The first term of (3.l.l)is dominated by
suplh (t) - E(h (t»1
t
m
m
1
IJ k(T)dF.
t-x
t-x
sup A
k (x) - fk(T)dF (x)
t
m j=l
m J, m
m j
1
m
t-x I
F. (x) )dk(T)
sup A L If(F. k (x)
J
m
J'm
t
m j=l
~
m
L
~ ~
~ sup IF.
m j=l
~
where
34
m
L
t
k (t)
F. (t)
J
J'm
is the total variation of k.
I
I
Hence
P(Suplh (t) - E(h (t»I>sl
m
t
~
p(:
m
~
j=l A
m
)
suplF. k (t) - F.(t)l>e)
t
J'm
J
~ .~ p(~ suplF.
(3.1.2)
J=l
A
m
t
k (t) - Fj(t)I>;)
J'm
We use an inequality of Smirnov (see Nadaraya (1965»,
P(SUp1 F1 ,n(U) - F(u)1 >tJ
u
~ ce-~Ai2
for some C, and O«xS2.
to show that(3.1.2) is dominated by
2 2
m
ak e A
m'
2 2
m J..l
2
ae
where y = 2 .
Since
converges for all y > O. the
~
Borel-Cantelli theorem shows that
SUpih (t) - E(h (t»1
t
m
m
+
0 a.s.
The second term of (3.1.1) is dominated by
.
m
m
supi L fa (x)f.(t-x)dx - J.=L f j (t)!
t j=l
m
J
1
m
= supi L f8 (x)(fj(t-x) - fj(t»dx l
. 1 m
t
J=
35
~
~
sup
t
(I j=l~ IxI I ~
0 (x) (f.(t-x) - fj(t))dxl
Am
J
sup(1 I
0 (x) (f(t-x) - f(t))dx
t
Ixli:;; A m
m
+
L
I
0 (x) If. l(y)(f(t-x-y) - f(t-y))dydx
j=2 Ixl~Am
J-
+ I
0 (x)
Ixl~Am
(3.1.3)
~
SUP (
t
~
j=l
(f.(t-x) - fj(t))dX I)
J
I
0 (x)lf(t-x) - f(t)ldx
Ixl~Am
+ I
0 tx) I B If(t-y-x) - f(t-y)ldydx.
Ixl $'A m
+ I
0 (x)(h (t-x)
I x 1~A
m
m
+ hm(t))dX)
Since f is uniformly continuous, ~'can be chosen so that If(t-x) - f(t)1
<
t for
all t whenever Ixl~A'
Thus the first term of (3.1.3) is dominated by ~.
At the same time"). can be chosen small enough so that
E: for I·x I~A.
..; '~.
Then the second term of (3.1.3) is dominated by
third term of (3.1.3) is dominated by 2B
large enough,
this is dominated by
suplE(h (t)) t
m
II f(t-y-.x)
hm(t)I<E:.
3E:
I
0 (x)dx.
Ixl~A m
also, and
-. f(t.y) Idy
J.
The
If m is chosen
36
o
Thus the second term of (3.1.1) tends to zero.
Note that the theorem applies to both compact and expanded estimators.
Recall that in the standard example k
m
r>O, s>O.
r
s
= m and A = m- with
m
Then
2
yk A
L mexp :~
2·
m=l
m
00
00
L me- ym
m=l
r-2-2s
=
The series will converge for all y>O if r-2-2s>0 or r>2 + 2s.
3.2
Almost Sure Pointwise Consistency
Van Ryzin (1969) used martingale methods to prove a theorem giving
conditions under which f
1,
type estimator of f(t).
f
(x) = ~l £1
1 , n+l
n
,n
n(t}+ f(t) almost surely, where f
(tlis: a
Ian
kernel~
The proof depends on the relationship
(x)+n~l
0 (x-x +1)'
m
n
In order to prove a similar result for h (t) it is necessary to have a similar
m
relationship for hm+l(t) and hm(t).
At this point, the compact estimator
only will be considered.
Let the i.i.d. sequence of X's be double-indexed X ,
iJ
j
Then for each (i,j) let S .. =
1.J
1
nm(t) = --k
k
L XiR.' and let
.q,=1
m m
L L 0 (t-S i ·)·
m i=l j=l m
J
l~i<oo,
l~j<oo.
37
Thus S .. is uniquely defined for all (i,j) and h (t) is a compact
m
1J
estimator of h as defined in Chapter 1.
k
k
=
hm+l(t)
~
m+l
h
(t)
+
km+l
m
_1_( L
km+l i=l
m
Also,
<5
m+l
(t-S.
) + L
L <5 (t-S ..
m
1,m+l
i=k +1 j=l m
1J
».
m
The double indexes can be assigned 'in such a way that, for any m,
~
the first mk are used for h (t) while the indexes are also assigned
m
m
"permanently," i.e., in such a way that they do not depend on m.
The following lemma was proved by Van Ryzin, and is the basis of the
theorem.
The lemma is proved by applying the submartingale convergence
theorem to the martingale
The details of the proof will not be given here.
3.2.1
Lemma
Let {Y } and {Y'} be two sequences of random variables on a
m
m
probability space (Q,F,P)
Let {F } be a sequence of a-fields, f c F
c F
m
m m+l
where Y and Y' are measgrable with respect to
m
m
(i)
O~
(iii)
E(Ym+llFm) s Ym + Y'm a.s.
(iv)
m
00
L
m=l
a.s.
f.
m
If
38
then there exists a random variable Y which is finite a.s. and such that
Y converges to Y a.s.
m
3.2.2
Definition
A real-valued function on the real line, g(c), is said
to be locally Lipschitz of order a, a>O, at 1 if there exists an £>0 and
an M, O<M<oo, such that \g(c) - g(1)!QI\c-1I
,..,
3.2.3
Theorem
1
Let hm(t) =
k
~
m
a
for all c in (1-£, 1+£).
m
~
~ a (t-S .) where the Sij'S are uniquely
iJ
m i=l j=l m
defined for all i,j, as described at the beginning of this section, and k
m
tends to infinity monotonically as m-+
00.
Let {a } be a kernel-type a-function
sequence as defined in Lemma 2.2.2.
Then am(x) =
m
1
x
~ k(~),
m
and suppose
m
that
(1)
00
~
m=l
1
-- <
A k
00
mm
A
(2)
lim ~ = 1
m -+ 00 Am+1
(3)
(km+1- k )
~~
m<oo
m=l Am+1 k k
2
m m+1
(4)
g(c) = f(k(CU) - k(U»)2dU is locally Lipschitz of order a at c=l for
2
some a>O.
(5)
Then if f and h are bounded and f is continuous at t, h (t)-+h(t) a.s.
m
39
Proof:
By Theorem 2.3.3, E(h (t») -+h(t) as m -+
m
00.
show that hm(t) - E(hm(t») -+ 0 almost surely as m -+
Hence it suffices to
00.
Let F be the a-field generated by
m
{x .• : i~k
, j~m}
m
1J
E(Y (t»
,m
h (t) - E(h (t»
m
m
k
+
k(
= Var(h (t») -+0 by Theorem 2.3.6 since
m
Ik/-
Sij
ij
k
) -Ek( t-S )
m
mAl
A-i.1
+ I
I
.,-...;.m;:...""...::..,lI_IT_
j=l i=l ~m+1 k m+1
t-S
i ,m+1)
i,m+1) _ Ek(
Am+1
Am+1
t-S
m
I
say.
+
t-S
A
ij)_ Ek(
m
t-S.
1
A
j)
m
Ak
mm
t-S .
m+1 km+1
+
I
I
k(-!J)
Am+l
j=l i=k +1
i=l
= A
k(
m
B + C + D,
Now
Ym+1
= A2 +
2
2
B + C2 + D + 2(AB + AC + AD + BC + BD + CD),
where the terms A and Bare F measurable, and D is independent of F
m
has mean zero.
m
and
Thus
+ 2(AB + AE(C!Fm) + 0 + BE(C!Fm) + 0 + 0)
= Ym + [B 2 + E(c 2 IFm) + E(D 2) + 2AB + 2AE(C\F m)
+ 2BE(C\F ) ]
m
40
Denote the quantity in square brackets by y'.
All the conditions of
m
Lemma 3.2.1 will be fulfilled for this {y } {Y'},{F } if it can be shown
m, m
m
~ Ely'm(t) I<ro. It will then follow from the Lemma that Ym almost
that
m=l
surely tends to a random variable Y which is finite a.s.
negative and E(Y )
m
+
0, Y must be a.s. zero.
Since Y
m
is non-
Then h (t) - E(h (t)) will
m
m
also tenda.s. to zero and the proof will be complete.
To show that ~ ElY' (t)
m'Fl
m
00
(i)
L E(B 2)<00
m=l
00
(ii)
L EIABI<oo
m=l
00
00
(iii)
L
L
m=l
m=l
I
it is sufficient to show the following
~ E(D 2)<00
(iv)
m=l
(v)
rEI AE (CIF) I<00
m=l
m
(vi)
r EIBE(cIF )1<00
m=l
m
By Holder's inequality, (ii) can be replaced by
(ii')
r [EA2]~[EB2]~<00.
m=l
Similarly, EIAE(C!F )
m
S
I
is dominated by
[E(A2 )] ~2 [E(E(C 21 F))] ~2 since g(x)
m
= X2
1'S
convex
41
2 J,,;
?
J,,;
[E(A )] 2 [E(C·')] 2
S;
Thus (v) can be replaced by
2
[E A2 ]2 [E C ]'2
I
!
(v' )
1/
<00.
m=l
In exactly the same way, (vi) can be replaced by
2
00
L [E B ]
(vi' )
J,,;
2
2
{E C ]
J,,;
2
<
00
m=l
For reference, introduce the further condition,
00
(vii)
L E(A 2 ) <
m=l
00,
---
a 2 +b 2
Since ab
2
<
..
-_._----
---------------~
,(vi') will follow from (i) and (iii), (ii') from (i)
and (vii), and (v') from (iii) and (vii).
In other words , i t is sufficient
to show (i), (iii), (iv), and (vii).
(i)
[m km
2
E(B) = E L L Wmi,(t)
j=l i=l
J
k(
t-S
ij) _ Ek(
]2 ,where
t-S
Am+1
W i,(t)
mJ
ij)
k(
Am+1
A
k
m+l m+l
t-S
A
ij) _ Ek(
t-S
A
ij)
m
m
A k
m m
Since E(W i.(t)) = 0, and since W . . (t) and Wi' (t) are independent
mJ
mll]l
m 2J 2
for i
1
;t
iz'
m
E( L
so that E(W . . (t) W i j (t)) = 0,
m1 1J 1
m2 2
k
m
L W .. (t))
j=l i=l
m1J
2
~m E[ ~ W (t)]2= k [E j=l~ w 1.(t)]2
i=l
i
j=l m j
m
m]
42
Since for any random variable X, E(X-EX)2~FX2, and since
m
2 m
(E a ) ~m>; a
i=1
i
i=l
2
i
2
' E(B ) does not exceed
k( __u_)
A
A k·~m---_}2f.(t-u)du
mm
J
1
~ ~J}
'\KJD
m
1
u
+2mk E I{ A k (k(-A---) m j =1
mm
m+1
2
u}2
k(~»
m
£. (t ...uldu
,
f.(t-u)du
J
(3;,2.1)
~mk
m
2 2 I (k(~) - k(
A k
m+l
1
m m
2 m
».
AU
m
! f.(t-u)du.
j=1 J
The first term of (3.2.1) is
2
Ik (u)du
*
Since a.m+l =
, and since {a (t)}
Am+l
m
=
{ ~~a. (t) } is a
m
a-function sequence, this is equal to
2
2m I k (u)du Am+1
Ak
_ 1)2 I
( mm
A 2k
m m
Am+! km+1
m
1(t-u)E f . (u)du .
m+
j=1 J
a*
By Lemma 2.3.2, the integral is h(t)(l+O(l», and the first term
of (3.2.1) is
2mA+1 Ak
m (m m
Am2km Am+lkm+l
Am
~
m+1
+
~
~
m+l
- 1)
2
Ik 2 (u)du
h(t) (1+0(1»
43
+
2
[AmAl
(_1
_....!:.
)1 ]
Am
m+
)
x
2
fk (u)du h(t) (1
=
4m(km+ 1 - km) 2
2
[kmAm+I k m+I
+
0(1))
2
fk (u)du h(t)(1 + 0(1)).
The second term of (3.2.1) is
2 m
U
L f. (t - A u)du
]
- k(u)
m
mm
m+I
j =1 J
/~ fH:m
<
where M is a bound on h and g is as in (4) .
there is an m0 so that for m > m ,
0
A
m
m+I
[~J
2mM
mm
-~g
-A- -
A
S'Ince -A--+
m
1, given
m+I
1 <
Lipschitz of order a at 1, for m > mo' g[:m ] s
m+I
Since g is locally
€.
A
m a
1 "'-Am+I
a
Thus, for m > m , the second term of (3.2.1) is dominated by
o
2mM Aa-I!l- _ _1__ a
km m Am Am+ 1
m(k m+ 1 - km)
2
Am+I kmkm+I
and (3.2.1) is dominated by
2
2
4fk (u)duM(I + 0(1))
1 J2 4fk 2 (u)duM(l
m
[ -1 - - Am
+-A
k m m+1 Am+ 1
+
0(1))
€,
44
1 a
- A
2M.
m
00
Thus, by (3) and (5),
2
L E(B )<00.
m=l
2
k
E
m
L
k
t-Si,m+l
Am+l
k
=
- Ek
i,.m+l
Am+l
Am+l km+l
i=1
=
t-S
k t-S l ,m+l
m
-2- Var
km+l
Am+l
km2a m x J:-.
Var (0 (t-S
»)
a
m
1,m+l
km+1
m
k a
mm
=-2km+1
[f 0: (t-u)Mdu + amI
(where M is a bound on h, and thus on f
j
for all j)
k
45
k
m
00
m
1
1
Since -=-2--""-< --.- and since - - < - k A
2
k A
km+IPrn
mm
km+ l
mm
t-S
(iv)
L
i=k +1
00
by (1).
t-S ij
- Ek
o
m+1
=
<
m=l
Am+l
km+IAm+1
j=l
m
2
E(C )
2
ij
m+l k A +
ml
L
km+l
L
(k +1 - k)o 1
m
m m+
2
km+l
. 1
~-0-
(t-s .)]
1J
Var
m+1
J
m+1
]
L 0 1(t-S .)
1
J
[ j=l m+
2
- k )fk (u)du
k
by Lemma 2.3.6.
m
2
h(t)
A
m+l m+l
km+l - km
Since ~--<
2
k
A
m+l m+1
co
2
L E(D )
1
k
A
<
00
by (1).
m=1
m+l m+l
2
= Var{hm(t)} ~ fkA k(u)du h(t) by Th eorem 2 .3.6,
(vii)
mm
so, by (I),
o
00
L
m=l
"
3.2.4 Theorem Let h (t) be an expanded estimator of h, conm
structed with Sij 's which are uniquely determined for all (i,j).
Under the conditions of Theorem 3.2.3, h (t)
m
Proof:
Let F
m
-+
h(t) almost surely.
be the a-field generated by the observations used
46
to make up the Sij'S for i
~
km, j
~
m.
Define other terms as in
C is then independent of F , so
the proof of Theorem 3.2.3.
m
00
that
L
m=l
(vii).
IY'(t)
m
I
<
it is sufficient to show (i), (iii), (iv) and
00,
o
The proof proceeds as for Theorem 3.2.3.
It is illuminating to examine the series convergence conditions
of Theorem 3.2.3 as they apply to the standard example where
k =m
m
r
and A =m- s , r
m
>
0,
S
0.
>
00
00
(1)
L
__
1_ <
m=l
Amkm
00
(3)
E
m
m=l Am+l
00
L ms - r <
requires
00
or s -r < -lor r > 5+1.
m=l
(km+l - km) 2
2
k k +
mm1
<
00
i
requ res
[r r]
~
00
l+s-r (m+1) - m
m
r
m=l
(m+l)
~
2
<
00.
2
r
Now, ml+s-r[(m+l)r - m ]
(m+1)r
•
=r
=
[~] r. . ;{~m:. . r_+-=.rm.: :;. . r_-_1': "'+_0: . .i:(m: -r_-_l- '):.-. -_-.,;: m:. . r. .: }~2
2
m+l
3r-s-l
m
2m
2r-2o+m
(2r-2)
3r-s-l
---"..-----'~--~
m
The series will converge if (3r-s-l) - (2r-2)
>
1 or r
>
s.
47
00
(4)
liS <
km AmS-lll
-A-- - A
I:
m=l
00
I:
m
m-l
00
m
ml-r+s(S-l) (m+l)s_m s
<
requires
The summand ml-r+s(S-l) I (m+l) 5 -m 51
00
m=l
5
5-1+0 (5-1)
=m+sm
m
-m 5 IS
L _
r-l-s(S-l)
m
J
5
Sm(5-1)8
. ( (S-lJB,
. -i'o_m
-
r-l-(s-l)S
m
The series will converge if (r-l-sS+s)-(sS-S»l or r>s+2-S.
If
~~2,
2-8=0 and the condition is r>s.
If
~<2, 2-S=2-~
and the condition is
r>s+2-~.
A
Conditions (2) and (4) are that lim _m
= 1 and that g(c) is
m+ 00 Am+ l
locally Lipschcitz of order
example since Am
the estimator.
the value of
•
~.
=
-s
m
~
at c=l.
The first is satisfied in the
The second depends on the kernel used for
Note that the kernel also affects condition (5) through
CHAPTER 4
Consistency for L Densities
2
4.1 Preliminaries
In this chapterf and the o-function sequence {o
~
. m)
be assumed to be in L2 .
will now
The o-functions will not necessarily be
in L1 , nor will they necessarily satisfy the axiom scheme 2.2.1.
Further assumptions about f will be made in the form of conditions
on its characteristic function.
Complementary assumptions about 0 will
m
also be made as conditions on its Fourier Tran~form.
The notation gt
will be used for the Fourier Transf91m of a function g,
In Chapter 2, no exact rates for the convergence of the bias or the
mean squared error were obtained.
L theory of this chapter.
2
Such results will be produced with the
Many of the results will be uniform, or
uniform over any finite interval.
If om and f are members of L , then the calculations in the next five
2
lemmas, all based onParseval' s Theorem, can be made.
4.1.1. Lemma tf h (x) is a compact or expanded estimator of h
m
49
m
A
m
I
Proof: E(h (x»=r. E(o (X-5 »=r. 0 (x-u)fj(u)du.
1j j=l m
m
j=l
m
Since the Fourier transform of 0 Cx-u)is eixtot(_t) Parseva1's
m
m'
Theorem y,ields
J
i xt
t
E(h (x»= m
r. -J.- f.t
(t)eo (t)dt
m
•
1 21T J
m
J=
ft
=
_l_J;
21T
j=l
o
ft(t)e-ixtoi'(t)dt.
j
m
The next lemma was proved by Leadbetter (1963).
( (x-X)o (y-X) }
E,o
t m
m
1 If f t (t-s)e -i(xt-ys) 0t (t)o t (-s)dtds.
=--41T 2
m
'
m
The next lemma is a corollary of the last one.
4.1.3 Lemma
If
i'
Ifj(t) I€L1,then
Et'O(X-Sij)O Cy-s .. )}
m
m
1J
= 1IIlCt-s)e-i(xt-YS)O!Ct)otc-S)dtds
4 2
J
m
m
•
'IT
o
The result follows from Lemma 4.1.2.
4.1.4
Lemma
If
. of €L ,then
If t(t)
1
for h an expanded estimator
m
0
f h
50
A
A
=
Cov{h (x)th (y)}
m
m
1
4n 2k
m
ff
m
~
j=l
t
t
t
{fj(t-S)+f.(t)fj(-s)}
J
A
Proof:
Cov{h (x)t h (y)}
m
m
k
k
m
m
m
m
[
~ Cov {o (x-S
i It j 1 ), 0m(y-Si 2 t j 2 )}.
m
i =1 i =1 j =1 j =1
1
2
1
2
~
~
Since Si
and Si
j
l' 1
.
2 tJ 2
are independent unless (i
1t
j 1)
=
(i2tj2)t
this expression may be rewritten as
1
= --,-4.ik
II
m
m
t
~ {f.(t-s)
';=1
J
o
by Lemmas 4.1.1 and 4.1.3.
The next lemma is a corollary of the last one.
4.1.5 Lemma Under the conditions of Lemma 4.1.4 t
1
Vadh: (x)} = -2-
m
4n k
m
II
m
t
t
t
~ {fj(t-s)
+ fj'(t)fj(-s)}
j=l
.
51
Two types of eRtimators, based on two types of a-function
sequence, will be defined.
Each type of o-function sequence is
designed to complement a certain class of probability densities.
4.1.6 Definition
An estimator will be said to be of "algebraic type"
or "type-A" i f o:(t) = g(Amt) where g is an even, bounded L
2
and A
m
-+
0 as m
-+
function
00.
Note that if ot(t)
m
~ g(Amt), then 0m(x)= ~
k (Ax) wherek tCt)=g(t).
A
m
m
In other words h is; loosely speaking ~ a kernel--type estimator, although
k
m~y ~ot
ue as in Lemma 2.2,J.
The type-A estimator has been designed for use with probability
densities whose characteristic functions "decrease algebraically," that
is, for those which satisfy
4.1. 7
Definition
An estimator will be said to be "of exponential
type" or "type-E" i f 0 t(t) = g(A ea.1 t
m
L function and Am
2
-+
Note that if om
0 as m -+
m
00
,
I)
and a.
where g is an even, bounded
>
o.
should be in L , as well as L , then rom(x)dx =
2
l
52
g(A ), which is not necessarily unity.
If it is desired to
m
have an estimator such that fo (x)dx = 1, the exponential type estim
t
mator can be normalized by letting 0m(t)
=
1
g(A ) g(Ame
I I).
a t
The
m
results of this chapter can easily be modified for this normalized
type-E estimator.
The type-E estimator has been designed for use with densities
whose characteristic functions "decrease exponentially," that is,
for those which satisfy the condition
Ift(t) le p1tl S A for some p > 0, A > 0.
It has been pointed out to me by W.L. Smith that a random variable
whose characteristic function "decreases exponentially" in this way
cannot be non-negative.
1
The theorems dealing with exponential-type
IThe argument is as follows.
<j>(z)
= <j>(x +in) =frr
Define
t
L:e -ix8+n8 f (8)d8.
On the real line, the integral is equal to the original density,
f(x), a.e.
If
Ift(8)I~e"plel. then
absolutely convergent if -p < n < p.
the integral is
I.e., fez) is analytic in the
strip
Sp = {x
If f (x)
a
+ inl
-p <
n
< pl.
for all xcv, then <j>(z) has a limit point of zeros in
Sp,
and is identically zero there, so that f(x) is identically zero on the
real line.
Thus f cannot refer to a non-negative random variable.
53
estimators (Section 4.3, Section 4.4, Theorems 4.45-4.4.10, and Section
5.4) cannot, therefore be applied to renewal densities.
They are
presented here, however, since they are likely to apply to similar
processes based on random variables which are not non-negative.
The following notations are introduced to deal with quantities
which arise frequently in the sequel.
Let L = !g2(s)ds.
Let h
m
m
L f.(t).
j=l J
=
t
In the rest of the chapter, it is assumed that f (t)€L •
1
means that there is a continuous version of f.
This
It will be assumed that
f is in fact the continuous version, so that
Since f is continuous it is bounded, and therefore fj(x) is continuous
for j
~
2, and
=
4.2
1
t
--2ff.(t)e
n
J
-ixt
dt.
Asymptotic Variance and Covariance--Algebraic-Type Estimators
The following lemma was proved by Leadbetter (1963).
4.2.1
Lemma If g(s) is in L , then g2(t) = fg(t-s)g(s)ds, the convo1u2
54
tion of g with itself, is bounded for all t, and converges to
2
g2(O) = fg (s)ds = L as t
~
0.
Note that, since g is even, g2(t) = fg(t-s)g(s)ds = fg(t+s)g(s)ds.
~
4.2.2 Theorem
Ig (t)-LI
2
<
Let h (t) be an expanded, type-A estimator such that
m
bltl S for some b > 0,
B ~ 0, and It I
11-f\t)I > altlCi. for some a > 0,
°
< CI.
<
~ 2, and It
i f S-CI. > -1, and
as m
~
00
uniformly in x.
Proof:
By Lemma 4.1.5
k A
Var{l~m(x)}
Imm
- h
(x)~1
2n
m
=
~
A
m
m E
4II2 j=l
ff......
t
...
Ifj'(t)fj'(-s) 110 (t) 110 I(_S) /dtds
m
m
1.
I
< 1, and
55
1"
J
f.(t-s)e
m
L
-ix(t-s)
-r
-r
m
m
m
6 (t)o (-s)dtds - LI L
j=1
"j-
f .(t)eJ
f t (T) e-iXT {A Ig (A (T+S» g (A S~ds-L}dT
m
j
J
j=1
m
m
The first term of (4.2.1) is dominated by
where B is an upper bound for /gl.
This term tends to zero as m +
by assumption.
The second term of (4.2.1) is dominated by
m
1
(4.2.2.) 2
41T
L f t (T)e -ixT {/g(s+A T)g(s)ds-L}dT
J
j=1
m
L
j=1
J
ITI<1
m
j
t
-ixT
fj(T)e
{/g(s+~T)g(s)ds-L}dT
I
For /T/>1,
l-ft(T) < c for some
t4.2.2.) is dominated by
Ct
so that the first term of
m
ixt
dt
56
Since ft(T)EL , and since Ig (A T)-LI is bounded, and tends to zero
2 m
l
as m
+
00,
this term tends to zero as m +
00.
Also, for m large enough so that
A <1, IA TI < 1, and Ig (A T)-LI<bIA TIS.
m
m
2 m
m
Thus the second term of
(4.2.2) is dominated by
•
Since S-a > -1, the integral is finite, and since S>O, AS
m
m+
00.
+
Thus the second term of 4.2.2 tends to zero as m +
0 as
00
•
L
Since the bounds on the terms of Ik A Var{h (x)} -h (x)--2 I are all
mm
m
m
'IT
A
independent of x, the convergence is uniform in x.
The fact that h (x)
m
m+
00
+
D
hex) uniformly in any finite interval as
gives the following corollary.
4.2.3 Corollary
Under the conditions of Theorem 4.2.3,
A
k A Var{h (x)}
mm
m
Since get)
+
L
h(x)--2'IT as m
+
00
uniformly in any finite interval.
/
itx
2
itx
dx. Also, since g
dx,g2(t) = Ik (x)e
= Ik(x)e
is even, g2(t) is even.
2 2
Thus, i f Ix k (x)dx
2
=c <
00,
g2(t)
= L- ..£.L
2
57
2 2 2
oCt ) and IL-g (t) I = IC~ + OCt )
2
/L-g (t)
2
1
< bltl
2
for It
I
I,
so that there is a b such that
< 1 and some b > O.
If the distribution of lifetimes has a mean, i.e. if
exists, then
ft(t) =
~
~
= fxf(x)dx
is positive since the lifetimes are non-negative, and
1-i~t+o(t).
t
In this case, 11-f (t)
t
an a > 0 such that 11-f (t)!
>
m
be satisfied if mA
m
~
there exists
altl for It I < 1.
t
m
The requirement that A
I = li~t+O(t) I and
2
~ 0 as m ~ ~ will certainly
L {flfj(t) Idt}
j=l
0 as m ~~.
t
Since {flfj(t)ldt} is a decreasing
sequence, in some cases the requirement is not as strong as rnA
m
Requirements of the form c
~
m j=l
~
{flfjt(t) Idt}2
-
~
O.
0 arise again
m
in the chapter, along with conditions of the form c
m
L flfjt(t)ldt ~ O.
j=l
The next lemma shows that the second is the stronger requirement.
58
4.2.4 Lemma
c
If c
m
~
as
0
m
t
[ flfj(t) Idt ~
m
j=1
m~
0
and
~,
as m ~ ~,
then
m
c
2
[ {flf!(t) Idt}
m j=1
J
Proof:
j
Since
flf!(t)1dt ~
~ 0 as m + ~.
= Ift(t)
0
I
and since Ift(t)1
<
1 for t I 0,
as j ~ ~ by dominated convergence.
Then for'tit > j
';0
c
o
m
[{flftj(t) Idt}2 + c
[
flftj(t) Idt.
m j=1
m j=j +1
o
The first term tends to zero since c
m
~
0
as m ~~.
tends to zero by assumption.
The second term
0
"
4.2.5 Theorem Suppose h (t) is an expanded type-A estimator such that
m
Ig (t)-L
2
a
> O.
I
< b Itl
S for some b
0 < a S 2, and
It I
<
> 0,
S
>
1, with S-a
0, and It
>
I
< 1, and such that
-1, and if
59
m
A
~
m j=1
then
J.
flf~(t) /dt
Ikmm
A covGh (x),
m
Proof:
+
0
as m + m,
J
hm(y»)1
By Lenuna 4.1.4,
+ 0
as m +
Ikmm
A Cov(h (~),
m
m.
h (y»)
m
I
A
m
= 47T 2
A
<~
-
47T
f}( t) f t( -s) e-i(xt-ys) c5 t (t) c5 t (-s)dtds
2
2
A B
J
m
j
m
(4.2.4) ~ ~ ~ {flf;(t)ldt}
47T
j =1
m
2
where B is an upper bound for Igi.
The first
termo>f(4.2.4)t.endst~ozero
as m +
m.
inner integral can be written as
by Lenuna 4.2.5, since
In the second term of (4.2.4) the
60
IfIe -iCx-y)
2
Am s{g(S+AmT)-g(s)} ds -
=2
-e
i(x-Y)T
(4.2.5) '-
1[
'2 Ie
2
Since Ig (s)ds
as
Iz I ~
00,
Ie
<
-i(x-y)s
A
m
-i (x-y) s
Am
g
2()d
s s
Ie
-i(x-y)
A
m
j
.(
) '; -i (x-y) s 2
]
{g(S+A T)-g(s)}2 ds -(1+e 1 x-y T)/e Am
g (s)ds
m
by the Riemann-Lebesgue Theorem
00,
2
s g (~)d~
Ie -izx g 2 (s)ds
and the second term of (4.2.5) t.ends to zero as m ~
00.
~
The
first term of (4.2.5) i-s dominated in modulus by
which is bounded and tends to zero as m
the inner integral of the second term of
to zero as m ~
(4.2.6)
00.
12
4rr
~
00
by Lemma 4.2.1.
(4.2.4)~s
Thus,
bounded and tends
The second term of (4.2.4) is dominated by
m
J
L
t
fj(T)e
j=I
-ixT (
J
e
-i (x-y) s
Amon
g(s)g(s+AmT)dsdT
IT/<I
Ie
where c is such that
_i(X-y)S
A
g(s)g(s+A T)ds dT
m
1
II-ft(T)
m
I
< c for
ITI
>
1.
The second term of
0
61
(4.2.6) tends to zero as m-+
00
by dominated convergence.
The first term of (4.2.6) is dominated by
m
1
87T
t
-ixt
fj(t)e
E
2
(g(s+Amt)-g(s)
2
ds dt
j=l
1
+ --=--
87T 2
1
-<
47T
+
+
Je
2
I
.27T Z
1
41T Z
II-ft(t)I
It I<1
1
47T
-<
2
f -~
-1
I
2
/g(s+Amt) - g(s)1 ds dt
(x-y)
A
m
1
Jl -aj'l'llJ
s g2(s)ds
m
E
j=l
f
t
Ifj(t)ldt
blAmtlSdt
0
[-=1 f
Am
e-
i (x-y)
Am
s g2(s)dsl
]
_
t~
x-y
E
m
j=l
The first term of (4.. 2.7) tends to zero as m -+
B - a > - 1.
The second term tiends to zero as m -+
in square brackets do so by assumption.
I IfI (T) IdT] .
00
since S > 0 and
00
since both factors
o
62
4.3
Asymptotic Variance and Covariance--Exponential-Type Estimators
For type-A estimators, am = Jo~(x)dx = 2~Am.
gives the relationship between a T!l and
~
The next lemma
for type-E estimators.
Lemma 4.3.2 is an extension of Lemma 4.3.1, and is used in the proof
of Theorem 4.3.3, which gives the rate of convergence of the variance
for expanded type-E estimators.
4.3.1
Lemma:
1
log (Am )
Proof:
If h is a type-E estimator with g such that
1
+-aTI
Since a
m
as m +
00.
=z:;;1
am
(4.3.1)
(1)
log -
Am
The second term in the right hand side of (4.3. 1) is .dominated by
Ig(y)/ dy
y
where B is a bound for
Ig/.
Since the integral is finite by assump-
tion, this term tends to zero as m +
00.
63
The first term of the r.h.s. of !(4.3.1) is
1
(4.3.2)
The second term of (4.3.2)is dominated by B1 (1 + B)
Cln
tends to zero as m -+
00.
,which
(A~)
log
o
The first term of (4.3.2) is 1 •
Cln
4.3.2
Lemma Under the conditions of Lemma 4.3.1, let
1
Then
Proof:
~m(T)
~m(T)
is bounded for all m and T and converges to 1 as
can be written as
1
(4. 3 • 3) -10~g---;(~~)-
00
The second term is dominated by --..B_.."..-__
log
tends to zero as m -+
(~m)
Ig (y) I dy,
f
1
y
00.
2
The first term of (4.3.3) is dominated byB •
bounded for all m and T.
Thus ~m(T) is
which
64
= e-alTI ~
where To
B Joo
~
1
B
2
The third integral is dominated by
dy and is thus finite.
The second integral is dominated by
Y
(-log To)
(4.3.4)
1.
IjJ
= B2
m(T) - 1
aIT/, which is also finite.
<
Thus
1
(~)
log
+ 1
log
[1-1
Am)
The first term of (4.3.4) is
(4.3.5)
1
log
(~J
The first term of (4.3.5)i '5 log To - log Am -1
log (~)
=
log
(~J
65
The second term of (4.3.5) is dominated by
1
(~)
log
Thus
o
4.3.3
2. Ae if
~
Theorem
pI t I
Let h
m
be an expanded type-E estimator with
for some A, some
p>
0 (which implies that f t (t) ELI)' and
> altl S for some a > 0,0 < S < 2, and It
11 - ft(t)1
m
1
log
(~J
E
j=l
rjfl(t)ldt
then
log
(~J
uniformly in x.
+
0
as
m +~,
I
<
1, and
66
Proof:
By
Lemma 4.1.5,
I
km
log
~
Var (hm(x)
(~J
- hm(x)
1i-a
<
(4.3.6)
+
1
4i
-
where Q is {t < 0.,
S
-2
a
I
e- iXT
m
L f t (t)d t
j=l
The first term of (4.3.6) is dominated by
log
[.~)
I
> a} u {t > 0, s < a} and R is the complement of
Q.
1
j
67
m
Since
~1_.....-~
log
(~J
0 as m + 00, by Lemma 4.2.5, this
L flft(t)ldt +
j=1
j
term tends to zero as m +
00.
The second term of (4.3.6) is dominated by
m t
L f (t-s) dtds
1
j=1
log
=
1
log
~J :.2
2
foo
J- s
j
m
L f t (-r)
j
j=1
0-
d .ds
1
=
log
(4.3.7) --
1
~--...,._
B2
log
2TI
(~J
2
The first term of (4.3.7) is dominated by
1
1
log
log
where c is such that
I
1
t
I-f (t)
of (4.3.7) tends to zero as m +
I
00.
"----- ..
[~J
< C
for It I > 1.
Thus the first term
68
The second term of (4.3.7) is dominated by
1
log
Since a < 2, this term also tends to zero as m ~
term of (4.3.6) tends to zero as m ~
00.
Thus the second
00.
The third term of (4.3.6) can be written
1
41T 2
(4.3.8)
The quantity in square brackets is
1. [
a
1
Joo g(y)g(yeal'r1)~ - 1] ~ [~m(T) _ 1)
log [~} Am
y
=
where ~(T) is as in Lemma 4.3.2.
(4.3.9)
~
21T
Thus (4.3.8) is dominated by
m
I
IT I<1
r f:'t)e~ixT (~ (~)-1) dT
j=l
J..
---.;;m;';"-"'ci.-_-I
69
The second term tends to zero as m ~
00
by dominated conver-
gence, since (Wm(T)-l) is bounded and tends to zero as m ~
since f t (t)
€
00,
and
Ll •
By the proof of Lemma 4.3.2, the first term of (4.3.9) is dominated
by
1
z-n2
1
log
Since the quantity in square brackets is finite, this term tends
to zero as m ~
00
Since hm(x)
o
by assumption.
~
hex) uniformly in any finite interval, the next
corollary follows immediately.
4.3.4
Corollary
Under the conditions of Theorem 4.3.3
--.;.k~m-.--~var(~(x) ~
log
{~
hex)
as
m~
TI,a
uniformly in any finite interval.
00
70
The condition
satisfied if
m
log
1
log
(~)
(~J
~ Q
m . t
: rlfj(t)ldt ~ 0 as m ~
j-l
as m ~
00.
00
is clearly
However, the condition is not
necessarily as strong as this.
The next lemma is closely related to Lemma 4.3.2 and is used in
the proof of Theorem 4.3.6, which gives the rate of convergence for the
covariance of expanded type-E estimators.
4.3.5
Under the conditions of Lemma 4.3.1, if
Lemma
~m ('r) = _1_.,.-_
log
then
~m(T)
[~J
is bounded for all m and T and tends to zero as m ~
Let z = Ameas •
Proof:
Then s •
~l~og~(~i;~m~) and
ds = dz.
aZ
a
~m (T)
=
1
-a-lo::;.g--,(....~-m~)
=
foo -i (x-Y)
Ame "'-
1
= e-a IT I .<
Br ~ dz.
oo
I
z
Am g(z)g(ze
e
alog
where To
(..L)
log
1.
-Hx-v)log
~
00.
aI I d
T) :
(L)
. I
Am g(z)g(zea
I
t )
d~
]
The third integral is dominated in modulus by
T I.
The second integral is dominated in modulus by aB 2 1
71
The first integral is
(I-g(z) )
[1+g (z,ea I, I) J
_ [l-g(ze
ah 1)]
+ (I-g(,)
~
-dZz ,
which can be written as the sum of four integrals, say 11 + 1 2 + 13
+ 14 •
I
fT o
=
Am
1
_
e
-i (~) log [:
'in
log
dZ = a
z
a
J
a
[~)
e
-i(x-y)s
ds
(~t]
[e-i(X-Y)lOg
- a
J
-i(x-y) .
which is dominated in modulus by
2a
(x-y)
a
dominated in modulus by B e , and 1 is dominated in modulus by B ' Thus
l
4
l
~m (-r)
1
<
log
and
~
rBfco
(1J l"a
18i!.Ll.dz +
lZ
(T) tends to zero as m
m
~m(T)
I~m I ~
(T)
B2h I
~
co,
is bounded for all m and L since
B2
alog
(1J
~B-(r--~-:-J
Jl dz +
Am z alog
'
J: Ig~t) Id~
in
2
B
B
=-+----
a alog
[~mJ
fI
CO
1
I
g (z ) dz
z
o
72
4.3.8
km
log
Theorem Under the conditions of Theorem 4.3.3,
A
~
COV(hm(lC), hm(y)
(~J
-+
0 as m
k
-+
00.
~
~
By Lemma 4.1.4, _.;;;;;;m_ _ Cov(hm(x), hm(Y)
Proof:
log
(~J
t
t
f (t)f (-s)dtds
j
The first term of (4.3.10) tends to zero as m -+
of Theorem 4.3.3.
00
j
as in the proof
The contributions to the second term of (4.3.10) which
arise from integration over (t
> 0,
s
< 0)
to zero as in the proof of Theorem 4.3.3.
and (t
< 0,
s
> 0)
also tend
The remaining contributions
arise from integration over four regions, of which a typical one is
t > s > O. For this region, the contribution to the second term of
(4.3.10) is
------_._--------
oo
= _1_
41T2
f
m t
-ixT
I: f (T)e
Q j=l j
73
1
(4.3.11)
r ~ ft(T)e-ixT~
) 0 j=l j
where
~
1
(T)dT + 4n2
m
m(T) is as in Lemma 4.3.7.
Joo
m t
-ixT
L f j (1')e
~ (T)dT
1 j=l
m
The second term of (4.3.11) is
dominated, in modulus, by
where c is such that
< C for Itl>l.
1
Il-f t (t)
I
by dominated convergence since f t
to zero as m ~
This term tends to zero
€
Ll and
~m(T)
is bounded and tends
00.
The first term of (4.3.11)is dominated in modulus by
by the proof of Lemma 4.3.7.
Since the quantity in square brackets is
o
finite, this term tends to zero by assumption.
4.4 Asymptotic bias and mean square error
In this section, exact rates of convergence for
IE
~(x) - hm(x) I
are obtained for estimators of algebraic and exponential type.
and
L
j=rn+l
t
fj(x),
the tail of the renewal density
74
series, together make up the bias of hm(x).
The results in the last
"
two sections on the variance of ~(x),
the results in this section on
IE ~(x) - hm(x)
I,
and our knowledge of the behavior of the tail of the
renewal density series from Chapter 2 are combined to produce theorems
on the rate of convergence of the mean square error for expanded type-A
and type-E estimators.
4.4.1
Theorem
t
Suppose that 11-f (t)
I
> altl
a.
for some a > 0,
0< a.
~ 2,
If h'"
is a type-A estimator of h (either expanded or compact), so that
m
and Itl<l, and that ItIPlft(t)1
~ K for
some K, some p > 1.
1
=
g (Amt) , and if g is such that
Jo 11-g Y<.tl.1 dt
t
then
1-P
Am
sup
x
I
""
hm(x) E
~(x)1 < C
for some constant C, and hence
1-r
E (hm(x) - hJ!\ (x)) + 0 as rn
m
A
uniformly in x for 1 < r <
Proof
By Lemma 4.1.1,
p.
+
00
<
00
for y
= max
(a. ,P)
75
E(h
A
(x) -
m
(4.1.1) =
f
1
m
t
-ixt·(8 '\- (t)-l)dt
h (x)) = -2
L f.(t)e
m
'IT
j=l J
m
f !1- g (Amt) IIj=l~
1
!
'IT
f!(t) Idt
(}
J
+
!
'IT
f0011
_g (A t) II ~ f!(t)ldt.
1
m
j=l J
The second term of (4.4.1) is dominated by
00
2cK
'IT
f1 \l-g(Amt) It-Pdt
1
t
where c is such that
<
c for It]>l.
This can be rewritten
11-f (t)1
as
00
A
p-1 2cK
m
'IT
f
J1-g(t)
tP
I
dt
A
m
where B = sup\g(t)\ and both integrals are finite. the first since
t
p
~
y, the second since p > 1.
The first term of (4.4.1) is dominated by
76
1
A
f
~ A y-l 2
TIa
m
y-l
m
m
o
for m large enough so that A < 1.
m
tion.
;a f o
A
The integral is finite by assump-
Thus
l-p
Am sup
Elhm(x)
x
-
hm(x)
I
~
2CK
-- +
[ n
Amy-P]
2
na
f1
0
00
·
Slnce
y
>
p, A y-P < 1, so that the last expression is dominated by
m
1
(
2CK
+
n
2a1I
n
J
0
00
11 -tgy
(t)d
I
t +I
(1+8)p
It- dt
which is a constant.
This is the first conclusion of the theorem,
o
which leads easily to the second.
4.4.2
Corollary
for every v
A
l-r
m
>
Under the conditions of Theorem 4.4.1, if e-mvA (l-p)~O
m
0, then
"
E(h (x)-h(x))
m
~
0 as m ~
00
uniformly in any finite interval, for
Proof:
If F(x) < 1 for all x <
00,
l<r~p.
and M is a bound on f, then· given
any interval I and its right endpoint T , hy Corollary 2.1.5,
l
00
L
f. (x)
j =m+l J
77
and
l-p
(4.4.2)
Since p
~
A
m
r, A p-r
m
by assumption.
m~
00,
~
The quantity is square brackets tends to zero
1.
Thus the left hand side of (4.4.2) tends to zero as
uniformly in I.
If F(T ) = 1 for some T <
2
2
00,
M is a bound on f, and K = me
TZ
then, by Corollary 2.1.6,
-
------------_.
and
(4.4.3)
·p-r
Since Am
A
l-p
m
~
00
f. (x) ~ Ke
~
j=m+l
KT
2
zA
A p-r l- GmlogKT
e
J
m
m!
1-]
m
p.
1 and the other factors in the right hand side of (4.4.3)
tend to zero, the right hand side of (4.4.3) tends to zero uniformly
in x.
00
Thus, in either case, A l-p
m
finite interval.
f.(x)
L
j=m+l
J
~
0 uniformly in any
Since
A l-PIEh (x) - hex)
m
m
I
~ A l-PIEh (x) - h (x) I + A l-p
m
m
m
m
00
L f.(x),
j=m+l J
78
o
the result follows from Theorem 4.4.1.
Note that, in the standard example, where A
m
= m-s
Corollary
4.4.2 applies, since
Ame
-\Jm
= m-s e -\JID
~
0 as m ~~, for any
\J >
O.
The conclusion of Theorem 4.2.2 is that
k A (Var(h (t)) - L
mm
m
2n
hm(x)) ~ O.
One conclusion of Theorem 4.4.1 is
A
l-rE('J~ (x) - h (x))~
m
m
m
O.
In the next theorem, the two normalizing sequences, k A
mm
and A l-r,
m
are chosen to be proportional, so that the two results can be combined.
Corollary 4.4.4 extends the theorem to give a rate of convergence for
the Mean Square Error (MSE) of h . Again we use the notation g2(t)
m
= Ig(t-S)g(S)dS.
4.4.3
Theorem
Suppose that hex) is bounded, Il-ft(t)l< altl
some a < 0, 0 < a ~ 2, and It I < I, and that ItIPlft(t)
t
I
a
~ K for some K>O,
some p > 1 (which means that f (t)€L ). Suppose that h is an
m
I
expanded type-A estimator of h such that Ig (t) - LI < bltl B
2
for some b > 0, B > max (O,a-l), and for all
I Il-g(t) ldt <
Y
Io
t
~
for y
= max(a,p)
and
m
A L (flf:(t) Idt)2 ~ 0 as m ~ ~.
m j=l
J
It I
< 1,
for
79
Then, if k
m
=
DA 1-2r with 1 < r < p,
m
'
and
(ii) A 2-2r E
m
~~m (x)
- hm(x) ~TID ~ 0 as m ~
hm(x) 2
00
uniformly in
x.
The first term tends to zero uniformly by Theorem 4.2.2.
term is dominated by
for 1
<
r
<
~
Lm
1-rE lh (x) - h (x)
m
m
p by Theorem 4.4.1.
1-1 2 which tends
Thus (ii) is proved.
which is bounded, hm(x) is uniformly bounded.
The second
to zero
Since h (x)
m
e
-m"V
0
Corollary
A
(l-p)
m
2 2T
Am interval.
hex),
This, together with (ii),
yields (i).
4.4.4
~
~
Under the conditions of Theorem 4.4.3, if
0 for any v > 0, then
MSE~m(X)] ~
hex)
~TID
as m ~
00
uniformly in any finite
80
A 2-2r MSE(h (x)) _ hex) L
m
m
2nD
Proof:
= A 2-2rE~ (x) - h (x)12 - h (x) ~
m
lm
m
m
2nu
J
~
1 r
2[A 1-r
f).(X))J [Am j=m+1
m
+
E~m(X)
l
- hm(X)]]
t
1-r
Am
00
L
j=m+1
L
f j (x) 2nD'
The first term tends to zero by Theorem 4.4.3.
term tends to zero by
Cor~llary
4.4.2.
by Theorem 4.4.1 and Corollary 4.4.2.
The third
The second term tends to zero
The fourth term tends to zero
0
by Theorem 2.1.3.
The fact that f 1- (t)
E
turn imply h(t) is bounded.
L1 means that f(x) is bounded, which may in
See the discussion in Section 2.1.
In the standard example where Am
k
m
= DA 1- r means that k = Dms (r-1) .
m
m
= m-s
the condition that
(Note that the "1''' here is a
81
number between 1 and p, not the "r" in previous discussions of the
standard example.)
4.4.5
Theorem
Suppose that h
m
is a type-E estimator of h (either
compact or expanded) so that o:(t)
is such that Il-g(t)
Ift(t)
as m +
I~
00
Ae-
Pt
for 0
<
I
<
Blltl for It I
<
1.
n<
min(l, pia), then
A
uniformly in x.
By Lemma 4.1.1,
I+
0 as m +
00
suppose that g
If f is such that
l
for some A > 0, P > 0, and A - n
m
A -n IEh (x) - h (x)
m
m
m
Proof:
= g(Amealtll,and
~
j=l
JlfJ!(t) Idt
+
0
82
00
J
A- n
II-et(t)
- mOm
<
I I ~ {~(t)
j =1
J
Idt
7f
(4.4.4)
<
A- n
m
7f-
Im
II-g(A eat)11 L f~(t)ldt
m. 1 J
J0
J=
where c is such that
:
II-f
I(t)
I
<
c for Itl>I.
a
For m large enough
so that A e <1, the first term of (4.4.4) is dominated by
m
which tends to zero by assumption.
The second term of (4.4.4) is dominated by
= -2cA
7f
A
,e... -n
rna
f
oo
\I-g(y) Iy
-p
la
dy
y
83
ex
which, for m large enough so that A e <1, is dominated by
m
p
.- -n
a
2cA A
-~
m
where B is a bound for Ig(t)
p
a
2cA A
m
-'ITa.
-I')
I.
This is equal to
r I-A I-pIa ea.. p] +(1+8)% ]
l8 1 ml-p a
£I. •
r2cAB l
2CA(B+l)) A tx.
=
m
+
"'p
lna(l-p!a)
Since
I')
va. (1-
p
pI a)
1-1')
] A
.
m
< min (1, pia), the second term of (4.4.4) also tends to
zero as m -+
4.4.6
n _ [2CAea -
o
00.
Corollary
Suppose that the conditions of Theorem 4.4.5 hold,
that F(x)<l for all x<oo, and that
~1
is a bound for f.
Given any
interval I and its right endpoint T , if
l
00
-1')
t
f. (x) •
m j=m+l J
A
·e
L
84
The first term tends to zero by Theorem 4.4.5.
By Corollary
2.1.5,for x€I the second term is dominated in absolute value by
o
which tends to zero by assumption.
4.4.7
Corollary
Suppose that the conditions of Theorem 4.4.5
hold, that F(T 2) = 1 for some
T2<~'
and that M is a bound for f.
T
Then i f K = Me 2, and
A~ 1'1 Ci:T2) m -+ 0 as m -+ ~,
m!
then
A-" b
m
[~m
(x) 1-+
)
0 as m
-+
~
uniformly in x.
Proof:
n
n
Again, A- b[hm(X)) = Am
m
E[6m(x)-hm(X))
n
- A~ f.(x),
m j=m+l J
and the first term tends to zero by Theorem 4.4.5.
By Corollary
2.1.6, the second term is dominated in absolute value by
•
which tends to zero by assumption.
o
85
4.4.8
Theorem
Under the conditions of Theorem 4.3.3, with
O<n< min (l,p/a) and k
m
uniformly in x.
Proof:
If the conditions of Theorem 4.3.3 are met, then the conditions
of Theorem 4.4.5 are met, since the only condition of
~eorem
4.4.5
which is not a condition of Theorem 4.3.3 is that
"
m
l-n
A
~
m j =1 f lf ; (t) Idt
-+
0 as m -+
00
and this condition is met since
1
log (1:-)
A
m
~
flf): (t) Idt
-+
1-:T1
1
0 as m -+
00
j =1
in Theorem 4.3.3. and A
m
log(-A)
m
-+
0 as m -+
00
for any o<n<l.
The proof is based on Theorems 4.3.3 and 4.4.5, but is otherwise similar to that of conclusion (ii) of Theorem 4.4.3.
o
The next two corollaries cover the two cases where F(x) < 1
for all x. and where F(T)
=1
to that of Corollary 4.4.4.
for some T<oo.
Their proofs are similar
86
4.4.9
Corollary
Under the conditions of Theorem 4.4.8, and if the
conditions of Corollary 4.4.6 are met, then
A- n MSE
-+ hex) as m-+
m
00
nel
uniformly in I.
4.4.10
Corollary
Under the conditions of Theorem 4.4.8, and if the
conditions of Corollary 4.4.7 are met, then
A- n MSE
m
f6 (X))
l m
-+ ?(x) as m -+
00.
TIel
Following is a discussion of A -sequences which will satisfy
m
the conditions of Corollary 4.4.9 or of Corollary 4.4.10.
In both
cases, it is useful to note that, since flf:Ct) \dt -+ 0 as m -+
J
l
m
L
m j=l
flf! (t) Idt -+ 0 as m -+
00.
J
In the case where F(x)
would be used.
(4.4.5)
<
00,
Corollary 4.4.9
m
~
1
log (~) j =1
m
(4.4.6)
1 for all x
An appropriate A -sequence must be such that
.n.
and
<
flfJ! (t) Idt -+ 0 as m -+
00
00,
87
where T1 is the right endpoint of the given interval I.
(4.4.5) is
satisfied if A = e-ym for some y > 0, since in that case
m
m
1
Ilf~
L
log (1-)
A
m
j =1
m
1
E Ilf; (t) Idt
(t) Idt = -y j =1
m
which tends to zero as m -+
Ify is chosen so that
co.
-10g(F(T ))
1
y <
n
,
then
which also tends to zero as m -+
co.
=1
In the case where F(T 2)
for some T <
2
co,
the appropriate
corollary is. 4.4.10, and the conditions which the A -sequence must
m
satisfy are (4.4.5) above and
(4.4.7)
KT m
A- n (. 2) =
m
m!
-+
0 as m -+
co,
Again, an A -sequence of the form A
m
m
any y>O.
= e- ym
satisfies (4.4.5) for
It also satisfies (4.4.7) since in that case
m
A- n (KT 2) = e(-yn + log(KTZ»m
m
ml
which tends to zero as m -+
co
for any y > o.
CHAPTER 5
INTEGRATED CONSISTENCY
5.1
Preliminaries
In this chapter, as in Chapter 4, f and the a-function sequence
are assumed to be in L2 .
Further conditions on f and {a } will
m
also be similar to those in Chapter 4.
In probability density estimation, one measure of global
consistency is the mean integrated square error eM.I.S.E.),
A strictly analogous measure for renewal density estimation is not
possible, since h is not integrable.
However, the
~ollowing
measure
can be used.
m
where, as before, hex)
m
=
r f.ex).
j =1 J
The next two lemmas use Parseval's Theorem to obtain useful
expressions of J
m
for expanded and compact estimators.
89
~
5.1.1
Lemma
If h ex) is an expanded estimator of hex), then
m
By Parseva1's Theorem
90
= f.
t
(t) f.
J1
is . t
J2
E [ e fJ o~(t)
+ If:(t) I 2
J
t
( - t)
11- 0
- fret)
['r
1-0 (t)
m
~
I
m
(tj
) [ - is
e
2
I.
fj
t
o~(-t)
- f; (-t)
)
- 0t (-t)
m
Thus,
m-[lf~(t)1
m
[
j =1
k
m
J
2]
dt
.
o
91
In the following results on the compact estimator, it is convenient
to use the notation R(z) for the real part of z.
~
Lemma
5.1.2
If h (x) is a compact estimator of hex), then
m
2nJ
m
m-l
t
+
'-1
J-
Proof:
2nJ
m
2.R(f
J
t
1
.(t))j
m-J
dt
As in the previous lemma,
=
~[
E e
(5.1.1)
t
= fj
iSo . t
.(..l J 1
-r
0m(t) -
'r
t
(t) f j (-t) II-om (t)
I2
For £.1 = £'2' j1 < j2 the expectation in (5.1.1) is
t . C·t) I0.~ (t) I2
J2- J 1
m
f.
I
-r (t)
+ f.
J1
t)
i'
[...
f. (- t) 1- 0 I (t) - 0 (- t)
J2
m
m
92
For i 1 = i 2 , j 1 > j2' the expectation in (5.1.1) is
For f
1
i
=
2
=
,e
j1
1
= j2 = j,
as in the previous lemma, the expectation
in (5.1.1) is
Thus,
~
k
+
~
m
~
o 1 1 <.
.(..=
-J 2<J. l~m
f.t (t)f.-r (-t) ]
J1
J2
93
m.~
+
..,
Im
+ 0 (t) ,"-
m-
2
f. (t)
L
m-1
+
j=l J
L
j=l
2jR(ft . (t))
m-J
k
dt.
m
5.2
Minimization of J
m
Given f and given m, the expressions for J
m
found in Lemmas
5.1.1 and 5.1.2 can be used to obtain 0 's which minimize J .
m
m
5.2.1
Lemma
27U"m
If
fi m(t)
= fa'o\t)
m
is an expanded estimator of hex), then
~12dt
a
+
Ib(a-b)
dt
a
o
94
m
m
m-E/f:Ct)1
2
where a = aCt) = I E f:,oCt) 1 +
j=l J
j =1 J
k
2
m
and b = bCt)
Proof:
By Lemma 5.1.1,
m-
~ If!Ct)
j=l
1
2
J
[
C5.1.2)
=
2 2
flat
Ct) 1 a
--:m::::..-
t
t
k
m
- abO
-.:.::.m Ct) - abO
---:::m C-t) + ab dt .
a
Since a and b are real, C5.1.2) is equal to
dt
95
b(a-b)
f
5.2.2
a
dt
o
•
~
Theorem
If h (x) is an expanded estimator of hex), then J
m
m
is minimized by taking 8t Ct)
m
I
m.J.
f:
=
8*t(t) where
m
2
I
1:
(t)
J
_
8*i" (t) = --,,-j_=l_
m
mt
2
m~.
2
L (t)
+ m- [
(t)
j =1 J
j =1 J
k
m
I[
I
If:
I
*
*+
and 8 '(t) is in L 2 , and the minimum J , J ,is given by
m
m m
[m
Im
[f:"(t)1 2 m-.[
21TJ * =
m
'-l J
J-
t
/f.(t)1
J=l
k
2
J
1m
- - - - - - -m-m- - - - - dt.
I[
i:
(t)
12
j =1 J
+ m- [
j =1
k
If-: (t) 12
J
m
Proof:
18
*-r (t)!
m
t
First we show that 8: (t) is in L .
1
~ 1,8
*
m
(t)
is in L2 .
Then, since
96
m
J
(5.1.3)
1
0*'t (t) Idt
m
S.
J
ldt
+
J
I
~ f!(t)
.
J--1 mJ
j =1 J
k
I {f (t) I <c
2
·'tl>l m- ~If.(t)
Itl<l
Let c be such that
1
for It 1>1.
2 dt.
I
m
Then __1_ _ < -~--c '
Il-f-l"(t) I
m
and
~ f:(t) 1 2 <
I
j =1 J
m-
m
~
t
If. (t) 12
~j~~_I_J
m(~-c
>
m
2
) for all Itl>l.
m
Thus (5.1.3) is dominated by
4
Cl_C)2 IfT(t)
2
t
m(l-c )
k
It\>1
dt,
m
t
*t
which, since f (t) is in L2 , is finite, and hence om (t) is in L .
l
Let a and b
2'ITJ
m
=
be as in Lemma 5.2.1.
~\2dt
Ia10 t(t) m
a
+
I
Then
b(a-b) dt.
a
97
Clearly, this is minimized if ot(t)
m
= o*t(t)
m
a'
-- band i n thOIS case
Substituting the appropriate expressions for a and b gives the
0
desired result.
,...
5.2.3
Lemma
If h (x) is a compact estimator of hex), then
m
m
where a
= aCt) = lEi!" (t) 12
+
j =1 J
and b
Proof:
k
m
m -j2
E f. (t)
j =1 J
= bet) = I
I.
From Lemma 5.1.2, the method of proof follows exactly that
in Lemma 5.2.1; since here, too, a and b are real.
,...
5.2.4' Theorem
If h (x) is a compact estimator of hex), then J
m
m
is minimized by taking ot(t)
m
= o*t(t),
m
where
o
98
and o*t(t) is in L , and the minimum J , J*, is given by
2
m
m m
m \.
2'1l'J
*
m
I.L
=
f'~ (t)
J =1 J
J
I
m
L
2
I
~
m_ 1 ftCt) 12 +
. 1 J
[ J=
m~12jRCft
.Ct))]
. 1
m-J
J=
k
i~ (t) I 2
m
+
m- I L f ~ (t) I2
j=l J
+
j=l J
k
Proof:
The proof that
ojo~
m
t
If. (t) I
J
+
dt .
1
L
j=l
2j R(ft. (t) )
m-J
m
(t) is in L follows from the analogous
2
*t
proof in Theorem 5.2.2, since 10m (t)
< m- L
j =1
m-
m
I~1
and since the fact that
99
m
J
implies that
Ia*tt) Idt
Itl>l m
( I ~ f tet) 12
<j
__J,,-·=...;:l~j- - - dt,
Itl>l
m- ~ If:r(t) 12
j =1
k
J
m
and this is the expression that was shown to be finite in the
proof of Theorem 5.2.2.
Let a and b be as in Lemma 5.2.3, and the rest of the proof
0
follows as in Theorem 5.2.2 also.
For the expanded estimator, a
*"r
m
m "lf. (t)
j =1 J
I E
1
2
+ m-
is given by
m t
2
Elf. (t) I
j =1
k
J
m
This expression is clearly real and even. Thus the (inverse) transform
a * (x) is also clearly real and even.
m
a * (x) is the "optimum"
m
a-function.
Similarly, since for the compact estimator,
100
*t
cS (t)
m
I
m l'
~ L(t)1
2
~i=l J
=
m t
I L f
2
(t) 1-
+
m -
m-l
2
m t
1
~ f j (t)
+ ~ 2.R{f·~· .(t)}
I
j=l j
j-1
k
m
j=l J
m-J
*
which is also clearly real and even, its (inverse) transform Q (x)
m
is real and even.
5.3
Integrated Consistency --Algebraic Case
In the rest of the chapter, only the expanded estimator will be
considered, since the compact estimator proved intractable.
Notation from the last section will be retained.
minimum J , associated with
m
*
cS (x).
m
*oj- .
&m
*
J m is the
The inverse transform of cS*t(t) is
m
The estimator which minimizes J , that is the estimator formed
m
*
with cS , will be called the optimum estimator.
m
The following notation will be convenient.
This section will deal with probability densities with characteristic functions which decrease algebraically in the sense that
Ift(t)1 Itl P -;- Kl/2 as It I -+
00
for some p > 1/2.
some K>O.
101
This is a stronger condition than that used in Chapter 4, which was
Ift(t)lltI P ::;; K for some P > 0, K> O.
5.3.1
Lemma If h (t) is an expanded estimator of h, then
m
J
*
m
= -m
k
m
0 * (0)
m
*t
and since ~ ELI' and
*i-
211"0 * (0) =!O . (t)dt
m
m
(5.3.1)
The second term of the right hand side of (5.3.1) is dominated by
Since
=I
J
°r
2
t
2
L If.(t)1 dt < mllf (t) dt,
j=1 J
m
I
*
o
m
m is negative, so that
Note that the term of J * which is O(k:)
m
m
102
If h (x) is an expanded estimator of h(x), with
5.3.2 Theorem
as ItJ ~
00
m
for some p > 1/2, then
[k ]1-1/2P *
m
m
m
Proof: Since
k
[-
J:
-+ -
2'7T
= 2\
1/2
K
P
~--
p
foo
u (1/2p)-1 du as m -+
0
00.
1+u
51
~-S2)
Sl
+F: 2)
f ---m--::S=---- dt,
m
)1_L2p *
m
m
J
1
J.
m
(5.3.2)
The second term of (5.3.2) is dominated in modulus by
k
]_..!..
2p
--.!!!
[ m
k
and hence the term tends to zero as m -+
00,
since --.!!!
m
-+
00
as m -+
00.
103
and Ift(t)/ <
for It I
£
T.
>
Then the first term of (5.3.2) can be
written as
(5.3.3)
[:] -
2~
1
2, (
-T
+
_~d=~
__S--2-
1
+ kS
m1
[:m] -}p 2~ I
1
m-S
2
l+k
S
[
Itl>T
m1
1
+~
kK
The first term of (5.3.3) is dominated by
tends to zero as m ~
T[k ] _..L
2p and
~ ~
00.
The third term of (5.3.3) is
~2p
k K
m
T
__ K21p fOO
u2~ -
1
--::---:---c1du
1 + u
21TP
~T2p
k K
m
2~ 1 u2~l+u- 1du
00
~ .~).c. 1T p
o
]dt
m
thus
104
as m ~
The integral is finite since p
00.
>
Note that the third
1/2.
term of (5.3.3) tends to this limit for any T
€
(Otoo)t not just the
T chosen above.
The second term of 5.3.3 is
~
_ m-5 2
k 51
k K
m
m
dt
Itl>T
mt 2p
kK m
m-S 2
k 51
m
dt
~-:2) (1+ m~2.:_)
T
{km
1
m:
m
m-5
1
[:]-Zp ~
J J f i"(t)1
r
+ [:
1
TP ~
rOO
m
1-f (t) 2
2 2p
_
t
K
m
1 zpl-f t (t)
1+~
k K
T
t
1-f (t)
dt
t
OO[
(5.3.4) =
- 1
2
m
2
]
T:I-1
2~_
..J,..J..1;::..--:f;.....t...,(..;;.t:-)"-"lr"':";_-_S
1 +~.
k K
m
dt •
m
m-S Z
dt
105
Jf t (t) ,2 t 2p
Since
K
-
<
1
£
for t > T, the first term of (5.3.4) is
dominated by
2
I-f t
1
: ] - 2p
[
(t)
m
~ JOO
m
;: .l-. . ;:f:. .t-'(~t~) 7'-_m_-_s.;;;.2J-- d t
£.........
1 +
T
mt
2p
""kK
m
1
--.....;;;..~-
1
2p
mt
dt
+k"K
m
since, for t>T, Ift(t)I<£< 1/2, and
2
The second term of (5.3.4) will be shown to be similarly bounded.
The first step is to find upper and lower bounds on the numerator of
the integrand, thusly:
_<[2
+
£
+ 2£ m-l (1 + £)2
l-ft(t)
m
< [
+
l-f (t)
<
[~
+
-
£
£
m]'
<2m--l] =
_<mj'_
h
+ e:
1
2
----!'
m-S!-] - 1
2
1
1 - £
2
- 1
=
£
2
2 - 2e: +
£3
+ 2e:m-l + e: 2m-I
(I-e:) 2 (l-e: 2 )
106
In the lower bound,
2 + £ + 2£
m
~
2.
Also, (1-£)
(5.3.5)
for m
m-1
-3£ < [
~
2.
- £
2
>
2m-1
m-1
2 + £ + £
<
2 + 1/2 + 1/2
3 for
1, so that
1 - ft(t)
m
2
1 - ft(t)
For m
<
m\]
- 1
1, it turns out that 0 is a lower bound, so
that (5.3.5) is true for m
~
1.
In the upper bound,
2
2
Also, (1-£ ) > 1/4, and (1-£ ) > 3/4, so that
(5.3.6)
for m
[
3.
~
1 - ft(t)
m
1 - f1"(t)
22-] _1
m-S
q
•
2
It turns out that for m=l,
9
4·
2
y:
4
- = 12£
3
is an upper bound, and
for m=2, ~o£ is an upper bound, so that (5.3.6) holds for m ~ 1.
Thus the second term for (5.3.4) is dominated in absolute value
by
12£
11"
1
-----::---- dt
1+
mt
2p
1ZK
m
107
and (5.3.4) is dominated in modulus by
24£
1T
Except for a factor of 48£, this is the same as the third term
of (5.3.3), which has already been shown to tend to a finite limit,
namely,
1001
2p J
2p - 1
~1TP
--~-+-u-
du,
o
for any T.
Thus the third term of (5.3.3) is arbitrarily small.
0
In general, since f is not known, the optimum estimate is also
unknown.
The next result is similar to the last, but substitutes an
appropriate type-A estimator for the optimum estimator.
5.3.3 Theorem
with :m ..
ro
Let h be an expanded type-A estimator of h(ot(t)=g(A t»)
m
m
m
1
as m .. ro and Am _
2p. Suppose that f is such that
Dr:] -
and such that Ift(t)1 /tl P
such that f/tl- 2P ll-g(t)
B > p-1/2, then
~
,2 dt
l/2
K
as It I
~
00 for some p
< 00 and Il-g(t)1
~
t
>
1/2.
If g is
B for some B > a-1/2,
108
k ]1 {p
[:
•
as m -+Proof:
Jm
-+-
1
21TD
f g 2 (t)dt + KD 21T2p- l
J Itl- 2P/ 1 - g (t) I2 dt
00.
By Lemma 5.1.1,
2'J
m•
f[11-0:<t) 1\ + lo:<t) It=:2] ]dt.
Thus
l--!1
+-2 [km]
To
m
P
2
2
J
.JILlg(At)1 dt
k
m
m
The second term of (5.3.7) is
I g. 2 (t)dt
1 km - 2p
1 -1
-[-]
21T m
A
m
1
= -21TD
2
fg (t)dt
'
the first term of the desired result.
If B is
a
bound for Igl, then the third term of (5.3.7) is dominated
in absolute value by
I:]-
1
Z t
Z
< B rlf (t)!2 dt
B [km]-ZlpJ
dt
-ZS2
21T
1Tm -m
2p
109
and thus tends to zero as m ~
00.
The first term of (5.3.7) is
For
It I
1
1
I+-ft(t) I
altl Cl
< 1, - - - - <
Thus, the first term of (5.3.8) is
no greater than
2S [k ]'1_1+28
= ~.2!!.
TIa 2
2p
0
m
Since 8 >
Cl-
f1
t
2B-2~ dt.
1/2 the integral is finite.
Also, since 8
>
p - 1/2 ,
k ] 1- 1+213
2p ~ 0 as m ~
[.2!!.
m
as m ~
00.
00,
and the first term of (5.3.8) tends to zero
110
The second term of (5.3.8) is
1
- 2p 1
Am
=L
2.".
f
1
2
()
\1- g (t) lSI
'J dt
m
It \>Am
[~m]m 1
2
dt
J
= L [km]1-1/2PA -1+2 P 11-g(t)1
2.".
m
m
It 1 2p
Itl>Am
=
(5.3.9)
f
2
11-g(t) 1
2p
Itl>Am
02p-1
+ K'-:::"2""--
J
Itl
2
1l-:,8.(t)1 dt
2p'
It I>Am It I
The second term tends to K02p-1
.".
2
JI 1-g(t) 12dt,
Itl
2p
the second term
of the desired result.
In the first term of (5.3.9), the quantity in square brackets
can be written as
2
If t (;) 1 1ft 2p
ttl
m
1
III
Il-cir c1
~
Let c be such that
Il-f'r C- t )I
<
Then
c for It\ > 1.
)r- l ,
m
t
Il-f C!-)
\
A
m
< 2
c for It\ >
and since ft C!-)
Am
A ,
m
+
0
as m +
00
and since
2
2 ~ K as m ~ ~,h
. .in square brac k ets ten ds to
~ ~
t e quantity
\ f t CAt) 1 \ At 1 ~
m
m
zero as m +
00.
It I
Al so, for
sup
I
il' (1
>
) 11m
1
2
2
Am,
< 40 ,
2
2p
-
- K
1
and
=M
<
00
m
Itl>Am
Thus, the integrand in the first term of (5.3.9) is dominated by
M11 - g Ct)
2p
I':'
It 1
which is integrable.
Thus, by dominated convergence, the first
term of (5.3.9) tends to zero as m +
5.3.4
Am
o
00.
Corollary Under the conditions of Theorem 5.3.3 except that
k ) -1/2q
= D·( :
where 0
<
q
<
(k )-1-2 q
p,:
J
m
+
f
2~D g2 (t)dt as m +
00.
112
Proof:
The proof of the Corollary follows the proof of Theorem 5.3.3
with q replacing p everywhere except that (5.3.9) has an extra factor
j
km l /2 P - l/2q
of - '
[ m
as m +
00,
[km]
= -m
r
p
pq.
Since q < p, this factor tends to zero
so that the second term of (5.3.8) tends to zero rather
than to
2p-l
D
K
o
2n
5.4
Integrated Consistency--Exponential Case
This section will deal with densities with characteristic func-
tions which decrease exponentially in the sense that
for some A > 0, p > 0, and all t.
This is the same class of
densities that was used in Chapter 4.
~
5.4.1
Theorem
as m +
00,
If h
k
is an expanded estimator of h, with ~
m
m
+
00
and if f is such that \l-ft(t)\ > a \t\Y for some a > 0,
113
o
y~
<
2, It I < 1, and that Ift(t) I
~
Ae- pltl for some
A > 0, P > 0, all t (which means that f t (t)€L ), and if
1
lim
v
-+
00
then
k
lim
m -+
00
m
~m
k
log ~
J; =
1
2nD .
m)
Proof:
By Lemma 5.3.1,
Thus
k
m
m
(5.4.1)
log
*
1
[kmmrm= -21T-1-0-g'"'7'_k:-m~
m
Since
~1
k -+
log -
0 as m
-+
00,
k
~
+ m
S dt
1
m
JSl+-m-S-2 . ~"JO(k)
m.
m
1ogk
m
m
the second term of the r.h.s. of (5.4.1)
m
m
tends to zero as m -+
00.
The first term of the r.h.s. of (5.4.1) can be written as
114
k
m
;n~ J 1- 52logl~J
m
km
5
m 1
+ -
km
(5.4.2)
=;n~J
log : .
k
m
1m
+2n~
log :
r
)
r
Sl dt
52
kmSl
1-- + - m
m
r1
f :
1
::
1+-
t
]
m 1
S
e-2pltldt
J l+ km e- 2p \t\
1+...'!!. S
m 1
m
]
k
+
Let x
k
1
= e -2pt .
m
m
n lOgr=]
1:
f
2n lOgr=] -1-:+_k~:-e--2-P-I-t-1
e-
2p
!tl
dt.
Then the third term of (5.4.2) is
J 1o dx
km
2p(l+m x)
1
2np as m
+ -
This is the desired limit.
+
00.
It remains to show that the first
two terms of (5.4.2) tend to zero as m +
The first term of (5.4.2) is
00.
115
52
5ince1-
Jn
km
> 0, and In 51 > 0, this is dominated by
Thus, the first term of (5.4.2) tends to zero as m ~
The second term of (5.4.2) is
k
m
1
m
21T -1:':':'0g-[-:.~-;-)
~.
116
(5.4.3)
dt .
The first term of (5.4.3) is dominated in modulus by
dt
+ _ _1___
k
1T
and thus tends to zero as m ~
term of 5.4.3 is dominated by
k
(5.4.4)
m
m
00.
log(-:-)
J_1d~t~_ _
0
k
(1+: 51)
117
-I-
Let c' be such that (I-If' (t)
I) 2
>
c' for It I > 1.
Then
Let to
=
I
2P
k
log
c'
(-mm ~),
and write (5.4.4) as
e- 2pt dt
k
(5.4.5)
k
(l+2!. S ) (l+2!. e -2Pt)
m I
m
The second term of (5.4.5) is dominated by
2 2
(1+4A C )
km
-m
TT
log
=
(k:)
2 ?
(1+4A e-) 2
TTPe'
J
<Xl
e- 2pt dt
=
2 2
1+4A C
km
-m
-2pt
0
2p
TT
t
IOg(k:J
0
as m -+
e
<Xl.
The first term of (5.4.5) is dominated by
118
1
k
log [ :)
where s
Jto
dt
1 -1+--:"k-:-~';""'-I-f t-C-t-)-,2
This is dominated by
ds
=
Since to
~
00
as m ~
00,
the integral tends to zero as m
~
00
by assumption, so the first term of 5.4.5 also tends to zero as
m
~
00.
o
As in the algebraic case, it will be possible to replace the
unknown "optimum" estimator by an appropriate estimator Ctype-E
in this case) and still obtain a result similar to Theorem 5.4.1
119
on the rate of convergence of J.
m
The next lemma, proved by
Leadbetter (1963) ,is necessary to the proof of the theorem which follows
it.
5.4.2
and A
n
= g(Anectt ) with Il-g(t) I
Lemma
If o~(t)
= Dn -b
where b > 1/2, and if Ift(t)
p > 0, and ct
S
I
S
Bl/tl for It I < 1,
S
Ae- pltl for some A > 0,
2pb, then
k
The proof of the lemma is not changed if : is substituted
for n to give Am
k
m
= D(-m)
-b
and
k
as - m
m
~
5.4.3
Theorem
as m -+
00,
Am
If
Il-g(t)
k -b
= D(:)
,b
>
I
hmis
S
-+
00.
k
an expanded type-E estimator with ~-+
m
Blltl for It I
S
1,
/7 ~ dy <
00,
00
and
t t l I IY
1/2, and if f is such that f (t)€Ll,~l-f (t) <a t
for some a > 0, It I < 1, and Ift(t)
I
S
Ae-p1tlfor some A > 0, P
>
0,
120
and if a
~
2pb, and
-+
0 as m -+
00,
then
k
lim
m -+ 00
m
-m
k
J
m
m
b
m - -lTC),
10g(-)
Proof:
By Lemma 5.1.1,
so that
(5.4.6)
+
L
21T
1
1ot
(km) f m(t)1
log -
2
In
The second term of the r.h.s. of (5.4.6) is
121
1
00
I Am
f
A
g2(y)~~!log(!-)
A
y
an
m
dv
Y
k
00
1
By Lemma 4.3.1, -an
2
g (y).::L.
m
b1og(-m) - logD
=
an
m
Thus the second term of the r.h.s. of (5.4.6) tends to -b as m +
no.
It remains to show that the first and third terms of the r.h.s. of
ClO.
(5.4.6) tend to zero as m +
00.
If B is a bound for Igl1 then the third term of the r.h.s. of
(5.4.6) is bounded in modulus by
•
<
~
1
2n
--n<:]
.
logl1-'
Thus the third term of the r.h.s. of (5.4.6) tends to zero as m +
00.
The first term of the r.h.s. of (5.4.6) is
(5.4.7)
The second term of (5.4. 7) i s dominated by
,
•
where c is such that
1
t
11-f (t)
I
<
c for
It I
>
1.
This tends to zero
122
as m ~
00
by Lemma 5.4.2.
The first term of (5.4.7) is dominated by
1T
~J
~1
B2 e 2a 0 2
s 1
1T
10 g (
<
2 2a
B e o2
1
1
Sl dt
0
(k~) 1-2bm2
10gr~)
1T
which tends to zero as m ~
00
o
by assumption.
Finally, note that it is certainly possible to choose constants
satisfying the restrictions for this result.
k
with r > 1, then
m
m
_~
00
as m ~
00,
For example, if km
and
f~) 1-2bm2 = .; ;m~r_-2. . ,r: -: b_+. . , 1_+_2. ,.b. .,.
k )
log ( ~
(r-1) log(m)
This will tend to zero as m ~
Note also that if Am
=0
00
if r
(~~b,
>
2b+1
2b-1 .
and am
= IO;(X)dX,
then
= mr
123
ana
m ..,
-k-
m
m
--;---k
m
k
as m -+
k
1
log (X-)
00,
so that
b log (-!!!.) - log D
= - - - - :mk - - - - - -+ 0
m
A
A
m
---+
am
00
is to tend to zero as m -+
.
as m -+
00 •
00,
which is required if Var(h (t))
m
Chapter 6
ASYMPTOTIC NORMALITY
6.1
Asymptotic Normality of the Expanded Estimator.
The following theorem
(cf.Lo~ve
(1960)) is the basis for the
results in this chapter.
6.1.1
Theorem
and La
k
2
n
and max
k
k
=1
a~k
+
If Xnk are independent summands centered at expectations,
for all n, then
0 if and only if
The next lemma, a more general version of Lemma 2.2.2, was
proved by Watson and Leadbetter (1963).
It uses a change in
notation which will also be used in the remainder of this chapter.
Let am(p)
= fIOm(x)IPdX.
Thus, what has been called simply
125
a m will now be a m(2).
Let {o (x)} be a o-function sequence with a
Lemma
6.1. 2
where p
m
*
Then {om(x)}
2.
~
=
m
(p) -::
{I~:;;~IP }is also a o-function
sequence.
Theorem
6.1.3
If f and h are bounded, and f is continuous at
t, and if, for some n > 0,
a (2+n)
m
.....;,;.;:.n-/..------"l-+-l'.l""7"/-
k 2(a (2))
m
m
0
-+
as m -+ 00,
2
then
h (t) - E (h (t))
L
m
m
~
Var(o (t_S u ))]1/ 2
m
.
R.= 1
[
k
-+
N(O,l)
m
l
Proof:
For each m, let
Z.. (m)
1)
o (t~S .. ) - E (0 (t-S .. ))
m
1)
= Z.1). = m m 1)
1/
2
[k
so that
~
mR.=l
Var(o (t-S 1R.))]
m
00,
126
k
m
m
};
};
i=l
j=l
The Z 's are centered at expectations, and the sum of their
ij
variances is one for all m, since
k
m
};
i=l
Var(o (t-S .))
1J
m
m
Var(Z .. )
1J
j=l
};
=
1.
E:
> 0,
Thus, by Theorem 6.1.1, it remains to show that
k
g (<0) =
m
};
m m
f
};
i=l j=l
Izl~E:
z2 dG ..
1J
~
0 as m ~
~
for all
where G.. = G.. (m) is the distribution of Z.. (m).
1J
1J
1J
Now,
f
Izl~E:
so that
m
};
j=l
f
2+11
Iz I '
dG ·
1J
127
m
~
E
j=l
Thus, it is sufficient to show that the last expression tends
for
r~l,
this expression is dominated by
Elo (t-S .) 12+n + IE(o (t-S ,)) \2+n
m
1
_
m
1J
m
1J
n
~ 2 +
m
E Var(o (t-S )) 1+n/2
k
m
1~
m ~=1
~
j=l
m
k 2 2 +n
m
~-n---
(6.1.1) s
~ Elo (t-S1 .) 12+ n
j=l
mJ
EO
In the denominator of (6.1.1),
""
= k m Var(hm(t)) - a m(2)h(t)
by Thoerem 2.3.11.
In the numerator of (6.1.1),
m
I: f. (u) du
j=l J
- am(2+n)h(t).
Thus, denoting the whole expression (6.1.1) by Am,
128
22+n a. (2+n)
A
m
.....
m
n/
1+n/2.
£n k
2(a. (2))
h(t)n/2
m
m
which tends to zero as m +
for all
6.1.4
€
> 0,
by assumption.
Thus gm(€)
+
0.
o
and the result follows.
Corollary
Under the conditions of Theorem 6.1.3,
hm(t) - E(hm(t))
-I /
L
00
+ NCO,l)
a. (2)
~
h(t) 1 2
[. m
Proof:
It was shown in the proof of Theorem 6.1.3 that
m
~
t=l
1
Thus, --k
m
Var(om(t-S 1t ))
m
~
t=l
= k m Var(hm(t)).., a m(2)h(t).
a. (2)
Var(om(t-S 1n )) can be replaced by ....,~r--- h(t) in
,.,
m
the conclusion of the theorem.
o
In Chapter 4, rates of convergence for the bias were discussed.
The two examples which follow show how the results in Chapter 4
can be combined with Corollary 6.1.4.
6.1. 5 Example
'"'"
If hm(t) is an expanded type-A estimator, and if
'"
"
f and hm satisfy the conditions of Corollary 4.3.2, then
129
Am1-rb(~ m(t))
=
~
0 as m ~~.
Also, am(2)
2 (x)
= Jo m
dx
t2 (t)dt
= 1Jo m
2~
..
2
Jg (t)dt
This implies that if f and h also meet the conditions
.
m
.
2~A
m
OD Theorem 6.1.3, and if A 2r-l k
m
m
is bounded for all m, then
.
A
b(hm(t))
=
["~~2)- 10
A l-r
m
b(~
(t)) ~~
A 2r-l k l~
m
~g2(U)dU m
~I
which tends to zero, and by Corollary 6.1.4
.
hm(t) - h (t)
L
a (2)
[
6.1.6
~
~
~
N(O,l).
h(t)
m
Example
~m satisfy
-
~
If hm is an expanded type-E estimator, and if f and
the conditions of
Cor~llary 4.4.2,
then A -8b(~ (t))
m
m
A 28 k
m
Thus, if m 1 is bounded for
log (X-)
as m ~~, and a m(2) - ~~ 109(~ ).
m
m
all m, then
A
A
b(h (t))
m
t~~2~ 10
~
_ A
-8 b(h
m
(t))
m
~
~
a~
A 28 k m m
109(~m) _
~
0
130
A
A
h (t)-h(t)
which tends to zero, and, again by Corollary 6.1. 4, L. a:n (2)
~
-+
N (0,1) .
h(t)
m
6.2
Asymptotic Joint Normality
In this section appropriate conditions are found so that for
distinct points tl, ... t p ' the random variables h(t l ), h(t 2), ... ,h(t p )
are asymptotically jointly normal.
6.2.1
Theorem:
If f and h are bounded, and t , t 2 , ... t are distinct
l
p
continuity points of f, and
a (2+n)
m
-n-/r------:l-+--;/- -+ 0 for some n
n
k 2(a (2))
2
m
m
>
0,
then
-+ N(O,l)
k
m
for any set of constants a l ,····. a p .
Proof:
For each m, let
P
p
E a
e (t-S .. ) - E a E(e (t -5 .. ))
q
q=l q m
1J
q=l.
m q 1J
Z•. (m)
1J
= Z..
1J
=
r;m·~
l
~=l
Yare
~
a
q=l q
0m(t
q
-Su))l ~
J
131
Then
p;:
k
m m
L L z..
i=l j=l 1J
~
L a h (t ) - E(h (t ))
= q=l q m q
m q
m
p
II
L Yare L a 0m(t -SIt))
2
m t=l
q=l q
q
~r-
and
m
p
k
L
Yare
Lao (t-S .))
m m
1J
= m j=l
q=l q m
L L Var (Zij )
--"-------"'------ = l.
i=l j=l
m
p
k
L Yare 'r a 0m(t-S 1t ))
m t=l
q=l q
k
Thus, by Theorem 6.1.1, it remains to show that
k
=
m m
L L
i=l j=l
fIzl~E z2dG ..
1J
is the distribution of Z.. (m).
1J
-+ 0
as m -+
co
for E > 0, where G.. = G.. (m)
1J
1J
As in the proof of 6.1.3, it is
km m
I /2+n -+ as m -+
sufficient to show that -~ E Z. .
En j=l
1J
co.
Now,
°
m p
p
2+
E
I:
I
I:
a
0
(t=Sl')
I:
E
(t
-Sl
.)
I
n
k
m
J
q=l
m
J
2 n = m j=l q=l q m
~ L Elz .. 1 +
m
p
En j=l
1J
En k
I: Yare I: a 0m(t -SIt)) 1+n/2
m t=l
q=l q
q
k
r
~
m
p
2
k
L (2p)1+n L (Ela 0 (t -S .) 12+n + IE ~ 0 (t-S .) I +n)
m j=l
q=l
q m q 1J
q m
1J
En~ ~
i
Yare
a 0 (t -S ))ll+ n/ 2
m ~=1
q=l q m q 1~
L
J
132
by the c r -inequa1ity corresponding to the elementary identity
n
n
r for r '1.
~ a. ,r ~ n r - 1 ~~ IN.
~ I
~
Ii=l
1
i=l 1
~
Th east
1
. .IS d
·
d
expressIon
omlnate
in turn by
m p
2 n
k (2P)1+ n L
L 2 Ela 0 (t -51.) 1 +
m
q=l
q
m
q
J
j=l
m
p
1+ n/ 2
En
I: Yare I: a Om (t -Su))
m
q
q
R.=1
q=l
(6.2.1)
t
In the denominator of (6.2.1),
m
p
L Yare L a o (t -SIt))
t=l
q=l q m q
=
m
I:
R.=1
A
A
2
'"
aq Var(h m(t q )) + 2
1
'"'"
P
2
Now, this last expression, if mu1tip1ed by a. (2)' tends to I: a h(t)
m
q=l q
q
as m +
00
by Theorems 2.3.11 and 2.3.17, so that if the denominator
of (6.2.1) is denoted by Ym,
133
In the numerator of (6.2.1),
p
m
L
L
q=l j=l
aq +n
2
Elo
(t -51') 1 + n
m q
J
2
lom(t q _u)1 2+ n
p
2+n
= L a
a (2+n)
q=l q
m
-
f
m
L
a (2+n)
m
f. (u)du
j =1 J
p
2+n
L a
a (2+n) h(t )
q=l q
m
q
by Lemma 6.1.2 and Theorem 2.3.11.
Thus, if the whole of expression (6.2.1) is denoted by A ,
m
p
X
-
m
2(2p)1+ n L a 2+n h(t )
q
q =l q
-----........,;~-----
[~a 2]1+n/ 2
q=l
k
q
which tends to zero as m ~
nl z
m
so that gm(€)
~,
(a (2))
l+n/ Z'
m
~
0 and the result
0
follows.
6.2.2.
Corollary
Under the conditions of Theorem 6.2.1,
P
A
- E(L a q hm(t q))
q=l
P
L
q=l
- II 2
a 2h (t )
q
q
~
N(O,l).
134
Proof:
It was shown in the proof of Theorem 6.2.1 that
o
for which the result follows.
6.2.3
Corollary
Under the conditions of Theorem 6.2.1, and if
A
A
b(h (t ))
m q
t~~2) -
+
~
(2)
m
p
E
a
q=l q
A
....
Let Ym = (~m(~l) ,
+
y
+
A
=
(h(t ) ,
1
=
(aI'
+
y
+
Z
m
=
....
2, ... , p, then
17 2
+ N(O.l)
2 h(t )
q
A
A
, h m(t p))
, h(t ))
P
• a )
p
+
m
J
-
A
+
=1
1/ 2
t
L
0 for q
- y
[ .=~2)
-
1/ 2
Also, let H be the p x p matrix which is all zeros, except for
o
135
diagonal elements h(t 1), ... ,h(t )'
p
Then the conclusion of Corollary
6.2.3 can be rewritten as
-+ -+
-+-+'
-+
HAZ ) -+ N(O ,AHA') for every A.
m
Application of the
Theorem
6.2.4
Cram~r-Wo1d
device produces the following theorem.
If f and h are bounded, and t , t , ... , t are distinct
2
l
p
continuity points of f,and
a. (2+n)
m
--~------..,-- -+
n 2
k
0 for some n > 0,
1+n/ 2
1 (a. '(2))
m
m
and if
then, using the notation introduced above,
-+
-+
L (2 ) -+N (O,H)
m
o
Sufficient conditions for the assumption
A
A
b(h (t q ))
m
r
-+
(2)1 1/2
[~m -'
0 were discussed earlier in Examples 6.1.5 and 6.1.6.
CHAPTER 7
EXAMPLES, FURTHER RESEARCH
7.1
Examples
This section presents examples of the compact and expanded estimators
of h computed from simulated data with three different distributions.
The estimators are the same in each case in that they use a standard
normal kernel, and m=8, k =64 and A =.32.
m
m
'
A
1
A
h (t)
m
=h
8
(t)
64 8
= -- L L
64 i=l j=l
1
.32
That is,
exp - _
/fiT
[
1 !t-s..]
1J 2]
2"
.32
The asymptotic theorems in previous chapters are of limited use in
deciding on the values of m, k , and A to use in an isolated case.
m
m
values used in the examples were chosen somewhat arbitrarily.
The
Various
possibilities were tried, and the choice was made according to rather
subjective considerations about what qualities were desirable in the
estimator.
These considerations are described below.
an area where further research is needed.
Clearly, this is
Also, the decisions on values
for m, k , and A were made partly by comparing the estimates with the
m
m
137
known true renewal density. which would not be possible in real
applications.
If one has only a certain number of observations (or in the case
of the examples. a certain amount of computer space). one must balance
m against k.
m
If m is small. the estimator loses accuracy for larger
values of t. where most of the contributions to h come from the higher
convolutions of f.
If km is too small. the estimator suffers from
having too few observations of each of the convolutions.
A value for A must also be chosen.
m
If A is large. then 0 has
a large variance. and the estimator is smooth.
m
m
It is possible for the
estimator to be "too smooth." so that real features of the renewal
density are glossed over.
If Am is small, 0m has a small variance, and
if A is very small, each observation produces its own peak in the
m
estimate.
It is possible for the estimator to be "too rough," if A
m
is small enough so that there is a large variability inherent in the
estimator. and this variability obscures the real variations in h.
The first example uses simulated exponential data with mean one.
Here h(t)=l.
Figure 1 shows the compact and expanded estimates of h
138
along with hS(t), the sum
of the first eight convolutions of f(t)=e- t •
A
Figure 1 shows clearly how h
m
follows h
m
rather than h, which is a
limitation of the estimator, though one which can be overcome to some
extent.
Since h(t)+
estimate
~
1
1
U
aA
t+oo, under appropriate conditions, one can
and use this as an estimate of h(t) for large t.
will be discussed in the next section.
This approach
Figure 1 also shows the variation
which is due to random clustering of observations, and which is characteristic of the estimator •
•
•
The Er1ang data used in the second example was produced by adding
pairs of independent simulated exponential observations.
-2t
density function is f(t)=te- t and h(t)=l-~
.
Figure 2 compares compact
and expanded estimates with the true renewal density.
positive at t=O, where h is actually zero.
"spilling over" into the region t<O.
unavoidable.
The probability
The estimates are
This is due to the estimates'
With the normal kernel, this is
With any kernel, unless f(t) is zero in a neighborhood of
zero, and A is chosen small enough, the estimate may be positive for
m
•
negative values of t.
139
The third ,example was computed from simulated data uniform over
(2,3).
The renewal density is somewhat more complicated in this case,
since
fj(t)
= k-1
L
(-1) Q, ()
)1 {
{t_(2j+Q,)}j-1
Q,=O j-1
and for k
(
= 1,
2,
for 2j + k-1<t<2j +k
j, zero otherwise.
Figure 3 shows the compact and expanded estimates and the renewal density.
of h
In the previous examples, h is so smooth that only the deficiencies
m
are really apparent.
In this example, where h varies a great deal,
it is satisfactory to note that hm follows h well, and that the inherent
variability of the estimator, which is so noticeable in previous examples,
becomes a secondary characteristic.
Note that in none of the three examples do the compact and expanded
estimates exhibit any striking differences in character. ,Since the
expanded estimator uses many more observations, it would probably be
seldom used.
This provides motivation for the pursuit of analogous
theorems for the many results in previous chapters which apply only to
the expanded estimator.
140
....
r---------------------~----,._----......,...-......,...----~
2'1
0'1
"0
"0
8 4 pUB '8 4' 8 4
0'0
.....
.....
~
Figure 2
h'and Estimators - Erlang Lifetimes
IX)
I
I
!""-
\0
If)
IX)
«,c
'"l:::
1\1
~
eo
<t-
l,c
,c
(Y)
N
0
,-
--.
I
0,5
r.J
-
.
--
I
1
-
.
I
1.5
2
2.5
3
3.5
1-
~.5
5
t
e
•
.
•
•
e
142
..
....
CXl
C'"I
(lj
1-0
:l
Clll
....
....
-0
~
N
1---..----_r_--.,_--·..,--_---tr-----.---1j......- - r - - - - r - - - _ r _ - - + _ 0
0'0
8'0
g'O
w'O
0'1
?1
8 q pue ,8 4 ,8 q
143
7.2
Further Research
1'\
It is possible to construct a U-statistic version of h
m
ting h
m
for all permutations of the observations {Xi} and taking their
average.
fully.
by calcula-
This is appealing, since the observations seem to be used more
The new version will have a smaller variance, but will have even
more internal dependence than the compact estimator, with similar, but
worse, difficulties in obtaining results.
In the compact estimator, mk observations of X are used, so that
m
there would be (mk )1 permutations of the observations for which to
m
calculate h.
m
However, the size of the task, which at first seems over-
whelming, is partly illusory.
+
The U-statistic version reduces to
1
lm~mr
This version gains most in the higher convolutions, where there are
more possibilities for choosing Xi's to form Sij'S.
of combinations which must be dealt with is
still formidable.
(m~),
The largest number
which is, however,
144
~
The fact that the joint distributions of the estimators h
m
converge
(under appropriate normalizations) to the product of independent normal
A
distributions, suggests that h
m
(x) be regarded as a stochastic process
and full weak convergence to an appropriate Brownian motion
considered.
Indeed Bickel and Rosenblatt (1973) proved stronger forms of such convergence in the case of probability density estimation and similar
results should be possible for the renewal density case considered here.
7.3
Alternate estimators
One possibility for further research is the alternate approach to
estimating h(t) mentioned in the introduction.
f,fj(t),coUld be estimated using the j
f
(t) denote this convolution.
th
The jth convolution of
A
convolution of
fl,n(~)~
Let
Then the estimator of h would be
j'u
h (t) =
n
r
j=l
f.
',n
(t) •
A
These convolutions of f ' :(t) (!an De- \rritten in terms of convolutions
l,n'
of 0 •
n
For instance,
<5
1
= 2"
n
n
n
(t -x -X. ) dx
n
E
E 0 2 Ct-X. -X. )
i =1 i =1 n,
11 1 2
1
2
1
2
145
where on' 2 is the second convolution of o'n'
1
n
00
A
and
n
a
.
E ... E
J i =1
i.=l
1
n
E
n
h (t) - E --. E
n
j=1 n J i =1
i.=1
1
.(t-X. - ••. -X. )
11
I .
J
n,J
J
1
By similar calculations,
on,J.Ct-X.11 - ... -X.)'
I
j
J
This is a much more complicated estimator, even if it is modified
to include only a finite number of convolutions.
A
late the mean of this h (t), the expectations of
n
calculated.
E(O
n,
For example, to calcu~.
n,
j'S must be
The simplest of these are 0 '2(t-X -Xi) and ~ 2(t-2X).
i1
n,
2
n,
2(t-X. 11
x.1 ))
2
=
IIfo
(x)o (t-y-z-x)
J n
n
f(y)f(z)dxdydz
One would expect that under suitable conditions, this integral converges
to !f(x)f(t-x)dx
= f 2 (t).
E(O n. 2(t-2X»)
= !!O n (x)O n (t-2y-x)f(y)dxdy
= !!o (x-y)o (t-x-y)f(y)dydx
n
n
Since the inner integral tends to zero, one would expect this
integral to tend to zero under suitable conditions.
seem to indicate that
an, j(t-X i 1-
If so, this would
- Xi ) tends to f.(t) or to 0,
j
)
depending on whether i , i , "', i are distinct or not.
2
j
1
(This would
146
also seem to indicate that
n convolutions.)
hn (t)
might as well be limited to the first
This combinatorial problem, along with the difficulties
of the multiple integrals encounted are the reasons why this approach
was not pursued.
A
Since h
t.
m
estimates h
m
rather than h, it is not accurate for large
However, since h(t)-J:. as t-+oo i f f(x)+Oas x-+oo and rlf(x)
]..I
IP
dx <
00
1
for some p > I (Smith[1958]), one can use--or some other estimator of
X
I
II
- to estimate h(t) for large t.
This raises the question
of when t is
"small" and when "large," or where to switch from one estimator to the
other.
This is another subject for further research.
Another approach which might sometimes be appropriate would be to
estimate h over only a fixed interval, say (O,T).
Let nl=nl(T) be the
smallest integer such that x + ..• + x > T.
n
l
2
+ Xn > T, etc.
integer such that X + +
n1 1
S·1, j
= Xn
+
i +1
•
1
h" (t) =
k
k
k
E
...
Let n
be the smallest
Let
2
+ X
n. .
and
1"+J
n.
J
E ok (t-S· .)
IJ
j=l j=l
for tE: (0, T) .
A
The number of observations needed for h
k
is random, and it is difficult
to envisage an expanded version of the estimator, but the values of t
147
"-
for which h
.,
•
k
is "accurate" are clearly specified .
References
,-
-...
.
.
•
1.
Bickel, P.J. and Rosenblatt, M. (1973). On some globa~. me~sur~s of
t1\e deviation of density function estimators. Annals of Statistics,
1, 1071-1095.
2.
Billinglsey, P. (1968).
Wiley.
3.
Buck, R.C. (1965).
4.
Cacou11os, Theophilos (1966) "Estimation of Multivariate Density"
Annals of Mathematical Statistics, ~, 179-90.
5.
Johnston, Gordon (1979) "Some Results on the Watson Estimator of
the Regression Function," Thesis for Ph.D., UNC-CH.
6.
Leadbetter, M. R. (1963) "On the Non-Parametric Estimation of
Probability Densities," Thesis for Ph.D., UNC-CH.
7.
Lo~ve,
8.
Nadaraya, E.A. (1965) "On Non-Parametric Estimates of Density
Functions and Regression Curves," Theory of Probability and Its
App1ications,10, 186-190.
9.
Parzen, E. (1962) "On Estimation of a Probability Density Function
and Mode," Annals of Mathematical Statistics, 33, 1065-1976.
10.
Rosenblatt, M. (1956) "Remarks on some Nonparametric Estimates of
a Density Function," Annals of Mathematical Statistics, ?:J....,
832-837.
11.
Schuster, E. F. (1969) "Estimation of a Probability Density Function
and its Derivatives," Annuals of Mathematical Statistics, 40,
1187 -1195.
12.
Smith, W.L. (1954) "Asymptotic Renewal Theorems," Proceedings
of the Royal Society of Edinborgh, A, 64, 9-48.
13.
Smith, W.L. (1958) "Renewal Theory and its Ramifications"
Journal of the Royal Statistical Society (B), ~, 243-284.
14.
Van Ryzin, J. (1969) "On Strong Consistency of Density Estimates"
Annals of Mathematical 'Statistics, 40, 1765-1772.
15.
Watson, G.S., and M.R. Leadbetter (1963) "Hazard Analysis II,"
Sankhya (A), 26, 101-116.
16.
Wegman, E.J. (1972)"Nonparametric probability density estimation
I: A summary of available methods/' Technometrics, .!i, 533-546.
Convergence of Probability Measures,
Advanced Calculus, Mc-Graw-Hill.
M. (1960) Probability Theory.
Princeton, New Jersey •
Van Nostrand Company,