Nychka, Douglas and A. Ronald Gallant.; (1984)Density in a Censored Regression Model."

CONSISTENT ESTIMATION OF THE PARAMETERS AND ERROR
DENSITY IN A CENSORED REGRESSION MODEL
by
Douglas Nychka and A. Ronald Gallant
Institute of Statistics Mimeograph Series No. 1642
June 1984
Consistent Estimation of the Parameters and Error
Density in a Censored Regression Model
by
Douglas Nychka and A. Ronald Gallant
Department of Statistics
North Carolina State University
1.
Introduction
The underlying model in this discussion has the form:
(1.1)
fl(X,e) + Ul
Yl
YZ = fZ(X,e) + Uz
where (U ' U )
Z
l
function h.
=
U is a mean zero random vector with probability density
f l and f Z are spec1. f·1e d ,cont1nuous
.
f unct10ns
.
0 f t h e exogeneous
variable X and the parameters e.
e and h are unknown.
The interest in this paper concerns estimating e and h when Y is not
observed directly.
We consider the data, z, where
(1. Z)
Thus, Yl is only observed conditional on the event
{YZ>O}.
Yz is never observed explicitly; only its sign is known.
Also, note that
This particular
observational model has a variety of applications in economics, psychology
and education.
One important example arises in labor economics for evaluating
training programs where the second equation represents a selection rule such
as voluntary selection or selection by program administrators.
Using this observational model, we propose consistent estimates for 8 and h
based on the maximum likelihood criterion.
In order to define these estimates
we first give the probability density for z and state the assumptions made on
8 and h.
-Z-
Let
~
°v
denote Lebesque measure and
v unit mass.
be a measure that gives the point
Then the joint distribution of (zl' zZ) is absolutely continuous
with respect to the product measure (~ + 00) x (00 + 01).
Using a conditioning
argument, the log density of z with respect to this dominating measure is
(1. 3)
Given n independent observations the log-likelihood has the form:
We will assume that
e
E
8, h
H where e
E
(usually a closed and bounded set in
(m+l) order Sobelev space on
Z
m.
mk )
-+
H is a bounded subset of an
Finally let
pK a polynomial of degree K in ]RZ and FK
We will assume that K = K
n
and
is a compact metric space
""as n
-+ "".
=
~
if: f
be a positive weight function,
= (l~)z}.
Then the estimates
Set
(6 n ,h n he
x H
k
satisfy:
t n (8 n ,h n )
max
(e,h)E
In order to define
H we
e
x
HK
t
n
(e ,h) .
n
first present some notation concerning function
spaces.
')
Define the differential operator for functions on m~ as
We will use the Sobolev Spaces (m th order):
-3-
( 1.4b)
2
Note that WO,2(A) = L (A) and in general wm,2(A) is a Hilbert
for m > 0.
Space with respect to the inner product:
<f,g>
wm,
(see Adams, 1976).
2
(A)
=
Ipf~m
p
p
<D f,D g> 2
L (A)
To simplify notation, in this paper when A
.
. .1n (14b)
o f ten om1t
t h e d oma1n
.:
Wm,2
lR 2 we will
=
__WJII,2(lR 2 ).
For R > 0 let
Then we have,
Thus H is a set of density functions with zero mean and whose tail behavior
lies between the extremes of
h
£
~L
and
~H'
Moreover, the restriction of
B is an implicit assumption on the smoothness of h.
R
limits the amount of oscillation in h and its derivative.
that
F
Ih
E
wm+ l ,2
This constraint
The requirement
is necessary to insure that the weighted polynomials in
can be used to approximate functions in
H.
(see Lemma 4.2)
Let e* and h* denote the true values of e and h.
Under conditions A-C
(see Section 2) which insure that the parameters are identifiable and assuming
conditions I-III which concern the Ceasero summability of {Uk,Xk}K =1
1
the tail behavior of the densities in
H (see Section 3).
00
and
'
We prove:
Theorem 1.1
Wm,2(lR 2 ).
e and
OOu u
n
where the closure is taken with respect to
k=l k
I f m ;;; 2 then e -+- e* a.s. inS
Suppose e*
E
h*
£
n
-4-
(1.4c)
We note that the convergence with respect to mth order Sobelev norm
is quite strong.
For example (1.4b) implies that
I (D P~h n )(U)
sun..
UEJRL
for all P such that 0 ~
- (D P h)(U) I
Ipi
~
~ m-2 .
functional on ~,2(m2) then
I
0
a.s.
In general, if A is any continuous
A(h ) - A(h*)1 ~ 0
n
a.s.
The main difficulty in interpreting this theorem is relating
H.
Although in Section 4 we show that
H, this is not enough to guarantee that
~U
k=l
~U
F
k
H will also be dense.
k=l K
H.)
( The problem
~
H = FKn H and in general, U H may not
k
k=l k
In order to obtain consistency for all functions in H we
arises in taking the intersection:
be equal to
&5
U H to
K=l k
is a dense set of functions in
consider a slightly different estimate.
Let
R be
the closure of H in
~,2
and let Q: ~,2 ~ Hbe the projection operator such that
Q(g) = h (=> Ilg-hll m 2 = mi!!. Ilf-hll
2.
W,
fEH
Wm,
(l.5)
Thus, Q in this case is the best approximation to f by
the constraints in H.
Since
H is
a density satisfying
a closed, convex set in ~,2 such an operator
is well-defined (see Rudin, 1974, p. 83).
Now define the estimates e ,h
n
g
E
F
K
n
such that e
n
E8
h
Q(g) for some
n
and
n
l
n
(8 n ,h n )
f. (e,Q(f»
n
.
Using this estimate we prove.
Theorem 1.2
Suppose
then en ~ e*
F~
wm+l,2(m 2 ) and m ~ 2.
a.s. in 8
and ~ ~h* a.s. in ~,2
If h*
E
Hand e*
E
e
-5-
The next section gives conditions under which e,h are identifiable and
the expected log-likelihood has a unique maximum.
Section 3 proves Theorem 1.1
and the following section proves Theorem 1.2.
2.
Identifiability of the Parameters
Let X, denote the space containing the exogenous variable x and let v be
a measure on X that is the weak limit of the empirical distributions of
Identifiability conditions:
Let
e,e'
1::8.
If
A)
fl(x,e')
f
then
B)
2
(x,e')
a.e.
v
e=e'.
If fZ(x,e) = ~(f2(x,e')) a.e. v for some monotone transformation ~ then
fZ(x,e) = fZ(x,e').
C)
Let B be any rectangle in ]RZ and let e
The
E
~9 .
later two conditions may appear unusual and require some comment.
In B), the invariance of f
Z
under monotone functions is necessary because
yz is never observed directly, only its sign is known.
To illustrate the problem encountered in this situation suppose
e, Xc;;
lR and f(x,e) = xe.
Then,
However, if we make the change of variables w
value
= cu
where c is some arbitrary
-6-
1
c
-h(w)dw
1
where 6'= 6c , h' = -c h
Thus, the parameters of the model can be varied without changing the probability
distribution of zZ.
Condition C insures that the distribution of the exogenous variables is
rich enough so that (f (x,6), f (x,6»
l
Z
will trace out the support of
h(~).
Without this requirement, the error density might only be identifiable on a
region smaller than its support.
If X
~ ~l
and v is dominated by Lebesque
measure then condition C) implies that for fixed 6
{f (x,6), f (x,6)}: X
l
Z
-+
the map
E (9
~Z is onto Le. for any u E ~Z there is an x EX
such that (f (x,6), f (x,6»
l
Z
= u .
The following two theorems address the identifiability and uniqueness
of the maximum likelihood estimates of the parameters and error density.
For
technical reasons that will be clearer in Section 3 (see Lemmas 3.1 and 3.Z) we
state these theorems for h
to Wm,Z.
E
H,
where the closure of
H is
taken with respect
Although elements in the closure will not be contained in B they
R
will still be densities with zero mean.
Theorem Z.l
Suppose conditions A-C hold.
Let h,h'
E
H,
6,6'
E
9.
If
(Z.1)
a. e. for z
then h
E
~
X
{O, I},
X
E
X
h' and 8=8'.
Proof
Using the fact that log is monotone (Z.l) implies
-7-
(z.Z)
and
(Z.3)
Note that (Z.3) can be rewritten as
(Z.4)
where HZ and H; are the marginal distribution functions for the second variable
in hand h'.
Thus we have
since H;-lo HZ is a monotone transformation by ~ fZ(x,e) = fZ(x,e') • Now
referring to (Z.Z) by C we can conclude that these integrals are equal for
any interval.
Thus
where 6 = fl(x,e) - fZ(x,e').
Thus, h,h' can only differ by a shift in location.
However, since both densities have zero mean 6
=0
or fl(x,e)
=
fZ(x,e').
Therefore h = h' and by A, e = e' .
QED
Next we show that the expected value of the log-likelihood of z has a
unique maximum at the true parameters.
Theorem Z.Z
Let h* and e* denote the true values of the parameters and P *
e
the conditional probability measure for z given x.
Assume v(X)
<
00
*(. Ix)
,h
•
-8-
I f e*
E
e,
h*
s(e,h) =
E
iT
then
fxf m x
{O,l} ge,h (z,x) dPe*,h* (zlx)dv(x)
(2.5)
achieves a unique maximum at e = e* and h = h*.
Proof:
Since -log is strictly convex by the usual application of Jensen's
inequality:
which gives,
o
~
fmx
{O l}g * *(z,x)dP * *(zlx),
e,h
e ,h
fm x
e
E
{O , I} g e, h ( z , x) dP e* , h * ( z Ix)
'B, h e: H and a. e. on
X
with equality holding only if ge,h = ge*,h*
Thus S(e,h) ~ S(e*,h*) Vee:
h e:
H
Moreover, when this maximum
is attained from the remarks above and Theorem 2.1,
QED
e = e*, h = h* .
3.
Consistency of estimates
We first state the necessary assumptions for Theorem 1.1.
Condition I (Ceasaro Summability)
If b(u,x) is a continuous function on
£
1
b(u,x)
n t=l
t t
m2
x
Wthen
~ f f 2 b(u,x)h(u)dudv(x)
X m
Condition II (Continuity of likelihood)
-9Let
sup
aee
Condition I I I (Continuity of expected log-likelihood)
J
X
<
M(x) dv(x)
00
The main idea behind the proof of Theorem 1.1 is to first prove consistency
--
for the estimate a ,h
n
In(8
n
,h )
n
where
n
= max
_ l
6e:g ,he:H
n
(a,h)
(3.1a)
Note that the only difference between (8
mization in (3.1~ is defined over all of
n
,hn )
and
Hrather
(6 n ,hn )
than
HK.
is that the maxiThe next step is
to show that the difference between these two estimates converges almost
surely to zero.
Hence, the consistency of the original estimates follows.
The next lemma gives a compactness property for
H.
This result and
Lemmas 3.2-3.4 will be used to establish the consistency of (8
n
,hn ).
Lemma 3.1
H is
Proof:
precompact in
Let {hn}n=l , 00 ~
wm,2(~2)
H.
We will show that there exists a subsequence of
{h } that is a Cauchy sequence.
n
Since
~,2 is a complete space this sub-
sequence must have a limit which lies in the closure of
precompact.
H.
Hence,
H is
-10-
(3.1b)
and let
f
J
be the operator that restricts a function with domain on R
one with domain on A .
J
By Theorem 6.2 I, IV (Adams, 1975) f
2
to
: ~1,2(R2)
J
+
wm,2(A )
J
is a compact embedding, where ~1,2(A) is the completion of Co (A) with respect
co
to the norm for
wm+ l ,2(A).
Moreover, when
(Adams, 1976, p. 56) the embedding from
wm+ l ,2. Therefor~ H
H ~ B and thus is a bounded set in
R
wm,2(A ).
J
wm+
A = R 2 , ~1,2
= Wm+l,2 and thus
0
l ,2 into Wm,2 is also compact.
is pre-compact in
Using this result we can extract a subsequence;
converges to a limit in fl(H).
can now be extracted from {h
{h } that
n
Similarly, a convergent subsequence on wm,2(A )
2
(1)
K
{h~l)} ~
}.
In general, we can choose
{
h
(J)
K
} such that
{h~J)} ~ {h~J-l)} and {h~} has a limit in wm,2(AJ ).
Set g.= h(j)
]
j
Note that this sequence is obtained from the diagonal entries
when these subsequences are arranged in a table.
Clearly, {gj} ~ {h } and the
n
proof will be completed by showing that {gj} is Cauchy.
Let a
J
£ wm+l,2 with
o
1
Take £
> O.
For 0
+
<J
~
II a J (g'l
]
jl
~
j2
- g.] 2)
From the definition of
~M
<
co
II wrn, 2
it is straightforward to show
(3. lc)
Since
J +
ce.
~M
), by the .
dom~nated convergence theorem
£ Wm+l ' 2 (R2
We will choose J such that (3.lc) is bounded by £/2.
II a J~M II
Wm,
Now {h J }
2
Cauchy sequence in wm,2(A ) and by construction fJ(gj) £ {h~} provided j
J
+
0 as
is a
> J.
-11-
Thus there is an M < ~ such that M ~ J and jl' j2
term on the RHS of (3.1c) is bounded by E/2.
~
M implies that the first
Thus (3.1c) is bounded by E
for jl' j2 ~ M and therefore {gjI is Cauchy sequence.
QED
Let S (S,h) = 1 l (S,h) and take S(e,h) as defined in (2.5).
n
n n
Lemma 3.2
(Gallant, 1984)
Suppose e x
H
is compact in the usual product topology, S(e,h) has a
unique maximum at (e*,h*) and S is continuous one x
sup
Is (S,h) - s(e,h)1
e,hEexH
n
then (S- ,h- )
n n
+
+
0
(3.2)
(S * ,h * ) in the product topology on e x H
H is compact the sequence (8 n ,hn ) will have at least one limit
Since e x
point.
Suppose (eO,h ) is such a limit point and
O
converging to it.
(6 h)
m m' m
If
a.s.
Proof.
s
H.
~
(6 m,h m)
is a subsequence
From the definition of the maximum likelihood estimate
s (e* h*)
m'
and by the assumption of uniform convergence in (3.2),
S(eo,h ) ~ S(e*,h*).
O
that (SO,h ) = (S*,h*).
O
Since (e*,h*) yields a unique maximum, we can conclude
Therefore the sequence of estimates has only one limit
point at (e*,h*).
Lemma 3.3
Under conditions II and III
a) the functionals
g~~~(u,x)
and
g~~~(x)
lR 2 xXxexHand
b)
S(S,h) is continuous on ex H •
are continuous on
-12-
Proof:
Suppose {(X ,U ,e ,h)} 1
is a sequence converging to (x,u,e,h)
n n n n n= ,~
let $ (w) = I
h (u ,w) .
n
( - f (x ,e ), ~)
n n
2 n n
Using the facts that convergence in Sobolev norm with m
~
2 implies the
uniform convergence of a function pointwise (see Adams, 1976, p. 97-98) and
that f
2
Since $n
is a continuous function of x and e we have
~ ~IU
by condition I I we can apply the dominated convergence theorem to
conclude that
Finally, noting that the log is a continuous function we have
(1)
ge
(1)
~ge,h(u,x)
h (u ,x )
n' n n n
and thus this functional is continuous.
The continuity of
g(2) is proved in a similar manner.
b)
From the results in a) it is clear that for fixed u and x ge h(u,x)
,
is a continuous function on ® x
H.
Also, from condition I I
6~ Ige,h (u,x)1 ~b(ul'x)
hEH
If (e ,h ) is a sequence converging to (e,h) then by the dominated
n
n
convergence theorem
S8
h
n' n
~
Se,h and thus the continuity in b) follows.
-13-
Lermna 3.4
Under conditions I-III
sup
(S,h)e:
e
_ IS (S,h) - S(S,h)1 ~ 0
x H
n
a.s.
If gS,h(u,x) were continuous in all of its arguments then this result
would follow by a Unifrom Strong Law of Large Numbers, such as Theorem 1 in
(Gallant, 1982).
However, because of the indicator functions in the log-
likelihood some additoinal work is required.
The idea behind this proof is
to approximate gS,h by a continuous likelihood and then show that the difference
, is negligible.
Let 0 ~ X~
Proof:
(z) ~ 1 be a continuous function such that
E:
X (z)
o
z
<0
XE: (z)
1
Z
>
and
E:
Let
and
Thus SE:(S,h) is identical to S (S,h), the original log-likelihood, except for a
n
n
continuous modification of the indicator functions close to O.
Adding and
subtracting this modified likelihood gives
(3.3)
Now we argue that each term on the RHS of (3.3) converges to zero uniformly in
2)
x H.
-14-
E
By construction, Ige,h(u,x)1 ~ b(u,x) and is continuous on JR
Z
x Xx
8)
x F.
Thus, by Theorem 1, (Gallant, 198Z) the first term in (3.3) converges uniformly
to zero.
Considering the second term in (3.3) and applying condition III gives
~ J JJR I
X
(-E
,E>
(uZ+fZ(x,e*»M(x)du z dv(x)
~ ZE J M(x)dv(x) .
X
Since by assumption J M(x)dv(x)
X
uniformly as E
~
<
00
we see then this term converges to zero
O.
Let 0 ~ ~E ~ 1 be a continuous, function which is one on the interval
[-E,E] but zero outside the interval [-ZE,ZE].
Considering the last term in
0.3)
Is (e,h) - SE(e,h)1
n
n
~ 1
¥
n W=l
~E(uz+fz(xk,e*»b(uk,xk)
Once again, applying Theorem 1, (Gallant, 198Z) the RHS of (3.4) converges
uniformly to
0.5)
and by using the same arguments given above this quantity is uniformly bounded
by
4 E JM(x)dv(x)
Therefore for E
>0
there is an N such that for n
~
N
Combining the results for the separate terms in (3.3) proves the lemma.
QED
-15Before giving the proof of Theorem 1.1 we need to introduce the following
projection operator:
. .m 2
Let P : w'
K
~
H
such that
g <=> II h -gll
For h
£::
wn, 211 Pk(h)
m 2
W'
=
min
f£::H
k
- hI/ m 2 will be non-increasing in K and i f h
W'
£::
~
H then
k
k=l
0.6a)
F1na 11 y, we c 1"
a1m th a t 1·f h
~
.m,2 and h ~~
~ h 1"n .W
o
n
--U~
k=l
H
then
k
0.6b)
To see this we have
II P k ( h n )
- h
II Wm' 2
0.7)
:;i
:;i
~
min
f£::H
Ilf-hil
k
wID,
2 + 211h -hll . .m 2
n
w'
II PK (h)-hll Wm, 2
+ 211h n -hll Wm, 2
where both terms on the RHS of (3.6) will converge to zero.
Proof of Theorem 1.1
We first argue that
By Lennna 3.1
(8 n ,hn )
are consistent estimates.
iT will be compact therefore, so is
identifiability conditions A-C by Lennna 2.2
s(e,h) over 2l x
8 x
Under the
S(e*,h*) is the unique maximum of
iT and by Lennna 3.3b S is continuous on
Lennna 3.3 we can conclude that (3.2) holds.
Ff.
3)
x
iT.
Finally, by
Having satisfied the assumptions
of Lennna 3.1 we have:
e
n
~ e*
a.s. in 8
(3.8)
-16-
A
Now we show that e
and h
n
By the definition of
From the properties of P
liminf S (6 ,h )
n n n
A
A
~
K
n
are consistent.
(8 n ,hn )
given above, the consistency of
(6 n ,hn )
and Lemma 3.4.
s(e * ,h * )
Now, using arguments similar to those in the proof of Lemma 3.2, it is
(6 n ,hn )
straightforward to show that
must have a single limit point at (e*,h*).
QED
4.
Consistency of
(8 11---0,h )
Let P denote the class of polynomials on
]R2 and let
v
f = P<P, p
= {f :
I::
P}
Lemma 4.1
If $2 has a moment generating function and V ~ WS ,2(]R2) then
b)
H is contained in the closure of F with respect to the norm for
~,2(]R2)
Proof
a)
The first statement of the lemma is equivalent to the following
condition:
If h
for all f
I::
V
I::
wS ,2 and <f,h> S 2
then h
W'
O.
o
(4.1)
-17-
First assume that (4.0 holds for h
E:
C~. Using the fact that the adjoint of DP ir
2
L for functions in Coco is (-1) IplDP
=
where g
=
~
(-l)lpl<f,DP.DPh> 2 = <f,g> 2
L
L
Ipl~
E
(-l)lpl(DP.DP)(h).
p~
and rewriting (4.2) gives
Ipl~S
Setting f =
(p, g/ ~) 2(
L
2
lR, ~
2) = 0
'if p
E:
P
From Gallant (1980), we know that the polynomials are dense in the weighted
2
2 2 2
L space L (lR ,~).
continuous g
Thus
g/~
= 0 and since g has compact support and is
= O.
Now <h,g> 2
o and from (4.2) we have <h,h> S 2
2
L OR )
W '
2 = 0 which implies
(lR )
h - O.
Thus we have demonstrated that (4.1) holds for all h
E:
co
2
Co (lR).
Since
C~(lR2) is dense in W~,2(lR2) by continuous extension (4.1) holds for all
h
E:
WS ,2 and the result follows
b)
Since h ~
H
by assumption
sequence {f} ~ V such that (4.1b) f
n
Ih
n
E:
~1,2.
From part a) there is a
~ Ih in ~1,2.
It remains to show that f2 ~ h in ~,2.
Repeated application of the
n
chain rule yields the formula
PI
P
a2
;(1 a/ 2
1
2
a
(as)
=
PI
P
E
E
2 (P.l)( P.2)
1
J
i=O l=O
(a)
P -i
':I
1
au
P -i
1
1
:;,
P -j
2
P -j
au 2
a
2
(4.1c)
-18-
Or in operator notation:
P
D (as) =
O~
Iq f~ Ip I
Since the binomial coefficients are bounded for Ipl~m there is a c <
00
such that
(4.ld)
Thus
By expanding w~ and applying the Cauchy Schwartz inequality to the cross
products it is straightforward to verify that there is and C'<
00
independent
of wI such that
(4.le)
or
Clearly 4.le will also hold for w
M<
00
2
and therefore it follows that there is and
such that
2
lI aS ll wm ,2
(4.10
-19-
Let a = f
n
+
Ih
and S = f
imbedding Wm+l,2(JR2) ~
be
bo~nded
II a S11
n
Ih .
-
By Theorem 5.4 (5) (Adams, 1976) the
~,4(JR2) is continuous.
for sufficiently large nand IIfn -
2
2 = II h - f2 11 2
Wm,
n ~,2
Hence II f
Ih
":m,4
n
+
~ O.
Ih
11
4
~,
4
wi 11
Thus by 4.lf
~ 0 •
QED
Before proving Theorem 1.2 we need to give some properties of two projection
operators.
Let T : ~,2 ~ F denote the projection operator such that
k
k
Hk by Fk in (3.6) - (3.7) the same relations
In particular, if h £ H ~ F and h ~ h then
n
Now if we replace P
will hold.
k
by T and
k
Also we will need the fact that Q (see (1.5»
Suppose {h } ~ ~,2 converges to h
n
there is a subsequence {Q(h
IIQ(h
nk
)}
m
W ,2.
with a limit p
H
By Lemma 3.1
E
H.
is compact and
Since
k
) - hll m 2::i IIQ(h) - hll
2 + 211h
- hllwm,2
W ,
~,
nk
it follows that for any
IIp-hli
n
E
is a continuous operator.
rJU'
E
>0
2;;0 IIQ(h)-hli
~,
2 +
£
By definition II Q(h)-h /I m 2;;; II p-h II
2
~,
W,
Thus IIQ(h)-hil
by the uniqueness of the projection, Q(h) = p.
limit point at Q(h) and Q(h )
n
~
IIp-hll m 2 and
W,
Therefore, {Q(h )} has only one
rJU'
2
=
n
Q(h).
Proof of Theorem 1.2
From the definition of
S
n
(8 n ,(Q.Tk)(h n »
en ,h n
~ S (8 ,h )
n n n
(4.2)
-20-
Also, from section 3 we know (6- ,h- )
n n
+
*
(6 * ,h)
a.s.
Since Q and T are continuous, (QoT ) will also be continuous and
k
k
(QoTk)(h ) ~ h*.
n
S(6 * ,h * )
~
Therefore, by Lemma 3.4 (4.2) implies
liminf S (6- ,h- ) .
n n n
Using arguments similar to those given in the proof of Lemma 3.2 one can
show that
{(e n ,hn )}
must have a single limit point at 6*,h*.
QED
References
Adams, R. A. (1975).
Gallant, A. R. (1980).
Sobolev Spaces.
Academic Press, New York.
"Explicit Estimators of Parametric Functions in
Nonlinear Regression," JASA 21(369): 182-193.
Gallant, A. R. (1982).
"Nonlinear Statistical Methods, Chapter 3.
A Unified
Asymptotic Theory of Nonlinear Statistical Models," Institute of Statistics
Mimeograph Series No. 1617, North Carolina State University.
Rudin, W. (1974).
Real and Complex Analysis.
t-IcGraw Hill, Inc., New York.