CHAPTER 2 Exercise 2.1 Suppose that Y = (Y 1,...,Yn) is a random

CHAPTER 2
Exercise 2.1 Suppose that Y = (Y1 , . . . , Yn ) is a random sample from an Exp(λ) distribution. Then we
may write
n
Pn
Y
λe−λyi = |λn e−λ{z i=1 y}i × |{z}
1 .
fY (y) =
i=1
It follows that T (Y ) =
Pn
i=1 Yi
gλ (T (Y ))
h(y)
is a sufficient statistic for λ.
Exercise 2.2 Suppose that Y = (Y1 , . . . , Yn ) is a random sample from an Exp(λ) distribution. Then the
ratio of the joint pdfs at two different realizations of Y , x and y, is
Pn
f (x; λ)
λn e−λ i=1 xi
= n −λ Pn y
i=1 i
f (y; λ)
λ e
= eλ
The ratio is constant iff
sufficient statistic for λ.
Pn
i=1 yi
=
Pn
i=1 xi .
Pn
i=1
yi −
Pn
i=1
xi
Hence, by Lemma 2.1, T (Y ) =
Pn
i=1 Yi
is a minimal
Exercise 2.3 Yi are identically distributed, hance have the same expectation, say E(Y ), for all i = 1, . . . , n.
Here, for y ∈ [0, θ], we have:
Z
2 θ
E(Y ) = 2
y(θ − y)dy
θ 0
Z θ
2
(θy − y 2 )dy
= 2
θ 0
θ
1
2
1
= 2 θ y2 − y3
θ
2
3
0
1
= θ.
3
Bias:
n
1X
1 1
E(T (Y )) = E(3Y ) = 3
E(Yi ) = 3 n θ = θ.
n
n 3
i=1
That it bias(T (Y )) = E(T (Y )) − θ = 0.
Variance: Yi are identically distributed, hance have the same variance, say var(Y ), for all i =
1, . . . , n,
var(Y ) = E(Y 2 ) − [E(Y )]2 .
We need to calculate E(Y 2 ).
Z
2 θ 2
y (θ − y)dy
θ2 0
Z
2 θ 2
= 2
(θy − y 3 )dy
θ 0
2
1 3 1 4 θ
= 2 θ y − y
θ
3
4
0
1 2
= θ .
6
Hence var(Y ) = E(Y 2 ) − [E(Y )]2 = 16 θ2 − 19 θ2 = 29 θ2 . This gives
E(Y 2 ) =
var(T (Y )) = 9 var(Y ) = 9
n
n
1 X
1 X2 2
1 2
2
θ = 9 2 n θ2 = θ2 .
var(Y
)
=
9
i
2
2
n
n
9
n 9
n
i=1
i=1
15
Consistency:
T (Y ) is unbiased, so it is enough to check if its variance tends to zero when n tends to infinity.
Indeed, as n → ∞ we have n2 θ2 → 0, that is T (Y ) = 3Y is a consistent estimator of θ.
Pn
1
n
Exercise 2.4 We have Xi ∼ Bern(p) for i = 1, . . . , n. Also, X =
iid
i=1 Xi .
b → 0 as n → ∞. We have
(a) For an estimator ϑb to be consistent for ϑ we require that the MSE(ϑ)
b = var(ϑ)
b + [bias(ϑ)]
b 2.
MSE(ϑ)
We will now calculate the variance and bias of pb = X.
n
1X
1
E(X) =
E(Xi ) = np = p.
n
n
i=1
Hence X is an unbiased estimator of p.
n
1 X
1
1
var(X) = 2
var(Xi ) = 2 npq = pq.
n
n
n
i=1
Hence, MSE(X) =
1
n
pq → 0 as n → 0, that is, X is a consistent estimator of p.
b → ϑ as n → ∞, that is, the bias tends
(b) The estimator ϑb is asymptotically unbiased for ϑ if E(ϑ)
to zero as n → ∞. Here we have
2
2
E(pq)
b = E[X(1 − X)] = E[X − X ] = E[X] − E[X ].
Note that we can write
2
E[X ] = var(X) + [E(X)]2 =
That is
E(pq)
b =p−
1
pq + p2
n
=
1
pq + p2 .
n
pq(n − 1)
→ pq as n → ∞.
n
Hence, the estimator is asymptotically unbiased for pq.
Exercise 2.5 Here we have a single parameter p and g(p) = p.
By definition the CRLB(p) is
CRLB(p) =
n
dg(p)
dp
o2
n 2
o,
E − d log Pdp(Y2 =y;p)
(1)
where the joint pmf of Y = (Y1 , . . . , Yn )T , where Yi ∼ Bern(p) independently, is
P (Y = y; p) =
n
Y
i=1
For the numerator of (1) we get
denominator of (1) we calculate
dg(p)
dp
log P =
pyi (1 − p)1−yi = p
Pn
i=1
yi
(1 − p)n−
Pn
i=1
yi
.
= 1. Further, for brevity denote P = P (Y = y; p). For the
n
X
i=1
yi log p + (n −
16
n
X
i=1
yi ) log(1 − p)
and
Pn
Pn
yi − n
d log P
i=1 yi
=
+ i=1
,
dp
p
1−p
P
P
n
n
d2 log P
i=1 yi
i=1 yi − n
=
−
+
.
2
2
dp
p
(1 − p)2
Hence, since E(Yi ) = p for all i, we get
2
d log P
E −
dp2
Pn
Pn
i=1 Yi
i=1 Yi − n
=E
−
E
p2
(1 − p)2
np
np − n
n
= 2 −
=
2
p
(1 − p)
p(1 − p)
Hence, CRLB(p) =
p(1−p)
n .
Since var(Y ) = p(1−p)
n , it means that var(Y ) achieves the bound and so Y has the minimum variance
among all unbiased estimators of p.
Exercise 2.6 From lectures, we know that the joint pdf of independent normal r.vs is
(
)
n
1 X
2
2 −n
2
f (y; µ, σ ) = (2πσ ) 2 exp − 2
(yi − µ) .
2σ
i=1
Denote f =
f (y|µ, σ 2 ).
Taking log of the pdf we obtain
n
n
1 X
log f = − log(2πσ 2 ) − 2
(yi − µ)2 .
2
2σ
i=1
Thus, we have
n
∂ log f
1 X
(yi − µ)
= 2
∂µ
σ
i=1
and
n
∂ log f
n
1 X
=
−
+
(yi − µ)2 .
∂σ 2
2σ 2 2σ 4
i=1
It follows that
and
n
∂ 2 log
= − 2,
2
∂µ
σ
n
∂ 2 log f
1 X
=
−
(yi − µ)
∂µ∂σ 2
σ4
i=1
n
∂ 2 log f
n
1 X
=
−
(yi − µ)2 .
∂σ 4
2σ 4 σ 6
i=1
Hence, taking expectation of each of the second derivatives we obtain the Fisher information matrix
n
0
2
σ
.
M=
0 2σn4
Now, let g(µ, σ 2 ) = µ + σ 2 . Then we have ∂g/∂µ = 1 and ∂g/∂σ 2 = 1. So
n
−1 0
1
CRLB(g(µ, σ 2 )) = (1, 1) σ2
0 2σn4
1
" ! 2
σ
σ2
0
1
n
(1 + 2σ 2 ).
=
= (1, 1)
2σ 4
1
n
0
n
17
Exercise 2.7
PSuppose that Y1 , . . . , Yn are independent Poisson(λ) random variables. Then we know that
T = ni=1 Yi is a sufficient statistic for λ.
Now, we need to find out what is the distribution of T . We showed in Exercise 1.10 that the mgf of a
Poisson(λ) rv is
z
MY (z) = eλ(e −1) .
Hence, we may write (we used z not to be confused with the values of T, denoted by t).
MT (z) =
n
Y
MYi (z) =
i=1
n
Y
eλ(e
z −1)
= enλ(e
z −1)
.
i=1
Hence, T ∼ Poisson(nλ), and so its probability mass function is
P (T = t) =
(nλ)t e−nλ
,
t!
t = 0, 1, . . . .
Next, suppose that
E{h(T )} =
∞
X
h(t)
t=0
for λ > 0. Then we have
∞
X
h(t)
t=0
t!
(nλ)t e−nλ
=0
t!
(nλ)t = 0
for λ > 0. Thus, every coefficient h(t)/t! is zero, so that h(t) = 0 for all t = 0, 1, 2, . . ..
Since T takes on values t = 0, 1, 2, . . . with probability 1 it means that
for all λ. Hence, T =
Pn
i=1 Yi
P {h(T ) = 0} = 1
is a complete sufficient statistic.
Pn
Exercise 2.8 S = i=1 Yi is a complete sufficient statistic for λ. We have seen that T = Y = S/n is a
MVUE for λ. Now, we will find a unique MVUE of φ = λ2 . We have
1 2
1
S
E(T 2 ) = E
= 2 E(S 2 )
n2
n
1
= 2 var(S) + [E(S)]2
n
1
= 2 nλ + n2 λ2
n
1
1
= λ + λ2 = E(T ) + λ2 .
n
n
It means that
2
1 E T 2 − T = λ2 ,
n
i.e., T 2 − n1 T = Y − n1 Y is an unbiased estimator of λ2 . It is a function of a complete sufficient
statistic, hence it is the unique MVUE of λ2 .
18
Exercise 2.9 We may write
P (Y = y; λ) =
=
=
λy e−λ
y!
1
exp{y log λ − λ}
y!
1
exp{(log λ)y − λ}.
y!
Thus, we have a(λ) = log λ, b(y) = y, c(λ) = −λ and h(y) =
representation of the form required by Definition 2.10.
1
y! .
That is the P (Y = y; λ) has a
Exercise 2.10 (a) Here, for y > 0, we have
λα α−1 −λy
y
e
Γ(α)
α
λ
= exp log
y α−1 e−λy
Γ(α)
α λ
= exp −λy + (α − 1) log y + log
Γ(α)
f (y|λ, α) =
This has the required form of Definition 2.10, where p = 2 and
a1 (λ, α) = −λ
a2 (λ, α) = α − 1
b1 (y) = y
b2 (y) = log y
λα
c(λ, α) = log
Γ(α)
h(y) = 1
(b) By Theorem 2.8 (lecture notes) we have that
S1 (Y ) =
n
X
Yi and S2 (Y ) =
i=1
n
X
log Yi
i=1
are the joint complete sufficient statistics for λ and α.
Exercise 2.11 To obtain the Method of Moments estimators we compare the population and the sample
moments. For a one parameter distribution we obtain θb as the solution of:
E(Y ) = Y .
Here, for y ∈ [0, θ], we have:
Z
2 θ
E(Y ) = 2
y(θ − y)dy
θ 0
Z
2 θ
(θy − y 2 )dy
= 2
θ 0
1 2 1 3 θ
2
= 2 θ y − y
θ
2
3
0
1
= θ.
3
19
(2)
Then by (2) we get the method of moments estimator of θ:
θb = 3Y .
Exercise 2.12 (a) First, we will show that the distribution belongs to an exponential family. Here, for y > 0
and known α, we have
λα α−1 −λy
y
e
Γ(α)
λα −λy
= y α−1
e
Γ(α)
α
λ
α−1
−λy
e
=y
exp log
Γ(α)
α λ
α−1
=y
exp −λy + log
Γ(α)
f (y|λ, α) =
This has the required form of Definition 2.11, where p = 1 and
a(λ) = −λ
b(y) = y
λα
c(λ) = log
Γ(α)
h(y) = y α−1
By Theorem 2.8 (lecture notes) we have that
S(Y ) =
n
X
Yi
i=1
is the complete sufficient statistic for λ.
(b) The likelihood function is
n
Y
λα α−1 −λyi
e
y
Γ(α) i
i=1
n
Y
λα log yiα−1 −λyi
=
e
e
Γ(α)
i=1
α n
Pn
Pn
λ
=
e(α−1) i=1 log yi e−λ i=1 yi
Γ(α)
L(λ; y) =
The the log-likelihood is
l(λ; y) = log L(λ; Y ) = αn log λ − n log Γ(α) + (α − 1)
n
X
i=1
Then, we obtain the following derivative of l(λ; Y ) with respect to λ:
n
1 X
dl
= αn −
yi
dλ
λ
i=1
This, set to zero, gives
b = Pαn
λ
n
i=1 yi
20
=
α
.
y
log yi − λ
n
X
i=1
yi .
Hence, the MLE(λ) = α/Y . So, we get
n
1
1 X
1
1
1
MLE[g(λ)] = MLE
Yi =
=
= Y =
S(Y ).
λ
MLE(λ)
α
αn
αn
i=1
That is, MLE[g(λ)] is a function of the complete sufficient statistic.
(c) To show that it is an unbiased estimator of g(λ) we calculate:
"
!
n
n
X
1 X
1
1
1
1
b
Yi =
E(Yi ) =
nα = .
E[g(λ)] = E
αn
αn
αn λ
λ
i=1
i=1
It is an unbiased estimator and a function of a complete sufficient statistics, hence, by Corollary
2.2 (given in Lectures), it is the unique MVUE(g(λ)).
Exercise 2.13 (a) The likelihood is
2 −n
2
L(β0 , β1 ; y) = (2πσ )
(
n
1 X
exp − 2
(yi − β0 − β1 xi )2
2σ
i=1
)
.
Now, maximizing this is equivalent to minimizing
S(β0 , β1 ) =
n
X
i=1
(yi − β0 − β1 xi )2 ,
which is the criterion we use to find the least squares estimators. Hence, the maximum likelihood
estimators are the same as the least squares estimators.
(c) The estimates of β0 and β1 are
βb0 = Y − βb1 x = 94.123,
Pn
i=1 xi Yi − nxY
b
β1 = P
n
2
2 = −1.266.
i=1 xi − nx
Hence the estimate of the mean response at a given x is
E(Y\
|x = 40) = 94.123 − 1.266x.
For the temperature of x = 40 degrees we obtain the estimate of expected hardness equal to
\
E(Y
|x) = 43.483.
Exercise 2.14 The LS estimator of β1 is
Pn
i=1 xi Yi − nxY
b
β1 = P
n
2
2 .
i=1 xi − nx
We will see that it has a normal distribution and is unbiased, and we will find its variance. Now,
normality is clear from the fact that we may write
βb1 =
Pn
x Y −x
i=1
Pni i 2
i=1 xi −
Pn
i=1 Yi
nx2
21
=
n
X
(xi − x)
i=1
Sxx
Yi ,
P
where Sxx = ni=1 x2i − nx2 , so that β̂1 is a linear function of Y1 , . . . , Yn , each of which is normally
distributed. Next, we have
E(βb1 ) =
=
=
=
=
n
1 X
(xi − x) E(Yi )
Sxx
i=1
( n
)
n
X
X
1
xi E(Yi ) − x
E(Yi )
Sxx
i=1
i=1
)
( n
n
X
X
1
xi (β0 + β1 xi ) − x
(β0 + β1 xi )
Sxx
i=1
i=1
(
)
n
X
1
2
2
nβ0 x + β1
xi − nβ0 x − nβ1 x
Sxx
i=1
( n
)
X
1
2
2
xi − nx β1
Sxx
i=1
=
1
Sxx β1 = β1 .
Sxx
Finally, since the Yi s are independent, we have
var(β̂1 ) =
=
=
1
(Sxx )2
1
(Sxx )2
1
(Sxx )2
n
X
i=1
n
X
i=1
(xi − x)2 var(Yi )
(xi − x)2 σ 2
Sxx σ 2 =
σ2
.
Sxx
Hence, β̂1 ∼ N (β1 , σ 2 /Sxx ) and a 100(1 − α)% confidence interval for β1 is
s
S2
β̂1 ± tn−2, α2
,
Sxx
where S 2 =
1
n−2
Pn
i=1 (Yi
− β̂0 − β̂1 xi )2 is the MVUE for σ 2 .
Exercise 2.15 (a) The likelihood is
L(θ; y) =
n
Y
θyiθ−1 = θn
i=1
!
n
Y
yi
i=1
"θ−1
,
and so the log-likelihood is
ℓ(θ; y) = n log θ + (θ − 1) log
Thus, solving the equation
!
n
Y
yi
i=1
! n "
Y
n
dℓ
= + log
yi = 0,
dθ
θ
"
.
i=1
we obtain the maximum likelihood estimator of θ as θ̂ = −n/ log(
(b) Since
n
d2 ℓ
= − 2,
dθ2
θ
22
Qn
i=1 Yi ).
we have
CRLB(θ) =
1
E −
Thus, for large n, θ̂ ∼ N (θ, θ2 /n).
d2 ℓ
dθ 2
=
θ2
.
n
(c) Here we have to replace CRLB(θ) with its estimator to obtain the approximate pivot
θb − θ
Q(Y , θ) = q
∼
θb2 approx.
n
This gives
AN (0, 1)


b
θ−θ
P −z α2 < q < z α2 ≃ 1 − α.


θb2


n
where z α2 is such that P (|Z| < z α2 ) = 1 − α, Z ∼ N (0, 1). It may be rearranged to yield
P



θ̂ − z α2
s
θb2
< θ < θ̂ + z α2
n
s

2
b
θ 
≃ 1 − α.
n
Hence, an approximate 100(1 − α)% confidence interval for θ is
s
θb2
θb ± z α2
.
n
Finally, the approximate 90% confidence interval for θ is
s
θ̂2
θ̂ ± 1.6449
,
n
where θb = −n/ log(
Qn
i=1 Yi ).
b=Y
Exercise 2.16 (a) For a Poisson distribution we have the MLE(λ) equal to λ
and
b ∼ AN λ, λ .
λ
n
Hence,
b2 ∼ AN
b1 − λ
λ
λ1
λ2
λ1 − λ 2 ,
+
n1 n2
So, after standardization, we get
b − (λ1 − λ2 )
b1 − λ
λ
q2
∼ AN (0, 1).
λ1
λ2
+
n1
n2
Hence, the approximate pivot for λ1 − λ2 is
Q(Y , λ1 − λ2 ) =
b1 − λ
b − (λ1 − λ2 )
λ
q2
b1
b2
λ
λ
n1 + n2
23
∼
approx.
N (0, 1).
Then, for z α2 such that P (|Z| < z α2 ) = 1 − α, Z ∼ N (0, 1),
we may write




b
b
λ1 − λ2 − (λ1 − λ2 )
q
P −z α2 <
< z α2 ≃ 1 − α.
b1
b2


λ
λ
n1 + n 2
what gives


s
s


b
b
b
b
b 2 − z α λ 1 + λ 2 < λ1 − λ 2 < λ
b1 + λ
b2 − z α λ1 + λ2 ≃ 1 − α.
b1 − λ
P λ
2
2

n1 n2
n1 n2 
That is, a 100(1 − α)% CI for λ1 − λ2 is
Y 1 − Y 2 ± z α2
s
Y1 Y2
+
.
n1
n2
(b) Denote:
Yi - density of seedlings of tree A at a square meter area i around the tree;
Xi - density of seedlings of tree B at a square meter area i around the tree.
Then, we may assume that Yi ∼ Poisson(λ1 ) and Xi ∼ Poisson(λ2 ).
iid
iid
We are interested in the difference in the mean density, i.e., in λ1 − λ2 .
From the data we get:
b
b
b2 = 1, λ1 + λ2 = 17 .
b1 − λ
λ
n1 n2
70
Hence, the approximate 99% CI for λ1 − λ2 is
#
r
r $
17
17
1 − 2.5758
, 1 + 2.5758
= [−0.269, 2.269].
70
70
The CI includes zero, hence, at the 1% significance level, there is no evidence to reject H0 :
λ1 − λ2 = 0 against H1 : λ1 − λ2 6= 0, that is, there is no evidence to say, at the 1% significance
level, that tree A produced a higher density of seedlings than tree B did.
Exercise 2.17 (a) Yi ∼ Bern(p), i = 1, . . . , n and we are interested in testing a hypothesis
iid
H0 : p = p0 against H1 : p = p1 . The likelihood is
L(p; y) = p
Pn
i=1
yi
(1 − p)n−
Pn
i=1
yi
and so we get the likelihood ratio:
λ(p) =
=
L(p0 ; y)
L(p1 ; y)
Pn
p0
i=1
Pn
yi
y
(1 − p0 )n−
Pn
i=1
Pn
yi
p1 i=1 i (1 − p1 )n− i=1 yi
Pni=1 yi Pn
p0
1 − p0 n− i=1 yi
=
p1
1 − p1
Pni=1 yi p0 (1 − p1 )
1 − p0 n
=
.
p1 (1 − p0 )
1 − p1
Then, the critical region is
R = {y : λ(p) ≤ a},
24
where a is a constant chosen to give significance level α. It means that we reject the null
hypothesis if
Pni=1 yi p0 (1 − p1 )
1 − p0 n
≤ a,
p1 (1 − p0 )
1 − p1
which is equivalent to
Pni=1 yi
p0 (1 − p1 )
≤ b,
p1 (1 − p0 )
or, after taking logs of both sides, to
n
X
p0 (1 − p1 )
≤ c,
yi log
p1 (1 − p0 )
i=1
where b and c are constants chosen to give significance level α.
When p1 > p0 we have
p0 (1 − p1 )
< 0.
log
p1 (1 − p0 )
Hence, the critical region can be written as
R = {y : y ≥ d},
for some constant d chosen to give significance level α.
By the central limit theorem, we have that (when the null hypothesis is true, i.e., p = p0 ):
p0 (1 − p0 )
Y ∼ AN p0 ,
.
n
Hence,
Y − p0
Z=q
p0 (1−p0 )
n
and we may write
where zα =
q d−p0
p0 (1−p0
n
∼ AN (0, 1)
α∼
= P (Y ≥ d|p = p0 ) = P (Z ≥ zα ),
q
p0 (1−p0 )
.
Hence
d
=
p
+
z
and the critical region is
0
α
n
)
R=
(
y : y ≥ p 0 + zα
r
p0 (1 − p0 )
n
)
.
(b) The critical region does not depend on p1 , hence it is the same for all p > p0 and so there is a
uniformly most powerful test for H0 : p = p0 against H1 : p > p0 .
The power function is
β(p) = P (Y ∈ R|p)
!
"
p0 (1 − p0 )
|p
= P Y ≥ p 0 + zα
n
q


p0 (1−p0 )
+
z
−
p
p
0
α
Y −p
n

q
= P q
≥
p(1−p)
n
r
p(1−p)
n
∼
= 1 − Φ{g(p)},
where g(p) =
q
p0 (1−p0 )
p0 +zα
−p
n
q
p(1−p)
n
and Φ denotes the cumulative distribution function of the
standard normal distribution.
25
Question 2.18 (a) Let us denote:
Yi ∼ Bern(p) - a response of mouse i to the drug candidate.
Then, from Question 1, we have the following critical region
(
)
r
p0 (1 − p0 )
R = y : y ≥ p 0 + zα
.
n
Here p0 = 0.1, n = 30, α = 0.05, zα = 1.6449. It gives
R = {y : y ≥ 0.19}
From the sample we have pb = y =
at the significance level α = 0.05.
6
30
(b) The power function is
where g(p) =
z0.05 = 1.6449,
q
p0 (1−p0 )
−p
p0 +zα
n
q
.
p(1−p)
n
= 0.2, that is there is evidence to reject the null hypothesis
β(p) ∼
= 1 − Φ{g(p)},
When n = 30, p0 = 0.1 and p = 0.2 we obtain, for
g(0.2) = −0.1356 and Φ(−0.1356) = 1 − Φ(0.1356) = 1 − 0.5539 = 0.4461.
It gives the power equal to β(0.2) = 0.5539. It means that the probability of type II error is
0.4461, which is rather high.
This is because the value of the alternative hypothesis is close to the null hypothesis and also the
number of observations is not large.
To find what n is needed to get the power β(0.2) = 0.8 we calculate:
q
0.1 + z0.05 0.09
√
n − 0.2
q
g(p) =
= 0.75 z0.05 − 0.25 n.
0.16
n
For β(p) ∼
= 1 − Φ{g(p)} to be equal to 0.8 it means that Φ{g(p)} = 0.2. From statistical tables
we obtain that g(p) = −0.8416. Hence, it gives, for z0.05 = 1.6449,
n = (4 × 0.8416 + 3 × 1.6449)2 = 68.9.
At least n = 69 mice are needed to obtain as high power test as 0.8 for detecting that the
proportion is 0.2 rather than 0.1.
26

Download Report

CHAPTER 2 Exercise 2.1 Suppose that Y = (Y 1,...,Yn) is a random

Paperzz.com

Your Paperzz