HWS6

ST 380: Probability and Statistics for the Physical Sciences
Solution to Homework Assignment - 6
Prepared by Liwei Wang
Fall, 2009
i.i.d.
Ch.2-Ex.7: (a) Since Xi
∼
Poi(λ) for i = 1, . . . , n, and Pr(Xi = xi ) =
likelihood function is given by
n
Y
λxi
l(λ) =
i=1
xi !
e−λ
(b) Notice that
l(λ1 )
=
l(λ2 )
λ1
λ2
P i xi
λxi −λ
e ,
xi !
the
Pn
λ i=1 xi −nλ
e
= Qn
i=1 xi !
exp{−n(λ1 − λ2 )}
Hence, the likelihood function depends on λ only through
Pn
i=1
xi (and n).
(c) Since the value of λ that maximizes l(λ) also maximizes log(l(λ)) it is convenient to work with log-likelihood function:
ln(l(λ)) =
n
X
i=1
n
Y
xi λ − nλ − ln( xi !)
i=1
d ln(l(λ))
=
dλ
Equating to 0 yields
0 =
Pn
i=1
λ
Pn
i=1
xi
−n
xi
−n
λ
n
X
xi /n
λ = x̄ =
i=1
2
Pn
x
i
= − i=1
. When λ = x̄,
Consider d ln(l(λ))
dλ2
λ2
2
d ln(l(λ))
≤ 0. So at this time λ̂ = ȳ is the m.l.e.
dλ2
d2 ln(l(λ))
dλ2
= − nx̄ . If x̄ ≥ 0,
(d) In example 1.4, there is only one observation, X = 3. So the MLE of λ is 3.
ST 380: HWS6
Page 1
c
Sujit
Ghosh, NCSU Statistics
Ch.2-Ex.9: In example 2.8
l(θ) ∝ θ8 (1 − θ)137
Since ln(x) is an increasing function in terms of x, θ which maximizes l(θ) also
maximizes ln(l(θ)). Hence, to find out m.l.e., we can just consider the value of θ
that maximizes ln(l(θ)).
ln(l(θ)) ∝ 8 ln θ137ln(1 − θ)
Equating to 0 yeilds
8
137
d ln(l(θ))
= −
dθ
θ 1−θ
0
=
θ =
2
Consider d ln(l(θ))
= − θ82 +
dθ2
8
.
m.l.e. for θ is 145
137
.
(1−θ)2
8
145
8
137
−
θ 1−θ
When θ =
2
8
, d ln(l(θ))
145
dθ2
= −17.06 < 0 So the
Ch.2-Ex.10*Method
1.
1 Suppose that the exact distance between John’s office and home is µ. Then it
is reasonable to assume that the distance (X) John’s runs follows a distribution with mean µ. For simplicity we assume that X ∼ N (µ, 1), i.e., a normal
distribution with (unknown) mean µ and variance 1. A more flexible model
would probably assume that the variance is also unknown.
Under the above assumption, the number displayed on the odometer Y has
the probability Pr(Y = d) = FX (d + 1) − FX (d), where FX (x) denotes the
cdf of X). Notice that Y = [X] = the largest integer less than or equal to X
(the “box” function). In the exercise, 3 is observed 7 times, and 4 is observed
3 times. Since the 10 observations are mutually independent, the likelihood
function
7 Z 5
3
Z 4
1 − (x−µ)2
1 − (x−µ)2
7
3
2
2
√ e
√ e
l(µ) = Pr(Y = 3) Pr(Y = 4) =
2π
2π
4
3
Z 4
Z 5
1 − (x−µ)2
1 − (x−µ)2
2
2
√ e
√ e
ln(l(µ)) = 7 ln
+ 3 ln
2π
2π
3
4
To find the m.l.e., we can use R programming:
> loglik=function(mu,sigma=1){3*log(pnorm(4,mu,sigma)-pnorm(3,mu,sigma))
+ 7*log(pnorm(5,mu,sigma)-pnorm(4,mu,sigma))}
> curve(loglik,from=2,to=6)
> optimize(loglik,c(3,5),maximum=T,sigma=1)
ST 380: HWS6
Page 2
c
Sujit
Ghosh, NCSU Statistics
$maximum
[1] 4.1998
−20
−30
−25
log(l(θ))
−15
−10
$objective
[1] -10.56479
>abline(v=4.1998)
2
3
4
5
6
θ
So the m.l.e. of the exact distance is 4.1998.
method 2 Suppose the number displayed on the odometer Y ∼ Possion(λ). λ is the
exact distance between John’s office and home.
Using the result of Ch2.Ex7, we can get the estimate of λ = ȳ = 3∗7+4∗3
= 3.3.
10
method 3 The measure of distance between home and office should be a continuous
variable. One generally used continuous distribution is normal distribution.
We can assume that Y ∼ N (µ, σ). Here, it is easy to see that µ represents
the exact distance between home and office.
l(µ, σ) =
10
Y
(yi −µ)2
1
√ e− 2σ2
2π
i=1
Use the result of Ch2.Ex.13, we can get the estimate of µ̂ = ȳ = 3.3
Using methods 2 and 3, we will always get an estimate between 3
and 4. But the exact distance exceeds 4 when the odometer shows
4. This is a drawback when you use methods 2 and 3. For method
1, I assume the variance to be 1, which may not be reasonable.
ST 380: HWS6
Page 3
c
Sujit
Ghosh, NCSU Statistics
However, this assumption dose not affect the estimate of µ much.
Even setting the variance to be 10, the estimate almost stayed the
same. For example, try the following R commends:
for(sigma in seq(0.5,2.5,l=10)){
cat(c("sigma=",round(sigma,3)),fill=T)
print(optimize(loglik,c(2.5,5.5),maximum=T,sigma=sigma))}
Ch.2-Ex.11 The m.l.e. µ̂ is obtained from the equation (b).
(yi −µ)2
i.i.d
Ch.2-Ex.12 (a) Since Yi ∼ N (µ, 1) and fYi (yi ) = √12π e− 2 , the likelihood function is given
by
n P n
n
2
Y
1
i=1 (yi −µ)
2
fYi (yi ) = √
l(µ) =
e−
2π
i=1
(b) Notice that
(
"
#)
X
l(µ1 )
1 X
= exp −
(yi − µ1 )2 −
(yi − µ2 )2
l(µ2 )
2 i
i
"
#)
(
X
1
yi + n(µ21 − µ22 )
= exp − 2(µ2 − µ1 )
2
i
We can see that in the above expression depends only on
(c) Set µ = 0 for illustration.
R codes:
Pn
i=1
yi (and n).
mu0=0;n=10
y=rnorm(n,mean=mu0)
lik=function(mu){prod(dnorm(y,mu))/prod(dnorm(y,mean(y))}
likelihood=function(mu){sapply(mu,lik)}
mu=seq(-3,3,l=100)
plot(mu,likelihood(mu),type="l")
for(l in 1:30){
y=rnorm(n,mean=mu0)
lines(mu,likelihood(mu),col="gray")}
ST 380: HWS6
Page 4
c
Sujit
Ghosh, NCSU Statistics
1.0
0.8
0.6
0.4
0.0
0.2
likelihood(mu)
−3
−2
−1
0
1
2
3
mu
We need to find the likelihood set LS0.1 = {µ : l(µ) ≥ 0.1l(µ̂)} to characterize
the accuracy of estimation. It can be shown that for this example,
2
2
LSα = ȳ + log(α), ȳ − log(α) for any α ∈ (0, 1).
n
n
In particular, for α = 0.1 and n = 10, the length of the interval is 0.921.
i.i.d
(d) Since Yi ∼ N (µ, σ) and fYi (yi ) =
l(µ) =
10
Y
√ 1 e−
2πσ
fYi (yi ) =
i=1
(yi −µ)2
2σ 2
1
√
2πσ
10
the likelihood is given by
−
e
P 10
2
i=1 (yi −µ)
2σ 2
Here, σ is assumed known.
(e) Similar to (d)
1 10 −
l(σ) = ( √
) e
2πσ
Here, µ is assumed known.
P 10
2
i=1 (yi −µ)
2σ 2
Ch.2-Ex.13 Use the result of Ch.2-Ex.12(a),
l(µ) ∝ e
−
−2µ
Pn
2
i=1 yi +nµ
2
Since ln(x) is an increasing function in terms of x, µ which maximizes l(µ) also
maximizes ln(l(µ)) Hence, to find out m.l.e., we can just consider ln(l(µ)). It easily
P
P
y
−
nµ
=
0
if
µ
=
ȳ
=
=
follows that d log(l(µ))
i
i yi /n. Also it follows easily
i
dµ
that
d2 log(l(µ))
dµ2
ST 380: HWS6
= −n < 0 and hence the MLE of µ is ȳ.
Page 5
c
Sujit
Ghosh, NCSU Statistics
Ch.2-Ex.14 Use the result of Ch.2-Ex.12(e),
n P n
Pn
(y −µ)2
(y −µ)2
1
− i=1 2i
− i=1 2i
−n
2σ
2σ
e
∝σ e
l(σ) = √
2πσ
Since ln(x) is an increasing function in terms of x, σ which maximizes l(σ) also
maximizes ln(l(σ)). Hence, to find Pout m.l.e., we can just consider ln(l(σ)). It
n
(y −µ)2
= − nσ + i=1σ3i
.
follows easily that d log(l(σ))
dσ
Equating to 0 yeilds
Pn
2
n
i=1 (yi − µ)
0 = − +
σ3
Pσn
2
i=1 (yi − µ)
σ2 =
n
P
P
2
2
3 n
3 n
d2 ln(l(σ))
n
1
i=1 (yi −ȳ)
i=1 (yi −ȳ)
Consider dσ2 = σ2 −
=
[n−
].
4
2
2
σ
σ
σ
Pn
2
d2 ln(l(σ))
2n
2
i=1 (yi −µ)
= − σ2 < 0. So the m.l.e. for σ is
.
dσ 2
n
ST 380: HWS6
Page 6
2
When σ =
Pn
i=1 (yi −µ)
2
n
c
Sujit
Ghosh, NCSU Statistics
,