Q2, Q3

Solution to Bonus Questions
Q2:
(a) The histogram of 1000 sample means and sample variances are plotted below. Both histogram
are symmetrically centered around the true lambda value 20. But the sample variances have
much larger spread and range than the sample means. Hence as estimators for lambda, sample
mean is better than sample variance because it is more precise.
Histogram of Sample Means
200
Frequency
100
100
50
0
0
Frequency
150
300
200
Histograms of Sample Variances
17
18
19
20
21
xbar
22
10
20
30
40
s2
(b) The average the 1000 sample means is 19.96 and the variance of them is 0.68. This leads to the
simulated MSE of sample mean: 0.68.
(c) The average the 1000 sample variances is 19.78 and the variance of them is 29.00. This leads to
the simulated MSE of sample mean: 29.04.
(d) Based on part (b) and (c), sample mean seems a better estimator than sample variance for
lambda, since it has a smaller variance and a smaller MSE. (The bias of the sample means and
the bias of the sample variances is about same. ) The result is same regardless of the choice of
lambda. Please see the following table for comparison in various choice of lambda.
Bias:xbar
Bias:s2
Var:xbar
Var:s2
MSE:xbar
MSE:s2
20
30
50
100
-0.011 -0.018 4.4e-03 -0.03
-0.053 0.250 1.7e-01
0.72
0.663 1.024 1.6e+00
3.27
26.081 58.705 1.8e+02 735.11
0.663 1.024 1.6e+00
3.27
26.084 58.767 1.8e+02 735.64
(e) The first few confidence intervals are presented in the following as an illustration. Totally, 53 out of
1000 simulated confidence intervals do not contain the true lambda 20. This is similar as what we
expect, since we constructed 95% confidence intervals.
[,1] [,2]
[1,] 19.01265 22.78735
[2,] 19.11671 22.34995
[3,] 18.05720 21.00947
[4,] 19.85984 22.94016
[5,] 18.02332 22.04335
[6,] 18.17906 20.75427
(f) The first few confidence intervals are presented in the following as an illustration. Totally, 51 out
of 1000 simulated confidence intervals do not contain the true lambda 20. This is similar as what
we expect, since we constructed 95% confidence intervals.
[,1] [,2]
[1,] 19.26406 22.53594
[2,] 19.10392 22.36274
[3,] 17.95178 21.11489
[4,] 19.74460 23.05540
[5,] 18.43167 21.63500
[6,] 17.88782 21.04552
(g) Comparing result from (e) and (f), both confidence intervals have similar coverage (similar
chance of containing the true lambda). The average lengths of the confidence intervals in part
(e) and (f) are 3.15 and 3.20, which are similar too.
However, if we further check the standard deviation of the lengths of confidence
intervals, we found the confidence interval length in part (f) is much more stable than the that in
part (f). The standard deviation of the length of CIs in part (e) is 0.43, while it is only 0.07 in part
(f), which is only 15% of the previous standard deviation. In another words, the confidence
intervals in part (f) is much more stable than the confidence interval in part (e). In the following,
we plot the upper limit against the lower limit of each confidence intervals. The black points are
for CIs in part (e) and the blue points are for part (f). Obviously, the blue points are more tightly
clustered than the black points, which implies that the CIs in part (f) is more stable than those in
part (e). The points outside of the two red lines are the CIs that don’t contain the true lambda.
To summary, we conclude the confidence interval in part (f) is better. This is because we
use extra distribution information to construct this confidence interval. The CI for population
mean in part (e) is generally true, no matter what distribution of the data is. The CI in part (f)
used the additional information that the mean and variance of the data is same, hence is more
efficient and achieve better precision.
CIs in (e)
CIs in (f)
22
21
20
19
Upper limit
23
24
Confidence Intervals for lambda
16
17
18
Lower limit
19
20
21
Q3:
(a) The regression output from R is attached below. About 29.49% variability in PIQ is
accounted for by a person’s brain size, height and weight.
> summary(regfit)
Call:
lm(formula = PIQ ~ MRI + Height + Weight, data = piq)
Residuals:
Min
1Q Median
-32.74 -12.09 -3.84
3Q
14.17
Max
51.69
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.114e+02 6.297e+01
1.768 0.085979 .
MRI
2.060e+00 5.634e-01
3.657 0.000856 ***
Height
-2.732e+00 1.229e+00 -2.222 0.033034 *
Weight
5.599e-04 1.971e-01
0.003 0.997750
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 19.79 on 34 degrees of freedom
Multiple R-squared: 0.2949,
Adjusted R-squared: 0.2327
F-statistic: 4.741 on 3 and 34 DF, p-value: 0.007215
(b) A person’s brain size significantly affects his/her PIQ, and 1 unit increase in of brain size (MRI
count/10000) will lead to 2.06 unit of PIQ increase. Height is negatively associated with PIQ,
and one inch taller will lead to 2.73 unit decrease in PIQ. Weight is not significantly
associated with PIQ given brain size and height of a person.
(c) The correlation matrix between brain size, height and weight is given below, together with
their correlation with PIQ. Height and weight are most correlated with correlation 0.7.
PIQ
MRI
Height
Weight
PIQ
1.000
0.378
-0.093
0.003
RI
Height
0.378 -0.093
1.000 0.588
0.588 1.000
0.513 0.700
Weight
0.003
0.513
0.700
1.000
The variance inflation factors (VIF) of brain size, height and weight are 1.58, 2.28, and 2.02,
respectively. These are all less than 10 and are acceptable. There does not appear to be high
multicollinearity.
(d) The correlation between PIQ and brain size, height, weight are 0.378 -0.093 0.003,
respectively. The partial correlation between PIQ and height given brain size is- 0.42 (with pvalue=0.02), and the partial correlation between PIQ and weight given brain size is -0.24
(with p-value=0.16). Hence, height is a better predictor to be included, given brain size is
already included in the model.
Appendix:
###### Bonus Question 2
## Part(a)
lambda=20
xn=matrix(rpois(30*1000,lambda),ncol=1000,nrow=30)
xbar=apply(xn,2,mean)
s2=apply(xn,2,var)
par(mfrow=c(1,2))
hist(xbar,main="Histogram of Sample Means")
abline(v=lambda,col='red')
hist(s2,main="Histograms of Sample Variances")
abline(v=lambda,col='red')
## Part (b)
mean(xbar)
var(xbar)
MSExbar=(mean(xbar)-lambda)^2+var(xbar)
MSExbar
## Part (c)
mean(s2)
var(s2)
MSEs2=(mean(s2)-lambda)^2+var(s2)
MSEs2
## Part (d)
## This comparison under different lambda will be done later.
## Part (e)
CIe=cbind(xbar-1.96*sqrt(s2)/sqrt(30),xbar+1.96*sqrt(s2)/sqrt(30))
head(CIe)
sum(CIe[,1]>lambda)+ sum(CIe[,2]<lambda)
head(CIe)
## Part (f)
CIf=cbind(xbar-1.96*sqrt(xbar)/sqrt(30),xbar+1.96*sqrt(xbar)/sqrt(30))
head(CIf)
sum(CIf[,1]>lambda)+ sum(CIf[,2]<lambda)
head(CIf)
## Part(g)
mean(CIe[,2]-CIe[,1])
sd(CIe[,2]-CIe[,1])
mean(CIf[,2]-CIf[,1])
sd(CIf[,2]-CIf[,1])
par(pch=19)
plot(CIe[,1],CIe[,2],xlab="Lower limit",ylab="Upper limit",
main="Confidence Intervals for lambda")
points(CIf[,1],CIf[,2],col="blue")
abline(v=lambda,col='red',lty=2)
abline(h=lambda,col='red',lty=2)
legend("topleft",c("CIs in (e)","CIs in (f)"),col=c("black","blue"),pch=19)
## Part (d)
bias=c()
variance=c()
mse=c()
for (lambda in c(20,30,50,100)){
## simulate 1000 samples
xn=matrix(rpois(30*1000,lambda),ncol=1000,nrow=30)
## record sample mean and sample variance of each sample for all 1000
samples
xbar=apply(xn,2,mean)
s2=apply(xn,2,var)
bias=cbind(bias,c((mean(xbar)-lambda), (mean(s2)-lambda)) )
variance=cbind(variance, c(var(xbar),var(s2)))
MSExbar=(mean(xbar)-lambda)^2+var(xbar)
MSEs2=(mean(s2)-lambda)^2+var(s2)
mse=cbind(mse,c(MSExbar,MSEs2))
}
mysummary=rbind(bias,variance,mse)
colnames(mysummary)=c(20,30,50,100)
rownames(mysummary)=c("Bias:xbar","Bias:s2","Var:xbar","Var:s2","MSE:xbar","MSE
:s2")
print(mysummary,digits=2)
###### Bonus Question 3
## part(a) Ex. 11.3
piq=read.table("C:/DATA/work/teaching/math3200/Tamhane_Data/Tamhane_Data/ASCII/
Chapt11/Ex11_3.txt",header=T)
head(piq)
regfit=lm(PIQ~MRI+Height+Weight,data=piq)
summary(regfit)
## part(c) Ex 11.34
print(cor(piq),width=3)
library(car)
vif(regfit)
## part(d) Ex 11.41
SSEx1=16198
SSEx1x2=13322
SSEx1x3=15258
n=dim(piq)[1]
(r12=-sqrt((SSEx1-SSEx1x2)/SSEx1))
(f2=(SSEx1-SSEx1x2)/(SSEx1/(n-3)))
1-pf(f2,df1=1,df2=n-3)
(r13=-sqrt((SSEx1-SSEx1x3)/SSEx1))
(f3=(SSEx1-SSEx1x3)/(SSEx1/(n-3)))
1-pf(r13^2*(n-3),df1=1,df2=n-3)