Homework 7 Solutions

Homework 7 Solutions
set.seed(1)
library(mvtnorm)
library(MCMCpack)
## Loading required package: coda
## Loading required package: MASS
## ##
## ## Markov Chain Monte Carlo Package (MCMCpack)
## ## Copyright (C) 2003-2016 Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park
## ##
## ## Support provided by the U.S. National Science Foundation
## ## (Grants SES-0350646 and SES-0350613)
## ##
library(coda)
Problem 1
Part a.
Intuitively, we can expect Var yi,j |θj , σ 2 < Var yi,j |µ, τ 2 , σ 2 because we can think of the first as expressing
only the within group variability that arises from the second part of the hierarchical mode we have assumed,
while thinking of the second as expressing both the within group variability that arises from the second part
of the model as well as the across group variability that arises from the first part of the model.
Part b.
Intuitively, we can expect Cov yi1 ,j , yi2 ,j |θj , σ 2 = 0 because yi1 ,j doesn’t
tell us anything about yi2 ,j beyond
the shared component θj , which is already known. We can expect Cov yi1 ,j , yi2 ,j |µ, τ 2 , σ 2 > 0 because yi1 ,j
tells us about yi2 ,j through the shared component θj .
Part c.
i.i.d.
For this part, let’s write yi,j = θj + i,j , where i,j ∼ normal 0, σ 2 . This is the same model as before.
Var yi,j |θj , σ 2 = Var θj + i,j |θj , σ 2
= Var i,j |θj , σ 2
= σ2
1
Var yi,j |µ, τ 2 , σ 2 = Var E θj + i,j |θj , σ 2 |µ, τ 2 , σ 2 + E Var θj + i,j |θj , σ 2 |µ, τ 2 , σ 2
= Var θj |µ, τ 2 , σ 2 + E σ 2 |µ, τ 2 , σ 2
= τ 2 + σ2
This is consistent with the answer to Part a., Var yi,j |θj , σ 2 = σ 2 < σ 2 + τ 2 = Var yi,j |µ, τ 2 , σ 2 .
Cov yi1 ,j , yi2 ,j |θj , σ 2 = E yi1 ,j − E yi1 ,j |θj , σ 2
yi2 ,j − E yi2 ,j |θj , σ 2 |θj , σ 2
= E (yi1 ,j − θj ) (yi2 ,j − θj ) |θj , σ 2
= E i1 ,j i2 ,j |θj , σ 2
=0
yi2 ,j − E yi2 ,j |µ, τ 2 , σ 2 |µ, τ 2 , σ 2
Cov yi1 ,j , yi2 ,j |µ, τ 2 , σ 2 = E yi1 ,j − E yi1 ,j |µ, τ 2 , σ 2
= E (yi1 ,j − µ) (yi2 ,j − µ) |µ, τ 2
= E (θj − µ − i1 ,j ) (θj − µ − i2 ,j ) |µ, τ 2
h
i
2
= E (θj − µ) − (θj − µ) (i1 ,j + i2 ,j ) + i1 ,j i2 ,j |µ, τ 2
= τ2
This is consistent with the answer to Part b.,Cov yi1 ,j , yi2 ,j |θj , σ 2 = 0 and Cov yi1 ,j , yi2 ,j |µ, τ 2 , σ 2 = τ 2 > 0.
Part 4
2
2
p µ|θ1 , . . . , θm , σ , τ , y 1 , . . . , y m
p µ, θ1 , . . . , θm , y 1 , . . . , y m |σ 2 , τ 2
p (µ, θ1 , . . . , θm , y 1 , . . . , y m |σ 2 , τ 2 ) dµ
−∞
p y 1 , . . . , y m |θ1 , . . . , θm , σ 2 p θ1 , . . . , θm |µ, τ 2 p (µ)
R
= ∞
p (y 1 , . . . , y m |θ1 , . . . , θm , σ 2 ) p (θ1 , . . . , θm |µ, τ 2 ) p (µ) dµ
−∞
p y 1 , . . . , y m |θ1 , . . . , θm , σ 2 p θ1 , . . . , θm |µ, τ 2 p (µ)
R∞
=
p (y 1 , . . . , y m |θ1 , . . . , θm , σ 2 ) −∞ p (θ1 , . . . , θm |µ, τ 2 ) p (µ) dµ
p θ1 , . . . , θm |µ, τ 2 p (µ)
= R∞
p (θ1 , . . . , θm |µ, τ 2 ) p (µ) dµ
−∞
= p µ|θ1 , . . . , θm , τ 2
= R∞
This means that, conditional on θ1 , . . . , θm and τ 2 , there is no residual information about µ remaining in
y 1 , . . . , y m or σ 2 , i.e. µ ⊥ y 1 , . . . , y m |θ1 , . . . , θm .
Problem 2
y1 <- read.table("http://www.stat.washington.edu/~pdhoff/Book/Data/hwdata/school1.dat",
header = FALSE)[, 1]
y2 <- read.table("http://www.stat.washington.edu/~pdhoff/Book/Data/hwdata/school2.dat",
header = FALSE)[, 1]
y3 <- read.table("http://www.stat.washington.edu/~pdhoff/Book/Data/hwdata/school3.dat",
header = FALSE)[, 1]
y4 <- read.table("http://www.stat.washington.edu/~pdhoff/Book/Data/hwdata/school4.dat",
2
header = FALSE)[, 1]
y5 <- read.table("http://www.stat.washington.edu/~pdhoff/Book/Data/hwdata/school5.dat",
header = FALSE)[, 1]
y6 <- read.table("http://www.stat.washington.edu/~pdhoff/Book/Data/hwdata/school6.dat",
header = FALSE)[, 1]
y7 <- read.table("http://www.stat.washington.edu/~pdhoff/Book/Data/hwdata/school7.dat",
header = FALSE)[, 1]
y8 <- read.table("http://www.stat.washington.edu/~pdhoff/Book/Data/hwdata/school8.dat",
header = FALSE)[, 1]
Y <- list(y1, y2, y3, y4, y5, y6, y7, y8)
samp.mu <- function(theta, tau.sq, mu.0, gamma.sq.0) {
m <- length(theta)
mu.var <- 1/(m/tau.sq + 1/gamma.sq.0)
mu.mean <- mu.var*(m*mean(theta)/tau.sq + mu.0/gamma.sq.0)
return(rnorm(1, mu.mean, sqrt(mu.var)))
}
samp.tau.sq <- function(theta, mu, eta.0, tau.sq.0) {
m <- length(theta)
tau.sq.inv.a <- (eta.0 + m)/2
tau.sq.inv.b <- (eta.0*tau.sq.0 + sum((theta - mu)^2))/2
return(1/rgamma(1, tau.sq.inv.a, tau.sq.inv.b))
}
samp.sigma.sq <- function(theta, Y, nu.0, sigma.sq.0) {
n <- sum(unlist(lapply(Y, function(x) {length(x)})))
ss <- 0
for (i in 1:length(theta)) {
ss <- ss + sum((Y[[i]] - theta[i])^2)
}
sigma.sq.inv.a <- (nu.0 + n)/2
sigma.sq.inv.b <- (nu.0*sigma.sq.0 + ss)/2
return(1/rgamma(1, sigma.sq.inv.a, sigma.sq.inv.b))
}
samp.theta <- function(Y, mu, sigma.sq, tau.sq) {
n <- unlist(lapply(Y, function(x) {length(x)}))
m <- length(Y)
y.bar <- unlist(lapply(Y, function(x) {mean(x)}))
theta.var <- 1/(n/sigma.sq + 1/tau.sq)
theta.mean <- theta.var*(n*y.bar/sigma.sq + mu/tau.sq)
return(rnorm(m, theta.mean, sqrt(theta.var)))
}
mu.0 <- 7
gamma.sq.0 <- 5
tau.sq.0 <- 10
eta.0 <- 2
sigma.sq.0 <- 15
3
nu.0 <- 2
theta <- unlist(lapply(Y, function(x) {mean(x)}))
sigma.sq <- 0
for (i in 1:length(theta)) {
sigma.sq <- sigma.sq + sum((Y[[i]] - theta[i])^2)/length(Y[[i]])
}
mu <- mean(theta)
tau.sq <- mean((theta - mu)^2)
S <- 5000
thetas <- matrix(nrow = S, ncol = length(theta))
sigma.sqs <- numeric(S)
mus <- numeric(S)
tau.sqs <- numeric(S)
r.prior <- numeric(S)
for (i in 1:S) {
thetas[i, ] <- theta <- samp.theta(Y, mu, sigma.sq, tau.sq)
sigma.sqs[i] <- sigma.sq <- samp.sigma.sq(theta, Y, nu.0, sigma.sq.0)
mus[i] <- mu <- samp.mu(theta, tau.sq, mu.0, gamma.sq.0)
tau.sqs[i] <- tau.sq <- samp.tau.sq(theta, mu, eta.0, tau.sq.0)
}
Part a.
esss <- as.matrix(c(effectiveSize(sigma.sqs), effectiveSize(tau.sqs), effectiveSize(mus)))
row.names(esss) <- c("$\\sigma^2$", "$\\tau^2$", "$\\mu$")
colnames(esss) <- c("Effective Sample Size (out of 5,000)")
kable(esss)
Effective Sample Size (out of 5,000)
2
σ
τ2
µ
4740.527
3618.222
4177.442
We can see all of the effective sample sizes exceed 1, 000. For full credit, it was necessary to also present
some additional assessment of convergence (e.g. autocorrelation measures, plots of the sampled values) and
comment on convergence.
Part b.
post.mean.ci <- rbind(c(mean(sigma.sqs), quantile(sigma.sqs, c(0.025, 0.975))),
c(mean(tau.sqs), quantile(tau.sqs, c(0.025, 0.975))),
4
c(mean(mus), quantile(mus, c(0.025, 0.975))))
row.names(post.mean.ci) <- c("$\\sigma^2$", "$\\tau^2$", "$\\mu$")
colnames(post.mean.ci) <- c("Mean", "2.5\\%", "97.5\\%")
kable(post.mean.ci)
σ2
τ2
µ
Mean
2.5%
97.5%
14.484544
5.601824
7.548087
11.732720
1.903702
5.914609
17.821712
14.897491
9.127841
10
14
18
22
0.1
0.0
0.00
0.00
0.05
0.05
0.2
0.3
Density
0.10
Density
0.15
0.15
0.4
0.5
Post.
Prior
0.10
Density
0.20
0.25
0.20
par(mfrow = c(1, 3))
par(mar = c(6, 4, 6, 1))
plot(density(sigma.sqs), xlab = expression(sigma^2), main = "")
lines(seq(0.1, 22, by = 0.1),
dinvgamma(seq(0.1, 22, by = 0.1), nu.0/2, nu.0*sigma.sq.0/2),
col = "blue")
plot(density(tau.sqs), xlab = expression(tau^2), main = "")
lines(seq(0.1, 50, by = 0.1),
dinvgamma(seq(0.1, 50, by = 0.1), eta.0/2, eta.0*tau.sq.0/2),
col = "blue")
legend("topright", lty = c(1, 1), col = c("black", "blue"), legend = c("Post.", "Prior"),
cex = 0.75)
plot(density(mus), xlab = expression(mu), main = "")
lines(seq(4, 12, by = 0.1),
dnorm(seq(4, 12, by = 0.1), mu.0, sd = sqrt(gamma.sq.0)),
col = "blue")
0
10
20
σ2
30
τ2
40
50
4
6
8
10
µ
For all three parameters, we see that the posterior distributions are more concentrated than the corresponding
prior distributions. We observe that our conclusions regarding τ 2 and µ are pretty consistent with our prior
5
expectations. In contrast, our conclusions regardin σ 2 are not very consistent with our prior expectations; we
conclude σ 2 is much larger than we initially expected.
Note - you could have also approximated the prior densities by simulating from them. This may produce
results that look quite different than the results shown here because the approximations obtained in this way
are pretty unstable, even when many draws are taken from the priors. Given that the densities are actually
known and easily computed, the approach shown above is deinitely preferable.
Part c.
tau.sq.prior <- 1/rgamma(1000000, eta.0/2, eta.0*tau.sq.0/2)
sigma.sq.prior <- 1/rgamma(1000000, nu.0/2, nu.0*sigma.sq.0/2)
par(mfrow = c(1, 1))
plot(density(tau.sqs/(sigma.sqs + tau.sqs)),
xlab = expression(tau^2/(sigma^2 + tau^2)), main = "")
lines(density(tau.sq.prior/(tau.sq.prior + sigma.sq.prior)), col = "blue")
legend("topright", lty = c(1, 1), col = c("black", "blue"), legend = c("Post.", "Prior"),
cex = 0.75)
2
0
1
Density
3
4
Post.
Prior
0.0
0.2
0.4
0.6
0.8
τ2 (σ2 + τ2)
R measures the contribution of across group variance to the total variance. Our prior R was very diffuse.
Given the data, we conclude R is approximately 0.26, i.e. we conclude that across group variance accounts
for about 26% of the total variance.
Note - you could have also directly computed the prior density of R from the priors of τ 2 and σ 2 . This will
look a little different, because of the approximation issues described in Part c., but is more accurate.
Part d.
6
prs <- as.matrix(c(mean(thetas[, 7] < thetas[, 6]),
mean(thetas[, 7] < thetas[, 1] &
thetas[, 7] < thetas[, 2] &
thetas[, 7] < thetas[, 3] &
thetas[, 7] < thetas[, 4] &
thetas[, 7] < thetas[, 5] &
thetas[, 7] < thetas[, 6] &
thetas[, 7] < thetas[, 8])))
row.names(prs) <- c("$\\text{Pr}\\left(\\theta_7 < \\theta_6 | \\boldsymbol y\\right)$",
"$\\text{Pr}\\left(\\theta_7 < \\text{min}\\left(\\boldsymbol \\theta_{-7}\\right) |
kable(prs)
Pr (θ7 < θ6 |y)
Pr (θ7 < min (θ −7 ) |y)
0.5104
0.3068
Part e.
7
8
θ
9
10
plot(unlist(lapply(Y, function(x) {mean(x)})), colMeans(thetas),
xlab = expression(bar(y)), ylab = expression(widehat(theta)))
abline(a = 0, b = 1)
6
7
8
9
10
y
We can see that more extreme school means are shrunk more towards the overall mean, which is very close to
mu
ˆ as shown below.
data.mean <- sum(unlist(lapply(Y, function(x) {sum(x)})))/sum(unlist(lapply(Y, function(x) {length(x)}))
kable(cbind("$\\bar{y}$" = data.mean,
"$\\hat{\\mu}$" = mean(mus)))
7
ȳ
µ̂
7.691278
7.548087
8