Final Exam

Final Exam – ST758 – December 2011
The classic problem of Seemingly Unrelated Regression arises from the analysis of investment
by two companies in the similar businesses. The regression model for company i = 1, 2 (actually
Atlantic Ritchfield (1) and Union Oil (2), two oil companies) follows
(i)
Yt
(i)
(i)
(i)
(i)
(i)
(i)
= β0 + β1 X1t + β2 X2t + et , t = 1, . . . , N = 20
(i)
(i)
where Yti is the investment of company i in year t, X1t is the market value, and X2t is the capital
stock, again for company i in year t. For Atlantic Ritchfield, the data are in ifcaf.txt; for Union Oil,
the data are in ifcun.txt. The order of the variables is year(t), Y , X1 , X2 ; there are also .dat files
without a header.
1) For each company, construct least squares estimates for the β’s and find their standard errors.
(i)
Compute them directly; you can use the R function lm to check. Also compute the residuals êt .
2) The relationship between these two regressions is that the errors are not assumed to be
(1)
independent between the two companies, although assumed independent across years. That is, et
(2)
(i)
and et are dependent (same time t), but et are independent across t. Estimate the (2 × 2)
covariance matrix of the errors Ω using the residuals from (1):
" P
P (1) (2) #
(1) 2
1
(ê
)
ê ê
t
Ω̂ =
P t (2) (1) Pt t (2) t 2
N −1
ê
ê
t
t t
t (êt )
p
and give the correlation coefficent Ω̂21 / Ω̂11 Ω̂22 .
**** DO (3) OR (4) **** BUT NOT BOTH ****
3) Use your estimated covariance matrix from (2) and compute (estimated) Generalized Least
Squares estimates for the β’s. Also give standard errors for the coefficients as if your estimated
covariance matrix were known. This SUR model can be written in the form of an Aitken model as
Y(1)
Y(2)
"
∼ N2N (
1N
0
(1)
X1
0
(1)
X2
0
0
1N
0
(2)
X1
0
(2)
X2
#
β (1)
β (2)
, V)
where V is the (2N ) × (2N ) matrix
Ω11 IN
Ω21 IN
Ω12 IN
Ω22 IN
If you don’t remember your linear models, for Y ∼ N (Xb, σ 2 V), the GLS estimates are b̃ =
(XT V−1 X)−1 XV−1 y.
(4) Compute the MLE by minimizing over β’s (N/2) times the log of the determinant of the error
sum of squares and cross products matrix. We need some notation, first for the vector of residuals
for each company:
(i)
(i) (i)
(i) (i)
ê(i) (β (i) ) = Y(i) − 1β0 − X1 β1 − X2 β2
and then for the concentrated/profile likelihood function `∗ (β (1) , β (2) )
`∗ (β
(1)
,β
(2)
"
P (1) (1) 2
P (1) (1) (2) (2) #
N
(êt (β ))
)êt (β )
t
t êt (β
) = log(det( P (2) (2) (1) (1)
))
P (2) (2)
2
ê
(β
)ê
(β
)
(ê
(β
))2
t
t t
t t
1
So minimize −`∗ (β (1) , β (2) ) with respect to the coefficients (β (1) , β (2) ) and find standard errors using
the inverse of the hessian of −`∗ (β (1) , β (2) ). (Yes, it’s not the usual likelihood, but do it this way
anyway.) Use your results from (1) for starting values and scale properly.
5) Constructing a proper statistical test of the significance of the correlation coefficient from (2)
is quite difficult. Construct the sampling distribution of this coefficient by generating independent
normal errors with variances matching Ω̂11 and Ω̂22 using beta’s near your estimates to create new
Y’s. Use no more than 1000 replications and estimate the .025 and .975 percentiles to determine
the significance of your coefficient from (2).
6) Using your results from (3) or (4) for β (i) ’s, compute the quadratic form in an efficient manner
"
(1)
(1) (1)
(1) (1)
Y(1) − 1β0 − X1 β1 − X2 β2
(2)
(2) (2)
(2) (2)
Y(2) − 1β0 − X1 β1 − X2 β2
#T Ω11 IN
Ω21 IN
Ω12 IN
Ω22 IN
−1 "
(1)
(1) (1)
(1) (1)
Y(1) − 1β0 − X1 β1 − X2 β2
(2)
(2) (2)
(2) (2)
Y(2) − 1β0 − X1 β1 − X2 β2
Give the proper count of the number of operations (in flops) for your computation in terms of the
sample size N and columns of the design matrix p.
2
#