Økonomisk Kandidateksamen 2004 (II) Econometrics 2 Solution

Økonomisk Kandidateksamen 2004 (II)
Econometrics 2
Solution
Solution to Question 1
(a) Taking first differences we have:
yi2 − yi1 = β (xi2 − xi1 ) + (ui2 − ui1 )
and the first differenced estimator is given by regressing (yi2 − yi1 ) on (xi2 − xi1 ).
The within estimator takes the mean over time for each unit from the values. For
the first period we have:
yi1 − ȳi = ai + βxi1 + ui1 − (āi + β x̄i + ūi )
= β (xi1 − x̄i ) + (ui1 − ūi )
where ȳi =
1
2
(yi1 + yi2 ) so that:
µ
¶
µ
µ
¶¶
1
1
yi1 −
(yi1 + yi2 )
(xi1 + xi2 )
= β xi1 −
2
2
µ
µ
¶¶
1
+ ui1 −
(yi1 + yi2 )
2
which is exactly the same as the first differenced equation. If we consider the second
period we derive exactly the same equation. Since the equations for first differencing
and taking the within transformation give the same equation, they will give the same
estimator.
(b) Suppose we have samples from the same population in two different time periods,
1 and 2. Some members of the population are ‘treated’ at the end of period 1
and before period 2 and we observe which people have been treated. Examples of
a treatment include a policy change that affects some people but not others or a
change in the local environment that affects people there but not people who live
elsewhere. A difference-in-difference estimator for the effect of the treatment on a
variable of interest is to compare the difference over time in the difference in the
means of the two groups.
Formally, let Yi be the variable of interest for person i and let dBi be a dummy
variable that is 1 if person i is ‘treated’ (and zero otherwise). Let d2i be a dummy
1
variable that is one if person i was sampled in period 2. Then run the regression:
Yi = β 0 + δ 0 d2i + β 1 dBi + δ 1 (d2i ∗ dBi ) + ui
The OLS estimate of δ 1 gives the estimate of the difference-in-difference. A test
of whether this coefficient is significantly different from zero is a test of whether
the treatment had an effect on the variable Y . The coefficient δ 1 is consistently
estimated if E (u|d2, dB, d2 ∗ dB) = 0.
(c) First try first differencing:
∆yit = β∆xit + δ i + ∆uit
where ∆t = t − (t − 1) = 1. Thus we still have a fixed effect in the first differenced
equation. If this is correlated with ∆xit then a regression of ∆yit on ∆xit will give
an inconsistent estimator of β. One approach is to assume that δ i is uncorrelated
with ∆xit . Then we can either use OLS or a random effects model. [Both give
consistent estimates but the latter is more efficient]. An alternative, if we are not
prepared to assume this lack of correlation, is to first difference again to remove δ i :
∆yit − ∆yit−1 = β (∆xit − ∆xit−1 ) + (δ i − δ i ) + (∆uit − ∆uit−1 )
∆∆yit = β∆∆xit + ∆∆uit
This is possible if T ≥ 3. Then we can estimate β by regressing ∆∆yit on ∆∆xit .
Solution to Question 2
(a) The solution should derive an AR(3), e.g. using the manipulations
∆Yt = δ + c1 ∆Yt−1 + c2 ∆Yt−2 + πYt−1 +
t
Yt − Yt−1 = δ + c1 (Yt−1 − Yt−2 ) + c2 (Yt−2 − Yt−3 ) + πYt−1 +
t
Yt = δ + (1 + c1 + π) Yt−1 + (c2 − c1 ) Yt−2 − c2 Yt−3 + t .
Second half of the question asks the students to explain the Dickey Fuller unit
root test and perform the test for a US interest rate based on a given estimation
output. The solution should note that a unit root in Yt corresponds to π = 0. A
more thorough solution may write the characteristic polynomial, A(z), and note that
an autoregressive unit root is defined by A(1) = 0. The solution should indicate that
the (normal) alternative is a stationary process.
For the US interest rate the Dickey-Fuller t−test for a unit root is tπ=0 = −1.67
and should be compared with a 5% critical value of −2.86. We cannot reject the
null of a unit root for the considered period.
2
(b) First part of the question asks the students to explain an implication of cointegration.
If the interest rate, Yt , is integrated of order one, I(1), and if the real interest rate
Xt = Yt − Zt ,
is a stationary process, then the inflation rate, Zt , has to be an I(1) process. Furthermore, Yt and Zt are cointegrated with a cointegration vector β = (1, −1)0 .
Second half of the question asks the students to test the theoretical prediction
based on estimation output for the regression model
δ+e
c1 ∆Xt−1 + π
eXt−1 + et .
∆Xt = e
The solution should note that a test for no cointegration is just a test for unit root
in Xt . The good solution notes that the critical values of Table 2.2 apply again
because the potential cointegration vector is given a priori and not estimated.
The t−test is given by tπe =0 = −2.54. We cannot reject the null of a unit root
and hence cannot reject no-cointegration. We conclude that the interest rate and
inflation do not appear to have the same stochastic trend with equal coefficients.
(c) Instead of the real interest rate, Xt = Yt − Zt , a different linear combination could
be considered, e.g.
Xt∗ = Yt − β 1 Zt .
The solution should note that β 1 can be (super-)consistently estimated in a static
regression of Yt on Zt and a constant, i.e.
Yt = β 0 + β 1 Zt + η t .
(∗)
To test if the regression in (∗) corresponds to a cointegration vector we can test for
a unit root in the estimated residual, b
η t . This can be done using an (augmented)
Dickey Fuller test (with no constant). Alternatively we can test for a unit root in
b Zt . The good solution notes that the fact that β
b is estimated changes
bt∗ = Yt − β
X
1
1
the asymptotic distribution of the Dickey Fuller test.
Solution to Question 3
(a) The exponential distribution is not discussed in the lectures, but the question is a
straightforward likelihood analysis.
The solution should note that the likelihood function for a single observation is
just the density conditional on the data, i.e.
n x o
i
.
Li (θ | xi ) = θ−1 exp −
θ
Taking logs yields the log-likelihood contribution
log Li (θ) = − log(θ) −
3
xi
.
θ
Since the observations are independent the likelihood function for a set of observations, x1 , x2 , ..., xn , is given by
L(θ | x1 , x2 , ..., xn ) =
n
Y
i=1
n x o
i
θ−1 exp −
.
θ
Taking logs yields the log-likelihood function
Pn
n ³
X
xi ´
i=1 xi
−
log(θ)
−
=
−n
log(θ)
−
.
log L(θ) =
θ
θ
i=1
The individual scores are found as the derivatives
si (θ) =
∂ log Li (θ)
xi
= −θ−1 + 2 ,
∂θ
θ
and the first order condition for the ML estimator, b
θML , is given by
Pn
¶
n µ
X
xi
xi
−1
−1
−θ + 2 = −nθ + i=1
si (θ) =
= 0.
2
θ
θ
i=1
i=1
n
X
This is solved where
b
θML =
Pn
i=1 xi
.
n
P100
Given the information that n = 100 and i=1 xi = 1000, we find the numerical
estimate to be
Pn
xi
1000
b
= 10.
θML = i=1 =
n
100
(b) We find the Hessian as the second derivative
∂ 2 log Li (θ)
xi
= θ−2 − 2 3 ,
∂θ∂θ
θ
and the information (per observation) is given by
£
¤
¡
¢
I(θ) = −E θ−2 − 2xi θ−3 = − θ−2 − 2θ−2 = θ−2 .
Here we use the fact the E[xi ] = θ; most solutions found this part difficult. The
asymptotic variance is given by I(θ)−1 = θ2 .
The solution should note that a Wald test for the hypothesis H0 : θ = θ0 is based
on estimates in the unrestricted model. The principle is that the distance (b
θML −θ0 )
b
is measured in terms of the variance of θML . The solution may derive the test from
the fact that
´
√ ³
¢
¡
T b
θML − θ → N 0, I(θ)−1 ,
which we can estimate by
´
³
−1b2
b
θML → N θ, T θML .
4
This expression can be used to construct a familiar t−tests for H0 :
b
b
√
θML − θ0
θML − θ0
= T·
→ N (0, 1) ,
tθ=θ0 = q
b
2
θ
ML
−1
b
T θML
or a Wald tests of the form
´ ³ −2 ´ ³
´
³
b
θML b
θML − θ0 → χ2 (1) .
θML − θ0 T b
To construct a t−test for the hypothesis H0 : θ = 11.5 we calculate
√
10 − 11.5
= −1.5,
tθ=11.5 = 100 ·
10
which is not significant according to a N (0, 1). We cannot reject that the mean
lifetime is 11.5.
b , in the regression model in
(c) The solution should discuss that the ML estimator, β
ML
(3.2) and (3.3) is consistent under weaker assumptions than maintained for the ML
estimation.
The expression in (3.3) contains three parts: (i) mean independence, E[ t | xt ] = 0;
(ii) conditional homoscedasticity; and (iii) normality. Only E[ t xt ] is required for
consistency.
The solution may discuss pseudo-likelihood estimation. If the normality in (3.3)
is not fulfilled the estimate based on the normal likelihood function is a pseudolikelihood estimator, which is still consistent. The first order conditions for the ML
estimates can be written as
T
T
1X
1X
st (θ) =
xt (yt − x0t β) = 0,
T
T
t=1
t=1
where st (θ) denotes the individual scores. That can be seen as sample counterparts
to the moment conditions
E[xt (yt − x0t β)] = 0.
If these moment conditions are valid, GMM estimation and pseudo maximum likelihood estimation are consistent.
Solution to Question 4
(a) The solution should note that the assumption on the roots of the characteristic
polynomial implies that the process generated by (4.1) is stationary.
To find the mean, µ = E[Yt ], just take expectations in (4.1) to obtain
E[Yt ] = δ + θ1 E[Yt−1 ] + θ2 E[Yt−2 ] + E[ t ]
µ(1 − θ1 − θ2 ) = δ
µ =
δ
,
1 − θ1 − θ2
5
which is well defined under the maintained assumption that A(1) = 1 − θ1 − θ2 6= 0.
Define the deviation from the mean as yt = Yt − µ, and consider the equation
Yt = µ(1 − θ1 − θ2 ) + θ1 Yt−1 + θ2 Yt−2 +
Yt − µ = θ1 (Yt−1 − µ) + θ2 (Yt−2 − µ) +
t
t
yt = θ1 yt−1 + θ2 yt−2 + t .
(∗∗)
To derive the autocovariances,
γ k = E[(Yt − µ)(Yt−k − µ)] = E[yt yt−k ],
multiply the equation in (∗∗) with yt−k and take expectations to obtain
2
E[yt yt−1 ] = θ1 E[yt−1
] + θ2 E[yt−2 yt−1 ] + E[ t yt−1 ]
γ 1 = θ1 γ 0 + θ2 γ 1
2
E[yt yt−2 ] = θ1 E[yt−1 yt−2 ] + θ2 E[yt−2
] + E[ t yt−2 ]
γ 2 = θ1 γ 1 + θ2 γ 0
E[yt yt−3 ] = θ1 E[yt−1 yt−3 ] + θ2 E[yt−2 yt−3 ] + E[ t yt−3 ]
γ 3 = θ1 γ 2 + θ2 γ 1
etc. Collecting terms we find
θ1
γ
1 − θ2 0
= θ1 γ 1 + θ2 γ 0
γ1 =
γ2
γ k = θ1 γ k−1 + θ2 γ k−2 ,
for k = 3, 4, ...
The autocorrelations, ρk = γ k /γ 0 , are found by dividing with the variance γ 0 :
θ1
1 − θ2
= θ1 ρ1 + θ2
ρ1 =
ρ2
ρk = θ1 ρk−1 + θ2 ρk−2 ,
for k = 3, 4, ...
Some solutions may attempt to calculate the variance γ 0 , but this is not required
for a correct solution.
(b) The Breusch-Godfrey LM test for first order autocorrelation is based on the auxiliary
regression
bt = α0 + α1 Yt−1 + α2 Yt−2 + α3bt−1 + η t ,
where bt is the estimated residual from the model in (4.1), where zeros are inserted
for missing residuals.
The null of no error autocorrelation is given by α3 = 0, and can be tested by the
LM test statistic
LM = T · R2 ,
6
where R2 is the coefficient of determination from the auxiliary regression. The
statistic is asymptotically distributed as a χ2 (1) under the null. The good solution
gives some intuition for the test procedure.
θ2 , obtained from (4.1) are inconsistent in the
The OLS estimators, b
δ, b
θ1 and b
presence of error autocorrelation. Consistency requires that E[ t Yt−1 ] = 0 and
E[ t Yt−2 ] = 0. First order error autocorrelation implies a model with the structure
Yt = δ + θ1 Yt−1 + θ2 Yt−2 +
t
= ρ
t−1
t
+ ηt.
It is clear that both t and Yt−1 depends on t−1 and E[ t Yt−1 ] 6= 0.
The solution should stress that it is the error autocorrelation and the lagged dependent variables that causes the inconsistency.
(c) ARCH describes the property that the conditional variance is time dependent, which
gives a tendency for volatility clustering. ARCH implies that the series of squared
residual is autocorrelated.
The ARCH(1) model is defined by the additional equation
σ 2t = E[
2
t
| It−1 ] =
+α
2
t−1 ,
where It−1 is the information set. To derive the implication for the squared residuals,
define the surprise in the squared innovations,
vt =
2
t
2
t
− E[
| It−1 ] =
2
t
− σ 2t .
We can write the ARCH equation as
2
t
=
+α
2
t−1
+ vt ,
which shows that the squared residuals follow an AR(1) process.
A verbal explanation is sufficient, but the good solution defines the ARCH process
formally and derives the implication for the squared residuals.
To test for the presence of ARCH(2) we use the implication for the squared
residuals and consider the auxiliary regression
2
t
= α0 + α1
2
t−1
+ α2
2
t−2
+ vt .
The null of no ARCH corresponds to the hypothesis α1 = α2 = 0; and can be tested
using the LM test statistic
LM = T · R2 ,
which under the null is asymptotically distributed as χ2 (2).
7