Økonomisk Kandidateksamen 2004 (II) Econometrics 2 Solution Solution to Question 1 (a) Taking first differences we have: yi2 − yi1 = β (xi2 − xi1 ) + (ui2 − ui1 ) and the first differenced estimator is given by regressing (yi2 − yi1 ) on (xi2 − xi1 ). The within estimator takes the mean over time for each unit from the values. For the first period we have: yi1 − ȳi = ai + βxi1 + ui1 − (āi + β x̄i + ūi ) = β (xi1 − x̄i ) + (ui1 − ūi ) where ȳi = 1 2 (yi1 + yi2 ) so that: µ ¶ µ µ ¶¶ 1 1 yi1 − (yi1 + yi2 ) (xi1 + xi2 ) = β xi1 − 2 2 µ µ ¶¶ 1 + ui1 − (yi1 + yi2 ) 2 which is exactly the same as the first differenced equation. If we consider the second period we derive exactly the same equation. Since the equations for first differencing and taking the within transformation give the same equation, they will give the same estimator. (b) Suppose we have samples from the same population in two different time periods, 1 and 2. Some members of the population are ‘treated’ at the end of period 1 and before period 2 and we observe which people have been treated. Examples of a treatment include a policy change that affects some people but not others or a change in the local environment that affects people there but not people who live elsewhere. A difference-in-difference estimator for the effect of the treatment on a variable of interest is to compare the difference over time in the difference in the means of the two groups. Formally, let Yi be the variable of interest for person i and let dBi be a dummy variable that is 1 if person i is ‘treated’ (and zero otherwise). Let d2i be a dummy 1 variable that is one if person i was sampled in period 2. Then run the regression: Yi = β 0 + δ 0 d2i + β 1 dBi + δ 1 (d2i ∗ dBi ) + ui The OLS estimate of δ 1 gives the estimate of the difference-in-difference. A test of whether this coefficient is significantly different from zero is a test of whether the treatment had an effect on the variable Y . The coefficient δ 1 is consistently estimated if E (u|d2, dB, d2 ∗ dB) = 0. (c) First try first differencing: ∆yit = β∆xit + δ i + ∆uit where ∆t = t − (t − 1) = 1. Thus we still have a fixed effect in the first differenced equation. If this is correlated with ∆xit then a regression of ∆yit on ∆xit will give an inconsistent estimator of β. One approach is to assume that δ i is uncorrelated with ∆xit . Then we can either use OLS or a random effects model. [Both give consistent estimates but the latter is more efficient]. An alternative, if we are not prepared to assume this lack of correlation, is to first difference again to remove δ i : ∆yit − ∆yit−1 = β (∆xit − ∆xit−1 ) + (δ i − δ i ) + (∆uit − ∆uit−1 ) ∆∆yit = β∆∆xit + ∆∆uit This is possible if T ≥ 3. Then we can estimate β by regressing ∆∆yit on ∆∆xit . Solution to Question 2 (a) The solution should derive an AR(3), e.g. using the manipulations ∆Yt = δ + c1 ∆Yt−1 + c2 ∆Yt−2 + πYt−1 + t Yt − Yt−1 = δ + c1 (Yt−1 − Yt−2 ) + c2 (Yt−2 − Yt−3 ) + πYt−1 + t Yt = δ + (1 + c1 + π) Yt−1 + (c2 − c1 ) Yt−2 − c2 Yt−3 + t . Second half of the question asks the students to explain the Dickey Fuller unit root test and perform the test for a US interest rate based on a given estimation output. The solution should note that a unit root in Yt corresponds to π = 0. A more thorough solution may write the characteristic polynomial, A(z), and note that an autoregressive unit root is defined by A(1) = 0. The solution should indicate that the (normal) alternative is a stationary process. For the US interest rate the Dickey-Fuller t−test for a unit root is tπ=0 = −1.67 and should be compared with a 5% critical value of −2.86. We cannot reject the null of a unit root for the considered period. 2 (b) First part of the question asks the students to explain an implication of cointegration. If the interest rate, Yt , is integrated of order one, I(1), and if the real interest rate Xt = Yt − Zt , is a stationary process, then the inflation rate, Zt , has to be an I(1) process. Furthermore, Yt and Zt are cointegrated with a cointegration vector β = (1, −1)0 . Second half of the question asks the students to test the theoretical prediction based on estimation output for the regression model δ+e c1 ∆Xt−1 + π eXt−1 + et . ∆Xt = e The solution should note that a test for no cointegration is just a test for unit root in Xt . The good solution notes that the critical values of Table 2.2 apply again because the potential cointegration vector is given a priori and not estimated. The t−test is given by tπe =0 = −2.54. We cannot reject the null of a unit root and hence cannot reject no-cointegration. We conclude that the interest rate and inflation do not appear to have the same stochastic trend with equal coefficients. (c) Instead of the real interest rate, Xt = Yt − Zt , a different linear combination could be considered, e.g. Xt∗ = Yt − β 1 Zt . The solution should note that β 1 can be (super-)consistently estimated in a static regression of Yt on Zt and a constant, i.e. Yt = β 0 + β 1 Zt + η t . (∗) To test if the regression in (∗) corresponds to a cointegration vector we can test for a unit root in the estimated residual, b η t . This can be done using an (augmented) Dickey Fuller test (with no constant). Alternatively we can test for a unit root in b Zt . The good solution notes that the fact that β b is estimated changes bt∗ = Yt − β X 1 1 the asymptotic distribution of the Dickey Fuller test. Solution to Question 3 (a) The exponential distribution is not discussed in the lectures, but the question is a straightforward likelihood analysis. The solution should note that the likelihood function for a single observation is just the density conditional on the data, i.e. n x o i . Li (θ | xi ) = θ−1 exp − θ Taking logs yields the log-likelihood contribution log Li (θ) = − log(θ) − 3 xi . θ Since the observations are independent the likelihood function for a set of observations, x1 , x2 , ..., xn , is given by L(θ | x1 , x2 , ..., xn ) = n Y i=1 n x o i θ−1 exp − . θ Taking logs yields the log-likelihood function Pn n ³ X xi ´ i=1 xi − log(θ) − = −n log(θ) − . log L(θ) = θ θ i=1 The individual scores are found as the derivatives si (θ) = ∂ log Li (θ) xi = −θ−1 + 2 , ∂θ θ and the first order condition for the ML estimator, b θML , is given by Pn ¶ n µ X xi xi −1 −1 −θ + 2 = −nθ + i=1 si (θ) = = 0. 2 θ θ i=1 i=1 n X This is solved where b θML = Pn i=1 xi . n P100 Given the information that n = 100 and i=1 xi = 1000, we find the numerical estimate to be Pn xi 1000 b = 10. θML = i=1 = n 100 (b) We find the Hessian as the second derivative ∂ 2 log Li (θ) xi = θ−2 − 2 3 , ∂θ∂θ θ and the information (per observation) is given by £ ¤ ¡ ¢ I(θ) = −E θ−2 − 2xi θ−3 = − θ−2 − 2θ−2 = θ−2 . Here we use the fact the E[xi ] = θ; most solutions found this part difficult. The asymptotic variance is given by I(θ)−1 = θ2 . The solution should note that a Wald test for the hypothesis H0 : θ = θ0 is based on estimates in the unrestricted model. The principle is that the distance (b θML −θ0 ) b is measured in terms of the variance of θML . The solution may derive the test from the fact that ´ √ ³ ¢ ¡ T b θML − θ → N 0, I(θ)−1 , which we can estimate by ´ ³ −1b2 b θML → N θ, T θML . 4 This expression can be used to construct a familiar t−tests for H0 : b b √ θML − θ0 θML − θ0 = T· → N (0, 1) , tθ=θ0 = q b 2 θ ML −1 b T θML or a Wald tests of the form ´ ³ −2 ´ ³ ´ ³ b θML b θML − θ0 → χ2 (1) . θML − θ0 T b To construct a t−test for the hypothesis H0 : θ = 11.5 we calculate √ 10 − 11.5 = −1.5, tθ=11.5 = 100 · 10 which is not significant according to a N (0, 1). We cannot reject that the mean lifetime is 11.5. b , in the regression model in (c) The solution should discuss that the ML estimator, β ML (3.2) and (3.3) is consistent under weaker assumptions than maintained for the ML estimation. The expression in (3.3) contains three parts: (i) mean independence, E[ t | xt ] = 0; (ii) conditional homoscedasticity; and (iii) normality. Only E[ t xt ] is required for consistency. The solution may discuss pseudo-likelihood estimation. If the normality in (3.3) is not fulfilled the estimate based on the normal likelihood function is a pseudolikelihood estimator, which is still consistent. The first order conditions for the ML estimates can be written as T T 1X 1X st (θ) = xt (yt − x0t β) = 0, T T t=1 t=1 where st (θ) denotes the individual scores. That can be seen as sample counterparts to the moment conditions E[xt (yt − x0t β)] = 0. If these moment conditions are valid, GMM estimation and pseudo maximum likelihood estimation are consistent. Solution to Question 4 (a) The solution should note that the assumption on the roots of the characteristic polynomial implies that the process generated by (4.1) is stationary. To find the mean, µ = E[Yt ], just take expectations in (4.1) to obtain E[Yt ] = δ + θ1 E[Yt−1 ] + θ2 E[Yt−2 ] + E[ t ] µ(1 − θ1 − θ2 ) = δ µ = δ , 1 − θ1 − θ2 5 which is well defined under the maintained assumption that A(1) = 1 − θ1 − θ2 6= 0. Define the deviation from the mean as yt = Yt − µ, and consider the equation Yt = µ(1 − θ1 − θ2 ) + θ1 Yt−1 + θ2 Yt−2 + Yt − µ = θ1 (Yt−1 − µ) + θ2 (Yt−2 − µ) + t t yt = θ1 yt−1 + θ2 yt−2 + t . (∗∗) To derive the autocovariances, γ k = E[(Yt − µ)(Yt−k − µ)] = E[yt yt−k ], multiply the equation in (∗∗) with yt−k and take expectations to obtain 2 E[yt yt−1 ] = θ1 E[yt−1 ] + θ2 E[yt−2 yt−1 ] + E[ t yt−1 ] γ 1 = θ1 γ 0 + θ2 γ 1 2 E[yt yt−2 ] = θ1 E[yt−1 yt−2 ] + θ2 E[yt−2 ] + E[ t yt−2 ] γ 2 = θ1 γ 1 + θ2 γ 0 E[yt yt−3 ] = θ1 E[yt−1 yt−3 ] + θ2 E[yt−2 yt−3 ] + E[ t yt−3 ] γ 3 = θ1 γ 2 + θ2 γ 1 etc. Collecting terms we find θ1 γ 1 − θ2 0 = θ1 γ 1 + θ2 γ 0 γ1 = γ2 γ k = θ1 γ k−1 + θ2 γ k−2 , for k = 3, 4, ... The autocorrelations, ρk = γ k /γ 0 , are found by dividing with the variance γ 0 : θ1 1 − θ2 = θ1 ρ1 + θ2 ρ1 = ρ2 ρk = θ1 ρk−1 + θ2 ρk−2 , for k = 3, 4, ... Some solutions may attempt to calculate the variance γ 0 , but this is not required for a correct solution. (b) The Breusch-Godfrey LM test for first order autocorrelation is based on the auxiliary regression bt = α0 + α1 Yt−1 + α2 Yt−2 + α3bt−1 + η t , where bt is the estimated residual from the model in (4.1), where zeros are inserted for missing residuals. The null of no error autocorrelation is given by α3 = 0, and can be tested by the LM test statistic LM = T · R2 , 6 where R2 is the coefficient of determination from the auxiliary regression. The statistic is asymptotically distributed as a χ2 (1) under the null. The good solution gives some intuition for the test procedure. θ2 , obtained from (4.1) are inconsistent in the The OLS estimators, b δ, b θ1 and b presence of error autocorrelation. Consistency requires that E[ t Yt−1 ] = 0 and E[ t Yt−2 ] = 0. First order error autocorrelation implies a model with the structure Yt = δ + θ1 Yt−1 + θ2 Yt−2 + t = ρ t−1 t + ηt. It is clear that both t and Yt−1 depends on t−1 and E[ t Yt−1 ] 6= 0. The solution should stress that it is the error autocorrelation and the lagged dependent variables that causes the inconsistency. (c) ARCH describes the property that the conditional variance is time dependent, which gives a tendency for volatility clustering. ARCH implies that the series of squared residual is autocorrelated. The ARCH(1) model is defined by the additional equation σ 2t = E[ 2 t | It−1 ] = +α 2 t−1 , where It−1 is the information set. To derive the implication for the squared residuals, define the surprise in the squared innovations, vt = 2 t 2 t − E[ | It−1 ] = 2 t − σ 2t . We can write the ARCH equation as 2 t = +α 2 t−1 + vt , which shows that the squared residuals follow an AR(1) process. A verbal explanation is sufficient, but the good solution defines the ARCH process formally and derives the implication for the squared residuals. To test for the presence of ARCH(2) we use the implication for the squared residuals and consider the auxiliary regression 2 t = α0 + α1 2 t−1 + α2 2 t−2 + vt . The null of no ARCH corresponds to the hypothesis α1 = α2 = 0; and can be tested using the LM test statistic LM = T · R2 , which under the null is asymptotically distributed as χ2 (2). 7
© Copyright 2026 Paperzz