Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 Seppo Pynnönen Econometrics I Heteroskedasticity 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least Squares (GLS) Seppo Pynnönen Econometrics I Heteroskedasticity Consider regression yi = β0 + β1 xi1 + · · · + βk xik + ui . (1) Assumption 2 (classical assumptions) states that the error term ui is homoskedastic, which means the variance of ui (conditional on the explanatory variables) is constant, i.e., var[ui |xi ] = σ 2 (< ∞) for all i, where xi = (xi1 , . . . , xik ). Violation of this assumption is called heteroskedasticity in which the variance var[ui |xi ] = σi2 varies (e.g. as a function of xi ). Seppo Pynnönen Econometrics I Heteroskedasticity Consequences 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least Squares (GLS) Seppo Pynnönen Econometrics I Heteroskedasticity Consequences In the presence of heteroskedasticity: (i) OLS estimators are not BLUE h i (ii) var β̂j are biased, implying that t-, F -, and LM-statistics, and confidence intervals are no more reliable. (iii) OLS estimator are no more asymptotically efficient. However, (iv) OLS estimators are still unbiased. (v) OLS estimators are still consistent Seppo Pynnönen Econometrics I Heteroskedasticity Heteroskedasticity-robust inference 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least Squares (GLS) Seppo Pynnönen Econometrics I Heteroskedasticity Heteroskedasticity-robust inference Consider for the sake of simplicity yi = β0 + β1 xi + ui , (2) var[ui |xi ] = σi2 . (3) i = 1, . . . , n, where Then writing the OLS-estimator of β1 in the form Pn (xi − x̄)ui β̂1 = β1 + Pi=1 . n 2 i=1 (xi − x̄) Because the error terms are uncorrelated, h i Pn (x − x̄)2 σ 2 i i var β̂1 = i=1 , (SSTx )2 where SSTx = n X (xi − x̄)2 . i=1 Seppo Pynnönen Econometrics I (4) (5) (6) Heteroskedasticity Heteroskedasticity-robust inference In the homoscedastic case, where P σi2 = σ 2 for all i formula (5) 2 reduces to the usual variance σu / (xi − x̄)2 . White (1980)2 derives a robust estimator for (5) as h i Pn (x − x̄)2 û 2 \ i i var β̂1 = i=1 , (SSTx )2 (7) where ûi are the OLS residuals. If we rewrite (1) in the matrix form y = Xβ + u, (8) and write β̂ = (X0 X)−1 X0 y as β̂ = β + (X0 X)−1 X0 u 2 (9) White, H. (1980). A Heteroskedasticity-consistent covariance matrix estimator and direct test for heterosedasticity. Econometrica 48, 817–838. Seppo Pynnönen Econometrics I Heteroskedasticity Heteroskedasticity-robust inference Given X, the variance-covariance matrix of b̂ is h i cov β̂ = (X0 X)−1 n X ! σi2 xi x0i (X0 X)−1 , (10) i=1 where xi = (1, xi1 , . . . xik )0 is the ith row of the data matrix X on x-variables. Analogous to (7), an estimator of (10) is h i \ cov β̂ = (X0 X)−1 n X ! ûi2 xi x0i (X0 X)−1 , i=1 which is often adjusted by n/(n − k − 1) (e.g. EViews). Heteroskedasticity robust standard error for estimate β̂j is the square root of the jth diagonal element of (11). Seppo Pynnönen Econometrics I (11) Heteroskedasticity Heteroskedasticity-robust inference Remark 7.1: If the residual variances var[ui ] = σi2 = σu2 are the same, then because n X xi x0i , X0 X = i=1 (11) is h i cov β̂ = σu2 (X0 X)−1 n X ! xi x0i (X0 X)−1 = σu2 (X0 X)−1 , i=1 i.e., the usual case. Seppo Pynnönen Econometrics I Heteroskedasticity Heteroskedasticity-robust inference Example 7.1: Wage example with heteroskedasticity-robust standard errors. Dependent Variable: LOG(WAGE) Method: Least Squares Sample: 1 526 Included observations: 526 White Heteroskedasticity-Consistent Standard Errors & Covariance ================================================================ Variable Coefficient Std. Error t-Statistic Prob. ---------------------------------------------------------------C 0.321378 0.109469 2.936 0.0035 MARRMALE 0.212676 0.057142 3.722 0.0002 MARRFEM -0.198268 0.058770 -3.374 0.0008 SINGFEM -0.110350 0.057116 -1.932 0.0539 EDUC 0.078910 0.007415 10.642 0.0000 EXPER 0.026801 0.005139 5.215 0.0000 TENURE 0.029088 0.006941 4.191 0.0000 EXPER^2 -0.000535 0.000106 -5.033 0.0000 TENURE^2 -0.000533 0.000244 -2.188 0.0291 ================================================================ R-squared 0.461 Mean dependent var 1.623 Adjusted R-squared 0.453 S.D. dependent var 0.532 S.E. of regression 0.393 Akaike info criterion 0.988 Sum squared resid 79.968 Schwarz criterion 1.061 Log likelihood -250.955 F-statistic 55.246 Durbin-Watson stat 1.785 Prob(F-statistic) 0.000 ================================================================ Seppo Pynnönen Econometrics I Heteroskedasticity Heteroskedasticity-robust inference Comparing to Example 6.3 the standard errors change slightly (usually little increase). However, conclusions do not change. Seppo Pynnönen Econometrics I Heteroskedasticity Testing for Heteroskedasticity 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least Squares (GLS) Seppo Pynnönen Econometrics I Heteroskedasticity Testing for Heteroskedasticity y = β0 + β1 x1 + · · · + βk xk + u. (12) Now variance of u also dependent on x-variables as σi2 = var[ui |x1 , . . . , xk ] = δ0 + δ1 x1 + · · · + δk xk , (13) then the homoscedasticity hypothesis is H0 : δ1 = · · · = δk = 0, i.e., σi2 = δ0 . Seppo Pynnönen Econometrics I (14) Heteroskedasticity Testing for Heteroskedasticity Writing vi = ui2 − E ui2 |x1 , . . . , xn (note that var[ui |x1 , . . . , xk ] = E ui2 |x1 , . . . , xk ), we can write (13) as ui2 = δ0 + δ1 x1 + · · · + δk xk + vi . (15) The error terms ui are unobservable. They must be replaced by the OLS-residuals ûi . Seppo Pynnönen Econometrics I Heteroskedasticity Testing for Heteroskedasticity Estimating the parameters with OLS, the null hypothesis in (14) can be tested with the overall F -statistic defined in (4.25), which can be written in terms of the R-square as F = Rû22 /k , (1 − Rû22 )/(n − k − 1) (16) where Rû22 is the R-square of the regression ûi2 = δ0 + δ1 x1 + · · · + δk xk + vi . The F -statistic is asymptotically F -distributed under the null hypothesis with k and n − k − 1 degrees for freedom. Seppo Pynnönen Econometrics I (17) Heteroskedasticity Testing for Heteroskedasticity Breuch-Bagan test: Asymptotically (16) is equivalent to the Lagrange Multiplier (LM) test LM = nRû22 , (18) which is asymptotically χ2 -distributed with k degrees of freedom when the null hypothesis is true. Remark 7.2: In regression (17) the explanatory variables can be also some external variables (not just x-variables). Seppo Pynnönen Econometrics I Heteroskedasticity Testing for Heteroskedasticity White test: Suppose, for the sake of simplicity, that in (1) k = 3, then the White-procedure is to estimate ûi2 = δ0 + δ1 x1 + δ2 x2 + δ3 x3 +δ4 x12 + δ5 x22 + δ6 x32 (19) +δ7 x1 x2 + δ8 x1 x3 + δ9 x2 x3 + vi Estimate the model and use LM-statistic of the form (18) to test whether the coefficients δj , j = 1, . . . , 9, are zero. Remark 7.3: As is obvious, Breuch-Pagan (BP) test with x-variables is White test without the cross-terms. Seppo Pynnönen Econometrics I Heteroskedasticity Testing for Heteroskedasticity Example 7.2: In the wage example BP (White without cross-terms) yields Rû22 = 0.030244. With n = 526, LM = nRû22 ≈ 15.91 df = 11, producing p-value 0.1446. Thus there is not empirical evidence of heteroskedasticity. White with cross-terms gives Rû22 = 0.086858 and LM ≈ 45.69 with df = 36 and p-value of 0.129. Again we do not reject the null hypothesis of homoscedasticity. Seppo Pynnönen Econometrics I Heteroskedasticity Testing for Heteroskedasticity Remark 7.4: When x-variables include dummy-variables, be aware of the dummy-variable trap due to D 2 = D! I.e., you can only include Ds. Modern econometric packages, like EViews, avoid the trap automatically if the procedure is readily available in the program. Seppo Pynnönen Econometrics I Heteroskedasticity Weighted Least Squares (WLS) 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least Squares (GLS) Seppo Pynnönen Econometrics I Heteroskedasticity Weighted Least Squares (WLS) Suppose the heteroskedasticity is of the form var[ui |xi ] = σ 2 h(xi ), (20) where hi = h(xi ) > 0 is some (known) function of the explanatory (and possibly some other variables). Seppo Pynnönen Econometrics I Heteroskedasticity Weighted Least Squares (WLS) √ Dividing both √ the new variables √ sides of (1)√by hi and denoting as ỹi = yi / hi , x̃ij = xij / hi , and ũi = ui / hi , we get regression 1 ỹi = β0 √ + β1 x̃i1 + · · · + βk x̃ik + ũi , hi where var[ũi |xi ] = = (21) 1 hi var[ui |xi ] 1 2 hi hi σ (22) = σ2, i.e., homoscedastic (satisfying the classical assumption 2). Applying OLS to (22) produces again BLUE for the parameters. Seppo Pynnönen Econometrics I Heteroskedasticity Weighted Least Squares (WLS) From estimation point of view the transformation leads, in fact, to the minimization of n X (yi − β0 − β1 xi1 − · · · − βk xik )2 /hi . i=1 This is called Weighted Least Squares (WLS), √ where the observations are weighted by the inverse of hi . Seppo Pynnönen Econometrics I (23) Heteroskedasticity Weighted Least Squares (WLS) Example 7.3: Speed and stopping distance for cars, n = 50 observations. Data Chart 2 Distance vs Speed 140 120 100 Distance 80 60 40 20 0 0 5 10 15 20 Speed Page 1 Seppo Pynnönen Econometrics I 25 30 Heteroskedasticity Weighted Least Squares (WLS) Visual inspection suggests somewhat increasing variability as a function of speed. From the linear model dist = β0 + β1 speed + u White test gives LM = 3.22 with df = 2 and p-val 0.20, which is not statistically significant. Seppo Pynnönen Econometrics I Heteroskedasticity Weighted Least Squares (WLS) Physics: stopping distance proportional to square of speed, i.e., β1 (speed)2 . Thus instead of a linear model a better alternative should be disti = β1 (speedi )2 + errori , Human factor: reaction time vi = β0 + ui , where β0 is the average reaction time and the error term ui ∼ N(0, σu2 ). Seppo Pynnönen Econometrics I (24) Heteroskedasticity Weighted Least Squares (WLS) During the reaction time the car moves a distance vi × speedi = β0 speedi + ui speedi . (25) Thus modeling the error term in (24) as (25), gives disti = β0 speedi + β1 (speedi )2 + ei , (26) ei = ui × speedi . (27) where Because var[ei |speedi ] = (speedi )2 var[ui ] = (speed)2 σu2 , (28) the heteroskedasticity is of the form (20) with hi = (speedi )2 . Seppo Pynnönen Econometrics I (29) Heteroskedasticity Weighted Least Squares (WLS) Estimating (26) by ignoring the inherent heteroskedasticity yields Dependent Variable: DISTANCE Method: Least Squares Included observations: 50 ============================================================== Variable Coefficient Std. Error t-Statistic Prob. -------------------------------------------------------------SPEED 1.239 0.560 2.213 0.032 SPEED^2 0.090 0.029 3.067 0.004 ============================================================== R-squared 0.667 Mean dependent var 42.980 Adjusted R-squared 0.660 S.D. dependent var 25.769 S.E. of regression 15.022 Akaike info criterion 8.296 Sum squared resid 10831.117 Schwarz criterion 8.373 Log likelihood -205.401 Durbin-Watson stat 1.763 ============================================================== Seppo Pynnönen Econometrics I Heteroskedasticity Weighted Least Squares (WLS) Accounting for the heteroskedasticity and estimating the coefficients from disti = β0 + β1 speedi + ui speedi (30) gives ============================================================== Variable Coefficient Std. Error t-Statistic Prob. -------------------------------------------------------------SPEED 1.261 0.426 2.963 0.00472 SPEED^2 0.089 0.026 3.402 0.00136 ============================================================== The results are not materially different. Thus the heteroskedasticity is not a big problem here. Seppo Pynnönen Econometrics I Heteroskedasticity Weighted Least Squares (WLS) Remark 7.5: The R-squares from (26) and (30) are not comparable. d using the Comparable R-squares can be obtained by computing dist coefficient estimates of (30) and squaring the correlation h i di . R = corr disti , dist (31) The R-square for (30) is 0.194 while for (26) 0.667. A comparable R-square, however, is obtained by squaring (31), which gives 0.667, i.e., the same in this case (usually it is slightly smaller; why?). Seppo Pynnönen Econometrics I Heteroskedasticity Feasible generalized Least Squares (GLS) 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least Squares (GLS) Seppo Pynnönen Econometrics I Heteroskedasticity Feasible generalized Least Squares (GLS) In practice the h(x) function is rarely known. In order to guarantee strict positivity, a common practice is to model it as h(xi ) = exp(δ0 + δ1 x1 + · · · + δk xk ). (32) In such a case we can write log(u 2 ) = δ0 + δ1 x1 + · · · + δk xk + e, where e is an error term. Seppo Pynnönen Econometrics I (33) Heteroskedasticity Feasible generalized Least Squares (GLS) In order to estimate the unknown parameters the procedure is: (i) Obtain OLS residuals û from regression equation (1) (ii) Run regression (33) for log(û 2 ), and generate the fitted values, ĝi . (iii) Re-estimate (1) by WLS using 1/ĥi , where ĥi = exp(ĝi ). This is called a feasible GLS. Another possibility is to obtain the ĝi by regressing log(û 2 ) on ŷ and ŷ 2 . Seppo Pynnönen Econometrics I
© Copyright 2026 Paperzz