Econ 140 Heteroskedasticity Lecture 19 Lecture 19 1 Today’s plan Econ 140 • How to test for it: graphs, Park and Glejser tests • What we can do if we find heteroskedasticity • How to estimate in the presence of heteroskedasticity Lecture 19 2 Palm Beach County revisited Econ 140 • How far is Palm Beach an outlier? – Can the outlier be explained by heteroskedasticity? – If so, what are the consequences? • Heteroskedasticity will affect the variance of the regression line – It will consequently affect the variance of the estimated coefficients • L19.XLS provides an example of how to work through a problem like this using Excel Lecture 19 3 Palm Beach County revisited (2)Econ 140 • Palm Beach is a good example to use since there are scale effects in the data – The voting pattern shows that the voting behavior and number of registered voters are related to the population in each county • As the county gets larger, voting patterns may diverge from what would be assumed given the number of registered voters – Note from the graph: as we move away from the origin, the difference between registered Reform voters and Reform votes cast increases – We’ll hypothesize that this will have an affect on heteroskedasticity Lecture 19 4 Notation Econ 140 • Heteroskedasticity is observed as cross-section variability in the data – data across units at point in time • In our notation, heteroskedasticity is: E(ei2) 2 • We can also write: E(ei2) = i2 – This means that we expect variable variance: the variance changes with each unit of observation Lecture 19 5 Consequences Econ 140 When heteroskedasticity is present: 1) OLS estimator is still linear 2) OLS estimator is still unbiased 3) OLS estimator is not efficient - the minimum variance property no longer holds 4) Estimates of the variances are biased 5) 2 eˆ 2 ̂ YX n k is not an unbiased estimator of YX2 6) We can’t trust the confidence intervals or hypothesis tests (t-tests & F-tests): we may draw the wrong conclusions Lecture 19 6 Consequences (2) Econ 140 • When BLUE holds and there is homoskedasticity, the firstorder condition gives: V bˆ ci2 2 xi ci xi2 where • With heteroskedasticity, we have: V ˆ ci2 i2 • If we substitute the equation for ci to both equations, we find: 2 ˆ i V bˆ and V 2 2 x x i i Lecture 19 7 Cases Econ 140 • With homoskedasticity: around each point, the variance around the regression line is constant • With heteroskedasticity: around each point, the variance around the regression line varies with each value of the independent variable Lecture 19 8 Detecting heteroskedasticity Econ 140 • There are three ways of detecting heteroskedastiticy: 1) Graphically 2) Park Test 3) Glejser Test Lecture 19 9 Graphical detection Econ 140 • We can see that the errors vary with the unit of observation • With homoskedasticity we find that for E(ei, X) = 0 : • The errors are independent of the independent variables • With heteroskedasticity we can get a variety of patterns • The errors show a systematic relationship with the independent variables • Note: you can use either e or e2 on the y-axis Lecture 19 10 Graphical detection (3) Econ 140 • Using the Palm Beach example (L19.xls), the estimated regression equation was: Yˆ 50.28 2.45 X • The errors of this equation, eˆ Y Yˆ can be graphed against the number of registered Reform party voters, (the independent variable) – Graph shows that the errors increasing with the number of registered reform voters • While the graphs may be convincing, we also want to use a test to confirm this. We have two: Lecture 19 11 Park Test Econ 140 • Here’s the procedure: 1) Run regression Yi = a + bXi + ei despite the heteroskedasticity problem (it can also be multivariate) 2) Obtain residuals (ei), square them (ei2), and take their logs (ln ei2) 3) Run a spurious regression: ln ei2 g 0 g1 ln X i vi 4) Do a hypothesis test on ĝ1 with H0: g1 = 0 5) Look at the results of the hypothesis test: • reject the null: you have heteroskedasticity 2 ln e • fail to reject the null: homoskedasticity, or i g0 which is a constant Lecture 19 12 Glejser Test Econ 140 • When we use the Glejser, we’re looking for a scaling effect • The procedure: 1) Run the regression (it can also be multivariate) 2) Collect ei terms 3) Take the absolute value of the errors 4) Regress |ei| against independent variable(s) • you can run different kinds of regressions: ei g 0 g1 X i ui or ei g 0 g1 X i ui 1 or ei g 0 g1 ui Xi Lecture 19 13 Glejser Test (2) Econ 140 4) [continued] • If heteroskedasticity takes one of these forms, this will suggest an appropriate transformation of the model • The null hypothesis is still H0: g1 = 0 since we’re testing for a relationship between the errors and the independent variables • We reach the same conclusions as in the Park Test Lecture 19 14 A cautionary note Econ 140 • The errors in the Park Test (vi) and the Glejser Test (ui) might also be heteroskedastic. – If this is the case, we cannot trust the hypothesis test H0: g1 = 0 or the t-test • If we find heteroskedastic disturbances in the data, what can we do? – Estimate the model Yi = a + bXi + ei using weighted least squares – We’ll look at two examples of weighted least squares: one where we know the true variance, and one where we don’t Lecture 19 15 Correction with known i2 Econ 140 • Given that the true variance is known and our model is: Yi = a + bXi + ei • Consider the following transformation of the model: Yi i a 1 i b Xi i ei i – In the transformed model, let ei i ui – So the expected value of the error squared is: E ui 2 Lecture 19 E (ei2 ) i2 16 Correction with known i2 (2) Econ 140 • Given that there is heteroskedasticity, E(ei2) = i2 – thus: 2 E (ui2 ) i2 1 i • In this simplistic example, we re-weighted model by the constant i • What this example shows: when the variance is known, we must transform our model to obtain a homoskedastic error term. Lecture 19 17 Correction with unknown i2 Econ 140 • Given an unknown variance, we need to state the ad-hoc but plausible assumptions with our variance i2 (how the errors vary with the independent variable) • For example: we can assert that E(ei2) = 2Xi • Remember: Glejser Test allows us to choose a relationship between the errors and the independent variable Yi Xi ei 1 a b Xi Xi Xi Xi Lecture 19 18 Correction with unknown i2 (2) Econ 140 • In this example you would transform the estimating equation by dividing through by X i to get: Yi Xi ei 1 a b Xi Xi Xi Xi • Letting: ei Xi – The expected value of this error squared is: E i2 Lecture 19 E ei2 Xi 19 Correction with unknown i2 (3) Econ 140 • Recalling an earlier assumption, we find: E i2 E ei2 2 X i 2 Xi Xi • When we don’t know the true variance we re-scale the estimating equation by the independent variable Lecture 19 20 Returning to Palm Beach Econ 140 • On L19.xls we have presidential election data by county in Florida – To get a correct estimating equation, we can run a regression without Palm Beach if we think it’s an outlier. – Then we can see if we can obtain a prediction for the number of reform votes cast in Palm Beach – We can perform a Glejser Test for the regression excluding Palm Beach – We run a regression of the absolute value of the errors (|ei|)against registered Reform voters (Xi) Lecture 19 21 Returning to Palm Beach (2) Econ 140 • The t-test rejects the null – this indicates the presence of heteroskedasticity • We can re-scale the model in different ways or introduce a new independent variable (such as the total number of registered voters by county) • Keep transforming the model and running the Glejser Test – When we fail to reject the null: there is no longer heteroskedasticity in the model Lecture 19 22 Summary Econ 140 • Even with re-weighted equations, we might still have heteroskedastic errors – so we have to rerun the Glejser Test until we cannot reject the null • If we cannot reject the null, we may have to rethink our model transformation – if we suspect a scale effect, we may want to introduce new scaling variables • Variables from the re-scaled equation are comparable with the coefficients from the original model Lecture 19 23
© Copyright 2026 Paperzz