2 REV 00 Nature of Heteroscedasticity

REV 00
Chapter 4
VIOLATING
ASSUMPTION IN
REGRESSION
QMT 3033 ECONOMETRIC
1
REV 00
Nature of Heteroscedasticity
 One of the assumptions in CLRM is that the
errors  i must be homoscedastic
(equal/constant variance).
var( i ) = E( i2 ) =  2 = constant i =1, 2, …, n
 If the errors do not have constant variance,
our regression model is said to have
heteroscedasticity problem:
2
var(  i ) = E( i ) =  constant i =1, 2, …, n
2
i
QMT 3033 ECONOMETRIC
2
REV 00
 The notion  i2 implies that the variance is
different for different observation in (i).
 Heteroscedasticity is associated with cross
sectional data. In cross sectional data, the
units may be of different size.
QMT 3033 ECONOMETRIC
3
REV 00
Homoscedasticity Case
f(Yi)
.
x11=80
x12=90
.
.
Var(ui) = E(ui2)= 2
x13=100
QMT 3033 ECONOMETRIC
income
x1i
4
REV 00
Heteroscedasticity Case
f(Yi)
.
.
x11
x12
.
Var(ui) = E(ui2)= i2
x13
income
QMT 3033 ECONOMETRIC
x1
5
REV 00
Consequences of Heteroscedasticity
 If heteroscedasticity is present in our
regression model,
a) The OLS estimators are still unbiased.
b) The estimator’s variances can be larger or
smaller then the true variance (biased).
c) The OLS estimators are inefficient (no
longer has the minimum variance).
d) Tests of significance are invalid and the OLS
estimators are not BLUE.
QMT 3033 ECONOMETRIC
6
REV 00
Detecting Heteroscedasticity
a) Breusch-Pagan-Godfrey test
Step 1:
Estimate Y     X  ...   X  u and obtain
the residuals uˆ , uˆ , uˆ , …., û .
Step 2:
Obtain , ~ 2   uˆ i2 n that is the maximum
likelihood (ML) estimator of 2.
i
1
2
1
2
2i
k
3
QMT 3033 ECONOMETRIC
ki
i
n
7
REV 00
Step 3:
Construct variables pi defined as;
2
2
~
pi = uˆi 
which is simply each residual squared divided
by ~ 2 .
QMT 3033 ECONOMETRIC
8
REV 00
Step 4:
Regress pi thus constructed on the Z's as:
pi = 1 + 2 Z 2i + … + m Zmi + vi
where vi is the residual term of this
regression.
Step 5:
Obtain the Regression Sum of Squares (RSS)
from step 4 and define
 = 1/2 RSS
QMT 3033 ECONOMETRIC
9
REV 00
Assuming ui are normally distributed, one can
show that if there is homoscedasticity and if
the sample size n increases indefinitely, then
~ 
2
m 1
that is,  follows the chi-square distribution
with (m-1) degree of freedom.
QMT 3033 ECONOMETRIC
10
REV 00
Therefore, if in an application the computed
 (= 2) exceeds the critical 2 value at the
chosen level of significance, one can reject
the null hypothesis of homoscedasticity,
otherwise one does not reject it.
H0: No heteroscedasticity [E(  i ) =  = constant]
2
2
H1: Heteroscedasticity [E(  i ) =  i  constant]
2
QMT 3033 ECONOMETRIC
2
11
REV 00
b) White General Heteroscedasticity
Step 1:
Given the data, estimate
Y     X   X u
i
1
2
2i
3
3i
i
and obtain the residuals.
QMT 3033 ECONOMETRIC
12
REV 00
Step 2:
Run the following auxiliary regression:
uˆ =    X   X   X   X
2
i
2
1
2
2i
3
3i
4
2i
5
2
3i
 X X v
6
2i
3i
i
QMT 3033 ECONOMETRIC
13
REV 00
Step 3:
Under the null hypothesis that there is no
heteroscedasticity, it can be shown that
sample size (n) times the r2 obtained from the
auxiliary regression asymptotically follows
the chi-square distribution with df equal to the
number of regressors (excluding the constant
term) in the auxiliary regression.
QMT 3033 ECONOMETRIC
14
REV 00
That is,
n.R2~ 2df
where df is as defined previously. In the threevariable regression model, df = 5.
QMT 3033 ECONOMETRIC
15
REV 00
Step 4:
If the 2 calculated > than the critical chisquare value, there is heteroscedasticity (reject
H0). If the 2 calculated < than the critical chisquare value, there is no heteroscedasticity
(accept H0), which is to say that in the
auxiliary regression (2), 2= 3= 4 = 5 = 6
= 0.
QMT 3033 ECONOMETRIC
16
REV 00
(if all the partial slope coefficients in this
regression are simultaneously equal to zero,
then the error variance is the homoscedastic
constant equal to 1.
H0: No heteroscedasticity [E( i) =  = constant]
H1: Heteroscedasticity [E(  i2) =  i2  constant]
2
QMT 3033 ECONOMETRIC
2
17