Chapter 7

Ch 7 and 17
1. Regression Analysis with Qualitative Information
(independent variable)
2. Limited Dependent Variable Model and Sample
Selection Corrections (dependent variable)
Part I. A Regression Analysis with Qualitative
Information
y  1  2 xt  u
x is not quantitative
* Dummy Variables
1. A dummy variable is a variable that takes on the value 1 or 0
2. Examples: (1) male = 1 if are male, 0 otherwise), (2) south = 1 if in
the south, 0 otherwise), etc.
3. Dummy variables are also called binary variables,.
1
ex:
W  1  2 X  1G   2 E  u
G 1
0
E 1
0
male
otherwise
college
otherwise
Other specifications including interactions:
W   0   2 X   XG   2 E  u (interactions)
W   0   2 X   XG   2 E   3 XE  u (interactions)
W   0   2 X   XG   2 E   3 XE   4GE  u (interactions)
W  1   2 X  1G   2 E   3GXE  u (interactions)
* Dummy Variable Trap
W   0   2 X   G  u (one catogory  only one dummy variable)
W   0  1 X  1G1   2G2  u (False)
G1  1 ( M )
G2  0 ( M )
0 (F )
1 (F)
G1  G2  1
 perfect collinearity. (since the variable of  0 is 1)
ex:
2
W
x0
x1
G1
G2
*
1
*
0
1
*
1
*
1
0
=﹥ G1 + G2 =1
2
Remedy :without intercept (  0 ),but R is not correct or negative.
W   0  1 x   G  u
1
G
0
M
F
E W G  1    0     1 x
E W G  0    0  1 x  base group
H0 :   0
y   0   2 x   D1997  
H0 :   0
 1 before 1997
D
0 after 1997

E  y D1997  1    0      2 x
( if   0)
E  y D1997  0    0   2 x
y
A (after 1997)
B (before 1997)

0 
x
3
=﹥A structural change
* The Chow Test (detecting a change in structure)
1. Turns out you can compute the proper F statistic without running
the unrestricted model with interactions with all
k continuous
variables
2. If run the restricted model for group one and get SSR1, then for
group two and get SSR2.
3. Run the restricted model for all to get SSR, then
 SSE   SSE1  SSE2  n  2  k  1
F

SSE1  SSE2
k 1
ex:
B  y   0  1 X  e1
(unrestricted )  SSE uB
A  y   0  1 X  e2
(unrestricted )  SSE uA
y   0   1 X  e3
( restricted )  SSE R
H 0 :  0  0 , 1  1
F
SSE R  SSEuA  SSEuB  q
SSE
A
u
 SSEuB  n  k  1
4
(text:P.286)
Note:
nW   0  1 X 1   G  
1 M
G
0 F
G=1  E( nWM )   0  1 X 1    E(WM )=e 0  1 X1 
G=0  E( nWM )   0  1 X 1  E(WF )=e 0  1 X1
E(WM )-E(WF ) e 0  1 X1  -e 0  1 X1

=
E(WF )
e 0  1 X1

E(WM )-E(WF ) 
=e  1
E(WF )
5
Part II. Limited Dependent Variable Model and
Sample Selection Corrections
Linear form: (E(y) or Probability (p)) is a linear function of
regressors, say x)
1. LPM (Linear Probability Model)
Nonlinear form: (E(y) or Probability (p))is a nonlinear function of
regressors, say x)
2. Probit
3. Logit
4. Truncated Variable (Tobit model)
5. The Poisson Regression Model (Count data variable)
The dependent variable y is a dichotomous variable (dummy variables)
taking the value 1 or zero.
6
 1 if the person is employed
y
 0 otherwise
 1 if the firm is bankrupt
y
 0 otherwise
yi  1   2 x   i , i  1, 2, , n
(1)
E ( i )  0 , Cov( i ,  j )  0 , i  j
E ( yi )  1 fi 1  0  fi  0   fi 1  the probability of y=1 , success
p= E ( yi )  1   2 xi
Probability that the event will occur given the xi
*
E( x)   x f  x   x1 f  x1    xn f  xn 
*
f  yi   p yi q1 yi  p yi 1  p 
1 yi
 
E y i
*
x
2
 pdf for yi
(one unit change in xi on the probability
that yi  1
where 0  E ( yi )  1
 0  1   2 x  1
7
yi  1   2 x   i , i  1, 2, , n
yˆ  b1  b2 x
(1)
from 1   i  yi  1   2 x  f   i 
yi  1  i  1  1   2 x

yi  0  i   1   2 x

 pi 
1  pi 
E ( i )  0
 1  1   2 x  pi    1   2 x 1  pi   0
 pi  1   2 x
V  i   E  i  E  i  
2
 E ( i 2 )
 1  1   2 x
  1   2 x     1   2 x  1  1   2 x 
  1   2 x 1  1   2 x 
 E ( yi ) 1  E ( yi ) 
 V   i   E ( yi ) 1  E ( yi )  , i  1, 2, , n
V  1    2  x 
V  1    E ( y1 ) 1  E ( y1 ) 
V   2    E ( y2 )  1  E ( y2 ) 
V   i  is heteroscedastic.
2
2
Possible Solution:
8
yi  1   2 x   i

yi
E  yi  1  E  yi  
 yi*  1


1   2 xi   i
E  yi  1  E  yi  
1
E  yi  1  E  yi  
 2
xi
E  yi  1  E  yi  
i
E  yi  1  E  yi  
 yi*  1 x0*   2 x1*   i*
Note:
Drawback
Potential problem can be outside [0, 1].
A better solution is to re-specify, or transform the regression model itself
to constrain the probability outcome. This is one justification for
development of Probit and Logit models of binary.
The Probit Model
The latent (index) variable approach:
An alternative (and more common) approach to specification of
discrete choice models is the latent variable approach, where it
is assumed that there is some underlying (and unobserved) latent
9
propensity variable y* where y*  (-,) .
U *   0 + xi  
1i
1i
1
U *  0 +0 xi  
2i
2i
y*  U * U *
i
1i
2i
yi

1
 

0

if U *  U *
1i
2i
if U *  U *
1i
2i
where U* is state-specific utilities
Therefore, the model is the following
yi*  0 +1xi  i
yi

1


 0
if yi*  0
if yi*  0
Ex:
1.
y→ the observed dummy is whether or not the person is employed
y*→ propensity or ability to find employed
2.
10
y→ the observed dummy variable is whether or not the person has
bought a car
y*→ desire or ability to buy a car.
3. Boczar (1978, J. of Finance)
bank
Personal loan debtor
Financial company
yi

1


 0
if yi*  0
if yi*  0
obtain a credit from bank
obtain a credit from financial company
f  y   p yi (1  p )1 yi
 i ~ f ( i )  logistic distribution
e  i
1
pdf : f ( i ) 
,






cdf:
F(

)

i
i
(1  e  i ) 2
1  e  i
 i ~ f ( i )  s tan dard normal distribution  probit model
pdf: f ( i ) 
1  12  i
e
,  i ~ N (0,1) ,     i  
2
11
* Likelihood Function (LF)
(text: p.584)
yi*  1   2 x   i
 1 if yi*  0
yi  
 0 otherwise
pi  p ( yi  1)  p ( yi*  0)
 p ( 1   2 xi   i  0)
 p ( i  ( 1   2 xi ))
 1  F ( ( 1   2 xi ))
 F ( 1   2 xi )
Where F is the cumulative probability function (CDF) of
F ( )  
  1   2 xi 

i
f ( ) d 
f  yi   p yi 1  p 
1 yi
, yi  0,1 , i  1, 2,..., n
f  yi    F ( 1   2 xi )  i  1  F ( 1   2 xi ) 
1 yi
y
12
LF  f  y1  f  y2 
f  yn 
LF  F ( 1   2 x1 )1 1  F ( 1   2 x1 ) 
0
 F (    x )  1  F (    x ) 
 F (    x )  1  F (    x ) 
1
1
0
2 p
1
2 p
0
1
1
2 p+1
1
2 P+1
 F ( 1   2 xn )  1  F ( 1   2 xn ) 
0
1
   F ( 1   2 xi )  1- F ( 1   2 xi ) 
yi =1
yi =0
LLF    F ( 1   2 xi )    1  F ( 1   2 xi ) 
yi =1
yi =0
MLE
Using numerical method to find b1
x0
x1
MLE
、 b2
x2
13
E  yi  1
 pi  F ( 1   2 xi )
 pˆ i  F (b1MLE  b2MLE xi )
pi
 ( 1   2 xi )
 f ( 1   2 xi ) 
 f ( 1   2 xi )   2
xi
xi
In this model we can examine the effect of one unit change of
xi
on the
probability that yi  1
Comparing the results of LPM、Probit and Logit models
=﹥can not directly compare
Linear Probability Model (LPM)
yi  1   2 xi   i
 E ( yi )  1   2 xi
b2MLE (estimating from probit and log it mod els) and
b2OLS (estimating from LPM)
14
Probit and Logit models :
pi
 ( 1   2 xi )
 f ( 1   2 xi ) 
xi
xi
 f ( 1   2 xi )   2
 b2  f (b1  b2 xi )
n
 b2 
 f (b  b x )
1
i 1
2 1
n
LPM : yi  1   2 xi   i
 E ( yi )  1   2 xi

  = p
E y i
x
i
xi
 2  b2
ex:
Financial Distress Model:
yi* →propensity to bankruptcy of company
 1 bank ruptcy
yi 
0 no

pi →the probability of company bankruptcy
15
ex:
Predicted value
yi*  1   2 x1  3 x2
pi  F  1   2 x1  3 x2 
pˆ i  F  b1MLE  b2MLE x1  b3MLE x2 
* Truncated Variable
(text:P.595)
Tobit Model (J.Tobin 1958/Econometrica vol. 26. pp.24-36)
 1   2 x for those working
Hi  
for those not working
 0
yi*  1   2 x   i
 yi*  1   2 xi   i if yi*  0
yi  
if yi*  0
 0
 lim ited var iable

yi   cut off
 censored

 i ~ N  0,  2 
Note that we have two sets of observations
16
*
(1) The probability of yi  0
pi ( yi  0)  pi ( yi*  0)
 pi  1   2 xi   i  0 
 pi   i    1   2 xi  
  1   2 xi  

 pi  i 





    2 xi 
 F  1




    2 xi 
 1 F  1





1   2 xi

1   2 xi

0
*
(2) The probability of yi  0
p ( yi*  0)  p ( yi  yi* )
 p  yi  1   2 xi   i 
 p   i  yi  1   2 xi 

y  1   2 xi 
 p i  i





17
1
e
2

1  y  1   2 xi
  i
2 




2

1
    2 xi  
 LF   1  F  1

e
 

2


  n1
n0 
let
n0
*
denote the number of yi  0
n1
*
denote the number of yi  0
1  y  1   2 xi
  i
2 





    x 
 LLF   1  F  1 2 1  
 

i 1 
50
MLE
Find b1
MLE
and b2
yi  1   2 xi   i
E  yi 
xi
  2  b2OLS
E  yi xi 
xi
E  yi xi 
xi
*
    2 xi 
 2  F  1




 b MLE  b2MLE xi 
 b2MLE  F  1

ˆ


 b1MLE  b2MLE xi 
F

ˆ


 b1  b2 xi 
F
   
 b1  b2 x 
using F 
 or
ˆ
n
  
18
(text:P.599)
2
* The Poisson Regression Model (Count data variable)
(text:P.604)
e    yi
p ( y  yi ) 
yi !
yi  0,1,
,
,
E  yi     e 1   2 xi
yi  E  yi  +  i
 yi  e 1   2 xi   i
joint pdf
e    y1 e    y2
LF 


y1 !
y2 !
e    yn
yn !
LLF     y1 n  ny1 !     y2 n  ny2 !
n

i 1


    yi  1   2 xi   e 1   2 xi  nyi !
19
MLE
Find b1
MLE
and b2
E ( yi )
 e 1   2 xi   2
xi
* eb1 b2 xi
e


=﹥Comparing
b1  b2 xi
n
b2MLE  e 1  2 xi
or eb1 b2 x
and
20
b2OLS .