1 Choosing a Functional Form

1
Choosing a Functional Form
1. Using a linear when we need a nonlinear model is specification error.
2. Estimator is biased
3. Theory must be your guide in econometrics, but with functional forms,
there is often no theory. Must choose functional form: not easy since little
guidance from economic theory.
4. We must plot the data to find if a nonlinear approach is necessary. This
is the way to find if the regression needs a nonlinear term! Plot the independent variables one variable at a time against the dependent variable.
5. Marginal effect of X on Y is not constant
6. Can test with t-stat to see if the nonlinear term is necessary
7. Even if the theory calls for no constant term, still leave it in. Constant
term will go to zero on its own
8. Tractable nonlinearity: can convert to linear form in parameters. Example
of polynomial form: quadratic specification in average income
T estScore = β0 + β1 avginc + β2 (avginc)2 + ui
Cubic just adds a third term.
9. Predicting change in TestScore for a change in income from $5,000 per
capita to $6,000 per capita: 3.85 × (6 − 5)0.042 × (62 − 52) = 3.4
10. Must remember to put in higher order terms
11. Declining marginal benefit of an increase in school budgets?
12. Caution! Dont extrapolate outside the range of the data!
13. Meaningless to run t-test on constant term. The deviation in sample from
zero mean of the errors is included the constant term. This is done for the
sake of the equation as a whole.
Change in income
Change in test score
from 5 to 6
from 25 to 26
from 45 to 46
3.4
1.7
0
14. Testing the null hypothesis of linearity
1
Figure 1: Plotting data to reveal nonlinearity
Figure 2: Tractable regression: cubic
2
15. H0: population coefficients on avginc2 and avginc2 = 0
16. H1: at least one of these coefficients is nonzero.
17. Stata command: test avginc2 avginc3
avginc2 =
0.0
avginc3 =
0.0
F (2, 416) =
37.69
P rob > F =
= 0.0000
18. Linearity is rejected at the 1 percent level against the alternative that it
is a polynomial of degree up to 3.
1.1
Alternative Functional Forms
1. linear form is the default-should be used unless strong evidence to the
contrary.
2. Log functions of Y and or X are percent changes
ln(x + ∆x) − ln(x) = ln(1 +
∆x
∆x
)=
x
x
3. In calculus
dln(x)
1
=
dx
x
4. Numerically it is not perfect!
ln(1.01) =
0.00995 ≈ 0.01;
ln(1.10) =
0.0953 ≈ 0.10(sort of !)
5. Often difficult to choose. Note vertical axis. Does this seem to fit as well
as the cubic or linear-log?
3
Table 1: Three log transformations
I.
II.
III.
linear-log
log-linear
log-log
Yi = β0 + β1 ln(Xi) + ui
ln(Yi ) = β0 + β1 Xi + ui
ln(Yi ) = β0 + β1 ln(Xi) + ui
Figure 3: Use R2 to Choose?
4
1.2
Basic transformations for tractable nonlinearity
Y = eβ0 X β1 eu
ln Y = ln eβ0 + ln X β1 + ln eu
ln Y = β0 ln e + β1 ln X + u ln e
ln Y = β0 + β1 ln X + u
ln Y = Y �
X � = ln X
Y � = β0 + β1 X � + u
1. Double log form
ln Yi = β0 + β1 ln X1i + β2 ln X2i + ui
2. Log-log β̂’s are elasticities
βk =
∆Y /Y
∆(ln Y )
=
= elasY,Xk
∆(ln Xk )
∆Xk /Xk
3. Good for estimating Cobb-Douglas equations
Y = eβ0 X1β1 X2β2 eu
4. Three cases, differing in whether Y and/or X is transformed by taking
logarithms.
5. The regression is linear in the new variable(s) ln(Y ) and/or ln(X), and
the coefficients can be estimated by OLS.
6. Hypothesis tests and confidence intervals are now implemented and interpreted as usual.
7. The interpretation of β1 differs from case to case.
8. Choice of specification should be guided by judgment (which interpretation
makes the most sense in your application), tests, and plotting predicted
values
5
1.3
Intractable Nonlinearity
1. The foregoing are not really nonlinear in that the are still linear in the
parameters.
2. Also have flawspolynomial: test score can decrease with income
3. Linear-log: test score increases with income, but without bound
4. How about a nonlinear function that has has test score always increasing
and builds in a maximum score.
Y = β0 − αe−β1 X
where Y, β0 , β1 , and α are unknown parameters. This is called a negative
exponential growth curve.
5. Stock and Watson suggest the transformation
α = eβ1 ,β2
(1)
to write the non-linear estimation as
�
minβ0 ,β1 ,β2
Yi − β0 [1 − e−β1 (Xi −β2 ) ]
6. Solved by a variant of Newton’s Method.
x1 = x0 −
f (x0
f � (x0 )
(2)
7. The result of the estimation is shown in figure 5
8. Not worth the effort?
1.4
Interactions
1. Perhaps a class size reduction is more effective in some circumstances than
in others
2. Perhaps smaller classes help more if there are many English learners, who
need individual attention
3. That is, might depend on PctEL
4. More generally, might depend on X2
5. How to model such interactions between X1 and X2 ?
6. We first consider binary Xs, then continuous Xs
Yi = β0 + β1 D1i + β2 D2i + ui
6
(3)
Figure 4: The negative exponential in Stata
7
Figure 5: Negative exponential RMSE = 12.675; Linear-log RMSE = 12.618
7. D1i and D2i are binary
8. Note that β1 is the effect of changing D1 = 0 to D1 = 1. In this specification this effect doesnt depend on the value of D2 .
9. To allow the effect of changing D1 to depend on D2 , include the interaction
term D1i D2i as a regressor
Yi = β0 + β1 D1i + β1 D1i + β3i (D1i D2i ) + ui
Yi = β0 + β1 D1i + β2 D2i + β3i (D1i D2i ) + ui
10. The effect of D1 depends on D1
11. β3 = increment to the effect of D1 , when D1
12. Interactions between continuous and binary variables
Yi = β0 + β1 Di + β2 Xi + ui
(4)
13. Di is binary, X is continuous
14. As specified above, the effect on Y of X (holding constant D) = β2 , which
does not depend on D.
8
15. To allow the effect of X to depend on D, include the “interaction term”
DXi as a regressor:
Yi = β0 + β1 Di + β2 Xi + β3 (Di Xi ) + ui
(5)
Interactions between two continuous variables
Yi = β0 + β1 X1i + β2 X2i + ui
(6)
16. X1 , X2 are continuous
17. As specified, the effect of X1 doesnt depend on X2 and vice versa.
18. To allow the effect of X1 to depend on X2 , include the interaction term
X1 X2 as a regressor.
19. This changes the slope. The changes are summarized in figure ??
Figure 6: Combinations of slope and intercept dummies
9