Multiple Regression Analysis

Multiple Regression Analysis
y = β0 + β1x1 + β2x2 + . . . βkxk + u
4. Further Issues
0
Redefining Variables
  Suppose we have a model with a variable like
income measured in dollars on the left-hand-side.
Now we re-define income to be measured in tens
of thousands of dollars. What effect will this have
on estimation and inference?
  It will not affect the R2
  Will such scaling have any effect on t-stats, Fstats and confidence intervals?
! 
No, these will also have the same interpretation
  Changing the scale of the y variable just leads to a
corresponding change in the scale of the
coefficients
1
Redefining Variables (cont)
  Suppose we originally obtain
! = β̂ + β̂ sqrft + β̂ bdrms, where
hprice
0
1
2
β̂ 0 = −19300, β̂1 = 128, β̂ 2 = 15200 and
se(β̂ 0 ) = 31000, se(β̂1 ) = 14, se(β̂ 2 ) = 9480, and R 2 = 0.63
  In this specification, house price is measured in
dollars.
  What happens if we re-estimate this with house
price measured in thousands of dollars?
2
Redefining Variables (cont)
 If we measure price in thousands of dollars,
the new coefficient will be the old
coefficient divided by 1000 (same estimated
effect!)
 The standard errors will be 1000 times
smaller
 t-stats etc. will be identical
! = β̂ + β̂ sqrft + β̂ bdrms, where
hprice
0
1
2
β̂ 0 = −19.3, β̂1 = 0.128, β̂ 2 = 15.2 and
3
se(β̂ 0 ) = 31, se(β̂1 ) = 0.014, se(β̂ 2 ) = 9.48, and R 2 = 0.63
Redefining Variables (cont)
  Changing the scale of one of the x variables:
What if we redefine square feet as thousands of square
feet? Now all the β’s have the same interpretation as before
with the exception of β1-hat
hpr̂ice = β̂ 0 + β̂1 sqft 1000 + β̂ 2 ...
  It will be 1000 times larger
! 
Why? Because now a 1 unit change in square feet is the same as
what previously was a 1000 unit change in square feet.
  The standard error will also be 1000 times larger and tstats etc. will have the same interpretation
4
Functional Form
  OLS can be used for relationships that are not strictly
linear in x and y by using nonlinear functions of x and y –
will still be linear in the parameters
Example:
log(wage)= β0 + β1(educ)+β2(exper)+β3 (exper)2
In this particular specification we have an example of a log
specification with a quadratic term--both are examples of
nonlinearities that can be introduced into the standard
linear regression model
5
Interpretation of Log Models
1. If the model is ln(y) = β0 + β1ln(x) + u, then β1 is an
elasticity. e.g. if we obtained an estimate of 1.2, this
would suggest that a 1 percent increase in x causes y to
increase by 1.2 percent.
2. If the model is ln(y) = β0 + β1x + u, then β1*100 is the
percent change in y resulting from a unit change in x. e.g.
if we obtained an estimate of 0.05, this would suggest that
a 1 unit increase in x causes a 5% increase in y.
3. If the model is y = β0 + β1ln(x) + u, then β1/100 is the unit
change in y resulting from a 1 percent change in x. e.g. if
we obtained an estimate of 20, this would suggest that a 1
percent increase in x causes a 0.2 unit increase in y.
6
Why use log models?
  Log-log models are invariant to the scale of the
variables since we’re measuring percent changes
  They can give a direct estimate of elasticity
  For models with y > 0, the conditional distribution
is often heteroskedastic or skewed, while ln(y) is
much less so
  The distribution of ln(y) is more narrow, limiting
the effect of outliers
7
Some Rules of Thumb
What types of variables are often used in log form?
*Variables in positive dollar amounts
*Variables measuring numbers of people
-school enrollments, population, # employees
*Variables subject to extreme outliers
What types of variables are often used in level form?
*Anything that takes on a negative or zero value
*Variables measured in years
8
Quadratic Models
  Captures increasing or decreasing marginal effects
  For a model of the form y = β0 + β1x + β2x2 + u,
we can’t interpret β1 alone as measuring the
change in y with respect to x.
dy
= β1 + 2 β 2 x
dx
 Now the effect of an extra unit of x on y depends in part
on the value of x. Suppose β1 is positive. Then if β2 is
positive, an extra unit of x has a larger impact on y when x
is big than when x is small. If β2 is negative, an extra unit
of x has a smaller impact on y (or a more negative impact
on y) when x is big than when x is small.
9
More on Quadratic Models
  Suppose that the coefficient on x is positive and
the coefficient on x2 is negative
  Then y is increasing in x at first, but will
eventually turn around and be decreasing in x
  We may want to know the point of inflection
! = 3.73 + 0.298exper - 0.0061exper 2
wage
The turning point will be at x * ≈ 24.4, where dw/dx = 0.
10
More on Quadratic Models
  Suppose that the coefficient on x is negative and
the coefficient on x2 is positive
  Then y is decreasing in x at first, but will
eventually turn around and be increasing in x
! =13.39 − 0.902 log(nox) − 0.087 log(dist)
log(price)
−0.545(rooms) + 0.062(rooms)2 − 0.048(stratio)
∂y
*
The turning point will be at r ≈ 4.4, where
= 0.
∂r
11
Interaction Terms
  We might think that the marginal effect of one
RHS variable depends on another RHS variable
Example: suppose the model can be written:
y = β0 + β1x1 + β2x2 + β3x1x2 + u
  Where y is house price, x1 is the number of square
feet and x2 is the number of bedrooms.
  So the effect of an extra bedroom on price is
∂y
= β 2 + β 3 x1
∂x2
12
Interaction Terms
  If β3>0, this tells us that an extra bedroom boosts the price
of a house more, if the square footage of the house is
higher.
! 
This shouldn’t be surprising. After all, an extra bedroom in a
small house is likely to be small compared with an extra bedroom
in a large house. So we would expect an extra bedroom in a big
house to be worth more.
  Note that this makes interpretation of β2 a bit less
straightforward.
! 
! 
! 
Technically, β2 tells us how much an extra bedroom is worth in a
house with zero square feet.
It may be useful to report on the value of β2+β3x1 for the mean
value of x1 .
Or redefine to x1 be deviations of square footage from the mean (so
that negative values imply smaller than average houses; positive
values imply larger than average houses)
13
More on Goodness-of-Fit:
Adjusted R-Squared
  Recall the R2 will always increase (or at least stay
the same) as we add more variables to the model
  The “adjusted R2“ takes into account the number
of variables in a model, and may decrease when
variables are added.
  The usual R2 can be written:
R
2
SSR n ]
[
≡ 1−
, where SSR / n is a biased estimate of σ
[ SST n ]
2
u
and SST / n is a biased estimate of σ y2
14
Adjusted R-Squared (cont)
  We can define the “population R-squared” as
2
σ
ρ 2 = 1 − u2
σy
  We can use SSR/(n-k-1) as unbiased estimate of σu2
  Similarly can use SST/(n-1) as unbiased estimate of σy2
  Therefore, adjusted R2, or “R-bar squared” is:
R 2 = 1 − [SSR / (n − k − 1)] / [SST / (n − 1)]
= 1 − σ̂ 2 / [SST / (n − 1)]
15
Adjusted R-Squared (cont)
  Notice that R 2 can go up or down when a variable
€
is added, unlike the regular R-squared which
always goes up
  R 2 is
not necessarily “better”- the ratio of 2
€
unbiased estimators isn’t necessarily unbiased
  Better to treat it as an alternative way of
summarizing goodness of fit
! 
If you add a variable to the RHS and the R 2 doesn’t rise,
this is likely (though not surely) an indication it
shouldn’t be included in the model
€
16
Comparing Nested Models
 Suppose you wanted to compare the
following two models:
1. y=β0+β1x+u
2. y=β0+β1x+β2x2+ u
We say that (1) is nested in (2); alternatively, (1)
is a special case of (2). With a t-test on β2 we
can choose between these two models (if reject
null of β2=0, we pick model 2). For multiple
exclusion restrictions can use F-test.
17
Comparing Non-Nested Models
  Suppose you wanted to compare the following two models:
1. y=β0+β1log(x)+u
2. y=β0+β1x+ β2x2+ v
  One is not nested in the other, so t-test or F-test cannot be
used to compare.
  Here R-bar-squared can be useful. We can simply choose
the model with the higher R-bar-squared.
! 
! 
Note that a simple comparison of regular R-squared would tend to
lead us to choose the model with more explanatory variables.
Note that if the LHS variable takes a different form between (1)
and (2) we cannot compare using R-bar-squared (or R-squared).
18
iClickers
  Imagine you want to compare the following two models:
1. y=β0+β1x+β2x2+β3x3+u
2. y=β0+β1x+ v
  Suppose you want to test whether the model should be
linear or cubic (i.e., whether model 2 is more appropriate
than model 1). Question: What would be the most helpful
to look at, in order to make this judgment?
1)  2 t-tests (one on beta2=0 and one on beta3=0)
2)  An F-test (on beta2=0 AND beta3=0, jointly)
3)  The unadjusted R-squared for both models.
4)  The adjusted R-squared for both models.
19
Goodness of Fit
  Important not to fixate too much on adj-R2 and
lose sight of theory and common sense
  If economic theory clearly predicts a variable
belongs, generally leave it in
  Don’t want to exclude a variable that prohibits a
sensible interpretation of the variable of interest
  Remember the ceteris paribus interpretation of
multiple regression
20
Residual Analysis
  Sometimes looking at the residuals (i.e. predicted
y – observed y) provides useful information
  Example: Regress price of cars on characteristics
! 
Engine size, efficiency, luxury amenities, roominess,
fuel efficiency, etc.
  Then the residual = actual price - predicted price
! 
By picking the car with the lowest (most negative)
residual, you would be choosing the most underpriced
car (assuming you’re controlling for all relevant
characteristics)
21
iClickers
  Question: What day of the week is today?
A) Monday
B) Tuesday
C) Wednesday
D) Thursday
E) Friday
22