2014 Midterm - Harvard University

Department of Economics
Harvard University
Economics 1123
Fall 2014
Midterm Exam
11:40 a.m. – 1:00 p.m., Thursday October 16, 2014
PACKET 1
Instructions
1. You may read this entire packet (Packet 1) as soon as you get it. DO NOT READ
PACKET 2 UNTIL YOU ARE TOLD.
2. This exam ends promptly at 1:00 PM.
3. The exam has two parts for a total of 100 points. Please put each part in a separate blue
book. Put your name and Harvard ID number on the cover of each blue book.
4. Write your answers using a pen (not pencil).
5. You are permitted one double-sided 8½” x 11” sheet of notes, plus a calculator. No
computers, wireless, or other electronic devices without prior permission. You may not share
resources with anyone else.
6. Please return this exam (both packets) with your completed blue books.
1-1
Introduction
Do bans on smoking in bars reduce the number of smokers? You will examine this question
using state panel data for 50 U.S. states in 9 years (2001-2009), for a total of 509 = 450
observations. The data set includes the smoking rate (the fraction of the adult population that
currently smokes), binary variables indicating whether states have smoking bans in bars, or in
restaurants, or in workplaces, and other related variables including the drinking rate.
The question being studied is similar to that in Problem Set 4, but the data set is different. The
Problem Set 4 data were 253,916 observations on individuals. In contrast, here the unit of
observation is a state in a given year.
The variables are summarized in Table 1.
Figure 1 is a plot, by year, of the number of states with a bar smoking ban and the smoking rate
in the state.
Table 2 (in Packet 2) contains regression results. Regressions (1) and (2) use only data from
2009 (50 observations). Regressions (3)-(6) use the full panel data set (450 observations).
1-2
Table 1. Variable Definitions and Summary Statistics
Data source: Center for Disease Control
Unit of observation: a state in a year, 50 states, 9 years (2001-2009); n = 450
Variable
smokeringrate
statebarban
staterestban
stateworkban
all3bans
drinkingrate
somehs
hsgrad
somecollege
collegegrad
white
black
Hispanic
other
Definition
Fraction of state adult population that currently smokes
Mean
.242
Std. Dev.
.044
=1 if state has a bar smoking ban in effect, = 0 otherwise
.202
.402
=1 if state has a restaurant smoking ban in effect, = 0 otherwise
.248
.422
=1 if state has a workplace smoking ban in effect, = 0 otherwise
.182
.375
=1 if smoking ban in bars, restaurants, and workplaces, = 0 otherwise
.129
.335
Fraction of state adult population that drinks
.596
.098
Fraction of state adult population with less than a high school diploma
.068
.028
.269
.046
.287
.035
Fraction of state adult population with a high school diploma and no
further education
Fraction of state adult population with high school diploma and some
college education, but no college degree
Fraction of state adult population with a college degree
.376
.073
Fraction of state adult population that is white
.755
.143
Fraction of state adult population that is black
.098
.098
Fraction of state adult population that is Hispanic
.081
.092
Fraction of state adult population that is neither white, black, or Hispanic
.066
.078
1-3
Selected Tables from Stock and Watson, Introduction to Econometrics
1-4
1-5
Department of Economics
Harvard University
Economics 1123
Fall 2014
Midterm Exam
11:40 a.m. – 1:00 p.m., Thursday October 16, 2014
PACKET 2
DO NOT TURN OVER UNTIL INSTRUCTED
2-1
Table 2. Smoking Rates and Public Smoking Bans: Regression Results
Dependent variable: smokingrate
statebarban
(1)
-.0494**
(.0097)
(2)
-.0306**
(.0077)
(3)
-.0187**
(.0045)
(4)
-.0120**
(.0033)
(5)
-.0133**
(.0036)
-.0003
(.0044)
-.0075*
(.0029)
.0034
(.0040)
-.0032
(.0030)
.229**
(.052)
.209
(.127)
.005
(.119)
-.374**
(.067)
-.027
(.045)
-.193**
(.044)
.272**
(.087)
2001 –
2009
yes
no
cluster
.015
(.036)
.256**
(.092)
-.046
(.079)
-.204**
(.049)
-.029
(.037)
-.207**
(.030)
.169*
(.070)
2001 –
2009
yes
yes
cluster
.0040
(.0042)
-.0041
(.0039)
.0018
(.0038)
.014
(.036)
.256**
(.092)
-.046
(.079)
-.203**
(.050)
-.028
(.037)
-.208**
(.030)
.169*
(.070)
2001 –
2009
yes
yes
cluster
statebarbandrinkingrate
staterestban
stateworkban
all3bans
drinkingrate
-.693**
(.236)
-.926**
(.209)
-.642**
(.111)
somehs
somecollege
collegegrad
black
Hispanic
other
Years used for the regression
2009
only
no
no
HR
2009
only
no
no
HR
Sate fixed effects?
Year fixed effects?
Standard errors
F-statistics testing that the coefficients on the following variables are all zero
(p-values in parentheses):
statebarban,
statebarbandrinkingrate
12.32
32.06
24.88
somehs, somecollege, collegegrad
(.000)
black, Hispanic, other
Number of observations
50
50
(.000)
10.63
(.000)
450
(.000)
23.73
(.000)
450
(6)
-.0028
(.0139)
-.0147
(.0233)
.0039
(.0038)
-.0035
(.0030)
.018
(.038)
.256**
(.092)
-.047
(.080)
-.204**
(.050)
-.028
(.037)
-.208**
(.030)
.166*
(.071)
2001 –
2009
yes
yes
cluster
7.05
(.002)
24.97
(.000)
23.11
(.000)
450
24.49
(.000)
23.96
(.000)
450
Notes: Regressions (1) and (2) use data from 2009 only; regressions (3)-(6) use panel data for all
9 years. Standard errors are given in parentheses under estimated coefficients, and p-values are
given in parentheses under F- statistics. All regressions include an intercept (not reported).
Standard errors and F-statistics are heteroskedasticity-robust (HR) for regressions (1) and (2) and
clustered for regressions (3)-(6). Coefficients are individually statistically significant at the
+
10%, *5%, **1% significance level.
2-2
Part 1 (45 points) – USE BLUE BOOK #1
1) Consider regression (2) in Table 2:
a) (5 points) Interpret the coefficient on statebarban.
b) (5 points) Construct a 95% confidence interval for the true (population) coefficient on
statebarban.
2) (5 points) Suggest a reason why the coefficient on statebarban changes between regressions
(1) and (2). Your reason should explain the direction of the change in the coefficient from
regression (1) to (2).
3) (5 points) Suggest a reason why the coefficient on statebarban changes between regressions
(3) and (4). Your reason should explain the direction of the change in the coefficient from
regression (3) to (4).
4) Consider regression (4):
a) (5 points) Test the population hypothesis that the coefficients on the educational
achievement variables are all zero, against the alternative that at least one of the
coefficients is nonzero, at the 5% significance level.
b) (5 points) Are the estimated differences in smoking rates associated with different rates
of educational achievement (holding constant the other variables in the regression) large
or small in a real-world sense? Explain.
5) (5 points) Regression (5) includes the variable, all3bans, which equals one if all three
smoking bans (workplace, restaurant, and bars) are in place and equals zero otherwise. Note
that you can perfectly predict the value of all3bans if you know the values of statebarban,
staterestban, and stateworkban (stated mathematically, all3bans =
statebarbanstaterestbanstateworkban). Does regression (5) suffer from perfect
multicolinearity? Why or why not?
6) Consider regression (6):
a) (5 points) Compute the predicted effect of a ban on smoking in bars for a state that has a
drinking rate of 0.70, holding constant the other variables in the regression.
b) (5 points) Explain how you would compute a 95% confidence interval for the predicted
effect in (a); be precise. (You do not need to compute this 95% confidence interval, just
explain how you would do so.)
2-3
Part 2 (35 points) – USE BLUE BOOK #2
7) In a separate study using data on individuals between ages 18 and 30 (not on states, as is used
in Table 2), a researcher estimates the probit regression of whether or not an individual
currently smokes using as regressors the variables statebarban, female (which is one if the
individual is female), and age (the individual’s age). The estimated probit coefficients on
these variables, and the intercept in the probit regression, are given in Table 3:
Table 3. Probit Regression Results: Individual Data
Dependent variable: current_smoker = 1 if the individual currently smokes, = 0 otherwise
Variable
statebarban
female
age
constant
Probit coefficient
-.12
-.14
-.0065
-.379
Standard error
.0069
.0055
.0008
.019
a) (5 points) Consider a 25 year old man living in a state with no bar smoking ban. Compute
the predicted probability that this man smokes.
b) (5 points). Is the predicted effect of a bar smoking ban statistically significantly different
from zero in this regression, holding constant the age and sex of the individual?
c) (5 points) Suppose instead that the researcher had estimated an OLS regression with
current_smoker as the dependent variable and the same regressors. State one advantage
and one disadvantage of this OLS regression, relative to the probit regression in the table.
Now return to the regressions in Table 2:
8) Consider the panel data regression (4).
a) (5 points) In your judgment, is the error term uit in the population version of the
regression serially correlated or serially uncorrelated? Explain using an example.
b) (5 points) Whatever your answer to part (a), suppose that uit is serially correlated. What
are the implications of this serially correlated error for fixed effects regression with
heteroskedasticity-robust standard errors? Explain.
9) (10 points) Idaho has a restaurant smoking ban, but not a bar smoking ban and not a
workplace smoking ban. Suppose the Idaho Governor’s office wants your assessment of the
evidence concerning the effect on smoking of adopting a bar smoking ban. Based on the
results in Table 2, in your expert judgment does adopting a bar smoking ban reduce the
smoking rate? Explain, with reference to specific regressions in Table 2 and arguments why
or why not the results provide a credible basis for providing this policy advice.
2-4