Econ 1629: Applied Research Methods Assignment 7 Part I: What

Econ 1629: Applied Research Methods
Assignment 7
Part I: What are the effects of a military draft? (Lab)
Pierre Mouganie has a new paper:
“In 1997, the French government put into effect a law that permanently exempted
young French male citizens born after Jan 1, 1979 from mandatory military service
while still requiring those born before that cutoff date to serve. This paper uses a
regression discontinuity design to identify the effect of peacetime conscription on
education and labor market outcomes. Results indicate that conscription eligibility
induces a significant increase in years of education, which is consistent with
conscription avoidance behavior. However, this increased education does not result in
either an increase in graduation rates, or in employment and wages. Additional
evidence shows conscription has no direct effect on earnings, suggesting that the
returns to education induced by this policy was zero.”1
Answer the following questions without reading the paper.
1. Write the type of population regression function we are interested in studying
(without regression discontinuity). What data could we use to estimate it? (The paper
studies several outcomes, but focus on wages.)
a. What factors are contained in the unexplained term?
b. Is the unexplained term likely to be uncorrelated with the explanatory
variable? Describe in language specific to this context.
2. Write the estimating regression equation that Mr. Mouganie is likely to have used.
a. What is the main parameter of interest? Interpret it in language a policymaker can
understand.
b. What assumptions are needed for this estimate to be valid?
c. Discuss the external validity of this approach: to what population should we
expect the results to generalize to?
1
h/t Marginal Revolution. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2494651
Part II: Education policy and student performance2
In a paper published in the American Economic Review (September 2005), economists
Kenneth Chay, Patrick McEwan, and Miguel Urquiola examine the impact of Chile’s 900
Schools Program (P900) on student performance. This paper is also available on the course
website, but you do not have to read it. Beginning in 1990, the P900 program identified 900
schools whose average 1988 4th grade test scores fell below a given threshold. These schools
then received infrastructure improvements, instructional materials, training for teachers, and
tutoring for low-achieving students. In this part of the assignment we are going to use a
Regression Discontinuity (RD) design to measure the effect that P900 had on student
performance in 1992.
For this part of the assignment, you are going to use the dataset “p900.dta”, available on the
course website. The key variables are:
dx92
p90
-
x88
eligible
-
x88_rel
-
ses90
-
1988—1992 school-level change in average test scores
indicator equal to one if the school received the P900
program resources and equal to zero, otherwise
average test score in the school in 1988
indicator equal to one if the school had an average
1988 score (x88) that made it eligible to receive the
P900 program and equal to zero, otherwise
1988 average score minus the P900 eligibility cutoff in
the region in which the school is
average “socioeconomic status” of students in the
school in 1990 (higher number implies higher SES)
1. The relationship between P900 program eligibility and the probability of receiving
P900 program resources can be seen below.
P900 status above is equal to one if the school received the P900 program resources
and equal to zero, otherwise.
What does the above plot tell us about program compliance? Does P900 program
eligibility perfectly predict receiving program resources?
2
Thanks to Sam Brown.
2. The relationship between the 1988-1992 school-level change in average test scores
and program eligibility can be seen below.
a. What happens to average test scores near the cutoff threshold? What does
this suggest about the effect that the P-900 program had on student
performance between 1988 and 1992?
b. The relationship between the 1988—1992 school-level change in average
test scores and program eligibility appears to be downward sloping. Why
might this be the case?
3. Let’s try to measure the effect that the P-900 program had on student performance
between 1988 and 1992. Our treatment variable is eligibility for the P900 program
(eligible), and our outcome of interest is the change in school test scores from 1988—
1992 (dx92).
a. Write down the associated regression equation.
b. Estimate this regression equation. Interpret the coefficient on eligible.
4. In Question 1 we examined program compliance. How might those results affect the
interpretation of our estimates in Question 3?
5. In Question 2 we saw that there is a negative relationship between the 1988-1992
school-level change in average test scores and program eligibility. Intuitively, this
biases our estimates in Question 3 because RD attributes some of this downward trend
to not being in the P900 program.
Since this downward trend is smooth, we can control for it in our RD specification
and thereby measure only the discontinuous change in test scores across the program
eligibility threshold. This yields a better estimate of the true program effect.
a. Include x88 as an additional control in your regression specification from
Question 3(a), and estimate this equation.
b. How does including this additional control affect your estimate of the
coefficient on eligible? Why?
6. The Regression Discontinuity (RD) design is based on the conjecture that
observations “close” to the cutoff threshold are similar. In other words, their
unobservable characteristics do not differ systematically across the threshold.
If this is the case, then we can control for confounding factors by limiting our sample
to observations that are within a certain distance from the threshold.
a. Create a variable named within3 that is equal to one if the distance from the
1988 average test score to the P900 eligibility cutoff (the absolute value of
x88_rel) is less than or equal to 3, and equal to zero otherwise.
b. Re-estimate the regression equation from Question 3(a), limiting the
sample to observations for which within3 is equal to one.
c. How does your estimate of the coefficient on eligible in this question
compare to your estimates in Question 3(a) and in Question 5(a)? Why do
you think that this may be the case? Which estimates do you think are more
reliable? Why?
7. [BONUS] The relationship between average socioeconomic status (SES) within a
school and program eligibility can be seen below.
What happens to average SES near the cutoff threshold? Why might this be the case?
How might this affect our estimates?