Differences in Difference and Randomized Experiment

M2R “Development Economics”
Empirical Methods in Development Economics
Université Paris 1 Panthéon Sorbonne
Differences in Difference and Randomized
Experiment
Rémi Bazillier
[email protected]
Semester 1, Academic year 2016-2017
1 / 57
Introduction
I
We’ve already reviewed together a number of popular
econometric methods to isolate causal/treatment effects:
I
Propensity Score Matching;
I
Instrumental Variables;
I
Heckman procedure;
I
Regression Discontinuity Design;
2 / 57
Introduction
I
The purpose of the class today is to present two other key
impact evaluation methods: Difference-in-Difference and
Randomized Experiments.
I
Outline:
1. Back to the selection bias
2. Solving the selection bias with a Difference-in-Difference
approach
3. Solving the selection bias with a Randomized Experiment
3 / 57
1. Back to the selection bias
I
Suppose we want to measure the impact of textbooks on
learning.
I
We denote:
I
I
YiT the average test score of children in a given school i if the
school has textbooks (i.e. if treated);
YiC the average test score of children in a given school i if the
school has no textbooks (i.e. if untreated).
4 / 57
1. Back to the selection bias
I
The objective of policy makers is to be able to estimate the
“treatment effect”, defined as follows:
E (YiT − YiC |T ).
I
This quantity indeed allows to measure how schools which
had textbooks would have fared in the absence of
textbooks, compared to those which had textbooks.
I
For sure, we cannot compute this quantity by observing a
school i both with and without books at the same time: while
every school has two potential outcomes, only one is observed
for each school.
5 / 57
1. Back to the selection bias
I
Imagine you get access to data on a large number of schools
in one developing country.
I
Some schools have textbooks and others do not.
I
Would you manage to measure the treatment effect
E (YiT − YiC |T ) by computing the difference between the
average test scores in schools with textbooks and in schools
without textbooks?
6 / 57
1. Back to the selection bias
I
Computing the difference between the average test scores in
schools with textbooks and in schools without textbooks boils
down to computing the following quantity:
D = E (YiT |T ) − E (YiC |C ).
I
Is D equal to E (YiT − YiC |T )?
7 / 57
1. Back to the selection bias
I
We have:
D
=
=
=
=
E (YiT |T ) − E (YiC |C )
E (YiT |T ) − E (YiC |C ) + E (YiC |T ) − E (YiC |T )
E (YiT |T ) − E (YiC |T ) − E (YiC |C ) + E (YiC |T )
E (YiT − YiC |T ) + E (YiC |T ) − E (YiC |C ).
I
The first term is clearly the “treatment effect” that we are
trying to isolate: E (YiT − YiC |T ).
I
However, there is a second term given by:
E (YiC |T ) − E (YiC |C ).
8 / 57
1. Back to the selection bias
I
This second term (E (YiC |T ) − E (YiC |C )) is the selection
bias.
I
It captures the fact that treatment schools may have had
different test scores on average even if they had not been
treated.
I
Put differently, the selection bias will exist (i.e.: it won’t be
equal to 0) as soon as schools in the treatment group and
schools in the control group initially differ with respect to
observed and unobserved characteristics that influence
students’ grade (“initially” means that these characteristics
are different prior to any treatment).
9 / 57
1. Back to the selection bias
I
For instance:
I
if schools that received textbooks were schools where parents
consider education a priority, then the selection bias will be
upward (i.e.: E (YiC |T ) > E (YiC |C )): one will conclude that
the impact of textbooks on student’s score is more positive
than it actually is.
I
if schools that received textbooks were targeted because they
were located in particularly disadvantaged communities, then
the selection bias will be downward (i.e.:
E (YiC |T ) < E (YiC |C )): one will conclude that the impact of
textbooks on student’s score is more negative than it actually
is.
10 / 57
1. Back to the selection bias
I
How can one eliminate the selection bias?
11 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
To implement a Diff-in-Diff, one needs information on
pre-period differences in outcomes between treatment and
control group so as to control for pre-existing differences
between the groups (and therefore neutralize the selection
bias).
12 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
We suppose 2 periods of time:
I
at date t = 0, a baseline (or pre-treatment) survey is
conducted in a set of different schools. Just after this baseline
survey, some of the schools are selected by an NGO (in a non
random manner) for being treated (i.e: for getting access to
textbooks);
I
at date t = 1, a post-treatment survey is conducted in both
the set of treated and untreated schools.
13 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
More specifically:
I
I
C |T ) and
the baseline survey allows to get information on E (Yi0
C
on E (Yi0 |C ), the test score in period 0 (where none of the
schools is treated) in schools that are destined to be treated
and in schools that will remain untreated respectively;
the post-treatment survey allows to get information on
T |T ) and E (Y C |C ), the average test score in period 1 in
E (Yi1
i1
schools that have been treated right after period t = 0 and in
schools that belong to the control group respectively.
14 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
Illustration in case the most disadvantaged schools are
selected by the NGO for being treated:
Treatment
effect
Factors
influencing
Y
Changes in
time-varying
characteristics
Students’
characteristics
Students’
characteristics
School
characteristics
School
characteristics
T0 :
value of Y in
the treatment
at date t=0
T1 :
value of Y in
the treatment
at date t=1
Changes in
time-varying
characteristics
Students’
characteristics
Students’
characteristics
School
characteristics
School
characteristics
C0:
value of Y in
the control at
date t=0
C1 :
value of Y in
the control at
date t=1
15 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
The “difference-in-difference” estimator denoted DD is given
by:
DD
=
=
=
=
=
(T1 − C1 ) − (T0 − C0 )
[E (Yi1T |T ) − E (Yi1C |C )] − [E (Yi0C |T ) − E (Yi0C |C )]
[E (Yi1T |T ) − E (Yi0C |T )] − [E (Yi1C |C ) − E (Yi0C |C )]
(T1 − T0 ) − (C1 − C0 )
(CTVCT + TE ) − CTVCC ,
where CTVCT and CTVCC stand for “changes in time-varying
characteristics” in the treatment and in the control group
respectively, and TE stands for the “treatment effect”.
16 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
In others words, DD coincides with the treatment effect TE
under the critical assumption that, in the absence of any
treatment, both the treatment and the control groups of
schools would have followed parallel trends over time (no
mean reversion effect, no anticipation of the treatment... etc).
I
This means that, when many years are available, you should
plot the series of average outcomes for Treatment and Control
groups and see whether trends are parallel and whether there
is a sudden change just after the reform for the Treatment
group.
17 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
The regression counterpart to obtain DD is given by
Y = α + βTreat + γTime + δ(Treat ∗ Time) + u,
(1)
where:
I
Treat is equal to 1 if the school belongs to the treatment
group (it is equal to 0 if it belongs to the control group);
I
Time is equal to 1 if the period is post-treatment (it is equal
to 0 if the period is pre-treatment).
18 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
δ captures DD (which coincides with the treatment effect as
soon as one can reasonably assume that, in the absence of
any treatment, both the treatment and the control groups of
schools would have followed parallel trends over time):
δ = (T1 − T0 ) − (C1 − C0 ).
19 / 57
2. Solving the selection bias: a Diff-in-Diff approach
I
Relying on a regression analysis has a big advantage: it allows
to partly relax the “parallel trend” assumption.
I
It indeed allows to control for the effect of school
characteristics that are likely to not evolve over time in a
parallel way in the treatment and in the ontrol group.
I
Moreover, as soon as there is more than one period in the
pre-treatment phase and more than one period in the
post-treatment phase, one can control for group specific time
trends (they capture the linear evolution over time of the
outcome Yi in each group).
I
To do so, one must introduce the following controls in
Equation (2): Trend and (Treat ∗ Trend) where Trend
captures the value of the period.
20 / 57
2. Solving the selection bias with a Diff-in-Diff approach
I
Keep in mind that the treatment should not be concomitant
to policies which might differentially impact the treatment
and the control group.
I
For instance, the introduction of textbooks by the NGO
should not be accompanied by other interventions by the
NGO aiming at improving test scores in the treatment group
(e.g. organisation of remedial classes).
I
Otherwise, it is impossible to attribute the treatment effect to
the introduction of textbooks only.
21 / 57
2. Solving the selection bias with a Diff-in-Diff approach
Diff-in-Diff in practice
I
We can try to find a “natural experiment” that allows us to
identify the impact of a policy
I
I
I
I
E.g. An unexpected change in policy
E.g. A policy that only affects 16 years-olds but not 15
years-olds
In general, exploit variation of policies in time and space
The quality of the comparison group determines the quality of
the evaluation
22 / 57
2. Solving the selection bias with a Diff-in-Diff approach
Diff-in-Diff in practice
I
When there is more than two periods, one can use a
regression with individual and time fixed effects
I
I
I
Individual fixed effects: control for time-invariant individual
characteristics
Time fixed effects: effects that are common to all groups at
once particular point in time, or common trend
Valid only when the policy change has an immediate impact
on the outcome variable
I
If there is a delay in the impact of the policy change, we do
need to use lagged treatment variables
23 / 57
2. Solving the selection bias with a Diff-in-Diff approach
Diff-in-Diff in practice
I
Diff-id-Diff attributes any difference in trends between the
treatment and control group that occur at the same time as
the intervention, to that intervention
I
If there are other factors that affect the difference in trends
between the two groups, then the estimation will be biased!
→ The common/parralel trend assumption
I
I
Cannot be tested directly but one can show graphical evidence
24 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
25 / 57
2. Solving the selection bias with a Diff-in-Diff approach
Diff-in-Diff in practice
I
Sensitivity Analysis for diff-in-diff
I
I
One need to convince that the effect is not driven by other
factors
Placebo tests:
I
I
I
I
I
Different comparison group
I
I
Use a “fake” treatment group
For instance for previous years
Using a treatment group a population that was NOT affected
If the DD estimate is different from 0, trends are not parallel,
and our original DD is likely to be biased
You should obtain the same estimates
Different outcome variable
I
I
Use an outcome variable which is NOT affected by the
intervention, using the same comparison group and treatment
year
If the DD estimate is different from zero, we have a problem
26 / 57
2. Solving the selection bias with a Diff-in-Diff approach
Diff-in-Diff in practice
I
Other issues
I
Bertrand et al. (2004): When outcomes within the unit of
time/group are correlated, OLS standard errors understate the
standard deviation of the DD estimator
I
I
I
Solution: Cluster at the i level
Solution 2: Collapsing the data into pre- and post- periods
produce consistent standard errors
Autor (2003): add specific linear individual time trend (or
quadratic individual time trend)
I
To check that results are not driven by a specific trend
27 / 57
2. Solving the selection bias with a Diff-in-Diff approach
Duflo (2001), “Schooling and Labor Market Consequences of School Construction in
Indonesia: evidence from an unusual policy experiment”, American Economic Review
I
Research Question
I
I
I
What is the effect of school infrastructure on educational
achievement?
What is the effect of educational achievement on salary level?
Program Description
I
I
I
1973-1978: the Indonesian Government built 61000 schools
(one school per 500 children between 5 and 14 years-old)
Enrollment rate increased from 69% to 85% between
1973-1978
The number of schools built in each region depended on the
number of children out of school in those regions in 1972,
before the start of the program
28 / 57
2. Solving the selection bias with a Diff-in-Diff approach
Duflo (2001), “Schooling and Labor Market Consequences of School Construction in
Indonesia: evidence from an unusual policy experiment”, American Economic Review
I
Identification
I
Two sources of variations in the intensity of the program for a
given individual
I
I
By region: there is variation in the number of schools received
in each region
By age: (1) Children who were older than 12 years in 1972
did not benefit from the program (2) The younger a child was
1972, the more it benefited from the program because she
spent more time in the new schools.
29 / 57
2. Solving the selection bias with a Diff-in-Diff approach
Duflo (2001), “Schooling and Labor Market Consequences of School Construction in
Indonesia: evidence from an unusual policy experiment”, American Economic Review
I
Data
I
I
I
I
1995 population census w/ individual data on birth date, 1995
salary level, 1995 level of education
The intensity of the building program in the birth region of
each individual
Focus on men born between 1950 and 1972
First step
I
We simplify the intensity of the program (high vs low) and the
groups of children (young who benefited and older who did not
benefit)
30 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
31 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
32 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
33 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
34 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
35 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
36 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
37 / 57
Source: World Bank (2011) “Impact Evaluation in Practice”
38 / 57
3. Solving the selection bias with a Randomized Exp.
I
Randomized experiments are far from being a new estimation
method.
I
They were institutionalized by researchers in psychology and
medicine, at the end of the 19th century.
I
However, this estimation method was extended to research in
Development Economics only recently, by Esther Duflo, who
has been Professor at the MIT (Massachusetts Institute of
Technology) since 1999 and co-founder (with Abhijit Banerjee
and Sendhil Mullainathan) of the Jameel Poverty Action Lab
(J-PAL) in 2003.
I
J-PAL was established to estimate the impact of a wide range
of development policies, all over the world.
39 / 57
3. Solving the selection bias with a Randomized Exp.
2.1. Mechanics
I
A randomized experiment aiming at measuring the impact of
textbooks on students’ test scores would consist in:
I
First, selecting a sample of N schools;
I
Second, randomly selecting half of them to assign them to the
treatment group.
40 / 57
3. Solving the selection bias with a Randomized Exp.
2.1. Mechanics
I
In the context of a randomized experiment, will an approach
consisting in computing the following quantity
D = E (YiT |T ) − E (YiC |C ) based on a post-treatment survey
allow to measure the treatment effect?
I
Remember that D can be rewritten as follows:
D = E (YiT − YiC |T ) + (E (YiC |T ) − E (YiC |C )).
I
Answering the question therefore amounts to determining
whether the selection bias measured by
E (YiC |T ) − E (YiC |C ) is equal to 0.
41 / 57
3. Solving the selection bias with a Randomized Exp.
2.1. Mechanics
I
The fact that the treatment has been randomly assigned
ensures that, on average, students in schools belonging to the
treatment group are identical to students in schools belonging
to the control group prior to the treatment.
I
We therefore have that E (YiC |T ) − E (YiC |C ) = 0.
42 / 57
3. Solving the selection bias with a Randomized Exp.
2.1. Mechanics
I
In other words, differences in the outcome Y between the
treatment and the control group after the random assignment
of the treatment are only attributable to their differences in
exposure to the treatment.
I
Therefore:
D = E (YiT − YiC |T ) = E (YiT − YiC ) = ATE ,
where ATE is the Average Treatment Effect (the effect of
being treated in the population if individuals are randomly
assigned to treatment).
43 / 57
3. Solving the selection bias with a Randomized Exp.
2.1. Mechanics
I
The regression counterpart to obtain the ATE is given by
Yi = α + βT + ui ,
(2)
where T is a dummy for assignment to the treatment group.
I
Indeed,
β = E (YiT − YiC ).
44 / 57
3. Solving the selection bias with a Randomized Exp.
2.2. Other sources of bias and short overview of potential solutions
I
Although randomized experiments allow to get rid of the
selection bias, other sources of bias can arise:
1. Externalities: it happens when untreated individuals are
affected by the treatment.
2. Attrition: it happens when some treated individuals leave the
original sample.
3. Hawthorne and John Henry effects: it happens when the
evaluation itself may cause the treatment or comparison group
to change its behavior.
I
Biases related to 1. and 2. are also called “partial
compliance” biases (“perfect compliance” meaning that the
treatment did reach ALL the individuals in the treatment
group, and ONLY them).
45 / 57
3. Solving the selection bias with a Randomized Exp.
2.2. Other sources of bias and short overview of potential solutions
I
Concerning externalities, they will lead to an underestimation
of the treatment effect if they are positive, and to an
overestimation of the treatment effect if they are negative.
I
Some techniques can be implemented to estimate the
magnitude of such externalities and therefore neutralize the
bias they induce.
I
For instance, when evaluating the impact of using fertilizers
on crop yields, one may worry about information externalities:
individuals in the treatment group (those who receive an
incentive to use fertilizers) may talk to individuals in the
control group about the benefits/drawbacks of using
fertilizers.
46 / 57
3. Solving the selection bias with a Randomized Exp.
2.2. Other sources of bias and short overview of potential solutions
I
One way to solve the problem is to ask farmers in both the
treatment and the control group the name of the 3 farmers
they discuss agriculture with the most often (we refer to them
as “friends” in the following).
I
To get an idea of the magnitude of the information spillovers
between the treatment and the control group, one can
compare:
I
I
the use of fertilizers by friends in the control group of farmers
in the treatment group with
the use of fertilizers by friends in the control group of farmers
in the control group.
47 / 57
3. Solving the selection bias with a Randomized Exp.
2.2. Other sources of bias and short overview of potential solutions
I
Concerning attrition, it won’t induce a bias if it is random;
however, it will lead to a bias as soon as it is correlated with
the impact the treatment has on each individual.
I
For instance, bias will arise if those who are benefiting the
least from a program tend to drop out of the sample.
I
Therefore, managing attrition during the data collection
process is essential.
48 / 57
3. Solving the selection bias with a Randomized Exp.
2.2. Other sources of bias and short overview of potential solutions
I
More precisely, this requires collecting good information in a
baseline questionnaire on how to find each individual again,
should he decide to leave the group after the treatment (by
asking for instance the names of relatives that can be
interviewed if the respondent cannot be found during the
post-treatment survey).
I
For sure, following up with ALL attritors is too expensive, but
following up with only a random sample of the attritors is a
good alternative.
49 / 57
3. Solving the selection bias with a Randomized Exp.
2.2. Other sources of bias and short overview of potential solutions
I
What are the Hawthorne and John Henry effects exactly?
I
Hawthorne effect: individuals in the treatment group,
because they are conscious of being observed, may alter their
behavior during the experiment (compared to what it usually
is) to please the experimenter (for instance, teachers in schools
which received textbooks may also decide to work harder).
I
John Henry effect: individuals in the control group, in case
they are aware of being a control group, may feel offended to
get this experimental status and therefore could react by also
altering their behavior (for instance, teachers in schools which
received no textbooks may decide either to work harder or to
slack off).
50 / 57
3. Solving the selection bias with a Randomized Exp.
2.2. Other sources of bias and short overview of potential solutions
I
One way to get rid of the HJH effects is to continue to
monitor the impact of a development program, after the
official experiment is over.
I
The fact that the measured impact is similar when the
program is not being officially evaluated any more and when
the program is officially evaluated means that it is not due to
the HJH effects. If they are not similar, the estimation of the
treatment effect should rely on the “post-post-treatment”
survey.
51 / 57
3. Solving the selection bias with a Randomized Exp.
2.3. What about external validity?
I
So far, we have mainly focused on issues of internal validity
which is whether an OLS estimate of β in Equation (2)
captures the treatment effect without bias.
I
External validity is about whether the treatment effect we
measure would carry over to other samples or populations.
I
Ensuring external validity is considered as the greatest
challenge faced by randomized experiments.
52 / 57
3. Solving the selection bias with a Randomized Exp.
2.3. What about external validity?
I
Discussing the external validity of a randomized experiment
requires to:
I
First, discuss whether the results obtained among members of
community X at date t would hold if the randomized
experiment is conducted among members of another
community at another point in time;
I
Second, discuss whether the results obtained among a
sub-sample of individuals in a given country would hold if the
randomized experiment is conducted among the entire
population of this country.
53 / 57
3. Solving the selection bias with a Randomized Exp.
2.3. What about external validity?
I
To address the first requirement:
I
One surely has to replicate the randomized experiment in
different communities and at different points in time (these
replications constitute an important activity at J-PAL);
I
One can also identify the observed individual characteristics
that magnify or mitigate the treatment effect. For instance, if
one finds that a development program works for poor women
in country i but that the magnitude of the positive effect of
this program decreases with these women’s income, then this
suggests that the development program wouldn’t be as
successful among a set of middle-income women in country i.
54 / 57
3. Solving the selection bias with a Randomized Exp.
2.3. What about external validity?
I
To address the second requirement one must think about the
possible effect of scaling up the program.
I
Scaling up the program can indeed trigger off general
equilibrium effects that are non existent when one runs a
randomized experiment in a small area.
55 / 57
3. Solving the selection bias with a Randomized Exp.
2.3. What about external validity?
I
Indeed, the fact that the randomized experiment is conducted
on a small scale means that one observes a partial equilibrium
effect: one looks at the impact of the treatment every other
things being held constant.
I
Put differently, the treatment affects a portion of a country’s
population that is too small to have an impact on
macro-economic variables (like wages or prices) which could
have a feedback effect on the outcome of interest.
I
If the randomized experiment is scaled up however, these
feedback effects are possible. They can considerably challenge
the results obtained on a small scale.
56 / 57