Finding True Program Impacts Through Randomization

FINDING TRUE PROGRAM IMPACTS
THROUGH
RANDOMIZATION
LAURA RALSTON, ECONOMIST, CCSD
SESSION OVERVIEW
1.
2.
3.
4.
Background
What is a randomized experiment?
Why randomize?
Key Takeaways
Materials used from MIT Open Courseware http://ocw.mit.edu
JPAL Executive Training: Evaluating Social Programs 2011
Chris Blattman “Can swords be turned into ploughshares? Experimental effects
of an agricultural program on employment, lawlessness, and armed recruitment”
with Jeannie Annan
IMPACT: WHAT IS IT?
Primary Outcome
12
10
Intervention
8
Impact
6
4
Counterfactual
2
0
-6
-5
-4
-3
-2 -1
0
1
2
3
4
Time (mths before and after)
5
6
IMPACT: WHAT IS IT?
16
Primary Outcome
14
Counterfactual
12
Intervention
Impact
10
8
6
4
2
0
-6
-5
-4
-3
-2 -1
0
1
2
3
4
Time (mths before and after)
5
6
IMPACT: WHAT IS IT?
4
Primary Outcome
3.5
3
Intervention
Impact
2.5
2
1.5
Counterfactual
1
0.5
0
-6
-5
-4
-3
-2 -1 0
1
2
3
4
Time (mths before and after)
5
6
HOW TO MEASURE IMPACT?
Impact is defined as the comparison between:
1. The outcome some time after the program has
been introduced
2. The outcome at that same point in time had the
program not been introduced (the
counterfactual)
COUNTERFACTUAL
The counterfactual represents the state of the world
that program participants would have experienced
in the absence of the program (i.e., had they not
participated in the program)
• Problem: counterfactual cannot be observed
• Solution: We need to “mimic” or construct the
counterfactual
IMPACT EVALUATION METHODS
Randomized Experiments
• Also known as:
•
•
•
•
•
Random Assignment Studies
Randomized Field Trials
Social Experiments
Randomized Controlled Trials (RCTs)
Randomized Controlled Experiments
IMPACT EVALUATION METHODS
Non- or Quasi –Experimental Methods
• Includes:
•
•
•
•
•
•
•
Pre-Post
Simple Difference
Difference-in-Difference
Multivariate Regression
Statistical Matching
Instrumental Variables
Regression Discontinuity
• More on these tomorrow
SESSION OVERVIEW
2.
What is a randomized experiment?
THE BASICS
Start with simple case:
• Take a sample of program applicants
• Randomly assign them to either:
• Treatment Group – is offered treatment
• Control Group – not allowed to receive treatment (during
the evaluation period)
• Note:
• Randomization does not mean denying people the benefits of the
program
• Usually existing constraints in project roll-out allow randomization
• Randomization is often the fairest way to allocate treatment
KEY ADVANTAGE OF EXPERIMENTS
• Because members of the groups (treatment and
control) are randomly selected, they do not
systematically differ at the start of the experiment,
• Any difference that subsequently arises between
them can be attributed to the program rather than
to other factors.
EXAMPLE: “WOMEN AS POLICYMAKERS”
TREATMENT VS. CONTROL VILLAGES AT BASELINE
Standard Errors in parentheses. Statistics for West Bengal, India.
Source: Chattopadhyay and Duflo (2004)
RANDOMIZATION EXAMPLE
CORE INTERVENTION: CASH PLUS SKILLS TRAINING
IMPLEMENTED BY AVSI UGANDA 2009-11
Target: 15 poorest, most marginalized rural women in villages of 80 - 300
households; Nominated by community and screened by NGO
Age 27, Work 15 hours/week, earn <$10/month in cash
Research questions: What limits the growth of self-employment and
income among the poorest and marginalized? Does more work and
income “empower” them?
SAMPLE: 120 VILLAGES IN 6 SUBCOUNTIES
SELECTED BASED ON BEING UNDERSERVED
VILLAGES REPRESENT 25% OF SUBCOUNTY POPULATION
BUCKET RANDOMIZATION BY VILLAGE TO TREATMENT OR WAITLIST
FIRST 60 VILLAGES RECEIVE IMMEDIATE TREATMENT (PHASE 1)
60 RECEIVE DELAYED TREATMENT 18 MONTHS LATER (PHASE 2)
SOME VARIATIONS ON THE BASICS
• Assigning to multiple treatment groups
• Assigning of units other than individuals or
households:
•
•
•
•
Health Centers
Schools
Local Governments
Villages
KEY STEPS IN CONDUCTING AN
EXPERIMENT
1. Design the study carefully: what is the objective of
your impact evaluation? What do you most want
to test?
2. Randomly assign people to treatment or control
3. Collect baseline data
4. Verify that assignment looks random
5. Monitor process so that integrity of experiment is
not compromised
KEY STEPS IN CONDUCTING AN
EXPERIMENT (CONT.)
1. Collect follow-up data for both the treatment and
control groups
2. Estimate program impacts by comparing mean
outcomes of treatment group vs. mean outcomes
of control group
3. Assess whether program impacts are statistically
significant and practically (size) significant.
SESSION OVERVIEW
3.
Why randomize?
WHY RANDOMIZE?
– CONCEPTUAL ARGUMENT
If properly designed and conducted, randomized
experiments provide the most credible method to
estimate the impact of a program
WHY “MOST CREDIBLE”?
Because members of the groups (treatment and
control) do not differ systematically at the outset of
the experiment,
Any differences that subsequently arises between
them can be attributed to the program rather than to
other factors.
EXAMPLE – CAN EMPLOYMENT PROGRAMS REDUCE
LAWLESSNESS AND REBELLION? A FIELD EXPERIMENT WITH
HIGH-RISK YOUTH IN LIBERIA (BLATTMAN 2014)
Knowledge Gaps:
1. Little experimental evidence of employment or incomes
on crime or violence (Freeman 1999, Blattman and Miguel 2010)
• Exceptions are with low-risk populations (Blattman et al 2013, 2014)
• US experiments test adolescent schooling, neighborhoods
2. Few experimental job programs generate jobs
• Demobilization and reintegration (Kingma and Muggah 2009)
• Vocational and business training (Card et al 2010, Attanasio et al 2011, McKenzie & Woodruff 2012)
• Cash to microenterprises target the already employed (de Mel et al 2008, Fafchamps et
al 2012)
3. Where there is evidence, theoretical mechanism
unclear
• Adult education: Opportunity cost, socialization, or peer effects?
• Income-conflict correlation: Opportunity cost or grievance?
INTERVENTION
Offer high-risk young people in hotspots:
1.
4-month residential training program
• Highly practice-based agricultural skills
2.
“Life skills” and counseling
• Handling conflict, dealing with trauma and
PTSD, career counseling
• Mentoring by former ex-combatants
3.
Assistance returning to a community
• Leader permission, land access, transport
4.
Package of agricultural inputs
• $125 in tools and materials in two stages
• Choice between vegetable farming,
animals
• $50 cash (Sinoe site only)
AIMS
1.
Increase farm
incomes and activity
2.
Shift occupational
incentives away from
illicit resource
extraction
3.
Socialize into
peacetime, nonviolent life
4.
Reduce risk of
mercenary
recruitment
RANDOMIZED EXPERIMENT
Suppose we evaluated this program using a
randomized experiment
Question 1: What would this entail? How would we do
it?
Question 2: What would be the advantage of using
this method to evaluate the impact of the program?
METHODS TO ESTIMATE IMPACTS
Let’s look at different ways of estimating the impacts
using the data from young people who were enrolled
in this program:
1.
2.
3.
4.
5.
Pre – Post (Before vs. After)
Simple difference
Difference-in-difference
Other non-experimental methods
Randomized Experiment
PRE-POST (BEFORE VS. AFTER)
• Look at average change in:
• Involvement in crime (drug selling, illicit extraction, stealing)
• Hours per week worked in legal activities (raising animals,
farming)
Crime Rate
Hours worked
Average Pre Program
47%
31
Average Post Program
41%
37
Difference
-6%
+6
• Question: under what conditions can these differences be
interpreted as an impact of the program?
WHAT WOULD HAVE HAPPENED IN
ABSENCE OF THE PROGRAM?
38
Hours worked per week
37
36
35
34
Impact
= 6 hrs ?
33
32
31
30
29
28
Pre
Post
SIMPLE DIFFERENCE
Compare crime rates of…
Young men who got program with those that didn’t
SIMPLE DIFFERENCE
• Look at average difference in:
• Involvement in crime (drug selling, illicit extraction, stealing)
• Hours per week worked in legal activities (raising animals,
farming)
Crime Rate
Hours worked
Average Outside of Program
47%
28
Average Inside of Program
41%
37
Difference
-6%
+9
• Question: under what conditions can these differences be
interpreted as an impact of the program?
WHAT WOULD HAVE HAPPENED IN
ABSENCE OF THE PROGRAM?
Impact
= 9 hrs ?
DIFFERENCE-IN-DIFFERENCES
(MORE ON THIS TOMORROW)
Compare decrease in crime rates of…
Young men who got program with those that didn’t
OTHER METHODS
There are more sophisticated non-experimental
methods to estimate program impacts:
• Regression
• Matching
• Instrumental Variables
• Regression Discontinuity
But all these methods rely on being able to mimic the
counterfactual under certain assumptions
Problem: Assumptions are not testable
IMPACT OF PROGRAM - SUMMARY
Method
Impact on Crime
Rates
Impact on Hrs
worked
1. Pre-post
-6%
+6*
2. Simple Difference
-10%*
+9
3. Difference-in-Difference
-8%
+11*
4. Regression
-4%
+5
5. Randomized Experiment
-5%*
+5.5*
*: Statistically significant at the 5% level
Bottom Line: which method we use matters!
SESSION OVERVIEW
4.
Key Takeaways
KEY TAKEAWAY #1
The single best way to evaluate the true average
impact of a program is by randomizing treatment
KEY TAKEAWAY #2
Randomization is more flexible than you think:
• It does not require withholding of benefits
• It can take advantage of necessary staggered rollout
• It can test different reforms or packages across
groups at the same time
EXAMPLE OF ROLL-OUT RANDOMIZATION
“Phase 1”
60 villages:
Training,
grant and
follow-up
1800 clients
in 120
villages
30 villages:
Intensify group
formation and
cooperation
30 villages:
No added
services
“Phase 2”
60 villages:
Training and
grant 18
months later
300 clients:
No follow-up
visits
300 clients:
Accountability:
1-2 follow-up
visits
300 clients:
“Accountability
& advice”: 3-5
follow-up visits
KEY TAKEAWAY #3
It is more ethical to test programs rigorously before
universally implementing them than it is to use scarce
public resources to implement a universal program
with uncertain benefits.
Thank you !