Average treatment effect on the treated Non

Isabel Beltran, World Bank
Why Use Randomized
Evaluation?
Fundamental Question

What is the effect of a program or
intervention?
 How can vulnerable groups partake in the state
and peace building process?
 What political and social accountability
mechanisms are most effective in a fragile state?
 What measures secure stability and reduce ethnic
conflict at the local level?
Objective

To Identify the causal effect of an intervention
 Identify the impact of the program from other factors

Need to find out what would have happened
without the program
 Cannot observe the same person with and without
the program at the same point of time
 Create a valid counterfactual
3
Correlation is not causation
Question: Does providing credit increase firm profits?
Suppose we observe that firms with more credit also
earn higher profits.
1)
Credit Use
OR
2)
Higher
profits
Higher profits ?
Business
Skills
Credit
?
Illustration: Credit Program
(Before-After)
Treatment Group
Treatment Group
14
(+6) increase in gross
operating margin
12
10
8
A credit program was
offered in 2008.
6
4
2
0
2007
2009
Why did operating
margin increase?
5
Motivation

Hard to distinguish causation from correlation by analyzing
existing (retrospective) data
 However complex, statistics can only see that X moves with Y
 Hard to correct for unobserved characteristics, like motivation/ability
 May be very important- also affect outcomes of interest

Selection bias a major issue for impact evaluation
 Projects started at specific times and places for particular reasons
 Participants may be selected or self-select into programs
 People who have access to credit are likely to be very different from
the average entrepreneur, looking at their profits will give you a
misleading impression of the benefits of credit
6
Illustration: Credit Program
(Valid Counterfactual)
Control Group
Treatment Group
14
(+4) Impact of the
program
12
10
(+2) Impact of other
(external) factors
8
6
4
2
0
Before
After
* Macroeconomic
environment affects
control group
* Program impact easily
identified
7
Experimental Design

All those in the study have the same chance of
being in the treatment or comparison group
By design, treatment and comparison have the
same characteristics (observed and unobserved), on
average
 Only difference is treatment
 Large sample  all characteristics average out


Unbiased impact estimates
8
Options for Randomization

Lottery (0nly some receive)
 Lottery to receive new loans, credit for community

Random phase-in (everyone gets it eventually)
 Some groups or individuals get credit each year

Variation in treatment
 Some get matching grant, others get credit, others get
business development services etc

Encouragement design
 Some farmers get home visit to explain loan product,
others do not
9
Lottery among the qualified
Must receive the
program
Randomize who gets the
program
Not suitable for the
program
Opportunities for
Randomization

Budget constraint prevents full coverage
 Random assignment (lottery) is fair and
transparent

Limited implementation capacity
 Phase-in gives all the same chance to go first

No evidence on which alternative is best
 Random assignment to alternatives
with equal ex ante chance of success
11
Opportunities for
Randomization

Take up of existing program is not complete
 Provide information or incentive for some to sign
up- Randomize encouragement

Pilot a new program
 Good opportunity to test design before scaling
up

Operational changes to ongoing programs
 Good opportunity to test changes before scaling
them up
12
Different levels you can
randomize at




Individual/owner/firm
Business Association
Village level
School level
 Women’s association
 Youth groups
 Regulatory
jurisdiction/
administrative district
13
Group or individual randomization?

If a program impacts a whole group-- usually
randomize whole community to treatment or
comparison

Easier to get big enough sample if randomize
individuals
Individual randomization
Group randomization
Unit of Randomization

Randomizing at higher level sometimes necessary:
 Political constraints on differential treatment within
community
 Practical constraints—confusing to implement different
versions
 Spillover effects may require higher level randomization

Randomizing at group level requires many groups
because of within community correlation
15
Elements of an experimental design
Target population
SMEs
Potential participants
Tailors
Furniture manufacturers
Evaluation sample
Random assignment
Treatment Group
• Participants
Control Group
 Non-participants
16
External and Internal Validity (1)

External validity
 The evaluation sample is representative of the total
population
 The results in the sample represent the results in the
population  We can apply the lessons to the whole
population

Internal validity
 The intervention and comparison groups are truly
comparable
  estimated effect of the intervention/program on the
evaluated population reflects the real impact on that
population
17
External and Internal Validity (2)

An evaluation can have internal validity without
external validity
 Example: A randomized evaluation of encouraging
informal firms to register in urban areas may not tell us
much about impact of a similar program in rural areas

An evaluation without internal validity, can’t have
external validity
 If you don’t know whether a program works in one place,
then you have learnt nothing about whether it works
elsewhere.
18
Internal & external validity
National Population
Random SampleRandomization
Representative
Sample of National
Population
Randomization
19
Internal validity
Population
Example:
Evaluating a
program that
targets women
Stratification
Population stratum
Samples of Population
Stratum
Randomization
20
Representative but biased:
useless
National Population
Randomization
Non-random
assignment
USELESS!
21
Efficacy & Effectiveness

Efficacy
 Proof of concept
 Smaller scale
 Pilot in ideal conditions

Effectiveness
 At scale
 Prevailing implementation arrangements -- “real life”


Higher or lower impact?
Higher or lower costs?
22
Advantages of “experiments”


Clear and precise causal impact
Relative to other methods
 Provide correct estimates
 Much easier to analyze- Difference in averages
 Easier to explain
 More convincing to policymakers
 Methodologically uncontroversial
23
Machines do NOT
 Raise ethical or practical concerns about
randomization
 Fail to comply with Treatment
 Find a better Treatment
 Move away—so lost to measurement
 Refuse to answer questionnaires

Human beings can be a little more
challenging!
What if there are constraints on
randomization?


Some interventions can’t be assigned
randomly
Partial take up or demand-driven
interventions: Randomly promote the
program to some
 Participants make their own choices about
adoption

Perhaps there is contamination- for instance,
if some in the control group take-up
treatment
25


Those who get receive marketing treatment are
more likely to enroll
But who got marketing was determined
randomly, so not correlated with other
observables/non-observables
 Compare average outcomes of two groups:
promoted/not promoted
 Effect of offering the encouragement (Intent-ToTreat)
 Effect of the intervention on the complier population
(Local Average Treatment Effect)
▪ LATE= ITT/proportion of those who took it up
Assigned to
treatment
Assigned to
control
Difference
Impact:
Average
treatment
effect on the
treated
Proportion
treated
100%
0%
100%
100%
Mean
outcome
103
Non-treated
Treated
Impact of assignment
80
23
23/100%=23
Intent-to-treat
estimate
Average treatment on
the treated
Randomly
Encouraged
Not
encouraged
Difference
Impact:
Average
treatment
effect on
compliers
Proportion
treated
70%
30%
40%
100%
Outcome
100
Non-treated
(did not take
up program)
Treated
(did take up
program)
Impact of
encouragement
92
8
8/40%=20
Intent-to-treat
estimate
Local average
treatment effect
Common pitfalls to avoid

Calculating sample size incorrectly
 Randomizing one district to treatment and one
district to control and calculating sample size on
number of people you interview


Collecting data in treatment and control
differently
Counting those assigned to treatment who
do not take up program as control—don’t
undo your randomization!!
29
When is it really not possible?

The treatment already assigned and
announced
and no possibility for expansion of treatment



The program is over (retrospective)
Universal take up already
Program is national and non excludable
 Freedom of the press, exchange rate policy
(sometimes some components can be randomized)

Sample size is too small to make it worth it
30
Thank You
31