Addressing the Evaluation Gap - Evaluation for African Development

Addressing the
Evaluation Gap
Responding to the paper by
William D. Savedoff and Ruth
Levine: “When Will We Ever
Learn? Closing the Evaluation
Gap”, Center for Global
Development www.cgdev.org
There have been and continue to be
multiple discussions concerning the
evaluation of international
development. They include some
commonly agreed frames of reference
(as we hope to discover here in
Sussex). But they also include forces
pulling in many divergent directions ...
Or at least different interpretations of
what form of “impact evaluation” is
called for.
Some attempt to address the
complexities of increasingly
integrated, multi-intervention, multidonor national development
assistance, including those promoting
human rights and advocating for policy
change.
Others call for a form of impact
evaluation that focuses on the need to
conduct rigorous research on more
specific cause-effect relationships.
The findings of such evaluations can
be used to inform subsequent project
design.
There are those who propose to use
randomized 'scientific' experimental
research designs to evaluate 'the real
impact' of development projects.
Among such proponents are the MIT
Poverty Action Lab
(http://www.povertyactionlab.com/)
Another is the Center for Global
Development's "Evaluation Gap" Working
Group. Their recently released report
(http://www.cgdev.org/section/initiatives/_ac
tive/evalgap) is receiving high-profile
attention. Not only in the US, but also in
Europe, including a multi-national, multiagency conference held in June at the
Rockefeller Foundation center in Bellagio,
Italy.
There are many aspects of the
CGD’s initiative that I believe we
should applaud and support. These
include (among others):
– Pointing out that “An evaluation gap exists
because there are too few incentives to conduct
good impact evaluations and too many
obstacles.”
– Calling for more financial and technical support
for more rigorous evaluation
– Advocating that there be more collaborative
evaluations
The CGD’s two main suggested
solutions are:
– The formation of an International Council to
Catalyze Independent Impact Evaluations of
Social Sector Interventions.
– The conducting of more rigorous impact
evaluations (implying randomized experimental
trials).1
1 In
fairness, their proposals are more comprehensive
than what I am highlighting here. But this points to an
important methodological challenge.
I suggest that those of us gathered
here in Sussex consider responses to
both of these:
– Do we agree that there is need for the proposed
CGD-organized International Council?
• If so, in what ways are we and the institutions we
represent willing to collaborate with it?
• Or are its proposed purposes (see next slide) already
being adequately met by existing institutions or
networks?
– What is the role of randomized experimental
trials among other evaluation designs?
The International Council
•
•
•
•
•
•
•
•
•
•
Establish quality standards for rigorous evaluations
Organize and disseminate information
Identify priority topics
Review proposals rapidly
Build capacity to produce, interpret and use
knowledge
Create a directory of researchers
Provide grants for impact evaluation design
Create and administer a pooled impact evaluation
fund
Signal quality with a “Seal of Approval”
Communicate with policymakers
Evaluation Designs
Though I humbly acknowledge that this is a
room full of experts, permit me to share with you
the introduction to evaluation design that
participants in my training workshops have
found helpful.1 This could help clarify the role of
more rigorous evaluations (even randomized
trials) – when they are needed, and when they
may be inappropriate or not feasible.
1These
are included in the book RealWorld Evaluation by Bamberger,
Rugh and Mabry, published by Sage February 2006
Design #1: Post-test only of project participants
X
P
Project participants
end of project
evaluation
12
Design #2: Pre+post of project; no comparison
P1
X
P2
Project participants
baseline
end of project
evaluation
13
Design #3: Pre+post of project; post-only comparison
P1
X
P2
C
Project participants
Comparison group
baseline
end of project
evaluation
14
Design #4: Quasi-experimental (pre+post, with ‘comparison’)
P1
X
P2
C1
C2
Project participants
Comparison group
baseline
end of project
evaluation
15
Design #5: Randomized experimental (pre+post, with ‘control’)
P1
X
P2
C1
C2
Project participants
Research subjects
randomly assigned
either to project or
control group.
Control group
baseline
end of project
evaluation
16
Design #6: Longitudinal Quasi-experimental
P1
X
C1
P2
X
C2
P3
P4
C3
C4
Project participants
Comparison group
baseline
midterm
end of project
evaluation
post project
evaluation
17
Design #7: Randomized Longitudinal Experimental
P1
X
C1
P2
X
C2
P3
P4
C3
C4
Project participants
Research subjects
randomly assigned
either to project or
control group.
Control group
baseline
midterm
end of project
evaluation
post project
evaluation
18
How often are ‘more rigorous’
evaluation designs actually used?
• Of the 67 projects included in the last bi-annual
meta-evaluation conducted by CARE
International, 50 (75%) used a posttest-only
design without a comparison group (Design 1);
12% used pre + posttest of project group
(Design 2). We guess that these are fairly typical
of evaluation designs actually used by INGOs
and other development agencies.
• There actually had been baseline studies conducted for 19 of the
projects where posttest-only evaluations were conducted. Among
the reasons the baselines were not used included accessibility of
the baseline data to the evaluators, comparability (in terms of
indicators and methodologies), questions regarding the quality of the
baseline studies, and/or oversight by those conducting the
evaluations
We need to be clear on what’re
defining as ‘impact’ and what the
contributing causes/contributions
are to achieve that ‘impact’.
• We do need to have proven hypotheses of
what interventions and outputs have been
shown to lead to what outcomes.
• But such research needs to be clear on
the relevant conditions and what other
contributing factors there were.
High infant mortality rate
Children are malnourished
Insufficient
food
Need for strengthened
capacity of health
institutions
Flies and
rodents
Diarrheal
disease
Poor quality
of food
Unsanitary
practices
Need for
improved health
policies
Do not use
facilities
correctly
People do not
wash hands
before eating
Lower infant mortality rate
More Children are well nourished
Sufficient
food
Strengthened capacity
of health institutions
Fewer flies
and rodents
Less diarrheal
disease
Good quality
of food
Sanitary
practices
Improved health
policies
facilities
used
correctly
People wash
hands before
eating
What is the role of randomized
experimental trials?
• I believe there are examples of where they
should be used to test interventions, to
determine clear cause-effect correlations.
These then can then be used in
subsequent project design and evaluation.
• I solicit your suggestions of examples
where they have been or should be used.