Technical Note Evaluation Criteria and Questions

Decentralized evaluation for evidence-based decision making
WFP Office of Evaluation
Decentralized Evaluation Quality Assurance System (DEQAS)
Technical Note
Evaluation Criteria and Questions
Version April 2016
1. Introduction
1. Evaluation criteria help focus evaluation objectives by defining the dimensions/ o by which
WFP interventions are assessed and establishing the focus of the evaluation questions. The
international criteria for evaluation are: relevance, effectiveness, efficiency, impact
and sustainability. Additional humanitarian evaluation criteria may also be applicable for
evaluations
taking
place
in
emergency-focused
contexts:
appropriateness,
connectedness, coherence and coverage.
2. All WFP evaluations assess WFP interventions’ quality in relation to all or some of the
internationally-agreed set of evaluation criteria. The application of international evaluation
criteria helps ensure the quality, credibility and relevance of WFP’s evaluations and
their accessibility by stakeholders (both internal and external) using a commonly agreed
language.
3. During a decentralized evaluation, the following actions are required to ensure selection and
application of appropriate evaluation criteria:
Phase 2: Preparation:
4. In the TOR, the Evaluation Manager:
Selects the appropriate evaluation criteria for the evaluation based upon the purpose
and objectives of the evaluation.
Develops evaluation questions to address each of the selected evaluation criteria,
appropriate to the evaluand and its context.
Uses the evaluation questions to inform the overall methodological approach in the
TOR
Phase 3: Inception:
5. In the Inception Report, the Evaluation Team:
Refines the evaluation questions (if needed) in light of the subject of the evaluation and
expands on the questions by formulating sub-questions in the Evaluation Matrix. (TN
on evaluation matrix)
Develops further the methodology to address all the evaluation matrix elements
Phase 4: Data Collection and Analysis
6. During the field work, the evaluation team:
Collects data as to populate the indicators identified in the evaluation matrix for each
question and sub-question.
Analyses all the data and information obtained to address the evaluation sub-questions
and questions.
Phase 5: Reporting
7. In the evaluation report, the evaluation team:
Reports findings against the evaluation questions,
Develops conclusions on the performance against each of the selected evaluation
criteria.
8. This Technical Note focuses on Phase 2 Preparation to guide WFP Evaluation Managers
through the process of identifying suitable evaluation criteria and assigning appropriate
evaluation questions.
2. Concepts and Definitions of Standard Evaluation Criteria1
9. In line with international standards for evaluations, each evaluation assesses the key
evaluation criteria, as defined in Tables 1 and 2 below. Both tables provides details about the
scope of the analysis around each criteria and for ease of reference the unit of analysis is an
intervention. But the same scope of analysis broadly applies to all types of evaluations.
Table 1: Definition of OECD-DAC evaluation criteria and analytical scope
Criterion2
Definition
Includes analysis of:
Relevance
Extent to which the objectives
of an intervention are
consistent with most
vulnerable groups needs,
country needs,
organisational priorities and
partners’ policies and
practice
• Relevance of the intervention design in view of the
needs of the most vulnerable groups
The extent to which objectives
as defined are achieved, and
the extent to which outputs
have led (or are expected to
lead) to expected outcomes as
planned.
• Achievement, or are likely to lead to achievement of
objectives
• Main results including positive, negative, intended
and unintended outcomes
Measures the outputs –
qualitative and quantitative
– in relation to inputs –
funds, expertise, time etc.
• Costs per recipient for different implementation
mechanisms/mode of transfer -food/cash/ voucher
• timeliness of delivery, compliance with intended
timeframes, comparison of channels of delivery (e.g.
Effectiveness
Efficiency
• Continued relevance of the objectives over the life of
the intervention (ability to adapt to new needs)
• Alignment with government, partners, donors’
policies and interventions
• Internal coherence with WFP policies
• Consistency of project design and logic
• Extent to which design and implementation were
gender-sensitive, based on gender analysis, and
addressed diverse needs
• Outputs and outcomes for men, women, boys and
girls, and other relevant socio-economic categories.
• Potential constraints and facilitating factors to
achievements
1 Adapted from OECD-DAC, 2000, DAC Criteria for Evaluating Development Assistance: Standard definitions for Relevance,
Effectiveness, Efficiency, Impact, Sustainability
ALNAP, 2006, Evaluating humanitarian action using the OECD-DAC criteria - An ALNAP guide for humanitarian agencies.
Definitions from WFP’s Operations Evaluation Guidance: Technical Note on Evaluation Criteria
Criterion2
Impact3
Sustainability
Definition
Includes analysis of:
This generally requires
comparing alternative
approaches to achieving the
same outputs
schools/ health systems versus community-based)
• comparison of different institutional arrangements
(e.g. continuity of supplies and use of local partners /
systems / procurement where feasible)
Wider effects of the projectsocial, economic, technical,
environmental – on
individuals, gender- and age
groups, communities and
institutions. Impacts can be
intended or unintended,
positive or negative, macro
(sector) or micro (household).
• Intended and unintended long-term effects of the
intervention on men, women, boys and girls, and
other relevant socio-economic categories
• Intended and intended long term effects on
institutional capacities
• Contribution of an intervention to long-term
intended results
The continuation of benefits
from an intervention after
assistance has been
completed, or the probability
of long term benefits.
• Capacity building/development results
• Institutional/systemic changes
• Integration of intervention elements into national
systems and processes
Refer to the TN on impact evaluation for further
guidance.
Table 2: Definition of humanitarian evaluation criteria and analytical scope4
Criterion
Definition
Appropriateness Appropriateness is the tailoring of
activities to local needs and context,
thereby increasing ownership,
accountability and cost-effectiveness. This
replaces the OECD-DAC criteria of
Relevance.
Coverage
10. Coherence
Includes analysis of
• Extent to which WFP inputs were
tailored to needs
• Extent to which they were adapted to
respond to the changing demands of
unstable environments
• Extent to which design and
implementation were gender-sensitive,
based on gender analysis
The degree to which major population
• Extent to which different groups are
groups facing life-threatening suffering
targeted or included
wherever they are have been provided with • Impact of exclusion on sub-groups
impartial assistance and protection,
(gender, ethnicity, location, family
proportionate to need. Requires analysis of
circumstance)
differential coverage/ targeting, inclusion
• Differentiation of targeting and
and exclusion impacts on population subforms/amount of assistance provided
groups (gender, ethnicity, location, family
circumstance).
The relationship between the subject of
• Contextual factors and how they
the evaluation, and the political, security,
influenced the design/ implementation
developmental, trade and military context
of the subject
as well as humanitarian policies, and in
• Links to the food security and nutrition
particular, take into account humanitarian policies and programmes of other
and human-rights considerations,
actors
3 Evaluating impact has particular implications for an evaluation, for example: there must be data availability to support longer-term
analysis; the evaluation team should include specialists in quantitative approaches. Refer to the Technical Note on Impact Evaluation,
available as part of this Guidance Package
4Adapted from ALNAP, (2006) Evaluating humanitarian action using the OECD-DAC criteria - An ALNAP guide for humanitarian
agencies.
Criterion
Connectedness
Definition
Includes analysis of
principles and standards.
• Consideration of humanitarian and
human rights principles and
standards, including gender equality
and women empowerment
Connectedness refers to the degree to
• Consistency between short-term
which activities of a short-term emergency
activities and other development
nature are carried out in a way that takes
interventions, development goals etc.
longer-term and interconnected problems
that address contextual problems
into account (e.g. refugee/host community • Presence of transition-focused
issues; relief and resilience). Can be
analyses, including stakeholder
applied as part of/replacing sustainability
consultations and existence of
above
transition strategy
3. Application of Evaluation Criteria in WFP Evaluations
11. In general, applying the standard evaluation criteria, relevance, effectiveness,
efficiency, sustainability, and impact, can help address the most important questions an
evaluation can raise.
At the minimum, all evaluations should respond to
relevance/appropriateness, effectiveness and efficiency. Whilst every effort should be made to
apply the other evaluation criteria of impact and sustainability, the timing of the evaluation
will determine whether they can be addressed. In this case a justification for not using them
will be included in the TOR. Further, in evaluations of humanitarian and emergency
programming, or interventions in a humanitarian setting, the criteria of appropriateness,
connectedness, coherence, and coverage should be applied.
12. All evaluations should apply a gender lens by considering gender equality and women
empowerment within each of the criteria applied, to ensure that the evaluation assesses the
inclusion of gender dimensions in the intervention design and implementation. 5 More
information on how to do this can be found in the TN on gender in evaluation, available as
part of DEQAS guidance package.
13. Table 3 provides some example parameters of how information needs link to different
evaluation criteria:
Table 3: Parameters for selecting evaluation criteria
If…
Then include a
question under
you wish to understand the intervention’s alignment with target
group/country/government/donor priorities
you need to know what progress is being made in an intervention or achievement of
outputs and outcomes
you wish to understand what is contributing to the success or obstacles to different
outputs
you need to report on the cost-effectiveness or timeliness of various aspects of an
intervention
you are interested in knowing the effect of an intervention on recipients lives in the
medium to longer term
Relevance
Effectiveness
Effectiveness
Efficiency
Impact
5 In accordance with “Integrating Human Rights and Gender Equality in Evaluation Towards UNEG
Guidance, 2011”, page 30, table 2.4.
you would like to determine the net-effects of an intervention
Impact
you would like to know if an activity or its impact is likely to continue beyond the
lifetime of an intervention
you wish to understand how well responses were tailored to local needs
Sustainability
you need to know how well short-term emergency responses considered and/or
affected longer-term issues and problems
you wish to assess the extent to which populations in need were targeted by the
intervention
you wish to understand how the intervention fits with wider policy concerns and
considerations, such as human rights
you wish to understand how the intervention fits with other key player’s work in food
security/nutrition
Connectedness
Appropriateness
Coverage
Coherence
Coherence
4. Inform the Evaluation Criteria using Evaluation Questions
14. To address each of the selected evaluation criteria, the evaluation will have to answer key
questions, identified as part of evaluation design in the TOR. These questions should be
developed in response to the criteria. The main questions (to be set out in the evaluation
Terms of Reference) will be further developed by the evaluation team during the inception
phase through the formulation of sub-questions within an Evaluation Matrix.
15. Collectively, the questions should be designed to give evaluation users the information they
need to make decisions, take action, or understand and learn from an intervention. They
should aim at highlighting the key lessons and performance of the subject of the evaluation –
which may be an Operation, Activity, Thematic Area, Transfer Modality or Pilot - which could
inform future strategic and operational decisions.
16. Good evaluations should not only assess results, but also explain the reasons for results
(explanatory factors). The questions should therefore go beyond what results were achieved
(e.g. how many people were reached with what quantity of food) to explore the reasons why
food assistance was successful or not in improving the food security outcomes of interest. The
criteria should be used to develop these ‘why’ type questions – why was the programme
effective, why was it efficient or not, and so on. Using the criteria in this way will promote
lesson-learning.
17. Given that evaluations are often limited in terms of budget, time, and resources, it is
important to have few, strategically designed evaluation questions to inform adequately
assessment of the evaluation criteria.
18. Table 4 translates the evaluation criteria above into example evaluation questions, building
on the areas of analysis discussed in table 1 and 2. However, since the context within which
each evaluation is commissioned is different, these should be tailored for the specific
evaluation.
Table 4: Example Evaluation Questions linked to Evaluation Criteria
Criterion
Relevance
Example questions
• To what extent was the design of the intervention relevant to the wider context?
• To what extent is the intervention in line with the needs of the most vulnerable
groups (men and women, boys and girls)?
• To what extent is the intervention aligned with the needs and priories of the
government?
• To what extent is the intervention aligned with WFP, partners, UN agencies and
donor policies and priorities?
• To what extent was the intervention based on a sound gender analysis? To what
extent was the design and implementation of the intervention gender-sensitive?
Effectiveness
• To what extent were the outputs and outcomes of the intervention achieved /are
likely to be achieved?
• What were the major factors influencing the achievement or non-achievement of
the outcomes of the intervention?
• To what extent is the achievement of outcomes leading to/likely to lead to
achievement of objectives of the intervention
• What were the major factors influencing the achievement or non-achievement of
the objectives of the intervention? (were the assumptions that achieving outcomes
would achieve the objectives confirmed?)
• To what extent did the intervention deliver results for men and women, boys and
girls?
• Were there unintended positive/negative results?
• Were the relevant assistance standards met?
Efficiency
• Was the intervention cost-efficient? (see different comparisons on table 1)
• Was the intervention implemented in a timely way?
• Was the intervention implemented in the most efficient way compared to
alternatives?
• Did the targeting of the intervention mean that resources were allocated
efficiently?
Impact
• What were the long-term effects of the intervention on recipients’ lives?
• Were there unintended (positive or negative) effects for recipients and non
recipients of assistance?
• What were the gender-specific impacts? Did the intervention influence the gender
context?
Sustainability
• To what extent did the intervention implementation arrangements include
considerations for sustainability, such as capacity building of national and local
government institutions, communities and other partners?
• To what extent did the benefits of the intervention continue after WFP’s work
ceased? OR
• To what extent is it likely that the benefits of the intervention will continue after
WFP’s work ceases?
• Has the intervention made any difference to gender relations in the medium or
longer term?
Appropriateness • Was the intervention approach chosen the best way to meet the food security and
nutrition needs of recipients?
(relate to
Were
adopted transfer modalities the best way of meeting recipients needs?
•
Relevance
Were
protection issues considered in the design and implementation?
•
criteria)
• To what extent was the intervention based on a sound gender analysis?
• To what extent was the design and implementation of the intervention gendersensitive i.e. considered gender equality and women empowerment issues?
Coverage
• Were the humanitarian needs of key target groups (men and women, boys and
girls) met by the intervention?
• Was WFP’s assistance provided proportionally according to the needs within the
context? OR different geographical areas or groups of populations affected
differently receive assistance according to their needs?
• Were relevant assistance standards met?
• Was WFP’s assistance provided consistent with that provided by others
(duplication/gaps)?
Coherence
• To what extent were prevailing context factors (political stability/instability,
population movements etc) considered when designing and delivering the
intervention?
• To what extent was WFP’s intervention coherent with key policies and
programmes of other partners operating within the same context?
• To what extent was the intervention design and delivery overall in line with
humanitarian principles?
Connectedness
• What have been the linkages between the intervention and any other WFP
interventions in relief/recovery/development?
• To what extent did the intervention link to any transition strategies in the context
or development goals?
5. Checking that the selected Criteria and Questions are addressed
19. At the reporting phase, it is important to revisit the selected criteria and questions to ensure
that the analysis done is sufficient to address them. Similarly, when reviewing the draft
evaluation report, checking that all evaluation questions have been addressed is a key quality
criteria as outlined in the evaluation report quality check list. In particular the conclusion
section will include an overall assessment of the evaluation subject per evaluation criteria.
6. Further reading
20. Further reading, including DEQAS sources, is available as follows:
Box 1: Further (including DEQAS) reading – Evaluation Criteria and Questions
•
•
•
•
•
•
•
•
•
•
•
Beck, T., 2008, ALNAP Guide to evaluating humanitarian action using the OECD- DAC criteria
(link)
IASC, 2010, International Humanitarian Norms & Principles - Guidance Materials (link)
OECD, 2007, Principles for Good International Engagement in Fragile States and Situations
(link)
OECD-DAC, 2010, Evaluating Development Co-Operation - Summary of Key Norms and
Standards (link)
OECD-DAC, 2002, Glossary of Key Terms in Evaluation and Results Based Management (link)
OECD-DAC, 2000, DAC Criteria for Evaluating Development Assistance (link)
OECD-DAC, 1999, Guidance for Evaluating Humanitarian Assistance in Complex Emergencies
(link)
OECD-DAC, 1991, Principles for Evaluation of Development Assistance (link)
UNEG, 2014, UN-SWAP Evaluation Performance Indicator Technical Note (link)
UNEG, 2011, Integrating Human Rights and Gender Equality in Evaluation (link)
UNHCR, 2011, Convention and Protocol relating to the Status of Refugees (link)
For more information on Decentralised Evaluations visit our webpage
http://go.wfp.org/web/evaluation/decentralized-evaluations
Or contact the DE team at: [email protected]