Guidance for Evaluating Humanitarian Assistance in
Complex Emergencies
Organisation for Economic Co-operation and
Development, Development Assistance Committee 1999
(http://www.reliefweb.int/rw/lib.nsf/db900SID/LGEL-5G8KHG/$FILE/oecd-evaluating-1999.pdf?OpenElement)
The standard OECD/DAC evaluation
criteria of efficiency, effectiveness, impact, sustainability
and relevance are broadly appropriate for humanitarian
assistance programmes.
Efficiency measures the outputs - qualitative and quantitative -
in relation to the inputs. This generally requires
comparing alternative approaches to achieving the same outputs, to see whether
the most efficient process has been used. Cost-effectiveness
is a broader concept than efficiency in that it looks beyond how inputs were converted into outputs, to whether different outputs could
have been produced that would have had a greater impact in
achieving the project purpose.
Effectiveness measures the extent to which the activity achieves
its purpose, or whether this can be expected to happen on the basis
of the outputs. Implicit within the criteria of effectiveness is timeliness (for if the delivery of food assistance is significantly delayed the
nutritional status of the target population will decline).
There is value in using it more explicitly as one of the standard criteria
because of its importance in the assessment of emergency programmes.
Similarly, issues of resourcing and preparedness should be
addressed
Impact looks at the wider effects of the project - social,
economic, technical, environmental - on individuals, gender and age-groups,
communities, and institutions. Impacts can be immediate and longrange, intended and unintended, positive and negative, macro (sector) and micro
(household). Impact studies address the question:
what real difference has the activity made to the beneficiaries? How many have been affected?
Relevance is concerned with assessing whether the project is in
line with local needs and priorities (as well as with donor
policy). A recent evaluation of humanitarian assistance replaced the criteria
of relevance with the criteria of appropriateness -
the need "to tailor humanitarian activities to local needs, increasing ownership, accountability, and cost-effectiveness accordingly"
(Minear, 1994). However, the two criteria complement
rather than substitute each other. ‘Relevance’ refers to the overall goal and
purpose of a programme, whereas ‘appropriateness’ is more focused
on the activities and inputs. The expansion of the criteria draws
attention to the fact that even where the overall programme goal is relevant -
for example, to improve nutritional status - there are still
questions to be asked about the programme purpose. Distributing large quantities of food aid may not be the best way of improving
nutritional status. Alternatives could include food
for work, cash for work, or measures to improve the functioning of local
markets. Furthermore, even if distribution of food aid is
deemed appropriate, it is still necessary to examine the appropriateness
of the food that is distributed.
Sustainability
- of particular importance for
development aid - is concerned with measuring whether an activity or an
impact is likely to continue after donor funding has been withdrawn. Projects
need to be environmentally as well as financially sustainable.
However, many humanitarian interventions, in contrast to development
projects, are not designed to be sustainable. They still need assessing,
however, in regard to whether, in responding to
acute and immediate needs, they take the longer-term into account. Larry Minear has referred to this as Connectedness, the need
"to assure that activities of a short-term emergency nature are
carried out in a context which takes longer-term and interconnected problems
into account" (Minear, 1994). For example,
otherwise efficient food distribution programmes can damage roads used by local traders, while the presence of large refugee camps can result in
severe environmental impacts in neighbouring areas. Local
institutions can also suffer - the high salaries paid by international NGOs can attract skilled staff away from government clinics and schools, leaving
the local population with reduced levels of service. Large-scale
relief programmes can also have a significant impact on local power structures, for better or for worse.
Coverage - the need "to reach major population groups
facing life-threatening suffering wherever they are, providing them
with assistance and protection proportionate to their need and devoid of
extraneous political agendas". Minear alerts evaluators
that complex emergencies and associated humanitarian programmes can
have significantly differing impacts on different population sub-groups,
whether these are defined in terms of ethnicity, gender, socio-economic
status, occupation, location (urban/rural or inside/outside
of a country affected by conflict) or family circumstance (e.g. single mother,
orphan).
Programmes need to be assessed
both in terms of which groups are included in a programme, and the differential impact on those included. For example, studies have shown
that, in Ethiopia in the 1980s, more than 90% of international relief
went to government-controlled areas, penalising those in areas of Tigray and Eritrea controlled by insurgent movements (Minear, 1994). Other
studies have revealed that single mothers may be disadvantaged
when it comes to access to resources, as they are unable to leave children to queue for relief goods. In the case of the Great Lakes emergency, it was
found that the coverage of the response varied enormously:
refugees and IDPs, and residents in neighbouring IDP camps, were often treated in quite different ways, despite having very similar needs
(Borton et al. 1996).
Coherence - refers to policy coherence, and the need to assess
security, developmental, trade and military policies as
well as humanitarian policies, to ensure that there is consistency and, in
particular, that all policies take into account
humanitarian and human rights considerations. A notable lack of coherence was evident in the international community’s response to the Great Lakes
emergency in 1994. During the crisis military contingents were
withdrawn from Rwanda during the genocide, when there is evidence to suggest that a rapid deployment of troops could have prevented many of the
killings and the subsequent refugee influx into Zaire. This was then
followed by a huge relief operation. In other instances, donor-imposed trade
conditions have been blamed for precipitating economic crisis and conflict,
undermining longer-term development policies. Coherence can also be analysed
solely within the humanitarian sphere - to see whether all
the actors are working towards the same basic goals. For example, there have
been instances of one major UN agency promoting the return of
refugees to their host country while another is diametrically
opposed to such policies.
Finally, there is the important
issue of co-ordination. This could be considered under the
criteria of effectiveness, for a poorly co-ordinated response is
unlikely to maximise effectiveness or impact. However, given the multiplicity
of actors involved in an emergency response, it is important that coordination
is explicitly considered - the intervention of a single agency cannot be
evaluated in isolation from what others are doing,
particularly as what may seem appropriate from the point of view of a single
actor, may not be appropriate from the point of view of the system as a whole.
Given the context of conflict
and insecurity, protection issues are also critical to the effectiveness
of humanitarian action. Where levels of protection are
poor it is feasible that the target population of an otherwise
effective project distributing relief assistance are being killed by armed
elements operating within the project area or even
within the displaced persons/refugee camp. Assessment of the levels of security and protection in the area of the project or programme and,
where relevant, the steps taken to improve them should be part of
all humanitarian assistance evaluations. In those humanitarian assistance
evaluations undertaken to date, such issues have often been left out of the
study or not adequately covered. Often the lack of familiarity of
Evaluation Managers with the issues of security and protection has contributed to such omissions.
International agreements on
standards and performance, such as the Red Cross/ NGO Code of Conduct and the
Sphere Project, as well as to relevant aspects on international humanitarian
law provide international norms against which the
performance of agencies and the system may be assessed.
Preparing the narrative history
and ‘baseline’ has to be the starting point for any study. Expansion and
modification of the narrative history and ‘baseline’ will probably continue
throughout the study as more information is obtained. Whilst documentation will
be an important source of information, interviews with the range of
actors and members of the affected population will be a vital source too.
Arguably, interviews form a more important source
than is normally the case with evaluations of development assistance due to the problems of poor record keeping and documentation. Effective
management of the results of these interviews is an important
determinant of the effectiveness of the team. Generally, different members of
the team should take responsibility for interviewing different individuals in
different locations. However, whilst rational in terms of time and travel, such
divisions of labour require the development of standard
protocols to be used by all members of the team in their separate interviews. Such protocols should include relatively open-ended questions such as
“What key lessons did you learn from your experience?” “What in
your view were the main strengths of the operation?” and “What would you change if you had to do it all again?” Team members would need to be
disciplined in their adherence to these protocols and to
writing-up and sharing interview records among team members. The use of laptop computers, e-mail and free-form databases is potentially a very
effective means of sharing such information and enabling all team members to
contribute to and benefit from the process of constructing the narrative
and ‘baseline’.
The strength of multidisciplinary teams lies in the
differing perspectives that can be brought to bear on the
issues and it is vital that the team has sufficient opportunities to get
together at different stages of the evaluation process
to discuss overlapping issues and conclusions.
Interviews with a sample of the
affected population should be a mandatory part of any humanitarian assistance
evaluation. Humanitarian assistance is essentially a ‘top down’ process. Of necessity
it often involves making assumptions about assistance needs and
the provision of standardised packages of assistance. Even where time and the
situation permits, humanitarian agencies are often poor at consulting or involving members of the affected population and beneficiaries or
their assistance. Consequently, there can often be
considerable discrepancy between the agency’s perception of its performance and
the perceptions of the affected population and
beneficiaries. Experience shows that interviews with beneficiaries
can be one of the richest sources of information in evaluations of humanitarian
assistance. The use of Rapid Rural Appraisal and Participatory
Rural Appraisal techniques can be very helpful in selecting
members of the affected population to be interviewed and in the structuring of
the interview. A combination of interviews with individual households,
women’s groups and open group discussions involving men
as well have proven to be very productive in some contexts. However, in the
context of recent or ongoing conflicts such a process may need
to be modified. Ensuring the confidentiality of some interviews
with individuals may be necessary. The deliberate seeking out of those who did
not benefit from the assistance available can also be fruitful as
it may reveal problems with the targeting and beneficiary
selection processes used by the agencies. Ideally, anthropologists familiar
with the culture and the indigenous language will
undertake this work.
It has already been noted that
one of the strengths of a multi-disciplinary team is the differing perspectives
it can bring to bear on issues. All team members should, therefore, be involved
in discussing the findings and linking these to
conclusions. Tensions which may arise between the team leader and individual
subject specialists on the nature of the conclusions can be
managed more effectively if the whole team is brought together to
discuss findings and conclusions. Ideally the team should hold workshops to
discuss their preliminary findings on return from fieldwork (where
they have not been working together in the field) and then
subsequently to discuss comments received on the draft report and any new
information provided by the agencies.
Whatever its scope or nature, an
evaluation report will maximise its potential impact if it presents its findings, conclusions, recommendations and follow-up sections separately.
If readers disagree with the recommendations (or find
themselves unable to implement them because of political constraints), they may be able to agree with the findings or conclusions. Comprehensive
discussions on the draft report within the target
audience of the report is likely to increase their ‘ownership’ of the report
and the likelihood of its acceptance and follow-up.
In any large-scale evaluation,
conflict over the nature of the recommendations is probably inevitable. In preparing recommendations, in order to minimise such conflict, there
needs to be a clear link between the recommendations and the evidence
in the body of the report to support them.
The form in which
recommendations should be made is the subject of continuing debate among
evaluation specialists. Some would argue that evaluation reports
should contain specific, implementable recommendations
detailing the actions agencies should take in order to improve future
performance. Such recommendations might also spell out who is
responsible for implementing each recommendation and who is responsible
for monitoring whether this action takes place. This approach has the benefit
of making the responsibility for implementation and follow-up clear
and reduces the opportunity for organisational evasion and
‘fudging’. However, others would urge caution, favouring findings and
conclusions over specific recommendations, so as not to burden
policy-makers with recommendations that could lead to unimplementable
policies. If recommendations are required, an evaluation team might provide
policymakers with options rather than a single recommendation,
together with an analysis of expected consequences. Different issues may
require different responses. Technical issues may lend themselves to specific
recommendations in the final report. In dealing with broader issues it may be
useful to deliver the analysis to a workshop of decision-makers and evaluators
which negotiates follow-up action.
Evaluation reports also need to
be “sold”. Bureaucrats, field officers and agency staff need to
be enthused, excited and convinced that the evaluation
report is important and should be read. While selling the report is more the
responsibility of the management group than the evaluation team, marketing strategies could be included in negotiated follow-up
actions in order to help steering committee members sell the evaluation report within their own organisation.
Large, system-wide evaluations
raise issues relating to a diverse range of organisations
and compliance cannot be compelled. Monitoring of
follow-up action is therefore important because it provides for
a level of accountability which is otherwise missing. A
wellresourced and well-structured monitoring process can
strongly influence agencies to account for their response (or
lack of it) to the evaluation report.
It is highly desirable for
Evaluation Managers to establish a mechanism whereby decisions relating to the conclusions
and recommendations of an evaluation are formally recorded
and an explanation provided for where action is to be taken and who is to be responsible. Where no
action is deemed appropriate, this would need to be justified. Evaluation departments would then have a basis for
monitoring whether the agreed actions are undertaken.