ije.oxfordjournals.org - Oxford Academic

Published by Oxford University Press on behalf of the International Epidemiological Association
The Author 2005; all rights reserved. Advance Access publication 30 September 2005
International Journal of Epidemiology 2005;34:1425–1434
doi:10.1093/ije/dyi195
Haemoglobin colour scale for anaemia diagnosis
where there is no laboratory: a systematic
review
Julia Critchley* and Imelda Bates
Accepted
31 August 2005
Background Anaemia is a major public health problem, in poor countries most of the cases are
diagnosed clinically. This is inaccurate and the haemoglobin colour scale (HCS)
has been developed as an inexpensive, simple alternative for assessing anaemia.
Laboratory and community studies have assessed its diagnostic accuracy, but
controversy over its validity and usefulness remains. We carried out a systematic
review to identify and summarize studies, explain heterogeneity, and make
recommendations for future research.
Methods
We searched electronic databases (MEDLINE, EMBASE, CINAHL, and Science
Citation Index), checked documents and references, and contacted experts. We
included all the studies comparing HCS diagnostic accuracy with a reference
standard. Both reviewers independently screened titles and abstracts, assessed
studies for inclusion, appraised quality, and extracted data.
Results
We included 14 studies, mostly from sub-Saharan Africa. Studies had heterogeneous populations, health care settings, anaemia prevalence, and findings.
HCS sensitivity for detecting anaemia was high in most of the studies (75–97%);
specificity was generally lower (41–98%). Sensitivity and specificity were higher
for laboratory-based studies compared with more pragmatic ‘real-life’ studies,
and the ‘study setting’ appeared to explain some of the heterogeneity. Five
studies compared the HCS with clinical diagnosis; sensitivity was higher for
the HCS in four studies, but specificity was often higher with clinical
diagnosis. A few studies evaluated the HCS in situations where there was no
laboratory.
Conclusions The HCS may improve anaemia diagnosis where there is no laboratory, but there is
a need for policy-relevant diagnostic research which is pragmatic, implementationfocused and assesses clinical outcomes. This requires a different approach and
research skill-mix from efficacy studies.
Keywords
Anaemia, haemoglobin, haemoglobinometry, developing countries, diagnosis,
health plan implementation
Introduction
1
Anaemia is the world’s second leading cause of disability and
one of the most serious global public health problems. Anaemia
affects over half of the pre-school children and pregnant women
in developing countries.2 In poorer malaria endemic countries it
is one of the commonest preventable causes of death in children
,5 years of age and in pregnant women.3 The HIV pandemic
Liverpool School of Tropical Medicine, Liverpool, UK
* Corresponding author. International Health Research Group, Liverpool
School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK.
E-mail: [email protected]
is worsening the severity and increasing the prevalence of
anaemia.4 Even mild and moderate anaemia has serious clinical
consequences. In pregnancy it is associated with increases in
stillbirths and low birth weight babies, and in children it causes
stunting and impaired intellectual development.5
Diagnosis of anaemia is problematic in resource-poor settings.
Clinical diagnosis is the most common method of detecting
anaemia but this is inaccurate. For example, it is only 65%
sensitive for detecting haemoglobin levels of ,7 g/dl in
pregnant women in Kenya6 and is even less useful for detecting
non-severe anaemia. Most of the existing methods for measuring haemoglobin are expensive, complex, or impractical for use
1425
1426
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
in primary care in poor countries. The ‘gold standard’ haemiglobincyanide method uses inexpensive but toxic consumables,
depends on accurate sample dilutions, and requires regular
supervision and careful quality control processes.7 Other
methods, such as the HemoCue system, are simple to operate
but use costly disposable cuvettes making them too expensive
for widespread use.8 A cheap reliable means of assessing
haemoglobin levels is therefore required for use in places
where there is no laboratory.
Development and use of the haemoglobin
colour scale
The haemoglobin colour scale (HCS),9 a simple, rapid and cheap
method for estimating haemoglobin concentration with a finger
prick blood sample, has been developed for use in resource-poor
settings where there is no laboratory. The method relies on
comparing the colour of a drop of blood absorbed onto a filter
paper with standard colours on a laminated card, varying from
pink to dark red. These colours correspond to haemoglobin
levels of 4, 6, 8, 10, 12, and 14 g/dl. Intermediate shades
can be identified, allowing haemoglobin levels to be judged to
1 g/dl. This is the only method currently available that is
affordable for widespread primary care use by the most
impoverished countries.
A diagnostic test for assessing anaemia in communities and
primary care facilities in resource-poor settings should be
inexpensive, rapid, simple to perform on a finger prick sample,
and reasonably accurate.8 Because the HCS is divided into
increments it is not surprising that studies have shown it to be
less accurate than the laboratory methods of measuring
haemoglobin [such as the hemiglobincyanide (HCN) and
HemoCue methods] which measure haemoglobin levels to
the nearest 0.1 g/dl. Interpretation of HCS results is highly
variable depending on the context. Opinions vary from the
recognition that the HCS is accurate enough to be useful for
anaemia diagnosis in resource-poor settings10 to suggestions
that its poor accuracy ‘renders its use questionable’.11 These
opinions partly reflect the wide range of results from studies of
the accuracy of the tests in different contexts, and a lack of
clarity in defining whether these studies were designed to assess
accuracy or usefulness of the HCS in practice. To provide an
overview of the evidence available to support widespread
introduction of the HCS we have performed the first systematic
review to assess the objectives, quality, and results of all
published studies evaluating the performance of the HCS
compared with reference methods and clinical diagnosis. We
have examined possible explanations for heterogeneity between
studies, and developed recommendations for research strategies
assessing the practical usefulness of such diagnostic tests in
resource-poor settings.
Methods
Types of studies included
We included any study comparing the HCS with a ‘gold standard’ method of measuring anaemia, such as haemiglobincyanide (cyanmethaemoglobin method)7, in a population-based or
hospital-based sample. For pragmatic reasons, we also included
HemoCue as a ‘gold standard’ method of measuring anaemia in
poorer tropical countries.
Identification and selection of studies
Electronic databases, MEDLINE, EMBASE, CINAHL, and
Science Citation Index (up to May 2005), were searched using
keywords ‘Haemoglobin Colour Scale’ and alternative spellings
(hemoglobin or color) and the full text of any articles
that appeared relevant was retrieved. Both authors assessed
studies for inclusion, using a pre-designed, piloted inclusion
form. We also searched for relevant studies by screening titles
and abstracts listed in the WHO Haemoglobin Colour Scale
bibliography,12 by checking lists of references in relevant
studies (all included studies and reviews), and by personal
communication.
Data extraction and assessment of
methodological quality
Both reviewers’ extracted data using a pre-designed form. This
included information on whether the studies were designed to
assess accuracy or usefulness in practice; study characteristics,
such as country and location; definitions of anaemia and
anaemia prevalence; reference (‘gold standard’) method of
measuring anaemia; health care setting and workers; any
training received, including the length and type of training;
sample size; statistical methods; and results.
Studies were categorized by both reviewers as essentially
‘laboratory based’ (efficacy studies), ‘real-life’ (effectiveness
studies), or ‘unclear’ on the basis of two criteria: (i) whether the
HCS was tested using samples from patients in clinic settings
(real-life), or samples of known haemoglobin from laboratories
(laboratory based) and (ii) level of training and supervision
received by those using the HCS (in ‘real-life’ studies users
received not more than 1 h of training, more extensive
training and constant supervision were classified as laboratory
based). Studies classified as ‘unclear’ contained elements of
both the study types (such as real-life but with extensive
training), or did not report sufficient details in order to make a
judgement.
The data extraction form was designed by consulting
accepted guidelines and empirical evidence for the importance
of various aspects of the design of diagnostic studies,13 and was
piloted on two studies before use. We contacted authors for more
details where specific items [such as 95% confidence intervals
(95% CIs)] were not reported. Methodological quality of studies
was assessed by both reviewers, adapting validated guidelines
specifically developed for assessing the quality of diagnostic
studies.13–15 The included items are as follows:
Clear descriptions of selection criteria, methods of testing, and
the reference standard (‘gold standard’) used.
Whether blood for the reference standard and HCS were
collected and analysed at the same time.
Whether the HCS test results were interpreted without
knowledge of the results of the reference standard, and the
results of clinical examination if applicable.
Whether the same clinical data were available when HCS
results were interpreted as would be available when the test is
used in practice.
Whether ‘intermediate’ HCS results were reported.
Explanation of withdrawals from the study population.
Whether results from all the health workers were included in
the final analysis.
HAEMOGLOBIN COLOUR SCALE
Whether the spectrum of patients was representative of those
who will receive the test in practice.
Data analysis
We planned to pool estimates of sensitivity and specificity of
the HCS across studies wherever appropriate, stratified by ‘reallife’ or ‘laboratory’ based. However, statistical methods for
combining results of diagnostic tests are at an early stage of
development.15 In particular, it would be inappropriate to
combine estimates of sensitivity and specificity if there is a
marked heterogeneity in these estimates, if the disease spectrum
is different, or if the diagnostic cut-offs for anaemia are
varying between studies.16 Due to the clinical and statistical
heterogeneity between studies we did not combine them
statistically, except for the laboratory-based studies with similar
findings.17–21
Sensitivity and specificity were stratified by the study type.
Diagnostic odds ratios (DORs—ratios of odds of a ‘positive’ HCS
result in patients with anaemia compared with a patient
without anaemia, the larger the value the stronger the diagnostic evidence) were calculated as an overall measure of test
accuracy (see the web version of Table 1, available at IJE
Online).22 Many studies did not provide any estimate of
precision for these statistics (such as a 95% CI) so wherever
possible we estimated these from the raw data using StatsDirect,
or obtained them from the authors.
Results
Description of studies
Fourteen studies were included (Figure 1 and Table 1).10,11,17–
21,23–29
Two studies from the WHO’s HCS bibliography12 were
assessed but excluded because they were pilot studies, primarily
concerned with training.30,31 Development of the HCS has been
an iterative process, and several of the included studies tested
prototype versions of the HCS.17,27,29 For example, in one study
users found it difficult to differentiate between values of 10 and
12, and the shades were later adjusted;17 another study did not
allow blood to soak into the paper for 30 s,27 and Tatsumi29
experimented with altering the amount of blood, suggesting that
the method was not optimized.
The objectives, type of patients, and the health care setting in
which the HCS was used varied between studies making it
difficult to generalize the results. Some studies were carried
out under ‘ideal’ settings or in patient populations that are not
representative of those presenting to primary care health
facilities in developing countries.11,17,25,29 Six studies were
‘laboratory based’.11,17–21 These assessed the ability of health
care workers to obtain accurate results under supervision using
the HCS, compared with a reference method. Four studies were
‘real-life’, and were carried out without supervision in antenatal clinics, schools, or rural research clinics after a brief
training.23–25,28 The remaining four10,26,27,29 were not clearly
‘laboratory-based’ or ‘real-life’ but contained elements of both,
e.g. laboratory testing but with brief training,29 use of HCS in a
field setting but as part of a randomized controlled trial, and
with extensive training.10
Most of the studies were conducted in sub-Saharan Africa
(Ethiopia, Tanzania, South Africa, Zimbabwe, Malawi, Kenya).
1427
Other studies include populations from the UK (3), Indonesia (2),
and one each from Egypt, Malaysia, India, Honduras, Thailand,
and Switzerland. Definitions of anaemia and anaemia prevalence varied between studies. Most of these defined a cut-off for
anaemia as a haemoglobin level either ,11 g/dl or ,12 g/dl.
Some studies also assessed the diagnostic accuracy of the HCS
at other cut-offs but the definitions of these varied. ‘Severe’
anaemia was generally defined as Hb ,7 g/dl or ,6 g/dl, but one
study also defined a cut-off for ‘very severe’ anaemia as ,5 g/dl.
A few studies used different definitions for men and women.
Anaemia prevalence varied dramatically, from 5 to 85%. The
reference methods used to diagnose anaemia included
HemoCue (7 studies), automated analyser (7), photometer
(3) and the copper sulphate method (2). Most of the studies
provided training for health care workers using the HCS, though
the methodology differed between studies and was usually not
clearly described. Several studies also assessed the diagnostic
accuracy of other methods, such as clinical examination for
anaemia (5 studies), the copper sulphate method (3 studies), the
Sahli method (1 study), and the Tallquist paper scale (1 study),
against the reference method.
Methodological quality of studies (web Table 2)
In all the studies, health workers using the HCS appear to have
been ‘blind’ to results of other methods of measuring haemoglobin. Many studies did not clearly describe their criteria
for selecting health workers or patients, or the characteristics
or completeness of their sample. This severely hinders interpretation of the results as the sensitivity and specificity of any
test may vary in different populations (e.g. by sex, age or
disease severity). Several studies used venous blood samples,11,17,20,28,29 whereas finger prick samples would be the
most likely source of blood if the HCS was used at a community or
a primary care level. Two studies used a venous blood sample for
the reference standard (automated haematology analyser) but
used a finger prick blood sample for the HCS measurement,
thereby introducing the possibility of bias.20,28 Some studies
excluded results from health workers who were retrospectively
shown to have performed poorly on the HCS17,23 or allowed
health care workers to estimate HCS more than once, without
explaining how any discrepancies were resolved.19,20,29
The HCS measures haemoglobin in 2 g/dl increments. Some
studies ‘allowed’ intermediate values (to the nearest 1 g/
dl)10,21,26,28 while others did not, with implications for calculations of diagnostic accuracy. In some studies inappropriate
statistics were used to assess the agreement between the HCS
and the reference standard (such as correlation coefficients).32
Sensitivity and specificity of HCS for the detection
of anaemia
Thirteen studies reported estimates of sensitivity or specificity for
anaemia using the HCS10,17–21,23–29 and were categorized as
‘real-life’, ‘laboratory based’, or unclear (Figures 2a and b).
Estimates of sensitivity and specificity from the five laboratorybased studies were all high, varying from 0.85 to 0.99 for
sensitivity and from 0.91 to 1.0 for specificity.17–21 All used the
same cut-off for diagnosing anaemia (Hb , 12 g/dl), so a pooled
estimate was calculated from the four studies for which the
1428
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
Table 1 Characteristics of the included studies
Study reference
(location)
Anaemia prevalence
Patient population
Main results
23 (Kenya)
60%
178 children (2–36 months)
Sens: 0.76% (CI 0.68–0.84)
20 (Indonesia)
38%
240 blood donors,
150 samples from the
haematology unit
Spec: 0.44% (CI 0.33–0.56)
2.7% severe
Sens:
Anaemia: 0.99 (CI 0.96–1.0)
Severe anaemia: 1.0 (CI 0.40–1.0)
Spec:
Anaemia: 1.0 (CI 0.98–1.0)
Severe anaemia: 0.99 (CI 0.95–0.99)
24 (Southern Ethiopia)
15.1%
403 pregnant women
10.4% mild, 4.2% moderate,
0.3% severe
10 (Tanzania)
Sens: 0.33–0.44 (CI 0.12–0.78)
Spec: 0.87–1.0 (CI 0.83–1.0)
85% anaemic, (29% severe,
9% very severe)
990 children and 643 gynae/
obstetric patients
83% anaemia (20% severe)
1529 pregnant women
Sens:
Anaemia: 0.97 (CI 0.96–0.97)
Severe anaemia: 0.92 (CI 0.89–0.94)
Very severe anaemia: 0.92 (CI 0.84–0.94).
Spec:
Anaemia: 0.41 (CI 34–47)
Severe anaemia: 0.86 (CI 0.84–0.88)
Very severe anaemia: 0.90 (CI 0.88–0.91)
25 (Upper Egypt)
17.5%
150 children (aged 6–11)
Sens: 0.88 (CI 0.70–0.97)
11 (UK)
Not relevant
408 adult and child inpatients
Not possible to calculate
sensitivity/specificity
21 (Multi-centre)
~5%
2801 blood donors
Sens: 0.85 (CI 0.77–0.91)
26 (Tanzania)
78.5% anaemia
614 children (16–83 months)
Sens:
Spec: 0.49 (CI 0.40–0.60)
Spec: 0.98 (CI 0.97–0.98)
3.5% severe anaemia
Anaemia: 0.85 (CI 0.81–0.88),
Severe anaemia: 0.74 (CI 0.49–0.90)
Spec:
Anaemia: 0.77 (CI 0.68–0.84)
Severe Anaemia: 0.99 (CI 0.99–1.00)
17 (South Africa)
Not reported
548 hospital outpatients
Sens:
Anaemia: 0.85 (CI 0.81–0.99),
Severe anaemia: 0.75 (CI 0.81–0.99)
Spec:
Anaemia: 0.77 (CI 0.95–0.98)
Severe anaemia: 0.99 (CI 0.98–0.99)
27 (South Africa)
36%
104 adult outpatients
Sens: 0.7
Spec: 0.85
28 (Malawi)
58.1% (mild)
32% (moderate)
729 pregnant women
Sens:
Anaemia: 0.78 (CI 0.74–0.8)
4% (severe)
Moderate: 0.82 (CI 0.77–0.86)
0.4% (very severe)
Severe anaemia: 0.81 (CI 0.65–0.92)
Very severe: 0.5 (CI 0.12–0.58)
Spec:
Anaemia: 0.50 (CI 0.46–0.55)
Moderate 0.45 (CI 0.42–0.49)
Severe anaemia: 0.74 (CI 0.74–0.79)
Very severe: 0.99% (CI 0.98–0.99)
HAEMOGLOBIN COLOUR SCALE
1429
Table 1 continued
Study reference
(location)
Anaemia prevalence
Patient population
Main results
29 (Indonesia)
16%
120 Children
Sens: 0.23 (CI 0.13–0.36)
28 (Multi-site reference centres)
67%
1213 random blood samples
Sens:
Spec: 0.97 (CI 0.89–1.0)
(severe 14%)
Anaemia: 0.86 (CI 0.83–0.89)
Severe: 0.94 (CI 0.89–0.97),
Spec:
Anaemia: 0.91 (CI 0.88–0.93)
Severe: 0.98 (CI 0.97–0.99)
28 (South Africa)
Not relevant
20 blood samples
Inappropriate statistical methods
28 (South Africa and Thailand)
Not given
113 blood donors
No discrepancies between methods
29 (South Africa)
Not given
20 blood samples
Sens:
Anaemia: 0.89 (CI 0.86–0.93)
Severe anaemia: 0.84 (CI 0.77–0.91)
Spec:
Anaemia: 0.95 (CI 0.91–0.97)
Severe anaemia: 0.92 (CI 0.90–0.94)
CI 5 95% confidence interval; Sens 5 sensitivity; Spec 5 specificity.
Potentially relevant diagnostic
studies identified and screened for
retrieval (n=67)
51 excluded as not relevant
(reviews, opinion pieces or
not concerned with HCS)
Diagnostic studies retrieved for
more detailed evaluation (n=16)
Potentially appropriate diagnostic
studies to be included in the
systematic review (n=14)
Diagnostic studies with usable
information by outcome included in
Figures 2-4 (n=14)
2 studies excluded
1– no reference standard
1– pilot concerned with
training
1 excluded from Figure 2 as
sensitivity or specificity could
not be estimated
2 excluded from Figure 3 as
Diagnostic Odds Ratios could
not be estimated
Figure 1 Flowchart showing selection and inclusion of studies in the
HCS systematic review. This illustrates the selection and inclusion of
studies for this review (QUOROM statement)
true positive, true negative, false negative, and false positive
rates could be obtained.17,18,20,21 This yielded an overall
sensitivity of 0.87 (95% CI 0.86–0.89) and specificity of 0.97
(95% CI 0.97–0.98). There was no statistically significant
heterogeneity between estimates of sensitivity and specificity
(c2 test, P , 0.25 for both).
Estimates of sensitivity and specificity from the ‘real-life’ or
unclear studies were lower and more heterogeneous than for
the laboratory-based studies, so a pooled estimate was not
considered appropriate. Sensitivity for the ‘real-life’ studies
varied between 0.76 and 0.88, apart from one outlier (Figure 2a).
Sensitivities lower than this may be clinically unacceptable; for
example, a sensitivity of 0.75 implies that 25% of the patients
with anaemia will not be diagnosed using the HCS. Specificity
was lower for the ‘real-life’ studies (Figure 2b), implying that
the HCS may falsely diagnose anaemia in many individuals.
Although anaemia treatment is cheap and relatively safe, the
potential costs of this over-diagnosis, particularly for patients
with very limited resources, must be taken into account in
assessing the usefulness of the HCS. DORs for anaemia diagnosis
were significantly lower (between 2.5 and 7.5) for ‘real-life’
studies, compared with laboratory-based studies (between 58
and 48 000), P 5 0.0032 (web Table 1, available at IJE Online;
Figure 3).
The haemoglobin cut-off for diagnosing anaemia varied
between studies (11–12 g/dl). As the HCS was read only to
the nearest 1 or 2 g/dl in these studies we would not necessarily
expect such variations to cause heterogeneity between studies, and there was almost no correlation between study-level
estimates of sensitivity and specificity (spearman’s rho 5 0.027,
P 5 0.29), suggesting that factors other than a threshold effect
account for most of the heterogeneity. These factors may include
characteristics of the population (such as anaemia prevalence
and patient group), test operator (level of training and
experience), or the test itself (optimal version). For example,
one of the ‘real-life’ studies24 and one of the unclear studies29
estimated particularly low sensitivity [0.44 (95% CI 0.32–0.56)
and 0.23 (95% CI 0.13–0.36), respectively]. However, the former
study was carried out in an area of very low anaemia prevalence,
and in the latter a prototype version of the HCS scale was
1430
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
(a)
Munster 1997
Lewis 1998
(b)
19
Munster 1997
18
Lewis 1998
Ingram 2000 17
21
Lewis 2001
Timan 2004 20
19
18
Ingram 2000 17
21
Lewis 2001
Timan 2004 20
Tatsumi 1999 29
Coetzee 2000 27
Tatsumi 1999 29
Coetzee 2000 27
Montresor 2000 26
Montresor 2003 10
Montresor 2000 26
Montresor 2003 10
Van Den Broek 1999 28
Barduagni 2003 25
Van Den Broek 1999 28
Barduagni 2003 25
Gies 2003
24
Gies 2003
Verhoef 2004 23
24
Verhoef 2004 23
1.0
0.2
0.4
0.6
Sensitivity
0.8
1.0
(c)
0.0
(d)
Ingram 2000 17
0.2
0.4
0.6
Specificity
0.8
1.0
0.4
0.6
Specificity
0.8
1.0
Lewis 199818
Ingram 2000 17
Timan 2004 20
Timan 2004 20
Montresor 2000 26
Montresor 2000 26
Montresor 2003
10
Montresor 2003 10
Van Den Broek 1999
28
Van Den Broek 1999 28
1.0
0.2
0.4
0.6
0.8
Sensitivity
1.0
0.0
0.2
Figure 2 (a) Sensitivity of HCS for anaemia, (b) specificity of HCS for anaemia, (c) sensitivity of HCS for severe anaemia (HB , 6 or 7), and
(d) specificity of HCS for severe anaemia (HB , 6 or 7). Figures (a–d) show sensitivity and specificity of the HCS for anaemia and severe anaemia
11
with 95% CIs for each study. Paddle’s study is omitted because we could not obtain or calculate sensitivity, specificity, and 95% CI. s 5 laboratory
(efficacy) study, n 5 unclear, 5 field-based (effectiveness) study
used. For both these studies specificity was high (Figure 2b),
suggesting that threshold effects may account for some of the
heterogeneity, despite the lack of a clear correlation. Similarly,
one of the ‘unclear’ studies appeared to have lower specificity,
but sensitivity was high10 (Figures 2a and b). DORs (Figure 3)
appear reasonably consistent within the three sub-groups,
suggesting that the study settings account for some of the
heterogeneity, except for one study that involved extensive
training, where the HCS was more accurate than in any other
study.20
Sensitivity and specificity of HCS for the detection
of severe anaemia
Six studies detected and reported HCS results for severe anaemia
(defined as Hb , 7 or ,6 g/dl).10,17,18,20,26,28 Sensitivity was
high (.0.84) in most of the studies but in the only ‘real-life’
based study it was just 50% (95% CI 12–58%)28 (Figure 2c).
If confirmed, sensitivity as low as 50% would be a cause for
concern. However, clinical diagnosis is more accurate for severe
anaemia compared with mild anaemia, and the HCS is therefore
likely to be most useful for detecting mild to moderate cases.
Specificity was very high in all the studies, varying between
86% (95% CI 84–88%) and 99% (95% CI 99–100%) (Figure 2d).
Most of the studies only had a small number of patients who were
severely anaemic, so larger sample sizes and further studies are
required to assess the diagnostic accuracy of HCS for severe
anaemia (Figure 4).
Comparison of alternative diagnostic tests
for anaemia with HCS
Clinical examination
Five studies reported diagnostic accuracy for anaemia using both
HCS and clinical examination10,17,24,26,28 (usually by assessing
HAEMOGLOBIN COLOUR SCALE
conjunctival pallor), compared with the same reference standard. In two of the studies the clinical examination was far more
intense than would be the case in practice. It involved assessing
pallor of nail beds, conjunctiva and palms, and grading the
appearance of each site as normal, pale, or very pale.10,26 Despite
this, in most studies clinical examination performed poorly,
especially for detecting mild and moderate levels of anaemia.
Munster 1997
Lewis 1998
1431
Sensitivity varied from 33% (95% CI 29–38%) to 57% (95%
CI 34–79%) and specificity from 79% (95% CI 73–79%) to 84%
(95% CI 79–88%). The HCS generally had better sensitivity,
but not necessarily better specificity, than clinical examination.
Only one study reported that clinical examination performed
better than the HCS.24
Other tests
One study in Egypt compared the diagnostic accuracy of the
‘Sahli’ method (a subjective method based on visual comparison) and HCS with the HemoCue reference standard.25
Sensitivity was similar for both the methods (92.3%, 95% CI
75–99% for Sahli compared with 88.5%, 95% CI 70–97% for
HCS), but in both the tests specificity was very low (48.8%, 95%
CI 40–60% and 39.0%, 95% CI 30.5–48% for HCS and Sahli,
respectively). The authors preferred the HCS as it is quicker to
carry out than the Sahli method. Three studies compared HCS
with the copper sulphate method for screening blood donors and
found that both methods had similar sensitivity and specificity
when using HemoCue or haemiglobincycanide as the reference
methods20,21,29 (Table 1). One study used results of the HCS, in
combination with anaemia symptoms and clinical examination,
as part of a scoring system for anaemia, and showed higher
sensitivity and specificity than the HCS alone.24
19
18
Ingram 2000 17
Lewis 200121
Timan 2004 20
Tatsumi 1999 29
Coetzee 2000 27
Montresor 2000 26
Montresor 2003 10
Van Den Broek 1999 28
Barduagni 2003 25
24
Gies 2003
Verhoef 2004 23
0
4
8
Ln DOR
Discussion
12
We have identified and extracted data from 14 studies to provide
an overview of the evidence available about the accuracy and
usefulness of the HCS in anaemia diagnosis in laboratorybased and ‘real-life’ settings. The HCS has only been available
since 2001, so there are relatively few publications at present
(Figure 1). Assessing the methodological quality of these studies
0.2
0.4
Sensitivity
0.6
0.8
1
Figure 3 Log diagnostic odds ratios (DORs) for the HCS. DORs and
11
95% CI of the HCS for anaemia for each study are shown. Paddle and
19
Munster are omitted as we could not calculate the DOR. s 5
laboratory (efficacy) study, n 5 unclear, 5 field-based (effectiveness)
study
0
0.2
1 - Specificity
HCS
0.4
0.6
Clinical
Figure 4 Comparison of sensitivity and specificity for HCS with clinical diagnosis. Estimates of sensitivity and specificity for both the HCS and clinical
diagnosis are displayed. The lines connect estimates of HCS and clinical diagnosis from the same study. The circles represent HCS and diamonds clinical
diagnosis. The ‘best’ tests (in terms of sensitivity and specificity) are towards the top left of the graph. The figure shows that sensitivity was much better
for the HCS than clinical diagnosis (except in one study where they are about the same). However, specificity was often similar or better for clinical
diagnosis
1432
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
and interpreting their results was difficult due to poor reporting
of the study design and methods; different diagnostic criteria
for anaemia; and frequently small sample sizes, particularly for
severe anaemia. We have identified gaps in the evidence and
clear evidence of heterogeneity between study settings. As with
any systematic review there is a possibility of bias due to the
methods of selecting the publications. However, the search was
comprehensive and supplemented by contacting key workers in
the field and WHO, and data were extracted independently by
two researchers.
Despite these limitations, our review suggests that the HCS
may be a useful tool for anaemia diagnosis. Sensitivity and
specificity were very high in laboratory-based studies, but
deteriorated substantially as the studies moved from accuracy
assessments in ‘ideal’ populations in the laboratory to more ‘reallife’, pragmatic evaluations. The most likely explanation for
this is the less intensive training and supervision available in
the field. Estimates of sensitivity and specificity appeared
heterogeneous between studies, but estimates of DORs were
reasonably consistent within our laboratory-based or ‘real-life’
sub-groups, suggesting that the study ‘setting’ can account for
some of the heterogeneity between included studies. The
reduced HCS accuracy in field studies implies that further
research is required to define and evaluate its role in primary
health care.
At present the most widely used method for detecting
anaemia in settings where there is no laboratory is clinical
diagnosis, and this is well recognized (confirmed by this review)
to be useful only for very marked anaemia. The HCS performed
better than clinical diagnosis alone for detecting mild and
moderate anaemia, but this advantage is reduced as anaemia
becomes more severe and hence more clinically obvious. Even
for severe anaemia, the HCS may have the advantage of
being semi-quantitative, whilst clinical diagnosis is always
qualitative. The HCS is likely to be most useful in areas with
moderate to high anaemia prevalence, though there may be a
trade-off with lower specificity—in which case the costs and
implications of a significant number of false positive results for
patients and the health services must be considered. Further
studies are therefore needed to assess the benefits and
disadvantages of using the HCS in specific patient populations
(e.g. ante-natal women and young children) in high prevalence
rural areas. The aim of a device such as the HCS should be to
improve identification of anaemic individuals, and not to replace
clinical examination, so it is important to investigate the most
effective way of combining these approaches. One study used a
scoring system based on a combination of HCS results and
anaemia symptoms and signs, and achieved higher sensitivity
and specificity than the HCS alone.24 Future studies should
therefore include an assessment of combinations of parameters
in an attempt to optimize anaemia diagnosis in resourcepoor settings.
Although evidence is now available about the accuracy of the
HCS, we lack studies addressing issues relevant to widescale
implementation and it is therefore difficult to draw conclusions
about its usefulness in practice. Future studies need to be
pragmatic in design (i.e. to assess diagnostic accuracy in local
health care settings with local workers) and should report clearly
their method of sampling and characteristics of the patient
population. Training methods should be formally evaluated, and
should also be practical. Some studies provided up to 2 days
training,10 which is burdensome, expensive, and may not be
feasible in practice. The introduction of HCS into settings where
there is no laboratory will require careful planning as systems for
supplying and safely disposing of sterile lancets will need to be
established. Realistic studies in primary care settings are needed
which compare the costs and effectiveness of the HCS with the
current routine method of clinical diagnosis of anaemia. These
studies need to be planned early in the development process
and should evaluate longer-term clinical outcomes (such as
better management of anaemia, less anaemia at delivery among
pregnant women, and increased birth weight), rather than
simply reporting on diagnostic accuracy. Such studies will
require a multi-disciplinary approach, involving, for example,
health planners, social scientists, economists, and epidemiologists as well as clinicians.
Conclusion
The HCS has the potential to be the most appropriate tool
currently available for the detection of mild and moderate
anaemias, which are likely to be missed by clinical diagnosis
alone, but which still require treatment. It is inexpensive, rapid
and simple to use8, and the few studies available suggest that
under ideal conditions it is reasonably accurate. Although
studies are limited, results are promising and justify investment
in evaluating its use on a larger scale and in real-life situations.
The next step on the pathway to implementation should be to
assess its usefulness in combination with clinical diagnosis in
practical situations, such as paediatric and ante-natal clinics in
developing countries where there is no laboratory. This type of
research requires a different approach and research skill-mix
from efficacy studies.
Acknowledgements
This article is based on a background paper commissioned by the
World Health Organization for an international expert consultation in May 2004. Funding was also received from the
Department for International Development UK (Effective Health
Care Research Consortium and Malaria Knowledge Programme).
Nevertheless, the views expressed in this article are those of the
authors alone. These organizations accept no responsibility for
the information or views expressed in this paper. The authors
thank Nynke Van Den Broek for supplying additional information from her study. Role of funding source: the funders played no
role in study design, collection, analysis, interpretation of data,
writing of the report, or in the decision to submit the paper for
publication. They accept no responsibility for the contents.
Contributions of authors: J.C. assessed studies for inclusion,
extracted data, and assessed study methodological quality,
carried out analyses, and is responsible for preparing the
article. I.B. assessed studies for inclusion, extracted data, and
assessed study methodological quality, contributed to analyses,
obtained initial funding and helped write the article. Both J.C.,
and I.B. contributed to the interpretation and revision of
the manuscript.
Conflicts of interest
The authors have declared no conflicts of interest.
HAEMOGLOBIN COLOUR SCALE
1433
Supplementary data
Supplementary tables can be found at IJE Online (http://
ije.oxfordjournals.org).
KEY MESSAGES
The HCS is simple, rapid, inexpensive, and suitable for assessing haemoglobin levels on finger prick blood samples
in resource-poor settings where there is no laboratory.
The accuracy of the scale is high in laboratory-based studies but deteriorates as studies become more field-based
and ‘real life’, and where prevalence of anaemia is low.
The HCS is a promising tool for diagnosing anaemia where there is no laboratory, but more research is needed to
determine the usefulness of the HCS in ‘real-life’ situations and to assess its effectiveness in improving clinical
outcomes.
References
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
in Health Care. Meta-analysis in Context. London: BMJ Publishing
Group, 2001.
Murray CJ, Lopez AD. Global mortality, disability, and the contribution of risk factors: Global Burden of Disease Study. Lancet 1997;349:
1436–42.
16
Schellenberg D, Armstrong Schellenberg JRM, Mushi A et al. The
silent burden of anaemic in Tanzanian children: a community-based
study. Bull World Health Organ 2003;81:581–90.
17
McDermott JM. Prospective assessment of mortality among a
cohort of pregnant women in rural Malawi. Am J Trop Med
1996;55:66–70.
Van den Broek NR. The relationship between asymptomatic human
immunodeficieny virus infection and the prevalence and severity
of anaemia in pregnancy Malawian women. Am J Trop Med
1998;59:1004–07.
18
19
20
WHO/NHD. Iron deficiency anaemia: assessment, prevention and
control. Geneva: World Health Organization, 2001.
Shulman CE, Levene M, Morison L, Dorman E, Peshu N, Marsh K.
Screening for severe anaemia in pregnancy in Kenya, using pallor
examination and self-reported morbidity. Trans R Soc Trop Med Hyg
2001;95:250–55.
Bain B, Bates I. Basic haematological techniques. Dacie and Lewis
Practical Haematology. London: Church, Livingstone, 2001, pp. 20–22.
21
22
23
Lara AM, Mundy C, Kandulu J, Chisuwo L, Bates I. Choosing a
haemoglobin method for district hospitals in Malawi. J Clin Pathol
2005;58:56–60.
Stott GJ, Lewis SM. A simple and reliable method for estimating
haemoglobin. Bull World Health Organ 1995;73:369–73.
24
Montresor A, Ramsan M, Khalfan N et al. Performance of the
haemoglobin colour scale in diagnosing severe and very severe
anaemia. Trop Med Int Health 2003;8:619–24.
25
Paddle JJ. Evaluation of the haemoglobin colour scale and comparison
with the hemocue haemoglobin assay. Bull World Health Organ
2002;80:813–16.
Essential Health Technologies. Haemoglobin colour scale bibliography.
Blood Transfusion and Clinical Technology, Geneva, 2004.
Irwig L, Tosteson ANA, Gatsonis C et al. Guidelines for meta-analyses
evaluating diagnostic tests. Ann Intern Med 1994;120:667–76.
Whiting P, Rutjes AWS, Reitsma JB, Bossuty PMM, Kleijnen J. The
development of QUADAS: a tool for the quality assessment of studies
of diagnostic accuracy included in systematic reviews. BMC Med Res
Methodol 2003;3:1–13.
Deeks JJ. Systematic reviews of evaluations of diagnostic tests.
In: Egger M, Davey Smith G, Altman DG, (eds). Systematic Reviews
26
27
28
Deeks JJ. Systematic reviews in health care: systematic
reviews of evaluations of diagnostic and screening tests. BMJ
2001;323:157–62.
Ingram CF, Lewis SM. Clinical use of who haemoglobin colour scale:
Validation and Critique. J Clin Pathol 2000;53:933–37.
Lewis SM, Stott GJ, Wynn KJ. An inexpensive and reliable new
haemoglobin colour scale for assessing anaemia. J Clin Pathol
1998;51:21–24.
Munster M, Lewis SM, Erasmus LK, Mendelow BV. A simple and
reliable method for estimating haemoglobin. Bull World Health Organ
1995;73:369–73.
Timan IS, Tatsumi N, Aulia D, Wangsasaputra E. Comparison of
haemoglobinometry by who haemoglobin colour scale and copper
sulphate against haemiglobincyanide reference method. Clin Lab
Haematol 2004;26:253–58.
Lewis SM, Emmanuel J. Validity of the haemoglobin colour scale in
blood donor screening. Vox Sang 2001;80:28–33.
Glas A, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PMM. The diagnostic
odds ratio: a single indicator of test performance. J Clin Epidemiol
2003;56:1129–35.
Verhoef H, Hoervorst R, West CE, van den Broek NR, Kok FJ.
Assessment of a simple test to detect anaemia in developing countries.
In: Verhoef H (ed.). Iron Deficiency and Malaria as Determinants of
Anaemia in African Children. Wageningen: Wageningen University,
2001.
Gies S, Brabin BJ, Yassin MA, Cuevas LE. Comparison of screening
methods for anaemia in pregnant women in Awassa, Ethiopia. Trop
Med Int Health 2003;8:301–09.
Barduagni P, Ahmed AS, Curtale F, Raafat M, Soliman L. Performance
of sahli and colour scale methods in diagnosing anaemia among
school children in low prevalence areas. Trop Med Int Health
2003;8:615–18.
Montresor A, Albonico M, Khalfan N et al. Field trial of a haemoglobin
colour scale: an effective tool to detect anaemia in preschool children.
Trop Med Int Health 2000;5:129–33.
Coetzee MJ, Writes R, van Zyl M, Ferreira C, Pieters H. Evaluation of a
World Health Organization colour scale for detection of anaemia in a
haematology clinic. S Afr Med J 2000;90:489.
Van Den Broek NR, Ntonya C, Mhango E, White SA.
Diagnosing anaemia in pregnancy in rural clinics: assessing the
potential of the haemoglobin colour scale. Bull World Health Organ
1999;77:15–21.
1434
29
30
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
Tatsumi N, Bunyaratvej A, Timan IS et al. Field evaluation of WHO
hemoglobin color scale in West Java. Southeast Asian J Trop Med Public
Health 1999;30(Suppl. 3):177–81.
Kumar A, Sundaram EB, Vyas MK, Britt RP. Use of the new WHO color
scale for haemoglobin estimation in urban and rural clinics. Indian Soc
Hematol Transfus Med 1999;17:14–15.
31
32
Gosling R, Walraven G, Manneh F, Bailey B, Lewis SM. Training health
workers to assess anaemia with the WHO haemoglobin colour scale.
Trop Med Int Health 2000;5:214–21.
White SA, Van Den Broek NR. Methods for assessing reliability and
validity for a measurement tool: a case study and critique using the
WHO haemoglobin colour scale. Stat Med 2004;23:1603–19.