Published by Oxford University Press on behalf of the International Epidemiological Association The Author 2005; all rights reserved. Advance Access publication 30 September 2005 International Journal of Epidemiology 2005;34:1425–1434 doi:10.1093/ije/dyi195 Haemoglobin colour scale for anaemia diagnosis where there is no laboratory: a systematic review Julia Critchley* and Imelda Bates Accepted 31 August 2005 Background Anaemia is a major public health problem, in poor countries most of the cases are diagnosed clinically. This is inaccurate and the haemoglobin colour scale (HCS) has been developed as an inexpensive, simple alternative for assessing anaemia. Laboratory and community studies have assessed its diagnostic accuracy, but controversy over its validity and usefulness remains. We carried out a systematic review to identify and summarize studies, explain heterogeneity, and make recommendations for future research. Methods We searched electronic databases (MEDLINE, EMBASE, CINAHL, and Science Citation Index), checked documents and references, and contacted experts. We included all the studies comparing HCS diagnostic accuracy with a reference standard. Both reviewers independently screened titles and abstracts, assessed studies for inclusion, appraised quality, and extracted data. Results We included 14 studies, mostly from sub-Saharan Africa. Studies had heterogeneous populations, health care settings, anaemia prevalence, and findings. HCS sensitivity for detecting anaemia was high in most of the studies (75–97%); specificity was generally lower (41–98%). Sensitivity and specificity were higher for laboratory-based studies compared with more pragmatic ‘real-life’ studies, and the ‘study setting’ appeared to explain some of the heterogeneity. Five studies compared the HCS with clinical diagnosis; sensitivity was higher for the HCS in four studies, but specificity was often higher with clinical diagnosis. A few studies evaluated the HCS in situations where there was no laboratory. Conclusions The HCS may improve anaemia diagnosis where there is no laboratory, but there is a need for policy-relevant diagnostic research which is pragmatic, implementationfocused and assesses clinical outcomes. This requires a different approach and research skill-mix from efficacy studies. Keywords Anaemia, haemoglobin, haemoglobinometry, developing countries, diagnosis, health plan implementation Introduction 1 Anaemia is the world’s second leading cause of disability and one of the most serious global public health problems. Anaemia affects over half of the pre-school children and pregnant women in developing countries.2 In poorer malaria endemic countries it is one of the commonest preventable causes of death in children ,5 years of age and in pregnant women.3 The HIV pandemic Liverpool School of Tropical Medicine, Liverpool, UK * Corresponding author. International Health Research Group, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK. E-mail: [email protected] is worsening the severity and increasing the prevalence of anaemia.4 Even mild and moderate anaemia has serious clinical consequences. In pregnancy it is associated with increases in stillbirths and low birth weight babies, and in children it causes stunting and impaired intellectual development.5 Diagnosis of anaemia is problematic in resource-poor settings. Clinical diagnosis is the most common method of detecting anaemia but this is inaccurate. For example, it is only 65% sensitive for detecting haemoglobin levels of ,7 g/dl in pregnant women in Kenya6 and is even less useful for detecting non-severe anaemia. Most of the existing methods for measuring haemoglobin are expensive, complex, or impractical for use 1425 1426 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY in primary care in poor countries. The ‘gold standard’ haemiglobincyanide method uses inexpensive but toxic consumables, depends on accurate sample dilutions, and requires regular supervision and careful quality control processes.7 Other methods, such as the HemoCue system, are simple to operate but use costly disposable cuvettes making them too expensive for widespread use.8 A cheap reliable means of assessing haemoglobin levels is therefore required for use in places where there is no laboratory. Development and use of the haemoglobin colour scale The haemoglobin colour scale (HCS),9 a simple, rapid and cheap method for estimating haemoglobin concentration with a finger prick blood sample, has been developed for use in resource-poor settings where there is no laboratory. The method relies on comparing the colour of a drop of blood absorbed onto a filter paper with standard colours on a laminated card, varying from pink to dark red. These colours correspond to haemoglobin levels of 4, 6, 8, 10, 12, and 14 g/dl. Intermediate shades can be identified, allowing haemoglobin levels to be judged to 1 g/dl. This is the only method currently available that is affordable for widespread primary care use by the most impoverished countries. A diagnostic test for assessing anaemia in communities and primary care facilities in resource-poor settings should be inexpensive, rapid, simple to perform on a finger prick sample, and reasonably accurate.8 Because the HCS is divided into increments it is not surprising that studies have shown it to be less accurate than the laboratory methods of measuring haemoglobin [such as the hemiglobincyanide (HCN) and HemoCue methods] which measure haemoglobin levels to the nearest 0.1 g/dl. Interpretation of HCS results is highly variable depending on the context. Opinions vary from the recognition that the HCS is accurate enough to be useful for anaemia diagnosis in resource-poor settings10 to suggestions that its poor accuracy ‘renders its use questionable’.11 These opinions partly reflect the wide range of results from studies of the accuracy of the tests in different contexts, and a lack of clarity in defining whether these studies were designed to assess accuracy or usefulness of the HCS in practice. To provide an overview of the evidence available to support widespread introduction of the HCS we have performed the first systematic review to assess the objectives, quality, and results of all published studies evaluating the performance of the HCS compared with reference methods and clinical diagnosis. We have examined possible explanations for heterogeneity between studies, and developed recommendations for research strategies assessing the practical usefulness of such diagnostic tests in resource-poor settings. Methods Types of studies included We included any study comparing the HCS with a ‘gold standard’ method of measuring anaemia, such as haemiglobincyanide (cyanmethaemoglobin method)7, in a population-based or hospital-based sample. For pragmatic reasons, we also included HemoCue as a ‘gold standard’ method of measuring anaemia in poorer tropical countries. Identification and selection of studies Electronic databases, MEDLINE, EMBASE, CINAHL, and Science Citation Index (up to May 2005), were searched using keywords ‘Haemoglobin Colour Scale’ and alternative spellings (hemoglobin or color) and the full text of any articles that appeared relevant was retrieved. Both authors assessed studies for inclusion, using a pre-designed, piloted inclusion form. We also searched for relevant studies by screening titles and abstracts listed in the WHO Haemoglobin Colour Scale bibliography,12 by checking lists of references in relevant studies (all included studies and reviews), and by personal communication. Data extraction and assessment of methodological quality Both reviewers’ extracted data using a pre-designed form. This included information on whether the studies were designed to assess accuracy or usefulness in practice; study characteristics, such as country and location; definitions of anaemia and anaemia prevalence; reference (‘gold standard’) method of measuring anaemia; health care setting and workers; any training received, including the length and type of training; sample size; statistical methods; and results. Studies were categorized by both reviewers as essentially ‘laboratory based’ (efficacy studies), ‘real-life’ (effectiveness studies), or ‘unclear’ on the basis of two criteria: (i) whether the HCS was tested using samples from patients in clinic settings (real-life), or samples of known haemoglobin from laboratories (laboratory based) and (ii) level of training and supervision received by those using the HCS (in ‘real-life’ studies users received not more than 1 h of training, more extensive training and constant supervision were classified as laboratory based). Studies classified as ‘unclear’ contained elements of both the study types (such as real-life but with extensive training), or did not report sufficient details in order to make a judgement. The data extraction form was designed by consulting accepted guidelines and empirical evidence for the importance of various aspects of the design of diagnostic studies,13 and was piloted on two studies before use. We contacted authors for more details where specific items [such as 95% confidence intervals (95% CIs)] were not reported. Methodological quality of studies was assessed by both reviewers, adapting validated guidelines specifically developed for assessing the quality of diagnostic studies.13–15 The included items are as follows: Clear descriptions of selection criteria, methods of testing, and the reference standard (‘gold standard’) used. Whether blood for the reference standard and HCS were collected and analysed at the same time. Whether the HCS test results were interpreted without knowledge of the results of the reference standard, and the results of clinical examination if applicable. Whether the same clinical data were available when HCS results were interpreted as would be available when the test is used in practice. Whether ‘intermediate’ HCS results were reported. Explanation of withdrawals from the study population. Whether results from all the health workers were included in the final analysis. HAEMOGLOBIN COLOUR SCALE Whether the spectrum of patients was representative of those who will receive the test in practice. Data analysis We planned to pool estimates of sensitivity and specificity of the HCS across studies wherever appropriate, stratified by ‘reallife’ or ‘laboratory’ based. However, statistical methods for combining results of diagnostic tests are at an early stage of development.15 In particular, it would be inappropriate to combine estimates of sensitivity and specificity if there is a marked heterogeneity in these estimates, if the disease spectrum is different, or if the diagnostic cut-offs for anaemia are varying between studies.16 Due to the clinical and statistical heterogeneity between studies we did not combine them statistically, except for the laboratory-based studies with similar findings.17–21 Sensitivity and specificity were stratified by the study type. Diagnostic odds ratios (DORs—ratios of odds of a ‘positive’ HCS result in patients with anaemia compared with a patient without anaemia, the larger the value the stronger the diagnostic evidence) were calculated as an overall measure of test accuracy (see the web version of Table 1, available at IJE Online).22 Many studies did not provide any estimate of precision for these statistics (such as a 95% CI) so wherever possible we estimated these from the raw data using StatsDirect, or obtained them from the authors. Results Description of studies Fourteen studies were included (Figure 1 and Table 1).10,11,17– 21,23–29 Two studies from the WHO’s HCS bibliography12 were assessed but excluded because they were pilot studies, primarily concerned with training.30,31 Development of the HCS has been an iterative process, and several of the included studies tested prototype versions of the HCS.17,27,29 For example, in one study users found it difficult to differentiate between values of 10 and 12, and the shades were later adjusted;17 another study did not allow blood to soak into the paper for 30 s,27 and Tatsumi29 experimented with altering the amount of blood, suggesting that the method was not optimized. The objectives, type of patients, and the health care setting in which the HCS was used varied between studies making it difficult to generalize the results. Some studies were carried out under ‘ideal’ settings or in patient populations that are not representative of those presenting to primary care health facilities in developing countries.11,17,25,29 Six studies were ‘laboratory based’.11,17–21 These assessed the ability of health care workers to obtain accurate results under supervision using the HCS, compared with a reference method. Four studies were ‘real-life’, and were carried out without supervision in antenatal clinics, schools, or rural research clinics after a brief training.23–25,28 The remaining four10,26,27,29 were not clearly ‘laboratory-based’ or ‘real-life’ but contained elements of both, e.g. laboratory testing but with brief training,29 use of HCS in a field setting but as part of a randomized controlled trial, and with extensive training.10 Most of the studies were conducted in sub-Saharan Africa (Ethiopia, Tanzania, South Africa, Zimbabwe, Malawi, Kenya). 1427 Other studies include populations from the UK (3), Indonesia (2), and one each from Egypt, Malaysia, India, Honduras, Thailand, and Switzerland. Definitions of anaemia and anaemia prevalence varied between studies. Most of these defined a cut-off for anaemia as a haemoglobin level either ,11 g/dl or ,12 g/dl. Some studies also assessed the diagnostic accuracy of the HCS at other cut-offs but the definitions of these varied. ‘Severe’ anaemia was generally defined as Hb ,7 g/dl or ,6 g/dl, but one study also defined a cut-off for ‘very severe’ anaemia as ,5 g/dl. A few studies used different definitions for men and women. Anaemia prevalence varied dramatically, from 5 to 85%. The reference methods used to diagnose anaemia included HemoCue (7 studies), automated analyser (7), photometer (3) and the copper sulphate method (2). Most of the studies provided training for health care workers using the HCS, though the methodology differed between studies and was usually not clearly described. Several studies also assessed the diagnostic accuracy of other methods, such as clinical examination for anaemia (5 studies), the copper sulphate method (3 studies), the Sahli method (1 study), and the Tallquist paper scale (1 study), against the reference method. Methodological quality of studies (web Table 2) In all the studies, health workers using the HCS appear to have been ‘blind’ to results of other methods of measuring haemoglobin. Many studies did not clearly describe their criteria for selecting health workers or patients, or the characteristics or completeness of their sample. This severely hinders interpretation of the results as the sensitivity and specificity of any test may vary in different populations (e.g. by sex, age or disease severity). Several studies used venous blood samples,11,17,20,28,29 whereas finger prick samples would be the most likely source of blood if the HCS was used at a community or a primary care level. Two studies used a venous blood sample for the reference standard (automated haematology analyser) but used a finger prick blood sample for the HCS measurement, thereby introducing the possibility of bias.20,28 Some studies excluded results from health workers who were retrospectively shown to have performed poorly on the HCS17,23 or allowed health care workers to estimate HCS more than once, without explaining how any discrepancies were resolved.19,20,29 The HCS measures haemoglobin in 2 g/dl increments. Some studies ‘allowed’ intermediate values (to the nearest 1 g/ dl)10,21,26,28 while others did not, with implications for calculations of diagnostic accuracy. In some studies inappropriate statistics were used to assess the agreement between the HCS and the reference standard (such as correlation coefficients).32 Sensitivity and specificity of HCS for the detection of anaemia Thirteen studies reported estimates of sensitivity or specificity for anaemia using the HCS10,17–21,23–29 and were categorized as ‘real-life’, ‘laboratory based’, or unclear (Figures 2a and b). Estimates of sensitivity and specificity from the five laboratorybased studies were all high, varying from 0.85 to 0.99 for sensitivity and from 0.91 to 1.0 for specificity.17–21 All used the same cut-off for diagnosing anaemia (Hb , 12 g/dl), so a pooled estimate was calculated from the four studies for which the 1428 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY Table 1 Characteristics of the included studies Study reference (location) Anaemia prevalence Patient population Main results 23 (Kenya) 60% 178 children (2–36 months) Sens: 0.76% (CI 0.68–0.84) 20 (Indonesia) 38% 240 blood donors, 150 samples from the haematology unit Spec: 0.44% (CI 0.33–0.56) 2.7% severe Sens: Anaemia: 0.99 (CI 0.96–1.0) Severe anaemia: 1.0 (CI 0.40–1.0) Spec: Anaemia: 1.0 (CI 0.98–1.0) Severe anaemia: 0.99 (CI 0.95–0.99) 24 (Southern Ethiopia) 15.1% 403 pregnant women 10.4% mild, 4.2% moderate, 0.3% severe 10 (Tanzania) Sens: 0.33–0.44 (CI 0.12–0.78) Spec: 0.87–1.0 (CI 0.83–1.0) 85% anaemic, (29% severe, 9% very severe) 990 children and 643 gynae/ obstetric patients 83% anaemia (20% severe) 1529 pregnant women Sens: Anaemia: 0.97 (CI 0.96–0.97) Severe anaemia: 0.92 (CI 0.89–0.94) Very severe anaemia: 0.92 (CI 0.84–0.94). Spec: Anaemia: 0.41 (CI 34–47) Severe anaemia: 0.86 (CI 0.84–0.88) Very severe anaemia: 0.90 (CI 0.88–0.91) 25 (Upper Egypt) 17.5% 150 children (aged 6–11) Sens: 0.88 (CI 0.70–0.97) 11 (UK) Not relevant 408 adult and child inpatients Not possible to calculate sensitivity/specificity 21 (Multi-centre) ~5% 2801 blood donors Sens: 0.85 (CI 0.77–0.91) 26 (Tanzania) 78.5% anaemia 614 children (16–83 months) Sens: Spec: 0.49 (CI 0.40–0.60) Spec: 0.98 (CI 0.97–0.98) 3.5% severe anaemia Anaemia: 0.85 (CI 0.81–0.88), Severe anaemia: 0.74 (CI 0.49–0.90) Spec: Anaemia: 0.77 (CI 0.68–0.84) Severe Anaemia: 0.99 (CI 0.99–1.00) 17 (South Africa) Not reported 548 hospital outpatients Sens: Anaemia: 0.85 (CI 0.81–0.99), Severe anaemia: 0.75 (CI 0.81–0.99) Spec: Anaemia: 0.77 (CI 0.95–0.98) Severe anaemia: 0.99 (CI 0.98–0.99) 27 (South Africa) 36% 104 adult outpatients Sens: 0.7 Spec: 0.85 28 (Malawi) 58.1% (mild) 32% (moderate) 729 pregnant women Sens: Anaemia: 0.78 (CI 0.74–0.8) 4% (severe) Moderate: 0.82 (CI 0.77–0.86) 0.4% (very severe) Severe anaemia: 0.81 (CI 0.65–0.92) Very severe: 0.5 (CI 0.12–0.58) Spec: Anaemia: 0.50 (CI 0.46–0.55) Moderate 0.45 (CI 0.42–0.49) Severe anaemia: 0.74 (CI 0.74–0.79) Very severe: 0.99% (CI 0.98–0.99) HAEMOGLOBIN COLOUR SCALE 1429 Table 1 continued Study reference (location) Anaemia prevalence Patient population Main results 29 (Indonesia) 16% 120 Children Sens: 0.23 (CI 0.13–0.36) 28 (Multi-site reference centres) 67% 1213 random blood samples Sens: Spec: 0.97 (CI 0.89–1.0) (severe 14%) Anaemia: 0.86 (CI 0.83–0.89) Severe: 0.94 (CI 0.89–0.97), Spec: Anaemia: 0.91 (CI 0.88–0.93) Severe: 0.98 (CI 0.97–0.99) 28 (South Africa) Not relevant 20 blood samples Inappropriate statistical methods 28 (South Africa and Thailand) Not given 113 blood donors No discrepancies between methods 29 (South Africa) Not given 20 blood samples Sens: Anaemia: 0.89 (CI 0.86–0.93) Severe anaemia: 0.84 (CI 0.77–0.91) Spec: Anaemia: 0.95 (CI 0.91–0.97) Severe anaemia: 0.92 (CI 0.90–0.94) CI 5 95% confidence interval; Sens 5 sensitivity; Spec 5 specificity. Potentially relevant diagnostic studies identified and screened for retrieval (n=67) 51 excluded as not relevant (reviews, opinion pieces or not concerned with HCS) Diagnostic studies retrieved for more detailed evaluation (n=16) Potentially appropriate diagnostic studies to be included in the systematic review (n=14) Diagnostic studies with usable information by outcome included in Figures 2-4 (n=14) 2 studies excluded 1– no reference standard 1– pilot concerned with training 1 excluded from Figure 2 as sensitivity or specificity could not be estimated 2 excluded from Figure 3 as Diagnostic Odds Ratios could not be estimated Figure 1 Flowchart showing selection and inclusion of studies in the HCS systematic review. This illustrates the selection and inclusion of studies for this review (QUOROM statement) true positive, true negative, false negative, and false positive rates could be obtained.17,18,20,21 This yielded an overall sensitivity of 0.87 (95% CI 0.86–0.89) and specificity of 0.97 (95% CI 0.97–0.98). There was no statistically significant heterogeneity between estimates of sensitivity and specificity (c2 test, P , 0.25 for both). Estimates of sensitivity and specificity from the ‘real-life’ or unclear studies were lower and more heterogeneous than for the laboratory-based studies, so a pooled estimate was not considered appropriate. Sensitivity for the ‘real-life’ studies varied between 0.76 and 0.88, apart from one outlier (Figure 2a). Sensitivities lower than this may be clinically unacceptable; for example, a sensitivity of 0.75 implies that 25% of the patients with anaemia will not be diagnosed using the HCS. Specificity was lower for the ‘real-life’ studies (Figure 2b), implying that the HCS may falsely diagnose anaemia in many individuals. Although anaemia treatment is cheap and relatively safe, the potential costs of this over-diagnosis, particularly for patients with very limited resources, must be taken into account in assessing the usefulness of the HCS. DORs for anaemia diagnosis were significantly lower (between 2.5 and 7.5) for ‘real-life’ studies, compared with laboratory-based studies (between 58 and 48 000), P 5 0.0032 (web Table 1, available at IJE Online; Figure 3). The haemoglobin cut-off for diagnosing anaemia varied between studies (11–12 g/dl). As the HCS was read only to the nearest 1 or 2 g/dl in these studies we would not necessarily expect such variations to cause heterogeneity between studies, and there was almost no correlation between study-level estimates of sensitivity and specificity (spearman’s rho 5 0.027, P 5 0.29), suggesting that factors other than a threshold effect account for most of the heterogeneity. These factors may include characteristics of the population (such as anaemia prevalence and patient group), test operator (level of training and experience), or the test itself (optimal version). For example, one of the ‘real-life’ studies24 and one of the unclear studies29 estimated particularly low sensitivity [0.44 (95% CI 0.32–0.56) and 0.23 (95% CI 0.13–0.36), respectively]. However, the former study was carried out in an area of very low anaemia prevalence, and in the latter a prototype version of the HCS scale was 1430 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY (a) Munster 1997 Lewis 1998 (b) 19 Munster 1997 18 Lewis 1998 Ingram 2000 17 21 Lewis 2001 Timan 2004 20 19 18 Ingram 2000 17 21 Lewis 2001 Timan 2004 20 Tatsumi 1999 29 Coetzee 2000 27 Tatsumi 1999 29 Coetzee 2000 27 Montresor 2000 26 Montresor 2003 10 Montresor 2000 26 Montresor 2003 10 Van Den Broek 1999 28 Barduagni 2003 25 Van Den Broek 1999 28 Barduagni 2003 25 Gies 2003 24 Gies 2003 Verhoef 2004 23 24 Verhoef 2004 23 1.0 0.2 0.4 0.6 Sensitivity 0.8 1.0 (c) 0.0 (d) Ingram 2000 17 0.2 0.4 0.6 Specificity 0.8 1.0 0.4 0.6 Specificity 0.8 1.0 Lewis 199818 Ingram 2000 17 Timan 2004 20 Timan 2004 20 Montresor 2000 26 Montresor 2000 26 Montresor 2003 10 Montresor 2003 10 Van Den Broek 1999 28 Van Den Broek 1999 28 1.0 0.2 0.4 0.6 0.8 Sensitivity 1.0 0.0 0.2 Figure 2 (a) Sensitivity of HCS for anaemia, (b) specificity of HCS for anaemia, (c) sensitivity of HCS for severe anaemia (HB , 6 or 7), and (d) specificity of HCS for severe anaemia (HB , 6 or 7). Figures (a–d) show sensitivity and specificity of the HCS for anaemia and severe anaemia 11 with 95% CIs for each study. Paddle’s study is omitted because we could not obtain or calculate sensitivity, specificity, and 95% CI. s 5 laboratory (efficacy) study, n 5 unclear, 5 field-based (effectiveness) study used. For both these studies specificity was high (Figure 2b), suggesting that threshold effects may account for some of the heterogeneity, despite the lack of a clear correlation. Similarly, one of the ‘unclear’ studies appeared to have lower specificity, but sensitivity was high10 (Figures 2a and b). DORs (Figure 3) appear reasonably consistent within the three sub-groups, suggesting that the study settings account for some of the heterogeneity, except for one study that involved extensive training, where the HCS was more accurate than in any other study.20 Sensitivity and specificity of HCS for the detection of severe anaemia Six studies detected and reported HCS results for severe anaemia (defined as Hb , 7 or ,6 g/dl).10,17,18,20,26,28 Sensitivity was high (.0.84) in most of the studies but in the only ‘real-life’ based study it was just 50% (95% CI 12–58%)28 (Figure 2c). If confirmed, sensitivity as low as 50% would be a cause for concern. However, clinical diagnosis is more accurate for severe anaemia compared with mild anaemia, and the HCS is therefore likely to be most useful for detecting mild to moderate cases. Specificity was very high in all the studies, varying between 86% (95% CI 84–88%) and 99% (95% CI 99–100%) (Figure 2d). Most of the studies only had a small number of patients who were severely anaemic, so larger sample sizes and further studies are required to assess the diagnostic accuracy of HCS for severe anaemia (Figure 4). Comparison of alternative diagnostic tests for anaemia with HCS Clinical examination Five studies reported diagnostic accuracy for anaemia using both HCS and clinical examination10,17,24,26,28 (usually by assessing HAEMOGLOBIN COLOUR SCALE conjunctival pallor), compared with the same reference standard. In two of the studies the clinical examination was far more intense than would be the case in practice. It involved assessing pallor of nail beds, conjunctiva and palms, and grading the appearance of each site as normal, pale, or very pale.10,26 Despite this, in most studies clinical examination performed poorly, especially for detecting mild and moderate levels of anaemia. Munster 1997 Lewis 1998 1431 Sensitivity varied from 33% (95% CI 29–38%) to 57% (95% CI 34–79%) and specificity from 79% (95% CI 73–79%) to 84% (95% CI 79–88%). The HCS generally had better sensitivity, but not necessarily better specificity, than clinical examination. Only one study reported that clinical examination performed better than the HCS.24 Other tests One study in Egypt compared the diagnostic accuracy of the ‘Sahli’ method (a subjective method based on visual comparison) and HCS with the HemoCue reference standard.25 Sensitivity was similar for both the methods (92.3%, 95% CI 75–99% for Sahli compared with 88.5%, 95% CI 70–97% for HCS), but in both the tests specificity was very low (48.8%, 95% CI 40–60% and 39.0%, 95% CI 30.5–48% for HCS and Sahli, respectively). The authors preferred the HCS as it is quicker to carry out than the Sahli method. Three studies compared HCS with the copper sulphate method for screening blood donors and found that both methods had similar sensitivity and specificity when using HemoCue or haemiglobincycanide as the reference methods20,21,29 (Table 1). One study used results of the HCS, in combination with anaemia symptoms and clinical examination, as part of a scoring system for anaemia, and showed higher sensitivity and specificity than the HCS alone.24 19 18 Ingram 2000 17 Lewis 200121 Timan 2004 20 Tatsumi 1999 29 Coetzee 2000 27 Montresor 2000 26 Montresor 2003 10 Van Den Broek 1999 28 Barduagni 2003 25 24 Gies 2003 Verhoef 2004 23 0 4 8 Ln DOR Discussion 12 We have identified and extracted data from 14 studies to provide an overview of the evidence available about the accuracy and usefulness of the HCS in anaemia diagnosis in laboratorybased and ‘real-life’ settings. The HCS has only been available since 2001, so there are relatively few publications at present (Figure 1). Assessing the methodological quality of these studies 0.2 0.4 Sensitivity 0.6 0.8 1 Figure 3 Log diagnostic odds ratios (DORs) for the HCS. DORs and 11 95% CI of the HCS for anaemia for each study are shown. Paddle and 19 Munster are omitted as we could not calculate the DOR. s 5 laboratory (efficacy) study, n 5 unclear, 5 field-based (effectiveness) study 0 0.2 1 - Specificity HCS 0.4 0.6 Clinical Figure 4 Comparison of sensitivity and specificity for HCS with clinical diagnosis. Estimates of sensitivity and specificity for both the HCS and clinical diagnosis are displayed. The lines connect estimates of HCS and clinical diagnosis from the same study. The circles represent HCS and diamonds clinical diagnosis. The ‘best’ tests (in terms of sensitivity and specificity) are towards the top left of the graph. The figure shows that sensitivity was much better for the HCS than clinical diagnosis (except in one study where they are about the same). However, specificity was often similar or better for clinical diagnosis 1432 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY and interpreting their results was difficult due to poor reporting of the study design and methods; different diagnostic criteria for anaemia; and frequently small sample sizes, particularly for severe anaemia. We have identified gaps in the evidence and clear evidence of heterogeneity between study settings. As with any systematic review there is a possibility of bias due to the methods of selecting the publications. However, the search was comprehensive and supplemented by contacting key workers in the field and WHO, and data were extracted independently by two researchers. Despite these limitations, our review suggests that the HCS may be a useful tool for anaemia diagnosis. Sensitivity and specificity were very high in laboratory-based studies, but deteriorated substantially as the studies moved from accuracy assessments in ‘ideal’ populations in the laboratory to more ‘reallife’, pragmatic evaluations. The most likely explanation for this is the less intensive training and supervision available in the field. Estimates of sensitivity and specificity appeared heterogeneous between studies, but estimates of DORs were reasonably consistent within our laboratory-based or ‘real-life’ sub-groups, suggesting that the study ‘setting’ can account for some of the heterogeneity between included studies. The reduced HCS accuracy in field studies implies that further research is required to define and evaluate its role in primary health care. At present the most widely used method for detecting anaemia in settings where there is no laboratory is clinical diagnosis, and this is well recognized (confirmed by this review) to be useful only for very marked anaemia. The HCS performed better than clinical diagnosis alone for detecting mild and moderate anaemia, but this advantage is reduced as anaemia becomes more severe and hence more clinically obvious. Even for severe anaemia, the HCS may have the advantage of being semi-quantitative, whilst clinical diagnosis is always qualitative. The HCS is likely to be most useful in areas with moderate to high anaemia prevalence, though there may be a trade-off with lower specificity—in which case the costs and implications of a significant number of false positive results for patients and the health services must be considered. Further studies are therefore needed to assess the benefits and disadvantages of using the HCS in specific patient populations (e.g. ante-natal women and young children) in high prevalence rural areas. The aim of a device such as the HCS should be to improve identification of anaemic individuals, and not to replace clinical examination, so it is important to investigate the most effective way of combining these approaches. One study used a scoring system based on a combination of HCS results and anaemia symptoms and signs, and achieved higher sensitivity and specificity than the HCS alone.24 Future studies should therefore include an assessment of combinations of parameters in an attempt to optimize anaemia diagnosis in resourcepoor settings. Although evidence is now available about the accuracy of the HCS, we lack studies addressing issues relevant to widescale implementation and it is therefore difficult to draw conclusions about its usefulness in practice. Future studies need to be pragmatic in design (i.e. to assess diagnostic accuracy in local health care settings with local workers) and should report clearly their method of sampling and characteristics of the patient population. Training methods should be formally evaluated, and should also be practical. Some studies provided up to 2 days training,10 which is burdensome, expensive, and may not be feasible in practice. The introduction of HCS into settings where there is no laboratory will require careful planning as systems for supplying and safely disposing of sterile lancets will need to be established. Realistic studies in primary care settings are needed which compare the costs and effectiveness of the HCS with the current routine method of clinical diagnosis of anaemia. These studies need to be planned early in the development process and should evaluate longer-term clinical outcomes (such as better management of anaemia, less anaemia at delivery among pregnant women, and increased birth weight), rather than simply reporting on diagnostic accuracy. Such studies will require a multi-disciplinary approach, involving, for example, health planners, social scientists, economists, and epidemiologists as well as clinicians. Conclusion The HCS has the potential to be the most appropriate tool currently available for the detection of mild and moderate anaemias, which are likely to be missed by clinical diagnosis alone, but which still require treatment. It is inexpensive, rapid and simple to use8, and the few studies available suggest that under ideal conditions it is reasonably accurate. Although studies are limited, results are promising and justify investment in evaluating its use on a larger scale and in real-life situations. The next step on the pathway to implementation should be to assess its usefulness in combination with clinical diagnosis in practical situations, such as paediatric and ante-natal clinics in developing countries where there is no laboratory. This type of research requires a different approach and research skill-mix from efficacy studies. Acknowledgements This article is based on a background paper commissioned by the World Health Organization for an international expert consultation in May 2004. Funding was also received from the Department for International Development UK (Effective Health Care Research Consortium and Malaria Knowledge Programme). Nevertheless, the views expressed in this article are those of the authors alone. These organizations accept no responsibility for the information or views expressed in this paper. The authors thank Nynke Van Den Broek for supplying additional information from her study. Role of funding source: the funders played no role in study design, collection, analysis, interpretation of data, writing of the report, or in the decision to submit the paper for publication. They accept no responsibility for the contents. Contributions of authors: J.C. assessed studies for inclusion, extracted data, and assessed study methodological quality, carried out analyses, and is responsible for preparing the article. I.B. assessed studies for inclusion, extracted data, and assessed study methodological quality, contributed to analyses, obtained initial funding and helped write the article. Both J.C., and I.B. contributed to the interpretation and revision of the manuscript. Conflicts of interest The authors have declared no conflicts of interest. HAEMOGLOBIN COLOUR SCALE 1433 Supplementary data Supplementary tables can be found at IJE Online (http:// ije.oxfordjournals.org). KEY MESSAGES The HCS is simple, rapid, inexpensive, and suitable for assessing haemoglobin levels on finger prick blood samples in resource-poor settings where there is no laboratory. The accuracy of the scale is high in laboratory-based studies but deteriorates as studies become more field-based and ‘real life’, and where prevalence of anaemia is low. The HCS is a promising tool for diagnosing anaemia where there is no laboratory, but more research is needed to determine the usefulness of the HCS in ‘real-life’ situations and to assess its effectiveness in improving clinical outcomes. References 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 in Health Care. Meta-analysis in Context. London: BMJ Publishing Group, 2001. Murray CJ, Lopez AD. Global mortality, disability, and the contribution of risk factors: Global Burden of Disease Study. Lancet 1997;349: 1436–42. 16 Schellenberg D, Armstrong Schellenberg JRM, Mushi A et al. The silent burden of anaemic in Tanzanian children: a community-based study. Bull World Health Organ 2003;81:581–90. 17 McDermott JM. Prospective assessment of mortality among a cohort of pregnant women in rural Malawi. Am J Trop Med 1996;55:66–70. Van den Broek NR. The relationship between asymptomatic human immunodeficieny virus infection and the prevalence and severity of anaemia in pregnancy Malawian women. Am J Trop Med 1998;59:1004–07. 18 19 20 WHO/NHD. Iron deficiency anaemia: assessment, prevention and control. Geneva: World Health Organization, 2001. Shulman CE, Levene M, Morison L, Dorman E, Peshu N, Marsh K. Screening for severe anaemia in pregnancy in Kenya, using pallor examination and self-reported morbidity. Trans R Soc Trop Med Hyg 2001;95:250–55. Bain B, Bates I. Basic haematological techniques. Dacie and Lewis Practical Haematology. London: Church, Livingstone, 2001, pp. 20–22. 21 22 23 Lara AM, Mundy C, Kandulu J, Chisuwo L, Bates I. Choosing a haemoglobin method for district hospitals in Malawi. J Clin Pathol 2005;58:56–60. Stott GJ, Lewis SM. A simple and reliable method for estimating haemoglobin. Bull World Health Organ 1995;73:369–73. 24 Montresor A, Ramsan M, Khalfan N et al. Performance of the haemoglobin colour scale in diagnosing severe and very severe anaemia. Trop Med Int Health 2003;8:619–24. 25 Paddle JJ. Evaluation of the haemoglobin colour scale and comparison with the hemocue haemoglobin assay. Bull World Health Organ 2002;80:813–16. Essential Health Technologies. Haemoglobin colour scale bibliography. Blood Transfusion and Clinical Technology, Geneva, 2004. Irwig L, Tosteson ANA, Gatsonis C et al. Guidelines for meta-analyses evaluating diagnostic tests. Ann Intern Med 1994;120:667–76. Whiting P, Rutjes AWS, Reitsma JB, Bossuty PMM, Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol 2003;3:1–13. Deeks JJ. Systematic reviews of evaluations of diagnostic tests. In: Egger M, Davey Smith G, Altman DG, (eds). Systematic Reviews 26 27 28 Deeks JJ. Systematic reviews in health care: systematic reviews of evaluations of diagnostic and screening tests. BMJ 2001;323:157–62. Ingram CF, Lewis SM. Clinical use of who haemoglobin colour scale: Validation and Critique. J Clin Pathol 2000;53:933–37. Lewis SM, Stott GJ, Wynn KJ. An inexpensive and reliable new haemoglobin colour scale for assessing anaemia. J Clin Pathol 1998;51:21–24. Munster M, Lewis SM, Erasmus LK, Mendelow BV. A simple and reliable method for estimating haemoglobin. Bull World Health Organ 1995;73:369–73. Timan IS, Tatsumi N, Aulia D, Wangsasaputra E. Comparison of haemoglobinometry by who haemoglobin colour scale and copper sulphate against haemiglobincyanide reference method. Clin Lab Haematol 2004;26:253–58. Lewis SM, Emmanuel J. Validity of the haemoglobin colour scale in blood donor screening. Vox Sang 2001;80:28–33. Glas A, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PMM. The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol 2003;56:1129–35. Verhoef H, Hoervorst R, West CE, van den Broek NR, Kok FJ. Assessment of a simple test to detect anaemia in developing countries. In: Verhoef H (ed.). Iron Deficiency and Malaria as Determinants of Anaemia in African Children. Wageningen: Wageningen University, 2001. Gies S, Brabin BJ, Yassin MA, Cuevas LE. Comparison of screening methods for anaemia in pregnant women in Awassa, Ethiopia. Trop Med Int Health 2003;8:301–09. Barduagni P, Ahmed AS, Curtale F, Raafat M, Soliman L. Performance of sahli and colour scale methods in diagnosing anaemia among school children in low prevalence areas. Trop Med Int Health 2003;8:615–18. Montresor A, Albonico M, Khalfan N et al. Field trial of a haemoglobin colour scale: an effective tool to detect anaemia in preschool children. Trop Med Int Health 2000;5:129–33. Coetzee MJ, Writes R, van Zyl M, Ferreira C, Pieters H. Evaluation of a World Health Organization colour scale for detection of anaemia in a haematology clinic. S Afr Med J 2000;90:489. Van Den Broek NR, Ntonya C, Mhango E, White SA. Diagnosing anaemia in pregnancy in rural clinics: assessing the potential of the haemoglobin colour scale. Bull World Health Organ 1999;77:15–21. 1434 29 30 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY Tatsumi N, Bunyaratvej A, Timan IS et al. Field evaluation of WHO hemoglobin color scale in West Java. Southeast Asian J Trop Med Public Health 1999;30(Suppl. 3):177–81. Kumar A, Sundaram EB, Vyas MK, Britt RP. Use of the new WHO color scale for haemoglobin estimation in urban and rural clinics. Indian Soc Hematol Transfus Med 1999;17:14–15. 31 32 Gosling R, Walraven G, Manneh F, Bailey B, Lewis SM. Training health workers to assess anaemia with the WHO haemoglobin colour scale. Trop Med Int Health 2000;5:214–21. White SA, Van Den Broek NR. Methods for assessing reliability and validity for a measurement tool: a case study and critique using the WHO haemoglobin colour scale. Stat Med 2004;23:1603–19.
© Copyright 2026 Paperzz