AJCP / Original Article Inappropriate Repeats of Six Common Tests in a Canadian City A Population Cohort Study Within a Laboratory Informatics Framework CME/SAM Eric K. Morgen, MD, MPH, FRCPC,1,2 and Christopher Naugler, MD, MSc, FRCPC3 From the 1Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Canada; 2Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Toronto, Canada; and 3Department of Pathology and Laboratory Medicine, University of Calgary and Calgary Laboratory Services, Calgary, Canada. Key Words: Laboratory utilization; Repeated testing; Inappropriate testing; Cholesterol; HbA1c; Vitamin D; Vitamin B12; TSH; Ferritin Am J Clin Pathol November 2015;144:704-712 DOI: 10.1309/AJCPYXDAUS2F8XJY ABSTRACT Objectives: To identify inappropriate repeats of six common laboratory tests in a population sample of patients, using highly specific criteria based only on repeat time and test value. Methods: We used a laboratory informatics database to conduct a retrospective cohort study using a population sample of 103,000 patients in the city of Calgary with an index test in 2010 and uniform follow-up of 1 year. We examined six tests (cholesterol, hemoglobin A1c, thyroidstimulating hormone, vitamin B12, vitamin D, and ferritin) with consensus-based or easily justified criteria for inappropriate repeats based solely on time to repeat and the index test value. Results: The percentages of tests repeated at 3, 6, and 12 months were 11%, 23%, and 41%, respectively. In total, 16% of these six tests were inappropriately repeated, representing an annual internal cost of $0.6 to $2.2 million Canadian dollars and corresponding to population-scaled national estimates for Canada and the United States of $160 million and $2.4 billion, respectively. Conclusions: Objective definitions based on repeated testing identified 16% of six studied tests as inappropriate, delineating a subset of inappropriate testing that is well suited to automated identification and intervention and that provides a likely lower bound on the true burden of inappropriate testing. 704 Am J Clin Pathol 2015;144:704-712 DOI: 10.1309/AJCPYXDAUS2F8XJY Upon completion of this activity you will be able to: • define the “Ulysses syndrome” and describe its relation to laboratory testing. • describe the difficulties in defining “inappropriate testing.” • discuss the advantages of objective definitions for inappropriate testing. • approximately quantify minimum burdens of inappropriate testing in tests studied in this article. The ASCP is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians. The ASCP designates this journal-based CME activity for a maximum of 1 AMA PRA Category 1 Credit ™ per article. Physicians should claim only the credit commensurate with the extent of their participation in the activity. This activity qualifies as an American Board of Pathology Maintenance of Certification Part II Self-Assessment Module. The authors of this article and the planning committee members and staff have no relevant financial relationships with commercial interests to disclose. Exam is located at www.ascp.org/ajcpcme. Laboratory testing is an integral part of the health care system, with many patient encounters resulting in the ordering of laboratory tests and a large proportion of clinical decisions relying on them.1 Consequently, changes in laboratory testing practices have fundamental impacts on both the systemic costs of health care and the care that patients receive. There is recent evidence that laboratory utilization is increasing above and beyond what can be attributed to inflation and population aging2,3 and that the trajectory appears unsustainable.4 We also know that due to population variation in analyte values and laboratory variation in test results, false-positive results are inevitable. As a result, increasing test volumes also increase unintended patient morbidity due to the “Ulysses syndrome” of investigations and interventions in patients who were actually healthy.5,6 (Coined in 1972, “Ulysses syndrome” refers to an abnormal test result leading to a series of health care adventures [ie, © American Society for Clinical Pathology AJCP / Original Article follow-up investigations and/or interventions, accompanied by patient anxiety, potential morbidity, and extra monetary cost] that were ultimately unnecessary, merely leading the patient back to his or her starting point before the abnormal result.) While measuring the true extent of such downstream consequences of false-positive tests is difficult and not frequently attempted in the literature, studies of particular tests have documented increased hospital/pharmacy/laboratory services amounting to thousands of dollars (blood cultures),7 increased interventions and hospitalizations (tuberculosis cultures),8 and short-term psychosocial consequences (cystic fibrosis screening).9 There is a strong perception that a substantial proportion of laboratory tests are unnecessary.6,10-12 Indeed, studies show that differences in laboratory testing volumes among institutions are often not correlated with the intended clinical outcomes13 and thus that excess testing between institutions appears to yield no measurable benefit.14-16 However, definitions of inappropriate testing have proven problematic. One systematic review noted that many studies of inappropriate laboratory testing lacked objective criteria,17 potentially leading to low specificity as well as sensitivity. A more recent systematic review emphasized the differences in results between studies with objective vs subjective criteria, noting that the former were more dependable and ultimately preferred but appeared to often underestimate inappropriate testing and required improvement.18 A further weakness of most criteria (all subjective criteria and many objective criteria) is the complexity of identifying inappropriate tests. This often requires human intervention and/or the evaluation of patient information not readily available to a testing laboratory—thus not feasible for automated strategies that could facilitate system-level reduction strategies. This may provide a partial explanation for an apparent lack of any systemic improvements in this area between 1997 and 2012.18 In this context, examining repeated ordering of the same test type on the same patient (ie, repeat testing) holds great promise to expand the scope of objective criteria for inappropriateness while remaining amenable to system-level strategies. Prior investigators have reported that repeat testing within 1 month comprises 30% of test volumes for eight common tests and 63% within 1 year,19 making this a fertile area for potential reductions. Here, we apply survival analysis to repeated laboratory testing in a population sample of 103,000 patients living in Calgary and the surrounding area. In particular, we examine six common laboratory test types where a reliable (highly specific) assessment of inappropriateness can be made entirely from the test interval and index test value, and we calculate the associated direct annual costs in the study laboratory catchment area, as well as corresponding population-scaled estimates nationally for Canada and the United States. © American Society for Clinical Pathology Materials and Methods Population and Data Extraction This study was approved by the University of Calgary Conjoint Research Ethics Board. Data on laboratory testing were extracted from the information system of Calgary Laboratory Services, the single laboratory provider for the city of Calgary and the surrounding area. This geographic area encompasses approximately 1.4 million people, and essentially all tests occurring within this area are captured within the database. We selected six test types for the current study that have well-accepted guidelines or easily defined uncontroversial criteria regarding the appropriateness of repeating testing: vitamin B12, cholesterol, thyroid-stimulating hormone (TSH), vitamin D, hemoglobin A1c (HbA1c), and ferritin. The tests are shown in ❚Table 1❚,20-39 along with the criteria used. For all six test types, every test instance occurring in 2010 or 2011 was extracted from the database. For each test instance, the following data were recorded: name of the test, date and time of testing, test value (ie, the numeric result), laboratory-defined upper and lower limits for that test, patient age at the time of testing, patient sex, and type of testing location (ie, outpatient, inpatient, community clinic). Because the data set as mentioned was difficult to work with due to its large size, we compiled a list of all patients receiving any laboratory test over the 2-year period and used the “SAMPLE” function in the structured query language database language to randomly subsample 20% of these patients to constitute the main data set for further analysis in this study. Descriptive Analysis The distribution of test dates over the entire 2-year study period was plotted. For the entire data set as well as each test type, the distribution of abnormality (ie, whether the test was designated as “abnormal” according to the laboratory reference range) was calculated, as were the distributions of patient ages and sexes associated with each test type. In addition, the distributions of age and sex across all studied patients were calculated. Survival Analysis All test instances recorded during 2010 were designated as “index tests.” For each index test, the tested patient was followed for exactly 1 year from the date of the index test, looking for a second test instance of the same type. The presence of such a repeat test was used as the outcome of interest for survival analysis. If any repeats were found during the 1-year period, this was designated as a positive outcome (ie, a repeat occurred) for the index test, and the intervening time to the first such repeat was recorded as the Am J Clin Pathol 2015;144:704-712705 DOI: 10.1309/AJCPYXDAUS2F8XJY Morgen and Naugler / Inappropriately Repeated Laboratory Testing ❚Table 1❚ Selected Tests With Simple Criteria for Identifying Inappropriate Repeat Testinga Test Criterion Rationale Tests with clear guidelines that can be used to identify inappropriate repeat testing Total cholesterol No repeats <12 weeks20 Serum cholesterol changes slowly, and repeat testing is not useful before the interval specified.21-23 Exception: A lipid panel might reasonably be repeated at 1 month in patients undergoing treatment intensification. HbA1c No repeats <3 months HbA1c represents an average effect over several months, and it is not useful to test (abnormal values) or <1 more frequently.24,25 Diabetic patients not meeting their glycemic goals (HbA1c year (normal values)26 generally 6%-7%) should be screened every 3 months. Those consistently meeting goals may be screened every 6 months. Patients without diabetes (HbA1c <6%) should be screened every 1 to 3 years, depending on risk profile.26,27 Exception: Pregnant women who require treatment should be tested no more frequently than once per month.26,27 TSH Should not be repeated Patients taking thyroxine with a recent dose change should be tested after 8 to 12 within 8 weeks28 weeks. Other clinical categories should be tested no more often than every 2 months.28 Tests without consensus-based guidelines but where a clear argument for particular criteria exists Vitamin B12 No repeats <1 year This test has low diagnostic accuracy and is not indicated for routine screening, and there are no guidelines to support retesting a patient with a normal result or retesting a patient with an abnormal result unless noncompliance with therapy is suspected.29,30 Vitamin D No repeats of normal The primary reason to repeat a normal test would be to monitor supplementation values <1 year to achieve higher levels of vitamin D, which is currently not recommended due to potential risks of renal calculi in the context of no demonstrated benefit for hip fracture31-33 or other potential benefits.34 Ferritin No repeats of normal Ferritin is indicated for the diagnosis of iron deficiency or iron overload states.35 In values <1 year both of these situations, a normal result essentially rules out the disease.36,37 It may be appropriate to monitor serum ferritin in the specific clinical situation of patients with transfusion-dependent anemia, but very few of these patients will have normal results.38,39 HbA1c, hemoglobin A1c; TSH, thyroid-stimulating hormone. a These criteria for inappropriate testing rely only on the time interval to repeat and whether the index test was abnormal. Tests are divided into two categories, depending on how well accepted the presented criteria are. time to that outcome. Kaplan-Meier curves were plotted for the entire data set (stratified for various factors), as well as for individual test types, stratified by whether the index test was abnormal by local laboratory criteria. Cox proportional hazards (CPH) modeling was used to test for a significant difference between these two curves. Calculation of Costs To calculate the cost of individual laboratory tests within the study laboratory, we used internal laboratory data representing two separate numbers: (1) a lower bound on this cost, representing only the variable costs (ie, reagent costs), which are saved whenever a test is not run, and (2) an upper bound on this cost, representing all expenses associated with performing the test, including equipment, reagents, and personnel time, per test instance. To obtain an annual cost range, we multiplied this range by the respective annual volumes of tests within the study laboratory. To derive statistics on the list prices for each test in the United States and Canada, we requested such prices from laboratories across Canada and the United States, and we calculated the 5th, 50th, and 95th percentile costs for each country from these data to provide an estimate of the range of prices that might be encountered in each country. 706 Am J Clin Pathol 2015;144:704-712 DOI: 10.1309/AJCPYXDAUS2F8XJY To estimate annual national cost burdens for redundant testing in Canada and the United States, we started with the annual volumes of redundant testing identified in this study for our geographic area, encompassing 1.4 million people in Calgary and surrounding areas. We then scaled the volumes of redundant testing up from a population of 1.4 million to the population of Canada (factor of 24.91) and the United States (factor of 224.21). This provided an estimate of the volume of redundant tests of each type for both countries, which could then be multiplied by various potential costs per individual test identified above. Results Test and Patient Characteristics Nearly 400,000 test instances were included in the study, performed on just over 100,000 patients. The test counts and patient counts for the entire data set and for each studied test type are shown in ❚Table 2❚, along with other univariate descriptive statistics (median age, proportion female, proportion abnormal). The proportion of men and women tested is approximately equal for cholesterol and © American Society for Clinical Pathology AJCP / Original Article ❚Table 2❚ Univariate Descriptive Statistics for Selected Laboratory Tests of Calgary Area Residents in 2010a Test Type No. of Tests No. of Patientsb Median Patient Age, y Patient Sex, % Female Abnormal Results, % of Index Tests Time to 25% Repeated, mo Cholesterol HbA1c TSH Vitamin B12 Vitamin D Ferritin All six tests 94,129 46,539 106,495 28,630 32,223 65,447 373,463 72,807 31,142 82,082 24,812 28,145 50,958 102,700 52 56 49 50.5 49 45 49 52 49 64 63 64 65 59 63 55 8 14 64 16 34 6.5 4.2 6.5 11.1 11.2 6.5 6.5 HbA1c, hemoglobin A1c; TSH, thyroid-stimulating hormone. a The presented data represent a 20% sample by patient of all patients tested at least once during this period in this geographic area. Because not all tests achieved a median repeat time (ie, time to 50% repeated) during the study, the time to 25% repeated is presented instead. b Some patients had multiple tests. HbA1c, but women were more likely to be tested for ferritin, TSH, vitamin B12, and vitamin D. The number of tests by month across the 2 study years was plotted (not shown) and demonstrates a generally even distribution, with small seasonable variability and a gradual upward trend consistent with gradually rising test volumes. Repeat Testing Survival analysis was performed to analyze repeat testing. The 25th percentile repeat time (ie, the time interval until 25% of all tests are repeated) is displayed in Table 2 (right side) for the entire data set and each test type. This was chosen instead of the median repeat time since most tests did not reach a median repeat time within 1 year. Overall KaplanMeier curves are plotted for each test type, as well as the two curves created by stratification on abnormal vs normal index test results ❚Figure 1❚. The influence of an abnormal index test on the risk of test repetition showed statistically significant effects by CPH modeling for each test type. Hazard ratios and 95% confidence intervals for an abnormal index test were as follows: cholesterol, 1.47 (1.44-1.50); HbA1c, 4.36 (4.244.48); TSH, 3.89 (3.79-3.99); vitamin B12, 1.58 (1.49-1.66); vitamin D, 1.30 (1.25-1.36); and ferritin, 1.99 (1.93-2.05). The percentages and numbers representing inappropriate repeat testing, according to the Table 1 criteria, were calculated and are shown in ❚Table 3❚, along with the calculated cost burden at the study laboratory. The percentages of inappropriate repeats were generally lower for the three tests with well-established consensus guidelines than for the tests without, which may indicate some positive influence of creating consensus guidelines on inappropriate repeat testing. Test Costs We received price lists from six Canadian laboratories (encompassing six provinces) and 13 US laboratories (encompassing local laboratories from six states and two national-based laboratories). This provided a range of costs, © American Society for Clinical Pathology for which various percentiles were calculated and are presented in ❚Table 4❚ and ❚Table 5❚, along with the estimated cost burden of redundant testing nationally for these six tests in Canada and the United States. The 50th percentiles for cost provide useful point estimates for cost in Canada and the United States of $160 million and $2.4 billion, respectively, while the other estimates provide a better sense for the range of potential costs under different assumptions. Discussion Repeat ordering of laboratory tests is common. Van Walraven and Raymond19 reported, in a population-based study of four million instances of eight common test types, that 30% were repeated within 1 month and 60% within 1 year. There has also been evidence that a substantial portion of such repeats is inappropriate. Stewart et al40 examined patients transferred between institutions and found that 32% had tests repeated within 12 hours, with 20% thought by the study authors to not be clinically indicated. Ganiyu-Dada and Bowcock41 studied three common hematologic tests repeated within an 8-week period and noted that 86% followed a normal result. However, they surveyed physicians who felt that “borderline normal” results were appropriate to repeat, and after excluding these, only 6% of repeat tests followed an unequivocally normal result. Kwok and Jones42 reported that 18% of tests at a tertiary-level immunology laboratory were repeated within 12 weeks and considered these to be redundant. Finally, Bates et al43 created multiple definitions of redundancy and reported rates of inappropriate repeat testing to be between 8% and 30% based on a combination of chart review and repeat interval, depending on the criteria used. These studies have been of great help to determine the magnitude of repeated laboratory testing. They also provide some clues as to what portion may be redundant. However, in this respect they suffer from similar Am J Clin Pathol 2015;144:704-712707 DOI: 10.1309/AJCPYXDAUS2F8XJY Morgen and Naugler / Inappropriately Repeated Laboratory Testing A B 0.5 0.8 Fraction Repeated Fraction Repeated 0.4 0.3 0.2 0.1 0.0 2 4 6 8 Time to Repeat (mo) 10 0 2 4 6 8 Time to Repeat (mo) 10 12 0 2 4 6 8 Time to Repeat (mo) 10 12 0 2 4 6 8 Time to Repeat (mo) 10 12 D 0.6 Fraction Repeated Fraction Repeated 0.2 12 0.8 0.4 0.2 0.0 0.3 0.2 0.1 0.0 0 E 0.4 0.0 0 C 0.6 2 4 6 8 Time to Repeat (mo) 10 12 F 0.30 0.5 Fraction Repeated Fraction Repeated 0.25 0.20 0.15 0.10 0.4 0.3 0.2 0.1 0.05 0.00 0.0 0 2 4 6 8 Time to Repeat (mo) 10 12 ❚Figure 1❚ Kaplan-Meier curves of repeat laboratory testing for Calgary area patients with index tests in 2010. Curves display the cumulative percentage of tests repeated up to each time point and are shown for each of the six tests of interest, stratified by abnormal (black) vs normal (light gray) results, with the nonstratified curve (dark gray) also shown for comparison. For curves corresponding to our proposed criteria for inappropriate repeated testing, vertical dashed lines indicate the time threshold for inappropriate tests, and horizontal dashed lines indicate the corresponding percentage of repeats within that time threshold. The colors of the dashed lines match those of the curve they apply to. A, Cholesterol. B, Hemoglobin A1c. C, Thyroid-stimulating hormone. D, Vitamin B12. E, Vitamin D. F, Ferritin. 708 Am J Clin Pathol 2015;144:704-712 DOI: 10.1309/AJCPYXDAUS2F8XJY © American Society for Clinical Pathology AJCP / Original Article ❚Table 3❚ Characteristics of Tests Performed in 2010 on Calgary Area Residents That Were Inappropriately Repeateda Test Type Annual Test Count at the Study Laboratory Percentage Repeated Recoverable Internal Cost Inappropriately of Each Test for the Study (95% Confidence Interval) Laboratory (CAD)b Annualized Cost of Inappropriate Repeats at the Study Laboratory (Millions) Cholesterol HbA1c with normal resultc HbA1c with abnormal resultd TSH Vitamin B12 Vitamin D with normal resultc Ferritin with normal resultc All testse 470,645 111,225 121,470 532,475 143,150 57,930 275,700 1,867,315 10.5 (10.3-10.7) 31.3 (30.7-31.9) 24.8 (24.3-25.3) 7.2 (7.0-7.3) 28.4 (27.8-28.9) 24.5 (23.8-25.3) 35.8 (35.4-36.2) 16.4 (16.3-16.6) $0.05-$0.25 $0.07-$0.24 $0.06-$0.21 $0.04-$0.19 $0.08-$0.29 $0.07-$0.18 $0.2-$0.79 $0.57-$2.15 $1.00-$5.00 $2.00-$7.00 $2.00-$7.00 $1.00-$5.00 $2.00-$7.00 $5.00-$13.00 $2.00-$8.00 NA CAD, Canadian dollars; HbA1c, hemoglobin A1c; NA, not applicable; TSH, thyroid-stimulating hormone. a Presented here are the estimated total annual counts for selected test types at the study laboratory, the percentage that were inappropriately repeated (using the criteria in Table 1), and the associated costs for the study laboratory. b The lower bound of each range represents only reagent costs (always recoverable when a test is not performed), while the upper bound includes all indirect costs (which may be recoverable if volumes decrease consistently over the long term). c The percentages and annual test counts for these rows reflect only normal results (ie, the count and percentage of HbA index tests with a normal result that were repeated 1c inappropriately). d The percentages and annual test counts for this row reflect only abnormal results. e The count in this row reflects all tests (both normal and abnormal) for all six studied test types and thus is larger than the sum of the numbers above it. This also influences the denominator for the percentage calculation. drawbacks to the general literature on overtesting, where a systematic review found the area characterized by widely varying definitions, small study sizes, and unvalidated, often subjective criteria.17 In contrast, we have assembled a large study population and chosen tests using uncontroversial, objective definitions for inappropriateness of repeat testing, based on easily available information about the tests. We followed over 100,000 patients (a population-based sample randomly selected from the 1.4 million residents of the study area) for at least a year, monitoring six common tests that together represent 18% of test volumes and 25% of test costs at the study laboratory. Index tests were ascertained over a 1-year period to avoid any biases due to seasonal effects, and each index test had a uniform 1-year follow-up period for the same reason. We selected three tests (cholesterol, HbA1c, and TSH) with consensus-based guidelines regarding the appropriate frequency of repetition, as well as three other tests (vitamin D, vitamin B12, and ferritin) where we could create cutoffs based on straightforward adaptation of existing testing guidelines. The choice of definitions is a clear advantage to our study, resulting in a set of objective criteria for identifying inappropriate retesting that have three useful implications: 1. Our estimates for the burden of inappropriate testing are very conservative and represent an approximate lower bound for the tests investigated. Our criteria will identify as inappropriate very few repeat tests (or perhaps none, for certain test types) that were actually appropriate. Furthermore, there are certain to be many repeats at greater intervals that we have targeted that are also inappropriate (ie, not every cholesterol test repeated at 12 or more weeks will be appropriate), and we have not even attempted to target inappropriate tests that are not repeats. Thus, the true fractions of © American Society for Clinical Pathology inappropriate tests should be no lower (and are likely much higher) than the numbers we report for these test types. 2. The criteria are not specific to particular health care settings but are applicable in diverse clinical settings and administrative levels. They may be applied in primary care, specialty clinics, or inpatient wards, and they may be applied to a single physician or an entire country. 3. Because the criteria are based on only two simple pieces of information (the repeat interval and whether the index test was abnormal), they are readily amenable to automated identification. Together, these suggest a dramatically simplified approach for the otherwise daunting task of “reducing inappropriate testing” that could be applied at any administrative level. The types of simple definitions proposed could be used to automatically flag tests that are likely to be inappropriate in electronic health record (EHR) systems. For flagged tests, the computer could then remind the ordering physician of the earlier result (which may have been overlooked) and prompt her or him for a justification if she or he wishes to override the warning. Computer logs of overrides and the justifications given could then be reviewed on an annual basis to monitor the program success, provide feedback to physicians on their ordering habits, and identify situations where the flags may be inappropriate because many overrides are being performed with good justification. Study Limitations There are a number of limitations to the current study. First, we examined only six test types, of the hundreds available at most laboratories. While these six tests represented 25% of study laboratory test volumes by cost, the picture is far from complete regarding inappropriate repeat testing. Am J Clin Pathol 2015;144:704-712709 DOI: 10.1309/AJCPYXDAUS2F8XJY Morgen and Naugler / Inappropriately Repeated Laboratory Testing ❚Table 4❚ Estimated Test Prices Within the Study Laboratory and the Distribution of Surveyed Test Prices Across Canada and the United States Selected Percentiles of Surveyed Canada List Prices for a Single Test (CAD) Selected Percentiles of Surveyed US List Prices for a Single Test (USD) Test Type Estimated Internal Cost in This Studya 5th 50th 95th 5th 50th 95th Cholesterol HbA1c TSH Vitamin B12 Vitamin D Ferritin $1.00-$5.00 $2.00-$7.00 $1.00-$5.00 $2.00-$7.00 $5.00-$13.00 $2.00-$8.00 $5.00 $13.06 $8.51 $14.18 $68.54 $8.16 $9.41 $19.51 $17.35 $21.80 $83.81 $19.59 $22.50 $56.50 $42.75 $40.93 $93.25 $49.50 $6.56 $15.11 $23.86 $22.78 $85.70 $20.00 $10.23 $30.00 $79.00 $32.73 $92.00 $25.00 $46.50 $129.00 $195.00 $99.25 $98.30 $39.80 CAD, Canadian dollars; HbA1c, hemoglobin A1c; TSH, thyroid-stimulating hormone; USD, US dollars. a Estimated internal costs are rounded to the nearest $1.00 CAD. ❚Table 5❚ Estimated Total Cost of Redundant Testing in the Canadian and US Health Care Systems Estimated National Annual Canadian Cost Burden for Each Percentile Cost Estimate in Table 4 (Millions CAD) Test Type Cholesterol HbA1c (normal resulta) HbA1c (abnormal resultb) TSH Vitamin B12 Vitamin D (normal resulta) Ferritin (normal resulta) Totals for all six testsc $1.23-$6.16 $1.73-$6.07 $1.5-$5.25 $0.96-$4.78 $2.03-$7.12 $1.77-$4.6 $4.92-$19.67 $14.14-$53.64 Estimated National Annual US Cost Burden for Each Percentile Cost Estimate in Table 4 (Millions USD) 5th 50th 95th 5th 50th 95th $6 $11 $10 $8 $14 $24 $20 $94 $12 $17 $15 $17 $22 $30 $48 $160 $28 $49 $42 $41 $42 $33 $122 $356 $73 $118 $102 $205 $208 $273 $443 $1,421 $113 $234 $203 $679 $299 $293 $553 $2,374 $515 $1,007 $871 $1,676 $908 $313 $881 $6,171 CAD, Canadian dollars; HbA1c, hemoglobin A1c; TSH, thyroid-stimulating hormone; USD, US dollars. a The percentages and annual test counts for these rows reflect only normal results (ie, the count and percentage of HbA index tests with a normal result that were repeated 1c inappropriately). b The percentages and annual test counts for this row reflect only abnormal results. c Totals may not add up due to rounding. Furthermore, the types of definitions for inappropriate testing used in this study will not be easily applicable to all laboratory tests. This is particularly true for tests with a wide variety of applications in a wide variety of clinical situations. However, for many tests, the inherent kinetics of the biological markers involved may provide certain limits for reasonable repeat times—for example, using the biological half-life of drugs to guide limits for repeat testing of drug levels. And for other tests, deliberation by guidelines committees may reveal reasonable limits on repeat testing intervals in many clinical situations, as they did for many of the tests studied in the current article. Ultimately, some tests will be well suited to this approach, and others will be poorly suited. However, the current approach represents a useful general strategy that can identify significant volumes of inappropriate testing and has favorable characteristics when feasible. Second, there are a small number of exceptions to our criteria for inappropriateness, defined in Table 1. For cholesterol, it may be considered appropriate to retest a patient after 1 month (rather than 3 months) if he or she is undergoing a treatment intensification. For HbA1c, pregnant women may be retested at 1 month rather than 3 months. Such exceptions 710 Am J Clin Pathol 2015;144:704-712 DOI: 10.1309/AJCPYXDAUS2F8XJY would have to be considered when implementing any plan to reduce unnecessary testing but are likely to have only a small impact on the estimated burden of inappropriate testing. Third, the criteria for several tests in our study (vitamin B12, vitamin D, and ferritin) were not based on consensus guidelines, rendering them less standardized. However, the argument for these criteria is straightforward. Each of these tests screens for problems (vitamin B12 deficiency, vitamin D deficiency, and iron deficiency) that are relatively uncommon and develop slowly, so after a negative test result it is difficult to justify repeat testing. Fourth, our approach of targeting repeated laboratory tests will not identify all inappropriate testing. Indeed, one of the most compelling reasons to study inappropriate repeat testing is that it represents a fraction of inappropriate testing that is relatively easier to target and define, in a field rife with vague definitions that make careful study and improvement programs difficult. So while we estimated in Table 2 that 16% of all tests we studied were inappropriate, this really represents only a lower bound on this percentage. Indeed, the trend observed in Table 3, where tests with formal consensus-defined restrictions about repeat testing © American Society for Clinical Pathology AJCP / Original Article tended to have lower rates of inappropriate repeats, implies that the percentage of inappropriate repeats may be higher for tests without clear consensus definitions and supports the potential benefit of creating such guidelines. Finally, it is quite difficult to estimate costs for laboratory testing. Even within a study laboratory, where it might seem straightforward to estimate the cost of a test, there is controversy due to the distinction between the marginal cost savings of avoiding testing and the price charged by a laboratory for a given test.44 In the simplest case, the reduction of a laboratory test volume by one test will save only the reagent cost of that test and will not affect the fixed costs of operating the laboratory. However, when testing is reduced by larger volumes and over longer periods of time, the marginal cost reduction may include reduced staffing and less need to increase analyzer capacity, thus deferring equipment upgrade/maintenance and staff hiring costs as overall laboratory volumes increase. We have dealt with this issue by providing the reagent costs as a lower bound and the total internal cost as an upper bound, defining the range of potential cost savings. The actual marginal savings will depend on the relative volume of tests avoided and whether this decrease is consistent over time. Extending such cost estimates beyond a single laboratory is even more problematic, since (beyond the problems already discussed) laboratories are funded under a variety of remuneration schemes and at a variety of prices. Thus, any specific analysis of potential costs would be necessarily relevant only to a very narrow subset of the population. Instead, we have chosen to provide cost estimates based on internal costs at the study laboratory (representing the minimum burden of redundant testing, when there is no profit margin to the testing performed), as well as for a full range of costs derived from Canadian and US list prices provided by a range of laboratories. This allows readers to get a sense of the potential cost burden of redundant testing under a variety of cost situations. There are similar challenges relating to the use of test volumes from one geographic area to estimate volumes in other areas. However, given the complexity of presenting the range of potential cost situations already considered, we have simply extrapolated our local test volumes directly to the Canadian and US systems using scaling factors according to the population sizes served. This will not provide definitive estimates but is a reasonable starting point for discussion about issues surrounding redundant testing. Conclusions Survival analysis of laboratory testing data for six tests with guideline-based or otherwise straightforward definitions of inappropriate repeat testing has demonstrated that 16% of © American Society for Clinical Pathology these tests are repeated inappropriately. This provides a fairly strong minimum estimate (ie, lower bound) for the burden of inappropriate repeat testing. It also provides a valuable and relatively unexploited opportunity to target inappropriate testing using EHR or other computer-based strategies and reduce unnecessary test volumes by leveraging these straightforward guidelines. Because redundant tests appear to make no direct contribution to guiding patient care,14-16 such interventions could reduce health care expenditures while likely reducing patient morbidity.5,6 The annual savings achievable by eliminating the inappropriate tests targeted in this study could amount to as much as $2 million within our institution and corresponds to an estimated $160 million across Canada. In the United States, with 10 times the population of Canada, the health system savings could be an order of magnitude higher. Corresponding author: Christopher Naugler, MD, MSc, FRCPC, C410, Diagnostic and Scientific Centre, 9, 3535 Research Rd NW, Calgary AB T2L 2K8, Canada; [email protected]. Acknowledgments: We thank Jeannine Viczko and Maggie Guo for help with data extraction. References 1. Forsman RW. Why is the laboratory an afterthought for managed care organizations? Clin Chem. 1996;42:813-816. 2. McGrail KM, Evans RG, Barer ML, et al. Diagnosing senescence: contributions to physician expenditure increases in British Columbia, 1996/97 to 2005/06. Healthc Policy. 2011;7:41-54. 3. Sivananthan SN, Peterson S, Lavergne R, et al. Designation, diligence and drift: understanding laboratory expenditure increases in British Columbia, 1996/97 to 2005/06. BMC Health Serv Res. 2012;12:472. 4. Naugler C. A perspective on laboratory utilization management from Canada. Clin Chim Acta. 2014;427:142144. 5. Rang M. The Ulysses syndrome. Can Med Assoc J. 1972;106:122-123. 6. McGregor MJ, Martin D. Testing 1, 2, 3: is overtesting undermining patient and system health? Can Fam Physician. 2012;58:1191-1193, e615-e617. 7. Bates DW, Goldman L, Lee TH. Contaminant blood cultures and resource utilization: the true consequences of falsepositive results. JAMA. 1991;265:365-369. 8. De Boer AS, Blommerde B, de Haas PEW, et al. Falsepositive Mycobacterium tuberculosis cultures in 44 laboratories in the Netherlands (1993 to 2000): incidence, risk factors, and consequences. J Clin Microbiol. 2002;40:4004-4009. 9. Tluczek A, Orland KM, Cavanagh L. Psychosocial consequences of false-positive newborn screens for cystic fibrosis. Qual Health Res. 2011;21:174-186. 10. Beck JR. Does feedback reduce inappropriate test ordering? Arch Pathol Lab Med. 1993;117:33-34. 11. Bareford D, Hayling A. Inappropriate use of laboratory services: long term combined approach to modify request patterns. BMJ. 1990;301:1305-1307. Am J Clin Pathol 2015;144:704-712711 DOI: 10.1309/AJCPYXDAUS2F8XJY Morgen and Naugler / Inappropriately Repeated Laboratory Testing 12. Perraro F, Rossi P, Liva C, et al. Inappropriate emergency test ordering in a general hospital: preliminary reports. Qual Assur Health Care. 1992;4:77-81. 13. Ashley JS, Pasker P, Beresford JC. How much clinical investigation? Lancet. 1972;1:890-892. 14. Daniels M, Schroeder SA. Variation among physicians in use of laboratory tests, II: relation to clinical productivity and outcomes of care. Med Care. 1977;15:482-487. 15. Bell DD, Ostryzniuk T, Verhoff B, et al. Postoperative laboratory and imaging investigations in intensive care units following coronary artery bypass grafting: a comparison of two Canadian hospitals. Can J Cardiol. 1998;14:379-384. 16. Powell EC, Hampers LC. Physician variation in test ordering in the management of gastroenteritis in children. Arch Pediatr Adolesc Med. 2003;157:978-983. 17. van Walraven C, Naylor C. Do we know what inappropriate laboratory utilization is? a systematic review of laboratory clinical audits. JAMA. 1998;280:550-558. 18. Zhi M, Ding EL, Theisen-Toupal J, et al. The landscape of inappropriate laboratory testing: a 15-year meta-analysis. PLoS One. 2013;8:e78962. 19. Van Walraven C, Raymond M. Population-based study of repeat laboratory testing. Clin Chem. 2003;49:1997-2005. 20. Anderson TJ, Grégoire J, Hegele RA, et al. 2012 update of the Canadian Cardiovascular Society guidelines for the diagnosis and treatment of dyslipidemia for the prevention of cardiovascular disease in the adult. Can J Cardiol. 2013;29:151-167. 21. Iliadi V, Kastanioti C, Maropoulos G, et al. Inappropriately repeated lipid tests in a tertiary hospital in Greece: the magnitude and cost of the phenomenon. Hippokratia. 2012;16:261-266. 22. Virani SS, Woodard LD, Wang D, et al. Correlates of repeat lipid testing in patients with coronary heart disease. JAMA Intern Med. 2013;173:1439-1444. 23. Goodwin JS, Asrabadi A, Howrey B, et al. Multiple measurement of serum lipids in the elderly. Med Care. 2011;49:225-230. 24. Laxmisan A, Vaughan-Sarrazin M, Cram P. Repeated hemoglobin A1C ordering in the VA Health System. Am J Med. 2011;124:342-349. 25. Akan P, Cimrin D, Ormen M, et al. The inappropriate use of HbA1c testing to monitor glycemia: is there evidence in laboratory data? J Eval Clin Pract. 2007;13:21-24. 26. Canadian Diabetes Association Clinical Practice Guidelines Expert Committee, Cheng AYY. Canadian Diabetes Association 2013 clinical practice guidelines for the prevention and management of diabetes in Canada: introduction. Can J Diabetes. 2013;37(suppl 1):S1-S3. 27. Alberta Health Services. Alberta Health Services Laboratory Bulletin—March 28, 2014. http://www.albertahealthservices. ca/LabServices/wf-lab-bulletin-new-nemoglobin-a1c-testutilization-criteria.pdf. Accessed September 12, 2014. 28. Towards Optimized Practice Program. Clinical practice guidelines: investigation and management of primary thyroid dysfunction. 2008. http://www.topalbertadoctors.org/ download/350/thyroid_guideline.pdf. Accessed August 18, 2014. 29. Smellie WSA, Wilson D, McNulty CAM, et al. Best practice in primary care pathology: review 1. J Clin Pathol. 2005;58:1016-1024. 712 Am J Clin Pathol 2015;144:704-712 DOI: 10.1309/AJCPYXDAUS2F8XJY 30. Health Quality Ontario. Serum vitamin B12 testing: a rapid review. 2012. http://www.hqontario.ca/Portals/0/Documents/ eds/rapid-reviews/vitamin-b12-121212-en.pdf. Accessed August 19, 2014. 31. Moyer VA; U.S. Preventive Services Task Force. Vitamin D and calcium supplementation to prevent fractures in adults: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2013;158:691-696. 32. Nestle M, Nesheim MC. To supplement or not to supplement: the U.S. Preventive Services Task Force recommendations on calcium and vitamin D. Ann Intern Med. 2013;158:701-702. 33. Rizzoli R, Boonen S, Brandi M-L, et al. Vitamin D supplementation in elderly or postmenopausal women: a 2013 update of the 2008 recommendations from the European Society for Clinical and Economic Aspects of Osteoporosis and Osteoarthritis (ESCEO). Curr Med Res Opin. 2013;29:305-313. 34. Prentice RL, Pettinger MB, Jackson RD, et al. Health risks and benefits from calcium and vitamin D supplementation: Women’s Health Initiative clinical trial and cohort study. Osteoporos Int. 2013;24:567-580. 35. Health Quality Ontario. Ferritin testing: a rapid review. 2012. http://www.hqontario.ca/Portals/0/Documents/eds/rapidreviews/ferritin-121212-en.pdf. Accessed August 19, 2014. 36. Oh RC, Franzos T, Montoya C. Clinical inquiry: how best to diagnose iron-deficiency anemia in patients with inflammatory disease? J Fam Pract. 2012;61:160-161. 37. Bacon BR, Adams PC, Kowdley KV, et al. Diagnosis and management of hemochromatosis: 2011 practice guideline by the American Association for the Study of Liver Diseases. Hepatology. 2011;54:328-343. 38. Brittenham GM, Cohen AR, McLaren CE, et al. Hepatic iron stores and plasma ferritin concentration in patients with sickle cell anemia and thalassemia major. Am J Hematol. 1993;42:81-85. 39. Cappellini MD, Porter J, El-Beshlawy A, et al. Tailoring iron chelation by iron intake and serum ferritin: the prospective EPIC study of deferasirox in 1744 patients with transfusiondependent anemias. Haematologica. 2010;95:557-566. 40. Stewart BA, Fernandes S, Rodriguez-Huertas E, et al. A preliminary look at duplicate testing associated with lack of electronic health record interoperability for transferred patients. J Am Med Inform Assoc. 2010;17:341-344. 41. Ganiyu-Dada Z, Bowcock S. Repeat haematinic requests in patients with previous normal results: the scale of the problem in elderly patients at a district general hospital. Int J Lab Hematol. 2011;33:610-613. 42. Kwok J, Jones B. Unnecessary repeat requesting of tests: an audit in a government hospital immunology laboratory. J Clin Pathol. 2005;58:457-462. 43. Bates DW, Boyle DL, Rittenberg E, et al. What proportion of common diagnostic tests appear redundant? Am J Med. 1998;104:361-368. 44. MacMillan D. Calculating cost savings in utilization management. Clin Chim Acta. 2014;427:123-126. © American Society for Clinical Pathology
© Copyright 2024 Paperzz