ANATOMIC PATHOLOGY Original Article Pathology and Probability Likelihood Ratios and Receiver Operating Characteristic Curves in the Interpretation of Bronchial Brush Specimens STEPHEN S. RAAB, MD, PATRICIA A. THOMAS, MD, JULIA C. LENEL, P H D , KENT BOTTLES, MD, KAREN M. FITZSIMMONS, MD, M. SUE ZALESKI, SCT(ASCP), RICHARD J. WITTCHOW, MD, LARON W. McPHAUL, MD, DANIEL D. SLAGEL, MD, AND MICHAEL B. COHEN, MD Diagnoses in pathology often are qualitative, such as atypical or suspicious, and consequently are thought to have limited clinical value. To investigate the utility of a qualitative diagnostic system, seven pathologists retrospectively evaluated 100 bronchial brush specimens using the following categories: definitely benign, probably benign, possibly malignant, probably malignant, and definitely malignant. The likelihood ratio (LR) and receiver operating characteristic (ROC) curve, two statistical probabilistic measurements, were used to calculate diagnostic accuracy among individuals and groups. The results show: (1) the LR Diagnoses in anatomic pathology, being dependent on human judgment, are embodied with a certain element of subjectivity. If the pathologist thinks that the biopsy findings are normal or are pathognomonic of certain disease processes, the pathology report is unequivocal and straight-forward. If the biopsy findings are abnormal, but not pathognomonic of any particular disease, the pathology report often is couched in qualitative terms such as suspicious, suggestive or most likely. Many have decried the use of qualitative diagnoses because of their ambiguous meaning and limited clinical value.1"5 The argument is well taken. How can a clinician initiate protocol chemotherapy on a diagnosis of probable lymphoma? In addition to this language problem, there is the difficulty of incorporating clinical facts into the pathology diagnosis. Pathologists often want to know the clinical history or the radiologic findings before making a final diagnosis. An example of this is the conviction that perusal of the roentgenographic findings is essential before making the pathology diagnosis of any bone tumor. This inclination results in the "double counting" of clinical facts;1 the pathology report represents more than a diagnosis made on morphologic findings. If unaware of this procedure, the clinician will be biased and overestimate the clinical findings in the decision making process. for individual diagnostic categories varied among observers, resulting in different clinically malignant probabilities; (2) observer experience did not appear to play a role in overall diagnostic accuracy, except in the diagnosis of small cell carcinoma; (3) observers operate at higher levels of diagnostic accuracy with, rather than without, clinical history. The authors conclude that qualitative diagnoses contain important information and can be interpreted effectively with LR and ROC. (Key words: Statistics; Anatomic pathology; Cytology; Diagnosis; Quality assurance/control) Am J Clin Pathol 1995; 103:588-593. To provide a solution for these problems, in 1981 Schwartz and colleagues proposed a new approach to the interpretation of biopsy specimens.' These authors advocated the use of a conditional probabilistic technique in the reporting of diagnoses. For example, if the pathology findings were not pathognomonic and the differential diagnosis included malignant tumor, benign tumor, and infection, the pathologist, without clinical history, should issue a pathology report expressing the conditional probability of each of these conditions. This article evoked a great deal of controversy, and little advancement has been made in addressing these issues. We believe that in certain areas in pathology, such as cytopathology, a qualitative diagnostic reporting system already expresses these probabilistic concepts. This view requires a shift in focus, from the idea that the pathology diagnosis provides a correct answer to one in which the pathology diagnosis is simply a laboratory test that necessarily exhibits uncertainties and errors.6 This conception places the focus on measuring the diagnostic accuracy of the pathology diagnosis. Instead of standard sensitivity and specificity measurements, diagnostic accuracy can be measured with the likelihood ratio (LR) and receiver operating characteristic (ROC) curve, which incorporate a probabilistic technique.7"14 The LR and ROC have been applied to clinical pathology and radiology data, but seldom have been used in the anatomic pathology domain. 15 " 26 In this report, using the bronchial brush (BB) specimen as an From the Department of Pathology. University of Iowa. Iowa City, example, we show the clinical utility of using the LR and ROC Iowa. curve in the evaluation of qualitative diagnoses. Manuscript received June 6, 1994; revision accepted September 30, 1994. MATERIALS A N D METHODS Address correspondence to Dr. Raab: University of Iowa Hospitals One hundred bronchial brush cases retrospectively were seand Clinics, Department of Pathology, 200 Hawkins Drive 5216 RCP, lected from the cytology files from the years 1991-1993 at the Iowa City, IA 52242-1009. 588 RAAB ET AL. ROC Curves and LR in Anatomic Pathology University of Iowa Hospitals and Clinics. Each of the BBs consisted of two slides. All cases had histologic follow-up and 6 to 18 months (mean 13 months) of clinical follow-up. Fifty cases had malignant histologic confirmation and fifty cases had benign histologic and clinical confirmation. The original cytologic diagnoses were placed in four categories: benign (35 cases), atypical (24), suspicious (14), and malignant (27). Some of the cases were diagnostically difficult, whereas others were straightforward. The malignant cases consisted of 41 non-small cell carcinomas, seven small cell carcinomas, one carcinoid, and one lymphoma. The benign cases exhibited a spectrum of cytologic findings ranging from no diagnostic abnormality to viral inclusions to acute inflammation. Each case was randomly re-labeled a number from 1 to 100. The slides then were screened by an experienced cytotechnologist who was unaware of the previous cytologic diagnoses and clinical histories. The cytotechnologist placed five dots per slide, marking the areas, which were diagnostic or most worrisome for malignancy. In the benign cases, these areas often exhibited reactive or degenerative changes. The cases were divided into three groups and passed among the observers. Clinical histories were not provided. The observers were instructed to first concentrate on the dotted regions, and then, only if uncertain, to look elsewhere on the slide. Each observer scored the cases in one, and only one of the following qualitative categories: definitely benign, probably benign, possibly malignant, probably malignant, and definitely malignant. These categories spanned the spectrum of certainty of malignancy. Tumor classification was not requested. Standardized answer forms were given to each observer with instructions as to how to complete the forms. The observers did not consult other study participants. The observers had different levels of experience in the interpretation of BBs. Observers 3 and 5 were considered the more experienced cytopathologists; these individuals were senior faculty members with formal cytopathology training and had been signing out for more than 5 years. The other observers were considered the less experienced cytopathologists. Observers 1, 2,4, 6, and 7 consisted of a junior faculty member, two cytopathology fellows and two residents who viewed the slides at the end of a 3-month block of cytology training. The LR for malignancy for each observer for each diagnostic category was calculated according to previously described methods.7"10 To briefly summarize, the LR of a diagnostic category is the quotient of the proportion of individuals with disease who have a particular diagnosis to the proportion of individuals without disease who have that particular diagnosis. Given the pre-test clinical probability of disease, the LR can be used to calculate the post-test probability of disease. The LR is related to the odds of disease by the equation: Post-test odds = Pre-test odds X LR The odds of disease and probability of disease are related by the following equations: Odds = Probability/(1 - probability) Probability = Odds/(l + Odds) LRs can range from 0 to oo. A LR < 1.0 lowers the post-test probability of disease from the pre-test probability of disease; a LR = 1.0 does not alter the post-test probability of disease; and a LR > 1.0 raises the post-test probability of disease from the pre-test probability of disease. 589 The ROC curves were constructed as described by Dorfman."' 2 A ROC curve is a graphic plot fitted to pairs of true positive (TP) rates (sensitivity) and false positive (FP) rates (100% specificity) for a given observer as the criteria for making a diagnosis is varied. Each criterion gives rise to one point on the curve. Sensitivity is a measure of the percentage of known diseased patients with positive test results among all patients diagnosed as diseased who were evaluated. Specificity is a measure of the percentage of patients with negative test results among all tested patients who were not diagnosed with the disease in question. Sensitivity and specificity can be expressed as follows: Sensitivity = True positives/(True positive + False negatives) Specificity = True negatives/(True negatives + False positives) An ideal test would have a sensitivity and specificity of 100%, although in practice these parameters tend to be inversely related. Conceptually, a ROC curve can be computed along the following lines. First, all five of the diagnostic categories are assumed to correspond to the presence of disease. This corresponds to a sensitivity of 1.0 (all of the patients with cancer are correctly diagnosed) and a specificity of 0 (all of the patients without cancer are incorrectly diagnosed). Next, all categories except definitively benign are assumed to correspond to the presence of cancer and the proportion of TPs and FPs are calculated. This yields a point of decreased sensitivity and increased specificity. Next, the combined categories of definitely malignant, probably malignant, and possibly malignant are assumed to correspond to the presence of cancer and the TPs and FPs are calculated. This process is continued until all the diagnostic categories are combined and assumed to correspond to the absence of disease. This yields a point corresponding to a sensitivity of 0 and a specificity of 1.0. In total six points are calculated. By convention, a ROC curve is plotted with the true positive rate along the vertical axis and the false positive rate along the horizontal axis. A ROC curve for an optimal observer will travel along the upper border of the graph and drop precipitously along the vertical axis. A ROC curve for an observer making random guesses is represented by a straight, 45° line. The ROC curves of observers who exhibit less than 100% diagnostic accuracy and who do not randomly guess, generally fall somewhere between these two curves. ROC curves were calculated and plotted with either RSCORE III or RSCORES, ROC curve analysis programs."12 With RSCORE III, individual observer curves were calculated; information was generalized over cases. With RSCORES, data across observers were pooled to calculate a ROC curve; information was generalized over observers. Parametric ROC curve calculating programs were used, because the data consisted of five discrete points and were not a continuous function. With RSCORE III, the diagnostic accuracy of each observer was represented by Az, which corresponds to the proportion of the total area of the ROC graph which lies under the binormal ROC curve. Az values generally range from 0.5 to 1.0; an Az value of 0.5 corresponds to the area under a straight 45° line (random guessing), whereas an Az value of 1.0 corresponds to the area under the curve of an optimal observer. The standard error and 95% confidence interval also were calculated for each observer. Vol. 103-No. 5 590 ANATOMIC PATHOLOGY Original Article TABLE 1. LIKELIHOOD RATIOS FOR MALIGNANCY FOR THE SEVEN OBSERVERS Observer Definitely Benign Probably Benign Possibly Malignant 1 2 3 4 5 6 0.19 0.39 0.24 0.11 0.27 0.32 0.49 1.06 0.67 0.45 1.25 1.75 1.47 1.33 oo 1.63 0.67 5.00 1.72 3.19 2.00 oo oo oo 9.80 12.77 9.00 21.42 5.17 19.00 7 0.45 2.67 oo oo oo Three ROC curves were calculated for each observer: for the combined malignant cell types, for small cell carcinoma and for non-small cell carcinoma. Mean Az values were calculated for the group of experienced observers and for the group of less experienced observers. The significance of the difference between the mean diagnostic accuracy for the combined malignant cell types of the more and of the less experienced observers was determined by a one-tailed t-test. Analysis of variance (ANOVA) was used to determine the effect of cell type (small cell carcinoma and non-small cell carcinoma) for the more and less experienced observers. Finally, the data for all observers were pooled, and a single ROC curve was calculated using RSCORES and represented the ROC curve without clinical history. A ROC curve also was calculated from the original cytologic diagnoses and represented the ROC curve with clinical history. RESULTS The LRs for malignancy for each of the seven observers for the five diagnostic categories are shown in Table 1. Several of the cells contain a LR of oo, which usually indicated few diagnoses were placed in these categories. Thus, an observer, like observer 7, who made few possibly malignant, probably malignant and definitely malignant diagnoses can be considered to be "conservative" because of the preponderance of benign diagnoses. The other observers were more definitive and placed a greater number of diagnoses in the definitely benign and definitely malignant categories. The number of cases placed in the same diagnostic category by different observers varied considerably. An example of how the LR can be applied to a clinical scenario follows. Suppose a patient had a lung mass and, clinically, the suspicion of malignancy was 65%. A BB was performed and interpreted by observer 2. If the cytologic diagnosis was definitely malignant, using the formulas in the materials and methods, the post-BB probability of malignancy would be 96%. Similarly, if the BB diagnosis was definitely benign, probably benign, possibly malignant, or probably malignant, the postBB probability of malignancy would be 42%, 66%, 71% and 86%, respectively. For observer 4, the post-BB probability of malignancy for the five diagnostic categories in order from benign to malignant given the same pre-BB probability of malignancy (65%) would be 26%, 48%, 73%, 76%, and 95%, respectively. A ROC curve measuring the diagnostic accuracy for the diagnosis of malignancy for each of the seven observers is plotted in Figure 1. The Az values, standard error and 95% confidence intervals are shown in Table 2. The mean Az values for the Probably Malignant Definitely Malignant group of more experienced observers and for the group of less experienced observers are shown in Table 3. A t-test for the difference between the two means (ie, mean accuracy for the more experienced observers vs mean accuracy for the less experienced observers) was not significant at a = 0.05 (t = 0.203, P = .849). The Az values measuring the diagnostic accuracy of the malignant categories small cell carcinoma and non-small cell carcinoma of the seven observers and for the groups of more experienced and less experienced observers are shown in Table 4. For this sample, it appears there is an interaction between level of experience and cell type. The more experienced observers performed at a higher level of diagnostic accuracy in the diagnosis of small cell carcinoma, and both groups performed at an equal level of diagnostic accuracy in the diagnosis of non-small cell carcinoma. A two-way fixed ANOVA, with cell type and experience being fixed and cases and observers being random, was run on the data. The effects of cell type and experience were not significant (P = . 16 and P = .35, respectively), and the interaction of cell type and experience also was not significant at the a = 0.05 level. A ROC curve measuring the diagnostic accuracy of the original cytologic diagnoses, when clinical history was provided, is plotted in Figure 2 (Az = 0.974). This is contrasted to the ROC curve calculated from the pooled data from the seven observers, who operated without clinical history (Az = 0.841). DISCUSSION Both clinicians and pathologists often fail to realize that morphologic observations are a laboratory test and are an estimation of the probability of occurrence of a particular disease.6 Ambiguities in the pathology reporting of diagnoses are a manifest expression of this likelihood. These ambiguities can be dealt with by either altering the approach in the reporting of diagnoses to reflect diagnostic probabilities or by using statistical methods, such as the LR and ROC curve, which effectively convey probabilities. Schwartz and coworkers chose the first approach, using a tabular form of Bayes' rule.1 They advocated a diagnostic reporting schema that lists the individual conditional probabilities of selected disease conditions given the observed morphologic findings. For example, the morphologic findings in a fictitious lung biopsy might be interpreted as 5% conditional probability of inflammation, 20% conditional probability of benign tumor, and 80% conditional probability of malignant tumor. Based on these conditional probabilities, the clinical probability of any of these entities could then be calculated. This approach by Schwartz and colleagues is quite useful, although it is unappeal- AJ.C.P.^May 1995 RAAB ET AL. 591 ROC Curves and LR in Anatomic Pathology TABLE 2. DIAGNOSTIC ACCURACY (AJ VALUES FOR MALIGNANCY FOR SEVEN OBSERVERS 1.0 0.8- 0.6 P(TP) 0.4 p(TP) - true positive rate p(FP) - false positive rate 0.2 0.0-f 0.0 0.2 0.4 0.6 0.8 Observer Area (AJ Standard Error l 2 3 4 5 6 7 0.823 0.747 0.878 0.912 0.853 0.888 0.921 0.043 0.055 0.039 0.030 0.043 0.050 0.020 1.0 / pooled data without history 0.6 p(TP) 0.91 0.85 0.95 0.97 0.94 0.99 0.93 Level of Experience Area (A J Standard Deviation Less More 0.858 0.866 0.073 0.018 l. Individual observers use probabilistic categories differently. Thus what some observers mean by qualitative diagnoses such as atypical may be different from what others mean. 0.4p(TP) - true positive rate p(FP) - false postive rate 0.2- 0.0 0.0 <,kzz £AZ<; <.\z<. sAzs <;A Z <; sAzi <;A z =s Sensitivity and specificity apply only to binary data (ie, the presence or absence of disease) and not to qualitative probabilistic data. A second problem with sensitivity and specificity is that these measurements fail to convey the clinical probability of disease in an individual patient given a particular test result. The LR and ROC curve analysis effectively handle both of these difficulties and represent an extension of Bayes' rule. In this study, we investigated the utility of the LR and ROC curve in the interpretation of the prototypical BB specimen. Several conclusions can be drawn: 1.0 / 0.74 0.64 0.80 0.85 0.77 0.79 0.88 TABLE 3. DIAGNOSTIC ACCURACY (Ar) VALUES FOR MALIGNANCY FOR THE POOLED MORE EXPERIENCED OBSERVERS AND THE POOLED LESS EXPERIENCED OBSERVERS P(FP) 0.8 95% Confidence Interval 0.2 0.4 0.6 0.8 1.0 TABLE 4. DIAGNOSTIC ACCURACY (AJ VALUES FOR SMALL CELL CARCINOMA AND FOR NON-SMALL CELL FOR SEVEN OBSERVERS, POOLED MORE EXPERIENCED OBSERVERS AND POOLED LESS EXPERIENCED OBSERVERS Cell Type p(FP) FIG. I. (Top) ROC curves measuring the diagnostic accuracy for the diagnosis of malignancy for seven observers. Observer(s) Small Cell Carcinoma SE Non-Small Cell Carcinoma SE FIG. 2. (Bottom) ROC curves measuring the diagnostic accuracy of seven observers without clinical history and the original cytologic diagnoses with clinical history. l 2 3 4 5 6 7 0.752 0.454 0.888 0.847 0.847 0.536 0.830 0.124 0.182 0.057 0.089 0.057 0.326 0.130 0.835 0.787 0.877 0.916 0.848 0.922 0.928 0.043 0.052 0.042 0.030 0.048 0.036 0.019 Experience Less More 0.684 0.868 0.092 0.029 0.878 0.863 0.032 0.015 ing to most pathologists because it is not thought to accurately reflect the normal thinking process. We propose the use of statistical methods that measure the accuracy of pathology diagnoses in probabilistic terms. This requires a switch from the commonly utilized measurements of sensitivity and specificity, which are limited in several aspects. SE = standard error. Vol. 103-No. 5 592 ANATOMIC PATHOLOGY Original Article This can be seen with the LR. In this study, the post-BB effect of the diagnosis of probably benign varied; for observers 1, 3 and 4, the post-BB probability of malignancy is lowered (LR <1.0), whereas for observers 2, 5, 6, and 7, it is increased (LR > 1.0). There also can be large differences in degrees of effect; for the diagnosis definitely malignant, the post-BB probability of malignancy is much higher for observer 6 (LR = 19.00) than for observer 5 (LR = 5.17). 2. Diagnoses express probability of disease. The pathology diagnosis should be used in conjunction with the clinical likelihood of disease to predict the post-test probability of disease. For example, the cytologic diagnosis of definitely malignant does not indicate that a lesion has to be malignant. In fact, if the clinical likelihood of disease is less than 100%, for all observers in this study, a definitely malignant diagnosis does not imply absolute certainty of malignancy; a diagnosis of definitely malignant only indicates that the post-BB probability of malignancy increases above the preBB clinical probability. Likewise the cytologic diagnosis of definitely benign does not imply that there is no possibility of malignancy. 3. Some observers are more conservative in diagnosis (observer 7) than others. Reasons include lack of experience and natural inclinations. Although observer 7 exhibited a similar level of diagnostic accuracy as the other observers, the predictive power of certain diagnostic categories, such as definitely malignant, is less meaningful. Although "accurate," a diagnosis of observer 7 may not be clinically useful. 4. For many types of specimens, the currently used diagnostic format of expressing probabilities through qualitative terms is adequate. Likelihood ratios also can be calculated for any other qualitative diagnoses such as suspicious or most likely; in addition, LRs can be calculated for the probability of occurrence of any disease process in any organ, such as Pneumocystis pneumonia in bronchoalveolar lavage specimens, metastatic disease in liver biopsies, or lymphocytic thyroiditis in thyroidfine-needleaspiration biopsies (FNABs). certainty of malignancy than a diagnosis of definitely malignant, but more certainty than a diagnosis of suspicious. Further studies are needed to investigate the probabilities associated with these terms. In actual practice, many clinicians, especially surgeons and oncologists, want binary diagnoses, their argument being that treatment is dependent on the presence or absence of disease and not on the probability of disease. Clinicians need to act and the pathologic diagnosis provides information on which this action is based. It often is easiest to make a clinical decision if the pathologic diagnosis lies at either end of the spectrum (ie, definitely malignant or definitely benign). A definitely malignant diagnosis is acted on as if there really is cancer, regardless of the probability of malignancy. However, in truth, this action is based on a number of factors, such as clinical history and physicalfindings,and the pathologic diagnosis is just one piece in the puzzle. Thesefindingsindicate that cytopathologists cannot and perhaps should not always render "black-and-white" diagnoses and that non-binary results have an important role in pathology. At the University of Iowa as elsewhere, non-binary diagnoses effect different clinical responses, depending on the clinical scenario. Following a non-binary BB diagnosis, a clinician may do nothing, repeat the test, order another test or start treatment. For example, if the BB diagnosis is atypical and the patient is young and without risk factors for carcinoma, the clinician may do no further work-up. With the same BB diagnosis in an older patient who has a lung mass and is presumably operable, the clinician may repeat the BB or move on to another test, such asfine-needleaspiration. In this case, the inability to issue a malignant diagnosis may be attributed to a sampling error. In an inoperable patient with brain metastases, the same BB diagnosis may be enough to initiate radiotherapy. For proper patient work-up or treatment, detailed communication of the pathologicfindingsto the clinical staff is key. The overall diagnostic accuracy of individuals or groups can be expressed with ROC curves, which also can be used to evaluate a number of variables including the effects of observer experience or clinical history. Using ROC curve analysis, Cohen Confidence intervals (CIs) can be calculated for each LR for and coworkers previously showed that in breast FNAB, experieach observer. The CIs for observer 1 are: definitely benign, enced observers performed at a higher level of diagnostic accu0.07, 0.49; probably benign, 0.08, 0.80; possibly malignant, 0.68, 3.20; probably malignant, 0.63, 4.72; and definitely ma- racy than less experienced observers.17 In this study, a similar conclusion cannot be made in the interpretation of BB specilignant, 2.80, 34.31. Because of CI overlap, these data collapse into three categories; the categories definitely benign and prob- mens for the diagnosis of malignancy. Possible explanations are that the sample size was not large enough to show an effect of ably benign can be combined and the categories possibly malignant and probably malignant can be combined. The category observer experience, too few "difficult" cases were included or that experience does not play a key role. definitely malignant remains unchanged. This collapse into three categories may be a more realistic representation of how Interestingly, although not statistically significant, the more cytologists really think (benign, atypical or suspicious, and ma- experienced observers performed at a higher level of diagnostic lignant). In this schema, a benign (definitely benign/probably accuracy than the less experienced observers in the interpretabenign) diagnosis would still lower the post-BB probability of tion of BBs with small cell carcinoma. In fact, some less experimalignancy; an atypical or suspicious (possibly malignant/ enced observers exhibited Az values of less than 0.5. This trend probably malignant) diagnosis, being centered around a LR of was not observed for non-small cell carcinomas. This finding 1.0 would not effect the post-BB probability of malignancy; and has an important clinical impact, because the diagnosis of small a definitely malignant diagnosis would increase the post-BB cell carcinoma elicits a different treatment protocol than does probability of malignancy. This collapse of categories removes the diagnosis of non-small cell carcinoma. Additional studies the LR of oo in cells that contain few data points. are needed to further characterize this trend. Other non-binary diagnoses, such as probable, consistent ROC curve analysis showed that the absence of clinical inwith or most likely are used with relative frequency in diagnosformation appears to lower diagnostic accuracy in the intertic pathology. The probability of disease associated with these pretation of BBs. In practice, because of the occurrence of atypcategories usually lies between the probabilities associated with ical cells in both benign and malignant conditions, the categories of definitely benign and definitely malignant. For cytopathologists generally are reluctant to make a diagnosis of example, the category consistent with malignancy implies less malignancy without knowing the clinical facts, such as the age AJ.C.P.-May 1995 RAAB ET AL. 593 ROC Curves and LR in Anatomic Pathology of the patient. With clinical information, findings that are suspicious for malignancy in an elderly patient may be reactive in a young patient with a history of AIDS and presumed pneumonia. Consequently, without clinical history, observers generally are more conservative; diagnoses called definitely malignant with a clinical history may be called probably malignant or possibly malignant without history. An important point shown in this study is that despite the absence of clinical information, observers still operate at high levels of diagnostic accuracy, even though fewer cases are called definitely benign or definitely malignant. For the diagnostic accuracy of a malignant diagnosis, the Az values ranged from 0.747 to 0.921. The absence of clinical information eliminates the problem of "double counting" clinical data. However, if not providing clinical information, clinicians must be willing to eschew more definitive diagnoses in favor of more qualitative diagnoses that can be interpreted effectively with the LR. In summary, the LR and ROC curve are two statistical techniques that express the probability of disease. Together with a rethinking of what anatomic pathologists really do, these two techniques can facilitate the communication problems between clinicians and pathologists. REFERENCES 1. Schwartz WB, Wolfe HJ, Pauker SG. Pathology and probabilities: A new approach to interpreting and reporting biopsies. N Engl J Med 1981;305:917-923. 2. Bryant GD, Normal GR. Expressions of probability: Words and numbers. N Engl J Med 1980; 302:411. 3. Toogood JH. What do we mean by "usually"? Lancet 1980;1: 1094. 4. Selvidge J. Assigning probabilities to rare events (PHD dissertation). Cambridge, MA: Harvard University, 1972. 5. Tversky A, Kahneman D. Judgment under uncertainty: Heuristics and biases. Science 1974; 185:1124-1131. 6. Valenstein P. Technology assessment for the diagnostic laboratory. American Society of Clinical Pathologists National Meeting. ASCP Special Topics Council Commission on Continuing Education: Should this test be done? Lecture notes 1992; 1-16. 7. Sackett DL, Haynes RB, Guyatt GH, Tugwell P. Clinical epidemiology: A basic science for clinical medicine, 2nd ed. Boston: Little, Brown, 1985. 8. Radack KL, Rouan G, Hedges J. The likelihood ratio: An improved measure for reporting and evaluating diagnostic test results. Arch Pathol Lab Med 1986; 110:689-693. 9. Giard RW, Hermans J. Interpretation of diagnostic cytology with likelihood ratios. Arch Pathol Lab Med 1990; 114:852-854. 10. Raab SS. Diagnostic accuracy in cytopathology. Diagn Cytopathol 1994;10:68-75. 11. Dorfman DD, Berbaum K.S. RSCORE-J: Pooled rating method data—a computer program for analyzing pooled ROC curves. Behav Res Methods Instruments Comput 1986; 18:452-462. 12. Dorfman DD. RSCORE II. In: JA Swets, RM Pickett, eds. Evaluation of diagnostic systems: Methods from signal detection theory. New York: Academic Press, 1982. 13. Godfrey K. Statistics in practice: Comparing the means of several groups. A' Engl J Med 1985;313:1450-1456. 14. Lusted LB. Introduction to Medical Decision Making. Springfield, IL: Charles C. Thomas Publishers, 1968. 15. Robertson EA, Zweig MH, Van Steirteghem AC. Evaluating the clinical efficacy of laboratory tests. Am J Clin Pathol 1983; 79: 78-86. 16. Langley FA, Buckley CH, Taster M. The use of ROC curves in histopathologic decision making. Anal Quant Cylol 1985; 7: 167-173. 17. Cohen MB, Rodgers RPC, Hales MS, et al. Influence of training and experience infine-needleaspiration biopsy of breast. Arch Pathol Lab Med 1987; 111:518-520. 18. Giard RWM, Hermans J. The value of aspiration cytologic examination of the breast. Cancer 1992; 69:2104-2110. 19. Beck JR, Shultz EK. The use of relative operative characteristic (ROC) curves in test performance evaluation. Arch Pathol Lab Med 1986; 110:13-20. 20. Kim I, Pollitt E, Leibel RL, et al. Application of receiver-operator analysis to diagnostic tests of iron deficiency in man. Pediatr Res 1984;18:916-920. 21. Hanley J A, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29-36. 22. McNeil BJ, Hanley JA. Statistical approaches to the analysis of receiver operating characteristic (ROC) curves. Med Decis Making 1984;4:137-150. 23. Metz CE. Basic principles of ROC analysis. Semin Nucl Med 1978;41:283-298. 24. Metz CE. ROC methodology inradiologicimaging. Invest Radiol 1986;21:720-733. 25. Metz CE, Goodenough DJ, Rossman K. Evaluation of receiver operating characteristic curve data in terms of information therapy, with applications in radiography. Radiology 1973; 109: 297-303. 26. Hanley JA. Receiver operating characteristic (ROC) methodology: The state of the art. Crit Rev Diagn Imaging 1989;29:307-335. Vol. 103 • No. 5
© Copyright 2026 Paperzz