Empirical Bayes Combination of Estimated Areas under ROC Curves Using Estimating Equations XIAO H. ZHOU, PhD A synthesis of the empirical Bayes method and the method of estimating equations is used to combine individual receiver operating characteristic (ROC) area estimates from different studies of the same diagnostic test into a single estimate. This single estimate represents the population mean from which individual areas under the ROC curves were sampled. The only data needed to carry out the method are estimated areas under the ROC curves and the corresponding standard errors. Key words: ROC curves; empirical Bayes method. (Med Decis Making 1996;16:24-28) is how to arrive at a better estimate of the true area under the ROC curve of the diagnostic test results using data from all the studies. McClish 4 presented a method to combine estimated areas under the ROC curves across different studies. Her method assumes that the studies are homogeneous and that all studies provide estimates for the same true ROC area. In other words, her method assumes that differences between the ROC areas estimated in individual studies are due only to within-study variability (experimental error). However, the true ROC area for the ith study might be affected, for example, by the design and execution of the study and the characteristics of the patients enrolled. Thus, differences in the ROC areas estimated in individual studies might come both from withinstudy variability and from actual differences between studies.5 The empirical Bayes (EB) method does not assume that individual studies all have the same true ROC area, and provides a simple way to express study-level heterogeneity with a two-stage model: 6-8 The applications of the EB method to meta-analysis in clinical trials have become very popular?-” However, the EB method has not been applied to ROC studies. In this paper, we apply the EB method to combine ROC area estimates across studies without assuming that different studies have the same true ROC area. Also, we apply the method of estimating equations (a version of the method of moment) to estimate the unknown population parameters. The next section describes the Bayes framework in the context of the areas under ROC curves. The method of estimating equations is then used to estimate unknown population parameters. The following section applies the method to one example. The data set in the example is taken from McClish’s paper concerning the dexamethasone suppression test. Evaluating the accuracies of diagnostic tests in detecting the presence of disease is very important for both quality of care and cost containment. One way to evaluate the accuracy of a diagnostic test is to estimate the sensitivity and specificity of the test by dichotomizing the test results into a positive result and a negative result. Both of these quantities, however, depend on the confidence threshold used by a specific test reader for calling a positive test, and this dependence confounds the results of the accuracy. To overcome this problem, we plot 1 - specificity versus sensitivity for all confidence thresholds, resulting in the receiver operating characteristic (ROC) curve. Thus, a ROC curve allows us to study the inherent discrimination capability of a diagnostic test. The area under a ROC curve is the most widely used index for summarizing information contained in a ROC curve. The area under a ROC curve has been shown to be equal to the probability of correctly ranking a (diseased, nondiseased) pair,’ and can be estimated either by the Wilcoxon statistics2 or by the parametric binormal model method.3 Several separate clinical studies, using independent case samples, are often conducted to estimate the accuracy of the same diagnostic test. ROC area estimates of the test results* and the corresponding standard errors are reported in the studies. Then, the question Received May 4, 1994, from the Division. of Biostatistics, Department of Medicine, Indiana University School of Medicine, and the Regenstrlef Institute for Health Cam, Indianapolis, Indiana. Revision accepted for publication January 17, 1995. Supported in part by grant number R29HS08559 from the Agency for Health Care Policy and Research and by PHS NO1-LM-4-3510. Address correspondence and reprint requests to Dr. Zhou: Division of Biostatistics, Indiana University School of Medicine, Riley Research Wing, RR 135, 702 Barnhill Drive, Indianapolis, IN 462025200. 24 Combining ROC Area Estlmates 0 25 VOL 16/NO 1, JAN-MAR 1996 Met hod We apply a two-stage model to account for the withinstudy and the between-study variability. In the first stage, the model assumes that area estimates under the ROC curves for individual studies am conditionally independent, given their true areas. In the second stage, the model assumes that the unobserved true areas for individual studies are a random sample from a distribution with the mean A and variance 7’. Suppose that K studies investigated the accuracy of the same diagnostic test. All studies had different patient samples. Let Ai denote the true area under the ROC curve for the ith study. An estimate & for A, is derived either by a nonparametric method 2,12 or by a parametric method.3 The two-stage Bayes model is defined as follows: l Stage one says the area estimate Ai for the ith study has the unknown mean Ai and the known variance Vi, given A,: E&IA,) = Ai, var&lAJ = Vi l Stage two assumes that the mean and variance of an unobserved true area A, are A and T’, respectively. The parameter A represents the population mean of the ROC area of a diagnostic test, and the parameter + represents the between-study variability. arbitrary constants. Proposition 1 in the appendix shows that the estimates A and ‘i2 are consistent estimates of A and T' and have an asymptotically joint normal distribution. . From Proposition 1, we see that the variance of A can be consistently estimated by (h$-$l and that the variance of ‘i2 can be consistently estimated by Two computational methods are available to solve equations 1 and 2. The first method is given by the following steps: l Choose an initial value of l Estimate A by K l Next, we are to estimate the parameters of interest A.&id +. Observe that the marginal mean and variance of Ai are E&J = A, vat-&J = Vi + ? n LJ,(A, 7’) = ,$,, s = 0 I (1) Since T’ is an unknown parameter, we need an additional estimating equation for 72. We propose the following quadratic estimating function: = 0 (2) Let A and ‘i2 be the solutions to equations 1 and 2. Gilbert and colleagues 15 used similar estimating equations to estimate A and 72. However, they did not give variance estimates for A and ,i2. Godambe and Heyde 14 have shown that given T' the estimate A is unbiased and has the smallest variance among all possible estimates with forms Zr_l &oL~/X~_~ oi, where ois are T', A Find the updated estimate of say 9:. /K T' by solving zK [Ai(Vi -+ i&)1” i=l l Based on theory of generalizing estimating functions 13,14 the optimal estimating function for the parameter A is (3) 7') Continue this process until convergence. The second method is for use with an existing statistical package. Note that if the marginal distribution of Ai is a normal with the mean A,and variance Vi + ?, then the score functions for A and T' from the normal random variables are identical to equations 1 and 2. Thus, equations 1 and 2 can be solved generally with software designed specifically for normal. The LE program in BMDP16 is an example. The LE program estimates the parameters that maximize a given likelihood function, using the iterative Newton-Raphson algorithm. If the parameter 72 = 0, then the true areas under the ROC curves for individual studies are the same. In other words, the individual studies are homogeneous. In this case, McClish 4 suggested using a weighted average of estimated ROC areas from individual studies, (xF= 1 &Nil/(X:& 1 Vi), to estimate the common true ROC area. This weighted average is the same as the solution to equation 1 when 72 = 0. The classical method to test the hypothesis that HP: 72 = 0 is based on the test statistics Q = Zy_(,, (& - A12/Vi, where the distribution 26 . Zhou Table 1 l Study 1 Study 2 Study 3 Study 4 Study 5 Study 6 Study 7 MEDICAL DECISION MAKING 4 An Extension Data from McClish’s Paper Areas(&) SE(k) Negative Cases 0.789 0.724 0.851 0.876 0.782 0.702 0.652 0.057 0.025 0.028 0.029 0.102 0.056 0.038 33 152 79 41 49 31 77 Positive Cases 34 215 119 54 52 65 111 of Q is approximated by a chi-square with K - 1 degrees of freedom. 10 If we reject the hypothesis that 7’ = 0, then we have to account for the between-study variability to estimate A, the true ROC area of a diagnostic test across all studies. The solution for A in equations 1 and 2 gives a consistent estimate of A after accounting for between-study variability. S o m e t i m e s , w e c a n h a v e m o r e individual study than just the estimated area under the ROC curve and corresponding standard error. For example, this information can be the mean age of the patients in the ith study. Then, we will extend our method to incorporate this information in our hierarchical model. Letassociated with X,s be p covariates .the ith study, . i =. 1, , K. Then, the two-stage model is defined as follows: l Stage one says that the estimated area under the ROC curve for each study has the unknown mean Ai and the known variance Vi: E&,/A,) = Ai, var(AilAi) = Vi l Stage two assumes that the mean of an unobserved area Ai from the ith study is a function of the covariates Xi specific to the ith study: h[E(AJ] = X$ Application As an illustration, we apply the method to an example. The example summarized the seven studies of the dexamethasone suppression test and was used by McClish as an illustration of her method. 4 where h is a known link function. Also, the variance of Ai is 72. The population parameters B and the between-study variability 7’ can be estimated by the solutions to equations 5 and 6: EXAMPLE The dexamethasone suppression test (DST) is a simple laboratory assessment of pituitary-adrenal dysregulation that can be used to distinguish various psychiatric disorders, including psychotic depression, schizophrenia, mania, and major depression. Mossman and Somoza 15 summarized the accuracy results of seven studies of the DST, including area estimates and corresponding standard errors. The area estimates and standard errors from the seven studies are summarized in table 1. First, we test the hypothesis that all seven studies are homogeneous (72 = 0). The Pearson chi-square method rejects this hypothesis with the p-value < 0.00003. Thus, we cannot use McClish’s method to combine the estimated ROC areas from these seven studies. We need to estimate the between-study variability 7’ to combine the individual area estimates. The newly developed method described above does not require the assumption that 72 = 0, and can be used to estimate 7’. Thus, we can apply this method to combine the estimated ROC areas from the seven studies. Using the BMDP LE program, we obtain the estimates for A and T? A = 0.77 and ‘i2 = 0.0051. Equation 3 gives the corresponding standard error of A = 0.03. Thus, the estimate for the population mean of the ROC area is 0.77, with the standard error 0.03. ift ah(p) xi /i - h-1(x;B) = o vi + 72 (5) I = x:.B) aP p (- 2 1 [& ,,“,-‘~~)I” K - p i=l , _ 1 = o (6) Then, the mean ROC area among the population of studies with the characteristics Xi is estimated by h-l(Xifi), where parameters fi are the solutions to equations 5 and 6. Discussion In this paper, we propose using an empirical Bayes method to combine estimated ROC areas across studies, accounting for both the within-study variability and the between-study variability. An attractive feature of the empirical Bayes method is that it allows one to borrow strength from all studies to estimate the population mean of the ROC area without assuming the studies are homogeneous. The method of estimating equations allows us to estimate the population parameters of interest without fully specifying distributions for estimated ROC areas, _ . VOL 16/NO 1, JAN-MAR 1996 Combining ROC Area Estimates 0 27 By establishing the relationship between our estimating equations and the score functions from a normal random variable, we could use a widely available statistical package, such as BMDP, to solve for A and ? in our estimating equations. Thus, the calculations for carrying out the proposed procedure are easy. Our estimating equations are similar to the ones proposed by Williams.18 The estimating equations proposed by Williams are equation 1 plus the following equation: ci=lK (iiiVi -+ A)’7’ - K = O However, our estimates are more efficient than the ones proposed by Williams in the sense that our estimates are maximum likelihood estimates if 4 - N(A, +?. In this paper, we assume that the variance Vi is known or can be estimated from the data in the ith study. If Vi is unknown, then we will have an overparameterization problem because the number of unknown parameters is greater than the number of studies. In this case, we can try two possible approaches to solve the overparameterization problem. One is to put a prior distribution on V,s; another one is to assume that Vis are same. For the data described in the example section, McClish4 gave an estimate of 0.781 for the population mean of the ROC area under the assumption 7’ = 0. Without this assumption, we gave an estimate of 0.770 with ‘i2 = 0.0051. It would be interesting to do a simulation study to see how big the between-study variability ? needs to be so that we can see the significant differences between McClish’s estimate and our estimate. The author thanks Dr. Siu Hui for her useful suggestions. References 1. Bamber D. The area above the ordinal dominance graph and the area below the receiver operating graph. J Math Psychol. 1975;12:387-415. 2. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29-36. 3. Dorfman D, Alf E. Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals: rating method data. J Math Psychol. 1969;6:487-96. 4. McClish DK. Combining and comparing area estimates across studies or strata. Med Decis Making. 1992;12:274-9. 5. National Research Council. Combining Information: Statistical Issues and Opportunities for Research. Washington, DC: National Academy Press, 1992. 6. Efron B, Morris CN. Stein’s paradox in statistics. Sci Am. 1977;236:119-27. 7 Maritz JS, Lwin T. Empirical Bayes Methods. Second edition. New York: Chapman and Hall, 1989. 8 Morris CM. Parametric empirical Bayes inference: theory and application. JASA. 1983;78:47-65. 9. Carlin JB. Meta-analysis for 2 X 2 tables: a Bayesian approach. Stat Med. 1992;11:141-58. 10. DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clin Trials. 1986;7:177-88. 11. DuMouchel WH, Harris J. Bayes methods for combining the results of cancer studies in humans and other species. JASA. 1983;78:293-315. 12. Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148:839-43. 13. Crowder M. On linear and quadratic estimating functions. Biometrika. 1987;74:591-7. 14. Godambe VP, Heyde CC. Quasi-likelihood and optimal estimation. Int Stat Rev. 1987;55:231-44. 15. Gilbert JP, McPeek B, Mosteller F. Progress in surgery and anesthesia: benefits and risks of innovative therapy. In: Bunker JP, Barnes BA, Mosteller F, eds. Costs, Risks, and Benefits of Surgery. 1977, pp 124-169. 16. Dixon WJ, et al. BMDP Statistical Software. Berkeley: University of California Press, 1990. 17. Mossman D, Somoza E. Maximizing diagnostic information from the dexamethasone suppression test. Arch Gen Psychiat. 1989;46:653-60. 18. Williams DA. Extra-binomial variation in logistic linear models. Appl Statist. 1982;31:144-8. 19. DeLong ER, DeLong DM, Clarke-Pearson D. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837-45. A PPENDIX Proposition 1. Assume that the fourth moment of & e)tists, and the following limits exist: 1 on = lim,,(l/K) 5 - 1 i=t vi + 72 K E[,$& - A)‘] - A(V, + T”) 012 = lim,,(l/K) 2 i=1 (Vi + 7”)” . oz1 Then, = 1 (1/K) 5 E(A, - A ) “ - (Vi + 7”)” lim,,(l/K) 2 i=l (Vi + 72)2’ uZZ = lirni+ iZl (Vi + 72)” The partial derivatives of &,(A, C) and &;(A, *PI witl%r%pedt I Prom the strong law of large numbers, we know that as K -+ CQ 1 f v dU,,h 7”) au,,L4, @PI &A a72 nrt .,A _a, ht. /.a _a, alI n Therefore, for large K, applying the Taylor expansion to our estimating equations yields: s . j ,:, ,,[LJl&, .p2), U,ji%, .pj,j’ ” [U,&%, 7% U&A, T?] -ITherefore, X This completes the proof of Proposition 1. - +)‘I
© Copyright 2026 Paperzz