Duke University Diagnostic Tests Issues with Incomplete or Sparse Data Andrzej S. Kosinski Department of Biostatistics and Bioinformatics Duke University and Duke Clinical Research Institute [email protected] Joint work with Huiman X. Barnhart Patient Care and Outcomes Research Grant sponsored by American Heart Association (AHA) Pharmaceutical Roundtable and Grant in Aid from AHA NCSU. April 27, 2006 1 of 40 Duke University OUTLINE • Dichotomous test measure – Background and data – Possible verification bias problem — missing data problem – Point estimate approach with assumptions – Two dimensional region estimate approach with no assumptions — Test Ignorance Region • Continuous test measure – Receiver Operating Characteristic (ROC) curve – Model derived predictive values • Possible future work NCSU. April 27, 2006 2 of 40 Duke University BACKGROUND Diagnostic test T (1=positive, 0=negative) evaluated by comparison with the gold standard D (1=disease, 0=no disease). D=1 D=0 T =1 A1 B1 T =0 A2 B2 Performance of a test often measured by: Sensitivity P (T = 1|D = 1) estimated by A1 A1 +A2 Specificity P (T = 0|D = 0) estimated by B2 B1 +B2 • Sensitivity and specificity are numbers between 0 and 1. • Higher values are better. • A random coin toss: sensitivity=0.5 and specificity=0.5. NCSU. April 27, 2006 3 of 40 Duke University PROBLEM The decision to verify often depends on: • Recorded variables related to the disease status • Unrecorded variables related to disease and hence disease itself. ⇒ Verified patients may be a biased sample from the population of interest. ⇒ Sensitivity and specificity estimates based only on complete data may be biased. ⇒ Verification bias Work-up bias Ransohoff and Feinstein (1978, New England Journal of Medicine) NCSU. April 27, 2006 4 of 40 Duke University DICHOTOMOUS TEST DATA LAYOUT Result of Disease verified (R=1) Disease not verified (R=0) test T D=1 D=0 T =1 a1 b1 u1 T =0 a2 b2 u2 T D R frequency 1 1 1 a1 1 0 1 b1 0 1 1 a2 0 0 1 b2 1 NA 0 u1 0 NA 0 u2 NCSU. April 27, 2006 D=1 D=0 Missing data problem 5 of 40 Duke University MISSING AT RANDOM (MAR) ASSUMPTION P (R|D, T, X) = P (R|T, X) does not depend on D Begg and Greenes (Biometrics, 1983) Diamond (American Journal of Cardiology, 1986) Cecil, Kosinski, Jones et.al (Journal of Clinical Epidemiology, 1996) NOT IGNORABLE (NI) SITUATION P (R|D, T, X) depends on D Zhou (Communications in Statistics - Theory and Methods, 1993) Baker (Biometrics, 1995) Kosinski and Barnhart (Biometrics, 2003) NCSU. April 27, 2006 6 of 40 Duke University REGRESSION MODEL FRAMEWORK Kosinski and Barnhart (Biometrics, 2003). Continuous or categorical covariates. Lobs = N P (Ri , Ti , Di |xi )Ri P (Ri , Ti |xi )1−Ri i=1 = N Ri × P (Di |xi )P (Ti |Di , xi )P (Ri |Ti , Di , xi ) i=1 1 1−Ri P (Di = d|xi )P (Ti |Di = d, xi )P (Ri |Ti , Di = d, xi ) d=0 Disease component : Test component : Missing data mechanism component : NCSU. April 27, 2006 logit P (Di = 1| xi ) = α z0i logit P (Ti = 1| Di , xi ) = β z1i logit P (Ri = 1| Di , Ti , xi ) = γ z2i 7 of 40 Duke University EXAMPLES OF MISSING DATA MECHANISM COMPONENT MAR assumption logit P (Ri = 1|Di Ti , xi ) = γ0 + γ1 Ti + γ2 x1 Disease (D) not included as a variable. NI (non-ignorable) missing data mechanism logit P (Ri = 1|Di Ti , xi ) = γ0 + γ1 Ti + γ2 Di + γ3 x1 “Non-differential non-ignorability” with respect to T and X. logit P (Ri = 1|Di Ti , xi ) = γ0 + γ1 Ti + γ2 Di + γ3 Ti Di + γ4 x1 “Differential non-ignorability” with respect to T . “Non-differential non-ignorability” with respect to X. NCSU. April 27, 2006 8 of 40 Duke University SPECT DATA Cecil, Kosinski, Jones et al. (1996, Journal of Clinical Epidemiology) T: single-photon-emission computed tomography (SPECT) thallium stress test (non-invasive diagnostic test) D: coronary angiography (invasive gold standard) Result of Disease D verified (R=1) Disease D not test T D=1 D=0 verified (R=0) Total T =1 195 232 996 1423 T =0 5 39 1221 1265 2217 2688 Total 82% = 2217/2688 not verified “Naive” SENS = 195 195+5 NCSU. April 27, 2006 = 97.5% “Naive” SPEC = 39 39+232 = 14.4% 9 of 40 Duke University SPECT DATA MODELS M-1 (MAR model) M-2 (MAR model) M-3 (NI model) Missing data mechanism component: logit P (R = 1| D, T, X1 , X2 , X3 ) γ SE P γ SE P γ SE P Int -3.323 0.15 <.001 -3.423 0.17 <.001 -3.315 0.37 <.001 T 2.476 0.16 <.001 2.476 0.17 <.001 3.183 0.45 <.001 Gender — — — -0.022 0.11 0.85 0.350 0.19 0.07 Stress mode — — — -0.176 0.12 0.13 0.114 0.17 0.50 Age ≥ 60 — — — 0.400 0.11 <.001 0.739 0.17 <.001 D — — — — “Naive” — — -2.054 1.02 0.043 Marginal estimate Sens 97.5% 81.9% (69.5, 94.3) 80.4% (68.6, 92.3) 65.3% (46.4, 84.1) Spec 14.4% 59.2% (55.4, 63.0) 58.5% (54.8, 62.2) 64.5% (55.8, 73.3) NCSU. April 27, 2006 10 of 40 Duke University REGRESSION MODEL FRAMEWORK • A flexible and general modeling framework • Allows for categorical and continuous covariates General Questions : • Can observed data provide evidence for non-ignorability? • Can we “test” for non-ignorability? • Maybe all we can do is to fit NI models as a plausible alternative to a MAR model. NCSU. April 27, 2006 11 of 40 Duke University “SOLUTION” TO PARTIAL VERIFICATION OF DISEASE Assumptions, models, Bayesian approach . . . provide point estimate (identifiability) for sensitivity and specificity. BUT • Data which can be used to check assumptions are missing. • Goodness-of-fit for models can be checked against observed data only. • Bayesian is not really a classic Bayesian situation. Observed data may not be able to improve on prior information regardless of the sample size. In the end one may need or choose to use assumptions but we recommed to start with a global sensitivity analysis first. NCSU. April 27, 2006 12 of 40 Duke University GLOBAL SENSITIVITY ANALYSIS Consider ALL combinations of sensitivity and specificity plausible under the observed data. Test Ignorance Region (TIR) “Ignorance” due to incompleteness of disease status verification. Kosinski and Barnhart (Satistics in Medicine, 2003) Horowitz and Manski (JASA, 2000) Molenberghs et al. (Applied Statistics, 2001). NCSU. April 27, 2006 13 of 40 Duke University Result of Disease verified (R=1) Disease not verified (R=0) test T D=1 D=0 D=1 D=0 T =1 a1 b1 u1 T =0 a2 b2 u2 Consider not known p1 and p2 : p1 = P (D = 1|T = 1, R = 0) p2 = P (D = 1|T = 0, R = 0) IDEALIZED COMPLETE VERIFICATION DATA Result of Disease verified (R=1) Disease not verified (R=0) test T D=1 D=0 D=1 D=0 T =1 a1 b1 p1 u 1 u 1 - p1 u 1 T =0 a2 b2 p2 u 2 u 2 - p2 u 2 NCSU. April 27, 2006 14 of 40 Duke University IDEALIZED COMPLETE VERIFICATION DATA Result of Disease D test T D=1 T =1 a1 + p1 u1 b1 + (1 − p1 )u1 T =0 a2 + p2 u2 b2 + (1 − p2 )u2 p1 = P (D = 1|T = 1, R = 0) D=0 p2 = P (D = 1|T = 0, R = 0) a1 + p1 u1 SENS ≡ f1 (p1 , p2 ) = a1 + p1 u1 + a2 + p2 u2 b2 + (1 − p2 )u2 SPEC ≡ f2 (p1 , p2 ) = . b1 + (1 − p1 )u1 + b2 + (1 − p2 )u2 (p1 , p2 ) ∈ [0, 1] × [0, 1]. NCSU. April 27, 2006 15 of 40 Duke University SPECT DATA 0.8 0.6 Specificity 0.0 0.2 0.4 0.8 0.6 0.4 0.2 0.0 p2=P(D=1|T=0,R=0) 1.0 Test Ignorance Region 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 16 of 40 Duke University 1.0 TIR FOR SPECT DATA Sensitivity 0.975 0.819 A: MCAR B: MAR Specificity 0.144 0.592 (p1=1,p2=0) 0.6 • B • (p1=0,p2=0) 0.2 0.4 Specificity 0.8 • •A 0.0 • (p1=1,p2=1) (p1=0,p2=1) • 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity NCSU. April 27, 2006 17 of 40 Duke University POSSIBLE PARAMETERIZATIONS • Pattern mixture model approach: P (D, T, R) = P (D|T, R)P (T, R) p1 = P (D = 1|T = 1, R = 0) p2 = P (D = 1|T = 0, R = 0) (p1 , p2 ) ∈ [0, 1] × [0, 1] • Selection model approach: P (D, T, R) = P (R|T, D)P (T, D) π1 = P (R = 1|T = 1, D = 1) π2 = P (R = 1|T = 0, D = 1) (π1 , π2 ) ∈ [a1 /(a1 + u1 ), 1] × [a2 /(a2 + u2 ), 1] p1 u1 = a1 (1 − π1 )/π1 p2 u2 = a2 (1 − π2 )/π2 • Other parametrizations can be considered: Zhou 1993, odds ratios, etc. NCSU. April 27, 2006 18 of 40 Duke University POSSIBLE PARAMETERIZATIONS The choice of parameterization does not modify the TIR. Assumptions can be equivalently expressed. For example, the MAR estimates result from assumptions • p1 = a1 /(a1 + b1 ) p2 = a2 /(a2 + b2 ) • or π1 = (a1 + b1 )/n1 π2 = (a2 + b2 )/n2 NCSU. April 27, 2006 19 of 40 Duke University “SOLUTION” TO PARTIAL VERIFICATION OF DISEASE Test Ignorance Region (TIR) — No assumptions. • More missingness results in more ignorance. • Maybe it is fine to settle on a non-identifiable model? Region-estimate may by informative enough... Do we always need point estimates? • Magnitude of non-identifiablity? Does the size of the region reflect amount of ingnorance due to the missing data? We suggest separation of • information provided by observed data (region-estimate) and • “information” provided by assumptions Only “information by assumption” allows to “shrink” a purely data based region-estimate into a point-estimate NCSU. April 27, 2006 20 of 40 Duke University SPECT DATA 0.8 0.6 Specificity 0.0 0.2 0.4 0.8 0.6 0.4 0.2 0.0 p2=P(D=1|T=0,R=0) 1.0 Test Ignorance Region 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 21 of 40 Duke University SPECT DATA — Assumption about p1 and p2 0.8 0.6 Specificity 0.0 0.2 0.4 0.8 0.6 0.4 0.2 0.0 p2=P(D=1|T=0,R=0) 1.0 Test Ignorance Region 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 22 of 40 Duke University SPECT DATA — Assumption about p1 and p2 0.8 0.6 Specificity 0.0 0.2 0.4 0.8 0.6 0.4 0.2 0.0 p2=P(D=1|T=0,R=0) 1.0 Test Ignorance Region 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 23 of 40 Duke University SPECT DATA — Assumption about p1 and p2 0.8 0.6 Specificity 0.0 0.2 0.4 0.8 0.6 0.4 0.2 0.0 p2=P(D=1|T=0,R=0) 1.0 Test Ignorance Region 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 24 of 40 Duke University SPECT DATA — Assumption about disease prevalence D=1 D=0 T=1 195 232 T=0 5 39 R=0 996 1221 Possible disease prevalence range: 0.074 − 0.899 Test Ignorance Region 0.8 0.6 0.4 0.2 0.0 0.2 0.4 Specificity 0.6 0.8 1.0 Disease prevalence range: 0.074 − 0.200 0.0 p2=P(D=1|T=0,R=0) 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 25 of 40 Duke University SPECT DATA — Assumption about disease prevalence D=1 D=0 T=1 195 232 T=0 5 39 R=0 996 1221 Possible disease prevalence range: 0.074 − 0.899 Test Ignorance Region 0.8 0.6 0.4 0.2 0.0 0.2 0.4 Specificity 0.6 0.8 1.0 Disease prevalence range: 0.074 − 0.200 0.0 p2=P(D=1|T=0,R=0) 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 26 of 40 Duke University SPECT DATA — Assumption about disease prevalence D=1 D=0 T=1 195 232 T=0 5 39 R=0 996 1221 Possible disease prevalence range: 0.074 − 0.899 Test Ignorance Region 0.8 0.6 0.4 0.2 0.0 0.2 0.4 Specificity 0.6 0.8 1.0 Disease prevalence range: 0.074 − 0.300 0.0 p2=P(D=1|T=0,R=0) 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 27 of 40 Duke University SPECT DATA — Assumption about disease prevalence D=1 D=0 T=1 195 232 T=0 5 39 R=0 996 1221 Possible disease prevalence range: 0.074 − 0.899 Test Ignorance Region 0.8 0.6 0.4 0.2 0.0 0.2 0.4 Specificity 0.6 0.8 1.0 Disease prevalence range: 0.200 − 0.300 0.0 p2=P(D=1|T=0,R=0) 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 28 of 40 Duke University • General regression based approach Point estimate BUT assumptions are needed • Test Ignorance Region (TIR) No assumptions BUT a region estimate TIR provides fair summary of the information in the data set. Overlay assumption derived point or sub-region estimates over the TIR: explicit statement about amount of information induced by a model or an assumption. NCSU. April 27, 2006 29 of 40 Duke University SPECT DATA Result of Disease D verified (R=1) Disease D not test T D=1 D=0 verified (R=0) Total T =1 195 232 996 1423 T =0 5 39 1221 1265 SPECT DATA — No disease information (or latent disease) Result of Disease D verified (R=1) Disease D not test T D=1 D=0 verified (R=0) Total T =1 0 0 1423 1423 T =0 0 0 1265 1265 NCSU. April 27, 2006 30 of 40 Duke University SPECT DATA — Latent disease D=1 D=0 T=1 0 0 T=0 0 0 R=0 1423 1265 Possible disease prevalence range: 0.000 − 1.000 0.8 0.6 Specificity 0.0 0.2 0.4 0.8 0.6 0.4 0.2 0.0 p2=P(D=1|T=0,R=0) 1.0 Test Ignorance Region 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 31 of 40 Duke University SPECT DATA — Latent disease - Assumption about prevalence D=1 D=0 T=1 0 0 T=0 0 0 R=0 1423 1265 Possible disease prevalence range: 0.000 − 1.000 Test Ignorance Region 0.8 0.6 0.4 0.2 0.0 0.2 0.4 Specificity 0.6 0.8 1.0 Disease prevalence range: 0.074 − 0.899 0.0 p2=P(D=1|T=0,R=0) 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 32 of 40 Duke University SPECT DATA — Latent disease - Assumption about prevalence D=1 D=0 T=1 0 0 T=0 0 0 R=0 1423 1265 Possible disease prevalence range: 0.000 − 1.000 Test Ignorance Region 0.8 0.6 0.4 0.2 0.0 0.2 0.4 Specificity 0.6 0.8 1.0 Disease prevalence range: 0.074 − 0.899 0.0 p2=P(D=1|T=0,R=0) 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 33 of 40 Duke University SPECT DATA — Latent disease - Assumption about prevalence D=1 D=0 T=1 0 0 T=0 0 0 R=0 1423 1265 Possible disease prevalence range: 0.000 − 1.000 Test Ignorance Region 0.8 0.6 0.4 0.2 0.0 0.2 0.4 Specificity 0.6 0.8 1.0 Disease prevalence range: 0.200 − 0.300 0.0 p2=P(D=1|T=0,R=0) 1.0 "Ignorance" about p1 and p2 0.0 0.2 0.4 0.6 0.8 p1=P(D=1|T=1,R=0) NCSU. April 27, 2006 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 34 of 40 Duke University Data with continuous test measure Result of Disease verified (R=1) Disease not verified (R=0) test T D=1 D=0 D=1 D=0 T = t1 a1 b1 u1 T = t2 .. . a2 .. . b2 .. . u2 .. . T = tN aN bN uN Consider not known pi = P (D = 1|T = ti , R = 0) (i = 1, 2, . . . , N ) NCSU. April 27, 2006 35 of 40 Duke University IDEALIZED COMPLETE VERIFICATION DATA Result of Disease verified (R=1) Disease not verified (R=0) test T D=1 D=0 D=1 D=0 T = t1 a1 b1 p1 u 1 u 1 - p1 u 1 T = t2 .. . a2 .. . b2 .. . p2 u 2 u 2 - p2 u 2 .. . T = tN aN bN pN u N u N - pN u N Receiver Operating Characteristic (ROC) curve results from a plot of sensitivity versus (1-specificity) for all cutpoints ti . Area Under the Curve (AUC) is computed: the closer to 1 the better. MAR estimate for ROC (Zhou, Biometrics, 1996) considers pi = ai /(ai + bi ) NCSU. April 27, 2006 36 of 40 Duke University Model based estimation Positive Predicitve Value (PPV): P (D = 1|T = 1) T = 1 means that X ≥ xcut PPV = = P (D = 1, X ≥ xcut) P (D = 1|T = 1) = P (D = 1|X ≥ xcut) = P (X ≥ xcut) i≥xcut P (D = 1, X = i) i≥xcut P (D = 1|X = i)P (X = i) = P (X ≥ xcut) i≥xcut P (X = i) • Logistic regression to relate presence of disease with the continuous test measure X P (D = 1|X) = 1/(1 + exp(−(α + βX))) • Density estimation for distribution of X. NCSU. April 27, 2006 37 of 40 Model: log P/(1−P) = −2.3914 + 0.0125 * X Hosmer and Lemeshow (H−L) Goodness−of−Fit test p−value = 0.81 0.8 • • 0.4 • • • 0 50 87.5 100 • • • •• • • • ••• 150 •• • •• • • • • • •• • •• •• •• • • ••• •• • • • 0.0 P = Probability of angiography >= 50% Duke University •• • 200 • 250 300 X NPV obs = 0.8525, mdens = 0.8530 PPV 20 10 0 Count 30 obs = 0.3393, mdens = 0.3523 0 50 NCSU. April 27, 2006 100 150 200 250 300 38 of 40 Model: log P/(1−P) = −2.3914 + 0.0125 * X Hosmer and Lemeshow (H−L) Goodness−of−Fit test p−value = 0.81 0.8 • • 0.4 • • • •• • •• ••• • • • • • • • • ••••• •• • • • 0.0 P = Probability of angiography >= 50% Duke University 0 50 100 • • • • • • •• • ••• 150 •• •• • 200 • 250 250 300 X NPV obs = 0.8117, mdens = 0.8091 PPV 20 10 0 Count 30 obs = 0.8000, mdens = 0.7487 0 50 NCSU. April 27, 2006 100 150 200 250 300 39 of 40 Duke University Possible work • Dichotomous test measure – TIR with covariates – Relationship with latent class models – Multiple tests – Sub-unit measurements - various types of missingness – Comparison of two tests utilizing the TIR concept • Continuous test measure – “TIR-like” approach to ROC and AUC? – Is there a penalty for a too dense choice of cutpoints for a MAR analysis of ROC and AUC. – Complete development of model based estimation of sensitivity and specificity and comparison to the direct computations approach; possible application to verification bias issues with ROC and AUC. NCSU. April 27, 2006 40 of 40
© Copyright 2026 Paperzz