Assessing Individual Agreement Huiman X. Barnhart, Ph.D. [email protected] Associate Professor Department of Biostatistics and Bioinformatics Duke Clinical Research Institute Duke University Andrzej S. Kosinski, Ph.D. Duke University Michael Haber, Ph.D. Department of Biostatistics, Emory University This research is funded by R01 MH70028 Outline • Introduction • Data Examples • Individual equivalence and individual agreement • Individual equivalence criterion (IEC) and coefficient of individual agreement (CIA) • Application to data examples • Discussion and Future research Introduction Accurate and precise measurement is important in clinical diagnosis. A method comparison study or reliability study is usually conducted to evaluate agreement between methods or observers. We are often interested in • Whether the methods/observers can be used interchangeably at individual level • Whether a new method that is easy to use can replace an existing standard method that may be expensive or invasive at individual level. Introduction Traditionally, if there is no reference method, assessing agreement has been based on • Intraclass correlation coefficient (ICC) (Inter- and Intra-ICC) 2 σB Between-subject variability ICC = 2 = 2 σB + σW Total variability for model Yij = αi + eij , j = 1, . . . , J • Concordance correlation coefficient (CCC) CCC E(Yi1 − Yi2 )2 = 1− E(Yi1 − Yi2 )2 |Yi1 , Yi2 are independent) 2σB1 σB2 ρµ = 2 + σ2 2 + σ2 2 σB1 + σ + (µ − µ ) 1 2 W1 B2 W2 for model Yij = µij + eij , j = 1, . . . , J 2 , ICC and CCC • With fixed within-subject variability σW 2 increase as between-subject variability σB increases. 10 6 4 2 0 2 4 6 8 10 0 10 0 2 4 6 8 10 6 8 10 ICC=CCC=0.97 6 4 2 0 0 2 4 6 8 ICC=CCC=0.91 8 10 ICC=CCC=0.8 8 10 0 2 4 6 8 ICC=CCC=0.0 0 2 4 6 8 10 0 2 4 Goniometer Example n=29, K=3, J=2 Table 1. Comparison of Manual Goniometer and Electrogoniometer Manual Goniometer Electro-goniometer µ 1.437 0.046 2 σW 0.736 0.977 2 σB 53.8 51.4 Intra ICC 0.986 0.982 Inter ICC 0.961 CCC 0.945 Bland and Altman Blood Pressure example (n=85, K=3, J=3) µ 2 σW 2 σB Intra ICC Inter ICC CCC Observer 1 127.4 37.4 936.0 0.962 Observer 2 127.3 38.0 917.1 0.960 0.875 0.782 Machine 143.0 83.1 983.2 0.922 Introduction • It is questionable whether ICC or CCC is adequate in assessing agreement at individual level. • We define “Individual Agreement” in the following sense: individual difference between measurements from different methods is close to the difference between replicated measurements within a method 10 10 Method 2 replications: 5,6,5,6 Method 2 replications: 3,4,3,4 • • 6 • • • • 2 • 0 0 2 4 • 4 6 8 Method 1 replications: 7,7,8,8 8 Method 1 replications: 5,5,6,6 0 2 4 6 8 10 0 2 4 6 8 10 Individual Bioequivalence • “Individual agreement” here is similar to individual bioequivalence in bioequivalence studies that assess agreement between a test drug against reference drug. • Individual bioequivalence was first introduced by Anderson and Hauck (1990) using probability criterion. • Schall and Luus (1993) extended to general case that includes the probability and moment criteria as special cases. • FDA modified and adopted the moment criterion in their guidelines (2001) for establishing individual bioequivalence (IBE). Individual Agreement • The moment criterion for IBE compares the (expected) squared individual difference between new and standard drugs to the (expected) squared individual difference within standard drug • We propose to assess individual agreement by comparing the (expected) sqaured individual difference between methods to the (expected) squared individual difference within mehtod due to replication. Two cases are considered and compared: (1) Existence of a reference method (or gold standard measured with error) (2) There is no reference method. Individual Bioequivalence Criterion FDA (2001) Existence of a reference • Reference-scaled IBC E(YiT − YiR )2 − E(YiR − YiR0 )2 IBC = ≤ θI 2 E(YiR − YiR0 ) )/2 θI is the bioequivalence limit set by the regulatory agency. • Yij = µij + ij , j =T (test drug), R (reference drug) Within-subject: σ 2 = V ar(iT ), σ 2 = V ar(iR ) WT WR Between-subject: σ 2 = V ar(µiT ), σ 2 = V ar(µiR ). BT BR Subject-by-formulation interaction: σ 2 = V ar(µiT − µiR ) D 2 2 2 − σ + σW (µT − µR )2 + σD WR T ≤ θI IBC = 2 σW R Proposed Individual Equivalence Criterion (IEC) Existence of a reference For a total of J methods with first J − 1 new methods and J method as a reference PJ−1 2 2 0) ( E(Y − Y ) )/(J − 1) − E(Y − Y ij iJ iJk iJk j=1 R IEC = ≤ θI 2 E(YiJk − YiJk0 ) /2 PJ−1 2 PJ−1 2 PJ−1 2 j=1 IEC R = (µj −µJ ) J−1 + j=1 σD jJ J−1 2 σW J + j=1 σW j J−1 2 where Yij = µij + ij with σD = V ar(µij − µiJ ). jJ 2 − σW J ≤ θI Coefficient of Individual Agreement (CIA) Existence of a reference 2 Within-subject variability σW J = . CIA = 2 2 τ∗R + σ∗R Total variability between methods R 2 “True” inter-method variability: τ∗R = E( PJ−1 j=1 2 Weighted within-method variability: σ∗R = 21 ( 2 CIA = IEC R + 2 R (µij −µiJ )2 /2) PJ−1 J−1 j=1 2 σW j J−1 2 + σW J) Proposed Individual Equivalence Criterion (IEC) No reference For J new methods P PJ−1 PJ 2 E(Yijk −Yijk0 )2 E(Yij −Yij 0 ) ) 0 j j=1 j =j+1 − 2 J(J−1) J N P IEC = ≤ θI . 2 j E(Yijk −Yijk0 ) 2J IEC N 2τ∗2 = 2 σ∗ IEC N and CIAN No reference N • CIA =ψ= σ∗2 τ∗2 +σ∗2 = Average within-subject variability Total variability between methods (Haber et al, 2005) • CIAN = 2 IEC N +2 . • In general, IEC N ≤ IEC R and CIAR ≤ CIAN Guideline on IEC and CIA values • In general, we want to have low value of IEC and high value of CIA to claim satisfactory individual agreement. • FDA’s boundary: θI = 2.4948. This corresponds to IEC ≤ 2.4948 or CIA ≥ 0.445 i.e., the inter-method variability (τ∗2 ) is within 125% of the within-subject variance (σ∗2 ). Estimation and Inference Based on method of moment, • Existence of a reference ˆ IEC R 2 2 − M SEW J ) + σ̂∗R 2(τ̂∗R = , M SEW J PJ−1 2 σ̂∗R M SEW j j=1 =( J −1 R M SEW J ˆ CIA = 2 . 2 τ̂∗R + σ̂∗R + M SEW J )/2. Yijk = µ + αi + ijk , j = 1, . . . , J. 2 τ̂∗R = P j (M SjJ − M SEjJ ) K(J − 1) Yijk = µ + αi + γij + ijk , j = j, or . J Estimation and Inference Based on method of moment, • No reference N 2(M S − M SE) ˆ IEC = K ∗ M SE N K ∗ M SE ˆ CIA = . M S + (K − 1) ∗ M SE τ̂∗2 = M S − M SE , K and σ̂∗2 = M SE. Yijk = µ + αi + γij + ijk , • Large sample distribution is possible. We use the bootstrap percentile method for one sided confidence bound Goniometer Example n=29, K=3, J=2 Table 3. Comparison of Manual Goniometer and Electrogoniometer Manual Goniometer Electro-goniometer µ 1.437 0.046 2 0.736 0.977 σW 2 σB 53.8 51.4 Intra ICC 0.986 0.982 IEC CIA Inter ICC CCC Individual Agreement Reference: Manual Goniometer No reference estimate (95% bound) Estimate (95% bound) 6.120 (9.896 upper) 4.975 (8.521 upper) 0.246 (0.168 lower) 0.287 ( 0.190 lower) – – 0.961 0.945 Table 4. Comparison of Observers and Automatic Machine in Measuring Blood Pressure (n=85, K=3, J=3) µ 2 σW 2 σB Intra ICC CIA inter ICC CCC CIA inter ICC CCC CIA inter ICC CCC Observer 1 127.4 37.4 936.0 0.962 Observer 2 Machine 127.3 143.0 38.0 83.1 917.1 983.2 0.960 0.922 Individual Agreement Reference: Observers No reference Overall results 0.111 (0.071 lower) 0.225 (0.152 lower) – 0.875 – 0.782 Pairwise: Observer 1 vs. Observer 2 – 1.44 (1.426 lower) – 0.989 – 0.973 Pairwise: Machine vs. observer 1 0.110 (0.070 lower) 0.178 (0.117 lower) – 0.834 – 0.701 Discussion and Future Research We proposed CIA for assessing individual agreement of continuous measures between multiple methods for scenarios of existing reference or no reference. • Asymptotic distribution of CIA • Covariates impact on CIA • How does CIA relates to ICC/CCC? • How to quantify the impact of between-subject variability? Discussion and Future Research • How to assess individual agreement for binary or discrete measures? How does the CIA differ from Kappa for binary case? • Replicated measures may be difficult to obtain. Repeated measures over time are often taken. How to assess (individual) agreement with repeated measures data? • How to design an agreement study with proper sample size computation? • How does the measurement error impact different study designs? Two group design or Matched design
© Copyright 2026 Paperzz