Reliability for Peer Grouping Eric Schone, Ph.D. Mathematica Policy Research Reliability Workgroup Charter Minnesota statute 62U.04 requires the Commissioner to consult with providers, health plans, purchasers and consumers regarding an appropriate minimum reliability threshold for the peer grouping results. The purpose of this group is for members to provide advice about how to consider reliability for the peer grouping analysis and for MDH to convey information about reliability related to specific methodological issues 1 Reliability and Validity From the charter – Validity refers to the accuracy of a measurement and reliability refers to the consistency of a measurement A measure is valid if its value reflects what it is supposed to be measuring – Quality of care for quality measures – Impact on cost of care for efficiency measures A measure is reliable if changes in the value of the measure reflect changes in performance – Differences between providers are real – Differences over time indicate improvement or worsening 2 What Makes a Measure Valid? Accurate record-keeping The numerator and denominator are correctly defined Examples – Free throw percentage • Clearly defined numerator, denominator, no measurement error • Poor measure of offensive impact – Batting average • Alternative measures of offensive production • Choosing a pinch hitter – Risk adjusted outcomes • Mortality • Cost 3 What Makes a Measure Reliable? Variation – Variation between providers is substantial – Random variation is small Example – Batting average vs “Clutch-hitting” average 4 Some Reliability Formulas Intraclass Correlation R = τ2/(τ2+σ2/N), – Increases with τ2 (variance between providers) – Decreases with σ2 (variance within measure) – Increases with N (number of patients) Minimum threshold N = R σ2 /(τ2 /(1-R)) Risk of misclassification: Complicated 5 Reliability Trade-Offs Winsorizing can reduce both between and within variance Increasing reporting threshold decreases number reported Increasing number classified increases risk of misclassification 6 Example: Reliability, sample size and misclassification Misclassified False positive False positive/ positives 20 22% 11% 44% .5 30 21% 10% 40% .7 70 15% 8% 31% .9 270 8% 4% 16% .4 20 30% 15% 30% .5 30 25% 13% 25% .7 70 15% 8% 15% .9 270 5% 3% 5% Sample size .4 Reliability High =Top 25% High =Top 50% 7 Conclusions Reliability standard for reporting/including in results different from standard for classification Reliability subject to trade-offs – Reliability and validity goals can be at odds Type of misclassification matters – False negative may be preferred to false positive 8
© Copyright 2026 Paperzz