Modelling health care costs: practical examples and applications Andrew Briggs Philip Clarke University of Oxford & Daniel Polsky Henry Glick University of Pennsylvania Modelling health care costs: Presentation overview • Statement of problem • Examples of cost distributions – Overall – By treatment group • Testing cost differences – Raw scale – Transformations – Back transformation • Multivariate analysis – Raw scale – Transformation • Summary/future directions Modelling health care costs: Statement of problem • Common to collect cost data in clinical trials • Cost data almost always skewed and may exhibit substantial kurtosis • Nevertheless, arithmetic means are the concern of decision makers – Only the mean can be used to estimate total cost of care – Only total cost of care will lead to balanced budgets • Cost models have a role beyond the simple estimation of within trial analysis – May be used to generalise to broader populations – May be used for sub-group analysis Modelling health care costs: Examples of cost distributions 1.BOAES Fraction .6 .4 .2 0 0 5000 10000 15000 Cost 2.UKPDS Fraction .6 .4 .2 0 0 500 1000 1500 cost 2000 2500 Modelling health care costs: Examples of cost distributions 3.ACT Fraction .2 .1 0 0 100000 200000 Cost 300000 4.Dan .8 Fraction .6 .4 .2 0 0 100000 Cost 200000 Modelling health care costs: Examples of cost distributions 5.SAH Fraction .3 .2 .1 0 0 100000 Cost 200000 50000 Cost 75000 100000 6.HG Fraction .2 .1 0 0 25000 Modelling health care costs: Cost distributions by treatment 1.BOAES: control group Fraction .6 .4 .2 0 0 5000 10000 15000 Cost 1.BOAES: treatment group Fraction .6 .4 .2 0 0 5000 10000 Cost 15000 Modelling health care costs: Cost distributions by treatment 2.UKPDS: control group Fraction .6 .4 .2 0 0 500 1000 1500 cost 2.UKPDS: treatment group 2000 2500 0 500 2000 2500 Fraction .6 .4 .2 0 1000 1500 cost Modelling health care costs: Cost distributions by treatment 3.ACT: control group Fraction .3 .2 .1 0 200000 Cost 3.ACT: treatment group 0 100000 0 100000 300000 Fraction .3 .2 .1 0 200000 Cost 300000 Modelling health care costs: Cost distributions by treatment 4.Dan: control group .8 Fraction .6 .4 .2 0 100000 Cost 4.Dan: treatment group 200000 0 200000 0 .8 Fraction .6 .4 .2 0 100000 Cost Modelling health care costs: Cost distributions by treatment 5.SAH: control group Fraction .3 .2 .1 0 0 100000 Cost 5.SAH: treatment group 200000 0 200000 Fraction .3 .2 .1 0 100000 Cost Modelling health care costs: Cost distributions by treatment 6.HG: control group Fraction .2 .1 0 0 25000 50000 Cost 6.HG: treatment group 75000 100000 0 25000 75000 100000 Fraction .2 .1 0 50000 Cost Approaches for testing cost differences • Parametric T-test or nonparametric bootstrap on untransformed cost – Both unbiased – Inefficient? • (Log) transformation of cost – Straight retransformation biased – Use E Ci exp 0.5 2 – Or non-parametric smearing 1 E Ci exp exp i N • Generalised linear models – lognormal: ln E Ci ti – Expectation modelled directly so no retransformation problem – Wide variety of possible link function/distributions Zhou’s test based on log normality H 0 : exp 1 0.5 12 exp 0 0.5 02 H 0 : 1 0.5 12 0 0.5 02 H 0 : 1 0.5 12 0 0.5 02 0 H0 : 1 0 0 iff 2 1 12 . • Special case of homogeneity of log variances – test of geometric means is equivalent to test of arithmetic means • By symmetry: for special case of homogeneity of log means – test of equality of log variances is equivalent to test of arithmetic means? • Zhou’s proposed test combines the two P-values and confidence intervals for back-transformed cost differences Dataset P-value Cost diff (95% CI) T-test: raw cost 0.013 149 (31 - 267) Bootstrapped means 0.012 149 (44 - 255) Zhou (bootstrap) 0.026 107 (21 - 191) <0.001 212 (146 - 278) GLM: log link / normal 0.019 149 (26 - 259) GLM: B-C link / normal 0.019 149 (26 - 260) T-test: raw cost 0.971 0 (-8 - 8) Bootstrapped means 0.938 0 (-8 - 8) Zhou (bootstrap) 0.988 0 (-14 - 12) Log (smeared) 0.165 5 (-2 - 13) GLM: log link / normal 0.971 0 (-8 - 8) GLM: B-C link / normal 0.971 0 (-8 - 8) T-test: raw cost 0.179 -15,523 (-38,248 - 7,201) Bootstrapped means 0.172 -15,523 (-37,854 - 6,665) Zhou (bootstrap) 0.005 -136,747 (-611,607 - -21,014) Log (smeared) 0.393 14,162 (-18,321 - 47,530) GLM: log link / normal 0.185 -15,523 (-37,212 - 7,790) GLM: B-C link / normal 0.182 -15,523 (-37,458 - 7,509) 1. BOAE Log (smeared) 2. UKPDS 3. ACT P-values and confidence intervals for back-transformed cost differences Dataset P-value Cost diff (95% CI) T-test: raw cost 0.057 2,925 (-91 - 5,940) Bootstrapped means 0.058 2,925 (-97 - 5,807) Zhou (bootstrap) <0.001 114,565 Log (smeared) <0.001 -8,589 GLM: log link / normal 0.073 2,925 (-297 - 5,675) GLM: B-C link / normal 0.071 2,925 (270 - 5701) T-test: raw cost 0.236 -4,060 (-10,795 - 2,675) Bootstrapped means 0.230 -4,060 (-10,836 - 2,575) Zhou (bootstrap) 0.119 -4,019 (-9,429 - 2,170) Log (smeared) 0.004 -6,701 (-11,128 - -2,229) GLM: log link / normal 0.243 -4,060 (-10,506 - 2,881) GLM: B-C link / normal 0.244 -4,060 (-10,484 - 2,909) T-test: raw cost 0.077 2,353 (-259 - 4,965) Bootstrapped means 0.080 2,353 (-200 - 4,959) Zhou (bootstrap) 0.468 1,258 (-1,873 - 4,388) Log (smeared) 0.024 2,891 (394 - 5,397) GLM: log link / normal 0.081 2,353 (-298 - 4,899) GLM: B-C link / normal 0.081 2,353 (-295 - 4,903) 4. DP (64,023 - 194,871) (-13,277 - -4,413) 5. SAH 6. HG Approaches to model selection • Examine fit using standard regression diagnostics – R2, normal probability plots etc. – Summarises fit to observed data • Test the predictive ability of the models directly – Ability to predict observations not used in model fitting Predictive ability of the models A simulation experiment 1. Sample was split into two equal parts • • 2. 3. Part i designated ‘training sub-sample’ Part ii designated ‘test sub-sample’ Each model fitted using the training sub-sample and costs predicted for the test sub-sample Mean square error calculated for each model Process repeated in 10,000 trials Results of a simulation exercise Mean squared error Model mean OLS (cost) 10466 53 102 OLS log(cost) no smearing 16072 226 127 OLS log(cost) smeared 47432 1489 218 OLS sqrt(cost) no smearing 10821 55 104 OLS sqrt(cost) smearing 10441 54 102 Poisson regression 11427 70 107 2-part OLS (‘+’ve cost) 10467 53 102 2-part OLS log(‘+’ve cost) no smearing 2-part OLS log(‘+’ve cost) 11298 54 106 11689 51 108 smearing 2-part OLS sqrt(‘+’ve cost) no smearing 2-part OLS sqrt(‘+’ve cost) 10616 55 103 10429 10757 54 51 102 104 smearing Tobit SE – estimated standard error of the mean RMSE – root mean squared error SE RMSE P-values and confidence intervals for back-transformed cost differences Dataset P-value Cost diff (95% CI) T-test: raw cost 0.013 149 (31 - 267) Bootstrapped means 0.012 149 (44 - 255) Zhou (bootstrap) 0.026 107 (21 - 191) <0.001 212 (146 - 278) 0.019 149 (26 - 259) Covar Adj raw cost 0. 180 (70 - 300) Covar Adj: Log(smeared) 0. 222 (126 - 338) Covar Adj GLM: log 0. 154 (-48 - 289) T-test: raw cost 0.971 0 (-8 - 8) Bootstrapped means 0.938 0 (-8 - 8) Zhou (bootstrap) 0.988 0 (-14 - 12) Log (smeared) 0.165 5 (-2 - 13) GLM: log link / normal 0.971 0 (-8 - 8) Covar Adj raw cost 0. -1 (-9 - 6) Covar Adj: Log(smeared) 0. 0 (-7 - 8) Covar Adj GLM: log 0. -2 (-13 - 7) T-test: raw cost 0.179 -15,523 (-38,248 - 7,201) Bootstrapped means 0.172 -15,523 (-37,854 - 6,665) Zhou (bootstrap) 0.005 Log (smeared) 0.393 14,162 (-18,321 - 47,530) GLM: log link / normal 0.185 -15,523 (-37,212 - 7,790) Covar Adj raw cost 0. -18,378 (-43,078 - 6,555) Covar Adj: Log(smeared) 0. -12,602 (-47,687 - 24,271) Covar Adj GLM: log 0. -25,230 (-57,500 - 7,039) 1. BOAE Log (smeared) GLM: log link / normal 2. UKPDS 3. ACT -136,747 (-611,607 - -21,014) P-values and confidence intervals for back-transformed cost differences Dataset P-value Cost diff (95% CI) T-test: raw cost 0.057 2,925 (-91 - 5,940) Bootstrapped means 0.058 2,925 (-97 - 5,807) Zhou (bootstrap) <0.001 114,565 Log (smeared) <0.001 -8,589 0.073 2,925 (-297 - 5,675) Covar Adj raw cost 0. 3,078 (125 - 6,102) Covar Adj: Log(smeared) 0. 3,649 (473 - 6,924) Covar Adj GLM: log 0. 3,364 (-984 - 8,149) T-test: raw cost 0.236 -4,060 (-10,795 - 2,675) Bootstrapped means 0.230 -4,060 (-10,836 - 2,575) Zhou (bootstrap) 0.119 -4,019 (-9,429 - 2,170) Log (smeared) 0.004 -6,701 (-11,128 - -2,229) GLM: log link / normal 0.243 -4,060 (-10,506 - 2,881) Covar Adj raw cost 0. -3,289 (-9,394 - 3,073) Covar Adj: Log(smeared) 0. -4,036 (-9,729 - 1,680) Covar Adj GLM: log 0. -3,248 (-16,448 - 9,510) T-test: raw cost 0.077 2,353 (-259 - 4,965) Bootstrapped means 0.080 2,353 (-200 - 4,959) Zhou (bootstrap) 0.468 1,258 (-1,873 - 4,388) Log (smeared) 0.024 2,891 (394 - 5,397) GLM: log link / normal 0.081 2,353 (-298 - 4,899) Covar Adj raw cost 0. 1,759 (-494 - 4,068) Covar Adj: Log(smeared) 0. 1,772 (-321 - 4,097) Covar Adj GLM: log 0. 1,540 (-1,067 - 4,132) 4. DP GLM: log link / normal (64,023 - 194,871) (-13,277 - -4,413) 5. SAH 6. HG Modelling health care costs: Summary • Different approaches to modelling health care cost can lead to quite different estimates • Difficult to tell which is most appropriate • Transforming cost data can be more efficient – GLM intuitive in modelling expectations – But modelling log cost better for heavy tails? • Covariate adjustment can help precision and should be used whenever possible – Will be used to extrapolate beyond the data – Creates sub-group effects with transformed models – Creates challenges for summarising incremental cost across different covariate patterns Modelling health care costs: Log cost distributions by treatment 1.BOAES: control group Fraction .2 .1 0 0 2 8 6 4 Natural log of cost 1.BOAES: treatment group 10 0 2 10 Fraction .2 .1 0 4 6 8 Natural log of cost Modelling health care costs: Log cost distributions by treatment 2.UKPDS: control group Fraction .2 .1 0 0 2 8 6 4 Natural log of cost 2.UKPDS: treatment group 10 0 2 10 Fraction .2 .1 0 4 6 8 Natural log of cost Modelling health care costs: Log cost distributions by treatment 3.ACT: control group Fraction .2 .1 0 0 2 10 8 6 4 Natural log of cost 3.ACT: treatment group 12 14 0 2 12 14 Fraction .2 .1 0 4 6 8 10 Natural log of cost Modelling health care costs: Log cost distributions by treatment Fraction 4.Dan: control group .2 .1 0 0 2 10 8 6 4 Natural log of cost 4.Dan: treatment group 12 14 0 2 12 14 Fraction .2 .1 0 4 6 8 10 Natural log of cost Modelling health care costs: Log cost distributions by treatment 5.SAH: control group Fraction .2 .1 0 0 2 10 8 6 4 Natural log of cost 5.SAH: treatment group 12 14 0 2 12 14 Fraction .2 .1 0 4 6 8 10 Natural log of cost Modelling health care costs: Log cost distributions by treatment 6.HG: control group Fraction .2 .1 0 0 2 10 8 6 4 Natural log of cost 6.HG: treatment group 12 14 0 2 12 14 Fraction .2 .1 0 4 6 8 10 Natural log of cost
© Copyright 2026 Paperzz