Stochastic DEA: Myths and misconceptions Timo Kuosmanen (HSE & MTT) Andrew Johnson (Texas A&M University) Mika Kortelainen (University of Manchester) XI EWEPA 2009, Pisa, Italy What is stochastic DEA? ”DEA is truly a stochastic frontier estimation method, and it is incorrect to classify it as a deterministic method.” Banker & Natarajan (2008) Operations Research, p.49 2 What is stochastic DEA? • Term stochastic (from Greek “Στοχος” for ”aim” or ”guess”) generally refers to statistical random variation 3 Elements of random variation in DEA • Random sampling of observations from the production possibility set (sampling error) • Random sampling of observations outside the production possibility set (outliers) • Random outcome of production process (stochastic technology) • Random measurement errors, omitted variables, and other disturbances (stochastic noise) 4 Common myths and misconceptions • Confusing stochastic noise with sampling variation, outliers, or stochastic technology • Statistical inference on sampling error is believed to improve robustness to noise • Robustness to outliers is seen as the same as robustness to noise (or at least closely related) 5 Sampling error output y True frontier input x 6 Sampling error y True frontier Random sample of observations (DMUs, firms) x 7 Sampling error y True frontier Random sample of observations (DMUs, firms) x 8 Sampling error y True frontier Random sample of observations (DMUs, firms) x 9 Sampling error y True frontier DEA frontier x 10 Statistical foundation of DEA – – – – Banker (1993) Management Science Korostelev, Simar & Tsybakov (1995) Annals Stat. Kneip, Park & Simar (1998) Econometric Theory Simar & Wilson (2000) JPA • Deterministic technology • No outliers or noise • Data randomly sampled from the PPS • DEA frontier converges to the true frontier as the sample size approaches to infinity • In a finite sample, DEA frontier is downward biased 11 Statistical foundation of DEA • Statistical inference on sampling error is possible by using – Asymptotic sampling distribution (Banker 1993) – Bootstrapping (Simar & Wilson 1998) • Such inferences have nothing to do with – outliers – stochastic technology – stochastic noise 12 Bootstrapping • Purpose of the smooth consistent bootstrap (Simar & Wilson 1998, 2000) is to mimic the original random sampling to estimate the sampling bias • Bias corrected DEA frontier will always lie above the original DEA frontier • In noisy data, DEA tends to overestimate the frontier • Assuming away noise, and “correcting” for the small sample bias by bootstrapping, we will shift the frontier upward => If noise is a problem, then bias correction will only make it worse 13 Simulated example y 6,000 Frontier Data points 5,000 4,000 3,000 0,000 2,000 4,000 6,000 8,000 10,000 12,000 x 14 Simulated example y 6,000 DEA Frontier Frontier Data points 5,000 4,000 3,000 0,000 2,000 4,000 6,000 8,000 10,000 12,000 x 15 Simulated example y 6,000 Bias Corrected Frontier DEA Frontier Frontier 5,000 Data points 4,000 3,000 0,000 2,000 4,000 6,000 8,000 10,000 12,000 x 16 Critique of Löthgren & Tambour (LT) “LT bootstrap involves measuring the distance from a different, random (as opposed to fixed) point to the [frontier] on each replication of the bootstrap Monte Carlo exercise. It seems entirely unclear what this procedure estimates. Certainly, it does not estimate anything of interest.” … “LT method assumes not only that [the frontier] is unknown, but also (implicitly) that the point from which one wishes to measure distance to the frontier is unknown. This is absurd.” Simar & Wilson (2000), JPA, pp. 67-68. 17 Outliers y Outliers True frontier x 18 Outliers y DEA frontier True frontier x 19 Outliers – Super-efficiency approach (Wilson 1995 JPA) – Peeling the onion; context dependent DEA (Seiford & Zhu 1999 Management Science) – Robust efficiency measures / efficiency depth (Kuosmanen & Post 1999 DP, Cherchye, Kuosmanen & Post 2000 DP) – Conditional order-m and order-α quantile frontiers (Aragon, Daouia & Thomas-Agnan 2002 DP; Cazals, Florens & Simar 2002 J Econometrics; Daouia & Simar 2007 J Econometrics; Daraio & Simar 2007 book) • Deterministic technology • Improve robustness to outliers by not enveloping the most extreme observations • Outliers are different from noise – Noise affects all observations 20 Stochastic technology y Pr.[f(x)≤f]= 0.05 Pr.[f(x)≤f]= 0.50 Pr.[f(x)≤f]= 0.95 x 21 Stochastic technology y Pr.[f(x)≤f]= 0.05 Pr.[f(x)≤f]= 0.50 Pr.[f(x)≤f]= 0.95 x 22 Chance constrained DEA – – – – Land, Lovell & Thore (1993) Managerial & Decision Econ. Olesen & Petersen (1995) Management Science Cooper, Huang & Li (1996) Annals of OR Huang & Li (2001) JPA • Stochastic technology, stochastic noise, both? 23 Chance constrained stochastic DEA • Huan & Li (2001) JPA • Assume inputs and outputs are multivariate normal random variables, with known expected values and covariance matrix 24 Chance constrained stochastic DEA • How do we get “knowledge” about the expected values of inputs and outputs? – Cannot be estimated from cross-sectional data – Panel data estimation would require that the true inputs and outputs do not change over time • How do we get “knowledge” about the variances and covariances of the error terms??? • Uncertainty of the parameter estimates not taken into account in the model 25 Stochastic noise y True frontier x 26 Stochastic noise y True frontier x 27 Stochastic noise y True frontier x 28 Stochastic DEA models to deal with noise • DEA+ – Gstach (1998) JPA – Banker & Natarajan (2008) Operations Research • “Stochastic DEA” – Banker, Datar & Kemerer (1991) Management Science • Stochastic FDH/DEA estimators – Simar & Zelenyuk (2008) DP. • Stochastic Nonparametric Envelopment of Data (StoNED) – Kuosmanen (2006) DP; Kuosmanen & Kortelainen (2007) DP. 29 Stochastic DEA models to deal with noise • Estimation of a fully deterministic frontier based on data perturbed by noise – The shape of frontier can be estimated without parametric assumptions • Estimation of inefficiency (efficiency scores) is very challenging in cross-sectional setting – Observed output contains the noise term – Only conditional expected value can be estimated – Even the SFA efficiency estimator is not consistent! 30 Stochastic DEA models to deal with noise • In cross-sectional setting, identifying inefficiency and noise requires some strong assumption – Assuming away noise completely is a strong assumption, too • Distributional assumptions do not influence the efficiency rankings – Ondrich & Ruggiero 2001, EJOR 31 Conclusions • Stochastic noise should not be confused with sampling error, outliers, or stochastic technology • Correcting for small sample bias by bootstrapping does not improve robustness to noise; it can even make things worse • Improving robustness to outliers is different from stochastic noise that perturbs all observations 32
© Copyright 2026 Paperzz