ST 516 Experimental Statistics for Engineers II Nonnormal Responses We have usually assumed that experimental data are at least approximately normally distributed, with at least approximately constant variance. When either assumption is violated, we can try transforming the response to remove the violation, or using another model for the response distribution. 1 / 10 Other Topics Nonnormal Responses ST 516 Experimental Statistics for Engineers II Box-Cox approach The power transformations y ∗ = y λ are useful. Box and Cox developed a systematic approach to finding a good λ, based on λ y − 1 λ 6= 0, y (λ) = λẏ λ−1 ẏ ln y λ = 0, where X 1 ẏ = exp ln y n is the geometric mean response. 2 / 10 Other Topics Nonnormal Responses ST 516 Experimental Statistics for Engineers II Procedure Fit model for various λ, and graph SSE against λ. Lowest SSE gives best λ. All λ with SSE(λ) ≤ SS∗ comprise a 100(1 − α)% confidence interval, where ! 2 t α/2,dfE SS∗ = SSE(λopt ) 1 + . dfE Example Peak discharge data (peak-discharge.txt): (peak-discharge-box-cox.R). 3 / 10 Other Topics Nonnormal Responses ST 516 Experimental Statistics for Engineers II Generalized Linear Model Sometimes a better approach is to use a different statistical model. E.g., for counted data, assume that Y has the Poisson distribution. Replace the linear model E(Y ) = µ = β0 + β1 x1 + β2 x2 + · · · + βk xk = x0 β by g (µ) = x0 β ⇐⇒ E(Y ) = µ = g −1 (x0 β) for some nonlinear link function g (·). 4 / 10 Other Topics Nonnormal Responses ST 516 Experimental Statistics for Engineers II If the distribution is in the exponential family and the link function is chosen to match it, estimation by maximum likelihood is relatively easy. In general, the variance of Y also depends on µ; examples from the exponential family: Distribution g (µ) Normal, σ 2 = 1 µ Poisson log µ Gamma 1/µ Inverse Gaussian 1/µ2 µ Binomial log 1−µ 5 / 10 Other Topics V (µ) 1 µ µ2 µ3 µ(1 − µ) Nonnormal Responses ST 516 Experimental Statistics for Engineers II Other combinations of distribution, g (·), and V (·) may also be used, but are not supported by standard software. The binomial case is widely used: 0 P(Y = 1) = 1 ex β . = . 0 1 + ex β 1 + e −x0 β Example Coupon redemption: Y is the number of customers out of 1000 who redeem the coupon; three factors were used in a 23 factorial design. 6 / 10 Other Topics Nonnormal Responses ST 516 Experimental Statistics for Engineers II R commands Generalized linear models are fitted using glm(): summary(glm(cbind(Redeemed, Customers - Redeemed) ~ A * B + A * C + B * C, coupon, family = "binomial")) Output Call: glm(formula = cbind(Redeemed, Customers - Redeemed) ~ A * B + A * C + B * C, family = "binomial", data = coupon) Deviance Residuals: 1 2 3 0.4723 -0.4307 -0.4228 7 / 10 4 0.3949 5 -0.4572 6 0.4166 Other Topics 7 0.4238 8 -0.3987 Nonnormal Responses ST 516 Experimental Statistics for Engineers II Output, continued Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.011545 0.025515 -39.645 < 2e-16 *** A 0.169208 0.025509 6.633 3.28e-11 *** B 0.169622 0.025515 6.648 2.97e-11 *** C 0.023317 0.025510 0.914 0.361 A:B -0.006285 0.025512 -0.246 0.805 A:C -0.002773 0.025432 -0.109 0.913 B:C -0.041020 0.025434 -1.613 0.107 --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 93.0238 Residual deviance: 1.4645 AIC: 72.286 on 7 on 1 degrees of freedom degrees of freedom Number of Fisher Scoring iterations: 3 8 / 10 Other Topics Nonnormal Responses ST 516 Experimental Statistics for Engineers II Reduced model The analyst decides to fit a reduced model including A, B, and BC (and, to keep it hierarchical, C ): summary(glm(cbind(Redeemed, Customers - Redeemed) ~ A + B * C, coupon, family = "binomial")) Output Call: glm(formula = cbind(Redeemed, Customers - Redeemed) ~ A + B * C, family = "binomial", data = coupon) Deviance Residuals: 1 2 3 0.3402 -0.3114 -0.3783 9 / 10 4 0.3531 5 -0.5142 6 0.4692 Other Topics 7 0.5509 8 -0.5171 Nonnormal Responses ST 516 Experimental Statistics for Engineers II Output, continued Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.01142 0.02551 -39.652 < 2e-16 *** A 0.16868 0.02542 6.635 3.25e-11 *** B 0.16912 0.02543 6.650 2.94e-11 *** C 0.02308 0.02543 0.908 0.364 B:C -0.04097 0.02543 -1.611 0.107 --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 93.0238 Residual deviance: 1.5360 AIC: 68.358 on 7 on 3 degrees of freedom degrees of freedom Number of Fisher Scoring iterations: 3 10 / 10 Other Topics Nonnormal Responses
© Copyright 2026 Paperzz