24. One-way ANOVA: Comparing several means The Practice of Statistics in the Life Sciences Third Edition © 2014 W. H. Freeman and Company Objectives (PSLS Chapter 24) One-way ANOVA Comparing several means The ANOVA F-test ANOVA assumptions ANOVA calculations Using Table F Interpreting results from an ANOVA Comparing several means When comparing >2 populations, the question is not only whether each population mean µi is different from the others, but also whether they are significantly different when taken as a group. If we split a class of 200 into 20 random samples of 10 students and found the 20 mean heights, the 2 most extreme samples could very well be “significantly different” --- but that would be a Type 1 error. We should ask instead if all 20 samples might have come from a single population or not. Handling multiple comparisons statistically The first step in examining multiple populations statistically is to test for an overall statistical significance as evidence of any difference among the parameters we want to compare. ANOVA F test If that overall test showed statistical significance, then a detailed follow-up analysis can examine all pair-wise parameter comparisons to define which parameters differ from which and by how much. more complex methods (see Chapter 26) Reminder: variance and standard deviation Sample variance s2 1 2 s df n 2 ( x x ) i 1 df n 1 Sample standard deviation s 1 s s df 2 n 2 ( x x ) i 1 Are the observed differences in sample means likely to have occurred by chance just because of the random sampling? C4 If we have sampled several times from the μ and standard deviation σ, then mean 25.2 B mean 27.8 B the variance of the mean 23.4 sample means B C4 ≈ mean 25.2 mean 27.8 20 22 sameAB population * 24 * AB 26 Data * 28 30 A mean A mean A C4 mean A mean A mean A C4 mean A mean A mean A mean with mean 32 B B the average sample variance mean 23.4 C4B mean 25.2 B mean 27.8 B mean 23.4 B 20 22 24 B 26 Data 28 30 32 C4 If we have sampled from distinct populations deviation σ but different means μ, then mean 25.2 B mean 27.8 the variance of the mean 23.4 sample means the same standard B C4 20 22 * 24 mean 25.2 B mean 25.2 B 27.8 mean 27.8 the average mean sample variance mean 23.4 B B B * 26 B Data 22 22 24 24 26 26 Data Data A mean A C4 mean * 28 28 28 A mean A mean A A mean mean A mean A 30 32 B 20 20 mean C4 A B C4 > B A with A 30 30 32 32 mean Male lab rats were randomly assigned to 3 groups differing in access to food for 40 days. The rats were weighed (in grams) at the end. • Group 1: access to chow only. • Group 2: chow plus access to cafeteria food restricted to 1 hour every day. • Group 3: chow plus extended access to cafeteria food (all day long every day). Cafeteria food: bacon, sausage, cheesecake, pound cake, frosting and chocolate Does the type of food access influence the body weight of male lab rats significantly? H0: µchow = µrestricted = µextended Ha: H0 is not true (at least one mean µ is different from at least one other mean µ) Chow Restricted Extended N 19 16 15 Mean 605.63 657.31 691.13 StDev 49.64 50.68 63.41 Reminders A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way or completely randomized design if several levels of one factor are being studied and the individuals are randomly assigned to its levels. (There is only one way to group the data.) One-way: 4 levels of nematode quantity in seedling growth experiment Two-way: 2 seed species and 4 levels of nematodes One-way ANOVA is used for completely randomized, one-way designs. The ANOVA F test The analysis of variance F test compares the variation due to specific sources (levels of the factor) with the variation among individuals who should be similar (individuals in the same sample). H0: All means µi are equal Ha: Not all means µi are equal (H0 is not true) The analysis of variance F statistic for comparing several means is: F variation among the sample means variation among individuals in the set of samples A mean 23.4 C4 MSG variation B among sample means F MSE variation among individuals in the set of samples mean 25.2 B mean 27.8 B mean 23.4 B 20 22 *24 * 26 A Data C4 mean 25.2 B mean 27.8 B mean 23.4 B *28 30 32 B 20 22 24 26 Data 28 30 C4 A mean 25.2 A mean 27.8 A mean 23.4 32 (A) Variability of means smaller than variability within samples F tends to be small. (B) Variability of means larger than variability within samples F tends to be large. ANOVA assumptions 1. The k samples must be independent SRSs. The individuals in each sample are completely unrelated. 2. Each population represented by the k samples must be Normally distributed. However, the test is robust to deviations from Normality (skew, mild outliers) for large-enough samples, thanks to the central limit theorem. 3. The ANOVA F-test requires that all k populations have the same standard deviation s. There are inference tests for this, but they tend to be sensitive to deviations from the Normality assumption or require equal sample sizes. A simple and conservative approach: The ANOVA F test is approximately correct when the largest sample standard deviation is no more than ~ twice as large as the smallest sample standard deviation. Equal sample sizes make the ANOVA more robust to deviations from the equal σ rule. Male lab rats were randomly assigned to 3 groups differing in access to food for 40 days. The rats were weighed (in grams) at the end. Conditions required for ANOVA: •Independent random samples? This is a randomized experiment, so the 3 groups are independent samples. ok •Normal populations or large enough samples? We see no evidence of non-Normality (and total sample size is large). ok •Equal population standard deviations? largest si = 63.41; smallest si = 49.64 ok Chow Restricted Extended N 19 16 15 Mean 605.63 657.31 691.13 StDev 49.64 50.68 63.41 Meconium is a newborn’s first stool, indicative of the fetus’ exposure to outside contaminants during pregnancy. Do cigarette chemicals reach the fetus in the womb? Cotinine is the metabolized form of nicotine. Its presence in the meconium was assessed for 3 independent random samples of newborns with different mother smoking status. Level Active smokers Non-smokers Passive smokers Conditions required: • Equal population standard deviations? than twice smallest si . 600 ANOVA is clearly not appropriate here. Meconium cotinine (ng/uL) 700 smallest si = 24.23 STOP Mean 367.20 185.00 263.40 StDev 143.67 24.23 52.54 Newborn meconium cotinine level (ng/uL) vs mother smoking status Checking that largest si is no more largest si = 143.67; N 10 10 10 500 400 300 200 100 Active smokers Non-smokers Smoking status Passive smokers ANOVA calculations We have k independent SRSs, from k populations or treatments. The ith population has a Normal distribution with unknown mean µi. All k populations have the same standard deviation σ, unknown. H0: m1 = m2 = … = mk MSG SSG (k 1) F MSE SSE ( N k ) Under H0, all k samples come from the same population N(μ,σ) and sample averages should be no more variable than points in individual samples. When H0 is true, F has the F distribution with k − 1 (numerator) and N − k (denominator) degrees of freedom. F MSG variation among sample means MSE variation among individual s in same sample MSG, the mean square for groups, is a variance of the means weighted for sample size. It measures the variability of the sample averages. SSG (k 1) MSE, the mean square for error or pooled sample variance sp2 is the average sample variance weighted for sample sizes. It measures the variability within each of the groups. SSE ( N k ) Using Table F The F distribution is asymmetrical and has two distinct degrees of freedom. This was discovered by Fisher, hence the label "F". Once again what we do is calculate the value of F for our sample data, and then look up the corresponding area under the curve in Table F. For df: 5,4 dfnum = k − 1 p dfden = N−k F Male lab rats were randomly assigned to 3 groups differing in access to food for 40 days. The rats were weighed (in grams) at the end. Does the type of access to food influence the body weight of lab rats significantly? Chow Restricted Extended 516 546 564 547 599 611 546 612 625 564 627 644 577 629 660 570 638 679 582 645 687 594 635 688 597 660 706 599 676 714 606 695 738 606 687 733 624 693 744 623 713 780 641 726 794 655 736 667 690 703 Chow Restricted Extended N 19 16 15 Mean 605.63 657.31 691.13 One-way ANOVA: Chow, Restricted, Extended Source Factor Error Total DF 2 47 49 S = 54.42 StDev 49.64 50.68 63.41 SS 63400 139174 202573 MS 31700 2961 R-Sq = 31.30% F 10.71 P 0.000 R-Sq(adj) = 28.37% Male lab rats were randomly assigned to 3 groups differing in access to food for 40 days. The rats were weighed (in grams) at the end. Does the type of access to food influence the body weight of lab rats significantly? The test statistic is F = 10.71. F, df1=2, df2=47 0 4 F 8 0.0001471 10.71 12 The F statistic (10.71) is large and the P-value is very small (<0.001). We reject H0. The population mean weights are not all the same; food access type influences weight significantly in male lab rats. Note: two-sample t test and ANOVA A two-sample t test assuming equal variance with a two-sided Ha and an ANOVA comparing only 2 groups will give the same P-value. H0: m1 = m2 Ha: m1 ≠ m2 H0: m1 = m2 Ha: m1 ≠ m2 One-way ANOVA t test assuming equal variance F statistic t statistic F = t 2 and both P-values are the same The t test is more flexible: You may choose a one-sided alternative or you may want to run a t test assuming unequal variance if you are not sure that the 2 populations have the same standard deviation s. Interpreting results from an ANOVA H0 states that all mi are equal, while Ha states that H0 is not true. A significant P-value leads you to reject H0 indicating that not all mi are equal. What do you do next? Which means are different from which? You can gain insight by looking at summary graphs (boxplot, mean ± stdev) or finding the confidence interval for each mean mi . There are also some tests of statistical significance designed specifically for multiple tests (Chapter 26). Male lab rats were randomly assigned to 3 groups differing in access to food for 40 days. The rats were weighed (in grams) at the end. • Group 1: access to chow only. • Group 2: chow plus access to cafeteria food restricted to 1 hour every day. • Group 3: chow plus extended access to cafeteria food (all day long every day). The ANOVA found that not all 3 food access types result in the same mean body weight. Access type significantly impacts the weight of male lab rats. The graph suggests that, on average, rats given extended access to cafeteria food are heavier and those with access only to chow are lighter. Chow Restricted Extended N 19 16 15 Mean 605.63 657.31 691.13 StDev 49.64 50.68 63.41 Pooled variance and confidence intervals MSE, the mean square for error or pooled sample variance sp2, estimates the common variance σ 2 of the k populations. Thus, we can easily calculate separate level C confidence intervals for each population mean µI (often provided by software packages): xi t * s p ni or xi t * MSE / ni Using the critical value t* from the t distribution with N − k degrees of freedom . Male lab rats were randomly assigned to 3 groups differing in access to food for 40 days. The rats were weighed (in grams) at the end. One-way ANOVA: Chow, Restricted, Extended Source Factor Error Total DF 2 47 49 S = 54.42 Level Chow Restricted Extended SS 63400 139174 202573 MS 31700 2961 R-Sq = 31.30% N 19 16 15 Mean 605.63 657.31 691.13 Pooled StDev = 54.42 F 10.71 P 0.000 R-Sq(adj) = 28.37% StDev 49.64 50.68 63.41 Individual 95% CIs For Mean Based on Pooled StDev ----+---------+---------+---------+----(------*------) (-------*-------) (-------*--------) ----+---------+---------+---------+----595 630 665 700 = √MSE = √2961 With 95% confidence, for df = N – k = 47, t* = 2.012 (or 2.021 for df ≈ 40 in Table C). Each CI is: xi t * s p ni For “Chow”: 650.63 (2.012) (54.42) 650.63 25.12 19 The purpose of the one-way analysis of variance F test (ANOVA) is to compare A) the variances of several populations. B) the proportions of successes in several populations. C) the means of several populations. Three varieties of tropical plant Heliconia are studied. Here are the flower lengths in mm for 3 independent random samples. 50 length (mm) 40 H0 versus Ha? 30 20 10 ANOVA assumptions satisfied? 0 Interpretation? bihai red variety yellow One-way ANOVA: length versus variety Source DF SS MS F variety 2 1082.87 541.44 259.12 Error 51 106.57 2.09 Total 53 1189.44 S = 1.446 Level bihai red yellow N 16 23 15 R-Sq = 91.04% Mean 47.597 39.711 36.180 StDev 1.213 1.799 0.975 Pooled StDev = 1.446 P 0.000 R-Sq(adj) = 90.69% Individual 95% CIs For Mean Based on Pooled StDev ---------+---------+---------+---------+ (-*-) (*-) (-*--) ---------+---------+---------+---------+ 38.5 42.0 45.5 49.0
© Copyright 2026 Paperzz