One Factor ANOVA March 2, 2017 Contents • • • • • • • • • • • One-factor ANOVA Example 1: Exam scores The Numerator of the ANOVA - variability between the means The Denominator of the ANOVA - variability within each group The F-statistic - the ratio of variability between and within. Finding the critical value of F with table E The Summary Table Example 2: Preferred temperature for weather sensitivity Effect Size Questions Answers One-factor ANOVA The ANOVA (”ANalysis Of VAriance”) is a hypothesis test for the difference between two or more means. This tutorial will describe how to compute the simplest ANOVA - the ’1-way’ or ’1-factor’ ANOVA for independent measures. Here’s how to get to the 1-factor ANOVA with the flow chart: START HERE Test for ρ=0 Ch 17.2 1 number of correlations correlation (r) measurement scale frequency number of variables 2 2 χ test independence Ch 19.9 Means Ch 17.4 Yes Do you know σ? 1 number of means More than 2 No number of factors 2 2-factor ANOVA Ch 21 1 one sample t-test Ch 13.14 independent measures t-test Ch 15.6 χ 2 test frequency Ch 19.5 2 Test for ρ1 = ρ2 z-test Ch 13.1 1 1-factor ANOVA Ch 20 2 Yes independent samples? No dependent measures t-test Ch 16.4 The null hypothesis for the ANOVA is that that all samples are drawn from populations with equal means and equal variances. The alternative hypothesis is that the populations are not the same, but the variances are still the same. The logic behind the ANOVA is to compare two estimates of the population variance. One is related to the variability between the means and the other is related to the variability of scores within each group. If the null hypothesis is true, these 1 two estimates should be on average the same, so their ratio, called the F-ratio, should be around one. If the population means, differ, then the F-statistic will become larger than one on average. If the measured value of F exceeds a critical value determined by α, then we reject the null hypothesis that the population means are the same. We’ll show how it works with an example. Example 1: Exam scores Suppose a professor teaches the same small graduate course over three years ( 2014, 2015 and 2016). You want to know if the course grades have varied significantly over these 3 years. We’ll run an ANOVA using alpha value of α = 0.05. Here are the scores: 2014 88.4 77.5 83.5 88.2 84.7 86.2 82.1 89.3 85.3 2015 84 81.3 80.2 72.3 87.3 67.5 66.8 78.2 73 2016 76.6 69.6 85.4 84.4 74.3 78 80.7 83.5 73.6 Here is a table of useful statistics from this list: n mean SS s sem 2014 9 85.02 108.8156 3.6881 1.2294 2015 9 76.73 417.0001 7.2198 2.4066 2016 9 78.46 236.9624 5.4425 1.8142 Also, the mean of all scores, called the ’grand mean’, is 80.0704. We’ll use the numbers from this table to visualize the data using a bar graph with error bars representing standard errors of the mean, just like we did with the independent-measures t-test: 2 88 86 84 score 82 80 78 76 74 72 2014 2015 2016 Exam We can use the same rule of thumb as we did for the independent measures t-test to estimate the significance of the differences between pairs of means. Remember, if the error bars overlap, then we will almost always fail to reject the null hypothesis for a one-tailed test with α = .05. Since this is the most liberal of all tests, we will also fail to reject for any other choice of tails or alphas. But remember, we can’t make separate t-tests on each pair of means - the probability of at least one type I error will be greater than α. Looking at the graph, you might guess that there is a statistically significant difference between the means. The Numerator of the ANOVA - variability between the means The numerator of the ANOVA’s F-statistic reflects how different the means are from each other. Under the null hypothesis, this number, called the ’variance between’, or s2bet is an estimate of the population variance. We first calculate the sums of squared deviation of each mean from the grand mean, scaled by the sample size. This is called SSbet : SSbet = P ¯ )2 = n(X̄ − X̄ (9)(85.02 − 80.0704)2 + (9)(76.73 − 80.0704)2 + (9)(78.46 − 80.0704)2 = 220.4869 + 100.4244 + 23.3405 = 344.2518 The degrees of freedom for SSbet is the number of groups minus one: 3 - 1 = 2. We can now calculate s2bet which is SSbet divided by its degrees of freedom: SSbet 344.2518 = 172.1259 s2 2 bet = dfbet = The Denominator of the ANOVA - variability within each group Another estimate of the population variance is the variance of the scores within each group. This number, called the ’variance 3 within’ or s2w is calculated by first adding up the sum of the squared deviation of each score from it’s own group mean: SSw = PP j i (Xi − X̄j )2 The table above already gives us the sum of squared deviation for each group, so the total sums of squared within is: SSw = 108.8156 + 417.0001 + 236.9624 = 762.7781 The degrees of freedom for SS within is the number of scores minus the number of groups: 9 + 9 + 9 - 3 = 24. The variance within is SS within divided by its degrees of freedom: SSw 762.7781 = 31.7824 s2 w = dfw = 24 The F-statistic - the ratio of variability between and within. Finally, we’ll calculate the observed value of F which is the ratio of s2bet and s2w : s2 F = bet = 172.1259 31.7824 = 5.42 s2 w Finding the critical value of F with table E The distribution of the F-statistic is a family of curves that varies both with dfbet and dfw . Table E provides critical values of F for α = .05 and α = .01 (bold). The rows of the table correspond to dfw and the columns are for dfbet . For our example, dfbet is 2 and dfw is 24. Here’s the relevant part of the table: dfw |df b 23 24 25 1 4.28 7.88 4.26 7.82 4.24 7.77 2 3.42 5.66 3.4 5.61 3.39 5.57 3 3.03 4.76 3.01 4.72 2.99 4.68 Looking at the row for dfbet = 2 and the column for dfw = 24, we see that in bold the critical value for F with α = 0.05 is 3.4. Remember, both the numerator and the denominator for the ANOVA are measures of the population variance under the null hypothesis. If the null hypothesis is false, then the means will have greater variability, so the numerator will grow large compared to the denominator. Therefore, a large value of F is evidence against the null hypothesis. Specifically, if our observed value of F is greater than the critical value, then we reject the null hypothesis and conclude that there is a significant difference between the means. For our example, since our observed value of F (5.42) is greater than the critical value of F (3.4), we reject the null hypothesis. Using APA format, we state: ”There is a significant difference between the means of the 3 exams, F(2,24) = 5.42, p < 0.05” We can also use the F-calculator in the Excel spreadsheet to find the actual p-value: dfb 2 Convert α to F dfw α 24 0.05 F 3.4 dfb 2 Convert F to α dfw F 24 5.42 α 0.0114 4 Using this p-value we can conclude using APA format: ”There is a significant difference between the means of the 3 exams, F(2,24) = 5.42, p = 0.0114” The Summary Table The statistics used to generate the F statistic for an ANOVA are traditionally reported in a summary table like this: SS df MS F Fcrit p-value Between 344.2518 2 172.1259 5.42 3.4 0.0114 Within 762.7781 24 31.7824 Total 1107.1563 26 In this next example we’ll use this table to keep track of our values. Example 2: Preferred temperature for weather sensitivity At the beginning of the quarter I surveyed you for your preferred outdoor temperature. I also asked you how much weather affected your mood with the options of Not at all, Just a little, A fair amount and Very much. Let’s see if there is a significant difference between the preferred temperatures across these 4 options. We’ll us α = 0.05 again. Here’s a table of statistics: n mean SS s sem n grand mean SStotal Not at all 3 71.67 16.6667 2.8868 1.6667 Just a little 30 73.33 1020.667 5.9326 1.0831 A fair amount 42 74.05 3113.905 8.7149 1.3447 Very much 23 75 1486 8.2186 1.7137 Totals: 98 73.9796 5689.9592 Here’s a graph of the means with error bars representing the standard error of the means: 5 78 Preferred Temperature (F) 76 74 72 70 68 Not at all Just a little A fair amount Very much How much does weather affect your mood? We can easily fill in some of the entries in the summary table. I’ve told you that SStotal = 5689.9592. The df for SStotal is the total sample size minus 1: 98 - 1 = 97. SSw is just the sum of SSw for each cell: 16.6667 + 1020.667 + 3113.905 + 1486 = 5637.2387. The df for SSw is the total sample size minus the number of groups: 98 - 4 = 94. Between Within Total SS df 5637.2387 5689.9592 94 97 MS F Fcrit p-value SSbet can be hard to calculate, but since we know that SStotal = SSw + SSbet , we know that SSbet = 5689.9592 - 5637.2387 = 52.7205. The degrees of freedom is 4 - 1 = 3. Just like for the last example, the variance within and between are just the SS divided by their df’s: SSbet 52.8183 = 17.6061 s2 3 bet = dfbet = SSw 5637.2387 = 59.9706 s2 w = dfw = 94 Finally, the F statistic is their ratio: s2 17.6061 = 0.29 = 59.9706 F = bet s2 w The table now looks like this: Between Within Total SS 52.8183 5637.2387 5689.9592 df 3 94 97 MS 17.6061 59.9706 F 0.29 6 Fcrit p-value The critical value of F for α = 0.05 and dfbet = 3 and dfw = 94 can be found in table E: dfw |df b 94 3 2.7 4 If you use the F calculaton the Excel spreadsheet you’ll see that the p-value is 0.8325: dfb 3 Convert α to F dfw α 94 0.05 F 2.7 dfb 3 Convert F to α dfw F 94 0.29 α 0.8325 The final summary table is: Between Within Total SS 52.8183 5637.2387 5689.9592 df 3 94 97 MS 17.6061 59.9706 F 0.29 Fcrit 2.7 p-value 0.8325 The observed value of F (0.29) is not greater than the critical value of F (2.7), we fail to reject the null hypothesis. Using APA format, we state: ”There is not a significant difference in preferred outdoor temperature across the 4 levels of the survey about how weather affects mood, F(3,94) = 0.29, p = 0.8325.” Note that the ANOVA doesn’t tell us anything about exactly how preferred temperature varies with how weather affects mood. Looking at the bar graph, it appears that there is an upward trend so that the more weather affects mood, the higher the preferred temperature. But to make statistical conclusions like this requires a post-hoc analysis that allows you to compare specific means to other means. Effect Size There are several measures of effect size for an ANOVA, and most software packages will spit out more than one. Probably the most common is η 2 , or ’eta squared’, which is the proportion of total variance in your data that is attributed to the effect driving the difference between the means. It’s simply the ratio of SSbet to SStotal . From our last example: η 2 = 52.8183 5689.96 = 0.0093 If SSw = 0, then SSbet = SStotal , which then makes η 2 = 1. This is as big as η 2 can get. It happens when all deviations from the grand mean are caused by the differences between the means. If SSbet = 0, η 2 = 0. This is small as η 2 can get. It happens when there are no differences between the means, so all of the variaiblity in your data is attributed to varability within each group. η 2 is simple, comonly used, but tends t ooverestimate effect size for larger number of treatments. Just so you know, η 2 is sometimes called ’partial eta-squared’ for some reason. 7 Questions Your turn again. Here are 7 practice questions based on the class survey, followed by their answers. 1) From our survey we can calculate Caffeiene consumption as a function of how much you said you liked math. Here is a table of statistics based on our survey: Not at all Just a little A fair amount Very much n 11 43 40 8 means 1.18 1.38 2.33 1.13 SSw 11.1364 160.9192 712.776 10.8752 ntotal 102 grand mean 1.7108 SStotal 921.2181 Calculate the standard errors of the mean for each of the 4 groups. Make a bar graph of the means for each of the 4 groups with error bars as the standard error of the means. Using an alpha value of α =0.05, is there difference in Caffeine consumption across the 4 groups of students who vary their preference for math? 2) From our survey we can calculate the preferred outdoor temperature as a function of how much weather affects your mood. Here is a table of statistics based on our survey: Not at all Just a little A fair amount Very much n 5 31 42 23 means 56.6 72.26 74.05 75 SSw 1727.2 2095.9356 3113.905 1486 ntotal 101 grand mean 72.8515 SStotal 9920.7723 Calculate the standard errors of the mean for each of the 4 groups. Make a bar graph of the means for each of the 4 groups with error bars as the standard error of the means. Using an alpha value of α =0.05, is there difference in preferred outdoor temperature across the 4 groups of students who vary in how much weather affects their mood? 3) From our survey we can calculate how much you drink across voting preferences. based on our survey: Democrat I never (or can’t) vote Independent Republican n 62 23 6 7 means 2.89 0.52 0.83 1.43 SSw 661.7102 9.7392 6.8334 13.7143 ntotal 98 grand mean 2.102 SStotal 800.4796 Here is a table of statistics Calculate the standard errors of the mean for each of the 4 groups. Make a bar graph of the means for each of the 4 groups with error bars as the standard error of the means. Using an alpha value of α =0.01, is there difference in drink per week across the 4 groups of voting preference for students? 4) From our survey we can calculate how well you think you’ll do on Exam 1 as a function of how much you think 8 you’ll like this class. Here is a table of statistics based on our survey: Not at all Just a little A fair amount Very much n 11 43 40 8 means 76.82 85.37 86.3 89.13 SSw 1463.6364 3148.0467 3222.4 332.8752 ntotal 102 grand mean 85.1078 SStotal 9111.8137 Calculate the standard errors of the mean for each of the 4 groups. Make a bar graph of the means for each of the 4 groups with error bars as the standard error of the means. Using an alpha value of α =0.01, is there difference in predicted Exam 1 score across the 4 groups of how much you think you’ll like Psych 315? 5) Suppose you want to know if how much students drink varies with how much they exercise. statistics based on our survey: Not at all Just a little A fair amount Very much n 8 37 33 24 means 2.13 2.28 2.11 1.79 SSw 116.8752 303.2708 284.8793 105.9584 ntotal 102 grand mean 2.098 SStotal 814.5196 Here is a table of Calculate the standard errors of the mean for each of the 4 groups. Make a bar graph of the means for each of the 4 groups with error bars as the standard error of the means. Using an alpha value of α =0.01, is there difference in alcoholic drinks across the 4 groups of students who exercise? 6) From our survey we can compute how many hours per week students play video games as a function of how much they exercise. Here is a table of statistics based on our survey: Not at all Just a little A fair amount Very much n 8 37 33 24 means 0.5 0.55 0.36 0.15 SSw 4 44.8925 17.1293 4.74 ntotal 102 grand mean 0.3897 SStotal 73.3217 Calculate the standard errors of the mean for each of the 4 groups. Make a bar graph of the means for each of the 4 groups with error bars as the standard error of the means. Using an alpha value of α =0.05, is there difference in hours of video game playing per week across the 4 groups of students who exercise? 7) From our survey we can see if there is a difference in the heights of students across different levels of exercise. Here is a table of statistics based on our survey: 9 Just a little A fair amount Very much n 37 33 24 means 65.19 64.76 65.04 SSw 533.6757 336.0608 336.9584 ntotal 94 grand mean 65 SStotal 1210 Calculate the standard errors of the mean for each of the 3 groups. Make a bar graph of the means for each of the 3 groups with error bars as the standard error of the means. Using an alpha value of α =0.05, is there difference in height across the 3 groups of students who exercise? 10 Answers 1) SSbet = (11)(1.18 − 1.7108)2 + (43)(1.38 − 1.7108)2 + (40)(2.33 − 1.7108)2 + (8)(1.13 − 1.7108)2 = 25.8396 25.8396 = 8.6132 s2 3 bet = SSw = SStotal − SSbet = 921.218 − 25.8396 = 895.707 25.8396 = 8.6132 s2 3 bet = 8.6132 = 0.94 F = 9.1399 We fail to reject H0 . There is not a significant difference in mean Caffeine consumption across the 4 groups of students who vary their preference for math, F(3,98) = 0.94, p = 0.4244. students who vary their preference for math 3.5 Caffeine consumption 3 2.5 2 1.5 1 0.5 0 Not at all Just a little A fair amount Very much SS df MS F Fcrit p-value Between 25.8396 3 8.6132 0.94 2.7 0.4244 Within 895.7068 98 9.1399 Total 921.2181 101 11 2) SSbet = (5)(56.6 − 72.8515)2 + (31)(72.26 − 72.8515)2 + (42)(74.05 − 72.8515)2 + (23)(75 − 72.8515)2 = 1497.9 1497.9004 = 499.3001 s2 3 bet = SSw = SStotal − SSbet = 9920.77 − 1497.9 = 8423.04 1497.9004 = 499.3001 s2 3 bet = F = 499.3001 86.8355 = 5.75 We reject H0 . There is a significant difference in mean preferred outdoor temperature across the 4 groups of students who vary in how much weather affects their mood, F(3,97) = 5.75, p = 0.0012. students who vary in how much weather affects their mood preferred outdoor temperature 80 60 40 20 0 Not at all Just a little A fair amount Very much SS df MS F Fcrit p-value Between 1497.9004 3 499.3001 5.75 2.7 0.0012 Within 8423.0406 97 86.8355 Total 9920.7723 100 12 3) SSbet = (62)(2.89 − 2.102)2 + (23)(0.52 − 2.102)2 + (6)(0.83 − 2.102)2 + (7)(1.43 − 2.102)2 = 108.93 108.9302 = 36.3101 s2 3 bet = SSw = SStotal − SSbet = 800.48 − 108.93 = 691.997 108.9302 = 36.3101 s2 3 bet = F = 36.3101 7.3617 = 4.93 We reject H0 . There is a significant difference in mean drink per week across the 4 groups of voting preference for students, F(3,94) = 4.93, p = 0.0032. voting preference for students 3.5 3 drink per week 2.5 2 1.5 1 0.5 0 Democrat I never (or can't) vote Independent Republican SS df MS F Fcrit p-value Between 108.9302 3 36.3101 4.93 4 0.0032 Within 691.9971 94 7.3617 Total 800.4796 97 13 4) SSbet = (11)(76.82 − 85.1078)2 + (43)(85.37 − 85.1078)2 + (40)(86.3 − 85.1078)2 + (8)(89.13 − 85.1078)2 = 944.798 944.7985 = 314.9328 s2 3 bet = SSw = SStotal − SSbet = 9111.81 − 944.798 = 8166.96 944.7985 = 314.9328 s2 3 bet = F = 314.9328 83.3363 = 3.78 We fail to reject H0 . There is not a significant difference in mean predicted Exam 1 score across the 4 groups of how much you think you’ll like Psych 315, F(3,98) = 3.78, p = 0.013. how much you think you'll like Psych 315 predicted Exam 1 score 100 80 60 40 20 0 Not at all Just a little A fair amount Very much SS df MS F Fcrit p-value Between 944.7985 3 314.9328 3.78 3.99 0.013 Within 8166.9583 98 83.3363 Total 9111.8137 101 14 5) SSbet = (8)(2.13 − 2.098)2 + (37)(2.28 − 2.098)2 + (33)(2.11 − 2.098)2 + (24)(1.79 − 2.098)2 = 3.5153 3.5153 = 1.1718 s2 3 bet = SSw = SStotal − SSbet = 814.52 − 3.5153 = 810.984 3.5153 = 1.1718 s2 3 bet = F = 1.1718 8.2753 = 0.14 We fail to reject H0 . There is not a significant difference in mean alcoholic drinks across the 4 groups of students who exercise, F(3,98) = 0.14, p = 0.9358. students who exercise 4 alcoholic drinks 3 2 1 0 Not at all Just a little A fair amount Very much SS df MS F Fcrit p-value Between 3.5153 3 1.1718 0.14 3.99 0.9358 Within 810.9837 98 8.2753 Total 814.5196 101 15 6) SSbet = (8)(0.5 − 0.3897)2 + (37)(0.55 − 0.3897)2 + (33)(0.36 − 0.3897)2 + (24)(0.15 − 0.3897)2 = 2.4561 2.4561 = 0.8187 s2 3 bet = SSw = SStotal − SSbet = 73.3217 − 2.4561 = 70.7618 2.4561 = 0.8187 s2 3 bet = F = 0.8187 0.7221 = 1.13 hours of video game playing per week We fail to reject H0 . There is not a significant difference in mean hours of video game playing per week across the 4 groups of students who exercise, F(3,98) = 1.13, p = 0.3408. students who exercise 0.8 0.6 0.4 0.2 0 Not at all Just a little A fair amount Very much SS df MS F Fcrit p-value Between 2.4561 3 0.8187 1.13 2.7 0.3408 Within 70.7618 98 0.7221 Total 73.3217 101 16 7) SSbet = (37)(65.19 − 65)2 + (33)(64.76 − 65)2 + (24)(65.04 − 65)2 = 3.2749 3.2749 = 1.6375 s2 2 bet = SSw = SStotal − SSbet = 1210 − 3.2749 = 1206.69 3.2749 = 1.6375 s2 2 bet = 1.6375 = 0.12 F = 13.2604 We fail to reject H0 . There is not a significant difference in mean height across the 3 groups of students who exercise, F(2,91) = 0.12, p = 0.8871. students who exercise 70 60 height 50 40 30 20 10 0 Just a little A fair amount Very much SS df MS F Fcrit p-value Between 3.2749 2 1.6375 0.12 3.1 0.8871 Within 1206.6949 91 13.2604 Total 1210 93 17
© Copyright 2026 Paperzz