Chi-Square Test for Qualitative Data 2 For qualitative data (measured on a nominal scale) * Observations MUST be independent - No more than one measurement per subject * Sample size must be large enough - Expected frequencies must be ≥ 5 Chi-square distribution Critical Values Table on page 537 in your book! X2 rollercoaster right here in California Goodness of Fit χ2 1 variable l H0: observed & expected frequencies do not differ l Steps: l l l l Calculate expected frequencies Compute χ2 Compare to critical value l df = # categories - 1 (fO-fE)2 ∑ fE Observed frequency Expected frequency Example: Goodness of Fit χ2 Married Single Separated Divorced Widowed Total Sample (N = 100) fo 50 22 8 18 2 100 expected freq. fe 0.55 0.21 0.09 0.10 0.05 100% Is the marital status of our sample representative of the population? Statistical Hypotheses: H0 = fo’s (observed frequencies) conform to fe’s (expected) H1 = the sample differs from the expected frequencies Decision rule: α = .05; df = 5 - 1 = 4; critical χ2= 9.49 Calculate test statistic: (*expected frequencies should not below 5 in any cell!) 2 ( f o − f e ) χ2 = ∑ fe 2 2 2 2 2 ( 50 − 55 ) ( 22 − 21 ) ( 8 − 9 ) ( 18 − 10 ) ( 2 − 5 ) χ2 = + + + + 55 21 9 10 5 χ 2 = .45 + .05 + .11 + 6.4 + 1.8 = 8.81 Getting the Critical Value Example: Goodness of Fit χ2 Observed statistical test value: χ2 (4) = 8.81, p > .05 Make a decision & interpret - Retain H0 because 8.81 < 9.49 - The sample does not significantly differ from the population, with regard to marital status Another Example Rated G Rated PG-13 Rated NC17 Sample (N = 24) fo 5 5 14 expected freq. fe 8 8 8 Is there an association between sexy advertising and buying more products? Statistical Hypotheses: H0 = there is no association between sexy advertising and purchases; H1 = there is an association between advertising and purchases Decision rule: α = .05; df = 3 - 1 = 2; critical χ2= 5.99 Calculate statistic: (remember: expected frequencies should not below 5 in any cell!) 2 ( f o − f e ) χ2 = ∑ fe (5 − 8) 2 (5 − 8) 2 (14 − 8) 2 2 χ = + + 8 8 8 χ 2 = 1.125 + 1.125 + 4.5 = 6.75 Another Example l Observed statistical test value: χ2 (2) = 6.75, p < .05 l Make a decision & interpret l Reject H0 because 6.75 > 5.99 l Sex sells! Practice! Goodness of Fit χ2 l Lets say you roll a 6-sided dice 120 times. You would EXPECT that each side would come up 1/6 of the time (i.e., 20 times) 1 fo 18 l l 2 3 4 5 6 19 21 23 22 17 Now your friend gets his own 6-sided dice and rolls it 120 times. You would have the same EXPECTED frequency here, right? 1 2 3 4 5 6 fo 8 9 15 15 16 57 Calculate a goodness of fit χ2 for both you and your friend, and determine whether one of you has a weighted dice, at α = .05. Don’t forget to calculate df to get the critical χ2 value! Is one of the dice suspect? Your 120 Rolls Dice Obs. Exp. O-E (O-E)2 (O - E) 2 E 1 18 20 -2 4 .20 2 19 20 -1 1 .05 3 21 20 1 1 .05 4 23 20 3 9 .45 5 22 20 2 4 .20 6 17 20 -3 9 .45 120 120 0 € 1.4 Friend’s 120 Rolls Dice Obs. Exp. O-E (O-E)2 (O - E) 2 E 1 8 20 -12 144 7.20 2 9 20 -11 € 121 6.05 3 15 20 -5 25 1.25 4 15 20 -5 25 1.25 5 16 20 -4 16 0.80 6 57 20 37 1369 68.45 120 120 0 85 df & critical value… l df = #categories – 1 = 5 l Critical χ2 = 11.07 Practice: Goodness of Fit χ2 l You: l l χ =∑ l 2 E = 1.4 NOT SIGNIFICANT Friend: l (O-E)2 χ 2 (O-E)2 =∑ E = 85 SIGNIFICANT Is your friend using a weighted dice? χ2 Test for Independence l l l Tests the association between 2 categorical variables Do the frequencies you actually observe differ from the expected frequencies by more than chance alone? Statistical hypotheses: l l l H0: the 2 variables are independent (i.e. no association) H1: the variables are not independent Steps: l l l Calculate expected frequency of each cell Compute χ2 Compare to critical value § df = (# rows – 1) x (# columns – 1) (fO-fE)2 fE ∑ Observed frequency Expected frequency Example: χ2 Test for Independence l l Is there an association between gender and vegetarianism? Non-Vegetarian Total: Male 10 60 70 Female 50 80 130 Total: 60 140 200 Statistical Hypotheses: l l l Vegetarian H0: gender and food preference are independent H1: gender and food preference are associated/ not independent Decision rule: α = .05 l df = (# rows – 1) x (# columns – 1) à (2-1) x (2-1) = 1 l Critical χ2 = 3.841 Next step: calculate the expected frequency of each cell Male Female Vegetarian Non-Vegetarian Total: 10 60 70 70 x 60 fe = = 21 200 50 fe = Total: 60 130 x 60 = 39 200 fe = 80 fe = 140 70 x 140 = 49 200 130 x 140 = 91 200 130 200 row total x column total expected frequency of each cell = grand total Now put it into the table… Male Veg Male Non-Veg Female Veg Female Non-Veg Sample (N = 200) fo 10 60 50 80 expected freq. fe 21 49 39 91 ( fo − fe) 2 χ =∑ fe (10 − 21) 2 (60 − 49) 2 (50 − 39) 2 (80 − 91) 2 2 χ = + + + 21 49 39 91 2 χ 2 = 5.76 + 2.47 + 3.10 + 1.33 = 12.66 Example: χ2 Test for Independence l Observed statistical test value: χ2 (1) = 12.66, p < .05 l Make a decision & interpret l Reject H0 and accept H1 because 12.66 > 3.84 l Gender is related to food preference! Practice! l Is there an association between cat ownership (yes/no) and life success (yes/no)? You survey 100 people… Successful Not Successful Cat 60 15 No Cat 15 10 Total: l l Don’t forget to get your row and column totals… And follow the steps of hypothesis testing: l l l l Statistical Hypothesis Decision Rule Calculate Test Statistic Make a Decision & Interpret Total: 100 Successful Not Successful Total: Cat 60 15 75 No Cat 15 10 25 Total: 75 25 100 Statistical Hypotheses: H0: cat ownership and life success are independent H1: cat ownership and life success are related Decision rule: α = .05 df = (# rows – 1) x (# columns – 1) à (2-1) x (2-1) = 1 Critical χ2 = 3.841 Successful Not Successful Total: Cat 60 15 75 No Cat 15 10 Total: 75 25 75 x 75 fe = = 56.25 100 25 x 75 fe = = 18.75 100 75 x 25 fe = = 18.75 100 25 x 25 fe = = 6.25 100 25 100 Cat, Success No cat, Success Cat, No success No cat, No Success Sample (N = 100) fo 60 15 15 10 expected freq. fe 56.25 18.75 18.75 6.25 Cat, Success No cat, Success Cat, No success No cat, No Success Sample (N = 100) fo 60 15 15 10 expected freq. fe 56.25 18.75 18.75 6.25 2 ( f o − f e ) χ2 = ∑ fe (60 − 56.25) 2 (15 − 18.75) 2 (15 − 18.75) 2 (10 − 6.25) 2 2 χ = + + + 56.25 18.75 18.75 6.25 χ 2 = .25 + .75 + .75 + 2.25 = 4.0 l Observed statistical test value: χ2 (1) = 4.00, p < .05 l Make a decision & interpret l Reject H0 because 4.00 > 3.84 l Cat ownership is related to life success! =
© Copyright 2026 Paperzz