Chapter 7: Non-parametric statistical methods The aim of this chapter is for you to appreciate that nominal and ordinal data can also be analysed using inferential tests known as non-parametric tests. Key learning objectives are: to understand what nominal and ordinal data are to recognise when data cannot be analysed using t-tests and ANOVA to understand that significance testing can be applied to a comparison of group frequencies when analysing nominal data to understand that significance testing can be applied to an analysis of ‘ranks’ when analysing ordinal data to know how to assign a rank to each raw score within a data set to be able to identify the correct non-parametric test, given specific experimental circumstances to be able to calculate and interpret test values to be able to perform post-hoc analyses using nominal and ordinal data. 7.2.1 TESTS FOR NOMINAL DATA DISCUSSION QUESTION 1. Provide real-world examples of experiments and other studies that generate nominal data. SUGGESTED RESPONSE This question should be answered in accordance with the interests and experiences of the instructor and students. 7.2.1.1 THE BINOMIAL TEST CALCULATION QUESTION 1. Using the binomial test, are the following pairs of data significantly different given a total sample size of 10: (a) 6 and 4 (b) 8 and 2 You might like to determine your finding using both the formula and Table 7.2. 2. What do these results tell you about the need for a large sample size? Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-1 SUGGESTED RESPONSE 1. The binomial test: (a) n pn 6 0.5 10 1 z A 0.63 npq 10 0.5 0.5 1.58 Given that zcritical = +/−1.96 (see Table 7.1) for a two-tail test when α = .05 we must conclude that 0.63 < zcritical, and thus does not fall into the critical region. So we cannot say that there is a statistically significant difference between 6 and 4. From Table 7.2, the probability of finding a comparison of 6 and 4 assuming p = q = 0.5 is 0.377 2 = 0.75 since we are performing a two-tailed test. This is in accord with the non-significant finding above. However, advanced students may notice a difference between the z value above, its corresponding probability and the probability derived from Table 7.2. This discord occurs because the values in Table 7.2 are exact, whereas the arithmetic method seeking to calculate a z score assumes that the binomial and normal distributions approximate. This may not be the case especially when the total number of incidences observed is small. (b) n pN 8 0.5 10 3 z A 1.90 Npq 10 0.5 0.5 1.58 Given that zcritical = +/−1.96 (see Table 7.1) for a two-tail test when α = .05 then our value of 1.90 for z falls short of the critical region and so we cannot say that there is a statistically significant difference between 8 and 2. This may appear surprising but is a consequence of having only 10 occurrences in total. From Table 7.2, the probability is 0.055 2 = 0.11, which is in accord with the nonsignificant finding above. Again there is a small numerical difference between the arithmetic method and when using Table 7.2 as explained above. 2. The result for (b) is surprising given the difference between 8 and 2 suggesting small sample sizes demonstrate significance only when extreme differences exist between the two outcomes. As a corollary, if small differences between the two outcomes are to be seen as statistically significant you need a much larger sample size. Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-2 7.2.1.2 THE CHI-SQUARE TEST CALCULATION QUESTION 1. In educational research, it is often useful to poll students as to what facilities best aid their learning. As such, each student in a class of 100 was asked to nominate that learning aid that was most important to them. The results were: Learning aid Lectures Tutorials Number of students 35 65 As we can see, most students thought that tutorials were more important than lectures. But was their a statistically significant difference between the two categories that may sway how the subject is taught in the future? SUGGESTED RESPONSE 2 (35 50) 2 (65 50) 2 225 225 9 50 50 50 50 df (k 1) (2 1) 1 Using the table of chi-square critical values (Table 7.3), we find χ2critical = 3.84 when α = .05. As our value for chi-square is bigger than the critical value (i.e. 9 > 3.84), it falls into the critical region and we can declare a significant difference between our two categories. As such, students clearly prefer tutorials to lectures. 7.2.2 TESTS FOR ORDINAL DATA DISCUSSION QUESTION 1. Provide pertinent real-world examples of experiments and studies that generate ordinal data. SUGGESTED RESPONSE This is an open-ended question that relies on the instructor’s knowledge and the interests of the students. Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-3 7.2.2.1 HOW TO RANK ORDINAL SCORES CALCULATION QUESTION 1. Ranking is the basis for tests of ordinal data. Therefore: (a) rank the following scores: 5 2 4 (b) rank the following scores: 8 13 14 1 8 11 21 SUGGESTED RESPONSE ranking: (a) Raw score Rank 1 1 2 2 4 3 5 4 8 5 Raw score Rank 8 1 11 2 13 3 14 4 21 5 (b) Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-4 CALCULATION QUESTION 1. Tied scores are a common problem when ranking. Therefore: (a) rank the following scores: 2 3 5 3 (b) rank the following scores: 5 5 5 2 7 7 SUGGESTED RESPONSE Ties: (a) The 3’s have the ordinal positions of 2 and 3. As such, their average rank is (2 + 3)/2 = 2.5 Raw score Rank 2 1 3 2.5 3 2.5 5 4 7 5 (b) The 5’s have the ordinal positions of 2, 3 and 4. Therefore, their average rank is + 3 + 4)/3 = 3 Raw score Rank 2 1 5 3 5 3 5 3 7 5 Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-5 (2 7.2.2.2 WILCOXON SIGNED-RANKS TEST CALCULATION QUESTIONS 1. Which of the following combinations demonstrate significance for the Wilcoxon signed-ranks test? (a) Tlesser = 7, Tcritical = 8 (b) Tlesser = 8, Tcritical = 5 (c) Tlesser = 7, n = 13, α = .05, appropriate for a two-tailed hypothesis (d) Tlesser = 7, n = 13, α = .05, appropriate for a one-tailed hypothesis (e) Tlesser = 55, n = 20, α = .05, appropriate for a two-tailed hypothesis (f) Tlesser = 62, n = 20, α = .05, appropriate for a one-tailed hypothesis 2. A comparison of two conditions gave a set of 10 difference scores. For the following difference scores determine if Tlesser is significant given α = .05 and the hypothesis is two-tailed. The difference scores are: −5 −10 −2 +1 +2 −3 +1 −4 −3 −6 SUGGESTED RESPONSE 1. Wilcoxon: (a) Significant as Tlesser < 8 (b) Not significant as Tlesser > 5 (c) Significant as Tlesser < 17 (d) Significant as Tlesser < 21 (e) Not significant as Tlesser > 52 (f) Not significant as Tlesser > 60 Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-6 2. Raw scores with corresponding ranks: Raw score Rank +1 1.5 +1 1.5 +2 3.5 −2 3.5 −3 5.5 −3 5.5 −4 7 −5 8 −6 9 −10 10 Division of ranks by sign: Ranks associated with positive scores Ranks associated with negative scores 1.5 3.5 1.5 5.5 3.5 5.5 7 8 9 10 Tlesser = 6.5 Tgreater = 48.5 As Tlesser = 6.5 < T critical = 8. Thus we can conclude a significant difference. DISCUSSION QUESTIONS 1. What effect do a number of tied scores have on the Wilcoxon signed-ranks test? 2. How can tied scores be avoided when designing a study? Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-7 SUGGESTED RESPONSE 1. Tied scores decrease variability in the ranks. As such, it can be more difficult to reject the null hypothesis. If you have several ties, you can apply a correction to the test found in Seigel and Castellan (1988). 2. One simple way is to use scales that allow for some variety in the choice of score given by each participant. In this way, you might choose a 10-point scale as opposed to a fivepoint scale. As such, it becomes less likely that the same score will be chosen by multiple participants and thus the number and extent of tied scores will be decreased. In addition, the more participants you have the more likely you are to get tied scores. Therefore, your analysis can become a compromise between the need for a large sample to maintain statistical power and the problem of tied scores. 7.2.2.3 MANN-WHITNEY U TEST CALCULATION QUESTIONS 1. For the data sets below calculate the sums of ranks: (a) Data set A: 24 23 25 27 19 (b) Data set B: 3 4 3 6 1 2. For the following data determine if there is a significant difference using the Mann-Whitney U Test assuming α = .05 and it is a two-tailed hypothesis: Data set A: 2 4 3 5 3 Data set B: 5 4 6 8 8 3. Will UA equal UB if nA is different to nB? Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-8 SUGGESTED RESPONSE 1. Raw score Rank 1 (B) 1 3 (B) 2.5 3 (B) 2.5 4 (B) 4 6 (B) 5 19 (A) 6 23 (A) 7 24 (A) 8 25 (A) 9 27 (A) 10 ΣRA = 40, ΣRB = 15 2. Before calculating UA and UB we need to find the sum of ranks: Raw score Rank 2 (A) 1 3 (A) 2.5 3 (A) 2.5 4 (A) 4.5 4 (B) 4.5 5 (A) 6.5 5 (B) 6.5 6 (B) 8 8 (B) 9.5 8 (B) 9.5 ΣRA = 17, ΣRB = 38 Giving: Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-9 U A (5 5) ( 5 (5 1) ) 17 25 15 17 23 2 U B (5 5) ( 5 (5 1) ) 38 25 15 38 2 2 As UB represents the lesser of the two values for U we compare it to Ucritical. Given nA = 5, nB = 5 and α = .05, Ucritical = 2 (two tailed). As UB = Ucritical we can conclude a significant difference between these two data sets. 3. UA and UB can equal if both sets of data are equally distributed amongst the ranks. This is most likely to occur when the size of each data set is the same. 7.3.1.1 THE CHI-SQUARE TEST AND OPTIONS FOR A POSTHOC ANALYSIS CALCULATION QUESTION 1. Undergraduate medical students were polled as to their prior use of alternative therapies. The results are below: Therapy Number of students who have used each therapy Acupuncture 15 Herbal remedies 17 Homeopathy 7 Voodoo 1 Using chi-square determine whether there is a preference for some forms of alternative therapies over others. SUGGESTED RESPONSE 2 (15 10) 2 (17 10) 2 (7 10) 2 (1 10) 2 25 49 9 81 16.4 10 10 10 10 10 10 10 10 df (k 1) (4 1) 3 Assuming α = .05, then the critical value for chi-square is 7.82. As 16.4 is greater than 7.82, then there is a statistically significant difference between the preferences shown for the four alternative therapies. Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-10 7.3.2.1 THE KRUSKAL-WALLIS TEST AND OPTIONS FOR A POST-HOC ANALYSIS CALCULATION QUESTIONS 1. Calculate the sum of ranks for the following data sets: Condition A: 2 3 1 3 2 Condition B: 4 2 5 6 8 Condition C: 7 6 9 10 11 2. Assuming four treatment groups with the following characteristics, calculate KW: Data set A: sum of ranks = 28, sample size = 7 Data set B: sum of ranks = 50, sample size = 5 Data set C: sum of ranks = 93, sample size = 6 Data set D: sum of ranks = 180, sample size = 8 SUGGESTED RESPONSE 1. First, rank all scores: Raw score Rank 1 (A) 1 2 (A) 3 2 (A) 3 2 (B) 3 3 (A) 5.5 3 (A) 5.5 4 (B) 7 5 (B) 8 6 (B) 9.5 6 (C) 9.5 7 (C) 11 8 (B) 12 9 (C) 13 10 (C) 14 11 (C) 15 Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-11 Now place the ranks into the various conditions in place of their raw scores and then sum the ranks: Condition A Condition B Condition C 1 3 9.5 3 7 11 3 8 13 5.5 9.5 14 5.5 12 15 ΣRA = 18 ΣRB= 39.5 ΣRC = 62.5 2. KW: KW 12 R 2 12 28 2 50 2 93 2 180 2 ( ) 3(n 1) ( ) 3 (26 1) 23.33 n(n 1) nk 26 (26 1) 7 5 6 8 CALCULATION QUESTIONS 1. Using Table 7.17 calculate z, given α = .05 for: (a) A one-tailed test of four comparisons (b) A two-tailed test of six comparisons 2. Is it possible to calculate the minimum significant difference given that z = 2.394, n = 24 and the size of the first treatment group under comparison is 6? 3. What is the absolute difference between these average ranks: (a) 4.5 versus 2.4 (b) 2.4 versus 4.5 SUGGESTED RESPONSE 1. (a) A one-tailed test of four comparisons has z = 2.241 (b) A two-tailed test of six comparisons has z = 2.638 2. No. The need for post-hoc tests suggests three or more treatment groups. You have only accounted for 6 out of 24 participants and cannot assume that the size of the second treatment group will also be 6. Until you know the size of the second treatment group you cannot calculate the minimum significant difference. 3. In both instances, it is 2.1. Given that, the ‘absolute’ value is the value of the difference irrespective of sign. Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-12 7.3.2.2 FRIEDMAN TWO-WAY ANALYSIS BY RANKS AND OPTIONS FOR A POST-HOC ANALYSIS CALCULATION QUESTION 1. If Fr = 7.81, assuming four categories and α = .05, is this value for Fr significant? SUGGESTED RESPONSE If there are four categories, then df = 3. Using the table of critical chi-square values, we can see that our value for Fr must be greater than 7.82 to be declared statistically significant. However, as 7.81 is just less than 7.82 you cannot assign statistical significance. Nevertheless, consider the effects of rounding errors etc and whether these could influence the value for Fr. In addition, remember that null hypothesis significance testing is a blunt instrument. You may still wish to express interest in the findings even if the maths suggests non-significance. Finally, if you had chosen α = .1 you would have found significance and the problem would have evaporated. CALCULATION QUESTION 1. Find the value for q given: (a) three possible comparisons when α = .05 and you have a two-tailed hypothesis (b) four possible comparisons when α = .05 and you have a one-tailed hypothesis. SUGGESTED RESPONSE Post-hocs: (a) q = 2.35 (b) q = 2.16 7.4.1 THE CHI-SQUARE TEST FOR A 2x2 TABLE CALCULATION QUESTION 1. For the following 2x2 table calculate chi-square: Variable A Variable B Category 1 Category 2 Category 1 23 12 Category 2 11 26 Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-13 SUGGESTED RESPONSE 2 2 n ( (freq R1C1 freq R2C2 ) - (freq R1C2 freq R2C1 ) (n / 2)) 2 (freq R1C1 freq R1C2 ) (freq R2C1 freq R2C2 ) (freq R1C1 freq R2C1 ) (freq R1C2 freq R2C2 ) 72 ( (23 26) - (12 11) (72 / 2)) 2 (23 12) (11 26) (23 11) (12 26) 13312800 7.96 1673140 df (r 1) (c 1) (2 1) (2 1) 1 χ2critical = 3.84, given 1 degree of freedom when α = .05. As such, our value for chi-square is much larger than the critical value and so we can declare a statistically significant difference. This is not surprising given the differences between and across cells in the data table above. 7.4.2 FISHER’S EXACT 2x2 TEST CALCULATION QUESTION 1. When using Fisher’s exact 2x2 test what do the following exact probabilities suggest when compared to α = .05: (a) Prexact = .001 (b) Prexact = .02 (c) Prexact = .1 (d) Prexact = .35 SUGGESTED RESPONSE Fisher’s exact 2x2 test: (a) Prexact = .001 < .05, so is considered an ‘improbable’ test result and thus we attribute statistical significance to it. (b) Prexact = .02 < .05, so is considered an ‘improbable’ test result and thus we attribute statistical significance to it. (c) Prexact = .1 > .05, so is considered a ‘probable’ test result and thus we do not attribute statistical significance to it. (d) Prexact = .35 > .05, so is considered a ‘probable’ test result and thus we do not attribute statistical significance to it. Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-14 7.4.3 THE CHI-SQUARE TEST AS APPLIED TO A 2x3 TABLE AND OPTIONS FOR A POST-HOC ANALYSIS OF INTERACTION CALCULATION QUESTIONS 1. For the following table of data what are the expected cell frequencies? Variable A Variable B Category 1 Category 2 Category 3 Category 1 observed freq. = 5 observed freq. = 8 observed freq. = 11 Category 2 observed freq. = 8 observed freq. = 4 observed freq. = 2 2. For a chi-square analysis of a 2x4 table, how many degrees of freedom are there? 3. For a chi-square analysis of a 2x4 table, how many (observed frequency expected frequency) 2 will have to be summed to calculate χ2? expected frequency SUGGESTED RESPONSE 1. Using: expected frequencycell column total row total total sample size We derive column and row totals noting total sample size equals 38: Variable A Variable B Category 1 Category 2 Category 3 Row totals Category 1 observed freq. = 5 observed freq. = 8 observed freq. = 11 24 Category 2 observed freq. = 8 observed freq. = 4 observed freq. = 2 14 Column totals 13 12 13 Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-15 We now get: Variable B Category 1 Category 2 Variable A Category 1 observed freq. = 5 Category 2 observed freq. = 8 Category 3 observed freq. = 11 expected freq. = 8 expected freq. = 8 expected freq. = 8 observed freq. = 8 observed freq. = 4 observed freq. = 2 expected freq. = 5 expected freq. = 4 expected freq. = 5 2. df (r 1) (c 1) (2 1) (4 1) 3 3. 2 × 4 = 8 cells and thus 8 parts to the arithmetic to be summered together. Instructor Resource Manual t/a Research Design and Statistics by Edwards 7-16
© Copyright 2026 Paperzz