Research Methods 1: Data Analysis for the "Risky Shift" practical:

Research Methods 1, Data analysis for the "Risky Shift" practical, 2006: Page 1:
Research Methods 1: Data Analysis for the "Risky
Shift" practical:
What do we want to find out?
Our experiment investigates whether group discussion affects the riskiness of
people's decision making, perhaps producing a "Risky Shift", so that decisions
that emerge from group discussion are more risky than decisions made by the
same individuals independently of each other.
Each participant in our study was presented with five scenarios. For each one
there were two alternatives: a "safe" option or a "risky" option that was preferable
to the safe option if it proved to be successful. Participants were presented with
various odds of success for the risky option, that effectively varied from slightly
risky to highly risky. They were asked to choose the odds that they would accept
before recommending the "risky" option to someone. Thus you might not
recommend to someone that they bet on a horse if the odds of it winning were 1
in a million, but you might recommend it if the odds were 1 in 2. The ratings for
each person were then averaged over all of the scenarios, to produce three
average ratings per person: (a) a rating for the "pre-discussion" condition (in
which participants made their decisions individually, before discussing the
scenarios with other people); (b) a rating for the "group discussion" condition (in
which participants discussed the scenarios in a small group and then came to a
consensus about the ratings); and (c) a rating for the "post-discussion" condition
(in which participants once again rated the scenarios in isolation from other
people). These three scores are labelled "premean1_5", "grpmean1_5" and
"postmean1_5" in the SPSS data file.
We want to know two things. Firstly, does group discussion affect the riskiness of
people's decisions about the scenarios? Secondly, do the effects of group
discussion persist? If so, when people make decisions individually again, their
decisions should be affected by the group discussion (and should be different
from their initial individual decisions).
Which test should we use?
Each participant provides three mean ratings for the riskiness of the decisions,
one each for the "pre-", "group-" and "post-" discussion conditions. Friedman's
test is the most appropriate test to see if these ratings differ, for a number of
reasons:
(a) the data are ratings, and hence ordinal data.
(b) the data do not show homogeneity of variance. The "group" ratings are
identical for all members of a given group, because they had to come to a
consensus. As a result, the variance for the "group" condition is much less than
for the other conditions.
(c) The data for the three conditions are not normally distributed.
Research Methods 1, Data analysis for the "Risky Shift" practical, 2006: Page 2:
Friedman's test will tell us whether the three conditions differ significantly in some
way. You also want to know if any effects of group discussion on judgementmaking persist: the way to do this is to compare the "pre-" and "post-" discussion
groups. The appropriate test for this "pairwise" comparison is the Wilcoxon test.
Friedman's Test, using SPSS:
Step 1: Open SPSS, and open the data file riskyshiftdatafile1.sav (available from
my website, www.sussex.ac.uk/Users/grahamh/teaching06).
Step 2: Click on "analyze", then "nonparametric tests", and then "k related
samples..."
Step 3: In the dialog box that appears, click on the variables "pre mean1_5", "grp
mean1_5" and "post mean1_5" in the lefthand window, and then click on the
arrow to move them to the righthand window (the one labelled "Test Variables").
Make sure that there is a tick next to "Friedman", at the bottom of the dialog box
(it should be ticked by default, anyway). If you click on "Statistics..." and select
"descriptive statistics", SPSS will give you the mean and standard deviation for
each of these conditions. Finally, click "OK" to run the analysis.
Step 4:
If you've done it correctly, you should get the following output:
Test Statisticsa
N
Chi-Square
df
Asymp. Sig.
48
39.767
2
.000
a. Friedman Test
Research Methods 1, Data analysis for the "Risky Shift" practical, 2006: Page 3:
Wilcoxon Test, using SPSS:
Step 1: Open the data file, riskyshiftdatafile1.sav (available from my website,
www.sussex.ac.uk/Users/grahamh/teaching06).
Step 2: Click on "Analyze"; then on "Nonparametric Tests"; and then on "2
related samples..."
Step 3: Click on the two variables that you want to compare, in this case pre
mean1_5 and post mean1_5. Click on the central arrow to move them from the
left-hand window to the window entitled "Test Pair(s) list)". Make sure there is a
tick next to "Wilcoxon" (by default, there should be).
Click on "Options..." and select "Descriptive Statistics". Finally, click on "OK" to
run the analysis.
Step 4: You should get the following results. (Remember that SPSS turns
Wilcoxon's W into a z-score).
Test Statistics(b)
Z
post mean1_5 pre mean1_5
-5.150(a)
Asymp. Sig. (2-tailed)
a Based on positive ranks.
b Wilcoxon Signed Ranks Test
.000
Research Methods 1, Data analysis for the "Risky Shift" practical, 2006: Page 4:
Friedman's test by hand, step by step:
(For a worked example of this test, look at RM1 lecture 17, "Nonparametric
statistics 2").
Step 1:
Rank each participant's scores individually.
If all of the scores are different, the lowest score gets a rank of 1; the next
highest gets a rank of 2, and the highest gets a rank of 3. If two scores are the
same, this is a "tie": the scores get the average of the ranks that they would have
obtained, had they been different.
I've ranked the first eight participants for you. Participant 1's lowest score is 3.4,
so this gets a rank of 1. The next highest is 4.6, so this gets a rank of 2; and the
highest is 5, so it gets a rank of 3. Participant 2 has two scores that are the same
(5 and 5) so these get the average of the ranks "1" and "2" (which is 1.5). This
uses up the ranks of 1 and 2, so the highest score gets the rank of 3.
Participant: premean
1:
2:
3:
4:
5:
6:
7:
8:
4.6
6.8
6.6
5.8
5.4
6.2
7.0
5.4
premean
rank
2
3
3
3
3
3
3
2.5
group
mean
5
5
5
5
5
5
5
5
group
mean
rank
3
1.5
2
2
2
1.5
1
1
postmean
3.4
5.0
4.6
4.2
4.6
5.0
6.6
5.4
postmean
rank
1
1.5
1
1
1
1.5
2
2.5
Step 2:
Find the rank total for each condition, using the ranks from all participants within
that condition. This gives you the rank total for condition 1 (Tc1), for condition 2
(Tc2) and for condition 3 (Tc3).
Step 3:
2
Work out χr :
Research Methods 1, Data analysis for the "Risky Shift" practical, 2006: Page 5:
C is the number of conditions (3 in this case).
N is the number of subjects (48 in this case).
Tc2 is the sum of the squared rank totals for each condition.
To get Tc2 :
(a) Square each rank total (i.e. square Tc1, Tc2 and Tc3).
(b) Add together these squared totals.
Step 4:
Degrees of freedom = number of conditions minus one.
d.f = 3 - 1 = 2 in this case.
Step 5:
2
Assessing the statistical significance of χr depends on the number of subjects
and the number of groups. We have more than 9 subjects, and so we can
2
2
effectively treat χr as Chi-Square. Compare your obtained χr value to the
critical value of Chi-Square for your d.f.
2
If your obtained χr is equal to or larger than the critical Chi-Square value, the
conditions are significantly different. If significant, the test will only tell you that
some kind of difference exists between the three conditions; look at the mean
score for each condition to see where the difference actually comes from.
Wilcoxon test by hand, step by step:
(For a worked example of this test, look at RM1 Lecture 16, "Nonparametric
statistics 1"):
Step 1: Find the difference between each pair of scores, keeping track of the
sign of the difference (i.e., whether the difference is positive or negative). To
illustrate, if we were comparing pre mean1_5 and post mean1_5, this is what the
differences would look like for the first five pairs of scores:
pre mean1_5
4.6
6.8
6.6
5.8
5.4
post mean1_5
3.4
5.0
4.6
4.2
4.6
difference
+1.2
+1.8
+2.0
+1.6
+0.8
Step 2: Rank the differences, disregarding whether they are positive or negative
differences. The smallest difference gets a rank of 1; the next smallest gets a
rank of 2; and so on. Tied scores are dealt with as explained above, in relation to
Friedman's test. Ignore zero difference-scores.
Research Methods 1, Data analysis for the "Risky Shift" practical, 2006: Page 6:
Step 3: Add together the positive-signed ranks. Add together the negative-signed
ranks.
Step 4: "W" is the smaller sum of ranks; N is the number of differences, omitting
zero differences.
Step 5: Ordinarily you would use a table (e.g. the one on my website) to find the
critical value of W, for your N. Your obtained W has to be equal to or smaller than
this critical value, for it to be statistically significant. If it is significant, this means
that pre- and post-discussion ratings differ. (Does this difference go in the
direction you would expect? Should you use a one-tailed or a two-tailed test in
these circumstances?)
To save you looking it up, I've reproduced the relevant part of the table below.
The table only goes as far as an N of 25, but that's good enough for our
purposes. For example, our obtained W has to be equal to or smaller than 89 to
be significant at the .05 level with a two-tailed test.
N = 25:
One tailed significance levels:
0.025
0.01
Two-tailed significance levels:
0.05
0.02
89
77
0.005
0.01
68