Candidate Number C8552 THE UNIVERSITY OF SUSSEX BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER DO NOT TURN OVER UNTIL INSTRUCTED TO BY THE CHIEF INVIGILATOR INSTRUCTIONS Do not, under any circumstances, remove the question paper, used or unused, from the examination room; it will be collected before you may leave. Time allowed: 2 Hours Answer ALL questions in the answer book provided Please note: The only approved calculators for use in University examinations are the Casio fx82, fx83, fx85, fx115, fx570 and fx991 (all with any suffix). Students are not permitted to take instruction notes or booklets relating to their calculator into an examination. C8552 Discovering Statistics 1. A record company executive was interested in the effects of subliminal messages on records having had many of his artists sued for allegedly having evil messages on their records (e.g. Ozzy Osbourne) that incited daft people to do stupid things like kill themselves. So, he took a record by Westlife and inserted different types of subliminal message onto different versions: (1) a control record didn’t have any message (no message); (2) a second record had a friendly message that said ‘Be happy and at one with your being’ (Friendly); (3) the third had the satanic message ‘surrender your soul to beelzibub’ (Satan); (4) the fourth had another satanic message that instructed them to do a violent act ‘Surrender your soul to the dark lord and sacrifice some goats while you’re at it’(Goats); and (5) a final record had the same satanic message about goats but it was played backwards (Backwards). He played the different types of record to different groups of teenagers. The outcome that the executive measured was the number of goats that each listener sacrificed. The SPSS Output is reproduced after the question. (a) What does Cohen’s d represent? Compute and interpret Cohen’s d for the difference in the number of goat’s sacrificed in the Satan group and the No Message group [5 marks] (b) There are some numbers missing from the ANOVA summary table. Calculate these three values (residual sum of squares and mean squares and the F-ratio). [3 marks]. (c) Is the assumption of homogeneity of variance met? [3 marks] (d) What conclusions could we make about the effects of subliminal messages on records? [2 marks] (e) The executive made 3 predictions: (1) having no message, or a friendly message, would have less effect than having some kind of satanic message; (2) the backward satanic message would have more impact than the two non-backward messages; (3) the satanic message that specifically told people to kill goats would have more effect than the satanic message that did not. Suggest some planned contrasts (with the appropriate group codings) that could be done to test these hypotheses. [4 marks] (f) What do you understand by the term ‘mean squares’ (i.e. conceptually speaking what does the ‘mean squares’ in an ANOVA table represent)? [3 marks] 95% Confidence Interval for Mean N Mean Std. Deviation Std. Error Lower Bound Upper Bound No Message 7 15.71 2.690 1.017 13.23 18.20 Friendly 8 12.13 2.588 .915 9.96 14.29 Satan 6 9.33 1.033 .422 8.25 10.42 Goats 8 8.13 3.441 1.217 5.25 11.00 Backward 7 11.86 3.237 1.223 8.86 14.85 Total 36 11.42 3.737 .623 10.15 12.68 2 /Turn over C8552 Discovering Statistics Number of Goats Sacrificed Levene Statistic df1 df2 Sig. 1.801 4 31 .154 ANOVA Number of Goats Sacrificed Sum of Squares df Mean Square 247.381 4 61.845 Between Groups Within Groups F Sig. .000 31 Total 488.750 35 Robust Tests of Equality of Means Number of Goats Sacrificed a Statistic df1 df2 Sig. Welch 9.289 4 14.917 .001 Brown-Forsythe 8.364 4 25.969 .000 a. Asymptotically F distributed. 2. An experiment was done to look at the positive arousing effects of imagery on different people. A sample of statistics lecturers was compared against a group of students. Both groups received presentations of positive images (e.g. cats and bunnies), neutral images (e.g. duvets and lightbulbs), and negative images (e.g. corpses and vivisection photographs). Positive arousal was measured physiologically (high values indicate positive arousal) both before and after each batch of images. The order in which participants saw the batches of positive, neutral and negative images was randomised to avoid order effects. It was hypothesised that positive images would increase positive arousal, negative images would reduce positive arousal and that neutral images would have no effect. Differences between the participant groups (lecturers and students) were not expected. The SPSS Output is reproduced after the question. (a) What type of analysis has been carried out (briefly describe the design in answering this question)? [2 marks] (b) With reference to the current experiment, what are the relative pros and cons of repeated measures experimental designs compared to independent (aka between-group) ones? [5 marks] (c) Are any assumptions broken and if so what impact does that have? [3 marks] 3 /Turn over C8552 Discovering Statistics (d) Interpret the output in full: do students and statistics lecturers differ in the type of stimuli that arouse them? Are statistics lecturers more aroused than students in general? Do the images vary in the degree to which they affect physiological arousal? [10 marks] Descriptive Statisti cs Arous al Bef ore Positiv e I magery Arous al Bef ore Neut ral Imagery Arous al Bef ore Negativ e Imagery Arous al Af ter Positiv e I magery Arous al Af ter Neut ral I magery Arous al Af ter Negativ e Imagery Group Stat is tics Students Tot al Stat is tics Students Tot al Stat is tics Students Tot al Stat is tics Students Tot al Stat is tics Students Tot al Stat is tics Students Tot al Mean 3. 6406 1. 9049 2. 7728 4. 7012 1. 9563 3. 3288 4. 3077 1. 9226 3. 1151 14. 5436 13. 7746 14. 1591 3. 8363 2. 5668 3. 2015 12. 7004 -9.7949 1. 4528 Lect urers Lect urers Lect urers Lect urers Lect urers Lect urers Std. Dev iation 5. 95578 7. 19965 6. 49218 8. 53801 7. 88769 8. 12304 2. 19003 9. 39252 6. 74959 6. 48193 6. 94780 6. 55159 7. 32800 7. 29281 7. 14519 6. 19516 6. 63192 13. 12183 N 10 10 20 10 10 20 10 10 20 10 10 20 10 10 20 10 10 20 b Mauchly's Test of Sphericity Measure: MEASUR E_1 a Epsilon Within Subject s Ef f ect Mauchly 's W TI ME 1. 000 IMAGERY .985 TI ME * IMAGERY .885 Approx. Chi-Square .000 .260 2. 076 df Sig. 0 2 2 . .878 .354 Greenhous e-Geisser 1. 000 .985 .897 Huy nh-Feldt 1. 000 1. 000 1. 000 Lower-bound 1. 000 .500 .500 Tes ts the null hy pothes is that t he error c ov arianc e matrix of t he orthonormalized transf ormed dependent v ariables is proportional to an identity mat rix. a. May be us ed t o adjust the degrees of f reedom f or the av eraged t est s of signif icanc e. C orrected tests are display ed in t he Tes ts of Within-Subject s Ef f ects table. b. Des ign: Intercept+GROUP Within Subject s Design: TIME+I MAGERY+TIME*IMAGERY 4 /Turn over C8552 Discovering Statistics Tests of Within-Subj ects Effects Measure: MEASURE_1 Sourc e TI ME Sphericity Assumed Greenhouse-Geisser Huy nh-Feldt Lower-bound TI ME * GROUP Sphericity Assumed Greenhouse-Geisser Huy nh-Feldt Lower-bound Error(TI ME) Sphericity Assumed Greenhouse-Geisser Huy nh-Feldt Lower-bound IMAGERY Sphericity Assumed Greenhouse-Geisser Huy nh-Feldt Lower-bound IMAGERY * GROUP Sphericity Assumed Greenhouse-Geisser Huy nh-Feldt Lower-bound Error(I MAGERY ) Sphericity Assumed Greenhouse-Geisser Huy nh-Feldt Lower-bound TI ME * IMAGERY Sphericity Assumed Greenhouse-Geisser Huy nh-Feldt Lower-bound TI ME * IMAGERY * Sphericity Assumed GROUP Greenhouse-Geisser Huy nh-Feldt Lower-bound Error(TI ME*I MAGERY ) Sphericity Assumed Greenhouse-Geisser Huy nh-Feldt Lower-bound Ty pe III Sum of Squares 306.989 306.989 306.989 306.989 260.139 260.139 260.139 260.139 729.057 729.057 729.057 729.057 883.031 883.031 883.031 883.031 781.951 781.951 781.951 781.951 2125.687 2125.687 2125.687 2125.687 1017.291 1017.291 1017.291 1017.291 758.698 758.698 758.698 758.698 1697.758 1697.758 1697.758 1697.758 df 1 1. 000 1. 000 1. 000 1 1. 000 1. 000 1. 000 18 18. 000 18. 000 18. 000 2 1. 970 2. 000 1. 000 2 1. 970 2. 000 1. 000 36 35. 462 36. 000 18. 000 2 1. 794 2. 000 1. 000 2 1. 794 2. 000 1. 000 36 32. 289 36. 000 18. 000 Mean Square 306.989 306.989 306.989 306.989 260.139 260.139 260.139 260.139 40. 503 40. 503 40. 503 40. 503 441.515 448.220 441.515 883.031 390.975 396.912 390.975 781.951 59. 047 59. 943 59. 047 118.094 508.646 567.112 508.646 1017.291 379.349 422.953 379.349 758.698 47. 160 52. 581 47. 160 94. 320 F 7. 579 7. 579 7. 579 7. 579 6. 423 6. 423 6. 423 6. 423 Sig. .013 .013 .013 .013 .021 .021 .021 .021 7. 477 7. 477 7. 477 7. 477 6. 621 6. 621 6. 621 6. 621 .002 .002 .002 .014 .004 .004 .004 .019 10. 786 10. 786 10. 786 10. 786 8. 044 8. 044 8. 044 8. 044 .000 .000 .000 .004 .001 .002 .001 .011 a Levene's Test of Equal ity of Error Variances Arous al Arous al Arous al Arous al Arous al Arous al Bef ore Positiv e I magery Bef ore Neut ral I magery Bef ore Negativ e Imagery Af ter Positiv e Imagery Af ter Neut ral I magery Af ter Negat iv e Imagery F .597 .140 10. 670 .169 .017 .003 df 1 df 2 1 1 1 1 1 1 18 18 18 18 18 18 Sig. .450 .713 .004 .686 .898 .954 Tes ts the null hy pot hesis that t he error v ariance of the dependent v ariable is equal across groups. a. Des ign: Intercept+GROU P Within Subject s Des ign: TIME+I MAGERY +TIME*IMAGER Y 5 /Turn over C8552 Discovering Statistics Tests of Between-Subjects Effects Measure: MEASURE_1 Transf ormed Variable: Av erage Sourc e Interc ept GROUP Error Ty pe III Sum of Squares 2618.947 821.611 802.225 df 1 1 18 Mean Square 2618.947 821.611 44. 568 F 58. 763 18. 435 Sig. .000 .000 6 Mean Arousal 5 4 3 2 1 0 Before Imagery After Imagery Time Figure 1: Mean arousal before and after imagery 6 /Turn over C8552 Discovering Statistics Mean Arousal 6 4 2 0 Statistics Lecturers Students Group Figure 2: Mean arousal for statistics lecturers and students Mean Arousal 8 6 4 2 0 Negative Neutral Positive Type of Imagery Figure 3: Mean arousal for different types of imagery 7 /Turn over C8552 Discovering Statistics ● Mean Arousal 10 8 Group ● Statistics Lecturers Students 6 ● 4 2 Before Imagery After Imagery Time Figure 4: Mean arousal before and after imagery in statistics lecturers and students ● Mean Arousal ● 5 ● Group ● Statistics Lecturers Students 0 Negative Neutral Positive Time Figure 5: Mean arousal after different types of imagery in statistics lecturers and students 8 /Turn over C8552 Discovering Statistics 14 Mean Arousal 12 10 Imagery ● Negative Neutral Positive 8 6 4 ● 2 ● Before Imagery After Imagery Time Figure 6: Mean arousal before and after different types of imagery Statistics Lecturers Students 15 ● Mean Arousal 10 5 Imagery ● Negative Neutral Positive ● ● 0 −5 −10 ● Before Imagery After Imagery Before Imagery After Imagery Time Figure 7: Mean arousal before and after different types of imagery in statistics lecturers and students 9 /Turn over C8552 Discovering Statistics 3. A study was carried out to explore the relationship between aggression and several potential predicting factors in 300 children that had an older sibling. Variables measured were Parenting Style (high score = strict, low score = liberal), Computer Games (high score = more time spent playing computer games), Television (high score = more time spent watching television), Enumbers (high score = more e-numbers in the child’s diet), and Sibling Aggression (high score = more aggression seen in their older sibling). The SPSS Output is reproduced after the question. (a) What is a bootstrap confidence interval and when would you use one? [3 marks] (b) What factors predict aggression and which do not (quote the relevant statistics)? Which is the most substantial predictor? [6 marks] (c) The R2 statistic is the squared correlation coefficient between which two variables? How would you interpret the four values of R2 in this output? [4 marks] (d) What assumption does the Durbin-Watson statistic help us to assess? Describe what you understand the assumption to mean and whether it has been met in these data. [3 marks] (e) What assumption does the scatterplot in this output assess? Describe what you understand the assumption to mean and whether it has been met in these data. [4 marks] Variables Entered/Removed Model a Variables Entered 1 Sibling Aggression, Parenting Style 2 Computer Games 3 E-Numbers 4 b Variables Removed Method . Enter . Enter . Enter . Enter b b b Television a. Dependent Variable: Aggression b. All requested variables entered. e Model Summary Change Statistics Model R 1 .667 .906 2 3 4 R Square Adjusted R Square Std. Error of the Estimate R Square Change F Change df1 df2 Sig. F Change a .444 .441 8.57290 .444 118.815 2 297 .000 b .820 .818 4.88508 .376 618.680 1 296 .000 c .944 .943 2.73580 .124 648.769 1 295 .000 d .944 .943 2.74042 .000 .005 1 294 .941 .971 .971 a. Predictors: (Constant), Sibling Aggression, Parenting Style b. Predictors: (Constant), Sibling Aggression, Parenting Style, Computer Games c. Predictors: (Constant), Sibling Aggression, Parenting Style, Computer Games, E-Numbers d. Predictors: (Constant), Sibling Aggression, Parenting Style, Computer Games, E-Numbers, Television e. Dependent Variable: Aggression 10 /Turn over DurbinWatson 1.981 C8552 Discovering Statistics a ANOVA Model 1 2 3 4 Sum of Squares df Mean Square F Sig. Regression 17464.455 2 8732.227 118.815 .000 Residual 21827.892 297 73.495 Total 39292.347 299 Regression 32228.612 3 10742.871 450.171 .000 Residual 7063.734 296 23.864 Total 39292.347 299 Regression 37084.389 4 9271.097 1238.689 .000 d Residual 2207.958 295 7.485 Total 39292.347 299 Regression 37084.430 5 7416.886 987.612 .000 e Residual 2207.917 294 7.510 Total 39292.347 299 a. Dependent Variable: Aggression b. Predictors: (Constant), Parenting Style, Sibling Aggression c. Predictors: (Constant), Parenting Style, Sibling Aggression, Computer Games d. Predictors: (Constant), Parenting Style, Sibling Aggression, Computer Games, E-Numbers e. Predictors: (Constant), Parenting Style, Sibling Aggression, Computer Games, E-Numbers, Television 11 /Turn over b c C8552 Discovering Statistics Coefficients Unstandardized Coefficients B Std. Error 44.809 3.353 .278 .034 Parenting Style -1.137 .154 (Constant) 61.898 2.031 Sig. Lower Bound Upper Bound 13.362 .000 38.210 51.408 .408 8.263 .000 .211 -.365 -7.395 .000 30.482 Sibling Aggression -.165 .026 -.243 Parenting Style -3.792 .138 Computer Games 2.173 .087 (Constant) 64.246 1.141 Sibling Aggression -.298 .016 Parenting Style -4.346 Computer Games VIF .344 .766 1.305 -1.440 -.834 .766 1.305 .000 57.902 65.894 -6.323 .000 -.217 -.114 .411 2.434 -1.218 -27.460 .000 -4.063 -3.520 .308 3.242 .995 24.873 .000 2.001 2.345 .379 2.636 56.310 .000 62.001 66.491 -.439 -19.197 .000 -.329 -.268 .364 2.745 3 .080 -1.397 -54.098 .000 -4.504 -4.188 .286 3.499 2.489 .050 1.140 49.313 .000 2.390 2.588 .356 2.805 E-Numbers .151 .006 .373 25.471 .000 .139 .162 .887 1.128 (Constant) 64.029 3.148 20.342 .000 57.835 70.224 Sibling Aggression -.299 .016 -.439 -18.698 .000 -.330 -.267 .346 2.888 Parenting Style -4.347 .082 -1.397 -52.769 .000 -4.509 -4.185 .273 3.667 Computer Games 2.488 .053 1.139 46.817 .000 2.383 2.593 .323 3.099 E-Numbers .151 .006 .373 25.353 .000 .139 .162 .882 1.134 Television .011 .151 .001 .074 .941 -.286 .308 .552 1.813 Sibling Aggression Beta Collinearity Statistics Tolerance (Constant) 2 95.0% Confidence Interval for B Standardized Coefficients t Model 1 a 4 a. Dependent Variable: Aggression 12 /Turn over C8552 Discovering Statistics Bootstrap for Coefficients Bootstrap B Bias Std. Error Sig. (2tailed) 44.809 -.022 2.993 1 Sibling Aggression .278 -.001 Parenting Style -1.137 (Constant) Model BCa 95% Confidence Interval Lower Upper .001 38.584 50.515 .030 .001 .219 .337 .002 .141 .001 -1.416 -.844 61.898 .008 1.973 .001 58.366 65.620 Sibling Aggression -.165 -.001 .024 .001 -.213 -.122 Parenting Style -3.792 .004 .125 .001 -4.044 -3.531 Computer Games 2.173 -.002 .073 .001 2.041 2.309 (Constant) 64.246 .036 1.118 .001 62.063 66.639 -.298 .000 .015 .001 -.328 -.269 -4.346 .000 .083 .001 -4.515 -4.175 Computer Games 2.489 -.001 .051 .001 2.392 2.585 E-Numbers .151 .000 .006 .001 .138 .162 (Constant) 64.029 .353 2.965 .001 58.652 71.907 Sibling Aggression -.299 -3.506E-005 .015 .001 -.331 -.268 Parenting Style -4.347 .001 .086 .001 -4.521 -4.169 Computer Games 2.488 .001 .053 .001 2.384 2.595 E-Numbers .151 -1.668E-006 .006 .001 .138 .162 Television .011 -.016 .146 .933 -.287 .228 (Constant) 2 a Sibling Aggression 3 Parenting Style 4 a. Unless otherwise noted, bootstrap results are based on 1000 bootstrap samples Casewise Diagnostics a Case Number Std. Residual Aggression Predicted Value Residual 13 -2.848 55.00 62.8040 -7.80401 14 -2.008 30.00 35.5032 -5.50324 20 2.586 55.00 47.9134 7.08661 71 -2.202 26.00 32.0345 -6.03450 79 2.038 68.00 62.4151 5.58493 128 -2.708 49.00 56.4198 -7.41976 131 2.272 39.00 32.7742 6.22575 171 -2.104 21.00 26.7668 -5.76676 205 2.327 42.00 35.6236 6.37638 208 -2.244 30.00 36.1485 -6.14850 222 -2.328 43.00 49.3796 -6.37960 249 2.075 44.00 38.3148 5.68517 279 2.295 32.00 25.7114 6.28862 298 2.148 46.00 40.1137 5.88631 a. Dependent Variable: Aggression 13 /Turn over C8552 Discovering Statistics 14 /Turn over C8552 Discovering Statistics 15 /Turn over C8552 Discovering Statistics FORMULAE 𝑑̂ = 𝑋̅𝐸𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡𝑎𝑙 − 𝑋̅𝐶𝑜𝑛𝑡𝑟𝑜𝑙 𝑠𝐶𝑜𝑛𝑡𝑟𝑜𝑙 𝑋̅ −𝑋̅ 𝑑̂ = 1 2 𝑠𝑝 (𝑁1 −1)𝑠12 +(𝑁2 −1)𝑠22 𝑁1 +𝑁2 −2 SS df MS = F= 𝑠𝑝 = √ MSM MSR 𝑅2 = SSM SST 𝑟=√ 𝑡2 𝑡 2 +df 𝑟=√ 𝐹(1,𝑥) 𝐹(1,𝑥)+dfR dfT = N 1 dfM = k 1 dfR = dfT−dfM 𝑧= 𝑋 − 𝑋̅ 𝑠 END OF PAPER 16
© Copyright 2026 Paperzz