The Effect of Number of Trials and Stimulus Set on the Psychometric Qualities of the Affective Misattribution Procedure Yoav Bar-Anan Ben-Gurion University, in the Negev, Be’er Sheva Brian A. Nosek University of Virginia A follow-up study to A Comparative Investigation of Seven Indirect Attitude Measures Yoav Bar-Anan and Brian A. Nosek To cite this manuscript: Bar-Anan, Y., & Nosek, B. A. (2013). The Effect of Number of Trials and Stimulus Set on the Psychometric Qualities of the Affective Misattribution Procedure. Open Science Framework, bHNd2. http://openscienceframework.org/project/bHNd2/ Five years after the study described in the main manuscript (Bar-Anan & Nosek, 2013), we conducted a smaller follow-up study to address particular questions from the main study about the Affective Misattribution Procedure (AMP; Payne et al., 2005). In the main study, the AMP showed some promising qualities—good internal consistency, fair convergent validity and the best discriminant validity for single category measurement. However, the AMP often trailed behind many of the other measures, and was very sensitive to removal of participants with extreme scores. The follow-up study examined three procedural features that might improve the AMP’s psychometric qualities. The main study had used the most common practices at the time of its design. In particular, we used a smaller number of critical trials for the AMP (48 trials) than for the other indirect measures. In most cases, increasing the number of trials can improve the reliability and validity of a measure. In the follow-up study, we used 120 critical trials (a common number of critical trials in the Implicit Association Test—IAT, Greenwald, McGhee, & Schwartz, 1998) and matched the number of trials across indirect measures used for comparison. Second, we had selected the stimuli for the main study based on acceptable results with those stimuli in previous studies (e.g., Bar-Anan et al., 2009; Nosek & Banaji, 2001). However, we had never tested whether the stimuli are representative examples of their categories, nor did we try to balance them on any objective criteria (e.g., facial expression). It is possible that selection of stimuli could be more impactful on measures that are more sensitive to the items than the stimuli (i.e., the AMP and the Evaluative Priming task—EPT, Fazio, Sanbonmatsu, Powell, & Kardes, 1986). Therefore, the follow-up study used stimuli selected especially for the AMP to compare experimentally with the stimuli from the main study. Finally, in the published literature there are several different duration parameters of stimuli presentations in the AMP. To generalize our results, the follow-up study used a different variation than the main study. Method 2,563 participants (1431 women, 756 men, 15 of unknown gender; Mage = 23.3; SDage = 11.9) completed the study online in a similar procedure to the one described in the main article. Each participant completed, in a random order, two of the four indirect measures and all the direct measures. All the measures measured race attitudes. Stimuli. We used two stimuli sets, assigned randomly to each participant (i.e., the stimuli set was a between-participant factor). One set was the same 12 stimuli used in the main study (the NBA stimuli). The other set were 24 stimuli of Black and White men used in studies that first presented the AMP (Payne et al., 2005). According to Payne et al., all these images had neutral expression and they were matched on attractiveness ratings. The indirect measures were the IAT, Brief-IAT (BIAT; Sriram & Greenwald, 2009), EPT and AMP. The task procedures were identical to those of the main study, with the following modifications. AMP. We used three critical blocks (after the practice block), each with 40 trials – 20 with Black people primes and 20 with White people primes (there were no neutral primes). Following a number of published studies (e.g., Bar-Anan & Nosek, 2012; Payne, Burkley, & Stokes, 2008), the trial sequence included four screens that appeared subsequently: the prime (100ms), a blank screen (100ms), the target (100ms) and then the mask (appeared until response). BIAT. We removed four trials from each of the first two critical blocks, to reduce the number of critical trials from 128 to 120. Additionally, whereas in the main study, four of the critical blocks used Good words as the focal attribute category and four used Bad words as the focal attribute category, all eight critical blocks of the BIAT used Good words as the focal attribute category. This modification was based on conclusions from a separate investigation regarding the best practices for the BIAT procedure, based on data from the main study (Nosek, Bar-Anan, Sriram, & Greenwald, 2013). That investigation confirmed previous findings (Sriram & Greenwald, 2009) that the BIAT is less reliable and valid when the focal attribute category is negative than when it is positive. EPT. We removed 20 trials from each of the three critical blocks, to reduce the number of critical trials from 180 to 120 to be comparable to the other indirect measures. Target words were all adjectives. Direct measures. In addition to the thermometer, the preference question, and the individual item ratings from the main study, participants also completed a modified version of the MRS. Based on findings from the main study, we selected the four items that were the most related to indirect measures and added four similar items that showed the strongest relation to indirect measures in another study (Motyl, Schmidt, & Nosek, 2013). Results Data treatment and score computation were identical to the main study. Table 1 summarizes the main results and Table 2 presents more details. Most of the results for the AMP score that was computed from the first 48 trials of the AMP sessions that used the NBA stimuli were a close replication of the results of the main study. The internal consistency was .66 (.69 in the main study), the average correlation with direct measures was .32 (.31 in the main study), and the average correlation with indirect measures was .27 (.21 in the main study). These psychometric indices improved when we computed the score of the full task (120 trials; NBA stimuli only). This was true mostly for the internal consistency (increased to .83) and the average correlation with the direct measures (increased to .39), and only slightly so with respect to the average correlation with the indirect measures (increased to .31). Table 1 Summary Results All stimuli sets NBA stimuli Payne’s stimuli All stimuli sets, without extreme 10% White-preference (effect-size) IAT 0.75 BIAT 0.29 EPT 0.11 AMP -0.14 0.75 0.29 0.02 -0.35 (-0.07) 0.75 0.32 0.21 0.0 Internal consistency IAT BIAT EPT AMP .86 .86 .85 .85 .37 .39 .85 .83 (.66) Mean correlation with direct measures IAT .23 .25 BIAT .32 .31 EPT .13 .11 AMP .39 .39 (.30) Mean correlation with indirect measures IAT .29 .31 BIAT .41 .41 EPT .23 .23 AMP .28 .31 (.27) .86 .84 .33 .86 .78 (13%) .76 (14%) -.08 (13%) .51 (46%) .23 .36 .13 .39 .18 (2%) .25 (4%) .09 (1%) .22 (10%) .26 .40 .23 .23 .24 (3%) .32 (7%) .11 (4%) .17 (5%) Notes. The NBA stimuli are the stimuli from the main study; In parentheses, the performance in that criterion of the AMP score computed from the first 48 trials of the task with the NBA stimuli set; The White-preference effect size was Cohen’s d indicating the magnitude of the effect compared to 0 (no preference between Whites and Blacks); Without extreme 10% = without the 10% most extreme scores (% shared variance lost in parentheses). The stimuli set influenced the average preference score of the AMP and the EPT (Table 1), but had no significant effect on the psychometric qualities of any of the four measures. The only noticeable difference was that the AMP had marginally stronger relationship with the IAT was when using the NBA stimuli, r(168) = .314, than when using Payne’s stimuli, r(171) = .140, Fisher’s z = 1.68, p = .09. None of the other differences was close to significance. Given the number of comparisons, the single difference with a relaxed alpha criterion is likely to be due to chance. Overall, these results suggest that the stimuli set has no impact on the most important psychometric evaluation criteria for the indirect measures. Table 2 Detailed results IAT BIAT AMP EP Thermometer Items Overall 0.30 (0.40) 0.11 (0.38) -0.03 (0.21) 0.06 (0.53) 0.27 (1.44) -0.34 (1.13) Mean (SD) .86 (1318) .85 (1261) .85 (1152) .37 (1257) Cronbach’s α .48 (404) .23 (339) .13 (415) .26 (1260) .19 (1268) IAT .39 (352) .34 (355) .34 (1192) .3 (1200) BIAT .21 (349) .40 (1088) .49 (1087) AMP .11 (1202) .18 (1205) EP .51 (2749) Thermometer Items Thermometer Items IAT BIAT AMP EP NBA Set Mean (SD) 0.30 (0.40) 0.11 (0.38) -0.07 (0.2) 0.01 (0.54) 0.17 (1.43) -0.73 (1.04) .86 (657) .85 (670) .83 (598) .39 (648) .79 (1325) Cronbach’s α .47 (214) .31 (168) .14 (200) .27 (626) .2 (631) IAT .42 (180) .34 (191) .33 (635) .25 (640) BIAT .21 (187) .4 (558) .49 (556) AMP .09 (624) .12 (625) EP .48 (1422) Thermometer Items .29 (178) .24 (184) .32 (549) .38 (546) AMP (48 trials) .27 (164) Thermometer Items IAT BIAT AMP EP Payne’s Set Mean (SD) 0.30 (0.40) 0.12 (0.37) 0.0 (0.22) 0.11 (0.53) 0.38 (1.44) 0.08 (1.08) .86 (661) .84 (591) .86 (554) .33 (609) .73 (1372) Cronbach’s α .50 (190) .14 (171) .13 (215) .24 (634) .24 (637) IAT .35 (172) .36 (164) .35 (557) .39 (560) BIAT .19 (162) .39 (530) .45 (531) AMP .13 (578) .18 (580) EP .56 (1327) Thermometer Items Notes. For correlations and internal consistency: the relevant N is in parentheses. Questionnaire 2.65 (0.98) .83 (2718) .24 (1266) .33 (1196) .27 (1089) .09 (1202) .31 (2740) .34 (2753) Questionnaire 2.64 (0.98) .83 (1408) .27 (630) .34 (636) .25 (558) .12 (624) .32 (1419) .35 (1425) .19 (548) Questionnaire 2.66 (0.98) .84 (1310) .21 (636) .33 (560) .30 (531) .06 (578) .3 (1321) .38 (1328) Finally, like in the main study, the AMP suffered the most among the indirect measures from the removal of the 10% most extreme scores. However, perhaps because of its overall improvement with more trials, without the 10% more extreme scores, the AMP showed moderately good psychometric qualities in comparison to the poor psychometric qualities of the AMP in the main study. In summary, the results of the follow-up study generally replicated the results of the main study and confirmed its conclusions. As could be expected, the results also suggest that adding trials to the AMP can improve its psychometric qualities. References Bar-Anan, Y., Nosek, B.A., & Vianello, M. (2009). The sorting paired features task: A measure of association strengths. Experimental Psychology, 56, 329-343. Bar-Anan, Y., & Nosek, B. A. (2012). Reporting intentional rating of the primes predicts priming effects in the affective misattribution procedure. Personality and Social Psychology Bulletin, 38, 1193-1207. Bar-Anan, Y., & Nosek, B. A. (2013). A comparative investigation of seven indirect attitude measures. Greenwald, A. G., McGhee, D. E., & Schwartz, J. K. L. (1998). Measuring individual differences in implicit cognition: The Implicit Association Test. Journal of Personality and Social Psychology, 74, 1464-1480. Motyl, M., Schmidt, K., & Nosek, B. A. (2013). Unpublished data. Nosek, B.A., & Banaji, M.R. (2001). The Go/No-Go Association Task. Social Cognition, 19, 625-666. Nosek, B. A., Bar-Anan, Y., Sriram, N., & Greenwald, A. G. (2013). Understanding and using the brief Implicit Association Test: I. Recommended scoring procedures. Unpublished manuscript. Payne, B. K., Burkley, M.A., & Stokes, M.B. (2008). Why do implicit and explicit attitude tests diverge? The role of structural fit. Journal of Personality and Social Psychology, 94, 1631. Payne, B. K., Cheng, C.M., Govorun, O., & Stewart, B.D. (2005). An inkblot for attitudes: Affect misattribution as indirect measurement. Journal of Personality and Social Psychology, 89, 277-293. Sriram, N., & Greenwald, A.G. (2009). The brief implicit association test. Experimental Psychology, 56, 283-294.
© Copyright 2025 Paperzz