Semantic Affect Misattributions?

Institutionen för klinisk neurovetenskap
Psykologprogrammet, termin 6
Huvudämne: Psykologi
Examensarbete (C-nivå) i psykologi (2PS013), 15 poäng
Vårterminen 2012
Semantic Affect Misattributions?
A semantic approach to the Affect Misattribution Procedure
Niklas Lanbeck
Handledare: Andreas Olsson, Institutionen för klinisk neurovetenskap
Sverker Sikström, Institutionen för psykologi, Lunds universitet
Examinator: Professor Petter Gustavsson, Institutionen för klinisk neurovetenskap
Institutionen för klinisk neurovetenskap
Psykologprogrammet, termin 6
Huvudämne: Psykologi
Examensarbete (C-nivå) i psykologi (2PS013), 15 poäng
Vårterminen 2012
Semantic Affect Misattributions?
A semantic approach to the Affect Misattribution Procedure
Sammanfattning/abstract
Misattribuering av affekt sker när man missuppfattar källan till ens reaktioner. Denna effekt
utnyttjas i Affekt Misattribuerings-Procedurer (AMP), vilket möjliggör skattning av attityder
medelst evaluerande priming av tvetydiga målstimuli. Effekterna är generellt stora och
robusta, och motstår viljestyrda försök att kontrollera respons. Resultat av implicita och
explicita attitydmått är ofta både konvergenta och divergenta. Denna studie använde en
modifierad AMP där utfallsvariabeln istället för skattningar av valens var försökspersonernas
(N = 42) spontana associationer till kinesiska tecken och neutrala in-group/out-group ansikten. Svaren analyserades med Latent Semantisk Analys (LSA) och underkastades en
nyligen föreslagen kvantitativ metod för semantisk probabilitetstestning. Resultat av tidigare
AMPs replikerades, till stöd för semantiska testers validitet. Därutöver kunde valens för
respons koherent prediceras av betingelser.
Nyckelord: implicita attityder, kvantitativ semantik, latent semantisk analys, missattribuering
av affekt, semantiska tester
Affect misattributions happen when people mistake the source of their reactions. This effect is
exploited by the Affect Misattribution Procedure (AMP), which is able to assess attitudes by
means of evaluative priming of ambiguous targets. The effects are generally large and robust,
persisting willed attempts at response control. Results of implicit and explicit attitude
measures are typically both convergent and divergent. This study used a modified AMP
where instead of valence ratings, the dependent variable was Subjects’ (N = 42) spontaneous
associations to Chinese pictographs and in-group/out-group neutral faces. Responses were
analyzed with Latent Semantic Analysis (LSA) and subjected to a recently proposed
quantitative method of semantic probability testing. Results of prior AMPs were replicated,
supporting the validity of semantic testing. In addition, valence for responses was coherently
predicted by conditions.
Keywords: affect misattribution, implicit attitudes, latent semantic analysis, quantitative
semantics, semantic tests
Semantic Affect Misattributions?
A semantic approach to the Affect Misattribution Procedure
Niklas Lanbeck
Introduction
Background
The Affect Misattribution Procedure (AMP, Payne, Cheng, Govorun, & Stewart, 2005)
is a procedure developed to assess attitudes indirectly. Participants are briefly presented with
a Chinese character preceded by a prime, and are asked to rate the pleasantness of an
ambiguous pictograph. Evaluations of targets are assumed to be influenced by automatic
affective reactions depending on prime valence, since effects persist even when subjects are
instructed to ignore the primes. This is used as a measure of attitudes towards primes, e.g. if
ratings of pictographs following out-group faces are consistently lower, one might infer a
negative out-group attitude. Effects are seen both on group level and individual difference in
scores, providing good reliability regarding internal consistency, a common shortcoming of
other indirect and implicit measures (De Houwer & Houwer, 2006; Lebel & Paunonen, 2011).
Another advantage is its ease of construction and administration.
Despite the robustness, questions have been raised about the route mediating the
influence of prime valence. It has been suggested that prime-congruent semantic concepts
may be activated in working memory and that evaluations are guided by the valence of these
concepts rather than affective reactions (Blaison, Imhoff, Hühnel, Hess, & Banse, 2012).E.g.
pleasant reactions to seeing a puppy may not be affective but the valence of the semantic
representation and related concepts connected to it. Even if this semantic route is not the
primary mediator, semantic processing is likely involved in evaluative responses, or at least
influenced by the processes leading to them (Storbeck & Clore, 2007). Here we approach
affect misattributions with Latent Semantic Analysis (LSA, Landauer, Foltz, & Laham, 1998),
a method based on the semantic relatedness of words as measured by co-occurrence in a large
number of contexts. Hoping to shed some light on semantic involvement, an open-ended
semantic dependent variable was implemented in the AMP. The direct evaluation of stimuli
was replaced with semantic associations, whose content in response to different classes of
stimuli are analyzed concerning likeness or difference. We begin with an overview of
assumptions related to concepts of implicitness and attitudes to establish a theoretical
framework.
Implicit Measures
Since the postulation of mental processes beyond the reach of conscious awareness and
control, measurement and assessment of these has been attempted in a number of ways (e.g.
Dutton & Aron, 1974; Schwarz & Clore, 1983). These processes have often been categorized
as implicit, the opposite of explicit processes, readily available for conscious report. Though
this distinction might seem clear enough, a closer look at the implications of this dichotomy
may be useful for the present investigation.
Fazio and Olson (2003) pointed out that the terms were adopted from cognitive
psychology to the domains of social psychology and social cognition (for an overview, see
Schacter, 1987). They referred to Greenwald and Banaji (1995), who suggested that; “[t]he
signature of implicit cognition is that traces of past experience affect some performance, even
though the influential earlier experience is not remembered in the usual sense—that is, it is
unavailable to self-report or introspection.” Greenwald and Banaji (1995) reported choosing
this distinction rather than aware-unaware, conscious-unconscious or similar “[…] because of
that dichotomy's prominence in recent memory research, coupled with the present intention to
connect research on attitudes, self-esteem, and stereotypes to memory research.” (p. 4).
Greenwald and Banaji further suggested a template for definition of specific categories
of implicit cognition, where an implicit construct is the introspectively unidentified or
inaccurately identified trace of past experience that mediates a category of responses. Applied
to attitudes, some stimuli would activate a prior evaluation of a group and, escaping explicit
awareness of this process, influence the subsequent response to the stimuli. It should be noted
that this is no guarantee for exclusively implicit storage or activation of an attitude, nor that
the same attitude could not be consciously accessed and explicitly expressed given the right
combination of input and mental states.
Supposedly, when an attitude is implicitly activated and thus not detected by awareness,
it may happen that the activated attitude influences evaluation of another object, or even that
the attitude is misattributed to the later object. In Greenwald and Banaji’s (1995) words; “[a]n
implicit attitude can be thought of as an existing attitude projected onto a novel object.” (p. 5).
Fazio and Olson (2003) argued that because of implicit procedures lacking proof of
unawareness combined with no evidence for separate explicit and explicit representations in
memory, it is of importance that the measure be viewed as implicit, not the attitude itself. An
implicit measure of a construct is implicit because the method of retrieval sidesteps conscious
filtering to some extent, capturing reactions which would not be available to direct inquiry.
Implicit measures can estimate effects of attitudes that participants may or may not be
unaware of. Should these estimates actually differ from explicit, directly reported ones,
significant overlap is still possible.
Attitudes
How, then, should attitudes be conceptualized? An early tentative definition in a
behaviorist framework read “an implicit, drive-producing response considered socially
significant in the individual’s society” (Doob, 1947, p. 136) Other research showed that
spatiotemporal pairing of valenced stimuli with neutral stimuli affected subsequent
evaluations of the formerly neutral stimuli, which lent credibility to the notion that attitudes
are a kind of evaluative conditioning (Staats & Staats, 1958). Other conceptions include
acquired behavioral dispositions, object-evaluation memory associations, momentary “on the
spot” constructions and states of pattern activation (see Eagly & Chaiken, 1993, for a
summary).
Interestingly, much of later research on the subject has been conducted without a
consensual definition. This approach is epitomized in the following quote from a seminal
volume on the subject; “Although this book deals with the entire domain of the psychology of
attitudes, it announces no general theory of attitudes.” (Eagly & Chaiken, 1993, p. 692).
Instead, they developed an abstract definition of attitude as “a psychological tendency, that is
expressed by evaluating a particular entity with favor or disfavor” (Eagly & Chaiken, 1993, p.
1). This statement was purposely non-restrictive, to include most contemporary definitions of
attitudes. While not specifying the “inner tendencies” at work (the specifics of which much
academic debate has revolved around), the focus was on creating a research framework in
which continuously changing approaches could be accommodated (Eagly & Chaiken, 2007).
As this “umbrella” definition has been successful in finding a lowest common denominator
for several distinct research projects and allows for consensual discussion of the subject
(Gawronski, 2007), no further distinction is necessary for the present research.
Projection
The use of a projective measure might seem controversial. Despite an era of empirical
evidence in disfavor of most projective measures, their use in clinical and forensic settings has
been remarkably resilient (Lilienfeld, Wood, & Garb, 2000). Perhaps, as Payne et al. (2005)
noted, because it makes intuitive sense. The logic in projective measures is that people
presented with an ambiguous stimulus, given a sufficiently wide scope of response options,
will respond in a way influenced by their personality. In disambiguating unstructured stimuli,
respondents project aspects of themselves and interpreting responses can be seen as
attempting to reverse engineer the influence of inherent constructs. Disappointingly, only a
few indexes derived from classic variants of such measures have demonstrated acceptable
validity (Lilienfeld et al., 2000).
Projection might be considered a case of misattribution; mistaking the source of an
effect. Such source confusion is a common feature of multi-determined events in everyday
life and correction demands motivation, awareness and control of the attributional bias
(Loersch & Payne, 2012; Wilson & Brekke, 1994). In a projective task, the self is the true
source but the effect of the self is wrongly attributed to some external (ambiguous) stimulus,
this is the so-called projective hypothesis (Lilienfeld et al., 2000). If the projective hypothesis
holds and the results of interpretating the ambiguous stimulus in any way reflect stable,
inherent traits, then some state resulting from (or related to) such traits should be responsible
for mediating responses.
Payne and colleagues (2005) realized that affective priming could be exploited in
combination with a projective task. Research by Zajonc (1980) had resulted in the affective
primacy hypothesis, which holds that affective reactions are partially independent from “cold”
cognitive processes, supported e.g. by experimental data which proved that affective ratings
of valence could be demonstrated in the absence of recognition memory of the same stimuli,
previously presented in a visually degraded form. This was developed further by Murphy and
Zajonc (1993) who recorded priming effects of affective but not cognitive stimuli under
suboptimal conditions, namely extremely brief stimulus duration. At longer durations only
cognitive priming was evident. This further supported the affective primacy hypothesis and
paved the way for Payne et al. (2005) in using this misattributed core affect (Russell, 2003) to
assess attitudes. Affect is not the only component of an attitude, but one of the principal ones,
enough so that if affect in response to a prime is misattributed to a neutral target, then an
attitude towards the prime stimulus likely exists. The affect-arousing property of the prime
stimulus may indeed be considered a partial attitude even when no beliefs or behavioral
dispositions are present, since the operational definition of attitudes here is an evaluative
opinion of an object. Hence it may be that no attitude towards the object is available for
explicit report (it may e.g. have been conditioned to another attitude object) but an implicit
attitude can still be inferred. It should also be noted that misattributions can only be used as
measures of attitudes when participants fail in monitoring and controlling the influence of
attitudinal objects (Winkielmann, Zajonc, & Schwarz, 1997).
Affect Misattribution
Since subliminal prime presentation had already proved effective in eliciting affective
responses by Murphy and Zajonc (1993), Payne et al. (2005) opted for supraliminal, visually
presented primes to study priming effects when attempts at correcting for these effects would
be possible for subjects. The design also included a warning condition in which participants
were explicitly instructed to avoid letting the prime influence their ratings of the pictographs.
As previous research had shown that awareness of biasing influences eliminates or reverses
such bias (Martin, Seta, & Crelia, 1990), warning-resistant prime effects would be indicative
of true misattributions.
The resulting Affect Misattribution Procedure (AMP) was validated in six experiments
exploring variations in timing and priming materials. Experiment 1 and 2 established effects
of prime valence on participants evaluations, even following the explicit warning against
biasing influence. Experiment 3 and 4 varied the prime and target duration and the stimulusonset-asynchrony (SOA) i.e. the interval between prime and target. No elimination of priming
effects were found at longer exposures when participants would have more time to control
and regulate their responses, though a decline of effect size was seen at longer target
durations. In experiment 5 self-reported voting intentions were predicted by priming with
pictures of candidates from the 2004 US elections, despite warning all participants of bias.
Experiment 6 explored the racial attitudes using pictures of White and Black neutral faces as
primes. Racial attitudes were estimated with a “feeling thermometer”, 11-point rating scale
towards Blacks, Whites, Asians and Hispanics. Scales regarding motivation to avoid and
control prejudice were administered1. The AMP scores correlated positively with the feeling
thermometer measure of racial attitudes but was moderated by motivation to control and avoid
prejudiced responses, such that participants highly motivated to avoid prejudice showed less
implicit-explicit correlations.
As intended, the procedure demonstrated good validity (Payne, Burkley, & Stokes,
2008; Payne et al., 2005) and its reliability is among the highest of implicit measures although
the number of studies replicating the results is small (Lebel & Paunonen, 2011). The AMP
voting predictions were replicated for the 2008 US Elections, which offered the at that time
unique opportunity of assessing prejudice directly, towards a black presidential candidate
(Payne, Krosnick, et al., 2010). Explicit prejudice was associated with voting for McCain
rather than Obama. Scoring high on implicit prejudice (as assessed by the AMP) increased
likelihood of not voting for Obama but those high in implicit prejudice weren’t more likely to
vote for McCain. Implicit prejudice also uniquely predicted not voting at all and voting for a
third party candidate (Payne, Krosnick, et al., 2010). AMP scores have also proven successful
in distinguishing smokers from nonsmokers, within group differences of smokers was related
to individual differences in current smoking withdrawal (Carolina, Payne, Mcclernon, &
Dobbins, 2007). Addictive behavior (alcohol consumption) was found to be related to both
self-reported and implicit alcohol attitudes, but in addition AMP scores explained more
behavioral variance than - and variance beyond - explicit measures and was immune to
influence of self-presentation (Payne, Govorun, & Arbuckle, 2008).
The research on affect misattribution continued with testing the fit of a multinomial
process model (Payne, Hall, Cameron, & Bishara, 2010). Relative contributions of the
theoretically separate components driving AMP; affective responses, target evaluations and
source confusion (misattribution) were calculated and the model’s predictions fitted the
results of variable manipulations well. The proposed formal model specified the contingencies
among parameters for when misattribution occurs; at longer pictograph durations the
misattribution rate declined (Payne, Hall, et al., 2010). The prolonged timeframe was
theorized to enable distinctions between reactions to prime and target and facilitate attribution
of affect to the true source. In another study, misattributions disappeared when participants
were asked to explicitly evaluate prime pleasantness prior to target evaluation (Oikawa, Aarts,
& Oikawa, 2011). When instructed to ignore primes, valence of primes influenced target
evaluations. In a third, non-affective response condition, the instruction was to indicate which
of two letters had been presented superimposed on the prime. This increased the proportion of
positive evaluations following positive compared to negative primes, indicating that active
1
Both further described under Method below.
processing directed at the prime is not enough to bind affect to it and prevent spillover to
target evaluation.
Semantic Misattribution
Deutsch and Gawronski (2009) used a semantic AMP version as a control condition for
testing the Response Interference (RI) account of underlying mechanisms in affective priming
(e.g. Gawronski, Deutsch, & Seidel, 2005). The RI hypothesis proposes that short term
associations between stimuli and response options interfere with affectively generated
response tendencies. Thus no RI should be involved in AMP due to the ambiguous targets
(not eliciting competing response tendencies). Using dual semantic primes (words of
incongruent vs congruent valence) for the Bona Fide Pipeline (BFP, Fazio, Jackson, Dunton,
& Williams, 1995) produced stronger priming effects with incongruent primes. The BFP task
consists of pressing keys denoting “good” and “bad” in response to words or images preceded
by primes, with response latencies as the outcome measure. This was contrasted with results
from AMP with dual primes, in which congruently valenced primes had an additive effect on
valence ratings. This was interpreted as the RI resulting from the contrast between context
prime and attended prime increasing the salience of the prime in BFP categorization. E.g., a
puppy would be more salient in the context of genocide than in the context of a petting zoo.
On the other hand, AMP-ratings are reinforced by evaluatively consistent primes, supporting
the notion that affective priming is additive (Murphy & Zajonc, 1993). The experiments were
repeated with animate vs unanimate semantic primes and animate/unanimate classifications as
outcome measure for both procedures. Results paralleled those of the first experiments, but in
addition an unexpected effect of prime valence was found on animate-inanimate AMP ratings.
Thus it was easier to distinguish between e.g. a puppy and a teddy bear than between a snake
and a hose. The authors pointed this out as a possible problem for applying AMP to nonevaluative measures, since “[if] non-evaluative responses to the neutral Chinese characters
can be influenced by judgment-irrelevant features of the primes, the resulting priming scores
could be systematically contaminated by contingent features of the employed prime stimuli”
(Deutsch & Gawronski, 2009, p. 110).
A slightly modified version of the AMP termed Semantic Misattribution Procedure
(SMP, Imhoff, Schmidt, Bernhardt, Dierksmeier, & Banse, 2010), was used to assess sexual
preference by priming with pictures of bodies of both genders, varying in sexual maturation,
and participants were asked to rate the following pictographs as “sexual” or “not sexual.”
Sexual SMP ratings correlated more with AMP pleasantness ratings for male than female
raters, since female subjects rated women as more pleasant than men, but both genders
equally sexual. The authors noted that despite including “Semantic” in the title, data was
insufficient to dissociate semantic priming from affective priming, and that both cognitive and
affective processes may be simultaneously active in either case (Imhoff et al., 2010).
An attempt at disentangling these processes and dissociating their effects was made in a
study using primes differing in semantic content and elicited affect (Blaison et al., 2012).
Based on numerous studies showing that angry faces elicit fear in individuals high in trait
social anxiety, and in individuals experiencing state anxiety, angry faces were used as
semantic-affective incongruence primes. Fearful faces, which have congruent content and
effect, constituted the control priming condition. Participants classified pictographs under
labels “visually evokes fear” and “visually evokes anger.” Angry primes consistently elicited
more anger than fear-evaluations of pictographs. Blaison et al.(2012) suggest these results
support semantic priming rather than affective, and argue that the processes underlying the
AMP be viewed as “cold” rather than “hot.”
Some doubts remain, as these modified AMPs employed distinct emotions (fear, anger)
rather than unattributed core affect (Russell, 2003), which may lend itself more to
misattributions. Potentially, emotional faces include semantic content that facilitate rapid
identification of specific emotions, while reactions to neutral out-group faces (as in Payne et
al. 2005 [Experiment 4] and 2010b) may be of a more rudimentary and unspecific nature. For
example, while neutral face stimuli produced out-group homogeneity (“they all look the
same”) bias, participants instead showed out-group heterogeneity bias (“they all look
different”) in a recognition task, distinguishing better between out-group angry faces
(Ackerman et al., 2006). Also, fear conditioning is more robust with (neutral) out-group faces
compared to in-group faces (Olsson, Ebert, Banaji, & Phelps, 2005) and Implicit Association
Task racial attitudes predict magnitude differences in amygdala activation to in- and outgroup faces (Phelps et al., 2000). In addition, lesions to the amygdala impair recognition of
especially fear (Calder et al., 1996), but it has been suggested that the amygdala specializes in
threat detection rather than emotion recognition (Ohman, 2002). Imaging studies have
confirmed that fear and anger activate separate neural regions (Sprengelmeyer, Rausch, Eysel,
& Przuntek, 1998).
Altogether, the empirical data does not confirm or deny a dissociation of automatic
emotion recognition and implicit attitude activation. No conclusive evidence of affective
processes in AMP exists to date, neither of exclusively semantic priming in SMP. However,
this possible semantic involvement is not necessarily a devaluation of the method. Blaison et
al. remarked it rather implicates that “the AMP […] is not restricted to evaluation but could be
an “inkblot” for many semantically defined psychological constructs—an almost universal,
psychometrically sound, projective task” (Blaison et al., 2012, p. 9).
The present study
Employing a quantitative measure of semantic likeness enables use of utterances
generated by participants as a dependent variable. Language is arguably one of the most direct
ways of accessing other people’s mental states, and has obvious methodological appeal.
Computational power and appropriate methods to create semantic representations have
existed for some time (Landauer & Dumais, 1997), still their use in psychological research
has not been widespread. The method of latent semantic analysis (LSA, Landauer & Dumais,
1997; Landauer et al., 1998) creates a high-dimensional space of words-by-context
representations, based on the fact that words with similar meaning occur in the same context
more frequently. By analyzing a vast corpus of representative language in text a matrix is
created, wherein semantic content is represented as distance to other words. That is, words
with similar meaning and synonyms are located near each other. Aside from word similarity,
the method has no concept of the actual semantic content represented, paralleling
Wittgenstein’s view that “the meaning of a word is its use in the language” (Wittgenstein,
1953, p. 20).
Repeated tests of distance between synonyms in predefined lists are used to measure the
quality of the space, and the number of dimensions is optimized accordingly to provide the
best fit (Sikström & Sawar, n.d.). LSA performance on such synonym tests has been good
enough to pass the English tests for American university admission (Landauer et al., 1998).
Semantic testing has demonstrated change in representations of self and significant others
following psychotherapy (Arvidsson, Sikström, & Werbart, 2011) and proven useful in
investigating the constancy of verbal eyewitness statements (Sikström & Sawar, n.d.). It has
also revealed interesting patterns of stereotypical gender representations in the context of
Reuter’s news items (Gustafsson Sendén & Sikström, n.d.).
Here the intent was to investigate whether typical results of AMP can be replicated with
an open ended outcome measure by running semantic tests in an LSA-created space.
Specifically, the procedure design is closely modeled on that of experiment 1 in Payne,
Burkley et al’s (2008) study, which is highly similar to that of experiment 6 in the first article
(Payne et al., 2005). Apart from an attempt at replicating the robust AMP effect of primes on
evaluations, this may be seen as a test of the methodological validity of semantic testing. If
evaluations are colored by affect misattribution, then semantic responses generated under the
same conditions should be too, and we expect to see significant differences in semantic
content generated to targets following in-group vs out-group primes. In addition it constitutes
an indirect test regarding semantic processes involved in AMP. If primes activate congruent
semantic information in working memory which is then attributed to evaluated targets,
semantic responses should be sensitive to warning. If rather unbounded core affect elicited by
primes activates congruently valenced semantic concepts, warnings should be ineffective.
Finally, individuals high in Concern With Acting Prejudiced and/or Internal Motivation to
Respond Without Prejudice are predicted to show a higher proportion of divergent explicit
and implicit attitudes.
Method
Participants
Participants (N = 42) were solicited for anonymous participation, thus not reporting age
or gender. Due to insufficient time to wait for ethical approval, anonymity was opted for and
demographic data was not collected. All subjects spoke Swedish as first language and were
currently undergoing or had finished higher academic training. This was not solely a mean of
comfortable selection, but also taken as a guarantee of a suitably sizeable mental lexicon for
the semantic associations included in the procedure. Exclusion criterion was any knowledge at
all of Chinese writing (or closely related pictographic writing systems, such as Korean).
Material
From the Face-Place database 72 images of emotionally neutral faces were selected 2; 36
Caucasian and 36 Afro-American with an even distribution of male and female faces. Of the
traditional Chinese characters (pictographs) used by Payne in his 2005 experiment 72 were
randomly selected. The stimuli were presented in 2 blocks, the first using faces as primes and
pictographs as targets while the second reversed the relationship.
Primes were presented for 75 ms, followed by a blank screen for 125 ms after which the
target was presented for 100 ms (see notes on timing below) 3. Immediately after the target
presentation a mask screen of monochromatic static visual noise appeared until the subject
had made his response and pushed a button to start the next trial. All presentations were made
on a Dell studio XPS 1640 with an Intel Core 2 duo P8400 processor, 8 GB DDR3 and an
ATI Mobility Radeon HD3670 GPU.
In addition, 3 questionnaires were administered. These were the same ones administered
by Payne et al. (2005) in the 5th experiment described in the first AMP article, namely a
feeling thermometer (of attitudes towards different ethnic groups), the Motivation to Control
2
Stimulus images courtesy of Michael J. Tarr, Center for the Neural Basis of Cognition, Carnegie
Mellon University, http://www.tarrlab.org/
3
Since the refresh rate of the monitor on which the stimuli were presented was 60 Hz (with no option to adjust
up or down) the actual timing of the presentations was modified by the presentation software to allow
completion of refresh cycles. When using the option for onset sync on ‘vertical blanks’, the actual timings are
logged with the resulting data and may be inspected. A refresh rate of 60 Hz gives a refresh duration of 16.67
ms with an average delay of 8.33 ms and a standard deviation of 4.17 ms. To optimize the targeted exposure
duration of 75, 100 and 125 ms, the method of targeting desired duration by calculating the corresponding
integer cycles times refresh duration (16.67) and subtracting 10 ms was used. This put the durations at 73, 100
and 123 ms.
Prejudiced Reactions (MCPR, Dunton & Fazio, 1997) scale and the Motivation to Respond
Without Prejudice scale (MRWP, Plant & Devine, 1998). All scales were translated into
Swedish by the author at the best of his ability, testing the resulting items on several pilots for
face validity. The feeling thermometer requested participants to rate their feelings towards
Asians, Whites, Blacks, Latinos, Arabs, Africans and Europeans on a scale from 0 (cold and
unfavorable) to 10 (warm and favorable). A couple of additional target groups were included
in the feeling thermometer to cover a wider range of attitudes. Dunton and Fazio’s (1997)
MCPR is divided into 2 subscales separating external interpersonal pressure from more
internalized ideals relating to self-concept. These are called Restraint to Avoid Dispute
(Restraint) and Concern With Acting Prejudiced (Concern). For Internal and External
Motivation (IMS/EMS) to Control Prejudiced Responses, the dimensions are what the titles
suggest. Both have demonstrated good reliability across samples, with Cronbach’s α for
MCPR ranging from .74-.77 (Dunton & Fazio, 1997) and from .76-.85 for MRWP (Plant &
Devine, 1998). Observed reliability in this dataset was α = .733 for MCPR and α = .135 for
MRWP.
Procedure
Participants were seated in front of the computer after a brief verbal instruction. They
were told by the experimenter that the goal of the research was to find out more about how
people associate words to different kinds of visual stimuli. Written instructions on the screen
told subjects to ignore the prime, respond to the target, respond with the first single word they
came to think of, loudly vocalize as quickly as possible, while avoiding repeats and proper
nouns. Half the participants were assigned to the warning condition, in which they read a
statement describing possible bias after exposure to out-group faces and which urged them to
do their best not to let themselves be affected. After vocal response, subjects entered said
response in a prompt 4 and pushed enter, which started the next stimulus sequence. Specific
instructions concerning prime and target for each block were displayed before presentation
start.
Following the trials, the 3 questionnaires were administered on screen. After completing
the session, subjects confirmed their semantic responses and helped the experimenter insert
missing umlauts, correct and clarify ambiguous typing errors (consulting recorded audio when
needed). When finished, subjects were debriefed on the specifics and actual aims of the study.
Statistical Analysis
A 1.6 Gb corpus consisting of roughly 890000 articles from the 100 largest Swedish
daily newspapers during 2000-2010 was used. The semantic spaces used were then generated
with LSA by the Infomap software5. The resulting matrices of words in rows by context in
columns with cells containing the number of occurrences of a word in a context, are
logarithmically rescaled and compressed by an algorithm called Singular Value
Decomposition (Sikström & Sawar, n.d.). This space contains the dimensions that best
represent semantic likeness, so that semantically identical words are neighbours and those
with similar meanings are located near each other. The representational quality of the space
was measured using the synonym test mentioned above. A list of Swedish synonyms was
rank-ordered by total number of words in the representation such that a low score indicates
closeness in the space. The median score (0.0195) of 1000 Swedish synonym tests was used
as an indicator of overall quality of the representation. All subsequent testing was conducted
4
Because of the software source code being unable to handle foreign keyboard layouts, the 3 umlaut
characters in the Swedish alphabet were transformed to other symbols. Subjects were instructed to ignore this
as the only viable solution was retroactive correction guided by the recorded vocal responses.
5
http://infomap-nlp.sourceforge.net
in the Semantic software, a Matlab application specifically developed for the purpose, which
is available for collaborative research6. Additional analysis was run in IBM SPSS 20.
For a between subjects (conditions) test of statistical differences in semantic
representations, two sets of words were subjected to a measure of distance (Sikström &
Sawar, n.d.) A semantic representation for each condition was created and the difference
between one condition and another were subtracted, creating unique semantic difference
vectors for each subject. This is the averaged location of the statements in each condition. A
“semantic scale” consisting of the distance between each subject’s summarized semantic
representation and difference vector resulted. The distance measure was calculated from
cosines of angles between vectors. The resulting semantic scale value has a range from -1 to
+1, denoting maximal similarity to the first and second condition being compared
respectively. On this scale, distance between conditions was measured as the difference of
average value of items in conditions. A p-value for the scale was created by standard
independent sample one-tailed test of the hypothesis that the semantic representations in the
first condition should have a positive value, i.e. more similar to other statements from the first
than the second condition. Z-values were calculated by dividing semantic distance by pooled
standard deviations of all items. Effect sizes were calculated by subtracting semantic distance
of condition-ordered data from that of random-ordered data and dividing by the standard
deviation of random-ordered data7(Sikström & Sawar, n.d.) Since effect sizes in this context
are derived from the more informative z-value distance measure they are presented
subsequently and not denominated d like Cohen’s effect sizes.
For within subject paired tests, the calculations regarded significant changes between
measurements. The change between measurements was measured as a vector. All subjects’
vectors were summarized and the length of the resulting vector was the distance measure for
paired tests. P-values were calculated by bootstrapping – randomly assigning subjects to
conditions many times and calculating variability between conditions. The probability equaled
the ratio of the number of times the semantic distance of random assignment is larger than the
semantic distance of ordered data. For paired tests, effect sizes were calculated from a ztransformation of the distance measure (dividing the distance measure by the standard
deviation). The sign of z-values and effect sizes indicate similarity to the first or second
condition being compared.
Valence for each participant’s statement in every single condition was calculated by
training the space on a list of semantic valence norms, locating them in the space with a
resulting semantic scale between the positive and negative groups of words. Valence was then
predicted by multiple linear regression. The valence word list comprised 173 items and had
been rated in a prior study (N = 42), where ratings for negative words ranged from -2.98 to 1.35, positive word ratings from 1.23 to 2.75 (Stenberg, Wiking, & Dahl, 1998). The
regressions created coefficients minimizing mean error squares between estimated valence
and listed valence rankings (Gustafsson et al., n.d.). These coefficients were used to predict
valence of every word in the space. It was also used to predict a common valence for each
semantic representation by subject and condition.
6
(http://www.lucs.lu.se/sverker.sikstrom/LSALAB_intro.html)
mean(dc (random))  dc (ordered)
7
z
std(dc (random))

Results
Questionnaires
From the feeling thermometer scores, an attitude index was created by subtracting the
mean out-group attitude from the mean in-group attitude. Due to subject selection, group
identity was assumed to be identical for all respondents. The resulting index had a theoretical
range from -10 to +10, the actual range was 3.20 to -7.00 (M = -.95, SD = 1.87) and the
sample distribution had a marked negative skewness (-1.20). This shows that most subjects
exhibited positive out-group - rather than the common in-group - bias. Since those in the
warning condition were warned prior to questionnaires, the possible resulting bias reversal
was investigated with an independent samples t-test which found no significant difference in
means (p = .884). Scores for the MCPR subscales Concern (M = 30.59, SD = 6.59) and
Restraint (M = 18.34, SD = 3.86) were calculated for each subject, as were scores for MRWP
subscales IMS (M = 26.63, SD = 5.32) and EMS (M = 14.38, SD = 6.30). There were strong
significant intercorrelations for all subscales except EMS (Concern-Restraint r = .663, p <
.001, Concern-IMS r = .626, p < .001, Restraint-IMS r = .361, p < .05.) Reliability analysis of
the 4 scales showed that Cronbach’s α increased from .615 to .774 if EMS was deleted,
accordingly a composite motivation score of the other three was computed, taking the mean of
standardized values for each scale across subjects.
AMP
In this modification of the AMP, we investigated differences and similarities in
semantic content of statements between conditions, instead of direct attitude measurement. To
assess the effect of explicitly warning participants of bias, a semantic test (see method above)
between subjects in the warning vs. no warning condition was performed. To recap, z-values
are a transformation of the semantic distance where a positive value means a larger distance,
or difference between statements, than would be expected by chance. A negative value means
the distance is smaller, or more similar, than would be expected of randomly selected
statements. Z-transformations of these measures are the number of standard deviations of the
degree of semantic difference or similarity. The similarity, as described above, is calculated
from the degree of co-occurrence in natural language text material. High positive z-value
means the content of statements is highly dissimilar, a low negative value that they are highly
similar. Similarity is here derived from the likelihood of two statements to occur in the same
linguistic context.
No significant difference was found between warning and no warning (p = .298).
Neither did paired tests reveal significant difference for black primes with vs. without
warning, nor white, nor for black vs. white primes in the warning condition (see Table 1).
Table 1. Warned vs unwarned prime conditions
Stimuli/Stimuli Black primes
White primes
Black primes
(Warning)
(Warning)
White primes
.9249 (.2018) ns
-.2385 (-.0520) ns
1.3477 (.2941) ns
Black primes
-.9569 (-.2088) ns
-1.3385 (-.2921) ns
White primes
-2.0267 (-.4423) ns
(Warning)
z-values of between-subjects comparison (effect size in parenthesis) ns = not significant
Paired semantic tests revealed a significant difference of content in responses to white
vs. black primes without warning (z = 1.805, p < .05, effect size = .202), which was not
significant in the warning condition (cf Table 2 and 3) . Without accounting for warning,
black vs. white primes had no significant effect on the similarity of the semantic content of
statements.
Table 2. No warning condition
Stimuli/Stimuli
White primes
Black primes
White targets
Black primes
1.8057 (.2018)*
White targets
8.5471 (2.2089)**
9.0010 (2.0284)**
Black targets
7.0212 (1.9125)**
5.9048 (1.7953)**
4.2147 (0.9795)**
z-values of pairwise, within-subject comparison (effect size in parenthesis) *=p<.05
**=p<.001
Table 3. Warning condition
Stimuli/Stimuli Black primes
White targets
Black targets
White primes
-0.4513 (-0.4423) ns 5.9101 (1.6854)** 8.2192 (1.5126)**
Black primes
8.1575 (1.6331)** 6.0143 (1.5954)**
White targets
2.1623 (0.0603)*
z-values of pairwise, within-subject comparison (effect size in parenthesis) *=p<.05
**=p<.001
The semantic space was trained on a list of words with rated valence, an individual
value predicting valence was created for subjects set of words for each condition. By
subtracting the valence of black from white primes, and targets, individual implicit and
explicit attitude values were computed. A positive value in these measures signifies positive
in-group/negative out-group bias in performance on the AMP association task, while negative
values point to negative in-group/positive out-group bias.
Implicit valence difference together with composite motivation, attitude index (feeling
thermometer), their interaction terms and the warning variable centered (by removing the
mean 0.5, transposing no warning to -.5 and warning to .5) were then regressed on the explicit
valence difference, yielding a low R-square (R2 = .093) and the ANOVA on the regression
showed no significance (F(5,34) = .701, p = .626). The interaction of attitude index and
composite motivation was the most significant (β = -.729, p = .106), gradually removing
variables resulted in a model (R2 = .112, F(2,37) = 2,334, p = .111) consisting of this variable
(β = -.781, p = .053) and an interaction between prime valence difference and attitude index
(β = -.173, p = .293). Since none of the individual variables were significant the model was
rejected. Neither did regressing explicit valence, motivation, attitude index, centered warning
and interactions terms on implicit valence result in satisfactory explained variance (R2 = .128)
or a significant ANOVA (F(7,32) = .670, p = .695).
Rather than continuing risky interpretation of insignificant results the data was prepared
for Multilevel Linear Modelling (MLM). For this purpose, the valence of the semantic
response to each single stimuli presentation was computed from the semantic space. For some
reason, the degrees of freedom in the MLM were close to the total number of observations (N
= 3024) even though grouped on the subject variable. Due to this artefact, some caution has
been taken while interpreting the results and readers should see this as a caveat.
First a model based on simple main effects of attitude index score, the composite
motivation scale and their interaction with a random intercept was tested on predicted valence
as dependent variable. This rendered no significant fixed effects (p > .50 for all). Next the
black/white and faces as primes/targets conditions were added with all 2-way interactions.
Like before no effect was significant (every p > .17) and instead a full factorial test of all
terms as predictors was run. This resulted in a nonsignificant (p = .077) 4-way interaction of
all predictors which was retained while trimming the model of insignificant effects, beginning
with multiple interaction terms. A type III partial sum of squares test of a model containing
effects of an interaction between motivation, attitude index, order and face (in-/out-) group (F
= 3.364, p = .067) and an interaction of motivation, attitude index and face group (F = 3.814,
p = 0.051) was arrived at but deemed unsatisfactory. A split of the data according to warning
condition was done and the MLM was run on both, but the resulting Hessian matrix was not
positive definite. A one way ANOVA of valence by warning condition revealed that this was
due to a lack of difference in group means (F(2,2659) = .346, p = .557).
Instead order of presentation was allowed to vary in order to study the difference in the
4-way interaction between faces as primes and faces as targets, yielding an estimate of -.188
(df = 2606, p = .067) for the faces as primes condition while the faces as target variable came
out redundant. Now prime color was allowed to vary for the interaction of motivation, attitude
index and face color, which gave an estimate of -.139 (df = 445.831, p = .052) but the type III
test of the effect was not significant (F = 1.901, p = .151). Abandoning the search for high
level interactions, the model was now reverted to a simple order, color and interaction one.
This rendered an estimate of the effect of face color (Est = -.554 (.116), t = -4.919, p < .001)
and of an interaction between face color and presentation order (Est = .618 (.159), t = 3.883, p
< .001) but no significant effect of order (t = .670, p = .504). The order condition was varied
to investigate the interaction, which showed that predictions of valence by face color were
lower in the prime order than the target order (t = -3.908, p < .001).
Discussion
Interpreting the data
The tendency towards reversed explicit bias (negative in-group bias) is remarkable. It
was ruled out that this was an artifact of warning some subjects of bias. A perhaps more
relevant explanation lies in the cultural context; the low degree of overt prejudice in the
Swedish population in general (Akrami & Ekehammar, 2005), and, one might suspect, this
sample in particular (university students). The norm of political correctness is well anchored
in the public sphere and a prerequisite in academic discourse. Also, the motivation
questionnaires and especially the explicit feeling thermometer attitude scale are very
transparent. Upon discovering the aims of the study, fear of having exhibited prejudice might
have driven subjects to overcompensate on the direct measures.
As can be seen in Table 1, between subjects comparison of semantic representations
with and without warning showed no significant results. In Table 2 and 3 significant
differences are seen for pairwise comparisons within subjects. That they are found on this
level is likely a result of the fact that within subject testing allows detection of smaller effects
due to control of individual variance (Sikström & Sawar, n.d.).
The lack of significance in difference between the warned and unwarned group allows
no interpretation of whether the representations between conditions diverged or converged. It
should be noted that this is a test of all statements in both conditions and does not control for
specific effects in nested conditions. The warning had no overall effect on statements, but
then again it was given just before the prime block and block order was randomized. About
half of the participants were thus warned prior to the (explicit) target block. The significant
difference of black and white prime responses in the unwarned condition which was not
significant for warned subjects is hard to interpret. The difference observed in the unwarned
condition could be explained by the activation of differential semantic concepts. If so, from
the warned condition still demands an explanation. The lack of significant difference does not
show that responses were more alike, only that they didn’t systematically vary depending on
stimuli. It may be that subjects’ strategies of reacting to the warning (to avoid bias in
responses) contain an element of randomizing their responses. It may also be that the increase
in response dimensions allowed for more monitoring and control of responses than a
dichotomous evaluation. This matter of structural fit is briefly discussed below. When
Deutsch and Gawronski (2009) worried about features of the stimuli contaminating answers it
was a question of valence effecting unrelated semantic categories. If these features really are
unrelated, focusing on non-affective features may reduce affective priming. This moderation
of attention on affective processing has been demonstrated in other priming research (Spruyt,
Houwer, & Hermans, 2007).
The hint of predictive value of interactions between explicitly reported attitudes towards
groups and subjects motivation scores did not reveal patterns consistent enough to be
interpreted. What was shown by the MLM was that valence could be predicted by target
stimuli face color and an interaction with presentation order. Since the order of faces as
targets was conceptualized as an explicit measure this shows that implicit and explicit
attitudes as measured by the procedure were related. At the same time they were divergent to
some extent, a pattern expected from prior evidence (Payne, Burkley, et al., 2008). In accord
with predictions from existing evidence (Payne et al., 2005), more negative valence (Est = .554) in statements following black faces in the priming condition was evident with MLM.
No final judgment based on these results can be passed on semantic processes in
affective priming, nor on the relative degree of semantic and affective processing in classical
AMP and in the association task used here. We conceptualized the effect of warning as an
indirect test of semantic involvement, where an unambiguous effect of warning would have
contradicted affective priming (and the affective primacy hypothesis). This was not a strong
assumption, and the mixed results of warning with no between group effects but differing
results in semantic content within conditions does not constitute evidence in favor of affective
primacy.
Confounders and limitations
One might contend that contemporary racial attitudes in Scandinavia are not comparable
to those in US, due to different histories of migration and foreign relations. Scandinavian
racial prejudice has been investigated, using instruments modeled on widely studied measures
of classical and modern racism (for an overview of concepts and relations between these, see
Gawronski, Peters, Brochu, & Strack, 2008). Measures of classical and modern racism have
been shown to be highly correlated but distinguishable in a Swedish population (Akrami &
Ekehammar, 2000). The concept of modern racism assesses subtler, less blatant forms of
prejudice and though an explicit measure, one might expect co-existing attitudes beyond those
explicitly reported. More importantly, Akrami et al. found MCPR (Dunton & Fazio, 1997)
moderation of the relationship between implicit (response-latency-based) and explicit
measures of prejudice in a Swedish population (Akrami & Ekehammar, 2005).
Another possible confounder is structural fit, the degree of correspondence between
methodology and concepts investigated (Payne, Burkley, et al., 2008). Since the degree of
structural fit increases correlations between measures, a high degree of methodological
uniformity removes variability resulting from measurement in data. Implicit-explicit
correlations in the order of .50 has been obtained with structurally equated tests (Payne,
Burkley, et al., 2008), using essentially the same procedure and design as above. This leaves
sufficient variance to explain, and in the context of investigating divergent implicit and
explicit attitudes it is desirable to minimize the amount of methodological variability to ensure
actual construct differences. It is not entirely clear whether the use of a semantic, open-ended
dependent variable constitutes an increase or decrease in structural fit. The one-dimensional
evaluations common to the AMP constrains responding enough that one might suspect
increased structural fit. In this study, the sacrifice of a rigorously homogenous dependent
variable might be justified by opening up a whole new class of informative responses.
The relatively small size of the sample is an obvious limitation. By institution standards
no compensation is given for procedures requiring less than 15 minutes, as the procedure
typically took 10-15 minutes to complete subjects had to volunteer their time for free. More
subjects to increase statistical power would be a priority in future ventures. Note though that
despite few participants, significant effects were found. Ethical approval would allow
collecting data on ethnical group identity, and adjust stimulus material accordingly. Further
development of task goals and instructions more intuitive to subjects might be beneficial since
many commented on the difficulty of generating spontaneous associations to stimuli. A casual
observation is that there seems to be great individual variation in the ease of performing such
associations, this could be explored and controlled for. Also, with larger samples reaction
times should be controlled.
Concluding remarks
This study proved to some extent that evaluative priming is reflected in spontaneous
semantic responses. The results also contribute to the accumulation of empirical evidence in
support of semantic testing of differences in language representations. Semantic testing
revealed significant differences in both semantic content and the valence of responses
between conditions. Interesting relationships between explicitly reported attitudes and
performance on implicit measures of the same attitudes were implied but could not be
demonstrated. An interesting possibility is repeating the same procedure with angry and
fearful faces as primes (as in Blaison et al., 2012), and asking subjects how they feel about the
target Chinese characters. This would allow more direct testing of e.g. distance to concepts of
anger and fear. In conjunction other semantically and affectively incongruent primes could be
used. Also, effects on semantic responses of more salient and evocative but still ambiguous
stimuli, along with effects of other classes of primes, could be explored.
References
Ackerman, J. M., Shapiro, J. R., Neuberg, S. L., Kenrick, D. T., Becker, D. V., Griskevicius,
V., et al. (2006). They all look the same to me (unless they’re angry): From out-group
homogeneity to out-group heterogeneity. Psychological Science, 17, 836-840.
Akrami, N., & Ekehammar, B. O. (2000). Classical and modern racial prejudice: A study of
attitudes toward immigrants in Sweden. European Journal of Social Psychology, 30,
521-532.
Akrami, N., & Ekehammar, B. O. (2005). Personality and social sciences: The association
between implicit and explicit prejudice: The moderating role of motivation to control
prejudiced reactions. Scandinavian Journal of Psychology, 46, 361-366.
Arvidsson, D., Sikström, S., & Werbart, A. (2011). Changes in self and object representations
following psychotherapy measured by a theory-free, computational, semantic space
method. Psychotherapy Research, 21, 430-446.
Blaison, C., Imhoff, R., Hühnel, I., Hess, U., & Banse, R. (2012). The Affect Misattribution
Procedure: Hot or not? Emotion, 12, 403-412.
Calder, A. J., Young, A. W., Rowland, D., Perrett, D. I., Hodges, J. R., & Etcoff, N. L.
(1996). Facial emotion recognition after bilateral amygdala damage: Differentially
severe impairment of fear. Cognitive Neuropsychology, 13, 699-745.
Carolina, N., Payne, B. K., Mcclernon, F. J., & Dobbins, I. G. (2007). Automatic affective
responses to smoking cues. Experimental and Clinical Psychopharmacology, 15, 400409.
De Houwer, J., & Houwer, J. D. (2006). What are implicit measures and why are we using
them. In R. W. Wiers & A. W. Stacy (Eds.), The Handbook of Implicit Cognition and
Addiction (pp. 11-28). Thousand Oaks, CA: Sage Publishers.
Deutsch, R., & Gawronski, B. (2009). When the method makes a difference: Antagonistic
effects on “automatic evaluations” as a function of task characteristics of the measure.
Journal of Experimental Social Psychology, 45, 101-114.
Doob, L. W. (1947). The behavior of attitudes. Psychological Review, 54, 135-156.
Dunton, B. C., & Fazio, R. H. (1997). An individual difference measure of motivation to
control prejudiced reactions. Personality and Social Psychology Bulletin, 23, 316-326.
Dutton, D. G., & Aron, A. P. (1974). Some evidence for heightened sexual attraction under
conditions of high anxiety. Journal of Personality and Social Psychology, 30, 510-517.
Eagly, A. H., & Chaiken, S. (1993). The Psychology of Attitudes. Fort Worth, TX: Harcourt
Brace Jovanovich College Publishers.
Eagly, A. H., & Chaiken, S. (2007). The advantages of an inclusive definition of attitude.
Social Cognition, 25, 582-602.
Fazio, R. H., Jackson, J. R., Dunton, B. C., & Williams, C. J. (1995). Variability in automatic
activation as an unobtrusive measure of racial attitudes: A bona fide pipeline? Journal of
Personality and Social Psychology, 69, 1013-1027.
Fazio, R. H., & Olson, M. A. (2003). Implicit measures in social cognition research: Their
meaning and use. Annual Review of Psychology, 54, 297-327.
Gawronski, B. (2007). Attitudes can be measured! But what is an attitude? Social Cognition,
25, 573-581.
Gawronski, B., Deutsch, R., & Seidel, O. (2005). Contextual influences on implicit
evaluation: A test of additive versus contrastive effects of evaluative context stimuli in
affective priming. Personality & Social Psychology Bulletin, 31, 1226-1236.
Gawronski, B., Peters, K. R., Brochu, P. M., & Strack, F. (2008). Understanding the relations
between different forms of racial prejudice: a cognitive consistency perspective.
Personality & Social Psychology Bulletin, 34, 648-665.
Greenwald, A. G., & Banaji, M. R. (1995). Implicit social cognition: Attitudes, self-esteem,
and stereotypes. Psychological Review, 102, 4-27.
Gustafsson Sendén, M., & Sikström, S. (n.d.). Gender stereotypes in media - content and
evaluations. Manuscript
Imhoff, R., Schmidt, A. F., Bernhardt, J., Dierksmeier, A., & Banse, R. (2010). An inkblot for
sexual preference: A semantic variant of the Affect Misattribution Procedure. Cognition
& Emotion, 25, 676-690.
Landauer, T., & Dumais, S. T.(1997). A solution to Plato’s problem: The latent semantic
analysis theory of acquisition, induction, and representation of knowledge. Psychological
Review; Psychological, 1, 211-240.
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to Latent Semantic
Analysis. Discourse Processes, 25, 259-284.
Lebel, E. P., & Paunonen, S. V. (2011). Sexy but often unreliable: The impact of unreliability
on the replicability of experimental findings with implicit measures. Personality &
Social Psychology Bulletin, 37, 570-583.
Lilienfeld, S. O., Wood, J. M., & Garb, H. N. (2000). The scientific status of projective
techniques. Psychological Science in the Public Interest, 1, 27-66.
Loersch, C., & Payne, B. K. (2012). On mental contamination: The role of (mis)attribution in
behavior priming. Social Cognition, 30, 241-252.
Martin, L. L., Seta, J. J., & Crelia, R. A. (1990). Assimilation and contrast as a function of
people’s willingness and ability to expend effort in forming an impression. Journal of
Personality and Social Psychology, 59, 27-37.
Murphy, S. & Zajonc, R. B. (1993). Affect, cognition, and awareness: Affective priming with
optimal and suboptimal stimulus exposures. Journal of Personality and Social
Psychology, 64, 723-739.
Ohman, A. (2002). Automaticity and the amygdala: Nonconscious responses to emotional
faces. Current Directions in Psychological Science, 11, 62-66.
Oikawa, M., Aarts, H., & Oikawa, H. (2011). There is a fire burning in my heart: The role of
causal attribution in affect transfer. Cognition & Emotion, 25, 156-163.
Olsson, A., Ebert, J. P., Banaji, M. R., & Phelps, E. A. (2005). The role of social groups in the
persistence of learned fear. Science, 309, 785-787.
Payne, B. K., Burkley, M. A., & Stokes, M. B. (2008). Why do implicit and explicit attitude
tests diverge? The role of structural fit. Journal of Personality and Social Psychology,
94, 16-31.
Payne, B. K., Cheng, C. M., Govorun, O., & Stewart, B. D. (2005). An inkblot for attitudes:
Affect misattribution as implicit measurement. Journal of Personality and Social
Psychology, 89, 277-293.
Payne, B. K., Govorun, O., & Arbuckle, N. L. (2008). Automatic attitudes and alcohol: Does
implicit liking predict drinking? Cognition & Emotion, 22, 238-271.
Payne, B. K., Hall, D. L., Cameron, C. D., & Bishara, A. J. (2010a). A process model of
affect misattribution. Personality & Social Psychology Bulletin, 36 1397-1408.
Payne, B. K., Krosnick, J. A., Pasek, J., Lelkes, Y., Akhtar, O., & Tompson, T. (2010b).
Implicit and explicit prejudice in the 2008 American presidential election. Journal of
Experimental Social Psychology, 46, 367-374.
Phelps, E. A., O’Connor, K. J., Cunningham, W. A, Funayama, E. S., Gatenby, J. C., Gore, J.
C., et al. (2000). Performance on indirect measures of race evaluation predicts amygdala
activation. Journal of Cognitive Neuroscience, 12, 729-738.
Plant, E. A., & Devine, P. G. (1998). Internal and external motivation to respond without
prejudice. Journal of Personality and Social Psychology, 75, 811-832.
Russell, J. A. (2003). Core affect and the psychological construction of emotion.
Psychological Review, 110, 145-172.
Schacter, D. L. (1987). Implicit memory: History and current status. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 13, 501-518.
Schwarz, N., & Clore, G. L. (1983). Mood, misattribution, and judgments of well-being:
Informative and directive functions of affective states. Journal of Personality and Social
Psychology, 45, 513-523.
Sikström, S., & Sawar, F. (n.d.). Semantic tests. Manuscript submitted for publication.
Sprengelmeyer, R., Rausch, M., Eysel, U. T., & Przuntek, H. (1998). Neural structures
associated with recognition of facial expressions of basic emotions. Proceedings of the
Royal Society, Biological sciences, 265, 1927-1931.
Spruyt, A., Houwer, J. D., & Hermans, D. (2007). Affective priming of nonaffective semantic
categorization responses. Experimental Psychology, 54, 44-53.
Staats, A. W., & Staats, C. K. (1958). Attitudes established by classical conditioning. Journal
of Abnormal Psychology, 57 37-40.
Stenberg, G., Wiking, S., & Dahl, M. (1998). Judging words at face value: Interference in a
word processing task reveals automatic processing of affective facial expressions.
Cognition & Emotion, 12, 755-782.
Storbeck, J., & Clore, G. L. (2007). On the interdependence of cognition and emotion.
Cognition & Emotion, 21, 1212-1237.
Wilson, T. D., & Brekke, N. (1994). Mental contamination and mental correction: Unwanted
influences on judgements and evaluations. Psychological Bulletin, 116, 117-142.
Winkielmann, P., Zajonc, R. B., & Schwarz, N. (1997). Subliminal affective priming resists
attributional interventions. Cognition & Emotion, 11, 433-465.
Wittgenstein, L. (1999). Philosophical investigations. (G. E. M. Anscombe, Trans.). Oxford,
UK: Blackwell Publishing (Original work published 1953).
Zajonc, R. B. (1980). Feeling and thinking: Preferences need no inferences. American
Psychologist, 35, 151-175.