- Stem-cell and Brain Research Institute

Journal of Experimental Psychology:
Learning, Memory, and Cognition
2006, Vol. 32, No. 6, 1291–1303
Copyright 2006 by the American Psychological Association
0278-7393/06/$12.00 DOI: 10.1037/0278-7393.32.6.1291
Perceptual Representation as a Mechanism of Lexical Ambiguity
Resolution: An Investigation of Span and Processing Time
Carol J. Madden
Rolf A. Zwaan
Erasmus University Rotterdam
Florida State University
In 2 experiments, the authors investigated the ability of high- and low-span comprehenders to construe
subtle shades of meaning through perceptual representation. High- and low-span comprehenders responded to pictures that either matched or mismatched a target object’s shape as implied by the preceding
sentence context. At 750 ms after hearing the sentence describing the target object, both high- and
low-span comprehenders had activated a contextually appropriate perceptual representation of the target
object. However, only high-span comprehenders had perceptually represented the contextually appropriate meaning immediately upon hearing the sentence, whereas low-span comprehenders required more
processing time before the perceptual representation was activated. The results are interpreted in a
framework of co-occurring lexical representations and perceptual–motor representations.
Keywords: perceptual representations, lexical ambiguity, individual differences
Perceptual–motor representations are conceptualized as activations of experiential simulations of a described situation (see
Barsalou, 1999; Zwaan & Madden, 2005). According to this recent
framework, when comprehenders process language, they partially
reactivate previous traces of experience that are distributed across
multiple perceptual and motor modalities in the brain (Zwaan &
Madden, 2005). A growing body of recent research has demonstrated the activation of perceptual–motor representations during
language comprehension. For instance, in a study by Zwaan and
colleagues (Zwaan et al., 2002), participants responded as to
whether a pictured object had been mentioned in a sentence they
had just read. On experimental trials, the sentence always named
an object in a particular location (e.g., There was spaghetti in the
[box/pot]), and the subsequently pictured object was always the
mentioned object, requiring a yes response. However, the pictured
object could either match or mismatch the contextual constraints of
the preceding sentence. For instance, if a participant read the
sentence There was spaghetti in the pot, then a picture of cooked
spaghetti would be a matching picture, whereas a picture of uncooked spaghetti would be a mismatching picture. Participants
were faster to respond to pictures that perceptually matched rather
than mismatched the contextual constraints of the sentence, providing support for the idea that perceptual information is incorporated in language representations. In a second experiment participants did not have to respond as to whether the pictured object was
mentioned in the sentence but were simply asked to name the
pictured object. Again, participants were faster to name pictures
that perceptually matched rather than mismatched the contextual
constraints of the sentence. Furthermore, other studies have demonstrated that comprehenders routinely represent other perceptual
aspects of described entities, such as the direction of motion of
objects (Zwaan et al., 2004) and the orientation of objects (Stanfield & Zwaan, 2001).
Additionally, there is support for the idea that motor programs
for actions are activated during language comprehension (Glenberg & Kaschak, 2002; Klatzky et al., 1989). Glenberg and Kas-
One of the most remarkable capacities of the human mind is the
ability to perceive squiggles on a page or rapid patterns of sound
waves and transform this visual or auditory code into meaningful
concepts and ideas. These resulting meaning representations and
the process by which they are activated have been the focus of
much research in cognitive psychology over the past century.
There have been several recent demonstrations that reading or
listening to language evokes perceptual–motor representations
(Chambers, Tanenhaus, & Magnuson, 2004; Chao & Martin, 2000;
Glenberg & Kaschak, 2002; Kan, Barsalou, Solomon, Minor, &
Thompson-Schill, 2003; Kaschak et al., 2005; Klatzky, Pellegrino,
McCloskey, & Doherty, 1989; Morrow & Clark, 1988; Pecher,
Zeelenberg, & Barsalou, 2003; Richardson, Spivey, Barsalou, &
McRae, 2003; Solomon & Barsalou, 2001; Stanfield & Zwaan,
2001; Tabossi, 1988; Zwaan, Madden, Yaxley, & Aveyard, 2004;
Zwaan, Stanfield, & Yaxley, 2002). However, it is not clear
exactly what role these representations play in language comprehension. The aim of the present study is to show how perceptual–
motor representations can be viewed as a mechanism by which the
contextually appropriate meaning or sense of a given word is
construed during language processing.
Carol J. Madden, Psychology Department, Erasmus University Rotterdam, Rotterdam, the Netherlands; Rolf A. Zwaan, Department of Psychology, Florida State University.
This research was supported by National Institute of Mental Health
Grant MH-63972 to Rolf A. Zwaan and is based on a dissertation submitted
by Carol J. Madden to the Department of Psychology at Florida State
University in partial fulfillment of the requirements for the degree of doctor
of philosophy. We thank Stephen Powers, Michelle Peruche, Angela Huntenburg, Julie Quevedo, James Mahone, Colleen Mattingly, and Greg
Smith.
Correspondence concerning this article should be addressed to Carol
J. Madden, Psychology Department, T12-37, Erasmus University Rotterdam, Postbus 1738, 3000 DR Rotterdam, the Netherlands. E-mail:
[email protected]
1291
1292
MADDEN AND ZWAAN
chak (2002) found that participants were faster to respond to
sentences when the response action was compatible with the
direction of described motion. For instance, after reading a sentence such as Close the drawer, participants were faster to respond
if the correct response required moving their hand away from
rather than toward their body. This has been termed the actionsentence compatibility effect. Klatzky and colleagues (1989) previously demonstrated a similar effect, showing that participants are
faster at reading and comprehending a sentence about throwing
darts, for instance, when they are first instructed to form their hand
into the pinched fingers (dart-throwing) position. This facilitation
is consistent with the idea that language representations are embodied and are thus susceptible to influences from perceptual–
motor systems. Recently, Zwaan and Taylor (2006, Experiment 4)
have shown that the activation of motor processes during the
comprehension of action sentences is rather immediate, as it occurs
during processing of the verb describing the action in question.
It has certainly become evident that perceptual–motor representations are evoked during language processing. However, the precise role of these representations in the comprehension process is
still unclear. The present study is an attempt to link perceptual–
motor representations with established findings in the field of
psycholinguistics and thus to establish these representations as a
mechanism of language comprehension. Although it is clear that
both perceptual and motor representations are activated during
language processing, the present set of experiments focuses solely
on the perceptual aspect of these representations. First, we discuss
current models of ambiguity resolution and comprehension skill,
and then we describe two experiments that investigate how perceptual representations act as a mechanism for construing the
contextually appropriate meaning or sense of a word during language processing. Within these experiments, the time course of
meaning construal through perceptual representations is investigated for comprehenders of varying skill level.
Lexical Ambiguity Resolution
Lexical ambiguity occurs very frequently in written and spoken
language, as most words and phrases in the English language lead
to multiple possible interpretations (Britton, 1978). How the contextually appropriate meaning of a word is selected has been the
topic of much research in psycholinguistics. Theories of lexical
ambiguity resolution have often focused on homonyms,1 or words
with a single identical pronunciation and orthography but two or
more unrelated meanings (e.g., dog bark and tree bark). However,
theorists from the field of linguistics are quick to make the distinction between this type of ambiguity and polysemy, wherein
words with a single, identical pronunciation and orthography have
multiple related meaning senses (Cruse, 1986; Lyons, 1977). For
example, the word twist can be applied to knobs, ankles, limes,
dancing, and the truth, each yielding a slightly different but related
meaning from the next.
How comprehenders are able to resolve ambiguities with respect
to homonymy is rather straightforward, as it is generally agreed
that homonyms are associated with separate stored lexical entries
in the mental lexicon. Theories differ on issues such as the time
course of contextual influence on the meaning selection process
(Conrad, 1974; Glucksberg, Kreuz, & Rho, 1986; Lucas, 1987;
Onifer & Swinney, 1981; Schvaneveldt, Meyer, & Becker, 1976;
Simpson, 1981; Swinney, 1979), but there is agreement on the fact
that one of the available lexical entries must be selected and
integrated into the larger discourse model. The area of polysemy,
however, is more complicated, in that there is not general agreement on whether all senses of a given word are stored in a single
lexical entry or whether each sense has its own lexical entry in the
mental lexicon. Consequently, this research area has recently provided fruitful investigations in terms of informing theories of
lexical ambiguity.
Rodd, Gaskell, and Marslen-Wilson (2002) have provided evidence that the benefit for ambiguous words in lexical decision
tasks is most likely due to polysemous words rather than homonyms as originally thought. In fact, once these authors empirically distinguished between words with multiple unrelated meanings and words with related senses, they found that words with
multiple unrelated meanings were responded to more slowly than
controls, whereas words with related senses were responded to
more quickly than controls. Likewise, Frazier and Rayner (1990)
have shown that polysemous words yield shorter fixations than
words with multiple unrelated meanings. Furthermore, Klepousniotou (2002) has demonstrated that polysemous words are
accessed more quickly than homonyms.
These findings have been interpreted in various ways. Klepousniotou (2002) interpreted the polysemy advantage as evidence
for single, semantically rich lexical entries for polysemous words
(although see Klein & Murphy, 2001) that allow the multiple
related senses of polysemous words to facilitate processing of the
lexical entry as a whole. In contrast, multiple unrelated meanings
of homonyms are stored in separate lexical entries and therefore
compete for activation. Rodd et al. (2002) considered multiple
interpretations for the data, favoring the idea that polysemous
words have multiple representations but that these are highly
correlated lexical representations in distributed semantic networks.
As long as the task is noncontextual, the multiple activated representations work together to recruit patterns of overlapping
activation.
Once the task becomes more contextual, however, each of these
accounts necessitates the existence of a mechanism by which
representations can be constrained to the contextually appropriate
sense of a polysemous word. Highly correlated lexical representations in distributed semantic networks will not inhibit each other,
and thus the mechanism by which one particular sense may be
construed is unclear. Likewise, Klepousniotou (2002) argued for a
single lexical entry specified for the base sense of a polysemous
word, along with an underspecified lexical rule that allows for
derivation of extended meanings. Exactly how a particular word
sense is correctly construed in contextual language is unclear. This
problem becomes even more challenging when sense ambiguity is
considered for words that are thought to have only a single sense.
Consider the example There was spaghetti in the box/pot. These
two sentences refer to the same meaning of spaghetti. However,
there are aspects of the meaning of spaghetti that can change from
context to context, most salient in this case being the shape. Here
we can see that the possibilities of subtle shades of meaning that
1
Previous studies have also used homographs, which share the same
orthography but not pronunciation, and homophones, which share the same
pronunciation but not orthography.
LEXICAL AMBIGUITY AND PERCEPTUAL REPRESENTATIONS
can be construed are countless, not merely limited to the number
of dictionary entries or senses of a word. There must be a mechanism of sense construal for all words if comprehenders do in fact
take into account subtleties such as the shape of entities in their
representations. It seems that this gap in theory can be filled by the
perceptual–motor representations introduced above.
Perceptual–Motor Representations and Lexical
Representations
As discussed previously, there is a growing body of evidence
that comprehenders activate perceptual–motor representations during language processing. These perceptual–motor representations
can be functionally divided into two types: representations of
experience with the referent of a linguistic communication and
representations of experience with the linguistic code itself (see
Zwaan & Madden, 2005). Although according to our framework
both types of representations are actually perceptual–motor representations, we use this term in reference to representations of the
referent of a linguistic communication, whereas we use the term
lexical representations to refer to the representations of experience
with the linguistic code itself. Lexical representations are traces of
experience of hearing or seeing words, as well as pronouncing,
writing, or typing words.2 They are interconnected with other
lexical representations through co-occurrence, and they are connected to their associated perceptual–motor representations
through co-occurrence as well. Thus, in addition to activating one
or more stored lexical representations when a given word or
concept is encountered, comprehenders also partially reactivate
perceptual–motor representations, or previous traces of experience
with the referent associated with that lexical representation.
The activation of perceptual–motor representations through lexical representations serves to ground the language comprehension
process in one’s own experiences. Because lexical representations
themselves are often contextually underspecified, perceptual–
motor representations may act as a mechanism to automatically
construe the appropriate meaning or sense of the encountered
word. Thus, encountering the words eagle and fly together in a
sentence leads to activation of the lexical representations for these
two words, which in turn activate a perceptual–motor representation of the referent. In the perceptual–motor representation, many
senses of the flying eagle are automatically construed, such as
outstretched wings and tucked talons. Thus, the activation of
perceptual–motor representations by lexical representations may
be a mechanism for construing word senses. However, it is still
unclear exactly how and when perceptual–motor representations
are used to resolve these subtle ambiguities. In particular, the time
course of activating a perceptual–motor representation relative to
the lexical representation is unclear, as is the extent to which
comprehension skill mediates activation of the two types of
representations.
lexical ambiguity resolution of homonyms, Gernsbacher and colleagues have demonstrated that more skilled comprehenders are
better able to use sentence context to quickly constrain their
representations than less skilled comprehenders (Gernsbacher &
Faust, 1991; Gernsbacher, Varner, & Faust, 1990). In these studies, a sentence containing a homograph or homophone was followed by a test word that was related to the contextually appropriate or the contextually inappropriate meaning of the preceding
homograph/homophone. Skilled comprehenders (as determined by
Gernsbacher & Varner’s [1988] comprehension test battery) were
faster to deactivate the inappropriate meanings of the preceding
homographs.
Likewise, Van Petten, Weckerly, McIsaac, and Kutas (1997)
used modulations in the n400 component of event-related brain
potentials to show that only high-span comprehenders (as determined by Daneman & Carpenter’s [1980] Reading Span task) were
able to use sentence-level context immediately upon reading a
word in a sentence. Low-span comprehenders did not show immediate sensitivity to sentence-level context at normal reading
speeds but rather relied on single-word associations. Thus, the
phase at which sentence context is able to influence meaning
selection, at least in homographs, seems to vary with the language
ability of the individual.
While Gernsbacher and colleagues (Gernsbacher & Faust, 1991;
Gernsbacher et al., 1990) link the source of these individual
differences in the ability to use context to the efficiency of a
suppression mechanism, others have posited that the difference
stems from the richness of representations and domain expertise
(MacDonald & Christiansen, 2002; McNamara & McDaniel,
2004; Pearlmutter & MacDonald, 1995; Zwaan & Truitt, 2000).
For instance, McNamara and McDaniel (2004) demonstrated that
the activation of relevant background knowledge rather than a
mechanism to suppress irrelevant meanings could explain the
differential use of contextual information to select appropriate
meanings. Furthermore, Zwaan and Truitt (2000) showed that the
extent to which smokers were able to reject locally consistent but
globally inconsistent smoking-related words was correlated with
the amount of lifetime smoking experience. Finally, although their
findings pertain to age-related differences, Dagerman, MacDonald,
and Harm (2006) suggested that the ability to use contextual
information (but not lexical frequency) is a factor of processing
speed. In their study, computational models were able to simulate
older and younger adults’ ability to use context by means of
varying a processing speed parameter. This idea of processing
speed as an underlying source of individual differences might also
apply to younger adults of varying comprehension skill.
2
Individual Differences in Lexical Ambiguity Resolution
It is not the case that all comprehenders are equally skilled and
fast at selecting the appropriate meaning of a homonym, and so
one would not expect that all comprehenders would be equally
skilled and fast at activating perceptual–motor representations and
construing the appropriate sense of a word. In their research on
1293
Although the current study does not provide evidence concerning the
nature of lexical representations, the idea that they are perceptual representations is consistent with Goldinger’s (1998) episodic theory of the
lexicon, which postulates that episodic traces of perceptual input are stored
in the lexicon from each individual instance of hearing words. These
perceptual lexical representations (e.g., experiences with the word spaghetti) may then be linked to referent representations (e.g., experiences
with actual spaghetti) through co-occurrence.
1294
MADDEN AND ZWAAN
The Present Study
Bridging the gap between research on lexical ambiguity resolution, individual differences, and perceptual–motor representations
in language comprehension, the present study incorporates the
empirical design of both Gernsbacher’s work on comprehension
skill using homographs and Zwaan and colleagues’ perceptual
mismatch paradigm. The resulting design investigates the time
course of meaning construal in high- and low-span comprehenders
to demonstrate how perceptual–motor representations might act as
a mechanism for construing the contextually appropriate meaning
or sense of a word during language processing. Although highspan comprehenders are clearly expected to use sentence context
to activate perceptual–motor representations more quickly than
low-span comprehenders, the following set of experiments is
aimed at better understanding exactly when and how lexical representations and perceptual–motor representations are activated for
high- and low-span comprehenders.
Experiment 1
The present study focuses on the perceptual aspect of
perceptual–motor representations activated during language processing, so the term will be shortened to perceptual representations in the context of our experimental manipulation. To investigate how perceptual representations act as a mechanism for
construing the contextually appropriate sense of a word, we examined the time course of the perceptual representation at two
levels of reading span and two levels of probe latency. In the
current experiment, high- and low-span comprehenders heard sentences describing the location of a target object. Immediate and
delayed picture presentations were used to probe for contextappropriate and context-inappropriate perceptual features of the
target words. We expected that pictures matching the contextual
constraints of the preceding sentence would be responded to more
quickly than mismatching pictures, provided an adequate perceptual representation had been activated at the time the picture was
presented. Thus, the match advantage was more likely to be
observed for high-span comprehenders and more likely to occur at
the later probe latency.
There are several possible sources for these predicted span
differences. First, the high-span comprehenders may activate the
appropriate perceptual representation more quickly because they
have a more efficient suppression mechanism than low-span comprehenders (Gernsbacher & Faust, 1991; Gernsbacher et al., 1990).
However, this is unlikely the cause of the difference in the present
design, as the disambiguation required here is not between mutually exclusive meanings, or even related senses, but rather between
very subtle perceptual features of a given sense of a concept. It is
unlikely that active suppression would have an influence within a
given sense of a concept. Alternatively, the difference might stem
from domain expertise (McNamara & McDaniel, 2004; Zwaan &
Truitt, 2000), such that high-span comprehenders happen to have
more experience with the content of the experimental items and
thus have richer networks of lexical and perceptual–motor representations. However, this explanation is also unlikely within the
context of the current study, as the sentences and pictures used
here are a range of everyday items such as spaghetti, eagles, and
newspapers, which should not represent an area of expertise for
any given subject.
Although comprehenders would not be expected to vary immensely in expertise with these everyday concepts, they might
exhibit wide differences in the degree to which perceptual–motor
representations are activated as a function of seeing or hearing a
word. In other words, the difference between high- and low-span
comprehenders might stem from a related but more procedural
source of expertise, in that the links between the lexical and
perceptual–motor representations might be stronger in high-span
comprehenders. In this case, the process of indirectly activating
perceptual–motor representations through lexical representation
may be reinforced through more frequent reading, and thus the
efficiency of the entire comprehension process would be improved
(MacDonald & Christiansen, 2002; Pearlmutter & MacDonald,
1995). It is unclear whether these strengthened links between
lexical and perceptual–motor representations would allow for increased processing speed in activating representations (Dagerman
et al., 2006) or greater precision in activating richer or perhaps
larger networks of representations (MacDonald & Christiansen,
2002). Thus, the present study aims to provide a better understanding of the time course and relative levels of activation of the lexical
and perceptual–motor representations for high- and low-span
comprehenders.
Method
Participants. One hundred sixty undergraduate students enrolled at
Florida State University participated in the experiment as part of a course
requirement.3 All participants were native English speakers.
Materials. Twenty-eight experimental sentence pairs were adapted
from Zwaan et al. (2002), each describing a target object in a location (see
Appendix A for sample stimuli). The sentences were altered such that the
target object was always the final word of the sentence—for example, In
the box/pot there was spaghetti. The sentence pairs were constructed such
that each target object was described in two locations that implied different
object shapes (e.g., long, straight strands of uncooked spaghetti or raveled
strands of cooked spaghetti). Two images, depicting the object in the two
implied shapes, were also constructed to correspond to each experimental
sentence pair. This yielded two sentences and two pictures for each target
object. The pictures were line drawings, regular drawings, or photos, all of
which were black and white and occupied a square of about 3 in. (7.62 cm)
on the center of the screen. Each experimental sentence could be paired
with a picture that matched or mismatched the implied shape of the target
object, yielding four possible sentence–picture combinations. Participants
were to see only one of these four possible combinations for each target
object, and so four experimental lists were created and counterbalanced
with respect to implied shape and match–mismatch condition of the 28
target objects. Between each of these four lists, the interstimulus interval
(ISI) between the offset of the sentence and the presentation of the picture
was varied. Pictures could appear immediately upon the offset of the final
word of the sentence (0-ms ISI) or 750 ms after the offset of the final word
of the sentence (750-ms ISI), now yielding eight lists. ISI was varied
between lists rather than within list so that participants would not notice
differences in picture onset delay from trial to trial. A given participant was
exposed to only one of the eight lists.
3
A total of 184 students actually participated in Experiment 1 to yield
160 data sets that could be used in the analysis design: 10 high-span and 10
low-span comprehenders for each of the four stimulus lists at each of the
two ISI presentations. Similarly, more than 160 students participated in
Experiment 2 to yield the desired number of 10 high-span and 10 low-span
participants per design cell.
LEXICAL AMBIGUITY AND PERCEPTUAL REPRESENTATIONS
In addition, 56 similar filler sentences and pictures were adapted from
the earlier study. Each of the 112 sentences (56 filler sentences and 28 pairs
of experimental sentences) was recorded to a Waveform audio format file
by a female native speaker of American English, and the pictures were
converted to black and white and scaled to occupy about 3 in.2 (7.62 cm2)
on the center of the screen. The picture was mentioned in the sentence on
half of the trials (all 28 experimental trials and 14 of the filler trials).
Participants were told to respond as to whether the pictured object had been
mentioned in the preceding sentence, using keys labeled Y and N. On 24 of
the filler trials, a question would appear after the picture-comparison
response had been made. These questions required inferences about the
sentences and were included to ensure that participants would make an
effort to process the sentences at a relatively deep level. For instance, the
sentence There was a flower in the vase was followed by the question Was
the vase empty? Participants answered these questions using keys that were
labeled with Y and N stickers. Because the participants did not know which
sentences would be followed by a question, they had to comprehend each
sentence to ensure a sufficient level of understanding. Both the experimental task and the Reading Span task described below were run on PCs with
19-in. flat-screen displays using the E-Prime stimulus presentation software (Schneider, Eschman, & Zuccolotto, 2002).
All participants completed a computer version of the Reading Span task
(Conway, Cowan, Bunting, Therriault, & Minkoff, 2002). On a given trial
of the Reading Span task, a participant would read aloud a sentence,
answer aloud “yes” or “no” as to whether it made sense, and then read
aloud the capitalized letter at the end of the sentence, remembering the
letter for a later test. Participants would see two, three, four, or five of these
trials in a set before having to write the final letters they could recall for
that set on a formatted sheet. Participants completed 3 practice sets and 12
experimental sets. Although it is impossible to administer this test in such
a way as to prohibit strategy use altogether (see McNamara & Scott, 2001),
the experimenter sat with participants and controlled the progression from
trial to trial to prevent participants from rehearsing the letters before they
had to write them down at the end of the set. The test was scored by adding
together the total number of correctly recalled letters over all sets. Participants were told to write the letters in the correct order, but letters were not
counted as incorrect if they were recalled in the wrong spaces. This scoring
practice was adopted because most misplaced letters ended up in the wrong
space only because one early letter was omitted from recall for that set, and
scoring the remainder of the letters as incorrect seemed to underestimate
the true span size. On the basis of a median split of scores, participants
were classified as high- or low-span comprehenders.
The Reading Span task assesses a participant’s ability to maintain
linguistic information in working memory while simultaneously processing
sentences. This is a crucial component of the language comprehension
process, as readers and listeners constantly hold words or clauses in
working memory while processing other words or clauses until both can be
integrated. This measure has been used often in language comprehension
experiments and correlates well with other measures of reading comprehension, such as verbal SAT (Daneman & Carpenter, 1980; Daneman &
Merikle, 1996). The sentence-processing component of the Reading Span
task directly taps into the efficiency of comprehension, which explains the
correlation with other measures of reading comprehension. In addition, the
Reading Span task indirectly measures comprehension, in that the time and
resources remaining after comprehension are devoted to memory storage
and maintenance. Efficient comprehenders activate representations
quickly, have more time and resources available for memory storage and
maintenance, and thus have higher span scores (for a discussion, see
Friedman & Miyake, 2003).
Procedure. Participants were met one at a time in the laboratory and
asked to sign a consent form. Then the participant was shown to another
room to complete the Reading Span task. The experimenter sat next to the
participant as he or she completed this task, advancing the participant from
trial to trial so that rehearsal or other memory strategies would be avoided.
1295
After completing the Reading Span task, participants began the pictureresponse experiment. On a given trial, participants pressed the spacebar to
hear a sentence over headphones (e.g., “In the box, there was spaghetti”).
The sentence was followed by a picture presentation either immediately
after the offset of the final word (0 ms between sentence offset and picture
presentation) or after a delay (750 ms between sentence offset and picture
presentation). The participant’s task was to make a response as to whether
the pictured object had been mentioned in the preceding sentence. If the
pictured object had been mentioned in the sentence, the participant was to
press the J key, which was covered by a Y sticker. All 28 of the experimental trials (both matching and mismatching pictures) and 14 of the filler
trials required a yes response. If the pictured object had not been mentioned
in the preceding sentence, as was the case for the 42 remaining filler trials,
the participant was to press the F key, which was covered by an N sticker
for “no.” Once the experiment was finished, participants were debriefed,
assigned partial course credit for their participation, and dismissed.
Design. Experiment 1 incorporated a 2 (match vs. mismatch) ! 2 (ISI:
0 ms vs. 750 ms) ! 2 (Reading Span: high vs. low) mixed design, with
match as a within-subject variable and ISI and reading span as betweensubjects variables. On all 28 experimental trials, the presented image
depicted the final word of the sentence (the target object) and thus required
a yes response. However, on these experimental trials, the target object
could be pictured in a shape that either matched or did not match the
contextual constraints of the sentence. For example, if the participant heard
“In the box there was spaghetti,” a matching picture would be spaghetti that
was uncooked and a mismatching picture would be cooked spaghetti. The
participant should respond yes in either case, because the pictured object
was mentioned in the sentence, but we expected the responses to be faster
for matching than for mismatching pictures.
To summarize, the current experiment was designed to detect differences
in the time course of construing subtle shades of meaning for unambiguous
words in high- and low-span comprehenders. We predicted that when the
perceptual properties of the pictured object matched the contextual constraints of the preceding sentence, responses would be facilitated relative to
when the picture mismatched the contextual constraints of the preceding
sentence. Thus, responses should be faster for matching than for mismatching pictures. However, this would be the case only if an adequate perceptual representation had been activated at the time of the picture presentation. Depending on the time course of activating perceptual–motor
representations relative to lexical representations and the demand on resources for the two types of representations, it would be more likely for a
match advantage to be observed in the later ISI condition and in the
high-span group.
Results and Discussion
The dependent measure of interest was the participant’s time to
respond to the presented picture. Analyses using response accuracy
were also conducted and are reported in Appendix B. Although list
was included as a factor in all analyses for both experiments,
effects for the list variable are not reported given the lack of
theoretical relevance (Pollatsek & Well, 1995; Raaijmakers, Schrijnemakers, & Gremmen, 1999). All analyses were conducted both
with variability due to subjects and with variability due to items in
the error term. These analyses are indicated by the subscripts 1 and
2, respectively, for both experiments. Incorrect responses were not
included in the reported analyses, and any response time above or
below two standard deviations from a participant’s mean for a
given condition was removed prior to running the analyses. This
constituted removal of about 5% of the data.
To ensure that the reported analyses reflected processes of true
sentence comprehension, 10 participants who scored less than 75%
1296
MADDEN AND ZWAAN
correct on the comprehension questions were excluded from the
analyses. In all but 3 of these cases, extra participants had been run
in the particular condition, and the low-accuracy data were replaced (high span: 0-ms n " 40, 750-ms n " 40; low span: 0-ms
n " 40, 750-ms n " 37). In addition, one picture yielded particularly low accuracy in both experiments. This item was a rather
amorphous fillet of fish, which, we realized in hindsight, must
have been difficult for participants to recognize and was therefore
excluded from the analyses in Experiments 1 and 2.
The means and standard deviations for the response times from
Experiment 1 are displayed in Table 1. Participants were classified
as high- or low-span comprehenders according to a median split of
scores on the Reading Span task (M " 31.6, SD " 4.2; for
high-span comprehenders, M " 35.0, SD " 2.5; for low-span
comprehenders, M " 28.2, SD " 2.5). The overall mixed analysis
of variance (ANOVA) with list as a between-subjects factor
showed a clear effect of match, indicating that participants were
faster to respond to the picture when it matched rather than
mismatched the contextual constraints of the preceding sentence:
F1(1, 141) " 28.82, p # .001, MSE " 8,868; F2(1, 51) " 34.38,
p # .001, MSE " 15,423. This effect was qualified by a three-way
interaction among match, ISI, and reading span, indicating that the
effect of match varied for high- and low-span comprehenders at
the two ISIs: F1(1, 141) " 4.04, p # .05; F2(1, 51) " 4.79, p #
.05, MSE " 6,462.
To understand the nature of this three-way interaction, separate
analyses were conducted for the 0-ms ISI and the 750-ms ISI. The
overall effect of match was observed in both the 0-ms ISI analysis,
F1(1, 72) " 11.95, p # .001, MSE " 9,482; F2(1, 51) " 19.27,
p # .001, MSE " 10,178, and the 750-ms analysis, F1(1, 69) "
17.37, p # .001, MSE " 8,227; F2(1, 51) " 2,317, p # .001,
MSE " 16,145. However, the 0-ms ISI analysis revealed an
interaction between match and reading span, F1(1, 72) " 4.61, p #
.05; F2(1, 51) " 3.89, p " .05, MSE " 9,154, whereas no such
interaction was present in the 750-ms ISI analysis (F1 # 1.6; F2 #
0.5). This suggests that high-span comprehenders showed a larger
difference between match and mismatch responses than low-span
comprehenders only in the 0-ms ISI condition. Within-group comparisons for high- and low-span comprehenders at each ISI were
conducted to confirm this observation. Indeed, at the 0-ms ISI the
high-span comprehenders showed the predicted match effect, F1(1,
36) " 10.50, p # .01, MSE " 14,142; F2(1, 51) " 18.56, p #
.001, MSE " 10,745, whereas low-span comprehenders did not
show a significant difference between responses to the matching
and mismatching pictures at the early ISI by subjects (F1 # 1.7),
although this comparison almost reached significance by items,
Table 1
Means (and Standard Deviations) for Response Times in
Experiment 1
High-span comprehenders
Low-span comprehenders
ISI
Match
Mismatch
Match
Mismatch
0 ms
750 ms
791 (144)
801 (192)
875 (260)
859 (196)
782 (139)
828 (226)
802 (165)
898 (295)
Note. ISI " interstimulus interval.
F2(1, 51) " 3.77, p " .06, MSE " 8,587. At the 750-ms ISI both
high- and low-span comprehenders showed the predicted match
effect: high span, F1(1, 36) " 7.53, p # .01, MSE " 6,957; F2(1,
51) " 15.91, p # .001, MSE " 8,717; low span, F1(1, 33) " 9.64,
p # .01, MSE " 9,612; F2(1, 51) " 16.77, p # .001, MSE "
12,487.
The pattern of results for Experiment 1 demonstrates that the
high-span comprehenders were more quickly able to activate a
perceptual representation of the target object in the appropriate
context. The interactions and within-group comparisons show that
the match effect at the immediate ISI was largely due to the
high-span comprehenders, who showed a stronger effect of match
than the low-span comprehenders. High-span comprehenders were
faster to verify pictures when the preceding sentence context
matched rather than mismatched the shape of the depicted object,
whereas low-span comprehenders showed only a weak advantage
for matching versus mismatching pictures, significant only in the
contrast by items. At the 750-ms ISI, both high- and low-span
comprehenders were faster to verify matching pictures than mismatching pictures. Thus, the current experiment showed that when
context is sufficiently constraining, high-span comprehenders were
able to use sentence context to quickly activate a perceptual
representation of the appropriate context, whereas low-span comprehenders could do so only after the delay.
The observed pattern of responses supports the idea that construal of the contextually appropriate meaning of a sentence requires the activation of a perceptual representation. At the immediate ISI, the low-span comprehenders had not yet activated a
perceptual representation of the sentence, evidenced by a lack of
an effect for matching versus mismatching pictures. Thus, they had
not construed the more subtle aspects of the sentence’s meaning at
the immediate ISI. However, by the time the picture was presented
in the 750-ms ISI, they had had enough time to activate a contextually appropriate perceptual representation of the described situation. At this point, the low-span comprehenders had fully construed the sentence meaning, as they showed the predicted match
effect just as the high-span comprehenders did. The relatively fast
responses of the low-span comprehenders to both matching and
mismatching pictures at the immediate ISI is considered in the
General Discussion.
A potential criticism of the present experiment is that the forcedchoice task taps into the comprehension processes relatively late.
Thus, the perceptual representation at the time of response is
susceptible to many postaccess influences (see Simpson, 1994). It
is possible that the high-span comprehenders would also show a
lack of match effect at the early interval if the task were to tap the
process of perceptual representation at an earlier stage. Indeed, less
skilled readers have shown impaired performance relative to
skilled readers on tasks that require explicit comparison between
the test item and the preceding context, yet these groups show
similar performance on a naming task that does not require explicit
comparison (Long, Seely, & Oppy, 1999). To address this issue,
we conducted another experiment in which a naming task was used
in place of the forced-choice task. The naming task is thought to
assess the comprehension process in its early stages, decreasing its
susceptibility to postaccess influences (Forster, 1981; Simpson,
1994) and minimizing response conflict (Long et al., 1999).
1297
LEXICAL AMBIGUITY AND PERCEPTUAL REPRESENTATIONS
Experiment 2
Experiment 2 was identical to Experiment 1 except that instead
of making a response as to whether the pictured object was
mentioned in the sentence, participants were instructed to name the
pictured object as quickly as possible. The naming task is a more
direct measure, designed to tap into the comprehension process at
lexical access, thus eliminating some of the cognitive processes
that are required for the task used in Experiment 1. The decision
task in Experiment 1 required recognition of the picture, lexical
access of the name of the object, comparison of that name to the
words in the sentence that preceded the picture, and an affirmative
or negative response based on that comparison. In contrast, the
naming task in Experiment 2 required only recognition of the
pictured object, lexical access of the name of that object, and a
vocal response, reporting the accessed name. It is important to note
that any comparison of the pictured object to the preceding sentence is eliminated in the naming task of Experiment 2.
Method
Participants. One hundred sixty undergraduate students enrolled at
Florida State University participated in the experiment as part of a course
requirement. All participants were native English speakers, and none had
participated in the previous experiment.
Materials and procedure. The materials and procedure for Experiment
2 were identical to those of Experiment 1, except that during the sentence–
picture experiment, participants were told to simply name the pictured
object as fast as possible, regardless of whether the picture was related to
the sentence. A microphone attached to the headphone set relayed the voice
input to a response box, where E-Prime software logged the latency of
voice onset for each trial. An experimenter sat with the participant during
the experiment to record any misnamed trials or trials in which the
microphone did not record the response correctly. Given that it is possible
for participants to perform this naming task without actually attending to
the sentences, it was especially important that the same yes–no inference
questions on 24 of the filler trials from Experiment 1 were again used in
Experiment 2. Participants answered these questions using keys that were
labeled with Y and N stickers. Because the participants did not know which
sentences would be followed by a question, they had to comprehend each
sentence to ensure a sufficient level of understanding.
Design. The design for Experiment 2 was identical to that for Experiment 1. It was a 2 (match vs. mismatch) ! 2 (ISI: 0 ms vs. 750 ms) ! 2
(Reading Span: high vs. low) mixed design, with match as a within-subject
variable and ISI and reading span as between-subjects variables. Just as in
Experiment 1, we predicted that when the perceptual properties of the
pictured object matched the contextual constraints of the preceding sentence, responses would be facilitated relative to when the picture mismatched the contextual constraints of the preceding sentence. Thus, responses should be faster for matching than for mismatching pictures, but
only if an adequate perceptual representation has been activated at the time
of the picture presentation.
Results and Discussion
The dependent measure of interest in Experiment 2 was the
participant’s time to name the presented picture. Misnamed trials
and trials in which the microphone did not record the response
correctly were not included in the reported analyses. Because it
was impossible to distinguish between misnamed trials and equipment error (microphone and response box), accuracy data were not
analyzed for Experiment 2. Naming latencies over 3 s as well as
naming latencies over or under two standard deviations from a
participant’s mean for a given condition were removed prior to
running the analyses. This constituted removal of less than 6% of
the data. The data from 2 of the 160 participants were excluded for
having too few usable trials (misnames or equipment error), and 2
new participants were run to replace them. The data from 5 of the
160 participants were excluded for responding too slowly (having
at least one of their condition means above 1,000 ms), and 5 new
participants were run to replace these data. Finally, as in Experiment 1, 7 participants who did not answer at least 75% of the
comprehension questions correctly were replaced (high span: 0-ms
n " 40, 750-ms n " 40; low span: 0-ms n " 40, 750-ms n " 40).
The means and standard deviations for the naming times from
Experiment 2 are displayed in Table 2. Participants were classified
as high- or low-span comprehenders on the basis of a median split
of scores on the Reading Span task, which yielded values very
similar to those observed in Experiment 1 (M " 31.5, SD " 4.7;
for high-span comprehenders, M " 35.3, SD " 2.6; for low-span
comprehenders, M " 27.6, SD " 2.7). As in Experiment 1, the
overall mixed ANOVA with list as a between-subjects factor
showed a clear effect of match, indicating that participants were
faster to name the pictured object when it matched rather than
mismatched the contextual constraints of the preceding sentence:
F1(1, 144) " 10.19, p # .01, MSE " 2,338; F2(1, 51) " 18.05,
p # .001, MSE " 2,902. The predicted three-way interaction
among match, reading span, and ISI did not reach conventional
levels of significance: F1(1, 144) " 3.50, p " .06; F2(1, 51) "
2.52, p " .12, MSE " 1,832. Nonetheless, separate analyses for
the 0-ms ISI and the 750-ms ISI were conducted.
In the immediate ISI analysis, the main effect of match reached
significance only in the analysis by items, suggesting that participants were already able to name the pictured objects more quickly
when they matched rather than mismatched the contextual constraints of the preceding sentence: F1(1, 72) " 2.73, p " .10,
MSE " 2,428; F2(1, 51) " 7.77, p # .01, MSE " 2,560. This main
effect was qualified by a significant interaction between match and
reading span, F1(1, 72) " 5.72, p # .05; F2(1, 51) " 5.97, p # .05,
MSE " 2,487. This indicated that just as in Experiment 1 the
marginal match effect at the immediate ISI was largely due to the
high-span comprehenders. Contrast tests for high- and low-span
comprehenders confirmed this notion, as the high-span comprehenders showed a significant match effect, F1(1, 36) " 6.57, p #
.05, MSE " 3,039; F2(1, 51) " 10.73, p # .01, MSE " 3,219,
whereas low-span comprehenders showed no difference in naming
the matching and mismatching pictures at the immediate ISI (both
Fs # 1).
Table 2
Means (and Standard Deviations) for Response Times in
Experiment 2
High-span comprehenders
Low-span comprehenders
ISI
Match
Mismatch
Match
Mismatch
0 ms
750 ms
655 (91)
662 (110)
687 (117)
681 (120)
681 (101)
701 (98)
674 (81)
724 (117)
Note. ISI " interstimulus interval.
1298
MADDEN AND ZWAAN
The 750-ms ISI analysis again yielded a main effect of match,
indicating that overall, participants were faster to name the pictures
when they matched rather than mismatched the contextual constraints of the preceding sentence: F1(1, 72) " 8.34, p # .01,
MSE " 2,248; F2(1, 51) " 12.97, p # .001, MSE " 2,574.
However, the 750-ms ISI analysis did not reveal an interaction
between match and reading span (both Fs # 1), suggesting that
both high- and low-span comprehenders were contributing to the
match effect in this condition. The contrast tests for both high- and
low-span comprehenders showed an effect of match: high span,
F1(1, 36) " 5.23, p # .05, MSE " 1,543; F2(1, 51) " 8.93, p #
.01, MSE " 2,431; low span, F1(1, 36) " 3.65, p " .06, MSE "
2,953; F2(1, 51) " 4.52, p # .05, MSE " 2,730.
The pattern of results for Experiment 2 was largely the same as
the pattern observed for Experiment 1. The high-span comprehenders perceptually represented the target object in the appropriate context immediately upon hearing the sentence and therefore
showed a significant match effect at both the immediate ISI and
the 750-ms ISI. Low-span comprehenders, however, were equally
fast to name both the contextually appropriate and inappropriate
pictures at the immediate ISI. Only after 750 ms did they exhibit
the predicted match effect.
Given the task differences between the two experiments, the
effects in Experiment 2 were expected to be weaker. In Experiment
1 participants were forced to compare the response-eliciting picture with the preceding sentence, so it is more likely that participants’ representations would affect the response to the picture. The
task in Experiment 2 did not require the comprehenders to refer to
the preceding sentence while responding to the picture. Speeded
naming of the pictured object could have been executed without
associating it in any way with the preceding sentence, whereas in
Experiment 1, the pictured object needed to be compared with the
preceding sentence before a response could be made. Even though
the sentences had to be kept active in short-term memory in order
to answer the comprehension questions that sometimes followed,
the comprehender might have been able to perform the naming
task independently of the sentence comprehension task. In fact, it
might have been possible for comprehenders to activate a
perceptual–motor representation of the sentence only after the fact,
when a question was encountered. Thus, it is likely that the match
effect was constrained by the somewhat shallow level of processing that the sentence received as participants settled into the task
demands of Experiment 2. Given that the task deemphasized the
sentence context and the naming task assessed the comprehension
process at an earlier stage than in Experiment 1, Experiment 2
provided a stronger test of the hypothesis than Experiment 1.
In sum, although the task in Experiment 2 was not as explicitly
tied to comprehension of the preceding sentence context as it was
in Experiment 1, the results looked qualitatively similar in the two
experiments. In both experiments, only the high-span comprehenders showed the match effect at the immediate ISI, whereas both
high- and low-span comprehenders showed the match effect at the
750-ms ISI. The fact that the pattern of results was similar, even
when a more shallow naming task was used, provided an important
manipulation check with regard to the comparison–response task
used in Experiment 1. The consistent pattern of results from
Experiment 2 addressed the possible criticism that the nature of the
comparison–response task used in Experiment 1 tapped into the
comprehension process at a late stage of comprehension and that
thus the effects might have been caused merely by task demands.4
General Discussion
The current study was aimed at investigating how high- and
low-span comprehenders activate perceptual representations as a
mechanism for construing the contextually appropriate sense of a
word. The two experiments presented here used sentence–picture
tasks to assess the time course of perceptual representations in
high- and low-span comprehenders. Experiment 1 used a
comparison–response task that was modeled after lexical ambiguity studies using homographs. However, in this experiment, rather
than encountering a homograph in a biasing sentence context,
participants had to disambiguate the shape of a described object
during comprehension on the basis of the preceding sentence
context. This experiment yielded a three-way interaction among
contextual match, ISI, and reading span, whereby only the highspan comprehenders showed the predicted match effect at the
immediate ISI, but all comprehenders showed the match effect at
the later ISI. Experiment 2 provides an extension of these findings,
in that a more direct measure, naming time, was used. This
experiment qualitatively replicated the pattern of results reported
in Experiment 1, despite decreased reference to the preceding
sentence while performing the task.
The present pattern of results is consistent with previous studies
of lexical ambiguity from the field of psycholinguistics. In both of
the current experiments, the pattern of results for low-span comprehenders parallels studies that have used homographs to show
that the time course of contextual constraint is often delayed, such
that multiple meanings are initially activated, followed by the
deactivation of the inappropriate meanings (Conrad, 1974; Lucas,
1987; Onifer & Swinney, 1981; Swinney, 1979). In contrast, the
pattern of results for the high-span comprehenders is consistent
with studies that show immediate contextual constraint on representations (Chambers, Tanenhaus, Eberhard, Filip, & Carlson,
2002; Chambers et al., 2004; Dahan & Tanenhaus, 2004; Glucksberg et al., 1986; Hess, Foss, & Carroll, 1995; MacDonald, 1994;
Schvaneveldt et al., 1976; Simpson, 1981). Also, the interaction
between reading span and processing time (ISI) is consistent with
homograph studies showing that more skilled/experienced/
younger/high-span comprehenders are better able to use sentence
context to quickly constrain their representations than less skilled/
experienced/older/low-span comprehenders (Dagerman et al.,
2006; Gernsbacher & Faust, 1991; Gernsbacher et al., 1990; McNamara & McDaniel, 2004; Van Petten et al., 1997).
However, this study also represents a departure from previous
research. The perceptual match effects in the current experiments
4
It might be argued that the very presence of a picture in the task
changes the comprehension process from what it might have been, given
the sentence alone. Thus, perhaps perceptual representations are activated
here only because the picture cue was presented. Although this explanation
cannot be ruled out given the current data set, we find it unlikely that the
perceptual representations evident in Experiment 2 are cue dependent,
given that the picture-naming task does not require reference to the preceding sentence. Furthermore, there is evidence from our lab that perceptual representations are activated during language processing in the absence of pictures (see Zwaan & Yaxley, 2003, 2004).
LEXICAL AMBIGUITY AND PERCEPTUAL REPRESENTATIONS
demonstrate that comprehenders are in fact perceptually representing linguistic input in order to construe subtle shades of meaning,
even for so-called single-meaning words. To adequately represent
a linguistic description, each described entity needs to be disambiguated in terms of aspects such as temporal specification, perceptual properties, spatial region, perspective, highlighted features,
and relation to other entities. These experiments suggest that the
construal or disambiguation of these aspects can be realized
through the activation of perceptual–motor representations. This
entails the partial reactivation of traces of our previous experiences
(see also Zwaan & Madden, 2005). This process of activation
occurs in similar fashion to the memory model proposed by
Hintzman (1986, 1988), in which activation continuously and
automatically flows to any memory traces that are sufficiently
similar to the retrieval environment. Furthermore, this model also
assumes that when context matching occurs between the memory
traces and the currently attended situation, memory retrieval is
facilitated (Hintzman, 2002; Hintzman, Block, & Summers, 1973).
It would be too cumbersome for comprehenders to store all possible construals of all known words in the mental lexicon, so the
activation of more distributed perceptual–motor representations
based on context is required as a mechanism to construe subtle
shades of word meaning. Other studies from our lab and other labs
support this idea and suggest that these perceptual–motor representations are embodied and distributed throughout the various
perceptual and motor modalities (Chao & Martin, 2000; Glenberg
& Kaschak, 2002; Kan et al., 2003; Kaschak et al., 2005; Klatzky
et al., 1989; Pecher et al., 2003; Richardson et al., 2003; Solomon
& Barsalou, 2001; Stanfield & Zwaan, 2001; Zwaan et al., 2002,
2004).
The observed match effect at the immediate ISI for the highspan comprehenders demonstrates that they activate a perceptual
representation of the described object immediately upon hearing
that word in a sentence, and thus the respective word can be
construed. In contrast, the low-span comprehenders do not activate
a perceptual representation of the described situation immediately
upon hearing that word, as there was no difference in responses to
pictures that match rather than mismatch the contextual constraints
of the sentence at the 0-ms ISI. However, at the 750-ms ISI, the
low-span comprehenders had activated a perceptual representation
of the described situation. Although the perceptual representation
takes longer for low-span comprehenders to activate, once it is
activated, it does not appear to be inferior to that of the high-span
comprehenders, as the match effect was equally large for the
low-span comprehenders at the 750-ms ISI as it was for the
high-span comprehenders at either ISI.
One intriguing aspect of the pattern of data concerns the lowspan comprehenders’ responses at the immediate picture probe.
They did not show the match effect at the 0-ms ISI, demonstrating
that they had not yet activated an adequate perceptual representation of the described situation. However, the mere fact that they
were able to perform the task indicates that they must have had
some representation available with which they could compare the
probe picture. The underspecified representation of the low-span
comprehenders is consistent with findings from studies of polysemy (Frazier & Rayner, 1990; Frisson & Pickering, 1999; Pickering & Frisson, 2001), suggesting that words with multiple senses
initially activate an underspecified representation. The current data
suggest that this representation is lexical in nature and is not
1299
contextually specified. Otherwise, a difference between the matching and mismatching pictures would be observed at the immediate
ISI for this group. These results support the idea discussed in the
introduction that two separate types of representation are active
and can potentially affect the response—namely, the perceptual–
motor representation as well as a noncontextual lexical representation of the target word. The contribution of these two types of
representations over time is outlined below.
Lexical representations are obligatorily activated when a given
word is read or heard. In the current experiments, the lexical
representation for the target concept is very quickly activated by
automatic priming from the sound of the target word during
sentence presentation. In addition, the lexical representation receives backward priming from the picture probe. As soon as the
picture is perceived, the lexical representation associated with that
concept is activated automatically. This lexical representation is
noncontextual and is activated regardless of the shape of the target
object in the picture probe. Thus, the lexical representation is
highly activated both from mention in the preceding sentence and
through backward activation from the picture probe.
As the lexical representation receives automatic activation from
the sound of the word as well as backward activation from the
picture probe, it sends activation to the associated perceptual
representations. Depending on the speed of the perceptual representation, either the lexical representation or the perceptual representation will have a greater effect on the response. Because the
perceptual representation is a higher level representation, it will
dominate the response if it is completed in time. If the perceptual
representation is not yet fully activated at the time the picture
probe is recognized, then the response will be made on the basis of
the lexical representation. Although perceptual representations are
automatic, they can be delayed or forgone when insufficient contextual information is present (see Frazier & Rayner, 1990) or
insufficient time or resources are present. In these experiments, if
new information had been presented to the low-span comprehenders before they had a chance to activate a perceptual representation
of the preceding sentence, perceptual representations might have
been forgone completely.
So what is it that causes problems for low-span comprehenders
in activating perceptual–motor representations? As discussed in
the introduction, there are several possible sources for the span
difference. Because the disambiguation required here is between
very subtle perceptual features of a given sense of a concept rather
than mutually exclusive meanings or even related senses, it is
unlikely that active suppression would be the cause. Likewise, it is
unlikely that domain expertise can offer a plausible explanation
within the context of the current study, as the variety of everyday
concepts used here does not represent an area of expertise for any
given subject. However, it is highly plausible that the process of
indirectly activating perceptual–motor representations through lexical representations is reinforced through more frequent reading
and thus that the links between the lexical and perceptual–motor
representations become stronger in more experienced comprehenders. In the current experiments, whereas low-span comprehenders had no problems activating lexical representations, they
demonstrated a clear disadvantage in their ability to use these
lexical representations to activate perceptual representations.
Strengthened links between lexical and perceptual–motor representations might affect the comprehension process in two ways.
1300
MADDEN AND ZWAAN
First, the stronger links might allow for greater precision in activating richer or perhaps larger networks of representations (MacDonald & Christiansen, 2002). However, this is not likely the case,
because a richer or larger network of perceptual representations
would most likely increase the size of the mismatch effect,
whereas the observed effect was equally large for the low-span
comprehenders at the 750-ms ISI and the high-span comprehenders at either ISI. A more likely explanation is that the stronger links
between lexical and perceptual–motor representations increased
processing speed in activating representations (Dagerman et al.,
2006). According to this idea, activation flows more readily from
the lexical representations for spaghetti and box to their associated
perceptual–motor representations in high-span than in low-span
comprehenders.
It is also likely that the strengthened links between lexical and
perceptual–motor representations facilitate the flow of activation
among activated perceptual–motor representations. For instance,
hearing the word box will automatically activate the lexical representation for box, which will automatically activate various
perceptual–motor representations that are linked to that lexical
representation. Likewise, hearing the word spaghetti will automatically activate its lexical representation as well as various
perceptual–motor representations. Within an interconnected network, the perceptual–motor representations for uncooked spaghetti
and the perceptual–motor representations for long, skinny pasta
boxes will support each other, whereas the perceptual–motor representations for cooked spaghetti and moving boxes will receive
only the initial activation from the lexical representations and then
quickly lose support. Thus, the context and the target actually
serve to constrain each other in a bidirectional manner (Dagerman
et al., 2006). Because low-span comprehenders have weaker links
between lexical and perceptual–motor representations, this process
of constraining activation flow takes longer. An alternative to this
idea is suggested by McKoon and Ratcliff’s (1992; Ratcliff &
McKoon, 1995) compound-cue retrieval theory, in which longterm memory is searched using a compound of the items in
short-term memory. In this case, the lexical representations for box
and spaghetti would form a compound cue that sends activation to
perceptual–motor representations that are most related. It is unclear exactly how activation flows to the appropriate perceptual–
motor representations, but it is evident that the activation arrives at
the perceptual–motor representations more quickly in high-span
than in low-span comprehenders.
The activation of perceptual–motor representations as a mechanism for meaning construal is consistent with nonmodular parallel interactive models (Glucksberg et al., 1986; Schvaneveldt et al.,
1976; Simpson, 1981), because lexical access (activation of the
lexical representation) and contextual integration (activation of
perceptual–motor representations) occur at the same time, and
these two phases can affect each other as they are in progress. In
addition, as lexical representations and perceptual–motor representations are activated together, meaning construal can be constrained by many types of information, such as syntactic and
thematic roles, word frequency, and situational context, all of
which stem from our embodied experiences with words and their
referents. This idea of representing multiple types of information
in our meaning representations is inconsistent with traditional
models of the mental lexicon but is similar to the more recent
constraint-based approach to comprehension (MacDonald, Pearlmutter, & Seidenberg, 1994; Seidenberg & MacDonald, 1999).
According to the constraint-based approach (MacDonald et al.,
1994; Seidenberg & MacDonald, 1999), the lexicon contains distributional information on many dimensions, such as syntax, grammar, argument structure, and semantic context. Some of this probabilistic information, such as the relative frequencies of words,
may be stored at the level of lexical representations, whereas other
more contextual information is stored at the level of perceptual–
motor representations. Furthermore, some information may be
stored at the intersection of the two types of representations. For
instance, the verb arrested might occur more often as a reduced
relative than as a main verb in most contexts. However, if the word
officer precedes the word arrested, then the distributional information about semantic roles (e.g., officers being agents of the main
verb arrest) will instead favor the main verb interpretation. In this
sense, many dimensions of distributional information are taken
into account during meaning selection, and comprehenders with
stronger links between lexical and perceptual–motor representations are better able to make use of this information quickly
(Dagerman et al., 2006; Pearlmutter & MacDonald, 1995). It is
possible that meaning can be successfully selected without the
activation of perceptual–motor representations when the probability of a given meaning is highly favored by purely lexical factors
such as word frequency. However, this will result in superficial
comprehension, whereas construal of subtle shades of meaning and
deep, contextual comprehension are contingent on the activation of
perceptual–motor representations.
The current pattern of data suggests that the high-span comprehenders use a faster mechanism of meaning construal. Immediately
upon hearing the final word of the sentence, the high-span comprehenders are able to activate a perceptual representation of the
described situation. The picture probe is compared with this representation, and the match effect emerges when the mismatching
picture takes longer to align with the representation. In contrast,
the low-span comprehenders are slower to use the mechanism of
meaning construal. At the immediate ISI, this group has not
finished activating the perceptual representation, and therefore, the
lexical representation determines the response. The lexical representation is noncontextual, and so responses to both matching and
mismatching pictures are facilitated, as they both match the activated lexical representation. At the later ISI, the perceptual–motor
representation dominates the response for both high- and low-span
comprehenders. Here, only the response to the matching picture is
facilitated. If the picture were compared with a noncontextual
lexical representation, then the mismatching response would be
facilitated to the same extent that the matching response was.
However, the response to the mismatching picture is slowed for
both span groups, indicating that each group has activated a
perceptual representation and that the picture was compared with
this rather than the lexical representation.5
5
Low-span comprehenders did exhibit a slight delay in responding as
compared with high-span comprehenders, which could be attributed to
difficulty in comparing the pictured object with a poor-quality perceptual
representation. However, this delay would likely have been accompanied
by a weakening of the match effect if the quality of the perceptual
representation had been impaired.
LEXICAL AMBIGUITY AND PERCEPTUAL REPRESENTATIONS
Conclusion
The current study investigated how high- and low-span comprehenders use perceptual–motor representations as a mechanism
by which the contextually appropriate meaning of a given word
can be construed during language processing. The results indicate
that speed of perceptual–motor representation differentiates highspan comprehenders from low-span comprehenders. The low-span
comprehenders in this study were university students and thus are
certainly not the poorest sample of comprehenders that could be
tested. These low-span comprehenders were able to activate a
perceptual representation comparable to that of high-span comprehenders well within a second of hearing the sentence’s final word,
and perhaps would have been able to activate a perceptual representation even more quickly if the manipulation of context had
been stronger (here it was merely the implied location in a simple
sentence). However, the slowdown in construal reported here may
be expected to increase with samples of even lower span comprehenders. In the current sample, it is easy to see how comprehension
problems could occur for the low-span group when linguistic
information is presented at a rapid rate. It is important to increase
the external validity of this finding to even lower span comprehenders in order to understand at what point comprehension suffers even at slow presentation rates and how the process of activating perceptual–motor representations can be facilitated.
The experiments reported here suggest that two systems of
representation are at work to construe subtle shades of meaning
upon hearing a sentence—namely, lexical representations and
perceptual–motor representations. Furthermore, the extent to
which each type of representation influences subsequent thoughts
and actions varies for different span groups over time. The most
likely source of this difference is the strength of the links between
the two types of representations. This finding informs theories of
language comprehension and highlights important questions for
future research. The next challenge is to further investigate the
nature of the relationship between lexical representations and
perceptual–motor representations in order to better understand the
process of meaning construal.
References
Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain
Sciences, 22, 577– 660.
Britton, B. K. (1978). Lexical ambiguity of words used in English text.
Behavior Research Methods & Instrumentation, 10, 1–7.
Chambers, C. G., Tanenhaus, M. K., Eberhard, K. M., Filip, H., & Carlson,
G. N. (2002). Circumscribing referential domains during real-time language comprehension. Journal of Memory and Language, 47, 30 – 49.
Chambers, C. G., Tanenhaus, M. K., & Magnuson, J. S. (2004). Actions
and affordances in syntactic ambiguity resolution. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 687– 696.
Chao, L. L., & Martin, A. (2000). Representation of manipulable manmade objects in the dorsal stream. NeuroImage, 12, 478 – 484.
Conrad, C. (1974). Context effects in sentence comprehension: A study of
the subjective lexicon. Memory & Cognition, 2, 130 –138.
Conway, A. R. A., Cowan, N., Bunting, M. F., Therriault, D. J., & Minkoff,
S. R. B. (2002). A latent variable analysis of working memory capacity,
short-term memory capacity, processing speed, and general fluid intelligence. Intelligence, 30, 163–184.
Cruse, D. A. (1986). Lexical semantics. Cambridge, England: Cambridge
University Press.
1301
Dagerman, K. S., MacDonald, M. C., & Harm, M. W. (2006). Aging and
the use of context in ambiguity resolution: Complex changes from
simple slowing. Cognitive Science, 30, 1–35.
Dahan, D., & Tanenhaus, M. K. (2004). Continuous mapping from sound
to meaning in spoken-language comprehension: Immediate effects of
verb-based thematic constraints. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 30, 498 –513.
Daneman, M., & Carpenter, P. A. (1980). Individual differences in working
memory and reading. Journal of Verbal Learning and Verbal Behavior,
19, 450 – 466.
Daneman, M., & Merikle, P. M. (1996). Working memory and language
comprehension: A meta-analysis. Psychonomic Bulletin & Review, 3,
422– 433.
Forster, K. I. (1981). Priming and the effects of sentence and lexical
contexts on naming time: Evidence for autonomous lexical processing.
Quarterly Journal of Experimental Psychology: Human Experimental
Psychology, 33A, 465– 495.
Frazier, L., & Rayner, K. (1990). Taking on semantic commitments:
Processing multiple meanings vs. multiple senses. Journal of Memory
and Language, 29, 181–200.
Friedman, N. P., & Miyake, A. (2003). The reading span test and its
predictive power for reading comprehension ability. Journal of Memory
and Language, 51, 136 –158.
Frisson, S., & Pickering, M. J. (1999). The processing of metonymy:
Evidence from eye movements. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 25, 1366 –1383.
Gernsbacher, M. A., & Faust, M. E. (1991). The mechanism of suppression: A component of general comprehension skill. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 245–262.
Gernsbacher, M. A., & Varner, K. R. (1988). The Multi-Media Comprehension Battery (Tech. Rep. No. 88 – 03). Eugene: University of Oregon,
Institute of Cognitive and Decision Sciences.
Gernsbacher, M. A., Varner, K. R., & Faust, M. E. (1990). Investigating
differences in general comprehension skill. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 16, 430 – 445.
Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action.
Psychonomic Bulletin & Review, 9, 558 –565.
Glucksberg, S., Kreuz, F. J., & Rho, S. H. (1986). Context can constrain
lexical access: Implications for models of language comprehension.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 12, 323–335.
Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical
access. Psychological Review, 105, 251–279.
Hess, D. J., Foss, D. J., & Carroll, P. (1995). Effects of global and local
context on lexical processing during language comprehension. Journal
of Experimental Psychology: General, 124, 62– 82.
Hintzman, D. L. (1986). “Schema abstraction” in a multiple-trace memory
model. Psychological Review, 93, 411– 428.
Hintzman, D. L. (1988). Judgments of frequency and recognition in a
multiple-trace memory model. Psychological Review, 95, 528 –551.
Hintzman, D. L. (2002). Context matching and judgments of recency.
Psychonomic Bulletin & Review, 9, 368 –374.
Hintzman, D. L., Block, R. A., & Summers, J. J. (1973). Contextual
associations and memory for serial position. Journal of Experimental
Psychology, 97, 220 –229.
Kan, I. P., Barsalou, L. W., Solomon, K. O., Minor, J. K., & ThompsonSchill, S. L. (2003). Role of mental imagery in a property verification
task: FMRI evidence for perceptual representations of conceptual
knowledge. Cognitive Neuropsychology, 20, 525–540.
Kaschak, M. P., Madden, C. J., Therriault, D. J., Yaxley, R. H., Aveyard,
M., Blanchard, A. A., & Zwaan, R. A. (2005). Perception of motion
affects language processing. Cognition, 94, B79 –B89.
Klatzky, R. L., Pellegrino, J. W., McCloskey, B. P., & Doherty, S. (1989).
Can you squeeze a tomato? The role of motor representations in seman-
1302
MADDEN AND ZWAAN
tic sensibility judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 633– 642.
Klein, D. E., & Murphy, G. L. (2001). The representation of polysemous
words. Journal of Memory and Language, 45, 259 –282.
Klepousniotou, E. (2002). The processing of lexical ambiguity: Homonymy and polysemy in the mental lexicon. Brain and Language, 81,
205–223.
Long, D. L., Seely, M. R., & Oppy, B. J. (1999). The strategic nature of
less skilled readers’ suppression problems. Discourse Processes, 27,
281–302.
Lucas, M. M. (1987). Frequency effect on the processing of ambiguous
words in sentence context. Language and Speech, 30, 25– 46.
Lyons, J. (1977). Semantics. Cambridge, England: Cambridge University
Press.
MacDonald, M. C. (1994). Distributional information in language comprehension, production, and acquisition: Three puzzles and a moral. In B.
MacWhinney (Ed.), The emergence of language (pp. 177–196). Mahwah, NJ: Erlbaum.
MacDonald, M. C., & Christiansen, M. H. (2002). Reassessing working
memory: Comment on Just and Carpenter (1992) and Waters and Caplan
(1996). Psychological Review, 109, 35–54.
MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1994). The
lexical nature of syntactic ambiguity resolution. Psychological Review,
101, 676 –703.
McKoon, G., & Ratcliff, R. (1992). Spreading activation versus compound
cue accounts of priming: Mediated priming revisited. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 1155–1172.
McNamara, D. S., & McDaniel, M. A. (2004). Suppressing irrelevant
information: Knowledge activation or inhibition? Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 465– 482.
McNamara, D. S., & Scott, J. L. (2001). Working memory capacity and
strategy use. Memory & Cognition, 29, 10 –17.
Morrow, D. G., & Clark, H. H. (1988). Interpreting words in spatial
descriptions. Language and Cognitive Processes, 3, 275–291.
Onifer, W., & Swinney, D. A. (1981). Accessing lexical ambiguities during
sentence comprehension: Effects of frequency of meaning and contextual bias. Memory & Cognition, 15, 225–236.
Pearlmutter, N. J., & MacDonald, M. C. (1995). Individual differences and
probabilistic constraints in syntactic ambiguity resolution. Journal of
Memory and Language, 34, 521–542.
Pecher, D., Zeelenberg, R., & Barsalou, L. W. (2003). Verifying properties
from different modalities for concepts produces switching costs. Psychological Science, 14, 119 –124.
Pickering, M. J., & Frisson, S. (2001). Processing ambiguous verbs:
Evidence from eye movements. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 27, 556 –573.
Pollatsek, A., & Well, A. D. (1995). On the use of counterbalanced designs
in cognitive research: A suggestion for a better and more powerful
analysis. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 21, 785–794.
Raaijmakers, J. G. W., Schrijnemakers, J. M. C., & Gremmen, F. (1999).
How to deal with “the language-as-fixed-effect fallacy”: Common misconceptions and alternative solutions. Journal of Memory and Language, 41, 416 – 426.
Ratcliff, R., & McKoon, G. (1995). Sequential effects in lexical decision:
Tests of compound-cue retrieval theory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1380 –1388.
Richardson, D. C., Spivey, M. J., Barsalou, L. W., & McRae, K. (2003).
Spatial representations activated during real-time comprehension of
verbs. Cognitive Science, 27, 767–780.
Rodd, J., Gaskell, G., & Marslen-Wilson, W. (2002). Making sense of
semantic ambiguity: Semantic competition in lexical access. Journal of
Memory and Language, 46, 245–266.
Schneider, W., Eschman, A., & Zuccolotto, A. (2002). E-Prime 1.0 [Computer software]. Pittsburgh, PA: Psychological Software Tools.
Schvaneveldt, R. W., Meyer, D. E., & Becker, C. A. (1976). Lexical
ambiguity, semantic context, and visual word recognition. Journal of
Experimental Psychology: Human Perception and Performance, 2, 243–
256.
Seidenberg, M. S., & MacDonald, M. C. (1999). A probabilistic constraints
approach to language acquisition and processing. Cognitive Science, 23,
569 –588.
Simpson, G. B. (1981). Meaning dominance and semantic context in the
processing of lexical ambiguity. Journal of Verbal Learning and Verbal
Behavior, 20, 120 –136.
Simpson, G. B. (1994). Context and the processing of ambiguous words. In
M. A. Gernsbacher (Ed.), Handbook of psycholinguistics (pp. 359 –374).
San Diego, CA: Academic Press.
Solomon, K. O., & Barsalou, L. W. (2001). Representing properties locally. Cognitive Psychology, 3, 129 –169.
Stanfield, R. A., & Zwaan, R. A. (2001). The effect of implied orientation
derived from verbal context on picture recognition. Psychological Science, 12, 153–156.
Swinney, D. A. (1979). Lexical access during sentence comprehension:
(Re)consideration of context effects. Journal of Verbal Learning and
Verbal Behavior, 18, 645– 659.
Tabossi, P. (1988). Effects of context on the immediate interpretation of
unambiguous nouns. Journal of Experimental Psychology: Learning,
Memory, and Cognition, 14, 153–162.
Van Petten, C., Weckerly, J., McIsaac, H. K., & Kutas, M. (1997).
Working memory capacity dissociates lexical and sentential context
effects. Psychological Science, 8, 238 –242.
Zwaan, R. A., & Madden, C. J. (2005). Embodied sentence comprehension.
In D. Pecher & R. A. Zwaan (Eds.), The grounding of cognition: The
role of perception and action in memory, language, and thinking (pp.
224 –245). Cambridge, England: Cambridge University Press.
Zwaan, R. A., Madden, C. J., Yaxley, R. H., & Aveyard, M. E. (2004).
Moving words: Dynamic mental representations in language comprehension. Cognitive Science, 28, 611– 619.
Zwaan, R. A., Stanfield, R. A., & Yaxley, R. H. (2002). Do language
comprehenders routinely represent the shapes of objects? Psychological
Science, 13, 168 –171.
Zwaan, R. A., & Taylor, L. J. (2006). Seeing, acting, understanding: Motor
resonance in language comprehension. Journal of Experimental Psychology: General, 135, 1–11.
Zwaan, R. A., & Truitt, T. P. (2000). Inhibition of smoking-related information in smokers and nonsmokers. Experimental and Clinical Psychopharmacology, 8, 192–197.
Zwaan, R. A., & Yaxley, R. H. (2003). Spatial iconicity affects semanticrelatedness judgments. Psychonomic Bulletin & Review, 10, 954 –958.
Zwaan, R. A., & Yaxley, R. H. (2004). Lateralization of object-shape
information in semantic processing. Cognition, 94, B35–B43.
LEXICAL AMBIGUITY AND PERCEPTUAL REPRESENTATIONS
1303
Appendix A
Samples of Sentence–Picture Pairs
In the [skillet/refrigerator] there was an egg.
On the [ice/bench] there was a hockey player.
In the [nest/sky] there was an eagle.
On the [floor/rack] there was a towel.
Appendix B
Accuracy Data for Experiment 1
The accuracy data for Experiment 1 are displayed in Table B1.
The overall mixed analysis of variance with list as a betweensubjects factor showed an effect of match whereby responses were
Table B1
Mean Percentages (and Standard Deviations) for Accuracy Data
in Experiment 1
High-span
comprehenders
Low-span
comprehenders
ISI
Match
Mismatch
Match
Mismatch
0 ms
750 ms
99 (03)
98 (03)
95 (08)
97 (05)
99 (03)
98 (03)
96 (05)
97 (05)
Note. ISI " interstimulus interval.
more accurate for matching than for mismatching pictures: F1(1,
141) " 22.59, p # .001, MSE " 1.781E-03; F2(1, 51) " 13.96, p #
.001, MSE " 4.274E-03. In addition, there was an interaction
between match and interstimulus interval (ISI), indicating that the
match effect was stronger at the earlier ISI: F1(1, 144) " 4.80, p #
.05; F2(1, 51) " 4.98, p # .05, MSE " 2.641E-03. There was an
effect of match (but no interactions) at both the 0-ms ISI: F1(1,
72) " 18.15, p # .001, MSE " 2.413E-03; F2(1, 51) " 16.68, p #
.001, MSE " 3.862E-03; and the 750-ms ISI: F1(1, 69) " 5.12, p #
.05, MSE " 1.120E-03; F2(1, 51) " 2.75, p " .10, MSE "
3.053E-03.
Received May 25, 2005
Revision received March 8, 2006
Accepted March 23, 2006 !