Neurophysiological evidence of memory traces for words in the

NEUROREPORT
COGNITIVE NEUROSCIENCE AND NEUROPSYCHOLOGY
Neurophysiological evidence of memory traces
for words in the human brain
Yury ShtyrovCA and Friedemann Pulvermˇller
MRC Cognition and Brain Sciences Unit,15 Chaucer Road, CB2 2EF, Cambridge, UK
CA
Corresponding Author
Received18 January 2002; accepted 21January 2002
Mismatch negativity (MMN), an index of experience-dependent
memory traces, was used to investigate the processing of lexical
contrasts in the human brain.The MMN was elicited either by rare
words presented among repetitive words or pseudowords, or by
pseudowords presented among words. Phonetic and phonological
contrasts were identical in all conditions. MMNs elicited by both
word deviants were larger than that elicited by the deviant pseudoword. The presence of lexical contrast did not signi¢cantly alter
the word-elicited MMNs, which were, however, distinct in amplitude and topography from the MMN evoked by pseudowords.
Thus, our results indicate the existence of word-related MMN enhancement largely independent of the lexical status of the standard
stimulus. This enhancement may re£ect the presence of a longterm memory trace for a spoken word. NeuroReport 13:521^525
c 2002 Lippincott Williams & Wilkins.
Key words: Brain; Event-related potential (ERP); Human; Language; Lexis; Memory trace; Mismatch negativity (MMN, MMF); Pseudoword; Semantics;
Word
INTRODUCTION
Mismatch negativity (MMN), a unique indicator of
automatic cerebral processing of acoustic stimuli, can be
used to investigate neural processing of speech
and language [1–4]. MMN, with its major source of
activity in the supratemporal auditory cortex, is a
brain response elicited by rare (deviant) stimuli
occasionally presented in a sequence of frequent (standard)
stimuli [5,6]. Importantly, MMN can be elicited
in the absence of the subject’s attention [7]. It is considered
to reflect the brain’s automatic reaction to any change
in the auditory sensory input and, furthermore, to
provide an index of experience-dependent memory
traces in the human brain [1,4,8]. It has been lately
suggested to reflect long-term memory traces for language
sounds such as phonemes [1,9,10] and syllables [4,11,12].
Most importantly, we have recently found that mismatch
negativity in response to individual words is greater than
for comparable meaningless word-like (i.e., obeying phonological rules of the language) stimuli [3]. In that study, we
presented subjects with word and pseudoword deviant
stimuli among pseudoword standards (phonetic contrasts
being identical) and found an increased MMN to word
stimuli. This enhancement is best explained by the activation of cortical memory traces for words realised as
distributed strongly connected populations of neurones
[13,14].
The important question, not answered in the previous
study, is whether the enhancement we observed can be
c Lippincott Williams & Wilkins
0959- 4965 explained merely by differences in the lexical status (word
vs pseudoword) between the stimuli. The possible confound
in the earlier study was that word and pseudoword
deviants were always contrasted against pseudoword
standards, so one can argue that it was this lexical
status difference rather than individual words’ memory
traces as such which contributed to the larger MMNs to
words. If this argument is correct then the prediction
should be that larger MMNs could be produced by lexically
non-uniform standard–deviant pairs regardless of the
direction of the contrast, whereas deviant words presented
among standard words would even elicit smaller MMN
response (in other words, both word deviants among
pseudoword standards and pseudoword deviants among
word standards should produce larger MMN than a word
deviant among word standards). On the contrary, if the
original suggestion that this enhancement reflects cortical
memory traces for words per se is true, then word deviants
presented among either word or pseudoword standard
stimuli would produce a larger MMN response compared
with that to pseudoword deviant. Until this question is
answered, it is not possible to draw definite and unambiguous conclusions about the cause of the enhanced MMN to
word stimuli. We have therefore set out to clarify this issue
using high density electroencephalography (EEG) and
decided to test both competing predictions by recording
MMN elicited either by deviant words presented among
words or pseudowords, or by deviant pseudowords
presented among words.
Vol 13 No 4 25 March 2002
521
NEUROREPORT
MATERIALS AND METHODS
Subjects: Ten healthy right-handed (handedness assessed
according to [15]) volunteers (no left-handed family
members, native British English speakers, age 18–31 years,
7 females) with normal hearing and no record of neurological diseases were presented with three sets of spoken
English language stimuli in three separate experimental
conditions.
Three sets of monosyllabic stimuli (Fig. 1) were prepared
with the condition that the phonetic and phonological
difference between the standard and the deviant stimuli is
identical in each set and that the stimuli themselves are as
similar acoustically as possible. In the first pair of stimuli,
we used two words: [taip] (type) and [tait] (tight) as the
standard and deviant sounds, respectively. In the second
pair, a pseudoword [baip] was used as the standard, and the
word [bait] (bite) as the deviant stimulus. Finally, in the
third set, the standard stimulus was a word, [paip] (pipe),
while the deviant was a pseudoword, [pait]. Thus, the point
when the standard and deviant stimuli differ is in each set at
the their last phoneme, which is always [t] for the deviants,
and [p] for the standards. The latter also means that the
acoustic/phonetic contrast between the standard and the
deviant stimuli is the same in all three pairs and so is
the phonological contrast (alveolar vs bilabial articulation of
the final stop). In the meantime, the lexical status difference
Y. SHTYROV AND F. PULVERMFLLER
is varied: there is no lexical difference in the first set, in the
second set the word deviants are presented among pseudowords, whilst the reversed design is used in the third
condition. All stimuli had very similar acoustic structure,
starting with a plosive consonant followed by diphthong
[ai] and terminating with a stop-consonant. All words were
high-frequency items (CELEX [16]), and the pseudowords
were checked for having no meanings in British English. To
ensure that the stimuli are as close acoustically as possible
and that the contrasts between the standard and the deviant
stimuli are identical we recorded multiple repetitions of the
deviant stimuli and additionally two more words [haip]
(hype) and [hait] (height) produced by a female native
speaker of English. With great care we selected a combination of recordings, whose vowels matched in their fundamental frequency (F0), as well as first (F1) and second (F2)
formant frequencies and whose overall length was as
similar as possible. In order to achieve complete similarity
in standard-deviant contrasts across stimulus pairs, we
decided to use identical final phonemes [p] and [t] in each
standard and deviant sound. To achieve this, one needs to
use cross-splicing, i.e. replacing the original sounds in the
recordings by different ones. To avoid differential effects of
early acoustic cues (which would arise if the final stops were
drawn from one of the actual stimulus sets and copied to the
other two) we selected to take them from similar words not
Fig. 1. Spectrograms of acoustic stimuli used in the three experimental conditions.The standard and deviant stimuli in each pair are identical up to the
divergence point (marked with white arrowheads), when -p or -t starts. Final [p] and [t] are identical across the standard and the deviant stimuli respectively.
52 2
Vol 13 No 4 25 March 2002
NEUROREPORT
MEMORY TRACES FOR WORDS
used as such in the experiment. The sound [p] and [t] were
cross-spliced from the recordings of [haip] and [hait] to the
initial segments [tai], [bai] and [pai], which had been
separated from their own terminal [t]-plosives. The pause
of about 85 ms preceding the final stop-consonants in this
type of English monosyllabic stimuli (Fig. 1) makes these
stops an ideal target for cross-splicing, and we were able to
smoothly replace the phonemes producing naturally sounding stimuli with precisely controlled acoustic features. Thus,
the two stimuli in each pair were identical up to the time
point (hereafter called divergence point) when the plosion
of the [p] or [t] started (Fig. 1). The divergence point was
always 130 ms before the offset of the deviant stimulus
(corresponding to 85 ms before the offset of standard
stimulus, since [p] is shorter than [t]).
This technique allowed us to avoid any acoustic differences before the onset of the final plosive and to control
exactly the point in time when the acoustic contrast
occurred. This is essential for MMN experiments, since
MMN is highly sensitive to even minor acoustic differences
between the standard and deviant stimuli. All stimuli were
normalised to have the same peak sound energy. For the
analysis and production of the stimuli we used the Cool Edit
96 program (Syntrillium Software Corp., AZ, USA).
Acoustic stimulation: In Condition I, we used the word
[taip] as the standard stimulus and [tait] as the deviant one.
In Condition II, the pseudoword [baip] was used as the
standard, and the word [bait] as the deviant stimulus. In
Condition III, the word [paip] was the standard stimulus,
while the deviant was the pseudoword stimulus [pait].
Thus, the standard–deviant acoustic–phonetic contrast, the
critical variable determining the MMN [17], was identical in
all conditions, while the lexical contrasts changed: no lexical
status contrast is present in Condition I, in Condition II the
deviant words are presented among pseudoword standards,
and the reversed is employed in Condition III. The three
experimental conditions were performed with every subject,
their order being counter-balanced across the subject group.
The stimuli were binaurally presented at comfortable
sound level (determined using the experimental stimuli) via
headphones connected to a STIM set-up (Neuroscan Labs,
VA, USA). The interstimulus (onset-to-onset) interval was
850 ms. In each condition, the deviant stimulus was
presented with 16.67% probability among the repetitive
standard stimuli: a pseudo-random sequence of 1008 stimuli
was created so that there were always 2 standard stimuli
between any two deviants.
Electroencephalographic recording: Subjects were seated
in an electrically and acoustically shielded chamber and
instructed to watch a silent videofilm of their own choice
and to ignore any auditory signals. During the auditory
stimulation, electric activity of the subjects’ brain was
continuously recorded (passband 0.01–100 Hz, sampling
rate 500 Hz) with a 64-channel EEG set-up (Neuroscan
Labs), using Ag/AgCl electrodes mounted in an extended
10–20 system electrode cap (Quik-Cap, Neuromedical
Supplies, VA, USA). AFz was used as the reference electrode
during the recording. To control for eye movement artifacts,
horizontal and vertical eye movements were recorded using
two bipolar electrooculogram (EOG) electrodes.
EEG data processing: The recordings were later filtered
off-line (passband 1–30 Hz). Event-related potentials were
obtained by averaging epochs, which started 100 ms before
the divergence point and ended 300 ms thereafter; the 100–
0 ms interval was used as a baseline. Epochs with voltage
variation exceeding 100 mV at any EEG channel or at either
of the two EOG electrodes were discarded. The MMN was
obtained by subtracting the response to the standard from
that to the deviant stimulus. For each experiment participant, the averaged MMN responses contained at least 100
accepted deviant trials in each condition. All responses were
recalculated offline against average reference for further
analysis.
Statistical analysis: Recordings from 36 electrodes located over frontal and central areas of both hemispheres
(4 electrode lines in posterior–anterior dimension 9
electrode lines in the left–right dimension), where MMN
responses could be observed, were used for statistical
analysis. Response latencies were determined from their
peak amplitudes at FCz, since electric MMN is usually
reported to be most prominent at the fronto-central sites
[6,9].
The analyses of the data were carried out using 3-way
ANOVA with condition and topography (posterior–anterior
and left–right dimensions taken separately in the analyses)
as within-subject factors; Huynh-Feldt correction of degrees
of freedom was applied. Where necessary for analysing
topography-related effects, the data were normalised [18]
using the following algorithm:
Ae A
qffiffiffiffiffiffiffiffiffiffiffi
AN ¼
:
P
k
Þ2
ðA
A
k1 i
i¼1
where AN is the calculated normalised response amplitude,
is
Ae is the original amplitude at a given electrode, A
the mean amplitude over all electrodes, and k is the total
number of electrodes used in the analysis.
Ethical considerations: All subjects gave their written
informed consent to participate in the experiments and
were paid for their participation. The experiments were
performed in accordance with the Helsinki Declaration.
Ethical permission for the experiments was issued by the
Cambridge Psychology Research Ethics Committee
(Cambridge University, UK).
RESULTS
All sets of stimuli elicited mismatch negativity responses.
The response latency at FCz electrode location did not
differ significantly between the conditions (p 4 0.5) and
its average value (7 s.e.m.) for all conditions was
138 7 20 ms. We therefore calculated mean area amplitudes
for the 20 ms interval around the peak (128–148 ms).
Analysis of mean area amplitudes at FCz indicated
clear difference between the conditions (Fig. 2). Both Conditions I and II yielded higher MMN amplitudes than Condition III (respective mean amplitude 0.60 7 0.19 mV,
Vol 13 No 4 25 March 2002
52 3
NEUROREPORT
0.51 7 0.16 mV and 0.20 7 0.18 mV). This difference in the
MMN amplitude between the word deviants (Conditions I
and II) and the pseudoword deviant (Condition III) was
highly significant (F(1,9) ¼ 26.03, p o 0.001). Further analysis
of differences between individual conditions showed that
MMN in Condition I was significantly larger than in
Condition III (F(1,9) ¼ 6.36, p o 0.02, one-tailed test); ANOVA comparison between Conditions II and III individually
also indicated larger MMN in Condition II (word deviant),
although this was only marginally significant (F(1,9) ¼ 2.88,
p o 0.07). No differences could be found between Conditions I and II in which the deviant stimuli were always
words, but the standards varied.
Analysis of the amplitude scalp distribution of the three
responses indicated clear topographical differences between
the conditions. As shown in Fig. 3, both Conditions I and II
had a very similar fronto-central distribution of the
response, whereas Condition III exhibited lateralised dis-
Fig. 2. Waveforms of electric MMN responses (deviant^ standard subtraction curves) evoked by the words (I and II) and pseudowords (III) in
the three conditions (grand-average at FCz). The MMNs elicited by the
words were larger than that elicited by the pseudoword.
Y. SHTYROV AND F. PULVERMFLLER
tribution, which is rather unusual for the electric MMN
response. This difference was reflected in a significant
Condition Topography
interaction
(F(16,144) ¼ 2.72,
¼ 0.407, p o 0.02), when all three conditions were included
in the analysis. Conditions I and II, however, were not
distinct in their response topography from each other, as
could also be predicted from the maps in Fig. 3
(F(8,72) ¼ 0.774, ¼ 0.333, p 4 0.5).
We also looked for significant differences between the
conditions in later time intervals (4 150 ms) but did not find
any. In addition, we would like to note that a positive
deflection, which followed the MMN and peaked at around
200 ms (Fig. 2), had its maximum at frontal channels
and therefore cannot be characterised as a P3-like
component.
DISCUSSION
In the present study, MMN responses were elicited by
words presented among words or pseudowords, or by
pseudowords presented among words. In order to avoid
acoustically-related effects, phonetic contrasts were identical in all conditions. We found that both MMNs elicited by
word deviants were larger than that elicited by pseudoword. Furthermore, topographies of the MMN differed
between the two word and non-word deviant stimuli, but
not between the conditions when the deviants were words
and the standard was either a word or pseudoword.
Our original motivation was to test two possible
explanations of the enhanced MMN response, which was
found for words vs pseudowords in an earlier study [3]. One
of these explanations suggests that this enhancement is a
sign of an activation of cortical memory traces for words,
whereas the alternative view would be that it is a bare
lexical status difference (word vs non-word) between the
standard and the deviant stimuli which could explain the
Fig. 3. Potential maps of electric MMN responses (deviant^ standard subtractions) evoked by the words (I and II) and pseudowords (III) in the three
conditions.Word-elicited MMNs di¡ered in amplitude and topography from that evoked by pseudoword.
524
Vol 13 No 4 25 March 2002
NEUROREPORT
MEMORY TRACES FOR WORDS
previous result. The latter would predict that lexical
contrasts would elicit equally large MMN regardless of
the direction of the contrast (word deviant vs non-word
standard or non-word deviant vs word standard), whereas
smaller response would be produced in the absence of
lexical contrast. Our results clearly demonstrated the
opposite: the presence of lexical contrast did not play a
major role in the magnitude of the MMN response. Instead,
both of the word deviants produced very similar responses
independently of the lexical status of the standards, whereas
the pseudoword-elicited MMN was distinct from them both
in the amplitude and topographical distribution. Importantly, Condition I, in which the lexical contrast was absent
altogether, produced the largest response, whereas Condition III (whose lexical contrast was identical to Condition II
in magnitude but not in direction), yielded the smallest of
all three MMNs. In addition, the topography of the
responses in Conditions I and II did not differ, whereas
both of them were distinct from Condition III.
Therefore, the current data indicate that only the first
alternative, the memory-trace-based explanation, is viable.
First, the results confirm that words do elicit larger MMN
responses than pseudowords. Second, this word-related
MMN enhancement, as demonstrated here, cannot be
attributed to the lexical status difference between the
standard and the deviant stimuli; it appears to be
independent of the lexical status of the standard stimulus.
This enhancement is probably caused by the activation of
pre-existing long-term memory traces for the word stimuli.
These traces, presumably, had been formed during the
subjects’ previous language experience [1,4,9,13,14]. No
such traces existed for the pseudowords, the MMN
responses to them supposedly reflecting only acousticphonetic and phonologic stimulus contrasts (and possibly
memory traces developed during the experimental session).
In summary, our results provide further support to the
word-elicited MMN responses, which can reflect the
presence of a long-term memory trace for a spoken word.
Interestingly, these memory traces appear to become
active as early as B140 ms after the stimuli could be
unmistakably perceived as words. Moreover, our results
were obtained in a passive paradigm, when subjects’
attention was distracted from the auditory input. This
implies that the brain might be capable of automatic lexical
classification of the incoming speech signals already at very
early stages of speech processing, in the same time range
when MMN was found to reflect phonemic differences
[4,19].
CONCLUSION
We recorded the mismatch negativity, an automatic auditory
change-detection brain response and an index of experience-dependent memory traces, elicited either by words
presented among words or pseudowords or by pseudowords presented among words. We found that MMN
elicited by words is larger than that elicited by pseudowords
and that this enhancement is largely independent of the
lexical status difference between the standard and deviant
stimuli. Words are differentially processed by the human
brain as early as B140 ms after the time point when they can
be identified as such. Thus, our results indicate the existence
of the word-related MMN enhancement. This, in turn,
suggests the existence of the long-term memory trace, or cell
assembly, representing spoken words in the human brain.
Furthermore, these traces can be activated in the absence of
active attention to the auditory input and are available
already at early stages of cerebral speech processing. The
MMN technique may be useful for further revealing
characteristics of these memory traces.
REFERENCES
Näätänen R. Psychophysiology 38, 1–21 (2001).
Näätänen R and Winkler I. Phsychol Bull 125, 826–859 (1999).
Pulvermüller F, Kujala T, Shtyrov Y et al. NeuroImage 14, 607–616 (2001).
Shtyrov Y, Kujala T, Palva S et al. NeuroImage 12, 657–663 (2000).
Alho K. Ear Hear 16, 38–51 (1995).
Picton T, Alain C, Ottert L et al. Audiol Neurootol 5, 111–139 (2000).
Näätänen R. Ear Hear 16, 6–18 (1995).
Kraus N, McGee T, Carrell TD et al. Ear Hear 16, 19–37 (1995).
Näätänen R. EEG Suppl 49, 170–173 (1999).
Näätänen R, Lehtokoski A, Lennes M et al. Nature 385, 432–434 (1997).
Alho K, Connolly JF, Cheour M et al. Neurosci Lett 258, 9–12 (1998).
Shtyrov Y, Kujala T, Ahveninen J et al. Neurosci Lett 251, 141–144 (1998).
Pulvermüller F. Behav Brain Sci 22, 253–336 (1999).
Pulvermüller F. Trends Cogn Sci 1, 517–524 (2001).
Oldfield RC. Neuropsychologia 9, 97–113 (1971).
Baayen RH, Piepenbrock R and Gulikers L. The CELEX Lexical Database
(Release 2; CDROM). Philadelphia, PA: Linguistic Data Consortium,
University of Pennsylvania; 1995.
17. Näätänen R and Alho K. Audiol Neurootol 2, 341–353 (1997).
18. McCarthy G and Wood CC. Electroencephalogr Clin Neurophysiol 62,
203–208 (1985).
19. Rinne T, Alho K, Alku P et al. Neuroreport 10, 1113–1117 (1999).
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Acknowledgements: We wish to thank Gabriele Holz, Olaf Hauk, Tara Dodson, Bundy Macintosh, Sally Butter¢eld and William
Marslen-Wilson for their contribution at di¡erent stages of this work.
Vol 13 No 4 25 March 2002
52 5