Full Paper - International Phonetic Association

15th ICPhS Barcelona
A Functional MRI Study of the Verbal Transformation Effect
Marc Sato1, Monica Baciu2, Hélène Lœvenbruck1, Christian Abry1 & Christoph Segebarth3
1
Institut de la Communication Parlée, CNRS UMR 5009, INPG/Université Stendhal
Laboratoire de Psychologie et Neurocognition, UMR CNRS 5105, Université Pierre Mendès-France
3
Unité Mixte INSERM/Université Joseph Fourier, U594, Neuroimagerie Fonctionnelle et Métabolique, LRC/ CEA
E-mail: sato,loeven,[email protected], [email protected], [email protected]
2
ABSTRACT
In the present study, we used functional magnetic resonance
imaging (fMRI) to localise the brain areas involved in a specific
auditory-verbal imagery task, the Verbal Transformation Effect
(VTE).
Two conditions were contrasted: a baseline condition involving
simple mental repetition of monosyllables and a first name vs. a
VTE condition involving mental repetition of the same items with
active search for verbal transformation. Significant activations,
predominantly within the left hemisphere, were obtained by
contrasting VTE vs. baseline, within the inferior frontal gyrus and
the insular region, the supramarginal gyrus, the premotor cortex
and supplementary motor area, the caudate nucleus and the
cerebellar cortex.
In line with previous behavioural studies, this study indicates that
during a VTE task the same regions are involved as those relative
to the Phonological Loop, the verbal component of Baddeley’s
working memory model. Given that VTE and baseline conditions
only differ in the presence or absence of a search for syllable
change, we hypothesize that the left inferior frontal and
supramarginal gyri activation indicate a frontal/parietal coupling
for
on-line
linguistic
information
processing
and
decision-making.
INTRODUCTION
While research on mental imagery has mainly focused on the
visual system over the last two decades, very few studies have
addressed the mechanisms underlying verbal auditory imagery
[1,2,3,4,5]
. In the present study, we used functional magnetic
resonance imaging (fMRI) to localise the brain areas involved in a
specific verbal-auditory imagery task: the Verbal Transformation
Effect (VTE) [6].
This word game [7], which bears an analogy with the perceptual
rivalry in ambiguous figures like the Necker cube [8], relies on the
fact that certain speech sequences, when repeated over and over,
yield a soundstream allowing more than one segmentation. For
example, rapid repetition of the word “life” provides a soundflow
fully compatible with the perception of either "life" or "fly". It has
been shown that verbal transformations can appear both during
production, i.e. when subjects repeatedly utter the stimuli either
overtly or covertly [1,2,3] and also during perception, i.e. when
subjects listen to an auditory speech stimulus looped on a tape
[9,10]
.
As the VTE requires that subjects analyse or make judgements
about repeated stimuli, the Phonological Loop – the verbal
component of Baddeley’s working memory model [11] - is
considered to provide a base on which these imagery processes
and judgements could take place [1,2,3].
In order to evaluate the cerebral areas associated with on-line
linguistic analysis and decision-making involved in the VTE, two
conditions were contrasted in the present fMRI study: a baseline
condition involving a simple mental repetition of monosyllables
and a first name vs. a VTE condition involving mental repetition
of the same items with active search for a verbal transformation.
Our hypothesis was that by contrasting VTE vs. baseline, regions
classically involved in syllable parsing and decision-making
should be activated. In addition, in line with the results from
previous behavioural studies, the regions classically involved in
the neural network of the Phonological Loop should also be
activated.
METHOD
Subjects
Ten healthy, right-handed (Edinburgh Handedness Inventory [12])
volunteers were examined (seven males and three females). All
subjects were native speakers of French, without hearing or
speaking disorders. Informed consent was obtained from each
subject prior to the experiment, in accordance with the
institutional review board regulations.
Material
Two couples of monosyllabic non-words, combining two
consonants and a vowel, were used. The stimuli consisted in a
combination of the neutral vowel [‹] and the bilabial [p] and
alveolar [s] consonants. In addition, we also used a couple of
disyllabic first names to test lexical influences during the VTE.
The three pairs of sequences obtained – i.e. [ps‹]/[‹ps],
[sp‹]/[p‹s] and [louis-jean]/[jean-louis] - are all phonotactically
attested in French.
Procedure
In order to evaluate the cerebral activations specifically associated
with the processes of syllable parsing and decision-making
involved in the VTE, two conditions were contrasted:
-
A baseline condition involving simple mental repetition.
- A VTE condition involving mental repetition with active
search for a verbal transformation.
Pre-scan training was carried out to enhance the subjects’
performance during the fMRI scans. They were instructed to
execute both the baseline and the VTE condition, first in an overt
mode, then in a covert mode (without any external movement) and
1639
ISBN 1-876346-48-5 © 2003 UAB
15th ICPhS Barcelona
trained to use a fixed rehearsal rate of about 2Hz. Concerning the
VTE condition, as shown in a previous behavioural study [3], each
of the pair members may be transformed to the other one: for
instance, mental repetition of the [sp‹] sequence may produce a
shift to the [p‹s] sequence, and conversely. For the three pairs of
stimuli, subjects were trained to find this transformation pattern
alternatively, and continue repeating whichever pair member they
perceived. For instance, for the [sp‹]/[p‹s] pair, they were trained
to first repeat [sp‹] and find [p‹s], then continue producing [p‹s]
while searching for [sp‹], and so on.
Figure1: Schematic representation of the trials for the VTE task. It
includes the following events: (i) fixation point, (ii) instruction screen, (iii)
fixation point, (iv) task interval.
The stimuli were generated by means of Psyscope V.1.1
(Carnegie Mellon Department of Psychology) running on a
Macintosh computer (Power Macintosh 9600). They were
transmitted to the subjects by means of a video projector (Eiki LC
6000), a projection screen situated behind the magnet and a mirror
centred above the subject’s eyes. In order to minimize movement
artefacts, all tasks were carried out silently, subjects’ heads being
secured by foam rubber padding. This procedure has proved valid
as a means of evaluating speech processing in previous
investigations [13,14].
Two functional scans were performed. The experiment was
carried out according to a block design: each session of
continuous data acquisition comprised 9 successive periods, each
composed of a baseline epoch and a VTE condition epoch. For
each epoch (50 s.), an instruction screen describing the condition
and the pair of sequences to be used was displayed (during 3000
ms), preceded and followed by a fixation cross (during 1000 ms).
In order to avoid a possible order effect, stimuli were randomly
presented across the subjects.
fMRI Acquisition
Functional MR imaging was performed on a 1.5 Tesla MR imager
(Philips NT) with echo-planar (EPI) acquisition. Twenty-five
adjacent, axial, slices (5 mm thickness each) were imaged 18
times during each condition. The imaging volume was oriented
parallel to the bi-commissural plane. Positioning of the image
planes was performed on scout images acquired in the sagittal
plane. An EPI MR pulse sequence was used for the functional
scans. The major MR parameters of this sequence were: TR =
3700 ms, TE = 45 ms, pulse angle = 90°, acquisition matrix =
64x64, reconstruction matrix = 128x128, field-of-view = 256x256
mm2. Between the first and the second functional MR scans, a
high-resolution 3D anatomical MR scan was obtained from the
volume functionally examined.
Data Analysis
The data were pre-processed and analysed with SPM-99 software
(Wellcome Department of Cognitive Neurology, London, UK) in
the MATLAB environment (Mathworks, Sherbon, USA). Motion
correction of the functional images was applied with respect to a
reference data set acquired halfway through the functional scans.
The T1-weighted structural MRI scan for each subject was
co-registered to the mean motion-corrected T2*-weighted images.
The resultant structural images were then spatially normalized
into a standard stereotaxic space, using as template a
representative brain from the Montreal Neurological Institute.
Functional images were then smoothed with a Gaussian kernel.
Global activity was corrected by grand mean scaling for each
Figure 2: Activation maps obtained for the ‘VTE condition – Baseline’ contrast (SPM 99, n=10, p<0.05 corrected).
ISBN 1-876346-48-5 © 2003 UAB
1640
15th ICPhS Barcelona
Region of maximal activation
No. voxels
Left/Right
Supramarginal Gyrus
Postcentral gyrus
Superior Parietal Lobule
Inferior frontal gyrus
Middle frontal gyrus
Middle frontal gyrus
Supramarginal Gyrus
Cerebellum
Cerebellum
Supplementary motor area
Superior frontal gyrus
Inferior frontal gyrus
Caudate nucleus
Insula
Mid-dorsolateral frontal gyrus
192
L
L
L
L
L
L
R
R
L
B
R
R
L
L
L
248
111
166
145
85
58
57
25
16
1
Talairach coordinates
X
y
z
-44
-59
-24
-55
-28
-44
44
20
-32
-4
24
63
-12
-40
-28
-33
-14
-52
9
-1
2
-32
-64
-52
14
3
13
1
12
36
46
28
47
22
55
39
51
-37
-23
44
61
21
11
5
15
Brodmann areas
Z value
P corrected
40
1/2/3
7
44
6
9
40
6
6
44
46
Inf
7.59
5.80
Inf
Inf
Inf
Inf
Inf
Inf
Inf
Inf
7.83
5.07
5.54
4.65
< .001
< .001
< .001
< .001
< .001
< .001
< .001
< .001
< .001
< .001
< .001
< .001
0.003
< .001
< .001
Table 1: Activation clusters obtained for the ‘VTE condition – Baseline’ contrast (P<0.05 corrected).
scan. Functional responses were determined using ANOVA.
Contrasts between conditions were determined pixelwise using
the General Linear Model. Statistical significance threshold for
individual pixels was established at p < 0.001, uncorrected.
Clusters of activated pixels were then identified, based on the
intensity of the individual responses and the spatial extent of the
clusters. Finally, a significance threshold of p < 0.05 (corrected
for multiple comparisons) was applied for identification of the
activated clusters. The results of the fixed effect group analysis
are reported in Figure 2 and Table 1, which represent the activated
regions and the stereotaxic Talairach coordinates of the cluster
maxima [15,16].
RESULTS
The ‘VTE condition - baseline’ contrast revealed a predominantly
left-lateralised network of areas, including frontal, parietal and
cerebellar regions (Table 1).
The left frontal activation was maximal in the inferior frontal
gyrus (BA 44) and extended superiorly into the middle frontal
gyrus (BA 6/9) and anteriorly into the mid-dorsolateral frontal
gyrus (BA 46). In addition, activation in the precentral gyrus of
the insular cortex was found. Less extensive right hemispheric
activations were found in the inferior frontal gyrus (BA 44) and
the superior frontal gyrus (BA 6). Finally, an activation of the
supplementary motor area (BA 6m) was also detected bilaterally.
Parietal activations were evident in both left and right
supramarginal gyrus (BA 40), a little less extensive within the
right hemisphere. The left parietal activation extended into the left
post-central gyrus (BA 1,2,3) and the superior parietal lobule
region (BA 7). Finally, activations were found in the left caudate
nucleus and in the cerebellar cortex, bilaterally.
DISCUSSION
The main finding of this study, in line with previous behavioural
studies, is that the verbal transformation task was associated with
a left-lateralised network of areas similar to those involved in
verbal working memory (VWM) [17-21], specifically: Inferior
frontal gyrus and insular region (IFG - BA 44 / INS),
supramarginal gyrus (SMG – BA 40), premotor and
supplementary motor areas (PMC - BA 6 / SMA - BA 6m),
dorsolateral prefrontal cortex (PFC - BA 9/46), and cerebellum.
Verbal working memory (VWM) refers to the cognitive
mechanisms that allow us to keep a limited amount of information
active for a brief period of time. The basic processes generally
assumed to be involved in VWM are: (a) storage, during which
phonological material is held temporarily in memory; (b)
rehearsal, during which phonological material is refreshed via
subvocal articulation. A classical model explaining these
mechanisms in the light of psychological, neuropsychological and
developmental findings on VWM is the Phonological Loop [11]. It
comprises two components, a passive Phonological Store and an
active Articulatory Control Process: Verbal material enters the
Phonological Store for temporary storage and subsequently
undergoes rapid decay unless rehearsed via the Articulatory
Control Process. A number of neuroimaging and
neuropsychological studies have provided arguments in favour of
an anatomical dissociation of storage and rehearsal during VWM
tasks [17-21]. These studies concluded that the inferior parietal
cortex (SMG) seems to be involved in storage while the prefrontal
cortex (particularly the inferior frontal gyrus) appears to mediate
rehearsal. Regarding the mechanisms associated with the VTE,
this view of a fronto-parietal dissociation in terms of rehearsal and
storage of verbal representation must be discussed.
First, the classical view production in literature on VWM and
language that the IFG (specifically Broca’s area) is directly
implicated in subvocal rehearsal and speech deserves to be
revised. Considering on one hand that our VTE activations come
from a subtraction of the baseline (involving subvocal rehearsal
itself) and, on another hand, that the IFG is activated during VTE
task, it seems indeed difficult to exclusively assign a purely
rehearsal role to this region. In accordance, the classical view of
Broca’s area as a major structure for speech production has been
recently re-evaluated by Murphy et al. [22] who founded no
activation of the IFG during automatic speech tasks, either overtly
or covertly. Then, building on different studies that have
demonstrated the implication of the IFG in phonological tasks such as phoneme monitoring [23], syllable counting [24] and
rhyming [17] - and in the observation and mental imagery of
actions [25], we hypothesize that the role of the IFG is that of an
attentional matching system for action understanding, which is
well adapted to a linguistic processing such as syllable parsing
during the VTE.
1641
ISBN 1-876346-48-5 © 2003 UAB
15th ICPhS Barcelona
Secondly, given that the VTE task appears to engage little demand
on verbal storage, the hypothesis involvement of the SMG in
short-term storage of phonologically coded verbal material must
be examined. Hickok and Poeppel [26] offer a hypothesis that can
better account for our results. According to them, the SMG should
not be the site of storage of phonemic representations per se but
rather serves to interface sound-based representations of speech in
auditory cortex with articulatory-based representations of speech
in the frontal cortex via sensorimotor recoding. We suggest that a
frontal-parietal circuit is recruited in the VTE, where the parietal
SMG would be involved in the construction of phonological
representations, by activation of long-term speech-body
(vocal-tract) memory representations [27], that would be sent to the
frontal LIFG action parser.
Finally, in addition to this frontal-parietal coupling of the IFG and
the SMG regions – which could be the basis for the
decision-making process crucial in the VTE [28] – it is important to
note that a broad range of speech-related areas were also activated
for the ‘VTE condition - Baseline’ contrast: the supplementary
motor area, the insular cortex, the caudate nucleus and the
cerebellum. These regions are classically considered to be
involved in initiation [29], planification / coordination of
articulatory movements [30], and modulation and temporal control
of such movements [31,32]. We suggest that these activations could
be due to the shift process between each pair-member stimulus
during the VTE. This shift would imply a new configuration of the
articulatory/phonological mental representation of rehearsed item
[3]
and therefore a stronger involvement (activation) of these
speech-related areas.
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
CONCLUSION
In line with the results of previous behavioural studies, this fMRI
study indicates that a verbal transformation task involves the same
regions as those involved in the Phonological Loop, the verbal
component of Baddeley’s working memory model. Given that, in
this study, the verbal transformation task and the baseline
conditions only differed by the search for syllable change, we
hypothesize that the left inferior frontal and supramarginal gyri
activations indicate a frontal/parietal coupling for on-line
linguistic information processing - in the present case, syllable
parsing and temporary storage of the phonological action goal,
respectively - and decision making.
ACKNOWLEDGMENTS
We thank Jean-Luc Schwartz, Marie-Agnès Cathiard and Carole
Peyrin for their contribution to the realization of this study.
REFERENCES
[1]
[2]
[3]
[4]
Reisberg, D., Smith, J.D., Baxter, A.D. & Sonenshine, M.
(1989). Quaterly Journal of Experimental Psychology, 41A:
619-641.
Smith, J.D., Reisberg, D. & Wilson, M. (1995).
Neuropsychologia, 11: 1433-1454.
Sato, M. & Schwartz, J.L. (2003). In Proceedings of the
XVth International Congress of Phonetics Science,
Barcelona, Spain.
McGuire, P.K., Silbersweig, D.A. & Frith, C.D. (1996).
Psychological Medicine, 26: 29-38.
ISBN 1-876346-48-5 © 2003 UAB
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
1642
Shergill, S.S., Bullmore, E.T., Brammer, M.J., Williams,
S.C., Murray, R.M. & McGuire, P.K. (2001). Psychological
Medicine, 31(2): 241-253.
Warren, M.R. & Gregory, R.L (1958). AJP, 71: 612-613.
Treiman, R. (1983). Cognition, 15: 49-74.
Chambers, D. & Reisberg, D. (1985). JEP: General, 11:
317-328.
Warren, M.R. (1961). BJP, 52: 249-258.
Pitt, M. & Shoaf, L. (2002). JEP: Human Perception and
Performance, 28: 150-162.
Baddeley, A.D. (1986). Working Memory. Oxford, Oxford
University Press.
Oldfield, R. C. (1971). Neuropsychologia, 9 (1): 97-113.
Yetkin, F.Z., Hammeke, T.A., Swanson, S.J., Morris, G.L.,
Mueller, W.M., McAuliffe, T.L. & Haughton, V.M. (1995).
AJNR, 16: 1087-1092.
Wildgruber, D., Ackermann, H. & Grodd, W. (2001).
NeuroImage, 13: 101-109.
Talairach, J. & Tournoux, P. (1988). Co-planar stereotaxic
atlas of the human brain. Stuttgart: George Thieme Verlag.
Brett, M., Johnsrude, I.S. & Owen, A.M. (2002). Nature
Neuroscience, vol. 3: 243-249.
Paulesu, E., Frith, C.D. & Frackowiak, R.S.J. (1993).
Nature, 362: 342-344.
Cohen, J.D., Perlstein, W.M., Braver, T.S., Nystrom, L.E.,
Noll, D.C., Jonides, J. & Smith, E.E. (1997). Nature, 386:
604-608.
Smith, E.E., Jonides, J., Marshuetz, C. & Koeppe, R.A.
(1998). Proc. Natl. Acad. Sci., 95: 876-882.
Honey, G., Bullmore, E. & Sharma, T. (2000). NeuroImage,
12: 495-503.
Henson, R.N.A, Burgess, N. & Frith, C.D. (2000).
Neuropsychologia, 38: 426-440.
Murphy, K., Corfield, D.R., Guz, A., Fink, G.R., Wise, R.J.,
Harrison, J. & Adams, L. (1997). JAP, 83(5): 1438-1447.
Demonet, J.F., Price, C., Wise, R. & Frackowiak R.J.
(1994). Brain, 117: 671-682.
Poldrack, R.A., Wagner, A.D., Prull M.W., Desmond J.E.,
Glover, G.H. & Gabrieli J.D.E (1999). NeuroImage, 10:
15-35.
Buccino, G., Binkofski, F., Fink, G.R., Fadiga, L., Fogassi,
L., Gallese, V., Seitz, R.J., Zilles, K., Rizzolatti, G. &
Freund, H.J. (2001). Eur. Journal of Neur.., 13, 400-404.
Hickok, G. & Poeppel, D. (2000). TICS, 4(4): 131-138.
Ruchkin, D.S., Grafman, J., Cameron, K. & Berndt, R.T. (in
press). Behavioral and Brain Sciences.
Kast, B. (2001). Nature, 411: 126-128.
Ziegler, W., Kilian, B. & Deger, K. (1997).
Neuropsychologia, 35: 1197-1208.
Dronkers, N.F. (1996). Nature, 384: 159-161.
Pickett, E.R., Kuniholm, E., Protopapas, A., Friedman. J. &
Liberman, P. (1998) Neuropsychologia, 36: 173-188.
Ivry, R.B & Keele, S.W. (1989). Journal of Cognitive
Neuroscience, 1: 136-152.