Neuropsychologia Neural networks involved in voluntary and

Neuropsychologia 48 (2010) 607–618
Contents lists available at ScienceDirect
Neuropsychologia
journal homepage: www.elsevier.com/locate/neuropsychologia
Neural networks involved in voluntary and involuntary vocal pitch regulation in
experienced singers
Jean Mary Zarate a,b,∗ , Sean Wood b,c , Robert J. Zatorre a,b
a
b
c
Montréal Neurological Institute, McGill University, 3801 University Street, Montréal, Québec, Canada
International Laboratory for Brain, Music, and Sound Research (BRAMS), 1430 Mont-Royal Boulevard West, Montréal, Québec, Canada
Department of Computer Science, Université de Montréal, C.P. 6128, Montréal, Québec, Canada
a r t i c l e
i n f o
Article history:
Received 9 February 2009
Received in revised form 16 July 2009
Accepted 24 October 2009
Available online 6 November 2009
Keywords:
Audio–vocal integration
Auditory feedback
fMRI
Pitch shift
Vocal control
a b s t r a c t
In an fMRI experiment, we tested experienced singers with singing tasks to investigate neural correlates
of voluntary and involuntary vocal pitch regulation. We shifted the pitch of auditory feedback (±25 or 200
cents), and singers either: (1) ignored the shift and maintained their vocal pitch or (2) changed their vocal
pitch to compensate for the shift. In our previous study, singers successfully ignored and compensated
for 200-cent shifts; in the present experiment, we hypothesized that singers would be less able to ignore
25-cent shifts, due to a prepotent, corrective pitch-shift response. We expected that voluntary vocal regulation during compensate tasks would recruit the anterior portion of the rostral cingulate zone (RCZa)
and posterior superior temporal sulcus (pSTS), as our earlier study reported; however, we predicted that
a different network may be engaged during involuntary responses to 25-cent shifts. Singers were less
able to ignore 25-cent shifts than 200-cent shifts, suggesting that pitch-shift responses to small shifts
are under less voluntary control than responses to larger shifts. While we did not find neural activity
specifically associated with involuntary pitch-shift responses, compensate tasks recruited a functionally
connected network consisting of RCZa, pSTS, and anterior insula. Analyses of stimulus-modulated functional connectivity suggest that pSTS and intraparietal sulcus may monitor auditory feedback to extract
pitch-shift direction in 200-cent tasks, but not in 25-cent tasks, which suggests that larger vocal corrections are under cortical control. During the compensate tasks, the pSTS may interact with the RCZa and
anterior insula before voluntary vocal pitch correction occurs.
© 2009 Elsevier Ltd. All rights reserved.
1. Introduction
Electrophysiological, tracer, and lesion studies in animals have
demonstrated that vocalization recruits a constellation of neural
structures, ranging from motor/premotor cortical areas [i.e., primary motor cortex, supplementary motor area, anterior cingulate
cortex] and subcortical regions (basal ganglia, thalamus) to an array
of brainstem structures, including periaqueductal gray, substantia
Abbreviations: ACC, anterior cingulate cortex; aINS, anterior insula; aSTG,
anterior superior temporal gyrus; BA, Brodmann area; IPL, inferior parietal lobule; IPS, intraparietal sulcus; M1, primary motor cortex; mid-PMC, mid-premotor
cortex; PAC, primary auditory cortex; PostC, postcentral gyrus; pre-SMA, presupplementary motor area; pSTG, posterior superior temporal gyrus; pSTS, posterior
superior temporal sulcus; PT, planum temporale; RCZa, anterior portion of rostral
cingulate zone; SMA, supplementary motor area; SMG, supramarginal gyrus; STG,
superior temporal gyrus; STS, superior temporal sulcus; vPMC, ventral premotor
cortex.
∗ Corresponding author at: Montréal Neurological Institute, Cognitive Neuroscience Unit, 3801 University Street, Room 276, Montréal, Québec, Canada H3A 2B4.
Tel.: +1 514 398 8519; fax: +1 514 398 1338.
E-mail address: [email protected] (J.M. Zarate).
0028-3932/$ – see front matter © 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.neuropsychologia.2009.10.025
nigra, reticular formation, and motoneuron pools (Jurgens, 2002).
Neuroimaging studies have confirmed that many of these regions
are also involved in human vocalization, including speech and
various singing tasks (Brown, Martinez, Hodges, Fox, & Parsons,
2004; Brown, Martinez, & Parsons, 2006; Jeffries, Braun, & Fritz,
2003; Kleber, Birbaumer, Veit, Trevorrow, & Lotze, 2007; Ozdemir,
Norton, & Schlaug, 2006; Paus, Petrides, Evans, & Meyer, 1993;
Perry et al., 1999; Riecker, Ackermann, Wildgruber, Dogil, & Grodd,
2000; Schulz, Varga, Jeffires, Ludlow, & Braun, 2005). Sensory feedback during vocalization not only stems from proprioception from
the vocal apparatus but also from auditory feedback processed by
temporal lobe regions [e.g., superior temporal gyrus (STG), superior temporal sulcus (STS)], which process vocal sounds, speech,
and other auditory stimuli (Belin, Zatorre, & Ahad, 2002; Scott
& Johnsrude, 2003). At times, vocal adjustments are necessary if
there is a mismatch between the intended and actual vocal output or if the environmental tasks change (e.g., noisy background);
this vocal regulation requires the integration of vocal motor
control and auditory processes (also known as “audio-vocal integration”), but the neural substrates involved in this process are not
well-understood.
608
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
Previous behavioral studies have investigated audio–vocal integration underlying vocal pitch regulation by manipulating auditory
feedback, either by adjusting the feedback amplitude (Lombard,
1911; Siegel & Pick, 1974) or by altering the fundamental frequency
(i.e., perceived pitch) of the auditory feedback (Burnett, Freedland,
Larson, & Hain, 1998; Burnett & Larson, 2002; Burnett, McCurdy, &
Bright, 2008; Donath, Natke, & Kalveram, 2002; Hafke, 2008; Hain
et al., 2000; Jones & Keough, 2008; Jones & Munhall, 2000, 2005;
Larson, 1998; Larson, Burnett, & Kiran, 2000; Natke, Donath, &
Kalveram, 2003; Natke & Kalveram, 2001). Such auditory feedback
perturbations often elicit fast, compensatory adjustments in either
vocal amplitude or pitch, such as the Lombard reflex [an increase
in vocal amplitude in response to decreased feedback amplitude
(Lombard, 1911; Siegel & Pick, 1974)] or the pitch-shift response,
in which the vocal pitch is quickly adjusted, often in the opposite direction of the feedback shift (Burnett et al., 1998; Burnett
& Larson, 2002). In a previous neuroimaging experiment (Zarate
& Zatorre, 2008), we modified the pitch-shift paradigms used by
Larson and Burnett to target cortical substrates of audio–vocal
integration. Rather than delivering pitch-shifted feedback for less
than 1 s as in the Larson/Burnett studies, we maintained a ±200cent shift in feedback (one whole tone, in musical terminology)
for approximately 3 s to increase the likelihood of capturing neural activity associated with audio–vocal integration. Subjects were
instructed either to: (1) ignore the pitch-shifted feedback and keep
their vocal output steady, or (2) compensate for the pitch shift, so
that the shifted feedback would sound like the original target note
(i.e., cancel out the pitch shift in the feedback). We believed the
latter task would recruit the brain regions involved in audio–vocal
integration, since subjects needed to monitor auditory feedback
while regulating their vocal output to cancel out the feedback shift.
We tested non-musicians and experienced singers to determine if
vocal training would modify neural activity associated with these
singing tasks. During our “compensate” task, we found two possible
substrates for audio–vocal integration, each of which was dependent on vocal experience: (1) non-musicians showed increased
activity in the dorsal premotor cortex (dPMC), and (2) experienced
singers showed increased activity in the anterior portion of the rostral cingulate zone (RCZa) and posterior STS (pSTS). The dPMC has
been implicated in selecting movements associated with particular
sensory cues (Chouinard & Paus, 2006; Petrides, 1986), including auditory–motor interactions (Chen, Penhune, & Zatorre, 2008;
Chen, Zatorre, & Penhune, 2006; Zatorre, Chen, & Penhune, 2007),
and thus may serve as a basic sensorimotor interface as people,
regardless of vocal experience, adjust their vocal output after hearing feedback perturbation. In general, the RCZa is implicated in
conflict monitoring (Botvinick, Cohen, & Carter, 2004; Botvinick,
Nystrom, Fissell, Carter, & Cohen, 1999; Carter et al., 1998; Durston
et al., 2003; MacDonald, Cohen, Stenger, & Carter, 2000; Picard &
Strick, 1996, 2001), while the pSTS processes vocal stimuli (Belin,
Zatorre, Lafaille, Ahad, & Pike, 2000; Kriegstein & Giraud, 2004) and
may be involved in extracting specific sound features (Celsis et al.,
1999; Warren, Scott, Price, & Griffiths, 2006; Warren, Uppenkamp,
Patterson, & Griffiths, 2003). We proposed that as people undergo
more vocal training or experience, the interface between the RCZa
and pSTS may be increasingly recruited for audio–vocal integration
(Zarate & Zatorre, 2008).
Although we outlined possible substrates for voluntary vocal
regulation in this prior study, we did not systematically study the
neural correlates of the pitch-shift response itself, which is also
a form of vocal regulation that relies on audio–vocal integration.
Since the pitch-shift response may be more involuntary, it may
be governed by different substrates than those outlined above for
voluntary vocal regulation. In fact, Burnett et al. (1998) suggested
that the midbrain periaqueductal gray (PAG) may a possible candidate for audio–vocal integration during the pitch-shift response,
due to its connections and its role in vocalization. Electrical and
pharmacological stimulation of the squirrel monkey PAG elicits
vocalization (Dujardin & Jurgens, 2005; Suga & Yajima, 1988), and
the human PAG is active during voiced speech when compared to
whispered speech, suggesting that the PAG is involved in motor
networks that produce vocal fold activity (Schulz et al., 2005). The
PAG receives input from a huge array of sensory cortical and subcortical regions, including higher order auditory areas (e.g., STS),
superior and inferior colliculi, lateral lemniscus, and the nucleus
gracilis, which suggests that the PAG may be involved in vocal
responses to external stimuli (Dujardin & Jurgens, 2005). The PAG
may receive information about auditory feedback via the inferior
colliculus (Huffman & Henson, 1990) or the lateral lemniscus and
initiate a quick, compensatory vocal response to any changes in
feedback, such as the Lombard reflex (Nonaka, Takahashi, Enomoto,
Katada, & Unno, 1997) or the pitch-shift response.
In our earlier study (Zarate & Zatorre, 2008), we made an
interesting observation—during the ignore task, we saw pitchshift responses only in the non-musicians; we therefore concluded
that vocal training must have helped singers suppress pitch-shift
responses when asked to ignore a large, 200-cent shift. Given
that only singers suppressed pitch-shift responses when ignoring
large pitch perturbations and generally produced more uniform
behavioral results than non-musicians in our previous experiment, in the current study, we investigated the neural correlates
of audio–vocal integration during both small pitch-shift responses
and larger, intended vocal adjustments only in experienced singers.
In the present experiment, singers performed the same ignore and
compensate tasks from our first experiment, but we utilized two
different shift magnitudes: 200-cent and 25-cent pitch shifts. Since
our previous experiment has already shown that singers can successfully ignore and compensate for a 200-cent shift, we expected
that the response magnitudes between these tasks would be significantly different. In contrast, given that pitch-shift responses
are better suited to fully correct for smaller pitch perturbations
than larger ones (Liu & Larson, 2007), and hence are thought to
be under more automatic control, we hypothesized that singers
would be less able to suppress pitch-shift responses to 25-cent
shifts than to 200-cent shifts; thus, we did not expect significant differences in response magnitudes for ignoring and compensating for
this smaller shift. We predicted that the brain regions that singers
recruited for ignoring and compensating for the large shift would be
similar to those reported in our prior experiment (Zarate & Zatorre,
2008). However, during the 25-cent tasks, we hypothesized that
not only similar regions would be recruited as in the large-shift
tasks, but that the PAG would also be specifically recruited during
elicited pitch-shift responses in the ignore task.
2. Materials and methods
2.1. Subjects
A total of 13 healthy subjects were recruited from the McGill University community and surroundings areas. All subjects (mean age = 23 ± 3.93 years old) were
right-handed, had normal hearing, and were devoid of neurological or psychological disorders and contraindications for functional magnetic resonance imaging
(fMRI) techniques. All subjects gave informed consent to participate in this study,
in accordance with procedures approved by the Research Ethics Committees of the
McConnell Brain Imaging Centre and the Montréal Neurological Institute. Three subjects were withdrawn from the study due to problems performing the tasks, and
another subject was excluded for moving excessively during the scanning session.
The remaining nine subjects (three male), all categorized as experienced singers,
had an average of 11 years (±4.28 years) of formal vocal training and/or experience,
were currently practicing or performing at the time of the study, and did not participate in our previous experiment (Zarate & Zatorre, 2008). According to self-report,
none of the subjects possessed absolute pitch.
2.2. Equipment
During familiarization sessions, subjects sat in front of a lab computer screen
and were given a microphone (Røde NT5, Silverwater, Australia) and a pair of
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
headphones (Sennheiser HD 280 PRO, Wedemark, Germany) through which all
auditory stimuli were delivered. During scanning sessions, the subjects were given
magnetic-resonance (MR) compatible headphones (MR-Confon Peltor Optimex,
Magdeburg, Germany) and an MR-compatible microphone (FOM-Optimic 2155,
Optoacoustics, Or-Yehuda, Israel). All visual cues were back-projected onto a screen
at the subjects’ feet, and subjects viewed the screen via a mirror attached to the
head coil. For both sessions, the microphone was connected to a mixer to amplify
the voice signal before it was sent to a VoiceOne digital signal processor (TC Helicon Vocal Technologies, Westlake Village, CA, USA). During the entire experiment,
pink noise was delivered through the headphones to reduce bone conduction, so
that the manipulated vocal signal from the digital signal processor would be the
main source of auditory feedback to the subjects. All auditory stimuli (pink noise,
target vocal waves, and auditory feedback) were delivered to the headphones via
the mixer, and all volume levels were adjusted to comfortable levels for each subject. Pink noise was delivered at an average of 68.3 dB SPL A, while the target wave
presentation was presented at an average of 4.1 dB SPL above the pink noise. The
delivery of target waves, visual prompts to cue subjects for singing, and Musical
Instrument Digital Interface (MIDI) system-exclusive messages to control the digital signal processor were all controlled by Media Control Functions (MCF) software
(DigiVox, Montréal, Canada). Auditory feedback (via the digital signal processor) and
all vocalizations were digitally recorded onto a Marantz PMD-670 digital recorder
(Marantz Professional, Itasca, IL, USA).
2.3. Experimental paradigm
During the familiarization session, subjects practiced singing tasks and control
conditions to prepare for the fMRI scanning session. For all singing tasks, we first
presented a target note and then used a visual cue to prompt subjects to sing the note
back using the syllable /a/. All subjects were trained to sing with minimal mouth
movement to reduce movement artifacts in the fMRI session. They were instructed
to keep their jaws slightly open and lips closed, so that at the beginning and end of
every sung note, only their lips, but not their jaws, moved. Each singing task was presented in blocks of five trials, with the same 2-s target note for each trial [176.99 Hz
(∼F3) for males, 355.03 (∼F4) for females]. In one task, after hearing the target note,
subjects were cued to sing the note for 4 s (“simple singing”). During pitch-shift tasks,
approximately 1 s after the onset of singing (shift onset range: 1000–1500 ms), the
voice was shifted either up or down by 200 cents (one whole tone) or 25 cents via
the digital signal processor and remained shifted until the end of the trial. For these
trials, subjects were instructed to make a different response in each of two distinct
tasks: (1) ignore the shifted feedback and keep the vocal output as steady as possible on the original note (“IGN”), or (2) correct the shifted feedback so that the
feedback sounded like the target note (“COMP”). We maintained the feedback shift
until the end of the trial to increase the probability of finding brain regions involved
in vocal pitch regulation with fMRI techniques. Two control conditions were also
presented: (1) a condition with only pink noise playing in the background, used to
assess “baseline” cortical activity in the MR scanner; and (2) a perception condition,
which presented a target note that subjects did not have to sing back, thus serving
as an auditory control for all singing tasks in the scanner. In both of these control
conditions, subjects were visually cued to breathe out normally, rather than sing;
therefore, these conditions also served as a respiratory control for the singing tasks.
During familiarization, the subjects went through four experimental runs with all
singing tasks and control conditions included in each run.
A few days after the familiarization session, each subject was tested in a Siemens
Trio 3T MR scanner. While in the scanner, subjects were exposed to all of the
singing tasks and control conditions presented in the familiarization session. Prior
to functional scanning, a high-resolution (voxel = 1 mm3 ) T1-weighted scan was
obtained for anatomical localization. During the two functional runs, one wholehead frame of forty contiguous T2*-weighted images were acquired in an ascending,
interleaved fashion (TE = 60 ms, TR = 10.3 s, 64 × 64 matrix, voxel size = 3.5 mm3 ,
FOV = 224 mm2 ). We utilized a sparse-sampling design (Belin, Zatorre, Hoge, Evans,
& Pike, 1999)—tasks were performed during the silent periods between scan acquisitions to: (1) prevent scanner noise from interfering with the auditory stimuli and
(2) reduce any effect of movement due to vocalization, since scanning occurred after
vocalizations were completed. We also used cardiac-triggered gating to minimize
any pulsatile artifacts in subcortical structures (Guimaraes et al., 1998). Relative timings between scan acquisitions and tasks were systematically varied or “jittered” by
±500 ms to maximize the likelihood of obtaining the peak of the hemodynamic
response for each task (Belin et al., 1999). Each subject went through two experimental runs in the scanner. At the end of the scanning session, the simple singing
task was presented a total of 20 times, while each pitch-shift task was presented
a total of 40 times (20 trials for each pitch-shift direction); one brain image was
acquired per trial. The order of all singing tasks and control conditions within each
run was counterbalanced across subjects.
2.4. Behavioral analyses
We automated the statistics extraction process using the Python programming
language in conjunction with de Cheveigné’s Matlab implementation of the YIN
pitch extractor (de Cheveigné & Kawahara, 2002). The individual vocalization files
were first extracted from all subjects’ recordings, each of which spanned the dura-
609
tion of each experimental session. To facilitate vocalization extraction, the presented
target notes (and auditory feedback via the digital signal processor) were recorded
only in the right channel of a stereo recording, while the left channel received input
only from the microphone directly (i.e., raw vocal output). By subtracting the envelope of the left channel from that of the right, we isolated the target notes from
the rest of the signal. The beginning of each individual vocalization file was defined
as the end of the preceding target. The end of the individual vocalization file was
the point in time when the signal became silent after having risen above half of its
maximum amplitude, or onset of the next target, whichever came first. The isolated
individual vocalization files were then exported as 44.1 kHz audio files.
Once the individual vocalization files were extracted, YIN was used to calculate
fundamental frequency (f0 ), signal power, and aperiodicity every 32 samples [resulting in a frame rate of 1378.125 Hz, i.e. (44.1/32) kHz]. Within each file, the beginning
and end of a given vocalization were defined as the first and last frames for which
the signal power is greater than 5% of the maximum and the aperiodicity is below
0.1. Since YIN normally calculates f0 in octaves relative to 440 Hz, we modified the
code to determine f0 relative to the frequencies of our target waves and then multiply
each value by 1200 to convert to cents (one octave equals 1200 cents); subsequently,
the mean f0 was calculated for each vocalization. For pitch-shift vocalizations, we
defined a pitch-shift window of 100 ms (50 ms before and after the programmed
shift time) to account for the digital signal processor’s variability in pitch-shift delivery. Therefore, the pre-shift mean included f0 values from the beginning of each
vocalization to the pitch-shift window. The post-shift mean consisted of f0 values
from the last second of each vocalization. We then subtracted the pre-shift mean
from the post-shift mean to calculate average response magnitude for each pitchshift vocalization. For each subject, all mean f0 values and response magnitudes were
first calculated within each trial and then averaged across all trials within each task
in a specific shift direction (e.g., ignore, shifted up 200 cents; compensate, shifted
down 25 cents).
In our previous experiment, we found no significant differences in the behavioral
results between the familiarization and fMRI sessions, so we present only the fMRI
session results in this paper. In one set of analyses, we used the average response
magnitude as the dependent variable for pitch-shift tasks. After separating tasks by
shift direction (i.e., up or down), the pitch-shift results were analyzed with two-way
repeated-measures analyses of variance (ANOVAs, instruction by shift magnitude).
In a second set of analyses, we converted response magnitudes to percentages of the
shift magnitude by dividing the absolute response magnitude in each pitch-shift trial
by the absolute pitch-shift magnitude and multiplying each value by 100; this helped
us determine how much correction was produced either by pitch-shift responses or
voluntary vocal pitch changes. The percent response magnitudes were analyzed
using a three-way ANOVA (instruction by shift magnitude by shift direction). The
Scheffé test was used for all post hoc analyses.
2.5. fMRI analyses
To correct for motion artifacts, all blood-oxygen-level-dependent (BOLD)
images from both functional runs were realigned with the fourth frame of the first
run using the AFNI software (Cox, 1996). To increase the signal-to-noise ratio of
the imaging data, the images were spatially smoothed with an 8-mm full-width
at half-maximum (fwhm) isotropic Gaussian kernel. Prior to analysis, the first four
frames were excluded from further analyses to remove T1-saturation effects; these
frames were acquired either during practice singing trials or presented instructions. For each subject, we conducted our image analyses in a similar fashion to
that described in our first paper (Zarate & Zatorre, 2008), using fMRISTAT, which
involves a set of four Matlab functions that utilize the general linear model for analyses (Worsley et al., 2002). The motion-correction parameters obtained with the
AFNI software were used as covariates in fMRISTAT to further account for motion
artifacts in the imaging results. Before group statistical maps for each contrast of
interest were generated, in-house software was used to non-linearly transform each
subject’s anatomical and functional images into standardized MNI/ICBM stereotaxic
coordinate space, using the non-linearly transformed, symmetric MNI/ICBM 152
template (Collins, Neelin, Peters, & Evans, 1994; Mazziotta et al., 2001; Talaraich
& Tournoux, 1988). The program stat summary assessed the threshold for significance by selecting the minimum among the values given by a Bonferroni correction,
random field theory, and the discrete local maximum to account for multiple comparisons (Worsley, 2005). The threshold for a significant peak was t = 4.9 at p = 0.05,
using a whole-brain search volume. We report peaks of neural activity if their voxel
or cluster p-values are less than 0.05. While some peaks did not meet the critical
threshold, they fell within regions previously reported in our earlier study (Zarate
& Zatorre, 2008). For these a priori regions, we corrected the threshold for small
volumes and report peaks if their corrected voxel or cluster p-values are 0.05 or
less.
In our analyses of functional connectivity, the general linear model was fitted
to account for the neural activity due to a stimulus (e.g., any singing task). Then,
the remaining or residual activity within a specific voxel (the “seed” voxel) was
regressed on the activity within the rest of the brain (on a voxel-by-voxel basis) to
determine where activity significantly covaries with the activity at that seed voxel,
without the effect of a stimulus (Friston, 1994; Worsley, Charil, Lerch, & Evans, 2005).
We also performed analyses of stimulus-modulated functional connectivity, which
assessed how the connectivity is affected by the stimulus or task of interest (Friston
610
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
Fig. 1. Average response magnitudes (±S.E.) for tasks with auditory feedback pitch-shifted (a) downwards and (b) upwards. In both directions, the responses to COMP200c
were significantly larger than in the other tasks (marked with *, ps < 0.001). The responses to COMP25c were larger than those in IGN200c (marked with !, p < 0.001; marked
with +, p < 0.07).
et al., 1997). Using stat summary, the critical t-thresholds for connectivity analyses
ranged from 5.00 to 5.08 (all ps = 0.05, corrected for multiple comparisons).
To perform conjunction analyses with two contrasts of interest, we utilized an
in-house tool called mincmath to find the minimum t-statistic at each voxel across
both contrast images. The conjunction results were then tested against the “conjunction null hypothesis”, which entailed using the critical t-values for just one contrast,
to determine whether there was significant neural activity in certain brain regions
in both contrasts (Nichols, Brett, Andersson, Wager, & Poline, 2005).
The locations of peak neural activity or connectivity were classified using: (1)
neuroanatomical atlases (Duvernoy, 1991; Talaraich & Tournoux, 1988); (2) probabilistic maps or profiles for the Heschl’s Gyrus (Penhune, Zatorre, MacDonald, &
Evans, 1996), planum temporale (Westbury, Zatorre, & Evans, 1999), mouth region
of the sensorimotor cortex (Fox et al., 2001), inferior frontal gyrus pars opercularis
(Tomaiuolo et al., 1999), and basal ganglia (Ahsan et al., 2007); and (3) locations
defined by previous reports or reviews on the medial frontal and cingulate areas
(Picard & Strick, 1996, 2001) and subdivisions of the premotor cortex (Chen et al.,
2008).
magnitudes in IGN200c, IGN25c, COMP200c, and COMP25c were
significantly different from each other (Fig. 2, all ps < 0.05), with
the exception of COMP200c and IGN25c (p > 0.1). Therefore, singers
produced significantly larger pitch-shift responses while ignoring
a 25-cent shift than a 200-cent shift. While singers did not correct
fully for the 200-cent shift (87.66% correction), they overcompensated for the 25-cent shift (112.67% correction). Additionally,
since percent response magnitudes were significantly different
between both 25-cent tasks, this overcompensation suggests that
the response magnitudes in the COMP25c task cannot be solely
attributed to involuntary pitch-shift responses, but rather that
singers voluntarily attempted to correct for the small perturbation.
2.6. Data exclusions
3.1. Behavioral results
3.2.1. Basic functional network for simple singing
Simple singing, when contrasted with perception, recruited a
functional network similar to that seen in our and others’ previous
experiments (Kleber et al., 2007; Perry et al., 1999; Zarate & Zatorre,
2008), including bilateral primary and secondary auditory areas,
bilateral sensorimotor mouth regions, bilateral supplementary
motor areas (SMA), right ventral premotor cortex, left thalamus,
left lateral globus pallidus, and bilateral medial geniculate nuclei
(Supplementary Table S1).
The two-way ANOVAs performed separately on downward- and
upward-shifted tasks gave similar results. Both ANOVAs revealed
significant two-way interactions between instruction and shift
magnitude [down: F(1,8) = 504.85, p < 0.001; up: F(1,8) = 306.22,
p < 0.001]. Scheffé post hoc tests determined that as expected, the
responses to compensating for a 200-cent shift (COMP200c) were
larger than responses to ignoring the 200-cent shift (IGN200c) and
both 25-cent tasks (COMP25c and IGN25c; Fig. 1, ps < 0.001). The
COMP25c responses were larger than IGN200c responses (Fig. 1;
p < 0.001 for downward pitch-shift, p < 0.07 for upward pitch-shift)
but were not significantly different from IGN25c responses.
While there were no significant differences between IGN200c
and IGN25c response magnitudes, the IGN200c responses in
both directions were closer to 0-cents magnitude than IGN25c
responses, suggesting that singers were more capable of suppressing prepotent pitch-shift responses to 200-cent shifts than to
25-cent shifts. To test for this directly, we converted the absolute
values of all response magnitudes to percentages of the absolute pitch-shift magnitude. The three-way ANOVA performed on
percent response magnitudes revealed a significant two-way interaction between instruction and shift magnitude [F(1,8) = 21.86,
p < 0.01], and post hoc tests determined that percent response
Fig. 2. Percent response magnitudes (±S.E.) for IGN and COMP tasks collapsed across
shift direction. The percent response magnitude for IGN200c was smaller than in all
other tasks (marked with *, ps < 0.001). Similarly, the percent response magnitude
for COMP25c was larger than responses in other tasks (marked with !, ps < 0.05). The
percent response magnitudes for COMP200c and IGN25c were both significantly different than IGN200c and COMP25c (marked with #, ps < 0.001) but were not different
from each other.
For behavioral analyses, 28 out of 1620 fMRI recordings were excluded from
analyses due to equipment failure, subject-performance error, or problems with
vocalization extraction. For fMRI analyses, 106 out of 2160 frames were excluded
from analyses due to equipment failure or performance errors.
3. Results
3.2. fMRI results
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
611
Table 1
Functional networks associated with ignoring pitch-shifted feedback.
(a) IGN200c ∩ IGN25c
(b) IGN200c–IGN25c
Left
x
Auditory
STG
STS
pSTS
Planum temporale
Frontal
BA 6/44
Parietal
Supramarginal gyrus
Angular gyrus
−58
Right
y
−22
z
10
t
4.5
x
Left
y
z
t
52
−20
6
3.7
52
10
30
4.8
x
Right
y
z
t
−60
−44
20
4.4
−52
−58
−46
−56
22
20
4.1
3.5
x
y
66
62
58
−16
−20
−36
z
10
2
8
t
4.4
4.2
3.5
Regions of peak neural activity during the IGN200c and IGN25c singing tasks. Section (a) shows the shared regions between both IGN tasks, while section (b) displays regions
with more activity during IGN200c than during IGN25c. All peak/cluster ps ≤ 0.05, corrected. Refer to legend for abbreviations.
3.2.2. Additional brain regions involved in ignoring pitch-shifted
feedback
Since we had no specific hypotheses about the direction of
the pitch shift, we combined the imaging results for each task
across both shift directions. When singers ignored either a 200cent or a 25-cent shift, they recruited a similar network of
regions in addition to the basic network for singing (specific
regions in each task are listed in Supplementary Table S2). Conjunction analyses between IGN200c and IGN25c determined that
right Brodmann area (BA) 6/44 (ventral premotor cortex and pars
opercularis of the inferior frontal gyrus) and bilateral planum
temporale were recruited for both tasks (Table 1a, Fig. 3a). A contrast between these tasks showed that IGN200c required more
activity in right STG, STS and pSTS, and left planum temporale,
supramarginal gyrus, and angular gyrus than IGN25c (Table 1b,
Fig. 3a), but no regions showed significantly increased activity
when IGN25c was contrasted with IGN200c, since overall neural activity was weaker during IGN25c (Supplementary Table
S2).
3.2.3. Additional brain regions involved in compensating for
pitch-shifted feedback
As singers corrected for either the 200-cent or the 25-cent shift,
they displayed similar patterns of increased neural activity (specific
regions recruited during each task are shown in Supplementary
Table S3). Conjunction analyses between COMP200c and COMP25c
showed a common network with increased activity in bilateral
BA 6/44, anterior insulae, pre-SMA, right RCZa, bilateral midpremotor cortex (mid-PMC), intraparietal sulci, and supramarginal
gyri, and right STS and planum temporale (Table 2a, Fig. 3b). A contrast between both tasks revealed more activity within bilateral
planum temporale, STG, STS, and right pSTS during COMP200c than
COMP25c (Table 2b, Fig. 3b), which is similar to the increased auditory cortical activity observed in the contrast between IGN200c and
IGN25c.
3.2.4. Functional connectivity during pitch-shift tasks
For connectivity analyses in IGN tasks, we chose a seed voxel
in the right pSTS since this region displayed more activity in
Fig. 3. Brain regions associated with pitch-shift tasks when compared to simple singing. (a) Left: The shared network of regions recruited during both IGN tasks when
compared with simple singing. Right: Auditory areas and supramarginal gyrus displayed more activity during IGN200c than during IGN25c. (b) Left: Both COMP tasks
engaged a shared functional network when compared to simple singing. Right: Extensive increases in auditory cortical activity were associated with COMP200c when
contrasted with COMP25c. All peak/cluster ps ≤ 0.05, corrected. Refer to legend for abbreviations.
612
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
Table 2
Functional networks associated with compensating for pitch-shifted feedback.
(a) COMP200c ∩ COMP25c
Left
x
Auditory
Motor
(b) COMP200c–COMP25c
Right
y
z
t
aSTG
STG
STS
pSTS
Planum temporale
x
Left
y
z
Right
t
x
y
−4
−24
−18
−34
64
−28
4
3.9
−58
−58
−66
66
−22
10
4.1
−62
RCZa (ACC BA 32)
Pre-SMA
Mid-PMC
−4
−48
4
0
54
42
5.7
4.6
2
2
44
20
18
0
40
44
46
3.8
3.8
5.0
Multimodal
Anterior insula
−34
22
2
5.4
32
22
2
6.2
Frontal
BA 6/44
Inf. frontal (BA 44)
−52
−40
8
16
30
26
4.4
3.4
52
10
24
7.6
Parietal
Intraparietal sulcus
Supramarginal gyrus
−42
−36
−40
−48
46
48
4.5
4.3
40
58
−44
−36
48
46
4.4
4.3
z
t
x
y
z
t
0
6
2
4.1
3.9
3.3
14
4.7
56
68
64
52
64
4
−16
−18
−42
−28
−2
8
4
12
16
4.6
4.6
4.1
4.1
5.5
Regions of peak neural activity during the COMP200c and COMP25c singing tasks. Section (a) shows the shared regions between both COMP tasks, while section (b) displays
regions with more activity during COMP200c than during COMP25c. All peak/cluster ps ≤ 0.05, corrected. Refer to legend for abbreviations.
experienced singers than non-musicians in our first experiment
(Zarate & Zatorre, 2008) and was also active during IGN200c in this
experiment. Table 3 and Fig. 4a show a vast network of regions
that are functionally connected to right pSTS, including auditory
regions, motor and premotor regions, insulae, BA 44, postcentral gyri, inferior parietal lobule, and various subcortical regions.
Stimulus-modulated functional connectivity analyses determined
that the IGN200 task modulated the connectivity between right
pSTS and right intraparietal sulcus, bilateral postcentral gyri, right
sensorimotor cortex, and a few regions along the posterior medial
wall, when compared to the effect of simple singing (Table 3,
Fig. 4b). Analyses of stimulus-modulated functional connectivity
revealed no significant differences in task-modulated connectivity between IGN25c and simple singing or between IGN25c and
IGN200c.
For all connectivity analyses in COMP tasks, we chose seed voxels in the right pSTS and right RCZa, since these regions were
more active in experienced singers during COMP200c than in nonmusicians in our previous experiment (Zarate & Zatorre, 2008). We
also chose a seed voxel in the right anterior insula, since this region
was originally part of our hypothesized network for audio–vocal
integration and was significantly active in all COMP tasks in this
Fig. 4. Functional and stimulus-modulated functional connectivity in IGN200c and COMP200c tasks. (a) The functional connectivity map during IGN200c tasks, generated
with a right pSTS seed voxel (MNI/ICBM152 world coordinates 54, −42, 12; all voxel ps ≤ 0.001, uncorrected). (b) When compared to simple singing, the IGN200c task
specifically enhanced connectivity of the right pSTS seed voxel with right intraparietal sulcus and bilateral postcentral gyri. All peak/cluster ps ≤ 0.05, corrected. (c) The
different overlap patterns between three connectivity maps during COMP200c tasks, generated with seed voxels in right pSTS (54, −42, 12), right RCZa (2, 20, 40), and right
anterior insula (32, 22, 2); all voxel ps ≤ 0.001, uncorrected. The Venn diagram above depicts the color legend used to show overlap in connectivity maps. (d) The COMP200c
task specifically enhanced connectivity between the right pSTS voxel and bilateral intraparietal sulci when compared with simple singing. All peak/cluster ps ≤ 0.05, corrected.
Refer to legend for abbreviations.
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
613
Table 3
Connectivity associated with ignoring pitch-shifted feedback.
Functional connectivity
Stimulus-modulated connectivity
R pSTS
R pSTS
x
y
z
t
Auditory
Left PAC
Left pSTG
Left STS
Right STS
Left pSTS
Left planum temporale
Right planum temporale
−38
−56
−56
46
−60
−58
36
−30
−54
−30
−30
−56
−44
−32
12
18
6
0
20
20
20
4.1
4.5
5.2
5.1
4.6
7.3
5.0
Motor
Right ACC—BA 32 (RCZa)
Left SMA
Right SMA
Left pre-SMA
Right pre-SMA
Right M1
Right mid-PMC
Left vPMC
Right vPMC
Left subcentral
Right subcentral
Right central/rolandic operculum
10
−8
10
−10
8
40
44
−58
50
−48
50
42
12
−12
−6
2
2
−2
0
12
4
−2
−2
2
40
66
54
58
58
50
40
10
32
10
8
16
3.9
4.0
5.9
3.3
7.2
4.6
4.3
4.2
5.0
3.8
4.0
3.9
Multimodal
Right anterior insula
Left posterior insula
Right posterior insula
Right BA 6/44
Right sensorimotor cortex
Right posterior cingulate
Left paracentral lobule
Right paracentral lobule
30
−34
32
42
16
−26
−26
12
8
10
20
28
3.7
3.5
4.8
4.6
Frontal
Left inferior frontal—BA 44
−52
12
6
5.5
Parietal
Left postcentral
Right postcentral
Right postcentral (opercular)
Right IPL
Right IPS
Right parietal operculum
−40
−42
58
5.6
50
62
−20
−26
18
20
4.2
6.0
50
−22
16
4.5
12
−12
8
4.1
−28
30
−26
22
−10
−10
−18
−16
0
−2
2
10
3.8
4.6
3.9
6.7
Thalamus
Right thalamus
Basal Ganglia
Left putamen
Right putamen
Left lateral globus pallidus
Right lateral globus pallidus
x
y
z
t
30
14
−12
8
−22
−24
−32
−30
46
40
62
54
4.7
3.2
3.7
3.8
−10
30
−34
−32
68
50
5.2
4.6
28
−40
38
5.3
Left: Brain regions whose activity is significantly correlated to activity within right pSTS (54, −42, 12) during IGN200c trials. Right: Brain regions displaying enhanced
connectivity with the right pSTS during the IGN200c task compared with simple singing. All peak/cluster ps ≤ 0.05, corrected. Refer to legend for abbreviations.
experiment. Table 4 (and Supplementary Table S4) shows that
most of the regions recruited during COMP200c are also functionally connected to each other, with the exceptions of the RCZa
and anterior insula seed voxels with pSTS. The pSTS seed voxel,
however, is functionally connected with both the RCZa and anterior insula on a subthreshold level. In fact, Fig. 4c demonstrates
the overlap between connectivity maps (all thresholded at t = 3.17,
uncorrected p = 0.001), and all three seed voxels overlap in the
medial motor regions, bilateral BA 6/44 and anterior to mid-insulae,
and left planum temporale. In stimulus-modulated functional connectivity analyses, we found that the connectivity between the
right pSTS and bilateral intraparietal sulci was significantly modulated by COMP200c when compared to the effect of simple singing
(Table 4, Fig. 4d). Interestingly, this is similar to the connectivity
results for IGN200c—the IGN200c task also significantly enhanced
the connection between right pSTS and right intraparietal sulcus. As seen in the IGN tasks, analyses of stimulus-modulated
connectivity revealed no significant differences in connectivity
between COMP25c and simple singing or between COMP25c and
COMP200c.
4. Discussion
4.1. Behavioral results
As seen in our previous experiment (Zarate & Zatorre, 2008),
singers were capable of both ignoring and compensating for the
large shift. However, as predicted, pitch-shift responses to small
feedback perturbations were not easily suppressed in the ignore
task, as demonstrated by a larger percent response magnitude during IGN25c than during IGN200c. For all behavioral analyses, we
analyzed the last second of each vocalization, which corresponds
to a late pitch-shift response that begins 300 ms after the pitch
shift and is subject to voluntary control (Burnett et al., 1998; Hain
et al., 2000)—including our ignore and compensate instructions.
Even though singers were instructed to keep their vocal output
steady in the IGN25c task, their late pitch-shift response already
may have been influenced by the more automatic early pitch-shift
response, which is elicited 100–150 ms after the pitch shift (Burnett
et al., 1998; Hain et al., 2000). Similar pitch-shift aftereffects that
occur much later than the early pitch-shift response have been
614
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
Table 4
Connectivity associated with compensating for pitch-shifted feedback.
Functional connectivity
Stimulus-modulated connectivity
R pSTS
R pSTS
x
y
z
t
Auditory
Left PAC
Right PAC
Right STG
Left pSTG
Left STS
Right STS
Left pSTS
Left planum temporale
Right planum temporale
−46
52
58
−58
−50
62
−60
−60
62
−22
−14
−8
−40
−30
−20
−48
−38
−22
14
8
8
14
0
2
14
12
4
3.9
4.9
5.9
5.4
4.5
5.6
5.7
5.0
5.7
Motor
Left ACC—BA 24
Right ACC—BA 24
Left ACC—BA 32 (RCZa)
Right ACC—BA 32 (RCZa)
Right SMA
Right pre-SMA
Left M1
Right M1
Left vPMC
Right vPMC
Left subcentral
Right subcentral
−4
2
−6
8
2
12
−52
56
−56
48
−52
60
4
−8
8
14
0
8
−4
−4
8
2
−6
−10
30
32
40
32
50
54
38
34
20
38
22
18
4.2
4.1
4.9
4.2
4.1
4.2
4.0
4.9
4.9
4.0
3.9
4.5
Multimodal
Left posterior insula
Right posterior insula
Left BA 6/44
Right BA 6/44
−32
32
−46
44
−26
−28
10
10
16
18
32
32
4.8
4.7
4.8
7.1
Frontal
Left inferior frontal—BA 44
−44
16
18
4.4
Parietal
Left postcentral
Right postcentral
Left IPL
Right IPL
Left IPS
Right IPS
Left supramarginal
Right supramarginal
−48
62
−60
58
−30
32
−54
60
−20
−22
−24
−26
−54
−46
−38
−36
18
26
38
34
40
40
30
30
3.8
5.4
4.3
5.1
4.2
4.5
5.0
4.1
Thalamus
Left thalamus
Right thalamus
−10
8
−18
−14
6
6
5.3
5.9
Basal ganglia
Left putamen
Right putamen
Left lateral globus pallidus
Right lateral globus pallidus
−26
30
−20
26
−24
10
−4
−12
14
4
2
2
4.3
4.9
4.5
3.4
x
y
z
t
−38
32
−52
−44
56
44
4.2
5.1
Left: Brain regions whose activity is significantly correlated to activity within right pSTS (54, −42, 12) during COMP200c. Right: Brain regions displaying enhanced connectivity
with the right pSTS during the COMP200c task compared with simple singing. All peak/cluster ps ≤ 0.05, corrected. Refer to legend for abbreviations.
reported—even after the shifted feedback was turned off, subsequent vocalizations with normal auditory feedback still showed a
compensatory adjustment from the original vocal pitch (Donath et
al., 2002; Jones & Keough, 2008; Jones & Munhall, 2000, 2005; Natke
et al., 2003). The compensatory early pitch-shift response helps stabilize the vocal motor system and correct for minor errors during
vocalization (Burnett et al., 1998; Liu & Larson, 2007). For smaller
pitch perturbations, we suggest that early pitch-shift responses
may be more robust and thus influence late pitch-shift responses,
making them less amenable to voluntary control during IGN25c
than during IGN200c.
4.2. fMRI results
The functional networks recruited during IGN200c and
COMP200c, when compared to simple singing, were similar to
those reported in our previous paper, despite variability from
slightly different experimental protocols (e.g., subject pool sampling, different magnetic field strengths, the use of different
methods and templates for resampling brain images into stereotaxic space).
Although singers were more successful at ignoring a large shift
than a small shift, we still found that both IGN tasks engaged a similar functional network for maintaining vocal output in the presence
of altered auditory feedback, including bilateral planum temporale
and right BA 6/44; we also determined that these regions were functionally connected with each other and additional cortical regions.
A contrast between the two tasks showed that IGN200c recruited
more activity within various auditory areas (including right pSTS)
and left supramarginal gyrus than IGN25c. The left supramarginal
gyrus has been recently associated with pitch memory (Gaab, Gaser,
& Schlaug, 2006; Gaab, Gaser, Zaehle, Jancke, & Schlaug, 2003) and
therefore may be recruited to keep the original target note in mind
as singers maintain their vocal output on that pitch and ignore the
large 200-cent pitch shift.
While singers undercorrected for the 200-cent shift and overcorrected for the 25-cent shift, they recruited a similar functional
network for voluntary vocal adjustments in both COMP tasks,
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
including right STS and planum temporale, right RCZa, bilateral
anterior insulae and BA 6/44, and bilateral intraparietal sulci and
supramarginal gyri, regardless of the shift magnitude. We found
that during the COMP200c task, most of the regions within this network were functionally connected with each other, particularly the
pSTS, RCZa, and anterior insula. When we compared the two COMP
tasks, we found that COMP200c required more activity within auditory regions, including right pSTS, than COMP25c.
4.2.1. Posterior STS: a possible substrate for monitoring auditory
feedback
In both 200-cent tasks, we found increased auditory cortical
activity when compared to 25-cent tasks. Recent fMRI experiments have also reported that larger pitch changes in auditory
stimuli engaged more auditory cortical activity than smaller pitch
changes (Hyde, Peretz, & Zatorre, 2008; Rinne et al., 2007). While
this enhancement of auditory cortical activity may be attributed
to the salience of larger pitch changes, we propose that the right
planum temporale, which is involved in pitch processing (Hyde
et al., 2008), and the right pSTS, which extracts particular sound
features from vocal stimuli (Belin et al., 2000; Celsis et al., 1999;
Kriegstein & Giraud, 2004; Warren et al., 2006, 2003), are also
specifically recruited as singers monitor their auditory feedback in
our 200-cent tasks. In support of the pSTS’s proposed role, we note
that connectivity between the right pSTS and intraparietal sulci was
enhanced in both COMP200c and IGN200c tasks, compared to simple singing. Rinne et al. (2007) also found that the intraparietal
sulcus was recruited in response to larger pitch shifts in the auditory discrimination task. The cortex within the intraparietal sulcus
plays a role in somatosensory and visuo-spatial transformations
for motor tasks (Astafiev et al., 2003; Grefkes, Ritzl, Zilles, & Fink,
2004; Tanabe, Kato, Miyauchi, Hayashi, & Yanagida, 2005), and in
our previous paper, we proposed that the intraparietal sulcus may
also be involved in frequency-related transformations (Zarate &
Zatorre, 2008); the intraparietal sulcus’ involvement in these types
of operations is further demonstrated by its recruitment during
musical transposition tasks (Foster & Zatorre, in press). Thus for the
COMP200c tasks, the pSTS may interact with the intraparietal sulcus to extract the pitch-shift direction to prepare the ensuing vocal
correction in the proper direction. During the IGN200c tasks, the
connectivity between the right pSTS and bilateral somatosensory
cortex was also enhanced. The frequency information extracted
within the pSTS and further processed within the intraparietal sulcus may be combined with somatosensory information to maintain
the current vocal output and ensure that it does not change in
response to the pitch-shifted feedback (see Kleber, Veit, Birbaumer,
Gruzelier, & Lotze, in press). Since functional connectivity analyses
determined that similar regions have correlated activity with right
pSTS in both COMP200c and IGN200c, we speculate that if a compensatory pitch-shift response occurred during IGN200c, singers
may then utilize the rest of the functionally connected network,
including RCZa and anterior insula, to readjust their vocal output
and correct the pitch-shift response.
4.2.2. The role of the insula in audio–vocal integration
In our previous experiment, we originally hypothesized that
the anterior insula played a role in audio–vocal integration for
three reasons: (1) this region has reciprocal connections with auditory areas and the anterior cingulate cortex (Mesulam & Mufson,
1982; Mufson & Mesulam, 1982); (2) the anterior insula’s cytoarchitecture and projections make this region more amenable to
integrating auditory input with other modalities, including visual
and vocal motor systems (Ackermann & Riecker, 2004; Bamiou,
Musiek, & Luxon, 2003; Bushara, Grafman, & Hallett, 2001; Lewis,
Beauchamp, & DeYoe, 2000; Rivier & Clarke, 1997); and (3) the
anterior insula may be involved specifically in audio–vocal integra-
615
tion since its activity is enhanced during overt speech and singing
when compared with covert or internal vocalization (Riecker et al.,
2000). Although singers displayed increased activity in the anterior insula during the compensate task in our prior study, this
activity did not survive the group comparison between singers
and non-musicians (Zarate & Zatorre, 2008). The anterior insula
may have been recruited to a much lower, subthreshold level in
non-musicians, and since the insula was only weakly active in
singers, the group contrast did not show any significant differences
in insular activity. Accordingly, we did not report the insula as an
experience-dependent substrate for audio–vocal integration. In the
present experiment, however, the anterior insula was one of the
most strongly recruited regions during both COMP tasks, so we
cannot dismiss its possible role in audio–vocal integration.
While anatomical studies report that the insula shares connections with auditory regions and the anterior cingulate cortex
(Mesulam & Mufson, 1982; Mufson & Mesulam, 1982), most auditory regions have connections with the mid-dorsal and posterior
insula (Augustine, 1996); this is supported by our functional connectivity results. Yet, the anterior insula may still receive input from
auditory regions via intra-insular connections (Augustine, 1996)
and via higher order auditory areas, as demonstrated by a weak
correlation in activity between the anterior insula and planum temporale in this experiment. Additionally, Augustine (1996) reported
that the anterior insula shares connections with BA 24; our overlapping connectivity maps support this and also demonstrate that
the anterior insula is functionally connected with BA 32. Together,
these cingulate areas are classified as the rostral cingulate zone
(RCZ), which can be subdivided into a posterior portion (RCZp) and
an anterior portion, RCZa (Picard & Strick, 1996, 2001). The RCZp is
associated with voluntary response selection, including speech and
singing (Paus et al., 1993; Picard & Strick, 1996, 2001). The RCZa is
involved in conflict monitoring in a variety of contexts (Botvinick
et al., 2004, 1999; Carter et al., 1998; Durston et al., 2003) and may
be recruited in our COMP tasks due to the conflict between the
intended note and altered auditory feedback. In summary, the anterior insula may be classified as a higher order association area due
to its projections and cortical architecture (Rivier & Clarke, 1997).
With its connections with auditory regions and the anterior cingulate cortex, the anterior insula may modulate vocalizations by
integrating auditory processes (via pSTS) with conflict monitoring
(via RCZa) and vocal output selection (via RCZp) and thus contribute
to audio–vocal integration.
4.2.3. The role of BA 6/44 during vocal pitch regulation
In all pitch-shift tasks, regardless of the shift magnitude, the
right ventral premotor cortex (vPMC; BA 6) and the right pars
opercularis of the inferior frontal gyrus (BA 44) were significantly
active when compared to simple singing. Both of these regions
are associated with vocal motor planning and control (Binkofski
& Buccino, 2004, 2006)—the vPMC is implicated in singing and
overt speech-related tasks (Ghosh, Tourville, & Guenther, 2008;
Nakamura, Dehaene, Jobert, Le, & Kouider, 2007; Perry et al.,
1999), whereas lesions or electrical stimulation in pars opercularis
can impair or arrest speech (Jurgens, 2002; Quinones-Hinojosa,
Ojemann, Sanai, Dillon, & Berger, 2003). The vPMC is also involved
in transforming spatial information into a motor response (Fogassi
et al., 2001; Rizzolatti, Fogassi, & Gallese, 2002); spatial information
may stem from the intraparietal sulcus, since a recent diffusionweighted imaging tractography study reported that the vPMC
has connections with this particular area (Rushworth, Behrens,
& Johansen-Berg, 2006). During our pitch-shift tasks, we propose
that the pitch-shift direction may be encoded by the intraparietal
sulcus, and that information is transmitted to the vPMC, which
is co-activated with the pars opercularis. This co-activation may
be attributed to both cytoarchitectural similarities between the
616
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
regions (Binkofski & Buccino, 2004, 2006) and high probabilistic
connections between the vPMC and pars opercularis (Rushworth
et al., 2006). Since BA 44 and premotor areas, including vPMC and
RCZp, interact with the primary motor cortex and other regions
of the vocal motor system (Jurgens, 2002), BA 6/44 may also contribute to executing the correct voluntary vocal response in our
pitch-shift tasks.
4.2.4. Investigating the involuntary pitch-shift response
One of the goals of this experiment was to determine the neural
substrates of the involuntary pitch-shift response. Since the PAG
is implicated in initiating vocal responses to external stimuli (e.g.,
pitch-shifted feedback, Dujardin & Jurgens, 2005), we hypothesized
that this region may play a crucial role in the pitch-shift response.
Unfortunately, we did not find increased neural activity specifically associated with the pitch-shift responses during the IGN25c
task, nor did we find any regions with significantly modulated connectivity due to 25-cent singing tasks when compared to simple
singing or either of the 200-cent tasks. However, it is indicative
that the increased functional connectivity between pSTS and intraparietal sulcus was only found in the two 200-cent tasks relative to
simple singing, and not in either of the 25-cent tasks. This finding
needs to be confirmed, but it would be broadly compatible with the
overall concept we propose in this paper—responses to 200-cent
shifts are under greater voluntary control than responses to 25cent shifts, and in turn, this is the consequence of greater functional
interactions between the two cortical regions.
We suggest that the imaging results reported in this paper may
coincide with the late pitch-shift response. As previously discussed,
this late component may have been influenced by the more robust
early component during the 25-cent tasks. Since the latency of the
early pitch-shift response is only 100–150 ms, while the sparsesampling fMRI paradigm captures neural activity only on the order
of seconds, the temporal resolution of our fMRI protocol hindered
our ability to capture the neural substrate of this response. Future
experiments designed to capture the neural correlates of the pitchshift response may require a continuous acquisition sequence to
analyze the hemodynamic response function time delay in cortical
and subcortical regions (see Brass & von Cramon, 2002), but the
scanner noise may interfere with vocal production and auditory
feedback manipulation. Alternatively, EEG/ERP or MEG methods
may complement our fMRI studies, since they have greater temporal resolution than fMRI methods and may reveal crucial temporal
information about the interaction between regions governing the
pitch-shift response.
5. Conclusion
In this experiment, we tested experienced singers to investigate
neural correlates of voluntary (via the COMP tasks) and involuntary vocal pitch regulation (i.e., elicited pitch-shift responses
in the IGN25c task). As seen in our first experiment (Zarate &
Zatorre, 2008), experienced singers were capable of correcting for
and ignoring a 200-cent pitch shift. While singers almost completely suppressed pitch-shift responses in IGN200c, they were less
able to suppress pitch-shift responses during IGN25c; this suggests
that pitch-shift responses to smaller shifts may be more robust
and under less voluntary control than responses to larger shifts.
Although we could not verify the specific neural substrates governing audio–vocal integration during the involuntary pitch-shift
response, we confirmed that the previously hypothesized substrates of audio–vocal integration, the anterior cingulate cortex,
auditory cortex, and the insula (see Ackermann & Riecker, 2004;
Eliades & Wang, 2003; Muller-Preuss, Newman, & Jurgens, 1980;
Perry et al., 1999; Riecker et al., 2000; Zarate & Zatorre, 2008), are
involved in audio–vocal integration. More specifically, subdivisions
of these regions, namely the RCZa, pSTS, and anterior insula are
recruited as experienced singers voluntarily adjust their vocal pitch
during the COMP tasks, regardless of the pitch-shift magnitude.
Importantly, the stimulus-modulated connectivity results suggest
that the pSTS is specifically involved in monitoring auditory feedback, and via connections with the intraparietal sulcus, encodes the
direction of pitch shifts in our 200-cent tasks, whereas this seems
not to be the case in the 25-cent tasks. Connectivity analyses also
indicate that the pSTS is functionally connected with the anterior
cingulate cortex and the insula. Thus, the pSTS may be involved in
a network that routes pitch-shift information to the RCZa either
directly through its shared connections with the anterior cingulate cortex or indirectly via its connections with the insula. The
RCZa may register cognitive conflict due to the mismatch between
shifted auditory feedback and the intended vocal output, and subsequently initiate proper vocal pitch correction via its connections
with the RCZp and the rest of the vocal motor system.
Broadly speaking, the functional connectivity results observed
in our experienced singers resemble our previous findings of functional connectivity between auditory cortex, anterior cingulate
cortex, and insula in both non-musicians and experienced singers
(Zarate & Zatorre, 2008). Although this network was not specifically recruited by non-musicians during COMP200c tasks in that
study, the fact that non-musicians possess this functionally connected network suggests that they have the potential to engage this
network for audio–vocal integration during voluntary vocal pitch
regulation if they undergo vocal training and practice.
Acknowledgments
We gratefully acknowledge Michael Petrides, Marc Schönwiesner, and the McConnell Brain Imaging Centre staff for their
assistance. This work was supported by grants from the Canadian Institutes of Health Research (CIHR; RJZ), the Eileen Peters
McGill Majors Fellowship (JMZ), and the Centre for Interdisciplinary
Research in Music Media and Technology (CIRMMT; JMZ, SW).
Appendix A. Supplementary data
Supplementary data associated with this article can be found, in
the online version, at doi:10.1016/j.neuropsychologia.2009.10.025.
References
Ackermann, H., & Riecker, A. (2004). The contribution of the insula to motor aspects
of speech production: A review and a hypothesis. Brain and Language, 89,
320–328.
Ahsan, R. L., Allom, R., Gousias, I. S., Habib, H., Turkheimer, F. E., Free, S., et al. (2007).
Volumes, spatial extents and a probabilistic atlas of the human basal ganglia and
thalamus. NeuroImage, 38, 261–270.
Astafiev, S. V., Shulman, G. L., Stanley, C. M., Snyder, A. Z., Van, E., & Corbetta, M.
(2003). Functional organization of human intraparietal and frontal cortex for
attending, looking, and pointing. Journal of Neuroscience, 23, 4689–4699.
Augustine, J. R. (1996). Circuitry and functional aspects of the insular lobe in primates
including humans. Brain Research Brain Research Reviews, 22, 229–244.
Bamiou, D. E., Musiek, F. E., & Luxon, L. M. (2003). The insula (Island of Reil) and
its role in auditory processing, Literature review. Brain Research Brain Research
Reviews, 42, 143–154.
Belin, P., Zatorre, R. J., & Ahad, P. (2002). Human temporal-lobe response to vocal
sounds. Brain Research Cognitive Brain Research, 13, 17–26.
Belin, P., Zatorre, R. J., Hoge, R., Evans, A. C., & Pike, B. (1999). Event-related fMRI of
the auditory cortex. NeuroImage, 10, 417–429.
Belin, P., Zatorre, R. J., Lafaille, P., Ahad, P., & Pike, B. (2000). Voice-selective areas in
human auditory cortex. Nature, 403, 309–312.
Binkofski, F., & Buccino, G. (2004). Motor functions of the Broca’s region. Brain and
Language, 89, 362–369.
Binkofski, F., & Buccino, G. (2006). The role of ventral premotor cortex in action
execution and action understanding. Journal of Physiology (Paris), 99, 396–405.
Botvinick, M. M., Cohen, J. D., & Carter, C. S. (2004). Conflict monitoring
and anterior cingulate cortex: An update. Trends in Cognitive Sciences, 8,
539–546.
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
Botvinick, M., Nystrom, L. E., Fissell, K., Carter, C. S., & Cohen, J. D. (1999). Conflict
monitoring versus selection-for-action in anterior cingulate cortex. Nature, 402,
179–181.
Brass, M., & von Cramon, D. Y. (2002). The role of the frontal cortex in task preparation. Cerebral Cortex, 12, 908–914.
Brown, S., Martinez, M. J., Hodges, D. A., Fox, P. T., & Parsons, L. M. (2004). The song
system of the human brain. Brain Research Cognitive Brain Research, 20, 363–375.
Brown, S., Martinez, M. J., & Parsons, L. M. (2006). Music and language side by side
in the brain: A PET study of the generation of melodies and sentences. European
Journal of Neuroscience, 23, 2791–2803.
Burnett, T. A., Freedland, M. B., Larson, C. R., & Hain, T. C. (1998). Voice F0 responses
to manipulations in pitch feedback. Journal of the Acoustical Society of America,
103, 3153–3161.
Burnett, T. A., & Larson, C. (2002). Early pitch-shift response is active in both steady
and dynamic voice pitch control. Journal of the Acoustical Society of America, 112,
1058–1063.
Burnett, T. A., McCurdy, K. E., & Bright, J. C. (2008). Reflexive and volitional voice
fundamental frequency responses to an anticipated feedback pitch error. Experimental Brain Research, 191, 341–351.
Bushara, K. O., Grafman, J., & Hallett, M. (2001). Neural correlates of auditory-visual
stimulus onset asynchrony detection. Journal of Neuroscience, 21, 300–304.
Carter, C. S., Braver, T. S., Barch, D. M., Botvinick, M. M., Noll, D., & Cohen, J. D.
(1998). Anterior cingulate cortex, error detection, and the online monitoring
of performance. Science, 280, 747–749.
Celsis, P., Boulanouar, K., Doyon, B., Ranjeva, J. P., Berry, I., Nespoulous, J. L., et al.
(1999). Differential fMRI responses in the left posterior superior temporal gyrus
and left supramarginal gyrus to habituation and change detection in syllables
and tones. NeuroImage, 9, 135–144.
Chen, J. L., Penhune, V. B., & Zatorre, R. J. (2008). Listening to musical rhythms recruits
motor regions of the brain. Cerebral Cortex, 18, 2844–2854.
Chen, J. L., Zatorre, R. J., & Penhune, V. B. (2006). Interactions between auditory and
dorsal premotor cortex during synchronization to musical rhythms. NeuroImage,
32, 1771–1781.
Chouinard, P. A., & Paus, T. (2006). The primary motor and premotor areas of the
human cerebral cortex. Neuroscientist, 12, 143–152.
Collins, D. L., Neelin, P., Peters, T. M., & Evans, A. C. (1994). Automatic 3D intersubject
registration of MR volumetric data in standardized Talaraich space. Journal of
Computer Assisted Tomography, 18, 192–205.
Cox, R. W. (1996). AFNI: Software for analysis and visualization of functional magnetic resonance neuroimages. Computers and Biomedical Research, 29, 162–173.
de Cheveigné, A., & Kawahara, H. (2002). YIN, a fundamental frequency estimator for
speech and music. Journal of the Acoustical Society of America, 111, 1917–1930.
Donath, T. M, Natke, U., & Kalveram, K. T. (2002). Effects of frequency-shifted auditory
feedback on voice F0 contours in syllables. Journal of the Acoustical Society of
America, 111, 357–366.
Dujardin, E., & Jurgens, U. (2005). Afferents of vocalization-controlling periaqueductal regions in the squirrel monkey. Brain Research, 1034, 114–131.
Durston, S., Davidson, M. C., Thomas, K. M., Worden, M. S., Tottenham, N., Martinez,
A., et al. (2003). Parametric manipulation of conflict and response competition
using rapid mixed-trial event-related fMRI. NeuroImage, 20, 2135–2141.
Duvernoy, H. M. (1991). The human brain: Surface, three-dimensional sectional
anatomy and MRI. Wien, New York: Springer–Verlag.
Eliades, S. J., & Wang, X. (2003). Sensory-motor interaction in the primate auditory cortex during self-initiated vocalizations. Journal of Neurophysiology, 89,
2194–2207.
Fogassi, L., Gallese, V., Buccino, G., Craighero, L., Fadiga, L., & Rizzolatti, G. (2001).
Cortical mechanism for the visual guidance of hand grasping movements in the
monkey: A reversible inactivation study. Brain, 124, 571–586.
Foster, N. E. V., & Zatorre, R. J. (in press). A role for the intraparietal sulcus in transforming musical information. Cerebral Cortex.
Fox, P. T., Huang, A., Parsons, L. M., Xiong, J.-H., Zamarippa, F., Rainey, L., et al. (2001).
Location-probability profiles for the mouth region of human primary motorsensory cortex: Model and validation. NeuroImage, 13, 196–209.
Friston, K. J. (1994). Functional and effective connectivity in neuroimaging: A synthesis. Human Brain Mapping, 2, 56–78.
Friston, K. J., Buechel, C., Fink, G. R., Morris, J., Rolls, E., & Dolan, R. J. (1997). Psychophysiological and modulatory interactions in neuroimaging. NeuroImage, 6,
218–229.
Gaab, N., Gaser, C., & Schlaug, G. (2006). Improvement-related functional plasticity
following pitch memory training. NeuroImage, 31, 255–263.
Gaab, N., Gaser, C., Zaehle, T., Jancke, L., & Schlaug, G. (2003). Functional anatomy of
pitch memory—An fMRI study with sparse temporal sampling. NeuroImage, 19,
1417–1426.
Ghosh, S. S., Tourville, J. A., & Guenther, F. H. (2008). A neuroimaging study of premotor lateralization and cerebellar involvement in the production of phonemes
and syllables. Journal of Speech Language, and Hearing Research, 51, 1183–1202.
Grefkes, C., Ritzl, A., Zilles, K., & Fink, G. R. (2004). Human medial intraparietal cortex subserves visuomotor coordinate transformation. NeuroImage, 23,
1494–1506.
Guimaraes, A. R., Melcher, J. R., Talavage, T. M., Baker, J. R., Ledden, P., Rosen, B.
R., et al. (1998). Imaging subcortical auditory activity in humans. Human Brain
Mapping, 6, 33–41.
Hafke, H. Z. (2008). Nonconscious control of fundamental voice frequency. Journal
of the Acoustical Society of America, 123, 273–278.
Hain, T. C., Burnett, T. A., Kiran, S., Larson, C. R., Singh, S., & Kenney, M. K. (2000).
Instructing subjects to make a voluntary response reveals the presence of
617
two components to the audio–vocal reflex. Experimental Brain Research, 130,
133–141.
Huffman, R. F., & Henson, O. W., Jr. (1990). The descending auditory pathway and
acousticomotor systems: Connections with the inferior colliculus. Brain Research
Brain Research Reviews, 15, 295–323.
Hyde, K. L., Peretz, I., & Zatorre, R. J. (2008). Evidence for the role of the right auditory
cortex in fine pitch resolution. Neuropsychologia, 46, 632–639.
Jeffries, K. J., Braun, A. R., & Fritz, J. B. (2003). Words in melody: An H2 15 O PET study
of brain activation during singing and speaking. NeuroReport, 14, 749–754.
Jones, J. A., & Keough, D. (2008). Auditory-motor mapping for pitch control in singers
and nonsingers. Experimental Brain Research, 190, 279–287.
Jones, J. A., & Munhall, K. G. (2000). Perceptual calibration of F0 production: Evidence from feedback perturbation. Journal of the Acoustical Society of America,
108, 1246–1251.
Jones, J. A., & Munhall, K. G. (2005). Remapping auditory-motor representations in
voice production. Current Biology, 15, 1768–1772.
Jurgens, U. (2002). Neural pathways underlying vocal control. Neuroscience and
Biobehavioral Reviews, 26, 235–258.
Kleber, B., Birbaumer, N., Veit, R., Trevorrow, T., & Lotze, M. (2007). Overt and imagined singing of an Italian aria. NeuroImage, 36, 889–900.
Kleber, B., Veit, R., Birbaumer, N., Gruzelier, J., & Lotze, M. (in press). The brain of
opera singers: Experience-dependent changes in functional activation. Cerebral
Cortex.
Kriegstein, K. V., & Giraud, A. L. (2004). Distinct functional substrates along the right
superior temporal sulcus for the processing of voices. NeuroImage, 22, 948–955.
Larson, C. R. (1998). Cross-modality influences in speech motor control: The use of
pitch shifting for the study of F0 control. Journal of Communication Disorders, 31,
489–502.
Larson, C. R., Burnett, T. A., & Kiran, S. (2000). Effects of pitch-shift velocity on voice
F0 response. Journal of the Acoustical Society of America, 107, 559–564.
Lewis, J. W., Beauchamp, M. S., & DeYoe, E. A. (2000). A comparison of visual and auditory motion processing in human cerebral cortex. Cerebral Cortex, 10, 873–888.
Liu, H., & Larson, C. R. (2007). Effects of perturbation magnitude and voice F0 level on
the pitch-shift reflex. Journal of the Acoustical Society of America, 122, 3671–3677.
Lombard, E. (1911). Le signe de l’elevation de la voix. Annales maladies oreille larynx
nez pharynx, 37, 101–119.
MacDonald, A. W., III, Cohen, J. D., Stenger, V. A., & Carter, C. S. (2000). Dissociating
the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive
control. Science, 288, 1835–1838.
Mazziotta, J., Toga, A., Evans, A., Fox, P., Lancaster, J., Zilles, K., et al. (2001). A
probabilistic atlas and reference system for the human brain: International Consortium for Brain Mapping (ICBM). Philosophical Transactions of the Royal Society
B: Biological Sciences, 356, 1293–1322.
Mesulam, M. M., & Mufson, E. J. (1982). Insula of the old world monkey. III: Efferent
cortical output and comments on function. Journal of Comparative Neurology,
212, 38–52.
Mufson, E. J., & Mesulam, M. M. (1982). Insula of the old world monkey. II: Afferent
cortical input and comments on the claustrum. Journal of Comparative Neurology,
212, 23–37.
Muller-Preuss, P., Newman, J. D., & Jurgens, U. (1980). Anatomical and physiological evidence for a relationship between the ‘cingular’ vocalization area and the
auditory cortex in the squirrel monkey. Brain Research, 202, 307–315.
Nakamura, K., Dehaene, S., Jobert, A., Le, B. D., & Kouider, S. (2007). Task-specific
change of unconscious neural priming in the cerebral language network. Proceedings of the National Academy of Sciences of the United States of America, 104,
19643–19648.
Natke, U., Donath, T. M., & Kalveram, K. T. (2003). Control of voice fundamental
frequency in speaking versus singing. Journal of the Acoustical Society of America,
113, 1587–1593.
Natke, U., & Kalveram, K. T. (2001). Effects of frequency-shifted auditory feedback
on fundamental frequency of long stressed and unstressed syllables. Journal of
Speech Language, and Hearing Research, 44, 577–584.
Nichols, T., Brett, M., Andersson, J., Wager, T., & Poline, J. B. (2005). Valid conjunction
inference with the minimum statistic. NeuroImage, 25, 653–660.
Nonaka, S., Takahashi, R., Enomoto, K., Katada, A., & Unno, T. (1997). Lombard reflex
during PAG-induced vocalization in decerebrate cats. Neuroscience Research, 29,
283–289.
Ozdemir, E., Norton, A., & Schlaug, G. (2006). Shared and distinct neural correlates
of singing and speaking. NeuroImage, 33, 628–635.
Paus, T., Petrides, M., Evans, A. C., & Meyer, E. (1993). Role of the human anterior
cingulate cortex in the control of oculomotor, manual, and speech responses: A
positron emission tomography study. Journal of Neurophysiology, 70, 453–469.
Penhune, V. B, Zatorre, R. J., MacDonald, J. D., & Evans, A. C. (1996). Interhemispheric
anatomical differences in human primary auditory cortex: Probabilistic mapping and volume measurement from magnetic resonance scans. Cerebral Cortex,
6, 661–672.
Perry, D. W., Zatorre, R. J., Petrides, M., Alivisatos, B., Meyer, E., & Evans, A. C.
(1999). Localization of cerebral activity during simple singing. NeuroReport, 10,
3979–3984.
Petrides, M. (1986). The effect of periarcuate lesions in the monkey on the performance of symmetrically and asymmetrically reinforced visual and auditory go,
no-go tasks. Journal of Neuroscience, 6, 2054–2063.
Picard, N, & Strick, P. L. (1996). Motor areas of the medial wall: A review of their
location and functional activation. Cerebral Cortex, 6, 342–353.
Picard, N., & Strick, P. L. (2001). Imaging the premotor areas. Current Opinion in
Neurobiology, 11, 663–672.
618
J.M. Zarate et al. / Neuropsychologia 48 (2010) 607–618
Quinones-Hinojosa, A., Ojemann, S. G., Sanai, N., Dillon, W. P., & Berger, M. S. (2003).
Preoperative correlation of intraoperative cortical mapping with magnetic resonance imaging landmarks to predict localization of the Broca area. Journal of
Neurosurgery, 99, 311–318.
Riecker, A., Ackermann, H., Wildgruber, D., Dogil, G., & Grodd, W. (2000). Opposite
hemispheric lateralization effects during speaking and singing at motor cortex,
insula and cerebellum. NeuroReport, 11, 1997–2000.
Rinne, T., Kirjavainen, S., Salonen, O., Degerman, A., Kang, X., Woods, D. L., et al.
(2007). Distributed cortical networks for focused auditory attention and distraction. Neuroscience Letters, 416, 247–251.
Rivier, F., & Clarke, S. (1997). Cytochrome oxidase, acetylcholinesterase, and NADPHdiaphorase staining in human supratemporal and insular cortex: Evidence for
multiple auditory areas. NeuroImage, 6, 288–304.
Rizzolatti, G., Fogassi, L., & Gallese, V. (2002). Motor and cognitive functions of the
ventral premotor cortex. Current Opinion in Neurobiology, 12, 149–154.
Rushworth, M. F., Behrens, T. E., & Johansen-Berg, H. (2006). Connection patterns
distinguish 3 regions of human parietal cortex. Cerebral Cortex, 16, 1418–1430.
Schulz, G. M., Varga, M., Jeffires, K., Ludlow, C. L., & Braun, A. R. (2005). Functional
neuroanatomy of human vocalization: An H215O PET study. Cerebral Cortex, 15,
1835–1847.
Scott, S. K., & Johnsrude, I. S. (2003). The neuroanatomical and functional organization of speech perception. Trends in Neurosciences, 26, 100–107.
Siegel, G. M., & Pick, H. L., Jr. (1974). Auditory feedback in the regulation of voice.
Journal of the Acoustical Society of America, 56, 1618–1624.
Suga, N., & Yajima, Y. (1988). Auditory-vocal integration in the midbrain of the mustached bat: Periaqueductal gray and reticular formation. In J. D. Newman (Ed.),
The physiological control of mammalian vocalization (pp. 87–107). New York:
Plenum Press.
Talaraich, J., & Tournoux, P. (1988). Co-planar stereotaxic atlas of the human brain (3rd
ed.). New York: Thieme Medical.
Tanabe, H. C., Kato, M., Miyauchi, S., Hayashi, S., & Yanagida, T. (2005). The sensorimotor transformation of cross-modal spatial information in the anterior
intraparietal sulcus as revealed by functional MRI. Brain Research Cognitive Brain
Research, 22, 385–396.
Tomaiuolo, F., MacDonald, J. D., Caramanos, Z., Posner, G., Chiavaras, M., Evans, A.
C., et al. (1999). Morphology, morphometry and probability mapping of the pars
opercularis of the inferior frontal gyrus: An in vivo MRI analysis. European Journal
of Neuroscience, 11, 3033–3046.
Warren, J. D., Scott, S. K., Price, C. J., & Griffiths, T. D. (2006). Human brain mechanisms
for the early analysis of voices. NeuroImage, 31, 1389–1397.
Warren, J. D., Uppenkamp, S., Patterson, R. D., & Griffiths, T. D. (2003). Separating
pitch chroma and pitch height in the human brain. Proceedings of the National
Academy of Sciences of the United States of America, 100, 10038–10042.
Westbury, C. F., Zatorre, R. J., & Evans, A. C. (1999). Quantifying variability in the
planum temporale: A probability map. Cerebral Cortex, 9, 392–405.
Worsley, K. J. (2005). An improved theoretical P value for SPMs based on discrete
local maxima. NeuroImage, 28, 1056–1062.
Worsley, K. J., Charil, A., Lerch, J., & Evans, A. C. (2005). Connectivity of anatomical and
functional MRI data. In 2005 International Joint Conference on Neural Networks
Worsley, K. J., Liao, C., Aston, J., Petre, V., Duncan, G. H., Morales, F., et al. (2002). A
general statistical analysis for fMRI data. NeuroImage, 15, 1–15.
Zarate, J. M., & Zatorre, R. J. (2008). Experience-dependent neural substrates involved
in vocal pitch regulation during singing. NeuroImage, 40, 1871–1887.
Zatorre, R. J., Chen, J. L., & Penhune, V. B. (2007). When the brain plays music:
Auditory-motor interactions in music perception and production. Nature
Reviews Neuroscience, 8, 547–558.