Brain & Language 128 (2014) 18–24 Contents lists available at ScienceDirect Brain & Language journal homepage: www.elsevier.com/locate/b&l Short Communication Cross-linguistic sound symbolism and crossmodal correspondence: Evidence from fMRI and DTI Kate Pirog Revill a,⇑, Laura L. Namy b, Lauren Clepper DeFife b,1, Lynne C. Nygaard b a b Center for Advanced Brain Imaging, Georgia Institute of Technology, Atlanta, GA, USA Department of Psychology, Emory University, Atlanta, GA, USA a r t i c l e i n f o Article history: Accepted 1 November 2013 Available online 15 December 2013 Keywords: Sound symbolism Crossmodal correspondences Spoken language fMRI DTI a b s t r a c t Non-arbitrary correspondences between spoken words and categories of meanings exist in natural language, with mounting evidence that listeners are sensitive to this sound symbolic information. Native English speakers were asked to choose the meaning of spoken foreign words from one of four corresponding antonym pairs selected from a previously developed multi-language stimulus set containing both sound symbolic and non-symbolic stimuli. In behavioral (n = 9) and fMRI (n = 15) experiments, participants showed reliable sensitivity to the sound symbolic properties of the stimulus set, selecting the consistent meaning for the sound symbolic words at above chances rates. There was increased activation for sound symbolic relative to non-symbolic words in left superior parietal cortex, and a cluster in left superior longitudinal fasciculus showed a positive correlation between fractional anisotropy (FA) and an individual’s sensitivity to sound symbolism. These findings support the idea that crossmodal correspondences underlie sound symbolism in spoken language. Ó 2013 Elsevier Inc. All rights reserved. 1. Introduction One of the most basic and enduring assumptions regarding natural language is that the relationship between linguistic form and meaning is fundamentally arbitrary (Hockett, 1977; Jackendoff, 2002; Pinker, 1999; Saussure, 1959). Indeed, arbitrary connections between linguistic form and meaning are thought to be a necessary design characteristic of language, granting language its compositional power, referential flexibility, and productivity (Gasser, 2004; Hockett, 1977; Monaghan, Christiansen, & Fitneva, 2011; Saussure, 1959). Despite the apparent arbitrariness of the relationship between linguistic signs and their meaning, both historical and recent evidence suggests that non-arbitrary correspondences between linguistic structure and categories of meaning exist in natural language, and that language users are sensitive to these correspondences (Kohler, 1947; Kovic, Plunkett, & Westermann, 2010; Maurer, Pathman, & Mondloch, 2006; Monaghan, Christiansen, & Chater, 2007; Nygaard, Cook, & Namy, 2009; Ohala, 1984; Perniss, Thompson, & Vigliocco, 2010; Ramachandran & Hubbard, 2001; Sapir, 1929; Sereno & Jongman, 1990). These correspondences, dubbed sound symbolism, include special classes of words ⇑ Corresponding author. Present address: Facility for Education and Research in Neuroscience, Emory University, 36 Eagle Row, Atlanta, GA 30322, USA. Fax: +1 404 727 0372. E-mail address: [email protected] (K.P. Revill). 1 Present address: Communication Sciences and Disorders, Georgia State University, Atlanta, GA, USA. 0093-934X/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.bandl.2013.11.002 such as onomatopoeia, Japanese mimetics (Imai, Kita, Nagumo, & Okada, 2008; Kita, 1997) and phonesthemes (Bergen, 2004) in which the structure of spoken word forms either resemble or reliably predict characteristics of the referents. Although these examples suggest that non-arbitrary mappings exist in natural language, these cases often reflect conventions within particular languages. There is mounting evidence that listeners are also sensitive to cross-linguistic sound symbolism (Nuckolls, 1999), enabling listeners to match unfamiliar foreign words to their correct meanings at rates above chance (Berlin, 1994; Brown, Black, & Horowitz, 1955; Kunihira, 1971). This evidence suggests that sound-to-meaning mappings may display consistency across languages, allowing native speakers of one language to recruit these correspondences in the service of inferring the meaning of words in another language. Recent findings also suggest a facilitative effect of sound symbolic correspondences on foreign word learning. Following Kunihira (1971), Nygaard et al. (2009) taught native English-speaking adults the English equivalents of Japanese antonyms. At test, listeners were more accurate and responded more quickly when the Japanese items had been paired with the actual English equivalent during learning than when paired with a mismatched meaning. 1.1. Sound symbolism, crossmodal correspondence, and synesthesia One account of the underlying mechanisms of cross-linguistic sound symbolism is based on cross-modal correspondences. Ramachandran and Hubbard (Hubbard, Brang, & Ramachandran, K.P. Revill et al. / Brain & Language 128 (2014) 18–24 2011; Ramachandran & Hubbard, 2001) suggest that sound symbolism is a product of cross-modal integration, whereby motor aspects of speech production or acoustic aspects of the speech signal elicit activation of corresponding properties in other sensory modalities and direct attention to aspects of the physical objects to which the words refer. In signed languages, visual-spatial linguistic forms often directly correspond to perceptual properties of a sign’s referent (Perniss et al., 2010) and signers are faster to match iconic signs with pictures when features of the referent corresponding to the iconic form of the sign are emphasized in the picture (Thompson, Vinson, & Vigliocco, 2010). Although the iconicity of the mapping between a visual property and its auditory analog may not be as literal or as readily recognized, reliable crossmodal correspondences have been reported, for example, between auditory pitch and visual size, with participants preferring pairings between high pitch and small size or low pitch and large size (Spence, 2011). Some (but not all) of these crossmodal correspondences may arise from the correlation of physical properties in the real world, for example where smaller or faster objects have higher resonating frequencies than larger or slower ones (see, e.g., Ohala, 1983; Spence, 2011). That these cross-modal links extend across languages and operate independently of specific language experience suggests a potential basis for cross-linguistic sound-to-meaning mappings. The results of a recent categorization experiment by Kovic and colleagues (Kovic et al., 2010) support this hypothesis. Participants trained with sound symbolic labels categorized objects more quickly than participants trained with non-symbolic labels, and event related potentials (ERP) recorded during the final categorization test showed an increased early negativity when participants viewed objects in the presence of sound symbolic relative to non-symbolic labels. This ERP component has been linked to cross-modal integration and stimulus binding in other tasks (Molholm et al., 2002). These findings are also consistent with current neural models of synesthetic crossmodal integration, which typically hypothesize co-activation of early sensory areas as well as activation of parietal areas associated with stimulus binding (Brang, Hubbard, Coulson, Huang, & Ramachandran, 2010). Although there are important differences between the perceptual experiences of true synesthetes and the sensitivity to crossmodal correspondences that most individuals exhibit, several research groups have drawn connections between crossmodal integration and synesthesia (Brang, Williams, & Ramachandran, 2012; Martino & Marks, 2001; Spector & Maurer, 2009). Relative to normal controls, synesthetes show enhanced crossmodal integration effects (Brang et al., 2012) as well as changes in the functional and anatomical properties of parietal areas known to be involved in cross-modal integration (Neufeld, Sinke, Dillo, et al., 2012; Neufeld, Sinke, Zedler, et al., 2012; Rouw & Scholte, 2007; van Leeuwen, den Ouden, & Hagoort, 2011). In previous work (Namy, DeFife, Mathur, & Nygaard, submitted for publication; Tzeng, Nygaard, & Namy, submitted for publication), we developed a multi-language stimulus set to investigate the extent to which native monolingual speakers of English display sensitivity to sound-to-meaning correspondences in words drawn from other natural languages from distinct language families. Native speakers of Albanian, Dutch, Gujarati, Indonesian, Korean, Mandarin, Romanian, Tamil, Turkish, and Yoruba nominated multiple synonyms for each member of 9 antonym pairs. In a 2AFC paradigm, native English speakers were asked to choose the meaning of foreign words from the antonym pairs. Although listeners identified the correct meanings at greater than chance levels across semantic domains and languages, how consistently listeners selected a particular meaning varied across individual items. This variation indicates that words differed with respect to their degree of sound symbolism within each language and semantic domain 19 and underscores the probabilistic nature of sound symbolism. From these data, we identified a set of sound symbolic words, items that were judged to mean a particular antonym by at least 80% of participants, as well as a set of non-symbolic words for which there was no consensus on the meaning. In this study, we investigate the cross-modal integration account of sound symbolism by examining whether activation in parietal or perceptual areas differs when participants guess the meanings of sound symbolic versus non-symbolic foreign words, as well as whether an individual participant’s sensitivity to sound symbolism is related to structural connectivity in multi-modal regions. 2. Results and discussion 2.1. Behavioral replication and results In the stimulus selection work (DeFife, Nygaard, & Namy, in preparation; Namy et al., submitted for publication), all items for a single antonym pair were presented in a single block. In order to accommodate fMRI task constraints, the current paradigm intermixes short blocks of five trials from each of four antonym pairs (big/small, round/pointy, still/moving, fast/slow), with sound symbolism classification held constant within each block. To determine whether this adapted paradigm elicited the same sensitivity to sound symbolism, we replicated our previous results with nine participants in a behavioral pilot experiment. Participants heard a foreign word and were asked to guess its meaning from its corresponding English antonym pair. We calculated the proportion of antonym 1 responses for sound symbolic words previously classified as meaning antonym 1, sound symbolic words previously classified as meaning antonym 2, and non-symbolic words that were previously equally likely to be paired with each antonym. Participants showed clear effects of sound symbolism (Fig. 1, left panel). A repeated measures ANOVA with two levels of meaning (form/ motion) and three levels of ‘‘sounds-like’’ category (sounds like antonym 1, sounds like antonym 2, sounds like neither) showed a significant main effect of ‘‘sounds-like’’ category, F(2,16) = 75.3, p < 0.001, but no main effect of meaning type or interaction between category and meaning type (both F < 1). Pairwise comparisons (collapsed across meaning type) show that all three word sets differed reliably from each other, t’s (8) = 10.2, 7.1, and 8.8 for antonym 2 vs. neither, neither vs. antonym 1, and antonym 1 vs. 2 respectively, all p’s < 0.001. Consistent with our previous work, participants showed similar sensitivity to sound symbolic Fig. 1. Mean proportion of antonym 1 responses for words previously chosen to sound like antonym 1, words equally likely to be paired with either antonym, and words previously chosen to sound like antonym 2 for the behavioral and fMRI experiments. Error bars represent standard error of the mean. 20 K.P. Revill et al. / Brain & Language 128 (2014) 18–24 properties of foreign words despite changes in how the materials were presented. Participants in the imaging study (n = 15) also showed sensitivity to sound symbolism in their pattern of responses (Fig. 1, right panel). A repeated measures ANOVA with two levels of meaning (form/motion) and three levels of ‘‘sounds-like’’ category showed a significant main effect of ‘‘sounds-like’’ category, F(2,28) = 71.4, p < 0.001, but no main effect of meaning type, F(1,14) = 1.76, p > 0.2, or interaction between category and meaning type, F(1,14) = 1.01, p > 0.2. Pairwise comparisons (collapsed across meaning type) show that all three word sets differed reliably from each other, t’s (14) = 7.7, 8.2, and 8.7 for antonym 2 vs. neither, neither vs. antonym 1, and antonym 1 vs. 2 respectively, all p’s < 0.001. Relative to rest, task-related activation was seen in a network of brain areas (Table 1, Fig. 2a) including bilateral superior temporal gyrus, bilateral (but predominantly left-lateralized) inferior frontal gyrus, and supplementary motor area. These regions are frequently identified as important components of the network that processes spoken language (Hickok & Poeppel, 2007; Scott, Blank, Rosen, & Wise, 2000), particularly of the dorsal speech perception pathway associated with the integration of auditory and motor representations. Despite participants’ reports of guessing on all stimuli and their lack of awareness of the sound symbolic manipulation, participants did show sensitivity to the manipulation in both their behavior (Fig. 1) and brain activity (Fig. 2a). The contrast between sound symbolic and non-symbolic word blocks revealed an area of significant activation in the intraparietal sulcus in left superior parietal lobe (Fig. 2a, Table 1). No areas surviving correction for multiple comparisons were more reliably active for non-symbolic words than symbolic words, or for the contrast of meaning type (motion versus form). Previous results from several studies of synesthesia show that synesthetic perception is associated with increased activation or volumetric differences in superior and inferior parietal lobes and along the intraparietal sulcus, often in the left hemisphere. The cluster identified here is located near the region showing more activity in audio-visual synesthetes than nonsynesthetes during auditory stimulus processing by Neufeld and colleagues (Neufeld, Sinke, Dillo, et al., 2012). Parietal cortex is known to be involved in multisensory integration (Calvert, 2001; Robertson, 2003), and permanent or temporary lesions to cortex along the inferior parietal sulcus can lead to difficulty with stimulus binding in patients (Robertson, 2003) and synesthetes (Esterman, Verstynen, Ivry, & Robertson, 2006). stimuli referring to aspects of object form or motion can activate occipitotemporal regions associated with perceiving those properties (Chao, Weisberg, & Martin, 2002; Revill, Aslin, Tanenhaus, & Bavelier, 2008; Willems & Casasanto, 2011). We were able to use independently defined regions of interest to further investigate whether regions involved in perceiving object form (for all participants) and motion (for a subset of the participants) were activated while guessing meanings for words relating to those properties and whether activation in these areas varied based on the sound symbolic status of the words. We observed greater activation for intact abstract shapes relative to scrambled abstract shapes bilaterally in all participants from the visual form localizer data. Mean peak voxel coordinates across all participants were ( 47, 75, 2) and (48, 74, 5). The contrast of moving versus static dots revealed bilateral activation in visual areas including area MT +, with average Talairach coordinates of ( 50, 73, 5) and (48, 70, 2) in a subset of 11 participants for whom MT localizer data was available. Each individual’s peak coordinates were used as the centers of independent ROIs for a targeted analysis of the sound symbolic word task data. Mean beta values, scaled as percent signal change, were extracted for each participant. Despite the small sample size, a repeated measures ANOVA revealed a significant effect of word meaning in the left MT ROI, F(1,10) = 6.29, p < 0.05, with less activation for motion words than form words. No other ROIs showed effects of word meaning. Although the relative reduction in left MT activity during processing of motion antonym blocks was not predicted, decreased activation in perceptual regions for linguistic stimuli has been observed in other studies (Aziz-Zadeh et al., 2008) and may indicate interference between linguistic stimuli and normal perceptual processing (Landau, Aziz-Zadeh, & Ivry, 2010; Meteyard, Zokaei, Bahrami, & Vigliocco, 2008). We did not observe significant effects or interactions with sound symbolism in any ROI (all F < 1). Planned contrasts did not reveal significant activation differences between symbolic and non-symbolic motion words in the MT ROIs (left: t(10) = 0.9, p > 0.2; right: t(10) = 1.5, p > 0.1) or between symbolic and non-symbolic form words in the LOC ROIs (left: t(14) = 0.7, p > 0.2; right: t(14) = 0.7, p > 0.2). While crossmodal activation theories of synesthesia posit direct connections between and activation of perceptual areas during processing of synesthetic stimuli, we did not observe evidence for direct activation of visual sensory areas by sound symbolic stimuli, though caution is warranted when drawing conclusions about null effects, particularly with a relatively small sample size. However, we do see increased activation for sound symbolic items in integrative areas associated with crossmodal binding of stimuli, an important component of current models of synesthetic processing (Brang et al., 2010; Neufeld, Sinke, Dillo, et al., 2012; Neufeld, Sinke, Zedler, et al., 2012; Rouw & Scholte, 2007; van Leeuwen et al., 2011). 2.3. ROI analyses 2.4. DTI analyses Prior evidence has shown that making categorical decisions about object form or motion properties or encountering language As a group, participants were sensitive to the sound symbolic properties of the words, pairing sound symbolic words with the 2.2. Whole-brain fMRI analysis Table 1 Talairach coordinates of the center of mass, volume, and peak t value of significant activation (cluster-based FWE corrected p < 0.05; uncorrected p < 0.001, cluster size 31 voxels) for each contrast of interest. BA = Brodmann’s Area. Contrast # Voxels Peak T All words > rest 348 327 203 176 133 47 35 12.7 14.2 12.5 10.2 9.7 9.0 5.0 Sound symbolic words > non-symbolic words x y 56.8 54.2 36.7 0.0 42.8 34.1 32.8 21.5 20.7 23.1 13.4 11.8 22.9 65.4 z Region 5.8 8.4 5.6 45.8 28.5 4.0 44.8 Right superior temporal gyrus Left superior temporal gyrus Left inferior frontal gyrus (BA 13/45) Medial frontal gyrus Left middle frontal gyrus (BA 44/45) Right insula, right inferior frontal gyrus Left superior parietal lobe K.P. Revill et al. / Brain & Language 128 (2014) 18–24 21 Fig. 2. (A) Warm colors: significant activation for task relative to baseline. Cool colors: area showing significant activation for sound symbolic words relative to non-symbolic words (collapsed across meaning dimension) in left superior parietal lobe. Activation is thresholded at a corrected p < 0.05 (uncorrected p < 0.001, cluster size 31 voxels). (B) Significant correlations between sensitivity to sound symbolism (% correct on sound symbolic words) and FA were found in two clusters (red) within the left superior longitudinal fasciculus. The FA skeleton (green) is overlaid on the group mean FA image in standard space. The relationship between participants’ performance on the sound symbolic trials and mean FA extracted from the significant clusters is shown for illustrative purposes only. ‘correct’ antonyms (the meaning agreed upon by more than 80% of the participants in the initial stimulus set construction) at a rate well above chance (67.7% of trials), but not all participants were equally likely to choose the correct meanings (range: 47.5–85.0%). We used each individual’s sound symbolic accuracy score to perform a regression against the fractional anisotropy (FA) skeleton defined by TBSS. Initial whole-brain analyses did not reveal any clusters in which an individual’s sensitivity to sound symbolism was significantly correlated with FA after permutationbased correction for multiple comparisons. However, previous research has suggested that FA in parietal/temporal white matter, including the superior longitudinal fasciculus (SLF), correlates with behavior on cross-modal integration tasks (Brang, Taich, Hillyard, Grabowecky, & Ramachandran, 2013) and language tasks, particularly language tasks involving phonological processing (Vandermosten et al., 2012; Wong, Chandrasekaran, Garibaldi, & Wong, 2011). SLF masks from the JHU white matter tractography atlas (Hua et al., 2008) were combined with the group FA skeleton using a region of interest approach. Within this limited search volume, two clusters in the left superior longitudinal fasciculus ( 26, 43, 28; 39, 41, 28; Fig. 2b) survived TFCE correction for multiple comparisons (p < 0.05) and show a positive correlation between FA and accuracy on sound symbolic words. Similar clusters show correlations between FA and crossmodal integration in nonsynesthetes (Brang et al., 2013) and with sound to meaning mapping in word learners (Wong et al., 2011). Synesthetes also show increased FA in this area compared to healthy controls (Rouw & Scholte, 2007). 3. General conclusions These data provide support for cross-modal activation as a mechanism by which sound symbolism facilitates word-to-meaning mappings. Heightened activation in left superior parietal lobe for sound symbolic relative to non-symbolic words suggests that sound symbolic foreign words engage cross-modal sensory integration processes to a greater extent than non-symbolic words. DTI analysis revealed that individual differences in performance on the behavioral task reliably predicted FA in the left superior longitudinal fasciculus, which has previously been linked to individual differences in cross-modal integration (Brang et al., 2013; Rouw & Scholte, 2007). We did not (possibly due to our small sample size) observe predicted differences between sound symbolic and nonsymbolic stimuli in lower level sensory ROIs involved in the perception of object form or motion. Future studies with a larger sample size and a stimulus set constructed to maximize sound symbolic properties will be needed to fully explore a direct co-activation hypothesis. While these findings are consistent with research linking increased activity and structural integrity in intraparietal regions with cross-modal integration, these areas are also part of an impor- 22 K.P. Revill et al. / Brain & Language 128 (2014) 18–24 tant dorsal pathway involved in phonological processing (Hickok & Poeppel, 2007; Wong et al., 2011). An important issue to address in future work is whether the relationship between FA and sensitivity to sound symbolism can be explained by individual differences in cross-modal integration or by variability in listeners’ phonological processing skills. Although there are no gross phonological differences between the sound symbolic and non-symbolic words in this stimulus set, there are phonological regularities in the sound symbolic stimulus set that provide reliable cues to meaning (Namy et al., submitted for publication). Better attunement to phonological information or sensitivity to phonological regularities, particularly when listening to words from unfamiliar phonologies, might have enabled some participants to capitalize more readily than others upon the cross-modal sound-to-meaning correspondences. In sum, these findings suggest that cross-modal correspondences between particular auditory stimuli and particular visuospatial features of objects account for at least some aspects of sound symbolism. That these correspondences appear to transcend language families suggests that the associations are not dependent upon language experience, but rather on a general sensitivity to relations across auditory and visual domains including natural correlations between physical features of objects and their auditory consequences. Recent phonological analyses (Namy et al., submitted for publication) have confirmed that there are common phonological properties associated with particular meanings across languages, and that the prevalence of these features is correlated with accuracy in guessing the meanings of these foreign words. A critical question, of course, is why these particular sound-to-meaning correspondences exist at all. These reliable correspondences may reflect underlying acoustic or articulatory properties of natural language that are non-arbitrarily related to features in other sensory modalities, perhaps through an abstract form of iconicity or embodiment. An additional question is why sound-to-meaning correspondences continue to exist in natural languages, given the clear advantages of arbitrariness in language. Perhaps cross-modal correspondences between sound and meaning persist because they ease the formation of sound to meaning mappings in first- or second-language learners (Maurer et al., 2006; Nygaard et al., 2009) or because they render semantic retrieval or categorization faster or more efficient in skilled language users (Kovic et al., 2010; Thompson et al., 2010). These will be important directions for future research. 4. Methods 4.1. Stimuli Four antonym pairs were employed for this study: two pairs relating to object motion (still/moving and fast/slow) and two pairs relating to object form (big/small and round/pointy). Stimuli were derived by asking native speakers of 10 foreign languages (6 M, ages 21–29, mean age of first exposure to English 9.7 years, all currently living in the US) to nominate as many synonyms as they could think of for each word and to accept or reject additional synonyms from language-to-English dictionaries (DeFife et al., in preparation). The final list of synonyms was recorded by the same native speaker using neutral, list-like prosody. The ten languages come from seven different language families (four are Indo-European, with one each from Austronesian, Korean, Sino-Tibetan, Dravidian, Altaic, and Niger-Congo language families) with a range of phonological and morphological properties. Two of the languages are tone languages. Vowel and consonant inventories range from moderately small to large, syllable structures range from simple to complex, and morphologies range from isolating to synthetic (see Supplemental materials). However, all participants in the experiments described here were native speakers of English with no exposure to or knowledge of any of these languages. After the stimuli were recorded by the native speakers, groups of 13–15 native English speakers heard each word and guessed which member of the antonym pair it referred to. From these ratings, we identified a subset of words for which at least 80% of listeners assigned the word to a single member of the antonym pair and a subset for which there was no consensus on meaning. Twenty of the sound symbolic (high consensus) items and twenty non-symbolic items were selected from each antonym pair as materials for the current study. For each antonym pair, equal numbers of sound symbolic and non-symbolic words were chosen from each language. 4.2. Participants Twenty-four young adults from the Emory University and Georgia Institute of Technology communities participated in the study, nine in the pilot experiment (8 female, mean age 19.2, SD 0.42, age range 18–20 years), and fifteen in the imaging paradigm (5 female, mean age 22.7, SD 3.8, age range 19–33 years). All participants gave informed consent in accord with Emory and Georgia Tech IRB protocols. Per self-report, all participants were right-handed with normal hearing, normal or corrected-to-normal vision, and no history of language or neurological disorders. All were native speakers of English and none had prior experience with any of the ten languages comprising the stimulus set. Participants were paid for their participation. 4.3. Procedures At the beginning of each task block, an instruction screen displayed the antonym pair that would be the basis for the next set of trials. The antonym pair remained on the screen throughout the block, with one word presented on each side of a central fixation cross. On each trial, participants heard a single word and indicated which antonym they thought corresponded with the meaning of the spoken word by depressing the first or second buttons of a response box affixed to the participant’s right leg using their right index and middle fingers. No feedback was provided. To signal the beginning of a new trial, the central fixation cross turned from a dark grey to a light grey color 200 ms prior to the onset of the target word. The fixation cross remained light grey until the participant made a response or until 3.3 s had elapsed and responses were no longer accepted. 500 ms after the time-out, the next trial began. Each block contained 5 trials, for a total block duration of 24 s (4 s instruction screen plus 5 4 s trials). In the fMRI experiment, task blocks were separated by a 12 s rest interval where only the fixation cross was present. Trials were blocked by antonym pair (fast/slow, moving/still, pointy/round, big/small), and sound symbolism status (symbolic/non-symbolic). Participants were not told about sound symbolism or informed of the sound symbolic blocking prior to the experiment, and post-experiment questioning indicated that participants were unaware of this manipulation. Participants completed 32 blocks of trials, four blocks each for every combination of antonym pair and sound symbolism level. In the fMRI experiment, eight blocks (two for each antonym pair, one at each level of sound symbolism) comprised a single functional run lasting 4:54. Participants completed four functional runs. The order of blocks was counterbalanced across participants and runs. Following completion of the main task, imaging participants also viewed stimuli designed to separately localize form- and motion-sensitive areas of visual cortex. An object form localizer was used to select regions in lateral occipital cortex (LOC) responsive to intact versus scrambled objects or shapes. The localizer consisted of alternating blocks of abstract shapes and scrambled K.P. Revill et al. / Brain & Language 128 (2014) 18–24 abstract shapes presented centrally. Each image was displayed for 800 ms with a 200 ms ISI between images, with 20 images per block. A blank screen was displayed for 12 s between shape and scrambled shape blocks. To ensure attention to the stimuli, participants performed a 1-back task, pressing a button to repeated shapes or scrambled shape images (10% of trials). Six complete cycles of intact and scrambled blocks were presented. During the motion localizer task, participants passively fixated a central cross while twelve 20-s blocks of moving or stationary dots were presented. Dots moved radially at 7°/s in an annulus ranging from 1° to 14° during the motion intervals. Due to equipment malfunction, only 11 of the 15 participants viewed the motion localizer stimuli; all participants viewed the form localizer. 4.4. Image acquisition All MRI data were collected on a Siemens 3T Trio scanner with a 12-channel RF-receive head coil. Functional data were collected using an EPI pulse sequence with the following scan parameters: repetition time (TR) 2000 ms, echo time (TE) 30 ms, flip angle 90°, 64 64 matrix, 192 192 mm field of view (FoV), GRAPPA parallel imaging with acceleration factor PE = 2, and isotropic voxel size of 3 mm3. Thirty-seven axial slices aligned with the A–P plane were collected in an ascending interleaved order. For the main task, we collected a total of 572 volumes (143 in each of 4 runs). Object form and motion localizers consisted of 192 and 160 functional volumes, respectively. Diffusion tensor images (DTI) were acquired using diffusion-weighted EPI sequence with a TR of 7700 ms, TE of 90 ms, matrix size 102 102 and FoV 204 204 mm, with voxel size 2 2 2 mm. Two repetitions of 30 directions were collected, along with a reference B0 image. In addition, a 3D anatomical image was acquired for each participant using a T1-weighted MP-RAGE sequence at a voxel size of 1 1 1 mm with a TR of 2300 ms, TE of 3.02 ms, TI of 1100 ms, 256 256 matrix, 256 256 mm FoV, 192 slices, and GRAPPA parallel imaging with acceleration factor PE = 2. 4.5. fMRI data analysis Following the removal of three initial volumes, functional data were slice-time corrected, motion-corrected, aligned to each subject’s anatomical image, normalized to the colin27 template in Talairach space, and smoothed at 8 mm FWHM using AFNI (Cox, 1996). Data were analyzed using multiple linear regression via AFNI’s 3dDeconvolve tool. The regression model included head movement vectors as regressors of no interest. Five task regressors were modeled with gamma variate functions convolved with stimulus timing and duration. The four conditions of interest were sound symbolic words referring to motion antonym pairs, sound symbolic words referring to object form antonym pairs, non-symbolic motion pairs, and non-symbolic form pairs. The initial instruction screen was modeled separately. Beta weights from the subject-level analysis were submitted to a whole-brain group-level analysis. Data were thresholded at a cluster-based FWE corrected p = 0.05 (a minimum cluster size of 31 voxels at an uncorrected p < 0.001) using AFNI’s cluster-based Monte Carlo simulation method. Data from the motion and form localizers were pre-processed and analyzed separately using identical methods. Due to group-level overlap between LOC and MT/MST (Kourtzi, Bulthoff, Erb, & Grodd, 2002), individual ROIs for visual motion processing and visual form processing were defined for each participant separately (Saxe, Brett, & Kanwisher, 2006) by drawing a sphere with radius 4.4 mm around each individual’s peak voxel in each hemisphere for the motion (when present) and form localizer contrasts respectively. 23 4.6. DTI data analysis DTI data analysis was performed in FSL using tract-based spatial statistics (TBSS) (Smith et al., 2006). Data were eddy-current corrected before fitting a diffusion tensor model to calculate fractional anisotropy (FA) values at each voxel in the brain. The FA images were aligned to MNI standard space and the mean FA map across all participants was thresholded at a FA value of 0.2 to define a FA skeleton representing the centers of all tracts common to the group. Individual subject FA values were projected onto the group skeleton for further analyses. We examined the relationship between behavioral performance and FA using the threshold-free cluster enhancement (TFCE) techniques available in FSL’s randomise program for permutation testing. To maintain consistency between the fMRI and DTI analyses, all coordinates of significant clusters from the TBSS analysis are reported in Talairach space following the icbm_fsl2tal transformation (Lancaster et al., 2007). Acknowledgments This work was supported by a GSU/GT Center for Advanced Brain Imaging seed grant (KPR) and an Emory College Instrumentation, Bridge, Instruction, and Seed grant (LCN & LLN). Appendix A. Supplementary material Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.bandl.2013. 11.002. References Aziz-Zadeh, L., Fiebach, C. J., Naranayan, S., Feldman, J., Dodge, E., & Ivry, R. B. (2008). Modulation of the FFA and PPA by language related to faces and places. Social Neuroscience, 3(3–4), 229–238. Bergen, B. K. (2004). The psychological reality of phonaesthemes. Language, 80(2), 290–311. Berlin, B. (1994). Evidence for pervasive synaesthetic sound symbolism in ethnozoological nomenclature. In L. Hinton, J. Nichols, & J. Ohala (Eds.), Sound symbolism (pp. 77–93). New York: Cambridge University Press. Brang, D., Hubbard, E. M., Coulson, S., Huang, M., & Ramachandran, V. S. (2010). Magnetoencephalography reveals early activation of V4 in grapheme-color synesthesia. Neuroimage, 53(1), 268–274. Brang, D., Taich, Z., Hillyard, S. A., Grabowecky, M., & Ramachandran, V. S. (2013). Parietal connectivity mediates multisensory facilitation. Neuroimage. Brang, D., Williams, L. E., & Ramachandran, V. S. (2012). Grapheme-color synesthetes show enhanced crossmodal processing between auditory and visual modalities. Cortex, 48(5), 630–637. Brown, R. W., Black, A. H., & Horowitz, A. E. (1955). Phonetic symbolism in natural languages. Journal of Abnormal Psychology, 50(3), 388–393. Calvert, G. A. (2001). Crossmodal processing in the human brain: Insights from functional neuroimaging studies. Cerebral Cortex, 11(12), 1110–1123. Chao, L. L., Weisberg, J., & Martin, A. (2002). Experience-dependent modulation of category-related cortical activity. Cerebral Cortex, 12(5), 545–551. Cox, R. W. (1996). AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Computers and Biomedical Research, 29(3), 162–173. DeFife, L. C., Nygaard, L. C., & Namy, L. L. (in preparation). Cross-linguistic consistency and within-language variability of sound symbolism in natural languages. Esterman, M., Verstynen, T., Ivry, R. B., & Robertson, L. C. (2006). Coming unbound: disrupting automatic integration of synesthetic color and graphemes by transcranial magnetic stimulation of the right parietal lobe. Journal of Cognitive Neuroscience, 18(9), 1570–1576. Gasser, M. (2004). The origins of arbitrariness in language. Paper presented at the Proceedings of the Cognitive Science Society. Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393–402. Hockett, C. F. (1977). The view from language: selected essays, 1948–1974. Athens: University of Georgia Press. Hua, K., Zhang, J., Wakana, S., Jiang, H., Li, X., Reich, D. S., et al. (2008). Tract probability maps in stereotaxic spaces: Analyses of white matter anatomy and tract-specific quantification. Neuroimage, 39(1), 336–347. Hubbard, E. M., Brang, D., & Ramachandran, V. S. (2011). The cross-activation theory at 10. Journal of Neuropsychology, 5(2), 152–177. 24 K.P. Revill et al. / Brain & Language 128 (2014) 18–24 Imai, M., Kita, S., Nagumo, M., & Okada, H. (2008). Sound symbolism facilitates early verb learning. Cognition, 109(1), 54–65. Jackendoff, R. (2002). Foundations of language: Brain, meaning, grammar, evolution. Oxford; New York: Oxford University Press. Kita, S. (1997). Two-dimensional semantic analysis of Japanese mimetics. Linguistics, 35(2), 379–415. Kohler, W. (1947). Gestalt psychology, an introduction to new concepts in modern psychology. New York: Liveright Pub. Corp. Kourtzi, Z., Bulthoff, H. H., Erb, M., & Grodd, W. (2002). Object-selective responses in the human motion area MT/MST. Nature Neuroscience, 5(1), 17–18. Kovic, V., Plunkett, K., & Westermann, G. (2010). The shape of words in the brain. Cognition, 114(1), 19–28. Kunihira, S. (1971). Effects of expressive voice on phonetic symbolism. Journal of Verbal Learning and Verbal Behavior, 10(4), 427–429. Lancaster, J. L., Tordesillas-Gutierrez, D., Martinez, M., Salinas, F., Evans, A., Zilles, K., et al. (2007). Bias between MNI and Talairach coordinates analyzed using the ICBM-152 brain template. Human Brain Mapping, 28(11), 1194–1205. Landau, A. N., Aziz-Zadeh, L., & Ivry, R. B. (2010). The influence of language on perception: listening to sentences about faces affects the perception of faces. Journal of Neuroscience, 30(45), 15254–15261. Martino, G., & Marks, L. E. (2001). Synesthesia: strong and weak. Current Directions in Psychological Science, 10(2), 61–65. Maurer, D., Pathman, T., & Mondloch, C. J. (2006). The shape of boubas: soundshape correspondences in toddlers and adults. Developmental Science, 9(3), 316–322. Meteyard, L., Zokaei, N., Bahrami, B., & Vigliocco, G. (2008). Visual motion interferes with lexical decision on motion words. Current Biology, 18(17), R732–R733. Molholm, S., Ritter, W., Murray, M. M., Javitt, D. C., Schroeder, C. E., & Foxe, J. J. (2002). Multisensory auditory-visual interactions during early sensory processing in humans: A high-density electrical mapping study. Brain Research. Cognitive Brain Research, 14(1), 115–128. Monaghan, P., Christiansen, M. H., & Chater, N. (2007). The phonological– distributional coherence hypothesis: Cross-linguistic evidence in language acquisition. Cognitive Psychology, 55(4), 259–305. Monaghan, P., Christiansen, M. H., & Fitneva, S. A. (2011). The arbitrariness of the sign: Learning advantages from the structure of the vocabulary. Journal of Experimental Psychology: General, 140(3), 325–347. Namy, L. L., DeFife, L. C., Mathur, N. M., & Nygaard, L. C. (submitted). Cross-linguistic sound symbolism: Phonetic determinants of word meaning. Neufeld, J., Sinke, C., Dillo, W., Emrich, H. M., Szycik, G. R., Dima, D., et al. (2012a). The neural correlates of coloured music: A functional MRI investigation of auditory-visual synaesthesia. Neuropsychologia, 50(1), 85–89. Neufeld, J., Sinke, C., Zedler, M., Dillo, W., Emrich, H. M., Bleich, S., et al. (2012b). Disinhibited feedback as a cause of synesthesia: evidence from a functional connectivity study on auditory-visual synesthetes. Neuropsychologia, 50(7), 1471–1477. Nuckolls, J. B. (1999). The case for sound symbolism. Annual Review of Anthropology, 28, 225–252. Nygaard, L. C., Cook, A. E., & Namy, L. L. (2009). Sound to meaning correspondences facilitate word learning. Cognition, 112(1), 181–186. Ohala, J. (1983). Cross-language use of pitch: An ethological view. Phonetica, 40(1), 1–18. Ohala, J. J. (1984). An ethological perspective on common cross-language utilization of F0 of voice. Phonetica, 41(1), 1–16. Perniss, P., Thompson, R. L., & Vigliocco, G. (2010). Iconicity as a general property of language: Evidence from spoken and signed languages. Front Psychology, 1, 227. Pinker, S. (1999). Words and rules: The ingredients of language (1st ed.). New York: Basic Books. Ramachandran, V. S., & Hubbard, E. M. (2001). Synesthesia – A window into perception, thought, and language. Journal of Consciousness Studies, 8(12), 3–34. Revill, K. P., Aslin, R. N., Tanenhaus, M. K., & Bavelier, D. (2008). Neural correlates of partial lexical activation. Proceedings of National Academy Science USA, 105(35), 13111–13115. Robertson, L. C. (2003). Binding, spatial attention and perceptual awareness. Nature Reviews Neuroscience, 4(2), 93–102. Rouw, R., & Scholte, H. S. (2007). Increased structural connectivity in graphemecolor synesthesia. Nature Neuroscience, 10(6), 792–797. Sapir, E. (1929). A study in phonetic symbolism. Journal of Experimental Psychology, 12, 225–239. Saussure, F. d. (1959). Course in general linguistics. New York: Philosophical Library. Saxe, R., Brett, M., & Kanwisher, N. (2006). Divide and conquer: a defense of functional localizers. Neuroimage, 30(4), 1088–1096; discussion 1097–1089. Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. (2000). Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123(Pt 12), 2400–2406. Sereno, J. A., & Jongman, A. (1990). Phonological and form class relations in the lexicon. Journal of Psycholinguistic Research, 19(6), 387–404. Smith, S. M., Jenkinson, M., Johansen-Berg, H., Rueckert, D., Nichols, T. E., Mackay, C. E., et al. (2006). Tract-based spatial statistics: Voxelwise analysis of multisubject diffusion data. Neuroimage, 31(4), 1487–1505. Spector, F., & Maurer, D. (2009). Synesthesia: A new approach to understanding the development of perception. Developmental Psychology, 45(1), 175–189. Spence, C. (2011). Crossmodal correspondences: A tutorial review. Attenuation Percept Psychophysics, 73(4), 971–995. Thompson, R. L., Vinson, D. P., & Vigliocco, G. (2010). The link between form and meaning in British sign language: Effects of iconicity for phonological decisions. Journal of Experimental Psychology. Learning, Memory, and Cognition, 36(4), 1017–1027. Tzeng, C., Nygaard, L. C., & Namy, L. L. (submitted for publication). The specificity of sound symbolic correspondences in spoken language. van Leeuwen, T. M., den Ouden, H. E., & Hagoort, P. (2011). Effective connectivity determines the nature of subjective experience in grapheme-color synesthesia. Journal of Neuroscience, 31(27), 9879–9884. Vandermosten, M., Boets, B., Poelmans, H., Sunaert, S., Wouters, J., & Ghesquiere, P. (2012). A tractography study in dyslexia: neuroanatomic correlates of orthographic, phonological and speech processing. Brain, 135(Pt 3), 935–948. Willems, R. M., & Casasanto, D. (2011). Flexibility in embodied language understanding. Front Psychology, 2, 116. Wong, F. C., Chandrasekaran, B., Garibaldi, K., & Wong, P. C. (2011). White matter anisotropy in the ventral language pathway predicts sound-to-word learning success. Journal of Neuroscience, 31(24), 8780–8785.
© Copyright 2024 Paperzz