Japanese Psychological Research 1997, Volume 39, No. 3, 182–190 Special Issue: Cognition and behavior of chimpanzees Auditory–visual intermodal matching by a chimpanzee (Pan troglodytes)1 KAZUHIDE HASHIYA Research Fellow of the Japan Society for the Promotion of Science, Section of Cognition and Learning, Department of Behavioral and Brain Sciences, Primates Research Institute, Kyoto University, Inuyama 484, Japan SHOZO KOJIMA Section of Cognition and Learning, Department of Behavioral and Brain Sciences, Primates Research Institute, Kyoto University, Inuyama 484, Japan Abstract: Auditory–visual intermodal information processing was studied in a female chimpanzee. Following the presentation of a recorded sound, the subject had to select, from two alternatives, a photograph that corresponded to the sample sound. Various human voices and sounds produced with objects served as auditory sample stimuli. Photographs of speakers or the sound sources of auditory stimuli served as visual comparison stimuli. When the choice alternatives consisted of a picture of a human and an object, the subject showed a generalization of performance, even when the particular auditory and visual stimuli were presented for the first time. The matching performance of the subject was significantly better when novel stimuli were presented as sample or comparison stimuli than when the trial consisted of familiar stimuli. This suggested that the novelty of stimuli facilitated the subject’s performance. In sum, the chimpanzee learned to match an auditory stimulus to the comparison visual stimulus, but intermodal processing was characterized by the deterioration of matching performance with familiar stimuli. Key words: auditory–visual intermodal integration, chimpanzees, matching to sample, novelty, familiarity. Studies of perception and cognition have demonstrated that there are many commonalities in the way information is processed by human and nonhuman primates, especially chimpanzees (Fujita & Matsuzawa, 1990; Kojima, Tatsumi, Kiritani, & Hirose, 1989; Matsuzawa, 1985, 1990). It is also true that researchers have had much difficulty in training nonhuman primates in auditory tasks (D’Amato, 1988; D’Amato & Salmon, 1984; Dewson & Burlingeme, 1975; Owren, 1990). This is in clear contrast to the studies on visual cognition in nonhuman primates. Most of our knowledge about perception and cognition in nonhuman 1 This study was supported by a Grant-in-Aid for Scientific Research, Ministry of Education, Science and Culture, Japan, #04610053 to Shozo Kojima and #2827 to Kazuhide Hashiya. The authors thank Sumiharu Nagumo for his help in programming and the technical assistance to construct the apparatus. Thanks are also due to the staff of the Laboratory Primate Center of the Primate Research Institute, especially Kiyonori Kumazaki and Norihiko Maeda, for their daily care of the chimpanzees, and to the staff of Primate Research Institute, especially Michael A. Huffman and Tetsuro Matuzawa, for their help and support. © 1997 Japanese Psychological Association. Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA. © Japanese Psychological Association 1997. Auditory–visual intermodal matching by a chimpanzee primates has been limited to the visual modality; there have been few studies about information processing in the auditory modality (Kojima, 1985, 1990) or about intermodal processing (Davenport, Rogers, & Russell, 1975). Therefore, we do not have sufficient data to discuss the similarities and differences in information processing between visual and auditory modalities. The present study aims to examine the littleknown aspects of auditory–visual intermodal integration in a chimpanzee. Categorical processing of the intermodal information was examined in Experiment 1, and, based on the result of Experiment 1, we investigated some features of the processing in Experiment 2. Experiment 1 A chimpanzee was trained to perform an auditory–visual intermodal matching-to-sample (AVMTS) task. The subject was given two alternatives to choose between: photographs of a human and an object. Reward was given when the choice matched the corresponding auditory sample: a human voice or a sound produced by an object. The goal of this experiment was to test the learning of the AVMTS task and to examine the generalization of matching performance by introducing novel stimulus pairs. Method Subject. The subject was an 11-year-old female chimpanzee, named Pan. Before this study, she had been trained to perform simultaneous AVMTS tasks for almost 2 years. She had also served as a subject for various kinds of behavioral experiments: auditory and speech perception, visual discrimination, sign language communication, and visual matching-to-sample tasks (Kojima & Kiritani, 1989; Kojima et al., 1989; Fushimi, 1994; Tanaka, 1995, 1996). The subject was not deprived of food. She was housed with a social group of nine chimpanzees who lived in an outdoor compound and an attached indoor residence. She was cared for according to guidelines produced by the Primate Research Institute, Kyoto University. 183 Auditory stimuli. Sample stimuli were digitally recorded from human voices and sounds made by objects. The duration of each stimulus was 4 s. Sound intensity of each auditory stimulus ranged from 50 to 70 dB sound pressure level (SPL) in an experimental booth. This SPL was higher than the threshold of hearing in chimpanzees as measured by Kojima (1990). Visual stimuli. Comparison stimuli were color photographs taken on a white background. They were digitally recorded on laser video disk (TEAC, LV-200, Japan) and displayed on a 21-inch video monitor. Each stimulus was 24 cm in width by 34 cm in height when measured on the monitor. Each sample stimulus was paired with a photograph of its speaker or its sound source. Choice alternatives consisted of two photographs presented side by side on the monitor. In Experiment 1, the comparison stimuli always consisted of a human stimulus pitted against an object stimulus. Apparatus. From the outdoor compound, the subject entered an experimental booth (2.4 m × 2.0 m × 1.8 m) equipped with a video monitor on one wall. The subject’s response was detected by a touch panel (Nissha-InterSystems, Hypertouch, Japan) attached to the monitor. A piece of apple or a raisin was delivered as a reward. It was automatically dispensed into a small cup below the monitor upon correct response. Two personal computers (NEC, PC-9800/F and VM, Japan) controlled the experiment. Auditory stimuli were generated by a digital sound processing board (Canopus, Sound Master-V, Japan) and presented through a loud speaker (SONY, SRS-160, Japan) placed above the monitor. Procedure. The subject sat in front of the touch-screen monitor. When the subject touched the start key (a purple rectangle 1 cm × 4 cm) at the center of the monitor, the key disappeared and an auditory sample stimulus was presented for 4 s. Immediately after the termination of the auditory stimulus, the two visual alternatives were presented on the monitor. When the chimpanzee touched the photograph corresponding to the sample auditory stimulus, she received a food reward, and another auditory sample was presented. When the subject’s © Japanese Psychological Association 1997. 184 K. Hashiya and S. Kojima choice was incorrect, nothing happened for 10 s except that the autitory stimulus corresponding to the visual stimulus the subject touched was presented as a feedback (see Figure 1). A non-correction procedure was used. Training and test. In previous experiments, the subject had been trained to match a sound to the corresponding picture by using object examples. In the present study, we introduced a new set of training stimuli. There were six stimulus pairs, each of which consisted of a human stimulus and an object stimulus. The particular pairing of a human stimulus and an object stimulus was consistent within a session but was changed randomly between sessions. A session consisted of 72 trials. The order of trials was randomized. All six people used had been known to the subject for more than 1 year. The six objects were also familiar to the subject: a rattle, a bell, a cheer horn, a ring-shaped toy, a paper pipe, and a wooden whistle. After 23 training sessions, a novel stimulus pair was introduced in the place of one of the six baseline stimulus pairs, as a test session. We introduced 26 novel stimulus pairs, 52 stimuli, in total. Each test stimulus pair was introduced one by one in a series of 26 test sessions. To maintain the subject’s performance, sessions without a novel stimulus pair (training sessions) were occasionally inserted among a series of test sessions. Two of the 12 baseline stimuli (one human stimuli and one object stimuli) was changed in turn to two novel stimuli (one human stimuli and one object stimuli) in a test session. This means that 12 trials out of 72 (one session) were allotted to test trials in a test session. The pairing of training stimuli was randomized between sessions but fixed within a session. Results and discussion The results of the training sessions are shown in Figure 2. To evaluate the extent of generalization, we also show the data of the first trial of test stimuli presentation (Table 1). Table 1 also shows percentage of correct responses for novel stimuli throughout the test session. The subject usually chose the correct picture in the © Japanese Psychological Association 1997. first trials even though she had no prior experience of the stimuli. The mean response time to novel stimuli (5.8 s, SD = 2.6) was significantly longer than that to baseline stimuli (2.6 s, SD = 0.8) (p , .005, t-test). This means that the subject discriminated novel stimuli from baseline stimuli. Experiment 1 demonstrated that the subject performed the AVMTS task even when a set of novel stimuli was presented. The total percentage of correct responses on the first session (71%, 223 correct out of 312 trials) was significantly higher than the chance level (50%). Correct responses were made significantly more often than the chance level in 9 out of 26 first sessions, but not in the other 17 sessions. Although the chimpanzee generalized the intermodal matching skill to totally new stimuli, performance remained relatively low and only slightly above the chance level. Experiment 2 Experiment 1 suggested that the subject discriminated between human and object stimuli in the AVMTS task. However, it was difficult to maintain high levels of performance through sessions and performance did not appear to improve between sessions. What conditions facilitate the subject’s performance in the AVMTS task? We hypothesized that the novelty of stimuli might have an important role in intermodal processing. Experiment 2 examined the effects of novelty of stimuli on AVMTS performance. Method Subject. The subject was the same as in Experiment 1. Stimuli. In Experiment 2, there were eight familiar and 28 novel stimuli. Half consisted of human stimuli and the other half were object stimuli. The SPL of auditory stimuli was almost the same, and the size of the visual stimuli was exactly the same as that in Experiment 1. In each session, six stimuli (three humans and three objects) were used. All possible combinations of those six stimuli were examined within a session. In a departure from the method in Auditory–visual intermodal matching by a chimpanzee Food Reward Figure 1. 185 Time out 10 seconds Schematic illustration of one trial from the task. The example shows a set of stimuli in the test session: a human and a traditional Japanese toy. Color photographs were used in the actual trials. Sound spectrograms (KAY CSL model 4300B) represent the auditory stimuli in this illustration. © Japanese Psychological Association 1997. K. Hashiya and S. Kojima Percent correct response 186 Baseline stimuli Novel stimuli Sessions Figure 2. Percent correct responses to baseline and novel stimuli in each session. The training phase lasted until the 23rd session. The average percent correct responses in the baseline test phase was 80.1%. Experiment 1, “human/human” and “object/ object” pairings of stimuli were tested in addition to “human/object” pairing. The pairing of stimuli were changed (not fixed, as in Experiment 1) within a session. Apparatus. A digital image processing board (Canopus, Super CVI, Japan) was used to present visual stimuli instead of the laser disk. Besides this, the apparatus was exactly the same as that used in Experiment 1. Procedure. The basic procedures were also the same as those in Experiment 1, except for the following points. We presented six stimuli in each session. Four of these stimuli were familiar (two humans and two objects). They were identical to those used in Experiment 1. Each of the familiar stimuli had been presented to the subject for more than 10 sessions in © Japanese Psychological Association 1997. Experiment 1. The remaining two stimuli were novel (one human and one object). A training session or a test session in Experiment 2 consisted of 120 trials. Each of the six stimuli was designed as the correct stimulus with the same probability. The subject received 14 sessions, 1,680 trials of tests in total, in Experiment 2. Analysis. Each trial can be classified into one of seven categories, according to the combination of choice alternatives. There were two factors defining the categories. The first factor was the human/object dichotomy of stimulus pairs. There were three variants: the alternatives consisted of two human stimuli, two object stimuli, or one human and one object stimulus. The second factor was the novelty/familiarity dichotomy. There were three variants: the alternatives consisted of two familiar stimuli, two novel stimuli, or one familiar and one novel stimulus. The following distinction was also noted. Categories that included both a human and an object stimulus, or both a novel and a familiar stimulus, were further divided into two subcategories, according to which stimulus was presented as a sample (see Table 2). In the 9 possible combinations of 2 factors described above, the categories consisted of novel human stimuli/novel human stimuli, or of novel object stimuli/novel object stimuli, were not tested in this study. Because one novel human stimulus and one novel object stimulus were introduced in each session, the combination of novel human/human stimuli or novel object/ object stimuli was not shown to the subject. Results and discussion Analysis of variance (ANOVA) was conducted: 4 × 3 and 2 × 5 designs (see Table 2). Each factor was significant in the analysis. Further analysis of each factor (by least squares difference) was done. The main findings are as follows. 1. Percent correct responses in “familiar/ novel” and “novel/novel” conditions were higher than that in the “familiar/familiar” condition (F(3, 39) = 5.28, p , .004). Auditory–visual intermodal matching by a chimpanzee 187 Table 1. Response in the first trial and percent correct responses in the first session for each stimulus First trial Pair 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Total % correct Statistics Human l ● l l ● ● l l l l l ● l l l ● l l ● ● ● l l l l l 18/26 69 * First session Object Human Object Total Binominal test ● ● 4/6 2/6 5/6 6/6 3/6 1/6 4/6 4/6 3/6 3/6 5/6 1/6 6/6 4/6 3/6 4/6 6/6 3/6 5/6 3/6 3/6 5/6 4/6 5/6 6/6 4/6 6/6 5/6 3/6 2/6 4/6 2/6 6/6 6/6 3/6 3/6 4/6 4/6 5/6 5/6 6/6 6/6 6/6 6/6 6/6 5/6 6/6 5/6 5/6 4/6 6/6 2/6 10/12 7/12 8/12 8/12 7/12 3/12 10/12 10/12 6/12 6/12 9/12 5/12 11/12 9/12 9/12 10/12 12/12 9/12 11/12 8/12 9/12 10/12 9/12 9/12 12/12 6/12 * ns ns ns ns ns * * ns ns ns ns ** ns ns * ** ns ** ns ns * ns ns ** ns 23/26 88 *** 102/156 65 *** 121/156 78 *** 223/312 71 *** l l ● l l l l l l l l l l l l l l l l l l l l l * p , .05, ** p , .01, *** p , .0001. A white unfilled circle means that the subject’s choice at the first trial was correct. A black filled circle means that the subject’s choice was incorrect. 2. Percent correct responses in “human/ object” and “object/object” conditions was higher than that in the “human/human” condition (F(3, 39) = 12.8, p , .001). 3. In the “human/object” condition, percent correct responses were higher in the following order: “novel/novel” . “familiar/ novel” . “familiar/familiar” (F(2, 26) = 8.00, p , .002). 4. When the sample stimulus was “familiar,” percent correct responses were higher when the choice alternative was “familiar/ novel” than when it was “familiar/familiar.” When the choice alternative was “familiar/ novel,” the percent correct responses were higher when the sample stimulus was “novel” than when it was “familiar.” © Japanese Psychological Association 1997. K. Hashiya and S. Kojima 188 Table 2. Percent correct ± standard error (number of correct responses/number of trials) in each condition Familiarity/novelty dichotomy of stimuli Human/object dichotomy of stimuli Familiar sample Novel sample Familiar choice Novel choice Familiar choice Novel choice Human sample Human choice Object choice 59.8 ± 3.3 (67/112) 73.2 ± 4.2 (164/224) 66.1 ± 5.0 (74/112) 84.8 ± 4.6 (95/112) 67.9 ± 4.7 (76/112) 89.3 ± 2.9 (100/112) – 85.7 ± 5.1 (48/56) Object sample Human choice Object choice 73.7 ± 4.5 (165/224) 72.3 ± 4.4 (81/112) 82.1 ± 4.5 (92/112) 83.9 ± 2.8 (94/112) 84.8 ± 3.2 (95/112) 90.2 ± 2.3 (101/112) 87.5 ± 3.5 (49/56) – Taken together, the data suggest that the novelty of the stimuli facilitated the subject’s performance of the AVMTS task. Novel auditory stimuli as well as novel visual stimuli improved the subject’s performance. This implies that it is useful to introduce new auditory/visual stimuli for facilitating and matching in AVMTS performance. Wright, Shyan, and Jitsumori (1990) successfully trained rhesus monkeys to perform auditory discrimination tasks using a trial unique test procedure. The procedure seems to be effective not only because it reduces intertrial interference but also because novel stimuli facilitate the subjects’ performance. General discussion The present study demonstrated a chimpanzee’s ability to integrate an auditory stimulus with a corresponding visual stimulus. The skill was generalized to a set of new stimuli. Kojima et al. (1989) demonstrated categorical processing in consonant perception by chimpanzees in an auditory discrimination task. The present study demonstrated that complex auditory stimuli can be matched with the corresponding visual stimuli in the intermodal discrimination task by a chimpanzee. Categorical processing of such complex stimuli was also suggested by this study. © Japanese Psychological Association 1997. However, the present study also showed that the performance on AVMTS tasks deteriorates over time without the introduction of new stimuli. Some previous studies (Hayes & Thompson, 1953; Mishkin & Delacour, 1975) reported that a trial using a new procedure or introducing novel stimuli did facilitate the subject’s performance in visual intramodal tasks. The recognition of familiarity or novelty of auditory stimuli seems to be as highly developed as with visual stimuli in chimpanzees. The subject seemed to utilize this ability in the AVMTS tasks. Auditory–visual intermodal integration is an essential part of human speech (Molfese, Morse, & Peters, 1994). The ability to match novel auditory stimuli to visual stimuli or the reverse seems to form one of the prerequisites of language acquisition in humans. The present study suggested that a part of this ability is shared with chimpanzees. This finding is important as a starting point for studying the evolutional uniqueness of human language. Although some studies have already succeeded in training rhesus monkeys (Gaffan & Harrison, 1991; Murray & Gaffan, 1994) in AVMTS tasks, partly because they mainly focused on the neurophysiological basis of the intermodal information processing, behavioral features of auditory–visual intermodal matching performance in nonhuman primates are Auditory–visual intermodal matching by a chimpanzee still unclear. The data from bonobos (SavageRumbaugh, McDonald, Sevcik, Hopkins, & Rubert, 1986) are also very important as a demonstration of auditory–visual intermodal information integration in nonhuman primates. But the number of such studies are limited and more systematic estimation of the ability is necessary. Our results also demonstrated that performance on AVMTS tasks was poor and fragile, especially when familiar stimuli were repeatedly used in a discrimination task. Because the probability of each stimulus being the correct response was the same in a session, the possibility of intertrial interference should be the same for each stimulus. The significant difference between the percent correct responses to novel stimuli and to familiar stimuli in a session cannot therefore be explained solely by such interference. For the same reason, boredom of the subject cannot fully explain the result. Familiar stimuli might not draw the subject’s attention as much as novel stimuli. Though the mechanism of such selective responses to novel stimuli is unclear, the detection of environmental changes is important for animals, to respond to danger. It seems to be adaptive for chimpanzees to be more sensitive to novel stimuli, which usually function as a signal of environmental change. A complex cognitive system and large memory should be necessary to process complex auditory stimuli like human spoken language (Kojima, 1985). However, memory retention of auditory information by chimpanzees seems to be poor, in contrast to that of visual information (Hashiya & Kojima, 1997). For chimpanzees, paying more attention to novel stimuli than to familiar stimuli might be a parsimonious selection when there are limitations in memory capacity. In a pragmatic sense, the present study showed that introducing novel stimuli is an effective way of maintaining chimpanzees’ performance on AVMTS tasks. Further studies based on this finding will be necessary to examine the nature of intermodal information processing in chimpanzees. 189 References D’Amato, M. R. (1988). A search for tonal pattern perception in cebus monkeys: Why can’t they hum a tune. Music Perception, 5, 453–480. D’Amato, M. R., & Salmon, D. P. (1984). Tune discrimination in monkeys (Cebus apella) and in rats. Animal Learning & Behavior, 10, 126–134. Davenport, R. K., Rogers, C. M., & Russell, I. S. (1975). Cross-modal perception in apes: Altered visual cues and delay. Neuropsychologica, 13, 229–235. Dewson, J. W., & Burlingame, A. C. (1975). Auditory discrimination and recall in monkeys. Science, 187, 267–268. Fujita, K., & Matsuzawa, T. (1990). Delayed figure construction in a chimpanzee (Pan troglodytes) and humans (Homo sapiens). Journal of Comparative Psychology, 104, 345–351. Fushimi, T. (1994). Acquisition of demand and reject behaviors in a chimpanzee. In S. C. Hayes, L. J. Hayes, M. Sato, & K. Ono (Eds.), Behavior analysis of language and cognition (pp. 123–144) Reno, NV: Context Press. Graffan, D., & Harrison, S. (1991). Auditory–visual associations, hemispheric specialization and temporal–frontal interaction in the rhesus monkey. Brain, 114, 2133–2144. Hashiya, K., & Kojima, S. (1997). Memory retention of a chimpanzee in the auditory–visual & intermodal matching-to-sample task. Manuscript in preparation. Hayes, K. J., Thompson, R. (1953). Nonspatial delayed response to trial-unique stimuli in sophisticated chimpanzees. Journal of Comparative and Physiological Psychology, 46, 498–500. Kojima, S. (1985). Auditory short-term memory in the Japanese monkey. International Journal of Neuroscience, 25, 255–262. Kojima, S. (1990). Comparisons of auditory functions in the chimpanzee and human. Folia Primatologica, 55, 62–72. Kojima, S., & Kiritani, S. (1989). Vocal–auditory functions in the chimpanzee: Vowel perception. International Journal of Primatology, 10, 199–213. Kojima, S., Tatsumi, I. F., Kiritani, S., & Hirose, H. (1989). Vocal–auditory functions in the chimpanzee: Consonant perception. Human Evolution, 4, 403–416. Matsuzawa, T. (1985). Color naming and classification in a chimpanzee (Pan troglodytes). Journal of Human Evolution, 14, 283–291. Matsuzawa, T. (1990). Form perception and visual acuity in a chimpanzee. Folia Primatologica, 55, 24–32. © Japanese Psychological Association 1997. 190 K. Hashiya and S. Kojima Mishkin, M., & Delacour, J. (1975). An analysis of short-term visual memory in the monkey. Journal of Experimental Psychology: Animal Behavior Process, 1, 326–334. Molfese, D. L., Morse, P. A., & Peters, C. J. (1994). Auditory evoked responses to names for different objects: Cross-modal processing as a basis for infant language acquisition. Developmental Psychology, 26, 780–795. Murray, E. A., & Gaffan, D. (1994). Removal of the amygdala and subjacent cortex disrupts the retention of both intramodal and cross modal associative memories in monkeys. Behavioral Neuroscience, 108(3), 494–500. Owren, M. J. (1990). Acoustic classification of alarm calls by vervet monkeys (Cercopitecus aethiops) and humans (Homo sapiens): 1. Alarm calls. 2. Synthetic calls. Journal of Comparative Psychology, 104, 20–40. © Japanese Psychological Association 1997. Savage-Rumbaugh, S., McDonald, K., Sevcik, R. A., Hopkins, W. D., & Rubert, E. (1986). Spontaneous symbol acquisition and communicative use by pygmy chimpanzees (Pan paniscus). Journal of Experimental Psychology: General, 115, 211–235. Tanaka, M. (1995). Object sorting in chimpanzees (Pan troglodytes): Classification based on identity, complementarity, and familiarity. Journal of Comparative Psychology, 109, 151–161. Tanaka, M. (1996). Information integration about object-object relationships by chimpanzees. Journal of Comparative Psychology, 110, 123–135. Wright, A. A., Shyan, M. A., & Jitsumori, M. (1990). Auditory same/different concept learning by monkeys. Animal Learning & Behavior, 18, 287–294. (Received Sept. 30, 1996; accepted March 15, 1997)
© Copyright 2026 Paperzz