doi:10.1093/scan/nsq036 SCAN (2010) 5,168 ^179 The culture ready brain Charles Whitehead London In this article, I examine two hypotheses of language origins: the extended mirror system hypothesis and the vocal grooming hypothesis. These conflict in several respects, partly because their authors were trained in different disciplines and influenced by different kinds of evidence. I note some ethnographic/linguistic and psychological issues which, in my view, have not been sufficiently considered by these authors, and present a ’play and display’ hypothesis which aims to explain the evolution, not of language, but of the ’culture ready brain’with apologies to Arbib for so extending his original concept. In the second half of the article, I will test all three hypotheses against the available fossil, archaeological and neuroimaging evidence. Keywords: mirror neurones; social displays; social mirrors; language origins; brain evolution THE EXTENDED MIRROR SYSTEM HYPOTHESIS The discovery of mirror neurones led Arbib and Rizzolatti (1997) to propose a mirror system hypothesis of language, according to which: the parity requirement for language in humansthat what counts for the ‘‘communicator’’ [e.g. speaker] must count approximately the same for the ‘‘communicatee’’ [e.g., hearer]is met because Broca’s area evolved atop the mirror system for grasping with its capacity to generate and recognize a set of actions (Arbib, 2006). However, mirror neurones for graspingcommon to monkeys as well as humanscannot be sufficient to explain uniquely human abilities such as language. To bridge this explanatory gap, Arbib (2002) proposed seven evolutionary stages, which constitute his extended mirror system hypothesis, namely: S1: S2: S3: S4: S5: S6: S7: a control system for grasping, a mirror system for grasping, a simple imitation system for grasping, a complex imitation system for grasping, protosign, protolanguage, language. This is in fact several hypotheses, but there is one core underlying idea. Arbib suggests that brain expansion in apes and humans involved expansion and reduplication of mirror systems, with duplicated systems subsequently evolving to serve different functions. Received 14 April 2009; Accepted 22 March 2010 Advance Access publication 16 June 2010 The studies of role-play (Whitehead, 2003) and pretend play (Whitehead et al., 2009) were conducted at the Wellcome Trust Centre for Neuroimaging at UCL, and funded by the Wellcome Trust. Correspondence should be addressed to Charles Whitehead, 19 Rydal Road, London SW16 1QF, UK. Email: [email protected] Accepting S1 and S2 as uncontroversial, I will discuss S3–S7 under three headings: Imitation (S3–S4) Arbib (2006) notes that there is no significant imitation in monkeys, and limited imitative ability in apes. Where chimpanzees typically took 12 trials to learn to imitate a behaviour (Myowa-Yamakoshi and Matsuzawa, 1999), humans can acquire longer sequences of more abstract actions in a single trial. Arbib therefore postulates a simple imitative system (S3) which evolved in an ape/human common ancestor, and a complex imitative system (S4) unique to humans. Mimesis (S5a–b) To explain the transition from imitation to communication, Arbib et al. (2008) divide ‘protosign’ into three subsidiary stages: S5a: pantomime of manual praxic actions, S5b: pantomime of actions not in repertoire, S5c: protosign. By ‘pantomime’ Arbib means mimetic gesture. Mimesis is voluntary representation by resemblance (e.g. ‘drawing pictures in the air’ with the hands: Burling, 1993). Arbib does not explain the transition from imitation (S4) to mimesis (S5a), or why any selfish hominid would wish to share its praxic skills. In S5b, praxic communication becomes extended to include the miming of actions outside the normal repertoiresuch as flapping the arms to represent the flight of a bird. Imitation of other species, such as turtles or sharks, occurs in dolphins (Tayler and Saayman, 1973). When a group of dolphins all follow and mimic a turtle, they are not exchanging useful information about turtles. Such collective behaviour implicates play or performance (see below) rather than communication. Playan essential aspect of ß The Author (2010). Published by Oxford University Press. For Permissions, please email: [email protected] The culture ready brain social learning in primatesis a whole-body affair, and it is phylogenetically older than mimesis, which occurs in apes but not in monkeys. Furthermore, apes do not use mimesis to convey information about the world, but to manipulate others (Mithen, 2006). For example, Kubie the gorilla, touches Zara lightly and moves his hand in the direction he wants her to move (Tanner and Byrne, 1996). This raises questions concerning the social changes required for mimesis, the associated cortical mechanisms, the relevance of grasping behaviour, and the evolutionary sequence postulated by Arbib. Conventionalization (S5c–S7) Arbib (2002) suggests that mimesis is slow, inefficient, and subject to misinterpretation. Hence the need for conventionalized signals (S5c) to ‘formalize, disambiguate and extend pantomime’for example, to disambiguate ‘bird’ from ‘flight’. However, the essential difference between mimesis and language is not one of efficiency but functiona mimed demonstration can be ‘worth a thousand words’. Mimetic signals refer to concrete things, agents or actions, whereas language is specialized for abstractiongeneralized concepts, ideas and beliefs (Whitehead, 2008a). Arbib (2008) suggests that conventionalized protosign (S5c) is subsequently extended into and combined with ‘protospeech’. Conventionalized manual, facial and vocal signals (protowords) then become separated from pantomime and coalesce to form ‘protolanguage’ (S6). The final transitionfrom protolanguage (S6) to fully syntactical language of modern type (S7)is entirely cultural. An important feature of Arbib’s hypothesis is the distinction between a biologically evolved language ready brain and, for example, Dunbar’s biologically evolved language. THE VOCAL GROOMING HYPOTHESIS Robin Dunbar’s Vocal Grooming Hypothesis is rooted in research related to primate group sizes. Dunbar (1993) demonstrated that, in living monkeys and apes, there is a positive linear correlation between mean group size and ratio of neocortical to whole brain volume. This, he infers, reflects increasing cognitive demand with increasing group size, due to increasing social complexity. Projecting Dunbar’s graph to intersect with human neocortical ratio predicts that the optimal group size for humans is around 150now known as ‘Dunbar’s number’. Subsequent research on a range of human groups confirmed that the mean number of people having social contact with each other was between 100 and 200 (Hill and Dunbar, 2003). The ’grooming gap’ Dunbar further showed that the time animals spent grooming each other also correlated with group size. The time SCAN (2010) 169 available for grooming, however, is not infinitely elastic. Feeding, resting and journeying between food sites make irreducible time demands. Dunbar found that the maximum time spent grooming in any modern primate group is 20% of the day. This suggests that groups requiring more than 20% grooming time would fragment into smaller groups, and there is a ‘glass ceiling’ on group size at around 80 individuals. Again, projecting the grooming time vs group size plot to the level of ‘Dunbar’s number’ implied that, if human sociality depended on the standard primate mechanism, we would need to spend around 43% of the day grooming each other. However, from studies in seven societies, Dunbar (1998) concluded that humans actually spend around 20% of their time in social interaction, which coincides with the primate grooming maximum. Thus we have a ‘grooming gap’ which requires explanation. Dunbar’s solution to this is his vocal grooming hypothesis, according to which language originated as a time-saving substitute for one-to-one grooming, allowing several individuals to be ‘groomed’ simultaneously, and leaving the groomer free to engage in other activities whilst grooming. The origins of vocal grooming Aiello and Dunbar (1993) used Dunbar’s findings to predict group size and grooming time for all hominid crania in their sample (n ¼ 85). They concluded that ‘the evolution of human language involved a gradual and continuous transition from non-human primate communication systems’. This gradual process divides into two phases. First, early Homo, with average grooming times approaching 23%, would have needed vocal grooming in which ‘tone and emotion would be the essential components’that is, something analogous to the ‘gossiping’ exchanges and choral ‘song’ displays in gelada baboons, who in fact maintain human-sized groups. Second, archaic Homo sapiens, with average grooming times within the modern range, would need speech capable of communicating social information. The authors do not explain why archaic H. sapiens should require speech when geladas can maintain equally large groups without it. The ’extended’ vocal grooming hypothesis Two particularly serious problems with the vocal grooming hypothesis concern the inadequacy of language as a grooming substitute: First, grooming functions to prevent the problem of freeriders by being costly in terms of time budgets. Words are cheapthey are just too time-saving, and it is too easy to lie. So we have the problem of explaining how such inexpensive and potentially dishonest signals can demonstrate commitment. Secondly, grooming cements social bonds by raising endorphin levels. Speech lacks the requisite psychopharmacological properties. 170 SCAN (2010) To address these problems, Dunbar added three new phases to his hypothesis: (1) Human laughter has an unvoiced homologue in chimpanzees. Dunbar (2009) therefore suggests that laughter probably achieved a recognisably human form at an early date, perhaps in early Homo erectus. Modern laughter often occurs in rhythmic chorusing bouts, and this may have scaffolded the emergence of his second phasesong-and-dance display. (2) Dunbar (2009) notes that evidence for modern vocal capacities (expanded hypoglossal and upper thoracic vertebral canals) first appears in archaic H. sapiens by 0.5 million years ago (mya). He notes that this evidence does not discriminate between song and speech, but suggests song and dance probably emerged at this time because of their obvious grooming advantage over languagein common with laughter, song and dance provide the requisite discharge of endorphins. (3) He then observes that religious techniquessuch as meditation, fasting and flagellationbecause they are arduous or painfulare likely to raise endorphin levels (Dunbar, in press). Perhaps religion evolved as the ‘opium of the people’ in more senses than one. Dunbar’s complex ideas on religion are beyond the scope of this article. However, his speculations concerning the religious capacities of different hominids seem to add several additional steps to his thesis. But no matter how many stages you insert in the progression towards language, it still remains a poor substitute for grooming. What seems striking about the Vocal Grooming Hypothesis, in its final extended form, is that language has almost become an appendagesimply included because it is necessary for ‘communal religion’. LINGUISTIC AND ETHNOGRAPHIC ISSUES Chomsky (2005) describes language as ‘a system of discrete infinity’. That is, phonemes can be combined to make words, words to make sentences, sentences to make narratives, and so onin principle to infinity. Digital codes such as language are discontinuous with all other communication systems, which use sliding scales of size, rhythm, pitch, timbre, etc. (Burling, 1993). Chomsky points out that you cannot get from an analogue to a digital system by a gradual transition, or by a series of intermediate steps. The origin of language has to be instantaneous, because a system is either digital or it is not. The idea that cultureand languagehad a ‘big bang’ origin has a long history in social anthropology, where culture is conceived as ‘anti-biological’: ‘in apes, sex controls society, but in humans, society controls sex’ (Sahlins, 1960). The anti-biological features of human culturesuch as C.Whitehead concealment of the genitals and classificatory kinshipimply that human culture began at some revolutionary moment by inverting an ancient primate social order (review: Whitehead, 2008b). Such ideas originate with Durkheim (1912), who argued that this revolutionary change was accomplished by ritual, and that language could only have a ritual origin. What distinguishes language from the vocalizations of animals, he argued, is displaced referenceconveying things known, imagined or imaginary. In order to encrypt an intangible, group members would have to engage in a recurring pantomime with self-evident meaning, where participants would know that the same meaning was present in the minds of all. Then it would become possible to refer to that meaning in a cryptic manner. Speech–act theory points to the same conclusion (e.g. Grice, 1969; Searle, 1969; Austin, 1978). Language could not function without a ‘social contract’ because words are cheap. Like paper banknotes, they could not be taken at face value unless backed by a source of genuine worth or sanctions against lying. In societies without police, gaols and judicial systems, this can only be accomplished through ritual and ritually-constructed supernatural beliefs (Knight, 1998). PSYCHOLOGICAL AND DEVELOPMENTAL ISSUES Mirror systems in the brain are implied by social mirror theory, which holds that ‘mirrors in the mind depend on mirrors in society’ (Whitehead, 2001). Shared social displays make experiential states salient so that we begin to notice them simultaneously in ourselves and others (Dilthey, 1883–1911). However, the discovery of mirror neurones in grasping cortex led the Parma team to favour simulation theory (Harris, 1991), according to which self-awareness is a given, and other-awareness is inferred by ‘mentally simulating’ others. In Harris’s theory, pretence and imagination are involved in mental simulation, but the Parma group emphasizes visuomotor mirroring, which could implicate any kind of shared behaviour. Developmental psychologists such as Gratier and Trevarthen (2008) have stressed the importance of displays in the development of social insight, whilst evidence that self-awareness and other-awareness emerge simultaneously (Gopnik and Meltzoff, 1994) favours social mirror theory. Social displays include any behaviour that makes experiential states observable, and so broaden the focus of investigation beyond language. Burling (1993) notes that we have at least three modes of communication: affective, mimetic and conventional. But we also have two other kinds of display which have functions over and above communication, and they too have the same three modalities. Play is exploratory and experimental, and, in the case of pretend play, creates a world of shared imagination. Performance combines the functions of communication and play with two The culture ready brain SCAN (2010) 171 Table 1 Illustrative examples of social display Communication Play Performance Implicit Gesture-calls (e.g. laughing, crying) Embodied play (e.g. contingent mirror play) Song-and-dance display Making marks Mimetic Projective Introjective Iconic gesture-calls Projective pretend play Role-play Making representational images Pantomime Conventional Analogical codes (e.g. pictographic writing) Collecting behaviour Ritual/ceremony Iconography Emblems ‘Fine art’ Music Cryptic codes (e.g. language, phonetic alphabets, mathematical denotations) Play scripts Myth Literary and dramatic arts Economico-moralcodes Games-with-rules Socio-economic personae Wealth displays (material, moral, aesthetic, cultural; spiritual; etc.) Source: Whitehead (2001): based on data in Bourdieu (1972), Burling (1993), Huizinga (1955), Jennings (1990, 1991), Winnicott (1974). otherssocial grooming and entrainmentensuring two or more individuals are functioning as one (Whitehead, 2001). What is remarkable and unique to humans is far more than just language. We have elaborated to an exceptional degree an unprecedented variety of social displays. According to social mirror theory, the reason for our high levels of self- and other-consciousness is this formidable armamentarium of displays. Table 1 summarizes our three types and three modes of social display, with some illustrative examples. A ’PLAY AND DISPLAY’ HYPOTHESIS Whitehead (2003, 2008a) has proposed a ‘play and display’ hypothesis, holding that the proliferation of social displays was a major factor in human brain expansion. Our prodigious repertoire of displays could not have emerged all at once, and must have evolved in a logical order. Communication has to be primary, because it is so widespreadeven cells communicate chemically. If performance is a playful extension of communication, it must be the most recent. A similar argument applies to modes of display. Implicit signals are common, mimesis is rare, and both have to be in place before they can be conventionalized to sustain and constitute modern human culture. Whitehead further argued that play and performance in one mode could scaffold the emergence of communication in a ‘higher’ mode. Song-and-dance display, for example, creates the preconditionssocial trust, social insight and voluntary control over displaysfor a major expansion of mimetic abilities. Similarly, ritual pantomime (mimetic performance) is a likely prerequisite for language (conventional communication), as argued by Durkheim (1912) and Knight Table 2 Hypothetical evolutionary sequence of social displays Communication Play Performance Implicit Mimetic Conventional (1991). If so, the result is a spiral evolution of displays as shown in Table 2. This spiral of display behaviours suggests three Rubicons during human evolution. The first, resulting from the emergence of song-and-dance display, would be expected to lead to brain expansion, as would the second, marked by a major elaboration of mimetic abilities up to the modern human level of role-play. There are four reasons why this might be so: (1) Multimodal integration. Dance and role-play require fine motor control of multiple independent sets of muscles throughout the body, co-ordinated with proprioceptive, auditory, and visual feedback. Song-and-dance display would be expected to lead to expansion of multimodal areas such as the inferior parietal lobule, and higher level sensorimotor areaspossibly pre-adapting the brain for role-play. 172 SCAN (2010) (2) Timing precision. Calvin (1983) has shown that the ‘release window’ involved in throwing a missile at a target, which is shorter than the firing times of individual neurones, requires massive neuronal ‘redundancy’ exploiting the statistical accuracy of large numbersto achieve the necessary precision. Richman (1978) showed that the synchronized choral displays of gelada baboons also involve millisecond timing precision. In the case of human performancethat of a concert pianist for examplethe subtleties of rhythm, rubato and the characteristic ‘pulse’ which distinguishes the work of individual western composers (Clines, 1977; Brown, 1991) demands fine timing precision not only in the performer but also in the listener, involving muscle tone throughout the body (Storr, 1993). (3) Performative skills. An imaging study of cello players showed that the cortical representations of the fingers of the left hand were larger than those of the right, and this difference correlated with the age at which cellists began to play (Elbert et al., 1995). Another study assessed the effects of skill acquisition in a finger tapping task (Karni et al., 1995). The increased skill was associated with a 25% increase in area of the cortical representations of the fingers. The results were consistent with findings in monkeys relating to both motor and perceptual skill learning. Humans are capable of learning an extraordinary variety of motor and cognitive skills, including those required for performative displays. Presumably our large brains, in part, pre-adapt us for such acquired skills. (4) Modelling other people. Role-play and ‘theatre of mind’ (ThoM) involve whole-system mind and body representations of multiple personae. Imagined peoplesuch as characters in a novelcommonly behave as though they have minds, desires and beliefs of their own. To process multiple mind/body representations in parallel would presumably require comprehensive expansions of all brain structures required for a ‘toy person’ to behave realistically. This ‘chimerical brain’ hypothesis is consistent with experimental data suggesting multiple dissociated self-representations in normal human minds (Oakley and Eames, 1985; Bliss, 1986; Hilgard, 1986; Laughlin et al., 1992; Mitchell, 1994). All the above implicit and mimetic abilities, expanded across the first two Rubicons, would lay the foundations for the thirdritual-based culture, establishing a ‘social contract’ on which the emergence of language depends. This would not be expected to lead to brain expansioncontraction is more likely because brain tissue is expensive (Aiello and Wheeler, 1995). Societies whose members are controlled ‘from the outside’ by rulesbacked by social or sacred sanctionsmay not need such highly skilled or finely timed performances to maintain social cohesion. C.Whitehead ARCHAEOLOGICAL AND FOSSIL EVIDENCE Figure 1 shows cranial capacities in fossil hominids since 3.5 mya. As predicted by the ‘play and display’ hypothesis, there appear to have been two periods of accelerated brain expansion: the first in habiline species between 2.5 and 2.0 mya, and the second from 700 thousand years ago (kya) in archaic H. sapiens. There is also evidence of structural brain changes across these periods, and they coincide with periods of cooling world climates and archaeological evidence of behavioural change. There is also evidence of a third grade shift, though this did not coincide with the emergence of modern human culture. The rate of brain expansion appears to plateau around 50 kya, with an average of 1500 cm3 (Figure 2). After 10 kyaroughly coinciding with the agricultural revolutionbrain volumes fell to their present average level of 1350 cm3. Figure 3 shows endocranial casts from three fossil hominids, compared with those from a chimpanzee and a contemporary human. Fig. 1 Hominid cranial capacities over the last 3.5 mya (data from De Miguel and Henneberg, 2001). Fig. 2 Hominid cranial capacities from 200 to 10 kya (bold rule indicates 1350 cm3 average for living humans) (data from De Miguel and Henneberg, 2001). The culture ready brain Fig. 3 Endocranial casts (after Holloway, 1974). Apith crania are slightly larger than those of chimps, the main differences being: bilateral transverse expansion, mainly due to increased bulk of the parietal and temporal lobes, and possibly enlarged premotor cortices; increased height, mainly of the parietal lobes; and a well-developed superior parietal lobule (Tobias, 1987). These enlargements include motor and parietal areas central to Arbib’s extended mirror system hypothesis. The H. erectus cast reflects changes that occurred in habilines. From an examination of six habiline cranial casts, Tobias (1987) notes a number of new features not present in apiths, including a modern pattern of left-right asymmetries, increased bulk of frontal and parietal lobes, a prominent inferior parietal lobule, and pronounced enlargement of SCAN (2010) 173 Broca’s and Wernicke’s ‘speech’ areas. Tobias infers that Homo habilis could speak, but homologous swellings can be seen in macaques (Deacon, 1992), and, in all primates other than ourselves, implicit vocalizations are processed in the left hemisphere (Falk, 1987). The expanded ‘speech’ and inferior parietal areas are consistent with song-and-dance display in habilines rather than archaic humans as suggested by Dunbar. There is also archaeological evidence for major behavioural change just before the first grade-shift. The earliest unequivocal stone tools appear around 2.7 mya (Bilsborough, 1992). These were used for butchering meat, and the butchery scatters were all at riverside sites. There are two points to note here. Firstly, chimpanzees cannot butcher meat because they cannot trust each other to share such a valuable resource. When chimpanzees capture an animal they tear it apart and eat it in a general mêlée (Teleki, 1973, 1981; Strum 1981)they are grabbing their share before the others eat it. Secondly, if you are a prey animal you do not linger near water where large carnivores are likely to drink (Potts, 1994). Still less would you sit around butchering meat, releasing an alluring smell that could travel a long way downwind. These two points suggest that the earliest stone tool makers had developed levels of social trust unprecedented in any non-human primate, and an ability to deal with dangerous predators which were larger, faster, and better armed with teeth, claws and muscle, than any hominid. This suggests an ability to maintain large, well-coordinated groups, and a requirement to do so. By Dunbar’s own reasoning one would expect song-and-dance display by 2.7 rather than 0.5 mya. Following the first grade-shift there was little change in H. erectus for around 1.5 million years. Then, during the second grade shift, all the previously expanded cortices became further enlarged, together with at least proportionate increase in prefrontal lobes. During this second grade shift we find the first unequivocal evidence for new kinds of display behaviour. Pigmentsespecially red ochre and haematite, perhaps serving as body paintwere increasingly used by African and European hominids from around 300 kya (Watts, 1999), and late H. erectus was assembling collections of non-utilitarian objects, such as attractive pebbles, crystals, shells, whale teeth and fossils (Hayden, 1993). From around 125 kya, Neanderthals engraved geometric patterns on bones and rocks (Marshack, 1976, 1990). Analogous evidence from Africa is scarce but present. The first putative iconic objectthe Berekhat Ram ‘figurine’predates 270 kya and comes from an Acheulian site (Schepartz, 1993). This might be better interpreted as a ‘doll’ than as ‘fine art’ which would imply social hierarchy and class politics of a distinctly modern kind (Whitehead, 2003). 174 SCAN (2010) C.Whitehead Fig. 4 Peak activations in nine studies involving tool use or hand-object manipulation. Whitehead (2003) has further argued that all this display material implicates increasing mimetic and pretend play abilities, particularly in the light of negative evidence from earlier periods. The stereotypical character of Acheulian toolswhich persisted unchanged for more than a million yearssuggests extraordinary social conformity. The play and display hypothesis would associate this with a cohesive ‘song-and-dance’ rather than a more flexible ‘pretend play and mimetic’ culture. If social displays implicate mirror systems, then the fossil and archaeological evidence is consistent with Arbib’s central thesis of expanding and proliferating mirror systems. The above evidence also favours the sequence and timing of display behaviours proposed by the play and display rather than the extended vocal grooming hypothesis. NEUROIMAGING EVIDENCE Tool use and object manipulation Buccino and colleagues (2001) conducted an imaging study of object manipulations using hands, feet and mouth, and found associated brain activations which mapped onto parietal and opercular prefrontal areas in a somatotopic fashion, resembling a classic motor homunculus, and suggesting that there are mirror systems for many types of body movementnot just manual grasping. Nevertheless, the discovery of a manual grasping mirror system led to more than 40 imaging studies of hand actions, tool use and object manipulation (Grèzes and Decety, 2001). Table 3 Contrasts used in nine tool use and object manipulation studies Planning and naming tools Johnson-Frey et al. (2005) Planning to mime tool use vs planning a meaningless hand movement: R and L hand Fridman et al. (2006) Planning a tool-use gesture vs planning a communicative gesture: R hand only Kan et al. (2006) Naming tools vs naming animals Martin et al. (1996) Naming tools vs naming animals Executing/miming tool use or object manipulation Choi et al. (2001) Miming tool use vs oppositional thumb–index finger movements: R and L hand Inoui et al. (2001) Using tongs vs using fingers to manipulate an object: R and L hand Ohgami et al. (2004) Miming tool use vs mimetic gesture representing an object: R and L hand Johnson-Frey et al. (2005) Miming tool use vs a meaningless hand movement: R and L hand Fridman et al. (2006) Miming tool use vs a communicative gesture: R hand only Naito and Ehrsson (2006) Illusory hand flexion with ball vs illusory hand flexion without ball: R and L hand Observing tool use Lotze et al. (2006) Observing tool use or instrumental actions with objects vs expressive gestures Figure 4 shows major centres of brain activity from nine of these studies, six of which involved using and/or planning to use tools, whilst two involved naming tools and the other hand-object interactive movement. Table 3 shows the contrasts used in these studies. The culture ready brain SCAN (2010) 175 Fig. 5 Main activation loci from four studies of dance. Peak activations in these studies clustered in opercular prefrontal and superior parietal areas, corresponding to the assumed grasping mirror system. Most of the activity is in the left hemisphere, whereas studies of non-praxic hand actions show more bilateral activity (Grèzes and Decety, 2001). Note that there is very little activation of medial or prefrontal areas other than opercular. Dance To date there have only been four imaging studies of dance. Three of these (Calvo-Merino et al., 2005, 2006; Cross et al., 2006) were primarily intended to investigate mirror systems rather than dance as such. Participants observed videos of familiar vs unfamiliar dance moves: the former showed greater activity than the latter. In the fourth study (Brown et al., 2006) participants performed tango steps on a sloping board, and metric dance was contrasted with non-metric dance, self-paced dance, isometric muscle contractions; and a resting baseline. Figure 5 shows activation loci associated with familiar vs unfamiliar dance moves, plus, from the Cross et al. study, all dance vs rest (observing) and, from the Brown et al. study, all metric dance contrasts, plus non-metric dance vs isometric muscle contractions (executing). Unsurprisingly, dance activates the opercular prefrontal and parietal areas associated with tool-use, mainly in the left hemisphere. Overall, however, there is fairly widespread activity in both hemispheres. Note the superior temporal cluster, most evident in the right hemisphere. Since the three studies of observing dance used silent videos as stimuli, this cannot be a result of heard music, though it could indicate imagined music. For two observing dance studies the activations shown represent dance minus dance, which may account for less inferior parietal activity than would be expected for a performative display requiring considerable multimodal integration. In any case, the observing dance studies lacked musical accompaniment, and the dance execution study had no visual feedback, so none were truly multimodal. A further difference between dance and tool-use is the broad scatter of medial prefrontal, parietal and temporal activations. It is unsurprising that a complex multimodal activity such as dance should involve more widespread cortical activation than tool use. Yet for decades palaeoanthropological debates about ‘cognitive evolution’ have been dominated by discussions of tool-making and tool-use, with corresponding concerns that technological innovations do not correlate with brain expansion (Mellars and Stringer, 1989). Interest in the evolution of dance is relatively recent (Whitehead, 2003; Mithen, 2006; Dunbar, 2009). To my knowledge, no previous author has suggested a role for dance display in brain expansion. Several of the brain structures activated by dance have also been implicated in studies of pretend play. This is consistent with the suggestion that performance in one mode may scaffold the later emergence of a higher mode. 176 SCAN (2010) Pretend play The first imaging study of pretence (Whitehead, 2003) involved six drama students imagining themselves performing Hamlet and Lady Macbeth in rehearsed extracts from Shakespeare’s plays, cued from a rolling text. The role-play tasks were contrasted with readings from control texts, selected for their apparently uninvolving character. Although participants found the role-play tasks considerably more demanding than the control tasks, we found more brain activity during the control tasks. A possible explanation for this unexpected finding might be that role-play (i.e. ‘ThoM’) is the default activity of the brain in awake adults. Such an idea is consistent with evidence that brain areas supporting default activity are involved with thinking about the self and others (D’Argembeau et al., 2005), watching social scenarios (Iacoboni et al., 2004), and following or constructing narratives (Mar, 2004). Many authors assume continuity between role-play, narrative and default activity (Mar, 2004). If so, all three would be expected to show significant overlap in imaging studies. Interestingly, narrative and default areas were activated in the role-to-control switch during the role-play study. One tentative suggestion is that the role-to-control switch involved increased activity in role-play areas as they were being dissociated. Clearly this suggestion needs to be tested by a more definitive study. However, the role-to-control switch findings are provocative and I have included them in Figure 6. C.Whitehead The role-play study was followed by two larger studies where participants observed videos of pretend actions contrasted with instrumental actions. In the first of these (German et al., 2004), an actor manipulated an object (e.g. placing a book on a shelf) or pretended to do so (e.g. placing an imaginary book on the shelf). One criticism of this study was that brain activity associated with the pretend action might have been due to its unfamiliarity. To correct for such a possibility, the second study (Whitehead et al., 2009) included two control conditions: using a familiar object in the normal way (e.g. using a pen to write with) and in an unusual way (e.g. using the pen to stir coffee). The pretend condition involved pretending the object was something else (e.g. pretending the pen was an aeroplane). We found minimal differences between the two instrumental tasks. Figure 6 shows peak activity for the pretend minus instrumental tasks in these two studies. The square markers represent a second task in the Whitehead et al. study. Participants were shown a picture of the object used in the previous action video and were required to name its use (e.g. ‘pen’, ‘coffee spoon’ or ‘aeroplane’). Pretend use was contrasted with usual plus unusual use. The two studies of observing pretence showed similar patterns of activation, with several areas close to those associated with dance: bilateral opercular prefrontal; right superior temporal; and bilateral temporal poles. In contrast to dance, these two studies showed relatively extensive activity in ventromedial and orbital prefrontal cortex. Two dance studies (Brown et al., 2006; Cross et al., 2006) contrasted Fig. 6 Main activation loci from three studies of pretence. Red, observing projective pretence (Whitehead et al., 2009); Orange, observing projective pretence (German et al., 2004); Maroon, role-to-control switch (Whitehead, 2003). The culture ready brain SCAN (2010) Table 4 Regions of interest associated with three forms of social display, compared with tool use, deactivation areas and ToM Tool Dance Pretend Narrative Deactivation ToM use play areas Superior parietal L Prefrontal operculum L Inferior parietal L Posterior cingulate/precuneus Superior temporal Temporal pole Ventromedial/orbital prefrontal Dorsolateral frontal R<L R<L R<L RþL R>L RþL R>L RþL RþL R R RþL R R>L RþL RþL RþL RþL R>L R>L RþL RþL R RþL RþL RþL RþL RþL Deactivation areas may equate with ‘theatre of mind’ (ThoM). R, right hemisphere; L, left hemisphere. Italics, role-to-control switch only. dance with a resting baseline, and these did not show such rostral ventromedial or orbital prefrontal activation. Observing pretence activated temporal and prefrontal areas frequently associated with ‘theory of mind’ (ToM). German et al. inferred that ToM is automatically engaged when people observe pretend actions. This would be consistent with Leslie’s (1987) view that children show precocious mentalizing abilities in the context of pretend play. However, ToM engages a subset of the brain structures implicated in the pretend play studies (Table 4). Social mirror theory and Lillard’s (2001) ‘twin earth’ model both imply that pretend play is the primary phenomenon, enabling children to acquire ToM. Whether ToM is necessary for pretend play, or the converse is true, remains an open question. The two studies of observing pretence also implicated brain regions repeatedly associated with narrative and default activity. If future research confirms that the role-to-control switch in the Whitehead et al. study does indeed represent role-play, the overlap between narrative, default activity and pretence would be considerably increased (Table 4). Comment on the imaging evidence Table 4 summarizes the brain regions implicated in the above imaging research, showing areas of overlap and difference. In view of the paucity of studies of dance and pretence, and the fact that different controls were used, conclusions based on this data are necessarily tentative. Although Table 4 does not take account of fine-grained functional differences within the broad areas shown, it does suggest a plausible evolutionary sequence, and possible directions for future investigation. Imaging research suggests that there may be up to five distinct mirror systems in modern human brains, each of which partially incorporates phylogenetically older ones. The first two systems, which we share with monkeys, would be those for reading body actions and affective signals. 177 A possible third would be a song-and-dance system, a fourth: a mimetic and pretend play system, and a putative fifth: a role-play system. CONCLUSIONS The evidence reviewed here broadly supports an extended mirror system hypothesis, though Arbib’s more controversial steps of protosign (S5) and protolanguage (S6) seem questionable. I have also argued that we need to take into account a broad range of display behaviours in explaining the evolution of the ‘language ready brain’. Anthropological and linguistic data favour Arbib’s hypothesis of language as a cultural invention. All the evidence reviewed above is consistent with the vocal grooming and play and display hypotheses, with cranial cast and archaeological data favouring the earlier date for song-and-dance display proposed by Whitehead. Overall, this article supports Arbib’s notion of a ‘language ready brain’, as opposed to the evolution of language per se. However, taking into account our rich repertoire of social displays, ‘culture ready brain’ might be a more appropriate term. Trevarthen (1995), after 30 years researching human infants in different cultures, claimed that human children are born ‘hungry for culture’. At least, I would claim, they are born hungry for socialization, and were things otherwise, human culture would be impossible. REFERENCES Aiello, L.C., Dunbar, R.I.M. (1993). Neocortex size, group size and the evolution of language. Current Anthropology, 34, 184–93. Aiello, L.C., Wheeler, P. (1995). The expensive tissue hypothesis: the brain and the digestive system in human and primate evolution. Current Anthropology, 36, 199–221. Arbib, M.A. (2002). The mirror system, imitation, and the evolution of language. In: Nehaniv, C., Dautenhahn, K., editors. Imitation in Animals and Artefacts. Cambridge, MA: The MIT Press, pp. 229–80. Arbib, M.A. (2006). Aphasia, apraxia and the evolution of the languageready brain. Aphasiology, 20, 1125–55. Arbib, M.A. (2008). Mirror neurons and language. In: Stemmer, B., Whitaker, H.A., editors. Handbook of the Neuroscience of Language. Amsterdam: Elsevier/Academic Press, pp. 237–46. Arbib, M.A., Liebal, K., Pika, S. (2008). Primate vocalization, gesture, and the evolution of human language. Current Anthropology, 49 (6), 1053–76. Arbib, M.A., Rizzolatti, G. (1997). Neural expectations: a possible evolutionary path from manual skill to language. Communication and Cognition, 29, 393–424. Austin, J.L. (1978). How to Do Things with Words. Oxford: Oxford University Press. Bilsborough, A. (1992). Human Evolution. Glasgow: Blackie. Bliss, E.L. (1986). Multiple Personality, Allied Disorders, and Hypnosis. Oxford: Oxford University Press. Bourdieu, P. (1972). Outline of a Theory of Practice. Cambridge: Cambridge University Press. Brown, P. (1991). The Hypnotic Brain: Hypnotherapy and Social Communication. New Haven: Yale University Press. Brown, S., Martinez, M.J., Parsons, L.M. (2006). The neural basis of human dance. Cerebral Cortex, 16, 1157–67. Buccino, G., Binkofsky, F., Fink, G.R., et al. (2001). Action observation activates premotor and parietal areas in a somatotopic manner: an fMRI study. European Journal of Neuroscience, 13, 400–4. 178 SCAN (2010) Burling, R. (1993). Primate calls, human language, and nonverbal communication. Current Anthropology, 34, 25–53. Calvin, W.H. (1983). A stone’s throw and its launch window: timing precision and its implications for language and the hominid brain. Journal of Theoretical Biology, 104, 121–35. Calvo-Merino, B., Glaser, D.E., Grèzes, J., Passingham, R.E., Haggard, P. (2005). Action observation and acquired motor skills: an fMRI study with expert dancers. Cerebral Cortex, 15, 1243–49. Calvo-Merino, B., Grèzes, J., Glaser, D.E., Passingham, R.E., Haggard, P. (2006). Seeing or doing? Influence of visual and motor familiarity in action observation. Current Biology, 16, 1905–10. Choi, S.H., Na, D.L., Kang, E., Lee, K.M., Lee, S.W., Na, D.G. (2001). Functional magnetic resonance imaging during pantomiming tool-use gestures. Experimental Brain Research, 139, 311–7. Chomsky, N. (2005). Three factors in language design. Linguistic Inquiry, 36, 1–22. Clines, M., editor. (1977). Music, Mind and Brain. New York: Plenum. Cross, E.S., Hamilton, A.F. de C., Grafton, S.T. (2006). Building a motor simulation de novo: observation of dance by dancers. NeuroImage, 31, 1257–67. D’Argembeau, A., Collette, F., Van der Linden, M., et al. (2005). Selfreferential reflective activity and its relationship with rest: a PET study. Neuroimage, 25, 616–24. Deacon, T.W. (1992). The human brain. In: Jones, S., Martin, R., Pilbeam, D., editors. The Cambridge Encyclopedia of Human Evolution. Cambridge: Cambridge University Press, pp. 115–23. De Miguel, C., Henneberg, M. (2001). Variation in hominid brain size: how much is due to method? Homo, 52, 3–58. Dilthey, W. (1883–1911). In: Rickman, H.P., editor. Selected Writings. Cambridge: Cambridge University Press. Dunbar, R.I.M. (1993). Coevolution of neocortical size, group size and language in humans. Behavioral and Brain Sciences, 16, 681–735. Dunbar, R.I.M. (1998). Theory of mind and the evolution of language. In: Hurford, J., Studdart-Kennedy, M., Knight, C., editors. Approaches to the Evolution of Language. Cambridge: Cambridge University Press, pp. 92–110. Dunbar, R.I.M. (2009). Why only humans have language. In: Botha, R., Knight, C., editors. The Prehistory of Language. Oxford: Oxford University Press, pp. 1–38. Dunbar, R.I.M. (2009). Mind the bonding gap: constraints on the evolution of hominin societies. In: Shennan, S., editor. Pattern and Process in Cultural Evolution. Berkeley, CA: University of California Press, pp. 223–34. Durkheim, E. (1912). The Elementary Forms of the Religious Life. London: Allen & Unwin. Elbert, T., Pantev, C., Weinbruch, C., Rockstroh, B., Taub, E. (1995). Increased cortical representation of the fingers of the left hand in string players. Science, 270, 305–7. Falk, D. (1987). Brain lateralization in primates and its evolution in hominids. Yearbook of Physical Anthropology, 30, 107–25. Fridman, E.A., Immisch, I., Hanakawa, T., et al. (2006). NeuroImage, 29, 417–28. German, T.P., Niehaus, J.L., Roarty, M.P., Giesbrecht, B., Miller, M.B. (2004). Neural correlates of detecting pretense: automatic engagement of the intentional stance under covert conditions. Journal of Cognitive Neuroscience, 16, 1805–17. Gopnik, A., Meltzoff, A.N. (1994). Minds, bodies and persons: young children’s understanding of the self and others as reflected in imitation and theory of mind research. In: Parker, S.T., Mitchell, R.W., Boccia, M.L., editors. Self-awareness in Animals and Humans. Cambridge: Cambridge University Press, pp. 166ff. Gratier, M., Trevarthen, C. (2008). Musical narrative and motives for culture in mother-infant vocal interaction. In: Whitehead, C., editor. The Origin of Consciousness in the Social World. Exeter: Imprint Academic, pp. 122–58. C.Whitehead Grèzes, J., Decety, J. (2001). Functional anatomy of execution, mental simulation, observation, and verb generation of actions: a meta-analysis. Human Brain Mapping, 12, 1–19. Grice, H.P. (1969). Utterer’s meanings and intentions. Philosophical Review, 78, 147–77. Harris, P. (1991). The work of the imagination. In: Whiten, A., editor. Natural Theories of Mind: Evolution, Development and Simulation of Everyday Mindreading. Oxford: Blackwell. Hayden, B. (1993). The cultural capacities of Neanderthals: a review and re-evaluation. Journal of Human Evolution, 24, 113–46. Hilgard, E.R. (1986). Divided Consciousness: Multiple Controls in Human Thought and Action. New York: Wiley. Hill, R.A., Dunbar, R.I.M. (2003). Social network size in humans. Human Nature, 14, 53–72. Holloway, R.L. (1974). The casts of fossil hominid brains. Scientific American, 231, 106–15. Huizinga, J. (1955). Homo ludens: A Study of the Play Element in Culture. Boston: Beacon Press. Iacoboni, M., Lieberman, M.D., Knowlton, B.J., et al. (2004). Watching social interactions produces dorsomedial prefrontal and medial parietal BOLD fMRI signal increases compared to a resting baseline. NeuroImage, 21, 1167–73. Inoui, K., Kawashima, R., Sugiura, M., et al. (2001). Activation in the ipsilateral posterior parietal cortex during tool use: a PET study. NeuroImage, 14, 1469–75. Jennings, S. (1990). Dramatherapy with Families, Groups and Individuals: Waiting in the Wings. London: Jessica Kingsley. Jennings, S. (1991). Symbolic Play in Therapy. Oxford: Blackwell. Johnson-Frey, S.H., Newman-Norlund, R., Grafton, S.T. (2005). A distributed left hemisphere network active during planning of everyday tool use skills. Cerebral Cortex, 15, 681–95. Kan, I.P., Kable, J.W., Van Scoyoc, A., Chatterjee, A., Thompson-Schill, S.L. (2006). Fractionating the left frontal response to tools: dissociable effects of motor experience and lexical competition. Journal of Cognitive Neuroscience, 18 (2), 267–77. Karni, A., Meyer, G., Jezzard, P., Adams, M.M., Turner, R., Ungerleider, L.G. (1995). Functional MRI evidence for adult motor cortex plasticity during motor skill learning. Nature, 377, 155–8. Knight, C. (1991). Blood Relations: Menstruation and the Origins of Culture. New Haven & London: Yale University Press. Knight, C. (1998). Ritual/speech coevolution: a solution to the problem of deception. In: Hurford, J.R., Studdert-Kennedy, M., Knight, C., editors. Approaches to the Evolution of Language: Social and Cognitive Bases. Cambridge: Cambridge University Press, pp. 68–91. Laughlin, C.D., McManus, J., d’Aquili, E.G. (1992). Brain, Symbol and Experience: Toward a Neurophenomenology of Human Consciousness. New York: Columbia University Press. Leslie, A.M. (1987). Pretense and representation: the origins of ‘‘theory of mind’’. Psychological Review, 94, 412–26. Lillard, A.S. (2001). Pretend play as Twin Earth: a social-cognitive analysis. Developmental Review, 21, 495–531. Lotze, M., Heymans, U., Birbaumer, N., et al. (2006). Differential cerebral action during observation of expressive gestures and motor acts. Neuropsychologia, 44, 1787–95. Mar, R.A. (2004). The neuropsychology of narrative: story comprehension, story production and their interrelation. Neuropsychologia, 42, 1414–34. Marshack, A. (1976). Some implications of the Paleolithic symbolic evidence for the origin of language. Current Anthropology, 17, 274–82. Marshack, A. (1990). Early hominid symbol and evolution of human capacity. In: Mellars, P., editor. The Emergence of Modern Humans. Ithaca, NY: Cornell University Press, pp. 457–498. Martin, A., Wiggs, C.L., Ungerleider, L.G., Haxby, J.V. (1996). Neural correlates of category-specific knowledge. Nature, 379, 649–52. Mellars, P., Stringer, C. (1989). Preface. In: Mellars, P., Stringer, C., editors. The Human Revolution: Behavioural and Biological Perspectives on the The culture ready brain Origins of Modern Humans. Edinburgh: Edinburgh University Press, pp. ix–xi. Mitchell, R.W. (1994). Multiplicities of self. In: Parker, S.T., Mitchell, R.W., Boccia, M.L., editors. Self-awareness in Animals and Humans. Cambridge: Cambridge University Press, pp. 81–107. Mithen, S. (2006). The Singing Neanderthals: The Origins of Music, Language, Mind and Body. London: Phoenix/Orion Books. Myowa-Yamakoshi, M., Matsuzawa, T. (1999). Factors influencing imitation of manipulatory actions in chimpanzees (Pan troglodytes). Journal of Comparative Psychology, 113, 128–36. Naito, E., Ehrsson, H.H. (2006). Somatic sensation of hand-object interactive movement is associated with activity in the left inferior parietal cortex. Journal of Neuroscience, 26, 3783–90. Oakley, D.A., Eames, L.C. (1985). The plurality of consciousness. In: Oakley, D.A., editor. Brain and Mind. London: Methuen, pp. 217–51. Ohgami, Y., Matsuo, K., Uchida, N., Nakai, T. (2004). An fMRI study of tool-use gestures: body part as object and pantomime. Neuroreport, 15, 1903–6. Potts, R. (1994). Variables versus models of early Pleistocene hominid land use. Journal of Human Evolution, 27, 7–24. Richman, B. (1978). The synchronization of voices by gelada monkeys. Primates, 19, 569–81. Rizzolatti, G., Arbib, M.A. (1998). Language within our grasp. Trends in Neurosciences, 21(5), 188–94. Sahlins, M.D. (1960). The origin of society. Scientific American, 203, 76–87. Schepartz, I.A. (1993). Language and modern human origins. Yearbook of Physical Anthropology, 36, 91–126. Searle, J.R. (1969). Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge University Press. Semendeferi, K., Damasio, H. (2000). The brain and it major anatomical subdivisions in living hominoids using magnetic resonance imaging. Journal of Human Evolution, 38, 317–32. Storr, A. (1993). Music and the Mind. London: Harper Collins. Strum, S.C. (1981). Processes and products of change. In: Harding, R.S.O., Teleki, G., editors. Omnivorous Primates. New York: Columbia University Press, pp. 255–302. Tanner, J.E., Byrne, R.W. (1996). Representation of action through iconic gesture in a captive lowland gorilla. Current Anthropology, 37, 162–73. SCAN (2010) 179 Tayler, C.K., Saayman, G.S. (1973). Imitative behaviour by Indian Ocean bottlenose dolphins (Tursiops adunius) in captivity. Behaviour, 44, 286–98. Teleki, G. (1973). The Predatory Behavior of Wild Chimpanzees. Lewisburg: Bucknell University Press. Teleki, G. (1981). The omnivorous diet and eclectic feeding habits of chimpanzees in Gombe National Park, Tanzania. In: Harding, R.S.O., Teleki, G., editors. Omnivorous Primates. New York: Columbia University Press. Tobias, P.V. (1987). The brain of Homo habilis: a new level of organization in cerebral evolution. Journal of Human Evolution, 16, 741–61. Tomasello, M., Call, J., Nagell, K., Olguin, R., Carpenter, M. (1994). The learning and use of gestural signals by young chimpanzees: a transgenerational study. Primates, 35, 137–54. Tomasello, M., George, R.L., Kruger, A.C., Farrar, M.J., Evans, A. (1985). The development of gestural communication in young chimpanzees. Journal of Human Evolution, 14, 175–86. Tomasello, M., Gust, D., Frost, G.T. (1989). A longitudinal investigation of gestural communication in young chimpanzees. Primates, 30, 35–50. Trevarthen, C. (1995). The child’s need to learn a culture. Children & Society, 9, 5–19. Watts, I. (1999). The origin of symbolic culture. In: Dunbar, R., Knight, C., Power, C., editors. The Evolution of Culture: An Interdisciplinary View. Edinburgh: Edinburgh University Press, pp. 113–46. Whitehead, C. (2001). Social mirrors and shared experiential worlds. Journal of Consciousness Studies, 8(4), 3–36. Whitehead, C. (2003). PhD dissertation, Department of Anthropology, University College London. Social Mirrors and the Brain. Whitehead, C. (2008a). The neural correlates of work and play: what brain imaging research and animal cartoons can tell us about social displays, self-consciousness, and the evolution of the human brain. In: Whitehead, C., editor. The Origin of Consciousness in the Social World. Exeter: Imprint Academic, pp. 93–121. Whitehead, C. (2008b). The human revolution: editorial introduction to ‘Honest fakes and language origins’ by Chris Knight. In: Whitehead, C., editor. The Origin of Consciousness in the Social World. Exeter: Imprint Academic, pp. 226–35. Whitehead, C., Marchant, J.L., Craik, D., Frith, C.D. (2009). Neural correlates of observing pretend play in which one object is represented as another. Social Cognitive and Affective Neuroscience, 4, 369–78. Winnicott, D.W. (1974). Playing and Reality. London: Penguin.
© Copyright 2026 Paperzz