da Oxford hnadbook of language evolution M. Tallerman & K. Gibson eds (Oxford Oxford UP 2012) chapter 51 .............................................................................................. P ROTO L A N G UAGE .............................................................................................. maggie tallerman 51.1 W H AT I S P ROTO L A N G UAG E A N D W H E N D I D I T E M E RG E ? ................................................................................................................ Most researchers suggest that early hominin communication involved some form of pre-language, or protolanguage. Protolanguage is seen as simpler than full language, with a proto-lexicon, i.e. storage for learned, meaningful signals, but no syntax (Bickerton 1990). Protolanguage may have utilized vocal, gestural, and mimed components (see Arbib; Corballis; Donald; Harnad, this volume). However, I assume for brevity that the primary—and dominant—modality was vocal. A gestural protolanguage predicts sign as the dominant modern language modality, obviously incorrectly. Scenarios for a switch from manual to vocal can easily be invoked (for instance, speech frees up the hands for increased tool use), but all lack evidence. We have no idea when protolanguage evolved, though many proposals link the Homo genus with the first protolanguage, perhaps 2 million years ago (mya). Suggested evidence comes from the archaeological and palaeoanthropological records. Stone tools appear around 2.6 mya (Wynn, this volume), including sharp-edged cutting tools, a kind of tool that is apparently beyond the capability of modern great apes. The fossil record indicates hominin brain growth and a developing vocal tract (MacLarnon; Mann; Wilkins; Wood and Bauernfeind, this volume) but provides only indirect evidence. Selective pressures for vocal tract changes doubtless come from the emerging speech capacity, but even dating speech 480 maggie tallerman would not date language: language could occur without speech (i.e. sign), and the speech capacity could merely indicate a limited protolanguage. On selective pressures specific to hominins, Bickerton (2009a) suggests a scenario involving construction of new niches: early Homo is postulated to employ cooperative scavenging, vocally recruiting aid to protect large carcasses. This builds in the evolution of cooperation, changes in diet leading to morphological changes in brain and body, and the beginnings of protolanguage. It appears consistent with the archaeological and fossil record: early hominins split animal bones to extract marrow using stone tools. Scavenging constitutes novel behaviour in the hominin lineage, thus might be the novel selection pressure needed for the development of new communicative behaviour. Lacking any definitive way to date protolanguage, I assume it emerged within the last 2 million years or so, roughly coincident with early stone tools and with brain growth beyond the size of other primate brains. Full language must already be in place by the time of the human diaspora, around 85 kya (thousand years ago), since all modern populations have the same language faculty (i.e. can learn any natural language); at the latest, the full language faculty must be fixed before Australia was settled, around 50–60 kya. Possibly, full language is linked to the speciation of Homo sapiens, around 195 kya (McDougall et al. 2005). We cannot tell whether other species had (any form of) language; for instance, there is debate over the linguistic capacity of Neanderthals (e.g. Mithen 2005; P. Lieberman 2006), which became extinct around 30 kya. 51.2 C O M P O S I T I O NA L P ROTO L A N G UAG E ................................................................................................................ A compositional or lexical protolanguage consists of single proto-words, initially uttered separately and slowly, and subsequently joined in short, fairly random sequences (Bickerton, this volume). It has no hierarchical structure, no syntactic combinatorial principles, and only a loose pragmatic relationship between protowords. What meanings might proto-words express? Bickerton (1990, 2009a), Hurford (2003a), and Tallerman (2007) suggest noun-like and verb-like words, while Heine and Kuteva (2007, this volume) suggest that noun-like words appear first, with all other categories deriving from these, including verbs. The term ‘proto-word’ reflects significant differences between proto-vocabulary and true words. Modern vocabulary items (almost) all fit into a structured semantic network, and contract obligatory relationships with other words. For instance, transitive verbs and prepositions require syntactic objects with specific semantic properties. No subcategorization relationships (Tallerman, Chapter 48) would protolanguage 481 occur in protolanguage—proto-words did not obligatorily select other words. Modern ‘defective’ vocabulary items are similar, having phonology and meaning but no syntax; examples include yes/no, hello, ouch, oops, wow, hey, shh, psst, and words for animal calls, like oink and woof. Jackendoff (1999, 2002) regards these as living linguistic fossils. Unlike ordinary words, defective words are used alone as meaningful utterances, and cannot be combined (except in quotatives: She said ‘ouch’). Some are largely involuntary, apparently under right hemisphere control, and some even survive aphasia; such features are suggestive of primate calls. However, unlike primate calls, they are crucially culture-specific, and thus learned. A crucial development occurs when words are first combined: stringing protowords together, still without hierarchical structure, brings an advance on a singleword stage (Bickerton, this volume). Jackendoff (2002) suggests compounding as a pre-syntactic principle, involving mere concatenation. Consider some English compounds: penknife, breadknife, table knife, butter knife. The relationships between the semantic head (knife) and the modifier are idiosyncratic: penknives sharpen pens, breadknives cut bread, table knives are used at the table, and butter knives spread butter. Compounding forms around twenty distinct semantic relationships, potentially producing quite an expressive protolanguage. Spontaneous compound-like (signed) utterances occur in captive apes, such as the chimpanzee Washoe’s ‘water bird’ for duck (Fouts 1975), but we cannot know if chimpanzees intend one sign to modify another, or understand that for the observer, concatenations generate meaning. What is the evidence for an unstructured word þ word (þ word . . . ) protolanguage stage? Bickerton (1990, 1995) argues that the protolanguage capacity is retained by modern humans, emerging in child language and in pidgins (Roberge, this volume), in adults learning a second language naturalistically, in homesign (used by the deaf children of hearing adults; Goldin-Meadow, this volume), and in linguistically-deprived children, who lack adequate input within the critical acquisition period (see Curtiss 1977 on ‘Genie’). Moreover, protolanguage may not be species-specific: trained apes can produce sign or symbol ‘utterances’ with short sequences of proto-words, lacking syntax (e.g. Gardner and Gardner 1969; Patterson 1978; see Gibson, Chapter 3). Though non-human primates never acquire protolanguage in the wild, their abilities in captivity suggest some phylogenetically ancient pre-linguistic capacities. Examples of various putative forms of modern protolanguage are given in (1)–(3); Roberge, this volume, provides samples of pidgins: (1) Ape protolanguage (Kanzi (bonobo), using a combination of lexigrams on a keyboard and gestures; data from Savage-Rumbaugh et al. 1998) water chase water balloon food childside childside orange Matata bite chase water 482 maggie tallerman bad water chase you juice raisins Austin carry childside carry hide peanut (2) Child language (Tom, 23 months) doggie fall I get that want hat there birdie put sock off more milk where gone Tom cup (3) Genie (Curtiss 1977) Paint. Paint picture. Take home. Ask teacher yellow material. Blue paint. Yellow green paint. Genie have blue material. Teacher said no. Genie use material paint. I want use material at school. Protolanguage exhibits the following properties (e.g. Bickerton 1990). First, the ordering of elements is relatively random. No hierarchical syntactic structure constrains surface order, and different word orders have no link to information structure (e.g. given vs. new information). The bonobo Kanzi illustrates this, producing Austin carry when asking to be carried to see Austin (Savage-Rumbaugh et al. 1998: 62), but Matata bite when Matata (his mother) bites him. SavageRumbaugh reports definite ordering regularities in Kanzi’s output (Greenfield and Savage-Rumbaugh 1990, 1991), but these differ significantly from the spoken English input that Kanzi receives. Pre-grammatical utterances in young children typically reflect closely the word orders of the ambient language, specifically by being consistently head-initial or head-final; Genie’s utterances in (3) also share this characteristic. Simple word order regularities do not, though, necessarily indicate syntax. Ancestral protolanguage putatively contained various ‘purely semantically based principle[s] that map into linear adjacency without using anything syntactic’ (Jackendoff 2002: 248). ‘Fossils’ of these principles, such as Agent First and Focus Last, still occur: Agent First produces the subject-initial constituent order found in 90% of languages today. Another principle, Grouping, ensures that in dog brown eat mouse, the dog is brown, while in dog eat brown mouse, the mouse is brown; Agent First ensures that the dog is eating. Yet no constituent structure is assumed here. Pre-syntactic principles of this nature in protolanguage could start to tie information structure to ordering. Next, consider the subcategorized arguments of verbs and other syntactic heads. In full languages, these are often phonetically null, but are systematically related to overtly present categories. Below, e stands for ‘empty’, an element understood, not pronounced: (4) Kim is too mean [e] to make supper. (Here, [e] ¼ Kim) (5) Kim is too mean [e1] to make supper for [e2]. (Here, [e1] is not Kim, but [e2] ¼ Kim) protolanguage 483 The meanings of null elements are not random, but are syntactically regulated. Contrast the protolanguage utterances from Genie: Paint picture (where the agent is missing) or Take home (where the agent and patient—the item to be taken—are both missing); null elements in Kanzi’s utterances appear similarly unconstrained. Bickerton notes that ‘in protolanguage . . . any item may be absent from any position’ (1990: 124); null elements are randomly distributed, so subjects or objects of verbs often get omitted. In ancestral protolanguage, a major rubicon is crossed when words start requiring co-occurring words with specific syntactico-semantic properties. However, some full languages also exhibit few syntactic restrictions on null elements. In Chinese, ‘He saw him’ translates as in (6), with an overt subject and object, but is also expressed (in an appropriate pragmatic context) as (7a), (b) or (c): (6) ta kanjian ta le he see he aspect ‘He saw him.’ (7) a. [e] kanjian ta le b. ta kanjian [e] le c. [e] kanjian [e] le Of course, Chinese is not protolanguage, but a full language with regular subcategorization requirements: kanjian ‘see’ is a transitive verb with an animate agent and a visible patient, just as in English. But the principles regulating null elements in English are not universal, so cannot form a model for how protolanguage became language. Ancestral protolanguage putatively lacked a mechanism for assembling words into structural units (Bickerton 2009a, 2010, this volume): initially, there were no syntactic relationships between proto-words. For Bickerton, the crucial development is the appearance of ‘Merge’ (more precisely, the novel linguistic use of an existing cognitive ability), which combines two words to form a phrase, then combines that phrase with another word, forming a larger phrase, and so on. Under this view, clausal subordination is not special; it is simply due to repeated applications of Merge. (Modern) protolanguage sometimes exhibits apparent subordination, such as Genie’s I want use material at school, but these are merely prefabricated routines, consisting of I want þ state of affairs; they are not productive. Bickerton suggests (2009b) that protolanguage-speaking children may not yet know the kinds of verbs that require subordinate clauses. Finally, protolanguage lacks a distinction between lexical elements (primarily verbs, nouns, adjectives) and functional elements (grammatical items, including determiners, auxiliaries, and sub-words such as affixes). Modern protolanguages lack grammatical markers (for instance, Genie’s ask teacher yellow material, or Kanzi’s ball slap), while in full languages, functional and lexical elements occur in 484 maggie tallerman roughly equal proportions in utterances. There is widespread agreement that ancestral protolanguage would contain at most two categories, the precursors to nouns and verbs. The transition to language involved the gradual accretion of other word classes via the same processes of grammaticalization that occur in all recorded languages: see Bybee; Heine and Kuteva, this volume; Hurford 2003a; Tomasello 2003b. Some grammatical elements mark phrase and clause boundaries, so presumably by the time these appear, speakers have passed the item-by-item stage of protolanguage production, and instead pre-form phrases before they are uttered. Of course, modern peoples all possess a full language faculty; thus, Bickerton’s proposals to model ancestral protolanguage on modern child language or on restricted linguistic systems are controversial. Modelling protolanguage on ape ‘language’ capacities is also controversial: the last common ancestor of chimpanzees and humans, around 6 or 7 mya, need not possess any specifically ‘linguistic’ characteristics, since there is plenty of time since the split for the full suite to evolve. In sum, using modern reflexes of ‘protolanguage’ as evidence for ancestral protolanguage is contentious. Nonetheless, the models of protolanguage presented by Bickerton and by Jackendoff—together with pathways of grammaticalization outlined by Heine and Kuteva—go a long way towards elucidating likely processes in language evolution. 51.3 P R I M AT E VO C A L I Z AT I O N S A N D P R I M AT E C O G N I T I O N ................................................................................................................ Under the compositional view of protolanguage, the earliest development is the creation of arbitrary signals connecting sounds to simple meanings—a protovocabulary of symbols, lacking word classes. Proponents of this view reject the idea that protolanguage emerged from a primate communication system, though it may have utilized essentially the same mechanical means of sound production. Proto-vocabulary thus represents a major discontinuity with primate communication. The evidence draws on known features of vocabulary, which show little overlap with features of natural vocal or gestural communication in other primates. In this scenario (e.g. Burling 2005) primate gesture-calls are not precursors to words; instead, the critical continuities are cognitive. The primate literature certainly provides increasingly sophisticated evidence concerning primate vocalizations and gestures (in the wild and in captivity), indicating a network of similarities and differences between human and nonhuman primates (see the contributions in Part I). But there is no straightforward pathway via which primate vocalizations and/or gestures could ‘turn into’ protolanguage 485 linguistic utterances. Below, I focus on comparison of words (or proto-words) and primate calls. Human (proto-)words differ from primate calls in major ways. The first concerns symbolic reference (see Deacon, this volume). Some primate vocalizations have functional reference: they refer to events or entities external to the caller, and not merely the animal’s own emotional state. This applies both to monkey and ape vocalizations (see Slocombe and Zuberbühler 2006; Slocombe, this volume), but prime examples are monkey alarm calls. Vervets produce different calls in response to each main predator, eagles, leopards or snakes (Cheney and Seyfarth 1990, this volume; Zuberbühler, this volume). Conspecifics clearly associate each alarm call with specific dangers, taking appropriate avoidance action for each predator type. But alarm calls are not word-like. They are more like propositions, rather than ‘referring’ to specific predators. As Bickerton (2009a: 200) points out, alarm calls may simply mean ‘threat from the ground’, ‘threat from above’, and so on. Unlike vocabulary items, each alarm call is tied to one specific context, with no flexibility or nuances of meaning (monkeys can’t indicate a particularly mean leopard). Primate calls never form part of an interconnected network of related symbols, as words do. And critically, alarm calls cannot be used merely to mention the concept of a leopard. Thus, a second distinction between human (proto-)vocabulary and primate calls is displacement. Using words in the absence of their referent is a novel feature in primate communication, though limited tactical deception (uttering an alarm call when no predator is visible) occurs in some monkeys (Wheeler 2009), and perhaps in chimpanzees too. The earliest hominin vocabulary likely had no displacement. For instance, a word indicating an animal is initially uttered when the animal is visible to both speaker and addressee. Later, the speaker sees the animal while his companion is distracted, and uses the animal’s name, since he can see it. If the companion is smart, he understands. The ability to comprehend displacement thus plausibly emerges before it is used deliberately in production, to refer to entities not present; see also Burling (2002). Displacement must be a highly adaptive feature of protolanguage (Bickerton 2009a). A scenario like this reveals a third major difference between human use of vocabulary and primate communication: only the former exhibits shared intentionality (Tomasello et al. 2005; Tomasello and Carpenter 2007), the mutual commitment to collaboration found in human interactions. Before 1-year-old, human infants can triangulate between themselves, an adult, and external objects, by pointing, gaze-following, or offering objects for inspection: they establish joint attention. Thus, unlike primate calls, words can be used merely to mention, or to point something out. Other primates use vocalizations and gestures to draw attention to themselves (e.g. to initiate play, or beg for food). But only humans engage spontaneously in triadic reference: two people attend to some external 486 maggie tallerman object, and agree on a convention for referring to it. Such collaboration is an absolute prerequisite for protolanguage (Tomasello 2003b). Shared intentionality gives rise to a fourth distinction: human vocabulary involves cultural transmission and learning, unlike non-human primate vocalizations. Moreover, though language has a critical period, vocabulary learning does not atrophy in adults. Learning also brings the prospect of innovation. Other primates are reported to produce some novel vocalizations (in Part I see Gibson; Slocombe), but in the wild their call repertoire is basically fixed, whereas vocabulary is productive and open-ended. Fifth, vocal learning requires both vocal control and vocal imitation. Researchers originally assumed that non-human primates only produce ‘affective’ (i.e. emotionally-driven) vocalizations, and couldn’t vocalize at will, suppress, or modify vocalizations. This is now known to be inaccurate. Audience effects and call modifications occur both in monkeys (Cheney and Seyfarth 1990) and great apes (Slocombe and Zuberbühler 2007; see Slocombe; Zuberbühler, this volume). This, along with tactical deception, suggests elements of vocal control. On the other hand, primate vocalizations are essentially driven by an internal state, rather than being volitional. And vocal imitation is a novelty in the human lineage. Such distinctions between words and primate calls might all appear in the earliest proto-words. Full language displays yet more differences. Duality of patterning is pivotal: a small, discrete set of sounds combines in different ways (phonology), giving rise to open-ended sets of morphemes and words, which themselves combine productively to form phrases and clauses (syntax). We assume protolanguage to lack this duality, and primate calls have nothing analogous (and nothing homologous). Ancestral protolanguage would not yet have a generative phonological system: for instance, Lindblom (1998) demonstrates that pressure for phonological complexity only comes from increases in vocabulary (see MacNeilage; Studdert-Kennedy, this volume). In sum, the earliest words had little in common with primate calls, apart from probably using the vocal/auditory modality (inconsequentially, since nearly all mammals vocalize). Protolanguage most likely did not develop from primate calls. An alternative is that primate cognition played a crucial role; see Seyfarth and Cheney (this volume); Hurford (2007, this volume). Hurford argues that human conceptual structure derives from primate perceptual structure. The fundamentals of modern cognition (concepts such as object permanence) probably evolved before the split from the Pan genus; see Tallerman (2009), also Coolidge and Wynn, this volume. Bickerton (1990: 91, 101) notes: In all probability, language served in the first instance merely to label protoconcepts derived from prelinguistic experience. [ . . . ] Protoconcepts which could serve as referents for nouns and even verbs – nouns and verbs being the basic units from which other linguistic categories are derived – were in place by the time the higher primates had developed. protolanguage 487 Strikingly, however, Bickerton (2009a) denies that any species other than H. sapiens had genuine concepts. Unlike mere categories, which other animals do have, concepts involve offline thinking (thinking about an activity or entity you are not currently engaged with) and displacement (e.g. imagining a leopard that’s not present). Concepts have permanent storage in the brain which can be accessed voluntarily (a lexicon). True concepts are triggered, in Bickerton’s view, by the emergence of the earliest words (see also Boeckx, this volume). Discussion of these two extremes—the idea that conceptual structure is in place well before protolanguage emerged versus the view that concepts are impossible without language—cannot be pursued here. However, researchers constantly invent subtle ways of examining animal cognition, producing new data to further the debate. 51.4 H O L I S T I C P ROTO L A N G UAG E ................................................................................................................ I emphasize critical distinctions between primate vocalizations and human vocabulary, because some recent speculation links primate calls directly to protolanguage. Mithen (2005, 2009) and Wray (1998, 2000, 2002a) assume that primate calls are ancestral to human vocabulary: Protolanguage would . . . be a phonetically sophisticated set of formulaic utterances, with agreed function-specific meanings, that were a direct development from the earlier noises and gestures, and which had, like them, no internal structure (Wray 1998: 51). Primate calls are not compositional, but holistic: the entire call is the entire message. Wray and Mithen, also Arbib (2005) and Fitch (2010a), propose that protolanguage was also holistic. They reject the compositional account of protolanguage starting with discrete proto-words representing concepts, and then forming short, unstructured proto-word strings. In holistic protolanguage (HPL), each utterance represents an entire proposition, with arbitrary form and a complex meaning agreed by the community. Wray’s toy examples are tebima ‘give-that-toher’ and kumapi ‘share-this-with-her’; 2000: 294). The idea is that form/meaning correspondences occasionally occur fortuitously; here, ma occurs in each string, and the meaning ‘her’ also occurs in each. A speaker might assume that ma means ‘her’, and a ‘word’ for ‘her’ then ‘fractionates’ out of the non-compositional sequence. Utterances are thus broken apart to form proto-words. The HPL idea is highly problematic. First, for fractionation to succeed, holistic calls must contain phonetic break-points (Studdert-Kennedy and Goldstein 2003), and early hominins must notice them. But just like our own innate vocalizations 488 maggie tallerman (laughter, crying, shrieks of fear, etc.), holistic primate calls contain no discrete phonetic units. One element of proposed continuity thus disappears. HPL must be physically very unlike primate calls, and proponents of HPL don’t explain how a presumed complex phonetic system originates; how do holistic primate calls turn into a discrete segmental system? Tallerman (2007) also questions the assumption that calls would be long enough to fractionate: instead of tebima, for instance, a signal in HPL might be simply ma—far more likely, given the protosyllable account of MacNeilage (1998, 2008, this volume). If each signal is short, there’s no material to break down, and the account fails. Moreover, the HPL account requires early hominins to possess a sophisticated compositional semantics (Johansson 2008), which, on the basis of comparative biology and the archaeological record, seems improbable. On the compositional account, semantics simply evolves in tandem with words. On physical properties alone, then, HPL is unsupported. Second, the processes turning a putative HPL into compositional language differ completely from observed processes of language change; see Bybee; CarstairsMcCarthy; Heine and Kuteva, this volume. The bundle of effects termed ‘grammaticalization’ create syntactic constructions (such as the passive), produce new word classes (such as adjectives), and form grammatical elements from content words (for instance, creating auxiliaries and complementizers out of verbs). Heine and Kuteva (2007, this volume) argue that grammaticalization is the only process that could produce words of distinct classes from a protolanguage consisting initially of noun-like items. As Bybee (this volume) notes, grammatical constructions are overwhelmingly formed by composition when adjacent elements fuse, not by breaking complex elements apart (cases like back-formation of edit from earlier editor are much rarer). The large-scale deconstruction presumed in accounts of HPL is unsupported by historical linguistics. Nor do language deficits support HPL: grammatical breakdown in agrammatical aphasia has entirely different properties. Third, consider the problem of counterexamples. Discussing Wray’s HPL examples above, Tallerman (2007) surmises that many utterances might contain ma but mean nothing to do with ‘her’, or might pertain to a female recipient but not contain ma; the number of counterexamples would overwhelm positive examples. This intuition is confirmed via computational modelling (Johansson 2008; K. Smith 2008). Johansson’s models vary the parameters of a toy HPL by various factors, including total inventory of utterances, number of sound segments, number of meaning elements etc., and: For all parameter combinations, the number of counterexamples were found to outweigh the number of positive examples by a considerable margin. For no parameter combination did the fraction of all predicates with more positive examples than counterexamples exceed 2% (Johansson 2008: 175). protolanguage 489 Fourth, some accounts (Mithen 2005, Arbib 2005) suggest that highly complex meanings could be inferred from holistic utterances. A. Smith (2008) observes that meanings in HPL must be reconstructed purely from context, and while humans can conceptualize simple, cognitively salient meanings, associated with basic-level categories such as ‘dog’ and ‘chair’, it is hard to learn general categories such as ‘animal’ or ‘furniture’ contextually. Moreover, if we show a child a picture of a ‘dax’, a mythical creature dancing on a table, she doesn’t assume that ‘dax’ means This-isa-dax-dancing-on-a-table, but rather, she associates the novel creature with the label dax (Tallerman 2008a). The intricate, very specific, multi-propositional meanings suggested by some proponents of HPL (see Tallerman 2007 for discussion) thus cannot be reconstructed from context in the first place, let alone transmitted successfully between further individuals. A. Smith concludes that ‘Unitary, unstructured meanings can only reliably be associated with highly salient, relatively simple meanings, as they must be reconstructable without any linguistic cues’ (2008: 109). Complex propositions are thus unlikely to be associated successfully with holistic utterances. Fifth, consider the use made of HPL. For Wray, its function was not informative, but social and manipulative, like animal communication systems (adding further continuity with primate calls). HPL delivers ‘subtle and complex social messages’ (Wray 2002a: 117), covering threats, greetings, and commands; a primitive compositional protolanguage, lacking grammar, would, Wray suggests, be too ambiguous to function properly. (Though potential ambiguities in complex holistic message strings are not seen as problematic.) However, Bowie (2008), in experiments using restricted language systems, shows that a small compositional system (containing just a few words) significantly enhances communication in novel situations; conversely, a semantically-fixed set of holistic signals is highly inflexible. Moreover, early hominins don’t need a new system to deliver the kind of social messages Wray suggests (Bickerton 2003, 2009a; Tallerman 2007). Our ancestors had, as we still have, all the primate vocal and gestural features necessary: tears, laughter, sighs, snarls, shouts of joy, cringing, plus biochemical signals; see Burling (1993, 2005). HPL adds nothing to this innate repertoire (nor does language). Language doesn’t replace primate signals, but adds an entirely different system alongside them (sometimes literally: cries of pain are involuntary, yet often expressed using language-specific vocalizations; cf. English ow! but French aie!). Finally, proposals for HPL entirely disregard typical communicative attempts made by apes in the lab. Far from being holistic, whole propositions, ape utterances are typically short, perhaps two proto-words, crucially revolving around content ‘words’. Apes trying to communicate with humans produce the elements with most meaning (noun-like and verb-like items), and ignore the rest. Their ancestors are our ancestors too, and this strategy—concentrate on producing maximum meaning with minimum effort—likely utilizes the type of cognition early hominins brought to protolanguage. HPL is thus too complex for our ancestors; instead, 490 maggie tallerman protolanguage comprised short proto-word sequences of one or two items bearing high meaning. I conclude, then, that holistic protolanguage is linguistically untenable, and does not achieve continuity with primate communication systems anyway. Primate cognition is more relevant in the continuity debate than primate vocal communication. 51.5 M U S I C A L P ROTO L A N G UAG E ................................................................................................................ Some recent publications suggest a musical protolanguage as a precursor of HPL (Mithen 2005, 2009, this volume; Fitch 2010a); see Botha (2009a) for critique. Fitch focuses on ‘complex vocal imitation’ (vocal imitation, control, and learning; 2010a: 340). In other primates, few homologues to these vital features of spoken language occur. However, complex vocal imitation occurs widely elsewhere. If the selection pressures giving rise to learned ‘song’ in songbirds, seals, whales etc. also applied to hominins, then novel vocal capacities in speech are a case of convergent evolution. Problematically, though, sexual selection drives the evolution of animal song: learned song is mostly produced by males, in courtship and defence of territory. But both males and females possess speech. So how would ‘musical protolanguage’ spread from males to females? Fitch emphasizes social bonding or group cohesion, as in other species where both sexes display complex vocal imitation. But there is no evidence that learned vocalization evolved first in males and later spread to females. Secondly, as is expected with sexually-selected characteristics, song arises at puberty, thus highly unlike learned vocalization in humans. Moreover, learned song shows seasonal peaks and is hormonally driven, again unlike speech (see Gibson, Chapter 11). The biological perspective offers little evidence that sexual selection drove human vocal learning. From a linguistic perspective too, musical protolanguage is dubious. Fitch suggests that a musical protolanguage contained ‘meaningless sung phrases of complex phonological structure’ (2010a: 496), and that ‘the generative aspect of phonology might have emerged before it was put to any meaningful use’ (2010a: 471). Fitch’s musical protolanguage is ‘bare phonology’, like non-lyrical song, and initially lacked meaning entirely. But as noted above, selective pressures for contrastive phonology come from an expanding vocabulary; complex phonology cannot evolve before vocabulary exists. Even allowing that Fitch really means phonetic structure, this scenario still entails a massive evolutionary leap, as other great apes have nothing comparable. More plausible scenarios are outlined by MacNeilage; Studdert-Kennedy, this volume. Both stress the importance of vocabulary (i.e. of meaning) as a selection pressure in learned vocalization. protolanguage 491 Fitch suggests that meaningless melodies subsequently become associated with whole events, hence meanings are paired arbitrarily with musical ‘phrases’ (forming a holistic protolanguage). These utterances have complex, hierarchical structure (Fitch 2010a: ch. 14), subsequently exapted for syntax. But phrases in music or animal song differ radically from syntactic phrases, which start with a semantic/ syntactic head that gains dependents—in evolution (Jackendoff 2002) as in child language acquisition; see Tallerman, Chapter 48. Musical protolanguage is thus an evolutionary cul-de-sac (since the musical aspects must ultimately be abandoned for a word-based protolanguage), and moreover, does not provide any observed features of full language.
© Copyright 2026 Paperzz