Language Families and Linguistic Diversity 499 Moir J & Nation I S P (2002). ‘Learners’ use of strategies for effective vocabulary learning.’ Prospect 17, 15–35. Nagy W E, Herman P & Anderson R C (1985). ‘Learning words from context.’ Reading Research Quarterly 20, 233–253. Nation I S P (2000). Learning vocabulary in lexical sets: dangers and guidelines. TESOL Journal 9, 6–10. Nation I S P (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press. Nation P & Wang K (1999). ‘Graded readers and vocabulary.’ Reading in a Foreign Language 12, 355–380. Palmer D M (1982). ‘Information transfer for listening and reading.’ English Teaching Forum 20, 29–33. Read J (1998). ‘Validating a test to measure depth of vocabulary knowledge.’ In Kunnan A J (ed.) Validation in language assessment. Mahwah, NJ: Lawrence Erlbaum. 41–60. Read J (2000). Assessing vocabulary. Cambridge: Cambridge University Press. Schmitt N & McCarthy M (eds.) (1997). Vocabulary: description, acquisition and pedagogy. Cambridge: Cambridge University Press. Schmitt N, Schmitt D & Clapham C (2001). ‘Developing and exploring the behaviour of two new versions of the Vocabulary Levels Test.’ Language Testing 18, 55–88. Smith H (1996). ‘An individualised vocabulary programme.’ TESOLANZ Journal 4, 41–51. Waring R & Takaki M (2003). ‘At what rate do learners learn and retain new vocabulary from reading a graded reader?’ Reading in a Foreign Language 15, 130–163. Wesche M & Paribakht T S (1996). ‘Assessing second language vocabulary knowledge: depth versus breadth.’ Canadian Modern Language Review 53, 13–40. Language Families and Linguistic Diversity M Ross, The Australian National University, Canberra, Australia ! 2006 Elsevier Ltd. All rights reserved. Linguistic Diversity ‘Linguistic diversity’ refers to several interrelated phenomena: 1. phylogenetic (genetic, genealogical) diversity: the number of language families in a geographic area; 2. intrafamilial and intralinguistic variation: differences among the languages of a family and among the variants of a language; 3. language density: the number of languages in a geographic area; 4. typological diversity: differences among the structures which make up languages. The focus of this article is on (1), (2) and (3) (see also Variation and Language: Overview). More conventional approaches to these topics are discussed first, then a recent alternative. Although (4) will not be discussed further here, the relationship between phylogenetic and typological diversity should be mentioned. It is quite common for the members of a language family to be typologically similar, but there is no necessary reason why they should be so. There is, for example, considerable typological variety among the grammars of the Austronesian family, some of it the result of contact with non-Austronesian languages, but much of it the outcome of language-internal changes. Conversely, languages belonging to different language families may be typologically quite similar, either as the result of contact or as the result of independent parallel innovation. Language Families Definitions and Problems A language family is conventionally defined as a set of languages that share a common ancestor. The family metaphor captures two insights: that languages are systems that are transmitted from one generation to the next (generational continuity) and that these systems change over time so that when a community of speakers divides into, say, three separate communities, the speech of each of the three will change in different ways from the others, leading eventually to mutual unintelligibility, i.e., three languages descended from a shared ancestor. The basis of the metaphor is biological evolution, where one species divides into two or more new species (McMahon, 1994: Chap. 12), even though linguists often use the term ‘family tree,’ which otherwise denotes trees drawn to represent human family relationships. Identifying a language family in the first place depends on finding individual-identifying evidence (Nichols, 1996, 1997), patterned similarities between a set of languages that could not have arisen by chance and must be the outcome of shared inheritance. The Indo-European family, for example, was first definitively identified by Sir William Jones in 1786 on the basis of similarities between the verb paradigms of Sanskrit, Greek, and Latin. Recently, the Trans-New 500 Language Families and Linguistic Diversity Figure 1 A conventional family tree of the Germanic languages. Guinea family has been identified on the basis of formally similar pronouns (Ross, 2005). Figure 1 shows a conventional family tree of the Germanic languages (see Germanic Languages) and serves as a starting point for a discussion of the family-tree model. It shows that the family concept is recursive. Thus English, Friesian, etc. form a West Germanic family. This, together with the North Germanic family and the single-member East Germanic family, forms the Germanic family. The latter, along with the Romance, Balto-Slavic, IndoIranian, and other, some single-member, families, is part of the Indo-European family (not shown in Figure 1). The languages at all nodes other than the terminals in Figure 1 are reconstructed proto-languages. The changes that languages undergo are often regular enough to allow quite a safe reconstruction of their parent based on the correspondences among them, provided that the linguistic comparative method is carefully applied. Thus the reconstruction of Proto-Indo-European is reasonably secure. The reconstruction (and indeed the existence) of its hypothesized parent, Proto-Nostratic, however, is considered dubious by many linguists, as it is not based on a rigorous application of the comparative method (Nichols, 1997), the maximum reach of which is usually reckoned to be around 6000–8000 years (see Long-Range Comparison: Methodological Disputes). The comparative method provides a means for inferring the tree structure of a large family by reconstructing the chronology of innovations relative to the protolanguage. For example, all Germanic languages reflect the so-called Germanic sound shift, a set of changes in consonants relative to the reconstructed system of Proto-Indo-European. It is inferred that this set of changes only occurred once, in Proto-Germanic, and this allows us to insert the Proto-Germanic node into the Indo-European tree (see Subgrouping Methodology). Scholars investigating worldwide linguistic diversity need standardized phylogenetic units to work with. After Morris Swadesh devised the techniques known as lexicostatistics and glottochronology in the 1950s (Swadesh, 1972), there was a wave of optimism about the possibility of quantifying linguistic diversity. Researchers drew trees based on the percentage of putative cognates (words related through shared inheritance, e.g., English house and German Haus) in each pair of languages on Swadesh’s 100or 200-meaning list, setting percentage ranges for different levels of grouping. For example, in work on Papuan languages two languages were deemed to be dialects of the same language if the difference between them meant no less than 70% of list items were cognate; at the other extreme lists with 5–12% cognates were attributed to the same phylum. The basic classification was into dialects, languages, subfamilies, families, stocks, and phyla. Today many scholars think such findings have little utility for historical reconstruction, as cognate pairs were often identified on the basis of similarity in form, with no attention to whether the words displayed regular correspondences determined by the comparative method, and cognacy was often not distinguished from borrowing. Wordlists could easily contain gaps and elicitation errors, skewing results. This meant that only lower-level groupings – those in any case obvious by inspection – were reliable. Swadesh assumed that basic vocabulary is replaced at a constant rate, but this assumption no longer has wide acceptance. A language that has undergone more rapid change than its sisters will appear more distantly related to them than it really is. In her study of worldwide typological diversity, Nichols (1992: 24–25) adopted the units ‘family’ and ‘stock.’ The family she defined as a group with about the time depth of one of the older branches of Indo-European (2500–4000 years, e.g., Iranian, Balto-Slavic), recognizable by inspection when regular correspondences between word forms Language Families and Linguistic Diversity 501 and morpheme paradigms are displayed. The stock is the deepest phylogenetic node at which a protolanguage is reconstructable by the comparative method (5000–8000 years, e.g., Indo-European, Austronesian). Nichols (1997) adds the ‘quasi-stock,’ a grouping of stocks with promising phylogenetic markers but with no regular sound correspondences and few clear cognates. By this rubric, Afro-Asiatic, for example, is a quasi-stock. This approach lacks quantitative support, but it has the advantage that groups of languages under comparison meet the same methodological requirement. The failure of quantitatively defined language groupings has a further consequence. The quantitative approach assumed languages to be composed of dialects, and families to be composed of languages. Since there is no quantitative distinction between a dialect and a language, linguists sometimes fall back on the assertion that dialects are mutually comprehensible, languages not. But as mutual comprehensibility is also a matter of degree, the distinction has limited objective validity, and ‘lect’ will be used here as a convenient cover term for both (see Language and Dialect: Linguistic Varieties). It follows that if there is no absolute distinction between the dialects of a language and the languages of a family, there is similarly no absolute distinction between a language with dialects and a family. Complex Internal Structures The tree structure of some large families, e.g., IndoEuropean and Austronesian, has been worked out in some detail. The structures of others have not. There are at least three reasons for this. Some putative families such as Afro-Asiatic are generally accepted, even though their time depth is apparently greater than 8000 years. The data reflect too much change and too much divergence to allow a thorough reconstruction of the protolanguage. Other families, such as Sino-Tibetan, have a history of frequent migrations back and forth, in the course of which lects have diversified but remained in contact, influencing each other in ways that make it very difficult to sort out borrowing from shared inheritance (LaPolla, 2001). A third reason is that the relationships among languages within many families are very complicated. In many – and perhaps most – parts of the world, the idealized model of a language family as the outcome of a community of speakers dividing into discrete daughter communities does not fit the data. Instead, a family may arise through the differentiation of an expanding community’s speech into a network of lects, so that speakers of most lects can understand those of communities within a certain radius, but comprehension diminishes the further the speaker moves away from home. Language families that arise in this way are the subject of Johannes Schmidt’s (q.v.) wave theory, whereby innovations spread out from the center of a network like the ripples when a stone is thrown into a pond. The tree and wave models resist integration into a single model, and this has often been taken as a sign that they are in conflict with each other. This is unfortunate, as they model different phenomena. In the tree model, a community splits, and the lect of one or more of the new communities undergoes innovations that are subsequently inherited into its daughter lects. In the wave model, a community spreads, and innovations spread at different rates through the resulting network. The tree model fits large amounts of Austronesian and Uralic data rather well, while the wave model works better in other areas. However, we would like to be able to examine the history of a language family within a single framework, and the discussion below examines how lectal differentiation can be interpreted within a tree model and how the latter needs to be modified to accommodate it. Dutch and German taken together provide a testbed. They form a network of lects covering the Netherlands and Germany (except the Friesian Islands), nearly half of Belgium, Luxembourg, twothirds of Switzerland, Liechtenstein, and most of Austria. Until recently, this continuum extended into Alasace and into areas in northern Italy. If we for a moment ignore speakers’ present-day (and often fairly recent) ability to communicate with each other in varieties of either standard Dutch or standard German, then we have a situation in which everyone readily understands nearby lects and in which there are few major boundaries between groups of lects, yet lects at greater distances from each other, especially on a north–south axis, are mutually incomprehensible. How does one split such a network up into languages in order to represent it in a family tree? Not surprisingly, attempts to do so disagree. Whatever higher-order groups are posited, each includes lects that are geographically contiguous with resemblant, mutually comprehensible lects in other higher-order groups, reflecting historical relationships that the grouping belies. A tree cannot do justice to the historical relationships. Instead, the best we can do is to redraw Figure 2, which shows the West Germanic part of Figure 1, as Figure 3. However, this avoids too much distortion by representing dialect networks very grossly. Norwegian, Danish, and Swedish similarly form a dialect network, and so does English in Britain and Ireland. The presenting problem with the Dutch–German network is that its speakers (and most linguists) 502 Language Families and Linguistic Diversity would call its lects dialects, yet the degree of diversity across the network as a whole is intuitively greater than we expect in a single language. In other words, the network is like a (small) family made up of dialects, with no intervening level of ‘language.’ This problem disappears if one recognizes that the distinction between ‘dialects’ and ‘languages’ is artificial and a matter of degree of divergence. But it leaves unanswered the question of whether the Dutch– German network is more appropriately called a language or a family. This question can be answered from two perspectives. One emanates from the line of thought above: the difference between language and family is also one of degree of diversity, and the question is an artifact of an unsubstantiated terminological distinction. But this is a little too simple if we are using ‘family’ recursively, because, ascending the tree, there comes a point at which the diversity within a collection of lects requires us to call it a family. It might at a pinch be reasonable to call the Dutch– German network a language, but it would be unreasonable to call West Germanic, the next node up the tree, anything but a family, since English, Friesian, Afrikaans, Dutch–German, and Yiddish are all quite distinct from each other. Where does this distinctness reside? Partly in their degree of difference from each other and partly in the fact that they are not linked by transitional dialects. It is tempting to attribute these facts to geography, as the languages are mostly not contiguous, but the crucial factor is the strength of the social boundaries that separate speech traditions. For example, Afrikaans is spoken in South Africa, far from the Dutch lects of which it is an offshoot. Although they remain largely mutually comprehensible, their differences are striking. But if geography were the major factor, we should expect the Englishes of Britain/Ireland and North America to have diverged to a similar degree, yet they have not. After Figure 2 A conventional family tree of West Germanic. Figure 3 A revised family tree of West Germanic. the establishment of British rule in 1795, speakers of the divergent southern African dialect of Dutch became socially isolated from their fellow-speakers in the Netherlands and a separate standard emerged. Looking at the Dutch–German network from the perspective of social boundaries, we find that German speakers recognize dialect groupings such as Low German, Swabian, and Swiss German. Despite fuzzy boundaries, these groupings also have a reality for dialectologists, but no one would normally call them languages. One reason for this is that their social boundaries are also fuzzy, even across national borders. In this context the application of the term ‘language’ is as much a product of political as of linguistic history. The local lects of speakers who also speak a variety of standard German (in Germany, Switzerland, Liechtenstein, and Austria) are considered to be German dialects. The local lects of those who also speak a variety of standard Dutch (in the Netherlands and Belgium) are considered to be Dutch dialects. The linguistic repertoire of today’s dialect speakers also includes their variety of the standard. In Switzerland, speakers are often diglossic in a Swiss German dialect and Swiss-flavored standard German, i.e., there is a measure of separateness between the two systems. Elsewhere, a speaker’s repertoire is a continuum from a local dialect to a version of the standard, and s/he moves back and forth along the continuum according to whom s/he is speaking with. In these speakers, dialect and standard usually influence each other, with the result that the mutual comprehensibility of local lects across the Dutch– German border is being reduced as speakers accommodate to standard Dutch or standard German, and the standard languages are thereby reinforcing the social boundaries between the political entities with which they are associated. The non-absoluteness of dialect/language and language/family distinctions is also manifest in Sinitic studies. The Sinitic network has conventionally been described as the Chinese language, comprising the Chinese dialects, but the Chinese language is comparable in diversity to the West Germanic family. This terminological situation has arisen because the Chinese language has long been coterminous with the political and social entity of China. The Chinese dialects and Chinese language are now sometimes Language Families and Linguistic Diversity 503 called the Sinitic languages and the Sinitic family, leaving the term ‘Chinese language’ to denote the standard language. Relationships between dialect and standard and relationships among dialects are not easily rendered in a family tree diagram. Nor are situations where the historically closest sisters of a language, e.g., Afrikaans, are themselves certain dialects of another language, Dutch. Extensive lectal networks occur in many parts of the world. New Guinea and Island Melanesia host a number, although none with the geographic extent of Dutch–German. Because their speakers’ repertoire usually does not include a related standard language, the dialect/language question is not raised, but the absence of standard languages over millennia has permitted quite complex linguistic events when social boundaries have shifted. The complex history reconstructed in Figure 4 shows one area where networks have broken and later rejoined in a different configuration as the result of shifting boundaries. The basis for this reconstruction is the distribution of innovations. The Austronesian lects of New Ireland, a large island to the east of New Guinea, share certain innovations and appear at first sight to reflect the differentiation of a single lect, Proto-New-Ireland, into a lectal chain running the length of the island, then into a series of small families. But the lects of the southernmost family share other innovations with lects spoken on islands to the southeast. First, it seems, a single speech community rapidly settled New Ireland, and their speech differentiated into a chain of lects. Initially, population density was low, and social boundaries emerged between lects. The southernmost lect underwent certain innovations, and then some of its speakers sailed east and settled Nissan Island. From Nissan they established settlements on Buka Island to the south. The result was the South New Ireland/Northwest Solomonic network. Initially, frequent contact was maintained across the new network, more frequent than with other communities in New Ireland. Later on, populations increased, and the south New Ireland community resumed contact with communities to its north and was reintegrated into a New Ireland lectal network, through which innovations now passed (i.e., limited koineization occurred). Links with Nissan diminished to an annual voyage. The speech of the northwest Solomons community underwent certain innovations and became Proto-Northwest-Solomonic, then its speakers scattered through parts of Bougainville and the New Georgia group and across Choiseul and Santa Isabel to form the communities where the member lects of today’s Northwest Solomonic family are spoken (Ross, 1988: 216–218, 258–259, 293–313). Although the lects of New Ireland look as if they form a single node in the tree, the analysis of shared innovations reveals a much more complex history. A similar but more complex series of events has occurred in the history of the Fijian lects and the genesis of Proto-Polynesian (Geraghty, 1983, summarized by Ross, 1997: 227–229). Cases such as New Ireland and Fiji raise a question of method. If, as in southern New Ireland, two sets of innovations overlap in a language or a group of languages, then they cannot both reflect a shared ancestor. One or both sets must reflect diffusion across lectal boundaries. How does one determine whether one set (or neither) reflects a shared ancestor? The answer lies in the nature of the innovations. The innovations common to the languages of southern New Ireland and the northwest Solomons entail bound morphology, which is only very rarely subject to diffusion. The innovations common to New Ireland as a whole do not, and are candidates for diffusion. Figure 4 Network breaking and joining in New Ireland (after Ross, 1997: 230). 504 Language Families and Linguistic Diversity Mechanisms of Diversification A family tree diagram can lull one into the false assumption that diversity occurs between lects, not within them, and that lects at terminal nodes are homogeneous. However, we know that an individual’s speech varies – for example, along a continuum from local lect to standard – and that the degree to which variables are manifested differs among speakers. A speaker’s choice of variables on particular occasions is in a complex relationship to her personal social network. We can distinguish between primary and secondary networks (Nettle, 1999: 67). At least in traditional societies, a person’s primary network is likely to be dense, multiplex, kin-based, and enduring, whereas links in the secondary network have single functions such as trade. This distinction is a simplification, but a helpful one. Grossly, the primary network determines the features of a person’s speech, while the secondary network mediates changes in those features. Social network research shows that speakers are likely to use variables – pronunciations, words and phrases, grammatical constructions – which are used by and identify them with others in their primary network, although age and gender are also determinants (Milroy, 1987, 2001). The choice of variables on a particular occasion is also biased by the nature of that occasion and by the speech of the interlocutor. There is thus some diversity in even the smallest unit of a network, namely a speaker’s primary links. How do innovations enter a tight-knit primary network? The answer seems to be that its members typically have weak ties with outsiders through their secondary network, who in turn, because of their social roles, have weak ties with members of other primary networks, and it is these multipleweak-tie individuals who are less subject to the norm enforcement of a primary network and act as carriers of innovation across the larger network (Milroy and Milroy, 1985). How an innovation begins is difficult to investigate, but it seems that a speaker repeats a variant (random or deliberate) which is acquired by the speaker’s children and/or copied by other speakers and selected as the marker of a social group (Weinreich et al., 1968; Milroy, 1992: 170; McMahon, 1994: 251; Croft, 2000: 44–78). Among the Takia of Karkar Island (Papua New Guinea), there is a division between coastals and inlanders. Coastals distinguish the phonemes /l/ and /r/. Inlanders merge them as /l/. Every Takia adult knows about this difference. Comparative evidence shows that the innovators are the inlanders. The merger of /l/ and /r/ as inland /l/ must have occurred randomly in the speech of a single speaker, been acquired by others, selected as an inlander marker, and carried from one primary network to another by inlanders with multiple secondary links. Nettle (1999: Chaps. 2–3) finds that an innovation is unlikely to catch on without the amplification afforded by social selection. When a speech community becomes two or three communities and contact among the new communities is reduced, their speech may diverge and new lects appear. The divergent changes are driven by social selection. The deciding factor behind divergence is the strength of social boundaries – the weakening of links between social networks so that new social identities emerge. Some languages are far more resistant to change than others. This difference has been attributed to social network structure. Milroy and Milroy (1985) propose that when the speakers of a language form a set of overlapping tight primary networks, i.e., a tight-knit speech community, with only weak social links to other groups, the language typically changes very slowly. The paradigm case is Icelandic, little changed since early medieval Old Norse, unlike its radically transformed sisters Norwegian, Danish, and Swedish. At the opposite extreme are papuanized Austronesian languages in mainland New Guinea. Their speakers entered into symbiotic relationships with speakers of Papuan languages so that their secondary links were with Papuan speakers whose alien speech was the source of extensive innovations (Ross, 1996). However, Labov (2001: Chap. 10) offers an occupationbased interpretation of data sets like the Milroys’, and there are relatively isolated communities (e.g., in Polynesia) where change has been quite fast, and apparently even tight-knit communities where speakers have exaggerated differences from their neighbors, i.e., accelerating change to maintain isolation (Anderson, 1988; Thurston, 1989; Ross, 2003). Nettle (1999: 66–78) argues that economic (inter) dependence is the main reason for language spread. He contrasts the many small speech communities of inland New Guinea, each of them economically largely independent, with the widespread communities of the Hausa, the Fulani, and the Tuareg of on the southern edge of the Sahara, where survival requires distant economically based relationships. Hunter-gatherer pygmy groups in central Africa depend on their farmer neighbors to supplement their diet, and each group of pygmies has shifted to the language of the farming group with which it is paired. Hunter–gatherer negrito groups in the Philippines have adopted Austronesian languages from farmer neighbors (Reid, 1994). The difference between these situations and the one which led to the papua- Language Families and Linguistic Diversity 505 nization of Austronesian languages in New Guinea was probably a difference in the strength of primary network links: these links were stronger among Austronesian-speaking horticulturalists than among hunter- gatherers. With explicable exceptions, language density is greater nearer the Equator than further away from it, perhaps because it is correlated with ecological risk: the more difficult it was to sustain human life, the larger the economic and therefore linguistic networks that came into being (Nettle, 1999: 60–66, 79–93). This economy–language pairing continues in more recent events. The expansion of European colonizers across the planet since the 15th century caused a devastating reduction in language diversity, now being intensified by the push toward economic globalization. The Origins of Phylogenetic Diversity The origins of the phylogenetic diversity of languages lie far back beyond knowability. Are all today’s language families descended from a single ‘ProtoWorld’? Many people have assumed so, not on linguistic evidence but on the basis of a single human origin in east Africa. The comparative method allows us to reconstruct linguistic history only as far back as about 8000 years, yet structurally modern language has probably been around for 100 000 years. Nichols (1992: 221–230) finds a geographic patterning of morphosyntactic features in the world’s languages, which, she suggests, are fossil reflexes of the original spread of languages. She traces a path from the Old World into the Pacific and rapidly south to Australia, followed by circum-Pacific and New World colonization. Those who entered the Americas across the Bering Strait were related to a circum-Pacific group, not an Old World population. Using Nichols’ stocks, Nettle (1999: 113–129) divides the world into nine regions and plots stock and language densities for each of them. Setting aside the New Guinea figures, which are significantly higher than anywhere else, he finds that stock (phylogenetic) density in Africa, Europe, and Asia is far lower than in the rest of the world. In the rest of the world there is a simple correlation: the more languages, the more stocks. This situation is claimed to reflect time depth. Over time, interaction between human populations has led through language shift to stock extinctions. On archaeological estimates, the time depth of human language in Africa is more than 100 000 years; it ranges from 60 000 to 40 000 years in Europe, Asia, Australia, and New Guinea; and is as little as 12 000 years in the Americas (although this dating is controversially recent). Nettle’s stock densities are in an approximate inverse relationship to these time depths. Bellwood (1997) argues that the Neolithic transition, i.e., the transition from foraging to agriculture, caused the expansion of some of the world’s largest language families, presumably leading to increased extinctions of languages and stocks. He notes four significant cultivation events that apparently caused the expansion of languages into often large families, probably replacing their own closest relatives as well as other families in the process: 1. Wheat, barley and legumes in the Fertile Crescent by 8000 BC: Afro-Asiatic, (controversially) IndoEuropean, Elamo-Dravidian, Kartvelian. 2. Taro and bananas in the highlands of New Guinea by 8000 BC (Denham et al., 2003): Trans-New Guinea family, limited to the New Guinea region. 3. Rice, foxtail and broomcron millet in the Yangtze and Yellow River Basins by 6000 BC: Austronesian, Tai-Kadai, Austro-Asiatic, Sino-Tibetan, HmongMien. 4. Sorghum and pearl millet in sub-Saharan Africa by 2000 BC (?): Nilo-Saharan and Niger-Congo. These events may have sharply reduced the world’s phylogenetic diversity. Relics of what was there before perhaps survive in Basque in the Pyrenees, in the diversity of the language families of the Caucasus, and of parts of New Guinea and parts of the Americas (see Nichols, 1997). In Diamond and Bellwood (2003), the Neolithictransition hypothesis is extended across the world, but somewhat tentatively, and wisely so. Although it is probably correct for Austronesian, and no doubt elsewhere, it would be wrong to overgeneralize it. Nilo-Saharan and Niger-Congo are not necessarily phylogenetic units (Nichols, 1997). Pama-Nyungan in Australia, Uralic, and Chukchi-Kamchatkan in northern and central Eurasia, Khoisan in southern Africa, and Athabaskan and Eskimo-Aleut in northwest North America have all expanded without a Neolithic transition, although there may have been other economic reasons for their spreads. Various agricultural families have not expanded significantly: Ramu–Lower Sepik in New Guinea, NakhDaghestanian and Kartvelian in the Caucasus, and various families in the Americas (Campbell, 2002). The putative Indo-European agricultural expansion (Renfrew, 1987) has been called into question by Nichols (1998), who brings a cohort of arguments, linguistic, geographic and historical, for a homeland in western Central Asia around 3500 BC. This is part of a larger argument that stock densities are also heavily influenced by geographic factors (Nichols, 506 Language Families and Linguistic Diversity 1997), to which Campbell (2002) would add differences in social behaviors among speech communities. A New Evolutionary Model Recently, an evolutionary model has been proposed in which the replicator is what Croft (2000) calls the lingueme. Its basis is a general theory of evolution applicable to change in social phenomena as well as in biology. There are three versions of the model (space compels us to ignore their differences): Nettle (1999), Croft (2000), and Enfield (2003). Croft’s is the most articulated version. A lingueme is any unit of linguistic structure, be it a phoneme, a morpheme, a word or phrase, or a construction, i.e., the units which are the parameters of intra- and interspeaker variation. The linguistic counterpart of the biologist’s DNA string is the utterance, a structured set of linguemes. The speaker, as the repository of grammar, corresponds to the biological organism (the interactor). As in network theory, speakers form a networked population, and language change occurs when a lingueme is propagated across the network in altered form and becomes manifest in speakers’ utterances. Croft (2000: Chaps. 5–6) builds in a paradigm of possible innovation types as well. What is new and what gives the model its integrative power is the view of the lingueme as the replicator. The language is deliberately left out of focus, recognizing that a speaker’s repertoire is a structured collection of variables that do not necessarily have common geographical boundaries. The dialect vs. language vs family issue does not arise. What is somewhat uncertain, however, is where this leaves the grammar. Enfield stresses that language learners infer ‘‘grammatical patterns’’ from others’ behavior. The idea that a single grammar is shared by speakers or is transmitted from one generation to the next is for him a cultural illusion. Linguistic signs ‘‘are best understood as theories, constructed by individual speakers over time by a process of trial and error’’ (Enfield, 2003: 2–3). Sets of signs come to cohere as systems in individuals’ minds, and speakers’ grammars have a lot in common because of the need for coordination. Croft (2001: 29) grants that the evolutionary model implies looser grammatical organization than either the structuralist or the generative model, but attributes ‘‘a high degree of structural organisation’’ to the lingueme pool. He asserts that through the utterances she hears, a child inductively acquires syntactic constructions, and then from these infers a taxonomic network of constructions (Croft, 2001: 57–58). Nettle, Croft, and Enfield all give pride of place to contact phenomena as a justification for the new model. They point out that if the replicator is the lingueme, then it does not matter whether an altered lingueme has its origin in what speakers recognize as ‘‘their own language’’ or ‘‘another language.’’ The process of change is the same. Under the family tree model, a language is only allowed to have one parent, and contact phenomena, they imply, tend to be pushed under the carpet. This is true, but the emphasis on contact risks overlooking what is modeled in a family tree, namely the generational continuity of a language. Although grammar is not itself a replicator and is reconstructed by each new speaker, parents and children usually have a clear sense that they are speaking the same language. Furthermore, contact in most cases entails bilingualism, which causes the linguemes of one language to be modified under the influence of the other. The modified language is usually still recognized as the same language by the next generation, whether altered linguemes have a language-external or languageinternal origin. This is so even for the papuanized Austronesian languages of New Guinea. The generational continuity depicted by a family tree is only broken when a social catastrophe occurs, as when the transportation of Melanesians to far-off plantations from the 1860s led to the rapid stabilization of Pacific Pidgin, a language with no previous existence as a system (Ross, 1997: 251–253). Language contact may have been marginalized by the family tree model, but it would be a pity if generational continuity were marginalized under the new model. See also: Cladistics; Contact-Induced Convergence: Typology and Areality; Cultural Evolution of Language; Early Historical and Comparative Studies; Evolutionary Theories of Language: Current Theories; Evolutionary Theories of Language: Previous Theories; Fijian; Genetics and Language; Germanic Languages; Labov, William (b. 1927); Language and Dialect: Linguistic Varieties; Language Change and Language Contact; Language/Dialect Contact; Long-Range Comparison: Methodological Disputes; Microparametric Variation; Origin and Evolution of Language; Papua New Guinea: Language Situation; Phonemics, Taxonomic; Solomon Islands: Language Situation; Subgrouping Methodology; Variation and Language: Overview; Variation in German. Bibliography Andersen Henning (1988). ‘Centre and periphery: adoption, diffusion and spread.’ In Fisiak J (ed.) Historical dialectology. Berlin: Mouton de Gruyter. 39–85. Bellwood P (1997). ‘Prehistoric cultural explanations for the existence of widespread language families.’ In McConvell P & Evans N (eds.) Archaeology and Language Families and Linguistic Diversity 507 linguistics: Aboriginal Australia in global perspective. Melbourne: Oxford University Press. 123–134. Campbell L (2002). ‘What drives linguistic diversification and language spread?’ In Bellwood P & Renfrew C (eds.) Examining the farming/language dispersal hypothesis. Cambridge: McDonald Institute of Archaeological Research. 49–63. Croft W (2000). Explaining language change: an evolutionary approach. Harlow: Pearson Education. Croft W (2001). Radical construction grammar: syntactic theory in typological perspective. Oxford: Oxford University Press. Denham T P, Haberle S G, Lentfer C, Fullagar T, Field J, Therin M, Porch N & Winsborough B (2003). ‘Origins of agriculture at Kuk Swamp in the Highlands of New Guinea.’ Science 201, 189–193. Diamond J & Bellwood P (2003). ‘Farmers and their languages: the first expansions.’ Science 300, 597–603. Enfield N J (2003). Linguistic epidemiology: semantics and grammar of language in mainland Southeast Asia. London: RoutledgeCurzon. Geraghty P (1983). The history of the Fijian languages. Oceanic Linguistics special publication No. 19. Honolulu: University of Hawaii Press. Labov W (2001). Principles of linguistic change. 2: Social factors. Oxford: Blackwell. LaPolla R J (2001). ‘The role of migration and language in the development of the Sino-Tibetan languages.’ In Aikhenvald A & Dixon R M W (eds.) Areal diffusion and genetic inheritance: problems in comparative linguistics. Oxford: Oxford University Press. 225–254. McMahon A (1994). Understanding language change. Cambridge: Cambridge University Press. Milroy J (1992). Linguistic variation and change: on the historical sociolinguistics of English. Oxford: Blackwell. Milroy J & Milroy L (1985). ‘Linguistic change, social network and speaker innovation.’ Journal of Linguistics 21, 339–384. Milroy L (1987). Language and social networks (2nd edn.). Oxford: Blackwell. Milroy L (2001). ‘Social networks.’ In Chambers J K (ed.) The handbook of language variation and change. Oxford: Blackwell. 549–572. Nettle D (1999). Linguistic diversity. New York: Oxford University Press. Nichols J (1992). Linguistic diversity in space and time. Chicago: Chicago University Press. Nichols J (1996). ‘The comparative method as heuristic.’ In Durie M & Ross M D (eds.) The comparative method reviewed: irregularity and regularity in linguistic change. New York: Oxford University Press. 39–71. Nichols J (1997). ‘Modeling ancient population structures and movement in linguistics.’ Annual Review of Anthropology 26, 359–384. Nichols J (1998). ‘The Eurasian spread zone and the IndoEuropean dispersal.’ In Blench R M & Spriggs M (eds.) Archaeology and language, 2: Correlating archaeological and linguistic hypotheses. London: Routledge. 220–266. Reid L A (1994). ‘Unravelling the linguistic histories of Philippine negritos.’ In Dutton T E & Tryon D T (eds.) Language contact and change in the Austronesian world. Berlin: Mouton de Gruyter. 443–475. Ross M D (1988). Proto Oceanic and the Austronesian languages of western Melanesia. Canberra: Pacific Linguistics. Ross M D (1996). ‘Contact-induced change and the comparative method: cases from Papua New Guinea.’ In Durie M & Ross M D (eds.) The comparative method reviewed: regularity and irregularity in language change. New York: Oxford University Press. 180–217. Ross M D (1997). ‘Social networks and kinds of speechcommunity event.’ In Blench R M & Spriggs M (eds.) Archaeology and language, 1: Theoretical and methodological orientations. London: Routledge. 209–261. Ross M D (2003). ‘Diagnosing prehistoric language contact.’ In Hickey R (ed.) Motives for language change. Cambridge: Cambridge University Press. 174–198. Ross M D (2005). ‘Pronouns as a preliminary diagnostic for grouping Papuan languages.’ In Pawley A, Attenborough R, Hide R & Golson J (eds.) Papuan pasts: Investigations into the cultural, linguistic and biological history of the Papuan-speaking peoples. Canberra: Pacific Linguistics. Swadesh M (1972). The origin and diversification of language. London: Routledge and Kegan Paul. Thurston W R (1989). ‘How exoteric languages build a lexicon: esoterogeny in West New Britain.’ In Harlow R & Hooper R (eds.) VICAL 1, Oceanic languages: papers from the Fifth International Conference on Austronesian Linguistics. Auckland: Linguistic Society of New Zealand. 555–579. Weinreich U, Labov W & Herzog M (1968). ‘Empirical foundations for a theory of language change.’ In Lehmann W P & Malkiel Y (eds.) Directions for historical linguistics. Austin: University of Texas Press. 95–195.
© Copyright 2026 Paperzz