Lexical homologies (cognates) Classification and the tempo of language evolution English here sea water when German hier See, Meer Wasser wann French ici mer eau quand Italian qui, qua mare acqua quando Greek edo thalasa nero pote Hittite ka aruna- watar kuwapi Quentin D. Atkinson Meaning Institute of Cognitive and Evolutionary Anthropology, University of Oxford English 1 0 0 0 1 0 0 0 1 0 0 1 German 1 0 0 0 1 1 0 0 1 0 0 1 French 0 1 0 0 0 1 0 0 0 1 0 1 Italian 0 1 0 0 0 1 0 0 0 1 0 1 Greek 0 0 1 0 0 0 1 0 0 0 1 1 Hittite 0 0 0 1 0 0 0 1 1 0 0 1 here sea water when Image from cover of Nature 449, 11 October, 2008 The IndoEuropean Language Family Tree Gray and Atkinson, Nature, 2003 Swadesh 200 word list 2449 cognate sets Likelihood model of cognate birth/death Branch-lengths = change Phylogenetic uncertainty I-E tree showing variation in rates of lexical replacement “One” “Ear” “Sand” Some examples of meanings with small and large numbers of cognate sets ROMANCE CELTIC GERMANIC SLAVIC Cognate sets Examples 1 two, three, five, I, who 2 one, four, we 3 how 4 name, tongue 6 ear, night, thou 10 day, to live, mother, salt, when 27 bark (of a tree), to count, to dig, to float, to flow, if, rub, sand, straight, woods INDOIRANIAN 46 dirty (the most variable word) GREEK 1 Estimating rates of word evolution on a phylogeny Coding the cognate data Languages meanings English here sea water German hier see, meer wasser 0 wan French ici mer eau 1 quand quand Italian qui, qua mare acqua 1 quando acqua quando Greek edo thalasa nero 2 pote thalasa nero pote Hittite ka aruna- watar 0 kuwapi aruna- watar kuwapi English here sea water when German hier See, Meer Wasser wann French ici mer eau Italian qui, qua mare Greek edo Hittite ka English 0 0 0 0 German 0 0, 1 0 0 French 1 1 1 0 Italian 1 1 1 0 Greek 2 2 2 0 Hittite 3 3 0 0 transition model (e.g., water) 0 when phylogeny q01 1 0 q10 q21 q20 + q12 q02 2 numerical estimates of transition rates, q (scaled as expected changes per ten thousand years) Distribution of word replacement rates (rates of lexical evolution) What predicts variation in rates of evolution? genes directional versus purifying selection (conserved and non-conserved elements), expression levels, population size words word frequency Paul (1880) and Zipf (1947), but not tested. 100-fold rate variation Correlated rates in Bantu (Pagel & Meade, 2006) Correlations between frequencies of word use Spoken word frequency in the British National Corpus 350 300 N=4840 words mean = 194 geometric mean = 35.94 median = 25 Count 250 200 150 100 50 0 1 1.5 2 2.5 3 3.5 4 4.5 log(10) of spoken word frequency per million Figure from Pagel et al., Nature, 2007. 2 Frequency vs rate of lexical evolution Parts of speech conjunctions ---prepositions ---adjectives ---verbs ---nouns ---special adverbs---pronouns ---numbers ---- r=-0.37 r=-0.35 r=-0.41 r=-0.32 R2 =0.48 R2 =0.48 R2 =0.50 R2 =0.48 Figure from Pagel, Atkinson & Meade, Nature, 2007 Two models of how frequency influences the rate of lexical evolution i) reduced mutation ii) matching-purifying model Word frequency distribution for “Thunderstorm” thunderstorm n=192 different words for ‘thunderstorm’ in a population of Midwest American speakers. adoption of variants bugs meaning or concept + arbitrary sound e.g., “rabbit” lagomorph thundershower mutation word = “rabbit” innovation Storm, thundercloud Electrical storm, thunder gust bunny Cat squall, thundering in the molly hole, yawl Peter hare Tempo and classification Word frequency distribution for “Thunderstorm” Frequency of word use together with part of speech accounts for 50% of variation in rates of evolution across 87 languages representing ~130,000 language-years of evolution Frequency may act as a linguistic form of ‘purifying selection’ affecting the choice of words Power law curve - bias against infrequent words The mechanism is expected to operate similarly across all languages and time scales, and makes predictions about specific meanings. (e.g. Indo-European and Bantu correlation). Some insights for classification languages evolve initially in less frequently used parts of vocabulary, retaining mutual intelligibility for longer high frequency words may be less likely to be borrowed cultural replicators can evolve more slowly than some human genes (e.g., compare “five” with lactase gene) -- some words persisting for tens of thousands of years slow evolution raises possibility of deep linguistic reconstructions 3 The IndoEuropean Language Family Tree Gray and Atkinson, Nature, 2003 long periods of stability or stasis followed by short punctuational bursts associated with speciation Punctuated Equilibrium and the fossil record Eldredge and Gould 1972 Swadesh 200 word list 2449 cognate sets Likelihood model of cognate birth/death Branch-lengths = change Phylogenetic uncertainty Species Formation through Punctuated Gradualism in Planktonic Foraminifera Bjorn A. M almgren; W. A. Berggren; G. P. L ohmann. S cience, 225 (4659): 317-319. Path lengths may contain components derived from punctuational and gradual processes Do the data sets show evidence of a punctuational effect (β > 0)? • • ! >0 each language splitting event makes some contribution to path length ! =0 path length accumulates as a function of time • Austronesian • punctuational effect in ~100% of trees Bantu • punctuational effect in ~98% of trees Indo-European • punctuational effect in ~67% of trees path length = n! + g Figure from Atkinson et al., Science, 2008 Rate variations through time… • language splitting events have a punctuational effect on lexical evolution • This effect is substantial, and potentially a ubiquitous property of language evolution • There may be more than one process causing punctuational language change… • Founder effect • small founder populations evolve more quickly. • Fits with computer simulations of (Nettle, 1999) • “The underlying cause of sociolinguistic differences…is the human instinct to establish and maintain social identity” Chambers (1995, p 250) • Martha’s Vineyard (Labov, 1963) • Noah Webster - “as an independent nation, our honor requires us to have a system of our own, in language as well as government” - Noah Webster, Dissertations on the English Language (1789, p. 20). Conclusions • Models of language evolution are useful for classification, but can also be used to test hypotheses about the processes underlying language change • We can feed what we learn back into the models to develop better classification and a better understanding of the human language 4 Thanks to: Mark Pagel Chris Venditti Andrew Meade Russell Gray Simon Greenhill The Leverhulme Trust 5
© Copyright 2026 Paperzz