Since language is the derIDing trait of our species, human evolution

HUMAN DIVERSITY AND LANGUAGE DIVERSITY
WILLIAM
So-YoWANG
Department ofElectronic Engineering,City University ofHong Kong
Since languageis the derIDingtrait of our species,human evolution and
linguistic evolution are obviously closely intertwined. Recent studies in
genetics suggestthat anatomicallymodemhumansemergedat a very late
date, perhaps 50 kys (Bertrapetit 2000; Thompson2000). This dating is
consistentwith the onsetof an unprecedented
degreeof culturalinnovations,
in both quality and quantity, as revealedin the archaeologicalrecord (Klein
1999).We sharethe belief with many studentsof humanprehistorythat the
evolution of anatomicallymodemhumans,the emergenceof language,and
the burst of cultural innovations,including extensivecave art and sailing
acrossbroad expansesof water are events which are all closely linked to
eachother.
Culturally, there have beenseveralmajor transitionsseparatingus from
our prehistoric ancestors-suchas the use of fire, the invention of tools, the
adventof agriculture,etc. Similarly, there musthave beenmajor transitions
which led from the primitive growls and howls of our ancestorsto the
intricate languageswe havetoday. We cannotrecoverlanguageevolution in
the very distantpastin ways comparableto those of the archaeologist,since
the earliest'material remains' of language,i.e., ancienttexts, date back no
farther than severalmillennia. However, linguists have developedmethods
of reconstruction and taxonomy which are helpful toward an
interdisciplinaryunderstandingof the diversity of peoples.
Indeed the identity of a people is often intimately coupled to the
languageit speaks.Linguistic groupinghasbeentaken,time and again,to be
the ftrst criterion for sorting out human diversity. The celebrateddiagram
published by Cavalli-Sforzaet al. (1988), comparinga genetictree with a
linguistic tree, was an eloquent statementon the important parallelisms
betweengenetic evolution and linguistic evolution on a global scale.More
locally, whena methoddevelopedto quantifygeneticaffinity was appliedto
a chain of languagesin Micronesia,it was found to yield comparableresults
(Cavalli-Sforzaand Wang 1986,reprintedin Wang 1991).
W.S.-Y
18
Wang
At the sametime, however, languagesand genesdo go their separate
ways, and suchcasesare not hard to find. Whenone ethnic groupconquers
anotherethnic group, the commonlanguageeventuallyarrived at may be
that of the conqueror,or that of the conquered.The latter is clearly the case
with the Manchus,an Altaic people from northeasternChina who founded
the Qing dynasty and ruled the entirety of China for nearly 300 years.
Although there are numerousmonumentsand documentswhich attestto the
glory of their long reign, the Manchulanguagehas beenall but replacedby
the languageof the Han majority. Li (2000,p. 15)describesthe situationthis
way.
"A surveydonein the People'sRepublic of China in the 1950'sfound that
quite a few elderly Manchus who lived in the more remote regions of
Manchuria could still speak Manchu. Those over thirty years old were
likely to understandit, while theyoungergenerationcould neitherspeakor
[sic) understand it. Since then, anthropologists and linguists doing
research in northern Manchuria have been reporting on a rapidly
dwindling number of Manchu speakers.By the 1990s Manchu speakers
havebecomenearlynon-existent."
Suchcasesof languagedisplacement,by no meansrare, remind us that
genesand languagescan and do go separateways. While they match in the
default case,we should not be disturbed when their phylogeniesdo not
agree. In fact, the casesof mismatchare in a sensemore interestingsince
they may reveal displacementeventslong ago which would be difficult to
uncoverotherwise.
Potential contributions from linguistics on the question of human
diversi~ comeunderthreeheadings:
1. To establishgeneticgroupsand subgroupsof languages.
2. To locatethe homelandof speakersof ancientlanguages.
3. To datesplits amonglanguages.
The study of languageprehistoryhas a distinguishedtradition in many
cultures. In China, reconstructingthe rhymes of ancient poetry reacheda
high level of scholarshipin the 16thcentury. In the West, historical
linguistics traces its roots to a famous lecture given in 1786 by William
Jones. The following paragraph with which he announcedthe genetic
relatednessamongsome of the languagesin Europeand in Asia is perhaps
the most oftenquotedin linguistics:
HumanDiversity and LanguageDiversity
19
"The Sanskrit language, whatever be its antiquity, is of a wonderful
structure; moreperfect than the Greek,more copiousthan the Latin, and
more exquisitelyrefined than either,yet bearing to both of thema strong
affinity, both in the roots of verbsand in theforms of grammar,than could
possibly have been produced by accident; so strong indeed, that no
philologer could examinethem all three, without believing them to have
sprungfrom somecommonsource,whichperhapsno longer exists;there is
a similar reason,though not quite so forcible, for supposingthat both the
Gothick and the Celtick, though blendedwith a very different idiom, had
the same origin with the Sanskrit; and the old Persian might be addedto
the same family, if this were the place for discussing any questions
concerningthe antiquities ofPersia...(Quotedin Cannon1991..31)
Building upon Jones'sinsight, a great deal has beenachievedtoward
clarifying the relationshipsamongthe 6000 or so languagesspokenin the
world today. The reconstructionof the Proto-Indo-European,
the "common
source" that Jones conjecturedin the above paragraph,togetherwith the
light it sheds on civilizations of some 7,000 years ago, has become a
standardin scholarshipto be emulatedeverywhere.Manyproto-languages
of
similar time depthshave beenreconstructed.
Currently, there is a spectrumof positionson how much time depthis
recoverablein languagefor determininggeneticrelationships.At one end of
the spectrum,somelinguists have beenreluctantto venturebeyondthe time
depth establishedby Indo-Europeanstudies. Since a living languageis
constantlychanging,theselinguists believe that nothingreliable will be left
of the original languageafter7,000yearsto be of diagnosticvalue.Although
this ceiling of 7,000 yearshas never been objectivelyjustified, it seemsto
reflect a bias from Indo-Europeanstudies.At the otherend of the spectrum,
some linguists propose global etymologies,roots of words which can be
found in all major phyla. These linguists believe that all the world's
languagescanbe tracedto a singlemonogeneticsource.
While monogenesis is the dominant view today, probabilistic
considerationsactually favor a scenario in which languagewas invented
independently at many sources, i.e., polygenesis(Freedmanand Wang
1996).In ponderingtheseissues,we shouldalsotakeinto accountthe effects
of global events such as major glaciations, which must have scrambled
human populations extensivelyby forcing distantmigrations. It would be
difficult to establishlinguistic lineagesacrosssuchbarriersof panmixia.
Although methods of taxonomy are not nearly as well developedin
linguistics as in biology, nonethelessa generalpicture is emerging,largely
20
W.S.-Y.Wang
thanks to the pioneering efforts of Joseph H. Greenberg of Stanford
University. Figure 1 shows the dozen or so phyla he proposes for the
languages of the world. This classification is discussed by Ruhlen (1991).
While most of the details remain to be worked out, his proposal is the first
major framework within which future researchcan be anchored. The phylum
that Greenberg has been investigating in depth himself is one he calls
Eurasiatic. As shown in Figure 2, the Eurasiatic phylum has Indo-European
as one of its branches, but also comprises many other branches as well,
including the enigmatic Ainu language, which has been considered by most
to be a linguistic isolate. Greenberg's results (2000), which have been just
published, are sure to elicit very different responsesfrom linguists of various
persuasions.
Quite independent of Greenberg's research, a group of Russian linguists,
led by the late Illich-Svitych, have also proposed a large phylum of
languages, which they call Nostratic. For some discussion of the Nostratic
proposal, see the anthrology edited by Salmons and Joseph (1998). It is
instructive to compare the memberships of the two proposals, as seen in
Table 1. Much of the original work on the two proposals was done during
the decades when communication across the continents was hampered by
political curtains, and the sharing of data was difficult. Recent years have
seen closer interactions between the linguists of the U.S. and Russian, with
the encouraging result of increasing convergence in their views.
Another phylum of great interest is Dene-Caucasian. The proposal by
Sergei Starostin (1990), a linguist at the Moscow State University, is shown
in Figure 3. Again, while some members of the phylum may be flrnlly
established, such as Sino-Tibetan, much work needs to be done for the
proposal to reach general acceptance.An example of recent progress here is
the fmding of Ruhlen (1998), on the Yeniseian and Na-Dene, which are two
branches of the Dene-Caucasian.This finding of 36 common etymologies is
of special interest since it definitively connects languages which are
currently distributed on opposite sides of the Pacific.
There is still no consensus regarding the distant affiliations of the
Chinese language. This is reflected in a monograph edited by Wang (1995),
in which E.G.Pulleyblank discussesthe connection between the Chinese and
Indo-European. Laurent Sagart (see Wang 1995) discussedthe Chinese and
Austronesian. In the same monograph, Starostin shows the number of basic
words, defined by Sergei Yakhontov (see Wang 1995), shared among these
language groups. In Table 2, Starostin's numbers have been converted to
Figure 1. The languagephyla of the world, proposedby JosephH. Greenberg.
W.S.-Y Wang
Comparisonbetweentwo classifications,NostraticandEurasiatic
Afro-Asiatic
";1'
";1'
";1'
";1'
";1'
";1'
Elamo-Dravidian
Kartvelian
Indo-Hittite
Uralic-Yukaghir
Altaic
Korean
Japanese
Ainu
Gilyak
Chukchi-Kamchatkan
Eskimo-Aleut
'v'
'v'
'v'
'v'
'v'
'v'
'v'
'v'
.Illich-Svitych, 1971-1984
2Greenberg, 2000
percentages.It can be seenthere that the subsetof the Chinese,TibetoBumlan, Caucasian and Yeniseian does show a significantly closer
relationshipinternally than any memberhas with eitherthe Indo-European
or Austronesian.
The Dene-Caucasian
languagesare largely found in the north; the major
exceptionbeing someTibeto-Bumlanlanguageswhich have migrated deep
into SoutheastAsia. In complementary distribution to the linguistic
developmentsin northernAsia, the languagesof southernAsia havebecomegrouped
under the phylum Austric. The reality of this phylum has been
considerably strengthened in recent years with the discovery of
morphologicalcorrespondences
by Reid (1994).The Austric phylumis a farflung group, comprising well over 1000 languages.According to Ruhlen(1991),
the major subgroupsareasfollows:
Table
Nostratic
Eurasiatic2
25
HumanDiversity and LanguageDiversity
t;
I. Miao-Yao
II. Austro-Asiatic
a. Munda
b. Mon-Khmer.e.g.Wa, Vietnamese.
III. Austro-Tai
a. Daic. e.g.Zhuang,Thai, Lao.
b. Austronesian.
i. Eastern= Oceanic,e.g.Hawaiian.
ii. Western.e.g.Malagasy,Tagalog.
A leading authority on Austric languagesis RobertBlust (1996) of the
University of Hawaii. Although Blust's latest classificationof Austric may
differ somewhatfrom that of Ruhlen,he offers the following approximate
datesof divergence,which provide a useful temporalframework.
Proto-Austric
8,500BP
Proto-Austronesian
6,500
Proto-Oceanic
4,000
Reviewingthe archaeologicalevidence,Blust suggeststhat the last unity of
the Austric phylum may have been at the Yunnan-Burmaborder, splitting
into various families, which then spreadinto SouthChina, SoutheastAsia.
The paths these early migrants took probably followed the coursesof the
greatrivers of Asia.
2. The relation of Chineseto othergroupsof languages,shownasthe
percentageof apparentcognatesfrom 35-word list ofYakhontov
oc
Old Chinese
Proto-Tibeto-Burman
Proto-North-Caucasian
Proto-Yenisseian
Proto-Indo-European
74
43
34
23
Proto-Austronesian
14
PTB
PNC
40
14
11
57
17
11
py
PIE
7
14
Weare far from havinga conclusiveprehistoryof Asia, though scholars
are beginning to bring togetherevidence from archaeology,geneticsand
linguistics. If we acceptthe three languagephyla discussedabove, then a
plausible scenariofrom linguistics is this. Early humansenteredEast and
Austric:
Table
51
26
W.S.-Y.Wang
SoutheastAsia, bringing with themtwo linguistic phyla. the Dene-Caucasian
in the north and the Austric in the south. Their domains were later
supplantedby the Eurasiaticphylum, particularlythe Altaic family and the
Indo-Iranianbranchof the Indoeuropeanfamily.
The Altaic family of languagesstretcheslike a belt acrossCentralAsia,
stretchingfrom Turkey in the west and extendingto the Pacific in the east
over severalmillennia. Only in recentcenturiesdid Russian,a memberof
the Slavic branch of the Indo-Europeanfamily, colonize large regions of
northernAsia. The Indo-Iranianlanguageshavemoved into WestAsia and
SouthAsia, where they claim large communitiesof speakersin Iran, India
and Pakistan.The expansioneastwardof the Eurasiaticphylum coversover
much of the territory earlier occupiedby speakersof the Dene-Caucasian
and Austric. With a f~w notable exceptions,such as Chinese,the earlier
languageshave been consistently shrinking as the Eurasiatic languages
gainedthe upperhand.
Much of the evidence linguists offer is based on vocabulary. In any
language,the vocabularycontainswords which are more cultural, suchas:
tennis, television, tea, etc. Cultural words are frequently adopted from
language to language, and hence are not stable indicators of genetic
relations. On the otherhand, all languagesalso havebasicwords which are
much more stable, suchas: water, hand, and tree. Although basicwords do
get adopted,they are relatively stable.As Morris Swadesh(1952)proposed
in the 1950s,theyprovide a sourceof quantitativedata for studyingrelations
amonglanguages.
Table 3 presentsin tabular form one of the lists of 100 basic words
Swadesh(1952) proposed that has gained wide acceptancein linguistic
research.Various criticisms have beenvoiced againstthe conceptof basic
words in general, and againstthis list of 100 words in particular. Some
scholarsfeel that the list is too inclusive,andwhittle it downto fewerwords.
The table Starostinconstructed,upon which Table 2 is based,usesa list of
35 words proposedby Yakhontov. In Table 3, these35 words are shownin
italics. As can be seenin the table, 32 of the 35 are in the Swadeshlist. The
threewords Yakhontovproposesnot in the Swadeshlist are: salt, wind, and
year.
Basic words as a methodin studyinglinguistic prehistoryhas beenused
primarily in two contexts.One is to show degreesof affinity, as Starostin
(1990)does in Table 2. The otheris to estimatedatesof the linguistic split.
A central problem in the historical study of languageis that of sorting out
HumanDiversity and LanguageDiversity
27
linguistic traits which are vertically transmitted as opposed to those which
are horizontally transmitted. The former mode is also called inheritance, and
the latter mode is also called botrowing. The problem is extremely difficult
because any linguistic trait can be transmitted either vertically or
horizontally.
Figure 4 illustrates one approach to this problem in the form of a family
tree for the Austronesian languages of Taiwan. Using standard methods of
cluster analysis, I constructed a tree on the basis of a table of numbers of
sharedwords among these languages (Wang 1989). Suchtrees are of course
Table 3. List of 100basicwords,proposedby Morris Swadesh.A smallersubsetof
32 words -plus salt, wind,andyear -proposedby SergeiYakhontovare shownin
italics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Nature
ashes
bark
cloud
fire
leaf
man
moon
mountain
person
rain
root
sand
seed
smoke
star
stone
sun
tree
water
woman
salt
wind
Body
Animal
belly
bird
blood
claw
bone
dog
breast feather
ear
fish
egg
horn
eye
louse
foot
tail
hair
hand
head
heart
knee
liver
meat
mouth
neck
nose
skin
tongue
tooth
Verb
bite
burn
come
die
drink
eat
fly
give
hear
kill
know
lie
say
see
sit
sleep
stand
swim
walk
Adjective
all
big
black
cold
dry
fat
full
good
green
long
many
new
red
round
small
warm
white
yellow
Misc"
earth
I
name
night
not
one
road
that
this
thou
two
we
what
who
year
28
W.s.-y Wang
time-honoredways of graphing vertical transmission.On the basis of the
resultingtree, I was able to make anothertable of the presumednumberof
sharedwords amongtheselanguages.Comparingthesetwo tables enabled
me to detectregions of mismatch,which I interpretto be due to horizontal
transmission.These horizontal transmissionsare indicated on the tree by
broken lines. While such a modified tree does capture both modes of
transmission,the method of its constructionappearsto give dominanceto
verticaltransmission.
Using similar methods,I madean attemptto estimatethe dateof the split
of the Sino-Tibetanfamily of languages,as shownin Figure5. Details of this
exerciseare discussedmore fully in (Wang 1998).I first constructeda tree
of the major dialectsof Chinese,which is shownat the top of the figure. The
tree shown in the middle of the figure is one I constructedfor IndoEuropean, following identical procedures.The encouragingresult when
comparingthe two treesis thatthe 'height' of the tree for Chinesedialectsis
approximatelythe sameas that for the threeGermaniclanguagesin the IndoEuropeantree.Basedon theseroughyardsticks,it would seemthat the SinoTibetantree at the bottomof the figure shouldbe somewhatyoungerthanthe
Indo-Europeantree. This meansthat if we assumethat the Indo-European
tree is 7,000yearsold, thenthe Sino-Tibetantreewould be 6,000yearsold.
Although defmitive supportfor this date of 6,000years,arrived at from
linguistic data, is hard to come by from other disciplines,there is a map
drawn by the Harvard archaeologistK. C. Chang (1986) which is very
suggestive.This map, shownhereas Figure 6, illustratesthe period of 6,000
years ago in China when for the fIrst time there was wholesaleinteraction
among the many cultural spheres,based on archaeologicalfinds. The
melting togetherof thesemany culturesled Changto refer to the period as
'initial China'. There is, then, an encouragingconvergenceof results here
betweenarchaeologyandlinguistics.
With the dramatic advancesmade by geneticsin recentyears,there is
accumulatingan ever increasingbody of geneticdatathat can be compared
with archaeologicaland linguistic hypotheses.Suchcomparisonswill surely
deepenour understandingof the nature of human diversity and linguistic
diversity, whetheror not geneticand linguistic mapsalwaysagree.In either
case,it is certain that we had only one past, and mismatchesbetweenthe
maps can yield important insights on when genes and languageswent
separateways.
30
W.S.-Y.Wang
Figure 5. Additive treesof Chinese,Indo-European,andSino-Tibetan.
Human Diversityand LanguageDiversity
31
Figure6. China in prehistoryasrevealedby archaeology.[Figureadaptedfrom K.C.
Chang,p. 235.]
32
W.S.-Y Wang
Acknowledgements
The researchreportedhereis supportedin part by Grant#90I 000I from the
City University of Hong Kong and from the RGC of the Hong Kong SAR. I
thank the organizers of the seminar for an excellent interdisciplinary
gathering.I am also grateful to Merritt Ruhlen for many conversationson
theoreticalaspectsof linguistic taxonomy,and for providing me with some
of the materials included in this paper. As this chaptergoes to press, I
receivedthe sadnews that ProfessorJosephH. Greenberghas passedaway
on May 7, 200I in Stanford.Almost single-handedly,Greenbergcreatedthe
field of linguistic taxonomyand has beenits mostprolific contributor. This
paper and numeroussimilar studies on languagediversity would not be
possiblewithout the foundationhe laid.
References
Bertranpetit,J. 2000. Genome,diversity, and origins: The Y chromosomeas a
storyteller.Proc. Natl. Acad.Sci. USA97:6927-6929.
Blust, R. 1996. Beyond the Austronesianhomeland:the Austric hypothesisand its
implicationsfor archeology.Trans.Amer.Philos.Soc.86(5):117-160.
Cannon,G. 1991.Jones'sSpnmgfrom SomeCommonSource.IN: Lamb, S.M. and
Mithchell, E.D. (eds.), Sprungfrom SomeCommonSource.Stanford:Stanford
UniversityPress,pp. 23-47.
Cavalli-Sforza,L.L., Piazza,A. , Menozzi,P. andMoWltain,J. 1988.Reconstruction
of humanevolution:Bringing togethergenetic,archeologicalandlinguistic data.
Proc. Natl Acad.Sci. USA85:6002-6006.
Cavalli-Sforza, L.L. and Wang, W.S-Y. 1986. Spatial distance and lexical
replacement.Language62:38-55.Reprintedin Wan~ 1991.
Chang,K.C. 1986. The Archeology of Ancient China. 4 Ed. New Haven,CT:Yale
UniversityPress.
Freedman,D.A. and Wang, W.S-Y. 1996. Languagepolygenesis:A probabilistic
model.Anthropolog.Sci. 104(2):131-138.
Greenberg,J. H. 2000. Indo-Europeanand its ClosestRelatives:The Eurasiatic
LanguageFamily. Stanford:StanfordUniversityPress.
Klein, R. G. 1999. The Human Career. 2nded. Chicago: University of Chicago
Press.
Li, G. R. 2000. Manchu: A Textbookfor ReadingDocuments.Honolulu: University
of Hawaii Press.
Reid, L. A. 1994. Morphologicalevidencefor Austric. OceanicLinguistics33:323-
344.
HumanDiversity and LanguageDiversity
33
Ruhlen,M. 1991.A Guide to the World's Languages.Stanford:StanfordUniversity
Press.
Ruhlen,M. 1998.The origin of the Na-Dene.Proc. Natl. Acad. Sci. USA95:13994-
13996.
Salmons,J.C. and Joseph, B.D. (eds.) 1998. Nostratic: Sifting the Evidence.
Philadelphia:JohnBenjaminsPublishingCo.
Starostin,S. 1990. A statistical evaluation of the time-depthand subgroupingof the
Nostratic macrofamily. Symposiumon Molecules to Culture. Cold Spring
Harbor,NY: Cold SpringHarbor LaboratoryPress,p. 33.
Swadesh,M. 1952.Lexicostatisticdating of prehistoricethnic contacts.Proc. Am.
Philos. Soc.96:452-463.
Thompson,R. et al. 2000. Recent common ancestryof human Y chromosome:
Evidencefrom DNA sequencedata.Proc. Nat. Acad. Sci. USA97:7360-7365.
Wang, W. S-Y. 1989. The migration of the Chinesepeople and the settlementof
Taiwan. IN: Anthropological Studies of the Taiwan Area. Taiwan: National
TaiwanUniversity, Departmentof Anthropology,pp. 15-36.
Wang,W. S-Y. 1991. Explorationsin Language.Taiwan:PyramidPress.
Wang, W. S-Y., ed. 1995. The Ancestry of the ChineseLanguage.J Chinese
LinguisticsMonograph8.
Wang, W. S-Y. 1998. Three windows on the past. IN: Mair, V. (ed.), The Bronze
Age and Early Iron Age Peoples of Eastern Central Asia. Philadelphia:
University of PennsylvaniaMuseumPublications,pp. 508-534.