Revisiting the role of the Himalayas in peopling Nepal

Journal of Human Genetics (2012) 57, 228–234
& 2012 The Japan Society of Human Genetics All rights reserved 1434-5161/12 $32.00
www.nature.com/jhg
ORIGINAL ARTICLE
Revisiting the role of the Himalayas in peopling Nepal:
insights from mitochondrial genomes
Hua-Wei Wang1,2, Yu-Chun Li1, Fei Sun3, Mian Zhao1,4, Bikash Mitra2,5, Tapas Kumar Chaudhuri5,
Pasupati Regmi6, Shi-Fang Wu1,4, Qing-Peng Kong1,4 and Ya-Ping Zhang1,2,4
Himalayas was believed to be a formidably geographical barrier between South and East Asia. The observed high frequency of
the East Eurasian paternal lineages in Nepal led some researchers to suggest that these lineages were introduced into Nepal
from Tibet directly; however, it is also possible that the East Eurasian genetic components might trace their origins to northeast
India where abundant East Eurasian maternal lineages have been detected. To trace the origin of the Nepalese maternal genetic
components, especially those of East Eurasian ancestry, and then to better understand the role of the Himalayas in peopling
Nepal, we have studied the matenal genetic composition extensively, especially the East Eurasian lineages, in Nepalese and its
surrounding populations. Our results revealed the closer affinity between the Nepalese and the Tibetans, specifically, the
Nepalese lineages of the East Eurasian ancestry generally are phylogenetically closer with the ones from Tibet, albeit a few
mitochondrial DNA haplotypes, likely resulted from recent gene flow, were shared between the Nepalese and northeast Indians.
It seems that Tibet was most likely to be the homeland for most of the East Eurasian in the Nepalese. Taking into account the
previous observation on Y chromosome, now it is convincing that bearer of the East Eurasian genetic components had entered
Nepal across the Himalayas around 6 kilo years ago (kya), a scenario in good agreement with the previous results from linguistics
and archeology.
Journal of Human Genetics (2012) 57, 228–234; doi:10.1038/jhg.2012.8; published online 22 March 2012
Keywords: mtDNA; Nepalese; origin
INTRODUCTION
Understanding how the ancestors of modern humans had successfully
settled and adapted to areas with extreme conditions, for example,
highlands, continues to be a hot issue of wide interest. For instance, a
recent study on mitochondrial genome information revealed that the
modern humans had successfully colonized the Tibetan Plateau since
the Late Pleistocene.1 As a neighbor of Tibet, Nepal is located at the
southern piedmont of the Himalayas and is bordered by India and
China. Similar to the Tibetans, the Nepalese are also famous for their
high-altitude living conditions. In consistent with the language
affiliation of the Nepalese populations (viz. Indo-European and
Tibeto-Burman),2–3 recent studies on the Nepalese have detected
substantial genetic contribution from South Asians, East and West
Eurasians,4,5 it becomes evident that the genetic landscape of the
Nepalese has been largely shaped by later immigrants from the
neighboring regions.4–6 For instance, South Asian genetic components
(for example, Y-chromosome haplogroups H-M52*, H-M69*,
H1-M82*, H1-M370*, R1a1-M198 and R2-M124;4,6 mitochondrial
DNA (mtDNA) haplogroups M31, M33, M35, M38, R6 and R305,7)
are prevalent in the Nepalese, reflecting extensive connections between
Nepal and India; whereas the East Eurasian influence on the Nepalese
is surprisingly substantial, as manifested by the presence of Y-chromosome haplogroup O3-M1174,5 and mtDNA haplogroups B5, D4
and G.5,7 As the Himalayas acts as a formidably geographical barrier
between Nepal and East Asia (Tibet), how these East Eurasian
components dispersed to Nepal becomes an issue of hot debate.
Recently, by comparing the Y-chromosome lineages between the
Nepalese and the Tibetan populations, Gayden et al. have proposed
that the East Eurasian genetic components had been introduced into
Nepal from Tibet directly,4,6 a scenario seemingly in agreement with
the hypothesized retreat of Baric speakers.8 However, considering the
close vicinity between Nepal and northeast India in geography, a
possibility that these genetic components of the East Eurasian ancestry
might trace their origins to northeast India could not be ruled out
completely. Indeed, abundant East Eurasian mtDNA lineages have
already been detected in the northeast Indian populations,9–12 raising
1State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan Province, China; 2Laboratory for
Conservation and Utilization of Bio-Resources & Key Laboratory for Microbial Resources of the Ministry of Education, Yunnan University, Kunming, PR China; 3Department of
Parasitology, Medical Teaching Institute, Xinxiang Medical University, Xinxiang, China; 4KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases,
Kunming, China; 5Cellular Immunology Laboratory, Department of Zoology, University of North Bengal Siliguri, Siliguri, India and 6Little Angel’s College Hattiban, Lalitpur, Nepal
Correspondence: Dr Q-P Kong or Dr Y-P Zhang, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming
650223, Yunnan Province, China.
E-mail: [email protected] or [email protected]
Received 30 September 2011; revised 5 January 2012; accepted 8 January 2012; published online 22 March 2012
Reconstructing the origin of the Nepalese
H-W Wang et al
229
Figure 1 Sampling locations of the populations analyzed in this study. The Nepalese populations collected in this study are highlighted with solid pentacles,
and the black dots represent the populations retrieved from the literature.
Table 1 Information of the 43 populations analyzed in this study
Population
Nepal_Kat
Nepal_Mix
Tharu_CI
Tharu_CII
Tharu_E
Kanet
Kashmir
Lobana
Brahmin
Jat Sikh
Kshatriya
Scheduled caste
Sikh
Tharu_UP
Kharia
Bhumij
Santhal
Garo
Lyngnga
Nongtrai
Maram
Bhoi
Khynriam
War_Khas
Pnar
War_Jaint
Adi
Apa_1
Apa_2
Nishi
Tipperah
Naga
Tib_Ng
Tib_R1
Tib_R2
Tib_L
Tib_N1
Tib_N2
Tib_S
Monba
Tib_Ny
Lhoba
Tib_C
Code
Location
Country
Group
Sample size
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
Kathmandu
Eastern Nepal
Central Terai
Central Terai
Eastern Terai
Himachal
Kashmir
Punjab
Punjab
Punjab
Punjab
Punjab
Punjab
Uttar Pradesh
Bihar
Bihar
Bihar
Meghalaya
Meghalaya
Meghalaya
Meghalaya
Meghalaya
Meghalaya
Meghalaya
Meghalaya
Meghalaya
Assam
Arunachal Pradesh
Tripura
Tripura
Tripura
Nagaland
Ngari
Shigatse
Shigatse
Lhasa
Nagqu
Nagqu
Shannan
Nyingchi
Nyingchi
Shannan
Chamdo
Nepal
Nepal
Nepal
Nepal
Nepal
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
India
China
China
China
China
China
China
China
China
China
China
China
Nepalese
Nepalese
Nepalese
Nepalese
Nepalese
Northwest Indian
Northwest Indian
Northwest Indian
Northwest Indian
Northwest Indian
Northwest Indian
Northwest Indian
Northwest Indian
North Indian
North Indian
North Indian
North Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Northeast Indian
Tibetan
Tibetan
Tibetan
Tibetan
Tibetan
Tibetan
Tibetan
Tibetan
Tibetan
Tibetan
Tibetan
200
46
57
76
40
37
19
65
26
30
34
20
40
38
39
60
45
76
74
27
60
29
82
29
51
17
45
26
21
44
20
43
46
59
220
59
168
58
56
51
53
20
61
Reference
This study
This study
Fornarino et al.5
Fornarino et al.5
Fornarino et al.5
Metspalu et al.11
Kivisild et al.14
Kivisild et al.14
Metspalu et al.11
Metspalu et al.11
Metspalu et al.11
Metspalu et al.11
Cordaux et al.9
Metspalu et al.11; Kivisild et al.14
Kumar et al.17; Banerjee et al.16
Kumar et al.17; Banerjee et al.16
Kumar et al.17
Reddy et al.12
Reddy et al.12
Reddy et al.12
Reddy et al.12
Reddy et al.12
Reddy et al.12
Reddy et al.12
Reddy et al.12
Reddy et al.12
Cordaux et al.9
Cordaux et al.9
Cordaux et al.9
Cordaux et al.9
Roychoudhury et al.15
Cordaux et al.9
Qin et al.13
Qin et al.13
Zhao et al.1
Qin et al.13
Zhao et al.1
Qin et al.13
Qin et al.13
Qin et al.13
Qin et al.13
Qin et al.13
Qin et al.13
Journal of Human Genetics
Reconstructing the origin of the Nepalese
H-W Wang et al
230
the possibility that the East Eurasian influence could trace its origin
back to the dispersal of Tibeto-Burman people who arrived at Nepal
via northeast India.5 Therefore, it is plausible that the bearer of the
East Eurasian genetic components might have arrived at Nepal either
from Tibet directly (across the Himalayas) or from northeast India
instead. To test both scenarios and get deeper insights into the origin
of the Nepalese, a simple way is to compare the phylogenetic affinity of
the East Eurasian lineages, respectively, observed in Nepal, northern
Table 2 The genetic distance between populations from Nepal, China
(Tibet), northern India based on the East Eurasian matrilineal
components
Table 3 Admixture analysis of the Nepalese by comparing with its
potential parental populations
Nepalese
Nepalese
0.00
northwest Indian
0.03
northwest
north
northeast
Indian
Indian
Indian
Parental group
Kathmandu eastern Nepal
0.00
Tibetan
0.58±0.16
0.45±0.17
0.87±0.23 0.75±0.19 0.12±0.24
northeast Indian
0.24±0.14
0.17±0.14
0.00±0.19 0.00±0.16 0.12±0.20
northwest Indian 0.19±0.12
0.39±0.13
0.00±0.17 0.20±0.14 0.56±0.18
0.01±0.06
0.00±0.07
0.13±0.09 0.06±0.07 0.19±0.09
north Indian
northeast Indian
0.12
0.04
0.17
0.05
0.00
0.12
0.00
Tibetan
0.02
0.03
0.13
0.03
Nep
152
CS# IG#
R58 m306
CS# CS#
R134 R114
Nep
056
Hybrid Nepalese population
Tibetan
Nep
102
Nep
198
CS# Nep
T17 005
CS#
T13
Nep
159
0.00
north Indian
CS# CS# CS#
A64/ B47/ C8
B26 B66
CS# TK# CS#
C48 OR89 R61
TK# Nep SF# TK# Nep
AS32 057 Sin11 AS16 079
14031
16291
16484G
16293 16319
16311
16278 16189 16335 12280
15172
12007
16192 13928C 16296+C 16192
16519 16297
16239 11887 16095 13858
16365
1780 8697
11061
16293C
12425
14509
16111
8056
16390 13708
16344 16209 13105 16519
16093 10565 9615 466
16319
334
3391 5774 15951 12727 9776 8709 464C 7546
16092 4991
16148 12501
16302 9632
16301 16185 10845 16497
16265C
15289
185
456
15859 571
5118
10906 6413
16102 15951
4052 7961 8584 455+TT
16093
16264 16144A 11365
374
14687 146
8572 5417
207
15203 13766
4353
3654 5436d 5363 392
10334 h7226 3407
189
12792
204
6905
14060 13743
8281- 5319
267 4143 195
3915
8886
12279
3399
13820 11914
152
8289d 4851
709
M5b1
8676
5426
3483
151
12813
8143
16304
11167
8005
152
M35a
16519
11827 6713
6020
6260 3144 11986
14410C
575 8930A
15287
1763
2880
10238 3394
16519
16093
12279
593
11016
15262
522- 523+CA
297 2000
15924
10166
333
150
195
523d
9064
10670
6293
150
4916
M3a1
M35b
4454
5432
M5a
460 M5b
146
482
15928
4703
M52
M35
14323
16048
12477
13368
10750
3921
8784T
M3
5460
M5
709
3744
12561
1598
199
16126
1462T
16129
482
1888
16275
16189
16183C
15712
11287
9966
8473
7867
269
Revised Nep
007
CRS
15326
8860
750
263
16519
16271
15787
13968
522523d
H2
4769
1438
H
7028
2706
HV
14766
11719
73
MG#
A156
Nep
138
16325
16224
16311
16185
16193
@15326
15791
12960
243
11437
9098
7444
5875A
4695
195
MG#
A190
Nep
028
16519
16497
16362
16390 16261
15928 16227
13194 16189
12007 16183C
6485 16182C
2392 16181
1019 16129
456 15670
15326
356+C 15467
13782
234 15110
9716
195 12768
8646
154 12285
6131
10003
5911
9153
5510
8022
709
R8
4456A
4388
16519
3645
13215
3310
9449
7759
3384
2755
522-523d
R
Nep
022
TK# CS#
AS15 R64
16344
16093
14233
8580
482
207
15148
11260
10954
3381
1760
16227
5805
152
IG# IG#
k11b t1331
16311
16360 14053
11928 8491
6242
385
3381
183
CS# CS#
T153 T14
CS#
B107
16311
16145
14550
13368
13143
9992
8273 12945
4703 9966
4343 9653
956
214
151
M30d
16166d
M30a
6366
513
16278
16192
13980
5147
152
N
Nep
045
16179d16399 16293
11946 16302
14155 16362 16189
8870 16093
3335 16311 9055
204 10160
1598 16234 6119
152 3504
16093 204
@522
8911
-523d
M30b
CS#
C64
16318T
15924
14148
8950
6185
3540
356+C
Nep CS#
193 T21
15323 16519 16400
7756 @1612616298
2056
12879
10727
5228639
523d
5783
146
12234
146
Tharu_CII
CS# Nep
C182 118
16278
16172
16169
13731
10166
5423
5124A
462
150
16519
16294
15466
15052
14053
12153
11893
8572
3766
207
204
195
Nep
166
16260
13329
12477
11893
8276+C
3798
152
95C
93
60
15868
15514
10680
8802
6261
3469
3202
960d
M3a
15908
8562
4580
Tharu_E
CS#
SW23
204
M33
16519
16362
16324
9344
6293
3221
1719
676
2361
Nep Nep
107 168
CS# CS#
B177 T6
Nep
076
16093 199
16293C 12858
16249 828116223 8289d
16213 @709
16129
16318C
12696
93
11617
16000T 7270
15184 6713
16362
3866 4659
16311
13966
793G
13656
721
11963
153
M30c
Nep
190
16318
16294
16114
16093
15856
7444
4936
4736
3184
195
14305
15259
Nep
075
16357
16266
12136
8723
7853
5471
4047
3894
3592
1664
522523d
195
189
M18
M81
11827
9266
M43
13135
12498
246
194
M30
15431
522-523d
195A
12636
11696
10316
709
12007
16223
12705
15301
10873
10398
9540
8701
NM# CS# CS# CS#
m2 T8 A47 B156
CS#
A68
Tharu_CI
16519
L3
D4
16362
14668
8414
5178A
4883
3010
M60
8345
7912
M
15043
14783
10400
489
Figure 2 Reconstructed mtDNA tree of the completely sequenced representatives of the major Nepalese mtDNA lineages. The phylogenetic tree was
reconstructed on the basis of 58 mitochondrial genomes, among which 21 mtDNAs were sampled from Katmandu, Nepal and generated in this study,
whereas the rest 37 related mtDNAs were collected from the literature.5,26,27,31–33 Mutations are recorded according to the rCRS.19 Suffixes A, C, G, and T
indicate transversions, ‘‘d’’ signifies a deletion and a plus (+) signs an insertion; recurrent mutations are underlined. The prefix ‘‘h’’ indicates heteroplasmy and
‘‘@’’ highlights back mutation. The length polymorphisms (for example, 309+C, 309+2C, 315+C and 315+2C) are ignored during the tree reconstruction.
The reconstruction of highly recurrent mutations (for example, 16519, and the insertion/deletion of ‘‘CA’’ repeats in region 514–523) is tentative at best.
Journal of Human Genetics
Reconstructing the origin of the Nepalese
H-W Wang et al
231
India (including northeast, north and northwest India) and Tibet at
higher molecular resolution, and which becomes feasible with the
recently released Tibetan mtDNA data.1,13
To achieve this objective, mtDNA variation within a number of
246 Nepalese individuals from Kathmandu and eastern Nepal were
collected and studied in this study (Supplementary Table S1, Supplementary Material online), and the recently released mtDNA data from
Nepal5 and the neighboring regions, especially those from Tibet,1,13
northeast and northwest India9–12,14–17 were also considered. To
further understand the Nepalese mtDNA landscape, a total of 21
representative individuals were selected for completely mtDNA
genome sequencing in order to unambiguously determine their
phylogenetic status (Figure 2).
Table 4 Estimated ages of haplogroups G2a2 and M9a1a2a based on
the calibration rates proposed in Forster et al.44 and Soares et al.46
Soares et al.’s rate46
Haplogroup
N
r±s
T (ky)
r±s
T (ky)
G2a2
23
0.30±0.20
5.74 (1.98–9.49)
0.30±0.20
6.14 (2.12–10.16)
M9a1a2a
20
0.30±0.19
5.65 (2.13–9.18)
0.30±0.19
6.05 (2.28–9.83)
Abbreviation: ky, kilo years.
The r and s values were obtained by considering the substitutions in segment 16090–16365
as fully described previously.44,46
MATERIALS AND METHODS
Sample collection
Kashmir
Data analysis
The principle component analysis (PCA) was conducted based on the East
Eurasian haplogroup frequencies as described previously40 (Supplementary
Table S2, Supplementary Material online). Genetic distances (Table 2) were
estimated by using the package Arlequin 3.11.41 Admixture estimation was
performed by the Weighted Least Squares (WLS) Method using the Statistical
Package for the Social Sciences (SPSS) 14.0 software (Table 3).42 The reduced
median networks of haplogroups of interest were constructed by using the
network 4.510 program (http://www.fluxus-engineering.com/sharenet.htm)
and adjusted manually (Figures 4 and 5).43 The ages of the lineages G2a2
(further defined by mutation 16193 and named G2a2 tentatively) and M9a1a2a
(characterized by mutations 16145, 16316 and a back mutation at site 16362,
and designated as M9a1a2a) (Table 4) were estimated by using the r
statistic44,45 with the suggested calibration rates.44,46
RESULTS AND DISCUSSION
On the basis of the combined information from control-region and
partial coding-region segments, the majority (96.34%; 237/246) of the
Nepalese mtDNAs could unambiguously be allocated into the defined
haplogroups of East Eurasian (36.59%; 90/246),1,20–25 South Asian
(51.63%; 127/246)5,7,11,12,26–31 and West Eurasian ancestries (8.13%;
20/246)26,31,47,48 (Supplementary Table S2, Supplementary Material
online). It is apparent that the genetic components of East Eurasian
Scheduled caste
0.500
Group
Nepal Group
North India Group
Northeast India Group
Northwest India Group
Tibet of China Group
Apatani
Jat Sikh
Monba
Nishi
Nep_Mix
Kanet
Lhasa
PC 2 (16.841%)
Total DNA was isolated with the standard phenol/chloroform method and
stored at 80 1C. An mtDNA segment (spanning nucleotide position 16024–
16569/1–576 and covering the whole-mtDNA control region) was amplified
and sequenced as fully described in our previous works.1,18 Mutations are
recorded by comparing with the revised Cambridge reference sequence
(rCRS).19 All the individuals were allocated into specific haplogroup based
on their control-region information; the assignments were further confirmed
by typing additional diagnostic coding-region mutations according to the
reconstructed phylogenetic trees of East Asian,1,20–25 South Asian5,7,12,26–33
and Southeast Asian34–38 (Supplementary Table S1, Supplementary Material
online). For the mtDNA sample of interest, entire genome was amplified and
sequenced as described elsewhere.1,18 To avoid any potential problems in
mtDNA data quality, necessary quality control measures21 and some caveats39
were followed as described previously. The new mtDNA sequences reported in
this study have been deposited in GenBank under accession numbers
JF742217–JF742461 (for control-region sequences) and JF742196–JF742216
(for whole-mitochondrial genomes).
Tri pura
0.750
Blood samples of 246 unrelated individuals were collected from Kathmandu
and eastern Nepal with informed consent (Figure 1 and Table 1), and this
project was approved by the Ethics Committee at Kunming Institute of
Zoology, Chinese Academy of Sciences. The geographical locations of the
populations are displayed in Figure 1.
DNA amplification, sequencing and quality control
Forster et al.’s rate44
Shigatse
Nagqu_2
0.250
Tharu_CI
Adi
Nep_Kat
Nagqu_1
Rikaze
Lobana
Naga
Sikh
Ngari
Sa
Apatani
0.000
Bhumij
Charndo
Tharu_CII
Lhoba
Tharu_E
Nyingchi
Garo
Bhoi
-0.250
Shannan
War_Jaint
Tharu
Pnar
Kshatriya Brahmin
Lyngnga
War_Khas
Kharia
Nongtrai
Maram Khynriam
-0.500
-0.200
0.000
0.200
0.400
0.600
0.800
1.000
PC 1 (37.109%)
Figure 3 Principle component analysis (PCA) of the populations under
study.
(36.59%) and South Asian (51.63%) ancestry have comprised the vast
majority of the Nepalese gene pool (Supplementary Table S1, Supplementary Material online),5 and this pattern remains almost stable for
both the East Eurasian (45.11%; 189/419) and South Asian (47.49%;
199/419) components after taking into account the recently reported
Nepalese mtDNA data.5 As for the 21 samples with ambiguously
phylogenetic status, completely sequencing their mtDNA genomes
revealed that virtually all of these samples in fact belong to the already
defined haplogroups, such as M3, M5, M18, M30, M35, M43, D4, R8
and M60. Of note is that, beside two singular branches identified in
this study, we also defined a novel haplogroup characterized by
variations 9266 and 11827, which was named M81 here (Figure 2).
To get more insights into the origin of the East Eurasian maternal
components observed in the Nepalese and therefore test the two
competing scenarios about how these components had been introduced into Nepal,4,5,8 we focused on the phylogenetic affinity between
the East Eurasian haplogroups identified in the Nepalese and those
from the Tibetan, northeast and northwest Indian populations.
Figure 3 illustrates the principle component analysis plot of the 43
populations under study, which was constructed based merely on the
East Eurasian lineages. Among the five Nepalese populations under
study, three clustered with the Tibetans (Figure 3). After we considered all the Nepalese regional populations as a whole and calculated its
Journal of Human Genetics
Reconstructing the origin of the Nepalese
H-W Wang et al
232
Fst value with the populations from its neighboring regions, the
smallest genetic distance was observed between the Nepalese and the
Tibetans (Table 2). By taking the Tibetans and northern Indians as the
parental populations, the results of the admixture estimation analysis
revealed that the Tibetans made major contribution to virtually all
Nepalese populations (except for the eastern Tharu population;
Table 3). Afterwards, we further compared the phylogenetic affinity
of the East Eurasian lineages observed in Nepalese (including haplogroups A11, C, G2a, M9a, F1c and Z; Figures 4 and 5) with those
from the neighboring regions, for example, Tibet, northeast and
northwest India, by means of median networks.43 On the basis of
the constructed networks (Figures 4 and 5), several features could be
observed: (1) the Nepalese share some basal or internal haplotypes
with the Tibetans; (2) the Nepalese harbor a number of unique
haplotypes at the terminal level, most of which branched off directly
from the nodes occupied almost exclusively by the Tibetan lineages
and (3) only a few haplotypes are shared sporadically between the
Nepalese and the northern Indians. Taken together, the Nepalese
lineages of East Eurasian ancestry generally show much closer affinity
with the ones from Tibet, albeit a few mtDNA haplotypes, likely
resulted from recent gene flow, were shared between the Nepalese and
northern (including northeast) Indians (Figures 4 and 5).
Even though we focused on the East Eurasian lineages identified in
the Nepalese populations, we did observe a number of the Nepalesespecific haplotypes, strongly suggesting their rather ancient origin and
most plausibly de novo differentiation in Nepal. To get some hint at
the arrival time of the lineages, we have focused on two clades from
haplogroups G2a and M9a1a2 simply because both clades contain the
Nepalese haplotypes at their terminal branch or basal node and likely
have differentiated in Nepal; estimating their ages would then help to
N
Nis
@16093
R
R
2R
16278
L
Ng
L
S
16374C
16243
16302
16129
16069
16400
Ng
16172
S
16291
Adi
Ng
16126
16294
16212
@16093
R
L
N-M
Ng
C
4 C II
5N
12R
R
2R
16224
16093
16093
K
16359
16311
C
16234
2R
R
16096
16085
2S
Nag
2Mon
16357
16311
2Nag
16126
16051
Bho
S
3N
C
N
C
Ny
16325
16288
16164
16148
16327
16298
16223
Lho
2R
Tha
Khy
16093
@16223
3N
16319
16293C
16290
16223
16284
16129
16220+A
R
Tri
N
3R
3R
16384
S
16355
16318T
2R
R
3S
Ng
16297
R
S
2Nag
Adi
16362
Adi
4Nis
16150
Ny
R
K
16086
@16298
16239
16319
4S
Ng
Khy
Ny
E
K
Kan
16354
K
Kan
16362
16288
K
@16223
2Ny
R
5R
K
16300
16456
Ng
Haplogroup C (N=64)
Haplogroup A11 (N=61)
C II
5 Apa
K
16092
16343
2R
R
CI
2 N-M
6R
16051
4 C II
E
16189
R
K
16058C
16145
16355
16215
16126
L
16107
2K
16150
16291
16093
K
Bho
Nis
E
16261
16224
Ny
Lyn
Gar
16304
16129
16111
Haplogroup F1c (N=39)
R
L
16284
Note
16212
16302
16266
S
K
N
3C I
Ny
Lho 3 R
2 C II
3K
3 Nis Apa
16362
L
Kan 4N
Nepalese
C
2K
S
C
16311
Adi
3Nis
N
Tibetan
Northeast Indian
R
16298
16260
16223
16185
Northwest Indian
North Indian
Haplogroup Z (N=36)
Figure 4 Median networks of haplogroups A11, C, F1c and Z. The reduced median networks of haplogroups of interest were constructed by using the
network 4.510 program (http://www.fluxus-engineering.com/sharenet.htm) and adjusted manually according to Bandelt et al.43 The data used here were
collected from this study (Supplementary Table S1) and the literature (Table 1). The sequence variation used for network construction was confined to
segment 16047–16497. Suffixes T, C and G refer to transversions; recurrent mutations are underlined and ‘‘@’’ denotes a reverse mutation, 16193+C was
omitted in the median network. Code in the circles refers to the population abbreviation as displayed in Supplementary Table S2.
Journal of Human Genetics
Reconstructing the origin of the Nepalese
H-W Wang et al
233
Non
K
N-M
8 WK
2CI
N-M
K
@16093
WK
E
CI
C II
5 C II
Tha
K
Mar
5 Pna
16051
E
16311
16356
16188+C
16168
K
K
3 C II
16300
C II
2CI
Tha
L
Adi
Lho
G2a2 16256
16209 16129
16193
R
16172
16093
N
R
16288
16093
Ng
L
16311
16189
16291
N
S
16189
16172
Ng
16362
16278
16227
16223
CI
3N
2CI
16293 16390 16256
16344
16129
16278
16311
3R
Mar
2 Adi
16317
4N
16168
R
7 C II
R
3 Ng
16311
N
7CI
R
16218
S
Tha
16325
Khy
9R
16188
2 Pna
Adi
R
5 Lyn
3 Ny
16224
Lho
16265C
Bho
16356
16243
C
Pna
16185
16172
16289
16384
16337
@16223
16158G
16114
16288
16274
16051
16114
Gar
Kha
16093
C I C II
C II
C
16271
2N
R
16319
16272
3M
R
L
@16227
2R
N
L
2N
C
R
16086
L
16215
16173
@16223
16474T
16234
Ny
R
C
Ng
3CI
2E
16348
16319
16081
16311
N
N
Mar
15 Khy
Khy
K
3N
5 Lyn
16093
@16362
CI
Bho
5 WJ
M9a1a2a
6S
4C
16362
16158
@16362
16316
16145
Note
Nepalese
Tibetan
S
Haplogroup G2a (N=56)
2R
L
16234
16223
Haplogroup M9a (N=143)
Northeast Indian
North Indian
Figure 5 Median networks of haplogroups G2a2 and M9a1a2a. For more information, see Figure 4.
date the arrival time of the migration from Tibet. In fact, time
estimation results revealed that haplogroups G2a2 and M9a1a2a
have very similar ages of B5.7 kya, and this age becomes a little
older (B6 kya) when calibration rate proposed by Forster et al.44 was
used. To this end, the very similar ages of both haplogroups, which
likely had in situ differentiated in Nepal, strongly suggest that the
bearers of these East Eurasian maternal components would have
arrived at Nepal no later than 5.7 kya (Table 4). In retrospect, previous
work has suggested that the maternal genetic components from the
northern East Eurasian was introduced into Tibet around 8.2 kya,1 and
our time estimation results fit this dating frame very well. It is then
conceivable that the settlement of Nepal by the bearer of the East
Eurasian genetic components occurred likely before 5.7 kya, a result in
good agreement with the archeological findings reporting shared the
Neolithic features between Nepal and Tibet (references therein).49
Previous studies have observed substantial East Eurasian genetic
components in the Nepalese populations;4,5 however, it remains
controversial whether the East Eurasian lineages have been introduced
into Nepal from Tibet directly (across the Himalayas)4,6 or via
northeast India.5,8,50 By extensively analyzing the mtDNA variation
in Nepal, Tibet, northern India populations, our observations, based
on the principle component analysis, Fst and admixture estimation,
revealed the closer genetic affinity between the Nepalese and the
Tibetans, and this result was further substantiated by the median
networks, (Figures 4 and 5) in which most of the Nepalese mtDNAs
prevalent among northern Asian populations shared the haplotypes
with the Tibetans at root level or branched off directly from the nodes
consisting almost exclusively of the Tibetan lineages. Our results
strongly suggest that most of the East Eurasian maternal components
identified in the Nepalese were introduced directly from Tibet,4,6 and
the time estimation results further date that this peopling scenario
plausibly occurred about 6 kya. Indeed, this inference seems to be in
striking accordance with the historically recorded passes (such as the
Kodari and Rasuwa Passes), which bridged the Nepalese and the
Tibetans since the ancient time.3 However, the observed gene flow
from northeast India suggests genetic contribution, albeit limited,
from this region, a scenario echoing the proposed inland dispersal
route.50 In this spirit, our findings complete the understanding of the
origin of the Nepalese and the way how the East Eurasian genetic
components had been introduced into Nepal. Taking into account the
previous observation on Y chromosome,4 now it is convincing that the
East Eurasian had entered Nepal across the Himalayas around 6 kya, a
scenario in good agreement with the previous findings from linguistics
and archeology.
CONFLICT OF INTEREST
The authors declare no conflict of interest.
ACKNOWLEDGEMENTS
We thank Wen-Zhi Wang, Chun-Ling Zhu, Yang Yang, Shi-Kang Gou and Tao
Sha for their technical assistant. We are grateful to the volunteers for providing
the blood samples to the project. This work was supported by grants from the
Natural Science Foundation of Yunnan Province (No. 2011FB106) and the
National Natural Science Foundation of China (NSFC, Nos. 30900797 and
30621092).
1 Zhao, M., Kong, Q. P., Wang, H. W., Peng, M. S., Xie, X. D., Wang, W. Z. et al.
Mitochondrial genome evidence reveals successful Late Paleolithic settlement on the
Tibetan Plateau. Proc. Natl. Acad. Sci. USA 106, 21230–21235 (2009).
2 Wang, H. W. Nepal (Social Sciences Documentation Publishing House, Beijing, China,
2004).
3 Xue, K. Q., Zhao, C. Q., Sun, S. H., Ge, W. J., Liu, Y., Xing, G. C. et al. Concise
Encyclopaedia Of South Asia and Central Asia (China Social Science Press, Beijing,
China, 2004).
4 Gayden, T., Cadenas, A. M., Regueiro, M., Singh, N. B., Zhivotovsky, L. A., Underhill, P.
A. et al. The Himalayas as a directional barrier to gene flow. Am. J. Hum. Genet. 80,
884–894 (2007).
5 Fornarino, S., Pala, M., Battaglia, V., Maranta, R., Achilli, A., Modiano, G. et al.
Mitochondrial and Y-chromosome diversity of the Tharus (Nepal): a reservoir of genetic
variation. BMC Evol. Biol. 9, 154 (2009).
6 Gayden, T., Mirabal, S., Cadenas, A. M., Lacau, H., Simms, T. M., Morlote, D. et al.
Genetic insights into the origins of Tibeto-Burman populations in the Himalayas.
J. Hum. Genet. 54, 216–223 (2009).
7 Thangaraj, K., Chaubey, G., Kivisild, T., Rani, D. S., Singh, V. K., Ismail, T. et al.
Maternal footprints of Southeast Asians in North India. Hum. Hered. 66, 1–9 (2008).
8 Su, B., Xiao, C. J., Deka, R., Seielstad, M. T., Kangwanpong, D., Xiao, J. H. et al.
Y chromosome haplotypes reveal prehistorical migrations to the Himalayas. Hum.
Genet. 107, 582–590 (2000).
9 Cordaux, R., Saha, N., Bentley, G. R., Aunger, R., Sirajuddin, S. M. & Stoneking, M.
Mitochondrial DNA analysis reveals diverse histories of tribal populations from India.
Eur. J. Hum. Genet. 11, 253–264 (2003).
10 Cordaux, R., Weiss, G., Saha, N. & Stoneking, M. The Northeast Indian passageway: a
barrier or corridor for human migrations? Mol. Biol. Evol. 21, 1525–1533 (2004).
Journal of Human Genetics
Reconstructing the origin of the Nepalese
H-W Wang et al
234
11 Metspalu, M., Kivisild, T., Metspalu, E., Parik, J., Hudjashov, G., Kaldma, K. et al. Most
of the extant mtDNA boundaries in south and southwest Asia were likely shaped during
the initial settlement of Eurasia by anatomically modern humans. BMC Genet. 5, 26
(erratum 26:41) (2004).
12 Reddy, B. M., Langstieh, B. T., Kumar, V., Nagaraja, T., Reddy, A. N. S., Meka, A. et al.
Austro-Asiatic tribes of Northeast India provide hitherto missing genetic link between
South and Southeast Asia. PLoS ONE 2, e1141 (2007).
13 Qin, Z. D., Yang, Y. J., Kang, L. L., Yan, S., Cho, K., Cai, X. Y. et al. A mitochondrial
revelation of early human migrations to the Tibetan Plateau before and after the last
glacial maximum. Am. J. Phys. Anthropol. 143, 555–569 (2010).
14 Kivisild, T., Bamshad, M. J., Kaldma, K., Metspalu, M., Metspalu, E., Reidla, M. et al.
Deep common ancestry of Indian and western-Eurasian mitochondrial DNA lineages.
Curr. Biol. 9, 1331–1334 (1999).
15 Roychoudhury, S., Roy, S., Basu, A., Banerjee, R., Vishwanathan, H., Usha Rani, M. V.
et al. Genomic structures and population histories of linguistically distinct tribal groups
of India. Hum. Genet. 109, 339–350 (2001).
16 Banerjee, J., Trivedi, R. & Kashyap, V. K. Mitochondrial DNA control region sequence
polymorphism in four indigenous tribes of Chotanagpur plateau, India. Forensic Sci.
Int. Genet. 149, 271–274 (2005).
17 Kumar, V., Langstieh, B. T., Madhavi, K. V., Naidu, V. M., Singh, H. P., Biswas, S. et al.
Global patterns in human mitochondrial DNA and Y-chromosome variation
caused by spatial instability of the local cultural processes. PLoS Genet. 2, e53
(2006).
18 Wang, H. W., Jia, X. Y., Ji, Y. L., Kong, Q. P., Zhang, Q. J., Yao, Y. G. et al. Strikingly
different penetrance of LHON in two Chinese families with primary mutation G11778A
is independent of mtDNA haplogroup background and secondary mutation G13708A.
Mutat. Res. 643, 48–53 (2008).
19 Andrews, R. M., Kubacka, I., Chinnery, P. F., Lightowlers, R. N., Turnbull, D. M. &
Howell, N. Reanalysis and revision of the Cambridge reference sequence for human
mitochondrial DNA. Nat. Genet. 23, 147 (1999).
20 Kivisild, T., Tolk, H. V., Parik, J., Wang, Y. M., Papiha, S. S., Bandelt, H. J. et al.
The emerging limbs and twigs of the East Asian mtDNA tree. Mol. Biol. Evol. 19,
1737–1751 (2002).
21 Kong, Q. P., Yao, Y. G., Sun, C., Bandelt, H. J., Zhu, C. L. & Zhang, Y. P. Phylogeny of
east Asian mitochondrial DNA lineages inferred from complete sequences. Am. J. Hum.
Genet. 73, 671–676 (2003).
22 Tanaka, M., Cabrera, V. M., González, A. M., Larruga, J. M., Takeyasu, T., Fuku, N. et al.
Mitochondrial genome variation in eastern Asia and the peopling of Japan. Genome
Res. 14, 1832–1850 (2004).
23 Kong, Q. P., Bandelt, H. J., Sun, C., Yao, Y. G., Salas, A., Achilli, A. et al. Updating the
East Asian mtDNA phylogeny: a prerequisite for the identification of pathogenic
mutations. Hum. Mol. Genet. 15, 2076–2086 (2006).
24 Derenko, M., Malyarchuk, B., Grzybowski, T., Denisova, G., Dambueva, I., Perkova, M.
et al. Phylogeographic analysis of mitochondrial DNA in northern Asian populations.
Am. J. Hum. Genet. 81, 1025–1041 (2007).
25 Kong, Q. P., Sun, C., Wang, H. W., Zhao, M., Wang, W. Z., Zhong, L. et al.
Large-scale mtDNA screening reveals a surprising matrilineal complexity in East Asia
and its implications to the peopling of the region. Mol. Biol. Evol. 28, 513–522
(2011).
26 Palanichamy, M. G., Sun, C., Agrawal, S., Bandelt, H. J., Kong, Q. P., Khan, F. et al.
Phylogeny of mitochondrial DNA macrohaplogroup N in India, based on complete
sequencing: implications for the peopling of South Asia. Am. J. Hum. Genet. 75,
966–978 (2004).
27 Sun, C., Kong, Q. P., Palanichamy, M. G., Agrawal, S., Bandelt, H. J., Yao, Y. G. et al.
The dazzling array of basal branches in the mtDNA macrohaplogroup M from India as
inferred from complete genomes. Mol. Biol. Evol. 23, 683–690 (2006).
28 Thangaraj, K., Chaubey, G., Singh, V. K., Vanniarajan, A., Thanseem, I., Reddy, A. G.
et al. In situ origin of deep rooting lineages of mitochondrial Macrohaplogroup ‘M’ in
India. BMC Genomics 7, 151–156 (2006).
29 Chandrasekar, A., Kumar, S., Sreenath, J., Sarkar, B. N., Urade, B. P., Mallick, S. et al.
Updating phylogeny of mitochondrial DNA macrohaplogroup m in India: dispersal of
modern human in South Asian corridor. PLoS ONE 4, e7447 (2009).
30 Thangaraj, K., Chaubey, G., Kivisild, T., Reddy, A., Singh, V. K., Rasalkar, A. A. et al.
Reconstructing the origin of Andaman Islanders. Science 308, 996 (2005).
31 Kivisild, T., Shen, P. D., Wall, D., Do, B., Sung, R., Davis, K. et al. The role of selection
in the evolution of human mitochondrial genomes. Genetics 172, 373–387 (2006).
32 Maca-Meyer, N., González, A. M., Larruga, J. M., Flores, C. & Cabrera, V. M. Major
genomic mitochondrial lineages delineate early human expansions. BMC Genet. 2, 13 (2001).
33 Ingman, M. & Gyllensten, U. Mitochondrial genome variation and evolutionary history of
Australian and New Guinean aborigines. Genome Res. 13, 1600–1606 (2003).
34 Macaulay, V., Hill, C., Achilli, A., Rengo, C., Clarke, D., Meehan, W. et al. Single, rapid
coastal settlement of Asia revealed by analysis of complete mitochondrial genomes.
Science 308, 1034–1036 (2005).
35 Hill, C., Soares, P., Mormina, M., Macaulay, V., Meehan, W., Blackburn, J. et al.
Phylogeography and ethnogenesis of aboriginal Southeast Asians. Mol. Biol. Evol. 23,
2480–2491 (2006).
36 Hill, C., Soares, P., Mormina, M., Macaulay, V., Clarke, D., Blumbach, P. B. et al.
A mitochondrial stratigraphy for island southeast Asia. Am. J. Hum. Genet. 80, 29–43
(2007).
37 Peng, M. S., Quang, H. H., Dang, K. P., Trieu, A. V., Wang, H. W., Yao, Y. G. et al. Tracing
the Austronesian Footprint in Mainland Southeast Asia: A Perspective from Mitochondrial DNA. Mol. Biol. Evol. 27, 2417–2430 (2010).
38 Tabbada, K. A., Trejaut, J., Loo, J. H., Chen, Y. M., Lin, M., Mirazón-Lahr, M. et al.
Philippine mitochondrial DNA diversity: a populated viaduct between Taiwan and
Indonesia? Mol. Biol. Evol. 27, 21–31 (2010).
39 Kong, Q. P., Salas, A., Sun, C., Fuku, N., Tanaka, M., Zhong, L. et al. Distilling artificial
recombinants from large sets of complete mtDNA genomes. PLoS ONE 3, e3016
(2008).
40 Yao, Y. G., Kong, Q. P., Bandelt, H. J., Kivisild, T. & Zhang, Y. P. Phylogeographic
differentiation of mitochondrial DNA in Han Chinese. Am. J. Hum. Genet. 70,
635–651 (2002).
41 Excoffier, L., Laval, G. & Schneider, S. Arlequin (version 3.0): an integrated software
package for population genetics data analysis. Evol. Bioinform. Online 1, 47–50 (2005).
42 Long, J. C., Williams, R. C., McAuley, J. E., Medis, R., Partel, R., Tregellas, W. M. et al.
Genetic variation in Arizona Mexican Americans: estimation and interpretation of
admixture proportions. Am. J. Phys. Anthropol. 84, 141–157 (1991).
43 Bandelt, H. J., Macaulay, V. & Richards, M. Median networks: speedy construction and
greedy reduction, one simulation, and two case studies from human mtDNA. Mol.
Phylogenet. Evol. 16, 8–28 (2000).
44 Forster, P., Harding, R., Torroni, A. & Bandelt, H. J. Origin and evolution of Native
American mtDNA variation: a reappraisal. Am. J. Hum. Genet. 59, 935–945 (1996).
45 Saillard, J., Forster, P., Lynnerup, N., Bandelt, H. J. & Nørby, S. mtDNA variation among
Greenland Eskimos: the edge of the Beringian expansion. Am. J. Hum. Genet. 67,
718–726 (2000).
46 Soares, P., Ermini, L., Thomson, N., Mormina, M., Rito, T., Röhl, A. et al. Correcting for
purifying selection: an improved human mitochondrial molecular clock. Am. J. Hum.
Genet. 84, 740–759 (2009).
47 Finnilä, S., Lehtonen, M. S. & Majamaa, K. Phylogenetic network for European mtDNA.
Am. J. Hum. Genet. 68, 1475–1484 (2001).
48 Herrnstadt, C., Elson, J. L., Fahy, E., Preston, G., Turnbull, D. M., Anderson, C. et al.
Reduced-median-network analysis of complete mitochondrial DNA coding-region
sequences for the major African, Asian, and European haplogroups. Am. J. Hum.
Genet. 70, 1152–1171 (2002).
49 Sharmai, D. R. Archaeological Remains of the Dang Valley. Ancient Nepal. 88, 8–15
(1988).
50 Peng, M. S., Palanichamy, M. G., Yao, Y. G., Mitra, B., Cheng, Y. T., Zhao, M. et al.
Inland post-glacial dispersal in East Asia revealed by mitochondrial haplogroup M9a’b.
BMC Biol. 9, 2 (2011).
Supplementary Information accompanies the paper on Journal of Human Genetics website (http://www.nature.com/jhg)
Journal of Human Genetics