Haploid chromosomes in molecular ecology: lessons from the

MEC1314.fm Page 1599 Monday, June 18, 2001 10:24 AM
Molecular Ecology (2001) 10, 1599–1613
INVITED REVIEW
Blackwell Science, Ltd
Haploid chromosomes in molecular ecology:
lessons from the human Y
M AT T H E W E . H U R L E S * and M A R K A . J O B L I N G
Department of Genetics, University of Leicester, University Road, Leicester LE1 7RH, UK
Abstract
We review the potential use of haploid chromosomes in molecular ecology, using recent work
on the human Y chromosome as a paradigm. Chromosomal sex-determination systems, and
hence constitutively haploid chromosomes, which escape from recombination over much
of their length, have evolved multiple times in the animal kingdom. In mammals, where males
are the heterogametic sex, the patrilineal Y chromosome represents a paternal counterpart
to mitochondrial DNA. Work on the human Y chromosome has shown it to contain the same
range of polymorphic markers as the rest of the nuclear genome and these have rendered
it the most informative haplotypic system in the human genome. Examples from research
on the human Y chromosome are used to illustrate the common interests of anthropologists
and ecologists in investigating the genetic impact of sex-specific behaviours and dispersals, as
well as patterns of global diversity. We present some methodologies for extracting information from these uniquely informative yet under-utilized loci.
Keywords: admixture, haploid, human, sex-specific, Y chromosome
Received 1 December 2000; revision received 20 March 2001; accepted 20 March 2001
Introduction
Sex-specific differences in ecology, behaviour and migration are abundant in the animal kingdom (Avise 1993).
Mitochondrial DNA (mtDNA) has been the most widely
used marker for molecular ecological investigations, but
gives information only about females (Avise et al. 1987;
Bermingham & Moritz 1998). In most species, information
about the male has been deduced from the differences
between patterns of diversity of mtDNA and those of
biparentally inherited markers (Fitz-Simmons et al. 1997;
Andersen et al. 1998; Lum et al. 1998; Lyrholm et al. 1999;
Nyakaana & Arctander 1999; Wilmer et al. 1999). However,
the fact that sex is chromosomally determined in many
organisms provides an independent source of sex-specific
information. The last 10 years have witnessed an explosion
in markers for human Y chromosome diversity, and the
potential of these is now being realized in studies of human
origins (Hammer et al. 1998), population histories (e.g.
*Present address: McDonald Institute for Archaeological Research,
University of Cambridge, Downing Street, Cambridge, CB2 3ER, UK
Correspondence: Mark A. Jobling. Fax: (+44) 116 252 3378; E-mail:
[email protected]
© 2001 Blackwell Science Ltd
Zerjal et al. 1997; Karafet et al. 1999), sex-biased admixture (Hurles et al. 1998), and male –female differences in
migration (Seielstad et al. 1998) and other behaviours
(Poloni et al. 1997). The purpose of this review is to persuade
those who study other organisms of the virtues of analysing the diversity of haploid chromosomes, using the
human Y as a paradigm. We summarize recent developments in human Y-chromosomal diversity studies and give
specific examples of anthropological research that mirror
issues in ecology.
Genetics is a relative newcomer to the field of human
evolutionary studies, and only one of a number of tools for
studying human history; many other ‘historical’ records
exist. Anthropologists, archaeologists, palaeontologists
and linguists have their own records of the human past.
Lack of congruence between these different records need
not necessarily be taken as a failure of one compared to
another. It may be the signal of something altogether
more interesting, and emphasizes the complementary
nature of disciplines investigating similar questions.
Genetic and linguistic records as documented in extant
diversity represent the outcome of processes documented
contemporaneously in the archaeological record (Fix
1999). In this sense it is analogous to the positioning of
MEC1314.fm Page 1600 Monday, June 18, 2001 10:24 AM
1600 M . E . H U R L E S and M . A . J O B L I N G
Fig. 1 Ideogram of the Y chromosome. Showing the locations of
the pseudoautosomal regions (PAR), the testis determining gene,
SRY, and the long arm heterochromatin.
molecular ecology within other ecological methods (e.g.
tagging and recapture).
The human Y chromosome
In humans, as in other mammals, the male is the heterogametic sex, possessing a constitutively haploid Y chromosome which causes testis differentiation and so determines
maleness in a dominant fashion through the action of a
single gene, SRY (Sex determining region, Y) (Sinclair et al.
1990). The Y chromosome shares a number of regions of
homology with the X, and undergoes recombination with
it in the two pseudoautosomal regions, at the tips of the
long and short arms (Fig. 1). Between these lie about 60 Mb
(megabases) of DNA, including the SRY gene, in which
recombination does not occur. It is this region that concerns
us here.
This escape from the scrambling effect of recombination
has profound consequences for the population genetics,
mutation processes, and genes of this unique chromosome.
Although we concentrate in this review on the first of these,
in reality the three are inseparable: the use of molecular
markers in population studies has to include a consideration of mutation dynamics, and the potential effects of
selection via Y-chromosomal genes (Jobling & Tyler-Smith
2000).
Markers for human Y-chromosomal diversity
Paternal lineage analysis through studies of Y chromosome
polymorphism is not a new idea: the first polymorphisms
on the human Y were identified 15 years ago (Casanova
et al. 1985), although subsequent progress in identifying
more markers was slow. This is not the place to present a
history of this process, and therefore we only discuss currently useful markers.
The human Y chromosome has a fundamental advantage over haploid chromosomes in other organisms in that
a great deal of money is being invested in determining its
entire DNA sequence. While this will have an undeniable
effect on the future ease of marker identification the majority of markers currently in use were found using ‘oldfashioned’ molecular biology. For examples of how to
exploit the variation on haploid chromosomes without a
whole chromosome sequencing project see Box 1.
In contrast to mtDNA, the Y chromosome is expected to
have a mutation rate similar to that of other nuclear loci, as
it is subject to the same repair processes (although it never
passes through female mitosis or meiosis). In addition, it
should contain the same types of polymorphic loci as are
found on the other chromosomes. This has indeed been
found to be the case (Jobling & Tyler-Smith 1995; de Knijff
et al. 1997; Underhill et al. 1997).
Y-chromosomal markers are best classified on the basis
of their mutation rate. This distinguishes between so-called
‘unique’ mutation events that can be considered to have
occurred once in human prehistory and those that are
likely to be recurrent ( Jobling & Tyler-Smith 1995). Into the
former class fall single nucleotide polymorphisms (SNPs)
and certain insertion–deletion events, for example the
insertion of an Alu element (Hammer 1994). Multiallelic
markers such as microsatellites and minisatellites have
much higher mutation rates (e.g. 10 –3 per locus per generation or greater), and thus fall into the latter category.
Unique event markers
Early searches for restriction fragment length polymorphisms (RFLP) on the Y chromosome revealed little
diversity ( Jakubiczka et al. 1990; Malaspina et al. 1990).
More recently studies have focused on sequencing the
same region in a number of individuals from diverse populations, and these efforts have met with greater success:
there are now in excess of 200 unique event markers
available (Underhill et al. 1997, 2000; Hammer et al. 2000;
Jobling & Tyler-Smith 2000; Shen et al. 2000), most of which
are SNPs. Nevertheless, the general picture remains one
of reduced nucleotide diversity compared to non-Ychromosomal nuclear sequences. In one recent large-scale
study, nucleotide diversity within both coding and noncoding regions of three Y-chromosomal genes encompassing 64 120 bp, was found to be about five times lower than
that in the corresponding regions of autosomal genes (Shen
et al. 2000).
This discrepancy can be explained in a number of ways.
One factor contributing to the lack of Y-chromosomal diversity is that, for a species with a 1 : 1 sex ratio, the effective
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599 –1613
MEC1314.fm Page 1601 Monday, June 18, 2001 10:24 AM
H A P L O I D C H R O M O S O M E S I N M O L E C U L A R E C O L O G Y 1601
Box 1 Finding markers on haploid
chromosomes without a genome project
Very few species of interest to ecologists have associated
genome projects. Haploid-chromosome-specific libraries
are equally scarce, although their construction has
been facilitated by recent advances in flow cytometry
and microdissection. Consequently if the diversity
of haploid chromosomes is to be exploited, strategies
for efficiently identifying markers need to be developed.
These studies are hampered by the rapid divergence
of haploid chromosomes which results in diminished
conservation of sequence similarity between species.
However, many species exhibit greater sequence
diversity than Homo sapiens (Gagneux et al. 1999) with
its shallow timedepth, and thus correspondingly shorter
lengths of sequence need be assayed to produce a
informative phylogeny.
Two main approaches to generating haploid chromosomal markers have been used. The first is to exploit the
limited homology available in the few highly conserved
genes to characterize sequence variants, for example the
SRY and ZFY genes on the mammalian Y chromosome
(Boissinot & Boursot 1997), and CHD-W (Griffiths et al.
1998) and ASW (O’Neill et al. 2000) genes on the W
chromosome of nonratite birds. By September 2000 SRY
sequences from 86 mammalian species were available
in the EMBL database, many from the highly conserved
HMG box region, though the distribution across
taxonomic units is by no means uniform. Whilst crossspecies homologies can be used to identify haploid
chromosome-specific clones from genomic libraries,
construction of such libraries is laborious, therefore
it is of interest to develop PCR-based strategies for
obtaining haploid chromosomal markers. To this end
population size of the Y chromosome is a quarter that of
any autosome and one third that of the X chromosome.
Neutral theory predicts that Y diversity should be correspondingly reduced, purely as a function of this reduced
population size (Nei 1987). A further explanation is that
mating practices have in the past resulted in certain males
producing many more sons than others. While these two
explanations are widely accepted, the possibility that natural
selection has contributed to reduced Y-chromosomal diversity in humans is more controversial. Several studies have
failed to reject the null hypothesis of neutrality (Hammer
1995; Goldstein et al. 1996), suggesting that there has not
been a recent global selective sweep on the Y chromosome.
However, one study derives an unexpectedly young age
for the most recent common ancestor (MRCA) of modern
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599–1613
degenerate PCR primers can be used to characterize
homologous sequences more accurately before using
inverse PCR (Raponi et al. 2000) methods to obtain longer
sequences from the often short stretch of known sequence.
The second approach uses methods that identify
markers positioned randomly throughout a genome,
to subsequently detect haploid chromosome-specific
markers through sex linkage. Amplified fragment length
polymorphism (AFLP) (Griffiths & Orr 1999) and random
amplified polymorphic DNA (RAPD) (Bello & Sanchez
1999) methods have been used to this end and can
subsequently be used to develop sequence characterized
amplified regions (SCARs), which can be screened for
variants. A higher throughput method is that of reduced
representation shotgun (RRS) sequencing (Altshuler
et al. 2000), which can be allied to flow-sorting of chromosomes to get chromosomal-specific markers (Mullikin
et al. 2000).
Once haploid-chromosome-specific sequence has
been obtained, markers can be developed by screening
a diverse subset of samples before developing highthroughput PCR assays for the polymorphisms discovered [e.g. restriction fragment length polymorphism
(PCR–RFLP), allele-specific amplification (PCR–ASA) ].
A number of mutation detection techniques developed
in recent years could expedite this process enormously,
for example denaturing high performance liquid
chromatography (DHPLC) (Underhill et al. 1997) and
DNA chips (Cargill et al. 1999), although access to these
technologies remains limited.
It is worth noting that while multiple methods exist
to develop sequence-based markers, the development
of microsatellite markers is less easy to envisage in the
absence of a chromosome-specific library. Novel screening strategies are needed.
Y chromosomes (Thomson et al. 2000), a result explained
either by the strong action of selection, or by the assumptions inherent within the population genetic model used.
Box 2 describes some of the genes and phenotypes associated with the human Y, on which selection may be acting.
Binary polymorphisms of unique origin can be combined
into monophyletic compound haplotypes, called haplogroups, which are related to one another phylogenetically
by a single most parsimonious tree (Fig. 2). This perfect
phylogeny can be constructed by hand from a characterstate table. Ancestral state can usually be determined by
examining the homologous region in an appropriate outgroup such as the chimpanzee, and thus the tree can be
rooted. The coalescence time to the MRCA of extant human
Y chromosomes has been estimated by comparing human
MEC1314.fm Page 1602 Monday, June 18, 2001 10:24 AM
1602 M . E . H U R L E S and M . A . J O B L I N G
Box 2 Selection on the Y chromosome
Studies of human Y-chromosomal diversity implicitly
assume that the variation being studied is selectively
neutral. This may indeed be true of the SNPs, microsatellites, and other markers themselves; however, the
absence of recombination means that all of these
markers are permanently linked to genes which may
themselves be the targets of natural selection. Positive
selection on a Y-chromosomal gene could lead to the
fixation, in a ‘selective sweep’ of a particular haplotype:
this is the hitch-hiking effect, in which the ‘neutral’
markers are carried as passengers with the selected
locus. If selective sweeps have occurred in human
history, they would result in an unexpectedly recent
common ancestor for all Y chromosomes (Whitfield
et al. 1995). Under neutral expectation and given its
likely long-term effective population size of 5000
(Hammer 1995), we would expect to find a common
ancestor for all extant Y chromosomes, in the absence of
selection, around 200 000 years ago. Most estimates
based upon unique-event polymorphisms have been
consistent with this expectation (Underhill et al. 1997;
Hammer et al. 1998); although a more recent estimate
(Thomson et al. 2000) gives a younger date, this may be
a reflection of assumptions about demography, rather
than selection.
The potential for negative selection on the Y chromosome is obvious given its pivotal role in male reproductive
fitness: the Y bears not only the sex-determining locus,
but also several genes essential in spermatogenesis
(Vogt et al. 1996). Studies are underway to determine
whether particular Y-chromosomal haplotypes predispose to infertility ( Jobling et al. 1998b; Kuroki et al.
1999; Previderé et al. 1999) through comparison of
affected and control populations. These studies provide
conflicting evidence, and need careful interpretation if
selective influences are not to be confused with effects
due simply to population structure.
A number of other diverse phenotypes have been
ascribed to the human Y chromosome, including
stature (Salo et al. 1995), tooth-size (Alvesalo & de la
Chapelle 1981), and cerebral asymmetry (Crow 1999). In
primate evolution all of these could have been subject to
sexual selection, though it is unclear to what extent this will
have been important in the evolution of Homo sapiens.
sequence diversity to that seen between humans and
apes and using estimates of the date of the human-ape
divergence to calibrate the measurement. Estimates vary
between 50 000 and 200 000 years ago, with most lying
around 150 000 years ago; all have wide confidence intervals (Underhill et al. 1997; Hammer et al. 1998; Thomson
et al. 2000).
Recurrent event markers
Unique-event polymorphisms on the Y chromosome are
often relatively population-specific (e.g. Seielstad et al.
1994; Hammer et al. 1997). Consequently the necessary use
of only a small number of individuals when searching for
variation can result in an ascertainment bias. Useful
measures of diversity require markers that are polymorphic in all populations: microsatellites, tandem arrays
of 1– 6 bp repeat units that are highly variable in allele
length, have this property. There are at least 34 known
microsatellite loci on the Y chromosome (Kayser et al. 1997;
White et al. 1999; Ayub et al. 2000) Of these, seven have
been widely used for both evolutionary and forensic
purposes by virtue of their high levels of diversity and
unambiguous allele designation (Roewer et al. 1996; Kayser
et al. 1997). Microsatellites on the Y chromosome show
diversity roughly equivalent to that of autosomal loci
(Roewer et al. 1992; Goldstein et al. 1996), and correspondingly
Fig. 2 An unrooted maximum parsimony Y chromosome tree
constructed from unique-event, binary polymorphisms. This tree
( Jobling & Tyler-Smith 2000) is unrooted due to uncertainty about
ancestral states and uses 24 polymorphisms to define 26
compound haplotypes (haplogroups) represented by circles.
Arrows point to derived states where this information is known.
Mutations are named where they can be typed by PCR. Several
other trees, constructed using mostly different polymorphisms,
are currently in use (Underhill et al. 1997, 2000; Hammer et al.
2000).
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599 –1613
MEC1314.fm Page 1603 Monday, June 18, 2001 10:24 AM
H A P L O I D C H R O M O S O M E S I N M O L E C U L A R E C O L O G Y 1603
similar mutation rates; ascertained through pedigree analysis, the per-locus mutation rate for eight tetranucleotide
repeat loci is 2.80 × 10–3 per generation (95% confidence
intervals 1.72 – 4.27 × 10–3) (Kayser et al. 2000).
As with unique-event polymorphisms, alleles at multiple
microsatellite loci on the Y chromosome can be combined
into compound haplotypes. Unlike unique-event markers,
however, these haplotypes cannot be used to construct a
single most parsimonious tree (Roewer et al. 1996; Forster
et al. 2000). Each haplogroup, in being a monophyletic
lineage, is founded by a single male with, by definition,
zero diversity at the multiallelic loci. Thus, recurrent polymorphisms, particularly slowly mutating ones, retain some
phylogenetic information and their compound haplotypes
show some statistical correspondence with the tree of
haplogroups (Bosch et al. 1999; Forster et al. 2000).
As well as microsatellites, the human Y chromosome
bears a number of other highly polymorphic systems,
including a single hypervariable minisatellite (Jobling et al.
1998a), which can be assayed using minisatellite variant
repeat (MVR) polymerase chain reaction (PCR) (Jeffreys
et al. 1991). These will not be discussed further here.
How unique is unique?
The simple classification of markers as ‘unique’ and
‘recurrent’ is useful, but not definitive. First, some markers
with very low mutation rates are nonetheless found to be
recurrent. One of the earliest markers to be discovered
(Whitfield et al. 1995), the A to G transition at SRY-1532, has
reverted, such that A alleles are found on both ancestral
and derived haplotypes (haplogroups 7 and 3, respectively
— see Fig. 2). A second recurrent marker has been described
(Shen et al. 2000), but they remain in a very small minority.
It should be noted that no SNP on the Y chromosome can
be considered to be ‘unique’ in an absolute sense, because
even though the mutation rate is very low (about 2 × 10 – 8
per site per generation; Thomson et al. 2000), the very large
number of extant human Y chromosomes outweighs this.
However, given the relatively small sample sizes normally
used in human population studies such recurrences will
not often be observed. It is reassuring that they can be easily
recognized when many markers are available, and a single
most parsimonious tree can still be constructed (Hammer
et al. 1998). Thus the label ‘unique’ is a relative term, as it
contains a consideration of the extent of sampling of the
entire intraspecific diversity.
Second, ‘unique’ mutational events can sometimes be
distinguished within the evolution of a multiallelic locus.
For example the microsatellite DYS390 contains short alleles
that result from the deletion of a large number of repeats
(Forster et al. 1998). Such deletions assayed at the sequence
level define monophyletic lineages. Furthermore, when
the time-depth under investigation is sufficiently shallow
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599–1613
(e.g. within a human pedigree containing only a few
generations), specific alleles at even the fastest mutating
loci can usefully define sublineages (Foster et al. 1999).
The genealogical approach
The wide range of independent polymorphic systems
having different mutational properties make the Y chromosome a powerful tool for investigating anthropological
questions on different time-scales. A combinatorial approach
combining both unique and recurrent mutations allows
optimum discrimination of both individuals and populations (Jobling & Tyler-Smith 1995; Mitchell & Hammer
1996; Santos & Tyler-Smith 1996). These markers are best
combined hierarchically, defining lineages with unique
event markers before investigating intralineage diversity
with faster mutating recurrent markers (de Knijff et al.
1997). This kind of approach has been termed ‘genealogical’
(Richards et al. 1997).
Many recurrent mutations observed in a phylogeny of
microsatellite haplotypes within a population can be eliminated, if haplotype networks are instead constructed within
individual haplogroups defined by unique mutations, as
illustrated in Fig. 3. This is due to the fact that all chromosomes within a haplogroup are derived from a common
ancestral chromosome that is likely to be far younger than
the ancestral chromosome of a geographically defined
population (de Knijff et al. 1997). A recent study has confirmed this a priori expectation by showing that microsatellite haplotypes are far better structured by lineage
than by population (Bosch et al. 1999). The amount of
diversity observed within a haplogroup defined by unique
mutations, assayed using recurrent mutations with much
higher mutation rates, such as microsatellites or minisatellites, gives information on the demographic history of
that haplogroup (de Knijff et al. 1997).
The advantages of a genealogical approach depend
upon the existence of suitable polymorphic unique event
markers within the population of interest. Some studies
focusing on populations for which few markers are available have attempted to use multiallelic haplotypes to
define lineages. Using individual haplotypes for this purpose is not meaningful, since their diversity is too high;
many are private to particular populations which makes
population comparisons impossible. Consequently a
number of recent studies have defined sublineages on the
basis of a grouping of related compound microsatellite
haplotypes (e.g. Malaspina et al. 1998). At present these
groupings are performed qualitatively, by inspecting haplotypes within a lineage defined by unique event markers
and defining subsets of haplotypes that seem more closely
related to one another than to others outside the grouping.
Whilst attempting to define monophyletic lineages
using multiallelic markers can render subsequent age
MEC1314.fm Page 1604 Monday, June 18, 2001 10:24 AM
1604 M . E . H U R L E S and M . A . J O B L I N G
Fig. 3 Schematic diagram illustrating the minimizing of confusion between identity by state (IBS) and identity by descent (IBD) through
the genealogical approach. The evolution of a single microsatellite locus is followed through three different lineages. Numbers indicate the
length of the microsatellite allele in repeat units. Networks under each lineage are constructed from the extant diversity at the single
microsatellite locus. Circles represent individual microsatellite alleles the length of which is written inside the circle. Circle area is
proportional to the observed frequency of each allele. Note that the network of the young lineage faithfully recapitulates the true
evolutionary connections, whereas the networks of the older lineages confuse IBD and IBS to a limited degree, but not as much as the lowest
network which represents the sum of the two older networks, effectively ignoring the existence of the marker which distinguishes between
the two lineages.
estimates meaningless if polyphyletic chromosomes are
clumped together, the general application of such markers
to lineage definition is not without merit, and, because
variation is detected in all populations, goes some way to
counteracting the ascertainment bias present in many
collections of unique-event markers. Many haplogroups
show substantial internal substructuring that can subsequently be shown to precisely recapitulate monophyletic
sublineages defined by novel unique event markers (MEH,
unpublished observations). What is needed is an unbiased
method for defining lineages within such networks.
The genealogical approach at work
As we have previously noted, the construction of a most
parsimonious tree from unique event markers is trivial.
Subsequent analysis of multiallelic diversity within lineages
has been performed with two aims in mind — first, to
provide a graphical display of the apparent diversity in an
attempt to reveal any substructure, and second to quantify
the diversity within a lineage. A number of different
measures of diversity have been used to relate intralineage
diversity to the age of a lineage.
Lineage substructure
Diversity is displayed graphically so as to tease patterns
out of the data that may not be obvious on simple inspection.
The most popular methods seek to identify evolutionarily
important links between haplotypes and display them in
networks or trees. In principle, multivariate statistical
methods can summarize such multidimensional data with
minimal loss of information but are rarely used to display
haplotypic diversity.
The likely mechanism of microsatellite mutation (singlestep slippage), results in recurrent mutation being common (Ciminelli et al. 1995; Cooper et al. 1996), and in the
existence of very many equally parsimonious trees (Roewer
et al. 1996). These are of limited phylogenetic use because
linkages due to recurrent mutation (identity by state — IBS)
cannot be distinguished from those due to single mutational events on ancestral haplotypes (identity by descent
— IBD). The likelihood that any one of the set of possible
trees represents the real evolutionary relationship of haplotypes is very low. Nonetheless, unique trees are still presented in many studies (Bergen et al. 1999; Ruiz-Linares
et al. 1999). Much attention has focused on the use of networks which include reticulations to represent a set of
equally parsimonious trees. Many analytical applications
do not require a single phylogeny but rather summaries
that encompass multiple trees. Often such summaries can
be highly supported by the data despite weak support for
individual trees. A number of different algorithms have
been applied to network construction, varying in complexity
from the computationally intense to those that can be done
by hand. Minimum spanning networks are probably the
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599 –1613
MEC1314.fm Page 1605 Monday, June 18, 2001 10:24 AM
H A P L O I D C H R O M O S O M E S I N M O L E C U L A R E C O L O G Y 1605
,
Y-chromosomal microsatellite diversity to good effect —
see Fig. 4 (Forster et al. 1998, 2000; Hurles et al. 1999).
The superimposition of additional information upon
networks or trees of multiallelic diversity can be used to
identify relationships that, after further testing for significance, can allow new inferences to be drawn and tested.
Superimposing population affinity allows population
structuring of haplotypes to be easily seen, which can then
be confirmed statistically (Zerjal et al. 1997, 1999; Hurles
et al. 1999; Helgason et al. 2000). In addition, the superimposition of allelic states whose recurrency is unknown can
allow unique events to be distinguished by virtue of their
defining a single cluster within a network (Hurles et al.
1998). Thus the graphical display of data represents a tool
to guide further analyses rather than an endpoint in itself.
Dating
,,
,
Fig. 4 Published networks of human Y chromosomal microsatellite haplotypes, (a) Median-Joining network of haplogroup
22 (see Fig. 2), adapted from Hurles et al. (1999). This single-step
network exhibits little diversity and is identical to a minimum
spanning network based on the same data. (b) Reduced Median
network of global Y-chromosomal microsatellite diversity, adapted
from Forster et al. (2000). This multistep network exhibits great
diversity and reveals the phylogenetic information within haplotypes of slower mutating microsatellites.
simplest of these methods, though their failure to include
unobserved ancestral haplotypes renders trees within
them evolutionarily implausible (Bandelt et al. 1995).
Recently two new methods, namely reduced median (RM)
and median-joining (MJ) networks, have been developed.
It is claimed that these methods can produce networks
containing all most parsimonious trees within a given
data set (Bandelt et al. 1995, 1999). Performed by a freely
available computer program, these analyses also provide
a number of criteria that can be invoked to reduce the
number of reticulations within the network, thus making
it more tree-like. A further refinement weights loci according to their apparent mutation rate (Forster et al. 2000;
Helgason et al. 2000; Kayser et al. 2000), resulting in additional network reduction. Both network methods have
been used in a number of mtDNA studies to cope with the
homoplasies inherent in sequence data from the hypervariable sequences of the mitochondrial genome. More
recently RM and MJ networks have been used to display
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599–1613
Dating the branchpoints of the tree of lineages is of prime
importance in interpreting geographical distributions of
lineages in terms of population history. A monophyletic
lineage has by definition been founded by a single
chromosome, with zero diversity. Thus the diversity within a
lineage can be related to the time since the lineage was
founded, or more correctly the time since the MRCA of
sampled chromosomes. Multiallelic markers with high
mutation rates can be used to investigate the intralineage
diversity. The markers most often used for this work are
microsatellites, although the minisatellite MSY1 has also
been used (Jobling et al. 1998a). There are a number of
confounding factors that contribute to the extent and
nature of the multiallelic diversity within a lineage. Box 3
details the more popular methods used to estimate the age
of the MRCA of a lineage from its extant diversity; little
work has been done to compare these different methods.
The impact of demography on these estimates and their
confidence limits has only begun to be appreciated (Thomas
et al. 1998), and much work remains before these estimates
can be regarded as being robust to prehistoric demographic perturbations. A recent example has shown
how the inclusion of an expanding population model
within coalescent dating substantially reduces the age of
the MRCA of all extant human Y chromosomes (Thomson
et al. 2000).
Coalescent simulations can give age estimates for lineages, based on tree topology and lineage frequency alone,
independent of intralineage diversity (Hammer et al. 1998).
These methods use estimates of sequence divergence rates
that depend on fossil record calibrations, whereas those
that use intralineage multiallelic marker diversity do not.
Little work has been done to compare dates calculated
using these different methods within the same samples,
and comparisons are problematic given that confidence
intervals are so large (Bosch et al. 1999). If we ignore the
MEC1314.fm Page 1606 Monday, June 18, 2001 10:24 AM
1606 M . E . H U R L E S and M . A . J O B L I N G
Box 3 Dating
Nodes in the Y-chromosomal phylogeny are commonly
dated using intralineage diversity at fast mutating multiallelic markers. The mutation rates of these markers are
sufficiently high to allow them to be determined by
analysing pedigrees, and with these rate estimates in
hand, a variety of different methods have been used to
relate diversity to age.
Perhaps the simplest dating method rests on the
assumption that the proportion of mutants within a
population is equal to the product of the mutation rate
and the time since the first appearance of the ancestral
allele. This relationship was first hypothesized by Luria
and Delbruck in their study of bacterial mutation in the
1940s (Luria & Delbruck 1943). More recently it has
been adapted to microsatellite haplotypes, by taking
into consideration a single-step mutational model,
when dating the origin of the commonest cystic fibrosis
mutation among Europeans (Bertranpetit & Calafell
1996). This method requires that a root haplotype be
identified and that the average number of mutational
steps from all haplotypes to the root is averaged over all
loci. This statistic, when calculated from a tree, is known
as ρ (rho) (Forster et al. 1996) and is divided by the
mutation rate per generation (ρ/µ = t) to give the age
of the MRCA in generations.
Another method that assumes linearity of a statistic
with respect to time since the founding of a lineage is
average squared distance (ASD) dating (Goldstein et al.
1995a, 1995b; Thomas et al. 1998). The ASD statistic is
simply the squared mutational distance between a root
haplotype and any other haplotypes within the lineage
confidence intervals, it seems that ages from intra-allelic
diversity are younger than those from coalescent dating:
for example, microsatellite diversity within haplogroup 16
yields an age of about 4000 years (Zerjal et al. 1997) while
coalescent analysis gives an age of 8400 years (Karafet et al.
1999) (confidence intervals are ±7000 years!). Possible discrepancies between dating methods might relate to the
differences in their underlying population and mutational
model. The latter should be refined by a consideration of
allele-specific mutation dynamics, given the growing
evidence for the mutational freezing of small alleles
(Carvalho-Silva et al. 1999). For a discussion of factors potentially causing divergent age estimates from different analyses see Bosch et al. 1999.
Once the age of a lineage has been calculated, the question of its anthropological relevance arises. One issue that
has been much debated is the equating of lineage age with
averaged over all loci and all haplotypes. Again dividing
this statistic by the mutation rate gives the age of the
MRCA of the lineage in generations. The linearity of ASD
has also been investigated for dating population splits.
In this case it was found that a related statistic performed
better. This statistic, known as δµ2, is a version of ASD corrected for intrapopulation variance (Goldstein et al. 1995b).
Based on neutral theory, the root haplotype is assumed
to be the most frequent haplotype, if the number of
mutations is small; this assumption has been little
discussed, however, and is highly sensitive to sampling.
A number of methods do not require that a root be
specified, relating the variance of microsatellite allele
lengths within a lineage to time (Goldstein et al. 1996;
Kittles et al. 1998). Recently, intra-allelic diversity has been
analysed using a Bayesian-based coalescent methodology to estimate the age of a MRCA; such methods are
in their infancy but are likely to become increasingly
popular (Wilson & Balding 1998; Hurles et al. 1999).
Generation time and mutation rate are key parameters
in all dating methods considered above and while confidence limits around mutation rates are being narrowed,
much uncertainty in generation time remains. Although
it is generally accepted that male generation time is
longer than that of females, generation times varying
from 20 to 30 years have been published. One group
consistently uses the figure of 27 years for male generation
time that comes from studies of extant hunter–gatherer
societies (Weiss 1973; Underhill et al. 1996). Longer
generation times have been proposed from the analysis
of well-documented genealogies (Tremblay & Vezina
2000), however, the relevance of relatively modern demographies to prehistorical societies remains doubtful.
population age. This practice has been widely attacked as
failing to appreciate the differences between a population
and a lineage MRCA (Cavalli-Sforza & Minch 1997). To
adapt a published analogy (Barbujani 1997), if a future
human colony were established on Mars, the Y-chromosomal
lineages of the colonists might coalesce in the Palaeolithic,
which would not mean that Mars had been colonized
15 000 years ago. Estimates of the age of the MRCA of a
single allele are susceptible to stochastic effects and are poor
estimates of population age. In general alleles are thought
to be older than the populations in which they are found.
This debate has unfortunately eclipsed the fact that providing a temporal aspect to patterns of lineage sharing can
inform in other, anthropologically useful ways. A number
of Y-chromosomal studies have used such lineage ages to
exclude one of a number of competing hypotheses for a
lineage’s distribution (Hurles et al. 1998, 1999).
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599 –1613
MEC1314.fm Page 1607 Monday, June 18, 2001 10:24 AM
H A P L O I D C H R O M O S O M E S I N M O L E C U L A R E C O L O G Y 1607
Fig. 5 Schematic graph indicating the
greater geographical differentiation of the
Y chromosome compared to both mtDNA
and autosomal loci (adapted from Seielstad
et al. 1998; see text).
The pattern of human Y chromosome diversity
Most genetic, palaeontological and archaeological evidence concurs in suggesting that anatomically modern
humans arose in Africa around 150 000 years ago (Stringer
& McKie 1996; Quintana-Murci et al. 1999). Because of the
relative youth of our own species, human genetic diversity
is limited compared to that of our closest relatives, the
great apes (Gagneux et al. 1999), and differences between
human populations are also correspondingly less marked
than those between populations of chimps, for example.
Studies using ‘classical’ markers [protein polymorphisms
and blood groups (Lewontin 1972) ], and more recent
studies using nuclear DNA polymorphisms (Barbujani
et al. 1997) are consistent in showing that approximately
80 – 85% of human genetic diversity lies within populations,
rather than between them. The question then arises of
whether this is equally true of all loci within the genome.
Early analyses using Y-chromosomal polymorphisms
(e.g. Torroni et al. 1990) suggested that Y haplotypes might
show a relatively high degree of population specificity, and
this was confirmed as more markers were discovered (e.g.
Seielstad et al. 1994; Underhill et al. 1996; Zerjal et al. 1997;
Hurles et al. 1999).
Carrying out a quantitative comparison across different
loci of the degree of genetic differentiation between different populations is not easy. First, comparisons will not be
valid unless the same population samples are analysed for
all loci. Second, the markers used to analyse diversity on
the Y chromosome, the rest of the nuclear genome, and
mtDNA differ in their mutational properties, and this must
be taken into account (Stoneking 1998). Third, although the
effective population sizes of the Y chromosome and mtDNA
are both expected to be around one quarter of those of any
autosome, there may be less obvious differences between
the effective population sizes of males and females, which
may bias the results (Nei 1987). One study has been carried
out (Seielstad et al. 1998) which at least takes the first of
these issues into account, and this suggests that, within
Africa and globally, the Y chromosome shows far more
genetic differentiation with geographical distance than do
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599–1613
either other nuclear loci or mtDNA (Fig. 5). Despite valid
criticisms of this work (Stoneking 1998), research both
before (e.g. Salem et al. 1996) and after (Kayser et al. 2000;
MEH unpublished results) this landmark study testifies to
the generality of the phenomena.
In a sense it is not the Y chromosome’s high geographical
differentiation that is surprising, but rather mtDNA’s lack
of it, since both are expected to have similar effective population sizes. The low effective population sizes of both loci
compared to autosomes ought to lead to a greater effect of drift
and more rapid divergence of populations, in both time
and space. The apparent difference between the two loci
was explained in the study of Seielstad et al. (1998), by invoking a higher prehistoric rate of female migration, which
was calculated as eight times higher than that of males.
While this might seem counterintuitive, given that males often
seem to be more mobile during wars and colonizations, for
example, the crucial parameter is intergenerational movement. It is thought that the majority of human populations
practise patrilocality (reviewed in Quinn 1977), where the
children of two people who come from different places are
brought up in the paternal place of origin. If this is so, we expect Y chromosomes to remain relatively static while mtDNAs
are mobile. Over many generations this will have the effect
of homogenizing the human mtDNA genetic landscape.
There are other possible explanations for this difference
in patterns of mtDNA and Y diversity: higher male than
female mortality before reproduction, local selection on Y
chromosomes, and the practice of polygyny, where some
males take many wives, may all contribute. Another factor
may be the intergenerational transmission of offspring
number. Simulation work on the over-representation
of disease alleles (Austerlitz & Heyer 1998) and mtDNA
lineages (Murray-McIntosh et al. 1998) suggests that if
intergenerational transmission of offspring number were
stronger through paternal lineages than through maternal
lineages, as might be expected in a patrilinearly based society, then this could be a powerful mechanism for limiting
local Y-chromosomal diversity and could lead to extensive
population structuring. All of these factors may contribute
towards a significantly lower effective population size for
MEC1314.fm Page 1608 Monday, June 18, 2001 10:24 AM
1608 M . E . H U R L E S and M . A . J O B L I N G
phylogeny. The most ancestral Y-chromosomal lineage,
the root of all others, was found exclusively in sub-Saharan
Africa. This finding, together with the consensus age to this
ancestral haplotype within the past 200 000 years, has been
taken to strongly support the Out-of-Africa model for
modern human origins. A more recent paper has dated the
MRCA of all modern Y chromosomes as being significantly
younger, around 50 000 years (Thomson et al. 2000), but is
still consistent with an African origin.
Range expansions into terra incognita
Fig. 6 Haplogroup distributions found in four different continental populations, based on the tree of haplogroups shown in
Fig. 5, showing only those haplogroups that can be typed by PCR.
Y chromosomes than for mtDNAs. A possible alternative
explanation lies in recent suggestions that recombination
between human mtDNAs operates as a confounding factor
for studies of maternal lineages (Awadalla et al. 1999);
however, the data used in this study have been criticised
(Kivisild & Villems 2000), and multiple whole genome
sequences of mitochondria show no evidence of recombination (Ingman et al. 2000).
Whatever its explanation, the geographical differentiation of the Y chromosome is such that many lineages defined
by unique markers are found to have population-specific
distributions ( Jobling et al. 1997). Different populations
tend to have very different distributions of haplogroups
(Fig. 6). This means that the Y chromosome is probably the
most sensitive genetic tool in the human genome for
detecting admixture between populations.
From patterns to events
The presence on the Y chromosome of a range of polymorphic markers with different mutation dynamics and
rates allow inferences to be made about events at multiple
time-depths within human prehistory. Most of these events
are directly analogous to issues in ecological genetics.
Origins of Species
Positioning the origin of modern humans in time and space
is a common aim of global studies of diversity. A study by
Hammer et al. (1998) used nested cladistic analysis to
analyse the global distribution of a Y-chromosomal
Human prehistory contains a number of dramatic examples
of population movements into lands previously uninhabited.
Historically, the timing and origin of such migrational
movements have been studied within many disciplines.
Recent studies of the Y-chromosomal origins of American
Indians and their movement across the Beringian landbridge have failed to support the linguist Joseph Greenberg’s
hypothesis of three separate migrations (Karafet et al. 1999;
Santos et al. 1999).
Y-chromosomal studies have also challenged the
dominant archaeo-linguistic model of a Taiwanese origin
for the Austronesian diaspora that led to the colonization
of Polynesia. Taiwan houses both the oldest archaeological
evidence (Spriggs 1989) of Austronesian people and the
greatest linguistic diversity (Blust 1999). A recent study
demonstrated an absence of Y-chromosomal haplotypes
ancestral to those found in Polynesia amongst Taiwanese
aborigines (Su et al. 2000).
Admixture
The issue of admixture underpins many of the contentious
issues in human evolutionary genetics. Y-chromosomal
studies have been used to identify sex-biased admixture
with the finding of European Y chromosomes but not
mtDNA in Polynesia (Hurles et al. 1998). The large genetic
and geographical distances between these two populations
make admixture easy to spot. More subtle sex-differences
have been recently noted in the relative contributions of
the more closely related populations of Scandinavia and
Ireland to the colonization population of Iceland (Helgason
et al. 2000). Even harder to distinguish are the relative contributions of Palaeolithic and Neolithic peoples to extant
genetic diversity in Europe. With no reliable outgroups,
this question of admixture in both time and space remains
contentious despite contributions from Y-chromosomal
studies (Semino et al. 1996; Casalotti et al. 1999).
Isolates (no admixture)
Population isolates have long been important for medical
geneticists, and intriguing for anthropologists, and their
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599 –1613
MEC1314.fm Page 1609 Monday, June 18, 2001 10:24 AM
H A P L O I D C H R O M O S O M E S I N M O L E C U L A R E C O L O G Y 1609
genetic diversity has been much investigated. The usual
suspects of Basque, Finnish and Jewish populations have
all been studied using Y-chromosomal markers. Rather
unexpectedly the Basques share a lineage with other
Iberian populations, and diversity within this lineage indicates that gene-flow must have occurred within the past few
millennia (Hurles et al. 1999). The Finns appear to owe
much of their Y-chromosomal, as well as linguistic, ancestry
to Central Asian populations with whom they share a
diagnostic lineage at high frequency (Zerjal et al. 1997). The
Y chromosome has also been used to confirm genetically
the paternal inheritance within Jewish populations of the
Cohanim priesthood (Thomas et al. 1998), as well as the
more general tendency of Jews not to mix their genes with
gentiles (Hammer et al. 2000), and the Middle-Eastern
origin of the Lemba, the ‘black Jews’ of South Africa (Thomas
et al. 2000).
Sex-differentiated behaviours
The power of comparing Y-chromosomal to mtDNA data
in studying sex-specific behaviours has been demonstrated
by a investigation into the relationship between global
differences in mtDNA and Y-chromosomal diversity and
linguistic affiliations (Poloni et al. 1997). Languages were
found to be better correlated with paternal lineages than
with maternal lineages, and it was suggested that this
might be due to children adopting their father’s language
rather than their mother’s, in contrast to the phrase ‘mother
tongue’ (Poloni et al. 1997). Alternatively, this could be
another effect of patrilocality, since mothers are more
mobile than fathers, and this mobility may be accompanied
by some language shift.
A similar approach was also used in an investigation of
the genetic impact of the Hindu caste system (Bamshad
et al. 1998). As might be expected this study found that this
mating structure, despite its relatively shallow time depth,
has indeed resulted in some degree of genetic stratification
of castes and that there is higher female then male geneflow between castes.
Paternity
The patrilineal inheritance of the Y chromosome has been
exploited in paternity cases where the putative father is
unobtainable; most famously this has supported the claim
that US president Thomas Jefferson fathered a son by a
mulatto slave (Foster et al. 1998). The rate of nonpaternity
in humans has been further elucidated in a study, by the
eponymous Sykes, of haplotype sharing among males
sharing the same surname (Sykes & Irven 2000). A monophyletic origin of this surname was suggested, thus allowing
the nonpaternity rate over the past 700 years to be estimated
as 1.3% per generation.
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599–1613
Future developments
One potential problem with such comparative analyses of
Y-chromosomal and mtDNA diversity is the impact of
stochasticity. These loci, with their small effective population
sizes, are particularly prone to the vagaries of drift. The
question ‘when can differences between these two loci be
ascribed to sex-specific as opposed to stochastic differences?’
has not yet been explicitly addressed.
The need to develop an analytical framework under
which information from multiple phylogenies can be combined has been appreciated by both anthropologists and
ecologists (Avise 1989; Bermingham & Moritz 1998; Jin
et al. 1999). These phylogenies can come from multiple
species, multiple nonrecombining loci within the same
genome or indeed from nongenetic data (Pagel 1999). Phylogenies can be constructed from physical characteristics
such as cranial measurements or indeed cultural characteristics, most notably languages. A cross-fertilization of
analytical ideas between ecologists interested in interspecies comparative phylogeography, and anthropologists
interested in intragenomic comparative phylogeography is
to be encouraged. The human genome leads the way in
terms of published phylogenies constructed for nonrecombining portions of the genome, of which the Y chromosome
is but one (Harding et al. 1997; Jin et al. 1999; Kaessmann
et al. 1999). One recent paper constructed a phylogeny for
markers within a nonrecombining portion of chromosome
21 and compared it to a Y-chromosomal phylogeny to provide greater support for a substantial ancient back migration from Asia to Africa ( Jin et al. 1999).
Diversity of haploid chromosomes in other species
Chromosomal sex-determining mechanisms have evolved
several times independently among animals. Furthermore,
after they have arisen, constitutively haploid chromosomes are evolutionarily labile: their gene content degrades
(Rice 1994), and they tend to lose and gain material at a
much greater rate than do other chromosomes, which are
constrained by the requirements of pairing and recombination. As a result, useful sequence homologies between the
Y chromosomes (or W chromosomes) of different species
are the exception, rather than the rule, and this has limited
the study of haploid chromosome diversity in species other
than humans. Most effort in nonhuman animals has gone
into developing markers for sex-testing purposes.
We therefore know little about the diversity of haploid chromosomes in other species. Diversity of nuclear
(Kaessmann et al. 1999) and mtDNA sequences (Gagneux
et al. 1999) in the great apes is much greater than that in
humans, and this is expected to be true of Y-chromosomal
sequences too. Preliminary studies of Y diversity among
orang-utans (Altheide TK & Hammer MF, submitted)
MEC1314.fm Page 1610 Monday, June 18, 2001 10:24 AM
1610 M . E . H U R L E S and M . A . J O B L I N G
confirm that this is so, although sample sizes are small. As
a result of efforts to characterize the mouse genome, polymorphic markers on the Y chromosomes of species in the
Mus subgenus have been identified (the Mouse Genome
Database lists 71, of which 27 can be typed by PCR) but
these have not been exploited in population studies. A few
Y-specific polymorphisms have been isolated in a smattering of mammalian species including field voles (a pericentric
inversion, Jaarola et al. 1997), cows (microsatellites, Hanotte
et al. 2000), dogs (microsatellites, Olivier et al. 1999) and
North American deer (ZFY sequences, Cathey et al. 1998).
Although some markers have been isolated in bird
species which are specific to either Z or W chromosomes,
no studies of W chromosome diversity have been done.
Such studies may seem redundant given that W chromosomes are female-specific, and therefore mirror the ancestry
of the more easily studied mtDNA; however, W chromosomes should in principle bear a greater diversity of polymorphic systems than mtDNA, and differences in the
patterns of diversity between the two systems would give
information about any irregularities in the sex-specificity
of mtDNA inheritance.
Methods to isolate polymorphic markers specific to the
haploid chromosomes in any species could be designed
(see Box 1), and would allow the kinds of studies which we
have described here to be extended beyond humans. Traits
such as stature (Salo et al. 1995), aggression (Maxson 1996),
attractiveness (Brooks 2000) and the fitness of sperm in
sperm competition may be linked to haploid chromosomes, and will form substrates for the action of natural
selection. Phenomena such as male dominance and differential male and female dispersal will influence the effective
population sizes of haploid chromosomes. These factors
are likely to play a major role in patterning diversity within
many species, and when disentangling them, molecular
ecologists are likely to find haploid chromosomes a uniquely
informative tool.
Acknowledgements
We thank Dan Bradley, Martin Jones and Chris Tyler-Smith for
comments on the manuscript.
References
Altshuler D, Pollara VJ, Cowles CR et al. (2000) An SNP map of the
human genome generated by reduced representation shotgun
sequencing. Nature, 407, 513 – 516.
Alvesalo L, de la Chapelle A (1981) Tooth sizes in two males with
deletions of the long arm of the Y chromosome. Annals of Human
Genetics, 45, 49 – 54.
Andersen L, Born E, Gjertz I et al. (1998) Population structure and
gene flow of the Atlantic walrus (Odobenus rosmarus rosmarus) in
the eastern Atlantic Arctic based on mitochondrial DNA and
microsatellite variation. Molecular Ecology, 7, 1323–1336.
Austerlitz F, Heyer E (1998) Social transmission of reproductive
behaviour increases frequency of inherited disorders in a young
expanding population. Proceedings of the National Academy of
Sciences of the USA, 95, 15140–15144.
Avise JC (1989) Gene trees and organismal histories — a phylogenetic approach to population biology. Evolution, 43, 1192–1208.
Avise JC (1993) Molecular Markers, Natural History and Evolution.
Kluwer Academic Publishers, Boston.
Avise JC, Arnold J, Ball RM et al. (1987) Intraspecific phylogeography: the mitochondrial DNA bridge between population
genetics and systematics. Annual Review of Ecology and Systematics,
18, 489–522.
Awadalla P, Eyre-Walker A, Smith JM (1999) Linkage disequilibrium
and recombination in hominid mitochondrial DNA. Science, 286,
2524–2525.
Ayub Q, Mohyuddin A, Qamar R et al. (2000) Identification and
characterisation of novel human Y-chromosomal microsatellites
from sequence database information. Nucleic Acids Research, 28,
e8.
Bamshad MJ, Watkins WS, Dixon ME et al. (1998) Female gene
flow stratifies Hindu castes. Nature, 395, 651– 652.
Bandelt H-J, Forster P, Röhl A (1999) Median-joining networks
for inferring intraspecific phylogenies. Molecular Biology and
Evolution, 16, 37–48.
Bandelt H-J, Forster P, Sykes BC, Richards MB (1995) Mitochondrial portraits of human populations using median networks.
Genetics, 141, 743–753.
Barbujani G (1997) DNA variation and language affinities. American
Journal of Human Genetics, 61, 1011–1014.
Barbujani G, Magagni A, Minch E, Cavalli-Sforza LL (1997) An
apportionment of human DNA diversity. Proceedings of the National
Academy of Sciences of the USA, 94, 4516 – 4519.
Bello N, Sanchez A (1999) The identification of a sex-specific DNA
marker in the ostrich using a random amplified polymorphic
DNA (RAPD) assay. Molecular Ecology, 8, 667– 669.
Bergen AW, Wang CY, Tsai J et al. (1999) An Asian-Native American
paternal lineage identified by RPS4Y resequencing and by microsatellite haplotyping. Annals of Human Genetics, 63, 63 – 80.
Bermingham E, Moritz C (1998) Comparative phylogeography:
concepts and applications. Molecular Ecology, 7, 367– 369.
Bertranpetit J, Calafell F (1996) Genetic and geographical variability in cystic fibrosis: evolutionary considerations. In: Variation in
the Human Genome (eds Chadwick D, Cardew G), pp. 97–118.
John Wiley & Sons. Chichester.
Blust R (1999) Subgrouping, circularity and extinction: some issues
in Austronesian comparative linguistics. Symposium Series of the
Institute of Linguistics Academia Sinica, 1, 31– 94.
Boissinot S, Boursot P (1997) Discordant phylogeographic patterns
between the Y chromosome and mitochondrial DNA in the house
mouse: Selection on the Y chromosome? Genetics, 146, 1019–1034.
Bosch E, Calafell F, Santos FR et al. (1999) Variation in short tandem
repeats is deeply structured by genetic background on the
human Y chromosome. American Journal of Human Genetics, 65,
1623–1638.
Brooks R (2000) Negative genetic correlation between male sexual
attractiveness and survival. Nature, 406, 67– 70.
Cargill M, Altshuler D, Ireland J et al. (1999) Characterization of
single-nucleotide polymorphisms in coding regions of human
genes. Nature Genetics, 22, 231–238.
Carvalho-Silva DR, Santos FR, Hutz MH, Salzano FM, Pena SDJ
(1999) Divergent human Y-chromosomal microsatellite evolution rates. Journal of Molecular Evolution, 49, 204 – 214.
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599 –1613
MEC1314.fm Page 1611 Monday, June 18, 2001 10:24 AM
H A P L O I D C H R O M O S O M E S I N M O L E C U L A R E C O L O G Y 1611
Casalotti R, Simoni L, Belledi M, Barbujani G (1999) Y-chromosome
polymorphisms and the origins of the European gene pool. Proceedings of the Royal Society of London Series B, 266, 1959–1965.
Casanova M, Leroy P, Boucekkine C et al. (1985) A human Y-linked
DNA polymorphism and its potential for estimating genetic
and evolutionary distance. Science, 230, 1403–1406.
Cathey JC, Bickham JW, Patton JC (1998) Introgressive hybridization
and nonconcordant evolutionary history of maternal and paternal
lineages in North American deer. Evolution, 52, 1224–1229.
Cavalli-Sforza LL, Minch E (1997) Paleolithic and neolithic lineages in the European mitochondrial gene pool. American Journal
of Human Genetics, 61, 247 – 251.
Ciminelli BM, Pompei F, Malaspina P et al. (1995) Recurrent
simple tandem repeat mutations during human Y-chromosome
radiation in Caucasian subpopulations. Journal of Molecular
Evolution, 41, 966 – 973.
Cooper G, Amos W, Hoffman D, Rubinsztein DC (1996) Network
analysis of human Y microsatellite haplotypes. Human Molecular
Genetics, 5, 1759 –1766.
Crow TJ (1999) Cerebral asymmetry, language and psychosis — the
case for a Homo sapiens-specific sex-linked gene for brain growth.
Schizophrenia Research, 39, 219 – 231.
Fitz-Simmons N, Moritz C, Limpus C, Pope L, Prince R (1997)
Geographic structure of mitochondrial and nuclear gene polymorphisms in Australian green turtle populations and malebiased gene flow. Genetics, 1147, 1843–1854.
Fix AG (1999) Migration and Colonization in Human Microevolution.
Cambridge University Press, Cambridge.
Forster P, Harding R, Torroni A, Bandelt H-J (1996) Origin and
evolution of Native American mtDNA variation: a reappraisal.
American Journal of Human Genetics, 59, 935–945.
Forster P, Kayser M, Meyer E et al. (1998) Phylogenetic resolution
of complex mutational features at Y-STR DYS390 in Aboriginal
Australians and Papuans. Molecular Biology and Evolution, 15,
1108 –1114.
Forster P, Rohl A, Lunnemann P et al. (2000) A short tandem repeatbased phylogeny for the human Y chromosome. American Journal
of Human Genetics, 67, 182 –196.
Foster EA, Jobling MA, Taylor PG et al. (1998) Jefferson fathered
slave’s last child. Nature, 396, 27– 28.
Foster EA, Jobling MA, Taylor PG et al. (1999) The Thomas Jefferson
paternity case. Nature, 397, 32.
Gagneux P, Wills C, Gerloff U et al. (1999) Mitochondrial sequences
show diverse evolutionary histories of African hominoids.
Proceedings of the National Academy of Sciences of the USA, 96,
5077–5082.
Goldstein DB, Linares AR, Cavalli-Sforza LL, Feldman MW (1995a)
An evaluation of genetic distances for use with microsatellite
loci. Genetics, 139, 463 – 471.
Goldstein DB, Linares AR, Cavalli-Sforza LL, Feldman MW (1995b)
Genetic absolute dating based on microsatellites and the origin
of modern humans. Proceedings of the National Academy of Sciences
of the USA, 92, 6723 – 6727.
Goldstein DB, Zhivotovsky LA, Nayar K et al. (1996) Statistical
properties of the variation at linked microsatellite loci: implications
for the history of human Y chromosomes. Molecular Biology and
Evolution, 13, 1213 –1218.
Griffiths R, Double MC, Orr K, Dawson RJG (1998) A DNA test to
sex most birds. Molecular Ecology, 7, 1071–1075.
Griffiths R, Orr K (1999) The use of amplified fragment length
polymorphism (AFLP) in the isolation of sex-specific markers.
Molecular Ecology, 8, 671– 674.
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599–1613
Hammer MF (1994) A recent insertion of an Alu element on the Y
chromosome is a useful marker for human population studies.
Molecular Biology and Evolution, 11, 749 – 761.
Hammer MF (1995) A recent common ancestry for human Y
chromosomes. Nature, 378, 376–378.
Hammer MF, Karafet T, Rasanayagam A et al. (1998) Out of
Africa and back again: Nested cladistic analysis of human Y chromosome variation. Molecular Biology and Evolution, 15, 427–441.
Hammer MF, Redd AJ, Wood ET et al. (2000) Jewish and Middle
Eastern non-Jewish populations share a common pool of Ychromosome biallelic haplotypes. Proceedings of the National
Academy of Sciences of the USA, 97, 6769 – 6774.
Hammer MF, Spurdle AB, Karafet T et al. (1997) The geographic
distribution of human Y chromosome variation. Genetics, 145,
787–805.
Hanotte O, Tawah CL, Bradley DG et al. (2000) Geographic distribution and frequency of a taurine Bos taurus and an indicine Bos
indicus Y specific allele amongst sub-Saharan African cattle
breeds. Molecular Ecology, 9, 387–396.
Harding RM, Fullerton SM, Griffiths RC et al. (1997) Archaic
African and Asian lineages in the genetic ancestry of modern
humans. American Journal of Human Genetics, 60, 772 –789.
Helgason A, Sigurdardöttir S, Nicholson J et al. (2000) Estimating
Scandinavian and Gaelic ancestry in the male settlers of Iceland.
American Journal of Human Genetics, 67, 697–717.
Hurles ME, Irven C, Nicholson J et al. (1998) European Ychromosomal lineages in Polynesians: a contrast to the population
structure revealed by mtDNA. American Journal of Human
Genetics, 63, 1793–1806.
Hurles ME, Veitia R, Arroyo E et al. (1999) Recent male-mediated
gene flow over a linguistic barrier in Iberia suggested by analysis
of a Y-chromosomal DNA polymorphism. American Journal of
Human Genetics, 65, 1437–1448.
Ingman M, Kaessman H, Pääbo H, Gyllensten U (2000) Mitochondrial genome variation and the origin of modern humans.
Nature, 408, 708–719.
Jaarola M, Tegelstrom H, Fredga K (1997) A contact zone with
noncoincident clines for sex-specific markers in the field vole
(Microtus agrestis). Evolution, 51, 241–249.
Jakubiczka S, Arnemann J, Cooke HJ, Krawczak M, Schmidtke J
(1990) A search for restriction fragment length polymorphism of
the human Y chromosome. Human Genetics, 84, 86 – 88.
Jeffreys AJ, MacLeod A, Tamaki K, Neil DL, Monckton DG (1991)
Minisatellite repeat coding as a digital approach to DNA typing.
Nature, 354, 204–209.
Jin L, Underhill PA, Doctor V et al. (1999) Distribution of haplotypes from a chromosome 21 region distinguishes multiple
prehistoric human migrations. Proceedings of the National Academy
of Sciences of the USA, 96, 3796–3800.
Jobling MA, Bouzekri N, Taylor PG (1998a) Hypervariable digital
DNA codes for human paternal lineages: MVR-PCR at the Yspecific minisatellite, MSY1 (DYF155S1). Human Molecular
Genetics, 7, 643–653.
Jobling MA, Pandya A, Tyler-Smith C (1997) The Y chromosome in
forensic analysis and paternity testing. International Journal of
Legal Medicine, 110, 118–124.
Jobling MA, Tyler-Smith C (1995) Fathers and sons: the Y
chromosome and human evolution. Trends in Genetics, 11,
449–456.
Jobling MA, Tyler-Smith C (2000) New uses for new haplotypes —
the human Y chromosome, disease and selection. Trends in Genetics,
16, 356–362.
MEC1314.fm Page 1612 Monday, June 18, 2001 10:24 AM
1612 M . E . H U R L E S and M . A . J O B L I N G
Jobling MA, Williams G, Schiebel K et al. (1998b) A selective difference between human Y-chromosomal DNA haplotypes. Current
Biology, 8, 1391–1394.
Kaessmann H, Heissig F, von Haeseler A, Paabo S (1999) DNA
sequence variation in a non-coding region of low recombination
on the human X chromosome. Nature Genetics, 22, 78–81.
Karafet TM, Zegura SL, Posukh O et al. (1999) Ancestral Asian
source(s) of New World Y-chromosome founder haplotypes.
American Journal of Human Genetics, 64, 817–831.
Kayser M, Brauer S, Weiss G et al. (2001) Independent histories of
human Y, chromosomes from Melanesia and Australia. American
Journal of Human Genetics, 68, 173 –190.
Kayser M, Caglià A, Corach D et al. (1997) Evaluation of Ychromosomal STRs: a multicenter study. International Journal
of Legal Medicine, 110, 125 –133.
Kayser M, Roewer L, Hedman M et al. (2000) Characteristics and
frequency of germline mutations at microsatellite loci from the
human Y chromosome, as revealed by direct observation in
father/son pairs. American Journal of Human Genetics, 66, 1580 –
1588.
Kittles RA, Perola M, Peltonen L et al. (1998) Dual origins of Finns
revealed by Y chromosome haplotype variation. American Journal
of Human Genetics, 62, 1171–1179.
Kivisild T, Villems R (2000) Questioning evidence for recombination in human mitochondrial DNA. Science, 288, 1931a.
de Knijff P, Kayser M, Cagliá A et al. (1997) Chromosome Y
microsatellites: population genetics and evolutionary aspects.
International Journal of Legal Medicine, 110, 134–140.
Kuroki Y, Iwamoto T, Lee JW et al. (1999) Spermatogenic ability is
different among males in different Y chromosome lineage. Journal
of Human Genetics, 44, 289 – 292.
Lewontin RC (1972) The apportionment of human diversity. Evolutionary Biology, 6, 381– 398.
Lum JK, Cann RL, Martinson JJ, Jorde LB (1998) Mitochondrial
and nuclear genetic relationships among Pacific Island and
Asian populations. American Journal of Human Genetics, 63, 613–
624.
Luria SE, Delbruck M (1943) Mutations of bacteria from virus
sensitivity to virus resistance. Genetics, 28, 491–511.
Lyrholm T, Leimar O, Johanneson B, Gyllensten U (1999) Sex-biased
dispersal in sperm whales: contrasting mitochondrial and nuclear
genetic structure of global populations. Proceedings of the Royal
Society of London Series B, 266, 347– 354.
Malaspina P, Cruciani F, Ciminelli BM et al. (1998) Network analyses of Y-chromosomal types in Europe, Northern Africa, and
Western Asia reveal specific patterns of geographic distribution.
American Journal of Human Genetics, 63, 847–860.
Malaspina P, Persichetti F, Novelletto A et al. (1990) The human Y
chromosome shows a low level of DNA polymorphism. Annals
of Human Genetics, 54, 297– 305.
Maxson SC (1996) Searching for candidate genes with effects on an
agonistic behavior, offense, in mice. Behavior Genetics, 26, 471–476.
Mitchell RJ, Hammer MF (1996) Human evolution and the Y
chromosome. Current Opinion in Genetics and Development, 6,
737–742.
Mullikin JC, Hunt SE, Cole CG et al. (2000) An SNP map of human
chromosome 22. Nature, 407, 516 – 520.
Murray-McIntosh RP, Scrimshaw BJ, Hatfield PJ, Penny D (1998)
Testing migration patterns and estimating founding population size in Polynesia by using human mtDNA sequences.
Proceedings of the National Academy of Sciences of the USA, 95,
9047–9052.
Nei M (1987) Molecular Evolutionary Genetics. Columbia University
Press, New York.
Nyakaana S, Arctander P (1999) Population genetic structure of the
African elephant in Uganda based on variation at mitochondrial
and nuclear loci: evidence for male-biased gene flow. Molecular
Ecology, 8, 1105–1115.
O’Neill M, Binder M, Smith C et al. (2000) ASW: a gene with
conserved avian W-linkage and female specific expression in
chick embryonic gonad. Development Genes and Evolution, 210,
243–249.
Olivier M, Breen M, Binns MM, Lust G (1999) Localization and
characterization of nucleotide sequences from the canine Y chromosome. Chromosome Research, 7, 223 – 233.
Pagel M (1999) Inferring the historical patterns of biological evolution. Nature, 401, 877–884.
Poloni ES, Semino O, Passarino G et al. (1997) Human genetic
affinities for Y-chromosome P49a,f/TaqI haplotypes show strong
correspondence with linguistics. American Journal of Human
Genetics, 61, 1015–1035.
Previderé C, Stuppia L, Gatta V et al. (1999) Y-chromosomal DNA
haplotype differences in control and infertile Italian subpopulations. European Journal of Human Genetics, 7, 733 – 736.
Quinn N (1977) Anthropological studies of women’s status.
Annual Review of Anthropology, 6, 181– 225.
Quintana-Murci L, Semino O, Bandelt HJ et al. (1999) Genetic
evidence of an early exit of Homo sapiens from Africa through
eastern Africa. Nature Genetics, 23, 437– 441.
Raponi M, Dawes IW, Arndt GM (2000) Characterization of
flanking sequences using long inverse PCR. Biotechniques, 28,
838–842.
Rice WR (1994) Degeneration of a nonrecombining chromosome.
Science, 263, 230–232.
Richards M, Macaulay V, Sykes B et al. (1997) Reply to Cavalli-Sforza
and Minch. American Journal of Human Genetics, 61, 251–254.
Roewer L, Arnemann J, Spurr NK, Grzeschik KH, Epplen JT (1992)
Simple repeat sequences on the human Y chromosome are equally
polymorphic as their autosomal counterparts. Human Genetics,
89, 389–394.
Roewer L, Kayser M, Dieltjes P et al. (1996) Analysis of molecular
variance (amova) of Y-chromosome-specific microsatellites in
two closely related human populations. Human Molecular Genetics,
5, 1029–1033.
Ruiz-Linares A, Ortíz-Barrientos D, Figueroa M et al. (1999) Microsatellites provide evidence for Y chromosome diversity among the
founders of the New World. Proceedings of the National Academy
of Sciences of the USA, 96, 6312–6317.
Salem A, Badr F, Gaballah M, Paabo S (1996) The genetics of
traditional living: Y-chromosomal and mitochondrial lineages
in the Sinai Peninsula. American Journal of Human Genetics, 59,
741–743.
Salo P, Kaariainen H, Page DC, de la Chapelle A (1995) Deletion
mapping of stature determinants on the long arm of the Y chromosome. Human Genetics, 95, 283–286.
Santos FR, Pandya A, Tyler-Smith C et al. (1999) The central Siberian
origin for Native American Y chromosomes. American Journal of
Human Genetics, 64, 619–628.
Santos FR, Tyler-Smith C (1996) Reading the human Y chromosome: the emerging DNA markers and human genetic history.
Brazilian Journal of Genetics, 19, 665– 670.
Seielstad MT, Hebert JM, Lin AA et al. (1994) Construction of
human Y-chromosomal haplotypes using a new polymorphic A
to G transition. Human Molecular Genetics, 3, 2159 – 2161.
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599 –1613
MEC1314.fm Page 1613 Monday, June 18, 2001 10:24 AM
H A P L O I D C H R O M O S O M E S I N M O L E C U L A R E C O L O G Y 1613
Seielstad MT, Minch E, Cavalli-Sforza LL (1998) Genetic evidence
for a higher female migration rate in humans. Nature Genetics,
20, 278 – 280.
Semino O, Passarino G, Brega A, Fellous M, SantachiaraBenerecetti AS (1996) A view of the Neolithic demic diffusion in
Europe through two Y chromosome-specific markers. American
Journal of Human Genetics, 59, 964 – 968.
Shen PD, Wang F, Underhill PA et al. (2000) Population genetic
implications from sequence variation in four Y chromosome
genes. Proceedings of the National Academy of Sciences of the USA,
97, 7354 – 7359.
Sinclair AH, Berta P, Palmer MS et al. (1990) A gene from the
human sex-determining region encodes a protein with homology
to a conserved DNA-binding motif. Nature, 346, 240–244.
Spriggs M (1989) The dating of the Island Southeast Asian
Neolithic — an attempt at chronometric hygiene and linguistic
correlation. Antiquity, 63, 587– 613.
Stoneking M (1998) Women on the move. Nature Genetics, 20, 219–
220.
Stringer C, McKie R (1996) African Exodus: the Origins of Modern
Humanity. Jonathan Cape, London.
Su B, Jin L, Underhill P et al. (2000) Polynesian origins: Insights
from the Y chromosome. Proceedings of the National Academy of
Sciences of the USA, 97, 8225 – 8228.
Sykes B, Irven C (2000) Surnames and the Y chromosome. American
Journal of Human Genetics, 66, 1417–1419.
Thomas MG, Parfitt T, Weiss DA et al. (2000) Y chromosomes
traveling south: The Cohen modal haplotype and the origins of
the Lemba — the ‘black Jews of Southern Africa’. American Journal
of Human Genetics, 66, 674 – 686.
Thomas MG, Skorecki K, Ben-Ami H et al. (1998) Origins of Old
Testament priests. Nature, 384, 138 –140.
Thomson R, Pritchard JK, Shen PD, Oefner PJ, Feldman MW
(2000) Recent common ancestry of human Y chromosomes:
Evidence from DNA sequence data. Proceedings of the National
Academy of Sciences of the USA, 97, 7360–7365.
Torroni A, Semino O, Scozzari R et al. (1990) Y chromosome DNA
polymorphisms in human populations: differences between
Caucasoids and Africans detected by 49a and 49f probes. Annals
of Human Genetics, 54, 287– 296.
Tremblay M, Vezina H (2000) New estimates of intergenerational
time intervals for the calculation of age and origins of mutations.
American Journal of Human Genetics, 66, 651–658.
Underhill PA, Jin L, Lin AA et al. (1997) Detection of numerous Y
chromosome biallelic polymorphisms by denaturing highperformance liquid chromatography. Genome Research, 7, 996 –1005.
© 2001 Blackwell Science Ltd, Molecular Ecology, 10, 1599–1613
Underhill PA, Jin L, Zemans R, Oefner PJ, Cavalli-Sforza LL (1996)
A pre-Columbian Y chromosome-specific transition and its
implications for human evolutionary history. Proceedings of the
National Academy of Sciences of the USA, 93, 196 – 200.
Underhill PA, Shen P, Lin AA et al. (2000) Y chromosome sequence
variation and the history of human populations. Nature Genetics,
26, 356–361.
Vogt PH, Edelmann A, Kirsch S et al. (1996) Human Y chromosome
azoospermia factors (AZF) mapped to different subregions in
Yq11. Human Molecular Genetics, 5, 933 – 943.
Weiss K (1973) Models in anthropology. American Antiquity, 38, 1–
186.
White PS, Tatum OL, Deaven LL, Longmire JL (1999) New, malespecific microsatellite markers from the human Y chromosome.
Genomics, 57, 433–437.
Whitfield LS, Sulston JE, Goodfellow PN (1995) Sequence variation of the human Y chromosome. Nature, 378, 379 – 380.
Wilmer J, Hall L, Barratt E, Moritz C (1999) Genetic structure and
male-mediated gene flow in the ghost bat (Macroderma gigas).
Evolution, 53, 1582–1591.
Wilson IJ, Balding DJ (1998) Genealogical inference from microsatellite data. Genetics, 150, 499–510.
Zerjal T, Dashnyam B, Pandya A et al. (1997) Genetic relationships
of Asians and northern Europeans, revealed by Y-chromosomal
DNA analysis. American Journal of Human Genetics, 60, 1174–
1183.
Zerjal T, Pandya A, Santos FR et al. (1999) The use of Y-chromosomal
DNA variation to investigate population history: recent male
spread in Asia and Europe. In: Genomic Diversity: Applications in
Human Population Genetics (eds Papiha SS & Deka R), pp. 91–102.
Plenum, New York.
Matt Hurles is a Research Fellow in Population Genetics
supported by the McDonald Institute for Archaeological
Research, and his interests lie in the use of modern DNA
diversity of humans and other species in the reconstruction of
prehistory, focused in Oceania, and in the analysis of genome
instability and genomic disorders. For further information,
see http://www-mcdonald.arch.cam.ac.uk/Genetics/home.html.
Mark Jobling is a Wellcome Trust Senior Research Fellow in Basic
Biomedical Science (Grant no. 057559), with interests in human
genetic history, infertility, mutation processes, forensics and genealogy centred on the Y chromosome. For further information, see:
http://www.le.ac.uk/genetics/maj4/maj4.html.