Untitled

Järvinen, Pia
Nucleotide variation of birch (Betula L.) species: population structure and phylogenetic
relationships. - University of Joensuu, 2004, 138 pp.
University of Joensuu, PhD Dissertations in Biology, No. 34. ISSN 1457-2486.
Net version ISBN 952-458-593-6
Key words: BpADH, BpFULL1, BpMADS2, Betula, Betula pendula, Betulaceae, dimorphism,
genetic differentiation, linkage disequilibrium, matK, nucleotide diversity, phylogeny,
recombination
Nucleotide variation in three nuclear genes, BpMADS2, BpFULL1, and BpADH was studied in
two natural silver birch (Betula pendula Roth) populations. Nuclear sequences were used to
explore the nucleotide variation within a population, among the populations, and between different
genes of silver birch, and to evaluate whether the post-glacial colonisation history of the species
is reflected in the distribution of variation. Many studies on marker genes, both on isozyme
variation and anonymous DNA, have shown that variation within and among silver birch
populations is high, resulting in a prediction that the level of nucleotide variation should also be
high in this species. The observed results, however, do not fully support this prediction. Especially
the level of nonsynonymous variation (πa) in BpMADS2 and BpFULL1 loci was very low, 0.00052
and 0, respectively, and only somewhat higher in the BpADH locus (0.0015). The synonymous
site overall variation (πs) for BpMADS2 was also low, only 0.0043. The synonymous site overall
nucleotide diversities for the BpFULL1 and BpADH regions were higher than in the BpMADS2
locus, 0.0134 and 0.0117, respectively. As the detected patterns of polymorphism and divergence
at these loci were concordant, the variable mutation rates between the studied loci could explain
the differences, but the power of some of the selection tests was low in the relatively small
sample with low numbers of segregating sites.
Many aspects of the data are consistent with a large effective population size of silver birch.
The genetic differentiation between the two studied populations was low and the decay of linkage
disequilibrium in the studied genes was very rapid. Furthermore, the recombination rate was
high in the silver birch nuclear genome. These three nuclear loci of silver birch did not show
strong patterns caused by the post-glacial expansion of the species, but were close to demographic
equilibrium. Earlier studies on cpDNA found a strong geographical pattern presumably due to
colonisation history, but there was only weak suggestion of such a pattern at the nuclear loci.
Silver birch, as a wind-pollinated species, has very efficient gene flow through pollen and this
may have broken down the initial genetic structure of populations established at colonisation.
More loci and populations will be needed to confirm these findings.
The phylogenetic relationships within the genus Betula (Betulaceae) were investigated using
a part of the nuclear ADH, BpMADS2, and BpFULL1 genes. In general, the results obtained
from the nuclear data fit rather well to the infrageneric classifications proposed for birches. In
disagreement with the classical division of the genus Betula, B. schmidtii grouped with the
species in subgenus Betula, and B. ermanii grouped with species in subgenus Chamaebetula.
Pia Järvinen, Department of Biology, University of Joensuu, P.O.Box 111, FIN-80101 Joensuu,
Finland
3
ABBREVIATIONS
ADH
Alcohol dehydrogenase
AFLP
Amplified Fragment Length Polymorphism
bp
base pair
BP
Before Present
BpADH
Betula pendula ADH
BpFULL1
Betula pendula FRUITFULL-Like 1
BpMADS2
Betula pendula MADS2
c
recombination rate between loci
kb
kilo base
MADS box genes
gene family of transcription factors
MatK
gene coding maturase-like protein in plants
PCR
Polymerase Chain Reaction
PEE
Positive Early Element
PI
PISTILLATA
PLE1,2,3
Positive Late Elements 1,2,3
RAPD
Random Amplified Polymorphic DNA
RFLP
Restriction Fragment Length Polymorphism
Tajima’s D
tests the hypothesis that all mutations are selectively neutral
π
nucleotide diversity
4
CONTENTS
LIST OF ORIGINAL PUBLICATIONS
1. INTRODUCTION
1.1. General background
1.2. Genus Betula
1.3. Flower development
1.3.1. Flower development in Arabidopsis
1.3.2. MADS box genes FRUITFULL and PISTILLATA
1.3.3. BpFULL1 gene
1.4. Molecular markers
1.4.1. Nuclear and chloroplast DNA
1.5. Nucleotide variation in plants
1.5.1. Nucleotide variation in Arabidopsis
1.5.2. Nucleotide variation in woody plants
1.5.2.1. Betula pendula
2. AIMS OF THE STUDY
3. MATERIALS AND METHODS
3.1. Plant materials
3.1.1. Plant material for the isolation and analysis of BpMADS2
3.1.2. Plant material for the population studies
3.1.3. Plant material for the phylogenetic studies
3.2. Isolation and analysis of BpMADS2 and BpADH genes
3.3. Isolation and analysis of BpMADS2, BpFULL1, and BpADH fragments
3.4. Phylogenetic studies
4. RESULTS
4.1. Isolation and analysis of the BpMADS2 gene
4.1.1. Identification of the putative regulatory elements in the BpMADS2 promoter
4.2. Isolation and analysis of the BpADH gene
4.3. Variation within the two B. pendula populations
4.3.1. Nucleotide polymorphism
4.3.2. Dimorphism of haplotypes
4.3.3. Divergence between populations and demographic equilibrium
4.3.4. Estimates of polymorphism and divergence
4.4. Birch phylogeny
5. DISCUSSION
5.1. BpMADS2 is the PI homologue of birch
5.1.1. Identification of the putative BpMADS2 regulatory regions
5.2. Nucleotide variation in silver birch
5.2.1. Level of nucleotide variation in three nuclear loci of silver birch
5.2.2. No genetic differentiation between the two silver birch populations
5.2.3. Recombination is common in silver birch genome
5.2.4. Nuclear genes of silver birch show few traces of postglacial expansion
5.3. Phylogeny of the genus Betula
5.3.1. Comparison of molecular phylogenies
5.3.2. Phylogenetic relationships of Betula schmidtii
5.3.3. Origin of the two alleles of ADH gene
5.3.4. Reconciling gene trees with a species tree
ACKNOWLEDGEMENTS
REFERENCES
5
6
7
7
7
8
8
10
11
11
12
13
13
14
14
15
15
15
15
15
16
16
17
17
18
18
19
20
20
20
23
23
24
25
28
28
28
29
29
30
31
31
32
32
34
34
35
38
39
LIST OF ORIGINAL PUBLICATIONS
This thesis is mainly based on the following publications but it also includes some previously
unpublished results. In the text, the publications are referred to by the Roman numerals I-IV.
I
Järvinen, P., Lemmetyinen, J., Savolainen, O. and Sopanen, T. (2003) DNA sequence
variation in BpMADS2 gene in two populations of Betula pendula. Molecular Ecology
12(2): 369-384.
II
Järvinen, P., Sopanen, T. and Savolainen, O. Nucleotide polymorphism in BpFULL1
and BpADH loci of silver birch (Betula pendula) (Betulaceae). Manuscript.
III
Järvinen, P., Palmé, A., Morales, L.O., Lännenpää, M., Sopanen, T., Keinänen, M.
and Lascoux, M. (2004) Phylogenetic relationships of Betula species (Betulaceae)
based on chloroplast matK and nuclear ADH gene sequences. American Journal of
Botany 91(11): 1834-1845.
IV
Järvinen, P., Sopanen, T. and Keinänen, M. Phylogeny of the genus Betula (Betulaceae):
inferences from two nuclear MADS genes. Manuscript submitted to BMC Evolutionary
Biology.
Publications I and III are reprinted with permission from publishers. Copyrights for I by Blackwell
Publishing Ltd, and for III by Allen Press.
6
1. INTRODUCTION
1.1. General background
Genetic variation exists within every species
and forms the basis for natural selection and
evolution. The extent and pattern of nucleotide
variation in natural populations can provide us
useful information on the evolutionary history
of the species, the mechanisms that maintain
genetic variation and the evolutionary forces
acting on species in general. Genetic variation
at nucleotide level has been studied very little
in woody plants, yet many of these species are
of great economical importance to humans.
In Northern Europe, silver birch, Betula
pendula Roth, is one of the three most
important forest tree species (Anonymous,
1999), and used, for example, in plywood, pulp
and furniture production. Long-lived species,
such as forest trees, are subject to
environmental conditions varying greatly from
year to year and thus maintaining the genetic
diversity in forest tree populations means
maintaining adaptability to changing
environmental conditions. Life history traits,
such as generation time, geographical
distribution, pollination mechanism, mating
system, and seed and pollen dispersal,
influence on the amount and apportionment of
genetic variation of the species. Besides natural
factors, direct human impact has already
altered (see Kado et al., 2003), or will in a long
term alter genetic variation in certain important
forest tree species. So far, forest trees have
undergone relatively little breeding, for
example in Finland the breeding of trees did
not start until the end of the 1940´s (Koski,
1989). However, biotechnological approaches,
such as micro-propagation, gene transfer, and
marker-assisted breeding can change this
situation in the future. Furthermore, birches
have been planted increasingly with material
coming from selected trees, and even though
the proportion of planted birches to those
originating from naturally regeneration is still
small, it is increasing all the time. In the long
term this may affect the genetic diversity of
the silver birch. Thus, studies on the current
genetic structure of silver birch will provide
reference values of nucleotide diversity in this
species.
There are also practical reasons for an
interest in the population genetics of silver
birch. Conventional breeding of woody plants
is very slow, and therefore many research
groups are looking for tools to improve wood
quality and other properties of trees using gene
technology. However, the phenotypic variation
of many commercially important and desired
properties of forest trees are complex and
regulated by many loci, or the genetic
background on desired properties is not known
(Neale and Savolainen, 2004). One potential
approach to overcome these problems could
be association genetics of complex traits. When
designing an association mapping study, one
of the most important issues is the population
structure of the studied species in natural
populations. A knowledge of the current
population structure of silver birch and of the
rate of decay of linkage disequilibrium will
facilitate the designing of association mapping
studies, and thereby potentially accelerate the
breeding of this economically important forest
tree species.
1.2. Genus Betula
The birches (Betula L.) are common trees and
shrubs of the Northern Hemisphere (Furlow,
1990). Phylogenetically, the genus Betula
belongs to the birch family Betulaceae (order
Fagales). The genus comprises approximately
30-35 species (Furlow, 1990; de Jong, 1993),
but the range of accepted species by different
authors ranges from 30 to over 150 (de Jong,
1993), and considerable controversy still exists
regarding the systematics of the genus. The
uncertainty of the number of species, and their
phylogenetic relationships are mostly due to
the high polymorphism in morphology.
Furthermore, hybridisation is very frequent and
for this reason introgression, the transfer of
genes between species, may have played an
important part in the evolution of the genus
(Alam and Grant, 1972; Furlow, 1990;
Atkinson, 1992).
The basic chromosome number of genus
Betula is n = 14, but natural polyploidy is very
frequent (Furlow, 1990; de Jong, 1993).
Species of Betula form a polyploid series, with
chromosome numbers of 2n = 28, 56, 70, 84,
and 112 and the ploidy levels differ between
7
subgenus Betulaster (e.g., B. maximowicziana)
subgenus Betulenta (e.g., B. lenta and B. alleghaniensis)
subgenus Neurobetula (e.g., B. ermanii and B. schmidtii)
subgenus Betula (e.g., B. pendula, B. pubescens, B. papyrifera,
B. populifolia, B. resinifera and B. platyphylla)
subgenus Chamaebetula (e.g., B. humilis, B. fruticosa and B. nana)
Figure 1. Hypothetical phylogenetic relationships between the subgenera of Betula (based on de Jong, 1993).
the subsections/subgenera. From the
evolutionary point of view the genus Betula is
still young, which partly explains the polyploid
nature of the genus and the occurrence of
various ploidy levels (Särkilahti and Valanne,
1990). Furthermore, the differences in ploidy
levels among the different subsections/
subgenera indicate that several independent
polyploidisations have occurred within the
genus.
Different attempts to identify sections or
subgenera, and the relationships among the
Betula species have been made on the basis of
morphology, biochemical characters and/or
chromosomal numbers (Regel, 1865; Winkler,
1904; Nakai, 1915; Komarov, 1936;
Pawlowska, 1983; de Jong, 1993; Keinänen et
al., 1999a). Regel (1865), the original
monographer of Betula, divided birches into
two main sections, Eubetula and Betulaster.
The Eubetula section was further divided into
three subsections, Albae (white birches),
Costatae (yellow birches), and Nanae (dwarf
birches). The section Betulaster contained only
few Asian birches in the subsection
Acuminatae. Since then this division has been
revised numerous times by number of authors
into different subsections or subgenera
(summarised in Furlow, 1990; de Jong, 1993;
Fig. 1).
Birches are wind-pollinated and the dispersal
of seeds is also by wind (Atkinson, 1992). In
contrast to the great variation in their vegetative
parts, birches are rather uniform in their
reproductive organs, including separate male
and female catkins. The female flowers consist
of a bicarpellate ovary with one anatropous
ovule in each locule (Table 1; Furlow 1990,
de Jong, 1993). The male flowers consist of a
reduced perianth with 1-4 reduced tepals and
1-4 bifid stamens. The number of stamens and
tepals is three or four in members of the
subgenera Betulenta and Neurobetula, two or
three in members of the subgenus Betula and
one or two in members of the subgenus
Chamaebetula (Table 1; de Jong, 1993). In
general, the number of tepals and stamens is
equal within the species. In the subgenus
Betulaster, however, the number of stamens has
been reduced to two, but four tepals have been
retained.
1.3. Flower development
1.3.1. Flower development in Arabidopsis
During recent years, the flowering plant
Arabidopsis thaliana (hereafter referred to as
Arabidopsis) has become universally
recognised as a model system in molecular,
genetic, and evolutionary research. It has many
advantages, including a small size, short
generation time, and a relatively small genome,
which has been completely sequenced (The
Arabidopsis Genome Initiative, 2000).
Arabidopsis is predominantly a selfing species,
and the reported level of outcrossing is less
than 1 % (Abbot and Gomez, 1989). Therefore
most Arabidopsis plants in nature represent
inbred lines, which are in practice homozygous.
Furthermore, the lack of heterozygous
individuals within this species will presumably
cause recombination to be effectively very rare.
Flowering is a very complex process and it
can be divided into several independent phases,
including induction of flowering, the formation
8
9
erect
Chamaebetula
Nanae
bBased
(1960)
on literature (e.g., Furlow, 1990, de Jong, 1993).
aKrüssman
pendulous
erect
Neurobetula
Betula
erect
Betulenta
pendulous,
clustered
Infructescenses
Albae
Eubetula
Costatae
Section or subsection Subgenus
(de Jong, 1993)
(Winkler, 1904)
Betulaster
Acuminatae
Betulaster
1-2
2-3
3-4
3-4
2
No. of stamens
in male flowers
B. fruticosa Pall.
B. humilis Schrenk
B. nana L.
B. pendula Roth
B. pubescens Ehrh.
B. platyphylla Suk. var. japonica
(Miq.) Hara
B. papyrifera Marsh.
B. populifolia Marsh.
B. resinifera Britt.
B. ermanii Cham.
B. schmidtii Regel
28
84
28
2nb
narrow
narrow
narrow
28
28
28
(56), 70, 84
28
28
broad
broad
broad
6-10
6-9
4-5
5-6
4-5
2
28
56
28
broad
broad
broad
narrow
56
very narrow 28
narrow
narrow
very broad
Fruit wings
5-7
5-7
5-7
7-11
9-11
9-12
9-11
10-12
B. maximowicziana Regel
B. lenta L.
B. alleghaniensis Britt.
Leaf
veinsa
Species
Table 1. Subgeneric taxonomic categories, selected morphological characters, chromosome numbers, and distribution of the birch species.
Asia
Europe, Asia
Europe, Asia,
North-America
North-America
North-America
North-America
Europe, Asia
Europe, Asia
Asia
Asia
Asia
North-America
North-America
Asia
Distribution
of inflorescence and floral meristems, and the
formation of floral organs (summarised in
Yanofsky, 1995). The transition from the
vegetative phase to the reproductive phase
during Arabidopsis development is the result
of a complex interaction of environmental (e.g.,
photoperiod, light intensity, light quality, and
temperature) and endogenous (e.g., hormones
and metabolites) factors. Furthermore, at least
four interacting pathways whose signals
regulate the expression of genes involved in
flower development have been described: the
photoperiod response pathway, the
vernalization response pathway, the
autonomous pathway, and the gibberellin
pathway (Koornneef et al., 1998; Mouradov
et al., 2002).
The meristem identity genes control the
transition from vegetative to inflorescence and
from inflorescence to floral meristems
(Yanofsky, 1995). One of the key regulators
of the transition from vegetative to
reproductive phase is the meristem identity
gene LEAFY (LFY), whose activity is proposed
to mediate the initiation of flowers. Also
CAULIFLOWER (CAL), APETALA1 (AP1),
and FRUITFULL (FUL) have important
function in flower initiation, partly because of
their roles in upregulating LFY expression
(Ferrándiz et al., 2000). CAL, AP1 and FUL
share redundant functions in the establishment
of floral meristem identity and mutations in
these genes cause conversion of flowers into
shoots (Yanofsky, 1995).
The Arabidopsis flower is organised into
four concentric whorls of organs. Starting from
the outermost whorl, these consist of four
sepals, four petals, six stamens and two fused
carpels. During the flower development the
identity of the different floral organs is
determined by the activity of floral organ
identity genes. This specification has been
described in the ABC model (Coen and
Meyerowitz, 1991), which postulates three
gene functions, A, B, and C that act in two
adjacent whorls to specify the floral organs.
According to this model, action of A alone
specifies sepal formation, the combination AB
specifies the development of petals, and the
combination BC specifies stamen formation.
Action of the C function alone determines the
development of carpels. Later this “classical”
ABC model has been refined and extended to
an ABCDE model, where the D function is
needed for the ovule identity and the E function
for petal, stamen and carpel identities
(Theissen, 2001).
MADS box genes encode for transcription
factors that have essential functions during
flower development and organ differentiation,
and nearly all of the A-, B-, and C-function
genes belong to this gene family. The term
MADS box arose from the first characterised
genes that shared this region, namely MCM1
from yeast, AGAMOUS from Arabidopsis,
DEFICIENS from Antirrhinum majus, and
SRF from human (Schwartz-Sommer et al.,
1990). A typical plant MADS domain protein
consists of a very conserved structural
organisation, including a MADS (M-),
intervening (I-), keratin-like (K-) and Cterminal (C-) domains (so-called MIKC type;
Münster et al., 1997). MADS domain proteins
bind to DNA either as homo- or heterodimers,
and the highly conserved MADS domain is the
major determinant of DNA binding and
dimerization (Riechmann and Meyerowitz,
1997). The formed homo- and/or
heterocomplexes bind to so called CArG-box
sequences (consensus CC(A/T)6GG or CTA(A/
T)4TAG). However, the MADS domain does
not contribute significantly to the functional
specificity of floral homeotic proteins (Krizek
and Meyerowitz, 1996; Krizek et al., 1999).
In addition to the MADS domain, the I-region
and in some genes parts of the K domain are
required for the formation of a DNA binding
complex (Riechmann et al., 1996a; Riechmann
et al., 1996b). The I-region and the K box have
also important roles for proper protein function
(Riechmann and Meyerowitz, 1997). The Cterminal is involved in the formation of higher
order complexes and it also functions as the
transcriptional activator domain (EgeaCortines et al., 1999; Honma and Goto, 2001).
1.3.2. MADS box genes FRUITFULL and
PISTILLATA
The FRUITFULL (FUL, formerly AGL8) gene
encodes MIKC-type MADS domaincontaining transcription factor and it is
involved in several distinct processes during
10
isolated in our group, e.g., BpMADS1
(Lemmetyinen et al., 2001), similar to
SEPALLATA3, BpMADS3-5 (Elo et al., 2001),
similar to FUL and AP1, BpMADS6
(Lemmetyinen et al., 2004), similar to
AGAMOUS, BpMADS7 (P. Järvinen, J.
Lemmetyinen and T. Sopanen, unpublished
results), similar to AGL11 and BpMADS8 (S.
Parkkinen, J. Lemmetyinen and T. Sopanen,
unpublished results), similar to AP3.
BpFULL1 (former BpMADS5) is the FULhomologue of birch (Elo et al., 2001). The
expression of BpFULL1 is inflorescence
specific (Elo et al., 2001; Lännenpää et al.,
submitted), and starts at the early stages of
inflorescence development. The expression
continued in both male and female
inflorescences thereafter and expression was
detected also in male inflorescences at anthesis,
and in female inflorescences during seed
development (Elo et al., 2001). No expression
was detected in the vegetative parts. In situ
hybridisation studies have shown that the
expression of BpFULL1 is localised in birch
inflorescence meristems and in male and
female inflorescences, especially in stamen and
carpel primordia (Lännenpää et al., submitted).
These results indicate that BpFULL1, as its
Arabidopsis homologue FUL, might be
involved in the transition from vegetative to
reproductive stage of development and the
initiation of flower development.
Arabidopsis development. It has an early acting
function in controlling flowering time, floral
meristem identity and cauline leaf morphology
together with two other genes, AP1 and CAL
(Ferrándiz et al., 2000). Later it has a role in
carpel and fruit development (Mandel and
Yanofsky, 1995; Gu et al., 1998). The
expression of FUL is first detected in the
inflorescence meristem at the time when the
development of Arabidopsis is switched from
the vegetative to the reproductive phase
(Mandel and Yanofsky, 1995). Later, FUL is
expressed in the center of floral meristem,
region which gives rise to the pistil. In the
mature flower the expression of FUL is
detected in carpel walls. Furthermore, FUL,
along with another meristem identity gene AP1,
appears to be angiosperm-specific (Litt and
Irish, 2003). The correlation of the origin of
the AP1/FUL gene lineage with the origin of
flowers suggests a possible role for these genes
in the evolution of this key angiosperm feature.
The PISTILLATA (PI) gene also encodes a
MIKC-type MADS domain-containing
transcription factor and it is required, along
with the other B-function gene APETALA3
(AP3), to specify petal and stamen identities
in the Arabidopsis flower (Goto and
Meyerowitz, 1994; Jack et al., 1992). PI, along
with AP3, also plays an additional role in
proliferation of the floral meristem (Krizek and
Meyerowitz, 1996). Mutations in the PI gene
cause homeotic conversion of petals to stamens
and of stamens to carpels. At the early stages
of flower development, prior to the first
appearance of the primordia of petals and
stamens, PI is expressed in the second, third,
and fourth whorls of the developing flowers
(Goto and Meyerowitz, 1994). Later on, the
expression of PI is restricted to the second and
third whorls.
1.4. Molecular markers
There are many reasons why molecular data,
particularly DNA sequence data, are much
more powerful for evolutionary studies, both
at population and at species level, compared
to morphological data. First of all, DNA
sequence data represents the highest level of
genetic resolution, and acts as a store of genetic
information containing “the code of life” (Li,
1997). Secondly, molecular data are much
more abundant than morphological data. For
example, the genome of Arabidopsis contains
25 498 genes encoding proteins from 11 000
families (The Arabidopsis Genome Initiative,
2000). Also, the three genomes in the plant cell
(nuclear, chloroplast, and mitochondrial),
compose three independent DNA data sets in
one species, and a combination of these data
1.3.3. BpFULL1 gene
Molecular data have shown that the
mechanisms controlling flower development
are largely conserved even in distantly related
plant species (Yanofsky, 1995), such as
Arabidopsis and birch. Utilising this
information several MADS genes and/or their
cDNAs regulating the development of birch
inflorescences and/or flowers have been
11
in anaerobic metabolism and its expression
increases under oxygen stress as well as in
response to cold in both Arabidopsis and Zea
mays and to dehydration in Arabidopsis
(Freeling and Bennett, 1984; Dolferus et al.,
1994). Additionally, ADH may have a role
during seedling development, fruit ripening,
and pollen development. In the majority of
flowering plants, two or three ADH loci have
been identified, each containing ten exons and
nine introns (e.g., Gaut and Clegg, 1991;
Morton et al., 1996; Gaut et al., 1999).
However, in Arabidopsis, ADH is a single copy
gene, and consists of seven exons and six
introns (Chang and Meyerowitz, 1986).
Chloroplast genome structure and variation
have been studied extensively in plants.
Compared to nuclear DNA, chloroplast DNA
has some properties that makes it especially
useful in phylogenetic studies. Chloroplast
genome is uniparentally, in most of the plant
species maternally, inherited and does not
undergo sexual recombination (Radetzky,
1990; Rajora and Dancik, 1992; Dumolin et
al., 1995). Thus a phylogeny based on the
chloroplast genome is not complicated by
recombination. However, chloroplast markers
have also some disadvantages, such as
introgression of chloroplast genes from one
species to another (Wendel and Doyle, 1998).
Furthermore, due to uniparental inheritance,
chloroplast DNA has a smaller effective
population size than nuclear DNA. In
monoecious species, such as birches, the
effective population size for chloroplast genes
is expected to be half of that for nuclear genes.
Based on the smaller effective population size
also the level of genetic variation is expected
to be smaller. The mutation rate of chloroplast
DNA may, however, vary considerably
depending on the parent (male, female) it is
inherited from and this has to be taken into
account when generalisations concerning many
plant species are made (male mutation bias;
Whittle and Johnston, 2002).
The chloroplast gene matK has been one of
the most commonly used sequences for
phylogenetic studies in plants (e.g., Wang et
al., 1999; Wang et al., 2000; Cheng et al., 2000;
Stanford et al., 2000; Soltis et al., 2001; Fukuda
et al., 2001). In many cases it has been found
sets can provide complementary information
on the evolution of the species and taxa.
Thirdly, DNA sequences generally evolve in a
much more regular manner than do
morphological characters and are often more
responsive to quantitative treatments than are
morphological data, and therefore can provide
a clearer picture of relationships of species (Li,
1997). Furthermore, molecular data offer
potentially huge data sets that are comparable
across a wide taxonomic range (e.g., Yokoyama
and Harry, 1993), and this might help us to
resolve one of the prime goals of evolutionary
biology, “the Tree of Life”. However, it is
important to keep in mind that DNA sequences
are only one of many types of data that can be
used to study phylogenetic relationships of
species and that molecular data and other used
approaches are not exclusive to each other, but
rather complete each other.
1.4.1. Nuclear and chloroplast DNA
A plant cell has one nuclear and two organellar
(chloroplast and mitochondrial) genomes. The
biparentally inherited nuclear DNA is the
fastest evolving among these three genomes
(e.g., Wolfe et al., 1987; Wang et al., 2000).
Recombination occurs frequently in nuclear
genomes and the recombination rate varies
considerably from locus to locus, depending
on, for example, the chromosomal location of
the gene. Recombination has an important role
in the evolution of a species because it
rearranges DNA sequences to generate new
combinations of DNA molecules (Posada and
Crandall, 2001). However, it also complicates
the phylogenetic studies by creating “mosaic
genes” where different parts of a gene have
different phylogenetic histories.
ADH genes are among the best-characterised
nuclear genes in plants and have become model
genes for studies of sequence variation (Gaut
and Clegg, 1993a; Gaut and Clegg, 1993b;
Innan et al., 1996; Bergelson et al., 1998;
Savolainen et al., 2000), and phylogenetic
studies at both high and low taxonomic levels
(Gaut and Clegg, 1991; Gaut and Clegg, 1993a;
Morton et al., 1996; Sang et al., 1997;
Charlesworth et al., 1998; Miyashita et al.,
1998; Gaut et al., 1999). Alcohol
dehydrogenase (ADH) is an essential enzyme
12
disequilibrium extended up to 250 kb
(Nordborg et al., 2002; Hagenbland and
Nordborg, 2002). Because of the rare
occurrence of heterozygotes in Arabidopsis
(Abbot and Gomez, 1989), the level of
effective recombination could generally be
expected to be low. However, even though
some species-wide studies of nucleotide
variation have revealed a low level of
recombination within some nuclear loci
(Kawabe et al., 2000), other loci have showed
several recombination events in the history of
the sample (Innan et al., 1996; Kawabe et al.,
1997; Kawabe and Miyashita, 1999;
Pugugganan and Suddith, 1999; Kuittinen and
Aguadé, 2000; Aguadé, 2001; Le Corre et al.,
2002; Miyashita, 2003). Furthermore, studies
on AFLP indicate that outcrossing does occur
in this selfing species (Miyashita et al., 1999).
Thus, studies on nuclear genes (e.g., Innan et
al., 1996; Kuittinen and Aguadé, 2000;
Aguadé, 2001, etc.) and AFLP (Miyashita et
al., 1999) indicate that recombination events
clearly have influenced the pattern of
polymorphism in Arabidopsis.
A well-defined dimorphic haplotypestructure with a clear separation in two highly
differentiated haplotypes has been found in
many Arabidopsis genes, such as ADH (Innan
et al., 1996), ChiA and ChiB (Kawabe et al.,
1997; Kawabe and Miyashita, 1999), Rpm1
(Stahl et al., 1999), FAH1 and F3H (Aguadé,
2001), TFL1 (Olsen et al., 2002), ACL5
(Yoshida et al., 2003), and CRY2 (Olsen et al.,
2004). Dimorphism was, however, restricted
to a few nucleotide differences at the CAL, AP3
and PI (Purugganan and Suddith, 1998, 1999),
and CHI genes (Kuittinen and Aguadé, 2000)
and there was no clear evidence for two major
haplotypes in these genes. Unlike other regions
where no clear haplotype-structure was present
or only two divergent sequence types were
detected, F18L15-130 region containing the
receptor-like protein kinase gene seems to
possess at least three divergent sequence types
(trimorphism, Miyashita, 2003). Relatively
high level of nucleotide variation in the many
studied nuclear regions is mostly caused by
differences between these two divergent
sequence types. Two different explanations
have been proposed for the origin of divergent
to evolve more rapidly than another commonly
analysed chloroplast gene, rbcL, and thus could
be a better sequence candidate for clarifying
relationships among closely related species
(Wang et al., 1999).
The choice of molecular markers for
phylogenetic studies can be very difficult. Both
nuclear and chloroplast markers have
advantages and disadvantages. Genes from two
different genomes may have distinct
phylogenies as a result of different inheritance
pathways and differential responses to
processes discussed above. On the other hand,
if different data sets give us similar trees, it
will give us confidence that both trees reflect
the same evolutionary history, and that the gene
trees are congruent with the true species tree.
1.5. Nucleotide variation in plants
1.5.1. Nucleotide variation in Arabidopsis
Genetic variation within and between
Arabidopsis populations has been studied with
meristem identity, floral developmental and
flowering time genes (e.g., Purugganan and
Suddith, 1998, 1999; Kuittinen et al., 2002;
Hagenblad and Nordborg, 2002; Le Corre et
al., 2002; Olsen et al., 2002; Shepard and
Purugganan, 2003; Olsen et al., 2004), genes
encoding metabolic enzymes (e.g., Hanfstingl
et al., 1994; Innan et al., 1996; Kawabe et al.,
2000; Miyashita, 2001; Aguadé, 2001;
Kuittinen and Aguadé, 2000; Miyashita, 2003;
Yoshida et al., 2003), and pathogen resistance
and defence genes (e.g., Kawabe et al., 1997;
Kawabe and Miyashita, 1999; Stahl et al.,
1999). Nucleotide diversity in these genes
varies from 0.0006 to 0.0558. The level and
pattern of DNA variation in the entire genome
of Arabidopsis has been studied using the
amplified fragment length polymorphism
(AFLP) analysis (Miyashita et al., 1999).
Nucleotide diversity for the entire genome was
estimated to be 0.0106, which is within the
range reported for specific nuclear genes.
Linkage disequilibrium, the nonrandom
association of allelic polymorphisms, among
polymorphic nucleotide sites has been
observed both within and among genes
(Nordborg et al., 2002; Hagenbland and
Nordborg, 2002; Shepard and Purugganan,
2003), and in some cases decaying of linkage
13
sequence types (allelic dimorphism),
introgression from a related species, or fusion
of previously isolated subpopulations of
Arabidopsis itself (Innan et al., 1996).
2002). Instead, the level of recombination rate
for Pal1 was high. Furthermore, the overall
genetic differentiation between the populations
of Scots pine was low, supporting the idea of
large effective population size in this species.
Nucleotide diversity and linkage
disequilibrium have been estimated among 19
loci in another long-lived outcrossing
gymnosperm, loblolly pine (Pinus taeda L.)
(Brown et al., 2004). The weighted average
diversities at silent (πs) and nonsynonymous
(πa) sites were 0.0064 and 0.0011, both rather
low values. The decay of linkage
disequilibrium was rapid and the observations
suggested substantial recombination in the
history of the sampled alleles.
The nucleotide variation of the sugi tree,
Cryptomeria japonica, has been studied using
several different nuclear loci (Kado et al.,
2003). Cryptomeria japonica is a
predominantly outcrossing and wind-pollinated
species and the distribution of this species is
restricted to Japan. The current population size
of this species is small. Cryptomeria japonica
has a long generation time and sometimes C.
japonica individuals live more than a thousand
years. The average nucleotide diversity for
silent sites was 0.0038, which is similar to that
in Scots pine. No apparent geographic
differentiation was found among studied
populations. The level of population
recombination rate in C. japonica was low and
this seems to be due to both low level of
recombination and small population size.
1.5.2. Nucleotide variation in woody plants
Nucleotide variation in nuclear genes has been
widely studied in herbaceous plants, especially
in the selfing Arabidopsis, as above described,
but there have been only few published studies
on nucleotide variation in woody plants
(Dvornyk et al., 2002; Kado et al., 2003;
García-Gil et al., 2003; Brown et al., 2004).
On the basis of allozyme data, trees have been
found to contain significantly more variation
than herbaceous plants (Hamrick et al., 1992).
The average genetic diversity within
populations of woody plants was 0.148, which
is 46 % higher than the mean for annual (0.101)
and 51 % higher than the mean for perennial
(0.098) herbaceous species. Based on earlier
morphological and other studies with forest tree
species there is also extensive variation within
populations (Stern, 1964; Howland et al., 1995;
Laitinen et al., 2000). However, at the DNA
level, woody plants have shown lower level of
nucleotide variation and species divergence
than herbaceous plants (Bousquet et al., 1992;
Savard et al., 1993; Laroche et al., 1997;
Andreasen and Baldwin, 2001).
Scots pine (Pinus sylvestris L.) is a longlived predominantly outcrossing perennial, and
its distribution area extends to most of the
Eurasian continent. Scots pine has a large
current population size. In earlier studies Scots
pine has shown high diversity at isoenzyme,
RFLP and microsatellite markers (Muona and
Harju, 1989; Karvonen and Savolainen, 1993;
Karhu et al. 1996). At nucleotide level,
variation in Scots pine has been studied in
nuclear genes encoding phenylalanine
ammonia-lyase (Pal1; Dvornyk et al., 2002)
and phytochromes P and O (PHYP and PHYO;
García-Gil et al., 2003). The overall silent
variation (πs) for Pal1, PHYP, and PHYO loci
was low, only 0.0049, 0.0024 and 0.0013,
respectively (Dvornyk et al., 2002; García-Gil
et al., 2003). Also the level of nonsynonymous
variation (πa) for these three loci was very low.
There was no linkage disequilibrium even
between closely linked sites (Dvornyk et al.,
1.5.2.1. Betula pendula
Silver birch, or European white birch (Betula
pendula Roth/Betula verrucosa Ehrh.) is
distributed throughout the northern temperate
region (Atkinson, 1992). It is a windpollinated, outcrossing species with
monoecious and a diclinous flowers. Birch, as
a pioneer species, migrated to Fennoscandia
after the last glacial epoch about 10 000 years
ago as a result of post-glacial migration of
individuals from refugia located to south-west,
south and south-east from Finland (Huntley and
Birks, 1983; Hyvärinen, 1987; Willis et al.,
2000). Based on variation in chloroplast DNA,
today’s silver birches in Europe can be
classified into two main haplogroups, of which
14
2. AIMS OF THE STUDY
one is dominant in the north-west and the other
in the south-east and east (Palmé et al., 2003).
In Finland the southeastern/ eastern haplogroup
is the dominant one representing about 70-90
% of the sample. Furthermore, the chloroplast
data showed that most variation within B.
pendula was found in central Europe, while
the level of variation in northern and southern
populations were lower and very similar
compared to each other.
As earlier mentioned, based on allozyme
data, trees have been found to contain more
variation than herbaceous plants (Hamrick et
al., 1992). The average heterozygosity for
silver birch was 0.141 (Rusanen et al., 2003),
which is only slightly lower than the averages
presented by Hamrick et al. (1992), 0.148,
respectively. Earlier studies based on
restriction fragment length polymorphism
(RFLP) and random amplified polymorphic
DNA (RAPD) analyses have shown a high
degree of polymorphisms within
morphologically variable natural populations
of B. pendula (Howland et al., 1995).
Furthermore, intraspecific variation in
secondary chemistry has been found to be
considerable among and within genotypes of
B. pendula (Keinänen et al., 1999b), and within
a naturally regenerated B. pendula population
(Laitinen et al., 2000, Laitinen et al., 2002).
Resistance to insect herbivory varied also
significantly among genotypes of B. pendula
(Prittinen et al., 2003). These earlier
observations combined with the expected large
current population size of silver birch suggest
that the level of nucleotide variation in this
outcrossing, wind-pollinated species should be
higher than those detected in herbaceous
species, such as Arabidopsis.
The objectives of this thesis were:
1 to isolate the genomic clone of PI
homologue of birch and analyse its
structure.
2 to study the nucleotide variation in two
naturally regenerated, 70-year-old silver
birch populations by analysing the variation
within a population, among the
populations, and between different genes.
3 to study whether the post-glacial
colonisation history is reflected in the
distribution of variation within the
populations of silver birch.
4 to study the phylogenetic relationships
among species within the Betula genus.
3. MATERIALS AND METHODS
Only a brief outline of the materials and
methods is given in this chapter. For instance,
the polymerase chain reaction (PCR)
conditions, primer sequences and details of
sequence analyses are described in detail in
the original papers (I-IV).
3.1. Plant materials
3.1.1. Plant material for the isolation and
analysis of BpMADS2 (I)
For the isolation of total RNA, from which the
first-strand cDNA was prepared, and for the
Southern and Northern hybridisation male and
female inflorescences of silver birch (Betula
pendula Roth) were collected from the wild
trees so that all the main developmental stages
were represented. In addition, leaves and roots
for Northern hybridisation were collected from
4 weeks old in vitro grown B. pendula (clone
JR ¼) seedlings. The samples were frozen and
stored at –80 °C.
3.1.2. Plant material for the population
studies (I, II)
The populations selected for the nucleotide
variation studies were naturally regenerated B.
pendula forests situated in Punkaharju, southeastern Finland (61º49´N, 29º19´E) and in
Rovaniemi, northern Finland (66º20´N,
26º40´E). The forest stands were ca. 65-70
years of age and 20-25 m in height. Samples
15
Table 2. The eight locations where Betula pendula was sampled. The longitude and latitude does not in all
cases correspond exactly to the sampling location but to nearest town.
Country
Location
Code
Longitude
Latitude
Finland*
Punkaharju
P.1.
29°19´
61°49´
Finland*
Rovaniemi
R.1.
26°40´
66°20´
Finland
Karjalohja
KL
23° 70´
60° 20´
Russia
Novosibirsk
K
78° 00´
53° 30´
Russia
Kurgan
M
64° 40´
55° 00´
Russia
Orenburg
O
55° 00´
52° 30´
Germany
Harzburg
G
10° 60´
51° 80´
Italy
Lillaz
I
7° 30´
45° 70´
*Individuals added from populations Punkaharju and Rovaniemi (I).
on the different chloroplast haplotypes
identified with PCR-RFLP (Palmé et al., 2003).
Two additional species of the birch family
(Betulaceae), Corylus avellana and Alnus
incana, were included in the studies III and IV
as outgroup members. The C. avellana
individual used as an outgroup in the ADH
analysis (III) and the A. incana individual used
as an outgroup in the BpMADS2 and BpFULL1
analyses (IV) was sampled in Joensuu
Botanical garden, Finland. The two C. avellana
individuals used as outgroups in the matK
analysis (III) were sampled in Halltorps Hage,
Sweden and Montejo de la Sierra, Spain. All
the samples were frozen in liquid nitrogen and
stored at –80 °C.
from 20 individuals were collected from both
locations, and the studied 10 individuals were
chosen randomly from these.
Leaf samples of additional six B. pendula
individuals representing different parts of the
distribution area of the species were obtained
from experiments located at the research
station of the Finnish Forest Research Institute
at Punkaharju, Finland, and from the seedlings
growing in Joensuu Botanical Garden (Table
2). All leaf samples were frozen and stored at
–80 °C.
3.1.3. Plant material for the phylogenetic
studies (III, IV)
The species for the phylogenetic studies were
chosen from all three major parts of the Betula
range: Europe, Asia, and North America, and
efforts were made to cover all the subgenera
or sections of the genus Betula (Table 1).
Leaves of 14 birch species representing five
subgenera (de Jong, 1993) were collected
either from individuals growing in botanical
gardens, from a natural population of B.
pendula at Punkaharju, Finland, or obtained
from experiments located at the research
station of the Finnish Forest Research Institute
at Punkaharju, Finland.
The seven additional B. pendula individuals
were used to investigate the within species
variation in matK (III). The individuals
included here came from different locations in
Europe and were chosen for this study based
3.2. Isolation and analysis of BpMADS2 and
BpADH genes (I, III)
A partial cDNA clone of BpMADS2 was first
isolated using PCR with partially degenerative
primers as described in paper I. An almost fulllength cDNA clone was isolated using PCR
with a new, BpMADS2 specific primer together
with an oligo d(T)-primer (I). The promoter
region of BpMADS2 (9.4 kb), along with the
missing 21 bp from the 5’-end of the coding
region of BpMADS2, was isolated by the
screening of a λFixII genomic library (obtained
from Prof. J. Kangasjärvi, University of
Helsinki) using the 3´end of BpMADS2 (533
bp, nucleotides 337-870) as the probe. The
isolated promoter fragment was subcloned
16
pendula individuals from which two BpADH
alleles were amplified (II). These alleles were
only partly or not at all rechecked with direct
sequencing.
Total DNA was extracted from leaves of six
additional silver birch individuals from
different parts of the distribution area (Table
2), and genomic fragments of BpMADS2 5’end
region were isolated as described in paper I
(unpublished results). PCR products were
cloned, and positive clones were selected and
sequenced. The PCR fragments were
completely rechecked with direct sequencing
to obtain both alleles.
Nucleotide sequences of BpMADS2,
BpFULL1 and BpADH were assembled using
GCG software package, program PileUp
(release 10.0, Genetics Computer Group,
Madison, WI, USA (I)) or the EMBOSS
program package (release 2.4.1, The European
Molecular Biology Open Software Suite (II)).
The resulting sequences were aligned with
BioEdit (Hall, 1999) and Genedoc (Nicholas
and Nicholas, 1997) programs and refined
visually. The polymorphism data was analysed
using the program package DnaSP (version 3.5,
Rozas and Rozas, 1999). Insertion/deletion
(indel) and microsatellite length variation was
not included in the estimates of nucleotide
diversity. Microsatellite variation was
analysed by comparing mean numbers of
repeats between populations and haplotypes.
Neighbour-joining (Kimura-2P distance
measure; Saitou and Nei, 1987 (I)), or DNA
parsimony trees (Heuristic search; Fitch, 1971
(II)) were constructed using programs available
in ClustalX (Thompson et al., 1997) or Phylip
(version 3.5, Felsenstein, 1993).
using PCR with the vector specific forward
primer and the gene specific BpMADS2 reverse
primer (5’-GCTTGTTCTTCTTGCTTGTGG3’).
The copy number of the BpMADS2 gene was
studied using Southern hybridisation analysis
(I). The expression pattern of the BpMADS2
gene was studied by Northern hybridisation
analysis (I). Expression of the BpMADS2 gene
in certain parts of the plant (roots, leaves,
inflorescences) or different developmental
stages of the male or female inflorescences was
also studied with PCR using first-strand cDNA
as a template (I).
The genomic clone of BpADH (3.1 kb) was
first isolated from a λFixII genomic library
using the partial cDNA of BpADH (nucleotides
55-417, accession number AJ279698, received
from M. Korhonen, University of Helsinki) as
the probe and then subcloned using PCR (III).
The copy number of the BpADH gene was
studied using Southern hybridisation analysis
(III).
The sequence comparisons of BpMADS2 and
BpADH genes were mainly done using GCG
software package, program PileUp (release
10.0; Genetics Computer Group, Madison, WI,
USA), and Genedoc (Nicholas and Nicholas,
1997) and ClustalX (Thompson et al., 1997)
programs. The phylogenetic analyses of
BpMADS2 and BpADH amino acid sequences
were done using Neighbor-joining algorithm
(Saitou and Nei, 1987) with ClustalX software
(I) or using the programs available in Phylip
3.5 (Felsenstein, 1993 (III)).
3.3. Isolation and analysis of BpMADS2,
BpFULL1, and BpADH fragments (I, II)
Total DNA was extracted from young leaves
of silver birch by Dneasy Plant Mini kit
(QIAGEN). Genomic fragments were isolated
using PCR with gene specific primers (I, II and
III). The amplified PCR products of
BpMADS2, BpFULL1 and BpADH were
cloned, and positive clones were selected and
sequenced. All sequence polymorphisms were
visually rechecked from chromatograms. Most
of the PCR fragments were also partly or fully
rechecked with direct sequencing and specific
attention was drawn to the microsatellite length
variation. The only exceptions are seven B.
3.4. Phylogenetic studies (III, IV)
Total DNA was extracted from young leaves
of 14 birch species and C. avellana (III) and
A. incana (IV) outgroups as described in paper
III. Genomic fragments of BpMADS2-,
BpFULL1-, and ADH-homologues were
isolated using PCR with gene specific primers
(I, II, III). The amplified fragments were
cloned, and positive clones were selected and
sequenced. All sequence polymorphisms were
visually rechecked from chromatograms. Most
of the PCR fragments were rechecked with
17
isolated BpMADS2 gene was sequenced and
the sequence was used in sequence
comparisons, and phylogenetic analyses.
According to the sequence comparisons, at
both nucleotide and amino acid level,
BpMADS2 is most similar to the Arabidopsis
and Antirrhinum majus B-function genes PI
and GLOBOSA (GLO).
Hybridisation analyses were conducted to
study the copy number and the expression
pattern of the BpMADS2 gene. Southern
hybridisation revealed that there was only one
genomic fragment hybridising with the
BpMADS2 probe. This result indicates that
BpMADS2 is a single copy gene in birch. The
localisation of the gene expression of
BpMADS2 was carried out using the Northern
hybridisation analysis. BpMADS2 was
expressed in male inflorescences and at the
early stages of development also in female
inflorescences but not in vegetative tissues
(roots, leaves, and shoots). At the early
developmental phases, the expression in male
inflorescences was weak. At the later
developmental phases the expression became
stronger and the strongest expression was
detected in the late developmental phase of
male inflorescences, before flower opening. No
expression was detected in later stages of the
female inflorescence development. PCR
analysis confirmed that BpMADS2 was
expressed in male and young female
inflorescences but not in older female
inflorescences, or in vegetative tissues.
The members of the PI lineage can be further
distinguished from other lineages, especially
from the otherwise very similar AP3 lineage,
by diagnostic sequences at the K domain and
at the C-terminal end of the predicted protein.
The PI-like genes, including e.g. GLO from
Antirrhinum majus and MdPI from Malus
domestica, typically code for consensus
sequence MPFxFRVQPxQPNLQE (PI motif)
at the C-terminal end of the protein, whereas
AP3-like genes code for a different sequence,
D(L/I)TTFALLE (euAP3 motif in higher
eudicots) or YGxHDLRLA (paleoAP3 motif
in most of the Ranunculidae and the magnolid
dicots) (Kramer et al., 1998). The members of
the PI clade are highly conserved also at the K
domain. This region displays a consensus
direct sequencing. The only exceptions are
those gene regions from which two alleles of
unequal lengths were amplified. Due to
technical problems, these alleles were only
partly or not at all rechecked with direct
sequencing.
Nucleotide sequences were analysed using
GCG program package (release 10.0, program
PileUp, Genetics Computer Group, Maddison,
WI, USA) or EMBOSS program package
(release 2.4.1, The European Molecular
Biology Open Software Suite). The resulting
sequences were aligned with BioEdit (Hall,
1999), GeneDoc (Nicholas and Nicholas,
1997) and ClustalX (Thompson et al., 1997)
programs, and refined visually. Nucleotide and
haplotype diversity analyses (III) were
conducted using the program package DnaSP
(Rozas and Rozas, 1999). The presence of
recombination and/or gene conversion among
ADH sequences (III) was tested with the
program Geneconv v. 1.81 (Sawyer, 1989;
1999). The phylogenetic trees of the nuclear
DNA sequences were inferred using two
methods: maximum parsimony using heuristic
search and maximum likelihood as
implemented in the program Phylip 3.5
(Felsenstein, 1993). The reliability of the trees
was tested using bootstrapping.
Under the assumption that all studied nuclear
data sets used for phylogenetic analyses (III,
IV) share a common evolutionary history, the
data sets were combined. Phylogenetic analysis
of the combined data set was conducted using
the maximum parsimony, maximum likelihood
and Neighbour-joining methods (IV).
4. RESULTS
4.1. Isolation and analysis of the BpMADS2
gene (I)
The partial cDNA clone of BpMADS2 was first
isolated using PCR with degenerative primers.
The corresponding almost full-length cDNA
clone was isolated using PCR with the gene
specific and oligo d(T)-primers. A genomic
clone containing the missing nucleotides from
the 5’-end of the coding region of BpMADS2
and a 9-kb fragment upstream to the cDNA
clone was isolated by screening of the genomic
library with the cDNA as the probe. The
18
FBP1
NtGLO
GLO
PMADS2
SLM2
AiMADS2
BpMADS2
MdPI
PI
EGM2
OsMADS2
OsMADS4
PrDGL
Consensus
PI Motif
Figure 2. Alignment of C-terminal PI-motif regions of the predicted protein sequences analysed in this study
(I, IV). The names of genes cloned in this study are highlighted in bold and the consensus is shown
below. FBP1, Petunia hybrida, acc.no. M91190 (Angenent et al., 1992); PMADS2, Petunia hybrida,
acc.no. X69947 (Kush et al., 1993); NtGLO, Nicotiana tabacum, acc.no. X67959 (Hansen et al.,
1993); GLO, Antirrhinum majus, acc.no. S28062 (Trobner et al., 1992); SLM2, Silene latifolia,
acc.no. X80489 (Hardenack et al., 1994); MdPI, Malus domestica, acc.no. AJ291490 (Yao et al.,
2001); PI, Arabidopsis thaliana, acc.no. D30807 (Goto and Meyerowitz, 1994); EGM2, Eucalyptus
grandis, acc.no. AF029976 (Southerton et al., 1998); OsMADS2, Oryza sativa, acc.no. L37526
(Chung et al., 1995); OsMADS4, Oryza sativa, acc.no. L37527 (Chung et al., 1995); PrDGL, Pinus
radiata, acc.no. AF120097 (Mourdarov et al., 1999).
sequence KHExL. The comparable sequence
in the K box of the AP3 homologues is (H/
Q)YexM. Both of these highly conserved PI
motifs are found from BpMADS2, and
especially in PI motif the homology between
the consensus sequence and BpMADS2 was
high (thirteen amino acids out of fourteen
possible were identical, Fig. 2).
initiation and the maintenance of PI and AP3
expression patterns (Tilly et al., 1998; Chen et
al., 2000; Honma and Goto, 2000), a shorter,
3-kb fragment was selected for further analysis
(unpublished results). The promoter sequence
of BpMADS2 was analysed by using PLACE
database that contains previously published
sequence motifs found in plant cis-acting
regulatory elements (Higo et al., 1999). The
sequence of the BpMADS2 promoter was
further compared to the functionally defined
PI and AP3 regulatory elements (Tilly et al.,
1998; Chen et al., 2000).
The 3-kb promoter region contained one
putative sequence motif (from site –2204 to
site –2195) that resembles the MADS domain
protein consensus binding sites, known as the
CArG box (unpublished results). No other
putative CArG boxes were detected within this
3-kb promoter region. In addition to this,
4.1.1. Identification of the putative
regulatory elements in the BpMADS2
promoter
In order to identify the putative regulatory
elements in the BpMADS2 promoter, a 9-kb
fragment upstream of the BpMADS2 first ATG
codon (putative translation start site) was
isolated and sequenced. Because the earlier
results with Arabidopsis B-function genes PI
and AP3 have shown that a much shorter
promoter region is sufficient to confer both the
19
several other shorter putative regulatory
elements could be identified in comparisons
between the promoter region of BpMADS2 and
PLACE database (data not shown). The 3-kb
promoter region of BpMADS2 was further
compared to the functionally defined PI
promoter elements. Since the BpMADS2
promoter has not been functionally dissected,
identities of fewer than eight consecutive
nucleotides were ignored to minimise the
possibility of their occuring by change. The
BpMADS2 promoter does show some
similarities to the PI promoter – a consecutive
stretch of ten nucleotides (-477 to -468,
CAAAAGCAAG) corresponds to the positive
late element (PLE1; Chen et al., 2000) found
in PI promoter and nine of these nucleotides
were identical also with the corresponding
region in the AP3 promoter. PI late element 2
identified
a
11-nucleotide
motif
(TTAAGAAAGTA), out of which 10
nucleotides were identical in the BpMADS2
promoter (nucleotides –250 to –240).
However, BpMADS2 promoter did not contain
regions showing significant similarities to PI
positive late element 3 (PLE3) or positive early
element (PEE).
because the whole gene would have been too
long and difficult to amplify as one fragment
(I). Region I of the BpMADS2 gene comprised
the 3´end of the MADS-box, the I-region, the
5´end of the K-box, and two introns. Region II
comprised most of the C-terminal region, one
longer intron, and some of the 3´untranslated
region. The detected level of diversity in the
BpMADS2 gene was low. At Region I of
BpMADS2 locus there were five segregating
sites among 40 alleles and 770 bp sequenced.
At Region II of BpMADS2 there were 42 silent
segregating sites in addition to only one
nonsynonymous site among 20 alleles and
1680 bp sequenced. The overall silent
variation (πs) for BpMADS2, including third
position of codons and noncoding regions, was
0.0043 and the estimate of nonsynonymous
variation (πa) was only 0.00052 (summarised
in Table 3). Intragenic recombination was
detected in both populations in Region II of
the BpMADS2 gene, but no intragenic
recombination has occurred in Region I (Table
4). Instead, significant linkage disequilibria
were detected within Region I, but also in
Region II between one pair of sites in
population Punkaharju.
4.2. Isolation and analysis of the BpADH
gene (III)
The genomic clone of BpADH was isolated by
screening of the genomic library using the
BpADH cDNA clone as the probe. According
to the nucleotide sequence comparisons,
BpADH belongs to the same group as,
Arabidopsis gene ADH1 (Chang and
Meyerowitz, 1986). According to phylogenetic
analyses, BpADH clusters along with other
dicot ADH homologues. Southern
hybridisation revealed that there were two
genomic fragments hybridising with the
BpADH probe, which indicates that there might
be at least two ADH genes in birch. The gene
expression of BpADH in birch has not been
tested yet.
Table 3. Summary of nucleotide polymorphism in
BpMADS2, BpFULL1 and BpADH loci of
silver birch.
Sequence
πs
πa
πtotal
BpMADS2
BpFULL1
BpADH
0.0043
0.00134
0.0117
0.00052
0
0.0015
0.0045a
0.0109
0.0078
BpMADS2*
0.00334
0
0.00283
Mean values
0.00818
ND
0.00682
πs from overall synonymous sites (third position of codons and
noncoding regions); πa from nonsynonymous sites; ND not
determined.
a
Regions I and II of the BpMADS2 gene, cloned alleles
combined
* Region I of the BpMADS2 gene, eight B. pendula individuals
from different parts of the distribution area.
4.3. Variation within the two B. pendula
populations (I, II)
4.3.1. Nucleotide polymorphism
Nucleotide variation of the BpMADS2 gene
was studied in two separate regions (Fig. 3),
20
Figure 3. The genomic regions of BpMADS2 and BpFULL1 used in sequence variation and phylogenetic
analysis.
Nucleotide variation of the BpFULL1 gene
was studied in a region, which comprised the
3´end of the K-box, most of the C-terminal
region, and four introns (Fig. 3, II). The
detected level of diversity of the BpFULL1
gene was much higher than in the BpMADS2
gene. At BpFULL1 gene there were 38
segregating sites among the 20 alleles and 1217
bp sequenced. The overall silent variation, πs,
was 0.00134 (Table 3). There was no
nonsynonymous variation in the coding region
of BpFULL1, and the only polymorphism
within the coding region was a synonymous
substitution of T to C. No evidence for linkage
disequilibrium was detected. Instead,
intragenic recombination has occured within
both populations, and the value of R M per
informative site varied from 0.10 to 0.15 (Table
4).
The region selected to study nucleotide
variation of the BpADH gene (II) covers
portions of five exons and four introns (Fig.
4), and corresponds to nucleotides 642-1770
of the Arabidopsis ADH sequence (Miyashita
et al., 1996). Unlike BpMADS2 and BpFULL1
genes, where only one fragment was amplified
from all individuals with the primers used (I,
II), an additional, about 450 bp longer fragment
was amplified from seven silver birch
individuals. The length variation between these
two alleles was due to one long indel. For the
polymorphism analyses only one allele per
individual was chosen from these seven
individuals with two BpADH alleles. At
Table 4. Summary of statistics for intragenic recombinations at the BpFULL1, BpADH, and BpMADS2
genes.
Length*
BpFULL1
BpADH
Punkaharju Rovaniemi
Punkaharju Rovaniemi
BpMADS2, Region I
BpMADS2, Region II
Punkaharju Rovaniemi
Punkaharju Rovaniemi
1173
1061
1059
1058
745
747
1662
1662
RM
5
3
2
1
0
0
2
3
No. of
informative sites
33
30
31
28
5
4
31
29
RM/ no. of
informative sites
0.1515
0.1000
0.0645
0.0357
0.0000
0.0000
0.0645
0.1034
C
0.0102
0.0837
0.0099
0.0015
0.0006
0.0000
0.0045
0.0404
12.1
96.1
11.0
1.7
0.5
0.001
7.6
67.7
C per gene
RM, the minimum number of recombination events by Hudson and Kaplan (1985); C (or R), the estimator of the
populations’ recombination rates per site (4Nr) by Hudson (1987). * Number of sites excluding sites with alignment
gaps. BpFULL1, BpADH and BpMADS2 (Region II) fragments: 10 individuals per population, BpMADS2 (Region I): 20
individuals per population. BpMADS2, Regions I and II: calculated from the data of Järvinen et al., 2003.
21
2E2
3
4
5
7
6
8
9
Figure 4. The genomic region of BpADH used in sequence variation and phylogenetic analysis.
BpADH there were 33 silent polymorphisms
in addition to the six nonsynonymous sites. The
detected level of overall silent variation of the
BpADH gene was very similar to BpFULL1
(II) and much higher than in BpMADS2 (I,
Table 3). The estimate of the nonsynonymous
nucleotide diversity (πa) in BpADH, however,
was much higher compared to the estimates
for BpMADS2 and BpFULL1 (0.0015).
Intragenic recombination has occured within
both populations, and the value of RM per
informative site was from 0.04 to 0.06 (Table
4). No evidence for linkage disequilibrium was
detected.
Both alleles of Region I of the BpMADS2
gene were sequenced from 6 additional silver
birch individuals (Table 2) from different parts
of the distribution area, and one individual
from populations Punkaharju and Rovaniemi
was added to this data set to give a total of 16
alleles (unpublished results). A total of four
segregating sites and two microsatellite
polymorphisms were detected within 16 alleles
(Table 5). All of the segregating sites were
located within the introns. The nucleotide
diversity (π total) for the entire region was
0.00283. The overall silent variation, π s,
including third position of codons and
noncoding regions, was 0.00334 (Table 3). The
total number of haplotypes among the 16
BpMADS2 sequences was five and the
haplotype diversity, H (Nei, 1987, pp. 259260), was 0.683 ± 0.091. Significant linkage
disequilibrium was detected between three
pairs of sites [(183, 514) (183, 634) (514,
634)]. The estimated minimum number of
recombination events, RM, was one, indicating
that intragenic recombination has occurred, and
the value of RM per informative site was 0.25.
In estimating the overall level of nucleotide
diversity (πtotal) and mean levels of silent site
nucleotide diversity for these loci, all studied
gene regions from both populations were
aligned together (cloned alleles). The overall
level of nucleotide diversity (πtotal) for silver
birch was 0.00682 (Table 3, unpublished
results). The mean value of πs (third position
of codons and noncoding regions) for the
combined data set was 0.00818, respectively.
Because the larger sample size increases the
power of detecting significant linkage
disequilibrium, linkage disequilibrium was
surveyed for nucleotide polymorphisms within
and between the studied genes using the
combined data set. The amount of linkage
disequilibrium was estimated using the r2
statistic (Hill and Robertson, 1968) for
polymorphic sites, and the significance of
pairwise disequilibrium comparisons was
assessed with Fisher’s exact test. Strong levels
of intragenic disequilibrium were observed
only within the BpADH gene (Fig. 5). No
intergenic disequilibrium was observed even
between the nearest sites. These results indicate
that the decay of linkage disequilibrium is
especially rapid in these three studied silver
birch genes.
1.0
r2
0.8
0.6
0.4
0.2
0.0
0
1000
2000
bp
3000
4000
5000
Figure 5. Average rate of decay of linkage disequilibrium, measured by the correlation
coefficient between nucleotide sites (r2), in
silver birch based on three nuclear genes.
22
Table 5. Polymorphic nucleotide sites among 16 alleles (8 individuals) in Region I of the BpMADS2
gene in Betula pendula individuals collected from different parts of the distribution area. Only
differences from the consensus sequence are shown. Dots indicate identity with the consensus
sequence. The positions of the polymorphic sites in two different introns are indicated at the
top. a, cloned allele; b, allele from direct sequencing.
Intron I
Individual
and allele
P.1.a.
P.1.b.
R.1.a.
R.1.b.
KL.a.
KL.b.
K.a.
K.b.
M.a.
M.b.
O.a.
O.b.
G.a.
G.b.
I.a.
I.b.
63-110
(CT)n
13
13
24
14
14
16
17
17
13
14
13
13
14
18
14
13
153-202
(CT)n
14
14
25
18
18
18
20
20
19
19
13
13
15
20
20
15
Intron II
183
C
.
.
A
.
.
A
A
A
A
.
.
.
.
A
A
.
4.3.2. Dimorphism of haplotypes
In all of the three studied nuclear genes, there
was a suggestion of two allele classes, and in
BpMADS2 and BpFULL1 genes the more
frequent allele class represented about 75 %
of the whole sample (I, II). However, in
BpFULL1 the allele class A comprised 55 %
of the sequences and the remaining 45 % of
the alleles formed the second group in the
phylogeny (II). The two classes of BpMADS2
alleles were present also in silver birch
individuals collected from different parts of the
distribution area (Fig. 6, unpublished results).
Especially in Region I of BpMADS2 the
presence of significant linkage disequilibrium
confirmed the strong dimorphism in this region
(I).
The two main allele types showed also some
differences at microsatellite level. In Region I
of the BpMADS2 gene the haplotype B was
associated with the longer microsatellites,
especially in population Rovaniemi (I). In
BpFULL1 gene the haplotype B was associated
with the long (TC) n repeat, especially in
population Punkaharju (II). In Region II of
BpMADS2 and BpADH none of the
514
C
.
.
T
.
.
T
T
T
T
.
.
.
.
T
T
.
618
G
.
.
T
.
.
C
C
C
C
.
.
.
.
C
C
.
634
C
.
.
T
.
.
T
T
T
T
.
.
.
T
.
T
.
microsatellites were associated with the two
different allele classes (I, II).
Although all the studied gene regions
exhibited traces of allelic dimorphism, the two
allele classes in each gene region were
differentiated from each other only by a limited
number of nucleotide sites (I, II, unpublished
results). This indicates that even though the
studied genes displayed allelic dimorphism,
which at some parts of the genes was maximal,
it seems likely that allelic dimorphism has
already disappeared or will gradually disappear
from the silver birch nuclear genome.
4.3.3. Divergence between populations and
demographic equilibrium
In all three genes the level of genetic
differentiation between the two populations
was low. In Region I of BpMADS2 and in
BpADH the genetic differentiation between the
two populations was very low (FST –0.0226 and
–0.0453, respectively (I, II)). The Region II
of BpMADS2 and BpFULL1 showed some, but
not significant, genetic differentiation between
the two populations (FST 0.0930 and 0.1082,
respectively (I, II)).
23
B. ermanii
(outgroup)
G.b.
62
R.1.a.
57
K.a.
83 K.b.
83 I.a.
83 M.a.
64
KL.b.
G.a.
62
M.b.
59
R.1.b.
51 KL.a.
51 I.b.
61 P.1.a.
88 P.1.b.
80 O.a.
90
O.b.
Figure 6. Gene genealogy of BpMADS2 Region I alleles from eight individuals collected from different parts
of the distribution area. All nodes with < 50% bootstrap support are collapsed; the other bootstrap
values are indicated next to relevant nodes.
The Tajima’s D test (Tajima, 1989) was used
for testing the fit of the frequency distribution
to the neutral expectation. Among the three
studied genes, two regions have negative
values of Tajima’s D (Tajima, 1989), and also
when the cloned alleles of Regions I and II of
the BpMADS2 gene are combined, this region
has negative value of Tajima’s D indicating an
excess of low-frequency polymorphisms within
these loci (I, II). This pattern of variation may
reflect the postglacial expansion of this species.
However, the obtained values were not
statistically significant, and, furthermore, in the
presence of recombination the test is
conservative. In contrast, both Region I of the
BpMADS2 gene and the BpFULL1 gene have
positive values of the Tajima’s D, especially
in population Punkaharju. Positive values of
Tajima’s statistics are associated with an excess
of intermediate-frequency polymorphisms.
This kind of patterns can be arise in different
situation. Balancing selection that maintains
variation will increase the proportion of alleles
at intermediate frequencies. On the other hand,
if previously isolated somewhat differentiated
populations are fused, the portion of
intermediate allelic frequencies is again
increased. Thus, the most likely explanation
for the obtained positive values may be the
fusion of previously isolated subpopulations
of silver birch.
4.3.4. Estimates of polymorphism and
divergence
The MK test of McDonald and Kreitman
(1991) was conducted for detecting selection
24
product that ranged in size from 757 to 785 bp
was amplified (IV). From B. ermanii two
fragments were amplified, one being 666 bp
(putative pseudogene), and the other 793 bp
in length. Thirty sites out of 737 analysed were
variable, and of these 15 were parsimony
informative. The variation in this region with
15 parsimony informative characters divided
the genus Betula into three main groups in the
maximum likelihood (ML) tree: B. lenta and
B. alleghaniensis formed the first group, B.
nana, B. papyrifera, B. maximowicziana, B.
ermanii, B. humilis, and B. fruticosa the second
one, and B. resinifera, B. platyphylla, B.
pubescens, B. populifolia, B. schmidtii, and B.
pendula the third group. When gaps were
excluded from the data set, the MP method
recovered the same three main groups as the
ML method, but clustered B. nana as a sister
species of group III. When gaps were included
in the data set, the MP method recovered the
same three main groups as ML and MP (gaps
excluded) methods, but clustered B. papyrifera
and B. ermanii into group III.
The Region II of BpMADS2 comprises most
of the C-terminal region, one longer intron, and
some of the 3’untranslated region. A single
fragment ranging in size from 1602 to 1701
bp was amplified from all 14 species (IV). Out
of the 1443 sites analysed 182 were variable.
Of these 91 were parsimony informative. The
variation in Region II of BpMADS2
homologues with 91 parsimony informative
characters divided the studied birch species
into three main groups with both ML and MP
methods, and with both data sets (gaps
included/excluded). B. lenta and B.
allehganiensis formed the first group, and B.
maximowicziana, B. ermanii, B. humilis and
B. fruticosa the second one, and all of the
remaining species the third one.
The region of the BpFULL1 gene that was
sequenced covers portions of five exons and
four introns. A single product that ranged in
size from 1082 to 1190 was amplified from all
14 species (IV). A hundred and twenty-one
sites out of 957 analyzed were variable and of
these 42 were parsimony informative. The
variation in BpFULL1 sequences with 42
parsimony informative characters divided the
14 birch species into four groups with both ML
(II). BpADH gene had two synonymous and
six nonsynonymous polymorphisms within B.
pendula, compared to three synonymous and
three nonsynonymous polymorphisms between
B. pendula and B. ermanii. The total number
of polymorphic sites was larger than that of
fixed sites, reflecting the high polymorphism
and low divergence. However, the result was
not statistically significant (P 0.58). BpFULL1
had only one synonymous polymorphism, but
no nonsynonymous polymorphisms within B.
pendula, compared to one nonsynonymous
change between B. pendula and B. ermanii.
BpMADS2 had only one nonsynonymous
polymorphism, and no synonymous
polymorphisms within B. pendula, compared
to one nonsynonymous change between B.
pendula and B. ermanii. Due to the lack of
nonsynonymous variation the test did not have
power in BpFULL1 and BpMADS2 genes and
the obtained results were not statistically
significant.
The HKA test of Hudson, Kreitman and
Aguadé (1987) was conducted to examine,
whether the level of polymorphism in
BpMADS2, and BpFULL1 and/or BpADH
regions is statistically higher than that of
divergence compared to other loci (II). No
significant discrepancy in the levels of
polymorphism and divergence was detected.
This pattern suggests that the evolutionary
dynamics of these three genes does not differ
significantly from each other. However, the
regions studied, especially in BpMADS2 and
BpFULL1 genes, contained only few hundred
nucleotides of coding sequences and only a few
nonsynonymous polymorphisms, and due to
the lack of nonsynonymous variation this test
had low power.
4.4. Birch phylogeny (III, IV)
The phylogenetic relationships within the
genus Betula (Betulaceae) were studied
utilising two flower specific silver birch genes,
BpMADS2 and BpFULL1 (I; Elo et al., 2001),
the ADH gene, and the chloroplast matK gene
and parts of its upstream and downstream
flanking regions.
The Region I of the BpMADS2 gene that was
sequenced covers portions of three exons and
two introns. From 13 birch species a single
25
Table 6. Number of polymorphic sites within BpMADS2, BpFULL1, ADH and matK gene regions.
BpMADS2
BpFULL1
ADH
matK
Region I Region II
No. of analysed sites
737
1443
957
1037
2431
No. of parsimony informative sites
15
91
42
82
5
Percentage of parsimony informative sites (%)
2.0
6.3
4.4
7.9
0.2
the other long alleles (B. nana and B.
pubescens), and in the ML tree and MP trees
when gaps were ignored or considered as a
single character, with the short allele of B.
papyrifera.
The variation in matK sequences with only
five parsimony informative characters divided
the Betula species into two groups with both
methods: one including the American species
B. lenta, B. alleghaniensis and B. papyrifera
and the other containing the remaining species
(III). The percentages of parsimony
informative sites in all gene regions are
summarised in Table 6.
Under the assumption that all four nuclear
data sets (Regions I and II of BpMADS2,
BpFULL1 and ADH) share a common
evolutionary history, the sequence data sets
were combined to give a total of 206 parsimony
informative characters (IV). With ML and NJ
methods this combined data set divided the
birch species into four groups: B. lenta and B.
allehganiensis formed the first group, B.
maximowicziana, B. ermanii, B. humilis and
B. fruticosa the second, B. schmidtii, B.
resinifera, B. pendula, B. pubescens, B.
populifolia and B. platyphylla the third and B.
nana and B. papyrifera the fourth group. With
MP method the tree topology was very similar
as with ML and NJ methods, but now B. nana
and B. papyrifera clustered with the species in
group III (Fig. 7).
and MP methods and with both data sets (gaps
included/excluded). As with earlier sequence
regions, B. lenta and B. allehganiensis formed
the first group, B. maximowicziana, B. ermanii,
B. humilis and B. fruticosa the second, and B.
schmidtii, B. resinifera, B. pendula, B.
pubescens and B. platyphylla third one, but
now B. nana, B. populifolia and B. papyrifera
formed an own, fourth group.
The sequenced ADH region is comprised of
portions of five exons and four introns. A single
product that ranged in size from 1061 to 1073
bp was amplified from 11 birch species (III).
Two fragments were amplified from B.
pubescens and B. papyrifera, one being ~1070
bp, the other about 1500 bp. From B. nana an
~1500 bp fragment was amplified. The length
variation between the short (~1070 bp) and
long (~1500 bp) ADH alleles was due to one
long indel in the intron 1 in the region used.
Out of the 1037 sites analysed 82 were
variable, and of these 44 were phylogenetically
informative. The ADH variation with 44
parsimony informative characters divided the
genus Betula into three main groups with both
MP and ML methods and with all data sets,
although the support for different groups
differed among methods. The first group in all
trees was formed by B. fruticosa, B. humilis,
B. ermanii, and the short allele of B. pubescens,
the second one by B. maximowicziana, B. lenta
and B. alleghaniensis, and the third group by
the remaining species. The only notable
difference between the MP tree (gaps included)
and ML tree, and also MP trees when gaps were
ignored or considered as a single character, was
the placement of the long allele of B.
papyrifera, which in the MP tree clustered with
26
B. alleghaniensis
Group I
B. lenta
B. papyrifera (short)
B. populifolia
100
B. platyphylla
100
69
90
B. pubescens (short)
87
Group III
B. pubescens (long)
100
75
B. nana
90
100
B. papyrifera (long)
B. resinifera
97
B. schmidtii
76
B. pendula
B. maximowicziana
94
B. ermanii
100
Group II
100
B. fruticosa
B. humilis
Chamaebetula
Neurobetula
Betula sensu de Jong (1993)
Betula
Betulaster
Betulenta
Figure 7. The unrooted Maximum parsimony consensus tree based on combined sequence data set (ADH, BpMADS2
5’end, BpMADS2 3’end and BpFULL1). The bootstrap values are based on 1000 resamplings and the
bootstrap values ≥ 50% are indicated next to the relevant nodes.
27
and Irish, 2003). It may also contain activation
domains, or may be subject to posttranslational modifications that may influence
DNA binding specificity, subcellular
localisation or the ability to attract interacting
partners (Cho et al., 1999; Egea-Cortines et
al., 1999; Vandenbussche et al., 2003). The
importance of the PI motif for the function of
BpMADS2 gene has not been studied yet.
However, the high degree of conservation of
the PI motif throughout the members of the PI
lineage (both monocots and dicots) suggests
that it has a critical function also in other plant
species, including birch.
5. DISCUSSION
5.1. BpMADS2 is the PI homologue of birch
We have been studying the genetic regulation
of flower development of birch (e.g., Elo et
al., 2001; Lemmetyinen et al., 2001;
Lemmetyinen et al., 2004). Our long-term aim
has been to find out the regulatory chains
leading from the determination of the
inflorescence meristem to the determination of
the identity of flower organs, and use this
information to develop a method to prevent
flower formation. For these purposes we have
isolated from silver birch (Betula pendula
Roth) several genes or cDNAs apparently
involved in the regulation of flower
development. One of the aims of my study was
to isolate a putative birch B-function genes,
which are important for the development of
petals and stamens.
The sequence comparisons and phylogenetic
analysis show that BpMADS2 is a member of
the PI clade (I). At the early stages of flower
development PI is expressed in the second,
third and fourth whorls of developing flower
in Arabidopsis (Goto and Meyerowitz, 1994).
At the later stages PI is no longer detected in
whorl four. High expression in male
inflorescences having flowers with stamens and
tepals and low or absent expression in female
inflorescences having flowers consisting only
of carpels further support the notion that
BpMADS2 is a B function gene (I).
The members of the PI lineage can be
distinguished by two diagnostic sequences at
the K domain and at the C-terminal end of the
predicted protein (Kramer et al., 1998). The
PI-like genes typically code for consensus
sequence of MPFxFRVQPxQPNLQE at the Cterminal end of the protein (so called PI motif).
The members of the PI lineage are highly
conserved also at the K domain. This region
displays a consensus sequence of KHExL.
Both of these motifs were found also from the
predicted BpMADS2 protein (Fig. 2), further
supporting the assumption that BpMADS2 gene
is PI-homologue of birch (I).
In Arabidopsis the C-terminal motif of the
PI gene is essential for the formation and/or
maintenance of higher-order transcriptional
complexes (Egea-Cortines et al., 1999; Lamb
5.1.1. Identification of the putative
BpMADS2 regulatory regions (unpublished
results)
In order to identify putative regulatory regions
and/or elements in the BpMADS2 promoter, a
9-kb fragment of genomic DNA upstream of
the BpMADS2 coding region was isolated and
sequenced (I, unpublished results). Because all
the major regulatory elements of the PI
promoter are known to lie within the 1.5-kb
region upstream of the transcription initiation
site (Chen et al., 2000; Honma and Goto,
2000), shorter, only a 3-kb fragment of the
BpMADS2 promoter was chosen to be
examined in more detail (this region has also
been used for the BpMADS2::BARNASE
construct discussed later).
The promoter regions of BpMADS2 and PI
genes in general are very different from each
other and do not allow an easy recognition of
regulatory elements or important areas. The 1.5
kb promoter region of the PI gene contains all
the major regulatory elements for the spatial
and temporal expression of the gene and can
be split into two regions (Honma and Goto,
2000). The distal region (from site -1458 to
site –301) promotes the initial expression of
PI in response to induction signals, and the
proximal region (from site -300 to site +1)
promotes the late expression of PI maintained
by the AP3/PI auto-regulatory circuit. The
proximal region of the PI promoter does not
contain any CArG box-like sequences even
though it is sufficient for PI auto-regulatory
expression, indicating that the interaction
between the PI/AP3 complex and the PI
28
promoter is indirect (Chen et al, 2000). Unlike
the proximal region of the PI promoter, the 3kb region of the BpMADS2 promoter contained
one CArG box-like sequence from site –2204
to site –2195 (unpublished results). The
promoter of the other B function gene, AP3,
contains three functional CArG box sequences
which all mediate discrete regulatory effects,
and are necessary for the AP3 feedback control
(Tilly et al., 1998). Furthermore, PI PLE1 and
PLE2 elements showed significant similarity
to the BpMADS2 promoter, but the third
element, PLE3, did not. In PI, all three positive
regions are required for both stamen and petal
expression (Chen et al., 2000). Even if it is not
possible to draw far-reaching conclusions,
these observations indicate that the mechanism
of regulation of the B function genes might be
partly different in these two species.
One of the methods employed in our group
in preventing flower formation is the tissuespecific ablation by using BARNASE gene
(Lemmetyinen et al., 2004). Because
BpMADS2 was expressed in male and female
inflorescences but not in any other parts of the
plant (I), the promoter of BpMADS2 could be
a suitable candidate for the prevention of the
formation of stamens and carpels. So far, the
BpMADS2::BARNASE construct has been
tested in Arabidopsis, and the preliminary
results show that this construct has an effect
on flower formation in most transgenic lines
(M. Lännenpää and T. Sopanen, unpublished
results). Flower formation was totally
prevented and only inflorescence stems
developed in over half of the obtained lines.
In remaining lines, incomplete flowers, from
which petals, stamens and carpels were
missing, or were malformed, were developed.
These results indicate that the 3-kb region
upstream of the BpMADS2 ATG codon
connected to BARNASE is sufficient to prevent
flower formation in most of the transgenic
Arabidopsis lines, and, unexpectedly, the
influence of the BpMADS2::BARNASE
construct extended into the first whorl of
Arabidopsis flower preventing also the
formation of sepals. These results indicate that
the birch promoter, or the used birch promoter
region, might be lacking some of the important
elements needed to function properly in
Arabidopsis.
5.2. Nucleotide variation in silver birch
In this study nucleotide variation of silver birch
was studied in two naturally regenerated
populations located in eastern and northern
Finland. The studied populations were
composed of 65-70-year-old birches that are
apparently unaffected by any human selection.
Studies performed on the current genetic
structure of naturally regenerated populations
are likely to be very valuable in the future, since
they provide us baseline reference values of
diversity.
5.2.1. Level of nucleotide variation in three
nuclear loci of silver birch
In this thesis the nucleotide variation of silver
birch was studied in three nuclear genes,
BpMADS2, BpFULL1, and BpADH (I, II). The
observed results do not fully support the
predictions of high nucleotide polymorphism
in silver birch. The estimates of silent site
nucleotide diversities (πs) in BpFULL1 and
BpADH genes were very similar, and much
higher than the estimate of silent site nucleotide
diversity in BpMADS2 gene (0.0134, 0.0117
and 0.0043, respectively; Table 3). The mean
estimate of πs for the silver birch was 0.00818
(unpublished results), which is only slightly
higher than the mean level of silent site
nucleotide diversity for the highly selfing
Arabidopsis (0.007; Yoshida et al., 2003).
Furthermore, the overall level of nucleotide
diversity (πtotal) for silver birch was 0.00682.
This is much lower than the estimate of the
nucleotide diversity for the entire genome of
Arabidopsis (0.0106, respectively; Miyashita
et al., 1999). The estimates of nonsynonymous
nucleotide diversity (π a ), especially in
BpMADS2, BpFULL1, but also BpADH loci
(I, II) were also lower compared to the
estimates of Arabidopsis, especially for ChiA,
CAL, PI and AP3 loci (0.0037, 0.0054, 0.0030
and 0.0040, respectively; summarised in
Aguadé, 2001).
In Arabidopsis, genes like ChiA, CAL, AP3
and PI exhibit a significant excess of within
species replacement polymorphisms (Kawabe
et al., 1997; Purugganan and Suddith, 1998;
29
1999). This phenomenon has been explained
with recent rapid population expansion, as a
consequence of which Arabidopsis now exists
in small inbred subpopulations. Even though
silver birch has gone through rapid population
expansion to most of Europe after the last
glaciation (Huntley and Birks, 1983), silver
birch genes do not exhibit same kind of excess
of within species replacement polymorphisms
as Arabidopsis (I, II). This is most likely due
to the very efficient pollen flow of silver birch
as will be discussed later.
Demographic factors, such as a recent
population expansion (discussed later), affect
all genes and all regions of a gene equally. In
contrast, selection directly affects the genetic
diversity at linked sites. Selection is thus
expected to result in heterogeneous patterns
of genetic diversity among different genes and
across a given gene. In a nuclear genome of
silver birch, BpFULL1 and BpADH genes
showed more variation (II) than the BpMADS2
gene (I), and for this reason the presence of
selection was studied in the analysed gene
regions (II). However, because the HKA tests
were not significant, and because the detected
patterns of polymorphism and divergence were
concordant, the obtained results indicate that
the variation of mutation rates between the loci
could be a sufficient explanation for the
detected differences in the levels of nucleotide
variation. Furthermore, mutation rates have
been found to vary extensively both among
genes and among groups of plants for the same
gene (e.g., Bousquet et al., 1992; Laroche et
al., 1997; Wang et al., 1999).
In general, long-lived, outcrossing, windpollinated forest tree species, such as silver
birch, have been found to harbour much more
genetic variation than annual, selfing plants
(summarised in Wang and Szmidt, 2001).
However, at nucleotide level many woody
plants have shown a slower substitution rate
when compared with herbaceous annual plants
(Bousquet et al., 1992; Savard et al., 1993;
Laroche et al., 1997; Andreasen and Baldwin,
2001). Scots pine, another long-lived
predominantly outcrossing forest tree has
shown low nucleotide diversity in the coding
region, both synonymous and nonsynonymous
sites (Dvornyk et al., 2002; García-Gil et al.,
2003). Likewise, P. taeda had low diversity in
a large set of genes (Brown et al., 2004). The
lower level of nucleotide diversity in silver
birch than expected is consistent with these
findings. Several hypotheses have been
proposed to explain this rate heterogeneity
among woody perennial and herbaceous annual
plant taxa, which seems to be related with life
history. In this case, generation time seems be
the most likely factor because the woody
perennials (longer generation times) analysed
showed lower numbers of nucleotide
substitutions per site than herbaceous annual
taxa (shorter generation times; Gaut et al.,
1996). However, recent studies with closely
related species do not support this hypotheses,
rather just the opposite, indicating that life
history cannot explain evolutionary rate
variation in studied species (Whittle and
Johnston, 2003). The mechanisms for the lower
rate of evolution in perennials compared to
annuals is not yet well understood. For this
reason further investigation about the factors
that influence the mutational process (e.g., the
relative frequency of germ-line and somatic
mutations in gametes, metabolic rate of
pregametic cells, and environmental
conditions) is essential for a better
understanding of molecular evolutionary rate
variation in plants.
5.2.2. No genetic differentiation between the
two silver birch populations
Partitioning of the genetic variability within
forest tree species has revealed that in general
more than 90 % of the total genetic variation
resides within populations and less than 10 %
is due to differentiation among tree populations
(Hamrick et al., 1992). In this study, the genetic
differentiation between the two silver birch
populations overall was low, especially in
BpMADS2 and BpADH loci (I, II).
Furthermore, silver birch individuals separated
by thousands of kilometres (e.g., Italy and
Russia) did not show more variation than two
random individuals from populations
Punkaharju or Rovaniemi with different allele
haplogroups (Region I of BpMADS2,
unpublished results). Earlier studies based on
allozyme data have also shown that the genetic
differentiation among the northern silver birch
30
of linkage disequilibrium was very rapid. The
only exception from this was Region I of
BpMADS2, which showed no evidence of
recombination but instead significant linkage
disequilibrium and low variation (I). When the
same region was studied from individuals
collected from different parts of the distribution
area of silver birch recombination was,
however, detected (unpublished results). These
results indicate that recombination is a common
phenomenon in silver birch nuclear genome.
populations is low (Rusanen et al., 2003). In
forest tree species gene flow is mediated by
seed and pollen dispersal, and the gene flow
through pollen is very efficient especially in
wind-pollinated species, such as silver birch
(e.g., Hamrick and Nason, 2000). Gene flow
is a strong force, which slows down population
differentiation. Conversely, efficient gene flow
among populations can also break down the
existing differentiation, which might have been
formed during the isolation of the populations
(for example during the last glaciation). Silver
birch is distributed throughout the Northern
Hemisphere (Atkinson, 1992), so the
distribution area of the species is large.
Furthermore, due to very efficient pollen flow,
this distribution area forms in many places
continuous populations over Europe. This has
ensured that if there has been some
differentiation among silver birch populations
after the last glaciation, it has most presumably
been broken down due to efficient gene flow.
The lack of genetic differentiation indicates
that the effective population size (Ne) of the
species is large. At neutral sites linkage
disequilibrium is governed by 1/(4Nec), where
c is the recombination rate between loci (Hill
and Robertson, 1968). Because close linkage
will restrict effective recombination, the larger
this product, the less disequilibrium would be
expected between neutral loci. If c between
closely linked sites is small, Ne must be large
to account for the lack of linkage
disequilibrium. Consistent with the large
effective population size the decay of linkage
disequilibrium in studied genes was very rapid,
even between the closely linked polymorphic
sites (I, II).
5.2.4. Nuclear genes of silver birch show few
traces of postglacial expansion
Current pattern of genetic variation of a species
is influenced by both genetic factors and
historical events. Re-colonisation of Europe by
forest tree species after the last glaciation is
well documented in the pollen fossil records
(Huntley and Birks, 1983). Pollen data, as well
as macrofossils, indicate that birch was present
in central Europe during full glacial (Huntley
and Birks, 1983; Willis et al., 2000). Birch
populations were not limited only to a few
southern refugia, but were locally present in a
belt that ran eastwards into Russia, and also
on the Northern European plains. Furthermore,
birch pollen was widely present in parts of the
Central and Northern Europe during the lateglacial period (Willis et al., 2000). When the
ice started to retreat silver birch, as a pioneer
species, occupied suitable habitats and quickly
spread northwards. Studies on chloroplast
genome have shown that Europe was
reoccupied by two main waves of recolonisation after the glaciation: one from east
and one from west (Palmé et al., 2003).
Following these two main waves of recolonisation, today’s silver birches in Europe
can be classified into two main chloroplast
haplogroups, of which one is dominant in the
north-west and the other in the south-east and
east. Although in general the silver birch
populations in northern Europe are dominated
by the north-west haplogroup, in Finland the
south-eastern/ eastern haplogroup is the
dominant one representing about 70-90 % of
the sample (Palmé et al., 2003; I). Due to the
presence of Scandinavian mountain range the
spread northwards/north-east from south
through Norway and Sweden was probably
5.2.3. Recombination is common in the silver
birch genome
Recombination is one of the key evolutionary
processes that shape the genetic structure of
populations (Posada and Crandall, 2001), and
it has also most likely played a role in
determining patterns of intraspecific variation
in silver birch. Recombination was detected
in all three nuclear loci and the rate of
recombination was very similar among the
genes (Table 4; I, II). Furthermore, due to the
high recombination rate in these loci, the decay
31
slowed down or entirely hindered, while the
spread of populations from east and south-east
to Finland was much more efficient.
A relatively short span of time (in
generations) has elapsed since re-colonisation
by silver birch took place in Finland 10 000
years ago (Huntley and Birks, 1983). This
recent history may have left durable prints in
the genetic structure in the silver birch
populations. However, over time the initial
genetic structure of populations established at
colonisation will break down due to interpopulation gene flow, and the rate at which this
breaking down will occur depends on the way
gene flow is mediated (Petit et al., 1993; Petit
et al., 1997). The gene flow of biparentally
inherited nuclear genes is mediated by both
seed and pollen, whereas only seed mediates
the gene flow of maternally inherited
chloroplast DNA. Since the gene flow through
pollen is very efficient in wind-pollinated
species, such as silver birch, this means that
the breaking down of the initial genetic
structure of populations occurs most rapidly
in biparentally inherited nuclear genes. On the
other hand, maternally inherited markers, such
as chloroplast DNA, retain the initial genetic
structure of populations much longer and
generally reveal much more genetic structure.
Unlike chloroplast data (Palmé et al., 2003),
nuclear loci of silver birch do not show strong
patterns caused by the post-glacial expansion
of the species, but are close to demographic
equilibrium (I, II). Although all gene
genealogies displayed some allelic
dimorphism, and even though the dimorphism
at some parts of the gene was maximal, with
the low number of variation, the Tajima’s D
tests were not very powerful. The detected
percentual distribution of putative nuclear
allele haplogroups was very similar to the
detected chloroplast haplotype distribution,
indicating the same dual origin of the Finnish
birches as the chloroplast haplotypes. As
mentioned above, for maternally inherited
chloroplast DNA inter-population gene flow
via seeds is likely to be substantially lower than
biparentally inherited nuclear DNA, where
inter-population gene flow is mediated via
seeds and pollen. Furthermore, the chloroplast
DNA is non-recombining, whereas in nuclear
genome of silver birch recombination seems
to be a common phenomenon (I, II). As a
consequence of this the initial genetic structure
of silver birch populations have been retained
longer in chloroplast genome, and this makes
chloroplast DNA much more suitable than
nuclear genome for the study of historical
processes of silver birch.
5.3. Phylogeny of the genus Betula
The birches are a difficult group taxonomically,
not only because of their high vegetative
variability and frequent hybridisation, but also
partly because of the confusions related to the
binomial nomenclature. Many birch species
have at least two different commonly used
Latin names (i.e., B. pendula/B. verrucosa/B.
alba, B. pubescens/B. alba) and these names
are used in parallel with each other.
Furthermore, ever since Regel (1865), the
subsections or subgenera of the genus Betula
have been revised by a number of authors (see
Furlow, 1990). A phylogenetic classification
of the genus Betula has been suggested by de
Jong (1993), who divided the genus into five
subgenera, namely Betulenta, Betulaster,
Neurobetula, Chamaebetula, and Betula (Fig.
1).
5.3.1. Comparison of molecular phylogenies
In general, the results obtained from the ADH
(III), BpMADS2, and BpFULL1 genes and
from the combined data set (IV) fit rather well
with the infrageneric classifications proposed
for birches (e.g., Regel, 1865; Winkler, 1904;
de Jong, 1993; Table 1), except for B. schmidtii
and B. ermanii. In all phylogenetic trees B.
schmidtii (subgenus Neurobetula) grouped
with the species in the subgenus Betula
(including B. pendula and all other white
birches), and B. ermanii (subgenus
Neurobetula) grouped with the species in
subgenus Chamaebetula (including B. humilis
and B. fruticosa) (III, IV). The phylogenetic
trees were mainly congruent with each other,
but differed somewhat in their resolution, and
in the bootstrap supports for the major clades.
Furthermore, it should be noted that due to the
limited number of species included to this
study, conclusions that rely on the order of
32
branching of the major gene clades must be
made with caution.
The nuclear data obtained of the species
compared in this study suggest that the diploid
B. pendula, B. resinifera, B. platyphylla, and
B. schmidtii form a continuum of closely
related taxa (Fig. 7; III, IV). The diploid B.
populifolia and polyploid B. pubescens and B.
papyrifera are clearly related to the former
group, but possibly due to hybridisation and/
or introgression the placement of these species
in a phylogeny differs depending on the gene
region used. Betula pendula, B. resinifera, B.
platyphylla, B. populifolia, B. pubescens and
B. papyrifera all belong to the subgenus Betula,
and they form a rather homogenous group of
pioneer species, with a characteristic white
bark, pendulous catkins, male flowers with two
or three stamens, and leaves with a small
number of veins (Table 1; de Jong, 1993).
Species in the subgenus Betula are considered
to hybridise more or less freely, and this leads
to introgression that complicates the
classification of the species (Johnsson, 1945;
Dugle, 1966; Furlow, 1990; de Jong, 1993).
Particular confusion has centred around the
European (B. pendula, B. pubescens) and the
North American representatives (B. resinifera,
B. populifolia, and B. papyrifera) of the
subgenus (summarised in Furlow, 1990). Many
researchers have recognised the American
species as geographic races of B. pubescens
or hybrids, while others consider them as
separate species. However, most modern
authors have maintained the American birches
as separate species (including de Jong, 1993).
Also in this study the placement of species such
as B. populifolia, B. papyrifera, and B.
pubescens varied depending on the gene region
used and the most likely explanation for this
seems to be hybridisation between the species
(III, IV).
Species in the subgenus Betula and subgenus
Chamaebetula are seen as derived from the
same or different ancestors related to the
subgenus Neurobetula (de Jong, 1993). This
may partly explain why B. schmidtii and B.
ermanii at nuclear level group together with
species in subgenus Betula and Chamaebetula,
respectively.
The tetraploid B. ermanii is morphologically
extremely variable, resembling species in the
subgenus Betulenta in having male flowers
with 3-4 stamens, but the fruiting catkins are
not always sessile and upright, and the bark
resembles that of white birches (subgenus
Betula) in being grayish white and lacking
methyl salicylate (Ashburner, 1980). At the
nucleotide level, B. ermanii was grouped as a
sister taxon of B. fruticosa and B. humilis
(subgenus Chamaebetula), indicating a close
relationship between these two subgenera (III,
IV). On the other hand, the grouping of B.
ermanii might partly be artificial. After the
grouping of B. schmidtii within the subgenus
Betula, B. ermanii was left over as the only
representative of the subgenus Neurobetula,
and grouped with the next most similar species.
If the analysis would have included more
species from the subgenus Neurobetula, this
grouping could be different, indicating that
further studies within the heterogenous
subgenus Neurobetula will be needed to
establish a reliable phylogeny for these birch
species.
The chloroplast gene matK is one of the most
widely used sequences for phylogenetic studies
in plants (e.g., Wang et al., 1999; Wang et al.,
2000; Cheng et al., 2000; Stanford et al., 2000;
Soltis et al., 2001; Fukuda et al., 2001). In this
thesis variation of matK among the studied
birch species was limited (Table 6, III). The
phylogenetic tree of matK with only five
parsimony informative characters divided the
14 Betula species into two well-supported
groups: one including the three American
species B. lenta, B. alleghaniensis, and B.
papyrifera, and the other containing the
remaining species. DNA sequences of higher
plants evolve at different rates, depending on
whether they are located in the nuclear,
chloroplast, or mitochondrial genome. The
comparison of chloroplast and nuclear DNA
sequences have shown that the chloroplast
DNA evolves only at half the rate of plant
nuclear DNA (Wolfe et al., 1987; Wang et al.,
2000). Furthermore, the chloroplast genome
is haploid (Radetzky, 1990; Rajora and Dancik,
1992; Dumolin et al., 1995) and is therefore
expected to have a smaller effective population
size (N e ) than diploid nuclear genes. In
33
birches support the idea of the hybrid origin
of the species.
Living species of Betula are all n = 14 or
higher, but there is no unanimity on the base
number of the genus (see Furlow, 1990). The
base chromosome number 14 is commonly
accepted, but Brown and Al-Dawoody (1979)
found that meiotic behaviour in hybrid birches
(2n = 42) suggests that these trees are actually
hexaploids, not triploids, which leads to a base
chromosome number of 7. Furthermore, in
meiosis the chromosomes in the 2n = 28 and
2n = 56 plants tend to lie in groups of seven
and for this reason the original basic
chromosome numbers of birches is thought to
be seven rather than fourteen (Eriksson and
Jonsson, 1986). The small number of
quadrivalents during the meiosis (multivalent
chromosomes, which form from four
chromosomes during the meiosis) has also been
thought to support the base number of seven.
Furthermore, the latest studies with molecular
markers offer molecular evidence for a base
number of 7 in Betula (Williams and Arnold,
2001). As mentioned earlier, the diploid B.
schmidtii shows features that are characteristic
of plants with hybrid origin (Woodworth,
1929). If the base chromosome number of the
genus Betula is 7, then B. schmidtii would
simply be an allotetraploid species (as well as
all other 2n = 28 species would be tetraploid).
But, if the base chromosome number is 14, as
commonly accepted, this would mean that B.
schmidtii is a homoploid diploid. The origin
of new homoploid species via hybridisation is
theoretically difficult because it requires the
development of reproductive isolation in
sympatry (Rieseberg, 1997). However, it is not
impossible, as documented examples of
homoploid diploid and allotetraploid hybrid
species in nature show (Rieseberg, 1997,
Ferguson and Sang, 2001).
monoecious species, such as Betula, Ne for
chloroplast genes is expected to be half of that
for nuclear genes and the level of genetic
variation is therefore expected to be smaller.
These facts could, at least partly, explain this
low level of variation in matK region compared
to analysed nuclear genes. Further, earlier
studies with chloroplast genes indicate that the
chloroplast genome evolves slowly in
Betulaceae in general (Bousquet et al., 1992;
Kato et al., 1998; Palmé and Vendramin, 2002).
5.3.2. Phylogenetic relationships of Betula
schmidtii
Betula schmidtii is considered to be a rather
peculiar species among birches because of the
blackness of its bark, hard, heavy wood and
slow growth (Ashburner, 1980). The female
inflorescences of B. schmidtii are erect and
elongated, and fruits wingless, with only
narrow margins (Nakai, 1915). Regel (1865)
placed B. schmidtii into serie Costatae, along
with B. ermanii. De Jong (1993) divided the
section Costatae into the subgenera Betulenta
and Neurobetula, and placed B. schmidtii into
subgenus Neurobetula. However, because of
the morphological differences compared to
other birch species discussed above, B.
schmidtii has sometimes been placed even in
its own subgenus, Asperae (Nakai, 1915).
The four studied nuclear gene sequences of
B. schmidtii resembled closely those of B.
pendula and other white birches (subgenus
Betula). Same kind of results have been
obtained in flavonoid profiles of B. schmidtii
(Keinänen et al., 1999b). The meiosis of
diploid B. schmidtii is very abnormal, and
suggests a hybrid origin for the species
(Woodworth, 1929). Similarities in flavonoid
composition and nucleotide sequences suggest
that one of the parental species of B. schmidtii
would belong to ancestors of the subgenus
Betula. However, the phenolic compounds
other than flavonoids characteristic for white
birches, were not detected in B. schmidtii and
of the two main non-flavonoid compounds
present in B. schmidtii, only one was detected
in another species, B. ermanii (Keinänen et al.,
1999b). This, along with the phenotypic
differences between B. schmidtii and white
5.3.3. Origin of the two alleles of ADH gene
The ADH gene was distinguished from the
other genes studied, since from some birch
species two fragments were amplified instead
of one (II, III). The classification of the birch
ADH alleles into two classes followed the
presence/absence of the one long indel from
position 66 to 524 bp. Two versions of the ADH
34
gene were amplified from one diploid species,
B. pendula (II), and two polyploid species, B.
pubescens and B. papyrifera (III). From the
diploid B. nana only the long allele was
amplified. The occurrence of two ADH alleles
in four different birch species can be explained
with recent or ancient hybridisation and/or
introgression. Hybridisation is a common
phenomenon among birches (Johnsson, 1945;
Dugle, 1966; Furlow, 1990) and it is therefore
expected that introgression may have played
an important role in the evolution of this genus
(Furlow, 1990; Atkinson, 1992). Hybridisation
is common also between species in subgenus
Betula, including B. pendula, B. pubescens,
and B. papyrifera (Johnsson, 1945; Thórsson
et al., 2001; Palmé et al., 2004). However,
hybridisation studies with B. pendula and B.
pubescens have shown that the cross B.
pubescens x B. pendula and reciprocal gives
only a few progeny that are extremely sterile,
indicating that hybridisation between these two
species is not common (Johnsson, 1945).
Furthermore, the indirect evidence about the
origin of the 42 chromosome trees, which has
been argued to be hybrids between B. pendula
and B. pubescens, support the hypothesis that
these trees are actually aneuploid B. pubescens,
not hybrids (Brown and Al-Dawoody, 1979).
Also, the distribution area of B. papyrifera is
limited to North America, while B. pendula
and B. pubescens are found in Europe and Asia.
Due to these geographical limits hybridisation
between these three species would be very
difficult. However, hybridisation might be the
most likely explanation for the occurrence of
long ADH alleles in B. pubescens and B. nana
(Anamthawat-Jónsson and Tomasson, 1990;
Jonsell, 2000; Thórsson et al, 2001).
Species in subgenus Betula are relatively
young and probably still evolving (Jäger, 1980;
de Jong, 1993). It is possible, that hybridisation
has occurred earlier in the evolution of
subgenus Betula and that these two different
ADH alleles found from B. pendula, B.
pubescens and B. papyrifera are relics from
these events. Palmé et al. (2004) speculated
with the possibility that two of the chloroplast
haplotypes shared between B. pendula, B.
pubescens and B. nana could be ancient and
most likely were present in the common
ancestor of these three species. However, their
final conclusion was that the haplotype sharing
among these three species is most likely caused
by hybridisation and subsequent cytoplasmic
introgression. This conclusion was justified by
the fact that geography was more important
than species identity in influencing the
haplotype composition of a population. Several
hypotheses have been suggested about the
origin of the tetraploid B. pubescens, generally
including B. pendula (see Howland et al.,
1995). Based on one hypothesis, B. pubescens
is an ancient allotetraploid, with B. pendula as
one of the ancestral parents, while the other
parental species might be B. humilis. An
alternative hypothesis suggests, that B.
pubescens is an autotetraploid of B. pendula.
Recent studies have demonstrated, that most
polyploid species examined have formed
recently from different populations of their
progenitors (multiple origins; summarised in
Soltis and Soltis, 1999). When genetically
different diploids have been involved in this
polyploidisation, the result can be a series of
genetically distinct polyploid populations.
Combined with the fact that the chloroplast
genome evolves slowly in Betulaceae
(Bousquet et al., 1992; Kato et al., 1998; Palmé
and Vendramin, 2002) and in the genus Betula
(matK gene (III)), these findings could explain
both shared haplotypes and the influence of
geography in the haplotype composition of
populations.
It is also possible that the two ADH alleles
are much older forms and typical of all or most
of the birch species, but due to PCR primers
and stringency of the amplification conditions
used, we were unable to isolate the longer allele
from other birch species studied. Stebbins
(1971) has estimated that approximately one
third of the angiosperm plants possess more
than two complete genomes (i.e. multiplied sets
of the diploid chromosome number of the
genus), and it is probable that also the present
basic chromosome number genus Betula is of
ancient polyploid origin.
5.3.4. Reconciling gene trees with a species
tree
When studying molecular phylogenies it is
important to keep in mind that a phylogenetic
35
tree (gene tree) constructed from DNA
sequences does not necessarily mirror the
actual species tree (the evolutionary pathway
of the species; Pamilo and Nei, 1988).
Processes such as hybridisation and
introgression, recombination, lineage sorting,
and gene duplication can frequently cause
incongruences among different gene trees and
the actual species tree. In fact, the gene tree
can be quite different from the species tree,
especially when the time of divergence between
different species is short. Furthermore, when
the studied species are relatively closely
related, such as species in subgenus Betula, the
number of nucleotides required for obtaining
the correct species tree with a probability of
95 % is considerable (Pamilo and Nei, 1988).
However, if several independent data sets result
in similar trees this will give us confidence that
the obtained gene trees truly reflect the same
evolutionary history as the species tree.
In a genus that is known for its high levels
of hybridisation and introgression, such as
Betula (Johnsson, 1945; Dugle, 1966; Furlow,
1990), transfer of genes across the species
boundaries is undoubtedly most extensive.
When an introgressed allele is sequenced
instead of one of the “original” alleles this will
affect the structure of the gene tree and
normally it will not mirror the majority of the
other genes in the species (Wendell and Doyle,
1998). If introgression is widespread,
bifurcating trees may simply no longer reflect
the evolutionary process, and this could
potentially be the case for several Betula
species. In species such as B. nana, which is
known to hybridise extensively, the diverse
grouping in different gene trees (III, IV) could
be explained with hybridisation and/or
introgression (in this case with hybridisation
and introgression between B. nana and B.
pubescens; Ashburner 1980, AnamthawatJónsson and Tomasson, 1990; Thórsson et al.,
2001; Palmé et al., 2004). Furthermore,
phylogenetic trees of the studied nuclear and
chloroplast genes did not give fully congruent
results (III, IV). Incongruences between
nuclear and cytoplasmic markers has often
been reported (e.g., Soltis et al., 1996; Erdogan
and Mehlenbacher, 2000; Semerikov et al.,
2003), and this incongruence can be due to
many factors, such as mentioned above. In the
present case cytoplasmic introgression seems
to be the most likely explanation for the
differences between phylogenetic trees of
BpMADS2, BpFULL1, ADH and matK genes
(III, IV). Cytoplasmic introgression has proven
to be common among other birch species
(Anamthawat-Jónsson and Tomasson, 1990;
Thórsson et al., 2001; Palmé et al., 2004). The
three species in group B in the phylogenetic
tree of matK are all American species with
overlapping geographical distributions, so
there has been plenty of opportunity for
hybridisation (III). Furthermore, both
morphological (Furlow, 1990; de Jong, 1993)
and nuclear data (III, IV) indicate that B.
papyrifera clearly belongs to the white birches
(subgenus Betula).
Recombination is a key evolutionary process
that shapes the genetic structure of populations
and architecture of genomes (Posada and
Crandall, 2001). Recombination also violates
the main assumption of most phylogenetic
methods, the idea of only one phylogenetic tree
underlying the evolution of the sequences
under study, by generating “mosaic genes”
where different regions have different
phylogenetic histories. Recombination seems
to be also a common phenomenon in silver
birch genome (I, II), indicating that it can not
be excluded when evaluating phylogenetic
relationships of genus Betula based on gene
trees. However, the power of statistical
methods to detect recombination varies greatly
and most methods have trouble detecting rare
recombination rates, especially when sequence
divergence is low (Posada and Crandall, 2001;
Wiuf et al., 2001). For most methods, a
minimum sequence divergence of 5% seems
necessary to attain substantial power to detect
recombination, but this limit was achieved only
in two gene regions used in this study (Table
6).
Furthermore, other processes that complicate
relationships between gene trees and species
trees, are lineage sorting, extinction of
ancestral gene polymorphisms through
stochastic processes, and gene duplications
(Wendell and Doyle, 1998). However, lineage
sorting is likely to be a problem only if the
time that the alleles need within a lineage to
36
coalesce is longer than the interval between
successive speciation events. Gene
duplications may result in a species containing
a number of distinct but related sequences.
Duplications are common evolutionary events
and consist of copying in multiple places a gene
located along a DNA strand. Then all of these
copies evolve independently from each other.
When studying duplicated genes there is a
danger of inadvertently including paralogous
genes resulting in a gene tree that reflects the
duplication of the gene rather than a possible
species tree. Two versions of the ADH gene
were isolated from four birch species, but the
fact that the coding regions of the short and
long alleles of B. papyrifera were identical (III)
and the coding regions of the short and long
alleles of B. pendula were almost identical (II)
strongly suggest, that these two sequences
represent two different alleles of the same ADH
gene, not two different ADH genes.
Finally, different phylogenetic methods treat
gaps (indels and microsatellite length variation)
differently; while maximum likelihood method
totally ignores gaps, maximum parsimony
considers each base pair within a gap as a
character. These differences between the two
methods clearly affect the clustering of certain
birch species in phylogenies, as discussed in
papers III and IV.
As many other studies based on morphology,
biochemical characters and chromosome
numbers have shown (Regel, 1865; Winkler,
1904; Nakai, 1915; Komarov, 1936;
Pawlowska, 1983; de Jong, 1993; Keinänen et
al., 1999b), the relationships among the Betula
species are complex. This study has been a first
step towards understanding the relationship
among different Betula species using nuclear
genes. As results from this study have shown,
many processes, such as mentioned above, can
cause incongruences among different gene
trees and the actual species tree. However, it
also revealed that molecular data can be
powerful tool in constructing evolutionary
histories
between
morphologically
differentiated species, such as B. schmidtii and
white birches (III, IV). To further understand
the relationship among the species of this
complex genus, larger number of unlinked
genes and more birch species have to be
studied.
37
ACKNOWLEDGEMENTS
First of all, I would like to thank all the people who have contributed to this thesis. I am very
grateful to my supervisors Professor Tuomas Sopanen, Professor Outi Savolainen and Dr. Markku
Keinänen for their advice and support throughout the work on this thesis.
I am especially grateful to Riitta Pietarinen for her help with the laboratory work. Without
your assiduous work in the lab this thesis would not have been possible. Great thanks belong to
my colleagues and friends at the Department of Biology. Especially I would like to thank the
members of our birch group, Kaija Keinonen, Ilkka Porali, Juha Lemmetyinen, Mika Lännenpää
and Luis Orlando Morales for their friendship, support, amusing discussions and valuable advice.
It has been a pleasure to work with you all these years!
I would also like to thank Docent Matti Rousi and Dr. Risto Jalkanen from the Finnish Forest
Research Institute, Punkaharju and Rovaniemi Research stations, Maisa Viljanen at Joensuu
Botanical Garden for sending me the population and birch species samples, and Professor Jaakko
Kangasjärvi for providing the genomic library of silver birch. I thank Hanni Sikanen for the help
with fieldwork, Minna Korhonen for the cDNA clone of the BpADH gene, and Anna Palmé for
three birch samples. I also thank Anna Palmé and Martin Lascoux for the co-operation on paper
III.
This study was carried out at the Department of Biology, University of Joensuu. I thank Professor
emeritus Heikki Hyvärinen, Professor Jussi Kukkonen, Dr. Markku Kirsi and Dr. Pertti Huttunen,
the heads of the Department of Biology, for providing excellent facilities. The study was funded
by the TEKES (as a part of Finnish Biodiversity Programme, FIBRE), the Graduate School of
Forest Sciences (former Graduate School of Biology and Biotechnology of Forest Trees), the
Department of Biology, University of Joensuu, and the Faculty of Science, University of Joensuu.
Last, but not least my special thanks to my family, Sanni and Vesa, for your love and patience
during these years. Without you and your support this project would not have been possible!
Oppinut on ylpeä siksi,
että tietää niin paljon;
viisas vaatimaton siksi,
että tietää niin vähän.
William Cowper
38
Proceedings of the National Academy of
Sciences 101:15255-15260.
Brown IR, Al-Dawoody D (1979)
Observations on meiosis in three cytotypes
of Betula alba L. New Phytologist 83:801811.
Chang C, Meyerowitz EM (1986) Molecular
cloning and DNA sequence of the
Arabidopsis thaliana alcohol dehydrogenase
gene. Proceedings of the National Academy
of Sciences 83:1408-1412.
Charlesworth B, Morgan MT, Charlesworth D
(1993) The effect of deleterious mutations
on neutral molecular variation. Genetics
134:1289-1303.
Charlesworth D, Liu FL, Zhang L (1998) The
evolution of the alcohol dehydrogenase gene
family by loss of introns in plants of the
genus Leavenworthia (Brassicaceae).
Molecular Biology and Evolution 15:552559.
Chen X, Riechmann JL, Jia D, Meyerowitz E
(2000) Minimal regions in the Arabidopsis
PISTILLATA promoter responsive to the
APETALA3/PISTILLATA feedback control
do not contain a CArG box. Sexual Plant
Reproduction 13:85-94.
Cheng Y, Nicolson RG, Tripp K, Chaw SM
(2000) Phylogeny of taxaceae and
cephalotaxaceae genera inferred from
chloroplast matK gene and nuclear rDNA
ITS region. Molecular Phylogenetics and
Evolution 14: 353-365.
Cho S, Jang S, Chae S, Chung KM, Moon YH, An G, Jang SK (1999) Analysis of the Cterminal region of Arabidopsis thaliana
APETALA1 as a transcription activation
domain. Plant Molecular Biology 40:419429.
Chung Y-Y, Kim S-R, Kang H-G, Noh Y-S,
Park MC, An G (1995) Characterization of
two rice MADS box genes homologous to
GLOBOSA. Plant Science 109:45-56.
Coen E, Meyerowitz EM (1991) The war of
the whorls: genetic interactions controlling
flower development. Nature 350:31-37.
de Jong PC (1993) An introduction to Betula:
its morphology, evolution, classification and
distribution, with a survey of recent work.
International Dendrology Society, Great
Britain.
REFERENCES
Abbot RJ, Gomez MF (1989) Population
genetic structure and outcrossing rate of
Arabidopsis thaliana (L.) Heynh. Heredity
62:411-418.
Aguadé M (2001) Nucleotide sequence
variation at two genes of the
phenylpropanoid pathway, the FAH1 and
F3H genes, in Arabidopsis thaliana.
Molecular Biology and Evolution 18:1-9.
Alam MT, Grant WF (1972) Interspecific
hybridization in birch (Betula). Le
Naturaliste Canadien 99: 33-40.
Anamthawat-Jónsson K, Tomasson T (1990)
Cytogenetics of hybrid introgression in
Icelandic birch. Hereditas 112:65-70.
Andreasen K, Baldwin BG (2001) Unequal
evolutionary rates between annual and
perennial lineages of checker mallows
(Sidalcea, Malvaceae): evidence from 18S26S rDNA internal and external transcribed
spacers. Molecular Biology and Evolution
18:936-944.
Angenent GC, Busscher M, Franken J, Mol
JNM, van Tunen AJ (1992) Differential
expression of two MADS box genes in wildtype and mutant Petunia flowers. The Plant
Cell 4:983-993.
Anonymous (1999) Finnish statistical
yearbook of forestry. Finnish Forest
Research Institute.
Atkinson MD (1992) Betula pendula Roth (B.
verrucosa Ehrh.) and B. pubescens Ehrh.
Journal of Ecology 80:837-870.
Bergelson J, Stahl E, Dudek S, Kreitman M
(1998) Genetic variation within and among
populations of Arabidopsis thaliana.
Genetics 148:1311-1323.
Bousquet J, Strauss SH, Doerksen AH, Price
RA (1992) Extensive variation in
evolutionary rate of rbcL gene sequences
among seed plants. Proceedings of the
National Academy of Sciences 89:78447848.
Brown GR, Gill GP, Kuntz RJ, Langley CH,
Neale DB (2004) Nucleotide diversity and
linkage disequilibrium in loblolly pine.
39
Proceedings of the National Academy of
Sciences 98:3915-3919.
Ferrándiz C, Gu Q, Martienssen R, Yanofsky
MF (2000) Redundant regulation of
meristem identity and plant architecture by
FRUITFULL,
APETALA1
and
CAULIFLOWER. Development 127:725734.
Fitch WM (1971) Toward defining the course
of evolution: minimum change for a
specified tree topology. Systematic Zoology
20:406-416.
Freeling M, Bennett DC (1985) Maize Adh1.
Annual Review of Genetics 19:297-323.
Fukuda T, Yokoyama J, Ohashi H (2001)
Phylogeny and biogeography of the genus
Lycium (Solanaceae): inferences from
chloroplast DNA sequences. Molecular
Phylogenetics Evolution 19: 246-258.
Furlow J (1990) The genera of Betulaceae in
the southeastern United States. Journal of the
Arnold Arboretum 71:1-67.
García-Gil MR, Mikkonen M, Savolainen O
(2003) Nucleotide diversity at the two
phytochrome loci along a latitudinal cline in
Pinus sylvestris. Molecular Ecology
12:1195-1206.
Gaut BS, Clegg MT (1991) Molecular
evolution of alcohol dehydrogenase 1 in
members of the grass family. Proceedings
of the National Academy of Sciences
88:2060-2064.
Gaut BS, Clegg MT (1993a) Molecular
evolution of the Adh1 locus in the genus Zea.
Proceedings of the National Academy of
Sciences 90:5095-5099.
Gaut BS, Clegg MT (1993b) Nucleotide
polymorphism in the Adh1 locus of pearl
millet (Pennisetum glaucum) (Poaceae).
Genetics 135:1091-1097.
Gaut BS, Morton BR, McCaig BC, Clegg MT
(1996) Substitution rate comparisons
between grasses and palms: synonymous rate
differences at the nuclear gene Adh parallel
rate differences at the plastid gene rbcL.
Proceedings of the National Academy of
Sciences 93:10274-10279.
Gaut BS, Peek AS, Morton BR, Clegg MT
(1999) Patterns of genetic diversification
within the Adh gene family in the grasses
Dolferus R, Jacobs M, Peacock WJ, Dennis
ES (1994) Differential interactions of
promoter elements in stress response of the
Arabidopsis Adh gene. Plant Physiology
105:1075-1087.
Dugle JR (1966) A taxonomic study of western
Canadian species in the genus Betula.
Canadian Journal of Botany 44:929-1007.
Dumolin S, Demesure B, Petit RJ (1995)
Inheritance of chloroplast and mitochondrial
genomes in pedunculate oak investigated
with an efficient PCR method. Theoretical
and Applied Genetics 91:1253-1256.
Dvornyk V, Sirviö A, Mikkonen M, Savolainen
O (2002) Low nucleotide diversity at the
Pal1 locus in the widely distributed Pinus
Sylvestris. Molecular Biology and Evolution
19:179-188.
Egea-Cortines M, Saedler H, Sommer H
(1999) Ternary complex formation between
the MADS-box proteins SQUAMOSA,
DEFICIENS and GLOBOSA is involved in
the control of floral architecture in
Antirrhinum majus. The EMBO Journal
18:5370-5379.
Elo A, Lemmetyinen J, Turunen M-L, Tikka
L, Sopanen T (2001) Three MADS-box
genes similar to APETALA1 and
FRUITFULL from silver birch (Betula
pendula). Physiologia Plantarum 112:95103.
Erdogan V, Mehlenbacher SA (2000)
Phylogenetic relationships of Corylus
species (Betulaceae) based on nuclear
ribosomal DNA ITS region and chloroplast
matK gene sequences. Systematic Botany
25:727-737.
Eriksson G, Jonsson A (1986) A review of the
genetics of Betula. The Scandinavian Journal
of Forest Research 1:421-434.
Felsenstein J (1993) PHYLIP (Phylogeny
Inference Package) version 3.5. Computer
program and documentation distributed by
the author. Website: http://evolution.
genetics.washington.edu/phylip/
software.pars.html#PHYLIP.
Ferguson D, Sang T (2001) Speciation through
homoploid hybridization between
allotetraploids in peonies (Paeonia).
40
(Poaceae). Molecular Biology and Evolution
16:1086-1097.
Goto K, Meyerowitz EM (1994) Function and
regulation of the Arabidopsis floral homeotic
gene PISTILLATA. Genes & Development
8:1548-1560.
Gu Q, Ferrándiz C, Yanofsky MF, Martienssen
R (1998) The FRUITFULL MADS-box gene
mediates cell differentiation during
Arabidopsis
fruit
development.
Development 125:1509-1517.
Hagenblad J, Nordborg M (2002) Sequence
variation and haplotype structure
surrounding the flowering time locus FRI in
Arabidopsis thaliana. Genetics 161:289-98.
Hall TA (1999) BioEdit: a user-friendly
biological sequence alignment editor and
analysis program for Windows 95/98/NT.
Nucleic Acids Symposium Serries 41:95-98.
Hamrick JL, Godt MJW, Sherman-Broyles SL
(1992) Factors influencing levels of genetic
diversity in woody plant species. New
Forests 6:95-124.
Hamrick JL, Nason JD (2000) Gene flow in
forest trees. In Boyle TJB, Young A, Boshier
D (eds) Forest Conservation Genetics:
Principles and Practice. CIFOR and CSIRO,
Australia.
Hanfstingl U, Berry A, Kellog EA, Costa III
JT, Rudiger W (1994) Haplotype divergence
coupled with lack of diversity at the
Arabidopsis thaliana alcohol dehydrogenase
locus: role for both balancing and directional
selection? Genetics 138:811-828.
Hansen G, Estruch JJ, Sommer H, Spena A
(1993) NTGLO: a tobacco homologue of the
GLOBOSA floral homeotic gene of
Antrirrhinum majus: cDNA sequence and
expression pattern. Molecular and General
Genetics 239:310-312.
Hardenack S, Ye D, Saedler H, Grant S (1994)
Comparison of MADS box gene expression
in developing male and female flowers of
the dioecious plant white campion. The Plant
Cell 6:1775-1787.
Higo K, Ugawa Y, Iwamoto M, Korenaga T
(1999) Plant cis-acting regulatory DNA
elements (Place) database: 19999. Nucleic
Acids Research 27:297-300.
Hill WG, Robertson A (1968) Linkage
disequilibrium in finite populations.
Theoretical and Applied Genetics 38:226231.
Honma T, Goto K (2000) The Arabidopsis
floral homeotic gene PISTILLATA is
regulated by discrete cis-elements rsponsive
to induction and maintenance signals.
Development 127:2021-2030.
Howland DE, Oliver RP, Davy AJ (1995)
Morphological and molecular variation in
natural populations of B. pendula. The New
Phytologist 130:117-124.
Hudson RR, Kreitman M, Aguadé M (1987)
A test of neutral molecular evolution based
on nucleotide data. Genetics 116:153-159.
Huntley B, Birks HJ (1983) An atlas of past
and present pollen maps for Europe: 0-13000
years ago. Cambridge University Press,
Cambridge, United Kingdom.
Hyvärinen H (1987) History of forests in
northern Europe since the last glaciation.
Annales Academiae Scientiarum Fennicae.
Series A. III, Geologica-Geographica 145:718.
Innan H, Tajima F, Terauchi R, Miyashita NT
(1996) Intragenic recombination in the Adh
locus of the wild plant Arabidopsis thaliana.
Genetics 143:1761-1770.
Jack T, Brockman LL, Meyerowitz EM (1992)
The homeotic gene APETALA3 of
Arabidopsis thaliana encodes a MADS box
and is expressed in petals and stamens. Cell
68:683-697.
Jäger EJ (1980) Progressionen im
Synfloreszensbau und in der Verbreitung bei
den Betulaceae. Flora 170:91-113.
Johnsson H (1945) Interspecific hybridization
within the genus Betula. Hereditas 31:163176.
Jonsell B, Ed. (2000) Flora Nordica, vol. 1.
The Bergius Foundation, Royal Swedish
Academy of Sciences, Stockholm, Sweden.
Kado T, Yoshimaru H, Tsumura Y, Tachida H
(2003) DNA variation in a conifer,
Cryptomeria japonica (Cupressaceae sensu
lato). Genetics 164:1547-1559.
Karhu A, Hurme P, Karjalainen M, Karvonen
P, Kärkkäinen K, Neale D, Savolainen O
(1996) Do molecular markers reflect patterns
of differentiation in adaptive traits of
conifers? Theoretical and Applied Genetics
93:215-221.
41
Karvonen P, Savolainen O (1993) Variation
and inheritance of ribosomal DNA in Pinus
sylvestris L. (Scots pine). Heredity 71:614622.
Kato H, Oginuma K, Gu Z,Hammel B, Tobe
H (1998) Phylogenetic relationships of
Betulaceae based on matK sequences with
particular reference to the position of
Ostryopsis. Acta Phytotaxonomica et
Geobotanica 49: 89-97.
Kawabe A, Innan H, Terauchi R, Miyashita NT
(1997) Nucleotide polymorphism in the
Acidic Chitinase locus (ChiA) region of the
wild plant Arabidopsis thaliana. Molecular
Biology and Evolution 14:1303-1315.
Kawabe A, Miyashita NT (1999) DNA
variation in the basic Chitinase locus (ChiB)
region of the wild plant Arabidopsis
thaliana. Genetics 153:1445-1453.
Kawabe A, Yamane K, Miyashita NT (2000)
DNA polymorphism at the cytosolic
Phosphoglucose Isomerase (PgiC) locus of
the wild Arabidopsis thaliana. Genetics
156:1339-1347.
Keinänen M, Julkunen-Tiitto R, Mutikainen P,
Walls M, Ovaska J, Vapaavuori E (1999a)
Trade-offs in secondary metabolism: effects
of fertilization, defoliation, and genotype on
birch leaf phenolics. Ecology 80:1970-1986.
Keinänen M, Julkunen-Tiitto R, Rousi M,
Tahvanainen J (1999b) Taxonomic
implications of phenolic variation in leaves
of birch (Betula L.) species. Biochemical
Systematics and Ecology 27:243-256.
Komarov V (1936) Flora of the U.S.S.R, vol.
5. Moskva-Leningrad, Izdatel´stvo
Akademii Nauk SSSr, Moskva-Leningrad,
the Soviet Union..
Koornneef M, Alonso-Blanco C, Peeters AJM,
Soppe W (1998) Genetic control of
flowering time in Arabidopsis. Annual
Review of Plant Physiology 49:345-370.
Koornneef M, Alonso-Blanco C, Vreugdenhil
D (2004) Naturally occuring genetic
variation in Arabidopsis thaliana. Annual
Review of Plant Biology 55:141-172.
Koski V (1989) Metsäpuiden jalostus/
Breeding
of
forest
trees.
Ammattikasvatushallitus, Helsinki, Finland.
Kramer EM, Dorit RL, Irish VF (1998)
Molecular evolution of genes controlling
petal and stamen development: duplication
and divergence within the APETALA3 and
PISTILLATA MADS-box gene lineages.
Genetics 149:765-783.
Krizek BA, Meyerowitz EM (1996) The
Arabidopsis homeotic genes APETALA3 and
PISTILLATA are sufficient to provide the B
class organ identity function. Development
122:11-22.
Krizek BA, Riechmann JL, Meyerowitz EM
(1999) Use of the APETALA1 promoter to
assay the in vivo function of chimeric MADS
box genes. Sexual Plant Reproduction 12:1426.
Krüssman G (1960) Handbuch der
Laubgehölze, Band I. Paul Parey, Berlin,
Germany.
Kuittinen H, Aguadé M (2000) Nucleotide
variation at the Chalcone isomerase locus
in Arabidopsis thaliana. Genetics 155:863872.
Kuittinen H, Salguero D, Aguadé M (2002)
Parallel patterns of sequence variation within
and between populations at three loci of
Arabidopsis thaliana. Molecular Biology
and Evolution 19:2030-2034.
Kush A, Brunelle A, Shevell D, Chua NH
(1993) The cDNA sequence of two MADS
box proteins in Petunia. Plant Physiology
102:1051-1052.
Laitinen M-L, Julkunen-Tiitto R, Rousi M
(2000) Variation in phenolic compounds
within a birch (Betula pendula) population.
Journal of Chemical Ecology 26:1609-1622.
Laitinen M-L, Julkunen-Tiitto R, Rousi M
(2002) Foliar phenolic composition of
European white birch during bud unfolding
and leaf development. Physiologia
Plantarum 114:450-460.
Lamb RS, Irish VF (2003) Functional
divergence within the APETALA3/
PISTILLATA floral homeotic gene lineages.
Proceedings of the National Academy of
Sciences 100:6558-6563.
Laroche J, Li P, Maggia L, Bousquet J (1997)
Molecular evolution of angiosperm
mitochondrial introns and exons.
Proceedings of the National Academy of
Sciences 94:5722-5727.
Le Corre V, Roux F, Reboud X (2002) DNA
polymorphism at the FRIGIDA gene in
42
Arabidopsis
thaliana:
extensive
nonsynonymous variation is consistent with
local selection for flowering time. Molecular
Biology and Evolution 19:1261-1271.
Lemmetyinen J, Pennanen T, Lännenpää M,
Sopanen T (2001) Prevention of flower
formation in dicotyledons. Molecular
Breeding 7:341-350.
Lemmetyinen J, Hassinen M, Elo A, Porali I,
Keinonen K, Mäkelä H, Sopanen T (2004)
Functional characterisation of SEPALLATA3
and AGAMOUS orthologues in silver birch.
Physiologia Plantarum 121:149-162.
Li WH (1997) Molecular evolution. Sinauer
Associates, Sunderland, Massachusetts,
USA.
Litt A, Irish VF (2003) Duplication and
diversification in the APETALA1/
FRUITFULL floral homeotic gene lineage:
implications for the evolution of floral
development. Genetics 165:821-833.
McDonald JH, Kreitman M (1991) Adaptive
protein evolution at the Adh locus in
Drosophila. Nature 351:652-654.
Mandel MA, Yanofsky MF (1995) The
Arabidopsis AGL8 MADS box gene is
expressed in inflorescence meristems and is
negatively regulated by APETALA1. The
Plant Cell 7:1763-1771.
Miyashita NT, Innan H, Terauchi H (1996)
Intra- and interspecific variation of the
alcohol dehydrogenase locus region in wild
plants Arabis gemmifera and Arabidopsis
thaliana. Molecular Biology and Evolution
13:433-436.
Miyashita NT, Kawabe A, Innan H, Terauchi
R (1998) Intra- and interspecific DNA
variation and codon bias of the Alcohol
Dehydrogenase (Adh) locus in Arabis and
Arabidopsis species. Molecular Biology and
Evolution 15: 1420-1429.
Miyashita NT, Kawabe A, Innan H (1999)
DNA variation in the wild plant Arabidopsis
thaliana revealed by amplified fragment
length polymorphism analysis. Genetics
152:1723-1731.
Miyashita NT (2001) DNA variation in the 5’
upstream region of the Adh locus of the wild
plants Arabidopsis thaliana and Arabis
gemmifera. Molecular Biology and
Evolution 18:164-171.
Miyashita NT (2003) Trimorphic DNA
variation in the receptor-like protein kinase
gene in the F18L15-130 region of the wild
plant Arabidopsis thaliana. Genes & Genetic
Systems 78:221-227.
Morton BR, Gaut BS, Clegg MT (1996)
Evolution of alcohol dehydrogenase genes
in the palm and grass families. Proceedings
of the National Academy of Sciences
93:11735-11739.
Moudarov A, Hamdorf B, Teasdale RD, Kim
JT, Winkler KU, Theissen G (1999) A DEF/
GLO-like MADS-box gene from a
gymnosperm: Pinus radiata contains an
ortholog of angiosperm B class floral
homeotic genes. Developmental Genetics
25:245-252.
Moudarov A, Cremer F, Coupland G (2002)
Control of flowering time: interacting
pathways as a basis for diversity. The Plant
Cell 14:S111-130.
Münster T, Pahnke J, Di Rosa A, Kim JT,
Martin W, Saedler H, Theissen G (1997)
Floral homeotic genes were recruited from
homologous MADS-box genes preexisting
in the common ancestor of ferns and seed
plants. Proceedings of the National Academy
of Sciences 94:2415-2420.
Muona O, Harju A (1989) Effective population
sizes, genetic variability, and mating system
in natural stands and seed orchards of Pinus
sylvestris. Silvae Genetica 38:221-228.
Nakai T (1915) Praecursores ad Floram
Sylvaticam Koreanam. II. (Betulaceae). The
Botanical Magazine (Tokyo) 29:35-47.
Neale DB, Savolainen O (2004) Association
genetics of complex traits in conifers.
Trends in Plant Science 9:325-330.
Nei M (1987) Molelecular Evolutionary
Genetics. Columbia University Press, New
York, USA.
Nicholas KB, Nicholas HB Jr. (1997)
GeneDoc: a tool for editing and annotating
multiple sequence alignments. Distributed by
the author.
Nordborg M, Borevitz JO, Bergelson J, Berry
CC, Chory J, Hagenblad J, Kreitman M,
Maloof JN, Noyes T, Oefner PJ, Stahl EA,
Weigel D (2002) The extent of linkage
disequilibrium in Arabidopsis thaliana.
Nature Genetics 30:190-193.
43
Olsen KM, Womack A, Garrett AR, Suddith
JI, Purugganan MD (2002) Contrasting
evolutionary forces in the Arabidopsis
thaliana floral developmental pathway.
Genetics 160:1641-1650.
Olsen KM, Halldorsdottir SS, Stinchcombe JR,
Weinig C, Schmitt J, Purugganan MD (2004)
Linkage disequilibrium mapping of
Arabidopsis CRY2 flowering time alleles.
Genetics 167:1361-1369.
Palmé A, Vendramin G (2002) Chloroplast
DNA variation, postglacial recolonisation
and hybridisation in hazel, Corylus avellana.
Molecular Ecology 11:1769-1779.
Palmé A, Su Q, Rautenberg A, Manni F,
Lascoux M (2003) Postglacial recolonisation
and cpDNA variation of silver birch, Betula
pendula. Molecular Ecology 12:201-212.
Palmé AE, Su Q, Palsson S, Lascoux M (2004)
Extensive sharing of chloroplast haplotypes
among European birches indicates
hybridization among Betula pendula, B.
pubescens and B. nana. Molecular Ecology
13:167-178.
Pamilo P, Nei M (1988) Relationships between
gene trees and species trees. Molecular
Biology and Evolution 5:568-583.
Pawlowska L (1983) Biochemical and
systematic study of the genus Betula L. Acta
Societatis Botanicorum Poloniae 52:301314.
Petit RJ, Kremer A, Wagner DB (1993)
Geographic structure of chloroplast DNA
polymorphisms in European oaks.
Theoretical and Applied Genetics 87:122128.
Petit RJ, Pineau E, Demesure B, Bacilieri R,
Ducousso A, Kremer A (1997) Chloroplast
DNA footprints of postglacial recolonization
by oaks. Proceedings of the National
Academy of Sciences 94:9996-10001.
Posada D, Crandall KA (2001) Evaluation of
methods for detecting recombination from
DNA sequences: computer simulations.
Proceedings of the National Academy of
Sciences 98:13757-13762.
Prittinen K, Pusenius J, Koivunoro K, Roininen
H (2003) Genotypic variation in growth and
resistance to insect herbivory in silver birch
(Betula pendula) seedlings. Oecologia
442:572-577.
Purugganan MD, Suddith JI (1998) Molecular
population genetics of the Arabidopsis
CAULIFLOWER regulatory gene: Nonneutral
evolution and natural occurring variation
in floral homeotic function. Proceedings of
the National Academy of Sciences 95:81308134.
Purugganan MD, Suddith JI (1999) Molecular
population genetics of floral homeotic loci:
departures from equilibrium-neutral model
at the APETALA3 and PISTILLATA genes of
Arabidopsis thaliana. Genetics 151:839848.
Radetzky R (1990) Analysis of mitochondrial
DNA and its inheritance in Populus. Current
Genetics 18: 429-434.
Rajora OP, Dancik BP (1992) Chloroplast
DNA inheritance in Populus. Theoretical
and Applied Genetics 84: 280-285.
Regel E (1865) Bemerkungen über die
Gattungen Betula und Alnus nebst
Beschreibung einiger neuer Arten. Bulletin
de la Société Impériale des Naturalistes de
Moscou 38:388-434.
Riechmann JL, Wang M, Meyerowitz EM
(1996a) DNA-binding properties of
Arabidopsis MADS domain homeotic
proteins APETALA1, APETALA3,
PISTILLATA and AGAMOUS. Nucleic
Acids Research 24:3134-3141.
Riechmann JL, Krizek BA, Meyerowitz EM
(1996b) Dimerization specificity of
Arabidopsis MADS domain homeotic
proteins APETALA1, APETALA3,
PISTILLATA,
and
AGAMOUS.
Proceedings of the National Academy of
Sciences 93:4793-4798.
Riechmann JL, Meyerowitz EM (1997) MADS
domain proteins in plant development.
Biological Chemistry 378:1079-1101.
Rieseberg LH (1997) Hybrid origins of plant
species. Annual Review of Ecology and
Systematics 28:359-389.
Rozas J, Rozas R (1999) DnaSP version 3: an
integrated program for molecular population
genetics and molecular evolution analysis.
Bioinformatics 15:174-175.
Rusanen M, Vakkari P, Blom A (2003) Genetic
structure of Acer platanoides and Betula
pendula in northern Europe. Canadian
Journal of Forest Research 33:1110-1115.
44
Saitou N, Nei M (1987) The neighbor-joining
method: a new method for reconstructing
phylogenetic trees. Molecular Biology and
Evolution 4:406-425.
Sang T, Donoghue MJ, Zhang D (1997)
Evolution of alcohol dehydrogenase genes
in peonies (Paeonia): phylogenetic
relationships of putative nonhybrid species.
Molecular Biology and Evolution 14:9941007.
Särkilahti E, Valanne T (1990) Induced
polyploidy in Betula. Silva Fennica 24:227234.
Savard L, Michaud M, Bousquet J (1993)
Genetic diversity and phylogenetic
relationships between birches and alders
using ITS, 18S rRNA, and rbcL gene
sequences. Molecular Phylogenetics and
Evolution 2:112-118.
Savolainen O, Langley CH, Lazzaro BP,
Fréville H (2000) Contrasting patterns of
nucleotide polymorphism at the alcohol
dehydrogenase locus in the outcrossing
Arabidopsis lyrata and the selfing
Arabidopsis thaliana. Molecular Biology
and Evolution 17:645-655.
Sawyer SA (1989) Statistical test for detecting
gene conversion. Molecular Biology and
Evolution 6: 526-534.
Sawyer SA (1999) GENECONV: a computer
package for the detection of gene conversion.
Computer program and documentations
distributed by the author. Website: http://
www.wusl.edu/~sawyer.
Schwartz-Sommer Z, Huijser P, Nacken W,
Saedler H, Sommer H (1990) Genetic control
of flower development by homeotic genes
in Antirrhinum majus. Science 250:931-936.
Semerikov V, Zhang H, Sun M, Lascoux M
(2003) Conflicting phylogenies of Larix
(Pinaceae) based on cytoplasmic and nuclear
DNA. Molecular Phylogenetics and
Evolution 27:173-184.
Shepard KA, Purugganan MD (2003)
Molecular population genetics of the
Arabidopsis CLAVATA2 region: the genomic
scale of variation and selection in a selfing
species. Genetics 163:1083-1095.
Soltis DE, Johnson LA, Looney C (1996)
Discordance between ITS and chloroplast
topologies in the Boykinia group
(Saxifragaceae). Systematic Botany 21:169185.
Soltis DE, Soltis PS (1999) Polyplody:
recurrent formation and genome evolution.
Tree 14:348-352.
Soltis DE, Tago-Nakazawa K, Xiang QY
(2001) Phylogenetic relationships and the
evolution
in
Chrysosplenium
(Saxifragaceae) based on matK sequence
data. American Journal of Botany 88: 883893.
Southerton SG, Marshall H, Moudarov A,
Teasdale RD (1998) Eucalypt MADS-box
genes expressed in developing flowers. Plant
Physiology 118:365-372.
Stahl EA, Dwyer G, Mauricio R, Kreitman M,
Bergelson J(1999) Dynamics of disease
resistance polymorphism at the Rpm1 locus
of Arabidopsis. Nature 400:667-671.
Stanford AM, Harden R, Parks CR (2000)
Phylogeny and biogeography of Juglans
(Juglandaceae) based on matK and ITS
sequence data. American Journal of Botany
87:872-882.
Stebbins GL (1971) Chromosomal evolution
of higher plants. Addison-Wesley, Reading,
Massachusetts, USA.
Stern K (1964) Herkunftsversuche für Zwecke
der Forstpflanzenzüchtung, erläutert am
Beispiel zweier Modellversuche. Der
Züchter 34:181-219.
Tajima F (1989) Statistical methods for testing
the neutral mutation hypothesis by DNA
polymorphism. Genetics 123:585-595.
Tang W, Perry SE (2003) Binding site selection
for the plant MADS domain protein AGL15:
an in vitro and in vivo studies. The Journal
of Biological Chemistry 278:28154-28159.
The Arabidopsis Genome Initiative (2000)
Analysis of the genome sequence of the
flowering plant Arabidopsis thaliana. Nature
408:796-815.
Theissen G (2001) Development of floral
organ identity: stories from the MADS
house. Current Opinion in Plant Biology
4:75-85.
Thompson JD, Gibson TJ, Plewniak F,
Jeanmougin F, Higgins DG (1997) The
ClustalX windows interface: flexible
strategies for multiple sequence alignment
45
aided by quality analysis tools. Nucleic
Acids Research 24:4876-4882.
Thórsson Æ, Salmela TE, AnamthawatJónsson K (2001) Morphological,
cytological, and molecular evidence for
introgressive hybridisation in birch. Journal
of Heredity 92:404-408.
Tilly JJ, Allen DW, Jack T (1998) The CArG
boxes in the promoter of the Arabidopsis
floral organ identity gene APETALA3
mediate diverse regulatory effects.
Development 125:1647-1657.
Troebner W, Ramirez L, Motte P, Hue I,
Huijser P, Loennig WE, Saedler H, Sommer
H, Schwarz-Sommer Z (1992) GLOBOSA:
a homeotic gene which interacts with
DEFICIENS in the control of Antirrhinum
floral organogenesis. The EMBO Journal
11;4693-4704.
Vandenbussche M, Theissen G, Van de Peer
Y, Gerats T (2003) Structural diversification
and neo-functionalization during floral
MADS-box gene evolution by C-terminal
frameshift mutations. Nucleic Acids
Research 31:4401-4409.
Wang X-Q, Tank DC, Sang T (2000)
Phylogeny and divergence times in Pinaceae:
evidence from three genomes. Molecular
Biology and Evolution 17:773-781.
Wang X-R, Tsumura Y, Yoshimaru H,
Nagasaka K, Szmidt AE (1999)
Phylogenetic relationships of Eurasian pines
(Pinus, Pinaceae) based on chloroplast rbcL,
matK, rpl20-rps18 spacer, and trnV intron
sequences. American Journal of Botany
86:1742-1753.
Wang X-R, Szmidt AE (2001) Molecular
markers in population genetics of forest
trees. Scandinavian Journal of Forest
Research 16:199-220.
Wendel J, Doyle J (1998) Phylogenetic
incongruence: window into genome history
and molecular evolution. In D. Soltis, P.
Soltris and J. Doyle [eds.], Molecular
systematics of plants II: DNA sequencing.
Kluwer Academic Press, Boston, USA.
Whittle C-A, Johnston MO (2002) Male-driven
evolution of mitochondrial and
chloroplastidial DNA sequences in plants.
Molecular Biology and Evolution 19:938949.
Whittle C-A, Johnston MO (2003) Broad-scale
analysis contradicts the theory that
generation time affects molecular
evolutionary rates in plants. Journal of
Molecular Evolution 56:223-233.
Williams JH, Arnold ML (2001) Sources of
genetic structure in the woody perennial
Betula occidentalis. International Journal of
Plant Sciences 162:1097-1109.
Willis KJ, Rudner E, Sümegi P (2000) The fullglacial forests of central and southeastern
Europe. Quaternary Research 53:203-213.
Winkler H (1904) Betulaceae. In Das
Pflanzenreich, Heft 19 (IV.61). 149 p. Edited
by Engler A, W. Engelmann, Leipzig,
Germany.
Wiuf C, Christensen T, Hein J (2001) A
simulation study of the reliability of
recombination detection. Molecular Biology
and Evolution 18:1929-1939.
Wolfe KH, Li W-H, Sharp PM (1987) Rates
of nucleotide substitution vary greatly among
plant mitochondrial, chloroplast, and nuclear
DNAs. Proceedings of the National
Academy of Sciences 84:9054-9058.
Woodworth RH (1929) Cytological studies in
the Betulaceae, l. Betula. The Botanical
Gazette 87:331-364.
Yanofsky MF (1995) Floral meristems to floral
organs: genes controlling early events in
Arabidopsis flower development. Annual
Review of Plant Physiology and Plant
Molecular Biology 46:167-188.
Yao JL, Dong YH, Morris BA (2001)
Parthenocarpic apple fruit production
conferred by transposon insertion mutations
in a MADS-box transcription factor.
Proceedings of the National Academy of
Sciences 98:1306-1311.
Yokoyama S, Harry DE (1993) Molecular
phylogeny and evolutionary rates of alcohol
in vertebrates and plants. Molecular Biology
and Evolution 10:1215-1226.
Yoshida K, Kamiya T, Kawabe A, Miyashita
NT (2003) DNA polymorphism at the
ACAULIS5 locus of the wild plant
Arabidopsis thaliana. Genes & Genetic
Systems 78:11-21.
46
47