Järvinen, Pia Nucleotide variation of birch (Betula L.) species: population structure and phylogenetic relationships. - University of Joensuu, 2004, 138 pp. University of Joensuu, PhD Dissertations in Biology, No. 34. ISSN 1457-2486. Net version ISBN 952-458-593-6 Key words: BpADH, BpFULL1, BpMADS2, Betula, Betula pendula, Betulaceae, dimorphism, genetic differentiation, linkage disequilibrium, matK, nucleotide diversity, phylogeny, recombination Nucleotide variation in three nuclear genes, BpMADS2, BpFULL1, and BpADH was studied in two natural silver birch (Betula pendula Roth) populations. Nuclear sequences were used to explore the nucleotide variation within a population, among the populations, and between different genes of silver birch, and to evaluate whether the post-glacial colonisation history of the species is reflected in the distribution of variation. Many studies on marker genes, both on isozyme variation and anonymous DNA, have shown that variation within and among silver birch populations is high, resulting in a prediction that the level of nucleotide variation should also be high in this species. The observed results, however, do not fully support this prediction. Especially the level of nonsynonymous variation (πa) in BpMADS2 and BpFULL1 loci was very low, 0.00052 and 0, respectively, and only somewhat higher in the BpADH locus (0.0015). The synonymous site overall variation (πs) for BpMADS2 was also low, only 0.0043. The synonymous site overall nucleotide diversities for the BpFULL1 and BpADH regions were higher than in the BpMADS2 locus, 0.0134 and 0.0117, respectively. As the detected patterns of polymorphism and divergence at these loci were concordant, the variable mutation rates between the studied loci could explain the differences, but the power of some of the selection tests was low in the relatively small sample with low numbers of segregating sites. Many aspects of the data are consistent with a large effective population size of silver birch. The genetic differentiation between the two studied populations was low and the decay of linkage disequilibrium in the studied genes was very rapid. Furthermore, the recombination rate was high in the silver birch nuclear genome. These three nuclear loci of silver birch did not show strong patterns caused by the post-glacial expansion of the species, but were close to demographic equilibrium. Earlier studies on cpDNA found a strong geographical pattern presumably due to colonisation history, but there was only weak suggestion of such a pattern at the nuclear loci. Silver birch, as a wind-pollinated species, has very efficient gene flow through pollen and this may have broken down the initial genetic structure of populations established at colonisation. More loci and populations will be needed to confirm these findings. The phylogenetic relationships within the genus Betula (Betulaceae) were investigated using a part of the nuclear ADH, BpMADS2, and BpFULL1 genes. In general, the results obtained from the nuclear data fit rather well to the infrageneric classifications proposed for birches. In disagreement with the classical division of the genus Betula, B. schmidtii grouped with the species in subgenus Betula, and B. ermanii grouped with species in subgenus Chamaebetula. Pia Järvinen, Department of Biology, University of Joensuu, P.O.Box 111, FIN-80101 Joensuu, Finland 3 ABBREVIATIONS ADH Alcohol dehydrogenase AFLP Amplified Fragment Length Polymorphism bp base pair BP Before Present BpADH Betula pendula ADH BpFULL1 Betula pendula FRUITFULL-Like 1 BpMADS2 Betula pendula MADS2 c recombination rate between loci kb kilo base MADS box genes gene family of transcription factors MatK gene coding maturase-like protein in plants PCR Polymerase Chain Reaction PEE Positive Early Element PI PISTILLATA PLE1,2,3 Positive Late Elements 1,2,3 RAPD Random Amplified Polymorphic DNA RFLP Restriction Fragment Length Polymorphism Tajima’s D tests the hypothesis that all mutations are selectively neutral π nucleotide diversity 4 CONTENTS LIST OF ORIGINAL PUBLICATIONS 1. INTRODUCTION 1.1. General background 1.2. Genus Betula 1.3. Flower development 1.3.1. Flower development in Arabidopsis 1.3.2. MADS box genes FRUITFULL and PISTILLATA 1.3.3. BpFULL1 gene 1.4. Molecular markers 1.4.1. Nuclear and chloroplast DNA 1.5. Nucleotide variation in plants 1.5.1. Nucleotide variation in Arabidopsis 1.5.2. Nucleotide variation in woody plants 1.5.2.1. Betula pendula 2. AIMS OF THE STUDY 3. MATERIALS AND METHODS 3.1. Plant materials 3.1.1. Plant material for the isolation and analysis of BpMADS2 3.1.2. Plant material for the population studies 3.1.3. Plant material for the phylogenetic studies 3.2. Isolation and analysis of BpMADS2 and BpADH genes 3.3. Isolation and analysis of BpMADS2, BpFULL1, and BpADH fragments 3.4. Phylogenetic studies 4. RESULTS 4.1. Isolation and analysis of the BpMADS2 gene 4.1.1. Identification of the putative regulatory elements in the BpMADS2 promoter 4.2. Isolation and analysis of the BpADH gene 4.3. Variation within the two B. pendula populations 4.3.1. Nucleotide polymorphism 4.3.2. Dimorphism of haplotypes 4.3.3. Divergence between populations and demographic equilibrium 4.3.4. Estimates of polymorphism and divergence 4.4. Birch phylogeny 5. DISCUSSION 5.1. BpMADS2 is the PI homologue of birch 5.1.1. Identification of the putative BpMADS2 regulatory regions 5.2. Nucleotide variation in silver birch 5.2.1. Level of nucleotide variation in three nuclear loci of silver birch 5.2.2. No genetic differentiation between the two silver birch populations 5.2.3. Recombination is common in silver birch genome 5.2.4. Nuclear genes of silver birch show few traces of postglacial expansion 5.3. Phylogeny of the genus Betula 5.3.1. Comparison of molecular phylogenies 5.3.2. Phylogenetic relationships of Betula schmidtii 5.3.3. Origin of the two alleles of ADH gene 5.3.4. Reconciling gene trees with a species tree ACKNOWLEDGEMENTS REFERENCES 5 6 7 7 7 8 8 10 11 11 12 13 13 14 14 15 15 15 15 15 16 16 17 17 18 18 19 20 20 20 23 23 24 25 28 28 28 29 29 30 31 31 32 32 34 34 35 38 39 LIST OF ORIGINAL PUBLICATIONS This thesis is mainly based on the following publications but it also includes some previously unpublished results. In the text, the publications are referred to by the Roman numerals I-IV. I Järvinen, P., Lemmetyinen, J., Savolainen, O. and Sopanen, T. (2003) DNA sequence variation in BpMADS2 gene in two populations of Betula pendula. Molecular Ecology 12(2): 369-384. II Järvinen, P., Sopanen, T. and Savolainen, O. Nucleotide polymorphism in BpFULL1 and BpADH loci of silver birch (Betula pendula) (Betulaceae). Manuscript. III Järvinen, P., Palmé, A., Morales, L.O., Lännenpää, M., Sopanen, T., Keinänen, M. and Lascoux, M. (2004) Phylogenetic relationships of Betula species (Betulaceae) based on chloroplast matK and nuclear ADH gene sequences. American Journal of Botany 91(11): 1834-1845. IV Järvinen, P., Sopanen, T. and Keinänen, M. Phylogeny of the genus Betula (Betulaceae): inferences from two nuclear MADS genes. Manuscript submitted to BMC Evolutionary Biology. Publications I and III are reprinted with permission from publishers. Copyrights for I by Blackwell Publishing Ltd, and for III by Allen Press. 6 1. INTRODUCTION 1.1. General background Genetic variation exists within every species and forms the basis for natural selection and evolution. The extent and pattern of nucleotide variation in natural populations can provide us useful information on the evolutionary history of the species, the mechanisms that maintain genetic variation and the evolutionary forces acting on species in general. Genetic variation at nucleotide level has been studied very little in woody plants, yet many of these species are of great economical importance to humans. In Northern Europe, silver birch, Betula pendula Roth, is one of the three most important forest tree species (Anonymous, 1999), and used, for example, in plywood, pulp and furniture production. Long-lived species, such as forest trees, are subject to environmental conditions varying greatly from year to year and thus maintaining the genetic diversity in forest tree populations means maintaining adaptability to changing environmental conditions. Life history traits, such as generation time, geographical distribution, pollination mechanism, mating system, and seed and pollen dispersal, influence on the amount and apportionment of genetic variation of the species. Besides natural factors, direct human impact has already altered (see Kado et al., 2003), or will in a long term alter genetic variation in certain important forest tree species. So far, forest trees have undergone relatively little breeding, for example in Finland the breeding of trees did not start until the end of the 1940´s (Koski, 1989). However, biotechnological approaches, such as micro-propagation, gene transfer, and marker-assisted breeding can change this situation in the future. Furthermore, birches have been planted increasingly with material coming from selected trees, and even though the proportion of planted birches to those originating from naturally regeneration is still small, it is increasing all the time. In the long term this may affect the genetic diversity of the silver birch. Thus, studies on the current genetic structure of silver birch will provide reference values of nucleotide diversity in this species. There are also practical reasons for an interest in the population genetics of silver birch. Conventional breeding of woody plants is very slow, and therefore many research groups are looking for tools to improve wood quality and other properties of trees using gene technology. However, the phenotypic variation of many commercially important and desired properties of forest trees are complex and regulated by many loci, or the genetic background on desired properties is not known (Neale and Savolainen, 2004). One potential approach to overcome these problems could be association genetics of complex traits. When designing an association mapping study, one of the most important issues is the population structure of the studied species in natural populations. A knowledge of the current population structure of silver birch and of the rate of decay of linkage disequilibrium will facilitate the designing of association mapping studies, and thereby potentially accelerate the breeding of this economically important forest tree species. 1.2. Genus Betula The birches (Betula L.) are common trees and shrubs of the Northern Hemisphere (Furlow, 1990). Phylogenetically, the genus Betula belongs to the birch family Betulaceae (order Fagales). The genus comprises approximately 30-35 species (Furlow, 1990; de Jong, 1993), but the range of accepted species by different authors ranges from 30 to over 150 (de Jong, 1993), and considerable controversy still exists regarding the systematics of the genus. The uncertainty of the number of species, and their phylogenetic relationships are mostly due to the high polymorphism in morphology. Furthermore, hybridisation is very frequent and for this reason introgression, the transfer of genes between species, may have played an important part in the evolution of the genus (Alam and Grant, 1972; Furlow, 1990; Atkinson, 1992). The basic chromosome number of genus Betula is n = 14, but natural polyploidy is very frequent (Furlow, 1990; de Jong, 1993). Species of Betula form a polyploid series, with chromosome numbers of 2n = 28, 56, 70, 84, and 112 and the ploidy levels differ between 7 subgenus Betulaster (e.g., B. maximowicziana) subgenus Betulenta (e.g., B. lenta and B. alleghaniensis) subgenus Neurobetula (e.g., B. ermanii and B. schmidtii) subgenus Betula (e.g., B. pendula, B. pubescens, B. papyrifera, B. populifolia, B. resinifera and B. platyphylla) subgenus Chamaebetula (e.g., B. humilis, B. fruticosa and B. nana) Figure 1. Hypothetical phylogenetic relationships between the subgenera of Betula (based on de Jong, 1993). the subsections/subgenera. From the evolutionary point of view the genus Betula is still young, which partly explains the polyploid nature of the genus and the occurrence of various ploidy levels (Särkilahti and Valanne, 1990). Furthermore, the differences in ploidy levels among the different subsections/ subgenera indicate that several independent polyploidisations have occurred within the genus. Different attempts to identify sections or subgenera, and the relationships among the Betula species have been made on the basis of morphology, biochemical characters and/or chromosomal numbers (Regel, 1865; Winkler, 1904; Nakai, 1915; Komarov, 1936; Pawlowska, 1983; de Jong, 1993; Keinänen et al., 1999a). Regel (1865), the original monographer of Betula, divided birches into two main sections, Eubetula and Betulaster. The Eubetula section was further divided into three subsections, Albae (white birches), Costatae (yellow birches), and Nanae (dwarf birches). The section Betulaster contained only few Asian birches in the subsection Acuminatae. Since then this division has been revised numerous times by number of authors into different subsections or subgenera (summarised in Furlow, 1990; de Jong, 1993; Fig. 1). Birches are wind-pollinated and the dispersal of seeds is also by wind (Atkinson, 1992). In contrast to the great variation in their vegetative parts, birches are rather uniform in their reproductive organs, including separate male and female catkins. The female flowers consist of a bicarpellate ovary with one anatropous ovule in each locule (Table 1; Furlow 1990, de Jong, 1993). The male flowers consist of a reduced perianth with 1-4 reduced tepals and 1-4 bifid stamens. The number of stamens and tepals is three or four in members of the subgenera Betulenta and Neurobetula, two or three in members of the subgenus Betula and one or two in members of the subgenus Chamaebetula (Table 1; de Jong, 1993). In general, the number of tepals and stamens is equal within the species. In the subgenus Betulaster, however, the number of stamens has been reduced to two, but four tepals have been retained. 1.3. Flower development 1.3.1. Flower development in Arabidopsis During recent years, the flowering plant Arabidopsis thaliana (hereafter referred to as Arabidopsis) has become universally recognised as a model system in molecular, genetic, and evolutionary research. It has many advantages, including a small size, short generation time, and a relatively small genome, which has been completely sequenced (The Arabidopsis Genome Initiative, 2000). Arabidopsis is predominantly a selfing species, and the reported level of outcrossing is less than 1 % (Abbot and Gomez, 1989). Therefore most Arabidopsis plants in nature represent inbred lines, which are in practice homozygous. Furthermore, the lack of heterozygous individuals within this species will presumably cause recombination to be effectively very rare. Flowering is a very complex process and it can be divided into several independent phases, including induction of flowering, the formation 8 9 erect Chamaebetula Nanae bBased (1960) on literature (e.g., Furlow, 1990, de Jong, 1993). aKrüssman pendulous erect Neurobetula Betula erect Betulenta pendulous, clustered Infructescenses Albae Eubetula Costatae Section or subsection Subgenus (de Jong, 1993) (Winkler, 1904) Betulaster Acuminatae Betulaster 1-2 2-3 3-4 3-4 2 No. of stamens in male flowers B. fruticosa Pall. B. humilis Schrenk B. nana L. B. pendula Roth B. pubescens Ehrh. B. platyphylla Suk. var. japonica (Miq.) Hara B. papyrifera Marsh. B. populifolia Marsh. B. resinifera Britt. B. ermanii Cham. B. schmidtii Regel 28 84 28 2nb narrow narrow narrow 28 28 28 (56), 70, 84 28 28 broad broad broad 6-10 6-9 4-5 5-6 4-5 2 28 56 28 broad broad broad narrow 56 very narrow 28 narrow narrow very broad Fruit wings 5-7 5-7 5-7 7-11 9-11 9-12 9-11 10-12 B. maximowicziana Regel B. lenta L. B. alleghaniensis Britt. Leaf veinsa Species Table 1. Subgeneric taxonomic categories, selected morphological characters, chromosome numbers, and distribution of the birch species. Asia Europe, Asia Europe, Asia, North-America North-America North-America North-America Europe, Asia Europe, Asia Asia Asia Asia North-America North-America Asia Distribution of inflorescence and floral meristems, and the formation of floral organs (summarised in Yanofsky, 1995). The transition from the vegetative phase to the reproductive phase during Arabidopsis development is the result of a complex interaction of environmental (e.g., photoperiod, light intensity, light quality, and temperature) and endogenous (e.g., hormones and metabolites) factors. Furthermore, at least four interacting pathways whose signals regulate the expression of genes involved in flower development have been described: the photoperiod response pathway, the vernalization response pathway, the autonomous pathway, and the gibberellin pathway (Koornneef et al., 1998; Mouradov et al., 2002). The meristem identity genes control the transition from vegetative to inflorescence and from inflorescence to floral meristems (Yanofsky, 1995). One of the key regulators of the transition from vegetative to reproductive phase is the meristem identity gene LEAFY (LFY), whose activity is proposed to mediate the initiation of flowers. Also CAULIFLOWER (CAL), APETALA1 (AP1), and FRUITFULL (FUL) have important function in flower initiation, partly because of their roles in upregulating LFY expression (Ferrándiz et al., 2000). CAL, AP1 and FUL share redundant functions in the establishment of floral meristem identity and mutations in these genes cause conversion of flowers into shoots (Yanofsky, 1995). The Arabidopsis flower is organised into four concentric whorls of organs. Starting from the outermost whorl, these consist of four sepals, four petals, six stamens and two fused carpels. During the flower development the identity of the different floral organs is determined by the activity of floral organ identity genes. This specification has been described in the ABC model (Coen and Meyerowitz, 1991), which postulates three gene functions, A, B, and C that act in two adjacent whorls to specify the floral organs. According to this model, action of A alone specifies sepal formation, the combination AB specifies the development of petals, and the combination BC specifies stamen formation. Action of the C function alone determines the development of carpels. Later this “classical” ABC model has been refined and extended to an ABCDE model, where the D function is needed for the ovule identity and the E function for petal, stamen and carpel identities (Theissen, 2001). MADS box genes encode for transcription factors that have essential functions during flower development and organ differentiation, and nearly all of the A-, B-, and C-function genes belong to this gene family. The term MADS box arose from the first characterised genes that shared this region, namely MCM1 from yeast, AGAMOUS from Arabidopsis, DEFICIENS from Antirrhinum majus, and SRF from human (Schwartz-Sommer et al., 1990). A typical plant MADS domain protein consists of a very conserved structural organisation, including a MADS (M-), intervening (I-), keratin-like (K-) and Cterminal (C-) domains (so-called MIKC type; Münster et al., 1997). MADS domain proteins bind to DNA either as homo- or heterodimers, and the highly conserved MADS domain is the major determinant of DNA binding and dimerization (Riechmann and Meyerowitz, 1997). The formed homo- and/or heterocomplexes bind to so called CArG-box sequences (consensus CC(A/T)6GG or CTA(A/ T)4TAG). However, the MADS domain does not contribute significantly to the functional specificity of floral homeotic proteins (Krizek and Meyerowitz, 1996; Krizek et al., 1999). In addition to the MADS domain, the I-region and in some genes parts of the K domain are required for the formation of a DNA binding complex (Riechmann et al., 1996a; Riechmann et al., 1996b). The I-region and the K box have also important roles for proper protein function (Riechmann and Meyerowitz, 1997). The Cterminal is involved in the formation of higher order complexes and it also functions as the transcriptional activator domain (EgeaCortines et al., 1999; Honma and Goto, 2001). 1.3.2. MADS box genes FRUITFULL and PISTILLATA The FRUITFULL (FUL, formerly AGL8) gene encodes MIKC-type MADS domaincontaining transcription factor and it is involved in several distinct processes during 10 isolated in our group, e.g., BpMADS1 (Lemmetyinen et al., 2001), similar to SEPALLATA3, BpMADS3-5 (Elo et al., 2001), similar to FUL and AP1, BpMADS6 (Lemmetyinen et al., 2004), similar to AGAMOUS, BpMADS7 (P. Järvinen, J. Lemmetyinen and T. Sopanen, unpublished results), similar to AGL11 and BpMADS8 (S. Parkkinen, J. Lemmetyinen and T. Sopanen, unpublished results), similar to AP3. BpFULL1 (former BpMADS5) is the FULhomologue of birch (Elo et al., 2001). The expression of BpFULL1 is inflorescence specific (Elo et al., 2001; Lännenpää et al., submitted), and starts at the early stages of inflorescence development. The expression continued in both male and female inflorescences thereafter and expression was detected also in male inflorescences at anthesis, and in female inflorescences during seed development (Elo et al., 2001). No expression was detected in the vegetative parts. In situ hybridisation studies have shown that the expression of BpFULL1 is localised in birch inflorescence meristems and in male and female inflorescences, especially in stamen and carpel primordia (Lännenpää et al., submitted). These results indicate that BpFULL1, as its Arabidopsis homologue FUL, might be involved in the transition from vegetative to reproductive stage of development and the initiation of flower development. Arabidopsis development. It has an early acting function in controlling flowering time, floral meristem identity and cauline leaf morphology together with two other genes, AP1 and CAL (Ferrándiz et al., 2000). Later it has a role in carpel and fruit development (Mandel and Yanofsky, 1995; Gu et al., 1998). The expression of FUL is first detected in the inflorescence meristem at the time when the development of Arabidopsis is switched from the vegetative to the reproductive phase (Mandel and Yanofsky, 1995). Later, FUL is expressed in the center of floral meristem, region which gives rise to the pistil. In the mature flower the expression of FUL is detected in carpel walls. Furthermore, FUL, along with another meristem identity gene AP1, appears to be angiosperm-specific (Litt and Irish, 2003). The correlation of the origin of the AP1/FUL gene lineage with the origin of flowers suggests a possible role for these genes in the evolution of this key angiosperm feature. The PISTILLATA (PI) gene also encodes a MIKC-type MADS domain-containing transcription factor and it is required, along with the other B-function gene APETALA3 (AP3), to specify petal and stamen identities in the Arabidopsis flower (Goto and Meyerowitz, 1994; Jack et al., 1992). PI, along with AP3, also plays an additional role in proliferation of the floral meristem (Krizek and Meyerowitz, 1996). Mutations in the PI gene cause homeotic conversion of petals to stamens and of stamens to carpels. At the early stages of flower development, prior to the first appearance of the primordia of petals and stamens, PI is expressed in the second, third, and fourth whorls of the developing flowers (Goto and Meyerowitz, 1994). Later on, the expression of PI is restricted to the second and third whorls. 1.4. Molecular markers There are many reasons why molecular data, particularly DNA sequence data, are much more powerful for evolutionary studies, both at population and at species level, compared to morphological data. First of all, DNA sequence data represents the highest level of genetic resolution, and acts as a store of genetic information containing “the code of life” (Li, 1997). Secondly, molecular data are much more abundant than morphological data. For example, the genome of Arabidopsis contains 25 498 genes encoding proteins from 11 000 families (The Arabidopsis Genome Initiative, 2000). Also, the three genomes in the plant cell (nuclear, chloroplast, and mitochondrial), compose three independent DNA data sets in one species, and a combination of these data 1.3.3. BpFULL1 gene Molecular data have shown that the mechanisms controlling flower development are largely conserved even in distantly related plant species (Yanofsky, 1995), such as Arabidopsis and birch. Utilising this information several MADS genes and/or their cDNAs regulating the development of birch inflorescences and/or flowers have been 11 in anaerobic metabolism and its expression increases under oxygen stress as well as in response to cold in both Arabidopsis and Zea mays and to dehydration in Arabidopsis (Freeling and Bennett, 1984; Dolferus et al., 1994). Additionally, ADH may have a role during seedling development, fruit ripening, and pollen development. In the majority of flowering plants, two or three ADH loci have been identified, each containing ten exons and nine introns (e.g., Gaut and Clegg, 1991; Morton et al., 1996; Gaut et al., 1999). However, in Arabidopsis, ADH is a single copy gene, and consists of seven exons and six introns (Chang and Meyerowitz, 1986). Chloroplast genome structure and variation have been studied extensively in plants. Compared to nuclear DNA, chloroplast DNA has some properties that makes it especially useful in phylogenetic studies. Chloroplast genome is uniparentally, in most of the plant species maternally, inherited and does not undergo sexual recombination (Radetzky, 1990; Rajora and Dancik, 1992; Dumolin et al., 1995). Thus a phylogeny based on the chloroplast genome is not complicated by recombination. However, chloroplast markers have also some disadvantages, such as introgression of chloroplast genes from one species to another (Wendel and Doyle, 1998). Furthermore, due to uniparental inheritance, chloroplast DNA has a smaller effective population size than nuclear DNA. In monoecious species, such as birches, the effective population size for chloroplast genes is expected to be half of that for nuclear genes. Based on the smaller effective population size also the level of genetic variation is expected to be smaller. The mutation rate of chloroplast DNA may, however, vary considerably depending on the parent (male, female) it is inherited from and this has to be taken into account when generalisations concerning many plant species are made (male mutation bias; Whittle and Johnston, 2002). The chloroplast gene matK has been one of the most commonly used sequences for phylogenetic studies in plants (e.g., Wang et al., 1999; Wang et al., 2000; Cheng et al., 2000; Stanford et al., 2000; Soltis et al., 2001; Fukuda et al., 2001). In many cases it has been found sets can provide complementary information on the evolution of the species and taxa. Thirdly, DNA sequences generally evolve in a much more regular manner than do morphological characters and are often more responsive to quantitative treatments than are morphological data, and therefore can provide a clearer picture of relationships of species (Li, 1997). Furthermore, molecular data offer potentially huge data sets that are comparable across a wide taxonomic range (e.g., Yokoyama and Harry, 1993), and this might help us to resolve one of the prime goals of evolutionary biology, “the Tree of Life”. However, it is important to keep in mind that DNA sequences are only one of many types of data that can be used to study phylogenetic relationships of species and that molecular data and other used approaches are not exclusive to each other, but rather complete each other. 1.4.1. Nuclear and chloroplast DNA A plant cell has one nuclear and two organellar (chloroplast and mitochondrial) genomes. The biparentally inherited nuclear DNA is the fastest evolving among these three genomes (e.g., Wolfe et al., 1987; Wang et al., 2000). Recombination occurs frequently in nuclear genomes and the recombination rate varies considerably from locus to locus, depending on, for example, the chromosomal location of the gene. Recombination has an important role in the evolution of a species because it rearranges DNA sequences to generate new combinations of DNA molecules (Posada and Crandall, 2001). However, it also complicates the phylogenetic studies by creating “mosaic genes” where different parts of a gene have different phylogenetic histories. ADH genes are among the best-characterised nuclear genes in plants and have become model genes for studies of sequence variation (Gaut and Clegg, 1993a; Gaut and Clegg, 1993b; Innan et al., 1996; Bergelson et al., 1998; Savolainen et al., 2000), and phylogenetic studies at both high and low taxonomic levels (Gaut and Clegg, 1991; Gaut and Clegg, 1993a; Morton et al., 1996; Sang et al., 1997; Charlesworth et al., 1998; Miyashita et al., 1998; Gaut et al., 1999). Alcohol dehydrogenase (ADH) is an essential enzyme 12 disequilibrium extended up to 250 kb (Nordborg et al., 2002; Hagenbland and Nordborg, 2002). Because of the rare occurrence of heterozygotes in Arabidopsis (Abbot and Gomez, 1989), the level of effective recombination could generally be expected to be low. However, even though some species-wide studies of nucleotide variation have revealed a low level of recombination within some nuclear loci (Kawabe et al., 2000), other loci have showed several recombination events in the history of the sample (Innan et al., 1996; Kawabe et al., 1997; Kawabe and Miyashita, 1999; Pugugganan and Suddith, 1999; Kuittinen and Aguadé, 2000; Aguadé, 2001; Le Corre et al., 2002; Miyashita, 2003). Furthermore, studies on AFLP indicate that outcrossing does occur in this selfing species (Miyashita et al., 1999). Thus, studies on nuclear genes (e.g., Innan et al., 1996; Kuittinen and Aguadé, 2000; Aguadé, 2001, etc.) and AFLP (Miyashita et al., 1999) indicate that recombination events clearly have influenced the pattern of polymorphism in Arabidopsis. A well-defined dimorphic haplotypestructure with a clear separation in two highly differentiated haplotypes has been found in many Arabidopsis genes, such as ADH (Innan et al., 1996), ChiA and ChiB (Kawabe et al., 1997; Kawabe and Miyashita, 1999), Rpm1 (Stahl et al., 1999), FAH1 and F3H (Aguadé, 2001), TFL1 (Olsen et al., 2002), ACL5 (Yoshida et al., 2003), and CRY2 (Olsen et al., 2004). Dimorphism was, however, restricted to a few nucleotide differences at the CAL, AP3 and PI (Purugganan and Suddith, 1998, 1999), and CHI genes (Kuittinen and Aguadé, 2000) and there was no clear evidence for two major haplotypes in these genes. Unlike other regions where no clear haplotype-structure was present or only two divergent sequence types were detected, F18L15-130 region containing the receptor-like protein kinase gene seems to possess at least three divergent sequence types (trimorphism, Miyashita, 2003). Relatively high level of nucleotide variation in the many studied nuclear regions is mostly caused by differences between these two divergent sequence types. Two different explanations have been proposed for the origin of divergent to evolve more rapidly than another commonly analysed chloroplast gene, rbcL, and thus could be a better sequence candidate for clarifying relationships among closely related species (Wang et al., 1999). The choice of molecular markers for phylogenetic studies can be very difficult. Both nuclear and chloroplast markers have advantages and disadvantages. Genes from two different genomes may have distinct phylogenies as a result of different inheritance pathways and differential responses to processes discussed above. On the other hand, if different data sets give us similar trees, it will give us confidence that both trees reflect the same evolutionary history, and that the gene trees are congruent with the true species tree. 1.5. Nucleotide variation in plants 1.5.1. Nucleotide variation in Arabidopsis Genetic variation within and between Arabidopsis populations has been studied with meristem identity, floral developmental and flowering time genes (e.g., Purugganan and Suddith, 1998, 1999; Kuittinen et al., 2002; Hagenblad and Nordborg, 2002; Le Corre et al., 2002; Olsen et al., 2002; Shepard and Purugganan, 2003; Olsen et al., 2004), genes encoding metabolic enzymes (e.g., Hanfstingl et al., 1994; Innan et al., 1996; Kawabe et al., 2000; Miyashita, 2001; Aguadé, 2001; Kuittinen and Aguadé, 2000; Miyashita, 2003; Yoshida et al., 2003), and pathogen resistance and defence genes (e.g., Kawabe et al., 1997; Kawabe and Miyashita, 1999; Stahl et al., 1999). Nucleotide diversity in these genes varies from 0.0006 to 0.0558. The level and pattern of DNA variation in the entire genome of Arabidopsis has been studied using the amplified fragment length polymorphism (AFLP) analysis (Miyashita et al., 1999). Nucleotide diversity for the entire genome was estimated to be 0.0106, which is within the range reported for specific nuclear genes. Linkage disequilibrium, the nonrandom association of allelic polymorphisms, among polymorphic nucleotide sites has been observed both within and among genes (Nordborg et al., 2002; Hagenbland and Nordborg, 2002; Shepard and Purugganan, 2003), and in some cases decaying of linkage 13 sequence types (allelic dimorphism), introgression from a related species, or fusion of previously isolated subpopulations of Arabidopsis itself (Innan et al., 1996). 2002). Instead, the level of recombination rate for Pal1 was high. Furthermore, the overall genetic differentiation between the populations of Scots pine was low, supporting the idea of large effective population size in this species. Nucleotide diversity and linkage disequilibrium have been estimated among 19 loci in another long-lived outcrossing gymnosperm, loblolly pine (Pinus taeda L.) (Brown et al., 2004). The weighted average diversities at silent (πs) and nonsynonymous (πa) sites were 0.0064 and 0.0011, both rather low values. The decay of linkage disequilibrium was rapid and the observations suggested substantial recombination in the history of the sampled alleles. The nucleotide variation of the sugi tree, Cryptomeria japonica, has been studied using several different nuclear loci (Kado et al., 2003). Cryptomeria japonica is a predominantly outcrossing and wind-pollinated species and the distribution of this species is restricted to Japan. The current population size of this species is small. Cryptomeria japonica has a long generation time and sometimes C. japonica individuals live more than a thousand years. The average nucleotide diversity for silent sites was 0.0038, which is similar to that in Scots pine. No apparent geographic differentiation was found among studied populations. The level of population recombination rate in C. japonica was low and this seems to be due to both low level of recombination and small population size. 1.5.2. Nucleotide variation in woody plants Nucleotide variation in nuclear genes has been widely studied in herbaceous plants, especially in the selfing Arabidopsis, as above described, but there have been only few published studies on nucleotide variation in woody plants (Dvornyk et al., 2002; Kado et al., 2003; García-Gil et al., 2003; Brown et al., 2004). On the basis of allozyme data, trees have been found to contain significantly more variation than herbaceous plants (Hamrick et al., 1992). The average genetic diversity within populations of woody plants was 0.148, which is 46 % higher than the mean for annual (0.101) and 51 % higher than the mean for perennial (0.098) herbaceous species. Based on earlier morphological and other studies with forest tree species there is also extensive variation within populations (Stern, 1964; Howland et al., 1995; Laitinen et al., 2000). However, at the DNA level, woody plants have shown lower level of nucleotide variation and species divergence than herbaceous plants (Bousquet et al., 1992; Savard et al., 1993; Laroche et al., 1997; Andreasen and Baldwin, 2001). Scots pine (Pinus sylvestris L.) is a longlived predominantly outcrossing perennial, and its distribution area extends to most of the Eurasian continent. Scots pine has a large current population size. In earlier studies Scots pine has shown high diversity at isoenzyme, RFLP and microsatellite markers (Muona and Harju, 1989; Karvonen and Savolainen, 1993; Karhu et al. 1996). At nucleotide level, variation in Scots pine has been studied in nuclear genes encoding phenylalanine ammonia-lyase (Pal1; Dvornyk et al., 2002) and phytochromes P and O (PHYP and PHYO; García-Gil et al., 2003). The overall silent variation (πs) for Pal1, PHYP, and PHYO loci was low, only 0.0049, 0.0024 and 0.0013, respectively (Dvornyk et al., 2002; García-Gil et al., 2003). Also the level of nonsynonymous variation (πa) for these three loci was very low. There was no linkage disequilibrium even between closely linked sites (Dvornyk et al., 1.5.2.1. Betula pendula Silver birch, or European white birch (Betula pendula Roth/Betula verrucosa Ehrh.) is distributed throughout the northern temperate region (Atkinson, 1992). It is a windpollinated, outcrossing species with monoecious and a diclinous flowers. Birch, as a pioneer species, migrated to Fennoscandia after the last glacial epoch about 10 000 years ago as a result of post-glacial migration of individuals from refugia located to south-west, south and south-east from Finland (Huntley and Birks, 1983; Hyvärinen, 1987; Willis et al., 2000). Based on variation in chloroplast DNA, today’s silver birches in Europe can be classified into two main haplogroups, of which 14 2. AIMS OF THE STUDY one is dominant in the north-west and the other in the south-east and east (Palmé et al., 2003). In Finland the southeastern/ eastern haplogroup is the dominant one representing about 70-90 % of the sample. Furthermore, the chloroplast data showed that most variation within B. pendula was found in central Europe, while the level of variation in northern and southern populations were lower and very similar compared to each other. As earlier mentioned, based on allozyme data, trees have been found to contain more variation than herbaceous plants (Hamrick et al., 1992). The average heterozygosity for silver birch was 0.141 (Rusanen et al., 2003), which is only slightly lower than the averages presented by Hamrick et al. (1992), 0.148, respectively. Earlier studies based on restriction fragment length polymorphism (RFLP) and random amplified polymorphic DNA (RAPD) analyses have shown a high degree of polymorphisms within morphologically variable natural populations of B. pendula (Howland et al., 1995). Furthermore, intraspecific variation in secondary chemistry has been found to be considerable among and within genotypes of B. pendula (Keinänen et al., 1999b), and within a naturally regenerated B. pendula population (Laitinen et al., 2000, Laitinen et al., 2002). Resistance to insect herbivory varied also significantly among genotypes of B. pendula (Prittinen et al., 2003). These earlier observations combined with the expected large current population size of silver birch suggest that the level of nucleotide variation in this outcrossing, wind-pollinated species should be higher than those detected in herbaceous species, such as Arabidopsis. The objectives of this thesis were: 1 to isolate the genomic clone of PI homologue of birch and analyse its structure. 2 to study the nucleotide variation in two naturally regenerated, 70-year-old silver birch populations by analysing the variation within a population, among the populations, and between different genes. 3 to study whether the post-glacial colonisation history is reflected in the distribution of variation within the populations of silver birch. 4 to study the phylogenetic relationships among species within the Betula genus. 3. MATERIALS AND METHODS Only a brief outline of the materials and methods is given in this chapter. For instance, the polymerase chain reaction (PCR) conditions, primer sequences and details of sequence analyses are described in detail in the original papers (I-IV). 3.1. Plant materials 3.1.1. Plant material for the isolation and analysis of BpMADS2 (I) For the isolation of total RNA, from which the first-strand cDNA was prepared, and for the Southern and Northern hybridisation male and female inflorescences of silver birch (Betula pendula Roth) were collected from the wild trees so that all the main developmental stages were represented. In addition, leaves and roots for Northern hybridisation were collected from 4 weeks old in vitro grown B. pendula (clone JR ¼) seedlings. The samples were frozen and stored at –80 °C. 3.1.2. Plant material for the population studies (I, II) The populations selected for the nucleotide variation studies were naturally regenerated B. pendula forests situated in Punkaharju, southeastern Finland (61º49´N, 29º19´E) and in Rovaniemi, northern Finland (66º20´N, 26º40´E). The forest stands were ca. 65-70 years of age and 20-25 m in height. Samples 15 Table 2. The eight locations where Betula pendula was sampled. The longitude and latitude does not in all cases correspond exactly to the sampling location but to nearest town. Country Location Code Longitude Latitude Finland* Punkaharju P.1. 29°19´ 61°49´ Finland* Rovaniemi R.1. 26°40´ 66°20´ Finland Karjalohja KL 23° 70´ 60° 20´ Russia Novosibirsk K 78° 00´ 53° 30´ Russia Kurgan M 64° 40´ 55° 00´ Russia Orenburg O 55° 00´ 52° 30´ Germany Harzburg G 10° 60´ 51° 80´ Italy Lillaz I 7° 30´ 45° 70´ *Individuals added from populations Punkaharju and Rovaniemi (I). on the different chloroplast haplotypes identified with PCR-RFLP (Palmé et al., 2003). Two additional species of the birch family (Betulaceae), Corylus avellana and Alnus incana, were included in the studies III and IV as outgroup members. The C. avellana individual used as an outgroup in the ADH analysis (III) and the A. incana individual used as an outgroup in the BpMADS2 and BpFULL1 analyses (IV) was sampled in Joensuu Botanical garden, Finland. The two C. avellana individuals used as outgroups in the matK analysis (III) were sampled in Halltorps Hage, Sweden and Montejo de la Sierra, Spain. All the samples were frozen in liquid nitrogen and stored at –80 °C. from 20 individuals were collected from both locations, and the studied 10 individuals were chosen randomly from these. Leaf samples of additional six B. pendula individuals representing different parts of the distribution area of the species were obtained from experiments located at the research station of the Finnish Forest Research Institute at Punkaharju, Finland, and from the seedlings growing in Joensuu Botanical Garden (Table 2). All leaf samples were frozen and stored at –80 °C. 3.1.3. Plant material for the phylogenetic studies (III, IV) The species for the phylogenetic studies were chosen from all three major parts of the Betula range: Europe, Asia, and North America, and efforts were made to cover all the subgenera or sections of the genus Betula (Table 1). Leaves of 14 birch species representing five subgenera (de Jong, 1993) were collected either from individuals growing in botanical gardens, from a natural population of B. pendula at Punkaharju, Finland, or obtained from experiments located at the research station of the Finnish Forest Research Institute at Punkaharju, Finland. The seven additional B. pendula individuals were used to investigate the within species variation in matK (III). The individuals included here came from different locations in Europe and were chosen for this study based 3.2. Isolation and analysis of BpMADS2 and BpADH genes (I, III) A partial cDNA clone of BpMADS2 was first isolated using PCR with partially degenerative primers as described in paper I. An almost fulllength cDNA clone was isolated using PCR with a new, BpMADS2 specific primer together with an oligo d(T)-primer (I). The promoter region of BpMADS2 (9.4 kb), along with the missing 21 bp from the 5’-end of the coding region of BpMADS2, was isolated by the screening of a λFixII genomic library (obtained from Prof. J. Kangasjärvi, University of Helsinki) using the 3´end of BpMADS2 (533 bp, nucleotides 337-870) as the probe. The isolated promoter fragment was subcloned 16 pendula individuals from which two BpADH alleles were amplified (II). These alleles were only partly or not at all rechecked with direct sequencing. Total DNA was extracted from leaves of six additional silver birch individuals from different parts of the distribution area (Table 2), and genomic fragments of BpMADS2 5’end region were isolated as described in paper I (unpublished results). PCR products were cloned, and positive clones were selected and sequenced. The PCR fragments were completely rechecked with direct sequencing to obtain both alleles. Nucleotide sequences of BpMADS2, BpFULL1 and BpADH were assembled using GCG software package, program PileUp (release 10.0, Genetics Computer Group, Madison, WI, USA (I)) or the EMBOSS program package (release 2.4.1, The European Molecular Biology Open Software Suite (II)). The resulting sequences were aligned with BioEdit (Hall, 1999) and Genedoc (Nicholas and Nicholas, 1997) programs and refined visually. The polymorphism data was analysed using the program package DnaSP (version 3.5, Rozas and Rozas, 1999). Insertion/deletion (indel) and microsatellite length variation was not included in the estimates of nucleotide diversity. Microsatellite variation was analysed by comparing mean numbers of repeats between populations and haplotypes. Neighbour-joining (Kimura-2P distance measure; Saitou and Nei, 1987 (I)), or DNA parsimony trees (Heuristic search; Fitch, 1971 (II)) were constructed using programs available in ClustalX (Thompson et al., 1997) or Phylip (version 3.5, Felsenstein, 1993). using PCR with the vector specific forward primer and the gene specific BpMADS2 reverse primer (5’-GCTTGTTCTTCTTGCTTGTGG3’). The copy number of the BpMADS2 gene was studied using Southern hybridisation analysis (I). The expression pattern of the BpMADS2 gene was studied by Northern hybridisation analysis (I). Expression of the BpMADS2 gene in certain parts of the plant (roots, leaves, inflorescences) or different developmental stages of the male or female inflorescences was also studied with PCR using first-strand cDNA as a template (I). The genomic clone of BpADH (3.1 kb) was first isolated from a λFixII genomic library using the partial cDNA of BpADH (nucleotides 55-417, accession number AJ279698, received from M. Korhonen, University of Helsinki) as the probe and then subcloned using PCR (III). The copy number of the BpADH gene was studied using Southern hybridisation analysis (III). The sequence comparisons of BpMADS2 and BpADH genes were mainly done using GCG software package, program PileUp (release 10.0; Genetics Computer Group, Madison, WI, USA), and Genedoc (Nicholas and Nicholas, 1997) and ClustalX (Thompson et al., 1997) programs. The phylogenetic analyses of BpMADS2 and BpADH amino acid sequences were done using Neighbor-joining algorithm (Saitou and Nei, 1987) with ClustalX software (I) or using the programs available in Phylip 3.5 (Felsenstein, 1993 (III)). 3.3. Isolation and analysis of BpMADS2, BpFULL1, and BpADH fragments (I, II) Total DNA was extracted from young leaves of silver birch by Dneasy Plant Mini kit (QIAGEN). Genomic fragments were isolated using PCR with gene specific primers (I, II and III). The amplified PCR products of BpMADS2, BpFULL1 and BpADH were cloned, and positive clones were selected and sequenced. All sequence polymorphisms were visually rechecked from chromatograms. Most of the PCR fragments were also partly or fully rechecked with direct sequencing and specific attention was drawn to the microsatellite length variation. The only exceptions are seven B. 3.4. Phylogenetic studies (III, IV) Total DNA was extracted from young leaves of 14 birch species and C. avellana (III) and A. incana (IV) outgroups as described in paper III. Genomic fragments of BpMADS2-, BpFULL1-, and ADH-homologues were isolated using PCR with gene specific primers (I, II, III). The amplified fragments were cloned, and positive clones were selected and sequenced. All sequence polymorphisms were visually rechecked from chromatograms. Most of the PCR fragments were rechecked with 17 isolated BpMADS2 gene was sequenced and the sequence was used in sequence comparisons, and phylogenetic analyses. According to the sequence comparisons, at both nucleotide and amino acid level, BpMADS2 is most similar to the Arabidopsis and Antirrhinum majus B-function genes PI and GLOBOSA (GLO). Hybridisation analyses were conducted to study the copy number and the expression pattern of the BpMADS2 gene. Southern hybridisation revealed that there was only one genomic fragment hybridising with the BpMADS2 probe. This result indicates that BpMADS2 is a single copy gene in birch. The localisation of the gene expression of BpMADS2 was carried out using the Northern hybridisation analysis. BpMADS2 was expressed in male inflorescences and at the early stages of development also in female inflorescences but not in vegetative tissues (roots, leaves, and shoots). At the early developmental phases, the expression in male inflorescences was weak. At the later developmental phases the expression became stronger and the strongest expression was detected in the late developmental phase of male inflorescences, before flower opening. No expression was detected in later stages of the female inflorescence development. PCR analysis confirmed that BpMADS2 was expressed in male and young female inflorescences but not in older female inflorescences, or in vegetative tissues. The members of the PI lineage can be further distinguished from other lineages, especially from the otherwise very similar AP3 lineage, by diagnostic sequences at the K domain and at the C-terminal end of the predicted protein. The PI-like genes, including e.g. GLO from Antirrhinum majus and MdPI from Malus domestica, typically code for consensus sequence MPFxFRVQPxQPNLQE (PI motif) at the C-terminal end of the protein, whereas AP3-like genes code for a different sequence, D(L/I)TTFALLE (euAP3 motif in higher eudicots) or YGxHDLRLA (paleoAP3 motif in most of the Ranunculidae and the magnolid dicots) (Kramer et al., 1998). The members of the PI clade are highly conserved also at the K domain. This region displays a consensus direct sequencing. The only exceptions are those gene regions from which two alleles of unequal lengths were amplified. Due to technical problems, these alleles were only partly or not at all rechecked with direct sequencing. Nucleotide sequences were analysed using GCG program package (release 10.0, program PileUp, Genetics Computer Group, Maddison, WI, USA) or EMBOSS program package (release 2.4.1, The European Molecular Biology Open Software Suite). The resulting sequences were aligned with BioEdit (Hall, 1999), GeneDoc (Nicholas and Nicholas, 1997) and ClustalX (Thompson et al., 1997) programs, and refined visually. Nucleotide and haplotype diversity analyses (III) were conducted using the program package DnaSP (Rozas and Rozas, 1999). The presence of recombination and/or gene conversion among ADH sequences (III) was tested with the program Geneconv v. 1.81 (Sawyer, 1989; 1999). The phylogenetic trees of the nuclear DNA sequences were inferred using two methods: maximum parsimony using heuristic search and maximum likelihood as implemented in the program Phylip 3.5 (Felsenstein, 1993). The reliability of the trees was tested using bootstrapping. Under the assumption that all studied nuclear data sets used for phylogenetic analyses (III, IV) share a common evolutionary history, the data sets were combined. Phylogenetic analysis of the combined data set was conducted using the maximum parsimony, maximum likelihood and Neighbour-joining methods (IV). 4. RESULTS 4.1. Isolation and analysis of the BpMADS2 gene (I) The partial cDNA clone of BpMADS2 was first isolated using PCR with degenerative primers. The corresponding almost full-length cDNA clone was isolated using PCR with the gene specific and oligo d(T)-primers. A genomic clone containing the missing nucleotides from the 5’-end of the coding region of BpMADS2 and a 9-kb fragment upstream to the cDNA clone was isolated by screening of the genomic library with the cDNA as the probe. The 18 FBP1 NtGLO GLO PMADS2 SLM2 AiMADS2 BpMADS2 MdPI PI EGM2 OsMADS2 OsMADS4 PrDGL Consensus PI Motif Figure 2. Alignment of C-terminal PI-motif regions of the predicted protein sequences analysed in this study (I, IV). The names of genes cloned in this study are highlighted in bold and the consensus is shown below. FBP1, Petunia hybrida, acc.no. M91190 (Angenent et al., 1992); PMADS2, Petunia hybrida, acc.no. X69947 (Kush et al., 1993); NtGLO, Nicotiana tabacum, acc.no. X67959 (Hansen et al., 1993); GLO, Antirrhinum majus, acc.no. S28062 (Trobner et al., 1992); SLM2, Silene latifolia, acc.no. X80489 (Hardenack et al., 1994); MdPI, Malus domestica, acc.no. AJ291490 (Yao et al., 2001); PI, Arabidopsis thaliana, acc.no. D30807 (Goto and Meyerowitz, 1994); EGM2, Eucalyptus grandis, acc.no. AF029976 (Southerton et al., 1998); OsMADS2, Oryza sativa, acc.no. L37526 (Chung et al., 1995); OsMADS4, Oryza sativa, acc.no. L37527 (Chung et al., 1995); PrDGL, Pinus radiata, acc.no. AF120097 (Mourdarov et al., 1999). sequence KHExL. The comparable sequence in the K box of the AP3 homologues is (H/ Q)YexM. Both of these highly conserved PI motifs are found from BpMADS2, and especially in PI motif the homology between the consensus sequence and BpMADS2 was high (thirteen amino acids out of fourteen possible were identical, Fig. 2). initiation and the maintenance of PI and AP3 expression patterns (Tilly et al., 1998; Chen et al., 2000; Honma and Goto, 2000), a shorter, 3-kb fragment was selected for further analysis (unpublished results). The promoter sequence of BpMADS2 was analysed by using PLACE database that contains previously published sequence motifs found in plant cis-acting regulatory elements (Higo et al., 1999). The sequence of the BpMADS2 promoter was further compared to the functionally defined PI and AP3 regulatory elements (Tilly et al., 1998; Chen et al., 2000). The 3-kb promoter region contained one putative sequence motif (from site –2204 to site –2195) that resembles the MADS domain protein consensus binding sites, known as the CArG box (unpublished results). No other putative CArG boxes were detected within this 3-kb promoter region. In addition to this, 4.1.1. Identification of the putative regulatory elements in the BpMADS2 promoter In order to identify the putative regulatory elements in the BpMADS2 promoter, a 9-kb fragment upstream of the BpMADS2 first ATG codon (putative translation start site) was isolated and sequenced. Because the earlier results with Arabidopsis B-function genes PI and AP3 have shown that a much shorter promoter region is sufficient to confer both the 19 several other shorter putative regulatory elements could be identified in comparisons between the promoter region of BpMADS2 and PLACE database (data not shown). The 3-kb promoter region of BpMADS2 was further compared to the functionally defined PI promoter elements. Since the BpMADS2 promoter has not been functionally dissected, identities of fewer than eight consecutive nucleotides were ignored to minimise the possibility of their occuring by change. The BpMADS2 promoter does show some similarities to the PI promoter – a consecutive stretch of ten nucleotides (-477 to -468, CAAAAGCAAG) corresponds to the positive late element (PLE1; Chen et al., 2000) found in PI promoter and nine of these nucleotides were identical also with the corresponding region in the AP3 promoter. PI late element 2 identified a 11-nucleotide motif (TTAAGAAAGTA), out of which 10 nucleotides were identical in the BpMADS2 promoter (nucleotides –250 to –240). However, BpMADS2 promoter did not contain regions showing significant similarities to PI positive late element 3 (PLE3) or positive early element (PEE). because the whole gene would have been too long and difficult to amplify as one fragment (I). Region I of the BpMADS2 gene comprised the 3´end of the MADS-box, the I-region, the 5´end of the K-box, and two introns. Region II comprised most of the C-terminal region, one longer intron, and some of the 3´untranslated region. The detected level of diversity in the BpMADS2 gene was low. At Region I of BpMADS2 locus there were five segregating sites among 40 alleles and 770 bp sequenced. At Region II of BpMADS2 there were 42 silent segregating sites in addition to only one nonsynonymous site among 20 alleles and 1680 bp sequenced. The overall silent variation (πs) for BpMADS2, including third position of codons and noncoding regions, was 0.0043 and the estimate of nonsynonymous variation (πa) was only 0.00052 (summarised in Table 3). Intragenic recombination was detected in both populations in Region II of the BpMADS2 gene, but no intragenic recombination has occurred in Region I (Table 4). Instead, significant linkage disequilibria were detected within Region I, but also in Region II between one pair of sites in population Punkaharju. 4.2. Isolation and analysis of the BpADH gene (III) The genomic clone of BpADH was isolated by screening of the genomic library using the BpADH cDNA clone as the probe. According to the nucleotide sequence comparisons, BpADH belongs to the same group as, Arabidopsis gene ADH1 (Chang and Meyerowitz, 1986). According to phylogenetic analyses, BpADH clusters along with other dicot ADH homologues. Southern hybridisation revealed that there were two genomic fragments hybridising with the BpADH probe, which indicates that there might be at least two ADH genes in birch. The gene expression of BpADH in birch has not been tested yet. Table 3. Summary of nucleotide polymorphism in BpMADS2, BpFULL1 and BpADH loci of silver birch. Sequence πs πa πtotal BpMADS2 BpFULL1 BpADH 0.0043 0.00134 0.0117 0.00052 0 0.0015 0.0045a 0.0109 0.0078 BpMADS2* 0.00334 0 0.00283 Mean values 0.00818 ND 0.00682 πs from overall synonymous sites (third position of codons and noncoding regions); πa from nonsynonymous sites; ND not determined. a Regions I and II of the BpMADS2 gene, cloned alleles combined * Region I of the BpMADS2 gene, eight B. pendula individuals from different parts of the distribution area. 4.3. Variation within the two B. pendula populations (I, II) 4.3.1. Nucleotide polymorphism Nucleotide variation of the BpMADS2 gene was studied in two separate regions (Fig. 3), 20 Figure 3. The genomic regions of BpMADS2 and BpFULL1 used in sequence variation and phylogenetic analysis. Nucleotide variation of the BpFULL1 gene was studied in a region, which comprised the 3´end of the K-box, most of the C-terminal region, and four introns (Fig. 3, II). The detected level of diversity of the BpFULL1 gene was much higher than in the BpMADS2 gene. At BpFULL1 gene there were 38 segregating sites among the 20 alleles and 1217 bp sequenced. The overall silent variation, πs, was 0.00134 (Table 3). There was no nonsynonymous variation in the coding region of BpFULL1, and the only polymorphism within the coding region was a synonymous substitution of T to C. No evidence for linkage disequilibrium was detected. Instead, intragenic recombination has occured within both populations, and the value of R M per informative site varied from 0.10 to 0.15 (Table 4). The region selected to study nucleotide variation of the BpADH gene (II) covers portions of five exons and four introns (Fig. 4), and corresponds to nucleotides 642-1770 of the Arabidopsis ADH sequence (Miyashita et al., 1996). Unlike BpMADS2 and BpFULL1 genes, where only one fragment was amplified from all individuals with the primers used (I, II), an additional, about 450 bp longer fragment was amplified from seven silver birch individuals. The length variation between these two alleles was due to one long indel. For the polymorphism analyses only one allele per individual was chosen from these seven individuals with two BpADH alleles. At Table 4. Summary of statistics for intragenic recombinations at the BpFULL1, BpADH, and BpMADS2 genes. Length* BpFULL1 BpADH Punkaharju Rovaniemi Punkaharju Rovaniemi BpMADS2, Region I BpMADS2, Region II Punkaharju Rovaniemi Punkaharju Rovaniemi 1173 1061 1059 1058 745 747 1662 1662 RM 5 3 2 1 0 0 2 3 No. of informative sites 33 30 31 28 5 4 31 29 RM/ no. of informative sites 0.1515 0.1000 0.0645 0.0357 0.0000 0.0000 0.0645 0.1034 C 0.0102 0.0837 0.0099 0.0015 0.0006 0.0000 0.0045 0.0404 12.1 96.1 11.0 1.7 0.5 0.001 7.6 67.7 C per gene RM, the minimum number of recombination events by Hudson and Kaplan (1985); C (or R), the estimator of the populations’ recombination rates per site (4Nr) by Hudson (1987). * Number of sites excluding sites with alignment gaps. BpFULL1, BpADH and BpMADS2 (Region II) fragments: 10 individuals per population, BpMADS2 (Region I): 20 individuals per population. BpMADS2, Regions I and II: calculated from the data of Järvinen et al., 2003. 21 2E2 3 4 5 7 6 8 9 Figure 4. The genomic region of BpADH used in sequence variation and phylogenetic analysis. BpADH there were 33 silent polymorphisms in addition to the six nonsynonymous sites. The detected level of overall silent variation of the BpADH gene was very similar to BpFULL1 (II) and much higher than in BpMADS2 (I, Table 3). The estimate of the nonsynonymous nucleotide diversity (πa) in BpADH, however, was much higher compared to the estimates for BpMADS2 and BpFULL1 (0.0015). Intragenic recombination has occured within both populations, and the value of RM per informative site was from 0.04 to 0.06 (Table 4). No evidence for linkage disequilibrium was detected. Both alleles of Region I of the BpMADS2 gene were sequenced from 6 additional silver birch individuals (Table 2) from different parts of the distribution area, and one individual from populations Punkaharju and Rovaniemi was added to this data set to give a total of 16 alleles (unpublished results). A total of four segregating sites and two microsatellite polymorphisms were detected within 16 alleles (Table 5). All of the segregating sites were located within the introns. The nucleotide diversity (π total) for the entire region was 0.00283. The overall silent variation, π s, including third position of codons and noncoding regions, was 0.00334 (Table 3). The total number of haplotypes among the 16 BpMADS2 sequences was five and the haplotype diversity, H (Nei, 1987, pp. 259260), was 0.683 ± 0.091. Significant linkage disequilibrium was detected between three pairs of sites [(183, 514) (183, 634) (514, 634)]. The estimated minimum number of recombination events, RM, was one, indicating that intragenic recombination has occurred, and the value of RM per informative site was 0.25. In estimating the overall level of nucleotide diversity (πtotal) and mean levels of silent site nucleotide diversity for these loci, all studied gene regions from both populations were aligned together (cloned alleles). The overall level of nucleotide diversity (πtotal) for silver birch was 0.00682 (Table 3, unpublished results). The mean value of πs (third position of codons and noncoding regions) for the combined data set was 0.00818, respectively. Because the larger sample size increases the power of detecting significant linkage disequilibrium, linkage disequilibrium was surveyed for nucleotide polymorphisms within and between the studied genes using the combined data set. The amount of linkage disequilibrium was estimated using the r2 statistic (Hill and Robertson, 1968) for polymorphic sites, and the significance of pairwise disequilibrium comparisons was assessed with Fisher’s exact test. Strong levels of intragenic disequilibrium were observed only within the BpADH gene (Fig. 5). No intergenic disequilibrium was observed even between the nearest sites. These results indicate that the decay of linkage disequilibrium is especially rapid in these three studied silver birch genes. 1.0 r2 0.8 0.6 0.4 0.2 0.0 0 1000 2000 bp 3000 4000 5000 Figure 5. Average rate of decay of linkage disequilibrium, measured by the correlation coefficient between nucleotide sites (r2), in silver birch based on three nuclear genes. 22 Table 5. Polymorphic nucleotide sites among 16 alleles (8 individuals) in Region I of the BpMADS2 gene in Betula pendula individuals collected from different parts of the distribution area. Only differences from the consensus sequence are shown. Dots indicate identity with the consensus sequence. The positions of the polymorphic sites in two different introns are indicated at the top. a, cloned allele; b, allele from direct sequencing. Intron I Individual and allele P.1.a. P.1.b. R.1.a. R.1.b. KL.a. KL.b. K.a. K.b. M.a. M.b. O.a. O.b. G.a. G.b. I.a. I.b. 63-110 (CT)n 13 13 24 14 14 16 17 17 13 14 13 13 14 18 14 13 153-202 (CT)n 14 14 25 18 18 18 20 20 19 19 13 13 15 20 20 15 Intron II 183 C . . A . . A A A A . . . . A A . 4.3.2. Dimorphism of haplotypes In all of the three studied nuclear genes, there was a suggestion of two allele classes, and in BpMADS2 and BpFULL1 genes the more frequent allele class represented about 75 % of the whole sample (I, II). However, in BpFULL1 the allele class A comprised 55 % of the sequences and the remaining 45 % of the alleles formed the second group in the phylogeny (II). The two classes of BpMADS2 alleles were present also in silver birch individuals collected from different parts of the distribution area (Fig. 6, unpublished results). Especially in Region I of BpMADS2 the presence of significant linkage disequilibrium confirmed the strong dimorphism in this region (I). The two main allele types showed also some differences at microsatellite level. In Region I of the BpMADS2 gene the haplotype B was associated with the longer microsatellites, especially in population Rovaniemi (I). In BpFULL1 gene the haplotype B was associated with the long (TC) n repeat, especially in population Punkaharju (II). In Region II of BpMADS2 and BpADH none of the 514 C . . T . . T T T T . . . . T T . 618 G . . T . . C C C C . . . . C C . 634 C . . T . . T T T T . . . T . T . microsatellites were associated with the two different allele classes (I, II). Although all the studied gene regions exhibited traces of allelic dimorphism, the two allele classes in each gene region were differentiated from each other only by a limited number of nucleotide sites (I, II, unpublished results). This indicates that even though the studied genes displayed allelic dimorphism, which at some parts of the genes was maximal, it seems likely that allelic dimorphism has already disappeared or will gradually disappear from the silver birch nuclear genome. 4.3.3. Divergence between populations and demographic equilibrium In all three genes the level of genetic differentiation between the two populations was low. In Region I of BpMADS2 and in BpADH the genetic differentiation between the two populations was very low (FST –0.0226 and –0.0453, respectively (I, II)). The Region II of BpMADS2 and BpFULL1 showed some, but not significant, genetic differentiation between the two populations (FST 0.0930 and 0.1082, respectively (I, II)). 23 B. ermanii (outgroup) G.b. 62 R.1.a. 57 K.a. 83 K.b. 83 I.a. 83 M.a. 64 KL.b. G.a. 62 M.b. 59 R.1.b. 51 KL.a. 51 I.b. 61 P.1.a. 88 P.1.b. 80 O.a. 90 O.b. Figure 6. Gene genealogy of BpMADS2 Region I alleles from eight individuals collected from different parts of the distribution area. All nodes with < 50% bootstrap support are collapsed; the other bootstrap values are indicated next to relevant nodes. The Tajima’s D test (Tajima, 1989) was used for testing the fit of the frequency distribution to the neutral expectation. Among the three studied genes, two regions have negative values of Tajima’s D (Tajima, 1989), and also when the cloned alleles of Regions I and II of the BpMADS2 gene are combined, this region has negative value of Tajima’s D indicating an excess of low-frequency polymorphisms within these loci (I, II). This pattern of variation may reflect the postglacial expansion of this species. However, the obtained values were not statistically significant, and, furthermore, in the presence of recombination the test is conservative. In contrast, both Region I of the BpMADS2 gene and the BpFULL1 gene have positive values of the Tajima’s D, especially in population Punkaharju. Positive values of Tajima’s statistics are associated with an excess of intermediate-frequency polymorphisms. This kind of patterns can be arise in different situation. Balancing selection that maintains variation will increase the proportion of alleles at intermediate frequencies. On the other hand, if previously isolated somewhat differentiated populations are fused, the portion of intermediate allelic frequencies is again increased. Thus, the most likely explanation for the obtained positive values may be the fusion of previously isolated subpopulations of silver birch. 4.3.4. Estimates of polymorphism and divergence The MK test of McDonald and Kreitman (1991) was conducted for detecting selection 24 product that ranged in size from 757 to 785 bp was amplified (IV). From B. ermanii two fragments were amplified, one being 666 bp (putative pseudogene), and the other 793 bp in length. Thirty sites out of 737 analysed were variable, and of these 15 were parsimony informative. The variation in this region with 15 parsimony informative characters divided the genus Betula into three main groups in the maximum likelihood (ML) tree: B. lenta and B. alleghaniensis formed the first group, B. nana, B. papyrifera, B. maximowicziana, B. ermanii, B. humilis, and B. fruticosa the second one, and B. resinifera, B. platyphylla, B. pubescens, B. populifolia, B. schmidtii, and B. pendula the third group. When gaps were excluded from the data set, the MP method recovered the same three main groups as the ML method, but clustered B. nana as a sister species of group III. When gaps were included in the data set, the MP method recovered the same three main groups as ML and MP (gaps excluded) methods, but clustered B. papyrifera and B. ermanii into group III. The Region II of BpMADS2 comprises most of the C-terminal region, one longer intron, and some of the 3’untranslated region. A single fragment ranging in size from 1602 to 1701 bp was amplified from all 14 species (IV). Out of the 1443 sites analysed 182 were variable. Of these 91 were parsimony informative. The variation in Region II of BpMADS2 homologues with 91 parsimony informative characters divided the studied birch species into three main groups with both ML and MP methods, and with both data sets (gaps included/excluded). B. lenta and B. allehganiensis formed the first group, and B. maximowicziana, B. ermanii, B. humilis and B. fruticosa the second one, and all of the remaining species the third one. The region of the BpFULL1 gene that was sequenced covers portions of five exons and four introns. A single product that ranged in size from 1082 to 1190 was amplified from all 14 species (IV). A hundred and twenty-one sites out of 957 analyzed were variable and of these 42 were parsimony informative. The variation in BpFULL1 sequences with 42 parsimony informative characters divided the 14 birch species into four groups with both ML (II). BpADH gene had two synonymous and six nonsynonymous polymorphisms within B. pendula, compared to three synonymous and three nonsynonymous polymorphisms between B. pendula and B. ermanii. The total number of polymorphic sites was larger than that of fixed sites, reflecting the high polymorphism and low divergence. However, the result was not statistically significant (P 0.58). BpFULL1 had only one synonymous polymorphism, but no nonsynonymous polymorphisms within B. pendula, compared to one nonsynonymous change between B. pendula and B. ermanii. BpMADS2 had only one nonsynonymous polymorphism, and no synonymous polymorphisms within B. pendula, compared to one nonsynonymous change between B. pendula and B. ermanii. Due to the lack of nonsynonymous variation the test did not have power in BpFULL1 and BpMADS2 genes and the obtained results were not statistically significant. The HKA test of Hudson, Kreitman and Aguadé (1987) was conducted to examine, whether the level of polymorphism in BpMADS2, and BpFULL1 and/or BpADH regions is statistically higher than that of divergence compared to other loci (II). No significant discrepancy in the levels of polymorphism and divergence was detected. This pattern suggests that the evolutionary dynamics of these three genes does not differ significantly from each other. However, the regions studied, especially in BpMADS2 and BpFULL1 genes, contained only few hundred nucleotides of coding sequences and only a few nonsynonymous polymorphisms, and due to the lack of nonsynonymous variation this test had low power. 4.4. Birch phylogeny (III, IV) The phylogenetic relationships within the genus Betula (Betulaceae) were studied utilising two flower specific silver birch genes, BpMADS2 and BpFULL1 (I; Elo et al., 2001), the ADH gene, and the chloroplast matK gene and parts of its upstream and downstream flanking regions. The Region I of the BpMADS2 gene that was sequenced covers portions of three exons and two introns. From 13 birch species a single 25 Table 6. Number of polymorphic sites within BpMADS2, BpFULL1, ADH and matK gene regions. BpMADS2 BpFULL1 ADH matK Region I Region II No. of analysed sites 737 1443 957 1037 2431 No. of parsimony informative sites 15 91 42 82 5 Percentage of parsimony informative sites (%) 2.0 6.3 4.4 7.9 0.2 the other long alleles (B. nana and B. pubescens), and in the ML tree and MP trees when gaps were ignored or considered as a single character, with the short allele of B. papyrifera. The variation in matK sequences with only five parsimony informative characters divided the Betula species into two groups with both methods: one including the American species B. lenta, B. alleghaniensis and B. papyrifera and the other containing the remaining species (III). The percentages of parsimony informative sites in all gene regions are summarised in Table 6. Under the assumption that all four nuclear data sets (Regions I and II of BpMADS2, BpFULL1 and ADH) share a common evolutionary history, the sequence data sets were combined to give a total of 206 parsimony informative characters (IV). With ML and NJ methods this combined data set divided the birch species into four groups: B. lenta and B. allehganiensis formed the first group, B. maximowicziana, B. ermanii, B. humilis and B. fruticosa the second, B. schmidtii, B. resinifera, B. pendula, B. pubescens, B. populifolia and B. platyphylla the third and B. nana and B. papyrifera the fourth group. With MP method the tree topology was very similar as with ML and NJ methods, but now B. nana and B. papyrifera clustered with the species in group III (Fig. 7). and MP methods and with both data sets (gaps included/excluded). As with earlier sequence regions, B. lenta and B. allehganiensis formed the first group, B. maximowicziana, B. ermanii, B. humilis and B. fruticosa the second, and B. schmidtii, B. resinifera, B. pendula, B. pubescens and B. platyphylla third one, but now B. nana, B. populifolia and B. papyrifera formed an own, fourth group. The sequenced ADH region is comprised of portions of five exons and four introns. A single product that ranged in size from 1061 to 1073 bp was amplified from 11 birch species (III). Two fragments were amplified from B. pubescens and B. papyrifera, one being ~1070 bp, the other about 1500 bp. From B. nana an ~1500 bp fragment was amplified. The length variation between the short (~1070 bp) and long (~1500 bp) ADH alleles was due to one long indel in the intron 1 in the region used. Out of the 1037 sites analysed 82 were variable, and of these 44 were phylogenetically informative. The ADH variation with 44 parsimony informative characters divided the genus Betula into three main groups with both MP and ML methods and with all data sets, although the support for different groups differed among methods. The first group in all trees was formed by B. fruticosa, B. humilis, B. ermanii, and the short allele of B. pubescens, the second one by B. maximowicziana, B. lenta and B. alleghaniensis, and the third group by the remaining species. The only notable difference between the MP tree (gaps included) and ML tree, and also MP trees when gaps were ignored or considered as a single character, was the placement of the long allele of B. papyrifera, which in the MP tree clustered with 26 B. alleghaniensis Group I B. lenta B. papyrifera (short) B. populifolia 100 B. platyphylla 100 69 90 B. pubescens (short) 87 Group III B. pubescens (long) 100 75 B. nana 90 100 B. papyrifera (long) B. resinifera 97 B. schmidtii 76 B. pendula B. maximowicziana 94 B. ermanii 100 Group II 100 B. fruticosa B. humilis Chamaebetula Neurobetula Betula sensu de Jong (1993) Betula Betulaster Betulenta Figure 7. The unrooted Maximum parsimony consensus tree based on combined sequence data set (ADH, BpMADS2 5’end, BpMADS2 3’end and BpFULL1). The bootstrap values are based on 1000 resamplings and the bootstrap values ≥ 50% are indicated next to the relevant nodes. 27 and Irish, 2003). It may also contain activation domains, or may be subject to posttranslational modifications that may influence DNA binding specificity, subcellular localisation or the ability to attract interacting partners (Cho et al., 1999; Egea-Cortines et al., 1999; Vandenbussche et al., 2003). The importance of the PI motif for the function of BpMADS2 gene has not been studied yet. However, the high degree of conservation of the PI motif throughout the members of the PI lineage (both monocots and dicots) suggests that it has a critical function also in other plant species, including birch. 5. DISCUSSION 5.1. BpMADS2 is the PI homologue of birch We have been studying the genetic regulation of flower development of birch (e.g., Elo et al., 2001; Lemmetyinen et al., 2001; Lemmetyinen et al., 2004). Our long-term aim has been to find out the regulatory chains leading from the determination of the inflorescence meristem to the determination of the identity of flower organs, and use this information to develop a method to prevent flower formation. For these purposes we have isolated from silver birch (Betula pendula Roth) several genes or cDNAs apparently involved in the regulation of flower development. One of the aims of my study was to isolate a putative birch B-function genes, which are important for the development of petals and stamens. The sequence comparisons and phylogenetic analysis show that BpMADS2 is a member of the PI clade (I). At the early stages of flower development PI is expressed in the second, third and fourth whorls of developing flower in Arabidopsis (Goto and Meyerowitz, 1994). At the later stages PI is no longer detected in whorl four. High expression in male inflorescences having flowers with stamens and tepals and low or absent expression in female inflorescences having flowers consisting only of carpels further support the notion that BpMADS2 is a B function gene (I). The members of the PI lineage can be distinguished by two diagnostic sequences at the K domain and at the C-terminal end of the predicted protein (Kramer et al., 1998). The PI-like genes typically code for consensus sequence of MPFxFRVQPxQPNLQE at the Cterminal end of the protein (so called PI motif). The members of the PI lineage are highly conserved also at the K domain. This region displays a consensus sequence of KHExL. Both of these motifs were found also from the predicted BpMADS2 protein (Fig. 2), further supporting the assumption that BpMADS2 gene is PI-homologue of birch (I). In Arabidopsis the C-terminal motif of the PI gene is essential for the formation and/or maintenance of higher-order transcriptional complexes (Egea-Cortines et al., 1999; Lamb 5.1.1. Identification of the putative BpMADS2 regulatory regions (unpublished results) In order to identify putative regulatory regions and/or elements in the BpMADS2 promoter, a 9-kb fragment of genomic DNA upstream of the BpMADS2 coding region was isolated and sequenced (I, unpublished results). Because all the major regulatory elements of the PI promoter are known to lie within the 1.5-kb region upstream of the transcription initiation site (Chen et al., 2000; Honma and Goto, 2000), shorter, only a 3-kb fragment of the BpMADS2 promoter was chosen to be examined in more detail (this region has also been used for the BpMADS2::BARNASE construct discussed later). The promoter regions of BpMADS2 and PI genes in general are very different from each other and do not allow an easy recognition of regulatory elements or important areas. The 1.5 kb promoter region of the PI gene contains all the major regulatory elements for the spatial and temporal expression of the gene and can be split into two regions (Honma and Goto, 2000). The distal region (from site -1458 to site –301) promotes the initial expression of PI in response to induction signals, and the proximal region (from site -300 to site +1) promotes the late expression of PI maintained by the AP3/PI auto-regulatory circuit. The proximal region of the PI promoter does not contain any CArG box-like sequences even though it is sufficient for PI auto-regulatory expression, indicating that the interaction between the PI/AP3 complex and the PI 28 promoter is indirect (Chen et al, 2000). Unlike the proximal region of the PI promoter, the 3kb region of the BpMADS2 promoter contained one CArG box-like sequence from site –2204 to site –2195 (unpublished results). The promoter of the other B function gene, AP3, contains three functional CArG box sequences which all mediate discrete regulatory effects, and are necessary for the AP3 feedback control (Tilly et al., 1998). Furthermore, PI PLE1 and PLE2 elements showed significant similarity to the BpMADS2 promoter, but the third element, PLE3, did not. In PI, all three positive regions are required for both stamen and petal expression (Chen et al., 2000). Even if it is not possible to draw far-reaching conclusions, these observations indicate that the mechanism of regulation of the B function genes might be partly different in these two species. One of the methods employed in our group in preventing flower formation is the tissuespecific ablation by using BARNASE gene (Lemmetyinen et al., 2004). Because BpMADS2 was expressed in male and female inflorescences but not in any other parts of the plant (I), the promoter of BpMADS2 could be a suitable candidate for the prevention of the formation of stamens and carpels. So far, the BpMADS2::BARNASE construct has been tested in Arabidopsis, and the preliminary results show that this construct has an effect on flower formation in most transgenic lines (M. Lännenpää and T. Sopanen, unpublished results). Flower formation was totally prevented and only inflorescence stems developed in over half of the obtained lines. In remaining lines, incomplete flowers, from which petals, stamens and carpels were missing, or were malformed, were developed. These results indicate that the 3-kb region upstream of the BpMADS2 ATG codon connected to BARNASE is sufficient to prevent flower formation in most of the transgenic Arabidopsis lines, and, unexpectedly, the influence of the BpMADS2::BARNASE construct extended into the first whorl of Arabidopsis flower preventing also the formation of sepals. These results indicate that the birch promoter, or the used birch promoter region, might be lacking some of the important elements needed to function properly in Arabidopsis. 5.2. Nucleotide variation in silver birch In this study nucleotide variation of silver birch was studied in two naturally regenerated populations located in eastern and northern Finland. The studied populations were composed of 65-70-year-old birches that are apparently unaffected by any human selection. Studies performed on the current genetic structure of naturally regenerated populations are likely to be very valuable in the future, since they provide us baseline reference values of diversity. 5.2.1. Level of nucleotide variation in three nuclear loci of silver birch In this thesis the nucleotide variation of silver birch was studied in three nuclear genes, BpMADS2, BpFULL1, and BpADH (I, II). The observed results do not fully support the predictions of high nucleotide polymorphism in silver birch. The estimates of silent site nucleotide diversities (πs) in BpFULL1 and BpADH genes were very similar, and much higher than the estimate of silent site nucleotide diversity in BpMADS2 gene (0.0134, 0.0117 and 0.0043, respectively; Table 3). The mean estimate of πs for the silver birch was 0.00818 (unpublished results), which is only slightly higher than the mean level of silent site nucleotide diversity for the highly selfing Arabidopsis (0.007; Yoshida et al., 2003). Furthermore, the overall level of nucleotide diversity (πtotal) for silver birch was 0.00682. This is much lower than the estimate of the nucleotide diversity for the entire genome of Arabidopsis (0.0106, respectively; Miyashita et al., 1999). The estimates of nonsynonymous nucleotide diversity (π a ), especially in BpMADS2, BpFULL1, but also BpADH loci (I, II) were also lower compared to the estimates of Arabidopsis, especially for ChiA, CAL, PI and AP3 loci (0.0037, 0.0054, 0.0030 and 0.0040, respectively; summarised in Aguadé, 2001). In Arabidopsis, genes like ChiA, CAL, AP3 and PI exhibit a significant excess of within species replacement polymorphisms (Kawabe et al., 1997; Purugganan and Suddith, 1998; 29 1999). This phenomenon has been explained with recent rapid population expansion, as a consequence of which Arabidopsis now exists in small inbred subpopulations. Even though silver birch has gone through rapid population expansion to most of Europe after the last glaciation (Huntley and Birks, 1983), silver birch genes do not exhibit same kind of excess of within species replacement polymorphisms as Arabidopsis (I, II). This is most likely due to the very efficient pollen flow of silver birch as will be discussed later. Demographic factors, such as a recent population expansion (discussed later), affect all genes and all regions of a gene equally. In contrast, selection directly affects the genetic diversity at linked sites. Selection is thus expected to result in heterogeneous patterns of genetic diversity among different genes and across a given gene. In a nuclear genome of silver birch, BpFULL1 and BpADH genes showed more variation (II) than the BpMADS2 gene (I), and for this reason the presence of selection was studied in the analysed gene regions (II). However, because the HKA tests were not significant, and because the detected patterns of polymorphism and divergence were concordant, the obtained results indicate that the variation of mutation rates between the loci could be a sufficient explanation for the detected differences in the levels of nucleotide variation. Furthermore, mutation rates have been found to vary extensively both among genes and among groups of plants for the same gene (e.g., Bousquet et al., 1992; Laroche et al., 1997; Wang et al., 1999). In general, long-lived, outcrossing, windpollinated forest tree species, such as silver birch, have been found to harbour much more genetic variation than annual, selfing plants (summarised in Wang and Szmidt, 2001). However, at nucleotide level many woody plants have shown a slower substitution rate when compared with herbaceous annual plants (Bousquet et al., 1992; Savard et al., 1993; Laroche et al., 1997; Andreasen and Baldwin, 2001). Scots pine, another long-lived predominantly outcrossing forest tree has shown low nucleotide diversity in the coding region, both synonymous and nonsynonymous sites (Dvornyk et al., 2002; García-Gil et al., 2003). Likewise, P. taeda had low diversity in a large set of genes (Brown et al., 2004). The lower level of nucleotide diversity in silver birch than expected is consistent with these findings. Several hypotheses have been proposed to explain this rate heterogeneity among woody perennial and herbaceous annual plant taxa, which seems to be related with life history. In this case, generation time seems be the most likely factor because the woody perennials (longer generation times) analysed showed lower numbers of nucleotide substitutions per site than herbaceous annual taxa (shorter generation times; Gaut et al., 1996). However, recent studies with closely related species do not support this hypotheses, rather just the opposite, indicating that life history cannot explain evolutionary rate variation in studied species (Whittle and Johnston, 2003). The mechanisms for the lower rate of evolution in perennials compared to annuals is not yet well understood. For this reason further investigation about the factors that influence the mutational process (e.g., the relative frequency of germ-line and somatic mutations in gametes, metabolic rate of pregametic cells, and environmental conditions) is essential for a better understanding of molecular evolutionary rate variation in plants. 5.2.2. No genetic differentiation between the two silver birch populations Partitioning of the genetic variability within forest tree species has revealed that in general more than 90 % of the total genetic variation resides within populations and less than 10 % is due to differentiation among tree populations (Hamrick et al., 1992). In this study, the genetic differentiation between the two silver birch populations overall was low, especially in BpMADS2 and BpADH loci (I, II). Furthermore, silver birch individuals separated by thousands of kilometres (e.g., Italy and Russia) did not show more variation than two random individuals from populations Punkaharju or Rovaniemi with different allele haplogroups (Region I of BpMADS2, unpublished results). Earlier studies based on allozyme data have also shown that the genetic differentiation among the northern silver birch 30 of linkage disequilibrium was very rapid. The only exception from this was Region I of BpMADS2, which showed no evidence of recombination but instead significant linkage disequilibrium and low variation (I). When the same region was studied from individuals collected from different parts of the distribution area of silver birch recombination was, however, detected (unpublished results). These results indicate that recombination is a common phenomenon in silver birch nuclear genome. populations is low (Rusanen et al., 2003). In forest tree species gene flow is mediated by seed and pollen dispersal, and the gene flow through pollen is very efficient especially in wind-pollinated species, such as silver birch (e.g., Hamrick and Nason, 2000). Gene flow is a strong force, which slows down population differentiation. Conversely, efficient gene flow among populations can also break down the existing differentiation, which might have been formed during the isolation of the populations (for example during the last glaciation). Silver birch is distributed throughout the Northern Hemisphere (Atkinson, 1992), so the distribution area of the species is large. Furthermore, due to very efficient pollen flow, this distribution area forms in many places continuous populations over Europe. This has ensured that if there has been some differentiation among silver birch populations after the last glaciation, it has most presumably been broken down due to efficient gene flow. The lack of genetic differentiation indicates that the effective population size (Ne) of the species is large. At neutral sites linkage disequilibrium is governed by 1/(4Nec), where c is the recombination rate between loci (Hill and Robertson, 1968). Because close linkage will restrict effective recombination, the larger this product, the less disequilibrium would be expected between neutral loci. If c between closely linked sites is small, Ne must be large to account for the lack of linkage disequilibrium. Consistent with the large effective population size the decay of linkage disequilibrium in studied genes was very rapid, even between the closely linked polymorphic sites (I, II). 5.2.4. Nuclear genes of silver birch show few traces of postglacial expansion Current pattern of genetic variation of a species is influenced by both genetic factors and historical events. Re-colonisation of Europe by forest tree species after the last glaciation is well documented in the pollen fossil records (Huntley and Birks, 1983). Pollen data, as well as macrofossils, indicate that birch was present in central Europe during full glacial (Huntley and Birks, 1983; Willis et al., 2000). Birch populations were not limited only to a few southern refugia, but were locally present in a belt that ran eastwards into Russia, and also on the Northern European plains. Furthermore, birch pollen was widely present in parts of the Central and Northern Europe during the lateglacial period (Willis et al., 2000). When the ice started to retreat silver birch, as a pioneer species, occupied suitable habitats and quickly spread northwards. Studies on chloroplast genome have shown that Europe was reoccupied by two main waves of recolonisation after the glaciation: one from east and one from west (Palmé et al., 2003). Following these two main waves of recolonisation, today’s silver birches in Europe can be classified into two main chloroplast haplogroups, of which one is dominant in the north-west and the other in the south-east and east. Although in general the silver birch populations in northern Europe are dominated by the north-west haplogroup, in Finland the south-eastern/ eastern haplogroup is the dominant one representing about 70-90 % of the sample (Palmé et al., 2003; I). Due to the presence of Scandinavian mountain range the spread northwards/north-east from south through Norway and Sweden was probably 5.2.3. Recombination is common in the silver birch genome Recombination is one of the key evolutionary processes that shape the genetic structure of populations (Posada and Crandall, 2001), and it has also most likely played a role in determining patterns of intraspecific variation in silver birch. Recombination was detected in all three nuclear loci and the rate of recombination was very similar among the genes (Table 4; I, II). Furthermore, due to the high recombination rate in these loci, the decay 31 slowed down or entirely hindered, while the spread of populations from east and south-east to Finland was much more efficient. A relatively short span of time (in generations) has elapsed since re-colonisation by silver birch took place in Finland 10 000 years ago (Huntley and Birks, 1983). This recent history may have left durable prints in the genetic structure in the silver birch populations. However, over time the initial genetic structure of populations established at colonisation will break down due to interpopulation gene flow, and the rate at which this breaking down will occur depends on the way gene flow is mediated (Petit et al., 1993; Petit et al., 1997). The gene flow of biparentally inherited nuclear genes is mediated by both seed and pollen, whereas only seed mediates the gene flow of maternally inherited chloroplast DNA. Since the gene flow through pollen is very efficient in wind-pollinated species, such as silver birch, this means that the breaking down of the initial genetic structure of populations occurs most rapidly in biparentally inherited nuclear genes. On the other hand, maternally inherited markers, such as chloroplast DNA, retain the initial genetic structure of populations much longer and generally reveal much more genetic structure. Unlike chloroplast data (Palmé et al., 2003), nuclear loci of silver birch do not show strong patterns caused by the post-glacial expansion of the species, but are close to demographic equilibrium (I, II). Although all gene genealogies displayed some allelic dimorphism, and even though the dimorphism at some parts of the gene was maximal, with the low number of variation, the Tajima’s D tests were not very powerful. The detected percentual distribution of putative nuclear allele haplogroups was very similar to the detected chloroplast haplotype distribution, indicating the same dual origin of the Finnish birches as the chloroplast haplotypes. As mentioned above, for maternally inherited chloroplast DNA inter-population gene flow via seeds is likely to be substantially lower than biparentally inherited nuclear DNA, where inter-population gene flow is mediated via seeds and pollen. Furthermore, the chloroplast DNA is non-recombining, whereas in nuclear genome of silver birch recombination seems to be a common phenomenon (I, II). As a consequence of this the initial genetic structure of silver birch populations have been retained longer in chloroplast genome, and this makes chloroplast DNA much more suitable than nuclear genome for the study of historical processes of silver birch. 5.3. Phylogeny of the genus Betula The birches are a difficult group taxonomically, not only because of their high vegetative variability and frequent hybridisation, but also partly because of the confusions related to the binomial nomenclature. Many birch species have at least two different commonly used Latin names (i.e., B. pendula/B. verrucosa/B. alba, B. pubescens/B. alba) and these names are used in parallel with each other. Furthermore, ever since Regel (1865), the subsections or subgenera of the genus Betula have been revised by a number of authors (see Furlow, 1990). A phylogenetic classification of the genus Betula has been suggested by de Jong (1993), who divided the genus into five subgenera, namely Betulenta, Betulaster, Neurobetula, Chamaebetula, and Betula (Fig. 1). 5.3.1. Comparison of molecular phylogenies In general, the results obtained from the ADH (III), BpMADS2, and BpFULL1 genes and from the combined data set (IV) fit rather well with the infrageneric classifications proposed for birches (e.g., Regel, 1865; Winkler, 1904; de Jong, 1993; Table 1), except for B. schmidtii and B. ermanii. In all phylogenetic trees B. schmidtii (subgenus Neurobetula) grouped with the species in the subgenus Betula (including B. pendula and all other white birches), and B. ermanii (subgenus Neurobetula) grouped with the species in subgenus Chamaebetula (including B. humilis and B. fruticosa) (III, IV). The phylogenetic trees were mainly congruent with each other, but differed somewhat in their resolution, and in the bootstrap supports for the major clades. Furthermore, it should be noted that due to the limited number of species included to this study, conclusions that rely on the order of 32 branching of the major gene clades must be made with caution. The nuclear data obtained of the species compared in this study suggest that the diploid B. pendula, B. resinifera, B. platyphylla, and B. schmidtii form a continuum of closely related taxa (Fig. 7; III, IV). The diploid B. populifolia and polyploid B. pubescens and B. papyrifera are clearly related to the former group, but possibly due to hybridisation and/ or introgression the placement of these species in a phylogeny differs depending on the gene region used. Betula pendula, B. resinifera, B. platyphylla, B. populifolia, B. pubescens and B. papyrifera all belong to the subgenus Betula, and they form a rather homogenous group of pioneer species, with a characteristic white bark, pendulous catkins, male flowers with two or three stamens, and leaves with a small number of veins (Table 1; de Jong, 1993). Species in the subgenus Betula are considered to hybridise more or less freely, and this leads to introgression that complicates the classification of the species (Johnsson, 1945; Dugle, 1966; Furlow, 1990; de Jong, 1993). Particular confusion has centred around the European (B. pendula, B. pubescens) and the North American representatives (B. resinifera, B. populifolia, and B. papyrifera) of the subgenus (summarised in Furlow, 1990). Many researchers have recognised the American species as geographic races of B. pubescens or hybrids, while others consider them as separate species. However, most modern authors have maintained the American birches as separate species (including de Jong, 1993). Also in this study the placement of species such as B. populifolia, B. papyrifera, and B. pubescens varied depending on the gene region used and the most likely explanation for this seems to be hybridisation between the species (III, IV). Species in the subgenus Betula and subgenus Chamaebetula are seen as derived from the same or different ancestors related to the subgenus Neurobetula (de Jong, 1993). This may partly explain why B. schmidtii and B. ermanii at nuclear level group together with species in subgenus Betula and Chamaebetula, respectively. The tetraploid B. ermanii is morphologically extremely variable, resembling species in the subgenus Betulenta in having male flowers with 3-4 stamens, but the fruiting catkins are not always sessile and upright, and the bark resembles that of white birches (subgenus Betula) in being grayish white and lacking methyl salicylate (Ashburner, 1980). At the nucleotide level, B. ermanii was grouped as a sister taxon of B. fruticosa and B. humilis (subgenus Chamaebetula), indicating a close relationship between these two subgenera (III, IV). On the other hand, the grouping of B. ermanii might partly be artificial. After the grouping of B. schmidtii within the subgenus Betula, B. ermanii was left over as the only representative of the subgenus Neurobetula, and grouped with the next most similar species. If the analysis would have included more species from the subgenus Neurobetula, this grouping could be different, indicating that further studies within the heterogenous subgenus Neurobetula will be needed to establish a reliable phylogeny for these birch species. The chloroplast gene matK is one of the most widely used sequences for phylogenetic studies in plants (e.g., Wang et al., 1999; Wang et al., 2000; Cheng et al., 2000; Stanford et al., 2000; Soltis et al., 2001; Fukuda et al., 2001). In this thesis variation of matK among the studied birch species was limited (Table 6, III). The phylogenetic tree of matK with only five parsimony informative characters divided the 14 Betula species into two well-supported groups: one including the three American species B. lenta, B. alleghaniensis, and B. papyrifera, and the other containing the remaining species. DNA sequences of higher plants evolve at different rates, depending on whether they are located in the nuclear, chloroplast, or mitochondrial genome. The comparison of chloroplast and nuclear DNA sequences have shown that the chloroplast DNA evolves only at half the rate of plant nuclear DNA (Wolfe et al., 1987; Wang et al., 2000). Furthermore, the chloroplast genome is haploid (Radetzky, 1990; Rajora and Dancik, 1992; Dumolin et al., 1995) and is therefore expected to have a smaller effective population size (N e ) than diploid nuclear genes. In 33 birches support the idea of the hybrid origin of the species. Living species of Betula are all n = 14 or higher, but there is no unanimity on the base number of the genus (see Furlow, 1990). The base chromosome number 14 is commonly accepted, but Brown and Al-Dawoody (1979) found that meiotic behaviour in hybrid birches (2n = 42) suggests that these trees are actually hexaploids, not triploids, which leads to a base chromosome number of 7. Furthermore, in meiosis the chromosomes in the 2n = 28 and 2n = 56 plants tend to lie in groups of seven and for this reason the original basic chromosome numbers of birches is thought to be seven rather than fourteen (Eriksson and Jonsson, 1986). The small number of quadrivalents during the meiosis (multivalent chromosomes, which form from four chromosomes during the meiosis) has also been thought to support the base number of seven. Furthermore, the latest studies with molecular markers offer molecular evidence for a base number of 7 in Betula (Williams and Arnold, 2001). As mentioned earlier, the diploid B. schmidtii shows features that are characteristic of plants with hybrid origin (Woodworth, 1929). If the base chromosome number of the genus Betula is 7, then B. schmidtii would simply be an allotetraploid species (as well as all other 2n = 28 species would be tetraploid). But, if the base chromosome number is 14, as commonly accepted, this would mean that B. schmidtii is a homoploid diploid. The origin of new homoploid species via hybridisation is theoretically difficult because it requires the development of reproductive isolation in sympatry (Rieseberg, 1997). However, it is not impossible, as documented examples of homoploid diploid and allotetraploid hybrid species in nature show (Rieseberg, 1997, Ferguson and Sang, 2001). monoecious species, such as Betula, Ne for chloroplast genes is expected to be half of that for nuclear genes and the level of genetic variation is therefore expected to be smaller. These facts could, at least partly, explain this low level of variation in matK region compared to analysed nuclear genes. Further, earlier studies with chloroplast genes indicate that the chloroplast genome evolves slowly in Betulaceae in general (Bousquet et al., 1992; Kato et al., 1998; Palmé and Vendramin, 2002). 5.3.2. Phylogenetic relationships of Betula schmidtii Betula schmidtii is considered to be a rather peculiar species among birches because of the blackness of its bark, hard, heavy wood and slow growth (Ashburner, 1980). The female inflorescences of B. schmidtii are erect and elongated, and fruits wingless, with only narrow margins (Nakai, 1915). Regel (1865) placed B. schmidtii into serie Costatae, along with B. ermanii. De Jong (1993) divided the section Costatae into the subgenera Betulenta and Neurobetula, and placed B. schmidtii into subgenus Neurobetula. However, because of the morphological differences compared to other birch species discussed above, B. schmidtii has sometimes been placed even in its own subgenus, Asperae (Nakai, 1915). The four studied nuclear gene sequences of B. schmidtii resembled closely those of B. pendula and other white birches (subgenus Betula). Same kind of results have been obtained in flavonoid profiles of B. schmidtii (Keinänen et al., 1999b). The meiosis of diploid B. schmidtii is very abnormal, and suggests a hybrid origin for the species (Woodworth, 1929). Similarities in flavonoid composition and nucleotide sequences suggest that one of the parental species of B. schmidtii would belong to ancestors of the subgenus Betula. However, the phenolic compounds other than flavonoids characteristic for white birches, were not detected in B. schmidtii and of the two main non-flavonoid compounds present in B. schmidtii, only one was detected in another species, B. ermanii (Keinänen et al., 1999b). This, along with the phenotypic differences between B. schmidtii and white 5.3.3. Origin of the two alleles of ADH gene The ADH gene was distinguished from the other genes studied, since from some birch species two fragments were amplified instead of one (II, III). The classification of the birch ADH alleles into two classes followed the presence/absence of the one long indel from position 66 to 524 bp. Two versions of the ADH 34 gene were amplified from one diploid species, B. pendula (II), and two polyploid species, B. pubescens and B. papyrifera (III). From the diploid B. nana only the long allele was amplified. The occurrence of two ADH alleles in four different birch species can be explained with recent or ancient hybridisation and/or introgression. Hybridisation is a common phenomenon among birches (Johnsson, 1945; Dugle, 1966; Furlow, 1990) and it is therefore expected that introgression may have played an important role in the evolution of this genus (Furlow, 1990; Atkinson, 1992). Hybridisation is common also between species in subgenus Betula, including B. pendula, B. pubescens, and B. papyrifera (Johnsson, 1945; Thórsson et al., 2001; Palmé et al., 2004). However, hybridisation studies with B. pendula and B. pubescens have shown that the cross B. pubescens x B. pendula and reciprocal gives only a few progeny that are extremely sterile, indicating that hybridisation between these two species is not common (Johnsson, 1945). Furthermore, the indirect evidence about the origin of the 42 chromosome trees, which has been argued to be hybrids between B. pendula and B. pubescens, support the hypothesis that these trees are actually aneuploid B. pubescens, not hybrids (Brown and Al-Dawoody, 1979). Also, the distribution area of B. papyrifera is limited to North America, while B. pendula and B. pubescens are found in Europe and Asia. Due to these geographical limits hybridisation between these three species would be very difficult. However, hybridisation might be the most likely explanation for the occurrence of long ADH alleles in B. pubescens and B. nana (Anamthawat-Jónsson and Tomasson, 1990; Jonsell, 2000; Thórsson et al, 2001). Species in subgenus Betula are relatively young and probably still evolving (Jäger, 1980; de Jong, 1993). It is possible, that hybridisation has occurred earlier in the evolution of subgenus Betula and that these two different ADH alleles found from B. pendula, B. pubescens and B. papyrifera are relics from these events. Palmé et al. (2004) speculated with the possibility that two of the chloroplast haplotypes shared between B. pendula, B. pubescens and B. nana could be ancient and most likely were present in the common ancestor of these three species. However, their final conclusion was that the haplotype sharing among these three species is most likely caused by hybridisation and subsequent cytoplasmic introgression. This conclusion was justified by the fact that geography was more important than species identity in influencing the haplotype composition of a population. Several hypotheses have been suggested about the origin of the tetraploid B. pubescens, generally including B. pendula (see Howland et al., 1995). Based on one hypothesis, B. pubescens is an ancient allotetraploid, with B. pendula as one of the ancestral parents, while the other parental species might be B. humilis. An alternative hypothesis suggests, that B. pubescens is an autotetraploid of B. pendula. Recent studies have demonstrated, that most polyploid species examined have formed recently from different populations of their progenitors (multiple origins; summarised in Soltis and Soltis, 1999). When genetically different diploids have been involved in this polyploidisation, the result can be a series of genetically distinct polyploid populations. Combined with the fact that the chloroplast genome evolves slowly in Betulaceae (Bousquet et al., 1992; Kato et al., 1998; Palmé and Vendramin, 2002) and in the genus Betula (matK gene (III)), these findings could explain both shared haplotypes and the influence of geography in the haplotype composition of populations. It is also possible that the two ADH alleles are much older forms and typical of all or most of the birch species, but due to PCR primers and stringency of the amplification conditions used, we were unable to isolate the longer allele from other birch species studied. Stebbins (1971) has estimated that approximately one third of the angiosperm plants possess more than two complete genomes (i.e. multiplied sets of the diploid chromosome number of the genus), and it is probable that also the present basic chromosome number genus Betula is of ancient polyploid origin. 5.3.4. Reconciling gene trees with a species tree When studying molecular phylogenies it is important to keep in mind that a phylogenetic 35 tree (gene tree) constructed from DNA sequences does not necessarily mirror the actual species tree (the evolutionary pathway of the species; Pamilo and Nei, 1988). Processes such as hybridisation and introgression, recombination, lineage sorting, and gene duplication can frequently cause incongruences among different gene trees and the actual species tree. In fact, the gene tree can be quite different from the species tree, especially when the time of divergence between different species is short. Furthermore, when the studied species are relatively closely related, such as species in subgenus Betula, the number of nucleotides required for obtaining the correct species tree with a probability of 95 % is considerable (Pamilo and Nei, 1988). However, if several independent data sets result in similar trees this will give us confidence that the obtained gene trees truly reflect the same evolutionary history as the species tree. In a genus that is known for its high levels of hybridisation and introgression, such as Betula (Johnsson, 1945; Dugle, 1966; Furlow, 1990), transfer of genes across the species boundaries is undoubtedly most extensive. When an introgressed allele is sequenced instead of one of the “original” alleles this will affect the structure of the gene tree and normally it will not mirror the majority of the other genes in the species (Wendell and Doyle, 1998). If introgression is widespread, bifurcating trees may simply no longer reflect the evolutionary process, and this could potentially be the case for several Betula species. In species such as B. nana, which is known to hybridise extensively, the diverse grouping in different gene trees (III, IV) could be explained with hybridisation and/or introgression (in this case with hybridisation and introgression between B. nana and B. pubescens; Ashburner 1980, AnamthawatJónsson and Tomasson, 1990; Thórsson et al., 2001; Palmé et al., 2004). Furthermore, phylogenetic trees of the studied nuclear and chloroplast genes did not give fully congruent results (III, IV). Incongruences between nuclear and cytoplasmic markers has often been reported (e.g., Soltis et al., 1996; Erdogan and Mehlenbacher, 2000; Semerikov et al., 2003), and this incongruence can be due to many factors, such as mentioned above. In the present case cytoplasmic introgression seems to be the most likely explanation for the differences between phylogenetic trees of BpMADS2, BpFULL1, ADH and matK genes (III, IV). Cytoplasmic introgression has proven to be common among other birch species (Anamthawat-Jónsson and Tomasson, 1990; Thórsson et al., 2001; Palmé et al., 2004). The three species in group B in the phylogenetic tree of matK are all American species with overlapping geographical distributions, so there has been plenty of opportunity for hybridisation (III). Furthermore, both morphological (Furlow, 1990; de Jong, 1993) and nuclear data (III, IV) indicate that B. papyrifera clearly belongs to the white birches (subgenus Betula). Recombination is a key evolutionary process that shapes the genetic structure of populations and architecture of genomes (Posada and Crandall, 2001). Recombination also violates the main assumption of most phylogenetic methods, the idea of only one phylogenetic tree underlying the evolution of the sequences under study, by generating “mosaic genes” where different regions have different phylogenetic histories. Recombination seems to be also a common phenomenon in silver birch genome (I, II), indicating that it can not be excluded when evaluating phylogenetic relationships of genus Betula based on gene trees. However, the power of statistical methods to detect recombination varies greatly and most methods have trouble detecting rare recombination rates, especially when sequence divergence is low (Posada and Crandall, 2001; Wiuf et al., 2001). For most methods, a minimum sequence divergence of 5% seems necessary to attain substantial power to detect recombination, but this limit was achieved only in two gene regions used in this study (Table 6). Furthermore, other processes that complicate relationships between gene trees and species trees, are lineage sorting, extinction of ancestral gene polymorphisms through stochastic processes, and gene duplications (Wendell and Doyle, 1998). However, lineage sorting is likely to be a problem only if the time that the alleles need within a lineage to 36 coalesce is longer than the interval between successive speciation events. Gene duplications may result in a species containing a number of distinct but related sequences. Duplications are common evolutionary events and consist of copying in multiple places a gene located along a DNA strand. Then all of these copies evolve independently from each other. When studying duplicated genes there is a danger of inadvertently including paralogous genes resulting in a gene tree that reflects the duplication of the gene rather than a possible species tree. Two versions of the ADH gene were isolated from four birch species, but the fact that the coding regions of the short and long alleles of B. papyrifera were identical (III) and the coding regions of the short and long alleles of B. pendula were almost identical (II) strongly suggest, that these two sequences represent two different alleles of the same ADH gene, not two different ADH genes. Finally, different phylogenetic methods treat gaps (indels and microsatellite length variation) differently; while maximum likelihood method totally ignores gaps, maximum parsimony considers each base pair within a gap as a character. These differences between the two methods clearly affect the clustering of certain birch species in phylogenies, as discussed in papers III and IV. As many other studies based on morphology, biochemical characters and chromosome numbers have shown (Regel, 1865; Winkler, 1904; Nakai, 1915; Komarov, 1936; Pawlowska, 1983; de Jong, 1993; Keinänen et al., 1999b), the relationships among the Betula species are complex. This study has been a first step towards understanding the relationship among different Betula species using nuclear genes. As results from this study have shown, many processes, such as mentioned above, can cause incongruences among different gene trees and the actual species tree. However, it also revealed that molecular data can be powerful tool in constructing evolutionary histories between morphologically differentiated species, such as B. schmidtii and white birches (III, IV). To further understand the relationship among the species of this complex genus, larger number of unlinked genes and more birch species have to be studied. 37 ACKNOWLEDGEMENTS First of all, I would like to thank all the people who have contributed to this thesis. I am very grateful to my supervisors Professor Tuomas Sopanen, Professor Outi Savolainen and Dr. Markku Keinänen for their advice and support throughout the work on this thesis. I am especially grateful to Riitta Pietarinen for her help with the laboratory work. Without your assiduous work in the lab this thesis would not have been possible. Great thanks belong to my colleagues and friends at the Department of Biology. Especially I would like to thank the members of our birch group, Kaija Keinonen, Ilkka Porali, Juha Lemmetyinen, Mika Lännenpää and Luis Orlando Morales for their friendship, support, amusing discussions and valuable advice. It has been a pleasure to work with you all these years! I would also like to thank Docent Matti Rousi and Dr. Risto Jalkanen from the Finnish Forest Research Institute, Punkaharju and Rovaniemi Research stations, Maisa Viljanen at Joensuu Botanical Garden for sending me the population and birch species samples, and Professor Jaakko Kangasjärvi for providing the genomic library of silver birch. I thank Hanni Sikanen for the help with fieldwork, Minna Korhonen for the cDNA clone of the BpADH gene, and Anna Palmé for three birch samples. I also thank Anna Palmé and Martin Lascoux for the co-operation on paper III. This study was carried out at the Department of Biology, University of Joensuu. I thank Professor emeritus Heikki Hyvärinen, Professor Jussi Kukkonen, Dr. Markku Kirsi and Dr. Pertti Huttunen, the heads of the Department of Biology, for providing excellent facilities. The study was funded by the TEKES (as a part of Finnish Biodiversity Programme, FIBRE), the Graduate School of Forest Sciences (former Graduate School of Biology and Biotechnology of Forest Trees), the Department of Biology, University of Joensuu, and the Faculty of Science, University of Joensuu. Last, but not least my special thanks to my family, Sanni and Vesa, for your love and patience during these years. Without you and your support this project would not have been possible! Oppinut on ylpeä siksi, että tietää niin paljon; viisas vaatimaton siksi, että tietää niin vähän. William Cowper 38 Proceedings of the National Academy of Sciences 101:15255-15260. Brown IR, Al-Dawoody D (1979) Observations on meiosis in three cytotypes of Betula alba L. New Phytologist 83:801811. Chang C, Meyerowitz EM (1986) Molecular cloning and DNA sequence of the Arabidopsis thaliana alcohol dehydrogenase gene. Proceedings of the National Academy of Sciences 83:1408-1412. Charlesworth B, Morgan MT, Charlesworth D (1993) The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303. Charlesworth D, Liu FL, Zhang L (1998) The evolution of the alcohol dehydrogenase gene family by loss of introns in plants of the genus Leavenworthia (Brassicaceae). Molecular Biology and Evolution 15:552559. Chen X, Riechmann JL, Jia D, Meyerowitz E (2000) Minimal regions in the Arabidopsis PISTILLATA promoter responsive to the APETALA3/PISTILLATA feedback control do not contain a CArG box. Sexual Plant Reproduction 13:85-94. Cheng Y, Nicolson RG, Tripp K, Chaw SM (2000) Phylogeny of taxaceae and cephalotaxaceae genera inferred from chloroplast matK gene and nuclear rDNA ITS region. Molecular Phylogenetics and Evolution 14: 353-365. Cho S, Jang S, Chae S, Chung KM, Moon YH, An G, Jang SK (1999) Analysis of the Cterminal region of Arabidopsis thaliana APETALA1 as a transcription activation domain. Plant Molecular Biology 40:419429. Chung Y-Y, Kim S-R, Kang H-G, Noh Y-S, Park MC, An G (1995) Characterization of two rice MADS box genes homologous to GLOBOSA. Plant Science 109:45-56. Coen E, Meyerowitz EM (1991) The war of the whorls: genetic interactions controlling flower development. Nature 350:31-37. de Jong PC (1993) An introduction to Betula: its morphology, evolution, classification and distribution, with a survey of recent work. International Dendrology Society, Great Britain. REFERENCES Abbot RJ, Gomez MF (1989) Population genetic structure and outcrossing rate of Arabidopsis thaliana (L.) Heynh. Heredity 62:411-418. Aguadé M (2001) Nucleotide sequence variation at two genes of the phenylpropanoid pathway, the FAH1 and F3H genes, in Arabidopsis thaliana. Molecular Biology and Evolution 18:1-9. Alam MT, Grant WF (1972) Interspecific hybridization in birch (Betula). Le Naturaliste Canadien 99: 33-40. Anamthawat-Jónsson K, Tomasson T (1990) Cytogenetics of hybrid introgression in Icelandic birch. Hereditas 112:65-70. Andreasen K, Baldwin BG (2001) Unequal evolutionary rates between annual and perennial lineages of checker mallows (Sidalcea, Malvaceae): evidence from 18S26S rDNA internal and external transcribed spacers. Molecular Biology and Evolution 18:936-944. Angenent GC, Busscher M, Franken J, Mol JNM, van Tunen AJ (1992) Differential expression of two MADS box genes in wildtype and mutant Petunia flowers. The Plant Cell 4:983-993. Anonymous (1999) Finnish statistical yearbook of forestry. Finnish Forest Research Institute. Atkinson MD (1992) Betula pendula Roth (B. verrucosa Ehrh.) and B. pubescens Ehrh. Journal of Ecology 80:837-870. Bergelson J, Stahl E, Dudek S, Kreitman M (1998) Genetic variation within and among populations of Arabidopsis thaliana. Genetics 148:1311-1323. Bousquet J, Strauss SH, Doerksen AH, Price RA (1992) Extensive variation in evolutionary rate of rbcL gene sequences among seed plants. Proceedings of the National Academy of Sciences 89:78447848. Brown GR, Gill GP, Kuntz RJ, Langley CH, Neale DB (2004) Nucleotide diversity and linkage disequilibrium in loblolly pine. 39 Proceedings of the National Academy of Sciences 98:3915-3919. Ferrándiz C, Gu Q, Martienssen R, Yanofsky MF (2000) Redundant regulation of meristem identity and plant architecture by FRUITFULL, APETALA1 and CAULIFLOWER. Development 127:725734. Fitch WM (1971) Toward defining the course of evolution: minimum change for a specified tree topology. Systematic Zoology 20:406-416. Freeling M, Bennett DC (1985) Maize Adh1. Annual Review of Genetics 19:297-323. Fukuda T, Yokoyama J, Ohashi H (2001) Phylogeny and biogeography of the genus Lycium (Solanaceae): inferences from chloroplast DNA sequences. Molecular Phylogenetics Evolution 19: 246-258. Furlow J (1990) The genera of Betulaceae in the southeastern United States. Journal of the Arnold Arboretum 71:1-67. García-Gil MR, Mikkonen M, Savolainen O (2003) Nucleotide diversity at the two phytochrome loci along a latitudinal cline in Pinus sylvestris. Molecular Ecology 12:1195-1206. Gaut BS, Clegg MT (1991) Molecular evolution of alcohol dehydrogenase 1 in members of the grass family. Proceedings of the National Academy of Sciences 88:2060-2064. Gaut BS, Clegg MT (1993a) Molecular evolution of the Adh1 locus in the genus Zea. Proceedings of the National Academy of Sciences 90:5095-5099. Gaut BS, Clegg MT (1993b) Nucleotide polymorphism in the Adh1 locus of pearl millet (Pennisetum glaucum) (Poaceae). Genetics 135:1091-1097. Gaut BS, Morton BR, McCaig BC, Clegg MT (1996) Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. Proceedings of the National Academy of Sciences 93:10274-10279. Gaut BS, Peek AS, Morton BR, Clegg MT (1999) Patterns of genetic diversification within the Adh gene family in the grasses Dolferus R, Jacobs M, Peacock WJ, Dennis ES (1994) Differential interactions of promoter elements in stress response of the Arabidopsis Adh gene. Plant Physiology 105:1075-1087. Dugle JR (1966) A taxonomic study of western Canadian species in the genus Betula. Canadian Journal of Botany 44:929-1007. Dumolin S, Demesure B, Petit RJ (1995) Inheritance of chloroplast and mitochondrial genomes in pedunculate oak investigated with an efficient PCR method. Theoretical and Applied Genetics 91:1253-1256. Dvornyk V, Sirviö A, Mikkonen M, Savolainen O (2002) Low nucleotide diversity at the Pal1 locus in the widely distributed Pinus Sylvestris. Molecular Biology and Evolution 19:179-188. Egea-Cortines M, Saedler H, Sommer H (1999) Ternary complex formation between the MADS-box proteins SQUAMOSA, DEFICIENS and GLOBOSA is involved in the control of floral architecture in Antirrhinum majus. The EMBO Journal 18:5370-5379. Elo A, Lemmetyinen J, Turunen M-L, Tikka L, Sopanen T (2001) Three MADS-box genes similar to APETALA1 and FRUITFULL from silver birch (Betula pendula). Physiologia Plantarum 112:95103. Erdogan V, Mehlenbacher SA (2000) Phylogenetic relationships of Corylus species (Betulaceae) based on nuclear ribosomal DNA ITS region and chloroplast matK gene sequences. Systematic Botany 25:727-737. Eriksson G, Jonsson A (1986) A review of the genetics of Betula. The Scandinavian Journal of Forest Research 1:421-434. Felsenstein J (1993) PHYLIP (Phylogeny Inference Package) version 3.5. Computer program and documentation distributed by the author. Website: http://evolution. genetics.washington.edu/phylip/ software.pars.html#PHYLIP. Ferguson D, Sang T (2001) Speciation through homoploid hybridization between allotetraploids in peonies (Paeonia). 40 (Poaceae). Molecular Biology and Evolution 16:1086-1097. Goto K, Meyerowitz EM (1994) Function and regulation of the Arabidopsis floral homeotic gene PISTILLATA. Genes & Development 8:1548-1560. Gu Q, Ferrándiz C, Yanofsky MF, Martienssen R (1998) The FRUITFULL MADS-box gene mediates cell differentiation during Arabidopsis fruit development. Development 125:1509-1517. Hagenblad J, Nordborg M (2002) Sequence variation and haplotype structure surrounding the flowering time locus FRI in Arabidopsis thaliana. Genetics 161:289-98. Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposium Serries 41:95-98. Hamrick JL, Godt MJW, Sherman-Broyles SL (1992) Factors influencing levels of genetic diversity in woody plant species. New Forests 6:95-124. Hamrick JL, Nason JD (2000) Gene flow in forest trees. In Boyle TJB, Young A, Boshier D (eds) Forest Conservation Genetics: Principles and Practice. CIFOR and CSIRO, Australia. Hanfstingl U, Berry A, Kellog EA, Costa III JT, Rudiger W (1994) Haplotype divergence coupled with lack of diversity at the Arabidopsis thaliana alcohol dehydrogenase locus: role for both balancing and directional selection? Genetics 138:811-828. Hansen G, Estruch JJ, Sommer H, Spena A (1993) NTGLO: a tobacco homologue of the GLOBOSA floral homeotic gene of Antrirrhinum majus: cDNA sequence and expression pattern. Molecular and General Genetics 239:310-312. Hardenack S, Ye D, Saedler H, Grant S (1994) Comparison of MADS box gene expression in developing male and female flowers of the dioecious plant white campion. The Plant Cell 6:1775-1787. Higo K, Ugawa Y, Iwamoto M, Korenaga T (1999) Plant cis-acting regulatory DNA elements (Place) database: 19999. Nucleic Acids Research 27:297-300. Hill WG, Robertson A (1968) Linkage disequilibrium in finite populations. Theoretical and Applied Genetics 38:226231. Honma T, Goto K (2000) The Arabidopsis floral homeotic gene PISTILLATA is regulated by discrete cis-elements rsponsive to induction and maintenance signals. Development 127:2021-2030. Howland DE, Oliver RP, Davy AJ (1995) Morphological and molecular variation in natural populations of B. pendula. The New Phytologist 130:117-124. Hudson RR, Kreitman M, Aguadé M (1987) A test of neutral molecular evolution based on nucleotide data. Genetics 116:153-159. Huntley B, Birks HJ (1983) An atlas of past and present pollen maps for Europe: 0-13000 years ago. Cambridge University Press, Cambridge, United Kingdom. Hyvärinen H (1987) History of forests in northern Europe since the last glaciation. Annales Academiae Scientiarum Fennicae. Series A. III, Geologica-Geographica 145:718. Innan H, Tajima F, Terauchi R, Miyashita NT (1996) Intragenic recombination in the Adh locus of the wild plant Arabidopsis thaliana. Genetics 143:1761-1770. Jack T, Brockman LL, Meyerowitz EM (1992) The homeotic gene APETALA3 of Arabidopsis thaliana encodes a MADS box and is expressed in petals and stamens. Cell 68:683-697. Jäger EJ (1980) Progressionen im Synfloreszensbau und in der Verbreitung bei den Betulaceae. Flora 170:91-113. Johnsson H (1945) Interspecific hybridization within the genus Betula. Hereditas 31:163176. Jonsell B, Ed. (2000) Flora Nordica, vol. 1. The Bergius Foundation, Royal Swedish Academy of Sciences, Stockholm, Sweden. Kado T, Yoshimaru H, Tsumura Y, Tachida H (2003) DNA variation in a conifer, Cryptomeria japonica (Cupressaceae sensu lato). Genetics 164:1547-1559. Karhu A, Hurme P, Karjalainen M, Karvonen P, Kärkkäinen K, Neale D, Savolainen O (1996) Do molecular markers reflect patterns of differentiation in adaptive traits of conifers? Theoretical and Applied Genetics 93:215-221. 41 Karvonen P, Savolainen O (1993) Variation and inheritance of ribosomal DNA in Pinus sylvestris L. (Scots pine). Heredity 71:614622. Kato H, Oginuma K, Gu Z,Hammel B, Tobe H (1998) Phylogenetic relationships of Betulaceae based on matK sequences with particular reference to the position of Ostryopsis. Acta Phytotaxonomica et Geobotanica 49: 89-97. Kawabe A, Innan H, Terauchi R, Miyashita NT (1997) Nucleotide polymorphism in the Acidic Chitinase locus (ChiA) region of the wild plant Arabidopsis thaliana. Molecular Biology and Evolution 14:1303-1315. Kawabe A, Miyashita NT (1999) DNA variation in the basic Chitinase locus (ChiB) region of the wild plant Arabidopsis thaliana. Genetics 153:1445-1453. Kawabe A, Yamane K, Miyashita NT (2000) DNA polymorphism at the cytosolic Phosphoglucose Isomerase (PgiC) locus of the wild Arabidopsis thaliana. Genetics 156:1339-1347. Keinänen M, Julkunen-Tiitto R, Mutikainen P, Walls M, Ovaska J, Vapaavuori E (1999a) Trade-offs in secondary metabolism: effects of fertilization, defoliation, and genotype on birch leaf phenolics. Ecology 80:1970-1986. Keinänen M, Julkunen-Tiitto R, Rousi M, Tahvanainen J (1999b) Taxonomic implications of phenolic variation in leaves of birch (Betula L.) species. Biochemical Systematics and Ecology 27:243-256. Komarov V (1936) Flora of the U.S.S.R, vol. 5. Moskva-Leningrad, Izdatel´stvo Akademii Nauk SSSr, Moskva-Leningrad, the Soviet Union.. Koornneef M, Alonso-Blanco C, Peeters AJM, Soppe W (1998) Genetic control of flowering time in Arabidopsis. Annual Review of Plant Physiology 49:345-370. Koornneef M, Alonso-Blanco C, Vreugdenhil D (2004) Naturally occuring genetic variation in Arabidopsis thaliana. Annual Review of Plant Biology 55:141-172. Koski V (1989) Metsäpuiden jalostus/ Breeding of forest trees. Ammattikasvatushallitus, Helsinki, Finland. Kramer EM, Dorit RL, Irish VF (1998) Molecular evolution of genes controlling petal and stamen development: duplication and divergence within the APETALA3 and PISTILLATA MADS-box gene lineages. Genetics 149:765-783. Krizek BA, Meyerowitz EM (1996) The Arabidopsis homeotic genes APETALA3 and PISTILLATA are sufficient to provide the B class organ identity function. Development 122:11-22. Krizek BA, Riechmann JL, Meyerowitz EM (1999) Use of the APETALA1 promoter to assay the in vivo function of chimeric MADS box genes. Sexual Plant Reproduction 12:1426. Krüssman G (1960) Handbuch der Laubgehölze, Band I. Paul Parey, Berlin, Germany. Kuittinen H, Aguadé M (2000) Nucleotide variation at the Chalcone isomerase locus in Arabidopsis thaliana. Genetics 155:863872. Kuittinen H, Salguero D, Aguadé M (2002) Parallel patterns of sequence variation within and between populations at three loci of Arabidopsis thaliana. Molecular Biology and Evolution 19:2030-2034. Kush A, Brunelle A, Shevell D, Chua NH (1993) The cDNA sequence of two MADS box proteins in Petunia. Plant Physiology 102:1051-1052. Laitinen M-L, Julkunen-Tiitto R, Rousi M (2000) Variation in phenolic compounds within a birch (Betula pendula) population. Journal of Chemical Ecology 26:1609-1622. Laitinen M-L, Julkunen-Tiitto R, Rousi M (2002) Foliar phenolic composition of European white birch during bud unfolding and leaf development. Physiologia Plantarum 114:450-460. Lamb RS, Irish VF (2003) Functional divergence within the APETALA3/ PISTILLATA floral homeotic gene lineages. Proceedings of the National Academy of Sciences 100:6558-6563. Laroche J, Li P, Maggia L, Bousquet J (1997) Molecular evolution of angiosperm mitochondrial introns and exons. Proceedings of the National Academy of Sciences 94:5722-5727. Le Corre V, Roux F, Reboud X (2002) DNA polymorphism at the FRIGIDA gene in 42 Arabidopsis thaliana: extensive nonsynonymous variation is consistent with local selection for flowering time. Molecular Biology and Evolution 19:1261-1271. Lemmetyinen J, Pennanen T, Lännenpää M, Sopanen T (2001) Prevention of flower formation in dicotyledons. Molecular Breeding 7:341-350. Lemmetyinen J, Hassinen M, Elo A, Porali I, Keinonen K, Mäkelä H, Sopanen T (2004) Functional characterisation of SEPALLATA3 and AGAMOUS orthologues in silver birch. Physiologia Plantarum 121:149-162. Li WH (1997) Molecular evolution. Sinauer Associates, Sunderland, Massachusetts, USA. Litt A, Irish VF (2003) Duplication and diversification in the APETALA1/ FRUITFULL floral homeotic gene lineage: implications for the evolution of floral development. Genetics 165:821-833. McDonald JH, Kreitman M (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652-654. Mandel MA, Yanofsky MF (1995) The Arabidopsis AGL8 MADS box gene is expressed in inflorescence meristems and is negatively regulated by APETALA1. The Plant Cell 7:1763-1771. Miyashita NT, Innan H, Terauchi H (1996) Intra- and interspecific variation of the alcohol dehydrogenase locus region in wild plants Arabis gemmifera and Arabidopsis thaliana. Molecular Biology and Evolution 13:433-436. Miyashita NT, Kawabe A, Innan H, Terauchi R (1998) Intra- and interspecific DNA variation and codon bias of the Alcohol Dehydrogenase (Adh) locus in Arabis and Arabidopsis species. Molecular Biology and Evolution 15: 1420-1429. Miyashita NT, Kawabe A, Innan H (1999) DNA variation in the wild plant Arabidopsis thaliana revealed by amplified fragment length polymorphism analysis. Genetics 152:1723-1731. Miyashita NT (2001) DNA variation in the 5’ upstream region of the Adh locus of the wild plants Arabidopsis thaliana and Arabis gemmifera. Molecular Biology and Evolution 18:164-171. Miyashita NT (2003) Trimorphic DNA variation in the receptor-like protein kinase gene in the F18L15-130 region of the wild plant Arabidopsis thaliana. Genes & Genetic Systems 78:221-227. Morton BR, Gaut BS, Clegg MT (1996) Evolution of alcohol dehydrogenase genes in the palm and grass families. Proceedings of the National Academy of Sciences 93:11735-11739. Moudarov A, Hamdorf B, Teasdale RD, Kim JT, Winkler KU, Theissen G (1999) A DEF/ GLO-like MADS-box gene from a gymnosperm: Pinus radiata contains an ortholog of angiosperm B class floral homeotic genes. Developmental Genetics 25:245-252. Moudarov A, Cremer F, Coupland G (2002) Control of flowering time: interacting pathways as a basis for diversity. The Plant Cell 14:S111-130. Münster T, Pahnke J, Di Rosa A, Kim JT, Martin W, Saedler H, Theissen G (1997) Floral homeotic genes were recruited from homologous MADS-box genes preexisting in the common ancestor of ferns and seed plants. Proceedings of the National Academy of Sciences 94:2415-2420. Muona O, Harju A (1989) Effective population sizes, genetic variability, and mating system in natural stands and seed orchards of Pinus sylvestris. Silvae Genetica 38:221-228. Nakai T (1915) Praecursores ad Floram Sylvaticam Koreanam. II. (Betulaceae). The Botanical Magazine (Tokyo) 29:35-47. Neale DB, Savolainen O (2004) Association genetics of complex traits in conifers. Trends in Plant Science 9:325-330. Nei M (1987) Molelecular Evolutionary Genetics. Columbia University Press, New York, USA. Nicholas KB, Nicholas HB Jr. (1997) GeneDoc: a tool for editing and annotating multiple sequence alignments. Distributed by the author. Nordborg M, Borevitz JO, Bergelson J, Berry CC, Chory J, Hagenblad J, Kreitman M, Maloof JN, Noyes T, Oefner PJ, Stahl EA, Weigel D (2002) The extent of linkage disequilibrium in Arabidopsis thaliana. Nature Genetics 30:190-193. 43 Olsen KM, Womack A, Garrett AR, Suddith JI, Purugganan MD (2002) Contrasting evolutionary forces in the Arabidopsis thaliana floral developmental pathway. Genetics 160:1641-1650. Olsen KM, Halldorsdottir SS, Stinchcombe JR, Weinig C, Schmitt J, Purugganan MD (2004) Linkage disequilibrium mapping of Arabidopsis CRY2 flowering time alleles. Genetics 167:1361-1369. Palmé A, Vendramin G (2002) Chloroplast DNA variation, postglacial recolonisation and hybridisation in hazel, Corylus avellana. Molecular Ecology 11:1769-1779. Palmé A, Su Q, Rautenberg A, Manni F, Lascoux M (2003) Postglacial recolonisation and cpDNA variation of silver birch, Betula pendula. Molecular Ecology 12:201-212. Palmé AE, Su Q, Palsson S, Lascoux M (2004) Extensive sharing of chloroplast haplotypes among European birches indicates hybridization among Betula pendula, B. pubescens and B. nana. Molecular Ecology 13:167-178. Pamilo P, Nei M (1988) Relationships between gene trees and species trees. Molecular Biology and Evolution 5:568-583. Pawlowska L (1983) Biochemical and systematic study of the genus Betula L. Acta Societatis Botanicorum Poloniae 52:301314. Petit RJ, Kremer A, Wagner DB (1993) Geographic structure of chloroplast DNA polymorphisms in European oaks. Theoretical and Applied Genetics 87:122128. Petit RJ, Pineau E, Demesure B, Bacilieri R, Ducousso A, Kremer A (1997) Chloroplast DNA footprints of postglacial recolonization by oaks. Proceedings of the National Academy of Sciences 94:9996-10001. Posada D, Crandall KA (2001) Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proceedings of the National Academy of Sciences 98:13757-13762. Prittinen K, Pusenius J, Koivunoro K, Roininen H (2003) Genotypic variation in growth and resistance to insect herbivory in silver birch (Betula pendula) seedlings. Oecologia 442:572-577. Purugganan MD, Suddith JI (1998) Molecular population genetics of the Arabidopsis CAULIFLOWER regulatory gene: Nonneutral evolution and natural occurring variation in floral homeotic function. Proceedings of the National Academy of Sciences 95:81308134. Purugganan MD, Suddith JI (1999) Molecular population genetics of floral homeotic loci: departures from equilibrium-neutral model at the APETALA3 and PISTILLATA genes of Arabidopsis thaliana. Genetics 151:839848. Radetzky R (1990) Analysis of mitochondrial DNA and its inheritance in Populus. Current Genetics 18: 429-434. Rajora OP, Dancik BP (1992) Chloroplast DNA inheritance in Populus. Theoretical and Applied Genetics 84: 280-285. Regel E (1865) Bemerkungen über die Gattungen Betula und Alnus nebst Beschreibung einiger neuer Arten. Bulletin de la Société Impériale des Naturalistes de Moscou 38:388-434. Riechmann JL, Wang M, Meyerowitz EM (1996a) DNA-binding properties of Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA and AGAMOUS. Nucleic Acids Research 24:3134-3141. Riechmann JL, Krizek BA, Meyerowitz EM (1996b) Dimerization specificity of Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA, and AGAMOUS. Proceedings of the National Academy of Sciences 93:4793-4798. Riechmann JL, Meyerowitz EM (1997) MADS domain proteins in plant development. Biological Chemistry 378:1079-1101. Rieseberg LH (1997) Hybrid origins of plant species. Annual Review of Ecology and Systematics 28:359-389. Rozas J, Rozas R (1999) DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175. Rusanen M, Vakkari P, Blom A (2003) Genetic structure of Acer platanoides and Betula pendula in northern Europe. Canadian Journal of Forest Research 33:1110-1115. 44 Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4:406-425. Sang T, Donoghue MJ, Zhang D (1997) Evolution of alcohol dehydrogenase genes in peonies (Paeonia): phylogenetic relationships of putative nonhybrid species. Molecular Biology and Evolution 14:9941007. Särkilahti E, Valanne T (1990) Induced polyploidy in Betula. Silva Fennica 24:227234. Savard L, Michaud M, Bousquet J (1993) Genetic diversity and phylogenetic relationships between birches and alders using ITS, 18S rRNA, and rbcL gene sequences. Molecular Phylogenetics and Evolution 2:112-118. Savolainen O, Langley CH, Lazzaro BP, Fréville H (2000) Contrasting patterns of nucleotide polymorphism at the alcohol dehydrogenase locus in the outcrossing Arabidopsis lyrata and the selfing Arabidopsis thaliana. Molecular Biology and Evolution 17:645-655. Sawyer SA (1989) Statistical test for detecting gene conversion. Molecular Biology and Evolution 6: 526-534. Sawyer SA (1999) GENECONV: a computer package for the detection of gene conversion. Computer program and documentations distributed by the author. Website: http:// www.wusl.edu/~sawyer. Schwartz-Sommer Z, Huijser P, Nacken W, Saedler H, Sommer H (1990) Genetic control of flower development by homeotic genes in Antirrhinum majus. Science 250:931-936. Semerikov V, Zhang H, Sun M, Lascoux M (2003) Conflicting phylogenies of Larix (Pinaceae) based on cytoplasmic and nuclear DNA. Molecular Phylogenetics and Evolution 27:173-184. Shepard KA, Purugganan MD (2003) Molecular population genetics of the Arabidopsis CLAVATA2 region: the genomic scale of variation and selection in a selfing species. Genetics 163:1083-1095. Soltis DE, Johnson LA, Looney C (1996) Discordance between ITS and chloroplast topologies in the Boykinia group (Saxifragaceae). Systematic Botany 21:169185. Soltis DE, Soltis PS (1999) Polyplody: recurrent formation and genome evolution. Tree 14:348-352. Soltis DE, Tago-Nakazawa K, Xiang QY (2001) Phylogenetic relationships and the evolution in Chrysosplenium (Saxifragaceae) based on matK sequence data. American Journal of Botany 88: 883893. Southerton SG, Marshall H, Moudarov A, Teasdale RD (1998) Eucalypt MADS-box genes expressed in developing flowers. Plant Physiology 118:365-372. Stahl EA, Dwyer G, Mauricio R, Kreitman M, Bergelson J(1999) Dynamics of disease resistance polymorphism at the Rpm1 locus of Arabidopsis. Nature 400:667-671. Stanford AM, Harden R, Parks CR (2000) Phylogeny and biogeography of Juglans (Juglandaceae) based on matK and ITS sequence data. American Journal of Botany 87:872-882. Stebbins GL (1971) Chromosomal evolution of higher plants. Addison-Wesley, Reading, Massachusetts, USA. Stern K (1964) Herkunftsversuche für Zwecke der Forstpflanzenzüchtung, erläutert am Beispiel zweier Modellversuche. Der Züchter 34:181-219. Tajima F (1989) Statistical methods for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595. Tang W, Perry SE (2003) Binding site selection for the plant MADS domain protein AGL15: an in vitro and in vivo studies. The Journal of Biological Chemistry 278:28154-28159. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796-815. Theissen G (2001) Development of floral organ identity: stories from the MADS house. Current Opinion in Plant Biology 4:75-85. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment 45 aided by quality analysis tools. Nucleic Acids Research 24:4876-4882. Thórsson Æ, Salmela TE, AnamthawatJónsson K (2001) Morphological, cytological, and molecular evidence for introgressive hybridisation in birch. Journal of Heredity 92:404-408. Tilly JJ, Allen DW, Jack T (1998) The CArG boxes in the promoter of the Arabidopsis floral organ identity gene APETALA3 mediate diverse regulatory effects. Development 125:1647-1657. Troebner W, Ramirez L, Motte P, Hue I, Huijser P, Loennig WE, Saedler H, Sommer H, Schwarz-Sommer Z (1992) GLOBOSA: a homeotic gene which interacts with DEFICIENS in the control of Antirrhinum floral organogenesis. The EMBO Journal 11;4693-4704. Vandenbussche M, Theissen G, Van de Peer Y, Gerats T (2003) Structural diversification and neo-functionalization during floral MADS-box gene evolution by C-terminal frameshift mutations. Nucleic Acids Research 31:4401-4409. Wang X-Q, Tank DC, Sang T (2000) Phylogeny and divergence times in Pinaceae: evidence from three genomes. Molecular Biology and Evolution 17:773-781. Wang X-R, Tsumura Y, Yoshimaru H, Nagasaka K, Szmidt AE (1999) Phylogenetic relationships of Eurasian pines (Pinus, Pinaceae) based on chloroplast rbcL, matK, rpl20-rps18 spacer, and trnV intron sequences. American Journal of Botany 86:1742-1753. Wang X-R, Szmidt AE (2001) Molecular markers in population genetics of forest trees. Scandinavian Journal of Forest Research 16:199-220. Wendel J, Doyle J (1998) Phylogenetic incongruence: window into genome history and molecular evolution. In D. Soltis, P. Soltris and J. Doyle [eds.], Molecular systematics of plants II: DNA sequencing. Kluwer Academic Press, Boston, USA. Whittle C-A, Johnston MO (2002) Male-driven evolution of mitochondrial and chloroplastidial DNA sequences in plants. Molecular Biology and Evolution 19:938949. Whittle C-A, Johnston MO (2003) Broad-scale analysis contradicts the theory that generation time affects molecular evolutionary rates in plants. Journal of Molecular Evolution 56:223-233. Williams JH, Arnold ML (2001) Sources of genetic structure in the woody perennial Betula occidentalis. International Journal of Plant Sciences 162:1097-1109. Willis KJ, Rudner E, Sümegi P (2000) The fullglacial forests of central and southeastern Europe. Quaternary Research 53:203-213. Winkler H (1904) Betulaceae. In Das Pflanzenreich, Heft 19 (IV.61). 149 p. Edited by Engler A, W. Engelmann, Leipzig, Germany. Wiuf C, Christensen T, Hein J (2001) A simulation study of the reliability of recombination detection. Molecular Biology and Evolution 18:1929-1939. Wolfe KH, Li W-H, Sharp PM (1987) Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proceedings of the National Academy of Sciences 84:9054-9058. Woodworth RH (1929) Cytological studies in the Betulaceae, l. Betula. The Botanical Gazette 87:331-364. Yanofsky MF (1995) Floral meristems to floral organs: genes controlling early events in Arabidopsis flower development. Annual Review of Plant Physiology and Plant Molecular Biology 46:167-188. Yao JL, Dong YH, Morris BA (2001) Parthenocarpic apple fruit production conferred by transposon insertion mutations in a MADS-box transcription factor. Proceedings of the National Academy of Sciences 98:1306-1311. Yokoyama S, Harry DE (1993) Molecular phylogeny and evolutionary rates of alcohol in vertebrates and plants. Molecular Biology and Evolution 10:1215-1226. Yoshida K, Kamiya T, Kawabe A, Miyashita NT (2003) DNA polymorphism at the ACAULIS5 locus of the wild plant Arabidopsis thaliana. Genes & Genetic Systems 78:11-21. 46 47
© Copyright 2026 Paperzz