PDF - Oxford Academic - Oxford University Press

Syst. Biol. 58(3):328–345, 2009
c Society of Systematic Biologists
Copyright DOI:10.1093/sysbio/syp030
Advance Access publication on July 3, 2009
Hybrid Origins and Homoploid Reticulate Evolution within Heliosperma (Sileneae,
Caryophyllaceae)—A Multigene Phylogenetic Approach with Relative Dating
B O ŽO F RAJMAN1,2∗ , F RIDA E GGENS1 ,
AND
B ENGT O XELMAN1,3
1 Department
of Systematic Botany, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-75236 Uppsala, Sweden;
Department, Biotechnical Faculty, University of Ljubljana, Večna pot 111, SI-1000 Ljubljana, Slovenia;
3 Department of Plant and Environmental Sciences, Göteborg University, Box 461, SE-405 30 Göteborg, Sweden;
E-mail:[email protected];
∗ Correspondence to be sent to: Biology Department, Biotechnical Faculty, University of Ljubljana, Večna pot 111, SI-1000 Ljubljana,
Slovenia; E-mail: [email protected].
2 Biology
Abstract.—We used four potentially unlinked nuclear DNA regions from the gene family encoding the second largest subunit of the RNA polymerases, as well as the psbE-petG spacer and the rps16 intron from the chloroplast genome, to evaluate
the origin of and relationships within Heliosperma (Sileneae, Caryophyllaceae). Relative dates of divergence times are used to
discriminate between hybridization and gene duplication/loss as alternative explanations for topological conflicts between
gene trees. The observed incongruent relationships among the three major lineages of Heliosperma are better explained by
homoploid hybridization than by gene duplication/losses because species branching events exceed gene coalescence times
under biologically reasonable population sizes and generation times, making lineage sorting an unlikely explanation. The
origin of Heliosperma is complex and the gene trees likely reflect both reticulate evolution and sorting events. At least two
lineages have been involved in the origin of Heliosperma, one most closely related to the ancestor of Viscaria and Atocion and
the other to Eudianthe and/or Petrocoptis. [BEAST; homoploid hybridization; incongruence; lineage sorting; PATHd8; r8s;
relative dating; RPA2; RPB2; RPD2.]
Molecular phylogenetics in plants is dominated by
studies of chloroplast (cp) and nuclear ribosomal (nr)
DNA sequences, but there is also an increasing number
of studies based on low-copy nuclear genes (reviewed
by Sang 2002). There are at least two good reasons for
why a large number of unlinked DNA sequence regions
are desirable for a proper understanding of plant phylogenetic relationships. First, provided that the method
used for phylogenetic inference is consistent, an increased data size will potentially increase accuracy (resolution and support). Second, if the phylogenetic history
of the studied lineages includes reticulate events like hybridization, introgression, or lateral gene transfer, data
from several unlinked regions can be used to detect and
potentially distinguish these events from intragenomic
processes such as gene duplication, recombination, and
lineage sorting (Pamilo and Nei 1988; Linder and
Rieseberg 2004).
The application of low-copy nuclear gene trees has
improved the understanding of reticulate evolution, in
particular of allopolyploids (e.g., Cronn et al. 1999;
Ferguson and Sang 2001; Popp and Oxelman 2001;
Doyle et al. 2003; Mason-Gamer 2004; Popp et al. 2005,
Huber et al. 2006) and to a lesser extent also of diploid
taxa (e.g., Cronn and Wendel 2003; Howarth and Baum
2005; Joly and Bruneau 2006; Poke et al. 2006). Hybridization has played an important role in the evolutionary history of plants (reviewed by e.g., Rieseberg
and Wendel 1993; Rieseberg and Carney 1998; Soltis
et al. 2003). Discordances between gene trees based
on cpDNA and nrDNA have often been explained as
a result of hybridization (e.g., Soltis and Kuzoff 1995;
McKinnon et al. 2001; Okuyama et al. 2005). However,
incongruent phylogenetic patterns can result from a variety of processes (Wendel and Doyle 1998; Slowinski
and Page 1999) that can be classified as interlineage
(hybridization, lateral gene transfer between organismal
lineages) or intralineage (incomplete lineage sorting, orthology/paralogy conflation), both of which may be further complicated by recombination between alleles or
genes. Moreover, phylogenetic trees inferred from different genes may disagree due to stochastic or systematic (the phylogenetic models and methods applied fail
to converge to the correct solution) errors. A major challenge for contemporary plant phylogenetics is how to
distinguish different causes of incongruent phylogenetic
patterns from each other.
The extent of homoploid hybridization in nature,
leading to the formation of new organismal lineages of
the same ploidy level as their progenitors, is unclear (reviewed by Rieseberg 1997). There are a few documented
cases of homoploid hybrid speciation, thoroughly investigated in Gossypium (Wendel et al. 1991; reviewed
by Cronn and Wendel 2003), Helianthus (e.g., Rieseberg
1991; Welch and Rieseberg 2002; Gross et al. 2003) and
some other genera (e.g., Wolfe et al. 1998; Ferguson and
Sang 2001; Wang et al. 2001; Howarth and Baum 2005;
James and Abbott 2005; Mir et al. 2006; Poke et al. 2006).
A homoploid hybrid species tends to have a combination of alleles and/or loci that are specific to either parent (Ferguson and Sang 2001), and it is likely that such
hybridization leaves traces in more than one nuclear
gene (Cronn et al. 1999). A phylogenetic analysis of sequences from several low-copy nuclear genes can therefore be a tool to trace the parental lineages involved in
the formation of homoploid hybrids.
Comparing the relative ages of conflicting groups in
different gene phylogenies can be useful for determining which processes are responsible for incongruent
trees. For example, a single hybridization event should
328
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
result in two predominant but conflicting phylogenies
where each topology is generally consistent over genes.
By contrast, incomplete lineage sorting of alleles and
gene duplications followed by random extinctions are
expected to generate many conflicting topologies. If the
ages of the conflicting groups are within the coalescence
time intervals of the descendent lineages, it is hard, if
not impossible, to distinguish hybridization from incomplete lineage sorting if just two conflicting gene
phylogenies are at hand (Holder et al. 2001; Joly et al.
2006). Hybridizations and incongruences due to orthology/paralogy conflation can be more readily distinguished if the conflicting divergence ages are outside of
these intervals. Thus, coalescence theory (reviewed by
Rosenberg and Nordborg 2002), knowledge of the ages
of divergences, population sizes, and generation times
can be used to test whether incomplete lineage sorting
of alleles can be rejected as an explanation for observed
incongruent gene phylogenies (Maureira-Butler et al.
2008). Recently, Avise and Robinson (2008) introduced
a new term “hemiplasy” for “topological discordance
between a gene tree and a species tree attributable to
lineage sorting.”
In Figure 1a, the gene trees I and II have identical
topologies but differ significantly with respect to the
divergence time between A and B. If Tree I represents
the species lineage (“lineage” in the following text)
phylogeny, then the gene phylogeny in Tree II is best
explained by graph IV, where a duplication of the gene
preceded the split between the lineages A and B, which
each has lost a copy afterwards. A similar pattern would
also result from incomplete lineage sorting. Conversely,
if Tree II reflects the organismal phylogeny, then Tree I is
best explained by graph III, where hybridization/lateral
gene transfer between species A and B after their divergence has caused the age of the copy present in B to
reflect the age of this event rather than the lineage split
event and that the original copy of the B lineage has
gone extinct.
The situation in Tree V (Fig. 1b) is analogous to the
scenario for Tree I. Tree V is best explained by graph
VII, assuming that Tree VI reflects the lineage tree.
There is an ancient split between the two lineages B
and AC, followed by a split between the lineages A
and C. After this, hybridization takes place between
lineages A and B, leading to the loss of the copy in B
followed by continued divergence of the lineages A
and B. Note that the age of the group CBA in Tree V
is then expected to be equal to the age of CA in Tree
VI. By contrast, if the gene tree looks like Tree VI and
Tree V reflects the lineage tree, the divergence of A, B,
and C must have been preceded by gene duplication,
as seen in graph VIII. One copy from each taxon has
later gone extinct (or is not sampled), causing the observed pattern. Thus, the divergence between C and
A in Tree VI must be older than that between B and
A in Tree V, and the age of the root in Tree VI is expected to be older than the root of Tree V because the
age of the root in Tree VI reflects the age of the gene
duplication.
329
FIGURE 1. a) Two gene trees (I and II) with identical topologies but
different divergence times between A and B, and hypothetical explanations of their history, III (hybridization/lateral gene transfer) and IV
(gene duplication followed by gene loss), explaining I and II, respectively, assuming that the other gene tree (II and I, respectively) reflects
the “true” species tree. b) Two gene trees (V and VI) with different
topologies and different divergence times between A and B, and hypothetical explanations of their history, VII (hybridization/lateral gene
transfer) and VIII (gene duplication preceding the divergence of A, B,
and C), explaining V and VI, respectively, assuming that the other gene
tree (VI and V, respectively) reflects the “true” species tree. Letters in
gray indicate extinctions/nonsampled gene copies.
Provided that the gene trees have a common root,
the relative ages of nodes in the different trees can be
assessed. This can give us the order in time of the interior nodes of the tree, making it possible to compare
gene trees in the way described above. Reconciling discordant trees and identifying the underlying processes
of gene duplication, gene loss, and horizontal gene
transfer are currently an area of active research (e.g.,
Sang and Zhong 2000; V’yugin et al. 2002; Górecki 2004;
Hallett et al. 2004). Ideally, a model should be able to
consider these processes simultaneously while inferring
the species tree from multiple gene trees. However, we
are not aware of any successful implementation of this.
Our aim here is to demonstrate the usefulness of the
relative ages in a collection of gene trees to discriminate
between duplication and horizontal events.
330
SYSTEMATIC BIOLOGY
Popp and Oxelman (2004) devised a protocol for
polymerase chain reaction (PCR) amplification and
sequencing of several presumably unlinked nuclear
DNA regions from the gene family encoding the second
largest subunit of the RNA polymerases (RNAP) and
applied it to the tribe Sileneae (Caryophyllaceae), a
monophyletic group supported by nrDNA and cpDNA
sequences (e.g., Oxelman and Lidén 1995; Oxelman
et al. 1997). A phylogenetic analysis of the combined
RNAP DNA regions together with cpDNA and nrDNA
internal transcribed spacer (ITS) sequences resulted in
a well-resolved and supported phylogeny of Sileneae.
Erixon and Oxelman (2008) thoroughly analyzed approximately 25 kb of cpDNA sequence data for roughly
the same set of taxa, which resolved most branches
of the tree with high support. However, Heliosperma
(Rchb.) Rchb., nom. cons. prop. (Frajman and Rabeler
2006), was not included in the RNAP study and represented by a single sequence (Heliosperma alpestre) in the
cpDNA study.
Heliosperma includes several diploid (2n = 24) taxa
from the Southern and Central European mountains
(e.g., Jalas and Suominen 1986). Monophyly of
Heliosperma, clearly separated from the core of Silene,
is supported by ITS and cpDNA rps16 intron sequences,
but its phylogenetic position remains uncertain (Frajman
and Oxelman 2007). Within Heliosperma, three major
clades were resolved with strong support, corresponding to H. alpestre, H. macranthum, and the H. pusillum
group. The phylogenetic relationships among these
three differ between the nuclear and the plastid data
sets, possibly as a result of ancient hybridization or
orthology/paralogy conflation. Also within the
H. pusillum group, the plastid and nuclear data revealed
incongruent patterns, with a clear geographical division
into a Western and an Eastern group in the plastid data.
These were hypothetically explained by ancient (2–20
Ma) divergence reflected in the cpDNA and subsequent
hybridization caused by secondary contacts due to geological and climatological changes and reflected by
extensive additive polymorphic sites and homoplasy in
the ITS data (Frajman and Oxelman 2007).
In this study, we explore the potential of multiple gene
phylogenies to address the origin of Heliosperma as well
as the relationships between H. alpestre, H. macranthum,
and the H. pusillum group. We use the relative ages
of gene tree divergences to address conflicting gene
phylogenies concerning the relationships between the
Heliosperma lineages and their implications for the hypothesized hybridization history within the genus and
the relationships to other Sileneae taxa. Specifically,
we do this by using four low-copy nuclear DNA sequence regions from the second largest subunit of the
RNAP gene family (RPA2, RPB2, RPD2a, and RPD2b),
as well as the psbE-petG spacer and the rps16 intron
from the chloroplast genome. The ITS region is not included in this study because its presumed concerted
evolution preceded by hybridization within Heliosperma
makes it unsuitable for dating (Frajman and Oxelman
2007).
VOL. 58
M ATERIAL AND M ETHODS
Plant Material, Sequence Regions, and DNA Extraction
Taxa from all major clades of Sileneae were selected
on the basis of previous studies (Oxelman and Lid én
1995; Oxelman et al. 1997; Popp and Oxelman 2004;
Frajman and Oxelman 2007). Voucher data and EMBL/
GenBank accession numbers of specimens are listed
in the online Appendix 1 (http://sysbio.oxfordjournals.
org/).
We sequenced four RNAP regions, three of them
(RPA2, RPD2a, and RPD2b) corresponding to the regions
used in the study by Popp and Oxelman (2004), whereas
the RPB2 region was extended and includes the sequence corresponding to the region between the middle
of exon 18 and the beginning of exon 24 in Arabidopsis
(GenBank accession no. AL035527). The chloroplast
regions used in our study were the psbE-petG spacer
region and the rps16 intron (see e.g., Oxelman et al.
1997; Frajman and Oxelman 2007; Erixon and Oxelman
2008). For dating purposes, we used also the chloroplast
matK region (online Appendix 2). The taxon sampling
in different DNA regions used is more or less identical,
with some exceptions (see online Appendix 1), for example, all RPD2b sequences of the taxa belonging to the
Silene subg. Silene are missing as this RPD2 paralogue
probably is extinct in this lineage (Popp and Oxelman
2004).
Total genomic DNA was extracted from herbarium
specimens or silica gel–dried material following the
protocol of Oxelman et al. (1997) and purified using the
QIAquick purification kit protocol (Qiagen, Alameda,
CA) or GFX PCR DNA and Gel Band Purification Kit
(Amersham Biosciences, Piscataway, NJ).
PCR, Cloning, Sequencing, and Contig Assembly
PCRs were performed as described by Popp and
Oxelman (2004). Additionally, some amplifications were
performed using the Phusion High-Fidelity PCR kit.
The psbE-petG region was amplified (and sequenced)
using the psbE-RF and petG-R primers and the same
PCR conditions as Erixon and Oxelman (2008). The
rps16 intron sequences were amplified and sequenced
as described by Oxelman et al. (1997) and Frajman and
Oxelman (2007). For the RNAP regions, Sileneae-specific
primers (Popp and Oxelman, 2004) were used, mainly
applying the reaction conditions described in Popp and
Oxelman (2004). Additional H. pusillum/macranthumspecific primers were designed for amplification of
the RPD2a region (Table 1) because amplification with
Sileneae-specific primers (that successfully amplified this
region in H. alpestre) was not successful in H. pusillum
and H. macranthum. The RPB2 region was amplified
and sequenced between exons 18 and 24 for RPB2,
using the RPB2SE 18F and r7586 primers (Popp and
Oxelman 2004; Table 1). The matK region was amplified
and sequenced using matK2.1F and matK-3.2R primers
(Table 1) and applying the following PCR conditions:
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
TABLE 1. Primers, not published elsewhere, designed for this
study
Region
RPB2
RPB2
RPB2
RPB2
RPB2
RPB2
RPD2a
RPD2a
matK
matK
Primer name
RPB2 SE18F
RPB2 SE23R7380
RPB2 HmacI23Ra
RPB2 HmacI23Rb
RPB2 HI22Ra
RPB2 HI22Rb
RPD2FSaHelPus
RPD2RSaHelPus
matK2.1F
matK-3.2R
Sequence
GGCAGCTTCCWGCAGGAATTGT
GCGTCTCCTTCYTTACCCACATGAGCA
AAGGGAAGCACGCCATAAGCA
AAGGGATGCACCATAAGCA
GATTATTGAGAGCTGATCAGG
AATTATTGAGAGCTGATCAGC
GGTATCCATTTAMGACTKTCYTTTG
TCGACAGAATCCAGCCCTGCA
CCTATCCATCTGGAAATCTTAG
CTTCCTCTGTAAAGAATTC
one cycle of 94◦ C 1 min, followed by 40 cycles of 94◦ C
30 s, 53◦ C 40 s, 72◦ C 40 s, and 72◦ C 5 min.
PCR products were purified using Multiscreen PCR
(Millipore, Billerica, MA) and then sequenced with
the PCR or nested primers (Popp and Oxelman 2004;
Table 1), using cycle sequencing terminator kits
provided by Amersham Pharmacia Biotech (APB,
Piscataway, NJ) or Applied Biosystems (AB, Foster City,
CA), and visualized on a MEGABace 1000 (APB) or
on ABI 3700/3730XL (AB). In most cases, we obtained
unambiguous sequences, but in some cases direct sequencing resulted in polymorphic sequences. In order
to recover single RPA2 sequences from the H. pusillum
group, we designed internal sequencing primers, with
the 3’-end being specific for polymorphic regions/sites.
To survey for the presence of sequence variants,
amplification products of the RNAP introns from some
Heliosperma accessions of all three main lineages were
cloned with the TOPO TA Cloning kit (Invitrogen,
Carlsbad, CA) according to the manufacturer’s manual,
using half-volume reactions. Between 20 and 40 positive colonies from each reaction were screened by direct
PCR using the T7 and M13R universal primers. Between 10 and 20 PCR products from individual colonies
were then sequenced using either the T7 or the M13R
primer. The quality of these sequences was usually
high, therefore both strands were sequenced only if
it was not possible to read the entire sequence of the
cloned fragment with a single sequencing primer. On
the basis of these sequences, RPB2 paralogue-specific
PCR/sequencing primers were designed (Table 1). To
search for paralogues/gene copies within the accessions
belonging to the three main lineages of Heliosperma, all
possible PCR/sequencing forward and reverse primer
combinations were used in amplifications, but only the
copies presented here were found.
Contigs were assembled and edited using Sequencher
3.1.1 (Gene Codes Corporation). Base polymorphisms
were coded using the NC-IUPAC ambiguity codes. A
polymorphism was coded when more than one peak
was present at the same position in the electropherogram using the same criteria as described by Frajman
and Oxelman (2007). If different sequences were obtained from clones of the same accession, they were all
included in preliminary phylogenetic analyses. Polymorphism consensus sequences were constructed from
clones from the same plant that formed monophyletic
331
groups. In the final phylogenetic analyses, only the consensus sequences were used from these plants.
The sequences were aligned manually using Se-Al
(Rambaut 1996) and BioEdit (Hall 1999) following
the criteria described by Popp and Oxelman (2004).
Alignments are available on TreeBASE (study number
SN4178). Gaps (indels) were coded as binary characters using SeqState version 1.25 (Müller 2005) applying
simple gap coding (Simmons and Ochoterena 2000).
To screen for putative recombination breakpoints,
Genetic Algorithm Recombination Detection (GARD;
Kosakovsky Pond et al. 2006) was used online (http://
www.datamonkey.org/GARD). We used the GARD detection method using nucleotide substitution models
as suggested by the model selection tool on the GARD
Web page for each data set, with gamma substitution
rate variation along the sequence alignment. GARD can
detect recombination between ancestral sequences and
identify nonrecombinant and recombinant sequences. It
outperforms other methods in terms of levels of Type I
and Type II error (Kosakovsky Pond et al. 2006) and can
employ complex models of substitution.
Phylogenetic Analyses
Maximum parsimony (MP) analyses as well as maximum parsimony bootstrap (MPB) analyses of all RNAP
data sets as well as of chloroplast data set (concatenated
rps16 and psbE-petG sequences) were performed using
PAUP* version 4.0b10 (Swofford 2002). The most parsimonious trees were searched for heuristically with
100 replicates of random sequence addition, tree bisection reconnection (TBR) swapping, and the multrees
option on. In cases where these search settings resulted
in large numbers of trees, we performed swapping on
a maximum of 1000 trees (nchuck = 1000). All characters were equally weighted and unordered. The data
were bootstrapped using full heuristics, 1000 replicates,
TBR branch swapping, multrees option off, and random addition sequence with five replicates. Agrostemma
githago was used as the out-group for rooting, based on
previous studies (Oxelman and Lidén 1995; Oxelman
et al. 1997). Bayesian analyses were performed on the
RNAP, cpDNA, and Caryophyllaceae matK data sets
using MrBayes version 3.1 (Huelsenbeck and Ronquist
2001; Ronquist and Huelsenbeck 2003). The data
were partitioned into a nucleotide set and an indel
set, which was treated as morphological data according to the model of Lewis (2001). Each data set was
analyzed with the default prior distributions and applying a model of evolution proposed by MrAIC.pl version
1.4 (Nylander 2004) using the Akaike criterion. The
Markov chain Monte Carlo (MCMC) chains were run
for 1 000 000 generations. Every 100th tree was saved,
resulting in 10 000 trees for each data set, of which the
first 2000 were discarded as conservative characterization as the burn-in phase. The AWTY (Are We There
Yet?) system (Wilgenbusch et al. 2004; Nylander et al.
2008) was used to explore topological convergence in
the MCMC runs.
332
SYSTEMATIC BIOLOGY
VOL. 58
As a null hypothesis, we expect that the individual
gene trees all reflect the same phylogenetic tree. Gene
copies in aberrant positions were identified as strongly
supported if there was mutual conflict, where MPB >
75% and posterior probability (PP) > 0.95. We regard
this as a powerful method because the topological location of the conflict is directly identified. We also use
these support levels to identify well-supported branches
in the individual gene trees.
We used DupTree version 1.48 (Wehe et al. 2008)
with full queue (500) heuristics to infer a species tree
that minimizes the number of gene duplications. Binary gene trees from cpDNA, RPA2, RPB2, RPD2a,
and RPD2b were used as input. All optimal trees were
output; PAUP* version 4.0b10 (Swofford 2002) was used
to determine the number of distinct topologies with the
“condense collapse = no” command. Primetv (Sennblad
et al. 2007) was used to visualize the gene trees reconciled in the species tree using the Web service at
http://prime.sbc.su.se/cgi-bin/primetv.cgi.
parameters, and an uncorrelated relaxed lognormal
clock (Drummond et al. 2006). The prior age of the Alsinoideae/Caryophylloideae was set using a lognormal
distribution with mean = 0.0, standard deviation = 1.0,
and offset = 34 Myr (i.e., end of Eocene), thus setting a
hard lower (younger) bound to the age of the group of
34 Myr. Two independent MCMC chains were run for
30 000 000 generations with tree and parameter values
saved every 3000th generation.
The dating of the Sileneae RNAP and cpDNA data
sets was performed using the same methods as for dating the matK tree (see above). The trees were rooted
with the out-group (A. githago), forcing the in-group to
become a monophyletic sister group. The median age of
the in-group crown group obtained from the matK analysis (16.5 Myr) was used as a fixed age to calibrate the
gene trees in the r8s and PATHd8 analyses. For BEAST,
the prior age of the root was set to the date obtained
from the matK analysis (20.81 Myr), with a normally distributed standard deviation of 2.9, which corresponds
well to the combined 95% highest posterior densities
Dating
(HPD) from the two BEAST runs (15.14–26.49 Myr).
We estimated the age of the tribe Sileneae using the Tracer version 1.3 (Rambaut and Drummond 2004) was
chloroplast matK gene sequence data by Fior et al. used to determine the degree of mixing, the shape of
(2006), complemented by some additional Sileneae the probability density distributions, and 95% credibiltaxa (online Appendix 2), using PATHd8 (Britton et al. ity intervals for estimated divergence dates. Effective
2006, 2007), r8s version 1.7 (Sanderson 2003), and BEAST sample sizes (ESS), that is, the number of effectively
version 1.4.7 (Drummond and Rambaut 2007). For the independent draws from the posterior distribution, and
dating performed with r8s, we used the penalized like- mixing, that is, efficiency with which MCMC algorithm
lihood (PL) method (Sanderson 2002). The node ages in samples a parameter (Drummond and Rambaut 2007),
PATHd8 were estimated by their corresponding mean in all BEAST analyses were good (i.e., ESS always expath length (MPL), that is, the average of all path lengths ceeded 100 and the parameter traces showed the chain
from the node to the terminals descended from that fluctuating rapidly around the equilibrium).
FigTree (Rambaut 2006) was used to display the trees
node (Britton et al. 2002).
For calibration of the Caryophyllaceae matK tree, we (one for each data set) after combining the tree files
used the inflorescence fossil (Caryophylloflora paleogenica using LogCombiner and summarizing the informaG. J. Jord. & Macphail) described by Jordan and Macphail tion using TreeAnnotator (both programs available in
(2003), which they determine as belonging to Caryophyl- BEAST package). In order to check whether the PPs
laceae. More specifically, they conducted a phylogenetic were actually affected by the data, we also ran analyses
analysis based on morphological characters scored for in which BEAST only sampled from the prior distribuboth fossil and extant species and molecular charac- tions (Drummond et al. 2006). The resulting ages from
ters for the extant species and position the fossil as these runs were in general well outside the range of ages
sister to or within the subfamilies Alsinoideae and from the runs with data, indicating informativeness of
Caryophylloideae. When dating the matK-tree with r8s the data.
and PATHd8, we used the extremes of the age interval 34–45 Ma, which corresponds to the middle to late
R ESULTS
Eocene age of the fossil (Gradstein et al. 2004) as fixed
ages for the Alsinoideae/Caryophylloideae node. The
No evidence for recombination could be found by the
majority rule tree with mean branch lengths obtained GARD analyses. Consequently, the tree model is considfrom posterior distributions of the MCMC analyses with ered appropriate for all the individual gene data sets,
MrBayes was used as input to r8s and PATHd8. The and no sequences were removed from the alignments.
BEAST analyses of the matK data set were performed The estimated median age of Sileneae from the BEAST
with a Yule tree prior, GTR+Γ substitution model analyses of the matK data set is 20.81 Myr (Fig. 2; 95%
FIGURE 2 (next page). Bayesian consensus chronogram (obtained with BEAST) of the Caryophyllaceae matK sequences, with the fixage
node marked (from the age of the inflorescence fossil Caryophylloflora paleogenica). Caryophylloideae and Alsinoideae are shaded. Numbers
below branches indicate PPs obtained from Bayesian analyses. Only values above 0.70 are shown. Numbers (in bold) associated with nodes in
the chronogram indicate the median crown group age in million of years of the clade above that node. The calculated age of the tribe Sileneae
is indicated.
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
F IGURE 2.
333
334
VOL. 58
SYSTEMATIC BIOLOGY
HPD: 15.13–26.5 Myr). The age of Sileneae estimated by
PL/MPL was 23/20 or 31/26 Myr when using the calibration extremes of 34 or 45 Myr, respectively.
Phylogenetic Relationships and Relative Dating
Gene-specific information about the size of the alignments, parsimony metrics, the model of evolution proposed by MrAIC, as well as the coefficient of variation
from BEAST analyses are presented in Table 2. The coefficient of variation gives an indication of how clock-like
the data are. Values close to 0.0 indicate that the data
are clock-like, whereas values much greater than 1.0
indicate substantial rate heterogeneity among lineages
(Drummond et al. 2006). The PPs for the clades obtained by MrBayes and BEAST analyses were similar in
most cases, with an exception in the RPA2 data (0.51/1.0
PP of the H. macranthum/H. alpestre clade obtained by
BEAST/MrBayes analyses).
The ages of selected clades as inferred from the three
dating methods (PATHd8, r8s, and BEAST) are presented in Figure 3; the age estimation of all major groups
is available in the online Appendix 3. The dates estimated with the three methods are largely correlated,
but some differences can be observed (Fig. 3, online
Appendix 3). However, the HPDs of the age estimations
in BEAST, that is, the credible sets that contain 95%
of the sampled values, are rather broad, up to almost
16.4 Myr for the in-group age estimations (see online
Appendix 3).
cpDNA
The cpDNA tree (Fig. 4) suggests a sister group position of Heliosperma to a moderately supported (MPB
85%, 0.92 PP) clade including Eudianthe, Viscaria, and
Atocion. Interspecific relationships within Heliosperma
are congruent with the rps16 phylogeny in Frajman
and Oxelman (2007): H. macranthum is sister to the
H. pusillum group, and they constitute a sister group
to H. alpestre. The age of the in-group estimated with
BEAST is c. 17.1 Myr, whereas the age of Heliosperma estimated with all three methods is c. 9.9 (BEAST) to 12.4
(MPL) Myr. The age estimation of the H. pusillum group
is 3.8 (BEAST) to 4.5 (MPL), whereas the age of the Western H. pusillum group varies between 0.9 (BEAST) and
1.4 (PL) Myr and the Eastern group varies between 2.5
and 3.2 Myr.
RPA2
All Heliosperma RPA2 sequences form a strongly supported monophyletic group (Fig. 5). The Heliosperma sequences are strongly supported as sister to the Viscaria
and Atocion sequences, all three forming a monophyletic
group with Petrocoptis and Eudianthe (MPB 84% and
0.97 PP). Heliosperma macranthum and H. alpestre form
a sister clade to the sequences from the H. pusillum
group, where two putative paralogous copies are found.
Heliosperma macranthum possesses one extra copy, which
is sister to one of the two H. pusillum group paralogues
(MPB 90% and 0.97 PP). The age estimations of the
Heliosperma sequences range between 6.3 (BEAST) and
7.8 (PL) Myr.
RPB2
The RPB2 phylogeny identifies one group of
Heliosperma sequences (Fig. 6) that share a common
origin with the Viscaria and Atocion sequences but
with moderate support (MPB 76% and 1 PP). These
Heliosperma sequences and A. rupestre form a monophyletic group with weak support and poor resolution
within. Heliosperma alpestre and H. macranthum possess
a second copy of RPB2, which groups confidently with
Eudianthe (MPB 100%, 1 PP) and these together form
a clade that is sister to Petrocoptis with strong support
(MPB 100% and 1 PP). The median age estimation of the
in-group with BEAST (13.4 Myr) is considerably lower
compared with the other gene trees, where this age is
15.2 (RPA2) to 17.1 Myr (cpDNA).
TABLE 2. Matrix and phylogenetic analyses statistics for the different sequence regions analyzed, and substitution models proposed by
MrAIC, and posterior coefficient of variation distribution from BEAST analyses
Region
Number of terminals
Number of included
characters
(nucleotides/indels)
Number/percentage of
parsimony
informative characters
Number/length of MP trees
Consistency index
(excluding uninformative
characters)/retention
index
Substitution model
Coefficient of
variation—median (95%
lower, upper HPD)
cpDNA
38
RPA2
54
RPB2
44
RPD2a
38
RPD2b
34
3180 (2881/299)
3977 (3839/138)
2120 (1897/223)
3311 (3172/139)
2164 (2048/116)
525/16.5%
12/1494
328/8.2%
97000/905
593/28.0%
49191/1519
315/9.5%
50/794
252/11.6%
162/774
0.747 (0.620)/0.849
GTR+Γ
0.785 (0.689)/0.879
GTR+Γ
0.777 (0.700)/0.828
GTR+Γ
0.822 (0.741)/0.892
GTR+Γ
0.832 (0.719)/0.851
GTR+Γ
0.486 (0.339, 0.668)
0.646 (0.468, 0.854)
0.612 (0.421, 0.834)
0.507 (0.316, 0.736)
0.663 (0.431, 0.937)
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
335
(Frajman and Oxelman 2007). RPA2 is duplicated in the
H. pusillum group, and there is one extra H. macranthum
copy sister to one of the H. pusillum copies. In RPD2a,
H. alpestre groups with Petrocoptis and does not form
a monophyletic group with the other Heliosperma taxa.
Monophyletic Heliosperma groups are associated with
Viscaria and Atocion in RPA2 and RPB2 with strong
and moderate support, respectively. A similar pattern
is seen in the cpDNA data, with the difference that
the relationship to Eudianthe is uncertain. The position
of the root of the in-group varies among the data sets
and is weakly supported, except in the cpDNA, where
Petrocoptis forms a sister lineage to a clade consisting
of two subclades, one including Lychnis and Silene, the
other including Eudianthe, Heliosperma, Viscaria, and
Atocion.
FIGURE 3. Relative comparison of the ages in million years obtained from PATHd8, r8s, and BEAST for the groups discussed (see
Discussion). The in-group age 16.5 Myr was set as a calibration point
in the r8s and PATHd8 analyses.
RPD2a
A single RPD2a sequence was amplified from each
of the Heliosperma accessions. The relationships among
basal branches of this gene tree are poorly resolved
(Fig. 7). Heliosperma alpestre is sister to Petrocoptis (MPB
88% and 1 PP), whereas H. macranthum is sister to the
H. pusillum group (MPB 100% and 1 PP). Viscaria vulgaris
is sister to Atocion with strong support (MPB 100% and
1 PP). The age of the in-group estimated with BEAST is
c. 15.5 Myr.
RPD2b
The relationships among basal branches in the RPD2b
tree are poorly resolved (Fig. 8). The Heliosperma sequences form a strongly supported group, but its location in the tree is ambiguous. One RPD2b sequence
per Heliosperma accession was obtained, and H. alpestre/
H. macranthum form a monophyletic group (MPB 76%
and 1 PP). The age estimation of the in-group with
BEAST is 15.8 Myr and for Heliosperma it varies between
6.3 (BEAST) and 6.9 (PL) Myr.
Congruence Assessment
The relationships among the main groups of Sileneae
and Heliosperma and the median ages from the BEAST
analyses are summarized in diagrams in Figure 9, where
all the nodes with support lower than MPB 75% and 0.95
PP are collapsed. The Duptree analysis for these taxa
resulted in 21 distinct optimal species trees that each
requires 11 gene duplications to reconcile the strongly
supported incongruences among the five data sets. The
H. macranthum cpDNA sequences are strongly supported as sister to the H. pusillum group sequences,
whereas RPA2 and RPD2b group H. macranthum and
H. alpestre together as sister to the H. pusillum group,
which is in agreement with nrDNA ITS sequences
D ISCUSSION
Our results demonstrate intricate phylogenetic patterns from the nuclear RNAP genes. There is topological
incongruence not only with the chloroplast data but also
among the nuclear genes. The situation is further complicated by the fact that there is often more than one
sequence type per gene region. We will discuss how
relative dates can be used to interpret our results; we
believe that the reasoning could be extended to other
examples.
The major aim of our study was to infer the phylogenetic position of Heliosperma within the tribe Sileneae
and to test the hypothesis of reticulate evolution among
its three major lineages put forward by Frajman and
Oxelman (2007). We will therefore focus on the phylogenetic relationships among the genera that show
affinities with Heliosperma in our gene trees (Figs 4–8),
namely Atocion, Eudianthe, Petrocoptis, and Viscaria. The
relationships within Silene and Lychnis based on these
genes and a similar sampling of taxa were discussed by
Popp and Oxelman (2004).
Comparison of Results from Three Dating Methods
We have used three dating methods to infer divergence times in the gene trees (Fig. 3, online Appendix 3).
The Bayesian approach (as implemented in BEAST)
takes topological and branch length uncertainties into
account, as opposed to our PL and MPL analyses, where
ages are estimated using a single input tree. The BEAST
method also differs by the way the trees are calibrated,
which was made at the root of Sileneae. The age of
the in-group (Sileneae excluding Agrostemma) had to
be fixed in the r8s and the PATHd8 analyses. Thus, it
may be argued that the results are not strictly comparable. However, our main goal is to make relative age
comparisons, and for this we need only identify relative age differences that are demonstrably larger than
the coalescence time intervals. The median age estimations of Sileneae and the in-group obtained from the
BEAST analyses of the matK data set were arbitrarily
336
SYSTEMATIC BIOLOGY
VOL. 58
FIGURE 4. Bayesian consensus chronogram (obtained with BEAST) of the cpDNA data set (rps16 intron and psbE-petG spacer). Numbers/letters next to the taxon names correspond to accession identifiers in Table 1. Numbers below (above) branches indicate parsimony
bootstrap percentages (MPB)/PPs obtained from Bayesian analyses. Only values above 60%/0.70 are shown. Numbers (in bold) associated
with nodes in the chronogram indicate the mean crown group age in million of years of the clade above that node. A black dot on a node
represent strong support (MPB > 75%/PP > 0.95) for the clade above that node.
chosen for calibration purposes in the analyses of the
RNAP and cpDNA data sets. The inferred absolute ages
are strictly speaking estimates of the cpDNA splits and
not of “species” lineage splits. In principle, we could
have set the root age to some arbitrary value, but we
think that our approach provides a starting point for
further absolute dating of the phylogenetic history of
Sileneae.
Agrostemma githago was used as the out-group. It is an
annual plant and there is some evidence that annuals
tend to have higher substitution rates than perennials
(e.g., Andreasen and Baldwin 2001), which might be due
to the shorter generation times of the former (Smith and
Donoghue 2008). Agrostemma is rather divergent from
the in-group, and the sequences are difficult to align in
some cases, especially in the case of RPA2, and to some
extent also in the RPD2 regions, where the alignable
part with Agrostemma is short. It is easy to imagine that
this can be a source of large stochastic effects. Unfortunately, the sequences of closest relatives of Sileneae,
the Caryophylleae (e.g., Dianthus and Saponaria) are to
a large extent not alignable in the intron regions of the
RNAP genes used here, which is the reason why the age
of the root could not been used for calibrating the trees
in the PATHd8 and r8s analyses, where an additional
out-group is needed to assign branch lengths from the
root.
A less desirable property of the MPL method is
that negative branch lengths may occur. In such cases,
PATHd8 assigns a zero branch length. An example
of this is the Petrocoptis/Eudianthe/H. macranthum/
H. alpestre clade in the RPB2 tree (Fig. 6). The PL method
penalizes rate changes between ancestral and descendant branches, which may give rise to more clock-like
trees than those produced by BEAST, which does not
assume autocorrelation. In our analyses, the covariance
is close to zero, which suggests that there is no strong
evidence of autocorrelation in the data (Drummond and
Rambaut 2007), that is, rates are not “inherited” from
ancestor lineages to their immediate descendants.
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
337
FIGURE 5. Bayesian consensus chronogram (obtained with BEAST) of the RPA2 sequences. Numbers/letters next to the taxon names correspond to specimens in Table 1. Numbers/letters next to the taxon names correspond to accession identifiers in Table 1. Numbers below (above)
branches indicate parsimony bootstrap percentages (MPB)/PPs obtained from Bayesian analyses. Only values above 60%/0.70 are shown.
Numbers (in bold) associated with nodes in the chronogram indicate the mean crown group age in million of years of the clade above that node.
A black dot on a node represent strong support (MPB > 75%/PP > 0.95) for the clade above that node.
For the sake of clarity, we will from here on refer
only to the median ages estimated with BEAST. Our
estimates of the absolute divergence ages show that the
Heliosperma lineage probably originated during middle to late Miocene. The estimated absolute age for
diversification of the H. pusillum group genes is 2.0–6.2
Myr. Thus, the suggestion of a pre-Pleistocene origin for
this group (Frajman and Oxelman 2007) is supported.
Using Relative Age Differences to Exclude Lineage Sorting
as an Explanation for Incongruences Among the Three
Heliosperma Lineages
Conservative coalescence times (in the sense that they
favor accepting the null hypothesis of lineage sorting
as an explanation for incongruence) for the bifurcations
observed in our gene trees can be calculated using large
populations sizes and generation times (Maddison and
338
SYSTEMATIC BIOLOGY
VOL. 58
FIGURE 6. Bayesian consensus chronogram (obtained with BEAST) of the RPB2 sequences. Numbers/letters next to the taxon names correspond to accession identifiers in Table 1. Numbers below (above) branches indicate parsimony bootstrap percentages (MPB)/PPs obtained
from Bayesian analyses. Only values above 60%/0.70 are shown. Numbers (in bold) associated with nodes in the chronogram indicate the mean
crown group age in million of years of the clade above that node. A black dot on a node represent strong support (MPB > 75%/PP > 0.95) for
the clade above that node.
Knowles 2006). Although very little is known about
effective population sizes and generation times for
Heliosperma, setting effective population size (N) =
10 000 and generation time (t) = 5 years potentially
allows for stringent discrimination between incomplete
lineage sorting and other causes of gene tree incongruence. Although members of Heliosperma are perennial plants, they usually flower and set seed the first
year. Effective population sizes can be calculated from
gene trees using coalescent theory. It is not an easy
task however, and several simplistic assumptions have
to be made. For example, the BEST method (Liu and
Pearl 2007), which ignores hybridization and gene duplications, can be used to estimate ancestral effective
population sizes. Unfortunately, these estimates tend
to be heavily influenced by the prior distributions (Liu
and Pearl 2007). The sizes of extant spatially restricted
groups of plants could be used as crude guesses, but
the relationships between extant and ancestral effective sizes are unknown. Extant population sizes, defined spatially, are probably much smaller than 10 000
individuals, in particular for H. macranthum, as estimated in the field. Hudson and Coyne (2002) showed
that it takes roughly 9–12 N generations for 95% of
nuclear loci to show reciprocal monophyly in sister
species. In the absence of other dating errors, we regard
differences in divergence times larger than 500 000 years
(10Nt), to be unlikely as attributable to incomplete lineage sorting of nuclear alleles. For organelle DNA, the
corresponding interval is 250 000 years, given that the
population size for a haploid gene in a hermaphroditic
population is 2N.
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
339
FIGURE 7. Bayesian consensus chronogram (obtained with BEAST) of the RPD2a sequences. Numbers/letters next to the taxon names
correspond to accession identifiers in Table 1. Numbers below (above) branches indicate parsimony bootstrap percentages (MPB)/PPs obtained
from Bayesian analyses. Only values above 50%/0.70 are shown. Numbers (in bold) associated with nodes in the chronogram indicate the mean
crown group age in million of years of the clade above that node. A black dot on a node represent strong support (MPB > 75%/PP > 0.95) for
the clade above that node.
Frajman and Oxelman (2007) suggested that the observed incongruences between cpDNA and nrDNA
ITS trees regarding the relationships among H. alpestre,
H. macranthum, and the H. pusillum group could be due to
either incomplete lineage sorting or ancient hybridization. The RPA2 and RPD2b trees (Figs 5 and 8) agree
in part topologically with the ITS tree (Frajman and
Oxelman 2007), whereas the RPB2 tree (Viscaria/Atocion/
Heliosperma clade) is poorly resolved. Heliosperma alpestre
groups with Petrocoptis in the RPD2a tree. Incomplete
lineage sorting can be rejected as a plausible explanation because of the large differences between the divergence times. The difference of 1.3–2.0 Myr between
the divergence of Heliosperma and the divergence of
H. alpestre/H. macranthum estimated from the nuclear
gene trees (RPA2, RPD2b) is outside of the estimated
conservative coalescence time of 250 000 years for chloroplast genes, and the difference of c. 1.3 Myr in the
divergence of Heliosperma and the divergence of the
H. pusillum group/H. macranthum estimated from
the cpDNA is greater than 500 000 years, which is the
estimated conservative coalescence time for the nuclear
genes applied here.
Chloroplast Capture Is Not Responsible for Observed
Incongruence Among the Three Heliosperma Lineages
Chloroplast capture via hybridization, that is, recent introgression of chloroplasts from one species
into another, is often a preferred explanation for incongruence between chloroplast and nuclear gene trees
(e.g., Rieseberg and Soltis 1991; McKinnon et al. 2001;
Tsitrone et al. 2003; Okuyama et al. 2005). It is possible
that the sister–group relationship between H. alpestre
and H. macranthum shown by the nuclear gene trees
(RPA2, RPD2b, ITS) reflects the “species” phylogeny
and that H. macranthum has captured plastids of the
H. pusillum lineage (Fig. 10a). However, the relatively
young age of the H. macranthum stem lineage in the
nuclear genes (Fig. 10b), compared with the chloroplast genes (Fig. 10c), rejects this explanation. If we
assume that the chloroplast tree represents the “species
340
SYSTEMATIC BIOLOGY
VOL. 58
FIGURE 8. Bayesian consensus chronogram (obtained with BEAST) of the RPD2b sequences. Numbers/letters next to the taxon names
correspond to accession identifiers in Table 1. Numbers below (above) branches indicate parsimony bootstrap percentages (MPB)/PPs obtained
from Bayesian analyses. Only values above 50%/0.70 are shown. Numbers (in bold) associated with nodes in the chronogram indicate the mean
crown group age in million of years of the clade above that node. A black dot on a node represent strong support (MPB > 75%/PP > 0.95) for
the clade above that node.
tree” (cf. Fig. 1), we may instead explain the incongruent pattern using the reasoning in graph VII in
Figure 1 (see introduction), that is, there must have been
hybridization going on between the H. alpestre and
H. macranthum lineages, with subsequent extinction of
the “original” H. alpestre nuclear copies. The younger
age of Heliosperma seen in the nuclear chronograms is in
line with this. Moreover, chloroplast gene duplications
and interlineage recombinations are rare (Rice and
Palmer 2006; but see Erixon and Oxelman 2008). Therefore, we favor explanations that do not require cpDNA
gene duplications.
From Figure 1 we can predict that the age of the
H. macranthum/H. pusillum clade in the cpDNA tree
should agree with the age of Heliosperma in the nuclear
data. However, the H. macranthum/H. pusillum clade
was estimated to be considerably older in the cpDNA
tree than Heliosperma in the nuclear trees (Fig. 3). Apart
from errors in the dating analysis, this discrepancy
could possibly be explained by incomplete concerted
evolution between the parental copies/alleles in the
hybrid (Wendel and Doyle 1998). Concerted evolution
will tend to homogenize different sequences, resulting
in underestimation of divergence times (Teshima and
Innan 2004). This is likely to have taken place in the
ITS sequences of the H. pusillum group, where extensive
hybridization appears to have taken place (Frajman and
Oxelman 2007). However, concerted evolution of RNAP
genes remains speculative because no traces of recombination (i.e., concerted evolution; Teshima and Innan
2004) have been found, despite thorough manual and
automated checking using GARD (see Results) and also
various methods implemented in the software RDP2
(Martin et al. 2005; results not shown).
Gene duplication of RPA2 has likely occurred in the
evolutionary history of the H. pusillum group, where
two RPA2 copies (Fig. 5) form two well-supported
clades. A gene duplication preceding the diversification of the H. pusillum group is supported because all
sampled specimens have two sequence copies, one in
each of the two major clades of the H. pusillum group
sequences. Heliosperma macranthum possesses an additional RPA2 sequence-type, which is sister to one of the
two H. pusillum-group paralogues. These must be paralogues and not alleles because both sequence types are
present in two variants per plant. The divergence time
between the first H. macranthum copy and its sister copy
of the H. pusillum group (3.3 Ma) is younger than the
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
341
FIGURE 9. Summary diagrams presenting relationships among the main groups of Sileneae. All nodes in the trees with support lower than
MPB 75% and 0.95 PP are collapsed here. The numbers associated with the nodes are the median ages in million years of the clades inferred
with BEAST.
divergence time between the other H. macranthum copy
and H. alpestre (5 Ma). If we disregard lineage sorting as
a plausible explanation, a hybridization event between
the H. pusillum and H. macranthum lineages is a more
parsimonious explanation than gene duplication before
the diversification of Heliosperma RPA2 sequences, as
this would require a single event (hybridization) rather
than an additional gene duplication and at least three
gene losses. However, extinction of one of the duplicated gene copies due to selection or drift might not be
very “costly” from a biological perspective, as most duplicated genes are lost within a few million years (Lynch
and Conery 2000). Moreover, as the posterior intervals
of the age estimates are large, rejection of the duplication/loss hypothesis of RPA2 is not strongly supported.
The RPD2a tree (Fig. 7) shows a different incongruent pattern. The sister–group relationship between the
H. macranthum and the H. pusillum group sequences is
congruent with the cpDNA tree. However, H. alpestre
groups with Petrocoptis, a topology not seen in any
FIGURE 10. Hypothetical scenario that would support chloroplast capture within Heliosperma (a). The arrow indicates hypothetical hybridization and capture of cpDNA. (b, c) Relationships inferred by nrDNA and cpDNA in Heliosperma, respectively. Dashed line represents
the age of H. macranthum stem lineage inferred by nuclear DNA. The numbers associated with the nodes are the median ages in million years
of the clades inferred with BEAST. Crossed circle and gray font represent extinction (or nonsampled gene copies). A, Heliosperma alpestre; M,
Heliosperma macranthum; P, Heliosperma pusillum.
342
VOL. 58
SYSTEMATIC BIOLOGY
other gene tree. The RPD2a divergence time between
Petrocoptis and H. alpestre (8.2 Ma) is younger than the
age of Heliosperma inferred by cpDNA (9.9 Myr; online
Appendix 3). One hypothetical explanation is ancient
hybridization between Heliosperma and Petrocoptis, the
latter serving as pollen donor. This topological pattern is
not seen in any other RNAP tree, but it might have been
distorted by the more recent hybridization between the
H. alpestre and H. macranthum lineages discussed above
(in the case of RPA2).
Phylogenetic Position of Heliosperma in Sileneae
Heliosperma is strongly supported as monophyletic
in cpDNA, RPA2, and RPD2b trees (Figs 4, 5, 8, respectively), as well as by ITS data (Frajman and Oxelman
2007), whereas in the RPB2 and RPD2a trees (Figs 6
and 7, respectively) the Heliosperma sequences fail to
form a monophyletic group. Various RNAP regions put
the Heliosperma sequences together with at least two
lineages. One lineage is closer to Viscaria/Atocion (supported by RPA2, partly RPB2), and the other is closer
to Eudianthe and/or Petrocoptis (supported by RPB2 and
RPD2a). It is noticeable that all data sets except RPD2a
either support a sister group relationship of Heliosperma
to Viscaria/Atocion or do not reject it strongly.
The RPA2 and RPD2b phylogenies group Viscaria and
Atocion as well as Petrocoptis and Eudianthe as sister
groups, although the MPB support for Petrocoptis and
Eudianthe is moderate (89% and 82%, respectively). In
the case of cpDNA and RPD2a, the Atocion/Viscaria
clade is weakly supported as sister to Eudianthe (MPB
85% and 70%, respectively), which is not in line with
some other chloroplast regions, where Heliosperma is sister to Eudianthe or Atocion/Viscaria (Erixon and Oxelman
2008). Despite a very large data set of cpDNA sequences in the latter study, it is still ambiguous whether
Heliosperma or Eudianthe is sister to Atocion/Viscaria.
Viscaria and Atocion did not form a monophyletic group
in the RPB2 tree (Fig. 6), which is in contrast to all other
evidence (see Popp and Oxelman 2004; Erixon and
Oxelman 2008; Frajman et al. 2009). However, this incongruence is weakly supported and could potentially
be explained as sampling error.
Heliosperma alpestre and H. macranthum possess two
RPB2 sequence variants (Fig. 6). As in the case of the
RPA2, different copies of both RPB2 paralogues were
found, which are likely alleles. One of the RPB2 paralogues groups with Eudianthe, with Petrocoptis sister
to this clade, whereas the other is positioned in the
clade with the H. pusillum group and Atocion/Viscaria.
The cpDNA sequences confidently put Petrocoptis as,
together with Agrostemma, being outside the rest of the
tribe (Fig. 4; Erixon and Oxelman 2008). However, it
is not found in this position in any of the RNAP trees
presented here, but instead is often found as sister, or
closely associated, to Eudianthe. If one assumes that the
chloroplast tree reflects the “species tree” in this regard,
the patterns seen in the RNAP trees are possibly the
result of hybridization between Petrocoptis and some
FIGURE 11. Hypothetical explanation of the phylogenetic relationships inferred by the RPB2 sequences, assuming the framework presented in Fig. 1. The arrow indicates hypothetical gene duplication.
Crossed circles and gray font represent extinction (or nonsampled
gene copies).
ancestor in the Eudianthe/Heliosperma/Viscaria/Atocion
lineage. In contrast, the complicated relationships seen
in the RPB2 tree can be explained by a tree (Fig. 11) similar to graph VIII in Figure 1, that is, the hypothesis of
an ancient duplication followed by the loss of some
copies. In one paralogue, Heliosperma and Atocion/
Viscaria form a monophyletic group, but the copy in
Eudianthe/Petrocoptis is lost. In the other paralogue, it
is not only the copy in the Viscaria/Atocion lineage that
is lost but also the copy in the H. pusillum lineage. The
duplication of RPB2 must then be older than the divergence between Eudianthe and the other groups, which
is consistent with the chronogram in Figure 6. The reasoning depends on the validity of the assumption that
the chloroplast tree splits reflect the “species tree” relationships. There is no need for chloroplast gene duplications in the particular cases discussed above, something
which supports the hybridization hypothesis, but it is
admittedly fragile. Future studies might test the hypothesis using additional genes and methodological
approaches to infer species trees from multiple gene
tree (e.g., like those of Liu and Pearl 2007).
C ONCLUSIONS
In this paper, we have evaluated different hypotheses
about conflicting gene trees for the same taxa. Divergence times can be useful for deciding between hybridization and gene duplication/loss as the most likely
explanation for topological conflicts between gene trees,
as long as the absolute time differences are large enough
to exclude incomplete lineage sorting of alleles. Using
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
reconciliation procedures without taking times into account may lead to improbable results. For example, all
21 most parsimonious reconciliations applied to our
data set favored at least one duplication of chloroplast
genes, which from an empirical perspective appears
unrealistic. Instead, we conclude that the incongruences
seen between the cpDNA and at least some of the nuclear regions regarding the relationships among the
three major groups of Heliosperma are likely to be best
explained by homoploid hybridization. The pattern
regarding the origin of Heliosperma is more complicated and is likely to include several events, perhaps
including both gene duplications/extinctions and hybridizations. Clearly, development of more sophisticated analytical tools (e.g., Maureira-Butler et al. 2008,
who present a probabilistic method to test the null hypothesis of lineage sorting) as well as more data from
nuclear loci are needed in order to improve our understanding of reticulate phylogenetic histories in groups
like Heliosperma.
S UPPLEMENTARY M ATERIAL
Supplementary material can be found at: http://
sysbio.oxfordjournals.org/.
A CKNOWLEDGMENTS
The Swedish Research Council for Environment,
Agricultural Sciences and Spatial Planning and the
Swedish Research Council kindly provided financial
support for this study. The field work and analyses
were partially supported by the scholarships of Helge
Ax:son Johnsons Stiftelse to B. Frajman and molecular
work by EMBO Short Term Fellowship ASTF 45-2005
to B. Frajman. We are most grateful to D. Petrović,
M. Ronikier, S. Strgulc Krajšek, A. Strid, and B. Surina
for collecting some of the plant samples used and to
the curator of the herbarium WU for providing us with
specimens. Many thanks are due to N. Heidari and
M. Popp who were an indispensable help in the laboratory, to C. Anderson for help with and advice concerning dating methods, and to M. Thulin, P. Sch önswetter,
G. Schneeweiss, B. Pfeil, the associate editor S. Renner,
and the reviewers for the comments on draft manuscript
versions.
R EFERENCES
Andreasen K., Baldwin B.G. 2001. Unequal evolutionary rates between
annual and perennial lineages of checker mallows (Sidalcea, Malvaceae): evidence from 18S–26S rDNA internal and external transcribed spacers. Mol. Biol. Evol. 18:936–944.
Avise J.C., Robinson T.J. 2008. Hemiplasy: a new term in the lexicon of
phylogenetics. Syst. Biol. 57:503–507.
Britton T., Anderson C.L., Jacquet D., Lundqvist S., Bremer K. 2006.
PATHd8—a new method for estimating divergence times in large
phylogenetic trees without a molecular clock. Available from:
www.math.su.se/PATHd8.
Britton T., Anderson C.L., Jacquet D., Lundqvist S., Bremer K. 2007.
Estimating divergence times in large phylogenetic trees. Syst. Biol.
56:741–752.
343
Britton T., Oxelman B., Vinnersten A., Bremer K. 2002. Phylogenetic
dating with confidence intervals using mean path lengths. Mol.
Phylogenet. Evol. 24:58–65.
Cronn R., Wendel J.F. 2003. Cryptic trysts, genomic mergers, and plant
speciation. New Phytol. 161:133–142.
Cronn R.C., Small R.L., Wendel J.F. 1999. Duplicated genes evolve independently after polyploid formation in cotton. Proc. Natl. Acad.
Sci. USA. 96:14406–14411.
Doyle J.J., Doyle J.L., Rauscher J.T., Brown A.H.D. 2003. Diploid and
polyploidy reticulate evolution throughout the history of the perennial soybeans (Glycine subgenus Glycine). New Phytol. 161:121–132.
Drummond A.J., Ho S.Y.W., Phillips M.J., Rambaut A. 2006. Relaxed
phylogenetics and dating with confidence. PLoS Biol. 4:699–710.
Drummond A.J., Rambaut A. 2007. BEAST: Bayesian evolutionary
analysis by sampling trees. BMC Evol. Biol. 7:214.
Erixon P., Oxelman B. 2008. Reticulate or treelike chloroplast DNA
evolution in Sileneae (Caryophyllaceae)? Mol. Phylogenet. Evol.
48:313–325.
Ferguson D., Sang T. 2001. Speciation through homoploid hybridization between allotetraploids in peonies (Paeonia). Proc. Natl. Acad.
Sci. USA. 98:3915–3919.
Fior S., Karis P.O., Casazza G., Minuto L., Sala F. 2006. Molecular
phylogeny of the Caryophyllaceae (Caryophyllales) inferred from
chloroplast matK and nuclear rDNA ITS sequences. Am. J. Bot.
93:399–411.
Frajman B., Heidari N., Oxelman B. 2009. Phylogenetic relationships of Atocion and Viscaria (Sileneae, Caryophyllaceae) inferred
from chloroplast, nuclear ribosomal, and low-copy gene DNA sequences. Taxon. 58(4).
Frajman B., Oxelman B. 2007. Reticulate phylogenetics and phytogeographical structure of Heliosperma (Sileneae, Caryophyllaceae)
inferred from chloroplast and nuclear DNA sequences. Mol. Phylogenet. Evol. 43:140–155.
Frajman B., Rabeler R.K. 2006. Proposal to conserve the name Heliosperma against Ixoca (Caryophyllaceae, Sileneae). Taxon 55:807–808.
Górecki P. 2004. Reconciliation problems for duplication, loss and horizontal gene transfer. In: Bourne P.E., Gusfield D., editors. RECOMB
’04: proceedings of the eighth annual international conference on
research in computational molecular biology. New York: Association for Computing Machinery. p. 316–325.
Gradstein F.M., Ogg J.G., Smith A.G. , editors. 2004. A geologic time
scale 2004. Cambridge: Cambridge University Press. 589 p.
Gross B.L., Schwarzbach A.E., Rieseberg L.H. 2003. Origin(s) of the
diploid hybrid species Helianthus deserticola (Asteraceae). Am. J.
Bot. 90:1708–1719.
Hall T.A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic
Acids Symp. Ser. 41:95.
Hallett M.T., Lagergren J., Tofigh A. 2004. Simultaneous identification
of duplications and lateral transfers. In: Bourne P.E., Gusfield D.,
editors. RECOMB ’04: proceedings of the eighth annual international conference on research in computational molecular biology.
New York: Association for Computing Machinery. p. 347–356.
Holder M.T., Anderson J.A., Holloway A.K. 2001. Difficulties in detecting hybridization. Syst. Biol. 50:978–982.
Howarth D.G., Baum D.A. 2005. Genealogical evidence of homoploid
hybrid speciation in an adaptive radiation of Scaevola (Goodeniaceae) in the Hawaiian Islands. Evolution. 59:948–961.
Huber K.T., Oxelman B., Lott M., Moulton V. 2006. Reconstructing the
evolutionary history of polyploids from multilabeled trees. Mol.
Biol. Evol. 23:1784–1791.
Hudson R.R., Coyne J.A. 2002. Mathematical consequences of the genealogical species concept. Evolution. 56:1557–1565.
Huelsenbeck J.P., Ronquist F. 2001. MrBayes: Bayesian inference of
phylogenetic trees. Bioinformatics. 17:754–755.
Jalas J., Suominen J., editors. 1986. Atlas Florae Europaeae 7,
Caryophyllaceae (Silenoideae). Helsinki (Finland): Committee for
mapping the flora of Europe and Societas Biologica Fennica
Vanamo. p. 85–88.
James J.K., Abbott R.J. 2005. Recent, allopatric, homoploid hybrid speciation: the origin of Senecio squalidus (Asteraceae) in the British
Isles from a hybrid zone on Mount Etna, Sicily. Evolution. 59:
2533–2547.
344
SYSTEMATIC BIOLOGY
Joly S., Bruneau A. 2006. Incorporating allelic variation for reconstructing the evolutionary history of organisms from multiple genes: an
example from Rosa in North America. Syst. Biol. 55:623–636.
Joly S., Starr J.S., Lewis W.H., Bruneau A. 2006. Polyploid and hybrid evolution in roses east of the Rocky Mountains. Am. J. Bot. 93:
412–425.
Jordan G.J., Macphail M.K. 2003. A middle-late Eocene inflorescence of Caryophyllaceae from Tasmania, Australia. Am. J. Bot. 90:
761–768.
Kosakovsky Pond S.L., Posada D., Gravenor M.B., Woelk C.H., Frost
S.D.W. 2006. Automated phylogenetic detection of recombination
using a genetic algorithm. Mol. Biol. Evol. 23:1891–1901.
Lewis P.O. 2001. A likelihood approach to estimating phylogeny from
discrete morphological character data. Syst. Biol. 50:913–925.
Linder C.R., Rieseberg L.H. 2004. Reconstructing patterns of reticulate
evolution in plants. Am. J. Bot. 91:1700–1708.
Liu L., Pearl D.K. 2007. Species trees from gene trees: reconstructing
Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions. Syst. Biol. 56:504–514.
Lynch M., Conery J.S. 2000. The evolutionary fate and consequences
of duplicate genes. Science. 290:1151–1155.
Maddison W.P., Knowles L.L. 2006. Inferring phylogeny despite incomplete lineage sorting. Syst. Biol. 55:21–30.
Martin D.P., Williamson C., Posada D. 2005. RDP2: recombination
detection and analysis from sequence alignments. Bioinformatics.
21:260–262.
Mason-Gamer R.J. 2004. Reticulate evolution, introgression, and intertribal gene capture in an allohexaploid grass. Syst. Biol. 53:25–37.
Maureira-Butler I.J., Pfeil B.E., Muangprom A., Osborn T.C., Doyle
J.J. 2008. The reticulate history of Medicago (Fabaceae). Syst. Biol.
57:466–482.
McKinnon G.E., Vaillancourt R., Jackson H.D., Potts B.M. 2001.
Chloroplast sharing in the Tasmanian eucalypts. Evolution. 55:
703–711.
Mir C., Toumi L., Jarne P., Sarda V., Di Giusto F., Lumaret R. 2006. Endemic North African Quercus afares Pomel originates from hybridisation between two genetically very distant oak species (Q. suber L.
and Q. canariensis Willd.): evidence from nuclear and cytoplasmic
markers. Heredity. 96:175–184.
Müller K. 2005. SeqState—primer design and sequence statistics for
phylogenetic DNA data sets. Appl. Bioinform. 4:65–69.
Nylander J.A.A. 2004. MrAIC.pl. Program distributed by the author.
Uppsala (Sweden): Evolutionary Biology Centre, Uppsala University.
Nylander J.A.A., Wilgenbusch J.C., Warren D.L., Swofford D.L. 2008.
AWTY (are we there yet?): a system for graphical exploration
of MCMC convergence in Bayesian phylogenetics. Bioinformatics.
24:581–583.
Okuyama Y., Fujii N., Wakabayashi M., Kawakita A., Ito M., Watanabe
M., Murakami N., Makoto K. 2005. Nonuniform concerted evolution and chloroplast capture: heterogeneity of observed introgression patterns in three molecular data partition phylogenies of Asian
Mitella (Saxifragaceae). Mol. Biol. Evol. 22:285–296.
Oxelman B., Lidén M. 1995. Generic boundaries in the tribe Sileneae
(Caryophyllaceae) as inferred from nuclear rDNA sequences. Taxon.
44:525–542.
Oxelman B., Lidén M., Berglund D. 1997. Chloroplast rps16 intron
phylogeny of the tribe Sileneae (Caryophyllaceae). Plant Syst. Evol.
206:393–410.
Pamilo P., Nei M. 1988. Relationships between gene trees and species
trees. Mol. Biol. Evol. 5:568–583.
Poke F.S., Martin D.P., Steane D.A., Vaillancourt R.E., Reid J.B. 2006.
The impact of intragenic recombination on phylogenetic reconstruction at the sectional level in Eucalyptus when using a single
copy nuclear gene (cinnamoyl CoA reductase). Mol. Phylogenet.
Evol. 39:160–170.
Popp M., Erixon P., Eggens F., Oxelman B. 2005. Origin and evolution
of a circumpolar polyploid species complex in Silene (Caryophyllaceae) inferred from low-copy nuclear RNA polymerase introns,
rDNA, and chloroplast DNA. Syst. Bot. 30:302–313.
Popp M., Oxelman B. 2001. Inferring the history of the polyploid Silene
aegaea (Caryophyllaceae) using plastid and homoeologous nuclear
DNA sequences. Mol. Phylogenet. Evol. 20:474–481.
VOL. 58
Popp M., Oxelman B. 2004. Evolution of a RNA polymerase gene
family in Silene (Caryophyllaceae)—incomplete concerted evolution and topological congruence among paralogues. Syst. Biol. 53:
914–932.
Rambaut A. 1996. Se-Al: Sequence Alignment Editor. Available from:
http://tree.bio.ed.ac.uk/software/seal/.
Rambaut A. 2006. FigTree v1.0. Available from: http://tree.bio.ed.ac.
uk/software/figtree/.
Rambaut A., Drummond A.J. 2004. Tracer v1.3. Available from: http://
evolve.zoo.ox.ac.uk/software.html.
Rice D.W., Palmer J.D. 2006. An exceptional horizontal gene transfer
in plastids: gene replacement by a distant bacterial paralog and evidence that haptophyte and cryptophyte plastids are sisters. BMC
Biol. 4:31.
Rieseberg L.H. 1991. Homoploid reticulate evolution in Helianthus
(Asteraceae): evidence from ribosomal genes. Am. J. Bot. 78:
1218–1237.
Rieseberg L.H. 1997. Hybrid origins of plant species. Annu. Rev. Ecol.
Syst. 28:359–389.
Rieseberg L.H., Carney S.E. 1998. Plant hybridisation. New Phytol.
140:599–624.
Rieseberg L.H., Soltis D.E. 1991. Phylogenetic consequences of cytoplasmic gene flow in plants. Evol. Trends Plants. 5:65–84.
Rieseberg L.H., Wendel J.F. 1993. Introgression and its consequences in
plants. In: Harrison R.G., editor. Hybrid zones and the evolutionary
process. New York: Oxford University Press. p. 70–109.
Ronquist F., Huelsenbeck J.P. 2003. MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 19:1572–
1574.
Rosenberg N.A., Nordborg M. 2002. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat. Rev. Genet.
3:380–390.
Sanderson M.J. 2002. Estimating absolute rates of molecular evolution
and divergence times: a penalized likelihood approach. Mol. Biol.
Evol. 19:101–109.
Sanderson M.J. 2003. r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock.
Bioinformatics. 19:301–302.
Sang T. 2002. Utility of low-copy nuclear gene sequences in plant phylogenetics. Crit. Rev. Biochem. Mol. Biol. 37:121–147.
Sang T., Zhong Y. 2000. Testing hybridization hypotheses based on incongruent gene trees. Syst. Biol. 49:422–434.
Sennblad B., Schreil E., Sonnhammer A.B., Lagergren J., Arvestad L.
2007. Primetv: a viewer for reconciled trees. BMC Bioinform. 8:148.
Simmons M.P., Ochoterena H. 2000. Gaps as characters in sequencebased phylogenetic analyses. Syst. Biol. 49:369–381.
Slowinski J., Page R.D.M. 1999. How should species phylogenies be
inferred from sequence data? Syst. Biol. 48:814–825.
Smith S.A., Donoghue J. 2008. Rates of molecular evolution are linked
to life history in flowering plants. Science. 322:86–89.
Soltis D., Kuzoff R.K. 1995. Discordance between nuclear and chloroplast phylogenies in the Heuchera group (Saxifragaceae). Evolution.
49:727–742.
Soltis D.E., Soltis P.S., Tate J.A. 2003. Advances in the study of polyploidy since Plant speciation. New Phytol. 161:173–191.
Swofford D.L. 2002. PAUP*. Phylogenetic analysis using parsymony
(*and other methods). Version 4.0b10. Sunderland (MA): Sinauer
Associates.
Teshima K.M., Innan H. 2004. The effect of gene conversion on
the divergence between duplicated genes. Genetics. 166:1553–
1560.
Tsitrone A., Kirkpatrick M., Levin D.A. 2003. A model for chloroplast
capture. Evolution. 57:1776–1782.
V’yugin V.V., Gelfand M.S., Lyubetsky V.A. 2002. Identification of
horizontal gene transfer from phylogenetic gene trees. Mol. Biol.
37:571–584.
Wang X.-R., Szmidt A.E., Savolainen O. 2001. Genetic composition and
diploid hybrid speciation of a high mountain pine, Pinus densata,
native to the Tibetan Plateau. Genetics. 159:337–346.
Wehe A., Bansal M.S., Burleigh J.G., Eulenstein O. 2008. DupTree: a
program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics. 24:1540–1541.
2009
FRAJMAN ET AL.—RETICULATE EVOLUTION OF HELIOSPERMA
Welch M.E., Rieseberg L.H. 2002. Patterns of genetic variation suggest
a single, ancient origin for the diploid hybrid species Helianthus
paradoxus. Evolution. 56:2126–2137.
Wendel J.F., Doyle J.J. 1998. Phylogenetic incongruence: window into
genome history and molecular evolution. In: Soltis P., Soltis D.,
Doyle J., editors. Molecular systematics of plants II. Dodrecht (the
Netherlands): Kluwer Academic Press. p. 265–296.
Wendel J.F., Stewart J. McD., Rettig J.H. 1991. Molecular evidence
for homoploid reticulate evolution among Australian species of
Gossypium. Evolution. 45:694–711.
345
Wilgenbusch J.C., Warren D.L., Swofford D.L. 2004. AWTY: a system
for graphical exploration of MCMC convergence in Bayesian phylogenetic inference. Available from: http://ceb.csit.fsu.edu/awty.
Wolfe A.D., Xiang Q.-Y., Kephart S.R. 1998. Diploid hybrid speciation in Penstemon (Scrophulariaceae). Proc. Natl. Acad. Sci. USA.
95:5112–5115.
Received 4 April 2008; reviews returned 9 June 2009; accepted 6 May 2009
Associate Editor: Susanne Renner