Journal of Systematics and Evolution 47 (5): 349–368 (2009) doi: 10.1111/j.1759-6831.2009.00044.x Estimating ancestral distributions of lineages with uncertain sister groups: a statistical approach to Dispersal–Vicariance Analysis and a case using Aesculus L. (Sapindaceae) including fossils A.J. HARRIS∗ Qiu-Yun (Jenny) XIANG∗ (North Carolina State University, Department of Plant Biology, Raleigh, North Carolina, USA) Abstract We propose a simple statistical approach for using Dispersal–Vicariance Analysis (DIVA) software to infer biogeographic histories without fully bifurcating trees. In this approach, ancestral ranges are first optimized for a sample of Bayesian trees. The probability P of an ancestral range r at a node is then calculated as P(rY ) = n t=1 F(rY )t Pt where Y is a node, and F(rY ) is the frequency of range r among all the optimal solutions resulting from DIVA optimization at node Y , t is one of n topologies optimized, and Pt is the probability of topology t. Node Y is a hypothesized ancestor shared by a specific crown lineage and the sister of that lineage “x”, where x may vary due to phylogenetic uncertainty (polytomies and nodes with posterior probability <100%). Using this method, the ancestral distribution at Y can be estimated to provide inference of the geographic origins of the specific crown group of interest. This approach takes into account phylogenetic uncertainty as well as uncertainty from DIVA optimization. It is an extension of the previously described method called Bayes-DIVA, which pairs Bayesian phylogenetic analysis with biogeographic analysis using DIVA. Further, we show that the probability P of an ancestral range at Y calculated using this method does not equate to pp∗ F(rY ) on the Bayesian consensus tree when both variables are <100%, where pp is the posterior probability and F(rY ) is the frequency of range r for the node containing the specific crown group. We tested our DIVA-Bayes approach using Aesculus L., which has major lineages unresolved as a polytomy. We inferred the most probable geographic origins of the five traditional sections of Aesculus and of Aesculus californica Nutt. and examined range subdivisions at parental nodes of these lineages. Additionally, we used the DIVA-Bayes data from Aesculus to quantify the effects on biogeographic inference of including two wildcard fossil taxa in phylogenetic analysis. Our analysis resolved the geographic ranges of the parental nodes of the lineages of Aesculus with moderate to high probabilities. The probabilities were greater than those estimated using the simple calculation of pp∗ F(ry ) at a statistically significant level for two of the six lineages. We also found that adding fossil wildcard taxa in phylogenetic analysis generally increased P for ancestral ranges including the fossil’s distribution area. The P was more dramatic for ranges that include the area of a wildcard fossil with a distribution area underrepresented among extant taxa. This indicates the importance of including fossils in biogeographic analysis. Exmination of range subdivision at the parental nodes revealed potential range evolution (extinction and dispersal events) along the stems of A. californica and sect. Parryana. Key words Aesculus, biogeography, DIVA, fossil wildcards, MrBayes, phylogenetic uncertainty. Studies in historical biogeography based on phylogeny have accumulated rapidly due to the recent increase in availability of molecular phylogenetic data (see Xiang et al., 1998a, 2004, 2005, 2006; Wen, 1999; Sanmartı́n et al., 2001; Donoghue & Smith, 2004; Sanmartı́n & Ronquist, 2004; Soltis et al., 2006). One of the most widely used methods of inferring biogeographic histories based on phylogeny is Dispersal– Vicariance Analysis (DIVA) (Ronquist, 1997, 2001). ∗ C Received: 11 March 2009 Accepted: 19 June 2009 Authors for correspondence. A.J. Harris E-mail: <[email protected]>; Tel.: 1-336-6842314. Jenny Xiang E-mail: <[email protected]>; Tel.: 1-919-5152728; Fax: 1-919-5153436. 2009 Institute of Botany, Chinese Academy of Sciences DIVA is a method of reconstructing biogeographic history that falls under the broad heading of eventbased methods, in which biogeographic processes that help drive speciation are incorporated a priori into the methodology (Ronquist, 1996, 1997; Sanmartı́n et al., 2001). Specifically, DIVA uses a parsimony approach that minimizes extinctions and dispersals and assumes vicariance as the null hypothesis (Ronquist, 1996). The program estimates distributions of hypothesized ancestors at internal nodes on a fully bifurcating phylogenetic tree based on the distributions of terminal taxa (Ronquist, 1996). Results of biogeographic analysis using DIVA are optimized ancestral ranges at each internal node under the parsimony criterion. Frequently, multiple equally parsimonious biogeographic pathways 350 Journal of Systematics and Evolution Vol. 47 No. 5 (MP pathways) are obtained from a given tree, and these are summarized as multiple optimal solutions at some or all internal nodes of the tree. Although new modelbased likelihood and Bayesian methods of reconstructing biogeographic histories have recently been developed (Ree et al., 2005; Ree & Smith, 2008; Sanmartı́n et al., 2008; also Lemmon & Lemmon, 2008), a quick, advanced search using Google Scholar for 2008 published reports containing the words “biogeography” and “DIVA” illustrates that DIVA continues to be widely used in historical biogeographic studies. The primary advantage of DIVA over the likelihood method of Ree et al. (2005) is that less prior information is required (Ree et al., 2005; Ree & Smith, 2008). DIVA is also fast, simple, and user-friendly and gives results congruent to the model-based likelihood method Lagrange (http://code.google.com/p/lagrange/) for most lineages that have been compared (Ree et al., 2005; Burbrink & Lawson, 2007; Ree & Smith, 2008; Velazco & Patterson, 2008; Xiang & Thomas, 2008; Xiang et al., 2009) when analyses using DIVA included outgroups that are not widely distributed or the root range was used for area coding for outgroups at higher rank than species (see Ronquist, 1996). Running the DIVA program requires that two parameters are defined; the phylogeny and the distributions of terminal taxa. Aside from any questions that might arise regarding the underlying assumptions implemented in the program, uncertainty in the results of DIVA arises from two areas, phylogenetic uncertainty and uncertainty in DIVA optimization. Biogeographic reconstruction using DIVA is typically carried out using a single tree topology; the author’s “best” tree representing the true phylogeny (e.g., Fiz et al., 2008; Jeandroz et al., 2008; Lim, 2008). The single tree approach is a common practice in phylogenetic biogeography using many methods including Component analysis (Page, 1993a, 1993b), Bremer’s ancestral area analysis (Bremer, 1992), and the model-based likelihood methods of Ree et al. (2005) and Ree & Smith (2008). Of the five reports published in the American Journal of Botany and Systematic Biology in 2008, in which a primary research goal was to reconstruct historical biogeography, five used DIVA, four used a single tree (Calviño et al., 2008; Hines, 2008; Huttunen et al., 2008; Mansion et al., 2008) and one showed that alternative resolutions of polytomies had no effect on biogeographic reconstruction (Mast et al., 2008). Using a single tree rarely accounts for the full range of possible, slightly less optimal topologies given the data. Additionally, the “best” phylogeny is not always fully resolved or strongly supported for all nodes; some clades may be weakly supported or there may be polytomies. Polytomies are par- 2009 ticularly problematic. The backbone phylogeny used in DIVA analysis must be fully bifurcating as the program is unable to accept polytomies, but polytomies present a problem for most methods of biogeographic analysis using phylogeny, as reconstruction necessarily breaks down at these unresolved nodes. The other area of uncertainty from DIVA is the multiple, equally parsimonious biogeographic scenarios for a given phylogeny. The program does not provide any quantifiable method of selecting between the multiple possibilities. However, authors can use information from area connections and divergence times to rule out certain hypotheses or to favor one hypothesis over another, as also discussed by Ronquist (1996). Both types of uncertainty in DIVA have been recognized and handled by Nylander et al. (2008) using posterior probabilities (pp). Nylander et al. (2008) recently showed the utility of a probabilistic approach to DIVA in reconstructing the biogeographic history of the avian genus Turdus L. Specifically, they optimized 20,000 Bayesian trees in DIVA and used the results of these optimizations to determine the marginal distributions of alternative ancestral ranges at each node of interest, dependent on the node’s occurrence in the sampled topologies. Thus, alternative ancestral ranges at each node in the tree (Fig. 1a of Nylander et al., 2008) can be assumed to have a probability equal to the product of the clade pp (phylogenetic uncertainty) and the occurrence of the alternative ranges for the clade in DIVA (the uncertainty in the biogeographic reconstruction). The occurrence of each alternative range was determined as a fraction of all optimal ranges; that is, for a given tree, a node with three optimal ancestral ranges “A, B, or AB”, the occurrence of each range was recorded “A:1/3, B:1/3, AB:1/3”. This approach accounts for both uncertainty in the location of a node in the broader tree topology (i.e., phylogenetic uncertainty) and uncertainty in ancestral range reconstructions (multiple, equally parsimonious DIVA optimizations). Nylander et al. (2008) referred to this as a Bayes-DIVA analysis. Using a subset of Bayesian trees to account for uncertainty in phylogeny has been used before (e.g., Lutzoni et al., 2001; Pagel et al., 2004). In biogeography, this methodology was also suggested by Lemmon and Lemmon (2008) and was previously used by Huelsenbeck and Immenov (2002). Nylander et al. (2008) were the first to apply this approach to use with DIVA. Here, we extend the Bayes-DIVA method to allow estimation of the geographic origin of a lineage in a polytomy. We first redefined a node as the parent node (parent node, hereafter) of a crown group node, where a crown group node (crown node, hereafter) represents the last shared common ancestor of all constituents of C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA Fig. 1. Graphical explanation of parent nodes, crown nodes, and unspecified sister groups. A, Hypothetical phylogeny containing well-supported crown groups marked by triangular symbols and incomplete resolution of relationships among them. Open circles indicate crown nodes of crown groups 1–4. Closed circles indicate parent nodes (node, sensu this study). Numbered parent nodes corresponding to numbered crown groups. B, Unspecified sister groups (x) for crown groups 1–4. Node numbers in closed circles correspond to those in A. a crown group with an undefined sister (x) (Fig. 1). Therefore, the parent node is inherently present on every tree in the posterior distribution of phylogenetic trees in which the crown group occurs, regardless of the relationship of the crown group to other groups. Using this definition allows for estimation of the ancestral range of the stem lineage of a highly supported terminal taxon or crown group even if the lineage is resolved as a member of a polytomy in the phylogeny (Fig. 1). The probability (P) of an ancestral range r at a node of interest is calculated as P(rY ) = n F(rY )t Pt (1) t=1 where Y is the parent node, t is one of the randomly selected Bayesian trees, n is the total number of sampled trees, F(rY )t is the occurrence of an ancestral range r at node Y for tree t, and Pt is the probability of tree t, which is the proportion of the tree in the pool of the sampled trees (which can be extended to the proportion of the tree in the pool of the entire posterior distribution of trees). F(rY ) is calculated as the actual frequency of r within the pool of biogeographic pathways optimized using DIVA for each sampled tree: F(rY ) = Ri t . The value i is the number of times a range (r) occurs in the total number of MP pathways (Rt ) over the tree. The actual frequency can be obtained by using the command “printrecs” in DIVA. An alternative estimation of F(rY ) is using the method of Nylander et al. (2008) as 1/N, where N is the total number of alternative ancestral distributions at node Y . An example of this method C 2009 Institute of Botany, Chinese Academy of Sciences 351 of probability calculation and both methods of deriving F(rY ) are illustrated in Fig. 2. This revised Bayes-DIVA approach can provide statistical confidence on inferred biogeographic origins of lineages of interest with unresolved or poorly supported phylogenetic placement, for which the traditional DIVA analysis or the BayesDIVA approach used by Nylander et al. (2008) are uninformative. The parent node Y in this study is similar to the floating node described by Pagel et al. (2004) in that both Y and the floating node do not always include the same crown groups. However, the floating node must include two specific crown groups of interest, although it may contain other clades or taxa as well (Pagel et al., 2004). Y differs in that it is the parent of exactly two groups: a specific crown group of interest and its sister x, which is undefined. Another important difference is that the two clades of interest at a floating node of Pagel et al. (2004) can have any level of support, whereas the Y applies to only the nodes connecting the well-supported crown clade and its unspecified sister. Therefore, the floating node is not suitable as a substitute for Y . Using simulated data, we tested whether the range probabilities of a parent node can be accurately inferred as the product of the pp at the node containing the crown group and a defined sister, and the frequency of occurrence of the range at that node optimized by DIVA on the Bayesian consensus tree topology, that is, pp ∗ F(r y ). (2) We further tested the utility of our approach using data from Aesculus L., a genus of woody trees and shrubs with a disjunct Laurasian distribution. We also illustrate two additional applications of this method. First, we estimated the impact of two fossil wildcard taxa (sensu Nixon & Wheeler, 1992) on biogeographic reconstruction of Aesculus. Second, we examined range subdivisions at the parental nodes of lineages of interest and estimated the most probable ranges inherited by these lineages (referred to as post-Y range hereafter) to gain some insights into range evolution along the stem branches. The primary goals of this study are: (i) to describe an alternative method of using the Bayes-DIVA analysis under phylogentic uncertainty which can provide estimation of geographic origin for crown groups with unknown sister relationships; and (ii) to test the method and its possible applications using Aesculus L. Aesculus (Sapindales, Sapindaceae) is a genus of 13–19 species belonging to six major lineages, which are supported by phylogenetic studies using molecular and morphological data: sect. Aesculus 352 Journal of Systematics and Evolution Vol. 47 No. 5 2009 Fig. 2. Example of calculation of P(rY ) and of F(rY ) using two methods. A, Hypothetical sample of three Bayesian trees, T 1 –T 3 . Node Y (circles) is parent node of Lineage 1. A, B, C, and D are distribution areas. Ranges of terminals are given below lineage names. Possible ranges for node Y include A, B, C, D and widespread areas including two or more of these. In B and C, only areas with F(rY ) > 0 for at least one tree shown. B, Calculation of F(rY ) using actual frequency of areas from dispersal–vicariance analysis output (i.e., Ri t ). C, Calculation of F(rY ) assuming all optimal areas equally probable for each t (i.e., 1/N). (2 species), sect. Macrothyrsus (1 species), sect. Parryana (1 species), sect. Pavia (4 species), an Asian clade (3–10 species), and the species Aesculus californica Nutt. (Xiang et al., 1998b; Forest et al., 2001; Harris et al., 2009). Extant Aesculus species are distributed across the Northern Hemisphere and each lineage is restricted to one of the following areas: East Asia (EA); western North America (wNA); eastern North America (eNA); and Europe (EU), except sect. Aesculus, which is disjunct in EA and EU. Aesculus has a rich fossil record from EA, EU, and wNA and with fossils found in strata ranging from the Paleocene to the Quaternary (Hu & Chaney, 1940; Condit, 1944; Puri, 1945; Szafer, 1947, 1954; Tanai, 1952; Schloemer-Jäger, 1958; Prakash & Barghoorn, 1961; Axelrod, 1966; Budantsev, 1983; de Lumley, 1988; Mai & Walther, 1988; Wehr, 1998; Golovneva, 2000; Manchester, 2001; Jeong et al., 2004; Dilhoff et al., 2005). Aesculus is an ideal genus for biogeographic study owing to its small number of species, pan-Northern Hemisphere distribution, extensive fossil record, and the continental endemism of most lineages and all species. However, molecular phylogenetic studies of Aesculus using several DNA regions (Xiang et al., 1998b; Harris et al., 2009) have resulted in poorly supported or unresolved relationships among the six major lineages despite strong support for the polytypic lineages (i.e., crown groups). Thus, the utility of DIVA applied in the traditional way for biogeographic reconstruction of the genus is limited. In addition to deep node polytomies, biogeographic reconstruction of Aesculus presents another challenge due to uncertainties in positions of some fossil species. Recently, many authors have cited the need for inclusion of fossils in phylogenetic reconstruction and phylogeny-based biogeographic analyses (Manchester, 1999; Rothwell, 1999; Wen, 1999; Lieberman, 2003; Crane et al., 2004; Donoghue & Smith, 2004; Xiang et al., 2005, 2006, 2009; Hilton & Bateman, 2006; Rothwell & Nixon, 2006). Excluding fossils can produce a false or incomplete biogeographic history of a group (Manchester, 1999; Lieberman, 2003; Crane et al., 2004). The limitations of including fossils, for which often only incomplete morphological data and rarely ancient DNA data is available, have been C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA discussed (Nixon & Wheeler, 1992; Kearney, 2002; Kearney & Clark, 2003; Wiens, 2003, 2006) and observed in empirical studies (e.g. Rothwell & Nixon, 2006; Harris et al., 2009; but see Manos et al., 2007). Fossil taxa for which little informative data is available may act as wildcard taxa (Nixon & Wheeler, 1992) in phylogenetic analysis. Wildcard taxa are defined as those that, due to significant missing characters, may be placed algorithmically at many or all nodes on the tree topology (Nixon & Wheeler, 1992; Kearney & Clark, 2003). Two geographically and temporally important complete leaf (leaflets attached to a petiole) fossil species of Aesculus offer few phylogenetically informative characters. These are Aesculus longipedunculus Schloemer-Jäger (Eocene, EU) and Aesculus “magnificum” (Budantsev, 1983; Manchester, 2001) (Paleocene, EA). In preliminary analyses, these fossil species behave as wildcards, limiting phylogenetic resolution for the fossils and for otherwise well-supported groups. In the example using Aesculus, we use the revised BayesDIVA to provide a statistical measure of shifts in ancestral range probabilities when fossils are included versus excluded. 1 Material and methods 1.1 Assessing the difference between Equation 1 and Equation 2 It is of interest to determine if the product of the pp and the frequency of a range (F(rY )) derived from DIVA analysis of the Bayesian consensus tree (with compatible groupings below 50% allowed) effectively reflects the estimation using Equation 1 of the revised Bayes-DIVA method (i.e., Equation 2 versus Equation 1) because the former is so much simpler. To accomplish this, 10 random DNA sequences of 200 bp in length were generated using a JavaScript sequence generator (http://www.faculty.ucr.edu/∼mmaduro/random.htm) (M. Maduro, pers. comm., 2008). These sequences were used to represent 10 hypothetical lineages, Lineage 1–Lineage 10. These lineages represent 10 unique operational taxonomic units where each might be a species or a clade containing multiple species with 100% pp. This is a simplistic example, but our analysis of Aesculus L. provides an example of data calculation for clades supported by pp less than 100% of the data. The 10 simulated sequences were treated as aligned and placed in a data matrix. Knowledge of any true relationship between these sequences was unknown and inessential as the objective was not to test the utility of Bayesian analysis in recovering true relationships. The random sequences were expected C 2009 Institute of Botany, Chinese Academy of Sciences 353 to provide phylogenetic uncertainty sufficient to test the hypothesis whether Equation 1 results in ancestral range probabilities at a node significantly different from that resulting from Equation 2. The 10 random sequences are available from the authors by request. Phylogenetic analysis of the simulated data was carried out using MrBayes 3.1.2 (Huelsenbeck & Ronquist, 2001; Huelsenbeck & Ronquist, 2003). The program was run using default priors for two simultaneous runs of 22 million generations each. Each run used one hot chain and two cold chains with default settings. Burnin was set to 2,200,000 (or 10%) and trees were sampled every 2000 generations. Resulting post-burnin trees were assembled into a PHYLIP format file and a majority rule consensus with compatible groupings >50% was generated using Consense in the PHYLIP 3.68 package (Felsenstein, 1989; Felsenstein, 2008). Lineage 3 was randomly selected as an outgroup. The consensus tree was used to identify four lineages, two sister groups, that would be used to test our hypothesis: Lineages 1 and 8; and Lineages 4 and 9 (Fig. 3: A). One hundred trees from the 19,800 post-burnin dataset were randomly selected using RandomTree (Kauff, 2005). Four ancestral areas, A, B, C, and D, were randomly assigned to each of the 10 lineages, with each area being used at least once and with each lineage endemic to a single area. The 100 trees were optimized using DIVA 1.1 for Windows (Ronquist, 1996, 1997) with default settings. The ancestral ranges of the parent nodes were recorded in an Microsoft Excel 2007 spreadsheet. The spreadsheet format was used for calculation of ancestral range probabilities at each node of interest and for statistical test analysis. Lineages 1, 4, 8, and 9 (Fig. 3: A) were used to compare the probabilities calculated using Equation 1 and Equation 2. Probabilities of ancestral ranges for the node shared by Lineage 1 + Lineage 8 and the node shared by Lineage 4 + Lineage 9 (occurring in the 50% consensus topology) were first calculated using Equation 2 to provide an estimation of ancestral origin of these lineages. The results were then compared to those estimated using Equation 1, in which the sisters of Lineages 1, 4, 8, and 9 were undefined (x). A two-tailed z-test was used to determine if there was significant difference between probabilities for ranges obtained using the two methods. The goal of these comparisons, and of similar comparisons made in the empirical example using Aesculus, was to determine whether Equation 1 could recover additional informative range data for the parental node that has <100% pp in the Bayesian consensus tree than Equation 2. Any significant differences between Equation 1 and Equation 2 indicate that there is additional useful range information present in the subset 354 Journal of Systematics and Evolution Vol. 47 No. 5 2009 Fig. 3. Results of Bayesian analysis of simulated data. A, Consensus trees for 19,800 (left) and 100 (right) Bayesian trees. Values of posterior probability support are shown above branches, actual occurrences are given in parentheses. Geographic ranges of terminals subtend terminal names. Parent and crown nodes used in Bayes-dispersal–vicariance analysis simulation are highlighted, expanded in B. MJ, majority. B, Explanation of nodes of interest for Bayes-dispersal–vicariance analysis simulation. of Bayesian trees that is discarded by using Equation 2. In all DIVA analyses constraints on maximum areas (“maxareas” command) were not implemented. 1.2 Reconstructing ancestral ranges in Aesculus L. 1.2.1 DNA and morphological data DNA sequences from matK, the rps16 intron, and internal transcribed spacer (ITS), available from a previous study for 16 species of Aesculus as well as for outgroup taxa Handeliodendron bodinieri Redhr., Billia columbiana Planch. & Linden ex Triana & Planch. and Billia hippocastanum Peyr., were used in this study (Appendix I). For information on outgroup selection see Hardin (1957a), Judd et al. (1994), Xiang et al. (1998b), Forest et al. (2001), Harrington et al. (2005), and Harris et al. (2009), and DNA sequences were aligned manually using MacClade 4.02 (Maddison & Maddison, 2001). The 39-character morphological matrix of Forest et al. (2001) was modified by: (i) excluding all outgroup taxa used in their study except those noted above; (ii) eliminating Aesculus glabra Willd. var. arguta (Buckley) B.L. Rob.; and (iii) combining the species of Billia into a single taxonomic entry, Billia sp. Fossil taxa, A. longipedunclus and A. “magnificum” were scored based on published reports (Schloemer-Jäger, 1958; Budantsev, 1983; Golovneva, 2000; Manchester, 2001) for three characters: petiolulate leaflets (as opposed to sessile); serrate margins (as opposed to entire); and having palmately compound leaves (as opposed to ternate). The presence of petiolulate leaflets is a parsimony informative character in Aesculus (Hardin, 1957a; Forest et al., 2001; Manchester, 2001; Harris et al., 2009). All extant species of Aesculus except (arguably) Aesculus parryi (sect. Parryana) have some degree of leaf serration (Hardin, 1957a; Forest et al., 2001). Outgroup taxa Handeliodendron and Billia have entire leaflets (Wiggins, 1932; Hardin, 1957a, 1957b; Forest et al., 2001; Harris et al., 2009). Palmately compound leaves are common to all extant Aesculus and Handeliodendron, whereas leaves of Billia are ternate (Forest et al., 2001; Hardin, 1957a, 1957b, 1960). 1.2.2 Phylogenetic analysis Three independent phylogenetic analyses were carried out. In Analysis 1, gaps in matK were coded using ambiguous region coding (ARC) (Kauff et al., 2003) for ambiguously aligned regions and simple gap coding for unambiguous gaps. In Analysis 2 ARC and simple gap coding were applied for all genes in the concatenated sequences. Analysis 3 included the extant species as well as the two fossil species A. longipedunculus and A. “magnificum” and was carried out using a matrix of combined morphological and molecular data with the same ARC and gap codings as Analysis 2. Analyses were carried out using MrBayes 3.1.2. Data was partitioned into four sets, matK, rps16, ITS, and morphology including the modified morphological matrix of Forest et al. (2001) and the C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA standard states from ARC and simple gap coding. For each gene region, ModelTest 3.0 (Posada & Crandall, 1998) was used to determine the best model of evolution. Although character state ratios and other specific information were dependent on use ARC and simple gap coding, the basic models were not affected by use of these coding methods. The Akaike Information Criterion in ModelTest returned the following models: TVM + I + G for matK, TRN + I for ITS, and K81uf for rps16. Models were implemented in MrBayes using the PRSET and LSET commands. For each analysis, two simultaneous, independent Markov chains were run for 22 million generations to check convergence. Trees were sampled every 2000 generations. Burnin was set to 2.2 million generations or 1100 trees, and was checked using Tracer 1.3 (Rambaut & Drummond, 2003). The 19,800 postburnin trees from each analysis were combined independently and summarized by generating a 50% majority rule consensus tree in PAUP∗ 4.0b10 (Swofford, 2002). 1.2.3 Biogeographic analysis using the revised Bayes-DIVA method Nine nodes of interest were identified on the Bayesian consensus tree from analysis of combined data with gaps in matK coded using ARC and simple gap coding (Analysis 1). These were the parent nodes of sect. Aesculus, sect. Macrothyrsus, sect. Parryana, sect. Pavia, the Asian clade, A. californica, and the crown nodes of each of the polytypic lineages; sect. Aesculus, sect. Pavia, and the Asian clade. One hundred trees from the combined post-burnin Bayesian tree files from each analysis were randomly sampled using RandomTree. Terminals were coded as belonging to one of five ancestral areas: Europe (A), East Asia (B), eastern North America (C), western North America (D), and Latin America (E) to cover distributional ranges of Aesculus and its outgroup Billia. Trees were optimized using default settings in DIVA 1.1 for Macintosh. Results from DIVA for each of the nine nodes of interest were recorded in a Microsoft Excel spreadsheet which was used for subsequent calculations. Ancestral range probability at each node of interest was calculated using Equation 1. For those nodes present in the Bayesian consensus topology, the probability of alternative ancestral ranges was also calculated using Equation 2 for comparison. Individual topologies of sampled trees were examined using TreeView 1.6.6 (Page, 1996, 2001) and PAUP∗ 4.0b10 (Swofford, 2002). Biogeographic analysis of the Analysis 3 phylogenies using the revised Bayes-DIVA method considered only the ancestral ranges of the six parent nodes and did not include the three crown group nodes. We used the floating node of Pagel et al. (2004) in cases where C 2009 Institute of Botany, Chinese Academy of Sciences 355 crown clades contained fossil species. The floating node allowed that crown clade existed on the tree as long as the floating node included only the crown clade alone or only the crown clade plus one or both fossil species. The floating node was not a substitute for Y . Instead Y included the crown clade (plus any fossils) and x. On some topologies for some crown clades of interest, x was a fossil and this was perfectly acceptable. The revised Bayes-DIVA analysis including fossils was done using a sample of 100 trees from the post-burnin posterior distribution of trees. This analysis was repeated for the same set of 100 trees with fossils pruned from the topologies. Z statistics were used to compare the results of these two analyses (fossils included and fossils pruned) and the results from Analysis 1 including only extant species. Post-Y range analyses for each of the six major lineages of Aesculus were carried out using Bayes-DIVA results from Analysis 1 data. For each node Y of the six major lineages, all possible ranges that the branch leading to the crown group of interest could inherit from ranges at Y with a P > 0 were determined. Inheritance of each possible range from splitting of ranges at Y was considered equally probable and was then weighted by the probability of the ancestral range at Y . The probability of each possible range inherited from node Y by each of the descendant branches was calculated as the sum of the probabilities of that range over all ranges with a P > 0 at Y . For example, if Lineage L has Y P(A) = 0.50 and P(AB) = 0.50, for range A at node Y, the probability of inheritance of range A by the two descendant lineages is 1.0. For range AB at node Y , the descendant lineages may inherit A, B, or AB, each with a probability of 1/3. The post-Y probability of range A for Lineage L is, therefore, post−Y P(A) = 0.5 ∗ 1.0 + 0.50 ∗ 0.333 = 0.667. Post-Y range probability calculations were carried out using RAD@Y, a Python 2.5 user interface program developed by the authors for this purpose and available upon request. The post-Y ranges provide information on range inheritance of the descendant lineages and range evolution along the stem of crown groups. For all comparisons between Equation 1 and Equation 2 in the empirical example using Aesculus, a quick and conservative approach was used by allowing F(rY ) to have its largest possible value, that is, F(rY ) = 1 (when there was no uncertainty from DIVA optimization for node Y), thus Equation 2 = pp. If the maximized values of Equation 2 are still significantly smaller than those found by using Equation 1, the conclusion that Equation 2 does not effectively reflect the probability estimated by Equation 1 can be made. 356 2 Journal of Systematics and Evolution Vol. 47 No. 5 Results 2.1 Equation 1 vs. Equation 2 in simulated data Relationships between all lineages were poorly supported (Fig. 3: A). The highest pp support was observed for the sister relationships between Lineages 1 and 8 (pp = 52%) and Lineages 4 and 9 (pp = 48%). In the randomly selected subset of Bayesian trees, the monophyly of Lineages 1 + 8 was supported in 58% of the data and the monophyly of Lineages 4 + 9 was supported in 47% of the data (Fig. 3: A). Results from DIVA using the 50% majority rule tree with nodes compatible (Fig. 3: A) indicated that the geographic range for the node shared by Lineages 8 + 1 was A only (no alternative solutions), thus F(rY ) = 1.0. For the node shared by Lineages 4 and 9, results from DIVA showed an ancestral range of BD only with F(rY ) = 1.0. Therefore, the probabilities of ancestral ranges for these nodes based on Equation 2 were P(A) = 0.54 ∗ 1.0 for L8 + 1 and P(BD) = 0.47 ∗ 1.0 for L4 + 9, implying that the geographic origins of both L1 and L8 are most likely to have occurred in A with probability of 0.54, whereas the geographic origins of L4 and L9 are both most likely to have occurred in BD with probabilities equal to 0.47. In the revised Bayes-DIVA approach applying Equation 1, the most probable ancestral ranges at four parent nodes (Fig. 3: B), Lineage 1 + x, Lineage 4 + x, Lineage 8 + x, and Lineage 9 + x, inferred from the 2009 sample of 100 Bayesian trees were A (P = 0.744), BD (P = 0.484), A (P = 0.755), and BD (P = 0.643), respectively (Fig. 4: A). All most highly supported ancestral ranges for each parent node of interest were significantly greater than the second most highly supported ancestral range (Fig. 4: A) and were significantly greater than those obtained by using Equation 2, except in the case of Lineage 4 + x (Table 1). The probability of BD was significantly higher for Lineage 9 compared to Lineage 4 (Fig. 4: B), and P(BD) was equal for Lineages 4 and 9 when using Equation 2. 2.2 Results of analyses using Aesculus Phylogenetic analyses of different data partitions showed strong support for the monophyly of polytypic groups but poor resolution of relationships among the major lineages (Fig. 5). In the analysis including fossils, support for the monophyly of all polytypic lineages greatly decreased (Fig. 6). Fossil species were observed to ally variously with all major lineages and, rarely, with outgroup species, with low support (Fig. 6). Results from the modified Bayes-DIVA analysis (below) are not presented on the consensus tree or other graphical representation of the relationships between clades of Aesculus. This is for three reasons. First, Y cannot be accurately reflected on a consensus tree or other single topology. Second, the probabilities of ancestral ranges calculated using Equation 1 are not Fig. 4. Results of Bayes-dispersal–vicariance analysis of simulated data. A, Relative frequency graphs showing probability (P) of ancestral ranges for the parent nodes of Lineages 1, 4, 8, and 9 and their unspecified sisters (x). Circled numbers correspond to numbered lineages. Ranges are shown above graphs. Results of Z-test comparing the most highly supported range to the second most highly supported range shown below frequency boxes. Arrows point to bars compared in B. B, Comparison of P(BD) as ancestral range of Lineages 4 and 9. C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA Table 1 Comparison of pp∗ F(rY ) (Equation 2) and P(rY ) = simulated data Sister in Bayesian consensus tree Ancestral area from consensus tree optimization pp support for sister in consensus of 19,800 n t=1 357 F(rY )t Pt (Equation 1) for ancestral areas of lineages of interest from the Java script Equation 2 results† Most highly supported ancestral area from BayesDIVA analysis Equation 1 results z statistic for comparison of EQ1 and EQ2 Significant difference at α/2 = .005 p value Lineages 1 Lineage 8 A 0.52 0.58 A 0.744 3.90 yes <0.0001 4 Lineage 9 BD 0.48 0.47 BD 0.484 0.286 no 0.779 8 Lineage 1 A 0.52 0.58 A 0.755 4.37 yes 0.0002 9 Lineage 4 BD 0.48 0.47 BD 0.643 3.80 yes <0.0001 †pp∗ F(rY ) was estimated with F(rY ) = 1 for a more conservative test on the majority rule tree of the 100 sampled trees; ‡A, B, C, and D were used in dispersal–vicariance analysis (DIVA) of simulated data to represent four hypothetical, unique areas. pp, posterior probability. Fig. 5. Bayesian trees from phylogenetic Analyses 1 and 2 of Aesculus. Consensus trees were condensed, showing major lineages. Values of posterior probability support are above branches, and bootstrap support are below branches. Modern ranges subtend terminal names corresponding to areas indicated in C. A, Results of analysis of extant taxa only (Analysis 1 with ambiguous region coding and simple gap coding in matK). Numbered nodes correspond to nine nodes of interest considered in Bayes-dispersal–vicariance analysis. 1, Asian clade + x; 2, sect. Aesculus + x; 3, Aesculus calfornica + x; 4, sect. Macrothyrsus + x; 5, sect. Pavia + x; 6, sect. Parryana + x; 7–9, last shared ancestor of species of polytypic lineages, that is, crown nodes. B, Results of Analysis 2 including extant species only and with ambiguous region coding and simple gap coding for all gene regions. C, Geographic map indicating areas used in Bayes-dispersal–vicariance analysis analysis, created using Online Map Creation (Weinelt, 1999). EA, East Asia; eNA, eastern North America; EU, Europe; LA, Latin America; wNA, western North America. dependent on position of the clades on the tree nor on pp support shown on the tree for clades, though they are weighted by these values. Finally, assuming that x is best represented by the sister group indicated on the consensus tree, topology limits confidence in the results of the revised Bayes-DIVA analysis to confidence in nodal support. C 2009 Institute of Botany, Chinese Academy of Sciences Using the revised Bayes-DIVA analysis, the ancestral ranges of the crown nodes of interest (Fig. 5: A, nodes 7–9), sect. Aesculus, sect. Pavia, and the Asian clade, were estimated to be EA-EU, eNA, and EA, respectively, in all sampled trees. Therefore, the probabilities for these ranges at these crown nodes are all equal to 1.0. In this case, there is no difference between 358 Journal of Systematics and Evolution Vol. 47 No. 5 2009 Fig. 6. Results from phylogenetic analysis of Aesculus including extant species and wildcard fossils (Analysis 3). A, Bayesian consensus trees from Analysis 3. Values of posterior probability support from 19,800 trees are shown above branches and those from 100 randomly sampled trees are below branches. Fossils highlighted in gray. Dashed lines indicate placement of Aesculus “magnificum” in consensus of 19,800 trees (lower) and 100 trees (upper). Distributional ranges are provided to the right of terminals corresponding to those indicated in B. B, Geographic map indicating areas used in Bayes-dispersal–vicariance analysis analysis, created using Online Map Creation (Weinelt, 1999). EA, East Asia; eNA, eastern North America; EU, Europe; LA, Latin America; wNA, western North America. Equation 1 and Equation 2 because all groups in question were supported by 100% pp and there was no optimization uncertainty in DIVA for all sampled trees. In contrast, the ancestral ranges at the parent nodes of the Asian clade, sect. Aesculus, sect. Pavia, A. californica, sect. Parryana, and sect. Macrothyrsus (Fig. 5: A, nodes 1–6, respectively) were sensitive to topological rearrangements. More than one optimal geographic range was resolved for each of these parent nodes using Bayes-DIVA (Fig. 7). For five of the six lineages, a most probable range with P ≥ 0.5 was recovered. The most probable range for the parent node of the Asian clade was shown to be EA with P = 0.755 (Table 2, Fig. 7: A). An EA distribution was also revealed to be the most likely for the parent node of sect. Aesculus (P = 0.832) (Table 2, Fig. 7: B). For the parent nodes of sect. Pavia and A. californica, the most likely ancestral ranges were shown to be widespread in eNAwNA and EA-wNA, respectively (P = 0.663 and 0.76) (Table 2, Fig. 7: C, D), whereas the parent nodes of sect. Parryana and sect. Macrothyrsus were both shown to be widespread in eNA-wNA (P = 0.90 and 0.395, C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA 359 Fig. 7. Probabilities of ancestral ranges for the six major lineages of Aesculus L. Highest probabilities are given in black text in beveled slices. A, Asian clade. B, Section Aesculus. C, Aesculus californica. D, Section Pavia. E, Section Parryana. F, Section Macrothyrsus. EA, East Asia; eNA, eastern North America; EU, Europe; LA, Latin America; wNA, western North America. C 2009 Institute of Botany, Chinese Academy of Sciences 360 Journal of Systematics and Evolution Vol. 47 No. 5 2009 Table 2 Most probable ancestral ranges of the stem node of six lineages inferred from analysis without fossils and comparison between Equation 1 and Equation 2 calculations of probability Lineage Most probable range Equation 1 results Equation 2 results† Z-statistic‡ for comparison of Eqn 1 to Eqn 2 P value§ Asian clade EA 0.755 0.73 0.580 0.5619 Aesculus EA 0.832 0.73 2.719 0.0065 Aesculus californica EA-wNA-eNA 0.760 0.75 0.234 0.815 Pavia eNA-wNA 0.663 0.70 −0.781 0.4348 Parryana eNA-wNA 0.900 0.70 6.628 <0.0001 Macrothyrsus eNA-wNA 0.395 none ≥0. 50 — — †pp∗ F(rY ) of Equation 2 was estimated with F(rY ) = 1 (no optimization uncertainty), leading to pp∗ F(rY ) = pp for conservative test. See Material and Methods, 1.2.3; ‡Two-tailed z-test; §Highlighting indicates significance at Zα/2 , α = 0.01; —, Eqn 2 not used for calculation of ancestral range probability for sect. Macrothyrsus as no posterior probability (pp) value is available from 50% majority rule consensus of Bayesian topologies. EA, East Asia; eNA, eastern North America; wNA, western North America. Table 3 Differences between pp and pp∗ F(rY ) of Equation 2 for Aesculus stem lineage nodes on Bayesian consensus tree derived from analysis without fossils (Analysis 1)† . F(rY ) was calculated using 1/N pp supporting relationship to sister in consensus topology Lineage Sister lineagein 50% MJ rule‡ Asian clade Aesculus 0.70 Aesculus Asian clade 0.70 A. californica (Aesculus + Asian clade) 0.75 Most probable range(s) from DIVA analysis of consensus tree‡ EA EU-EA EA EU-EA EU-EA Equation 2 results 0.365 0.365 0.365 0.365 0.250 EA-wNA 0.250 EU-EA-wNA 0.250 Pavia Parryana 0.70 eNA-wNA 0.70 Parryana Pavia 0.70 eNA-wNA 0.70 † Sect. Macrothyrsus pruned for this analysis to produce fully bifurcating tree topology; ‡ See Fig. 5: A. DIVA, dispersal–vicariance analysis; EA, East Asia; eNA, eastern North America; EU, Europe; pp, posterior probability; wNA, western North America. respectively) (Table 2, Fig. 7E, F). Some of these probability values are greater than the pp support for the nodes shared by these lineages and a specific sister and all are greater than the Equation 2 values for nodes present in the 50% majority rule Bayesian consensus (Table 3). The probabilities obtained for ranges at the parent nodes of sect. Parryana + x and sect. Aesculus + x were significantly different from those obtained using Equation 2 (Table 2), for which the relationships sect. Parryana + sect. Pavia and sect. Aesculus + the Asian clade were used for DIVA analysis (Table 3). When fossils were included in the Bayes-DIVA analysis, the probability of any ancestral range including Europe, P(EU ∈ R), increased significantly for three of six parent nodes (Fig. 8, Table 4) when compared to results from trees with fossils pruned. The value of P(EU ∈ R) increased significantly for all six parent nodes when compared to results from trees resulting from phylogenetic analysis including extant taxa only (Table 4). In contrast, changes in the probability of ranges including East Asia, P(EA ∈ R), were less dramatic when fossils were included vs. excluded (Fig. 8, Table 4). Fig. 8. Comparison of P(EU ∈ R) and P(EA ∈ R) for the six parent nodes of interest when fossils are included, pruned, and excluded. Probability (P; y axis) is the probability of any ancestral area, including widespread areas, that include Europe (left) and East Asia (right). Post-Y range calculations yielded moderate to high support for a post-Y range of EA for the Asian clade and sect. Aesculus (post−Y P(EA) = 0.837 and 0.888, respectively) (Table 5). The most probable postY range for sect. Pavia was eNA, but with lower support (post−Y P(eNA) = 0.543) (Table 5). For the other C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA 361 Table 4 Change in probabilities of ancestral ranges including Europe (EU) and East Asia (EA) when fossils excluded vs. included. A, Comparison of probabilities calculated using Bayes- dispersal–vicariance analysis with fossils pruned vs. fossils included on trees from analysis including both extant and fossil species. Highlighting indicates significant change in P. B, Comparison of probabilities with fossils included (trees from Analysis 3) vs. excluded (trees from analysis including only extant species). Highlighting indicates significant change in P Lineage P(EU ∈ R), fossils excl. P(EU ∈ R), fossils incl. P(EU ∈ R)† P value‡ P(EA ∈ R), fossils excl. P(EA ∈ R), fossils incl. P(EA ∈ R)† P value‡ 0.745 0.678 0.000 0.033 0.044 0.000 0.655 0.707 0.183 0.114 0.112 0.200 0.090 ↓ 0.029 ↑ 0.183 ↑ 0.081 ↑ 0.068 ↑ 0.200 ↑ 0.1648 0.6599 < 0.0001 0.0282 0.0735 < 0.0001 1.000 0.674 0.000 0.033 0.033 0.000 0.992 0.568 0.192 0.044 0.062 0.179 0.008 ↓ 0.106 ↓ 0.192 ↑ 0.011 ↑ 0.029 ↑ 0.179 ↑ 0.3703 0.1223 < 0.0001 0.9681 0.3371 < 0.0001 0.008 ↓ 0.422 ↓ 0.573 ↑ 0.022 ↑ 0.026 ↑ 0.158 ↓ 0.3703 < 0.0001 < 0.0001 0.3843 0.3953 0.0107 A Asian clade Aesculus Aesculus californica Pavia Parryana Macrothyrsus B Asian clade 0.000 0.655 0.655 ↑ < 0.0001 1.000 0.992 Aesculus 0.088 0.707 0.619 ↑ < 0.0001 0.990 0.568 Aesculus californica 0.080 0.183 0.103 ↑ < 0.0001 0.765 0.192 Pavia 0.000 0.114 0.114 ↑ < 0.0001 0.022 0.044 Parryana 0.007 0.112 0.105 ↑ 0.0017 0.036 0.062 Macrothyrsus 0.058 0.200 0.142 ↑ 0.0027 0.337 0.179 †Absolute value of change, arrow indicating direction of change when fossils included; ‡From z-test comparing two means. Table 5 Possible post-Y ranges inherited from the most probable ancestral range (see Fig. 7) for each of the six major lineages of Aesculus Lineage † Asian clade (5) Aesculus (11) Aesculus californica (11) Pavia (7) Parryana (23) Possible postY ranges‡ Probability for each postY range EA EA-wNA wNA 0.83700 0.06000 0.06000 EA EU EA-EU 0.88800 0.03250 0.02905 wNA EA EA-wNA 0.32500 0.26000 0.26000 eNA wNA eNA-wNA 0.54300 0.22100 0.22100 wNA eNA-wNA eNA 0.34400 0.30670 0.30670 Macrothyrsus (15) eNA 0.47000 wNA 0.15200 eNA-wNA 0.15100 †Total number of non-zero post-Y ranges are shown in parentheses below lineage names; ‡Only the three highest post-Y ranges are shown. EA, East Asia; eNA, eastern North America; wNA, western North America. three major lineages of Aesculus, no single post-Y range received support greater than 0.500 (Table 5). These preliminary results, which do not represent all of the available molecular and fossil data (see Harris et al., 2009), revealed possible extinction in EA and migra C 2009 Institute of Botany, Chinese Academy of Sciences tion to wNA of the A. californica lineage, extinction in wNA and dispersal to eNA of the sect. Pavia lineage, and extinction in eNA and dispersal to wNA of sect. Parryana. 3 Discussion 3.1 Accounting for phylogenetic and DIVA optimization uncertainties Accounting for uncertainties in phylogeny and optimization is a major challenge in biogeographic analysis. The Bayes-DIVA method provides a simple and sound solution to this problem. The Bayes-DIVA method of Nylander et al. (2008) applies to nodes with fixed bipartitions (i.e., the two sister lineages at a node are clearly defined) and only trees containing these fixed nodes are considered, however, the revised Bayes-DIVA approach extends the method to allow estimation of geographic ranges at a node with only one of the two lineages defined and all trees containing the defined lineage contribute to the estimation. This revision to BayesDIVA provides a method of estimating biogeogaphic origins of lineages with uncertain sister affiliation with statistical confidence. Both Bayes-DIVA methods require optimization of a large set of Bayesian topologies and subsequent analyses of the results. It would be an easier alternative solution if the product of the pp value and F(rY ) obtained from the 50% majority rule tree (i.e., Equation 2) could accurately reflect the full extent of range information inherent in the sampled Bayesian trees. However, our comparisons showed that this is not 362 Journal of Systematics and Evolution Vol. 47 No. 5 the case (e.g., P(BD) for Lineages 4 and 9, Fig. 4: B) and that probabilities calculated using Equation 2, even when F(rY ) is equal to its maximum value of 1.0, are usually lower than the probabilities obtained using the revised Bayes-DIVA method. An alternative way to simplify the calculation of Equation 1 is to use 1/N (N is the number of alternative optimal ranges from DIVA for tree t) for F(rY ). However, we found that 1/N (implying occurrence of each unique alternative range with equal frequency) can be very different from the actual frequencies ( Ri t ) (Fig. 9). The values of F(rY ) calculated using Ri t can be substantially different in two trees showing identical sister relationships at the node of interest but differing elsewhere (Fig. 9). Using Ri t as a calculation of F(rY ) accurately reflects the frequencies of ranges given the data, which is important because the actual frequencies better reflect the uncertainty of DIVA optimization. Because a range with 100% occurrence at node Y on a given tree suggests no uncertainty in DIVA optimization, a range at node Y occurring more frequently in the optimal MP pathways indicates greater certainty of that range in DIVA optimization compared to other ranges occurring at the lower frequencies. However, 1/N may be used if one prefers to weight the alternative ranges at a node equally. Software for automation of analyzing the results from Bayes-DIVA and calculation of probabilities is desirable as, at present, this can be time consuming. The revised Bayes-DIVA approach is not in disagreement with Nylander et al. (2008) or Huelsenbeck and Immenov (2002), but rather provides an alternative method of accommodating phylogenetic and optimization uncertainties extending to parent nodes of crown groups with uncertain sisters. Although the model-based, full Bayesian approach of Sanmartı́n et al. (2008) has been developed to account for phylogenetic and optimization uncertainties in inferring biogeographic dispersal events, this approach is well suited for island biogeography, but may not be suitable for continental biogeography. Nylander et al. (2008) raised the question of how range probabilities obtained using Bayes-DIVA should be interpreted because the optimal ranges for each node from DIVA represent only the most parsimonious solutions, rather than including all possible solutions that may be statistically equally likely. This is no less of a concern for the revised Bayes-DIVA approach presented here. Nylander et al. (2008) hypothesized that optimal solutions from DIVA might be treated as approximating ML solutions and that the Bayes-DIVA method could then be treated as a non-parametric empirical Bayesian method. The method relies on empirical observations to approximate the actual stochastic distribution (see Johns, 1957). However, as noted by Nylander et al. 2009 (2008), it is not currently possible to determine how effectively DIVA MP solutions approximate ML solutions because there is no stochastic model for DIVA and, thus, no way of estimating the full range of distribution of solutions. Nonetheless, studies comparing biogeographic inference using DIVA and the model-based likelihood methods (e.g., Ree et al., 2005; Ree & Smith, 2008) have found that results are often largely congruent (e.g., Ree et al, 2005; Xiang & Thomas, 2008; Xiang et al., 2009). This may support the necessary assumption that MP solutions from DIVA are reasonable approximations of the ML solutions. 3.2 Biogeographical inference of extant Aesculus Conflicting biogeographic hypotheses have been proposed for Aesculus (Hardin, 1957a; Xiang et al., 1998b; Forest et al., 2001; Harris et al., 2009). The most recent hypothesis was proposed by Harris et al. (2009) based on results of DIVA using phylogenies inferred from a combination of DNA sequences, morphology, and fossils. The study of Harris et al. (2009) included more molecular data and more fossils than were included here for testing the Bayes-DIVA method. We do not attempt to describe the biogeographic history of lineages of Aesculus with the data presented here. Rather, this portion of the discussion focuses on the utility of this approach to Bayes-DIVA with respect to ancestral distributions at certain nodes of interest. Despite low to moderate support for placement of five of six lineages in phylogenetic analysis of extant taxa (Analysis 1), we were able to obtain high to moderate statistical support for the biogeographic origins of these lineages (Fig. 5: A, nodes 1–3, 5–6) using the new approach described here (Fig. 7, Table 3). For example, the placement of sect. Parryana was supported by pp = 70% in phylogenetic analysis (Fig. 5: A) and we obtained higher support (P = 0.90) for its ancestral range in eNA-wNA (Table 2, Fig. 7: E). The support for the ancestral range of eNA-wNA for the section was much lower (P = 0.70) when estimated using Equation 2 (Table 3). The increased probability support for P(eNAwNA) for sect. Parryana using the revised Bayes-DIVA occurred because some alternate placements of the section, for example, sect. Parryana + sect. Macrothyrsus and sect. Parryana + (sect. Macrothyrsus + A. californica), yielded non-zero F(eNA-wNAParryana+x ). For the node Y including sect. Pavia and x, the ancestral range eNA-wNA is supported weakly to moderately (P = 0.663) (Table 2, Fig. 7: D). However, all four possible ancestral ranges of sect. Pavia contain eNA, resulting in P(eNA ∈ R) = 1.0 and providing high confidence for inference that the ancestral range of sect. Pavia included eNA. Similarly this type of inference can be C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA 363 Fig. 9. Comparison of Rit for an identical node in two different Bayesian trees from analysis including fossils (Analysis 3). A, B, Two Bayesian trees from sample of 100 from Analysis 3. Section Aesculus highlighted dark gray; sect. Macrothyrsus (Aesculus parviflora) + Aesculus californica are highlighted in light gray. Dots indicate the parent node of sect. Aesculus. The alternative range frequencies at this node are presented in C–E. C, Relative frequencies of nine alternative optimal ancestral areas determined using 1/N. Arrow indicates area BC (EA – eNA), an example referred to in text. D, Relative frequencies of the alternative ancestral areas determined based on actual occurrences by Ri t for tree A. Arrow indicates area BC (EA – eNA), example referred to in text. E, Relative frequencies determined using Ri t for tree B. Arrow indicates area BC (EA – eNA), example referred to in text. A, Europe; B, East Asia (EA); C, eastern North America (eNA); D, western North America. applied to sect. Macrothyrsus, represented by a single extant species known from southeastern USA, which has an unresolved sister (i.e., part of a polytomy) in the Bayesian consensus topology (Fig. 5: A, node 4). Although no single ancestral range with P ≥ 0.5 emerged C 2009 Institute of Botany, Chinese Academy of Sciences for sect. Macrothyrsus, the two ancestral ranges with the highest probabilities, eNA (P = 0.395) and eNA-wNA (P = 0.245) (Fig. 7: F), can be combined for a total probability of P = 0.640 of eNA. Further exploration of the biogeographic history of this group might begin 364 Journal of Systematics and Evolution Vol. 47 No. 5 with eNA as a working hypothesis. This finding highlights the utility of the Bayes-DIVA analysis in cases of polytomies. The ancestral ranges of the lineages of interest inferred in this study are largely congruent with those inferred in Harris et al. (2009), but here we show statistical support deriving from analysis that takes into account topological and optimization uncertainties. 3.3 Adding fossil wildcards The addition of a European fossil appears to have had a more significant effect than the addition of an East Asian fossil on ranges estimated for the parent nodes of interest (Table 4, Fig. 8). Bayes-DIVA results inferring P(EU ∈ R) changed significantly for all six parent nodes of interest (Table 4), whereas the P(EA ∈ R) increased significantly for only two of the six nodes (Table 4). This phenomenon can be explained by the fact that only one extant species of Aesculus occurs in Europe, Aesculus hippocastanum L., forming sect. Aesculus with Aesculus turbinata Blume (EA) with a pp = 100% (Fig. 5), but there are several extant species in two major clades occurring in EA. When EU is specified as the range for only one, highly stable terminal taxon, the impact of EU on optimal ancestral ranges for the major lineages is expected to be small compared to EA. Adding a wildcard fossil from Europe to the phylogeny thus heavily influenced the outcomes, that is, increased the probability of EU in the ancestral ranges of the parent nodes. This finding suggests that including fossils from species poor areas with phylogenetic uncertainty will have dramatic impact on results of biogeographic analysis. We therefore recommend that special care be taken to reduce a fossil’s wildcard behavior (see Kearney, 2002; Kearney & Clark, 2003) especially when introducing a fossil from a geographic area unrepresented or poorly represented by extant species. Lineages most affected by the inclusion of wildcard fossil taxa appear to be those that have a very low probability of a range including the distribution of the fossil (i.e., P(Rfossil ∈ R) when fossils are not included in the analysis. 3.4 Determining probable post-Y ranges How an ancestral range is subdivided and inherited by daughter lineages immediately following speciation (i.e., at the base of the internode of a branch) can provide additional information about the total historical biogeographic pathway of a lineage of interest. Recently, range evolution along internodes on a phylogenetic tree has been addressed by and can be calculated using the model-based likelihood method of Ree et al. (2005) and Ree and Smith (2008). Here we show that the marginal probabilities obtained using Bayes-DIVA can also be used to make inferences about range inheritance at the 2009 base of the internode and evolution along the branches. Our analyses on range divisions at the parental nodes revealed potential extinction and dispersal events along branches of two Aesculus lineages (comparing results of Fig. 7 and Table 5). Future studies could compare the range evolution data from Bayes-DIVA and the likelihood method implemented in Lagrange. 4 Conclusions As other authors have previously argued, it is best to include all available and relevant information when using phylogeny to reconstruct biogeography (Tiffney & Manchester, 2001; Huelsenbeck & Immenov, 2002; Ree et al., 2005; Nylander et al., 2008). Historical biogeography is a synthetic discipline that produces the most reliable results when analyses include data from divergence time, evidence from paleobotany, geological and ecological data, as well as highly resolved and robust phylogenies (e.g., Tiffney & Manchester, 2001; Emerson & Hewitt, 2005; Ree et al., 2005; Carstens & Richards, 2007; Nylander et al., 2008). Biogeographic analysis using DIVA, which requires little prior information, is perhaps not the best method of biogeographic reconstruction when information in addition to phylogenetic pattern and distributions of extant taxa is available. However, given that DIVA is fast, user friendly, and produces results similar to those from the modelbased methods that implement prior information into the optimization, our revised Bayes-DIVA approach provides a solution for authors who favor DIVA but face the problem of polytomies. In biogeogrpahic studies using DIVA, the prior information on divergence time and area connections can be used to distinguish among multiple optimal solutions. DIVA remains advantageous when working with groups for which little or unreliable prior information is available, for its ease of use and freedom from potential error associated with model selection and model parameter determination required for the model-based methods (Ree et al., 2005; Nylander et al., 2008). Bayes-DIVA offers an advantage over using DIVA in the traditional way as well as over using many methods that require only a single input tree. This is because the Bayes-DIVA analysis provides statistical support for inferred ranges, allows for inference at poorly supported parent nodes of lineages of interest, and allows for other types of analyses of support for biogeographic reconstruction including the two applications we have shown here. As suggested by previous authors (Lemmon & Lemmon, 2008; Nylander et al., 2008; Sanmartı́n et al., C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA 2008), this approach is not limited to use with DIVA and is applicable to other types of biogeographic analyses. Acknowledgements The authors are highly indebted to François LUTZONI (Duke University, Durham, NC, USA) for his assistance in developing this approach to use of DIVA software. We also acknowledge Beau DABBS (University of Chicago, Chicago, IL, USA) for his assistance with mathematics and statistics, David THOMAS (formerly North Carolina State University) for helpful discussion, Morris MADURO (University of California, Riverside, CA, USA) for correspondence regarding the random sequence generation script, Holly FORBES (University of California, Berkeley, CA, USA) for collection of fresh leaf materials, and the Gray Herbarium at Harvard University for the loan of herbarium specimens. This manuscript is a part of the thesis of AJ Harris submitted to the NCSU graduate school in 2007. This study has benefited from a National Science Foundation (USA) grant made to Xiang (DEB-0444125). For travel support to workshops and symposia we thank the Deep Time Research Coordination Network, supported by a NSF grant funded to D.E. Soltis (DEB-0090283), and the Phytogeography of the Northern Hemisphere Working Group and the Clock Workgroup supported by NSF through NESCent. References Axelrod DI. 1966. The Eocene Copper Basin flora of northeastern Nevada. University of California Publications in Geological Science 59: 1–83. Bremer K. 1992. Ancestral areas: a cladistic reinterpretation of the center of origin concept. Systematic Biology 41: 436– 445. Budantsev LJ. 1983. History of the Arctic flora of the early Cenophytic epoch. Nauka, Leningrad. (in Russian). Burbrink FT, Lawson R. 2007. How and when did Old World rat snakes disperse into the New World? Molecular Phylogenetics and Evolution 43: 173–189. Calviño CI, Martı́nez SG, Downie SR. 2008. Morphology and biogeography of Apiaceae subfamily Saniculoideae as inferred by phylogenetic analysis of molecular data. American Journal of Botany 95: 196–214. Carstens BC, Richards CL. 2007. Integrating coalescent and ecological niche modeling in comparative phylogeography. Evolution 61: 1439–1454. Condit C. 1944. The Remington Hill flora. Washington: Carnegie Institute of Washington Publication 553: 21–55. Crane PR, Herendeen P, Friis EM. 2004. Fossils and plant phylogeny. American Journal of Botany 91: 1683–1699. de Lumley H. 1988. La stratigraphie du remplissage de la Grotte du Vallonnet. L’Anthropologie 92: 407–428. Dilhoff RM, Leopold EB, Manchester SR. 2005. The McAbee flora of British Columbia and its relation to the early-middle C 2009 Institute of Botany, Chinese Academy of Sciences 365 Eocene Okanagan Highlands flora of the Pacific Northwest. Canadian Journal of Earth Science 42: 151–166. Donoghue MJ, Smith SA. 2004. Patterns in the assembly of the temperate forest around the Northern Hemisphere. Philosophical Transactions of the Royal Society of London: Biology 359: 1633–1644. Emerson BC, Hewitt GH. 2005. Phylogeography. Current Biology 15: 367–371. Felsenstein J. 1989. PHYLIP (Phylogeny Inference Package) Version 3.2. Cladistics 5: 164–166. Felsenstein J. 2008. PHYLIP (Phylogeny Inference Package) Version 3.68. Distributed by the author. Seattle: Department of Genome Sciences, University of Washington. Forest F, Drouin JN, Charest R, Brouillet L, Bruneau A. 2001. A morphological phylogenetic analysis of Aesculus L. and Billia Peyr. (Sapindaceae). Canadian Journal of Botany 79: 154–169. Fiz O, Vargas P, Alarcón M, Aedo C, Garcia JL, Aldasoro JJ. 2008. Phylogeny and historical biogeography of Geraniaceae in relation to climate changes and pollination ecology. Systematic Botany 33: 326–342. Golovneva L. 2000. Early Paleogene floras of Spitzbergen and North Atlantic floristic exchange. Acta Universitatis Carolinae Geologica 44: 39–50. Hardin JW. 1957a. A revision of the American Hippocastanaceae. Brittonia 9: 145–171. Hardin JW. 1957b. A revision of the American Hippocastanaceae, II. Brittonia 9: 173–195. Hardin JW. 1960. Studies in the Hippocastanaceae, V. Species of the Old World. Brittonia 12: 26–38. Harrington MG, Edwards KJ, Johnson SA, Chase MW, Gadek PA. 2005. Phylogenetic inference in Sapindaceae sensu lato using plastid matK and rbcL DNA sequences. Systematic Botany 30: 366–382. Harris AJ, Thomas DT, Xiang QY. 2009. Phylogeny, origin, and biogeographic history of Aesculus L. (Sapindales): an update from combined analysis of DNA sequences, morphology, and fossils. Taxon 58: 108–126. Hilton J, Bateman RM. 2006. Pteridosperms are the backbone of seed-plant phylogeny. Journal of the Torrey Botanical Society 133: 119–168. Hines HM. 2008. Historical biogeography, divergence times, and diversification patterns of bumble bees (Hymenoptera: Apidae: Bombus). Systematic Biology 57: 58–75. Hu HH, Chaney RW. 1940. A Miocene flora from Shantung Province, China. Washington: [bpa2]Carnegie Institute of Washington Publication 507: 1–147. Huelsenbeck JP, Immenov NS. 2002. Geographic origin of human mitochondrial DNA: accommodating phylogenetic uncertainty and model comparison. Systematic Biology 51: 155–165. Huelsenbeck JP, Ronquist F. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17: 754–755. Huelsenbeck JP, Ronquist F. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19: 1572–1574. Huttunen S, Hedenäs L, Ignatov MS, Devos N, Vanderpoorten A. 2008. Origin and evolution of the Northern Hemisphere disjunction in the moss genus Homalothecium (Brachytheciaceae). American Journal of Botany 95: 720–730. 366 Journal of Systematics and Evolution Vol. 47 No. 5 Jeandroz S, Murat C, Wang Y, Bonfante P, Tacon FL. 2008. Molecular phylogeny and historical biogeography of the genus Tuber, the “true truffles”. Journal of Biogeography 35: 815–829. Jeong EK, Kim K, Kim JH, Suzuki M. 2004. Fossil woods from the Janggi Group (Early Miocene) in Pohang Basin, Korea. Journal of Plant Research 117: 183–189. Johns MV Jr. 1957. Non-parametric empirical Bayes procedures. The Annals of Mathematical Statistics 28: 649–669. Judd WS, Saunders RW, Donoghue MJ. 1994. Angiosperm family pairs: preliminary analyses. Harvard Papers in Botany 5: 1–51. Kauff F, Miadlikowska J, Lutzoni F. 2003. ARC: a program for ambiguous region coding. Available online at http://www.lutzonilab.net/ and select “Downloadable Programs” [Accessed 10 October 2006]. Kauff F. 2005. RandomTree: random tree sampling. Available online at http://www.lutzonilab.net/ and select “Downloadable Programs” [Accessed 10 October 2006]. Kearney M. 2002. Fragmentary taxa, missing data, and ambiguity: mistaken assumptions and conclusions. Systematic Biology 51: 369–381. Kearney M, Clark JM. 2003. Problems due to missing data in phylogenetic analyses including fossils: a critical review. Journal of Vertebrate Paleontology 23: 263–274. Lemmon AR, Lemmon EM. 2008. A likelihood framework for estimating phylogeographic history on a continuous landscape. Systematic Biology 57: 544–561. Lieberman BS. 2003. Paleobiogeography: the relevance of fossils to biogeography. Annual Review of Ecology and Systematics 34: 51–69. Lim K. 2008. Historical biogeography of New World emballonurid bats (tribe Diclidurini): taxon pulse diversification. Journal of Biogeography 35: 1385–1401. Lutzoni F, Pagel M, Reeb V. 2001. Major fungal lineages are derived from lichen symbiotic ancestors. Nature 411: 937– 940. Maddison DR, Maddison WP. 2001. MacClade 4: analysis of phylogeny and character evolution. Version 4.02. Sunderland: Sinauer Associates. Mai DH, Walther H. 1988. Die pliozaenen Floren von Thueringen, Deutsche Demokratische Republik. Quartaerpalaeontologie 7:55–297. Manchester SR. 1999. Biogeographical relationships of North American Tertiary floras. Annals of the Missouri Botanical Gardens 86: 472–522. Manchester SR. 2001. Leaves and fruits of Aesculus (Sapindales) from the Paleocene of North America. International Journal of Plant Sciences 162: 985–996. Manos PS, Soltis PS, Soltis DE, Manchester SR, Oh SH, Bell CD, Dilcher DL, Stone DE. 2007. Phylogeny of extant and fossil Juglandaceae inferred from the integration of molecular and morphological data sets. Systematic Biology 56: 412–430. Mansion G, Rosenbaum G, Schoenenberger N, Bacchetta G, Rosselló JA, Conti E. 2008. Phylogenetic analysis informed by geological history supports multiple, sequential invasions of the Mediterranean Basin by the angiosperm family Araceae. Systematic Biology 57: 269–285. Mast AR, Willis CL, Jones EH, Downs KM, Weston PH. 2008. A smaller Macadamia from a more vagile tribe: inference of 2009 phylogenetic relationships, divergence times, and diaspore evolution in Macadamia and relatives (tribe Macadamieae; Protaceae). American Journal of Botany 95: 843–870. Nixon KC, Wheeler QD. 1992. Extinction and the origin of species. In: Novacek MJ, Wheeler QD eds. Extinction and phylogeny. New York: Columbia University Press. 119–143. Nylander JAA, Olsson U, Alström P, Sanmartı́n I. 2008. Accounting for phylogenetic uncertainty in biogeography: a Bayesian approach to Dispersal–Vicariance Analysis of the thrushes (Aves: Turdus). Systematic Biology 57: 257–268. Page RDM. 1993a. COMPONENT: tree comparison software for Microsoft Windows, Version 2.0, User’s Guide. London: Natural History Museum. Page RDM. 1993b. Genes, organisms, and areas: the problem of multiple lineages. Systematic Biology 42: 77–84. Page RDM. 1996. TREEVIEW: an application to display phylogenetic trees on personal computers. Computer Applications in Bioscience 12: 357–358. Page RDM. 2001. TreeView for Windows. Version 1.6.6. Available online at http://taxonomy.zoology.gla.ac.uk/ and select “Software”. Pagel M, Meade A, Barker D. 2004. Bayesian estimation of ancestral character states on phylogenies. Systematic Biology 53: 673–684. Posada D, Crandall KA. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14: 817–818. Prakash U, Barghoorn ES. 1961. Miocene fossil woods from the Columbia Basalts of Central Washington, II. Journal of the Arnold Arboretum 42: 165–203. Puri GS. 1945. Some fossil leaflets of Aesculus indica Colebr. from the Karewa Beds at Laredura and Ningal Nullah, Pir Panjal, Kashmir. Journal of the Indian Botanical Society 24. Rambaut A, Drummond J. 2003. Tracer. Version 1.3. Available online at http://evolve.zoo.ox.ac.uk/Evolve/Welcome.html. Ree RH, Smith SA. 2008. Maximum likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis. Systematic Biology 57: 4–14. Ree RH, Moore BR, Webb CO, Donoghue MJ. 2005. A likelihood framework for inferring the evolution of geographic range of phylogenetic trees. Evolution 59: 2299–2311. Ronquist F. 1996. Dispersal Vicariance Analysis (DIVA) 1.1. User’s manual. Available online at http://www.ebc.uu.se/ syszoo/research/diva/diva.html. Ronquist F. 1997. Dispersal–vicariance analysis: a new approach to the quantification of historical biogeography. Systematic Biology 46: 195–203. Ronquist F. 2001. Dispersal Vicariance Analysis (DIVA) 1.2. Available online at http://www.ebc.uu.se/syszoo/ research/diva/diva.html. Rothwell GW. 1999. Fossil and ferns in the resolution of land plant phylogeny. Botanical Review 65: 189–218. Rothwell GW, Nixon KC. 2006. How does the inclusion of fossil data change our conclusions about the phylogenetic history of euphyllophytes. International Journal of Plant Sciences 167: 737–749. Sanmartı́n I, Ronquist F. 2004. Southern Hemisphere biogeography inferred by event–based models: plant versus animal patterns. Systematic Biology 53: 216–243. Sanmartı́n I, Enghoff H, Ronquist F. 2001. Patterns of animal dispersal, vicariance and diversification in the C 2009 Institute of Botany, Chinese Academy of Sciences HARRIS & XIANG: Statistical approach to using DIVA Holarctic. Biological Journal of the Linnean Society 73: 345–390. Sanmartı́n I, Van Der Mark P, Ronquist F. 2008. Inferring dispersal: a Bayesian approach to phylogeny-based island biogeography, with special reference to the Canary Islands. Journal of Biogeography 35: 428–449. Schloemer-Jäger A. 1958. Alttertiare pflanzen aus flozen der bragger-halbinsel Spitzbergens. Paleontographica Abt B 39–103. Soltis DE, Morris AB, MacLachlan JS, Manos PS, Soltis PS. 2006. Comparative phylogeography of unglaciated eastern North America. Molecular Ecology 15: 4261–4293. ∗ Swofford DL. 2002. PAUP – Phylogenetic analysis using par∗ simony ( and other methods). Version 4.0b10. Sunderland: Sinauer Associates. Szafer W. 1947. The Pliocene flora of Kroscienko in Poland. Rozpr Wydz mat-przyr Akad Urn. 72: 91–162. (in Polish and English). Szafer W. 1954. Pliocene flora from the vicinity of Czorsztyn (West Carpathians) and its relationship to the Pleistocene. Institute of Geology of Warzawa 111: 1–238. (in Polish and English). Tanai T. 1952. The fossil vegetation from the coalified basin of Nishitagawa, Prefecture of Yamagata, Japan. Japanese Journal of Geology and Geography 22: 119–135. (in French). Tiffney BH, Manchester SR. 2001. Integration of paleobotanical and neobotanical data in the assessment of phylogeographic history of Holarctic angiosperm clades. International Journal of Plant Sciences 162: S19–S27. Velazco PM, Patterson BD. 2008. Phylogenetics and biogeography of the broad-nosed bats, genus Platyrrhinus (Chiroptera: Phyllostomidae). Molecular Phylogenetics and Evolution 49: 479–459. Wehr WC. 1998. Middle Eocene insects and plants of the Okanagan Highlands. In: Martin JE ed. Contributions to the paleontology and geology of the West Coast: in honor of V. Standish Mallory. Seattle: Thomas Burke Memorial Washington State Museum Research. 99–109. Weinelt M. 1999. Online Map Creation (OMC) version 4.1. Available online at http://www.aquarius.ifm-geomar.de [Accessed 1 Jan 2008]. Wen J. 1999. Evolution of the eastern Asian and eastern North American disjunct distributions in flowering plants. Annual Review of Ecology and Systematics 30: 421–455. Wiens JJ. 2003. Missing data, incomplete taxa, and phylogenetic accuracy. Systematic Biology 52: 528–538. Wiens JJ. 2006. Missing data and the design of phylogenetic analyses. Journal of Biomedical Informatics 39: 34–42. Wiggins IL. 1932. The lower California buckeye, Aesculus parryi A. Gray. American Journal of Botany 19: 406–410. Xiang QY, Thomas DT. 2008. Tracking character evolution and biogeographic history through time in Cornaceae—Does choice of methods matter? Journal of Systematics and Evolution 46: 349–374. Xiang QY, Soltis DE, Soltis PS. 1998a. The eastern Asian and eastern and western North American disjunction: congruent phylogenetic patterns in seven diverse genera. Molecular Phylogenetics and Evolution 10: 178–190. Xiang QY, Crawford DJ, Wolfe AD, Tang YC. 1998b. Origin and biogeography of Aesculus L. (Hippocastanaceae): C 2009 Institute of Botany, Chinese Academy of Sciences 367 a molecular phylogenetic perspective. Evolution 52: 988– 997. Xiang QY, Zhang WH, Ricklefs RE, Qian H, Chen ZD, Wen J, Li JH. 2004. Regional differences in rates of plant speciation and molecular evolution: a comparison between eastern Asia and eastern North America. Evolution 58: 2175– 2184. Xiang QY, Manchester SR, Thomas DT, Zhang WH, Fan C. 2005. Phylogeny, biogeography, and molecular dating of cornelian cherries (Cornus, Cornaceae): tracking Tertiary plant migration. Evolution 58: 1685–1700. Xiang QY, Thomas DT, Zhang WH, Manchester SR, Murrell Z. 2006. Species level phylogeny of the genus Cornus (Cornaceae) based on molecular and morphological evidence – implications for taxonomy and Tertiary intercontinental migration. Taxon 55: 9–30. Xiang QY, Smith SA, Harris AJ, Feng C. 2009. Use of fossils in biogeographic analysis – challenges and possible solutions. Abstract. Invited presentation: 4th International conference of the International Biogeography Society, Merida, Mexico. 69. Appendix I: DNA sequences of Aesculus and outgroups. Notes: For each taxon, information reads as taxon, accession number, and gene sequence data available. GenBank accessions are given following gene names. Internal transcribed spacer (ITS) accessions are given in the order ITS1, ITS2, and 5.8s if available. Superscripts correspond to numbered accessions in Fig. 3a of Harris et al. (2009). Ingroup.—Section Aesculus—. A. hippocastanum L., Kew 00-69.11289-263, rps16 (EU687697) matK (EU687725) ITS (EU687600, EU687637); A. turbinata Blume, D.J. Crawford 4111 , rps16 (EU687695) matK (EU687723) ITS (EU687598, EU687635); A. turbinata Blume, JC Raulston Arboretum 9500162 , rps16 (EU687696) matK (EU687724) ITS (EU687599, EU687636, EU687666); Section Calothyrsus (traditional)—. A. assamica Griff., Mongolia Expedition 10039, rps16 (EU687676) ITS (EU687578, EU687615, EU687651); A. californica (Spach.) Nutt., D.J. Crawford 4061 , rps16 (EU687689) matK (EU687715) ITS (EU687590, EU687627, EU687659); A. californica (Spach.) Nutt., T.M. Hardig 27952 , rps16 (EU687690) matK (EU687716) ITS (EU687591, EU687628, EU687660); A. californica (Spach.) Nutt., J.C. Raulston arboretum 9504133 , rps16 (EU687691) matK (EU687717) ITS (EU687592, EU687629, EU687661); A. californica (Spach.) Nutt., UC Berkeley 93.12034 , rps16 (EU687692) matK (EU687718) ITS (EU687593, EU687630, EU687662); 368 Journal of Systematics and Evolution Vol. 47 No. 5 A. californica (Spach.) Nutt., UC Berkeley 93.11165 , rps16 (EU687693) matK (EU687719) ITS (EU687594, EU687631, EU687663); A. chinensis Bunge, Q.Y. Xiang 3051 , rps16 (EU687678) ITS (EU687580, EU687617, EU687652); A. chinensis Bunge, Q.Y. Xiang 04-C882 , rps16 (EU687677) matK (EU687706) ITS (EU687579, EU687616); A. indica (Camb.) Hook, Q.Y. Xiang 3011 , rps16 (EU687686) matK (EU687711) ITS (EU687587, EU687624); A. indica (Camb.) Hook, J.C. Raulston Arboretum 0014052 , rps16 (EU687687) matK (EU687712) ITS (EU687588, EU687625, EU687658); A. polyneura Hu & Fang, Q.Y. Xiang 02-255, rps16 (EU687681) matK (EU687707) ITS (EU687582, EU687619, EU687654); A. tsiangii Hu & Fang, Q.Y. Xiang 04-C37, rps16 (EU687685) matK (EU687710) ITS (EU687586, EU687623, EU687657); A. wilsonii Rehder, Q.Y. Xiang 02-1051 , rps16 (EU687684) ITS (EU687585, EU687622, EU687656); A. wilsonii Rehder., Q.Y. Xiang 04-C92 , rps16 (EU687683) matK (EU687709) ITS (EU687584, EU687621, EU687655); A. wangii Hu, Q.Y. Xiang 303, rps16 (EU687682) matK (EU687708) ITS (EU687583, EU687620); Section Macrothyrsus—. A. parviflora Walter, J.C. Raulston arboretum sene non., 2009 rps16 (EU687694) matK (EU687721) ITS (EU687596, EU687633, EU687664); Section Pavia—. A. glabra Willd., D.J. Crawford 413, rps16 (EU687702) matK (EU687734) ITS (EU687607, EU687644, EU687671); A. flava Sol., C.W. DePamphilis F-MI-41 , matK (EU687737); A. flava Sol., Q.Y. Xiang 98-1502 , rps16 (EU687703) matK (EU687738) ITS (EU687610, EU687647, EU687672); A. pavia L., Q.Y. Xiang 01-541 , rps16 (EU687700) matK (EU687732) ITS (EU687605, EU687642, EU687669); A. pavia L., Q.Y. Xiang 98-1352 , rps16 (EU687701) matK (EU687733) ITS (EU687606, EU687643, EU687670); A. sylvatica Bart., Q.Y. Xiang 01-2511 , rps16 (EU687698) matK (EU687726) ITS (EU687601, EU687638, EU687667); A. sylvatica Bart., Q.Y. Xiang 98-1102 , rps16 (EU687699) matK (EU687728) ITS (EU687602, EU687639, EU687668); Section Parryana—. A. parryi Gray, Epling 1936 sene non, rps16 (EU687688) matK (EU687714). Outgroup.—Handeliodendron bodinieri (Levl.) Rehd., Q.Y. Xiang 302, rps16 (EU687674) ITS (EU687575, EU687612, EU687649); Billia Peyr sp., Q.Y. Xiang 02-12, rps16 (EU687675) matK (EU687705) ITS (EU687577, EU687614, EU687650). C 2009 Institute of Botany, Chinese Academy of Sciences
© Copyright 2026 Paperzz