Syst. Biol. 47(4):617–624, 1998 Using a Nonrecursive Formula to Determine Cladogram Probabilities J. STONE1, 3 AND J. REPKA2 Department of Zoology, University of Toronto, Toronto, Ontario M5S 3G5, Canada; E-mail: [email protected] 2 Department of Mathematics, University of Toronto, Toronto, Ontario M5S 3G3, Canada; E-mail: [email protected] 1 Abstract.— Three properties of bifurcating branching diagrams that are used for representing a specic number of taxa are (1) the number of possible arrangements, (2) the number of possible topologies, and (3) the probabilities of formation according to particular models of cladogenesis. Of these, the probabilities have received the least attention in the literature. Indeed, many biologists would be astonished by the observation that the probability of a commonly cited cladogram containing 35 phyla of the animal kingdom is < 0.0072% of the value of the average probability taken over all possible cladograms! We reviewed works on cladogram arrangements and topologies and developed a computer-generated table of enumerations that extends and corrects such tables in the literature. We also developed a nonrecursive formula for the determination of cladogram probabilities. This formula facilitates calculation and thereby should promote use of cladogram probabilities, which might provide more accurate null hypotheses for tests of cladogenic events than do considerations of cladogram arrangements or topologies. [Bifurcating branching diagram; cladogram arrangement; cladogram topology; likelihood; null hypothesis; phylogenetic tree.] An important question confronts every systematist who performs a cladistic analysis: How many distinct cladograms (bifurcating branching diagrams) representing n taxa are possible? Many researchers (Cayley, 1856, 1889; Schr öder, 1870; Edwards and Cavalli-Sforza, 1964; Cavalli-Sforza and Edwards, 1967; Moon, 1970; Harding, 1971; Dobson, 1974; Harper, 1976; Phipps, 1976a, 1976b; Felsenstein, 1978) have considered various mathematical versions of this question, have provided methods for enumerating the number of distinct bifurcating or multifurcating branching diagrams consisting of n labeled terminal nodes and labeled or unlabeled nonterminal nodes, and have tabulated values. An important distinction considered by only a few researchers (Harding, 1971; Phipps, 1976a; Simberloff et al., 1981; Savage, 1983; Slowinski and Guyer, 1989) is the distinction between cladogram arrangement and cladogram topology. For example, three taxa can be arranged into three distinct cladogram arrangements, but there is only one cladogram topology. Enumerations of cladogram arrangements and topologies are presented in Table 1, which extends and corrects such tables in the literature. A third cladogram property, which has received less 3 Present address (and address for correspondence): Department of Ecology and Evolution, State University of New York, Stony Brook, NY 11794-5245, USA. attention in published works than has arrangement or topology, concerns probability. Cladogram probability refers to the likelihood of obtaining a particular bifurcating branching diagram according to a particular model of cladogenesis. In this paper, formulae for the enumeration of cladogram arrangements and topologies are reviewed and a nonrecursive formula for calculating cladogram probabilities is presented. The simple, nonrecursive formula is more amenable to computer programming and so will facilitate and increase the efciency of calculating cladogram probabilities. CLADOGRAM ARRANGEMENT The number of distinct rooted bifurcating branching diagrams with n labeled terminal nodes can be calculated by using the formula P (2i – 3), (1) where the product runs between i = 2 and n (Edwards and Cavalli-Sforza, 1964; CavalliSforza and Edwards, 1967; Moon, 1970; Harding, 1971; Phipps, 1976a; Felsenstein, 1978). To simplify calculations, the formula usually is presented (Harding, 1971; Phipps, 1976a; Felsenstein, 1978) in the form 617 (2n – 3)!/ [2n–2 (n – 2)!]. (2) 618 TABLE 1. Numbers of cladogram arrangements and topologies representing n taxa. n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 VOL. 47 SYSTEMATIC BIOLOGY Arrangements Topologies 1 1 3 15 105 945 10,395 135,135 2,027,025 34,459,425 654,729,075 13,749,310,575 316,234,143,225 7,905,853,580,625 213,458,046,676,875 6,190,283,353,629,375 191,898,783,962,510,625 6,332,659,870,762,850,625 221,643,095,476,699,771,875 8,200,794,532,637,891,559,375 319,830,986,772,877,770,815,625 13,113,070,457,687,988,603,440,625 563,862,029,680,583,509,947,946,875 25,373,791,335,626,257,947,657,609,375 1,192,568,192,774,434,123,539,907,640,625 1 1 1 2 3 6 11 23 46 98 207 451 983 2,179 4,850 10,905 24,631 56,011 127,912 293,547 676,157 1,563,372 3,626,149 8,436,379 19,680,277 In more recent publications concerning bifurcating branching diagram enumeration (Swofford et al., 1996:410; Trueman, 1993), various authors have provided the formula for enumerating the number of distinct unrooted bifurcating branching diagrams with n labeled terminal nodes: P (2i – 5), (3) where the product runs between i = 3 and n. W(n) = T(i) T(n–i) / 2 + E(n) , (5) where the sum runs between i = 1 and (n–1), E(n) = 0, if n is odd, or W(n/ 2) / 2 if n is even, and W(1) = 1. CLADOGRAM PROBABILITY CLADOGRAM TOPOLOGY The number of distinct bifurcating branching diagram topologies with n terminal nodes can be enumerated by using a two-term recursive formula (Phipps, 1976a): T(i) T(n–i) if n is odd, or T(i) T(n–i) + [T(n/ 2) (T(n/ 2) + 1)/ 2], if n is even, j terminal nodes and the summation is performed over all integers i satisfying 1 # i < n/ 2. A simpler but equivalent recursive enumeration was provided by Wedderburn (1922): (4) in which T( j) is the number of topologically distinct bifurcating branching diagrams for A natural extension of the preceding works is the calculation of probabilities of constructing the various distinct bifurcating branching diagrams under the assumption that there is an equal likelihood of bifurcating (or terminating) at every node during the (Markovian) construction of the diagrams. Harding (1971: Formula 4.1) provided a recursive formula for the calculation of such probabilities. Specically, for a bifurcating branching diagram with n terminal nodes, if the two clades that originate from the most basal node yield n0 and n0 0 terminal nodes 1998 STONE AND REPKA—CLADOGRAM PROBABILITIES and the probabilities of constructing these clades are p0 and p0 0 , respectively, then the probability of obtaining that particular diagram among all distinct bifurcating branching diagrams with n(= n0 +n0 0 ) terminal nodes is 2(n – 1)–1 p0 p0 0 , if the topologies of the two clades are distinct, or (n – 1)–1 p0 p0 0 = (n – 1)–1 (p0 )2 , (6) if the topologies of the two clades are identical. This formula can be used recursively to calculate the probability of constructing any cladogram, beginning at the most basal node and continuing toward the terminal nodes. It is natural to distinguish cases in which the topologies of two clades originating from a nonterminal node are identical or distinct. For a nonterminal node u, the function d (u) (modeled after the Kroenecker delta function) is dened to be 0, if the topologies of the two clades originating from u are distinct, or 1, if they are identical. Then formula 6 can be written as [2 – d (u0 )](n – 1)–1 p0 p0 0 , (60 ) in which u0 represents the most basal node. Formula 6 can be used to derive a nonrecursive formula that is easy to use. The probability of constructing a cladogram with a particular topology (among all those with n terminal nodes) is (1/ (n – 1)!)P C(n1 (u) + n2 (u) – 2, n1 (u) – 1)(2 – d (u)), (7) where the product is over all nonterminal nodes u, C denotes the combinatorial function (i.e., C(n, k) = n!/ [k!(n – k)!]), and n1 (u) and n2 (u) are the numbers of terminal nodes contained in the two clades originating from u (a proof is provided in the Appendix). REMARKS The numbers exhibited adjacent to the cladograms in Figure 1 are the numerators of the probabilities obtained with use of 619 formula 7 (i.e., omitting the factor 1/ (n – 1)!). Dots in the gure indicate nonterminal nodes from which two identical clades originate (i.e., nonterminal nodes with d (u) = 1). With a little practice, probabilities are easy to calculate. If cladogram topologies are arranged in order of increasing probability (n = 2 to 9, Fig. 1), some interesting generalizations can be made: For any particular value of n, (1) the least probable cladogram topology is the one that contains terminal sister group pairs, themselves sister to a completely pectinate branching sequence of (n – 4) terminal nodes; (2) (a corollary of 1) the completely pectinate cladogram never is the least probable—in fact, it is twice as probable as the least-probable topology; and (3) cladograms can be partitioned according to the numbers of terminal nodes contained in each of the two clades originating from the most-basal node (k and n – k, for any integer k satisfying 1 # k # n/ 2). As has been shown (Feller, 1968; Slowinski and Guyer, 1989; Heard, 1992), the total probability of all cladograms in any such partition equals 2/ (n – 1) except that, when n is even, the cladograms with clades containing (n/ 2) and (n/ 2) terminal nodes have total probability 1/ (n – 1). A cladogram having a low probability represents a bifurcating branching diagram that can be constructed in very few ways. The cladograms having the lowest probabilities generally are highly asymmetric; however, some highly symmetric cladograms also have comparatively low probabilities. For example, whenever n = 2k , for any k, there is a very symmetric diagram in which every nonterminal node joins two identical clades (such nonterminal nodes are indicated by dots in Fig. 1) and every terminal node is exactly k nonterminal nodes away from the root node (Fig. 1: k = 1, 2, 3; i.e., n = 2, 4, 8). Such diagrams have low probabilities. Both highly asymmetric and highly symmetric diagrams display a rigidity that makes it difcult to introduce much variety into the process by which they are constructed. 620 SYSTEMATIC BIOLOGY VOL. 47 FIGURE 1. Cladograms with n terminal nodes for n = 2, . . . , 9. For each value of n, cladograms are presented in order of increasing probability. Numbers adjacent to the cladograms are the numerators of the corresponding probabilities if they are expressed as fractions with denominator (n – 1)! (for each n, the numerators, counted with repetitions, sum to (n – 1)!). Dots in the gure indicate nonterminal nodes from which two identical clades originate. AN EXAMPLE An intriguing exercise that yields an astonishing result is the application of formula 7 to a particular cladogram that has been inferred to represent the phylogeny of the animal kingdom (Meglitsch and Schram, 1991: Fig. 38.2; and Fig. 2, this paper). The probability of this particular hypothesis of meta- zoan evolution equals » (1/ 13912) times the value of the average probability taken over all possible cladogram topologies representing 35 phyla! The implications of this small value are either that (1) this particular hypothesized scenario of animal evolution was a very improbable event in the history of life on this planet (if cladogenesis truly has been Markovian and the cladogram accu- 1998 STONE AND REPKA—CLADOGRAM PROBABILITIES 621 FIGURE 2. Cladogram of the Metazoa (adapted from Meglitsch and Schram, 1991: Fig. 38.2); Placozoa and Mesozoa are excluded. The nonterminal node immediately preceding that from which Sipunculida and Echiura originate originally consisted of a trichotomy comprising the clades Sipunculida + Echiura, Mollusca, and the taxa originating from the nonterminal node at the base of the Annelida. Numbers adjacent to labeled nonterminal nodes are factors corresponding to the product of formula 7. Dots in the gure indicate nonterminal nodes from which two identical clades originate. Each of these contributes a factor of 1 to the product of formula 7, except for the one that is labeled, which contributes a factor of 12. Each of the other unlabeled nonterminal nodes in the gure contributes a factor of 2. The probability of this cladogram is 2(2)2(2)2(6)12(2)11,705,850(42)2(2)2 (2)2(2)2(151,164)2(2)2(18)2(2)2(2)2/34! = 1/1,461,601,812,825,000. Using formula 4 or 5, there are 105,061,603,969 cladogram topologies representing 35 taxa, so the average probability is 1/105,061,603,969,which is approximately 13,912/1,461,601,812,825,000. Thus, the probability of this cladogram of the Metazoa is 1/13 912 or 0.00719% of the value of the average probability taken over all possible cladograms. Less conservative, smaller estimates of » 0.0064% or 0.00080% result if the clades Mollusca and Sipunculida + Echiura originate from a common nonterminal node or if the clades Sipunculida + Echiura and the taxa originating from the nonterminal node at the base of the Annelida originate from a common nonterminal node, respectively. 622 SYSTEMATIC BIOLOGY rately reconstructs the history of metazoan evolution) or (2) the assumption of Markovian cladogenesis is unrealistic. There is no reason to favor implication (1); however, evidence has been presented to support arguments that rejection of this analysis because of implication (2) might be either too drastic (Savage, 1983) or justied (Guyer and Slowinski, 1991, 1993; Heard, 1992; Mooers, 1995). In either case, this cladogram is presented for illustrative purposes only— its terminal nodes represent phyla, which cannot speciate (i.e., it is incomplete [Mooers, 1995; Mooers and Heard, 1997])—to demonstrate how a cladogram probability can be used to establish a null hypothesis. Nevertheless, cladogram probabilities should be considered cautiously, because implicit in the Markovian model are many assumptions, some of which might be inappropriate in particular instances (Losos and Adler, 1995; Mooers and Heard, 1997). For example, different taxa live in different environments, contain different numbers of species or individuals, and undergo different rates of mutation and rates and modes of speciation; in a Markovian model, however, all such factors are assumed to be of no consequence. DISCUSSION Enumerating the cladogram arrangements possible for a number of taxa is requisite to determining the likelihood of obtaining stochastically a bifurcating branching diagram with particular sister group relationships (equal to the reciprocal of the values obtained by using formula 1 or formula 3) and thereby provides a measure of condence of that particular hypothesis of evolutionary relationships (similar to the use of the proportional-todistinguishable-arrangements null hypothesis of Simberloff et al. [1981], HD ). Enumerating the cladogram topologies possible for a specied number of taxa is requisite to determining the likelihood of obtaining a particular bifurcating branching diagram topology under the assumption that all topologies representing that number of taxa are equiprobable and thereby provides a measure of condence of that particular classication scheme (similar to the use of VOL. 47 the equiprobable null hypothesis of Simberloff et al. [1981], HE ). Calculating a particular cladogram probability is requisite to determining the likelihood of obtaining a particular hierarchy of classication under the assumption of a Markovian bifurcating branching process and thereby provides a measure of condence of those particular cladogenic events (similar to the use of the Markovian branching null hypothesis of Simberloff et al. [1981], HM ). Therefore, the enumeration of cladogram arrangements and topologies concerns the diagrams (or patterns) resulting from (the process of) cladogenesis, whereas the calculation of cladogram probability more directly concerns cladogenesis itself (Savage, 1983); the nonrecursive formula also can be applied to gene trees within populations (Maddison and Slatkin [1991], for example). Felsenstein (1978) concluded the abstract of his paper by stating that one of the principal uses of enumerating cladogram arrangements was to frighten taxonomists. In the same spirit, the determination of cladogram probabilities might assist frightened systematists with their interpretations of their works: The quantication of certain vague concepts, such as the diversity of clades contained in cladograms, might be facilitated by measures of probability. For example, given a particular hierarchical classication of n taxa containing a taxon-rich clade, the probabilities of other distinct hierarchical classications of n taxa that include a clade with at least as many taxa can be determined. Hitherto, such quantications have consisted of visual comparisons of diversity diagrams of real clades with those of clades generated by stochastic branching models (Raup et al., 1973; Gould et al., 1977), calculations of numerical indices of balance (Shao and Sokal, 1990; Heard, 1992), examinations of distributions of taxonomic subunits within taxonomic units (Dial and Marzluff, 1989), interpretations of fractal dimensions as indicators of diversity (Burlando, 1990), utilization of stratigraphic or molecular dating methods (Hey, 1992; Nee et al., 1992), comparisons of cladogram topologies with nonparametric distributions of computer-simulated topologies (Kirkpatrick and Slatkin, 1993), and consid- 1998 STONE AND REPKA—CLADOGRAM PROBABILITIES erations of bifurcating branching diagrams (Farris, 1976; Slowinski and Guyer, 1989). It is hoped that the nonrecursive formula presented in this paper will increase the utility of cladogram probabilities, provoke discussion, promote additional discoveries, and thereby ultimately eventually provide a basis for selecting among equally parsimonious but distinct competing hypotheses. ACKNOWLEDGMENTS Concerning this work, M. Telford provided calculated commentary; S. B. Heard, W. P. Maddison, and J. B. Slowinski provided informative, detailed, and thoughtprovoking reviews; and D. Cannatella recursively provided editorial suggestions and encouragement. REFERENCES BURLANDO , B. 1990. The fractal dimension of taxonomic systems. J. Theor. Biol. 146:99–114. CAVALLI -SFORZA, L. L., AND A. W. F. EDWARDS . 1967. Phylogenetic analysis: Models and estimation procedures. Evolution 21:550–570. CAYLEY, A. 1856. Note sur une formule pour la réversion séries. J. Reine Angewandte Math. 52:276–284. [Collect. Pap. (Cambr.) 4:30–37, 1897.] CAYLEY, A. 1889. A theorem on trees. Q. J. Math. 23:376–378. [Collect. Pap. (Cambr.) 13:26–28, 1897.] DIAL, K. P., AND J. M. MARZLUFF . 1989. Nonrandom diversication within taxonomic assemblages. Syst. Zool. 38:26–37. DOBSON , A. J. 1974. Unrooted trees for numerical taxonomy. J. Appl. Probab. 11:32–42. EDWARDS , A.W. F., AND L. L. CAVALLI-SFORZA. 1964. Reconstruction of evolutionary trees. Pages 66–67 in Phenetic and phylogenetic classication (W. H. Heywood and J. McNeill, eds.). Publ. No. 6, Systematics Association, London. FARRIS, J. S. 1976. Expected asymmetry of phylogenetic trees. Syst. Zool. 25:196–198. FELLER , W. 1968. An introduction to probability theory and its application. John Wiley and Sons, New York. FELSENSTEIN , J. 1978. The number of evolutionary trees. Syst. Zool. 27:27–33. GOULD , S. J., D. M. RAUP, J. J. SEPKOSKI, T. J. M. SCHOPF, AND D. S. SIMBERLOFF . 1977. The shape of evolution: A comparison of real and random clades. Paleobiology 3:23–40. GUYER, C., AND J. B. SLOWINSKI . 1991. Comparisons of observed phylogenetic topologies with null expectations among three monophyletic lineages. Evolution 45:340–350. GUYER, C., AND J. B. SLOWINSKI . 1993. Adaptive radiation and the topology of large phylogenies. Evolution 47:253–263. HARDING, E. F. 1971. The probabilities of rooted tree shapes generated by random furcations. Adv. Appl. Probab. 3:44–77. 623 HARPER, C. W. 1976. Phylogenetic inference in paleontology. J. Paleontol. 50:180–193. HEARD , S. B. 1992. Patterns in tree balance among cladistic, phenetic, and randomly generated phylogenetic trees. Evolution 46:1818–1826. HEY , J. 1992. Using phylogenetic trees to study speciation and extinction. Evolution 46:627–640. KIRKPATRICK, M., AND M. SLATKIN . 1993. Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution 47:1171–1181. LOSOS , J. B., AND F. R. ADLER . 1995. Stumped by trees? A generalized null model for patterns of organismal diversity. Am. Nat. 145:329–342. MADDISON , W. P., AND M. SLATKIN . 1991. Null models for the number of evolutionary steps in a character on a phylogenetic tree. Evolution 45:1184–1197. MEGLITSCH , P. A., AND F. R. SCHRAM. 1991. Invertebrate zoology. Oxford Univ. Press, Oxford, England. MOOERS , A. Ø. 1995. Tree balance and tree completeness. Evolution 49:379–384. MOOERS , A. Ø., AND S. B. HEARD . 1997. Inferring evolutionary process from phylogenetic tree shape. Q. Rev. Biol. 72:31–54. MOON , J. W. 1970. Counting labelled trees. Canadian Mathematical Monographs, No. 1. Canadian Mathematical Congress. W. Clowes, London. NEE , S., A. Ø. MOOERS , AND P. H. H ARVEY . 1992. Tempo and mode of evolution revealed from molecular phylogenies. Proc. Natl. Acad. Sci. USA 89:8322–8326. PHIPPS, J. B. 1976a. Dendogram topology: Capacity and retrieval. Can. J. Bot. 54:679–685. PHIPPS, J. B. 1976b. The numbers of classications. Can. J. Bot. 54:686–688. RAUP , D. M., S. J. GOULD , T. J. M. SCHOPF, AND D. S. SIMBERLOFF . 1973. Stochastic models of phylogeny and the evolution of diversity. J. Geol. 81:525–542. SAVAGE , H. M. 1983. The shape of evolution: Systematic tree topology. Biol. J. Linn. Soc. 20:225–244. SCHR ÖDER, E. 1870. Vier combinatorische Probleme. Z. Math. Phys. 15:361–376. SHAO, K., AND R. R. SOKAL . 1990. Tree balance. Syst. Zool. 39:266–276. SIMBERLOFF , D. S., K. L. HECHT, E. D. MCCOY, AND E. F. CONOR. 1981. There have been no statistical tests of cladistic biogeographical hypotheses. Pages 40–63 in Vicariance biogeography: A critique (G. Nelson and D. E. Rosen, eds.). Columbia Univ. Press, New York. SLOWINSKI , J. B., AND C. GUYER . 1989. Testing the stochasticity of patterns of organismal diversity: An improved null model. Am. Nat. 134:907–921. SWOFFORD , D. L., G. J. OLSEN , P. J. WADDELL , AND D. M. HILLIS. 1996. Phylogenetic inference. Pages 407–514 in Molecular systematics, 2nd edition (D. M. Hillis, C. Moritz, and B. K. Mable, eds.). Sinauer, Sunderland, Massachussetts. TRUEMAN , J. W. H. 1993. Randomization confounded: A response to Carpenter. Cladistics 9:101–109. WEDDERBURN , J. H. M. 1922. The functional equation g(x2 ) = 2a x + [ g(x)]2 . Ann. Math., 2nd Ser. 24:121– 140. Received 10 April 1997; accepted 2 February 1998 Associate Editor: D. Cannatella 624 VOL. 47 SYSTEMATIC BIOLOGY APPENDIX. PROOF OF FORMULA 7 The proof is by mathematical induction. It is supposed that the result is known for all patterns with fewer than n nonterminal nodes. In particular, the result is known for the two clades originating from the most basal node u0 , which have n0 and n0 0 terminal nodes (n = n0 + n0 0 ). This means that the probabilities of constructing these clades are p0 = (1/ (n0 – 1)!)P C(n1 (u0 ) + n2 (u0 ) – 2, n1 (u0 ) – 1)(2 – d (u0 )), p0 0 = (1/ (n0 0 – 1)!)P C(n1 (u0 0 ) + n2 (u0 0 ) – 2, n1 (u0 0 ) – 1)(2 – d (u0 0 )), p = (2 – d (u0 ))(n – 1)–1 (1/ (n0 – 1)!)(1/ (n0 0 – 1)!) ´ P C(n1 (u) + n2 (u) – 2, n1 (u) – 1)(2 – d (u)) = (2 – d (u0 ))((n – 1)!)–1 ((n – 1)!/ (n – 1)) (1/ (n0 – 1)!)(1/ (n0 0 – 1)!) ´ P C(n1 (u) + n2 (u) – 2, n1 (u) – 1)(2 – d (u)) (80 ) (80 0 ) where products are performed over the nonterminal nodes (u0 and u0 0 ) in each branch. Formula 60 can be used to show that the probability of constructing the entire cladogram is p = (2 – d (u0 ))(n – 1)–1 p0 p0 0 = (2 – d (u0 ))(n – 1)–1 ´ (1/ (n0 – 1)!)P C(n1 (u0 ) + n2 (u0 ) – 2, n1 (u0 ) – 1)(2 – d (u0 )) ´ (1/ (n0 0 – 1)!)P C(n1 (u0 0 ) + n2 (u0 0 ) – 2, n1 (u0 0 ) – 1)(2 – d (u0 0 )). form a product over every nonterminal node in the whole cladogram except the most basal node u0 . So = (2 – d (u0 ))((n – 1)!)–1 ((n – 2)!/ ((n0 – 1)!(n0 0 – 1)!)) ´ P C(n1 (u) + n2 (u) – 2, n1 (u) – 1)(2 – d (u)) = (2 – d (u0 ))((n – 1)!)–1 C(n – 2, n0 – 1) ´ P C(n1 (u) + n2 (u) – 2, n1 (u) – 1)(2 – d (u)), (9) where the product runs over all nonterminal nodes except u0 . Recalling that n = n0 + n0 0 , p = (1/ (n – 1)!)P C(n1 (u) + n2 (u) – 2, n1 (u) – 1)(2 – d (u)), 0 (6 ) In this formula, the two products are performed over the nonterminal nodes in the two clades. Together they where the product now runs over all nonterminal nodes. This is formula 7. The initial step in the induction is trivial, because there is only one topology for a bifurcating branching diagram with two terminal nodes, that is, its probability is 1, so the proof is complete.
© Copyright 2026 Paperzz