OIKOS 112: 392 /405, 2006 Diversity: between neutrality and structure Salvador Pueyo Pueyo, S. 2006. Diversity: between neutrality and structure. / Oikos 112: 392 /405. Here I present an integrated framework for species abundance distributions (SADs) that goes beyond the neutral theory without relying on complex mechanistic models. I give some general mathematical results on the relationship between SADs and their underlying dynamics, and analyse an extensive set of marine phytoplankton data in order to test the neutral theory against this broader framework. The main theoretical and empirical results are: (i) the logseries, which is the SAD produced by simple neutral models without migration, is quite robust in response to additional factors, including some forms of niche segregation; (ii) when there is a small but significant deviation from a logseries, the SAD will generally have the form of a power law, regardless of the specific mechanisms; (iii) when the deviation is moderate, the SAD will generally have the form of a lognormal, regardless of the specific mechanisms; (iv) although in a wide range of situations neutral and non-neutral dynamics cannot be distinguished from the SAD alone, some empirical SADs do have the fingerprint of non-neutrality: this is the case of marine dinoflagellates, in contrast to marine diatoms, which adjust to neutral theory predictions. The results for marine phytoplankton illustrate that both neutral and non-neutral mechanisms coexist in nature, and seem to have different weights in different groups of organisms. In addition to the above findings, I discuss several related contributions and point out some important pitfalls in the literature. S. Pueyo, Dept. d’Ecologia, Univ. de Barcelona, Avgda. Diagonal 645, ES-08028 Barcelona, Catalonia, Spain ([email protected]). Ecologists are currently engaged in a strong controversy (Whitfield 2002) about the unified neutral theory of biodiversity and biogeography (Hubbell 2001). This theory maintains that the patterns of diversity must be explained without considering differences between species (reviewed by Chave 2004). At present, the neutral theory is centred on two aspects of diversity patterns (Hubbell 2001): (i) the proportions between different species, as captured by the statistical distribution of n, which is the abundance of a species chosen at random among those represented in a sample taken from a natural community, and (ii) the species /area relationship. Here I address species abundance distributions (SADs). An extension of my approach to species /area relationship (SAR) can be found in Pueyo (in press). Many possible SADs have been proposed in the ecological literature (May 1975, Engen 1978, McGill 2003a). Among these, it is particularly frequent to choose either the lognormal (Preston 1948) or the logseries (Fisher 1943) for fitting empirical data. These two distributions were introduced on empirical grounds and have proved quite useful, but the reasons for their success remain unclear. It has generally been thought that the proportions between the abundances of different species have a direct relation with the proportions between the sizes of their ecological niches: SADs would thus reflect a rather rigid ecosystem structure. At the other end, some authors have maintained that SADs result from random fluctuations in species abundances. The neutral theory belongs to this second category. Accepted 4 July 2005 Copyright # OIKOS 2006 ISSN 0030-1299 392 OIKOS 112:2 (2006) Following Chave (2004), we can distinguish two types of neutral community models, depending on whether or not these are spatially structured. Here I call the neutral community models without space ‘‘simple neutral models’’ (SNMs). Watterson (1974) introduced SNMs in ecology in a review in which he imported several models from population genetics literature. This approach was later divulgated and expanded by Caswell (1976). SNMs have three ingredients: . Ecological drift (in Hubbell’s terms): changes in species abundances caused by stochastic reproduction and death events that have the same rules for all organisms, regardless of species (reproduction is assumed asexual). . Regulation of total community size, equally affecting all organisms regardless of species. . Rare speciation events consisting of the introduction of an organism that belongs to a new species. Watterson (1974) showed that the SNMs that he reviewed produced a logseries SAD (given a large number of individuals N and species S). Hubbell (2001) introduced a spatially structured model with two levels of integration: community and metacommunity. A metacommunity is a large set of local communities and has the dynamics of an SNM (Watterson’s ‘‘model 2’’, which is a limiting case of a model by Karlin and McGregor 1967). Therefore, at a large spatial scale, Hubbell expects an SAD close to a logseries. A community has the same dynamics except that, instead of rare speciation events that give rise to new species, there is a given rate of immigration m of organisms that belong to a finite number of species and have a logseries distribution of abundances (because they are assumed to come at random from any site in the metacommunity). Hubbell calls the resulting SAD a ‘‘zero-sum multinomial’’ (ZSM). This distribution was analytically formalized by Volkov et al. (2003), Vallade and Houchmandzadeh (2003), McKane et al. (2004) and Alonso and McKane (2004). The ZSM equals a logseries for large m, and is more ‘‘humped’’ for small m, like a lognormal. The empirical support for Hubbell’s (2001) unified neutral theory lies in the similitude between empirical SADs and the ZSM and that between empirical SARs and those in a more elaborate version of his model, with explicit space. In particular, the main argument used by Hubbell (2001) is that the ZSM would fit two large samples of tropical forest trees better than the lognormal. However, McGill (2003b) carried out several statistical tests, the results of which did not support Hubbell’s claim. Volkov et al. (2003) responded with a new statistical analysis that seemed to support the ZSM over the lognormal, but Pueyo (unpubl.) shows that their OIKOS 112:2 (2006) analysis was incorrect and, as far as we know, the ZSM and the lognormal fit the data equally well. Many ecologists are uncomfortable with the neutral theory because there is much evidence of ecological niches and other factors that are overlooked by the theory, and it is difficult to believe that these are irrelevant for the global features of ecological communities. Hubbell himself (2001) expects a future unification of the neutral and niche theories, although meanwhile he and his collaborators focus on defending the former (Hubbell 2003, Volkov et al. 2003, 2004). It is not clear how this unification can be achieved. The introduction of an increasing number of parameters in the models will not produce a fundamental theory. In addition, even though the distribution predicted by the neutral theory does not fit tropical forest tree SADs better than the lognormal, as far as we know it does fit them, and this must be explained by opponents of the theory. An idea gaining momentum is that both neutral and non-neutral models will give rise to similar SADs. This was argued by Chave et al. (2002) on the basis of a series of simulations, and also by Mouquet and Loreau (2003). McGill (2003a) outlined a theoretical framework for this point of view. He proposed that SADs would generally belong to a class of statistical distributions that he called POLO, which would cover both the lognormal and the power law (or Pareto, or Zipf-Mandelbrot distribution). He classified these two distributions together because the lognormal becomes a power law at the limit of infinite s (Montroll and Shlesinger 1982; s is the standard deviation of the variable after taking logarithms). He conjectured that ‘‘any complex theory that invokes multiplication of complex factors will produce POLOlike distributions’’, so ‘‘almost any theory will match almost any data as long as we only look at the shape of the distribution’’. This line of thought is enlightening but still lacks an analytical foundation. It also lacks a strong empirical justification of the need to transcend the neutral theory, as would be given by a test that clearly rejects neutrality in favour of a broader alternative such as the above mentioned. These are the two main points that I address in this paper, which includes a theoretical and an empirical part. In the theoretical part I tentatively present an integrated framework for diversity patterns, focusing on SADs. I analyse the SADs to be expected under a set of ‘‘minimal assumptions’’. This must not be confused with the use of ‘‘minimal models’’ as in the neutral theory. Neutral models can be considered ‘‘minimal’’ in the sense that they have few parameters. However, the simplicity of neutral models is not due to minimal but to major assumptions: the tuning of each of the parameters that there would be in a more complex model to zero. Instead, in my approach, zero or any other specific value 393 for any possible parameter is not assumed. My main assumption is that the contribution of these additional parameters to the SAD is small to moderate. In the empirical part, I analyse an exceptionally large set of marine phytoplankton data, in the light of the integrated framework proposed. In particular, I test whether or not these data are consistent with the neutral theory. The final discussion includes a critical review of related contributions. I describe some pitfalls in the explanations of the SADs given by Bell (2000, 2001) and Pachepsky et al. (2001). I also show how the integrated approach sheds light on some of the main controversies in the literature, such as those related to Preston’s (1962) ‘‘canonical’’ distribution or the possible ‘‘left skew’’ of SADs as compared to a lognormal. Methods This paper presents theoretical and empirical results on species abundance distributions (SADs). There is a methodological choice crucial for both types of results: the formalism that we use to capture the information held in an empirical sample. The relevance of this choice for the empirical part is clear. In my approach, this choice is equally relevant for the theoretical part, because my theoretical research explicitly addresses the fraction of reality that is empirically observable by means of a sample. In most scientific fields, the statistical distribution of a given variable x is represented either as a set of probabilities {p(x)} when x is discrete, or as a continuum of probability densities {f(x)} when x is continuous. Since species abundances n are discrete, we may consider estimating the set of discrete probabilities {p(n)} by means of a common histogram. However, there is a drawback with this method. As noted by other authors (McGill 2003a) and clearly supported by the results in this paper, SADs approach the power law statistical distribution, also called Pareto or Zipf-Mandelbrot p(n)8nb (1) (8means / ‘‘proportional’’), with b not far from unity. The power law is a heavy-tailed distribution that differs considerably from common ‘‘textbook’’ statistical functions, and makes a common histogram inappropriate in most cases. In a histogram, most species are concentrated in a few bins at the lower end of the distribution, followed by a queue of bins with one or a few sparse species among many empty bins. Instead of the equalsized bins of a common histogram, unequal sizes must be used in order to obtain a more homogeneous number of data per bin. The probabilities of bins of varying sizes can be compared if these are previously standardised by dividing by the size of the bin. The result obtained is the 394 density of probability for each bin. Therefore, in spite of the discrete nature of abundances, it is more useful to represent probability densities {f(n)} than discrete probabilities {p(n)}. In particular, when dealing with power laws, it is convenient to use multiplicative intervals for the bins and to represent the results in a log /log scale. This produces a plot with an array of equally-spaced spots arranged in a straight line and with quite homogeneous error bars. These are completely homogeneous for b/1 in Eq. 1. Multiplicative intervals have the form [lj ; lj1 ) for some constant l and a series of discrete j beginning with j /0. I use l/2 because this is the minimum l for which each species unambiguously belongs to a single bin, considering that abundances are in fact discrete. For each bin j, the (logarithmically) central value is 1 pffiffiffiffiffiffiffiffiffiffiffiffi j nj 2j 2j1 2 2 ; and its estimated probability density s ˆ j ) 1 j ; where sj is the number of species in bin j, is f(n j 2 S S is the total number of species and 2j is the width of the bin. In his well known representation, Preston (1948) also took multiplicative intervals with l /2, but with three differences: (i) the probability of each bin was calculated without standardising by size, (ii) the probabilities were represented in an arithmetic scale, and (iii) overlapping intervals were taken for successive bins. The reason for the first two differences is that Preston’s representation was designed in such a way to obtain the shape of a Gauss bell for a lognormal SAD. Therefore, this might be considered a method specifically designed for working with the lognormal hypothesis rather than a generalpurpose method. The third difference is inconvenient for statistical tests such as x2 because it eliminates the independence in statistical error between bins. Appendix A gives details on methods of parameter estimation and hypothesis testing. Theory An integrated framework for species abundance distributions The meaning of the logseries In this subsection I investigate the range of circumstances in which a logseries SAD can be expected, and in next subsection I investigate the SADs to be expected when these circumstances are not met. Watterson (1974) showed that simple neutral models (SNMs) with a large number of species S and individuals N produce a logseries. This is the reason why Hubbell (2001) expects this SAD at a metacommunity level. While neutrality leads to the logseries, to which extent is the logseries an evidence of neutrality? OIKOS 112:2 (2006) The logseries distribution has the form f(n)kn 1 fn (2) e Equation 2 has two parts. The first is a power law f(n)knb (3) community size with b/1. The second is an exponential function. Appendix B shows that these two parts are the independent outcome of two distinct mechanisms: the power law with b/1 is the result of ecological drift, while the exponential results from the regulation of total community size. These two mechanisms are not equally important for all species abundances, as shown by Fig. 1, which is the probability density function (p. d. f.) of a logseries with realistic parameters (those fitted to Mediterranean diatoms in the ‘‘Case study’’ below). In this case, the logseries (solid line) overlaps with the power law (dotted line) for most of its range. This ‘‘power law’’ region becomes larger when sample size increases. The exponential part of the logseries is apparent only at the upper end of the distribution. This implies that, in SNMs, the trajectory of the abundance of each species is essentially governed by drift, and only when a species becomes exceptionally abundant is it clearly affected by the limited size of the community. The main role of the exponential part is to -1 f(ni )ki nb i -3 i i. e. it reduces to Eq. 3. In SNMs, every species has the same ki, but Eq. 5 implies that the SAD to be expected may be quite similar without requiring this assumption. Three outstanding situations in which we will find a power law (Eq. 2) with b/1 in some range of abundances are thus (for large N and S): β w, la = 1 3) b fu end nc in tio g n -6 -7 0 1 2 3 4 5 log10(n) Fig. 1. Probability density function of a logseries distribution (solid line) with realistic parameters, with indication of its two components: (i) the power law (dotted line) with slope parameter b /1, which results from ecological drift in simple neutral models (SNMs), and (ii) the bending function (arrow), which results from the regulated community size (short dash line) in these same models. OIKOS 112:2 (2006) (5) i 1) 2) -5 i X [f(ki )ki ] nb knb er -4 (4) for ni [n0 ; nM ] Now take a set of species that satisfy Eq. 4, with ki either equal or different for each species. The probability density of an abundance n for one such species chosen at random is X X f(n) [f(ki )f(njki )] [f(ki )ki nb ] w po log10(f(n)) -2 set an upper bound, resulting from community size, to the power law. In the physical literature, a function with this role is called a bending function. Therefore, a logseries is a power law distribution with b /1 and an exponential bending function. In a different context, Mandelbrot (1963) showed that the power law has a special property that makes it robust in response to factors other than its generating mechanism. Let us call this property ‘‘invariance under assemblage’’. Take a given range of abundances [n0, nM]. Assume that the dynamics of a species i in this range is governed by drift. Then the density of probability f(ni) of finding an abundance ni at time t is SNMs. When the reproduction rate nearly equals the mortality rate for each species but these rates differ between species, while the premises of SNMs are satisfied in other respects. Then the abundance of each species will fluctuate at different speed, but each will satisfy Eq. 4 and the ensemble will satisfy Eq. 3. When there are several guilds with many species in each, in such a way that the niche of each guild does not overlap with the others, but there is full niche overlap within each guild, and the dynamics of the set of species in the guild agrees with either point 1 or point 2 above. Case 3 above will render an SAD really close but not necessarily equal to the logseries, because the principle of invariance under assemblage does not extend to the bending function. More generally, when there is an appreciable overlap of niches among different species but this does not have the simple form in the third of the above cases, the species will still undergo wide fluctuations driven by drift. However, this will not be pure drift, but drift 395 modulated by the influence of niches. Therefore, the outcome may still be close to a logseries, but will differ more clearly. This divergence will be larger the stronger the niche segregation, or any other factor besides niche segregation and drift. In next subsection I show what happens in this case. Of course, in addition to the extension of SNMs in this section, we cannot rule out that a completely different mechanism may lead to the logseries or a similar distribution (e. g. by producing a logseries niche structure). When the logseries is not enough In the previous subsection, I show that the logseries distribution produced by SNMs is robust in response to the presence of factors that are not contemplated in such models, but the robustness is not unlimited. Here I study which SADs are to be expected when these additional factors are strong enough for the distribution to differ significantly from a logseries. If we have a given range of abundances fully dominated by ecological drift, the probability density function (p. d. f.) will have the form (10) When there is only ecological drift (Eq. 7), Dp/0. Otherwise, we can perform the expansion Dr(n) X cj [Dlog(n)]j (11) j1 (8) where {z} are the parameters for niche segregation, environmental noise, relative fitness, life history traits, compensatory mortality, Allee effect, migratory exchanges, anthropogenic influences, non-stationarity, etc., and for the variation in these factors between different species. At this point there are several possible strategies. Hubbell’s (2001) model assumes zi /0 for all i except one. The chosen parameter is a measure of the overall intensity of migratory exchange, which Hubbell tries to fit from the SAD. Another option is to attempt to model all of the processes that affect the dynamics of the community and give a reasonable value to each parameter involved. This would be extremely difficult, and we might eventually find that the addition of so much 396 (9) ρ(n γ) (7) In this way, the significance of a given r does not depend on n. As stated in the Methods section, the error bars for f0 are homogeneous in this representation. The deviations r may result from a number of factors, i. e. r(n)r(n; z1 ; z2 ; :::; zv ) Dr(n)r(n)r(ng ) (6) which corresponds to a straight line in a log /log plot, with slope /1. If there are some other factors of importance for the p. d. f. f in a community, there may be some deviation from this straight line. The only relevant deviations are those statistically significant. Therefore, it is convenient to measure these deviations r in terms of the log /log plot, r(n)log[f(n)]log[f 0 (n)]log[f(n)]log[kn1 ] Dlog(n)log(n)log(ng ) ρ(n) f 0 (n)kn1 complexity does not produce results that differ qualitatively from those obtained with much simpler models. This is precisely what opens the door to the alternative presented here: the development of an approach to SAD based on those properties of r that do not depend on the particular values taken by the set of parameters {z}. If the sample is small, the observable range of n will also be small and the occurrence of significant deviations r is more unlikely. As the sample increases, the range of n will increase and the deviations will become progressively larger. If the new shape that emerges is well matched by a continuous and indefinitely derivable function, the set of deviations can be decomposed into a Taylor series. Let us choose an arbitrary ng in the range of observable abundances and define (Fig. 2): log(n γ) log(n) Fig. 2. Graphical representation of the function r, which I use to express the probability density function of abundances (solid line) as a modification of a power law with b /1 (dotted line). OIKOS 112:2 (2006) laws are subject to the principle of invariance under assemblage, as explained in the subsection ‘‘The meaning of the logseries’’ above, which makes this result robust. If two terms in Eq. 11 are required instead of one (j /1, 2), a lognormal is obtained from Eq. 7 and 9 /11 Taylor series such as Eq. 12 are often used in physics for simplifying complex functions whose details are not well known. Since, when Dlog(n) is small, [Dlog(n)]j vanishes for large j, we often find that a few terms with small j suffice for fitting the observable range of a complex function. This procedure lies at the basis of results as important for physics as for example the principle of minimum entropy production (Nicolis and Prigogine 1977). For a better understanding of the Taylor series approach, I illustrate its application with a large sample of Mediterranean marine phytoplankton (Margalef 1994), which I analyse in detail in the next section. Fig. 3a compares the empirical SAD with the straight line that would correspond to Eq. 6. The empirical SAD is close to what we would expect from neutral fluctuations alone, but there are indeed other factors at work. If the observable range is small in view of the strength of the factors other than ecological drift, it is likely that we require only one term in the expansion in Eq. 11, which corresponds to j /1. Then, from Eq. 7 and 9 /11 we obtain a power law (Eq. 3) with b"1: More specifically, f(n)8n1 e c (n ) m 12c g log(ng ) (14) 2 s2 1 2c2 In Eq. 14, c1 appears as a function of ng, unlike in Eq. 12. This occurs because c1 is a measure of slope in the log /log plot. In Eq. 12 I assume a straight line (Fig. 3b), so c1 is a constant. In contrast, in this second approach I assume a curve (Fig. 3c), so the slope depends on the point where it is measured. However, the term log(ng) cancels this effect and the resulting parameter m does not depend on ng. Fig. 3c shows that the lognormal fits our data even better than the power law. The intuition behind our mathematical result is that (i) in addition to increasing or decreasing the number of species of small abundance as compared to those of large abundance, the simplest thing that any additional factor can do is slightly increase or decrease the number of species of intermediate abundance as compared to those of small or large abundance, (ii) the simplest way for capturing such deviation is by introducing a degree of curvature by means of a quadratic term, and (iii) if the range of n is small enough, there may not be room for further involved outcome. Again, the simple fact that an SAD is lognormal will not provide information on the relative importance of each of the mechanisms involved. One point must be clarified in relation to the above developments. In the subsection ‘‘The meaning of the logseries’’ above, I distinguish two domains in the logseries distribution produced by SNMs (Fig. 1). We preserve the straight line in the log /log plot, but inclined. Fig. 3b shows that this suffices for quite a good fit to the empirical data (in this case, b:1:2): The intuition behind our mathematical result is that (i) the simplest thing that any additional factor can do is slightly increase or decrease the number of species of small abundance as compared to those of large abundance, (ii) the simplest tool for capturing one such deviation is a straight line (this is the reason why linear regressions are so widely used in all fields), and (iii) if the range of n is small enough, there may not be room for a further involved outcome. Therefore, when a power law with b"1 fits our empirical data well, as we see in Fig. 3b, we cannot determine which of the ecological factors corresponding to different zi are responsible, because several will primarily have the same outcome. A measure of their combined strength is b /1. Furthermore, power -1 (13) (/8 means ‘‘proportional’’), satisfying (12) b1c1 1 log(n)m 2 2 s a b c OIKOS 112:2 (2006) log10 (f(n)) -2 Fig. 3. Incorporation of successive terms of the Taylor expansion in Eq. 3 for fitting the Mediterranean phytoplankton species abundance probability densities. a, 0th order approach, which corresponds to a power law (Eq. 1) with b/1. b, 1st order approach, which corresponds to a power law with arbitrary b (in this case, b:1:2): c, 2nd order approach, which corresponds to a lognormal. -3 -4 -5 -6 -7 0 1 2 3 log10 (n) 4 5 0 1 2 3 log10 (n) 4 5 0 1 2 3 4 5 log10 (n) 397 The first domain covers most of the range of n in large samples and consists of a power law with b/1, resulting from ecological drift. The second domain is a deviation from the power law at the upper end of the distribution, produced by the exponential bending function caused by the finite size of the system. The previous developments in this section refer to the modification in the first domain when we add other mechanisms that we assume to be relevant but with less impact on the SAD than ecological drift. These results do not directly affect the bending function. The precise shape of this function will be modified in an uncertain manner with these added factors, but we may assume that it will not change in the essential, i. e. there will be a decrease in probabilities at the upper end of the distribution because of the finite size of the system. The power law with b"1 that we found will thus display an upper bound: this could not be otherwise, because, without such a bound, the expected abundance would be infinite for any bB/2 (Mandelbrot 1983). In principle, also the lognormal that we obtain in this manner is a lognormal with an upper bound. We can often overlook this bound, because a lognormal decays by itself faster than a power law and has a finite expectation, but, on the other hand, decays more slowly than an exponential bending function. Whenever the bound is unduly overlooked when fitting the parameters of the lognormal, there will be a decoupling between empirical and expected SAD, with an ‘‘excess’’ rarity in the first, which may be perceived as a left skew in Preston’s representation. Further generalization In the previous subsections I show that, when an SAD is shaped mainly by ecological drift but other mechanisms also have some relevance, this SAD is likely to be either a power law or a lognormal. However, the conditions that lead to lognormality are even broader. We will generally find a lognormal when there is a combination of: 1) 2) One or several mechanisms that, in isolation, would produce a power law SAD. This is the case of ecological drift, but also of simple forms of environmental noise (Appendix B) or a scaleinvariant niche structure (Morse et al. 1985). One or several mechanisms that slightly favour intermediate abundances. This is, for example, the case of migration, compensatory mortality or some possible forms of non-scaling niches. The sequence in which I introduced the series of analytical steps in this section is based on the hypothesis that it is mainly because of ecological drift that SADs approach a power law, and that the other factors introduce minor variations on this theme. However, it must be pointed out that there are other possible sources for the power law and thus for the lognormal. Some of 398 these mechanisms produce power laws with b"1 without the requirement of additional factors. For example, in the extreme case of population fluctuations driven entirely by environmental noise with a completely different effect on each species, a power law with b/2 would be obtained (Appendix B). Case study: a glimpse at marine phytoplankton In the light of the integrated approach outlined in the previous sections, here I analyse two large sets of marine phytoplankton data published by Margalef (1994), one from the Mediterranean and the other from the Caribbean. Each set results from grouping more than 1000 samples from a number of sites in a large area, so they are expected to capture the features of their metacommunities, according to Hubbell (2001). There are 162 478 identified cells in the Mediterranean set and 883 352 in the Caribbean. This is an exceptional amount of data: both sets exceed the size of the sample taken by Siemann et al. (1996, 1999), which the authors claimed to be the most thorough sample of an ecological community to date, and also the sample sizes in Hubbell (2001). Details of the data sets are shown in the upper part of Table 1. All the samples were taken in the photic zone (down to 110 m). The Mediterranean samples were obtained at several sites in the Catalan sea, while the Caribbean samples were obtained along the eastern coast of Venezuela. I considered only the cells identified to species. For each sea, I studied three sets: (i) the complete set including all the groups of phytoplankton, (ii) diatoms, and (iii) dinoflagellates. These two last groups are important because most cells belong to one of these and, in addition, they are the only ones exhaustively identified to species level. It is of interest to analyse them separately because their differences are not only taxonomic but also ecological (Margalef 1978). In each case, I represent the empirical probability density function (p. d. f.) as explained in the Methods section, and perform the following operations: (i) the logseries is fitted by maximum likelihood estimation and its adequacy tested, (ii) the power law slope parameter b is estimated by simple regression for the whole range of abundances, and (iii) b is fitted by maximum likelihood estimation in the interval of abundances [10, 1000), which displays no significant deviation from a power law, and the 90% confidence intervals are quantified. I then examine the goodness of fit to the distributions with standard chi-square tests (which might however suffer a slight bias in favour of the null hypothesis when applied to SADs, according to recent results by Alonso and McKane 2004). Appendix A gives the procedures for this set of statistical treatments. Table 1 shows the results. Fig. 4 compares the empirical SADs with the logseries, and Fig. 5 displays OIKOS 112:2 (2006) Table 1. Statistics of marine phytoplankton diversity. NT: total sample size, including cells identified and not identified to species level. N: sample size without unidentified cells (these were not used in the analyses). S: number of species identified. b: ‘‘slope’’ parameter of the power law (Eq. 1). ci: confidence interval. df: degrees of freedom. o: minimum significance level that allows the rejection of a given distribution. a: parameter of the logseries (Appendix A). Mediterranean All Diatoms Dinofl. 102 558 60 851 122 Power law in the interval [10,1000], by maximum likelihood estimation b 1.23 1.02 1.46 90% c. i. (1.19,1.26) (0.91,1.07) (1.40,1.51) 6.12 0.45 18.0 x2 df 5 5 6 o 0.29 0.99 0.21 1.20 (1.15,1.24) 2.23 5 0.82 1.02 (0.92,1.07) 4.47 5 0.48 1.44 (1.35,1.50) 6.83 4 0.15 Power law in the whole range, by regression b 1.31 1.13 0.99 0.98 r2 1.40 0.98 1.24 0.99 1.14 0.98 1.34 0.99 Logseries a x2 df o 36.7 24.3 7 0.001 24.5 51.7 14 3/10 5 10.6 28.1 14 0.014 14.6 31.0 10 6/10 4 11.7 3.9 11 0. 972 14 055 10 874 209 All 779 347 759 794 118 42.8 52.5 11 B/10 6 116 409 112 352 107 Dinofl. 1 113 581 883 352 257 NT N S 197 535 162 478 353 Diatoms Caribbean the set of power laws that best fit the whole range of abundances. The logseries can definitely be rejected for the phytoplankton as a whole and for dinoflagellates in particular, in both seas. Only Mediterranean diatoms strictly adhere to the logseries, while those in the Caribbean allow this distribution to be rejected but not as strongly as dinoflagellates and phytoplankton in general. For all the sets, the overall SADs are close to power laws. In the range [10,1000), diatoms have b:1:0 in both seas, as corresponds to a logseries. Dinoflagellates have b:1:45 in both seas, while phytoplankton as a whole displays b:1:2; also in both. Discussion Neutral theory, niche theory and the integrated framework In this paper I present an integrated theoretical framework for diversity patterns. While recognising the great importance of the mechanisms highlighted by the neutral theory (random drift, community-level regulation, migration), I consider that there are many other ecological mechanisms that must not be overlooked, and show a simple way to incorporate them into the theory of diversity patterns. A key argument in this study is that many different models will produce the same few diversity patterns, as maintained by other authors, such as Chave et al. (2002), McGill (2003a) and Mouquet and Loreau (2003). Therefore, a given species abundance distribution (SAD) or species area relationship (SAR) will rarely suffice for OIKOS 112:2 (2006) supporting a narrowly defined model, not even the neutral theory. On the other hand, the SADs of some natural communities allow the neutral theory to be rejected, in principle. In the ‘‘Case study’’ above, I show that this applies to marine dinoflagellates (in contrast to diatoms, whose SADs are consistent with neutrality). The analytical findings that result from my minimal assumptions support the conjecture by McGill (2003a): under broad conditions, complex systems involving multiplicative processes will render POLO-like distributions. ‘‘POLO’’ is the term that he proposes for embracing the power law and the lognormal. Neutral models are just an instance of this type of system. At a metacommunity level, Hubbell (2001) expects a logseries distribution, which is a particular case of power law distribution (with b/1 and an exponential bending function). At a community level, he expects what he calls a ‘‘zero-sum multinomial’’ (ZSM), which can be assimilated to a lognormal. Indeed, the equations for the lognormal and for the ZSM differ, and Hubbell (2001) maintains that the ZSM fits tropical forest tree data better than the lognormal. However, according to recent analyses (S. Pueyo, unpubl.), the ZSM and the lognormal fit the data equally well. Since we cannot currently distinguish between these two distributions from empirical data, for practical purposes a ZSM is a lognormal. When Preston (1948) and other authors assert that a given sample from a natural community ‘‘displays a lognormal distribution’’, what is meant is that it ‘‘displays a statistical distribution that cannot be distinguished from a lognormal in practice’’. Therefore, when we search for a mechanism that generates a lognormal, what we are actually looking 399 -1 log10 (f(n)) -2 Mediterranean phytoplankton Mediterranean diatoms Mediterranean dinoflagellates Caribbean phytoplankton Caribbean diatoms Caribbean dinoflagellates Fig. 4. Species abundance probability densities of marine phytoplankton (empty spots), compared with the best-fit logseries distribution (full spots). -3 -4 -5 -6 -7 -1 log10 (f(n)) -2 -3 -4 -5 -6 -7 0 1 2 3 4 5 6 0 1 log10 (f(n)) 2 3 4 5 6 0 log10 (f(n)) -2 2 3 4 5 6 log10 (f(n)) for (or must look for) is a mechanism that generates a distribution that cannot be distinguished from a lognormal in practice. Hubbell gives an option for one such mechanism: the combination of ecological drift and migration (plus community regulation). However, these two factors can be replaced or complemented by other factors, which are listed in the subsection ‘‘Further generalization’’ above. For example, there is evidence of compensatory mortality in tropical forest trees (Peters -1 1 log10 (f(n)) 2003), and this mechanism will produce an effect on the SAD that is difficult to distinguish from the effect of migration. The set of phytoplankton data examined here is specially relevant in this context, for two reasons: the huge amount of data, and the fact that these result from sampling at a metacommunity level. At this level, the distribution expected from the neutral theory has a single parameter to fit, which implies a more specific Mediterranean phytoplankton Mediterranean diatoms Mediterranean dinoflagellates Caribbean phytoplankton Caribbean diatoms Caribbean dinoflagellates -3 -4 -5 -6 -7 -1 log10 (f(n)) -2 -3 -4 -5 -6 -7 0 1 2 3 4 log10 (n) 400 5 6 0 1 2 3 4 log10 (n) 5 6 0 1 2 3 4 log10 (n) 5 6 Fig. 5. Species abundance probability densities of marine phytoplankton, with power laws fitted by regression. OIKOS 112:2 (2006) prediction than at a community level and is thus advantageous for testing the theory. The results obtained for the Mediterranean and Caribbean are very similar, which suggests that these have quite a general validity. The analyses indicate that marine phytoplankton is not neutral, at least with respect to dinoflagellates. On the other hand, diatoms largely agree with the predictions of the neutral theory in the Caribbean and completely so in the Mediterranean. SADs departing from the neutral theory expectations are reasonably well fitted by the power law distribution, as the integrated framework predicts for small departures from the logseries. In both seas, dinoflagellates display a power law with a slope parameter b:1:45; which differs significantly from the value b/1.0, which we would expect from neutrality and we do find in diatoms. For all phytoplankton together, we obtain b:1:2 in both seas, which also differs significantly from b/1.0. The power law for the full samples results from assembling the power law for each taxonomic group, which illustrates the principle of invariance under assemblage (Eq. 5). In this case, however, we assemble power laws with different b, unlike Eq. 5, but not different enough for the result to clearly deviate from a power law. Other authors have previously used power laws with b"1 to fit empirical SADs. Siemann et al. (1996) fitted their huge samples of grassland arthropods with an expression equivalent to a power law with b/1.5. Marine phytoplankton SADs give evidence of nonneutrality, but this does not imply that ecological drift is unimportant in these organisms. My theoretical developments begin with the assumption that drift is the factor with the strongest influence on the type of shape that SADs display, while making clear that this is not the sole option. However, this option is plausible for phytoplankton. According to the well-known ‘‘paradox of the plankton’’ enunciated by Hutchinson (1961), ‘‘the problem that is presented by phytoplankton is essentially how it is possible for a number of species to coexist in a relatively isotropic or unstructured environment all competing for the same sort of materials’’. The diversity of phytoplankton is difficult to explain from niche segregation alone, which suggests that niche overlap must be high. If this is the case, drift will be a key factor for SADs. The above results also suggest that niche overlap might be broader in diatoms than in dinoflagellates (which must not be confused with a total absence of niche segregation in diatoms). This is a conjecture that can and must be tested. At least on first inspection, it appears to be congruent with other ecological differences between diatoms and dinoflagellates. Diatoms are mainly associated with mixed waters, well matched by Hutchinson’s above description, while dinoflagellates are more often found in stratified waters, and have traditionally been attributed the characteristics of late stages OIKOS 112:2 (2006) of succession (Margalef 1978), in which the community would be more structured (Margalef 1963). Given the number of models that produce the same few SADs, little can be said from the SADs alone, but our case study strongly supports a ‘‘between neutrality and structure’’ paradigm: SADs seem to result from a combination of ecological drift and ecosystem organisation, with these two elements having different weight in different groups of organisms. Critical review of some related contributions Here I discuss a few recent contributions that explain the origin of SADs and that do not entirely coincide with either Hubbell’s (2001) unified neutral theory or the integrated framework enunciated in this paper. I also show how the integrated approach sheds light on some old controversies in the ecological literature. Bell (2000, 2001) developed a neutral model that differed from that of Hubbell. This must be considered a simple neutral model (SNM) as defined in the introduction, because it has the characteristic ingredients: ecological drift, a form of global regulation equally affecting all individuals regardless of their species, and a process analogous to speciation. The latter process is labelled as ‘‘migration’’ because it is based on a finite pool of species. However, for practical purposes it is equivalent to speciation, because the pool of species is large, the rate of ‘‘migration’’ small, and ‘‘migration’’ events are equiprobable for all species instead of following a plausible metacommunity distribution as in Hubbell’s model. Surprisingly, Bell finds a distribution that resembles the lognormal and Hubbell’s ZSM in that it is more humped than the logseries. This result is due to an artifact in Bell’s simulations. Take, for example Fig. 1 in Bell (2000). This was obtained by prescribing an equal initial abundance for all species and then running 2000 iterations of his model. However, 2000 iterations is not enough to reach the final steady-state SAD. After some tens of thousands of iterations, the SAD that results from Bell’s model is a logseries. Pachepsky et al. (2001) presented another dynamic model to explain plant SADs. They explicitly introduced many physiological traits that differ between species, but eventually reduced this setting to a tradeoff between fecundity and time to reproduction. This implies a difference between species in the statistical distribution of reproduction events, but (i) reproduction and mortality events are still random events at an individual level, (ii) reproduction rate equals mortality rate for all species, and (iii) there is no regulation mechanism differentiating between species. If we take Appendix B and the subsection ‘‘The meaning of the logseries’’ into account, 401 it is clear that the above model is a neutral model and must lead to a logseries, like other neutral models. However, Pachepsky et al. report a lognormal. This is because they introduced no speciation or immigration, so all species except one must eventually become extinct, and what they studied is a transitory, like Bell (2001, 2002). In this case, there is a depression of probabilities at the lower end of the range because it is there where species become extinct without being replaced. This model is not valid for explaining the lognormal-like shape of SADs in nature, but is an example of how moderate modifications of the logseries lead to the lognormal. Magurran and Henderson (2003) analysed the SAD of an estuarine fish community and reached some conclusions on its origin. They examined the SAD for the whole data set and also the SADs for the species that had been recorded for either less or more than 10 years out of 21. The set of long-lasting species had a lognormal distribution. The authors thus proposed establishing a distinction between the core species in a community, which would have a lognormal distribution, and occasional immigrants that usually attain low numbers and would be responsible for the ‘‘left skew’’ as compared to the lognormal, often claimed for empirical SADs. I reanalysed the data and found that the full set approaches a power law with b:1:3: For most conceivable models, either neutral or non-neutral, sporadic species are more likely to be rare. It is not surprising that their removal makes the shape of the distribution more humped and that this must then be fitted by a lognormal instead of a power law, as expected from my Taylor series approach. The set of occasional species still approaches a power law, but with a larger b (/b:1:65); which is not either surprising. These observations might suggest that the results reported by Magurran and Henderson (2003) do not contribute to our theoretical framework, but indeed they do, thanks to an interesting observation in their paper: their ‘‘occasional’’ species were attributed to non-estuarine habitats in the literature much more often than their ‘‘core’’ species. This is empirical proof of a specific non-neutral mechanism affecting the SAD in this community. In particular, it is the mechanism operating in the model of ‘‘source /sink competitive metacommunity’’ described by Mouquet and Loureau (2003): community-level SADs are affected by the presence of species with low local fitness that reiteratively immigrate from other communities with habitats to which these species are better adapted. There can be little doubt that this mechanism contributes to the estuarine fish species studied by Magurran and Henderson displaying b:1:3 instead of the value b/1, which we would expect from the neutral theory. Besides the origin of the lognormal-like SAD, the main related issues discussed in the literature (Magurran 402 and Henderson 2003) have been its seeming left skew in Preston’s representation, and the observations by Preston (1962) himself on the ‘‘canonical’’ lognormal (Sugihara 1980). In the subsection ‘‘When the logseries is not enough’’, I give a possible explanation for the left skew. The integrated approach also gives some clues on Preston’s canonical. Preston (1948) distributed the abundances of species by multiplicative intervals similar to my intervals [2j ; 2j1 ) (the difference is that Preston’s intervals overlap, but this does not affect the following results). Preston’s best known representation of species abundances consists of j vs the number of species in the bin. However, in addition to this ‘‘species curve’’, he also calculated what he called the ‘‘individuals curve’’: j vs the sum of the abundances of the species that belong to the bin. In several samples, he found a shape like the lower half of a Gauss bell. Assuming that abundances are lognormally distributed, Preston (1962) noted that this empirical result suggested a constraint in the relationship between the two parameters of the lognormal. He called this particular case of lognormal the ‘‘canonical’’ lognormal. According to the integrated approach, the lognormal SAD results from a small deviation from a power law SAD, with a slope parameter b not far from b /1. In the case of a power law, the number of individuals in bin j of Preston’s ‘‘individuals curve’’ 22b 1 [(2b)log(2)]j 2j1 will be Sf2j xkxb dx Sk e in a 2b continuous approximation, where S is the number of species in the sample. This function increases exponentially with increasing j if bB/2, as is the case for b close to 1. This will rule for the whole distribution range except at the upper end, where the bending function will produce a downward inflexion. This result is similar to half a Gauss bell. If lognormal SADs result from small deviations from power laws, these will often display the effect found by Preston (1962). Practical consequences The analysis of an empirical SAD can be performed as follows: 1) 2) 3) 4) Represent the SAD as explained in Methods. Determine the number of terms in Eq. 11 required for fitting the SAD. As shown in the ‘‘Case study’’ above, a single factor will sometimes give a reasonable fit, which simplifies matters. On other occasions, two factors and, exceptionally more, will be required. Estimate the corresponding parameters. At this step all the information in the SAD will have probably been exhausted. Try to find quantitative relationships between functional parameters (related to migration rate, niche OIKOS 112:2 (2006) segregation, compensatory mortality, etc.) and the descriptive parameters of the SAD. This is not possible by studying the SAD alone, because each descriptive parameter may well be a function of more than one functional parameter. Supporters of the neutral theory could argue that, when two functional parameters suffice for explaining the type of SAD, it is unnecessary to consider other parameters. This line of reasoning is seriously flawed: while the inclusion of more parameters will not have a qualitative effect on the shape of the predicted SAD (or the SAR), it may substantially alter any quantitative prediction. For example, the speciation rate that can sustain a given number of species will change dramatically with just a small niche segregation or compensatory mortality, and so will the effects of ecosystem fragmentation. If we reach some results at the fourth of the above levels, we will be in a better position to predict how different forms of anthropogenic interference will affect diversity patterns, and perhaps also to reach a deeper understanding of the role of diversity in ecosystems. Neutral theory will have played a pivotal role in paving the road to this stage, but along the way it will have to cease to be neutral. Everything seems to indicate that the differences between species have much greater ecological importance than neutral theory might suggest. Acknowledgements / I thank a number of colleagues for useful comments and discussions: D. Alonso, E. Clavero, J. Flos, E. Gutiérrez, D. Jou, J. Martı́nez-Alier, A. McKane, J. L. Pretus, M. A. Rodrı́guez, R. V. Solé, and especially B. McGill and R. Margalef. I also thank J. Flos for facilitating the continuity of my research. I dedicate this paper to the late Ramon Margalef, who was one of my main sources of inspiration (and data). References Alonso, D. and McKane, A. J. 2004. Sampling Hubbell’s neutral theory of biodiversity. / Ecol. Lett. 7: 901 /910. Bell, G. 2000. The distribution of abundance in neutral communities. / Am. Nat. 155: 606 /617. Bell, G. 2001. Neutral macroecology. / Science 293: 2413 /2418. Bulmer, M. G. 1974. Fitting Poisson lognormal distribution to species /abundance data. / Biometrics 30: 101 /110. Caswell, H. 1976. Community structure: a neutral model analysis. / Ecol. Monogr. 46: 327 /354. Chave, J. 2004. Neutral theory and community ecology. / Ecol. Lett. 7: 241 /253. Chave, J., Muller-Landau, H. C. and Levin, S. A. 2002. Comparing classical community models: theoretical consequences for patterns of diversity. / Am. Nat. 159: 1 /23. Engen, S. 1978. Stochastic abundance models. / Chapman and Hall. Engen, S. and Lande, R. 1996. Population dynamic models generating the lognormal species abundance distribution. / Math. Biosci. 132: 169 /183. Fisher, R. A. 1943. A theoretical distribution for the apparent abundance of different species. / J. Anim. Ecol. 12: 54 / 57. OIKOS 112:2 (2006) Frieden, R. 1985. Estimating occurrence laws with maximum probability, and the transition to entropic estimators. / In: Smith, C. R. and Grandy Jr., W. T. (eds), Maximum-entropy and Bayesian methods in inverse problems. Reidel, Dordrecht, pp. 133 /169. Hubbell, S. P. 2001. The unified neutral theory of biodiversity and biogeography. / Princeton Univ. Press. Hubbell, S. P. 2003. Modes of speciation and the lifespans of species under neutrality: a response to the comment of Robert E. Ricklefs. / Oikos 100: 193 /199. Hutchinson, G. E. 1961. The paradox of the plankton. / Am. Nat. 95: 137 /145. Jaynes, E. T. 1983. Papers on probability, statistics and statistical physics (Rosenkratz, R. D., ed.). / Reidel, Dordrecht. Karlin, S. and McGregor, J. 1967. The number of mutant forms maintained in a population. / In: Proc. 5th Berkeley Symp. Math. Statist. Prob. IV, pp. 415 /438. MacArthur, R. 1960. On the relative abundance of species. / Am. Nat. 94: 25 /36. Magurran, A. E. and Henderson, P. A. 2003. Explaining the excess of rare species in natural species abundance distributions. / Nature 422: 714 /716. Mandelbrot, B. 1963. New methods in statistical economics. / J. Polit. Econ. 71: 421 /440. Mandelbrot, B. B. 1983. The fractal geometry of nature. / W. H. Freeman. Margalef, R. 1963. On certain unifying principles in ecology. / Am. Nat. 97: 357 /373. Margalef, R. 1978. Life-forms of phytoplankton as survival alternatives in an unstable environment. / Oceanol. Acta 1: 493 /509. Margalef, R. 1994. Through the looking glass: how marine phytoplankton appears through the microscope when graded by size and taxonomically sorted. / Sci. Mar. 58: 87 /101. May, R. M. 1975. Patterns of species abundance and diversity. / In: Cody, M. L. and Diamond, J. M. (eds), Ecology and evolution of communities. The Belknap Press of Harvard Univ. Press, pp. 81 /120. McGill, B. J. 2003a. Strong and weak tests of macroecological theory. / Oikos 102: 679 /685. McGill, B. J. 2003b. A test of the unified neutral theory of biodiversity. / Nature 422: 881 /885. McKane, A. J., Alonso, D. and Solé, R. V. 2004. Analytic solution of Hubbell’s model of local community dynamics. / Theor. Popul. Biol. 65: 67 /73. Montroll, E. W. and Shlesinger, M. F. 1982. On 1/f noise and other distributions with long tails. / Proc. Natl Acad. Sci. USA 79: 3380 /3383. Morse, D. R., Lawton, J. H., Dodson, M. M. et al. 1985. Fractal dimension of vegetation and the distribution of arthropod body lengths. / Nature 314: 731 /733. Mouquet, N. and Loreau, M. 2003. Community patterns in source /sink metacommunities. / Am. Nat. 162: 544 /557. Nicolis, G. and Prigogine, I. 1977. Self-organization in nonequilibrium systems. From dissipative structures to order through fluctuations. / John Wiley & Sons. Pachepsky, E., Crawford, J. W., Bown, J. L. et al. 2001. Towards a general theory of biodiversity. / Nature 410: 923 /926. Peters, H. A. 2003. Neighbour-regulated mortality: the influence of positive and negative density dependence on tree populations in species-rich tropical forests. / Ecol. Lett. 6: 757 /765. Preston, F. W. 1948. The commonness, and rarity, of species. / Ecology 29: 254 /283. Preston, F. W. 1962. The canonical distribution of commonness and rarity. / Ecology : 185 /215 43: 410 /432. Pueyo, S. 2006. Self-similarity in species abundance distribution and in species area relationship. / Oikos. 112: 156 /162. Siemann, E., Tilman, D. and Haarstad, J. 1996. Insect species diversity, abundance and body size relationships. / Nature 380: 704 /706. 403 Siemann, E., Tilman, D. and Haarstad, J. 1999. Abundance, diversity and body size: patterns from a grassland arthropod community. / J. Anim. Ecol. 68: 824 /835. Sugihara, G. 1980. Minimal community structure: an explanation of species abundance patterns. / Am. Nat. 116: 770 / 787. Vallade, M. and Houchmandzadeh, B. 2003. Analytical solution of a neutral model of biodiversity. / Phys. Rev. E 68: 061902. Volkov, I., Banavar, J. R., Hubbell, S. P. et al. 2003. Neutral theory and relative species abundance in ecology. / Nature 424: 1035 /1037. Volkov, I., Banavar, J. R., Maritan, A. et al. 2004. The stability of forest biodiversity. / Nature 427: 696 /697. Wagensberg, J., López, D. and Valls, J. 1988. Statistical aspects of biological organization. / J. Phys. Chem. Solids 49: 695 / 700. Watterson, G. A. 1974. Models for the logarithmic species abundance distributions. / Theor. Popul. Biol. 6: 217 /250. Whitfield, J. 2002. Neutrality versus the niche. / Nature 417: 480 /481. Subject Editor: Per Lundberg Appendix A: Data analysis The main methodological innovation in data analysis in this paper is the type of SAD representation (Methods). Here I add some details on parameter estimation and intervals of confidence. Whenever there is a truly good fit, I seek the maximum likelihood estimator (m. l. e.) of the parameters (Bulmer 1974). Take the abundances ni for species i /1 to S in the sample. For the statistical distribution that we assume, these will have some probabilities {p(ni; w1. . .wq)} or densities of probability {f(ni; w1 . . . wq)}, depending on a set of parameters w1. . .wq. The m. l. e. consists of the values of the parameters that maximise either ai log(p(ni )) or ai log(f(ni )); which is equivalent to maximising the ensemble probability of our set of abundances. For the interval of abundances n [10; 1000); the marine phytoplankton data in my case study is well fitted by a continuous power law b1 f(n) b1 nb b1 nM n0 where n0 and nM are the lower and upper bounds, n0 / 10 and nM /1000. In this interval, I obtain the m. l. e. of b by iteratively searching the value b̂ that maximises b̂ 1 Þ b̂log(g); where g is the geometric logð b̂1 b̂1 n n0 M mean of the data. It is immediate to find confidence intervals and perform contrasts of hypotheses, because the only source of error for this estimator is the variability in log(g), which has a Gaussian distribution when the number of species is large. When referring to the whole SAD instead of a particular interval, the power law is a convenient but inexact approximation, and the m. l. e. is not reliable. Therefore, I estimate b by simple regression. In the case of the logseries, I seek the m. l. e. of Fisher’s parameter a of the logseries in its discrete form. p(n)kn1 efn 404 Fisher (1943) himself gave the method. It consists of iteratively searching the values of k and f (Eq. 1) that satisfy: 8 N > 1 > k 1 log > < kS > kS > > 1 flog : N and then calculating a /kS. a is a parameter independent of sample size, which allows k and f to be obtained as a function of N. Appendix B: why simple neutral models produce a logseries SAD In the subsection ‘‘The meaning of the logseries’’ above, I state that the logseries distribution which we obtain from simple neutral models (SNMs) has two components, a power law with b/1 and an exponential bending function, and each has a different origin (Fig. 1). The origin of the power law is easy to find by means of the diffusion equations by Engen and Lande (1996). If we take a continuous abundance n, the probability density function (p. d. f.) of a set of noninteracting species with the same dynamics will have the form n 1 2h(u) f 0 (n)8 exp du (B1) n(n) 1 n(u) g (from Eq. 11 in Engen and Lande 1996), where 8/ means ‘‘proportional’’, h(n) is the expected change in the abundance of a given species in a small interval of time when its initial abundance is n, and v(n) is the variance of this change. OIKOS 112:2 (2006) In a community entirely driven by ecological drift, h(n)0 (B2) v(n)8n (B3) Equation B3 results from the fact that the variance of the sum of a set of independent variables is the sum of their variances (in this case, each variable is the number of descendents of one of the integrants of a given species, born and not dead in a short interval of time, minus one in case the parent dies in this same interval). From Eq. B1 /B3, we find: f 0 (n)8n1 (B4) In contrast, take the extreme situation of population fluctuations driven entirely by environmental noise with a completely different effect on each species. Environmental noise synchronises the organisms of the same species, so v(n)8n2 : Since we assume independence between species, we will obtain f 0 (n)8n2 : In either of the two cases, the power law results from random reproduction and death events. The only additional factor in SNMs to which we can attribute the bending function is the regulation of total community size. Why does the regulation of community size specifically produce an exponential bending function (for large N and S)? This can be explained by applying a method which is well known in statistical physics, with the denomination of ‘‘maximum entropy formalism’’ (MAXENT). Interestingly, little after Jaynes in 1957 introduced this method in statistical physics (Janes 1983), MacArthur (1960) introduced it in the field of biological diversity, with no specific denomination. While this became an established method in statistical physics, in the case of ecology I am aware of only some isolated attempts of ‘‘reintroduction’’ following the work of E. T. Jaynes (Wagensberg et al. 1988). Here I use MAXENT, but it would not be appropriate to apply it in its original form. Instead, I apply a generalised version called Kullbach-Leibler norm (Frieden 1985). I do not explain the theoretical foundations of this methodology; however, these can be found in the above references. In broad conditions, the generalised MAXENT method allows the transformation of the statistical OIKOS 112:2 (2006) distribution f0 for a set of non-interacting entities into the statistical distribution f to expect when we add a constraint of the form S X h(ni )k (B5) i1 for a constant k and a given function h. The p. d. f. that results for large N and S is f(n)8f 0 (n)efh(n) (B6) The regulation of community size in neutral models either has the form of a zero-sum rule or is nearly equivalent. The zero-sum rule consists of imposing a fixed community size N, i. e. applying Eq. B5 with h(n) /n and k/N: S X ni N i1 From Eq. B6, the p. d. f. that results from this constraint will be: f(n)8f 0 (n)efn (B7) Since f0 satisfies Eq. B4 for SNMs, we obtain the logseries (Eq. 2). MacArthur (1960) developed his own version of MAXENT in order to find the SAD that results from the zero-sum rule alone, without ecological drift. He assumed that all abundances are equally probable a priori, i. e. a uniform f0. It follows from Eq. B7 that the resulting SAD is exponential. This SAD became widely known in the ecological literature under the name ‘‘broken stick distribution’’. I conclude that the logseries equation combines the independent outcome of two distinct mechanisms: the power law with b/1 results from ecological drift, and the exponential bending function results from the constraint on community size. The first part of the equation is unaffected by the presence or not of this or similar constraints, while the second part is not necessarily affected by the rules that govern the abundances of single species in the absence of constraints. 405
© Copyright 2026 Paperzz