Palaeontological community and diversity analysis – brief notes Oyvind Hammer Paläontologisches Institut und Museum, Zürich [email protected] Zürich, June 3, 2002 Contents 1 Introduction 2 2 The basics of palaeontological community analysis 3 3 Comparing samples 5 4 Cluster analysis 8 5 Ordination and gradient analysis 11 6 Diversity 17 7 Curve fitting 25 8 Time series analysis 29 Bibliography 34 1 Chapter 1 Introduction Palaeontology is becoming a quantitative subject, like other sciences such as biology and geology. The demands are increasing on palaeontologists to support their conclusions using statistics and large data bases. Palaeontology is also moving in the direction of becoming more analytical, in the sense that fossil material is used to answer questions about environment, evolution and ecology. A quick survey of recent issues of journals such as Paleobiology or Lethaia will show this very clearly indeed. In addition to hypothesis testing, quantitative analysis serves other important purposes. One of them is sometimes referred to as ’fishing’ (or data mining), that is searching for unknown patterns in the data that may give us new ideas. Finally, quantitative methods will often indicate that the material is not sufficient, and can give information about where we should collect more data. This short text will describe some methods for quantitative treatment of palaeontological material with respect to community analysis, biogeography and biodiversity. In practice, this means using computers. It goes without saying that all the methods rely on good data. With incomplete or inaccurate data, almost any result can be achieved. Quantitative analysis has not made ’old fashioned’ data collection redundant - quite the opposite is true. In this little course we can not go in any depth about the mathematical and statistical basis for the methods - the aim is to give a practical overview. But if such methods are used in work that is to be published, it is important that you know more about the underlying assumptions and the possible pitfalls. For such information I can only refer to the literature. The PAST software In this course we will use a simple, but relatively comprehensive computer program for Windows, called PAST (PAlaeontologigal STatistics). This program has been designed for teaching, and contains a collection of data analysis methods that are commonly used by palaeontologists in a user-friendly package. It can also be used for ’real’ work. PAST is free, and can be downloaded to your own computer from this address: http://folk.uio.no/ohammer/past/ Here you will also find the manual for the program, and a number of ’case studies’ demonstrating the different methods. Some of these examples will be used in the course. 2 Chapter 2 The basics of palaeontological community analysis The starting point for most community analysis is the occurrence matrix. This table consists of rows representing samples, and columns representing taxa (often species). The occurrences of the taxa in the different samples can be in the simple form of presence or absence, conventionally coded with a 1 or a 0, or they can be given in terms of specimen counts (abundance data). Whether to use presence/absence or abundance depends on the material and the aims of the analysis, but it is generally preferable to try to collect abundance data if possible. Abundance can easily be converted to presence/absence, but the converse is impossible! Quarry A Outcrop X Outcrop Y A. expansus 143 12 9 P. ridiculos 13 56 73 M. limbata 0 3 14 The samples (rows) may come from different localities that are supposed to be of the same age, or they may be come from different levels in a single or composite section, in which case the rows should be arranged in stratigraphical order. The former type of data forms the basis of biogeographical and ecological studies, while the latter involves geological time and is unique to palaeontology. The analysis of stratigraphically ordered samples borders upon biostratigraphy, but pure biostratigraphical analysis with the purpose of correlation will not be treated in this text. In practice, most occurrence matrices have a number of features that can cause problems for analysis. They are normally sparse, meaning that they have many zero entries. They are almost always noisy, meaning that if there is some structure present it will normally be degraded by errors, taxonomical confusion, missing data and other ’random’ variation. And they are commonly redundant, meaning that many samples will have similar taxon composition and many taxa will have similar distributions on samples. Ideally, a sample should represent an unbiased random selection of individuals that actually lived together in the same place and at the same time (a census). This is rarely the case in palaeontology, where post-mortem transportation and time-averaging due to slow sedimentation and bioturbation cause mixing 3 CHAPTER 2. THE BASICS OF PALAEONTOLOGICAL COMMUNITY ANALYSIS 4 of fossil communities both in space and time. In addition, sorting and differential preservation potential can severely bias both presence-absence data and (even more) abundance data. This can invalidate some assumptions of some statistical tests, but it does not invalidate the whole field of palaeontological community analysis. In many cases the samples probably do represent unbiased selections within a fossil group at least. Unless there has been some very selective and serious hydrodynamical sorting, we can hope that for example a sample of gastropod shells within a limited size range is relatively unbiased. Time-averaging on the order of a few hundred or perhaps even a few thousand years is not necessarily detrimental if the communities were reasonably stable throughout this time. Still, we need to always keep these potential problems in mind. The analysis of occurence matrices can take many different directions. We may simply want to compare samples in a pairwise manner, to test statistically whether two samples should be considered to have different compositions. Large numbers of samples may be divided into groups according to similarity (cluster analysis), and these groups may be interpreted in terms of biogeographical regions or facies. Samples may also be ordered in a continuum according to their taxon content (ordination), which can be interpreted in terms of an environmental gradient. So far we have discussed similarities between samples, which we can refer to as sample-centered analysis (also known as Q-mode analysis). We could also compare taxa, and look at what species tend to co-occur. This can be called taxon-centered (or R-mode) analysis. The occurrence matrix represents a multivariate data set, where each data point (sample) is described using a number of values (taxon occurrences). Analysis is much simplified if we can reduce this data set by extracting a single parameter for each sample, describing some aspect of its taxon composition. Many such parameters are used by ecologists, attempting to measure qualities such as species richness or dominance (the numerical dominance of one or a few species). When such a parameter, for example number of species, is extracted for a number of samples in stratigrapical order, we have a univariate time series which can be analyzed in order to detect trends or cycles, perhaps associated with changes in climate or sea level. All these methods will be covered in the course, with examples from the ’real world’. Chapter 3 Comparing samples The comparison of samples from different localities or stratigraphical levels forms the basis of much of community analysis. Such comparison can be done within a stringent statistical framework by using the Chi-squared test, or we can use a ’heuristic’ distance measure. The Chi-squared test (counted data in categories) The Chi-squared ( ) test is designed for comparing two samples consisting of the number of occurrences within a set of categories. This makes it an appropriate method for testing whether two samples with taxon abundance data are likely to have been taken from the same community. The test gives the probability that the samples are the same, and if this value is very low (say ) we can say that the null hypothesis of the samples being taken from the same community is rejected at a significance level of 0.05. As with most statistical tests we can sometimes reject the null hypothesis of equality, but we can never confirm it with statistical significance. The Chi-squared test assumes that all taxon abundance value within a sample is larger than 5. A proper execution of this test may therefore require the removal of rare taxa or fossil-poor samples. Another detail considers the number of constraints, which needs to be set by the user. This would normally be left at 0, but should be set to 1 if the abundance data have been normalized to percentage values. As an example, consider the data from the Llanvirnian (Ordovician) as presented in Case Study 8. Abundance data have been collected at ten levels in a section. For statistically testing whether the samples come from the same population using the Chi-squared test, we first remove the rarest taxa, but the nature of this data set is such that we must retain some abundance values with less than 5 specimens. This possible source of error must be kept in mind, but the Chi-square test should be reasonably robust to this invalidation of the assumptions. Comparing successive samples, we find that the null hypothesis of the samples coming from the same population can be rejected at for all the successive sample pairs from sample 3 to sample 8. The community thus seems to be changing throughout this interval. The null hypothesis can not be rejected for sample pairs 1-2, 2-3, 8-9 and 9-10, meaning that we can not assume that these sample pairs are from different populations. 5 CHAPTER 3. COMPARING SAMPLES 6 Sample similarity measures for presence/absence data A large number of heuristic indices have been defined for measuring the distance between two samples containing taxon occurrences. They can be divided into two groups: those using presence/absence data and those using abundance data. For presence/absence data the following distance measures should be mentioned: Jaccard similarity. A match is counted for all taxa with presences in both samples. Using for the number of matches and for the the total number of taxa with presences in just one sample, we have Jaccard similarity = M / (M+N) Dice (Sorensen) coefficient. Puts more weight on joint occurences (M) than on mismatches. Dice similarity = 2M / (2M+N) The Simpson similarity is defined as , where is the smaller of the numbers of presences in the two samples. This index treats two associations as identical if one is a subset of the other, making it useful for fragmentary data. Raup-Crick index for absence-presence data. This index (Raup & Crick 1979) uses a randomization ("Monte Carlo") procedure, comparing the observed number of species ocurring in both samples with the distribution of co-occurrences in 200 pairs of random replicates of the pooled sample. It is an example of a more general class of similarity index based on bootstrapping (see the chapter on Diversity). All these indices range from 0 (no similarity) to 1 (identity). Further information can be found in Krebs (1989), Magurran (1988) and Ludwig & Reynolds (1988). Sample similarity measures for abundance data The Euclidean distance: "! % #$$ ' & 0/ 21 )(+*,.- Correlation (of the variables along rows) using Pearson’s 3 . Correlation using Spearman’s rho (basically the 3 value of the ranks). Bray-Curtis distance measure, sensitive to absolute abundances. 4 3 6 5879 3;: <= 0!?> @& (+*BA - D C / - ; 1 A > @& (+* ,.- - CHAPTER 3. COMPARING SAMPLES 7 Chord distance for abundance data. This index is sensitive to species proportions and not to absolute abundances. It projects the two multivariate sample vectors onto a hypersphere and measures the distance between these points, thus normalizing abundances to 1. 7 3 ! % #$$ / > )& (+* ., - - 1 > @& (+* - > @& (+* - Morisita’s similarity index for abundance data. / 1 1 * ! > @> (+* @& (+* ,.- > ,.@- (+ * / 1 &> - , & - / 1 1 ! @& (+* ., - ,.- / 1 > @(+* > @(+* 3 < : ! & C - > 1 , @& (+&* ,.- - - 1 , * > )& (+* - > @& (+* - (3.1) This index was recommended by Krebs (1989). The existence of all these indices is highly confusing. The Euclidean index is often used, but the Chord distance and the Morisita index may perform better for community analysis. See also Krebs (1989). If your samples are characterized by high dominance (overwhelming numerical abundance of one or a few species), you may choose to take the logarithm of all abundance values before measuring distance. This will put a smaller weight on the dominant taxa, allowing the rarer taxa to contribute more to the distance value. Chapter 4 Cluster analysis Cluster analysis means finding groupings of samples (or taxa), based on an appropriate distance measure. Such groups can then be interpreted in terms of biogeography, environment and evolution. Hierarchical cluster analysis will produce a so-called dendrogram, where similar samples are grouped together. Similar groups are further combined in ’superclusters’, etc. (fig. 4.1). Figure 4.1: Dendrogram. The vertical axis is in units of group similarity. Using cluster analysis of samples, we can for example see whether limestone samples group together with shale samples, or if samples from Germany group together with those from France or those from England. For a stratigraphic sequence of samples, we can detect turnover events in the composition of communities. We can also cluster taxa (R mode). In this way we can detect associations (or ’guilds’) of taxa, for example whether a certain brachiopod is usually found together with a certain crinoid. Many of the distance measures described above for comparing samples can also be used when comparing the distributions of taxa. There are several algorithms available for hierarchical clustering. Most of these algorithms are ag8 CHAPTER 4. CLUSTER ANALYSIS 9 glomerative, meaning that they cluster the most similar items first, and then proceed by grouping the most similar clusters until we are left with a single, connected supercluster. In PAST, the following algorithms are implemented: Mean linkage, also known as Unweighted Pair-Group Moving Average (UPGMA). Clusters are joined based on the average distance between all members in the two groups. Single linkage (nearest neighbour). Clusters are joined based on the smallest distance between the two groups. Ward’s method. Clusters are joined such that increase in within-group variance is minimized. Being based on variance, this method makes most sense using the Euclidean distance measure. For community analysis, the UPGMA algorithm is recommended. Ward’s method seems to perform better when the Euclidean distance measure is chosen, but this is not the best distance measure for community analysis. It may however be useful to compare the dendrograms given by the different algorithms and different distance measures in order to informally assess the robustness of the groupings. If a grouping is changed when trying another algorithm, that grouping should perhaps not be trusted. It must be emphasized that cluster analysis by itself is not a statistical method, in the sense that no significance values are given. Whether a cluster is ’real’ or not must be more or less informally decided on the basis of how well it is separated from other clusters (fig. 4.2). One approach may be to decide a priori on a cut-off value for the across-cluster similarity. More formal tests of significance exist as extensions to the basic clustering algorithms, but they are not in common use. Significance values based on testing whether two clusters could have been taken from the same population are not valid, because these clusters have already been constructed precisely in order to maximize the distance between them. This would be circular reasoning. Investigating the robustness of the clusters after random perturbations of the data might be a somewhat more fruitful approach. More information on cluster analysis can be found in Krebs (1989), Ludwig & Reynolds (1988) and Jongman et al. (1995). CHAPTER 4. CLUSTER ANALYSIS A 10 B Figure 4.2: Dendrogram A shows two well separated clusters, while dendrogram B (with the same branching topology) is quite unresolved. The groups in dendrogram B must be interpreted with great caution. Figure 4.3: Clustering of Ordovician trilobite families, with a distance measure based on the correlation of their generic diversities in four intervals. From Adrain et al. (1998). This analysis has been part of the foundation for splitting the Ordovician trilobites into two major ’evolutionary faunas’ (Ibex and Whiterock). The two clusters are however not very well separated. Chapter 5 Ordination and gradient analysis Ordination means ordering the samples or the taxa along a line or placing them in a low-dimensional space in such a way that distances between them are preserved as far as possible. Concentrating on samples, each original sample is a data point in a high-dimensional space, with a number of variables equal to the number of taxa. Ordination means projection of this very complicated data set onto a lowdimensional space, be it 3D space, a plane or a line. If the variation in the original data set is mostly controlled by a single environmental gradient, we might be able to find a way of optimally projecting the points onto a line such that distances between samples are to a large degree preserved. This will simplify our study of the data, and the line (axis) found by the algorithm may be given an ecological interpretation. There are two main types of ecological gradient analysis. Indirect gradient analysis proceeds as described above, where the gradient axis is found from the data in such a way that distances along the axis are preserved as much as possible. Direct gradient analysis means analysing the samples in terms of a gradient that was known a priori, such as a measured temperature or depth gradient. This latter type of analysis is rarely possible in palaeontology, and we will therefore concentrate on indirect methods. A thorough introduction to ordination is given by Jongman et al. (1995). Principal components analysis Principal components analysis (PCA) is a method that produces hypothetical variables (components), accounting for as much of the variation in the data as possible. The components are linear combinations of the original variables. This is a method of data reduction that in well-behaved cases makes it possible to present the most important aspects of a multivariate data set in two dimensions, in a coordinate system with axes that correspond to the two most important (principal) components. In addition, these principal components may be interpreted as reflecting real, underlying environmental variables (fig.5.1). PCA is tricky to grasp in the beginning. What is the meaning of those abstract components? Consider another example, this time from morphometry. We have measured shell size , shell thickness and a colour index on 1000 foraminiferans of the same species but from different climatic zones. From these three variables the PCA analysis produces three components. We are told that the first of these (component A) can explain 73 percent of the variation in the data, the other (B) explains 24 percent, while the last (C) explains 3 percent. We then assume that component A represents an important hypothetical - 11 5 CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS PCA axis 2 12 PCA axis 1 Walruses Polar bears Figure 5.1: Hypothetical example of PCA. 12 communities have been sampled from the Barents Sea. Only two species are included (polar bear and walrus). The 12 samples are plotted according to their species compositions. PCA implies constructing a new coordinate system with the sample centroid at the origin and with axes normal to eachother such that the first axis explains as much of the variation in the data as possible. In this case, we might for example interpret axis 1 in terms of temperature. variable which may be related to environment. The program also presents the ’loadings’ of component A, that is how much each original variable contributes to the component: 5 ! / C - 5C - This tells us that A is a hypothetical variable that reduces sharply as (shell size) increases, but increases when (shell thickness) increases. The colour index has almost no correlation with A. We guess that A is an indicator of temperature. When temperature increases, shell size diminishes (organisms are often larger in colder water), but shell thickness increases (it is easier to precipitate carbonate in warm water). Plotting the individual specimens in a coordinate system spanned by the first two components supports this interpretation: We find specimens collected in cold water far to the left in the diagram (small A), while specimens from warm water are found to the right (large A). It is sometimes argued that PCA assumes some statistical properties of the data set such as multivariate normality and uncorrelated samples. While it is true that violation of these assumptions may degrade the explanatory strength of the axes, this is not a major worry. PCA, like other indirect ordination methods, is a descriptive method without statistical significance anyway. Ecological ordination involves sorting samples along one, two or sometimes more axes that are expected to relate to underlying environmental parameters or geography. PCA is a good method for ordination, but as we have seen it assumes linear relationships between the components and the original variables (fig. 5.2). This may sometimes indeed hold true to some extent, but it is also common that an original variable, such as the number of individuals of a species, displays a peak for a certain value of the environmental parameter. This value is then referred to as the optimum for the species. For example, the species may prefer a certain temperature, and become rarer for both higher and lower temperatures (fig. 5.3). CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS 13 A Abundance B C D Environmental gradient Figure 5.2: Hypothetical abundance of four species (A-D) along an environmental gradient. Each species has a linear dependence on the environmental parameter. B is indifferent with respect to the parameter. Such a linear abundance pattern is assumed by PCA. A figure like this (and the one in fig. 5.3) is called a coenocline. Correspondence analysis Correspondence analysis (CA) is a method for ordination which has been constructed specifically for situations where different taxa have localized optimal positions on the gradients (fig. 5.3). Like in PCA, ’hypothetical variables’ are constructed (in decreasing order of importance) which the original data points can be plotted against. CA can also produce diagrams showing both taxon-oriented (R-mode) and sample-oriented (Q-mode) ordination simultaneously. Instead of maximizing the amount of variance along the axes as in PCA, CA maximizes the correspondence between species scores (positions along the gradient axis) and sample scores. To understand this, it may help to consider one of the possible algorithms for correspondence analysis, known as reciprocal averaging. We start with the species in a random order along the ordination axis. The samples are placed along the axis at positions decided by a weighted mean of the scores of the species they contain. The species scores are then updated to weighted means of the scores of the samples in which they are found. In this way, the algorithm goes back and forth between species scores and sample scores until they have stabilized. It can be shown that this will lead to optimum correspondence between species scores and taxon scores whatever the initial random ordering. Correspondence analysis can often give diagrams where the data points are organized in a horseshoelike shape (the ’arch effect’), and where points towards the edges of the plot are compressed together. This is to some extent an artefact of the mathematical method, and many practicioners prefer to ’detrend’ and ’rescale’ the result of the CA such that these effects disappear. This is called Detrended Correspondence Analysis (DCA), and is presently the most popular type of ecological ordination (Hill & Gauch 1989). An interesting effect of the rescaling is that the average width of each species response along the gradient (tolerance) becomes 1. We can then use the total length of an axis to say something about how well the species are spread out along that gradient (beta diversity). If for example an axis has length 5, it means that species at one end of the gradient have little or no overlap with those at the other end. CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS 14 Abundance B E A C D Environmental gradient Figure 5.3: Hypothetical abundance of five species (A-E) along an environmental gradient. Each species has an abundance peak for its optimal living conditions. C has a wide distribution (high tolerance for variation in the environmental parameter), while E is a specialist with a narrow range. Correspondence analysis is suitable for this situation. Correspondence analysis without detrending is also used as the basis for a divisive clustering method known as TWINSPAN. The first ordination axis is divided in two, and the species/samples on the different sides of the dividing line are assigned to two clusters. This is continued until all clusters are subdivided into single species/samples. Other ordination methods Seriation Seriation was developed by archeologists, and can be regarded as a simple ordination method for presence/absence data. Rows and columns in the matrix are moved around in such a way that presences are concentrated along the diagonal in the matrix. This diagonal can be regarded as an ordination axis, along which samples (rows) and taxa (columns) are sequenced. Principal coordinates analysis Principal coordinates (PCO) analysis starts from distance values between all pairs of data points using any distance (or similarity) measure. The points are then placed in a low-dimensional space such that the distances are preserved as fas as possible. PCO is also known as metric multidimensional scaling (MDS). PCO is attractive because it allows the use of any distance measure, including those based on presence/absence data. For some reason it is not used much in ecology, and its behaviour is not very well studied. Quite often it suffers from arch effects similar to those of correspondence analysis. CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS 15 Figure 5.4: Detrended Correspondence Analysis of five samples from the Silurian of Wales (Case Study 9). The horizontal ordering corresponds to the presumed distance from the coastline, and we therefore interpret Axis 1 as an onshore-offshore gradient. Figure 5.5: Detrended Correspondence Analysis of plant fossil communities from the Permian. Sample ordination to the left, taxon ordination to the right. Axis 1 correlates well with latitude. In the sample ordination, open symbols are low latitude (China, Euramerica, North Africa and northern South America), filled squares are high southern latitude (Gondwana), and filled triangles and circles are mid-to high-latitude northern latitude (Russia and Mongolia). From Rees et al. (2002). Non-parametric multidimensional scaling Non-parametric multidimensional scaling (NMDS, Kruskal 1964) starts from a ranking (ordering) of distance values between all pairs of data points using any distance measure. These ranks are then used in an iterative procedure in order to try to place the points in a low-dimensional space such that ranked distances are preserved. One of the problems with this procedure is technical: Available algorithms do not guarantee an optimal ordination solution. Examples ’Case study’ nr. 7, 9 and 10. CHAPTER 5. ORDINATION AND GRADIENT ANALYSIS Figure 5.6: Result of seriation. Taxa in rows, samples in columns. Black square means presence. 16 Chapter 6 Diversity Diversity is roughly the same as species richness (sometimes the former is used in a general sense, while the latter refers to the number of species). We can measure diversity in different ways. The simplest approach is of course simply to count the number of species, but often we would like to include the distribution of numbers of individuals of the different species. Such diversity indices will vary over time and space, and can be important environmental indicators. The somewhat confusing concepts of alpha, beta and gamma diversity need to be briefly explained: alpha diversity is the local diversity (diversity of one community). beta diversity is the rate of change in species composition along a gradient. gamma diversity is the diversity of a region. Diversity indices These diversity indices can be calculated in PAST: Number of taxa ( ) Total number of individuals ( ) ! > Dominance=1-Simpson index. Ranges from 0 (all taxa are equally present) to 1 (one taxon dom where is number of individuals of taxon inates the community completely). . Simpson index=1-dominance. Measures ’evenness’ of the community from 0 to 1. Note the confusion in the literature: Dominance and Simpson indices are often interchanged, and sometimes they are defined as being reciprocal (Simpson=1/Dominance). Shannon index (entropy). A diversity index, taking into account the number of individuals as well as number of taxa. Varies from 0 for communities with only a single taxon to high values (up to about 5.0) for communities with many taxa, each with few individuals. ! / > 17 CHAPTER 6. DIVERSITY 18 Menhinick’s richness index - the ratio of the number of taxa to the square root of sample size. This is an attempt to correct for sample size - larger samples will normally contain more taxa. Margalef’s richness index: individuals. / 1 +1 , where , , is the number of taxa, and is the number of Equitability. Shannon diversity divided by the logarithm of number of taxa. This measures the evenness with which individuals are divided among the taxa present. ! C + 1 , where Fisher’s alpha - a diversity index, defined implicitly by the formula is number of taxa, is number of individuals and is the Fisher’s alpha. This index refers to a parameter in a logarithmic abundance model (see below), and is thus only applicable to samples where such a model fits. Discussions of these and other diversity indices are found in Magurran (1988) and Krebs (1989). The confusing multitude of indices can be approached pragmatically: Use the index you like best (the one that best supports your theory!), but also check some other indices to see if your conclusions will change according to the index used. This approach has been formalized by Tothmeresz (1995), who suggested to use some family of diversity indices dependent upon a single continuous parameter. One example is the so-called Renyi family, which is dependent upon a parameter as follows: < ! > / @& (+ * ! Here, is the number of species and is the proportional abundance of species . It can be shown , the Shannon index for and a number behaving that this index gives the number of species for . We can then plot a diversity profile for a single sample, letting vary like the Simpson index for from say 0 to 4. For comparing the diversities in two samples, we can plot their diversity profiles in the same figure. If the curves cross, the ordering of the two samples according to diversity is dependent upon . The diversities are then said to be non-comparable. ! A word on bootstrapping The diversity indices above may be practically useful for comparing diversity in different samples, but they have little statistical value. If we are told that one community has Shannon index 2.0 and another has index 2.5, is the latter significantly more diverse? This is like asking whether 7 is close to 8 - it’s a meaningless question unless we know the variances of the parent populations. What we need is some kind of idea about how the diversity index would vary when taking repeated samples from the same two populations. If these variances are small relative to the difference between the populations, the difference is statistically significant. So how can we estimate confidence intervals for diversity parameters? One possible method is bootstrapping. This is a general and very simple way of estimating confidence intervals for almost any type of statistical problem, and has become extremely popular in ecological data analysis, morphometry and systematics. The basic idea is to use the sample we have (or preferably several samples which we hope are from the same population) as an estimate of the statistical distribution in the parent population. CHAPTER 6. DIVERSITY 19 This is of course only an approximation, and sometimes a very bad one, but often it’s the best we can do. We then ask a computer to produce a large number (for example 1000) of random simulated samples from the estimated parent population, and see what range of variation we get in this set of samples. This variation is used as an estimate of the ’real’ variance. To make this more concrete, we can take the example of diversity indices. Say that we have collected abundance data for 273 individuals of 12 species in one sample, and calculated a Shannon index of 2.5. We want to know what range of Shannon indices we might expect if we had collected many samples with the same total number of individuals from the same parent population. We proceed as follows. First, take all the individual fossils we have collected and put them in a hat (it might be more practical to make one piece of paper for each fossil, with the species name). Assume, or rather hope, that the relative abundances represent a reasonable approximation to the ’real’ distribution of abundances in the field. Then, pick a fossil 273 times from the hat with replacement, meaning that you put each fossil back into the hat, and calculate the Shannon index for this random sample. Repeat this whole procedure 1000 times, producing a set of 1000 Shannon indices. The mean represents an estimate of the mean of Shannon indices from the parent population. Then disregard the 25 smallest and 25 largest indices, leaving 950 indices with a range corresponding to a 95 percent confidence interval. A similar approach is useful for comparing the diversity indices from two samples. We first pool the samples, meaning that we put all the specimens from both (or more) samples into the same hat. A number of random replicate pairs of samples are then made, and the diversities compared for each pair. If we rarely observe a difference in diversity between the replicates as large as the difference between the original samples, we conclude that the difference is significant. The same method can of course be used for estimating significance values for any community similarity measure, not only from differences in diversity indices. This gives an alternative to the Chi-squared test, and can be used also for presenceabsence data. A special case of this approach, using number of shared taxa as the similarity measure, is known as ’Raup-Crick similarity’ (Raup & Crick 1979). Abundance plots A useful way of summarizing the distribution of abundances on the members of a community is to plot species abundances in descending order. This is called an abundance plot (fig. 6.1). If the curve drops very rapidly and then levels off, we have a community dominated by a few taxa. It is quite commonly seen, in particular for species-poor communities, that the curve drops exponentially so that plotting logarithms (’Whittaker plot’) produces a straight descending line. This type of curve, known as a geometric series or geometric distribution, is sometimes seen in ’severe’ environments or in early stages of a succession. Another type of common abundance pattern, especially in species-rich communities, fits the log-normal model where many taxa have a certain abundance and fewer taxa have lower or higher abundance. This produces a Whittaker plot with a plataeu in the middle. This is sometimes taken as an indication of a situation where many independent random factors decide the abundance of the taxa, and is expected in environments which are randomly fluctuating (fig. 6.2). The significance of the fit to a specific abundance model can be approximated with specially designed Chi-squared tests. All the common species abundance models (geometric, log-series, log-normal and broken stick) refer to some simple theory of how the ecospace is divided into niches, occupied by the different species, CHAPTER 6. DIVERSITY 20 Figure 6.1: Ranked abundance plot for horizon 8 in Case Study 8 (Ordovician of Wales), showing the number of specimens (vertical axis) of the different species (horizontal axis). The function is close to negative exponential, such that taking the logarithms of abundances would have produced an almost straight descending line. Figure 6.2: Ranked log-abundance (Whittaker) plot for three contemporary communities from the Silurian Waldron Shale, Indiana. The Biohermal community, above storm wave base, approximates to a log-normal distribution. The Inter-reef community, below storm wave base, follows a geometric distribution (or perhaps rather a so-called log series distribution which flattens off for the rarest species). The Deeper Platform community approximates to a so-called broken stick model, typical of stable environments with strong inter-species interactions. From Peters & Bork (1999). under different models of competition. A new, comprehensive model, covering many aspects such as immigration, speciation and extinction, has been put forward by Hubbell (2001). Known as the ’neutral’ or ’ecological drift’ model, it is a null hypothesis with random drift of abundances, much like the genetic drift model in population genetics. This model predicts a certain shape of abundance plots somewhat like the log-normal model but with a larger number of rare species, which seems to fit the communities studied so far better than any previous model. Being a theory which makes very few assumptions and which incorporates evolutionary aspects, it should be of great interest to paleontologists. Rarefaction It is unfortunately the case that the number of taxa (diversity) in a sample increases with sample size. We find more conodont species in a 10 kilo sample than in a 100 gram sample. To compare the number of taxa in samples of different sizes we must therefore try to compensate for this effect. Some of the diversity indices described above try to account for sample size, but rarefaction (e.g. Krebs 1989) is a much more precise method. The rarefaction program must be told how many specimens we have of each taxon in the largest sample we have got. The program then computes how many taxa we would expect to find in samples containing smaller numbers of specimens, with standard deviations (fig. 6.3). Technically this can be done using bootstrapping, or with a faster ’direct’ method (Krebs 1989). These numbers can then be compared with the number of taxa in real samples of corresponding sizes. Another CHAPTER 6. DIVERSITY 21 way of using rarefaction curves, which may be less sensitive to differences in compositions between the samples, is to perform the rarefaction on each sample separately. Normalized diversities can then be found by standardizing on a small sample size and reading the corresponding expected taxon count from each rarefaction curve. Figure 6.3: Rarefaction on a sample from the Ordovician (sample 9 in the data set from Case Study 8) with 7 species and 57 specimens. By extrapolation, the curve indicates that further sampling would have increased the number of taxa. The curve also shows how many taxa we would expect to find if the number of specimens in the sample were lower. Standard deviations are not shown. Diversity curves Curves showing diversity as a function of time have become popular in studies of the history of life. Such curves may (or may not!) be correlated with environmental parameters, and can show interesting phenomena such as adaptive radiations and mass extinctions. The compilation of diversity curves from the fossil record is not as easy as just counting taxa. First, we have to decide on a taxonomic level for our study. Some classical diversity curves have been based on counts of families or genera, but it must always be remembered that these taxonomic units are the results of quite arbitrary decisions, and that they are influenced as much by disparity (levels of morphological difference) as by diversity. A consensus is now emerging that diversity curves should ideally be based on species counts. However, this leads to other problems. How do we delineate the species? How do we deal with synonyms? Any diversity study has to consider taxonomical issues very carefully in order to make reasonable species counts. The second major problem is that of incompleteness of the data. The fossil record itself is relatively sparse, and even worse, the completeness varies wildly through the stratigraphic column either because of preservational factors or because of different intensities of collection. Ideally one should try to compensate for this. One method is based on rarefaction, where samples with abundances have been collected. The sample size is standardized using the smallest sample, and rarefaction is used to answer the question of how many taxa we would have seen in each larger sample if it had been as small as the standard size. Another, similar method involves randomized resampling (bootstrapping) in order to see how sampling intensity and structure influences diversity. This can be done even with presence/absence CHAPTER 6. DIVERSITY 22 data. 140 Tremadoc Arenig Caradoc Ashgill 120 Mean standing diversity 100 80 60 40 20 0 −490 −485 −480 −475 −470 −465 Age −460 −455 −450 −445 −440 Figure 6.4: Diversity curve for the Ordovician of Norway (upper curve), produced from a large database of published first and last occurrences at a number of localities. Diversities are counted within 1 million years intervals. The lower curves show the upper and lower limits of the 90 percent confidence interval resulting from random resampling of localities with replacement. The curve correlates well with sea level, with low diversity at highstands. A third problem involves imprecise stratigraphical correlation, which will invariably add noise to the diversity curve. In order to reduce this problem, and also to simplify data collection, diversity is simply counted within each stratigraphical unit (often on the stage level), in the hope that the unit boundaries are reasonably well correlated. However, this reduces time resolution, and it forces us to define standing diversity more carefully. Should we correct for the time duration of the unit? It is obvious that if species longevity is very short compared with unit duration, there will be many more species within the unit than there ever was at any particular point in time. We should then divide taxon count by unit length in order to get a standardized standing diversity estimate. A related issue is illustrated by the fact that if two units of equal duration have different turnover rates, they will have different taxon counts even if standing diversity was in reality the same, resulting in artificial diversity spikes in units containing turnover events (fig. 6.5). This can to some extent be corrected by letting taxa that originate or disappear within a unit count as 1/2 instead of 1. In addition one may choose to let a taxon that exists only within the unit count as 1/3. This reflects mean longevity of a taxon within the time unit in the case of uniform distribution of first and last appearances. We are usually making the ’range-through assumption’, meaning that a taxon is supposed to have been present from its first to its last appearance. Gaps in the record are disregarded. This means that the diversity curves will usually be artificially depressed near the beginning and end, due to gaps in these regions not being filled in by assuming range-through from possible occurrences outside the time period we are studying (’Lazarus taxa’). This so-called edge effect is more serious when taxon longevities are CHAPTER 6. DIVERSITY 23 A B C Figure 6.5: Range chart of seven species. The diversity count in interval A is unquestionably 4. In interval C there are altogether 3 species, but mean standing diversity is perhaps closer to 2.3 due to the species which disappears in the interval. Interval B has high turnover. The total species count (7) in this interval is much higher than the maximum mean standing diversity of 4. By letting species that appear or disappear in the interval count as 0.5 units, we get estimated mean standing diversities of A=4, B=3.5, C=2.5. long (meaning that species counts are less sensitive than genus counts) and sampling is incomplete. In spite of all these problems, it has been shown theoretically and by computer simulation that the inaccuracies mentioned above are not necessarily serious as long as they are unsystematic. They may add noise and obscure patterns, but they will rarely produce false, strong signals, at least as long as parts of the biotas are at all preserved. A further comfort comes from the fact that in the few cases where published diversity curves have been tested by others using different (improved) data sets and methods, they normally turn out to be robust except details. However, the question of the reliability of diversity curves is still being debated. Testing for extinction and radiation Given a stratigraphically ordered sequence of diversity estimates, we may note some points of sudden decrease or increase, and wonder whether these represent extinction or radiation ’events’. It has turned out to be rather difficult to test this statistically, for many reasons. First, there is always minor extinction going on (’background extinction’). The event must be significantly more severe than the background extinction in order to classify as a mass extinction. As a first approach, a bootstrap test may be attempted to investigate this. But such a test can only show that the event is large relative to the background extinction, and whether it should be called an extinction event is a matter of definition. Another problem is the so-called Signor-Lipps effect, which will always bias the signal towards gradual rather than sudden extinction. This is related to the edge effect mentioned above, and comes about because even if a sudden extinction of many taxa took place at some boundary, it is unlikely that we will find all these taxa in the very small volume of rock just below that level. In fact, the probability of finding a taxon above a certain level drops inversely with the distance below the boundary, producing an artificial gradual decline. To some extent, one can try to correct for this effect. In recent years, there has been interest in testing palaeontological time series (whether from morphology or community analysis) against the null hypothesis of a so-called random walk, where the positive or CHAPTER 6. DIVERSITY 24 negative change from each time step to the next is randomly distributed. Such random walks can display both gradual and sudden patterns which might well be mis-interpreted as meaningful if observed in the fossil record. Statistical testing for extinction in the fossil record is being much debated right now, and one should refer to recent literature for possible methods. Chapter 7 Curve fitting Many data sets consist of pairs of measurements. Examples are lengths and thicknesses of a number of bones, grain sizes at a number of given levels in a section, and the number of species at different points in geological time. Such data sets can be plotted with points in a coordinate system (scatter plot). Often we wish to see if we can fit the points to a mathematical function (straight line, exponential function etc.), perhaps because we have a theory about an underlying mechanism which is expected to bring the observations into conformation with such a function. Most curve fitting methods are based on least squares, meaning that the computer finds the parameters that give the smallest possible sum of squared error between the curve and the data points. Fitting to a straight line The most common type of curve fitting consists in finding the parameters possible fit to a straight line: and that give the best 5 ! C There are two forms of linear curve fitting. The most common type is regression, which assumes 5 that the given values are exact and independent from , such that the measurement error or random 5 - found in the values. In the example of grain sizes we can perhaps assume that the deviation is only level in meters is a given, almost exact, strictly increasing value, while the grain size is a more randomly varying variable. The other form of linear curve fitting, called RMA (Reduced Major Axis) is to be preferred in the example of lengths and thicknesses of bones. The and values are here of more comparable nature, and errors in and are both contributing to the total squared error. Regression and RMA can often give quite similar results, but in some cases the difference may be substantial. - 5 - 5 Correlation, significance, error estimates A linear regression or an RMA analysis wil produce some numbers indicating the degree of fit. It should be noted that the significance value and the estimation of standard errors on the parameters depend 25 CHAPTER 7. CURVE FITTING 26 upon several assumptions, including normal distribution of the residual (distances from data points to the fitted line) and independence of the residual upon the independent variable. Least- squares curve fitting as such is perfectly valid even if these assumptions do not hold, but significance values can then not be trusted. PAST produces the following values: , 3;3 1 : The probability that can use the values below. - and 5 are uncorrelated. If , 3;3 1 is small ( When 3 : Correlation coefficient. This value shows the strength of the correlation. ! . increasing together, and are placed perfectly on a straight line, we have 3 3;3 : Standard error in 3;3 : Standard error in ), you and 5 are Figure 7.1: Example of linear regression. Note that in this case, the assumption of independence of the standard deviation of the residual upon the independent variable does not seem to hold well (the points scatter more for larger ). - Log and log-log transformation 5 We can use linear regression also for fitting the data points to an exponential curve. This is done simply by fitting a straight curve to the logarithms of the values (taking the logarithm transforms an exponential function to a straight line). If we use the natural logarithm, the parameters and from the regression are to be interpreted as follows: 5 ! CHAPTER 7. CURVE FITTING 27 In PAST there is also a function for taking the base-10 logarithms of both the data points are then fitted to the power function ! For the special case 5 ! - 5 and the values. The we are back to a linear function. Examples ’Case study’ nr. 1 (last part), 3 (first part) and 5. Fitting to periodic functions Some geological and palaeontological phenomena are periodic, meaning that they vary in a cyclical pattern. Examples are climatic cycles (Milankovitch), annual cycles in isotope data from belemnites and ammonites, and possibly periodic mass extinctions. We may try to fit such data to a sinusoid (fig. 7.2): 5 ! Here we have three parameters: / 1 , - (7.1) (amplitude) (period): Decides the duration of each cycle (phase): Translates the curve left or right In PAST the user must set an assumed the best fit. . The machine then optimizes the values of and to give Example ’Case study’ nr. 11. Nonlinear curve fitting All the functions above are linear in the parameters, or they can be linearized. It is more tricky for the computer to fit data to nonlinear functions. One example is the logistic curve 5 ! C The logistic curve is often used to describe growth with saturation (fig. 7.3). It was used as a model for the marine Palaeozoic diversity curve by Sepkoski (1984). The question of whether, for example, a logistic curve fits a data set better than a straight line does is difficult to answer. We can always produce better fits by using models with more parameters. If Mr. A has a theory, and Mrs. B also has a theory but with more parameters, who shall we believe? There are formal ways of attacking this problem, but they will not be described here. CHAPTER 7. CURVE FITTING 28 y=4*cos(2*pi*x/5−pi/4) 4 3 2 y 1 0 −1 −2 −3 −4 0 1 2 3 4 5 x 6 7 Figure 7.2: Sinusoid. =4, =5, 8 9 10 0.8 0.9 1 ! y=3/(1+30*exp(−7*x)) 3 2.5 y 2 1.5 1 0.5 0 0 0.1 0.2 0.3 0.4 0.5 x 0.6 0.7 Figure 7.3: Logistic curve. =3, =30, =7. Chapter 8 Time series analysis Data sets where we have measured a quantity at a sequence of points in time are called time series. Such data sets can be studied with curve fitting as described above, but there are also analysis methods that have been specifically constructed for time series. Spectral analysis Spectral analysis involves searching for periodicities in the time series, preferably with a measure of statistical significance. Such periodicities may be difficult to spot by eye in the original data set, but the spectral analysis may bring them out clearly. The analysis consists in calculating how much ’energy’ we have at different frequencies, that is how strong presence we have of different sinusoidal components at the different frequencies. There are many different methods for spectral analysis. Some of them involve the use of the Fourier Transform, which is simply correlation of the signal with a harmonic series of sine and cosine functions. One spectral analysis method which I would like to promote is the Lomb periodogram (Press et al. 1992). This method has the advantage of being able to handle data points that are not evenly spaced. It is important to understand that such spectral analysis only attempts to detect sinusoidal periodicities. Other periodic functions, for example a ’sawtooth curve’, will appear in the spectrogram as a ’fundamental’ with ’harmonics’ at whole number multiples of the fundamental frequency. The function is thus decomposed into its sinusoidal parts. Spectrograms such as the one in fig. 8.2 must be interpreted correctly. There are a number of pitfalls to consider, most of them having to do with the fact that it is impossible for the analysis to increase the information content of the signal. The following check list applies to the Fourier Transform, but similar limitations exist for the unevenly spaced case and for other algorithms: The highest frequency that can be studied (the Nyquist frequency) is the one corresponding to the period of two consecutive samples. The lowest frequency inspected by the algorithm is the one corresponding to the period given by the total length of the analyzed time series. However, effects such as spectral leakage (see below) cause the lowest trustworthy frequency channel to be the one corresponding to four periods over the duration of the time series. In other words, you need four full cycles to be able to detect 29 CHAPTER 8. TIME SERIES ANALYSIS 30 periodicity with confidence (this rule of thumb is rather conservative, and some people would push the number down to maybe three). The frequency resolution is limited by the total number of samples in the signal, so that the number of analysis channels is half the number of samples. The use of a finite-length time series implies a truncation of the infinitely long signal expected by the Fourier transform. This leads to so-called spectral leakage, limiting the frequency resolution further and potentially producing spurious low-amplitude peaks in the spectrogram. A simple test of statistical significance involves comparing the strength of the spectral peak with the distribution of peaks expected from a random signal (’white noise’). A similar test involves random reordering of the sample points in order to remove their temporal relationships. If the original spectral peaks are not much stronger than the peaks observed in the ’shuffled’ spectrum, we have a low significance. Autocorrelation Autocorrelation is a simple form of time series analysis which in some cases may show periodicities more clearly than spectral analysis. As the name indicates, the time series is correlated with a copy of itself. This gives of course perfect correlation (value 1). Then the copy is translated by a small time difference, called lag time, and we get a new (lower) correlation value. This is repeated for increasing lag times, and we get a diagram showing correlation as a function of lag time. If the time series is periodic, we will get high correlation for lag times corresponding to the period, which will show up as peaks in the autocorrelogram (fig. 8.3). Wavelets Wavelet analysis is a new type of time series analysis that has lately become popular in geophysics and petrology, but it should also have potential in palaeontology. Using the so-called quasi-continuous wavelet transform we can study a time series on several different scales simultaneously. This is done by correlating the time series against a particular, short-duration time series (’mother wavelet’) with all possible locations in time, and scaled (compressed) to different extents. We can say that the wavelet function is like a magnifying glass that we use to observe the time series at all points in time, and the analysis also continuously adjusts the magnification so that we can see the time series at different scales. In this way we can see both long-term trends and short-term details. Wavelet analysis was used by Prokoph et al. (2000) to illustrate a 30-million year cycle in diversity curves for planktic foraminifera. Example: Neogene isotope data (case study 13) Oxygen isotope data from foraminiferan shells in core samples give a good indication of temperature change through time. In this example we shall look at such an oxygen isotope log (Shackleton et al. CHAPTER 8. TIME SERIES ANALYSIS 31 1990). The data have already been fitted to an age model, so that we can treat the data set as a time series (fig. 8.1). 5.2 5 4.8 4.6 d18O 4.4 4.2 4 3.8 3.6 3.4 3.2 0 0.1 0.2 0.3 0.4 0.5 MYBP 0.6 0.7 0.8 0.9 1 Figure 8.1: Oxygen isotope data from core sample, one million years back in time (present time to the left). 5 We can first try to find sinusoidal periodicities using spectral analysis. Figure 8.2 shows the Lomb periodogram, where the axis shows cycles per million years and the axis shows the strength of the sinusoidal components. The peaks around 8 and 11 cycles per million years correspond to periods of 1/8=0.122 and 1/11=0.094 million years, respectively. These periods fit well with the 100,000 years Milankovitch cycle connected with orbital eccentricity, or alternatively periodicity in orbital inclination. The peak at 24 cycles per million years indicate a 41,000 years cycle (axial obliquety), while the peak at 43 cycles per million years indicates a 23,000 years cycle (precession). We see that the Milankovitch cycles are very prominently shown with this type of analysis. The autocorrelogram (fig. 8.3) indicates periodicites of 14, 30 and 39 samples. The samples in the time series are placed with a distance of 3000 years, so this corresponds to periodicites of 42, 90 and 131 thousand years, in reasonable accordance with the Milankovitch cycles. In this case the periodicities are better shown with spectral analysis than with autocorrelation, to some extent because the sinusoidal nature of the cycles is well suited for spectral methods. Finally we can study the time series at different scales using the continuous wavelet transform (fig. 8.4). The horizontal axis shows samples in units of 3000 years, while the vertical axis shows the twologarithm of the number of samples for the scale at which the time series is observed. Thus, the value 3 on the vertical axis means that at this horizontal level in the diagram, the signal is observed at a scale - CHAPTER 8. TIME SERIES ANALYSIS 32 Figure 8.2: Spectral analysis of isotope data from core sample (1 million years BP to the present). The peaks in the spectrum indicate strong periodicities. The frequency axis is in units of periods (cycles) per million years. ! ! ! ! samples corresponding to samples, or 24000 years. We can glimpse periodicities for samples (96000 years), (45000 years) and samples (29000 years), in relatively good accordance with the Milankovitch periodicities. An advantage of wavelet analysis over spectral analysis is that we can see how periodicities change over time. The spectral analysis considers the time series as a whole, and does not give any information localized in time. CHAPTER 8. TIME SERIES ANALYSIS 33 Figure 8.3: Autocorrelogram of isotope data from core. The peaks in the curve indicate periodicities. The axis shows lag time in units of 3000 years. - 5 - Figure 8.4: Quasi-continuous wavelet diagram of isotope data from core. The axis whows time in units of 3000 years, while the axis shows the scale at which the time series is observed, from about 380,000 years (top) down to 6,000 years (bottom). We can glimpse periodicities at three different levels. Bibliography [1] Adrain, J.M., Fortey, R.A. & Westrop, S.R. 1998. Post-Cambrian Trilobite Diversity and Evolutionary Faunas. Science 280:1809. [2] Harper, D.A.T. (ed.). 1999. Numerical Palaeobiology. John Wiley & Sons. [3] Hill, M.O. & H.G. Gauch Jr. 1980. Detrended Correspondence analysis: an improved ordination technique. Vegetatio 42:47-58. [4] Hubbell, S.P. 2001. The Unified Neutral Theory of Biodiversity and Biogeography. Princeton University Press. [5] Jongman, R.H.G, ter Braak, C.J.F. & van Tongeren, O.F.R. (eds.). 1995. Data Analysis in Community and Landscape Ecology. Cambridge University Press. [6] Krebs, C.J. 1989. Ecological Methodology. Harper & Row, New York. [7] Kruskal, J.B. 1964. Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis. Psychometrika 29:1-27. [8] Ludwig, J.A. & Reynolds, J.F. 1988. Statistical Ecology. A primer on methods and computing. John Wiley & Sons. [9] Magurran, A.E. 1988. Ecological Diversity and its Measurement. Princeton University Press. [10] Peters, S.E. & Bork, K.B. 1999. Species-abundance Models: An Ecological Approach to Inferring Paleoenvironment and Resolving Paleoecological Change in the Waldron Shale (Silurian). Palaios 14:234-245. [11] Press, W.H., S.A. Teukolsky, W.T. Vetterling & B.P. Flannery. 1992. Numerical Recipes in C. Cambridge University Press. [12] Prokoph, A., Fowler, A.D. & Patterson, R.T. 2000. Evidence for periodicity and nonlinearity in a high-resolution fossil record of long-term evolution. Geology 28:867-870. [13] Raup, D. & R.E. Crick. 1979. Measurement of faunal similarity in paleontology. Journal of Paleontology 53:1213-1227. [14] Rees, P.M., Ziegler, A.M., Gibbs, M.T., Kutzbach, J.E., Behling, P.J. & Rowley, D.B. 2002. Permian phytographic patterns and climate data/model comparisons. Journal of Geology 110:1-31. 34 BIBLIOGRAPHY 35 [15] Sepkoski, J.J. 1984. A kinetic model of Phanerozoic taxonomic diversity. Paleobiology 10:246267. [16] Shackleton, N.J., A. Berger & W.R. Peltier. 1990. An alternative astronomical calibration of the lower Pleistocene timescale based on ODP Site 677. Transactions of the Royal Society of Edinburgh: Earth Sciences 81:251-261. [17] Tothmeresz, B. 1995. Comparison of different methods for diversity ordering. Journal of Vegetation Science 6:283-290.
© Copyright 2026 Paperzz