Combining and diversity models to fill gaps in our knowledge of

Ecology Letters, (2011) 14: 1043–1051
doi: 10.1111/j.1461-0248.2011.01675.x
LETTER
Combining a - and b -diversity models to fill gaps in our
knowledge of biodiversity
Karel Mokany,1* Thomas D.
Harwood,1 Jacob McC. Overton,2
Gary M. Barker2 and Simon Ferrier1
1
CSIRO Ecosystem Sciences, Climate
Adaptation Flagship, PO Box 1700,
Canberra ACT 2601, Australia
2
Landcare Research, Private Bag
3127, Hamilton, New Zealand
*Correspondence: E-mail:
[email protected]
Abstract
For many taxonomic groups, sparse information on the spatial distribution of biodiversity limits our capacity to
answer a variety of theoretical and applied ecological questions. Modelling community-level attributes (a- and
b-diversity) over space can help overcome this shortfall in our knowledge, yet individually, predictions of aor b-diversity have their limitations. In this study, we present a novel approach to combining models of a- and
b-diversity, with sparse survey data, to predict the community composition for all sites in a region. We applied
our new approach to predict land snail community composition across New Zealand. As we demonstrate, these
predictions of metacommunity composition have diverse potential applications, including predicting c-diversity
for any set of sites, identifying target areas for conservation reserves, locating priority areas for future ecological
surveys, generating realistic compositional data for metacommunity models and simultaneously predicting the
distribution of all species in a taxon consistent with known community diversity patterns.
Keywords
Alpha, beta, community, dissimilarity, diversity, gamma, metacommunity, macroecology, richness, snail.
Ecology Letters (2011) 14: 1043–1051
INTRODUCTION
Reliable knowledge of how biodiversity varies over space is vital in our
quest to answer a range of important ecological questions, both
applied and theoretical. For example, predicting how climate change
will affect biodiversity in the future first requires good knowledge of
current spatial patterns in biodiversity (Botkin et al. 2007). Testing
theoretical models for how communities are structured also requires
reliable information on spatial patterns in biodiversity (e.g. Driscoll &
Lindenmayer 2009). Despite the importance of information on how
biodiversity varies across space, reliable data on community composition are generally available for only a small number of ecological
survey sites. This limitation in our knowledge of the spatial
distribution of species has been termed the ÔWallacean shortfallÕ
(Lomolino 2004) and severely restricts our capacity to address a range
of important questions in ecology.
One common approach to dealing with the Wallacean shortfall is
to predict the spatial distribution of individual species over a region
using statistical or mechanistic models (Elith & Leathwick 2009;
Kearney & Porter 2009). Species-distribution modelling can generate
useful information on likely patterns in the occurrence of species
across an area, and is particularly applicable to taxonomic groups
that are well studied across the region of interest (e.g. vertebrates)
(Araújo et al. 2006; Thuiller et al. 2006; Hole et al. 2009). In such
cases, species-distribution modelling can play an important role in
filling gaps in our knowledge of biodiversity as a whole (i.e. all
species in a taxon). However, many taxonomic groups are highly
speciose, with little information on the distribution or other
attributes of every component species and with new species still
being discovered (e.g. tropical plants and insects, marine invertebrates). In these situations, species-distribution modelling is limited
in its ability to predict occurrences for all species in a taxon and
hence predict patterns for biodiversity as a whole (Mokany & Ferrier
2011).
Community-level modelling has the capacity to complement
species-level approaches by predicting spatial patterns in biodiversity
for highly diverse, poorly studied taxa. Although there is a broad range
of community-level modelling approaches (Ferrier & Guisan 2006),
two of the most commonly employed approaches predict either the
species richness of communities (a-diversity) occurring at individual
sites within a region, or the dissimilarity in community composition
between pairs of sites (b-diversity), as a function of underlying
environmental gradients (Mokany & Ferrier 2011). While a variety of
alternative methods can be applied to quantify b-diversity (Tuomisto
2010), in this study we focus on pair-wise dissimilarity, due to its utility
in describing diversity patterns over large regions at a fine spatial
resolution. Community-level modelling approaches can generate
useful predictions of how community properties, such as richness
or compositional dissimilarity, vary across a region for poorly studied
taxa (e.g. Marsh et al. 2010). However, these approaches do not
provide information on the spatial distribution of component species
within each community, which is important for a variety of applied
and theoretical objectives. In addition, models of species richness and
compositional dissimilarity each predict how one property of
biodiversity (either richness or dissimilarity) varies across a region
and it is not immediately evident how these separate predictions can
best be combined to produce a more holistic assessment of spatial
patterns in community diversity.
To date, there have been few attempts to synthesise information
from community-level modelling of a- and b-diversity. Approaches
focusing on the partitioning of diversity into its components (a-, b-,
c-diversity) (Loreau 2000; Crist & Veech 2006) are less relevant here,
as they view b-diversity in terms of a single value for whole regions
rather than as a property relating to pairs of communities at specific
sites, and therefore exhibiting variation within a given region (Baselga
2010). One approach synthesising community richness and dissimilarity information is the maximisation of complementary richness
(MCR: Arponen et al. 2008). This approach is focused on conservation
2011 Blackwell Publishing Ltd/CNRS
1044 K. Mokany et al.
Letter
planning and selects a given number of sites that best represent the
diversity of a region based on predicted patterns in a- and b-diversity.
Arponen et al. (2008) raise the possibility of combining modelled
estimates of a- and b-diversity in more powerful ways, but alternative
approaches are yet to emerge.
In this study, we present and apply a novel approach for
synthesising information from community-level models of a- and
b-diversity (the Dynamic Framework for Occurrence Allocation in
Metacommunities: DynamicFOAM). Our approach applies an optimisation algorithm, which constructs species lists for each community in
a region (i.e. every cell in a regular grid) under the constraints of
modelled estimates of the number of species present (a-diversity), the
dissimilarity in species composition between each pair of communities
(b-diversity) and any available data on the occurrences of specific
species at specific sites (Fig. 1). Our approach essentially predicts the
composition of all communities across a region (or the distribution of
every species in a taxon), based on the specified models of a- and
b-diversity. Depending on the level of knowledge for a taxon,
predicted communities can comprise real species, purely hypothetical
species (e.g. undescribed species), or a mixture of both.
In this study, we describe our approach and demonstrate its
application by extending existing models of a- and b-diversity for land
Limited
survey data
Environmental
variables
Site
Species
1
1
0
0
1
1
1
1
1
0
0
1
1
1
α- & β -diversity
models
Predicted
richness (α)
Predicted
dissimilarity (β)
Site
Site
Site
DynamicFOAM
Predicted composition
Site
Species
1
1
1
0
1
1
1
1
1
0
0
1
0
0
0
1
1
1
0
1
0
1
1
1
0
0
1
0
0
1
1
0
1
0
1
1
1
0
0
1
1
1
0
0
0
0
1
1
1
0
1
1
1
0
1
1
Figure 1 Graphical depiction of the procedure for predicting the composition of all
communities using limited community survey data, relevant environmental
variables, models of a- and b-diversity and the Dynamic Framework for
Occurrence Allocation in Metacommunities (DynamicFOAM).
2011 Blackwell Publishing Ltd/CNRS
snails in New Zealand (Overton et al. 2009). We show how
community composition predicted with this approach can be used
for a wide range of ecological applications, including predicting
c-diversity for any set of sites; identifying target areas for new
conservation reserves; locating priority areas for future ecological
surveys; generating realistic compositional data for metacommunity
models; testing macroecological theory and; predicting the spatial
distribution of all species in a taxon simultaneously. Finally, we
highlight the emergence of species-level phenomena, such as
aggregated distributions and realised environmental niches, from the
community-level diversity models. These analyses illustrate the utility
in combining community-level models of a- and b-diversity, especially
for highly diverse taxonomic groups where there are substantial gaps
in our knowledge of biodiversity.
METHODS
Predicting the composition of communities from a- and b-diversity
The approach described in this study, predicts the presence ⁄ absence of
all species in all communities (i.e. sites, or cells in a regular grid) by
applying output from community-level modelling of species richness
and compositional dissimilarity (e.g. Overton et al. 2009; Marsh et al.
2010). Our approach can apply predictions generated from either
correlative or mechanistic models of a- and b-diversity. Most commonly,
correlative modelling techniques would be applied to model how the
number of species in a community (a-diversity), or the difference in
species composition between two communities (b-diversity), changes
with important environmental variables. These models then predict a- or
b-diversity over all sites, based on their environmental conditions, and
this is the starting point for our analysis (Fig. 1).
Our approach requires pair-wise compositional dissimilarity (bij) to
be modelled as the complement of SørensenÕs similarity coefficient
(i.e. SørensenÕs dissimilarity) bij = 1 – (2cij ⁄ (ai + aj)), where ai and aj
are the number of species in sites i and j respectively, and cij is the
number of species in common between the two sites. Current
applications of community-level diversity modelling can predict the
richness of each site (ai and aj above) and the dissimilarity between
each pair of communities (bij) (Ferrier et al. 2007), from which the
expected number of species in common between site pairs (cij) can be
derived (Appendix S1). The objective of the DynamicFOAM optimisation algorithm is to allocate occurrences (presence ⁄ absence) for
each species in each site such that the total number of species in each
site equals the predicted richness (ai, aj) and the number of species
shared between all pairs of sites (cij) equals that derived from the
predicted dissimilarity in composition (bij).
Although this optimisation problem is conceptually reasonably
straightforward, the search space for an optimal solution is very large,
even for simple cases (e.g. for 20 species over 10 sites, there are
1.6 · 1060 permutations of species presences ⁄ absences). We reduce
the search space by setting the number of species present in each site to
the modelled species richness (a-diversity). The optimisation process
then seeks to organise the designated number of presences and absences
in the species · site matrix so as to match as closely as possible the
modelled number of species in common between each pair of sites (Fig. 2).
A further input requirement is an estimate of the total number of
species across all communities in the region (total c-diversity), which
dictates the size of the species list to be solved. The c-diversity for a
metacommunity (all the communities in a region) can be specified
Letter
Filling gaps in biodiversity knowledge 1045
Initial solution
Species
Site
0
1
1
0
1
0
0
0
1
0
0
1
0
0
1
0
0
1
0
0
0
0
1
1
0
1
1
1
0
1
1
0
1
0
0
0
0
1
1
1
0
1
0
0
0
0
0
0
1
0
0
0
1
1
1
1
0
0
1
0
1
0
1
1
0
0
1
1
0
1
0
0
0
0
1
0
0
0
1
0
1
0
0
1
1
0
0
1
0
0
1
1
0
0
1
0
1
0
0
0
0
1
0
0
0
1
1
1
1
0
0
0
0
0
1
0
1
1
1
0
1
0
0
0
1
1
1
1
0
0
0
1
0
1
0
0
1
0
0
1
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
1
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
1
0
0
0
0
1
0
1
0
1
0
0
0
0
0
A
New permuted solution
B
Best solution so far
Species
Site
0
1
1
0
1
0
0
0
1
0
0
1
0
0
1
0
0
1
0
0
0
0
1
1
0
1
1
1
0
1
1
0
1
0
0
0
0
1
1
1
0
1
0
0
0
0
0
0
1
0
0
0
1
1
1
1
0
0
1
0
1
0
1
1
0
0
1
1
0
1
0
0
0
0
1
0
0
0
1
0
1
0
0
1
1
0
0
1
0
0
1
1
0
0
1
0
1
0
0
0
0
1
0
0
0
1
1
1
1
0
0
0
0
0
1
0
1
1
1
0
1
0
0
0
1
1
1
1
0
0
0
1
0
1
0
0
1
0
0
1
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
1
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
1
0
0
0
0
1
0
1
1
0
1
0
0
0
1
0
1
0
1
0
0
0
0
0
Number of species
in common
Site
Modelled
Site
0
0
1
0
0
1
0
0
1
0
0
0
0
1
1
0
1
1
1
0
1
1
0
1
0
0
0
0
1
1
1
0
1
0
0
0
0
0
0
1
0
0
0
1
1
1
1
0
0
1
0
1
0
1
1
0
0
1
1
0
1
0
0
0
0
1
0
0
0
1
0
1
0
0
1
1
0
0
1
0
0
1
1
0
0
1
0
1
0
0
0
0
1
0
0
0
1
1
1
1
0
0
0
0
0
1
0
1
1
1
0
1
0
0
0
1
1
1
1
0
0
0
1
0
1
0
0
1
0
0
1
0
0
0
0
0 0 0 0 0 0
0 0 0 0 0 1
C0 1 0 1 0 0
1
0
0
0
1
0
C
0
1
0
0
0
1
1
0
0
0
1
0
0
0
0
0
0
1
1
0
0
0
0
1
1
0
D0
0
0
0
Number of species
in common
Site
Modelled
Site
Best solution
so far
New
permuted
solution
E
F
G
Check
Final solution
Species
Site
0
1
1
0
1
0
0
0
1
0
0
1
0
0
1
0
0
1
0
0
0
0
1
1
0
1
1
1
0
1
1
0
1
0
0
0
0
1
1
1
0
1
0
0
0
0
0
0
1
0
0
0
1
1
1
1
0
0
1
0
1
0
1
1
0
0
1
1
0
1
0
0
0
0
1
0
0
0
1
0
1
0
0
1
1
0
0
1
0
0
1
1
0
0
1
0
1
0
0
0
0
1
0
0
0
1
1
1
1
0
0
0
0
0
1
0
1
1
1
0
1
0
0
0
1
1
1
1
0
0
0
1
0
1
0
0
1
0
0
1
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
1
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
1
0
0
0
0
1
0
1
0
1
0
0
0
0
0
Figure 2 A graphical representation of the DynamicFOAM procedure. An initial
solution is generated randomly (A), given the predicted richness of each site. The
best solution so far is permuted by selecting a site at random (B), then selecting a
species presence (= 1) and absence (= 0) at random (C) and permuting the selected
presence and absence (D). The objective value is obtained as the absolute difference
between the modelled number of species in common between each pair of sites and
that generated by the permuted species list (E). The solution with the lowest
objective value (best solution so far vs. new permuted solution) is retained as the
best solution so far for the next iteration (F). The final solution is obtained when
either the objective value reaches zero (perfect solution), a stipulated time elapses,
or a stipulated number of permutations are made without improvement in the
objective value (G).
using prior knowledge, estimated using rarefaction techniques (Gotelli
& Colwell 2001), or for modest-sized metacommunities, c-diversity
can be estimated using the DynamicFOAM algorithm itself. In the
latter case, the algorithm is applied to a range of feasible c-diversities.
The estimated c-diversity is identified as described in Appendix S2, as
the c-diversity whose solution is better (smaller error) than the next in
the ascending sequence of tested c-diversities (i.e. the first local
minimum in solution error). This approach is robust in estimating the
lower bounds on c-diversity (Appendix S2).
The sequence of steps in the DynamicFOAM algorithm is shown
diagrammatically in Fig. 2A–G. First, an initial solution is generated by
allocating the predicted number of presences (modelled richness) to
species in each site (A). This initial state may be purely random, or can
incorporate some prior knowledge (e.g. weighted-random allocation; see
Appendix S1, Fig. S6). The objective value for this initial solution is then
determined, as the absolute difference between the modelled and
generated number of species in common (cij), summed over all site pairs.
An iterative procedure is then applied, where a randomly selected
presence and absence (C) from a randomly selected site (B) are swapped
(D), the objective value for the altered species · site list is assessed (E)
and the permuted species list retained for the next iteration if the
objective value is lower than the previous best (F). This iterative
procedure continues until (1) the global minimum is achieved or (2) a
stipulated time period elapses (maximum processing time), or (3) no
further improvement is achieved after a given number of iterations
(maximum unimproved iterations) (G). This optimisation process will
often terminate at local minima. Consequently, the full algorithm (Fig. 2)
is repeated multiple times, generating a population of solutions, with the
optimal solution identified as that with the lowest objective value.
Where there are no data available on the presence ⁄ absence of
specific species in specific sites, the algorithm generates a species · site matrix of presences ⁄ absences using purely hypothetical
species. However, where data on the presence ⁄ absence of specific
species in specific sites are available, they can be added as further
constraints in the optimisation process. In this case, DynamicFOAM
predicts the occurrence of these specific species over all sites.
Including known occurrences of specific species in the algorithm can
markedly reduce the error in the predicted occurrences of species,
with perfect solutions obtainable where there is no error in the input
a- and b-diversity models (Fig. S1). A more detailed description of the
procedure is given in Appendix S1, and software to apply
DynamicFOAM is available from the author on request.
Applying DynamicFOAM to New Zealand land snails
We assessed the practical utility of the DynamicFOAM procedure by
applying it to the land snail fauna of New Zealand, which is both
highly diverse (998 recognised species) and highly endemic (99.5% of
species) (Barker 2005; Overton et al. 2009). Overton et al. (2009)
recently modelled both richness (a-diversity) and compositional
dissimilarity (b-diversity) of land snails in New Zealand, based on
2330 snail community surveys and a range of environmental and
vegetation variables. The richness model of Overton et al. (2009) was
developed using generalised additive modelling (within the GRASP
software package: Lehmann et al. 2002) with 14 environmental
variables, explaining 27% of variation in a-diversity. Compositional
dissimilarity was modelled using generalised dissimilarity modelling
(Ferrier et al. 2007) with the same 14 environmental variables, with the
final model explaining 57% of variation in b-diversity (Overton et al.
2009). We applied these predictions of richness and compositional
dissimilarity, along with the known community composition of all
2330 survey sites, to predict land snail community composition over
all of New Zealand. The approach was applied using a 200 m
resolution grid for all New Zealand, such that communities were
defined as square 4 ha areas. Although the 2330 community surveys
detected 845 of the 998 recognised species, rarefaction analyses
suggest that the total number of land snail species in New Zealand
is likely to be 1350–1400 (Appendix S3). We therefore applied a
c-diversity of 1350 species in the DynamicFOAM algorithm, 845
species of known identity and 505 ÔunknownÕ species. The original
predictions of a-diversity (Overton et al. 2009) were scaled by a factor
2011 Blackwell Publishing Ltd/CNRS
1046 K. Mokany et al.
of 2.5 to extend them from a richness value per 20 · 20 m sample
plot to a richness value for each 200 · 200 m grid cell (Appendix S4).
The algorithm could not be applied to all sites in New Zealand
simultaneously (c. 6.7 million sites) due to computational constraints.
We therefore applied DynamicFOAM sequentially to contiguous square
blocks of 2500 sites (100 km2). All sites within a block were solved
simultaneously, with 200 randomly selected sites from any adjacent
blocks already solved included as ÔknownÕ compositional data constraints, along with the 2330 community survey sites. This allowed
practical application of the algorithm in a sequential process, gradually
solving the entire land area of New Zealand, whilst ensuring consistency
in solutions between each block solved. A weighted-random procedure
was used in generating the initial state (Fig. 2A) for each 2500 site block,
to minimise error in the final solution (Appendix S1, Fig. S6).
Error quantification
Error in the community compositions predicted by DynamicFOAM
was quantified in two ways. First, we calculated the mean absolute error
in the predicted compositional dissimilarity (MAEb) of each site-pair
relative to the modelled (target) dissimilarity between each site-pair.
This measure indicates how closely the procedure generated communities with the target level of compositional dissimilarity, and is directly
related to the objective value of the optimisation algorithm. Second, we
used known compositional data from the 2330 survey sites to quantify
the proportion of correctly predicted species occurrences in those sites
(the number of observed species correctly predicted divided by the
total number of observed species). We determined the significance of
this proportion by comparing it with 1000 random selections of the
specified number of species from a pool of 1350 species.
Error in the prediction of species occurrences can come from three
sources: (1) error in the specified species richness of each site (the
a-diversity model), (2) error in the specified compositional dissimilarity between each pair of sites (the b -diversity model) and (3) the
DynamicFOAM procedure failing to identify the optimal solution,
given the specified constraints. We quantified the relative importance
of these three sources of error in our predictions of land snail
community composition using the 2330 survey sites, where both
known and predicted composition, richness and pair-wise dissimilarities were available. To do this, we used the algorithm to predict the
composition of 50 randomly selected survey sites, with the remaining
2280 survey sites included as Ôknown dataÕ, as with the analysis for all
New Zealand. We applied the algorithm using all four combinations
of known and predicted richness (a) and dissimilarities (b) for the 50
randomly selected survey sites (i.e. (i) known-a known-b; (ii) known-a
predicted-b; (iii) predicted-a known-b; (iv) predicted-a predicted-b).
The error in predicted occurrences from each of these combinations
enabled us to quantify the relative contribution of the richness
predictions [(iii – i) ⁄ iv], the dissimilarity predictions [(ii – i) ⁄ iv] and the
DynamicFOAM procedure itself (i ⁄ iv) to the total error in predicted
occurrences (iv). This was repeated 10 times.
RESULTS
Predicting the composition of all land snail communities
The DynamicFOAM procedure successfully generated predictions of
the land snail community composition for each of the 6.7 million grid
cells across New Zealand. As the algorithm fixes the species richness
2011 Blackwell Publishing Ltd/CNRS
Letter
of each site to the richness specified by the a-diversity model, the fit
of the predicted community compositions is assessed by how well the
resulting pair-wise compositional dissimilarities match those specified
by the b-diversity model. For the New Zealand land snails, the mean
absolute error in the pair-wise dissimilarities (MAEb) of the predicted
community compositions was a Sørensen dissimilarity value of 0.075
(± 0.0004 SE). For a pair of sites, each with average richness (= 38)
and with half their species in common (= 19), this level of error is
equivalent to community compositions with either three too few, or
three too many species in common.
The capacity of the procedure to correctly predict the occurrences
of specific species in specific sites can be quantified by comparing the
predicted composition with the known composition for all the
community survey sites. The mean proportion of correctly predicted
occurrences for all survey sites was 0.498 (± 0.004 SE). This
represents correct prediction for approximately half the species
known to occur in a specific site. For 72 survey sites, no species
occurrences were correctly predicted and for all but two of the
remaining survey sites, the proportion of correctly predicted
occurrences was significantly higher than that for a random selection
of species (mean P-value = 0.0005). Our error partitioning analysis
revealed that the greatest sources of error in the prediction of
the occurrences of specific species in specific sites was the underlying
b-diversity model (responsible for 49.8% of total error) and the
DynamicFOAM optimisation procedure itself (40.0% of error), with
a relatively small proportion of error coming from the underlying
a-diversity model (10.2% of error). The proportion of correctly
predicted occurrences was robust to decreases in the amount of
known compositional information included in the algorithm (Fig. S5).
Extending the predictions of community composition
To demonstrate how the predictions of community composition can
be used to estimate c-diversity in any sub-region, we extracted the
total number of species within circular regions of radius 200 m and
3200 m centred on every grid cell in New Zealand (Fig. 3a,b). We also
identified the most suitable regions in which to discover undescribed
species, by determining the total number of ÔunknownÕ species
predicted to occur within a circular region of arbitrary radius 400 m
centred on every grid cell (Fig. 3c).
The predictions of community composition for all sites are
equivalent to predicting the spatial distributions of all 845 ÔknownÕ
and 505 ÔunknownÕ land snail species (e.g. Fig. 4a–c). For the 845
ÔknownÕ species for which occurrence data were available, the spatial
pattern of predicted occurrences closely matched predictions
obtained from previously applied species-distribution modelling for
individual species (see Fig. S2 for examples). When the predicted
occurrences of individual species were related to a key environmental variable (mean annual temperature), clear species–environment
correlations (realised environmental niches) emerged (e.g. Fig. 4d–f).
The predicted area of occupancy for the 845 species surveyed was
highly correlated with the number of survey sites in which those
species were recorded (PearsonÕs r = 0.902, P < 0.001; Fig. S3). This
indicates a high level of consistency between the predicted
occurrences and the survey data in terms of which species were
common and rare.
We assessed the reliability of the c-diversity predictions that emerge
from the results by comparing the observed c-diversity across
Letter
Filling gaps in biodiversity knowledge 1047
(a)
(b)
Number of species
0 – 50
50 – 100
100 – 150
150 – 200
200 – 220
Applying our predictions of land snail c-diversity for regions of
varying size (e.g. Fig. 2A,B), we explored more theoretical aspects of
how c-diversity changes with area. For 100 randomly selected sites in
New Zealand, we extracted our predictions of c-diversity for
concentric circular regions of different radii (200 m, 400 m, 800 m,
1600 m and 3200 m) centred around those sites. We then examined
the correlation between the c–diversity for each sized region and the
mean a– and b–diversity within those regions. As the size of the
region increased, the correlation of c–diversity with mean a– diversity
decreased, while the correlation of c–diversity with mean b– diversity
increased (Fig. 6). The threshold at which b– diversity became more
strongly correlated with c–diversity was an area with radius of c. 600 m
(Fig. 6).
Number of species
0 – 300
300 – 600
600 – 900
900 – 1200
1200 – 1350
(c)
Number of species
0 – 30
30 – 60
60 – 90
90 – 120
120 – 150
Figure 3 The predicted number of land snail species within a circular region of
radius (a) 200 m (= total area of five grid cells) and (b) 3200 m, centred around
each grid cell in New Zealand and (c) the predicted number of unsurveyed land
snail species within a circular region of radius 400 m centred around each grid cell
in New Zealand.
randomly selected subsets of community survey sites with the
predicted c-diversity over the same sites. Note that here we are
examining c-diversity for a selected number of sites, rather than the
global c-diversity (1350 species) used as a model input. We assessed
500 random combinations of sites, with the number of sites in a
combination ranging from 3 to 400, according to a stratified random
sample. The predicted c-diversities explained 99% of the variation in
observed c-diversities of the survey sites (Fig. 5). Note that the
predicted c-diversities are greater in magnitude than the observed
values due to the scaling of a-diversity by a factor of 2.5 from the plot
level (where the observations were made) to the grid cell level (where
the predictions were made) (Appendix S4).
DISCUSSION
Filling gaps in our knowledge of biodiversity
Major shortfalls in current knowledge restrict our capacity to best
conserve, manage and understand biodiversity (Lomolino 2004). In
this study, we have presented a novel approach to filling gaps in our
knowledge of biodiversity, by combining models of a- and b-diversity.
As we have demonstrated, our procedure predicts the composition of
all communities and this information can be used to help answer a
variety of theoretical and applied ecological questions. Our approach
is particularly well suited to speciose taxonomic groups for which
there is sparse information on the distribution of all species across the
region of interest, where there are likely to be a substantial number of
undescribed species, and hence, where species-level modelling for all
species in the taxon is infeasible. For the land snail fauna in New
Zealand, our procedure generated predictions of the composition of
all 6.7 million communities, which exactly matched the underlying
model of a-diversity and approached the modelled b-diversities to
within an average Sørensen dissimilarity of 0.075 between site pairs.
The spatial patterns in biodiversity of the predicted communities
therefore closely match the community-level diversity models.
A unique feature of our procedure is that properties of individual
species emerge from the constraints of the community-level models of
a- and b-diversity, combined with the community survey data. For
example, the predicted occurrences of individual species over space
(e.g. Fig. 4a–c) show clearly aggregated spatial patterns, and closely
match the distributions predicted through conventional speciesdistribution modelling approaches (e.g. Fig. S2). The procedure also
performed moderately well at predicting the occurrence of specific
species in specific sites, with almost half (49.8%) of the known species
occurrences correctly predicted. We believe that this level
of predictive accuracy is quite high, given the inherent errors in the
a- and b-diversity models, and the fact that occurrences for all 1350
species (including many rare species) are being predicted simultaneously over millions of locations consistent with observed patterns in
a- and b-diversity. Through a data thinning test, we found that the
accuracy of occurrence predictions was relatively insensitive to
reductions in the amount of known compositional data included in
the algorithm, suggesting that this approach will be useful for much
sparser data sets than that analysed here (Fig. S5). Of further interest
is the emergence of clear and distinct relationships between the
occurrence frequency of individual species and key environmental
variables such as mean annual temperature (Fig. 4d–f), akin to realised
environmental niches (Hutchinson 1957).
2011 Blackwell Publishing Ltd/CNRS
1048 K. Mokany et al.
Letter
(a)
(b)
(c)
(e)
(d)
f
(f)
f
f
MAT
MAT
MAT
Figure 4 Emergence of species-level attributes from community-level models. The predicted occurrences across New Zealand for three of the 1350 land snail species applied
in the DynamicFOAM procedure (A = Chaureopa planulatu, B = Allodiscus austrodimorphus, C = Cavellia irregularis). Black indicates predicted presence, grey indicates predicted
absence, with known occurrences shown in white. Also shown are the relative frequency of occurrence (f) as a function of mean annual temperature (MAT) for each species
(black bars) against the background environmental frequency for all New Zealand (grey bars) for the same three snail species (D–F).
600
Correlation coefficient
( Pearson's R)
Observed gamma-diversity
1.0
400
200
0.8
0.6
0.4
0.2
0
0
200
400
600
800
1000
1200
Predicted gamma-diversity
Figure 5 Observed vs. predicted c-diversity for 500 random combinations of the
community survey sites, with the number of sites in each combination ranging from
3 to 400. Predicted c-diversity is extracted from the predicted land snail community
compositions across New Zealand. The grey line is a linear model fit to the data
(y = 0.524 x–5.096, R2 = 0.99, P < 0.001).
The emergence of species-level attributes, such as aggregated
distributions and realised niches, from the higher level community
diversity patterns contrasts starkly with the common approach of
examining emergent community-level properties from the combina 2011 Blackwell Publishing Ltd/CNRS
0.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
Area radius (km)
Figure 6 The correlation between the c-diversity within a circular area of a given
radius and either the mean a-diversity (black circles) or mean b-diversity (grey
triangles) within that area. Data are for circular areas of different radii centred
around 100 randomly selected sites across New Zealand.
tion of many species-level models (e.g. Hole et al. 2009). One
advantage of our approach is that while sensible predictions emerge at
the species level (Fig. S2), predictions at the community level tightly
adhere to the observed patterns of biodiversity (i.e. the a- and
b-diversity models). In contrast, summing many species-level predictions
Letter
may generate predictions of community diversity that conflict with
observed community-level diversity patterns (Pineda & Lobo 2009),
such as the a- and b-diversity models.
Applying complete metacommunity predictions
The predictions made by our procedure have a broad range of
possible applications, including conservation planning, ecological
survey design, exploration of macroecological theory and application
in mechanistic metacommunity models. One of the most obvious
benefits of predicting the composition of all communities across a
region is that we can assess the total number of species (c-diversity)
predicted to occur in any set of sites within that region, regardless of
their spatial arrangement. Estimates of c-diversity extracted from our
predictions of community composition were strongly related to actual
c-diversity for land snails in New Zealand (R2 = 0.99, Fig. 5)
indicating a high level of reliability. Note that the capacity for our
algorithm to predict c-diversity for any set of sites within a region is
very different from diversity partitioning approaches (e.g. Loreau
2000) where a single region-wide b-diversity value (Baselga 2010)
can be combined with a single mean a-diversity value to predict a
single c-diversity value for a whole region.
We have illustrated the potential of DynamicFOAM predictions to
assess c-diversity of any set of sites with a simple example, where we
determined the number of land snail species predicted to occur in a
circular area of radius 200 m or 3200 m centred around each grid cell
in New Zealand (Fig. 3a,b). Although the region around some sites
possessed high c-diversity regardless of the area examined, other sites
possessed relatively low c-diversity for the 200 m radius, but relatively
high c-diversity for the 3200 m radius. The most notable example of a
rapid increase in c-diversity with increasing area was in the
mountainous parts of the South Island of New Zealand (Fig. 3a,b),where richness tended to be low, but pair-wise b-diversity was
relatively high. The capacity to interrogate the predictions of
community composition and extract a prediction of the c-diversity
for any combination of sites has obvious practical applications for
conservation planning, where we may want to identify the best area or
set of areas to situate new conservation reserves, so as to maximise the
number of species represented (Margules & Pressey 2000).
Our procedure also incorporates species that are yet to be
discovered in its predictions of community composition. In the case
of New Zealand land snails, it was necessary to include ÔunknownÕ or
ÔundescribedÕ species in the optimisation procedure, together with the
845 species encountered in the community surveys. Including
unknown species in the procedure enables the identification of areas
that are predicted to have the largest numbers of undescribed species
(e.g. Fig. 3C), based on the community-level diversity models and the
community surveys. This information could be used in planning the
best sites for future ecological surveys to both identify new species
and improve the community-level diversity models. These predictions
could save valuable ecological research funding and effort, whilst
maximising our capacity for documenting and describing the EarthÕs
biota, bridging the so called ÔLinnaean ShortfallÕ in our knowledge of
biodiversity (Brown & Lomolino 1998).
The development and testing of macroecological theory is another
endeavour which can be restricted by shortfalls in our knowledge of
biodiversity (Gaston & Blackburn 2000). For example, the way in
which a- and b-diversity interact to influence the c-diversity for a
Filling gaps in biodiversity knowledge 1049
region continues to be an important topic of debate among ecologists
(e.g. Loreau 2000; Crist & Veech 2006). In this study, we demonstrate
how the DynamicFOAM predictions can contribute to our understanding of basic macroecological forces, such as the relative
importance of a- and b-diversity in influencing c-diversity. In a
simple analysis, we assessed the correlation between predicted land
snail c-diversity and both mean a- and mean b-diversity, for areas of
different size centred on 100 randomly selected sites in New Zealand.
This analysis suggests that as the size of a region increases, mean
a-diversity becomes less important in influencing c-diversity while
mean b-diversity becomes more important (Fig. 6). A wide variety of
alternative macroecological explorations are conceivable using the
DynamicFOAM predictions of metacommunity composition.
As we have demonstrated, our procedure can generate community
compositions for all sites across large regions that approach known
patterns in biodiversity (a- and b-diversity) and in which component
species possess realistic distributions (Figs 4 and S2). Such predictions
could facilitate the first application of mechanistic metacommunity
models to predict biodiversity change over large areas under scenarios
of global change (Kerr et al. 2007). Utilising complete predictions of
metacommunity composition from the DynamicFOAM procedure as
an initial state, metacommunity models could incorporate key
processes (e.g. dispersal, interspecific interactions) in predicting likely
impacts of global change on biodiversity as a whole (Mokany &
Ferrier 2011). This exciting new opportunity for metacommunity
modelling may require the adaptation of existing model frameworks
(Holyoak et al. 2005), or the derivation of new metacommunity
modelling approaches, to enable their application to the large regions
for which biodiversity management decisions are often made.
Looking forward
The reliability of community composition predictions made with the
DynamicFOAM procedure depends on a number of factors, including
the amount of ecological survey data available, the robustness of the
underpinning community-level models (the a- and b-diversity models)
and the capacity of the algorithm to generate a solution that best
meets the constraints. Our error partitioning analysis for the land snail
example suggested that improvements in the b-diversity model would
yield the greatest improvement in the accuracy of predicting specific
species in specific sites (responsible for 49.8% of the total error).
Despite the modest amount of variation in species richness explained
by the a-diversity model (27%), an improved richness model would
yield at most a 10.2% reduction in the error of predicted occurrences.
Our analyses also found a large component of error (40.0%) stemming
from the DynamicFOAM procedure failing to reach the globally
optimal solution. This error can be reduced directly by increasing the
processing time; however, there are decreasing marginal returns in
error reduction as processing time increases (data not shown).
It is highly likely that further improvements in the predictive
accuracy and computation time of our approach are possible through
further methodological development. These improvements could be
achieved through programming efficiencies, parallelisation or alternative optimisation algorithms (though alternative approaches, such as
binary linear programming and binary genetic algorithms, were found
to be inferior in preliminary testing of our approach). There are also
numerous possible approaches for sequentially solving community
compositions for regions with large numbers of communities and for
2011 Blackwell Publishing Ltd/CNRS
1050 K. Mokany et al.
initialising species lists prior to the iterative permutation procedure
(Fig. 2A). The solution presented in this study, for land snails in New
Zealand required c. 6 days of computing time on a standard desktop.
Following substantial gains in computational efficiency, multiple
solutions could be generated for a given taxon and region, allowing for
quantification of confidence intervals associated with the predictions
of species occurrences and gamma diversities. Testing our generic
approach with different taxa in different regions is another important
step in further improving the DynamicFOAM procedure.
In conclusion, our approach to combining community-level models
of a- and b-diversity represents a powerful way to help fill the gaps in
our knowledge of biodiversity, especially for highly diverse, poorly
studied taxa. Our results are consistent with theory at both the species
and community levels. As we have demonstrated, the predictions
made from our procedure have a wide range of potential applications,
including conservation planning, survey design, testing macroecological theory and facilitating the application of metacommunity models
to predict the impacts of global change on biodiversity.
ACKNOWLEDGEMENTS
We thank S. H. Roxburgh for advice on optimisation algorithms, as
well as D. R. Paini, J. B. Pichancourt and four anonymous reviewers
for comments on earlier versions of this manuscript.
AUTHORSHIP
KM and SF identified the problem; KM designed the study, derived
the method, wrote the manuscript; KM and TDH analysed the data;
JMO and GMB provided data and expert advice; TDH, JMO, GMB
and SF contributed to the manuscript.
REFERENCES
Araújo, M.B., Thuiller, W. & Pearson, R.G. (2006). Climate warming and the
decline of amphibians and reptiles in Europe. J. Biogeogr., 33, 1712–1728.
Arponen, A., Moilanen, A. & Ferrier, S. (2008). A successful community-level
strategy for conservation prioritization. J. Appl. Ecol., 45, 1436–1445.
Barker, G.M. (2005). The character of the New Zealand land snail fauna and
communities: some evolutionary and ecological perspectives. Rec. West. Aust. Mus.
Supp., 68, 53–102.
Baselga, A. (2010). Partitioning the turnover and nestedness components of beta
diversity. Glob. Ecol. Biogeogr., 19, 134–143.
Botkin, D.B., Saxe, H., Araujo, M.B., Betts, R., Bradshaw, R.H.W., Cedhagen, T.
et al. (2007). Forecasting the effects of global warming on biodiversity. Bioscience,
57, 227–236.
Brown, J. & Lomolino, M. (1998). Biogeography, 2nd edn. Sinauer Press, Sunderland
Massachusetts.
Crist, T.O. & Veech, J.A. (2006). Additive partitioning of rarefaction curves and
species-area relationships: unifying alpha-, beta- and gamma-diversity with
sample size and habitat area. Ecol. Lett., 9, 923–932.
Driscoll, D.A. & Lindenmayer, D.B. (2009). Empirical tests of metacommunity
theory using an isolation gradient. Ecol. Monogr., 79, 485–501.
Elith, J. & Leathwick, J.R. (2009). Species distribution models: ecological explanation
and prediction across space and time. Annu. Rev. Ecol. Evol. Syst., 40, 677–697.
Ferrier, S. & Guisan, A. (2006). Spatial modelling of biodiversity at the community
level. J. Appl. Ecol., 43, 393–404.
Ferrier, S., Manion, G., Elith, J. & Richardson, K. (2007). Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional
biodiversity assessment. Divers. Distrib., 13, 252–264.
Gaston, K.J. & Blackburn, T.M. (2000). Pattern and Process in Macroecology. Blackwell
Publishing, Oxford.
2011 Blackwell Publishing Ltd/CNRS
Letter
Gotelli, N.J. & Colwell, R.K. (2001). Quantifying biodiversity: procedures and
pitfalls in the measurement and comparison of species richness. Ecol. Lett., 4,
379–391.
Hole, D.G., Willis, S.G., Pain, D.J., Fishpool, L.D., Butchart, S.H.M., Collingham,
Y.C. et al. (2009). Projected impacts of climate change on a continent-wide
protected area network. Ecol. Lett., 12, 420–431.
Holyoak, M., Leibold, M. & Holt, R. (2005). Metacommunities: Spatial Dynamics and
Ecological Communities. The University of Chicago Press, Chicago.
Hutchinson, G. (1957). Concluding remarks. Cold Spring Harb. Symp. on Quant. Biol.,
22, 415–427.
Kearney, M. & Porter, W. (2009). Mechanistic niche modelling: combining physiological and spatial data to predict speciesÕ ranges. Ecol. Lett., 12, 334–350.
Kerr, J.T., Kharouba, H.M. & Currie, D.J. (2007). The macroecological contribution
to global change solutions. Science, 316, 1581–1584.
Lehmann, A., Overton, J.M. & Leathwick, J.R. (2002). GRASP: generalized
regression analysis and spatial prediction. Ecol. Modell., 160, 165–183.
Lomolino, M. (2004). Conservation biogeography. In: Frontiers of Biogeography: New
directions in the Geography of Nature (eds Lomolino, M. & Heaney, L.). Sinauer
Associates Sunderland, Massachusetts, pp. 293–296.
Loreau, M. (2000). Are communities saturated? On the relationship between a, b
and y diversity. Ecol. Lett., 3, 73–76.
Margules, C.R. & Pressey, R.L. (2000). Systematic conservation planning. Nature,
405, 243–253.
Marsh, C.J., Lewis, O.T., Said, I. & Ewers, R.M. (2010). Community-level diversity
modelling of birds and butterflies on Anjouan, Comoro Islands. Biol. Conserv.,
143, 1364–1374.
Mokany, K. & Ferrier, S. (2011). Predicting impacts of climate change on biodiversity: a role for semi-mechanistic community-level modelling. Divers. Distrib.,
17, 374–380.
Overton, J., Barker, G. & Price, R. (2009). Estimating and conserving patterns of
invertebrate diversity: a test case of New Zealand land snails. Divers. Distrib., 15,
731–741.
Pineda, E. & Lobo, J.M. (2009). Assessing the accuracy of species distribution
models to predict amphibian species richness patterns. J. Anim. Ecol., 78, 182–
190.
Thuiller, W., Broennimann, O., Hughes, G., Alkemade, J.R.M., Midgley, G.F. &
Corsi, F. (2006). Vulnerability of African mammals to anthropogenic climate
change under conservative land transformation assumptions. Glob. Change Biol.,
12, 424–440.
Tuomisto, H. (2010). A diversity of beta diversities: straightening up a concept gone
awry. Part 2. Quantifying beta diversity and related phenomena. Ecography, 33,
23–45.
SUPPORTING INFORMATION
Additional Supporting Information may be found in the online
version of this article:
Figure S1 Increasing the accuracy of occurrence predictions by
including known data as a constraint (a test with synthetic
metacommunities).
Figure S2 Examples of species distributions predicted using
DynamicFOAM compared with those using individual species-distribution modelling.
Figure S3 Predicted vs. surveyed extent of occurrence for the 845
surveyed species.
Figure S4 Rank-occurrence relationships for both predicted occurrence and surveyed occurrence.
Figure S5 The effect of data scarcity on the accuracy of occurrence
predictions.
Figure S6 The effect of a weighted-random procedure for generating
an initial solution on the predictive accuracy of the algorithm.
Appendix S1 Detailed description of the DynamicFOAM algorithm.
Appendix S2 Applying DynamicFOAM directly to predict gammadiversity.
Letter
Appendix S3 Estimation of the total number of land snail species.
Appendix S4 Scaling richness predictions from 20 m sample plot to
200 m grid cell.
As a service to our authors and readers, this journal provides
supporting information supplied by the authors. Such materials are
peer-reviewed and may be re-organised for online delivery, but are not
copy-edited or typeset. Technical support issues arising from
supporting information (other than missing files) should be addressed
to the authors.
Filling gaps in biodiversity knowledge 1051
Editor, Howard Cornell
Manuscript received 28 March 2011
First decision made 03 May 2011
Second decision made 1 July 2011
Manuscript accepted 15 July 2011
2011 Blackwell Publishing Ltd/CNRS