Supporting Information: Figures S1-S9

1
Supporting Information: Figures S1-S9
2
3
Figure S1: Number of bivariate models used to build ESMs (a) for different sample sizes (N1 ≤ 25; 25 <
4
N2 ≤ 50; N3 > 50) and variable importance for ESMs (b and c) and standard SDMs (d). The plots show
5
the number of times a predictor was considered in a bivariate model for ESMs based on GLMs, GBMs
6
and Maxent (b) and the mean Somers' D of the bivariate models per predictor which was used as
7
weight for ESMs (c). The variable importance for standard SDMs (d) is 1 minus the correlation
8
coefficient between models including and excluding a variable.
9
The median number of bivariate models considered to build ESMs (considered if Somers’ D > 0) was
10
50 with a minimum of 20 bivariate models (a). For some species more than half of the bivariate
11
models were skipped from ESMs, however, for each ESM every predictor was kept at least 2 times in
12
a bivariate model (b). Contrary to ESMs, every modelling technique of standard SDMs frequently
13
dropped variables completely (i.e. variable importance is 0 in d). The overall pattern of variable
Overcoming limitations of modelling rare species
14
importance for ESMs is the same compared to standard SDMs, however, each of the 11 predictors
15
were always kept for the ensemble (c). This means that ESMs are not superior to standard SDMs
16
because they use cross-validation to drop non-informative predictors, rather they are superior
17
because they do not drop but keep them (although down-weighted) for the ensemble. Thus, ESMs
18
are much more informative than standard SDMs where a variable is either in or out.
19
2
Overcoming limitations of modelling rare species
20
21
Figure S2: Correlation matrix and histograms of model performance based on the five evaluation
22
indices used: Area under the Curve (AUC), True Skills Statistics (TSS), sensitivity (Se), specificity (Sp)
23
and Boyce index. Both standard SDMs and ESMs, as well as each of the three modelling techniques
24
and their ensemble, are included in the figure. TSS, Se and Sp were highly correlated with AUC (|r| ≥
25
0.7, provided in bold). Only the Boyce index was negligibly correlated with the other indices.
26
3
Overcoming limitations of modelling rare species
27
28
Figure S3: Model performance according to True Skills Statistics (TSS), sensitivity (Se) and specificity
29
(Sp) and its relationship with different modelling strategies (Standard and ESM) and techniques
30
(GBM, GLM and Maxent) as well as their ensemble prediction (EP).
4
Overcoming limitations of modelling rare species
31
32
Fig. S4: Boxplots of Pearson correlation coefficients between predictions of two modelling techniques
33
when used as standard SDMs and within the ESM framework. The predictions of different techniques
34
were clearly more similar with ESMs.
35
5
Overcoming limitations of modelling rare species
36
37
Fig S5: Evaluation of the species distribution models by AUC (top) and the Boyce index (bottom) using
38
the calibration dataset of the transferability assessment. For the transferability assessment only the
39
34 species in the group 25 < n ≤ 50 (Fig. 3) were considered and split into one halve for calibrating
40
and one halve for independently evaluating the models. Significant differences (***: p ≤ 0.001, **: p
41
≤ 0.01, *: p ≤ 0.05, ƚ: p < 0.1) between standard SDMs and ESMs are indicated for each technique
42
(GBM, GLM and Maxent) and the ensemble prediction (EP).
43
6
Overcoming limitations of modelling rare species
44
45
Figure S6: Mean AUC score (left) and Boyce index (right) as well as the standard error bars of ESMs
46
and standard SDMs evaluated with the transferability assessment. The evaluation measures on the x-
47
axes are based on 50-fold split sampling of the calibration dataset (western partition of Switzerland,
48
see Fig. S4), while evaluation measures on the y-axes are based on the independent test dataset
49
(eastern partition of Switzerland, see Fig. 4).
50
7
Overcoming limitations of modelling rare species
51
52
Figure S7: AUC against an increasing number of averaged bivariate models to build an ESM evaluated
53
on the test dataset of the transferability assessment for three species. Bivariate models were
54
randomly added to the ESM until the ESM contained all 55 possible bivariate models. The grey lines
55
in each row of panels represent 20 random runs, the red lines in each panel indicate the mean of the
56
20 runs and black lines represent the standard error. The horizontal black lines show the AUC of the
57
corresponding standard SDM. The species shown are a) Asplenium adulterinum, b) Sedum rubens and
58
c) Typha minima.
59
8
Overcoming limitations of modelling rare species
60
61
62
Figure S8: Boxplots of the performance of Ensemble Small Models (ESM) in comparison to different
63
approaches to build standard SDMs. In addition to the approach used in the main text (BIC variable
64
selection for GLMs and without further variable selection for Maxent), for both techniques models
65
were built following an all subset approach and selecting the best performing model based on cross-
66
validated (CV) AUC scores. 50-fold split sampling of the training data from the transferability
67
assessment was used for model evaluation. Model performance was not significantly improved by an
68
all subset approach and cross-validated variable selection compared to a selection using stepwise BIC
69
for GLM and no pre-selection for Maxent. Performance of variable selection using cross-validation
70
was even smaller when evaluation was based on Boyce index. These results do also hold when
71
models were evaluated using the test data of the transferability assessment (Fig. S9).
72
9
Overcoming limitations of modelling rare species
73
74
75
Figure S9: Performance of Ensemble Small Models (ESM) in comparison to different approaches to
76
build standard SDMs analogous to Table S2 but evaluated with the testing data from the
77
transferability assessment. The model performance was always highest for ESM.
78
10