1 Supporting Information: Figures S1-S9 2 3 Figure S1: Number of bivariate models used to build ESMs (a) for different sample sizes (N1 ≤ 25; 25 < 4 N2 ≤ 50; N3 > 50) and variable importance for ESMs (b and c) and standard SDMs (d). The plots show 5 the number of times a predictor was considered in a bivariate model for ESMs based on GLMs, GBMs 6 and Maxent (b) and the mean Somers' D of the bivariate models per predictor which was used as 7 weight for ESMs (c). The variable importance for standard SDMs (d) is 1 minus the correlation 8 coefficient between models including and excluding a variable. 9 The median number of bivariate models considered to build ESMs (considered if Somers’ D > 0) was 10 50 with a minimum of 20 bivariate models (a). For some species more than half of the bivariate 11 models were skipped from ESMs, however, for each ESM every predictor was kept at least 2 times in 12 a bivariate model (b). Contrary to ESMs, every modelling technique of standard SDMs frequently 13 dropped variables completely (i.e. variable importance is 0 in d). The overall pattern of variable Overcoming limitations of modelling rare species 14 importance for ESMs is the same compared to standard SDMs, however, each of the 11 predictors 15 were always kept for the ensemble (c). This means that ESMs are not superior to standard SDMs 16 because they use cross-validation to drop non-informative predictors, rather they are superior 17 because they do not drop but keep them (although down-weighted) for the ensemble. Thus, ESMs 18 are much more informative than standard SDMs where a variable is either in or out. 19 2 Overcoming limitations of modelling rare species 20 21 Figure S2: Correlation matrix and histograms of model performance based on the five evaluation 22 indices used: Area under the Curve (AUC), True Skills Statistics (TSS), sensitivity (Se), specificity (Sp) 23 and Boyce index. Both standard SDMs and ESMs, as well as each of the three modelling techniques 24 and their ensemble, are included in the figure. TSS, Se and Sp were highly correlated with AUC (|r| ≥ 25 0.7, provided in bold). Only the Boyce index was negligibly correlated with the other indices. 26 3 Overcoming limitations of modelling rare species 27 28 Figure S3: Model performance according to True Skills Statistics (TSS), sensitivity (Se) and specificity 29 (Sp) and its relationship with different modelling strategies (Standard and ESM) and techniques 30 (GBM, GLM and Maxent) as well as their ensemble prediction (EP). 4 Overcoming limitations of modelling rare species 31 32 Fig. S4: Boxplots of Pearson correlation coefficients between predictions of two modelling techniques 33 when used as standard SDMs and within the ESM framework. The predictions of different techniques 34 were clearly more similar with ESMs. 35 5 Overcoming limitations of modelling rare species 36 37 Fig S5: Evaluation of the species distribution models by AUC (top) and the Boyce index (bottom) using 38 the calibration dataset of the transferability assessment. For the transferability assessment only the 39 34 species in the group 25 < n ≤ 50 (Fig. 3) were considered and split into one halve for calibrating 40 and one halve for independently evaluating the models. Significant differences (***: p ≤ 0.001, **: p 41 ≤ 0.01, *: p ≤ 0.05, ƚ: p < 0.1) between standard SDMs and ESMs are indicated for each technique 42 (GBM, GLM and Maxent) and the ensemble prediction (EP). 43 6 Overcoming limitations of modelling rare species 44 45 Figure S6: Mean AUC score (left) and Boyce index (right) as well as the standard error bars of ESMs 46 and standard SDMs evaluated with the transferability assessment. The evaluation measures on the x- 47 axes are based on 50-fold split sampling of the calibration dataset (western partition of Switzerland, 48 see Fig. S4), while evaluation measures on the y-axes are based on the independent test dataset 49 (eastern partition of Switzerland, see Fig. 4). 50 7 Overcoming limitations of modelling rare species 51 52 Figure S7: AUC against an increasing number of averaged bivariate models to build an ESM evaluated 53 on the test dataset of the transferability assessment for three species. Bivariate models were 54 randomly added to the ESM until the ESM contained all 55 possible bivariate models. The grey lines 55 in each row of panels represent 20 random runs, the red lines in each panel indicate the mean of the 56 20 runs and black lines represent the standard error. The horizontal black lines show the AUC of the 57 corresponding standard SDM. The species shown are a) Asplenium adulterinum, b) Sedum rubens and 58 c) Typha minima. 59 8 Overcoming limitations of modelling rare species 60 61 62 Figure S8: Boxplots of the performance of Ensemble Small Models (ESM) in comparison to different 63 approaches to build standard SDMs. In addition to the approach used in the main text (BIC variable 64 selection for GLMs and without further variable selection for Maxent), for both techniques models 65 were built following an all subset approach and selecting the best performing model based on cross- 66 validated (CV) AUC scores. 50-fold split sampling of the training data from the transferability 67 assessment was used for model evaluation. Model performance was not significantly improved by an 68 all subset approach and cross-validated variable selection compared to a selection using stepwise BIC 69 for GLM and no pre-selection for Maxent. Performance of variable selection using cross-validation 70 was even smaller when evaluation was based on Boyce index. These results do also hold when 71 models were evaluated using the test data of the transferability assessment (Fig. S9). 72 9 Overcoming limitations of modelling rare species 73 74 75 Figure S9: Performance of Ensemble Small Models (ESM) in comparison to different approaches to 76 build standard SDMs analogous to Table S2 but evaluated with the testing data from the 77 transferability assessment. The model performance was always highest for ESM. 78 10
© Copyright 2026 Paperzz