Cross-Validation Regions

1
Appendix S2: Model sensitivity to heterogeneity in survey coverage
2
Introduction
3
The geographical distribution of line transect surveys available for this study was very patchy,
4
with the highest concentration of effort occurring in Longhurst’s (2007) North West Atlantic
5
Shelves (NWCS), Gulf Stream (GFST), and Caribbean (CARB) biogeographical provinces, within
6
200 nmi of the United States and southeast Canada (Fig. 1 of main text). Little or no survey effort
7
was available for large portions of the AFTT study area, either because the region had never been
8
surveyed or because we were unable to establish collaborations that would grant us access to extant
9
surveys. This prompted the question: how well do our models predict cetacean density in these
10
regions?
11
When a model fitted to data in one region yields accurate predictions in a novel region not used
12
in model fitting, the model is said to be transferable to the novel region (Randin et al. 2006). A
13
large part of our methodology was concerned with obtaining models that transferred well to the
14
unsurveyed regions of our study area, e.g., by limiting models to a small number of covariates,
15
constraining them to simple smooth relationships, and using covariates that have sound ecological
16
relationships with cetacean distributions (Wenger & Olden 2012). To evaluate model
17
transferability, we performed several qualitative assessments, such as mapping where model
18
predictions were made outside sampled covariate ranges, examining alternate models, and
19
comparing predicted density surfaces to maps of cetacean sightings taken from the OBIS-
20
SEAMAP repository (Halpin et al. 2009), which catalogued sightings from a wide range of sources
21
not able to be utilized in our models. We present and discuss these qualitative assessments in taxon-
22
specific reports that accompany this paper.
23
Building on prior studies, Wenger and Olden (2012) proposed a more formal method for
24
evaluating model transferability using cross-validation of non-random subsets of modeled data.
25
Traditionally, cross-validation is used when all data must be used to fit a model and none are
26
available to validate it. In this situation, error estimates obtained from the fitted model will be
27
biased low. Cross-validation yields a less biased estimate by splitting the data into subsets and, for
28
each subset, withholding it, refitting the model on the remaining data, and predicting the new
29
model on the held-out subset. Typically, the records are split randomly into equal-sized subsets.
30
In the most exhaustive approach, called leave-one-out cross validation, each subset consists of a
31
single record.
32
This approach can provide an unbiased estimate of model error for the modeled data but not
33
for new data, thus it does not assess how transferable the model might be to unsampled regions.
34
To do that, Wenger and Olden proposed cross-validating non-random groups of the modeled
35
data—e.g., geographic subsets—and use the results as a surrogate estimate of the error that would
36
occur on an independent data set. The idea is that if cross-validation groups differ from each other
37
in the same way that an independent data set would differ from them, then the cross-validation
38
error that results would be a reasonable surrogate for what could occur with the independent data.
39
Here, we present results of such an experiment for our study.
40
Methods
41
Cross-Validation Regions
42
Wenger and Olden (2012) provided some general guidelines for how to divide data sets into cross-
43
validation groups but noted that these decisions require a degree of professional judgement and
44
depend on how the predictions are to be used. For our study, the key question was: how would
45
results differ if line transect surveys were available for the regions of the AFTT study area for
46
which we had none? Of particular concern was the northern part of the study area which had been
47
surveyed by organizations operating in Canada and Greenland (Lawson & Gosselin 2009; Hansen
48
& Heide-Jørgensen 2013) that were unwilling to share data for our analysis. Therefore, we sought
49
to divide the surveys into cross-validation regions that resembled the scale and geographical
50
configuration of typical cetacean line transect survey programs that had been conducted or could
51
be conducted in the future.
52
To define these regions, we first separated the surveys into broad geographic groups: North
53
America, Europe, the greater Caribbean, and the mid-Atlantic ridge. Next, for North America and
54
Europe, we split the data at the continental shelf break into “shelf” and “offshore” subsets (usually
55
at the 125 m isobath). The shelf break is an important ecological feature or boundary for many
56
cetacean species in the western North Atlantic (Roberts et al. 2016). Also, in the U.S., where most
57
of the surveys available to us occurred, the shelf break was often the boundary between on-shelf
58
aerial programs and off-shelf shipboard programs. We did not split data from the Caribbean this
59
way, owing to the relatively limited data in the region.
60
Next, we split the surveys in North America into Gulf of Mexico, Southeast, and Northeast
61
regions, with the Southeast/Northeast split occurring at Cape Hatteras, North Carolina at 32.25°N.
62
These splits reflected both the spatial scale of survey programs and the differences in cetacean
63
species communities found in the regions (Schick et al. 2011; Roberts et al. 2016). Finally, we
64
split the Northeast on-shelf region at 41°N, reflecting both the ecological differences between the
65
greater Gulf of Maine region and the Mid-Atlantic Bight and the substantial heterogeneity in
66
survey effort between the two regions. Figure 1 shows the final regions.
67
Cross-Validation Procedure
68
For each taxon we performed the following procedure. First, for each of the cross-validation
69
regions included in the taxon’s full model, we withheld the survey effort segments for that region
70
and performed the entire model fitting and selection procedure (see Methods in main text) on the
71
remaining segments, obtaining the model that would have resulted had survey segments from the
72
withheld region not been available. For taxa for which we had excluded some surveys from their
73
full models, we restricted the cross-validation to the regions comprising the surveys that were
74
included. For example, for the sei whale (Balaenoptera borealis) model, for which we had
75
excluded surveys from Europe, we restricted the cross-validation to the 9 included regions instead
76
of the full 11. (The accompanying taxon-specific reports describe the data used for each taxon.)
77
With traditional cross-validation, the next step is to predict each cross-validation model on the
78
data that were withheld from it, and then, once such predictions have been obtained across all
79
withheld sets, concatenate them and produce summary statistics across the aggregate. For our
80
experiment, we wanted to examine how well each cross-validation model predicted all of the
81
regions, not just how well each model predicted its withheld region. We wanted to know, for each
82
model, would just the withheld region be affected by the loss of data or would multiple regions be
83
affected? Therefore, we predicted each cross-validation model across the entire data set, obtaining
84
a prediction for every region from each of the models. To summarize the results, we produced bar
85
plots for each region showing the mean density estimated by each model for the survey segments
86
in the region. When computing the mean, we weighted each segment by the area it covered (length
87
multiplied by detection function truncation distance). For comparison, we included bars for the
88
observed density and the density predicted by the full model that included all segments.
89
90
Figure 1. Cross-validation regions used in this experiment. Inset bar plot summarizes the
91
survey effort in each region.
92
Results
93
We present results for the three taxa presented in the main text of the paper representing three
94
ecologically-distinct cetacean families. When interpreting them, it is important to keep in mind
95
each result represents a mean density of the survey segments that occurred in the region, not for
96
the whole region polygon. Because the survey segments were distributed heterogeneously within
97
each region according to the objectives of the surveyor organizations, the density of the segments
98
may not accurately estimate the density of the region’s geographic extent.
99
Sei whale
100
The sei whale, one of the least studied baleen whales, inhabits temperate and subpolar waters of
101
the northern and southern hemispheres and is believed to migrate seasonally to high-latitude
102
feeding areas in summer and lower-latitude calving areas in winter, although the locations of
103
calving grounds are presently unknown (Prieto et al. 2012). Of the 9 regions included in the sei
104
whale model, the Mid-Atlantic Ridge showed the highest observed density—an order of magnitude
105
higher than any other region (Fig. 2). This region was surveyed only once, from 4 June to 2 July,
106
2004, with 53 sightings reported, all in the northern half of the survey and especially concentrated
107
around the Charlie Gibbs Fracture Zone where 80 individuals were reported (Waring et al. 2008).
108
The density predicted by our full model was only half of the observed density. Models that
109
excluded other Offshore regions, the Caribbean, and the Northeast Shelf all predicted substantially
110
higher density, yet still not as high as the observed density, while the model that excluded the Mid-
111
Atlantic Ridge predicted very low density.
112
Together, these results suggest that the Mid-Atlantic Ridge is unique among the regions we
113
studied, that it cannot be easily modeled without including data from within the region. We caution
114
that these results are based on a single month-long survey. Seasonal and inter-annual variability in
115
sei whale density here remains unknown, as the area has never been resurveyed. It is also important
116
to note that the single survey occurred during peak summer, while other regions received survey
117
effort across the entire seasonal period used in the sei whale model (April-October).
118
In the other regions, observed density was highest in the Northeast Shelf, significantly lower
119
in the Mid-Atlantic Shelf and Northeast Offshore, and near zero in the other five regions (Fig. 3),
120
consistent with reports in the literature. In the Northeast Shelf, where most sightings were reported,
121
predicted densities closely matched observed density for all models except the cross-validation
122
model that dropped the Northeast Shelf segments. Still, that model’s predicted density was within
123
~25% of observed density, indicating a much stronger feasibility for extrapolating into this region
124
from external data as compared to the feasibility of doing so for the Mid-Atlantic Ridge.
125
In the Mid-Atlantic Shelf, the region of next highest observed density, all models predicted ~50-
126
100% higher density than what was observed, except when the Northeast Shelf was excluded. In
127
the Northeast Offshore, where low density was observed, all models over-predicted but less-so
128
when another Offshore region or the Northeast Shelf was excluded. Extreme over-prediction
129
occurred when the Northeast Offshore itself was excluded. We suspect this resulted from the
130
similarity in conditions in covariate space between the Northeast Offshore and the northern Mid-
131
Atlantic Ridge—both cold, deep, and productive environments. Still, this prediction was only
132
~25% of that predicted for the Mid-Atlantic Ridge by the same model, indicating some ability to
133
discriminate between the two environments.
134
135
Figure 2. Sei whale cross-validation results for each region (Fig. 1). Bars are the densities
136
observed (black), modeled from all segments (gray), and modeled via cross-validation (colors).
137
Densities are the mean for all survey segments in the region weighted by their areas (segment
138
length multiplied by detection function truncation distance). Sei whale was modeled with segments
139
from 9 of the 11 regions (see the taxon-specific report for more information).
140
141
Figure 3. Reproduction of Fig. 2 with Mid-Atlantic Ridge region dropped and y axis rescaled, to
142
allow better inspection of results for other regions.
143
Kogia spp.
144
The dwarf sperm whale (Kogia sima) and pygmy sperm whale (Kogia breviceps), modeled
145
together as the Kogia guild, are endemic to tropical and temperate offshore waters (Bloodworth &
146
Odell 2008; McAlpine 2009). Consistent with this, all models predicted near-zero density for shelf
147
regions (Fig. 4). Highest densities were observed in the Gulf of Mexico Offshore and Caribbean
148
(which included both offshore and shelf segments), where waters are warm year-round. Lower
149
densities were observed in the Southeast and Northeast Offshore regions. The full model predicted
150
the highest densities in the Gulf of Mexico and Southeast Offshore, and lower densities in the
151
Caribbean and Northeast Offshore.
152
In the Gulf of Mexico Offshore, where the most Kogia sightings were made, the full model
153
predicted density close to the observed density but over-predicted in the other offshore regions,
154
most substantially in the Southeast Offshore, which is similar to and strongly connected to the Gulf
155
of Mexico by the Loop Current and Gulf Stream. When the Gulf of Mexico Offshore segments
156
were excluded, predicted densities in the Southeast and Northeast Offshore were much closer to
157
observed densities, while the Caribbean switched from moderately over-predicting to moderately
158
under-predicting.
159
These results collectively show that the models performed very well at isolating Kogia to
160
offshore environments—the on-shelf / offshore pattern was successfully reproduced no matter
161
which cross-validation region was excluded—but that prediction of offshore densities was strongly
162
influenced by whether the Gulf of Mexico Offshore region was included.
163
164
165
Figure 4. Kogia spp. cross-validation results for each region (Fig. 1). Bars are the densities
166
observed (black), modeled from all segments (gray), and modeled via cross-validation (colors).
167
Densities are the mean for all survey segments in the region weighted by their areas (segment
168
length multiplied by detection function truncation distance). Kogia spp. was modeled with
169
segments from 8 of the 11 regions (see the taxon-specific report for more information).
170
Striped dolphin
171
The striped dolphin (Stenella coeruleoalba) is generally believed to inhabit tropical and warm
172
temperate waters, and is usually found outside the continental shelf (Archer & Perrin 1999). In the
173
North Atlantic, sea surface temperature (SST) may be an important constraint on the species’
174
range, with oceanographic features such as the meanderings of the Gulf Stream possibly
175
determining the northern limit (Bloch et al. 1996; Archer & Perrin 1999). Observed densities were
176
highest in the Northeast Offshore, Mid-Atlantic Ridge, and European Offshore, with lower
177
densities observed in the Gulf of Mexico Offshore and Caribbean, and negligible density reported
178
in all Shelf regions (Fig. 5). The observed densities are consistent with the view that the species
179
does not inhabit the shelf and suggest that in the North Atlantic the species is more common in the
180
cooler portions of its range. Perhaps reflecting this, none of the five lowest-AIC full models
181
selected SST as a covariate, and instead covariates related to productivity and dynamic features
182
such as SST fronts and sea surface height anomalies (see the accompanying striped dolphin taxon-
183
specific report).
184
The density predictions from the selected full model were generally consistent with observed
185
densities, correctly isolating the species to offshore regions and reproducing the pattern of higher
186
densities in cooler regions (Fig. 5). However, the model under-predicted density in the two areas
187
of highest observed density: in the Northeast Offshore, predicted density was ~60% of observed
188
density, while in the Mid-Atlantic Ridge, it was only ~10%. In areas of low density—the Gulf of
189
Mexico Offshore, Caribbean, and Southeast Offshore—the model modestly over-predicted
190
density, in absolute terms.
191
The cross-validation revealed strong sensitivity to data loss in three regions. In the Northeast
192
Offshore, predicted density dropped by ~75% when the survey segments from this region was
193
excluded, exacerbating the under-prediction of density here. In the European Offshore, predicted
194
density increased by a factor of 6 when these segments were excluded, resulting in a large over-
195
prediction of density. Finally, in the Gulf of Mexico Offshore, predicted density tripled when these
196
segments were excluded.
197
Together, these results suggest that the available survey data were sufficient for correctly
198
modeling striped dolphin as an offshore species that inhabits warm and temperate waters, with
199
highest densities occurring in dynamic, productive waters, but that the models are sensitive to data
200
gaps at the geographic scale of the cross-validation regions.
201
202
Figure 5. Striped dolphin cross-validation results for each region (Fig. 1). Bars are the densities
203
observed (black), modeled from all segments (gray), and modeled via cross-validation (colors).
204
Densities are the mean for all survey segments in the region weighted by their areas (segment
205
length multiplied by detection function truncation distance). Striped dolphin was modeled with
206
segments from all 11 regions.
207
Discussion and conclusion
208
In this experiment, we analyzed the sensitivity of three density models to heterogeneity in
209
survey coverage using a non-random cross-validation approach (Wenger & Olden 2012). We split
210
the available line transect surveys into 11 geographical regions on boundaries based on cetacean
211
ecology and patterns in survey design, then excluded each region and examined predictions of the
212
resulting models.
213
The results indicated the models remained generally capable of reproducing overall inter-
214
regional patterns in taxa distributions when data from one region were withheld. In nearly all cross-
215
validation scenarios the models were able to correctly determine whether a taxon was present or
216
absent in each region. In the regions where taxa were present, the models were often able to rank
217
the regions by highest to lowest density in an order similar to what was observed.
218
When densities predicted by cross-validation models are compared to densities predicted by
219
full models, some patterns in model sensitivity may be observed. For all three taxa, when the region
220
of highest observed abundance was excluded, the cross-validation model substantially under-
221
predicted density in that region while the full model performed much better. Similarly, in some
222
regions of relatively low but non-zero density, dropping these regions resulted in a substantial
223
over-prediction of density relative to the full model, e.g., for sei whale in the Northeast Offshore,
224
or striped dolphin in the European or Gulf of Mexico Offshore. Finally, when the region with the
225
highest number of sightings was dropped—for sei whale: Northeast Shelf; Kogia: Gulf of Mexico
226
Offshore; striped dolphin: Northeast Offshore—predictions generally improved in other regions.
227
In conclusion, this experiment suggests that our models are likely to offer plausible predictions
228
of species occupancy (presence or absence) in unsurveyed areas of the AFTT area. They may also
229
plausibly indicate where density is higher or lower for regions of the geographic scale we tested.
230
However, absolute density predictions in unsurveyed areas should be viewed as speculative and
231
interpreted cautiously. If a taxon’s area of highest density is unsurveyed, this experiment suggests
232
our models will under-predict density there. Unsampled areas of intermediate density may be over-
233
or under-predicted. The highest caution is advised for species believed to inhabit cold-temperate
234
and subpolar waters, which represent a large portion of the AFTT area but for which little survey
235
effort was available.
236
Literature Cited
237
Archer FI, Perrin WF. 1999. Stenella coeruleoalba. Mammalian Species 603:1–9.
238
Bloch D, Desportes G, Petersen A, Sigurjøansson J. 1996. Strandings of striped dolphins (Stenella
239
coeruleoalba) in Iceland and the Faroe Islands and sightings in the northeast Atlantic, north
240
of 500N latitude. Marine Mammal Science 12:125–132.
241
242
243
244
Bloodworth BE, Odell DK. 2008. Kogia breviceps (Cetacea: Kogiidae). Mammalian Species
819:1–12.
Halpin P et al. 2009. OBIS-SEAMAP: The World Data Center for Marine Mammal, Sea Bird, and
Sea Turtle Distributions. Oceanography 22:104–115.
245
Hansen RG, Heide-Jørgensen MP. 2013. Spatial trends in abundance of long-finned pilot whales,
246
white-beaked dolphins and harbour porpoises in West Greenland. Marine Biology
247
160:2929–2941.
248
Lawson JW, Gosselin J-F. 2009. Distribution and preliminary abundance estimates for cetaceans
249
seen during Canada’s Marine Megafauna Survey-A component of the 2007 TNASS.
250
Canadian Science Advisory Secretariat= Secrétariat canadien de consultation scientifique.
251
Available from http://biblio.uqar.qc.ca/archives/30125408.pdf (accessed March 25, 2014).
252
Longhurst AR. 2007. Ecological geography of the sea. Academic Press.
253
McAlpine DF. 2009. Pygmy and dwarf sperm whales. Pages 936–938 Encyclopedia of marine
254
mammals 2nd Edition. Academic Press.
255
Prieto R, Janiger D, Silva MA, Waring GT, GonçAlves JM. 2012. The forgotten whale: a
256
bibliometric analysis and literature review of the North Atlantic sei whale Balaenoptera
257
borealis: North Atlantic sei whale review. Mammal Review 42:235–272.
258
Randin CF, Dirnböck T, Dullinger S, Zimmermann NE, Zappa M, Guisan A. 2006. Are niche-
259
based species distribution models transferable in space? Journal of Biogeography 33:1689–
260
1703.
261
262
Roberts JJ et al. 2016. Habitat-based cetacean density models for the U.S. Atlantic and Gulf of
Mexico. Scientific Reports 6:22615.
263
264
Schick R et al. 2011. Community structure in pelagic marine mammals at large spatial scales.
Marine Ecology Progress Series 434:165–181.
265
Waring GT, Nøttestad L, Olsen E, Skov H, Vikingsson G. 2008. Distribution and density estimates
266
of cetaceans along the mid-Atlantic Ridge during summer 2004. Journal of Cetacean
267
Research and Management 10:137–146.
268
269
270
Wenger SJ, Olden JD. 2012. Assessing transferability of ecological models: an underappreciated
aspect of statistical validation. Methods in Ecology and Evolution 3:260–267.