Calculating center of gravity, population boundaries, the

1
Appendix S1 – Statistical properties of the abundance-weighted average
2
estimator for center of gravity
3
4
Several previous studies have used an β€œabundance-weighted average” estimator for the center of
5
gravity for a population in each year. This estimator averages the location of sample
6
observations si, where location is generally summarized using spatial coordinates, si = ( Lat(i),
7
Lon(i) )T, and where each location is weighted by the observed biomass at that location (where
8
other metrics, including counts of individuals or abundance have also sometimes been used):
π‘›π‘œπ‘π‘  (𝑑)
𝑠̅(𝑑) = βˆ‘
9
𝑖=1
𝑐𝑖
𝑠𝑖
π‘›π‘œπ‘π‘ (𝑑)
βˆ‘π‘—=1 𝑐𝑗
10
where nobs(t) is the number of survey occasions in year t, ci is biomass (in kg.) for the i-th sample
11
in year t. Assuming that samples are proportional to local density, E(𝑐𝑖 ) ∝ 𝑑(𝑠, 𝑑), the expected
12
value of this estimator is a function of the sampling distribution function 𝒫(𝑠, 𝑑) and the species
13
density function d(s,t):
14
E(𝑠̅(𝑑)) = ∫ 𝑠
𝒫(𝑠, 𝑑)𝑑(𝑠, 𝑑)
𝑑𝑠
𝑐(𝑑)
15
where E(𝑠̅(𝑑)) is the expected value for the abundance-weighted average estimator, and 𝑐(𝑑) =
16
∫ 𝒫(𝑠, 𝑑)𝑑(𝑠, 𝑑)𝑑𝑠 is an integration constant for total sample-weighted abundance. This equation
17
implies that E(𝑠̅(𝑑)) will decrease if E(𝑠𝒫(𝑠, 𝑑)) decreases and vice-versa. Therefore, we
18
conclude a priori that the abundance-weighted estimator will result in biased estimates of trends
19
in the center-of-gravity whenever the sampling intensity function 𝒫(𝑠, 𝑑) itself has a spatial trend
20
over time. The expectation E(𝑠̅(𝑑)) also includes population density d(s,t) only via its product
1
21
with sampling intensity 𝒫(𝑠, 𝑑), so we conclude that the two processes are perfectly confounded
22
by the estimator.
2
23
Appendix S2 – Calculating center of gravity, population boundaries, the
24
population kernel, and area occupied for the species distribution function
25
model
26
27
Our species distribution model involves estimating a species density function d(s,t) representing
28
population density at every location s and time t:
29
𝑑(𝑠, 𝑑) = Ξ¦(𝒙𝑝 (𝑠𝑖 )T πœ·π‘ + πœ€π‘ (𝑠𝑖 , 𝑑𝑖 )) × exp⁑(π’™π‘Ÿ (𝑠𝑖 )T πœ·π‘Ÿ + πœ€π‘Ÿ (𝑠𝑖 , 𝑑𝑖 ))
30
where this equation predicts density while accounting for spatiotemporal variation in encounter
31
probability p(s,t) and density when encountered r(s,t), but based upon reference values for
32
catchability variables (i.e., such that 𝒛𝑝,𝑖 = π’›π‘Ÿ,𝑖 = 𝟎; see Appendix S3 for symbol definitions).
33
The species distribution model therefore β€œfilters out” the average effect of catchability covariates
34
upon observed catch rates when predicting densities and derived statistics. The species density
35
function can then be used to calculate metrics for monitoring shifts in species distribution,
36
including (1) center of gravity, (2) the area occupied by the core of the species distribution, and
37
(3) population boundaries.
38
1. Center of gravity
39
We are specifically interested in the center-of-gravity for the distribution of a given species:
40
πœ‡(𝛿(𝑑)) = ∫ 𝛿(𝑠, 𝑑)
𝑑(𝑠, 𝑑)
𝑑𝑠
𝑐(𝑑)
41
where Ξ΄(s,t) is any measured variable that is useful for tracking changes in spatial distribution
42
over time, and 𝑐(𝑑) = ∫ 𝑑(𝑠, 𝑑)𝑑𝑠 is an integration constant representing total abundance in time
43
t. In the following, we track spatial changes in the center of gravity and therefore define
44
Ξ΄(s,t)=Latititude(s)¸or Ξ΄(s,t)=Longitude(s).
3
45
2. Area occupied
46
We also calculate the variance of the species density function (termed the β€œinertia” in Woillez et
47
al. (2009)):
𝜈(𝛿𝑖 (𝑑), 𝛿𝑗 (𝑑)) = ∫(𝛿𝑖 (𝑠, 𝑑) βˆ’ πœ‡(𝛿𝑖 (𝑑))) (𝛿𝑗 (𝑠, 𝑑) βˆ’ πœ‡(𝛿𝑗 (𝑑)))
48
𝑑(𝑠, 𝑑)
𝑑𝑠
𝑐(𝑑)
49
The center-of-gravity and variance can then be used to estimate a kernel K(t) that provides a
50
second-order approximation to the spatial distribution of the species in time t:
𝐾(𝑠, 𝑑) = 𝑀𝑉𝑁(πœ‡πΎ , Σ𝐾 ) β‰… 𝑑(𝑠, 𝑑)
51
52
where
πœ‡πΎ (𝑑) = (πœ‡(πΏπ‘Žπ‘‘(𝑑)), πœ‡(πΏπ‘œπ‘›(𝑑)))
53
54
T
and
Σ𝐾 (𝑑) = [
55
𝜈(πΏπ‘Žπ‘‘(𝑑), πΏπ‘Žπ‘‘(𝑑)) 𝜈(πΏπ‘Žπ‘‘(𝑑), πΏπ‘œπ‘›(𝑑))
]
𝜈(πΏπ‘Žπ‘‘(𝑑), πΏπ‘œπ‘›(𝑑)) 𝜈(πΏπ‘œπ‘›(𝑑), πΏπ‘œπ‘›(𝑑))
56
This kernel can then be summarized to visualize an ellipse that contains a fixed proportion p of
57
the area under the kernel approximation to population density d(t). This ellipse is indicated by
58
values:
59
|𝐾(𝑠, 𝑑) βˆ’ πœ‡πΎ (𝑑)| = 𝐹 βˆ’1 (𝑝, 2)
60
where |𝐾(𝑠, 𝑑) βˆ’ πœ‡πΎ (𝑑)| is the effective distance of location s from the center of the population
61
πœ‡πΎ (𝑑) in year t, and 𝐹 βˆ’1 (𝑝, 2) is the chi-squared cumulative distribution function evaluated at
62
proportion p and with 2 degrees of freedom.
63
Hypothetically, a northward shift in population center-of-gravity might be caused by either
64
an expansion along the northern or a contraction along the southern boundary of the population.
65
Therefore, calculate an index of population area to distinguish between these two possibilities.
4
66
Specifically, we calculate an index of the area occupied by the population, where this index at in
67
year t is calculated:
68
π‘Žπ‘‘ = πœ‹πΉ βˆ’1 (𝑝, 2)√|Σ𝐾 (𝑑)|
69
where |Σ𝐾 (𝑑)| is the determinant of the variance Σ𝐾 (𝑑) of the kernel approximation to the species
70
density function.
71
3. Population boundaries
72
Finally, we calculate a metric representing population boundaries along a pre-defined axis. To
73
do so, we calculate the cumulative distribution for population density along the axis (e.g.,
74
northings), and use quantiles from this distribution (i.e., the 5th and 95th percentiles) as the
75
boundary along this axis.
76
The cumulative distribution is calculated by first extrapolating the density function to each of
77
15,979 grid cells within the domain of the triannual and annual surveys (each is 2x2 nautical
78
miles). Each cell is assumed to have density equal to the density of the nearest knot. We then
79
calculate the cumulative distribution:
π‘žπ‘‘ (𝛿) =
80
𝑐𝑒𝑙𝑙𝑠
βˆ‘π‘›π‘—=1
𝑑(𝑠𝑗 , 𝑑) 𝐼(𝑠𝑗 < 𝛿)
𝑐(𝑑)
81
where 𝑑(𝑠𝑗 , 𝑑) is the extrapolated density for the jth cell (centered at location sj), ncells is the
82
number of cells, 𝐼(𝑠𝑗 < 𝛿) is an indicator function that equals one if sj is less than 𝛿 and zero
83
𝑐𝑒𝑙𝑙𝑠
otherwise, 𝑐(𝑑) = βˆ‘π‘—=1
𝑑(𝑠𝑗 , 𝑑) is an integration constant that ensures that π‘ž(𝛿, 𝑑) is one when
84
𝛿 β†’ ∞, and π‘žπ‘‘ (𝛿) is the quantile function in year t, returning the proportion of abundance that is
85
located to one side or the other of 𝛿. Once this cumulative distribution is calculated, we then
86
identify the lower and upper bounds such that
87
𝑛
0.05 = π‘žπ‘‘ (π›Ώπ‘™π‘œπ‘€π‘’π‘Ÿ (𝑑))
5
88
89
90
and
0.95 = π‘žπ‘‘ (π›Ώπ‘’π‘π‘π‘’π‘Ÿ (𝑑))
but where other quantiles could also have been used.
91
6
92
Appendix S3 – Detailed description of methods for the spatiotemporal species
93
distribution function model
94
95
We seek to estimate a function d(s,t) representing density d at any given location s and time t.
96
This function is decomposed into the probability p(s,t) of encountering the species at a given
97
location, and the expected density r(s,t) of the species when encountered, where 𝑑(𝑠, 𝑑) =
98
𝑝(𝑠, 𝑑)π‘Ÿ(𝑠, 𝑑). Each component is in turn modeled as a spatiotemporal process:
99
Ξ¦βˆ’1 (𝑝𝑖 ) = 𝒙𝑝 (𝑠𝑖 )T πœ·π‘ + πœ€π‘ (𝑠𝑖 , 𝑑𝑖 ) + 𝒛𝑝,𝑖 πœΈπ‘
100
log(π‘Ÿπ‘– ) = π’™π‘Ÿ (𝑠𝑖 )T πœ·π‘Ÿ + πœ€π‘Ÿ (𝑠𝑖 , 𝑑𝑖 ) + π’›π‘Ÿ,𝑖 πœΈπ‘Ÿ
101
where encounter probability pi for sample i at location si and time ti is specified via a logit link
102
function Ξ¦βˆ’1, and encounter probabilities ri are specified via a logarithmic link function, x(s) is a
103
multivariate function representing measured covariates at location s and Ξ² is a vector of
104
coefficients estimated for variables x, Ξ΅(s,t) represents spatiotemporal variation in the species
105
distribution function, and zp,i and zr,i is a vector of variables affecting catch rates independent of
106
local densities (termed β€œcatchability” variables) where Ξ³p and Ξ³r are vectors of coefficients
107
allowing catchability variables zp and zr to impact encounter probabilities and densities.
108
We constrain this function by invoking the First Law of Geography (Tobler 1970) wherein
109
nearby locations on average have greater similarity than geographically distant locations. We
110
therefore specify that spatial variation follows a stationary stochastic process that exhibits
111
geometric anisotropy, while temporal variation follows a random-walk process in time:
112
2
πœ€π‘ (𝑑)~𝐺𝑃(πœ€π‘ (𝑑 βˆ’ 1), πœŽπœ€,𝑝
C𝑝 )
113
where Cp is a correlation function governing the decrease in similarity in encounter probabilities
114
as a function of distance:
7
115
𝐢𝑝 (𝑠, 𝑠 + β„Ž) = π‘€π‘Žπ‘‘π‘’π‘Ÿπ‘›(|π‡β„Ž|; πœ…π‘ )
116
and where H is a 2x2 matrix representing geometric anisotropy, h is a vector representing the
117
displacement of two locations s and s+h, and therefore |Hh| represents the effective distance
118
between these two locations (Thorson et al. 2015b), and where πœ…π‘ and πœ…π‘Ÿ are estimated to
119
represent the spatial scale of similarity in probability of encounter and positive density
120
components. Spatiotemporal variation πœ€π‘Ÿ is defined similarly for positive catch rates. We chose
121
to use a random-walk temporal process, rather than an autoregressive process, to ensure that
122
locations and/or time intervals without sampling do not exhibit mean-reversion, which could
123
otherwise shrink center-of-gravity estimates for undersampled time periods towards the average
124
center-of-gravity for better sampled periods. However, we confirm that all results are
125
qualitatively similar when re-estimated via an autoregressive model where the magnitude of
126
autoregression is estimated as a fixed effect.
127
8
128
Appendix S4 – Details regarding a simulation experiment evaluating the likely
129
performance of AWA and SDF estimators
130
We conduct a simulation experiment to illustrate the magnitude of bias that arises from using
131
either the AWA or SDF estimators given the timing and location of samples that are available.
132
In this experiment, we simulate population density 𝑑(𝑠, 𝑑) = 𝑝(𝑠, 𝑑)π‘Ÿ(𝑠, 𝑑) in each of 15,979 grid
133
cells within the domain of the triannual and annual surveys (each is 2x2 nautical miles), where
134
each component is modeled as a spatiotemporal process:
135
𝑝(𝑠, 𝑑) = Ξ¦ (πœ€π‘ (𝑠, 𝑑) + πœ”π‘ (𝑠)) 𝑛(𝑠(1); πœ‡π‘  (𝑑), πœŽπ‘  )
136
π‘Ÿ(𝑠, 𝑑) = exp⁑(πœ€π‘Ÿ (𝑠, 𝑑) + πœ”π‘Ÿ (𝑠))𝑛(𝑠(1); πœ‡π‘  (𝑑), πœŽπ‘  )
137
where encounter probability p at location s and time t is specified as the logistic transformation
138
Ξ¦ of Ξ΅p(s,t) and Ο‰p(s), representing spatiotemporal and purely spatial variation in encounter
139
probability (see Appendix S1 for more details regarding notation), and where the density given
140
encounters r is defined similarly except using an exponential transformation. Each component
141
also affected by a unimodal preference function based on the northings s(1) of location
142
s=(Northings,Eastings)T. We use a Gaussian probability density function 𝑛(𝑠(1); πœ‡π‘  , πœŽπ‘  ) for
143
this preference function, where the northward center πœ‡π‘  (𝑑) of this preference function varies
144
among years, thereby inducing shifts in the species center of gravity, and where the dispersion πœŽπ‘ 
145
is fixed equal to one-quarter of the total range north to south of the population domain (402.5
146
km).
147
148
We explore model performance given four scenarios regarding changes in COG over time:
1. Constant – The preference function is constant for all years, πœ‡π‘  (𝑑) = πœ‡, where πœ‡ is the
149
average northings of the 15,979 grid cells. This scenario still has small interannual variation
150
in COG due to random variation in πœ€π‘ (𝑠, 𝑑) and πœ€π‘Ÿ (𝑠, 𝑑) among years.
9
151
2. Variable – The centroid of the preference function varies among years πœ‡π‘  (𝑑)~𝑁(πœ‡, πœŽπ‘  ),
152
where πœŽπ‘  is again one-quarter of the total range north to south of the population domain.
153
This induces high interannual variation in northward center of gravity.
154
3. Northward shift – The centroid of the preference function shifts progressively northward,
155
πœ‡π‘  (𝑑) = βˆ’0.5πœŽπ‘  + (𝑑 βˆ’ 1)πœŽπ‘  /(𝑛𝑑 βˆ’1), where nt is the number of years of data (37 years),
156
such that the centroid of the preference function moves northward 402.5 km over this period.
157
4. Southward shift – The centroid of the preference function shifts progressively southward,
158
πœ‡π‘  (𝑑) = 0.5πœŽπ‘  βˆ’ (𝑑 βˆ’ 1)πœŽπ‘  /(𝑛𝑑 βˆ’1), in the mirror image of the Northward shift scenario.
159
160
Sampling data are then simulated as follows:
𝑐𝑖 ~𝑏𝑖 𝐿𝑁(log(π‘Ÿ(𝑠𝑖 , 𝑑𝑖 )) , πœŽπ‘2 )
161
where 𝐿𝑁(log(π‘Ž) , 𝑏) is a lognormal distribution with log-mean a and log-standard deviation b,
162
and 𝑏𝑖 indicates whether sample i encounters the species or not:
163
164
𝑏𝑖 ~π΅π‘’π‘Ÿπ‘›(𝑝(𝑠𝑖 , 𝑑𝑖 ))
where π΅π‘’π‘Ÿπ‘›(π‘Ž) is a Bernoulli distribution with probability a.
165
In this exercise, we simulate sampling to occur at the location and year of each sample in our
166
case study application, such that the sampling intensity function 𝒫(𝑠, 𝑑) is identical to that in our
167
case study. We also specify large residual variation in sampling data πœŽπ‘ = 1, moderate levels of
168
spatial variation, πœŽπœ”,𝑝 = πœŽπœ”,π‘Ÿ = 0.5, and low levels of spatiotemporal variation, πœŽπœ€,𝑝 = πœŽπœ€,π‘Ÿ =
169
0.2, where the range of spatial covariance is defined such that correlations are 10% at a distance
170
of 1000 km for encounter probabilities p, and 500 km for catch rates given encounters r.
171
10
172
Appendix S5 – Visualizing the spatial distribution of sampling for triennial
173
and annual surveys
174
175
Fig. S1 – Spatial distribution of sample locations for the triennial survey (red) and annual survey
176
(blue) in each year (the area without samples in the Southern California bight is a conservation
177
area that is excluded from sampling, and hence densities are not extrapolated into this area)
11
178
179
12
180
Appendix S6 – Changes in the northern and southern population boundary
181
for West Coast fishes
182
183
We here show estimates of the 5 and 95 percentiles of the northward cumulative distribution of
184
abundance for each of 18 West Coast species. The 50 percentile corresponds to the median of
185
the northward distribution, and this median shows similar trends to the center of gravity for most
186
species (Fig. 5 in main text). Several species show large shifts in the population boundary,
187
including Pacific hake, which has a north-skewed distribution in 1992 relative to other years.
188
Similarly, sharpchin and darkblotched rockfish show a northward movement of the southern
189
population edge. For darkblotched, this coincides with a northward shift in the population
190
center-of-gravity (Fig. 5 in main text), and corroborates a decrease in the area occupied by this
191
species (Fig. 7 in main text).
192
13
193
Fig S2 – The 5, 50, and 95 percentiles of the northward cumulative distribution function for each
194
West Coast species.
195
14