Interpretation of graph theory metrics for species persistence in a

Graph theory for species persistence
Interpretation of graph theory metrics for species
persistence in a metapopulation. The Gulf of
Lion study case.
Andrea Costa1,2,* , Andrea M. Doglioli1,2 , Katell Guizien3 , Anne. A.
Petrenko1,2
1 - Aix Marseille Université, CNRS/INSU, IRD, Mediterranean Institute
of Oceanography (MIO), UM 110, 13288 Marseille
2 - Université de Toulon, CNRS/INSU, IRD, Mediterranean Institute
of Oceanography (MIO), UM 110, 83957 La Garde
3 - Laboratoire d’Ecogeochimie des Environnements Benthique, CNRS,
Universite Paris VI, UMR8222, Av. du Fontaule - F-66651 Banyuls-surMer (France)
* [email protected]
keywords
Connectivity, Graph Theory, Metapopulation Model,
Bridging Centrality, Modularity, Clustering
1
Directed Weighted
Graph theory for species persistence
Abstract
New challenges in biodiversity conservation focus on taking into account
biological networks connectivity. Graph theory has recently been used to
investigate the conditions for species persistence from different measures
of connectivity in such networks. In the present study, for the first time,
a set of metrics defined in graph theory is compared to a metapopulation
modelling approach. The two approaches are confronted to evaluate the
persistence of soft bottom polychetae populations in the Gulf of Lion (Mediterranean
Sea). Various classical graph analysis concepts (betweenness, centrality
and modularity) are tested. New descriptors (directed weighted bridging
centrality, minimum cycles identification) are also derived and evaluated.
The major innovation of this work is the introduction of a novel metric
for measuring the distance between nodes in a graph when dealing with
connectivity matrices containing larval transfer probabilities. The new
metric ensures a physically meaningful interpretation of shortest paths and,
consequently, of betweenness. This last highlighting the nodes ensuring an
efficient transport in a biological network. The comparison with metapopulation
simulations enables to ground the interpretation of the graph descriptors in
the view of species persistence. Graph theory complements the results of
the metapopulation model by adding an exhaustive analysis of the spatial
information. In particular, modularity and bridging centrality are shown to
characterize clusters of interconnected nodes (communities) and to highlight
2
Graph theory for species persistence
the rescuing sites. Those sites bringing species’ regional persistence result
better indicated by bridging centrality and shortest cycles lengths.
1
2
Introduction
Losses of biodiversity at sea due to deleterious effects of human activities
3
(e.g., habitat destruction, overfishing, etc.) are currently expected to be
4
mitigated by the implementation of Marine Protected Areas (MPAs), see
5
Duraiappah and Shahid (2005) for a comprehensive report on the argument.
6
The basic assumption of this approach is that, if a carefully-chosen portion
7
of the whole marine biological network is protected, the network would
8
not be subject to breakdowns generating critical losses of biodiversity and
9
individual abundances in the marine ecosystem. Indeed, if a sufficient number
10
of populations in the sea are sufficiently connected with other similar populations,
11
the surviving of the species is ensured. Thus, the challenge is to identify
12
the sub-networks that could permit the species persistence in a whole habitat
13
by minimizing the mortality only in some areas within it. Equivalently
14
we can say that the problem is to identify the minimal sub-network that
15
can maximise the connectivity of the whole network. In this way it will be
16
possible to protect the network by minimizing the costs of the implementation
17
of MPAs.
18
19
However, there are two major problems in identifying key sites for the
conservation of a set of species distributed in disjunct sites between which
3
Graph theory for species persistence
20
species may disperse. First, great dissimilarities in dispersing ability among
21
species translates into different connections patterns. Second, species interactions
22
with the environment or between themselves in the different sites also affect
23
differently the persistence of species in the network. As an example, in
24
the case of marine benthic invertebrates which dispersal occurs during
25
the pelagic larval stage, differences among species in spawning period or
26
pelagic larval duration lead to different hydrological connectivity measured
27
by probability of transfer of larvae from one site to another (Guizien et al.,
28
2014). This variety of aspects led to different methodologies for tackling
29
the problem and, consequently, different ways of defining connectivity. The
30
reader can refer to Kool et al. (2013) and references therein for a review of
31
the different techniques of population connectivity estimations in different
32
contexts.
33
In the present work we carry out a comparison between two techniques
34
with a different concept of connectivity, metapopulation model and graph
35
theory.
36
On the one hand, graph theory identifies well connected networks
37
with integer networks with an efficient transfer within them. Graph Theory
38
analysis have been successfully applied to different biological networks at
39
varying levels of integration. It was first introduced in ecology by Urban
40
and Keitt (2001) in a study of landscape connectivity, in which connections
41
were identified with the distance between different patches. Schick and
42
Lindley (2007) extended its use to the rivers network, whose connections
4
Graph theory for species persistence
43
were set according to the carrying capacity of the rivers. Treml et al. (2008)
44
studied a marine islands network which connections were estimated from
45
flow direction and intensity. Rozenfeld et al. (2008) exploited graph theory
46
tools to infer gene flux in marine populations network from genetic similarity.
47
Jacobi et al. (2012) used graph theory for the identification of marine communities.
48
Andrello et al. (2013) used graph theory concepts in the estimation of the
49
connectivity among marine oases.
50
On the other hand the metapopoulation model identifies well connected
51
networks with networks ensuring the regional persistence of the species
52
even if at a very local scale. This kind of model links local demography
53
and regional dispersal. Population dynamics models have been extensively
54
used to investigate conditions of species persistence, identified by eigenvalues
55
of the growth transfer matrix large than unity (Caswel (2001) and Hastings
56
and Botsford (2005)). Meanwhile, in our knowledge, no systematic methodologies
57
have been proposed for identifying the sites of the metapopulation essential
58
for species regional persistence. Recently, ad hoc simulations based on
59
threat scenarios have been used to identify core sites in a metapopulation
60
model of the Gulf of Lion (NW Mediterranean Sea) by Guizien et al. (2014).
61
In the present study, we take advantage of the different definitions
62
of well connected network between the two techniques. In particular, we
63
identify (among both classic and new graph theory measures) graph theory
64
tools that can give information coherent with the one given in the metapopulation
65
model analysis by Guizien et al. (2014). In this way we can incorporate the
5
Graph theory for species persistence
66
metapopulation model point of view in the graph theory’s concept of well
67
connected network. Such an interpretation is becoming possible by using a
68
metapopulation modelling with spatially uniform demographic parameters
69
and a short-lived species, avoiding demographic effect as accumulation
70
over multiple generations. It is important to notice that, in return, graph
71
analysis enabled to extend systematically the spatial resolution of the information
72
inferred from the metapopulation scenarios simulations.
73
74
75
As a result our comprehension of the mechanisms underlying high
connectivity of marine biological networks is two-ways enriched.
The Gulf of Lion (GoL) was selected as study case because of the
76
numerous studies, both physical and biological, already performed in this
77
area that can be used to interpret and validate our results. The GoL is
78
located in the north-western Mediterranean Sea and is characterized by a
79
large continental margin (Figure 1) dominated by soft-bottom forming the
80
habitat of uniform polychaete assemblages in the 10 to 30 m bathymetrical
81
depth range. Its hydrodynamics is complex and highly variable (Millot,
82
1990). The circulation is strongly influenced by the Northern Current (NC),
83
which constitutes an effective dynamical barrier blocking coastal waters on
84
the continental shelf (Petrenko, 2003) and delimits the regional scale of
85
hydrodynamical connectivity. Exchanges between the GoL and offshore
86
waters are mainly induced by processes associated with the NC (Petrenko
87
et al., 2005). Hydrodynamical connectivity was quantified by larval transfer
88
probability between 32 sites along the shore of the GoL (Figure 1).
6
Graph theory for species persistence
89
During stratified conditions, barotropic eastward currents can be
90
detected, mostly in the western part of the gulf (Petrenko et al., 2008).
91
Alternatively, in unstratified winter conditions, eastward or south-westward
92
currents can occur due to the rotational wind field (Estournel et al., 2003).
93
The present study gives also high relevance to the way of implementing
94
graph analysis in order to have meaningful results, especially when calculating
95
betweenness centrality. Concerned about the validity of some of the choices
96
of the metric used as distance between the nodes that were adopted by
97
previous literature, we propose hereby a new methodology that permits
98
to obtain physically meaningful results from graph theory analysis. This is
99
a crucial aspect of the analysis that potentially avoid groundless choices in
100
environmental policy management decisions, especially when dealing with
101
the efficiency assessment of existing MPAs and the effect of potential ones.
102
The paper is organized as follows. In the Methods Section we explain
103
the methodology used for obtaining the common input of graph theory
104
and metapopulation model: the connectivity matrices from Lagrangian
105
dispersal simulations. We also report the basic concepts of the metapopulation
106
modelling carried out by Guizien et al. (2014). Successively we summarize
107
some basic concepts of graph theory and explain why it is possible to apply
108
them to our problem. Lastly we introduce a new metric for measuring
109
the node-to-node distance in graphs built with current-based connectivity
110
matrices. In the Results Section we recall the main results from Guizien
111
et al. (2014) and present the systematic analysis of the hydrological connectivity
7
Graph theory for species persistence
112
matrices with graph theory analysis. In the Discussion we examine the
113
graph theory analysis results in the light of metapopulation simulations.
114
Methods
115
In the first part of this section, we recapitulate the characteristics of the
116
Lagrangian dispersal simulation that resulted in the 20 variant connectivity
117
matrices used in Guizien et al. (2014) and in the present study. After we
118
recall the principal characteristics of the metapopulation model used by
119
Guizien et al. (2014). In the subsequent part, we present the essential concepts
120
of graph theory that we used in data analysis and results interpretation.
121
Finally we introduce a novel metric for the node to node distance in the
122
graphs built on current-based connectivity matrices. We also analyse the
123
consequences of its use.
124
Fundamentals about metapopulation models
125
The metapopulation model used by Guizien et al. (2014) describes explicitly,
126
in discrete time and for a set of sites connected by larval transfer, the population
127
spatial density dynamics of the sedentary adult stage of soft bottom polychaete.
128
Larval transfer is derived from Lagrangian dispersal simulations with
129
a three-dimensional circulation model (see Marsaleix et al., 2006) at a horizontal
130
resolution of 750 m. Spawning was simulated by releasing 30 particles in
131
the center of each of the 32 sites, on the 30 m isobath, every hour from
8
Graph theory for species persistence
132
January 5 at 0h until April 13 at 23h in 2004 and 2006 (Guizien et al.,
133
2012). The final positions of larvae after three, four and five weeks were
134
processed to compute the proportion of larvae coming from an origin site
135
and arriving at a settlement site. Connectivity matrices were then built for
136
ten consecutive 10-day spawning periods in each year and for each of the
137
three different pelagic larval durations (3, 4 and 5 week). See Table 2 for
138
the periods to which each connectivity matrix corresponds.
139
Population density at a given time at a given site results from spatially
140
structured local survivorship and reproductive success inputs potentially
141
depending on all the other sites in the system. The model accounts for
142
both (i) recruitment limitation due to space availability at the destination
143
site (computed as the proportion of free space based on the saturating
144
density of adults, and (ii) the variability in propagule transfer rate. See
145
Appendix A for the mathematical details of the model.
146
The analysis by Guizien et al. (2014) assessed the effect of connectivity
147
on population persistence and spatial distribution at the limit of population
148
density equilibrium. Other simulations explored the resistance of a short-
149
lived species to two types of scenarios. The first scenario aimed to quantify
150
the resistance of the metapopulation to habitat loss around the four main
151
ports (Figure 1). By increasing the number of unsuitable sites starting
152
from each port and proceeding symmetrically around them, the resistance
153
of the metapopulation to habitat loss around the four main ports was quantified.
154
This procedure is consistent with the most likely scenario of habitat loss
9
Graph theory for species persistence
155
due to an expansion of industrial activities in the coastal zone. The second
156
scenario assessed the resistance of the metapopulation to recruitment failure
157
affecting either the eastern or western part of the GoL.
158
Basic concepts of graph theory
159
Here we present both the essential bases of graph theory and some new
160
concepts that are necessary to develop a coherent methodology for the
161
analysis of connectivity matrices obtained as described above.
162
Mathematically speaking a graph G is a set (V, E) of nodes V and
163
edges E. The set V represents the collection of objects under study that
164
are pair-wise linked by an edge representing a relation of interest between
165
these two objects. When the relation is symmetric, the graph is said to
166
be ‘undirected’, otherwise it is ‘directed’. An example of undirected graph
167
in the context of biological network study is the genetic distance among
168
populations used in Rozenfeld et al. (2008). While an example of directed
169
graph is the probability of connections due to the current field between
170
two zones of the sea as in Rossi et al. (2014). If every existing edge has
171
the same importance as others, the graph is said to be ‘binary’, that is
172
the edges can exist or not. If each edge has a specific relative importance,
173
a weight can be associated to each of them and the graph will then be
174
called ‘weighted’. The total weight of the connections of a node i ∈ V
175
is called total degree k(i). In an undirected graph, it will simply be the
176
number of edges incident on the node. In a directed graph, it is possible to
10
Graph theory for species persistence
177
178
179
distinguish between the in-degree and out-degree. The first one is the sum
P
of the values of the edges terminating in the node k in (i) =
j aji , while
P
the second is k out (i) =
j aij with j ∈ V and i 6= j. Here the values
180
aij are the terms of the connectivity matrix where all the values of the
181
edges from node i to node j are stored. The density ρ of a graph can be
182
defined as the ratio between the number of existing edges and its maximum
183
possible value. For a directed graph we have: ρ =
184
When ρ = 1, the graph is said to be complete.
185
number of not null edges
.
N ·(N −1)
We also introduce two quantities useful in our analysis. The network
186
length is defined as the sum of all the elements of a connectivity matrix.
187
The number of not-null elements of a connectivity matrix is the number of
188
non zero elements that it contains.
189
In our case, we deal with weighted directed graphs. The nodes of our
190
graphs represent the sites used in the metapopulation model, while the
191
edges represent a not null probability with which a Lagrangian particle
192
released in one of these sites is transported, after a certain amount of time
193
corresponding to the larval duration period, to another of these sites.
194
In a directed unweighted graph, it is possible to define the shortest
195
path σi,j connecting two nodes i ∈ V and j ∈ V as the shortest possible
196
alternating sequence of nodes and edges, beginning with i and ending with
197
j, such as each edge connects the preceding node with the succeeding one.
198
The definition can be extended to directed weighted graphs: the shortest
11
Graph theory for species persistence
199
path has the lowest cost between two nodes. The most frequent choice
200
to define the cost of a path is the sum of its edges’ weights. Nonetheless,
201
other alternatives are possible and will be discussed in more detail later.
202
The definition of the centrality measure called betweenness BC(k), k ∈ V ,
203
is based on the concept of shortest path. The betweenness estimates the
204
relative importance of a node k within a graph by counting the fraction of
205
existing shortest paths σi,j that effectively pass through this node σi,j (k):
BC(k) =
206
X σij (k)
σi,j
i6=k6=j
(1)
A widely used method for identifying clusters in physical networks
207
is the maximum modularity criterion first introduced by Newman and
208
Girwan (2004). It arises from the observation that simply counting edges is
209
not an effective way to quantify the concept of community structure. The
210
partition of a network that simply minimizes the number of inter -cluster
211
connections while maximizing the intra-cluster ones does not necessarily
212
result to be effectively good when faced to reality. A good division would
213
rather be one in which there are more (or less) edges between clusters than
214
expected. A method to quantify this idea, that true cluster structures (i.e.,
215
communities) in a network are mirrored by a statistically unexpected disposition
216
of edges, was proposed by Newman and Girwan (2004). Their method is
12
Graph theory for species persistence
217
based on the use of the concept of modularity. Modularity Q is defined,
218
up to a multiplicative constant, as the difference between the number of
219
edges falling within given groups of nodes and the expected number of such
220
edges expected by chance in a random network that preserves the degree
221
distribution of the original graph. The latter is a network that conserves
222
the degree values but with randomly placed edges (furhter details can be
223
found in Newman, 2006). The values of modularity can be either positive
224
or negative, with positive values indicating the possible presence of community
225
structure. Therefore we are able to investigate the community structure
226
of a network by looking for the divisions of the network associated with a
227
maximum value of modularity. Given a network, let ci be the community
228
in which node i is assigned. For a directed weighted graph the modularity
229
assumes the form (see Nicosia et al., 2009, for details):
kiout kjin
1 X
δ(ci , cj )
aij −
Q=
m i,j∈V
m
(2)
230
where ki and kj , where ki and kj are the degrees of the nodes i and j; m =
231
and δ(ci , cj ) is the Kronecker δ-function.
232
P
Exploiting a reformulation of modularity in matrix formalism, it is
233
possible to recursively explore all the possible divisions of a network in
234
order to identify the one that maximizes the modularity value of the network
235
without exceedingly high computational power (see Newman, 2006, for
236
details). One drawback of the algorithm is an intrinsic variability that
13
i
ki
Graph theory for species persistence
237
eventually makes the results not completely compatible between different
238
runs of the analysis. For example certain nodes could be assigned to different
239
clusters without changing the maximum value of Q. This inconvenient can
240
be bypassed by running the analysis multiple times and taking, as a best
241
division, the one that is the most frequently found. In the present work we
242
ran the analysis 10 000 times on the 20 different variant matrices, hence a
243
total of two hundred thousand runs.
244
In order to extract all the possible information from the connectivity
245
matrices about the role played by the different sites, we used also the bridging
246
centrality CBR . This measure was first proposed by Hwang et al. (2008)
247
for undirected unweighted graphs. For our analysis we extended its use to
248
directed weighted graphs.
249
Bridging centrality highlights those nodes that connect different clusters
250
of a network. It is derived both from the betweenness value of a node and
251
from the bridging coefficient, a topological factor that accounts for the
252
probability of leaving the direct neighbourhood of the node by starting
253
from one of the nodes composing it. Intuitively, nodes with a high number
254
of such edges fall on the boundary of clusters. In Hwang et al. (2008), for a
255
node v ∈ V , the topological factor is defined as:
Ψuu (i) =
1 X ∆(v)
k(i)
k(i) − 1
(3)
v∈N (i)
256
where k(i) is the degree of the node i ∈ V and N (i) the direct neighbourhood
14
Graph theory for species persistence
257
of i: that is the set of nodes reachable from i in one step. ∆(v) is the out-
258
degree of nodes v ∈ N once deleted the edges going from v to other nodes
259
in N (i). The generalization to directed weighted graph basically consists
260
in accounting for the weight of the edges and in checking which edges are
261
effectively leaving the neighbourhood of the node. Then, we correct the
262
out-degree of i via the term avi and the total degree of v via the term −(aiv +
263
avi ). Note that, for this calculation, all the terms avv on the diagonal of the
264
connectivity matrix are suppressed because irrelevant. The redefinition - in
265
the directed weighted case - of bridging centrality is then:
Ψdw (i) =
X
∆(v) − avi
1
.
k tot (i)
k tot (v) − (aiv + avi )
(4)
v∈N (i)
266
where k tot (i) = k in (i) + k out (i) is the total degree of the node i ∈
267
V . In this way, we retain both the information on the flux of information
268
through a node (given by the betweenness) and the topological information
269
on the position of this node relatively to clusters (given by the bridging
270
coefficient). In fact, a node falling on the border of a cluster and channelling
271
a high flux of information will have both high bridging coefficient and high
272
betweenness values. So that the removal of such a high bridging centrality
273
node would have a much more disruptive effect than the removal of a node
274
having only a high betweenness value or a high bridging coefficient alone
275
(see Hwang et al., 2008, for an analysis and discussion of this phenomenon
276
in the undirected case). An important aspect to pay attention to, when
15
Graph theory for species persistence
277
calculating the betweenness centrality and the topological factor of a node,
278
is the different orders of magnitude in play. While the first is normalized
279
to one, the second is not: its value depends upon the particular metric
280
used to define the distance between the nodes. But we do not want to
281
give excessive importance neither to the betweenness nor to the bridging
282
coefficient. We want the two parameters to concur with equal importance
283
in characterizing the centrality of a node. Thus, following the suggestions
284
of Hwang et al. (2008), we: (1) calculate the betweenness centrality and
285
the topological factor for each node, (2) calculate the rank vector of the
286
nodes on the base of their value of betweenness and bridging values, and
287
(3) calculate the bridging centrality as:
HBR (i) = ΓBR(i) · ΓΨ(i)
288
(5)
where ΓBR(i) is the rank of a node i in the betweenness vector and
289
ΓΨ(i) is the rank of a node i in the topological factor vector. Bridging centrality
290
allows us to identify the nodes which are likely to be on the boundaries of
291
the clusters and hence able to prevent the fragmentation of the network in
292
isolated components.
293
Another concept of graph theory is the cycle; despite its simplicity it
294
turns out to be useful in the study of species multi-generational persistence.
295
Cycles are defined as those paths that, starting from node i ∈ V , end up
296
to the node i itself, after a certain number L of steps. Note that, in our
16
Graph theory for species persistence
297
work, we want to distinguish between the effect of the particles remaining
298
at the same site versus the effect of the ones leaving the site and coming
299
back. The latter effect can be evaluated taking L > 2. One of the essential
300
requisites for ensuring the persistence of a species in a given zone is the
301
high probability to see the larvae returning home after a certain number of
302
generations (see Hastings and Botsford, 2005, for details). This means that
303
the shorter the cycle starting from a given node, the more likely the site is
304
important for persistence. In fact, in this case, the site survival would be
305
quite independent from the import of larvae from other sites. Thus it can
306
act as a source in our network.
307
The main practical problem of this kind of analysis is the generally
308
overwhelming computational power required. We used an algorithm that
309
recursively finds all the possible cycles for every node of the network, thus
310
involving a (N − 1)L−1 complexity. Indeed, our analysis was doable because
311
the number of nodes (N = 32) in our network is small. Nonetheless, we
312
were constrained to limit L to 5; hence L is between 2 and 5.
313
A new metric for node-to-node distance
314
An essential aspect in analysing biological network stability and structure
315
with graph theory is the choice of the metric used to define the distance
316
between the nodes of the corresponding graph. Above all, this choice has
317
important consequences on the physical interpretation of the results. In
318
principle, many choices are possible: the genetic distance was used in Rozenfeld
17
Graph theory for species persistence
319
et al. (2008), the connection time between sites in Treml et al. (2008); the
320
larval transfer probability in, for example, Andrello et al. (2013).
321
Here we propose the use of a new metric to define the distance between
322
nodes when dealing with larval transfer probabilities, in order to ensure
323
that largest larval transfer probability between two nodes correspond to
324
smallest node-to-node distance. Such transformation permits to maintain
325
a meaningful calculation of the betweenness values of nodes applying the
326
Djikstra algorithm for the shortest path finding (Dijkstra, 1959).
327
We define the distance between two nodes i and j as:
dij = ln
328
1
aij
(6)
where aij is the connectivity probability given by the connectivity
329
matrices used in the metapopulation model. Notice that Equation (6) respects
330
the physical properties of a distance (see Appendix B for a detailed demonstration).
331
This definition combines two functions: h(x) = 1/x and f (x) = ln(x). The
332
use of h(x) = 1/x is, among different possibilities, the transformation we
333
prefer to exchange the ordering of the metric in order to make it physically
334
compatible with the concept of shortest path, at the base of betweenness.
335
The use of f (x) = ln(x) is due to the nature of the connectivity values and
336
of the shortest path algorithms. In fact, we must bear in mind three facts
337
about the connectivity values: (1) these values are calculated by considering
338
the position of the Lagrangian particles only at the beginning and at the
18
Graph theory for species persistence
339
end of the advection period; (2) we are discarding the information on the
340
effective path taken by a particle: the probability to go from i to j is independent
341
from the zone from which the particle arrived in i; and (3) the calculation
342
of the shortest paths implies the summation of a variable number of these
343
connectivity values (this is equivalent to say that, in the calculation of
344
betweenness, we are considering paths which values are calculated on a
345
different number of generations). Considering these facts, we clearly understand
346
that our probabilities are intrinsically independent one from the others.
347
But a problem arises here: as we just said above, most of the algorithms
348
calculate the shortest paths as the summation of the edges composing them
349
(e.g., the Dijkstra algorithm, Dijkstra, 1959). This is incompatible with
350
the independence of the probabilities at play here. The metric we propose
351
above, thanks to the basic property of logarithms allows us to use classical
352
shortest path algorithms while dealing correctly with the independence of
353
our connectivity values. In fact we are de facto calculating the value of a
354
path as the product of the values of its edges. It is worth mentioning that
355
the values di,j = ∞, resulting from the values aij = 0, do not influence the
356
calculation of betweennes values via the Dijkstra algorithm. The reader is
357
referred to Appendix B for details.
19
Graph theory for species persistence
358
Results
359
In general the connectivity matrices’ values depend strongly on the circulation
360
present in the Gulf during the period of the dispersal simulation. The typical
361
circulation of the Gulf of Lion is a westward current regime (Figure 1).
362
This was the case of matrices #7,#11, #12, #15, #17. In this study, other
363
types of circulation were also present. In particular matrix #1 was obtained
364
after a period of reversed (eastward) circulation. Indeed this case of circulation
365
is less frequent than the westward circulation (Petrenko et al., 2008). Matrices
366
#10,#4 and #13 correspond to a circulation pattern with an enhanced
367
recirculation in the centre of the gulf. Finally matrices #2, #3, #5, #6,
368
#8, #9, #14, #16, #18, #19, #20 correspond to a rather diffuse circulation
369
with no clear patterns.
370
Relying on these matrices, the first scenario of the metapopulation
371
model study found the evidence of a rescue mechanism from the sites located
372
in the western part of the gulf by the sites in the eastern part. Indeed a
373
deletion of sites only in the eastern part always turned out to be critical.
374
Conversely, a deletion of sites in the western never was. This fact clearly
375
shows that the sites in the eastern semi-gulf act as an essential source of
376
larvae that prevents the extinction of the species in the GoL.
377
The second scenario highlighted how Sète is crucial to the persistence
378
of polychaete in the GoL. In fact the deletion of this site and of few neighbouring
379
ones brought to the species extinction in the whole gulf. On the other
20
Graph theory for species persistence
380
hand, the deletion of the other harbours required a simultaneous loss of
381
a higher number of near sites to be critical.
382
We now present the results from graph theory.
383
Figure 2 shows a geographical representation of the connectivity matrices
384
and betweenness values. It highlights the strong dependency of betweenness
385
on the releasing period of the larvae, that is on the circulation pattern
386
present in the gulf. The representation we use in Figure 2 permits us to
387
see simultaneously: (1) the geographical distribution of the sites, (2) the
388
geographical direction of the connectivity by advection aij (with the arrows
389
pointing in the i → j direction) and (3), for each couple of sites, the difference
390
between the probability to go from one site to the other or vice versa by
391
looking at the different colors of the arrows. For clarity the connectivity
392
values lower than 2/3 of the maximum one are not represented. When
393
both probabilities in i → j and j → i directions are plotted, the arrows
394
reach only the mid-distance between the nodes.
395
This kind of representation permits also to capture the circulation
396
patterns. For example, we can see that, in Figure 2a, there are less arrows
397
in the west-to-east direction than in Figure 2b. Moreover these arrows
398
are almost always weaker than the corresponding east-to-west ones. For
399
example the westward circulation driving the connectivity in matrix #7
400
reflects the dominance of east-to-west arrows in Figure 2a compared to
401
Figure 2b (matrix #1), dominated by eastward circulation.
402
Figure 2d displays the connectivity values of the mean of the 20 variant
21
Graph theory for species persistence
403
matrices. The betweenness values are the mean of the betweenness values
404
obtained for each node with the 20 different matrices. Note that this calculation
405
is different from calculating betweenness from the mean matrix, which is a
406
superposition of different current dynamics due to an intrinsic non linearity
407
of the betweenness measure. A simple comparison of the betweenness values
408
(obtained from the mean matrix) with the mean values of betweenness for
409
each node (obtained from the 20 variant matrices) can effectively show
410
that the two ways of proceeding are significantly different. In our case the
411
betweenness values often differ by one order of magnitude or more in one
412
case respect to the other (data not shown). In particular we prefer to rely
413
on the mean betweenness because it is statistically representative of 20
414
different realizations of the network, while the mean connectivity matrix
415
has a nearly null probability to occur. In Figure 2d the representation of
416
the mean values of connectivity is useful to give an idea of the predominant
417
circulation pattern over the periods related to the different connectivity
418
matrices.
419
A simple but interesting feature of the connectivity matrices can
420
be introduced in order to clarify the influence of the circulation on the
421
connectivity matrices and, consequently, on betweenness values (Figure
422
3): the number of not null elements that they contain (see Figure 4a). This
423
quantity can make one understand the effects of the different circulation
424
patterns on the connectivity between the analysed sites in the GoL, especially
425
if we cross this information with the network length (given in Figure 4b).
22
Graph theory for species persistence
426
In order to avoid infinite network length values, we substituted the infinities,
427
coming from the null aij values put in Equation 6, with a constant. After
428
a sensitivity analysis we confidently set this constant to 1000 times the
429
maximum value of dij in the different matrices. As this metric depends on
430
the number of Lagrangian particles that do not reach any one of the 32
431
sites, a high value of network length indicates that few connections exist in
432
the network.
433
Thus, in a situation of eastward circulation (matrix #1), we have a
434
lot more connections between the sites with compared to almost all the
435
other connectivity matrices (892 not null elements out of 1024). This implies
436
that this kind of circulation retains a lot of particles alongshore (at least
437
during the initial and final parts of their larval period). As a consequence
438
the fact that the network length value (1.67 × 107 ) is much lower than the
439
mean one (4.90 × 107 ) tells us that there are a lot of connections with small
440
values. Indeed we have a lot of paths sharing a limited amount of network
441
length. Therefore only very few paths are highly probable in the graph
442
corresponding to matrix #1. For this reason the betweenness values for
443
matrix #1 are low for all the nodes (see Figure 3).
444
Case #11 is also peculiar. It has the lowest number of existing connections
445
(376, Figure 4a) and the highest network length (7.5 × 106 , Figure 4b). The
446
first data indicates a circulation dynamics that disperses a lot of particles
447
offshore. Thus there are only very few paths in the corresponding graph.
448
Furthermore, considering network length, we can say that the paths with
23
Graph theory for species persistence
449
a probability much higher than the others are likely very few. Thus we
450
understand that in this situation only the predominant paths are left and
451
that they all have approximately the same probability. Thus, the fact that
452
node 21 has a significantly high betweenness value in matrix #11 (see Figure
453
3) is then a strong indication about its importance in the dynamics of the
454
connectivity network.
455
All the other cases are a composition of an intermediate number of
456
zero connections with an intermediate value of network length, thus they
457
can not be interpreted as easily as the cases 1 and 11 presented above.
458
In general, from the results in Figure 3, we see that in all these cases
459
only node 21 happens to have a much greater value of betweenness (roughly
460
one order of magnitude) than the other ones. It corresponds to the site
461
in front of Port Camargue, roughly at the center of the Gulf of Lion. The
462
constancy with which we find this result, with almost all possible circulation
463
patterns that can occur in the gulf (see Figure 3), makes us confident on
464
the importance of this node in maintaining the connectivity across the
465
Gulf of Lion. Indeed, the highest-betweenness node is the a node in which
466
the highest flux of larvae settle in and restart from. Consequently a high-
467
betweennes node is likely a zone through which most of the gulf-scale migration
468
flow in two successive reproductions comes across. Thus, such a zone plays
469
an essential role in the efficiency of the larval transfer across the networks,
470
as any recruitment failure limiting the offspring production in that site
471
will drastically reduce the number of larvae dispersed in the subsequent
24
Graph theory for species persistence
472
473
generations in the entire region.
Minimum cycles defined on the new metric distance defined in the
474
Methods Section evidenced that the nodes with the greater probability
475
to see their particles returning home after a period ranging from 2 to 5
476
generations are the nodes 13 to 16 (Figure 5).
477
In addition, the nodes that appear most frequently in the total set of
478
minimum cycles are the nodes: 13, 14, 15, 16 (data not shown). They are
479
all in the centre of the gulf. This zone seems to be thus a central core of
480
sites that ensures the persistence of the species in the Gulf of Lion.
481
Cluster analysis based on the principle of maximization of the modularity
482
computed on transfer probability (see Figure 6) shows a fairly simple division
483
of the network into two clusters: a western one (sites 1 to 18) and a central-
484
eastern one (sites 19 to 32). This division of the network is the one found
485
more frequently after 200 000 runs of the modularity maximization algorithm
486
(40% of the times against 10% for the other possible divisions). The majority
487
of the other cases were almost identical but with a different attribution of
488
the sites on the geographical boundary of the two clusters: sites 18, 19 and,
489
in some cases, 20. The corresponding value of modularity for this division
490
of the network is Q = 0.16.
491
This result means that the transfer of larvae having a 3-5 weeks pelagic
492
larval duration is organized in two groups of nodes having larger exchange
493
of larvae within them than with the other group. These two clusters are
494
likely to retain a consistent portion of their particles inside themselves.
25
Graph theory for species persistence
495
We finally present results on bridging centrality. For its calculation,
496
we use the cij connectivity values to be consistent with the idea at the
497
base of bridging centrality as formulated in Methods. The three nodes
498
that are characterized by the higher value of bridging centrality are the
499
nodes 11, 12 and 16 (see Figure 7). These nodes have bridging centrality
500
values of 600, 572 and 513 respectively. Following Hwang et al. (2008),
501
these three nodes, representing the top ten percent of the 32 nodes, are the
502
nodes likely to be crucial for the integrity of the network. That is, these
503
nodes prevent the network to easily break into separated sub-networks.
504
Discussion and conclusions
505
The present study analyses with graph theory concepts the structure of
506
connectivity within a regional metapopulation of soft-bottom polychaetes
507
in the Gulf of Lion. The aim is to refine the physical-biological interpretation
508
of different graph theory tools by comparing them with the result of a
509
metapopulation model analysis of the same metapopulation.
510
For the first time graph analysis is applied to Lagrangian trajectories
511
based on a fully 3-D circulation model. Moreover the dispersal time of
512
the numerical particles is set in order to mimic the principal biological
513
characteristics of polychaete. Both these aspects complicate the dispersion
514
dynamics. To the best of our knowledge, it is also the first time that graph
515
analysis is used on such a restricted coastal spatial domain for conservation
26
Graph theory for species persistence
516
aims. This fact adds difficulty in the analysis since it results in dense connectivity
517
matrices. Nonetheless the analysis converged to a meaningful result, even
518
under such constraints and many results shed light on the convergence
519
between metapopulation models and graph theory.
The first breakthrough of the present study is the sensitivity of graph
520
521
metrics to flow variability: different circulation patterns correspond to
522
networks with different connection patterns and different centrality measure
523
values. To our knowledge, no precedent work showed this in a quantitative
524
way.
525
An important measure of graph analysis is the betweenness centrality.
526
This has been widely used in precedent literature on connectivity estimation
527
as an indication of those nodes that are likely to play the role of crucial
528
hubs for multi-generational connectivity (see for example Rozenfeld et al.,
529
2008).
530
In contrast with graphs built with genetic distance (Rozenfeld et al.,
531
2008) or connection time (Treml et al., 2008), particular attention is required
532
when dealing with larval transfer probabilities between the sites forming
533
the metapopulation network. In this case, the most immediate choice for
534
edges definition would be the probabilities themselves. This was the choice
535
of Andrello et al. (2013) when studying the Mediterranean MPAs network.
536
But, with this choice, one obtains conceptually wrong results when applying
537
the shortest path algorithm in calculating betweenness. In fact, with this
538
metric, the shortest path is the most improbable one. This is meaningless
27
Graph theory for species persistence
539
if one wants to identify the crucial zones that maintain the connectivity
540
within a network, since these sites would be the less frequented ones. The
541
new node-to-node metric we propose solves this incoherency. It also coherently
542
redefine how to calculate shortest paths accordingly to the independence of
543
the probabilities in the connectivity matrices.
544
High betweenness values have been used in precedent literature to
545
identify “gateways” through which larvae have to pass during multi-generational
546
migrations (Rozenfeld et al., 2008). Nonetheless, in our study, it is the site
547
21 offshore of Port Camargue, which has the highest value of betweenness
548
in almost all cases of circulation patterns in spring 2004 and 2006 in the
549
Gulf of Lion. Whereas the metapopulation modelling study showed that
550
losing this site would not endanger species persistence in the region at
551
minimum recruitment success. Hence we draw attention to the fact that
552
a “gateway” do not point out an essential site for the persistence of the
553
species. Indeed, even if a site with high betweenness value is well preserved,
554
if the other sites are lost, the preservation of such a site ends up being
555
useless from a conservationist point of view. This is a remarkable example
556
of how different analyses can cross-check each other in the estimation of
557
connectivity.
558
We also specifically address the analysis to the inspection of the spatial
559
scale at which multi-generational flows forms retention loops. A first attempt
560
to identify these sites essential for persistence is to find the sites from which
561
are starting the shortest cycles. The presence of a cycle explicitly means
28
Graph theory for species persistence
562
that the nodes that compose it are a potential self-persistent network (Hastings
563
and Botsford, 2005). The set of sites highlighted by this technique are
564
located in the centre of the GoL, where the two point identified as critical
565
by the metapopulation model (sites 10 and 18) were identified. However,
566
sites depicting shortest paths lie between them (sites 13 to 16). This discrepancy
567
cannot be due to the influence of the density dependent factors in the metapopulation
568
model used in Guizien et al. (2014), that are not considered in Hastings
569
and Botsford (2005), as recruitment success was set to the minimum value
570
and the saturating capacity never reached its maximum in the metapopulation
571
simulations. Conversely the discrepancy between shortest paths analysis
572
and metapopulation modelling could be due to the particular scenarios
573
considered by the habitat loss scenario in the metapopulation model analysis.
574
Firstly, consistently with the fact that anthropic pressure decrease as distance
575
from the harbours increase, scenarios included only the four harbours sites.
576
Secondly, under the hypothesis that the effect of anthropic pressure acts
577
predominantly on neighbouring sites, the habitat removal procedure was
578
done by progressively eliminating neighbouring sites. Shortest path analysis
579
supports this point of view. In fact it points out that the nodes composing
580
the shortest cycles are all close one to the others. So that the hypothesis
581
that the survivorship of a site depends on the survivorship of the neighbouring
582
ones is reasonable. Nevertheless shortest path analysis do not include any
583
assumption on geographical proximity within the path. Consequently, shortest
584
path analysis’ results would be fully comparable only with a more general
29
Graph theory for species persistence
585
586
removal procedure than the one used in Guizien et al. (2014).
By exploiting the maximization of the modularity value criterion, we
587
identify two clusters of interconnected sites with a corresponding modularity
588
value Q = 0.16. In general, there is no absolute threshold to discriminate
589
between low and high values of modularity. Research on modularity has
590
been largely developed since its introduction by Newman and Girwan (2004)
591
and various shortcomings of this quantity are nowadays well known, see
592
for example Fortunato and Barthelemy (2006) and Kehagias and Pitsoulis
593
(2013). Practically, a good general strategy is to study a lot of network of
594
the same type (biological, genetic, social, etc.) with similar characteristics
595
(betweenness, number of edges, etc.) and establish an appropriate modularity
596
threshold. In our case this would mean to analyse the network of polychaete
597
in regions different from the GoL, or to consider a different species. Although
598
this was not possible in the framework of this study, considering that, by
599
definition, −1 < Q < 1 we are confident in stating two things: (1) as our
600
value of Q is positive there is a cluster structure and (2) as Q is less than
601
a fifth of the maximum possible value (that is 0.2), we can define it as low.
602
This means that the clusters exist but are not separated in a sharp way.
603
As a consequence, the division between the two clusters is not keen and,
604
within the gulf, there is a considerable migration flow of polychaete that is
605
not spatially highly organized. This is not surprising given the disposition
606
of the studied sites, the topography of the Gulf of Lion and the complex
607
circulation inside it, that all cause the high number of connections in the
30
Graph theory for species persistence
608
network. This is very likely a characteristic of a coastal environment where
609
all the sites are alongshore and no considerable physical and/or dynamical
610
barrier between them.
611
One possible objection to this method is the validity (see Methods)
612
of a random model as null hypothesis (as pointed out, for example, by
613
Thomas et al., 2014). Regarding this aspect we are convinced that a random
614
null model is the best choice to put in evidence the effect of a fundamental
615
forcing of the biological system’s dynamic that -although chaotic- is deterministic:
616
the current field. Indeed community structure in marine biological networks
617
is likely to arise as a result of the actions of currents.
618
The modularity analysis result is indirectly in agreement with the
619
results of the metapopulation model. In fact the analysis by Guizien et al.
620
(2014) showed the presence of a rescue mechanism of the sites in the western
621
part of the gulf by the eastern sites. This result indicates an organization
622
of the metapopulation in two big sub-populations in the same way as graph
623
theory clustering does. The presence of a rescue mechanism is mirrored by
624
the low value of modularity, that is the non negligible amount of communication
625
between the different sites.
626
In addition we can refine the rescue mechanism using the result of the
627
bridging centrality. This last measure is specifically built in order to try to
628
identify those nodes that drive the communication between clusters (i.e.,
629
communities). We have calculated the bridging centrality of each node
630
in order to identify the sites essential to maintain connectivity between
31
Graph theory for species persistence
631
the two communities. In particular we found that nodes 11, 12 and 16
632
are the top 10% nodes with highest bridging centrality value. Bridging
633
centrality targets those nodes crucial for the integrity of the network as
634
a whole. Since the diagonal terms in the connectivity matrices are not
635
influential in the calculation of bridging centrality (see Methods), this one
636
characterizes the net effect of the regional scale transport of the larvae
637
disregarding the values of local retention of each site. Thus it identifies the
638
potentiality of a node to sustain the species spreading through the entire
639
network. Following Hwang et al. (2008), the importance of each node in
640
the network can be assessed through the number of isolated components
641
created by removing the other nodes sequentially. No isolated component
642
were created by removing a single node. Isolated subnetworks appeared
643
only after removing a great number of nodes. This result reflects the high
644
average edge density of the 20 variant matrices (ρ = 0.604). Nonetheless a
645
lot of weak connection between the 32 sites are present (data not shown).
646
But the solidity of the biological network must be assessed independently
647
from these numerous low connections which are not efficient. By setting a
648
threshold to get rid of the many low-weight edges, the predominant connections
649
were highlighted. The threshold value was put at 0.001 on the base of the
650
arguments presented in Appendix C. The deletion of the nodes 11 and 12
651
led to nine isolated components - sets of nodes disjoint from other portions
652
of the graph - mainly consisting in isolated single nodes or couple of nodes.
653
In contrast the single removal of the other nodes created, on average, only
32
Graph theory for species persistence
654
seven separated isolated components (data not shown). We can thus conclude
655
that nodes 11 and 12 are the most important nodes for the solidity of regional
656
connectivity. Contrarily, site 16 was no longer depicted by the analysis.
657
This fact is reasonable as the bridging centrality value of this node was
658
lower than for nodes 11 and 12. Anyway the choice to consider as important
659
the top ten percent of the nodes is just a guideline. Remarkably the removal
660
of node 21 had no particularly important effects on the fragmentation of
661
the network. It is thus clear that bridging centrality adds additional information
662
on the structure of the network than what betweenness alone can give. In
663
fact, if high-betweenness nodes are the ones that ensures a high flux of the
664
network, high-bridging centrality depicts the nodes that prevent a critical
665
fragmentation of the network, thus playing an important role in the species
666
spreading in the region.
667
This study revisited the previous state-of-the-art interpretation of
668
some graph theory metrics in a biological conservation point of view. Table
669
1 summarizes the principal measures used in this study and their physical-
670
biological interpretation based on the comparison of graph theory analysis
671
with a metapopulation model. This comparison highlight a potential misuse
672
of betweenness values which do not identify sites important for species
673
persistence. Reversely, the comparison pinpoints the potentiality of shortest
674
paths to identify multi-generational loop of persistence, and of bridging
675
centrality to identify nodes important to species regional colonisation. Important
676
message is also to take care of adapting connectivity metrics of larval transfer
33
Graph theory for species persistence
677
probability in order to ensure a physically meaningful interpretation of
678
betweenness and shortest path.
679
Acknowledgements
680
This study was partially funded by the European CoCoNet Project and by
681
the Ministère de l’Éducation Nationale, de l’Enseignement Supérieur et de
682
la Recherche.
34
35
Scope
Identifying sites with high probability
of returning home of their larvae
Nodes through which the highest percentage
of most probable paths pass through them.
Find sets of nodes more
than randomly connected
Find nodes leading the
communication between clusters
Nodes increasing larval transport
at the base of persistence.
Likely to result in a rescue mechanism.
Nodes maintaining the
connectivity at the whole network scale.
Communities.
Interpretation
Source sites.
Table 1: Recapitulation of the four main measure we use in the framework of this study. For each
one we indicate scope and physic-biological interpretation.
Bridgingcentrality
M odularity
Betweenness
Measure
M inimumCycles
Graph theory for species persistence
Graph theory for species persistence
683
Appendices
684
A
685
The model used by Guizien et al. (2014) can be written in matricial form
686
as follows:
Metapopualtion model
P (t + ∆t) = min(G(t)P (t), Pmax )
687
with a time step of ∆t and the growth transfer matrix G defined as
Gij = li aij (t)bj + sjj δ(i, j)
688
where: Pmax = 1/αA with αA is the mean cross-sectional area of one
689
adult, N (t) ∈ RN contains the spatial density of adults in each site i ∈
690
[1, 32] at time t, li [number of larva per adult] is the propagulae production
691
rate at site i, bj [number of adult per larva] is the recruitment success at
692
site j, sjj [no units] is the adult survivorship rate at site i, δ(i, j) is the
693
Kronecker δ-function, Pmax is the site carrying capacity and aij is the propagulae
694
transfer rate from site i to site j. The larvae production rate bj is equal
695
to the number of larvae produced by each adult female F SR f where F is
696
the fecundity rate, SR is the sex ratio in the adult population, and f is the
697
probability of an egg being fertilized. The recruitment success bij accounts
698
for all mortality losses since egg release until the first reproduction of new
36
Graph theory for species persistence
699
recruits, and includes mortality during larval dispersal, settlement and
700
juveniles stages. Notice that adult survivorship rate can be related to species
701
life expectancy LE as sij = e
702
at which 99% of individuals of the same generation have died.
703
B
704
We verify that the new metric we propose for measuring the distance between
705
nodes dij = ln( a1ij ), aij being the probability of advection from site i to
706
site j in a given time, is meaningful and respects the linearity property we
707
expect for a distance. In fact it satisfies both homogeneity and additivity:
708
ln(0.01) L∆t
E
, where life expectancy LE is the age
Node-to-node metric properties
1. Homogeneity:
αdij = α ln
709
710
1
aij
= ln
1
aαij
that is, a distance multiplied by a scalar is still a distance.
2. Additivity:
dik + dkj = ln
1
aik
+ ln
1
akj
= ln
1
aik · akj
= ln
1
aij
= dij
711
that is, the sum of two distances is still a distance and, in particular,
712
the final distance is obtained by the multiplication of the partial
37
Graph theory for species persistence
713
original metric values. This aspect adapts particularly well to the
714
problem addressed in the paper. In fact the length of a path results
715
to be calculated as the product of the probabilities associated with
716
its components.
717
Note that as the elements aij have no units, also the elements dij
718
have no units.
719
C
Guess estimation of a threshold for good
connectivity
720
A threshold for establishing which edges play a major role in maintaining a
good persistence in a biological network can be established via an educated
guess on the basis of physical reasoning. In particular, given a probabilitybased connectivity matrix, we can expect that, in an efficient network,
a transfer rate Cij between 1 and its not null minimum value C m of the
matrix is necessary for the maintenance of an overall good connectivity.
Following a common methodology for the estimation of the order of magnitude
of unknown quantities, we can estimate M as the geometric mean of 1 and
C m:
√
M=
C m · 1.
721
This kind of estimation has the advantage to consider a vast range of values
722
for the variables at play in determining the unknown quantity, while not
38
Graph theory for species persistence
723
being biased by the choice of too large/small extremal values. Therefore,
724
after some steps of this kind, we are statistically confident on the goodness
725
of the estimation.
Dealing with living organisms, we also have to improve this estimation
by accounting for the survivorship of the propagulae. In an efficient network,
we can expect that a percentage S of propagulae between M and at least
1
M
e
is likely needed for the maintenance of the persistence of the species.
We then estimate
r
S=
M
·M
e
As a last step, we also account for the percentage of the surviving particles
that successfully reproduce. A percentage T between S and 1e S is likely
needed for a good persistence of the species in the habitat studied with
graph theory. We estimate
r
T =
726
S
·S
e
In our case, using the mean value of M resulting from the 20 variant
727
matrices, we have R = 0.0041 that we round to the value 0.001 used in our
728
analysis.
729
Please note that we had also tested this estimation qualitatively. We
730
had assessed that there is a sharp difference in the number by connections
731
when changing the threshold of an order of magnitude. In fact, for a threshold
732
equal to 0.01, we obtained an almost completely disconnected network.
733
While, when retaining all values above 0.0001, nothing hardly changed.
39
Graph theory for species persistence
734
Thus a threshold equal to 0.001 seems to be exactly the threshold we searched
735
in order to put ourself in the condition to have what we call a connected
736
minimal network.
737
List of captions
738
FIGURE 1. Schematic representation of the typical circulation in the Gulf
739
of Lion. The thick arrow represents the dominant alongshore Northern
740
Current. The thinner arrow represents the eastward currents that can be
741
detected in stratified conditions or under particular wind field conditions.
742
The positions of the 32 studied sites are plotted. The sites 3, 10, 18 and
743
32, used for the habitat loss scenario in the metapopulation model, are
744
highlighted by bigger grey dots. Node 21 is the smallest of the gray dots.
745
The gray lines correspond to the 100 m, 200 m, 1000 m and 2000 m isobaths.
746
FIGURE 2. Spatial representation of the connectivity matrices and
747
betweenness values in different circulation situations. (a) Westward drift
748
(matrix #7), (b) eastward drift (matrix #1), (c) central retention (matrix
749
#10). In the panel (d) the mean connectivity matrix values and the mean
750
of the betweenness values are represented. In all the figures a threshold on
751
the value of connectivity was imposed for clarity: connectivity values lower
752
than the 2/3 of the maximum one are not plotted. Values of betweenness
753
are normalized to the maximum value that was found in each case.
754
FIGURE 3. Values of betweenness for the 32 sites using the 20 variant
40
REFERENCES
Graph theory for species persistence
755
connectivity matrices. The normalization is done on the maximum value of
756
betweenness obtained using the different variant matrices.
757
FIGURE 4. (a) Number of not null elements in the 20 variant matrices.
758
Different circulation patterns have different effects on the connectivity
759
inside the Gulf of Lion. (b) Network length of the 20 variant matrices.
760
FIGURE 5. Sum of the products of the weights of all the cycles of
761
length from 2 to 5 steps that start from each of the 32 sites. The minima
762
correspond to the nodes for which the probability of particles returning
763
home (in a 2 to 5 generations time span) is higher.
764
FIGURE 6. Clusters identified with a criteria of maximization of
765
modularity. The result is the average assignation of a node to one of the
766
two clusters after 200 000 code runs with the 20 variant connectivity matrices.
767
To different color it corresponds a different cluster.
768
FIGURE 7. Bridging centrality values for the 32 sites. Geographical
769
representation. Nodes 11 and 12 have the highest values: 600 and 572 respectively.
770
References
771
Andrello, M., Mouillot, D., Beuvier, J., Albouy, C., Thuiller, W., and
772
Manel, S. (2013). Low connectivity between mediterranean marine
773
protected areas: a biophhysical modeling approach for the dusky
774
grouper: Epinephelus Marginatus. PLOS One, 8.
775
Caswel, H. (2001). Matrix population models (second edition). Sinauer
41
REFERENCES
776
777
778
779
Graph theory for species persistence
Associates, Sunderland, MA, USA.
Dijkstra, E. (1959). A note on two problems in connexion with graphs.
Numerische Mathematik, 1:269–271.
Duraiappah, A. and Shahid, N. (2005).
Millennium ecosystem
780
assessment.ecosystems and humanwell-being: Biodiversity synthesis.
781
World Resources Institute,Washington, DC. USA.
782
Estournel, C., de Madron, X. D., Marsaleix, P., Auclair, F., Julliand, C.,
783
and Vehil, R. (2003). Observation and modeling of the winter coastal
784
oceanic circulation in the gulf of lion under wind conditions influenced
785
by the continental orography (fetch experiment). Journal of Geophysical
786
Research, 108.
787
788
Fortunato, S. and Barthelemy, M. (2006). Resolution limit in community
detection. PNAS, 104:36–41.
789
Guizien, K., Belharet, M., Moritz, C., and Guarini, J. (2014). Vulnerability
790
of marine benthic metapopulations: implications of spatially structured
791
connectivity for conservation practice. Diversity and Distributions, pages
792
1–11.
793
Guizien, K., Belharet, P., Marsaleix, P., and Guarini, J. (2012). Using
794
larval dispersal simulations for marine protected area design: application
795
to the gulf of liones (nw mediterranean). Limnology and Oceanography,
796
57.
42
REFERENCES
797
798
799
Graph theory for species persistence
Hastings, A. and Botsford, L. (2005). Persistence of spatial populations
depends on returning home. PNAS, 103:6067–6072.
Hwang, W., Taehyong, K., Murali, R., and Aidong, Z. (2008). Bridging
800
centrality: Graph mining from element level to group level. Proceedings
801
KDD 2008, pages 366–344.
802
803
804
805
Jacobi, M., Andr, C., Doos, K., and Jonsson, P. (2012). Identification of
subpopulations from connectivity matrices. Ecography, 35:31–44.
Kehagias, A. and Pitsoulis, L. (2013).
Bad communities with high
modularity. The European Physical Journal B, 86.
806
Kool, J., Moilansen, A., and Treml, E. (2013). Population connectivity:
807
recent advances and new perspectives. Landscape Ecol., 28:165–185.
808
Marsaleix, P., Auclair, P., and Estournel, C. (2006). Considerations on
809
open boundary conditions for regional and coastal ocean models. Journal
810
of Atmosphere and Ocean Technology, 23:16041613.
811
812
813
814
815
816
Millot, C. (1990). The gulf of lions hydrodynamics. Continental Shelf
Research, 10:885894.
Newman, M. (2006). Modularity and community structure in networks.
PNAS, 103:8577–8582.
Newman, M. and Girwan, M. (2004). Finding and evaluating community
structure in networks. Physical Review E, 69.
43
REFERENCES
Graph theory for species persistence
817
Nicosia, V., Mangioni, G., Carchiolo, V., and Malgeri, M. (2009).
818
Extending the definition of modularity to directed graphs with
819
overlapping communities. Journal os Statistical Mechanics, 3.
820
Petrenko, A. (2003). Variability of circulation features in the gulf of lion
821
nw mediterranean sea. importance of inertial currents. Oceanology Acta,
822
26:323338.
823
Petrenko, A., Dufau, C., and Estournel, C. (2008). Barotropic eastward
824
currents in the western gulf of lion, north-western mediterranean sea,
825
during stratified conditions. Journal of Marine Systems, 74:406–428.
826
Petrenko, A., Leredde, Y., and Marsaleix, P. (2005). Circulation in a
827
stratified and wind-forced gulf of lions, nw mediterranean sea: in situ
828
and modeling data. Continental Shelf Research, 25:727.
829
Rossi, V., Giacomi, E. S., Cristobal, A. L., and Hernandez-Garcia, E.
830
(2014). Hydrodynamic provinces and oceanic connectivity from a
831
transport network help desining marine reserves. Geoph. Res. Lett., 41.
832
Rozenfeld, A., Arnaud-Haond, S., Hernandez-Garcia, E., Eguiluz, V.,
833
Serrao, E., and Duarte, C. (2008). Network analysis identifies weak and
834
strong links in a metapopulation system. PNAS, 105:18824–18829.
835
Schick, R. and Lindley, S. (2007). Directed connectivity among fish
836
populations in a riverine network. Journal of applied ecology, 44:1116–
837
1126.
44
REFERENCES
838
Graph theory for species persistence
Thomas, C., Lambrechts, J., Wolansky, E., Traag, V., Blondel, V.,
839
Deleersnijder, E., and Hanert, E. (2014). Numerical modelling and graph
840
theory tools to study ecological connectivity in the great barrier reef.
841
Ecological Modelling, 272:160–174.
842
Treml, E., Halpin, P., Urban, D., and Pratson, L. (2008). Modeling
843
population connectivity by ocean currents, a graph theoretic approach
844
for marine conservation. Lansc. Ecol., 23:19–36.
845
846
Urban, D. and Keitt, T. (2001). Landscape connectivity: a graph theoretic
perspective. Ecology, 82:1205–1218.
45
FIGURES
Graph theory for species persistence
Figure 1
46
FIGURES
Graph theory for species persistence
(a)
0.1
0.3
0.35
0
0.01
0.02
0.05
0.06
12
15’
13 14
16 17
15
19
18
21 22
12.5
25
26 27 28
29
303132
11
12
11
10
43oN 9
8
10.5
7
6
5
4
3
30’
2
1
9.5
10
15’
13 14
16 17
15
19
18
23 24
12
25
26 27 28
29
303132
11
10.5
10
7
6
5
4
3
30’
2
1
8.5
9.5
9
8.5
15’
30’
4oE
30’
5oE
3oE
30’
4oE
(c)
0.05
Betweenness
0.1
0.15
30’
5oE
(d)
0.2
0
0.05
Betweenness
0.1
0.15
0.2
12
13 14
16 17
15
19
18
21 22
23 24
14.5
11.5
25
26 27 28
29
303132
12
11
10
43 N 9
8
11
10.5
o
10
9.5
7
6
5
4
3
30’
2
1
15’
13 14
16 17
15
19
18
21 22
23 24
14
25
26 27 28
29
303132
13.5
13
12
11
10
43 N 9
8
45’
12.5
o
12
7
6
5
4
3
30’
2
1
11.5
45’
9
8.5
15’
3oE
20
30’
Connectivity
20
30’
15’
11.5
45’
9
15’
0
21 22
12
11
10
43oN 9
8
45’
3oE
20
30’
11.5
23 24
Connectivity
20
30’
Connectivity
0.05
Betweenness
0.03 0.04
11
10.5
10
15’
30’
4oE
30’
5oE
3oE
Figure 2
47
30’
4oE
30’
5oE
Connectivity
0
(b)
Betweenness
0.15
0.2
0.25
FIGURES
Graph theory for species persistence
0.4
2
0.35
4
0.3
Variant Matrices
6
8
0.25
10
0.2
12
0.15
14
16
0.1
18
0.05
20
5
10
15
20
25
Sites
Figure 3
48
30
0
FIGURES
Graph theory for species persistence
900
7
x 10
7
700
6
600
Network Length
Number of Not Null Elements
800
500
400
300
5
4
3
200
2
100
1
0
0
5
10
Matrices
15
20
0
0
(a)
5
10
Matrices
(b)
Figure 4
49
15
20
FIGURES
Graph theory for species persistence
5
3
x 10
Cycles Length
2.5
2
1.5
1
0.5
0
0
5
10
15
Sites
Figure 5
50
20
25
30
FIGURES
Graph theory for species persistence
20
30’
Latitude
15’
13 14
16 17
15
21 22
19
18
23 24
25
26 27 28
29
303132
12
11
10
9
8
43oN
7
6
5
4
3
30’
2
1
45’
15’
o
3 E
30’
o
4 E
Longitude
Figure 6
51
30’
o
5 E
FIGURES
Graph theory for species persistence
15’
13 14
16
15
17
19
18
21 22
600
23 24
25
26 27 28
29
3031 32
12
11
10
o
43 N 9
8
7
6
5
4
3
30’
500
400
300
45’
200
2
1
100
15’
3oE
30’
4oE
30’
Figure 7
52
5oE
0
Bridging Centrality
20
30’
TABLES
Graph theory for species persistence
Matrix
Period
#1
from Jan 5 to Jan 14
#2
from Jan 15 to Jan 24
#3
from Jan 25 to Feb 3
#4
from Feb 4 to Feb13
#5
from Feb 14 to Feb 23
#6
from Feb 24 to Mar 4
#7
from Mar 5 to Mar 15
#8
from Mar 16 to Mar 26
#9
from Apr 27 to May 5
#10
from May 6 to May 16
#11
from Jan 5 to Jan 14
#12
from Jan 15 to Jan 24
#13
from from Jan 25 to Feb 3
#14
from Feb 4 to Feb13
#15
from Feb 14 to Feb 23
#16
from Feb 24 to Mar 4
#17
from Mar 5 to Mar 15
#18
from Mar 16 to Mar 26
#19
from Apr 27 to May 5
#20
from May 6 to May 16
Circulation Pattern
Eastward
Mixed
Mixed
Central Retention
Mixed
Mixed
Westward
Mixed
Mixed
Central Retention
Westward
Westward
Central Retention
Mixed
Westward
Mixed
Westward
Mixed
Mixed
Mixed
Table 2: Time period relative to each connectivity matrix; the
matrices’ periods from 1 to 10 refers to year 2004, from 11 to 20 to year
2006. The circulation regime present in the Gulf of Lion in each period
is also indicated. A ‘Mixed’ regime indicates the contemporary presence
of more than one of the simple regimes: westward, eastward or central
retention (see the Results for further details).
53