Graph theory for species persistence Interpretation of graph theory metrics for species persistence in a metapopulation. The Gulf of Lion study case. Andrea Costa1,2,* , Andrea M. Doglioli1,2 , Katell Guizien3 , Anne. A. Petrenko1,2 1 - Aix Marseille Université, CNRS/INSU, IRD, Mediterranean Institute of Oceanography (MIO), UM 110, 13288 Marseille 2 - Université de Toulon, CNRS/INSU, IRD, Mediterranean Institute of Oceanography (MIO), UM 110, 83957 La Garde 3 - Laboratoire d’Ecogeochimie des Environnements Benthique, CNRS, Universite Paris VI, UMR8222, Av. du Fontaule - F-66651 Banyuls-surMer (France) * [email protected] keywords Connectivity, Graph Theory, Metapopulation Model, Bridging Centrality, Modularity, Clustering 1 Directed Weighted Graph theory for species persistence Abstract New challenges in biodiversity conservation focus on taking into account biological networks connectivity. Graph theory has recently been used to investigate the conditions for species persistence from different measures of connectivity in such networks. In the present study, for the first time, a set of metrics defined in graph theory is compared to a metapopulation modelling approach. The two approaches are confronted to evaluate the persistence of soft bottom polychetae populations in the Gulf of Lion (Mediterranean Sea). Various classical graph analysis concepts (betweenness, centrality and modularity) are tested. New descriptors (directed weighted bridging centrality, minimum cycles identification) are also derived and evaluated. The major innovation of this work is the introduction of a novel metric for measuring the distance between nodes in a graph when dealing with connectivity matrices containing larval transfer probabilities. The new metric ensures a physically meaningful interpretation of shortest paths and, consequently, of betweenness. This last highlighting the nodes ensuring an efficient transport in a biological network. The comparison with metapopulation simulations enables to ground the interpretation of the graph descriptors in the view of species persistence. Graph theory complements the results of the metapopulation model by adding an exhaustive analysis of the spatial information. In particular, modularity and bridging centrality are shown to characterize clusters of interconnected nodes (communities) and to highlight 2 Graph theory for species persistence the rescuing sites. Those sites bringing species’ regional persistence result better indicated by bridging centrality and shortest cycles lengths. 1 2 Introduction Losses of biodiversity at sea due to deleterious effects of human activities 3 (e.g., habitat destruction, overfishing, etc.) are currently expected to be 4 mitigated by the implementation of Marine Protected Areas (MPAs), see 5 Duraiappah and Shahid (2005) for a comprehensive report on the argument. 6 The basic assumption of this approach is that, if a carefully-chosen portion 7 of the whole marine biological network is protected, the network would 8 not be subject to breakdowns generating critical losses of biodiversity and 9 individual abundances in the marine ecosystem. Indeed, if a sufficient number 10 of populations in the sea are sufficiently connected with other similar populations, 11 the surviving of the species is ensured. Thus, the challenge is to identify 12 the sub-networks that could permit the species persistence in a whole habitat 13 by minimizing the mortality only in some areas within it. Equivalently 14 we can say that the problem is to identify the minimal sub-network that 15 can maximise the connectivity of the whole network. In this way it will be 16 possible to protect the network by minimizing the costs of the implementation 17 of MPAs. 18 19 However, there are two major problems in identifying key sites for the conservation of a set of species distributed in disjunct sites between which 3 Graph theory for species persistence 20 species may disperse. First, great dissimilarities in dispersing ability among 21 species translates into different connections patterns. Second, species interactions 22 with the environment or between themselves in the different sites also affect 23 differently the persistence of species in the network. As an example, in 24 the case of marine benthic invertebrates which dispersal occurs during 25 the pelagic larval stage, differences among species in spawning period or 26 pelagic larval duration lead to different hydrological connectivity measured 27 by probability of transfer of larvae from one site to another (Guizien et al., 28 2014). This variety of aspects led to different methodologies for tackling 29 the problem and, consequently, different ways of defining connectivity. The 30 reader can refer to Kool et al. (2013) and references therein for a review of 31 the different techniques of population connectivity estimations in different 32 contexts. 33 In the present work we carry out a comparison between two techniques 34 with a different concept of connectivity, metapopulation model and graph 35 theory. 36 On the one hand, graph theory identifies well connected networks 37 with integer networks with an efficient transfer within them. Graph Theory 38 analysis have been successfully applied to different biological networks at 39 varying levels of integration. It was first introduced in ecology by Urban 40 and Keitt (2001) in a study of landscape connectivity, in which connections 41 were identified with the distance between different patches. Schick and 42 Lindley (2007) extended its use to the rivers network, whose connections 4 Graph theory for species persistence 43 were set according to the carrying capacity of the rivers. Treml et al. (2008) 44 studied a marine islands network which connections were estimated from 45 flow direction and intensity. Rozenfeld et al. (2008) exploited graph theory 46 tools to infer gene flux in marine populations network from genetic similarity. 47 Jacobi et al. (2012) used graph theory for the identification of marine communities. 48 Andrello et al. (2013) used graph theory concepts in the estimation of the 49 connectivity among marine oases. 50 On the other hand the metapopoulation model identifies well connected 51 networks with networks ensuring the regional persistence of the species 52 even if at a very local scale. This kind of model links local demography 53 and regional dispersal. Population dynamics models have been extensively 54 used to investigate conditions of species persistence, identified by eigenvalues 55 of the growth transfer matrix large than unity (Caswel (2001) and Hastings 56 and Botsford (2005)). Meanwhile, in our knowledge, no systematic methodologies 57 have been proposed for identifying the sites of the metapopulation essential 58 for species regional persistence. Recently, ad hoc simulations based on 59 threat scenarios have been used to identify core sites in a metapopulation 60 model of the Gulf of Lion (NW Mediterranean Sea) by Guizien et al. (2014). 61 In the present study, we take advantage of the different definitions 62 of well connected network between the two techniques. In particular, we 63 identify (among both classic and new graph theory measures) graph theory 64 tools that can give information coherent with the one given in the metapopulation 65 model analysis by Guizien et al. (2014). In this way we can incorporate the 5 Graph theory for species persistence 66 metapopulation model point of view in the graph theory’s concept of well 67 connected network. Such an interpretation is becoming possible by using a 68 metapopulation modelling with spatially uniform demographic parameters 69 and a short-lived species, avoiding demographic effect as accumulation 70 over multiple generations. It is important to notice that, in return, graph 71 analysis enabled to extend systematically the spatial resolution of the information 72 inferred from the metapopulation scenarios simulations. 73 74 75 As a result our comprehension of the mechanisms underlying high connectivity of marine biological networks is two-ways enriched. The Gulf of Lion (GoL) was selected as study case because of the 76 numerous studies, both physical and biological, already performed in this 77 area that can be used to interpret and validate our results. The GoL is 78 located in the north-western Mediterranean Sea and is characterized by a 79 large continental margin (Figure 1) dominated by soft-bottom forming the 80 habitat of uniform polychaete assemblages in the 10 to 30 m bathymetrical 81 depth range. Its hydrodynamics is complex and highly variable (Millot, 82 1990). The circulation is strongly influenced by the Northern Current (NC), 83 which constitutes an effective dynamical barrier blocking coastal waters on 84 the continental shelf (Petrenko, 2003) and delimits the regional scale of 85 hydrodynamical connectivity. Exchanges between the GoL and offshore 86 waters are mainly induced by processes associated with the NC (Petrenko 87 et al., 2005). Hydrodynamical connectivity was quantified by larval transfer 88 probability between 32 sites along the shore of the GoL (Figure 1). 6 Graph theory for species persistence 89 During stratified conditions, barotropic eastward currents can be 90 detected, mostly in the western part of the gulf (Petrenko et al., 2008). 91 Alternatively, in unstratified winter conditions, eastward or south-westward 92 currents can occur due to the rotational wind field (Estournel et al., 2003). 93 The present study gives also high relevance to the way of implementing 94 graph analysis in order to have meaningful results, especially when calculating 95 betweenness centrality. Concerned about the validity of some of the choices 96 of the metric used as distance between the nodes that were adopted by 97 previous literature, we propose hereby a new methodology that permits 98 to obtain physically meaningful results from graph theory analysis. This is 99 a crucial aspect of the analysis that potentially avoid groundless choices in 100 environmental policy management decisions, especially when dealing with 101 the efficiency assessment of existing MPAs and the effect of potential ones. 102 The paper is organized as follows. In the Methods Section we explain 103 the methodology used for obtaining the common input of graph theory 104 and metapopulation model: the connectivity matrices from Lagrangian 105 dispersal simulations. We also report the basic concepts of the metapopulation 106 modelling carried out by Guizien et al. (2014). Successively we summarize 107 some basic concepts of graph theory and explain why it is possible to apply 108 them to our problem. Lastly we introduce a new metric for measuring 109 the node-to-node distance in graphs built with current-based connectivity 110 matrices. In the Results Section we recall the main results from Guizien 111 et al. (2014) and present the systematic analysis of the hydrological connectivity 7 Graph theory for species persistence 112 matrices with graph theory analysis. In the Discussion we examine the 113 graph theory analysis results in the light of metapopulation simulations. 114 Methods 115 In the first part of this section, we recapitulate the characteristics of the 116 Lagrangian dispersal simulation that resulted in the 20 variant connectivity 117 matrices used in Guizien et al. (2014) and in the present study. After we 118 recall the principal characteristics of the metapopulation model used by 119 Guizien et al. (2014). In the subsequent part, we present the essential concepts 120 of graph theory that we used in data analysis and results interpretation. 121 Finally we introduce a novel metric for the node to node distance in the 122 graphs built on current-based connectivity matrices. We also analyse the 123 consequences of its use. 124 Fundamentals about metapopulation models 125 The metapopulation model used by Guizien et al. (2014) describes explicitly, 126 in discrete time and for a set of sites connected by larval transfer, the population 127 spatial density dynamics of the sedentary adult stage of soft bottom polychaete. 128 Larval transfer is derived from Lagrangian dispersal simulations with 129 a three-dimensional circulation model (see Marsaleix et al., 2006) at a horizontal 130 resolution of 750 m. Spawning was simulated by releasing 30 particles in 131 the center of each of the 32 sites, on the 30 m isobath, every hour from 8 Graph theory for species persistence 132 January 5 at 0h until April 13 at 23h in 2004 and 2006 (Guizien et al., 133 2012). The final positions of larvae after three, four and five weeks were 134 processed to compute the proportion of larvae coming from an origin site 135 and arriving at a settlement site. Connectivity matrices were then built for 136 ten consecutive 10-day spawning periods in each year and for each of the 137 three different pelagic larval durations (3, 4 and 5 week). See Table 2 for 138 the periods to which each connectivity matrix corresponds. 139 Population density at a given time at a given site results from spatially 140 structured local survivorship and reproductive success inputs potentially 141 depending on all the other sites in the system. The model accounts for 142 both (i) recruitment limitation due to space availability at the destination 143 site (computed as the proportion of free space based on the saturating 144 density of adults, and (ii) the variability in propagule transfer rate. See 145 Appendix A for the mathematical details of the model. 146 The analysis by Guizien et al. (2014) assessed the effect of connectivity 147 on population persistence and spatial distribution at the limit of population 148 density equilibrium. Other simulations explored the resistance of a short- 149 lived species to two types of scenarios. The first scenario aimed to quantify 150 the resistance of the metapopulation to habitat loss around the four main 151 ports (Figure 1). By increasing the number of unsuitable sites starting 152 from each port and proceeding symmetrically around them, the resistance 153 of the metapopulation to habitat loss around the four main ports was quantified. 154 This procedure is consistent with the most likely scenario of habitat loss 9 Graph theory for species persistence 155 due to an expansion of industrial activities in the coastal zone. The second 156 scenario assessed the resistance of the metapopulation to recruitment failure 157 affecting either the eastern or western part of the GoL. 158 Basic concepts of graph theory 159 Here we present both the essential bases of graph theory and some new 160 concepts that are necessary to develop a coherent methodology for the 161 analysis of connectivity matrices obtained as described above. 162 Mathematically speaking a graph G is a set (V, E) of nodes V and 163 edges E. The set V represents the collection of objects under study that 164 are pair-wise linked by an edge representing a relation of interest between 165 these two objects. When the relation is symmetric, the graph is said to 166 be ‘undirected’, otherwise it is ‘directed’. An example of undirected graph 167 in the context of biological network study is the genetic distance among 168 populations used in Rozenfeld et al. (2008). While an example of directed 169 graph is the probability of connections due to the current field between 170 two zones of the sea as in Rossi et al. (2014). If every existing edge has 171 the same importance as others, the graph is said to be ‘binary’, that is 172 the edges can exist or not. If each edge has a specific relative importance, 173 a weight can be associated to each of them and the graph will then be 174 called ‘weighted’. The total weight of the connections of a node i ∈ V 175 is called total degree k(i). In an undirected graph, it will simply be the 176 number of edges incident on the node. In a directed graph, it is possible to 10 Graph theory for species persistence 177 178 179 distinguish between the in-degree and out-degree. The first one is the sum P of the values of the edges terminating in the node k in (i) = j aji , while P the second is k out (i) = j aij with j ∈ V and i 6= j. Here the values 180 aij are the terms of the connectivity matrix where all the values of the 181 edges from node i to node j are stored. The density ρ of a graph can be 182 defined as the ratio between the number of existing edges and its maximum 183 possible value. For a directed graph we have: ρ = 184 When ρ = 1, the graph is said to be complete. 185 number of not null edges . N ·(N −1) We also introduce two quantities useful in our analysis. The network 186 length is defined as the sum of all the elements of a connectivity matrix. 187 The number of not-null elements of a connectivity matrix is the number of 188 non zero elements that it contains. 189 In our case, we deal with weighted directed graphs. The nodes of our 190 graphs represent the sites used in the metapopulation model, while the 191 edges represent a not null probability with which a Lagrangian particle 192 released in one of these sites is transported, after a certain amount of time 193 corresponding to the larval duration period, to another of these sites. 194 In a directed unweighted graph, it is possible to define the shortest 195 path σi,j connecting two nodes i ∈ V and j ∈ V as the shortest possible 196 alternating sequence of nodes and edges, beginning with i and ending with 197 j, such as each edge connects the preceding node with the succeeding one. 198 The definition can be extended to directed weighted graphs: the shortest 11 Graph theory for species persistence 199 path has the lowest cost between two nodes. The most frequent choice 200 to define the cost of a path is the sum of its edges’ weights. Nonetheless, 201 other alternatives are possible and will be discussed in more detail later. 202 The definition of the centrality measure called betweenness BC(k), k ∈ V , 203 is based on the concept of shortest path. The betweenness estimates the 204 relative importance of a node k within a graph by counting the fraction of 205 existing shortest paths σi,j that effectively pass through this node σi,j (k): BC(k) = 206 X σij (k) σi,j i6=k6=j (1) A widely used method for identifying clusters in physical networks 207 is the maximum modularity criterion first introduced by Newman and 208 Girwan (2004). It arises from the observation that simply counting edges is 209 not an effective way to quantify the concept of community structure. The 210 partition of a network that simply minimizes the number of inter -cluster 211 connections while maximizing the intra-cluster ones does not necessarily 212 result to be effectively good when faced to reality. A good division would 213 rather be one in which there are more (or less) edges between clusters than 214 expected. A method to quantify this idea, that true cluster structures (i.e., 215 communities) in a network are mirrored by a statistically unexpected disposition 216 of edges, was proposed by Newman and Girwan (2004). Their method is 12 Graph theory for species persistence 217 based on the use of the concept of modularity. Modularity Q is defined, 218 up to a multiplicative constant, as the difference between the number of 219 edges falling within given groups of nodes and the expected number of such 220 edges expected by chance in a random network that preserves the degree 221 distribution of the original graph. The latter is a network that conserves 222 the degree values but with randomly placed edges (furhter details can be 223 found in Newman, 2006). The values of modularity can be either positive 224 or negative, with positive values indicating the possible presence of community 225 structure. Therefore we are able to investigate the community structure 226 of a network by looking for the divisions of the network associated with a 227 maximum value of modularity. Given a network, let ci be the community 228 in which node i is assigned. For a directed weighted graph the modularity 229 assumes the form (see Nicosia et al., 2009, for details): kiout kjin 1 X δ(ci , cj ) aij − Q= m i,j∈V m (2) 230 where ki and kj , where ki and kj are the degrees of the nodes i and j; m = 231 and δ(ci , cj ) is the Kronecker δ-function. 232 P Exploiting a reformulation of modularity in matrix formalism, it is 233 possible to recursively explore all the possible divisions of a network in 234 order to identify the one that maximizes the modularity value of the network 235 without exceedingly high computational power (see Newman, 2006, for 236 details). One drawback of the algorithm is an intrinsic variability that 13 i ki Graph theory for species persistence 237 eventually makes the results not completely compatible between different 238 runs of the analysis. For example certain nodes could be assigned to different 239 clusters without changing the maximum value of Q. This inconvenient can 240 be bypassed by running the analysis multiple times and taking, as a best 241 division, the one that is the most frequently found. In the present work we 242 ran the analysis 10 000 times on the 20 different variant matrices, hence a 243 total of two hundred thousand runs. 244 In order to extract all the possible information from the connectivity 245 matrices about the role played by the different sites, we used also the bridging 246 centrality CBR . This measure was first proposed by Hwang et al. (2008) 247 for undirected unweighted graphs. For our analysis we extended its use to 248 directed weighted graphs. 249 Bridging centrality highlights those nodes that connect different clusters 250 of a network. It is derived both from the betweenness value of a node and 251 from the bridging coefficient, a topological factor that accounts for the 252 probability of leaving the direct neighbourhood of the node by starting 253 from one of the nodes composing it. Intuitively, nodes with a high number 254 of such edges fall on the boundary of clusters. In Hwang et al. (2008), for a 255 node v ∈ V , the topological factor is defined as: Ψuu (i) = 1 X ∆(v) k(i) k(i) − 1 (3) v∈N (i) 256 where k(i) is the degree of the node i ∈ V and N (i) the direct neighbourhood 14 Graph theory for species persistence 257 of i: that is the set of nodes reachable from i in one step. ∆(v) is the out- 258 degree of nodes v ∈ N once deleted the edges going from v to other nodes 259 in N (i). The generalization to directed weighted graph basically consists 260 in accounting for the weight of the edges and in checking which edges are 261 effectively leaving the neighbourhood of the node. Then, we correct the 262 out-degree of i via the term avi and the total degree of v via the term −(aiv + 263 avi ). Note that, for this calculation, all the terms avv on the diagonal of the 264 connectivity matrix are suppressed because irrelevant. The redefinition - in 265 the directed weighted case - of bridging centrality is then: Ψdw (i) = X ∆(v) − avi 1 . k tot (i) k tot (v) − (aiv + avi ) (4) v∈N (i) 266 where k tot (i) = k in (i) + k out (i) is the total degree of the node i ∈ 267 V . In this way, we retain both the information on the flux of information 268 through a node (given by the betweenness) and the topological information 269 on the position of this node relatively to clusters (given by the bridging 270 coefficient). In fact, a node falling on the border of a cluster and channelling 271 a high flux of information will have both high bridging coefficient and high 272 betweenness values. So that the removal of such a high bridging centrality 273 node would have a much more disruptive effect than the removal of a node 274 having only a high betweenness value or a high bridging coefficient alone 275 (see Hwang et al., 2008, for an analysis and discussion of this phenomenon 276 in the undirected case). An important aspect to pay attention to, when 15 Graph theory for species persistence 277 calculating the betweenness centrality and the topological factor of a node, 278 is the different orders of magnitude in play. While the first is normalized 279 to one, the second is not: its value depends upon the particular metric 280 used to define the distance between the nodes. But we do not want to 281 give excessive importance neither to the betweenness nor to the bridging 282 coefficient. We want the two parameters to concur with equal importance 283 in characterizing the centrality of a node. Thus, following the suggestions 284 of Hwang et al. (2008), we: (1) calculate the betweenness centrality and 285 the topological factor for each node, (2) calculate the rank vector of the 286 nodes on the base of their value of betweenness and bridging values, and 287 (3) calculate the bridging centrality as: HBR (i) = ΓBR(i) · ΓΨ(i) 288 (5) where ΓBR(i) is the rank of a node i in the betweenness vector and 289 ΓΨ(i) is the rank of a node i in the topological factor vector. Bridging centrality 290 allows us to identify the nodes which are likely to be on the boundaries of 291 the clusters and hence able to prevent the fragmentation of the network in 292 isolated components. 293 Another concept of graph theory is the cycle; despite its simplicity it 294 turns out to be useful in the study of species multi-generational persistence. 295 Cycles are defined as those paths that, starting from node i ∈ V , end up 296 to the node i itself, after a certain number L of steps. Note that, in our 16 Graph theory for species persistence 297 work, we want to distinguish between the effect of the particles remaining 298 at the same site versus the effect of the ones leaving the site and coming 299 back. The latter effect can be evaluated taking L > 2. One of the essential 300 requisites for ensuring the persistence of a species in a given zone is the 301 high probability to see the larvae returning home after a certain number of 302 generations (see Hastings and Botsford, 2005, for details). This means that 303 the shorter the cycle starting from a given node, the more likely the site is 304 important for persistence. In fact, in this case, the site survival would be 305 quite independent from the import of larvae from other sites. Thus it can 306 act as a source in our network. 307 The main practical problem of this kind of analysis is the generally 308 overwhelming computational power required. We used an algorithm that 309 recursively finds all the possible cycles for every node of the network, thus 310 involving a (N − 1)L−1 complexity. Indeed, our analysis was doable because 311 the number of nodes (N = 32) in our network is small. Nonetheless, we 312 were constrained to limit L to 5; hence L is between 2 and 5. 313 A new metric for node-to-node distance 314 An essential aspect in analysing biological network stability and structure 315 with graph theory is the choice of the metric used to define the distance 316 between the nodes of the corresponding graph. Above all, this choice has 317 important consequences on the physical interpretation of the results. In 318 principle, many choices are possible: the genetic distance was used in Rozenfeld 17 Graph theory for species persistence 319 et al. (2008), the connection time between sites in Treml et al. (2008); the 320 larval transfer probability in, for example, Andrello et al. (2013). 321 Here we propose the use of a new metric to define the distance between 322 nodes when dealing with larval transfer probabilities, in order to ensure 323 that largest larval transfer probability between two nodes correspond to 324 smallest node-to-node distance. Such transformation permits to maintain 325 a meaningful calculation of the betweenness values of nodes applying the 326 Djikstra algorithm for the shortest path finding (Dijkstra, 1959). 327 We define the distance between two nodes i and j as: dij = ln 328 1 aij (6) where aij is the connectivity probability given by the connectivity 329 matrices used in the metapopulation model. Notice that Equation (6) respects 330 the physical properties of a distance (see Appendix B for a detailed demonstration). 331 This definition combines two functions: h(x) = 1/x and f (x) = ln(x). The 332 use of h(x) = 1/x is, among different possibilities, the transformation we 333 prefer to exchange the ordering of the metric in order to make it physically 334 compatible with the concept of shortest path, at the base of betweenness. 335 The use of f (x) = ln(x) is due to the nature of the connectivity values and 336 of the shortest path algorithms. In fact, we must bear in mind three facts 337 about the connectivity values: (1) these values are calculated by considering 338 the position of the Lagrangian particles only at the beginning and at the 18 Graph theory for species persistence 339 end of the advection period; (2) we are discarding the information on the 340 effective path taken by a particle: the probability to go from i to j is independent 341 from the zone from which the particle arrived in i; and (3) the calculation 342 of the shortest paths implies the summation of a variable number of these 343 connectivity values (this is equivalent to say that, in the calculation of 344 betweenness, we are considering paths which values are calculated on a 345 different number of generations). Considering these facts, we clearly understand 346 that our probabilities are intrinsically independent one from the others. 347 But a problem arises here: as we just said above, most of the algorithms 348 calculate the shortest paths as the summation of the edges composing them 349 (e.g., the Dijkstra algorithm, Dijkstra, 1959). This is incompatible with 350 the independence of the probabilities at play here. The metric we propose 351 above, thanks to the basic property of logarithms allows us to use classical 352 shortest path algorithms while dealing correctly with the independence of 353 our connectivity values. In fact we are de facto calculating the value of a 354 path as the product of the values of its edges. It is worth mentioning that 355 the values di,j = ∞, resulting from the values aij = 0, do not influence the 356 calculation of betweennes values via the Dijkstra algorithm. The reader is 357 referred to Appendix B for details. 19 Graph theory for species persistence 358 Results 359 In general the connectivity matrices’ values depend strongly on the circulation 360 present in the Gulf during the period of the dispersal simulation. The typical 361 circulation of the Gulf of Lion is a westward current regime (Figure 1). 362 This was the case of matrices #7,#11, #12, #15, #17. In this study, other 363 types of circulation were also present. In particular matrix #1 was obtained 364 after a period of reversed (eastward) circulation. Indeed this case of circulation 365 is less frequent than the westward circulation (Petrenko et al., 2008). Matrices 366 #10,#4 and #13 correspond to a circulation pattern with an enhanced 367 recirculation in the centre of the gulf. Finally matrices #2, #3, #5, #6, 368 #8, #9, #14, #16, #18, #19, #20 correspond to a rather diffuse circulation 369 with no clear patterns. 370 Relying on these matrices, the first scenario of the metapopulation 371 model study found the evidence of a rescue mechanism from the sites located 372 in the western part of the gulf by the sites in the eastern part. Indeed a 373 deletion of sites only in the eastern part always turned out to be critical. 374 Conversely, a deletion of sites in the western never was. This fact clearly 375 shows that the sites in the eastern semi-gulf act as an essential source of 376 larvae that prevents the extinction of the species in the GoL. 377 The second scenario highlighted how Sète is crucial to the persistence 378 of polychaete in the GoL. In fact the deletion of this site and of few neighbouring 379 ones brought to the species extinction in the whole gulf. On the other 20 Graph theory for species persistence 380 hand, the deletion of the other harbours required a simultaneous loss of 381 a higher number of near sites to be critical. 382 We now present the results from graph theory. 383 Figure 2 shows a geographical representation of the connectivity matrices 384 and betweenness values. It highlights the strong dependency of betweenness 385 on the releasing period of the larvae, that is on the circulation pattern 386 present in the gulf. The representation we use in Figure 2 permits us to 387 see simultaneously: (1) the geographical distribution of the sites, (2) the 388 geographical direction of the connectivity by advection aij (with the arrows 389 pointing in the i → j direction) and (3), for each couple of sites, the difference 390 between the probability to go from one site to the other or vice versa by 391 looking at the different colors of the arrows. For clarity the connectivity 392 values lower than 2/3 of the maximum one are not represented. When 393 both probabilities in i → j and j → i directions are plotted, the arrows 394 reach only the mid-distance between the nodes. 395 This kind of representation permits also to capture the circulation 396 patterns. For example, we can see that, in Figure 2a, there are less arrows 397 in the west-to-east direction than in Figure 2b. Moreover these arrows 398 are almost always weaker than the corresponding east-to-west ones. For 399 example the westward circulation driving the connectivity in matrix #7 400 reflects the dominance of east-to-west arrows in Figure 2a compared to 401 Figure 2b (matrix #1), dominated by eastward circulation. 402 Figure 2d displays the connectivity values of the mean of the 20 variant 21 Graph theory for species persistence 403 matrices. The betweenness values are the mean of the betweenness values 404 obtained for each node with the 20 different matrices. Note that this calculation 405 is different from calculating betweenness from the mean matrix, which is a 406 superposition of different current dynamics due to an intrinsic non linearity 407 of the betweenness measure. A simple comparison of the betweenness values 408 (obtained from the mean matrix) with the mean values of betweenness for 409 each node (obtained from the 20 variant matrices) can effectively show 410 that the two ways of proceeding are significantly different. In our case the 411 betweenness values often differ by one order of magnitude or more in one 412 case respect to the other (data not shown). In particular we prefer to rely 413 on the mean betweenness because it is statistically representative of 20 414 different realizations of the network, while the mean connectivity matrix 415 has a nearly null probability to occur. In Figure 2d the representation of 416 the mean values of connectivity is useful to give an idea of the predominant 417 circulation pattern over the periods related to the different connectivity 418 matrices. 419 A simple but interesting feature of the connectivity matrices can 420 be introduced in order to clarify the influence of the circulation on the 421 connectivity matrices and, consequently, on betweenness values (Figure 422 3): the number of not null elements that they contain (see Figure 4a). This 423 quantity can make one understand the effects of the different circulation 424 patterns on the connectivity between the analysed sites in the GoL, especially 425 if we cross this information with the network length (given in Figure 4b). 22 Graph theory for species persistence 426 In order to avoid infinite network length values, we substituted the infinities, 427 coming from the null aij values put in Equation 6, with a constant. After 428 a sensitivity analysis we confidently set this constant to 1000 times the 429 maximum value of dij in the different matrices. As this metric depends on 430 the number of Lagrangian particles that do not reach any one of the 32 431 sites, a high value of network length indicates that few connections exist in 432 the network. 433 Thus, in a situation of eastward circulation (matrix #1), we have a 434 lot more connections between the sites with compared to almost all the 435 other connectivity matrices (892 not null elements out of 1024). This implies 436 that this kind of circulation retains a lot of particles alongshore (at least 437 during the initial and final parts of their larval period). As a consequence 438 the fact that the network length value (1.67 × 107 ) is much lower than the 439 mean one (4.90 × 107 ) tells us that there are a lot of connections with small 440 values. Indeed we have a lot of paths sharing a limited amount of network 441 length. Therefore only very few paths are highly probable in the graph 442 corresponding to matrix #1. For this reason the betweenness values for 443 matrix #1 are low for all the nodes (see Figure 3). 444 Case #11 is also peculiar. It has the lowest number of existing connections 445 (376, Figure 4a) and the highest network length (7.5 × 106 , Figure 4b). The 446 first data indicates a circulation dynamics that disperses a lot of particles 447 offshore. Thus there are only very few paths in the corresponding graph. 448 Furthermore, considering network length, we can say that the paths with 23 Graph theory for species persistence 449 a probability much higher than the others are likely very few. Thus we 450 understand that in this situation only the predominant paths are left and 451 that they all have approximately the same probability. Thus, the fact that 452 node 21 has a significantly high betweenness value in matrix #11 (see Figure 453 3) is then a strong indication about its importance in the dynamics of the 454 connectivity network. 455 All the other cases are a composition of an intermediate number of 456 zero connections with an intermediate value of network length, thus they 457 can not be interpreted as easily as the cases 1 and 11 presented above. 458 In general, from the results in Figure 3, we see that in all these cases 459 only node 21 happens to have a much greater value of betweenness (roughly 460 one order of magnitude) than the other ones. It corresponds to the site 461 in front of Port Camargue, roughly at the center of the Gulf of Lion. The 462 constancy with which we find this result, with almost all possible circulation 463 patterns that can occur in the gulf (see Figure 3), makes us confident on 464 the importance of this node in maintaining the connectivity across the 465 Gulf of Lion. Indeed, the highest-betweenness node is the a node in which 466 the highest flux of larvae settle in and restart from. Consequently a high- 467 betweennes node is likely a zone through which most of the gulf-scale migration 468 flow in two successive reproductions comes across. Thus, such a zone plays 469 an essential role in the efficiency of the larval transfer across the networks, 470 as any recruitment failure limiting the offspring production in that site 471 will drastically reduce the number of larvae dispersed in the subsequent 24 Graph theory for species persistence 472 473 generations in the entire region. Minimum cycles defined on the new metric distance defined in the 474 Methods Section evidenced that the nodes with the greater probability 475 to see their particles returning home after a period ranging from 2 to 5 476 generations are the nodes 13 to 16 (Figure 5). 477 In addition, the nodes that appear most frequently in the total set of 478 minimum cycles are the nodes: 13, 14, 15, 16 (data not shown). They are 479 all in the centre of the gulf. This zone seems to be thus a central core of 480 sites that ensures the persistence of the species in the Gulf of Lion. 481 Cluster analysis based on the principle of maximization of the modularity 482 computed on transfer probability (see Figure 6) shows a fairly simple division 483 of the network into two clusters: a western one (sites 1 to 18) and a central- 484 eastern one (sites 19 to 32). This division of the network is the one found 485 more frequently after 200 000 runs of the modularity maximization algorithm 486 (40% of the times against 10% for the other possible divisions). The majority 487 of the other cases were almost identical but with a different attribution of 488 the sites on the geographical boundary of the two clusters: sites 18, 19 and, 489 in some cases, 20. The corresponding value of modularity for this division 490 of the network is Q = 0.16. 491 This result means that the transfer of larvae having a 3-5 weeks pelagic 492 larval duration is organized in two groups of nodes having larger exchange 493 of larvae within them than with the other group. These two clusters are 494 likely to retain a consistent portion of their particles inside themselves. 25 Graph theory for species persistence 495 We finally present results on bridging centrality. For its calculation, 496 we use the cij connectivity values to be consistent with the idea at the 497 base of bridging centrality as formulated in Methods. The three nodes 498 that are characterized by the higher value of bridging centrality are the 499 nodes 11, 12 and 16 (see Figure 7). These nodes have bridging centrality 500 values of 600, 572 and 513 respectively. Following Hwang et al. (2008), 501 these three nodes, representing the top ten percent of the 32 nodes, are the 502 nodes likely to be crucial for the integrity of the network. That is, these 503 nodes prevent the network to easily break into separated sub-networks. 504 Discussion and conclusions 505 The present study analyses with graph theory concepts the structure of 506 connectivity within a regional metapopulation of soft-bottom polychaetes 507 in the Gulf of Lion. The aim is to refine the physical-biological interpretation 508 of different graph theory tools by comparing them with the result of a 509 metapopulation model analysis of the same metapopulation. 510 For the first time graph analysis is applied to Lagrangian trajectories 511 based on a fully 3-D circulation model. Moreover the dispersal time of 512 the numerical particles is set in order to mimic the principal biological 513 characteristics of polychaete. Both these aspects complicate the dispersion 514 dynamics. To the best of our knowledge, it is also the first time that graph 515 analysis is used on such a restricted coastal spatial domain for conservation 26 Graph theory for species persistence 516 aims. This fact adds difficulty in the analysis since it results in dense connectivity 517 matrices. Nonetheless the analysis converged to a meaningful result, even 518 under such constraints and many results shed light on the convergence 519 between metapopulation models and graph theory. The first breakthrough of the present study is the sensitivity of graph 520 521 metrics to flow variability: different circulation patterns correspond to 522 networks with different connection patterns and different centrality measure 523 values. To our knowledge, no precedent work showed this in a quantitative 524 way. 525 An important measure of graph analysis is the betweenness centrality. 526 This has been widely used in precedent literature on connectivity estimation 527 as an indication of those nodes that are likely to play the role of crucial 528 hubs for multi-generational connectivity (see for example Rozenfeld et al., 529 2008). 530 In contrast with graphs built with genetic distance (Rozenfeld et al., 531 2008) or connection time (Treml et al., 2008), particular attention is required 532 when dealing with larval transfer probabilities between the sites forming 533 the metapopulation network. In this case, the most immediate choice for 534 edges definition would be the probabilities themselves. This was the choice 535 of Andrello et al. (2013) when studying the Mediterranean MPAs network. 536 But, with this choice, one obtains conceptually wrong results when applying 537 the shortest path algorithm in calculating betweenness. In fact, with this 538 metric, the shortest path is the most improbable one. This is meaningless 27 Graph theory for species persistence 539 if one wants to identify the crucial zones that maintain the connectivity 540 within a network, since these sites would be the less frequented ones. The 541 new node-to-node metric we propose solves this incoherency. It also coherently 542 redefine how to calculate shortest paths accordingly to the independence of 543 the probabilities in the connectivity matrices. 544 High betweenness values have been used in precedent literature to 545 identify “gateways” through which larvae have to pass during multi-generational 546 migrations (Rozenfeld et al., 2008). Nonetheless, in our study, it is the site 547 21 offshore of Port Camargue, which has the highest value of betweenness 548 in almost all cases of circulation patterns in spring 2004 and 2006 in the 549 Gulf of Lion. Whereas the metapopulation modelling study showed that 550 losing this site would not endanger species persistence in the region at 551 minimum recruitment success. Hence we draw attention to the fact that 552 a “gateway” do not point out an essential site for the persistence of the 553 species. Indeed, even if a site with high betweenness value is well preserved, 554 if the other sites are lost, the preservation of such a site ends up being 555 useless from a conservationist point of view. This is a remarkable example 556 of how different analyses can cross-check each other in the estimation of 557 connectivity. 558 We also specifically address the analysis to the inspection of the spatial 559 scale at which multi-generational flows forms retention loops. A first attempt 560 to identify these sites essential for persistence is to find the sites from which 561 are starting the shortest cycles. The presence of a cycle explicitly means 28 Graph theory for species persistence 562 that the nodes that compose it are a potential self-persistent network (Hastings 563 and Botsford, 2005). The set of sites highlighted by this technique are 564 located in the centre of the GoL, where the two point identified as critical 565 by the metapopulation model (sites 10 and 18) were identified. However, 566 sites depicting shortest paths lie between them (sites 13 to 16). This discrepancy 567 cannot be due to the influence of the density dependent factors in the metapopulation 568 model used in Guizien et al. (2014), that are not considered in Hastings 569 and Botsford (2005), as recruitment success was set to the minimum value 570 and the saturating capacity never reached its maximum in the metapopulation 571 simulations. Conversely the discrepancy between shortest paths analysis 572 and metapopulation modelling could be due to the particular scenarios 573 considered by the habitat loss scenario in the metapopulation model analysis. 574 Firstly, consistently with the fact that anthropic pressure decrease as distance 575 from the harbours increase, scenarios included only the four harbours sites. 576 Secondly, under the hypothesis that the effect of anthropic pressure acts 577 predominantly on neighbouring sites, the habitat removal procedure was 578 done by progressively eliminating neighbouring sites. Shortest path analysis 579 supports this point of view. In fact it points out that the nodes composing 580 the shortest cycles are all close one to the others. So that the hypothesis 581 that the survivorship of a site depends on the survivorship of the neighbouring 582 ones is reasonable. Nevertheless shortest path analysis do not include any 583 assumption on geographical proximity within the path. Consequently, shortest 584 path analysis’ results would be fully comparable only with a more general 29 Graph theory for species persistence 585 586 removal procedure than the one used in Guizien et al. (2014). By exploiting the maximization of the modularity value criterion, we 587 identify two clusters of interconnected sites with a corresponding modularity 588 value Q = 0.16. In general, there is no absolute threshold to discriminate 589 between low and high values of modularity. Research on modularity has 590 been largely developed since its introduction by Newman and Girwan (2004) 591 and various shortcomings of this quantity are nowadays well known, see 592 for example Fortunato and Barthelemy (2006) and Kehagias and Pitsoulis 593 (2013). Practically, a good general strategy is to study a lot of network of 594 the same type (biological, genetic, social, etc.) with similar characteristics 595 (betweenness, number of edges, etc.) and establish an appropriate modularity 596 threshold. In our case this would mean to analyse the network of polychaete 597 in regions different from the GoL, or to consider a different species. Although 598 this was not possible in the framework of this study, considering that, by 599 definition, −1 < Q < 1 we are confident in stating two things: (1) as our 600 value of Q is positive there is a cluster structure and (2) as Q is less than 601 a fifth of the maximum possible value (that is 0.2), we can define it as low. 602 This means that the clusters exist but are not separated in a sharp way. 603 As a consequence, the division between the two clusters is not keen and, 604 within the gulf, there is a considerable migration flow of polychaete that is 605 not spatially highly organized. This is not surprising given the disposition 606 of the studied sites, the topography of the Gulf of Lion and the complex 607 circulation inside it, that all cause the high number of connections in the 30 Graph theory for species persistence 608 network. This is very likely a characteristic of a coastal environment where 609 all the sites are alongshore and no considerable physical and/or dynamical 610 barrier between them. 611 One possible objection to this method is the validity (see Methods) 612 of a random model as null hypothesis (as pointed out, for example, by 613 Thomas et al., 2014). Regarding this aspect we are convinced that a random 614 null model is the best choice to put in evidence the effect of a fundamental 615 forcing of the biological system’s dynamic that -although chaotic- is deterministic: 616 the current field. Indeed community structure in marine biological networks 617 is likely to arise as a result of the actions of currents. 618 The modularity analysis result is indirectly in agreement with the 619 results of the metapopulation model. In fact the analysis by Guizien et al. 620 (2014) showed the presence of a rescue mechanism of the sites in the western 621 part of the gulf by the eastern sites. This result indicates an organization 622 of the metapopulation in two big sub-populations in the same way as graph 623 theory clustering does. The presence of a rescue mechanism is mirrored by 624 the low value of modularity, that is the non negligible amount of communication 625 between the different sites. 626 In addition we can refine the rescue mechanism using the result of the 627 bridging centrality. This last measure is specifically built in order to try to 628 identify those nodes that drive the communication between clusters (i.e., 629 communities). We have calculated the bridging centrality of each node 630 in order to identify the sites essential to maintain connectivity between 31 Graph theory for species persistence 631 the two communities. In particular we found that nodes 11, 12 and 16 632 are the top 10% nodes with highest bridging centrality value. Bridging 633 centrality targets those nodes crucial for the integrity of the network as 634 a whole. Since the diagonal terms in the connectivity matrices are not 635 influential in the calculation of bridging centrality (see Methods), this one 636 characterizes the net effect of the regional scale transport of the larvae 637 disregarding the values of local retention of each site. Thus it identifies the 638 potentiality of a node to sustain the species spreading through the entire 639 network. Following Hwang et al. (2008), the importance of each node in 640 the network can be assessed through the number of isolated components 641 created by removing the other nodes sequentially. No isolated component 642 were created by removing a single node. Isolated subnetworks appeared 643 only after removing a great number of nodes. This result reflects the high 644 average edge density of the 20 variant matrices (ρ = 0.604). Nonetheless a 645 lot of weak connection between the 32 sites are present (data not shown). 646 But the solidity of the biological network must be assessed independently 647 from these numerous low connections which are not efficient. By setting a 648 threshold to get rid of the many low-weight edges, the predominant connections 649 were highlighted. The threshold value was put at 0.001 on the base of the 650 arguments presented in Appendix C. The deletion of the nodes 11 and 12 651 led to nine isolated components - sets of nodes disjoint from other portions 652 of the graph - mainly consisting in isolated single nodes or couple of nodes. 653 In contrast the single removal of the other nodes created, on average, only 32 Graph theory for species persistence 654 seven separated isolated components (data not shown). We can thus conclude 655 that nodes 11 and 12 are the most important nodes for the solidity of regional 656 connectivity. Contrarily, site 16 was no longer depicted by the analysis. 657 This fact is reasonable as the bridging centrality value of this node was 658 lower than for nodes 11 and 12. Anyway the choice to consider as important 659 the top ten percent of the nodes is just a guideline. Remarkably the removal 660 of node 21 had no particularly important effects on the fragmentation of 661 the network. It is thus clear that bridging centrality adds additional information 662 on the structure of the network than what betweenness alone can give. In 663 fact, if high-betweenness nodes are the ones that ensures a high flux of the 664 network, high-bridging centrality depicts the nodes that prevent a critical 665 fragmentation of the network, thus playing an important role in the species 666 spreading in the region. 667 This study revisited the previous state-of-the-art interpretation of 668 some graph theory metrics in a biological conservation point of view. Table 669 1 summarizes the principal measures used in this study and their physical- 670 biological interpretation based on the comparison of graph theory analysis 671 with a metapopulation model. This comparison highlight a potential misuse 672 of betweenness values which do not identify sites important for species 673 persistence. Reversely, the comparison pinpoints the potentiality of shortest 674 paths to identify multi-generational loop of persistence, and of bridging 675 centrality to identify nodes important to species regional colonisation. Important 676 message is also to take care of adapting connectivity metrics of larval transfer 33 Graph theory for species persistence 677 probability in order to ensure a physically meaningful interpretation of 678 betweenness and shortest path. 679 Acknowledgements 680 This study was partially funded by the European CoCoNet Project and by 681 the Ministère de l’Éducation Nationale, de l’Enseignement Supérieur et de 682 la Recherche. 34 35 Scope Identifying sites with high probability of returning home of their larvae Nodes through which the highest percentage of most probable paths pass through them. Find sets of nodes more than randomly connected Find nodes leading the communication between clusters Nodes increasing larval transport at the base of persistence. Likely to result in a rescue mechanism. Nodes maintaining the connectivity at the whole network scale. Communities. Interpretation Source sites. Table 1: Recapitulation of the four main measure we use in the framework of this study. For each one we indicate scope and physic-biological interpretation. Bridgingcentrality M odularity Betweenness Measure M inimumCycles Graph theory for species persistence Graph theory for species persistence 683 Appendices 684 A 685 The model used by Guizien et al. (2014) can be written in matricial form 686 as follows: Metapopualtion model P (t + ∆t) = min(G(t)P (t), Pmax ) 687 with a time step of ∆t and the growth transfer matrix G defined as Gij = li aij (t)bj + sjj δ(i, j) 688 where: Pmax = 1/αA with αA is the mean cross-sectional area of one 689 adult, N (t) ∈ RN contains the spatial density of adults in each site i ∈ 690 [1, 32] at time t, li [number of larva per adult] is the propagulae production 691 rate at site i, bj [number of adult per larva] is the recruitment success at 692 site j, sjj [no units] is the adult survivorship rate at site i, δ(i, j) is the 693 Kronecker δ-function, Pmax is the site carrying capacity and aij is the propagulae 694 transfer rate from site i to site j. The larvae production rate bj is equal 695 to the number of larvae produced by each adult female F SR f where F is 696 the fecundity rate, SR is the sex ratio in the adult population, and f is the 697 probability of an egg being fertilized. The recruitment success bij accounts 698 for all mortality losses since egg release until the first reproduction of new 36 Graph theory for species persistence 699 recruits, and includes mortality during larval dispersal, settlement and 700 juveniles stages. Notice that adult survivorship rate can be related to species 701 life expectancy LE as sij = e 702 at which 99% of individuals of the same generation have died. 703 B 704 We verify that the new metric we propose for measuring the distance between 705 nodes dij = ln( a1ij ), aij being the probability of advection from site i to 706 site j in a given time, is meaningful and respects the linearity property we 707 expect for a distance. In fact it satisfies both homogeneity and additivity: 708 ln(0.01) L∆t E , where life expectancy LE is the age Node-to-node metric properties 1. Homogeneity: αdij = α ln 709 710 1 aij = ln 1 aαij that is, a distance multiplied by a scalar is still a distance. 2. Additivity: dik + dkj = ln 1 aik + ln 1 akj = ln 1 aik · akj = ln 1 aij = dij 711 that is, the sum of two distances is still a distance and, in particular, 712 the final distance is obtained by the multiplication of the partial 37 Graph theory for species persistence 713 original metric values. This aspect adapts particularly well to the 714 problem addressed in the paper. In fact the length of a path results 715 to be calculated as the product of the probabilities associated with 716 its components. 717 Note that as the elements aij have no units, also the elements dij 718 have no units. 719 C Guess estimation of a threshold for good connectivity 720 A threshold for establishing which edges play a major role in maintaining a good persistence in a biological network can be established via an educated guess on the basis of physical reasoning. In particular, given a probabilitybased connectivity matrix, we can expect that, in an efficient network, a transfer rate Cij between 1 and its not null minimum value C m of the matrix is necessary for the maintenance of an overall good connectivity. Following a common methodology for the estimation of the order of magnitude of unknown quantities, we can estimate M as the geometric mean of 1 and C m: √ M= C m · 1. 721 This kind of estimation has the advantage to consider a vast range of values 722 for the variables at play in determining the unknown quantity, while not 38 Graph theory for species persistence 723 being biased by the choice of too large/small extremal values. Therefore, 724 after some steps of this kind, we are statistically confident on the goodness 725 of the estimation. Dealing with living organisms, we also have to improve this estimation by accounting for the survivorship of the propagulae. In an efficient network, we can expect that a percentage S of propagulae between M and at least 1 M e is likely needed for the maintenance of the persistence of the species. We then estimate r S= M ·M e As a last step, we also account for the percentage of the surviving particles that successfully reproduce. A percentage T between S and 1e S is likely needed for a good persistence of the species in the habitat studied with graph theory. We estimate r T = 726 S ·S e In our case, using the mean value of M resulting from the 20 variant 727 matrices, we have R = 0.0041 that we round to the value 0.001 used in our 728 analysis. 729 Please note that we had also tested this estimation qualitatively. We 730 had assessed that there is a sharp difference in the number by connections 731 when changing the threshold of an order of magnitude. In fact, for a threshold 732 equal to 0.01, we obtained an almost completely disconnected network. 733 While, when retaining all values above 0.0001, nothing hardly changed. 39 Graph theory for species persistence 734 Thus a threshold equal to 0.001 seems to be exactly the threshold we searched 735 in order to put ourself in the condition to have what we call a connected 736 minimal network. 737 List of captions 738 FIGURE 1. Schematic representation of the typical circulation in the Gulf 739 of Lion. The thick arrow represents the dominant alongshore Northern 740 Current. The thinner arrow represents the eastward currents that can be 741 detected in stratified conditions or under particular wind field conditions. 742 The positions of the 32 studied sites are plotted. The sites 3, 10, 18 and 743 32, used for the habitat loss scenario in the metapopulation model, are 744 highlighted by bigger grey dots. Node 21 is the smallest of the gray dots. 745 The gray lines correspond to the 100 m, 200 m, 1000 m and 2000 m isobaths. 746 FIGURE 2. Spatial representation of the connectivity matrices and 747 betweenness values in different circulation situations. (a) Westward drift 748 (matrix #7), (b) eastward drift (matrix #1), (c) central retention (matrix 749 #10). In the panel (d) the mean connectivity matrix values and the mean 750 of the betweenness values are represented. In all the figures a threshold on 751 the value of connectivity was imposed for clarity: connectivity values lower 752 than the 2/3 of the maximum one are not plotted. Values of betweenness 753 are normalized to the maximum value that was found in each case. 754 FIGURE 3. Values of betweenness for the 32 sites using the 20 variant 40 REFERENCES Graph theory for species persistence 755 connectivity matrices. The normalization is done on the maximum value of 756 betweenness obtained using the different variant matrices. 757 FIGURE 4. (a) Number of not null elements in the 20 variant matrices. 758 Different circulation patterns have different effects on the connectivity 759 inside the Gulf of Lion. (b) Network length of the 20 variant matrices. 760 FIGURE 5. Sum of the products of the weights of all the cycles of 761 length from 2 to 5 steps that start from each of the 32 sites. The minima 762 correspond to the nodes for which the probability of particles returning 763 home (in a 2 to 5 generations time span) is higher. 764 FIGURE 6. Clusters identified with a criteria of maximization of 765 modularity. The result is the average assignation of a node to one of the 766 two clusters after 200 000 code runs with the 20 variant connectivity matrices. 767 To different color it corresponds a different cluster. 768 FIGURE 7. Bridging centrality values for the 32 sites. Geographical 769 representation. Nodes 11 and 12 have the highest values: 600 and 572 respectively. 770 References 771 Andrello, M., Mouillot, D., Beuvier, J., Albouy, C., Thuiller, W., and 772 Manel, S. (2013). Low connectivity between mediterranean marine 773 protected areas: a biophhysical modeling approach for the dusky 774 grouper: Epinephelus Marginatus. PLOS One, 8. 775 Caswel, H. (2001). Matrix population models (second edition). Sinauer 41 REFERENCES 776 777 778 779 Graph theory for species persistence Associates, Sunderland, MA, USA. Dijkstra, E. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1:269–271. Duraiappah, A. and Shahid, N. (2005). Millennium ecosystem 780 assessment.ecosystems and humanwell-being: Biodiversity synthesis. 781 World Resources Institute,Washington, DC. USA. 782 Estournel, C., de Madron, X. D., Marsaleix, P., Auclair, F., Julliand, C., 783 and Vehil, R. (2003). Observation and modeling of the winter coastal 784 oceanic circulation in the gulf of lion under wind conditions influenced 785 by the continental orography (fetch experiment). Journal of Geophysical 786 Research, 108. 787 788 Fortunato, S. and Barthelemy, M. (2006). Resolution limit in community detection. PNAS, 104:36–41. 789 Guizien, K., Belharet, M., Moritz, C., and Guarini, J. (2014). Vulnerability 790 of marine benthic metapopulations: implications of spatially structured 791 connectivity for conservation practice. Diversity and Distributions, pages 792 1–11. 793 Guizien, K., Belharet, P., Marsaleix, P., and Guarini, J. (2012). Using 794 larval dispersal simulations for marine protected area design: application 795 to the gulf of liones (nw mediterranean). Limnology and Oceanography, 796 57. 42 REFERENCES 797 798 799 Graph theory for species persistence Hastings, A. and Botsford, L. (2005). Persistence of spatial populations depends on returning home. PNAS, 103:6067–6072. Hwang, W., Taehyong, K., Murali, R., and Aidong, Z. (2008). Bridging 800 centrality: Graph mining from element level to group level. Proceedings 801 KDD 2008, pages 366–344. 802 803 804 805 Jacobi, M., Andr, C., Doos, K., and Jonsson, P. (2012). Identification of subpopulations from connectivity matrices. Ecography, 35:31–44. Kehagias, A. and Pitsoulis, L. (2013). Bad communities with high modularity. The European Physical Journal B, 86. 806 Kool, J., Moilansen, A., and Treml, E. (2013). Population connectivity: 807 recent advances and new perspectives. Landscape Ecol., 28:165–185. 808 Marsaleix, P., Auclair, P., and Estournel, C. (2006). Considerations on 809 open boundary conditions for regional and coastal ocean models. Journal 810 of Atmosphere and Ocean Technology, 23:16041613. 811 812 813 814 815 816 Millot, C. (1990). The gulf of lions hydrodynamics. Continental Shelf Research, 10:885894. Newman, M. (2006). Modularity and community structure in networks. PNAS, 103:8577–8582. Newman, M. and Girwan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69. 43 REFERENCES Graph theory for species persistence 817 Nicosia, V., Mangioni, G., Carchiolo, V., and Malgeri, M. (2009). 818 Extending the definition of modularity to directed graphs with 819 overlapping communities. Journal os Statistical Mechanics, 3. 820 Petrenko, A. (2003). Variability of circulation features in the gulf of lion 821 nw mediterranean sea. importance of inertial currents. Oceanology Acta, 822 26:323338. 823 Petrenko, A., Dufau, C., and Estournel, C. (2008). Barotropic eastward 824 currents in the western gulf of lion, north-western mediterranean sea, 825 during stratified conditions. Journal of Marine Systems, 74:406–428. 826 Petrenko, A., Leredde, Y., and Marsaleix, P. (2005). Circulation in a 827 stratified and wind-forced gulf of lions, nw mediterranean sea: in situ 828 and modeling data. Continental Shelf Research, 25:727. 829 Rossi, V., Giacomi, E. S., Cristobal, A. L., and Hernandez-Garcia, E. 830 (2014). Hydrodynamic provinces and oceanic connectivity from a 831 transport network help desining marine reserves. Geoph. Res. Lett., 41. 832 Rozenfeld, A., Arnaud-Haond, S., Hernandez-Garcia, E., Eguiluz, V., 833 Serrao, E., and Duarte, C. (2008). Network analysis identifies weak and 834 strong links in a metapopulation system. PNAS, 105:18824–18829. 835 Schick, R. and Lindley, S. (2007). Directed connectivity among fish 836 populations in a riverine network. Journal of applied ecology, 44:1116– 837 1126. 44 REFERENCES 838 Graph theory for species persistence Thomas, C., Lambrechts, J., Wolansky, E., Traag, V., Blondel, V., 839 Deleersnijder, E., and Hanert, E. (2014). Numerical modelling and graph 840 theory tools to study ecological connectivity in the great barrier reef. 841 Ecological Modelling, 272:160–174. 842 Treml, E., Halpin, P., Urban, D., and Pratson, L. (2008). Modeling 843 population connectivity by ocean currents, a graph theoretic approach 844 for marine conservation. Lansc. Ecol., 23:19–36. 845 846 Urban, D. and Keitt, T. (2001). Landscape connectivity: a graph theoretic perspective. Ecology, 82:1205–1218. 45 FIGURES Graph theory for species persistence Figure 1 46 FIGURES Graph theory for species persistence (a) 0.1 0.3 0.35 0 0.01 0.02 0.05 0.06 12 15’ 13 14 16 17 15 19 18 21 22 12.5 25 26 27 28 29 303132 11 12 11 10 43oN 9 8 10.5 7 6 5 4 3 30’ 2 1 9.5 10 15’ 13 14 16 17 15 19 18 23 24 12 25 26 27 28 29 303132 11 10.5 10 7 6 5 4 3 30’ 2 1 8.5 9.5 9 8.5 15’ 30’ 4oE 30’ 5oE 3oE 30’ 4oE (c) 0.05 Betweenness 0.1 0.15 30’ 5oE (d) 0.2 0 0.05 Betweenness 0.1 0.15 0.2 12 13 14 16 17 15 19 18 21 22 23 24 14.5 11.5 25 26 27 28 29 303132 12 11 10 43 N 9 8 11 10.5 o 10 9.5 7 6 5 4 3 30’ 2 1 15’ 13 14 16 17 15 19 18 21 22 23 24 14 25 26 27 28 29 303132 13.5 13 12 11 10 43 N 9 8 45’ 12.5 o 12 7 6 5 4 3 30’ 2 1 11.5 45’ 9 8.5 15’ 3oE 20 30’ Connectivity 20 30’ 15’ 11.5 45’ 9 15’ 0 21 22 12 11 10 43oN 9 8 45’ 3oE 20 30’ 11.5 23 24 Connectivity 20 30’ Connectivity 0.05 Betweenness 0.03 0.04 11 10.5 10 15’ 30’ 4oE 30’ 5oE 3oE Figure 2 47 30’ 4oE 30’ 5oE Connectivity 0 (b) Betweenness 0.15 0.2 0.25 FIGURES Graph theory for species persistence 0.4 2 0.35 4 0.3 Variant Matrices 6 8 0.25 10 0.2 12 0.15 14 16 0.1 18 0.05 20 5 10 15 20 25 Sites Figure 3 48 30 0 FIGURES Graph theory for species persistence 900 7 x 10 7 700 6 600 Network Length Number of Not Null Elements 800 500 400 300 5 4 3 200 2 100 1 0 0 5 10 Matrices 15 20 0 0 (a) 5 10 Matrices (b) Figure 4 49 15 20 FIGURES Graph theory for species persistence 5 3 x 10 Cycles Length 2.5 2 1.5 1 0.5 0 0 5 10 15 Sites Figure 5 50 20 25 30 FIGURES Graph theory for species persistence 20 30’ Latitude 15’ 13 14 16 17 15 21 22 19 18 23 24 25 26 27 28 29 303132 12 11 10 9 8 43oN 7 6 5 4 3 30’ 2 1 45’ 15’ o 3 E 30’ o 4 E Longitude Figure 6 51 30’ o 5 E FIGURES Graph theory for species persistence 15’ 13 14 16 15 17 19 18 21 22 600 23 24 25 26 27 28 29 3031 32 12 11 10 o 43 N 9 8 7 6 5 4 3 30’ 500 400 300 45’ 200 2 1 100 15’ 3oE 30’ 4oE 30’ Figure 7 52 5oE 0 Bridging Centrality 20 30’ TABLES Graph theory for species persistence Matrix Period #1 from Jan 5 to Jan 14 #2 from Jan 15 to Jan 24 #3 from Jan 25 to Feb 3 #4 from Feb 4 to Feb13 #5 from Feb 14 to Feb 23 #6 from Feb 24 to Mar 4 #7 from Mar 5 to Mar 15 #8 from Mar 16 to Mar 26 #9 from Apr 27 to May 5 #10 from May 6 to May 16 #11 from Jan 5 to Jan 14 #12 from Jan 15 to Jan 24 #13 from from Jan 25 to Feb 3 #14 from Feb 4 to Feb13 #15 from Feb 14 to Feb 23 #16 from Feb 24 to Mar 4 #17 from Mar 5 to Mar 15 #18 from Mar 16 to Mar 26 #19 from Apr 27 to May 5 #20 from May 6 to May 16 Circulation Pattern Eastward Mixed Mixed Central Retention Mixed Mixed Westward Mixed Mixed Central Retention Westward Westward Central Retention Mixed Westward Mixed Westward Mixed Mixed Mixed Table 2: Time period relative to each connectivity matrix; the matrices’ periods from 1 to 10 refers to year 2004, from 11 to 20 to year 2006. The circulation regime present in the Gulf of Lion in each period is also indicated. A ‘Mixed’ regime indicates the contemporary presence of more than one of the simple regimes: westward, eastward or central retention (see the Results for further details). 53
© Copyright 2025 Paperzz