1 2 Predicting fish diet composition using a bagged 3 classification tree approach: a case study using 4 yellowfin tuna (Thunnus albacares) 5 Petra M. Kuhnert 6 7 8 CSIRO Mathematics, Informatics and Statistics, Private Bag 2, Glen Osmond SA 5064, Australia E-mail: [email protected]; Phone: +61 8 8303 8775; Fax: +61 8 8303 8763 9 Leanne M. Duffy 10 11 12 Inter-American Tropical Tuna Commission, 8604 La Jolla Shores Drive, La Jolla CA 920371508 USA 13 Jock W. Young 14 15 16 CSIRO Marine and Atmospheric Research and Wealth from Oceans Flagship, GPO Box 1538, Hobart TAS 7001 Australia 17 Robert J. Olson 18 19 20 Inter-American Tropical Tuna Commission, 8604 La Jolla Shores Drive, La Jolla CA 920371508 USA 21 22 23 24 25 26 27 28 29 30 1 31 Abstract: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 We provided a classification tree modeling framework for investigating complex feeding relationships, and 48 Keywords: Yellowfin tuna diet, classification and regression trees (CART); bootstrapping; 49 Feeding habits; Predator-prey relationships; Spatial, Trophic ecology;. illustrated the method using stomach contents data for yellowfin tuna (Thunnus albacares) collected by longline fishing gear deployed off eastern Australia between 1992 and 2006. The non-parametric method is both exploratory and predictive, can be applied to varying size datasets and therefore is not restricted to a minimum sample size. The method uses a bootstrap approach to provide standard errors of predicted prey proportions, variable importance measures to highlight important variables, and partial dependence plots to explore the relationships between explanatory variables and predicted prey composition. Our results supported previous studies of yellowfin tuna feeding ecology in the region. However, the method provided a number of novel insights. For example, significant differences were noted in the prey of yellowfin tuna sampled north of 30°S in summer where oligotrophic waters dominate. The analysis also identified that sea-surface temperature, latitude and yellowfin size were the most important variables associated with dietary differences. The methodology is appropriate for delineating ecosystemlevel trophic dynamics, as it can easily incorporate large datasets comprising multiple predators to explore trophic interactions among members of a community. Broad-scale relationships among explanatory variables (environmental, biological, temporal and spatial) and prey composition elucidated by this method then serve to focus and lend validity to subsequent fine-scale analyses of important parameters using standard diet methods and chemical tracers such as stable isotopes. 50 51 52 53 54 55 56 57 58 59 60 61 2 62 Introduction 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 Characterising the feeding relationships of top predators in complex marine ecosystems is a key component of 78 predators and p represents the number of prey observed in the stomach contents of predators at the time of sampling 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 (Chipps and Garvey 2007). Not all prey are represented in the stomach of each predator, hence the matrix usually ecosystem modeling (Olson and Watters 2003; Griffiths et al. 2010). Typically, fish diets have been characterised using stomach contents data. While stable-isotope and fatty-acid analyses continue to provide valuable information about food webs, diet data from stomach contents are essential for defining the taxonomic components and links in ecosystem models. Yet, there is no unified approach for analysing stomach contents (hereafter “diet”) data used to inform these models. Previous approaches have been exploratory, consisting of several types of multivariate analyses accompanied by non-parametric statistical tests to examine a-priori hypotheses about the data. The relationships between the type of prey eaten by a predator and their respective environmental niches are also difficult to synthesize. Given the growing focus on understanding predator-prey relationships under increasing environmental and fishing pressures (Pikitch et al. 2004; Marasco et al. 2007), it is imperative to develop methodology that is both exploratory and predictive, and able to inform ecosystem models for exploratory analyses and eventual ecosystembased management. A large body of literature focusing on diet studies based on stomach contents has depended on multivariate analyses (Mardia et al. 1979). Methods including principal component analysis, non-metric multidimensional scaling and cluster analysis are used to explore an n × p matrix of prey weights, where n represents the number of consists of many zeros. Examples of recent diet analyses that adopt one or more of these multivariate approaches include the work of Young et al. (2010), who analysed 10 species of pelagic fishes, Griffiths et al. (2007) who analysed the feeding dynamics of longtail tuna (Thunnus tonggol), Potier et al. (2007), who investigated prey composition of three large pelagic fish predators and Young et al. (2006), who analysed the feeding ecology of broadbill swordfish (Xiphias gladius) from eastern Australian waters. In many of these analyses, the multivariate methods were accompanied by non-parametric statistical tests, including the Wilcoxon signed rank test and the Kruskal-Wallis rank sum tests to examine specific differences in diet composition as expressed through the results of the multivariate methods. Alternative analyses have typically accompanied the multivariate investigations to provide a thorough evaluation of feeding. These include the calculation of daily consumption rates of the predator and how this varies with fish size. Studies by Olson and Galván-Magaña (2002), Young et al. (2010), Griffiths et al. (2009) and Griffiths et al. (2007) provide examples. Regression-based approaches have included quantile regression (Koenker and Bassett 1978; Scharf et al. 1998), to examine prey-predator length relationships and provide a broad overview and synthesis of what different sized predators eat (Ménard et al. 2006; Young et al. 2010; Logan et al. 2011); Generalised Linear Models (GLMs) for analysing the presence or absence of a particular prey in the stomach of a predator (McCullagh and Nelder 1983); or Generalised Linear Mixed Models (GLMMs) (McCulloch and Searle 2001). The latter can be used to address pseudo-replication, which is a concern when the stomachs of several predators are collected at the same sampling event (e.g. in the purse-seine fishery, fish that are collected from the same purse-seine set, i.e. sampling event, are not 3 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 considered independent) (M. Hunsicker, personal communication). Classification and Regression Trees (CART) 113 Materials and methods 114 Study sites and sample collection 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 The yellowfin tuna is one of four tuna and billfish species fished commercially off eastern Australia (Young et al. 131 broke the survey region into south (south of 30°S), central (latitude between 20°S and 30°S) and northern locations 132 (north of 20°S) and inshore (longitude <155°E) and offshore regions (longitude between 155°E and 160°E). These (Breiman et al. 1984) have been used as an exploratory tool to predict individual prey weights (Oson and GalvánMagaña 2002) and the occurrence of fish in the diet (Baker and Sheaves 2005). They have not been used as a tool to predict diet composition or to identify the relationships between explanatory variables and the distribution of prey groupings. This paper extends the classification tree approach proposed by Breiman et al. (1984) to provide a method for exploring and predicting diet composition. The method provides variable importance measures for identifying important predictor variables in the model. It produces bagged predictions with standard errors using bootstrap techniques similar to Breiman (1996) and Kuhnert et al. (2010), but with a spatial component to account for sampling design. It also includes partial dependence plots to examine the relationships between the explanatory variables and the response in the model (Breiman 2001). Finally, it provides a method for visualising the results by mapping the predictions back to the pruned tree for interpretation and synthesis (Kuhnert and Mengersen 2003). We demonstrate this approach using a case study from Australia (Young et al. 2010), where the diet composition of yellowfin tuna (Thunnus albacares) was investigated to determine if relationships existed with environmental, biological, temporal and spatial covariates. 2011). The fishery for yellowfin tuna is situated largely within the East Australia Current. As this current is predicted to increase spatially because of ocean warming, understanding its impact on the distribution of yellowfin tuna in the region is an important research goal (Hartog et al. 2011). Part of this understanding will require detailed dietary information. As trophic relations are at the base of most marine ecosystem models (Christensen and Walters 2004, Fulton et al 2007), diet studies that have the capacity to predict prey concentrations under different environmental conditions will be particularly valuable to evaluate indirect effects of future management scenarios on exploited ecosystems. Although previous studies have documented the diet of yellowfin tuna in these waters (Young et al. 2001; Young et al. 2010), none have attempted such a task. Yellowfin tuna were caught by longline fishing gear deployed off eastern Australia between 1992 and 2006 using the survey approach outlined in Young et al. (2010) and the stomachs were removed for diet analysis. Of the 818 stomachs sampled, only 528 that contained prey remains were used in the analysis. Any prey contributing less than 1% to the total wet weight across all stomach samples was also excluded from the analysis. This resulted in 19 prey taxa for analysis (Table 1). For each prey group in each stomach, a corresponding prey proportion, based on the wet weight stomach remains, was calculated. Explanatory variables consisted of categorical versions of longitude (LN.reg) and latitude (LT.reg) that 4 133 134 135 136 137 138 139 140 141 142 143 144 145 regions were based on where samples were taken in relation to the main features of the East Australian Current (see 146 Classification tree for diet data 147 Classification trees 148 149 150 151 152 153 Classification and Regression Trees (CART) is a popular non-parametric modeling approach that is extensively 154 155 156 157 158 159 160 161 162 163 164 165 166 167 amongst categories being predicted and is calculated as a function, φ of class probabilities predicted at node, t . On Young et al. (2006) for details). A seasonal categorical variable identified samples taken in summer (September to March) or winter (April to August). Sea-surface temperature (SST) was recorded at time of capture from an underway thermosalinograph (Young et al. 2006). Absent values were later included from SST satellite imagery using the spatial dynamics ocean data explorer (SDODE) interface (Hobday et al. 2006) after pair wise comparisons showed a significant relationship between satellite and vessel-derived data (R2=0.87). These data were represented as a continuous variable ranging between 14.6 and 29.1°C (mean 21.67°C). Mixed layer depth (MLD), defined as the depth from the surface of water with equal density, was applied to each sample also using the SDODE interface. Values ranged between 12.56 and 161.7 m (mean 37.69 m). The fork length (Length) of yellowfin tuna was represented by a continuous variable and ranged between 73 and 214 cm (mean 130 cm). Moon phase was recorded as a continuous variable between 0 and 1 where 0 indicates a new moon, 1 indicates a full moon and values in between represent all other different phases of the moon. There were missing values of fork length (6.7% missing), sea-surface temperature (2.2% missing) and mixed layer depth (2.2% missing). described in Breiman et al. (1984), available for application using the RPART package (Therneau et al. 2009) in the R programming language (R Development Core Team 2005) and more recently, applied with references to ecology, by Zuur et al. (2007). We briefly describe the methodology here with extensions to diet data. Through a greedy algorithm, data are partitioned by successive splitting on explanatory variables that seek to minimize an error criterion. For classification problems, the Gini index, i(t ) (Breiman et al. 1984), represents a measure of diversity production of a large tree, v-fold cross-validation (typically 10-fold) is implemented to prune the tree to the lowest cross-validated error rate. Cross-validation is a technique used to test the predictive performance of a model and how well it will generalize to a new set of data. Cross-validation is implemented by holding back a portion, v, of the data as a test set, constructing the model on the remaining set of observations (regarded as a training set) and then predicting using the test set data. This is repeated for all possible subsets of data to provide a cross-validated prediction error. As Breiman et al. (1984) highlights, the selection of the final tree can be subjective and may result in a series of trees within one standard error of the tree yielding the minimum, known as the 1SE tree. Breiman et al. (1984) advocate using the 1SE rule for selecting what is regarded as “the right size tree” and reporting the crossvalidated error rate and its accompanying standard error as a measure of the uncertainty. Although different types of trees (e.g. unpruned, stumps, selecting trees based on the number of splits or varying error rates) can be explored, using the 1SE rule, as suggested by Breiman makes the model selection less subjective. We therefore followed Breiman’s suggestion to use cross-validation and the 1SE rule to identify an optimal tree. Predictions are formed by partitioning a new observation down the tree until it resides in a terminal node with labels in accordance to the 5 168 169 170 171 172 173 174 175 176 177 naming convention set out by Breiman et al. (1984). For classification tree problems, terminal node predictions are represented as the most dominant prey group that is partitioned to that node. Missing values are easily accommodated in CART models through surrogates that represent splits closely related to the primary split, which can be used to partition data in the event of missing values. Although classification trees have typically been developed on datasets whereby the response has consisted of a categorical variable, such as the presence or absence of a species or the classification of two or more categories, classification trees can also be developed for an n× p matrix of diet data, where π ij represents the proportion of prey, j , consumed by predator, i , based on the recorded wet weights and subject to the constraint: sum of the proportions equals 1. This type of model is equivalent to a multinomial model (McCullagh and Nelder 1983), where π ij =Pr{Yi = j} with a probability distribution represented as ni yi1 yip Pr {Yi1 = yi1 ,K , Yip = yip } = π i1 Kπ ip yi1 ,K , yip 178 179 p ∑ π ij =1 , where the j =1 and ∑y ij (1) = ni represents the number of predators consuming prey, i . We explored relationships between the prey j 180 composition and matrix of explanatory variables, X i , in the usual way through a logit link that linearly relates the 181 response to the explanatory variables through the following expression ηij = log 182 π ij = α j + β j Xi π ip (2) 183 where α j represents the mean proportion weight (log-scale) for each prey classification and β j represents a vector 184 of regression coefficients for j = 1, 2, K , J − 1 , describing the contribution of each explanatory variable in the model. 185 186 187 188 189 190 191 192 193 194 195 Put more simply, we modelled the prey proportions such that the probability of residing in the prey class represents a 196 of prey proportions into an ( L = n × k ) vector of prey classes, Yl (l = 1, K , L.) , with case weights, Wl , representing 197 the proportion of prey, Yl , eaten by a predator. In this parameterisation, k < p and represents the prey classes with function of the explanatory variables. Models of this form may be investigated for stomach contents data but they rely on a-priori knowledge of how each explanatory variable should be represented in the model and what if any interactions might be explored. Furthermore, the investigation of interactions may be difficult, especially for prey groups that appear in low proportions. The number of prey groups that can be examined using a multinomial regression approach may also be limited due to the shear lack of data and missing values that either need to be removed or imputed. In the classification tree framework, we can fit a similar model to predict the prey distribution of a predator, given important explanatory variables. If we consider a transformation of the data and let each row, l , of the transformed dataset represent a unique predator-prey combination, where the proportion of a prey consumed by a predator is calculated as a case weight, Wl , for the classification tree model, then we can retransform the n× p matrix 6 198 199 200 201 weights observed. To illustrate, consider prey eaten by a yellowfin tuna captured along the east coast of Australia. 202 Y3 = Carangidae and corresponding case weights of W1 = 0.2 , W2 = 0.1 and W3 = 0.7 respectively. For a particular yellowfin, we find that it consumes squid of the family, Ommastrephidae, crustacea of the order Decapoda and fishes from the family Carangidae in proportions of 0.2, 0.1 and 0.7 respectively. In the transformed dataset, this predator would appear three times with prey labels of Y1 = Ommastrephidae, Y2 = Decapoda and 203 We calculated two types of predictions for each unique yellowfin tuna, i , from this type of model. The first 204 is the predicted prey composition, πˆij , which represents the predicted proportion of prey, j , for each unique 205 predator, i . The second is the predicted prey group, Yˆi (i =1,K,n) , where Yˆi = max{πˆi1,K,πˆip } and therefore represents 206 the prey group that yields the maximum predicted prey proportion. 207 Variable importance 208 We identified important variables that contribute to the model using a variable importance ranking, similar 209 to the approach described by Breiman et al. (1984). The ranking, M ( xm ) for a variable, xm , is developed from 210 surrogate splits, s%m , appearing at node 211 212 213 214 215 216 alternative splits with high correlation to the primary split and we used these to partition the data if missing data are 217 node impurity, i , as t is partitioned by a surrogate split into left and right daughter nodes, which are represented by 218 t L and t R , respectively. In other words, the importance of a variable in a tree based model represents the 219 220 221 222 contribution that each variable makes as a surrogate to the primary split. We regard node impurity as the error in t of the fitted classification tree model. Recall that surrogate splits represent present. The reason for using surrogate variables in the calculation of variable importance is to provide recognition to variables that are important predictors but are either masked or hidden by other variables in the model. That is, if two variables are closely associated with one another and therefore provide a similar partitioning, although only one will be selected by the model, the contributions of both are recognized in the variable importance ranking. Using the notation defined in Breiman et al. (1984), the variable importance is defined for a variable, xm , as the change in the predicting the prey group classified at node, t . The difference therefore in the impurity calculated at a parent node with the impurities calculated at the left and right daughter nodes therefore provides an estimate of the error based on the split. The variable importance calculation is mathematically expressed as M ( x ) = ∑ ∆i ( s%m ,t ) m t∈T where ∆i ( s%m ,t ) =i (t ) −i (t L )−i (t R ) 223 (3) and i (t ) = φ ( p (1|t ),K, p ( J |t ) ) ( J = no. of classes) 224 225 226 For each split of the tree and each variable, the node impurity is calculated at each node and summed across all nodes of the tree. Thus, the final ranking represents a ranking relative to the variable that yielded the largest variable importance. 7 227 Bagging 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 Various bootstrapping techniques have been applied to classification trees to provide more accurate predictions due 243 vector of prey classes, Yl* (b) , with corresponding case weights, Wl* ( b ) , representing the proportion of prey, Yl* (b) , 244 245 eaten by a predator as represented in each bootstrap sample, b . Averaged prey proportions are formed by taking the to the instability noted with the greedy algorithm that produces the partitions of the tree (Breiman 1996; Freund and Schapire 1996; Breiman 1998; Breiman 2001). Traditional bootstrap aggregation, referred to as bagging, develops B unpruned trees using re-samples of the data and forms predictions at each of the B trees for each observation in the dataset. Although unpruned trees are generally used, pruned versions can be considered, e.g. “stumps”, trees containing one split. We can then form predictions by averaging across the B trees to yield an aggregated (or bagged) prediction and, as Kuhnert et al. (2010) demonstrated using Random Forests, we can calculate standard errors. For visualisation we adopt the approach proposed by Kuhnert and Mengersen (2003), where the bagged predictions are mapped back to the pruned tree for interpretation. In terms of predicting diet composition, we extend the bootstrap aggregation approach of Breiman (1996) by adopting a spatial bootstrap to account for spatial dependence in the data using the methods described by Hall (1985) and Cressie (1993). The spatial bootstrap ensures that bootstrap samples of the data are stratified according to a defined spatial resolution, typically one that is small enough such that the correlation between spatial grids is negligible. We can formally test this assumption by fitting a variogram to the residuals resulting from the bootstrap predictions. The approach therefore takes B spatial bootstrap samples of the transformed data to form a resampled mean of the bootstrapped proportions for each predator with a corresponding estimate of the variance as follows: * πˆ ij (⋅) = 246 1 B * ∑ πˆ (b ) (i = 1, K , n; j = 1, K , p ) B b =1 ij 2 1 B * * * (⋅) ∑ πˆij (b ) −πˆij var(πˆ ) = ij B −1 b =1 (4) 247 The predicted prey therefore represents the bootstrapped average prey proportion yielding the maximum probability 248 * (⋅) . and represented as Yˆi* (⋅) (i =1,K,n ) , where Yˆi* (⋅) = max πˆi*1 (⋅),K,πˆip 249 Visualisation of results and partial dependence plots 250 251 252 253 254 255 256 257 We can visualize the bagged predictions using the pruned classification tree in a similar approach outlined by { } Kuhnert and Mengersen (2003) where they took bootstrap samples of their data, fitted a regression tree to each resample, formed bootstrap predictions and mapped the predictions back to the original tree to examine the error in each terminal node of the tree. Using this approach, we mapped the bagged predictions back to the intermediate and terminal nodes of a pruned tree to visualise the bagged prey distribution in terms of the proportion of prey eaten and produce summary statistics accordingly. Partial dependence plots represent a useful visual aid for exploring the relationship between explanatory variables and the predicted response. In terms of their calculation, partial dependence is calculated using the 8 258 bootstrap predictions, πˆij* (b )|x⋅ j , x.− j = x.− j for each variable x⋅ j , conditional on holding all other explanatory 259 260 261 262 variables constant at their respective means. The approach used here is similar to that described by Breiman (2001). 263 Results 264 A classification tree for diet data 265 266 267 268 269 270 271 272 273 274 275 276 The bagged classification tree approach was applied to the yellowfin dataset using the spatial and environmental 277 278 279 280 281 282 283 284 (expression in Equation 3), that was used to partition the data and form splits of the tree (Figure 2). Values of the gini 285 Predicting diet composition 286 287 288 289 290 Bootstrapping was performed to provide a bagged predicted diet composition for each yellowfin tuna with The plotted results either represent a step function (for continuous variables) showing the relationship between the explanatory variable and the predicted response, or alternatively, a barplot (for categorical variables) showing the contribution of each category to the prey composition relative to the most dominant response category. covariates described in the methods with the aim of predicting prey composition. The tree was pruned to 1 standard error, yielding a cross-validated error rate of 0.859 (SE=0.023). The resulting model consists of terminal nodes with colors reflecting the prey yielding the highest proportion in the prey composition (Figure 1). Prey codes used in the figure are outlined in Table 1. Important splits are indicated by the length of the splits (Figure 1), with the longer splits highlighting the importance of seasonal and spatial effects. The first split separates summer and winter samples, followed by splits on latitudinal region separating the central and southern regions from the northern region during summer months, and the central and northern from the southern region in winter months. Further splits were on sea-surface temperature (SST), mixed layer depth and yellowfin length. The variable importance ranking indicated SST had the highest rank (1.00) followed by latitude (0.74), season (0.51), fork length (0.49), mixed layer depth (0.36) and low relative importance values contributed by longitude region (0.19) and moon phase (0.14). A map highlighting prey diversity for the yellowfin tuna was based on the calculated gini index, i (t ) index, hereafter termed diversity, ranged between 0 and 1 where low values (purple and blue in Figure 2) indicated a dominant prey and high values (yellow and orange) indicated highly diversified prey species composition. We showed the observed diversity for each yellowfin and a smoothed representation using the results from a fitted generalized additive model (GAM) to latitude and longitude using smoothing splines in Figures 2(a) and 2(b) respectively. Overall, high diversity is represented amongst the yellowfin sampled, indicating that the yellowfin ate a varied diet. Of notable interest was the high diversity of prey predicted by the GAM in the diets of yellowfin sampled in the southern areas of the map compared to central and northern regions. corresponding standard errors. We investigated the residuals from the fitted classification tree to determine whether spatial dependence had been adequately captured and for this application, spatial bootstrapping was not required. We showed the results from bagging in Figure 3 (panels labeled bootstrapped proportions) based on 500 trees developed on bootstrap samples of the data. This number was based on investigations by Kuhnert and Mengersen (2003) to 9 291 292 ensure sampling error was minimized. Predictions were mapped to the terminal nodes of the pruned tree (Figure 1), 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 We used Figure 1 to facilitate interpretation of the bagged estimates and examined internal and terminal providing a summary of the predictions. nodes of the tree to make sense of the predictions produced. To illustrate the method, we selected a terminal node from the tree in Figure 1 (node 8) and an internal node (node 6). We showed in Figure 3(a) the results from the internal node (6) representing the location of samples taken in winter months in the southern region off the east coast of Australia. The observed proportion of prey consumed by the tuna at that node is presented along with the corresponding bootstrap predictions. The results indicated a diverse prey composition (gini diversity index of 0.795), highlighting yellowfin that eat a range of prey in low proportions but with high precision as demonstrated from the narrow bootstrap percentile intervals (Figure 3(a)). The most dominant prey appearing in this node were crustacean decapods, and scomberosocidae, carangid, gempylid, scombrid and tetraodontid fishes. The predicted prey composition for terminal node (8) of the tree (Figure 3(b)) showed lower prey diversity (gini diversity index of 0.542), highlighting crustacean decapods as the most dominant prey (and prey class prediction), with some uncertainty surrounding that prediction as demonstrated by the wide bootstrap percentile intervals. All other prey appeared in lower proportions. Bootstrap predictions for all terminal nodes of the tree are shown in Figure 4. Although some nodes clearly showed a dominant prey (nodes 8 and 29), most terminal nodes comprised 2 or 3 dominant prey, but predicted in low proportions. Of note was the prevalence of myctophid fishes in northern waters during summer (node 5), highlighting the method’s usefulness in distinguishing prey differences at smaller spatial and temporal scales. 310 Investigating relationships between predictor variables and diet composition 311 312 313 314 315 316 317 318 319 Partial dependence plots were constructed to examine the relationship between each of the predictor variables and the 320 321 322 323 324 325 326 waters up to 25°C (Figure 5). Crustacean decapods (black) appeared in higher proportions as SST increased. In predicted prey composition for a subset of prey. We presented partial dependence plots for the 3 most important variables, sea-surface temperature, length and latitude. We investigated the interaction between latitude and longitude to aid with the interpretation. Partial dependence plots for SST and length (Figures 5 and 6 respectively), are shown for a subset of prey, crustacean decapods, and fishes from the families Carangidae, Gempylidae, Scombridae, Scomberesocidae and Tetradontidae, which were identified as important in nodes 6 and 8 of the tree (Figure 3). Rug plots represented by black segments at the base of the graph, showed the distribution of samples collected for each variable. Reduced proportions of scomberesocids (blue) and carangids (green) were noted in warmer waters compared to cooler waters, wheras greater proportions were noted for scombrids (orange) in warmer contrast, gempylid (gold) and tetradontid (red) fishes appeared in small proportions consistently across the range of sea surface temperatures. Figure 6 shows the contribution of yellowfin fork length to the predicted proportions of prey across a yellowfin size range between 73 cm to 170 cm with an outlier at 214 cm. Greater proportions of scomberosocids (blue) were found in smaller yellowfin compared to larger fish, wheras consistent proportions between about 0.1 and 0.25, were noted for carangids (green), scombrids (orange) and decapods (black) for fish between 100 cm and 160 10 327 328 329 330 331 332 333 334 335 cm in length. tetraodontid (red) and gempylid (gold) fishes were consistently predicted in low proportions across the 336 Discussion 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 The yellowfin analyzed in this study exhibited a diverse diet composition, rarely dominated by any one particular range of yellowfin tuna sizes. Although there was a marked decline predicted for decapods as fork length increased (Figure 6), this was more likely an artifact of the data (or lack of data for large fish) than related to biological factors. We showed the partial dependence between latitude and longitude in Figure 7 for four prey identified in node 6 of the tree. The predicted range of proportions is between 0 and 0.4 for each prey group. The plots indicated greater proportions of decapods in the inshore, central and southern regions with lower proportions noted in offshore regions and survey sites to the north. Greater proportions were also noted in the southern inshore region for scombrids. Carangid fishes were more prevalent as prey in offshore regions and regions to the north where tetraodontid fishes appearing in greater proportions offshore and regions to the south. prey taxon, a similar conclusion to that reached previously for yellowfin tuna off eastern Australia (Young et al. 2010). However, there were a number of novel insights delivered by this analysis. For example, significant differences were noted in the prey of yellowfin tuna sampled north of 30°S in summer where oligotrophic (nutrient poor) waters dominate (Ridgway 2007). These fish fed mainly on small crustacean megalopa and lanternfish. The predicted expansion of these waters in future years may result, therefore, in a diet shift to smaller crustaceans and fish. How this will impact on stocks of yellowfin tuna in the region needs further study, but should be considered in modelling scenarios (e.g. Griffiths et al., 2010). Other insights included spatial mapping of prey diversity, and the relationships between individual prey taxa and environmental correlates. For example, carangid fishes (F.Car) were increasingly dominant prey in areas with decreased SST. As tunas and other large predators are considered effective samplers of smaller prey species these analyses could be used to build a better understanding of the environmental tolerances of many lower-trophic level species for which there is only limited information. A similar analysis is being conducted for yellowfin tuna in the eastern Pacific Ocean (R. Olson and colleagues, personal communication) and, in contrast, found less diverse food habits compared to the Australian-caught yellowfin. The diet of eastern Pacific tuna was dominated by only a few prey taxa, depending on temporal, spatial, environmental and biological covariates. Classification trees are highly suited for analysing data that may include non-linearity, outliers, high order interactions, lack of balance and missing values, which are all relevant to diet data. In this context, we present a model for predicting diet composition that is both exploratory and predictive. From an exploration perspective, we produced an easily interpretable classification tree accompanied by partial dependence plots for exploring the relationship between the explanatory variables and the predicted response. We also predicted the composition of prey consumed by each predator and produced a predicted prey distribution at each terminal node of the tree, with bootstrap percentile intervals. Furthermore, we can also predict the composition of prey consumed by each predator. This methodology therefore provides a significant improvement over existing methods that are purely exploratory or involve multiple methodologies to determine relationships between the explanatory variables and the response and can therefore be used as an effective tool in broader analyses of environmental variables. By providing mapped 11 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 distributions of prey composition and diversity, the method presents a baseline for future comparison, particularly relevant in understanding distribution changes with respect to climate change (Chen et al. 2011). Moreover, this methodology represents a tool for hypotheses development involving potential changes in prey biodiversity under model scenarios of climate change (e.g. Polovina et al. 2011). One potential limitation of this approach is its ability to perform fine scale analyses of any single dietary predictor. An example of this is the investigation of predator length, which was highlighted by this method as an important predictor and shown through partial dependence plots as having distinct relationships with prey species. Methods such as quantile regression (Koenker and Bassett 1978; Scharf et al. 1998) can be a useful accompaniment to this broader scale analysis to investigate the relationships between tuna length and minimum, median and maximum prey lengths in relation to dietary differences as highlighted in Young et al. (2010). Therefore we recommend using this tree based methodology in a broader scale analysis initially to focus on specific biological questions that relate to dietary differences that can be explored using other techniques. A second potential limitation of the tree based approach presented here is its ability to handle dependent observations. Although not an issue for the data presented in this manuscript, pseudoreplication, where multiple fish were sampled from the same purse-seine set can induce a bias in the tree based model. In a regression analysis, the bias can be easily accounted for in the random effects of a generalized linear mixed model. In the case of trees, we can overcome this issue by implementing a sub-sampling approach within the bootstrap procedure. This approach randomly samples a subset of observations from each set, fits the tree model and compares the predictions with the observed proportions using a Hellinger distance to identify whether bias is an issue for the sample size taken and whether an analysis of the entire set of data is reasonable. Although the classification tree methodology can be applied to quite varied diet data with a large variety of prey taxa, we recommend selecting a threshold, say 1% and applying the methodology to stomach samples where the percent of wet weight is greater than the nominated threshold. We found that by applying this threshold, we obtained more informative analyses as it omitted many rare prey groups and outliers that tended to make interpretation of model outputs difficult. We also illustrated the method using diet data for one predator in this paper. Despite this, the methodology can be easily applied to a multi-predator species dataset, in which stomachs from more than one predator from a region are sampled. The inclusion of multiple predator species simply introduces an additional explanatory variable into the model indicating the predator species. The classification tree therefore has the capacity to split on the predator species variable if it identifies a different prey composition for the different species being analyzed. The methodology presented here is an extension of the classification tree methodology presented by Breiman et al. (1984), which incorporates spatial bootstrap techniques (where required) using a bagged approach to provide bootstrap predictions and standard errors that can be mapped back to a pruned tree for visualization or used to form predictions of prey distributions for each predator in the dataset. The partial dependence plots provide a valuable addition for visualizing relationships between each explanatory variable and the predicted proportions of each prey of interest. Although it is tempting to extrapolate beyond the range of the data to provide forecasts of the prey composition, such predictions are not based on data and must be treated with caution. To ensure predictions do not fall outside the range of the data, we have restricted predictions to only those based on the data provided. For 12 401 402 403 404 405 406 407 408 409 410 411 412 example, in Figure 7, we have only provided predictions for the regions that were sampled, as indicated by the black 413 Conclusions 414 415 416 417 418 419 420 421 422 423 424 The classification tree methodology provides new insights into the complex feeding habits of top predators like 425 Software and code 426 427 428 429 points shown on each map. The choice of explanatory variables to include in this analysis requires some preliminary investigations. Although the inclusion of additional variables in a model may lead to low error rates, they may also introduce complex and noisy splits that are very difficult to interpret. For example, the use of spatial variables, latitude and longitude in their continuous form made this tree difficult to interpret. As an alternative, we created categorical versions of these variables that made more sense ecologically when examining the prey composition at each split of the tree. A second example is the consideration of a year term in the model. While data were collected over a 14 year period (1996-2006), splits on year were not interpretable ecologically as samples were not collected homogeneously across sampling years and locations. As balanced sampling regimes are not always possible, we recommend that prior to any classification tree modeling, a thorough exploratory analysis be conducted on the explanatory variables to be incorporated into the model. yellowfin tuna. Compared to existing approaches, this methodology offers a robust approach for analysing diet data, providing exploratory summaries in addition to predicted prey composition that can be useful for examining differences between predator feeding characteristics in different spatial regions, temporal zones and environmental regimes. The bootstrap implementation also provides a way of incorporating uncertainty into the model by providing bootstrap percentile intervals around the estimated diet composition. By providing mapped distributions of prey composition and diversity, the method presents a baseline for future comparison, particularly relevant in understanding distribution changes with respect to climate change. This method shows promise as a framework for clarifying heretofore approximations of the trophic structure underlying ecologically-important large marine ecosystems, which is an essential prerequisite for understanding future effects from environmental and anthropogenic forces. The methodology presented in this paper was implemented in R (R Development Core Team 2005), making use of the rpart (Therneau et al. 2009) and maps (Becker et al. 2010) packages from the CRAN website. A diet R package that implements the methodology presented in this paper is currently under development with a corresponding publication, details of which can be obtained from the first author. 430 Acknowledgements 431 432 433 434 435 The authors wish to acknowledge Ross Sparks and the three anonymous reviewers who kindly reviewed this manuscript. We acknowledge Alex Aires-DaSilva for making the R code and resources available for producing the maps in this paper. We thank Christine Patnode for assistance with the graphics. We thank skippers from the Eastern Tuna and Billfish fishery for help in collecting the yellowfin samples; Matt Lansdell and S. Ridoch (CSIRO) for taxonomic support and Scott Cooper, CSIRO for structuring the database that housed this information. We thank 13 436 437 438 Alistair Hobday, Klaus Hartman, Jason Hartog and Sophie Bestley for developing and making available the SDODE 439 References 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 Baker R, Sheaves M (2005) Redefining the piscivore assemblage of shallow estuarine nursery habitats. Marine interface for exploring global data sets. Finally, we acknowledge the funding support from CSIRO through the Julius Award that was granted to the first author. Ecology Progress Series 291: 197-213. doi:10.3354/meps291197 Becker RA, Wilks AR, Brownrigg R, Minka TP (2010) maps: Draw Geographical Maps. R package version 2.1-5, http://CRAN.R-project.org/package=maps Breiman L (1996) Bagging Predictors. Machine Learning 24: 123-140. doi: 10.1023/A:1018054314350 Breiman L (1998) Arcing Classifiers (with Discussion). Annals of Statistics 26: 801-824. doi:10.2307/120055 Breiman L (2001) Random Forests. Machine Learning 45: 5-32. doi: 10.1023/A:1010933404324 Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and Regression Trees. Wadsworth, Belmont, California. Chipps SR, Garvey JE (2007) Assessment of diets and feeding patterns. In: Guy CS, Brown ML (eds) Analysis and interpretation of Freshwater Fisheries Data. American Fisheries Society, Bethesda, Maryland, USA., pp 473-514 Christensen, V., and Walters, C.J. (2004) Ecopath with Ecosim: methods, capabilities and limitations. Ecological Modelling 172(2-4), 109-139. Doi: 10.1016/j.ecolmodel.2003.09.003 Cressie NAC (1993) Statistics for Spatial Data. Wiley, New York Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Kauffman M (ed) Machine Learning: Proceedings of the Thirteenth International Conference, San Francisco, pp 148-156 Fulton, E.A., Smith, A.D.M., and Smith, D.C. (2007) Alternative Management Strategies for Southeast Australian Commonwealth Fisheries: Stage 2: Quantitative Management Strategy Evaluation. CSIRO Marine and Atmospheric Research, Hobart, Tasmania Griffiths SP, Fry GC, Manson FJ, Pillans RD (2007) Feeding dynamics, consumption rates and daily ration of longtail tuna (Thunnus tonggol) in Australian waters, with emphasis on the consumption of commercially important prawns. Marine and Freshwater Research 58: 376-397. doi: 10.1071/MF06197 Griffiths SP, Kuhnert PM, Fry GF, Manson FJ (2009) Temporal and size-related variation in the diet, consumption rate and daily ration of mackerel tuna (Euthynnus affinis) in neritic waters of eastern Australia. ICES Journal of Marine Science 66: 720-733. doi: 10.1093/icesjms/fsp065 Griffiths SP, Young JW, Lansdell JW, Campbell RA, Hampton J, Hoyle SD, Langley A, Bromhead D, Hinton MG (2010) Ecological effects of longline fishing and climate change on the pelagic ecosystem off eastern Australia. Reviews in Fish Biology and Fisheries 20: 239-272. doi: 10.1007/s11160-009-9157-7 Hall P (1985) Resampling a coverage pattern. Stochastic Processes and their Applications 20: 231-246 Hobday, A.J., Hartmann, K., Hartog, J., Bestley, S., 2006. SDODE: spatial dynamics ocean data explorer. User Guide. CSIRO Marine and Atmospheric Research, Hobart 8pp. 14 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 Hartog J, Hobday AJ, Matear R, Feng M (2011) Habitat overlap of southern bluefin tuna and yellowfin tuna in the east coast longline fishery - implications for present and future spatial management. Deep Sea Research II 58: 746752. doi: 10.1016/j.dsr2.2010.06.005 I-Ching Chen, Jane K. Hill, Ralf Ohlemüller, David B. Roy, Chris D. Thomas (2011) Rapid Range Shifts of Species Associated with High Levels of Climate Warming. Science, 333, 1024 – 1026. doi: 10.1126/science.1206432 Koenker RW, Bassett GW (1978) Regression quantiles. Econometrica 46: 33-50 Kuhnert PM, Kinsey-Henderson A, Bartley R, Herr A (2010) Incorporating uncertainty in gully erosion calculations using the random forests modelling approach. Environmetrics 21: 493-509. doi: 10.1002/env.999 Kuhnert PM, Mengersen K (2003) Reliability measures for local nodes assessment in classification trees. Journal of Computational and Graphical Statistics 12: 398-416. doi: 10.1198/1061860031734 Logan JM, Rodriguez-Marin E, Goñi N, Barreiro S, Arrizabalaga H, Golet W, Lutcavage M (2011) Diet of young Atlantic bluefin tuna (Thunnus thynnus) in eastern and western Atlantic foraging grounds. Marine Biology 158: 7385. doi: 10.1007/s00227-010-1543-0 Marasco RJ, Goodman D, Grimes CB, Lawson PW, Punt AE, Quinn TJI (2007) Ecosystem-based fisheries management: some practical suggestions. Canadian Journal of Fisheries and Aquatic Sciences 64: 928-939.doi: 10.1139/F07-062 Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic, London McCullagh P, Nelder JA (1983) Generalized linear models. Chapman and Hall, London McCulloch CE, Searle SR (2001) Generalized, linear and mixed models. John Wiley and Sons, New York Ménard F, Labrune C, Shin Y-J, Asine A-S, Bard F-X (2006) Opportunistic predation in tuna: a size-based approach. Marine Ecology Progress Series 323: 223-231. doi: 10.3354/meps323223 Olson RJ, Galván-Magaña F (2002) Food habits and consumption rates of common dolphinfish (Coryphaena hippurus) in the eastern Pacific Ocean. U.S. National Marine Fisheries Service, Fisheries Bulletin 100: 279-298 Olson RJ, Watters GM (2003) A model of the pelagic ecosystem in the eastern tropical Pacific Ocean. InterAmerican Tropical Tuna Commission, Bulletin 22: 133-218 Pikitch EK, Santora C, Babcock EA, Bakun A, Bonfil R, Conover DO, Dayton P, Doukakis P, Fluharty D, Heneman B, Houde ED, Link J, Livingston PA, Mangel M, McAllister MK, Pope J, Sainsbury KJ (2004) Ecosystem-based fishery management. Science 305: 346-347. doi: 10.1126/science.1098222 Polovina, J. J., J. P. Dunne, P. A. Woodworth, and E. A. Howell. 2011. Projected expansion of the subtropical biome and contraction of the temperate and equatorial upwelling biomes in the North Pacific under global warming. ICES J. Mar Sci. doi:10.1093/icesjms/fsq198 Potier M, Marsac F, Cherel Y, Lucas V, Sabatie R, Maury O, Ménard F (2007) Forage fauna in the diet of three large pelagic fishes (lancetfish, swordfish and yellowfin tuna) in the western equatorial Indian Ocean. Fisheries Research 83: 60-72. doi: 10.1016/j.fishres.2006.08.020 R Development Core Team (2005) R: A language and environment for statistical computing, reference index version 2.12.1. R Foundation for Statistical Computing, ISBN 3-900051-07-0, URL http://www.R-project.org., Vienna, Austria 15 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 Ridgway KR (2007) Long-term trend and decadal variability of the southward penetration of the East Australian Current. Geophysical research letters 34: 5pp. doi: 10.1029/2007GL030393 Scharf FS, Juanes F, Sutherland M (1998) Inferring ecological relationships from the edges of scatter diagrams: comparison of regression techniques. Ecology 79: 448-460 Therneau TM, Atkinson B, R port by Brian Ripley (2009) RPART: Recursive Partitioning. R package version 3.145, http://CRAN.R-project.org/package=rpart Young JW, Hobday AJ, Campbell RA, Kloser RJ, Bonham PI, Clementson LA, Lansdell MJ (2011) The biological oceanography of the East Australian Current and surrounding waters in relation to tuna and billfish catches off eastern Australia. Deep Sea Research II 58: 720-733. doi: 10.1016/j.dsr2.2010.10.005 Young JW, Lamb TD, Bradford R, Clementson LA, Kloser RJ, Galea H (2001) Yellowfin tuna (Thunnus albacares) aggregations along the shelfbreak of southeastern Australia: links between inshore and offshore processes. Marine and Freshwater Research 52. doi: 10.1071/MF99168 Young JW, Lansdell JW, Riddoch S, Revill A (2006) Feeding ecology of broadbill swordfish, Xiphias gladius (Linnaeus, 1758), off eastern Australia in relation to physical and environmental variables. Bulletin of Marine Science 79: 793-811 Young JW, Lansdell MJ, Campbell RA, Cooper SP, Juanes F, Guest MA (2010) Feeding ecology and niche segregation in oceanic top predators off eastern Australia. Marine Biology 157: 2347-2368. doi: 10.1007/s002227010-1500-y Zuur AF, Leno EN, Smith GM (2007) Analysing Ecological data. Springer, New York 16 541 542 Figure Captions 543 544 545 546 547 548 Figure 1: Pruned classification tree that predicts the diet composition of yellowfin tuna off the east coast of Australia. 549 prey group abbreviations. This tree yielded a cross-validated error rate of 0.859 ± 0.023. The prey groups identified at each terminal node are those with the highest proportion composition among a suite of prey in the diet. Colors represent broad taxonomic groupings: light blue represent salps, dark blue represent squids, red represent crustaceans, and green and yellow represent fish. Covariates used to develop the tree were Season (Summer and Winter), Latitude Region (LT.reg: Central, South and North), Longitude Region (LN.reg: Offshore and Inshore), sea-surface temperature (SST), mixed layer depth (MLD) and fork length (Length). Refer to Table 1 for 550 551 552 553 554 Figure 2: Map of diversity values ranging between 0 and 1 showing the diversity of the distribution of prey predicted for each yellowfin tuna. Diversity is mapped as a set of (a) points corresponding to individual predators and (b) a spatially interpolated map using generalized additive modelling overlayed with contours showing the standard errors of the predicted diversity and points indicating sampling locations. 555 556 557 558 559 Figure 3: Observed and bagged prey proportions for (a) node 6, representing an internal node of the classification tree and (b) node 8, representing a terminal node of the classification tree. The largest predicted prey class is colored according to the legend, and the locations of samples are portrayed on the map, with the diversity index, D listed beneath the map. 95% bootstrap percentile intervals are shown for each figure. 560 561 562 Figure 4: Terminal node bootstrapped predictions and 95% bootstrap percentile intervals are presented for each node in the pruned classification tree for the 19 prey examined. 563 564 565 566 567 Figure 5: Partial dependence plots showing the relationship between SST and the predicted proportion for Crustaceans (Cr.Dec), and fishes from the families Scomberesocidae (F.Scombs), Carangidae (F.Car), Gempylidae (F.Gem), Scombridae (F.Scom), and Tetraodontidae (F.Tet). A rug plot is shown beneath the plot to indicate the distribution of SST measurements taken. 568 569 570 571 572 Figure 6: Partial dependence plots showing the relationship between fork length and the predicted proportion for Crustaceans (Cr.Dec), and fishes from the families Scomberesocidae (F.Scombs), Carangidae (F.Car), Gempylidae (F.Gem), Scombridae (F.Scom), and Tetraodontidae (F.Tet). A rug plot is shown beneath the plot to indicate the distribution of length measurements taken. 573 17 574 575 576 Figure 7: Partial dependence plots of four dominant prey appearing in node 6 of the classification tree showing the relationship between the spatial variables and the predicted proportion of (a) Cr.Dec, (b) F.Scombs, (c) F.Car and (d) F.Tet. Predictions are based on averaging across all other variables in the model. 18 Figures Figure 1: Pruned classification tree that predicts the diet composition of yellowfin tuna off the east coast of Australia. The prey groups identified at each terminal node are those with the highest proportion composition among a suite of prey in the diet. Colors represent broad taxonomic groupings: light blue represent salps, dark blue represent squids, red represent crustaceans, and green and yellow represent fish. Covariates used to develop the tree were Season (Summer and Winter), Latitude Region (LT.reg: Central, South and North), Longitude Region (LN.reg: Offshore and Inshore), sea-surface temperature (SST), mixed layer depth (MLD) and fork length (Length). Refer to Table 1 for prey group abbreviations. This tree yielded a cross-validated error rate of 0.859 ± 0.023. 1 (a) (b) Figure 2: Map of diversity values ranging between 0 and 1 showing the diversity of the distribution of prey predicted for each yellowfin tuna. Diversity is mapped as a set of (a) points corresponding to individual predators and (b) a spatially interpolated map using generalized additive modelling overlayed with contours showing the standard errors of the predicted diversity and points indicating sampling locations. 2 (a) (b) Figure 3: Observed and bagged prey proportions for (a) node 6, representing an internal node of the classification tree and (b) node 8, representing a terminal node of the classification tree. The largest predicted prey class is colored according to the legend, and the locations of samples are portrayed on the map, with the diversity index, D listed beneath the map. 95% bootstrap percentile intervals are shown for each figure. 3 Figure 4: Terminal node bootstrapped predictions and 95% bootstrap percentile intervals are presented for each node in the pruned classification tree for the 19 prey examined. 4 Figure 5: Partial dependence plots showing the relationship between SST and the predicted proportion for Crustaceans (Cr.Dec), and fishes from the families Scomberesocidae (F.Scombs), Carangidae (F.Car), Gempylidae (F.Gem), Scombridae (F.Scom), and Tetraodontidae (F.Tet). A rug plot is shown beneath the plot to indicate the distribution of SST measurements taken. 5 Figure 6: Partial dependence plots showing the relationship between fork length and the predicted proportion for Crustaceans (Cr.Dec), and fishes from the families Scomberesocidae (F.Scombs), Carangidae (F.Car), Gempylidae (F.Gem), Scombridae (F.Scom), and Tetraodontidae (F.Tet). A rug plot is shown beneath the plot to indicate the distribution of length measurements taken. 6 (b) (a) (d) (c) Figure 7: Partial dependence plots of four dominant prey appearing in node 6 of the classification tree showing the relationship between the spatial variables and the predicted proportion of (a) Cr.Dec, (b) F.Scombs, (c) F.Car and (d) F.Tet. Predictions are based on averaging across all other variables in the model. 7 Tables Table 1: Prey groupings, broad categorizations and codes used in the analysis of yellowfin tuna diets. Prey Group Broad Categorisation Code Salpidae Salps S-Sal Octopodidae Molluscs Ce-Oct Argonautidae Molluscs Ce-Arg Ommastrephidae Molluscs Ce-Om Decapoda Crustaceans Cr-Dec Clupeidae Fishes F-Clu Emmelichthyidae Fishes F-Emm Alepisauridae Fishes F-Ale Myctophidae Fishes F-Myc Scomberesocidae Fishes F-Scombs Hemiramphidae Fishes F-Hemi Carangidae Fishes F-Car Bramidae Fishes F-Bra Gempylidae Fishes F-Gem Scombridae Fishes F-Scom Nomeidae Fishes F-Nom Monacanthidae Fishes F-Mon Ostraciidae Fishes F-Ost Tetraodontidae Fishes F-Tet
© Copyright 2026 Paperzz