The general surplus production model under the non-equilibrium condition with multiple data sources being combined: application to assessment of Georges Bank yellowtail flounder Saang-Yoon Hyun School of Marine Sciences, University of Massachusetts School for Marine Science and Technology, University of Massachusetts at Dartmouth 706 S. Rodney French Blvd, New Bedford, MA 02744 USA Phone: (508) 999-8875; FAX: (508) 910-6374 Email: [email protected] Surplus production models; ICESms110819 1 Abstract Under limited data and information, a surplus production model is useful for stock assessment because it has much fewer parameters than age-structured models. I propose the general surplus production (Pella-Tomlinson) model (Pella and Tomlinson 1969) under the non-equilibrium condition, because it is flexible in the model shape, and the non-equilibrium status is more realistic than the equilibrium. The general model’s shape parameter is not fixed unlike the logistic surplus production (Schaefer) model (Schaefer 1954). The shape parameter is defined as the ratio of BMSY / K , where BMSY is the biomass that produces the maximum sustainable yield (MSY), and K is the virgin biomass or the carrying capacity. It is not a trivial task to estimate parameters in the general surplus production model, because the model is non-linear and it suffers from an overparameterization problem. On the other hand, the logistic surplus production model is free of the overparameterization problem, because its shape parameter is fixed as 0.5. Thanks to the fixed value, the precision in the logistic surplus production model’s estimation is good, and the logistic surplus production model could be considered to outperform the other surplus production models (Prager 2002). But the fixed value could lead to serious bias in estimation. The shape parameter is affected by natural mortality, the steepness in a stockrecruitment model (the steepness = % of virgin recruitment that will be produced when the biomass level is a certain % of the virgin biomass), and K (Maunder 2003). The shape parameter fixed as 0.5 is too strict to be realistic. Another thing new in this study is to incorporate different survey data simultaneously. The ASPIC software is available from the US NOAA tool box (http://nft.nefsc.noaa.gov/) for application of a surplus production model, but it fails to accommodate different survey data together. When different surveys take place in the same year, they are likely to be correlated to each other. Then because of the dependence between different survey data, it should be more efficient and less biased to systematically combine those data than to separately use the data. Finally I warn that population biomass or abundance could be overestimated by the common or routine (almost all) practice in stock assessment models of treating the logarithm of survey index or catch data as a normal (Gaussian) random variable. Because the mean of the logarithm of a random variable is not the same as the logarithm of the mean of the variable, estimate of a population size (abundance or biomass) is likely positively biased (over-estimated) from the practice. I suggest a more accurate expected value of the logarithm. For demonstration purposes, I apply the above model and idea to data on Georges Bank (GB) yellowtail flounder (Limanda ferruginea). Data Annual commercial yield and survey data are available. Four kinds of surveys have been made: the US NEFSC spring, fall and scallop surveys, and the Canada DFO survey. I use data from two surveys of the NEFSC spring and fall surveys, because both data have been collected much longer than the NEFSC scallop and DFO surveys: the spring survey since 1968, and the fall survey since 1963 (Legault et al. 2010). Surplus production models; ICESms110819 2 Difference equation of non-equilibrium surplus production model Management practice is generally based on annual data, and thus, using time increment as one year, I use a difference equation of the general surplus production model aka Pella-Tomlinson model under the non-equilibrium condition. I use Gilbert’s (1992) formulation of the Pella-Tomlinson model to explicitly have the shape parameter. Bt 1 Bt Bt n r n 1 Bt Yt (1/ n 1) K (1) where n = the parameter that determines BMSY / K (eq. 2), called the shape parameter; K = the virgin biomass, or the carrying capacity; r = the production rate at maximum production. Management references are calculated as follows (Maunder 2001). BMSY 1 (1/( n 1)) (2) K n where BMSY = biomass that corresponds to MSY, and the ratio of BMSY to K represents the position of maximum production. MSY is: MSY r BMSY (3) Yield in the difference equation (eq. 1) is the product of fishing mortality and biomass: Yt Ft Bt (4) It should be noted that in the difference equation model, Ft is a dimensionless rate whose range is from 0 to 1. That is, it is different in range from the instantaneous rate, rooted from a differential equation model. Survey data are available, which represent a relative total biomass or a scaled catch (in weight) per unit effort. Denoting the index from survey s in year t as I st , I st qs ( Bt Yt / 2) (5) qs is the scaled catchability coefficient of survey s. Because the survey index data is not raw but scaled (expanded to the population total area from survey areas), theoretically qs can be larger than one: i.e., qs > 0. Term, ( Bt Yt / 2 ) in the above equation represents the average of biomass between the beginning and end of the year. For example, recruits (births) at year t, fish survived from the previous years to the year, and removal by fisheries in the year average “[(recruits + survivals) + (recruits + survivals – harvest)]/2” (Schnute and Richards 1995). Probability error & Dependence between survey data As in most surplus production models, I consider that survey data have an error, assuming that the logarithm of survey index follows a normal (Gaussian) distribution, treating yield data as a constant. log I st ~ N(E(log I st ), V(log I st )) where V( ) = the variance operation. The expected value of log I st is: E (log I st ) log qs log( Bt Yt / 2) Surplus production models; ICESms110819 (6) (7) 3 where Bt is a function of four free parameters: n, K, r and the initial biomass B1 (eq. 1). V( log I st ) is treated as an additional parameter. When data are collected from different surveys in a common year, then those survey data are likely to be dependent on unknown population biomass in the year. For example, index data from spring and fall surveys for GB yellowtail flounder are significantly correlated (Fig. 1). In the case, the dependence between different survey data must not be ignored. For the purposes, I assume that the joint probability of different survey data is conditionally independent on parameter of unknown biomass. i Pr(log I1t , log I 2 t , , log I it | Bt ) Pr(log I st | Bt ) (8) s 1 The conditional independence between survey data in a year incorporates the dependence on the common biomass in the year. Likelihood Denoting parameters as the vector of θ , the likelihood of the parameters is: y i L(θ | I ) Pr(log I st | θ) t 1 s 1 (9) y i t 1 s 1 (log I st E (log I st )) 2 1 exp 2 V (log I st ) V (log I st ) where y = the number of years, and i = the number of surveys. To save degrees of freedom, I do not treat V( log I st ) as a free parameter, but analytically calculate it. The maximum likelihood estimate (MLE) of the variance is: y Vˆ (log I st ) log I t 1 E (log I st ) 2 st y (10) E (log I st ) in the above eqs, 9 and 10 is replaced by the right side in eq. 7. It is not trivial to estimate four parameters (i.e., n, K, r, and B1 ) in the above likelihood function because the likelihood surface is flat and the global maximum is hard to find. Because of the reason, the constraint of B1 K is suggested (Polacheck et al. 1993). I estimate parameters both with the constraint and without it. I use ADMB software (Fournier 2007) to numerically differentiate the likelihood function (9) with respect to parameters. With the constraint of B1 K Surplus production models; ICESms110819 4 I use data that start in 1969 (Figure 2) for deploying the constraint of B1 K , because the average of the spring and fall survey indices in 1969 was largest, and the largest value makes the constraint less arbitrary. Without the constraint of B1 K I estimate parameters without the constraint of B1 K , using different values of n (eq. 1). The possible range of n is from 0.1 to 2.5, which correspond to the shape parameter ( BMSY / K ) of 0.08 – 0.54. I use data that start in 1976 (Table1, Figure 4) whose survey values are low enough to remove the constraint. Another reason for the choice of year 1976 as the start year is that only US and Canada commercial fisheries have been permitted since the year and the yield data since the year have been relatively well monitored. Fisheries from other foreign countries before 1976 took place, and their yield data may not be reliable. Questionable practice in using a lognormal density function Most fish stock assessment models often apply the logarithm to catch (Fournier and Archibald 1982) or survey index data (Polacheck et al. 1993, Maunder 2001). The logarithm scales down raw data and also transforms the raw data’s distribution to a normal (Gaussian) distribution. The scale-down and the normality increase stability in parameter estimation. I have no problem with the role of the logarithm. However, I am concerned that the routine practice of the lognormal application may lead to overestimate true population size (in number or biomass), because the expected value of logarithm of a random variable, say X is not the same as the logarithm of the expected value of the variable: i.e., E (logX ) log E ( X ) . When log X is a concave function where X > 0, then the Jensen inequality indicates that (11) E (log X ) log( EX ) That is, in fish stock assessment literature’s routine treatment of E(log Catch) = log E(Catch), and E(log Index) = log E(Index), the equality (=) indicates approximate equality ( ), which is Taylor series first order approximation. The Jensen inequality (eq.11) means that stock assessment models, which contain the routine treatment, are likely to over-estimate a population size (abundance or biomass). Surplus production and catch-atage models use catch-per-unit-effort (e.g., survey index) and catch data by deploying the routine practice. The cumulative effect might increase the positive bias. To remove the overestimation potential, I consider the expected value of the logarithm up to Taylor series second order approximation. Letting X ~ N( , 2 ), then E (log X ) log 2 2 2 Applying the same principle to survey index data, E ( I st ) st qs ( Bt Yt / 2) Var ( I st ) s 2 Like eq. 10, the analytical MLE of the variance is: Surplus production models; ICESms110819 (12) (13) 5 y s 2 (I t 1 st st ) 2 (14) y E (log I st ) log st s2 2 st 2 (15) st in eqs. 14 and 15 is replaced by qs ( Bt Yt / 2) in eq. 13. 10000 Key preliminary results are shown below as Figures and Table. 6000 4000 2000 0 Index (mt) 8000 r = 0.72 (P-value < 0.000) 1970 1975 1980 1985 1990 1995 2000 2005 2010 Year Figure 1. A significant correlation between the US NEFSC spring (solid line) and fall (dotted line) survey indices for Georges Bank yellowtail flounder. The significant correlation suggests that we should correctly incorporate the dependence between the different survey data into the stock assessment. Surplus production models; ICESms110819 6 5000 10000 2e+04 40 60 80 (d) 0 20 $ mortality (%) Fishing 100 0 1.0 0.6 0.4 0.2 Ratio $ 0.8 (c) 1970 1980 1990 2000 2010 1970 1980 1990 2000 2010 Year Figure 2. On the basis of survey and commercial yield data from 1969-2009 on Georges Bank yellowtail flounder, performance of the non-equilibrium surplus production model where the shape parameter is treated as a free parameter. Panel (a): the predicted survey indices are overlapped with the observed values. Red solid line indicates the predicted spring survey index, and red dots on broken red line are the observed spring survey index. Counterparts in blue represent fall survey index. Panel (b): the solid line is the predicted biomass whose values are on the left y-axis, and black dots are the yield data whose values are on the right y-axis. The horizontal broken line indicates MSY (= 9869.6 mt). Panel (c): Bt / K (solid line) is compared with BMSY / K (broken line), the shape parameter whose estimate is 0.44. Panel (d): Fishing mortality (%). Surplus production models; ICESms110819 7 Yield (mt) 20000 1e+05 $ (mt) Biomass (b) 6e+04 10000 6000 2000 Index (mt) (a) 1e+05 8e+04 6e+04 4e+04 1.1 1.2 2e+04 Biomass (mt) 0.7 0.8 0.9 1970 1975 1980 1985 1990 1995 2000 2005 2010 Year Figure 3. Predicted biomass from data in 1976-2009, given different values of n (eq. 2; Table 1) where the constraint of B1 = K was removed. n controls the shape parameter. The green line was biomass predicted by the best model where n =1.2. For comparison, the red line is added, which is the predicted biomass from data in 1969-2009 where the constraint of B1 = K was assumed. Surplus production models; ICESms110819 8 100 80 60 40 0 20 $ mortality (%) Fishing (d) 1970 1980 1990 2000 2010 1970 1980 1990 2000 2010 Year Figure 4. On the basis of survey and commercial yield data from 1976-2009 on Georges Bank yellowtail flounder, performance of the non-equilibrium surplus production model where the constraint of B1 = K is relieved. Panel (a): the predicted survey indices are overlapped with the observed values. Red solid line indicates the predicted spring survey index, and red dots on broken red line are the observed spring survey index. Counterparts in blue represent fall survey index. Both predicted spring and fall survey indices were very close to each other. Panel (b): the solid line is the predicted biomass whose values are on the left y-axis, and black dots are the yield data whose values are on the right y-axis. The horizontal broken line indicates MSY (= 7599.6 mt). Panel (c): Bt / K (solid line) is compared with BMSY / K (broken line), the shape parameter whose estimate is 0.40. Panel (d): Fishing mortality (%). Surplus production models; ICESms110819 9 Yield (mt) 10000 10000 5000 30000 $ (mt) Biomass 6000 0 1.0 0.6 0.4 0.2 0.0 Ratio $ 0.8 (c) 15000 10000 (b) 2000 Index (mt) (a) Table 1. Evaluation of the non-equilibrium surplus production model fitted to data from 1976 - 2009 without the constraint of B1 = K. The model was best fitted when the shape parameter (= BMSY / K ) was 0.40 (i.e., n = 1.2 in eq. 2), where (i) the residuals between observed survey index values and the model fitted values were the smallest (60.9% in the spring survey, and 44.3% in the fall survey) as the mean of the absolute values of the relative residuals; and (ii) the scaled negative log-likelihood value was the smallest as -4.8. The other values of n (i.e., outside the range of 0.7 – 1.2) did not lead to stable estimation of parameters or they resulted in unreasonable estimates (e.g., Bt < Yt ). n 0.7 0.8 0.9 1.1 1.2 The mean of the absolute values of Scaled negative relative residual (%) BMSY / K log-likelihood Spring survey Fall survey 0.30 63.6 46.2 -1.9 0.33 63.2 45.8 -2.5 0.35 62.7 45.4 -3.0 0.39 61.6 44.7 -4.2 0.40 60.9 44.3 -4.8 Acknowledgements Chris Legault at the US NOAA Northeast Fisheries Science Center (NEFSC) and Heath Stone at the Canada DFO provided me with data and valuable information about Georges Bank yellowtail flounder ecology and management. Brian Rothschild, Yue (June) Jiao, and Steve Cadrin in the School for Marine Science & Technology at the Univ. of Massachusetts Dartmouth were consulted. Mark Maunder at the Inter-American Tropical Tuna Commission helped me with ADMB and advice. The work is part of the project of New England Multi-Species Survey, funded by the NOAA NMFS (NA10NMF4720287). References Fournier, D.A., Archibald, C.P., 1982. A general theory for analyzing catch at age data. Can. J. Fish. Aquat. Sci. 39: 1195–1207. Fournier, D.A. 2007. An introduction to AD Model Builder Version 8.0.2: for use in nonlinear modeling and statistics. Otter Research Ltd., Sidney, B.C., Canada. Gilbert, D.J. 1992. A stock production modeling technique for fitting catch histories to stock index data. New Zealand Fisheries Assessment Res. Doc. 92/15. [Available from National Institute of Water and Atmospheric Research (NIWA), Greta Point, P.O. Box 297, Wellington, N.Z.) Legault, C.M., L. Alade, and H.H. Stone. 2010. Stock assessment of Georges Bank yellowtail flounder for 2010. TRAC Reference Document - 2010/06. Surplus production models; ICESms110819 10 Maunder, M.N., 2001. A general framework for integrating the standardization of catchper-unit-of-effort into stock assessment models. Can. J. Fish. Aquat. Sci. 58: 795–803. Maunder, M.N., 2003. Letter to the editor. Is it time to discard the Schaefer model from the stock assessment scientist’s toolbox? Fisheries Research 61: 145-149. Pella, J.J., Tomlinson, P.K., 1969. A generalized stock production model. IATTC Bull. 13, 421–458. Polacheck, T., Hilborn, R., and Punt, A.E. 1993. Fitting surplus production models: comparing methods and measuring uncertainty. Can. J. Fish. Aquat. Sci. 50: 2597–2607. Prager, M.H., 2002. Comparison on logistic and generalized surplus production models applied to swordfish, Xiphias gladius, in the north Atlantic Ocean. Fish. Res. 58, 41–57. Schaefer, M.B., 1954. Some aspects of the dynamics of populations important to the management of commercial marine fisheries. IATTC Bull. 1, 25–56. Schnute, J.T., Richards, L.J., 1995. The influence of error on population estimates from catch-age models. Can. J. Fish. Aquat. Sci. 52: 2063–2077. Surplus production models; ICESms110819 11
© Copyright 2026 Paperzz