A numerical simulation is used to evaluate the accuracy and precision of the annual trends in CPUE estimates for each method. Robustness and limitations of each method are also investigated with consideration given to target change over years, multispecies abundance trends, and different catchability by operational areas (e.g., eastern and western areas). Figure 1 provides a flowchart of the methodology of the numerical simulation.
Definition of target strategy
Catch data are generated by mimicking actual Japanese longline fishery data at Kesennuma, on the Pacific coast in northeastern Japan. The main targets of this fishery are swordfish, blue shark, and tunas such as bigeye Thunnus obesus and albacore. The gear configurations such as line materials and depth of hooks are tailored to the target species (Hiraoka et al. 2016). It is therefore assumed that the longline fishery frequently changes the target strategy and that the target species are mutually exclusive. Based on these assumptions, the fishing operations at this fishery are grouped into three target strategies:
-
1.
Strategy I: Target species is Species 1 (i.e. blue shark) only,
-
2.
Strategy II: Target species is Species 2 (i.e. swordfish) only,
-
3.
Strategy III: Target species is neither Species 1 nor 2,
The proportions of the three target strategies in year i (\(p_{{i, {\text{Strategy I}}}}^{{}}\), \(p_{{i, {\text{Strategy II}}}}^{{}}\), and \(p_{{i, {\text{Strategy III}}}}^{{}}\); \(\mathop \sum \limits_{\theta }^{{}} p_{i, \theta }^{{}}\) = 1,\(\theta \in \left\{ {{\text{Strategy I}},{\text{ Strategy II}},{\text{ and Strategy III}}} \right\}\)) are given based on skipper records of the longline fishery in Kesennuma between 2004 and 2008. The target strategies on each operation are assigned using the Monte Carlo method assuming a multinomial distribution with expectation \( p_{{i,{\text{Strategy I}}}}^{{}}\), \(p_{{i,{\text{Strategy II}}}}^{{}}\), and \(p_{{i,{\text{Strategy III}}}}^{{}} \) for each strategy. The annual transition of target species is expressed using a linear equation:
$$p_{{i,{\text{Strategy I}}}}^{{}} = a + b{\text{Year}}_{i} ,$$
(1)
a and b are fixed values of the intercept and slope, respectively. Based on the skipper records, the value of the first year \( p_{{1,{\text{Strategy I}}}}^{{}}\) is arbitrarily set to 1.5 times higher than the average ratio of operations targeting blue shark (0.514 \(\times\) 1.5 = 0.772), and this value decreases on a yearly basis from 0.772 to 0.257, i.e., a = 0.828, b = − 0.057. This means that the major target strategy is shifted from Strategy I to II every year (Fig. 2). The ratio of Strategy III, \(p_{{i,{\text{Strategy III}}}}^{{}}\), is fixed at 0.100 over 10 years across all scenarios (Fig. 2). The ratio of Strategy II is given as \(p_{{i,{\text{Strategy I}}}}^{{}} = 1 - ( {p_{{i,{\text{Strategy II}}}}^{{}} + p_{{i,{\text{Strategy III}}}}^{{}} } )\).
Data generation
The annual abundance trends over the 10-year period are generated using the following equation:
$$\begin{gathered} N_{i + 1, \rho } = N_{i, \rho } \times e^{{r_{\rho } }} , \hfill \\ i \in \left\{ {1,2, \ldots , 9} \right\}, \hfill \\ \rho \in { }\left\{ {{\text{Sp}}1,{\text{ Sp}}2} \right\} \hfill \\ \end{gathered}$$
(2)
\(N_{i, \rho }\) is the abundance of Species 1 or 2 in year i, and \(r_{\rho }\) is the annual change rate of abundance. Two scenarios of annual abundance trends (i.e. increasing or decreasing) for Species 1 and 2 are assumed. \(N_{10, \rho }\) is set to be 30% larger than \(N_{1, \rho }\) for the increasing scenario, as \(N_{10, \rho }\) is 30% smaller than \(N_{1, \rho }\) for the decreasing scenario. The ratio of 30% is given in accordance with the actual minimum and maximum ranges of abundance estimates for swordfish and blue shark in the North Pacific (Brodziak and Ishimura 2011; Hiraoka et al. 2016). These two scenarios are set for each Species 1 and 2. A total of four scenarios (i.e. increasing and increasing, increasing and decreasing, decreasing and increasing, and decreasing and decreasing) are used (Fig. 3).
The mean catch amount (μi,Sp1, μi,Sp2) is given using the following general relationship (Maunder and Punt, 2004):
$$\begin{gathered} \mu_{i, \rho ,\kappa ,\alpha } \in \left\{ {q_{\rho } z_{\kappa ,\rho } D_{\alpha ,\rho } N_{i,\rho } E} \right\}, \hfill \\ i \in \left\{ {1,2, \ldots , 10} \right\}, \hfill \\ \rho \in { }\left\{ {{\text{Sp}}1,{\text{ Sp}}2} \right\}, \hfill \\ \kappa \in { }\left\{ {{\text{tar}},{\text{ non}}} \right\}, \hfill \\ \alpha \in { }\left\{ {{\text{Area}}1,{\text{ Area}}2} \right\} \hfill \\ \end{gathered}$$
(3)
q is the catchability coefficient per hook with neither target effect nor area effect, \(z_{\kappa ,\rho }\) represents the effect of targeting behavior on q,\(D_{\alpha ,\rho }\) is relative density by area, and E is fishing effort. Descriptions of the notations are shown in Table 1. The value of qSp1 was set to 0.004 and the value of qSp2 was set to 0.002. These values are set based on data from longline fisheries that mainly catch blue shark and swordfish (Hiraoka et al. 2016; Brodziak and Ishimura 2011).
Table 1 Descriptions of the notations used in the simulation study The subscript \(\kappa\)(tar or non) indicates whether the species is targeted or not targeted, respectively. The values of ztar,Sp1 and ztar,Sp2 were set to 1, and znon,Sp1 and znon,Sp2 were set to 0.2.
The subscript \(\alpha\) indicates fishing area (Area1 or Area2). The variable \(D_{\alpha ,\rho }\) represents relative density by area. The variable \(D_{\alpha ,\rho }\) has a Bernoulli distribution with expectation \(v_{i}\), which is the average proportion of operations in Area1. The Bernoulli trials based on \(v_{i}\) determine the areas in which each operation was conducted. The value of \(v_{i}\) varies over the years to simulate the yearly transition of fishing areas:
$$v_{i} = c + d{\text{Year}}_{i} ,$$
(4)
c and d are the intercept and slope, respectively. We used three scenarios of area effect (A, B, and C). For all three scenarios, the major target strategy is shifted from Strategy I to Strategy II each year, and the ratio of Strategy III is fixed over years. In Scenario A, we assume no area effect, i.e. \(D_{{{\text{Area}}1, \rho }} = 1\) and \(D_{{{\text{Area}}2, \rho }} = 1\). In Scenarios B and C, we assume \(D_{{{\text{Area}}\;1, \rho }} = 0.5\) and \(D_{{{\text{Area}} 2, \rho }} = 1\). In Scenario B, the value of \(v_{i}\) is fixed at 0.5 over years, i.e. c = 0.5 and d = 0. In Scenario C, the value of \(v_{i}\) increases on a yearly basis from 0.1 to 0.9, i.e. c = 0.011 and d = 0.089. The difference between variables \(z_{\kappa ,\rho }\) and \(D_{\alpha ,\rho }\) is that the former is an unobserved or latent variable, as the latter is an observed variable.
The values of \(\mu_{i, \rho ,\kappa ,\alpha }\) on an operation used in each abundance scenario are shown in Fig. 4. The fishing effort E was fixed at 2000 hooks for all operations.
The catch amounts “Catch” on operation s is generated using a random variable following a Poisson distribution with the mean values of \(\mu_{i, \rho ,\kappa ,\alpha }\):
$$\begin{gathered} {\text{Catch}}_{s,i, \rho ,\kappa ,\alpha } \sim {\text{Poisson}}\left( {\mu_{i, \rho ,\kappa ,\alpha } } \right) \hfill \\ s = 1,2, \ldots , 10000 \hfill \\ \end{gathered}$$
(5)
The annual sample size (i.e. the number of fishing operations) is set to 10,000. The simulation is repeated 100 times for each combination of four scenarios of abundance and three scenarios of area effect.
Model structure for CPUE standardization
The eight candidates of CPUE standardization methods and two control scenarios for comparison are summarized in Table 2. The descriptions of each method are as follows:
Table 2 Summary of CPUE standardization model structures with Poisson and lognormal error distribution
-
(i)
Nominal CPUE (Nominal)
The CPUE is calculated without considering target effect.
-
(ii)
Nominal CPUE with positive catch (Nominal-P)
The CPUE is calculated using only positive catches. It is assumed that the catchability coefficient and the catch number for a target species are both greater than zero. The objective for using this method was that if it were possible to successfully choose the dataset of only a homogeneous condition of a target strategy, the CPUEs estimated from the dataset would not be affected by the target strategy. Nominal-P is applied in actual stock assessments in Japan, e.g., in the CPUE analyses for Japanese bottom-trawl fisheries targeting Alaska pollock Gadus chalcogrammus (Hamatsu et al. 2017) and brown sole Pseudopleuronectes herzensteini (Yamashita et al. 2019) in the Pacific Ocean.
-
(iii)
Two-step delta-lognormal method (Delta)
One of the conventional methods to estimate CPUE is the two-step delta-lognormal method (Lo et al. 1992). This model is commonly used when most of the errors are effectively modeled with a lognormal distribution, although the catches include zeros. This model is used in particular to address catch data containing a large proportion of zeros due to target effect, i.e. the species of interest was not targeted. This approach is generally applied to actual stock assessments, e.g., Pacific bluefin tuna Thunnus orientalis (Ichinokawa et al. 2014), albacore (ISC 2017), and blue marlin Makaira nigricans (Forrestal et al. 2019a).
In this analysis, we applied the two-step delta-lognormal method (Lo et al. 1992) to catch data that were generated based on Poisson distribution, and evaluated its performance. The proportion of positive catches was modeled using a binomial GLM with a logit link function, and the positive CPUEs were modeled with a normal linear model for log-transformed positive CPUE.
-
(iv)
Ranking method (RM)
The CPUE is standardized using RM in a GLM with Poisson error distribution with the main effects of year and rank. The RM divides the generated data into 10 groups in accordance with the order of nominal CPUE for Species 2 at each 10th percentile (Hiraoka et al. 2016). The group number “10” signifies that the nominal CPUE for Species 2 is at the lowest rank (Rank 1), indicating that Species 2 is the main target species. Rank was modeled as a categorical variable. In all subsequent methods using rank, we treated rank in the same manner. Method (v), RM-intrc, below, estimates the mean CPUE of all ranks, whereas method (vi), RM-intrc-Rank1, estimates only the mean CPUE of Rank 1.
-
(v)
Ranking method with the interaction term (RM-intrc)
The CPUE is standardized using RM in a GLM with the interaction term between rank and year.
-
(vi)
Estimating the mean CPUE of Rank 1 using RM-intrc (RM-intrc-Rank1).
The nominal CPUE at Rank 1 for Species 2 (the lower nominal CPUE for Species 2) is standardized using the GLM with the output of RM-intrc. The operations of Rank 1 theoretically contain a large proportion of operations that mainly target Species 1. The mean CPUE is expected to be robust, because the catchability coefficients for Species 1 are relatively high and less variable in operations targeting Species 1, irrespective of changes in abundance.
-
(vii)
Directed residual mixture (DRM) model.
The CPUE is estimated using the DRM method (Okamura et al. 2017). To deal with the zeros, we used an additive constant, 0–13 (Appendix F in ESM.txt). The explanatory variable (i.e. year) was treated as a continuous variable, unlike the categorical variables of the other models (i.e. year and target strategies in RM and FMM, and control models CP and CL described below). The regression model, which is a three-component mixture model similar to the FMM, was implemented for the catches of Species 1 and 2.
-
(viii)
Finite mixture model (FMM)
The CPUE is estimated using the FMM, which directly estimates the target strategy from a dataset containing multiple species (Cosgrove et al. 2014). The regression model, which is a three-component mixture model, is used to estimate the CPUE (i.e. number of fish in the catch) of Species 1 and 2, and to separate the components into the three target strategies. Details are described in Online Appendix A. The FMM is carried out using R version 3.3.1 (R Core Team 2018) and “flexmix” package version 2.3–13 (Leisch 2004; Grün and Leisch 2007, 2008).
-
(ix)
Complete data with Poisson model (CP)
CP is one of two control scenarios we used for comparison. The CPUE is estimated using a GLM with a Poisson distribution for the complete dataset (target strategy is known). Although such a dataset is unrealistic, comparing the results of this method with those of other methods enables us to effectively show the degree of bias and variance that the other methods would produce.
-
(x)
Complete data with lognormal model (CL).
CL is the other control scenario. Here, the CPUE is estimated using a GLM with a lognormal distribution for the complete dataset. To deal with the zeros, we again used an additive constant, 0–13 (Appendix F in ESM.txt). As with the CP model, we used this method to examine and validate the suitability of the assumption for lognormality inherent in the DRM method (Okamura et al. 2017) in the modeling of count data with low catch numbers.
The least squared mean (LSMEAN; SAS Institute Inc. 2009) was calculated to estimate the annual CPUE for each method. In calculating the annual CPUE estimated by Delta, we used the products of the predicted year effects (i.e. LSMEAN) from the binomial and lognormal components.
Model evaluation and comparison
In the performance evaluation for the 10 models, the annual trend residual (ATR) was used to represent variance, bias, and hyper-stability (the annual trend would be flattened compared to the actual trend) for the annual trends in abundance estimates. The ATR is defined as:
$${\text{ATR}}_{i} = \left( {{\text{log}}\left( {\frac{{{\text{CPUE}}_{i} }}{{{\text{CPUE}}_{1} }}} \right) - {\text{log}}\left( {\frac{{N_{i} }}{{N_{1} }}} \right)} \right)/\left( {i - 1} \right),$$
(6)
the first and second terms indicate the estimated and true annual trends in stock abundance, respectively.
A zero value of ATR indicates that the estimated annual trend is unbiased, while a positive or negative value indicates the direction of the bias. The whiskers in the box plots are visual illustrations of variance, a measure of the precision of the estimated trend. The calculation of ATR includes just one species, Sp1.