Skip to main content
Log in

Modeling and comparison of count data containing zero values: a case study of Setipinna taty in the south inshore of Zhejiang, China

  • Review Article
  • Published:
Environmental Science and Pollution Research Aims and scope Submit manuscript

Abstract

To effectively use the fishery count data containing zero values, Setipinna taty in the coastal waters of south inshore of Zhejiang in China from 2017 to 2019 was used in this study. Environmental factors, such as water temperature, water depth, and salinity, were selected to establish models and compare based on the generalized additive model (GAM) of the Tweedie distribution (Tweedie-GAM) and two-stage GAM, Ad hoc method, and generalized additive mixed model (GAMM). The results showed that each station accounted for a higher proportion of zero values and the two-stage GAM model had a higher deviation interpretation rate, and GAM I and GAM II had 19.6% and 60.4% deviation interpretation rates. The cross-validation results showed that the performance evaluation of the two-stage GAM model was the best and showed the highest R2 value, the lowest average absolute error, and the relatively small root mean square error. This study found that the abundance of S. taty in the south inshore of Zhejiang was highest at around 21°C and 18°C in spring and autumn, and the abundance reached the highest at a water depth of about 20 m. In spatial distribution, the high value of the abundance of S. taty was mostly distributed in the coastal waters in the south of 28°N. In future research, models should be fitted and compared for different sampling zero-value ratios, and more environmental factors should be included to accurately find an optimal model and provide references for the conservation of fishery resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  • Baayen H, Vasishth S, Kliegl R, Bates D (2017) The cave of shadows: addressing the human factor with generalized additive mixed models. J Mem Lang 94:206–234

    Article  Google Scholar 

  • Barry SC, Welsh AH (2002) Generalized additive modelling and zero inflated count data. Ecol Model 157:179–188

    Article  Google Scholar 

  • Chang JH, Chen Y, Holland D, Grabowski J (2010) Estimating spatial distribution of American lobster Homarus americanus using habitat variables. Mar Ecol Prog Ser 420:145–156

    Article  Google Scholar 

  • Chen XJ (2014) Fishery Resources and Fisheries, 2nd edn. Ocean Press, Beijing, pp 152–167

    Google Scholar 

  • Guisan A, Edwards TC, Hastie T (2002) Generalized linear and generalized additive models in studies of species distributions: setting the scene. Ecol Model 157:89–100

    Article  Google Scholar 

  • Guo A, Zhou YD, Jin HW, Xue LJ, Xu HX (2010) Seasonal changes on food composition and feeding habitat of Setipinna taty in the East China Sea. Fishery Information and Stratery 25:10–13

    Google Scholar 

  • Hamley JM (1975) Review of gillnet selectivity. J Fish Res Board Can 32:1943–1969

    Article  Google Scholar 

  • Hashimoto M, Nishijima S, Yukami R, Watanabe C, Kamimura Y, Furuichi S, Ichinikawa M, Okamura H (2019) Spatiotemporal dynamics of the Pacific chub mackerel revealed by standardized abundance indices. Fish Res 219:105315

    Article  Google Scholar 

  • Heyne M, Derrick D, Al-Tamimi J (2019) Native language influence on brass instrument performance: an application of generalized additive mixed models (GAMMs) to midsagittal ultrasound images of the tongue. Front Psychol 10:2597

    Article  Google Scholar 

  • Hua Z, Sun W, Yang G, Du Q (2019) A full-coverage daily average PM2.5 retrieval method with two-stage IVW fused MODIS C6 AOD and two-stage GAM model. Remote Sens 11:1558–1576

    Article  Google Scholar 

  • Jensen OP, Seppelt R, Miller TJ, Bauer LJ (2005) Winter distribution of blue crab Callinectes sapidus in Chesapeake Bay: application and cross-validation of a two-stage generalized additive model. Mar Ecol Prog Ser 299:239–255

    Article  Google Scholar 

  • Kabacoff R (2015) R in action: data analysis and graphics with R. Manning Publications Co, Greenwich

    Google Scholar 

  • Kotaro Y, Yukio T (2002) Standardization of CPUE for sailfish caught by Japanese longline in the Atlantic Ocean. International Commission for the Conservation of Atlantic Tunas 54:817–825

    Google Scholar 

  • Li G, Chen XJ, Tian SQ (2009) CPUE standardization of chub mackerel (Scomber japonicus) for Chinese large lighting-purse seine fishery in the East China Sea and Yellow Sea. J Fish China 33:1050–1059

    Google Scholar 

  • Li Y, Jiao Y, He Q (2011) Decreasing uncertainty in catch rate analyses using Delta-AdaBoost: an alternative approach in catch and bycatch analyses with high percentage of zeros[J]. Fish Res 107:261–271

    Article  Google Scholar 

  • Li ZG, Ye ZJ, Zhang C, Zhuang LC, Wang ML (2013) CPUE distribution of Setipinna taty in southern Yellow Sea and its influencing factors revealed by stow net fishing in spring. Periodical Ocean University of China 43:30–36

    Google Scholar 

  • Li MK, Zhang CL, Xu BD, Xue Y, Ren YP (2020) A comparison of GAM and GWR in modelling spatial distribution of Japanese mantis shrimp (Oratosquilla oratoria) in coastal waters. Estuar Coast Shelf Sci 244:106928

    Article  Google Scholar 

  • Liu Y, Cheng JH, Li SF (2004) A study on the distribution of Setipinna taty in the East China Sea. Marine Fisheries Research 26:8–13

    CAS  Google Scholar 

  • Liu Y, Cheng JH, Chen XG (2006) Studies on the seasonal distribution of Setipinna taty in the East China Sea. Marine Fisheries Research 27:1–6

    Google Scholar 

  • Liu XX, Gao CX, Tian SQ, Qin S, Ma J, Zhao J (2020) Distribution of optimal habitats for Setipinna taty in the south inshore of Zhejiang province based on habitat suitability index. Journal of Fishery Sciences of China. https://doi.org/10.3724/SP.J.1118.2020.20052

  • Qian W, Yang Y, Zou H (2016) Tweedie’s compound Poisson model with grouped elastic net. J Comput Graph Stat 25:606–625

    Article  Google Scholar 

  • Sagarese S, Frisk MG, Cerrato RM, Sosebee K, Musick JA, Rago P (2014) Application of generalized additive models to examine ontogenetic and seasonal distributions of spiny dogfish (Squalus acanthias) in the Northeast (US) shelf large marine ecosystem. Can J Fish Aquat Sci 71:847–877

    Article  Google Scholar 

  • Stow CA, Jolliff J, Mcgillicuddy DJ, Doney SC, Allen JL (2009) Skill assessment for coupled biological/physical models of marine systems. J Mar Syst 76:4–15

    Article  Google Scholar 

  • The State Bureau of Quality and Technical Supervision (1998) GB17378.3-1998 The specification for marine monitoring-Part 3: Sample collection, storage and transportation. China Standards Press, Beijing

    Google Scholar 

  • Third Institute of Oceanography, State Oceanic Administration (2007) GB/T 12763.6-2007 Specifications for oceanographic survey-Part 6: Marine bioligical survey. China Standards Press, Beijing

    Google Scholar 

  • Tian SQ, Chen XJ, Chen Y, Xu LX, Dai XJ (2009) Standardizing CPUE of Ommastrephes bartramii for Chinese squid-jigging fishery in Northwest Pacific Ocean. Chin J Oceanol Limnol 27:729–739

    Article  Google Scholar 

  • Tweedie MCK (1984) An index which distinguishes between some important exponential families. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference. Calcutta, India, pp 579-604

  • Wang YH, Jiang GC, Dong HL (1990) Distribution characteristics and relationship of dissolved oxygen, pH value and nutrients in the southern sea of Zhejiang in spring. Acta Oceanol Sin 12:654–660

    CAS  Google Scholar 

  • Wei S, Jiang WM (1992) Study on food web of fishes in the Yellow Sea. Oceanologia et Limnologia Sinica 23:182–192

    Google Scholar 

  • Willmott C, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82

    Article  Google Scholar 

  • Wood SN (2017) Generalized additive models: an introduction with R. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  • Xu ZL, Peng HH, Peng SZ (2015) The development and evaluation of species distribution models. Acta Ecol Sin 35:557–567

    Article  Google Scholar 

  • Zhang J, Jiang R, Wang ZQ (2016) Age selectivity of offshore stow nets for hairfin anchovy Setipinna taty and little yellow croaker Larimichthys polyactis. Marine Fisheries 38:525–532

    Google Scholar 

  • Zhang YL, Xu BD, Zhang CL, Ren YP, Xue Y (2019) Relationship between the habitat factors and the abundance of small yellow croaker (Larimichthys polyactis) in Haizhou Bay based on the Tweedie-GAM model. Acta Oceanologica Sinica 41:78–89

    Google Scholar 

  • Zhao J, Zhang SY, Lin J, Zhou XJ (2014) A comparative study of different sampling designs in fish community estimation. Chin J Appl Ecol 25:1181–1187

    Google Scholar 

  • Zhao J, Liu XX, Wu JH, Han DY, Tian SQ, Ma J (2020) Application of zero-inflated model in predicting the distribution of rare fish species: a case study of Coilia nasus in Yangtze estuary, China. Chinese Journal of Ecology 39:3155–3163

    Google Scholar 

  • Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Zero-truncated and zero-inflated models for count data, Mixed effects models and extensions in ecology with R. Springer, New York, pp 261–293

    Book  Google Scholar 

Download references

Acknowledgements

We would like to thank the teachers and students from the laboratory of Shanghai Ocean University and Zhejiang Mariculture Research Institute for their work and help in sample collection and biological analysis and for the valuable comments on the revision of the paper.

Funding

This work was funded by the National Natural Science Foundation of China (31902372) and Zhejiang Mariculture Research Institute of China (325000).

Author information

Authors and Affiliations

Authors

Contributions

XL contributed to the conception of the study, performed the data analyses, and wrote the manuscript; CG contributed to the conception of the study and provided ideas; JZ contributed significantly to analysis and manuscript preparation; ST helped perform the analysis with constructive discussions; SY helped perform the analysis with constructive discussions and financially supported this work; JM analyzed existing literatures and provided a lot of work for the revision of the paper; all authors read and approved the final manuscript.

Corresponding author

Correspondence to Jin Ma.

Ethics declarations

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent to publish

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Responsible Editor: Marcus Schulz

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Gao, C., Zhao, J. et al. Modeling and comparison of count data containing zero values: a case study of Setipinna taty in the south inshore of Zhejiang, China. Environ Sci Pollut Res 28, 46827–46837 (2021). https://doi.org/10.1007/s11356-021-13440-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11356-021-13440-5

Keywords

Navigation