Statistical comparison between SARIMA and ANN’s performance for surface water quality time series prediction

Abstract

The performance comparison studies of the autoregressive integrated moving average model (ARIMA) and the artificial neural network (ANN) were mostly carried out between the selected model structures through trial-and-error, strongly influenced by model structure uncertainty. This research aims to make up for this inadequacy. First, a surface water quality prediction case study including eight monitoring sites in China was introduced. Second, the ARIMA and ANN’s performance was compared statistically between 6912 Seasonal ARIMA (SARIMA) and 110,592 feedforward ANN with different model structures, based on the mean square error (MSE) distributions depicted by boxplots. In a statistical view, the ANN models obtained a significantly lower median value and a more concentrated distribution of validation MSEs, which indicated lighter overfitting and better generalization ability. Furthermore, the optimal SARIMA models’ performance is inferior to even the median of the ANN models in the case study. In contrast with the previous comparisons among selected models, the statistical comparison in this study shows lower uncertainty.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Data availability

The datasets and codes are available in our GitHub repository: https://github.com/MrBrenda/WaterResourcesFNNModels.git.

References

  1. Ahmad S, Khan IH, Parida B (2001) Performance of stochastic approaches for forecasting river water quality. Water Res 35:4261–4266. https://doi.org/10.1016/S0043-1354(01)00167-1

    CAS  Article  Google Scholar 

  2. Ansari M, Othman F, Abunama T, El-Shafie A (2018) Analysing the accuracy of machine learning techniques to develop an integrated influent time series model: case study of a sewage treatment plant, Malaysia. Environ Sci Pollut Res 25:12139–12149. https://doi.org/10.1007/s11356-018-1438-z

    Article  Google Scholar 

  3. Bhagat SK, Tung TM, Yaseen ZM (2020) Development of artificial intelligence for modeling wastewater heavy metal removal: State of the art, application assessment and possible future research. J Clean Prod 250:119473. https://doi.org/10.1016/j.jclepro.2019.119473

    CAS  Article  Google Scholar 

  4. Bhagat SK, Tiyasha T, Awadh SM, Tung TM, Jawad AH, Yaseen ZM (2021) Prediction of sediment heavy metal at the Australian Bays using newly developed hybrid artificial intelligence models. Environ Pollut 268:115663. https://doi.org/10.1016/j.envpol.2020.115663

    CAS  Article  Google Scholar 

  5. Box GE, Jenkins GM (1976) Time series analysis: forecasting and control, vol 31, third edn. Holden Day, Oakland, p 303

    Google Scholar 

  6. Diez-Sierra J, del Jesus M (2020) Long-term rainfall prediction using atmospheric synoptic patterns in semi-arid climates with statistical and machine learning methods. J Hydrol 586:124789. https://doi.org/10.1016/j.jhydrol.2020.124789

    Article  Google Scholar 

  7. Doshi-Velez F, Kim B (2017) Towards A Rigorous Science of Interpretable Machine Learning 1–13. https://arxiv.org/abs/1702.08608v2.

  8. Edwin AI, Martins OY (2014) Stochastic Characteristics and Modelling of Monthly Rainfall Time Series of Ilorin, Nigeria. Open J Mod Hydrol 04:67–79. https://doi.org/10.4236/ojmh.2014.43006

    Article  Google Scholar 

  9. Elkiran G, Nourani V, Abba SI (2019) Multi-step ahead modelling of river water quality parameters using ensemble artificial intelligence-based approach. J Hydrol 577:123962. https://doi.org/10.1016/j.jhydrol.2019.123962

    CAS  Article  Google Scholar 

  10. García Nieto PJ, García-Gonzalo E, Alonso Fernández JR, Díaz Muñiz C (2019) Water eutrophication assessment relied on various machine learning techniques: A case study in the Englishmen Lake (Northern Spain). Ecol Model 404:91–102. https://doi.org/10.1016/j.ecolmodel.2019.03.009

    CAS  Article  Google Scholar 

  11. García-Alba J, Bárcena JF, Ugarteburu C, García A (2018) Artificial neural networks as emulators of process-based models to analyse bathing water quality in estuaries. Water Res 150:283–295. https://doi.org/10.1016/j.watres.2018.11.063

    CAS  Article  Google Scholar 

  12. Haghiabi AH, Nasrolahi AH, Parsaie A (2018) Water quality prediction using machine learning methods. Water Qual Res J Can 53:3–13. https://doi.org/10.2166/wqrj.2018.025

    CAS  Article  Google Scholar 

  13. Hameed M, Sharqi SS, Yaseen ZM, Afan HA, Hussain A, Elshafie A (2017) Application of artificial intelligence (AI) techniques in water quality index prediction: a case study in tropical region, Malaysia. Neural Comput & Applic 28:893–905. https://doi.org/10.1007/s00521-016-2404-7

    Article  Google Scholar 

  14. Hanson PC, Stillman AB, Jia X, Karpatne A, Dugan HA, Carey CC, Stachelek J, Ward NK, Zhang Y, Read JS, Kumar V (2020) Predicting lake surface water phosphorus dynamics using process-guided machine learning. Ecol Model 430:109136. https://doi.org/10.1016/j.ecolmodel.2020.109136

    CAS  Article  Google Scholar 

  15. Hunter JM, Maier HR, Gibbs MS, Foale ER, Grosvenor NA, Harders NP, Kikuchi-Miller TC (2018) Framework for developing hybrid process-driven, artificial neural network and regression models for salinity prediction in river systems. Hydrol Earth Syst Sci 22:2987–3006. https://doi.org/10.5194/hess-22-2987-2018

    Article  Google Scholar 

  16. Kang G, Gao JZ, Xie G (2017) Data-driven water quality analysis and prediction: A survey. Proc - 3rd IEEE Int Conf Big Data Comput Serv Appl BigDataService 2017 224–232. https://doi.org/10.1109/BigDataService.2017.40

  17. Khairuddin N, Aris AZ, Elshafie A, Sheikhy Narany T, Ishak MY, Isa NM (2019) Efficient forecasting model technique for river stream flow in tropical environment. Urban Water J 16:1–10. https://doi.org/10.1080/1573062x.2019.1637906

    CAS  Article  Google Scholar 

  18. Landeras G, Ortiz-Barredo A, López JJ (2009) Forecasting weekly evapotranspiration with ARIMA and artificial neural network models. J Irrig Drain Eng 135:323–334. https://doi.org/10.1061/(ASCE)IR.1943-4774.0000008

    Article  Google Scholar 

  19. Maier HR, Jain A, Dandy GC, Sudheer KP (2010) Methods used for the development of neural networks for the prediction of water resource variables in river systems: Current status and future directions. Environ Model Softw 25:891–909. https://doi.org/10.1016/j.envsoft.2010.02.003

    Article  Google Scholar 

  20. Monteiro M, Costa M (2018) A Time Series Model Comparison for Monitoring and Forecasting Water Quality Variables. Hydrology 5:37. https://doi.org/10.3390/hydrology5030037

    Article  Google Scholar 

  21. Mount NJ, Maier HR, Toth E, Elshorbagy A, Solomatine D, Chang FJ, Abrahart RJ (2016) Data-driven modelling approaches for socio-hydrology: Opportunities and challenges within the Panta Rhei Science Plan. Hydrol Sci J 61:1192–1208. https://doi.org/10.1080/02626667.2016.1159683

    Article  Google Scholar 

  22. Ömer Faruk D (2010) A hybrid neural network and ARIMA model for water quality time series prediction. Eng Appl Artif Intell 23:586–594. https://doi.org/10.1016/J.ENGAPPAI.2009.09.015

    Article  Google Scholar 

  23. Rafael A, Parmezan S, Souza VMA, Batista GEAPA (2019) Evaluation of statistical and machine learning models for time series prediction : Identifying the state-of-the-art and the best conditions for the use of each model. Inf Sci 484:302–337. https://doi.org/10.1016/j.ins.2019.01.076

  24. Raman H, Sunilkumar N (1995) Multivariate modelling of water resources time series using artificial neural networks. Hydrol Sci J 40:145–163. https://doi.org/10.1080/02626669509491401

    Article  Google Scholar 

  25. Salmani MH, Salmani Jajaei E (2016) Forecasting models for flow and total dissolved solids in Karoun river-Iran. J Hydrol 535:148–159. https://doi.org/10.1016/J.JHYDROL.2016.01.085

    CAS  Article  Google Scholar 

  26. Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Netw 61:85–117. https://doi.org/10.1016/J.NEUNET.2014.09.003

    Article  Google Scholar 

  27. Sheikhy Narany T, Aris AZ, Sefie A, Keesstra S (2017) Detecting and predicting the impact of land use changes on groundwater quality, a case study in Northern Kelantan, Malaysia. Sci Total Environ 599–600:844–853. https://doi.org/10.1016/J.SCITOTENV.2017.04.171

    Article  Google Scholar 

  28. Shi B, Wang P, Jiang J, Liu R (2018) Applying high-frequency surrogate measurements and a wavelet-ANN model to provide early warnings of rapid surface water quality anomalies. Sci Total Environ 610–611:1390–1399. https://doi.org/10.1016/j.scitotenv.2017.08.232

    CAS  Article  Google Scholar 

  29. Tiyasha, Tung TM, Yaseen ZM (2020) A survey on river water quality modelling using artificial intelligence models: 2000–2020. J Hydrol 585:124670. https://doi.org/10.1016/j.jhydrol.2020.124670

    CAS  Article  Google Scholar 

  30. Valipour M, Banihabib ME, Behbahani SMR (2013) Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J Hydrol 476:433–441. https://doi.org/10.1016/J.JHYDROL.2012.11.017

    Article  Google Scholar 

  31. Wu W, May RJ, Maier HR, Dandy GC (2013) A benchmarking approach for comparing data splitting methods for modeling water resources parameters using artificial neural networks. Water Resour Res 49:7598–7614. https://doi.org/10.1002/2012WR012713

    Article  Google Scholar 

  32. Wu W, Dandy GC, Maier HR (2014) Protocol for developing ANN models and its application to the assessment of the quality of the ANN model development process in drinking water quality modelling. Environ Model Softw 54:108–127. https://doi.org/10.1016/j.envsoft.2013.12.016

    Article  Google Scholar 

  33. Zhang X, Liang F, Yu B, Zong Z (2011) Explicitly integrating parameter, input, and structure uncertainties into Bayesian Neural Networks for probabilistic hydrologic forecasting. J Hydrol 409:696–709. https://doi.org/10.1016/j.jhydrol.2011.09.002

    Article  Google Scholar 

  34. Zhou J, Wang Y, Xiao F, Wang Y, Sun L (2018) Water Quality Prediction Method Based on IGRA and LSTM. Water 10:1148. https://doi.org/10.3390/w10091148

    CAS  Article  Google Scholar 

Download references

Acknowledgements

This study was financially supported by the Major Science and Technology Project of Water Pollution Control and Management in China (grant no. 2018ZX07208006) and the National Natural Science Foundation of China (grant no. 51778451). We also thank the 111 Project (B13017) of Tongji University.

Funding

Major Science and Technology Project of Water Pollution Control and Management in China (grant no. 2018ZX07208006). National Natural Science Foundation of China (grant no. 51778451). 111 Project (B13017) of Tongji University.

Author information

Affiliations

Authors

Contributions

Xuan Wang: conceptualization, methodology, software, data curation, writing—original draft preparation.

Wenchong Tian: methodology, investigation, writing—reviewing, and editing.

Zhenliang Liao: supervision, writing—reviewing, and editing.

Corresponding author

Correspondence to Zhenliang Liao.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Responsible Editor: Xianliang Yi

Supplementary Information

ESM 1

(DOCX 1090 kb).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Tian, W. & Liao, Z. Statistical comparison between SARIMA and ANN’s performance for surface water quality time series prediction. Environ Sci Pollut Res (2021). https://doi.org/10.1007/s11356-021-13086-3

Download citation

Keywords

  • ANN
  • ARIMA
  • Surface water quality
  • Time series prediction
  • Statistical comparison
  • Grid sampling