Assessment of input data selection methods for BOD simulation using data-driven models: a case study

  • Azadeh Ahmadi
  • Zahra Fatemi
  • Sara Nazari


Using the multivariate statistical methods, this study interprets a set of data containing 23 water quality parameters from 10 quality monitoring stations in Karkheh River located in southwest of Iran over 5 years. According to cluster analysis, the stations are classified into three classes of quality, and the most important factors on the whole set of parameters and each class are determined by the help of factor analysis. The results indicate the effects of natural factors, soil weathering and erosion, urban and human wastewater, agricultural and industrial wastewater on water quality at different levels and any location. Afterwards, five input selection methods such as correlation model, principal component analysis, combination of gamma test and backward regression, gamma test and genetic algorithm, and gamma test by elimination method are used for modeling BOD, and then their efficiency is investigated in simulation BOD with local linear regression, Artificial Neural Network, and genetic programming. From five methods of input variables in BOD simulation by local linear regression, genetic test and backward regression with RMSE error of 0.27 are the best input methods; gamma test based on genetic algorithm is the best model in simulation by Artificial Neural Network with RMSE error of 0.28, and finally, the gamma test model based on genetic algorithm with RMSE error of 0.1303 is the most appropriate model in simulation with genetic programming.


Surface water quality Factor analysis Principal component analysis Gamma test Genetic programming Karkheh River 


  1. Ahmed, A. M., & Shah, S. M. A. (2015). Application of adaptive neuro-fuzzy inference system (ANFIS) to estimate the biochemical oxygen demand (BOD) of Surma River. Journal of King Saud University—Engineering Sciences.Google Scholar
  2. Ahmadi, A., Han, D., Karamouz, M., & Remesan, R. (2009). Input data selection for solar radiation estimation. Hydrological processes, 23(19), 2754-2764.Google Scholar
  3. Asagha, E. N., Udo, S. O., & Echi, I. M. (2014). Modeling and simulation of global solar radiation in Warri, Nigeria using gamma test and artificial neural network algorithms. International Journal of Innovative Research and Development|| ISSN 2278–0211.Google Scholar
  4. Baek, G., Cheon, S.-P., Kim, S., Kim, Y., Kim, H., Kim, C., & Kim, S. (2012). Modular neural networks prediction model based A2/O process control system. International Journal of Precision Engineering and Manufacturing, 13(6), 905–913.CrossRefGoogle Scholar
  5. Baghvand, A., Nokhandan, A. K., & Kerachian, R. (2006). Design of river a water quality monitoring network: an entropy based approach. World Environmental and Water Resources Congress 2006.Google Scholar
  6. Chau, K.-W. (2006). A review on integration of artificial intelligence into water quality modelling. Marine Pollution Bulletin, 52(7), 726–733.CrossRefGoogle Scholar
  7. Chen, W.-B., & Liu, W.-C. (2014). Artificial neural network modeling of dissolved oxygen in reservoir. Environmental Monitoring and Assessment, 186(2), 1203–1217.CrossRefGoogle Scholar
  8. Dogan, E., Ates, A., Yilmaz, E. C., & Eren, B. (2008). Application of artificial neural networks to estimate wastewater treatment plant inlet biochemical oxygen demand. Environmental Progress, 27(4), 439–446.CrossRefGoogle Scholar
  9. Dogan, E., Sengorur, B., & Koklu, R. (2009). Modeling biological oxygen demand of the Melen River in Turkey using an artificial neural network technique. Journal of Environmental Management, 90(2), 1229–1235.CrossRefGoogle Scholar
  10. Fan, X., Cui, B., Zhao, H., Zhang, Z., & Zhang, H. (2010). Assessment of river water quality in Pearl River Delta using multivariate statistical techniques. Procedia Environmental Sciences, 2, 1220–1234.CrossRefGoogle Scholar
  11. Gazzaz, N. M., Yusoff, M. K., Ramli, M. F., Juahir, H., & Aris, A. Z. (2015). Artificial neural network modeling of the water quality index using land use areas as predictors. Water Environment Research, 87(2), 99–112.CrossRefGoogle Scholar
  12. Hosseini, S. M., & Mahjouri, N. (2014). Developing a fuzzy neural network-based support vector regression (FNN-SVR) for regionalizing nitrate concentration in groundwater. Environmental Monitoring and Assessment, 186(6), 3685–3699.CrossRefGoogle Scholar
  13. Jaafar, W. W., & Han, D. (2011). Variable selection using the gamma test forward and backward selections. Journal of Hydrologic Engineering, 17(1), 182–190.CrossRefGoogle Scholar
  14. Karamouz, M., Mahjouri, N., & Kerachian, R. (2004). River water quality zoning: a case study of Karoon and Dez River system. Journal of Environmental Health Science & Engineering, 1(2), 1–2.Google Scholar
  15. Kazi, T., Arain, M., Jamali, M., Jalbani, N., Afridi, H., Sarfraz, R., et al. (2009). Assessment of water quality of polluted lake using multivariate statistical techniques: a case study. Ecotoxicology and Environmental Safety, 72(2), 301–309.CrossRefGoogle Scholar
  16. Ketola, A. A., Adekolurejo, S. M., & Osibanjo, O. (2013). Water quality assessment of River Ogun using multivariate statistical techniques. Journal of Environmental Protection, 04, 466–479. Scholar
  17. Mulia, I. E., Asano, T., & Tkalich, P. (2015). Retrieval of missing values in water temperature series using a data-driven model. Earth Science Informatics, 8(4), 787–798.CrossRefGoogle Scholar
  18. Noori, R., Hoshyaripour, G., Ashrafi, K., & Araabi, B. N. (2010). Uncertainty analysis of developed ANN and ANFIS models in prediction of carbon monoxide daily concentration. Atmospheric Environment, 44(4), 476–482.CrossRefGoogle Scholar
  19. Palma, P., Alvarenga, P., Palma, V. L., Fernandes, R. M., Soares, A. M., & Barbosa, I. R. (2010). Assessment of anthropogenic sources of water pollution using multivariate statistical techniques: a case study of the Alqueva’s reservoir, Portugal. Environmental Monitoring and Assessment, 165(1–4), 539–552.CrossRefGoogle Scholar
  20. Park, Y.-S., Céréghino, R., Compin, A., & Lek, S. (2003). Applications of artificial neural networks for patterning and predicting aquatic insect species richness in running waters. Ecological Modelling, 160(3), 265–280.CrossRefGoogle Scholar
  21. Pejman, A., Bidhendi, G. N., Karbassi, A., Mehrdadi, N., & Bidhendi, M. E. (2009). Evaluation of spatial and seasonal variations in surface water quality using multivariate statistical techniques. International Journal of Environmental Science & Technology, 6(3), 467–476.CrossRefGoogle Scholar
  22. Rama, B., Manoj, K., & Kumar, P. (2013). Index analysis, graphical and multivariate statistical approaches for hydrochemical characterisation of Dam Oder River and its canal system, Durgapur, West Bengal, India. International Research Journal of Environmental Sciences, 2(2), 53–62.Google Scholar
  23. Ravansalar, M., Rajaee, T., & Zounemat-Kermani, M. (2016). A wavelet–linear genetic programming model for sodium (Na+) concentration forecasting in rivers. Journal of Hydrology, 537, 398–407.CrossRefGoogle Scholar
  24. Singh, K. P., Basant, A., Malik, A., & Jain, G. (2009). Artificial neural network modeling of the river water quality—a case study. Ecological Modelling, 220(6), 888–895.CrossRefGoogle Scholar
  25. Tomić, A. N. Š., Antanasijević, D. Z., Ristić, M. Đ., Perić-Grujić, A. A., & Pocajt, V. V. (2016). Modeling the BOD of Danube River in Serbia using spatial, temporal, and input variables optimized artificial neural network models. Environmental Monitoring and Assessment, 188(5), 1–12.Google Scholar
  26. Zahiri, A., & Azamathulla, H. M. (2014). Comparison between linear genetic programming and M5 tree models to predict flow discharge in compound channels. Neural Computing and Applications, 24(2), 413–420.CrossRefGoogle Scholar
  27. Zhou, F., Liu, Y., & Guo, H. (2007). Application of multivariate statistical modes to water quality assessment of the watercourses in Northwestern New Territories, Hong Kong. Environmental Monitoring and Assessment, 132(1–3), 1–13.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Civil EngineeringIsfahan University of TechnologyIsfahanIran

Personalised recommendations