Skip to main content

Pre-processing and Input Vector Selection Techniques in Computational Soft Computing Models of Water Engineering

  • Chapter
  • First Online:
Computational Intelligence for Water and Environmental Sciences

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1043))

Abstract

Input feature selection has a crucial role in predictive computational soft computing models. This chapter explored the appropriate pre-processing techniques and input vector selection methods for soft computing models. The pre-processing techniques, namely principal component analysis (PCA), Boruta feature selection algorithm (BFS), the gamma test (GT) algorithm, and subset selection by maximum dissimilarity (SSMD) algorithm, in the field of soft computing models is introduced, and implemented in bedload transport predictions, as a test case. The results of the current study highlighted the effectiveness of pre-processing, input variable selections, determination of the dominant input features and provide significant practical reference value for soft computing model developments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Barzegar, R., Moghaddam, A. A., Deo, R., Fijani, E., & Tziritis, E. (2018). Mapping groundwater contamination risk of multiple aquifers using multi-model ensemble of machine learning algorithms. Science of the Total Environment, 621, 697–712.

    Article  Google Scholar 

  • Çamdevýren, H., Demýr, N., Kanik, A., & Keskýn, S. (2005). Use of principal component scores in multiple linear regression models for prediction of Chlorophyll-a in reservoirs. Ecological Modelling, 181(4), 581–589.

    Article  Google Scholar 

  • Choubin, B., Darabi, H., Rahmati, O., Sajedi-Hosseini, F., & Kløve, B. (2018). River suspended sediment modelling using the CART model: A comparative study of machine learning techniques. Science of the Total Environment, 615, 272–281.

    Article  Google Scholar 

  • Das, P., & Chanda, K. (2020). Bayesian Network based modeling of regional rainfall from multiple local meteorological drivers. Journal of Hydrology, 591, 125563.

    Article  Google Scholar 

  • Dehghani, M., Seifi, A., & Riahi-Madvar, H. (2019). Novel forecasting models for immediate-short-term to long-term influent flow prediction by combining ANFIS and grey wolf optimization. Journal of Hydrology, 576, 698–725.

    Article  Google Scholar 

  • Ebrahimi, H., & Rajaee, T. (2017). Simulation of groundwater level variations using wavelet combined with neural network, linear regression and support vector machine. Global and Planetary Change, 148, 181–191.

    Article  Google Scholar 

  • Helena, B., Pardo, R., Vega, M., Barrado, E., Fernandez, J. M., & Fernandez, L. (2000). Temporal evolution of groundwater composition in an alluvial aquifer (Pisuerga River, Spain) by principal component analysis. Water Research, 34(3), 807–816.

    Article  Google Scholar 

  • Huang, M., Peng, G., Zhang, J., & Zhang, S. (2006). Application of artificial neural networks to the prediction of dust storms in Northwest China. Global and Planetary Change, 52(1–4), 216–224.

    Article  Google Scholar 

  • Jafari, S. M., Zahiri, A. R., Hadad, O. B., & Tabari, M. M. R. (2021). A hybrid of six soft models based on ANFIS for pipe failure rate forecasting and uncertainty analysis: A case study of Gorgan city water distribution network. Soft Computing, 25(11), 7459–7478.

    Article  Google Scholar 

  • Kursa, M. B., & Rudnicki, W. R. (2010). Feature selection with the Boruta package. Journal of Statistical Software, 36(11), 1–13.

    Article  Google Scholar 

  • Liu, M. Y., Huai, W. X., Yang, Z. H., & Zeng, Y. H. (2020). A genetic programming-based model for drag coefficient of emergent vegetation in open channel flows. Advances in Water Resources, 140, 103582.

    Article  Google Scholar 

  • Lu, C., Zhang, T., Zhang, R., & Zhang, C. (2003, April). Adaptive robust kernel PCA algorithm. In 2003 IEEE International conference on acoustics, speech, and signal processing, 2003. Proceedings (ICASSP'03) (Vol. 6, pp. VI-621). IEEE.

    Google Scholar 

  • Mallakpour, I., Villarini, G., Jones, M. P., & Smith, J. A. (2017). On the use of Cox regression to examine the temporal clustering of flooding and heavy precipitation across the central United States. Global and Planetary Change, 155, 98–108.

    Article  Google Scholar 

  • Memarzadeh, R., Zadeh, H. G., Dehghani, M., Riahi-Madvar, H., Seifi, A., & Mortazavi, S. M. (2020). A novel equation for longitudinal dispersion coefficient prediction based on the hybrid of SSMD and whale optimization algorithm. Science of the Total Environment, 716, 137007.

    Article  Google Scholar 

  • Montes, C., Kapelan, Z., & Saldarriaga, J. (2021). Predicting non-deposition sediment transport in sewer pipes using Random forest. Water Research, 189, 116639.

    Article  Google Scholar 

  • Noori, R., Karbassi, A. R., Moghaddamnia, A., Han, D., Zokaei-Ashtiani, M. H., Farokhnia, A., & Gousheh, M. G. (2011). Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. Journal of Hydrology, 401(3–4), 177–189.

    Article  Google Scholar 

  • Nourani, V., & Molajou, A. (2017). Application of a hybrid association rules/decision tree model for drought monitoring. Global and Planetary Change, 159, 37–45.

    Article  Google Scholar 

  • Qu, J., Ren, K., & Shi, X. (2021). Binary Grey wolf optimization-regularized extreme learning machine wrapper coupled with the Boruta algorithm for monthly streamflow forecasting. Water Resources Management, 35(3), 1029–1045.

    Article  Google Scholar 

  • Remesan, R., Shamim, M. A., Han, D., & Mathew, J. (2009). Runoff prediction using an integrated hybrid modelling scheme. Journal of Hydrology, 372(1–4), 48–60.

    Article  Google Scholar 

  • Riahi-Madvar, H., & Seifi, A. (2018). Uncertainty analysis in bed load transport prediction of gravel bed rivers by ANN and ANFIS. Arabian Journal of Geosciences, 11(21), 1–20.

    Article  Google Scholar 

  • Riahi-Madvar, H., Ayyoubzadeh, S. A., & Atani, M. G. (2011). Developing an expert system for predicting alluvial channel geometry using ANN. Expert Systems with Applications, 38(1), 215–222.

    Article  Google Scholar 

  • Riahi-Madvar, H., Dehghani, M., Seifi, A., & Singh, V. P. (2019). Pareto optimal multigene genetic programming for prediction of longitudinal dispersion coefficient. Water Resources Management, 33(3), 905–921.

    Article  Google Scholar 

  • Safari, M. J. S., Mohammadi, B., & Kargar, K. (2020). Invasive weed optimization-based adaptive neuro-fuzzy inference system hybrid model for sediment transport with a bed deposit. Journal of Cleaner Production, 276, 124267.

    Article  Google Scholar 

  • Seifi, A., & Riahi, H. (2020). Estimating daily reference evapotranspiration using hybrid gamma test-least square support vector machine, gamma test-ANN, and gamma test-ANFIS models in an arid area of Iran. Journal of Water and Climate Change, 11(1), 217–240.

    Article  Google Scholar 

  • Smith, E. V., Jr. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3(2), 205–231.

    Google Scholar 

  • Snieder, E., Shakir, R., & Khan, U. T. (2020). A comprehensive comparison of four input variable selection methods for artificial neural network flow forecasting models. Journal of Hydrology, 583, 124299.

    Article  Google Scholar 

  • Wang, Y. F., Huai, W. X., & Wang, W. J. (2017). Physically sound formula for longitudinal dispersion coefficients of natural rivers. Journal of Hydrology, 544, 511–523.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hossien Riahi-Madvar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Riahi-Madvar, H., Gharabaghi, B. (2022). Pre-processing and Input Vector Selection Techniques in Computational Soft Computing Models of Water Engineering. In: Bozorg-Haddad, O., Zolghadr-Asli, B. (eds) Computational Intelligence for Water and Environmental Sciences. Studies in Computational Intelligence, vol 1043. Springer, Singapore. https://doi.org/10.1007/978-981-19-2519-1_20

Download citation

Publish with us

Policies and ethics