Skip to main content

Advertisement

Log in

Groundwater Potential Mapping in Hubei Region of China Using Machine Learning, Ensemble Learning, Deep Learning and AutoML Methods

  • Original Paper
  • Published:
Natural Resources Research Aims and scope Submit manuscript

Abstract

Freshwater scarcity has become more widespread on a global scale in recent years. Surface water resources are no longer sufficient to meet the demands of human productivity and survival, and groundwater resources are now being utilized extensively. Therefore, exploration of potential groundwater resources is critical for their rational development and utilization. In this study, four methods of machine learning, ensemble learning, deep learning, and automated machine learning (AutoML) were selected, and their representative models were chosen for comprehensive groundwater potential mapping (GPM) in Hubei Province, China. In total, 812 samples were collected and acquired; about 80% of the samples were selected randomly as training data set and the remaining 20% as test data set. Considering local hydrological, geological and climatic conditions, slope, elevation, curvature, landforms, geology, distance to fault, land type, soils, precipitation, evaporation, topographic wetness index, stream power index, distance to rivers, normalized difference vegetation index, and distance to residential area were selected as factors in this work. Finally, the four models were validated using receiver operating characteristic (ROC)–area under the curve (AUC) and classification reports. The Shap values of each factor were calculated as a measure of contribution of each factor to groundwater potential. The results showed that the ROC–AUC values of random forest, Stacking, convolutional neural network, and AutoML were 0.82, 0.85, 0.87 and 0.88, and the precision values were 0.793, 0.784, 0.807 and 0.844, respectively. These results indicate that the first application of AutoML to GPM in this study was feasible, and it had the best predictive ability and accuracy compared with the other three methods. In addition, the Shap values indicated that the two factors that had the greatest influence on groundwater potential were geology and precipitation. The results obtained in this study can provide technical support to the local government for groundwater exploration and development in Hubei Province, and the newly introduced method of AutoML can provide new ideas to experts and scholars for groundwater potential assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11

Similar content being viewed by others

References

  • Aboutalib, S. S., Mohamed, A. A., Berg, W. A., Zuley, M. L., Sumkin, J. H., & Wu, S. (2018). Deep learning to distinguish recalled but benign mammography images in breast cancer screening. Clinical Cancer Research, 24(23), 5902–5909.

    Article  Google Scholar 

  • Abu El-Magd, S. A., Ali, S. A., & Pham, Q. B. (2021). Spatial modeling and susceptibility zonation of landslides using random forest, naïve bayes and K-nearest neighbor in a complicated terrain. Earth Science Informatics, 14(3), 1227–1243.

    Article  Google Scholar 

  • Adnan, M. S. G., Rahman, M. S., Ahmed, N., Ahmed, B., Rabbi, M. F., & Rahman, R. M. (2020). Improving spatial agreement in machine learning-based landslide susceptibility mapping. Remote Sensing, 12(20), 3347.

    Article  Google Scholar 

  • Ahmad, I., Dar, M. A., Andualem, T. G., & Teka, A. H. (2020). GIS-based multi-criteria evaluation of groundwater potential of the Beshilo River Basin, Ethiopia. Journal of African Earth Sciences, 164, 103747.

    Article  Google Scholar 

  • Al-Najjar, H. A. H., & Pradhan, B. (2021). Spatial landslide susceptibility assessment using machine learning techniques assisted by additional data created with generative adversarial networks. Geoscience Frontiers, 12(2), 625–637.

    Article  Google Scholar 

  • Aluko, O. E., & Igwe, O. (2017). An integrated geomatics approach to groundwater potential delineation in the Akoko-Edo Area, Nigeria. Environmental Earth Sciences, 76(6), 240.

    Article  Google Scholar 

  • Arabameri, A., Lee, S., Tiefenbacher, J. P., & Ngo, P. T. T. (2020). Novel Ensemble of MCDM-Artificial Intelligence techniques for groundwater-potential mapping in Arid and Semi-Arid Regions (Iran). Remote Sensing, 12(3), 490.

    Article  Google Scholar 

  • Ayalew, L., & Yamagishi, H. (2005). The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology, 65(1), 15–31.

    Article  Google Scholar 

  • Aykut, T. (2021). Determination of groundwater potential zones using Geographical Information Systems (GIS) and Analytic Hierarchy Process (AHP) between Edirne-Kalkansogut (northwestern Turkey). Groundwater for Sustainable Development, 12, 100545.

    Article  Google Scholar 

  • Bai, Z., Liu, Q., & Liu, Y. (2021). Landslide susceptibility mapping using GIS-based machine learning algorithms for the Northeast Chongqing Area, China. Arabian Journal of Geosciences, 14(24), 2831.

    Article  Google Scholar 

  • Bai, Z., Liu, Q., & Liu, Y. (2022). Risk assessment of water inrush from coal seam roof with an AHP–CRITIC algorithm in Liuzhuang Coal Mine, China. Arabian Journal of Geosciences, 15(4), 364.

    Article  Google Scholar 

  • Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing, 114, 24–31.

    Article  Google Scholar 

  • Beven, K. J., & Kirkby, M. J. (1979). A physically based, variable contributing area model of basin hydrology/Un modèle à base physique de zone d’appel variable de l’hydrologie du bassin versant. Hydrological Sciences Bulletin, 24(1), 43–69.

    Article  Google Scholar 

  • Bowers, A. J., & Zhou, X. (2019). Receiver Operating Characteristic (ROC) Area Under the Curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes. Journal of Education for Students Placed at Risk (JESPAR), 24(1), 20–46.

    Article  Google Scholar 

  • Buckland, M., & Gey, F. (1994). The relationship between recall and precision. Journal of the American Society for Information Science, 45(1), 12–19.

    Article  Google Scholar 

  • Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), 6.

    Article  Google Scholar 

  • Dong, S., Wang, P., & Abbas, K. (2021). A survey on deep learning and its applications. Computer Science Review, 40, 100379.

    Article  Google Scholar 

  • Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14(2), 241–258.

    Article  Google Scholar 

  • Erickson, N., Mueller, J., Shirkov, A., Zhang, H., Larroy, P., Li, M., & Smola, A. (2020). AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. ArXiv:2003.06505 [Cs, Stat]. http://arxiv.org/abs/2003.06505

  • Falah, F., & Zeinivand, H. (2019). GIS-based groundwater potential mapping in Khorramabad in Lorestan, Iran, using Frequency Ratio (FR) and Weights of Evidence (WoE) Models. Water Resources, 46(5), 679–692.

    Article  Google Scholar 

  • Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.

    Article  Google Scholar 

  • Hart, S. (1989). Shapley Value. In J. Eatwell, M. Milgate, & P. Newman (Eds.), Game Theory (pp. 210–216). Palgrave Macmillan UK. https://doi.org/10.1007/978-1-349-20181-5_25

  • He, X., Zhao, K., & Chu, X. (2021). AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, 212, 106622.

    Article  Google Scholar 

  • Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366.

    Article  Google Scholar 

  • Hosmer, D. W., & Lemesbow, S. (1980). Goodness of fit tests for the multiple logistic regression model. Communications in Statistics—Theory and Methods, 9(10), 1043–1069.

    Article  Google Scholar 

  • Janiesch, C., Zschech, P., & Heinrich, K. (2021). Machine learning and deep learning. Electronic Markets, 31(3), 685–695.

    Article  Google Scholar 

  • Jia, X., O’Connor, D., Hou, D., Jin, Y., Li, G., Zheng, C., Ok, Y. S., Tsang, D. C. W., & Luo, J. (2019). Groundwater depletion and contamination: Spatial distribution of groundwater resources sustainability in China. Science of the Total Environment, 672, 551–562.

    Article  Google Scholar 

  • Lee, S., Hyun, Y., & Lee, M.-J. (2019). Groundwater potential mapping using data mining models of big data analysis in Goyang-si South Korea. Sustainability, 11(6), 1678.

    Article  Google Scholar 

  • Liang, M., & Hu, X. (2015). Recurrent Convolutional Neural Network for Object Recognition. 3367–3375. https://openaccess.thecvf.com/content_cvpr_2015/html/Liang_Recurrent_Convolutional_Neural_2015_CVPR_paper.html

  • Le Maitre, D. C., Scott, D. F., & Colvin, C. (1999). Review of information on interactions between vegetation and groundwater. https://researchspace.csir.co.za/dspace/handle/10204/524

  • Manap, M. A., Nampak, H., Pradhan, B., Lee, S., Sulaiman, W. N. A., & Ramli, M. F. (2014). Application of probabilistic-based frequency ratio model in groundwater potential mapping using remote sensing data and GIS. Arabian Journal of Geosciences, 7(2), 711–724.

    Article  Google Scholar 

  • Mengistu, A. G., van Rensburg, L. D., & Mavimbela, S. S. W. (2018). Shallow groundwater effects on evaporation and soil temperature in two windblown sands (Eutric Cambisol and Chromic Luvisol) in South Africa. Geoderma Regional, 15, e00190.

    Article  Google Scholar 

  • Moggridge, B. J. (2020). Aboriginal people and groundwater. The Proceedings of the Royal Society of Queensland, 126, 11–27.

    Google Scholar 

  • Moore, I. D., Grayson, R. B., & Ladson, A. R. (1991). Digital terrain modelling: A review of hydrological, geomorphological, and biological applications. Hydrological Processes, 5(1), 3–30.

    Article  Google Scholar 

  • Mukherjee, I., & Singh, U. K. (2020). Delineation of groundwater potential zones in a drought-prone semi-arid region of east India using GIS and analytical hierarchical process techniques. CATENA, 194, 104681.

    Article  Google Scholar 

  • Naghibi, S. A., Ahmadi, K., & Daneshi, A. (2017). Application of support vector machine, random forest, and genetic algorithm optimized random forest models in groundwater potential mapping. Water Resources Management, 31(9), 2761–2775.

    Article  Google Scholar 

  • Naghibi, S. A., & Pourghasemi, H. R. (2015). A comparative assessment between three machine learning models and their performance comparison by bivariate and multivariate statistical methods in groundwater potential mapping. Water Resources Management, 29(14), 5217–5236.

    Article  Google Scholar 

  • Naghibi, S. A., Pourghasemi, H. R., & Dixon, B. (2015). GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environmental Monitoring and Assessment, 188(1), 44.

    Article  Google Scholar 

  • Nguyen, P. T., Ha, D. H., Avand, M., Jaafari, A., Nguyen, H. D., Al-Ansari, N., Van Phong, T., Sharma, R., Kumar, R., Le, H. V., Ho, L. S., Prakash, I., & Pham, B. T. (2020a). Soft Computing ensemble models based on logistic regression for groundwater potential mapping. Applied Sciences, 10(7), 2469.

    Article  Google Scholar 

  • Nguyen, P. T., Ha, D. H., Jaafari, A., Nguyen, H. D., Van Phong, T., Al-Ansari, N., Prakash, I., Le, H. V., & Pham, B. T. (2020b). Groundwater potential mapping combining artificial neural network and real AdaBoost Ensemble Technique: The DakNong Province Case-study Vietnam. International Journal of Environmental Research and Public Health, 17(7), 2473.

    Article  Google Scholar 

  • Niu, P.-P., Jiang, S.-Y., Xiong, S.-F., Hu, Q.-S., & Xu, T.-L. (2019). Geological characteristics, fluid inclusions and H-O-C-S isotopes of the Zaopa Ag-Mo prospect in the Suizao area, Hubei Province: Implications for ore genesis. Ore Geology Reviews, 111, 103012.

    Article  Google Scholar 

  • Oikonomidis, D., Dimogianni, S., Kazakis, N., & Voudouris, K. (2015). A GIS/Remote Sensing-based methodology for groundwater potentiality assessment in Tirnavos area, Greece. Journal of Hydrology, 525, 197–208.

    Article  Google Scholar 

  • Panahi, M., Sadhasivam, N., Pourghasemi, H. R., Rezaie, F., & Lee, S. (2020). Spatial prediction of groundwater potential mapping based on convolutional neural network (CNN) and support vector regression (SVR). Journal of Hydrology, 588, 125033.

    Article  Google Scholar 

  • Pham, B. T., Tien Bui, D., Pourghasemi, H. R., Indra, P., & Dholakia, M. B. (2017). Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: A comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theoretical and Applied Climatology, 128(1–2), 255–273.

    Article  Google Scholar 

  • Pourghasemi, H. R., & Beheshtirad, M. (2015). Assessment of a data-driven evidential belief function model and GIS for groundwater potential mapping in the Koohrang Watershed, Iran. Geocarto International, 30(6), 662–685.

    Article  Google Scholar 

  • Prasad, P., Loveson, V. J., Kotha, M., & Yadav, R. (2020). Application of machine learning techniques in groundwater potential mapping along the west coast of India. GIScience and Remote Sensing, 57(6), 735–752.

    Article  Google Scholar 

  • Qiu, J. (2010). China faces up to groundwater crisis. Nature, 466(7304), 308–308.

    Article  Google Scholar 

  • Rahman, A. T. M. S., Hosono, T., Quilty, J. M., Das, J., & Basak, A. (2020). Multiscale groundwater level forecasting: Coupling new machine learning approaches with wavelet transforms. Advances in Water Resources, 141, 103595.

    Article  Google Scholar 

  • Rahmati, O., Avand, M., Yariyan, P., Tiefenbacher, J. P., Azareh, A., & Bui, D. T. (2020). Assessment of Gini-, entropy- and ratio-based classification trees for groundwater potential modelling and prediction. Geocarto International. https://doi.org/10.1080/10106049.2020.1861664

    Article  Google Scholar 

  • Rahmati, O., Falah, F., Naghibi, S. A., Biggs, T., Soltani, M., Deo, R. C., Cerdà, A., Mohammadi, F., & Tien Bui, D. (2019). Land subsidence modelling using tree-based machine learning algorithms. Science of the Total Environment, 672, 239–252.

    Article  Google Scholar 

  • Rahmati, O., Nazari Samani, A., Mahdavi, M., Pourghasemi, H. R., & Zeinivand, H. (2015). Groundwater potential mapping at Kurdistan region of Iran using analytic hierarchy process and GIS. Arabian Journal of Geosciences, 8(9), 7059–7071.

    Article  Google Scholar 

  • Rahmati, O., Pourghasemi, H. R., & Melesse, A. M. (2016). Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: A case study at Mehran Region, Iran. CATENA, 137, 360–372.

    Article  Google Scholar 

  • Rastogi, A. K. (1991). Computation of average seasonal groundwater flows in phreatic aquifer-river system. Journal of Hydrology, 123(3), 355–365.

    Article  Google Scholar 

  • Razandi, Y., Pourghasemi, H. R., Neisani, N. S., & Rahmati, O. (2015). Application of analytical hierarchy process, frequency ratio, and certainty factor models for groundwater potential mapping using GIS. Earth Science Informatics, 8(4), 867–883.

    Article  Google Scholar 

  • Sachdeva, S., & Kumar, B. (2021). Comparison of gradient boosted decision trees and random forest for groundwater potential mapping in Dholpur (Rajasthan), India. Stochastic Environmental Research and Risk Assessment, 35(2), 287–306.

    Article  Google Scholar 

  • Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. WIREs Data Mining and Knowledge Discovery, 8(4), e1249.

    Article  Google Scholar 

  • Saha, S., Sarkar, R., Roy, J., Bayen, B., Bhardwaj, D., & Wangchuk, T. (2022). Application of RBF and MLP Neural Networks Integrating with Rotation Forest in Modeling Landslide Susceptibility of Sampheling, Bhutan. In R. Sarkar, R. Shaw, & B. Pradhan (Eds.), Impact of Climate Change, Land Use and Land Cover, and Socio-economic Dynamics on Landslides (pp. 221–245). Springer. https://doi.org/10.1007/978-981-16-7314-6_10

  • Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE, 10(3), e0118432.

    Article  Google Scholar 

  • Samantaray, S., Sahoo, A., & Ghose, D. K. (2020). Assessment of Groundwater Potential Using Neural Network: A Case Study. In V. Bhateja, S. C. Satapathy, Y.-D. Zhang, & V. N. M. Aradhya (Eds.), Intelligent Computing and Communication (pp. 655–664). Springer. https://doi.org/10.1007/978-981-15-1084-7_63

  • Selvam, S., Dar, F. A., Magesh, N. S., Singaraja, C., Venkatramanan, S., & Chung, S. Y. (2016). Application of remote sensing and GIS for delineating groundwater recharge potential zones of Kovilpatti Municipality, Tamil Nadu using IF technique. Earth Science Informatics, 9(2), 137–150.

    Article  Google Scholar 

  • Shailaja, G., Kadam, A. K., Gupta, G., Umrikar, B. N., & Pawar, N. J. (2019). Integrated geophysical, geospatial and multiple-criteria decision analysis techniques for delineation of groundwater potential zones in a semi-arid hard-rock aquifer in Maharashtra, India. Hydrogeology Journal, 27(2), 639–654.

    Article  Google Scholar 

  • Shirzadi, A., Soliamani, K., Habibnejhad, M., Kavian, A., Chapi, K., Shahabi, H., Chen, W., Khosravi, K., Thai Pham, B., Pradhan, B., Ahmad, A., Bin Ahmad, B., & Tien Bui, D. (2018). Novel GIS based machine learning algorithms for shallow landslide susceptibility mapping. Sensors, 18(11), 3777.

    Article  Google Scholar 

  • Sun, X., Chen, J., Han, X., Bao, Y., Zhou, X., & Peng, W. (2020). Landslide susceptibility mapping along the upper Jinsha River, south-western China: A comparison of hydrological and curvature watershed methods for slope unit classification. Bulletin of Engineering Geology and the Environment, 79(9), 4657–4670.

    Article  Google Scholar 

  • Truong, A., Walters, A., Goodsitt, J., Hines, K., Bruss, C. B., & Farivar, R. (2019). Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), 1471–1479. https://doi.org/10.1109/ICTAI.2019.00209

  • Wang, Q., Li, W., Chen, W., & Bai, H. (2015). GIS-based assessment of landslide susceptibility using certainty factor and index of entropy models for the Qianyang County of Baoji city, China. Journal of Earth System Science, 124(7), 1399–1415.

    Article  Google Scholar 

  • White, D. C., Lewis, M. M., Green, G., & Gotch, T. B. (2016). A generalizable NDVI-based wetland delineation indicator for remote monitoring of groundwater flows in the Australian Great Artesian Basin. Ecological Indicators, 60, 1309–1320.

    Article  Google Scholar 

  • Williams, D. D. (1991). The spring as an interface between groundwater and lotic faunas and as a tool in assessing groundwater quality. SIL Proceedings, 1922–2010, 24(3), 1621–1624.

    Article  Google Scholar 

  • Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.

    Article  Google Scholar 

  • Yahiaoui, B., Agoubi, B., & Kharroubi, A. (2021). Groundwater potential recharge areas delineation using groundwater potential recharge index (GPRI) within arid areas: Ghomrassen, south Tunisia. Arabian Journal of Geosciences, 14(11), 919.

    Article  Google Scholar 

  • Yang, L., Ma, K.-M., Guo, Q.-H., & Bai, X. (2008). Evaluating long-term hydrological impacts of regional urbanisation in Hanyang, China, using a GIS model and remote sensing. International Journal of Sustainable Development and World Ecology, 15(4), 350–356.

    Article  Google Scholar 

  • Zare, M., Pourghasemi, H. R., Vafakhah, M., & Pradhan, B. (2013). Landslide susceptibility mapping at Vaz Watershed (Iran) using an artificial neural network model: A comparison between multilayer perceptron (MLP) and radial basic function (RBF) algorithms. Arabian Journal of Geosciences, 6(8), 2873–2888.

    Article  Google Scholar 

Download references

Acknowledgments

We sincerely thank the editors and reviewers for their valuable comments that greatly improved this paper. This work was supported by Natural Science Foundation of Anhui Province under grant number 1908085ME145.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qimeng Liu.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bai, Z., Liu, Q. & Liu, Y. Groundwater Potential Mapping in Hubei Region of China Using Machine Learning, Ensemble Learning, Deep Learning and AutoML Methods. Nat Resour Res 31, 2549–2569 (2022). https://doi.org/10.1007/s11053-022-10100-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11053-022-10100-4

Keywords

Navigation