Abstract
Freshwater scarcity has become more widespread on a global scale in recent years. Surface water resources are no longer sufficient to meet the demands of human productivity and survival, and groundwater resources are now being utilized extensively. Therefore, exploration of potential groundwater resources is critical for their rational development and utilization. In this study, four methods of machine learning, ensemble learning, deep learning, and automated machine learning (AutoML) were selected, and their representative models were chosen for comprehensive groundwater potential mapping (GPM) in Hubei Province, China. In total, 812 samples were collected and acquired; about 80% of the samples were selected randomly as training data set and the remaining 20% as test data set. Considering local hydrological, geological and climatic conditions, slope, elevation, curvature, landforms, geology, distance to fault, land type, soils, precipitation, evaporation, topographic wetness index, stream power index, distance to rivers, normalized difference vegetation index, and distance to residential area were selected as factors in this work. Finally, the four models were validated using receiver operating characteristic (ROC)–area under the curve (AUC) and classification reports. The Shap values of each factor were calculated as a measure of contribution of each factor to groundwater potential. The results showed that the ROC–AUC values of random forest, Stacking, convolutional neural network, and AutoML were 0.82, 0.85, 0.87 and 0.88, and the precision values were 0.793, 0.784, 0.807 and 0.844, respectively. These results indicate that the first application of AutoML to GPM in this study was feasible, and it had the best predictive ability and accuracy compared with the other three methods. In addition, the Shap values indicated that the two factors that had the greatest influence on groundwater potential were geology and precipitation. The results obtained in this study can provide technical support to the local government for groundwater exploration and development in Hubei Province, and the newly introduced method of AutoML can provide new ideas to experts and scholars for groundwater potential assessment.
Similar content being viewed by others
References
Aboutalib, S. S., Mohamed, A. A., Berg, W. A., Zuley, M. L., Sumkin, J. H., & Wu, S. (2018). Deep learning to distinguish recalled but benign mammography images in breast cancer screening. Clinical Cancer Research, 24(23), 5902–5909.
Abu El-Magd, S. A., Ali, S. A., & Pham, Q. B. (2021). Spatial modeling and susceptibility zonation of landslides using random forest, naïve bayes and K-nearest neighbor in a complicated terrain. Earth Science Informatics, 14(3), 1227–1243.
Adnan, M. S. G., Rahman, M. S., Ahmed, N., Ahmed, B., Rabbi, M. F., & Rahman, R. M. (2020). Improving spatial agreement in machine learning-based landslide susceptibility mapping. Remote Sensing, 12(20), 3347.
Ahmad, I., Dar, M. A., Andualem, T. G., & Teka, A. H. (2020). GIS-based multi-criteria evaluation of groundwater potential of the Beshilo River Basin, Ethiopia. Journal of African Earth Sciences, 164, 103747.
Al-Najjar, H. A. H., & Pradhan, B. (2021). Spatial landslide susceptibility assessment using machine learning techniques assisted by additional data created with generative adversarial networks. Geoscience Frontiers, 12(2), 625–637.
Aluko, O. E., & Igwe, O. (2017). An integrated geomatics approach to groundwater potential delineation in the Akoko-Edo Area, Nigeria. Environmental Earth Sciences, 76(6), 240.
Arabameri, A., Lee, S., Tiefenbacher, J. P., & Ngo, P. T. T. (2020). Novel Ensemble of MCDM-Artificial Intelligence techniques for groundwater-potential mapping in Arid and Semi-Arid Regions (Iran). Remote Sensing, 12(3), 490.
Ayalew, L., & Yamagishi, H. (2005). The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology, 65(1), 15–31.
Aykut, T. (2021). Determination of groundwater potential zones using Geographical Information Systems (GIS) and Analytic Hierarchy Process (AHP) between Edirne-Kalkansogut (northwestern Turkey). Groundwater for Sustainable Development, 12, 100545.
Bai, Z., Liu, Q., & Liu, Y. (2021). Landslide susceptibility mapping using GIS-based machine learning algorithms for the Northeast Chongqing Area, China. Arabian Journal of Geosciences, 14(24), 2831.
Bai, Z., Liu, Q., & Liu, Y. (2022). Risk assessment of water inrush from coal seam roof with an AHP–CRITIC algorithm in Liuzhuang Coal Mine, China. Arabian Journal of Geosciences, 15(4), 364.
Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing, 114, 24–31.
Beven, K. J., & Kirkby, M. J. (1979). A physically based, variable contributing area model of basin hydrology/Un modèle à base physique de zone d’appel variable de l’hydrologie du bassin versant. Hydrological Sciences Bulletin, 24(1), 43–69.
Bowers, A. J., & Zhou, X. (2019). Receiver Operating Characteristic (ROC) Area Under the Curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes. Journal of Education for Students Placed at Risk (JESPAR), 24(1), 20–46.
Buckland, M., & Gey, F. (1994). The relationship between recall and precision. Journal of the American Society for Information Science, 45(1), 12–19.
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), 6.
Dong, S., Wang, P., & Abbas, K. (2021). A survey on deep learning and its applications. Computer Science Review, 40, 100379.
Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14(2), 241–258.
Erickson, N., Mueller, J., Shirkov, A., Zhang, H., Larroy, P., Li, M., & Smola, A. (2020). AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. ArXiv:2003.06505 [Cs, Stat]. http://arxiv.org/abs/2003.06505
Falah, F., & Zeinivand, H. (2019). GIS-based groundwater potential mapping in Khorramabad in Lorestan, Iran, using Frequency Ratio (FR) and Weights of Evidence (WoE) Models. Water Resources, 46(5), 679–692.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.
Hart, S. (1989). Shapley Value. In J. Eatwell, M. Milgate, & P. Newman (Eds.), Game Theory (pp. 210–216). Palgrave Macmillan UK. https://doi.org/10.1007/978-1-349-20181-5_25
He, X., Zhao, K., & Chu, X. (2021). AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, 212, 106622.
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366.
Hosmer, D. W., & Lemesbow, S. (1980). Goodness of fit tests for the multiple logistic regression model. Communications in Statistics—Theory and Methods, 9(10), 1043–1069.
Janiesch, C., Zschech, P., & Heinrich, K. (2021). Machine learning and deep learning. Electronic Markets, 31(3), 685–695.
Jia, X., O’Connor, D., Hou, D., Jin, Y., Li, G., Zheng, C., Ok, Y. S., Tsang, D. C. W., & Luo, J. (2019). Groundwater depletion and contamination: Spatial distribution of groundwater resources sustainability in China. Science of the Total Environment, 672, 551–562.
Lee, S., Hyun, Y., & Lee, M.-J. (2019). Groundwater potential mapping using data mining models of big data analysis in Goyang-si South Korea. Sustainability, 11(6), 1678.
Liang, M., & Hu, X. (2015). Recurrent Convolutional Neural Network for Object Recognition. 3367–3375. https://openaccess.thecvf.com/content_cvpr_2015/html/Liang_Recurrent_Convolutional_Neural_2015_CVPR_paper.html
Le Maitre, D. C., Scott, D. F., & Colvin, C. (1999). Review of information on interactions between vegetation and groundwater. https://researchspace.csir.co.za/dspace/handle/10204/524
Manap, M. A., Nampak, H., Pradhan, B., Lee, S., Sulaiman, W. N. A., & Ramli, M. F. (2014). Application of probabilistic-based frequency ratio model in groundwater potential mapping using remote sensing data and GIS. Arabian Journal of Geosciences, 7(2), 711–724.
Mengistu, A. G., van Rensburg, L. D., & Mavimbela, S. S. W. (2018). Shallow groundwater effects on evaporation and soil temperature in two windblown sands (Eutric Cambisol and Chromic Luvisol) in South Africa. Geoderma Regional, 15, e00190.
Moggridge, B. J. (2020). Aboriginal people and groundwater. The Proceedings of the Royal Society of Queensland, 126, 11–27.
Moore, I. D., Grayson, R. B., & Ladson, A. R. (1991). Digital terrain modelling: A review of hydrological, geomorphological, and biological applications. Hydrological Processes, 5(1), 3–30.
Mukherjee, I., & Singh, U. K. (2020). Delineation of groundwater potential zones in a drought-prone semi-arid region of east India using GIS and analytical hierarchical process techniques. CATENA, 194, 104681.
Naghibi, S. A., Ahmadi, K., & Daneshi, A. (2017). Application of support vector machine, random forest, and genetic algorithm optimized random forest models in groundwater potential mapping. Water Resources Management, 31(9), 2761–2775.
Naghibi, S. A., & Pourghasemi, H. R. (2015). A comparative assessment between three machine learning models and their performance comparison by bivariate and multivariate statistical methods in groundwater potential mapping. Water Resources Management, 29(14), 5217–5236.
Naghibi, S. A., Pourghasemi, H. R., & Dixon, B. (2015). GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environmental Monitoring and Assessment, 188(1), 44.
Nguyen, P. T., Ha, D. H., Avand, M., Jaafari, A., Nguyen, H. D., Al-Ansari, N., Van Phong, T., Sharma, R., Kumar, R., Le, H. V., Ho, L. S., Prakash, I., & Pham, B. T. (2020a). Soft Computing ensemble models based on logistic regression for groundwater potential mapping. Applied Sciences, 10(7), 2469.
Nguyen, P. T., Ha, D. H., Jaafari, A., Nguyen, H. D., Van Phong, T., Al-Ansari, N., Prakash, I., Le, H. V., & Pham, B. T. (2020b). Groundwater potential mapping combining artificial neural network and real AdaBoost Ensemble Technique: The DakNong Province Case-study Vietnam. International Journal of Environmental Research and Public Health, 17(7), 2473.
Niu, P.-P., Jiang, S.-Y., Xiong, S.-F., Hu, Q.-S., & Xu, T.-L. (2019). Geological characteristics, fluid inclusions and H-O-C-S isotopes of the Zaopa Ag-Mo prospect in the Suizao area, Hubei Province: Implications for ore genesis. Ore Geology Reviews, 111, 103012.
Oikonomidis, D., Dimogianni, S., Kazakis, N., & Voudouris, K. (2015). A GIS/Remote Sensing-based methodology for groundwater potentiality assessment in Tirnavos area, Greece. Journal of Hydrology, 525, 197–208.
Panahi, M., Sadhasivam, N., Pourghasemi, H. R., Rezaie, F., & Lee, S. (2020). Spatial prediction of groundwater potential mapping based on convolutional neural network (CNN) and support vector regression (SVR). Journal of Hydrology, 588, 125033.
Pham, B. T., Tien Bui, D., Pourghasemi, H. R., Indra, P., & Dholakia, M. B. (2017). Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: A comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theoretical and Applied Climatology, 128(1–2), 255–273.
Pourghasemi, H. R., & Beheshtirad, M. (2015). Assessment of a data-driven evidential belief function model and GIS for groundwater potential mapping in the Koohrang Watershed, Iran. Geocarto International, 30(6), 662–685.
Prasad, P., Loveson, V. J., Kotha, M., & Yadav, R. (2020). Application of machine learning techniques in groundwater potential mapping along the west coast of India. GIScience and Remote Sensing, 57(6), 735–752.
Qiu, J. (2010). China faces up to groundwater crisis. Nature, 466(7304), 308–308.
Rahman, A. T. M. S., Hosono, T., Quilty, J. M., Das, J., & Basak, A. (2020). Multiscale groundwater level forecasting: Coupling new machine learning approaches with wavelet transforms. Advances in Water Resources, 141, 103595.
Rahmati, O., Avand, M., Yariyan, P., Tiefenbacher, J. P., Azareh, A., & Bui, D. T. (2020). Assessment of Gini-, entropy- and ratio-based classification trees for groundwater potential modelling and prediction. Geocarto International. https://doi.org/10.1080/10106049.2020.1861664
Rahmati, O., Falah, F., Naghibi, S. A., Biggs, T., Soltani, M., Deo, R. C., Cerdà, A., Mohammadi, F., & Tien Bui, D. (2019). Land subsidence modelling using tree-based machine learning algorithms. Science of the Total Environment, 672, 239–252.
Rahmati, O., Nazari Samani, A., Mahdavi, M., Pourghasemi, H. R., & Zeinivand, H. (2015). Groundwater potential mapping at Kurdistan region of Iran using analytic hierarchy process and GIS. Arabian Journal of Geosciences, 8(9), 7059–7071.
Rahmati, O., Pourghasemi, H. R., & Melesse, A. M. (2016). Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: A case study at Mehran Region, Iran. CATENA, 137, 360–372.
Rastogi, A. K. (1991). Computation of average seasonal groundwater flows in phreatic aquifer-river system. Journal of Hydrology, 123(3), 355–365.
Razandi, Y., Pourghasemi, H. R., Neisani, N. S., & Rahmati, O. (2015). Application of analytical hierarchy process, frequency ratio, and certainty factor models for groundwater potential mapping using GIS. Earth Science Informatics, 8(4), 867–883.
Sachdeva, S., & Kumar, B. (2021). Comparison of gradient boosted decision trees and random forest for groundwater potential mapping in Dholpur (Rajasthan), India. Stochastic Environmental Research and Risk Assessment, 35(2), 287–306.
Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. WIREs Data Mining and Knowledge Discovery, 8(4), e1249.
Saha, S., Sarkar, R., Roy, J., Bayen, B., Bhardwaj, D., & Wangchuk, T. (2022). Application of RBF and MLP Neural Networks Integrating with Rotation Forest in Modeling Landslide Susceptibility of Sampheling, Bhutan. In R. Sarkar, R. Shaw, & B. Pradhan (Eds.), Impact of Climate Change, Land Use and Land Cover, and Socio-economic Dynamics on Landslides (pp. 221–245). Springer. https://doi.org/10.1007/978-981-16-7314-6_10
Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE, 10(3), e0118432.
Samantaray, S., Sahoo, A., & Ghose, D. K. (2020). Assessment of Groundwater Potential Using Neural Network: A Case Study. In V. Bhateja, S. C. Satapathy, Y.-D. Zhang, & V. N. M. Aradhya (Eds.), Intelligent Computing and Communication (pp. 655–664). Springer. https://doi.org/10.1007/978-981-15-1084-7_63
Selvam, S., Dar, F. A., Magesh, N. S., Singaraja, C., Venkatramanan, S., & Chung, S. Y. (2016). Application of remote sensing and GIS for delineating groundwater recharge potential zones of Kovilpatti Municipality, Tamil Nadu using IF technique. Earth Science Informatics, 9(2), 137–150.
Shailaja, G., Kadam, A. K., Gupta, G., Umrikar, B. N., & Pawar, N. J. (2019). Integrated geophysical, geospatial and multiple-criteria decision analysis techniques for delineation of groundwater potential zones in a semi-arid hard-rock aquifer in Maharashtra, India. Hydrogeology Journal, 27(2), 639–654.
Shirzadi, A., Soliamani, K., Habibnejhad, M., Kavian, A., Chapi, K., Shahabi, H., Chen, W., Khosravi, K., Thai Pham, B., Pradhan, B., Ahmad, A., Bin Ahmad, B., & Tien Bui, D. (2018). Novel GIS based machine learning algorithms for shallow landslide susceptibility mapping. Sensors, 18(11), 3777.
Sun, X., Chen, J., Han, X., Bao, Y., Zhou, X., & Peng, W. (2020). Landslide susceptibility mapping along the upper Jinsha River, south-western China: A comparison of hydrological and curvature watershed methods for slope unit classification. Bulletin of Engineering Geology and the Environment, 79(9), 4657–4670.
Truong, A., Walters, A., Goodsitt, J., Hines, K., Bruss, C. B., & Farivar, R. (2019). Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), 1471–1479. https://doi.org/10.1109/ICTAI.2019.00209
Wang, Q., Li, W., Chen, W., & Bai, H. (2015). GIS-based assessment of landslide susceptibility using certainty factor and index of entropy models for the Qianyang County of Baoji city, China. Journal of Earth System Science, 124(7), 1399–1415.
White, D. C., Lewis, M. M., Green, G., & Gotch, T. B. (2016). A generalizable NDVI-based wetland delineation indicator for remote monitoring of groundwater flows in the Australian Great Artesian Basin. Ecological Indicators, 60, 1309–1320.
Williams, D. D. (1991). The spring as an interface between groundwater and lotic faunas and as a tool in assessing groundwater quality. SIL Proceedings, 1922–2010, 24(3), 1621–1624.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.
Yahiaoui, B., Agoubi, B., & Kharroubi, A. (2021). Groundwater potential recharge areas delineation using groundwater potential recharge index (GPRI) within arid areas: Ghomrassen, south Tunisia. Arabian Journal of Geosciences, 14(11), 919.
Yang, L., Ma, K.-M., Guo, Q.-H., & Bai, X. (2008). Evaluating long-term hydrological impacts of regional urbanisation in Hanyang, China, using a GIS model and remote sensing. International Journal of Sustainable Development and World Ecology, 15(4), 350–356.
Zare, M., Pourghasemi, H. R., Vafakhah, M., & Pradhan, B. (2013). Landslide susceptibility mapping at Vaz Watershed (Iran) using an artificial neural network model: A comparison between multilayer perceptron (MLP) and radial basic function (RBF) algorithms. Arabian Journal of Geosciences, 6(8), 2873–2888.
Acknowledgments
We sincerely thank the editors and reviewers for their valuable comments that greatly improved this paper. This work was supported by Natural Science Foundation of Anhui Province under grant number 1908085ME145.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare no conflict of interest.
Rights and permissions
About this article
Cite this article
Bai, Z., Liu, Q. & Liu, Y. Groundwater Potential Mapping in Hubei Region of China Using Machine Learning, Ensemble Learning, Deep Learning and AutoML Methods. Nat Resour Res 31, 2549–2569 (2022). https://doi.org/10.1007/s11053-022-10100-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11053-022-10100-4