Abstract
Groundwater resources are vitally important in arid and semi-arid areas meaning that spatial planning tools are required for their exploration and mapping. Accordingly, this research compared the predictive powers of five machine learning models for groundwater potential spatial mapping in Wadi az-Zarqa watershed in Jordan. The five models were random forest (RF), boosted regression tree (BRT), support vector machine (SVM), mixture discriminant analysis (MDA), and multivariate adaptive regression spline (MARS). These algorithms explored spatial distributions of 12 hydrological-geological-physiographical (HGP) conditioning factors (slope, altitude, profile curvature, plan curvature, slope aspect, slope length (SL), lithology, soil texture, average annual rainfall, topographic wetness index (TWI), distance to drainage network, and distance to faults) that determine where groundwater springs are located. The area under the curve (AUC) of the receiver operating characteristic (ROC) curve was employed to evaluate the prediction accuracies of the five individual models. Here the results were ranked in descending order as MDA (83.2%), RF (80.6%), SVM (80.2%), BRT (78.0%), and MARS (75.5%).The results show good potential for further use of machine learning techniques for mapping groundwater spring potential in other places where the use and management of groundwater resources is essential for sustaining rural or urban life.
Similar content being viewed by others
References
Abeare SM (2009) Comparisons of boosted regression tree, GLM and GAM performance in the standardization of yellow fin tuna catch-rate data from the Gulf of Mexico Longline Fishery. PhD thesis, University of Pretoria
Aertsen W, Kint V, Van Orshoven J, Muys B (2011) Evaluation of modelling techniques for forest site productivity prediction in contrasting ecoregions using stochastic multicriteria acceptability analysis (SMAA). Environ Model Softw 26(7):929–937
Agarwal E, Agarwal R, Garg RD, Garg PK (2013) Delineation of groundwater potential zone: An AHP/ANP approach. J Earth Syst Sci 122:887–898
Al-Abadi AM (2015) Modeling of groundwater productivity in northeastern Wasit Governorate, Iraq by using frequency ratio and Shannon’s entropy models. Appl Water Sci. https://doi.org/10.1007/s13201-015-0283-1
Al-Abdi AM, Pourghasemi HR, Shahid S, Ghalib HB (2017) Spatial mapping of groundwater potential using entropy weighted linear aggregate novel approach and GIS. Arab J Sci Eng 42(3):1185–1199. https://doi.org/10.1007/s13369-016-2374-1
Al-Amoush H, Al-Shabeeb AR, Al-Ayyash S, Al-Adamat R, Ibrahim M, Al-Fugara A, Rajab JA (2016) Geophysical and hydrological investigations of the Northern Wadis Area of Azraq Basin for groundwater artificial recharge purposes. Int J Geosci 7(05):744. https://doi.org/10.4236/ijg.2016.75057
Al-Fugara A, Ahmadlou M, Al-Shabeeb A, AlAyyash S, Al-Amoush H, Al-Adamat R (2020) Spatial mapping of groundwater springs potentiality using grid search-based and genetic algorithm-based support vector regression. Geocarto Int. https://doi.org/10.1080/10106049.2020.1716396
Al-Mahamid (1998) Three dimensional numerical model for groundwater flow and contamination transport of Dhuleil-Hallaqbat Aquifer system, 1998, Unpublished, MSc thesis, Jordan university
Associates in Rural Development Inc (ARD) (2001) The water resource policy support activity. A report submitted to the Ministry of Water and Irrigation, Jordan
Al-Shabeeb AR, Al-Adamat R, Al-Fugara A, Al-Amoush H, Al-Ayyash S (2018) Delineating groundwater potential zones within the Azraq Basin of Central Jordan using multi-criteria GIS analysis. 7, 82–90. https://doi.org/10.1016/j.gsd.2018.03.011
Al-Qaisi (2010) Climate change effects on water resources in Amman Zarqa Basin-Jordan, MWI, Individual. project report climate change—mitigation and adaptation, p 39
Al Kuisi M, Abdel-Fattah A (2010) Groundwater vulnerability to selenium in semi-arid environments: Amman Zarqa Basin, Jordan. Environ Geochem Health 32:107–128
Al Mahameed J (2005) Integration of water resources of the upper Aquifer in Amman-Zarqa Basin based on mathematical modeling and GIS, Jordan, PhD Dissertation, V(12), University f Fribourg, FOG, ISSN: 1434-7512
Benjamini Y, Leshno M (2005) Statistical methods for data mining. Data mining and knowledge discovery handbook. Springer, US, pp 565–587
Beven K (1997) TOPMODEL: a critique. Hydrol Process 11:1069–1085
Beven K, Freer J (2001) A dynamic topmodel. Hydrol Process 15:1993–2011
Bökeoglu Cokluk Ö, Büyüköztürk S (2008) Discriminant function analysis: concept and application. Egitim Arastirmalari Eurasian J Educ Res 33:73–92
Breiman L (2001) Random forests. Mach Learn 45:5–32
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and Regression Trees. Wadsworth International Group, Belmont, California, USA
Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46:131–159
Charon JE (1974) Hydrogeological applications of ERTS satellite imagery. In: Proc UN/FAO regional seminar on remote sensing of earth resources and environment. Commonwealth Science Council, Cairo, pp 439–456
Chenini I, Ben Mammou A (2010) Groundwater recharges study in arid region: an approach using GIS techniques and numerical modeling. Comput Geosci 36(6):801–817
Chowdhury A, Jha M, Chowdary V, Mal B (2009) Integrated remote sensing and GIS-based approach for assessing groundwater potential in West Medinipur district, West Bengal, India. Int J Remote Sens 30(1):231–250
Corsini A, Cervi F, Ronchetti F (2009) Weight of evidence and artificial neural networks for potential groundwater spring mapping: an application to the Mt. Modino area (Northern Apennines, Italy). Geomorphology 111:79–87
Conoscenti C, Angileri S, Cappadonia C, Rotigliano E, Agnesi V, Märker M (2014) Gully erosion susceptibility assessment by means of GIS-based logistic regression: a case of Sicily (Italy). Geomorphology 204:399–411
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Deichmann J, Eshghi A, Haughton D, Sayek S, Teebagy N (2002) Application of multiple adaptive regression splines (MARS) in direct response modeling. J Interact Market 16:15–27
Dixon B (2009) A case study using SVM, NN and logistic regression in a GIS to predict wells contaminated with Nitrate-N. Hydrogeol J 17:1507–1520
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Hum Genet 7(2):179188
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–67. https://doi.org/10.1214/aos/1176347963
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232. https://doi.org/10.1214/aos/1013203451
Friedman JH, Meulman JJ (2003) Multiple additive regression trees with application in epidemiology. Stat Med 22:1365–2138
Funaya H, Ikeda K (2012) A statistical analysis of soft-margin support vector machines for non-separable problems. The 2012 International Joint Conference on Neural Networks (IJCNN), IEEE, 1–7.
Greenbaum D (1992) Structural influences on the occurrence of groundwater in SE Zimbabwe. Geol Soci London Special Pub 66:77–85
Guisan A, Thuiller W (2005) Predicting species distributions: offering more than simple habitat models. Ecol Lett 8:993–1009
Hammouri N, Al-Raggad H, Al-Amoush M, Al-Harahsheh S (2014) Groundwater potential zones mapping in southern part of Jordan Valley. Arab J Geosci 7:2815–2829
Hosmer DW, Lemeshow S (1989) Applied regression analysis, Wiley, New York, 70. ISBN 978–0–470–58247–3
Israil M, Singhal D, Kumar B (2006) Groundwater-recharge estimation using a surface electrical resistivity method in the Himalayan foothill region, India. Hydrogeol J 14:44–50
Jha MK, Chowdhury A, Chowdary VM, Peiffer S (2007) Groundwater management and development by integrated remote sensing and geographic information systems: prospects and constraints. Water Resour Manag 21:427–467
Jha MK, Chowdary VM, Chowdhury A (2010) Groundwater assessment in Salboni Block, West Bengal (India) using remote sensing, geographical information system and multi-criteria decision analysis techniques. Hydrogeol J 18:1713–1728. https://doi.org/10.1007/s10040-010-0631-z
Khosravi K, Panahi M, Tien Bui D (2018) Spatial prediction of groundwater spring potential mapping based on adaptive neuro-fuzzy inference system and metaheuristic optimization. Hydrol Earth Syst Sci 22:4771–4792
Leathwick JR, Elith J, Francis MP, Hastie T, Taylor P (2006) Variation in demersal fish species richness in the oceans surrounding New Zealand: an analysis using boosted regression trees. Mar Ecol Prog Ser 321:267–281
Lloyd JW (1996) Specific monitoring. In: HAJ van Lanend (ed) Monitoring for groundwater management in semi-arid regions. Studies and reports in hydrology No. 57, UNESCO, Paris, pp 47–64
Ließ M, Glaser B, Huwe B (2012) Uncertainty in the spatial prediction of soil texture: comparison of regression tree and random forest models. Geoderma 170:70–79
Machiwal D, Madan KJ, Bimal CM (2010) Assessment of groundwater potential in a semi-arid region of India using remote sensing, GIS and MCDM techniques. Water Resour Manag 25:1359–1386
Margane A, Hobler M, Al-Momani, M, Subh A (2002) Contributions to the hydrogeology of Northern and Central Jordan. Bundesanstalt Fuer Gewwissenschaften und Rohstoffe und Staaliche Geoloische in der Bundesrepublik Deutschland, Stuttgart. ISBN 3-510-95890-X, 52p.
Masoud MH, El Osta MM (2016) Evaluation of groundwater vulnerability in El-Bahariya Oasis, Western Desert, Egypt, using modelling and GIS techniques: a case study. J Earth Syst Sci 125(6):1139–1155
Menard S (2001) Applied logistic regression analysis, 2nd edn., Sage Publication, Thousand Oaks, pp 1–101. ISBN 0-7619-2208-3
Meyer D, Leisch F, Hornik K (2003) The support vector machine under test. Neurocomputing 55(1–2):169–186
Micheletti N, Foresti L, Robert S, Leuenberger M, Pedrazzini A, Jaboyedoff M, Kanevski M (2014) Machine learning feature selection methods for landslide susceptibility mapping. Math Geosci 46:33–57
Mousavi SM, Golkarian A, Naghibi SA, Kalantar B, Pradhan B (2017) GIS-based groundwater spring potential mapping using data mining boosted regression tree and probabilistic frequency ratio models in Iran. AIMS Geosci 3(1):91–115
Moore ID, Grayson R, Ladson A (1991) Digital terrain modelling: a review of hydrological, geomorphological, and biological applications. Hydrol Process 5:3–30
Nampak H, Pradhan B, Manap MA (2014) Application of GIS based data driven evidential belief function model to predict groundwater potential zonation. J Hydrol 513:283–300
Naghibi SA, Pourghasemi HR, Dixon B (2016) GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environ Monit Assess 188:44. https://doi.org/10.1007/s10661-015-5049-6
Naghibi SA, Ahmadi K, Daneshi A (2017) Application of support vector machine, random forest, and genetic algorithm optimized random forest models in groundwater potential mapping. Water Resour Manag 31(9):2761–2775
Naghibi SA, Pourghasemi HR, Abbaspour K (2018) A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS. Theor Appl Climatol 131(3–4):967–984
Negnevitsky M (2002) Artificial intelligence: a guide to intelligent systems. Addison–Wesley/Pearson, Harlow, p 394
Oh HJ, Kim YS, Choi JK, Lee S (2011) GIS mapping of regional probabilistic groundwater potential in the area of Pohang City, Korea. J Hydrol 399:158–172
OPTIMA (Optimization for Sustainable Water Resources Management) (2006) Case study: Zarqa River, Jordan. Third Management Board Meeting, May 18–19, 2006. Gumpoldskirchen, Austria
Ozdemir A (2011) GIS-based groundwater spring potential mapping in the Sultan Mountains (Konya, Turkey) using frequency ratio, weights of evidence and logistic regression methods and their comparison. J Hydrol 411:290–308
Peters J, Verhoest N, Samson R, Boeckx P, De Baets B (2008) Wetland vegetation distribution modelling for the identification of constraining environmental variables. Landsc Ecol 23:1049–1065
Pourghasemi HR, Gayen A, Edalat M, Zarafshar M, Tiefenbacher JP (2019) Is multi-hazard mapping effective in assessment of natural hazards and integrated watershed management? Geosci Front. https://doi.org/10.1016/j.gsf.2019.10.008
Pourghasemi HR, Kariminejad N, Amiri M, Edalat MZ, Blaschke T, Cerda A (2020a) Assessing and mapping multi-hazard risk susceptibility using a machine learning technique. Sci Rep 10:3203. https://doi.org/10.1038/s41598-020-60191-3
Pourghasemi HR, Gayen A, Lasaponara R, Tiefenbacher JP (2020b) Application of learning vector quantization and different machine learning techniques to assessing forest fire influence factors and spatial modelling. Environ Res. https://doi.org/10.1016/j.envres.2020.109321
Pourghasemi HR, Nitheshnirmal S, Karimi N, Collins A (2020c) Gully erosion spatial modelling: role of machine learning algorithms in selection of the best controlling factors and modelling process. Geosci Front. https://doi.org/10.1016/j.gsf.2020.03.005
Pourtaghi ZS, Pourghasemi HR (2014) GIS-based groundwater spring potential assessment and mapping in the Birjand Township, southern Khorasan Province Iran. Hydrogeology 22:643–662. https://doi.org/10.1007/s10040-013-1089-6
Rodriguez-Galiano V, Sanchez-Castillo M, Chica-Olmo M, Chica-Rivas M (2015) Machine 573 learning predictive models for mineral prospectivity: an evaluation of neural networks, ran574 dom forest, regression trees and support vector machines. Ore Geol Rev 71(804–818):575. https://doi.org/10.1016/j.oregeorev.2015.01.001
Salazar F, Toledo M, Oñate E, Suárez B (2016) Interpretation of dam deformation and leakage with boosted regression trees. Eng Struct 119:230–251
Samui P, Kurup P (2012) Multivariate adaptive regression spline (MARS) and least squares support vector machine (LSSVM) for OCR prediction. Soft Comput 16(8):1347–1351
Schapire RE (2003) The boosting approach to machine learning: an overview. Lecture notes in statistics, Springer Verlag, New York, pp 149–172
Scholkopf B, Smola A (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, USA
Schonlau M (2005) Boosted regression (boosting): an introductory tutorial and a Stata plugin. Stata 5(3):330–354
Shruthi RB, Kerle N, Jetten V, Abdellah L, Machmach I (2015) Quantifying temporal changes in gully erosion areas with object oriented analysis. CATENA 128:262–277
Smola AJ, Scholkope B (2004) A tutorial on support vector regression. Stat comput 14:199–222
Stumpf A, Kerle N (2011) Object-oriented mapping of landslides using Random Forests. Remote Sens Environ 115:2564–2577
Tahmassebipoor N, Rahmati O, Noormohamadi F, Lee S (2016) Spatial analysis of groundwater potential using weights-of-evidence and evidential belief function models and remote sensing. Arab J Geosci 9:79. https://doi.org/10.1007/s12517-015-2166-z
Toppo D (2014) Political agitation and water shortages in the Hashemite Kingdom of Jordan. Claremont McKenna College
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Venables WN, Ripley BD (2002) Modern applied statistics with S. Springer-Verlag, Berlin
Walsh SJ, Butler DR, Malanson GP (1998) An overview of scale, pattern, process relationships in geomorphology: a remote sensing and GIS perspective. Geomorphology 21:183–205
Waikar ML, Nilawar AP (2014) Identification of groundwater potential zone using remote sensing and GIS technique. Int J Innov Res Sci Eng Technol 3(5):1264–1274
Williams GJ (2011) Data mining with rattle and R: the art of excavating data for knowledge discovery. Springer-Verlag, New York
World Water Assessment Programme (WWAP) (2009) Water in a changing world. World Water Development Report 3. UNESCO, Paris
Yeh HF, Lee CH, Hsu KC, Chang PH (2009) GIS for the assessment of the groundwater recharge potential zone. Environ Geol 58(1):185–195. https://doi.org/10.1007/s00254-008-1504-9
Yu H, Kim S (2012) SVM tutorial—classification, regression and ranking. In: Rozenberg G, Bäck T, Kok JN (eds) Handbook of natural computing. Springer, New York (NY), pp 479–506
Zabihi M, Pourghasemi HR, Pourtaghi ZS, Behzadfar M (2016) GIS based multivariate adaptive regression spline and random forest models for groundwater potential mapping in Iran. Environ Earth Sci 75:665. https://doi.org/10.1007/s12665-016-5424-9
Acknowledgements
Contribution of Hamid Reza Pourghasemi was supported by College of Agriculture, Shiraz University (Grant No. 97GRC1M271143). Contribution of Adrian L. Collins to this manuscript was funded by grant award provided by the British Biotechnology and Biological Sciences Research Council (BBS/E/C/000I0330). The researchers thank this council for support. Also, authors would like to thank from Editor-in-Chief “Prof. Dr. Olaf Kolditz”, and two anonymous reviewers for positive commetns.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Al-Fugara, A., Pourghasemi, H.R., Al-Shabeeb, A.R. et al. A comparison of machine learning models for the mapping of groundwater spring potential. Environ Earth Sci 79, 206 (2020). https://doi.org/10.1007/s12665-020-08944-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-020-08944-1