Abstract
This study evaluated and compared groundwater spring potential maps produced with two different models—namely multivariate adaptive regression spline (MARS) and random forest (RF)—using geographic information system (GIS). In total, 234 spring locations were identified in the Boujnord, North Khorasan, Iran and a GIS spring inventory map was prepared. Of these, 176 (70 %) locations were employed to produce spring potential maps (training), while the remaining 58 (30 %) cases were used to validate the model. The explanatory variables used to predict spring location were altitude, slope aspect, slope degree, slope length, topographic wetness index (TWI), plan curvature, profile curvature, land use, lithology, distance to rivers, drainage density, distance to faults, and fault density. Furthermore, the spatial relationships between spring occurrence and explanatory variables were performed using a Certainty Factor (CF) model. For validation, area under a receiver operating characteristics (ROC) curves (AUC) was used. The validation results showed that the AUC for calibration is almost identical (0.79) in both models, while for prediction, the MARS model (73.26 %) performed better than RF (70.98 %) model. These results indicate that the MARS and RF models are good estimators of groundwater spring potential in the study area. These groundwater spring potential maps can be applied to groundwater management and groundwater resource exploration.
Similar content being viewed by others
References
Balashi MS, McGuirez AD, Duffy P, Flannigan M, Walsh J, Melillo J (2009) Assessing the response of area burned to changing climate in western boreal North America using a Multivariate Adaptive Regression Splines (MARS) approach. Glob Change Biol 15:578–600. doi:10.1111/j.1365-2486.2008.01679.x
Bera K, Bandyopadhyay J (2012) Ground water potential mapping in Dulung watershaed using remote sensing and GIS techniques, West Bangal, India. Int J Sci Res Publ 2(12):1–7
Beven K, Kirkby MJ (1979) A physically based, variable contributing area model of basin hydrology. Hydrol Sci Bull 24:43–69
Breiman L (2001) Random forests. Mach Learn 45(l):5–32
Breiman L, Cutler A (2006) Random Forests. http://stat-www.berkeley.edu/users/breiman/RandomForests/cchome.htm
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman & Hall/CRC
Calle ML, Urrea V (2010) Letter to the editor: stability of random forest importance measures. Brief Bioinform 12(1):86–89
Carranza EJM, Hale M (2002) Evidential belief functions for data-driven geologically-constrained predictive mapping of gold potential, Baguio district, Philippines. Ore Geol Rev 22:117–132
Catani F, Lagomarsino D, Segoni S, Tofani V (2013) Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Nat Hazards Earth Syst Sci 13:2815–2831
Chung CF, Leclerc Y (1994) A quantitative technique for zoning landslide hazard. International Association for Mathematical Geology Annual Conference, Quebec, pp 87–93
Chung-Jo F, Fabbri AG (2003) Validation of spatial prediction models for landslide hazard mapping. Nat Hazards 30:451–472
Conoscenti CH, Ciaccio M, Caraballo-Arias NA, Go´mez-Gutie´rrez A, Rotigliano E, Agnesi V (2014) Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: a case of the Belice River basin (western Sicily, Italy). Geomorphology. doi:10.1016/j.geomorph.2014.09.020
Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:317–403
Davoodi Moghaddam D, Rezaei M, Pourghasemi HR, Pourtaghie ZS, Pradhan B (2013) Groundwater spring potential mapping using bivariate statistical model and GIS in the Taleghan Watershed, Iran. Arab J Geosci. doi:10.1007/s12517-013-1161-5
Donati L, Turrini MC (2002) An objective method to rank the importance of the factors predisposing to landslides with the GIS methodology: application to an area of the Apennines (Valnerina; Perugia Italy). Eng Geol 63:277–289
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–14
Ganapuram S, Vijaya Kumar GT, Murali Krishna IV, Kahya E, Demirel MC (2009) Mapping of groundwater potential zones in the Musi basin using remote sensing data and GIS. Adv Eng Softw 40:506–518
Geology Survey of Iran (GSI) (1997) http://www.gsi.ir/Main/Lang_en/index.html
Godebo TR (2005) Application of remote sensing and GIS for geological investigation and groundwater potential zone identification, Southeastern Ethiopian Plateau, Bale Mountains and the surrounding areas. M.Sc. Thesis. Addi Ababa University, p. 89
Gutiérrez AG, Schnabel S, Contador JFL (2009) Using and comparing two nonparametric methods (CART and MARS) to model the potential distribution of gullies. Ecol Model 220:3630–3637
Heckerman D (1986) Probabilistic interpretation of MYCIN’s certainty factors. In: Kanal LN, Lemmer JF (eds) Uncertainty in artificial intelligence. Elsevier, New York, pp 298–311
Israil M, Al-hadithi M, Singhal DC, Kumar B, Rao MS, Verma K (2006) Groundwater resources evaluation in the Piedmont zone of Himalaya, India, using isotope and GIS technique. J Spatial Hydrol 6(1):34–38
Jaiswal RK, Mukherjee S, Krishnamurthy J, Saxena R (2003) Role of remote sensing and GIS techniques for generation of groundwater prospect zones towards rural development: an approach. Int J Remote Sens 24:993–1008
Jha MK, Chowdhury A, Chowdary VM, Peiffer S (2007) Groundwater management and development by integrated remote sensing and geographic information systems: prospects and constraints. Water Resour Manage 21:427–467
Kaliraj S, Chandrasekar N, Magesh NS (2013) Identification of potential groundwater recharge zones in Vaigai upper basin, Tamil Nadu, using GIS-based analytical hierarchical process (AHP) technique. Arab J Geosci. doi:10.1007/s12517-013-0849-x
Kanungo DP, Sarkar S, Sharma Sh (2011) Combining neural network with fuzzy, certainty factor and likelihood ratio concepts for spatial prediction of landslides. Nat Hazards 59(3):1491–1512
Kennison RF, Cox J (2013) Health and functional limitations predict depression scores in the health and retirement study; results straight from MARS. Calif J Health Promot 11(1):97–108
Lee S, Pradhan B (2006) Probabilistic landslide hazards and risk mapping on Penang Island, Malaysia. J Earth Syst sci 115(6):661–667
Lee S, Pradhan B (2007) Landslide hazard mapping at Selangor, using frequency ratio and logistic regression models. Landslides 4:33–41
Liaw A, Wiener M (2002) Classification and regression by random forest. R News 2:18–22
Mair A, El-Kadi AI (2013) Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA. J Contam Hydrol 153:1–23. doi:10.1016/j.jconhyd.2013.07.004
Micheletti N, Foresti L, Robert S, Leuenberger M, Pedrazzini A, Jaboyedoff M, Kanevski M (2014) Machine learning feature selection methods for landslide susceptibility mapping. Math Geosci 46:33–57
Milborrow S (2012) Derived from mda: MARS by Trevor Hastie and Rob Tibshirani: multivariate Adaptive Regression Spline Models. R package version 3.2-2. http://CRAN.R-project.org/package=earth
Moore ID, Burch GJ (1986) Sediment transport capacity of sheet and rill flow: application of unit stream power theory. Water Resour 22:1350–1360
Moore ID, Grayson RB, Ladson AR (1991) Digital terrain modeling: a review of hydrological, geomorphological and biological applications. Hydrol Pro 5:3–30
Murugesan B, Thirunavukkarasu R, Senapathi V, Balasubramanian G (2012) Application of remote sensing and GIS analysis for groundwater potential zone in Kodaikanal Taluka, South India. Earth Sci 7(1):65–75
Naghibi A, Pourghasemi HR (2015) A comparative assessment between three machine learning models and their performance comparison by bivariate and multivariate statistical methods for groundwater potential mapping in Iran. Water Resour Manage 29(14):5217–5236. doi:10.1007/s11269-015-1114-8
Naghibi SA, Pourghasemi HR, Pourtaghi ZS, Rezaei A (2014) Groundwater qanat potential mapping using frequency ratio and Shannon’s entropy models in the Moghan watershed, Iran. Earth Sci Inform. doi:10.1007/s12145-014-0145-7
Naghibi SA, Pourghasemi HR, Dixon B (2016) Groundwater spring potential using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environ Monit Assess. doi:10.1007/s10661-015-5049-6
Negnevitsky M (2002) Artificial Intelligence: a guide to intelligent systems. AddisonWesley/Pearson Education, Harlow, p 394
Oh HJ, Lee S (2010) Cross-validation of logistic regression model for landslide susceptibility mapping at Geneoung areas, Korea. Disaster Adv 3(2):44–55
Oh HJ, Kim YS, Choi JK, Lee S (2011) GIS mapping of regional probabilistic groundwater potential in the area of Pohang City, Korea. J Hydrol 399:158–172
Ozdemir A (2011) GIS-based groundwater spring potential mapping in the Sultan Mountains (Konya, Turkey) using frequency ratio, weights of evidence and logistic regression methods and their comparison. J Hydrol 411:290–308
Pourghasemi HR, Pradhan B, Gokceoglu C (2012) Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed, Iran. Nat Hazards 63(2):965–996
Pourghasemi HR, Pradhan B, Gokceoglu C, Mohammadi M, Moradi HR (2013) Application of weights-of-evidence and certainty factor models and their comparison in landslide susceptibility mapping at Haraz watershed, Iran. Arab J Geosci 6(7):2351–2365
Pourtaghi ZS, Pourghasemi HR (2014) GIS-based groundwater spring potential assessment and mapping in the Birjand Township, southern Khorasan Province, Iran. Hydrogeol J 2(3):643–662
Pradhan B, Lee S, Buchroithner MF (2010a) A GIS-based back-propagation neural network model and its cross-application and validation for landslide susceptibility analyses. Comput Environ Urban Syst 34(3):216–235
Pradhan B, Lee S, Buchroithner MF (2010b) Remote sensing and GIS-based landslide susceptibility analysis and its cross-validation in three test areas using a frequency ratio model. Photogramm Fernerkund Geo Inform 1:17–32. doi:10.1127/14328364/2010/0037
Prasad A, Iverson L, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9(2):181–199
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaurmann, SanMateo
Rahmati O, Pourghasemi HR, Melesse A (2016) Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: a case study at Mehran Region, Iran. Catena 137:360–372. doi:10.1016/j.catena.2015.10.010
Rahmati O, Samani AN, Mahdavi M, Pourghasemi HR, Zeinivand H (2015) Groundwater potential mapping at Kurdistan region of Iran using analytic hierarchy process and GIS. Arab J Geosci 8 (9):7059–7071
Rodriguez-Galiano V, Mendes MP, Garcia-Soldado MJ, Chica-Olmo M, Ribeiro L (2014) Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (Southern Spain). Sci Total Environ 476–477:189–206
Saha D, Dhar YR, Vittala SS (2010) Delineation of groundwater development potential zones in parts of marginal Ganga Alluvial Plain in South Bihar, Eastern India. Environ Monit Assess 165:179–191
Samui P, Kothari DP (2012) A multivariate adaptive regression spline approach for prediction of maximum shear modulus (Gmax) and minimum damping ratio. Eng J 16(5):69–77
Sarkar S, Kanungo DP (2004) An integrated approach for landslide susceptibility mapping using remote sensing and GIS. Photogram Eng Remote Sens 70(5):617–625
Shahid S, Nath SK, Roy J (2000) Groundwater potential modeling in a soft rock area using a GIS. Int J Remote Sens 21(9):1919–1924
Shortliffe EH, Buchanan GG (1975) A model of inexact reasoning in medicine. Math Biosci 23:351–379
Sidle RC, Ochiai H (2006) Landslides: processes, prediction, and land use. American Geophysical Union, Washington, DC 312 pp
Solomon S, Quiel F (2006) Groundwater study using remote sensing and geographic information systems (GIS) in the central highlands of Eritrea. Hydrol J 14:729–741
Sorichetta A, Ballabio C, Masetti M, Robinson GR Jr, Sterlacchini S (2013) A comparison of data-driven groundwater vulnerability assessment methods. Ground Water 51(6):866–879. doi:10.1111/gwat.12012
Swets JA (1988) Measuring the accuracy of diagnostic systems. Sciene 240:1285–1293
Talebi A, Uijlenhoet R, Troch PA (2007) Soil moisture storage and hillslope stability. Nat Hazards Earth Syst Sci 7:523–534
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012) Spatial prediction of landslide hazards in Hoa Binh province (Vietnam): a comparative assessment of the efficacy of evidential belief functions and fuzzy logic models. Catena 96:28–40
Waikar ML, Nilawar AP (2014) Identification of Groundwater Potential Zone using Remote Sensing and GIS Technique. Int J Innov Res Sci Eng Technol 3(5):1264–1274
Williams G (2011) Data mining with rattle and R (The art of excavating data for knowledge discovery series), 1st edn. Springer-Verlag, New York. doi:10.1007/978-1-4419-9890-3
Yao D, Yang J, Zhan X (2013) A novel method for disease prediction: hybrid of random forest and multivariate adaptive regression splines. J comput 8(1):170–177
Yesilnacar EK (2005) The application of computational intelligence to landslide susceptibility mapping in Turkey. Ph.D Thesis. Department of Geomatics the University of Melbourne, p. 423
Youssef AM, Pourghasemi HR, Pourtaghi Z, Al-Katheeri MM (2015) Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir region, Saudi Arabia. Landslides, doi:10.1007/s10346-015-0614-1
Zare M, Pourghasemi HR, Vafakhah M, Pradhan B (2013) Landslide susceptibility mapping at Vaz Watershed (Iran) using an artificial neural network model: a comparison between multilayer perceptron (MLP) and radial basic function (RBF) algorithms. Arab J Geosci 6(8):2873–2888
Acknowledgments
The authors would like to thank Dr. Michael Fienen at the USGS Wisconsin Water Science Center for revising of language of manuscript. Also, we gratefully acknowledge of Editor-in-Chief Prof. James W. LaMoreaux and the two anonymous reviewers for their helpful comments on the previous version of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zabihi, M., Pourghasemi, H.R., Pourtaghi, Z.S. et al. GIS-based multivariate adaptive regression spline and random forest models for groundwater potential mapping in Iran. Environ Earth Sci 75, 665 (2016). https://doi.org/10.1007/s12665-016-5424-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-016-5424-9