Abstract
The objective is to propose an approach to care the groundwater quality from anthropogenic threats with minimum funds or poor data, where statistical methods such as popular principal components analysis and K-means, afford non-significant results. It is a frequent dilemma in developing countries. To overcome it, data mining (DM) techniques were applied to evaluate hidden patterns between 15 hydrogeochemical parameters from 29 production wells and the DRASTIC vulnerability index (DVI), to identify the specific parameters related to the threat, even natural or anthropogenic. The DM classifiers afforded four wells’ clusters, located in correspondence to their DVI-scaled areas in the map. The DM informational and differential weights, with the interaction and multi tests procedures, pointed the key water quality parameters as reliable forecasters and no need for others. The approach would be useful to foresee warning criteria for policy-makers, saving funds in the groundwater quality control analysis. The groundwater quality was adequate, but high and moderate DVI values areas need prevention. DM identified critical physicochemical parameters as predictors for the aquifer vulnerability (Mg2+, HCO3−, Cl−, K+, Na+ concentrations, electric conductivity, and total dissolved solids characterize highest DVIs). The main surface threats were the urban and industrial activities in the center, agriculture along the flanks, and cheese manufacturing in the north.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal CC (2015) Data mining: the textbook. Springer, New York
Agoubi B, Dabbaghi R, Kharroubi A, Mamdani A (2018) Adaptive neural fuzzy inference system for improvement of groundwater vulnerability. Groundwater. https://doi.org/10.1111/gwat.12634
Aller L, Lehr J, Petty R (1987) DRASTIC: a standardized system to evaluate groundwater pollution potential using hydrogeologic settings. US EPA, Washington, DC. https://nepis.epa.gov/Exe/ZyPURL.cgi?Dockey=20007KU4.TXT
Amin SU, Agarwal K, Beg R (2013) Genetic neural network based data mining in prediction of heart disease using risk factors. In: ICT (ed) Proceedings of 2013 IEEE conference on information and communication technologies. IEEE, pp 1227–1231. http://doi.org/10.1109/cict.2013.6558288
Árcega-Santillán I, Otazo-Sánchez E, Galindo-Castillo E, Acevedo-Sandoval O, Romo-Gómez C (2015) Determinación del índice de vulnerabilidad mediante el método DRASTIC. Caso: acuífero del Valle de Tulancingo, Hidalgo, México. Revista Iberoamericana de Ciencias, 2(6):39–49. http://www.reibci.org/publicados/2015/dic/1400101.pdf
Belkhiri L, Narany TS (2015) Using multivariate statistical analysis, geostatistical techniques and structural equation modeling to identify spatial variability of groundwater quality. Water Resour Manage 29:2073–2089. http://doi.org/10.1007/s11269-015-0929-7
Bhardwaj BK, Pal S (2012) Data mining: a prediction for performance improvement using classification. Int J Comp Sci Inf Secur 9:136–140. http://arxiv.org/abs/1201.3418
Blaylock BK, Horel JD, Liston ST (2017) Cloud archiving and data mining of high-resolution rapid refresh forecast model output. Comput Geosci 109:43–50. https://doi.org/10.1016/j.cageo.2017.08.005
Busico G et al (2017) A modified SINTACS method for groundwater vulnerability and pollution risk assessment in highly anthropized regions based on NO3− and SO42− concentrations. Sci Total Environ 609(2017):1512–1523. https://doi.org/10.1016/j.scitotenv.2017.07.257
Caniani D, Lioi D, Mancini I, Masi S (2015) Hierarchical classification of groundwater pollution risk of contaminated sites using fuzzy logic: a case study in the Basilicata region (Italy). Water 7:2013–2036. http://doi.org/10.3390/w7052013
Charfi S, Zouari K, Feki S, Mami E (2013) Study of variation in groundwater quality in a coastal aquifer in north-eastern Tunisia using multivariate factor analysis. Quat Int 302:199–209. http://doi.org/10.1016/j.quaint.2012.11.002
Chaurasia V, Pal S (2013) Early prediction of heart diseases using data mining techniques. Carib J Sci Tech 1:208–217. http://caribjscitech.com/wp-content/uploads/2013/12/Carib.j.SciTech2013Vol.1208-217.pdf
Chou JS, Ho CC, Hoang HS (2018) Determining quality of water in reservoir using machine learning. Ecol Inform 44:57–75. http://doi.org/10.1016/j.ecoinf.2018.01.005
Chu K et al (2018) Modified principal component analysis for identifying key environmental indicators and application to a large-scale tidal flat reclamation. Water 10(1):69. https://doi.org/10.1016/j.ecoinf.2018.01.005
CONAGUA (2015) CONAGUA, Actualización de la disponibilidad media anual de agua en el acuífero Valle de Tulancingo (1317), Estado de Hidalgo. Diario Oficial de la Federacion, México. Retrieved in 23 October 2018 from: http://dof.gob.mx/nota_detalle_popup.php?codigo=5320583
Curtis ZK, Li SG, Liao HS, Lusch D (2018) Data‐driven approach for analyzing hydrogeology and groundwater quality across multiple scales. Groundwater 56:377–398. https://doi.org/10.1111/gwat.12584
Dziedzic R, Margerm K, Evenson J, Karney BW (2014) Building an integrated water–land use database for defining benchmarks, conservation targets, and user clusters. Water Resour Plann Manage 141:1–9. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000462
EPA (2018) 2018 edition of the drinking water standards and health advisories tables. In: Agency, U.S.E.P. (ed) EPA 822-F-18-001. Office of Water. U.S. Environmental Protection Agency, Washington DC
Ferreira AMS, de Oliveira Fontes CH, Cavalcante AAMT, Marambio JES (2015) Pattern recognition as a tool to support decision making in the management of the electric sector. Part II: a new method based on clustering of multivariate time series. Int J Electr Power Energy Syst 67:613–626. http://doi.org/10.1016/j.ijepes.2014.12.001
Fournier VP et al (2014) SPMF: a Java open-source pattern mining library. Mach Learn Res 15:3389–3393. http://www.philippe-fournier-viger.com/fournierviger14a_SPMF_open_source_library.pdf
Guerrero RW, Gómez AC, Castro RJ, González RCA, Santos LEM (2010) Caracterización fisicoquímica del lactosuero en el Valle de Tulancingo. Revista Salud Pública y Nutrición 9:LA321–LA328. Retrieved from: https://www.uaeh.edu.mx/investigacion/icbi/LI_FisicAlim/Carlos_Aldapa/3.pdf
Hadjimichael A, Comas J, Corominas L (2016) Do machine learning methods used in data mining enhance the potential of decision support systems? A review for the urban water sector. AI Commun 29:1–10. http://doi.org/10.3233/AIC-160714
Hájek P, Holeňa M, Rauch J (2010) The GUHA method and its meaning for data mining. J Comput Syst Sci 76:34–48. https://doi.org/10.1016/j.jcss.2009.05.004
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor 11:10–18. Retrieved 23 Oct 2018 from: http://www.kdd.org/exploration_files/p2V11n1.pdf
Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. Elsevier-Morgan Kauffman Publishers Inc., San Francisco, CA, USA
Hanini AE, Added A, Abdeljaoued S (2013) A GIS-based DRASTIC model for assessing phreatic aquifer of Bekalta (Tunisian Sahel). Geogr Inf Syst 05(03):242–247. http://doi.org/10.4236/jgis.2013.53023
He HY et al (2018) Optimizing the DRASTIC method for nitrate pollution in groundwater vulnerability assessments: a case study in China. Pol J Environ Stud 27(1):95–107. http://doi.org/10.15244/pjoes/75181
Hernández-Espriú A et al (2014) The DRASTIC-Sg model: an extension to the DRASTIC approach for mapping groundwater vulnerability in aquifers subject to differential land subsidence, with application to Mexico City. Hydrogeol J 22(6):1469–1485. https://doi.org/10.1007/s10040-014-1130-4
Kaur H (2015) A review of applications of data mining in the field of education. Int J Adv Res Comput Commun Eng 4(4):409. http://doi.org/10.17148/IJARCCE.2015.4492
Khan A, Khan HH, Umar R, Khan MH (2014) An integrated approach for aquifer vulnerability mapping using GIS and rough sets: study from an alluvial aquifer in North India. Hydrogeol J 22(7):1561–1572. https://doi.org/10.1007/s10040-014-1147-8
Körting TS, Garcia Fonseca LM, Câmara G (2013) GeoDMA—geographic data mining analyst. Comput Geosci 57:133–145. https://doi.org/10.1016/j.cageo.2013.02.007
Leduc C, Pulido-Bosch A, Remini B (2017) Anthropization of groundwater resources in the Mediterranean region: processes and challenges. Hydrogeol J 25(6):1529–1547. https://doi.org/10.1007/s10040-017-1572-6
Lesser J, Associates (2006) Estudio Geohidrológico en el Municipio de Tulancingo de Bravo, Hidalgo. CEAA, Hidalgo. Retrieved from http://www.lesser.com.mx/geologia-e-hidrogeoquimica.html
Lesser J, Arellano-Islas S, González-Posadas D, Lesser L (2007) Balance y Modelo del Acuífero de Tulancingo. HGO, Hidalgo, Mexico. Retrieved from http://www.lesser.com.mx/files/07.1-Tulancingo_Lesser.pdf
Li D, Huang D, Guo C, Guo X (2015) Multivariate statistical analysis of temporal-spatial variations in water quality of a constructed wetland purification system in a typical park in Beijing, China. Environ Monit Assess 187(1):4219. https://doi.org/10.1007/s10661-014-4219-2
Liu Y, Liang Y, Liu S, Rosenblum DS, Zheng Y (2016) Predicting urbanwater quality with ubiquitous data. https://arxiv.org/pdf/1610.09462v1
Marín-Celestino AE, Martínez Cruz DA, Otazo-Sánchez EM, Gavi-Reyes F, Vásquez-Soto D (2018) Groundwater quality assessment: an improved approach to K-means clustering, principal component analysis and spatial analysis: a case study. Water 10(4):437. https://doi.org/10.3390/w10040437
Martinelli G et al (2018) Nitrate sources, accumulation and reduction in groundwater from Northern Italy: insights provided by a nitrate and boron isotopic database. Appl Geochem 91:23–35. https://doi.org/10.1016/j.apgeochem.2018.01.011
Mittal M, Pareek S, Agarwal R (2015) Loss profit estimation using association rule mining with clustering. Manage Sci Lett 5(2):167–174. https://doi.org/10.5267/j.msl.2015.1.004
Neshat A, Pradhan B (2017) Evaluation of groundwater vulnerability to pollution using DRASTIC framework and GIS. Arab J Geosci 10(22). http://hdl.handle.net/10453/123500
Neshat A, Pradhan B, Dadras M (2014) Groundwater vulnerability assessment using an improved DRASTIC method in GIS. Resour Conserv Recycl 86:74–86. https://doi.org/10.1016/j.resconrec.2014.02.008
NOM-127-SSA1 (1994) Mexican Official Norm “Environmental health, water use and human consumption: permissible limits of quality and treatments to be bound water for drinking water”
Ojuri O, Bankole T (2013) Groundwater vulnerability assessment and validation for a fast growing city in Africa: a case study of Lagos, Nigeria. J Environ Prot 04(05):454–465. https://doi.org/10.4236/jep.2013.45054
Pacheco FAL, Pires LMGR, Santos RMB, Sanches Fernandes LF (2015) Factor weighting in DRASTIC modeling. Sci Total Environ 505:474–486. https://doi.org/10.1016/j.scitotenv.2014.09.092
Piché R, Järvenpää M, Turunen E, Šimůnek M (2014) Bayesian analysis of GUHA hypotheses. Intell Inf Syst 42(1):47–73. https://doi.org/10.1007/s10844-013-0255-6
Prabha SL, Shanavas ARM (2014) Educational data mining applications. Oper Res Appl Int J (ORAJ) 1(1):23–29. http://airccse.com/oraj/papers/1114oraj04.pdf
Ramos JA, Noyola MC, Tapia Silva FO (2010) Aquifer vulnerability and groundwater quality in mega cities: case of the Mexico Basin. Environ Earth Sci 61(6):1309–1320. http://doi.org/10.1007/s12665-009-0434-5
Re V, Thin MM, Setti M, Comizzoli S, Sacchi E (2018) Present status and future criticalities evidenced by an integrated assessment of water resources quality at catchment scale: the case of Inle Lake (Southern Shan state, Myanmar). Appl Geochem 92:82–93. https://doi.org/10.1016/j.apgeochem.2018.03.005
Ryan SE, Snoeck C, Crowley QG, Babechuk MG (2018) 87Sr/86Sr and trace element mapping of geosphere-hydrosphere-biosphere interactions: a case study in Ireland. Appl Geochem 92:209–224. https://doi.org/10.1016/j.apgeochem.2018.01.007
Sattari MT, Apaydin H, Ozturk F, Baykal N (2012) Application of a data mining approach to derive operating rules for the Eleviyan irrigation reservoir. Lake Reservoir Manage 28(2):142–152. http://doi.org/10.1080/07438141.2012.678927
Sener E, Davraz A (2012) Assessment of groundwater vulnerability based on a modified DRASTIC model, GIS and an analytic hierarchy process (AHP) method: the case of Egirdir Lake basin (Isparta, Turkey). Hydrogeol J 21(3):701–714. https://doi.org/10.1007/s10040-012-0947-y
Simmonds J, Gomez JA, Ledezma A (2018) Statistical and data mining techniques for understanding water quality profiles in a mining-affected river basin. Int J Agric Environ Inf Syst 9(2). http://doi.org/10.4018/IJAEIS.2018040101
Singh PK, Kaur PD (2017) Review on data mining techniques for prediction of water quality. Int J Adv Res Prediction Water Qual 8(5):396–401. www.ijarcs.info/index.php/Ijarcs/issue/view/64
Uddameri V, Honnungar V, Hernandez EA (2014) Assessment of groundwater water quality in central and southern Gulf Coast aquifer, TX using principal component analysis. Environ Earth Sci 71(6):2653–2671. https://doi.org/10.1007/s12665-013-2896-8
Umar R, Ahmed I, Alam F (2009) Mapping groundwater vulnerable zones using modified DRASTIC approach of an alluvial aquifer in parts of central Ganga plain, Western Uttar Pradesh. J Geol Soc India 73(2):193–201. https://doi.org/10.1007/s12594-009-0075-z
Wang X, Zhang F, Ding J (2017) Evaluation of water quality based on a machine learning algorithm and water quality index for the Ebinur Lake Watershed, China. Sci Rep 7(1):12858. http://doi.org/10.1038/s41598-017-12853-y
WHO (2011) Guidelines for drinking water quality. WHO library cataloguing-in-publication data. World Health Organization, Malta, p 541
Yin L et al (2013) A GIS-based DRASTIC model for assessing groundwater vulnerability in the Ordos Plateau, China. Environ Earth Sci 69(1):171–185. https://doi.org/10.1007/s12665-012-1945-z
Yoo K, Shukla SK, Ahn JJ, Oh K, Park J (2016) Decision tree-based data mining and rule induction for identifying hydrogeological parameters that influence groundwater pollution sensitivity. J Cleaner Prod 122. http://doi.org/10.1016/j.jclepro.2016.01.075
Zhang W, Ma D, Yao W (2014) Medical diagnosis data mining based on improved Apriori algorithm. J Netw 9(5):1339–1345. http://doi.org/10.4304/jnw.9.5.1339-1345
Zhang B, Li G, Cheng P, Yeh T-CJ, Hong M (2016) Landfill risk assessment on groundwater based on vulnerability and pollution index. Water Resour Manage 30(4):1465–1480. http://doi.org/10.1007/s11269-016-1233-x
Zhao Y, Xia XH, Yang ZF, Wang F (2012) Assessment of water quality in Baiyangdian Lake using multivariate statistical techniques. Procedia Environ Sci 13:1213–1226. https://doi.org/10.1016/j.proenv.2012.01.115
Acknowledgements
Authors recognize Hidalgo State Autonomous University, and CONACYT National Water Management Network. AEMC, IAS and MLHF thank CONACYT for postdoctoral and doctoral scholarships, respectively. Also, COTAS Tulancingo A.C. (Groundwater Technical Committee Civil Association of Tulancingo) and Hidalgo State Water and Sewerage Commission (CEAA) for funds and data supply.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Marín-Celestino, A.E., de los Ángeles Alonso-Lavernia, M., de la Luz Hernández-Flores, M., Árcega-Santillán, I., Romo-Gómez, C., Otazo-Sánchez, E.M. (2020). Unveiling Groundwater Quality—Vulnerability Nexus by Data Mining: Threats Predictors in Tulancingo Aquifer, Mexico. In: Otazo-Sánchez, E., Navarro-Frómeta, A., Singh, V. (eds) Water Availability and Management in Mexico. Water Science and Technology Library, vol 88. Springer, Cham. https://doi.org/10.1007/978-3-030-24962-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-24962-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24961-8
Online ISBN: 978-3-030-24962-5
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)