Abstract
In this study, we assess the validity of an African-scale groundwater pollution model for nitrates. In a previous study, we identified a statistical continental-scale groundwater pollution model for nitrate. The model was identified using a pan-African meta-analysis of available nitrate groundwater pollution studies. The model was implemented in both Random Forest (RF) and multiple regression formats. For both approaches, we collected as predictors a comprehensive GIS database of 13 spatial attributes, related to land use, soil type, hydrogeology, topography, climatology, region typology, nitrogen fertiliser application rate, and population density. In this paper, we validate the continental-scale model of groundwater contamination by using a nitrate measurement dataset from three African countries. We discuss the issue of data availability, and quality and scale issues, as challenges in validation. Notwithstanding that the modelling procedure exhibited very good success using a continental-scale dataset (e.g. R2 = 0.97 in the RF format using a cross-validation approach), the continental-scale model could not be used without recalibration to predict nitrate pollution at the country scale using regional data. In addition, when recalibrating the model using country-scale datasets, the order of model exploratory factors changes. This suggests that the structure and the parameters of a statistical spatially distributed groundwater degradation model for the African continent are strongly scale dependent.
Similar content being viewed by others
References
Aljazzar, T. H., (2010). Adjustment of DRASTIC Vulnerability Index to Assess Groundwater Vulnerability for Nitrate Pollution Using the Advection-Diffusion Cell. Von der Fakultät für Georessourcen und Materialtechnik der Rheinisch-Westfälischen Technischen Hochschule Aachen Ph.D. thesis. 146pp.
Ateawung, J. N. (2010). A GIS based water balance study of Africa. Master of physical land resources, Universiteit Gent Vrije Universiteit Brussel Belgium.55pp
Barrio I, Arostegui I, Quintana JM (2013) Use of generalised additive models to categorise continuous variables in clinical prediction. BMC Med Res Methodol 13(1):83. https://dx.doi.org/10.1186%2F1471-2288-13-83. https://doi.org/10.1186/1471-2288-13-83
Bartram, J. and Ballance, R. [Eds] (1996). Water quality monitoring: a practical guide to the design and implementation of freshwater quality studies and monitoring programmes. Chapman and Hall, London. http://www.who.int/water_sanitation_health/resourcesquality/waterqualmonitor.pdf (Accessed online April 25th,2017).
Bauder JW, Sinclair KN, Lund RE (1993) Physiographic and land use characteristics associated with nitrate nitrogen-nitrogen in Montana groundwater. J Environ Qual 22(2):255–262. https://doi.org/10.2134/jeq1993.00472425002200020004x
Beven KJ (1993) Estimating transport parameters at the grid scale: on the value of a single measurement. J Hydrol 143(1-2):109–123. https://doi.org/10.1016/0022-1694(93)90091-M
Böhlke JK (2002) Groundwater recharge and agricultural contamination. Hydrogeol J 10(1):153–179. https://doi.org/10.1007/s10040-001-0183-3
Booker DJ, Snelder TH (2012) Comparing methods for estimating flow duration curves at ungauged sites. J Hydrol 434:78–94. https://doi.org/10.1016/j.jhydrol.2012.02.031
Boy-Roura, M. (2013). Nitrate groundwater pollution and aquifer vulnerability: the case of the Osana region. PhD thesis. Universitat de Girona. 143pp
Boy-Roura M, Nolan BT, Menció A, Mas-Pla J (2013) Regression model for aquifer vulnerability assessment of nitrate pollution in the Osona region (NE Spain). J Hydrol 505:150–162. https://doi.org/10.1016/j.jhydrol.2013.09.048
Breiman L (2001b) Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci 16(3):199–231. https://projecteuclid.org/euclid.ss/1009213726. https://doi.org/10.1214/ss/1009213726
Breiman, L., (2001a). Random forests. Mach. Learn. 45, 5–32. Doi: https://doi.org/10.1023/A:1010933404324. (https://link.springer.com/content/pdf/10.1023%2FA%3A1010933404324.pdf. Acccesed online June, 21st 2016).
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth International Group, Belmont, California
Chapman, D. (1996). Water quality assessments—a guide to use of biota, sediments, and water in environmental monitoring—second edition. 1996, 651 pages published on behalf of WHO by F & FN Spon. http://www.who.int/water_sanitation_health/resourcesquality/watqualassess.pdf. (accessed online March18th 2017).
Charrière S, Aumond C (2016) Managing the drinking water catchment areas: the French agricultural cooperatives feed back. Environ Sci Pollut Res 23(11):11379–11385. https://doi.org/10.1007/s11356-016-6639-8
Constant T, Charrière S, Lioeddine A, Emsellem Y (2016) Use of modeling to protect, plan, and manage water resources in catchment areas. Environ Sci Pollut Res 23(16):15841–15851. https://doi.org/10.1007/s11356-015-5459-6
Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecology 88(11):2783–2792. https://doi.org/10.1890/07-0539.1
Davis DB, Sylvester-Bradley R (1995) The contribution of fertiliser nitrogen to leachable nitrogen in the UK: a review. J Sci Food Agric 68(4):399–406. https://doi.org/10.1002/jsfa.2740680402
De’ath G (2002) Multivariate regression trees: a new technique for modeling species–environment relationships. Ecology 83(4):1105–1117. https://doi.org/10.2307/3071917. Stable URL http://www.jstor.org/stable/3071917
De’ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81(11):3178–3192. https://doi.org/10.1890/0012-9658(2000)081 [3178:CARTAP]2.0.CO;2
Destouni G (1993) Stochastic modelling of solute flux in the unsaturated zone at the field scale. J Hydrol 143(1–2):45–61. https://doi.org/10.1016/0022-1694(93)90088-Q
Díaz-Uriarte R, De Andres SA (2006) Gene selection and classification of microarray data using random forest. BMC bioinformatics 7(1):3. https://doi.org/10.1186/1471-2105-7-3
Donigan, A.S., Jr., and Rao, P.S.C. (1986). Examples models testing studies in vadose zone modelling of organic pollutants. Edited by S.C. Hem and S.LM Melancon, PP103–131, Lewis Publ., Chelsea, MI.
Dupas R, Curie F, Gascuel-Odoux C, Moatar F, Delmas M, Parnaudeau V, Durand P (2013) Assessing N emissions in surface water at the national level: comparison of country-wide vs. regionalized models. Sci Total Environ 443:152–162. https://doi.org/10.1016/j.scitotenv.2012.10.011
El-Sadek, A. A. M. (2002). Engineering approach to water quantity and quality modelling at field and catchment scale. Ph.D. thesis. Katholieke Universiteit Leuven.251pp.
Evans JS, Murphy MA, Holden ZA, Cushman SA (2011) Modelling species distribution and change using the random forest. In: Drew CA, Wiersma YF, Huettmann F (eds) Predictive species and habitat modeling in landscape ecology. Springer, New York, pp 139–159. https://doi.org/10.1007/978-1-4419-7390-0_8
Fekete A, Damm M, Birkmann J (2010) Scales as a challenge for vulnerability assessment. Nat Hazards 55(3):729–747. https://doi.org/10.1007/s11069-009-9445-5
Foster SSD (2000) Assessing and controlling the impacts of agriculture on groundwater—from barley barons to beef bans. Q J Eng Geol Hydrogeol 33(4):263–280. https://doi.org/10.1144/qjegh.33.4.263
Foster, S.; Garduño,H., Kemper, L., Tuinhof, A., Nanni, M., Dumars, C. (2003). Groundwater quality protection defining strategy and setting priorities. Briefing note 8.6pp. http://documents.worldbank.org/curated/en/434861468166483398/pdf/301000PAPER0BN8.pdf. Accessed online march 6th, 2017).
Gemitzi A, Petalas C, Pisinaras V, Tsihrintzis A (2009) Spatial prediction of nitrate pollution in groundwaters using neural networks and GIS: an application to south Rhodope aquifer (Thrace, Greece). Hydrol Process 23(3):372–383. https://doi.org/10.1002/hyp.7143
Grömping U (2009) Variable importance assessment in regression: linear regression versus random Forest. Am Stat 63(4):308–319. https://doi.org/10.1198/tast.2009.08199
Gross, E. L. (2008). Ground water susceptibility to elevated nitrate concentrations in South Middleton Township, Cumberland County, Pennsylvania. Master of Science. Shippensburg University. 117pp. http://www.ship.edu/uploadedfiles/ship/geo-ess/graduate/theses/gross_thesis_080505.pdf; accessed online July 6th, 2015).
Gubler S, Fiddes J, Keller M, Gruber S (2011) Scale-dependent measurement and analysis of ground surface temperature variability in alpine terrain. Cryosphere 5(2):431–443. https://doi.org/10.5194/tc-5-431-2011
Gurdak JJ, Qi SL (2012) Vulnerability of recently recharged groundwater in principal [corrected] aquifers of the United States to nitrate contamination. Environ Sci Technol 46(11):6004–6012. https://doi.org/10.1021/es300688b
Gurdak JJ, Geyer GE, Nanus L, Taniguchi M, Corona CR (2016) Scale dependence of controls on groundwater vulnerability in the water–energy–food nexus. California Coastal Basin aquifer system Journal of Hydrology: Regional Studies 11:126–138. https://doi.org/10.1016/j.ejrh.2016.01.002
Gurdak JJ (2014) Groundwater vulnerability handbook of engineering hydrology. CRC Press, Taylor & Francis Group 2014:33
Haller, L., McCarthy, P., O'Brien, T., Riehle, J. and Stuhldreher, T. (2013). Nitrate pollution of groundwater. 2014: alpha water systems INC.
Hamza M, Larocque D (2005) An empirical comparison of ensemble methods based on classification trees. J Statist Comput Simulat 75(8):629–643. https://doi.org/10.1080/00949650410001729472
Hartmann J, Moosdorf N (2012) The new global lithological map database GLiM: a representation of rock properties at the earth surface. Geochem Geophys Geosyst 13(12):Q12004. https://doi.org/10.1029/2012GC004370
Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning, 2nd edn. Springer. isbn:0-387-95284-5
Heidema AG, Boer JMA, Nagelkerke N, Mariman ECM, van der, A.D.L., Feskens, E.J.M. (2006) The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases. BMC Genet 7(1):23. https://doi.org/10.1186/1471-2156-7-23
Heuvelink GBM, Pebesma EJ (1999) Spatial aggregation and soil process modelling. Geoderma 89: 47–65. https://doi.org/10.1016/S0016-7061(98)00077-9
Jones MJ (1985) The weathered zone aquifers of the basement complex areas of Africa. Q J Eng Geol Hydrogeol 18:35–46. https://doi.org/10.1144/GSL.QJEG.1985.018.01.06
Jung YY, Koh DC, Park WB, Ha K (2016) Evaluation of multiple regression models using spatial variables to predict nitrate concentrations in volcanic aquifers. Hydrol Process 30(5):663–675. https://doi.org/10.1002/hyp.10633
Knudby A, Brenning A, LeDrew E (2010) New approaches to modelling fish-habitat relationships. Ecol Model 221(3):503–511. https://doi.org/10.1016/j.ecolmodel.2009.11.008
Kulabako N, Nalubega M, Thunvik R (2007) Study of the impact of land use and hydrogeological settings on the shallow groundwater quality in a peri-urban area of Kampala, Uganda. Sci Total Environ 381(1):180–199. https://doi.org/10.1016/j.scitotenv.2007.03.035
Lawler JJ, White D, Neilson RP, Blaustein AR (2006) Predicting climate-induced range shifts: model differences and model reliability. Glob Change Biol 12(8):1568–1584. https://doi.org/10.1111/j.1365-2486.2006.01191.x
Li X, Zhai T, Jiao Y, Wang G (2015) Using Bayesian hierarchical models and random forest algorithm for habitat use studies: a case of nest site selection of the crested ibis at regional scales. PeerJ PrePrints 3:e871v1. https://doi.org/10.7287/peerj.preprints.871v1
Liaw, A., Wiener, M., (2002). Classification and regression by random forest. Vol. 2/3, December 2002. http://www.bios.unc.edu/~dzeng/BIOS740/randomforest.pdf (accessed online April, 16th 2017).
MacDonald, A. (2010). Groundwater, health, and livelihoods in Africa. British Geological Survey © NERC 2010 Earthwise 26, 2pp. ORAL PRESENTATION. http://nora.nerc.ac.uk/17329/1/29-30%5B1%5D.pdf (Accessed online January 28th 2016).
MacDonald AM, Bonsor HC, Dochartaigh BÉÓ, Taylor RG (2012) Quantitative maps of groundwater resources in Africa. Environ Res Lett 7(2):024009. https://doi.org/10.1088/1748-9326/7/2/024009
MacDonald, A., M., R. Taylor, G., and H. Bonsor, C. (2013). (Eds.) Groundwater in Africa—is there sufficient water to support the intensification of agriculture from “Land Grabs”." Hand book of land and water grabs in Africa. pp 376–383
MacDonald A, Davies J, Calow R (2008) African hydrogeology and rural water supply, Applied groundwater studies in Africa. IAH selected papers on hydrogeology, volume 13 (ed. by S. M. A. Adelana & a. M. MacDonald). CRC Press/Balkema, Leiden, The Netherlands
MacDonald AM, Davies J (2000) A brief review of groundwater for rural water supply in sub-Saharan Africa, British Geological Survey, technical report WC/00/33. Overseas Geology Series, BGS, Nottingham, UK
Margat, J. (2010). Ressources et utilisation des eaux souterraines en Afrique. Managing Shared Aquifer Resources in Africa, Third International Conférence Tripoli 25–27 may 2008. International Hydrological Programme, Division of Water Sciences, IHP-VII Series on groundwater No.1, UNESCO, pp 26–34
Mfumu KA, Ndembo LJ, Vanclooster M (2016) Modelling nitrate pollution pressure using a multivariate statistical approach: the case of Kinshasa groundwater body, Democratic Republic of Congo. Hydrogeol J 24(2):425–437. https://doi.org/10.1007/s10040-015-1337-z
Mulla DJ, Addiscott TM (1999) Validation approaches for field-, basin-, and regional-scale water quality models. Assessment of non-point source pollution in the vadose zone:63–78. https://doi.org/10.1029/GM108p0063
National Research Council (NRC), (1993). Ground water vulnerability assessment: Predictive relative contamination potential under conditions of uncertainty. National Academy Press, Washington D.C., pp. 224. ISBN: 978–0–309-04799-9
Nolan BT, Hitt KJ (2006) Vulnerability of shallow groundwater and drinking-water wells to nitrate in the United States. Environmental Science & Technology 40(24):7834–7840. https://doi.org/10.1021/es060911u
Nolan BT, Gronberg JM, Faunt CC, Eberts SM, Belitz K (2014) Modeling nitrate at domestic and public-supply well depths in the Central Valley, California. Environmental science & technology 48(10):5643–5651. https://doi.org/10.1021/es405452q
Oliveira S, Oehler F, San-Miguel-Ayanz J, Camia A, Pereira JM (2012) Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random Forest. For Ecol Manag 275:117–129. https://doi.org/10.1016/j.foreco.2012.03.003
Ouedraogo I, Vanclooster M (2016a) A meta-analysis and statistical modelling of nitrates in groundwater at the African scale. Hydrology and Earth System Sciences, Vol 20, no6 20(6):2353–2381. https://doi.org/10.5194/hess-20-2353-2016.
Ouedraogo I, Vanclooster M (2016b) Shallow groundwater poses pollution problem for Africa. In: SciDev.Net, p 4. http://hdl.handle.net/2078.1/169630
Ouedraogo, I., Defourny, P., and Vanclooster, M.(2016a). Modeling groundwater nitrate concentrations at the African scale using random forest regression techniques. Accepted April 24th to review in the special issue on groundwater in sub-Saharan Africa for Hydrogeological Journal (HJ) (in progress, book expected in December 2017).
Ouedraogo I, Defourny P, Vanclooster M (2016b) Mapping the groundwater vulnerability for pollution at the pan-African scale. Sci Total Environ 544:939–953. https://doi.org/10.1016/j.scitotenv.2015.11.135
Pearson S (2015) Identifying groundwater vulnerability from nitrate contamination: comparison of the DRASTIC model and environment Canterbury’s method. Lincoln University, Degree of Master of Applied Science (Environmental Management), 58 pp
Postnote (2011). Water Adaptation in Africa. Number 373 April 2011. http://www.parliament.uk/documents/post/postpn_373-Water-Adapatation-in-Africa.pdf (Accessed online January 26th, 2016)
Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems (N.Y.), 9(2): 181–199. https://doi.org/10.1007/s10021-005-0054-1
Puckett LJ, Tesoriero AJ, Dubrovsky NM (2011) Nitrogen contamination of surficial aquifers-a growing legacy. Environ Sci Technol 45(3):839–844. https://doi.org/10.1021/es1038358
Rawlings JO, Pantula SG, Dickey DA (1998) Applied regression analysis, a research tool, springer, 658p. https://doi.org/10.1007/b98890
Refsgaard JC, Thorsen M, Jensen JB, Kleeschulte S, Hansen S (1999) Large scale modelling of groundwater contamination from nitrate leaching. J Hydrol 221(3):117–140. https://doi.org/10.1016/S0022-1694(99)00081-5
Refsgaard, J.C., and Butts, M.B. (1999). Determination of grid scale parameters in catchment modelling by upscaling local scale parameters. Proceeding of the Int. workshop on modelling transport process in soils. EurAEng’s IG on soil and water, Leuven, Belgium, 24-26 Nov., 650-665
Rodriguez-Galiano V, Mendes MP, Garcia-Soldado MJ, Chica-Olmo M, Ribeiro L (2014) Predictive modeling of groundwater nitrate pollution using random Forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (southern Spain). Sci Total Environ 476-477:189–206. https://doi.org/10.1016/j.scitotenv.2014.01.001
Royal Society of Chemistry (RSC) (2010) Africa’s water quality. http://www.rsc.org/images/RSC_PACN_Water_Report_tcm18-176914.pdf Last accessed August 2016
Schwarz GE, Richard BA, Smith RA, Preston SD (2011) The regionalization of National-Scale SPARROW models for stream nutrients. Journal of the American Water Resources Association (JAWRA) 47(5):1151–1172. https://doi.org/10.1111/j.1752-1688.2011.00581.x
Shamsudduha M, Taylor RG, Chandler RE (2015) A generalized regression model of arsenic variations in the shallow groundwater of Bangladesh. Water Resour Res 51(1):685–703. https://doi.org/10.1002/2013WR01457
Sharaky, A. M. (2016). Geology and water resources in Africa. Institute of African Research and Studies. The university of Cairo. http://scholar.cu.edu.eg/sharaky/files/notes.pdf. 40pp (accessed online 19th August 2016)
Spalding RF, Exner ME (1993) Occurrence of nitrate in groundwater- a review. J Environ Qual 22(392–402). https://doi.org/10.2134/jeq1993.00472425002200030002x
Strebel, O., Duynisveld, W. H. M., and Böttcher, J. (1989). Nitrate pollution of groundwater in Western Europe, Agric. Ecosyst. Environ. 26, 189–214. doi.org/10.1016/0167-8809(89)90013-3
Strobl C, Boulesteix AL, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC bioinformatics 8(1):25. https://doi.org/10.1186/1471-2105-8-25
UNEP (United Nations Environment Programme). (2010). Africa Water Atlas. Nairobi, UNEP, Division of Early Warning and Assessment (DEWA). http://na.unep.net/atlas/ africaWater/book.php.
UNEP/DEWA, (2014). Sanitation and Groundwater Protection –a UNEP Perspective UNEP/DEWA, http://www.bgr.bund.de/EN/Themen/Wasser/Veranstaltungen/symp_sanitat-gwprotect/present_mmayi_pdf.pdf?__blob=publicationFile&v=2. 18pp (Accessed online August 14th 2014).
Wakida FT, Lerner DN (2005) Non-agricultural sources of groundwater nitrate: a review and case study. Water Res 39(1):3–16. https://doi.org/10.1016/j.watres.2004.07.026
Ward MH, deKok TM, Levallois P, Brender J, Gulis G, Nolan BT, VanDerslice J (2005) Workgroup report: drinking-water nitrate and health—recent findings and research needs. Environ Health Perspect 113(11):1607–1614. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1310926
Wheeler DC, Nolan BT, Flory AR, DellaValle CT, Ward MH (2015) Modeling groundwater nitrate concentrations in private wells in Iowa. Sci Total Environ 536:481–488. https://doi.org/10.1016/j.scitotenv.2015.07.080
WHO (1992). GEMS/WATER Operational Guide. Third edition. World Health Organization, Geneva. 121pp. http://apps.mwho.int/iris/bitstream/10665/62446/1/GEMS_W_92.1_(part1).pdf. (Accessed online March 18th 2017)
Xu Y, Usher B (2006) Groundwater pollution in Africa. Taylor&Francis/Balkema, The Netherlands, 353pp. https://doi.org/10.1201/9780203963548
Yee TW, Mitchell ND (1991) Generalized additive models in plant ecology. Journal of vegetation science, 2(5), 587-602. ISO 690. https://doi.org/10.2307/3236170
Zhao C, Liu C, Xia J, Zhang Y, Yu Q, Eamus D (2012) Recognition of key regions for restoration of phytoplankton communities in the Huai River basin, China. J Hydrol 420:292–300. https://doi.org/10.1016/j.jhydrol.2011.12.016
Acknowledgments
This study was carried out within the framework of a doctoral research programme, and has been supported by the Islamic Development Bank (IDB) under the Merit Scholarship Programme (MSP) for theses and the ‘Fonds Spécial de Recherche’ (FSR) of the Université Catholique de Louvain. Several people from across the world helped with data acquisition, namely T. Gleeson (McGill University), N. Moosdorf (Hamburg University), and M. Cissé (DGPRE/Senegal).
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Kenneth Mei Yee Leung
Rights and permissions
About this article
Cite this article
Ouedraogo, I., Defourny, P. & Vanclooster, M. Validating a continental-scale groundwater diffuse pollution model using regional datasets. Environ Sci Pollut Res 26, 2105–2119 (2019). https://doi.org/10.1007/s11356-017-0899-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-017-0899-9