Introduction

Groundwater corresponds to the largest freshwater volume in continents (World Resources Institute 1990), and aquifer replenishment, i.e., recharge, becomes a key component to fully understanding the hydrologic cycle processes at local, intermediate, and regional scales.

Groundwater recharge estimates are crucial to improve water management strategies and are usually required in several hydrogeological applications such as baseflow analysis in shallow aquifer–river interactions (Zomlot et al. 2015), riparian ecosystem administration (Singh et al. 2021), unsaturated zone solute transport (Nativ et al. 1995), urban hydrogeology (Tubau et al. 2017), contaminant plume development (Birla et al. 2020), wellhead protection zone delineation (Fienen et al. 2022), and vulnerability mapping (Torkashvand et al. 2021), to name a few.

Overall, recharge evaluation poses a major scientific challenge, as this factor is spatial- and time-dependent, and relies on a vast parametrization, which in turn is based on surface/subsurface geology, soil type, vegetation cover, topography, hydrology, climatology, and human influence. Thus, several methods have been developed using a wide range of techniques and considering different data sources as inputs. Moreover, actual groundwater recharge that reaches the water table is more complicated to assess than potential recharge. The latter represents the water that infiltrates into deep soil layers without the influence of roots and evapotranspiration eventually becoming real recharge (Westenbroek et al. 2018).

Depending on the theoretical framework and principle, each strategy provides advantages, drawbacks, and inherent uncertainty ranges for point or diffuse recharge estimates, at different space–time scales. The most common frameworks involve water budget, modeling, surface water data, and physical, chemical, and heat tracer methods (Cartwright et al. 2017; Águila et al. 2019; Walker et al. 2019; Burri et al. 2021; Post et al. 2022). Scanlon et al. (2002) and Healy (2010) summarize the attributes of these approaches, including the applicable time and space scales.

Regional recharge assessments are particularly crucial in arid basins and aquifer-dependent regions where groundwater is the main source of domestic supply. However, the reliability of the recharge outcomes highly depends on the quality and quantity of available data, which otherwise is often limited in poorly monitored catchments and groundwater systems.

This is the case of the so-called Basin of Mexico, a ~9,000-km2 basin in central Mexico, which is home to ~25 million people. The basin includes Mexico City and its Metropolitan area, one of the world’s major megacities, which largely relies on groundwater (INEGI 2020). Current water use amounts to 61 m3/s basin-wide, out of which ~65% is supplied by ~2,800 wells that continuously pump groundwater from a regional aquifer system composed of Quaternary–Tertiary alluvial sediments, pyroclastic rocks, and fractured lavas, overlain by a compressible lacustrine aquitard of variable extent and thickness.

Evidence has shown that intensive groundwater overdraft in the Basin of Mexico (BM) has been carried out since the mid-twentieth century (Marsal and Mazari 1959), under conditions where long-term pumping exceeds recharge. Several negative effects were triggered as a result, including excessive drawdown rates (>1.5–2 m/year), hydrogeochemical degradation, inversion of the vertical hydraulic gradient (downwards from the lacustrine formation to the aquifer), aquitard consolidation, and severe land subsidence, reaching a maximum rate of ~40 cm/year (Cabral-Cano et al. 2008; Chaussard et al. 2014, 2021; Cigna and Tapete 2021; CONAGUA 2020; Fernández-Torres et al. 2022; Hernández-Espriú et al. 2014; Huizar-Alvarez et al. 2004; Solano-Rojas et al. 2015).

Overall, the space/time distribution of groundwater recharge in the BM is a relatively understudied problem. To the best of the authors’ knowledge, very few studies have rigorously assessed recharge in the BM. Carrera-Hernández and Gaskin (2008) developed a robust methodology in which a daily soil–water balance was applied using different vegetation and soil types for the period 1975–1986, estimating values of potential recharge in the order of ~11–24 m3/s, or ~36–78 mm/year. In addition, two other reports described potential recharge values in the study area (Morales-Escalante et al. 2020; Palma-Nava et al. 2022).

Surprisingly, and despite its importance, very little has been discussed and published in the scientific literature about this topic. One reason might be that open hydrogeological data in the BM (e.g., groundwater levels, geological borehole logs, pumping tests, hydrogeochemical analysis) are scarce, missing, or even lacking in terms of coverage and timespan, thus complicating recharge computation and modeling.

By using remote sensing information and global data sources, this paper aims to provide a practical approach to estimating the temporal and spatial distribution of potential groundwater recharge in data-scarce regions and its application in the BM. Precipitation, air temperature, land cover data, and soil properties were acquired by means of global, long-term, continuous datasets as forcings of a physically based daily soil–water-balance model (the SWB model; Dripps and Bradbury 2007; Westenbroek et al. 2010). Previous research demonstrated that a soil water-balance scheme was successfully tested in the BM as a feasible method to analyze recharge distribution (Carrera-Hernández and Gaskin 2008).

The main objectives of this research comprise: (1) the validation of remote/global products of soil types, land use, precipitation, and air temperature with local information and ground stations in the BM, and (2) the evaluation and comparison between simulated recharge outcomes using different data sources in the BM in both temporal and spatial scales.

The advantage of this strategy is that it allows for the depiction of regional changes in potential recharge patterns by using: (1) pseudo-continuous hydrological information in areas with limited ground observations, and (2) a simple potential recharge model that represents an appropriate balance between complexity, accuracy and the need for practical design an analysis (Dripps and Bradbury 2007). Therefore, this approach can be extrapolated to other parts of the world with similar settings, for monitoring shifts in regional recharge, particularly in low and middle-income countries where specific data are often insufficient, incomplete and outdated. Moreover, it can be beneficial for groundwater practitioners, environmental and water managers who are interested in straightforward yet robust regional recharge estimates.

Methods and materials

Study area

The study area (Fig. 1) corresponds to the Basin of Mexico (BM), an endorheic basin of 8,830 km2 (Fig. 1a) that, during the last century, had been connected through artificial channels to the Tula River in the northwest of Mexico City to drain the excess of surface water into the Gulf of Mexico (SACMEX 2018). The BM is located between the coordinates 99°25′47″ W to 98°11′48″ W longitude and 19°03′19″ N to 20°11′20″ N latitude and presents a topographic elevation range from 2,220 to 5,200 m above sea level (asl), with a mean elevation of 2,550 m asl. The BM extent includes Mexico City and its Metropolitan area, with ~21.8 million people (INEGI 2020), which represents ~88% of the total population in the basin.

Fig. 1
figure 1

Study zone and overall attributes in the Basin of Mexico: a Groundwater management units, location of groundwater wells, and climatological ground stations, b land cover from the Climate Change Initiative Land Cover (Bontemps et al. 2013), c soil classes derived from SoilGrids (Hengl et al. 2017), d surface geology/lithology from the Mexican Geological Survey (SGM 2017), e geographical features for further statistical analysis

The BM exhibits a temperate climate with a mean temperature that ranges from 15 to 25 °C, and a mean precipitation of ~680 mm/year, where 80% of annual precipitation is accumulated from June to October; however, annual precipitation varies from ~1,300 mm in the Sierra de las Cruces, in the southwest (Fig. 1e), to 430 mm in the northern part of the basin.

Croplands cover ~50% of the BM extension, followed by urban areas with ~19% (Fig. 1b), while other predominant land uses are evergreen forest (~15%) and scrub (~8%). These four land uses correspond to 92% of the BM area; however, the urban area represents ~50% of Mexico City, followed by evergreen forest and cropland with 24 and 18%, respectively.

Phaeozem is the predominant soil type (Fig. 1c), representing ~45% of the BM area, mostly located in the northern and northwest parts of the basin. Luvisols cover ~22% of the BM and are located in the Sierra de las Cruces, Sierra de Chichinautzin, and the Sierra Nevada, in the southwest, south, and southeast of Mexico City (Fig. 1e), respectively. Leptosols and vertisols cover ~15 and ~9% of the BM, respectively, where vertisols extend over the former Texcoco Lake system.

The surface geology is shown in Fig. 1d. The Basin of Mexico (BM) is located in the eastern part of the Trans-Mexican Volcanic Belt, a ~1,000-km continental volcanic arc that crosses from Nayarit to Veracruz states (Ferrari et al. 1999). The Sierra de las Cruces (Fig. 1e) in the southwest part of the BM is characterized by the presence of andesite-dacite and andesite tuffs from the Neogene and Quaternary periods. The Sierra de Chichinautzin, in the south, is made up of Quaternary basalts and andesite rocks.

The Sierra Nevada is formed by the Popocatepetl, Iztaccíhuatl, and other stratovolcanoes with andesitic, dacitic, and rhyolitic compositions from the Pliocene–Pleistocene periods (Arce et al. 2019). The north and northeast parts of the basin exhibit intercalation of basalt-andesite and andesitic tuff rocks. Overall, a regional aquifer system composed of Quaternary–Tertiary alluvial sediments, pyroclastic rocks, and fractured lavas represents an unconfined aquifer with confined sections of up to 800 m of saturated thickness in some areas (Hernández-Espriú et al. 2014; Carrera-Hernández and Gaskin 2008). A ~40–400 m thickness compressible lacustrine aquitard overlays the regional volcanic aquifer.

On the other hand, water demand in Mexico City and its Metropolitan Zone exceeds natural water incomes (Huerta-Vergara et al. 2022; Carrera-Hernández and Gaskin 2007). In Mexico City, water demand reached 32.55 m3/s in 2017, where internal water sources corresponded to ~64% of the total water supply and external water sources to ~36% (SACMEX 2018). Moreover, groundwater represents the main water source in the Metropolitan Zone, totaling ~68% of the total water use (SACMEX 2018; Escolero et al. 2016).

Official or permitted wells (groundwater concessions) and annual groundwater withdrawals considering the BM and Mexico City are presented for context in Table 1. Almost 30% of groundwater concessions (permitted wells) are in Mexico City, which totals ~50% of the groundwater withdrawals in the BM [~1,300 × 106 m3 (Mm3)]. Domestic supply is the dominant groundwater use in the study area, representing ~70% of total groundwater use basin-wide, and ~94% in Mexico City.

Table 1 Number of groundwater concessions and total groundwater withdrawals by water use in the Basin of Mexico and in Mexico City. The number of titles and volume were derived from the National Water Commission database (REPDA) for 1994–2019

Datasets used and workflow

Remotely sensed data and global hydrological products were used primarily to model potential recharge in the BM, including climatological data, vegetation classes, and soil types. In addition, ground observations and local information were used for comparison purposes. Table 2 shows the main features of each global dataset involved. The overall workflow of this study is shown in Fig. 2.

Table 2 Remote sensing and global datasets used in this study
Fig. 2
figure 2

Conceptual framework implemented in this study; M Models

Daily precipitation was obtained from the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) version 2.0 (Funk et al. 2015) from the UC Santa Barbara, USA, considering a 21-year period (2000–2021). Gridded daily minimum, mean, and maximum temperatures from the same period were obtained from the Daymet V4 product developed by the Environmental Sciences Division at Oak Ridge National Laboratory, Oak Ridge, and NASA (NASA EarthData 2022).

The land use was derived from the Climate Change Initiative Land Cover (CCI-LC) for the year 2015 (Fig. 1b), which is a 300-m resolution global product of land cover classification developed from remote sensing data and machine learning algorithms (Bontemps et al. 2013), where similar classes were grouped, preserving the most frequent land use. Soil types were analyzed from the SoilGrids dataset (Hengl et al. 2017), a 250-m resolution global product of soil physical attributes and taxonomical classification, originated from machine learning using remote sensing data and in-situ soil profiles.

Local daily precipitation and temperature data were collected from ground stations managed by the National Meteorological Service (SMN 2021) (blue triangles in Fig. 1a; Table 3). Ground stations were found in an elevation range of 1,530–3,300 m asl, and only 47 stations were selected, considering those with less than 20% of missing data during the period 2000–2016. Climatological data for recent years were not available.

Table 3 Ground datasets used in this study

Likewise, the land cover maps of the National Institute of Statistics and Geography (INEGI) and the soil types of the National Institute of Forestry, Agriculture and Livestock Research (INIFAP) and the National Commission for the Knowledge and Use of the Biodiversity (CONABIO) were collected. More information on the local datasets is shown in Table 3. The local datasets of land use and soil types were compared with the global products of CCI-LC and SoilGrids, respectively.

For calibration purposes, local daily streamflow records were obtained from the National Bank of Surface Water Data (CONAGUA 2021) from the National Water Commission (CONAGUA). Table 4 shows the attributes of the two streamflow gauges located in the BM (Fig. 1a), considering available information from 2000–2014: The San Lucas gauge station (ID 26275) on the Río de la Compañía with a catchment area of 304 km2, and the El Molinito gauge station (ID 26053), located on the Río Hondo, draining ~143 km2 of surface area.

Table 4 Streamflow gauges used for model calibration in the Basin of Mexico

Potential groundwater recharge model and calibration

The daily potential groundwater recharge in the BM was estimated using the soil–water-balance (SWB) model version 2.0 (Westenbroek et al. 2018), available at Westenbroek (2021). SWB is a distributed model developed by the US Geological Survey to estimate water balance components at a daily time step using a modified version of the Thornthwaite-Mather soil-moisture-balance approach (Thornthwaite and Mather 1957). SWB v2.0 is an evolutionary product of the SWB model developed by Dripps and Bradbury (2007) and has been improved to facilitate data input and net infiltration modeling.

The SWB model solves a water budget approach (potential evapotranspiration, snowmelt, interception, runoff, infiltration, actual evapotranspiration, leakage, percolation, flow routine) for each grid cell of the simulation domain. The net infiltration at any time step is assumed when the soil moisture exceeds the total water value, which depends on the field capacity, wilting point, and effective rooting depth of vegetation (Westenbroek et al. 2018). The hydrological processes are computed in sequential order, from the upper to deeper soil layers. The main limitations of the SWB approach in the BM are the lack of temporal groundwater abstraction to include irrigation returns in croplands, and the unrealistic flow routine in flat terrains, where the model incorporates preferential paths using the single flow direction raster.

The SWB model relies on data being available on land use, soil hydrologic group, surface water flow direction, gridded precipitation, and minimum and maximum air temperature. The land use was directly used from the CCI Land Cover product, and the soil hydrologic groups were derived from the SoilGrids soil texture following the classes proposed by Hawkins et al. (2009)—shown in Table S1 in the electronic supplementary material (ESM). Moreover, soil water capacity was established using the soil textures and following Westenbroek et al. (2018) (shown in Table S2 in the ESM). The flow direction raster without sinks required in the SWB model was computed through the r.watershed algorithm in GRASS GIS v7.8.5 (Neteler et al. 2012) using the Shuttle Radar Topography Mission (SRTM) v4, at 90 m pixel resolution (Bamler 1999).

The potential recharge was simulated within a 500 × 500 m2 grid including a computing domain of 271 rows and 269 columns, where the daily records from ground stations were interpolated to fit the model grid size using an inverse distance weighted algorithm (IDW) (Shepard 1968) computed in Python 3.9 (Van Rossum and Drake 2009) and using an exponent of 2. Precipitation from CHIRPS and air temperature from Daymet (see Table 1) were spatially averaged to fit the 500 × 500 m2 grid through Xarray (Hoyer and Hamman 2017), a Python library for multidimensional arrays computing.

Four SWB model configurations were implemented to compare different climatological data sources (M1, M2, M3, and M4), as is shown in Table 5. Within the model setup M3, the CHIRPS precipitation bias correction (CHIRPSC) using interpolated ground data consisted of two steps: monthly bias correction and daily bias correction. The former was performed using a power function (Wörner et al. 2019; Goshime et al. 2019; Vernimmen et al. 2012) as follows:

$${P}_{\mathrm{c},i}={a}_{m}{P}_{i}^{{b}_{m}}$$
(1)

where Pi is the monthly precipitation from CHIRPS for date i (year/month), Pc,i is the bias-corrected monthly precipitation for date i, and am and bm are calibration coefficients for each month, i.e., 24 parameters per pixel of the grid to compute. Parameters am and bm were calibrated using the historical interpolated monthly precipitation (Ph,i) from ground stations reducing the root mean square error between Pc,i and Ph,i.

Table 5 Model configurations used to simulate potential recharge in the BM

Within the latter computation, daily precipitation from CHIRPS was corrected using the ratio between the CHIRPS monthly bias-corrected precipitation and the original CHIRPS monthly precipitation:

$${P}_{\mathrm{c},t}={P}_{t}\frac{{P}_{c,i}}{{P}_{i}}$$
(2)

where Pc,t is the CHIRPS daily precipitation bias-corrected for day t, Pt is the CHIRPS daily precipitation for day t, Pc,i is the CHIRPS monthly precipitation bias-corrected, and Pi is the CHIRPS monthly precipitation. The air temperature time series from Daymet were directly used without correction, since the bias-correction methodology tested with ground data showed poor improvements.

The SWB model uses the curve number approach (Woodward et al. 2003) to compute runoff volume as:

$$R=\frac{{\left(P-{I}_{\mathrm{a}}\right)}^{2}}{\left(P+{S}_{\mathrm{max}}-{I}_{\mathrm{a}}\right)}$$
(3)

where R is runoff, P the daily precipitation, Ia the initial abstraction, and Smax the maximum soil moisture-holding capacity. Initial abstraction is computed as Ia = 0.05 × Smax, and Smax is computed as follows:

$${S}_{\mathrm{max}}=\frac{1000}{\mathrm{CN}}-10$$
(4)

where CN is the curve number. Initial values for the hydrological soil group were obtained from Westenbroek et al. (2018), shown in Table S3 in the ESM.

The SWB model calibration was performed in two steps. The first step consisted of a manual iterative change of the initial root zone depth values (shown in Table S3 in the ESM) at rates of 1% to reduce the root mean squared error (RMSE) of the simulated annual actual evapotranspiration (AET) against remote-sensing-derived evapotranspiration products and two empirical formulations. The AET from SWB is sensitive to the root zone depth and is computed as the fraction of soil moisture extracted by potential evapotranspiration (PET), where PET was estimated using the Thornthwaite approach (Thornthwaite 1948). The annual AET values used for validation were extracted from global products as reported in Table 2, such as the Moderate Resolution Imaging Spectrometer (MODIS), product MOD16 (Mu et al. 2011), the Global Land Evaporation Amsterdam Model (GLEAM v3.5) (Martens et al. 2017) and the TerraClimate dataset (Abatzoglou et al. 2018). Finally, the empirical models of Turc (Turc 1961) and Coutagne (Demertzi et al. 2021; Coutagne 1949) were also implemented to compute AET at the annual scale using climatology from the ground stations (2000–2016).

The second step consisted of a manual iterative change of the curve number (from Table S3 in the ESM) to reduce the RMSE between simulated and observed monthly runoff values for the period 2001–2012; however, other metrics, such as the Kling-Gupta-Efficiency (KGE) (see Clark et al. 2021) and the Pearson correlation coefficient, were used to evaluate the model performance. The observed runoff was computed from the local daily streamflow records of the San Lucas and El Molinito gauges (Table 3) using the recursive digital filter proposed by Lyne and Hollick (1979):

$${\mathrm{Qd}}_{t}=\alpha\; {\mathrm{Qd}}_{t-1}+\frac{1+\alpha }{2} \left({\mathrm{Qt}}_{t}-{\mathrm{Qt}}_{t-1}\right)$$
(5)

where subindex t refers to the date, Qd is the computed runoff, Qt is the observed streamflow (runoff + baseflow), and α is the filter parameter. This filter was applied three times, alternating the time orientation (where the computed baseflow becomes the total flow for the next iteration), and with a fixed α value of 0.925, as suggested Nathan and McMahon (1990) for daily streamflow hydrographs.

Recharge application (app) development

To provide accessible information on the potential recharge estimates in the BM for education, science communication, and water management purposes, a web application was developed to share the main results from this research.

The web application was developed using Streamlit (Streamlit Inc. 2022), a Python library for fast development of web applications for data science. Moreover, interactive plots were generated using the Plotly library (Plotly 2022). The “BMRecharge-app” is freely available at The Hydrogeology Group (2022) and allows one to easily visualize and download potential recharge time series considering the four modeling setups (M1–M4), and to visualize basin-wide maps of annual potential recharge for different time spans.

Results

Comparison and correction of data sources

The monthly and annual precipitation time series and the annual air temperature using ground gauges, CHIRPS, CHIRPS bias-corrected (CHIRPSC), and Daymet datasets are shown in Fig. 3. The spatial averaged monthly precipitation over the BM using CHIRPS followed the temporal patterns of interpolated precipitation from gauges (Fig. 3a), with a Pearson correlation coefficient of 0.96 and an RMSE of 14.1 mm. CHIRPSC slightly changed the correlation coefficient and RMSE to 0.97 and 12.6 mm, respectively.

Fig. 3
figure 3

Comparison of a monthly precipitation and d annual precipitation using climatologic ground gauges, CHIRPS, and bias-corrected CHIRPS (CHIRPSC), spatially averaged over the Basin of Mexico extension. Pixel-based comparison of precipitation time series from CHIRPS vs ground gauges for monthly and annual time steps are shown in parts b and e, respectively, while CHIRPSC vs ground gauges at monthly and annual time steps are shown in parts c and f, respectively. The comparison between annual temperature using the Daymet dataset and ground gauges averaged over the BM is shown in parts g and h. Soil types and land use comparison of global products with local data are shown in i and j, respectively

Point-based comparisons of monthly precipitation are shown in Fig. 3b, c. The correlation coefficient and RSME for the pixel-based comparison between global (CHIRPS, model M2) and local rainfall (gauges, model M1) were 0.87 and 35.2 mm, respectively. Using the CHIRPSC dataset (model M3), these metrics improved to 0.9 and 30.9 mm. However, some discrete precipitation events higher than 600 mm, determined from gauges, were underestimated by CHIRPS and CHIRPSC; this was related to large local precipitation gradients at the higher elevations in the south and southwest parts of the BM (not shown), for instance, in the Sierra de las Cruces (see Fig. 1e for location). Arciniega-Esparza et al. (2022) have shown how CHIRPS fails to detect some rainy and dry days in high precipitation gradients across Costa Rica.

At the annual time step (Fig. 3d), the correlation coefficient from the spatially averaged precipitation from CHIRPS and CHIRPSC decreased to 0.57 and 0.63, with RMSE values of 80 and 72.4 mm, respectively. This decrease in correlation is related to the lower number of records to compute the correlation coefficient, and due to the large error observed in 2015, with more than 110 mm of difference between ground gauges and remotely sensed precipitation. However, at the pixel-scale (Fig. 3e, f), the correlation coefficient was 0.7 and 0.81, considering CHIRPS and CHIRPSC configurations, respectively. Such differences in spatially averaged precipitation may be related to local heavy precipitation events that cannot be detected by satellites and due to the lack of ground precipitation data within the Cuautitlán-Pachuca and Ápan groundwater management units (GMU), and in the Sierra Nevada and Sierra de Calpulalpan (see Fig. 1 for location).

The spatially averaged air temperature over the BM estimated from gauges and Daymet is shown in Fig. 3g. The Daymet product exhibits a bias of –0.4 °C in relation to the mean air temperature from the ground dataset (period 2000–2016), a bias of –0.55 °C for the period 2000–2013, and a correlation coefficient of 0.72. However, air temperature from Daymet exhibited an increase in 1.8 °C from 2012–2013. Both series displayed a positive trend of 0.03 and 0.1 °C/year using gauges and the TerraClimate product, respectively.

Figure 3i, j shows the comparison of soil types and land use. Most of the soil types between SoilGrids and the INIFAP-CONABIO datasets showed discrepancies, where SoilGrids exhibited a larger extent of Luvisols (22 against 0.6%, respectively), but lower extents of Andosol, Cambisol, Litosol, Regosol, and Solonchak soil types (Fig. 3i). Moreover, the INIFAP-CONABIO dataset contains the urban and water categories, which are related to land use instead of soil types and may be affecting the comparison. Finally, the extent of land uses from the CCI-LC and the INEGI datasets were in general consistent (Fig. 3j); however, the CCI-LC showed a higher urban area (19 against 13%) related to the more recent period of CCI-LC concerning INEGI dataset (2015 and 2013, respectively).

Model calibration and evaluation

The rooting depth parameter and the curve number for different soil-land covers were iteratively changed to improve the actual evapotranspiration (AET) and runoff simulations, respectively, and the calibrated parameters are shown in Table S4 in the ESM. Figure 4 shows a comparison of the best performance simulation for each model setup (Fig. 2), considering monthly runoff at the streamflow gauges and using the spatially averaged AET for the BM, compared with empirical formulations and global products.

Fig. 4
figure 4

Monthly runoff comparison for a El Molinito and b San Lucas streamflow gauges. The comparison of spatial averaged annual actual evapotranspiration (AET) is shown in c, where black color lines correspond to the reference values. M Models

The computed metrics for simulated monthly runoff using the four model setups are shown in Table 6. The CHIRPSC model setup (M3) depicted the best performance for both streamflow gauges, considering the RSME, KGE, and Pearson metrics. The worst performance was obtained with the CHIRPS model setup according to RMSE and KGE metrics. As shown in Fig. 4a, the CHIRPS model setup overestimated more than two times the monthly runoff in El Molinito gauge for the period 2009–2014. Moreover, the local and CHIRPS-Daymet model setups were comparable in El Molinito station, but the model M1 showed better performance at the San Lucas station, where the CHIRPS-Daymet model (M4) overestimated the average monthly runoff two-fold.

Table 6 Metrics for monthly runoff simulations using the four model setups. RMSE root mean squared error; KGE Kling-Gupta Efficiency; r Pearson correlation coefficient

Figure 4c shows annual actual evapotranspiration (AET) outcomes considering different data sources. The Turc and Coutagne formulations estimated annual AET values in the order of 358 and 603 mm, with mean values of 410 and 530 mm, respectively. Remote-sensing-derived mean annual AET from MODIS, TerraClimate, and GLEAM estimated 643, 636, and 734 mm, respectively. Simulated mean annual AET over the BM by means of the SWB model ranged from 472 to 490 mm. Similar results were found for models M2, M3 (380–580 mm/year, respectively), and for M1, M4 (360–700 mm/year, respectively).

As noted in Fig. 4c, GLEAM overestimated the annual AET, representing ~97% of the mean annual precipitation in the BM (even in some years, AET exceeds precipitation). AET from MODIS and TerraClimate represent, on average, 86% of annual precipitation, but in some years, TerraClimate showed AET values higher than precipitation. AET derived from Turc and Coutagne equations represented 55 and 70% of the mean annual precipitation, respectively, which are consistent with previous findings by Birkle et al. (1998). Simulated AET/precipitation from the model setups using the SWB approach ranged between 55 and 95%, with a mean value of ~64%.

Potential groundwater recharge estimation in the Basin of Mexico

The 2000–2021 mean annual potential recharge for each model setup is shown in Fig. 5. The spatial average recharge considering the Local (M1), CHIRPS (M2), CHIRPSC (M3), and CHIRPS-Daymet (M4) models were 37, 45, 38, and 45 mm/year (equivalent to an infiltration rate of 10.38, 12.57, 10.73, 12.61 m3/s), and medians of 21, 16, 18, 24 mm/year (5.9, 4.5, 5, 6.7 m3/s), respectively.

Fig. 5
figure 5

Spatial distributions of mean annual potential recharge for the period 2000–2016 using the four model configurations: a Local Model (M1), b CHIRPS Model (M2), c CHIRPSC Model (M3), d CHIRPS-Daymet Model (M4)

All model setups agreed that the BM southwest area, along the Sierra de las Cruces and Sierra de Chichinautizn (see Fig. 1e for locations), represents the dominant groundwater recharge surface area. In contrast, potential recharge in a large share of Mexico City and its northern metropolitan region is practically negligible.

It is worth noting that the lack of climate stations at high elevations tends to simulate lower recharge rates in ground-based models, with differences higher than 150 mm/year (Fig. 6), such as in the high mountainous regions of the Chalco-Amecameca management unit (see Fig. 1b for location), considering the model M1 (Fig. 6a, c). Also, simulated recharge using the M3 model (CHIRPSC) showed similar spatial patterns in comparison to model M1 (Fig. 6b), suggesting that the bias correction technique would reduce precipitation at higher elevations within the Sierra Nevada and Sierra Calpulalpan (see Fig. 1e for location).

Fig. 6
figure 6

Differences of mean annual simulated potential recharge for a Model 2-Model 1, b Model 3-Model 1, c Model 4-Model 1, d Model 3-Model 2, e Model 4-Model 2, f Model 4-Model 3

On the other hand, the spatial distribution of potential recharge for the years 2000, 2005, 2010, 2015, and 2020 using the four model setups are shown in Fig. 7. Similar spatial patterns were observed from 2000 to 2005 for all models; yet, from 2006 to 2015, simulations derived from model M1 (Local) exhibited higher recharge values over the Cuautitlán-Pachuca and Texcoco management units.

Fig. 7
figure 7

Spatial distribution of annual potential recharge simulations for 2000, 2005, 2010, 2015, and 2020 using the four model setups. M Models

High pixel-scale recharge was estimated in the Sierra de las Cruces and Sierra de Chichinautzin during the wet period from 2000 to 2004 (Fig. 3d), reaching values up to 800 mm/year; however, isolated high values were derived due to significant flow accumulation as the effect of the flow routine implemented in the SWB model. In addition, recharge is very close to zero over the metropolitan area since percolation through faults in deep urban drainage was not included in the models.

Within the simulation period, 2005 was the driest year, resulting in a 20% recharge reduction, particularly in the higher mountainous zones of the basin (Fig. 7). Moreover, a severe recharge decrease (~80%) in relation to the long-term mean was observed during 2015 in the Chalco-Amecameca, Texcoco, Cuautitlán-Pachuca, and Ápan management units (see Fig. 1 for locations), associated with negative precipitation anomalies of about 55 mm and positive anomalies of temperature higher than 0.4 °C (Fig. 3).

Spatial recharge patterns

The spatial averaged potential recharge over the BM (annual basis) using the four model setups ranged from 7.4 to 110 mm (or 2.1–30.8 m3/s), as shown in Fig. 8a (period: 2000–2021). The larger variations between models M1 and M2 were observed from 2000–2004, with divergences from 21–33 mm (5.8–9.2 m3/s). A higher similarity between setups was observed during the periods 2005–2007 and 2010–2015. All models computed a lower potential recharge in 2005, 2012, and 2015, of 25.8, 23.4, and 13.4 mm (7.22, 6.55, 3.75 m3/s), respectively; however, in high recharge years, some discrepancies were detected. For instance, the largest recharge value considering model 1 (local model) was computed as 55.5 mm (15.53 m3/s) for the year 2010, in contrast with 71.8 mm (20.1 m3/s) for 2003 by means of the remote sensing products. Furthermore, the CHIRPS-Daymet setup (Model M4) showed a potential recharge recovery in 2018 and 2021, with ~80 and ~110 mm/year (22.4 and 30.8 m3/s), respectively.

Fig. 8
figure 8

Temporal and statistical analysis of potential groundwater recharge in the study area. a Spatial averaged time-series of annual potential recharge over the Basin of Mexico (BM) is shown. b Spatial averaged mean monthly (1–12) potential recharge over the BM is shown. cd Mean annual potential recharge by groundwater management units and by geographical features (see locations in Fig. 1e), respectively, are shown, where large boxes correspond to the interquartile range (25th–75th), the inner line within the widest box corresponds to the median and the remaining boxes to a 5%-quantile increment

Overall, 90% of annual recharge is distributed from June to September, reaching values of 7–13 mm/month (Fig. 8b). All models yielded good agreement for August to September (7–11 mm/month), with some minor differences of 4.5 mm/month in July by means of the CHIRPS-Dayment setup (M4) in comparison to the local model (M1).

Figure 8c shows annual recharge statistical distributions, separated by GMU in the BM, illustrated by boxenplots (an improved version of classic boxplots, which describes error distribution using quantiles). Higher potential recharge values were estimated for the Zona Metropolitana GMU with a mean recharge value of 77.4–93.7 mm/year, or 4.7–5.7 m3/s. The lower value corresponds to the recharge rate derived from model M1 and the larger to the CHIRPS setup (model M2), as shown in Table 7. However, ~50% of this GMU is covered by the urban area where recharge is negligible, while less than 40% of the area corresponds to the Sierra de las Cruces, with a recharge interquartile range (25th–75th) of 30–300 mm/year. Figure 8d reveals the importance of surrounding mountain ranges in the BM composed of fractured lavas, such as the Sierra de las Cruces and Sierra del Chichinautzin, since they both signify the most important recharge areas in the basin, representing 2–3 times the recharge rates in Sierra Nevada and Sierra de Calpulalpan, and up to 5–20 times the recharge rates in valleys (Table 7).

Table 7 Mean annual potential recharge by groundwater management units (GMUs) and geographical features

The statistical analysis of several geographical features in the BM suggested similarity between CHIRPS (M2) and CHIRPS-Daymet (M4) outcomes, and between local (M1) and CHIRPSC (M3) models, as shown in Fig. 8c, d. All setups generated similar recharge intervals in the Sierra de las Cruces, the Zona Metropolitana, and Tecomulco GMUs; however, CHIRPS and CHIRPS-Daymet predicted, on average, more than 2.5 times the recharge in the Sierra Nevada and more than two times the recharge in the Sierra de Chichinautzin.

The annual precipitation and terrain elevation showed the strongest control on mean annual potential recharge. Figure 9 shows the nonlinear relationship between mean annual recharge and mean annual precipitation (GWR-P) for all model setups, where the circles’ size corresponds to the terrain elevation and colors represent the predominant surficial lithology in the BM.

Fig. 9
figure 9

ad Non-linear correlations between mean annual potential recharge (GWR) vs mean annual precipitation (precipitation in the X axis), terrain elevation (circles’ size in m asl), and surficial lithology (circles’ color) over the Basin of Mexico

Both mean annual recharge (GWR) and precipitation (P) are related by a potential power function [GWR = a(P)b]. As noted in the empirical equations from Fig. 9, a precipitation rate lower than 500 mm/year does not generate recharge at all. Overall, similar GWR-P relationships were observed between the CHIRPS and CHIRPSC setups; however, CHIRPS showed a clear relationship between precipitation and elevation, and CHIRPSC tends to underestimate precipitation in high terrain elevations. The GWR-P relationship from M1 (local model) resulted in lower recharge rates within a precipitation range of 600–900 mm/year. Finally, the CHIRPS-Daymet Model (M4) resulted in an underestimation of recharge considering large precipitation heights. From this analysis (Fig. 9), it is concluded that the areas more prone to important groundwater recharge development in the BM are those with precipitation rates higher than ~800 mm in terrains with elevation higher than ~3,000 m asl, along basaltic-andesitic and andesitic-basaltic rocks. In contrast, areas with rainfall rates less than ~650 mm/year, within less than 2,500 m asl of elevation, along alluvial deposits and rhyolitic rocks, develop very low recharge rates or even no recharge at all.

Discussion

In this research, the potential recharge in the BM was simulated and compared using climatological data from ground stations and remote-sensing/global products to understand the benefits and limitations of the global products against local data for groundwater recharge assessments. Most of the previous research on potential recharge in the BM is based on ground gauges (Palma-Nava et al. 2022; Mautner et al. 2020; Carrera-Hernández and Gaskin 2008; Birkle et al. 1998; Durazo and Farvolden 1989), where solely Carrera-Hernández and Gaskin (2008) considered the elevation effect by means of rainfall interpolation.

The locally available data sources provide valuable climatological information; however, dense station networks are required in regional analysis and large-scale modeling. Wohl et al. (2012) showed how the number of precipitation stations has been decreasing in tropical regions since the decade of the 1970s, including Mexico, reducing the availability of long time series with common periods. Other limitations for ground stations are the lack of recent periods on publicly available datasets, whereas remote-sensing and global products provide information with a delay of a few months or near real-time. Nevertheless, global datasets require validation and correction with local data. The lack of ground data to correct satellite products has been discussed in previous works under complex topographical and wet-dry conditions (Arciniega-Esparza et al. 2022; Argüeso et al. 2013). Hence, remote sensing products remain the main source of past and current climatological data in regional regions with low density, interrupted, discontinuous, or even private data from ground monitoring networks.

Larger differences in soil types than land use were observed using SoilGrids and CCI-LC compared to local datasets. The dataset of local soil types showed categories such as urban or water classes (more related to land use), which limits the comparison with the global product. Besides, the CCI-Land Cover product and the local data showed a good agreement on land cover, where the differences were related to the increase in the urban area in 2015 (from CCI-Land Cover) concerning 2013 (from the INEGI dataset).

Precipitation in the BM from CHIRPS showed good agreement with ground gauges at both monthly and annual scales (r = 0.87 and 0.7, respectively), where the bias correction slightly improves by 3 and 15% of the remotely sensed precipitation, respectively. The outcomes suggest that precipitation from CHIRPS and air temperature from Daymet are adequate for modeling the potential recharge process in the BM (CHIRPS-Daymet Model; M4). Moreover, potential recharge values estimated using the remote sensing data and global products resulted in similar spatial patterns and rates in those zones with a high density of climatological stations (Figs. 5, 6 and 8). The lack of ground gauges in the Sierra Nevada and Sierra de Calpulalpan resulted in discrepancies of ~20–50 mm of the averaged potential recharge in geographical features (Fig. 8), with pixel value differences above 150 mm (Fig. 6) between local and remote sensing models, even when applying bias correction.

The AET from global products derived from remote sensing data was found to overestimate the evapotranspiration in the BM. For instance, AET from GLEAM represented more than 97% of the annual precipitation, and TerraClimate showed years where AET was higher than precipitation. Overestimation of AET from GLEAM has been associated with forest areas in China (Yang et al. 2017) and showed low performance across Africa in comparison to other high-resolution models (Weerasinghe et al. 2020). Similar results have been found by Salazar-Martínez et al. (2022), where MOD16 and GLEAM products tend to overestimate evapotranspiration in forest vegetation and underestimate this variable in nonforest areas. The modeled AET outcomes using the SWB approach agreed with previous estimations of Birkle et al. (1998), considering that annual AET represents 70–80% of annual precipitation in the study area.

The spatial averaged mean annual potential recharge estimated in this study using four SWB model configurations (shown as the Basin-wide region in Table 7) ranged from 37–45 mm/year (or 10.38–12.61 m3/s), with spatial averaged annual values that ranged between 7.4 and 110 mm (2.1–30.8 m3/s) for the period 2000–2021. Such estimations are consistent with previous works. Carrera-Hernández and Gaskin (2008), for example, reported an annual potential recharge flowrate of 10.9–23.8 m3/s considering the period 1975–1986, while Birkle et al. (1998) reported 13–13.8 m3/s for the period 1980–1985. The spatial patterns of annual recharge simulated in this work showed more similarity with results from Carrera-Hernández and Gaskin (2008), where the Sierra de las Cruces, Sierra de Chichinautzin, and Sierra de Nevada are the largest and most important recharge areas, basin-wide. The low recharge rates in the valleys are associated with low precipitation values and high temperatures within clayey and low-permeability surficial deposits.

Overall, annual recharge over the BM represents from 2 to 12.5% of annual precipitation, with a mean value of ~5%, as shown in Fig. 10a. The year 2015 represented the worst recharge condition during the analyzed period, whereas 2018 and 2021 showed the largest proportion of precipitation potentially recharging the main aquifer. However, annual groundwater withdrawals represent more than four times the computed potential recharge volume (Fig. 10b), and even more than tenfold solely in 2015. Groundwater overdraft represents one of the most important issues for the BM’s water security (Escolero et al. 2016) since this condition caused excessive drawdowns and large subsidence rates (Chaussard et al. 2021; Hernández-Espriú et al. 2014; Carrera-Hernández and Gaskin 2007). Moreover, the increase in temperature, urban cores growing in the south part of Mexico City, and climate change will likely reduce the potential recharge in the long run.

Fig. 10
figure 10

Temporal variation of a percent of annual precipitation that becomes potential recharge (GWR), and b dimensionless groundwater stress index as the ratio of groundwater (GW) withdrawals and potential recharge (GWR) in the Basin of Mexico (period: 2000–2021)

Summary and conclusions

In this research, a practical approach to characterizing the temporal and spatial distribution of regional groundwater recharge is proposed by primarily using remote sensing information and global hydrological data. This strategy was applied to the Basin of Mexico (BM), a heavily water-stressed region that relies on groundwater for most of its domestic supply.

Climatological records from ground stations in combination with remotely sensed precipitation from the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) were used in combination with daily temperature from the Daymet product, land use derived from the Climate Change Initiative Land Cover (CCI-LC), soil types analyzed from the global SoilGrids dataset, and actual evapotranspiration from GLEAM, MOD16, and TerraClimate products. These data were used as forcings to feed a conceptual-based daily SWB model (Dripps and Bradbury 2007; Westenbroek et al. 2010), which estimates water balance components at a daily time step using a modified version of the Thornthwaite-Mather soil-moisture-balance approach.

The key motivation of this research was to assess the reliability of global data sources on recharge modeling outcomes. Thus, four recharge model configurations were tested and calibrated (using runoff as the target variable), combining both local, i.e., ground gauges and remote/global information.

The main results showed a good correlation between monthly gauges-derived and remotely sensed precipitation and also by local and global (Daymet) air temperature (0.87 ≤ r ≤ 0.9 and r = 0.72, respectively). The 2000–2016 spatial average recharge considering the Local (M1), CHIRPS (M2), CHIRPSC (M3), and CHIRPS-Daymet (M4) models were 37, 45, 38, and 45 mm/year (equivalent to a potential recharge rate of 10.38, 12.57, 10.73, 12.61 m3/s), and medians of 21, 16, 18, 24 mm/year (5.9, 4.5, 5, 6.7 m3/s), respectively. All model setups agreed that the BM southwest, along the Sierra de las Cruces and Sierra de Chichinautizn, represents the dominant groundwater recharge area, while potential recharge in a large share of Mexico City and its northern metropolitan region is virtually zero.

Surprisingly, results suggest that precipitation and groundwater recharge are not linearly correlated but dependent, following a power function (GWR = a(P)b, where GWR stands for recharge and P for precipitation). Basin-wide, the areas more prone to major groundwater recharge are those with annual precipitation rates greater than ~800 mm in elevation terrains higher than ~3,000 m asl, along basaltic-andesitic and andesitic-basaltic rocks. In contrast, areas with rainfall rates less than ~650 mm/year, within areas less than 2,500 m asl of elevation, along alluvial deposits and rhyolitic rocks, develop minimal recharge rates. In addition, a precipitation rate lower than 500 mm/year does not generate recharge.

The outcomes suggest that precipitation from CHIRPS and air temperature from Daymet are adequate for modeling the potential recharge process in the study area by means of the CHIRPS-Daymet Model (M4). As demonstrated, groundwater recharge estimates using the remote sensing data and global products derived similar regional patterns and rates than in those areas with a high density of climatological ground stations.

However, remote sensing/global data showed discrepancies with local data even after bias corrections (Fig. 3), where more complex correction methods (e.g., quantile mapping and machine learning approaches) could improve the results. The soil types from SoilGrids showed differences with the local information, but the local data showed inconsistent soil types (such as urban and water types) that made the comparison difficult. A better agreement of land use derived from remote sensing data with local data was observed, but a limitation of this study is the consideration of time-constant land use, where the comparison of land use from CCI-LC in 2015 with the local land use from 2013 showed an increase in the urban area, affecting other land uses.

Another limitation of this study was the manual calibration procedure, where the use of automatic parameter estimation programs, such as PEST, could improve the model performance. Finally, the AET derived from remote sensors generated unrealistic values, with AET values higher than or close to the annual precipitation; hence, these products would also have to be corrected before they can be used for multiobjective calibration in the BM.

It is concluded that these remote sensing and global data can be used to depict regional changes in recharge patterns by using pseudo-continuous hydrological information in areas with limited ground observations to feed a simple yet robust recharge model (SWB model). However, data validation from remote-sensing/global products with local data is recommended to understand how these global data could affect the model results.

The presented approach in this paper can be extrapolated to other parts of the world with similar hydrogeological settings for establishing changes in regional recharge, which can be particularly beneficial in low-income countries where site-specific data are often incomplete and out of date.