Introduction

As the largest freshwater storage component of hydrological systems, groundwater interacts with other land water components in rivers, lakes, soil, snow, ice and plants, and responds to changes in climate at regional and global scales (Green et al. 2011; Perez-Valdivia et al. 2012; Lambert et al. 2013; Watras et al. 2014). It is well known that the amount of groundwater storage varies in time and space due to recharge and discharge processes and the spatial variation in aquifer capacity and properties. Yet, estimates of groundwater storage (GWS) variations remain poorly known (Richey et al. 2015b), as does a methodology to link large-scale estimates with groundwater-monitoring well level (GWWL) observations, especially in regions with sparse observation data (Henry et al. 2011).

In Canada, GWWL observations are made by several government jurisdictions and private industries, and span shallow water-table levels to deep hydraulic heads. The distribution of these observations is spatially variable, with most monitoring wells located in the southern part of the country, and extremely sparse or non-existent data in the northern part; thus, it is challenging if at all possible to have regional and national representation of GWS variation patterns solely from monitoring wells. Where GWWL observations exist, accurate estimates of the change in groundwater storage can be limited because of uncertainties in specific yield, effective porosity, or storage coefficient.

The Gravity Recovery and Climate Experiment (GRACE) mission, launched in March 2002, aimed to measure water mass redistribution in the Earth system (Tapley et al. 2004). The main weakness of GRACE is the coarse spatial resolution of about 400 km, which makes it challenging to link GRACE estimation to point-scale ground observation. The GRACE measurements have shown good agreement with in-situ soil moisture and groundwater observations over large regions, and have been successfully used to study groundwater depletion trends in large water basins, sub-basins, regions and sub-regions worldwide (Yeh et al. 2006; Rodell et al. 2007, 2009; Swenson et al. 2008a; Strassberg et al. 2009; Tiwari et al. 2009; Wada et al. 2010; Famiglietti et al. 2011; Scanlon et al. 2012; Feng et al. 2013; Huang et al. 2015). These studies are mostly based on monthly GRACE models. Direct assimilation of instantaneous tracking data can also be used for groundwater studies (see Han et al. 2009). In the regions where in-situ soil moisture data are not available, the soil moisture predictions from the land surface models (LSMs) were used to separate the GWS variations from the GRACE total water storage (TWS) variations. However, several studies have suggested that the GRACE-derived seasonal GWS variations are not realistic due to uncertainties in the simulated soil moisture predictions from the LSMs (Huang et al. 2012; Scanlon et al. 2012; Long et al. 2013). These findings also bring into question the soil moisture trends derived from the LSMs, and consequently the resulting GWS trends; thus, it has become necessary to compare the GWS results from GRACE and the LSMs with GWWL observations to better understand the applicability of estimating GWS variation, and potential limitations when attempting to link such large-scale data with local GWWL observation (Alley and Konikow 2015).

In this study, the plausibility of applying GRACE to evaluate GWS variations is tested by comparing directly with GWWL observations by a large-scale groundwater-monitoring network. The Canadian province of Alberta is chosen because of its varying geologic setting (shallow Quaternary-Neogene aquifers and deep sedimentary basin) and a well-established groundwater-monitoring network—the Groundwater Observation Well Network (GOWN). Groundwater is a vital resource in Alberta, supplying various domestic, municipal, agricultural, industrial, and environmental water needs. Although groundwater exists across all parts of Alberta, it has been estimated that only 4 km3 is recoverable (approximately 6 mm across the entire province), which is predominately found in shallow bedrock aquifers and buried channel aquifers (AEP 2014).

As in many other regions, effective groundwater management relies on understanding its use and availability. To develop a better understanding of groundwater resources, regional groundwater mapping and flow modelling is a priority, which is dependent on making the most of point-scale observations to explain broad-scale patterns. Broad-scale observation patterns, such as represented by GRACE, have not been investigated in great detail in Alberta. Point-scale observations are provided by the GOWN, which is operated by the Government of Alberta. This network consists of groundwater-monitoring wells located in various aquifers throughout the province, with most wells instrumented with data loggers and sensors that continually record groundwater levels. Current and historical groundwater level information for the active and inactive wells in the network can be accessed from the provincial website (GOWN 2015).

This study has two main objectives. The first is to test the applicability of GRACE to adequately represent the broad-scale pattern of GWS variations in Alberta, Canada. The second is to examine the link between the GRACE GWS variations and GWWL observations. The methodology presented in this study uses the Release 05 (RL05) monthly Earth gravity models derived from the GRACE measurements for the 2002 to 2014 period, and the LSMs. A key question to answer in this case study is: How representative are the GWS variations mapped from the GRACE TWS variations, when the best LSM predictions are used to account for soil moisture? First, the seasonal GWS variations are derived as 12 monthly maps from January to December, which are averaged over 2002–2014 along the GWS trend map and time series to characterize seasonal and inter-annual patterns of GWS variations. Second, the broad-scale patterns are compared with point-scale GWWL observations from the GOWN. The study identifies conditions where point-scale GWWL observations best represent the broad-scale patterns of GRACE GWS variations, thereby illustrating requirements for future strategic GWWL observations.

Data

Monthly GRACE gravity models

The CSR RL05, GFZ RL05a and JPL RL05.1 monthly GRACE models for the period of April 2002 to October 2014 have been used for this study. There are 139 models available for this period and 12 missing months (June, July, 2002; June, 2003; January, June, 2011; May, October, 2012; March, August, September, 2013; February, July, 2014). The GRACE processing standards for these models can be found in Bettadpur (2012), Dahle et al. (2013) and Watkins and Yuan (2012). The GRACE models do not include the spherical harmonic (SH) degree-one terms which correspond to geo-centre motion due to the Earth’s mass redistribution. Swenson et al.’s (2008b) predictions of degree-one terms are added to each monthly GRACE model. The monthly degree-two term C 20 time series is better determined by the satellite laser ranging (Cheng et al. 2013), and is thus used to replace the corresponding C 20 of each GRACE model. In addition, a gravity rate map derived from the GPS vertical velocity map using the ratio between the gravity change rates by absolute gravity measurements and the GPS vertical velocities at the same locations was used to remove the mass redistribution effect due to the glacial isostatic adjustment (Lambert et al. 2013).

Land surface models

The Global Land Data Assimilation System (GLDAS), GLDAS-1, 1.0 by NASA, USA, has been used for this study. GLDAS contains four LSMs: CLM V2.0, Mosaic (MOS) Model, VIC Model and Noah V2.7. The details of these models can be found in Rodell et al. (2004) and Rui (2011). The GLDAS LSMs have been most widely used to separate the GWS components from the GRACE TWS variations (see e.g. Yeh et al. 2006; Rodell et al. 2007, 2009; Strassberg et al. 2009; Tiwari et al. 2009; Famiglietti et al. 2011; Scanlon et al. 2012; Huang et al. 2012; Feng et al. 2013; Huang et al. 2015). They are arguably considered as the state-of-art LSM models.

Groundwater monitoring well level observations

Groundwater monitoring data were obtained from the Groundwater Information Network (GIN) portal (GIN 2014), which connects groundwater-monitoring data from authoritative sources in most Canadian provinces including Alberta’s GOWN. For the period corresponding to GRACE observations (2002–2014) there are 431 groundwater observations wells with either partial or complete data. Among these wells, there are multiple wells in close proximity at 97 locations. In most cases, these coincident data have the same pattern of water well variations. For each cluster of wells, one representative well was selected that had the longest monitoring period and the greatest overlap with the GRACE data. After this culling procedure, 256 monitoring wells were used in this study as shown in Fig. 1, which are located in various types of aquifers including bedrock, surficial, buried valleys under both confined and unconfined conditions.

Fig. 1
figure 1

The 256 selected groundwater-monitoring wells are shown as blue dots in the study region (Alberta, AB) marked by the red polygon. Gray polygons show the distribution of small lakes. The water storage variations for the 12 major lakes (blue polygons) are estimated and used for separating the GWS variations from the GRACE TWS variations. In addition, the largest lake within the study region, Lesser Slave Lake, is shown as a green polygon. BC British Columbia, SK Saskatchewan

Lake water-level observations

Lake water-level observations for 12 major lakes in Fig. 1 were retrieved from Environment Canada (Wateroffice 2015). They include Great Bear Lake, Great Slave Lake, Lake Athabasca, Reindeer Lake, Lake Winnipegosis, Lake Manitoba, Lake Winnipeg, Lake Superior, Lake Michigan, Lake Huron, Lake Erie, and Lake Ontario. Although most of these lakes are outside the study region, lake water storage variations need to be considered when estimating GWS from the GRACE TWS variations to reduce the spectral leakage error from these lakes. The lake water-level observations were averaged into monthly values corresponding to the GRACE data months. Lesser Slave Lake is the largest lake within the study region. Its water storage varies from about −0.5 to 0.5 km3 for the study period, which is below the sensitivity of GRACE, thus negligible. All other lakes in the study region are too small to affect the GWS estimation.

Methods

GIA correction, filtering and leakage errors of monthly GRACE models

The monthly GRACE models contain signals of the Earth’s gravitational potential variations caused by the integrated mass redistribution over time within the Earth surface, which can be expressed as the following SH expansion series (Bettadpur 2012; Dahle et al. 2013; Watkins and Yuan 2012):

$$ V\left(r,\varphi, \lambda, t\right)=\frac{GM}{r}{\displaystyle \sum_{n=0}^N}{\left(\frac{a}{r}\right)}^n{\displaystyle \sum_{m=0}^n}{\overline{P}}_{nm}\left( \sin \varphi \right)\left[{\overline{C}}_{nm}(t) \cos m\lambda +{\overline{S}}_{nm}(t) \sin m\lambda \right] $$
(1)

where r is the geocentric radius, and (φ, λ) are geographical latitude and longitude, respectively, t is time, the pair {GM, a} are the scaling geocentric constant and radius of GRACE models, \( {\overline{P}}_{nm} \) are the fully normalized associated Legendre functions of the first kind, and the pair \( \left\{{\overline{C}}_{nm},{\overline{S}}_{nm}\right\} \) are the model coefficients of degree n and order m.

The mass redistribution corresponding to the potential variations can be represented by the potential of a mass layer on the Earth’s surface with the water thickness equivalent (WTE; Wahr et al. 1998):

$$ \Delta z\left(r,\varphi, \lambda, t\right)=\frac{5.517a}{3}{\displaystyle \sum_{n=0}^N}\frac{2n+1}{1+{k}_n}{\displaystyle \sum_{m=0}^n}{\overline{P}}_{nm}\left( \sin \varphi \right)\left[{\overline{\Delta C}}_{nm}(t) \cos m\lambda +{\overline{\varDelta S}}_{nm}(t) \sin m\lambda \right] $$
(2)

where k n is the load Love number of degree n, and the pair \( \left\{{\overline{\Delta C}}_{nm},{\overline{\Delta S}}_{nm}\right\} \) are coefficient residuals with respect to their means, which represent a mean or static potential model.

In Canada, non-water mass (mainly the Earth’s crust and mantle) redistribution is largely caused by the glacial isostatic adjustment (GIA) around the Hudson Bay, which is removed using the GPS-derived GIA estimation as discussed in Lambert et al. (2013). The TWS variations after the GIA correction have been extracted from the GRACE models by the two-step method (Huang et al. 2012), which has been modified by adapting its parameters to RL05 models. It is briefly described below.

First, a de-striping step by Swenson and Wahr (2006) is used to correct the GRACE model coefficients above SH degree n = 11 and order m = 2. In the coefficient fitting, the fourth order polynomial is chosen when m < 31, otherwise the third order is chosen. The de-striping criterion is the root mean square (RMS) ratio between the original and de-striped coefficients greater than 2 to minimize the loss of signal. Second, a statistical test step is used to identify signal-dominant coefficient time series for all degrees and orders, then filter only noise-dominant coefficient time series. The 7-parameters model (Huang et al. 2012) is used to fit a coefficient time series. The two criteria are applied to test if the series is signal-dominant: (1) the signal-noise ratio is greater than 4 for at least one of the four following parameters in the 7-parameters model: velocity, acceleration, annual and semi-annual amplitudes; (2) the confidence level of the F-test on the least-squares fitting (Davis et al. 2008) is greater than 99.9 %. In addition, when the series passes the test, the fitting residuals of all coefficients in the series are then tested against the three RMSs of the residuals. The coefficients passing both tests remain unchanged, while the non-isotropic Gaussian filtering with r0 = 300 km, r1 = 450 km and m1 = 60 (see Han et al. 2005) is applied to the remaining coefficients only. This two-step method minimizes the resulting signal loss from the three methods while retaining their efficiency and effectiveness in extracting the signal.

In the processing, the reference mean GRACE model is derived from the 72 monthly models between 2004 and 2009. The mean model must be removed from all the monthly models prior to the de-striping step. The critical and parametric values above are selected through a full range of experiments and give the best correlation between the resulting GRACE TWS variation and the terrestrial water storage variation from all four GLDAS LSMs for the period of study.

The non-tidal variability in oceans has been removed from the monthly GRACE models using the baroclinic OMCT ocean model (Bettadpur 2012; Dahle et al. 2013; Watkins and Yuan 2012). However, there are still uncorrected oceanic effects and modelling errors present in the GRACE TWS results over oceans, which cause the leakage errors over continental lands originated from the GRACE coarse spatial representation of TWS. A procedure has been used to remove these errors and mask out the Antarctic. This procedure consists of the following steps:

  • GRACE TWS values over oceans and the Antarctic are set as zeros.

  • Remaining land TWS values are converted into monthly SH models to predict the leakage errors into oceans.

  • GRACE TWS values over oceans are corrected for the leakage errors from the previous step.

  • Corrected TWS values from the previous step are converted into monthly SH models to predict the leakage errors into the land.

  • Land TWS values are corrected for the leakage errors from the previous step.

Filtering of monthly soil moisture, snow and ice storage predictions and surface water storage observations

The geophysical parameters on terrestrial water storage in the LSMs consist of soil moisture, snow and ice water and canopy water (except for VIC). The sum of them is termed as SMSI. Soil moisture storage is dominant in the TWS outside ice caps over northern Canada. The SMSI storage predictions have a higher spatial resolution than the GRACE TWS variation results. In order to make their resolutions comparable, these predictions are first converted to the GRACE-like SH models, then truncated and filtered equivalent to the filtering of GRACE models (Huang et al. 2012). The same conversion has also been applied to the monthly lake water storage data described in section ‘Lake water level observations’. Strictly speaking, water storage variations in rivers and small water bodies should also be corrected for; however, at the resolution of GRACE, they are too small to be measured, and difficult to estimate from in situ observations available, thus are neglected in this study.

Separation of the groundwater storage variation

Given the GRACE TWS Δz GRC, the SMSI storage Δz SMSI, and the surface-water storage in lakes and rivers Δz SW in WTE, the groundwater storage variation Δz GW can be separated in terms of the water storage balance Eq. (3), where each term represents the deviation from the mean value of a reference period:

$$ \Delta {z}_{\mathrm{GW}}=\Delta {z}_{\mathrm{GRC}}-\Delta {z}_{\mathrm{SMSI}}-\Delta {z}_{\mathrm{SW}} $$
(3)

Isolating the GWS component from GRACE TWS variation depends on how well the SMSI and surface-water components can be estimated. The latter can be accurately estimated from water level gauges on major lakes, but is usually neglected for rivers and small lakes due to difference in spatial resolution and lack of monitoring data. The former is generally dominant in the TWS variation, and challenging to estimate accurately because SMSI monitoring data are very sparse; thus, the LSMs are often used to provide the SMSI component for the separation of the GWS component. However the SMSI predictions show a wide range of differences from model to model more in magnitude than in phase (Fan et al. 2011; Huang et al. 2012), whereby these differences are due to the disparity in forcing data (precipitation, pressure and temperature etc.), input data, modelling methods and parameters (e.g. Rodell et al. 2004; Fan et al. 2011; Wang et al. 2013). They introduce uncertainty into the resultant GWS variation; however, this uncertainty can be assessed by comparing GWS variation with GWWL observations. Using GWWL observations for a specific region can thus better inform accuracy of a LSM, which leads to a better estimation of the GWS variation.

Results

Total water storage variations

Using CSR RL05 models, the monthly TWS variations have been estimated for 139 months from April 2002 to October 2014, which have been used to produce three types of results: twelve mean monthly TWS maps from January to December; one trend map; and one TWS variation time series for the province of Alberta.

The 12 monthly TWS maps are derived by removing the mean map for 139 months, then averaging the same month maps across all the years of the study period. The linear trend is mostly averaged out because of the removal of the mean map, which effectively shifts the zero TWS reference to the middle of the study period. The resultant maps reflect the annual TWS variation patterns for the study region, and can be considered as long-term normal TWS patterns for each month. These maps can be used as references to indicate the TWS anomaly for a month of interest.

Figure 2 shows the 12 monthly TWS variation maps. The seasonal pattern is evident from January to December, with a strong increase in storage from February to June and decrease from August to October. The change in TWS variation from May to June in the southern half of the map area is dramatic, and requires further investigation. The presence of the Canadian Rocky Mountains (oriented southeast to northwest along the southwest margin of the study region) provides a distinct division of TWS variation amplitudes. The TWS varies from the southwest, adjacent to the Rocky Mountains, to significantly lower in the northeast. The similar lower TWS variation continues eastward from the study region. This division is consistent with the precipitation pattern in western Canada (NRCan 2015). Furthermore the coarse spatial resolution of GRACE, which spatially creates the signal from a higher TWS variation region over surroundings, causes the signal leakage from southwest to northeast of the Rocky Mountains; hereafter it is termed as the spatial leakage error.

Fig. 2
figure 2

The mean monthly total water storage (TWS) variation maps from GRACE observation in Alberta, for the period of April 2002 to October 2014

The trend map shown in Fig. 3 in WTE per year is derived from a linear least-squares regression, in which monthly values of the TWS time series at each grid point are equally weighted. It is not re-scaled by any factors because the objective of this study is to map variations based solely on the GRACE data. The scaling factors are often derived from the water-storage-simulation data such as soil moisture predictions from GLDAS CLM and may cause distortion in the resulting GRACE trend map (e.g. Landerer and Swenson 2012). In this region, the TWS trend ranges from −4 to 23 mm/year (mean of 7 mm/year) and decreases from southwest to northeast. The TWS accumulation in the southeast portion of the study area is counterintuitive when compared with the precipitation pattern (as this area tends to have low precipitation); however, accumulation appears to reflect a recovery from the drought conditions in Alberta prior to the study period (e.g. Armstrong et al. 2015).

Fig. 3
figure 3

The total water storage (TWS) variation trend in water thickness equivalent (WTE) from GRACE observation for the period of April 2002 to October 2014

The average TWS variation time series over the study region has been estimated using the 2-D window weight function (Huang et al. 2012) to reduce the spatial leakage error. The mean GRACE TWS variation series over Alberta is displayed in Fig. 4. It shows strong seasonal, inter-annual or secular variations with a standard deviation of 56 mm. The annual amplitude of the series is estimated to be 53 mm with a phase of 0.72 year and the TWS trend rate is 9 ± 1 mm/year. The TWS increases in 2005 and 2011 contributed most to the trend, while the TWS decreases in 2008 and 2014 offset the trend. A 6-year inter-annual cycle appeared from 2005 to 2011, but needs to be characterized with a longer time series.

Fig. 4
figure 4

The average total water storage (TWS) variations from GRACE observation in Alberta for the period of April 2002 to October 2014

A simulation analysis has been performed to estimate how the spatial leakage and filtering errors affect the TWS time series for the region. The SMSI grids from GLDAS Noah for the study period are used for the simulation. The simulation results are shown in Fig. 5. They indicate a slight trend of about −1 mm/year. The annual amplitude of the unfiltered Noah predictions is 41 mm with a phase of 0.79 year. The standard deviation of the differences before and after the GRACE-equivalent filtering is 10 mm in comparison with the standard deviation of the soil-moisture storage variations of 35 mm. Thus, a relative error level of 8 % \( \left(\frac{10^2}{35^2}\times 100\%\right) \) can be inferred from the variance of the GRACE TWS variations over Alberta assuming that the simulation with the Noah model is realistic in terms of both magnitude and phase.

Fig. 5
figure 5

The SMSI storage variations from GLDAS Noah, before and after the GRACE-equivalent filtering

Two other GRACE processing centres, GFZ (Germany) and JPL NASA (USA), also produce the monthly GRACE models operationally. To understand the uncertainty of GRACE processing, the monthly TWS variations have also been estimated from the GFZ RL05a and JPL RL05.1 models for the 139 monthly, respectively. They differ from the CSR RL05 ones by 29 and 30 mm in RMS for the domain of Fig. 3 and the 139 months, respectively. In contrast, the RMS of the CSR TWS results is 77 mm for the domain suggesting a relative uncertainty level of 14–15 % from the GRACE processing. These differences are likely due to different processing methods and procedures. Each monthly map in Fig. 2 represents a mean map of about 12 months resulting in the corresponding processing uncertainty is of \( \frac{30}{\sqrt{12}}=9 \) mm for each of the 12 monthly maps. Trend estimates from the GFZ RL05a and JPL RL05.1 models differ from the CSR in Fig. 3 by about 2 mm in RMS, or a relative uncertainty of 4 %.

Groundwater storage variations

Successfully isolating the GWS component from the GRACE TWS largely depends on the accuracy of the SMSI and SW components. The SMSI storage predictions from LSMs vary from model to model significantly. Figure 6 shows variability of the predictions from the four GLDAS models. They show similar seasonal patterns but different amplitudes.

Fig. 6
figure 6

Comparisons of the SMSI storage time series from the four GLDAS models and their means (AVG) in WTE in Alberta

The correlation between the GRACE TWS variation and the SMSI predictions from the four GLDAS models has been computed for the 139 GRACE months, as shown in Fig. 7. For the entire province of Alberta, VIC gives the best correlation (0.54), while CLM gives the worst (0.39). In addition, one average SMSI prediction set has been created from the four GLDAS SMSI predictions named as AVG. As expected, it gives a median correlation (0.52) when compared with the other four models. All four GLDAS modes as well as the AVG model have been used to separate the GWS component in this study. To match the GRACE resolution, all the monthly SMSI grids are first converted into monthly SH models, which are then filtered identically to the GRACE models, and finally synthesized into the SMSI grids spectrally comparable with the GRACE TWS grids.

Fig. 7
figure 7

Correlation between the GRACE TWS variations and the SMSI predictions from the GLDAS Mosaic (MOS), Noah (NOAH), VIC, CLM, and the average of the four models as AVG. Both the TWS and SMSI grids are in 1 by 1 arcdeg. The 256 well locations are marked in the AVG panel

To minimize the large-scale spatial leakage error caused by major lakes outside the study region, the monthly water storage variations from these lakes are removed by converting them into the GRACE-equivalent models following the same procedure that is applied to the SMSI grids in the preceding. Major rivers in Alberta may also have water storage variations, but this is expected to be below the sensitivity of GRACE and not accounted for in this study. By removing the filtered SMSI and SW grids, which are spectrally comparable with the GRACE TWS results, the monthly GWS results are derived for each GLDAS model and AVG.

The 12 monthly GWS variation maps in Fig. 8 show the average GWS variations using AVG over a year for the period of study. Within Alberta, the overall GWS increases from February to June, decreases from July to October, then increases from November to December and decreases in January. Using Noah gives a similar annual pattern to the one shown in Fig. 8. The amount of GWS variation is approximately half of the TWS variation, and more closely reflects the hydrologic seasons in Alberta. The seasonal amplitude of GWS variations within Alberta is lower than the southwest region of the maps in Fig. 8. Similar to the TWS variations shown in Fig. 2, spatial signal leakage from the southwest is evident for 6 of 12 months due to the coarse resolution of the GRACE TWS results, and presence of the Rocky Mountains.

Fig. 8
figure 8

The monthly groundwater storage (GWS) variation maps from GRACE observation and the GLDAS AVG model for the period of April 2002 to October 2014

The GWS trend using AVG shown in Fig. 9 ranges from −4 to 25 mm/year with a mean of 11 mm/year. It generally increases from west to east. The GWS trend is greater than the TWS trend shown in Fig. 3 implying an overall decreasing trend of the SMSI and SW components, in particular over northern Alberta. The GWS trend has also been estimated using each of the four GLDAS models. The mean standard deviation of the four resulting trend distributions is smaller than 2 mm/year. Within the province of Alberta, the standard deviation is predominantly 1 mm/year, which represents the uncertainty level of about 3 % in the trend estimation due to the removal of AVG.

Fig. 9
figure 9

The groundwater storage (GWS) variation trend map from GRACE observation (using the AVG model) for the period of April 2002 to October 2014

The GWS variation time series using AVG is given in Fig. 10. It is relatively noisier and seasonally weaker than the TWS variation in Fig. 4. Its trend is 12 ± 1 mm/year in contrast to 9 ± 1 mm/year for the TWS trend. The GWS shows an increasing trend through the study period except for a drop in 2007. The annual amplitude is 22 mm with a phase of 0.65 year in comparison with respectively 53 mm and 0.72 year for the TWS variation. The GWS variation time series using each GLDAS model is also computed and shown in Fig. 10. The standard deviation of the GWS values using the four models at each month ranges from 2 to 32 mm with a mean of 11 mm. In comparison to the annual amplitude of 22 mm, this represents an uncertainty level of 25 % on average in the estimation of GRACE GWS annual variations.

Fig. 10
figure 10

The average GWS variations from GRACE observation using the four GLDAS models and AVG in Alberta, for the period of April 2002 to October 2014

Discussion

GRACE groundwater variations and groundwater-monitoring well levels

The objective of this study is to test the applicability of GRACE for representing GWS variations. The GWWLs can be compared with the corresponding GRACE results at the regional scale. This approach requires: (1) a dense distribution of groundwater-monitoring wells for the study region; and (2) realistic specific yield or effective porosity values at each well, which are representative of the sub-region surrounding it. Given the large-scale study region (as would be expected for most parts of the world), the first requirement is not met, while the second requirement is met at least from a first-order estimate given the nature of the aquifer material. Ideally, a groundwater observation network with uniformly distributed wells, complete time series data, and very well characterized aquifer properties would be used to test the applicability of GRACE. Such an observation network does not exist at a spatial scale comparable to GRACE measurement. The GWWL observations in this study do not span the whole GRACE period, and have data gaps from well to well. These data are what can be expected for many large regions, and in the case of Alberta, are moderately well distributed across a significantly large area. Not all groundwater wells in a given region will reflect regional groundwater storage changes due to hydrological settings and human activities such as withdrawal, injection, and mining. Some of the GOWN wells have been installed to monitor transient response to pumping and interaction with surface water, and do not adequately capture the regional groundwater response. The wells which reflect and respond to groundwater variations over regional aquifers and watersheds need to be identified so that their water levels are more comparable with the GRACE water storage results.

One approach for linking GRACE results to finer spatial scales (e.g. observation well data) would be to invoke a scaling routine. Landerer and Swenson (2012) estimate the grid gain factors from a LSM simulation, which are then used to extrapolate the GRACE results to finer scales. The applicability of these gain factors is dependent on how well the LSM simulates TWS variations. As the objective of this study is to find the connection between the GRACE GWS and GWWL variations directly, the grid gain factors are not applied to the GRACE GWS. To link GWWL data with GRACE, an approach needs to take into consideration the fact that the GRACE GWS variations are spatially coarse, and are not strictly comparable with well observations. Regional GWS variations cannot be estimated directly from these wells; however, GRACE GWS can be directly compared. In this study, a different scaling approach is developed that links GWWL data with GRACE; however, a first step is to identify where GWWL time series generally matched GRACE results. This selection is intended to reveal conditions where GRACE might reflect the regional GWWL variation.

The selection criterion is introduced in terms of a statistical correlation coefficient between the GRACE TWS time series and the GWWL time series at each well for the period of the well observation. If the coefficient is greater than 0.5, the well is selected. The intention of this selection criterion is to identify under what conditions observation well data match GRACE. Applying this criterion, 36 wells were selected out of the original 256 wells, and they are shown in Fig. 11 and Table 1, and have a mean correlation coefficient of 0.68. These wells are located in the middle and southern parts of the study region, where the majority of 256 Alberta’s GOWN wells, as shown in Fig. 1, are located. These selected wells also better represent broad-scale (i.e. regional) conditions than wells installed for monitoring smaller-scale hydrologic regimes such as interaction with surface water. For these wells, the GWS variation from GRACE and LSM models is likely realistic around these wells, though further hydrogeological investigation is required to verify this finding. However, the 36 wells are also clustered in the southeast corner of the study region bringing into question the geologic difference and implication for detecting GRACE-scale changes in storage (for more discussion see section ‘Conditions for the correlation of GRACE data with groundwater observations’).

Fig. 11
figure 11

The 36 wells with observations highly correlated with the GRACE-derived total water storage (TWS) variations

Table 1 Formation-lithology, GWWLs’ correlation with the GRACE TWS (TWS Corr.) variations, well depths, numbers of months, specific yields or effective porosities, and conversion coefficients using AVG at 36 wells

The correlation coefficients between the GRACE GWS variations and the GWWL time series have been computed as shown in Fig. 12 and Table 2. It is found that all four GLDAS models give a close level of the mean correlation ranging from 0.72 to 0.74, while AVG gives a slightly better mean correlation of 0.75 than each individual model. These correlation values are slightly higher than the mean correlation of 0.68 with TWS. In other words, removing the SMSI and SW components has improved the overall correlation implying that the SMSI predictions from the GLDAS models are reflective of the GRACE TWS variations. Furthermore, the GRACE GWS results give a significant trend in the area of the selected wells as shown in Fig. 9. Accordingly, similar trends are expected to be seen in the GWWL data.

Fig. 12
figure 12

The 36 wells with observations highly correlated with the GRACE-derived groundwater storage (GWS) variations using AVG. The pixel codes are labelled in black. Each cell corresponds to a GRACE GWS ‘pixel’

Table 2 Correlation between the GRACE GWS variations using the four GLDAS models as well as AVG and the GWWLs at the 36 selected wells

To understand the influence of the coarse spatial resolution of GRACE on estimating GWS, a simulation analysis was completed that computed the correlation coefficients between the unfiltered and filtered Noah SMSI time series at each grid point closest to a GOWN well. Only 10 out of the 256 wells show the coefficients smaller than 0.5, and only 3 out of the 10 wells are in the 36 selected wells (well ID: 121, 123, and 278). The mean value of the correlation coefficients from this simulation is 0.70, which is close to the mean of 0.74 for the GRACE GWS results using Noah. This agreement is surprising as the Noah SMSI simulation can reflect the GWS variation only indirectly due to the lack of groundwater component. Nevertheless, the simulation indicates how much the low resolution of GRACE affects the correlation analysis. It suggests that the comparison between the GRACE GWS results and the GWWL observations can help validate and understand the coarse GWS results. Furthermore, it can be concluded that the GWS component has been extracted from the GRACE TWS variation to a significant level when using the GLDAS models in the area with the selected wells.

The remaining question is how to interpret the coarse GRACE GWS variations and connect them to the GWWL observations at each well. The statistical correlation coefficients have indicated the consistency in pattern (phase and/or trend) between the GRACE GWS variation and GWWL time series at the 36 wells. However the GRACE GWS variations are generally incomparable with the GWWL observations in magnitude for two reasons: (1) GRACE data represents the GWS variations in WTE while the GWWL observations represent a number of processes occurring at a given aquifer; (2) GRACE data have a spatial resolution of 100 s km, while the GWWL observation data can be local or regional depending on the connectivity of the aquifer with surrounding aquifers.

To quantify the relation in magnitude between the GRACE GWS variations and the GWWL variations, a scale factor for a site p is defined by

$$ f\left({\phi}_p,{\lambda}_p\right)=\frac{SD\left[GW{S}_{GRC}\left({\phi}_p,{\lambda}_p,{t}_k\right)\right]}{SD\left[ GWWL\left({\phi}_p,{\lambda}_p,{t}_k\right)\right]}\times 100\% $$
(4)

where ϕ p and λ p are geographical latitude and longitude of the site p, t k refers to time epochs and SD is a mathematical operator for standard deviation. This factor represents the magnitude ratio between the GRACE GWS and local GWWL variations.

Assuming that the GRACE and GLDAS can give the local GWS variations at a well site precisely, the f factor would be equal to the specific yield or effective porosity at the site. Equation (4) is an extension of the basic equation to estimate the specific yield using the GWWL and WTE applying for two time series instead of one set of samples at a time. Alternatively, it can be estimated by a linear regression as well. Due to the coarse resolution of GRACE, the derived GWS variations only represent smooth mean values at the GRACE scale. Thus, this factor can be expressed as the product of specific yield S y (or effective porosity) and an up-scaling ratio β, i.e.

$$ f\left({\phi}_p,{\lambda}_p\right)=\beta \left({\phi}_p,{\lambda}_p\right){S}_{\mathrm{y}}\left({\phi}_p,{\lambda}_p\right) $$
(5)

The up-scaling ratio β can be mathematically expressed as

$$ \beta \left({\phi}_p,{\lambda}_p\right)=\frac{{\overline{S}}_{\mathrm{y}}\times \overline{GWWL}}{S_y\left({\phi}_p,{\lambda}_p\right)\times GWWL\left({\phi}_p,{\lambda}_p,{t}_k\right)} $$
(6)

where \( {\overline{S}}_{\mathrm{y}} \) and \( \overline{GWWL} \) are regional averages of specific yield and GWWL, respectively, the product of which is directly comparable with the GRACE GWS variation. The up-scaling ratio β reflects how a groundwater-monitoring water well responds to the GRACE-scale groundwater storage variations in magnitude. The GWWLs are over-responsive in the case of β < 1 (or f < S y), and vice versa. Thus the f factor can be used to up-scale the GWWL variations to the GRACE-scale GWS variations or vice versa so that a direct comparison between them can be made. Examining the outcome of the f factor estimation may reveal a linkage with an attribute of a well that could be applied to a different region (e.g. relation with lithology or depth).

The f factors have been estimated for the 36 wells at which the GWWLs highly correlate with the GRACE GWS variations. Table 1 shows the f factors, as well as aquifer storage parameters of specific yields for wells in unconfined aquifers, effective porosities for wells in confined aquifers, and both specific yields and effective porosities for unknown types of aquifers. It can be seen that the f factors are within 5 % to the ranges of storage parameters at 16 wells most of which are in unconfined aquifers. It suggests that these wells reflect the GRACE-scale GWS variations in both pattern and magnitude. At the other 20 wells, the f factors are outside the parameter ranges, and appear to be somewhat related to lithology. Bedrock sandstone aquifers have a relatively narrow range and lower value of the f factor (0.11–0.53, mean of 0.27) compared with surficial sand deposits (0.07–1.23, mean of 0.37). This suggests that the f factor may be related to geological setting, in-turn offering a method for predicting it across other regions. Furthermore, many of the f factors that are outside the parameter ranges correspond to channel sand or gravel deposits, which tend to be subject to more complex recharge/discharge processes than widespread aquifers.

Figure 13 shows the GRACE GWS variations and the up-scaled GWWLs by the f factors for each corresponding GRACE GWS ‘pixel’ of 1 × 1 arc-deg surrounding a grid node. There are 29 wells included that have more than 65 months of observation. These wells are located in 16 pixels, and the number of wells within each pixel ranges from 1 to 6. A common feature among all the wells is that the GWS increases over the period of study, which generally agrees with the GRACE GWS trends at the well sites. This suggests that the GRACE GWS trends are realistic over the corresponding region, and that correlation between the GRACE signal and groundwater levels is apparent to subsurface depths in the order of at least 100 m (i.e. the depth extent of typical groundwater observation wells). Many of the GWS variations also appear to represent the seasonal variability exhibited by GWWLs, further suggesting that the GRACE GWS trends are realistic. In many cases, (e.g. P-31) channel sand aquifers can be seen to respond differently than a bedrock aquifer, which is to be expected given the differing hydrogeologic settings. Although the GRACE GWS variations capture seasonal variations observed at most of the wells, there are a couple of locations where groundwater observation data is more subdued than predicted by the GRACE GWS estimate (e.g. P-26 and P-33). These locations demonstrate that the specific aquifer hydromechanical properties locally or the nature of an individual well (e.g. construction details) are not captured by GRACE.

Fig. 13
figure 13

Groundwater monitoring well levels vs. the GRACE-GLDAS (AVG) groundwater storage (GWS) estimates in blue. The wells are grouped into each pixel with a code corresponding to the code in Fig. 12

Conditions for the correlation of GRACE data with groundwater observations

GRACE-based estimates of groundwater depletion (e.g. Richey et al. 2015a) have certainly revealed global changes in groundwater storage; however, applicability at a spatial scale for regional monitoring requires that any GWS trends estimated from GRACE accurately represent local groundwater observations. Compared to the broad-scale patterns that are readily apparent globally, point-scale observations will reflect groundwater conditions unique to a given region, which are likely to exhibit both a decrease (i.e. depletion) and an increase in groundwater levels, and some degree of seasonality depending on the characteristics of the aquifer. For the Canadian province of Alberta, the GRACE GWS variations derived in this study correspond well with 36 of 256 wells in the groundwater-monitoring network, having a correlation coefficient greater than 0.5. An interesting outcome from this study is that GRACE GWS variations correlated better with wells located in the southern portion of Alberta compared with wells located in the northern portion of Alberta. Although there is a difference in spatial coverage of wells (sparser in the northern portion of Alberta), the study compared the location of highly correlated wells with two key landscape factors that may be worthwhile to consider when applying this approach in other regions: (1) geologic setting, and (2) land use.

Compared with the bedrock geology of Alberta (Prior et al. 2013), the GRACE GWS variations derived in this study appear to correlate well for groundwater observations in almost all bedrock formations that outcrop or sub crop in the southern portion of Alberta (Fig. 14a). Wells with the highest correlation (~0.8–0.9) correspond to the Belly River Formation in east-central Alberta, which has productive aquifers in this region. Bedrock wells with a lower correlation (~0.6) but still considered highly correlated, correspond to sandstone bodies within the Paskapoo and Horseshoe Canyon formations. However, many of the bedrock wells completed in these formations did not correlated to GRACE GWS variations, possibly suggesting that the high degree of heterogeneity (e.g. Grasby et al. 2008) may be a factor to consider. In this regard, a better correlation between GRACE GWS variations with GWWLs may occur for regionally extensive hydrogeologic conditions, and less so for small sandstone bodies of limited extent or lateral connectivity. For groundwater observations in channel aquifers and surficial deposits, the GRACE GWS variations derived in this study also appear to correlate well for the southern portion of Alberta. Generally, the Quaternary-Neogene sediments are thicker in the central and northern parts of Alberta (MacCormack et al. 2015); however, high correlation between GRACE GWS variations with GWWLs was found for wells completed in pre-glacial channel aquifers that are buried by glacial sediments (e.g. Andriashek and Meeks 2000) and surficial deposits varying from less than 25 m to greater than 100 m. Combined with the findings related to bedrock geology, it is proposed that conditions for correlation between GRACE data and groundwater observations may be unrelated to the geologic setting.

Fig. 14
figure 14

a Comparison of groundwater-monitoring wells and bedrock geology of Alberta, with correlation of the GRACE-derived GWS variations for bedrock wells shown as scaled black dots. b Comparison of highly correlated GWWLs and GRACE GWS variations with land use in Alberta

A common landscape factor for the high correlation between GRACE GWS variations with GWWLs in southern Alberta is land use (Fig. 14b). This portion of the province is comprised of grassland and parkland (i.e. a prairie landscape), which generally has few forests. Almost all of the highly correlated wells are located in this prairie landscape, whereas the remaining portion of the province is comprised of a forested landscape (e.g. boreal and foothills). It is hypothesized here that the influence of a forested landscape and thousands of discrete water bodies of the Boreal Plains (wetlands) has not been accounted for appropriately in the estimated GRACE GWS variations. It is possible that the dynamic hydrology of Boreal wetlands such as seasonal freezing effects (e.g. Smerdon and Mendoza 2010) needs to be considered as part of the land surface modelling to accurately represent soil moisture effects that differ from the prairie landscape.

The strong correlation of GRACE GWS variations with GWWLs in southern Alberta begins to demonstrate the conditions where local groundwater observations can be represented by large-scale data, which is promising for regional monitoring of groundwater systems in similar prairie landscape that are undergoing stress; however, this study also proposes conditions where local groundwater observations are not represented by large-scale data. There are a few factors that could influence the pattern of groundwater well levels resulting in the low correlation with the GRACE TWS results across study regions such as Alberta. These include:

  • Presence of vast forested and wetland systems, such as the boreal

  • Open pit mine developments, therefore influenced by de-watering (e.g. Fort McMurray oil sand mining development for wells 57, 62, and 49)

  • The role of perched aquifer systems

Summary and conclusions

In this study, the applicability of GWS variations derived from global gravity models by GRACE satellite mission and LSMs by the GLDAS has been tested to represent the broad-scale pattern of GWS variations, and studied the connection between the GWS variations and GWWL data at wells in the province of Alberta, Canada.

First, the TWS variations have been estimated using the Release 05 (RL05) monthly GRACE models for the period of April 2002 to October 2014. The two-step method adapted to the RL05 models is used to extract the TWS variation signals from the GRACE models, while the glacial isostatic adjustment correction based on GPS and absolute gravity observations is used to separate the TWS variations from the total mass variations within the Earth by GRACE. The twelve mean monthly TWS variation maps are derived to show the TWS variation pattern over Alberta, which increases from February to June, decreases from July to October, and slightly increases from November to January. The highest TWS is in June and the lowest in October. The average TWS variations have annual amplitude of 53 mm in WTE and a phase of 0.72 year showing an increasing trend of 9 ± 1 mm/year over Alberta for the period of April 2002 to October 2014. A simulation analysis based on the GLDAS Noah model suggests that the TWS results are subject to a relative error level of 8 % on the variance of the TWS variations for the region of study.

Second, the monthly GWS variations are derived from the TWS variations for the period of study. The four GLDAS models (Mosaic, Noah, VIC and CLM) and an average model (AVG) from them are used to separate the GWS variation signals from the TWS variations after the GRACE-equivalent filtering. The twelve mean monthly GWS variation maps show a seasonal pattern similar to the TWS. The average GWS variations using AVG have annual amplitude of 22 mm and a phase of 0.65 year showing an increasing trend of 12 ± 1 mm/year for the period of study.

Third, GWWL data at 256 wells in the Alberta GOWN network are used to compare the GRACE GWS variations. A correlation criterion between the GRACE TWS variations and GWWLs is introduced to identify wells which reflect and respond to groundwater variations over regional aquifers and watersheds. Thirty-six out of 256 wells are significantly correlated with the GRACE TWS variations with a mean correlation coefficient of 0.68. They show an increased mean correlation coefficient of 0.72–0.74 with the GWS variations implying that the GLDAS models are realistic to a significant degree. A simulation correlation analysis suggests the same level of correlation that the GRACE GWS variations can achieve; thus, it can be concluded that the broad-scale GWS variations derived from the GRACE TWS and GLDAS models are likely realistic in the region of study.

Finally, a site factor f is defined as the standard deviation ratio between the GRACE broad-scale GWS variations and GWWLs. The f factors have been estimated at all the 36 wells in southern Alberta, and are within 5 % to the ranges of storage parameters at 16 wells. The f factors appear dependent of geological settings. The bedrock aquifers are slightly better represented by the GRACE data than the surficial sands or channel aquifers. Applying these estimates of f to up-scale the GWWLs leads to a convincing agreement between the GRACE GWS and GWWL trends at all the 36 wells, which can thus be used to monitor regional GWS conditions in Alberta. For most of the highly correlated wells, even the general pattern of seasonal variability is captured in the GWS variations.

The selected 36 wells are dominantly located in the southern portion of Alberta, while the wells in northern Alberta do not have strong correlation with the GRACE GWS variations. This distribution coincides most notably with land use, and appears to be less related to the geological setting or depth. The 36 wells with high correlation span the depth range that is at least 100 m below ground surface. The poor correlation in the northern portion of Alberta indicates that the dynamic hydrology of a forested- and wetland-dominated landscape may need to be considered differently in land surface modelling to correct for soil moisture. Other factors also affect the representation of GRACE GWS variations such as water withdrawal, and open pit mine developments, etc.