Introduction

Evaluating dynamically stable storage and flows of groundwater is a foundation for its sustainable use along with equitable governance and management (Gleeson et al. 2020). However, in many areas, groundwater resources are in rapid decline (Clarke et al. 1996; Schwartz and Ibaraki 2011), and better governance requires more attention to physical aspects including measurement, estimation, modelling, and monitoring. Specific storage (SS) is a key hydrogeological property of semiconfined and confined groundwater systems. Many studies have demonstrated that the extent and rate of drawdown propagation both vertically and horizontally around the pumping well is sensitive to SS (Wang 2020). Yet SS is traditionally considered to range within a few orders of magnitude, much smaller than the range exhibited by hydraulic conductivity which is nearly 13 orders of magnitude (Freeze and Cherry 1979). It is common practice to assume SS values where site-specific data are limited or unavailable for groundwater investigations and modelling (Hoeksema and Kitanidis 1985; Zhao and Illman 2021). Choosing unrealistically high values of SS could lead to a rapid depletion of storage and unexpected land subsidence, whereas an underestimate of SS (i.e., too low) restricts optimal use of groundwater resources. However, both scenarios have potential social, environmental, and economic consequences.

Groundwater models are widely used as decision support tools to quantify the possible consequences of natural variations and human interventions (Barnett et al. 2012). As one of the crucial storage properties, distribution and variability of SS in the model domain representative of the actual subsurface condition is necessary to obtain a reliable modeling result. In addition, representative site-specific estimates of SS or appropriate probability density function (PDF) can be used to improve groundwater models with probabilistic outcomes. A probabilistic modelling approach enables the uncertainties of model results to be evaluated to provide more confidence in the models supporting decision-making (White et al. 2020). However, inadequate field measurement of SS makes it challenging to parameterize fully distributed fields in the model domain. As a consequence, a single and uniform value of SS within and across model layers is often adopted in local/regional scale models irrespective of the varied hydrogeological conditions (Anderson et al. 2018).

Lognormal frequency distribution of hydraulic conductivity (K) has been widely used in stochastic groundwater analysis (Loáiciga et al. 2006). However, PDFs for specific storage are limited because most aquifer tests focus on hydraulic conductivity and do not often provide robust SS data. Hoeksema and Kitanidis (1985) analysed storage coefficients (S) and K values for 31 aquifers and suggested that both parameters can be approximated as being log-normally distributed. A study of fractured rock transmissivity and storativity by Shapiro et al. (1998) assumed log-normal and bi-modal distribution of storativity (SS multiplied by the vertical extent of the formation), depending on fracture infill in the analytical models for interpretation of slug tests. In addition, Mace et al. (1999) measured SS for the Carrizo-Wilcox aquifer, Texas, USA comprised of loose sediments and decomposed rock using pumping test and slug test analysis and found that SS can be approximated by a log-normal distribution ranging from about 10-7 to 10-3 m-1 with the geometric mean of 1.5 × 10-5 m-1.

A compilation of literature-derived SS values can provide reference to its realistic range and the foundation of PDF for different subsurface materials. To the best of the authors’ knowledge, only two studies have extensively compiled, analysed and compared SS values reported in the scientific literature. Quinn et al. (2016) compiled SS values for sandstone from 14 studies with limited estimation methods, while Kuang et al. (2020) collated 182 SS values determined by field-based methods. However, they did not consider commonly used geotechnical core test results or SS values derived from and in context with calibrated transient groundwater models.

This study aims to (1) compile, analyse and interpret literature-derived SS values representative of a wide variety of subsurface materials and estimation methods, (2) derive detailed statistical interpretations of groundwater storage through probability distributions of SS values classified by estimation methods and lithological conditions, (3) analyse how groundwater modelling practice uses SS values in numerical models, and (4) provide an overview of the representative scale and uncertainties associated with different SS estimation methods. The results provide insight into the SS values for different lithologies that can improve the understanding of compressible groundwater storage and confidence in the modelling of groundwater systems. The research reported here extends that of Kuang et al. (2020) by outlining the implications, considering results from laboratory studies, deriving statistical relationships that can be used by the modelling community, and by reviewing the practice of numerical modelling for which SS is mostly used.

Background and Methods

Theory and estimation of uniaxial specific storage

Specific storage (SS) can be defined as the volume of water that a unit volume of aquifer releases from storage because of expansion of the water and compression of the aquifer matrix under a unit decline in the average head (Freeze and Cherry 1979; Hantush 1960). Here, it is important to consider the coupled physics of the two-phase aquifer system, i.e. solid matrix and liquid water. Terzaghi (1925) established the mechanism of sharing stress between solid grains and pore water. This was followed by Meinzer (1928), who introduced the concept that both pore water and a porous medium are elastically compressible allowing an aquifer to produce more water than the total pore volume. In addition, solid grain is also compressible and has influence over the volume of water production derived from the storage, especially for the formation with low porosity and compressibility (Van der Kamp and Gale 1983). When the hydrostatic pressure is reduced, water expands and solid grains in the aquifer matrix can be rearranged while the pore space remains saturated (Domenico and Schwartz 1998).

Even though this behavior is dominant in confined saturated aquifers, it can also occur in unconfined water-table aquifers (Batu 1998). Unconfined or water-table aquifers show a delayed yield during extraction of water through pumping, with a time variable decline of water level due to gravity drainage and elastic storage (Boulton 1963). After groundwater pumping commences, the produced water comes initially from the elastic behaviour of the aquifer and the pore water around the intake screen, caused by lowered pressure (Neuman 1972). Due to a decrease of pressure around the intake screen, the hydraulic head declines rapidly, and a cone of depression gradually starts to form. As pumping continues, contribution from elastic storage gradually dissipates and gravity drainage becomes dominant (Kruseman et al. 1970). Therefore, short-duration pumping tests in unconfined aquifers lead to values that are representative of elastic storage rather than specific yield (Maréchal et al. 2010).

Specific storage (SS) can be expressed as follows:

$${S}_\mathrm{s}={\rho}_\mathrm{w}g\ \left(\alpha +\varnothing \beta \right)$$
(1)

where ρw is the density of the pore fluid (water) [ML−3], g is the gravitational acceleration at the surface of the Earth [LT−2], α is the uniaxial (vertical) matrix compressibility of the aquifer formation [L T2M−1], β is the compressibility of the water [L T2M−1], and ∅ is the total porosity of the aquifer matrix [-]. The term ρw denotes the volume of water released from storage by the compression of the aquifer and ρwg ∅ β denotes the volume of water released from the storage by the expansion of pore water. This expression is widely used in the field of hydrogeology after neglecting lateral deformation and individual grain compressibility (Jacob 1940).

S S can also be calculated from barometric efficiency (BE) using the following equation (Acworth et al. 2017):

$${S}_\mathrm{s}=\frac{\rho_\mathrm{w}\ g\varnothing \beta }{\mathrm{BE}}$$
(2)

Barometric efficiency is usually calculated by relating a change in water level to the change in atmospheric pressure in an open well system, which can vary in value between 0 and 1. Groundwater response to atmospheric pressure changes is common in confined, semiconfined, and deeper portions of unconfined aquifers. BE can also be calculated from the loading efficiency (LE) using the relationship, BE = 1 – LE (Acworth et al. 2017).

For deeper groundwater, SS is often aggregated into the storage coefficient or storativity (S) of a confined aquifer with saturated thickness, b [L] and can be expressed as follows:

$$S={S}_\mathrm{s}\times b$$
(3)

It is to be noted that this parameter assumes completely confined conditions and horizontal flow only.

In highly compressible subsurface materials (e.g., clay), water produced from the expansion of pore water is very low compared to that produced from compression of the aquifer matrix and often neglected in the calculation (Domenico and Mifflin 1965). Moreover, in soil mechanics, SS is expressed as follows for compressible fine-grained subsurface materials (Batu 1998):

$${S}_\mathrm{s}=\frac{K_\mathrm{v}}{C_\mathrm{v}}={\rho}_\mathrm{w} g\alpha$$
(4)

Here, Kv is the vertical hydraulic conductivity [LT−1] and Cv is the coefficient of consolidation [L2T−1].

S S is an important storage parameter for water balance studies considering transient flow in aquifer systems (Folnagy et al. 2013). It determines the time response of an aquifer subject to pumping or recharge (Haitjema 2006). The use of an unrealistically high value of SS may lead to under-prediction of the pressure drawdown associated with long-term groundwater extraction and may also increase the risk of land surface subsidence (LSS; Chen et al. 2018). LSS occurs when storage is permanently lost, decreasing the specific storage due to consolidation of strata and can continue for years after pumping is ceased (Smith and Majumdar 2020). SS also influences the migration of contaminants through aquifers and therefore plays a vital role when the effectiveness of remedial measures is evaluated (Alexander et al. 2011). Lastly, SS is regarded as one of the four poroelastic coefficients which are essential for characterizing fluid-saturated and elastic porous medium (Green and Wang 1990; Wang 2000).

Depth dependency of S S

Deeper subsurface materials tend to be more compacted due to total stress of overburden, thereby decreasing porosity and compressibility leading to lower SS. Van der Gun (1979) proposed the following equation to estimate storage coefficient or storativity (S) (SS multiplied by the vertical extent of the formation) as a function of the depth for a confined or semiconfined aquifer (Boonstra and de Ridder 1981):

$$S=1.8\times {10}^{-6}\left({d}_2-{d}_1\right)+8.6\times {10}^{-4}\left({d}_2^{0.3}-{d}_1^{0.3}\right)$$
(5)

where d1 and d2 denotes the depth [L] of the top and bottom of the aquifer. Kuang et al. (2021) proposed the following empirical SS-depth model:

$$\mathit{\log}\;{S}_\mathrm{s}=\mathit{\log}\;{S}_{\mathrm{sr}}+\left(\mathit{\log}{S}_{\mathrm{s}0}-\mathit{\log}{S}_{\mathrm{sr}}\right)\ {\left(1+z\right)}^{-\lambda }$$
(6)

where SS is calculated value at depth z (km), Ss0 at ground surface, Ssr is residual specific storage (m-1) and λ is the decay index. SS can be estimated from either in situ observations or laboratory tests of rock core or sediment samples. In situ methods are based on utilizing pressure changes in the aquifer due to hydraulic stress that either occurs naturally (e.g., Earth tides or atmospheric pressure) or is induced (e.g., by pumping, underground excavation or changing the surface load through mining or building). Hydraulic testing, e.g., through pumping of groundwater or application of a “slug” (or by compressing air to displace water), creates artificial hydraulic stress and causes a pressure change in the monitoring bore. To estimate the hydrogeological properties of an aquifer, typically hydraulic conductivity and SS, the drawdown response in the test well and neighbouring monitoring bores are monitored over time and analysed by matching the transient data to analytical or numerical solutions that are based on the conceptual hydraulic model of the system (Bohling and Butler Jr 2001; Cooper Jr and Jacob 1946; Cooper Jr et al. 1967; Hantush 1964; Hyder et al. 1994; Neuman and Witherspoon 1972; Papadopulos 1965; Rathod and Rushton 1991; Theis 1935). These methods have been widely used in groundwater investigations and stating further details about each model/solution as well, as its applicability is beyond the scope of this paper and the reader is referred to further literature (Batu 1998; Butler 2019; Kruseman et al. 1970).

In situ estimates of SS can also be obtained by analysing pore pressure perturbation in the aquifer resulting from naturally occurring Earth tides (Bredehoeft 1967; Hsieh et al. 1987; Rojstaczer 1988; Van der Kamp and Gale 1983; Xue et al. 2016), sea tides (Carr and Van Der Kamp 1969; Erskine 1991), atmospheric pressure variations (barometric loading) (Acworth et al. 2016; Jacob 1940), or seismic waves (Folnagy et al. 2013; Shih 2009; Sun et al. 2018). In recent years, the combined effect of Earth and atmospheric pressure or tides (EAT) are also used to estimate SS in order to widen methods and improve knowledge (Cutillo and Bredehoeft 2011; McMillan et al. 2020; Rau et al. 2020; Shen et al. 2020). This approach requires high resolution and approximately hourly measurement of groundwater head and atmospheric pressure for a continuous period of at least 60 days (Schweizer et al. 2021,) as well as site-specific predictions of Earth tides that can be calculated using standard astronomical tables (McMillan et al. 2019). By using the driving force and response, hydraulic and geomechanical properties can be quantified.

Core testing methods are based on collecting rock cores or sediment samples using minimally disturbed coring methods and applying standard geotechnical laboratory tests to obtain estimates of material properties (e.g. compressibility, porosity) to calculate SS using established equations (Fatt 1958; Grisak and Cherry 1975; Neuzil et al. 1981; Shaver 1998; Sneed 2001). Measurement of the compressibility of a sample is obtained by standard uniaxial and triaxial compression tests or oedometer tests (e.g., ASTM D5731, ASTM D7012, AS 4133.4.3.1). Laboratory tests must be conducted under controlled environmental and stress conditions to reveal the behaviour of the subsurface materials under varying water content, temperature and pressure (Bouzalakos et al. 2016; Eaton et al. 2000; Masoumi et al. 2017).

Compilation of S S values and analysis of relationships

In this study, SS values were collected from 183 papers published in peer-reviewed journals and hydrogeological technical reports (grey literature), provided that the method of derivation and lithology were clearly stated. SS was calculated using Eq. (3) from studies where values of the storage coefficient and aquifer thickness were provided. Furthermore, Eq. (4) was used to calculate SS where constituent parameter values are available from geotechnical core test analysis. A total of 430 SS values measured using eight different methods for 26 types of materials from unconfined to confined groundwater conditions were compiled. Materials with similar constituents were grouped into a broad category for analysis—for instance, (1) silty clay, sandy silty clay, sandy clay and clay were grouped as clayey materials, (2) clayey silt, sandy silt and silt were grouped as silty materials, (3) silty sand, clayey sand and sand were grouped as sandy materials, (4) carbonate rock, chalk, marl and limestone were grouped with limestone and dolomite, (5) siltstone, mudstone and claystone were grouped with sedimentary rocks, (6) metamorphic rock and marble were grouped with fractured igneous and (7) metamorphic sand and cobbles were grouped with sand and gravel, and (8) clayey gravel were grouped with gravel. These values have been collected from many parts of different countries located in Europe, America, Asia, Middle East, Africa, Canada and Australia and cover a wide range of geological conditions. Where more than one value of SS were provided for the same subsurface material, the geometric mean was considered as a representative value. In addition, other information such as depth of measurement, thickness of the aquifer, state of confinement, hydraulic conductivity, porosity, barometric efficiency, and compressibility of the subsurface materials were also collected where available. Linear least-squares regression analysis was done to evaluate the correlation between SS and these properties. A complete summary of the compiled values of SS and other parameters for different subsurface materials is provided in Table S1 of the electronic supplementary material (ESM).

Determination of frequency distributions

An independent and random variable follows the lognormal distribution when its log-transformed values fit a Gaussian/normal distribution with 68.3 and 95.5% values falling within (mean ±standard deviation) and (mean ±2 × standard deviation), respectively (Maymon 2018). A perfectly lognormal distribution has equal mean and median, zero skewness and kurtosis (ranging from –1 to +1). Moreover, a quantile-quantile (Q-Q) plot (probability plot) (Easton and McCulloch 1990) shows that the scatter plot of theoretical quantiles for normal distribution and actual quantiles of log-transformed values of the variable follow nearly a straight line. Lognormality of a variable can also be verified by chi-squared (χ2) goodness of fit (Sudicky 1986) and D’Agostino’s K-squared test (D’Agostino and Pearson 1973), which examines the normality based on skewness and kurtosis. The frequency distribution of this type of variable can be expressed as the log-normal PDF with a mean (μ) and standard deviation (σ). Distribution of compiled SS values has been evaluated for its normality and the appropriate PDF has been calculated.

Review of numerical groundwater modelling practices to adopt S S values

A review of 45 publicly available multilayer transient models was done which include confined aquifers. Primary focus is on approaches in how SS values are treated in the modelling domain, calibration practices and sensitivity of the model result to the SS values. Altogether, 83 SS values were collected from these modelling reports whenever the lithology is mentioned and compared to collected estimated values for similar lithologies. The selection of these model studies was based on the robustness of the modelling approach and the adoption of SS values in the modelling domain. Among these models, 36 were developed using the finite difference modelling software MODFLOW (Harbaugh and McDonald 1996; McDonald and Harbaugh 1988), seven using the finite element modelling software FEFLOW (Diersch 2005; Trefry and Muffels 2007), one using iMOD (Vermeulen et al. 2018) and one using FRAC3DVS-OPG (Therrien et al. 2010). Approximately one third of these studies focused on groundwater resource management, another third was associated with coal and coal seam gas projects, and the rest were used for waste disposal as well as mining operations. These models were used to inform decisions on water resources management for groundwater systems in the USA, Australia, China, Canada, Vietnam, Ireland, Belgium, Netherlands Sweden and New Zealand. These studies were completed between the years 2000–2020, with the majority being after 2016. A complete summary of the reviewed models is provided in Table S2 of the ESM.

Confidence in model outputs can be enhanced by sensitivity analysis of the influence of parameter uncertainty on modelled flow and drawdown (Anderson et al. 2015). However, sensitivity analysis has limitations, particularly if the perturbed models are uncalibrated. Leading modelling practices apply quantitative uncertainty analysis (QUA) to evaluate the relative importance of parameters for model results, and to objectively evaluate data worth (Middlemis and Peeters 2018; Turnadge et al. 2018). There are several types of QUA, for example, the pilot point parameterization scheme is widely used through nonlinear parameter estimation tools such as PEST (Doherty et al. 2010). Sensitivity of model output to SS as reported in the reviewed model was evaluated.

Results

S S by estimation method

A comparative plot of the collated SS values grouped by estimation method is shown in Fig. 1. These data mostly originate from traditional measurement methods, e.g., pumping test, and geotechnical core testing. SS values measured from pumping tests range from 6 × 10–3 to 1.3 × 10–8 m–1 (six orders of magnitude), representing applications in different types of subsurface materials. Within this subset, 23 types of individual materials are measured with more than two-thirds of the values for sandy and silty materials and almost 90% of the values lie between 10–3 and 10–6 m–1. Core tests provide higher values relative to other methods and, combined with pumping test data, account for almost half of the SS values in the order of 10–5 m–1 (25th to 75th percentile).

Fig. 1
figure 1

SS values measured using different methods. Green dots indicate the data points, and n denotes the total number of data points compiled for each estimation method. Note that horizontal spacing between points in each category were added for improved visualization. An explanation of the box plot is provided in the top-right corner and applies to all the box plots in this work. EAT Earth and atmospheric pressure or tides

Pore pressure responses to natural events, such as Earth tides and atmospheric pressure, result in comparatively lower SS values with almost half in the order of 10–6 m–1. Slug testing results show a wide range of values (3.2 × 10–9 to 5.7 × 10–3 m–1) despite the limited number of data points compared to other methods. Two extremely low values in this data set in the order of 10–9 m–1 were measured by slug tests.

Overall, it is observed that although the measured SS values vary over seven orders of magnitude (3.2 × 10–9 to 6 × 10–3 m–1), about one-third (33%) of the values are in the order of 10–5 and another one-third in the order of 10–6 m–1. By comparison, compiled SS values from the model analyses show a spread over five orders of magnitude with almost half of the values in the order of 10–6 m–1.

S S by subsurface lithology

A comparative plot of Ss values for 13 broad categories of subsurface materials, subdivided into eight different test methods, is presented in Fig. 2. Note that this analysis does not include Ss values obtained from the numerical modelling studies. The plot shows that the Ss values for most of the subsurface materials range over several orders of magnitude. More than two-thirds of the values for consolidated materials (e.g., granite, limestone, igneous and metamorphic rock, basalt, sandstone, shale) are less than 10–5 m–1. In contrast, two-thirds of unconsolidated loose materials (e.g., sandy, silty, clayey materials, glacial till) have values higher than 10–5 m–1. Granite has the highest variation of any material type, ranging from 3.18 × 10–9 to 1 × 10–3 m–1 with an interquartile range (IQR) between 1.70 × 10–7 and 1.27 × 10–5 m–1.

Fig. 2
figure 2

SS values grouped by subsurface materials. Different markers indicate the method of measurement and n denotes the total number of data points. Materials are ordered by an increasing minimum of SS value in each category

Limestone and dolomite have the smallest IQR of any material type with almost 90% of the values falling within 1.45 × 10–7 and 9.82 × 10–6 m–1. For igneous and metamorphic rock, more than half of the SS values were between 1.0 × 10–7 and 3.0 × 10–6 m–1. Values for basalt show lower variability among all other consolidated materials and fall between 1.3 × 10–7 and 4.5 × 10–6 m–1. Likewise, SS values of different types of sedimentary rock also range from 4 × 10–7 to 3 × 10–5 m–1.

With the highest number of samples, sandstone shows uniformly distributed values within four orders of magnitude, but nearly 85% of the data ranges over two orders of magnitude (10–6 and 10–5 m–1). Similarly, shale also has more than two-thirds of the values within the same range but with a smaller IQR.

Glacial till which exhibits the highest SS value of 6 × 10–3 m–1 in this data spans over five orders of magnitude and depicts a variation similar to unsorted sediments. However, 60% of the values are in the order of 10–5 and 10–4 m–1. Silty, clayey, sandy materials and the mixture of sand with gravel all show values greater than 1 × 10–6 m–1, with nearly 90, 90, 75 and 70% of the values in the order of 10–5 and 10–4 m–1, respectively. Overall, most of the SS values for consolidated materials are in the order of 10–6 and 10–5 m–1 and for unconsolidated materials in the order of 10–5 and 10–4 m–1.

S S dependence on thickness and depth

S S values show a negative relationship with the aquifer thickness when correlated. This becomes clear as the thickness increases from 50 to 1,000 m (Figs. 3 and 4). An approximately linear relationship between the log-transformed data was inferred by linear least-squares regression for which R2 is 0.29 and the power function is shown in Fig. 3. For aquifer thicknesses less than 50 m, assumptions regarding thickness and uniform extent may influence a reliable estimate of SS using conventional pumping and slug test methods. This is because SS is derived from storativity by dividing the thickness of the aquifer which averages across the depth and does not account for heterogeneous conditions. The spread of SS values grouped by increasing thickness of the aquifer is highlighted in Fig. 4. It is evident that most of the higher SS values are estimated for thin aquifers.

Figure 5 shows that the SS values in the compiled data decreases with increasing depth, which is consistent with previous studies (Kuang et al. 2021; Rau et al. 2018; Smith et al. 2013).

Fig. 3
figure 3

Relationship between SS and thickness of the aquifer for different materials. Different colors indicate the method of measurement and marker shapes indicate the material type

Fig. 4
figure 4

Relationship between SS and thickness of the aquifer. Red dots indicate the SS values. Note that vertical spacing between points in each category was added for improved visualization

.

Fig. 5
figure 5

Empirical SS–depth model (Kuang et al. 2021) fitted to compiled SS values for a different types of materials and b for sandy materials. Different shapes indicate the method of measurement. Black continuous lines indicate the best fitted values with increasing depth and red-dotted line denotes values calculated with the empirical parameters suggested by Kuang et al. (2021)

Relationship between S S and other properties

There are 217 data points for which values of both SS and hydraulic conductivity (K) are available at the same site. The values of K vary between 1.0 × 10–14 and 1.9 × 10–2 m s–1 with a geometric mean of 3.4 × 10–6 m s–1 and represent a variety of consolidated and unconsolidated lithologies. Overall and as expected, the values of K for consolidated materials were lower than for unconsolidated materials. No meaningful relationship between these two parameters was observed in the compiled values.

There are 149 data points for which values of both SS and porosity (∅) are available. Geotechnical core testing is the most common method to estimate the porosity of subsurface materials. Additionally, porosity can be interpreted from some of the passive methods, e.g., Earth tide, and atmospheric pressure analysis. While pump and slug testing cannot provide porosity values, a number of studies provided values from auxiliary methods. Theoretically, values of SS should show a positive relationship with porosity, as defined in Eq. (1). From the compiled data, an overall moderate positive trend was observed, resulting in an approximately linear relationship in log scale by linear least-squares regression with R2 = 0.232 (shown in Fig. 6). This result shows that there are more data for porosity >0.1 and variation in SS also grows from 2 to 4 orders of magnitude as the porosity increases, indicating that SS values become less sensitive to porosity as its value increases, especially for highly compressible materials, i.e., clayey, sandy, silty materials. Some higher values of SS were estimated for granite, limestone and shale, which have lower porosity, indicating that secondary porosity (i.e. fracture) may not be considered due to test limitations. As expected, porosity is higher for unconsolidated loose materials compared to consolidated types.

Fig. 6
figure 6

Relationship between SS and porosity (∅) for different subsurface materials. Different marker colors indicate the method of measurement, whereas shapes indicate the material type

There are 58 SS data points for which formation or bulk compressibility values are also available. These data confirm the increase of SS with increasing formation compressibility, as expected; the majority of these data points are from Earth tide analysis, the only method that provides an in-situ estimate of formation compressibility. There is a strong relationship evident in Fig. 7 for the majority of the data, resulting in a power-law relationship by linear least-squares regression with R= 0.73. Note that most of the values used to develop this relationship are derived from pore pressure response to naturally occurring stress (Earth tides, barometric pressure) methods, where Eq. (1a) is generally used to calculate compressibility and SS. Therefore, the strong correlation with R2= 0.73 can be skewed by the linearity inherent in the equation.

As expected, there was also a relationship between decreasing SS and increasing BE, although there was considerable scatter confirming that SS is more related to compressibility than hydraulic properties. 

Fig. 7
figure 7

Relationship between SS and formation compressibility for different subsurface materials. Different marker colors represent the method of measurement and shapes indicate the material type

Frequency distributions of S S

Frequency distributions of the SS data (n = 430) for all types of subsurface materials are presented in Fig. 8a. The log-transformed SS (log10Ss) data set has values of a nearly equal mean (–4.96) and median (–4.95), skewness and kurtosis of –0.07 and –0.01 (close to zero), respectively. D’Agostino’s K-squared test (D’Agostino and Pearson 1973) shows a p-value of 0.81 (>0.05) and the quantile-quantile (Q-Q) plot (Easton and McCulloch 1990) follows nearly a straight line. The chi-squared (χ2) goodness of fit test on a randomly selected subset of the log-transformed SS values showed that a log-normal distribution can be assumed on 95% significance level.

Fig. 8
figure 8

Histogram and probability density plot of the log10 transformed SS for all materials measured by a all methods, b passively acquired pore pressure responses to natural events

Based on these tests, it is evident that SS follows log normality. Thus, the frequency distribution of SS can be expressed as the log-normal PDF with a mean (μ) of –4.96 (geometric mean 1.1 × 10–5 m–1) and standard deviation (σ) of 1.04 as shown in Fig. 8a. Likewise, the PDF of SS for different subsurface materials measured by passively acquired pore pressure responses to natural stresses (Earth tide, barometric loading, seismic waves, sea tide; Fig. 8b) shows a relatively lower mean (–5.65, geometric mean 2.2 × 10–6 m–1) with a smaller standard deviation (0.66). A lower value of standard deviation for this distribution suggests that the values are closer to the mean.

The PDF for all consolidated, unconsolidated, and individual types of subsurface materials with more than ten data points are presented in Fig. 9, for application in groundwater models. The distributions are based on the mean and standard deviation of the compiled data for each type of material. D’Agostino’s K-squared test showed a low p-value (<0.05) for some materials (e.g., sandstone, limestone and dolomite) which failed the log-normality of the PDF. Moreover, the lack of adequate data for individual material types resulted in gaps within some frequency distribution plots. The PDFs of consolidated and unconsolidated material have a significant difference in the mean value (–5.63 and –4.34), but the standard deviation is almost similar (0.84 and 0.79). Individual consolidated materials and unconsolidated materials also show similar patterns, except for the granite and glacial till with higher standard deviations (1.33 and 1.13) which could be attributed to a range of fracturing and weathering conditions of these types of materials.

Fig. 9
figure 9

Histogram and probability density plot of the log10 transformed SS for different types of material

S S from numerical groundwater models

An evaluation of the typical source of SS values and parametrization approaches in the 45 groundwater models that were reviewed is summarized in Table 1. A small fraction of these models was based on measured SS values with confidence that can be regarded as moderate to high. Measured SS values used in the numerical models typically originate from the testing of bores within the model area and less commonly from lab tests of cores. However, the majority of the models reviewed used SS from previous studies to assign initial SS values to each layer. The data sources for a few models are not reported, even for a secondary source from prior studies or literature. There is relatively low confidence in SS values in these models that rely on literature values or previous studies without consideration of the primary SS data.

Table 1 Summary of groundwater modelling approaches with SS data sources and parameterization

Initially assigned SS values were often refined during model calibration and the calibrated values were used for further analysis and prediction. Seven models used single and fixed SS values for every layer after calibration and 22 models used SS values that were uniform within layers but variable across layers after calibration (Table S2 of the ESM). In these models, SS values ranged from 1.0 × 10–2 to 1.0 × 10–7 m–1 with most values between 1.0 × 10–5 and 1.0 × 10–7 m–1.

Some models were not calibrated for SS, and instead relied on the initially assigned values, based directly on the field investigation, laboratory tests or previous studies. Four of the models reviewed were assigned with uniform and fixed SS values for all layers despite differences in subsurface materials (Ackerman et al. 2010; CDM-Smith 2016; Jacobs 2018; SKM-NSW 2010). Five of the models reviewed assigned different SS values across each layer, while the values were uniform within each layer (Geofirma 2011; GHD 2013; HydroSimulations-Hume 2018; Mackie 2013; Zheng et al. 2018). In these models, SS values range from 1.6 × 10–3 to 2.0 × 10–7 m–1 with most of the values in order of 10–5 and 10–6 m–1. Larger values are assigned to the top layers which are unconfined or semiconfined and may not contribute a significant amount of water from elastic storage. Some models of this type were refined with other parameters (e.g., horizontal hydraulic conductivity Kh, vertical hydraulic conductivity Kv , recharge) during calibration; however, SS values were fixed.

Discussion

S S dependency on depth

The empirical SS-depth model proposed by Kuang et al. (2021) has been added to the compiled data (n = 283, depth < 1 km) (Eq. 6). Figure 5a shows a model fit with (R2 = 0.32) with logSs0= –4.265, logSsr = –6.003 and λ = 8.742, whereas the line with parameters (logSs0= –3.884, logSsr = –5.853 and λ = 15.47) used in Kuang et al. (2021) provides R2 = 0.30. This poor fit (low R2 value) can partly be attributed to the different lithologies that are grouped together and higher scattering of SS values up to 200 m depth.

The model was also fitted for sandy materials for which 39 SS values out of 42 total values measured within 200 m below ground surface. Figure 5b shows that the best fitted line (R2 = 0.56) was generated with logSs0= –3.352, logSsr = –5.227 and λ = 32.909. This model-fit analysis indicates that a depth model for a single type of material could provide more useful information on how that material behaves with increasing depth. It is observed that the model provides almost a constant parameter (~logSsr) when the depth exceeds half of the total depth (z/2) considered during the fitting of the model with the dataset. This shows that SS values do not change much when deeper than 500 and 100 m in the subsurface, as is shown in Fig. 5a,b, respectively.

Relatively high SS values with large variations are evident up to a depth of 100 m below ground level. The results shown in Figs. 3, 4 and 5 suggest that estimation of SS may be obscured by relatively shallow and thin strata through which leakage may influence the interpretation of conventional aquifer pump and slug test (Rau et al. 2018). At sites where vertical leakage may occur, the verification of SS > 10–3 m–1 is warranted.

S S dependency on method and scale

Estimated values of SS at a single field site can vary by several orders of magnitude depending on the measurement technique (Quinn et al. 2016). Therefore, it is important to consider the uncertainties associated with each method. Pumping tests typically provide order of magnitude estimates of hydraulic parameters, provided that an appropriate conceptual model is selected (Kruseman et al. 1970). There are limitations and assumptions of each analytical method selected for interpreting data, including whether the wells are fully or partially penetrating an aquifer. Also, applying an analytical model developed for a fully confined aquifer to a leaky aquifer will overestimate SS (Turnadge et al. 2019). Likewise, overestimation of SS due to the leakage from the adjacent layers in multilayered and karstified aquifer systems is also possible (Bergelson et al. 1998). In addition, it is important to avoid data from the early stages of a test due to the wellbore storage effect and late time data due to boundary effects to get the most robust estimates of SS by matching drawdown curves to type curves (Chapuis 1992; De Marsily 1986). In this compilation, it is found that SS estimated from aquifer pumping test analyses tend to provide higher values than other methods for similar subsurface materials with reduced aquifer thickness (Fig. 3). This finding aligns with a previous hypothesis that higher values are influenced by violations of the conceptual models (Rau et al. 2018).

Compiled data show that slug test analyses result in relatively low SS values for several types of subsurface materials (e.g., granite, limestone, sand and gravel, glacial till) compared to other methods (Fig. 2). The outliers possibly indicate higher uncertainty, likely due to lower test stress and area of influence. Although slug tests provide low-cost in-situ data, this method can have significant uncertainty due to the smaller test scale resulting in skin effects from the gravel pack around the screen interval. Moreover, estimated values can differ by more than two orders of magnitude due to the lack of sensitivity of the type curve fitting to changes in SS values (Fitts 2013). It is noted that most of the consolidated rock under natural conditions is nonuniformly fractured and can be considered to provide a major source of water storage. Due to the smaller area of influence, estimated SS values can be affected by the absence of the influence from fractures.

This meta-review provides the most comprehensive compilation of SS estimated by passive methods, particularly those that combine barometric and Earth tide analysis in confined and semiconfined aquifers (McMillan et al. 2019). Earth tide analysis before McMillan et al. (2020) relied on an a priori estimation of Poisson’s ratio (Smith et al. 2013). Owing to the difficulties related to in situ measurement of Poisson’s ratio, laboratory measurements or literature-derived values are commonly used in SS estimation. It is noted that these methods are only applicable under semiconfined to confined conditions where the possibility of leakage from the upper layers is comparatively low. This is also reflected by values for consolidated materials (e.g., sandstone, limestone, granite) that are typically lower than those for unconsolidated materials (e.g., sandy, silty materials).

Difficulties in extracting individual tidal components from the pore pressure data can contribute to uncertainty in the estimated values (Rau et al. 2020; Schweizer et al. 2021). Skin and wellbore storage effects can also provide errors in phase analysis of tidal response (Gao et al. 2020). Solid grain compressibility is often neglected when considering formation compressibility; however, this assumption may overestimate SS values in the case of consolidated rock with lower porosity and compressibility where the amount of extractable water from the aquifer is always less than the change in bulk volume (Turnadge et al. 2019). Van der Kamp and Gale (1983) found that SS values can be 5–12 % larger for sandstones when compressibility of individual grains is neglected. In addition, the difference between actual and theoretical Earth tide strain near the Earth’s surface can provide remarkable uncertainty in the calculated SS values when the theoretical gravity tide is used in the calculation (Cutillo and Bredehoeft 2011). The ratio between the actual and calculated Earth tide strain can be 50% in some cases (Berger and Beaumont 1976); therefore, the use of calculated strain may result in half the porosity and compressibility values compared to the actual strain (Rojstaczer and Agnew 1989).

This overview analysis shows that there is higher uncertainty in SS values derived from testing of core samples compared to in situ methods (Fig. 2). Major sources of uncertainty in geotechnical core tests usually come from disturbance of the sample during collection, storage, handling, transporting, and testing under conditions in the laboratory that differ from in situ conditions (Clayton et al. 1995; Timms and Acworth 2005). Clark (1998) noted that it can be challenging to collect a completely undisturbed sample for laboratory testing, particularly for brittle hard rock and softer subsurface materials—for example, laboratory tests of core samples from sandstone and shale resulted in much higher SS values than obtained from in situ passive methods (David et al. 2017). In some cases, laboratory test values of SS were one to three orders of magnitude larger (Smith et al. 2013; van der Kamp 2001). It is challenging to capture the spatial heterogeneity of natural conditions in small samples—for example, Beavan et al. (1991) found that the uniaxial strain modulus derived from a laboratory test was 50% higher than the values estimated from tidal analysis of a sandstone aquifer, which was attributed to the presence of heterogeneity and fractures at the field scale. Laboratory tests of rock core are typically conducted using dry samples, whereas a small increase of water content in rock can decrease its mechanical properties (Masoumi et al. 2017). Consequently, rocks within aquifers should be tested under water-saturated conditions.

Some consolidated rock, and silty and clayey materials, can be regarded as aquitards, given their very low hydraulic conductivity. Due to slow response to the hydraulic stresses, laboratory testing of small-scale core specimens is commonly used for estimating SS values. Therefore, the number of SS values derived from in situ methods is much lower for this type of formation compared to laboratory measurements. However, aquitards can be characterized using the pore pressure response to naturally occurring stresses (e.g., Earth tides, barometric pressure) (Acworth et al. 2017) and could provide more reliable and in situ SS estimates compared to laboratory testing of small cores. The results show that laboratory measurements for aquitards lead to values that are one or two order of magnitudes higher in comparison with pore pressure responses to naturally occurring stress methods. One possible reason for this is the fact that hydraulic properties derived from transient methods are frequency dependent (Valois et al. 2022); however, the exact reason for this remains unknown and requires further research.

The compiled data in this study show that measured SS values for clayey materials can be more uncertain than for other types of materials. The use of total porosity estimated from the total moisture content in the laboratory for clay and clay-rich materials can overestimate SS as a certain amount of water cannot contribute to SS due to adsorption characteristics (Galperin 1993; Jury and Horton 2004; Rau et al. 2018). Further, the compressibility of clayey materials in laboratory test conditions is typically higher than for in situ conditions (Klohn 1965; Radhakrishna and Klym 1974). Therefore, the reliability of laboratory-measured SS values for clayey materials may not be applicable to in situ conditions.

Estimation of SS values can occur at various horizontal scales, particularly as the radius of influence from a borehole depends on the magnitude of pumping and hydraulic stress. Pressure head drawdown can range from several millimetres to hundreds of meters under different test conditions, while the volume of aquifer material that is influenced varies significantly. Smaller-scale tests and observations (e.g., passive methods) within individual bores provide site-specific locally scaled values and could be useful to evaluate the spatial distribution of heterogeneity, while larger-scale tests (e.g., aquifer pumping test) provide lumped average values of SS at a scale that better matches that of numerical modelling.

A comparison of the testing scale of each method considered in this study is described in Table 2. During an aquifer pumping test, the distance between the pumping and observation wells could be tens to hundreds of meters (Allègre et al. 2016). The spatial scale of testing could also be defined by the distance where the drawdown decays to 5% of its maximum from the centre of the well (Zhang et al. 2019), regardless of the type of testing. The smallest spatial scale of SS is sampled with geotechnical tests of rock core or oedometer tests of unconsolidated materials.

Table 2 Test scale and stress for different SS measurement techniques

The scale of testing is also related to the magnitude of hydraulic stress, since drawdown also influences the volume of material that is sampled. The magnitude of changes in well water levels caused by tidal stresses can be millimetres to a few centimetres, which is representative of a relatively small aquifer volume (Allègre et al. 2016; Hsieh 1998). In contrast, a slug test can result in drawdown of up to a few meters and the resulting test scale can be a radius of few meters around the bore (Frus and Halford 2018; Zhang et al. 2019).

Earlier studies have compared the SS values measured from pumping tests and tidal stress for the same aquifer and found that the values are similar (Allègre et al. 2016; Shen et al. 2020). A possible explanation of the resemblance can be attributed to the spatial extent of tidal stress. Tidal stress is ubiquitous, and even though hydraulic stress is relatively small, it acts uniformly over a larger area compared to an aquifer pumping test, with spatial influence that could extend over kilometres (Narasimhan et al. 1984). Another investigation found discrepancies which are attributed to borehole skin effects (Valois et al. 2022). Overall, further research is necessary to investigate the representative scale for the estimation methods based on natural forces (McMillan et al. 2020).

It is important to note that measurement of SS for a formation can occur under drained or undrained conditions. In undrained conditions, the stress occurs at a rate that is too fast for pore water to flow, and the fluid mass remains constant. Under drained conditions, a slow change of stress in the subsurface allows the pore water to move laterally and the pore pressure remains constant (Rau et al. 2018). Barometric loading and Earth tides occur over a large area and are transient, leading to the assumption of undrained conditions (Narasimhan et al. 1984). However, natural stresses occur at different frequencies which, in combination with subsurface properties, could be at the transition between drained and undrained conditions. Further research is necessary to elucidate the appropriateness of conceptual models under different stress conditions.

S S dependency on materials and physical limits

The lower SS of consolidated materials (Fig. 2) is due to lower porosity and matrix compressibility, but higher values are also possible, caused by the presence of interconnected fractures. However, some materials with a narrow range of SS (e.g. sedimentary rocks, gravel) in the compiled data may result from a lower number of data points compared to the other material types. The large variability of SS found for granite could be attributed to its nature, which occurs as a massive, consolidated rock with various degrees of fracturing and weathering. Several SS values higher than 1 × 10–5 m–1 measured by pump testing can be attributed to the larger test scale and the presence of fractures or vertical leakage.

A minimum SS value proposed by Rau et al. (2018) was for consolidated rock, represented by marble, derived from poroelastic parameters measured on rock cores in the laboratory (Wang 2000) and assumed a maximum loading efficiency (LE) of 0.2 (or BE of 0.8). Maximum SS values for extractable water from unconsolidated clayey materials were estimated assuming negligible grain compressibility. Overall, extractable (i.e. free water) storage limits of SS between 2 × 10–7 and 1.3 × 10–5 m–1 were proposed for consolidated rock and unconsolidated materials (Rau et al. 2018).

Field measurements and poroelastic theory indicate that fine sands can have higher SS than clays (Rau et al. 2018), although SS values for coarse sand and gravel were not considered. In that study, a maximum SS of approximately 2.0 × 10–5 m–1 was derived from seismic testing of a 25-m-thick profile of fine sands in the Botany Sands Aquifer, Australia. SS decreased with increasing depth through this profile of unconfined fine sands. This estimated SS assumed a midpoint value for sand grain compressibility that was reported from testing of Ottawa Sands, USA (Richardson et al. 2002). However, the SS values of clastic unconsolidated materials with grain size larger than fine sand were not considered. Thus, the possibility of a larger upper limit of SS for large clastic unconsolidated materials remains unclear.

This study has found that almost half of the compiled SS values and more than 75% of values for sandy, silty, clayey, sand and gravel and glacial till are higher than the estimated upper limit of extractable SS of 1.3 × 10–5 m–1. In contrast, only a few SS values in this meta-review were lower than the suggested lower limit of 2.3 × 10–7 m–1. While this points to a violation of conceptual models used in the estimation, further research is required to reconcile the relatively high SS values that were reported in some studies compared to the estimated maximum of extractable SS suggested by Rau et al. (2018). Moreover, additional sensitivity testing of a possible range of extractable SS is needed beyond consideration of marble, clayey materials, and fine sand (Rau et al. 2018), to provide a wider representation of consolidated and unconsolidated materials and subsurface conditions. Therefore, a relatively high upper limit of SS for fractured or leaky conditions and coarser-grained clastic unconsolidated materials requires further consideration.

Comparison of S S from estimation and numerical models

S S values used in the 45 models that were reviewed are generally not consistent with values reported in the literature for similar lithologies. In the models that were reviewed, there were 83 SS values for which lithologies were specifically mentioned. These values were for 13 types of materials among which 12 types have estimated values from other methods as compiled in this study. A comparison of SS values gathered from these two types of sources is shown in Fig. 10, which reveals that a significant number of model values are beyond the interquartile range of the compiled values estimated by different testing methods, especially for granite, limestone, silty clay, sand, clay as well as sand and gravel.

Fig. 10
figure 10

Box plots showing the range of SS values estimated for different geologic materials using different methods compiled in this study. Red dots are the SS values that are used for the calibration of studied numerical GW models. nc and nm denote the total number of data points for compiled and model values, respectively

There are more data for basalt and sandstone than for other types of lithology. The comparison between estimated and modelled SS values is not representative of geologic materials where there are too few data or model points. For basalt, almost all the SS values lie beyond the maximum range of SS values found in the compiled data. The number of data points measured by other methods are limited in number (= 8). However, values higher than ~10–4 m–1 are not realistic for consolidated material (Fig. 2). For sandstone, SS values typically fall between the interquartile range of the estimated values.

An exceptionally high value of 2.3 × 10–3 m–1 for granite was used for a model that included decomposed rock; however, values for shale, claystone and marble are more consistent with the estimated values. Overall, this analysis indicates that the parameterization of models should be improved with more realistic SS values that are consistent with observed data for similar lithologies. Measurement of SS values from bore sites within the model area can provide higher confidence in SS values used in models compared to reliance on previous studies or literature.

Implications for transient numerical modelling

Four of the reviewed models demonstrated that some model outputs were sensitive to the value of SS. However, most of the model results were more influenced by the vertical hydraulic conductivity of the aquitards, horizontal hydraulic conductivity of the major aquifer, recharge and head at the boundaries—for example, Golder (2007) found that the modelled water balance was sensitive to the SS of the aquitard. As the SS values decreased for the aquitard, leakage to and from the adjacent aquifer increases during recharge and pumping respectively. In addition, Auctus (2017) reported that modelled drawdown was highly sensitive to the order of magnitude changes in SS values. Therefore, the model was calibrated with the lower end (~10–5) of the likely SS range calculated from field tests for all layers. Furthermore, Pattle (2017) indicated that variations in simulated groundwater levels and discharge to the surface-water bodies are sensitive to SS. These models included a sensitivity analysis that involved varying SS upwards and downwards, although the results were generated from models that were no longer calibrated.

A comprehensive model sensitivity study by Turnadge et al. (2018) found that the timing of maximum drawdown prediction was highly sensitive to SS along with hydraulic conductivity for some but not all layers of the model. Their methods assessed the sensitivities of four types of model predictions to three types of model parameters (i.e. horizontal and vertical hydraulic conductivity, specific storage) for 10 hydrostratigraphic units represented in the transient model developed by CDM-Smith (2016). In addition to the sensitivity to the timing of maximum drawdown, this study also found that the magnitude of maximum drawdown and vertical fluxes along with the spatial extent of drawdown propagation predictions are moderately sensitive to SS.

Seven of the reviewed models used spatially variable SS using pilot point parameterization with a spatial interpolation method (kriging) for specific layers (Bilge 2012; Frans and Olsen 2016; Khan et al. 2002; Liu et al. 2014; OGIA 2019; Pawel and Matthew 2018; Vermeulen and Kelder 2020). Pawel and Matthew (2018) modelled the Heretaunga aquifer for groundwater resource assessment with spatially variable SS that was calibrated for each of the two layers. In this model, parameterization of spatial variability in Kh, SS and specific yield (Sy) generated a smooth spatial variability using pilot points (Certes and de Marsily 1991), by assuming values at arbitrary points and interpolating between the points. The model was calibrated using automated software PEST, with SS bounds for 185 pilot points across the model area set between 10–3 and 10–7 m–1 with a preferred value of 10–6 m–1. The PEST calibration also enabled quantitative uncertainty analysis (QUA) to be undertaken on this model, using calibration-constrained Monte-Carlo analysis (Knowling et al. 2018). OGIA (2019) also incorporated spatially variable SS in a more complex regional model, initially generated from random realizations of porosity and compressibility based on probability distributions for each lithology developed from core test results and literature values. A grid-based SS was eventually obtained by interpolation of pilot points to the model grid and calibration was done by the PEST software suite.

Model layers with both SS and Sy values enable aquifers to behave as either confined or unconfined, with water release from storage according to Sy instead of SS if water level declines below the confining layer (Pawel and Matthew 2018). However, a few models assigned with SS values for only deeper layers assume confined conditions and Sy only for upper layers consider unconfined conditions (Barnett 2013; Bilge 2012; HydroSimulations 2019). Note that none of the reviewed models considered a variation of SS with depth and thickness of the layers, in contrast to the general trend of decreasing SS with increasing depth and thickness found in this study (Figs. 4 and 5).

The PDF of SS presented here can be used to improve groundwater models with probabilistic outcomes and to evaluate uncertainties (QUA) of model results, especially when adequate information about the true distribution of SS is unknown for the subsurface materials in a study area. It can also be quite useful for applying the principle of parsimony (Hill 1998) to combine hydrogeologic units in the model domain. Furthermore, spatial distribution of the SS resulting from subsurface heterogeneity can be represented by the provided PDFs during model upscaling, as similar lithologies occur worldwide.

Conclusion

This study conducted a meta-review of specific storage (SS) and compiled 430 values alongside other hydrogeological information such as porosity, hydraulic conductivity and formation compressibility. This database is more than twice the size of previous compilations and includes a multifactor analysis to enable a more advanced interpretation of its relevance in groundwater systems under different conditions. Overall, it is observed that although the SS values for similar types of the material varied by up to 6 orders of magnitude, SS values are typically between 10–6 and 10–5 m–1 for consolidated materials, and between 10–5 and 10–4 m–1 for unconsolidated materials.

It is also found that conventional methods (i.e., pumping test, geotechnical core test) provide a relatively higher value of SS than methods based on passively acquired pore pressure responses to natural forces (e.g., Earth tide, barometric loading, sea tide, seismic wave) for similar material types. Differences may be related to test scale, analytical model assumptions used in the interpretation of observational data and the overall stress conditions of the different methods. However, site-specific comparison would be required to resolve these possible influences. SS values were found to be larger in unconsolidated materials (aggregated means), especially sandy lithologies, but that the spread (standard deviation) is greater within consolidated materials.

A review of the use of SS values in transient regional-scale groundwater models found that most models assumed a fixed value of SS for each layer or aquifer material and did not reference any source for this choice. The simplest models assume a constant SS across the whole model domain. Very few models evaluated or reported that model output was sensitive to SS values, although further evaluations are needed to determine the influence on the magnitude and timing of groundwater drawdown in upper layers of a model if relatively large SS values are used. Leading modelling practices enable variable SS across and within layers and apply QUA methods for evaluating the influence of SS values in modelling outcomes to improve the confidence of modelling results.

Values of SS compiled for a variety of subsurface materials can support a first estimate of SS for specific sites where direct measurement is not possible, particularly where modelling shows low sensitivity to SS parameterisation. Compiled SS values along with porosity and formation compressibility were used to develop a correlation. Empirical equations (shown in Figs. 6 and 7) developed in this study can be used to estimate SS values for a formation based on other parameter measurements. In addition, SS values, along with other parameters for different material types presented here (Table S1 of the ESM), can be used as a reference value and may help to constrain groundwater models or assist with analysis or calibration. The compiled SS data revealed a log-normal PDF from which a series of PDFs for 10 different types of consolidated and unconsolidated materials were derived. These PDFs can form a foundation for stochastic analysis of groundwater systems.

Further research is required to evaluate the physical basis for relatively high SS values (>1.3 × 10–5 m–1), particularly for sandy materials and extractable water content for different lithologies. Some open questions related to the sensitivity of SS estimation to varying magnitudes of hydraulic stress in monitoring bores, conditions during on-site testing and scale of influence for pore pressure around a borehole, and the scale dependency of SS for a range of subsurface conditions and stresses need further investigation. Results from this study demonstrate that SS is a hydrogeological parameter that must be given more attention and, if estimated in situ at an appropriate scale, could improve understanding of groundwater storage and increase confidence in the modelling of groundwater systems.