Introduction

Hydraulic properties must be distributed spatially to simulate groundwater flow through complex aquifer systems. Typically, hydraulic properties are estimated based on relatively sparse measurements within a small volume of the subsurface, from which relations between geologic materials and hydraulic properties can be inferred. Geologic structure can be mapped broadly, whereas hydraulic properties cannot be measured directly at regional scales. Thus, geologic frameworks are regularly used in groundwater-flow modeling to define the structure of zones for distributing hydraulic properties throughout large simulated volumes of the subsurface. The implicit assumption of this well-established approach is that the relations between geology and hydraulic properties at the scale where measurements are available can be extrapolated throughout the entire model domain. However, for regional-scale groundwater modeling it is important to consider whether this type of extrapolation benefits from increasingly complex geologic frameworks.

Early geologic frameworks for groundwater models simply separated basin fill from bedrock (e.g., Williamson et al. 1985), which does not account for structural or stratigraphic complexity within the basin itself. However, during the last decade, geologic frameworks have relied on more extensive geologic information to subdivide simulated volumes into many dozens of hydrogeologic units (e.g., Belcher 2004; Stoller-Navarro Joint Venture 2009). Typically, hydraulic properties were distributed into zones by assuming minimal or no variability within hydrogeologic units. Calibration of early models by trial and error (e.g., Williamson et al. 1985) was more expedient when the variability of a hydraulic property such as transmissivity within a given zone was minimized. For similar practical reasons, subsequent model calibrations with formal parameter estimation approaches also initially avoided representing variability within hydrogeologic units. The rationale for this was that hydraulic properties were distributed with zones of homogeneous properties (e.g., Czarnecki and Waddell 1984), where heterogeneity was accommodated by subdividing zones into refined geologic frameworks (Belcher 2004; Hsieh et al. 2007). Heterogeneity within individual zones can obviously be accommodated when sufficient data is available to constrain such variability. However, for geologic units to have hydrologic significance, the variability in hydraulic properties within those units should be less than the variability between different units. Thus, when extrapolating hydraulic properties across regional scales, homogeneous units are generally preferred.

The number of potential free parameters rapidly increases as geologic frameworks are subdivided further, but additional geologic units do not necessarily improve hydraulic property estimates. This condition can be identified where the mean and variance of hydraulic properties in subdivided hydrogeologic units differ little from the parent hydrogeologic unit—for example, Tertiary volcanic rocks in southern Nevada were differentiated into eight hydrogeologic units, but the mean log-hydraulic conductivities of 64 % of the combinations were statistically identical (Belcher et al. 2002). Differentiating Tertiary volcanic rocks also did not appreciably reduce the standard deviation of log-hydraulic conductivity in the eight hydrogeologic units, which ranged between 0.8 and 1.7. In such situations, excessive zonation introduces unnecessary complexity that can be misleading when extrapolating assumed relations between geology and hydraulic properties for regional-scale groundwater models. Although it is possible to expend considerable efforts developing extremely detailed geologic frameworks across large scales, complex zonation of groundwater models is only useful if it can be supported by hydraulic data.

One way investigators have attempted to handle uncertainty resulting from complex geology is through multi-model analysis (Burnham and Anderson 2002; Poeter and Hill 2007). Multiple groundwater model results are generated based on multiple geologic frameworks and resolved with multi-model analysis. Results from multiple calibrated models are synthesized because models appear similar and cannot be differentiated with their objective functions. Predictive uncertainty typically increases with multi-model analysis because the approach retains all model results. Ultimately, multi-model analysis can be problematic because the objective function does not guarantee that erroneous models are rejected. Thus, a practical approach for evaluating the suitable level of geologic complexity for a regional-scale groundwater model is still needed.

Regrettably, regional-scale constraints on model complexity and parameter estimation such as regional groundwater discharge, are frequently disregarded. This is because they are not easily incorporated into an objective function due to differences in scale between investigation and observation. For example, 25 calibrated groundwater models of northern Yucca Flat in southern Nevada simulated flow rates that ranged between 15,000 and 33,000 acre-ft/year (0.019 and 0.040 km3/year; Ye et al. 2010). Northern Yucca Flat is a 170 mi2 (440 km2) sub-area of a 4,200-mi2 (10,880 km2) area that discharges only 18,000 acre-ft/year (0.022 km3/year) to the regional discharge point at Ash Meadows (Laczniak et al. 1999). An upper bound for groundwater discharge should therefore have been used in the objective function for models of northern Yucca Flat, since sub-area models that simulate flows in excess of groundwater discharge from an entire basin can be rejected easily. These types of simple but useful constraints on model calibration are often avoided because a formulaic approach for estimating flow through a small sub-area of a relatively large contributing basin is difficult to implement.

Aquifer-test results provide flexible, large-scale constraints for evaluating the degree to which a complex geologic framework is useful for defining zones within a regional-scale model. Unlike the contributing area to regional groundwater discharge, pumping and observation well locations can be specified so that the framework can be tested where desired; furthermore, the rate and duration of pumping discharges are known. Bulk hydraulic properties in large volumes of an aquifer can be estimated with confidence because drawdowns can be detected more than a mile from a pumping well (Geldon 2004; Garcia, et al. 2013). Drawdowns from multiple, interfering (i.e., overlapping) aquifer tests at multiple depths define a volume of aquifer where hydraulic properties can be investigated in a manner similar to hydraulic tomography (Yeh and Liu 2000; Fienen et al. 2008; Bohling and Butler 2010). This investigated volume serves as a control on constraining multiple, competing groundwater flow models based on different conceptualizations of the hydrogeology.

This paper presents a novel approach for evaluating the suitability and, hence, the hydrologic utility of competing geologic frameworks for distributing zones of hydraulic properties across regional scales. Hydrologic utility is defined to increase as the uniqueness of individual zones increases; hydrologic utility decreases when splitting of zones does not result in one or more unique zones. Zones are more unique as standard deviations of log-hydraulic conductivities decrease and differences between mean log-hydraulic conductivities increase. The approach presented here quantifies differences between hydraulic properties as distributed by alternative geologic frameworks and estimated during groundwater flow model calibration with Tikhonov regularization. Hydrologic utility of competing geologic frameworks are ranked qualitatively with these criteria. The method facilitates an evaluation of the relative benefits of explaining heterogeneity by subdividing a hydrogeologic feature into multiple independent features or increasing dependent variability within the existing feature. The approach is demonstrated with an example from Pahute Mesa, Nevada, where detailed geologic mapping facilitates the development of multiple geologic frameworks and extensive aquifer testing facilitates the calibration of the corresponding groundwater-flow models.

Pahute Mesa

Pahute Mesa is located in southern Nevada, within the Nevada National Security Site (NNSS) in the USA (Fig. 1). The area lies within the southwestern Nevada volcanic field and includes layered, variably welded, ash-flow tuff sheets and intercalated lavas derived from a series of caldera collapse-related eruptions between approximately 15 and 7.5 Ma (Winograd and Thordarson 1975; Byers et al. 1976; Sawyer et al. 1994; Laczniak et al. 1996; Fenelon et al. 2010).

Fig. 1
figure 1

Location of well sites, pilot points (for hydraulic conductivity and storage coefficients, and for fault multipliers), and discretized fault structures associated with multi-well aquifer tests at Pahute Mesa, Nevada National Security Site (NNSS), 2009–2012. Inset map shows location of the NNSS, Nevada, USA

The volcanic-rock section has been divided both vertically and laterally into eight hydrostratigraphic units (HSUs) based on geologic and hydrogeologic data from surface mapping and boreholes (Table 1). The HSUs are broadly classified as aquifer, confining unit, or composite unit (Prothro et al. 2009a). Each HSU is a composite of multiple hydrogeologic units that are grouped by stratigraphy and inferred hydraulic properties. Rhyolitic lavas or welded ash flow tuffs such as in the BA and TSA (Table 1), generally comprise aquifers; bedded and non-welded, zeolitized tuffs such as the UPCU and LPCU (Table 1) typically comprise confining units; stratigraphically complex mixtures of rhyolite lava and zeolitic nonwelded tuff such as in the CHZCM (Table 1) typically comprise composite units (Blankennagel and Weir 1973; Prothro and Drellack 1997; Bechtel Nevada 2002).

Table 1 Description of hydrostratigraphic units (HSUs) in the study area (after Bechtel Nevada 2002)

The layered sequences of volcanic rocks beneath Pahute Mesa are disrupted by caldera margin structures (Timber Mountain caldera complex structural margin, Fig. 2) and further faulted into distinct structural blocks (Warren et al. 2000). More than a half dozen faults with offsets in excess of 700 ft (200 m) have been mapped previously in Pahute Mesa (McKee et al. 2001; Bechtel Nevada 2002) and additional faults are mapped as well drilling continues (for example, National Security Technologies, LLC 2010). Across the NNSS, faults have been characterized as potentially exhibiting distinct hydraulic properties from the surrounding rocks (Prothro et al. 2009b).

Fig. 2
figure 2

Cross section through Pahute Mesa showing complex geometry of faulted HSUs (Table 1). The inferred location of lava flows from alternative frameworks are indicated by cross-hatching. Model layers 2–28 are shown. Layer 1 is 1 ft (0.3 m) thick and layer 29 extends another 1,312 ft (400 m) down, neither are shown to preserve the area of interest. Modified from Bechtel Nevada (2002)

The depth to the water table at Pahute Mesa exceeds 2,000 ft (600 m) and sparsely distributed wells, generally more than 0.5 mi (0.8 km) apart, penetrate more than 5,000 ft (1,500 m) of the complex volcanic-rock-dominated hydrogeologic system (Fenelon et al. 2010). Water-level data are collected from multi-level monitoring wells at 14 locations (Figs. 1 and 2; Navarro-Intera, LLC 2013). Environmental water-level fluctuations are substantial beneath Pahute Mesa because of the thick unsaturated zone and high pneumatic and hydraulic diffusivity of the volcanic rocks.

Geologic frameworks

Hydraulic properties in groundwater models of Pahute Mesa must be distributed spatially to simulate flow and transport so that radionuclide migration can be evaluated (Laczniak et al. 1996). Borehole logs and geologic mapping of faults and outcrops provide sufficient information to develop geologic frameworks of the subsurface beneath Pahute Mesa that depict the three-dimensional relationships of HSUs and structural features (Bechtel Nevada 2002; Fig. 2). A standard geologic framework has been developed for the underground test area (UGTA) activity that is a primary basis for distributing hydraulic properties (Bechtel Nevada 2002). The standard geologic framework was discretized vertically into 251 layers between 1,700 ft (500 m) below sea level to 6,500 ft (2,000 m) above sea level, where each layer was about 33 ft (10 m) thick.

Although considerable efforts were involved in defining and subdividing the geology into hydrostratigraphic units (HSUs) throughout the NNSS, it remains unclear what level of geologic complexity actually benefits groundwater-model development at Pahute Mesa. The hydrologic utility of the standard geologic framework and several alternative geologic frameworks are investigated within the study area where about 40 mi3 (167 km3) of aquifer have been characterized with eight multi-well aquifer tests (Halford et al. 2012a; Table 2). The alternative geologic frameworks include a range of greater and lesser complexity than the standard framework. The least complex is an undifferentiated case where the entire aquifer is a single HSU. In a more complex framework, standard HSUs are subdivided into lava and non-lava units, and in the most complex, fault structures are further subdivided as hydraulically unique features. Finally, the approach is used to test the hypothesis that analysis of these four frameworks can inform the development and testing of a fifth, simplified geologic framework for Pahute Mesa with hydraulically unique units. Figure 3 illustrates these five conceptual frameworks.

Table 2 Summary of eight aquifer tests and six groundwater-flow models, Pahute Mesa, NNSS (U.S. Geological Survey, 2011). (Well locations are shown in Fig. 1.)
Fig. 3
figure 3

Comparative hydrostratigraphic columns of investigated geologic frameworks with abbreviated HSU names (Table 1)

Methodology

The hydrologic utility of geologic frameworks is evaluated by calibrating groundwater flow models to drawdowns from multiple interfering aquifer-tests and interpreting differences between preferred and estimated hydraulic properties. Tikhonov regularization is employed to specify that the preferred hydraulic conductivity distribution within each HSU is homogeneous, which reflects a known ignorance of the natural variability within HSUs. The hydrologic utility is assessed by examining the uniqueness of the HSUs determined for each alternative geologic framework, which is used to inform the development of a simplified framework with the appropriate level of geologic complexity for groundwater modeling in the region.

Groundwater model

Groundwater flow and drawdowns from each aquifer test are simulated with MODFLOW 2000 (Harbaugh et al. 2000). Each model grid extended laterally about 200,000 ft (61,000 m) away from the pumping well. All models extended vertically from 1,700 ft (500 m) below sea level to 4,200 ft (1,300 m) above sea level, which coincides with the water table. All models are discretized vertically into 29 layers (Fig. 2; right-hand side). Rows and columns in the grid are assigned widths of 100 ft (30 m) at the pumped well, then expanded successively by a factor of 1.25 away from the pumped well to the flow-model edges. The number of rows and columns in each model differs, but ranges between 118 and 121. All external boundaries are specified no-flow boundaries. Changes in the wetted thickness of the aquifer system are not simulated because the maximum drawdown near the water table is very small relative to the total saturated thickness. Simulation periods are subdivided into stress periods that simulate simplified pumping schedules for each pumping well site with between 6 and 12 stress periods each (Table 2).

Within the models, is distributed throughout each HSU with many pilot points because variability within each HSU is possible. Pilot points are locations in the model domain that guide the estimation of hydraulic properties (RamaRao et al. 1995) and are assigned to HSUs at multiple depths for 76 mapped locations (Fig. 1). Less than 76 pilot points exist in most HSUs because pilot points are not defined where a HSU is absent. Hydraulic conductivity is distributed with a total of 509 pilot points. Hydraulic properties are interpolated from pilot points with kriging to node locations as defined for each groundwater-flow model. Pilot points were spaced 4,000 ft (1,200 m) apart and distributed with sufficient density and coverage to accommodate information between pumping and observation wells (Fig. 1). The spatial variability of log-hydraulic conductivity is defined with an isotropic, exponential variogram, where the specified range is 15,000 ft (4,600 m) and no nugget is specified. Although the variograms play a lesser role relative to the prior information (see section ‘Tikhonov Regularization’), an exponential variogram is selected and this large range is specified to allow multiple pilot points to contribute to each quadrant about a MODFLOW cell.

Calibration constraints

More than 23 million gallons of water were pumped from Pahute Mesa during eight aquifer tests between 2009 and 2012. Drawdowns from 83 pumping-observation well pairs defines the investigated volume, where the maximum distance between pumping and observation well exceeded 2 mi (3.2 km; Halford et al. 2012a). Within the investigated volume hydraulic properties are constrained by the aquifer test data, whereas beyond the investigated volume hydraulic properties are unknown and thus are extrapolated based on assumed relations defined by the geologic framework.

Specific yield and specific storage are estimated as uniform values for each HSU since it is assumed they can be well constrained for the calibration based on values typically observed for most aquifer materials (US Geological Survey 2013). Specific yield of fractured rocks is expected to range between 0.005 and 0.05. Specific-storage initially is assigned as 1.5 × 10−6 ft−1 (4.9 × 10−6 m−1) and allowed to range between 1 × 10−7 and 3 × 10−6 ft−1 (3.2 × 10−7 and 9.8 × 10−6 m−1). Vertical-to-horizontal anisotropy is assumed equal to 1 and is not estimated.

Hydraulic properties are estimated initially by simultaneously calibrating six separate groundwater-flow models to the pumping responses from the eight aquifer tests for each framework that is tested. Independent groundwater-flow models allow grid refinement near each pumping well, different pumping schedules to match each aquifer test, and drawdown estimates specific to one or two tests. Two pairs of tests at nested sites are each analyzed with a single model because drawdowns from paired tests interfered and could not be isolated (Table 2).

Log-hydraulic conductivity and storage coefficients are estimated by minimizing a weighted composite sum-of-squares objective function. The hydraulic-conductivity distributions are estimated by adjusting 494 of 509 pilot-point values with PEST (Doherty 2008). Pilot-point values are fixed and not estimated where whole HSUs were minimally investigated by aquifer tests. The objective function compares differences between simulated and measured drawdowns and differences between log-hydraulic conductivity estimates within each HSU. The measurement objective function is comprised of 24,563 drawdown comparisons where 18,055 drawdowns are weighted >0.5 (Table 2).

Tikhonov regularization

Tikhonov regularization minimizes differences between log-hydraulic conductivity estimates within HSUs (Doherty and Johnson 2003). Preferred hydraulic conductivity distributions define the prior information for each of the alternative geologic frameworks, where differences between log-hydraulic conductivity estimates equaled 0 within each HSU. Therefore, each geologic framework uses different prior information equations to reflect the alternative geologic interpretations which inform the objective function of assumed relations between geology and the distribution of hydraulic properties. Prior information weights are equal to 1 where pilot points are separated by less than 4,000 ft (1,200 m), decreased from 1 to 0.2 between 4,000 and 20,000 ft (1,200 and 6,100 m) and equal to 0 where separation exceeds 20,000 ft (6,100 m). About 11,000 regularization equations define the preferred hydraulic conductivity distribution of the standard geologic framework. Variability of log-hydraulic conductivity within each HSU is reduced by the prior-information equations to the minimum variation needed to simulate the observed drawdowns. The mean log-hydraulic conductivity of each HSU is unconstrained because no preferred value is specified.

Regardless of which geologic framework is tested, all models are calibrated to within the irreducible measurement error, which dictates that geologic framework performs equally well at matching drawdown observations within the investigated volume. Irreducible measurement and numerical model errors are estimated from water-level modeling results (Garcia et al. 2013; Halford et al. 2012b). The expected measurement root-mean-square error (RMSE) is 0.02 ft (0.6 cm), which equals a sum-of-squares error of 7 ft2 (0.7 m2). Simulated drawdowns for all models matches measured drawdowns to within the limits of the irreducible RMSE of 0.02 ft (0.6 cm). Simulated and measured drawdowns in well ER-EC-6 shallow during five of the eight aquifer tests exhibit typical differences (Fig. 4).

Fig. 4
figure 4

Simulated and measured drawdowns in well ER-EC-6 S during the ER-EC-11 main, ER-20-8 main upper and lower zones (ER-20-8I+D), and ER-EC-12 main upper and lower zones (ER-EC-12S+I) aquifer tests as interpreted with the standard and alternative geologic frameworks

Evaluating alternative frameworks

Areal extent and volume of investigation are mapped using the maximum simulated drawdown, which is the maximum drawdown at a given cell within the model domain from any aquifer test. For example, maximum simulated drawdown in well ER-EC-6 shallow using the standard framework is 0.45 ft (14 cm), which occurred during the 18th day of the ER-EC-11 main aquifer test (Fig. 4). A threshold of 0.05 ft (1.5 cm) for detecting simulated drawdown is supported by residual errors from water-level modeling to estimate drawdowns (Garcia et al. 2013). The extent of investigation where the maximum simulated drawdown exceeds this threshold is a two-dimensional area that is defined by the maximum simulated drawdown at any depth (Fig. 5). The hydrologic utility of alternative geologic frameworks is evaluated by estimating hydraulic property distributions within each framework. Undifferentiated, separated-lavas, fault-structure, and simplified are the four alternative geologic frameworks with differing complexity relative to the standard framework (Fig. 3).

Fig. 5
figure 5

Maximum simulated drawdown from standard geologic framework and extents of investigation for each geologic framework as defined by the corresponding 0.05 ft (1.5 cm) contours shown. Fault traces, NNSS boundary, and pumping and observation well locations are shown. Black-filled circles and the black and yellow-filled circles represent observation wells and pumping wells, respectively (see also Fig. 1)

Undifferentiated is the simplest geologic framework that lumps the entire subsurface into a single HSU. It establishes a baseline for the hydraulic significance of other frameworks, but is not considered as a viable framework itself. Separated-lava geologic framework intersects an independent geologic model of lava distributions with the standard framework and four HSUs are subdivided into lava and non-lava (Fig. 2). Lavas generally are expected to be more permeable from flow-log data (Garcia et al. 2010). Fault-structure geologic framework has the same HSUs as the separated-lava geologic framework and also differentiates seven major fault structures as hydraulically unique features (Prothro et al. 2009a). Simulated faults are multiplier arrays that further modify hydraulic conductivities that are distributed by HSUs in the fault-structure geologic framework. This approach has been applied in previous models of Pahute Mesa (Stoller-Navarro Joint Venture 2009). Hydraulic conductivity multipliers in faults, like HSUs, are expected to be spatially variable. The fault multiplication array is defined with 56 pilot points, interpolated by kriging, and affects an area perpendicular to the fault traces and is 820 ft wide (250 m; Fig. 1). Faults are classified by orientation because Dickerson (2001) alleges that groundwater flow is restricted and enhanced by north–south and east–west trending faults, respectively. As with the HSUs, variability of multipliers within each of the two fault classes is minimized by prior-information equations.

The results for the more complex geologic frameworks (see section ‘Results’) support the development of a simplified framework that is better suited for extrapolating hydraulic properties than the four previously tested frameworks. In this simplified framework, hydraulically similar HSUs are combined so that the resulting four HSUs are unique and conceptually consistent (Fig. 3). The sUPCU, TCA, LPCU, TSA, and sCHZCM are combined into a single HSU. The lava and non-lava components of the sCHZCM are not separated, despite being hydraulically unique. This is because the estimated mean hydraulic conductivity of the lava is less than the non-lava, which is inconsistent with the conceptual model of lavas at NNSS. Lavas in the sUPCU are combined with the BA/SPA because they are hydraulically similar and are conceptualized as having been originally misclassified. The sFCCU and sCFCM remain unchanged in the simplified framework.

Hydraulic properties at pilot points from the calibrated standard framework model are the initial estimates for all alternative geologic frameworks. Differences in prior information are minimized primarily to calibrate each alternative model. This is because simulated and measured drawdowns agree within the tolerances of measurement error using the initial set of hydraulic properties. Alternative sets of prior-information range between 9,000 and 68,000 equations for the alternative geologic frameworks.

The utility of a geologic framework for extrapolating hydraulic properties beyond the zone of investigation increases with greater agreement between preferred and estimated hydraulic conductivity distributions. Differences between preferred and estimated log-hydraulic conductivities are measured using the standard deviation within each HSU or structure. This approach directly measures departure from the assumed distribution because preferred variability in each HSU or structure is 0. The standard deviation of log-hydraulic conductivity estimates for each HSU or structure is biased less than the true variability, because variability is minimized by the Tikhonov regularization equations.

Mean and standard deviation of log-hydraulic conductivity for each HSU or structure are estimated in the investigated volume where simulated drawdowns exceed 0.05 ft (1.5 cm; Fig. 5). Hydraulic conductivities within the investigated volume are inferred primarily from drawdown data. As a result, hydraulic conductivity estimates within the investigated volumes are similar, irrespective of geologic frameworks, and the influence of the frameworks is largely on the distribution outside the investigated volume. Thus, where aquifer test data exist to constrain the model, the framework has little influence and hydraulic properties are similar, whereas when there is no hydraulic data to constrain the model, the assumptions about geology are highly influential.

For each geologic framework, log-hydraulic conductivities are sampled directly from six MODFLOW models at geologic framework cells and averaged in each framework cell, such that the distribution of hydraulic properties reflects the average values calibrated for the different aquifer tests. Despite minor differences between the grid spacing for each MODFLOW model (Table 2), the standard deviation of log-hydraulic conductivity is less than 0.3 log (ft/day) in more than 95 % of the geologic framework cells within the investigated volume. This small standard deviation of the six samples in each framework cell shows that little additional variability is introduced by interpreting aquifer tests with multiple MODFLOW models that use slightly different grid spacing for each of the six aquifer tests.

Hydrologic utility increases as uniqueness of HSUs and structures increase within a geologic framework. For a given framework, these features are more unique as the standard deviation of hydraulic conductivities in a HSU or structure decreases and the differences between mean hydraulic conductivities of different HSUs and structures increases. According to this definition of utility, a framework with low utility will include units that are not significantly different from each other, whereas high utility is identified by geologic units that are easily distinguished from each other in terms of their hydraulic properties. These measures of utility and uniqueness are qualitative, but relate directly to how suitable a geologic framework is for simulating hydraulic responses.

Results

The analyses presented here include a comparison of HSUs and structures in standard, separated-lavas, fault-structure, and simplified frameworks to the single HSU in the undifferentiated framework. Mean and standard deviation of hydraulic conductivity are 0.9 ft/day (0.3 m/day) and a factor of 3, respectively, for the single HSU in the undifferentiated framework (Fig. 6). Displacing the mean hydraulic conductivity of a HSU or structure outside of the range between 0.3 and 3 ft/day (0.1 to 1 m/day) or reducing the standard deviation to less than a factor of 3 are considered to be improvements.

Fig. 6
figure 6

Mean and standard deviation of hydraulic conductivity in hydrostratigraphic units (HSUs) and structures that were defined by the undifferentiated, standard, separated-lavas, and fault-structure frameworks. Background colors correspond to the HSUs illustrated in Fig. 3. Note: ft/day ≈ 0.03 m/day

The hydraulic conductivities of the sFCCU, BA/SPA, sCFCM, and sCHZCM in the standard framework are clearly distinguished from the single HSU in the undifferentiated framework (Fig. 6). The sFCCU and sCHZCM differ greatly because the standard deviations decrease in the standard framework to less than a factor of 2. Mean hydraulic conductivities of the BA/SPA, and sCFCM differ more than 10 times from the single HSU in the undifferentiated framework (Fig. 6). Mean hydraulic conductivity of the TCA in the standard framework is 4 times less than in the undifferentiated framework. A marked decrease in hydraulic conductivity of the TCA is inconsistent with this HSU being classified as an aquifer (Prothro et al. 2009a) and therefore is not considered an improvement.

Hydraulic conductivity distributions of the BA/SPA and sUPCU are improved marginally by subdividing these HSUs into lava and non-lava units in the separated-lavas framework (Fig. 6). Mean and standard deviation of hydraulic conductivities differ most in the sUPCU by separating the original HSU into lava and non-lava HSUs. Mean hydraulic conductivities differ 5 times and standard deviations of the lava and non-lava HSUs are about half of the standard deviation of the original HSU (Fig. 6). Lava and non-lava HSUs in the sCHZCM differ but the mean hydraulic conductivity of the lava HSU is less than the non-lava HSU. This contradicts the conceptual model and empirical evidence of lavas as being generally more permeable units.

Adding fault structures as unique hydrologic features degrades the hydrologic utility of all HSUs in the fault-structure framework. Standard deviations of every HSU in the fault-structure framework are greater than in comparable HSUs in the separated-lavas framework. Standard deviations increase the most in lava and non-lava HSUs of the sUPCU and sCHZCM (Fig. 6).

The HSUs of the simplified framework are hydraulically unique, and due to the calibration procedure are able to explain the field observations equally well as HSUs and structures in the other frameworks. Simulated and measured drawdowns agree to within the irreducible measurement error of 0.02 ft (0.6 cm; Fig. 4). The hydraulic conductivity estimates of the sFCCU and combined “sUPCU, TCA, LPCU, TSA, and sCHZCM” are most similar, but overlap between probability density functions (PDF) is less than 60 % (Fig. 6). This is compared to overlaps in the standard framework such as between LPCU and TSA that exceed 90 % (Fig. 6). The BA/SPA and sCFCM differ most in the simplified framework, where mean hydraulic conductivities vary more than two orders of magnitude and PDFs overlap less than 3 %.

Discussion

Geologic framework selection significantly affects hydraulic property estimates beyond the extent of investigation, which also will affect flow and transport model results. Hydraulic property estimates within the extent of investigation are constrained primarily by field observations and are similar, irrespective of the geologic framework. Geologic framework selection becomes highly relevant beyond the extent of investigation, because assumed relations affect hydraulic property estimates more than field observations.

The effects of geologic frameworks on hydraulic property distributions within and beyond the extent of investigation are presented with maps of transmissivity (Fig. 7). Transmissivity integrates effects of multiple HSUs with variable thicknesses and simplifies comparison of the 29-layer models. Figure 7 shows transmissivity distributions of the undifferentiated and fault-structure frameworks, which reflect the widest range of independent HSUs, and the simplified framework (Fig. 3). Distributions of transmissivity are similar within the investigated extents of the undifferentiated, fault-structure, and simplified frameworks (Fig. 7). Transmissivity generally exceeds 10,000 ft2/day (929 m2/day) between the pumping wells ER-20-8#2 main, ER-EC-11 main, and ER-20-7 and decreases to the east and west. These transmissivity distributions are similar in the investigated extents because hydraulic conductivity estimates are informed primarily by field observations.

Fig. 7
figure 7

Transmissivity distributions and extents of investigation from undifferentiated, fault-structure, and simplified geologic frameworks. Note: ft2/day ≈ 0.09 m2/day

Distributions of transmissivity differ to a greater degree beyond the investigated extent of the undifferentiated, fault-structure, and simplified frameworks (Fig. 7). Transmissivity of the undifferentiated framework beyond the investigated extent is relatively uniform because estimates defaulted to one unconstrained mean hydraulic conductivity by a single HSU. Transmissivity of the fault-structure framework beyond the investigated extent is highly variable because the 14 HSUs and structures allow 14 unconstrained mean hydraulic conductivities to be estimated (Fig. 3). Transmissivity estimates in fault structures beyond the extent of investigation exceed transmissivities in areas where field observations control hydraulic conductivity estimates. The distribution of transmissivity varies beyond the investigated extent in the simplified framework, because hydraulically unique HSUs are retained (Fig. 7). For example, transmissivities of 100–300 ft2/day (9–28 m2/day) are extrapolated northeast of the investigated extent because the sCFCM is prevalent and therefore retained as a unique HSU in the simplified framework. Likewise, transmissivities of 3,000–10,000 ft2/day (278–929 m2/day) are extrapolated southeast of the investigated extent because the BA/SPA occurs there and retains a generally higher hydraulic conductivity than other HSUs. The variability in transmissivity beyond the investigated extent is justified by the quantitative evaluation of the uniqueness of the HSUs within the simplified geologic framework (Fig. 6).

Conclusions

This work demonstrates that the hydrologic utility of geologic frameworks can be evaluated directly with aquifer-test results and geologic observations. Aquifer-test results provide flexible large-scale constraints because known volumes of water can be displaced at specified rates and locations. Hydraulic properties are distributed in numerical flow models with many pilot points. Assumed relations from the geologic framework are specified flexibly through Tikhonov regularization of the pilot points. The approach presented here allows hydraulic properties to vary within hydrogeologic units so observed hydraulic responses can be matched. Differences between assumed hydraulic property distributions and those necessary to simulate measured drawdowns are minimized by the geologic observations that are imposed through Tikhonov regularization. This approach is used to test multiple geologic frameworks of highly fractured volcanic rocks at Pahute Mesa, NNSS, where about 40 mi3 (167 km3) of aquifer have been characterized with eight multi-well aquifer tests.

The utility of geologic frameworks for extrapolating hydraulic properties is quantified with estimates of means and standard deviations of log-hydraulic conductivity of units and structures. Log-hydraulic conductivities are sampled exclusively from a simulated volume where drawdowns from multiple, interfering aquifer tests exceeded a detection threshold. Sampling is limited to this investigated volume so that hydraulic conductivity estimates are defined by the observed hydraulic responses, and so that the standard deviation primarily measures the departure from assumed homogeneity in a unit or structure. Hydrologic utility of geologic frameworks increases as hydraulic conductivity varies less within hydrogeologic units and differs more between hydrogeologic units. Thus hydrologic utility provides a useful metric for testing the suitability of geologic frameworks for extrapolating hydraulic conductivity across regional scales.

An appropriately simplified framework for extrapolating hydraulic properties can be inferred from testing multiple alternative geologic frameworks with varying degrees of complexity. The Simplified framework for Pahute Mesa is developed by retaining hydraulically unique hydrogeologic units and structures while combining redundant units. This approach for reducing geologic complexity retains hydraulically useful variability with fewer independent hydrogeologic units than the other frameworks considered. The results of this analysis support simplification of the geologic framework as it is incorporated in flow and transport models, not in the creation of the original geologic frameworks. This is because mapping of hydraulically unique hydrogeologic units can require a detailed understanding of both the geologic structures and the hydraulic properties to reduce redundancy in model units. Whereas with a multi-model analysis approach, all five of the geologic framework models tested at Pahute Mesa would be retained because they match observed drawdown data equally well, the approach presented here illustrates that some of these frameworks are poorly suited for extrapolation beyond the region where hydraulic properties are constrained by aquifer testing.