Introduction

Groundwater represents an important freshwater resource, especially during periods of drought when surface-water resources are scarce (Dennehy et al. 2015). In the 1970s, the US Geological Survey (USGS) began a national effort to quantify groundwater resources of the United States (US) through development of groundwater-availability models of regional aquifer systems (Jorgensen et al. 1996; US Geological Survey 2015a; Dennehy et al. 2015). Numerical groundwater models provide calibrated hydrologic budgets that allow water resource managers to evaluate future groundwater availability under different pumping scenarios and climate conditions. Quantifying the groundwater-use component of a hydrologic budget is a vital but often challenging endeavor because of limited historical data before 1950 (Committee on USGS Water Resources Research et al. 2002), lack of directly reported water-use data for some locations or water-use categories (Brown 2000), and changes in data-collection methods over time (Perrone et al. 2015). The compilation, analysis, and spatial and temporal extrapolation of water-use data remain a challenging task for water scientists (Johnson and Belitz 2015; Perrone et al. 2015; Barato 2016) but is of paramount importance to better quantify water availability (Fishman 2016).

Development of long-term water-use records is crucial for the calibration of water-availability models (Feinstein et al. 2004); however, there is a lack of published research on the topic. Researchers have explored long-term water quality (Cosby et al. 1985; Hall and Smol 1996; Swank et al. 2001; Burt et al. 2014), but long-term water quantity research is generally limited to less than 15 years of record (Salas-La Cruz and Yevjevich 1972; Hansen and Narayanan 1981; Tsakiri and Zurbenko 2013; Yan et al. 2016) and future water use is often projected using trend analysis (Brown 2000). Predictive modeling of recent water-use datasets (less than 20 years old) has helped to highlight explanatory variables that drive water use (Levin and Zarriello 2013; Committee on USGS Water Resources Research et al. 2002), allowing water use to be estimated based on changing climate, population, or agriculture; however, techniques for modeling the temporal variation in water use may be limited in spatial scale.

Spatial extrapolation of water-use data is challenging because of the disparity in where water-use data are collected or estimated. Site-specific data are reported by individual users but are often the hardest to obtain because data availability varies state-to-state. Aggregate data compiled at the county, hydrologic unit, aquifer, or state level provide more temporally consistent datasets but must be disaggregated to individual well locations. Water-use datasets may be developed by compiling multiple site-specific datasets for local-scale water-availability models (Reed and Czarnecki 2006; Richards 2010), extrapolating to areas with limited data using geostatistical techniques (Ahmad et al. 2005; Torak and Painter 2011), or estimating water use assuming that the spatial distribution of groundwater withdrawals did not vary over time (Clark and Hart 2009). The US Geological Survey (USGS) has compiled and published a water-use census for the US every 5 years since 1950 for state-level aggregations and since 1985 for county-level aggregations for eight major categories from surface water or groundwater and fresh or saline water sources (USGS 2015b). These censuses are an important resource for consistent, historical water-use estimates but the data must be disaggregated for use in predictive models of water use or as components of groundwater-flow models (USGS 2015b); thus, the combination of techniques to model water use using explanatory variables that allow for extrapolation over both time and space is limited.

As part of a regional-scale analysis of the Ozark Plateaus aquifer system (Ozark system) in the central US, a groundwater-use record was required for quantifying the hydrologic budget and developing a groundwater-flow model of the system (USGS 2015a; Hays et al. 2016). The Ozark system is being modeled with 2.6-km2 grid cells, which requires spatially accurate site-specific groundwater-use data. Additionally, the Ozark system groundwater model, which currently is being calibrated, will simulate hydrologic conditions beginning in 1900, necessitating historical groundwater-use data. The challenge for creating such a water-use record is to capture variability in site-specific groundwater-withdrawal rates over both time and space for a regional aquifer system. Because development of historical water-use records poses major challenges, detailed descriptions of methods for water-use data compilation and manipulation are necessary to advance understanding of water use through time. To create the best available groundwater-use record for the Ozark system, the following questions were addressed: (1) What is the most appropriate method to model site-specific groundwater-withdrawal rates to 1900, given sparse historical data? (2) What is a meaningful range of values (estimate interval) for the modeled groundwater-withdrawal rates? and (3) How does the 110-year record of historic groundwater use fit into the context of freshwater availability across the Ozark system? The methods and results of modeling the water-use component of the groundwater flow-model are described and details for the Ozark system groundwater-availability study, including the flow model, can be found at USGS (2015a).

Methods

Study area

The Ozark system underlies the Ozark Plateaus Physiographic Province (Ozark Plateaus) in the central US (Fig. 1) and is composed of interbedded Cambrian to Pennsylvanian clastic and carbonate lithologies (Fig. 2; Jorgensen et al. 1993; Adamski et al. 1995; Kresse et al. 2014; Westerman et al. 2016a). In stratigraphic order, the Ozark system includes the Basement confining unit, St. Francois aquifer, St. Francois confining unit, Ozark aquifer, Ozark confining unit, Springfield Plateau aquifer, and Western Interior Plains confining system (Fig. 2; Imes and Emmett 1994). The Ozark aquifer is generally unconfined where units are exposed on the Salem Plateau and confined where overlain by the Ozark confining unit (Hays et al. 2016; Westerman et al. 2016a). The Ozark aquifer has a median thickness of 576 m and owing to variable hydraulic properties is sub-divided into upper, middle, and lower sections (Westerman et al. 2016a; Westerman et al. 2016b). The middle Ozark aquifer includes low permeability dolomites and regionally acts as a confining unit for the lower Ozark aquifer (Hays et al. 2016). The Springfield Plateau aquifer is generally unconfined where units are exposed on the Springfield Plateau and confined where overlain by the Western Interior Plains confining system (Hays et al. 2016). Carbonate units of the Ozark system, including the Springfield Plateau aquifer and portions of the Ozark aquifer, have undergone karstification, resulting in variable hydraulic properties between primary bedrock porosity (low permeability and storativity) and secondary porosity from fractures and dissolution-enlarged conduits (higher localized permeability; Hays et al. 2016).

Fig. 1
figure 1

Hydrogeologic units of the Ozark Plateaus aquifer system and plateaus of the Ozark Plateaus Physiographic Province

Fig. 2
figure 2

Conceptual model of the hydrologic budget for the Ozark Plateaus aquifer system, showing the general structure and stratigraphy of hydrogeologic units in the study area (modified from Haley et al. 1993; Stoeser et al. 2007; and Hays et al. 2016)

The Ozark system is predominantly freshwater, with surface water and groundwater generally flowing away from the higher topography of the Ozark dome (approximately centered at the Saint Francois Mountains) outward towards streams at the margins of the Ozark system and neighboring groundwater systems (Fig. 2; Jorgensen et al. 1993; Hays et al. 2016). The ultimate source of water to the Ozark system is precipitation, where approximately 24% of precipitation falling over the Ozark Plateaus contributes to recharge, the remainder being lost to evapotranspiration, interception by vegetation, and flow out of the modeled area (Hays et al. 2016). Water inputs to the Ozark system include recharge from precipitation, stream leakage from losing stream reaches, and inflow from neighboring surface-water and groundwater systems; and loss of water from the Ozark system includes groundwater flow to gaining stream reaches, groundwater outflow to neighboring systems, and groundwater withdrawals for household, industrial, and agricultural water use (Hays et al. 2016). Generally, the lower Ozark aquifer is the primary source of groundwater across much of Missouri, and the Springfield Plateau aquifer is used across northern Arkansas (Miller and Vandike 1997; Kresse et al. 2014; Hays et al. 2016).

Data acquisition and interpolation of groundwater-withdrawal rates

Site-specific water-use (SSWU) data were compiled from a variety of federal and state sources (Table S1 of the electronic supplementary material (ESM); Barnett 2003; Center for Applied Research and Environmental Systems 2015; Hallberg 2016; Kansas Department of Agriculture 2012; Kansas Department of Health and Environment 2004, 2016; Kansas Geological Survey 2015; Kansas Rural Water Association 2016; Missouri Department of Natural Resources 1973, 1975, 1977, 1980, 1982, 1985, 1987, 2007, 2013a, b; Missouri Division of Health 1962, 1966, 1969, 1971; Oklahoma Department of Environmental Quality 2012; Oklahoma Water Resources Board 1998; Sturgess 2006; University of Missouri 2013; US Environmental Protection Agency 2016a; US Geological Survey 2015b, d). Water-use data aggregated at the county level (CNTY) were downloaded from the USGS National Water Information System (NWIS; USGS 2015c; Table S1 of the ESM). Groundwater-withdrawal rates stored in disparate formats were compiled into a single database to create a comprehensive water-use dataset for the Ozark system—for example, historical Missouri Department of Natural Resources (MODNR) Census of Missouri Public Water Suppliers reports were available only in hard-copy format, so they were scanned, digitized, and reviewed for errors. Missing water-use metadata were compiled using driller water-well logs, federal and state agency well-information systems, and spatial data information services (Table S1 of the ESM).

Groundwater-withdrawal rates from CNTY data had to be disaggregated to individual wells, which necessitated acquiring well locations in addition to those with known withdrawal rates because of the limited SSWU dataset. The SSWU dataset included 2,693 unique well locations with groundwater-withdrawal rates for years ranging from 1900 to 2014 and was expanded to 148,836 well locations using MODNR’s Well Information System and USGS NWIS wells (Table S1 of the ESM). Wells were classified by water-use category based on (1) primary water-use information for the well where available (e.g., a municipal supplier was categorized as public supply use), (2) metadata compiled in the SSWU dataset, or (3) land-use/land-cover (LULC) classes from the Cropland Data Layer (US Department of Agriculture 2011), which was mostly used for wells without pumping data (Table 1). Water-use categories for the SSWU and CNTY data were then aggregated into five divisions to simplify data analysis and modeling (Table 1).

Table 1 Site-specific water-use (SSWU) categories, land-use/land-cover (LULC) classes, and USGS (USGS) county-aggregate categories used to derive groundwater-use divisions

Both the CNTY and SSWU datasets included years when water-use data were not collected; for CNTY data, water-withdrawal rates were aggregated every 5 years between 1985 and 2010 and in the SSWU dataset, groundwater-withdrawal rates were collected for public suppliers in Missouri irregularly from 1962 to 1987 and yearly from 1996 through 2010 (Table S1 of the ESM). Therefore, linear interpolation of groundwater-withdrawal rates was completed using the interp1d method (Scipy Community 2014) in Python version 2.7 (Python Software Foundation 2016) at the county level for both CNTY and SSWU to create datasets with yearly time steps (Fig. 3). Groundwater-withdrawal rates from CNTY data were generally greater than SSWU data for all water-use divisions for the period of record (Fig. 3)—for example, domestic water use is often not in SSWU databases because typically small withdrawal rates do not meet reporting requirements. Therefore, SSWU and CNTY data were used to model non-agriculture and public supply water-use divisions and only CNTY data were used to model agriculture, domestic, and livestock water-use divisions because of either limited or no SSWU data (Fig. 3).

Fig. 3
figure 3

Groundwater-withdrawal rates for site-specific water use (SSWU) compared to county-level aggregate water-use (CNTY) data through time for a agriculture, b domestic, c non-agriculture, d public supply, and e livestock water-use divisions. Values were linearly interpolated within the period of record to create a dataset with a yearly time step

Extrapolation of groundwater-withdrawal rates

Population was used as a predictive variable to model domestic and public supply groundwater-withdrawal rates beginning in 1900. This method assumed the same ratio of groundwater use between public supply versus domestic self-supply from 1900 to 1985 (when CNTY domestic data were first available, Fig. 3). Population was available in 10-year increments from 1900 to 2010 at the county scale (Minnesota Population Center 2011) and was linearly interpolated using the interp1d method (Scipy Community 2014) in Python version 2.7 (Python Software Foundation 2016) to provide a yearly time-step. The USGS estimates population served by public supply versus domestic self-supply at the state level (Maupin et al. 2014), but to maintain a consistent dataset over time, total county-level population data were used for this analysis (Minnesota Population Center 2011). Prior knowledge about the Ozark system suggests that few domestic wells and no public supply wells were active in 1900 (Miller and Vandike 1997), so for comparison to the linear, population-based model (LIN), a multiplier was used to constrain the public supply and domestic groundwater withdrawal predictions to 0 million liters per day (ML/d) in 1900 (LIN-0), which assumes any water use was from surface-water resources. The multiplier was the same length as the number of modeled years and incrementally ranged from zero in 1900 to one in 2010. Groundwater-withdrawal rates for the other water-use divisions (agriculture, livestock, and non-agriculture) were modeled using simple linear extrapolation from the last year of reported data—either CNTY or SSWU, depending on which was greater (Fig. 3)—to 0 ML/d in 1900 (LIN-0).

Well selection

To model groundwater-withdrawal rates across the Ozark system, Python version 2.7 was used to create a well-selection tool that followed a hierarchical process for applying SSWU and CNTY data based on the level of confidence that a well pumped groundwater for a specific use during a specific year (Fig. 4). Priority was given to SSWU data, followed by disaggregating CNTY data to wells with two levels of confidence. Level 1 included wells where groundwater-withdrawal rates were greater than 0 ML/d for a given year or the drill date of the well was older than the given year, while level 2 wells did not have pumping data, but provided coordinates for potential groundwater withdrawals. The use of level 2 wells necessitated estimating the number of total wells required within a county each year to disaggregate CNTY groundwater-withdrawal rates; otherwise, realistic groundwater-withdrawal rates per well may not have been achieved.

Fig. 4
figure 4

Logic used to model groundwater-withdrawal rates using either site-specific water-use (SSWU) or county-level aggregate water-use (CNTY) data, depending on the water-use division. Level 1 wells have a higher level of confidence because pumping was known to occur and level 2 wells have a lower level of confidence because they serve only as a well location without pumping information

Based on water-use data from level 1 wells, the median groundwater-withdrawal rate per well per year was 0.76 ML/d for non-agriculture and 0.49 ML/d for public supply water use. For domestic, agriculture, and livestock water-use divisions (where SSWU data were limited or not available), the groundwater-withdrawal rate was assumed to be 0.0011 ML/d per well (US Environmental Protection Agency 2016b). Dividing CNTY values by either the calculated median rate per well (for public supply and non-agriculture use) or 0.0011 ML/d (for domestic, agriculture, and livestock use), the median number of wells required per county per year was 135 for agriculture, 1,497 for domestic, 209 for livestock, 1 for non-agriculture, and 5 for public supply throughout the Ozark system. The estimated median numbers of wells per county for each category were used as upper-limit thresholds for level 2 well selection, and if less wells were available within a county, then all wells for that water-use division were selected. The combined modeled groundwater-withdrawal rates were based on the combination of site-specific level 1 wells identified in a county plus a sufficient number of level 2 wells to match the combined SSWU and CNTY withdrawals (Fig. 4).

Integration into the groundwater flow model

The Ozark system is simulated with nine layers representing six hydrogeologic units (Westerman et al. 2016a) and horizontally uniform 2.6-km2 cells using MODFLOW-NWT (Niswonger et al. 2011; USGS 2015a). Each groundwater-withdrawal rate modeled using the well-selection tool (Fig. 4) and associated well location was assigned a model row, column, and layer corresponding to a model cell; row and column were determined through an intersection of the well coordinates with the spatially referenced model grid and model layer was determined through a comparison of the altitude of the bottom of the well to the altitude of each hydrogeologic unit (Westerman et al. 2016b). Because withdrawals in MODFLOW-NWT are simulated at the center of a model cell, all modeled groundwater-withdrawal rates for wells located within a single layer, row, and column were summed to produce a single groundwater-withdrawal rate within the cell for each year. The 110-year water-use record is available in a companion, digital dataset from Knierim et al. (2016). Domestic water use included a large number (139,911) of small groundwater-withdrawal rates distributed across the Ozark system, so to aid in distribution of the digital dataset, water-use data provided in Knierim et al. (2016) re-aggregated model cells with domestic use to the county level.

The method to assign each well to a model layer assumes that the bottom of the well (which generally corresponds to the bottom of the open interval) represents the hydrogeologic unit that groundwater is withdrawn from and, therefore, does not explicitly model groundwater wells open to multiple hydrogeologic units. This assumption is valid for the Ozark system because approximately 53% of the wells are located in areas where the Ozark aquifer is exposed at the surface in the Salem Plateau (Fig. 1), such that groundwater is dominantly withdrawn from the highly productive lower Ozark aquifer. The assumption may be problematic, however, in the western and southern extents of the study area where the Springfield Plateau aquifer and the Western Interior Plains confining system are exposed at the surface (Fig. 1). Even in these areas, wells open to multiple intervals will withdraw most water from higher permeability units, which tends to be the lower Ozark aquifer (Hays et al. 2016). If well depth was not available, which represented approximately 3% of the wells, the layer was assigned as the most common, productive hydrogeologic unit in the study area, or the lower Ozark aquifer (Hays et al. 2016). Additionally, if well depth was below the top of the Basement confining unit, the layer was assigned as the lowest aquifer unit at that location using the hydrogeologic framework for the Ozark system (Westerman et al. 2016b).

Groundwater-use statistical models, Missouri public supply

The LIN-0 model was compared to six machine learning models for public supply groundwater-withdrawal rates in Missouri. The public supply data for Missouri was selected as the benchmark dataset because the Missouri Census of Public Water Suppliers and MODNR’s Major Water Users Database provided the longest (1962–2010) and most complete SSWU dataset (Table S1 of the ESM). Groundwater-withdrawal rates from 1901 to 2010 for 1,579 groundwater public supply wells in Missouri were predicted using weighted least squares (WLS), K-nearest neighbors regression (KNN), single regression tree (TREE), multivariate adaptive regression spline (MARS), local polynomial regression (LOESS), and a gradient boosted regression tree (GBRT). The models were built in R using various functions from the rpart, kknn, caret, and gbm libraries (Ridgeway 2015; Therneau et al. 2015; Schliep et al. 2016; Kuhn et al. 2016).

Training data consisted of 173,690 observations, and initial predictors included population (US Census Bureau 2010), precipitation (PRISM Climate Group 2015), LULC (US Department of Agriculture 2011), and well depth (from the SSWU metadata). Zip code tabulation areas (ZCTA) provided higher-resolution population data for 2010 (US Census Bureau 2010) compared to county-level data (Minnesota Population Center 2011), and population for ZCTA was extrapolated to 1900 using the ratio of the ZCTA population to the county population in 2010, which assumed a constant ratio between the two population units over time. Observations in the WLS model were weighted by years, where heavier weights were placed on observations that fall within the earlier and later years. Tuning parameters for the KNN and GBRT models were optimized using cross validation. Predictor selection was also treated as a free parameter for all models. Predicted groundwater-withdrawal rates from the machine learning models were also constrained to 0 ML/d using the same linear multiplier used for the LIN-0 model, again assuming that any water use in 1900 was from surface-water resources.

Results

Population and groundwater use

As population increased, groundwater-withdrawal rates for domestic, non-agriculture, and public supply use also increased, whereas livestock and agriculture groundwater-withdrawal rates were relatively invariant with changes in population (Fig. 5). Although domestic, public supply, and non-agriculture water use showed a relation with population, the r 2 values were generally low (Fig. 5), such that county-level population may only explain a small portion of changes in groundwater-withdrawal rates through time. The slope for both domestic and public supply correlations were less than 1, such that a 10% increase in population, for example, resulted in an approximately 8% increase in public supply and 6% increase in domestic groundwater-withdrawal rates. Because of the low r 2 values and high variability county-to-county (as represented by spread in the data), the linear models for extrapolating groundwater-withdrawal rates to 1900 assumed a 1:1 relation between changes in population and groundwater-withdrawal rates for public supply and domestic water use. Population was not used to model historical non-agriculture, agriculture, and livestock groundwater-withdrawal rates.

Fig. 5
figure 5

Population compared to groundwater-withdrawal rates for site-specific water use (SSWU) and county-level aggregate water use (CNTY) by county for a agriculture, b domestic, c non-agriculture, d public supply, and e livestock water-use divisions. Darker colors represent CNTY values and lighter colors represent SSWU values. Livestock water-use data (e) were only available from CNTY values

Linear models

Groundwater-withdrawal rates using the LIN model were 445 ML/d in 1900 because domestic (164 ML/d) and public supply (281 ML/d) were modeled using population change and were not constrained to 0 ML/d in 1900 (Fig. 6). With the LIN-0 model, all water-use divisions were constrained to 0 ML/d in 1900 (Fig. 7). Total groundwater withdrawals in 2010 were 1,438 ML/d with most (41%) of the groundwater withdrawals for Missouri public supply (Table 2). Most groundwater withdrawals (52–60%) were from the lower Ozark aquifer, followed by use from the middle Ozark aquifer (12–21%), with other aquifer units each ranging from 3 to 17%.

Fig. 6
figure 6

Modeled groundwater-withdrawal rates for the Ozark system through time by water-use division. Groundwater-withdrawal rates were extrapolated to 1900 using a linear decrease to 0 ML/d (LIN-0) for livestock, agriculture, and non-agriculture water-use divisions, or a population-based linear model (LIN) for domestic and public supply water-use divisions

Fig. 7
figure 7

Modeled groundwater-withdrawal rates for the Ozark system through time by water-use division. Groundwater-withdrawal rates were extrapolated to 1900 using a linear decrease to 0 ML/d (LIN-0) for livestock, agriculture, and non-agriculture water-use divisions, or a population-based linear model constrained to 0 ML/d (LIN-0) for domestic and public supply water-use divisions

Table 2 Summary of modeled groundwater-withdrawal rates (in million liters per day) for selected years across the Ozark system. Italics denote yearly total for all water-use divisions and all states in the Ozark system

The number of active cell nodes and rates of groundwater withdrawals increased over time for all water-use divisions (Video S1 of the ESM), illustrating that the well-selection tool was able to programmatically select SSWU and disaggregate CNTY data (extrapolated to 1900) to well locations and model layers across the Ozark system. Additionally, as a quality-control check, model cell values were summed by division for each county and compared to published CNTY values (USGS 2015c); each water-use division showed an increase in modeled groundwater-withdrawal rates over time as CNTY rates increased (data not shown). Livestock and agricultural use had an r 2 of 1 because only CNTY data were used to model groundwater-withdrawal rates. Although only CNTY data were used to model domestic water use, an r 2 of 0.98 was likely because, for some counties, the number of available wells to disaggregate CNTY data was not large enough to model the total groundwater-withdrawal rates for that county. Non-agriculture (r 2 of 0.76) and public supply (r 2 of 0.87) water-use divisions had r 2 values less than 1 because a combination of CNTY and SSWU data was used (data not shown).

Statistical models, Missouri public supply

The statistical models produced similar predictions to the unconstrained linear model (LIN), such that the raw predictions (not constrained to 0 ML/d in 1900) ranged from approximately 100 to 300 ML/d in 1901 (data not shown). When constrained to 0 ML/d in 1900, the statistical models also produced results similar to the LIN-0 model, with groundwater-withdrawal rates approximately 350–500 ML/d in 2010 (Fig. 8). Well depth was not used as a predictor in the models because 25% of the depth values (from wells that also included groundwater-withdrawal rates) were erroneously recorded as being equal to zero. Land use was also removed from the model as it did not provide any explanatory power based on cross-validation metrics, and precipitation was not used as a predictor because it introduced unrealistically high inter-year variance in the predictions, although precipitation did improve the overall prediction accuracy. Ultimately, models using only population as a predictor best satisfied both the realistic consecutive year variance and goodness-of-fit criteria. The median groundwater-withdrawal rate in 2010 for the zero-constrained, statistical models was 385 ML/d. The statistical models did not perform substantially better than the simpler linear model, so methods were not extended to data other than Missouri public supply, and LIN-0 was ultimately used to model groundwater-withdrawal rates from the Ozark system (Fig. 7).

Fig. 8
figure 8

Groundwater-withdrawal rates for Missouri public supply through time, comparing statistical models—weighted least squares (WLS), K-nearest neighbor (KNN), regression tree (TREE), multivariate adaptive regression splines (MARS), local polynomial regression (LOESS), and gradient boosting regression tree (GBRT)—to the original site-specific water-use dataset (SSWU) and modeled groundwater withdrawal rates (LIN-0)

Discussion

Modeled groundwater-withdrawal rates: challenges and data limitations

Estimating historic groundwater-withdrawal rates is difficult prior to the mid-1900s because there is a paucity of water-use data prior to systematic water-use compilation efforts by state agencies and the USGS (USGS 2015b). Although groundwater-withdrawal rates from the Ozark system increased with population for most water-use divisions (Fig. 5), the relations were based on the SSWU dataset in which consistent water-use records began in 1962. Therefore, the population-based linear model adjusted for CNTY data (LIN) performed poorly before the mid-1900s and over-estimated historical groundwater withdrawals (Fig. 6). For example, public supply groundwater-withdrawal rates in 1950 for the entire state of Missouri—which provides a reasonable comparison because Missouri includes the largest portion of the Ozark system—was approximately 95 ML/d (MacKichan 1951), compared to 211 and 96 ML/d from only the Ozark system in Missouri for the LIN and LIN-0 models, respectively. Therefore, modeled groundwater withdrawals (even from LIN-0) may over-estimate historical groundwater withdrawals, especially considering that the historic, state-level water-use records may also be high compared to post-1985 when water-use data-aggregation methods were standardized (Perrone et al. 2015).

Groundwater use likely increased in a non-linear pattern through time as well-drilling technology advanced across the Ozark Plateaus. Development of the Ozark Plateaus from western expansion of European settlement began as early as the 1700s, but generally increased in the late 1800s during a period of extensive timber logging (Jacobson and Primm 1997). Groundwater use was first concentrated around springs, which were utilized for drinking water and livestock watering (Rafferty 2001), and records of the first hand-dug wells occurred in the late 1800s (Miller and Vandike 1997). Groundwater wells increased throughout the Ozark system after the 1930s once drilling machines were available to construct public supply wells (Miller and Vandike 1997). During the period from 1960 to 1964, the number of water wells drilled in Missouri increased 57% (Meyer and Wyrick 1966), and this period of increasing groundwater access corresponded with the first census of public water suppliers in Missouri (Missouri Division of Health 1962); therefore, groundwater use in the period between 1900 and 1950 may have increased rapidly after the 1930s, which is not fully reflected with the LIN-0 model (Fig. 7). Without earlier water-use data, however, the LIN-0 model provides the best mechanism to model historic groundwater use. If groundwater-withdrawal rates were not constrained to 0 ML/d in 1900, total groundwater-withdrawal rates (from the LIN model) were 445 ML/d in 1900, which is too large based on settlement patterns and qualitative records of water use in the Ozark system; accordingly, and without additional detail, the LIN-0 model was selected for integration in the groundwater model.

When SSWU data were not available, CNTY values were disaggregated using the well-selection tool (Fig. 4), and this method evenly distributed groundwater withdrawals across available wells in the county. Equal distribution of groundwater withdrawals across a county will not accurately reflect locations with greater pumping and may not be suitable for modeling water use in some aquifer systems. For example, for the Mississippi Embayment Regional aquifer system (MERAS) groundwater availability model, CNTY groundwater-withdrawal rates were disaggregated based on a pumpage fraction per well (calculated from SSWU data) rather than even distribution throughout the county (Clark and Hart 2009). This method could be used in the MERAS model because public supply, industrial, and especially irrigation withdrawals, which had greater SSWU data availability for MERAS, dominated water use, such that domestic and livestock use, which only had CNTY data available, were ignored (Clark and Hart 2009). In contrast in the Ozark system, domestic, livestock, and agriculture use constituted between 42 and 60% of total groundwater use, which necessitated disaggregating CNTY data without the ability to calculate a pumpage fraction per well. Non-agriculture and public supply water use were more likely to have greater ranges in groundwater-withdrawal rates per well in the Ozark system, but again, these water-use divisions were better captured in the SSWU dataset. Therefore, an even distribution of groundwater withdrawals associated with agriculture, domestic, and livestock water use likely captured the generally small withdrawal rate per well, and the land-use patterns and resulting spatial distribution of wells throughout the Ozark system (Video S1 of the ESM).

Ranges in modeled groundwater-withdrawal rates

Aggregated groundwater-withdrawal rates at the regional or state level are more widely available than SSWU data, but relating those larger-scale estimates to individual well locations throughout an aquifer system is difficult. For example, using the current minimum per-capita water-use rate of 303 L/d (USGS 2016) and the 1900 population of 2.6 million people, total water use from the Ozark system (at least for domestic use) would have been 788 ML/d. Between 1985 and 2010, groundwater use accounted for approximately 3 to 7% of total water use across the Ozark system (USGS 2015c); assuming the same ratio, total groundwater-withdrawal rates in 1900 may have been 23–55 ML/d. However, a constant per-capita water-use rate is not realistic because increased efficiencies in water distribution and consumption and increased consumption following wider availability of technology and urbanization (Meyer and Wyrick 1966; Brown et al. 2013) have led to changes in the per-capita water-use rate. Groundwater withdrawals in 1900 from the LIN model were 445 ML/d (Fig. 6). Although the LIN model over-estimated groundwater withdrawals from 1900 to approximately 1960, the approach was reasonable considering data availability, including consistent population data beginning in 1900. Using these values, groundwater-withdrawal rates from the Ozark system may have been between 23 and 445 ML/d, or a range of approximately 422 ML/d. Is there a better mechanism to bracket historic groundwater-withdrawal rates?

The statistical models relied on the best-available SSWU dataset (Missouri public supply) and provided an estimate interval for historical groundwater-withdrawal rates. If the LIN-0 model (which is adjusted for CNTY groundwater-withdrawal rates) was used as the “true” value for comparison to the statistical models, the percent difference between the models ranged from 3 to 75%, with an average of plus or minus 38%—for example, in 1950 the groundwater-withdrawal rate from LIN-0 was 96 ML/d, the lowest statistical model value (MARS, 53 ML/d) was 45% smaller than LIN-0, and the highest statistical model value (GBRT, 143 ML/d) was 49% greater than LIN-0. Although the statistical models provide a mechanism to bracket ranges in groundwater-withdrawal rates, overall, the models did not perform better than the simpler LIN-0 model. Variability in groundwater-withdrawal rates for the other water-use divisions may be greater than ranges calculated for public supply. Using the percent differences calculated from public supply for total water use and smoothing over a 5-year span, groundwater-withdrawal rates in 1950 were bracketed between 217 and 565 ML/d (Fig. 9). The later periods of the SSWU record should be better constrained because of data availability, so the percent difference generally decreased through time. The statistical models did not account for the additional water use estimated from CNTY data, so the bracket around LIN-0 was skewed, with a smaller upper range (Fig. 9). Constraining water use to 0 ML/d in 1900 caused an artificial decrease in the range of groundwater-withdrawal rates in the early 1900s, although the mean percent difference is still 44% before 1950 compared to 33% after 1950. Quantifying the estimate interval of groundwater use over time provides a mechanism to test the sensitivity of groundwater flow models to water use, which is generally a parameter held constant during model calibration.

Fig. 9
figure 9

Modeled site-specific groundwater-withdrawal rates produced from the linear, zero-constrained model (LIN-0) bracketed by the range of groundwater-withdrawal rates produced from the statistical models smoothed over a 5-year period.

Groundwater-withdrawal rates and groundwater availability

Accurately modeling both the spatial and temporal components of historical groundwater use is important for understanding how changing patterns in land use may affect groundwater resources in the future. Land use across the Ozark system is a mosaic of forest and agriculture with local urban development (Hays et al. 2016), which has been the general pattern of land use since development (Jacobson and Primm 1997) and ultimately controls patterns in water use. Modeled groundwater use from the Ozark system included higher groundwater-withdrawal rates concentrated in areas for public supply and non-agriculture use, and relatively more distributed groundwater use with smaller groundwater-withdrawal rates for agriculture, domestic, and livestock use (Video S1 of the ESM). Table 2 provides an overview of groundwater-withdrawal rates by water-use division and state to highlight the dominant uses of groundwater withdrawn from the Ozark system. The 110-year water-use dataset modeled using LIN-0 and assigned to Ozark system aquifer units is available in a companion dataset from Knierim et al. (2016).

Although groundwater withdrawals accounted for less than 10% of the total withdrawals in the Ozark Plateaus (USGS 2015c), groundwater has historically provided an important drinking-water resource for people living in the Ozark Plateaus and continues to do so (Hays et al. 2016). In combination, public supply and domestic use accounted for between 48 and 74% of total groundwater use since 1901, with withdrawals in Missouri—which has the largest land-surface area and population in the Ozark system—accounting for the largest portion of groundwater-withdrawal rates (Table 2). Public supply use was concentrated around urban areas, whereas domestic use was more widely distributed throughout rural areas (Video S1 of the ESM). Water used for domestic self-supply is nearly 100% from groundwater resources (Missouri Department of Natural Resources 2003; Maupin et al. 2014; Pugh and Holland 2015), and parts of the Ozark Plateaus remain rural and without public-supply infrastructure, further highlighting the importance of groundwater as a drinking-water source in the Ozark system.

In combination, agriculture and livestock groundwater use ranged from 23 to 38% of total groundwater use since 1901. Groundwater used for irrigation has not been as intensive throughout the Ozark Plateaus compared to nearby areas in Kansas, Nebraska, or eastern Arkansas because agriculture has generally not included row crops such as wheat and corn, owing to the chert regolith and nutrient-poor soils (Hays et al. 2016). The highest groundwater-withdrawal rates for agriculture use were concentrated in southwestern Missouri and the highest rates for livestock use were concentrated in southwestern Missouri and northwestern Arkansas (Video S1 of the ESM). These areas are dominated by hay and sorghum (Missouri Department of Natural Resources 2003) and poultry, hogs, and cattle production (Hays et al. 2016; US Department of Agriculture 2016). Conversion of forested lands to agriculture or open-areas appropriate for pasturing cattle or siting poultry facilities will increase the demands for freshwater resources, especially groundwater. The groundwater use data modeled by LIN-0 (Knierim et al. 2016) provides a tool to better predict how changing land-use patterns may affect groundwater-withdrawal rates.

Groundwater use for the non-agriculture division ranged from 1 to 18% of total groundwater use since 1901 (Table 2). Similar to groundwater withdrawals used for public supply, non-agriculture use was generally concentrated around urban areas, but also included groundwater withdrawals around mines (Video S1 of the ESM) as lead, zinc, iron ore, and barite mining has been an important industry in the Ozark Plateaus (Rafferty 2001; Missouri Department of Natural Resources 2015). Aquifer properties of the Ozark system, such as low hydraulic conductivity and storativity in carbonate units (Hays et al. 2016), have contributed to the development of cones of depression with steep hydraulic gradients around pumping centers where groundwater is used for non-agriculture and public supply (Imes and Emmett 1994; Richards 2010). Quantifying groundwater-withdrawal rates and monitoring groundwater levels in these areas is especially important because potential declines in recharge such as during periods of drought, can create substantial decreases in groundwater availability relatively quickly (Hays et al. 2016).

Conclusion

Accurately estimating historical groundwater use is of critical importance for quantifying variables that control groundwater-withdrawal rates and reliably simulating future water-use scenarios. The best-available site-specific water-use (SSWU) data combined with county-level water-use estimates (CNTY) were used to model site-specific groundwater withdrawals from the Ozark system from 1900 to 2010. Substantial effort was exerted to acquire and quality-control check SSWU data, disaggregate CNTY data to accurate well locations, and produce a historical groundwater-use record with reasonable groundwater-withdrawal rates in the early to mid-1900s. Additionally, by intersecting well depths with the hydrogeologic framework for the Ozark system (Westerman et al. 2016b), groundwater-withdrawal rates were assigned to aquifer units (assuming that groundwater was withdrawn only from the unit intersecting the bottom of the well), which is generally not reported or may be difficult to acquire from SSWU and CNTY datasets.

A simple population-based, linear model constrained to 0 ML/d in 1900 (LIN-0) provided the best means to extrapolate groundwater-withdrawal rates. Non-linear statistical models were tested on a sub-set of the water-use data (Missouri public supply), but performed comparably to the unconstrained, population-based, linear model (LIN), with poor performance in the early to mid-1900s because of a paucity of water-use data. Therefore, modeled groundwater-withdrawal rates from LIN-0 were used as input to the Ozark system groundwater flow model (Knierim et al. 2016; USGS 2015a). The statistical models provided a mechanism to bracket historic groundwater use, such that groundwater-withdrawal rates ranged, on average, plus or minus 38% from modeled values. Although there was a large degree in uncertainty in the modeled groundwater-use record, especially for the older data, quantifying the range of groundwater use provides possible scenarios for groundwater availability models, such that the groundwater-withdrawal rates can be increased (or decreased) to assess how water availability changes. This assessment is especially important in aquifers with similar hydrogeology to the Ozark system, where low storativity can cause localized and steep cones of depression around pumping centers (Hays et al. 2016).

Groundwater use from the Ozark system was relatively evenly split among water-use divisions—public supply, domestic, agriculture, non-agriculture, and livestock—so that accurately modeling groundwater-withdrawal rates for each division was critical for capturing total groundwater use. In combination, groundwater withdrawn for public supply and domestic use accounted for between 48 and 74% of total groundwater use since 1901, highlighting that groundwater provides an important drinking-water resource to people in the Ozark Plateaus. Total groundwater use from the Ozark system was 392 ML/d in 1950 and 1,438 ML/d in 2010, representing an increase of 367%. Future access to groundwater resources will continue to be an important driver of economic and environmental security throughout the Ozark system.

Methods used to develop the water-use record for the Ozark system can be applied to other aquifer systems. Lessons learned of relevance to other water-use researchers include: (1) USGS county- and state-level compilations provide valuable starting points to assess the overall magnitude of regional water use within the US, (2) assembling site-specific water-use values from multiple datasets requires a high degree of quality assurance because of data duplication and typographical errors, (3) the relative magnitude of water use among water-use divisions is critical for assessing the level of detail necessary to accurately represent site-specific groundwater withdrawals, and (4) the paucity of historic water-use records can hamper estimates of water use over time and increase uncertainty, but reasonable estimates that fit the context of the aquifer system, including the pattern of development and hydrogeology, provide critical quantifications for hydrologic budgets. Additionally, local-scale refinements to the regional water-use record (Knierim et al. 2016)—for example by scaling the historic record to a smaller-scale (and likely more detailed) assessment of water use—can further provide a quantitative planning tool to water-resource managers. Therefore, quantification of historical groundwater use allows development of more accurate hydrologic budgets and overall greater understanding of groundwater availability.