Introduction

Geographically isolated depression wetlands—such as vernal pools, Carolina bays, playas, and karst sinkhole wetlands—provide important habitats to plant and animal species and thus support regional biodiversity. However, such depression wetlands may be increasingly vulnerable to anthropogenic climate and landcover change and to human demands on water resources (Tiner 2003; Wolfe et al. 2004; Gómez-Rodríguez et al. 2010; Bird and Day 2014; Bolpagni et al. 2019; Cartwright 2019). Many small depression wetlands are seasonally inundated—holding water for only part of each year—with inundation patterns driven by wetland and drainage-basin characteristics (e.g., size, geomorphology, substrate) and climate variability (Brooks 2004; Calhoun et al. 2017; Bertassello et al. 2019; Davis et al. 2019). Primary impacts of climate change and human water use in these depression wetland ecosystems can arise through hydrologic changes such as shifts in seasonal timing of inundation and changes in hydroperiod (i.e., the number of days per year that a wetland is inundated; Brooks 2009; Gómez-Rodríguez et al. 2010; Russell et al. 2020). Because hydroperiod commonly functions as a master control on wetland community composition and ecosystem dynamics (Babbitt 2005; Foti et al. 2012), anticipating wetland ecological trajectories under climate change commonly involves hydrologic modeling to predict and quantify shifts in the hydroperiod regime, i.e. changing patterns of seasonal and interannual variation of inundation over decadal time scales (Greenberg et al. 2015; Lee et al. 2015; Chandler et al. 2016; Qi et al. 2019). The need for such evaluation is heightened by noting that a lag of several decades may separate the hydrologically driven crossing of an ecological threshold from the accumulation of effects sufficient to trigger a management response (Wolfe et al. 2004).

Compared to other wetland types, hydrologic modeling of geographically isolated depression wetlands presents a distinct set of challenges. For example, whereas hydroperiod in alluvial wetlands can generally be assessed or at least inferred using streamflow records from streamgage networks (Townsend 2001; Berkowitz et al. 2019), leveraging such networks is more problematic—and in some cases impossible—for depression wetlands, which tend to be partially or completely disconnected from local surface-drainage systems. Furthermore, hydroperiods of coastal and riverine wetlands are driven by tidal and streamflow regimes that affect large areas, potentially making them more tractable by remote-sensing analysis than small depression wetlands where inundation may be partially obscured by canopy cover. Moreover, coastal and alluvial wetlands in many regions have been the subjects of relatively robust research and extensive monitoring networks (e.g., Foti et al. 2012; Crase et al. 2013; Berkowitz et al. 2019), reflecting threats to human infrastructure from floods, storm surges, erosion, etc., inherent in the tidal and streamflow regimes that control hydroperiod in these wetland types. By contrast, availability of hydrologic data and models tends to be more limited for small depression wetlands (Naughton et al. 2012; Cartwright and Wolfe 2016) and commonly focused on their aggregate effects on downstream streamflow rather than their internal hydrodynamics (e.g., Golden et al. 2014; Fossey et al. 2015; Golden et al. 2016; Jones et al. 2019).

Models of depression wetland hydroperiod can vary in a number of important ways, including differences in model architecture (i.e., structure and connectivity of model components such as storage compartments and fluxes between them), differing approaches to model parameterization, and ways to cope with inevitable limitations on availability and spatial resolution of environmental data (Clark et al. 2017). Such differences—especially varying degrees of model complexity—can have major implications for model transferability either in space (i.e., extrapolation to new wetlands in other geographic locations) or in time (i.e., model forecasting to predict future hydrologic conditions or hindcasting to simulate wetland hydrology over the decades prior to the beginning of hydrologic observations; Wenger and Olden 2012; Clark et al. 2017; Lute and Luce 2017). Substantial evidence indicates that model transferability benefits from parsimony, meaning that simpler models (i.e., with fewer parameters and storage compartments) tend to perform as well or better than more complex models when validated using independent data sets (Beven 1989; Jakeman and Hornberger 1993; Leplastrier et al. 2002; Perrin et al. 2003; Chien and Mackay 2014; Lute and Luce 2017; Chien and Mackay 2020). Whereas more complex models may better approximate localized physical processes and generally enable closer fit in calibration, favorable calibration statistics for highly complex models may represent a form of overfitting (i.e., “overparameterization”; Beven 1989; Perrin et al. 2001) causing such models to perform poorly when extrapolated to novel contexts (Wenger and Olden 2012; Lute and Luce 2017; Chien and Mackay 2020). Hydroclimatic non-stationarity (e.g., from climate change) implies that the hydroclimatic conditions of the model calibration and validation time periods (generally the recent past) may be substantially different from those of the more distant past (hindcasting) or future (forecasting), creating potential for transferability problems (Vaze et al. 2010; Coron et al. 2012; Coron et al. 2014; Magand et al. 2015; Duethmann et al. 2020). Thus, when hydroclimatic similarity cannot be assumed between the calibration/validation periods and simulation/prediction periods, model parsimony may be important inasmuch as it increases (or at least does not decrease) model transferability to novel hydroclimatic conditions (Vaze et al. 2010; Coron et al. 2014).

Even for relatively parsimonious hydrologic models, hydroclimatic variability and non-stationarity may necessitate periodic model updates or recalibrations over time (Grigg and Hughes 2018; Duethmann et al. 2020). Because of data limitations, hydrologic models for depression wetlands are commonly calibrated and validated using less than a decade of hydrologic observations (e.g., Sun et al. 2006; Barton et al. 2008; Naughton et al. 2012; Lee et al. 2015; Chandler et al. 2016; Qi et al. 2019). Even in the more data-rich field of streamflow modeling, 25-year datasets are considered “long term” and relatively rare (Chien and Mackay 2014; Chien and Mackay 2020). In this context, calibration and validation datasets typically represent only a subset of the long-term hydroclimatic variability and thus may poorly represent the hydroclimatic extremes of droughts and pluvial periods (Perrin et al. 2001; Kling et al. 2015). Such limitations can constrain methods to assess model transferability to dissimilar hydroclimatic conditions, such as the differential split-sample test (Klemes 1986). These issues make model hindcasting and forecasting especially problematic in regions where hydroclimatic variability is projected to intensify (Karl and Knight 1998; Huntington 2006; Kling et al. 2015), highlighting the importance of periodic model recalibration. While such updates are sometimes performed for widely used streamflow models (e.g., Grigg and Hughes 2018), they are much less common for localized models for depression wetlands.

Here, we revisit a simple two-compartment, four-parameter hydrologic accounting model that simulates daily water levels in Sinking Pond (Wolfe et al. 2004), a 35-ha seasonally ponded karst depression on the Eastern Highland Rim (Fenneman 1938) of Tennessee, USA. Hindcasting from the original model based on about 5 years of continuous water-level data showed an abrupt increase in hydroperiod around 1970, coinciding with reported increases in regional streamflow and precipitation (Karl et al. 1996; McCabe and Wolock 2002) and supported by dendrochronological evidence (Wolfe et al. 2004). This paper presents a revised hydrologic model that preserves the original structure and parameterization (Wolfe et al. 2004) but incorporates an expanded (22 versus 5 years) calibration/verification dataset and more robust calibration and verification procedures. Using the recalibrated model, we assessed temporal trends in precipitation and temperature across a 165-year simulation period (1855–2019) and compared seasonal and interannual variability of precipitation and temperature among the input dataset and two calibration/verification datasets. We show that the model revision reduced bias and improved performance and that the revised simulated time series of pond stage supports the previous conclusion that increased precipitation since around 1970 has driven a discernible increase in Sinking Pond’s hydroperiod.

Data and Methods

Site Description

Sinking Pond is a 35-ha seasonally inundated karst-depression wetland located on Arnold Air Force Base, Tennessee, USA, within the Eastern Highland Rim subsection of the Interior Low Plateaus physiographic province (Fig. 1; Fenneman and Johnson 1946). The name “Sinking Pond” refers to the abrupt seasonal rises and falls of pond water levels. For example, the pond can transform from a fully drained condition to 2 m or more of inundation within one or two days (Wolfe et al. 2004). In most years, Sinking Pond becomes inundated in autumn or winter, remains inundated through spring, and drains by mid to late summer. The deepest parts of Sinking Pond are narrow, steep-sided swallets and their interconnecting channels that constitute a relatively small percent of pond area and are topographically distinct from the broad expanse of shallower and flatter areas at higher elevations within the pond. Sinking Pond’s internal drainage system—a network of coalesced sinkholes and connected karst conduits—receives concentrated recharge from the Sinking Pond basin (Wolfe 1996a; Wolfe et al. 2004; Haugh 2006). Surface water flows seasonally to Crumpton Creek, a tributary of the Duck River (Fig. 1). Groundwater in the Sinking Pond area follows a flow path roughly parallel to Crumpton Creek and discharges through several springs, of which the most prominent is Big Spring at Rutledge Falls (Fig. 1; Haugh 2006).

Fig. 1
figure 1

Location of the Sinking Pond study area within Arnold Air Force Base on the Eastern Highland Rim of Tennessee, USA. Figure modified from Wolfe et al. (2004)

Sinking Pond is located within The Barrens: a region characterized by open-canopy ecosystems including prairie-like “barrens”, oak savannas, and woodlands, with ponds and other wetlands embedded throughout a landscape of low (< 20 m) relief and thick, infertile soils atop karstic Mississippian limestone (Wolfe 1996a; Pyne 2000). The study site has mean annual precipitation of roughly 150 cm and monthly mean temperatures ranging from 2.7 °C in January to 25 °C in July (Wolfe et al. 2004). Landcover is primarily temperate broadleaf forest (Wolfe et al. 2004). Vegetation within the 335-ha Sinking Pond basin ranges from oak-hickory (Quercus-Carya) forest along well-drained slopes and ridges to stands of sweetgum (Liquidambar styriciflua), blackgum (Nyssa sylvatica), red maple (Acer rubrum), and willow oak (Quercus phellos) in poorly drained areas; water tupleo (Nyssa aquatica), overcup oak (Quercus lyrata), and river birch (Betula nigra) dominate the deepest or most frequently inundated zones (Patterson 1989; Wolfe 1996a; Wolfe et al. 2004).

Climate Data

Daily precipitation and temperature values from January 1, 1854 through September 30, 2019 served as inputs to the Sinking Pond hydrologic model. Precipitation and temperature observations were compiled from National Centers for Environmental Information (NCEI) weather stations in the vicinity of the study site (National Centers for Environmental Information 2020), as described by Wolfe et al. (2004). Daily precipitation and temperature records beginning March 1, 1893 and February 1, 1895, respectively, were primarily from the station in Tullahoma, Tennessee (NCEI station 409,155). Gaps in the Tullahoma record (approximately 0.1 % of daily records) were filled using records from the station in McMinnville, Tennessee (NCEI station 405,882), according to Wolfe et al. (2004). Daily precipitation and temperature records prior to March 1, 1893 and February 1, 1895, respectively, were produced by disaggregating monthly temperature records. Disaggregation was performed by scaling reference daily values using ratios of the monthly values being disaggregated divided by the corresponding reference monthly values. Wolfe et al. (2004) describe selection of reference values and the disaggregation procedure in detail.

Stage Data

Daily measurements of pond stage (i.e., gage height) at the Sinking Pond study site were obtained from U.S. Geological Survey site number 03596075 (“Sinking Pond at AEDC, TN”; U.S. Geological Survey 2020). Records from this gage began in September 1992 and continued, with some interruptions, through and beyond the end of this study in September 2019. Water years (annual periods beginning October 1 of the previous calendar year and ending September 30 of the calendar year) from 1993 to 2019 with no more than 10 % missing daily records were used for model recalibration (Online Resource 1).

Model Formulation

Using a classical water-budget approach, a hydrologic model was developed to evaluate climate-driven hydrologic changes at the Sinking Pond study site. This model is described in detail by Wolfe et al. (2004) and summarized briefly here. On a daily time-step, the model accounts for precipitation and runoff inputs to the pond and outflows to the atmosphere, surface water, and the local groundwater system (Fig. 2). The model begins with a basic water budget (Eq. 1):

$$ \mathrm{P}+\mathrm{GI}+\mathrm{RI}=\Delta \mathrm{S}+\mathrm{GO}+\mathrm{RO}+\mathrm{ET} $$
(1)
Fig. 2
figure 2

Conceptual diagram of the Sinking Pond hydrologic model showing daily flux variables (arrows), daily state variables (SB and SP), and rate coefficients (green diamonds). Model terms subject to calibration are in red. All abbreviations are defined in Table 1

where P is precipitation, GI is groundwater inflow, RI is surface-runoff inflow, ΔS is change in volume of stored water (i.e., storage), GO is groundwater outflow, RO is surface-runoff outflow, and ET is evapotranspiration. Equation 1 is then decomposed into two linked functions (Eqs. 2 and 3) representing the contributing basin and Sinking Pond, respectively:

$$ \mathrm{BP}=\Delta \mathrm{SB}+\mathrm{BG}+\mathrm{BR}+\mathrm{BE} $$
(2)

and

$$ \mathrm{PP}+\mathrm{BR}=\Delta \mathrm{SP}+\mathrm{PG}+\mathrm{PR}+\mathrm{PE} $$
(3)

where BP is precipitation falling on the contributing basin outside the pond, SB is water storage in the basin vadose zone outside the pond, BG is groundwater recharge from the basin vadose zone, BR is surface runoff from the basin into the pond, BE is ET from the basin outside the pond, PP is precipitation falling on the pond, SP is water storage in the pond, PG is groundwater recharge from the pond, PR is surface runoff from the pond, and PE is ET from the pond (Fig. 2; Table 1). Daily flux and storage terms were expressed in basin centimeters (bcm; Table 1). Pond volume was calculated from daily SP values and then transformed into daily modeled stage values using a stage-volume rating curve from Wolfe et al. (2004).

Table 1 Key components of the Sinking Pond hydrologic model

Four model terms were subject to calibration (Fig. 2; Table 1). Three of these terms (KBG, KBR, and KPR) are rate coefficients that scale water transfers from a given compartment of the model (i.e., the basin or pond) to storage in that compartment (i.e., SB or SP) at each time step. The KBR and KPR runoff coefficients determine daily runoff from the basin and pond, respectively, as functions of storage (i.e., SB and SP). The KBG coefficient determines the daily groundwater recharge flux from the basin vadose zone as a function of SB. The fourth calibrated model term, PG, represents recharge from the pond when it contains water and accumulation of groundwater-storage deficit when the pond is drained. PG is conceptualized as a constant daily flux value, in part because characterizing hydraulic gradients is problematic when the pond is drained but also because the notable absence of seasonal and diurnal variability in discharge at Big Spring (Williams and Farmer 2003; Haugh 2006; see site description) suggests that a relatively steady accumulation of groundwater deficit is physically plausible.

Daily potential evapotranspiration (PET) was calculated using daily temperature and an empirically determined PET scaling factor, based on published annual pan evaporation rates (Wolfe et al. 2004). BE and PE were calculated using PET and storage (i.e., SB and SP, respectively), as well as basin and pond water-availability thresholds as described in Wolfe et al. (2004). The upper boundary conditions for storage in the basin and pond are constrained by runoff generation once basin-soil saturation or pond filling is exceeded (SB or SP > 0). The lower bound of basin storage is set by a spatially weighted average available moisture capacity of -23.4 bcm (Wolfe et al. 2004). We opted not to impose an explicit lower limit on pond storage because that would have required a fifth model calibration term, increasing model complexity and the risk of overfitting (i.e., improved calibration fit without necessarily improving model simulation). Thus, the model allows pond storage deficits to accumulate until reversed through recharge driven by precipitation and basin-runoff inputs. This storage conceptualization was reasonable given the thickness of the limestone substrate in which Sinking Pond is embedded, however, we note that this conceptualization may not be transferable to other wetlands with different hydrogeologic characteristics.

Original Model Calibration

Original calibration of the Sinking Pond hydrologic model, as presented in Wolfe et al. (2004), used a 2-year calibration period (i.e., observed stage records) from February 1993 through February 1995 and a 3-year validation period (water years 2000 through 2002). Model calibration and validation were performed using graphical comparison of observed and modeled values for pond volume, pond stage, and pond runoff (Wolfe et al. 2004). Notably, this original calibration process did not include an assessment of how the calibration and validation water years compared to the simulation period of record (water years 1855 through 2002) in terms of the distributions and variability of temperature and precipitation, nor any explicit evaluation of water years with anomalous climate conditions (e.g., very wet or very dry years).

We assessed the performance of the original model calibration from Wolfe et al. (2004) using an updated dataset containing an additional 17 years (22 years total) of climate and stage observations (Online Resource 1). We assessed calibration performance using three quantitative metrics, along with graphical comparisons of modeled versus observed stage hydrographs for each water year. The first two quantitative metrics, root mean-squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE), are commonly used in hydrologic model calibration (Ritter and Muñoz-Carpena 2013; Moriasi et al. 2015). In addition to calculating NSE using daily values across all water years, we also calculated NSE for each water year independently and then recorded the 25th percentile of these annual NSE values (NSEp25). The NSEp25 metric provides a means of assessing model performance for the 5 water years (out of 22) that were the least well-predicted by the model. To complement RMSE and NSE, we developed a third metric, which we refer to as hydroperiod discrepancy (HPD), to represent the degree to which the calibrated model overpredicted (HPD > 0) or underpredicted (HPD < 0) pond hydroperiod. Hydroperiod is defined here as the number of days in each water year with pond stage ≥ 323 m. We selected this stage value for the hydroperiod definition because it is approximately mid-way between the stage at which the pond is dry (approximately 321.5 m) and the spillway elevation (approximately 324.5 m). Because of the pond-basin topography, the pond contains very little water at a pond stage of 323 m, with pond volume of roughly 2.5 % of the spillway-stage pond volume. Because Sinking Pond typically fills and drains rapidly each year (Wolfe et al. 2004), selection of a slightly different stage value to define hydroperiod would have had minimal effect on the calculated hydroperiods. HPD was calculated as percent overprediction of the observed pond hydroperiod following Eq. (4):

$$ \mathrm{HPD}=\frac{{\mathrm{HP}}_M{\mathrm{HP}}_O}{{\mathrm{HP}}_O}\times 100 $$
(4)

where HPM and HPO are modeled and observed pond hydroperiod, respectively.

Model Recalibration

Recalibration of the Sinking Pond hydrologic model relied on the full 22 water years of daily observed pond stage (Online Resource 1), because this approach has been demonstrated to produce the most robust parameter sets (Arsenault et al. 2018). Recalibration involved evaluating all possible combinations of the model parameters subject to calibration (KBG, KBR, KPR, and PG) using the ranges and intervals in Online Resource 2, which produced 81,900 potential calibrations (i.e., unique combinations of model parameters) to be evaluated. For each potential calibration, the model was run on a daily time step from January 1, 1854 through September 30, 2019, to produce daily values of modeled pond stage over this time period. Modeled pond stage was then compared to the daily values of observed pond stage over 22 water years (Online Resource 1) to calculate model performance metrics (RMSE, NSE, NSEp25, and HPD). In this way, model performance metrics were generated for each of the 81,900 potential calibrations considered (Cartwright 2021).

Recalibration of the Sinking Pond hydrologic model was then conducted using a hierarchical multi-objective calibration procedure. First, a subset of potential calibrations was identified having NSE ≥ 98 % of the maximum NSE value across all potential calibrations, with NSE having been calculated across all water years in the recalibration dataset. This first criterion ensured that the final calibration had near-optimal NSE. Second, a reduced subset of potential calibrations was identified having HPD ≤ ± 2 %. This second criterion ensured that the final calibration was not biased in approximating pond hydroperiod. Lastly, from the subset of potential calibrations meeting the first two criteria, the final calibration was selected which had the highest value of NSEp25, which optimized model representation of the five worst-performing water years. We did not use RMSE as a recalibration criterion because it was largely redundant to NSE (correlation = -0.98, p < 0.001).

We validated the recalibration approach using leave-one-out cross-validation (LOOCV). Iteratively for each water year in the recalibration dataset, we (1) withheld the water year in question and performed model calibration using data from all other water years, (2) used the resulting calibration to generate modeled daily stage values for the withheld water year, and (3) compiled modeled daily stage values for all water years in this manner, one year at a time. We then calculated model performance metrics (RMSE, NSE, and HPD) using the modeled daily stage values for the entire recalibration period of record that were generated by the LOOCV process.

Hydrologic Simulation (hindcasting)

Because the model ran on a daily time step using climate inputs from January 1, 1854 through September 30, 2019, modeled daily stage values over a 165-year period (water years 1855 through 2019) could then be used to simulate (i.e., hindcast) historical patterns of pond hydroperiod and hydrologic responses to climatic variability. The start date for hydrologic simulation was set to January 1, 1854 to give the model time to stabilize before the period of analysis beginning October 1, 1854 (i.e., the start of the 1855 water year) and because a January start date makes plausible an initial condition where both the basin and the pond compartments were at full saturation (i.e., not producing runoff but ready to do so with an incremental input of water to either compartment).

Analysis of Climate Variability

We assessed climate variability by examining ranges and median values of temperature and precipitation across water years and by season (October through December, etc.). We compared climate variability across three time periods: (1) the period of record (POR) used in the original model calibration and validation (5 water years; Online Resource 1), (2) the recalibration POR (22 water years; Online Resource 1), and (3) the hindcasting POR (165 water years from 1855 to 2019). This analysis allowed an appraisal of how well the model calibration and recalibration represented the climatic variability of the simulation period.

Trend Analysis

Using the 165-year daily time series of simulated pond-stage values from the recalibrated model over the hindcasting period (1855-2019), we checked for possible hydroperiod trends using three common test statistics: Mann-Kendall (Mann 1945; Kendall 1975; Yue et al. 2002a), Pettitt (1979), and Sen’s slope (1968). We also checked for trends in climate variables driving the model, including precipitation, temperature, and water balance (precipitation minus potential evapotranspiration, PET). We checked for serial autocorrelation in annual climate variables and hydroperiod by computing the sample autocorrelation function using time lags from one to 20 years (Yue et al. 2002b). We visualized trends in climate variables and hydroperiod using boxplots to depict variability within three 50-year periods: 1870–1919, 1920–1969, and 1970–2019. Splitting the time series in this way allowed us to compare climate variables and hydroperiod pre- and post-1970, based on previously identified hydrologic changes circa 1970 at this site (Wolfe et al. 2004) and the broader region (Karl et al. 1996; McCabe and Wolock 2002). We quantified differences among these 50-year periods using analysis of variance (ANOVA) and Tukey’s post-hoc tests.

To check whether trends in the precipitation records used in the Sinking Pond hydrologic model were consistent or discrepant with larger regional patterns of precipitation trends (or lack thereof), we obtained precipitation records from National Centers for Environmental Information (2020) for all NCEI weather stations located within 100 km of the Sinking Pond study site having at least 70 years of annual precipitation data between 1920 and 2019, which yielded 8 weather stations (Online Resource 3). For each station, we performed a Mann-Kendall test on total annual precipitation and on the number of precipitation days, defined by NCEI as the number of days per year with at least 0.254 mm of precipitation. All modeling and analyses were performed in the R statistical language (R Core Team 2017) using the R packages Kendall (McLeod 2011), trend (Pohlert 2020), and forecast (Hyndman et al. 2020). Data and R processing scripts used in this study are available from Cartwright (2021).

Results

Climate Variability Represented by the Sinking Pond Hydrologic Model

As expected, interannual variability in precipitation and temperature was greater across the 22 water years used in model recalibration than across the 5 years used in the original calibration/validation (Fig. 3). Compared to the original calibration dataset, climatic variability across the recalibration water years more closely approximated the range of climatic conditions in the model simulation (hindcasting) period from 1855 to 2019; however, precipitation and temperature extremes across the simulation period were still not fully captured by the 22 water years used in model recalibration. Relatively dry water years (total precipitation < 130 cm) accounted for roughly a third of water years in the simulation period (57 years) but were completely absent from the dataset used in the original calibration. The updated dataset used in recalibration contained four such dry years (18 % of recalibration years), including two extremely dry years (2007 and 2008) with slightly less than 100 cm total precipitation. Similarly, warm years in the simulation period (mean annual temperature > 15.5 °C) were unrepresented by the original calibration but were represented by four years (18 %) in the recalibration. Improvements in representation of interannual climatic variability for the recalibration over the original calibration were especially pronounced for years with high October through December precipitation (Fig. 3b), high October through December temperatures (Fig. 3g), and low January through March temperatures (Fig. 3h). Notably, very cold water years in the simulation period were not well represented by either the original calibration or the recalibration, when examining entire water years (Fig. 3f) or three of the four quarters (Fig. 3g, i and j).

Fig. 3
figure 3

Variability in (ae) total precipitation and (fj) mean temperature for water years used in the Sinking Pond hydrologic model. Orange: 5 water years used in the original model calibration and validation; gray: 22 water years used in model recalibration; blue: 165 water years used in model simulation (i.e., hindcasting) from 1855 through 2019. All boxes show minimum, maximum, and median values. Boxes in (a) and (f) show variability across entire water years (October 1 through September 30). Boxes in (b) through (e) and (g) through (j) show variability within quarters of water years

Performance of the Original Model Calibration and Recalibration

Model performance was improved by the recalibration compared to the original model calibration according to all performance metrics calculated using the 22 water years in the recalibration period of record (Table 2, Online Resource 4). Model recalibration reduced RMSE by approximately 0.13 m (13 cm) and increased NSE by approximately 0.1. For 14 out of 22 water years (64 %), NSE was increased by the recalibration relative to the original calibration (Online Resource 5). Model bias in approximating pond hydroperiod was dramatically reduced by recalibration: whereas hydroperiod was underestimated by almost 20 % using the original calibration, the recalibration procedure reduced hydroperiod bias to within 2 % (Online Resource 4). For 16 water years (73 %), hydroperiod discrepancy was improved (i.e., became closer to zero) by the recalibration (Online Resource 5). Moreover, model performance in unusual or difficult-to-predict water years was improved by the recalibration, as indicated by an increase in NSEp25 of almost 0.2 (Table 2, Online Resource 4).

Table 2 Model parameters and performance metrics for the original calibration and recalibration of the Sinking Pond hydrologic model

Two model parameters—KBG and PG, representing groundwater recharge from the basin and pond, respectively—were largely unchanged by model recalibration (Table 2). Of the remaining two parameters, KBR was increased and KPR was decreased by the recalibration, reflecting increased basin runoff into the pond and decreased surface runoff from the pond, respectively. Changes in these two model parameters both served to increase pond storage and corresponding hydroperiod and thus were largely effective in removing hydroperiod bias from the model (i.e., reducing overall underestimation of pond hydroperiod close to zero).

Changes in the hydrologic simulation produced by the recalibration were especially apparent for the 2006, 2007, and 2008 water years (Fig. 4, Online Resource 5), the three water years in the recalibration period of record with lowest annual precipitation. Although pond inundation did occur in these drought years, the model under the original calibration either failed to fill at all (2007 and 2008) or was inundated only briefly (2006), producing large underestimates of pond hydroperiod (Online Resource 5). Model recalibration substantially improved performance in two of these drought years, lessening the underprediction of hydroperiod in 2006 and representing hydroperiod almost perfectly in 2008. In 2007, recalibration shifted the model from complete underprediction of hydroperiod (no simulated pond filling), to a roughly 200 % hydroperiod overprediction.

Fig. 4
figure 4

Hydrographs showing pond stage for the 22 water years in the period of record used for recalibration of the Sinking Pond hydrologic model. Tick marks on the horizontal axis show starts of each water year (WY) on October 1

Recalibration also produced moderate improvements under unusually wet conditions in 2017 and 2018 (Fig. 4). Both water years had above average annual precipitation (by approximately 16 and 22 cm, respectively) and the 2017 growing season was especially wet, with April through June and July through September precipitation roughly 8 and 14 cm above average, respectively. July through September conditions in 2017 were also abnormally cool (0.7 °C below average). As a result, Sinking Pond never drained at the end of the 2017 water year, instead remaining inundated from late winter of 2017 until early autumn of 2018 (Fig. 4). Although neither the original calibration nor the recalibration successfully replicated this unusual event, the recalibrated model drained later and filled earlier than the original calibration, thus performing slightly better during these unusually cool and wet conditions. We also note that although the recalibrated model was largely successful in simulating pond stage and hydroperiod in most water years, it was considerably less successful in years with unusually complex hydrographs such as years with multiple filling and draining events (e.g., 2007 and 2009; Fig. 4, Online Resource 5).

For many water years in the hindcasting period (1855 through 2019), simulated pond stage and hydroperiod were largely unchanged by the recalibration relative to the original calibration (Online Resource 6). However, overall increases in simulated hydroperiod produced by the recalibration—presumably resulting from the substantial correction of hydroperiod bias—were apparent for several water years in the hindcasting period, particularly for water years for which simulated hydroperiod had been particularly low under the original calibration.

The LOOCV process produced a distribution of model parameters (i.e., calibrated parameters that differed across the water years as each was withheld for validation; Online Resources 7 and 8) that was generally similar to the parameter values from the recalibrated hydrologic model (Table 2). Model performance metrics from the LOOCV process varied across water years withheld for validation (Online Resources 7 and 8). Performance metrics from LOOCV were poor for a few atypical water years, including 2007, 2009, and 2017–18 which contained more than one distinct period of pond inundation or in which pond drainage did not occur (Fig. 4) but were generally acceptable in most water years (Online Resource 8).

Based on the recalibrated model, inputs to the pond were dominated by BR, which averaged roughly 2.5 times the magnitude of PP; pond outflows were dominated by PG and PR, with PE accounting for roughly one-eighth of pond outflows (Table 3).

Table 3 Means and standard deviations of model terms from the recalibrated Sinking Pond hydrologic model
Table 4 Analysis of trends in climate and hydroperiod from the Sinking Pond hydrologic model

Trends in Pond Hydroperiod and Climate Variables

Simulated hydroperiod from the recalibrated model showed clear positive trends over time (Fig. 5), related to increases in total annual precipitation and water balance (precipitation minus PET; Figs. 5 and 6; Table 4). Based on Sen’s slope over the entire hindcasting period (1855–2019), hydroperiod increased on average by 3.3 days per decade. The 95 % confidence interval for this Sen’s slope value did not include zero, indicating > 95 % confidence of positive Sen’s slope over the hindcasting period. Sen’s slope for hydroperiod over the past 100 years (1920–2019) was greater (Fig. 5b; Table 4), with hydroperiod increasing on average by 5.9 days per decade. Pettitt’s tests indicated breakpoints in hydroperiod around the year 1970 (Table 4), and hydroperiod from 1970 to 2019 was significantly greater than pre-1970 hydroperiod based on ANOVA and a Tukey’s post-hoc test (Fig. 6d). Consistent with a breakpoint circa 1970, hydroperiod showed no significant trends when examining only the post-1970 period (Table 4).

Fig. 5
figure 5

Hydroperiod (days per water year with stage ≥ 323 m) from the Sinking Pond hydrologic model, showing (a) yearly values and (b) trends over time, along with (c) trends in mean annual precipitation minus potential evapotranspiration (PET). Details on Sen’s slope (for the entire hindcasting period and for the last 100 years, in gray and black dashed lines, respectively) and Pettitt breakpoint analysis are presented in Table 4; these statistics were calculated using yearly values, not moving averages

Fig. 6
figure 6

Boxplots of (ac) climate variables and (d) hydroperiod from the Sinking Pond hydrologic model, grouped into 50-year time periods from 1870 through 2019. Analysis of variance (ANOVA) results are presented above each plot and letters (A, B, AB) represent significant (p < 0.05) differences or lack thereof from a Tukey’s post-hoc test

Increasing hydroperiod mirrored increases in local precipitation over time. Over the entire hindcasting period, total annual precipitation increased by 1.9 cm per decade on average according to Sen’s slope (Table 4). Total annual precipitation was significantly greater after 1970 than before based on ANOVA and a Tukey’s post-hoc test (Fig. 6a). Unlike precipitation, temperature at the Sinking Pond site did not show clear trends (Table 4), instead increasing significantly over the earlier part of the hindcasting period (from 1870to1919 to 1920–1969) and then decreasing slightly but not significantly over the latter part (from 1920to1969 to 1970–2019; Fig. 6b). Similar to hydroperiod, neither temperature nor precipitation showed significant temporal trends if looking only at the post-1970 period. Unlike with daily or seasonal variables, serial autocorrelation was not a major concern for hydroperiod and annual climate variables. Because maximum autocorrelation values (from one to 10-year lags) and 1-year lag values generally indicated only weak to moderate autocorrelation (Table 4), trend statistics presented here were calculated without pre-whitening (Yue et al. 2002b).

The increasing precipitation trend at the Tullahoma weather station (from which 99.9 % of daily climate records were obtained for use in the Sinking Pond hydrologic model) does not appear to be a regional outlier. Of eight NCEI weather stations selected for comparison (Online Resource 3), four showed significant (p < 0.05) positive trends in total annual precipitation based on a Mann-Kendall test and a partially overlapping set of four stations showed significant positive trends in the number of precipitation days per year (Online Resource 9). Three stations showed no significant trend, and none showed significant negative trends in either total annual precipitation or number of precipitation days per year.

Discussion

In this study, quadrupling the length of the climatologic and hydrologic record used for calibration and validation and applying more explicit model performance metrics improved the Sinking Pond hydrologic model from Wolfe et al. (2004). The improvement consisted principally of a reduction in bias (i.e., hydroperiod underprediction). In turn, this bias reduction improved model performance in most water years and largely resolved a shortcoming of the original calibration in which the model failed to fill during abnormally dry water years (2006–2008). The recalibrated model confirmed a finding by Wolfe et al. (2004) of increasing hydroperiod at Sinking Pond beginning around 1970, apparently linked to increasing precipitation (Table 4; Figs. 5 and 6). Notably, this hydroperiod increase is consistent with previous studies showing increased precipitation and streamflow since 1970 in the eastern United States (Karl and Knight 1998; McCabe and Wolock 2002). Although landcover change was not accounted for in the model, forest cover in the Sinking Pond basin increased from the 1940 s through the early 2000 s (Wolfe et al. 2004), a trend which, if anything, would be expected to reduce pond inundation because of greater evapotranspiration (Sun et al. 2000; Jones et al. 2018), suggesting a more likely role for precipitation rather than landcover change in driving hydroperiod increase. Moreover, the potential for hydrologic effects from localized alterations in and around the Sinking Pond site—including structural changes to the pond spillway or internal drain as well as waterfowl impoundments in adjacent watersheds—were investigated by Wolfe et al. (2004) using field observations and hydrologic records and were ultimately deemed unlikely to have caused the increase in pond hydroperiod.

Although recalibration improved model performance overall and for several years that were unusually wet or dry, both the original calibration and the recalibration appeared generally successful in simulating daily water levels in hydrologically typical years, indicating adequacy for the purpose of characterizing secular trends in Sinking Pond’s hydrologic regime. Several factors probably account for the adequacy of the original calibration, despite the relatively few (five) years of calibration/validation data that were originally available. The magnitude of the increase in pond hydroperiod produced a reasonably strong signal that might be more readily detected than more subtle or variable trends. In addition, the original modeling effort began with a reasonably firm conceptual model (Fig. 2), based on more than a decade of hydrologic and ecological investigations in the study area (Patterson 1989; Wolfe 1996a, b; Wolfe and League 1996; McCarthy and Evans 2000), whose essential features are preserved in the numerical model’s structure and parameterization. Finally, the model’s relative simplicity (two compartments, eight transfer terms) and parsimony (four parameters) may have helped mitigate against overspecification and false precision (Beven 1989; Perrin et al. 2001; Lute and Luce 2017).

These considerations suggest that the model structure described here may have some transfer value to similar depression wetlands where inflow is dominated by runoff and internal drainage is tightly coupled to groundwater conditions. It is unclear whether or how that structure could be adapted for other depression wetland systems in the region, such as the shallow, hydraulically perched “karst pans” that coexist along with Sinking Pond and similar features in the Eastern Highland Rim (Wolfe 1996b). In general, model transferability among wetland sites is likely to depend both on intrinsic model characteristics—including simplicity, parsimony, and conceptual basis—as well as climatological, hydrogeologic, and geomorphic similarity among wetland sites. Efforts to adapt the model structure presented here for similar wetland systems could benefit from (1) empirical observations at the sites in question to relate wetland storage to evapotranspiration, groundwater flow, and runoff, (2) evaluation of alternative model formulations (e.g., by scaling PG to SP and/or imposing a lower limit on SP), and (3) use of the freely available and reproducible data and code for the model available from Cartwright (2021).

Although recalibration improved performance of the Sinking Pond hydrologic model and succeeded in approximating pond-stage patterns in most years, poor model performance during two hydrologically atypical years—2007 and 2009, which each involved multiple filling and draining events—reflects the difficulty of replicating unusually complex hydrographs (Fig. 4, Online Resource 5). Model underperformance in these unusual years is unsurprising in a parsimonious model but raises questions about the degree to which such anomalous water years affected hindcasting accuracy. Of the 138 years prior to the recalibration period beginning in 1993, 11 years (8 %) were drier (had lower total annual precipitation) than the driest year in the recalibration dataset, whereas 3 years (2 %) were wetter than the wettest recalibration year. Thus, a relatively small percentage of hindcasted years would be considered extreme or anomalous based on climate conditions. Furthermore, of the 22 years in the recalibration dataset, all but 3 (86 %) had a single, relatively clear pond filling event and a single draining event, including water years 2006 and 2008 which were almost as dry as 2007 (Fig. 4). These two lines of evidence suggest (1) that hydrologically unusual and possibly poorly modelled years probably did occur in the hindcasting period, but also (2) that these years were likely rare compared to years with more moderate climate and simpler hydrographs. Thus, given the overall strength of the hydroperiod trend signal (Table 4), any model limitations in predicting water levels in hydrologically complex or anomalous years are unlikely to have strongly influenced our finding of increased pond hydroperiod over time.

A shift in hydrologic regime of the magnitude shown for Sinking Pond may have profound ecological implications. In common with wetlands generally, biological communities in depression wetlands tend to be shaped by ranges of hydrologic variation and especially by hydroperiod (Brooks 2000; Babbitt 2005; Liston 2006; Stroh et al. 2008; Greenberg et al. 2015; Cartwright and Wolfe 2016; Chandler et al. 2016). Therefore, substantial hydrologic shifts in either direction may degrade wetland ecological integrity. A shift to a drier hydrologic regime may facilitate encroachment of plants and animals from adjacent upland habitats; conversely, a shift to wetter conditions may exceed the physiological tolerance of key wetland species (Cartwright 2019). Hydroperiod changes can produce cascading consequences through wetland ecosystems, as hydroperiod influences diverse ecological processes including amphibian reproduction and population dynamics, litter decomposition and carbon cycling, wetland geochemistry and water quality, and food webs including invertebrate community composition (Golladay et al. 1997; Brooks 2000; Leibowitz and Brooks 2007; Boven et al. 2008; Brooks 2009; Goldyn et al. 2015; Cartwright and Wolfe 2016; Chandler et al. 2016). These linkages underscore the potential importance of climate-driven hydroperiod changes for wetland communities and ecosystems (Greenberg et al. 2015; Lee et al. 2015). Further evaluation of these possible changes at Sinking Pond could employ future hydroperiod projections, leveraging the existing open-source model (Cartwright 2021) combined with temperature and precipitation projections from multiple downscaled climate models to encompass the range of uncertainty across climate models and greenhouse-gas emissions scenarios (Greenberg et al. 2015; Lee et al. 2015; Terando et al. 2020).

In Sinking Pond, the abrupt hydroperiod increase around 1970 appears to have triggered a collapse of tree regeneration, setting the system on a trajectory to transition from a seasonally inundated forested wetland toward a seasonally drained open-water pond (Wolfe et al. 2004). Such pronounced effects of wetland hydrology on vegetation structure have similarly been documented in other depression wetland systems of the southeastern United States (De Steven and Toner 2004; Kirkman et al. 2012), such as Carolina bays (Collins and Battaglia 2001; Stroh et al. 2008) and karst-depression wetlands embedded in fire-dependent pine forests (Kirkman et al. 2000). More broadly, hydroperiod has been documented to be a primary determinant of vegetation structure and plant community composition in a variety of wetland types, ranging from sawgrass marsh (Foti et al. 2012) to alluvial floodplain forests (Townsend 2001) to coastal mangroves (Crase et al. 2013). As the potential impacts of climate change on small, geographically isolated wetlands attract further study (Brooks 2009; Cartwright and Wolfe 2016), hydrologic models may become increasingly important for linking meteorological changes in water availability to hydroperiod shifts and resulting ecosystem effects.

Robust hydrologic models for seasonally inundated depression wetlands are important for natural-resource management more broadly, because depression wetlands tend to make substantial contributions to regional biodiversity and because their ecohydrology may be sensitive to climatic fluctuations, human water withdrawals, land-use change, and ecosystem restoration efforts (Tsai et al. 2007; Barton et al. 2008; Brooks 2009; Gómez-Rodríguez et al. 2010; Bird and Day 2014; Park et al. 2014; Cartwright and Wolfe 2016; Bolpagni et al. 2019; Bertassello et al. 2020). In particular, there is a need for models of wetland hydrology to provide temporal extrapolation (i.e., hindcasting and forecasting) in the context of climatic non-stationarity (Greenberg et al. 2015; Lee et al. 2015; Chandler et al. 2016). Although hydroperiod changes may also be detected through other means such as remote sensing (Gómez-Rodríguez et al. 2010; Russell et al. 2020), hydrologic models are generally required for evaluation of hydroperiod dynamics over long (> 50-year) timeframes. Hydrologic hindcasting and forecasting provide unique windows into wetland dynamics that can help wetland managers anticipate future ecological consequences from past and present hydrologic trajectories (Lee et al. 2015; Chandler et al. 2016). To optimize model transferability in time under non-stationary hydroclimatic conditions, wetland models would ideally (1) be grounded in conceptual models informed by prior investigations of wetland geomorphology and landscape context; (2) be calibrated and validated using hydrologic observations from a wide range of climatic conditions, including droughts and wet years; (3) adhere to the parsimony principle to avoid over-fitting; and (4) undergo periodic recalibration as longer hydrologic records become available (Wolfe 1996b; Sun et al. 2006; Kling et al. 2015; Lute and Luce 2017; Jones et al. 2018).

Conclusions

In this study, we present a recalibrated hydrologic model for Sinking Pond, a seasonally inundated karst-depression wetland on the Eastern Highland Rim in Tennessee, USA. Using 22 years of hydrologic and climate observations, model recalibration provided substantial improvements over the previous calibration (which had used 5 years of observations) and was generally successful in representing pond water levels and hydroperiod across a variety of climatic conditions. We used the recalibrated model to simulate daily water levels over a 165-year hindcasting period and found strong evidence for precipitation-driven hydroperiod increase after 1970.

These findings suggest several implications for future hydrologic modeling studies in depression wetlands. First, reasonably well-performing hydrologic models for depression wetlands are achievable and can be helpful to anticipate hydroperiod changes and their ecological consequences over time. Second, simplicity and parsimony are valuable features of hydrologic models and in some cases may aid model transferability in time (e.g., hindcasting and forecasting). Third, our findings support the idea that longer periods of hydrologic observations—and especially the representation of climatically extreme years—can help strengthen model robustness in the context of hydroclimatic non-stationarity and increasing climatic variability. Finally, this study underscores the importance of considering the range of climatic variability across the model calibration and validation period in comparison to that of the time period used for temporal extrapolation (e.g., hindcasting), to optimize model accuracy for detecting potential hydroperiod changes over time.