Introduction

Urban streams receive excess nitrogen (N) from multiple sources in the watershed and transport N downstream because supply from the watershed is greater than demand in the stream (Grimm et al. 2005; Kaushal et al. 2014a, b). Channel degradation from flashy runoff, incision, and floodplain disconnection impairs N transformation and uptake and exacerbates downstream N transport (Paul and Meyer 2001). Consequently, stream restoration designed to repair and reconnect stream channels, is an increasingly popular approach for managing N in urban streams. Such restoration attempts to improve hydrologic condition favorable for N transformation and denitrification, by reducing flashiness, increasing residence times, and adding organic carbon for denitrifiers (Kaushal et al. 2008; Gift et al. 2010; Mayer et al. 2010b; Duan et al. 2019). Our recent synthesis suggested that there is potential for reducing N through stream restoration (Newcomer Johnson et al. 2016), though it is not clear which methods are most effective. Identifying effective watershed restoration and stormwater BMPs is critical for establishing protocols to meet nutrient management goals in watersheds like the Chesapeake Bay, USA where this study was conducted. (Urban Stormwater Work Group 2020). While some restoration approaches have shown short-term benefits (Bukaveckas 2007; Kaushal et al. 2008; Filoso and Palmer 2011; Filoso et al. 2015), most projects receive surprisingly little or no post-restoration assessment (Bernhardt et al. 2005; Hassett et al. 2005). Most stream restoration studies are short-term evaluations employing space for time substitutions examining restored and unrestored, reference sites simultaneously (but see Bukaveckas 2007). Even fewer studies have examined groundwater-surface water interactions in urban or restored streams (Striz and Mayer 2008; Mayer et al. 2010b). Our research was intended to fill a gap in long-term studies of restoration, improve our understanding of N behavior in groundwater and surface water of restored streams, and elucidate possible BMPs for N management in urban ecosystems.

We investigated the long-term effects of geomorphic stream restoration on riparian and in-stream N transport and transformation at Minebank Run. Initially, Minebank Run was a geomorphically degraded urban stream near Baltimore, Maryland, USA, in the Chesapeake Bay watershed. However, it was later restored by hydrologically reconnecting the stream channel with the floodplain to dissipate erosive force (Kaushal et al. 2008). Over almost a decade, we examined relationships between hydrology, chemistry, and biology before and after channel and floodplain restoration. Before restoration, Minebank Run exhibited “urban stream syndrome” characteristics (Walsh et al. 2005), which included highly eroded stream banks and flashy hydrology stemming from increased runoff from impervious surfaces (Doheny et al. 2006). Channel incision and meandering had exposed buried stormwater and sewer infrastructure prompting Baltimore County Department of Environmental Protection and Sustainability (BCDEPS) to restore Minebank Run. Minebank Run was restored in two phases: (phase 1) an upstream section in 1999 and (phase 2) a downstream section in 2004 and 2005. We hypothesized that stream restoration designed to reconnect the stream to its floodplain would reduce surface and groundwater NO3 concentration by creating hydrologic conditions that positively affect microbial activity such as denitrification, improve stream bank stability, and expand hyporheic zones (Mayer et al. 2010b).

Naturally flowing stream channels are hydrologically connected to their floodplains. Urban streams often are disconnected from their floodplains because of stream channel incision from flashy stormwater runoff (Paul and Meyer 2001), altered subsurface flow from engineered urban karst (Kaushal and Belt 2012), and stream burial in pipes (Elmore and Kaushal 2008; Beaulieu et al. 2014, 2015; Pennino et al. 2014). However, increasing hydrologic connectivity at the groundwater-surface water interface can foster “hot spots” and “hot moments” of N removal via denitrification when proper redox conditions develop and when inorganic N and organic C are available to denitrifying bacteria in the subsurface sediments (Hedin et al. 1998; Sobczak and Findlay 2002; Sobczak et al. 2002; McClain et al. 2003; Mayer et al. 2010b; Vidon et al. 2010). The current stream restoration protocol in the Chesapeake Bay region focuses on reconnecting floodplains as the most effective way to increase N uptake in urban streams (Urban Stormwater Work Group 2020). To test the efficacy of floodplain reconnection to reduce N in groundwater and surface water of degraded streams, we examined two independent groundwater and surface water datasets collected at our intensively monitored study site. We targeted stream reaches for intensive study where channel geomorphology was reengineered to reconnect hydrology. We use both empirical and modeling approaches to corroborate results and examined our site both before and after restoration. We examined two different post-restoration time periods (2002–2008 and 2002–2012) because there was evidence for failure of restoration features after 2008. Our results are intended to inform future stream restoration efforts designed to manage N in urban ecosystems by analyzing long-term data, which is rarely available for restored sites.

Methods

Study design

We investigated how stream restoration influenced NO3 flux and concentration in groundwater and surface water at Minebank Run (Baltimore County, Maryland, USA) from November 2001 to October 2008. Studies of biogeochemistry, geomorphology, hydrology, salinity dynamics, and denitrification processes at Minebank Run have been published elsewhere (e.g., Mayer et al. 2003, 2010b2013; Groffman et al. 2005; Doheny et al. 2006, 20072012; Kaushal et al. 2008; Gift et al. 2010; Striz and Mayer 2008; Klocker et al. 2009; Newcomer et al. 2012; Harrison et al. 20112012a, b2014; Cooper et al. 2014; Pennino et al. 2016; Wood et al. this issue). Here, we sought to place our research within the context of assessing the restoration of ecosystem processes, hydrologic transport, and the identification of factors limiting and influencing those processes and transport (Palmer 2009). Our specific objectives were to examine relationships among DOC, NO3, and Cl in surface water and groundwater pre- and post-restoration by employing a Before/After-Control/Impact (BACI) study design (Underwood 1992; Thompson et al. 2018). Our study used both long-term monitoring of a stream before and after restoration with intensive groundwater and surface water characterization. We used two independent data sets to produce empirical and modeling results, including our own empirical data (EPA) and from the US Geological Survey (USGS) stream gage data to develop corroborative N flux models spanning pre- and post-restoration study periods. We expected that N transport and processing would be controlled by geomorphology, hydrology, and carbon supply.

Study area

Minebank Run is a 2nd order urban stream located within an 8.47 km2 watershed within Baltimore County, Maryland, USA in the eastern section of the Piedmont physiographic province (39o24′43″N and 76o33′12″W; Fig. 1). Minebank Run flows in a northeasterly direction at approximately a 1% grade for 5.2 km, where it enters Gunpowder Falls, eventually draining into Chesapeake Bay (Doheny et al. 2006). Land use in the Minebank Run watershed is over 80% urban/suburban (Doheny et al. 2006). Between 1960-70 s, rapid urbanization led to severe channel degradation that was addressed by installing concrete flumes in the channel, which, by the 1990s, had been eroded out of place (Sortman 2004). The high proportion of impervious surface in the watershed, including the Interstate-695 Beltway, in a region of significant topographic relief, combined to produce flashy hydrology, eroded banks, incised channel bed, and general geomorphic instability. Hydrographs of storm events at Minebank Run (Doheny et al. 2006) are typical of urban streams in areas of high impervious surface (Paul and Meyer 2001).

Fig. 1
figure 1

Aerial view of Minebank Run watershed and stream showing locations (red boxes) of study sites. Yellow line delineates the watershed. Blue line represents the stream. Stream flow is to the northeast where Minebank Run confluences with Gunpowder Falls. (Aerial photo courtesy of Baltimore County Department of Environmental Protection and Sustainability)

Urban development around Minebank Run predates stormwater management regulations implemented in mid-1980’s, and thus, uncontrolled runoff entering the stream was a significant water quality problem. Portions of the channel were encased in concrete, thereby increasing the flashiness of storm flows. Sewer lines and storm drains were eroded and exposed. Riparian buffers were cleared for residential and commercial development.

Before restoration, the study reach was characterized by channel incision, revealing bedrock in some places, and causing lateral movement of the stream that impacted property and sewer infrastructure. Pre-restoration channel width ranged from 0.15–15.5 m, and depth ranged from 0.04–0.9 m, yielding channel cross-sectional areas ranging from 0.007 – 13.908 m2 (Doheny et al. 2006). Mean bank height along the study reach was 0.77 ± 0.11 m (Mayer 2010b) with some extremely incised banks up to 1.5 m (Fig. 2). Minebank Run was classified as a B4c channel type (Rosgen 1996), characterized by single-thread channel with entrenchment ratio of 1.4–2.2, width-to-depth ratio > 12, moderate sinuosity > 1.2, water-surface slope of 3–4%, and a median particle diameter in the gravel range of 2 to 64 mm (Doheny et al. 2007).

Fig. 2
figure 2

Aerial photos of a portion of the downstream study reach of Minebank Run showing the reconstructed stream before and after the restoration (aerial photos by K. Jewell). Note the bend in the stream channel (future oxbow) that became the oxbow after the restoration

Phase I of the restoration, which addressed the upstream 2,400 m of stream beginning from the headwaters, began in 1999 and was completed in 2002. Phase II of the restoration, lasting from June 2004 to February 2005, addressed the remaining 3,300 m of the stream to the confluence of the Gunpowder River (USEPA 2006; Doheny et al. 2012). The restored study site is in Cromwell Valley Park (CVP) in approximately the middle of the 3300 m of the Phase II restoration reach and is referred to as restored site (Fig. 1). The control study site is in the Intervale (IV) neighborhood of Towson, MD, USA in approximately the middle of the 2400 m phase I restoration reach (Fig. 1). We refer to this as the control rather than a reference site because it is not pristine (sensu Brinson and Rheinhardt 1996), yet it is within the same watershed and exposed to the same runoff and conditions as the restored site. Rather, the control IV site represents a reach that was not manipulated while the downstream site was manipulated by the process of restoration. Our objective was to determine if the downstream reach developed N retention, transport, and/or transformation behavior that was different after the restoration. The control and restored reaches are 2.4 km apart. The restoration included re-engineering approximately 1800 m of mainstem channel and 600 m of tributary channels (Sortman 2004).

Throughout the 5.2 km length of Minebank Run, various stream restoration techniques were used based on the condition of the channel and surrounding land use and infrastructure. For example, the stream channel was redesigned to move the thalweg away from an exposed sewer line to protect against further erosion and channel meandering (Fig. 2). At two points, the channel was redesigned to reduce bank erosion by creating oxbow wetlands (Harrison et al. 2012a, b, 2014), which effectively straightened the channel but allowed for greater overbank flow and stormwater retention. Prior to restoration, about 40–70% of the reach was riffles with runs and pools making up the remainder whereas, after restoration, the proportions of riffles, runs, and pools were more equitable at about 30–40% each (Doheny et al. 2012).

The restoration was intended to reconnect the stream channel with the floodplain by mimicking natural valley and floodplain morphology. For example, the project included root wads and imbricated riprap to reduce bank erosion, cross vanes, step-pools, and pool-riffle features to slow stream flow, as well as a stable meander pattern and cross-section. Natural channel design methods (Rosgen 1996, 2011) were also applied to control flow and erosion by: a) raising the stream bed by filling the channel with gravel and cobble, b) removing concrete liners, c) reconstructing point bars, riffles and meander features, d) creating step-pool structure, e) armoring banks, f) creating oxbow wetlands, and g) re-vegetating the riparian zone (Sortman 2004; Duerksen and Snyder 2005). The restoration corresponded to typologies A, C, and I as described in Newcomer-Johnson et al. (2016) where the stream bottom was raised (A), the floodplain lowered (C), and oxbow wetlands created (I).

Although the approach to stream restoration at Minebank Run was primarily intended to address channel erosion and protect sewer infrastructure, we hypothesized that restoration could also affect the hydrology and biogeochemistry of the system (Mayer et al. 2003; Fig. 3). Specifically, we speculated that the physical manipulations designed to accommodate the change in stream discharge rates would also have the potential to change surface and groundwater hydrology (Bukaveckas 2007; Tague et al. 2008). Changes in hydrology could reduce the hydrologic drought (drying of riparian zones due to hydrologic disconnection with the stream and groundwater flowpaths) common in urban streams (Groffman et al. 2003). We speculated that the approach of reshaping the banks and raising the stream bed to eliminate bank incision also might allow carbon-rich riparian soils to become saturated and/or remain wetter, resulting in biogeochemical conditions favorable for nutrient transformations such as denitrification (Kaye et al. 2006). Specifically, we expected flow control structures installed in the stream channel to reduce erosion also may trap organic matter long enough to create enriched anoxic zones conducive for denitrification to occur (Groffman et al. 2005). We also expected re-vegetating the riparian zone could provide litter and organic matter inputs and root biomass to supply carbon to denitrifiers (Gift et al. 2010).

Fig. 3
figure 3

Conceptual figure of stream restoration to improve floodplain reconnection. Eliminating incision and reconnecting the floodplain unites saturated soils with organic matter from plants and roots and allows greater stream channel and groundwater interaction leading to more denitrification and subsequent reductions in stream nitrogen

Precipitation and hydrology

Daily, monthly, and annual precipitation data for the study period (2002–2012) was downloaded from the PRISM Climate Group (https://prism.oregonstate.edu/explorer/; latitude: 39.4200/-76.5788). Stream discharge (m3/s) was measured at 5-min intervals at Minebank Run restored CVP with an automated USGS stream gage (USGS ID 0,158,397,967, Minebank Run near Glen Arm, Maryland, USA; http://waterdata.usgs.gov/nwis/uv?0158397967) by the USGS from 2002–2012. To construct regression models of the relationship between discharge and chemistry, the date of the water chemistry sampling was matched with the same date for the mean daily stream discharge provide by the USGS stream gage. While there is no stream gage at the control IV site, discharge measurements and samples collected by USGS are available (USGS ID 0158397925); https://waterdata.usgs.gov/md/nwis/inventory/?site_no=0158397925).

Seasonal groundwater and surface water chemistry

Groundwater and surface water were collected during 19 sampling events at the restored CVP reach (2001–2008) and 14 corresponding sampling events at the control IV reach (2003–2008). We refer to these samples as EPA data. Sampling dates before 26 May 2004 are considered pre-restoration and sampling dates after 2 March 2005 are post-restoration. At CVP, groundwater and surface water were collected 9 times pre-restoration (28–30 Nov 2001, 5–7 March 2002, 6–9 May 2002, 22–25 July 2002, 15–17 Oct, 2002, 7–9 April 2003, 29–31 July 2003, 15–17 Oct 2003, 10–12 May 2004,) and 10 times post-restoration (29 Nov-1 Dec 2005, 24–27 April 2006, 28–30 Aug 2006, 5–6 Dec 2006, 16–19 April 2007, 26–27 July 2007, 27–29 Nov 2007, 21–24 April 2008, 7–8 July 2008, 27–29 Oct 2008). Groundwater and surface water were collected at control IV beginning April 2003 and continuing through Oct 2008 for the same 14 sampling dates as restored CVP during that period.

Groundwater was collected from the piezometers using low-flow pumping methods (Puls and Barcelona 1996) with a peristaltic pump through a flow cell and multi-meter instrument (Hach Co., Loveland, CO, USA). Surface water was collected via peristaltic pump for consistency with groundwater sampling. Field measurements for all samples included dissolved oxygen (DO; mg/L), pH, temperature (Cº), oxidation reduction potential (ORP; mV), and specific conductance (mS/cm). Samples for C, N, and ion analysis were stored on ice and acidified to pH 2 and/or filtered with 0.45-micron filters, until they could be analyzed in the lab.

Piezometers were installed along transects aligned perpendicular to stream flow in groups of 3 (one group in the channel, and one group each on either bank) at 61, 122, and 183 cm below the surface to capture longitudinal and lateral flow (as described in Striz and Mayer 2008; Kaushal et al. 2008). Transects crossed the stream at geomorphic and restoration features of interest including: cutbanks, gravel bars, terrace, riprap, and oxbows (Fig. 2). A total of 18 piezometers and 2 surface water stations were sampled at our control site at Intervale (IV) and 33 piezometers and 3 surface water stations were sampled at our restored site at Cromwell Valley Park (CVP). At control site IV, piezometers were arranged in 2 transects located 38 m apart across 2 consecutive meander bends. At restored CVP, piezometers were located downstream of USGS stream gage 0,158,397,967 (Fig. 2) and arranged in 3 transects (71 m and 49 m apart). After restoration, some piezometers were replaced at the approximate original pre-restoration locations, where possible, or in comparable locations along the new post-restoration channel. Restoration involved redesigning the channel to fill heavily incised reaches that threatened damage to sewer infrastructure and, in the process, two bends in the channel were cut off to create oxbow wetlands (Fig. 2) that were the focus of previous studies (Harrison et al. 2011, 2012a, b, 2014).

Bi-weekly surface water chemistry

Independent of the 19 groundwater EPA sampling episodes described above, a second set of surface water samples for NO3 and specific conductance was collected by USGS (https://www.usgs.gov/mission-areas/water-resources/science/national-field-manual-collection-water-quality-data-nfm?qt-science_center_objects=0#qt-science_center_objects) approximately every 2 weeks at restored CVP (N = 278 sampling events) beginning 5 March 2002 and beginning 3 June 2004 at control IV (N = 201 sampling events), and continuing at both sites until 30 June 2008, capturing both pre- and post-restoration periods.

Laboratory chemical analyses

Chemical analyses followed methodology described in APHA (1998), USEPA (1983), and standard operating procedures (e.g. K-GCRD-SOP-1151–1) performed at the Robert S. Kerr Environmental Research Center, US Environmental Protection Agency (EPA; https://www.epa.gov/research/science-action-robert-s-kerr-environmental-research-center-ada-oklahoma) and at USGS (USGS National Water Quality Laboratory: https://www.usgs.gov/labs/nwql). Dissolved organic carbon (DOC) was measured directly on a Tekmar–Dohrmann instrument (Teledyne Technologies Inc., Los Angeles, CA, USA) via the UV-persulfate digestion method. Nitrite (NO2¯) and nitrate (NO3¯) were measured using a Lachat Flow Injection Analyzer (Hach Co., Loveland, CO, USA). Because nitrite was negligible in our samples, we refer to combined nitrite and nitrate as nitrate (NO3¯). Cl¯ was measured using capillary electrophoresis with indirect UV detection (Waters Corp., Milford, MA, USA).

Estimation of N fluxes and hydrologic metrics

Changes in NO3 and total nitrogen (TN) flux (kg/yr) were estimated using the R package EGRET (Exploration and Graphics for RivEr Trends) created by the USGS (Hirsch and De Cicco 2015), along with mean daily discharge and concentration data collected every 2 weeks at the Minebank Run stream gage (USGS 0,158,397,967) by the USGS from 2002–2008 and by Pennino et al. (2016) from 2010–2012. The EGRET package applies the Weighted Regressions on Time, Discharge, and Season (WRTDS) smoothing method (Hirsch et al. 2010) for obtaining estimates of annual, monthly and daily flux, and annual flow normalized (FN) flux and concentrations. Flow normalization is used to remove the impact of year-to-year variations in discharge, which helps in assessing changes over time due to changes in the watershed (Medalie et al. 2012). For the FN flux, the mean daily discharge for a particular day is averaged across all years of the data and then multiplied by the concentration on that day instead of just using the concentration and mean daily Q of that particular day (Medalie et al. 2012).

We calculated the flashiness index, a metric to assess the variability in mean daily discharge over a given period of time (Baker et al. 2004; Poff et al. 2006; Sudduth et al. 2011; Violin et al. 2011; Pennino et al. 2016). Greater hydrologic flashiness would indicate more rapid changes in discharge from one day to the next, which is more typical in urban streams in watersheds with high impervious surface compared to less developed or forested watersheds (Konrad et al. 2005; Walsh et al. 2005; Meierdiercks et al. 2010; Smith et al. 2013; Loperfido et al. 2014). Additionally, we applied the same methodology to precipitation data to calculate a “precipitation flashiness index,” based on mean daily rainfall data.

NO3 flux and the flashiness index were calculated over two different time periods: 1) using just the USGS concentration dataset from 2002–2008 and 2) including an additional concentration dataset collected from 2010–2012 (Pennino et al. 2016). The NO3 and TN concentrations were taken from the same Minebank Run USGS gage station and show similar means and ranges between the two datasets (Fig. S1).

Statistical analyses

We used ANOVA to test for differences in chemistry (NO3, DOC, and Cl concentrations) between groundwater and surface water and between pre-restoration and post-restoration periods at restored CVP and control Intervale. We used mixed-model ANOVA with the restoration treatment (pre and post restoration) as a fixed effect and time (sampling period) as a random effect. Separate ANOVAs were run for each constituent (NO3, DOC, and Cl). Separate models were constructed for the restored CVP and control Intervale reaches for each groundwater and surface water. We considered samples as independent replicates because of stream studies, including previous studies at Minebank Run (e.g. Groffman et al. 2005; Kaushal et al. 2008), showing high variability across features or microhabitats in streams, especially groundwater samples. This variability was likely due to a combination of stream geomorphology, hot spots of denitrification and organic carbon accumulation, and hydrology causing alternate flushing or legacy effects (Striz and Mayer 2008; Mayer et al. 2010b). To this point, we examined the spatial variability in NO3, DOC, and Cl concentrations across stream features because we expected that chemical behavior would be driven by spatial and temporal dynamics affected by geomorphic differences. Therefore, we performed one-way ANOVA to compare NO3, DOC, and Cl concentrations, respectively among stream features. Separate ANOVAs were run for each constituent for the pre-restoration and post-restoration periods, respectively. Tukey's post-hoc tests were performed to compare means of the respective comparisons. We performed regression analyses on the USGS surface water samples over time. We developed separate regression models for the pre-restoration, construction, and post-restoration periods in order to capture trends across these distinct conditions. Data were analyzed using Systat 13.0 and SigmaPlot 14.0 software (https://systatsoftware.com).

Results

Evaluating evidence for failure of stream restoration features after 2008

Annual peakflow events were used to estimate the 100-year recurrence interval discharge (Q100) to determine the size of storm events that may be large enough to cause damage to stream channels. Using a bootstrapping approach and the recurrence interval, we calculated Q100 for Minebank Run to be 32 cubic meters per second (m3/s) (https://www.usgs.gov/special-topics/water-science-school/science/100-year-flood; Fig. 4). Hawley (2018) suggested that, to reduce the impacts of 100-year flood events, “Practitioners are encouraged to size their grade control armor to actually resist entrainment at a defensible recurrence interval (e.g., Q100 with an approximately 25%–50% factor of safety) and provide adequate thicknesses of the stone layers both vertically and tied into the banks laterally.” At Minebank Run, based on monthly peakflow events, we found two large storm events producing discharge above 32 m3/s in June 2006 and June 2012 (Fig. 4). There were five more monthly peakflow events within 25% of the Q100 that occurred in August 2008, August 2011, September 2011, August 2012, and July 2008 (Fig. 4). This corroborates the observed failure of many of the restoration features after 2008 (personal communication, Ed Doheny, USGS; Fig. 4).

Fig. 4
figure 4

Monthly (a) and daily (b) peakflow, the discharge at which stormflow hydrograph is at its peak (m3/s) during the study period, calculated based on methods of Pennino et al. (2016). The red line is at 32 cubic meters per second (m3/s) and represents the 100-year recurrence interval discharge amount (Q100). The blue shaded region shows any peakflow values within 25% of Q100

Precipitation trends and hydrologic response to stream-floodplain reconnection

Annual precipitation during the study period ranged from 80 to 174 cm and was greatest in 2003 and 2011 and lowest in 2002 and 2007 (Fig. 5). Average annual discharge was highest in 2004 and 2003 and lowest in 2002 and 2007 (ranging from 900—3500 m3/s) (Fig. 6). The highest mean daily discharge days occurred on 27 Oct 2003, 8 Oct 2005, 13 Aug 2008, 11 Sept 2009, 26 Dec 2009, 30 Sept 2010, 10 March 2011, 7 September 2011, and 29 October 2012 (Fig. 6c). The top 10 peak flow events occurred on 12 June 2003, 8 Oct 2005, 25 June 2006, 23 July 2008, 13 Aug 2008, 28 Aug 2009, 14 Aug 2011, 7 Sept 2011, 1 June 2012, 14 Aug 2012 (Fig. 4).

Fig. 5
figure 5

Total annual (a), monthly (b), and daily precipitation (c), downloaded from the PRISM Climate Group (https://prism.oregonstate.edu/explorer/; latitude: 39.4200/-76.5788)

Fig. 6
figure 6

Mean annual (a), monthly (b), and daily (c) discharge at Minebank Run restored CVP reach (USGS gage 0158397967)

A fundamental objective of the restoration was to reduce flashy flows during precipitation events. Monthly flashiness index, based on mean monthly discharge, decreased over time between 2002 and 2008 (Fig. 7a), but when extending the timeframe from 2002–2012, the flashiness index no longer shows a decline (Fig. 7b). Similarly, when specifically comparing the pre-restoration period (2002–2004) with the post restoration period of 2005–2008, there is a decline in the stream flashiness index (p = 0.03, Fig. 8a), but there is no difference when the post-restoration period is extended to 2012 (p = 0.11, Fig. 8b). By comparison, the flashiness of daily rainfall data (precipitation flashiness index) shows the opposite pattern with an increase in variability post-restoration (p = 0.05) in the 2002 to 2008 period and no significant change in the 2002 to 2012 period (p = 0.14, Figs. S2 and S3).

Fig. 7
figure 7

The average monthly Hydrologic Flashiness Index: (a) from 2002–2008 and (b) from 2002–2012. Two post-restoration periods are shown because there is evidence for failure of restoration features after 2008 which we attribute to damaging peak flows

Fig. 8
figure 8

Hydrologic flashiness compared during pre- and post-restoration periods for (a) 2002–2008 and (b) 2002–2012. Two post-restoration periods are shown because there is evidence for failure of restoration features after 2008 which we attribute to damaging peak flows

Comparing groundwater vs surface water chemistry

Groundwater NO3 concentrations did not differ from surface water NO3 concentrations during either the pre- or post-restoration periods at either the restored CVP or the control IV sites based on mixed-model ANOVA (p ≥ 0.09; Table 1; Fig. 9). In other words, groundwater resembled surface water at both reaches. DOC was always significantly higher in surface water than in groundwater both before and after restoration at both the control IV and the restored CVP reaches (p < 0.001; Table 1; Fig. 9), suggesting that DOC was transported in surface water without being stored in groundwater or because DOC was consumed while in the subsurface. Groundwater and surface water Cl concentrations were similar at the restored CVP reach both before and after restoration (p ≥ 0.07; Table 1; Fig. 9). However, at the control IV reach, surface water Cl was double that in the groundwater both before and after restoration (p < 0.001; Table 1), suggesting that local runoff events in the headwaters influenced surface water salt chemistry. Cl concentrations at the downstream restored CVP reach (Fig. 9) were chronically elevated compared to the upstream control IV reach due to effects of the I-695 Beltway, a major freeway that received heavy deicer salt inputs. Groundwater was a reservoir for salt loads (Cooper et al. 2014), leading to similar groundwater and surface water Cl concentrations downstream (Fig. 9).

Table 1 Mixed-model ANOVA comparisons of nitrate (NO3), dissolved organic carbon (DOC), and chloride (Cl) concentrations (mg/L) in groundwater versus surface water at restored CVP and control Intervale (IV) sites during pre- and post-restoration time periods*. Water type (groundwater [GW] vs surface water [SW]) was used as a fixed factor while time (sampling period) was used as a random factor in the models. Table shows sample size (N), mean (), standard error (SE), degrees freedom (df), F statistics (F), and p-values (p). Statistical comparisons (df, F, p) represent a comparison of groundwater vs. surface water at each site during each time period. *Six outliers (NO3 > 6 mg/L) excluded from NO3 analyses. *One outlier (Cl > 400 mg/L) excluded from Cl analyses
Fig. 9
figure 9

Minebank Run restored CVP and control Intervale (IV) reaches, NO3, DOC, and Cl concentrations (mg/L) in groundwater and surface water. Time is not to scale

Fig. 10
figure 10

At Minebank Run, at both restored CVP reach and the control Intervale (IV) reach, groundwater NO3 versus molar ratios of C:N exhibited a similar negative curvilinear relationship. This pattern is exhibited in aquatic ecosystems elsewhere (Taylor and Townsend 2010), and is indicative of microbial control of NO3dynamics

Pre- and post-restoration patterns in groundwater and surface water chemistry

Groundwater NO3 concentration was significantly lower after restoration at the restored CVP reach based on mixed-model ANOVA (F1, 422 = 3.98; p = 0.05; Table 2; Fig. 9). The decline in groundwater NO3 concentration after restoration based on the EPA data was overall about 15% from a mean of 1.56 mg/L (SE ± 0.08) to 1.33 mg/L (SE ± 0.06). Surface water NO3 concentration also was lower after restoration at CVP (F1, 44 = 5.49; p = 0.02; Table 2; Fig. 9). The decline in surface water NO3 concentration after restoration based on the EPA data was overall about 33% from a mean of 1.61 mg/L (SE ± 0.03) to 1.08 mg/L (SE ± 0.08). However, both surface water and groundwater NO3 concentration at the control IV reach remained similar throughout the post-restoration period (p ≥ 0.18, Table 2; Fig. 9), suggesting that conditions upstream were not the cause of the downstream changes in NO3 and that the effect on NO3 at the CVP reach was due to the restoration.

Table 2 Mixed-model ANOVA comparisons of nitrate (NO3), dissolved organic carbon (DOC), and chloride (Cl) concentrations (mg/L) during pre- versus post-restoration periods at restored CVP and control Intervale (IV) sites in groundwater (GW) and surface water (SW). Site (CVP vs Intervale) was used as a fixed factor while time (sampling period) was used as a random factor in the models. Table shows sample size (N), mean (), standard error (SE), degrees freedom (df), F statistics (F), and p-values (p). Statistical comparisons (df, F, p) represent a comparison of pre- vs post-restoration at each site for each type of water. *Six outliers (NO3 > 6 mg/L) excluded from NO3 analyses. *One outlier (Cl > 400 mg/L) excluded from Cl analyses

Neither surface water nor groundwater DOC differed after restoration at the CVP reach (p ≥ 0.65; Table 2; Fig. 9). Likewise, surface water DOC at the control IV reach did not differ after the downstream restoration (F1, 24 = 0.37; p = 0.55; Table 2; Fig. 9). However, groundwater DOC was lower after the restoration at the control IV reach for reasons we were not able to ascertain (F1, 181 = 10.71; p = 0.001; Table 2; Fig. 9). This difference did not seem to propagate downstream to the restored reach suggesting that more local sources of organic matter were transported to the stream.

Neither groundwater nor surface water Cl concentrations at the restored CVP reach differed after the restoration (p ≥ 0.21; Table 2; Fig. 9). This suggests that the change in NO3 concentration at CVP was due to biological control as Cl is conservative and is not altered significantly biologically (Mayer et al. 2010b). Likewise, neither groundwater nor surface water Cl concentrations at the control Intervale site reach differed after the restoration (p ≥ 0.26; Table 2; Fig. 9).

Restoration effects on biweekly surface water chemical concentrations and fluxes

Based on linear regression, surface water NO3 concentration in the intensive USGS surveys of surface water showed increasing trends prior to restoration and during construction at restored CVP (p ≤ 0.008; Table 3; Fig. 10). However, after restoration, NO3 trends declined steadily (p < 0.001; Table 3; Fig. 10). Seasonal cycles were evident, with higher NO3 observed in winter with maximum concentrations > 4 mg/L before restoration and > 2 mg/L even after restoration (Table 3) when temperatures and microbial and plant activity were lower, likely reducing uptake of NO3. NO3 was especially low (0.23 mg/L; Table 3) during a severe drought in 2002 and then rose concurrently with a rapid shift to a wet season in 2003 (Fig. 10). Similar trends occurred at control IV (Table 3; Fig. 10), suggesting that seasonal and interannual NO3 cycles and runoff effects propagated downstream to the restored site. However, there are no comparable pre- and post-restoration comparisons of NO3 at control IV because start of sampling coincided with the restoration construction period at CVP (Table 3).

Table 3 Linear regression trends of nitrate (NO3) concentrations (mg/L) and specific conductance (µS/cm) over time (2003–2008) in surface water collected by USGS during bi-weekly surveys at Minebank Run. *One outlier (specific conductance > 5000 µS/cm) excluded from analyses
Fig. 11
figure 11

USGS bi-weekly Minebank Run surface water NO3 concentrations (mg/L) at restored CVP and control Intervale (IV). NO3 shows increasing trends prior to the restoration and during construction at the restored, downstream CVP reach. NO3 trends decline steadily after the restoration. Seasonal cycles are evident and NO3 was especially low during a severe drought in 2002 and then rose concurrently with a rapid shift to a wet season in 2003

Biweekly surface water sampling by USGS at restored CVP did not include regular Cl analysis. Instead, we relied on specific conductance as a surrogate measure of Cl which can reflect similar patterns as Cl flux (Cooper et al. 2014; Pennino et al. 2016). Like Cl, specific conductance was chronically higher at the downstream restored CVP site than upstream at control IV (Fig. 11). Specific conductance was relatively variable, exhibiting peaks and outliers (Fig. 11). Based on linear regression, conductivity increased over time at restored CVP and control IV prior to restoration (p ≤ 0.05; Table 3). Post-restoration specific conductance trends were not significant at either restored CVP or control IV (p ≥ 0.11; Table 3).

Fig. 12
figure 12

USGS bi-weekly Minebank Run surface water specific conductance (µS/cm) at restored CVP and control Intervale (IV). Surface water in samples collected bi-weekly at restored CVP reach. Specific conductance is generally higher at restored CVP than at the upstream control Intervale because of the influence of the I-695 beltway and associated inputs of road salts. Trends overall appear to be increasing with occasional extreme peaks from storm runoff suggesting that road salts and/or other ions are influencing water chemistry at Minebank Run

Relationship between NO3 and DOC

Ratios of NO3 to C:N at both the restored CVP reach (N = 477; \(\overline{\mathrm{X} }\) ± SE: 2.15 ± 0.35) and the control IV reach (N = 232; \(\overline{\mathrm{X} }\)  ± SE: 3.17 ± 0.97) showed the same negative curvilinear relationship (Fig. 12). Highest NO3 concentrations were observed in piezometers where C:N ratio was lowest. Low NO3 concentrations approached zero in groundwater samples where C:N was about 10:1.

Chemistry response to channel geomorphology

NO3, DOC, and Cl concentrations in groundwater differed among stream features at restored CVP reach before and after the restoration (p ≤ 0.05; Table 4). Before restoration, NO3 was highest in cutbanks (\(\overline{\mathrm{X} }\)  ± SE: 2.6 ± 0.3 mg/L; Table 4) followed by concentrations below the stream channel and in terrace features (\(\overline{\mathrm{X} }\) ± SE: 1.53 ± 0.06 mg/L and 1.51 ± 0.08 mg/L, respectively; Table 4). DOC was highest in the subsurface of the stream channel (\(\overline{\mathrm{X} }\) ± SE: 1.18 ± 0.08 mg/L; Table 4), likely as a function of transport and groundwater-surface water mixing. Cl was highest in groundwater of gravel bars associated with meander features and below the stream channel (\(\overline{\mathrm{X} }\) ± SE: 163.5 ± 11.3 mg/L and 135.0 ± 8.3 mg/L, respectively; Table 4) and lowest in cutbank features (\(\overline{\mathrm{X} }\) ± SE: 59.8 ± 6.3 mg/L; Table 4) where the stream was most disconnected.

Table 4 ANOVA comparisons of nitrate (NO3), dissolve organic carbon (DOC), and chloride (Cl) concentrations (mg/L) in groundwater among stream features Minebank Run. *Tukey's post-hoc tests were performed to compare means during the pre-restoration and post-restoration periods, respectively. Means with the same letter are not significantly different p > 0.05

After restoration, oxbows and rip rap structures became new features of the system. NO3 was highest in oxbow features (\(\overline{\mathrm{X} }\) ± SE: 2.52 ± 0.25 mg/L; Table 4), suggesting higher retention of NO3 at this floodplain reconnection feature (Harrison et al. 2012a). Cutbank features were mostly eliminated after the restoration, however NO3 was relatively low for the few samples collected (\(\overline{\mathrm{X} }\) ± SE: 0.91 ± 0.09 mg/L; Table 4). NO3 in the stream and at terrace features designed to connect the floodplain to the stream channel were similar (Table 4), suggesting again that the stream and terraced features were hydrologically connected. DOC remained highest (\(\overline{\mathrm{X} }\) and SE: 1.23 ± 0.06 mg/L; Table 4) below the stream channel perhaps owing to higher transport of incoming organic matter mixing in the subsurface. Chloride was highest at gravel bars and below the stream channel (\(\overline{\mathrm{X} }\) ± SE: 212.3 ± 13.9 and 128.5 ± 3.3 mg/L, respectively; Table 4), suggesting enhanced Cl storage in these features. Chloride concentration was similar among oxbows, riprap, and terrace features (Table 4).

Trends in NO3 and TN fluxes from bi-weekly monitoring at Minebank Run

Monthly NO3 flux estimates showed an increasing trend pre-restoration (R2 = 0.2, p = 0.01), and a decreasing trend post-restoration from 2005–2008 (R2 = 0.13, p = 0.01; Fig. 13a), but there was no change in flux post-restoration from 2005–2012 (R2 = 0.003, p = 0.26; Fig. 13b). A similar pattern is also seen for TN flux (Fig. 13c, d). When comparing the monthly pre- and post-restoration flux values, there is a significant decline in flux in the first three years post-restoration (p < 0.001; Fig. 14a) and in the first seven years post-restoration (p = 0.01; Fig. 14b). The same pattern is found for TN flux (Fig. 14c, d). When looking at trends in annual flux and FN flux, for both NO3 and TN, there is a steady decline in both from 2002–2008 (p = 0.01 and p < 0.001, respectively; Fig. 15a, c). For the 2002–2012 period there is a less pronounced decline in FN NO3 and TN flux (p < 0.001), but no significant decline in annual flux (p = 0.22 and p = 0.27, respectively; Fig. 15b, d), perhaps due to several in-stream constructed features eroding and damaged after storms and the oxbow wetland filling in beginning around 2008.

Fig. 13
figure 13

Trends in NO3 flux (kg/month) compared during pre- and post-restoration periods for (a) 2002–2008 and (b) 2002–2012. Trends in TN flux (kg/month) compared during pre- and post-restoration periods for (c) 2002–2008 and (d) 2002–2012. The blue regions represent the timeframe for the restoration and the pink region represents the timeframe for high peakflows, sufficient to cause geomorphic changes. Two post-restoration periods are shown because there is evidence for failure of restoration features after 2008 which we attribute to damaging peak flows

Fig. 14
figure 14

Pre- and post-restoration box plots of monthly NO3 flux (kg/month) at Minebank Run for (a) 2002–2008 and (b) 2002–2012. Pre- and post-restoration box plots of monthly TN flux (kg/month) at Minebank Run for (c) 2002–2008 and (d) 2002–2012. Two post-restoration periods are shown because there is evidence for failure of restoration features after 2008 which we attribute to damaging peak flows

Fig. 15
figure 15

Annual average NO3 flux (black dots and blue trendline) and flow normalized (FN; green) flux at Minebank Run for the (a) 2002–2008 and (b) 2002–2012. Annual average total nitrogen (TN) flux (black dots and blue trendline) and flow normalized (FN; green) flux at Minebank Run for the (c) 2002–2008 and (d) 2002–2012. The blue regions represent the timeframe for the restoration and the pink region represents the timeframe for high peakflows, sufficient to cause geomorphic changes. Two post-restoration periods are shown because there is evidence for failure of restoration features after 2008 which we attribute to damaging peak flows

The EGRET package was also used to calculate mean annual flow normalized (FN) concentrations and we found that FN NO3 and TN concentrations declined over the 2003–2008 period (R2 = 1.0, p < 0.001; Fig. 16a) and the 2003–2012 period (R2 = 0.99, p < 0.001; Fig. 16b). However, we found no relationship between NO3 concentration and discharge for either the 2003–2008 or 2003–2012 time periods (p ≥ 0.11). Only TN post-restoration had a significantly positive relationship with discharge in both the 2002–2008 and 2002–2012 periods (Fig. 17c, d, p = 0.01). This is likely because discharge is more closely linked to particulate N than dissolved N forms.

Fig. 16
figure 16

Annual mean NO3 concentration (black dots and blue trendline) and flow normalized concentration trends (green) for the (a) 2002–2008 and (b) 2002–2012. Annual mean total nitrogen (TN) concentration (black dots and blue trendline) and flow normalized concentration trends (green) for the (c) 2002–2008 and (d) 2002–2012. The blue regions represent the timeframe for the restoration and the pink region represents the timeframe for high peakflows, sufficient to cause geomorphic changes. Two post-restoration periods are shown because there is evidence for failure of restoration features after 2008 which we attribute to damaging peak flows

Fig. 17
figure 17

Nitrate (NO3) concentration vs discharge relationship during pre- and post-restoration periods for (a) 2002–2008 and (b) 2002–2012. Total nitrogen (TN) concentration vs. discharge relationship during pre- and post-restoration periods for (c) 2002–2008 and (d) 2002–2012. Two post-restoration periods are shown because there is evidence for failure of restoration features after 2008 which we attribute to damaging peak flows

Discussion

Chemistry trends at Minebank Run suggest that stream restoration involving floodplain reconnection has the potential to reduce NO3 concentration and NO3 flux at a reach scale, provided that the stream restoration remains stable and the geomorphic features that contribute to the processing and transformation of N remain intact. Based on the BACI design, there was no change in NO3 concentration in the control IV reach after restoration while there was significant reduction in NO3 concentration in the impacted (restored CVP) reach, suggesting that the effect was due to restoration. Furthermore, the lower concentrations of NO3 in surface water and groundwater after restoration, while no coinciding change in conservative Cl and/or specific conductivity after restoration, suggests that the decreases in NO3 are a function of biological uptake or denitrification (Mayer et al. 2010b) and not a function of simple dilution effects (Altman and Parizek 1995). Previous work at Minebank Run has shown that when stream channels are restored to allow overbank flow, denitrification rates are higher than where the channel flows through high, incised banks (Kaushal et al. 2008; Klocker et al. 2009). Below, we discuss the impacts of stream restoration on long-term patterns and processes of N transport and retention.

Stream-floodplain reconnection (and other management activities) influences long-term N transport and retention

Our study demonstrated that floodplain reconnection was an effective restoration approach for reducing NO3 concentration and flux in an urban stream. Both groundwater and surface water NO3 decreased at restored CVP after the restoration despite no change in NO3 concentration at the control site. Furthermore, NO3 flux continued to trend downward after the restoration. These results suggest that the restoration was effective at a reach scale in reducing NO3. Despite immediate positive effects of the restoration on nitrogen transformation, long-term flux measures of N showed that the restoration approach was unable to remain effective after several years due to high flow events destroying much of the restoration features.

While atmospheric N declines in the Chesapeake Bay area (Lovett et al. 2000; Linker et al. 2013), have contributed to corresponding declines in stream N (Eshleman et al. 2013), it is unlikely that such declines contributed to changes in chemistry at the restored reach of Minebank Run. The lack of observable declines at the control IV reach over time suggests that any probable reduction in atmospheric inputs had a much lesser effect on NO3 flux at Minebank Run than did restoration. Furthermore, terrestrial inputs are unlikely to have declined over time given the consistent urbanization pressure. Restoration at Minebank Run appears to have made an impact on NO3 flux through improved groundwater-surface water interaction and the initial reconnection of the floodplain to the channel. However, with numerous structural failures appearing along the restored reach (see Fig. S4a and b), long-term efficacy is in question. While restoration effects suggest that such management was effective at stabilizing current N inputs for the period of study, significant future reductions in NO3 at Minebank Run will likely require additional management efforts and/or reductions in watershed nutrient inputs.

Evidence that altered stream morphology enhances N transformation by increasing retention times in stream features and floodplains

The similarity of groundwater and surface water NO3 and Cl at the restored CVP reach suggests mixing of groundwater and surface water. NO3 and Cl patterns among stream restoration features varied before and after restoration suggesting that some features were more retentive of N. The oxbows created by the restoration had higher NO3 concentrations perhaps because they were designed to retain stormwater runoff (Harrison et al. 2014). However, these features promoted high rates of denitrification (Harrison et al. 2011), demonstrating that such wetlands have the potential to reduce NO3 in urban watersheds. The change in the proportions of riffles, runs, and pools may also have influenced the uptake of NO3. The proportion of pools, which were found to have higher sediment denitrification potential than riffles (Harrison et al. 2012b), increased after restoration relative to riffles (Doheny et al. 2012). Overall, this suggests that altering stream geomorphology features may enhance N transformation by maintaining anaerobic conditions and microbial activity that stimulate denitrification (Harrison et al. 2012b).

N sink and source dynamics vary among geomorphic structures depending on the influence of geomorphology on hydrology and subsequent microbial activity (Munn and Meyer 1990; Jones and Holmes 1996; Kemp and Dodds 2002; Fisher et al. 2005). There was considerable change in channel morphology (e.g. increased proportions of riffles, run, pools, reduced sinuosity, reduced bank elevation and slope, reduced incision). Collectively, these changes influenced hydrology (i.e. flashiness), and thus nutrient transformation processes. Furthermore, boundary sheer stress was reduced, thereby reducing the erosive forces on the channel bed and banks of the stream (Doheny et al. 2007, 2012). Boundary shear stress and mean velocity values for Minebank Run were generally greater than non-urban B or C channel types (Doheny et al. 2012; Doheny and Baker 2018). The slope of the shear stress regression line for Minebank Run is considerably flatter than those for non-urban streams suggesting that small changes in mean velocity and discharge result in large changes in boundary shear stress and susceptibility to erosion (Doheny and Baker 2018), which may be an underlying reason for structural failures at Minebank Run. This could have resulted in observed changes in the slope of the NO3 flux trend over the long term due to subsequent loss of N transformation functionality. Thus, reducing shear stress, flashiness, and scouring of organic matter, and instead, controlling flows to facilitate formation of debris dams may improve denitrification at Minebank Run and other similar urban streams (Groffman et al. 2005; Harrison et al. 2012b), but changes over time due to failures in geomorphic stability may have consequences on maintaining water quality functions.

Altered stream morphology after restoration also likely contributed to a reduction in hydrologic flashiness at Minebank Run. A previous study at Minebank Run showed that restoration lessened the positive relationship between precipitation and daily peak discharge, suggesting that restoration reduced overall flashiness of the system (Pennino et al. 2016). In another Minebank Run study, channel depth in the restored CVP reach was unchanged by restoration, however, channel width increased, and, consequently, cross-sectional area increased which caused a proportional decrease in mean flow velocity for comparable discharges (Doheny et al. 2012). Increased stream surface area has been shown across stream restoration approaches to improve nutrient retention (Newcomer-Johnson et al. 2016; Grant et al. 2018). Therefore, the increase in cross-sectional area may have contributed to the reduction in NO3 flux that we observed in this study.

Enhanced hydrologic connectivity in stream restoration is linked to N retention and transformation

The decline in post-restoration NO3 concentrations and flux is supported by previous studies at Minebank Run showing that restored low “connected” banks consistently had higher in situ denitrification rates (Kaushal et al. 2008; Mayer et al. 2013). The restoration dramatically altered stream bed elevations in many places along the reach (Doheny et al. 2012) likely leading to improved connection between the stream channel and banks (Striz and Mayer 2008). Wider channel width and decreased channel incision may have increased hydrologic connectivity between groundwater and surface water and, thereby, affected denitrification rates (Groffman et al 2002, 2003). Changes in channel morphology to a mix of runs, riffles, and pools may have also enhanced N uptake by increasing the turbulence along bioreactive microbial films that control denitrification (Grant et al. 2018). The step-pool sequences, stream barbs, and meanders such as those in low connected reaches have been shown to increase hydrologic residence times and nitrogen retention in transient storage zones at the riparian-stream interface (Kasahara and Hill 2006) relative to straighter runs (Hill et al. 1998; Gücker and Boechat 2004). Hydrologic flowpaths in the more connected banks may have fostered higher denitrification rates as was demonstrated in previous work at Minebank Run using conservative tracer injections showing that lateral groundwater inputs along the riparian-stream interface can be substantial (Klocker et al 2009).

Additional importance of organic matter and potential role for enhancing N transformation

DOC showed no change after restoration at CVP. However, DOC and NO3 are linked. NO3 declined with increasing C:N in a negative curvilinear trend suggesting that N transformation is C limited, a result consistent with previous studies at Minebank Run (Mayer et al. 2010b) as well as patterns across ecosystem types showing DOC limitation of denitrification (Taylor and Townsend 2010). Therefore, factors that influence accumulation and processing of C are strong regulators of N dynamics in streams. NO3 approached zero where C:N was about 10:1 suggesting that management of stream N will be enhanced if riparian zones are kept intact to provide leaf litter as a C source (Wood et al., this issue), or where C is available from the rooting zone via exudates and decomposition (Gift et al. 2010), and where organic matter can be retained in the stream for denitrification (Groffman et al. 2005; Lazar et al. 2014). Current protocols for restoring streams in the Chesapeake Bay to manage N emphasizes engineering floodplains to maintain a zone of rooted vegetation to maximize NO3 transformation potential (Urban Stormwater Work Group 2020).

Our observation of consistently higher DOC in surface water versus groundwater at Minebank Run suggests that organic matter does not mix in the subsurface and/or is consumed rapidly. Previous work has shown that DOC concentration is highest in surface water and declines with depth in the channel (Mayer et al. 2010b) likely due to limited water influx into the hyporheic zone. Organic matter is more abundant in upper root zone layers (Gift et al. 2010) and declines with depth in riparian soils (Groffman et al. 2002). Furthermore, particulate matter may be less likely to remain entrenched in interstitial zones or persist in debris and leaf packs in flashy streams (Groffman et al. 2005), thereby limiting the availability of organic matter in the subsurface. Because DOC is a critical limit to denitrification, restoration techniques designed to supply DOC more effectively to hyporheic zones and floodplains can optimize N removal (Newcomer-Johnson et al. 2016; Duan et al. 2019).

Previous research at Minebank Run showed that denitrification enzyme activity (DEA) and microbial biomass C were both higher in hyporheic sediments than in deep floodplain sediments suggesting that the hyporheic zone is responding to and processing C and NO3 from upstream and/or riparian sources (Mayer et al. 2010b). These results also suggest that restoration that increases carbon availability in sediments could enhance denitrification capacity in stream ecosystems. At Minebank Run and other streams in the Baltimore area, denitrification potential was highest in organic debris dams and other features high in organic matter (Groffman et al. 2005). None of the stream features in Minebank Run were hot spots of high organic matter accumulation probably because flashy stream flows frequently wash debris from the channel and downstream. Also, pools in both the restored and unrestored reaches of Minebank Run had lower denitrification enzyme activity than pools from other streams in the Baltimore area (Groffman et al. 2005). Strong positive relationships at Minebank Run between root biomass and soil organic matter, and between soil organic matter and denitrification potential, suggests that deep rooted vegetation may be particularly important for maintaining an active denitrification zone in restored riparian zones (Gift et al. 2010). However, stream restoration approaches that improve hydrologic connectivity between hyporheic zones and floodplains (so that organic matter reaches subsurface zones where there is low DO and adequate NO3 for anaerobic activity) is key to enhancing denitrification.

Reductions in NO3 and TN flux and flow normalized concentrations post-restoration

NO3 and TN flux showed a significant decline in the 2005–2008 post restoration period compared to the 2002–2004 pre-restoration period, but the absolute NO3, TN, and FN fluxes increased slightly during the 2010–2012 post-restoration period. Mean annual concentrations and flow normalized concentrations continued to decline during the full post-restoration period 2005–2012. There are several possible explanations for the increase in NO3 flux in 2010–2012 including greater mean annual discharge and the observed failure of many of the restoration features after 2008. There is evidence that damage occurred to the restored features by 2013 and, based on the flow record, there were likely damaging peaks in 2009, 2011, 2012 (personal communication, Ed Doheny, USGS; Fig. 4). There were peak flow events > 30 m3/s at least three times between August of 2011 and June of 2013 (Fig. 4). In 2008, there was evidence of erosion at the toe of the banks in some of the cross sections, in 2011–2012, particularly August through September 2011, which was a wet period due to tropical storms. Some restoration features were degrading and many of the cut banks had re-established themselves. Based on observations by USGS (Ed Doheny, personal communication), by 2013, the area downstream of the sampling wells at restored CVP looked like degraded pre-restoration conditions with cut banks re-established, cross vanes buried and damaged, rock weirs collapsed, and without maintenance of the grade control of the channel bed (Fig. S4a and b). By 2016, the restoration at CVP had largely reverted to a pre-restoration condition where incised banks had re-appeared and the stream was no longer connected to the floodplain. Additionally, the large oxbow feature created in 2004 to accept excess stormflow upstream of the USGS gage, had filled with sediment by 2016, and perhaps as early as 2010, significantly reducing its original depth of about 1.5 m (Harrison et al. 2012a), and was no longer functioning to accept stormflow connected to the channel (personal communication, Ed Doheny).

The failure of restoration features after 2008 may have not only contributed to less NO3 removal due to less biogeochemical processing, but also less attenuation of the discharge (Woltemade and Potter 1994; Hammersmark et al. 2008; Sholtes and Doyle 2011; Jacobson et al. 2015). This may partly explain why discharge was higher in 2010–2012 even though precipitation did not differ between the periods of 2005–2008 and 2010–2012 (Fig. 5). Magnitude and intensity of individual storms may also have been a factor. Also, less infiltration and less attenuation of peak discharge may have occurred after the oxbow feature was dysfunctional and the stream was no longer connected to the floodplain (Fink and Mitsch 2007; Hudson et al. 2012; Harrison et al. 2014; Palmer et al. 2014). There may have been less removal of N from the stream compared to when the oxbows were at peak function in the first years after restoration (Harrison et al. 2014). Consequently, the increase in NO3 flux in 2010–2012 may be explained by the higher discharge and lesser biogeochemical processing due to the geomorphic failure of the restoration features and it may explain why stream flashiness, which initially declined after restoration, increased by 2008. Similarly, Thompson et al. (2018) observed little reduction in hydrologic flashiness at a reach scale after restoration at a stream in Maryland, USA.

Despite geomorphic failure of restoration features, the decline of flow normalized NO3 concentrations over time indicate that the stream was still able to continue removing NO3 through biogeochemical processing. Also, the lower slope of the NO3 and TN concentration-to-discharge relationships during post-restoration compared to pre-restoration, indicates that, even at higher flows, N concentrations were lower after the restoration, which confirms that the restoration improved biogeochemical NO3 uptake activity across a range of flows.

Conclusions and recommendations

Our studies support the idea that in-stream processes and hydrologic connectivity between the stream channel and subsurface zones may influence N processing in urban streams (Craig et al. 2008, Kaushal et al. 2008). Restoration activities focused on increasing hydrologic connectivity in riparian zones may enhance denitrification rates by increasing soil organic C availability and altering hydrologic flowpaths (e.g. Fennessy and Cronk 1997; Groffman et al. 2003; Boulton 2007; Mayer et al. 2007). Because riparian soils, geomorphology, hydrologic flowpaths, N loads, and geology all play roles in explaining variation in denitrification rates (e.g. Alexander et al. 2000; Stanley and Doyle 2002; Groffman and Crawford 2003; Gücker and Boechat 2004; Wollheim et al. 2005; Mulholland et al. 2008), all of these factors should be considered and further evaluated in the efficacy of restoration designs aimed at increasing both denitrification rates and mass removal of NO3 in riparian zones.

Restoration practices that improve hydrologic connectivity between the stream and the riparian zone can increase NO3 removal. Yet, more work is necessary to better quantify the effectiveness of stream restoration practices under various applications and conditions and over time (Stanley and Doyle 2002; Kasahara and Hill 2006; Bukaveckas 2007; Roberts et al. 2007). For example, little is known about how N removal is affected by changes in riparian plant communities, changes in soil organic matter, or variable stormwater flows and discharge. Furthermore, urban watersheds may behave differently than streams in less developed watersheds which have been more thoroughly studied (e.g. Peterson et al. 2001; Mulholland et al. 2004; Kaushal and Lewis 2005; Doheny and Baker 2018). Comparisons of streams in other regions and land use types will be critical in determining effectiveness of restoration and establishment of metrics to assess water quality improvements associated with stream and riparian restoration.

Performance among restoration designs and hydrologic conditions is likely to be highly variable, suggesting that stream restoration by itself is currently not adequate to mitigate for excess N inputs or to compensate for stream destruction and degradation (Craig et al. 2008) and that a comprehensive approach must be taken for watershed management (USEPA 2008) including reducing effective impervious surface coverage in uplands (Walsh et al. 2005) and repairing aging infrastructure (Doyle et al. 2008). A combination of restoration features and stormwater management (Reisinger et al. 2019) may improve long-term efficacy of restoration. Structural failures among restored stream features also underscores the importance of designing restoration projects to be stable at Q100, with a factor of safety (Hawley 2018).

Minebank Run is one of thousands of streams leading to the Chesapeake Bay representing about 5.2 km of stream length of the estimated 160,000 km of streams within the 166,000 km2 watershed of the Chesapeake Bay (USEPA 2009). Total cost of restoration at Minebank Run was $4.4 million (in 2005 US$; Baltimore County Department of Environmental Protection and Sustainability, unpublished). Employing this type of restoration solely for managing N is prohibitively expensive and likely will not address the impacts of future N loads and sources. While stream restoration is not the primary solution to N management in Chesapeake Bay, stream restoration has numerous cumulative, potential benefits that may justify the costs of such efforts, including, sediment and erosion control, protection of property, increased property values, fish and wildlife habitat and migration corridors, green space, stream temperature control, improved ecosystem metabolism, and maintenance of riparian zones.

Longevity or efficacy of restoration projects, especially under repeated storm effects or increasing urbanization, or from the effects of salinization and chemical cocktails (Cooper et al. 2014; Kaushal et al. 2020), is currently unknown. Storms after the restoration may have exceeded the channel design discharge for which the restoration at Minebank Run was engineered and/or the bankfull dimensions may have been too difficult to accurately identify (Sortman 2004). Soil instability and poor vegetation reestablishment also may lead to erosion during overbank flows (Sortman 2004). Natural channel design (NCD) restoration approaches integrate fluvial processes of “self-formed and self-maintained natural rivers” (Rosgen 2011). Because urban streams self-form in highly altered landscapes, applying NCD techniques in effort to reconstruct a stable reference reach yields unexpected outcomes and the geomorphologic evolution of such restored streams has not been fully documented. While NCD may produce a more functional stream in the sense of flashiness or nutrient uptake, metrics of success and failure have not yet been fully established and protocols for stream restoration intended to reduce N transport are evolving (Urban Stormwater Work Group 2020). Furthermore, ecosystem functions in urban streams such as nutrient uptake may be more resilient to disturbance, storm events, and floods because of functional redundancy of the microbial communities driving processes such as denitrification (Utz et al. 2016; Reisinger et al. 2017). Future studies should investigate effects of varying N loads, hydrologic residence times, hydrological connectivity, and seasonality on denitrification rates in restored streams and under different land uses and stream flow conditions (e.g. Tague et al. 2008). Further research on coupled restoration practices and stormwater management may be useful because it may be desirable to create conditions with high denitrification rates in urban areas where water from the landscape is concentrated (Pennino et al. 2016). In addition, the structure and function of urban streams can evolve over time with management or degradation (Kaushal et al. 2014a, b; Kaushal et al. 2015). To our knowledge, this study is among the first to look at long-term changes in the biogeochemistry and hydrology of a restored stream, and more work is needed to analyze long-term changes in material transport and retention in restored streams over time.

Restoration designs are heterogeneous efforts consisting of various components including bank re-shaping, bank stabilization, channel reconstruction, riparian re-vegetation, etc. Therefore, study designs that distinguish the influence of individual restoration components will help to identify those techniques that contribute most to nutrient uptake and other ecosystem functions of concern. Also, because in situ N uptake and transformation is notoriously difficult to measure (Groffman et al. 2006), studies that measure denitrification or surrogates of denitrification (e.g. oxidation–reduction potential, DO) at watershed scales and over time will be most useful in quantifying restoration effects. Finally, long-term monitoring will better elucidate short and long-term patterns of nutrient dynamics (Goodale et al. 2003, 2005, White et al. in prep). Therefore, long-term studies like ours (Pickett et al. 2001; Mayer et al. 2010a) are critical for understanding restoration effectiveness.