Introduction

Forests are a major contributor to the terrestrial carbon sink1,2, which offsets approximately \(25\,\hbox {to}\,30\,\%\) of human-caused carbon emissions3. However, extreme events such as floods, droughts and storms pose a major threat to the viability of natural systems and are responsible for much of the interannual fluctuations in the global carbon cycle4,5. Extreme drought and heat events in particular have severe negative impact on vegetation carbon uptake and storage capacity6,7, especially since they often occur simultaneously as so-called compound events8,9. Under the prevailing conditions of global warming, these events are expected to increase, both in duration and intensity in the future10,11. Understanding how those events alter forest carbon cycling is therefore critical for predicting future trends of the terrestrial carbon sink.

Continuous monitoring of ecosystems is an important source of information for studying the terrestrial carbon and water cycles. An effective method for this is the eddy covariance method, which can be utilized to derive the exchange of trace gases such as carbon dioxide and water vapor between the land surface and the atmosphere12. Eddy covariance measurements provide valuable insight into how the forest responds to changes in water availability, which in turn affects the forest’s ability to sequester carbon. Especially when measurements are conducted over several years, they allow to detect changes in the terrestrial carbon sink over time and attribute these changes to different drivers13,14,15. It also serves as a source of ground truth data for developing land surface models16,17,18,19 and has contributed to our understanding on the role of extreme events in the terrestrial carbon cycle20,21,22.

Droughts can reduce the CO\(_2\) uptake of forests due to water scarcity causing plants to alter their stomata behavior to compensate for water loss23. Other non-stomatal plant processes such as changes in mesophyll conductance, Rubisco carboxylation capacity, and maximum electron transport rate14,24,25 can limit photosynthetic activity during droughts, but their interplay and implications for forest carbon cycling are unclear and subject of ongoing research26,27. The study of drought impacts is further complicated by the fact that not all responses are immediately apparent28. Delayed effects can arise from, for example, hydraulic damage and dieback29,30,31,32, shifts in carbon allocation33,34,35 and carbon depletion36,37,38, changes in nutrient cycling39,40 and reduced resistance to pest41,42.

The challenge of identifying lagged effects is separating the impact of previous causes from current conditions. Lagged effects have been mainly studied on an annual to monthly time scales using tree ring data43,44, remote sensing data45, carbon cycle models models46, or a mixture of those47, but studies on the ecosystem scale using daily eddy covariance measurements remain rare28. Legacy effects were also studied using regression analysis by identifying differences between predicted and actual ecosystem productivity in years following a drought event47,48. While regression models can be a powerful tool for understanding the functional relationships between ecosystem fluxes and potential drivers, attributing model errors to legacy effects remains tentative, as it may be that errors originate from other sources such as unobserved disturbances or model misspecification49,50,51.

Here we take a similar approach, but aim to incorporate information about past climate states directly into the analysis. We combine local measurements from the Hohes Holz (DE-HoH) research site, which provide detailed insights into the carbon and water cycles and associated meteorological measurements from 2015 to 2020, with a high-resolution dataset with daily standardized drought information aimed to support drought research at ICOS ecosystem sites by providing information on deviations from long-term climatology. Overall our goal here is to understand and analyse the (potentially delayed) responses of the 2018 drought and heat event on the terrestrial carbon cycle in a mixed deciduous forest in Germany. In particular, we aim to answer the following questions:

  • (a) which impact did the extreme drought and heat event in 2018 have on carbon, water and energy exchange at the forest ecosystem Hohes Holz?

  • (b) what are the temporal patterns at which aggregated climate conditions impact the ecosystem fluxes and

  • (c) can incorporating drought information in statistical models improve predictions, especially during legacy years?

Results and discussion

Meteorology of the years of observation and relation to long-term climatology

During the observation period from 2015 to 2020, we generally observed that years were warmer than average, while annual precipitation varied. Putting these measurements in historical context (1950 – 2021)52, Fig. 1 shows that recent years have been mostly drier than usual, with the exception of the second half of 2017. Panel (a) shows a time series of the Standardized Precipitation-Evapotranspiration Index (365 days SPEI)53, with blue colours indicating periods that are wetter than usual and red colours indicating periods that are drier than usual. While wet and dry periods generally alternate, during the period of our measurements at DE-HoH the colour red dominates. The extreme drought event in 2018 stands out in particular, where standard deviations of below -2 indicate an occurrence probability of less than \(2.3\,\%\). An event with similar severity was last recorded around 1960.

Figure 1
figure 1

Climatic conditions of the observation years 2015–2020 in the context of long-term climatology (1950 - 2020) at the study site, (a) time series of 12-month SPEI from 1950 to 2020; the dashed vertical line indicates the start of the eddy covariance measurements, (b) annual values of precipitation and temperature of the six observation years at DE-HoH from 2015 to 2020 relate to the long-term data and (c) 3- and 12-month SPEI as well as local precipitation and soil water content at \(50\,\hbox {cm}\) soil depth of the observation period.

Examining panel (b) of annual temperature and precipitation anomalies reveals the underlying patterns of recent dry years. The empirical distribution of precipitation during the observation period (blue) roughly matches the distribution of historical precipitation data (grey). In contrast, annual temperatures since 2015 have all been in the range of 1 to \(2\,^{\circ }\hbox {C}\) warmer than the long-term mean, suggesting that increased evapotranspiration due to warm conditions is having an additional negative impact on the water balance. Furthermore, 2018 stands out as the year with the highest temperature anomaly (\(+1.86\,^{\circ }\hbox {C}\)) and the third lowest precipitation sum (\(301\,\hbox {mm}\)), placing it at the extreme end of the joint distribution. This observation at the study site is consistent with other regions in Central Europe11. In comparison to the long-term annual values since 1950, the year of the last extreme drought event, 1959, had the lowest precipitation with \(296\,\hbox {mm}\), but temperatures were only slightly higher than average by only \(+0.4\,^{\circ }\hbox {C}\).

A detailed look at conditions during the measurement period is provided in panel (c) of Fig. 1, showing the temporal evolution of local precipitation patterns, soil water content (SWC) at \(50\,\hbox {cm}\) soil depth and SPEI estimates corresponding to two aggregation time periods of 90 days and 365 days to reflect short-term (seasonal, affecting upper soil layer) and long-term (annual, affecting deeper soil layers) variability of governing weather conditions. Soil moisture usually follows a periodic pattern with a decrease at the beginning of the growing season and replenishing after the end of the growing season. Additionally, during the growing season, short-term rewetting due to large rain events occurs, particularly in the summer of 2015 and 2017, where persistent rain resulted in a strikingly small decrease in soil moisture. The SPEI time series both reflect the respective cold-wet and warm-dry conditions, wherein the persistence of these conditions depends on the aggregation period. The extreme event in 2018 as well as the exceptionally quick transition from wet to extremely dry conditions54 is well reflected in both SPEI timeseries. The short-term SPEI (90 d) indicates that drought conditions persisted until the end of the year, making 2018 the only year when soil moisture replenishment only started the following year. The beginning of 2019 consisted of a short wet period until another dry and warm, but not extreme growing season began. At the same time, the long-term SPEI (365 d) indicates persistent drought conditions, reaching its most extreme levels during spring and early summer 2019 and lasting until the end of summer 2019. The time series of SPEI for the two different aggregation periods highlight the difference between 2018 and 2019: while the meteorological conditions of 2019 were warm and dry, they do not themselves meet drought categories (i.e. SPEI (90 d) \(> -1\)). Due to the only moderate precipitation in winter the short-term drought was alleviated and soil water replenished in early 2019, while long-term drought (SPEI (365 d) \(< -1\)) continued throughout the summer of 2019.

Fingerprint of drought on ecosystem fluxes

Eddy covariance data from 2015 to 2020 show that the forest acts as a carbon sink with an average net ecosystem productivity (NEP) of \(\sim 362\,\hbox {g}\,\hbox {m}^{-2}\,\hbox {yr}^{-1}\). Annual and growing season values of eddy covariance and associated meteorological measurements are displayed in Table 1. In 2018, NEP was \(\sim 16\,\%\) higher than in previous years before the drought. In contrast, NEP was \(\sim 25\,\%\) lower in 2019 and returned to an average level in 2020. The only year that had a higher NEP than 2018 was 2017, which received most rainfall of the six years of observations with more than 600 mm annual precipitation. Table 1 indicates that these wet conditions in 2019 resulted in the lowest average vapor pressure deficit (VPD), air temperature (TA), and sensible heat flux (H). In contrast, the dry conditions in 2018 led to the highest VPD and H fluxes.

Table 1 Annual sums of net ecosystem exchange (NEE), gross primary productivity (GPP), ecosystem respiration (Reco) and precipitation (Prec) as well as annual averages of incoming short-wave radiation (SW\(_\text {IN}\)), latent heat flux (LE), sensible heat flux (H), net radiation (Rn), air temperature (TA), vapour pressure deficit (VPD) and length of the growing season derived from Phenocam pictures (cf. Fig. S1). The uncertainty estimates for the eddy covariance carbon fluxes are based on the bootstrapped u\(^*\) threshold, while for the other variables it is the standard error of the annual mean.

The annual values of the growing season carbon fluxes, shown in Table 1, highlight the severity of the drought in 2018. Only \(43\,\%\) (\(130\,\hbox {mm}\)) of the total precipitation (\(302\,\hbox {mm}\)) fell during the growing season that year, compared to \(67\,\%\) (\(390\,\hbox {mm}\)) in 2016, the second driest year of the study period. Nevertheless, net carbon uptake in 2018 was average over the observation period. In contrast, despite average precipitation and incoming radiation, 2019 had the lowest net carbon uptake. This suggests that the extreme heat and drought event in 2018 had a lasting impact on ecosystem fluxes in 2019. Both of these years also had significantly longer growing seasons, with an additional 19 days according to Phenocams estimates (see Supplement Fig. S1). Bastos et. al55 have shown that in Europe in 2018, as during the 2003 drought, in terms of annual carbon balance, high spring productivity offsets reduced carbon sequestration during the summer due to drought, similar to our research site. This highlight the need to consider potential impacts on legacy years, as consequences on carbon balance might not be prominent in the year of the heat wave itself.

Figure 2
figure 2

Time series of ecosystem fluxes net ecosystem productivity (NEP), gross primary productivity (GPP), ecosystem respiration (Reco), latent (H) and sensible heat (LE) at (a) monthly, (b) daily and (c) diurnal scale.

Fig. 2 shows how compounding dry and hot conditions affected ecosystem fluxes at seasonal to hourly time scales. On a monthly scale (panel (a)), net carbon sequestration in May and June was equal or even higher in 2018 than in other years, indicating that warm temperatures, clear sky and high radiation boosted gross primary productivity (GPP) until the water deficits most likely caused stomatal closure to regulate transpiration from July on56. While monthly averages of NEP were only slightly below average during the growing season in 2019, GPP and ecosystem respiration (Reco) were strongly reduced. The consequences on the carbon balance become especially apparent for 2019 when fluxes are summed up over the whole year as done in panel (b), where deviations from the usual annual progression of accumulated fluxes start notably in July.

Regarding the energy fluxes, sensible heat (H) usually peaks in April and decreases once the growing season begins, as vegetation activity alters the Bowen ratio57. Due to an early onset of the growing season, H is lower in April 2018 and 2019 compared to the average of the remaining years. Remarkable is also the increase of H in July 2018 which demonstrates how the drought decreased evapotranspiration, i.e LE relative to H. This behaviour of the Bowen ratio is known to appear under dry soil conditions58,59,60. LE shows a remarkable pattern in 2018, with above-average values through May, followed by the lowest monthly averages of the observation period during the summer months, giving an overall picture of a forward-shifted growing period. Sensible heat is also increased in summer 2019 to comparable levels to 2018 from July on, indicating re-occuring drought stress on the ecosystem61.

Panel (c) of Fig. 2 reveals the average diurnal timing of anomalies in the fluxes over the vegetation period. Strongest anomalies for the carbon fluxes occur during the afternoon, while H has strongest differences between drought and non-drought conditions during midday. Note that for visual reasons, we averaged the diurnal cycle for both short and long-term drought. This shows that despite temporarily increased photosynthesis in the early summer of 2018, over the entire drought period the drought had a negative impact on diurnal carbon sequestration.

Figure 3
figure 3

Relationship between net ecosystem productivity (NEP) and photosynthetic photon flux density (PPFD, a) and evapotranspiration (ET, b), respectively, plotted on a monthly basis and separately for 2018, 2019 and the other years. Curve in panel a is the Michaelis-Menten equation fitted as described in Falge62, while in panel b it is a local polynomial smoothing curve63.

Table 2 Parameters of Michaelis-Menten equation of Fig. 3a, fitted as described in Falge (2001)62. \(\alpha \) is the ecosystem apparent quantum yield in \(\upmu \,\hbox {mol}({\hbox {CO}_2}) / \upmu \,\hbox {mol}({\hbox {Photon}})\) and reference GPP is the ecosystem productivity in \(\upmu \,\hbox {mol}_{\hbox {CO}_{2}}\,\hbox {m}^{-2}\,\hbox {s}^{-1}\) at light saturation of 2000 \(\upmu \,\hbox {mol}\,\hbox {m}^{-2}\,\hbox {s}^{-1}\).

The relationship between daytime CO\(_2\) fluxes and PPFD or evapotranspiration, respectively, reveals the efficiency of the forest carbon sequestration under given light and water conditions. Net carbon sequestration per photon (Fig. 3a) respectively per water evaporated (Fig. 3b) was at average or even higher in May and June 2018, indicating the positive effect of the previous wet winter causing well-saturated soils in combination with the warm and sunny early summer of 2018. Yet, in July 2018, carbon sequestration efficiency drops. The parameters for the light response curve in Table 2 show that reference GPP (at 2000 \(\upmu \,\hbox {mol}\,\hbox {m}^{-2}\,\hbox {s}^{-1}\)) reached only 75 % of productivity in July and August 2018 compared with non-drought years. Even though soil water content replenished to some extent in 2019, light efficiency was still at similar low levels as in 2018. The relationship between carbon sequestration and evapotranspiration is very similar, except that it rises to near-normal conditions in July 2019 before dropping significantly again in August.

If taking the lower light use efficiency in panel (a) and water use efficiency in panel (b) as indications of stress, the low efficiencies in June and August 2019 could be interpreted as water stress. Water stress starting in July 2018 is in accordance with the soil moisture data. But although observed soil moisture down to \(50\,\hbox {cm}\) and short-term SPEI (90 d) was near normal at the beginning of the growing season 2019, both light and water use efficiency were surprisingly low. This indicates a (potentially) lagged effect from the previous year, not related to the soil water content in the upper soil layers or short-term drought.

Relationship between drought at various time scales and eddy covariance fluxes

To better understand how time scales of droughts and their timing during the growing season affect the ecosystem fluxes, we looked at their relationship with SPEI for multiple aggregation periods. Panel (a) in Fig. 4 shows the multi-year relationship between SPEI at aggregation times from 5 to 365 days and daily standardized ecosystem fluxes during the growing season. The relationship was examined for each day of the year using a rolling correlation with a window of five days (to avoid low statistical power64,65) and standardized observations over the six observational years. For each ecosystem flux, four areas (indicated with boxes in Fig. 4a) are highlighted where the standardized measurements are plotted against the SPEI values during that period (Fig. 4b).

Figure 4
figure 4

Time series of running correlation between between SPEI at aggregation scales from 5 to 365 days and standardized ecosystem fluxes (a). Panel (b) contains scatter plots of the z-transformed eddy covariance fluxes against the SPEI values of four selected areas from (a). (1) and (2) reflect the responses to short-term and long-term drought, respectively, at the beginning of the growing season, while (3) and (4) reflect the responses to short-term and long-term drought, respectively, in August. The blue line is a simple regression to illustrate the overall relationship between the individual windows.

The pattern of temporal correlation is similar among both carbon fluxes and energy fluxes, with the sign reversed between latent and sensible energy. Especially for the second half of the growing season, ET is mostly positively correlated across all aggregation scales of SPEI, while H is mostly negatively correlated. This implies that the Bowen ratio is more sensitive to climatic conditions during the second half of the growing season which can most likely be explained by the availability of soil moisture59. Soil moisture availability is contingent on weather patterns during the summer. Fig. 4 shows that the Bowen ratio in the latter part of the summer was influenced by both short-term and long-term weather conditions.

The patterns of carbon fluxes show primarily a negative correlation at the beginning of the growing season in May for shorter aggregation periods of SPEI, which transitions to a positive correlation for longer periods. Because of the large daily fluctuations in ecosystem fluxes, the strength of the relationship is also variable from day to day, as the drought indices exhibit much higher temporal persistence. It therefore makes sense to look at the correlations together over larger time periods as well, which we do exemplary for May and August in panel (b) of Fig. 4; and thereby reflecting the conditions during the start and mid of the growing season, respectively. While the wide dispersion of the data reflects the multiple influencing factors, the trend line (blue) shows whether or not there is a relationship with the drought indices when viewed over the entire period of the respective area. Here the slope of the regression line corresponds roughly to the average correlation of the four selected areas.

Area 1 corresponds to SPEI with aggregation periods of 30 to 90 days in May, reflecting short- to midterm deviations from normal climate conditions. Especially NEP and GPP are notably negatively correlated, meaning that warm and dry spells before and at the beginning of the growing season enhance carbon sequestration. Note that during the start of the growing period in the spring season, despite moderately drier conditions (compared to climatology), soils in forest hold sufficient moisture (in absolute term) to adequately support their phenological development. Thus, the warm conditions during spring and clear sky further enhance vegetation development. However as these conditions deplete the soil moisture storage during spring, without an adequate supply of water through rainfall in the subsequent summer, the forest productivity declines66. In contrast, the long-term drought index in May, shown in area 2, is negatively correlated with carbon fluxes as well as with ET. Areas 3 and 4 correspond to short and long-term drought in August. NEP, GPP, and LE are negatively affected regardless of aggregation time, which in turn favors an increase in H. Reco seems to be relatively unaffected by drought, as indicated by a slope close to zero. But a deeper look at the time series in panel (a) shows that the period included in the sample covers times with both positive and negative correlation, which averages out when aggregated over longer periods.

Estimating forest productivity under extreme conditions

We found that carbon sequestration of the forest ecosystem was noticeably impacted during the extreme heat and drought of 2018, and that low light and water use efficiency persisted into the year 2019 (Fig. 3). Although this can be an indicator of prolonged drought stress67,68, reduced carbon sequestration in 2019 could also result from unfavorable weather conditions. To understand how much ecosystem productivity could be expected given the hydro-meteorological conditions, we use a regression model (i.e. Restricted Cubic Spline; RCS regression) to estimate GPP as a function of Photosynthetic Photon Flux Density (PPDF) and Soil Water Content (SWC) in \(50\,\hbox {cm}\), and compare it against a model that includes additional information on drought conditions (SPEI_90 and SPEI_365). The presence of legacy effects may be indicated by a notable difference in the performance of the two regression models as only one of the models has information on past climate conditions. We chose GPP rather than NEP as response variable because GPP is more directly attributed to changes in predictors than NEP, which is the balance between GPP and Reco and would thus require additional consideration of changes in autotrophic and heterotrophic respiration.

Figure 5
figure 5

Scatter plot of observed vs. predicted GPP using restricted cubic spline regression for years 2015 to 2020. Predictions are based on soil water content (SWC) and photosynthetic photon flux density (PPFD) (gray dots) and considering additionally SPEI_90 and SPEI_365 (blue dots). Results are plotted for each year separately and with root mean squared error (RMSE), mean percentage error (MPE) and R2 adjusted for number of predictors.

Overall, the RCS model with SPEI had higher explanatory power (adj. R\(^2\) = 0.64) compared to the model based solely on SWC and PPFD (adj. R\(^2\) = 0.54). The Akaike information criterion (AIC) of the complex model (AIC = 3572) was substantially lower than that of the simple model (AIC = 3786), indicating that the additional complexity is worth considering in terms of the increase in explanatory power. A comparison of actual versus predicted GPP for each year shows that in 2019, the mean percentage error (MPE) was -42.8 % (Fig. 5). This represents a significant overestimation of actual productivity, which is not seen in any other year. Including SPEI in the model reduces the MPE to -23.65 %, but the adj. R\(^2\) of 0.4 is still the lowest of all observation years. In 2020, there are no significant differences between the two models, indicating that the observed productivity can be explained by the model just as well as in other years without drought, and that GPP was not affected by legacy effects.

Other studies in Central Germany have found decreases in tree growth in 2019 in a floodplain forest69, but stronger legacy effects in 2020 than 2019 in an old-growth and more diverse forest (DE-Hai)48. Yu et al.48 used a random forest regression model to quantify legacy effects, but focused on residuals rather than the difference in explanatory power. It is unclear whether the difference in lagged responses between this study site (DE-HoH) and more diverse forest site (DE-Hai;48) is due to differences in forest structure or can be attributed to the different regression approaches. Both forests have European beech (Fagus sylvatica L.) as dominant species, but DE-HoH consists also of 46% (basal area) sessile oak (Quercus petraea (Matt.) Liebl.), while DE-Hai has 28% of ash (Fraxinus excelsior). Some studies attribute higher sensitivity of ash to environmental changes than oak70,71, but whether this explains the differences in lagged responses remains unclear and requires further research.

Figure 6
figure 6

Global interpretation of the restricted cubic spline regression models with gross primary productivity (GPP) as response. Bivariate partial dependence plot between photosynthetic photon flux density (PPFD) and soil water content in 50 cm (SWC) (a) and SPEI_90 and SPEI_365 (b). Feature importance (c) is defined by \(\chi ^2\) minus degrees of freedom and degree of nonlinearity (d) is defined as sum of squares explained by the splines.

Parametric models with splines can account for non-linear responses. While machine learning algorithms can outperform classical statistical methods, they often come at the cost of lack of interpretability. In this study, we minimized the complexity of the model to allow more robust evaluation of the model results. Optimal daily gross primary productivity of our forest ecosystem was found to occur at soil water content (SWC) of approximately 20–22.5 vol. % and daily average photosynthetically active photon flux density (PPFD) of \(400\,-\,600\,\upmu \,\hbox {mol}\,\hbox {m}^{-2}\,\hbox {s}^{-1}\) (Fig. 6a). Based on the results of the RCS model, we notice that both long-term drought and short-term wet spells reduce forest productivity (Fig. 6b). This is in line with this region being both potentially energy and water limited. Energy limited ecosystems can benefit in the short run from increase in available energy72, as indicated by low SPEI (90 d). But longterm drought can shift them to water limitation and therefore SPEI (365 d) around zero are preferred.

The partial dependence plots in panel (a) and (b) of Fig. 6 need to be interpreted in conjunction with the relative importance of each feature (Fig. 6c). The majority of the additional contribution from the more complex model comes from the Standardized Precipitation-Evapotranspiration Index (SPEI) with a time window of 365 days, while SPEI with a time window of 90 days has only a small influence. PPFD is the most important variable, explaining most of the day-to-day variability, followed closely by SWC. Both were also identified as most dominant in other data-driven studies on forest productivity73,74,75.

The inclusion of SPEI as a covariate slightly reduces the relative importance of PPFD and SWC as dominant predictors. In environmental data, it is common for predictors to be correlated with each other to some extent. Therefore, the introduction of additional predictors is expected to affect the partial sum of squares. However, the small reduction in the chi-squared statistic for PPFD and SWC suggests that most of the information contributed by the additional predictors is independent and not multicollinear. The RCS model also allows for the attribution of explained variance to the linear and spline terms in the equation (Figure 6d). Overall, the splines accounted for approximately \(25\,\%\) of the explained variance, while the linear relationship accounted for the remaining \(75\,\%\).

Possible reasons for legacy effects and limitations of the study

The Standardised Precipitation-Evapotranspiration Index provides useful insights in two important ways. Firstly, it serves as a versatile multi-index due to its flexible parameterisation in terms of accumulation time. This allows different time periods to be considered and both short and long term anomalies to be assessed. Secondly, our study provides some evidence that the long-term SPEI can be an important tool for improving estimates of carbon uptake under legacy situations where it operates as a surrogate for otherwise challenging-to-measure responses at the leaf-level. However, the study is not sufficient to make a general recommendation, as the short observation period and the limitation to one site mean that the transferability to other sites and extreme events has not yet been tested. The use of a standardised drought index dataset76 would make it possible to identify potential common patterns of extreme events in similar ecosystems in a large-scale, multi-site study.

Some possible mechanisms, which could explain the legacy effects, include hydraulic damage to the trees, which directly limits radial growth by decreasing photosynthetic capacity, and indirectly by preferential allocation of photoassimilates to replenish non-structural carbohydrate stores28,77. This could partly explain the low carbon fluxes at the beginning of the growing season in the year following the extreme event, which is in line with other studies reporting delayed spring phenology in the subsequent year78. Measurements of gas exchange at leaf level, water potential and hydraulic conductivity would be required to advance the understanding of the mechanisms leading to the lagged effects, also in relation to the different responses of the individual species28. Furthermore, separate measurements of autotrophic and heterotrophic respiration would allow to differentiate between the response patterns of the individual respiration fluxes79. However, these measurements are very laborious and would need to be maintained for many years to measure both pre- and post-drought situations, so they may be more feasible in experimental and controlled environments.

Other factors may involve the loss of deep water reservoirs, which are challenging to observe but known to be linked to forest die-off80. In fact, our water level measurements show that the level dropped significantly in spring 2019 following the 2018 drought, and the actual depth could not be determined since then due to inaccessibility (cf. Fig S2). It is reasonable to assume that this has implications on the water support at the research site as oak e.g. are known to be able to reach deep water resources81. However, the specific rooting depth at our research site remains unknown, so the effect of the drop in water level is unclear.

Research of the mechanisms behind legacy effects is still in its infancy, with most evidence coming from dendrochronological analysis28. This has led to a range of drought effects being found in temperate and boreal forests, but the results of these studies are diverse as the effects can be related to a variety of factors. Furthermore, the recent spread of eddy covariance data and drought studies has shown that during drought, there is a decoupling of tree-ring legacy effects from gross primary productivity28,82,83. For attributing lagged responses in forest fluxes with certainty, future studies will need comprehensive measurements from before and after drought situations. Additionally, studies need to strategically integrate experimental research with model development to facilitate robust scaling from individual to ecosystem and remote sensing levels.

Conclusion

Extreme drought events pose a significant threat to forests and can disrupt ecosystem processes for multiple years. However, the rarity of such extreme events limits the opportunities for targeted ecosystem studies. Additionally, our process understanding of the impacts of drought on ecosystem functioning is still limited, making it difficult to accurately model the future trajectory of the terrestrial carbon cycle. In the face of global warming, it is crucial that we gain a deeper understanding of the consequences of extreme droughts in order to effectively model and mitigate their impacts. To accurately assess the effects of an extreme drought event on ecosystem fluxes, we used a combination of eddy covariance and complementary hydrometerological measurements from a deciduous forest from 2015 to 2020, along with the standardized precipitation-evapotranspiration index. In this context, it is important to consider the extreme event from two view points: The 3-month SPEI (90 days) illustrates the extreme water deficit in summer 2018, while summer 2019 falls within the expected long-term climate variability. In contrast, the 12-month SPEI (365 days), identifies 2019 as an extreme drought year based on the aggregated water deficit. This emphasizes the importance of using standardized indices for accurately communicating and interpreting legacy effects to avoid misunderstandings about actual and lagged responses to extreme events76.

In 2018, the combination of a well-saturated soil and an early start of the growing season enhanced carbon sequestration. Despite severe drought stress later in the summer, the carbon uptake for 2018 was above average. This is consistent with other studies on the effects of drought on European forests, that have shown that the carbon balance during extreme event years may not be as severe due to compensatory effects within the annual cycle. However, in 2019, carbon fluxes decreased to record lows, likely due to a combination of legacy effects from the previous year at the start of the growing season and soil moisture stress in the later summer. Regression analysis indicates that such reduced carbon uptake could not simply be explained by the current hydrometeorological conditions given by radiation and soil moisture measurements. The use of the standardized precipitation-evapotranspiration index (SPEI) as a covariate in regression modeling yet improved estimates of forest productivity under extreme conditions.

While further information such as dynamics of deep water reservoirs would be beneficial to identify the causes for lagged responses, we could demonstrate that using SPEI with different aggregation times as co-variate in regression modelling could improve estimates of forest productivity, making it a promising tool to improve our understanding of ecosystem carbon balance under extreme conditions. While carbon fluxes returned to normal levels in 2020, it is too early to conclude that there will be no lasting impacts from the extreme drought event. These impacts can persist for several years and require longer measurements and investigation to fully understand drought effects on ecosystem behaviour.

Materials and methods

Site description

The study area is part of the mixed deciduous forest ’Hohes Holz’ in the region of the Magdeburger Boerde in Central Germany (\(52^{\circ }\) 05’ N, \(11^{\circ }\) 13’ E, \(200\,\hbox {m}\) above sea level). Climate in the study area is subatlantic-submontane with a mean annual temperature of \(9.1^{\circ }\) C (climatic period 1981-2010, station Ummendorf of the German Weather Service), mean minimum daily temperature in the coldest month (January) of \(0.7\,^{\circ }\,\hbox {C}\) and mean maximum of the warmest month (July) of \(18.3\,^{\circ }\,\hbox {C}\). Annual mean precipitation was 563 mm during the climatic period 1981-2010, while annual precipitation during the investigated period measured locally at the site ranged from \(301\,\hbox {mm}\) in 2018 and \(610\,\hbox {mm}\) in 2017 (see Table 1). The forest stand is located in a mainly municipal forest area of about \(15\,\hbox {km}^{2}\), managed by regional forestry. Within a fenced area of about 1 ha, intensive measurement equipment related to the carbon, water and energy cycles of the forest ecosystem was established since 2013, including an eddy covariance tower for measurements of trace gas fluxes between the forest ecosystem and the atmosphere.

The fenced area is composed of European beech (Fagus sylvatica L.), and sessile oak (Quercus petraea (Matt.) Liebl.) as dominant species (38 % and 45 % of total basal area, respectively) with accompanying hornbeam (Carpinus betulus L., 13 %) with 245 trees in the enclosure. Tree height and diameter at breast height were 24.0 ± 10.5 m, and 0.34 ± 0.21 m for Fagus, 29.6 ± 3.1 m and 0.47 ± 0.11 m for Quercus, and 17.7 ± 5.6 m, and 0.21 ± 0.07 m for Carpinus in 2020. Additional selected tree plots (CPs) were investigated according to the ICOS sampling design84, partly located outside the fenced area. Within the fenced area only trees in danger were harvested or fell during storms since 2011 and the surrounding area has thus a lower stocking degree due to regular thinning. There were no further treatments of plants or trees on the research site. Within these plots, beech and oak are also the dominant species (41 % and 46 % of total basal area, respectively), accompanied by hornbeam (10 %), sycamore (Acer pseudoplatanus L., 2 %) and birch (Betula pendula Roth, 1 %). Inventory data outside the fenced area are mainly from field inventories performed during winter 2017 / 2018. The bedrock is Pleistocene loess (Weichsel), with Haplic Luvisols and Stagnic Gleysol as predominant soil type. Soil texture at 0–20 cm depth is 3.0 ± 1.8 % sand, 87.1 ± 2.1 % silt, and 10.0 ± 2.2 % clay, with a pH-value of 8.0.

The study site ’Hohes Holz’ was established since 2013 within the framework of the TERENO-project as part of the Central German Lowland Observatory85 (https://www.tereno.net/) managed by the UFZ. Since the beginning of 2019, the station fulfills all required instrumentation and sampling procedures according to the ICOS ecosystem standards for class 1 stations86,87 (https://www.icos-cp.eu/observations/ecosystem). In addition to the ICOS requirements, several continuous and campaign-based measurements are performed, most of them related to the water balance of the forest ecosystem. Only those considered for the present analysis are detailed below. The station is well equipped with line power and internet access, such that all continuous data are transferred to institutional sftp on a daily basis.

Eddy covariance measurements

Continuous flux measurements with the eddy covariance (EC) technique are performed in 49 m height on the scaffolding tower since July 2014 using an ultrasonic anemometer (CSAT-3, Campbell Scientific Inc., Logan, UT, USA) and an open-path infrared gas analyser (LI-7500, LiCor Inc., Lincoln, NE, USA) for carbon dioxide (CO\(_2\)), water vapour (H\(_2\)O), sensible heat (H) and momentum (\(\tau \)) fluxes88,89. The sonic anemometer is directed to west-south-west according to the main wind direction prevailing in the area. High-frequency raw data are acquired with a CR3000 data logger (Campbell Scientific Inc., Logan, UT, USA) and collected, pre-processed and archived with the EDDYMEAS data acquisition module of Eddysoft90. Sampling frequency for wind components, sonic temperature and CO\(_2\) and H\(_2\)O concentrations is 20 Hz. Since spring 2016 an additional EC system according to ICOS standards (GILL HS-50, Gill Instruments Ltd., Lymington, UK and LI-7200, LiCor Inc., Lincoln, NE, USA) was installed on the tower in the same height until Oct. 2018 and was moved to 45m height thereafter.

Storage fluxes

CO\(_2\) and H\(_2\)O concentrations are measured along a vertical profile from 0.1 m above the soil surface up to 49 m height by sucking the air through equally long and heated tubings to a valve system that provides the air sequentially to a LI810A gas analyzer (LiCor Inc., Lincoln, NE, USA). The air of each of the 9 levels was analyzed for CO\(_2\) and H\(_2\)O for 60 s before switching to the next level until a change of the system was done to fulfill ICOS-standards (June 14, 2018). Since this change, 3 additional levels were added (0.1 m, 0.4 m and 10 m) and the time sampled per level was reduced accordingly.

Meteorological and hydrological variables

The station is equipped with sensors on the tower for atmospheric variables as well as in the tower surrounding for soil variables. The main variables are short- and long-wave radiation components of both hemispheres, photosynthetic photon flux density, air temperature, air humidity, air pressure, precipitation, soil heat flux, soil temperature and soil moisture (see tab. 3). Those data are replicated in different heights along the tower or with depth in the soil, respectively and are sampled at a frequency of 0.05 Hz and aggregated in 10 min values by the data loggers (CR3000 or CR1000, Campbell Scientific Inc., Logan, UT, USA). Table 3 contains further details regarding measurement location, abbreviations, sensor types and manufacturers.

Table 3 Main sensors for hydro-meteorological variables measured at the ecosystem station ’Hohes Holz’.

Data processing

Flux computation from high frequency (20 Hz) raw data of the CSAT-3 / LI-7500-EC-system is performed with the EddyPro\(\circledR \) software91 [LI-COR-Biosciences, 2017] with commonly used settings such as block averaging, Webb correction and planar fit for 4 sectors92. Subsequent post-processing steps such as estimating the u*-threshold, gap-filling and flux partitioning for net ecosystem exchange of CO\(_2\) (NEE) are performed with the REddyProc open source software package93 after adding the CO\(_2\)-storage change from the profile data94. Thresholds for u* are estimated with the moving point method95 using data where SW_IN <  10 W m\(^{-2}\). Bootstrapping is used to estimate the distribution of the u* threshold, and all subsequent processing is performed for the median quantile, while we report the uncertainty range in Tab. 1 using the \(5\,\%\) and \(95\,\%\) threshold. Gaps in NEE, air temperature and vapor pressure deficit (VPD) were filled using the marginal distribution sampling method96 with default variables and margins (SW_IN \(50\,\hbox {W}\,\hbox {m}^{-2}\), TA \(2.5\,^{\circ }\,\hbox {C}\) and VPD \(5.0\,\hbox {hPa}\)). NEE was partitioned into gross primary productivity (GPP) and ecosystem respiration (Reco) using both, the nighttime NT96 and daytime DT97 approach. Both approaches rely on the assumption that NEE measured at night is essentially ecosystem respiration (Reco), yet the NT-method is based on using fitting the Lloyd and Taylor equation98 and the DT-method is based on fitting a rectangular hyperbolic light-response curve62. We repeated all statistical analyses with partitioned fluxes of both methods (not shown) and found that the choice of partitioning method did not notably affect the analysis, so we only present results from the NT method.

Phenocam

Pictures of the canopy were taken multiple times per day with a Stardot NetCam SC5 (Stardot Technologies, Buena Park, CA), which is located in \(45\,\hbox {m}\) height in westerly direction on the same tower as the eddy covariance system. Using the ’phenopix’ R-package99 we performed the following steps to extract information about the phenological state of the ecosystem: For each picture and a fixed region of interest (ROI), the RGB channel values were extracted and the relative greenness (G\(_{CC}\)) of each image was calculated100. Derived G\(_{CC}\) values were filtered for low illumination (G\(_{CC}<0.2\)), outliers14 and for change in scene illumination101. Afterwards, the filtered timeseries of G\(_{CC}\) was fit to a double logistic sigmoid function102. Finally, the phenological transition dates were estimated using local extrema in the rate of change of curvature of the fit102,103. To estimate the uncertainty of the transition dates, the fit is repeated 100 times to randomly-noised data using the residuals between the original model fit and the observed data. The resulting fit is visualized in Supplementary Fig. S1. For 2015, the Phenocam images could not be analyzed using the described method due to incorrect color settings, so the derived phenological states were estimated from the images using expert knowledge.

Standardized precipitation evapotranspiration index

To objectively quantify the drought at the research site, we used the Standardized Precipitation Evapotranspiration Index (SPEI)104. The SPEI takes into account both changes in temperature, i.e. potential evapotranspiration, and precipitation when evaluating water conditions. Studies have demonstrated that the response of vegetation to water conditions can vary significantly, with some effects being observed within a few months and others taking several years to manifest105,106. The SPEI can be used to study vegetation response across different timescales due to its capability of considering different aggregation periods. We used a dataset tailored to the ICOS sites with daily temporal resolution and constructed based on the E-OBS52 dataset. For transparency, we state that the authors of this study are also part of the authors of the dataset. For a detailed description of the methodology used to create the drought indices, we refer to the data descriptor76.

Statistical analysis

We used regression analysis to investigate the relationship between GPP and potential predictors. Predictors were selected according to the equation of photosynthesis, which states that both water and photons are needed to absorb \(\hbox {CO}_{2}\) from the atmosphere. Consequently, measurements of Photosynthetically Active Photon Flux Density (PPFD) were used as a predictor for light energy and upscaled soil water content measurements from \(50\,\hbox {cm}\) (SWC_50) were used as a predictor for water availability. Additionally, we tested, whether drought indices could be useful predictors for including climatological effects into the models.

We use restricted cubic spline (RCS) regression to account for potential non-linear relationships. Cubic spline regression is basically a series of piecewise cubic polynomials in which the number of pieces is defined by so-called knots and constructed so that the function is smooth. They are restricted to be linear for \(x<k_{min}\) and \(x>k_{max}\) to optimize their behaviour in the tails107. The equation has the form of:

$$\begin{aligned} g(y) = \beta _0 + \beta _1x + \sum ^{k-1}_{i=2} \beta _i \cdot C_i(x), \end{aligned}$$
(1)

where g is a link function and \(C_i(x)\) is the cubic component108. It has been shown that the position of the knots is not very sensitive109 so default quantiles can be used. In practice, a number of 3 – 5 knots is usually used109,110.

We used Akaike’s Information Criterion (AIC) to find optimal number of knots and to compare the overall performance of the regression models. AIC is a measure of relative quality of statistical models that takes into account both best fit and complexity of the model. Due to the latter it is also suited to compare models with different complexity, i.e. amount of parameters, to identify whether the incremental complexity is worth it. AIC is calculated as follows:

$$\begin{aligned} AIC = 2k - 2ln(\hat{L}) \end{aligned}$$
(2)

where k is the number of parameters in the model and L is the maximized value of the likelihood function of the model. A lower AIC is considered to be a better trade-off between model fit and degrees of freedom spent compared to a model with higher AIC.

We calculated \(\chi ^2\) to determine feature importance of the predictors110:

$$\begin{aligned} \chi ^2 = \hat{\beta }_{S}^\top \widehat{\Sigma }_{S}^{-1}\hat{\beta }_S \end{aligned}$$
(3)

where S is a set of terms associated with the sub-model tested, \(\beta _S\) is the corresponding subset of coefficient estimates and \(\widehat{\Sigma }_{S}\) is their covariance matrix. The sub-model is a version of the full model without the predictor in question. This is equivalent to perform a F-Test on the significance of a predictor multiplied by the degrees of freedom, and the resulting \(\chi ^2\) statistic is here the feature importance. Likewise, we expressed the degree of non-linearity of a predictor as the ratio between its splines’ partial sum of squares and the partial sum of squares of both, the linear and the splines term of that predictor109.