Introduction

Evapotranspiration (ET) is considered a key parameter needed to fine-tune irrigation, optimize water use, and enhance crop quality (Hargreaves and Samani 1982; Jensen et al. 1990; Allen et al. 1998; Kustas et al. 2018, 2019). Satellite remote-sensing technology offers the possibility of providing routine sub-field and inter-field-scale ET estimates and plant stress indices (Knipper et al. 2019a). Commonly, these ET estimates are based on process-based models using a combination of thermal, visible, shortwave, and near infra-red imagery (Glenn et al. 2007; Anderson et al. 2018; Gonzalez-Dugo et al. 2009; Anderson et al. 2012; Knipper et al. 2019a, b; Chen and Liu 2020). However, given the complexity of physical and biological processes controlling evaporation and transpiration, ET estimates from different modeling approaches might be subject to substantial uncertainty (Long et al. 2014; Chen and Liu 2020). Consequently, many remote sensing-based ET methods use ground observations for validation, improving model parameterizations, or embedded data within a given assimilation scheme.

Various methods, including surface energy flux partitioning and the soil water balance, have been used to ground-truth remotely sensed ET estimates. Multiple studies have compared ET derivations from modeling approaches based on remote sensing data at a wide range of temporal and spatial scales. These include lysimetry (e.g., Dhungel et al. 2021; Evett et al. 2012; Sánchez et al. 2019), profiles of soil water content (e.g., Guderle and Hildebrandt, 2015), eddy covariance (EC) (Dhungel et al. 2021; García-Gutiérrez et al. 2021; Knipper et al. 2020a; Sánchez et al. 2019, surface renewal (SR) (e.g., Xue et al. 2020), Bowen ratio systems (e.g., Evett et al. 2012), scintillometry (e.g., Evett et al. 2012; Geli et al. 2020), and other methods. The various approaches have different limitations or requirements regarding spatial representation, frequency of measurements, calibration of sensors, and environmental/landscape characteristics. Nevertheless, significant advances in the design and quality of sensors used in EC systems (Mauder et al. 2020) have led to heavy dependence on this technique as a reliable method for ET ground-truthing. In fact, flux networks, such as Ameriflux, Euroflux, AsiaFlux, and the global network FLUXNET (https://fluxnet.org/about/history/), rely on EC towers for monitoring water, carbon, and energy fluxes.

The EC method has not only evolved in terms of the instrumentation used but also in understanding the theoretical basis of eddy covariance measurements related to the quantification of energy, momentum, and mass transfer between ecosystems and the atmosphere (Webb et al. 1980; Leuning et al. 2012; Mauder et al. 2013, 2020; Stoy et al. 2013). In addition, while storage and data processing were challenging a few decades ago, drastic improvements in computing technologies have facilitated the conversion of raw instrumental data streams into quality-controlled fluxes. In addition, the micrometeorological community has also moved toward more standardized practices regarding sensor deployment, data processing, and corrections (Pastorello et al. 2020).

The EC method works well under ideal conditions, but these can be challenging to fulfill, especially in long-term studies that continuously characterize surface fluxes. For example, a flat, large, and homogeneous surface is ideal for implementing the EC method, which is uncommon in many agricultural settings. Also, flux observations represent a dynamic spatial area depending on wind and stability conditions; this area is referred to as the flux “footprint” (Chu et al. 2021). Thus, EC measurements can be challenging to interpret. When the source of fluxes is influenced by a heterogeneous landscape or during stable or weakly turbulent periods (i.e., usually at nighttime), the flux footprint might cover areas beyond the scope of the landscape of interest (Foken et al. 2005; Mauder et al. 2013). Another critical issue regarding the EC method is the assumption that advective fluxes can be ignored, but such conditions are common over multiple landscapes such as heterogeneous irrigated agricultural regions (French et al. 2012). Moreover, the sum of the components of the energy balance (EB) is often not zero, and this imbalance can make up 10 to 40% of the net radiation (Wilson et al. 2002; Franssen et al. 2010; Leuning et al. 2012; Kutikoff et al. 2019; Mauder et al. 2020; Dhungel et al. 2021). Energy Balance Residual (EBR) is commonly used to assess this imbalance with the following equation:

$$EBR = Q_{{{\text{net}}}} - G - H-\lambda E$$
(1)

where Qnet is the net radiation, G is the soil heat flux, H is the sensible heat flux, and λE is the latent heat flux. Canopy heat storage in biomass and water content and photosynthesis are often neglected energy terms (Lindroth et al. 2010), yet they can significantly contribute to the EBR, particularly the heat storage in trees at full canopy cover and at sub-daily timescales (Meyers and Hollinger 2004; Leuning et al. 2012).

The Grape Remote sensing Atmospheric Profile and Evapotranspiration eXperiment (GRAPEX) aims to develop a remote sensing-based ET toolkit using earth observations for monitoring vineyard water use and stress from within vineyard block to regional scales in California. In this context, ET estimates derived from eddy covariance (ETEC) are commonly used as a source of validation of remote-sensing ET estimates (Anderson et al. 2018; Knipper et al. 2019a, b, 2020b). Considering the challenges related to EC flux estimates under non-ideal conditions commonly found in California vineyards, it is crucial to quantify the uncertainty of daily ET fluxes for use in precision irrigation management. This study compares nine approaches to derive daily ET estimates from EC flux measurements. The analyses were performed based on a rich dataset collected across five vineyards between 2018 and 2020. We investigated how meteorological conditions, atmospheric stability, and advection serve as potential sources of uncertainty.

Approaches for daily ET calculations

The EC method can be used to directly quantify the exchange of water vapor fluxes from a crop surface to the atmosphere by the covariance between turbulent fluctuations of the vertical wind \(\left({w}^{^{\prime}}\right)\) and water vapor \(\left({q}^{^{\prime}}\right)\). Co-spectral analyses showed that an averaging period (AP) of 30 min was appropriate to capture low-frequency contributions to the fluxes. Therefore, the daily ET flux (ETEC) can be computed as the sum of all the ET flux for each AP \(\left( {ET_{EC}^{AP} = \overline{w^{\prime}q^{\prime}} } \right)\) in a given day. Therefore:

$$ET_{EC} = \sum\limits_{t = 0}^{t = n} {ET_{EC,t}^{AP} \times dt}$$
(2)

The corresponding time dependency of each \(ET_{EC}^{AP}\) is represented by t, thus \(n=24-AP/60\) to complete a full day cycle, and \(dt = AP \times 60{\text{ s}}\). Consequently, ETEC represents the total daily water vapor mass flux (i.e., g H2O m−2 d−1).

Since stable conditions are often prevalent during nighttime, the flux footprint can cover more extensive areas than the surface of interest, turbulence can be highly intermittent, nighttime ET fluxes are assumed to be negligible (Wever et al. 2002; Novick et al. 2004; Maltese et al. 2018). However, there is evidence that further efforts should be considered to quantify night-time ET (Fisher et al. 2007; Novick et al. 2009). Thus, \(ET_{EC - DT}^{AP} = \overline{w^{\prime}q^{\prime}} \times f\left( {SW} \right)\), and \(f\left( {SW} \right) = \left\{ {\begin{array}{*{20}c} {1,} & {SW_{IN} > 50\left[ {{\text{WM}}^{ - 2} } \right]} \\ {0,} & {SW_{IN} \le 50\left[ {{\text{WM}}^{ - 2} } \right]} \\ \end{array} } \right.\). In this case, a threshold of low incoming shortwave radiation is set as a proxy of an expected photosynthetic light below the compensation point for grapevines (Greer and Weedon 2012). Therefore, the \(ET_{EC - DT}^{AP}\) for each period becomes dependent on incoming shortwave radiation (\({SW}_{IN}\)). Thus, the “EC-DT” subscript refers to cases when only daytime fluxes contribute to the total daily ET. Therefore, in the same fashion that Eq. 2, the daily ET estimate can be estimated by Eq. 3.

$$ET_{EC - DT}^{{}} = \sum\limits_{t = 0}^{t = n} {ET_{EC - DT,t}^{AP} } \times dt$$
(3)

Observed lack of closure in the EB is a common issue in EC flux systems, in which sensible (H) and latent heat (λE) fluxes are not in close agreement with the total available energy (i.e., \(\left( {Q_{{{\text{net}}}} - G} \right)\). Therefore, based on conservation of energy principles, it is common practice to use the Bowen (Bo) ratio (H/λE) to partition the EBR (Twine et al. 2000; Barr et al. 2006; Widmoser and Wohlfahrt 2018). Then, the ET flux corrected by Bo can be estimated by Eq. 4.

$$ET_{B - SD}^{AP} = ET_{EC}^{AP} + \frac{{\left( {1 + B_{o}^{AP} } \right)^{ - 1} \times R^{AP} }}{\lambda } \times dt$$
(4)

where λ is the latent heat of vaporization of water. Similar to Eq. 2, once the Bo correction is applied, this daily ET flux can be computed by Eq. 5.

$$ET_{B - SD}^{{}} = \sum\limits_{t = 0}^{t = n} {ET_{B - SD,t}^{AP} } \times dt$$
(5)

Then, the limitations of nighttime fluxes could also be considered. In that case, the daily ET corrected by Bo assuming only daytime fluxes are relevant can be computed by Eq. 6.

$$ET_{B - SD - DT}^{{}} = \sum\limits_{t = 0}^{t = n} {ET_{B - SD,t}^{AP} \times f\left( {SW} \right) \times dt}$$
(6)

The Bo approach for H and λE correction is supported by evidence of meso-scale turbulent measurements (Mauder et al. 2020) and based on comparative studies using lysimeter ET estimates against EC λE estimates (Mauder et al. 2018). Nevertheless, when the EBR is considerable relative to the total available energy and there is evidence of less than ideal experimental conditions, it might be somewhat arbitrary to assume that the energy imbalance should follow the same ratio as the fluxes derived by the EC method (Charuchittipan et al. 2014). Alternatively, the issues regarding the EB closure are commonly mitigated when considering total daily fluxes. For example, observed hysteresis in the different fluxes is expected to be compensated when sub-daily fluxes are integrated throughout the day (Dhungel et al. 2021). Also, storage components might become less relevant on a daily scale. Furthermore, daily adjustment has been linked to less scatter than a complete partitioning of the residual for every AP (Mauder et al. 2018). Therefore, in a similar fashion to Eq. 4, total daily ET flux can be estimated using Eq. 7.

$$ET_{B - D}^{{}} = ET_{EC}^{{}} + \frac{{\left( {1 + B_{o}^{{}} } \right)^{ - 1} \times R}}{\lambda }$$
(7)

where the total residual is estimated as \(R = \sum\nolimits_{t = 0}^{t = n} {R_{t}^{AP} } \times dt\), and its respective partitioning corresponds to \({B}_{o}=\frac{{\sum }_{0}^{n}{H}^{AP}}{{\sum }_{0}^{n}{\lambda E}^{AP}}\). In this case, Bo is a function of daytime and nighttime fluxes, which can lead to bias in the H and λE correction since the magnitude of fluxes is much larger at daytime. Thus, implementing this correction only to daytime fluxes seems relevant. In this case, the daily ET for daytime fluxes (ETB-D-DT) is calculated by Eq. 8.

$$ET_{B - D - DT} = ET_{EC - DT} + \frac{{\left( {1 + B_{oDT} } \right)^{ - 1} \times R_{DT} }}{\lambda }$$
(8)

Similar to the previous approach, the total residual is \(R_{DT} = \sum\nolimits_{t = 0}^{t = n} {R_{t}^{AP} \times dt \times f\left( {SW} \right)}\) and \(B_{oDT} = \frac{{\sum\nolimits_{0}^{n} {H^{AP} } }}{{\sum\nolimits_{0}^{n} {\lambda E^{AP} } }} \times f\left( {SW_{IN} } \right)\). Another alternative correction based on the Bo is deriving a factor resulting from a centered moving median over a ± 15 days sliding window on each AP (Pastorello et al. 2020). AP during sunrise (03:00–09:00) and sunset (15:00–21:00) local standard time is excluded from the analyses to avoid issues related to significant changes in heat storage expected during those hours. Then, the daily ET estimation is based on the moving median of \(\alpha =\frac{{Q}_{\mathrm{net}}^{AP}-{G}^{AP}}{{H}^{AP}+{\lambda E}^{AP}}\) parameter can be computed by Eq. 9.

$$ET_{B - SD - MM}^{{}} = \sum\limits_{t = 0}^{t = n} {ET_{EC,t}^{AP} \times \alpha_{MM} \times dt}$$
(9)

Where \({\alpha }_{MM}\) is the moving median of \(\alpha\). A similar approach has been implemented by FLUXNET (Pastorello et al. 2020), which uses an alternative method when there is not enough data to estimate a reliable \({\alpha }_{MM}\). However, large data gaps in the GRAPEX datasets did not exist so this method was not required.

Direct estimations of \({ET}_{EC}^{AP}\) depend on measurements of high-frequency atmospheric water vapor, which is costly and requires a significant amount of maintenance to work reliably. Consequently, a common approach is to estimate λE, and then ET, as the energy balance residual. Therefore,

$$ET_{EB} = \frac{{\sum\nolimits_{t = 0}^{t = n} {Q_{{{\text{net}}}}^{AP} - H^{AP} - G^{AP} } }}{\lambda } \times dt$$
(10)

In this case, a comparable estimate of only daytime fluxes is computed using Eq. 11.

$$ET_{EB - DT} = \frac{{\sum\nolimits_{t = 0}^{t = n} {Q_{{{\text{net}}}}^{AP} - H^{AP} - G^{AP} } }}{\lambda } \times f\left( {SW_{IN} } \right) \times dt$$
(11)

Nine methodological approaches are used in this study to calculate micrometeorological daily ET estimates and are summarized in Table 1.

Table 1 Summary description of micrometeorological methodological approaches for daily ET estimates

Materials and methods

Study sites

This study focuses on EC flux measurements collected as part of GRAPEX on 5 vineyards located over the North Coast and the Central Valley of California. In the North Coast area, two flux towers were deployed at the study site: one in a Cabernet Sauvignon (BAR_A12) and another in a Petite Sirah vineyard (BAR_A07). In the Central Valley, fluxes were captured from three towers; one in the Lodi area and two in the Madera area were analyzed. The site in Lodi hereafter will be referred to as SLM_001, and the sites at Madera will be referred to as RIP_720 and RIP_760. In SLM_001, fluxes were measured for a Pinot Noir vineyard during 2018 and 2019, while in 2020, the block was converted to Cabernet Sauvignon by cutting the vines at the rootstock and re-grafting the former variety. In Madera, the flux tower in RIP_720 is deployed in a Merlot vineyard, while RIP_760 is a Chardonnay vineyard. Additional details regarding location, vineyard characteristics, soil texture, irrigation system, and other relevant information are summarized in Table 2.

Table 2 Study sites and vineyard characteristics for EC flux towers

EC flux towers instrumentation

EC flux towers at each site were instrumented with sensors to measure the main components of a surface EB (e.g., Qnet, H, λE, and G). Each tower had a very similar array of sensors, yet some differences were unavoidable due to the availability of sensors at the time of installation and changes in technology. In Table 3, a detailed list of sensors at each EC flux tower is presented. G was estimated as the mean of five heat flux plates installed at a depth of 8 cm along a diagonal transect across the inter-row space. A set of soil thermocouples at depths of 2 cm and 6 cm and a soil moisture sensor at a depth of 5 cm were co-located with each soil heat flux plate to account for soil heat storage. Agam et al. (2019) analyzed a network array of 11 sensors in the inter-row and determined that a transect of 5 sensors evenly spaced across the vine inter-row provided a reliable area-averaged soil heat flux.

Table 3 Flux towers instrumentationa

Flux data processing

H and λE were computed as functions of 30 min average covariance of the corresponding variables sampled at 20 Hz. Anomalous measurements in the high-frequency time series for each computed variable were removed following the Median Absolute Deviation method implemented by Mauder et al. (2013a). Wind velocity components were rotated into the mean streamwise flow following the 2-D coordinate rotation method described by Tanner and Thurtell (1969) (Kaimal and Finnigan 1994; Foken and Napo 2008). Wind components and scalar quantities were adjusted in time to account for sensor displacement, and also, frequency response attenuation corrections were performed (Massman 2000). The air temperature was derived from sonic temperatures and the atmospheric water vapor density estimations were based on Schotanus et al. (1983). Then, the resulting fluxes were adjusted by the Webb, Pearman, and Leuning (WPL) density corrections (Webb et al. 1980). G measurements collected over the vineyard inter-row space were corrected to represent a surface approximation by accounting for the heat storage in the overlying soil layer. Then, the adjusted G estimations representing the flux at the surface level were averaged to obtain a unique representative G flux at each vineyard (Agam et al. 2019).

Atmospheric stability and advection criteria

Energy imbalance was analyzed relative to atmospheric stability conditions. The stability parameter ξ was computed based on turbulence data, and fluxes were classified in discrete stability category cases: stable (ζ ≥ 0.1), neutral (− 0.1 < ζ < 0.1), unstable (− 0.5 ≤ ζ ≤  − 0.1), and very unstable (ζ <  − 0.5). ζ is a dimensionless quantity that depends on the Obukhov length scale parameter (L). L can be physically interpreted as the height above the surface at which buoyancy predominantly controls turbulence. Thus, \(\zeta = \frac{z - d}{L}\), and the Obukhov length is given by:

$$L = \frac{{ - \overline{{\theta_{v} }} u{}_{*}^{3} }}{{kg\left( {w^{\prime}\theta_{v} ^{\prime}} \right)_{s} }}$$
(12)

where z is the turbulence measurements height, d is the zero-plane displacement height (often taken as ~ 2/3 canopy height), \(\overline{{\theta }_{v}}\) is the mean virtual potential temperature, u* is the friction velocity, \(k\) is the von Kármán constant, \(g\) is the gravitational acceleration, and \(\left( {\overline{{w^{\prime}\theta_{v} ^{\prime}}} } \right)_{s}\) is the mean kinematic eddy heat flux at the surface.

Similarly, the energy imbalance was also analyzed depending on advective conditions. Cases under daytime advective conditions were identified when measured H was negative (H < 0 W m2) for at least three continuous hours, or days when H was consistently low throughout the day (H < 50 W m−2 at least 8 h) and λE exceeded available energy \(\left( {Q_{{{\text{net}}}} - G} \right)\) for at least 1 h. Days, when neither of these criteria were met, were classified as non-advective. These criteria were based on previous studies, which classified advection with similar indicators (e.g., Tolk et al. 2006; French et al. 2012; Kutikoff et al. 2019). It should be noted that these daytime stable cases caused by advection do not follow the classic definition of stable atmospheric conditions where turbulence is suppressed/weak with a shallow boundary layer since the turbulent transport of water vapor is significantly enhanced with a fully convective boundary layer; hence, we call these daytime stable cases “pseudo-stable”.

Data processing and statistical analyses

Flux and meteorological data were processed and analyzed using Python and built upon the SciPy library. Linear least-squares regressions were performed to determine sub-daily and daily agreement between available energy \(\left( {Q_{{{\text{net}}}} - G} \right)\) and EC turbulent fluxes (H + λE). Violin and box plots were created using the Python-Plotly data visualization package. Box plots depict data distribution based on five parameters: minimum, first quartile (25th percentile), median, third quartile (75th percentile), and maximum. Minimum and maximum values are estimated as the first/third quartile minus/plus 1.5 the interquartile range. Violin plots depict the same statistical parameters that are found in a box plot, and add a rotated kernel density distribution plot on each side of the center line of the box plot.

We also performed a supplementary analysis comparing our daily ET estimates to the new flux processing algorithm flux-data-qaqc (Volk et al. 2021). Flux-data-qaqc is a new open-source and object-oriented Python package to post-process EC flux data to estimate daily or monthly ET that has been corrected for energy balance closure. This software can also be used to perform visual or flag-based quality control and quality assurance checks on EC flux or meteorological data.

Results and discussion

How large is the uncertainty of daily ET estimates from flux measurements?

Daily ET estimates derived from nine methods utilizing EC flux measurements resulted in a mean difference of 48% when comparing the minimum and maximum daily ET relative to the ensemble mean from all methods. When considering the growing season period, the ET average difference was 44%, equivalent to 1.71 mm d−1 (mean ET = 3.9 mm d−1). ETB-MM led to the largest estimates in 88% of the cases, and ETEC-DT was the lowest in 87% of the cases (Fig. 1). While most methods are near the mean ET estimate, certain conditions, especially during the peak of the summer, lead to important discrepancies. Significantly above-average discrepancies were observed in the sites RIP_720, RIP_760, and BAR_A07. Despite the potential sources of uncertainty identified in our analysis, we also found that ET flux estimates are more uncertain under nighttime stable and strongly advective or pseudo-stable conditions. While these atmospheric conditions are quite rare during daytime hours, some sites were more prone to these conditions, and we unfurl details of these conditions later in the discussion.

Fig. 1
figure 1

Daily evapotranspiration estimated based on eddy covariance flux and surface available energy measurements in five California vineyards. Colored dots represent daily sum of ET fluxes computed using nine different approaches as described in the “Approach to Daily ET calculations” section. Abbreviation in the legend refer to the corresponding method as named in Eq. 211. Black and red lines represent the ensemble mean and median of all methods, respectively. Each sub-plot represents the observations for a given year (2018–2020) at each column and sites (rows)

Overall, our ET estimates derived using the post-processing analysis described previously compare closely to results using other standardized EC processing software (i.e., EddyPro, EasyFlux). For instance, we compared the mean daily ET estimates derived in this study for four GRAPEX sites (Fig. S1) against estimates obtained using flux-data-qaqc (Volk et al. 2021). Overall, good agreement was observed, yet in cases of springtime precipitation or highly advective conditions as observed in RIP_760, we found generally greater scatter (Fig. S1). The flux-data-qaqc data processing method bases the EBR correction in a 15 day centered sliding window filtering threshold, which in some cases can lead to attenuation of unusual weather conditions. We identified similar biases for the estimates based on the ETB-SD-MM, which are based on a similar algorithm.

The annual accumulated discrepancy in ET can range between 20% in the case of SLM_001 and about 50% in RIP760 (Fig. 2). In the context of our efforts to ground truth remotely sensed ET models, the magnitude of these differences is not trivial. For instance, the expected ET uncertainty could lead to different irrigation strategies and undesired water stress or excessive application in certain circumstances. Multiple studies have used the EC technique to estimate crop water use based on ET, yet different approaches are followed to address the energy imbalance. A common approach to close the energy budget uses the Bo ratio (Paço et al. 2006), while others use ET measurements directly from the EC system without considering the gap in EB closure (i.e., Zhang et al. 2012; Poblete-Echeverría and Ortega-Farias 2013; Zanotelli et al. 2019; Anapalli et al. 2020). Moreover, many studies derive ET as the EBR, given that the other components are directly measured (Qnet, H, and G). For studies using this approach in agricultural systems, H is typically derived based on EC or surface renewal methods (i.e., Spano et al. 2000; Castellví et al. 2008; French et al. 2012; Anapalli et al. 2018). On average, we found that ETEB was ~ 21% greater than ETEC, and that difference was up to ~ 25% during the growing season. Those differences are expected to propagate to other estimations such as crop coefficients calculations. We argue that considering and discussing the assumptions underlying ground-based ET estimates from micrometeorological techniques is fundamental when extrapolating and determining crop water use to inform irrigation decisions.

Fig. 2
figure 2

Annual evapotranspiration, years 2018–2020, based on eddy covariance flux and surface available energy measurements in five California vineyards. Each bar represents the annual sum of daily ET fluxes computed using nine different approaches as described in the “Daily ET calculations” section. Abbreviation in the legend refer to the corresponding method as named in Eqs. 211. Each sub-plot represents the observations for a given year (2018–2020) at each column and sites (rows)

Energy balance closure

Lack of EB closure is commonly a significant source of uncertainty in daily ET estimates. Previous studies using the EC technique on agricultural sites have reported a lack of closure between 20 and 30% when comparing the sum of H and λE to the residual of Qnet minus G (Stoy et al. 2013; Eshonkulov et al. 2019). For the analyzed GRAPEX sites, we found a mean closure of 75% across all sites and years (Fig. 3) when comparing half-hour fluxes. As expected, when comparing fluxes at the daily scale, a slightly better closure of ~ 78% was found (Fig. S2). The magnitude of this imbalance is comparable to the lack of closure reported by other studies measuring EC fluxes over vineyards as well (Li et al. 2008; Parry et al. 2019; Sánchez et al. 2019; Vendrame et al. 2020). High inter-annual consistency of the closure was found across sites, however, sites of similar characteristics within a given region can exhibit quite distinct imbalances (e.g., RIP_720 and RIP_760).

Fig. 3
figure 3

Scatter plots and linear least-squares regressions of 30 min available energy \(\left( {Q_{{{\text{net}}}} - G} \right)\) and eddy covariance turbulent fluxes (H + λE). Each sub-plot represents the observations for a given year (2018–2020; columns) and sites (rows). Colors of scattered dots illustrate the time of the day of the fluxes. b0 represents the intercept of the respective regression line (black), b1 is the slope of the regression, and R2 is the coefficient of determination for each sub-plot. Fig.S2 depicts the same analysis for daily fluxes

Usually EC accuracy is assessed based on the slope of the regression line between available energy and EC fluxes; the slope is assumed to define the overall EB closure capability of the EC system. However, using only this parameter can prevent a detailed understanding of underlying factors leading to a given energy imbalance. In Fig. 4, we illustrate the mean behavior of the \(\frac{H+\lambda E}{{Q}_{net}-G}\) ratio throughout the day at each site and analyzed year. Distinct features are observed across different sites, yet a consistent high closure (~ 1) is achieved at mid-afternoon (~ 1530 local time). Also, while small, nighttime fluxes feature a large imbalance in relative magnitude (~ 60–70%). Higher and more consistent closure ratios during daytime were found for RIP_720 and BAR_A07, which had EC measurements 1.5 to 2 m above the vine canopy compared to the other tower locations with measurements ~ 3 m above the vines. This might suggest that considering the complexity of vineyard canopies and their influence on surface turbulence (Alfieri et al. 2019), sub-meso-scale transport processes and secondary circulation might strongly influence the EC measurements under certain conditions.

Fig. 4
figure 4

Eddy covariance turbulent fluxes (H + λE) to available energy \(\left( {Q_{{{\text{net}}}} - G} \right)\) ratio for 30 min time average observations. Each sub-plot represents the observations for a given year (2018–2020; columns) and sites (rows). Solid blue lines represent the mean ratio estimated at each 30 min average period throughout the growing season (April–October). Shaded light blue boundaries for each line represents the 95% confidence interval. Black dotted lines illustrate the “ideal” case or total EB closure

One of the most common methods to close the energy balance is partitioning the EBR based on the Bowen ratio. If a significant portion of the lack of closure is due to meso-scale transport and secondary circulation, then it is assumed that these processes behave similarly to small-scale turbulence (Mauder et al. 2007). In fact, recent studies usually approach the partitioning of the EBR by applying an eddy covariance flux correction somewhere between a buoyancy-flux-based and the abovementioned Bo preserving adjustment method, which in most cases leads to a larger correction of H compared to λE (Mauder et al. 2020). However, latent heat fluxes are much larger in irrigated agricultural fields, such as California vineyards; therefore, corrections based on the Bo ratio tend to significantly increase ET estimates. Vendrame et al. (2020) concluded that the reason for the energy imbalance over two vineyard sites could be attributed to the systematic low accuracy of λE measured by EC, given that the larger imbalance was found in the sites with larger λE. This is consistent with the low closure found in RIP_760, in which large irrigation inputs lead to the overall highest λE estimates when compared to other GRAPEX sites. However, Widmoser and Wohlfahrt (2018) found that in a grassland ecosystem about one-third of the energy imbalance could be attributed to λE, while only 10% was attributed to H and the large remaining fraction (~ 58%) to the available energy \(\left( {Q_{{{\text{net}}}} - G} \right)\). When flux measurements aim to represent fluxes over a heterogeneous surface, or in cases when the vertical source distribution of scalars differs, it is expected that similarity assumptions are violated (Huang et al. 2009). Thus, there is enough evidence to assume that compounding factors are influencing the lack of closure at GRAPEX sites, and there is not a particular method that can be considered more plausible when partitioning the lack of closure to ET.

Energy imbalance and meteorological conditions

The magnitude of the energy imbalance increased throughout the growing season for all sites. Increases in temperature and VPD seem to affect not only the magnitude of the imbalance but also the \(\frac{H+\lambda E}{{Q}_{\mathrm{net}}-G}\) proportionality. As VPD increases, there is a larger range of the magnitude of the imbalance found at any given \(\frac{H+\lambda E}{{Q}_{\mathrm{net}}-G}\) ratio. Extreme VPD observations, especially at BAR_A12 and RIP_720, were linked to some of the largest imbalance in magnitude and lower \(\frac{H+\lambda E}{{Q}_{\mathrm{net}}-G}\) ratio (Fig. 5). These results do not imply causality, yet it is important to consider that when atmospheric evaporative demands increase the expected uncertainty of ET estimates from EC might be greater. Extremely high horizontal wind speed also affected the imbalance by increasing the \(\frac{H+\lambda E}{{Q}_{\mathrm{net}}-G}\) ratio (Fig. 5). Windy conditions might be related to ET enhancement due to horizontal advection. Further discussions on this relationship is presented in the subsection “Energy imbalance and Advection”.

Fig. 5
figure 5

Relationship between eddy covariance turbulent fluxes (H + λE) to available energy \(\left( {Q_{{{\text{net}}}} - G} \right)\) ratio for the daily sum of daytime observations and the absolute daily energy balance residual \(\left( {EBR = Q_{{{\text{net}}}} - G - H - \lambda E} \right)\). Colored dots illustrate the magnitude of daily maximum VPD (top row) and wind speed (bottom row). The shape of the symbols represents the years, and each column of subplots depicts observations for each study site. Cool to warm colors represent low to high VPD and wind speeds, respectively

Energy imbalance and atmospheric stability conditions

There is evidence that the energy imbalance decreases for increasing frictional velocities (u*) and increasing instability (Wilson et al. 2002; Barr et al. 2006; Franssen et al. 2010). Therefore, convection is not suppressed under unstable conditions leading to smaller relative energy balance deficits. Also, under these unstable atmospheric conditions, the ergodic (i.e., time average converges to the ensemble average on a given appropriate time interval) and Taylor hypotheses (i.e., the temporal average can substitute the spatial average) are better fulfilled. However, very unstable conditions are also related to low-frequency turbulence (i.e., larger eddies) due to the development of organized convection, meso-scale circulation systems, or an increase in the boundary layer height (Franssen et al. 2010), which can negatively affect energy balance closure. In this study, as expected, the largest fraction of ET takes place under unstable to very unstable conditions (Fig. 6), yet the sites in Madera County (RIP_720 and RIP760) have an important contribution of fluxes under daytime pseudo-stable conditions caused by advection. Very unstable conditions tend to dominate compared to the unstable case; however, while not as predominant, the unstable conditions were related to less EBR (Fig. 7). While the energy imbalance absolute mean and median are greater in the very unstable conditions compared to unstable, the very unstable tends to be associated with larger fluxes in magnitude overall.

Fig. 6
figure 6

Daily ETEC flux for each analyzed study site (columns) and year (rows). Colors represent the contribution to the total daily ET of fluxes from each atmospheric stability category. Gray bars represent periods of time with missing data

Fig. 7
figure 7

Integrated violin and box plot of the energy balance residual \(\left( {EBR = Q_{{{\text{net}}}} - G - H - \lambda E} \right)\) for 30 min time average measurements classified based on atmospheric stability categories. Statistics for this analysis were based on flux observations from all five analyzed study sites and years (2018–2020) during the growing season (April–October). A more detailed analysis per study site and analyzed year is presented in Fig S3

Greater uncertainty of ET estimates across methods associated with a larger energy imbalance was consistently found for days and sites where daytime pseudo-stable conditions were more prevalent (Fig. S3). In 2020, both sites in Madera County showed a significant contribution of surface energy fluxes under daytime pseudo-stable conditions (Fig. 6), and fluxes under such conditions were also important in 2018.

Energy imbalance and advection

Advective conditions tended to increase the magnitude of the energy imbalance, thus the uncertainties of surface fluxes derived by the EC method are less reliable in those cases (Fig. 8). On days with strong advection, the magnitude of the energy imbalance was much larger than the non-advective fluxes. We found that advective conditions were very prevalent in Madera County, especially in 2020, when 96 days during the growing season had extreme advective conditions for the RIP_760 site (Fig. S4). A clear impact on the energy imbalance was observed in this specific case (RIP_760 in 2020), when the mean observed daily imbalance was about twice as large on days with strong advection (Fig. 8).

Fig. 8
figure 8

Integrated violin and box plots of the daily energy balance residuals \(\left( {EBR = Q_{{{\text{net}}}} - G - H - \lambda E} \right)\) classified based on advective conditions for all five analyzed study sites and years (2018–2020) during the growing season (April–October). Absence of violin/box plot for the advective case means that there were no advective conditions for that given analyzed period/study site. Fig.S4 shows the number of days with advective conditions for each site and analyzed year

EC estimates of surface fluxes are reliant on two important assumptions: (1) that the flow of air moving through a fixed location of turbulent flux measurement is representing a homogeneous landscape; and (2) such fluxes are stationary (i.e., the mean flux does not change over the period used to compute the mean) (Foken and Napo 2008). The first assumption of landscape homogeneity is often violated. Most of California’s agriculture takes place in a patchy landscape in which cropland is usually a composite of grasslands, annual crops, vegetables, and woody perennial orchards and vineyards. During the growing season, it is not uncommon to observe that irrigated crops are influenced by hot and dry air blowing from senescent grasslands, dry native ecosystems, and fallow lands, which would affect EC measurements based on the homogeneity assumption. This advective air flow provides additional energy, which enhances ET while decreasing sensible heat flux (French et al. 2012). Attempts to quantify the advection term is very difficult given limited observations typically available from the upwind source (Alfieri et al. 2012). Under strong advective conditions, especially during the afternoon, such energy contribution can reduce the magnitude of the EBR or cause an overestimation relative to local available energy (French et al. 2012; Kutikoff et al. 2019), such as is illustrated in Fig. 4. Second, the stationary assumption is not commonly fulfilled due to diurnal variations in atmospheric stability, indicating that turbulence conditions might change significantly over the chosen flux averaging time (Stoy et al. 2013). As a result, changes in heat storage between the soil and the point of measurements can play an important role in the observed energy imbalance throughout the day. Yet, heat storage can be difficult to quantify in irrigated vineyards due to vines canopy and soil moisture distribution heterogeneity.

We observed that the energy imbalance is greatest in the late afternoon when pseudo-stable conditions are more prevalent. The case of RIP_760 in 2020 is of special interest given that it illustrates the challenges of EC measurements under strongly advective conditions. In this case, the total growing season ET range across methods is 433 mm (~ 42% of the mean of all methods). The magnitude of this uncertainty was heavily influenced by advection enhanced ET and hysteresis effects due to prevailing pseudo-stable conditions (Fig. 9).

Fig. 9
figure 9

Sunburst plot hierarchically illustrating the proportion of ET 30 min time average flux data at each site (rows) and analyzed year (column) classified based on advective conditions and atmospheric stability

While these issues raise concern regarding the reliability of studies based on EC flux estimates over patchy irrigated agricultural landscapes, we also show that during daytime these pseudo-stable and strongly advective conditions are very rare (Fig. 9). Thus, most study sites were not significantly influenced by strong advection, and a more consistent impact on the energy imbalance is related to a lack of closure under the much more frequent condition of very unstable.

Conclusion

In this study, we assess nine different approaches to calculate daily ET based on EC flux measurements over five vineyards spanning a significant climate gradient from northern, central and southern areas of the California Central Valley. The implemented methods address the eddy covariance energy imbalance issue differently, which led to a mean daily uncertainty of 46% during the growing season. This uncertainty was remarkably consistent across years and sites, ranging between 43 and 50%. We found that this uncertainty is usually underestimated when considering the slope of the relationship between the available energy \(\left( {Q_{{{\text{net}}}} - G} \right)\) and the surface fluxes (H + λE). This simple analysis can hide issues, such as hysteresis, advection, and heat storage, which can mislead the interpretation of surface fluxes derived through the EC method. The debate on how best to deal with the energy imbalance within the EC method, in our opinion, remains open. While there has been some significant progress toward understanding the causes behind the energy imbalance in the last two decades, there is a lack of agreement regarding best practices to deal with it once it arises. Major challenges remain regarding creating frameworks to systematically identify and quantify the sources of the energy imbalance and further understanding specific-site dependencies. Nevertheless, while the debate remains open, we believe that this study illustrates a general framework to quantify the uncertainty of EC ET estimates. This approach can be especially relevant when those estimates are used to compare with other measuring approaches, validating models, and/or ground-truthing remote sensing estimates.