Introduction

More than 7 Mha of the Earth’s surface is devoted to grapevine production according to the Organisation International de la Vigne et du Vin (OIV, 2022), occupying a particular historical, cultural and economic importance in the Mediterranean region (Limier et al., 2018). While vineyards have traditionally been cultivated rainfed, irrigation practices are becoming increasingly widespread globally due to changes in agronomic management along with technological advances. These largely focus on maximizing green water use while limiting water losses from soil evaporation (Medrano et al., 2015). This is especially relevant within the context of climate change, with the increased probability of more severe and frequent extreme heat and drought events seriously threatening the viability of viticulture in semi-arid areas such as the Mediterranean region (Rienth & Scholasch, 2019; Romero et al., 2022).

The precise quantification of water availability and vine water demand is therefore essential to provide guidelines on agronomic management solutions, such as irrigation scheduling (Engler et al., 2016), to optimize grape production and water use efficiency in light of these changing conditions Indeed, determining the spatial variability of vine water stress is particularly useful to improve water resource management. Large within-field heterogeneity has been previously observed (Bellvert et al., 2012, 2016), which potentially leads to less water use efficiency when irrigation is applied homogeneously. In recent years, large advances have been made in the use of unmanned aerial vehicles (UAV) to support the precise monitoring of agronomic activities through very high spatial resolution imagery (e.g., de Castro et al., 2021; Gago et al., 2015). Particularly, the use of UAVs in conjunction with red–green–blue (RGB), visible-near-infrared (VNIR) and thermal infrared (TIR) sensors have shown great promise to monitor vine water stress (Aboutalebi et al., 2020; Nieto et al., 2019; Romero et al., 2018). Radiometric land surface temperature (LST) retrieved from TIR remote sensing is well established to detect physiological changes in vegetation, particularly due to water stress. Indeed, limited water availability induces partial stomatal closure in plants decreasing water evaporation through transpiration. Stomatal closure reduces the cooling effect stemming from the use of intercepted radiation as transpiration, promoting an increase of leaf and canopy temperatures. As such, there is a direct link between leaf temperature and stomatal conductance, which is highly relevant for both transpiration and photosynthesis in plants. For this reason, large efforts have been made since the 1970s to use canopy temperature as an indicator of plant status (Jackson et al., 1977). Most notably, the crop water stress index (CWSI) was first proposed by Jackson et al. (1981) to normalize the LST between conditions when the plant is transpiring at its full potential and when the plant is fully stressed with no transpiration occurring. Indeed, the simplicity of the CWSI as a water stress indicator has made it widely used by the research community (Bellvert et al., 2014; Gonzalez-Dugo & Zarco-Tejada, 2022) and different derivations have been proposed ranging from its analytical to empirical form (see Maes & Steppe, 2012 for a review).

On the other hand, VNIR sensing is well established for the retrieval of vegetation traits (Homolová et al., 2013), such as the leaf area index (LAI) or leaf chlorophyll content, while certain studies have also found a relation with vine physiology (Marino et al., 2014) or large-scale ecosystem function (Badgley et al., 2017). Both Poblete et al. (2017) and Romero et al. (2018) demonstrated that while individual VNIR vegetation indices (VIs) did not show significant correlations against in situ water status indicators, both studies reported good results to map water status indicators, such as vine stem water potential, by training artificial neural networks by combining numerous broadband VNIR indices. Indeed, Baluja et al. (2012) showed that VNIR VIs were significantly correlated to vine water status undergoing high and continuous water stress, which suffered large changes to vegetation growth due to this accumulated stress. The authors suggested that VNIR region was able to capture long term water stress impacts during the latter phenological periods when plant growth was shown to be impeded by water limitation, while TIR sensing captured short-term responses (Baluja et al., 2012). In the shortwave VNIR spectral region, narrow band hyperspectral indices, such as the Photochemical Reflectance Index (Gamon et al., 1992), are more associated to short term vegetation physiological responses and has even shown to perform similarly to TIR-based CWSI to track water stress proxies in vineyards (e.g. Zarco-Tejada et al., 2013).

More recently, 3D models derived from high-resolution RGB imagery and photogrammetry have been able to provide detailed estimates of crop structural characteristics, giving quantitative insights on crop growth and development. For example, Comba et al. (2020) demonstrated the effectiveness of estimating vine LAI using 3D-based canopy descriptors such as canopy height, thickness and leaf density distribution. While Rossi et al. (2022) proposed an automatic 3D model segmentation to extract canopy traits using a low-cost phenotyping platform equipped with an RGB camera, demonstrating significant differences of canopy traits in tomatoes for various water stress treatments. While Aboutalebi et al. (2020) incorporated 3D products from UAV imagery to better parameterize a surface energy balance model for the estimation of evapotranspiration (ET) in vineyards.

As such, it is clear that the use and combination of TIR, VNIR and RGB sensors onboard UAVs can provide highly relevant information on vine physiological and structural response to water limitation, which can provide a spatially distributed warning system of vine water stress as a support to irrigation management. However, few studies have robustly evaluated and compared the utility of each set of sensors to track vine water status, especially over different phenological periods. Therefore, the main objective of this study was to evaluate the capabilities of a UAV payload equipped with TIR, VNIR and RGB sensors to detect grapevine water stress treated with different irrigation regimes throughout the vine phenological period. Both linear and non-linear models were trained and evaluated against in situ measurements to assess the effectiveness and importance of each UAV-based variable to estimate the physiological response of grapevines due to water stress. In situ physiological proxies, such as stomatal conductance (gs), leaf (Ψleaf) and stem (Ψstem) water potential, and canopy traits, such as LAI, were collected in the study site during each UAV overpass and served to benchmark model performance.

Materials and methods

Case study and experimental design

The case study was a 0.5 ha vineyard (Petit Verdot variety) at the ‘El Socorro’ experimental farm (40.14 N 3.37 E, Belmonte de Tajo, Madrid, Spain; altitude 755 m.a.s.l) located roughly 60 km southeast of Madrid, Spain. The area has a typical semi-arid continental Mediterranean climate with mean annual air temperatures of about 14 °C and average annual rainfall of 420 mm (Guerra et al., 2022) with a seasonal drought period during the summer with low precipitation and high temperatures. The soil has a clay-loam texture with the inter-rows mowed with flail mower (Guerra et al., 2022), leaving no cover crop understory. The grapevines were placed on a vertical shoot position (VSP) trellis system with vegetation reaching heights of roughly 1.5 m. The grapevines were planted following a 2 m inter-row spacing by 1.1 m inter-vine spacing, being oriented in the North–South direction. A drip irrigation system, a self-compensating integrated drip irrigation pipe with 2.2 l/h at 75 cm spacing, was installed in 2021 to implement precise variable irrigation regimes in order to study the effect of water stress on grapevines. The experimental design was established following a randomized block design with three replications for three different treatments (Fig. 1). The three different irrigation treatments consisted on maintaining different weekly crop coefficients (Kc) compared to the reference ET (ETo) as calculated by the FAO56 Penman–Monteith method (Allen et al., 1998). Daily ETo and rainfall were retrieved from a nearby agrometeorological station (Chinchón, UTM30N X: 460,101, Y: 4,449,200) belonging to the Spanish Agroclimatic Information systems for Irrigation (SIAR, https://servicio.mapa.gob.es/websiar/). The typical irrigation practice for vineyards of the region maintains Kc values close to 0.4 throughout the seasonal growing period (Romero et al. 2016; Rallo et al. 2021). To induce a large contrast and variability of vine water stress in the case study, we implemented treatments that imposed a severe deficit irrigation (0.2Kc; 20% of ETo), typical deficit irrigation (0.4Kc; 40% of ETo) and highly irrigated practices (0.8Kc; 80% of ETo). The irrigation period in 2022 and 2023 began in early June and ended at the end of September, with irrigation inputs weekly adjusted over the three treatments taking into account the weekly accumulated rainfall and ETo. The average annual irrigation input for the 0.2Kc, 0.4Kc and 0.8Kc treatments were 93, 194 and 371 mm, respectively.

Fig. 1
figure 1

Aerial view of experimental scheme in ‘El Socorro’ farm with three different irrigation treatments and repetitions (0.2Kc in light blue, 0.4Kc in blue and 0.8Kc in dark blue along with 27 permanent sampling points (orange circles) and local meteorological tower. The orthomosaic was produced from the RGB camera (DJI P1) on 2022-08-30 (Color figure online)

In-situ measurements

Twelve field campaigns (Table 1) were carried out during the main vine growth period (June to August) in 2022 and 2023 to acquire UAV imagery and in situ vine biophysical measurements. Three permanent sampling points (i.e., vines) spaced evenly at each repetition (3) of each treatment (3) resulted in a total of 27 field measurements for each campaign (Fig. 1). At each field campaign, midday leaf (Ψleaf) and stem (Ψstem) water potential along with leaf area index (LAI) were measured at each sampling point using a Scholander pressure chamber Model 600 (PMS Instruments, Albany, USA. Scholander et al., 1965) and Licor’s LAI-2200C (LI-COR Biosciences USA, 2011), respectively. In situ measurements of stomatal conductance (gs) were additionally acquired with the LI-600 porometer (LI-COR model 600, LI-COR Biosciences, Lincoln, NE) only during the 2023 campaigns (the instrument was not available in 2022). In addition, the phenological growth stage at each sampled vine was recorded in the field following the extended BBCH scale (Lorenz et al., 1995) and the mode value during each field campaign is presented in Table 1.

Table 1 Date, UAV takeoff time and meteorological conditions during flight time for each field campaign performed at El Socorro experimental farm

LAI measurements with the LAI-2200C (LI-COR Biosciences USA, 2011) were acquired following the protocol established for vineyards suggested by White et al. (2019). As such, four measurements were taken at each sampling point below the vine-row and then at 1/4, 1/2 and 3/4 distance from vine-row with the sensor height roughly 30 cm above the ground to avoid the LAI-2200-C lens intercepting the drip line. In this regard, the four measurements were averaged to obtain an ‘ecosystem-level’ LAI considering both the grapevine and interrow (see White et al., 2019 for more details on the LAI measurement protocol).

Midday Ψleaf, Ψstem and gs were acquired simultaneous to UAV overpass over the 27 sampling points. Ψleaf and Ψstem were sampled using the Scholander pressure chamber (Model 600). For Ψleaf, a well-developed sunlit leaf was excised at each vine sample during the UAV overpass, which was similar for the Ψstem samples albeit the latter was covered at least an hour prior to excision using an opaque aluminum zip bag. The measurements were acquired in the field with Bar units but converted to Mpa as presented in this study. Simultaneously, gs was measured using the LI-600 porometer over six different leaf samples at each vine (three in the upper canopy and three in the lower canopy).

In addition to vine-level measurements, a local meteorological station was installed in the eastern edge of the experiment (Fig. 1). The tower was equipped with an integrated open-path infrared gas analyzer and 3D sonic anemometer from Campbell Scientific (IRGASON, Campbell Scientific, Logan, Utah), which measures carbon, heat and water exchanges while also sampling meteorological scalars such as air temperature, humidity and wind speed at the half-hourly time step. In addition, shortwave and longwave radiation was measured using a four-component net radiometer (SN-500-SS, Apogee, Logan, Utah). Since the tower was installed on 2022-08-03, meteorological data for the campaigns prior to the tower installation were acquired from the Agroclimatic Information System for Irrigation (SIAR) at the Chinchón station (i.e. about 10 km from the study site). The SIAR data provides data at the daily scale but were adjusted for the specific half-hourly time step of the UAV overpass time and local conditions using a linear regression model (see Fig. 11), which related daily meteorological data from the Chinchón weather station to the conditions at the overpass time at the El Socorro experimental farm (see Fig. 11). This was mostly relevant to the calculation of the surface to air temperature gradient (dT), which was one of the variables assessed in this study (see Sect. 2.4). Table 1 shows the meteorological conditions during each UAV campaign used in this study.

UAV payload and image processing

The DJI Matrice 300 UAV (DJI Technology Co., Ltd, Shenzhen, China) was used to acquire visible-near-infrared (VNIR), thermal infrared (TIR) and RGB imagery using three different sensors: Parrot Sequoia + (Parrot S.A., Paris, France), DJI’s Zenmuse H20T and DJI’s Zenmuse P1, respectively. The Sequoia + camera has four separate bands in the green (0.48–0.52 µm), red (0.64–0.68 µm), red-edge (0.73–0.74 µm) and near-infrared (0.77–0.81 µm) spectral regions with a horizontal field-of-view (FOV) of 61.9 and vertical FOV of 48.5. The Zenmuse H20T is radiometric microbolometer monoband camera between 8 and 14 µm with a FOV of 40.6, while the Zenmuse P1 is a RGB sensor equipped with 35 mm lens acquiring images at 45 megapixels, which sampled very high resolution RGB imagery. During each campaign, two flights were carried out. Firstly, an overpass at 40 m above the surface acquired VNIR and TIR images simultaneously with 70% and 80% of frontal and lateral overlap, respectively, resulting in a native pixel resolution of roughly 4 cm for both. Secondly, RGB imagery were captured at 15 m above surface (also with 70% and 80% of frontal and lateral overlap, respectively) to acquire a dense point cloud through photogrammetric techniques, which resulted in a native pixel resolution of about 0.2 cm for the orthomosaic.

UAV images were processed using OpenDroneMap (ODM, https://www.opendronemap.org/), an open-source photogrammetry software. Raw TIR H20T image tiles, in R-JPEG format, were first converted to a single band radiometric temperatures using the DJI Thermal SDK software (https://www.dji.com/downloads/softwares/dji-thermal-sdk). These individual temperature image tiles were then mosaicked together within ODM using an incremental structure-from-motion algorithm and a Fast Library for Approximate Nearest Neighbors (FLANN) matcher algorithm. Congruently, multispectral images from the Sequoia + sensor were additionally radiometrically calibrated using camera corrections, such as vignetting, black level and gain/exposure compensations, using the available routines developed for OpenDroneMap. (https://github.com/OpenDroneMap/ODM/blob/master/opendm/multispectral.py), and following documentation from Sequoia (Parrot, 2017). The RGB and Digital Surface Models (DSM) were generated through a completely automatic processing chain as described in de Castro et al. (2018)

Generation and extraction of UAV-based variables

Table 2 shows the different variables derived from the UAV payload, taking advantage of the TIR, VNIR and 3D imagery acquired during each campaign. For the multispectral VNIR data, we computed different vegetation indices (VIs) exploiting all the band combinations available and those most typically used to monitor vegetation status. The normalized difference vegetation index (NDVI) is the most widely applied VI and has been shown to correlate with vegetation density (e.g. Gitelson, 2004). The optimized soil-adjusted vegetation index (OSAVI, Rondeaux et al., 1996) was proposed to limit the effect of the soil signal on NDVI, especially for conditions of low vegetation cover such as in the case of vineyards with clumped vegetation planted in rows. In addition, red-edge reflectance has shown to be less affected by canopy structure and sensitive to vegetation traits, such as LAI or chlorophyll content, for different crop types (Dong et al., 2015; Nguy-Robertson & Gitelson, 2015). As such, we also examined the red-edge NDVI (reNDVI, Gitelson & Merzlyak, 1994) and the green chlorophyll index (CIgreen, Gitelson et al., 2003), both demonstrating low saturation issues at high LAI values. In addition, we also tested the near-infrared reflectance of vegetation (NIRv, Badgley et al., 2017), which is the product of NDVI and near-infrared (NIR) reflectance, since it has shown to correlate better with vegetation function such as photosynthetic capacity or gross primary production compared to NDVI. Indeed, Badgley et al. (2017) argued that NIRv better isolates the vegetation signal as it represents the proportion of NIR reflectance attributable to the vegetation within the pixel.

Table 2 The different variables generated from the UAV payload and used in this study, categorized into groups based on the use of multispectral visible to near infrared (VNIR) images, thermal infrared (TIR) images and 3D structural variables from RGB-based point clouds

For the TIR variables, other than land surface temperature (LST), we also examined the LST to air temperature gradient (dT), which normalizes the LST for given air temperature conditions (dT = LST – Ta). Indeed, LST is temporally very dynamic and affected by quick changes to air temperature or wind speed that affect the interchange of heat from surface to atmosphere. In addition, we computed an empirical derivation of the widely applied crop water stress index (CWSI, Jackson et al., 1981). The CWSI is related to the ratio between actual evapotranspiration (ET) and potential ET without water limitations, with the CWSI most commonly being retrieved empirically by applying threshold limits to normalize the canopy temperature between conditions of maximum and minimum water stress (Maes & Steppe, 2012). In this study, we applied the CWSI as proposed by Veysi et al. (2017), which normalizes the LST using the temperature of a seemingly well-irrigated pixel (LSTmin, 1st percentile of vine LST) and that of a pixel seemingly suffering maximum stress (LSTmax, 99th percentile of vine LST).

The 3D structural features of the vines were obtained by using an automated object-based image analysis algorithm (OBIA), as developed by de Castro et al. (2018). This method allows to automatically classify the position and dimensions of the vines based on a digital surface model (DSM = DEM—DTM). The OBIA algorithm, developed with the eCognition Developer 9 software (Trimble GeoSpatial, Munich, Germany), employs a checkerboard segmentation algorithm to divide the DSM and classify it into the pixels corresponding to grapevines, avoiding possible weed vegetation pixel.

From there, a 2 × 2 m window across each of the sample points (Fig. 2) was applied and vine geometric features, including the average canopy height (CH), volume (CV) and area (CA) along with mean values of all other variables listed in Table 2, were extracted from these objects. The CH of the vegetation cover was determined by subtracting the DEM values from the DTM. The CV was calculated using the voxelization technique by summing the volumes of all pixels corresponding to the vine within the 2 × 2 m window around each sample point. Similarly, CA from the 2 × 2 m square was obtained from the number of pixels classified as vine multiplied by pixel area. For more details on the grapevine classification and 3D algorithm, refer to the work of De Castro et al. (2018).

Fig. 2
figure 2

Example of spatial distribution of UAV variables over the study area including the normalized difference vegetation index (NDVI), Green chlorophyll index (CIgreen), land surface temperature (LST) and canopy height (CH) on 2022-08-02. Orange circles represent the permanent sampling points and the surrounding white squares are the 2 × 2 m window used to extract all the UAV-based variables to compare against in-situ measurements. The coordinates on X–Y axis are projected in UTM zone 30N with units in meters (Color figure online)

Empirical models

Using the variables listed in Table 2, various empirical models were trained to better understand the relation between the variables generated from UAV imagery and the in situ physiological (Ψleaf, Ψstem and gs) and canopy (LAI) traits. The data was standardized by subtracting the mean of the dataset to each sample then dividing by the standard deviation (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html). As a first step, a correlation matrix, based on Spearman’s coefficient (\(p\)) to account for non-linear monotonic relations, was used to characterize the relations between variables and discard potentially redundant variables highly related to each other and, therefore, providing similar information. This was processed using the Python Seaborn package (Waskom, 2021; https://seaborn.pydata.org/). In addition, an analysis of variance (ANOVA) was performed for each campaign to quantify if there were significant treatment effects caused by the three different irrigation regimes. Along with this, a post-hoc Tukey’s test was used to investigate pairwise differences between each treatment. These were implemented in python using the statsmodels package (Seabold & Perktold, 2010).

Subsequently, empirical models were developed in Python using the scikit-learn package (Pedregosa et al. 2011; https://scikit-learn.org/stable/). A principal component analysis (PCA) was performed on the UAV-based predictor variables, which was subsequently used to develop principal component regression (PCR) models. To evaluate the importance of each group of variables stemming from each of the TIR, VNIR and RGB sensors, different variable combinations were used as inputs to the PCR models, resulting in a total of seven PCR models (i.e. 1 = TIR, 2 = VNIR, 3 = 3D, 4 = TIR + VNIR, 5 = TIR + RGB, 6 = VNIR + 3D, 7 = TIR + VNIR + 3D). For each PCR model, a multiple linear regression model was applied and the number of principal components (PCs) was equivalent to the number of different variable groups used. For example, the PCR model using TIR, VNIR and 3D variable types (as described in Table 2) implemented three PCs as independent variables while the model using only TIR and VNIR variables implemented the PCR with two PCs. In addition, non-linear empirical models were implemented using Random Forest (RF) (Pedregosa et al. 2011) and the Gini importance (calculated using scikit-learn and based on Breiman, 2001) was used to quantify the significance of each variable within the developed RF models. For the RF models, to evaluate possible overfitting issues, the dataset was divided 70/30 to calibrate and validate the model, respectively. This analysis was done using all vine-level data samples available, having typically 27 measurements per campaign. However, clear outliers were eliminated based on the interquartile range method (i.e. if they were 1.5 times lower or greater than the 25th and 75th percentile, respectively). An additional analysis was performed by spatially aggregating the dataset at the row level. This was done by averaging the three vine-level measurements acquired per row, leaving a total of nine samples per campaign for this scale of analysis. This was done to analyze the effect of the spatial unit and inherent uncertainties of the in-situ measurements on the performance of the empirical models.

Model evaluation metrics

Modelled results were evaluated against in situ measurements and their performance were quantified using the root-mean-square-error (RMSE, Eq. 1a), the relative RMSE (rRMSE, Eq. 1b), the percent bias (PBIAS, Eq. 1c) and Pearson’s correlation coefficient (r). In addition, for the PCR models implemented with different variable group types, the Akaike Information Criteria (AIC) (Akaike, 1974) is reported, which penalizes the goodness of fit of model by its complexity (i.e. the number of parameters used), where a lower AIC value is considered to be a better model fit.

$$RMSE=\sqrt{\frac{\sum {\left(Yobs|-{Y}_{mod}\right)}^{2}}{N}}$$
(1a)
$$rRMSE=\frac{RMSE}{{Y}_{obs,max}-{Y}_{obs,min}}$$
(1b)
$$PBIAS=\frac{\sum \left({Y}_{mod}-{Y}_{obs}\right)}{\sum \left({Y}_{obs}\right)}\times 100$$
(1c)

where Yobs are the measured observations, Ymod are the modelled values and N is the sample size.

Results

In-situ and UAV variable time series

The seasonal patterns of in-situ and UAV-based variables, separated by the three irrigation treatments, are shown in Figs. 3 and 4, respectively. In general, for both years, differences between the three different irrigation regimes were more pronounced starting from mid-July, corresponding roughly to the end of fruit development and beginning of ripening of berries stage (BBCH scale 79–81). The leaf area index (LAI) of grapevines was generally larger for the 0.8Kc as compared to both 0.2Kc and 0.4Kc treatments. This trend is particular noticeable after mid-July in both years, where the mean LAI values of the 0.8Kc were roughly 2.5 m2/m2 and 2 m2/m2 in 2022 and 2023, respectively whereas the LAI for 0.2/0.4Kc in 2022 and 2023 maintained below 2 m2/m2 and around 1 m2/m2, respectively. Significant treatment effects for LAI were only observed in 2022 starting from August 16th and were driven principally by the large differences between 0.8Kc and the two other treatments. In fact, pairwise Tukey’s tests revealed non-significant (p-value > 0.05) differences between 0.2Kc and 0.4Kc for all dates assessed in 2022 and 2023. It should be noted that only nine LAI measurements were acquired (instead of 27 samples) during the July 4th campaign in 2023 due to instrumentation issues and, therefore, limited conclusions can be made for this date. By contrast, the physiological measurements showed more consistent differences between each of the three irrigation regimes, especially for Ψstem and gs. Indeed, Ψleaf showed significant differences between the three irrigation treatments for all campaigns in 2022 (except June 21st) but, by contrast, insignificant treatment effects were found for all dates in 2023, with rather inconsistent results throughout the seasonal period. By contrast, Ψstem showed very significant differences in treatments for both 2022 and 2023, especially during the peak summer period between mid-July and mid-August. The gs measurements obtained very similar patterns to Ψstem, with significant treatment effects mostly apparent during the mid-summer campaigns.

Fig. 3
figure 3

Daily means and standard deviation, separated by treatments (0.2Kc in light blue, 0.4Kc in blue and 0.8Kc in dark blue), of in-situ leaf area index (LAI), leaf water potential (Ψleaf), stem water potential (Ψstem), and stomatal conductance (gs) collected in the El Socorro experimental farm during the 2022 and 2023 campaigns. Irrigation treatment effects were quantified through an Analysis of Variance (ANOVA) implemented at each date where ns = p-value > 0.05, *0.01 < p-value < 0.05, **0.001 < p-value < 0.01 and ***p-value < 0.001 (Color figure online)

Fig. 4
figure 4

Daily means and standard deviation, separated by irrigation treatments (0.2Kc in light blue, 0.4Kc in blue and 0.8Kc in dark blue), of normalized difference vegetation index (NDVI), green chlorophyll index (CIgreen), surface to air temperature gradient (dT), and canopy area (CA) extracted from UAV imagery over the vine sampling points within the ‘El Socorro’ experimental farm during the 2022 and 2023 campaigns. Irrigation treatment effects were quantified through an Analysis of Variance (ANOVA) implemented at each date where ns = p-value > 0.05, *0.01 < p-value < 0.05, **0.001 < p-value < 0.01 and ***p-value < 0.001 (Color figure online)

UAV-based variables showed a similar temporal trend and treatment effects as compared to the in-situ measurements (Fig. 4). For example, the mean NDVI values for 0.8Kc was consistently highest but values for the 0.2 and 0.4Kc were generally closer to each other. In 2022, there was only a significant difference in NDVI between 0.2Kc and 0.4Kc on August 2nd 2022, with all other dates demonstrating non-significant differences. By contrast, the LST to air temperature difference (dT) more consistently and significantly captured mean differences between the different treatments, especially after the mid-July period, as similarly observed by the in situ Ψstem and gs measurements (Fig. 3). Moreover, Tukey’s test revealed significant pairwise differences in dT between 0.2Kc and 0.4Kc in six different campaigns, whereas the VNIR-based indices (e.g., NDVI, CIgreen) solely showed a significant difference between both treatments during the aformentioned 2022-08-02 campaign. The vine canopy area (CA) demonstrated very consistent and significant differences between irrigation regimes throughout the entire phenological period. Although, while significant treatment effects were observed for all dates, all pairwise Tukey’s test between 0.2 and 0.4Kc resulted in insignificant differences, similar to the results obtained from in-situ LAI and VNIR-based indices.

Relation between UAV and in-situ biophysical variables

The relationship between all variables assessed in this study, quantified with Spearman’s p, are shown in Fig. 5. LAI was most related to CA (p = .54) and CV (p = .53), along with VNIR indices such as OSAVI, NDVI and NIRv (p = .44/45). In general, TIR-based variables had little relation with LAI (p < .4), except for dT (p = − .40). By contrast, LST was the only UAV-based variable to show some relation with Ψleaf (p = − .43), with the rest showing very little correlation with this variable. Ψstem was generally more correlated to UAV-based variables as compared to Ψleaf, especially with dT (p = .50), NIRv (p = .50) and LST (p = .48). For gs, only TIR-based variables showed significant correlations, namely dT (p = − .62) and LST (p = − .53). Interestingly, CWSI did not demonstrate any strong correlations with any of the in-situ variables. Based on this, we discarded CWSI, NDVI, which showed a very similar albeit slightly weaker relations compared to OSAVI (including being highly correlated to each other i.e. p = .98) and CV, which showed a similar pattern with CA, when developing the empirical models described in the following sections. NIRv was also highly correlated to NDVI and OSAVI (p = .97) but was kept as it generally had stronger correlations with the physiological variables (Ψleaf, Ψstem and gs) compared to the other VNIR VIs.

Fig. 5
figure 5

Correlation matrix of variables grouped by VNIR (green labels), TIR (red labels), 3D structural (blue labels) and in situ (black labels) data (Color figure online)

Principal component analysis and multiple linear regression models

Based on the correlation matrix shown in Fig. 5, the eight variables selected to develop empirical models were OSAVI, reNDVI, CIgreen, NIRv, LST, dT, CH and CA. To reduce dimensionality and potential multicollinearity issues, a PCA was performed to extract three PCs (Fig. 6) to be used as predictor variables to estimate in situ variables using multiple linear regression models. The three PCs explained 78% of the variability in the original predictor variables, where the 1st principal component (PC1) is largely represented by CA and VNIR indices (OSAVI, NIRv, CIgreen), the 2nd principal component (PC2) is mostly composed of CH, reNDVI and dT and the 3rd principal component (PC3) is dominated by TIR variables (LST and dT).

Fig. 6
figure 6

Loadings of the three principal components (explaining 78% of variability) of the UAV-based predictor variables (Color figure online)

In addition to the PCA applied using all variables (as shown in Fig. 6 for TIR + 3D + VNIR variables), a series of PCAs were applied with different variable input combinations separating between variable group types, namely between TIR, VNIR and 3D variables as shown in Table 2 and described in Sect. 2.4. A total of seven PCR models were applied to estimate LAI, gs, Ψleaf, and Ψstem and the performance metrics are displayed in Table 3, along with the scatter plots of the PCR models using all variables as illustrated in Fig. 7. In general, PCR models using all three variable types (TIR, VNIR and 3D) performed best for all dependent variables, having the highest correlations, lowest errors and even the lowest AIC scores. For LAI, the model results based solely on the 3D and VNIR variables achieved very similar results for all performance metrics compared to when using all variables (RMSE = .66 vs .6 m2/m2). Indeed, for the PCR models only using 1 PC, the model only based on TIR variables had the worst results when compared to LAI measurements (r = .44 and RMSE = .75 m2/m2). By contrast, TIR variables demonstrated very high importance when establishing PCR models for gs, Ψleaf, and Ψstem. Indeed, all model combinations without TIR variables had very low performance for gs (r =  < .1), Ψleaf (r < .4) and Ψstem (r < .4). In fact, for Ψstem, model results only using TIR variables achieved very similar error statistics (r = .61, RMSE = .20 MPa) compared to using the entire set of variables (r = .65, RMSE = .19 MPa), while gs and Ψleaf models benefited more when TIR was combined with the different variable types (Table 3).

Table 3 Correlation (r), root-mean-square error (RMSE), relative-root-mean-square-error (rRMSE) and Akaike information criteria (AIC) of principle component regression models using different input combinations from RGB, VNIR and TIR variables for leaf area index (LAI), stomatal conductance (gs), leaf water potential (Ψleaf) and stem water potential (Ψstem)
Fig. 7
figure 7

Scatter plots and coefficient values of independent variables for multiple linear regression model applied with the three principal components of all variables from UAV payload (TIR, VNIR and 3D) for leaf area index (LAI, a), stomatal conductance (gs, b), stem water potential (Ψstem, c) and leaf water potential (Ψleaf, d) (Color figure online)

The importance of TIR variables to estimate these physiological measurements was also supported by the model coefficient values as shown in Fig. 7. Indeed, PC3, largely representing TIR variables, had the largest weight in the PCR models developed for gs, Ψleaf and Ψstem (Fig. 7b, c, d). By contrast, PC1, representing mostly CA, OSAVI and NIRv, was by far the most important predictor for LAI (Fig. 7a). As shown in Fig. 7, the Ψleaf PCR model showed the lowest correlation (r = .51) while PCR model for Ψstem had the highest correlation (r = .65). The LAI PCR model showed relatively high correlation (r = .62) but a somewhat large scatter of results (RMSE = .65 m2/m2, rRMSE = 37%).

Random Forest regression models

The results of the developed RF models are shown in Fig. 8 for the calibration (70%) and validation (30%) datasets, including illustrating the feature importance of each variable in the trained RF models. In general, there is some evidence of model overfitting issues, with very high correlation and low errors observed during the model calibration phase (r > .92, rRMSE < 20%) but comparatively larger errors observed when applied with validation dataset, most notably with LAI and Ψleaf. Indeed, the LAI RMSE increases from .36 to .73 m2/m2 from calibration to validation datasets, with the latter having poorer performance metrics compared to the PCR model (RMSE = .65 m2/m2) with large biases apparent for both low (< 1 m2/m2) and high (> 2.5 m2/m2) LAI samples. The most important features for the LAI RF model were the 3D structural (CA and CH) and VNIR (OSAVI, NIRv) variables, which is a similar result to the PCR models discussed above. The Ψleaf RF validation model results (r = .64, RMSE = .26 MPa) were, by contrast, slightly better compared to the PCR model (r = .51, RMSE = .28 MPa). Interestingly, the most important variable for the Ψleaf RF model was CH, which was not observed by the PCR analysis, followed by LST then NIRv. There were seemingly less overfitting issues with gs and Ψstem, with less significant differences in model performance between calibration and validation. Indeed, the gs errors barely change between calibration (rRMSE = 19%) and validation (rRMSE = 20%), which was much lower than the PCR model (rRMSE = 32%). The dT was by far the most important variable for the gs RF model, followed by CIgreen, reNDVI and LST with each having similar model weights. The Ψstem RF model (r = .78, RMSE = .17 MPa) also performed better than the PCR model (r = .65, RMSE = .19 MPa) with TIR variables (LST and dT) having the largest importance followed by NIRv.

Fig. 8
figure 8

Scatter plots of calibration (70%, left column) and validation (30%, center column) along with feature importance coefficients (right column) of Random Forest regression models developed using TIR (red bards), VNIR (green bars) and RGB (blue bars) variables from UAV payload for leaf area index (LAI), stomatal conductance (gs), leaf water potential (Ψleaf) and stem water potential (Ψstem) (Color figure online)

Spatial–temporal effects on model performance

The RF model was also assessed for the different seasonal periods sampled in this study (Fig. 9), which spanned between the BBCH phenological growth stages 75 (beginning of fruit development) to 85 (end of berry ripening). The trained RF models for each dependent variable was evaluated independently for each of the campaigns for both the 2022 and 2023 datasets, except for gs which only had data available for 2023. In general, model performance metrics were highest during the mid-summer period (July 18th to August 16th), with the highest correlations and lowest errors. This was particularly apparent for the 2023 campaigns, where lower correlations were consistently shown for the first and last campaigns of the season for all variables. For example, Ψstem observed very low correlations (r < .2) and high errors (rRMSE > 55%) for the first two campaigns in 2023-06-22 and 2023-07-04 (and last in 2023-08-29) but had much higher correlations (r > .5) and lower errors (rRMSE < 30%) for the other dates. This was also somewhat observed for Ψleaf, which had higher errors in these two first campaigns of 2023, but correlations were generally lower compared to Ψstem. The rRMSE was relatively constant for gs throughout the period assessed, roughly oscillating between 30 and 40%. However, the correlation for gs was also generally higher between 2023-07-04 and 2023-08-16, with a notably lower correlation for the first campaign on 2023-06-22 (r < .3). LAI errors were fairly consistent throughout the phenological period in 2022 (rRMSE ranged between 30 and 50%) but greater seasonal variability was observed in 2023 with rRMSE ranging from 30 to 75%. Indeed, errors were generally lower and had less seasonal variations in 2022 compared to the 2023 campaigns for all available variables.

Fig. 9
figure 9

Model performance (Pearson correlation r, top row) and relative root-mean-square error (rRMSE, bottom row) of developed random forest model evaluated for each campaign of 2022 (left column) and 2023 (right column) throughout the phenological period. Area shaded in grey corresponds to the fruit development stage (BBCH 75–79) and area shaded in blue corresponds to the ripening of berries stage (BBCH 81–85) (Color figure online)

The dataset was also spatially aggregated to obtain mean values for each row, where the three sampling points for each row were averaged (Fig. 1) for both the dependent and independent variables. The PCR and RF models were trained and evaluated using these row means, as illustrated in Fig. 10. Both the PCR and RF model results largely improved upon using the original vine-level dataset, especially for LAI and Ψleaf. Indeed, the LAI rRMSE for both PCR and RF model were less than 30%, including better capturing the full variability of values. The Ψleaf model performance also improved considerably, especially for the RF model (r = .77, rRMSE = 15%). The spatial aggregation also showed improvements for Ψleaf with both PCR and RF models, while the gs RF model had slightly worse performance metrics compared to the original non-aggregated dataset.

Fig. 10
figure 10

Scatter plots of principle component regression (left column), using the entire row averaged dataset, and the Random Forest regression (right column), using 30% of the row averaged dataset as validation, when aggregating model inputs (UAV payload) and in situ variables (LAI, gs, Ψleaf, Ψstem) as the mean of each row (three sampling points per row) (Color figure online)

Discussion

Results from this study assessed the applicability of various indicators estimated from a UAV payload equipped with TIR, VNIR and RGB sensors to track water stress in a vineyard under different irrigation regimes. The analysis revealed the importance of the TIR spectral region to monitor grapevine water status, being the only set of variables to individually correlate significantly with in situ physiological measurements of gs, Ψleaf and Ψstem (Fig. 5), with perhaps the only exception being NIRv and Ψleaf (p = .5). Indeed, this was further confirmed through both the PCR and RF models, where the TIR variables had the most important weight on the empirical models trained for gs, Ψleaf and Ψstem. Moreover, the PCR models which did not use TIR variables were largely ineffectively related to in-situ physiological proxies (r < .4), particularly for gs (r < .1). This was further supported by the RF models, where the variables with the highest feature importance for physiological proxies were dT or LST, with the exception of Ψleaf, which showed that firstly CH then LST were the most important features. These findings confirm past results which demonstrated the suitability of TIR imagery to accurately detect water stress in vegetation (Jackson et al., 1981; Wang et al., 2024), including vineyards (Baluja et al., 2012; Bellvert et al., 2014; Santesteban et al., 2017). The large importance of CH for Ψleaf was a surprising result but perhaps pertaining to the sensitivity of Ψleaf to changes in the canopy architecture. Indeed, as discussed in García-Tejera et al. (2021), changes to plant canopy morphology affects the resistance of water movement from soil to leaves and, thus, values of Ψleaf. For example, canopy pruning will reduce the root-to-leaf-ratio, decreasing resistance due to plant hydraulic architecture, and induce higher Ψleaf for the same given soil and atmospheric conditions (García-Tejera et al., 2021). Hence care should be taken when using leaf water potential as a metric for crop stress.

The low performance of CWSI in this study (p < 0.37) was also a relatively surprising result, especially given its high use in the literature for vineyard water stress detection (e.g. Bellvert et al., 2014, 2016; Matese et al., 2018; Santesteban et al., 2017). This likely stemmed from the sensitivity of CWSI to the selection of the so-called ‘hot’ and ‘cold’ pixels, representing limiting conditions where the vegetation canopy is meant to be suffering maximum stress (i.e. no transpiration) and minimum stress (i.e. fully transpiring at potential rate), respectively. Indeed, there are various forms to estimate CWSI, ranging from direct, empirical and analytical methods (see Maes & Steppe, 2012 for a comprehensive review). This study applied the histogram approach by acquiring these ‘hot’ and ‘cold’ pixels as simply the 1st and 99th percentile of the LST of vine pixels observed in each UAV overpass as similarly applied in many other studies (De Swaef et al., 2021; Maimaitiyiming et al., 2020; Veysi et al., 2017). However, it is likely that the conditions of fully transpiring and fully stressed conditions were not present during the UAV overpasses even though there were three different irrigation regimes applied, inducing a certain level of variability. In addition, the minimum and maximum LST values were likely changing between overpass dates due to different atmospheric conditions influencing the evaporative demand and also throughout the phenological period, due to changes in crop development and water stress, resulting in different ‘anchor’ pixels in each UAV overpass dates. Bellvert et al. (2015) demonstrated that the relationship between Ψleaf and the empirical derivation of CWSI changed throughout the vine phenological stage due to changes in canopy cover and density, suggesting the need to re-calibrate empirical models to account for these effects. Alternatively, physically-based methods can estimate evapotranspiration (ET) through resistance-based energy balance modeling, such as the two-source energy balance (TSEB, Norman et al., 1995) model, to more directly compute CWSI as the ratio between actual ET and potential ET (Nieto et al., 2022). In fact, Nieto et al. (2022) suggested the use of alternative crop water stress indices based on the ET partitioning (i.e. transpiration) inherently performed by TSEB, to better capture the stress signal from the vegetation. The UAV payload and variables assessed in this study could be ingested into the TSEB model, as shown by Burchard-Levine et al. (2024), to better track water stress in different seasonal periods without re-calibration, including having the added advantage of quantifying crop water demand in physical units (e.g., mm/day) rather than solely relying on relative water stress mapping.

The robustness of empirical models, as those shown in this study, is largely dependent on the quality and quantity of the reference datasets. Considering the dimensions of the case study (~ 0.5 ha), sufficient data were acquired to robustly analyze and understand which UAV-based variables were most related to the vine physiological response to water stress using the developed empirical models. However, the results of the RF models do show some evidence of overfitting, especially for Ψleaf and LAI, with strong differences in the model performance between the calibration (r > 0.9) and validation (r ~ 0.6) datasets (Fig. 8). Overfitting issues were less apparent when spatially aggregating the reference datasets at the row level (Fig. 10) with overall better model performance during the validation phase. This suggests that the row-level aggregation likely cancelled out some of the inherent uncertainty from the vine-level dataset, which was being captured during the training phase making the models less generalizable. For instance, the validation of RF LAI model showed relatively large errors (RMSE = .73 m2/m2; rRMSE = 44%) especially compared to other studies estimating LAI in vineyards, such as Gao et al. (2022) or Aboutalebi et al. (2020) who reported an RMSE of roughly 0.30 m2/m2 when combining VNIR. TIR and 3D variables derived from UAVs with machine learning methods. Similarly to our study, Aboutalebi et al. (2020) showed that vine structural variables such as area or volume were highly correlated with LAI. Kang et al. (2022) also reported a LAI RMSE between ~ .35 and.60 m2/m2 using empirical models trained with satellite-based VIs. However, in our study, both the PCR and RF LAI modeling performance improved notably when aggregating both the predictors and response variables to at the row level (RMSE ~ .5 m2/m2; rRMSE < 30%). Indeed, while the sampling protocol used, as described in White et al. (2019), is well established to measure LAI in vineyards, there are always important sources of measurement uncertainties including environmental conditions and different field operators during the various field campaigns. In this case, it may be necessary to do more measurement repetitions at each sample point to limit inherent sampling errors to develop more generalizable models. For example, Gao et al. (2022) used over 1000 sample records across various years (2014–2019) and sites (three), including augmenting their LAI sample variability using bare-soil LAI (LAI = 0 m2/m2), to train and test their LAI models. As such, it might be necessary to increase the training set variability to improve results, as the LAI observations had relatively low variability (largely between 1 and 3 m2/m2 for all irrigation treatments and dates), which may link to the poorer modeling performance observed especially for high and low LAI values as shown in Fig. 8. Similar patterns were also shown for Ψleaf with the RF model improving substantially when the dataset was aggregated at the row scale, obtaining an r and RMSE of 0.79 and 0.18 Mpa, respectively. These results were largely in line with past results such as Bellvert et al., (2015, 2016) which reported an R2 and RMSE between 0.55 and 0.76 and 0.15 and 0.21 Mpa, respectively using both general and seasonal empirical models relating CWSI with Ψleaf. Measurement uncertainties are especially relevant for water potential sampling using the Pressure Chamber method (Scholander et al., 1965), which is susceptible to important errors related to the sampling protocol employed and the subjective decision making of different operators, which were demonstrated to lead to significant biases by Rodriguez‐Dominguez et al. (2022). Indeed, using a combined datasets from different field sites with large variability in vineyard characteristics (e.g. structural characteristics, variety) would improve the generality of the models and potentially serve to upscale the results for different sites or management scales.

As shown in Fig. 9, there were important seasonal trends in the performance of the trained RF model. Notably in 2023, both Ψleaf and Ψstem showed much larger errors in the first two first campaigns (June 22 and July 4) and, to a lesser extent, in the last campaign (August 29). Indeed, for all response variables assessed, the empirical models performed best during the peak vine biomass between mid-July and mid-August, the beginning of the fruit ripening or veraison phase. The lower modeling performance during the early and late phenological stages, especially for the 2023 campaigns as shown in Fig. 9 may be related to the greater possible uncertainties stemming from the in-situ measurements, where vine leaves have greater heterogeneity in development/size and health. Certainly, the sampling protocols for in-situ physiological measurements have an implicit bias to sample well-illuminated and healthy leaves, which are then compared against mean averages of the entire vine canopy as observed from the UAV imagery, which contain both sunlit, shaded and, possibly, diseased leaves. For instance, due to time constraints during the UAV overpass, only one leaf sample per vine was collected for each of the Ψleaf and Ψstem measurements, where past works suggested the acquisition of five or six leaves per vine (Acevedo-Opazo et al., 2008; Rodríguez-Pérez et al., 2007; Van Leeuwen et al., 2009). While six leaf-level samples were acquired per vine for the gs measurements, which is similar to the recommended sampling strategy suggested in Loveys et al. (2005) to limit relative sampling error to 5%, the intrinsic bias of sampling well-illuminated and healthy leaves hinders the comparability of these leaf level measurements to canopy scale observations, which may be accentuated in early and late phenological periods when there is a larger heterogeneity of leaf conditions. Möller et al. (2007) attributed a lower correlation between CWSI and Ψstem during the late season to the effects of partial leaf senescence within the canopy. In addition, Jones et al. (2002) demonstrated large differences in gs values between sunlit and shaded leaves, with sunlit leaves having, on average, double the gs values as shaded leaves. This was particularly the case for Ψleaf, which generally showed more ‘noisy’ patterns, both from the performances of trained empirical models and the seasonal observations themselves. For example, Ψleaf showed no consistent irrigation treatment effects in 2023, whereas significant treatment patterns were observed for both Ψstem and gs during the mid-summer period that year (Fig. 3). Another factor that may explain the seasonality of the model performance may be linked to the different atmospheric conditions present throughout phenological cycle. For example, Bahat et al. (2024) were able to improve model predictions of Ψstem by adding meteorological scalars as model predictors, showing that shortwave incoming radiation was the most important feature in the developed model. The authors stated that the seasonality effect in model predictions were mitigated by taking into account the changes in evaporative demand between the different dates, which may have played a role in the results presented here where only air temperature (through the dT variable) was used.

Ψstem and gs were generally better predicted using the empirical models developed as compared to LAI or Ψleaf. The results were in line with past studies such as Santesteban et al. (2017), which reported an R2 of .69 and .71 for Ψstem and gs, respectively by adjusting a non-linear model using a UAV-based and empirically derived CWSI. Baluja et al. (2012) also achieved an RMSE of ~ 0.1 Mpa and ~ 0.2 mol m−2 s−1 for Ψstem and gs, respectively using VNIR and TIR indices over a single UAV overpass during the end of the phenological phase (veraison period). As Baluja et al. (2012) pointed out, VNIR-based indicators may be suitable to detect the vine response to water stress after a prolonged period of water limitation but TIR information more directly captures short-term physiological changes. In this study, VNIR variables did not demonstrate effectiveness as predictors for vegetation water stress proxies (Fig. 5; Table 3). Romero et al. (2018) also showed that multi-spectral VIs had rather low correlations with Ψstem in a vineyard in China, but considerably improved correlations when applying an artificial neural network model with a series of VIs from the Sequoia sensor. They achieved similar results to those presented here by solely using VNIR indices (r = 0.73 and 0.62 for validation and testing datasets, respectively) but their model performances were only tested during the late veraison period. TIR information is likely necessary to detect the rapid effects of water stress before visible changes to canopy structure are apparent to properly implement water management strategies to optimize water use efficiency. This was demonstrated for winter wheat by Wang et al. (2024) and also supported by the in-situ data collected. For instance, LAI observations were only significantly larger for the 0.8Kc treatment, while no significant difference were found for the 0.2 and 0.4Kc irrigation regimes. This is a very similar result to Möller et al. (2007) which showed only a significantly larger LAI for the mild water stress treatment and no significant difference between the moderate and severely stressed treatments. This is in contrast to the vine physiological measurements, particularly Ψstem and gs, which generally had more significant differences throughout the phenological period (Fig. 3), showing a clearer physiological response to irrigation treatments. A similar trend was observed with the VNIR and TIR indices in 2022 where NDVI values were only significantly different between irrigation treatments from mid-July onwards while dT was significantly different for all campaigns of that year. Interestingly, during the first campaign of 2023, which was right after a rainfall event (Burchard-Levine et al., 2024), NDVI values showed significant differences, likely stemming from the legacy treatment effects of the prior year, but dT, along with in-situ physiological observations, did not show any significant differences as the recent rainfall events limited water stress over all treatments. This further demonstrated that TIR imagery can capture short-term vegetation physiological changes while VNIR information may only capture the longer-term canopy growth or structural changes responding to the water limitation. It should be noted that the radiometric quality of the multispectral data was not assessed in this study with certain studies demonstrating an effect from atmospheric conditions and biases particularly for the green and red-edge bands of the Sequoia sensor (Fawcett et al., 2018; Olsson et al., 2021). This could have affected the performance of the VNIR indices in this analysis and future work should evaluate the radiometric quality over the seasonal period to understand how this could affect their performance and feature importance within the developed empirical models.

Conclusion

These results demonstrated the applicability of high-resolution imagery from UAVs to track water stress in vineyards by combining TIR, VNIR and 3D structural variables. Notably, TIR variables were shown to be the most important predictors to track in-situ water stress such as Ψstem and gs, while VNIR and 3D structural variables, such as canopy area and height, were most related to LAI measurements. The grapevines demonstrated a notable physiological response to the different irrigation treatments through rather consistent and significant differences within in-situ physiological observations, but structural changes in foliage were not as sensitive to water stress as suggested by the field LAI measurements. Empirical models taking advantage of all three variables groups (i.e. TIR + VNIR + 3D) showed the best performance for all response variables. Indeed, RF models generally outperformed PCR models, however evidence of model overfitting were observed particularly for LAI and Ψleaf. These overfitting effects were less apparent when spatially aggregating the empirical models to the row level instead of individual vines, with improved results for both LAI and Ψleaf at this scale, perhaps indicating the presence of noisy data at the individual vine canopy level for these variables. In addition, the model performance showed certain seasonal trends with best results being obtained during the peak biomass period from mid-July to mid-August. Future work should apply these developed empirical models to further understand the irrigation treatment effects on the the agronomic management of grapevines to better support and monitor yield and grape quality, while improving water use efficiency.