Forecasting carrot yield with optimal timing of Sentinel 2 image acquisition

Suarez, L. A.; Robertson-Dean, M.; Brinkhoff, J.; Robson, A.

doi:10.1007/s11119-023-10083-z

Forecasting carrot yield with optimal timing of Sentinel 2 image acquisition

Open access
Published: 02 November 2023

Volume 25, pages 570–588, (2024)
Cite this article

Download PDF

You have full access to this open access article

Precision Agriculture Aims and scope Submit manuscript

Forecasting carrot yield with optimal timing of Sentinel 2 image acquisition

Download PDF

L. A. Suarez ORCID: orcid.org/0000-0002-4233-2172¹,
M. Robertson-Dean²,
J. Brinkhoff¹ &
…
A. Robson¹

974 Accesses
2 Citations
Explore all metrics

Abstract

Accurate, non-destructive forecasting of carrot yield is difficult due to its subterranean growing habit. Furthermore, the timing of forecasting usually occurs when the crop is mature, limiting the opportunity to implement alternative management decisions to improve yield (during the growing season). This study aims to improve the accuracy of carrot yield forecasting by exploring time series and multivariate approaches. Using Sentinel-2 satellite imagery in three Australian vegetable regions, we established a time series of carrot phenological stages (PhS) from ‘days after sowing’ (DAS) to enhance prediction timing. Numerous vegetation indices (VIs) were analyzed to derive temporal growth patterns. Correlations with yield at different PhS were established. Although the average root yield (t ha⁻¹) did not significantly differ across the regions, the temporal VI signatures, indicating different regional crop growth trends, did vary as well as the PhS at when the maximum correlation with yield occurred ($PhS_{{R2_{max} }} )$ with two of the regions producing a delayed $PhS_{{R2_{max} }}$ (i.e. 90–130 DAS). The best multivariate model was identified at 70 DAS, extending the forecasting window before harvest between 20 to 60 days. The performance of this model was validated with new crops producing an average error of 16.9 t ha⁻¹ (27% of total yield). These results demonstrate the potential of the model at such early stage under varying growing conditions offering growers and stakeholders the chance to optimize farming practices, make informed decisions on selling, harvesting, and labor planning, and adopt precision agriculture methods.

Medium-resolution multispectral satellite imagery in precision agriculture: mapping precision canola (Brassica napus L.) yield using Sentinel-2 time series

Article 29 January 2022

Accuracy of carrot yield forecasting using proximal hyperspectral and satellite multispectral data

Article Open access 02 May 2020

Developing Maize Yield Predictive Models from Sentinel-2 MSI Derived Vegetation Indices: An Approach to an Early Warning System on Yield Fluctuation and Food Security

Article 12 October 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Accurate pre-harvest yield forecasting is essential for all agricultural and horticultural industries as it helps growers (and the greater industry) be better prepared to meet market demands and respond to potential production issues such as seasonal, location and varietal variability and climate volatility. Whilst there has been extensive research on the accuracy of remote sensing (RS) for the yield forecasting of grains crops (Weiss et al., 2020), there has been little work on similar applications in horticulture and even less on carrots. This paper addresses this shortfall by evaluating the accuracy of RS for forecasting yield in commercial carrot crops grown over multiple seasons and locations within Australia.

Carrot is an important vegetable crop due to the high nutrient content and benefits for human health (Que et al., 2019). However, forecasting carrot yield is not a common practice. This is likely because carrots grow underground, and the farming systems used are often intensive, characterised by small planting areas, spatially dispersed crop planting (i.e. crops distributed with spatial gaps), and usually involve crop rotation. These characteristics also limit the development and adoption of commercial carrot yield monitors or other technological developments that allow growers to have a clear understanding of the within-field yield variability.

In a recent study by Schauberger et al. (2020), 362 studies on the yield forecasting were identified from a query to the Web of Science® from 2004 to 2019. In reference to horticultural crops, 12 papers explored potatoes, olives (9), citrus (5), apples (4), mangoes (3), strawberry (2) and none in carrots. Subsequent to this review, Suarez (2020) explored the accuracies of hyperspectral data, Sentinel-2 (S2) and Worldview-3 (WV3) satellite imagery for forecasting carrot yield prior harvesting as a surrogate approach for yield monitors.

While accurate crop yield forecasting can be achieved using various RS platforms and sensors (satellite, airborne, UAV, proximal) (Weiss et al., 2020), it is crucial to consider the intricate interactions between canopy reflectance properties and factors such as crop type, phenological stages, plant densities, soil type, and agro-climatic zones (Al-Gaadi et al., 2016; Mkhabela et al., 2011). Vegetation indices (VIs) have been developed to measure different biophysical or biochemical variables of crops, such as water status, chlorophyll content, or biomass (Zarco-Tejada et al., 2005). These VIs change during the growing season, reflecting variations in crop variables that, in turn, impact crop status, strongly correlated with yield (Schlemmer et al., 2013). VIs thus act as proxies for yield forecasting (Shanahan et al., 2001). To establish robust relationships with yield, reflectance-based data must be calibrated to account for these influences.

The most simplistic calibration approaches are linear or non-linear regressions using reflectance information usually in the form of IVIs to estimate biophysical variables (Schauberger et al., 2020). Although simple, these methods are statistically preferable as they are easier to interpret, are less likely to overfit and have been proven accurate (Robson et al., 2017; Suarez et al., 2020). Bolton and Friedl (2013) established relevant univariate linear regressions between Normalised difference vegetation index (NDVI), the two-band Enhanced vegetation index (EVI2) and Normalised difference water index (NDWI) derived from MODIS to maize and soybean yield, at the regional level. Due to the influence of phenological growth stage on the reflectance properties of the crop, yield predictions from the linear models improved when they included days after sowing (DAS) or ‘greenup’ instead of the “day of the year”. From this analysis, 70 and 80 DAS were identified to be the optimum growth stage that produced the highest correlation between the VIs and maize and soybean yield, respectively. Of the VI tested, NDWI produced the highest correlation to yield in maize (R² = 0.58) and EVI2 for soybean (R² = 0.70). For potato (another subterranean root crop), Al-Gaadi et al. (2016) obtained prediction errors of between 7.9 and 13.5% from Landsat-8 and between 3.8 and 10.2% from S2 using a univariate approach a few day prior harvesting.

Although in recent times more robust regression approaches including Machine Learning (ML) and Artificial Intelligence (AI) algorithms have gained popularity for the forecasting of crop yield from remote sensing, their capacity to accurately extrapolate or forecast under unknown events is often limited in comparison to regression models (Johnson et al., 2016; Shaub, 2020). These approaches also require large datasets for calibration and then validation of the models. Johnson et al. (2016) tested multivariate linear regressions (MLR), Bayesian neural networks (BNN) and Model-based recursive partitioning models for predicting the yield of barley, canola and spring wheat over the Canadian Prairies between 2000 and 2011. The authors used NDVI and EVI from AVHRR and MODIS as predictors (i.e. MODIS-NDVI, MODIS-EVI and NOAA-NDVI) and hierarchically clustered the crops (i.e. prairies) at different geographical levels. Whilst accuracies did vary per crop, the MLR models with NDVI and EVI as predictors produced significantly higher forecasting accuracies. The interaction of NDVI and EVI (NDVI x EVI) was found more accurate in the forecasting of barley than canola or spring wheat yields. The rationale behind this results is the linear relationship found between MODIS-NDVI and each crop yield. Similarly, a study by Gomez et al. (2019) reported that machine learning algorithms produced the lowest prediction error (at the block level = 11.2%.) using S2 for potato yield forecasting 2 months prior harvesting.

From previous studies, there is not one conclusively superior statistical approach for forecasting yield from remote sensed data. The accuracies vary according the crop, level of association (clustering, global, region and block level), resolution of the imagery and timing of capture in relation with the crop growing stage (Tedesco et al., 2021). However, one common trend is to find the point in time with the highest correlation between RS data and crop yield (Bala & Islam, 2009; Tedesco et al., 2021; Zhao et al., 2007). Tedesco et al. (2021) demonstrated that VIs serve as effective proxies for monitoring the temporal changes in sweet potato crops and distinguishing between their phenological stages. They found that the period of active growth (200–500 growing degree days, GDD) resulted in the smallest yield prediction errors, regardless of the season (i.e., summer or winter). In a similar vein, Ayu Purnamasari et al. (2019) identified the greenup period as the most suitable for predicting cassava yields using VIs derived from S2 satellite imagery and biophysical properties. Bala and Islam (2009) identified the optimal period for forecasting potato tuber yield using MODIS imagery as being between 48 and 64 DAS within a growing season of approximately 96 days. In contrast, Suarez et al. (2020) achieved overall accuracies, represented as percentage errors (%), ranging from 9.2 to 12.7% when estimating carrot root yield using WV3 satellite imagery. These estimations were made approximately 4 weeks before harvest and were conducted across various vegetable growing regions in Australia. The ranges for optimal yield estimation are often from the middle to the end of season, with the latter being too late in the growing season for growers to implement alternative management changes to maximise yields.

As one methodology to achieve earlier yield forecasts, Rahman and Robson (2016) developed a two-step ‘time series’ approach using historic Landsat imagery and corresponding annual yield data that accurately depicted the growth profile of sugarcane using the Green normalised difference vegetation index (GNDVI). The authors fitted a quadratic equation using the annual growing profile and identified single images acquired in April (3 months prior to harvest) achieved the highest correlations to yield as this period represented when all regional crops had reached full canopy cover and had not yet started senescence. From these quadratic equations it was possible to estimate the maximum GNDVI value from any capture date and then input this value into a second linear equation between maximum GNDVI and yield. This methodology provided a large window in which imagery could be used to forecast yield (Feb–June). This is particularly useful in regions of continual cloud cover and extended planting periods.

For the forecasting of carrot yield from RS data an understanding of the seasonal growth profile, as measured from changes in canopy reflectance, needs to be established. This will indicate spectral changes associated with growth stages, seasonal, locational, varietal and management influences. From here, the attempt to develop yield forecasting can be better addressed in terms of identifying what period is best correlated with yield and how robust that relationship is. Therefore, the aim of this study is to develop a yield forecasting algorithm for carrot from remotely sensed imagery and to identify the optimum capture window (OCW), under the hypotheses that:

1.
VI values change per growing region, sowing arrangement, and per growing stage (Rapaport et al., 2014; Suarez et al., 2017; Tedesco et al., 2021);
2.
The relationship between VIs and yield varies during the growing period (Wang et al., 2016);
3.
VIs can be used as predictors of crop yield (Robson et al., 2017; Shanahan et al., 2001);
4.
The integration of more than one VI increases the prediction accuracy earlier in the season compared to using a single VI (to be tested).
5.
The regression fit needs to be both robust and simple enough to facilitate interpretation and industry adoption.

This paper addresses an important root crop (carrots) where research is lacking and the provision of yield forecasts early in the season is not available. We will demonstrate that the growing pattern as indicated by VIs varies per growing region. We will optimize the capture window to provide yield forecast as early as possible regardless of region by integrating different structure-based, pigment-based and water-related VIs in a multivariate analysis so a unique generic algorithm can be used among seasons and locations, facilitating the adoption by growers.

Material and methods

Study area

For this study three major vegetable (carrot) growing regions were selected from both the sub-tropical (Western Australia—WA and Queensland—Qld) and the temperate (Tasmania—Tas) climatic regions of Australia (Fig. 1). The soil type was variable across the regions. Arenosol soils with low water holding capacity dominate in WA, alluvial vertisols or cracking clay soil dominance in Qld and nitosoil soils occur in Tas. More information regarding the growing window of each region and management practices can be found in Suarez et al. (2020).

Crop distribution and field data collection

The carrot crops selected for this study exhibited similar planting and harvest dates within each growing region. Carrots are predominantly grown during the winter-spring season in Qld (with a crop duration ranging from 115 to 150 days), in summer in Tas (125 days), and throughout the year in WA (with a duration of 130–165 days). Data collection occurred between March 2017 and January 2019, encompassing four growing periods in Qld and two in both Tas and WA (Table 1).

Table 1 Carrot field crops growing periods by region included in this study

Full size table

Crop boundaries of between seven to sixteen carrot fields per region were manually delineated from high resolution images from WV3 acquired over each region (Fig. 1). WV3 provides 8 multispectral bands in the visible (VIS) and near-infrared (NIR) with a spatial resolution of 1.24 m, 8 Short-wave infrared (SWIR) bands (3.7 m) and 12 bands to map clouds, aerosol, water vapor, ice and snow (CAVIS at 30 m spatial resolution) (DigitalGlobe, 2018). For each field carrot crop, NDVI derived from WV3 images was calculated and Iso Cluster unsupervised classification (Ball & Hall, 1965) was used to assign each NDVI pixel value into one of 8 vigor classes (from very low to very high). Classification thresholds were assigned per field. Six sample areas (located over low, medium and high vigor zones) were randomly selected and a total of 18 areas per field crop were manually sampled for whole plant carrot yield assessment in an area defined according to the sowing arrangement (~ 1–2 m²). The manually harvested carrot yield from the 18 areas per crop were averaged and converted into t ha⁻¹. This sampling methodology was applied to ensure that the variability of canopy reflectance and therefore yield variation was encompassed within each field crop (Suarez et al., 2020). Crop quality assessment (grading) was performed for each of the samples (visual assessment of individual carrots) using standards defined in commercial practices. Leaves (fresh biomass) were removed from roots, weighed and converted into t ha⁻¹ for additional analysis (data no presented).

Satellite image acquisition for time series analysis

S2 satellite imagery (Level 1C product), available in Google Earth Engine (GEE) (Gorelick et al., 2017), was evaluated for each of the growing seasons. This product level (1C) provides a Top-Of-Atmosphere reflectance product (TOA) produced by The European Space Agency (ESA). (European Space Agency, 2023). The lowest 1% of pixel values in each tile per band were selected to remove the darkest pixels in the image, which are likely candidates for dark objects. Cloud cover analysis was performed for each of the carrot crop fields across the growing periods and only cloud-free images over the fields were retained. A total of 99 captures were analyzed, including 43 over the Qld crops, and 28 for each of the WA and Tas regions.

Satellite data extraction

The field crop boundaries, including the respective field crop ID, sowing date (SDate) and average yield (t ha⁻¹) were imported into GEE using Google Fusion Tables. In GEE, a query based on sowing and harvest date was used to select the S2-L1C imagery available for each growing season. VIs (listed in Table 2) were calculated and the mean value of each multispectral band and VI was extracted per field crop. A data table was generated per region, which contained the crop information (i.e. crop ID, yield, SDate), capture date (CDate) and the reflectance values for all the available multispectral bands and the VIs. An R software script (R Core Team, 2014), was designed to import the resulting data tables from GEE and to run the required exploratory and statistical analysis. Time series were produced based on the calculation of the DAS:

$${\text{DAS}} = {\text{CDate }} - {\text{SDate}}$$

(1)

Table 2 Sentinel-2 bands and vegetation indices (VIs) tested in this study

Full size table

VI and reflectance values were interpolated to ± 10 day intervals, stopping at 150 ± 10 DAS. This produced a time series consisting of eight phenological stages (PhS) (Bolton & Friedl, 2013): 10, 30, 50, 70, 90, 110, 130 and 150 DAS with 0 DAS equivalent to the SDate. An initial attempt to reduce the interpolation period to ± 5 days was performed. However, many crops did not have sufficient images available due to cloud conditions, limiting sample size and the ability to undertake the subsequent statistical analysis. Hence, the broader ± 10 day interval was used to guarantee image availability for each PhS.

Carrot canopy reflectance profile and Optimal Capture Window (OCW) for yield forecasting per growing region

The VIs measured across each growing region was compared to better understand the variation of the canopy reflectance both temporally and spatially. The proposed VIs (Table 2) include structure-related, pigment-related and water-related indices, enabling them to effectively indicate the crop condition within the carrot fields.

The time series of the aggregated VIs values were analyzed per region from which the VIs variability could be established (hypothesis 1). To identify the OCW, several univariate linear models were fitted, with the derived VIs (Table 2) used as predictor of yield (t ha⁻¹) per region and at each PhS. By analyzing the temporal relationship between individual VI and yield, we tested the hypothesis that such a relationship is not stable but that it changes according to the PhS and the VI (hypothesis 2). As such, once the OCW is established, we test the hypothesis that a specific VI can be used to forecast yield (hypothesis 3). The R² was plotted per growing region from 0 to 150 DAS and a smoothing method using the local polynomial regression fitting (loess) was added to better identify trends.

Accelerating the optimal yield forecasting window capture and validation

Multivariate models for predicting yield were developed in an attempt to reduce the capture gap (CG) (i.e. time between the SDate and the OCW identified with the univariate lineal models). We aimed to test the hypothesis that using this method, yield forecasts can be provided earlier in the season (hypothesis 4). Spatial variability (across the 3 regions) was also included with the derived VIs into the multivariate linear models. These new models were fitted for each DAS using a stepwise regression to identify which VIs best described the variability in $log\left( {yield} \right)$. $Log\left( {yield} \right)$ was identified as a more suitable response variable than $Yield$ due to the non-constant variability exhibited by the residuals in all fitted models. Upon transforming the response variable, model assumptions were satisfied for all DAS (normality, constant variability of residuals, independence). The stepwise regression method was carried out based upon the Akaike Information Criterion (AIC) to identify the model with the optimal AIC (Burnham & Anderson, 2004). To identify the OCW, the coefficient of determination (R²) value was compared for the models fitted for each DAS. Independent variables exhibiting multicollinearity were removed from the model, according to the generalized variance inflation factor (GVIF) (Fox & Monette, 1992), to produce a simplified model for predicting $log\left( {yield} \right)$ at the OCW across the regions.

The resulting ‘best’ models were thoroughly validated with independent datasets from Tas and WA regions so these datasets were invisible during the training process. The validation dataset included new crops from the same and new seasons that were included in the training process. An independent dataset from Qld region was not available and therefore, validation results are only shown for Tas and WA regions. Figure 2 shows the steps for satellite data extraction, processing and statistical analysis performed in this study.

Results and discussion

Crop profile characterization: spatio-temporal VI variability

Average reflectance and VI values were calculated at ten-day intervals throughout the growing season, displaying distinct profiles across regions (Fig. 3). The spectral curves at sampled sites transitioned from very low VI values (similar to bare soils) to increasing values, reaching a plateau, and then decreasing between 90 and 130 DAS. This shift signifies the change in crop canopy from active growth to declining condition, consistent with the physiological growth stage when carrots maximize photosynthetic capacity (Johansen et al., 2015) and when the cessation of the carrot root growth coincides with the fall of the shoot weight (Nilsson, 1987). The alignment of VI profiles with crop growth underscores the predominant influence of crop development on RS data changes, with soil type playing a minor role. This is evidenced by the spectral profiles depicted by structural-related indices (e.g., NDVI, EVI2, SAVI) and pigment-related VIs (CHI, NDRE), whose values constantly increased as the crop developed.

VIs such as CHI, NDRE, NDVI and SAVI constantly increased, reaching peak values around 90 DAS in the Tas region and 110 DAS in WA and Qld regions. This suggests that crops in the Tas region reach their maximum photosynthetic capacity earlier in the growing period than the WA and Qld crops, and as result harvested earlier. Other bands or indices clearly showed that the growth profile differed between regions during the entire growing period (i.e. NDRE740, NDRE783, NDRE865 and TCARI) whist signature of others VIs were similar over certain periods. The latter was the case of the SR (up to 50 DAS), SIPI (70 DAS–110 DAS), EVI2 and SAVI (up to 90 DAS). These results validate hypothesis 1, as the temporal variability of the VIs fluctuates based on both the growing region and the specific VI.

Vegetation indices and carrot root yield: variability per region and growing period

Univariate linear models (yield vs. VI) were calculated at each PhS per region to determine the peak of maximum correlation for yield forecasting, $PhS_{{R2_{max} }}$. The regression coefficients (R²) varied per region and VI at different PhS confirming hypothesis 2 (Fig. 4). However, the PhS at which the $PhS_{{R2_{max} }}$ occurred did not always coincide with the PhS at which the maximum VI value was achieved ($PhS_{{VI_{max} }}$) typically falling between 90 to 110 DAS (Fig. 3). Most of the VIs in the WA region reached maximum correlation with yield early in the season (~ 30–50 DAS) after which the relationships started to decline sharply until about 90 DAS. This response may indicate that a rapid early vegetative development is crucial for the efficient utilization of resources, in terms of yield potential, in a short growing period (Evers, 1988; Suojala, 2000b). In Qld region, the PhS at which $PhS_{{R2_{max} }}$ occurred was around 130 DAS, indicating that canopy growth did not decline until later in the season and that the interaction of senescence of the leaves with carrot growth was different than in the WA crops. This interaction may be affected by genotype and the nutritional characteristics of the crops. The lengthy vegetative growth indicates that the Qld crops took more time to accumulate the final harvested yield than those in WA (Nilsson, 1987). However, a prolonged growing season does expose the carrots to increased risk of unfavorable environmental conditions such as frost.

In the Tas region, the $PhS_{{VI_{max} }}$ and $PhS_{{R2_{max} }}$ were both around 90 DAS for many of the indices evaluated (e.g. NDRE, GNDVI, NDRE740, SR_G), suggesting that the PhS at which the maximum photosynthetic capacity occurred coincided with the peak of vegetation development indicating that root growth gain did not vary much from 110 DAS until harvest (around 125 DAS). However, EVI2, SAVI and to some extent SR, showed two peaks of correlation to yield: between 30–50 DAS and 90–110 DAS. This suggests that there is potential for yield to be estimated earlier in the growing season. Hole et al. (1987) reported that the highest differences in relative root growth, defined by the shoot to root ratio, can be estimated between 27 and 48 DAS and Suojala (2000a) found that nearly 60% of the total harvested carrot yield was gained by the middle of the growing season after which there was no significant increments in yield gain. This situation can explain the capability of RS-derived data for forecasting yield at such early growing stages supporting hypothesis 3.

Reducing the capture gap (CG) for early yield forecasting

The $PhS_{{R2_{max} }}$ differed per region and according to the VI used. It ranged from early, middle and late in the growing season (WA, Tas and Qld, respectively). Therefore, it is essential to minimize the CG (i.e. number of days from sowing to the forecast date) among the regions so the early yield forecast can be used in the current season to quantify and identify the extent of underperforming areas.

Multivariate models that included all the multispectral bands, the VIs and all regions were generated at each DAS to investigate if it was possible to reduce the CG, in other words, to provide earlier yield forecasts. However, these models were over fitted as many of the predictors (bands and VIs) showed multi-collinearity. Simplified models were tested based on the GVIF values and the VIs with high GVIF were removed until the influence of multi-collinearity was reduced. The parameters of the best models per DAS are presented in Fig. 5. The predictive capability is similar for 30, 50, 70 and 110 DAS, with moderate R² values in the range of 0.5 to 0.62, while the models for 10 and 90 DAS are lower, at 0.35 and 0.18 respectively. At 130 DAS (close to harvest), the model performs very well, with R² = 0.8 (Fig. 5).

Some VIs are common across most of the best models. NDWI was present in the ‘best’ model for 6 of the DAS models, while NDRE740, NDRE783 and TCARI were present in 5 models (Table 3). The region variable is present in all models except 90 DAS. Notably, Vegetation Indices (VIs) related to water and pigment content play a significant role in accurately estimating carrot yields. This is attributed to key limiting factors in carrot crop growth, development, and yield. These factors include a larger photosynthetic surface, often quantified as the Leaf Area Index (LAI), which can store more macronutrients such as nitrogen (N), phosphorus (P), and potassium (K) (Abdel-Mawly, 2004), as well as an ample supply of water (Jeptoo et al., 2013; Reid & Gillespie, 2017). Increased leaf nitrogen levels enhance the photosynthetic capacity of vegetation and, consequently, the chlorophyll (Chl) content (Gitelson et al., 2003). It's worth noting that the Red edge bands in remote sensing data are highly sensitive to changes in chlorophyll content, which explains their consistent presence in the models utilizing these bands directly or via VIs.

Table 3 Variables included in the best models per ‘days after sowing’ (DAS)

Full size table

The resulting multivariate models were further validated with an independent dataset of 18 carrot field crops located in WA (12) and Tas (6) regions. The actual average carrot root yield (t h⁻¹) per field crop was provided by the respective growers and compared with the forecasted yield (t h⁻¹). The Root Mean Square Error (RMSE) was calculated for each DAS model. The best performing model, in terms of adjusted R², was at 130 DAS. However, consideration of the best prediction model overall was based on a number of factors, including RMSE (Fig. 6) and usefulness of the model in terms of reducing the capture gap. The model for 70 DAS performs well in terms of R² and RMSE, for both the training and the validation datasets.

The final optimal model developed for 70 DAS is shown below (Eq. 2):

$$\begin{aligned} Log\left( {yield_{{t ha^{ - 1} }} } \right) && = 8.25 \mp 0.390Re_{TAS} + 0.21Re_{WA} \\&&\quad+ \left( { - 3.32GNDVI - 2.55NDRE740 - 19.89NDRE783 + 16.21NDRE865 + 4.91NDWI - 4.66RENDVI + 0.74RE740 - 0.322SIPI + 0.12SR} \right)_{70DAS}\end{aligned}$$

(2)

where Re represents the region effect, comparing Tas and WA regions to the Qld region. The adjusted R² value for this model at 70 DAS is 0.50, and the RMSE is 10.21 t ha⁻¹.

By integrating several VIs in the prediction model, the correlations of crop reflectance properties to yield variability may be better explained as different VIs relate to different plant properties i.e. vegetative cover, nutritional and water status (Zhao et al., 2007). This may explain why the final model that includes multiple variables performed better, as the respective VIs have been related to biomass and the physiological condition of the crops (SR, GNDVI), as well as biochemical composition (NDRE) and water status (NDWI) (Zarco-Tejada et al., 2005). Furthermore, as the generic yield forecast model incorporates the spatial variability associated with growing location (region) and its interactions with the different VIs, it is therefore more likely to compensate for a wide range of constraints that may limit yield. This result validate hypothesis 4 and 5.

Validation of the final optimal model

The total harvested yield for each sampled field crop was provided by the respective growers. This value was compared against the predicted yield from Eq. (2). These comparisons are presented in Fig. 7, with the gray colored points corresponding to the data used for training the model in Eq. (2). RMSE for the training dataset was 10.21 t ha⁻¹. Furthermore, yield forecast of 18 additional crops (12 in the WA region and 6 in the Tas region) was calculated at 70 DAS to validate the fitted model (2). Results indicated that the model performed moderately well at predicting yield for WA and Tas crops, with a reasonably small RMSE of 16.97 t ha⁻¹ considering that the standard deviation of the validation dataset was calculated as 19.32 t ha⁻¹. The validation data is presented in Fig. 7 as the black markers.

Limitations

From Fig. 7, the model tended to underestimate yield (i.e. the majority of the fields in the validation dataset were below the parity line). There are two outliers (1 for each region) in the validation dataset, both with unusually large observed yields. Yields around 90–100 t ha⁻¹ were not common across the sampled fields, and are therefore not well represented in the training dataset. The model also tends to under-predict yields for these high-observed yield fields in the training dataset, but to a lesser degree. There is room to improve the models ability to predict higher yielding crops with the inclusion of more training data from crops with higher yields. Future research endeavors could explore the utilization of cumulative Vegetation Indices (VIs) over time. As noted by Lai et al. (2018), time-integrated VIs offer a more comprehensive representation of the phenological cycle when compared to a single-date approach. This approach has the potential to enhance the accuracy of our estimations.

Conclusion

The potential of remote sensing for predicting carrot yield across multiple growing regions, seasons and at different growth stages was explored in this study. In the case of using a single VI as a predictor, the OCW varied per region and per VI. In two regions, the OCW was close to harvest. Whilst this outcome offers some benefit for pre-harvest yield forecasting i.e. for forward selling and harvesting logistics (labor, storage, transport etc.) it is likely too late to assist growers with the implementation of remedial actions to maximize production. For the first time in root crops, the methodology proposed in this study successfully reduced the capture gap by more than 60 days for some regions incorporating different RS-data and the region as input parameters. This alone greatly improves the potential of optical remote sensing for yield forecasting in growing regions and times of the year that are cloud dominated. This result offers immediate advantages in being able to narrow down the predictions of yields at the early time of 70 DAS. The fitted model presents a simple linear relationship between the regions, VIs, a multispectral band and yield. It is plausible that interactions exist between the predictors, which are yet to be explored. As more data becomes available, more complex models incorporating such interactions can be explored, which has the potential to improve the accuracy to predict yields at this stage of the growing season. The outcomes presented in this study are important to industry considering the subterranean growth habit of the carrot and the limited ability to derive an accurate pre-harvest non- destructive prediction of yield.

Data availability

Data presented in this study is confidential.

References

Abdel-Mawly, S. (2004). Growth, yield, N uptake and water use efficiency of carrot (Daucus carota L.) plants as influenced by irrigation level and nitrogen fertilization rate. Assiut University Bulletin for Environmental Researches, 7(1), 111–122.
Google Scholar
Al-Gaadi, K. A., Hassaballa, A. A., Tola, E., Kayad, A. G., Madugundu, R., Alblewi, B., et al. (2016). Prediction of potato crop yield using precision agriculture techniques. PLoS ONE, 11(9), 1–16. https://doi.org/10.1371/journal.pone.0162219
Article CAS Google Scholar
Ayu Purnamasari, R., Noguchi, R., & Ahamed, T. (2019). Land suitability assessments for yield prediction of cassava using geospatial fuzzy expert systems and remote sensing. Computers and Electronics in Agriculture, 166, 105018. https://doi.org/10.1016/j.compag.2019.105018
Article Google Scholar
Bala, S. K., & Islam, A. S. (2009). Correlation between potato yield and MODIS-derived vegetation indices. International Journal of Remote Sensing, 30(10), 2491–2507. https://doi.org/10.1080/01431160802552744
Article ADS Google Scholar
Ball, G. H., & Hall, D. J. (1965). ISODATA, a novel method of data analysis and pattern classification. Stanford Research Inst.
Google Scholar
Barnes, E. M., Clarke, T. R., Richards, S. E., Colaizzi, P. D., Haberland, J., Kostrzewski, M., et al. (2000). Coincident detection of crop water stress, nitrogen status and canopy density using ground based multispectral data. In Proceedings of the fifth international conference on precision agriculture, Bloomington, MN (pp. 16–19).
Bolton, D. K., & Friedl, M. A. (2013). Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics. Agricultural and Forest Meteorology, 173(52), 74–84. https://doi.org/10.1016/j.agrformet.2013.01.007
Article ADS Google Scholar
Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research, 33(2), 261–304. https://doi.org/10.1177/0049124104268644
Article MathSciNet Google Scholar
DigitalGlobe. (2018). Worldview-3: Above and beyond. Retrieved August 15, 2019, from http://worldview3.digitalglobe.com/
European Space Agency. (2023). Level-1C Algorithm. Retrieved September, 2023, from https://sentinel.esa.int/web/sentinel/technical-guides/sentinel-2-msi/level-1c/algorithm-overview
Evers, A.-M. (1988). Effects of different fertilization practices on the growth, yield and dry matter content of carrot. Agricultural and Food Science., 60, 135–152.
Article Google Scholar
Fox, J., & Monette, G. (1992). Generalized collinearity diagnostics. Journal of the American Statistical Association, 87(417), 178–183. https://doi.org/10.1080/01621459.1992.10475190
Article Google Scholar
Gitelson, A. A., Gritz, Y., & Merzlyak, M. N. (2003). Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. Journal of Plant Physiology, 160(3), 271–282. https://doi.org/10.1078/0176-1617-00887
Article CAS PubMed Google Scholar
Gitelson, A. A., Kaufman, Y. J., & Merzlyak, M. N. (1996). Use of a green channel in remote sensing of global vegetation from EOS-MODIS. Remote Sensing of Environment, 58(3), 289–298. https://doi.org/10.1016/S0034-4257(96)00072-7
Article ADS Google Scholar
Gitelson, A. A., & Merzlyak, M. N. (1994). Quantitative estimation of chlorophyll-a using reflectance spectra: Experiments with autumn chestnut and maple leaves. Journal of Photochemistry and Photobiology b: Biology, 22(3), 247–252. https://doi.org/10.1016/1011-1344(93)06963-4
Article CAS Google Scholar
Gobron, N., Pinty, B., Verstraete, M. M., & Widlowski, J. (2000). Advanced vegetation indices optimized for up-coming sensors: Design, performance, and applications. IEEE Transactions on Geoscience and Remote Sensing, 38(6), 2489–2505. https://doi.org/10.1109/36.885197
Article ADS Google Scholar
Gomez, D., Salvador, P., Sanz-Justo, J., & Casanova, J.-L. (2019). Potato yield prediction using machine learning techniques and Sentinel 2 data. Remote Sensing, 11, 1745. https://doi.org/10.3390/rs11151745
Article ADS Google Scholar
Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment, 202, 18–27. https://doi.org/10.1016/j.rse.2017.06.031
Article ADS Google Scholar
Hole, C. C., Morris, G. E. L., & Cowper, A. S. (1987). Distribution of dry matter between shoot and storage root of field-grown carrots. I. Onset of differences between cultivars. Journal of Horticultural Science, 62(3), 335–341. https://doi.org/10.1080/14620316.1987.11515789
Article Google Scholar
Huete, A. R. (1988). A soil-adjusted vegetation index (SAVI). Remote Sensing of Environment, 25(3), 295–309.
Article ADS Google Scholar
Jeptoo, A., Aguyoh, J. N., & Saidi, M. (2013). Improving carrot yield and quality through the use of bio-slurry manure. Sustainable Agriculture Research, 2(1), 164–172.
Article Google Scholar
Jiang, Z., Huete, A. R., Didan, K., & Miura, T. (2008). Development of a two-band enhanced vegetation index without a blue band. Remote Sensing of Environment, 112(10), 3833–3845. https://doi.org/10.1016/j.rse.2008.06.006
Article ADS Google Scholar
Johansen, T. J., Thomsen, M. G., Løes, A.-K., & Riley, H. (2015). Root development in potato and carrot crops—Influences of soil compaction. Acta Agriculturae Scandinavica, Section B, 65(2), 182–192. https://doi.org/10.1080/09064710.2014.977942
Article CAS Google Scholar
Johnson, M. D., Hsieh, W. W., Cannon, A. J., Davidson, A., & Bédard, F. (2016). Crop yield forecasting on the Canadian Prairies by remotely sensed vegetation indices and machine learning methods. Agricultural and Forest Meteorology, 218–219, 74–84. https://doi.org/10.1016/j.agrformet.2015.11.003
Article ADS Google Scholar
Jordan, C. F. (1969). Derivation of leaf-area index from quality of light on the forest floor. Ecology, 50(4), 663–666. https://doi.org/10.2307/1936256
Article Google Scholar
Kim, M. S., Daughtry, C., Chappelle, E., McMurtrey, J., & Walthall, C. (1994). The use of high spectral resolution bands for estimating absorbed photosynthetically active radiation (APAR).
Lacaux, J. P., Tourre, Y. M., Vignolles, C., Ndione, J. A., & Lafaye, M. (2007). Classification of ponds from high-spatial resolution remote sensing: Application to Rift Valley Fever epidemics in Senegal. Remote Sensing of Environment, 106(1), 66–74. https://doi.org/10.1016/j.rse.2006.07.012
Article ADS Google Scholar
Lai, Y. R., Pringle, M. J., Kopittke, P. M., Menzies, N. W., Orton, T. G., & Dang, Y. P. (2018). An empirical model for prediction of wheat yield, using time-integrated Landsat NDVI. International Journal of Applied Earth Observation and Geoinformation, 72, 99–108. https://doi.org/10.1016/j.jag.2018.07.013
Article ADS Google Scholar
Mkhabela, M. S., Bullock, P., Raj, S., Wang, S., & Yang, Y. (2011). Crop yield forecasting on the Canadian Prairies using MODIS NDVI data. Agricultural and Forest Meteorology, 151(3), 385–393. https://doi.org/10.1016/j.agrformet.2010.11.012
Article ADS Google Scholar
Nilsson, T. (1987). Carbohydrate composition during long-term storage of carrots as influenced by the time of harvest. Journal of Horticultural Science, 62(2), 191–203. https://doi.org/10.1080/14620316.1987.11515769
Article CAS Google Scholar
Peñuelas, J., Baret, F., & Filella, I. (1995). Semi-empirical indices to assess carotenoids/chlorophyll a ratio from leaf spectral reflectance. Photosynthetica, 31(2), 221–230.
Google Scholar
Que, F., Hou, X.-L., Wang, G.-L., Xu, Z.-S., Tan, G.-F., Li, T., et al. (2019). Advances in research on the carrot, an important root vegetable in the Apiaceae family. Horticulture Research, 6(1), 69. https://doi.org/10.1038/s41438-019-0150-6
Article PubMed PubMed Central Google Scholar
R Core Team. (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Google Scholar
Rahman, M. M., & Robson, A. J. (2016). A novel approach for sugarcane yield prediction using landsat time series imagery: A case study on Bundaberg region. Advances in Remote Sensing. https://doi.org/10.4236/ars.2016.52008
Article Google Scholar
Rapaport, T., Hochberg, U., Rachmilevitch, S., & Karnieli, A. (2014). The effect of differential growth rates across plants on spectral predictions of physiological parameters. PLoS ONE, 9(2), e88930. https://doi.org/10.1371/journal.pone.0088930
Article ADS CAS PubMed PubMed Central Google Scholar
Reid, J. B., & Gillespie, R. N. (2017). Yield and quality responses of carrots (Daucus carota L.) to water deficits. New Zealand Journal of Crop and Horticultural Science, 45(4), 299–312. https://doi.org/10.1080/01140671.2017.1343739
Article Google Scholar
Robson, A., Rahman, M., & Muir, J. (2017). Using worldview satellite imagery to map yield in Avocado (Persea americana): A case study in Bundaberg, Australia. Remote Sensing, 9(12), 1223.
Article ADS Google Scholar
Rouse, J. W., Haas, R. H., Schell, J. A., & Deering, D. W. (1974). Monitoring vegetation systems in the Great Plains with ERTS. Paper presented at the Third ERTS Symposium, Washington, DC, USA.
Schauberger, B., Jägermeyr, J., & Gornott, C. (2020). A systematic review of local to regional yield forecasting approaches and frequently used data resources. European Journal of Agronomy, 120, 126153. https://doi.org/10.1016/j.eja.2020.126153
Article Google Scholar
Schlemmer, M., Gitelson, A., Schepers, J., Ferguson, R., Peng, Y., Shanahan, J., et al. (2013). Remote estimation of nitrogen and chlorophyll contents in maize at leaf and canopy levels. International Journal of Applied Earth Observation and Geoinformation, 25, 47–54. https://doi.org/10.1016/j.jag.2013.04.003
Article ADS Google Scholar
Sentinel-2 PDGS Project Team. (2011). Sentinel-2 payload data ground segment (PDGS): Products definition document (p. 92). European Space Agency (ESA).
Shanahan, J. F., Schepers, J. S., Francis, D. D., Varvel, G. E., Wilhelm, W. W., Tringe, J. M., et al. (2001). Use of remote-sensing imagery to estimate corn grain yield. Agronomy Journal, 93(3), 583–589.
Article Google Scholar
Shaub, D. (2020). Fast and accurate yearly time series forecasting with forecast combinations. International Journal of Forecasting, 36(1), 116–120. https://doi.org/10.1016/j.ijforecast.2019.03.032
Article Google Scholar
Suarez, L. A., Apan, A., & Werth, J. (2017). Detection of phenoxy herbicide dosage in cotton crops through the analysis of hyperspectral data. International Journal of Remote Sensing, 38(23), 6528–6553. https://doi.org/10.1080/01431161.2017.1362128
Article ADS Google Scholar
Suarez, L. A., Robson, A., McPhee, J., O’Halloran, J., & van Sprang, C. (2020). Accuracy of carrot yield forecasting using proximal hyperspectral and satellite multispectral data. Precision Agriculture. https://doi.org/10.1007/s11119-020-09722-6
Article Google Scholar
Suojala, T. (2000a). Growth of and partitioning between shoot and storage root of carrot in a northern climate. Agricultural and Food Science. https://doi.org/10.23986/afsci.5646
Article Google Scholar
Suojala, T. (2000b). Pre-and postharvest development of carrot yield and quality. University of Helsinki.
Google Scholar
Tedesco, D., de Oliveira, M. F., dos Santos, A. F., Costa Silva, E. H., de Souza Rolim, G., & da Silva, R. P. (2021). Use of remote sensing to characterize the phenological development and to predict sweet potato yield in two growing seasons. European Journal of Agronomy, 129, 126337. https://doi.org/10.1016/j.eja.2021.126337
Article Google Scholar
Tucker, C. J. (1979). Red and photographic infrared linear combinations for monitoring vegetation. Remote Sensing of Environment, 8(2), 127–150.
Article ADS Google Scholar
Wang, H., Lin, H., Munroe, D. K., Zhang, X., & Liu, P. (2016). Reconstructing rice phenology curves with frequency-based analysis and multi-temporal NDVI in double-cropping area in Jiangsu, China. Frontiers of Earth Science, 10(2), 292–302. https://doi.org/10.1007/s11707-016-0552-9
Article ADS Google Scholar
Weiss, M., Jacob, F., & Duveiller, G. (2020). Remote sensing for agricultural applications: A meta-review. Remote Sensing of Environment, 236, 111402. https://doi.org/10.1016/j.rse.2019.111402
Article Google Scholar
Xu, H. (2006). Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. International Journal of Remote Sensing, 27(14), 3025–3033. https://doi.org/10.1080/01431160600589179
Article ADS Google Scholar
Zarco-Tejada, P. J., Ustin, S. L., & Whiting, M. L. (2005). Temporal and spatial relationships between within-field yield variability in cotton and high-spatial hyperspectral remote sensing imagery. Agronomy Journal, 97(3), 641–653. https://doi.org/10.2134/agronj2003.0257
Article Google Scholar
Zhao, D., Reddy, K. R., Kakani, V. G., Read, J. J., & Koti, S. (2007). Canopy reflectance in cotton for growth assessment and lint yield prediction. European Journal of Agronomy, 26(3), 335–344. https://doi.org/10.1016/j.eja.2006.12.001
Article CAS Google Scholar

Download references

Acknowledgements

This project (VG16009) has been funded by Horticulture Innovation Australia, using the vegetable research and development levy and contributions from the Australian Government. Hort Innovation is the grower-owned, not-for-profit research and development corporation for Australian horticulture. Special thanks to Dr. Surantha Salgadoe (University of New England), Julie O’Halloran, Rhianna Robinson and Zara Hall (Department of Agriculture and Fisheries Queensland), Phillip Beveridge and John McPhee (Tasmanian Institute of Agriculture, University of Tasmania) and Allan McKay for their assistance during fieldwork activities. The authors appreciate the thoughtful suggestions provided by the reviewers and editors, which have helped us improve this paper.

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions.

Author information

Authors and Affiliations

Applied Agricultural Remote Sensing Centre, University of New England, Armidale, NSW, 2350, Australia
L. A. Suarez, J. Brinkhoff & A. Robson
School of Science and Technology, University of New England, Armidale, NSW, 2351, Australia
M. Robertson-Dean

Authors

L. A. Suarez
View author publications
You can also search for this author in PubMed Google Scholar
M. Robertson-Dean
View author publications
You can also search for this author in PubMed Google Scholar
J. Brinkhoff
View author publications
You can also search for this author in PubMed Google Scholar
A. Robson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to L. A. Suarez.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Suarez, L.A., Robertson-Dean, M., Brinkhoff, J. et al. Forecasting carrot yield with optimal timing of Sentinel 2 image acquisition. Precision Agric 25, 570–588 (2024). https://doi.org/10.1007/s11119-023-10083-z

Download citation

Accepted: 28 September 2023
Published: 02 November 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11119-023-10083-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Forecasting carrot yield with optimal timing of Sentinel 2 image acquisition

Abstract

Similar content being viewed by others

Medium-resolution multispectral satellite imagery in precision agriculture: mapping precision canola (Brassica napus L.) yield using Sentinel-2 time series

Accuracy of carrot yield forecasting using proximal hyperspectral and satellite multispectral data

Developing Maize Yield Predictive Models from Sentinel-2 MSI Derived Vegetation Indices: An Approach to an Early Warning System on Yield Fluctuation and Food Security

Introduction