Introduction

Carrot (Daucus carota L.), grown in more than 25 countries, is one of the most economically important vegetables in the world because of its significance to food security and dietary benefits to humans. In Australia, carrots are the third largest (by volume/tonnage) vegetable crop produced (around 318 000 t) and the fifth most valuable vegetable grown (about $231 million, 2017 value) (Horticulture Innovation Australia Limited 2018). There are five major carrot production areas in Australia: Western Australia (Gingin and Preston), Victoria (East Gippsland), South Australia (Riverland) and Tasmania (Forth). The production area increased by 15% from 2015 to 2016. About two-thirds of consumption is domestic with more than 89% of Australian households regularly purchasing fresh carrots. Only 10 t were imported, while nearly 103 000 t were exported, in 2016/2017 (Horticulture Innovation Australia Limited 2018).

Accurate yield estimations are important for all cropping industries to support improved decision making for crop management, labour requirements, storage, transport and marketing (Ye et al. 2007). Remote sensing (RS) is one of the most important technologies available to precision agriculture (PA) (Yang et al. 2004; Lee et al. 2010) and has been demonstrated to accurately explain spatial and temporal variability of many important agronomic parameters including abiotic and biotic constraints and yield (Magney et al. 2017). Several studies have assessed RS technologies for PA purposes in different broadacre crops (Zarco-Tejada et al. 2005) and many tree crops (Rahman et al. 2018; Robson et al. 2017b). Yang and Everitt (2002) compared grain sorghum yield estimations from remote sensing data obtained during the growing season to yield data measured from yield monitors during harvest. Yield maps derived from reflectance-based imagery, captured around peak vegetative development, were highly comparable to the yield records obtained from yield monitors, explaining up to 85% of yield variability and indicating the potential of remote sensing as a surrogate for yield monitors. Additionally, continual site-specific crop condition information (e.g. leaf area index—LAI, biomass, yield, severity of pest and disease infestations, levels of nitrogen deficiency) is essential for precision crop management during the growing period and between seasons (Thenkabail 2003) but traditional methods for collecting this data are costly and time consuming. RS technologies, although requiring calibration (i.e. field data measurements), can accurately measure spatial and temporal variability that can be used for crop management and incorporated in crop models (Thenkabail 2003). Ye et al. (2007) showed that the ability to predict annual variability in yield several months prior to harvest offers significant benefit to growers, processors and marketers. The authors tested different remote sensing sensors (i.e. hyperspectral and multispectral) to predict fruit yield (cv. Satsuma mandarin) several months before harvesting. Results proved the capabilities of these sensors to estimate yield one season ahead under the natural alternate yield bearing process of fruit trees.

Meaningful information regarding crop health also can be gained by transforming reflectance data into vegetation indices (VIs). VIs are mathematical expressions, usually composed of two or three spectral bands computed as subtractions, ratios or linear combinations of light reflectance in the visible (VIS), near-infrared (NIR) and shortwave infrared (SWIR) spectral regions (Table). VIs are developed to identify a particular plant condition. Pigment-related, structural-related and water-related indices generally include the VIS spectrum (blue, green, red), NIR bands and SWIR bands, respectively. Evidence shows that the absorption dynamics in the visible part of the spectra, are predominately driven by leaf pigments and their interactions (Peñuelas et al. 1995). The NIR reflectance changes are mainly influenced by leaf structures in the mesophyll cell wall (Pinter et al. 2003) and whilst the SWIR range is highly influenced by specific constituents such as lipids, oils, water, cellulose and lignin (Rapaport et al. 2015; Daughtry 2000). As a result, VIs have been used as strong indicators of plant health (Henry et al. 2004; Apan et al. 2004), water availability (Detar et al. 2006; Rapaport et al. 2015), nitrogen (Ryu et al. 2011) and other nutrient deficiencies (Schlemmer et al. 2013) in different crops such as cotton (Ortiz et al. 2011), soybean (Huang et al. 2015), rice (Gnyp et al. 2014), tomatoes, potatoes, onions (Marino and Alvino 2015; Trout et al. 2008), avocados and macadamia (Robson et al. 2017b). However, the literature is limited for root crops such as carrots. Al-Gaadi et al. (2016) demonstrated that satellite multispectral sensors (i.e. Landsat-8 (OLI) and Sentinel-2) were successful in providing accurate estimations of yield variability and total yield of potatoes but the accuracies varied according to the spatial resolution. Bala and Islam (2009) tested normalised difference vegetation index (NDVI), the fraction of photosynthetically active radiation (fPAR) and LAI products derived from MODIS TERRA images (500 m, 1 km and 1 km spatial resolution, respectively) to estimate potato yield prior to harvesting with overall accuracies of about 75%. The authors found that the peak of maximum correlation between NDVI and potato yield coincided with the peak of the fPAR and concluded that higher spatial resolution could increase the accuracies in future studies due to the high discrepancies between the size of the fields and the spatial resolution of these products (i.e. NDVI, LAI and fPAR). Most recently, Gomez et al. (2019) developed a potato-yield prediction model using Sentinel-2 data and machine learning algorithms. The results showed an improved accuracy to those studies reported previously. However, the model evidenced greater uncertainty due to the poor capacity of the model to account for extreme events or abnormal crop conditions and the lack of calibration data from different regions.

Before the era of high temporal and spatial resolution satellite sensors, the low temporal and/or spatial resolution of some satellites (e.g. Landsat, MODIS) limited their usability in medium to small broadacre or tree crops for PA purposes. The launch of satellite platforms such as IKONOS, Quickbird, GeoEYE and, more recently, Sentinel-2 and Worldview-3 and 4 (WV-3/ WV-4) have greatly increased the development and potential adoption of RS applications for PA (Battude et al. 2016; Yang et al. 2009). Furthermore, the commercial cost of these imagery sources has reduced, making these technologies economically attractive. WV-3, launched on 14 August 2014, has been successfully used to characterise tree health, yield and fruit quality in orchards (Rahman et al. 2018; Robson et al. 2017a, b). Sentinel-2, launched on 23 June 2015, has mostly been used for land cover and crop and tree species classifications (Immitzer et al. 2016; Qiu et al. 2017). However, the accuracy of these sensors, including hyperspectral sensors, has not been assessed for yield forecasting in vegetable crops.

Statistical models calibrated for yield estimation that have a VI as the predictor variable can produce highly accurate results (Bolton and Friedl 2013; Al-Gaadi et al. 2016). However, yield is a complex variable dependent on many different factors that could limit the use of VIs as yield predictors. In such cases, more robust statistical methods, such as multivariate regression, are needed (Suarez et al. 2016). Despite the extensive use of VIs in agriculture, there is limited implementation of VIs derived from RS for yield prediction in many horticultural crops, and particularly carrots.

Due to the importance of this vegetable crop, the main objective of this study was to test proximal (hyperspectral) and remote (multispectral satellite) sensors for the prediction of carrot root yield. More specifically the work aimed to: (1) explore the relationship between carrot yield and VIs derived from different platforms; (2) assess the prediction accuracy of VIs derived from hyperspectral and multispectral sensors; and (3) determine the influence of spatial resolution on model performance.

Materials and methods

Study area and cropping systems

This study included three intensive vegetable production regions in Australia: Western Australia (WA), Queensland (Qld) and Tasmania (Tas) (Fig. 1). The WA site is an overhead irrigation system (centre pivot and fixed sprinkler systems) with dune-swale terrain characterised by leached sands as the predominant soil type. Due to the arenosol soils and low water holding capacity, crops are generally irrigated daily unless rainfall exceeds evaporation. In this region, the crops grow in a subtropical climatic region and the preplant fertiliser program includes superphosphate plus manganese and boron. Phosphorus rate is guided by soil tests. Multiple top dressings are applied in crop at 10–14 day intervals comprising urea, potassium-sulphate (400 kg/ha) and magnesium-sulphate (100 kg/ha). Urea is applied depending on crop appearance and vigour. Within Qld, carrot production is predominantly limited to fertile alluvial vertisols (cracking clay soils) in subtropical regions. A blend of nitrogen, phosphorus, potassium and sulphur is applied pre-planting with top dressing of calcium and potassium nitrate. Boron is also applied in crop. Irrigation scheduling is based on applying 25 mm per week. In Tas, carrot cropping is mostly summer-autumn based, and occurs in cool temperate climatic regions with red nitosol soils in an undulating landscape. Standard management includes a small amount of N applied at planting with a top dressing of calcium nitrate applied during the growing season if required. Crops are irrigated at a rate of 25 ml every 7 days.

Fig. 1
figure 1

Study area: location of the three growing regions and the distribution of the sampled fields

Each region and the respective growing periods included in the study were exposed to different seasonal weather conditions according to the monthly climate statistics. These statistics were calculated from weather stations located in Gingin (WA) (31.46° S; 15.86° E), Amberly (Qld) (27.63° S; 152.71° E) and Devonport (Tas) (41.17° S; 146.43° E) about 29 km, 35.7 km and 15.5 km from the field sites, respectively. The WA and Tas weather data covers the years 1996–2018, and 1941–2018 for Qld (Bureau of Meteorology 2017). During the study period, the rainfall in WA was the highest of the three sites (62 mm), followed by Qld (59 mm) and Tas (50 mm).

Five to twelve commercial carrot crops were sampled per region in their respective growing seasons. Carrots grown in WA have an expected crop duration of 130–165 days, in Qld, 115–150 days and in Tas about 125 days. WA is a single crop system whilst in Tas and Qld, the crops are rotated with a preferred minimum return period of four and three years, respectively. Rotation crops include onions, green beans, sweet corn and some pumpkins in Qld; and onions, potatoes, broccoli, poppies, peas, beans or grains in Tas. Table 1 summarises the planting schedules of the study sites in each region. From each of the sampled locations (more than 400 across all regions), yield and above ground biomass (AGB) were collected, as well as hyperspectral and multispectral canopy measures.

Table 1 Planting and harvesting schedule of the fields per growing region

Sampling method

The stratified random sampling method was based on the methodology reported by Robson et al. (2017a) but adjusted to suit the planting configuration of carrots. This protocol ensured that a large extent of the spatial variability of crop vigor was included. For this, a vigour map per crop was generated from the WV-3 capture by classifying the NDVI or SR indices (Table 2) into 8 vigour classes. Iso Cluster unsupervised classification (Ball and Hall 1965) was used to assign each pixel value into the classes (from very low to very high). Sample points were randomly distributed across the crop (18 in total) to characterise the three major vigour classes: low, medium and high. Each major class was represented by six sample points. The location of the points was done using ArcGIS 10.4 (Environmental Systems Research Institute. Redlands, CA, USA). A Trimble DGPS (Trimble, Sunnyvale, CA, USA), with accuracies ranging from 0.05 to 0.8 m, was used to locate the sample points in the field. Once the points were located in the field, the sampling area (i.e. sample plot) was established based on the sowing arrangement per region described below.

Table 2 Vegetation indices used in this study

Yield and above ground biomass data collection

Measured yield (i.e. calibration data) and above ground biomass (AGB) data were collected at each sample plot as close as possible to commercial harvest to ensure minimal maturity and quality change between sampling and harvest. Each sample plot was 1 m long, with a plot width of 1.525 m for two rows of twin row planting (Qld) and 2 m and 1.125 m for four twin rows per bed in WA and Tas, respectively. From these sampled plots, carrot root yield was calculated in t/ha for all regions. Following sampling, a standard procedure was used for quality grading which comprised of counting and classifying individual roots based on length, width, shape, incidence of cavity spot, etc. Four classes were used for quality grading: Premium carrot or class 1 (C1), class 2 (C2), oversize and waste. Only straight carrots between 120 and 200 mm length and between 25 to 43 mm diameter at the shoulder without any incidence of disease or cavities were classified as C1. C2 had the same length and diameter of C1 but slight bend in the carrots. Carrots longer than 200 mm and/or greater than 43 mm diameter, irrespective of the shape were classified as oversize. Waste class included carrots shorter than 100 mm, misshapen, and/or with or without spot cavities and disease. The total number of carrots per sampled plot was used to estimate plant density (plant/ha).

A one-way ANOVA analysis was undertaken to investigate the differences between regions in terms of AGB and average yield (t/ha). This method avoids the propagation of the Type I error (5%) carried out in ANOVA when comparing more than two categories (Aerd statistics 2019). Furthermore, Games Howell post-hoc test (Games and Howell 1976) was used to determine which growing regions produced significantly different yields and AGB and the respective magnitude of such differences. The one-way ANOVA and the post-hoc tests were performed using the userfirendlyscience R-package (Peters 2018).

Reflectance-based data acquisition

Satellite multispectral imagery

WV-3 provides high spectral resolution with 8 multispectral bands in the visible (VIS) and near-infrared (NIR), 8 SWIR bands and 12 bands to map clouds, aerosol, water vapor, ice and snow (CAVIS at 30 m spatial resolution) (DigitalGlobe 2018). This sensor also provides very high spatial resolution: 0.31 m in the panchromatic, 1.24 m in the VIS–NIR bands and 3.7 m in the shortwave infrared (SWIR) bands (DigitalGlobe 2018). Sentinel-2 offers a lower spatial resolution: ranging between 10 and 60 m depending on the respective 13 multispectral band width: 4 in the VIS, 5 in the NIR (including 4 bands in the red edge region—VNIR) and 4 in the SWIR region. This sensor provides worldwide coverage every 5–10 days with image data being freely available (European Space Agency 2019). Because the various spatial resolutions of Sentinel-2 may limit its use in small area carrot crops, this study only incorporated the spectral bands of 10 m spatial resolution. Figure 2 shows the spectral configuration of the multispectral sensors used in this study and their respective overlap, whilst Table 3 shows the VIs generated from each sensor.

Fig. 2
figure 2

Spectral configuration of the WV-3 (black bars) and Sentinel-2 (green bars) bands covered in this study. For WV-3, only the multispectral bands with spatial resolution of 1.24 m were included. For Sentinel-2, only the spectral bands of 10 m spatial resolution were included (green boxes) (Color figure online)

Table 3 Data collection dates per sensor, growing period and region

Based on the planting schedule, the acquisition of high-resolution satellite imagery (WV-3) was targeted to coincide with mature crops for each of the three regions (i.e. about 4 weeks before commercial harvest). Once a successful WV-3 capture was available, a coarser resolution Sentinel-2 level 1C image was also acquired for a similar date. The level 1C images are ortho-rectified products (geometrically corrected at sub-pixel level) providing ‘Top-of-atmosphere’ (TOA) reflectance (radiometrically-corrected) (Sentinel-2 PDGS Project Team 2011). However, due to differences in rectification accuracies (3 m and 10 m for WV-3 and Sentinel-2, respectively), Sentinel-2 images were “registered” on to coincident points from the WV-3 captures (co-registration). This geo-referencing procedure was performed using ArcGIS v10.4 (Environmental Systems Research Institute. Redlands, CA, USA). The digital numbers of the WV-3 imagery were transformed to TOA reflectance following the equation presented in Kuester (2016).

Proximal hyperspectral data collection and analysis

A hand-held spectroradiometer FieldSpec Handheld4 (Analytical Spectral Devices Inc., Colorado, USA) was used to collect five reflectance readings randomly selected per sample plot. The ASD spectrometer acquires continuous spectra from 350 to 2500 nm with wavelength accuracy of ± 1 nm and spectral resolution of 3 nm at 700 nm and 8 nm at 1400 nm and 2100 nm. Spectral data were collected with a bare fiber optic of 25° field of view at 0.3 m above the canopy in order to get a sample size of 0.13 m diameter. The equipment was calibrated with a Spectralon® white panel at each sampling location. Each reflectance was an average of 10 repeated scans taken under cloudless conditions between 8:00 h and 16:00 h. Hyperspectral data were collected prior to manual harvesting sampling.

Raw spectral data (from 350 to 2500 nm) were exported into ASCII format using ViewSpec Pro which is integrated with RS3 Spectral Acquisition Software (Analytical Spectral Devices Inc., Colorado, USA). The ASCII data format was imported into R software (R Core Team 2014) and converted into hyperspectral object (R class) with the hyperSpect R-package to facilitate hyperspectral data manipulation and analysis (Beleites 2014). Pre-processing of spectral data included exploratory and cleaning processes to identify and remove outlier reflectance data as they tend to increase the error in statistical models (Wold et al. 2001; Zainol Abdullah et al. 2014). Other pre-processing techniques such as smoothing or normalization of reflectance data were omitted as these techniques can change the spectral characteristics of the data leading to inaccurate results (Suarez et al. 2017; Vaiphasa 2006). Furthermore, this study did not use the entire spectrum, but rather selected only those spectral regions that corresponded with those used as inputs to calculate the VIs (see Table 2). However, the strong water absorption wavelengths (1356–1480 nm; 1791–2021 nm and 2396–2500 nm) were removed (Apan et al. 2006). From the five randomly selected reflectance readings collected at each sampling location, the mean spectra value was calculated. VIs were then calculated following the equations in Table 3. Due to weather conditions, hyperspectral data could not be collected in Tas. Table 3 shows the information pertaining to the sensors with their respective capture date per region and growing period.

Vegetation indices and analysis

For the extraction of multispectral satellite data, the location of each sample point was used as a centroid that was then buffered to create a 2 m (for WV-3) and 10 m (Sentinel-2) area of interest (AOI). The mean reflectance value of all pixels within the AOI (pixel inclusion higher than 50%) was calculated for each multispectral band allowing for number of vegetation indices to be calculated (Table 2). In cases where the equation did not specify a particular wavelength (broadband VIs), blue (B), green (G), red (R) and near-infrared (NIR) were replaced by the reflectance values recorded at the 445 nm, 550 nm, 675 nm and 800 nm wavelength, respectively. This was only required in the cases of hyperspectral data and broadband VIs.

Several linear models were generated following the equation:

$$y = a + bx$$
(1)

where \(y\) corresponds to yield (t/ha), and \(x\) the VI (per sensor, per region). The optimal sensor was identified as the one with the overall highest coefficient of determination (R2). Although VIs performance can change according to the sensor (Al-Gaadi et al. 2016), the overall R2 was used as indicator of the stability of the sensor to explain yield variability. The optimal VI for yield estimation was defined by the highest coefficient of determination (R2) and the lowest residual standard error (σ). Scatter plots were used to identify indices with saturated values (low index variability) which were removed from the analysis. Optimised regression fits were tested by fitting non-linear models for each region. The resulting new models were validated using the Global Validation of Linear Model Assumptions (gvlma) package in R (Peña and Slate 2006). Only those models that passed the gvlma test and reduced the root mean squared error (RMSE) were chosen over the linear models for yield forecasting. Linear mixed models were tested with location as a random effect; however, results did not receive a higher accuracy than the empirical regression models per region.

A yield map (t/ha) for each sampled crop was generated using the optimal VI and the best regression fit per region. The average forecasted yield per crop (FY) was calculated by averaging the pixel values in t/ha of the resulting yield map. Commercial harvest (CH) (t/ha) reported by the grower was used as validation data. CH follows the form:

$$CH = CH_{T} /A$$
(2)

where \({CH}_{T}\) is the total commercially harvested yield (t) per crop and \(A\) is the area (ha) of the crop. FY (t/ha) was validated against CH at the crop level and the accuracies (indicated by the percent error, % error) were calculated following the equation:

$$\% \;error = \left( {\frac{FY - CH}{{CH}}} \right)* 100$$
(3)

The CH information was available for WA1 and WA2, Tas1 and Qld1 crops and therefore, only 16 out of the 24 fields were validated. Figure 3 shows the main activities performed in this study.

Fig. 3
figure 3

Flowchart for yield forecasting using different sensors. AGB: Above ground biomass

Results and discussion

Yield and grading quality variability across growing regions

WA recorded the highest yield followed by Tas and Qld (72 t/ha, 68 t/ha and 64 t/ha, respectively) (Fig. 4). The only statistically significant difference in yields was between WA and Qld across all sample sites. Qld had the lowest yield variability followed by Tas and WA. One-way ANOVA was used to further explore the relationships between the vigour classes and total carrot yield (t/ha) (Fig. 4). On average for all regions, yield increased by 16 t/ha between the low and high vigour classes, 10 t/ha between low and medium and 6 t/ha between medium and high. There were significant differences between total yield (t/ha) for each vigour class (p < 0.001) in Tas, whilst for WA, vigour class differences were not statistically significant different (p > 0.05). In Qld, yield was significantly different between low and medium and low and high vigour classes (p < 0.05) but not between medium and high vigour class.

Fig. 4
figure 4

Measured total carrot root yield (t/ha) per region and per vigour class derived from WV-3. The dotted lines delimit the minimum and maximum values, the boxes delimit the 25th and the 75th percentile while the hollow points represent outliers and the filled points represent the median. Means sharing the same letter are not statistically significantly different (p > 0.05)

It was not possible to clearly establish a relationship between plant density (plants/ha) and total yield (t/ha). WA had the lowest plant population per ha (~ 644,000 plants/ha) but the highest average total yield of all regions. In comparison to WA, Tas and Qld reported, on average, nearly 40% and 25% more plants/ha but 6% and 11% lower yields, respectively. Carrot quality grading was significantly (p < 0.05) related to plant density. Percentage of C1 decreased as the plant density increased while the percentage of C2 increased in line with plant density. As such, the results indicate that optimal plant density is important to produce a larger proportion of premium carrots (as defined by the market requirements of a particular size (class)) (Lana 2012). Significant relationships between percentage of oversize and waste carrots to plant density could not be established in this study.

Above ground biomass (AGB) variability

AGB was recorded at each of the 414 sampling locations. The variability per region was statistically significant (p < 0.05) with the Qld region producing the most variability and the highest AGB (average = 19 t/ha) followed by Tas (average = 16 t/ha) and WA (average = 11 t/ha). To validate the relationship between AGB and the vigour classes generated from the satellite imagery, one-way ANOVA and Games Howell post-hoc test were used. Samples from all regions displayed an AGB increase in line with the vigour class, with the low, medium and high vigour class producing 12 t/ha, 16 t/ha and 18 t/ha, respectively. Whilst the differences between the three vigour classes were significant for Tas and WA (p < 0.05), only the low and high vigour classes were significantly different (p < 0.05) for Qld (Fig. 5).

Fig. 5
figure 5

Above ground biomass (t/ha) distribution per vigour classes generated from WV-3 imagery. The dotted lines delimit the minimum and maximum values, the boxes delimit the 25th and the 75th percentile while the hollow points represent outliers and the filled points represent the median. Means sharing the same letter are not statistically significantly different (p > 0.05)

Total yield and AGB relations across regions

For all regions, the relationship between the measured AGB (t/ha) and total carrot yield (t/ha) was significant but weak (r = 0.24, p < 0.001). However, stronger relationships were achieved at the regional level (Fig. 6). Moderate (r = 0.58) and strong (r = 0.88) relationships were identified for Qld and Tas, respectively. This result is consistent with prior studies of other crops where total biomass was found to be directly related to marketable cucumber weight (R2 = 0.66) and wheat grain yield (R2 = 0.8) (Agegnehu et al. 2014; Nair and Ngouajio 2010). A significant relationship could not be established for WA. Measured AGB was 12.5 t/ha for WA1 and decreased to 9.4 t/ha in WA2 while the average total carrot root yield (t/ha) was not significantly different (p > 0.05). Issues related to monoculture cropping during WA2, are believed to be the main reason. The sampled fields had high infestations of wild carrots (Daucus carota L.), a weed that produces a large amount of green vegetation, potentially limiting the canopy development of the commercial carrots. As the roots of the wild carrot plants do not grow to any significant size, they do not compete against the root development of the commercial plants. The differential impact of canopy and root development between weed and commercial carrots affected the ratio of AGB and root yield of the commercial crop.

Fig. 6
figure 6

Relationship between above ground biomass (t/ha) and total carrot root yield (t/ha) in WA, Qld and Tas

Spectral response of carrots

To compare the influence of growing location on spectral response, the mean spectral profiles per region were plotted (Fig. 7). Spectral characteristics varied across each location and could be explained by differences in canopy architecture, potentially different pigment concentration, sowing configuration and soil types as reported by Broge and Leblanc (2001), Daughtry et al. (2000) and Qi et al. (1994). Other studies have identified different growing stages (Suarez et al. 2017) and varieties (Rama Rao 2008) as factors influencing the reflectance curves but, for this study, this influence is considered to be minimal as both growth stages at the time of data collection and variety were consistent.

Fig. 7
figure 7

Mean reflectance response per region and sensor. a WV-3, b Sentinel-2 and c hyperspectral. SD standard deviation

Reflectance changes were also observed per sensor and particularly between WV-3 and Sentinel-2. Qld produced the most consistent and the highest canopy reflectance values in the NIR region (for both hyperspectral and multispectral sensors) when compared to the other sites. For WA and Tas, the mean reflectance measured by WV-3 and Sentinel-2 was inconsistent: the reflectance profile was higher for Tas and lower for WA in WV-3 but this trend reversed for Sentinel-2. The trend changes could be explained by the coarser spatial resolution of the Sentinel-2 as the pixel values within the buffer area could include non-crop areas influencing the reflectance values.

Relationship between vegetation indices and carrot root yield

The relationship between total carrot root yield (t/ha) and vegetation indices varied between sensors and regions (Table 4). Overall, the multispectral sensors (i.e. Sentinel-2 and WV-3) better explained yield variability than the hyperspectral sensor at the field level. This can be explained due to the ground-based hyperspectral measurements being collected from a point source location and therefore being less capable of measuring the dynamics of major changes while also being highly sensitive to the presence of other vegetation, such as weeds, or soil exposure at the canopy level. The differences between the canopy structure-related (for example RDVI, NDVI, EVI2 or GEMI) and the pigment-related indices (such as SIPI or GNDVI) could not be clearly identified in WA. In Qld (where the relationship between AGB and yield was moderate), the differences were more evident with the structure-related indices better explaining yield variability. Hyperspectral data in the form of VIs did not provide significant correlations (p > 0.05) with yield and therefore, results are not presented in Table 4.

Table 4 Linear model parameters of satellite-derived vegetation indices as predictors of total carrot yield in WA, Qld and Tas

Irrespective of the VI or the region, the derived-VIs from WV-3 produced stronger relationships with yield than Sentinel-2 (Table 4). The R2 increased from 0.13 to 0.21, from 0.15 to 0.35 and from 0.50 to 0.70 for WA, Qld and Tas, respectively whilst the residual standard error (sigma, σ), decreased as the spatial resolution increased (Table 4). This is considered primarily to be the result of the higher spatial resolution of the WV-3 satellite sensor. This result aligns with Lima et al. (2019); Forkuor et al. (2018) and Wang et al. (2018) where Sentinel-2 (10 m spatial resolution) yielded higher classification accuracies over the Landsat 8-OLI sensor (30 m spatial resolution). Although Forkuor et al. (2018) indicated that the narrow bands of the Sentinel-2 could also have a positive influence on their results, the WV-3 (with slightly wider bandwidths than Sentinel-2) produced better results in this current study. Similar results were found by Yang et al. (2006). In their study, airborne imagery and QuickBird imagery were resampled at 2.8 m and 8.4 m. Regardless of the spectral configuration (i.e. airborne sensor had broader spectral bands than QuickBird sensor), higher spatial resolution gave better results for mapping sorghum yield and yield patterns.

The optimal VI for each region was defined by the highest R2 and the lowest σ obtained from the linear models. Tas produced the highest R2 when RDVI, SAVI and OSAVI were the predictor variables (R2 = 0.77), with the lowest σ (10.75 t/ha) achieved with RDVI. EVI2 performed better in Qld (R2 = 0.55), whilst for the WA region, GNDVI was optimal (R2 = 0.29). These results indicate that VIs were more susceptible to regional changes than the pure reflectance per band as their prediction performance varied across regions while the reflectance curves remained similar (Fig. 7).

Fig. 8
figure 8

Optimal regression fits of VIs derived from WV-3 for total carrot root yield (t/ha) for WA (left), Qld (middle) and Tas (right). The linear fit and the power or exponential curves are represented by the solid and dotted lines, respectively. All the models are significant with p < 0.001

The spectral reflectance and absorption properties of plants can be influenced by canopy structure, health and the presence, absence or concentration of specific plant constituents (Huete et al. 1994; Rondeaux et al. 1996). Therefore without undertaking destructive sampling and subsequent chemical analysis, the exact drivers of spectral variation can only be assumed based on what parameters have been measured in the field and prior literature. For Tas, yield had a stronger correlation to NIR bands (r > 0.8, p < 0.05) than the VIS bands (particularly the green region, r < 0.1), therefore it can be suggested that the structural parameters of the plant were the main driver. This is supported by the higher correlation coefficient (r) achieved between AGB and yield (r = 0.88) and the structural-related optimal index for this region (RDVI). Similarly for Qld, higher correlations between the NIR spectral range and yield were also dominant over the VIS region. The poor relationship with the red band (r < − 0.2) suggests that concentrations of photosynthetic pigments, or potentially plant nutrition, was not a major driver of yield variability (Pinter et al. 2003) (this was supported by leaf tissue testing at the sample sites, where no statistical variation in plant foliar nutrition and yield were identified, data not shown). The moderate relationship between AGB and yield (r = 0.58) and the higher correlations achieved with EVI2, support the hypothesis that the structural characteristics of the canopy, such as LAI, are the main indicators of yield variability for this region.

In WA, significant and moderate negative correlation coefficients were found between yield and the VIS bands (r < − 0.4, p < 0.05). GNDVI, the best performing index in this region, was initially derived to maximise the sensitivity to chlorophyll (chl-a) concentration, for the measurement of photosynthesis rate and to monitor plant stress. This index includes the green band as it has been found to be sensitive to pigment concentration by increasing reflectance in response to a decreasing chl-a concentration. The red spectrum has a similar behaviour but only in leaves with low chlorophyll concentrations (Gitelson et al. 1996). As such, pigment concentration is suggested as an important factor in yield variability. In WA, the low correlation between AGB and yield (r = 0.16) indicates that the structural properties of the plant were not strongly related to yield performance. The high weed population (wild carrot) evident in WA2 likely altered the canopy architecture due to high canopy densities and competition for light and nutrient availability, however this did not impact the total root yield (t/ha). This is also supported by differences between WA1 and WA2 (results not shown) where the reflectance in the green and red bands was higher in WA2 than WA1, indicating a lower pigment concentration due to a possible nutrient competition in AGB-reduced WA2 (Suarez et al. 2017; Blackburn 2007; Gitelson et al. 2003). Increased reflectance in the NIR bands in WA2 is likely the result of changes in canopy structure caused by the weed population (Schlemmer et al. 2013).

In an attempt to improve the accuracy from the linear models, several non-linear models were tested. The resulting new models were validated using gvlma test (Peña and Slate 2006) and only the models that passed the test and reduced the RMSE were considered as an optimised regression model. In WA and Tas, the parameters of the non-linear models remained mostly the same or increased the RMSE (particularly for Tas) compared to the linear fits, while failing the gvlma test. In Qld, the exponential fit improved the prediction performance of the EVI2 by producing a higher R2 and lower RMSE. This non-linear fit passed the gvlma test. The best regression fits per region and their respective parameters are shown in Fig. 8 whilst examples of yield maps derived from the regression fits and the location of sample points with the respective measured yield values (t/ha) are presented in Fig. 9.

Fig. 9
figure 9

Yield map examples derived from WV-3 for WA (a) and Tas (b) regions. Points and labels indicate the location and the range of manually harvested yield from samples taken

The forecasted yield (FY) (t/ha) from imagery at the crop level was compared against the commercially harvested yield (CH) for WA1, WA2, Qld1 and Tas1 (Table 5). The % error ranged from 3.3% to 15.8% in WA, from 2 to 12% in Qld and from 1.1% to 24% in Tas. Although WA produced the lowest R2 and the highest σ, the overall % error of the validation for this region (10.2% equivalent to 7.16 t/ha) was well below the error of the calibration model (15.24 t/ha). In Tas, only one season of data was available and this may have contributed to the overall highest % error (12.7%). However, this percentage (equivalent to 10.16 t/ha) was still below the error of the calibration model (10.60 t/ha). The regression fit in Qld included three seasons which may explain why it achieved the lowest overall % error (9.2% equivalent to 5 t/ha) when compared to the CH measured from Qld1 crops. This result highlighted the importance of including as many seasons as possible to develop the calibration models as it better encompasses seasonal variation.

Table 5 Accuracies (% error) of forecasted yield (FY). Please refer to Eqs. (2) and (3) for the calculations of commercial harvest (CH) and % error

It is worth noting that although the best efforts were made to obtain an exact measure of actual CH for each of the sampled crops, it is likely that harvest losses associated with carrots left in the ground or on top of the ground may have occurred. It is also possible that carrots harvested from a block other than the one sampled were added to the final CH. This sometimes occurs when growers need to fill trucks to send to the processors. An example of this is Block CE and Pivot E P1 where the CH (74 t/ha and 73 t/ha, respectively) were well above the maximum yield measured by the point source samples (65 t/ha and 57 t/ha, respectively).

Unfortunately, commercial yield monitors are not commonly used in Australian carrot production so obtaining actual accuracies at the individual point level could not be determined. From this study, the accuracies achieved in forecasting carrot yield from satellite imagery demonstrated the potential of this technology to be used as a surrogate of yield monitoring (in the absence of commercial harvester-based systems), and as an accurate pre-harvest predictor of total crop yield (Pinter et al. 2003). Better predicting pre-harvest yield offers significant benefit to growers by supporting harvesting, transport, factory logistics and marketing related activities. For precision farming, the provision of RS imagery identifying crop variability within the growing season supports the improved detection of biotic and abiotic stresses and the subsequent application of remedial actions (Moran et al. 1997).

Conclusion

This study evaluated the potential of remote sensing for predicting carrot yield and for the derivation of pre-harvest yield maps. The performance of vegetation indices (VIs) derived from both a handheld hyperspectral sensor and multispectral satellite sensors were evaluated. Comparison of the regression parameters for each sensor indicated that the spatial resolution was more important than the spectral configuration of the sensors or the inclusion of specific wavelengths in VIs.

High overall accuracy was attained with regional algorithms. Yield regression coefficients ranging from 0.29 to 0.77 gave up to 1% error when validated with commercially harvested yield. These results highlight the potential of this technology in precision agriculture practices. It was shown that AGB may not directly relate to yield potential, particularly when the crop is influenced by high weed populations or had a medium-to-high AGB. As a result, NDVI or SR, two of the most widely used VIs in the remote sensing and agronomic communities, were not within the top performing VIs. As such, these indices or their derived maps need to be used judiciously when employed as indicators of yield variability for commercial carrot production.

In this study, the data collected in the field (i.e. site-specific information) proved to be essential for the proper calibration of reflectance-based sensors as predictors of yield and indicators of yield variability. In the absence of harvester-based carrot yield monitors, satellite multispectral sensors proved to be a highly applicable alternative to monitor yield performance and yield potential at the crop level. Further research is required to determine what minimum growth stage RS imagery can be acquired to accurately predict yield variability. The earlier the information is provided to growers the more opportunity they have to undertake remedial action on low performing regions and to achieve optimum yield potential. The limited seasonal or geographical impacts on yield forecast performance indicate that a generic algorithm may be possible, thereby avoiding the need for calibration data once sufficient and representative information at the field level has been incorporated in the prediction models. Therefore, further research is needed to evaluate the accuracy of a generic algorithm across regions.