Introduction

Remote sensing techniques have played a significant role in precision agriculture for many decades (Zhang & Kovacs, 2012). In recent years, technological advancements in unoccupied aerial systems (UAS, drones), as well as in miniaturised imaging sensors have improved the flexibility and feasibility of the use of UAS-based remote sensing practices. This has led to a great dissemination of the technology among scientists, industry, and the community (Nex et al., 2022).

This study aims to develop precise UAS-based methods for grass quality and quantity estimation. Previous studies have already shown the potential of digital UAS-based remote sensing for this task (Oliveira et al., 2020, Feng et al., 2020, Oliveira et al., 2022; Karila et al., 2022, Freitas et al., 2022, Fernandéz-Habas et al., 2022). However, this study will develop the use of new-generation hyperspectral cameras for grass quantity and quality parameter estimation, particularly focusing on the short-wave infrared spectral range (SWIR; 900–1700 nm) UAS cameras that have not been applied previously in this task.

Currently, most commercial UAS remote sensing cameras operate on broadband within the visible to near-infrared spectral range (VNIR; 400–900 nm). RGB cameras are the most widespread system, providing high spatial resolution with a relatively low cost compared to other sensors. Although they have only three bands, RGB images can be used to compute vegetation indices and photogrammetric 3D point clouds to estimate grassland biomass (Viljanen et al., 2018; Niu et al., 2019, and Grüner et al., 2019). UAS multispectral cameras have become popular in vegetation applications due to their relatively low cost and light weight and spectral bands in the red edge and near-infrared (NIR) regions, which are highly correlated to the plant’s internal structure (Ustin & Jacquemoud, 2020; Zeng et al., 2020). UAS multispectral cameras have been applied in the estimation of grass biomass (Liu et al, 2019; Pranga et al., 2021) and quality parameters (Geipel et al., 2016, Osco et al., 2019, Lussem et al., 2022). UAS hyperspectral cameras bring detailed spectral information about plants by using a high spectral resolution and narrow bandwidth, as well as a high spatial resolution. UAS hyperspectral VNIR cameras have been successfully demonstrated for biomass and quality parameters estimation in pasture (Barnetson et al., 2020; Wijesingha et al., 2020), alfalfa (Feng et al., 2020), and grass (Capolupo et. al (2015); Oliveira et al., 2020; Karila et al., 2022, Franceschini et. al (2022)), as well as in other agriculture applications such as the yield estimation of wheat (Montesinos-López et al., 2017), maize (Aguate et al., 2017), and potato (Li et al., 2020) crops.

The hyperspectral SWIR range has been mostly part of laboratory analysis based on near-infrared spectroscopy (NIRS) or field spectroscopy. NIRS systems typically operate in a range of 700 nm to 2500 nm with narrow bands, and it has become one of the most common techniques applied to predict plants’ biophysical parameters such as the concentration of protein, lignin, and fibre (Norris et al., 1976; Stuth et al., 2003). Field spectroscopy is also adopted for the prediction of foliage nutrient concentration but based on field spectral measurements employing a wide spectral range with narrow bands (400–2500 nm) (Kawamura et al., 2010, Pullanagari et al., 2012, Fernandés-Habas et al., 2022). Because of the high redundancy of spectral bands in hyperspectral data, it is common to apply techniques to identify optimal features/bands for building predictive models where the optimal features can occur either in the visible near-infrared (VNIR) region or in SWIR regions, depending on the parameter (Thenkabail et al., 2000, Kawamura et al., 2010, Fernandés-Habas et al., 2022). For example, the SWIR range includes many spectral absorption bands for many grass quality parameters, such as nitrogen concentration (Stuth & Tolleson, 2003; Berger et al., 2020, Pullanagari et al., 2021, Féret et al., 2021).

Togeiro de Alckmin et al. (2020) and Pullanagari et al. (2021) investigated different spectral ranges (full spectra, VNIR, and SWIR) from field spectrometer data to estimate crude protein and nitrogen concentration respectively. Togeiro de Alckmin et al. (2020) focused on crude protein yield (CP) and crude protein as a dry matter fraction (%CP) in perennial ryegrass, combining data from different sites. In their results, VNIR spectral range obtained the best accuracy for CP, while the full spectra model was the best model for %CP. Pullanagari et al. (2021) estimated nitrogen concentration using a large time series of spectra data collection in temperate grasslands during the winter, spring, autumn, and winter. Unlike Togeiro de Alckmin et al. (2020), they observed higher accuracy when using the full spectra model than when using VNIR and SWIR ranges alone, and the SWIR model outperformed VNIR. Thomson et al. (2022) investigated the performance of four proximal spectrometers to estimate pasture quality parameters, including the standard field spectrometer ASD (400–2500), lower costs sensors, a hyperspectral camera (397–1004 nm), and two handheld spectrometers (908–1676 nm and 1345–2555 nm) against NIRS measurements. The hyperspectral camera’s performance was similar to the ASD. In the same study, when dividing the ASD into three spectral ranges, the spectral region 914 nm–1676 nm was the best to estimate crude protein, WSC, and dry matter. The authors highlighted that more research was needed into hyperspectral imaging and its application for nutrient characteristic prediction in pastures.

The availability of SWIR cameras for UAS platforms has been limited until now, thus few studies in UAS-based remote sensing have applied SWIR bands. Jenal et al. (2020) investigated the potential of a UAS multispectral VNIR/SWIR camera for forage dry matter (DM) yield estimation with simple linear regression (SLR) models. The system acquired four bands (910 nm, 980 mn, 1100 nm, and 1200 nm); the selection of these bands was based on the vegetation indices NRI (Koppe et al., 2010) and GnyLi (Gnyp et al., 2014). Honkavaara et al. (2016) studied the potential of VNIR and SWIR hyperspectral frame cameras based on a tuneable Fabry-Pérot interferometer (FPI) in measuring a 3-D digital surface model and the surface moisture of a peat production area, and Tuominen et al. (2018) in the estimation of species distribution in a highly diverse forest area.

Information about the performance of various sensors can guide the user and industry in their choice of precision agriculture applications. It is relevant to investigate if the expensive and heavy hyperspectral cameras are advantageous compared to commercial off-the-shelf RGB and multispectral cameras. Zheng et al. (2018) evaluated RGB, colour-infrared, and multispectral data for the estimation of nitrogen accumulation in rice, observing that the multispectral indices provided better performance. Studies by Askari et al. (2019) and Oliveira et al. (2020) showed that hyperspectral imaging provided better accuracy than RGB and multispectral imaging in estimating grass quality and quantity parameters when using partial least squares regression (PLSR) and random forest regression (RF) respectively. The study by Karila et al. (2022) indicated that with deep learning neural networks, the RGB images provided results comparable to hyperspectral images for many quality and quantity parameters. Tahmasbian et al. (2021) conducted a laboratory investigation with two hyperspectral cameras (VNIR 400–1000 nm and SWIR 1000–2500 nm) and a NIRS system for predicting carbon and nitrogen concentrations in ground samples of Australian wheat. The authors reported the best results for carbon prediction were obtained using the hyperspectral VNIR camera, and the most important spectral region was 400–550 nm. Conversely, the SWIR hyperspectral camera provided the most accurate predictions of nitrogen, 1451–1600 nm, 1901–2050 nm, and 2051–2200 nm.

This study’s objectives were to develop and assess a machine-learning pipeline for novel VNIR and SWIR hyperspectral imaging sensors to estimate the quantity and quality parameters of grass sward. The feasibility of these hyperspectral cameras was compared to more widespread RGB and multispectral cameras, and with data fusion. Machine-learning pipelines were implemented, using random forest and recursive feature selection, for each remote sensing technology, following an empirical study on a controlled trial site, where remote sensing datasets were collected using various sensors. Specifically, this study develops and explores the use of state-of-the-art hyperspectral cameras, and particularly a SWIR hyperspectral camera, which has a spectral range that has not previously been explored in research on UAS-based grass quality and quantity estimation.

Methods

Test area and experimental setup

The experimental trial was established in 2018 by the Natural Resources Institute Finland (Luke) in Maaninka, Finland (Central Finland, 63° 8′ 43.22″ N, 27°’18′ 46.20″ E, loam soil). The experiment tested the effects of mineral and organic N fertilisation on grass yield production with a split plot design including four replicates, three main plots (fertilisation type: no cow slurry, cow slurry applied once, and cow slurry applied twice during the growing season) and five subplots (total nitrogen 0, 150, 250, 350, or 450 kg N ha−1 year−1). The plot dimensions were 1.5 × 8 m, and each plot was separated by a similarly sized cover plot (Fig. 1). The plant material was pure timothy (Phleum pratense L. cultivar ‘Nuutti’) cultivated as typically for silage grass. The agronomic details of the experimental setup are described in more detail in Termonen et al. (2022). The selected data for spectral sensor tests were from the third yield data, harvested on 2 September 2021 and fertilised with 0, 30, 50, 70, or 90 kg N ha−1 a day after the second harvest (19 July). The first harvest was on 10 June. The variation in yield quantity and quality due to the carry-over effects (slower release of nutrients) of organic fertilisers is included in the data. The precipitation sum and the effective temperature sum, starting from the beginning of growing season (11th May 2021), reached 296 mm and 1343°Cd, respectively, by the third harvest (2nd September 2021). The effective temperature sum was calculated as the sum of the positive differences between diurnal mean temperatures and 5 °C. Temperature sum was 215°Cd above the long-term average (1990–2020).

Fig. 1
figure 1

Experimental site in Maaninka, Finland (63′ 8’ 43.22″ N, 27° 18′ 46.20″ E): (a) RGB orthomosaic nitrogen fertiliser rates, from 0 to 450 kg ha−1 year−1; b canopy height model; cAFX10 orthomosaic (colour composite: B540nm, G680 nm, R840 nm); and d AFX17 orthomosaic (colour composite: B1000 nm, G1200 nm, R1680 nm)

The field data for training consisted of 60 samples of fresh yield (FY), dry matter yield (DMY), digestible organic matter in dry matter (D-value), neutral detergent fibre (NDF), water-soluble carbohydrates (WSC), and nitrogen concentration (Ncont) (Table 1). The quality parameters were determined by near-infrared (NIR) spectroscopy in the laboratory of Valio Ltd.

Table 1 Descriptive statistics of the third harvest timothy data samples, mean, standard deviation (std), minimum (min), 25th, 50th, and 75th percentiles and maximum (max)

UAS and camera systems

The UAS data collection was carried out on 26 August 2021 using a DJI Matrice 300 RTK (M300) and a DJI Matrice 600 (M600). The M600 was employed in the hyperspectral data acquisition with two Specim AFX pushbroom hyperspectral cameras (Fig. 2), flown in separate flights. The Specim AFX hyperspectral cameras integrate a computer and a high-end GNSS/IMU unit. The AFX10 VNIR camera (2.1 kg, 400–1000 nm, spectral binning of 2) had a spectral resolution of 5.5 nm, a spectral sampling of 2.68 nm, 224 bands, 1024 spatial pixels, and a focal length 15 mm. The AFX17 SWIR camera (2.4 kg, 900–1700 nm, no spectral binning) had a spectral resolution of 8 nm, a spectral sampling of 3.5 nm, 224 bands, 640 spatial pixels, and a focal length of 18 mm (Specim Spectral Imaging Ltd., 2022). The flights with AFX10 and AFX17 were done from approximately 100 m above ground level, using the same flight trajectory. Both datasets consisted of six flight lines with a length of 425 m length and an overlap of approximately 23%. The M300 (Fig. 2) was mounted with a Zenmuse P1 RGB camera of a 45-megapixel full-frame sensor with interchangeable fixed-focus lenses on a 3-axis stabilised gimbal, a 4.4 μm pixel size with a DJI DL 35 mm f/2.8 lens and RTK GNSS, and a Micasense Altum camera, composed by five VNIR bands (Table 2). Ten ground control points (GCPs) were measured using a Topcon GNSS receiver, to support the geometric quality of the data. Table 2 presents the details of the flights. Growth and changes in the quality of pure timothy stands are slow during the latter part of summer in Finland regrowth (Hyrkäs et al., 2016). Therefore, it was not expected significant differences in yield and quality characteristics between 26 August and 2 September, despite this period covering 7 days out of the total 46-day growing period since the previous cut.

Fig. 2
figure 2

Unoccupied aircraft systems used in the campaigns. Left: Specim AFX17 onboard DJI Matrice 600 (Gremsy T7—Bundle for M600 gimbal is not in the picture). Right: DJI Matrice 300 RTK with Zenmuse P1 photogrammetric camera and the Micasense Altum multispectral camera

Table 2 Data and flights

Image processing

The hyperspectral AFX10 and AFX17 datasets were georectified and calibrated to radiances using the Specim CaliGeoPRO v2.3.12 software. The post-processed kinematic (PPK) GNSS/IMU solutions were calculated for the flight trajectories using the Applanix PosUAV v 8.6 software. The GNSS base station data were obtained from the National Land Survey FINPOS service, using the virtual reference station method. The DSM from the Zenmuse P1 sensor was used in the georectification phase. The boresight calibration was carried out in a signalised test field and enhanced in the campaign area.

The raw image pixel values of AFX datasets were transformed to the units of radiance using the Specim CaliGeoPRO v2.3.12 software. The AFX10 dataset was collected in uniform conditions, whereas the AFX17 dataset was collected in varying conditions: flight lines 1–3 were captured in uniform conditions, and during flight lines 4–6, the level of illumination was decreasing. To obtain a more uniform radiometric response over the area, a relative correction was computed, using three regions with similar reflectance characteristics in the area to determine the linear parameters, considering the flight line with the panels as a reference. To identify channels strongly affected by atmospheric absorptions, datasets collected from an FGI remote sensing test field in Sjökulla, Finland, were used. Pixels from a 1 m × 1 m area were extracted from three spectrally uniform gravel targets to compute the average and standard deviation. Bands where the standard deviation was three times higher than the coefficient of variation of all bands were removed. For AFX10, bands 908–1003 nm and for AFX17 bands 1347–1485 nm and 1698–1720 nm were not used. In total, 188 bands were used for AFX10, and 175 bands for AF17, out of 224 bands on each camera.

The geometric processing of RGB and multispectral camera datasets was carried out using Agisoft Metashape 1.7 software. The workflow included aligning images through a self-calibrating process, using a “high” quality configuration and the pre-calibrated interior calibration parameters of each camera. GCPs were measured manually. For the P1 datasets, RTK GNSS observations were utilised in georeferencing. All parameters and observations were optimised, and tie points were inspected using automatic and manual outlier removal. After the final optimisation, the dense point cloud and a digital surface model (DSM) were generated using the options “high” density and “mild” for depth filters respectively, followed by the orthomosaic generation for P1 and Altum images. The radiometric calibration of the multispectral images was done in Agisoft, as described in Agisoft (2022). The canopy height model (CHM) was generated from the RGB dense point cloud, by classifying the ground points to create a digital terrain model (DTM) and then subtracting the DTM from the DSM. The final CHM for the study area is presented in Fig. 1 (b).

The reflectance transformation was performed for the orthomosaics using four reflectance reference panels of 1 m × 1 m and nominal reflectance equal to 50%, 25%, 10%, and 5%, which were built in-house by installing Zenith Polymer films (SphereOptics GmbH, Uhldingen, Germany) on flat aluminium honeycomb panels (6 mm Potmacore panels by Potma, Pello, Finland). The panels were placed approximately 1 m above the ground. An additional three panels with a nominal reflectance of 50%, 10%, 3% were placed directly on the ground. The empirical line method (ELM) was applied for reflectance transformation with an exponential function for the RGB dataset and a linear function for the multispectral and hyperspectral datasets.

Machine learning and assessment

The machine-learning process included feature extraction, feature selection, and a supervised learning process. Polygons were manually created in QGIS to match the location of the grass experimental plots (8 m × 1 m) and used to extract several features from each camera reflectance orthomosaic and from the CHM. These features consisted of spectral related statistics (Appendix Table 7) as described in Nevalainen et al. (2017), computed for each band of each camera dataset. In total, there were 1880 features for AFX10, 1750 features for AFX17, 50 features for Altum, and 30 features for the RGB camera. Vegetation indices (Appendix Tables 8, 9, and 10) commonly applied in the literature were computed according to each sensor band configuration. In addition to the previous features, the simple band ratio (SBR) of the reflectance spectra (Eq. 1) (Chappelle et al., 1992) and two-band normalised difference (NBR) (Eq. 2) (Thenkabail et al., 2013) were computed for all possible band combinations for the two hyperspectral cameras. Altogether, 17 578 and 15 225 combinations were computed for AFX10 and AFX17 respectively.

$$SBR= \frac{{R}_{i}}{{R}_{i+1}}$$
(1)
$$NBR= \frac{{R}_{i}- {R}_{i+1}}{{R}_{i}+ {R}_{i+1}}$$
(2)

where i is the spectral band for reflectance (R).

Table 3 Summary of feature datasets
Table 4 Dry matter yield (DMY) and fresh yield (FY) mean and standard deviations (in parenthesis) across 3-times repeated tenfold cross-validation for root-mean-squared error (RMSE), normalised root-mean-squared error (NRMSE), coefficient of determination (R2) and Pearson correlation coefficient (R)

As the number of extracted features was much higher than the number of training samples, feature selection techniques were applied to identify the most relevant predictors of each grass parameter (Hennessy et al., 2020). First, a filter method based on the Pearson correlation coefficient was applied to remove highly correlated features (0.98) and the features with a low correlation with the interested variable (less than 0.3). Moreover, constant or quasi-constant features were removed. Next, a recursive feature selection cross-validation (RFECV) method was applied with tenfold cross-validation (CV), based on random forest (RF) estimator importance. RF is a supervised machine-learning algorithm (Breiman, 2001), which has been proved as one of the most precise prediction methods in agriculture biophysical parameters (Prado Osco et al., 2019; Kganyago et al., 2021; Freitas et al., 2022; Angel & McCabe, 2022). In the RFECV with the RF estimator, for each CV split, feature importance is computed based on the RF estimator, initially with all features, and then one or few features are dropped iteratively until the minimum number of features is reached. The mean result from each split and the best scored iteration is selected. The maximum number of features was limited to 40, as RFECV selected hundreds of features in some cases.

The selected features were used to create RF prediction models for the parameters analysed in this study. The accuracy of the models was assessed through leave-one-out cross-validation (LOOCV) and repeated tenfold CV. LOOCV is recommended when the amount of sample data is low. However, it can lead to higher variance, and the computational cost is high. As the results of both CV techniques were compatible, only repeated tenfold CV is presented. Implementations were based on the Scikit-learn python library (Pedregosa et al., 2011). The metrics for performance assessment included coefficient of determination (R2) as in Eq. 3 (see Scikit-learn, 2023), Pearson correlation coefficient (R), root-mean-squared error (RMSE), and normalised (by the mean) root-mean-squared error (NRMSE). The metrics represent the mean and the standard deviation of all runs in the 3-times repeated tenfold CV, resulting in 30 runs.

$${R}^{2}= 1- \frac{\sum {({\widehat{y}}_{i}-{y}_{i})}^{2}}{\sum {({\widehat{y}}_{i}-\overline{y })}^{2}}$$
(3)

where \({\widehat{y}}_{i}\) is the actual measured value, \({y}_{i}\) is the predicted value, and \(\overline{y }\) is the mean value of the actual measurements.

The process described above was performed separately for each dataset, i.e., RGB, CHM, multispectral data (Altum), and hyperspectral datasets (AFX10 and AFX17). For the RGB and multispectral orthomosaics, the extracted features (Sect. 2.4) were divided into spectral and vegetation indices. In considering the two hyperspectral sensors, the features extracted were separated into four feature sets: spectral metrics; vegetation indices; SBRs; and NBRs. The workflow RFECV feature selection and RF was thus performed, for example, for the hyperspectral cameras, four times as there were four feature groups, and then once again for the combined selected features from each feature set (Fig. 3). Additionally, two sensor combinations (data fusion) were evaluated for the analyses of each separated dataset, including features from the multispectral Altum and AFX17 (MS-VNIR_HS-SWIR) and the AFX10 and AFX17 (HS-VNIR_HS-SWIR) cameras. Table 3 shows the label and descriptions of each feature set. The RF models, formed by different datasets, were then compared using the Scott-Knott test (Scott & Knott, 1974) to assess the difference in prediction accuracy and to rank the best models. The Scott-Knott test was conducted using the NRMSEs obtained from the cross-validation process of each RF model. As a tenfold repeated 3-times cross-validation was performed for each RF model, 30 NRMSEs were obtained for each model. The list of NRMSEs for all models was then used as an input for the Scott-Knott test, which was implemented by Tantithamthavorn et al. (2019).

Fig. 3
figure 3

Feature groups by sensor. CHM: canopy height model, SPEC: spectral features, VI: vegetation indices, SBR: simple band ratio, NBR: normalised band ratio, RFECV: recursive feature elimination with cross-validation, RF: random forest, RMSE: root-mean-squared error, NRMSE: normalised root-mean-squared error, R.2: coefficient of determination (source images: http://www.dji.com/fi/zenmuse-p1, micasense.com/es/altum/ and http://www.specim.fi/afx/)

Results

Spectral data analysis

Figure 4 shows a comparison of the mean reflectance among five nitrogen rates, considering averages of all plots of the same rate, for all sensors. It can be observed that the reflectance spectra of the different sensors are of high quality and compatible after the radiometric and reflectance calibrations. The experimental design proved the expected variation among the plots. The reflectance in an NIR region of 750–1300 nm increased with the N fertiliser rate applied, as N-rates increase LAI, and absorption of chlorophyll declines in this spectral range. The spectra separated significantly for the N-rates of 0 and 90 kg ha−1, for the higher levels of N-rate the differences were smaller than the spectral reflectance increased more slowly in parts of the spectrum, which is known as the vegetation saturation problem (Thenkabail et al., 2000). Correspondingly, greater differences appeared in the green (~ 550 nm) and in the red (~ 630 nm) bands.

Fig. 4
figure 4

Average and standard deviation spectral reflectance of each nitrogen level sample plots for RGB, multispectral (Altum), and hyperspectral (VNIR AFX10 and SWIR AFX17) cameras

Quantity parameters prediction

Table 4 presents the mean and standard deviation results of the repeated tenfold CV for DMY and FY prediction validation, organised by sensor and sensor data fusion. Figure 5 displays boxplots of the NRMSE and R2 values for each iteration of the repeated tenfold CV. These results indicated that the sensor choice impacted the performance metrics of the quantity parameters estimation, where 3D and RGB features presented the worst results, and the hyperspectral cameras the best results, with lower interquartile range and fewer outliers. Multispectral features provided an NRMSE of 9.88% for DMY and 10.1% for FY, which was similar to the results of HS-VNIR, with NRMSE of 9.18% for DMY and of 9.56% for FY. However, the best accuracy for DMY was obtained by HS-SWIR, with an NRMSE of 8.4% and R2 of 0.89. For FY, the best NRMSE (8.36%) and R2 (0.92) were obtained with the combination HS-VNIR_HS-SWIR. Thus, DMY and FY were estimated with comparable accuracies of about 8.4%. The Scott-Knott test (Table 6) ranked the feature sets HS-SWIR, MS-VNIR_HS-SWIR, HS-VNIR_HS-SWIR in the same group as the best models for DMY and MS-VNIR_HS-SWIR, HS-VNIR_HS-SWIR as the best models for FY.

Fig. 5
figure 5

a Normalised root-mean-squared error (NRMSE) and b coefficient of determination (R2) over tenfold cross-validation and three repeats for dry matter yield (DMY) and fresh yield (FY). A few R2 outliers were removed from the graph to improve legibility. Boxplot components: midline = median, box = interquartile range, whiskers = data range (excluding outliers), points = outliers

Figure 6 shows the 10 most important features for the best models (HS-VNIR, HS-SWIR, MS-VNIR_HS-SWIR, HS-VNIR_HS-SWIR) selected by the RF. Feature importance scores were calculated as the (normalised) total reduction of the mean-squared error achieved by the feature over all decision trees in the forest (criterion “square_error” in Scikit-Learn). The higher the feature importance score, the more important the feature is in predicting the target variable. NBRs of bands between 408 and 540 nm with bands between 550 and 700 nm were presented in the models for DMY and FY. The SWIR bands and ratios were also important for DMY and FY, where the NBR of 1088.2 nm and 1098.65 was among the most important for the three models with HS-SWIR.

Fig. 6
figure 6

The 10 most important features for HS-VNIR, HS-SWIR, HS-VNIR_HS-SWIR, and MS-VNIR_HS-SWIR models for a dry matter yield (DMY) and b fresh yield (FY). The higher the feature importance score, the more important the feature is in predicting the target variable

Quality parameters prediction

Regarding the quality parameters, the models with only 3D features had poorer predictions for almost all parameters, with the exception of NDF, followed by the RGB model. On the other hand, the hyperspectral cameras and their combinations produced the best results (Table 5).

Table 5 Quality parameters’ mean and standard deviations (in parenthesis) across 3-times repeated tenfold cross-validation for normalised root-mean-squared error (NRMSE), coefficient of determination (R2), and Pearson correlation coefficient (R), Nitrogen concentration (Ncont), digestible organic matter in dry matter (D-value), neutral detergent fibre (NDF), and water-soluble carbohydrates (WSC)

For Ncont, the mean R2, R, and NRMSE varied from -0.77 to 0.33, 0.44 to 0.81, and 7.44% to 12.17% respectively. Despite the low mean R2 value, it can be seen in Fig. 7 that the median values were above 0.6 for most of the experiments and the best results were obtained with HS-VNIR, whereas the Scott-Knott test indicated that the performance for HS-VNIR, HS-VNIR_HS-SWIR were similar (Table 6). In the HS-VNIR model, the best features were located around 430–604 nm, 726–760 nm, 800 nm, and 816 nm. The best bands for HS-SWIR were located around 960 nm, 1217–1330 nm, 1575 nm, 1688–1695 nm (Fig. 8).

Fig. 7
figure 7

Normalised root-mean-squared error (NRMSE) (%) and coefficient of determination (R2) over tenfold cross-validation and three repeats for a, b nitrogen concentration (Ncont), c, d digestible organic matter in dry matter (D-value), e, f neutral detergent fibre (NDF), and g, h water-soluble carbohydrates (WSC). Two R2 outliers, one for 3D and one for RGB, were removed from the graph to improve legibility. Boxplot components: midline = median, box = interquartile range, whiskers = data range (excluding outliers), points = outliers

Table 6 Best models according to the Non-Parametric ScottKnott ESD test for dry matter yield (DMY), fresh yield (FY), nitrogen concentration (Ncont), digestible organic matter in dry matter (D-value), neutral detergent fibre (NDF), and water-soluble carbohydrates (WSC)
Fig. 8
figure 8

The 10 most important features for HS-VNIR, HS-SWIR, HS-VNIR_HS-SWIR, and MS-VNIR_HS-SWIR models for a nitrogen concentration (Ncont), b digestible organic matter in dry matter (D-value), c neutral detergent fibre (NDF), and d water-soluble carbohydrates (WSC). The higher the feature importance score, the more important the feature is in predicting the target variable

D-value obtained NRMSE values of 1% to 1.4%, R2 from -0.96 to -0.10, and R -0.09 to 0.53. The best results were very similar for the MS-VNIR_HS-SWIR, HS-VNIR_HS-SWIR, HS-VNIR, and HS-SWIR models (Table 6). D-value had a very low correlation with 3D features (R -0.09). The most important features for the models with HS-VNIR features included NBRs in regions between 400 and 586 nm; and for the models with HS-SWIR, NBR bands were selected around the regions between 1074–1200 nm and 1620–1690 nm (Fig. 8).

NDF models reached the poorest R values from all parameters, with mean R from 0.18 to 0.46, and median R2 below 0 in tenfold CV (Fig. 7), indicating that the models were poorly fitted with this dataset. The RGB feature set provided the lowest accuracies (NRMSE 1.52%, R2 -0.43, R 0.19), and MS-VNIR_HS-SWIR provided the best performance (NRMSE 1.27%, R2 -0.04, R 0.46). The best 10 features for the HS-SWIR, MS-VNIR_HS-SWIR and HS-VNIR_HS-SWIR models were similar and mostly composed of NBRs of HS-SWIR in the spectral region between 1488 and 1635 nm (Fig. 8).

As expected, WSC estimation accuracies were comparable to Ncont due to their correlation. The MS-VNIR and HS-VNIR provided the best NRMSE values for WSC; MS-VNIR presented the best NRMSE at 12.02%, and R2 at 0.31. For HS-VNIR, the most relevant features were SBRs in the range 408–605 nm and bands in the 722–750 nm spectral region. Additionally, VIs related to chlorophyll, which is correlated to nitrogen, were also part of the best features. Regarding the SWIR region, the selected features included bands around 940–1245 nm, 1340 nm, and 1695 nm.

Model comparison

An evaluation and comparison of all models using Scott-Knott test (Table 6) showed that the HS-VNIR_HS-SWIR combination was among the best group results for all quality and biomass parameter estimations. Similarly, MS-VNIR_HS-SWIR was in the group of best models for all parameters except for Ncont. The multispectral camera feature set alone (MS-VNIR) was among the best models for WSC and NDF. Considering estimations with single sensors, the best models were obtained with the HS-SWIR for DMY, D-value, and NDF, with the HS-VNIR for Ncont, D-value, and WSC, and with MS-VNIR for NDF and WSC.

Discussion

Quantity parameters

The predicted grass quantity results (Table 4) had NRMSE values ranging from 8.4% to 18.02% (128 to 276 kg DM ha−1) for DMY and from 8.36% to 23.62% (682 to 1868 kg ha−1) for FY, which is consistent to other works. Geipel et al. (2021) estimated DMY and FY for two years of a grass field using PLSR and achieved an NRMSE of 14.2% for FY and 15.2% for DMY. Oliveira et al. (2020) studied silage grass in a multi-date research setting using 3D, RGB, multispectral, and hyperspectral VNIR cameras. The results for the LOOCV of the regrowth training data provided the best NRMSE of 17.18% for FY and NRMSE 15.23% for DMY, with hyperspectral features. Although the RGB and multispectral datasets did not achieve the best results, they were comparable to other studies using 3D, RGB, and multispectral cameras (Lussem et al., 2022; Pranga et al., 2021).

The use of only 3D features led to the worst accuracies, as also observed in Pranga et al. (2021) and Lussem et al. (2022). Although not presented in the results, the 3D data fusion with spectral features decreased the estimation accuracy, which differed from the previous study by Oliveira et al. (2020). The poor accuracy of the 3D features could potentially be due to the use of the third-growth data, where the growth takes place through the increased density of the foliar stand, whereas in primary growth, the growth occurs by extension of the streams. Photogrammetric 3D models may therefore be suboptimal data for the regrowth analysis. Further studies could focus on investigating ultra-high-resolution LiDAR 3D models.

The models built with MS-VNIR_HS-SWIR and HS-VNIR_HS-SWIR features outperformed models based only on 3D, RGB, multispectral, or hyperspectral VNIR data. Interestingly, the best results for both DMY and FY included SWIR data (Table 6). In the DMY estimations, the 10 most significant features for the best models (HS-SWIR, HS-VNIR_HS-SWIR, and MS-VNIR_HS-SWIR models) included the NBR of the 1088.2 nm and 1089.65 nm bands and the mean normalised of the 1276 nm band, which is similar to the 1100 nm and 1200 nm bands studied by Jenal et al. (2020), utilising their four-band multispectral camera with an SWIR range (910 nm, 980 nm, 1100 nm, and 1200 nm).

Quality parameters

At their best, the quality parameters presented NRMSEs of 7.44% for Ncont, 1% for D-value, 1.27% for NDF, and 12.02% for WSC. For Ncont, HS-VNIR, and HS-VNIR_HS-SWIR, the sets had a similar performance. In the HS-VNIR_HS-SWIR model, nine of the 10 most important features were from the VNIR range. These most relevant features were in the wavelengths of 400–500 nm and 700–800 nm, which are known to be sensitive to nitrogen (Thenkabail et al., 2011). When using HS-SWIR, the spectral region of 1200–1700 nm in the most important features was also in accordance with other studies for grass nitrogen estimation (Pullanagari et al., 2021; Togeiro de Alckmin et al., 2020). When using MS-VNIR_HS-SWIR features, the 10 most significant features belonged to both sensors, but the highest contribution came from the VI TCI, which is a chlorophyll-related VI, agreeing to the fact that nitrogen is correlated to chlorophyll. Pullanagari et al. (2021) also obtained comparable N concentration prediction performance for VNIR and SWIR regions using 1D-CNN model and field spectrometer data.

Feature sets HS-VNIR and HS-SWIR achieved the best models for D-value and WSC estimations. However, upon analysing the data fusion between HS-SWIR and MS-VNIR, as well as HS-SWIR and HS-VNIR for the D-value and WSC models, it was found that the 10 best features were predominantly from the HS-SWIR data. This suggests that the SWIR region was a more significant contributor to the accuracy of the estimations, particularly when combined with the VNIR data. The correlation between WSC and Ncont parameters was consistently observed, as evidenced by the similarity in the 10 most important features for both parameters. Furthermore, some of these best features were also included in the top features for DMY and FY, which are also correlated to Ncont and WSC. The good NR MSE values but low R2 values for D-value and NDF indicate that the models were unable to explain much of the variation in the D-value and NDF, but the predictions of the model were relatively close to the actual values. This is probably related to low variation and the range of the D-value and NDF training data (Table 1). These parameters are properties that describe different aspects of the forage's composition and its interaction with the digestive system of animals, thus their estimation can be a more challenging task, requiring additional data from various sites and time periods (Stuth et al., 2003). The best features selected for the models of data fusion between HS-SWIR and MS-VNIR, as well as HS-SWIR and HS-VNIR, for NDF were predominantly from the HS-SWIR region. Considering the models using only HS-VNIR, bands in the spectral regions of 400–600 nm and 700–800 nm were selected for all the estimated quality parameters, while the most common selected bands for all parameters for HS-SWIR were around 1050–1300 nm and 1600 to 1700 nm. This observation suggests that the SWIR region is more representative, which is aligned with previous studies indicating the high sensitivity of lignin and cellulose in this region (Thenkabail et al., 2013).

Comparative analysis of remote sensing technologies and future studies

The presented results showed that the VNIR and SWIR hyperspectral cameras provided advantages for the grass quantity and quality estimation compared with the multispectral RGB cameras and photogrammetric 3D point cloud. Interestingly, the combination of a multispectral camera and SWIR hyperspectral cameras provided comparable performance to the combination of VNIR and SWIR hyperspectral cameras. However, most of the best features were selected from the SWIR camera for the quality parameters. Multispectral cameras obtained comparable performance as hyperspectral data for NDF and WSC parameters. The SWIR hyperspectral camera provided advantages particularly for the biomass parameters, whereas HS-SWIR and its combination with both MS-VNIR and HS-VNIR presented better results than using VNIR cameras alone. The data fusion of the VNIR and SWIR range was shown to be among the best model performance for both quantity and quality parameters. Considering the features selected for the estimations (as shown in Fig. 6 and Fig. 8), it becomes apparent that several narrow band ratios were utilised in constructing the models. This observation highlights the significance of the hyperspectral data, particularly its high spectral resolution, in achieving accurate estimation results.

The findings of this study were based on data from only one harvest and one environment. Further studies are therefore needed to validate these findings in more challenging conditions such as mixed legume-grass setups or swards with different species and weeds, and for the earlier two harvests when the variation in studied parameters and the rate of quality decrease/yield increase is much higher. In particular, the variation in D-value and NDF was limited, which should be considered when drawing conclusions about the results. A small sample size may challenge the predictive power of the algorithms. However, the use of repeated cross-validation indicated a good stability in the models when repeated tenfold.

Efforts to improve animal feeding are important in the context of the environment and climate change, as they can be effective mitigation measures (Rojas-Downing et al., 2017). To this end, the use of UAS and remote sensing data can provide valuable information about grass sward yield and nitrogen concentration. This knowledge can facilitate the calculation of the site-specific soil surface balance of nitrogen, helping optimise economical fertiliser application and minimise nitrogen losses (Valkama et al., 2016). This would be especially useful when manure is used as a nitrogen source, as roughly half the manure nitrogen supply is used by crops, while the rest is lost to the environment in different forms (Keskinen et al., 2022).

Conclusions

This study implemented and evaluated machine-learning pipelines for novel visible to short-wave infrared range (VNIR, SWIR) hyperspectral cameras and compared their performance to state-of-the-art methods based on 3D (crop height model), RGB, and multispectral cameras to estimate various grass quantity and quality parameters using machine-learning techniques with a random forest estimator. To overcome the high dimension of the features, a recursive feature elimination with cross-validation was used to select the most significant features for the random forest estimation models. The results showed that the hyperspectral sensors predicted grass quantity and quality traits more accurately than the other tested systems. SWIR hyperspectral camera especially provided advantages for the biomass parameters. This was the first study in which the feasibility of a wide spectral range (400–1700 nm) and high spatial resolution was employed to estimate grass parameters measured from UAS. While more studies covering other datasets and variability are needed for robust applications, hyperspectral UAS-based data were shown to be a promising instrument for agricultural remote sensing such as for grass parameter estimation.

Appendix

See Tables 7, 8, 9, and 10.

Table 7 Spectral features extracted for each band of the camera datasets (Nevalainen et al., 2017)
Table 8 Vegetation indices for RGB camera
Table 9 Visible near infra-red vegetation indices
Table 10 Short wavelength infra-red vegetation indices