Introduction

Grazing pasture at the optimal time for nutritional value is crucial for farmers seeking to capitalise on home grown feed resources in grazing livestock systems (Nuthall, 2012). Tools which are currently available to help farmers identify the right time point for grazing often focus on pasture quantity and not pasture quality. These include rising plate meters (Earle & McGowan, 1979), ultrasound sensors (Legg & Bradley, 2020), and multispectral cameras borne on unmanned aerial vehicles (Karunaratne et al., 2020) or satellites (Punalekar et al., 2018). However, optimising ruminant nutrition is related to the pasture’s nutritive characteristics more so than yield. The ability to easily monitor nutritive characteristics would therefore allow for precise manipulation of grazing ruminant diets so that nutrient supply is made equal to nutrient demand (Duranovich et al., 2021).

The analysis of fresh pasture for nutritive characteristics has traditionally been completed by laboratory near infra-red spectroscopy (NIRS), which is available to farmers at minimal cost but is slow as samples must be sent away to be analysed (Givens et al., 1997). These laboratory hyperspectral instruments measure the absorbance and reflectance of thousands of individual wavelengths (or narrow bands of multiple wavelengths) in the visible (VIS, 380–700 nm), near infra-red (NIR, 700–1400 nm) and short wave infra-red (SWIR, 1400–3000 nm) regions. A series of absorptions are present at different wavelengths of VIS–NIR and SWIR spectrum. These absorptions, known as overtones, are the result of interactions with the fundamental vibrations of the chemical bonds (C–H, O–H and N–H) associated with the atoms of known chemical groups (Reddy et al., 2011). The functional groups of a nutrient molecule determine the overtone regions in which a spectral response is found, for example wavebands centred around 1610 nm have been reported as an informative wavelength for protein determination (Punalekar et al., 2021) which corresponds to an N–H overtone; and similarly wavebands around the 2325 nm region have been associated with fibre (Kawamura et al., 2008) which corresponds to a C–H overtone. In this way, detailed reflectance information can be related to the chemical composition of forage samples with high accuracy, particularly for dry matter and crude protein (Ariza-Nieto et al., 2018; Thomson et al., 2020). Other nutritional components such as ash, fibres and sugars can also be estimated by this method but reported accuracies for these nutrients tend to be lower than for protein and dry matter factions e.g. Rukundo et al., 2021. The reasoning for this is unconfirmed but may be due to a lack of strong spectral features or overlapping absorption regions with other nutrients.

It is possible to take non-destructive hyperspectral measurements directly in the field using handheld instruments, thus providing real-time nutritive characteristics. However, the expense of these instruments has so far limited their application to scientific studies rather than practical applications. The Analytical Spectral Devices (ASD) range of spectrometers (ASD Inc., Falls Church, VA, USA) are commonly used instruments in research studies that take detailed hyperspectral scans, similar in spectral range and resolution to benchtop machines. A number of studies using ASD instruments, or similar, have provided proof of concept that proximal, in-field, spectral data can be used to predict nutritive concentrations albeit with lesser accuracy than in highly controlled laboratory conditions (Adjorlolo et al., 2015; Duranovich et al., 2020; Kawamura et al., 2008; Pullanagari et al., 2012; Smith et al., 2019, 2020; Thomson et al., 2020). However, the bulky ASD instruments must be placed in a backpack when used in situ, and the operator needs to simultaneously carry a light shroud to minimise effects of changing light conditions, making them laborious to use in practice (Kawamura et al., 2009).

To make hyperspectral technology accessible to the farming industry, instruments that are lower cost, easier to use and more portable must be found. Ideally, such instruments would estimate pasture nutrient concentrations equally accurately, if not more accurately, than high-end scientific units such as the FS4. New handheld instruments have come to the market in the last five years at more affordable price-points and with greater ease of use (Adão et al., 2017; Yakubu et al., 2020). The compromise with such instruments is that their spectral ranges and resolutions are typically lower than an ASD instruments and therefore results may not be as robust. More scrutiny of these newer sensors is warranted as, at the time of writing and to the author’s knowledge, there have been less than ten studies published previously focussing on the application of novel spectrometers to estimate pasture nutrient concentrations (Carreira et al., 2021; Geipel et al., 2021; Rukundo et al., 2021; Suzuki et al., 2008; Wijesingha et al., 2020).

The sensing principles of next generation instruments are varied in terms of light source, detectors, wavelength ranges, wavelength resolutions, and scanning mechanisms (Beć et al., 2021). In addition, sensors are now available that output data as hyperspectral images where there is spectral information in each pixel rather than simply a single spectra per sample. Adão et al. (2017) compiled a list of 29 hyperspectral cameras available at that time but also reported that studies using these for agricultural or forestry applications were still scarce likely due to cost, and complexity of data processing. With increases in computing power and lowering of instrument cost over time, hyperspectral imaging offers new avenues to explore pasture nutrient prediction, such as image classification to isolate only the desired substrate within the field of view (Behmann et al., 2018; Gu et al., 2017). The potential of attaching hyperspectral imagers to all-terrain vehicles (Suzuki et al., 2008) and to unmanned aerial vehicles (Wijesingha et al., 2020) for paddock-scale nutrient mapping has also been demonstrated albeit with some specialist post-processing tools required to stitch hyperspectral images together at a larger spatial scale.

One drawback of the literature on novel hyperspectral instruments to date is that studies tend to focus on a single sensor, offering no comparison between instruments with different specifications. The present study aimed to fill this gap in the literature by comparing three novel, low-cost handheld hyperspectral instruments, each with distinctive specifications, to the ASD FieldSpec 4 (FS4, the scientific control instrument) for proximal pasture nutrient assessment. Two research questions were asked: (i) could any of the novel instruments equal or exceed the predictive accuracy of the FS4 and (ii) could any of the novel instruments equal or exceed the predictive accuracy of a subset of FS4 data matching the spectral range of the novel instrument. The latter question enabled separation of the effects of spectral range and sensing principle (i.e. data type and capture method) on predictive accuracy by controlling for spectral range as an influencing factor.

Methods

Locations

This study was carried out in the months of November and December 2020. Data collection occurred in two paddock areas, B4–6 and B15 at the Ellinbank Smart Farm [Agriculture Victoria Research, Australia (38.2408 S, 145.9414 E)], located in the temperate dairying region of Gippsland, Victoria. The farm has facilities for research but is managed as a commercial farm, with paddock management representative of standard practice for the region. These paddock areas covered 3.02 ha and 3.40 ha for areas B4–6 and B15 respectively and were chosen because they have contrasting topography: B4–6 being drier and sloped, and B15 being flat and prone to waterlogging. The contrast between the two paddock areas was expected to provide variation in the resulting sample set with which to train robust models. Both paddocks had previously been sown with perennial ryegrass (Lolium perenne L.; variety ‘Bealey’).

Sensor selection

The purpose of this study was to compare novel handheld sensors with a current scientific standard (control) sensor for in-field application. Three candidate sensors were selected for inclusion based on having contrasting specifications. The control sensor, the FS4, was selected based on previously published studies in which this sensor was used successfully for the measurement of pasture nutritive concentrations (Adjorlolo et al., 2015; Duranovich et al., 2020; Kawamura et al., 2008; Pullanagari et al., 2012; Smith et al., 2019) and because it is considered an accurate and field-appropriate instrument for scientific research. The candidate sensors chosen were the Specim IQ (Specim; Oulu, Finland), the MicroNIR OnSite-W (Viavi; Scottsdale, Arizona, USA), and the NeoSpectra Scanner (Si-Ware; Menlo Park, California, USA), henceforth referred to as SPM, MOW, and NSS respectively. The candidate sensors cost between $7500 to $35000 AUD and were all considerably lower in cost than the FS4 (typically > $100000 AUD) as well as more appropriate for uptake in the agricultural sector due to their portability and intuitive user interfaces. These sensors were all commercially available, but none were specifically targeted at agricultural applications. The candidate instruments varied in their spectral ranges, spectral resolutions, and data capture methods, all of which are reported in Table 1. Of note was the SPM, a camera which captured data as hyperspectral images, where a full spectrum is recorded for every pixel. All other sensors provided single spectra which were utilised for data analysis. The main criteria used to compare the instruments was the accuracy of the nutritive models they could generate.

Table 1 A description of three candidate hyperspectral instruments (SPM, MOW, and NSS) and one ‘gold standard’ instrument (FS4) used in the present study

Sampling duration and interval

Sampling in each paddock area took place during the regrowth period between two grazing events. The study occurred in late Spring (November) and early Summer (December) when pasture accumulation rate was still high, but soil water was becoming limited, which is typical of Summer conditions in this region. During these two months, the site recorded 42.2 mm and 56.2 mm of rainfall respectively using an onsite weather station which was drier than average in this region in these months (5-year average of 81 mm and 67 mm respectively). The grazing of the paddocks was completed in line with normal farm practice using the commercial herd at the site. Both paddocks were at a similar stage of the grazing rotation and were ready to be grazed simultaneously. The grazing events that marked the start of the study period occurred over two consecutive days (one day in each paddock) to synchronise their regrowth periods. The first two weeks of regrowth were avoided because a previous study showed that soil visibility during this early stage leads to mixed spectral signatures in the resulting data (Thomson et al., 2020). Sampling then occurred at three timepoints in each paddock area: during the third (W3), fourth (W4) and fifth (W5) weeks of the regrowth cycle. The paddocks were ready to be grazed again after this period. Sampling was deliberately staggered over time in order to provide variation of plant physiological stage in the dataset. This is because plant maturity is known to influence nutrient concentrations (Nelson & Moser, 1994).

Sampling design

Reference measurements were obtained in pre-set areas to enable calibration and validation of predictive models. This sampling approach was strategically randomised and has been published previously (Thomson et al., 2020). In brief: nine pre-identified 10 m × 10 m spatially separated plots within each of the two paddock areas were chosen (18 plots in total). Within each of those plots, three quadrat samples were collected using a 31 cm × 64 cm quadrat, giving 27 datapoints per paddock area, per timepoint. For the purposes of this study, the quadrat cut was the main unit of calibration, not the 10 m × 10 m plot. The three sampling locations within the plot were strategically randomised by following a set of predetermined bearings and positions, and sampling at ¼, ½, and ¾ distances through the plot, thus removing subjectivity and preventing the same spots from being sampled on subsequent visits. Two paddock areas, sampled three times each with 27 quadrats collected each time, gave a maximum sample set of 162 samples. However, 17 quadrats fell on areas of low yield and the resulting samples were later determined to be too small for nutritive analysis. Therefore, only 145 samples were used for modelling.

Acquisition of hyperspectral datasets

The four hyperspectral instruments were all used to collect data from the chosen sampling areas prior to the destructive cutting of the sample. On collection days, sampling occurred between 1000 and 1300 h when the sun was close to a nadir position. All instruments were operated at the same time by three different individuals to ensure the data were comparable and each instrument had its own method of operation as illustrated by Fig. 1.

Fig. 1
figure 1

Four hyperspectral instruments assessed for their ability to predict pasture nutrient composition in situ including a the ASD ‘FieldSpec 4’ (FS4) with a custom light shroud; b the Specim ‘IQ’ Camera (SPM); c the Viavi ‘MicroNIR Onsite-W’ (MOW); and d the Si-Ware ‘NeoSpectra Scanner’ (NSS)

The FS4 unit (Fig. 1a) was fitted with a pistol grip sensor and a 10° field of view focussing lens. The instrument was mounted in a backpack and the pistol grip was inserted into a holder in the top of a battery-powered custom light shroud, 40 cm above ground level, so that it sat at nadir above the pasture surface. The design of the battery-powered custom light shroud has been published previously (Thomson et al., 2020) and its purpose was to prevent any effect of changing ambient light conditions, as is recommended for the use of the FS4 in field conditions (Kawamura et al., 2009). The sensor was calibrated at the start of each sampling session using a Spectralon® white reflectance tile and re-calibrated in between each 10 m × 10 m plot. Data were collected in the form of single spectra (350–2500 nm). Three spectra were taken per quadrat and the spectra averaged to give sufficient coverage of the quadrat.

The SPM (Fig. 1b) was a hyperspectral camera that was mounted on a tripod set at 1.05 m above ground level. The camera was tilted to a 75° angle—this was set as close to nadir as possible without capturing parts of the tripod or the tripod’s shadow within the images. At this height the majority of a quadrat could be viewed within a single image so only one image was required for each sample. A white reflectance tile, provided with the camera, was placed in every image, to the side of the quadrat and the camera had an in-built mechanism for recognising the tile and thereby performing a calibration. To capture an image of correct exposure, the user had to manually set a suitable exposure time which depended on the intensity of ambient light. Feedback helped the user set an appropriate time, and if this quality check failed, the image was retaken with a better exposure setting. Spectra were extracted from selected pixels within each image in the range of 397–1004 nm, as described below.

The MOW (Fig. 1c) and NSS (Fig. 1d) sensors ware both contact sensors that consisted of a handheld spectrometer with its own light source and a small (~ 2 cm wide) sampling window that was pressed against the material (i.e. the leaves of the pasture) to be scanned. The MOW was operated while attached to a laptop to allow viewing and naming of the captured spectra whereas the NSS was linked by Bluetooth to a mobile phone app Neospectra Scan for operation. Data were collected in the form of single spectra for MOW (950–1650 nm) and NSS (1345–2555 nm) but these sensors could only survey a small portion of the forage so three measurements were taken within each quadrat and the obtained spectra were averaged for analysis to attempt to provide better coverage.

Reference data, structural and nutritive characteristics

Once hyperspectral data had been acquired and prior to destructive sampling of pasture, covariate information associated with quadrat, including GPS location, pasture sward height, visual percentage cover, and leaf stage were recorded. Sward height, without compression, was measured using a 70 cm ruler in three spots through the quadrat and the results were averaged. Leaf stage was measured using the Dairy NZ scoring system (McCarthy et al., 2016). Sward composition was visually estimated by dividing into categories of Live, Bare and Dead with maximum intervals of 5%. These ancillary data helped describe the nature of the samples collected and statistical significance was determined by two-way ANOVA testing for effects of paddock, week, and their interaction on the measured pasture parameters. Destructive sampling of the pasture within the quadrat occurred using battery-powered hand shears. First the biomass > 5 cm, mainly comprising aerial leaves, was removed and retained as one sample for future nutritive analysis. Pasture > 5 cm is accessible for livestock to eat and therefore the most important part of the pasture to analyse. Then the remaining biomass < 5 cm was removed and retained for determination of total biomass.

The > 5 cm pasture samples were subsampled into 25% and 75% portions using a randomised manual quartering technique. Fresh weights of each subsample were recorded with the larger subsample oven dried at 60 °C until a constant weight was achieved (normally 72 h), then ground to 1-mm (Retsch Mill SM 300, Metrohm Australia Pty Ltd, Gladesville, Australia), and retained for future chemical composition analysis. All 145 samples were sent as one batch at the end of the study to DairyOne (Ithaca, NY, USA) for wet chemistry determination of eight nutritive characteristics: residual dry matter (DM), crude protein (CP), metabolisable energy (ME), ash, acid detergent fibre (ADF), neutral detergent fibre (NDF), water-soluble carbohydrate (WSC), and non-fibre carbohydrate (NFC). DairyOne forage analyses were undertaken according to laboratory accredited methods which have been described previously (Moate et al., 2020). The 25% subsample was oven dried at 100 °C to determine an accurate dry matter percentage for the > 5 cm fraction so total DM yield could be calculated.

The < 5 cm pasture sample was then washed in cold water, because they contained excessive contamination compared to the > 5 cm samples, prior to being dried at 100 °C until a constant weight was achieved (normally 72 h), weighed, and discarded. Final dry weights for both the < 5 cm and the > 5 cm samples were combined to give the total DM yield (applying the dry matter percentage from the smaller > 5 cm subsample to the total weight of that sample).

Hyperspectral data pre-processing

For three of the four instruments (FS4, MOW and NSS) data were obtained in the form of single (averaged) spectra that were immediately ready for processing. Images from the SPM instrument required additional processing to obtain equivalent data. The quadrat area within the hyperspectral image was first extracted through manual digitisation followed by clipping using ENVI software (L3 Harris Geospatial, Broomfield, Colorado, USA). The resulting clipped image was then classified using the spectral angle mapper (SAM) algorithm with the goal of isolating the spectra of photosynthetic vegetation in the image and discarding spectra from non-photosynthetic vegetation (leaf litter), soil, or shadowed areas (Kruse et al., 1993). The SAM algorithm is a supervised technique that matches unknown pixels to classes from a user-defined spectral library based on similarity to the reference spectra. Comparison of a raw hyperspectral image and an example of a classified image is illustrated in Fig. 2. Finally, six images, one from each paddock and sampling week, were selected as validation images. In each validation image 15 user-defined areas for each of three classes were digitised and checked for the correct classification. Following classification, the mean spectra from the pixels classed as ‘photosynthetic vegetation’ were extracted to create a single spectra per sample for modelling.

Fig. 2
figure 2

An example of a hyperspectral image acquired by the SPM a in its raw format and b classified using a trained spectral angle mapper (SAM) algorithm. Blue represents photosynthetic vegetation, green represents non-photosynthetic vegetation, and red represents background (soil and shadowed areas)

Using the Unscrambler 11 software (Camo Software; Oslo, Norway), the spectral data collected from all four instruments were treated identically for modelling. The edges of spectra were firstly trimmed of noise (Table 1) before a variety of standard spectral processing techniques were applied. These included: (a) derivatisation (1st or 2nd order), gap, and smoothing; (b) linear baseline correction and (c) scatter correction (using either standard normal variate detrending, normal multiplicative scatter correction, or weighted multiplicative scatter correction) for reducing the impact of light scattering during the spectral data acquisition. All combinations of these spectral pre-processing methods were applied and tested to find the best combination of techniques for each variable and for each instrument (Supplementary Table 1).

Hyperspectral modelling

Models were fitted using the Unscrambler 11 software (Camo Software; Oslo, Norway) using the Partial Least Squares Regression (PLSR) method. This method has been extensively used for the analysis of spectral data historically (Wold et al., 2001). Models were built for each of the eight measured nutritive parameters and for each of the four instruments, leading to 32 final models. Firstly, the optimum data pre-processing combination had to be chosen. To do this a PLSR model was created for each differently pre-treated dataset using the entire sample set to develop the model and 20-fold cross validation to obtain statistics on predictive performance. The optimum data pre-processing method was selected based on the best cross validated result for each PLSR model. Secondly the dataset was split into training (80%; 116 samples) and validation (20%; 29 samples) datasets to produce final models and perform independent validation. For fairness, the same data split was used to test all four instruments. The splitting method used was the conditional Latin hypercube sampling algorithm [from the cLHS package of R software (R Core Team, 2017)] which uses existing ancillary information (in this case ‘paddock, ‘week’, and ‘DM yield’) to optimise sampling stratification so that the resulting calibration set contained equal variance for the specified parameters as the original dataset (Minasny & McBratney, 2006). The remaining samples were assigned to a validation set. This method was used in preference to alternative methods that split the data based on spectral characteristics (such as the Kennard–Stone method) because these other methods would have resulted in a different optimum split for each instrument and prevented fair comparison.

The final model was developed using the pre-processing technique that was identified as optimum for that nutritive parameter (Supplementary Table 1), and the training sample set. Both 20-fold cross validation and independent validation through application to the 29 validation samples were performed for final models. The optimum number of factors to include in the final PLSR model was selected as the lowest number of factors (therefore the least complex model) that simultaneously gave the highest cross validation performance to minimise any overfitting. At the cross-validation stage a means separation test (two-tailed Wilcoxon’s signed-rank test for paired samples) was applied to the mean absolute errors (MAE) of each predicted datapoint. This tested the null hypothesis that there was no difference between the mean absolute error of the candidate instrument and the control instrument for each instrument and nutrient combination. From the independent test, the following model metrics were determined: (a) Root mean square error (RMSE) which represents model accuracy, (b) normalised RMSE which is RMSE as a percentage of the measured mean of the data (c) Lin’s concordance correlation coefficient (LCCC) which measures the fit of the predicted Y and the measured Y to a 1:1 line of parity, (d) The coefficient of determination (R2) which measures the degree of correlation between predicted Y and measured Y, (e) the ratio prediction error to the standard deviation of the measured dataset [RPD (Williams, 2014)], (f) bias (mean error), which is the difference between the measured mean and the predicted mean of the dataset, and (g) MAE which considers the average magnitude of errors without regard for their direction (positive or negative). Lin’s concordance correlation coefficient was considered the most important metric on which to base comparisons of model quality between instruments because it simultaneously considers both accuracy and precision (Tedeschi, 2006). Model prediction performance was considered superior when LCCC was closest to one. Computations for performance metrics were completed in a combination of Unscrambler 11 software (Camo Software; Oslo, Norway), R software (R Core Team, 2019) package epiR, and Microsoft Excel (Redmond, WA, USA). Plotting of spectra were performed using the R software package spectrolab (R Core Team, 2019).

Separating effects of spectral range and data type/capture method

A second exercise aimed to understand to what extent each instrument’s performance could be attributed to the spectral range measured versus its sensing principles. To do this, the FS4 spectra were subset into the following three ranges 400–920 nm, 914–1676 nm, and 1400–2555 nm henceforth referred to as FS4400−920, FS4914−1676, and FS41400−2555. These subsets matched the trimmed spectral ranges of the three candidate instruments, SPM, MOW and NSS respectively, however, it should be noted that the FS4 spectral range ended at 2450 nm so the third spectral range for NSS could not be matched exactly. Final PLSR models were developed for each of the subset FS4 ranges in the same way as described previously. These were then compared against their matching instrument’s best models using the same model metrics described above. By controlling for the sensor’s spectral range, any remaining discrepancy in performance was considered to be due to the data type/capture method of the camera. The data type and capture method were considered a combination of the sensor’s remaining attributes including distance to substrate, size of the source, lighting effects, and its native spectral resolution. It was acknowledged that these factors remained confounded as further separation of these effects was not within the scope of the present study.

A second test was also conducted to examine the importance of spectral range, in which a jack-knife analysis (Marten’s Uncertainty Test) was applied to the results of the original FS4 PLSR modelling for each nutrient to determine the significance of the regression coefficients (in this case, the wavelengths). Significant wavelengths selected for each nutrient are shown in Supplementary Fig. 1.

Results

Reference dataset description

The reference sample set was collected from the two paddock areas over three weekly timepoints corresponding to weeks three to five of a regrowth cycle post grazing. Table 2 reports whether there were significant effects of ‘paddock’ (B4–6 or B15) and ‘week’ (W3–W5) on the structural and nutritive characteristics of the sampled pasture. The pasture in the paddocks B4–6 exhibited a slower accumulation rate and less total biomass than the pasture in B15 as shown by the effect of ‘paddock’ on sward height, leaf stage, and total biomass (all P < 0.005). However, for the nutritive characteristics, the effect of ‘paddock’ was only significant for DM and ME where pasture in B4–6 had higher DM and ME than in B15. The effect of sampling week created the most variation in the sample set with all nutritive parameters and all but one structural parameter (dead matter, %) significantly affected by ‘week’. The week in which the pasture was likely optimal for grazing in both paddocks was in W4 as ME was higher in this week than in either W3 or W5 (P < 0.005). Crude protein was also high in W3 and W4 before a significant decline was observed in W5 (P < 0.005).

Table 2 The mean effects of paddock area (B4–6 or B15) and sampling week (W3 to W5—corresponding to the third, fourth and fifth regrowth weeks) on structural and nutritive characteristics of a Perennial Ryegrass based-pasture measured during the same regrowth cycle in late Spring/early Summer on a commercial dairy farm in Victoria (Australia)

Table 3 provides descriptive statistics for the entire resulting sample set combined for the eight nutritive parameters for modelling. For many nutrients a large variation was observed as shown by coefficients of variation (CV%) greater than 10%. For WSC, NFC and DM the measured ranges were very wide with CV% above 20%. The notable exceptions to this were ADF, NDF and ME with CV% of 8.4%, 7.0% and 3.9% respectively.

Table 3 The reference sample set which was used for modelling eight nutritive parameters of pastures

Spectral dataset descriptions

An example of a hyperspectral image acquired by the SPM is illustrated in Fig. 2. The SPM required classification of the image into three categories so that spectra of ‘photosynthetic vegetation’ could be extracted without distortion from spectra of ‘non-photosynthetic vegetation’ (plant litter), and soil and deeply shadowed areas (combined into one category, ‘background’). The SAM method was successful at classifying the images with a mean overall accuracy of 93.4% correct classification in the validation test (data not shown). Each image comprised 101 070 pixels in total, and using the six validation images, the average number of pixels that were classified as ‘photosynthetic vegetation’ to be used for spectral extraction was 82 725 or 81%. For the ‘photosynthetic vegetation’ category, which was the most important to correctly classify, the errors of omission (missed pixels or ‘false negatives’) were 0.91% while the errors of commission (pixels incorrectly included or ‘false positives’) were 8.55%. Where errors of commission occurred, it was most commonly pixels that should have been ‘non-photosynthetic vegetation’ that were incorrectly classified as ‘photosynthetic vegetation’.

Figure 3 shows the spectral data, as raw reflectance, obtained from each instrument, including the SPM data that was extracted as one mean spectra for the ‘photosynthetic vegetation’ category after classification. In general, the FS4 spectra overlayed each other by sampling week closely whereas, for the other instruments, less consistent overlaying by sampling week was observed (Fig. 3). Furthermore, there were differences in absolute values of reflectance observed between the instruments (on the y axis). The MOW instrument gave a maximum reflectance value in the NIR region 0.2 higher than those observed by the FS4 instrument (0.5 vs. 0.7, respectively).

Fig. 3
figure 3

Processed and trimmed spectral signatures, shown as reflectance value (y axis), from each of four hyperspectral instruments (FS4, SPM, MOW and NSS) that each scanned the same 145 samples in situ. Each line is the signature from one sample, and these are coloured according to the sampling week in which they were collected (W3–W5). The spectral ranges of the three candidate instruments are marked on the FS4 plot. Known overtone regions for key chemical bonds (N–H, O–H and C–H) and their combinations are shown as bars for reference (Reddy et al., 2011). (FS4 ASD FieldSpec 4; MOW Viavi MicroNIR Onsite-W; NSS Si-Ware NeoSpectra Scanner; SPM Specim IQ Camera.)

Comparing the performance of the four instruments

Table 4 shows the calibration and 20-fold cross-validation results of the optimised model for each instrument and nutritive parameter combination. The eight calibration models for the FS4 showed the best fit according to R2 values of 0.81 on average in comparison to 0.67, 0.50 and 0.45 on average for the SPM, MOW and NSS respectively. However, the performance of the FS4 was reduced at the cross-validation stage and was exceeded by the cross-validation performance of the SPM model for ADF (P = 0.036) as well as closely matching the SPM for prediction of CP, ME, NDF, DM, NFC and WSC. In comparison, the MOW and NSS models had numerically poor cross-validation statistics compared to the FS4 for most parameters, which reached statistical significance in some cases (the MOW model for DM had significantly higher error in comparison to the FS4, as did the NSS model for CP).

Table 4 Calibration and 20-fold cross-validation (CV) of models used to predict eight nutritive parameters of pastures produced by data from four different hyperspectral instruments

Table 5 and Fig. 4 show the results of an independent test of the optimum models by applying them to 29 validation samples that were withheld from the calibration set. A range of prediction performance metrics for the models are presented in Table 5 while Fig. 4 focusses on a comparison of LCCC scores for each of the model types (Fig. 4a) and for each instrument in comparison to the FS4 control (Fig. 4b). In every instance the instrument producing the model with the highest LCCC was either the FS4 or SPM. According to Fig. 4b, SPM exceeded the performance of the control FS4 instrument in six out of eight parameters with Δ LCCCs up to + 0.10. Ash and ADF were the two parameters for which SPM did not match the prediction performance of the FS4. However, ADF was poorly predicted in general with no instrument able to exceed an LCCC of 0.40, which would be considered poor performance. If an arbitrary LCCC threshold value for acceptable performance of a predictive equation was set at 0.70 then the FS4 was able to predict five out of eight nutrients acceptably (CP, WSC, NFC, Ash and DM) in comparison to four (CP, WSC, NFC and DM), two (CP, and DM) and two (NFC and DC) nutrients respectively for SPM, NSS and MOW. The performance of the MOW and NSS instruments lagged behind the other two instruments in terms of predictive performance for every parameter tested with Δ LCCC’s exceeding − 0.3 for MOW and − 0.2 for NSS in comparison to the performance of the FS4 for some parameters.

Table 5 Independent validation of models using 29 samples to predict eight nutritive parameters of pastures produced by data from four different hyperspectral instruments
Fig. 4
figure 4

Shown in part a are the Lin’s Concordance Correlation Coefficient (LCCC) of models used to predict eight nutritive parameters of pastures produced by data from four different hyperspectral instruments (FS4, SPM, MOW and NSS). Part b shows the change (Δ LCCC) for each instrument versus the gold standard FS4, for each of the eight nutrients. (ADF acid detergent fibre; CP crude protein; DM dry matter; FS4 ASD FieldSpec 4; ME metabolisable energy; MOW Viavi MicroNIR Onsite-W; NDF neutral detergent fibre; NFC non-fibre carbohydrate; NSS Si-Ware NeoSpectra Scanner; SPM Specim IQ Camera; WSC water soluble carbohydrate)

Separating effects of spectral range and data type

In order to understand why some instruments were more successful at predicting nutrient concentrations than others a further test was undertaken to separate the effect of the instruments’ spectral range from their data type/capture method. A new set of ‘control’ models were made for each nutrient but this time using only subset ranges of the FS4 data where each subset matched the spectral range of one of the candidate instruments. The performances of the FS4 subset models are shown in Fig. 5 (for LCCC) and in Table 6 (other metrics). Using LCCC as the key metric of predictive performance, the FS4914−1676 subset produced the best model three times (CP, WSC, and DM) whilst the FS41400−2555 subset produced the best model four times (ME, NFC, Ash, ADF). The FS4914−1676 and FS41400−2555 subsets were the same for the best model of NDF. Interestingly, the FS4400−920 subset, which had the same spectral range as the successful SPM instrument, never produced the best model. However in many cases, differences in the predictive power of the spectral ranges were small, and furthermore the results of the jack-knifing analysis to assess significant wavelengths (Supplementary Fig. 1) also suggested that there were significantly informative regions throughout the entire 450–2450 nm range for all nutrients.

Fig. 5
figure 5

The Lin’s Concordance Correlation Coefficient (LCCC) of models used to predict eight nutritive parameters of pastures produced first by the full spectral range of the ASD FieldSpec 4 (FS4, 400–2450 nm) instrument and then using three subset ranges of FS4 data (400–920 nm, 914–1676 nm, 1400–2555 nm) which matched the spectral ranges of three other candidate instruments used in this study (SPM, MOW and NSS, respectively). (ADF acid detergent fibre; CP crude protein; DM dry matter; ME metabolisable energy; MOW Viavi MicroNIR Onsite-W; NDF neutral detergent fibre; NFC non-fibre carbohydrate; NSS Si-Ware NeoSpectra Scanner; SPM Specim IQ Camera; WSC water soluble carbohydrate)

Table 6 Independent validation of models used to predict eight nutritive parameters of pastures produced by three subset spectral ranges originating from a lager dataset from the FieldSpec 4 instrument

Figure 6 shows the difference between each FS4 subset model and the instrument with a matching spectral range. The SPM was compared against FS4400−920, MOW was compared against FS4914−1676, and NSS was compared against FS41400−2555. The SPM instrument was able to exceed the performance of its matching FS4400−920 subset for every nutrient. For half the nutrients the Δ LCCC’s were ≥ 0.10 (CP, WSC, NDF and NFC). The NSS instrument was also able to marginally exceed the performance of its matching FS41400−2555 subset for two out of eight nutrients (NDF and NFC) but otherwise under-performed, particularly for ME where the Δ LCCC was − 0.4. The MOW instrument always underperformed against its matching FS4914−1676 subset (Δ LCCC was always negative).

Fig. 6
figure 6

The change in Lin’s Concordance Correlation Co-efficient (Δ LCCC) between the best model of each of three candidate instruments (SPM, MOW and NSS as illustrated in Fig. 4a) and their matching FS4 subset range: 400–920 nm, 914–1676 nm, 1400–2555 nm respectively (as illustrated in Fig. 5). (ADF acid detergent fibre; CP crude protein; DM dry matter; FS4 ASD FieldSpec 4; ME metabolisable energy; MOW Viavi MicroNIR Onsite-W; NDF neutral detergent fibre; NFC non-fibre carbohydrate; NSS Si-Ware NeoSpectra Scanner; SPM Specim IQ Camera; WSC water soluble carbohydrate)

The best candidate instrument

Based on the afore-mentioned results, the SPM camera exceeded the predictive capabilities of the other candidate sensors so was concluded to be superior. Validation plots using 29 samples and reporting predicted data versus the reference data for the SPM are shown in Fig. 7. For CP, WSC, NFC and DM the linear regression line of the data showed good agreement with the line of parity indicating no issue with the slope of the modelled data. In the case of CP, care might be advised applying the model above 200 g/kg DM as the model appeared to underpredict after this point. Ash was also predicted well but the result was influenced by one extreme datapoint. For NDF, ADF and ME the slopes of the regression lines deviated from the lines of parity indicating poor predictive performance particularly at high and low values. These nutrients were also poorly predicted by the other sensors including the FS4. They were also the same nutrients for which there was limited variation in the dataset as shown by the CV% from Table 3.

Fig. 7
figure 7

Validation plots showing the relationships between measured and predicted nutritive concentrations for eight parameters of pasture derived from an independent validation test of the most promising candidate instrument: the Specim IQ (SPM). The line of parity is shown with a solid black line and a linear regression line (measured Y versus predicted Y) is shown as a dashed grey line

Benchmarking performance relative to previous work

The results shown in Fig. 8 benchmark the candidate units against seven FS4 calibrations published by others. The only performance metric reported consistently across seven suitable studies was R2 so therefore this was the parameter that was used for comparison; however, it should be noted that R2 can be influenced by the variance of the dataset to which it is applied and therefore results should be interpreted with caution. It was observed that CP and ME were the most consistently well-predicted nutrients (R2 > 0.7 on average) in previous pasture studies. The predictive R2 of models built in the present study for all instruments were below average for both CP and ME, though considerably more so for ME. Previously reported performance for WSC and ash were also consistent but with lower mean R2 than for CP and ME (0.5–0.6 on average). For these nutrients, the present study produced above average quality of prediction for the FS4 (and for the SPM but for WSC only). For DM, ADF and NFC there was no clear consensus from previous literature as to how well these are typically predicted from spectral data with some studies reporting high R2 (above 0.70) and others low R2 (below 0.40) for the same nutrient. In the present study, ADF and NDF had far below average prediction performance by all instruments whereas DM had far above average prediction performance for all instruments.

Fig. 8
figure 8

Benchmarking the R2 of prediction of each of four hyperspectral instruments used in the present study (FS4, SPM, MOW, NSS) against the mean (and standard deviation—shown as bars) R2 from up to seven previously published studies for eight nutritive parameters of pasture. The number of studies reporting each nutrient were: CP = 7; ME = 3; Ash = 4; DM = 3; ADF = 6; NDF = 6; NFC = 1; WSC = 3. Included studies were: (Adjorlolo et al., 2015; Duranovich et al., 2020; Kawamura et al., 2008; Pullanagari et al., 2012; Smith et al., 2019, 2020; Thomson et al., 2020). (ADF acid detergent fibre; CP crude protein; DM dry matter; FS4 ASD FieldSpec 4; ME metabolisable energy; MOW Viavi MicroNIR Onsite-W; NDF neutral detergent fibre; NFC non-fibre carbohydrate; NSS Si-Ware NeoSpectra Scanner; SPM Specim IQ Camera; WSC water soluble carbohydrate)

Discussion

The performance of the control instrument

For this study, the FS4 was considered the control instrument for proximal prediction of pasture nutrient concentrations, given there have been many proof-of-concept studies reporting calibration models for this purpose using ASD FieldSpec instruments. Examples include Australian calibrations designed for perennial ryegrass plant breeding plots (Smith et al., 2019, 2020) and commercial dairy pastures (Thomson et al., 2020), New Zealand calibrations for practical use in diverse dairy pastures (Duranovich et al., 2020; Pullanagari et al., 2012), in Japanese mixed species pastures (Kawamura et al., 2008), and for alpine pastures in South Africa (Adjorlolo et al., 2015). Thomson et al. (2020) reported a study using the FS4 that preceded the present study, conducted at the same site, with similar sample numbers, and with similar study design for gathering reference data. The main difference between the two studies (excluding comparison of different hyperspectral instruments), was that the work reported by Thomson et al. (2020) took place in mid to late Spring (October/November) as opposed to late Spring/early Summer (November/December) for the present study. Using normalised RMSE of prediction for comparison, there was slight improvement in the validation performance for CP, NFC and ash in the present study versus the preceding study, and a marked improvement for WSC (15.5% versus 23.1%). However, predictions of the remaining parameters (NDF, ADF, DM and ME) had a higher normalised RMSE in the present study in comparison to that of Thomson et al. (2020) indicating worse predictive performance. This comparatively poor performance may, in part, be due to the effect of season in which the study took place. In the present study, the start of the dry Summer period had started to take effect in the paddock B4–6 as shown by elevated pasture DM in the last week of the study and a low rate of biomass accumulation observed throughout the three sampling timepoints (Table 2). The worst predictive equations for the FS4 (NDF, ADF and ME) were all nutrients with a low CV% in the reference dataset (Table 3) indicating a lack of variation present for those parameters even though they were taken during a period of increased moisture depletion and the onset of the reproductive stage (Clark et al., 2013). Ideally reference sample sets should have a wide variation present for each parameter of interest for them to be well modelled (Cougnon et al., 2014). However, this did not hinder the present study which aimed to compare three candidate instruments to the FS4 when calibrated using the same sample set, not to create the most robust models possible.

The performance of the candidate instruments

The candidate instruments varied in their ability to predict nutrient concentrations of pasture. The most promising instrument, and the only instrument to exceed the predictive performance of the FS4 in some cases, was the SPM (the hyperspectral camera). The NSS and MOW sensors underperformed in comparison, both against the SPM and against the FS4 control. Only one of the candidate instruments had, at the time of writing and to the author’s knowledge, been calibrated for pasture nutrient concentrations previously in the literature. Carreira et al. (2021) recently published MOW cross-validation results for CP and NDF calibrations with RMSE of 23.7 and 36.3 g/kg DM respectively. In comparison the present study achieved a slightly improved RMSE of 20.6 and 35.3 g/kg DM for CP and NDF respectively using the MOW (Table 4). For further comparison, the performance of the instruments can be benchmarked (i) against FS4 calibrations published by others (Fig. 8), (ii) against other calibrated novel spectrometers in the literature, and (iii) against the performance of laboratory NIRS instruments.

When the candidate instruments were benchmarked against FS4 calibrations from the literature, results varied by nutrient (Fig. 8). For CP, ME, WSC and Ash where previous results were quite consistent, the FS4 and SPM predictions were often either within the expected range or close to it. The exception was ME where predictions in the present study were not as accurate as those presented previously. For DM, ADF and NFC there was no clear consensus from previous literature as to how well these are typically predicted. One reason for this inconsistency, as stated previously, might be due to the reference dataset used (e.g. sample number, range, and standard deviation) and the conditions under which it was collected (e.g. seasons, years, climates), leading to either more or less variation being present with which to model. This makes it more difficult to benchmark these parameters.

Another benchmarking approach was to examine how the candidate instruments compared against other examples of next generation sensors for which calibrations have been published. There are two notable studies published recently that can be discussed. The first study compared two handheld sensors to a laboratory NIRS machine for their ability to predict nutrient concentrations in American forage types (Rukundo et al., 2021). One sensor was a high-end unit similar to the FS4 and the second a less expensive unit similar to the MOW. Rukundo et al. (2021) used RPD as the main metric of performance. The models developed by Rukundo et al. (2021) for the handheld devices had high accuracy for prediction of nitrogen, which is used to calculate CP (RPD ≥ 2.3), and lower prediction accuracy for ADF and NDF (RPD ≥ 1.6). In each case the performance exceeded the best RPD value reported in the present study (which were models produced by the SPM instrument in both instances): 1.67 for CP and 1.17 for NDF. An explanation for this difference was that Rukundo et al. (2021) cut the forage samples and placed them in a cup prior to scanning with the handheld devices thus homogenising them and excluding distortion from soil. A more comparable example was a Norwegian study (Geipel et al., 2021) that used a hyperspectral camera similar to the SPM with a 450–800 nm spectral range, attached to an unmanned aerial vehicle to survey a larger area. Geipel et al. (2021) achieved a model that was able to predict CP with a normalised RMSE of 11.7% and NDF with a normalised RMSE of 4.8%. In comparison the normalised RMSE values for CP and NDF predicted by the SPM instrument in the present study were remarkably similar: 11.1% and 5.8% respectively, despite the fact that a larger dataset (n = 707) was used for calibration and validation in the study of Giepel et al. (2021). This supports the evidence that the SPM is a strong candidate for the estimation of nutritive characteristics, and its attachment to a ground-based or aerial vehicle is of interest to future studies.

There have been much larger scale studies completed using laboratory NIRS instruments in order to understand the maximum possible accuracy for nutritive characteristics estimation in vegetation. One recent publication demonstrates that it is possible to have a prediction performance of R2 > 0.88 for DM and R2 > 0.90 for CP, ADF and NDF with a robust enough sample set scanned under ideal laboratory conditions i.e. dried and ground to remove variance attributable to water and increase homogeneity of the sample (Ariza-Nieto et al., 2018). Whilst it is likely that in-field spectrometry will never be capable the same level of accuracy as can be achieved in a laboratory setting, this gives hope that future in-field improvements will be able to improve upon present results, perhaps by considering instrument design to remove sources of light scattering, methods of increasing sample homogeneity in the field, and the gathering of larger reference datasets.

Relative importance of spectral range and data type

The SPM was shown to excel at capturing data in a useful form, as it outperformed a matching subset spectral range from the FS4 sensor (Fig. 6). The explanation for this was the hyperspectral imaging datatype that allowed for precise classification and subsequent extraction of a spectra that related only to the photosynthetic vegetation in the sample, free from distortion from litter, soil and shadow. The FS4, and other scientific instruments like it, are flawed in this regard because the spectra they record will be a mixed response of the reflectance of all the different material types captured within their field of view. Previous studies have noted the problem of bare soil exposure distorting spectral signals from grasslands (Asner et al., 2000). This was one reason why Thomson et al. (2020) concluded that the FS4 was not appropriate for use in the first two weeks post grazing in dairy pastures until sufficient canopy closure has been reached. Hyperspectral imaging offers a method to overcome this limitation as even when there is minimal grass present, the vegetative regions can be isolated and extracted for prediction. This could facilitate starting pasture nutrient monitoring directly post grazing. Another advantage the SPM had over the FS4 was a larger field of view allowing the majority of the quadrat to be captured in a single image. In comparison, the FS4 required three spectra to be taken and averaged to be sure of good sample coverage which may have reduced the quality of the resulting spectral information.

A notable result of the present study was the comparably poor performance of the NSS and MOW instruments. In the majority of cases, both instruments failed to equal the predictive performance of a matching subset spectral range from the FS4. The regions they measured were potentially informative, in fact, when the subset FS4 spectral ranges were compared, these two subset ranges were marginally better at predicting nutrient concentrations than the spectral range that matched the SPM instrument (Fig. 5). This is also consistent with previous literature that has suggested that the SWIR spectral region contains several wavelengths that are sensitive to various nutrients particularly fibrous components like cellulose (Adjorlolo et al., 2013, 2015). Though, this may vary by nutrient, as a different study found more informative wavelengths for CP in the VIS–NIR region than the SWIR region (Togeiro de Alckmin et al., 2020), which also agreed with the results shown from the jack-knifing analysis presented from the present study (Supplementary Fig. 1). According to the results of the subset calibrations in the present study, it would have been expected that either the MOW or the NSS instruments would outperform the SPM instrument for most nutrients. The fact that this was not the case suggests some deficiency in their methods of data capture. While it is not possible to isolate the exact cause of the lower data quality of these two units, the factors that seem most likely to cause the problem would be their small sampling area, and lower spectral resolution in comparison to the FS4 and SPM. Both instruments were ‘contact’ style sensors requiring the sample to be pressed against a window, approximately 2 cm wide for scanning, which restricted the area that could be scanned at any one time, even though three scans were taken and averaged in each sample to try to improve coverage. Furthermore, the canopy surface is not homogeneous so decaying leaves and dirt particles could have been caught in the scanned area though care was taken to try to avoid this. For future work, the number of scans per sample taken by such contact style sensors may need increasing, or the sample area should be decreased to better match the amount scanned. In comparison, the SPM was averaging the results of tens of thousands of scans from each of the classified pixels in each image which was likely key to its comparative success.

Recommendations for future instrument development and on-farm uptake

Recommendations can be drawn from this work to inform future instrument design. The following instrument specifications would seem advantageous for the purposes of pasture prediction (a) a higher spectral resolution, (b) a wide field of view and/or the ability to scan a large size of sample, and (c) a hyperspectral imaging datatype as opposed to single spectra. Results from comparing different subsets of FS4 data (Table 6) showed that having a spectral range in the NIR-SWIR region was often more informative for prediction purposes than a VIS–NIR spectral range, though both ranges were sensitive to nutritive changes but to different extents. As a result, the success of the SPM camera was attributed to its sensing principles rather than any superiority of its spectral range. This would suggest that a hyperspectral camera of similar specification to the SPM but with a spectral range in the NIR-SWIR region might have been the optimal specification for predicting nutrient concentrations of pastures. Based on the present study, it appears that the SPM might already be capable of superseding the FS4 as the ‘gold standard’ for in situ pasture nutrient prediction, however, the reference dataset used in the present study was limited and further tests on a larger and more diverse sample set would be required to make this recommendation.

The SPM camera is an example of an instrument that would enable farmers or consultants to start analysing their own pasture for nutrient concentrations because it is relatively inexpensive (about three times cheaper than the FS4), easier to use and more portable than the FS4. The potential importance of regular nutrient monitoring with a portable spectrometer has already been demonstrated. One study showed that cows grazing in New Zealand were often not receiving the optimum nutrition from their pasture due to an excess of CP supply and limited ME availability (Duranovich et al., 2021). Analysis of weekly pasture nutrient concentrations for the two paddock areas observed in the current study showed that these data would have enabled a farmer to identify the best timepoint for grazing: the fourth week of regrowth. This did not differ between the two paddocks, which were grazed one after the other, despite them having quite different characteristics e.g. moisture limited versus moisture retaining (Table 2). Therefore, an approach that farmers could take to utilising these technologies would be to select a single sentinel paddock representative of a group of paddocks grazed in a similar time period, then undertake weekly observations of its nutritive characteristics watching for a peak in ME and a sustained CP value that has not yet started to drop in order to decide on the suitability of paddocks for grazing. The frequency of observation might need to differ throughout the year as the rapidity of changes in nutrient concentrations fluctuates. Duranovich et al. (2021) found that variation in pasture nutrient concentrations could be better explained by the month within the season than by either the season or the paddock. Further work would be warranted to find a balance of the best practice method that is least labour intensive for collection of these data as part of farming practice.

Conclusions

Out of three ‘next generation’ spectrometers that were compared to the current ‘gold standard’ instrument for pasture nutrient prediction, one instrument was able to match and even improve upon the performance of the control sensor: a hyperspectral camera with a VIS–NIR spectral range. However, by sub-setting and contrasting data obtained from the control instrument to match the three candidate instruments, it was ascertained that the NIR-SWIR regions were potentially more informative than the VIS–NIR regions. Therefore, it was notable that the two candidate sensors tested that had spectral ranges in the NIR-SWIR regions did not meet that potential. Both these sensors had small fields of view, requiring direct contact with the substrate to be scanned, and also lower spectral resolutions. As a result, instruments with specifications similar to these would not be recommended for use in proximal pasture nutrient prediction based on the results of the present study. Instead, more research is warranted into hyperspectral imaging and its application for nutrient characteristics prediction in pastures.