1 Introduction

1.1 Background

Wheat ranks third among US field crops in terms of planted area and production following corn and soybeans (USDA 2021). Winter wheat production accounts for approximately 78% of US total wheat production (USDA 2021). In 2021, US farmers produced a total of 34.84 billion kilograms of winter wheat from an estimated harvested area of 10.32 million hectares (USDA 2021). However, with increasingly extreme and variable climate, loss of fertile soil, limited water resources, as well as increased competition in global wheat markets, there has been a general downward trend in US winter wheat production over the last two decades (Ray et al. 2013; Lollato et al. 2017; USDA 2021). Previous research suggests that agricultural land expansion is not a sustainable solution to achieve higher productions and is often associated with environmental impacts such as land degradation, groundwater pollution, and loss of biodiversity (Borlaug and Dowswell 2003). Thus, it is argued that increasing crop yield in existing cultivated lands, rather than expanding them, would be necessary to meet increasing demands for food and biomass (Borlaug and Dowswell 2003). Plant breeding programs strive to achieve higher yields by identifying crop varieties that are more resistant to different stresses under variable conditions. This requires continuous monitoring of agricultural crops over time and their characteristics, which are referred to as “phenotypic traits.”

Plant phenotypes are a set of observable “traits”, such as structural and biochemical characteristics, that are formed during plant growth from dynamic interactions between plant genetic constitution (genotype) and the environment (Granier et al. 2006; Houle et al. 2010). Quantitative measurement of these plant phenotypes is known as plant phenotyping. Through plant phenotyping, it is possible to improve crops and develop new cultivars that are resistant to a variety of disturbances such as water stress and drought (White and Conley 2013; Li et al. 2014; Thorp et al. 2015). However, plant phenotyping efforts have been impeded by lack of efficient technology to measure plant traits in the field (Furbank and Tester 2011; Zaman-Allah et al. 2015; Yang et al. 2017).

Traditional methods of plant phenotyping often rely on lab-based approaches or ground vehicles equipped with sensors to measure plant traits. These approaches are costly, destructive, and time-intensive, and cannot be applied over large field trials. Satellite and airborne remote sensing have been widely used to partially address some of these limitations through non-destructive estimation of plant traits (Sankaran et al. 2015; Yang et al. 2017). However, satellite data often lack the needed spatial resolution to reliably estimate plant traits and data acquisition using airborne remote sensing is costly (Han-Ya et al. 2010; Gevaert et al. 2015; Sankaran et al. 2015). In addition, even the finer spatial resolution of airborne platforms often exceeds the size of research plots and individual plants. Due to this scale-mismatch between remote sensing observations and plant size in croplands, the capability of these data to estimate plant traits is often limited. Lastly, plant phenotyping requires multi-temporal assessment of plant traits over the growing season. However, multi-temporal airborne and field data collection campaigns are very costly. Unmanned/unoccupied aerial systems (UASs; also called drones) are capable of collecting remotely sensed data over large field trials with very fine spatial resolution. In addition, UASs can collect data with high temporal resolution at relatively low cost. In this study, we tested the capability of UAS-based remote sensing to monitor several winter wheat traits and yield over time.

1.2 UAS-Based Remote Sensing of Phenotypic Traits

Crops, depending on their status and health, have different structural and biochemical traits, as well as phenology (i.e., changes of these traits over time), all of which affect plants’ spectral response (Ustin and Gamon 2010). In other words, differences in spectral patterns detected by optical remote sensing are largely related to plant structural, biochemical, and phenological properties, which vary among species. Therefore, we posit that spectral responses recorded by UAS-based remote sensing can be used to monitor plant structural, biochemical, and phenological properties. UAS-based remote sensing, in general, has been successful at estimating plant phenotypic traits using three main approaches including (1) vegetation indices (VIs), (2) machine learning algorithms (MLAs), and (3) structure-from-motion (SfM).

VIs combine data from two or more spectral bands to build a single index value that is capable of characterizing the abundance and status of vegetation. There is a large body of literature that suggests UAS-derived VIs can estimate different crop structural and biochemical traits (Sultana et al. 2014; Wang et al. 2014; Geipel et al. 2016; Haghighattalab et al. 2017; Cai et al. 2019; Tao et al. 2020). While plant trait estimation based on UAS-derived VIs has shown promise, the use of VIs as indicators of plant structural and biochemical traits is constrained by specific limitations. For instance, a plant trait may affect specific wavelengths of the spectrum, although a VI may not include specific wavelength(s) relevant to that plant trait. Furthermore, the relationship between spectral data and plant traits can vary due to environmental conditions, time of data acquisition, and crop phenological status (Epiphanio and Huete 1995; Colombo et al. 2003; Gitelson et al. 2003). Thus, these relationships are often site- and time-specific and need to be re-calibrated.

As an alternative approach to VIs, MLAs have the ability to establish adaptive relationships between independent and dependent variables (Elarab et al. 2015; Tilly et al. 2015) and potentially, can provide more robust prediction of plant traits (Gholizadeh et al. 2015). Recent studies have demonstrated the potential of MLAs to estimate crop traits and yield (Cipollini et al. 2001; Verrelst et al. 2011; Vilas et al. 2011; Hassan-Esfahani et al. 2014). For instance, Maimaitijiang et al. (2017) assessed the performance of UAS-derived VIs and two MLAs including support vector regression (SVR) (Cortes and Vapnik 1995) and extreme learning regression (ELR) in estimating soybean chlorophyll content, N content, LAI, fresh biomass, and dry biomass. Their results showed that compared with VIs, MLAs improved estimation of soybean traits, among which ELR yielded the most accurate results. Similarly, Li et al. (2019) were able to predict winter wheat yield during the early, middle, and late growing stages of winter wheat using UAS-derived SVR. Collectively, these studies have suggested that within a machine learning framework, UAS data can provide strong estimations of different plant traits for a variety of crops.

SfM is a topographic surveying technique that has emerged from computer vision and photogrammetry (Carrivick et al. 2016). SfM often uses common red–green–blue (RGB) images to create three-dimensional (3D) models of the Earth’s surface (Carrivick et al. 2016; Madec et al. 2017; Walter et al. 2018). The 3D models obtained from SfM technique are widely used to generate surface and elevation models and estimate crop structural traits and yield (Turner et al. 2012; Ballesteros et al. 2014; Bendig et al. 2014; Chapman et al. 2014; Holman et al. 2016; Du and Noguchi 2017). For instance, several phenotyping studies used SfM to estimate crop height, LAI, and biomass from UAS-based RGB data (Maimaitijiang et al. 2017; Hu et al. 2018). These studies suggested that SfM is a reliable approach for estimating crop structural traits.

Given the opportunities provided by UASs, we assessed their capability in elucidating a suite of winter wheat traits including N content (%), plant height (cm), fresh biomass (gr), dry biomass (gr), LAI (m2/m2), and yield (kg/ha) over the growing season. Specifically, we addressed two objectives: (1) assess the ability of UAS-derived metrics in estimating winter wheat biochemical and structural traits and yield over the growing season and (2) determine whether using MLAs improves estimation of winter wheat traits and yield. Our central hypothesis was that plant traits would affect remote sensing spectral signatures, as such UAS-derived metrics will be capable of estimating crop N content, structural traits, and yield and their temporal changes. Additionally, we expect that MLAs can provide more robust estimation of plant traits compared to other techniques such as VIs. To achieve our specific objectives, we used UAS-mounted RGB and multispectral sensors to collect remote sensing data from winter wheat fields at different stages of growth, including tillering, booting, and harvest at our experimental plots located in Stillwater, Oklahoma, US. To validate these remotely sensed data, we measured a suite of plant traits on the ground, including N content, plant height, biomass, LAI, and yield. We then estimated winter wheat traits using several VIs, MLAs, and SfM during different stages of the growing season. Findings of this research can potentially improve monitoring practices by breeders through non-destructive and rapid acquisition of winter wheat phenotypic information, provide alternative strategies for improving winter wheat production while minimizing environmental consequences, and eventually assist winter wheat producers to reduce associated costs.

2 Methods

2.1 Study Site

This experiment was conducted at Efaw site (36° 08´ N, W 97° 06´) in Stillwater, Oklahoma, US. The soil type was silty loam (fine-silty, mixed, super active, thermic fluventic Haplustolls). The experiment period was the active winter wheat growing season from November 2019 to June 2020. The experimental field was rainfed and the plots were fertilized using four replications and five pre-plant N rates. In total, there were 20 plots and the size of each plot was 3 m × 6 m with an alley of 3 m between plots (Fig. 1). Soil N treatments included five different rates of 0, 50, 100, 150, and 200 kg/ha. Treatment 0 kg/ha served as an unfertilized check plot and treatment 200 kg/ha served as a non-limiting N reference.

Fig. 1
figure 1

The study area is shown with a true color composite (date of imagery: March 29, 2020). Red numbers below each plot represent pre-plant applied soil N treatment (kg/ha)

2.2 Data Collection

2.2.1 Collecting Ground Reference Data for Model Development and Validation

We measured LAI (m2/m2), plant height (cm), fresh biomass (weights of fresh samples; gr), dry biomass (weights of dry samples; gr), N content (%), and yield (kg/ha) of winter wheat within 20 research plots during winter wheat tillering, booting, and harvest growing stages. The LAI measurements were taken using a LP-80 ceptometer (METER Group, Inc., Washington, US), which quantifies LAI based on photosynthetically active radiation (PAR) measurements in the 400–700 nm range. Five LAI measurements were taken within each plot using a 1 m × 1 m quadrat. We collected 100 LAI measurements during each field campaign, including tillering, booting, and harvest. We then took the average of the five readings per plot and used it as the LAI of the respective plot. The LAI readings were taken while facing the sun to avoid shadow on the ceptometer probe. Plant height was measured as the distance from ground to the top of canopy using a ruler. We measured plant height of 25 randomly selected plants from each plot during each field campaign. We then took the average of these 25 measurements to represent the mean plant height in each plot. For biomass measurements, the whole winter wheat plants were harvested approximately 2 cm above the soil surface from the middle row of each plot within a north-facing 25 cm × 25 cm quadrat after each UAS data collection (i.e., one sample per plot; 20 biomass samples in total for each field campaign) (see Fig. S1 in Supplementary material for an example). Fresh biomass weight was quantified by weighing the harvested samples in the field using a portable balance (Ohaus SJX1502N/E Scout, OHAUS Corporation, Parsippany, US). The samples were then bagged and transferred to the Soil, Water, and Forage Analytical Laboratory (SWFAL), Oklahoma State University, and oven-dried at 70 °C for 72 h. After weight stabilization, dry samples were weighed to obtain dry biomass weight. The whole-plant dried biomass samples were then used to quantify N concentration for winter wheat, using a combustion analyzer (Leco CN628, LECO Corporation, St. Joseph, Michigan, US) at the SWFAL lab. During the harvest stage, using a Kincaid 8XP plot combine (Kincaid Equipment Manufacturing, Haven, Kansas, US), the middle two rows of each plot were harvested, total yield weight was determined for each plot, and grain samples were collected from each plot (i.e., 20 samples in total). These grain samples were bagged and transferred to the SWFAL lab for grain N content analysis using the combustion analyzer.

2.2.2 UAS Data Collection for Spectral Sampling

Four flights were conducted on November 23, 2019, March 29, 2020, May 09, 2020, and June 11, 2020 during clear-sky conditions to collect RGB and multispectral data. All data collections started at 10:00 am (15:00 GMT) and ended before 12:00 pm local time (17:00 GMT). The RGB imagery was acquired with a DJI Phantom 4 quadcopter UAS platform (DJI, Shenzhen, China). Final RGB images had 2 cm spatial resolution (i.e., pixel size). Multispectral imagery was acquired with a MicaSense RedEdge-MX multispectral sensor (MicaSense, Seattle, Washington, US) mounted on a DJI Matrice 600 Pro hexacopter (see Fig. S2 in Supplementary material). The MicaSense RedEdge-MX has five 1.2 megapixel global-shutter single band cameras, imaging at 1280 × 960 pixel resolution in the blue (center wavelength 475 nm), green (center wavelength 560 nm), red (center wavelength 668 nm), red-edge (center wavelength 717 nm), and near-infrared (NIR; center wavelength 840 nm) spectral bands. Final multispectral images had spatial resolution of 3 cm. All flight missions were planned using Mission Planner at a flight altitude of 30 m above ground level with an overlap of 90% and sidelap of 80% to ensure sufficient overlap for 3D model construction and mosaicking.

2.3 Image Pre-processing

Image pre-processing steps included orthomosaic and digital surface model (DSM) generation for RGB and multispectral imagery and radiometric correction only for multispectral imagery. The process of generating orthomosaics included camera alignment, importing ground control points (GCPs) and geometric correction, building dense point clouds, building digital elevation models (DEMs), and generating orthomosaics in Agisoft Metashape software package, version 1.5.3.8469 (Agisoft LLC., St. Petersburg, Russia). To improve the positional accuracy of UAS data, nine permanent GCPs were established across the experimental field and their coordinates were recorded using a Trimble R4 RTK (Trimble Inc., Sunnyvale, California, US) global positioning system (GPS) receiver. Specifically, we used two of our nine GCPs as check points (meaning that we did not use them for georeferencing). The root mean square errors (RMSEs) of our GCPs for each data collection campaign are reported in Table S1 in the Supplementary material. For radiometric correction, a calibration tarp was placed in the field before each flight mission. The calibration tarp consisted of six panels with different colors including black, blue, green, gray, red, and white. Using an ASD FieldSpec 3 Standard spectroradiometer (ASD Inc., Boulder, Colorado, US), spectral reflectance of these panels was taken during the UAS flights. Then, these data were used to derive surface reflectance using the empirical line correction method (Conel et al. 1987).

2.4 Traits Estimation from Imagery

2.4.1 Estimating Winter Wheat Traits and Yield Using VIs

A set of VIs was extracted from the UAS multispectral orthomosaics including normalized difference vegetation index (NDVI), normalized difference red-edge index (NDRE), chlorophyll red-edge (CIre), enhanced vegetation index (EVI), and excess green index (ExG) (Table S2). NDVI is a widely used index for estimating plant biochemical and structural traits. However, NDVI tends to saturate at higher amounts of biomass (Mutanga and Skidmore 2004). NDRE is a red-edge-based index that has been commonly used as an indicator of plant N content (Fitzgerald et al. 2006; Li et al. 2018) and shown to perform better than NDVI for estimating plant biochemical traits (Fitzgerald et al. 2006). CIre is another red-edge-based index that has been commonly used for estimating plant chlorophyll and N content (Li et al. 2018). EVI is an adjusted version of NDVI and is considered to be less sensitive to soil background and atmospheric aerosol scattering compared to NDVI (Huete et al. 2002). ExG uses bands in the visible portion of the spectrum and has been used as an indicator of plant structural traits (Woebbecke et al. 1995; Li et al. 2018).

2.4.2 Estimating Winter Wheat Traits and Yield Using MLAs

In addition to VIs, the relationship between UAS-derived reflectance and winter wheat LAI, plant height, fresh biomass, dry biomass, N content, and yield was established using linear regression and four MLAs, including SVR, bagging ensemble (Rasmussen 2003), Gaussian process regression (GPR) (Breiman 1996), and artificial neural network (ANN) (Huang 2003; Haykin and Network 2004). For each winter wheat trait and yield, a separate model was developed in MATLAB 2020b (MathWorks Inc., Natick, Massachusetts, US). These approaches were selected because they cover different paradigms of MLAs including kernel-based algorithm (i.e., SVR), ensemble learning (i.e., bagging ensemble), Bayesian algorithm (e.g., GPR), and ANN. The parameters used in our MLAs are shown in Table S3 in Supplementary material.

2.4.3 Estimating Winter Wheat Traits and Yield Using SfM

We applied SfM to RGB imagery to generate DSM layers for our four flight campaigns. The first flight was conducted before crop emergence and therefore served as the reference ground elevation model (i.e., zero plant height). We calculated the difference between bare-ground height obtained from the first flight and top of canopy height calculated from the other three flights to estimate plant height during different growing stages, including tillering, booting, and harvest. We then used the estimated plant height through SfM and tested its association with measured winter wheat plant height, fresh biomass, dry biomass, LAI, and yield.

2.4.4 Assessing Model Performance

To assess the performance of UAS-derived VIs and SfM, the relationships between winter wheat traits and UAS-derived metrics (i.e., VIs and SfM) were evaluated for the entire dataset containing 60 samples and also for each growing stage individually in terms of correlation coefficient (r) and p-values. Relationships with p-value < 0.05 were considered significant.

To assess the performance of MLAs, 60% of all data points collected throughout the growing season were randomly selected for developing (or training) MLAs. Specifically, spectral reflectance from UAS data were used as independent variables and winter wheat N content, structural traits, and yield were used as the dependent variables. The remaining 40% of the data were used for model validation using correlation coefficient and RMSE. This training-validation process was repeated 100 times, and each time a new and random subset of data was used for model development and performance assessment.

3 Results

3.1 Spectral Characteristics of Winter Wheat over the Growing Season

The blue band had a significantly higher reflectance during the harvest stage than tillering and booting stages under all soil N treatments at α = 0.05 (Fig. 2). The reflectance of green region was significantly higher during the harvest and tillering stages compared with the booting stage. Additionally, the red band had a significantly higher reflectance in the harvest stage than other growing stages. The results also showed a general decreasing pattern in red-edge and NIR reflectance as the growing season progressed. We found significant temporal changes in spectral characteristics of winter wheat during the growing season in our experiment. The average spectra among five N fertilizer rates over the growing season are also shown in Fig. S3 in Supplementary material.

Fig. 2
figure 2

Comparison of UAS spectral reflectance over the growing season. The test was applied to 20 mean spectral reflectance measurements for each UAS band from 20 plots over each growing stage. Band acronyms: B blue, G green, R red, RE red-edge, NIR near-infrared

3.2 Estimating Winter Wheat N content, Structural Traits, and Yield Using VIs and SfM

N content estimation We assessed the capability of all VIs and SfM-derived metrics to estimate N content, structural traits, and yield. The relationships between measured and estimated traits were determined both for the entire dataset containing 60 samples and each growing stage individually (Table 1; Fig. S4a–l). The results showed that red-edge-based VIs, including NDRE and CIre had significant relationships with N content only at tillering stage (Table 1; Fig. S4a, b). Moreover, NDRE and ExG were the only UAS-derived metrics that showed significant relationships with grain N content at tillering and booting stages, respectively (Table 1; Fig. S4c, d).

Table 1 Optimal UAS-derived metrics for estimating winter wheat traits and yield during the individual growing stages and for the entire datasets

Plant height estimation: NDVI was found to be the best VI for estimating plant height at tillering and booting stages (Table 1; Fig. S4e). NDRE and CIre also showed significant and similar performances in estimating plant height at tillering and booting stages. At harvest, no significant correlations were found between UAS-derived metrics and plant height.


Biomass estimation The results for fresh biomass showed that NDVI had the best predictive performance at both tillering and booting stages (Table 1; Fig. S4g). Furthermore, NDRE and CIre showed significant relationships with fresh biomass only at booting stage. UAS-derived metrics did not show a significant relationship with fresh biomass during the harvest stage. For dry biomass, no UAS-derived metric provided significant estimations across the individual growing stages.


LAI estimation LAI was best estimated by NDVI at tillering stage (Table 1; Fig. S4i). At booting, NDVI, NDRE, CIre, and SfM-derived plant height showed significant and similar correlations with LAI. At harvest, SfM-derived plant height was the only metric that showed a significant correlation with LAI (Table 1; Fig. S4j). Overall, NDVI and SfM-derived plant height had the best performances at estimating LAI (Table 1; Fig. S4i, j).


Yield estimation UAS-derived metrics were also capable of predicting yield over the growing season, albeit with varying prediction power. NDVI showed the highest correlation with yield at tillering stage (Table 1; Fig. S4k). At booting, NDVI and ExG best predicted yield with similar performances. Moreover, NDRE and CIre showed significant correlations with yield at both tillering and booting stages. At harvest, insignificant relationships between VIs and yield were observed. This insignificant relationship was expected due to senescence of winter wheat at harvest (i.e., very low greenness). Overall, NDVI and NDRE seemed to perform better than other metrics at estimating yield (Table 1; Fig. S4k–l).

After we explored the capability of UAS-derived VIs and SfM in estimating winter wheat traits during each growing stage, we also determined the best UAS-derived metric for estimating each trait using the whole dataset by combining data points from different dates. Results from UAS-derived metrics for the entire dataset showed that CIre was strongly associated with winter wheat N content (Table 1; Fig. S4b); SfM-derived plant height consistently had the strongest relationship with measured plant height (Table 1; Fig. S4f), fresh biomass (Table 1; Fig. S4h), and LAI (Table 1; Fig. S4j) for the entire datasets. NDRE was also the strongest predictor for yield (Table 1; Fig. S4l). However, UAS-derived metrics showed no significant correlations with grain N content and dry biomass when data from all dates were pooled together. Pooling the data points also created a larger dataset that was suitable for training and validating MLAs (see Sect. 3.3 below).

3.3 Estimating Winter Wheat N content, Structural Traits, and Yield Using MLAs

We used linear regression using all UAS spectral bands and four MLAs, including SVR, bagging ensemble, GPR, and ANN to estimate winter wheat traits and yield using UAS-derived data. The results for the optimal UAS-derived metrics, linear regression using all spectral bands, and MLAs for the validation datasets are shown in Figs. 3 and in Supplementary material.

Fig. 3
figure 3

Correlation coefficients of estimated winter wheat traits and yield using validation dataset. a N content (%), b plant height (cm), c fresh biomass (gr), d dry biomass (gr), e LAI (m2/m2), and f yield (kg/ha). Vertical error bars show ± 1 standard deviation from 100 permutations. Due to lodging at harvest stage in our site, SfM-derived plant heights at harvest were excluded from linear regression and MLAs; see Sect. 4.3 for further discussion on the issue of lodging at harvest


N content estimation For winter wheat N content validation data using only the spectral bands, linear regression showed the strongest performance from 100 runs and GPR and bagging ensemble showed the second best predictive performances (Fig. 3a). ANN model had the weakest agreement with N content based on validation data. Our results showed that, in general, using both SfM-derived plant height and spectral bands in linear regression and MLAs improved the retrieval accuracy of N content (Fig. 3a). No models produced significant relationships with grain N content.


Plant height estimation For plant height, similar to N content, once we used only spectral bands, linear regression produced the best estimations based on validation data (Fig. 3b), followed by GPR and bagging ensemble. ANN showed the weakest performance in predicting plant height. In general, the performance of linear regression and MLAs at estimating plant height improved once SfM-derived plant height was added as input variable to our models (Fig. 3b).


Biomass estimation Bagging ensemble using only spectral bands had the best performance at estimating fresh biomass (Fig. 3c). Linear regression using all bands and GPR showed very similar performances in estimating fresh biomass (Fig. 3c). For dry biomass, bagging ensemble using only spectral bands outperformed other MLAs and linear regression and GPR showed comparable performances (Fig. 3d). In general, we observed slight improvements in estimating fresh and dry biomass once we added SfM-derived plant height to our liner regression and MLAs (Fig. 3c, d).


LAI estimation For LAI, the GPR and SVR models provided the best predictive performances based on validation data when we used only spectral bands (Fig. 3e). Additionally, SfM-derived plant height rendered results that were close to the linear regression using only spectral bands and outperformed bagging ensemble (using only spectral bands) and ANN in estimating LAI. Our results showed that, adding SfM-derived plant height improved the performance of linear regression and MLAs at estimating LAI (Fig. 3e).


Yield estimation GPR and linear regression provided the best results for winter wheat yield validation dataset (Fig. 3f). Validation results showed the marginally superior performance of GPR over linear regression using all bands in predicting yield once we used spectral bands only. UAS-derived NDRE showed a comparable performance to SVR and ANN in predicting yield (Fig. 3f). Additionally, using both SfM-derived plant height and spectral bands improved the performance of linear regression, bagging ensemble, and GPR in predicting yield (Fig. 3f).

3.4 Mapping Winter Wheat N content, Structural Traits, and Yield

The optimal UAS-derived metrics for estimating winter wheat traits and yield during the tillering, booting, and harvest stages and for the entire datasets are shown in Table 1. We mapped winter wheat traits and yield at the tillering stage by applying the optimal prediction models identified for the entire datasets (Fig. 4). We applied the optimal models to the UAS imagery taken during the tillering stage since it was considered the peak of the growing season. These optimal models included GPR using SfM-derived plant height and spectral bands for estimating N content (Fig. 4a), plant height (Fig. 4b), fresh biomass (Fig. 4c), and LAI (Fig. 4e), bagging ensemble using SfM-derived plant height and spectral bands for dry biomass (Fig. 4d), and linear regression using SfM-derived plant height and spectral bands for estimating yield (Fig. 4f). These maps can be used to assess the spatial pattern of different traits across plots. In our case, the maps of structural traits (i.e., plant height, dry and fresh biomass, and LAI; Fig. 4b–e) and yield (Fig. 4f) showed somewhat similar patterns.

Fig. 4
figure 4

Winter wheat traits and yield maps estimated by applying the optimal prediction models to the UAS multispectral imagery taken at the tillering stage. a N content map estimated by applying GPR algorithm using SfM-derived plant height and spectral bands, b plant height map by applying GPR algorithm using SfM-derived plant height and spectral bands, c fresh biomass map estimated by applying GPR algorithm using SfM-derived plant height and spectral bands, d dry biomass map estimated by applying bagging ensemble algorithm using SfM-derived plant height and spectral bands, e LAI map estimated by applying GPR algorithm using SfM-derived plant height and spectral bands, and f yield map estimated by applying linear regression using SfM-derived plant height and spectral bands. Trait acronyms: N nitrogen, LAI leaf area index

Additionally, we assessed the difference between estimated traits under different soil N treatments (Fig. S6). Specifically, we calculated the average estimated trait values for each plot (n = 20) from Fig. 4 and tested whether the differences in estimated traits and yield under different soil N fertilizer treatments were significant. Our results showed that estimated winter wheat N content (Fig. S6a), plant height (Fig. S6b), and yield (Fig. S6f) did not change significantly as a result of N fertilizer rates. However, estimated fresh (Fig. S6c) and dry biomass (Fig. S6d) were significantly lower under 50 kg/ha N treatment. Moreover, estimated LAI (Fig. S6e) values were significantly lower under unfertilized N treatment.

4 Discussion

4.1 Selecting Appropriate Remote Sensing Metrics for Estimating Winter Wheat Traits During the Growing Season

4.1.1 Optimal VIs for Estimating Winter Wheat Traits

Our results showed that UAS data can successfully estimate winter wheat traits and yield over the course of growing season. Additionally, our results suggested that selecting the appropriate remote sensing metrics plays an important role in quantifying winter wheat traits during the growing season. Specifically, we identified red-edge-based indices including NDRE and CIre as the best UAS-based VIs in estimating winter wheat N content during the tillering stage and also for all temporal datasets combined. Red-edge-based indices use the red-edge region of the electromagnetic spectrum which is a transition zone from very low reflectance in red region due to strong pigments absorption to high reflectance in NIR due to plant structure. Red-edge region includes important wavelength features for estimating N content, although N does not have known absorption features within the red-edge region (Jiang et al. 2020; Wang et al. 2020).

Furthermore, our results showed that, similar to N content estimation, red-edge-based indices derived from UAS can improve prediction accuracy of yield in winter wheat. We presume that the superior performance of red-edge-based indices in estimating yield were due to the association between yield and N content in winter wheat; yield rates increase with higher concentration of N-containing photosynthetic pigments (Li et al. 2019).

For structural traits, including LAI, plant height, and fresh biomass as well as yield, our results identified NDVI as a reliable metric during early and middle growing stages. Previous work, however, has suggested that NDVI tends to saturate over dense vegetation (Mutanga and Skidmore 2004). But our results showed the superior performance of NDVI at estimating LAI, plant height, fresh biomass, and yield even during the peak of winter wheat growing season, likely because the vegetation cover in our study area was relatively low, based on the average LAI of 1.01 ± 0.12 (m2/m2).

In addition to NDVI, our results highlighted the potential of SfM-derived plant height in estimating winter wheat structural traits. We presume that the superior performance of SfM-derived plant height in estimating winter wheat structural traits was due to the correlation between winter wheat plant height, LAI, and fresh biomass during the harvest stage. Plant height, LAI, and fresh biomass decrease gradually during the harvest stage as winter wheat leaves lose their moisture and shrink and kernels droop toward the ground (Large 1954). These findings matched findings from previous research that reported, as opposed to UAS-derived VIs, SfM-derived plant height did not drop drastically during the harvest stage and appeared to perform better in estimating winter wheat structural changes, including LAI, plant height, and fresh biomass (Maimaitijiang et al. 2017; Hu et al. 2018; Gong et al. 2021).

We note that, except for LAI, our remotely sensed data were not able to estimate winter wheat traits and yield during the harvest stage. We assume that the insignificant relationship between UAS-derived VIs and winter wheat traits and yield was mainly due to senescence of winter wheat at harvest when photosynthetic activity of winter wheat significantly decreases (Cai et al. 2019).

4.1.2 Optimal MLAs for Estimating Winter Wheat Traits

MLAs, in most cases, slightly improved the capability of UAS-derived data in estimating winter wheat traits and yield. This improvement in estimation accuracy is in part due to combining visible, red-edge, and near-infrared bands, which altogether provide the necessary information for estimating structural and biochemical properties of winter wheat. Nevertheless, our results confirmed that the choice of appropriate MLA is essential when using UAS data for estimating different winter wheat phenotypic traits.

Specifically, GPR was the most sensitive metric at estimating N content and structural traits, including plant height, fresh biomass, and LAI. A possible explanation for superior performance of GPR is that it uses a Bayesian approach which is able to perform adaptive, nonlinear data fitting for complex datasets, when the data from different dates are combined (Verrelst et al. 2011; Van Wittenberghe et al. 2014; Song et al. 2017).

For fresh and dry biomass, bagging ensemble regression had the strongest agreement with validation datasets when we used only spectral bands. Bagging ensemble has been reported as a useful proxy in estimating biomass of forest trees, wetland species, and agricultural crops (Mutanga et al. 2012; Wang et al. 2014; Dube and Mutanga 2015) possibly because it reduces the variance in prediction relative to the observations (i.e., meaning that a given model performs well on both training and test datasets) and prevents model overfitting (Breiman 1996; Liaw and Wiener 2002).

Our results also showed that, ANN had the highest error rates for winter wheat traits and yield. Similar issues with ANN’s poor prediction performance have been reported from previous studies on remote detection of plant traits (Verrelst et al. 2011; Gholizadeh et al. 2015). This weak performance is due to the sensitivity of ANN to several network architecture parameters, including the number of hidden layers and the number of nodes in each layer, which can largely affect the estimation capability of ANN.

Our findings provided evidence that while using UAS-derived MLAs slightly improved estimation accuracies in most cases, simple and computationally efficient VIs consistently had high correlations with winter wheat traits and yield during the growing season. In addition, SfM-derived plant height showed a great potential in estimating winter wheat traits. Specifically, the addition of SfM-derived plant height to our models, in general, improved the performance of our MLAs for predicting N content and structural traits, such as plant height, fresh biomass, and LAI. Using MLAs to estimate crop traits requires continuous acquisition of a large amount of in-situ reference data that exemplify variations in crop phenotypic traits over varying crop growth conditions. Thus, using MLAs for individual growing stages may not be feasible in all cases since collecting a large amount of in-situ data for training and validating MLAs is very expensive and time-intensive. Moreover, the relationship between MLAs and crops' traits varies across species and over time; these relationships may not be transferable to other crop types and over time (Weiss et al. 2020). Thus, we argue that, for specific crop traits, simple and computationally efficient metrics including UAS-derived plant height, CIre, and NDRE can be used reliably.

4.2 The Role of UAS-Based Remote Sensing for Developing Successful Breeding Programs

We identified three requirements for a successful integration of remote sensing and breeding sciences. First, this integration requires remote sensing methods to be applied and tested under a variety of environmental conditions. Breeding programs continuously monitor crop phenotypic traits over large breeding trials and across a range of varieties, developing stages, and soil fertilizers. However, there are technological constraints to effectively monitor crop traits. Traditional field-based phenotyping approaches are usually conducted at limited number of locations that by design cannot represent all environmental gradients. As a result, breeding efforts have adopted remote sensing approaches to retrieve information about crop phenotypic traits over large field trials and under a variety of environmental settings. Previous studies have shown evidence that using optical remote sensing outperforms traditional field-based phenotyping methods (Bhandari et al. 2020; Guo et al. 2021).

Second, to effectively monitor crop traits remotely, remote sensing data require to be collected with appropriate spatial resolution. However, the application of optical remote sensing for crop phenotyping remains challenging due to the spatial resolution of remote sensing platforms. For example, the spatial resolution of satellite and airborne platforms are often coarser than the size of individual crops. The coarse spatial resolution of satellite and airborne remote sensing data, therefore, increases the probability of having mixed pixels within the research plots.

Third, in addition to spatial resolution of remote sensing data, paying specific attention to temporal resolution is required when using remote sensing-based crop monitoring approaches. Our experiment and previous research have provided evidence that phenology (i.e., temporal variation in plant traits as a result of plant life cycle events) significantly affects remote sensing reflectance signatures (Anderegg 2015; Wang et al. 2020). Therefore, there is a critical need within breeding programs for developing multi-temporal experiments to monitor crop traits over the growing season. However, collecting multi-temporal data with traditional field-based methods and using airborne campaigns is not feasible due to the associated costs.

Our findings showed that UAS-based remote sensing can address these requirements by developing an operational monitoring system which is characterized by flexibility, low cost of operation, and fine spatial and temporal resolution. Our experiment, and other UAS-based experiments, can potentially aid winter wheat phenotyping programs by providing efficient proxies for estimating plant traits and yield during the growing season.

4.3 Limitations and Future Research

In our experiment, we collected data at four points in time, including the emergence, tillering, booting, and harvest growing stages of winter wheat. However, winter wheat developmental stages go beyond these four stages and include other developmental stages, such as stem elongation, dough development, and flowering. The significant cost of in-situ data collections and trait analysis limited our ability to repeat our data collection campaigns for more developmental stages. Conducting similar remote sensing experiments with finer temporal resolution would be critical to understand the impact of phenology on crop traits and the ability of remote sensing to track temporal changes of these traits.

In addition to temporal scale, assessing the impact of spatial and spectral resolutions on plant phenotyping–meaning the degree to which our ability to remotely predict specific traits varies as a result of changes in spatial and spectral resolutions–is an open question. Although we have not tested the impact of these dimensions of scale on remote sensing of phenotypic traits, identifying optimal spatial and spectral scales for capturing these traits can contribute to the development of cheaper crop-specific monitoring sensors.

Also, our experiment was limited to one site and one crop type. Capability of remote sensing to estimate plant traits can vary significantly depending on the crop type. Even for the same crop, depending on biotic and abiotic factors, we might observe different temporal patterns in phenotypic traits. Therefore, expanding the current experiment to other crop types and environmental conditions can be an interesting future research avenue.

Finally, in our site, significant lodging occurred during the harvest stage (Fig. 5). Lodging, here defined as the displacement of crop stems from their upright position, is a common phenomenon within cereal crops such as wheat (Crook and Ennos 1993; Li et al. 2022; Pinthus 1974). Lodging is often the outcome of interactions between crop variety, environmental parameters (e.g., rainfall and wind events), management practices (e.g., soil nitrogen treatment), and crop traits such as crop N content and crop height (Berry et al. 2000; Chauhan et al. 2019). For instance, excessive application of soil nitrogen supply can enhance the risk of lodging by increasing crop height and self-weight moment (i.e., the force exerted on the plant base induced by the weight of its upper parts) (Chen et al. 2018; Crook and Ennos 1995; Zhang et al. 2017). In our site, crop lodging occurred within all plots; therefore, we presume that lodging was not driven by soil N treatment. In our site, we attribute crop lodging to environmental parameters such as rain and wind. During the harvest stage, winter wheat stems lose their moisture and become weak, and as a result, the crop heads become too heavy for the stems to hold up and lodge toward the ground (Gravelle et al. 1988; Large 1954; Loyce et al. 2008). These structural changes in winter wheat during the harvest stage make winter wheat even more susceptible to lodging by wind and rain events (Niu et al. 2016). In our study, the lodged patches were under-represented in our in-situ plant height measurements and therefore, we did not use SfM-derived plant height at harvest in model development (Fig. 3).

Fig. 5
figure 5

RGB imagery taken at harvest from the study site at Efaw site, Stillwater, Oklahoma, US. Lodged patches of winter wheat are shown inside red rectangles. Calibration panels used for empirical line atmospheric correction can be seen on the left

5 Conclusions

In this study, we developed a UAS-based remote sensing monitoring system and examined its capability to estimate winter wheat traits and yield over the growing season. Our experiment showed that using MLAs, in most cases, slightly improved estimation accuracies, presumably through combining visible, red-edge, and near-infrared bands that provide critical spectral information for estimating winter wheat traits and yield. However, based on our findings, UAS-derived red-edge-based vegetation indices, including CIre and NDRE showed comparable performance to some computationally complex MLAs in estimating N content and yield. Additionally, SfM-derived plant height outperformed our best MLA in estimating structural traits, including plant height, fresh biomass, and LAI. Considering the expensive and time-intensive process of collecting in-situ data required for developing MLAs, we conclude that relatively simple UAS-derived VIs and SfM are good options for developing operational monitoring approaches for winter wheat over the growing season.

Methodology and findings of our study can potentially benefit breeding programs. Crop breeders make efforts to achieve higher yields through continuous monitoring of crop traits and selecting cultivars that are resistant to stress under varying environmental conditions. These efforts have traditionally focused on using ground-based monitoring methods. The application of these methods is constrained by high operational costs. UAS-based remote sensing approaches can be reliably used to monitor some critical phenotypic traits with high spatial and temporal resolution and a relatively low-cost of operation.