Introduction

Adoption of best management practices and development of superior food crop cultivars are necessary to cope with pressures imposed by different biotic and abiotic factors (e.g., pests, diseases, drought, water logging, salinity, nutrient deficiencies, extreme temperatures, etc.) and secure food production. General agricultural management practices include nutrient management, pest and disease management, irrigation and drainage (Boman, 2012; Vincent et al., 2021). These practices require regular field monitoring to identify problems and examine crop responses to management. This is often accompanied by specialized laboratory analyses to assess crop physiological responses, which are costly and time-consuming.

Recent advances in plant breeding have accelerated the development of new crop cultivars to cope with a rapidly changing production environment affected by diseases and other plant stresses (Lenaerts et al., 2019). These new cultivars require evaluation on a large-scale in a commercial production setting to assess their horticultural traits, physiological needs and economic potential before being released for widespread commercial adoption. Identification of plant physiological needs through field scouting and hand sampling of tissues for laboratory analysis has been the traditional way to evaluate new cultivars and fine-tune management practices. As these processes are laborious, costly and prone to human error, new phenotyping techniques are needed to advance crop selection and improve crop production (Li et al., 2014). New high-throughput phenotyping techniques are now available that utilize unmanned aerial vehicle (UAVs) (Abdulridha et al., 2020a, b; Costa et al., 2020a) or ground-based remote sensing and artificial intelligence (Burud et al., 2017; Cruz et al., 2019; Reynolds et al., 2019). These techniques have been used to determine various horticultural traits in several crops. For example, the correlation of different models and crop indices based on spectral reflectance data has been studied to assess crop yield in wheat (Hassan et al., 2019; Mirasi et al., 2019), biomass in oat (Coelho et al., 2018), leaf area index in wheat (Xie et al., 2014), plant nutrient content in citrus and grapevine (Osco et al., 2019; Moghimi et al., 2020), detection of pests and diseases in citrus and avocado (Abdulridha et al., 2019a; Partel et al., 2019a, b).

In citrus, remote sensing and AI have also been used to determine crop yield (Vijayakumar et al., 2021; Ye et al., 2007) and canopy volume (Ampatzidis et al., 2019), to count trees (Csillik et al., 2018; Ampatzidis et al., 2020), determine leaf stomatal properties (Costa et al., 2021) and to detect diseases such as huanglongbing (Cerreta et al., 2018; Garza et al., 2020), canker (Abdulridha et al., 2019b) and foot rot (Garza et al., 2020). In this study, citrus was used as a model system for developing novel high-throughput phenotyping techniques utilizing UAV imagery and machine learning to accelerate cultivar selection and improve crop production.

Nutrient management is one of the most important factors in citrus production as it directly influences tree health and productivity (Galvez-Sola et al., 2015; Morgan & Graham, 2019). The plant nutrient status is influenced by the nutrient uptake efficiency of the plant, which is affected by many factors including rootstock cultivar (Uygur & Yetisir, 2009; Toplu et al., 2011; Yilmaz et al., 2018), soil type, season and plant developmental stage (Scagel et al., 2007) as well as soil-borne and other diseases such as huanglongbing (Cao et al., 2015; Morgan & Graham, 2019). To determine the nutrient status of a plant and correct potential deficiencies, regular analysis of leaf nutrient concentrations is essential (Galvez-Sola et al., 2015; Stammer & Mallarino, 2018). Because the nutrient status of a plant is variable, multi-monthly and multi-annual analyses may be necessary to assess plant responses to environmental factors and management practices accurately.

Plant nutrient analysis requires the chemical analysis of leaf samples in a specialized laboratory, which is expensive and uses toxic chemicals with negative impacts on the environment (Galvez-Sola et al., 2015). Moreover, it is prone to human error resulting from inconsistencies and bias during leaf sampling and during the analysis process, which may compromise interpretation and significance of the data. Faster, cheaper and more environmentally friendly alternatives to conventional nutrient analysis methods are being developed at a rapid pace. Qamar-uz-Zaman & Schumann (2006) calculated NDVI using aerial photography and correlated it with citrus leaf macro and micronutrient concentrations. Machine learning regression models were used with close-range spectroscopy devices to scan leaves and determine concentrations of leaf nitrogen and other leaf nutrients in citrus (Galvez-Sola et al., 2015; Osco et al., 2019, 2020). These methods may help overcome some of the limitations of the traditional method of leaf nutrient analysis or complement it. However, some of these methodologies were developed for specific nutrients and are therefore not applicable to assess all nutrients simultaneously. Moreover, some approaches (e.g., Osco et al., 2020), although precise, use expensive ground-based sensors (e.g., spectroradiometer) for data collection, which requires more resources (e.g., driver, sensor operator) and time to scan large orchards than a UAV-based sensing system. Developing a more efficient methodology using novel machine learning algorithms and UAV-based sensing may help improve the overall consistency and data collection speed.

The objectives of this study were to: (i) develop a novel high-throughput method to determine leaf nutrient concentrations in citrus (as a case study), (ii) utilize this method to identify nutrient deficient zones and (iii) create fertility maps compatible with variable rate technologies (e.g., variable rate fertilizers) for zone-based management. The proposed technique and model can be applied to different crops and production systems.

Materials and methods

Study site

Four field trials with 5-year-old grafted sweet orange (Citrus sinensis) trees growing in a commercial citrus production system in Florida, USA (Lykes Bros. Inc.) were used for nutrient analysis and model development. Trees were composed of 2 different scion cultivars and more than 30 different rootstock cultivars. The scion cultivars were ‘Hamlin’ orange, an early maturing cultivar, and ‘Valencia’ orange, a late maturing cultivar. The rootstocks included more than 30 commercial and experimental cultivars with different taxonomic backgrounds (Kunwar et al., 2021; Ampatzidis et al., 2019).

Trials A (10.76 ha) and B (11.25 ha) were located in Southeast Florida near Fort Basinger, Highlands County (27° 22′ 16.0′′ N 81° 08′ 08.0′′ W), and trials C (8.78 ha) and D (two areas with 6.10 ha and 3.43 ha) were located in Central Florida near Lake Wales, Polk County (27° 56′ 07.4′′ N 81° 30′ 00.1′′ W) (Fig. 1). The soil type in trials A and B is a poorly drained sandy Entisol with Spodosol-like properties (Mylavarapu et al., 2016), and trees were planted in double-rows on raised beds separated by furrows at a spacing of 2.4 m along the rows and 7.6 m between the rows. The soil in trials C and D is a well-drained sandy Entisol, and trees were planted in single rows without beds at a spacing of 2.4 m along the rows and 6.7 m between the rows. Trees in all trials were produced in a commercial citrus nursery (Lykes Bros. Inc., Basinger, FL, USA) and were planted in 2015. Trees in trials A and C were composed of ‘Hamlin’ scion on 32 and 35 different rootstock cultivars, and trees in trials B and D were composed of ‘Valencia’ scion on 32 and 35 rootstock cultivars (Table 1). The experimental design was randomized with rootstocks replicated as linear plots of eight trees. Irrigation was by under-tree micro-sprinklers. Nutrient, disease and weed management was per the grower’s standards, as outlined in Kunwar et al., (2021) and was similar for all the trials.

Fig. 1
figure 1

Aerial views of a trial A, b trial B, c trial C, and d trial D-east

Table 1 Details of study sites used for model development

Study workflow

The framework of this study was divided into two main phases (Fig. 2). In the data acquisition phase (Fig. 2, green), the spectral measurements of the canopy reflectance were acquired using a UAV-based multispectral camera (red, green, blue, red edge and near-infrared wavelengths), and leaf samples were collected and analyzed in a laboratory as described in the next paragraph to generate the dataset. The second phase (Fig. 2, blue) consisted of: (i) pre-analysis to evaluate the dataset for each nutrient; (ii) model development using the multispectral bands as inputs and 5-fold cross-validation (80% of samples for training and validation of the model, 20% for testing) to ensure the repeatability of the methodology used; and (iii) with the model generated, the evaluation metrics of mean absolute percentage error (MAPE), root mean square error (RMSE), MAPE-CV ratio and difference plot were applied to evaluate the results.

Fig. 2
figure 2

Workflow of the development of the estimation model for nutrient analysis that includes dataset collection and analysis, and model development and evaluation (Color figure online)

Leaf nutrient analysis

Six replications per scion/rootstock combination in each trial were used for leaf nutrient analysis using traditional procedures. A random sample of 16 mature leaves (4 from each cardinal direction) from the most recent spring flush was collected from the 3rd and the 6th trees from each replicated plot in September 2019. Because there were more than six replications of each scion/rootstock combination in each trial, not all areas at each trial site were equally sampled. Leaf samples from the two trees were pooled to generate one leaf sample for analysis. In total, 804 leaf samples were collected, 192 from each trial at the Polk County location and 210 from each trial at the Highlands County location (402 leaf samples each from ‘Hamlin’ and ‘Valencia’ scion). Leaf nutrient analysis was conducted by a commercial analysis service (Waters Agricultural Laboratories, Georgia, USA). Total leaf nitrogen (N) was determined using the Dumas combustion method slightly modified by Sweeny (1989), which measures all forms of nitrogen (ammonium, nitrate, heterocyclic and protein). For the other nutrients—phosphorus (P), sulfur (S), potassium (K), zinc (Zn), magnesium (Mg), calcium (Ca), iron (Fe), manganese (Mn), copper (Cu) and boron (B) - leaf samples were first digested using nitric acid and hydrogen peroxide solution followed by inductively coupled argon plasma (ICAP) analysis (Havlin & Soltanpour, 1980).

Spectral data acquisition

The canopy reflectance of trees was measured with a 5-band multispectral camera (Micasense Altum, Micasense, USA) [blue (465–485 nm), green (550–570 nm), red (663–673 nm), red edge (712–722 nm) and near-infrared (820–1000 nm)] mounted on a quadcopter UAV (Matrice 210, DJI, Shenzehen, China). For UAV flight planning and mission control, the Pix4DCapture (Pix4D S.A., Prilly, Switzerland) software app was used on an iPad (Apple, Cupertino, CA, USA) connected to the remote controller of the UAV. All flights were conducted around solar noontime with a clear sky to minimize atmospheric interference. The UAV flew at 122 m height. The generated ortho-mosaic map had a resolution of 50 mm/pixel. Figure 3 presents the workflow of data collection to generate the study’s dataset. The dataset was divided into (1) ‘Hamlin’ scion (containing all rootstock cultivars), and (2) ‘Valencia’ scion (containing all rootstock cultivars).

Fig. 3
figure 3

Data collection and dataset generation workflow

5-Fold cross-validation

A 5-fold cross-validation scheme was used to validate the model’s ability to estimate the objectives. It is a popular strategy since it is straightforward to understand and produces a less biased or optimistic estimate of model competence than other approaches, such as a simple train/test split. The general procedure of the 5-fold cross-validation used is presented in Fig. 4.

Fig. 4
figure 4

General procedure for a 5-fold cross-validation

Data analysis

The dataset was analyzed using standard statistical procedures to obtain the mean, standard deviation, skewness and coefficient of variation (CV) values. The skewness evaluates the distribution of the data, where zero corresponds to a symmetric distribution. Negative and positive skewness presents a distribution of values clustered on smaller values (negative) or higher values (positive). The coefficient of variation is presented in percent and shows the variability of the dataset concerning its mean value.

The recommended ranges for each nutrient based on the University of Florida’s (UF) Institute of Food and Agricultural Sciences (IFAS) guidelines for citrus nutrient management (Kadyampakeni & Morgan, 2020) are presented in Table 2. These values were used as guidelines for data interpretation in this study.

Table 2 Guidelines for interpretation of leaf analysis based on 4-6 month-old spring flush from non-fruiting twigs (based on Kadyampakeni & Morgan 2020)

Regression model

The collected spectral reflectance (five bands) from the multispectral camera was used to generate a regression model capable of using these five inputs to estimate the nutrient concentrations of the crop. For this study, multiple regression algorithms such as ElasticNet, Lasso regression, Linear SVM, PLSR, Random Forest and Ridge regression were evaluated on their ability to generate a model in this dataset without overfitting or with acceptable error. The gradient boosting regression model proved to have the best results and therefore is used in this study.

Gradient boosting regression tree

The gradient boosting regression tree is an ensemble model. Ensemble-based models consist of multiple base models (in this case, regression trees) where each base model provides a solution to the problem, whose outputs are combined in some way (typically by weighted or unweighted voting or averaging) to produce the final ensemble model output (Zhang & Haghani, 2015). The success of the ensemble comes from the reduced total error by correcting the mistakes of individual models. Combining individual base models with different errors can reduce the final error of the ensemble model. Trees are one type of base model that is commonly used for ensembles. They are sensitive to small variability in the training data. This unique property makes them good candidates for ensembles (Zhang & Haghani, 2015).

A single tree regression model is partitioned in nodes occupied each by a simple constant model. In these nodes, a decision is made based on the value of the variable being analyzed. For each subsequent node, a new decision is based on an input variable until it reaches the end of the tree and gives the output. Figure 5a presents a schematic plot for a simple regression tree with decision nodes, where X1 and X2 are the variables being analyzed, b1, b2, b3 and b4 are values of the simple constant function, and Ya, Yb, Yc, Yd, Ye are the possible outputs of the model. Figure 5b presents a schematic example of an ensemble model composed of multiple regression trees where B1, B2, B3, B4 and B5 represent each a base model (single regression tree) and the output Y is the weighted (w1, w2, w3, w4, w5) vote of them.

Fig. 5
figure 5

a Single regression tree example; and b ensemble model example

The gradient boosting methodology generates base models sequentially. Estimation accuracy is improved by emphasizing on the training cases that are more difficult to estimate. In the boosting process, examples that are harder to determine appear more often in the training phase. Each new base model is designed to correct the mistakes made by its previous base models.

The purpose of resampling of the training data is to provide the most useful information for each consecutive model. The adjusted distribution for each step of training is based on the error produced by the previous models. Figure 6 presents the steps of the gradient boosting algorithm.

Fig. 6
figure 6

Work flow of the gradient boosting algorithm

Evaluation metrics

Mean absolute percentage error (MAPE), root mean squared error (RMSE), and MAPE-CV ratio

The mean absolute percentage error (MAPE) is the computed average of absolute percentage errors by which outputs of a model differ from actual values of what is being estimated. The MAPE is given by the sum of the absolute difference between both the ground truth (Gt) and the model output (Y) divided by the ground truth, which is then divided by the number of individuals (n). The formula for the MAPE is presented in Eq. 1.

$$\text{M}\text{A}\text{P}\text{E}=1/n \sum \left|\frac{Gt-Y}{Gt}\right|$$
(1)

The root-mean-squared error (RMSE) is the root of the squared difference between the ground truth (Gt) and output value (Y), divided by the number of individuals (n) (Eq. 2).

$$\text{R}\text{M}\text{S}\text{E}=\surd \left(\sum {(Gt-Y)}^{2}/n\right)$$
(2)

When analyzing the performance of different algorithms and models, the quality of the dataset must also be considered. Datasets with high CV values present a greater challenge for regression algorithms, as the possible outputs are spread into a larger range. In contrast, datasets with low CV values present a lesser challenge for regression algorithms. In these cases, a good analysis is the MAPE-CV ratio given in Eq. 3, which is a ratio of error by challenge level, an indicator of performance where smaller values correspond to better performance.

$$MAPE-CV Ratio=MAPE/CV$$
(3)
Measure of agreement

The agreement between measurements refers to the degree of concordance between two sets of measurements of the same individual by two different methodologies. The Pearson correlation coefficient is often inappropriately used to evaluate for agreement since it is an incorrect measure of reproducibility or repeatability (Watson & Petrie, 2010). This invalidates the approach for comparison between other works with different datasets. Statistical methods to test agreement are used to decide whether one technique for measuring a variable can substitute another (Ranganathan et al., 2017).

The difference plot is a display of the pattern and agreement of one variable being measured by two different methodologies (Watson & Petrie, 2010). The diagram plots the difference between a measurement pair on the vertical axis and the mean of the pair on the horizontal axis. To determine the repeatability of the proposed approach, the method assumes a normal distribution of differences, where 95% of them are expected to lie between \(d\pm 1.96\,{\text{s}}\), where d is the mean of observed differences and s is the standard deviation, generating an error in a confidence range (95%).

Interpretation

Given all evaluation metrics, an important step is understanding how to interpret these results. MAPE presents the absolute regression error in percentage, being suited to compare the results when the datasets have different units or magnitudes. RMSE presents the same absolute regression error in the original units used in the dataset, being suited for comparing different regressions in the same dataset. The MAPE-CV ratio is a score of regression performance, being the ratio between efficacy and challenge.

Lastly, the difference plot presents the agreement between two sets of variables. This also generates an error range in a confidence level (95% in this study). Essentially, this is the maximum error expected for each regression with a 95% confidence. The key interpretation here is the difference between this range and the MAPE, where the first is the worst-case scenario for a single estimation, while the latter is the average error expected for a large number of estimations. These metrics are suited for specifically scaled comparisons; the error range should be used as the error of a small-scale estimation (for example, when comparing the estimation of nutrient content between two citrus trees), and the MAPE should be used on a large-scale estimation (for example, when comparing the average nutrient content of two groves, each containing more than 5,000 citrus trees).

Results

Dataset analysis

‘Hamlin’ dataset

Table 3 presents the leaf nutrient data (lab chemical analysis) for the ‘Hamlin’ trials. In this case, N, P and S returned low variability, but the data are still viable as these are the normal ranges for each of the nutrients for these trees. As this low variability could force the model to overfit, to evaluate this possibility, the MAPE of the model created was compared to the CV of the data itself. This was used to verify that the model was aiming to determine the correct values.

The larger values of CV, namely for Fe and Cu, present a dataset with larger variability, which could mean that these nutrients present a more challenging problem for the model to determine the correct value. Based on the nutrient recommendations for citrus (Kadyampakeni & Morgan, 2020), the average leaf nutrient concentrations for the ‘Hamlin’ trees were in the optimal to high range, except for Ca, which was low, and Cu, which was in excess.

Table 3 Mean, minimum and maximum, CV and skewness of the ‘Hamlin’ dataset

‘Valencia’ dataset

Table 4 presents the leaf nutrient data for the ‘Valencia’ trials. Although the mean, minimum and maximum values are similar to the ‘Hamlin’ dataset, the data were more spread (larger CV). This can produce a larger error in the model, which had to be investigated further in the model evaluation. Based on Kadyampakeni and Morgan (2020), the average leaf nutrient concentrations for the ‘Valencia’ trees were in the optimal to high range, except for Cu, which was in excess.

Table 4 Mean, minimum and maximum, CV and skewness of the ‘Valencia’ dataset

Model: ‘Hamlin’ dataset (UAV imagery)

The models were generated using the open-source library scikit-learn for Python. The five-fold cross validation ensures that all data points were used for training and testing, where the average error for all five folds is considered the overall error of the model and algorithm. Table 5 shows the MAPE values for all nutrients for each validation (five-fold), the average MAPE, the average RMSE and MAPE-CV ratio.

Table 5 Mean absolute percentage error (MAPE), root mean squared error (RMSE) and MAPE-CV ratio for the estimation models (five-fold validation)

Nutrients like N, P and S, which had a lower variability in the dataset (lower CV in Table 3), also had lower MAPE values (Table 5), while nutrients with higher variability in the dataset (higher CV in Table 3) had higher MAPE values (Table 5). A lower variability in the dataset will usually generate models with higher precision, as these values are spread in smaller ranges and therefore generate smaller errors. In such cases with different variabilities, the MAPE-CV ratio is useful to compare results, with a smaller ratio indicating a better performance.

Overall, the MAPE shows a good regression for most of the nutrients, with errors under 10%. For Zn, Mn, Fe and Cu, which returned larger errors (Table 5), the model was less accurate, but satisfactory, considering that data acquisition was done by UAV (and not by time-consuming and laborious laboratory analysis). The error distribution is shown in Fig. 7. For N and S, the spread of errors is under 10% of the absolute error. For P, K, Mg, Ca and B, the model resulted in errors of less than 20%. These are encouraging results for a procedure that uses UAV imagery.

Fig. 7
figure 7

Whisker plot of the distribution of errors for each nutrient in the ‘Hamlin’ dataset

The difference plots were used to compare the standard method of measuring nutrient concentrations (lab chemical analysis) with the new proposed UAV- and AI-based method. Figure 8 shows the difference plot for N as an example. It shows that the operating range for N determination by spectral canopy reflectance analysis works on a \(\pm\,0.25\) range of error for the percentage of the nutrient in the plant. This is equivalent to a \(\pm\, 8.39{\%}\) range of error in operation.

Fig. 8
figure 8

Difference plot for nitrogen measurements in the ‘Hamlin’ (UAV) dataset

The difference plot was applied to all nutrients. Table 6 shows the upper and bottom range of operation in their nutrient concentrations and equivalent values in percentage. The table shows that the maximum error for almost half of the nutrients was less than 20%.

Table 6 Upper and bottom range for the difference plot for each nutrient in the ‘Hamlin’ dataset

Model: ‘Valencia’ dataset (UAV imagery)

This model was generated following the same procedures described for the ‘Hamlin’ model. Table 7 shows the MAPE for all nutrients for all five-fold validations, the average MAPE, RMSE and MAPE-CV ratio. It shows a larger error for the estimation as expected from a dataset with more variability, but the average MAPE for most nutrients can be considered precise enough for a UAV-based measurement. The same pattern was found for the distribution of errors, where the larger CV in the dataset analysis shows a higher average MAPE.

Table 7 Mean absolute percentage error (MAPE), root mean squared error (RMSE) and MAPE-C ratio for the estimation models (five-fold validation)

The error distributions are presented in Fig. 9. Although errors are larger compared with the ‘Hamlin’ model, the errors for N, P and K are still less than 20%. The error distribution for Fe and Cu are shown as infinite, possibly due to outliers in the dataset. The MAPE-CV ratio for this dataset is larger, showing a poorer performance.

Fig. 9
figure 9

Whisker plot of the distribution of errors for each nutrient in the ‘Valencia’ (UAV) dataset

The difference plot was again applied to all nutrient data. Table 8 presents the upper and bottom range of operation in their nutrient concentrations and equivalent values in percentage. This model presents a larger error range of operation, expected from a wider spread of errors. When comparing to the other UAV model (‘Hamlin’ dataset), these ranges show that the presence of outliers can be highly influential. This can be minimized by collecting a larger dataset to better represent these outliers.

Table 8 Upper and bottom range for the difference plot for each nutrient in the ‘Valencia’ (UAV) dataset

Discussion

To evaluate the performance of the proposed model, leaf nutrient data were used from two different citrus scion cultivars grafted on more than 30 different rootstock cultivars and growing in two field locations with different soil and environmental conditions. The best overall fit was found for N, for which the model achieved an 8.39% range of operational error (difference plot range) for the ‘Hamlin’ dataset, and a 25.51% error for the ‘Valencia’ dataset. For the other nutrients (P, K, Mg, Ca, S and B), the error ranges were less than 20% for the ‘Hamlin’ dataset, and up to 40% for the ‘Valencia’ dataset. The model was not able to achieve acceptable errors for the micro-nutrients Zn, Mn, Fe and Cu. However, on a large-scale analysis, the MAPE better presents the model’s estimation accuracy (see “Evaluation metrics” section); the model determined N with a 3.82% MAPE and 9.85% MAPE for the ‘Hamlin’ and ‘Valencia’ datasets, respectively.

Assessing leaf nutrient concentrations of citrus trees is important for determining the nutrient status of the plants, which can fluctuate depending on the developmental stage of the plant, environmental conditions and pest/disease pressure. Regular nutrient analysis is important in crop production to identify and correct potential nutrient deficiencies and prevent production losses (Shaw et al., 2002; Kadyampakeni & Morgan, 2020; Bahtiar et al., 2020). Rather than collecting leaves manually and subjecting them to costly chemical analysis procedures, combining UAV imaging (spectral leaf reflectance measurements) and AI is less laborious, safer for the environment and more cost-efficient. This approach not only allows determination of the crop nutrient status but also development of fertility maps for variable rate fertilizer applications for further cost savings and environmental benefits. The aerial maps produced may also identify other problems. These include differences in the topography or micro-environment of the production site as well as disease zones that may affect nutrient availability, uptake or retention, or technical problems that may have arisen from malfunctions of machinery and other equipment associated with crop management.

To demonstrate the development of fertility maps, and the correlation of the canopy reflectance data and the nutrient concentrations obtained by chemical analysis, nitrogen maps were generated for the ‘Hamlin’ trees in trial B (Fig. 10). These maps and zones were based on the UF/IFAS guidelines for nutrient ranges of citrus trees (Table 2). One map (Fig. 10a) was generated using the proposed methodology (AI-based estimation model) based on canopy spectral reflectance values for all 4925 trees that are part of the trial in that location, and the second map (Fig. 10b) was generated using the ground truth data (laboratory chemical analysis) from a subset of 192 tree samples (red dots represent individual tree samples in the study site).

Fig. 10
figure 10

Nitrogen maps based on UF/IFAS guidelines (Table 2), with optimum (green), high (blue), and excess (pink) zones: a map developed by the proposed methodology; and b map developed from ground truth samples (chemical analysis; red dots represent sample points) (Color figure online)

The major advantage of the proposed methodology is that large populations of plants can be assessed quickly and at low cost, while also reducing inaccuracies resulting from sampling a limited subset of plants. This is demonstrated in Fig. 10b. It can be seen that there is a larger discrepancy between maps in the areas where fewer trees were sampled (red dots). For example, only three leaf samples were obtained from the northeast area of this trial site, which contained 202 trees. Similarly, only three samples were collected from the southeast area, which contained 307 trees. This difference in sampling frequency is likely the major reason for the discrepancy of the two maps in these areas of the trial.

In addition to aiding growers in nutrient management and disease identification, the proposed technology could be used in breeding programs to study cultivar performance and assess nutritional requirements and cultivar adaptation to different environmental conditions and diseases. A recent study by Osco et al., (2020) presented a framework based on machine learning algorithms to assess leaf nutrient concentration based on ground-based hyperspectral imaging. The study used different algorithms such as k-nearest neighbor (KNN), lasso regression, ridge regression, support vector machine (SVM), artificial neural network (ANN), decision tree (DT) and random forest (RF). Leaf nutrient determination with the proposed UAV-based methodology presented here yielded results comparable to those obtained by all algorithms used in Osco et al., (2020). For example, in that study, the best regression achieved for N was with a MAPE of 2.39%. In contrast, the methodology proposed in this study achieved a MAPE of 3.82%. Although most nutrient values in the dataset were similar to values measured by Osco et al., (2020), Cu concentrations differed considerably (157.57 ppm in current dataset, and 72.20 ppm in Osco et al., 2020) as did CVs (44.01% in current dataset, and 36.14% in Osco et al., 2020). The reason for the higher Cu concentrations and variance measured in this study is the frequent foliar sprays of this element to combat the bacterial disease citrus canker (Behlau et al., 2010). Achieving 80–90% accuracy of measurement for most of the other nutrients demonstrates the superiority of the proposed UAV- and AI-based methodology compared to the methodology used by Osco et al. (2020). The UAV based approach is ideal for evaluation of large-scale commercial fields, where larger areas can be covered by UAVs in considerably less time than by ground-based sensing systems.

In this study, the difference plots were incorporated into the analysis to evaluate accuracy of the model. Difference plots better represent accuracy of an application by considering most of the errors and eliminating some outliers to generate a range of operational errors. This is useful when comparing different datasets, as demonstrated in this study.

Conclusions

Herein, a methodology for generating and applying regression models for determining nutrient concentrations of citrus trees was developed. Different datasets including two different scion cultivars (‘Valencia’ and ‘Hamlin’) in combination with more than 30 different rootstock cultivars and two different locations were used. The proposed method incorporated a gradient boosting regression tree to determine different macro- and micro-nutrients and was evaluated with a cross validation method for overall errors and by a difference plot. This approach developed a precise estimation model, with less than 15% MAPE for most of the nutrients analyzed. The gradient boosting regression algorithm proved to be suitable for small datasets (less than 400 individuals) with both larger and smaller CV. For this analysis, it surpassed other regression algorithms without overfitting the data. It is a reliable methodology for crop spectral analysis, as larger datasets are difficult to create. Although this model was tested in commercial citrus production systems, it could be adapted to other crop systems. This new technology will allow the generation of prescription maps for variable rate application of fertilizers based on UAV imagery.