Using Hyperspectral Signatures for Predicting Foliar Nitrogen and Calcium Content of Tissue Cultured Little-leaf Mockorange (Philadelphus microphyllus A. Gray) Shoots

8 Determining foliar mineral status of tissue cultured shoots can be costly and time consuming, yet hyperspectral 9 signatures might be useful for determining mineral contents of these shoots. In this study, hyperspectral signatures 10 were acquired from tissue cultured little-leaf mockorange ( Philadelphus microphillus ) shoots to determine the 11 feasibility of using this technology to predict foliar nitrogen and calcium contents. After using a spectroradiometer to 12 take hyperspectral images for determining foliar N and Ca contents, the correlation between the hyperspectral bands, 13 vegetation indices, and hyperspectral features were calculated from the spectra. Features with high correlations were 14 selected to develop the models via different regression methods including linear, random forest (RF), and support 15 vector machines. The results showed that non-linear regression models developed through machine learning 16 techniques, including RF methods and support vector machines provided satisfactory prediction models with high R 2 17 values (%N by RF with R 2 = 0.72, and %Ca by RF with R 2 = 0.99), that can estimate nitrogen and calcium content of 18 little-leaf mockorange shoots grown in vitro. Overall, the RF regression method provided the most accurate and 19 satisfactory models for both foliar N and Ca estimation of little-leaf mockorange shoots grown in tissue culture.


Introduction
Hyperspectral sensing is the measurement of the spectral characteristics of materials by the using sensing systems with more than 60 spectral bands and with spectral resolutions less than 10 nm.This resolution can produce a continuous portion of the light spectrum defining the chemical composition of an object through its spectral signatures (Gomez, 2020).With substantial developments in recording spectral bands of electromagnetic waves, hyperspectral sensors can provide data with a large number of spectral bands due to their high resolution in the range of 350 to 2500 nm, and spectral bands are acquired by passive optical sensors.Spectral data are detected from any surface that can reflect, absorb, and transmit electromagnetic radiation (Hruška et al., 2018).
Hyperspectral imaging provides the ability to complete reflectance or fluorescence spectroscopy on all single spatial pixels of a spectral image thereby discerning characteristics that cannot be seen by human eyes (Robila, 2004, Gomez, 2020).The basic shape of a curve over the spectral range is characteristic of the parent material of the object being analyzed by spectroscopy (Liang, 2004).In the visible (VIS) to near infrared (NIR) spectrum (approximately between 400 and 1100 nm), characteristics of water, soil, or plant canopy give rise to specific curvatures in the reflectance spectrum, which makes them recognizable (Liang, 2004, Robila, 2004).
Perhaps the biggest advantage of hyperspectral data over simpler red-green-blue (RGB) imagery and multispectral data is that hyperspectral data can detect more accurate information of the object due to more spectral bands being recorded.Hyperspectral acquisition devices, including sensor types, acquisition modes and unmanned aerial vehicle (UAV)-compatible sensors, provide information that is needed or used both for research and commercial purposes, (Adão et al., 2017).Hyperspectral sensors and UAV have been useful in many areas of study including material identification, precision agriculture (vegetative coverage, nutrition deficiencies, foliar water content, physiological disorders, etc.), environmental aspects (wetlands, hydrology, etc.), health care (medical diagnoses, food safety, food quality assessment, etc.), and many more applied fields (Adão et al., 2017, Gomez, 2020).
A vegetative index (VI) describes an equation that processes spectral data for the purpose of determining information about plant health.Detectable vegetation indices (VIs) from hyperspectral signatures can provide an estimation and analysis of several plant characteristics, such as biophysical, physiological, or even biochemical parameters in crops, including leaf chlorophyll content (LCC), leaf water content (LWC), leaf area index (LAI), fractional photosynthetically active radiation (FPAR) absorbed by a canopy, surface roughness, and phenology, which are some of the most important inputs to land surface process models (Liang, 2004, Adão et al., 2017, Morcillo-Pallarés et al., 2019).These VIs can be applied in the regression models to help estimating plant status, such as foliar mineral contents.
With the importance of nitrogen increasing yield efficiency and crop health, modern application of hyperspectral signatures in preventing nitrogen deficiencies in the field have become widespread.Hence, much research has been conducted using remote sensing and applying hyperspectral signatures to determine crop nitrogen deficiency, required rates of fertilizers to increase crop production, or even the amount of nitrogen uptake by plants to improve agricultural production and yield efficacy (Maes and Steppe, 2019).DeOliveira et al. (2017) applied selected vegetation indices to estimate foliar N concentration in three Eucalyptus tree clones grown in the field.Liu et al. (2016) applied multiple linear regression and neural network analysis to find a relationship between the leaf nitrogen content of field grown winter wheat and vegetative indices in narrow bands.Other studies have used hyperspectral indices to check the nutrition status of sodium and potassium content in grass (Capolupo et al., 2015), potassium deficiency level in canola (Severtson et al., 2016), nitrogen concentration in field grown oat (Van Der Meij et al., 2017), corn (Gabriel et al., 2017), rice (Wen et al., 2018), and wheat (Zhu et al., 2018), and leaf N, P, K, Ca, Mg, and few micronutrients of corn and soybean plants (Pandey et al., 2017).
Putting a new plant species into tissue culture to establish in vitro shoot cultures may require adjusting the nutrient medium components to optimize desirable shoot growth of the new species.Finding the optimum concentration of each component is critical and requires time and money.Estimating an explant's foliar mineral status to check its health status is important to attain optimal in vitro growth.Usually, destructive methods are applied to estimate foliar mineral contents, especially for tissue cultured plants.Finding nondestructive methods, such as applying hyperspectral signatures can help growers to reduce their production cost and save time.
To date, reports on using hyperspectral devices and hyperspectral vegetation indices in tissue culture environments are lacking.To check the feasibility of using of this technology to evaluate the mineral content of tissue cultured little-leaf mockorange shoots, we used a spectroradiometer during the shoot proliferation stage of micropropagation to determine if hyperspectral imaging could help in estimating nutrition status of the explants during stage 2. If hyperspectral imaging shows success, it can help tissue culture plant producers save money by avoiding destructive sampling for foliar nutrient analysis and save time waiting for nutrient analyses to be completed.

Preparing the spectroradiometer and taking readings
For this research, we used either an Analytical Spectrum Devices FieldSpec 4 High-Resolution spectroradiometer (Malvern Panalytical Ltd., Westborough, MA, USA) or an Analytical Spectrum Devices FieldSpec HandHeld-2 spectroradiometer (Analytical Spectral Devices Company, Boulder, CO, USA).After 30 minutes of spectroradiometer warm up, the device was optimized and calibrated with a Spectralon ® 99% white reference panel.During calibration, an average of 100 dark current measurements were calibrated together, and an average of 50 scans of the Spectralon ® white reference were measured every two minutes (Labsphere Inc., North Sutton, NH, USA) (Beck, 2019).Target reference recordings displayed an average of 20 scans at an optimized integration time of approximately 1 second.
Reflectance readings of mockorange shoots were made immediately (within 2 minutes) after they were taken out of the jar and prior to completion of the reflectance spectra procedure.Measurements were completed in a dark-room and conducted on a black-colored bench to exclude external light and reduce outside lights.The probe was held about 5 to 10 cm over the explants to take the reflectance.Measurements were taken on all six shoots that were grown within each culture jar.Three duplicate readings were recorded for shoots grown in each jar in order to reduce error effects.
After every 10 to 12 readings, a new calibration was completed to reduce the error from external white light.All measurements were acquired using RS3 software version 6.4 (Malvern Panalytical Ltd., Westborough, MA, USA).
Reflectance spectral data represented the full range of VIS, NIR, and short wave infrared (SWIR) light between 350 and 2500 nm, with a resolution of 1 nm.The spectral sampling interval was automatically interpolated from 1.4 nm to 1 nm at the time of each individual measurement by RS3 software, so a single value for each wavelength from 350 to 2500 nm was recorded (Beck, 2019).Data were exported by the ViewSpec Pro software version 6.2 (Malvern Panalytical Ltd., Westborough, MA, USA).The average of three readings of the reflectance from the group of six explants (per jar) was used to create a single treatment reflectance spectrum for each jar of shoots.

Tissue analysis for mineral content
After taking the hyperspectral reflectance, the shoots were separated from the agar medium, placed in an envelope and dried at 70ᵒC for 72 hours.Dried shoots were ground using a pestle and mortar.Dried tissues were sent to the tissue analysis lab (Brookside Laboratories, Inc., New Bremen, OH) for foliar nutrient analysis.Tissue analysis was completed by using a combustion method applying a Carlo Erba 1500 C/N analyzer to estimate total N content (method B2.20, Miller et al., 2013).For Ca, lab procedures entailed use of nitric acid and hydrogen peroxide in a closed Teflon vessel and digested in a CEM Mars Microwave and analyzed on a Thermo 6500 Duo ICP (method B4.30, Miller et al., 2013).Results from foliar analyses were used for correlation model training with the hyperspectral signatures.

Hyperspectral data analysis
Preprocessing the spectral signatures was the first step in hyperspectral dataset analysis, particularly for spectra collected by the spectrometer.To further reduce noise, spectra were preprocessed with a Savitzky-Golay smooth filter (window size = 5 and polynomial order = 4) (Ge et al., 2019).The process of selecting an appropriate order and window size was done by trial and error, with the goal of smoothing out only large changes on a signature's surface.
The success of developing regression models is contingent upon the number of features assigned to the feature space (Zhao et al., 2019).Apart from the reflectance value at each wavelength, the hyperspectral dataset was used to extract spectral indices and geometric features from continuum removal regions.Thus, the number of features used for regression becomes even more critical when hyperspectral datasets are used; the large number of spectral bands makes determining whether spectral bands or spectral vegetation indices generated from spectral bands, or both, are associated with foliar chemical or physiological status, or in this case, leaf mineral content.To address this question, related features (explained in the following sections) were extrapolated from the spectra and then feature selection approaches were suggested for training the model with fewer but more informative features.

A. Spectral indices
Spectral indices defined by the mathematical operators between two or more spectral bands are also widely used for features extraction in remote sensing (Lu et al., 2020, ).Many spectral indices used in agricultural applications are suitable for the specific purpose of plant monitoring.In this study, some commonly used spectral indices for mineral estimation were selected (Table 1).

B. Continuum removal
The absorption bands of the electromagnetic spectrum contain valuable information about the minerals or chemical compounds present in the target.This information has been used in various studies.Huanga et al. (2004) and Gomeza et al. (2008) used absorption features to estimate the amount of clay and calcium in the soil and the nitrogen concentration in a tree's canopy leaf surface, respectively.
Basically, the presence of organic components on the surfaces of plant leaves results in absorptions in the VIS and NIR wavelength ranges.These molecules include C-N, NH, and OH (Hunt, 1980), which indicate significant biochemical substances found on plant surfaces, such as lignin and starch, as well as nitrogen-containing components found in plants, such as protein and chlorophyll.These chemical and organic compounds may produce absorption in the spectral signature of plants due to the electron transfer phenomenon in the VIS region of the electromagnetic spectrum.On the other hand, specific absorptions in the SWIR region of the plant's spectral signature may be connected to the cellulose, glucose, and water content of the plant's leaf structure.
To demonstrate the geometrical differences between absorption regions, spectra need to be transformed into numerical features.To extract numerical information from the absorption region's surface, the spectrum's general concave shape must be ignored.This approach to normalization is referred to as "continuous removal" or "convex body", and it enables the comparison of spectra acquired with various equipment or under varying lighting conditions (Sowmya and Giridhar, 2017).
The continuum removal, spectral signatures, and convex hulls of spectra can be shown graphically (Figure 1).Three characteristics are defined in this study by the geometry of the spectral signature following continuum removal.
The depth, area, and asymmetry features in Figure 1 correspond to the continuum values at the lowest point of absorption, the area under the continuum curve in an absorption region, and the ratio of the left to right area.In this study, fifteen ranges in spectral signatures were selected (Table 2).To choose these spectral ranges, the spectral signature was carefully examined, and the absorption regions were selected based on a visual comparison between the absorption regions and the surrounding (left and right extremum) wavelengths.
The   is the area of space between the continuum line and the continuum removed spectrum on the left, and  ℎ is the area of space between the continuum line and the continuum removed spectrum on the right, the features are defined as follows (Aspinall et al., 2002): • D = The absorption depth (the lowest point in continuum region) For example, Asy 2 means the Asymmetry in the second wavelength range.

Model development
From the feature selection section, relevant features from spectral signatures were identified for tissue cultured shoots.
The next steps were to 1) fit the regression model by using machine learning methods and 2) validate their significance using test data.Linear, Random Forest and Support Vector Machine were three regression models used in this research and are briefly explained below.
• Linear Regression: is a linear model that assumes a linear relationship between the input variable (x) and the single output variable (y).To select the relevant features, defined features (independent variables), such as reflectance values, continuum removal, and spectral indices for a linear model, a correlation test was used.
Pearson's correlation coefficient is the covariance of the two variables divided by the product of their standard deviations (Freedman et al., 2007).Pearson correlation coefficient was used so that features with high correlation values were first recognized and selected from the list of defined features.In addition to a single variable linear model, a multi-variant linear model was also examined to determine the performance of different combinations of spectral features on the estimation results.
• Random Forest Regression (RF): This type of regression is a supervised learning algorithm that uses an ensemble learning method for regression and also is constructed by a set of decision trees.Group learning technique is combined with multiple decision trees to compare against a single regression model, enabling RF to obtain satisfactory and acceptable results for an R-square (R 2 ) value close to 1 or root mean square error (RMSE) close to zero, which shows ideal estimation.For this reason, RF has been widely used by researchers in regression and classification problems.The performance of a random forest model depends on the number of trees and the input variables.Therefore, in this paper, different random forest regression models were trained to achieve the best model.
• Support-Vector Machines (SVM): This type of regression is a supervised learning model with associated learning algorithms that analyze data for classification and regression analysis.Various models can be produced based on changing the parameters in SVM, including the kernel type and the c-constant penalty term, which has the responsibility of balancing and maximizing the separator margin in features space (for example a two-dimensional space constructed by reflectance values in two wavelengths).In this study to reach an optimal model, parameter tuning was considered first by using RBF (Radial Basic Function), Linear and Polynomial (commonly used or built-in functions in SVM algorithm for transferring values of a variable to another space, these functions are known as kernels).Kernels and C values of 10, 100 and 1000 were used and then models with satisfactory results were selected.
To manage the results, the following procedures involved separately adding variables into the model and then calculating the coefficient of determination (R 2 ), RMSE and the correlation coefficient (Corr).Next, a combination of variables was added to the model (multiple-inputs) and then new calculations for R 2 , RMSE and Corr were made.The best model was chosen by comparing the results and using the best R 2 and Corr values and by using error bar plots and scatter plots.The error bar plots showed the error between observed and predicted values and the scatter plots showed the correlation between observed versus estimated values.
Where N is the number of observations, Oi is the observed values, Pi is the estimated values, O is the mean of the observational values, P is the mean of the estimated parameter and is the standard deviation of the observations and is the standard deviation of the estimated values.

Model evaluation criteria (Statistical criteria for numerical evaluation of the developed model)
A schematic diagram of the methods used for developing a regression model from the hyperspectral bands and the mineral content in little-leaf mockorange shoots, is shown in Figure 2, and the evaluation criteria were calculated separately for foliar N or Ca contents.
The flowchart can be divided into the following steps: • Step 1: Separately adding variables into model and calculation of R 2 , RMSE and Corr. • Step 2: Adding a combination of variables (multiple inputs) into the model and then calculation of R 2 , RMSE and Correlation.
• Step 3: Comparing the results and choosing the best models given their performance in terms of evaluation criteria to be shown using error plot and scattering plots. •

Results
The correlation between spectral features including spectral bands, spectral indices and continuum removal features were calculated.Spectral bands with higher correlation to leaf N content were used in regression model training (Figures 3 and 5).As shown, the wavelengths from 648 to 651 nm were shown to have a moderately high correlation with %N with correlation value of 0.30 (Figure 3).In general, leaf reflectance between 505 nm and 670 nm had the highest correlation with N content of microshoots and was used for developing a linear model for N estimation (Figure 3).

Model development
Results showed that the reflectance values at the wavelength of 648 nm, asymmetric feature in range 1819 nm to 2150 nm (Asy 11) and the area from 559 to 772 nm (Area 3) had correlation values of 0.30, 0.31 and 0.37 with %N content (Figure 5).These spectral features provided information needed for predicting the %N to generate a linear model for Based on these spectral data, N content acquired by a linear model was estimated by R 2 = 0.21, RMSE = 0.54 and Corr = -0.45(Figure 4).
Random Forest regression was used in the next model.One of the main advantages of RF regression is that the number of input variables lack an effect on this model (Horning, 2010).The algorithm is able to apply the most effective variables given to entropy value, and then develop the regression model by using the most effective variables, meaning that RF algorithm could also be a feature selection.All the selected spectral bands from the correlation test and all the spectral features (indices and continuum removal) were added to the RF model.Based on the results, the RF regression model revealed that asymmetric point from 1819 to 2150 nm (Asy 11), asymmetric point from 559 to 772 nm (Asy 3), the reflectance values at the wavelength of 2480 nm, reflectance at wavelength of 525 nm, and the Double Peak Index (DPI) were the most effective features to generate a nonparametric (non-linear) model (Figure 5).
To develop a RF model, besides using optimal feature selection as effective inputs to the model, the number of trees in a RF model must be determined.By testing various models with different combinations of the mentioned features and/or indices, eventually the most accurate model was selected (Table 3).The fitted model with DPI index and reflectance at 525 nm and the tree number of 5 was a more accurate model fitted by RF regression, with R 2 = 0.72 and RMSE = 0.30, and correlation = 0.84 (Figure 6) compared to the other fitted models.
Support vector machine, one of the most commonly used regression methods, was used for the developing another regression model.In the SVM model, two main objectives were considered.First, the selected features from the correlation test and those selected by RF methods were added to a SVM model.Second, the parameters of the SVM model, including the kernel type and the penalty term, were evaluated by trial and error such that the most accurate SVM model fitted by the optimal model had the lowest RMSE.
The model generated by SVM regression provided an estimation of foliar %N content that compared to the linear model, and the fitted SVM model including Double Peak Index (DPI) with asymmetric point from 1819 to 2150 nm (Asy 11) (Table 4).Another model including DPI with asymmetric point from 559 to 772 nm (Asy 3), provided an approximate accurate method to estimate foliar N content, respectively at R 2 = 0.58 and RMSE = 0.32, or R 2 = 0.61 and RMSE = 0.33 for little-leaf mockorange shoots produced in tissue culture (Figure 7).

Foliar calcium content
After analysis of the hyperspectral bands and checking for their correlation with the Ca content of the shoots received from the tissue analysis, the bands with higher correlations were selected, and those were 721 nm, 541 nm, 1293 nm, 1805 nm, and 2209 nm, with correlation values of 0.35, 0.33, 0.30, 0.28, and 0.26, respectively (Figure 8).
Examining the correlation values between %Ca with different features and VIs spectra showed that the minimum (depth) external of the wavelength between 1819 to 2150 nm (Min 11), and minimum (depth) external wavelength between 1287 to 1670 nm (Min 8) had the highest correlation values with Ca, respectively 0.59 and 0.45 (Figure 9).

Model development
Model development showed that Ca content determined by a linear model consisted of parameters of minimum (depth) external wavelengths between 1819 to 2150 nm (Min 11) and the area from 559 to 772 nm (Area 3) could be estimated by R 2 = 0.83 and RMSE = 0.09.Nevertheless, the coefficient of Area 3 was low enough to ignore it to draw the error bar graph (Figure 10).%Calcium = 1.13*(Min11) + 0.08 The Random Forest algorithm provided a successful model to estimate the %Ca of little-leaf mockorange shoots.
After examining several models with different feature combinations and tree numbers, the model including four features of minimum (depth) from 838 to 843 nm (Min 4), area from 2428 to 2490 nm (Area 15), asymmetric point from 1670 to 1714 nm (Asy 9), and Cellulose Absorption Index (CAI), with the tree number of 5 were the most effective features to generate a nonparametric (non-linear) model (Figure 11, Table 5), yielding R 2 = 0.99 and RMSE = 0.03 and correlation value = 0.99 (Figure 12, right).The error bar plot in Figure 12 (left) reveals only slight differences between observed and estimated Ca among test samples proving the success of developed RF model for shoot Ca estimation.
Using the specific spectral features and a selected index (CAI) acquired from the RF algorithm as the best variables to use in model development.The specific spectral features and CAI index used for the RF algorithm were also used to develop a fitted model for SVM regression.After developing and running several models with different penalty terms (costs = 10, 50, or 100) and different kernels (linear, polynomial, or radial) (Table 6), eventually a model via linear kernel, including all four features of minimum (depth) reflectance from 838 to 843 nm (Min 4), area from 2428 to 2490 nm (Area 15), asymmetric point from 1670 to 1714 nm (Asy 9), and CAI was eventually developed.
This model had a R 2 = 0.59 and RMSE = 0.16 and was determined to be the better model, regardless of the penalty term (cost value) (Table 6, Figure 13).

Discussion
Regression modeling plays an important role in estimating various plant characteristics, such as mineral content and water content.Accurate prediction of these parameters can assist in better understanding of plant growth and development, and improving agricultural practices.In this context, several regression models have been developed for hyperspectral data analysis, including the Random Forest (RF) and Support Vector Machine (SVM) models.In this study, we compared the performance of linear, RF, and SVM regression models in predicting the nitrogen (%N) and calcium (%Ca) content of tissue-cultured shoots.Additionally, we evaluated the importance of selecting the best features and wavelengths from the hyperspectral bands for accurate prediction.Our findings indicated that the RF model outperformed the SVM model in predicting %N, whereas %Ca was better predicted by the RF model with higher R 2 and lower RMSE values.These results demonstrated the importance of selecting the appropriate regression model and optimal features for hyperspectral data analysis in predicting plant characteristics.
This research demonstrated that hyperspectral imaging can be used to predict the percentages of N and Ca in little-leaf mockorange shoots produced in tissue culture.Linear, RF and SVM regression procedures were used to obtain an accurate model to estimate the %N and %Ca in little-leaf mockorange shoots produced in tissue culture.
Among the three developed regression models used to estimate and predict the foliar nitrogen content, random forest regressions and SVM, could estimate %N more accurately than the linear regression model.Nevertheless, the models developed to predict %N were slightly less accurate than those developed for predicting %Ca in the tissue cultured shoots.
The RF (tree number = 5) could estimate %N better than SVM (no matter what the cost (parameter or penalty term) used for this regression model).For %Ca, the RF model had a higher R 2 (0.99), had a lower RMSE (0.03) and provided a better model than SVM with a lower R 2 (0.59) and a higher RMSE (0.16).Finding the best regression model and the best features or indices as well as the best wavelengths throughout the hyperspectral bands is highly important for predicting a specific mineral content or other plant characteristics, such as water content.
Although the linear regression model provided an acceptable R 2 value, the model failed to predict %Ca.Hence, RF and SVM regression models were alternately considered.Based on the results obtained from this research, foliar %Ca content could best be estimated using a non-linear regression model rather than a linear model.Although the features used in the model (including the Cellulose Absorption Index) worked for both RF regression model and SVM regression model, the RF regression had stronger R 2 and correlation, and therefore was a better model to estimate the %Ca of tissue cultured shoots of little-leaf mockorange.Cellulose is an important component in the structure of primary cell walls of green plants (Anonymous, Cellulose, 2021).Calcium interacts with cellulose as a cellular structural component.A high correlation between %Ca and CAI is likely due to this relationship, and in the future more detailed experiments can be conducted to determine any possible relationship between %Ca and CAI index.
To date, research using hyperspectral images to estimate shoot mineral contents of shoots or plantlets produced in tissue culture (in vitro) is lacking.Studies, however, have been conducted to estimate N content of agronomic field crops, such as estimating N in winter wheat at different growth stages, based on NIR wavelengths via multivariate linear regression and Back Propagation (BP) neural network using vegetation indices (Liu et al., 2016), estimating leaf N content of winter wheat via selected spectral indices and around NIR wavelengths (Zhu et al., 2018), estimating N content in potato plants in NIR (Clevers and Kooistra, 2012), N estimation in maize via VIs, such as NDVI, Renormalized difference vegetation index (RDVI) or Optimized Soil-Adjusted Vegetation Index (OSAVI) (Gabriel et al., 2017), N estimation in rice with Gaussian process regression (GPR) model (Wen et al., 2018), N estimation of eucalyptus using NDVI in red-edge and modified red-edge NDVI (DeOliveira et al., 2017), and estimation of macro-and micronutrients such as N and Ca in soybean and maize via partial least squares regression (PLSR) models (Pandey et al., 2017).
Although some reports describe the use of NIR or lower short wave infrared (SWIR) wavelengths to provide effective estimates of N, almost all of these studies have used only vegetation indices such as NDVI or other VIs.The difference between this study and other hyperspectral studies was application of different geometric features generated from continuum removal, such as minimum reflectance (depth), area under the spectrum, and asymmetric point of the spectrum alongside the reflectance spectrum acquired from little-leaf mockorange shoots produced in tissue culture.
Applying these geometric features for plants grown in an in vitro environment, nevertheless, resulted in satisfactory R 2 and RMSE values obtained from the regression models used to predict N and Ca contents in the shoots.
An interesting aspect of %N and %Ca estimation was that both were predictable in spectrum ranges from 1819 to 2150 nm (Range 11) and from 559 to 772 nm (Range 3).Using different features of these ranges provided information for each of these two minerals in little-leaf mockorange shoots.In addition, correlation plots of estimated and measured values for N and Ca concentrations, revealed a small gap between higher concentrations and lower concentrations of these two minerals, probably due to the limited number of samples (less than 100) used for predicting their concentrations.The other possibility for the gap was that hyperspectral images could estimate N or Ca only at higher concentrations, due to the tiny size of the leaves and stems on the shoot cultures, meaning less information was acquired from their reflectance.
A deeper look at the scatter plot of %Ca obtained from the RF algorithm (Figure 12, error bar plot) showed that samples with values higher than 0.15 of CAI features had much lower differences between the measured and estimated values compared to the differences between measured and estimated values of CAI less than 0.15.This result indicated that for a more accurate prediction, features with higher correlation values must be selected.On the other hand, except for two samples (error bars shown in Figure 12, left), the developed model either accurately estimated or slightly overestimated %Ca.
Most of the earlier foliar nutrient content studies have used mostly the vegetation indices to estimate canopy minerals especially N. Unfamiliarity with hyperspectral features relative to prediction of foliar mineral status may be a limitation on using of this technique in comparison with vegetation indices.Recruitment of a team of plant scientists, plant nutritionists, and hyperspectral scientists, may provide an opportunity to apply these features more effectively.
This study illustrated the potential for success of such a team of a plant scientists and hyperspectral scientists.
All these results were obtained from a specific selected mockorange genotype.Application of hyperspectral imaging was successfully completed for shoots from this little-leaf mockorange grown in vitro, but the success of this method for other mockorange species as well as other plant species still needs to be tested.
This study showed that hyperspectral imaging could help to predict foliar nutrient contents (N and Ca particularly) of little-leaf mockorange shoots produced in tissue culture and could help to avoid destructive methods of foliar mineral analysis.This nondestructive method, can save tissue culture producers the time necessary for drying, grinding, sending the samples off to a tissue analysis lab, and then waiting for the analysis, and the money by avoiding paying for shipping and foliar tissue analyses, enabling producers to save money.

Conclusion
This study demonstrated that strong regression models could be developed to predict N and Ca contents of tissue cultured little-leaf mockorange shoots.resulted in an R 2 = 0.72 and correlation = 0.84.Likewise, the best RF model for %Ca estimation resulted in an R 2 = 0.99 and correlation = 0.99.These strong statistical values clearly demonstrated that hyperspectral imaging can be used to predict accurately %N and %Ca in tissue cultured shoots from one selected little-leaf mockorange genotype.
Other mockorange species as well as other plant species produced in tissue culture would need to be tested to validate using hyperspectral imaging to predict N and Ca contents of their shoots.

Tables
Table 1 The highest correlated vegetation indices determined by using hyperspectral imaging in this study.Formula calculations were obtained from Anonymous, Index Database, 2021

Legends to Figres
Step 4: Plotting the best results in error plot (showing the error between observed and estimated values) and scatterplot (showing scattering of observed and predicted values to each other).

N
content measurement.The best single variable linear model was obtained by Asymmetric features in range 11 shown below.%Nitrogen = 4.47*(Asy 11) -3.45

Fig. 1 Fig. 2 Fig. 3 Fig. 4 Fig. 5 Fig. 6 Fig. 7 Fig. 8 Fig. 9
Fig. 1 Spectral signature derived of certain wavelength range (X axis shows wavelength and Y axis shows reflectance values) and its continuum removal and geometric features extracted from continuum removal regions which include the area in left side (Area L), the area in right side (Area R) and depth (D) Data sets were divided into model training and model test groups for generating the optimum regression model.Data partitioning or splitting data sets (hyperspectral recorded samples) into training and sample (test) groups was one of the crucial steps in regression.In our case, 39 samples (reflectance spectra) out of 56 samples (70%) were used for model training and the rest of samples were used for model testing (17 samples out of 56 samples).The training data set was then used to develop a regression model with wavelengths in the spectral signature and vegetation indices calculated from those spectral signatures, as well as generated features obtained from those spectral signatures correlated to the foliar nutrient content from lab analysis.The developed model was validated and evaluated by using test datasets.
The best features to estimate %N were reflectance values at the wavelength of

Table 2
Fifteen wavelength ranges taken from spectra extracted from little-leaf mockorange shoot cultures by using a

Table 3
Various models developed for %N estimation in little-leaf mockorange shoots produced in tissue culture.Each model has different feature combinations and a different number of trees via the Random Forest algorithm

Table 4
Various models developed for %N estimation of little-leaf mockorange shoots produced in tissue culture.The models had different feature combinations and different penalty terms via SVM algorithm.RMSE, R 2 and correlation values resulted from the SVM model tuned by the linear kernel and three different values of penalty term

Table 5
Various models developed for %Ca estimation in little-leaf mockorange shoots produced in tissue culture.The models contain different features combinations and different number of trees via the Random Forest algorithm

Table 6
Various models developed for %Ca estimation of little-leaf mockorange shoots produced with tissue culture.The models used different feature combinations and different penalty terms via the SVM algorithm