The effect of drought stress of sorghum grains on the textural features evaluated using machine learning

This study aimed to determine the discriminatory power of textural features to differentiate the sorghum grains subjected to normal, mild deficit, and severe deficit irrigation. The studies were carried out with the use of image processing, discrimination analysis, analysis of variance and cluster analysis using the selected texture parameters calculate for images from individual color channels L, a, b, R, G, B, U, V, S, X, Y and Z. The results indicated that different levels of irrigation can discriminate the sorghum grain with an accuracy of up to about 100%. Most of the genotypes for each level of irrigation were different in the terms of values of textural features and formed separate homogeneous groups. Drought is one of the limiting factors contributing to a decrease in sorghum grain productivity and nutritional quality, especially when it is cultivated in a marginal area. Therefore, low-quality grains produced under water stress should be recognized before they enter into the food and feed chain. The application of image analysis based on textures of sorghum grain images proved to be useful for the discrimination of sorghum grains subjected to drought stress. The applied procedure provided the fast, objective results that may be applied in practice for screening distinguishing the sorghum grains with different irrigation levels.


Introduction
Sorghum [Sorghum bicolor (L.) Moench] is the fifth most important cereal crop throughout the world after wheat, rice, maize, and barley cultivated as fodder crop or staple food in tropical and semi-tropical areas of Asia and Africa [1]. Sorghum grains are gluten-free with a great deal of potential in foods and beverages markets including cookies [2], noodles [3,4], tortilla chips [5], bread [6,7] and chicken nuggets [8].
A major constraint on crop production under a climate change scenario is water shortage during the crop season. As sorghum is a C 4 drought-tolerant crop, it is predominantly cultivated in those areas [9,10], although it has been proven that it responds positively to irrigation [1]. The reduction of grain quality and quantity under drought stress even in sorghum occurs at both pre-and post-anthesis growth stages [11,12]. Water stress during seed filling influences various metabolic processes in the leaves which are involved in the biosynthesis of functional constituents such as seed reserves and minerals [13]. Drought is extremely detrimental for seed production by reducing seed size and number and seed quality [13]. The negative impact of water stress on chemical parameters including protein, nitrogen-free extracts, sugar, crude fiber, and total ash has been reported [14,15]. Underwater stress conditions, plants accumulate more neutral detergent fiber (NDF) and lignin [16]. So, screening of sorghum seeds produced under well-watered conditions is essential before processing.
Post-harvest classification and sorting of health and vigor seeds can be an important marketing issue from the farmers' point of views or from the consumers' point of view. It has been reported that discriminating underdeveloped, poorly Ewa Ropelewska and Leyla Nazari contributed equally to this work.

3
filled seeds from those fully developed, well-filled seeds could be a struggle [17]. This is due to the increased occurrence of mainly hollow seeds inside under drought which feature a relative volume. Therefore, it is vital to differentiate between poorly and well-filled seeds at post-harvest quality control [17].
The application of Image-based technology in the analysis of seed surface texture may lead to a more powerful measurement of quality factors. Numerous devices such as CCD cameras, flatbed scanners, hyperspectral cameras, ultraviolet cameras, CT, and NMR have been employed for imaging [18]. Computer vision requires no prior information about the sample seeds. Due to the low costs, high accuracy, high flexibility, and repeatability of machine vision, it has been successfully used in industry, medicine, and plant biology. It has been implemented in seed research for the investigation of diversity, cultivar identification, cultivar registration, and plant breeding [18,19]. Image processing has been applied for post-harvest assessment of wheat and barley kernel infections with Fusarium [20] and for the classification of boiled potatoes [18]. To the best of our knowledge, discrimination of seeds produced under normal irrigation or drought has never been evaluated by image analysis. Thus, this study aimed to evaluate the efficacy of textural features for the differentiation of seeds produced under normal or drought conditions using discriminative models.

Field experiment
Ten sorghum genotypes were cultivated at the experimental field of Fars Agricultural and Natural Resources Research and Education Center on 6 June 2018 (52°43′ E and 29°46′ N; altitude 1604 m). While genotypes MGS2 and KGS23 were promising lines improved by SPII, the other eight genotypes were landraces kindly provided by the National Plant Gene-Bank of Iran, SPII.
The experiment was conducted as a split-plot. The deficit irrigation was considered as the main factor and genotype as the sub-factor. Subplots were 12 m 2 including four rows of 5 m long with a row distance of 0.6 m. The fertilizers were distributed based on soil test results. Irrigation treatments were applied to main plots at normal water irrigation [irrigation when the evaporation rates from pan class A reached 60 mm (Ir 60 )], mild water-deficit irrigation [irrigation when evaporation rates from pan class A exceeded 120 mm (Ir 120 )], and severe water-deficit irrigation [irrigation when evaporation rates from pan class A exceeded 180 mm (Ir 180 )]. Water stress treatment was started 26 days after sowing and continued during the season. FAO-CROPWAT 8.0 was used to calculate the reference crop evapotranspiration (ET o ) and schedule different levels of irrigation according to Doorenbos and Pruitt [21]. Grains were harvested at physiological maturity.

Image processing and textural features
Grain images were acquired using a camera (Canon Pow-erShot G11, 10 Megapixel) placed on a tripod stand in a completely dark room. The images of sorghum grain are presented in Fig. 1. The light source was a 10-W fluorescent lamp with a 400 mm diameter placed above the samples on the stand. Zoom adjustment was 5× to fill the camera window with the grains arranged on a black background. The images of 100 grains positioned on the dorsal or the ventral side were acquired for each sorghum genotype and saved in TIFF format. Sorghum grain images were processed using the MaZda software (Łódź University of Technology, Institute of Electronics, Poland) [22]. Each sorghum seed clearly separated from the black background was considered as one region of interest (ROI). Afterward, the sorghum seed images were converted to individual color channels L, a, b, R, G, B, U, V, S, X, Y, and Z. For each ROI in each color channel, about 200 textures were calculated [22].

Statistical analysis
The analyses of obtained results were performed using the WEKA 3.9 application (Machine Learning Group, University of Waikato) [23], Statistica 13 software, and GenStat 12. The WEKA was used for discrimination analysis to distinguish the sorghum grains with different deficit irrigation. Firstly, the attribute selection was performed using the Best First with the correlation-based feature selection (CFS) algorithm. Due to this step, the textural features with the highest discriminative power were chosen for the analysis. The attribute selection was carried out for the set including the textures from all color channels (R, G, B, L, a, b, X, Y, Z, U, V, S) and then the textures were selected for individual color spaces. The highest results (discrimination accuracies) were obtained for Lab color space. Therefore, only these results are presented in this paper. In the following step of the analysis, the discriminative models were developed based on selected texture parameters. The discrimination analysis was carried out using the dataset manually split into training and test set with a ratio of 70/30. For the discrimination of grains with different deficit irrigation, three classifiers, which provided the highest results, were used: Bayes Net form Bayes group, SMO (Sequential Minimal Optimization) form Functions group, and Random Forest from Decision Trees [24]. The discrimination analyses were carried out for a set of selected textures from all color channels and then for textures selected from the Lab color space. The results are presented in the form of confusion matrices and average accuracy for sorghum grain with three irrigation levels.
One-way analysis of variance (one-way ANOVA) was performed using Statistica 13 to the determination of differences in the texture attributes between sorghum grains belonging to different genotypes. The analysis was performed separately for each irrigation level at a significance level of P ≤ 0.05. The normality of distribution was tested using Shapiro-Wilk, Lilliefors, and Kolmogorov-Smirnov tests. The homogeneity of variance was checked with the use of the Brown-Forsythe test and Levene's test. The Newman-Keuls and the Kruskal-Wallis tests were applied.
Factorial ANOVA and clustering were performed using GenStat 12.0 software. To determine the statistical significance of all data, a factorial ANOVA was performed to determine the effect of water stress, genotype, classifier, and their interactions. Means were compared using Duncan's multiple range tests at a 0.05 level of probability. Average linkage was applied as a criterion in hierarchical cluster analysis. The Euclidean distance was used to form the same clustering.

Results and discussion
The results of the discriminant analysis revealed sorghum grain differentiation in terms of the textures of the outer surface of the images depending on the irrigation level. The classification accuracy (%) of sorghum grains produced under normal irrigation and drought based on the selected textures from all color channels (R, G, B, L, a, b, X, Y, Z, U, V, S) using three analyzed classifiers are presented in Analysis of variance (ANOVA) on accuracy values revealed the significant differences in discrimination accuracy among genotypes, irrigation, and interaction of genotype × irrigation (P < 0.001), while the accuracy was not statistically significant for classifier performance and its interaction with genotype or irrigation level (P ≥ 0.399) ( Table 2). Table 2 showing the ANOVA results confirms the statistical differences in accuracy of discrimination that have also been depicted in Table 1. In order to figure out which genotypes, irrigation levels, and classifiers are significantly different from each other, Duncan's multiple range tests were applied (Fig. 2). The highest discrimination accuracy based on textural features from color channels belonged to genotype MGS2 (99.9%) followed by genotypes TN-04-142 (97.4%) and KGS23 (97.3%). It can be concluded that textural features could not distinguish genotype TN-04-86 with the low accuracy of classification under each level of irrigation. Moreover, a comparison of means revealed the highest accuracy for normal irrigation (90.6%) and the lowest for mild stress (83.4%) (Fig. 2). Textural properties have been considered to be influenced by cultivar and maturity at harvest in the food industry [25][26][27][28].
SMO sequential minimal optimization, Ir 60 normal water, Ir 120 mild deficit irrigation, Ir 180 severe deficit irrigation

3
The highest accuracy under normal irrigation was attained by genotypes MGS2, KGS23, TN-04-134, TN-04-142, and TN-04-79 ranging from 100 to 98%. The highest value of accuracy under mild stress was associated with genotype MGS2, while in the case of severe water stress the highest value was related to genotypes MGS2, KGS23, and TN-04-142 (Fig. 2).
For discrimination of different sorghum genotypes, the large sets of textural features from the color channels were selected using WEKA application: 139 textures in the case of normal irrigation, 120 textures for grains subjected to maternal mild deficit irrigation, and 109 textures for grains produced under severe deficit irrigation. Four textural features including RH mean , BH mean , bH mean , and UH mean were chosen to exemplify the discriminatory power of textural features to differentiate sorghum grain of the genotypes produced under different levels of irrigation ( Table 3). The mean values of these features for each level of irrigation with the results of comparison of means are presented in Table 3. According to these mean values, most of the genotypes formed separate homogeneous groups. However, some genotypes formed a common group. For instance, 10 genotypes were categorized into eight groups for the RH mean parameter under normal irrigation as genotypes TN-04-59 and TN-04-86 were placed in the same group and the same for genotypes TN-04-79 and TN-04-142. The scatter plots for selected variables (VH mean vs. SH mean ) between the genotypes are shown in Fig. 3. Separation of the analyzed genotypes is more evident under normal irrigation and severe water stress in comparison to the case of genotypes grown under mild water stress. It is a confirmation of the classification accuracy presented in Table 1. As a result of clustering the textural features of the grains by the Hierarchical Cluster analysis method and considering the Euclidean distance as a similarity index, the genotypes were placed in the different groups (Fig. 4). The figure indicated that the 10 sorghum genotypes could be divided into three groups for the grains produced under normal irrigation and severe deficit irrigation, while the genotypes where grain production occurred under mild water stress could be categorized into four groups (slicing in similarity = 0.8). Moreover, it can be noticed that water stress can create textural changes on the surfaces of grains with different degrees that might change similarities and dissimilarities between the genotypes. For instance, under normal irrigation, genotype TN-04-79 were clustered into a group along with TN-04-134 and TN-04-142, while under mild and severe water stress it had the highest distance with other genotypes and clustered alone in a group (Fig. 4).
In the following steps, the discrimination analyses were performed for the color space Lab, for which the highest accuracies were obtained ( Table 4). The discriminatory power of selected textural features from the color space Lab was slightly lower compared to those values from color channels. According to the selected features from the color space Lab, sorghum grains subjected to different maternal irrigation levels were discriminated more efficiently based on SMO classifier with an average accuracy of up to 99.67% in the case of genotype MGS2. Also, very high correctness results were obtained for KGS23 (up to 92.33%), TN-04-79 (up to 95.00%), TN-04-134 (up to 96.67%), and TN-04-142 (up to 97.67%). The lowest accuracies ranging from 54.00% to 56.67% were determined for the genotype TN-04-86 (Table 4).
The results indicated that image processing based on the textural parameters is useful for discrimination between grains produced under normal irrigation and those harvested under deficit irrigation. Texture parameters considered as quantitative values calculated from the images may provide significant reliable data on the structure and quality of tested objects. By using the textures, there is possible to notice the changes difficult to observe visually. The digital images may have different textures if the distribution of color is dissimilar, even those they have the same number of pixels and color histograms [29][30][31]. Therefore, the application of image analysis based on texture parameters of grain images is so important. The applied procedure provided the objective, independent of the human factor results and it may be successfully useful in practice to-screen for healthy sorghum grains which are harvested from the fields with optimum irrigation management.
Abiotic stresses significantly affect the reproductive growth of various crops resulting in a reduction in ultimate economic yields. Drought stress can decrease grain yield through processes such as fertilization, nutrient assimilation, and mobilization to reproductive organs as well as seed development [13]. Water stress during the seed filling stage may negatively influence seed weight and seed composition and hence, resulting in a reduction in quantitative and qualitative yield [32]. Water stress can prevent the various nutrient accumulation in seeds, primarily protein and starch [33,34]. Although sorghum crops are tolerant to drought stress, the quantity and quality of the grain yield can adversely be affected when imposed to water stress [1]. The tractability of biotic and abiotic stress on phenotypic characteristics of crops in the field through computer vision, machine learning, and image processing techniques have been demonstrated in several reports [35][36][37]. Crops display several mechanisms including morphological, physiological, and biochemical mechanisms to mitigate the effect of water stress resulting in different phenotypes that can be differentiated using the feature extraction method [38].
Sorghum may be used either for food for human consumption, feed, or exported [39]. Water stress could result in the deterioration of grain quality and its products. Therefore, a method to trace which grains are produced under wellwatered conditions is vital in the food industry. VHMean SHMean G1T120 G2T120 G3T120 G4T120 G5T120 G6T120 G7T120 G8T120 G9T120 G10T120

Conclusions
The possibility of image processing application in distinguishing the seeds from the maternal plants grown in different levels of irrigation has not been studied previously. Water irrigation levels applied at three levels considerably affected the analyzed texture parameters of sorghum grains. This was achieved by discrimination methods, ANOVA, and clustering. The discriminatory power of textural features for the identification of sorghum grains under normal irrigation and water stress was genotype-dependent. The highest accuracy of discrimination based on color channels ranged from 99.9% related to genotype MGS2 to 80.0% and 57.8% belonging to genotypes TN-04-90 and TN-04-86, respectively. Over an average of all genotypes, the highest accuracy was obtained for normal irrigation (90.6%) and the lowest for mild stress (83.4%). There were no significant differences in the performance of classifiers with a mean of 87.3%, 87.8%, and 87.4% for Bayes Net, SMO, and Random Forest, respectively.