Integrated selection criteria in sugarcane breeding programs using discriminant function analysis

Abstract

Background

Selection indices help the plant breeders to discriminate desirable genotypes on the basis of phenotypic performance. Therefore, the present study was conducted to evaluate thirty sugarcane genotypes (clones) along with two check cultivars in two cropping seasons at Mattana Agricultural Research Station.

Results

The results showed the studied traits observed in all genotypes were significantly different. The results could significantly discriminate between low and high sugar yield genotypes by describing eleven traits including sugar yield (ton/fed), cane yield (ton/fed), number of stalk/m2, stalk weight (kg), stalk height (cm), stalk diameter (cm), number of internodes, Brix %, sucrose %, purity %, and sugar recovery %. High sugar yield genotypes were selected by discriminant analysis. The discriminant score (DS) could explain 79.2% of sugar yield variations and had a significant canonical correlation (0.89**). Results of discriminant function analysis (DFA) indicated that the most important traits, in order of appearance, are stalk weight, stalk height, purity %, Brix%, and cane yields.

Conclusions

Genotypes, G.2017-43, G.2017-42, G.2017-29, G.2017-33, and G.2017-44, showed the highest values of the discriminant score and were recognized as the highest yielder sugarcane genotypes. While the genotypes named Vis, G.2017-30, G.2017-10, G.2017-27, G.2017-25, G.2017-70, G.2017-41, G.2017-40, G.2017-35, and G.2017-58, recognized as the lowest yielder sugarcane genotypes which represent the lowest values of the discriminant score.

Background

Sugarcane (Saccharum sp. hybrids) is one of the main crops in the world and the major producer of sugar and ethanol (Silva et al., 2016). Sugarcane is the world’s most-produced crop (total production) and ranks among the ten most widely grown crops worldwide. The total global production of sugarcane in 2016–2017 was 1.9 billion tons, and it was grown in approximately 100 countries, covering an area of ~ 26 million hectares (FAOSTAT, 2018). Early-stage of sugarcane breeding is commonly correlated with low accuracy and selection efficiency due to the large genotypes also to environmental interactions. In general, these methods used to predict the genotypic and breeding values provide paths for more accurate breeding selection (Piepho et al., 2008). Seema et al. (2020) mentioned that sugarcane breeding programs have sought to improve new analytical methodologies to optimize the process of obtaining and selecting superior genotypes, in order to develop genetic materials with high yield and expressing agronomic traits of interest. The use of selection indices is an alternative method recommended by breeders. The selection index is one such method of selecting plants for crop improvement based on several characters of importance. This method was proposed by Smith (1937) using a discriminant function of Fisher (1936).

Also, Smith (1937) suggested that a better way of exploiting genetic correlation with several traits having high heritability is to construct an index, called selection index, which combines information on all the characters associated with the dependent variable like yield. Thus selection index refers to a linear combination of characters associated with yield. The best-known selection indices involve discriminant functions based on the relative economic importance of various characters. The discriminant function analysis measures the efficiency of various character combinations in selection. The selection index leads to simultaneous manipulation of several characters for genetic improvement of economic yield. This technique provides information on yield components and thus aids in indirect selection for genetic improvement.

The discriminant analysis provides an equation that gives maximum separation of high- and low-yield genotypes (Abdolshahi et al., 2015). The linear discriminant analysis can be used not only to examine multivariate differences between groups but also to determine which variables are the most useful for discriminating between groups and also whether one subclass of variables works as well as another and which groups are similar and which are different (Hadavani et al., 2018; Patel and Raval, 2018).

The aim of the present study was to develop a selection index approach that considers the information of several sugarcane traits using the discriminant analysis to better understand the relationship between the traits and sugar yield and find a rank to select superior genotypes in sugarcane breeding.

Methods

Experimental design and plant materials

This study was conducted at Mattana Agricultural Research Station (latitude of 25° 17′ N and longitude of 32° 33′), Luxor Governorate. The climate of Luxor is classified by the Köppen-Geiger system as desert, where rainfalls were about 2 mm/year, with a summer mean temperature of 32.4 °C and winter mean temperature of 23.2 °C and relative humidity of 61.6% during 2017–2018 and 2018–2019. Plant materials composed 30 genotypes of sugarcane, which were tested along with two check commercial cultivars (GT-54-9 and Ph8013) (Table 1). Genotypes were grown in a randomized complete block design with three replications. The plot area was 15 m2, including 3 rows of sugarcane of 5 m long, spaced at 1.0 m. Planting was done during March 2017 by fifteen 3-budded cane pieces in each row. The field was irrigated right after planting and all other agronomic practices were carried out as recommended. Some physical and chemical properties of representative soil samples of the experimental site before sowing for 2017 and 2018 seasons are shown in Table 2. Plant cane was allowed to ratoon after harvest. Both plant cane and its first ratoon crops were harvested at the age of 12 months. At harvest, a sample of ten stalks from each plot was collected to determine the following traits:

  1. 1.

    Number of stalks per m2 (NSm−2)

  2. 2.

    Stalk weight (kg) (SW)

  3. 3.

    Stalk height (cm), which was measured from soil surface to the visible dewlap (SH)

  4. 4.

    Stalk diameter (cm), which was measured at the middle part of stalk (SD)

  5. 5.

    Number of internode (NI)

  6. 6.

    Brix (total soluble solids %), which was determined using a hydrometer (Br)

  7. 7.

    Sucrose percentage, which was determined using automatic saccharimeter, according to A.O.A.C. (1980) (SC)

  8. 8.

    Juice purity % was estimated as (sucrose % × 100)/Brix %, (Pr %)

  9. 9.

    Sugar recovery % was calculated according to the formula described by Yadav and Sharma (1980) [Sucrose % − 0.4 (Brix-sucrose %)] × 0.73 (SR)

  10. 10.

    Cane yield/fed (ton), which was calculated on plot basis (one feddan = 0.42 ha)

  11. 11.

    Sugar yield/fed (ton) was estimated by multiplying net cane yield/fed (ton) by sugar recovery % (SY)

Table 1 Genotype code, pedigree of the studied sugarcane genotypes
Table 2 Some physical and chemical properties of representative soil samples of the experimental site before sowing for 2017/2018 and 2018/2019 seasons

Statistical analysis

Regular analysis of variance of randomized complete block design (RCBD) and combined analyses of variance of collected data were run as outlined by Gomez and Gomez (1984) who mentioned that the combined analysis can be applied if the coefficient of variation (CV %) for the individual experiments was lower than 20%. Simple correlation coefficients between various pairs of the studied characters were computed according to Gomez and Gomez (1984). Simple correlation coefficient analysis was automated using R studio statistical software version (3.6.1.) The selection index developed by Smith (1937) using the discriminate function approach (Fisher, 1936) was used to discriminate the genotypes based on all the characters.

Among 30 sugarcane genotypes and two check cultivars (Table 1), the highest 16 sugar yield genotypes and the rest 16 low sugar yield genotypes were selected as group one and group two based on average sugar yield over 2 years that could differentiate groups and then discriminant function analysis (DFA) was performed using SPSS software version 14 (Table 6).

Discriminant function analysis (DFA) provides an equation that gives maximum separation or discrimination between two groups of genotypes. All trait values were standardized before running the discriminant analysis. Also, the main terms related to DFA in our study were as follows:

  1. 1-

    Independent variables (eleven measured traits): these are the discriminating variables or “Predictors”

  2. 2-

    Dependent variable (two groups of 30 genotypes and two checks): this is the grouping variable, which is the object of classification efforts

  3. 3-

    Discriminant function: it is a latent variable, which is created as a linear combination of discriminating (independent) variables

For the purpose of this study, the DFA was used to determine which traits (independent variables) discriminate between two groups of genotypes. In simple words, the discriminant function can be thought of as a multiple regression equation. Accordingly, the latent variables which are created as a linear combination of discriminating (independent) variables would be as follows:

$$ {D}^2=a+{b}_1\ast {X}_1+{b}_2\ast {X}_2+\dots +{b}_n\ast {X}_n $$
(1)

where D2 = discriminate function or the predicted score (discriminant score), a is an intercept, b1 through bn are the discriminant coefficients (analogous to regression coefficients), and X1 through Xn are discriminating variables.

The contribution of each variable to the discrimination between groups is determined by the standardized discriminant coefficients (b1 to bn) for each variable in each discriminant function. The larger coefficient (or the standardized coefficient) indicates greater contribution of the respective variable (groups of genotypes).

The first statistic from the DFA is the Eigen-values of the discriminant functions. In this investigation, we have one discriminant function because we are only using two groups here, namely “group 1” high yielder and “group 2” low yielder, so only one Eigen-values displayed that reflects the importance ratio of the measured traits, which classify cases of the dependent variable (groups). In other words, they reflect the percent of variance explained in this variable, cumulating to 100% for function.

The canonical correlation is the multiple associations between the predictor’s independent variables (eleven measured traits) and the discriminant function. It provides an index of overall model fit which is interpreted as being the proportion of variance explained (R2).

The second statistic is the Wilks’ lambda statistic that is used to test the significance of the discriminant function as a whole. The value of Wilks’ lambda ranges between 0 and 1, when Wilks’ lambda value closes to be 0 and significant; it means that the DFA has goodness of fit to differentiate the genotypes in two groups and vice versa. Therefore, it tells us the variance of the dependent variable (two groups of 30 genotypes and two check cultivars) that is not explained by the discriminant function.

Also, the DFA output includes two important items: the standardized canonical discriminant function coefficients and the structure matrix. The first indicates the relative contribution of each variable to the respective discriminating function. Another way of investigating the relationship between dependent variables (genotypes groups) and discriminant functions is to look at the structure matrix. Finally, we get discriminant scores were a weighted linear combination (sum) of the discriminating variables. Based on these discriminant scores, we ranked genotypes in our investigation (selection index).

Results

The results proved that the coefficient of variation (CV %) for the individual experiments was lower than 20% that permits to apply a combined analysis as supposed by Gomez and Gomez (1984).

Mean performance

The mean values of sugar yield and its related characters for the thirty sugarcane genotypes along with two check cultivars in the plant cane, first ratoon crops, and across the two seasons are given in Tables 3, 4, and 5. Results revealed the presence of significant differences among genotypes (clones), seasons, and their interaction for all studied characters except for the seasonal effect for the number of internode per plant and sugar yield (tons per fed); the interaction between seasons and genotypes was insignificant for stalk diameter, stalk weight, cane yield, and sugar yield. When the interaction effect between genotype and season was insignificant, it means that the sugarcane genotypes had similar behavior in the two seasons. Therefore, it is enough to discuss the combined averages across the two seasons. The coefficient of variation (CV %) values for all studied characters was laid out in the statistically acceptable range (less than 20). Results showed that the first season had higher mean values for all studied traits compared to the second season, except for the number of stacks m−2 and cane yield representing seasonal differences.

Table 3 Mean performance of 30 sugarcane genotypes and two check cultivars for growth characters in the first, second, and across the two seasons
Table 4 Mean performance of 30 sugarcane genotypes and two check cultivars for quality characters in the first, second, and across the two seasons
Table 5 Mean performance of 30 sugarcane genotypes and two check cultivars for stalk weight, cane yield, and sugar yield in the first, second, and across the two seasons

Means listed in Tables 3, 4, and 5 indicated that the number of stalk per m2 is an important character toward the cane yield. Results indicated that a number of stalk per m2 (as an average of the two seasons) ranged from 14.33 for G.2017-16 to 27.83 stalks m−2 for G.2017-59. It is clear that clones G.2017-25 and G.2017-59 gave the maximum number of stalks/m2 recording 27 and 27.83, respectively, without significant differences with the used check cultivars. Results indicated that the maximum number of stalk/m2 was produced by genotypes (G.2017-25, G.2017-42, and G.2017-59) with significant differences compared to the check cultivar (Ph.8013) in the 1st season and genotypes (G.2017-17 and G.2017-43) with no significant differences with the check cultivar (Ph.8013) in the 2nd season.

Respect to stalk length (Tables 3, 4, and 5), the tallest plants across the two seasons were recorded by genotypes G.2017-35 (307.83 cm) and G.2017-44 (304.83 cm), respectively, but did not surpass the two check cultivars. However, in the 1st season, the genotypes (G.2017-17, G.2017-18, and G.2017-42) surpassed the check cultivar (G.T.54-9) for the stalk length. Meanwhile, in the 2nd season, the tallest genotypes (G.2017-25, G.2017-37, G.2017-41, and G.2017-58) had significant differences with the check cultivar (Ph.8013).

The mean average of stalk diameter for all tested genotypes did not surpass the check cultivars in the 1st and 2nd seasons and across two seasons. Meanwhile, the highest stalk diameter was obtained by check cultivar (G.T.54-9) recording 2.87, 2.77, and 2.82 cm, respectively, with a non-significant interaction effect between seasons and genotypes.

Regarding the number of internodes per stalk (Tables 3, 4, and 5), there are significant differences among sugarcane genotypes. Also, it ranged from the lowest values that were recorded by G.2017-59 (13.33, 13, and 13.17) to the highest values (22.67, 23.33, and 23) recorded by G.2017-10 in the 1st and 2nd seasons and across both seasons with significant differences with other sugarcane genotypes.

Data in Tables 3, 4, and 5 revealed that genotype G.2017-27 had the highest Brix percent (22.41%) across the two seasons followed by G.2017-35 (22.08%) with no significant differences compared to the two check cultivars being G.T.54-9 (21.65%) and Ph.8013 (20.10 %). In the 1st season, the highest Brix percent was obtained by G.2017-35 (23.82%) and G.2017-52 (23.85 %), while genotypes G.2017-27 and G.2017-59 gave the maximum Brix percent of 22% in the 2nd season.

Concerning to the sucrose % content (Tables 3, 4, and 5), the results have appeared that genotypes (G.2017-29 and G.2017-33) and check cultivar (G.T.54-9) had the maximum percent of sucrose across the two seasons recording 15.62, 15.99, and 16.37%, respectively, while in the 1st season, the genotypes (G.2017-63) and check cultivar (G.T.54-9) produced the highest sucrose % recording 17 and 17.07%, respectively. In the 2nd season, genotypes (G.2017-33) and check cultivar (G.T.54-9) gave the maximum sucrose % over the other genotypes recording 15.39 and 15.67%, respectively. The considerable variability among the aimed genotypes provides a good chance to improve the sucrose % content.

Across the two seasons, data in Tables 3, 4, and 5 showed that the highest purity % values were gained by genotypes (G.2017-33 and G.2017-63), and the two check cultivars (G.T.54-9 and Ph.8013) recording 81.93, 76.88, 75.67, and 75.76%, respectively, with no significant differences among them. The two genotypes being G.2017-16 (80.74%) and G.2017-42 (78.36%) and the check cultivar G.T.54-9 (78.62%) gave the maximum purity % in the 1st season while their corresponding percentages in the 2nd season were obtained by genotypes: G.2017-33, G.2017-44, and G.2017-63 recording 86.13, 79.03, and 77.51%, respectively.

Results indicated that the sugarcane genotype being G.2017-33 and the check cultivar (G.T.54-9) surpassed the others for sugar recovery % recording 10.71, 10.51, and 10.61 and 11.09, 9.72, and 10.41 in the first and second seasons and combined them.

Cane and sugar yields are the final expressions of the most physiological processes, which have interacted with the weather and environment during growth. Variation in stalk weight, cane yield, and sugar yield among studied sugarcane genotypes was relatively high as shown in Tables 3, 4, and 5.

Results indicating that the check cultivars of G.T.54-9 and Ph.8013 gave the maximum values of stalk weight, and cane and sugar yields (tons fed-1) in the 1st and 2nd seasons and across the two seasons with significant differences with the other genotypes. It is noted that G.T.54-9 reflected the maximum cane and sugar yields (54.30 and 5.65 ton fed-1, respectively) across the two seasons with significant differences than Ph.8013 that recorded 46.44 and 4.51 ton fed-1, respectively, while no significant difference was found between G.T.54-9 and Ph.8013 concerning stalk weight recording 1.12 and 1.13 kg, respectively, across both seasons.

Simple correlation coefficient

The coefficients of correlation between all pairs of the studied traits were computed and graphically illustrated in Fig. 1. It is obvious that the data distribution of each variable is shown on the matrix diagonal. The bi-variant scatter plots with a fitted line between all studied traits are displayed below diagonal while the value of the correlation plus the significance level as stars was shown above diagonal.

Fig. 1
figure1

Correlation matrix among SY, sugar yield (ton/fed); CY, cane yield (ton/fed); NSm−2, number of stalk m−2; SW, stalk weight; SH, stalk height/cm; SD, stalk diameter/cm; NI, number of internodes; Br, Brix %; Sc, sucrose %, Pr, purity %; and SR, sugar recovery %

Data in Fig. 1 showed that sugar yield (SY) was positively and highly significantly associated with the number of stalk/m2 (r = 0.77**), stalk weight (r = 0.84**), stalk height (r = 0.53**), purity % (r = 0.47**), sugar recovery % (r = 0.58**), and cane yield (r = 0.92**). With respect to cane yield, it was positively and highly significantly correlated with the number of stalks per m2 (0.84**), stalk weight (0.77**), and stalk height (0.53**). There was positively and highly significant correlation between sugar recovery % and each of the stalk weight (0.53**), stalk diameter (0.57**), sucrose % (0.94**), and juice purity % (0.48**). Concerning sucrose %, it was positively and highly significantly associated with stalk weight (0.48**), stalk diameter (0.59**), and Brix (0.66**).

There was also a significant and negative correlation coefficient between the number of stalks per m2 and stalk weight (r = − 0.48**) while it was positively correlated with stalk height (0.54**). Highly significant and positive correlation coefficient was obtained between stalk weight and stalk diameter (0.51**).

Discriminant function analysis

Based on average yield over 2 years, the sugarcane genotypes were descendingly ranked for sugar yield and their corresponding traits. The highest 16 yielder genotypes were selected as group one (high yielder genotypes) and the rest of the 16 genotypes as group two (low yielder genotypes). In this approach, the result discriminant analysis was illustrated and shown under the following titles.

The group statistics and tests of equality of group means

Table 6 showed the mean values and standard deviation of the studied traits for the two groups and the test of the two group differences using Wilks’ lambda where proceeding further with the analysis will not be meaningful if there are no significant group differences. The examination of the group means and standard deviations can be helpful in obtaining a rough idea of variables that may be important. Wilks’ lambda is of great analytic importance where the smaller Wilks lambda indicated more importance of the independent variable (measured traits) than to the discriminant function.

Table 6 Comparing between the two supposed groups (high and low yielder genotypes) using a test of equality of group means

Wilks’ lambda ranges between 0 and 1. Values close to 0 indicate different group means while the values close to 1 indicate that the two group means are not different (equal to 1 indicates all means are the same).

Using the statistic of Wilks’ lambda, results obtained that there were significant differences between the two groups for all studied traits except the number of stalk per square meter, stalk diameter, number of internodes, and Brix %, sucrose %, and sugar recovery % suggesting that these may be good discriminators to differentiate the two genotype groups.

Standardized canonical discriminant function coefficient and structure matrix

Firstly, data (Table 7) of the standardized canonical discriminant function coefficients (b) are used to create the following highly significant discriminant function model as follows:

$$ {\mathrm{DS}}^2=5.53\kern0.4em \mathrm{CY}-4.65\kern0.4em \mathrm{SY}+3.54\kern0.4em \Pr +2.11\kern0.4em \mathrm{Br}+0.58\kern0.4em \mathrm{SW}-0.52\kern0.4em \mathrm{SH}-0.27\kern0.4em \mathrm{NI}-0.22\kern0.4em \mathrm{NSm}{-}^2+0.134\kern0.4em \mathrm{SC}+0.043\kern0.4em \mathrm{SD} $$
(2)
Table 7 Standardized canonical discriminant function coefficients and structure matrix

where DS2 is the discriminant score, CY cane yield (ton/fed), SY sugar yield (ton/fed), Pr purity %, Br Brix %, SW stalk weight, SH stalk height, NI number of internode, NSm−2 number of stalk/m2, Sc sucrose %, and SD stalk diameter.

Discriminant analysis not only describes numerically the general distance between the clones with discriminant score (DS2), but also shows the characters that serve the purpose of distinguishing the cultivars among the studied specification. It is possible to classify the studied cultivars and applications using these characters, which use the coefficients from various canonical distributions. If a coefficient is higher than ± 0.5, that character is defined as a distinguishing factor (Tatsuoka 1971).

The second item is the structure matrix; it is just like factor a loading (0.30) is seen as the cut-off between important and less important variables. The independent variables, which have high values of structure matrix, contribute most to the dependent variable separation.

The canonical correlation measures the degree of association between the predicted values (score) fitted by the discriminant function and independent variables (eleven measured traits). As shown in Table 7, a canonical correlation of 0.89 suggests that the discriminate function model explains 89.2% (0.892 × 100) of the variation among the 32 genotypes. Also, results in Table 7 presented a highly significant small value of Wilks’ lambda (0.22).

Discriminant score and ranking the aimed genotypes for yield potential based on all studied traits

Table 8 appeared that genotype 18 (G.2017-42) is a low-yield genotype (group 2) but based on discriminant function (Eq. (2)) classified as group 1 (high yield). Other genotypes (like G31 (GT.54-9), G32 (Ph.8013), and G19 (G.2017-43)) were classified in their expected groupings. Thus, the correct classification rate was 31/32 = 96.8%. Genotypes are ranked with a discriminant score (Table 8 and Fig. 2) as follows: G19, G31, G32, G18, G9, G12, G20, G13, G4, G27, G21, G25, G26, and G5 (G.2017-43, G.T.54-9, Ph.8013, G.2017-42, G.2017-29, G.2017-33, G.2017-44, G.2017-34, G.2017-17, G.2017-67, G.2017-52, G.2017-63, G.2017-65, and G.2017-18) as superior genotypes (considering sugar yield and most studied traits) recording greater discriminant score values being 3.47, 3.28, 2.86, 2.82, 2.63, 2.34,1.88,1.73, 1.41, 1.38, 1.31, 1.11, 1.07, and 0.81, respectively. The lower discriminant score values were recorded by G10, G1, G8, G6, G29, G17, G16, G14, and G23, (G.2017-30, G.2017-10, G.2017-27, G.2017-25, G.2017-70, G.2017-41, G.2017-40, G.2017-35, and G.2017-58) being − 3.40, − 3.39, − 2.81, − 2.79, − 2.66, − 2.10, and − 1.96, respectively. Plotting the discriminant scores helps researchers to easily visualize and select the superior genotypes.

Table 8 Discriminant score and classification for two groups of high- and low-yield genotypes
Fig. 2
figure2

Ranking genotypes based on discriminant scores

Discussion

According to data in Tables 3, 4, and 5, the number of stalk is very important in sugarcane as it is directly related to the final millable cane population at harvest. The variation in the production of the number of stalk/m2 may be attributed to variation in the genetic behavior of sugarcane genotypes in addition to their interaction with the environmental conditions. Abu-Ellail (2015) indicated the varying response of different sugarcane genotypes for the number of stalks per m2, and also, the reduced plant population is due to the poor establishment of plant crops or the infection of pests and diseases were blamed to be responsible for the poor yield reported by Singh and Dey (2002). Genotypes, G.2017-43, G.2017-42, G.2017-29, G.2017-33, and G.2017-44, showed the highest values of cane yield and sugar yield, due to their performance and genetic makeup; moreover, they recorded the highest stalk weight, stalk diameter, and stalk number and also register the highest Brix and sucrose percentages. Abu-Ellail et al. (2018) found that crop cycles had a negative effect on cane and sugar yields; it is important to study the characteristics of sugarcane associated with the best clones to use them as selection criteria in the breeding program. These results are in agreement with those obtained by Masri et al. (2014) and Abu-Ellail et al. (2019) who found significant differences among the tested sugarcane genotypes for cane and sugar yields and other agronomic and physiological characters. While the genotypes named vis., G.2017-30, G.2017-10, G.2017-27, G.2017-25, G.2017-70, G.2017-41, G.2017-40, G.2017-35 and G.2017-58, recognized as lowest yielder sugarcane genotypes, due to significant differences among genotypes for stalk length and diameter, and the interaction with seasons. Ahmed and Obeid (2012) and Milligan et al. (1996) found that stalk diameter has been suggested as being indicative of better cultivars. Also, they reported that the number of millable cane and stalk weight are the most useful and reasonable selection criteria for high cane yield. There are significant differences among genotypes for technological quality traits such as Brix and sucrose % that are commonly used for the selection of genotypes. The selected best clones should display high performance in a series of yield and quality-related traits such as sucrose and sugar recovery percentages. Highly significant genotype effects indicate the existence of differences that can be utilized during selection; apparent sucrose content in the juice and Brix % were the major traits that emerged most suitable for the application of selection, while purity % was unsuitable (Zhou et al., 2012, Azeredo et al., 2017 and Silva et al., 2017).

Data in Fig. 1 indicated that sugar yield displayed a strong and significant correlation of cane yield followed by stalk weight, number of stalks per m2, and stalk height in addition to purity % and sugar recovery %. Kwajaffa and Olaoye (2014), Masri et al. (2014), and Feven et al. (2018) found a significant and positive correlation between sugar yield and previous traits. Due to the importance of sugarcane production to Egypt’s economy, improved production is essential. A high correlation for these traits with cane yield suggests that these traits are a better criterion to improve yield.

Discriminant function analysis (DFA) is a better technique in comparison with multiple-regression to improve yield (Farshadfar 2012). It is used to predict group membership, so an examination of whether there are any significant differences between two groups on each of the independent variables (studied traits) was run. The examination of the group means and standard deviations can be helpful in obtaining a rough idea of variables that may be important. Wilks’ lambda is of great analytic importance where the smaller Wilks’ lambda indicated more importance of an independent variable (measured traits) to the discriminant function and its ranges between 0 and 1. Values close to 0 indicate different group means while the values close to 1 indicate that the two group means are not different (equal to 1 indicates all means are the same). Results obtained that there were significant differences between the two groups for stalk weight, stalk height, purity %, cane yield, and sugar yield suggesting that these may be good discriminators to differentiate the two genotype groups (Table 6).

Results (Table 7) included two items: the standardized canonical discriminant function coefficients and the structure matrix. The first item indicates a standardized canonical discriminant function coefficient of each variable (not affected by measure unit) to the respective discriminating function. Another way of investigating the relationship between dependent variables (the two groups of 30 genotypes and two check cultivars) and discriminant function (measurement traits) is to look at the structure matrix that reflects the relative importance of the studied traits in separating the two yielding groups of genotypes. Consequently, the six mentioned traits (CY, SY, Pr%, Br%, SW, and SH) were shown as the most distinguishing related traits. It is noted that the discriminant function discarded the trait of sugar recovery % to avoid the multicollinearity problem because it was computed from two other traits being sucrose % and Brix %.

Discriminant analysis has features that support univariate analysis of variance and regression support their interpretations rather than being one of the techniques that can be used in place of these analyses. DFA with a correct classification rate of 96.8% was known as a powerful multivariate method to discriminate between genotypes relating to group 1 (high yield) and 2 (low yield) in sugarcane. Cane yield, purity %, Brix %, stalk weight, and stalk height were known as the most valuable traits with high breeding value for improving sugar yield in sugarcane, which is due to the high coefficients (Eq. (2)), high Wilks’ lambda (Table 6), and positive significant correlations with yield (Fig. 1). Therefore, selecting is based on these traits and can improve the sugar yield. The importance of these traits has been confirmed by several other studies by Hiremath and Nagaraja (2016) and (Mohammed et al., 2019) in sugarcane. Researchers have previously assessed the importance of these traits to confirm their role in improving yield in maize (Ahmet 2012), in wheat (Patel and Raval 2018; Abdolshahi et al. 2015), in barley (Aram et al. 2018), in green gram (DAS and Baisakh 2019), in melon (Naroui et al. 2017), in vegetable cowpea (Sivakumar et al. 2017), and in bean (Hadavani et al. 2018).

The proposed integrated selection criterion in this study (Eq. (2)), could explain 79.2% of sugar yield variation between two groups. Results in Table 7 provided the relative importance of the predictors by identifying the largest loadings for each discriminate function. It is clear that all the structure matrix values are relatively small (less than or equal 0.30) except for stalk weight, stalk height, purity %, and cane and sugar yields indicating their effectiveness as discriminators between high sugar yield and low sugar yield genotype groups. Mohammed et al. (2019) observed sugar yields have been generally improved by the characters of cane yield, stalk weight, and increased total biomass rather than directly by increasing sugar concentration in stalks.

Wilks’ lambda is a measure of how function separates the 32 genotypes into two yielding groups. It ranges between 0 and 1. Results in Table 7 presented a highly significant small value of Wilks’ lambda (0.22) which indicates the great distinguishing ability of the function. The Wilks’ lambda parameter provides the proportion of total variability that is not explained by the discriminant function model.

Discriminant function technique involves the development of selection criteria on a combination of various characters and aids the breeder in indirect selection for genetic improvement in yield. In plant breeding, the selection index refers to a linear combination of characters associated with yield. The results in this study are in accordance with the result of Muhammad et al. (2014) who reported that the important traits to be considered in increasing sugar yield are cane productivity and stalk crop.

The discriminant score is the predicted values of fitting the discriminant function model. Based on discriminant scores, all genotypes are classified and ranked into two classes being high and low yielding groups (Table 8 and Fig. 2). As discriminant scores in Eq. (2) are calculated based on standardized data, genotypes with a higher discriminant score than zero belong to group 1 (high-yield genotypes) and lower than zero belong to group 2 (low-yield genotype), with cut-off value equals to zero. In this study, discriminant analysis was used as a powerful multivariate method to find an integrated selection criterion using all studied traits not only the yield. These results confirmed the efficiency of the proposed integrated selection criteria.

Conclusion

In the current investigation, the results proved that the coefficient of variation (CV %) for the individual experiments was lower than 20% that permits application combined analysis. Results revealed the presence of significant differences among genotypes (clones), seasons, and their interaction for all studied characters except for the seasonal effect for the number of internodes and sugar yield and season × genotypes interaction effect for stalk diameter, stalk weight, cane yield, and sugar yield was insignificant.

Results showed that the first season had higher mean values for all studied traits compared to the second season, except for stalk diameter and sugar yield representing seasonal differences.

Furthermore, the correlation coefficient presented that cane yield was the major sugar yield contributing factor followed by stalk weight, the number of stalks per m2, and stalk height in addition to purity % and sugar recovery %.

The discriminant function model explains 79.2% of the variation among the 32 genotypes. Also, a highly significant small value of Wilks’ lambda (0.22) indicates the great distinguishing ability of the function by using effective traits concurrently; discriminant scores (DS2) were calculated for all genotypes. This indicator could successfully discriminate between high and low sugar yield genotypes.

Consequently, traits such as stalk weight, stalk height, purity %, Brix %, and cane yields could be employed in future breeding programs to improve sugar yield. Their feasibility is due to their cheap and simple measurement, positive and significant correlations with sugar yield, and high coefficients in discriminant function. In addition, this criterion could successfully separate high- and low-yield genotypes (Fig. 2). Genotypes, G.2017-43, G.2017-42, G.2017-29, G.2017-33, and G.2017-44, showed the highest values of the discriminant score and were recognized as the highest yielder sugarcane genotypes, while the genotypes named vis., G.2017-30, G.2017-10, G.2017-27, G.2017-25, G.2017-70, G.2017-41, G.2017-40, G.2017-35, and G.2017-58, recognized as the lowest yielder sugarcane genotypes represent lowest values of the discriminant score (Table 8 and Fig. 2).

Availability of data and materials

All data generated or analyzed during this study are included in this published article

Abbreviations

Br:

Brix %

Sc:

Sucrose %

Pr:

Purity %

SR:

Sugar recovery %

CV %:

The coefficient of variation

LSD:

Least significant deference

NSm-2:

Number of stalk/m2

SH:

Stalk height/cm

SD:

Stalk diameter/cm

NI:

Number of internode

SW:

Stalk weight

CY:

Cane yield (ton/fed)

SY:

Sugar yield (ton/fed)

DFA:

Discriminant function analysis

RCBD:

Randomized complete block design

References

  1. Abdolshahi R, Nazari M, Safarian A, Sadathossini T, Salarpour M, Amiri H (2015) Integrated selection criteria for drought tolerance in wheat (Triticum aestivum L.) breeding programs using discriminant analysis. Field Crops Res 174:20–29

    Google Scholar 

  2. Abu-Ellail FFB, Abd El-Azez YM, Bassiony NA (2019) Assessment of ratooning ability and genetic variability of promising sugarcane varieties under middle Egypt conditions. Electronic J. Plant Breed. 10(1):143–154

    Google Scholar 

  3. Abu-Ellail FFB, Masri MI, El-Taib ABA (2018) Performance of some new sugarcane clones for yield and its components at two different crop cycles. Indian J. Sugarcane Technol. 33(1):27–34

    Google Scholar 

  4. Ahmed AO, Obeid A (2012) Investigation on variability, broad sensed heritability and genetic advance in sugarcane (Saccharum spp). International J. Agri. Sci. 2(9):839–844

    Google Scholar 

  5. Ahmet OZ (2012) Use of discriminant analysis for selection of hybrid maize parent lines. Turk. J. Agric. 36:533–542

    Google Scholar 

  6. Aram A, Ezzat K, Asgar S, Mehdi Z (2018) Application of secondary traits in barley for identification of drought tolerant genotypes in multi-environment trials. AJCS 12(1):157–167

    Google Scholar 

  7. Azeredo AAC, Bhering LL, Brasileiro BP, Cruz CD, Silveira LCI, Oliveira RA, BespalhokFilho JC, Daros E (2017) Comparison between different selection indices in energy cane breeding. Genet Mol Res 16(1):1–11

    Google Scholar 

  8. Das TR, Baisakh B (2019) Selection indices and discriminant function analysis for grain yield in green gram (Vigna radiata (L.) Wilczek). E-planet 17(1):13–21

    Google Scholar 

  9. FAOSTAT (2018) Food and Agricultural Organization: Sugarcane production countries. Available online: http://www.fao.org/faostat (accessed on 10 January 2020).

  10. Farshadfar E (2012) Application of integrated selection index and rank-sum for screening drought-tolerant genotypes in bread wheat. Int J Agric Crop Sci. 4(6):325–332

    Google Scholar 

  11. Feven M, Hussein M, Esayas T (2018) Correlation of traits among cane yield and its component in sugarcane (Saccharum Spp) genotypes at metahara sugar estate. Int. J. Adv. Res. Biol. Sci. 5(11):56–61

    Google Scholar 

  12. FFB A-E (2015) Breeding for yield and quality traits in sugarcane. Ph.D Thesis. Fac. of Agric., Cairo Univ, Egypt

    Google Scholar 

  13. Fisher A (1936) The use of multiple measurements in taxonomic problems. Ann. Eugen. 7:179–189

    Google Scholar 

  14. Gomez KA, Gomez AA (1984) Statistical Procedures for Agricultural Research, 2nd edn. Wiley, New York, p 680

    Google Scholar 

  15. Hadavani JK, Mehta DR, Kanani DK (2018) Discriminate function analysis in Indian bean (Lablab purpureus L.). J. Pharmacognosy and Phytochemistry 7(5):119–121

    Google Scholar 

  16. Hiremath G, Nagaraja TE (2016) Selection indices for cane yield in mid-late maturing clones of sugarcane (Saccharum officinarum L.). Res. Environ. Life Sci. 9(8):1022–1024

    Google Scholar 

  17. Kwajaffa AM, Olaoye G (2014) Flowering behaviour, pollen fertility and relationship of flowering with cane yield and sucrose accumulation among sugarcane germplasm accessions in a savanna ecology of Nigeria. Inter. J. Current Agric. Res. 3(12):104–108

    Google Scholar 

  18. Masri MI, Shaban Sh A, El-Hennawy HH, ABA E-T, FFB A-E (2014) Evaluation of some sugarcane genotypes for yield and quality traits at the first clonal selection stage. Egypt. J. of Appl. Sci 29(12 B):709–730

    Google Scholar 

  19. Milligan SB, Gravois KA, Martin FA (1996) Inheritance of sugarcane ratooning ability and relationship of younger crop traits to older crop traits. Crop Sci. 36:45–50

    ADS  Google Scholar 

  20. Mohammed AK, Ishaq MN, Gana AK, Agboire S (2019) Evaluation of sugarcane hybrid clones for cane and sugar yield in Nigeria. African J. Agric. Res. 14(1):34–39

    Google Scholar 

  21. Muhammad K, Hidayat R, Rabbani MA, Farha T, Amanullah K (2014) Qualitative and quantitative assessment of newly selected sugarcane varieties. Sarhad J. Agric. 30(2):187–191

    Google Scholar 

  22. Naroui Rad MR, Fanaei HR, Ghalandarzehi A (2017) Integrated selection criteria in melon breeding. Inter. J. Vegetable Sci. 23(2):125–134

    Google Scholar 

  23. Patel NS, Raval LJ (2018) Selection indices for yield improvement in bread wheat (Triticum aestivum L.) under late sown condition. J. Pharmacognosy and Phytochemistry 7(5):1586–1588

    CAS  Google Scholar 

  24. Piepho HP, Möhring J, Melchinger AE, Büchse A (2008) BLUP for phenotypic selection in plant breeding and variety testing. Euphytica 161:209–228 https://doi.org/10.1007/s10681-007-9449-8

    Google Scholar 

  25. Silva LA, Resende RT, Ferreira RADC, Silva GN, Kist V, Barbosa MHP, Nascimento M, Bhering LL (2016) Selection index using the graphical area applied to sugarcane breeding. Genetics and Molecular Res 15(3):gmr.15038711

    Google Scholar 

  26. Silva LA, Teodoro PE, Peixoto LA, Assis C, Gasparini K, Barbosa MHP, Bhering LL (2017) Selecting sugarcane genotypes by the selection index reveals high gain for technological quality traits. Genetics and Molecular Res 16(2):1–12

    CAS  Google Scholar 

  27. Singh RK, Dey P (2002) Genetic variability in plant and ratoon of sugarcane genotypes grown under saline conditions. Indian Sugar 10:725–727

    Google Scholar 

  28. Sivakumar V, Celine VA, Venkata RC (2017) Discriminant function method of selection in vegetable cowpea genotypes. Int. J. Curr. Microbial. App. Sci. 6(10):4954–4958

    Google Scholar 

  29. Smith FH (1937) A discriminate function for plant selection. Ann. Eugen. 7:240–250

    Google Scholar 

  30. Tatsuoka MM (1971) Multivariate Analysis, 2nd edn. Macmillan, New York

    Google Scholar 

  31. Yadav RL, Sharma RK (1980) Effect of nitrogen levels and harve dates on quality characters and yield of four sugar cane genotypes. Indian J. 50(7):581–589

    CAS  Google Scholar 

  32. Yadav S, Jackson P, Wei X, Ross EM, Aitken K, Deomano E, Atkin F, Hayes BJ, Voss-Fels KP (2020) Accelerating genetic gain in sugarcane breeding using genomic selection. Agronomy 10:585. https://doi.org/10.3390/agronomy10040585

    Article  Google Scholar 

  33. Zhou MM, Lichakane M, Joshi SV (2012) Family evaluation for quality traits in South Africa sugarcane breeding programmes. Proc. S. Afr. Sug. Technol. Ass. 85:221–236

    Google Scholar 

Download references

Acknowledgements

The authors wish to thank Dr. Waleed Fars, head researcher at Cent. Lab. For Design & Stat. Analysis Res., Agricultural Research Centre, 12619, Giza, Egypt.

Funding

Not applicable

Author information

Affiliations

Authors

Contributions

All authors shared in this work. FFBA was a major contributor in writing the manuscript and performed experiments. EMAH analyzed and interpreted the data and contribute to review the manuscript before submitting it to the journal. AA performed the experiments. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Farrag F. B. Abu-Ellail.

Ethics declarations

Ethics approval and consent to participate

The manuscript does not contain experiments using animals. The manuscript does not contain human studies.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Abu-Ellail, F.F.B., Hussein, E.M.A. & El-Bakry, A. Integrated selection criteria in sugarcane breeding programs using discriminant function analysis. Bull Natl Res Cent 44, 161 (2020). https://doi.org/10.1186/s42269-020-00417-6

Download citation

Keywords

  • Sugarcane breeding
  • Discriminant function analysis
  • Selection criterion