Introduction

"White gold" is the famous name of cotton; it takes this fame from its importance in the economy. Cotton is not only the essential fiber crop in the world but the second most crucial oilseed. Also, cotton is a fiber and oil crop grown in more than 70 countries worldwide, which plays an essential role in the global economy. India is the largest producer, with an output of 5.9 million tons, followed by China, the USA, Pakistan, and Brazil1.

Cotton is the main cash crop of Egypt. Since 1820, there has been no crop than cotton, continuing in the Egyptian people's life and survival. It has an essential role in the national economy in contributing to trade, textile industry, employment, and foreign exchange. Increasing productivity per unit area is the requirement for the sustenance of domestic and export needs. Expression of various economic traits often varies with the breeding materials used and environmental conditions2,3,4.

In Egypt, water scarcity, especially at the ends of the canals, is a significant factor limiting the cultivation of the cotton crop2,5. Drought is common abiotic stress during the cotton growing season, which causes a series of adverse effects on cotton growth, yield, and fiber quality2,6,7. Cotton is a dreadfully drought-sensitive crop causing incentive reduction in yield because drought stress is a complex phenomenon that affects the physiology of the cotton plant6. Also, cotton is a very susceptible plant to the quantity of irrigation water, and therefore, irrigation management is very complicated. The flowering and boll-forming stage is the cotton plants' critical yield determinant period. Water stress occurring during this stage will undoubtedly seriously affect cotton development and final productivity8,9,10. Water-deficit stress affects physiological processes in plants, resulting in alterations in photosynthetic rate, transpiration rate, stomatal conductance, carboxylation efficiency, and water use efficiency in plants. Most practical drought tolerance breeding programs emphasize direct selection for yield under stress. However, underwater stress, high-yielding genotypes could likely be low-yielding under well-watered environments11,12,13. Also, 14,15,16observed that genotypes bred in optimum conditions are not likely to sustain yield genotype-environment interaction, and selection only under water stress conditions is fruitful.

Direct selection based on yield only is mainly difficult practiced in cotton breeding, so yield is a complex trait and highly affected by environmental conditions; however, the presence of genotype x environment interactions reduces the efficiency of using yield as the sole selection criterion and thus, complicates the efforts of selection17,18,19. In addition to the environmental effects, other factors such as polygenic nature, low heritability, linkage, and non-additive gene action may make the selection less efficient, mainly in early segregation generations20,21,22. The selection index technique was proposed by23,24 to be used in the simultaneous improvement of several traits and to select for relatively more heritable correlated characteristics. Furthermore, the selection index aims to determine the most valuable genotypes and the most suitable combination of traits plants8,12,25,26.

Some comparisons of the indices with direct selection conclude that using indices as a selection criterion achieves superior results. Several authors confirmed the efficacy of the selection index among them11,14,16,27,28,29,30,31. It could be concluded that the selection index method was more efficient in isolating the superior elite families in most studies traits. We can depend on this method from selection in scientific programmers to obtain elite genotypes superior yield and fiber traits together. On the other hand, the pedigree selection method was inferior to detecting the superior genotypes17,18,19,20,21,22.

The first objective of the research reported herein was to select cotton genotypes adapted to semi-arid climate conditions cultivated under irrigation for high yields and the standards of the fiber quality properties required by the textile industry. The second objective of the research was to determine the predicted and realized gains from different selection indices to improve some economic characters under water stress conditions.

Results

Heritability values in broad-sense, phenotypic (PCV), and genotypic (GCV) coefficients of variation and mean performance for all the studied traits are presented in Table 1. Heritability was high, over 50% for all the studied features in F2, F3, and F4 generations expect BW, S/B, and SI in the F2 generation. B/P, SI, LI, UI, FL, and PI showed a decrease in heritability in a broad sense from F3 to F4 generations. On the other hand, LCY/P, BW, S/B, LP%, and MR increased heritability values from F3 to F4 generations. The observed phenotypic and genotypic coefficients of variation were more significant in F2 and F3 than in F4 for all the studied traits. Except for LP% and PI, it is interesting to mention that F4 generation reduced PCV and GCV values for all studied characters.

Table 1 Estimates of broad-sense heritability (h2b), phenotypic (PCV), and genotypic (GCV) coefficients of variation, means, and standard errors (\({\text{S}}\overline{x}\)) for the eleven studied characters in F2, F3, and F4 generations.

Mean performances for the studied traits in F2, F3, and F4 generations are presented in Table 1. Except FL and MR, mean performance in F4 generation revealed higher than in F3 generation for all studied characters. However, MR was lower (desirable) in F4 than F3 generation.

Phenotypic and genotypic correlation coefficients

The coefficient of phenotypic and genotypic correlations among different character combinations are given in Table 2. The results revealed that B/P, BW, S/B, and LP% had positive and significant with lint cotton yield/plant at most studied generations. Also, fiber length was positively correlated with LCY/P in F3 and F4 generations. B/P was negatively associated with SI and LI in both rp and rg. BW positively and significantly correlated with SI, LI, and MR in rp and rg. Phenotypic and genotypic correlation coefficients for S/B with LP% were positive and increased from F2 to F4 generations, but S/B showed negative associations with SI and LI. SI exhibited a positive relationship with LI but showed negative associations with LP%. Except for the negative relationship between PI and MR in F3, there is no clear relationship between the remaining fiber traits and each other.

Table 2 Phenotypic(rp) (above diagonal) and Genotypic(rg) (below diagonal) correlation coefficients in F2, F3, and F4 generations between all pairs of studied traits.

Predicted and realized gains from the selection

Predicted and realized advances from selection procedures for yield and yield components are presented in Table 3. The highest predicted genetic advances in F2, F3, and F4 generations for LCY/P and B/P were obtained with Ped.1 and I.1. The high genetic correlation (more than 0.93) between LCY/P and B/P Table 2 could explain the improvement of both LCY/P and B/P using direct selection for lint cotton yield/plant (Ped.1) and selection index involving yield and its components (I.1). The highest realized and predicted genetic advances in the F4 generation were obtained with the index Ped.1(direct selection for LCY/P). The results from Ped.1 had already been expected, since the positive correlations between lint index with LCY/P and B/P in F4, these results indicate the genetic variation for lint cotton yield/plant in early generations didn’t exhaust enough, and the improvement of yield/plant could be continued in further generations.

Table 3 Predicted and realized gains from the different selection procedures for yield and yield components in the three segregating generations.

On the other hand, the lowest realized and predicted genetic advances in F4 generation for LCY/P was obtained with the index I.2 (the index involving BW, S/B, SI, fiber length, and PI). There was disagreement between the eight indexes regarding the predicted and realized responses (simple correlation between predicted and realized gains = 0.874 for LCY/P and = 0.769 for B/P). also the predicted improvements in F3 were higher than the realized gains in F4, which indicate the predominance of additive and dominance genetic variances in the inheritance for LCY/P and B/P.

Regarding BW and S/B Table 3, the highest predicted genetic advances in F2 and F3 generations were obtained with the index I.1. Also, I.4 for S/B and I.5for BW gave the highest realized and predicted genetic advances in the F4 generation. The simple correlation between predicted (F3) and realized (F4) gains was 0.434 for BW and 0.667 for S/B at the eight indexes; also, the predicted increases in F3 were higher than the realized gains in F4 at most selection procedures.

For SI and LI Table 3, a complete agreement was found for these two traits, as Ped.3 (direct selection for LI) gave the highest expected genetic gains in F2, F3, and F4 generations; this is due to the strong genetic correlation between both seed index and lint percentage with lint index. In contrast, Ped.2 (direct selection for LP% showed the lowest realized and predicted genetic advances in all generations for SI and LI. The predicted gains in F3 were higher than the realized gains in F4 at most selection procedures. The simple correlation between predicted and realized gains was 0.820 at the eight indexes for LI. Thus, there was relatively agreement between this trait's predicted and realized responses.

Considering LP% Table 3, maximum predicted advance was obtained by using Ped.2 (direct selection for LP%) in F2, F3, and F4 generations. Thus, direct selection for LP% is recommended in itself improvement. The simple correlation between predicted and realized gains was 0.813 at the eight indexes. There was a fluctuation in the differences between predicted and actual genetic increases in value and direction.

Regarding UI and fiber length Table 4, the selection procedure involved LP%, LI, FL, and UI (I.3) gave the highest predicted gains in the F3 generation. Also, I.2 (the index involving BW, S/B, SI, FL, and PI) showed the high realized and predicted genetic advances in the F4 generation. The eight indexes exhibited differences between predicted and actual genetic gains in both UI and fiber length value and direction. The simple correlation between predicted and realized gains was 0.680 for UI and 0.702 for fiber length.

Table 4 Predicted and realized gains from the different selection procedures for fiber properties in the three segregating generations.

For PI and MR Table 4, predicted genetic response in F3 generation revealed desirable values by applying I.2 for PI and MR. Also, I.2 and I.3 showed the desirable predicted and realized gains in F4 generation for PI and MR, respectively. The simple correlation between predicted and realized gains was 0.572 for PI and 0.052 for MR.

The results of this work were concluded superior five families from these indices in the F4 generation Table 5 where it showed that the first family was selected by five indices (Ped.1, Ped.2, Ped.3, I.1 and I.5), the second family was selected using six indices (Ped.1, Ped.3, I.1, I.2, I.4 and I.5), the third family was selected by six indices (Ped.1, Ped.2, I.1, I.3, I.4 and I.5), the fourth family was selected using five indices (Ped.1, I.2, I.3, I.4 and I.5 ), and the fifth family was selected using five indices (Ped.1, Ped.2, I.1, I.2 and I.4). The superior five families released from these indices in the F4 generation Table 5 exceeded the better parent for LCY/P, B/P, BW, S/B, LI, and reasonable fiber traits. These families could be continued to further generations as breeding material for developing water deficit tolerant genotypes. Similar findings were reported by11,12,15,31.

Table 5 Means and their percent from the better parent of the superior four families scored by using two selection procedures for studied characters in the F4 generation.

Discriminant analysis (DA)

Discriminant analysis32 was performed using the MASS package in R software version 4.1.033. The purpose is to identify the traits that efficiently discriminate among the three generations. These discriminant traits have been improved by the selection procedures and can be used to judge the selection efficiency. The results of linear discriminant analysis (LDA) in Table 6 cleared linear combinations of the studied traits (linear discriminants) that characterize or separate the three generations. As shown in Table 6, the first and the second discriminants (LD1 and LD2) are a linear combination of the seven traits where the percentage separations achieved by LD1 and LD2 were 99.7 and 0.3%, respectively. LI was the most crucial trait (coefficient = 5.92), followed by SI and BW (− 3.21 and 1.24) for LD1. On the other hand, LD2 was not effective in separating the three generations.

Table 6 Linear Discriminants of lint yield/plant (LY), boll number/plant (BN), boll weight (BW), seed number/boll (SN), seed index (SI), lint % (LP), and lint index (LI).

Figure 1 explains the density plots of the studied traits for the three generations using smoothed kernel density function of the values. Each density plot shows both the distribution of the values and their probability. The area under the curve represents the distribution of the values of each trait, while the values on the Y-axis represent the probability of these values. The X-axis value that corresponds to the peak is the average of the feature. As shown from the figure, a higher concentration of each trait's values was demonstrated by the peaks of each density plot, which means higher probability.

Figure 1
figure 1

Density plot of boll number/plant (BN), boll weight (BW), lint % (LP), lint yield/plant (LY), seed index (SI), lint index (LI), and seed number/boll (SN).

In comparison, a lower concentration of the values of each trait was demonstrated by the tails of each density plot, which means lower probability. It is observed that the corresponding value to the peak of LI for F2 is different from F3 and F4, where the means of LI for F3 and F4 were higher than the mean of F2. The same trend was observed for SI, while the reverse was the case for SN. The density plots results show that the selection from F2 improved LI and SI means only and decreased the variation of all the studied traits.

Figure 2 shows the stacked histogram for discriminant function values based on LD1. It’s evident that no overlaps were detected between F2 and F3, and F4. However, an overlap was observed between F3 and F4. These results demonstrated that selection among F2 plants has led to an improvement in the generation F3; however, selection among F3 plants has not led to an improvement in the generation F4.

Figure 2
figure 2

Stacked histogram for discriminant function values based on LD1.

Figure 3 shows the biplot based on LD1 and LD2. It is clear that F2 was separated very clearly while an overlap was observed between F3 and F4. Based on arrows, LI explained more for F3 and F4 from F2, while SI explained more for F2 from F3 and F4 Table 7. On the other hand, BW could not explain among the three generations. At the same time, LY, BN, SN, and LP have no specific trend and can not be used to describe the variation or separate among the three generations.

Figure 3
figure 3

Biplot based on LD1 and LD2.

Table 7 Means of lint yield/plant (LY), boll number/plant (BN), boll weight (BW), seed number/boll (SN), seed index (SI), lint % (LP), and lint index (LI) for the three generations.

Path analysis

Path analysis34 was performed using the R software version 4.1.0, 2021 using the function (sem), which stans for structural equation modeling, in the (lavaan) package. Then path diagram was drawn using the process (semPaths) in the same package. The results of path analysis are shown on the path diagrams in Figs. 4, 5, and 6. All the path diagrams included the effects of lint index (LI) and seed index (SI) on lint % (LP), boll number/plant (BN) and seed number/boll (SN) on boll weight (BW), lint % (LP) and boll weight (BW) on lint yield/plant (LY) for the three generations. Three types of arrows characterize the path diagram. The first type is single-headed arrows (paths) used to define the causal relationships, where the trait at the tail of the needle affects the attribute at the head. The second type is double-headed arrows connecting two characteristics that define their covariance. The third type is a double-headed arrow pointing to the same trait, representing the variance of that trait. It was observed that LP was strongly affected by LI and SI, where the R2 values were 0.990, 0.834, and 0.980 for F2, F3, and F4, respectively. Also, BW was strongly affected by LI, SI, and SN, where the R2 values were 0.986, and 0.978 for F2 and F3, respectively. However, it was moderately affected in F4 (R2 = 0.487). LY was directly and strongly influenced by LI, SI, BN, and SN, where R2 values were 0.980, 0.992, and 0.942 for F2, F3, and F4, respectively. Each direct effect (standardized estimates) is shown in the middle of the arrow, and each indirect effect is shown in parenthesis beside the direct effect. The direct effect of BN on LY was more substantial than the direct effect of LP and BW in the three generations. The indirect effect of LI and SI through LP had the same trend in the three generations, where LI had a positive effect while SI had a negative effect. SN had no indirect effect on LY through BW, while LI and SI had a positive indirect effect on LY through BW in the three generations. The direct and indirect effects on LY were more significant in F4 than in F2 and F3. These more substantial effects that have been found in F4 may be due to the effect of selection, which led to a stronger relationship between Lint yield and its components and consequently stabilized the progeny of the generation.

Figure 4
figure 4

Path diagram of the direct and indirect effects of the studied traits in F2.

Figure 5
figure 5

Path diagram of the direct and indirect effects of the studied traits in F3.

Figure 6
figure 6

Path diagram of the direct and indirect effects of the studied traits in F4.

Discussion

The results indicated a high magnitude of genetic components and gave a possible success in the selection of the early segregating generations under water deficit stress conditions, its agree with35,36,37,38. The observed phenotypic and genotypic coefficients of variation were more significant in F2 and F3 than in F4 for all the studied traits; this was due to a reduction in genetic variability and heterozygosity as a result of using different selection procedures which exhausted a significant part of variability. Similar results agreed with those of7,39,40,41,42,43.

Mean performance in the F4 generation revealed higher than in the F3 generation for all studied characters. However, MR was lower (desirable) in F4 than F3 generation; this attributed to the possible accumulation of favorable alleles due to the efficiency of selection procedures application in this study; these results indicated the feasibility of selection for these traits. Several researchers obtained similar results by44,45,46,47,48.

Plant breeders must be concerned with the total array of economic characters and not just one character. Thus, the correlation analysis provides an excellent index to predict the corresponding change in one character at the expanse of the proportionate change in the other coefficient of phenotypic and genotypic correlations among different character combinations. Generally, genotypic correlations were higher than phenotypic correlations; this may be due to the relative stability of genotypes as the majority of them were subjected to a certain amount of selection14,28,30, and similar results were obtained by11,15,16,31,49.

Generally, from previous results under deficit water stress, direct selection for lint index (Ped.3) was the most efficient in improving lint cotton yield/plant, bolls/plant, seeds/boll, uniformity index, and fiber length. However, the multiplicative index of50 involving all studied characters (I.5) exhibited the highest values for boll weight. Also, the Ped.2 index (direct selection for lint percentage) proved to be the most efficient in improving seed and lint indexes. Direct selection for lint cotton yield/plant (Ped.1) could produce the highest desirable values for lint percentage and micronaire reading with a relatively reasonable yield. A selection index involving yield and its components (I.2) is recommended in improving the Pressley index.

The main objective of any breeder is to get a high yield with acceptable fiber qualities, especially under normal conditions in general9,10,25., and underwater deficit stress conditions in special3,6,36,51,52,53.This work suggested selection for favorable characters and combinations of characters under deficit water stress could be carried out at early generations because genetic variation had exhausted at early generations, and the improvement of traits could not be continued in further generations. Application of these two selection methods at the early generations of the cross-G.86 × Menoufi could improve lint yield with desirable fiber quality as one of the essential aims under deficit water stress.

Materials and methods

Genetic materials and selection procedures:

This work was carried out at Sakha Agricultural Research Station during the 2017, 2018, and 2019 growing seasons. The materials used were the F2, F3, and F4 generations of intraspecific cotton (Gossypium barbadense L.) cross (Giza 86 × Menoufi). Giza 86 as along stable variety. It is characterized by high yield and good fiber properties. Menoufi (Giza 36) is an extra-long stable variety, the characteristics of these two parents under water deficit stress are presented in Table 8. F2, F3, and F4 generations were evaluated under water deficit stress by applying one irrigation at planting and two supplemental irrigations 30 and 20 days after planting. The ordinary practices of cotton cultivation were applied.

Table 8 Characteristics of the cotton parental genotypes under this study.

In the 2017 season, F2 generation with original parents were grown in no replicated rows 5.0 m long with 30 cm hill space, while row to row width was kept 70 cm apart. One plant was left per hill at thinning time. Self-pollination was practiced for all F2 plants selfed and open-pollinated bolls/plant of 439 guarded plants. Selection of superior progenies is a procedure intensive and fiber quality trait; lint yield /plant (LY/P) (g), bolls/plant (B/P), boll weight (BW) (g), seeds/boll (S/B), seed index (SI) (g), lint percentage (LP%), lint index (LI).

Using 15% selection intensity, the plants with the highest performance for lint yield/plant, bolls/plant, boll weight, seeds/boll, seed index, lint percentage, and lint index were saved. These gave a total of 76 F3 selected progenies.

In the 2018 season, part of selfed seeds of 76 selected progenies was evaluated with original parents in a randomized complete blocks design with three replicates. The experimental plot consisted of a single row 5.0 m long with 30 cm hill space, while row to row width was 70 cm apart. One plant was left per hill at thinning time.

The 76 progenies were ranked using eight selection procedures. The eight superior progenies of each selection procedure were selected using 10% selection intensity. In the 2019 season, the selfed seeds of selected progenies (19) were evaluated with original parents in a randomized complete blocks design with three replicates. The experimental plot was laid out as same as carried out in 2018. Selection of superior progenies is a procedure intensive and fiber quality trait; lint yield /plant (LY/P) (g), bolls/plant (B/P), boll weight (BW) (g), seeds/boll (S/B), seed index (SI) (g), lint percentage (LP%), lint index (LI), fiber length at 2.5% span length (FL)(mm), uniformity index (UI%), Pressley index (PI), and micronaire reading (MR).

Selection procedures were as follows:

Ped.1

Direct selection for lint cotton yield/plant

Ped.2

Direct selection for lint percentage

Ped.3

Direct selection for lint index

I.1

Selection index involving yield and its components

I.2

Selection index involving boll weight, seeds per boll, seed index, fiber length, and Pressley index

I.3

Selection index involving lint percentage, lint index, fiber length, and uniformity index

I.4

Selection index involving all characters24

I.5

Selection index involving all characters50

Statistical and genetic analysis

The phenotypic (PCV) and genotypic (GCV) coefficient of variation were estimated according to54. The variance and covariance components from the regular randomized complete block design analysis were used to estimate the phenotypic and genotypic variances and covariances as outlined in Table 9.

Table 9 Analysis of variance and covariance on plot mean basis in F3 and F4 generations.

Heritability in a broad sense (h2b) was calculated as follows:

$${\text{h}}_{{\text{b}}}^{2} \,({\text{in}}\,{\text{F}}_{2} \,{\text{generation}}) = \frac{{{\text{VF}}_{2} - ({\text{VP}}_{1} + {\text{VP}}_{2} )}}{{{\text{VF}}_{2} }} \times 100$$
$${\text{h}}_{{\text{b}}}^{2} \,({\text{in}}\,{\text{F}}_{3} \,{\text{and}}\,{\text{F}}_{4} \,{\text{generations}}) = \frac{{\sigma^{2} g}}{{\sigma^{2} {\text{p}}}} \times 100$$

where VF2 = The phenotypic variance of the F2 generation. VP1, VP2 = The variances of the first and second parents, respectively. σ2g = The genotypic variance of the F3 and F4 generations. σ2p = The phenotypic variances of the F3 and F4 generations. The phenotypic and genotypic correlation coefficients between studied characters in F2 and F3 and F4 generations were estimated as outlined by55,56.

The appropriate index weights (b, s) were calculated from the following formula postulated by23,24:

$$\left( {\text{b}} \right) = \left( {\text{P}} \right)^{{ - {1}}} \cdot \left( {\text{G}} \right) \cdot \left( {\text{a}} \right)$$

where (b) = Vector of relative index coefficients, (P)−1 = Inverse phenotypic variance-covariance matrix, (G) = Genotypic variance-covariance matrix and (a)=Vector of relative economic values on the basis of equally important, i.e., (a)w = (a)1 = (a)2 = (a)3 = 1.

The formula suggested by23,24: Was used in calculating various selection indices:

$${\text{I}} = {\text{b}}_{{1}} {\text{x}}_{{1}} + {\text{b}}_{{2}} {\text{x}}_{{2}} + \cdots + {\text{b}}_{{\text{n}}} {\text{x}}_{{\text{n}}}$$

Predicted improvement in lint yield on the basis of an index was estimated according to the following expression:

Selection advance (SA) = SD(∑bi·σgiw)1/2 57.

where SD denotes selection differential in standard units. bi denotes index weights for characters considered in an index. σgiw denotes genotypic covariances of the characters with yield.

Predicted genetic advance in lint yield based on direct selection was estimated from the following expression:

(ΔGw) due to selection for Xi=K·σgwi/σpi58.

Also, the predicted response in any selected and unselected character was calculated as suggested by57,59.

The realized gains were calculated as deviation of generation mean for each character from the procedure mean of that character.

Conclusion

In this study, the result under deficit water stress, direct selection for lint index (Ped.3) was the most efficient in improving lint cotton yield/plant, bolls/plant, seeds/boll, uniformity index, and fiber length. However, the multiplicative index involving all studied characters (I.5) exhibited the highest values for boll weight. Also, the Ped.2 index (direct selection for lint percentage) proved to be the most efficient in improving seed and lint indexes. Direct selection for lint cotton yield/plant (Ped.1) could produce the highest desirable values for lint percentage and micronaire reading with a relatively reasonable yield. A selection index involving yield and its components (I.2) is recommended for improving the Pressley index.