Introduction

Cotton is one of the most essential multi-purposes crop due to a wide range of its benefits whether for human or animal such as feed, fiber, protein and oil. Despite the importance of cotton crop for many countries, its yield and productivity as any crop facing and affected by many environmental factors such as biotic and abiotic stresses. Therefore, the plant breeders meet a big challenge to improve cotton crop in terms of yield, yield components and fiber quality against theses harmful stresses [1]. Cotton improvement has targeted directedly towards yield and yield components characters for instance number of locules, boll size and number of bolls per plant, seeds per boll, seed size, lint index, seed index, and ginning outturn. Knowing how gene actions influence economic characteristics is essential for developing high-yielding and quality cultivars [2, 3]. Evaluating candidate lines' combining ability is not only critical for identifying superior combiner parents, but also to determine the type of gene action regulating trait inheritance [4]. Effects of combining ability are divided into general combining ability (GCA) of parents and specific combining ability (SCA) of their crosses. These effects of GCA and SCA are linked to additive and non-additive gene actions, respectively [5]. In the combining ability the entire genetic variability of each trait can be partitioned into GCA and SCA. Many authors mentioned that SCA effects are caused by genes that are non-additive (dominant or epistasis), whereas GCA effects are caused by genes that are additive in nature. They also emphasized the importance of non-additive gene activation for specific cotton characteristics [6,7,8,9,10,11,12]. They stressed upon the appreciable degree of variance to GCA and cleared the mean squares due to GCA and SCA were highly significant however the genetic variances due to SCA were greater than GCA for the yield traits showing the non – additive gene action [13, 14].

The main focus of the cotton improvement program was on developing hybrids, which has helped increase the productivity of cotton [15,16,17]. The most effective way to break yield barriers is through hybridization. It's not a good idea to choose parents based on their phenotypic performance alone, because lines with good phenotypes may produce bad recombinants in the generations that follow [18]. So, it is important that parents are chosen based on how well they can work together. Combining ability analysis is the most popular biometrical tool for finding potential parents and coming up with the best way to breed plants [19, 20].

The selection of parents or inbreeds based on their phenotypic diversity with strong combining ability is critical in developing better hybrids in a heterosis breeding program. The investigation of general and specific combining ability aids in the identification of parents or inbreeds for the generation of superior hybrids. The Line x Tester analysis is one of the easiest and most efficient methods of assessing the combining ability of many inbred/parents. Production of commercially viable hybrids is achievable based on the results of the Line x Tester analysis. Yield is a complicated polygenically inherited character that is the outcome of the multiplicative interaction of its constituent characteristics. Because it is heavily influenced by the environment, selecting solely on yield may limit future improvements. The yield component traits, on the other hand, are less complex in inheritance and are influenced by the environment to a smaller extent. Thus, selection on yield component quality can result in effective yield improvement. To differentiate between the high- and low-performing parents in a hybrid combination, data on the nature of gene activity must be evaluated. As a result, the line by tester approach aids in identifying the gene activity responsible for the manifestation of features of interest in both small and large sample sizes [21, 22]. A strategy like this also aids in the selection of prospective parents and crossing for the development of high-yielding hybrids [23]. Such a method also assists in the selection of promising parents and crosses for developing high-yielding hybrids [24,25,26,27,28,29] and provides information about GCA (additive) of parents and SCA (non-additive) of crosses, and at the same time, it helps identify the best heterotic crosses [30,31,32,33]. The most notable advantage of the line x tester technique over other crossing methods is that it requires fewer experimental materials for the mating procedure. The line x tester technique is used in cotton to study yield, its components, and fiber quality parameters [17, 34,35,36,37,38,39,40].

In this respect and with the above-mentioned various aspects of cotton background. The main objectives of the current research are to assess the combining ability, heterosis and performance of cotton for yield, and yield components using a line × tester mating design strategy, which helping the cotton breeder to determine best superior parents and progenies for simultaneous improvement of cotton crop for its yield, yield components and quality.

Materials and methods

The present investigation was carried out at Sakha Agricultural Research Station during the three growing seasons 2015, 2016 and 2017. The genetic materials used in the current study were twelve genotypes, four of them as tester male parents and eight genotypes as female parents. The names, origin of these cotton genotypes is furnished in (Table 1). In 2015 season the four male testers and the eight female parents were crossed according to line x tester design to produce 32 F1 top crosses as out lined by [41]. The twelve genotypes were grown with their 32F1 hybrids for two years 2016 and 2017, respectively. The experimental design was randomized complete blocks design with three replications. Each plot was represented by one row 4 m long and 0.7 m width and 40 cm between hills and one plant were left per hill. The recommended agricultural agronomic and cultural management practices (thinning, hoeing, fertilization, irrigation etc.) by agriculture research center (ARC) were applied at the proper time as and when required.

Table 1 The origin and the main characters of the parents

At maturity stage, the data were collected and taken on the middle five plants, leaving two plants on either start or end of the row to avoid the border effects. Data were recorded on the following traits as described by [42]: Boll weight gram (B.W) was obtained from a random sample of 18 bolls collected from each plot to determine the boll weight. Seed cotton yield (g.) /plant (S.C.Y/P), It is the mean seed cotton yield harvested till final picking from the center row of each plot and expressed in grams. Lint cotton yield (g.)/plant (L.C.Y/P), It is the mean lint yield harvested and ginned till final picking from the center row of each plot and expressed in grams. Lint percentage (Ginning out turn) (L %), Seed cotton obtained from eighteen bolls for each plot chosen at random was ginned and the lint yield obtained from it was used for working out the GOT by the following formula:

$$GOT\mathrm{ \% }=\frac{weight of lint }{Weight of seed cotton }X 100$$

Seed index g. (S.I), by weighting of seed cotton (Sc) for one hundred seeds. Number of bolls/ plant (No. B./P) of random plants, Lint index (L.I), by weighting of lint from one hundred seeds.

Statistical analysis

Statistical analysis was performed for each year. Combined analysis between the two years was done whenever homogeneity of error mean squares was detected for the studied characters according to [43]. The combining ability analysis was done using line x tester procedure as suggested by [41]. Heritability estimates were obtained as described by [44]. Path coefficient analysis was performed keeping single plant yield as the dependent variable and yield component characters with yield related traits as independent variables based on the scale of [45]. Heritability Estimates were obtained as described [38]. In R (version 3.5.2), all statistical analyses were conducted (R Core Team, 2020) where correlation diagram was performed using corrplot package cluster dendrogram and PCA biplot were conducted using factoextra package, and path analysis was performed with lavaan package.

The Shapiro–Wilk and Brown–Forsythe tests were used to determine whether the data had a normal distribution and whether the variances were homogeneous. Analyses of variance (ANOVA) and mean discrimination analysis were conducted on variables that met both assumptions.

Results

Dissecting of the relationship among parent genotypes (lines and testers)

The cluster analysis of six traits was conducted based on Euclidean distances using by unweighted paired group method using the arithmetic average (UPGMA). Based on the matrix data of dissimilarity coefficients, a dendrogram was performed as shown in Fig. 1. In this dendrogram, the twelve parental cotton genotypes were classified into three clusters. Cluster I included four parents and classified into two sub-clusters; (Giza 96 and Giza 70) in the first subcluster, and the other one (G.89 × G.86, and G86), while cluster II has the two closest parents Suvin and C.B58 in the first subcluster and the other one has one parent (TNB). Cluster III consisted of five parents and classified into two sub-clusters; one has four parents including Pima S6, and kar in the first sub-subcluster and Giza 93, G.94 in the second sub-subcluster while the other sub-cluster has only one parent (BBB). Genotypes grouped in the same cluster (intra cluster) are expected to be genetically more similar than genotypes grouped in different clusters (inter cluster).

Fig. 1
figure 1

Dendrogram based on dissimilarity coefficients for yield, and yield component measured on twelve parental cotton genotypes over two years of study

Analysis of variance

Analysis of variance (ANOVA) shown in Table 2 indicated the individual effect of growing season lines, testers factors and their interactions on six studied traits. Effect of growing season (years) was significant for all studied traits. The genotype factor affected all studied traits highly significantly. For effect of different interactions between season, lines, and testers factors, year x treatment interaction showed a highly significant effect on all studied traits. The mean squares for lines, testers and their interaction L x T were highly significant for each year and their combined over two years and this could be due to genetic diversity of parents used to generate the hybrids and environmental influences. The interaction between crosses and their partitions (L, T, and LxT) with years LxY, TxY and Lx TxY were highly significant for all the studied traits, meaning that the genotypes and their partitions affected by years.

Table 2 The Mean squares of twelve parents and F1 for yield and yield components traits in two years and their combined data in line X testers hybrids of cotton

Mean performance of genotypes

The mean performance of lines, testers, and their interaction for all studied traits in each year are showed in (Table 3). The performance of genotypes appeared to be vary across years with respect to their means for all the studied traits. The results also showed that the parent G.94 had the highest and most desirable mean values for seed cotton, lint cotton yield, lint percentage, and boll weight, while the parent Pima S6 had the best desirable values for number of bolls/ Plant and seed index. On the other hand, the tester G.86 gave the highest mean values for all the studied traits except seed index, whereas the tester BBB had the highest value for the trait seed index.

Table 3 The mean performances of eight parental lines and four testers for yield and yield components traits in two years

The mean performance of the thirty-two top crosses in each year for all the studied traits are shown in (Table 4). The results showed that the Cross G. 86 X (G.89 X G86) had the highest values for seed cotton yield/ plant, lint cotton yield/ plant and numbers of bolls/ Plant with the mean values (387.14, 151.36 and 119.50) respectively. On the other hand, the Cross-G.93 X Suvin gave the highest value for seed index with the mean value (11.55). The highest value for lint percentage (40.30) obtained from the cross-G.86 X Suvin, while the highest value for boll weight was obtained from the cross Kar. X TNB (3.59 gm).

Table 4 The mean performances of the thirty-two top crosses for yield and yield components traits in two years

Combining ability

General combining ability

The average of lines x testers crosses performances were used to estimate general combining ability effects (GCA). Estimates of general combining ability effects of lines and testers for all the studied traits in each year are presented in (Table 4), Data showed that the best desirable general combining ability effects for seed cotton yield/ plant, lint cotton yield/ plant, boll weight, number of bolls/ plant and lint index were found in the parents Suvin, G.89 X G.86 and TNB. Meanwhile for favorable lint percentage were Suvin, G.96 and Pima S6, where they exhibited highly significant positive estimates of general combining ability effects. The tester G.86 exhibited highly significant positive (desirable) general combining ability effects for seed cotton yield, lint cotton yield, lint percentage, number of bolls/ plant and seed index. Also, the tester G.93 exhibited highly significant positive general combining ability effects for seed cotton yield, lint cotton yield and boll weight.

Specific combining ability

The SCA mean squares were significant for all studied traits. Thus, the significance of SCA (variances due to lines x testers) implied that both additive and non-additive types of variation was available for all the characters, yet additive genes were more important than the dominant genes because variance due to GCA was higher than that of SCA. Estimates of specific combining ability effects of 32 crosses for all the studied traits for each year are presented in (Tables 5 and 6). Data showed that eight, six, three, six, five and five crosses exhibited highly significant positive (desirable) effects for seed cotton yield, lint yield, lint percentage, boll weight, number of boll/ plant and seed index, respectively. The cross G.86X (G.89 X G.86) expressed high significant specific combining ability effects for seed cotton yield, lint percentage, boll weight and amount of bolls/ plant in each year and over two years. Also, the cross-G.93 X C.B58 had desirable values of specific combining ability effects for seed cotton yield, lint yield, boll weight and amount of bolls/ plant. It could be concluded that the best combiner of specific combining ability effects (Desirable) for most traits in each year and both years might be prime importance in breading program. The other crosses exhibited significant negative or insignificant negative or positive specific combining ability effects (Undesirable) for these traits (Table 6).

Table 5 General combining ability effects of parental genotypes for yield and yield component traits in two years
Table 6 Estimates of specific combining ability for yield and yield components in two years

Estimation of heterosis

The mean due to crosses and parent vs. crosses were highly significant in F1 crosses, indicating the presence of heterosis in F1 generation (Table 2). The (parent vs. crosses) x years interaction was significant indicating that heterosis for the traits, boll weight, lint percentage, seed cotton yield, plant and lint yield/ plant were inconsistent across different years which reflects the importance of selection of crosses for each year to maximize the yielding ability. Heterosis expressed, as the percentage deviation (increase or decrease) of F1 mean performance from the corresponding better parent for all the studied traits are presented in (Table 7).

Table 7 Heterosis relative to the better parents for yield and yield components traits in the two years

Regarding seed cotton yield/ plant, heterosis relative to better parent indicated twelve crosses showed highly significant positive heterosis values in each year. Where G.93 × Suvin exhibited highly significant positive heterosis for SCY. For lint cotton yield, eleven crosses exhibited highly significant positive heterosis relative to better parent in each year. With respect to lint percentage sixteen crosses showed highly significant positive heterosis relative to better parent in each ear and their combined.

For boll weight seven crosses gave significant positive heterotic effects relative to better parent in each year. The 32 F1 crosses exhibited highly significant positive heterosis relative to better parent for number of bolls/ plants in each year. Only four crosses (G.86 × Suvin), Kar. X TNB, G.93 X (G.89 X G.86) and G.93 × G.70 had highly significant positive heterosis relative to better parent in each year. The other crosses remaining exhibited significant negative or non-significant positive or negative values for these traits. It could be concluded that most of the crosses which exhibited highly significant positive heterosis relative to better parent could be utilized in perspective cotton breeding programs for improving these traits.

Path analysis

The result of direct and indirect correlation coefficients regressed with seed cotton yield was presented in Fig. 2 and path analysis diagram were further shown in Fig. 3. Lint yield had the highest significant positive direct effect on seed cotton yield (r = 0.99) which implied that lint yield could be used as marker for direct selection. Also, boll weight and number of bolls/plants showed significant positive direct effect on SCY (r = 0. 95 and r = 0.69). significant direct effect on SCY was recorded by BW (r = -0.59), The path coefficient analysis of indirect and direct effects of the associated traits with seed cotton yield revealed that LCY (r = 0.57) had the highest indirect contribution to seed cotton yield, followed by NOBP (r = 0.41), and BW (r = 0.16) indicating the importance of these traits to SCY. These need to be carefully considered simultaneously when selecting for yield improvement in cotton.

Fig. 2
figure 2

Pearson's genotypic correlation coefficients between the traits. LCY, lint cotton yield; SCY, seed cotton yield; SI, seed index; BW, boll weight; NoBP, number of bolls per plant; L, lint percentage. Crosses indicate non-significant correlations and non-crosses indicate significant correlation by t-test the 5% of probability

Fig. 3
figure 3

Path diagram showing the direct effect of the 6 explanatory variables on seed cotton yield examined for parents, testers and F1 crosses that evaluated over two seasons of 2016 and 2017. Bidirectional arrows show correlation between the variables, and unidirectional arrows indicate a direct effect on the direction of the arrow, blue and red arrows represent positive and negative effects. Solid arrows indicate P < 0.05 and dashed arrows indicate P > 0.05. LCY, lint cotton yield; SCY, seed cotton yield; SI, seed index; BW, boll weight; NoBP, number of bolls per plant; L, lint percentage

Regression analysis

Figure 4 provided the regression coefficients for all attributes and the regression coefficient of determination (R2). Figure 4 provides a graphic representation of the dependence of seed cotton yield (SCY) on key yield-related variables. The results revealed that LCY had the highest coefficient of determination (0.99), followed by No. B/P (0.95), and SI (0.69), while L percent had the lowest coefficient of determination (0.37), followed by No. B/P. (0. 59). Regression coefficient of LCY for BW, No. B/P, L percent, and SI showed that one unit's change in lint weight and boll weight caused a 59 percent, 93 percent, 50 percent, and 70 percent change (increase or decrease) in lint cotton yield (the dependent variable) (dependent variables). The range of the regression coefficient of L percent was 0.24 for BW and 0.38 for SI, respectively. In contrast, the range for BW was 0.56 (SI) to 0.3 9 (No. B/P).

Fig. 4
figure 4

Linear regression of the yield traits, LCY, lint cotton yield; SCY, seed cotton yield; SI, seed index; BW, boll weight; No. BP, number of bolls per plant; L, lint percentage, linear mixed-effects models (second to left-most column), and log-transformed mixed-effects regression (right-most column)

Discussion

A "line x tester" analytical approach was used to develop 32 F1 hybrids from a cross between eight different cotton genotypes, or "female parental lines," and four high-yielding testers in this study. Using cluster analysis, we were able to see how the different lines and testers used to build heterotic pools related in terms of the six yield attributes we were looking at. Then, we were able to cross these lines and testers. Based on this study's findings on cluster analysis, it was decided to cross twelve different female paternal genotypes and testers, resulting in the development of super heterotic hybrid cottons. As a result of this research, we can confidently say that certain cotton types can be further altered for crossbreeding utilizing cluster analysis. Previously, similar findings were found in another experiment. [6, 46,47,48]. Because of the wide range in combing ability among the male parents studied, the four testers were classified into two distinct clusters. Genotypic performance was used to categorize cotton genotypes into distinct groups, which may be used in cross-breeding to produce transgressive segregants in the early generations [49,50,51,52,53].

The results of the F1 crosses, together with the twelve parent genotypes, were examined for six yield and yield components traits. Heterosis breeding can only succeed if the genetic diversity between female parent genotypes and tester lines can be accurately measured. There were substantial genetic differences among the 44 genotypes found by using ANOVA in this investigation (P < 0.01) for all traits, further subsequent analysis were performed to assess combining ability [41, 54,55,56]. In addition, highly significant differences were exhibited across female parental genotypes, testers, and their interaction for studied traits. Practically, the combining ability of genotypes is dissected to discover genotypes with high genetic potential for developing cross combinations with desired traits and to study the activity of genes involved in trait expression [33, 57,58,59,60]. Using the Line X Tester analytic method, we can better estimate and predict essential quantitative features, which is a well-established biometrical genetics-based approach in the context of this inquiry [5, 22, 25, 54, 61,62,63,64]

Combining ability is measured via two genetic parameters, GCA and SCA, which may be respectively controlled by the additive genetic effects and non-allelic interactions of the parents [33, 65, 66]. In this investigation, positive and negative GCA effects were exhibited for different genotypes of both female parental lines and testers, indicating possessing of promising good combiner and poor combiner in term of specific traits. Female parental genotypes with strong capacity to impart desired traits to their cross offspring could be used as a significant material to improve the qualities of interest, as these genotypes have good general combining ability [9, 67, 68]. The significant GCA effects revealed in this study were consistent with previous investigations [24, 47, 69,70,71,72]. Both female parental genotypes and testers (pollinators genotypes) with positive and significant GCA effect observed in the current study are of great importance since crossing between such good combiners would result in favorable hybrid combinations in consequent segregating populations, improving selection process for specific traits. Theoretically, high GCA impact could be attributed to additive gene effects or additive x additive gene interaction effects [33, 54, 73,74,75]. Highly significant GCA for female parental lines and testers for LCY noticed in this study further reveal vital role of additive type of gene effects in such trait. It is noteworthy that good combiners parents for SCY were also shown to be good combiners for the majority of its yield components [76,77,78]. As lint yield is significantly important in such fiber crops, the female parents Suvin, G.89 X G.86 and TNB represented the best general combiner in term of LCY, revealing the most favorable genotype among female parents.

Having promising genotypes with excellent fiber quality is also urgently needed due to the increasing global demand for textile products and fierce competition from current synthetic fibers and textile industry technologies [70, 79,80,81]. As a result of this study, we have provided favorable crossing material for quality traits characterized with highly positive GCA effects recorded by both female parents and testers for different quality traits. This includes female parents Suvin, G.96, Pima S6 as well as G.86 and G.93 as testers for seed cotton yield, lint cotton yield, lint percentage, number of bolls/plant and seed index.

As Often, the lint yield is negatively linked with fiber quality in cotton constituting unfavorable association which has impeded cotton breeding efforts to enhance multiple fiber properties [12, 82, 83]. This may shed the light on the necessity of designing breeding programs based on such promising parents in yield traits.

Furthermore, findings of the current investigation showed that specific combining ability was highly significant for all yield traits revealing the role of non-additive gene effects as dominance or epistatic controlling these traits. One of the most promising hybrids in respect to its specific combining ability for LCY, SCY, No. B/P, and SI is G.86X (G.89 X G.86). These promising hybrids could be selected for further recombination breeding programs based on their performance and significant specific combining ability. Nevertheless, not all of F1 hybrid combinations showed positive SCA values for all the evaluated traits concurrently, stating that specific hybrid combinations having high significant SCA for several traits had both parents with a good GCA [7, 61, 84, 85].

In general, variances of GCA and SCA point out the magnitude of gene action, and this further helps in developing an appropriate breeding strategy for future breeding programs [54, 86, 87]. Variances due to GCA effects (mean squares due to lines and testers) were lower than SCA (mean squares due to lines x testers) for some traits such as LCY, SCY, SI, and NoBP indicating that the non-additive type gene action (dominant or epistatic) played an important role in governing these traits. In contrast, GCA variances were greater than SCA variances for BW, L%, and SI, which reflect the importance of additive genes against those non-additive genes controlling these traits. These findings were in consistence with those previously [8, 12, 68, 88,89,90,91].

The regression analysis was conducted to investigate the dependence between several variables. All the yield-related traits are correlated with each other in a way that increases or decreases in one trait directly affects others. Thus, estimation of association among yield,, and yield components are helpful to initiate and select the most appropriate breeding methods [92]. Estimation of phenotypic correlation among the recorded traits showed that SCY had a significant and highly positive correlation with each of NBP and LCY, indicating that selection for these two traits in yield improvement program will increase the lint yield. Similar patterns of correlation were reported in previous studies by [3, 4, 15, 22, 23, 90, 92,93,94,95]. Boll weight showed a negative correlation with NB in all three environments [96]. Likewise, significant, and positive association between BW and NSB is highly beneficial since increase in boll weight will increase the number of seeds per boll, which in turn will result in increased surface area, enhancing the maximum lint percentage. Similarly with earlier investigations, the current study found negative correlations among yield related and fiber quality traits. Correlation between SCY and LCY with fiber quality traits showed non-significant association [97,98,99].

Despite being essential, the correlation coefficient can lead to misunderstandings regarding the relationship between two traits and does not have to be an accurate measure of cause and effect. Thus, the strength and efficiency of the correlation coefficient between two characters may be attributed to the influence possessed by a third trait or collection of traits on the traits, which does not provide the precise relative importance of the direct and indirect effects of the elements being studied [93]. These justifications have led to using path analysis in the current study. Strikingly, the largest direct effect of SCY on the dependent variable LCY in this study implied that SCY could be used as marker to improve LCY through direct selection process. These findings were in line with recent investigations [100, 101]. Furthermore, NBP showed the highest positive indirect contribution to lint cotton yield through SCY, followed by LI through SI, BW through SI which pinpoints the importance of these traits due to their indirect vital role on improvement of LCY. Such results claim that careful and simultaneous consideration should be attained when selecting for yield improvement strategy in cotton and confirms that selection for LCY should depend on such marker traits, as well.

Conclusion

The parent G.94 and Pima S6, as well as the tester G.86, had the best means for all the traits. The crosses G.86 (G.89 × G.86), G.93 × Suvin, and G.86 × Suvin were the elite genotype for all studied traits. The parents Suvin, G89x G86, and TNB had the most desirable GCA effects for SCY, LCY, BW, NOBP, and LI. Suvin, G.96, and pima S6 had the most desirable L%. The cross-G.86 x (G.89 × G.86) showed high significant SCA effects for SCY, L%, BW, and NOBP. whereas the crosses G.86 × Suvin, Kar x TNB, G.93 × Suvin, and G.93 × TNB had a highly significant positive heterotic effects for all the studied traits. This could be recommended this cross for use in future breeding programmes to improve both lint yield and fibre length.