Examining two sets of introgression lines across multiple environments reveals background-independent and stably expressed quantitative trait loci of fiber quality in cotton

Key message Background-independent (BI) and stably expressed (SE) quantitative trait loci (QTLs) were identified using two sets of introgression lines across multiple environments. Genetic background more greatly affected fiber quality traits than environmental factors. Sixty-one SE-QTLs, including two BI-QTLs, were novel and 48 SE-QTLs, including seven BI-QTLs, were previously reported. Abstract Cotton fiber quality traits are controlled by QTLs and are susceptible to environmental influence. Fiber quality improvement is an essential goal in cotton breeding but is hindered by limited knowledge of the genetic basis of fiber quality traits. In this study, two sets of introgression lines of Gossypium hirsutum × G. barbadense were used to dissect the QTL stability of three fiber quality traits (fiber length, strength and micronaire) across environments using 551 simple sequence repeat markers selected from our high-density genetic map. A total of 76 and 120 QTLs were detected in the CCRI36 and CCRI45 backgrounds, respectively. Nine BI-QTLs were found, and 78 (41.71%) of the detected QTLs were reported previously. Thirty-nine and 79 QTLs were SE-QTLs in at least two environments in the CCRI36 and CCRI45 backgrounds, respectively. Forty-eight SE-QTLs, including seven BI-QTLs, were confirmed in previous reports, and 61 SE-QTLs, including two BI-QTLs, were considered novel. These results indicate that genetic background more strongly impacts on fiber quality traits than environmental factors. Twenty-three clusters with BI- and/or SE-QTLs were identified, 19 of which harbored favorable alleles from G. barbadense for two or three fiber quality traits. This study is the first report using two sets of introgression lines to identify fiber quality QTLs across environments in cotton, providing insights into the effect of genetic backgrounds and environments on the QTL expression of fiber quality and important information for the genetic basis underlying fiber quality traits toward QTL cloning and molecular breeding. Electronic supplementary material The online version of this article (10.1007/s00122-020-03578-0) contains supplementary material, which is available to authorized users.


Introduction
Cotton is an important economic crop worldwide that produces natural fibers used in the textile industry. It is essential that fiber quality is improved in order to keep pace with the development of spinning technology and cotton harvesting mechanization. However, the narrow genetic variation Communicated by Brent Hulke. Yuzhen Shi, Aiying Liu and Junwen Li have contributed equally to this work.

3
in Upland cotton limits the improvement in cotton varieties (Qin et al. 2008). It has been a long-term challenge for cotton breeders to improve fiber quality and yield to meet the needs of cotton producers and the textile industry. Cotton (Gossypium spp.) contains 52 species , including two important cultivated tetraploid species: G. hirsutum (Upland cotton), with a high fiber yield, wide adaptability and medium fiber quality, and G. barbadense (Sea-Island, Egyptian or Pima cotton), with a low fiber yield, and narrow adaptability but high fiber quality Shi et al. 2016).
Therefore, introducing desirable genes from G. barbadense into Upland cotton cultivars and mapping the quantitative trait loci (QTLs) for fiber quality traits transferred from G. barbadense using introgression lines or chromosome segment substitution lines in the Upland cotton background could facilitate improvements in the fiber quality of Upland cotton.
Fiber length (FL), strength (FS) and micronaire (FM) are the three most important traits for evaluating fiber quality. Fiber quality traits are complex quantitative traits controlled by multiple genes and are susceptible to environmental impacts . Therefore, the use of traditional breeding methods alone for fiber quality breeding is neither accurate nor efficient. Molecular marker-assisted selection (MAS) is a fast and effective method for improving Upland cotton fiber quality.
In the past 20 years, researchers have identified a large number of QTLs related to fiber quality in G. barbadense using interspecific segregating populations of G. hirsutum × G. barbadense or natural populations of G. barbadense (Abdullaev et al. 2017;Said et al. 2015a), but most of the mapping populations are early segregating populations of G. hirsutum × G. barbadense such as F 2 populations (Jiang et al. 1998;Kohel et al. 2001;Lin et al. 2005;Mei et al. 2004;Paterson et al. 2003), F 2:3 populations (He et al. 2007) and early backcross generation populations (BC 1 , BC 2 , BC 2 S 1 ) (Lacape et al. 2005;Shi et al. 2015Shi et al. , 2016. A few mapping populations are recombinant inbred line (RIL) populations (Lacape et al. 2009(Lacape et al. , 2010 or backcross introgression line (BIL) populations (Nie et al. 2015;Yu et al. 2013), as well as natural populations of G. barbadense (Abdullaev et al. 2017;Wang et al. 2013a). Due to the complex background of these mapping populations, it is difficult to accurately identify and precisely locate QTLs (Islam et al. 2016). Therefore, most of the QTL mapping results cannot be applied to the genetic improvement in Upland cotton . Introgression lines (ILs), also known as chromosome segment introgression lines (CSILs) or chromosome segment substitution lines (CSSLs), are constructed by hybridization, backcrossing, self-pollination and MAS. Only the introgressed segment differs between a CSSL and its recipient parent. As the same set of CSSLs has the same or a similar genetic background but differs only in a specific genetic region on one chromosome (thus eliminating the influence of complex genetic backgrounds), CSSLs are ideal materials for studying quantitative traits and QTL mapping in crops. The construction and utilization of CSSLs have been widely reported in tomato, rice, maize and other crops (Balakrishnan et al. 2019;Bouchez et al. 2002;Monforte and Tanksley 2000;Okada et al. 2018;Qi et al. 2013;Qiu et al. 2017). In cotton, chromosome substitution lines (CSLs) were first constructed, in which a pair of chromosomes or chromosome arms of Upland TM-1 (the recipient parent) were replaced by those of 3-79 (the donor parent, G. barbadense) (Stelly et al. 2005). CSLs differ from CSSLs or ILs in that a pair of recipient parent chromosomes or a pair of chromosome arms is replaced by a pair from the donor parents in the recipient parent, while CSSLs or ILs contain one or a few chromosome segments from the donor parent in the recipient parent background. Many researchers have evaluated and performed genetic studies on CSLs Saha et al. 2013;Wu et al. 2006), and CSSLs have been gradually constructed and used. Wang et al. (2008) constructed a set of CSSLs using the standard genetic line TM-1 as the recipient parent and Hai7124 as the donor parent through backcrossing and then used them to map QTLs for fiber quality . A QTL for FS, qFS-D11-1, was fine mapped using the ILs of TM-1 (G. hirsutum L.) × H102 (G. barbadense L.) (Su et al. 2013), and a QTL for FL, qFL-chr1, was fine mapped and analyzed using near-isogenic introgression lines (NIILs) of Upland Tamcot 2111 (G. hirsutum, the recurrent parent) × Pima S-6 (G. barbadense, the donor parent) . Some QTLs related to fiber quality in G. barbadense were identified based on segregating populations of derived progenies of one IL Wang et al. 2011Wang et al. , 2016. To date, most of the CSLs, ILs or CSSLs reported in cotton have been detected in the obsolete TM-1 genetic background. Although many QTLs for fiber quality traits have been detected in cotton, few of them have been used in MAS in breeding (Cao et al. 2014). This may be due to inaccurate QTL mapping in populations with complex genetic backgrounds, the use of different environments or the use of different genetic backgrounds in QTL mapping populations and breeding populations. In addition, using the QTLs identified in mapping populations in breeding populations is a challenge. To date, the genetic background effects on QTL expression have not been reported.
To transfer beneficial genes from G. barbadense into Upland cotton cultivars, we constructed two sets of CSSLs with two different Upland cotton genetic backgrounds in which Hai1 (G. barbadense) was the donor parent and CCRI36 and CCRI45 (G. hirsutum) were the recipient parents (Li et al. , 2019aLu et al. 2017;Ma et al. 2013;Yang et al. 2009). Some of the CSSLs were genetically 1 3 evaluated, and some QTLs for yield and fiber quality were identified using secondary segregating populations derived from one, two or four CSSLs as parents Li et al. 2019b;Song et al. 2017;Zhai et al. 2016). With the rapid development of biotechnology, multiple cotton genomes have been sequenced, providing a foundation for further cotton gene identification and molecular breeding at the genome level (Hu et al. 2019;Li et al. 2015;Wang et al. 2019).
In this paper, two sets of CSSLs were evaluated and used to dissect the genetic basis of the stability of cotton fiber quality traits across multiple environments and multiple genetic backgrounds, including the identification of more genetic BI-and/or SE-QTLs for fiber quality traits, thus providing new and important stable QTLs with known genomic segments for fine gene mapping, gene cloning and molecular breeding. To the best of our knowledge, this study represents the first report using two sets of CSSLs with different genetic backgrounds but with the same donor parent to dissect the stability of QTLs affecting fiber quality traits as well as to compare and analyze the influence of genetic backgrounds and environments on the expression of fiber quality QTLs in cotton.

Development of two sets of cotton CSSLs and multi-environment field experiments
Two sets of CSSLs were derived from two interspecific crosses with Hai1 (G. barbadense) as the donor parent and CCRI36 or CCRI45 (G. hirsutum) as the recipient parent. CCRI45 (also called CCRI221) is a late-maturing Upland cotton (G. hirsutum) cultivar, and CCRI36 is an early-maturing Upland cotton cultivar; both cultivars have high yield and were bred by the Institute of Cotton Research (ICR), the Chinese Academy of Agricultural Sciences (CAAS), Anyang, Henan Province. Hai1 is a cultivated line of G. barbadense with very high fiber quality.
First, two crosses (resulting in two F 1 populations) were performed, with Hai1 as the male parent and CCRI36 or CCRI45 (G. hirsutum) as the female parent. Subsequently, BC 5 F 3 populations with the CCRI36 background were obtained by five generations of successive backcrossing (with CCRI36 as the recurrent parent) and two generations of self-pollination and MAS (Li et al. , 2019a. Similarly, BC 4 F 3 populations with the CCRI45 background were also obtained by four generations of backcrossing (with CCRI45 as the recurrent parent) and two generations of selfpollination and MAS (Yang et al. 2009;Li et al. 2016).
On the basis of the above design, two subpopulations in each genetic background were randomly selected, including 408 CSSLs in the CCRI36 background, named 36Pop, and 332 CSSLs in the CCRI45 background, named 45Pop.
Subsequently, 36Pop was evaluated in a total of five environments as follows. In 2011, individual CSSLs in 36Pop (BC 5 F 3:5 ) and their recurrent parent were grown in three environments at three different locations: Anyang in Henan Province (11HNA), Liaoyang in Liaoning Province (11LNL) and Shihezi in Xinjiang Autonomous Region (11XJS). In 2014, the same population and the recurrent parent were further evaluated in the Shihezi experiment farm of ICR of CAAS and the experiment field of cotton research institute of Xinjiang Academy of Agricultural and Reclamation Science in Xinjiang Autonomous Region (14XJS and 14XJN, respectively).
45Pop was evaluated in a total of seven environments as follows. Individual CSSLs in 45Pop (BC 4 F 3:5 ) and their recurrent parent were grown in seven environments at four A randomized incomplete-block design with two replicates was adopted in all the environments, except in 15AY with one replicate, in which a randomized incomplete-block design with one replicate was adopted. In each environment, the recurrent parent, used as a control, was planted with 19 CSSLs at intervals in each of the environments. Single-row plots with a 5 m length and 0.8 m width were used in 11HNA and 15AY, whereas single-row plots with a 5 m length and 1 m width were used in 14HNZ and 15HNZ. Two-row plots with a 3 m length and 0.4 m width between each row were used in 11LNL, whereas two-narrow row plots with a 3 m length and 0.2 m width between the narrow rows, a plastic membrane cover and a wide/narrow row-spacing pattern (a 0.6 m width between two wide rows) were adopted in 11XJS, 11XJK, 14XJS, 14XJN, 14XJK and 14XJA.
Local field management practices were carried out in each of the environments or locations.

Evaluation of phenotypic traits
Naturally opened bolls were collected from the BC 5 F 3 individuals of CCRI36 × Hai1 and BC 4 F 3 individuals of 1 3 CCRI45 × Hai1 in 09AY, and 30 naturally opened bolls (from the middle of plants) were harvested in every plot (row) in the other environments (10HNA, 11HNA, 11LNL, 11XJS, 11XJK, 14XJN, 14XJS, 14XJA, 14XJK, 14HNA, 15HNA and 15HNZ) for testing three important fiber quality parameters of fiber strength (FS, cN/Tex), fiber length (FL, mean upper-half length, mm) and fiber micronaire value (FM, an integrated fiber quality parameter of fineness and maturity, unit) using HFT9000 (Premier Evolvics Pvt. Ltd, India) instruments with HVICC Calibration at the Cotton Quality Supervision, Inspection and Testing Center, Ministry of Agriculture, Anyang, China.

Molecular markers and genotype detection
Genomic DNA was extracted from young leaves of the CSSLs and their parents using a modified cetyl trimethylammonium bromide (CTAB) method (Paterson et al. 1993). The details for PCR amplification, PCR product electrophoresis and silver staining were the same as in the report of Sun et al. (2012).
Based on the genetic linkage map comprising 2292 marker loci distributed on 26 chromosomes and covering almost the whole cotton AD genome (5115.16 cM) with an average marker interval of 2.23 cM, 551 simple sequence repeat (SSR) markers with an average interval of 10 cM between two markers were selected for the screening of genotypes in two sets of CSSLs . Chromosome (C) 4 had the least number of markers (11 SSRs), while C11, C19 and C21 had the largest (30 SSRs). The longest genetic distance between two markers was 25.99 cM, and the shortest was 0.45 cM. The details of selected markers and their adjacent markers on the genetic map are in Table S1. The sequences of each primer used in this report can be downloaded at http://www.cotto nmark er.org and were synthesized by Bioethics Engineering Co., Ltd (Shanghai).
Descriptive statistical analysis, correlation analysis and analysis of variance (ANOVA) were performed using SPSS 20.0 software (SPSS, Chicago, IL, America). Genotypic analysis of populations and analysis of chromosome introgressed segments calculations (including background recovery rate of the CSSLs, the number and length of introgressed segments) were performed using GGT 2.0 software developed by van Berloo (http://www.plant breed ing.wur.nl/ UK/softw are_ggt.html) (van Berloo 2008).
QTL mapping was performed with QTL IciMapping (version 4.0), and the RSETP-LRTADD mapping method was applied with a logarithm of odds (LOD) threshold of 2.5 Wang et al. 2019).
The QTLs were named as follows: (q + trait abbreviation) + chromosome/linkage group + QTL number. QTLs for the same trait in different environments and populations were considered stable when their confidence intervals overlapped Sun et al. 2012).

Evaluation of CSSLs and fiber quality
The results of the descriptive statistical analysis of each trait in the different populations are shown in Table 1. The average FL, FS and FM values of the recurrent parent were generally consistent with those of the corresponding population in all environments. The FL of the populations in all environments was slightly greater than that of the recurrent parents, except in 14XJN; the FS of the populations in the other environments was slightly higher than that of the recurrent parents, except in 10AY; and the FM value of the populations in all environments was slightly lower than that of the recurrent parents. The average FL, FS and FM values of 36 and 45 recurrent parents were similar in all environments, with medium fiber quality. The descriptions of the statistical analysis of quality for 45Pop in 09HNA, 10HNA, 11HNA and 11XJK follow those reported by Ma et al. (2013).
In all environments, the range and coefficient of variation of each trait in all the populations were large. For the same population in all environments, the variation in FM was the greatest among the three traits, and the variation in FS was greater than that in FL.
These results indicate abundant genetic variation in the CSSL populations produced by advanced backcrossing and continuous self-crossing.
The absolute skewness of all traits in all populations and environments was less than 1, thus following a normal distribution (Table 1, Fig. S1).
The correlation coefficients of each trait between different environments were significant (Table S2), indicating that these materials were stable across multiple environments. Most of the correlation coefficients among environments for FL and FM were larger than 0.5, whereas those for FS were smaller than 0.5, indicating that FS was more greatly affected by the environment than FL and FM.

3
ANOVA revealed highly significant effects of genotype (G), the environment (E) and the interaction between genotype and the environment (G × E interaction) on all three traits in the populations ( Table 2). The broad-sense heritability values, calculated by partitioning the variance into genetic and G × E effects, were above 85% for all three traits.
Through the evaluation and analysis of multiple environments, some of the CSSLs with excellent and stable CSSLs were screened out Lu et al. 2017). These materials with outstanding fiber quality can be used in cotton breeding and gene identification and cloning.  1 3

Genotype analysis
The maximum length of the introgressed segments from Hai1 in each individual in the 36Pop population was 488.2 cM; the minimum length was 4.5 cM, and the average length was 125.80 cM. The percent return to the background of the recurrent parent in the population was 90.5-99.8%, with an average of 97.5%. The number of introgressed Hai1 segments was generally 5-20, and the length of the introgressed Hai1 segments was mainly between 30 and 210 cM (Fig. 1). The maximum length of the introgressed segments from Hai1 in each individual in the 45Pop population was 514.1 cM; the minimum length was 94.5 cM, and the average length was 125.80 cM. The percent return to the background of the recurrent parent in the population was 90.5-99.8%, with an average of 97.5%. The number of introgressed Hai1 segments was generally 5-20, and the length of the introgressed Hai1 segments was mainly between 150 and 390 cM (Fig. 1).

Analysis of QTLs in CSSL populations in the CCRI36 background
In the CCRI36 background, a total of 76 QTLs were identified, with a phenotypic variation explained (PVE) of 2.77-13.91% for all three fiber quality traits, 39 of which were detected in at least two environments (Fig. 2, Table S3).
Six of 26 FS-QTLs were detected in at least two environments: Two QTLs (qFS-C7-4 and qFS-C16-4) were detected in four environments with a PVE of 3.19-4.21%, and four QTLs were detected in two environments. Five of the six stable QTLs had positive additive effects, and the Hai1 alleles increased FS.

Analysis of QTLs in CSSL populations in the CCRI45 background
In the CCRI45 background, a total of 120 QTLs were identified with a PVE ranging from 3.40 to 22.94% for all three fiber quality traits, 79 of which were detected in at least two environments (Fig. 2, Table S3). FL:A total of 49 FL-QTLs were detected on 18 chromosomes in the CSSL populations with the CCRI45 background, while no FL-QTL was detected on C3, C4, C8, C9, C11, C18, C23 and C24. C12 and C14 contained the most FL-QTLs (5-6 QTLs). The PVE by these QTLs ranged from 3.12 to 19.95%; 34 of 49 QTLs had positive additive effects, and the Hai1 alleles increased FL.

QTLs detected in both genetic backgrounds simultaneously
Among the above QTLs, nine background-independent QTLs (BI-QTLs) were simultaneously detected in both backgrounds, including four FL-QTLs, four FS-QTLs and one FM-QTLs (Table 3, Table S3).
Among the four FL-QTLs, qFL-C7-3 near NAU1085 on C7 was simultaneously detected in six environments in the CCRI36 background and in five environments in

Fig. 2 (continued)
the CCRI45 background; qFL-C15-1 near NAU3177 on C15 was simultaneously detected in five environments in the CCRI36 background and in eight environments in the CCRI45 background; qFL-C16-2 near BNL2634 on C16 was simultaneously detected in eight environments in the CCRI45 background and in one environment in the CCRI36 background; and qFL-C2-6 near NAU2277 on C2 was simultaneously detected in four environments in the CCRI45 background and in one environment in the CCRI36 background. The Hai1 alleles in the four FL-QTLs all increased FL. Among the four FS-QTLs, FS-C7-4 near NAU1085 on C7 was simultaneously detected in four environments in the CCRI36 background and in three environments in the CCRI45 background; qFS-C16-3 near BNL2634 on C16 was simultaneously detected in two environments in the CCRI36 background and in three environments in the CCRI45 background; qFS-C17-3 near HAU0195a on C17 was simultaneously detected in two environments of each of both backgrounds; qFS-C15-3 near NAU3177 on C15 was simultaneously detected in four environments in the CCRI45 background and in one environment in the CCRI36 background. The Hai1 alleles in the four FS-QTLs increased FS.
Only one FM-QTL (qFM-C17-2) near NAU2909 on C17 was detected in six environments in the CCRI36 background and in two environments in the CCRI45 background.

Fiber quality QTL clusters
The QTL clusters were defined as a QTL-rich region that contained two or more QTLs of various trait types within common confidence region. Some of the QTLs formed

Characteristics of the materials used in this study
CSSLs are valuable genetic resources for basic and applied research on the improvement in complex traits (Balakrishnan et al. 2019). The materials used in this paper were CSSLs with the Upland cotton background and one or more introgressed segments from G. barbadense. Only the introgressed segments differed between the CSSLs and their recipient parents. A set of CSSLs, which had the same or a similar genetic background and differed only in a specific genetic region, can eliminate the influence of a complex genetic background, making CSSLs ideal materials for researching quantitative trait inheritance and gene identification in crops and advantageous in the identification of QTLs. The CSSLs are similar to their recurrent parents in terms of field-observed phenotypes but with one or more specific traits of G. barbadense (Ma et al. 2013;Li et al. 2019b).

3
In this paper, the CSSLs exhibited high genetic diversity in fiber quality traits (Table 1, Fig. S1). Through multiple environmental evaluation, some stable and highquality lines were obtained Lu et al. 2017). These CSSLs enriched our understanding of the genetic basis of traits in Upland cotton and will serve as useful materials for further QTL/gene fine mapping and genetic improvement in fiber quality in breeding.

QTLs for fiber quality traits
Cotton fiber quality traits are very important traits that are largely affected by both genetic backgrounds and environmental factors. In the present study, a total of 76 QTLs (29 FL-QTLs, 26 FS-QTLs and 21 FM-QTLs) and 120 QTLs (49 FL-QTLs, 52 FS-QTLs and 19 FM-QTLs) were detected in the two sets of CSSLs of the CCRI36 and CCRI45 backgrounds, respectively (Fig. 2, Table S3). Nine QTLs (four FL-QTLs, four FS-QTLs and one FM-QTL) were simultaneously detected in both backgrounds (the CCRI36 background and the CCRI45 background) ( Table 3, Table S3). Thus, a total of 187 QTLs were identified in this study, including 74 FL-QTLs, 39 FM-QTLs and 74 FS-QTLs.
In the present study, more QTLs were located on the D subgenome than the A subgenome (29 on the A subgenome and 47 on the D subgenome in the CCRI36 background, and 50 on the A subgenome and 70 on the D subgenome in the CCRI45 background), which was consistent with most previous reports Jiang et al. 1998;Lacape et al. 2010;Paterson et al. 2003;Said et al. 2013;Yang et al. 2015).

Effects of genetic backgrounds and environments on the expression of QTLs
Fiber quality traits in cotton are complex quantitative traits. One of the difficulties in improving complex traits is the environmental sensitivity of the identified QTLs. The percentages of SE-QTLs for the three traits (FL, FS and FM) were 60.00% and 49.00% in previous papers reported by Sun et al. (2012) and Jamshed et al. (2016), respectively. In the present paper, the overall percentage of SE-QTLs was 62.50%. These results were consistent with the results of previous reports Jamshed et al. 2016), indicating that environmental factors have a large influence on fiber quality traits .
To date, there has been no report on the effect of genetic background on the QTL expression of fiber quality in cotton. In the present study, among the 187 QTLs detected overall, only nine (4.81%) were BI-QTLs, which indicated that genetic background has a strong influence on fiber quality traits in cotton. In rice, some studies show that the expressions of the QTLs for complex traits are strongly affected by genetic background (Qiu et al. 2017;Wan et al. 2005;Zhao et al. 2016;Zheng et al. 2011). The consistency of the QTLs among different genetic backgrounds in rice is relatively low for complex traits, such as appearance quality (14.5%) (Qiu et al. 2017), salt tolerance (15.4%) (Cheng et al. 2012) and drought tolerance (17.9%) (Wang et al. 2013b). Our results are consistent with these reports on other traits in rice.
Comparatively, the percentage of BI-QTLs was much lower than that of SE-QTLs in this study, indicating that genetic background has a stronger impact than the environment on fiber quality. For this reason, breeders should pay much attention to the effects of different environments as well as different genetic backgrounds when QTL information is used in molecular breeding for fiber quality traits. adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.