Theoretical and Applied Genetics

, Volume 132, Issue 4, pp 989–999 | Cite as

Whole genome sequencing of a MAGIC population identified genomic loci and candidate genes for major fiber quality traits in upland cotton (Gossypium hirsutum L.)

  • Gregory N. Thyssen
  • Johnie N. Jenkins
  • Jack C. McCarty
  • Linghe Zeng
  • B. Todd Campbell
  • Christopher D. Delhom
  • Md. Sariful Islam
  • Ping Li
  • Don C. Jones
  • Brian D. Condon
  • David D. FangEmail author
Original Article


Key message

Significant associations between candidate genes and six major cotton fiber quality traits were identified in a MAGIC population using GWAS and whole genome sequencing.


Upland cotton (Gossypium hirsutum L.) is the world’s major renewable source of fibers for textiles. To identify causative genetic variants that influence the major agronomic measures of cotton fiber quality, which are used to set discount or premium prices on each bale of cotton in the USA, we measured six fiber phenotypes from twelve environments, across three locations and 7 years. Our 550 recombinant inbred lines were derived from a multi-parent advanced generation intercross population and were whole-genome-sequenced at 3× coverage, along with the eleven parental cultivars at 20× coverage. The segregation of 473,517 single nucleotide polymorphisms (SNPs) in this population, including 7506 non-synonymous mutations, was combined with phenotypic data to identify seven highly significant fiber quality loci. At these loci, we found fourteen genes with non-synonymous SNPs. Among these loci, some had simple additive effects, while others were only important in a subset of the population. We observed additive effects for elongation and micronaire, when the three most significant loci for each trait were examined. In an informative subset where the major multi-trait locus on chromosome A07:72-Mb was fixed, we unmasked the identity of another significant fiber strength locus in gene Gh_D13G1792 on chromosome D13. The micronaire phenotype only revealed one highly significant genetic locus at one environmental location, demonstrating a significant genetic by environment component. These loci and candidate causative variant alleles will be useful to cotton breeders for marker-assisted selection with minimal linkage drag and potential biotechnological applications.



This research was funded mainly by the USDA Agricultural Research Service CRIS Projects 6054-21000-017-00D (GNT, DDF), 6064-21000-016-00D (JNJ, JCM), 6066-21000-051-00D (LZ), and 6082-21000-008-00D (BTC). Additional funding was provided by Cotton Incorporated Projects 10-747 and 15-751 awarded to DDF, and Project 09-541 to JNJ. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA which is an equal opportunity provider and employer.

Compliance with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Availability of data and materials

All relevant data reported in this paper are within the paper and its online supplementary files.

Supplementary material

122_2018_3254_MOESM1_ESM.tif (1.8 mb)
Fig. S1. VanRaden kinship matrix of 550 RILs in the MAGIC population. Color key and histogram of values is shown at top left (TIFF 1817 kb)
122_2018_3254_MOESM2_ESM.pdf (13.7 mb)
Fig. S2. GWAS Manhattan plots for each of six traits at each of twelve location-years. Chromosomes are labeled 1–13 for Chr. A01–A13, and 14–26 for D01–D13. Location and trait abbreviations: Stoneville, MS, USA (STV), Starkville, MS, USA (MSU), Florence, SC, USA (FLO), elongation (ELO), micronaire (MIC), short fiber index (SFI), fiber strength (STR), upper half mean length (UHML), and uniformity index (UI) (PDF 14026 kb)
122_2018_3254_MOESM3_ESM.tif (197 kb)
Fig. S3. Violin plot of MIC values for RILs based on genotypes at the three loci discussed in the text. Genotypes are presented along the horizontal axis, with the high-MIC haplotype indicated with a dark gray rectangle and the low-MIC haplotypes with a light gray rectangle. The number (N) of RILs in each group is indicated. See also Fig. 6. For pairwise t test p values see Table S10 (TIFF 197 kb)
122_2018_3254_MOESM4_ESM.xlsx (3.2 mb)
Supplementary material 4 (XLSX 3229 kb)


  1. Bates D, Mächler M, Bolker B, Walker S (2014) Fitting linear mixed-effects models using lme4. ArXiv preprint arXiv:1406.5823
  2. Bouwman AC, Daetwyler HD, Chamberlain AJ, Ponce CH, Sargolzaei M, Schenkel FS et al (2018) Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals. Nat Genet 50:362–367CrossRefGoogle Scholar
  3. Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23:2633–2635CrossRefGoogle Scholar
  4. Bradow JM, Davidonis GH (2000) Quantitation of fiber quality and the cotton production-processing interface: a physiologist’s perspective. J Cotton Sci 4:34–64Google Scholar
  5. Buckler ES, Holland JB, Bradbury PJ, Acharya CB, Brown PJ, Browne C et al (2009) The genetic architecture of maize flowering time. Science 325:714–718CrossRefGoogle Scholar
  6. Cao J, Schneeberger K, Ossowski S, Günther T, Bender S, Fitz J et al (2011) Whole-genome sequencing of multiple Arabidopsis thaliana populations. Nat Genet 43:956–963CrossRefGoogle Scholar
  7. Dabbert TA, Pauli D, Sheetz R, Gore MA (2017) Influences of the combination of high temperature and water deficit on the heritabilities and correlations of agronomic and fiber quality traits in upland cotton. Euphytica 213:6CrossRefGoogle Scholar
  8. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA et al (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158CrossRefGoogle Scholar
  9. Dell’Acqua M, Gatti DM, Pea G, Cattonaro F, Coppens F, Magris G et al (2015) Genetic properties of the MAGIC maize population: a new platform for high definition QTL mapping in Zea mays. Genome Biol 16:167CrossRefGoogle Scholar
  10. Fang DD, Hinze LL, Percy RG, Li P, Deng D, Thyssen G (2013) A microsatellite-based genome-wide analysis of genetic diversity and linkage disequilibrium in Upland cotton (Gossypium hirsutum L.) cultivars from major cotton-growing countries. Euphytica 191:391–401CrossRefGoogle Scholar
  11. Fang DD, Jenkins JN, Deng DD, McCarty JC, Li P, Wu J (2014) Quantitative trait loci analysis of fiber quality traits using a random-mated recombinant inbred population in Upland cotton (Gossypium hirsutum L.). BMC Genom 15:397CrossRefGoogle Scholar
  12. Fang L, Wang Q, Hu Y, Jia Y, Chen J, Liu B, Zhang Z, Guan X, Chen S, Zhou B et al (2017) Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat Genet 49:1089–1098CrossRefGoogle Scholar
  13. Gore MA, Fang DD, Poland JA, Zhang J, Percy RG, Cantrell RG et al (2014) Linkage map construction and quantitative trait locus analysis of agronomic and fiber quality traits in cotton. Plant Genome. Google Scholar
  14. Hequet EF, Wyatt B, Abidi N, Thibodeaux DP (2006) Creation of a set of reference material for cotton fiber maturity measurements. Text Res J 76:576–586CrossRefGoogle Scholar
  15. Huang X, Han B (2014) Natural variations and genome-wide association studies in crop plants. Ann Rev Plant Biol 65:531–551CrossRefGoogle Scholar
  16. Huang X, Paulo M-J, Boer M, Effgen S, Keizer P, Koornneef M, van Eeuwijk FA (2011) Analysis of natural allelic variation in Arabidopsis using a multiparent recombinant inbred line population. PNAS 108:4488–4493CrossRefGoogle Scholar
  17. Huang BE, Verbyla KL, Verbyla AP, Raghavan C, Singh VK, Gaur P et al (2015) MAGIC populations in crops: current status and future prospects. Theor Appl Genet 128:999–1017CrossRefGoogle Scholar
  18. Huang C, Nie X, Shen C, You C, Li W, Zhao W, Zhang X, Lin Z (2017) Population structure and genetic basis of the agronomic traits of upland cotton in China revealed by a genome-wide association study using high-density SNPs. Plant Biotechnol J 15:1374–1386CrossRefGoogle Scholar
  19. Hulse-Kemp AM, Lemm J, Plieske J, Ashrafi H, Buyyarapu R, Fang DD et al (2015) Development of a 63 K SNP array for cotton and high-density mapping of intra-and inter-specific populations of Gossypium spp. G3: Genes Genomes Genet. Google Scholar
  20. Islam MS, Zeng L, Delhom CD, Song X, Kim HJ, Li P, Fang DD (2014) Identification of cotton fiber quality quantitative trait loci using intraspecific crosses derived from two near-isogenic lines differing in fiber bundle strength. Mol Breed 34:373–384CrossRefGoogle Scholar
  21. Islam MS, Thyssen GN, Jenkins JN, Zeng L, Delhom CD, McCarty JC et al (2016) A MAGIC population-based genome-wide association study reveals functional association of GhRBB1_A07 gene with superior fiber quality in cotton. BMC Genom 17:903CrossRefGoogle Scholar
  22. Jenkins J, McCarty J, Gutierrez O, Hayes R, Bowman D, Watson C, Jones D (2008) Registration of RMUP-C5, a random mated population of upland cotton germplasm. J Plant Regist 2:239–242CrossRefGoogle Scholar
  23. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079CrossRefGoogle Scholar
  24. Li C, Fu Y, Sun R, Wang Y, Wang Q (2018) Single-locus and multi-locus genome-wide association studies in the genetic dissection of fiber quality traits in Upland cotton (Gossypium hirsutum L.). Front Plant Sci 9:1083CrossRefGoogle Scholar
  25. Lipka AE, Tian F, Wang Q, Peiffer J, Li M, Bradbury PJ et al (2012) GAPIT: genome association and prediction integrated tool. Bioinformatics 28:2397–2399CrossRefGoogle Scholar
  26. Ma Z, He S, Wang X, Sun J, Zhang Y, Zhang G et al (2018) Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield. Nat Genet 50:803–813CrossRefGoogle Scholar
  27. MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E et al (2016) The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45:D896–D901CrossRefGoogle Scholar
  28. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9:356–369CrossRefGoogle Scholar
  29. Paterson A, Saranga Y, Menz M, Jiang C-X, Wright R (2003) QTL analysis of genotype × environment interactions affecting cotton fiber quality. Theor Appl Genet 106:384–396CrossRefGoogle Scholar
  30. Paterson AH, Wendel JF, Gundlach H, Guo H, Jenkins J, Jin D et al (2012) Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492:423–427CrossRefGoogle Scholar
  31. Paudel DR, Hequet EF, Abidi N (2013) Evaluation of cotton fiber maturity measurements. Ind Crops Prod 45:435–441CrossRefGoogle Scholar
  32. Percy RG, Cantrell RG, Zhang J (2006) Genetic variation for agronomic and fiber properties in an introgressed recombinant inbred population of cotton. Crop Sci 46:1311–1317CrossRefGoogle Scholar
  33. Rakshit S, Rakshit A, Patil J (2012) Multiparent intercross populations in analysis of quantitative traits. J Genet 91:111–117CrossRefGoogle Scholar
  34. Rodgers J, Delhom C, Fortier C, Thibodeaux D (2011) Rapid measurement of cotton fiber maturity and fineness by image analysis microscopy using the Cottonscope®. Text Res J 82:259–271CrossRefGoogle Scholar
  35. Said JI, Knapka JA, Song M, Zhang J (2015) Cotton QTLdb: a cotton QTL database for QTL analysis, visualization, and comparison between Gossypium hirsutum and G. hirsutum × G. barbadense populations. Mol Genet Genom 290:1615–1625CrossRefGoogle Scholar
  36. Servin B, Martin OC, Mézard M (2004) Toward a theory of marker-assisted gene pyramiding. Genetics 168:513–523CrossRefGoogle Scholar
  37. Su J, Fan S, Li L, Wei H, Wang C, Wang H, Song M, Zhang C, Gu L, Zhao S (2016) Detection of favorable QTL alleles and candidate genes for lint percentage by GWAS in Chinese upland cotton. Front Plant Sci 7:1576Google Scholar
  38. Su J, Li L, Zhang C, Wang C, Gu L, Wang H, Wei H, Liu Q, Huang L, Yu S (2018) Genome-wide association study identified genetic variations and candidate genes for plant architecture component traits in Chinese upland cotton. Theor Appl Genet 131:1299–1314CrossRefGoogle Scholar
  39. Sun Z, Wang X, Liu Z, Gu Q, Zhang Y, Li Z et al (2017) Genome-wide association study discovered genetic variation and candidate genes of fibre quality traits in Gossypium hirsutum L. Plant Biotechnol J 15:982–996CrossRefGoogle Scholar
  40. The 1000 Genomes Project Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061–1073CrossRefGoogle Scholar
  41. Thyssen GN, Fang DD, Turley RB, Florane C, Li P, Naoumkina M (2014) Next generation genetic mapping of the Ligon-lintless-2 (Li2) locus in upland cotton (Gossypium hirsutum L.). Theor Appl Genet 127:2183–2192CrossRefGoogle Scholar
  42. Verhalen LM, Mamaghani R, Morrison WC, McNew RW (1975) Effect of blooming date on boll retention and fiber properties in cotton 1. Crop Sci 15:47–52CrossRefGoogle Scholar
  43. Wakelyn PJ, Chaudhry MR (2010) Cotton: technology for the 21st century. In: International cotton advisory committee, Washington, DCGoogle Scholar
  44. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H et al (2013) The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42:D1001–D1006CrossRefGoogle Scholar
  45. Wu TD, Nacu S (2010) Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26:873–881CrossRefGoogle Scholar
  46. You Q, Xu W, Zhang K, Zhang L, Yi X, Yao D, Wang C, Zhang X, Zhao X, Provart NJ (2016) ccNET: Database of co-expression networks with functional modules for diploid and polyploid Gossypium. Nucleic Acids Res 45:D1090–D1099CrossRefGoogle Scholar
  47. Yu J, Holland JB, McMullen MD, Buckler ES (2008) Genetic design and statistical power of nested association mapping in maize. Genetics 178:539–551CrossRefGoogle Scholar
  48. Yuan Y, Wang X, Wang L, Xing H, Wang Q, Saeed M, Tao J, Feng W, Zhang G, Song X-L (2018) Genome-wide association study identifies candidate genes related to seed oil composition and protein content in Gossypium hirsutum L. Front Plant Sci 9:1359CrossRefGoogle Scholar
  49. Zhang Z, Ersoz E, Lai C-Q, Todhunter RJ, Tiwari HK, Gore MA et al (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet 42:355–360CrossRefGoogle Scholar
  50. Zhang T, Hu Y, Jiang W, Fang L, Guan X, Chen J et al (2015) Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol 33:531–537CrossRefGoogle Scholar

Copyright information

© This is a U.S. government work and its text is not subject to copyright protection in the United States; however, its text may be subject to foreign copyright protection 2018

Authors and Affiliations

  • Gregory N. Thyssen
    • 1
    • 2
  • Johnie N. Jenkins
    • 3
  • Jack C. McCarty
    • 3
  • Linghe Zeng
    • 4
  • B. Todd Campbell
    • 5
  • Christopher D. Delhom
    • 6
  • Md. Sariful Islam
    • 7
  • Ping Li
    • 1
  • Don C. Jones
    • 8
  • Brian D. Condon
    • 2
  • David D. Fang
    • 1
    Email author
  1. 1.Cotton Fiber Bioscience Research UnitUSDA-ARS-SRRCNew OrleansUSA
  2. 2.Cotton Chemistry and Utilization UnitUSDA-ARS-SRRCNew OrleansUSA
  3. 3.Genetics and Sustainable Agriculture Research UnitUSDA-ARSMississippi StateUSA
  4. 4.Crop Genetics Research UnitUSDA-ARSStonevilleUSA
  5. 5.Coastal Plain Soil, Water and Plant Conservation Research UnitUSDA-ARSFlorenceUSA
  6. 6.Cotton Structure and Quality Research UnitUSDA-ARS-SRRCNew OrleansUSA
  7. 7.Sugarcane Production Research UnitUSDA-ARSCanal PointUSA
  8. 8.Cotton IncorporatedCaryUSA

Personalised recommendations