Theoretical and Applied Genetics

, Volume 127, Issue 2, pp 283–295 | Cite as

Genetic diversity and population structure in the US Upland cotton (Gossypium hirsutum L.)

  • Priyanka Tyagi
  • Michael A. Gore
  • Daryl T. Bowman
  • B. Todd Campbell
  • Joshua A. Udall
  • Vasu Kuraparthy
Original Paper


Key message

Genetic diversity and population structure in the US Upland cotton was established and core sets of allelic richness were identified for developing association mapping populations in cotton.


Elite plant breeding programs could likely benefit from the unexploited standing genetic variation of obsolete cultivars without the yield drag typically associated with wild accessions. A set of 381 accessions comprising 378 Upland (Gossypium hirsutum L.) and 3 G. barbadense L. accessions of the United States cotton belt were genotyped using 120 genome-wide SSR markers to establish the genetic diversity and population structure in tetraploid cotton. These accessions represent more than 100 years of Upland cotton breeding in the United States. Genetic diversity analysis identified a total of 546 alleles across 141 marker loci. Twenty-two percent of the alleles in Upland accessions were unique, specific to a single accession. Population structure analysis revealed extensive admixture and identified five subgroups corresponding to Southeastern, Midsouth, Southwest, and Western zones of cotton growing areas in the United States, with the three accessions of G. barbadense forming a separate cluster. Phylogenetic analysis supported the subgroups identified by STRUCTURE. Average genetic distance between G. hirsutum accessions was 0.195 indicating low levels of genetic diversity in Upland cotton germplasm pool. The results from both population structure and phylogenetic analysis were in agreement with pedigree information, although there were a few exceptions. Further, core sets of different sizes representing different levels of allelic richness in Upland cotton were identified. Establishment of genetic diversity, population structure, and identification of core sets from this study could be useful for genetic and genomic analysis and systematic utilization of the standing genetic variation in Upland cotton.


Simple Sequence Repeat Marker Allelic Richness Polymorphism Information Content Simple Sequence Repeat Locus Upland Cotton 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We thank Dr. Gina Brown-Guedira for providing access to the genotyping facility and Jared Smith, Kim Howell and Blake Bowen for their technical assistance. We are grateful to Cotton Incorporated, NC Agricultural Research Service and NC Cotton Producers Association for funding support. The authors would like to thank NC State University Plant Breeding Center and Monsanto Company for providing PhD assistantship to Priyanka Tyagi.

Conflict of interest

The authors declare that there are no conflicts of interest in the reported research.

Ethical standards

The authors note that this research is performed and reported in accordance with ethical standards of the scientific conduct.

Supplementary material

122_2013_2217_MOESM1_ESM.pdf (108 kb)
Figure S1. Neighbor-joining tree of the Upland cotton diversity panel. Colors in the dendrogram correspond to different groups Group 1 (red-western), Group 2 (green-southeastern), Group 3 (blue-southwestern), and Group 4 (yellow-midsouth) of the Upland cotton diversity panel as identified in Structure analysis (PDF 107 kb)
122_2013_2217_MOESM2_ESM.xlsx (25 kb)
Table S1. List of G. hirsutum accessions with identification number (XLSX 25 kb)
122_2013_2217_MOESM3_ESM.xlsx (15 kb)
Table S2: List of SSR primers used to genotype a panel of 381 cotton accessions (XLSX 14 kb)
122_2013_2217_MOESM4_ESM.xlsx (15 kb)
Table S3. Summary statistics for SSR loci used to genotype G. hirsutum accessions (XLSX 15 kb)
122_2013_2217_MOESM5_ESM.xlsx (126 kb)
Table S4. List of G. hirsutum accessions with unique alleles (present in only one accession) (XLSX 125 kb)
122_2013_2217_MOESM6_ESM.xlsx (33 kb)
Table S5. Proportional Membership of cotton accessions to clusters as determined by model-based analysis using STRUCTURE. Lines were assigned to a group based on membership probability higher than 0.70. The identified cluster roughly corresponds to following geographical areas of cotton belt: Cluster 1-Western, cluster 2-Eastern, cluster 3-southwest, cluster 4-midsouth, and cluster 5-G. barbadense (XLSX 33 kb)


  1. Abdalla AM, Reddy OUK, El-Zik KM, Pepper AE (2001) Genetic diversity and relationships of diploid and tetraploid cottons revealed using AFLP. Theor Appl Genet 102:222–229CrossRefGoogle Scholar
  2. Abdurakhmonov IY, Kohel RJ, Yu JZ, Pepper AE, Abdullaev AA, Kushanov FN, Salakhutdinov LB, Buriev ZT, Saha S, Scheffler BE, Jenkins JN, Abdukarimov A (2008) Molecular diversity and association mapping of fiber quality traits in exotic G. hirsutum L. germplasm. Genomics 92:478–487PubMedCrossRefGoogle Scholar
  3. Abdurakhmonov IY, Buriev ZT, Shermatov SE, Abdullaev AA, Urmonov K, Kushanov F, Egamberdiev SS, Shapulatov U, Abdukarimov A, Saha S, Jenkins JN, Kohel RJ, Yu JZ, Pepper AE, Kumpatla SP, Ulloa M (2012) Genetic Diversity in Gossypium genus. In: Caliskan M (ed) Genetic Diversity in Plants, ISBN: 978-953-51-0185-7, InTech, pp 313–338. doi: 10.5772/2640
  4. Bertini CHCD, Schuster I, Sediyama T, Barros EG, Moreira MA (2006) Characterization and genetic diversity analysis of cotton cultivars using microsatellites. Genet Mol Biol 29:321–329CrossRefGoogle Scholar
  5. Blott S et al (2003) Molecular dissection of a quantitative trait locus: a phenylalanine-to-tyrosine substitution in the transmembrane domain of the bovine growth hormone receptor is associated with a major effect on milk yield and composition. Genetics 16:253–266Google Scholar
  6. Botstein D, White RL, Skolnick M, Davis RW (1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32:314–331PubMedCentralPubMedGoogle Scholar
  7. Bowman DT, May OL, Calhoun DS (1996) Genetic base of upland cotton cultivars released between 1970 and 1990. Crop Sci 36:577–581CrossRefGoogle Scholar
  8. Bowman DT, Gutierrez OA, Percy RG, Calhoun DS, May OL (2006) Pedigrees of upland and pima cotton cultivars released between 1970 and 2005. Miss Agric For Exp Stn Bull 1155Google Scholar
  9. Brown WL (1983) Genetic diversity and genetic vulnerability: an appraisal. Econ Bot 37:4–12CrossRefGoogle Scholar
  10. Brubaker CL, Bourland FM, Wendel JF (1999) The origin and domestication of cotton. In: Smith CW, Cothren JT (eds) Cotton: origin, history, technology, and production. Wiley, New York, pp 3–32Google Scholar
  11. Buckler ES et al (2009) The genetic architecture of maize flowering time. Science 325:714–718PubMedCrossRefGoogle Scholar
  12. Campbell BT, Williams VE, Park W (2009) Using molecular markers and field performance data to characterize the Pee Dee cotton germplasm resources. Euphytica 169:285CrossRefGoogle Scholar
  13. Courtois B, Frouin J, Greco R, Bruschi G et al (2012) Genetic diversity and population structure in a European collection of rice. Crop Sci 52:1663–1675CrossRefGoogle Scholar
  14. Dejoode D, Wendel J (1992) Genetic diversity and origin of the hawaiian-islands cotton, Gossypium tomentosum. Am J Bot 79:1311–1319CrossRefGoogle Scholar
  15. Dent AE, Bridgett MV (2012) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour 4:359–361CrossRefGoogle Scholar
  16. Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14:2611–2620PubMedCrossRefGoogle Scholar
  17. Excoffier L, Lischer HEL (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and windows. Mol Eco Res 10:564–567CrossRefGoogle Scholar
  18. Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587PubMedGoogle Scholar
  19. Fang DD, Hinze LL, Percy RG, Li P, Deng D, Thyssen G (2013) A microsatellite-based genome-wide analysis of genetic diversity and linkage disequilibrium in Upland cotton (Gossypium hirsutum L.) cultivars from major cotton-growing countries. Euphytica 1–11Google Scholar
  20. Flajoulot S, Ronfort J, Baudouin P, Barre P, Huguet T, Huyghe C, Julier B (2005) Genetic diversity among alfalfa (Medicago sativa) cultivars coming from a breeding program, using SSR markers. Theor Appl Genet 111:1420–1429PubMedCrossRefGoogle Scholar
  21. Flint-Garcia SA et al (2005) Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J 44:1054–1064PubMedCrossRefGoogle Scholar
  22. Frankel OH (1984) Genetic perspectives of germplasm conservation. In: Arber W, Llimensee K, Peacock WJ, Starlinger P (eds) Genetic manipulation: impact on man and society. Cambridge University Press, Cambridge, pp 161–170Google Scholar
  23. Garris AJ, Tai TH, Coburn J, Kresovich S, McCouch S (2005) Genetic structure and diversity in Oryza sativa L. Genetics 169:1631–1638PubMedCrossRefGoogle Scholar
  24. Hao C, Dong Y, Wang L, You G, Zhang H, Ge H, Jia J, Zhang X (2008) Genetic diversity and construction of core collection in Chinese wheat genetic resources. Chin Sci Bull 53:1518–1526CrossRefGoogle Scholar
  25. Hinze LL, Dever JK, Percy RG (2012) Molecular variation among and within improved cultivars in the U.S. cotton germplasm collection. Crop Sci 52:222–230CrossRefGoogle Scholar
  26. Iqbal MJ, Aziz N, Saeed NA, Zafar Y, Malik KA (1997) Genetic diversity evaluation of some elite cotton varieties by RAPD analysis. Theor Appl Genet 94:139–144PubMedCrossRefGoogle Scholar
  27. Jenkins JN, McCarty JC Jr, Gutierrez OA, Hayes RW, Bowman DT, Watson CE, Jones DC (2008) Registration of RMUP-C5, a random mated population of upland cotton germplasm. J Plant Reg 2:239–242CrossRefGoogle Scholar
  28. Kalivas A, Xanthopoulos F, Kehagia O, Tsaftaris AS (2011) Agronomic characterization, genetic diversity and association analysis of cotton cultivars using simple sequence repeat molecular markers. Genet Mol Res 10:208–217PubMedCrossRefGoogle Scholar
  29. Kuraparthy V, Bowman DT (2013) Gains in breeding Upland cotton for fiber quality. J Cotton Sci (in press)Google Scholar
  30. Kuroda Y, Tomooka N, Kaga A, Wanigadeva SMSW, Vaughan DA (2009) Genetic diversity of wild soybean (Glycine soja Sieb. et Zucc.) and Japanese cultivated soybeans [G. max (L.) Merr.] based on microsatellite (SSR) analysis and the selection of a core collection. Genet Res Crop Evol 56:1045–1055CrossRefGoogle Scholar
  31. Lacape JM, Dessauw D, Rajab M, Noyer JL, Hau B (2007) Microsatellite diversity in tetraploid Gossypium germplasm: assembling a highly informative genotyping set of cotton SSRs. Mol Breeding 19:45–58CrossRefGoogle Scholar
  32. Li H, Luo J, Hemphill JK, Wang JT (2001) A rapid and high yielding DNA miniprep for cotton (Gossypium spp.). Plant Mol Biol Rep 19:183aCrossRefGoogle Scholar
  33. Liu KJ, Muse SV (2005) PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21:2128–2129PubMedCrossRefGoogle Scholar
  34. Liu B, Wendel JF (2001) Intersimple sequence repeat (ISSR) polymorphisms as a genetic marker system in cotton. Mol Ecol Notes 1:205–208CrossRefGoogle Scholar
  35. Liu KJ, Goodman M, Muse S, Smith JS, Buckler E, Doebley J (2003) Genetic structure and diversity among maize inbred lines as inferred from DNA microsatellites. Genetics 165:2117–2128PubMedGoogle Scholar
  36. Liu D, Guo X, Lin Z, Nie Y, Zhang X (2006) Genetic diversity of Asian cotton (Gossypium arboreum L.) in china evaluated by microsatellite analysis. Genet Res Crop Evol 53:1145–1152CrossRefGoogle Scholar
  37. May OL, Bowman DT, Calhoun DS (1995) Genetic diversity of U.S. upland cotton cultivars released between 1980 and 1990. Crop Sci 35:1570–1574CrossRefGoogle Scholar
  38. McMullen MD, Kresovich S, Villeda HS, Bradbury P, Li H, Sun Q, Flint-Garcia S, Thornsberry J, Acharya C, Bottoms C et al (2009) Genetic properties of the maize nested association mapping population. Science 325:737–740PubMedCrossRefGoogle Scholar
  39. Meuwissen THE, Karlsen A, Lien S, Olsaker I, Goddard ME (2002) Fine mapping of a quantitative trait locus for twinning rate using combined linkage and linkage disequilibrium mapping. Genetics 16:373–379Google Scholar
  40. Multani DS, Lyon BR (1995) Genetic fingerprinting of australian cotton cultivars with RAPD markers. Genome 38:1005–1008PubMedCrossRefGoogle Scholar
  41. Nei M, Tajima F, Tateno Y (1983) Accuracy of estimated phylogenetic trees from molecular data. J Mol Evol 19:153–170PubMedCrossRefGoogle Scholar
  42. Niles GA, Feaster CV (1984) Breeding. In: Kohel RJ, Lewis CF (eds) Cotton, agronomy monograph no. 24. CSSA, Madison, pp 201–231Google Scholar
  43. Oliveira HR, Campana MG, Jones H, Hunt HV, Leigh F, Redhouse DI, Lister DL, Jones MK (2012) Tetraploid wheat landraces in the Mediterranean basin: taxonomy, evolution and genetic diversity. PLoS One 7:e37063. doi: 10.1371/journal.pone.0037063 PubMedCentralPubMedCrossRefGoogle Scholar
  44. Powell W, Machray GC, Provan J (1996) Polymorphism revealed by simple sequence repeats. Trends in Plant Sci 1:215–222Google Scholar
  45. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959PubMedGoogle Scholar
  46. Rahman M, Yasmin T, Tabbasam N, Ullah I, Asif M, Zafar Y (2008) Studying the extent of genetic diversity among Gossypium arboreum L. genotypes/cultivars using DNA fingerprinting. Genet Resour Crop Evol 55:331–339CrossRefGoogle Scholar
  47. Rohlf FJ (2000) Numerical taxonomy and multivariate analysis system, ver. 2.11. Applied Biostatistics, New YorkGoogle Scholar
  48. Smith CW, Cantrell RG, Moser HS, Oakley SR (1999) History of cultivar development in the United States. In: Smith CW, Cothren JT (eds) Cotton: origin, history, technology, and production. Wiley, New York, pp 99–171Google Scholar
  49. Staten G (1970) Breeding Acala 1517 cottons, memoir series no. 4. New Mexico State University, Las Cruces, pp 1926–1970Google Scholar
  50. Van Becelaere G, Lubbers EL, Paterson AH, Chee PW (2005) Pedigree- vs. DNA marker-based genetic similarity estimates in cotton. Crop Sci 45:2281–2287CrossRefGoogle Scholar
  51. Van Esbroeck GA, Bowman DT (1998) Cotton improvement. Cotton germplasm diversity and its importance to cultivar development. J Cotton Sci 2:121–129Google Scholar
  52. Van Esbroeck GA, Bowman DT, May OL, Calhoun DS (1999) Genetic similarity indices for ancestral cotton cultivars and their impact on genetic diversity estimates of modern cultivars. Crop Sci 39:323–328Google Scholar
  53. Wendel J, Percy R (1990) Allozyme diversity and introgression in the galapagos-islands endemic Gossypium darwinii and its relationship to continental Gossypium barbadense. Biochem Syst Ecol 18:517–528CrossRefGoogle Scholar
  54. Wendel J, Brubaker C, Percival A (1992) Genetic diversity in Gossypium hirsutum and the origin of upland cotton. Am J Bot 79:1291–1310CrossRefGoogle Scholar
  55. Wendel J, Rowley R, Stewart J (1994) Genetic diversity in and phylogenetic-relationships of the brazilian endemic cotton, Gossypium mustelinum (malvaceae). Plant Syst Evol 192:49–59CrossRefGoogle Scholar
  56. Wu R, Zeng ZB (2001) Joint linkage and linkage disequilibrium mapping in natural populations. Genetics 157:899–909PubMedGoogle Scholar
  57. Wu R, Ma CX, Casella G (2002) Joint linkage and linkage disequilibrium mapping of quantitative trait loci in natural populations. Genetics 160:779–792PubMedGoogle Scholar
  58. Xu H, Mei Y, Hu J, Zhu J, Gong P (2006) Sampling a core collection of Island cotton (Gossypium barbadense L.) based on the genotypic values of fiber traits. Genet Res Crop Evo 53:515–521CrossRefGoogle Scholar
  59. Yu J, Holland JB, McMullen MD, Buckler ES (2008) Genetic design and statistical power of nested association mapping in maize. Genetics 178:539–551PubMedCrossRefGoogle Scholar
  60. Yu JZ, Fang DD, Kohel RJ, Ulloa M, Hinze LL, Percy RG, Zhang J, Chee P, Scheffler BE, Jones DC (2012) Development of a core set of SSR markers for the characterization of gossypium germplasm. Euphytica 187:203–213CrossRefGoogle Scholar
  61. Zhang J, Lu Y, Cantrell R, Hughs E (2005) Molecular marker diversity and field performance in commercial cotton cultivars evaluated in the southwestern USA. Crop Sci 45:1483–1490CrossRefGoogle Scholar
  62. Zhang Y, Wang XF, Li ZK, Zhang GY, Ma ZY (2011) Assessing genetic diversity of cotton cultivars using genomic and newly developed expressed sequence tag-derived microsatellite markers. Genet Mol Res 10:1462–1470PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Priyanka Tyagi
    • 1
  • Michael A. Gore
    • 2
  • Daryl T. Bowman
    • 3
  • B. Todd Campbell
    • 4
  • Joshua A. Udall
    • 5
  • Vasu Kuraparthy
    • 1
  1. 1.Crop Science DepartmentNorth Carolina State UniversityRaleighUSA
  2. 2.Department of Plant Breeding and GeneticsCornell UniversityIthacaUSA
  3. 3.North Carolina Foundation Seed Producers Inc.ZebulonUSA
  4. 4.Coastal Plains Soil, Water and Plant Research CenterUSDA-ARSFlorenceUSA
  5. 5.Department of Plant and Wildlife SciencesBrigham Young UniversityProvoUSA

Personalised recommendations