Because of its diversity and economic significance, the cotton genus (Gossypium) has been subjected to decades of taxonomic, cytogenetic, and phylogenetic analyses. Accordingly, a reasonably well-documented phylogenetic and taxonomic understanding has developed, as recently summarized (Wendel and Grover 2015). Work published since that time also supports the emerging picture of species relationships and diversification (Grover et al. 2015a, 2015b; Chen et al. 2016; Gallagher et al. 2017). Our purpose here is to provide a brief summary of this understanding, and to introduce the chromosomal context for the genome designations that are widely used by cotton researchers worldwide.

As shown in Table 1, the genus is divided into 8 diploid genome groups (A through G, and K), as well as one allopolyploid clade (AD genome) formed from ancient merger and chromosome doubling from A and D genome ancestors. These genome groups were initially defined based on comparative chromosome sizes and chromosome behavior in interspecific hybrids (Beasley 1942; Stephens 1947; Phillips 1966; Fryxell 1992; Endrizzi et al. 1985). Subsequent phylogenetic work (reviewed in Wendel and Grover 2015) has confirmed that each of these genome groups is monophyletic; that is, they have a single origin, with each genome group comprising a natural set of more or less closely related species. Genome groups vary widely in species diversity, from consisting of only a single species (F genome) to larger genome groups containing more than a dozen species each (D, K). The important allopolyploid clade, which includes G. hirsutum and G. barbadense, contains 7 species, including two described only in the last 10 years (G. ekmanianum, G. stephensii) (Krapovickas and Seijo 2008; Gallagher et al. 2017).

Table 1 Taxonomic, cytogenetic and geographic diversity of species of Gossypium. Included are descriptions are each genome group and the collective geographic distribution of the included species. Genomic placements and taxonomic status of species enclosed by parentheses remain to be established as they are poorly known

Many Gossypium species are taxonomically well-understood, whereas others are poorly known and not well-established as bona fide species; these are indicated with parentheses in Table 1. Most notable in this respect are species or putative species from the horn of Africa and the Arabian Peninsula; several of these species are poorly represented in herbarium collections and no living material has been available for study (last five in Table 2). Remarkably, new species remain to be discovered and or taxonomically described, as evidenced by the recent publication of G. anapoides (Stewart et al. 2014) and G. stephensii (Gallagher et al. 2017). Also notable is the relatively poorly understood diversity in the Mexican arborescent clade, in which additional species likely remain to be described (Wendel and Grover 2015; Feng et al. 2011). More species might be discovered in Australia and in Southern America with new exploitation.

Table 2 Nomenclature of individual genomes and chromosomes for each species in Gossypium, with Chinese translation of species names

This taxonomic framework provides a justification for a nomenclature for individual genomes and chromosomes in each species in Gossypium. A stable and accepted nomenclature will facilitate comparisons among the many kinds of studies that might be conducted in Gossypium, which range from basic taxonomic exploration to breeding and germplasm introgression. As shown in Table 2, designations of genomes and chromosomes in each species, subspecies and variety in Gossypium are suggested, including for G. herbaceum subs. africanum with its genome as A1-a, which might be useful for ongoing genomic studies, and one Mexican arborescent species with a genome designated as D12, as this species may soon be described as a new D genome species. The last five species in Table 2 have not been assigned genome designations because they are poorly represented in herbarium collections and no living material has been available for study.

A comparative nomenclature of individual chromosomes in Gossypium is facilitated by the many early studies in the later decades of the previous century, when cotton karyotype studies were frequently conducted (Edwards 1977, 1979a, 1979b; Wang et al. 1994; Endrizzi et al. 1985; Wendel and Grover 2015). These earlier investigations provided an important foundation for the transition into the molecular biology era. By using cultivars or wild species in Gossypium, there were increasingly frequent reports on genetic mapping related to linkage groups (Brubaker et al. 1999; Rong et al. 2004; Khan et al. 2016), on identifications of individual chromosomes (Wang et al. 2007, 2008; Gan et al. 2011, 2012, 2013; Shan et al. 2016), and even on microdissection and microcloning of individual chromosomes (Peng et al. 2012). De novo sequence-based genomic studies provided detailed information on single or pseudo-chromosomes (Wang et al. 2012; Paterson et al. 2012; Li et al. 2014, 2015; Zhang et al. 2015; Liu et al. 2015; Yuan et al. 2015).

These and many other studies collectively indicate that a clear nomenclature of individual chromosomes in Gossypium will be useful to facilitate communication and to provide consistency in chromosome designations. Here we suggest such designations (Table 2) for all clades in Gossypium, with the exception of the last five taxa that are too poorly understood taxonomically and cytogenetically to be included. For the seven allotetraploid species, the first letters of their specific names are used instead of their corresponding genome designations. Ah and Dh stand for corresponding chromosome sets of A-sub and D-sub genome, respectively, for G. hirsutum, as well as Ab and Db for G. barbadense, Att and Dtt for G. tomentosum, Am and Dm for G. mustelinum, Ad and Dd for G. darwinii, Ae and De for G. ekmanianum, and As and Ds for G. stephensii. At and Dt are more broadly used terms that designate chromosomes of A-sub and D-sub genomes in all allotetraploid cottons, respectively, from which the Att and Dtt for G. tomentosum are distinguished. Generally, in diploid species, the designations for individual chromosomes correspond to the individual genomes. There is another exception for G. armourianum and G. harknessii because of the historical use of the genome designation D2 for both of these species; accordingly, ‘a’ and ‘h’, the first letters of their specific epithets, are used to clarify this confusion, so that the individual chromosomes of the two species are designated D2a1 - D2a13 and D2h1 ~ D2h13, respectively. The other break with tradition is the simplifying omission of dashes in individual chromosome designations for G. herbaceum subs. africanum, G. sturtianum var nandewarense, G. armourianum, G. harknessii, G. davidsonii and G. klotzschianum.