Introduction

Driven by the need for greater energy security and a reduced impact of fuel consumption on the environment, there has been a growing interest in recent years to produce biofuels from cellulosic biomass. Switchgrass (Panicum virgatum L.), a perennial warm season C4 grass native to North America, is one of the crops that was identified in the 1990s as a potential feedstock for the production of cellulosic biofuels because of its high biomass yield, low nutrient and water requirements, and adaptation to marginal lands [13]. Switchgrass belongs to the family Poaceae, subfamily Panicoideae, tribe Paniceae and comprises two ecotypes, upland and lowland, that largely vary in their ploidy level, zone of adaptation, and plant architecture. Switchgrass has extensive genetic diversity and a large potential for genetic improvement [1]. The improvement of switchgrass for traits important for bioenergy production, including total biomass yield and biomass composition, is key to making switchgrass an integral part of a successful cellulosic biofuel production program.

To enhance the breeding process, genetic maps have been developed for switchgrass [48], and its genome has been sequenced (www.phytozome.net). However, genetic analyses have been hampered by the complexity of the switchgrass genome. In its lowest ploidy form, switchgrass is an allotetraploid (2n = 4x = 36) and has a haploid DNA content of 1.35 pg (1,300 Mb) [9]. It is also largely an obligate outcrossing species, and consequently, switchgrass genotypes are heterozygous at most loci. Most genetic maps published to date have been developed in pseudo-F1 populations resulting from crossing two heterozygous switchgrass genotypes [46, 8]. The exception is the genetic map generated by Liu et al. [7] who used an F2 population obtained by selfing a heterozygous self-compatible accession. When analyzing marker segregation in pseudo-F1 progeny, in essence, the amount of recombination that took place in each of the parental genomes is measured. The output is two linkage maps per chromosome, one that represents recombination in the female (egg) parent and one that represents recombination in the male (pollen) parent. While algorithms have been written to integrate parental maps (e.g., JoinMap [10] and OneMap [11]), such maps have not yet been produced in switchgrass.

An additional layer of integration can be achieved by combining maps that have been generated in different mapping populations but comprise common markers into a consensus map (e.g., maize- Falque et al. [12]; sugar cane- Ming et al. [13]; pearl millet- Qi et al. [14], Rajaram et al. [15]; rye- Gustafson et al. [16]; wheat- Marone et al. [17]). Integrated maps compensate for inconsistencies in individual maps, provide better genome coverage, allow integration of QTL data, and increase marker densities for practical map applications such as marker-assisted selection, map-based cloning, and cross-species comparative analyses.

The genome of the lowland switchgrass genotype AP13 has been sequenced to 15× using mainly the Roche 454 platform (www.phytozome.net). Sequence assembly, however, remains a challenge due to the heterozygosity and tetraploidy of AP13 and the repeat content (≥33 % [18]) of the switchgrass genome. Sequence contigs can be ordered by anchoring them to a high-density genetic map. However, marker densities on the individual classically constructed switchgrass genetic maps are relatively low (average of 0.32 markers/cM) [5, 6]. Recently, the Joint Genome Institute (JGI) has sequenced the progeny of the AP13 × VS16 mapping population [6] and used this data combined with information from the assembly of the genome sequence of Panicum hallii, a close relative of P. virgatum, to order the switchgrass sequence contigs (assembly v1.1; www.phytozome.net). A consensus map integrating three published genetics maps could be used as an independent tool to validate the sequence-based map. Lu et al. [8] used genotyping by sequencing (GBS) to generate a high-density map containing an average of 2,450 markers per linkage group. However, marker ordering in this map was achieved by alignment against the foxtail millet (Setaria italica) genome and thus does not take into account potential structural differences between the foxtail millet and switchgrass genomes.

Foxtail millet is a perennial C4 grass which, as switchgrass, belongs to the Poaceae family, subfamily Panicoideae, tribe Paniceae. The two species diverged around 13.1 million years ago (MYA) [19]. Because of its close phylogenetic relationship with biofuel grasses and relatively simple genome (C = 500 Mb, 2n = 2x = 18), the foxtail millet genome underwent whole-genome Sanger shotgun sequencing in January 2010 so that it could be used as a model for switchgrass genome analyses [20]. The Setaria genome was assembled (http://www.phytozome.net/) into 336 scaffolds and anchored to a 992 loci genetic map [20]. Comparative analyses with sorghum and rice identified at least seven chromosomal inversions in addition to a complex set of rearrangements involving three ancestral chromosomes that occurred in Setaria or in the lineage leading to Setaria since its divergence from a common ancestor with sorghum approximately 26 MYA. Although the foxtail millet genome has been used as a “surrogate” for switchgrass [8], the structural relationship between the two genomes has not yet been established.

The objectives of this research were (1) to integrate several of the published switchgrass genetic maps into a consensus map with an increased marker density that can be used to validate the sequence-based genetic map and in practical breeding applications and (2) to employ the consensus map in a comparative analysis with the foxtail millet genome sequence.

Material and Methods

Mapping Populations and Markers

The dataset used to build the consensus maps combined segregation data from Missaoui et al. [4], Okada et al. [5], and Serba et al. [6]. Serba and colleagues analyzed 191 F1 plants from a cross between the lowland genotype “AP13” (female parent), derived from cv. “Alamo,” and the upland genotype “VS16” (male parent), derived from cv. “Summer.” Sixty-two of the progeny were also analyzed by Missaoui et al. [4]. The Okada et al. [5] mapping population consisted of 238 F1 plants obtained from a cross between selected genotypes from the lowland cvs. “Kanlow” (female parent) and Alamo (male parent). Because markers that are heterozygous in the female parent are typically analyzed separately from those that are heterozygous in the male parent in F1 populations, there were two datasets available for each of the mapping populations. For the purpose of generating the consensus maps, female and male datasets were considered separate populations. The Missaoui dataset comprised 110 loci that segregated in the female parent and 102 that segregated in the male parent, the Okada dataset consisted of 563 female and 542 male loci, and the Serba dataset consisted of 515 female and 363 male loci. Also included in the Serba datasets are 32 loci (18 in the female and 14 in the male) detected by 19 primer pairs developed against genes that were of interest to our group or to collaborating groups and that were newly mapped in the AP13 × VS16 population. Primer sequences for these newly mapped loci are listed in Online Resource 1. The markers were amplified in a total volume of 20 μl consisting of 50 ng genomic DNA, 0.4 μM of forward and reverse primers, 0.8 U GoTaq DNA Polymerase (Promega, Madison, WI), 2.5 mM MgCl2, and 0.2 mM dNTPs in 1× buffer. PCR conditions consisted of an initial denaturation of 3 min at 94 °C followed by 35 cycles of denaturation at 94 °C for 30 s, annealing at temperatures listed in Online Resource 1 for 30 s, and extension at 72 °C for 1 min. The final extension was held at 72 °C for 15 min after which the samples were cooled to 4 °C. PCR products were separated on 0.5 % Mutation Detection Enhancement (MDE) gels (Lonza, Rockland, ME, USA) to reveal single-strand conformation polymorphisms that were visualized by silver staining. Segregation patterns were scored manually.

Identification of Putative Foxtail Millet Orthologs for the Switchgrass Markers

The nucleotide sequences of the markers included in the Okada et al. [5] and Serba et al. [6] maps were used as queries in a BLASTn search against the P. virgatum genome (www.phytozome.net). For each search, the region that aligned best to the query sequence (E value < 1e-30) plus, when available, 1 kb of sequence on either side of the alignment was then used as a query in a BLASTn search against the foxtail millet genome (nonmasked; www.phytozome.net). Top hits (E value < 1e-10) were considered as being putatively orthologous to the mapped P. virgatum markers. For the RFLP markers included in the Missaoui et al. [4] map, sequence information was available only for the RZ loci, which were detected by rice cDNA clones [21]. RZ clone end sequence was downloaded from GenBank, and the corresponding full-length gene sequence was obtained through a BLASTn search of the end sequences against the rice genome sequence present in GenBank. The sequence of the full-length genes was then used in a BLASTn search against the foxtail millet genome, and top hits (E value < 1e-10) were retained as putatively orthologous to the RZ markers.

Generation of a Consensus Map

In first instance, we used Mapmaker v.3.0 [22] to identify RFLP markers from the partial map by Missaoui et al. [4] that could be integrated into the Serba et al. [6] maps based on the sixty-two common progeny. RFLP markers that did not link at a logarithm (base 10) of odds (LOD) score of 2.0 using the “group” command, or that mapped to the ends of linkage groups using the “try” command, were removed from the dataset. The Serba female and male datasets were then merged with the Okada female and male datasets to represent one single large mapping population consisting of 858 plants. Markers were assigned genotypic scores in the populations in which they had been mapped and were given missing data scores for the other population(s). A separate data file was made for each linkage group (Online Resource 2).

Consensus maps were generated using a multistep process. First, we generated a framework consensus map for each switchgrass linkage group using only markers that mapped to a syntenic foxtail millet chromosome. We used JoinMap 4.0 [10] with the population type set as backcross and a regression mapping algorithm at a pairwise recombination frequency estimate <0.40, a LOD score ≥3, a goodness-of-fit jump threshold of 5, and a ripple value of 1 to determine marker orders. Although JoinMap assigned a specific location to each marker, most markers could only be placed within a bin because they were segregating in only a subset of the four populations. We therefore manually scrutinized recombination events and used comparative information from foxtail millet to reorder markers. The assumption was that marker orders were conserved between foxtail millet and switchgrass unless we had evidence from recombination data that this was not the case. We then used the try command of Mapmaker v. 3.0 to place markers that did not map to syntenic locations in foxtail millet and markers for which no putative foxtail millet ortholog had been identified relative to the framework map. Because Mapmaker considers markers that are scored in different phases as unlinked, use of Mapmaker required conversion of all marker scores to the same phase. The conversion was done in Excel using phase information provided by the JoinMap 4.0 data tabsheet obtained for each LG group node output. Map distances were calculated using Mapmaker, and the Kosambi mapping function was used to convert recombination units into genetic distances. Finally, we manually verified recombination events to indicate the range of locations that markers without comparative information could occupy on the map. Linkage group designations are based on the names given in Okada et al. [5] and Serba et al. [6] and do not reflect a specific subgenome allocation.

Comparative Analyses

Relationships between marker positions in the switchgrass consensus map and their corresponding positions in foxtail millet were visualized in a circle diagram using the Circos software [23]. Comparative dot plots were drawn using the “R programming language” [24]. Putative centromere locations in switchgrass were inferred from the centromere locations in colinear foxtail millet chromosomes. A rearrangement between the foxtail millet genome and the switchgrass genetic map was defined as a minimum of three adjacent markers being present in inverted orientation (inversion) or on a nonsyntenic chromosome (translocation). To investigate whether overall marker densities and the relative number of colinear markers varied along the length of a switchgrass linkage group, we divided each linkage group in five regions. The “pericentromeric” region is the region of a linkage group that corresponds to 33 % of the syntenic foxtail millet chromosome surrounding each centromere. The remainder of each arm is then split into two so that the two regions in each arm have corresponding regions in foxtail millet that are equal in length. Since short and long arms have not been assigned in either switchgrass or foxtail millet, data from the distal regions on each arm were combined into a single distal region for further analysis. Similarly, data from the two proximal regions were combined into a single proximal region. Chi-square tests were conducted to identify significant differences in marker distributions. Information on comparative relationships between the foxtail millet, sorghum, and rice genomes was from Bennetzen et al. [20].

Results

Consensus Map

A total of 1,459 loci from three different maps were integrated into the consensus map. Fifty RFLP loci originated from Missaoui et al. [4], 599 simple sequence repeat (SSR) loci were from Okada et al. [5], 778 loci including 336 Diversity Array Technology (DArT) loci, 439 SSR loci, and 3 sequenced-tag-sites markers developed from expressed sequenced tags (EST-STSs) were from Serba et al. [6], and 32 loci were newly mapped in the AP13 × VS16 mapping population. Of the 1,459 loci, 314 were shared by at least two of the four mapping populations (K Okada female and A Okada male, or P Serba female and S Serba male) that constituted the mega mapping dataset, and of these, 138 were shared between the Okada and Serba datasets (Fig. 1), bringing the number of unique integrated loci to 1,321. The 1,321 loci were organized in 9 sets of homoeologous linkage groups (Fig. 2). The total length of the consensus map was 2,122 cM with individual linkage groups spanning from 70 to 179 cM (Table 1). The average distance between markers was 1.6 cM. Nevertheless, a total of 13 gaps larger than 10 cM were present on LG I-a (1 gap), LG II-a (2), LG II-b (2), LG III-a (3), LG V-a (1), LG VI-a (1), LG VII-a (1), LG VII-b (1), and LG VIII-a (1) (Fig. 2).

Fig. 1
figure 1

Venn diagram showing the number of markers that are common between the different mapping populations. Letters and letter combinations indicate the population(s) with P AP13 (female) [6], S VS16 (male) [6], K Kanlow (female) [5], A Alamo (male) [5]. For each population or combination of populations, the total number of markers and, in brackets, the number of markers included in the consensus map framework is given

Fig. 2
figure 2figure 2figure 2

Switchgrass consensus genetic maps. Loci with orthologs in foxtail millet are indicated in bold. The population(s) in which each marker was scored is indicated in parentheses: (P) AP13 (female) [6], (S) VS16 (male) [6], (K) Kanlow (female) [5], (A) Alamo (male) [5]. For markers that could not be mapped unambiguously, the range to which they can map is indicated by a vertical line on the right-hand side of linkage groups

Table 1 The number of framework loci, the total number of loci, the length and percent coverage of switchgrass linkage groups

Comparative Analysis

Of the 1,321 loci that were integrated into the switchgrass consensus map, 1,215 loci (92.0 %) identified hits in the foxtail millet genome at the E value threshold of 1e-10. Of those, 70.9 % mapped to syntenic linkage groups/chromosomes in switchgrass and foxtail millet. Plotting of the switchgrass genetic map locations for these loci against the position of their putative orthologs on the foxtail millet genome sequence displayed a high level of colinearity between the switchgrass and foxtail millet genomes (Fig. 3, Online Resource 3) with most of the switchgrass linkage groups corresponding to a single foxtail millet chromosome. The exceptions were switchgrass LG III which largely corresponds to foxtail millet chromosome III but has a distal region composed of four loci that are orthologous to the distal region of foxtail millet chromosome VII, and switchgrass LG VII which largely corresponds to foxtail millet chromosome VII but has a distal region composed of seven loci that are orthologous to the distal region of foxtail millet chromosome III. In addition to this reciprocal translocation, nine inversions differentiate the switchgrass and foxtail millet genomes (Fig. 3, Online Resource 3, Table 2). The inversions are located on switchgrass LGs I, III (2 inversions), IV (3 inversions), V, VI, and VII (Fig. 3, Online Resource 3). With the exception of two inversions on LG IV-b and one inversion on LG VII-a that could not be resolved on LGs IV-a and VII-b, respectively, all rearrangements were present on both switchgrass homoeologous chromosomes (Table 2). Fifty-eight markers (6.7 % of syntenic datapoints) mapped to non-colinear positions in the two species (Fig. 3, Online Resource 3). Overall, some 95 % of the foxtail millet genome is covered by the switchgrass consensus map with a minimum of 83.1 % for LGVIII-a and a maximum of 99.5 % for LGIII-a (Table 1).

Fig. 3
figure 3

Comparative relationship between loci on the switchgrass genetic map and their putative orthologs in the foxtail millet genome sequence. Foxtail millet chromosomes are in blue on the left-hand side of the circle diagram. Switchgrass linkage groups are in green on the right-hand side of the circle diagram. Because no rearrangements were identified between the homoeologous switchgrass chromosomes, only the linkage group (either a or b) with the highest number of comparative datapoints is represented. Markers that are colinear in switchgrass and foxtail millet are connected by orange lines, markers that characterize an inversion in switchgrass relative to foxtail millet are connected by green lines, markers that characterize an interchromosomal translocation in switchgrass relative to foxtail millet are connected by purple lines, and markers that are syntenic but non-colinear are connected by blue lines. Putative locations of centromeres on foxtail millet chromosomes are indicated with black bars

Table 2 Breakpoints of rearrangements that differentiate the switchgrass linkage groups and foxtail millet chromosomes

Marker Distribution

The positions of centromeres on the switchgrass linkage groups were extrapolated from the centromere locations in foxtail millet [20] (Online Resource 3). To evaluate marker distributions, each linkage group was divided into distal, proximal, and pericentromeric regions. Marker densities were calculated only for regions that could be clearly delineated (several regions with inversions were excluded) and that spanned at least 1 cM. The marker density across the pericentromeric regions was 1.29 markers per cM, across the proximal regions 0.75 markers per cM, and across the distal regions 0.49 markers per cM. The distribution of the different marker types (EST or gene-derived, PstI genomic SSRs, random genomic SSRs, and DArT markers) across the three chromosomal regions is given in Table 3. Genic markers made up a significantly higher percentage of the total number of markers in the distal chromosome regions compared to the pericentromeric regions (χ 2 = 21.37; p < 0.001), while the reverse was true for genomic SSR markers (χ 2 = 13.35; p < 0.001). There was no significant difference in the proportion of DArT markers in the distal compared to the pericentromeric regions (χ 2 = 2.12; p = 0.15). Different marker types varied significantly in their ability to identify orthologous sequences in foxtail millet (χ 2 = 513.80; p < 0.001). Colinear orthologs were found for 86 % of genic markers but only for 31 % of PstI-SSRs. Genes that mapped to pericentromeric regions were equally likely to identify colinear orthologs than genes that mapped to distal chromosome regions (χ 2 = 2.83; p = 0.09). While statistically, this was also true for genomic SSRs (χ 2 = 0.02; p = 0.88) and DArT markers (χ 2 = 3.18; p = 0.07), there was a trend for fewer orthologs to be identified for pericentromeric genomic markers than for proximal genomic markers than for distal genomic markers (Table 3).

Table 3 Distribution of different marker types across distal, proximal, and pericentromeric chromosome regions

Discussion

Consensus Map

The consensus map consisted of 1,321 loci and covered approximately 95 % of the switchgrass genome as estimated from the alignment of the switchgrass maps with the foxtail millet genome sequence. Our procedure for generating the consensus map combined the ease of use of JoinMap with the better error handling ability of Mapmaker. In our hands, merging all data (K Okada female, A Okada male, P Serba female, S Serba male) into a single large data file yielded consensus maps that were highly similar to the individual maps produced by Serba et al. [6] and Okada et al. [5]. Overall, 93 % of the markers mapped to the same relative position in the consensus map as in the individual published map. The two most likely explanations for the 7 % of discrepancies in map positions that we observed are scoring errors in one of the populations, which can lead to interstitially located markers to map to the end of a linkage group in an individual map, and the way Mapmaker and JoinMap handle scoring errors, which can lead to markers being ordered differently by different software programs [6]. The differences observed between the consensus and individual maps highlight the importance of the marker segregation quality and the software choice for developing a genetic map.

The distribution of loci is not uniform along the linkage groups. As expected, marker density (number of markers per cM) is highest in the pericentromeric regions of the genetic maps due to suppressed recombination. We also investigated whether genic and genomic markers were distributed differently along a linkage group. Gene-derived markers accounted for 44 % of the total marker number. However, they were overrepresented (52 %) in the distal regions of the linkage groups and underrepresented (30 %) in the pericentromeric regions (Table 3). This is consistent with the fact that genes are mostly located in distal chromosome regions [20, 2528]. In contrast, genomic SSR markers and DArT markers made up a higher percentage of the total markers in pericentromeric regions than those in distal chromosome regions (Table 3), suggesting that the genomic markers that map to the pericentromeric regions likely represent repeats. Consequently, orthologous sequences were identified more frequently for genic markers than for genomic markers (Table 3). There was no significant difference between the chromosome regions in the percentage of genic markers with syntenic orthologs. Although more SSR markers that mapped to the distal chromosome regions had a foxtail millet ortholog compared to markers that mapped to pericentromeric regions (Table 3), this difference was not significant possibly because the power of the statistical analysis was reduced by the small number of markers in the pericentromeric regions. Nevertheless, the observed trend suggests that genomic SSRs that mapped to distal chromosome regions were more likely to be associated with genes, while genomic SSRs that mapped to pericentromeric regions were more likely to be located within more rapidly evolving repetitive DNA. The SSRs for which foxtail millet orthologs were identified were not necessarily located within genes. Because the SSR-containing switchgrass sequences were extended by 1 kb on either end by conducting a BLASTn search of the SSR clone sequences against the switchgrass genome sequence before them as queries in a BLASTn search against the foxtail millet genome sequence, the extended sequence might contain part of a switchgrass coding region if the SSR was located within ~1 kb of gene. Therefore, foxtail millet orthologs would likely be identified for any switchgrass SSR located within ~1 kb of a coding region.

Comparative Analysis with Setaria

The outcrossing nature and polyploidy of switchgrass complicate genetic analyses. Having a closely related diploid species with a well-assembled genome can be very helpful in unraveling traits into their genetic components and cloning the underlying genes. However, it is important to know the structural relationship between the model and target genomes. In 2010, the genome of foxtail millet, a species that diverged some 13.1 MYA from switchgrass, was sequenced as a model for switchgrass [20]. We therefore conducted a comparative analysis between the published foxtail millet genome and our switchgrass consensus map. Both switchgrass and foxtail millet have a basic chromosome number of nine, and the majority of loci that mapped to a single switchgrass linkage group identified a single foxtail millet chromosome indicating high levels of synteny between switchgrass and foxtail millet (Fig. 3, Online Resource 3). We therefore discarded the 29 % of BLASTn hits that mapped to nonsyntenic chromosomes, many of which represented repetitive sequences, as nonorthologous. The exceptions were four and three BLASTn hits to the region 33–35.2 Mb on foxtail millet chromosome VII that were identified when adjacent markers at the top of switchgrass LGs III-a and III-b, respectively, were used as queries, and seven BLASTn hits to the region 0.73–2 Mb on foxtail millet chromosome III identified using adjacent markers at the bottom of both switchgrass LGs VII-a and VII-b (Fig. 3, Online Resource 3). The close linkage of the putative orthologous loci on switchgrass LGs III and VII, and foxtail millet chromosomes VII and III, respectively, indicated the presence of a reciprocal translocation that took place in either the switchgrass or foxtail millet lineage. In addition to the reciprocal translocation, nine inversions differentiated the switchgrass and foxtail millet genomes. The breakpoints of these rearrangements are listed in Table 2. Centromere locations in switchgrass were extrapolated from the centromere locations in foxtail millet and mapped to the region with the highest ratio of physical distance (measured in foxtail millet) to genetic distance (measured in switchgrass) (Online Resource 3).

A comparative analysis of the foxtail millet genome with that of sorghum had previously revealed that sorghum chromosomes 6, 8, and 9 corresponded to foxtail millet chromosomes III and VII and that both sets of chromosomes were differentiated by a complex set of rearrangements [20]. The remaining sorghum and foxtail millet chromosomes showed one-to-one relationships, with inversions differentiating the syntenic foxtail millet (sorghum) chromosomes I (4) (2 inversions), IV (10) (4 inversions), V (3) (1 inversion), VI (7) (3 inversions), and VIII (5) (1 inversion) [20]. Of these, seven inversions also differentiated foxtail millet and rice, and hence occurred in the foxtail millet lineage after its divergence from a common ancestor with sorghum. Three inversions were specific to the sorghum lineage. For one inversion (on chromosome V), the evolutionary time point could not be established. Because the breakpoints of the inversions and translocations were not included in Bennetzen et al. [20], we have listed those that occurred in the foxtail millet lineage and that coincide with rearrangements that differentiate foxtail millet and switchgrass in Table 2. All rearrangements that differentiate switchgrass from foxtail millet, except one inversion on switchgrass LG IV-b, were rearrangements that had previously been shown to have occurred in foxtail millet after the divergence of foxtail millet and sorghum from a common ancestor (Table 2). The inversion on switchgrass LG IV-b included only three comparative data points and could not be confirmed on LG IV-a due to a lack of data in that region. Our analyses also showed that the inversion on foxtail millet chromosome V for which the timing had not previously been determined and part of the complex rearrangement that formed foxtail millet chromosomes III and VII were also specific to foxtail millet. Some of the foxtail millet lineage-specific rearrangements (Fig. 4) identified in comparative analyses with sorghum and rice were not covered by markers in the switchgrass consensus map, and hence, we could not determine their status in switchgrass.

Fig. 4
figure 4

Model showing the evolution from three ancestral chromosomes to current-day foxtail millet chromosomes III and VII. Numbers on the right-hand side of each chromosome give the ancestral chromosome number and regions. Chromosome regions are delineated by rearrangement breakpoints. Numbers on the left-hand side of the chromosome indicate the location of the chromosomal breakpoints in the sorghum genome, which was taken as a proxy for the chromosome structure of the common ancestor to sorghum and switchgrass/foxtail millet. Evolutionary events are numbered (in parenthesis), and each number is followed by Y, N, or U indicating whether this rearrangements took place (Y), did not take place (N), or could not be assessed (U) in switchgrass. Event (1) is a nested chromosome insertion, event (10) is a reciprocal translocation, and the remaining events are intrachromosomal inversions

The fact that part of the complex chromosome III–chromosome VII rearrangement that occurred in the foxtail millet lineage after its divergence from sorghum predates the divergence of switchgrass and foxtail millet and part occurred after the divergence of the two species allowed us to develop a model for the evolution of those chromosomes (Fig. 4). The model is based on the assumption that the most likely path of chromosome evolution from ancestral state to current-day state involved a minimal number of rearrangements. We considered the structure of current-day sorghum chromosomes 6, 8, and 9, bar an inversion on chromosome 8 in sorghum which postdates the divergence of the sorghum and foxtail millet lineages, as the ancestral chromosome configuration. The model hypothesizes a chromosome fusion in which ancestral chromosome 9 was inserted into the pericentromeric region of chromosome 8. This was followed by three inversions that led to the structure of current-day switchgrass chromosome III. The relative order of the rearrangements in the model is random, except when rearrangements were known to predate (rearrangements 1, 2, 3, and 4) or postdate (rearrangements 5, 8, 10, and 11) the divergence of switchgrass and foxtail millet, or when they involved overlapping chromosome segments (e.g., rearrangements 2, 3, and 5).

Our analysis could only identify chromosomal rearrangements that spanned, on average, at least 1.2 mega bases in foxtail millet due to the limited resolution of the switchgrass genetic map. Nevertheless, at the gross chromosomal level, chromosome evolution seems to have occurred at a faster pace in foxtail millet compared to switchgrass during the past 13.1 MY. An interesting question is whether chromosome level evolution is correlated with rearrangements at the gene level. Massa et al. [29] found that the rate of chromosomal rearrangements and the rate of single-gene deletions and insertions in the grass species Aegilops tauschii, Brachypodium distachyon, Oryza sativa, and Sorghum bicolor were highly correlated. However, single-gene inversions were not analyzed in the Massa et al. [29] study. An analysis of the orientation of genes in switchgrass BAC-end sequences relative to both foxtail millet and sorghum has shown that single-gene inversions occur at a higher rate in foxtail millet compared to sorghum [20], suggesting that the same mechanism may underlie single-gene inversions and inversions of large gene blocks.

The accumulation of a higher number of rearrangements in foxtail millet compared to switchgrass was a somewhat surprising finding considering that switchgrass is a tetraploid with, presumably, a greater tolerance to rearrangements than the diploid foxtail millet. However, the divergence of the switchgrass A and B genomes and their subsequent hybridization might have been too recent for rearrangements to have accumulated in the switchgrass genome. Current phylogenies suggest that the two diploid genomes that formed tetraploid switchgrass diverged after the split of Panicum section Panicum, to which P. hallii belongs, and Panicum section Virgata, to which switchgrass belongs [3032]. Because the divergence of P. hallii and P. virgatum has been dated to approximately 5.3 MYA [33], tetraploid P. virgatum must have been formed in the past 5 MY. This number is in close agreement with divergence estimates based on 6,800 bp of sequence data of 6 MYA for the divergence of the switchgrasses A and B genomes and 3.8 MYA for the split between the closely related A genome of switchgrass and E genome of Panicum rudgei (B. Bahri and K.M. Devos, unpublished data).

Fate of Nonfunctional Centromeres after Chromosome Fusion

The insertion of ancestral chromosome 9 into ancestral chromosome 8 to form switchgrass and foxtail millet chromosomes III provided us with an opportunity to analyze the fate of the two centromeres in the fused chromosome. Luo and colleagues [34] noted that nested chromosome insertions during the evolution of Ae. tauschii chromosomes led to the loss of the centromere of the chromosome in which the insertion occurred (recipient chromosome) [34]. The same observation was made for nested chromosome insertions in B. distachyon and sorghum [34] and, in this study, in switchgrass/foxtail millet. In an attempt to uncover what happens after centromere inactivation, we plotted the ratio of the length of sorghum and foxtail millet synteny blocks, as determined by Bennetzen et al. [20], along each of the nine foxtail millet chromosomes (Fig. 5). The average ratio was 2.11, which corresponds to the approximately 2-fold larger genome size of sorghum compared to foxtail millet. However, as expected, the ratio was not constant across the length of the chromosomes but peaked in the regions corresponding to the ten sorghum centromeres indicating that most of the genome expansion in sorghum occurred in the centromeric region. Interestingly, where the average ratio of the length of a sorghum block to the length of the syntenic foxtail millet block was 3.93 around the active foxtail millet centromeres, this ratio was 15.71 in the region of the “lost” foxtail millet centromere. A similar analysis of the relative length of syntenic rice and foxtail millet regions showed an average value of 1.08 along the length of the chromosome, a value of 0.84 around the active foxtail millet centromeres, and a value of 3.58 around the lost centromere on foxtail millet chromosome III. The <1 ratio around the foxtail millet centromeres indicates that centromeric regions in the 510-Mb foxtail millet genome have expanded more than the centromeric regions in the 430-Mb rice genome. Furthermore, the value of 3.58 at the position of the lost centromere in foxtail indicates that this region has been contracting in foxtail millet. We hypothesize that during a nested insertion, the centromere in the recipient chromosome is effectively moved to a more distal region where recombination rates are higher, which promotes removal of repeats around the inactive centromere through homologous recombination. It has been a long-standing question whether the accumulation of transposable elements in low recombinant regions is due to reduced recombination rates (ectopic recombination model [35]) or due to weaker selection against deleterious insertions in gene-poor regions [36]. Analysis of the repeat content around inactivated centromeres that have been moved to regions with higher recombination rates following a nested insertion event could help in uncoupling the effects of recombination and gene density.

Fig. 5
figure 5

Graph showing syntenic block length ratios along the length of the foxtail millet chromosomes. a Ratio of length of syntenic blocks in sorghum and foxtail millet. b Ratio of length of syntenic blocks in rice and foxtail millet. Foxtail millet chromosomes (labeled LG I to LG IX) are separated by vertical lines. For each set of syntenic blocks, the ratio is plotted against the location of the midpoint of the syntenic block on the foxtail millet chromosome

Conclusion

Switchgrass consensus maps were developed as a tool for the switchgrass community to facilitate trait analyses and to provide the marker density needed to conduct comparative analyses between switchgrass and its genomic model, foxtail millet. The two species differed by one reciprocal translocation and at least nine inversions, all but one of which had occurred in the foxtail millet lineage during the past 13 MY. Considering that there are further rearrangements in the foxtail millet genome for which the status could not be determined in switchgrass due to a lack of markers in the relevant regions, it is likely that comparative analyses at the whole genome level will uncover additional differences between the switchgrass and foxtail millet genomes. The rearrangements need to be taken into account when using foxtail millet as a proxy for switchgrass in genetic and genomic analyses.

In addition to providing a resource for practical applications in breeding and genomic analyses, the newly gained comparative knowledge yielded insight into the evolutionary events that led to a reduction in chromosome number from 10 in the tribe Andropogoneae to 9 in the tribe Paniceae and, ultimately, to the formation of current-day switchgrass and foxtail millet chromosomes. The structure of the switchgrass genome, which is intermediate between the structures of the sorghum and foxtail millet genomes and appears to have been relatively stable within the past 13 MY, makes switchgrass an interesting species to include in evolutionary studies within the Panicoideae.