A high-resolution physical map integrating an anchored chromosome with the BAC physical maps of wheat chromosome 6B
- 2.8k Downloads
A complete genome sequence is an essential tool for the genetic improvement of wheat. Because the wheat genome is large, highly repetitive and complex due to its allohexaploid nature, the International Wheat Genome Sequencing Consortium (IWGSC) chose a strategy that involves constructing bacterial artificial chromosome (BAC)-based physical maps of individual chromosomes and performing BAC-by-BAC sequencing. Here, we report the construction of a physical map of chromosome 6B with the goal of revealing the structural features of the third largest chromosome in wheat.
We assembled 689 informative BAC contigs (hereafter reffered to as contigs) representing 91 % of the entire physical length of wheat chromosome 6B. The contigs were integrated into a radiation hybrid (RH) map of chromosome 6B, with one linkage group consisting of 448 loci with 653 markers. The order and direction of 480 contigs, corresponding to 87 % of the total length of 6B, were determined. We also characterized the contigs that contained a part of the nucleolus organizer region or centromere based on their positions on the RH map and the assembled BAC clone sequences. Analysis of the virtual gene order along 6B using the information collected for the integrated map revealed the presence of several chromosomal rearrangements, indicating evolutionary events that occurred on chromosome 6B.
We constructed a reliable physical map of chromosome 6B, enabling us to analyze its genomic structure and evolutionary progression. More importantly, the physical map should provide a high-quality and map-based reference sequence that will serve as a resource for wheat chromosome 6B.
KeywordsCentromere Chromosomal rearrangement Chromosome 6B DNA marker Gene order Nucleolus organizer region BAC physical map RH map Synteny Wheat
Bacterial artificial chromosome
Basic Local Alignment Search Tool
Expression sequence tag
Fluorescence in situ Hybridization
Intergenic spacer sequence
Insertion site-based polymorphism
International Genome Sequencing Consortium
Minimal tiling path
Nucleolus organizer region
Polymerase chain reaction
PCR-based Landmark Unique Gene
Restriction fragment length polymorphism
Recombinant inbred line
Simple sequence repeat
Whole Genome Profiling
Common wheat (Triticum aestivum L.) is an allohexaploid species with three distinct genomes, A, B and D, which have been defined by genome analysis . The wheat genome consists of 21 chromosomes, and each of the three subgenomes contributes 7 chromosomes [2, 3]. The genome has a total length of 16.9 Gb , which is extremely large compared to the sequenced genomes of other grass species: rice (389 Mb; ), Brachypodium distachyon (272 Mb; ), sorghum (730 Mb; ), maize (2.3 Gb; ) and barley (5.1 Gb; ). Most of the wheat genome is occupied by various types of repetitive DNA sequences, which account for more than 80 % of the entire genome . Although wheat has been widely used in cytogenetics because of the high visibility of its chromosomes, complex features, such as its large size, high repeat content and polyploidy, have hampered molecular approaches to date, particularly whole-genome sequencing. However, recent technological advances have paved the way for genomic approaches that work for even complex genomes, such as wheat. For example, whole-genome sequences of the common wheat cultivar ‘Chinese Spring’ (CS), wild diploid wheat Triticum urartu (A genome progenitor of common wheat) and wild diploid goat grass Aegilops tauschii (D genome progenitor of common wheat) were analyzed using next-generation sequencing (NGS) technology [11, 12, 13].
The International Wheat Genome Sequencing Consortium (IWGSC) has been coordinating wheat genome sequencing to enhance knowledge of the structure and function of the bread wheat genome (http://www.wheatgenome.org/) and to provide tools to facilitate the breeding of improved varieties. The consortium has adapted a strategy that relies on the purification of individual chromosome arms from wheat telosomic lines  by flow-cytometric sorting to reduce the sample complexity  and obtain chromosome-specific genomic data. Recently, the IWGSC presented a draft sequence of all 21 wheat chromosomes  in which the DNA of flow-sorted fractions was sequenced by NGS. To obtain a reference genome sequence using BAC-by-BAC sequencing, physical mapping of the BAC contigs (hereafter reffered to as contigs) is currently underway for each of the 21 chromosomes . To date, BAC-based physical maps have been developed for chromosomes 1A, 1B, 3B and 6A [16, 17, 18, 19, 20, 21]. Most recently, during the preparation of this paper, two additional physical maps for chromosome arms 3DS and 5DS were released [22, 23].
Our group is responsible for sequencing wheat chromosome 6B (http://komugigsp.dna.affrc.go.jp/index.html) under the framework of IWGSC. This chromosome is the third largest chromosome in common wheat, with an estimated molecular size of 914 Mb: 415 Mb representing the short arm (6BS) and 498 Mb representing the long arm (6BL) . As a first step, we conducted whole-chromosome shotgun sequencing using DNA that had been amplified from the flow-sorted chromosome arms 6BS and 6BL with massively parallel 454 pyrosequencing . The survey sequence data were highly informative for determining the genomic composition of chromosome 6B. Within a total of 508 Mb of assembled sequence (56 % of the size of 6B), 4798 gene loci were predicted, and more than 70 % of the 6B assembly consisted of repetitive sequences. Functional non-protein-coding RNAs, such as micro-RNA, transfer RNA and ribosomal RNA (rRNA), were also identified. In particular, the short arm (6BS) is characterized by the presence of secondary constriction, the nucleolus organizer region (NOR), which features a locus for the rRNA genes (rDNA locus) [25, 26]. The NOR is designated as Nor-B2 . We identified several contigs containing the rDNA locus, and they exhibited extremely high read depths, indicating that these contigs were part of the NOR on 6BS.
In this study, we established a BAC-based physical map of chromosome 6B with the goal of developing a high-quality reference sequence for the wheat genome. To achieve this goal, two BAC libraries were constructed using arm-specific DNA samples. In addition, we used two recently developed genomic tools to physically map all of the contigs onto chromosome 6B. This process was performed using insertion site-based polymorphism (ISBP) markers (Kaneko et al. in preparation) that were designed from the junction sequences between transposable elements and their flanking sequences , and a radiation hybrid (RH) mapping panel (Watanabe et al. in preparation) that was derived from a series of chromosome deletion lines generated by γ-ray irradiation . Combining the above genomic resources with other wheat-specific data, we were able to develop a robust system to anchor and order contigs onto the high-resolution RH map of chromosome 6B. The physical map not only allows us to understand key structural features and the evolutionary history of chromosome 6B but will also enable the use of genome sequencing to create a high-quality map-based reference sequence in the near future.
Results and discussion
Construction of 6B arm-specific BAC libraries
Arm-specific BAC libraries on chromosome 6B
Number of clones
Average insert size
Purity of the sorted fraction
Construction of a BAC-based physical map of wheat chromosome 6B
Physical maps of chromosome arms 6BS and 6BL
Number of BACs in the contigs
Number of contigs
Estimated size (chromosome coveragea)
359 Mb (87 %)
475 Mb (95 %)
Contigs N50 (kb)
Number of MTP BACs
The above process greatly improved our BAC-based physical map. The first physical maps revealed estimated sizes of 492 and 495 Mb that corresponded to 119 % and 99 % of the arm lengths, N50 values of 1503 and 2422 kb and L50 values of 87 and 65 contigs for 6BS and 6BL, respectively (Additional file 4). However, the final maps showed remarkable increases in the N50 values (from 1503 to 2302 kb for 6BS and from 2422 to 2508 kb for 6BL) and a decrease in the L50 values (from 87 to 52 for 6BS and from 65 to 61 for 6BL), although the total lengths for both arms decreased to 359 and 475 Mb, covering approximately 87 and 95 % of the entire genomic region for 6BS and 6BL, respectively (Additional file 4; Table 2). The reason for the lower coverage of 6BS remains unclear, but it is likely a consequence of the purity of the sorted chromosome arm DNA (Table 1), the number of WGP tags on each BAC (Additional file 2) or differences in the genomic structure between 6BS and 6BL (e.g., the Nor-B2 region with highly repetitive rRNA gene sequences).
Further overlap analysis of neighboring clones within each contig using the FPC MTP module allowed us to select 3076 and 4557 minimal tiling path (MTP) clones along the two arms, laying a foundation for the map-based genomic sequencing of the entire chromosome 6B (Table 2).
Anchoring of chromosome 6B-specific markers to determine the physical location of the BAC contigs
Molecular markers used for the physical mapping
Anchored to BAC contig
On integrated physical map
1164 (39.2 %)
SSR and STS-RFLP markers
127 (61.1 %)
1293 (64.7 %)
2584 (50.0 %)
Characteristics of the BAC contigs in the physical map of chromosome 6B
6BS (size, coveragea)
6BL (size, coveragea)
6B (size, coveragea)
BAC contigs on the RH map
201 (340.2 Mb, 82.0 %)
261 (454.5 Mb, 91.3 %)
462 (794.8 Mb, 87.0 %)
BAC contigs not on the RH map but anchored with markers
6 (1.05 Mb, 0.25 %)
7 (1.12 Mb, 0.22 %)
13 (2.17 Mb, 0.24 %)
BAC contigs not anchored with markers
97 (17.8 Mb, 4.3 %)
117 (19.0 Mb, 3.8 %)
214 (36.8 Mb, 4.0 %)
Integrated physical map and map resolution
Mapped contig size (Mb)
Map resolution (Mb)
Our RH mapping could offer more refined deletion bins for chromosome 6B than were previously available . The 18 breakpoints in the 6B deletion lines and 653 markers defined 20 chromosomal blocks (7 on 6BS and 13 on 6BL). These 20 blocks were designated as improved deletion bins (Fig. 3a). Each deletion bin was denoted by the name of the deletion line and the breakpoint at the proximal end of the bin (Fig. 3a). Based on the number of obligate breaks and the bin size (calculated from the cumulative size of the contigs), the resolution of all 20 deletion bins was estimated to range from 0.4 to 1.8 Mb/break (Table 5). Five bins (6BS8, 6BL3, 6BL8, 6BL9 and 6BL14) showed a relatively low resolution with >900 kb/break, which may have occurred for the following reasons: 1) the large cumulative size of the contigs with a limited number of obligate breaks (e.g., 6BS8, 6BL3 and 6BL9); 2) one locus mapped by a large number of markers (e.g., 6BL14); and 3) a bin mapped with only one large contig (e.g., 6BL8). Deletion bins that mapped to regions close to the centromere a remarkably high resolution. For example, bin C-6BL12 had the highest resolution, 0.40 Mb/break (Table 5). This result shows that RH mapping is useful for the chromosomal assignment of contigs, even within the centromeric region, where recombination events are extremely rare.
Notably, the largest space between markers on the RH map was located between KGS0314 (6BS9) and ISBP0092 (6BS8), which were separated by 29.2 cR (Fig. 3c). Clearly, this chromosomal site was associated with the chromosome breakpoint of deletion line 6BS-9, which was located near Nor-B2 . These results suggest that this space over 29.2cR corresponds to the NOR region (see the following section).
Although 350 of the 480 contigs (152 on 6BS and 198 on 6BL) were specifically positioned in order along chromosome 6B (Additional file 5), 130 contigs were localized within a total of 47 loci (16 on 6BS and 31 on 6BL), which indicated that multiple contigs resided within the same locus. For example, 4 of the 13 loci within the bin of 6BL14 carried 5 to 7 contigs in which the relative order was unknown at this stage. However, 80 of these 130 contigs (21 on 6BS and 59 on 6BL) were anchored by genic markers with orders that were estimated from the syntenic relationships of rice, B. distachyon and sorghum. These efforts led to the successful determination of the physical locations of a total of 430 contigs, representing a total length of 761.6 Mb and covering approximately 83 % of chromosome 6B (Additional file 5).
In the present study, we developed an RH map for chromosome 6B and determined the order of the contigs. Our study clearly demonstrates the appropriateness and effectiveness of the RH mapping approach for ordering contigs on wheat chromosome compared to typical strategies, such as genetic mapping and deletion bin mapping. The RH mapping method has important advantages. It produces uniform chromosomal breaks, allows for direct localization and does not require marker polymorphisms. Thus, it allows for the determination of the chromosomal locations and order of contigs with high resolution and accuracy.
RH mapping was able to define the order of 2860 markers based on contig order (Additional file 5). The marker order on the physical RH map provides more detail and a higher resolution compared to previous deletion bin mapping of chromosome 6B [35, 40, 41, 42, 43]. Although the marker density of the 6B RH map (3.6 markers/Mb) is lower than those of other physical maps (10.1 and 11 markers/Mb on 1BS and 1BL, respectively; [18, 19]), all of our markers are PCR markers, which will be valuable for future map-based gene cloning and marker-assisted selection in breeding compared to the array-based markers used in the previous studies.
Fine mapping of the Nor-B2 region on the short arm of chromosome 6B
The integration of FPC-assembled BAC contigs with a high-resolution RH map of chromosome 6B allowed us to characterize several features that are specific to the genomic composition and structure of chromosome 6B. In common wheat, the majority of the highly repeated rRNA genes for 18S, 5.8S and 25S RNAs (rDNA unit) are located at the NOR loci on chromosome arms 1BS and 6BS, as well as at minor sites on chromosome arms 1AS and 5DS, which have been designated Nor-B1, Nor-B2, Nor-A1 and Nor-D3, respectively . Nor-B2 is recognized as the secondary constriction on satellite chromosome 6B, which indicates that genes in Nor-B2 are transcriptionally active .
Previous research has identified approximately 5500 rRNA genes on chromosome 6B . Considering that the length of each copy (rDNA unit and adjacent IGS) is approximately 9 kb , the complete region encompassing the Nor-B2 locus could be estimated as approximately 16.5 Mb. Although the five BAC contigs showed a total physical length of 9.4 Mb, the copy numbers of the rRNA genes detected within them were extremely limited (Fig. 4). Therefore, the current BAC-based physical map partially covers the border region (proximal to the centromere) of the Nor-B2 locus, and it does not include the core rDNA array region of Nor-B2. Furthermore, the physically mapped BAC contigs in the genomic region on the other side (proximal to telomere) of the Nor-B2 locus do not contain the rRNA genes.
Interestingly, all of the rRNA genes discovered on the above contigs that mapped near the Nor-B2 locus appeared to be interrupted by other types of genomic sequences, including transposons and other genes. This is in striking contrast to the NOR region in rice. In rice (Oryza sativa L. ssp. japonica), the rRNA genes are distributed in a uniform array throughout the NOR, encompassing a region of approximately 3.5 Mb on chromosome 9 (Additional file 6) [47, 48]. Moreover, the rDNA units have been found to exist as clusters in tandem arrays, even within the border region of the NOR, demonstrating a clear boundary between the rDNA repeat and its flanking region . In the region flanking the rDNAs on rice chromosome 9, Ty3/gypsy retrotransposons accumulates at a density twice as high as that of the entire rice genome [5, 49]. Regarding chromosome 6B, two ISBP markers, ISBP0092 and ISBP0465, were designed based on the junction sequences, including the gypsy-type retrotransposon (Additional file 6), and these mapped close to the positions of the rRNA genes on CTG151 and CTG142, respectively (Fig. 4). The other five ISBP markers mapped to the same region (Fig. 4), including other types of repeat sequences, such as DNA transposons (CACTA and MITE) and unknown repeat sequences (Additional file 6). We previously found that the DNA transposon diffusion is involved in the propagation of micro RNA genes and specific transfer RNA genes in wheat chromosome 6B based on the analysis of 6B survey sequences . The ambiguity found in the border region of Nor-B2 might be another example of specific transposons containing rRNA genes becoming diffused around the border region of Nor-B2, resulting in insertions of the rRNA genes into the genomic sequence outside of the rDNA array. We identified three gene loci (GZ6H023 and TNAC3867 on CTG516, TNAC3875 on CTG142) that displayed colinearity between wheat and rice (Figs. 3e and 4), and other mapped genes on CTG516 were also found to have conserved gene order relative to rice chromosome 2. These results support the concept that the border region of Nor-B2 originally possessed a structure similar to that of rice; however, the occurrence of genomic rearrangements due to the transposition of various repeat sequences has resulted in the present ambiguity at the proximal border of Nor-B2. This feature may reflect general characteristics of the wheat genome, which contains highly repetitive sequences. A detailed evaluation of the structure of Nor-B2 and its relationship to the repetitive sequences will be presented in future sequencing studies.
Fine mapping of the centromere on wheat chromosome 6B
A shift in the position of the centromere since the divergence of wheat and rice has been suggested based on comparisons between wheat chromosome 3B and rice chromosome 1 . To dtermine whether the centromeric shift occurred between chromosome 6B and rice chromosome 2, we investigated the relationships of orthologous genes in the centromeric region between wheat 6B and rice 2. A total of 54 genic markers were mapped on the contigs at the centromeric deletion bins (C-6BS1, C-6BL12 and 6BL12), of which 47 markers were associated with pericentromeric rice genes on chromosome 2 (Fig. 5). This result supports a conservation of the centromeric region, as shown in a previous study . The centromeric region, from contig CTG307 (C-6BS1) to contig CTG482 (C-6BL12), was interposed between the genic markers TNAC8059 (CTG307 on C-6BS1) and GZ6H037 (CTG2358 on C-6BL12) (Fig. 5; Additional file 5). These two markers aligned to rice genes Os02g0241600 and Os02g0244700, respectively, on the short arm and at a position 5.6 Mb distal to the centromere of rice chromosome 2 (Fig. 5). This suggests that the shift in the position of the centromere occurred between wheat 6B and rice 2, as observed between wheat 3B and rice 1.
Distribution of genes along wheat chromosome 6B
On chromosomes 1B and 3B, the distribution of syntenic genes had no impact on the overall gene-density gradient [18, 19, 55]. We found that 1538 of 2015 genes assigned on chromosome 6B had orthologs on rice chromosome 2, B. distachyon chromosome 3 or sorghum chromosome 4 and the remaining 477 genes were nonsyntenic. The distribution patterns of the nonsyntenic group on 6BS showed a clear gradient of gene density, becoming increasingly dense from centromere to telomere, whereas there was no discernable gradient pattern for the syntenic group (Fig. 6b and c). This shows that the gradient of gene density along 6BS is due to the presence of nonsyntenic genes, which was consistent with the general pattern of gene density on wheat chromosomes 1B and 3B [18, 19, 55]. In contrast, on 6BL, both the nonsyntenic and syntenic genes increased in density from centromere to telomere (Fig. 6b and c). This result suggests that the general pattern discussed above may not apply to 6BL. However, our result did not completely cover the positions of the genes on the 6B physical map; we used 2015 genes, representing only half of the 3746 high-confidence genes annotated on chromosome 6B based on the IWGSC survey sequencing data . Completion of the genomic sequencing and gene annotation will reveal the gene organization on chromosome 6B more definitively.
Local rearrangements on wheat chromosome 6B
Genomic regions containing chromosome rearrangements between 6B and its homologous chromosomes in other species
Wheat chromosome 6B
Wheat chromosome 6B
Rice chromosome 2
B. distachyon chromosome 3
Sorghum chromosome 4
(position on RH map)
(estimated size a)
6BS: 188.3–226.9 Mb
6BS: 230.2–267.1 Mb
6BL: 516.9–555.0 Mb
6BL: 632.1–650.0 Mb
6BL: 602.5–713.8 Mb
6BL: 715.8–786.4 Mb
6BL: 516.9–555.0 Mb
6BL: 559.5–630.8 Mb
6BL: 632.1–650.0 Mb
A comparison of wheat chromosome 6B with Sb04 led to the discovery of one large inversion (559.5-630.8 Mb on the physical map) of the long arm 6BL (Table 6; Additional file 7C). Because wheat 6B and Os02 or Bradi3 share the same inversion relative to Sb04, this chromosomal structural change might be unique to the sorghum lineage. We discovered two reciprocal translocations between wheat 6B and Sb04 (Table 6; Additional file 7C), which corresponded exactly to the two Pooideae-specific inversions described above (Table 6; Additional file 7A).
Disruption of synteny in the proximal region of wheat chromosome 6BL
A large rearrangement in the proximal region (deletion bins, C-6BL12, 6BL12 and 6BL13; Fig. 7d; 340.2–421.8 Mb in Additional file 7; Additional file 5) was observed between 6B and the Sb04, Os02, and Bradi3 chromosomes; that is, fewer syntenic genes were mapped to this genomic region. Such an interruption of synteny in the proximal region of wheat homoeologous group 6 chromosomes relative to rice chromosome Os02 has been previously described by the mapping of orthologous genes . It has also been documented between chromosome 6D of Ae. tauschii and Os02 based on high-resolution physical mapping . Using the genomic sequences of the MTP BAC clones in this region (unpublished results), we estimated the number of genes in these bins that were homologous to the rice genes by a BLAST analysis. From 864 MTP clones (85 Mb) in the three bins, we identified 517 gene sequences homologous to the rice genes. We found 70 and 69 gene sequences that were homologous to Os01 and Os02, respectively, and 25–56 gene sequences were derived from other chromosomes in rice. We investigated the colinearity of the 69 sequences that were homologous to Os02 based on their positions on Os02, but we failed to detect a certain level of colinearity. The estimated proportion of nonsyntenic genes in this region was very high (87 %; 449 in 517 sequences) compared with the proportion (57 %) in the entire chromosome 6B based on our survey sequence data, as noted previously . Furthermore, a recent genome sequencing study reported that the proportion of nonsyntenic genes in the proximal region of chromosome 3B was 28 %, and even in distal regions, the proportions were 44 and 53 % . These results suggest that the interruption was unusual; it was not due to the simple integration of chromosomal segments from a rice chromosome other than Os02 into this region but rather to the abundant and random accumulation of nonsyntenic genes that might disrupt the synteny between wheat chromosome 6B and rice chromosome 2. Although we do not currently have any data to explain the mechanisms underlying the disruption in colinearity in this region, it might be due to structural differences in the centromeric region, including the previously described centromere shifting.
Large interruptions of approximately 30–40 Mb of the genomic region of Bradi3 and Sb04 were also found (Fig. 7d; Additional files 7B and C). The interruption between 6B and Bradi3 revealed characteristics of the chromosomal structures of Bradi3, which was postulated to correspond to three rice chromosomes (Os02, Os08 and Os10), and its pericentromeric 10–45 Mb region was syntenic with Os08 and Os10 . Therefore, the interruption between 6B and Bradi3 resulted from the insertion of other chromosomal segments in Bradi3. The cause of the interruption between 6B and Sb4 remains unclear, although the small number of annotated gene in this 30-Mb genomic region corresponding to the proximal region of Sb04  is estimated to be one of the causes.
In summary, we could infer the structural features of chromosome 6B with high resolution based on the the physical localizations of genic markers obtained by RH mapping (Fig. 3e). The results provide information to understand the molecular and biological mechanisms underlying the genomic divergence and chromosomal evolution of wheat.
Here, we provide a BAC-based physical map of wheat chromosome 6B with an estimated size of 833.8 Mb that covers 91 % of the chromosome, which allowed us to select MTP BAC clones for the map-based genomic sequencing of chromosome 6B. The development of 2860 anchor markers and the construction of a high-resolution RH map permitted the successful localization of 480 of the assembled contigs to their chromosomal regions, representing 87 % (794.8 Mb) of the entire chromosome. The establishment and analysis of the integrated physical map also led to the discovery of several important features of chromosome 6B, including the fine chromosomal localization and organization of Nor-B2 and the centromere, the gene distribution patterns and the evolutionary history of chromosomal rearrangements among grass species. Furthermore, the use of marker information from syntenic genes led to the identification of chromosomal regions that were highly conserved or frequently rearranged, which is useful for our understanding of the complexity of genome evolution in wheat.
Construction of BAC libraries
The short and long arms of wheat chromosome 6B were isolated from a double ditelosomic line (2n = 40 + 2t6BS + 2t6BL) of CS  that was obtained from NBRP-Wheat Japan (accession number LPGKU0038). Aqueous suspensions of intact mitotic chromosomes were prepared from synchronized root tips of young seedlings as described by Vrána et al. , and both arms of 6B were purified by flow cytometry . The purity of the flow-sorted fractions was determined by FISH using probes for GAA microsatellites and telomeric repeats as described by Janda et al. . A total of 5,200,000 6BS arms and 5,150,000 6BL arms were sorted by flow cytometry, embedded in agarose miniplugs (each containing approximately 200,000 arms) and used to construct BAC libraries according to Šimková et al. . Briefly, high-molecular-weight DNA was partially digested with HindIII and subjected to two rounds of size selection using pulsed-field gel electrophoresis. The DNA was electroeluted from the gel and ligated into the pIndigoBAC-5 vector (Epicentre, Madison, WI, USA). The recombinant vector was used to transform Escherichia coli DH10B competent cells (Invitrogen, Carlsbad, CA, USA). The libraries were ordered into 384-well plates filled with freezing medium consisting of 2YT, 6.6 % glycerol and 12.5 μg/ml chloramphenicol and stored at −80 °C. The average insert size was estimated based on an analysis of 240 randomly selected BAC clones in each library.
Fingerprinting by whole-genome profiling
Fingerprinting of chromosome arm 6BS- and 6BL-specific BAC libraries was performed by using a WGP™ method that was developed based on NGS technology to establish the chromosome physical maps . Using 41,472 clones of 6BS (plate no. 41–148) and 49,920 clones of 6BL (plate no. 69–198), a multi-dimensional pool of BAC DNAs with a high concentration and low E. coli level was developed by Amplicon Express, Inc. (Pullman, WA, USA). The pooled BAC DNAs were subjected to sample preparation and sequencing by KeyGene N.V. (Wageningen, The Netherlands): a) digestion with the restriction enzymes EcoRI and MseI, ligation of sequencing adaptors containing sample identification tags and PCR amplification; b) pooling of the PCR products; c) cluster amplification; and d) sequencing from the EcoRI side using the HiSeq2000 sequencer (Illumina, San Diego, CA, USA) with a read length of 81 nucleotides. The high-quality reads were used for WGP data processing, which included the following steps: a) deconvolution, i.e., assignment of sequencing reads to individual BACs in the pools; b) assignment of WGP tags to BACs based on the deconvoluted reads; and c) filtering of WGP tags and BACs using various quality control measures to minimize noise.
BAC contig assembly
The assembly of BAC clones was performed using FPC software , which was improved by KeyGene N.V. to be capable of processing the WGP data. The assembly procedure was based on the previously described guidelines for physical map assembly of IWGSC and chromosome 3B physical mapping [20, 32] as shown below: a) the initial assembly was performed using incremental contig-building with a cut-off of 1e−75; b) a single-to-end and end-to-end merge was conducted using the Match 1 parameter by increasing the cut-off (1e−5 at each step) to a final cut-off of 1e−05; c) the DQer function was used at each cut-off step to separate all of the contigs containing more than 10 % of Questionable (Q) clones with the Step 3 parameter; d) finally, a single optimal cut-off value was determined. The following parameters were used for the WGP analysis: a band size (CB unit, average distance between tags) of 5220 and 5075 for 6BS and 6BL, respectively; a gel length (corresponding to the number of unique tags) of 120,913 and 112,559 for 6BS and 6BL, respectively; a FromEnd value (corresponding to half of the average number of tags per BAC) of 12; and a tolerance of 0. The MTP was selected using the FPC MTP module .
Elimination of low-confidence BAC contigs
To detect BAC clones derived from other chromosomal DNAs that contaminated our preparation, a BLASTN  search was performed against the survey sequence of wheat  using the WGP tag sequences on the BAC clones as queries with thresholds of 95 % coverage and 95 % identity. First, we identified WGP tags with high similarity to each chromosome group. Next, we compared the number obtained for the group 6 chromosomes and the largest number among the other chromosome groups. If the number obtained for one of the other groups was larger (more than four) than the number obtained for group 6, we judged that the BAC clone was derived from the other chromosomes. Contigs that included BAC clones occupying more than 66 % of the total number of BAC clones within a contig were eliminated from further experiments and were labeled as low-confidence contigs.
Marker selection and development
We used several sources to develop markers used in this study, which were mainly divided into three groups. The first group was the markers that genetically mapped to chromosome 6B and deposited in the public databases: GrainGenes CMap and NBRP-Wheat, Japan. From these databases, we selected 208 PCR-based markers such as STS-RFLP and SSR markers. Additionally, 751 PLUG markers [34, 35] (http://plug.dna.affrc.go.jp/) were selected based on synteny of wheat group-6 chromosomes to rice chromosome 2.
The second group consisted of markers derived from wheat and barley expressed sequences, i.e., full-length cDNAs [66, 67] and ESTs (Ta#56: wheat ESTs deposited in build #56 of the NCBI UniGene dataset; http://www.ncbi.nlm.nih.gov/unigene). In the Genome Zipper , we found 2304 non-redundant reads by NGS mapped on chromosome 6H of barley. We identified wheat ESTs homologous to the above barley 6H reads by BLAST (390 markers). Additionally, we selected following wheat sequences as marker sources: a) full-length cDNAs with ≥70 % coverage and ≥50 % identity with the syntenic genes common to Os02, Bradi3 and Sb04 (orthologous set, 905 markers); b) full-length cDNAs and ESTs with more than 80 and 85 % identities to genes of Os02 and Bradi3, respectively (predicted orthologous set, 482 FLcDNA and 309 EST markers); and c) full-length cDNAs with more than 85 % identity with Bradi3 genes listed in the barley 6H Genome Zipper (predicted orthologous set from B. distachyon, 133 markers).
The third group contained ISBP markers, which were developed from the survey sequence data for 6BS and 6BL  according to Paux et al. . Among the high-confidence primer set identified (Kaneko et al. in preparation), 2000 sets were selected based on the frequency of the number of junctions between the transposable element subfamilies.
To determine the chromosomal locations of contigs that lacked the above anchors, additional PCR markers such as ISBP and SSR were developed using the genomic sequences of the MTP BAC clones and were used for RH mapping. High-confidence ISBP markers were developed as described by Paux et al. , and Primer3 was used to design SSR primers at positions flanking the microsatellite sequences (>4 units of di-, tri- and tetra-nucleotides) predicted by Perl scripts. To develop 6B-specific markers, a BLASTN search using the primer sequences as queries was performed against the wheat survey sequence data . Primer pairs with complete identity to the only chromosome 6B sequences were 1076 ISBP (426 for 6BS and 650 for 6BL contigs) and 1163 SSR (447 for 6BS and 716 for 6BL contigs) markers, from which 79 ISBP (37 for 6BS and 42 for 6BL contigs) and 74 SSR (38 for 6BS and 36 for 6BL contigs) were used for anchoring.
PCR screening of BAC libraries
Prior to BAC library screening, the PCR-based molecular markers were tested to determine their quality and specificity to chromosome 6B. For the PCR amplification, total DNA was extracted from CS, the Nullisomic-6B Tetrasomic-6A (N6BT6A) line and M808 using standard procedures, and the DNAs of the sorted chromosome arms 6BS and 6BL were amplified by multiple displacement amplification using the illustra™ GenomiPhi V2 DNA Amplification Kit (GE Healthcare Bio-Science Corp., Piscataway, NJ, USA). PCR reactions were performed using Quick Taq Solution (Toyobo, Osaka, Japan) according to the manufacturer’s recommendations. Touchdown PCR amplification was performed as follows: initiation at 94 °C for 2 min, 10 cycles (94 °C for 20 s, 65 °C minus 0.5 °C each cycle for 20 s, 72 °C for 20 s), 20 cycles (94 °C for 2 min, 60 °C for 20 s, 72 °C for 20 s) and termination at 72 °C for 1 min. The amplified products were analyzed by using a MultiNA microchip electrophoresis system (Shimadzu, Kyoto Japan) according to the manufacturer’s recommendations.
Screening was performed using the BAC DNA pools, which were created from the above BAC libraries for 6BS and 6BL, comprising 57,600 and 76,032 BAC clones, respectively. The BAC DNAs were extracted from three-dimensional (plate, row, column) pooled BACs and purified using the alkaline lysis method . Each pool contained 100 μg of BAC DNA. The PCR reaction was performed in 15 μL containing 30 ng of BAC pooled DNA, 0.2 μM of each primer and 5 μL of GoTaq Green Master Mix (Promega, Madison, WI, USA). Amplification was initiated at 94 °C for 1 min, followed by 35 cycles of 94 °C for 30 s, 60 °C for 1 min, 72 °C for 1 min, and termination at 72 °C for 1 min. The resulting fragments were electrophoresed in a 2 % agarose gel.
The RH panel of chromosome 6B was produced by Watanabe et al. (in preparation). We selected 355 lines as the RH panel based on the genotypes determined using the 21 6B-specific SSR markers. In addition to the RH panel, 21 chromosome deletion lines (9 lines for 6BS and 12 lines for 6BL)  were used for the approximate mapping of markers to each chromosome bin. CS and the N6BT6A line were included as positive and negative controls, respectively, for the amplification of the PCR markers. The seeds of CS (LPGKU2269), the N6BT6A line (LPGKU0075) and the 21 chromosome deletion lines (LPGKU1247-1267) were obtained from NBRP-Wheat, Japan.
To genotype the RH panel and chromosome deletion lines, we used 653 markers that were anchored on 462 contigs and defined based on their specificity for chromosome 6B using the N6BT6A line. The presence/absence of the markers was analyzed by agarose gel electrophoresis.
The data obtained for the 653 markers were used to group and order the markers in Carthagene version 1.3.beta . The linkage group was determined by using the group command with a two-point LOD threshold of 4.0 and a maximum distance of 100 cR. The heap command was used to identify the map with the highest likelihood. The mapocb command was used to calculate the number of obligate breaks.
Calculation of gene density along chromosome 6B
We used two gene information sources to calculate the gene density along chromosome 6B. First, we extracted markers representing independent genes from the DNA markers used for the anchoring BAC clones. Redundancy in the marker set was removed based on the sequence similarity using CAP3  with 98 % identity. BLASTX search for the resulting non-redundant markers were conducted to the rice, Brachypodium and sorghum genes with E value < 10−5. Then, markers hit to the genes on rOs02, Bardi3 and Sb04 were used for the further analysis. Second, we used gene models annotated from the survey sequencing data by IWGSC . We mapped 731,925 and 998,825 WGP tags on 6BS and 6BL to IWGSC sequence contigs assembled from the 6B survey sequencing  by megablast with default setting. Gene information on 6B survey sequence contigs were assigned to BAC contigs based on the information of the WGP tags uniquely mapped to 6B survey sequence contigs with 100 % identity and 100 % coverage. If the tags located in distant contigs were mapped on a same survey sequence contig, we discarded the hit information.
Comparison of the virtual gene order based on the physical map with the syntenic chromosomes of other cereals
To compare the gene order between 6B and its homologous chromosomes, Os02, Bradi3 and Sb04, a BLASTX search with an E-value < 10−5  was conducted against the protein sequences of the three grasses using nucleotide sequences of the aforementioned gene set. The positions of the genes on the 6B physical map were compared with the physical genomic positions of the orthologous sequences on the homologous chromosomes in the three cereals. The graphical displays of the relationships between orthologous genes were drawn using Circos-0.67-7 software .
Sequencing of MTP BAC clones
BAC clones were sequenced using Roche GS FLX (Basel, Switzerland). BAC DNAs were prepared in a 96-deepwell plate, extracted using the alkaline lysis method and purified with the MultiScreen Filter plate (Millipore, Billerica, MA, USA). After fragmentation with the Covaris Acoustic Solubilizer LE220 (Covaris, Woburn, MA, USA), the sequencing libraries were constructed using the NEBNext DNA Library Prep Master Mix Set for 454 (New England Biolabs, Ipswich, MA, USA) and tagged with Multiplex Identifier (MID) Adaptors (Roche, Basel, Switzerland). The tagged DNAs were subjected to size selection using agarose gel electrophoresis to obtain DNA fragments of the appropriate size (600–1200 bp) and then quantified using the Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA). The resulting libraries were sequenced according to the manufacturer’s recommendations. The resulting SFF files were split into separate files of each BAC clone based on MID by using the sffile program in SFF Tools (Roche, Basel, Switzerland). The split SFF files were used for de novo assembly with the GS De Novo Assembler v2.6 (Roche, Basel, Switzerland) with default parameters, and the BAC vector and E. coli genome sequences were trimmed.
To detect ribosomal RNA genes, a BLASTN search was performed against BAC sequences with E-values < 10−10. For use as queries, four rRNA gene sequences from wheat, 5S (accession no. 3IZ9), 5.8S (3IZ9), 18S (3IZ7) and 25S (3IZ9) , were obtained from the Protein Data Bank (PDB; http://www.pdb.org/pdb/home/home.do), and the IGS sequence (X07841)  was obtained from DDBJ/EMBL/GenBank (http://www.ddbj.nig.ac.jp).
To identify BAC clones containing centromeric sequences, a BLASTN search was run against the BAC sequences with E-value < 10−4, and a low-complexity filter was used. Eighteen centromeric sequences were used as queries: 15 probe sequences (KC290868-KC290916) , the CCS1-R11H-2 gene (AB048245) , the rye centromere-specific marker (KF719092) and contig 310431  of the 5xCS genome sequence .
Availability of supporting data
A genome browser of the physical map of the wheat chromosome 6B is available from the Unité de Recherche Génomique Info website (https://urgi.versailles.inra.fr/gb2/gbrowse/wheat_phys_pub/) and the Komugi Genome Sequence Program web site (http://komugigsp.dna.affrc.go.jp/index.html). The other supporting data are included as additional files.
The authors wish to thank Etienne Paux for critical editing of the manuscript. This work was supported by the Ministry of Agriculture, Forestry, and Fisheries of Japan (Genomics for Agricultural Innovation, grant number KGS-1003 and −1004; Genomics-based Technology for Agricultural Improvement, NGB-1003), the Czech Science Foundation (award P501/12/G090), and the National Program of Sustainability of the Czech Republic (LO1204).
- 1.Kihara H, Nishiyama I. Genomanalyse bei Triticum und Aegilops. I Genomaffinitäten in tri-, tetra- und pentaploiden Weizenbastarden. Cytologia. 1930;1:270–84.Google Scholar
- 2.Sears ER. The aneuploids of common wheat. Missouri Agricultural Experiment Station Research Bulletin. 1954;572:1–58.Google Scholar
- 9.The International Barley Genome Sequencing Consortium. A physical, genetic and functional sequence assembly of the barley genome. Nature. 2012;491:711–6.Google Scholar
- 27.McIntosh RA, Yamazaki Y, Dubcovsky J, Rogers J, Morris C, Apples R, et al. Catalogue of gene symbols for wheat. 2013 [http://www.shigen.nig.ac.jp/wheat/komugi/genes/download.jsp].Google Scholar
- 30.Sears E, Sears L. The telocentric chromosomes of common wheat. In: Proc 5th Int Wheat Genet Symp. New Delhi, India: Agricultural Research Institute; 1978. p. 389–407.Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.