Integrated physical, genetic and genome map of chickpea (Cicer arietinum L.)
- First Online:
- Cite this article as:
- Varshney, R.K., Mir, R.R., Bhatia, S. et al. Funct Integr Genomics (2014) 14: 59. doi:10.1007/s10142-014-0363-6
- 1.8k Downloads
Physical map of chickpea was developed for the reference chickpea genotype (ICC 4958) using bacterial artificial chromosome (BAC) libraries targeting 71,094 clones (~12× coverage). High information content fingerprinting (HICF) of these clones gave high-quality fingerprinting data for 67,483 clones, and 1,174 contigs comprising 46,112 clones and 3,256 singletons were defined. In brief, 574 Mb genome size was assembled in 1,174 contigs with an average of 0.49 Mb per contig and 3,256 singletons represent 407 Mb genome. The physical map was linked with two genetic maps with the help of 245 BAC-end sequence (BES)-derived simple sequence repeat (SSR) markers. This allowed locating some of the BACs in the vicinity of some important quantitative trait loci (QTLs) for drought tolerance and reistance to Fusarium wilt and Ascochyta blight. In addition, fingerprinted contig (FPC) assembly was also integrated with the draft genome sequence of chickpea. As a result, ~965 BACs including 163 minimum tilling path (MTP) clones could be mapped on eight pseudo-molecules of chickpea forming 491 hypothetical contigs representing 54,013,992 bp (~54 Mb) of the draft genome. Comprehensive analysis of markers in abiotic and biotic stress tolerance QTL regions led to identification of 654, 306 and 23 genes in drought tolerance “QTL-hotspot” region, Ascochyta blight resistance QTL region and Fusarium wilt resistance QTL region, respectively. Integrated physical, genetic and genome map should provide a foundation for cloning and isolation of QTLs/genes for molecular dissection of traits as well as markers for molecular breeding for chickpea improvement.
KeywordsChickpea Physical map Genetic maps Reference genome sequence
Chickpea is a self-pollinated diploid (2n = 2x = 16) annual grain legume with a genome size of approximately 740 Mbp (Arumuganathan and Earle 1991). It is cultivated mostly on residual soil moisture in semi-arid regions of South Asia and sub-Saharan Africa. India is the largest producer of chickpea contributing about 60 % of total world’s production. Other important chickpea producing countries are Pakistan, Turkey, Mexico, USA, Canada and Australia (http://www.cgiar.org/our-research/crop-factsheets/chickpea/). The seeds of chickpea are rich in protein (24.6 %) and carbohydrate (64.6 %) and a good source of minerals and fibers and a source of food grain with low cholesterol levels (Abu-Salem and Abou 2011). There are two distinct chickpea types, Desi and Kabuli, different in their morphology. Desi-type chickpeas possess purple flower and small, dark and angular seeds and are largely consumed in Indian subcontinent and Pakistan, while Kabuli chickpeas have white flowers and large, cream-coloured seeds and are preferred in the Mediterranean basin and Central Asia where they are mainly consumed as a whole seed. Despite the increase in area and production during the last decade, there has not been a significant increase in the productivity of chickpea (Varshney et al. 2010). Considerable breeding efforts have been made across the globe to overcome the abiotic (drought, salinity, heat) and biotic (Helicoverpa, Fusarium wilt, Ascochyta blight) production constraints in chickpea. The genetic gains obtained through the conventional breeding efforts are not on par with the growing demands of the crop. Therefore, molecular breeding is now becoming one of the integral components of chickpea breeding programs (Chamarthi et al. 2011; Varshney et al. 2013a).
Recent advances in chickpea genomics have made it possible to not only develop large-scale molecular markers, genetic maps, transcriptomic resources for conducting large-scale and high-throughput marker genotyping but also sequence the genome of this important crop (Varshney et al. 2013b; Jain et al. 2013). For instance, >3,000 simple sequence repeat (SSR) markers (Nayak et al. 2010; Thudi et al. 2011; Choudhary et al. 2012), 15,360 Diversity Arrays Technology (DArT) loci (Thudi et al. 2011) and >3,000 single nucleotide polymorphism (SNP) markers (Hiremath et al. 2012; Gaur et al. 2012) have been developed. Several genetic maps have been constructed (Nayak et al. 2010; Millán et al. 2010; Gujaria et al. 2011; Thudi et al. 2011; Choudhary et al. 2012; Hiremath et al. 2012; Gaur et al. 2012), and hundreds of quantitative trait loci (QTLs) for drought tolerance (Varshney et al. 2014), salinity (Vadez et al. 2012), Ascochyta blight (Kottapalli et al. 2009; Anbessa et al. 2009), Botrytis grey mould (Anuradha et al. 2011) and Fusarium wilt (Tekeoglu et al. 2002; Udupa and Baum 2003) have been mapped. Further, large-scale expressed sequence tags (ESTs; Varshney et al. 2009) were generated and transcriptome assembly-constructed (Garg et al. 2011; Hiremath et al. 2012; Kudapa et al. 2014). In addition, several large-insert bacterial artificial chromosome (BAC) and binary bacterial artificial chromosome (BIBAC)-based libraries were also constructed for chickpea (Lichtenzveig et al. 2005; Zhang et al. 2010). The available genomic resources for chickpea molecular breeding community have been recently reviewed extensively by Upadhyaya et al. (2011) and Varshney et al. (2012).
In the context of development of physical maps, a BAC/BIBAC-based physical map was developed (Zhang et al. 2010). On this physical map, three large contigs closely linked to QTLs contributing to Ascochyta blight resistance and flowering time in chickpea were identified (Zhang et al. 2010). However, a comprehensive genome-wide physical map, and its integration with genetic maps possessing QTLs for important targeted traits and draft genome of chickpea, is the need of the hour for facilitating cloning of candidate genes and enhancing molecular breeding programs in this important crop.
In this study, we have constructed a genome-wide physical map employing large-insert BAC libraries and integrated it with the genetic maps and chickpea reference genome sequence for the identification of BAC clones covering important genomic regions for drought tolerance, Ascochyta blight and Fusarium wilt resistance. A comprehensive analysis of the targeted QTL regions in the integrated genetic, physical and genome maps has provided candidate genes that may be used for functional analysis as well as molecular breeding for drought tolerance and resistance to Ascochyta blight and Fusarium wilt.
Materials and methods
ICC 4958, a drought tolerant Desi chickpea genotype, was used for the construction of BAC libraries. DNA was isolated from etiolated young seedling as per Cuc et al. (2008).
Two BAC libraries were constructed from high molecular weight (HMW) genomic DNA processed at Amplicon Express, Pullman, Washington, using the method of Tao et al. (2002). Purified large DNA fragments were ligated with pCCBAC1 (Epicentre, Omaha, NE, USA). Ligations were transformed into DH10B Escherichia coli cells (Invitrogen, Grand Island, NY, USA) and plated on Luria-Bertani (LB) agar with appropriate chloramphenicol, X-gal and IPTG concentrations. Clones were robotically picked with a Genomic Solutions G3 into 384-well plates containing LB freezing media. Plates were incubated for 18 h, replicated and then frozen at −80 °C. The replicated copy was used as a source plate for physical mapping.
High information content fingerprinting
We used the four-colour high information content fingerprinting (HICF) SNaPshot method of Luo et al. (2003) with modification suited for ABI Genetic Analyzer 3730 platform (Gu et al. 2009). For each 384-well plate from the library, four 96-well blocks were inoculated using a 96-pin replicator. To control for plate orientation and fingerprinting, quality wells E7 and H12 were replaced in each 96-well block with a control BAC clone. The cultures were grown for 20 h at 37 °C and 400 rpm on an orbital shaker. The BAC DNA was purified using the Qiagen R.E.A.L. 96 Prep Kit (Qiagen, Valencia, CA, USA). Each BAC was simultaneously digested with four 6-bp recognizing restriction endonucleases generating 3′ recessed ends. Each of the four recessed 3′ ends was labeled with a different fluorescent dye using the SNaPshot kit (Applied Biosystems, Foster City, CA, USA). Restriction fragments were sized with a capillary DNA analyzer ABI 3730XL (Applied Biosystems, Foster City, CA, USA) using an internal GeneScan 1200 LIZ size standard. Fragment size calling was performed with the GeneMapper v. 3.7 (Applied Biosystems, Foster City, CA, USA).
Fingerprinting data editing
Outputs of size-calling files from GeneMapper software were automatically edited with the FP Miner program (Gu et al. 2009). This software package was used to distinguish peaks corresponding to restriction fragments from peaks generated by a background noise in the profile of each BAC fingerprint and to remove vector restriction fragments from the profiles. The program also removed substandard profiles that could negatively affect contigs assembly. The files generated by FP Miner were used in the fingerprinted contig (FPC) assembly.
Physical map contig assembly
Contigs were assembled from fragments within a size range of 70–1,000 bp using FPC software (version 9.3, http://www.agcol.arizona.edu/software/fpc/), using the following assembly strategy. We set tolerance at 6 (0.6 bp) and an initial cut-off of 1E − 60 (1 × 10−60). We followed the initial assembly with several rounds of DQer, until no contig containing 15 % or more questionable (Q) clones. This was followed by several rounds of end-to-end merging and single-to-end merging at progressively lower cut-off stringencies. The “best of” function was set to 100 builds.
FPC displays the length of each BAC clone and each BAC contig in consensus band (CB) units. To convert CB units to kilobytes, we estimated the lengths of inserts in 100 BAC clones in kb by pulsed field gel electrophoresis (PFGE) and divided the total length by the number of restriction fragments in the fingerprints of the clones. The conversion factor for the BAC clones was 1.92 kb/CB unit.
Anchoring of physical map with genetic maps
For anchoring the physical map with the genetic map, a set of 337 BAC clones corresponding to 337 polymorphic BAC-end sequence-derived SSR (BES-SSR) markers in inter-specific genetic map (ICC 4958 × PI 489777; Thudi et al. 2011) and two intra-specific genetic maps (ICC 4958 × ICC 1882 and ICC 283 × ICC 8261) were subjected to fingerprinting in the same way as mentioned above. The fingerprinted clones were tried for their assembly with the FPC contig physical map for anchoring genetic map with the physical map.
Alignment of physical map with draft genome of chickpea
A set of 4,290 BAC clones forming the minimum tilling path (MTP) of FPC contig were picked, and plasmid DNA was isolated as described by Huo et al. (2008). BAC clones were sequenced with BigDye Terminator v3.1 (Applied Biosystems, Foster City, CA, USA) using the pIndigoBAC-5 reverse primer (5′-CTCGTATGTTGTGTGGAATTGTGAGC-3′). Amplifications were carried out in 96-well plate (10 μl reaction volumes containing 1 μl of BigDye, 1.75 μl sequencing reaction buffer, 2 mM MgCl2, 5 % DMSO, 0.5 μM primer and 200–500 ng of BAC DNA) or 384-well plate (5 μl reaction volume). PCR was performed with 49 cycles of 96 °C for 10 s, 50 °C for 10 s and 60 °C for 4 min. The PCR product was purified and then analyzed on the 3730XL DNA analyzer (Applied Biosystems, Foster City, CA, USA). Base calling of ABI trace file was performed with PHRED (Ewing and Green 1998; Ewing et al. 1998), and sequences with quality scores of less than 20 were trimmed. Vector sequences were removed using Cross_match. BES less than 100 bp was also filtered.
Similarly, another set of 46,270 BAC-end sequences for CAH1 library that were used to develop novel SSR markers and development of high-density genetic map in chickpea were also used for in silico mapping onto draft chickpea genome during the present study. The details of construction of BAC library, BAC-end sequences is available elsewhere (Thudi et al. 2011).
In addition, efforts have also been made to map onto draft genome sequence the genetically mapped 1,328 marker loci including novel 625 Chickpea KASPar Assay Markers (CKAMs) and 314 tentative orthologous genes (TOGs)-SNPs from Hiremath et al. (2012) for integrating physical, sequence and genetic maps. For repeat analysis, a repeat database for chickpea genome was first built from the Kabuli draft genome already available with us (Varshney et al. 2013b), followed by search for different types of repeat sequences from the 53,316 BESs using RepeatMasker v 3.2.7 (Tarailo-Graovac and Chen 2009). The unique sequences were isolated from repetitive ones using NCBI BLAST+ ver 2.2.25 with a cut-off E value of less than 10−50. All the 53,316 BESs were used as Basic Local Alignment Search Tool (BLAST) query against chickpea repeat database.
The clean sequences after repetitive sequence analysis were subjected to mapping onto draft genome of Kabuli chickpea using NCBI BLAST+. From all of the BLAST alignments, clean BESs were extracted according to the following criteria (Katagiri et al. 2004; Asamizu et al. 2012): (i) sequence identity >90 % and alignment coverage >50 %, (ii) mapped positions of each pair of ends >100 bp and <200 kb apart in the same chromosome, (iii) direction of each paired end is correct, (iv) BLASTN E < 10−100, (v) a minimum of one hit for one of the paired ends and (vi) no redundant chromosomal locations.
Identification of genes in important QTL regions
In order to identify candidate genes present in the QTL regions for some important traits such as drought tolerance (“QTL-hotspot”, Varshney et al. 2014), Ascochyta blight resistance (Udupa and Baum 2003) and Fusarium wilt resistance (Sabbavarapu et al. 2013), the markers present in these QTL regions were subjected to BLAST against chickpea genome assembly (Varshney et al. 2013b) and the corresponding UniProt IDs were retrieved. For functional categorization of the genes, the UniProt IDs of the genes were mapped onto UniProt KB database (http://www.uniprot.org/).
New BAC libraries
In addition, another BAC library (CAH1) constructed from the chickpea cultivar ICC 4958 using HindIII restriction enzyme (Thudi et al. 2011) was also used in the present study. From this library, a set of 1,110 BACs containing resistance gene homolog (RGH) and BES-SSRs were used for developing the physical map.
Statistics of BAC fingerprinting of different libraries in chickpea
Clones with usable fingerprinting data
Clones in FPC
Summary of statistics of the chickpea physical map
Total no. of BAC clones targeted
71,094 (~12× coverage)
No. of BAC clones with usable data
Contig assembly results
No. of clones in assembly
No. of clones in contigs
No. of clones as singletons
3,256 (~300 Mb)
No. of contigs
Total contiguous coverage
Contig size distribution (no. of contigs by group)
No. of contigs with 2 clones
No. of contigs with 3–9 clones
No. of contigs with 10–24 clones
No. of contigs with 25–49
No. of contigs with 50–99 clones
No. of contigs with 100–199 clones
No. of contigs with ≥200 clones
Q clones statistics in contigs
No. of contigs without Q clones
No. of contigs with Q clones
Total no. of Q clones
Range of Q clones in contigs
1 to 89
Clone statistics in contigs
Total no. clones in 1,174 contigs
Range of clone in contigs
2 to 3,007
Average no. of clones in each contig
Genome coverage (total clones ×130 kb)
Band statistics in clones
Total no. of bands in clones
Average no. of bands in clones
Range of bands in clones
34 to 2,268
Minimum tilling path (MTP)
Total no. of clones in contigs
Physical map assembly
An analysis of Q clones in our data set indicated that 443 contigs out of total 1,174 contigs (37.73 %) contained a total of 2,093 Q clones (2,093 / 49,368 = 4.23 %) with a range of 1 to 89 Q clones/contig (Table 2). The low number (4.23 %) of Q clones in our assembly indicated the good quality and proper cut-off and tolerance values used for the assembly.
Anchoring physical map with genetic maps
In order to anchor the contig map with the genetic map, a total of 337 polymorphic BES-SSRs were used for mapping on three mapping populations (one inter- and two intra-specific crosses). However, only 259 BES-SSR markers were integrated on genetic maps based on three populations, namely, ICC 4958 × PI 489777 (inter-specific mapping population) and ICC 4958 × ICC 1882 and ICC 283 × ICC 8261 (intra-specific populations) (Supplementary Table 1). A consensus map was also developed based on genetic maps for intra-specific mapping populations (http://cmap.icrisat.ac.in/cmap/sm/cp/varshney/).
Distribution of 259 BES-SSR markers mapped on FPC assembly of chickpea
In addition to above, 619 genetically mapped marker loci including CKAMs and TOGs that were mapped onto chickpea draft genome sequence/scaffolds were found very close/linked to all 163 MTP clones of physical map mapped onto eight pseudo-molecules of draft genome sequence. This has also helped to anchor a physical map with the second-generation genetic map developed based on cost-effective SNP marker assays (Hiremath et al. 2012).
QTLs assigned to integrated genetic and physical map
A number of QTLs were identified for abiotic and biotic stresses earlier; in addition, we also identified QTLs for abiotic stress (drought tolerance, salinity) and biotic stress (like Fusarium wilt, Ascochyta blight and Botrytis grey mould) from published studies, and their corresponding location on the inter-specific map based on ICC 4958 × PI 489777 (Thudi et al. 2011) was determined (Supplementary Table 1).
Anchoring physical map with the draft genome of Kabuli chickpea
Results of BAC-end sequencing of ICC 4958
Total BAC clones of MTP
No. of BES generated
No of BAC clones targeted from CAH1 library
No. of BES generated
Total number of clones for in silico mapping
No. of BES generated
37.8 Mb (including redundant clones)
Mate pairs seq
2,978 (2,978 clones)
Results of in silico mapping of BAC clones on the chickpea reference genome
Length of pseudo-molecule (Mb)
No. of contigs
Pseudo-molecule coverage (bp)
Similarly, we also tried to map sequences of 1,328 genetically mapped loci from Hiremath et al. (2012) onto the daft genome sequence of chickpea. As a result, 619 CKAMs and TOGs were mapped onto the chickpea draft genome sequence/scaffolds and were found to be very close/linked to MTP clones of a physical map mapped onto draft genome sequence. The number of mapped marker loci on the draft genome sequence ranged from 43 (on Ca8) to 113 (on Ca4) with an average of ~77 loci/pseudo-molecule (Supplementary Fig. 1a–h). This helped us to anchor a physical map with the draft genome sequence of chickpea and genetic maps.
Mining candidate genes in select stress responsive QTL regions
After identification of BAC contigs/clones in the region or proximity QTLs for drought tolerance (“QTL-hotspot”), Ascochyta blight and Fusarium wilt resistance, an effort was made to mine candidate genes in the genomic regions after aligning molecular markers present in/associated with the QTL regions on genome map/sequence assembly. For instance, in the case of “QTL-hotspot” region, analysis of seven SSR markers (ICCM0249, NCPGR127, TAA170, NCPGR21, TR11, STMS 11 and GA24) on CaLG04 with a genome sequence identified a region of ca. 7 Mb (starting at 9.1 Mb and ending at 16.1 Mb) on Ca4 chromosome. Genome annotation of this region has identified a total of 654 genes (Fig. 5; Supplementary Table 3). In the case of Ascochyta blight resistance, analysis of two SSR markers (GA16 and CaM1135) present in QTL region/genes ara1 and ara2a (Thudi et al. 2011) identified a 3 Mb region (starting at 31 Mb and ending at 34 Mb) on Ca2 that contained a total of 306 genes (Supplementary Table 4). However, the markers TA130 and CaM0232 on LG4 (Thudi et al. 2011; Fig. 5) corresponding to ara2b QTL/gene could not be mapped; hence, this QTL region was not analyzed in details. For Fusarium wilt resistance QTL region, analysis of two markers (CaM1125 and TR44; Sabbavarapu et al. 2013) on genome sequence provided a relatively smaller region (0.6 Mb, between 29 and 29.6 Mb), comprising 23 genes (Supplementary Table 5).
Functional categorization of genes present in the QTL regions of drought tolerance ("QTL-hotspot"), Fusarium wilt and Ascochyta blight resistance
Number of genes in different QTL regions
“QTL-hotspot” (Varshney et al. 2014)
Ascochyta blight resistance ara1 and ara2a (Udupa and Baum 2003)
Fusarium wilt resistance (Sabbavarapu et al. 2013)
Transcription factor binding transcription factor activity
Structural molecule activity
Electron carrier activity
Enzyme regulator activity
Sequence-specific DNA binding transcription factor activity
Signal transducer activity
Nutrient reservoir activity
Superoxide dismutase activity
Structural constituent of ribosome
MAP kinase activity
Response to stimulus
Establishment of localization
Cellular component organization or biogenesis
Multicellular organismal process
Immune system process
Single organism signaling
Developmental process involved in reproduction
Defense response to bacterium, incompatible interaction
Single-multicellular organism process
Regulation of biological process
Response to other organism
Systemic acquired resistance
Multicellular organismal process
In the case of “QTL-hotspot” region, out of 654 genes, 362 genes were assigned to “molecular function” category, 252 genes to “cellular component” category and 370 genes to “biological process” category, respectively. It is important to note here that the sum of genes assigned to different functional categories (984) is higher than the total number of genes (654), as a given gene may fall in more than one category. In the molecular function category, the highest number of genes fell into catalytic activity (239) followed by binding (219). Under cellular component category, the highest number of genes fell into cell part (190) followed by organelle (160). Similarly, in the biological processes category, a maximum number of genes fell into metabolic process (296) followed by cellular process (239).
For Ascochyta blight QTL region, 157 genes were assigned to molecular function, 112 genes to cellular component and 162 genes to biological processes. The highest number of genes under molecular function category fell into catalytic activity (95). In the cellular component category, a maximum number of genes (88) belonged to cell part class. Similarly, under biological processes category, the highest number of genes fell to metabolic process class that contained the highest number of genes (118).
In the case of Fusarium wilt resistance QTL region, 14 genes were assigned to molecular function category, 9 genes to cellular component category and 14 genes to biological processes, respectively. Among these categories, binding class with 11 genes, cell part with 9 genes and cellular process with 12 genes were found in abundance under molecular function, cellular component and biological process categories, respectively.
Chickpea is one of the important diploid legume crops (with moderate genome size of 738 Mb) for the poor, largely vegetarian people living in semi-arid tropics (SAT) and other climate vulnerable regions in the world. The availability of physical maps in crop plants serves as an indispensable tool and have been used for a variety of purposes including (i) QTL fine mapping and positional (map-based) cloning of QTLs/genes (Bakker et al. 2003), (ii) anchoring chromosomes using fluorescence in situ hybridization (FISH) (Islam-Faridi et al. 2002), (iii) repeat classification (Cardle et al. 2000), (iv) draft genome sequence assembly (Sasaki et al. 2005), (v) marker development experiments (van der Vossen et al. 2000) and (vi) analysis of structural variation in the genome (Kidd et al. 2008). In the recent past, significant progress and tremendous advancements have been made in the area of sequencing/resequencing crop genomes using next-generation sequencing (NGS) technologies (Hillier et al. 2008; Wheeler et al. 2008; Thudi et al. 2012). However, despite the advancements in NGS technologies, the need and value for development of high-quality physical maps still remains high (Lewin et al. 2009). For instance, the sequencing/assembly of crops possessing complex genomes with large fractions of repetitive genome will not be easily addressed by NGS alone but will be facilitated by physical maps which would provide anchor points to link sequence contigs and bridge gaps due to large repeat regions. One of the effective ways for providing these anchor points is by construction of a whole genome physical map using bacterial artificial chromosome (BAC) clones (Shizuya et al. 1992; Rounsley et al. 2009) and BAC-end sequencing (Nelson et al. 2005). The development of physical maps using BAC clones has been proven effective for the construction of genome-wide physical maps, since generation and storage of these clones is somewhat relatively easy (Gregory et al. 1997; Marra et al. 1997; Klein et al. 2000; Wu et al. 2004). Therefore, BAC-based physical maps combined with BAC-end sequencing have formed the basis of several whole genome sequencing projects (Sasaki et al. 2005; Wei et al. 2007). Keeping the importance of physical maps in view, a BAC-based physical map has been developed for chickpea in the present study.
Physical map assembly and genome coverage/representation of chickpea genome
In the present study, we reported the construction of a high-quality, high-coverage BAC-based physical map of chickpea genome using genomic DNA of cultivar ICC 4958. The physical map has a total of 49,368 BAC clones assembled into 1,174 contigs. The number of clones in each contig varied from 2 to 3,007 with an average of ~39 clones/contig spanning from >200 kb to 4.1 Mb, with an average physical length of ~489 kb (574 / 117 = 0.4889 Mb). These contigs collectively span 615 Mb, smaller than the expected 738 Mb estimated genome size of chickpea genome by 17 %.
The genome coverage (eight times) of chickpea genome (46,112 clones) assembled into chickpea physical map assembly (Table 2) was sufficient. Comparative genome coverage (10× coverage) has been found earlier to be sufficient for developing a physical map of the chickpea genome (Zhang et al. 2010). Similarly, the number of clones used for fingerprinting (70,321) has been also found sufficient for generating a quality physical map, since less number of clones (67,584) were fingerprinted in a previous study of chickpea physical map development (Zhang et al. 2010).
The chickpea physical map was developed using two BAC libraries based on patterns of fragments generated by digestion of restriction enzymes (BamHI, EcoRI, XbaI, XhoI and HaeIII). The genome size estimated by K-mer analysis revealed a genome size of 738 Mb. However, genome size of only 615 Mb could be assembled into contigs, while 300 Mb remained in the form of singletons. The estimation of coverage of physical map was based on contigs only, and if singletons were also included, then the genome coverage of chickpea physical map might be overestimated (615 + 300 = 915 Mb). The discrepancy in genome size estimation is not unusual as in earlier studies; also, overestimation of genome size has been reported such as in Populus (about 20 % larger; Kelleher et al. 2007) and soybean (26.3 % larger; Wu et al. 2004). However, in the case of papaya, the genome size estimated based on the physical map was close to the actual size (Yu et al. 2009). The smaller genome size obtained in the present study may be attributed to the following: (i) false end-merges of clones and formation of large contigs; (ii) more overlaps between contigs; (iii) 3,256 or 6.5 % of clones remained as singletons, and 4.4 % (2,137) of the clones were located on small contigs (containing fewer than ten clones); (iv) average insert size estimation was not perfect based on representative samples of BAC libraries; (v) genome size of chickpea not perfectly estimated earlier; and (vi) the use of only two restriction enzymes for BAC library construction, since the use of more number of RE in BAC library construction has been found to increase the actual coverage of map (Zhang et al. 2010). The existence of more overlap than the expected between the contigs will reduce the overall size of contigs and, hence, the physical map assembly. Discrepancies in genome size of chickpea was also reported in an earlier study of the development of physical map in chickpea; however, they reported more genome size than the expected 738 Mb (Zhang et al. 2010). Despite the genome size estimation discrepancy, the quality of the chickpea physical map developed in the present study was found to be sufficiently high for use in various applications of genomics research.
Integrating physical map with the genetic/QTL maps
A genetic linkage map is constructed by placing genetic loci on chromosomes based on recombination frequencies, while a physical map is constructed based on overlapping restriction patterns of large insert BAC clones. Therefore, integration of these two different types of maps may reveal recombination hotspots as well as regions suppressed for recombination. In addition, this may allow an estimation of physical distances between genetic markers, thereby providing a framework for assembling the whole genome shotgun sequences.
Keeping in view the importance of integrating physical maps with genetic maps, we tried to anchor our physical map assembly with the genetic map by fingerprinting of 337 BAC clones and assembly of 319 of these clones into physical map assembly. The SSR markers derived from BAC-end sequencing of these clones had been found to be polymorphic and were mapped in the three bi-parental mapping populations derived from crosses ICC 4958 × PI 489777, ICC 4958 × ICC 1882 and ICC 283 × ICC 8261. Therefore, integrating these BAC clones into the physical map assembly helped to anchor our physical map with these three genetic maps (Supplementary Table 1).
The integration of physical map with genetic maps has been reported earlier in different model plant species including Arabidopsis (Meinke et al. 2009), rice (Chen et al. 2002) and fruit trees including peach (Zhebentyayeva et al. 2008), papaya (Yu et al. 2009) and apple (Han et al. 2011). The anchoring of physical map with the genetic map helped us to identify the BAC clones within or close to some of the QTLs already reported for important traits in chickpea (Supplementary Table 1). Therefore, the chickpea physical map may greatly help in cloning and fine mapping of some of these important genes in the future, thereby facilitating molecular breeding in this important crop.
Further, additional efforts are necessary to integrate and correlate the physical contig map with the other available genetic maps of chickpea for locating all mapped genes and QTLs to physical map contigs. These efforts have been further accelerated with the recent availability of the whole genome sequences of chickpea from NIPGR, New Delhi, India (Jain et al. 2013), and by the International Chickpea Genome Sequencing Consortium (Varshney et al. 2013b).
Integrated physical and genetic map is expected to act as an important resource for expediting the mapping and cloning of target genes controlling complex quantitative traits like drought and yield in chickpea. The availability of gene information will facilitate molecular breeding for improving important targeted traits in this important legume crop.
Integrating physical map with the Kabuli chickpea genome map
Genome sequencing of all the model legumes including Lotus (Sato et al. 2008), soybean (Schmutz et al. 2010) and Medicago (Young et al. 2011) has been completed. Following model genomes, several other legume genomes have been sequenced including pigeon pea (Varshney et al. 2012), common bean (Scott Jackson, personal communications) and, recently, chickpea (Varshney et al. 2013b), and efforts are being made to sequence peanut, lentil and cowpea. This revolution in whole genome sequencing of crop genomes has made the integration of physical maps with the draft genome sequence possible. Chromosome-anchored physical maps may serve as an important tool for (i) improving the draft genome sequence by filling sequence gaps; (ii) resolving assembly errors caused due to repetitive sequences, large gene families and segmental duplications; (iii) providing anchor pints; (iv) ordering of sequence contigs/scaffolds; (v) map-based cloning; and (vi) cloning sequences that are too large or repetitive for PCR-based cloning (Ha et al. 2012).
Mapping of BESs from one variety onto the reference genome sequence of a different variety may help to identify putative chromosomal structural rearrangements between the two varieties. This has been done successfully for the identification of structural rearrangements between Glycine soja and Glycine max by aligning BESs from G. soja against the G. max reference sequence (Ha et al. 2012). Thus, physical maps will also help to investigate a structural evolution that has occurred between two genomes and to allow researchers to effectively shuttle between the genomes to capture useful information for crop improvement and basic genetics research (Ha et al. 2012).
The availability of genome sequence of CDC Frontier, a Canadian Kabuli chickpea variety (Varshney et al. 2013b), has facilitated the integration of BESs from the physical map including BES of MTP clones form the chickpea cultivar ICC 4958 with the daft sequence of chickpea. A set of only ~965 BACs including 163 MTP clones could be mapped onto the draft genome of chickpea. These BACs formed 491 hypothetical contigs representing 54,013,992 bp (~54 Mb) of reference genome. The mapping of 163 MTP clones on eight chickpea pseudo-molecule ranged from 11 (on Ca8) to 29 (Ca4) with an average of 20 clones/pseudo-molecule (Table 5). These mapped MTP clones coming from different contigs of the physical map represent ~70 Mb genome of physical map integrated into draft genome and genetic map. The low number of BACs mapped on the reference genome during the present study may be due to the following: (i) exclusion of 37,230 BES having different types of repeat elements; (ii) exclusion of BAC clones, where paired BESs aligned <100 and >200 kb apart; (iii) exclusion of majority of BACs showing multiple region mapping on either the same pseudo-molecule or different pseudo-molecules; and (iv) exclusion of several paired BES from mapping showing orientation in either the same or opposite directions indicating potential inversions. Similar type of potential inversions, repeat elements, mapping on different regions on pseudo-molecules have been reported earlier during the in silico mapping and integration of physical maps with the genome sequence in rice (Katagiri et al. 2004), tomato (Asamizu et al. 2012) and soybean (Ha et al. 2012). The removal of 37,230 repeat BES from chromosomal mapping was aimed to avoid the complications of map construction. Integration of the physical map of a Desi variety during the present study with the draft genome sequence of Kabuli chickpea will help in locating chromosomal structural arrangements between Desi (ICC 4958) and Kabuli (CDS Frontier) genotypes in future studies.
Candidate stress-responsive genes
Among 654 genes present in the “QTL-hotspot”, the GO annotation-indicated genes like ERECTA-like kinase have a role in drought tolerance, and it has been reported that this gene enhances transpiration efficiency (Masle et al. 2005) and water use efficiency and transpiration efficiency in Arabidopsis (Xing et al. 2011). Several other genes like thiamine thiazole synthase and cysteine-rich receptor-like protein kinase, calmodulin-binding heat shock protein as well as other heat shock transcription factors, tubulin beta-chain, have roles in adaptation to various stress conditions. In addition, genes like fructose-bisphosphate aldolase, transmembrane protein which respond to salt stress as well as CRT/DRE-binding factor 4 gene which has shown a prominent role in abiotic stress response in Medicago (Li et al. 2011) were also found in this region, indicating the potential of this region to enhance the tolerance to several abiotic stresses.
In the case of Ascochyta blight resistance QTL region, among 306 genes, genes like BED finger-nbs resistance protein and gene with leucine-rich repeat domain are typically involved in host resistance mechanism like DNA-directed RNA polymerase subunit beta, receptor-like protein kinase and Ser-Thr protein kinase. Further, this region also harbours NAC domain protein for systemic acquired resistance as well as NB-LRR-type disease resistance protein Rps1-k-2 and ABC-type transport system involved in resistance to organic solvents’ periplasmic component.
Similarly, in the case of Fusarium wilt resistance QTL region, genes that have shown their activity in defense response to bacterium in Medicago like pheophorbide A oxygenase were found. This region also harbours superoxide dismutase which has shown differential expression in WR 315 compared to JG 62 in infected state as well as zinc finger protein and translation initiation factor (Gupta et al. 2013).
These candidate genes may be validated by using some functional genomics approaches like Targeting Induced Local Lesions IN Genome (TILLING), quantitative RT-PCR, over expression, etc., using the appropriate genetic material. In summary, integration of the physical map with the genetic maps and genome sequence of chickpea should help the international chickpea community for cloning of genes related to important tolerance/resistance to abiotic and biotic stresses as well as agronomic traits.
The authors are thankful to Douglas R. Cook, University of California, Davis, USA, and Aamir W. Khan for their help during analysis and interpretation of data. This study was supported as a part of the Tropical Legumes I (TLI- Objective 4: Chickpea) project funded by Bill & Melinda Gates Foundation (BMGF) through CGIAR Generation Challenge Programme (RKV) and National Institute of Plant Genome Research (NIPGR), Department of Biotechnology, Government of India (SB). This work has been undertaken as part of the CGIAR Research Program on Grain Legumes. ICRISAT is a member of CGIAR Consortium.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.