Sequencing the Chickpea Genome

  • Aamir Waseem Khan
  • Mahendar Thudi
  • Rajeev K. Varshney
  • David EdwardsEmail author
Part of the Compendium of Plant Genomes book series (CPG)


The importance of chickpea and constraints in chickpea production urged the need of chickpea genome. Varshney and colleagues in 2013 reported the draft genome of chickpea (kabuli). The genome assembly was 532.29 Mb spanning across 7,163 scaffolds and consisted of 28,269 gene models. The estimated size of chickpea genome was 738.09 Mb based on k-mer analysis. The draft genome assembly covered 73.8% of the total estimated genome size for chickpea. Gene annotation was carried for predicted gene models, though the UTRs and promoters have not been yet been predicted. Genome duplication and synteny analysis with other closely related legume crops showed gene conservation and segmental duplications spread across the draft genome assembly. The genome assembly provides resource for targeting genes responsible for disease resistance which are of agronomic importance. The genome assembly has been used for genome-assisted breeding and is further utilized to study the diversity and domestication of chickpea.


  1. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402CrossRefPubMedPubMedCentralGoogle Scholar
  2. Attwood TK, Bradley P, Flower DR, Gaulton A, Maudling N, Mitchell AL, Moulton G, Nordle A, Paine K, Taylor P, Uddin A (2003) PRINTS and its automatic supplement, prePRINTS. Nucleic Acids Res 31(1):400–402CrossRefPubMedPubMedCentralGoogle Scholar
  3. Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27(2):573CrossRefPubMedPubMedCentralGoogle Scholar
  4. Corpet F, Servant F, Gouzy J, Kahn D (2000) ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Res 28(1):267–269CrossRefPubMedPubMedCentralGoogle Scholar
  5. Delcher AL, Salzberg SL, Phillippy AM (2003) Using MUMmer to identify similar regions in large sequence sets. Current Protocols in Bioinformatics 10–3: doi: 10.1002/0471250953.bi1003s00
  6. Edgar RC, Myers EW (2005) PILER: identification and classification of genomic repeats. Bioinformatics 21(1):i152–i158CrossRefPubMedGoogle Scholar
  7. Elsik CG, Mackey AJ, Reese JT, Milshina NV, Roos DS, Weinstock GM (2007) Creating a honey bee consensus gene set. Genome Biol 8(1):1CrossRefGoogle Scholar
  8. Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296(5565):92–100Google Scholar
  9. Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nature Rev Genet 17(6):333–351CrossRefPubMedGoogle Scholar
  10. Jain M, Misra G, Patel RK, Priya P, Jhanwar S, Khan AW, Shah N, Singh VK, Garg R, Yadav M, Kant C, Sharma P, Bhatia S, Tyagi AK, Chattopadhya D (2013) A draft genome sequence of the pulse crop chickpea (Cicer arietinum L.). Plant J 74:715–729CrossRefPubMedGoogle Scholar
  11. Jurka J (1995) Database of repetitive elements (repbase). NCBI Database RepositoryGoogle Scholar
  12. Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27–30CrossRefPubMedPubMedCentralGoogle Scholar
  13. Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12(4):656–664CrossRefPubMedPubMedCentralGoogle Scholar
  14. Koboldt DC, Steinberg KM, Larson DE, Wilson RK, Mardis ER (2013) The next-generation sequencing revolution and its impact on genomics. Cell 155(1):27–38CrossRefPubMedPubMedCentralGoogle Scholar
  15. Letunic I, Doerks T, Bork P (2012) SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res 40:D302–D305CrossRefPubMedGoogle Scholar
  16. Li L, Stoeckert CJ, Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13(9):2178–2189CrossRefPubMedPubMedCentralGoogle Scholar
  17. Lowe TM, Eddy SR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25(5):955–964CrossRefPubMedPubMedCentralGoogle Scholar
  18. Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y, Tang J (2012) SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1(1):1CrossRefGoogle Scholar
  19. Magrane M and UniProt Consortium (2011) UniProt Knowledgebase: a hub of integrated protein data. Database p bar009Google Scholar
  20. Metzker ML (2010) Sequencing technologies—the next generation. Nature Rev Genet 11(1):31–46CrossRefPubMedGoogle Scholar
  21. Nawrocki EP, Kolbe DL, Eddy SR (2009) Infernal 1.0: inference of RNA alignments. Bioinformatics 25(10):1335–1337CrossRefPubMedPubMedCentralGoogle Scholar
  22. Parra G, Bradnam K, Korf I (2007) CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23(9):1061–1067CrossRefPubMedGoogle Scholar
  23. Paterson AH, Bowers JE, Bruggmann R, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, Schmutz J, Spannagl M (2008) The Sorghum bicolor genome and the diversification of grasses. Nature 457(LBNL-6812E)Google Scholar
  24. Price AL, Jones NC, Pevzner PA (2005) De novo identification of repeat families in large genomes. Bioinformatics 21(suppl 1):i351–i358CrossRefPubMedGoogle Scholar
  25. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J. and Heger A (2011) The Pfam protein families database. Nucleic Acids Res p gkr1065Google Scholar
  26. Ruperao P, Chan KCK, Azam S, Karafiátová M, Hayashi S, Čížková J, Saxena RK, Šimková H, Song C, Vrána J, Chitikineni A, Visendi P, Gaur PM, Millán T, Singh KB, Taran B, Wang J, Batley J, Doležel J, Varshney RK, Edwards D (2014) A chromosomal genomics approach to assess and validate the desi and kabuli draft chickpea genome assemblies. Plant Biotechnol J 12:778–786CrossRefPubMedGoogle Scholar
  27. Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326(5956):1112–1115Google Scholar
  28. Schuler GD (1997) Sequence mapping by electronic PCR. Genome Res 7(5):541–550CrossRefPubMedPubMedCentralGoogle Scholar
  29. Sigrist CJ, Cerutti L, De Castro E, Langendijk-Genevaux PS, Bulliard V, Bairoch A, Hulo N (2010) PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Res 38(suppl 1):D161–D166CrossRefPubMedGoogle Scholar
  30. Soderlund C, Bomhoff M and Nelson WM (2011) SyMAP v3. 4: a turnkey synteny system with application to plant genomes. Nucleic Acids Research p gkr123Google Scholar
  31. Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, Diemer K, Muruganujan A, Narechania A (2003) PANTHER: a library of protein families and subfamilies indexed by function. Genome Res 13(9):2129–2141CrossRefPubMedPubMedCentralGoogle Scholar
  32. Varshney RK, Chen W, Li Y, Bharti AK, Saxena RK, Schlueter JA, Donoghue MT, Azam S, Fan G, Whaley AM, Farmer AD (2012) Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nat Biotechnol 30(1):83–89Google Scholar
  33. Varshney, RK Song C, Saxena RK, Azam S, Yu S, Sharpe A, Cannon S, Baek J, Rosen BD, Tar’an B, Millan T, Zhang X, Ramsay LD, Iwata A, Wang Y, Nelson W, Farmer AD, Gaur PM, Soderlund C, Penmetsa RV, Xu C, Bharti AK, He W, Winter P, Zhao S, Hane JK, Garcia NC, Condie JA, Upadhyaya HD, Luo MC, Thudi M, Gowda CLL, Singh NP, Lichtenzveig J, Gali KK, Rubio J, Nadarajan N, Dolezel1 J, Bansal KC, Xu X, Edwards D, Zhang G, Kahl G, Gil J, Singh KB, Datta SK, Jackson SA, Wang J, Cook DR (2013) Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement. Nat Biotechnol 31: 240–246Google Scholar
  34. Varshney RK, Terauchi R, McCouch SR (2014) Harvesting the promising fruits of genomics: applying genome sequencing technologies to crop breeding. PLoS Biol 12(6):e1001883CrossRefPubMedPubMedCentralGoogle Scholar
  35. Xu Z, Wang H (2007) LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 35:W265–W268CrossRefPubMedPubMedCentralGoogle Scholar
  36. Zdobnov EM, Apweiler R (2001) InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17(9):847–848CrossRefPubMedGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Aamir Waseem Khan
    • 1
    • 2
  • Mahendar Thudi
    • 1
  • Rajeev K. Varshney
    • 1
  • David Edwards
    • 2
    Email author
  1. 1.International Crops Research Institute for the Semi-Arid TropicsPatancheruIndia
  2. 2.School of Biological SciencesThe University of Western AustraliaPerthAustralia

Personalised recommendations