Genomic Tools for the Study of Azospirillum and Other Plant Growth-Promoting Rhizobacteria

  • Víctor GonzálezEmail author
  • Luis Lozano
  • Patricia Bustos
  • Rosa I. Santamaría


Bioinformatics tools are essential for extracting valuable biological knowledge from bacterial genomes. Currently, there are many computational applications, algorithms, and programs to decipher the genomes in terms of structure, function, and evolution. Specialized databases to upload and retrieve genomic information have grown as well in the last past years. In this chapter, we highlight the basic bioinformatics procedures, databases, and web resources commonly used in bacterial genomics covering Azospirillum and related bacteria.


Bacterial Genome Genomic Island Integrate Microbial Genome KEGG Orthology Protein Family Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The work on bacterial genomics is supported by grants to VG from CONACYT CB 131499 and PAPPIT-UNAM IN2084143. We wish to thank to Olga M. Carrascal and Irma Martínez for valuable comments to the manuscript and José Espíritu for computational help.


  1. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS et al (2012) SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477CrossRefPubMedCentralPubMedGoogle Scholar
  2. Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2014) GenBank. Nucleic Acids Res 42:D32–D37CrossRefPubMedCentralPubMedGoogle Scholar
  3. Borodovsky M, Lomsadze A (2014) Gene identification in prokaryotic genomes, phages, metagenomes, and EST sequences with GeneMarkS suite. Curr Protoc Microbiol 32:Unit 1E 7Google Scholar
  4. Bose M, Barber RD (2006) Prophage Finder: a prophage loci prediction tool for prokaryotic genome sequences. In Silico Biol 6:223–227PubMedGoogle Scholar
  5. Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES et al (2008) ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res 18:810–820CrossRefPubMedCentralPubMedGoogle Scholar
  6. Carver T, Harris SR, Berriman M, Parkhill J, McQuillan JA (2012) Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinformatics 28:464–469CrossRefPubMedCentralPubMedGoogle Scholar
  7. Chaudhuri RR, Loman NJ, Snyder LA, Bailey CM, Stekel DJ, Pallen MJ (2008) xBASE2: a comprehensive resource for comparative bacterial genomics. Nucleic Acids Res 36:D543–D546CrossRefPubMedCentralPubMedGoogle Scholar
  8. Chen F, Mackey AJ, Stoeckert CJ Jr, Roos DS (2006) OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res 34:D363–D368CrossRefPubMedCentralPubMedGoogle Scholar
  9. Chen IM, Markowitz VM, Chu K, Anderson I, Mavromatis K, Kyrpides NC et al (2013) Improving microbial genome annotations in an integrated database context. PLoS One 8:e54859CrossRefPubMedCentralPubMedGoogle Scholar
  10. Contreras-Moreira B, Vinuesa P (2013) GET_HOMOLOGUES, a versatile software package for scalable and robust microbial pangenome analysis. Appl Environ Microbiol 79:7696–7701CrossRefPubMedCentralPubMedGoogle Scholar
  11. Darling AC, Mau B, Blattner FR, Perna NT (2004) Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14:1394–1403CrossRefPubMedCentralPubMedGoogle Scholar
  12. Datta RS, Meacham C, Samad B, Neyer C, Sjolander K (2009) Berkeley PHOG: PhyloFacts orthology group prediction web server. Nucleic Acids Res 37:W84–W89CrossRefPubMedCentralPubMedGoogle Scholar
  13. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999) Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27:4636–4641CrossRefPubMedCentralPubMedGoogle Scholar
  14. Didelot X, Falush D (2007) Inference of bacterial microevolution using multilocus sequence data. Genetics 175:1251–1266CrossRefPubMedCentralPubMedGoogle Scholar
  15. Didelot X, Darling A, Falush D (2009) Inferring genomic flux in bacteria. Genome Res 19:306–317CrossRefPubMedCentralPubMedGoogle Scholar
  16. Didelot X, Lawson D, Darling A, Falush D (2010) Inference of homologous recombination in bacteria using whole-genome sequences. Genetics 186:1435–1449CrossRefPubMedCentralPubMedGoogle Scholar
  17. Dimmer EC, Huntley RP, Alam-Faruque Y, Sawford T, O’Donovan C, Martin MJ et al (2012) The UniProt-GO Annotation database in 2011. Nucleic Acids Res 40:D565–D570CrossRefPubMedCentralPubMedGoogle Scholar
  18. Dufayard JF, Duret L, Penel S, Gouy M, Rechenmann F, Perriere G (2005) Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases. Bioinformatics 21:2596–2603CrossRefPubMedGoogle Scholar
  19. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G et al (2009) Real-time DNA sequencing from single polymerase molecules. Science 323:133–138CrossRefPubMedGoogle Scholar
  20. Fonseca NA, Rung J, Brazma A, Marioni JC (2012) Tools for mapping high-throughput sequencing data. Bioinformatics 28:3169–3177CrossRefPubMedGoogle Scholar
  21. Fouts DE (2006) Phage_Finder: automated identification and classification of prophage regions in complete bacterial genome sequences. Nucleic Acids Res 34:5839–5851CrossRefPubMedCentralPubMedGoogle Scholar
  22. Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A et al (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res 41:D808–D815CrossRefPubMedCentralPubMedGoogle Scholar
  23. Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I (2004) VISTA: computational tools for comparative genomics. Nucleic Acids Res 32:W273–W279CrossRefPubMedCentralPubMedGoogle Scholar
  24. Gordon D, Green P (2013) Consed: a graphical editor for next-generation sequencing. Bioinformatics 29:2936–2937CrossRefPubMedCentralPubMedGoogle Scholar
  25. Held K, Ramage E, Jacobs M, Gallagher L, Manoil C (2012) Sequence-verified two-allele transposon mutant library for Pseudomonas aeruginosa PAO1. J Bacteriol 194:6387–6389CrossRefPubMedCentralPubMedGoogle Scholar
  26. Huerta-Cepas J, Bueno A, Dopazo J, Gabaldon T (2008) PhylomeDB: a database for genome-wide collections of gene phylogenies. Nucleic Acids Res 36:D491–D496CrossRefPubMedCentralPubMedGoogle Scholar
  27. Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A et al (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 40:D306–D312CrossRefPubMedCentralPubMedGoogle Scholar
  28. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:19Google Scholar
  29. Jeck WR, Reinhardt JA, Baltrus DA, Hickenbotham MT, Magrini V, Mardis ER et al (2007) Extending assembly of short DNA sequences to handle error. Bioinformatics 23:2942–2944CrossRefPubMedGoogle Scholar
  30. Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240CrossRefPubMedCentralPubMedGoogle Scholar
  31. Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M (2012) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 40:D109–D114CrossRefPubMedCentralPubMedGoogle Scholar
  32. Kaneko T, Minamisawa K, Isawa T, Nakatsukasa H, Mitsui H, Kawaharada Y et al (2010) Complete genomic structure of the cultivated rice endophyte Azospirillum sp. B510. DNA Res 17:37–50CrossRefPubMedCentralPubMedGoogle Scholar
  33. Koonin EV, Wolf YI (2008) Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res 36:6688–6719CrossRefPubMedCentralPubMedGoogle Scholar
  34. Koren S, Treangen TJ, Hill CM, Pop M, Phillippy AM (2014) Automated ensemble assembly and validation of microbial genomes. BMC Bioinformatics 15:126CrossRefPubMedCentralPubMedGoogle Scholar
  35. Kosuge T, Mashima J, Kodama Y, Fujisawa T, Kaminuma E, Ogasawara O et al (2014) DDBJ progress report: a new submission system for leading to a correct annotation. Nucleic Acids Res 42:D44–D49CrossRefPubMedCentralPubMedGoogle Scholar
  36. Kristensen DM, Cai X, Mushegian A (2011a) Evolutionarily conserved orthologous families in phages are relatively rare in their prokaryotic hosts. J Bacteriol 193:1806–1814CrossRefPubMedCentralPubMedGoogle Scholar
  37. Kristensen DM, Wolf YI, Mushegian AR, Koonin EV (2011b) Computational methods for Gene Orthology inference. Brief Bioinform 12:379–391CrossRefPubMedCentralPubMedGoogle Scholar
  38. Kulikova T, Akhtar R, Aldebert P, Althorpe N, Andersson M, Baldwin A et al (2007) EMBL nucleotide sequence database in 2006. Nucleic Acids Res 35:D16–D20CrossRefPubMedCentralPubMedGoogle Scholar
  39. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C et al (2004) Versatile and open software for comparing large genomes. Genome Biol 5:R12CrossRefPubMedCentralPubMedGoogle Scholar
  40. Kuzniar A, van Ham RC, Pongor S, Leunissen JA (2008) The quest for orthologs: finding the corresponding gene across genomes. Trends Genet 24:539–551CrossRefPubMedGoogle Scholar
  41. Langille MG, Brinkman FS (2009) IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 25:664–665CrossRefPubMedCentralPubMedGoogle Scholar
  42. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25CrossRefPubMedCentralPubMedGoogle Scholar
  43. Lee H, Tang H (2012) Next-generation sequencing technologies and fragment assembly algorithms. Methods Mol Biol 855:155–174CrossRefPubMedGoogle Scholar
  44. Li L, Stoeckert CJ Jr, Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189CrossRefPubMedCentralPubMedGoogle Scholar
  45. Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K et al (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25:1966–1967CrossRefPubMedGoogle Scholar
  46. Lima-Mendez G, Van Helden J, Toussaint A, Leplae R (2008) Prophinder: a computational tool for prophage prediction in prokaryotic genomes. Bioinformatics 24:863–865CrossRefPubMedGoogle Scholar
  47. Loman NJ, Misra RV, Dallman TJ, Constantinidou C, Gharbia SE, Wain J et al (2012) Performance comparison of benchtop high-throughput sequencing platforms. Nat Biotechnol 30:434–439CrossRefPubMedGoogle Scholar
  48. Lozano L, Hernandez-Gonzalez I, Bustos P, Santamaria RI, Souza V, Young JP et al (2010) Evolutionary dynamics of insertion sequences in relation to the evolutionary histories of the chromosome and symbiotic plasmid genes of Rhizobium etli populations. Appl Environ Microbiol 76:6504–6513CrossRefPubMedCentralPubMedGoogle Scholar
  49. Magoc T, Pabinger S, Canzar S, Liu X, Su Q, Puiu D et al (2013) GAGE-B: an evaluation of genome assemblers for bacterial organisms. Bioinformatics 29:1718–1725CrossRefPubMedCentralPubMedGoogle Scholar
  50. Magrane M, Consortium U (2011) UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford) 2011:bar009CrossRefGoogle Scholar
  51. Mardis ER (2011) A decade’s perspective on DNA sequencing technology. Nature 470:198–203CrossRefPubMedGoogle Scholar
  52. Mardis ER (2013) Next-generation sequencing platforms. Annu Rev Anal Chem (Palo Alto Calif) 6:287–303CrossRefGoogle Scholar
  53. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380PubMedCentralPubMedGoogle Scholar
  54. Markowitz VM, Chen IM, Palaniappan K, Chu K, Szeto E, Pillay M et al (2014) IMG 4 version of the integrated microbial genomes comparative analysis system. Nucleic Acids Res 42:D560–D567CrossRefPubMedCentralPubMedGoogle Scholar
  55. Mi H, Guo N, Kejariwal A, Thomas PD (2007) PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways. Nucleic Acids Res 35:D247–D252CrossRefPubMedCentralPubMedGoogle Scholar
  56. Miller JR, Koren S, Sutton G (2010) Assembly algorithms for next-generation sequencing data. Genomics 95:315–327CrossRefPubMedCentralPubMedGoogle Scholar
  57. Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M (2007) KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res 35:W182–W185CrossRefPubMedCentralPubMedGoogle Scholar
  58. Nielsen CB, Cantor M, Dubchak I, Gordon D, Wang T (2010) Visualizing genomes: techniques and challenges. Nat Methods 7:S5–S15CrossRefPubMedGoogle Scholar
  59. Ostlund G, Schmitt T, Forslund K, Kostler T, Messina DN, Roopra S et al (2010) InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res 38: D196–D203CrossRefPubMedCentralPubMedGoogle Scholar
  60. Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T et al (2014) The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res 42:D206–D214CrossRefPubMedCentralPubMedGoogle Scholar
  61. Pop M, Kosack D (2004) Using the TIGR assembler in shotgun sequencing projects. Methods Mol Biol 255:279–294PubMedGoogle Scholar
  62. Powell S, Forslund K, Szklarczyk D, Trachana K, Roth A, Huerta-Cepas J et al (2014) eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res 42:D231–D239CrossRefPubMedCentralPubMedGoogle Scholar
  63. Pruitt KD, Tatusova T, Brown GR, Maglott DR (2012) NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res 40:D130–D135CrossRefPubMedCentralPubMedGoogle Scholar
  64. Ren X, Liu T, Dong J, Sun L, Yang J, Zhu Y et al (2012) Evaluating de Bruijn graph assemblers on 454 transcriptomic data. PLoS One 7:e51188CrossRefPubMedCentralPubMedGoogle Scholar
  65. Ribeiro FJ, Przybylski D, Yin S, Sharpe T, Gnerre S, Abouelleil A et al (2012) Finished bacterial genomes from shotgun sequence data. Genome Res 22:2270–2277CrossRefPubMedCentralPubMedGoogle Scholar
  66. Rivera D, Revale S, Molina R, Gualpa J, Puente M, Maroniche G et al (2014) Complete genome sequence of the model rhizosphere strain Azospirillum brasilense Az39, successfully applied in agriculture. Genome Announc 2(4), pii: e00683-14Google Scholar
  67. Royce L, Boggess E, Jin T, Dickerson J, Jarboe L (2013) Identification of mutations in evolved bacterial genomes. Methods Mol Biol 985:249–267CrossRefPubMedGoogle Scholar
  68. Santamaria RI, Bustos P, Sepulveda-Robles O, Lozano L, Rodriguez C, Fernandez JL et al (2014) Narrow-host-range bacteriophages that infect Rhizobium etli associate with distinct genomic types. Appl Environ Microbiol 80:446–454CrossRefPubMedCentralPubMedGoogle Scholar
  69. Schneider GF, Dekker C (2012) DNA sequencing with nanopores. Nat Biotechnol 30:326–328CrossRefPubMedGoogle Scholar
  70. Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M (2006) ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res 34:D32–D36CrossRefPubMedCentralPubMedGoogle Scholar
  71. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19:1117–1123CrossRefPubMedCentralPubMedGoogle Scholar
  72. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV et al (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4:41CrossRefPubMedCentralPubMedGoogle Scholar
  73. UniPort Consortium (2014) Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res 42:D191–D198CrossRefGoogle Scholar
  74. Vacheron J, Desbrosses G, Bouffaud ML, Touraine B, Moenne-Loccoz Y, Muller D et al (2013) Plant growth-promoting rhizobacteria and root system functioning. Front Plant Sci 4:356CrossRefPubMedCentralPubMedGoogle Scholar
  75. Vallenet D, Belda E, Calteau A, Cruveiller S, Engelen S, Lajus A et al (2013) MicroScope–an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data. Nucleic Acids Res 41:D636–D647CrossRefPubMedCentralPubMedGoogle Scholar
  76. Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E (2009) EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19(2):327–35Google Scholar
  77. Wagner A, Lewis C, Bichsel M (2007) A survey of bacterial insertion sequences using IScan. Nucleic Acids Res 35:5284–5293CrossRefPubMedCentralPubMedGoogle Scholar
  78. Waterhouse RM, Tegenfeldt F, Li J, Zdobnov EM, Kriventseva EV (2013) OrthoDB: a hierarchical catalog of animal, fungal and bacterial orthologs. Nucleic Acids Res 41:D358–D365CrossRefPubMedCentralPubMedGoogle Scholar
  79. Wisniewski-Dye F, Borziak K, Khalsa-Moyers G, Alexandre G, Sukharnikov LO, Wuichet K et al (2011) Azospirillum genomes reveal transition of bacteria from aquatic to terrestrial environments. PLoS Genet 7:e1002430CrossRefPubMedCentralPubMedGoogle Scholar
  80. Wisniewski-Dye F, Lozano L, Acosta-Cruz E, Borland S, Drogue B, Prigent-Combaret C et al (2012) Genome sequence of Azospirillum brasilense CBG497 and comparative analyses of Azospirillum core and accessory genomes provide insight into niche adaptation. Genes (Basel) 3:576–602CrossRefGoogle Scholar
  81. Yu C, Desai V, Cheng L, Reifman J (2012) QuartetS-DB: a large-scale orthology database for prokaryotes and eukaryotes inferred by evolutionary evidence. BMC Bioinformatics 13:143CrossRefPubMedCentralPubMedGoogle Scholar
  82. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829CrossRefPubMedCentralPubMedGoogle Scholar
  83. Zhao Y, Wu J, Yang J, Sun S, Xiao J, Yu J (2012) PGAP: pan-genomes analysis pipeline. Bioinformatics 28:416–418CrossRefPubMedCentralPubMedGoogle Scholar
  84. Zhao Y, Jia X, Yang J, Ling Y, Zhang Z, Yu J et al (2014) PanGP: a tool for quickly analyzing bacterial pan-genome profile. Bioinformatics 30:1297–1299CrossRefPubMedCentralPubMedGoogle Scholar
  85. Zhou Y, Liang Y, Lynch KH, Dennis JJ, Wishart DS (2011) PHAST: a fast phage search tool. Nucleic Acids Res 39:W347–W352CrossRefPubMedCentralPubMedGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Víctor González
    • 1
    Email author
  • Luis Lozano
    • 1
  • Patricia Bustos
    • 1
  • Rosa I. Santamaría
    • 1
  1. 1.Centro de Ciencias Genómicas, Universidad Nacional Autónoma de MéxicoCuernavacaMexico

Personalised recommendations