Comparative Genome Analysis in the Integrated Microbial Genomes (IMG) System

  • Victor M. Markowitz
  • Nikos C. Kyrpides
Part of the Methods in Molecular Biology™ book series (MIMB, volume 395)


Comparative genome analysis is critical for the effective exploration of a rapidly growing number of complete and draft sequences for microbial genomes. The Integrated Microbial Genomes (IMG) system ( has been developed as a community resource that provides support for comparative analysis of microbial genomes in an integrated context. IMG allows users to navigate the multidimensional microbial genome data space and focus their analysis on a subset of genes, genomes, and functions of interest. IMG provides graphical viewers, summaries, and occurrence profile tools for comparing genes, pathways, and functions (terms) across specific genomes. Genes can be further examined using gene neighborhoods and compared with sequence alignment tools.

Key Words

Comparative genome data analysis integrated microbial genomes occurrence profiles microbial genome data management comparative genome data analysis gene occurrence profile functional occurrence profile gene model validation integrated microbial genomes 



We thank Krishna Palaniappan, Ernest Szeto, Frank Korzeniewski, Iain Anderson, Natalia Ivanova, Athanasios Lykidis, Kostas Mavrommatis, Phil Hugenholtz, Anu Padki, Kristen Taylor, Xueling Zhao, Shane Brubaker, Greg Werner, and Inna Dubchak for their contribution to the development and maintenance of IMG. With their comments and suggestions, Krishna Palaniappan and Iain Anderson helped improve the examples in this chapter. Eddy Rubin and James Bristow provided, support, advice, and encouragement throughout the IMG project. IMG uses tools and data from a number of publicly available resources, their availability and value is gratefully acknowledged. The work presented in this paper was supported by the Director, Office of Science, Office of Biological and Environmental Research, Life Sciences Division, US Department of Energy under contract no. DE-AC03-76SF00098.


  1. 1.
    Liolios, K., Tavernarakis, N., Hugenholtz, P., and Kyrpides, N. C. (2006) The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide. Nucleic Acid Res. 34, D332–D334.CrossRefPubMedGoogle Scholar
  2. 2.
    Bateman, A., Coin, L., Durbin, R., et al. (2004) The Pfam Protein Families Database. Nucleic Acids Res. 32, D138–D141.CrossRefPubMedGoogle Scholar
  3. 3.
    Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res. 33, D201–D205.CrossRefPubMedGoogle Scholar
  4. 4.
    Tatusov, R. L., Koonin, E. V., and Lipman, D. J. (1997) A genomic perspective on protein families. Science 278, 631–637.CrossRefPubMedGoogle Scholar
  5. 5.
    Marchler-Bauer, A., Panchenko, A. R., Shoemaker, B. A., Thiessen, P. A., Geer, L. Y., and Bryant, S. H. (2002) CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res. 30, 281–283.CrossRefPubMedGoogle Scholar
  6. 6.
    Kanehisa, M., Goto, S., Kawashima, S. Okuno, Y., and Hattori, M. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–D280.CrossRefPubMedGoogle Scholar
  7. 7.
    Gene Ontology Consortium. (2004) The Gene Ontology Database and Informatics Resource. Nucleic Acids Res. 32, 258–261.Google Scholar
  8. 8.
    Kersey, P., Bower, L., Morris, L., et al., (2005) Integr8 and genome reviews: integrated views of complete genomes and proteomes. Nucleic Acid Res. 33, D297–D302.CrossRefPubMedGoogle Scholar
  9. 9.
    Pruitt, K. D., Tatusova, T., and Maglott, D. R. (2005) NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts, and proteins. Nucleic Acid Res. 33, D501–D504.CrossRefPubMedGoogle Scholar
  10. 10.
    Bowers, P. M., Pellegrini, M., Thompson, M. J., Fierro, J., Yeates, T. O., and Eisenberg, D. (2004) Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol. 5, R35.CrossRefPubMedGoogle Scholar
  11. 11.
    Hauser, L., Larimer, F., Land, M., Shah, M., and Uberbacher, E. (2004) Analysis and annotation of microbial genome sequences. Genet. Eng. 26, 225–238.Google Scholar
  12. 12.
    Markowitz, V. M., Korzeniewski, F., Palaniappan, K., et al. (2006) The Integrated Microbial Genomes (IMG) system. Nucleic Acids Res. 34, D344–D348.CrossRefPubMedGoogle Scholar
  13. 13.
    BioPAX. (2006) Biological Pathways Exchange.
  14. 14.
    Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D., and Yeates, T. O. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. 96, 4285–4288.CrossRefPubMedGoogle Scholar
  15. 15.
    Osterman, A. and Overbeek, R. (2003) Missing genes in metabolic pathways: a comparative genomic approach. Chem. Biol. 7, 238–251.Google Scholar
  16. 16.
    Overbeek, R., Larsen, N., Pusch, G. D., et al. (2000) WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction. Nucleic Acids Res. 28, 123–125.CrossRefPubMedGoogle Scholar
  17. 17.
    Overbeek, R., Larsen, N., Walunas, T., et al. (2003) The ERGO genome analysis and discovery system. Nucleic Acid Res. 31, 164–171.CrossRefPubMedGoogle Scholar
  18. 18.
    Uchiyama, I. (2003) MBGD: microbial genome database for comparative analysis. Nucleic Acid Res. 31, 58–62.CrossRefPubMedGoogle Scholar
  19. 19.
    Overbeek, R., Begley, T., Butler, R. M., et al. (2005) The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acid Res. 33, 5691–5702.CrossRefPubMedGoogle Scholar
  20. 20.
    Alm, E. J., Huang, K. H., Price, M. N., et al. (2005) The microbes online web site for comparative genomics. Genome Res. 15, 1015–1022.CrossRefPubMedGoogle Scholar
  21. 21.
    Maltsev, N., Glass, E., Sulakhe, D., et al. (2006) PUMA2: grid-based high-throughput analysis of genomes and metabolic pathways. Nucleic Acids Res. 34, D369–D372.CrossRefPubMedGoogle Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Victor M. Markowitz
    • 1
  • Nikos C. Kyrpides
    • 2
  1. 1.Lawrence Berkeley National LaboratoryBerkeley, CA
  2. 2.Lawrence Berkeley National LaboratoryBerkeley, CA

Personalised recommendations