Skip to main content
Log in

Using the COG Database to Improve Gene Recognition in Complete Genomes

  • Published:
Genetica Aims and scope Submit manuscript

Abstract

A complete understanding of the biology of an organism necessarily starts with knowledge of its genetic makeup. Proteins encoded in a genome must be identified and characterized, and the presence or absence of specific sets of proteins must be noted in order to determine the possible biochemical pathways or functional systems utilized by that organism. The COG database presents a set of tools suited to these purposes, including the ability to select protein families (COGs) that contain proteins from a specified set of species. The selection is based upon a phylogenetic pattern, which is a shorthand representation of the presence or absence of a particular species in a COG. Here we present the use of phylogenetic patterns as a means to perform targeted searches for undetected protein-coding genes in complete genomes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Andersson S.G. et al.: The genome sequence of Rickettsia prowazekiiand the origin of mitochondria, Nature 396(1986): 133–140.

    Google Scholar 

  2. Driessen A.J., Fekkes P and van der Wolk J.P.: The Sec system, Curr. Opin. Microbiol. 1(1998): 216–222.

    Google Scholar 

  3. Fitch W.M.: Distinguishing homologous from analogous proteins, Syst. Zool. 19(1970): 99–113.

    Google Scholar 

  4. Fitch W.M.: Uses for evolutionary trees, Philos. Trans. R. Soc. Lond. B Biol. Sci. 349(1995): 93–102.

    Google Scholar 

  5. Fleischmann R.D. et al.: Whole-genome random sequencing and assembly of Haemophilus influenzaeRd, Science 269(1995): 496–512.

    Google Scholar 

  6. Galperin M.Y. and Koonin E.V.: Functional genomics and enzyme evolution. Homologous and analogous enzymes encoded in microbial genomes, Genetica 106(1999): 159–170.

    Google Scholar 

  7. Galperin M.Y., Tatusov R.L. and Koonin E.V.: Comparing microbial genomes: how the gene set determines the lifestyle, In: Charlebois R.L. (ed), Organization of the Prokaryotic Genome. ASM Press, Washington, D.C, 1999, pp. 91–108.

    Google Scholar 

  8. Galperin M.Y., Walker D.R. and Koonin E.V.: Analogous enzymes: independent inventions in enzyme evolution, Genome Res. 8(1998): 779–790.

    Google Scholar 

  9. Koonin E.V., Mushegian A.R. and Bork P.: Non-orthologous gene displacement, Trends Genet. 12(1996): 334–336.

    Google Scholar 

  10. Koonin E.V., Tatusov R.L. and Galperin M.Y.: Beyond the complete genomes: from sequences to structure and function, Curr. Opin. Struct. Biol. 8(1998): 355–363.

    Google Scholar 

  11. Kozak M.: Initiation of translation in prokaryotes and eukaryotes, Gene 234(1999): 187–208.

    Google Scholar 

  12. Tatusov R.L. et al.: The COG database: a tool for genomescale analysis of protein functions and evolution, Nucleic Acids Res. 28(2000): 33–36.

    Google Scholar 

  13. Tatusov R.L., Koonin E.V. and Lipman D.J.: A genomic perspective on protein families, Science 278(1997): 631–637.

    Google Scholar 

  14. Thompson J.D., Higgins D.G. and Gibson T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Res. 22(1994): 4673–4680.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Natale, D.A., Galperin, M.Y., Tatusov, R.L. et al. Using the COG Database to Improve Gene Recognition in Complete Genomes. Genetica 108, 9–17 (2000). https://doi.org/10.1023/A:1004031323748

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1004031323748

Navigation