, Volume 106, Issue 1–2, pp 159–170

Functional genomics and enzyme evolution

  • M.Y. Galperin
  • E.V. Koonin


Computational analysis of complete genomes, followed by experimental testing of emerging hypotheses — the area of research often referred to as ‘functional genomics’ — aims at deciphering the wealth of information contained in genome sequences and at using it to improve our understanding of the mechanisms of cell function. This review centers on the recent progress in the genome analysis with special emphasis on the new insights in enzyme evolution. Standard methods of predicting functions for new proteins are listed and the common errors in their application are discussed. A new method of improving the functional predictions is introduced, based on a phylogenetic approach to functional prediction, as implemented in the recently constructed Clusters of Orthologous Groups (COG) database (available at This approach provides a convenient way to characterize the protein families (and metabolic pathways) that are present or absent in any given organism. Comparative analysis of microbial genomes based on this approach shows that metabolic diversity generally correlates with the genome size-parasitic bacteria code for fewer enzymes and lesser number of metabolic pathways than their free-living relatives. Comparison of different genomes reveals another evolutionary trend, the non-orthologous gene displacement of some enzymes by unrelated proteins with the same cellular function. An examination of the phylogenetic distribution of such cases provides new clues to the problems of biochemical evolution, including evolution of glycolysis and the TCA cycle.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Fickett J.W.: Finding genes by computer: The state of the art, Trends Genet. 12 (1996): 316–320.PubMedGoogle Scholar
  2. 2.
    Pearson W.R.: Effective protein sequence comparison, Methods Enzymol. 266 (1996): 227–258.PubMedGoogle Scholar
  3. 3.
    Altschul S.F.: Sequence comparison and alignment. In: Bishop M.J. and Rawlings C.J. (eds), DNA and Protein Sequence Analysis: A Practical Approach. IRL Press, Oxford, 1997, pp. 137–167.Google Scholar
  4. 4.
    Smith T.F. and Waterman M.S.: Identification of common molecular subsequences, J. Mol. Biol. 147 (1981): 195–197.PubMedGoogle Scholar
  5. 5.
    Pearson W.R. and Lipman D.J.: Improved tools for biological sequence comparison, Proc. Natl. Acad. Sci. USA 85 (1988): 2444–2448.PubMedGoogle Scholar
  6. 6.
    Altschul S.F., Gish W., Miller W., Myers E.W. and Lipman D.J.: Basic local alignment search tool, J. Mol. Biol. 215 (1990): 403–410.PubMedGoogle Scholar
  7. 7.
    Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zheng Z., Miller W. and Lipman D.J.: Gapped BLAST and PSIBLAST — A new generation of protein database search programs, Nucleic Acids Res. 25 (1997): 3389–3402.PubMedGoogle Scholar
  8. 8.
    Bairoch A., Bucher P. and Hofmann K.: The PROSITE database, its status in 1997, Nucleic Acids Res. 25 (1997): 217–221.PubMedGoogle Scholar
  9. 9.
    Henikoff S., Pietrokovski S. and Henikoff J.G.: Superior performance in protein homology detection with the Blocks Database servers, Nucleic Acids Res. 26 (1998): 311–315.Google Scholar
  10. 10.
    Attwood T.K., Beck M.E., Flower D.R., Scordis P. and Selley J.N.: The PRINTS protein fingerprint database in its fifth year, Nucleic Acids Res. 26 (1998): 306–311.Google Scholar
  11. 11.
    Corpet F., Gouzy J. and Kahn D.: The ProDom database of protein domain families, Nucleic Acids Res. 26 (1998): 325–328.Google Scholar
  12. 12.
    Sonnhammer E.L.L., Eddy S.R., Birney E., Bateman A. and Durbin R.: Pfam: Multiple sequence alignments and HMM-profiles of protein domains, Nucleic Acids Res. 26 (1998): 322–325.Google Scholar
  13. 13.
    Wootton J.C.: Non-globular domains in protein sequences: Automated segmentation using complexity measures, Comput. Chem. 18 (1994): 269–285.PubMedGoogle Scholar
  14. 14.
    Ouzounis C., Casari G., Valencia A. and Sander C.: Novelties from the complete genome of Mycoplasma genitalium, Mol. Microbiol. 20 (1996): 898–900.PubMedGoogle Scholar
  15. 15.
    Kyrpides N.C., Olsen G.J., Klenk H.-P., White O. and Woese C.R.: Methanococcus jannaschii genome: Revisited, Microb. Compar. Genom. 1 (1996): 329–338.Google Scholar
  16. 16.
    Koonin E.V., Mushegian A.R., Galperin M.Y. and Walker D.R.: Comparison of archaeal and bacterial genomes: Computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea, Mol. Microbiol. 25 (1997): 619–637.PubMedGoogle Scholar
  17. 17.
    Galperin M.Y. and Koonin E.V.: Sources of systematic error in functional annotation of genomes: Domain rearrangement, non-orthologous gene displacement, and operon disruption, In Silico. Biol. 1 (1998): 55–67 〈〉.PubMedGoogle Scholar
  18. 18.
    Camadro J.M. and Labbe P.: Cloning and characterization of the yeast HEM14 gene coding for protoporphyrinogen oxidase, the molecular target of diphenyl ether-type herbicides, J. Biol. Chem. 271 (1996): 9120–9128.PubMedGoogle Scholar
  19. 19.
    Hughes N.J., Clayton C.L., Chalk P.A. and Kelly D.J.: Helicobacter pylori porCDAB and oorDABC genes encode distinct pyruvate: Flavodoxin and 2-oxoglutarate: Acceptor oxidoreductases which mediate electron transport to NADP, J. Bacteriol. 180 (1998): 1119–1128.PubMedGoogle Scholar
  20. 20.
    Lee S.H., Hidaka T., Nakashita H. and Seto H.: The carboxyphosphonoenolpyruvate synthase-encoding gene from the bialaphos-producing organism Streptomyces hygroscopicus, Gene 153 (1995): 143–144.PubMedGoogle Scholar
  21. 21.
    Nakashita H., Watanabe K., Hara O., Hidaka T. and Seto H.: Studies on the biosynthesis of bialaphos. Biochemical mechanism of C-P bond formation: Discovery of phosphonopyruvate decarboxylase which catalyzes the formation of phosphonoacetaldehyde from phosphonopyruvate, J. Antibiot. 50 (1997): 212–219.Google Scholar
  22. 22.
    Galperin M.Y., Bairoch A. and Koonin E.V. A superfamily of metalloenzymes unifies phosphopentomutase and cofactor-independent phosphoglycerate mutase with alkaline phosphatases and sulfatases, Protein Sci. 8 (1998): 1829–1835.Google Scholar
  23. 23.
    Galperin M.Y., Walker D.R. and Koonin E.V.: Analogous enzymes: Independent inventions in enzyme evolution, Genome Res. 8 (1998): 779–790.PubMedGoogle Scholar
  24. 24.
    Tatusov R.L., Koonin E.V. and Lipman D.J.: A genomic perspective on protein families, Science 278 (1997): 631–637.PubMedGoogle Scholar
  25. 25.
    Koonin E.V., Tatusov R.L. and Galperin M.Y.: Beyond the complete genomes: from sequences to structure and function, Curr. Opin. Struct. Biol. 8 (1998): 355–363.PubMedGoogle Scholar
  26. 26.
    Overbeek R., Larsen N., Smith W., Maltsev N. and Selkov E.: Representation of function: The next step, Gene 191 (1997): GC1-GC9.PubMedGoogle Scholar
  27. 27.
    Fleischmann R.D., Adams M.D., White O. et al.: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd, Science 269 (1995): 496–512.PubMedGoogle Scholar
  28. 28.
    Fraser C.M., Gocayne J.D., White O. et al.: The minimal gene complement of Mycoplasma genitalium, Science 270 (1995): 397–403.PubMedGoogle Scholar
  29. 29.
    Tatusov R.L., Mushegian A.R., Bork P., Brown N.P., Hayes W.S., Borodovsky M., Rudd K.E. and Koonin E.V.: Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coli, Curr. Biol. 6 (1996): 279–291.PubMedGoogle Scholar
  30. 30.
    Tomb J.-F., White O., Kerlavage A.R. et al.: The complete genome sequence of the gastric pathogen Helicobacter pylori, Nature 388 (1997): 539–547.PubMedGoogle Scholar
  31. 31.
    Mushegian A.R. and Koonin E.V.: A minimal gene set for cellular life derived by comparison of complete bacterial genomes, Proc. Natl. Acad. Sci. USA 93 (1996): 10268–10273.PubMedGoogle Scholar
  32. 32.
    Koonin E.V., Mushegian A.R. and Bork P.: Non-orthologous gene displacement, Trends Genet. 12 (1996): 334–336.PubMedGoogle Scholar
  33. 33.
    Godon J.J., Chopin M.C. and Ehrlich S.D.: Branched-chain amino acid biosynthesis genes in Lactococcus lactis subsp. lactis, J. Bacteriol. 174 (1992): 6580–6589.PubMedGoogle Scholar
  34. 34.
    De Rossi E., Leva R., Gusberti L., Manachini P.L. and Riccardi G.: Cloning, sequencing and expression of the ilvBNC gene cluster from Streptomyces avermitilis, Gene 166 (1995): 127–132.PubMedGoogle Scholar
  35. 35.
    Quinn C.L., Stephenson B.T. and Switzer R.L.: Functional organization and nucleotide sequence of the Bacillus subtilis pyrimidine biosynthetic operon, J. Biol. Chem. 266 (1991): 9113–9127.PubMedGoogle Scholar
  36. 36.
    Li X., Weinstock G.M. and Murray B.E.: Generation of auxotrophic mutants of Enterococcus faecalis, J. Bacteriol. 177 (1995): 6866–6873.PubMedGoogle Scholar
  37. 37.
    Schenk-Groninger R., Becker J. and Brendel M.: Cloning, sequencing, and characterizing the Lactobacillus leichmannii pyrC gene encoding dihydroorotase, Biochimie. 77 (1995): 265–272.PubMedGoogle Scholar
  38. 38.
    Rutter W.J.: Evolution of aldolase, Fed. Proc. 23 (1964): 1248–1257.PubMedGoogle Scholar
  39. 39.
    Stallings W.C., Powers T.B., Pattridge K.A., Fee J.A. and Ludwig M.L.: Iron superoxide dismutase from Escherichia coli at 3.1-A resolution: A structure unlike that of copper/zinc protein at both monomer and dimer levels, Proc. Natl. Acad. Sci. USA 80 (1983): 3884–3888.PubMedGoogle Scholar
  40. 40.
    Romano A.H. and Conway T.: Evolution of carbohydrate metabolic pathways, Res. Microbiol. 147 (1996): 448–455.PubMedGoogle Scholar
  41. 41.
    Fothergill-Gilmore L.A. and Michels P.A.: Evolution of glycolysis, Prog. Biophys. Mol. Biol. 59 (1993): 105–235.PubMedGoogle Scholar
  42. 42.
    Galperin M.Y. and Brenner S.E.: Using metabolic pathway databases for functional annotation, Trends Genet. 14 (1998): 332–333.PubMedGoogle Scholar
  43. 43.
    Bork P., Sander C. and Valencia A.: Convergent evolution of similar enzymatic function on different protein folds: The hexokinase, ribokinase, and galactokinase families of sugar kinases, Protein Sci. 2 (1993): 31–40.PubMedGoogle Scholar
  44. 44.
    Daldal F. and Fraenkel D.G.: Tn10 insertions in the pfkB region of Escherichia coli, J. Bacteriol. 147 (1981): 935–943.PubMedGoogle Scholar
  45. 45.
    Kengen S.W., Tuininga J.E., de Bok F.A., Stams A.J. and de Vos W.M.: Purification and characterization of a novel ADP-dependent glucokinase from the hyperthermophilic archaeon Pyrococcus furiosus, J. Biol. Chem. 270 (1995): 30453–30457.PubMedGoogle Scholar
  46. 46.
    Blattner F.R., Plunkett G., III, Bloch C.A. et al.: The complete genome sequence of Escherichia coli K-12, Science 277 (1997): 1453–1474.PubMedGoogle Scholar
  47. 47.
    Marsh J.J. and Lebherz H.G.: Fructose-bisphosphate aldolases: An evolutionary history, Trends Biochem. Sci. 17 (1992): 110–113.PubMedGoogle Scholar
  48. 48.
    Cooper S.J., Leonard G.A., McSweeney S.M., Thompson A.W., Naismith J.H., Qamar S., Plater A., Berry A. and Hunter W.N.: The crystal structure of a class II fructose-1,6-bisphosphate aldolase shows a novel binuclear metal-binding active site embedded in a familiar fold, Structure 4 (1996): 1303–1315.PubMedGoogle Scholar
  49. 49.
    Carreras J., Mezquita J., Bosch J., Bartrons R. and Pons G.: Phylogeny and ontogeny of the phosphoglycerate mutases. IV. Distribution of glycerate-2,3-P2 dependent and independent phosphoglycerate mutases in algae, fungi, plants and animals, Comp. Biochem. Physiol. [B] 71 (1982): 591–597.Google Scholar
  50. 50.
    Smith E.T., Blamey J.M. and Adams M.W.: Pyruvate ferredoxin oxidoreductases of the hyperthermophilic archaeon, Pyrococcus furiosus, and the hyperthermophilic bacterium, Thermotoga maritima, have different catalytic mechanisms, Biochemistry 33 (1994): 1008–1116.PubMedGoogle Scholar
  51. 51.
    Gest H.: Evolutionary roots of the citric acid cycle in prokaryotes, Biochem. Soc. Symp. 54 (1987): 3–16.PubMedGoogle Scholar
  52. 52.
    Selig M., Xavier K.B., Santos H. and Schonheit P.: Comparative analysis of Embden-Meyerhof and Entner-Doudoroff glycolytic pathways in hyperthermophilic archaea and the bacterium Thermotoga, Arch. Microbiol. 167 (1997): 217–232.PubMedGoogle Scholar
  53. 53.
    Suzuki M., Sahara T., Tsuruha J., Takada Y., and Fukunaga N.: Differential expression in Escherichia coli of the Vibrio sp. strain ABE-1 icdI and icdII genes encoding structurally different isocitrate dehydrogenase isozymes, J. Bacteriol. 177 (1995): 2138–2142.PubMedGoogle Scholar
  54. 54.
    Weitzman P.D.: Unity and diversity in some bacterial citric acid-cycle enzymes, Adv. Microb. Physiol. 22 (1981): 185–244.PubMedGoogle Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • M.Y. Galperin
    • 1
  • E.V. Koonin
    • 1
  1. 1.National Center for Biotechnology Information, National Library of MedicineNational Institutes of HealthBethesdaUSA

Personalised recommendations