Journal of Molecular Evolution

, Volume 44, Issue 1, pp 66–73 | Cite as

Conserved Clusters of Functionally Related Genes in Two Bacterial Genomes

  • Javier  Tamames
  • Georg  Casari
  • Christos  Ouzounis
  • Alfonso  Valencia


An approach for genome comparison, combining function classification of gene products and sequence comparison, is presented. The genomes of Haemophilus influenzae and Escherichia coli are analyzed, and all genes are classified into nine major functional classes, corresponding to important cellular processes. To study gene order relationships and genome organization in the two bacteria, we performed statistics on neighboring pairs of genes. To estimate the significance of the observations, a statistical model based on binomial distributions has been developed. Significant patterns of gene order are observed within, as well as between, the two bacterial genomes: Functionally related genes tend to be neighbors more often than do unrelated genes. Some of these groups represent well-known operons, but additional gene clusters are identified. These clusters correspond to genomic elements that have been conserved during bacterial evolution. In addition to nearest-neighbor relationships, the method is also useful to study the relative direction of transcription in genomes, which is also highly conserved between homologous gene pairs. This new approach combines the high-level description of molecular function with pair statistics that express genome organization. It is expected to complement traditional methods of sequence analysis in the study of genomic structure, function, and evolution.

Key words: Genome analysis — Bacterial evolution — Genome organization — Functional classification — Gene clusters 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Adams MD, Kerlavage AR, Fields C, Venter JC (1993) 3,400 new expressed sequence tags identify diversity of transcripts in human brain. Nat Genet 4:256–267PubMedCrossRefGoogle Scholar
  2. Ahn S, Tanksley SD (1993) Comparative linkage maps of the rice and maize genomes. Proc Natl Acad Sci USA 90:7980–7984PubMedCentralPubMedCrossRefGoogle Scholar
  3. Bairoch A, Boeckmann B (1991) The SWISS-PROT protein sequence data bank. Nucleic Acids Res 19:2247–2250PubMedCentralPubMedCrossRefGoogle Scholar
  4. Bork P, Ouzounis C, Casari G, Schneider R, Sander C, Dolan M, Gilbert W, Gillevet PM (1995) Exploring the Mycoplasma capricolum genome: a minimal cell reveals its physiology. Mol Microbiol 16:955–967PubMedCrossRefGoogle Scholar
  5. Boudreau E, Otis C, Turmel M (1994) Conserved gene clusters in the highly rearranged chloroplast genomes of Chlamydomonas moewusii and Chlamydomonas reinhardtii. Plant Mol Biol 24:585–602PubMedCrossRefGoogle Scholar
  6. Casari G, Andrade A, Bork P, Boyle J, Daruvar A, Ouzounis C, Schneider R, Tamames J, Valencia A, Sander C (1995) Challenging times for bioinformatics. Nature 376:647–648PubMedCrossRefGoogle Scholar
  7. Casari G, Ouzounis C, Valencia A, Sander C (1996) GeneQuiz II: automatic function assignment for genome sequence analysis. In: Hunter L, Klein TE (eds) First annual Pacific symposium on biocomputing. World Scientific, Hawaii, pp 707–709Google Scholar
  8. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb J-F, Dougherty BA, Merrick JM, McKenney K, Sutton G, FitzHugh W, Fields C, Gocayne JD, Scott J, Shirley R, Liu L-I, Glodek A, Kelley JM, Weidman JF, Phillips CA, Spriggs T, Hedblom E, Cotton MD, Utterback TR, Hanna MC, Nguyen DT, Saudek DM, Brandon RC, Fine LD, Fritchman JL, Fuhrmann JL, Geoghagen NSM, Gnehm CL, McDonald LA, Small KV, Fraser CM, Smith O, Venter JC (1995) Whole-genome random sequencing an assembly of Haemophilus influenzae Rd. Science 269:496–512PubMedCrossRefGoogle Scholar
  9. Fraser FC, Gocayne JD, White O, Adams MD, Clayton RA, Fleischmann RD, Bult CJ, Kerlavage AR, Sutton G, Kelley JM, Fritchman JL, Weidman JF, Small KV, Sandusky M, Fuhrmann J, Nguyen D, Utterback TR, Saudek DM, Phillips CA, Merrick JM, Tomb J-F, Dougherty BA, Bott KF, Hu P-C, Lucier TS, Peterson SN, Smith HO, Hutchison CAI, Venter JC (1995) The minimal gene complement of Mycoplasma genitalium. Science 270:397–403PubMedCrossRefGoogle Scholar
  10. Gompels UA, Nicholas J, Lawrence G, Jones M, Thomson BJ, Martin ME, Efstathiou S, Craxton M, Macaulay HA (1995) The DNA sequence of human herpesvirus-6: structure, coding content, and genome evolution. Virology 209:29–51PubMedCrossRefGoogle Scholar
  11. Irving NG, Cabin DE, Swanson DA, Reeves RH (1994) Gene order is conserved within the human chromosome 21 linkage group on mouse chromosome 10. Genomics 21:144–149PubMedCrossRefGoogle Scholar
  12. Johansson M, Ellegren H, Andersson L (1995) Comparative mapping reveals extensive linkage conservation but with gene order rearrangements-between the pig and the human genomes. Genomics 25:682–690PubMedCrossRefGoogle Scholar
  13. Karlin S, Ladunga I (1994) Comparisons of eukaryotic genomic sequences. Proc Natl Acad Sci USA 91:12832–12836PubMedCentralPubMedCrossRefGoogle Scholar
  14. Keeling PJ, Charlebois RL, Doolittle WF (1994) Archaebacterial genomes: eubacterial form and eukaryotic content. Curr Opin Genet Dev 4:816–822PubMedCrossRefGoogle Scholar
  15. Kingsmore SF, Watson ML, Howard TA, Seldin MF (1989) A 6000 Kb segment of chromosome 1 is conserved in human and mouse. EMBO J 8:4073–4080PubMedCentralPubMedGoogle Scholar
  16. Kunisawa T (1995) Identification and chromosomal distribution of DNA sequence segments conserved since divergence of Escherichia coli and Bacillus subtilis. J Mol Evol 40:585–593PubMedCrossRefGoogle Scholar
  17. Liu SL, Sanderson KE (1995) Rearrangements in the genome of the bacterium Salmonella typhi. Proc Natl Acad Sci USA 92:1018–1022PubMedCentralPubMedCrossRefGoogle Scholar
  18. Lopez-Garcia P, St Jean A, Amils R, Charlebois RL (1995) Genomic stability in the archaeae Haloferax volcanii and Haloferax mediterraneii. J Bacteriol 177:1405–1408PubMedCentralPubMedGoogle Scholar
  19. Lundin LG (1979) Evolutionary conservation of large chromosomal segments reflected in mammalian gene maps. Clin Genet 16:72–81PubMedCrossRefGoogle Scholar
  20. O’Brien SJ, Seuanez HN, Womack JE (1988) Mammalian genome organization: an evolutionary view. Annu Rev Genet 22:323–351PubMedCrossRefGoogle Scholar
  21. Ouzounis C, Valencia A, Tamames J, Bork P, Sander C (1995) The functional composition of living machines as a design principle for artificial organisms. In: Morán F, Moreno A, Merelo JJ, Chacón P (eds) European conference on artificial life 1995 (ECAL95). Springer-Verlag, Granada, Spain, pp 843–851Google Scholar
  22. Ouzounis C, Casari G, Valencia A, Sander C (1996) Novelties from the complete genome of Mycoplasma genitalium. Mol Microbiol 20: 897–899CrossRefGoogle Scholar
  23. Riley M (1993) Functions of the gene products of Escherichia coli. Microbiol Rev 57:862–952PubMedCentralPubMedGoogle Scholar
  24. Rogatko A, Zacks S (1989) Statistical inference in the gene order problem: theoretical aspects. Prog Clin Biol Res 329:63–68PubMedGoogle Scholar
  25. Rudd KE (1993) Maps, genes, sequences, and computers: an Escherichia coli case study. ASM News 59:335–341Google Scholar
  26. Sankoff D, Goldstein M (1989) Probabilistic models of genome shuffling. Bull Math Biol 51:117–124PubMedCrossRefGoogle Scholar
  27. Sankoff D, Leduc G, Antoine N, Paquin B, Lang BF, Cedergren R (1992) Gene order comparisons for phylogenetic inference: evolution of the mitochondrial genome. Proc Natl Acad Sci USA 89: 6575–6579PubMedCentralPubMedCrossRefGoogle Scholar
  28. Shapiro JA (1982) Changes in gene order and gene expression. Natl Cancer Inst Monogr 60:87–110PubMedGoogle Scholar
  29. Tamames J, Ouzounis C, Sander C, Valencia A (1996) Genomes with distinct functional composition. FEBS Lett 389:96-101 Zakharov IA, Nikiforov VS, Stepaniuk EV (1992) Homology and evolution of gene orders: combinatorial measure of synteny group similarity and simulation of the evolution process. Genetika 28:77–81PubMedCrossRefGoogle Scholar
  30. Zorio DAR, Cheng NN, Blumenthal T, Spieth J (1994) Operons as a common form of chromosomal organization in C. elegans. Nature 372:270–273PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag New York Inc. 1997

Authors and Affiliations

  • Javier  Tamames
    • 1
  • Georg  Casari
    • 2
  • Christos  Ouzounis
    • 3
  • Alfonso  Valencia
    • 1
  1. 1.Protein Design Group, CNB-CSIC, Campus U. Autonoma, Cantoblanco, E-28049 Madrid, SpainES
  2. 2.Biological Structures and BioComputing Programme, EMBL-Heidelberg, Meyerhofstrasse 1, D-69012 Heidelberg, GermanyDE
  3. 3.Artificial Intelligence Center—SRI International, Menlo Park, CA 94025-3493, USAUS

Personalised recommendations