Journal of Molecular Evolution

, Volume 62, Issue 1, pp 1–14 | Cite as

A Comparative Categorization of Protein Function Encoded in Bacterial or Archeal Genomic Islands

  • Rainer MerklEmail author


Genomes of prokaryotes harbor genomic islands (GIs), which are frequently acquired via horizontal gene transfer (HGT). Here I present an analysis of GIs with respect to gene-encoded functions. GIs were identified by statistical analysis of codon usage and clustering. Genes classified as putatively alien (pA) or putatively native (pN) were categorized according to the COG database. Among pA and pN genes, the distribution of COG functions and classes were studied for different groupings of prokaryotes. Groups were formed according to taxonomical relation or habitats. In all groups, genes related to class L (replication, recombination, and repair) were statistically significantly overrepresented in GIs. GIs of bacteria and archaea showed a distinct pattern of preferences. In archeal GIs, genes belonging to COG class M (cell wall/membrane/envelope biogenesis) or Q (secondary metabolites biosynthesis, transport, and catabolism) were more frequent. In bacterial GIs, genes of classes U (intracellular trafficking, secretion, and vesicular transport), N (cell motility), and V (defense mechanisms) were predominant. Underrepresentation was strongest for genes belonging to class J (translation, ribosomal structure, and biogenesis). Among single COG functions overrepresented in GIs were transferases and transporters. In both superkingdoms, HGT enhances genomic content by meeting demands that are independent of the studied habitats. These findings are in agreement with the complexity theory, which predicts the preferential import of operational genes. However, only specific subsets of operational genes were enriched in GIs. Modification of the cell envelope, cell motility, secretion, and protection of cellular DNA are major issues in HGT.


Genomic islands Horizontal gene transfer Lateral gene transfer Clusters of Orthologous Groups of genes Xenologous genes 



This project was carried out within the framework of the Competence Network Göttingen “Genome Research on Bacteria” (BiotechGenoMik) financed by the German Federal Ministry of Education and Research (BMBF). I thank Rüdiger Schmidt for his help in preparing the manuscript and the referees for their constructive comments.


  1. Abe T, Kanaya S, Kinouchi M, Ichiba Y, Kozuki T, Ikemura T (2003) Informatics for unveiling hidden genome signatures. Genome Res 13:693–702CrossRefPubMedGoogle Scholar
  2. Berg OG, Kurland CG (2002) Evolution of microbial genomes: sequence acquisition and loss. Mol Biol Evol 19:2265–2276PubMedGoogle Scholar
  3. Borukhov S, Nudler E (2003) RNA polymerase holoenzyme: structure, function and biological implications. Curr Opin Microbiol 6:93–100CrossRefPubMedGoogle Scholar
  4. Breitbart M, Salomon P, Andresen B, Mahaffy JM, Segall AM, Mead D, Azam F, Rohwer F (2002) Genomic analysis of uncultured marine viral communities. Proc Natl Acad Sci USA 99:14250–14255CrossRefPubMedGoogle Scholar
  5. Chen SL, Lee W, Hottes AK, Shapiro L, McAdams HH (2004) Codon usage between genomes is constrained by genome-wide mutational processes. Proc Natl Acad Sci USA 101:3480–3485PubMedGoogle Scholar
  6. Daubin V, Lerat E, Perrière G (2003) The source of laterally transferred genes in bacterial genomes. Genome Biol 4:R57CrossRefPubMedGoogle Scholar
  7. de la Cruz F, Davies J (2000) Horizontal gene transfer and the origin of species: lessons from bacteria. Trends Microbiol 8:128–133Google Scholar
  8. Doolittle WF (1999) Phylogenetic classification and me universal tree. Science 284:2124–2129CrossRefPubMedGoogle Scholar
  9. Dow JM, Osbourn AE, Wilson TJ, Daniels MJ (1995) A locus determining pathogenicity of Xanthomonas campestris is involved in lipopolysaccharide biosynthesis. Mol Plant Microbe Interact 8:768–777PubMedGoogle Scholar
  10. Eichler J (2003) Facing extremes: archaeal surface-layer (glyco)proteins. Microbiology 149:3347–3351CrossRefPubMedGoogle Scholar
  11. Florea L, McClelland M, Riemer C, Schwartz S, Miller W (2003) EnteriX 2003: visualization tools for genome alignments of Enterobacteriaceae. Nucleic Acids Res 31:3527–3532CrossRefPubMedGoogle Scholar
  12. Garcia-Vallvé S, Romeu A, Palau J (2000) Horizontal gene transfer in bacterial and archaeal complete genomes. Genome Res 10:1719–1725PubMedGoogle Scholar
  13. Garcia-Vallvé S, Guzman E, Montero MA, Romeu A (2003) HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes. Res 31:187–189Google Scholar
  14. Genevaux P, Bauda P, DuBow MS, Oudega B (1999) Identification of Tn10 insertions in the rfaG, rfaP, and galU genes involved in lipopolysaccharide core biosynthesis that affect Escherichia coli adhesion. Arch Microbiol 172:1–8CrossRefPubMedGoogle Scholar
  15. Grantham R, Gautier C, Gouy M, Mercier R, Pave A (1980) Codon catalog usage and the genome hypothesis. Nucleic Acids Res 8:r49–r62PubMedGoogle Scholar
  16. Hacker J, Kaper JB (2000) Pathogenicity islands and the evolution of microbes. Annu Rev Microbiol 54:641–679CrossRefPubMedGoogle Scholar
  17. Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31:371–313CrossRefPubMedGoogle Scholar
  18. Jain R, Rivera MC, Lake JA (1999) Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci USA 96:3801–3806CrossRefPubMedGoogle Scholar
  19. Jeltsch A (2002) Beyond Watson and Crick: DNA methylation and molecular enzymology of DNA methyltransferases. Chembiochem 3:274–293PubMedGoogle Scholar
  20. Jin Q, Yuan Z, Xu J, Wang Y, Shen Y, Lu W, Wang J, Liu H, Yang J, Yang F, Zhang X, Zhang J, Yang G, Wu H, Qu D, Dong J, Sun L, Xue Y, Zhao A, Gao Y, Zhu J, Kan B, Ding K, Chen S, Cheng H, Yao Z, He B, Chen R, Ma D, Qiang B, Wen Y, Hou Y, Yu J (2002) Genome sequence of Shegella flexneri 2a: insights into pathogenicity through comparison with genomes of Escherichia coli K12 and O157. Nucleic Acids Res 30:4432–4441CrossRefPubMedGoogle Scholar
  21. Karlin S (2001) Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends Microbiol 9:335–343CrossRefPubMedGoogle Scholar
  22. Karlin S, Mrázek J (2001) Predicted highly expressed and putative alien genes of Deinococcus radiodurans and implications for resistance to ionizing radiation damage. Proc Natl Acad Sci USA 98:5240–5245CrossRefPubMedGoogle Scholar
  23. Kunst F, Ogasawara N, Moszer I, et al. (1997) The complete genome sequence of the Gram-positive bacterium Bacillus subtilis. Nature 390:249–256CrossRefPubMedGoogle Scholar
  24. Kurland CG (2000) Something for everyone. Horizontal gene transfer in evolution. EMBO Rep 1:92–95CrossRefPubMedGoogle Scholar
  25. Kurland CG, Canback B, Berg OG (2003) Horizontal gene transfer: a critical view. Proc Natl Acad Sci USA 100:9658–9662CrossRefPubMedGoogle Scholar
  26. Lawrence JG, Ochman H (1997) Amelioration of bacterial genomes: rates of change and exchange. J Mol Evol 44:383–397PubMedGoogle Scholar
  27. Lawrence JG, Ochman H (1998) Molecular archaeology of the Escherichia coli genome. Proc Natl Acad Sci USA 95:9413–9417CrossRefPubMedGoogle Scholar
  28. Lawrence JG, Ochman H (2002) Reconciling the many faces of lateral gene transfer. Trends Microbiol 10:1–4CrossRefPubMedGoogle Scholar
  29. Makarova KS, Koonin EV (2003) Comparative genomics of Archaea: How much have we learned in six years, and what’s next? Genome Biol 4:115CrossRefPubMedGoogle Scholar
  30. Mantri Y, Williams KP (2004) Islander: a database of integrative islands in prokaryotic genomes, the associated integrases and their DNA site specificities. Nucleic Acids Res 32(database issue):D55–D58PubMedGoogle Scholar
  31. Merkl R (2004) SIGI: Score-based identification of genomic islands. BMC Bioinformatics 5:22CrossRefPubMedGoogle Scholar
  32. Mooi FR, Bik EM (1997) The evolution of epidemic Vibrio cholerae strains. Trends Microbiol 5:161–165CrossRefPubMedGoogle Scholar
  33. Nakamura Y, Gojobori T, Ikemura T (1999) Codon usage tabulated from the international DNA sequence databases; its status 1999. Nucleic Acids Res 27:292PubMedGoogle Scholar
  34. Nakamura Y, Itoh T, Matsuda H, Gojobori T (2004) Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat Genet 36:760–766PubMedGoogle Scholar
  35. Nesbø CL, Doolittle WF (2003) Targeting clusters of transferred genes in Thermotoga maritima. Environ Microbiol 5:1144–1154PubMedGoogle Scholar
  36. Nesbø CL, L’Haridon S, Stetter KO, Doolittle WF (2001) Phylogenetic analyses of two “archaeal” genes in Thermotoga maritima reveal multiple transfers between Archaea and Bacteria. Mol Biol Evol 18:362–375PubMedGoogle Scholar
  37. Nicolas P, Bize L, Muri F, Hoebeke M, Rodolphe F, Ehrlich SD, Prum B, Bessières P (2002) Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models. Nucleic Acids Res 30:1418–1426CrossRefPubMedGoogle Scholar
  38. Ochman H, Lawrence JG, Groisman EA (2000) Lateral gene transfer and the nature of bacterial innovation. Nature 405:299–304CrossRefPubMedGoogle Scholar
  39. Ohnishi M, Kurokawa K, Hayashi T (2001) Diversification of Escherichia coli genomes: are bacteriophages the major contributors? Trends Microbiol 9:481–485CrossRefPubMedGoogle Scholar
  40. Raetz CR, Whitfield C (2002) Lipopolysaccharide endotoxins. Annu Rev Biochem 71:635–700CrossRefPubMedGoogle Scholar
  41. Ragan MA (2001a) Detection of lateral gene transfer among microbial genomes. Curr Opin Genet Dev 11:620–626CrossRefGoogle Scholar
  42. Ragan MA (2001b) On surrogate methods for detecting lateral gene transfer. FEMS Microbiol Lett 201:187–191Google Scholar
  43. Raymond CK, Sims EH, Kas A, Spencer DH, Kutyavin TV, Ivey RG, Zhou Y, Kaul R, Clendenning JB, Olson MV (2002) Genetic variation at the O-antigen biosynthetic locus in Pseudpmanas aeruginosa. J Bacteriol 184:3614–3622CrossRefPubMedGoogle Scholar
  44. Ruepp A, Graml W, Santos-Martinez ML, Koretke KK, Volker C, Mewes HW, Frishman D, Stocker S, Lupas AN, Baumeister W (2000) The genome sequence of the thermoacidophilic scavenger Thermoplasma acidophilum. Nature 407:508–513PubMedGoogle Scholar
  45. Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637CrossRefPubMedGoogle Scholar
  46. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA (2003) The COG database: an updated version includes Eukaryotes. BMC Bioinform 4:41Google Scholar
  47. Tu Q, Ding D (2003) Detecting pathogenicity islands and anomalous gene clusters by iterative discriminant analysis. FEMS Microbiol Lett 221:262–275CrossRefGoogle Scholar
  48. Wang HC, Badger J, Kearney P, Li M (2001) Analysis of codon usage patterns of bacterial genomes using the self-organizing map. Mol Biol Evol 18:792–800PubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  1. 1.Institut für Biophysik und physikalische BiochemieUniversität RegensburgRegensburgGermany

Personalised recommendations