Gecko and GhostFam

Rigorous and Efficient Gene Cluster Detection in Prokaryotic Genomes
  • Thomas Schmidt
  • Jens Stoye
Part of the Methods In Molecular Biology™ book series (MIMB, volume 396)


A popular approach in comparative genomics is to locate groups or clusters of orthologous genes in multiple genomes and to postulate functional association between the genes contained in such clusters. For a rigorous and efficient detection in multiple genomes, it is essential to have an appropriate model of gene clusters accompanied by efficient algorithms locating them. The Gecko method described herein was designed to serve as a basic tool for the detection and visualization of gene cluster data in prokaryotic genomes founded on a formal string-based gene cluster model.

Key Words

Comparative genomics gene cluster Gecko GhostFam common intervals 



The authors wish to thank Christian Rückert and Jörn Kalinowski for their helpful discussions on the topic of gene clusters and their valuable feedback during the development of GhostFam and Gecko.


  1. 1.
    Overbeek, R., Fonstein, M., D’Souza, M., Pusch, G. D., and Maltsev, N. (1999) The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96, 2896–2901.Google Scholar
  2. 2.
    Bork, P., Snel, B., Lehmann, G., et al. (2000) Comparative genome analysis: exploiting the context of genes to infer evolution and predict function, in Comparative Genomics, (Sankoff, D. and Nadeau, J. H., eds.), Kluwer Academic Publishers, pp. 281–294.Google Scholar
  3. 3.
    Lathe, W. C., III, Snel, B., and Bork, P. (2000) Gene context conservation of a higher order than operons. Trends Biochem. Sci. 25, 474–479.CrossRefPubMedGoogle Scholar
  4. 4.
    Tamames, J., Casari, G., Ouzounis, C., and Valencia, A. (1997) Conserved clusters of functionally related genes in two bacterial genomes. J. Mol. Evol. 44, 66–73.CrossRefPubMedGoogle Scholar
  5. 5.
    Dandekar, T., Snel, B., Huynen, M., and Bork, P. (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 23, 324–328.CrossRefPubMedGoogle Scholar
  6. 6.
    Yanai, I. and DeLisi, C. (2002) The society of genes: networks of functional links between genes from comparative genomics. Genome Biol. 3, 1–12.Google Scholar
  7. 7.
    Kolesov, G., Mewes, H. W., and Frishman, D. (2002) Snapper: gene order predicts gene function. Bioinformatics 18, 1017–1019.CrossRefPubMedGoogle Scholar
  8. 8.
    Tamames, J. (2001) Evolution of gene order conservation in prokaryotes. Genome Biol. 2, 1–11.CrossRefGoogle Scholar
  9. 9.
    Korbel, J. O., Snel, B., Huynen, M. A., and Bork, P. (2002) Shot: a web server for the construction of genome phylogenies. Trends Genet. 18, 158–162.CrossRefPubMedGoogle Scholar
  10. 10.
    Schmidt, T., and Stoye, J. (2004) Quadratic time algorithms for finding common intervals in two and more sequences, in Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching, CPM 2004, volume 3109 of LNCS, Springer Verlag, pp. 347–358.Google Scholar
  11. 11.
    Schmidt, T. (2005) Efficient Algorithms for Gene Cluster Detection in Prokaryotic Genomes. Dissertation, Technische Fakultät der Universität Bielefeld, Bielefeld, 2005. Available at
  12. 12.
    Rogozin, I. B., Makarova, K. S., Wolf, Y. I., and Koonin, E. V. (2004) Computational approaches for the analysis of gene neighborhoods in prokaryotic genomes. Briefings in Bioinformatics 5, 131–149.CrossRefPubMedGoogle Scholar
  13. 13.
    Uno, T. and Yagiura, M. (2000) Fast algorithms to enumerate all common intervals of two permutations. Algorithmica 26, 290–309.CrossRefGoogle Scholar
  14. 14.
    Heber, S. and Stoye, J. (2001) Finding all common intervals of k permutations, in Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching, CPM 2001, pp. 207–218.Google Scholar
  15. 15.
    Didier, G., Schmidt, T., Stoye, J., and Tsur, D. (2007) Character sets of strings. J. Discr. Alg. 5, 330–340.CrossRefGoogle Scholar
  16. 16.
    Bergeron, A., Corteel, S., and Raffinot, M. (2002) The algorithmic of gene teams, in Proceedings of the Second International Workshop on Algorithms in BioInformatics, WABI2002, pp. 464–476.Google Scholar
  17. 17.
    Amir, A., Apostolico, A., Landau, G. M., and Satta, G. (2003) Efficient text fingerprinting via parikh mapping. J. Discr. Alg. 26, 1–13.Google Scholar
  18. 18.
    Tatusov, R. L., Natale, D. A., Garkavtsev, I. V., et al. (2001) The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 29, 22–28.CrossRefPubMedGoogle Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Thomas Schmidt
    • 1
  • Jens Stoye
    • 2
  1. 1.Technische Fakultät, Universitat Bielefeld, International NRW Graduate School in Bioinformatics and Genome ResearchGermany
  2. 2.Technische Fakultät, Universitat BielefeldGermany

Personalised recommendations