Integer Linear Programs for Discovering Approximate Gene Clusters

  • Sven Rahmann
  • Gunnar W. Klau
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4175)

Abstract

We contribute to the discussion about the concept of approximate conserved gene clusters by presenting a class of definitions that (1) can be written as integer linear programs (ILPs) and (2) allow several variations that include existing definitions such as common intervals, r-windows, and max-gap clusters or gene teams. While the ILP formulation does not directly lead to optimal algorithms, it provides unprecedented generality and is competitive in practice for those cases where efficient algorithms are known. It allows for the first time a non-heuristic study of large approximate clusters in several genomes. Source code and datasets are available at http://gi.cebitec.uni-bielefeld.de/assb.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Snel, B., Bork, P., Huynen, M.A.: The identification of functional modules from the genomic association of genes. Proc. Natl. Acad. Sci. USA 99, 5890–5895 (2002)CrossRefGoogle Scholar
  2. 2.
    Hoberman, R., Durand, D.: The incompatible desiderata of gene cluster properties. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS, vol. 3678, pp. 73–87. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  3. 3.
    Wolsey, L.A.: Integer programming. Wiley Interscience Series in Discrete Mathematics and Optimization. John Wiley & Sons, Chichester (1998)Google Scholar
  4. 4.
    Heber, S., Stoye, J.: Algorithms for finding gene clusters. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 252–263. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  5. 5.
    Schmidt, T., Stoye, J.: Quadratic time algorithms for finding common intervals in two and more sequences. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 347–358. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Bergeron, A., Corteel, S., Raffinot, M.: The algorithmic of gene teams. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 464–476. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Li, Q., Lee, B.T.K., Zhang, L.: Genome-scale analysis of positional clustering of mouse testis-specific genes. BMC Genomics 6, 7 (2005)CrossRefGoogle Scholar
  8. 8.
    Durand, D., Sankoff, D.: Tests for gene clustering. J. Comput. Biol. 10, 453–482 (2003)CrossRefGoogle Scholar
  9. 9.
    Chauve, C., Diekmann, Y., Heber, S., Mixtacki, J., Rahmann, S., Stoye, J.: On common intervals with errors. Technical Report 2006-02, Abteilung Informationstechnik, Technische Fakultät, Universität Bielefeld (2006) ISSN 0946-7831Google Scholar
  10. 10.
    ILOG, Inc.: CPLEX (1987–2006), http://www.ilog.com/products/cplex

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Sven Rahmann
    • 1
    • 2
  • Gunnar W. Klau
    • 3
    • 4
  1. 1.Algorithms and Statistics for Systems Biology group, Genome Informatics, Technische FakultätBielefeld UniversityBielefeldGermany
  2. 2.International NRW Graduate School in Bioinformatics and Genome Research 
  3. 3.Mathematics in Life Sciences group, Dept. of Mathematics and Computer ScienceFree University BerlinBerlinGermany
  4. 4.DFG Research Center Matheon “Mathematics for key technologies”Berlin

Personalised recommendations