Integer Linear Programs for Discovering Approximate Gene Clusters
We contribute to the discussion about the concept of approximate conserved gene clusters by presenting a class of definitions that (1) can be written as integer linear programs (ILPs) and (2) allow several variations that include existing definitions such as common intervals, r-windows, and max-gap clusters or gene teams. While the ILP formulation does not directly lead to optimal algorithms, it provides unprecedented generality and is competitive in practice for those cases where efficient algorithms are known. It allows for the first time a non-heuristic study of large approximate clusters in several genomes. Source code and datasets are available at http://gi.cebitec.uni-bielefeld.de/assb .
KeywordsGene Cluster Gene Content Integer Linear Program Target Function Genome Selection
Unable to display preview. Download preview PDF.
- 3.Wolsey, L.A.: Integer programming. Wiley Interscience Series in Discrete Mathematics and Optimization. John Wiley & Sons, Chichester (1998)Google Scholar
- 9.Chauve, C., Diekmann, Y., Heber, S., Mixtacki, J., Rahmann, S., Stoye, J.: On common intervals with errors. Technical Report 2006-02, Abteilung Informationstechnik, Technische Fakultät, Universität Bielefeld (2006) ISSN 0946-7831Google Scholar
- 10.ILOG, Inc.: CPLEX (1987–2006), http://www.ilog.com/products/cplex