A Pseudo-boolean Programming Approach for Computing the Breakpoint Distance Between Two Genomes with Duplicate Genes

Angibaud, Sébastien; Fertin, Guillaume; Rusu, Irena; Thévenin, Annelyse; Vialette, Stéphane

doi:10.1007/978-3-540-74960-8_2

Sébastien Angibaud¹,
Guillaume Fertin¹,
Irena Rusu¹,
Annelyse Thévenin² &
…
Stéphane Vialette²

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4751))

Included in the following conference series:

RECOMB International Workshop on Comparative Genomics

416 Accesses
5 Citations

Abstract

Comparing genomes of different species has become a crucial problem in comparative genomics. Recent research have resulted in different genomic distance definitions: number of breakpoints, number of common intervals, number of conserved intervals, Maximum Adjacency Disruption number (MAD), etc. Classical methods (usually based on permutations of gene order) for computing genomic distances between whole genomes are however seriously compromised for genomes where several copies of the same gene may be scattered across the genome. Most approaches to overcoming this difficulty are based on the exemplar method (keep exactly one copy in each genome of each duplicated gene) and the maximum matching method (keep as many copies as possible in each genome of each duplicated gene). Unfortunately, it turns out that, in presence of duplications, most problems are NP-hard, and hence several heuristics have been recently proposed.

Extending research initiated in [2], we propose in this paper a novel generic pseudo-boolean approach for computing the exact breakpoint distance between two genomes in presence of duplications for both the exemplar and maximum matching methods. We illustrate the application of this methodology on a well-known public benchmark dataset of γ-Proteobacteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Angibaud, S., Fertin, G., Rusu, I., Vialette, S.: How pseudo-boolean programming can help genome rearrangement distance computation. In: Bourque, G., El-Mabrouk, N. (eds.) Comparative Genomics. LNCS (LNBI), vol. 4205, pp. 75–86. Springer, Heidelberg (2006)
Chapter Google Scholar
Angibaud, S., Fertin, G., Rusu, I., Vialette, S.: A general framework for computing rearrangement distances between genomes with duplicates. Journal of Computational Biology 14(4), 379–393 (2007)
Article MathSciNet Google Scholar
Barth, P.: A Davis-Putnam based enumeration algorithm for linear pseudo-boolean optimization. Technical Report MPI-I-95-2-003, Max Planck Institut Informatik, p.13 (2005)
Google Scholar
Blin, G., Chauve, C., Fertin, G.: The breakpoint distance for signed sequences. In: Proc. 1st Algorithms and Computational Methods for Biochemical and Evolutionary Networks (Comp. Bio. Nets.), pp. 3–16. KCL publications (2004)
Google Scholar
Blin, G., Chauve, C., Fertin, G.: Genes order and phylogenetic reconstruction: Application to γ-proteobacteria. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 11–20. Springer, Heidelberg (2005)
Chapter Google Scholar
Blin, G., Rizzi, R.: Conserved intervals distance computation between non-trivial genomes. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 22–31. Springer, Heidelberg (2005)
Chapter Google Scholar
Bourque, G., Yacef, Y., El-Mabrouk, N.: Maximizing synteny blocks to identify ancestral homologs. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 21–35. Springer, Heidelberg (2005)
Chapter Google Scholar
Bryant, D.: The complexity of calculating exemplar distances. In: Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment, and the Evolution of Gene Families, pp. 207–212. Kluwer Academic Publishers, Dordrecht (2000)
Google Scholar
Chai, D., Kuehlmann, A.: A fast pseudo-boolean constraint solver. In: Proc. 40th ACM IEEE Conference on Design Automation, pp. 830–835. ACM Press, New York (2003)
Google Scholar
Chauve, C., Fertin, G., Rizzi, R., Vialette, S.: Genomes containing duplicates are hard to compare. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J.J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 783–790. Springer, Heidelberg (2006)
Chapter Google Scholar
Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(4), 302–315 (2005)
Article Google Scholar
Eén, N., Sörensson, N.: Translating pseudo-boolean constraints into SAT. Journal on Satisfiability, Boolean Modeling and Computation 2, 1–26 (2006)
MATH Google Scholar
Lerat, E., Daubin, V., Moran, N.A.: From gene tree to organismal phylogeny in prokaryotes: the case of γ-proteobacteria. PLoS Biology 1(1), 101–109 (2003)
Article Google Scholar
Marron, M., Swenson, K.M., Moret, B.M.E.: Genomic distances under deletions and insertions. Theoretical Computer Science 325(3), 347–360 (2004)
Article MATH MathSciNet Google Scholar
Sankoff, D.: Genome rearrangement with gene families. Bioinformatics 15(11), 909–917 (1999)
Article Google Scholar
Sankoff, D., Haque, L.: Power boosts for cluster tests. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 11–20. Springer, Heidelberg (2005)
Google Scholar
Schrijver, A.: Theory of Linear and Integer Programming. John Wiley and Sons, Chichester (1998)
MATH Google Scholar
Sheini, H.M., Sakallah, K.A.: Pueblo: A hybrid pseudo-boolean SAT solver. Journal on Satisfiability, Boolean Modeling and Computation 2, 165–189 (2006)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire d’Informatique de Nantes-Atlantique (LINA), FRE CNRS 2729, Université de Nantes, 2 rue de la Houssinière, 44322 Nantes Cedex 3, France
Sébastien Angibaud, Guillaume Fertin & Irena Rusu
Laboratoire de Recherche en Informatique (LRI), UMR CNRS 8623, Faculté des Sciences d’Orsay - Université Paris-Sud, 91405 Orsay, France
Annelyse Thévenin & Stéphane Vialette

Authors

Sébastien Angibaud
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Fertin
View author publications
You can also search for this author in PubMed Google Scholar
Irena Rusu
View author publications
You can also search for this author in PubMed Google Scholar
Annelyse Thévenin
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Vialette
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Glenn Tesler Dannie Durand

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Angibaud, S., Fertin, G., Rusu, I., Thévenin, A., Vialette, S. (2007). A Pseudo-boolean Programming Approach for Computing the Breakpoint Distance Between Two Genomes with Duplicate Genes . In: Tesler, G., Durand, D. (eds) Comparative Genomics. RECOMB-CG 2007. Lecture Notes in Computer Science(), vol 4751. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74960-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-540-74960-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74959-2
Online ISBN: 978-3-540-74960-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics