Abstract
We present a branch-and-bound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing certain inequalities, the Lagrangian subproblem becomes a pairwise alignment problem, which can be solved efficiently by a dynamic programming approach. Due to a reformulation w.r.t. additionally introduced variables prior to relaxation we improve the convergence rate dramatically while at the same time being able to solve the Lagrangian problem efficiently. Our experiments show that our implementation, although preliminary, outperforms all exact algorithms for the multiple sequence alignment problem.
Supported by the German Academic Exchange Service (DAAD).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Althaus, E., Canzar, S.: A lagrangian relaxation approach for the multiple sequence alignment problem. Technical Report MPI-I-2007-1-001, Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany (May 2007)
Althaus, E., Caprara, A., Lenhof, H.-P., Reinert, K.: Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics. In: Lengauer, T., Lenhof, H.-P. (eds.) Proceedings of the European Conference on Computational Biology, vol. 18 of Bioinformatics, pp. S4–S16, Saarbrücken, Oxford University Press, Oxford (2002)
Althaus, E., Caprara, A., Lenhof, H.-P., Reinert, K.: Aligning multiple sequences by cutting planes. Mathematical Programming 105, 387–425 (2006)
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)
Caprara, A., Fischetti, M., Toth, P.: A heuristic method for the set cover problem. Operations Research 47, 730–743 (1999)
Carrillo, H., Lipman, D.J.: The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 48(5), 1073–1082 (1988)
Delcher, A., Kasif, S., Fleischmann, R., Peterson, W.O.J., Salzberg, S.: Alignment of whole genomes. Nucleic Acids Research 27, 2369–2376 (1999)
Eppstein, D.: Sequence comparison with mixed convex and concave costs. Journal of Algorithms 11, 85–101 (1990)
Fisher, M.: Optimal solutions of vehcile routing problems using minimum k-trees. Operations Research 42, 626–642 (1994)
Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman (1979)
Gupta, S., Kececioglu, J., Schaeffer, A.: Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J. Comput. Biol. 2, 459–472 (1995)
Gusfield, D.: Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge (1997)
Held, M., Karp, R.: The traveling salesman problem and minimum spanning trees: part ii. Mathematical Programming 1, 6–25 (1971)
Larmore, L., Schieber, B.: Online dynamic programming with applications to the prediction of rna secondary structure. In: Proceedings of the First Symposium on Discrete Algorithms, pp. 503–512 (1990)
Lermen, M., Reinert, K.: The practical use of the \(\mathcal{A}^*\) algorithm for exact multiple sequence alignment. Journal of Computational Biology 7(5), 655–673 (2000)
Lipman, D., Altschul, S., Kececioglu, J.: A tool for multiple sequence alignment. Proceedings of the National Academy of Sciences of the United States of America 86, 4412–4415 (1989)
Mehlhorn, K., Näher, S.: The LEDA Platform of Combinatorial and Geometric Computing. Cambridge University Press, Cambridge (1999), See also http://www.mpi-sb.mpg.de/LEDA/
Reinert, K.: A Polyhedral Approach to Sequence Alignment Problems. PhD thesis, Universität des Saarlandes (1999)
Reinert, K., Lenhof, H.-P., Mutzel, P., Mehlhorn, K., Kececioglu, J.: A branch-and-cut algorithm for multiple sequence alignment. In: Proceedings of the First Annual International Conference on Computational Molecular Biology (RECOMB 1997), pp. 241–249 (1997)
Reinert, K., Stoye, J., Will, T.: An iterative methods for faster sum-of-pairs multiple sequence alignment. Bioinformatics 16(9), 808–814 (2000)
Sankoff, D., Kruskal, J.B.: Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison. Addison-Wesley, Reading (1983)
Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1, 337–348 (1994)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Althaus, E., Canzar, S. (2007). A Lagrangian Relaxation Approach for the Multiple Sequence Alignment Problem. In: Dress, A., Xu, Y., Zhu, B. (eds) Combinatorial Optimization and Applications. COCOA 2007. Lecture Notes in Computer Science, vol 4616. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73556-4_29
Download citation
DOI: https://doi.org/10.1007/978-3-540-73556-4_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73555-7
Online ISBN: 978-3-540-73556-4
eBook Packages: Computer ScienceComputer Science (R0)