A Lagrangian relaxation approach for the multiple sequence alignment problem
- 283 Downloads
We present a branch-and-bound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing certain inequalities, the Lagrangian subproblem becomes a pairwise alignment problem, which can be solved efficiently by a dynamic programming approach. Due to a reformulation w.r.t. additionally introduced variables prior to relaxation we improve the convergence rate dramatically while at the same time being able to solve the Lagrangian problem efficiently. Our experiments show that our implementation, although preliminary, outperforms all exact algorithms for the multiple sequence alignment problem. Furthermore, the quality of the alignments is among the best computed so far.
KeywordsSequence comparison Lagrangian relaxation Branch and bound
- Althaus E, Caprara A, Lenhof H-P, Reinert K (2002) Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics. In: Lengauer T, Lenhof H-P (eds) Proceedings of the European conference on computational biology, Saarbrücken, October 2002. Bioinformatics, vol 18. Oxford University Press, London, pp S4–S16 Google Scholar
- Beasley J (1993) Lagrangian relaxation. In: Modern heuristic techniques for combinatorial problems. Blackwell Scientific, Oxford Google Scholar
- Elias I (2003) Settling the intractability of multiple alignment. In: Proc. of the 14th ann. int. symp. on algorithms and computation (ISAAC’03). Lecture notes in computer science, vol 2906. Springer, Berlin, pp 352–363 Google Scholar
- Gupta S, Kececioglu J, Schaeffer A (1995) Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J Comput Biol 2:459–472 Google Scholar
- Larmore L, Schieber B (1990) Online dynamic programming with applications to the prediction of RNA secondary structure. In: Proceedings of the first symposium on discrete algorithms, pp 503–512 Google Scholar
- Lucena A (1993) Steiner problem in graphs: Lagrangean relaxation and cutting-planes. COAL Bull 21:2–7 Google Scholar
- Mehlhorn K, Näher S (1999) The LEDA platform of combinatorial and geometric computing. Cambridge University Press, Cambridge. See also http://www.mpi-sb.mpg.de/LEDA/ Google Scholar
- Reinert K (1999) A polyhedral approach to sequence alignment problems. PhD thesis, Universität des Saarlandes, 1999 Google Scholar
- Reinert K, Lenhof H-P, Mutzel P, Mehlhorn K, Kececioglu J (1997) A branch-and-cut algorithm for multiple sequence alignment. In: Proceedings of the first annual international conference on computational molecular biology (RECOMB-97), pp 241–249 Google Scholar
- Sankoff D, Kruskal JB (1983) Time warps, string edits and macromolecules: the theory and practice of sequence comparison. Addison–Wesley, Reading Google Scholar
Open AccessThis is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.