Advertisement

A Lagrangian Relaxation Approach for the Multiple Sequence Alignment Problem

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4616)

Abstract

We present a branch-and-bound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing certain inequalities, the Lagrangian subproblem becomes a pairwise alignment problem, which can be solved efficiently by a dynamic programming approach. Due to a reformulation w.r.t. additionally introduced variables prior to relaxation we improve the convergence rate dramatically while at the same time being able to solve the Lagrangian problem efficiently. Our experiments show that our implementation, although preliminary, outperforms all exact algorithms for the multiple sequence alignment problem.

Keywords

Multiple Sequence Alignment Integer Linear Programming Target Node Lagrangian Relaxation Pairwise Alignment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Althaus, E., Canzar, S.: A lagrangian relaxation approach for the multiple sequence alignment problem. Technical Report MPI-I-2007-1-001, Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany (May 2007)Google Scholar
  2. 2.
    Althaus, E., Caprara, A., Lenhof, H.-P., Reinert, K.: Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics. In: Lengauer, T., Lenhof, H.-P. (eds.) Proceedings of the European Conference on Computational Biology, vol. 18 of Bioinformatics, pp. S4–S16, Saarbrücken, Oxford University Press, Oxford (2002)Google Scholar
  3. 3.
    Althaus, E., Caprara, A., Lenhof, H.-P., Reinert, K.: Aligning multiple sequences by cutting planes. Mathematical Programming 105, 387–425 (2006)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)Google Scholar
  5. 5.
    Caprara, A., Fischetti, M., Toth, P.: A heuristic method for the set cover problem. Operations Research 47, 730–743 (1999)zbMATHMathSciNetCrossRefGoogle Scholar
  6. 6.
    Carrillo, H., Lipman, D.J.: The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 48(5), 1073–1082 (1988)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Delcher, A., Kasif, S., Fleischmann, R., Peterson, W.O.J., Salzberg, S.: Alignment of whole genomes. Nucleic Acids Research 27, 2369–2376 (1999)CrossRefGoogle Scholar
  8. 8.
    Eppstein, D.: Sequence comparison with mixed convex and concave costs. Journal of Algorithms 11, 85–101 (1990)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Fisher, M.: Optimal solutions of vehcile routing problems using minimum k-trees. Operations Research 42, 626–642 (1994)zbMATHMathSciNetGoogle Scholar
  10. 10.
    Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman (1979)Google Scholar
  11. 11.
    Gupta, S., Kececioglu, J., Schaeffer, A.: Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J. Comput. Biol. 2, 459–472 (1995)Google Scholar
  12. 12.
    Gusfield, D.: Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge (1997)zbMATHGoogle Scholar
  13. 13.
    Held, M., Karp, R.: The traveling salesman problem and minimum spanning trees: part ii. Mathematical Programming 1, 6–25 (1971)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Larmore, L., Schieber, B.: Online dynamic programming with applications to the prediction of rna secondary structure. In: Proceedings of the First Symposium on Discrete Algorithms, pp. 503–512 (1990)Google Scholar
  15. 15.
    Lermen, M., Reinert, K.: The practical use of the \(\mathcal{A}^*\) algorithm for exact multiple sequence alignment. Journal of Computational Biology 7(5), 655–673 (2000)CrossRefGoogle Scholar
  16. 16.
    Lipman, D., Altschul, S., Kececioglu, J.: A tool for multiple sequence alignment. Proceedings of the National Academy of Sciences of the United States of America 86, 4412–4415 (1989)CrossRefGoogle Scholar
  17. 17.
    Mehlhorn, K., Näher, S.: The LEDA Platform of Combinatorial and Geometric Computing. Cambridge University Press, Cambridge (1999), See also http://www.mpi-sb.mpg.de/LEDA/ Google Scholar
  18. 18.
    Reinert, K.: A Polyhedral Approach to Sequence Alignment Problems. PhD thesis, Universität des Saarlandes (1999)Google Scholar
  19. 19.
    Reinert, K., Lenhof, H.-P., Mutzel, P., Mehlhorn, K., Kececioglu, J.: A branch-and-cut algorithm for multiple sequence alignment. In: Proceedings of the First Annual International Conference on Computational Molecular Biology (RECOMB 1997), pp. 241–249 (1997)Google Scholar
  20. 20.
    Reinert, K., Stoye, J., Will, T.: An iterative methods for faster sum-of-pairs multiple sequence alignment. Bioinformatics 16(9), 808–814 (2000)CrossRefGoogle Scholar
  21. 21.
    Sankoff, D., Kruskal, J.B.: Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison. Addison-Wesley, Reading (1983)Google Scholar
  22. 22.
    Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1, 337–348 (1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  1. 1.Max-Planck Institut für Informatik, Stuhlsatzenhausweg 85, D-66123 SaarbrückenGermany
  2. 2.Université Henri Poincaré, LORIA, B.P. 239, 54506 Vandœvre-lès-NancyFrance

Personalised recommendations