Skip to main content

LASA: A Tool for Non-heuristic Alignment of Multiple Sequences

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 13))

Abstract

We have developed a non-heuristic tool (LASA) for the multiple sequence alignment problem (MSA), one of the most important problems in computational molecular biology. It is based on a dynamic programming algorithm for solving a Lagrangian relaxation of an integer linear programming (ILP) formulation for MSA. The objective function that is optimized by LASA models the sum-of-pairs scoring scheme and “truly” affine gap costs. Due to a reformulation w.r.t. additionally introduced variables prior to relaxation we improve the convergence rate dramatically while at the same time being able to solve the Lagrangian problem efficiently. Our experiments show that our implementation LASA outperforms all exact algorithms for the multiple sequence alignment problem. Furthermore, the quality of the alignments ranks among the best computed so far.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Althaus, E., Canzar, S.: A lagrangian relaxation approach for the multiple sequence alignment problem. In: Dress, A.W.M., Xu, Y., Zhu, B. (eds.) COCOA. LNCS, vol. 4616. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  2. Althaus, E., Caprara, A., Lenhof, H.-P., Reinert, K.: Aligning multiple sequences by cutting planes. Mathematical Programming 105, 387–425 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  3. Beasley, J.: Lagrangian Relaxation. In: Modern heuristic techniques for combinatorial problems. Blackwell Scientific Publications (1993)

    Google Scholar 

  4. Caprara, A., Fischetti, M., Toth, P.: A heuristic method for the set cover problem. Operations Research 47, 730–743 (1999)

    MATH  MathSciNet  Google Scholar 

  5. Edgar, R.C.: Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32(5), 1792–1797 (2004)

    Article  Google Scholar 

  6. Elias, I.: Settling the intractability of multiple alignment. In: Ibaraki, T., Katoh, N., Ono, H. (eds.) ISAAC 2003. LNCS, vol. 2906, pp. 352–363. Springer, Heidelberg (2003)

    Google Scholar 

  7. Eppstein, D.: Sequence comparison with mixed convex and concave costs. Journal of Algorithms (11), 85–101 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  8. Gupta, S., Kececioglu, J., Schaeffer, A.: Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J. Comput. Biol. 2, 459–472 (1995)

    Article  Google Scholar 

  9. Gusfield, D.: Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge (1997)

    MATH  Google Scholar 

  10. Held, M., Karp, R.: The traveling salesman problem and minimum spanning trees: part ii. Mathematical Programming 1, 6–25 (1971)

    Article  MATH  MathSciNet  Google Scholar 

  11. Katoh, K., Ichi Kuma, K., Toh, H., Miyata, T.: MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Research 33, 511 (2005)

    Article  Google Scholar 

  12. Lee, C., Grasso, C., Sharlow, M.F.: Multiple sequence alignment using partial order graphs. Bioinformatics 18(3), 452–464 (2002)

    Article  Google Scholar 

  13. Lipman, D., Altschul, S., Kececioglu, J.: A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. U.S.A. 86, 4412–4415 (1989)

    Article  Google Scholar 

  14. Lucena, A.: Steiner problem in graphs: Lagrangean relaxation and cutting-planes. COAL Bulletin 21, 2–7 (1993)

    Google Scholar 

  15. Mehlhorn, K., Näher, S.: The LEDA Platform of Combinatorial and Geometric Computing. Cambridge University Press, Cambridge (1999)

    Google Scholar 

  16. Morgenstern, B.: DIALIGN: multiple DNA and protein sequence alignment at BiBiServ. Nucl. Acids Res. 32(2), 33–36 (2004)

    Article  Google Scholar 

  17. Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302(1), 205–217 (2000)

    Article  Google Scholar 

  18. Reinert, K.: A Polyhedral Approach to Sequence Alignment Problems. PhD thesis, Universität des Saarlandes (1999)

    Google Scholar 

  19. Stoye, J., Evers, D., Meyer, F.: Rose: generating sequence families (1998)

    Google Scholar 

  20. Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22(22), 4673–4680 (1994)

    Article  Google Scholar 

  21. Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: A benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15(1), 87–88 (1999)

    Article  Google Scholar 

  22. Walle, I.V., Lasters, I., Wyns, L.: SABmark - a benchmark for sequence alignment that covers the entire known fold space. Bioinformatics 21, 1267–1268 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Mourad Elloumi Josef Küng Michal Linial Robert F. Murphy Kristan Schneider Cristian Toma

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Althaus, E., Canzar, S. (2008). LASA: A Tool for Non-heuristic Alignment of Multiple Sequences. In: Elloumi, M., Küng, J., Linial, M., Murphy, R.F., Schneider, K., Toma, C. (eds) Bioinformatics Research and Development. BIRD 2008. Communications in Computer and Information Science, vol 13. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70600-7_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70600-7_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70598-7

  • Online ISBN: 978-3-540-70600-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics