Advertisement

Making the shortest-paths approach to sum-of-pairs multiple sequence alignment more space efficient in practice

Extended abstract
  • Sandeep K. Gupta
  • John D. Kececioglu
  • Alejandro A. Schäffer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 937)

Abstract

The MSA program, written and distributed in 1989, is one of the few existing programs that attempts to find optimal alignments of multiple protein or DNA sequences. MSA implements a branch-and-bound technique on a variant of Dijkstra's shortest paths algorithm to prune the basic dynamic programming graph. We have made substantial improvements in the time and space usage of MSA. On some runs, we achieve an order of magnitude reduction in space usage and a significant multiplicative factor speedup in running time. To explain these improvements, we give a much more detailed description of MSA than has been previously available.

Keywords

Multiple Sequence Alignment Priority Queue Pairwise Alignment Outgoing Edge Optimal Alignment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    S. F. Altschul. Gap costs for multiple sequence alignment. J. Theor. Biol., 138:297–309, 1989.PubMedGoogle Scholar
  2. 2.
    S. F. Altschul, Raymond J. Carroll, and David J. Lipman. Weights for data related by a tree. J. Molecular Biology, 207:647–653, 1989.CrossRefGoogle Scholar
  3. 3.
    G. J. Barton and M. J. E. Sternberg. Evaluation and improvements in the automatic alignment of protein sequences. J. Mol. Biol., 198:327–337, 1987.PubMedGoogle Scholar
  4. 4.
    G. J. Barton and M. J. E. Sternberg. A strategy for the rapid multiple alignment of protein sequences. Protein Engineering, 1:89–94, 1987.PubMedGoogle Scholar
  5. 5.
    H. Bodlaender, R. G. Downey, M. R. Fellows, and H. T. Wareham. The parameterized complexity of sequence alignment and consensus. In Proc. of the 5th Symp. on Combinatorial Pattern Matching, Lecture Notes Comp. Sci. 807, pages 15–30, 1994.Google Scholar
  6. 6.
    H. Carrillo and D. Lipman. The multiple sequence alignment problem in biology. SLAM J. Appl. Math., 48:1073–1082, 1988.CrossRefGoogle Scholar
  7. 7.
    S. C. Chan, A. K. C. Wong, and D. K. Y. Chiu. A survey of multiple sequence comparison methods. Bulletin of Mathematical Biology, 54:563–598, 1992.PubMedGoogle Scholar
  8. 8.
    E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1:269–271, 1959.CrossRefGoogle Scholar
  9. 9.
    D. Feng and R. Doolittle. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Molecular Evol., 25:351–360, 1987.Google Scholar
  10. 10.
    D. G. Higgins, A. J. Bleasby, and R. Fuchs. Clustal v: improved software for multiple sequence alignment. CABIOS, 8:189–191, 1992.PubMedGoogle Scholar
  11. 11.
    J. Kececioglu. Notes on an approach of Carrillo and Lipman to minimum sum of pairs multiple sequence alignment. Unpublished notes, 1989.Google Scholar
  12. 12.
    J. Kececioglu. The maximum weight trace problem in multiple sequence alignment. In Proc. of the 4th Symp. on Combinatorial Pattern Matching, Springer-Verlag Lecture Notes in Comp. Sci. 684, pages 106–119, 1993.Google Scholar
  13. 13.
    D. J. Lipman, S. F. Altschul, and J. D. Kececioglu. A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA., 86:4412–4415, 1989.PubMedGoogle Scholar
  14. 14.
    D. Maier. The complexity of some problems on subsequences and supersequences. J. ACM, 25:322–336, 1978.CrossRefGoogle Scholar
  15. 15.
    M. A. McClure, T. K. Vasi, and W. M. Fitch. Comparative analysis of multiple protein-sequence alignment methods. Mol. Biol. Evol., 11:571–592, 1994.PubMedGoogle Scholar
  16. 16.
    S. Subbiah and S. C. Harrison. A method for multiple sequence alignment with gaps. J. Mol. Biol., 209:539–548, 1989.PubMedGoogle Scholar
  17. 17.
    W. R. Taylor. Multiple sequence alignment by a pairwise algorithm. CABIOS, 3:81–87, 1987.PubMedGoogle Scholar
  18. 18.
    W. R. Taylor. A flexible method to align large numbers of biological sequences. Journal of Molecular Evolution, 28:161–169, 1988.PubMedGoogle Scholar
  19. 19.
    L. Wang and T. Jiang. On the complexity of multiple sequence alignment. J. Computational Biology, 1:337–348, 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1995

Authors and Affiliations

  • Sandeep K. Gupta
    • 1
  • John D. Kececioglu
    • 2
  • Alejandro A. Schäffer
    • 3
  1. 1.Department of Computer ScienceRice UniversityHouston
  2. 2.Department of Computer ScienceThe University of GeorgiaAthens
  3. 3.Department of Computer ScienceRice UniversityHouston

Personalised recommendations