Advertisement

Approximation algorithms for multiple sequence alignment

  • Vineet Bafna
  • Eugene L. Lawler
  • Pavel A. Pevzner
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 807)

Abstract

We consider the problem of aligning of k sequences of length n. The cost function is sum of pairs, and satisfies triangle inequality. Earlier results on finding approximation algorithms for this problem are due to Gusfield, 1991, who achieved an approximation ratio of 2 − 2/k, and Pevzner, 1992, who improved it to 2 − 3/k. We generalize this approach to assemble an alignment of k sequences from optimally aligned subsets of l<k sequences to obtain an improved performance guarantee. For arbitrary l<k, we devise deterministic and randomized algorithms yielding performance guarantees of 2−l/k. For fixed l, the running times of these algorithms are polynomial in n and k.

Keywords

Multiple Sequence Alignment Pairwise Alignment Performance Guarantee Center Vertex Computer Science Division 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AL89]
    Altschul S.F., Lipman D.J., Trees, stars, and multiple biological sequence alignment. SIAM J. Appl. Math., 49, (1989), pp. 197–209.Google Scholar
  2. [B75]
    Baranyai, Z., On the factorization of the complete uniform hypergraph, Infinite and Finite Sets, A. Hajnal, T. Rado, V. T. Sós, eds., North-Holland, Amsterdam, (1975), pp. 91–108.Google Scholar
  3. [B90]
    Bósak, J., Decompositions of Graphs, Kluwer Academic Publishers, (1990).Google Scholar
  4. [CW79]
    Carter J.L., Wegman M.N., Universal classes of hash functions, Journal of Computer and System Sciences, 18(1979), pp. 143–154.Google Scholar
  5. [CWC92]
    Chan S.C., Wong A.K.C., Chiu D.K.Y., A survey of multiple sequence comparison methods, Bull. Math. Biol., 54(1992), pp. 563–598.Google Scholar
  6. [FD87]
    Feng D., Doolittle R., Progressive sequence alignment as a prerequisite to correct phylogenetic trees, Journal of Molec. Evol., 25(1987), pp. 351–360.Google Scholar
  7. [G91]
    Gusfield, D., Efficient methods for multiple sequence alignment with guaranteed error bounds. Tech. Report, Computer Science Division, Uiversity of California, Davis, CSE-91-4, (1991).Google Scholar
  8. [G93]
    Gusfield, D., Efficient methods for multiple sequence alignment with guaranteed error bounds, Bulletin of Mathematical Biology, 55(1993), pp. 141–154.Google Scholar
  9. [K93]
    Kececioglu J., The maximum weight trace alignment problem in multiple sequence alignment, eds. A. Apostolico, M. Crochemore, Z. Galil, U. Manber, Combinatorial Pattern matching 93, Padova, Italy, June 1993, LNCS 684, 106–119.Google Scholar
  10. [LAK89]
    Lipman D.J., Altschul S.F., Kececioglu J.D., A tool for multiple sequence alignment, Proc. Natl. Acad. Sci. USA, 86(1989), pp. 4412–4415.Google Scholar
  11. [L73]
    Lorimer, P., Finite Projective Planes and Sharply 2-transitive Subsets of Finite Groups, Proc. Second Internat. Conf. Theory of Groups, Canberra, (1973), pp. 432–436.Google Scholar
  12. [P92]
    Pevzner, P., Multiple Alignment, Communication Cost, and Graph Matching, SIAM J. Applied Math., 52, (1992), pp. 1763–1779.Google Scholar
  13. [S75]
    Sankoff D., Minimum mutation tree of sequences, SIAM J. Appl. Math., 28, (1975), pp. 35–42.Google Scholar
  14. [S85]
    Sankoff D., Simultaneous solution of the RNA folding, alignment and protosequence problems, SIAM J. Appl. Math., 45 (1985), pp. 810–825.Google Scholar
  15. [SS90]
    Schmidt J., Siegel A., The analysis closed hashing under limited randomness, Proceedings of the 22nd ACM Symposium on Theory of Computing, (1990), pp. 224–234.Google Scholar
  16. [WJ93]
    Wang L., Jiang, T., On the Complexity of Multiple Sequence Alignment, 1993, J. of Comp. Biol. (to appear).Google Scholar
  17. [WSB76]
    Waterman M.S., Smith T.F., Beyer W.A., Some biological sequence metrics. Adv. in Math., 20(1976), pp. 367–387.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1994

Authors and Affiliations

  • Vineet Bafna
    • 1
  • Eugene L. Lawler
    • 2
  • Pavel A. Pevzner
    • 1
  1. 1.Department of CSEThe Pennsylvania State UniversityUniversity Park
  2. 2.Computer Science DivisionUniversity of CaliforniaBerkeley

Personalised recommendations