# Approximation algorithms for multiple sequence alignment

Conference paper

First Online:

## Abstract

We consider the problem of aligning of *k* sequences of length *n*. The cost function is sum of pairs, and satisfies triangle inequality. Earlier results on finding approximation algorithms for this problem are due to Gusfield, 1991, who achieved an approximation ratio of 2 − 2/k, and Pevzner, 1992, who improved it to 2 − 3/k. We generalize this approach to assemble an alignment of *k* sequences from optimally aligned subsets of l<k sequences to obtain an improved performance guarantee. For arbitrary l<k, we devise deterministic and randomized algorithms yielding performance guarantees of 2−l/*k*. For fixed *l*, the running times of these algorithms are polynomial in *n* and *k*.

## Keywords

Multiple Sequence Alignment Pairwise Alignment Performance Guarantee Center Vertex Computer Science Division
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## Preview

Unable to display preview. Download preview PDF.

## References

- [AL89]Altschul S.F., Lipman D.J.,
*Trees, stars, and multiple biological sequence alignment*. SIAM J. Appl. Math.,**49**, (1989), pp. 197–209.Google Scholar - [B75]Baranyai, Z.,
*On the factorization of the complete uniform hypergraph*, Infinite and Finite Sets, A. Hajnal, T. Rado, V. T. Sós, eds., North-Holland, Amsterdam, (1975), pp. 91–108.Google Scholar - [B90]Bósak, J.,
*Decompositions of Graphs*, Kluwer Academic Publishers, (1990).Google Scholar - [CW79]Carter J.L., Wegman M.N.,
*Universal classes of hash functions*, Journal of Computer and System Sciences,**18**(1979), pp. 143–154.Google Scholar - [CWC92]Chan S.C., Wong A.K.C., Chiu D.K.Y.,
*A survey of multiple sequence comparison methods*, Bull. Math. Biol.,**54**(1992), pp. 563–598.Google Scholar - [FD87]Feng D., Doolittle R.,
*Progressive sequence alignment as a prerequisite to correct phylogenetic trees*, Journal of Molec. Evol.,**25**(1987), pp. 351–360.Google Scholar - [G91]Gusfield, D.,
*Efficient methods for multiple sequence alignment with guaranteed error bounds*. Tech. Report, Computer Science Division, Uiversity of California, Davis, CSE-91-4, (1991).Google Scholar - [G93]Gusfield, D.,
*Efficient methods for multiple sequence alignment with guaranteed error bounds*, Bulletin of Mathematical Biology,**55**(1993), pp. 141–154.Google Scholar - [K93]Kececioglu J.,
*The maximum weight trace alignment problem in multiple sequence alignment*, eds. A. Apostolico, M. Crochemore, Z. Galil, U. Manber, Combinatorial Pattern matching 93, Padova, Italy, June 1993, LNCS 684, 106–119.Google Scholar - [LAK89]Lipman D.J., Altschul S.F., Kececioglu J.D.,
*A tool for multiple sequence alignment*, Proc. Natl. Acad. Sci. USA,**86**(1989), pp. 4412–4415.Google Scholar - [L73]Lorimer, P.,
*Finite Projective Planes and Sharply 2-transitive Subsets of Finite Groups*, Proc. Second Internat. Conf. Theory of Groups, Canberra, (1973), pp. 432–436.Google Scholar - [P92]Pevzner, P.,
*Multiple Alignment, Communication Cost, and Graph Matching*, SIAM J. Applied Math.,**52**, (1992), pp. 1763–1779.Google Scholar - [S75]Sankoff D.,
*Minimum mutation tree of sequences*, SIAM J. Appl. Math.,**28**, (1975), pp. 35–42.Google Scholar - [S85]Sankoff D.,
*Simultaneous solution of the RNA folding, alignment and protosequence problems*, SIAM J. Appl. Math.,**45**(1985), pp. 810–825.Google Scholar - [SS90]Schmidt J., Siegel A.,
*The analysis closed hashing under limited randomness*, Proceedings of the 22nd ACM Symposium on Theory of Computing, (1990), pp. 224–234.Google Scholar - [WJ93]Wang L., Jiang, T.,
*On the Complexity of Multiple Sequence Alignment*, 1993, J. of Comp. Biol. (to appear).Google Scholar - [WSB76]Waterman M.S., Smith T.F., Beyer W.A.,
*Some biological sequence metrics*. Adv. in Math.,**20**(1976), pp. 367–387.Google Scholar

## Copyright information

© Springer-Verlag Berlin Heidelberg 1994