# Optimal sequence alignment using affine gap costs

- 802 Downloads
- 83 Citations

## Abstract

When comparing two biological sequences, it is often desirable for a gap to be assigned a cost not directly proportional to its length. If affine gap costs are employed, in other words if opening a gap costs*v* and each null in the gap costs*u*, the algorithm of Gotoh (1982,*J. molec. Biol.* **162**, 705) finds the minimum cost of aligning two sequences in order*MN* steps. Gotoh's algorithm attempts to find only one from among possibly many optimal (minimum-cost) alignments, but does not always succeed. This paper provides an example for which this part of Gotoh's algorithm fails and describes an algorithm that finds all and only the optimal alignments. This modification of Gotoh's algorithm still requires order*MN* steps. A more precise form of path graph than previously used is needed to represent accurately all optimal alignments for affine gap costs.

## Keywords

Optimal Path Vertical Edge Optimal Alignment Horizontal Edge Cost Assignment## Preview

Unable to display preview. Download preview PDF.

## Literature

- Altschul, S. F. and B. W. Erickson. 1986. “A Nonlinear Measure of Subalignment Similarity and its Significance Levels.”
*Bull. math. Biol.***48**, 617–632.zbMATHMathSciNetCrossRefGoogle Scholar - Erickson, B. W. and P. H. Sellers. 1983. “Recognition of Patterns in Genetic Sequences.” In
*Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison*, D. Sankoff and J. B. Kruskal (Eds), pp. 55–91. Reading, MA: Addison-Wesley.Google Scholar - Fitch, W. M. and T. F. Smith. 1983. “Optimal Sequence Alignments.”
*Proc. natn. Acad. Sci. U.S.A.***80**, 1382–1386.CrossRefGoogle Scholar - Gotoh, O. 1982. “An Improved Algorithm for Matching Biological Sequences.”
*J. molec. Biol.***162**, 705–708.CrossRefGoogle Scholar - Needleman, S. B. and C. D. Wunsch. 1970. “A General Method Applicable to the Search for Similarities in the Amino Acid Sequences of Two Proteins.”
*J. molec. Biol.***48**, 443–453.CrossRefGoogle Scholar - Schwartz, R. M. and M. O. Dayhoff. 1978. “Matrices for Detecting Distant Relationships.” In
*Atlas of Protein Sequence and Structure*, Vol. 5, Suppl. 3, M. O. Dayhoff (Ed.), pp. 345–358. Washington, DC: National Biomedical Research Foundation.Google Scholar - Sellers, P. H. 1974. “On the Theory and Computation of Evolutionary Distances.”
*SIAM J. appl. Math.***26**, 787–793.zbMATHMathSciNetCrossRefGoogle Scholar - Smith, T. F., M. S. Waterman and W. M. Fitch. 1981. “Comparative Biosequence Metrics.”
*J. molec. Evol.***18**, 38–46.CrossRefGoogle Scholar - Taniguchi, T. H. Matsui, T. Fujita, C. Takaoka, N. Kashima, R. Yoshimoto and J. Hamuro. 1983. “Structure and Expression of a Cloned cDNA for Human Interleukin-2.”
*Nature***302**, 305–310.CrossRefGoogle Scholar - Taylor, P. 1984. “A Fast Homology Program for Aligning Biological Sequences.”
*Nucl. Acids Res.***12**, 447–455.Google Scholar - Waterman, M. S. 1984. “Efficient Sequence Alignment Algorithms.”
*J. theor. Biol.***108**, 333–337.MathSciNetGoogle Scholar - —, T. F. Smith and W. A. Beyer. 1976. “Some Biological Sequence Metrics.”
*Adv. Math.***20**, 367–387.zbMATHMathSciNetCrossRefGoogle Scholar - Ukkonen, E. 1983. “On Approximate String Matching.”
*Proc. Int. Conference on the Foundations of Computer Theory, Lecture Notes in Computer Science*, Vol. 158, pp. 487–496. Berlin: Springer-Verlag.Google Scholar - Yokota, T., N. Arai, F. Lee, D. Rennick, T. Mosmann and K. Arai. 1985. “Use of a cDNA Expression Vector for Isolation of Mouse Interleukin 2 cDNA Clones: Expression of T-Cell Growth Factor Activity After Transfection of Monkey Cells.”
*Proc. natn. Acad. Sci. U.S.A.***82**, 68–72.CrossRefGoogle Scholar