Skip to main content
Log in

Sequence comparison with concave weighting functions

  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

We consider efficient methods for computing a difference metric between two sequences of symbols, where the cost of an operation to insert or delete a block of symbols is a concave function of the block's length. Alternatively, sequences can be optimally aligned when gap penalties are a concave function of the gap length. Two algorithms based on the ‘candidate list paradigm’ first used by Waterman (1984) are presented. The first computes significantly more parsimonious candidate lists than Waterman's method. The second method refines the first to the point of guaranteeingO(N 2 lgN) worst-case time complexity, and under certain conditionsO(N 2). Experimental data show how various properties of the comparison problem affect the methods' relative performance. A number of extensions are discussed, among them a technique for constructing optimal alignments inO(N) space in expectation. This variation gives a practical method for comparing long amino sequences on a small computer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Literature

  • Fickett, J. W. 1984. “Fast optimal alignment.”Nucleic Acids Res. 12, 175–179.

    Google Scholar 

  • Fitch, W. M. and T. F. Smith. 1983. “Optimal sequence alignments.”Proc. natn. Acad. Sci. U.S.A. 80, 1382–1386.

    Article  Google Scholar 

  • Fredman, M. L. 1984. “Algorithms for computing evolutionary similarity measures with length independent gap penalties.”Bull math. Biol. 46, 553–566.

    Article  MATH  MathSciNet  Google Scholar 

  • Gotoh, O. 1982. “An improved algorithm for matching biological sequences.”J. molec. Biol. 162, 705–708.

    Article  Google Scholar 

  • — 1986. “Alignment of three biological sequences with an efficient traceback procedure.”J. theor. Biol. 121, 327–337.

    Article  MathSciNet  Google Scholar 

  • Hirschberg, D. S. 1975. “A linear space algorithm for computing maximal common subsequences.”Communications of ACM 18, 341–343.

    Article  MATH  MathSciNet  Google Scholar 

  • — and L. L. Larmore. 1987. “The least weight subsequence problem.”SIAM J. on Computing 16, 628–638.

    Article  MATH  MathSciNet  Google Scholar 

  • Miller, W. and E. W. Myers. 1986. “A simple row-replacement method.” TR 86-28 Dept. of Computer Science, University of Arizona, Tucson, AZ 85721 (submitted toSoftware—Practice and Experience).

    Google Scholar 

  • Myers, E. W. and W. Miller. 1988. “Row replacement algorithms for screen editors.”Trans. on Prog. Lang. and Systems (in press).

  • Needleman, S. B. and C. D. Wunsch. 1970. “A general method applicable to the search for similarities in the amino acid sequences of two proteins.”J. molec. Biol. 48, 443–453.

    Article  Google Scholar 

  • Sankoff, D. and J. B. Kruskal. 1983.Time Warps, Strings Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Smith, T. F., M. S. Waterman. 1981. “Identification of common molecular subsequences.”J. molec. Biol. 147, 195–197.

    Article  Google Scholar 

  • —— and W. M. Fitch. 1981. “Comparative biosequence metrics.”J. molec. Evol. 18, 38–46.

    Article  Google Scholar 

  • Ukkonen, E. 1985. “Algorithms for approximate string matching.”Information and Control 64, 100–118.

    Article  MATH  MathSciNet  Google Scholar 

  • Waterman, M. S. 1984. “Efficient sequence alignment algorithms.”J. theor. Biol. 108, 333–337.

    MathSciNet  Google Scholar 

  • -Waterman, M. S. and T. F. Smith. “Rapid dynamic programming algorithms for RNA secondary structure.”Adv. Appl. Math. 7, 455–464.

  • —— and W. A. Beyer. 1976. “Some biological sequence metrics.”Adv. Math. 20, 367–387.

    Article  MATH  MathSciNet  Google Scholar 

  • Zuker, M. and D. Sankoff. 1984. “RNA structures and their prediction.”Bull. math. Biol. 44, 591–621.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was supported in part by NSF Grant DCR-8511455.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Miller, W., Myers, E.W. Sequence comparison with concave weighting functions. Bltn Mathcal Biology 50, 97–120 (1988). https://doi.org/10.1007/BF02459948

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02459948

Keywords

Navigation