Skip to main content
Log in

A local algorithm for DNA sequence alignment with inversions

  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

A dynamic programming algorithm to find all optimal alignments of DNA subsequences is described. The alignments use not only substitutions, insertions and deletions of nucleotides but also inversions (reversed complements) of substrings of the sequences. The inversion alignments themselves contain substitutions, insertions and deletions of nucleotides. We study the problem of alignment with non-intersecting inversions. To provide a computationally efficient algorithm we restrict candidate inversions to theK highest scoring inversions. An algorithm to find theJ best non-intersecting alignments with inversions is also described. The new algorithm is applied to the regions of mitochondrial DNA ofDrosophila yakuba and mouse coding for URF6 and cytochrome b and the inversion of the URF6 gene is found. The open problem of intersecting inversions is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Literature

  • Arratia, R., P. Morris and M. S. Waterman. 1988. Stochastic scrabble: a law of large numbers for sequence matching with scores.J. appl. Prob. 25, 106–119.

    Article  MATH  MathSciNet  Google Scholar 

  • Arratia, R. A., L. Goldstein and L. Gordon. 1989. Two moments suffice for Poisson approximation: The Chen-Stein method.Annls Prob. 17, 9–25.

    MATH  MathSciNet  Google Scholar 

  • Arratia, R. A., L. Gordon and M. S. Waterman. 1990. The Erdös-Rényi law in distribution, for coin tossing and sequence matching.Annls Statist. 18, 539–570.

    MATH  MathSciNet  Google Scholar 

  • Clary, D. O. and D. R. Wolstenholme. 1985. The mitochondrial DNA molecule ofDrosophila yakuba: nucleotide sequence, gene organization, and genetic code.J. molec. Evol. 22, 252–271.

    Article  Google Scholar 

  • Goldstein, L. and M. S. Waterman. 1992. Poisson, compound Poisson, and process approximations for testing statistical significance in sequence comparisons.Bull. math. Biol. in press.

  • Gotoh, O. 1982. An improved algorithm for matching biological sequences.J. molec. Biol. 162, 705–708.

    Article  Google Scholar 

  • Howe, C. J., R. F. Barker, C. M. Bowman and T. A. Dyer. 1988. Common features of three inversions in wheat chloroplast DNA.Curr. Genet. 13, 343–349.

    Article  Google Scholar 

  • Karlin, S. and S. F. Altschul. 1990. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.Proc. natn. Acad. Sci. U.S.A. 87, 2264–2268.

    Article  MATH  Google Scholar 

  • Needleman, S. B. and C. D. Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequences of two proteins.J. molec. Biol. 48, 443–453.

    Article  Google Scholar 

  • Pearson, W. R. and D. J. Lipman. 1988. Improved tools for biological sequence comparison.Proc. natn. Acad. Sci. USA 85, 2444–2448.

    Article  Google Scholar 

  • Smith, T. F. and M. S. Waterman. 1981. Identification of common molecular subsequences.J. molec. Biol. 147, 195–197.

    Article  Google Scholar 

  • Smith, T. F., M. S. Waterman and W. M. Fitch. 1981. Comparative biosequence metrics,J. molec. Evol. 18, 38–46.

    Article  Google Scholar 

  • Wagner, R. A. 1983. On the complexity of the extended string-to-string correction problem. InTime Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison. D. Sankoff and J. B. Kruskal (Eds), pp. 215–235. London: Addison-Wesley.

    Google Scholar 

  • Waterman, M. S., T. F. Smith and W. A. Beyer. 1976. Some biological sequence metrics.Adv. Math. 20, 367–387.

    Article  MATH  MathSciNet  Google Scholar 

  • Waterman, M. S. 1984. General methods of sequence comparison.Bull. math. Biol. 46, 473–500.

    Article  MATH  MathSciNet  Google Scholar 

  • Waterman, M. S. and M. Eggert. 1987. A new algorithm for best subsequence alignments with application to tRNA-tRNA comparisons.J. molec. Biol. 197, 723–728.

    Article  Google Scholar 

  • Waterman, M. S., L. Gordon and R. Arratia. 1987. Phase transitions in sequence matches and nucleic acid structure.Proc. natn. Acad. Sci. U.S.A. 84, 1239–1243.

    Article  MathSciNet  Google Scholar 

  • Waterman, M. S. 1989. Sequence alignments. In:Mathematical Methods for DNA Sequences, M. S. Waterman (Ed.), pp. 53–92. Boca Raton, Florida: CRC Press.

    Google Scholar 

  • Zhou, D. X., O. Massenet, F. Quigley, M. J. Marion, F. Moneger, P. Huber, and R. Mache. 1988. Characterization of a large inversion in the spinach chloroplast genome relative toMarchantia: a possible transposon-mediated origin.Curr. Genet. 13, 433–439.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schöniger, M., Waterman, M.S. A local algorithm for DNA sequence alignment with inversions. Bltn Mathcal Biology 54, 521–536 (1992). https://doi.org/10.1007/BF02459633

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02459633

Keywords

Navigation