CPM 1997: Combinatorial Pattern Matching pp 180-190 | Cite as
Aligning coding DNA in the presence of frame-shift errors
Conference paper
First Online:
Abstract
The problem of aligning two DNA sequences with respect to the fact that they are coding for proteins is discussed. Criteria for a good alignment of coding DNA, together with an algorithm that satisfies them, are presented. The algorithm is robust against frame-shifts and forgiving towards silent substitutions. The important choice of objective function is examined and several variants are proposed.
Keywords
Generalize Substitution Silent Mutation Silent Substitution Dynamic Programming Matrix Ambiguity Code
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- 1.K.-M. Chao. Computing all suboptimal alignments in linear space. In 5th Symposium on Combinatorial Pattern Matching, pages 31–42. Springer-Verlag LNCS 807, 1994.Google Scholar
- 2.M. O. Dayhoff, R. M. Schwartz, and B. C. Orcott. A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure, 5:345–352, 1978. National Biomedical Research Foundation, Silver Spring, Maryland, USA.Google Scholar
- 3.O. Gotoh. An improved algorithm for matching biological sequences. Journal of Molecular Biology, 162:705–708, 1982.Google Scholar
- 4.X. Guan and E. C. Uberbacher. Alignments of DNA and protein sequences containing frameshift errors. Comp. Appl. Bio. Sci., 12(1):31–40, 1996.Google Scholar
- 5.J. Hein. An algorithm combining DNA and protein alignment. Journal of Theoretical Biology, 167:169–174, 1994.Google Scholar
- 6.S. Henikoff and J. G. Henikoff. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad.Sci., 89:10915–10919, 1992.Google Scholar
- 7.D. S. Hirschberg. A linear space algorithm for computing longest common subsequences. Communications of the ACM, 18:341–343, 1975.Google Scholar
- 8.L. J. Knecht. Alignment and Analysis of Genes Coding for Proteins. PhD thesis, Swiss Federal Institute of Technology, 1996.Google Scholar
- 9.T. Leitner. Personal communication. Until recently at the Department of Biochemistry, Royal Institute of Technology, Stockholm, now at Los Alamos National Laboratory, USA, Theoretical Biology and Biophysics Group.Google Scholar
- 10.E. W. Myers and W. Miller. Optimal alignments in linear space. Comp. Appl. Bio. Sci., 4(1):11–17, 1988.Google Scholar
- 11.H. Peltola, H. Söderlund, and E. Ukkonen. Algorithms for the search of amino acid patterns in nucleic acid sequences. Nuclear Acids Research, 14(1):99–107, 1986.Google Scholar
- 12.D. Sankoff and J. Kruskal. Time warps, string edits, and macromolecules: The theory and practice of sequence comparison. Addison-Wesley, 1983.Google Scholar
- 13.P. H. Sellers. On the theory and computation of evolutionary distances. SIAM Journal on Applied Mathematics, 26:787, 1974.Google Scholar
- 14.D. J. States and D. Botstein. Molecular sequence accuracy and the analysis of protein coding regions. Proc. Natl. Acad.Sci., 88:5518–5522, July 1991.Google Scholar
- 15.M. S. Waterman. Introduction to computational biology: Maps, sequences and genomes. Chapman & Hall, 1995.Google Scholar
Copyright information
© Springer-Verlag Berlin Heidelberg 1997