Abstract
A scoring scheme is presented to measure the similarity score between two biological sequences, where matches are weighted dependent on their context The scheme generalizes a widely used scoring scheme. A dynamic programming algorithm is developed to compute a largest-scoring alignment of two sequences of lengths m and n in O(mn) time and O(m+n) space. Also developed is an algorithm for computing a largest-scoring local alignment between two sequences in quadratic time and linear space. Both algorithms are implemented as portable C programs. An experiment is conducted to compare protein alingments produced by the new global alignment program with ones by an existing program.
Keywords
- Linear Space
- Dynamic Programming Algorithm
- Recursive Call
- Biological Sequence
- Optimal Alignment
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This research was supported in part by NSF Grant DIR-9106510.
This is a preview of subscription content, access via your institution.
Preview
Unable to display preview. Download preview PDF.
References
Barker, W. C., D. G. George, L. T. Hunt and J. S. Garavelli. The PIR protein sequence database. Nucleic Acids Res. 19 (1991) 2231–2236.
Dayhoff, M. O., R. M. Schwartz and B. C. Orcutt A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure (Dayhoff, M. O. ed.), Vol. 5, Suppl. 3, pp. 345–358. National Biomedical Research Foundation, Washington, DC (1978).
Gotoh, O. An improved algorithm for matching biological sequences. J. Mol. Biol. 162 (1982) 705–708.
Hirschberg, D. S. A linear space algorithm for computing maximal common subsequences. Comm. ACM 18 (1975) 341–343.
Huang, X., R. C. Hardison and W. Miller. A space-efficient algorithm for local similarities. Comput. Applic. Biosci. 6 (1990) 373–381.
Huang, X. and W. Miller. A time-efficient, linear-space local similarity algorithm. Adv. Appl. Math. 12 (1991) 337–357.
Lewin, B. Genes IV. Cell Press, Cambridge, MA (1990).
Myers, E. W. and W. Miller. Optimal alignments in linear space. Comput. Applic. Biosci. 4 (1988) 11–17.
Sankoff, D. and J. B. Kruskal. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparisons. Addison-Wesley, Reading, MA (1983).
Sellers, P. H. On the theory and computation of evolutionary distances. SIAM J. Appl. Math. 26 (1974) 787–793.
Smith, T. F. and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol. 147 (1981) 195–197.
Waterman, M. S., T. F. Smith and W. A. Beyer. Some biological sequence metrics. Adv. Math. 20 (1976) 367–387.
Wilbur, W. J. and D. J. Lipman. The context dependent comparison of biological sequences. SIAM J. Appl. Math. 44 (1984) 557–567.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huang, X. (1994). A context dependent method for comparing sequences. In: Crochemore, M., Gusfield, D. (eds) Combinatorial Pattern Matching. CPM 1994. Lecture Notes in Computer Science, vol 807. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58094-8_5
Download citation
DOI: https://doi.org/10.1007/3-540-58094-8_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58094-2
Online ISBN: 978-3-540-48450-9
eBook Packages: Springer Book Archive
