Skip to main content

A context dependent method for comparing sequences

Part of the Lecture Notes in Computer Science book series (LNCS,volume 807)

Abstract

A scoring scheme is presented to measure the similarity score between two biological sequences, where matches are weighted dependent on their context The scheme generalizes a widely used scoring scheme. A dynamic programming algorithm is developed to compute a largest-scoring alignment of two sequences of lengths m and n in O(mn) time and O(m+n) space. Also developed is an algorithm for computing a largest-scoring local alignment between two sequences in quadratic time and linear space. Both algorithms are implemented as portable C programs. An experiment is conducted to compare protein alingments produced by the new global alignment program with ones by an existing program.

Keywords

  • Linear Space
  • Dynamic Programming Algorithm
  • Recursive Call
  • Biological Sequence
  • Optimal Alignment

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This research was supported in part by NSF Grant DIR-9106510.

This is a preview of subscription content, access via your institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barker, W. C., D. G. George, L. T. Hunt and J. S. Garavelli. The PIR protein sequence database. Nucleic Acids Res. 19 (1991) 2231–2236.

    Google Scholar 

  2. Dayhoff, M. O., R. M. Schwartz and B. C. Orcutt A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure (Dayhoff, M. O. ed.), Vol. 5, Suppl. 3, pp. 345–358. National Biomedical Research Foundation, Washington, DC (1978).

    Google Scholar 

  3. Gotoh, O. An improved algorithm for matching biological sequences. J. Mol. Biol. 162 (1982) 705–708.

    Google Scholar 

  4. Hirschberg, D. S. A linear space algorithm for computing maximal common subsequences. Comm. ACM 18 (1975) 341–343.

    Google Scholar 

  5. Huang, X., R. C. Hardison and W. Miller. A space-efficient algorithm for local similarities. Comput. Applic. Biosci. 6 (1990) 373–381.

    Google Scholar 

  6. Huang, X. and W. Miller. A time-efficient, linear-space local similarity algorithm. Adv. Appl. Math. 12 (1991) 337–357.

    Google Scholar 

  7. Lewin, B. Genes IV. Cell Press, Cambridge, MA (1990).

    Google Scholar 

  8. Myers, E. W. and W. Miller. Optimal alignments in linear space. Comput. Applic. Biosci. 4 (1988) 11–17.

    Google Scholar 

  9. Sankoff, D. and J. B. Kruskal. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparisons. Addison-Wesley, Reading, MA (1983).

    Google Scholar 

  10. Sellers, P. H. On the theory and computation of evolutionary distances. SIAM J. Appl. Math. 26 (1974) 787–793.

    Google Scholar 

  11. Smith, T. F. and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol. 147 (1981) 195–197.

    Google Scholar 

  12. Waterman, M. S., T. F. Smith and W. A. Beyer. Some biological sequence metrics. Adv. Math. 20 (1976) 367–387.

    Google Scholar 

  13. Wilbur, W. J. and D. J. Lipman. The context dependent comparison of biological sequences. SIAM J. Appl. Math. 44 (1984) 557–567.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, X. (1994). A context dependent method for comparing sequences. In: Crochemore, M., Gusfield, D. (eds) Combinatorial Pattern Matching. CPM 1994. Lecture Notes in Computer Science, vol 807. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58094-8_5

Download citation

  • DOI: https://doi.org/10.1007/3-540-58094-8_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58094-2

  • Online ISBN: 978-3-540-48450-9

  • eBook Packages: Springer Book Archive