A context dependent method for comparing sequences

  • Xiaoqiu Huang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 807)


A scoring scheme is presented to measure the similarity score between two biological sequences, where matches are weighted dependent on their context The scheme generalizes a widely used scoring scheme. A dynamic programming algorithm is developed to compute a largest-scoring alignment of two sequences of lengths m and n in O(mn) time and O(m+n) space. Also developed is an algorithm for computing a largest-scoring local alignment between two sequences in quadratic time and linear space. Both algorithms are implemented as portable C programs. An experiment is conducted to compare protein alingments produced by the new global alignment program with ones by an existing program.


Linear Space Dynamic Programming Algorithm Recursive Call Biological Sequence Optimal Alignment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barker, W. C., D. G. George, L. T. Hunt and J. S. Garavelli. The PIR protein sequence database. Nucleic Acids Res. 19 (1991) 2231–2236.Google Scholar
  2. 2.
    Dayhoff, M. O., R. M. Schwartz and B. C. Orcutt A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure (Dayhoff, M. O. ed.), Vol. 5, Suppl. 3, pp. 345–358. National Biomedical Research Foundation, Washington, DC (1978).Google Scholar
  3. 3.
    Gotoh, O. An improved algorithm for matching biological sequences. J. Mol. Biol. 162 (1982) 705–708.Google Scholar
  4. 4.
    Hirschberg, D. S. A linear space algorithm for computing maximal common subsequences. Comm. ACM 18 (1975) 341–343.Google Scholar
  5. 5.
    Huang, X., R. C. Hardison and W. Miller. A space-efficient algorithm for local similarities. Comput. Applic. Biosci. 6 (1990) 373–381.Google Scholar
  6. 6.
    Huang, X. and W. Miller. A time-efficient, linear-space local similarity algorithm. Adv. Appl. Math. 12 (1991) 337–357.Google Scholar
  7. 7.
    Lewin, B. Genes IV. Cell Press, Cambridge, MA (1990).Google Scholar
  8. 8.
    Myers, E. W. and W. Miller. Optimal alignments in linear space. Comput. Applic. Biosci. 4 (1988) 11–17.Google Scholar
  9. 9.
    Sankoff, D. and J. B. Kruskal. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparisons. Addison-Wesley, Reading, MA (1983).Google Scholar
  10. 10.
    Sellers, P. H. On the theory and computation of evolutionary distances. SIAM J. Appl. Math. 26 (1974) 787–793.Google Scholar
  11. 11.
    Smith, T. F. and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol. 147 (1981) 195–197.Google Scholar
  12. 12.
    Waterman, M. S., T. F. Smith and W. A. Beyer. Some biological sequence metrics. Adv. Math. 20 (1976) 367–387.Google Scholar
  13. 13.
    Wilbur, W. J. and D. J. Lipman. The context dependent comparison of biological sequences. SIAM J. Appl. Math. 44 (1984) 557–567.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1994

Authors and Affiliations

  • Xiaoqiu Huang
    • 1
  1. 1.Department of Computer ScienceMichigan Technological UniversityHoughton

Personalised recommendations