Advertisement

Identifying periodic occurrences of a template with applications to protein structure

  • Vincent A. Fischetti
  • Gad M. Landau
  • Jeanette P. Schmidt
  • Peter H. Sellers
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 644)

Abstract

We consider a string matching problem where the pattern is a template that matches many different strings with various degrees of perfection. The quality of a match is given by a penalty matrix that assigns each pair of characters a score that characterizes how well the characters match. Superfluous characters in the text and superfluous characters in the pattern may also occur and the respective penalties for such gaps in the alignment are also given by the penalty matrix. For a text T of length n, and a template P of length m, we wish to find the best alignment of T with Pn, which is the concatenation of n copies of P, (m will typically be much smaller than n). Such an alignment can simply be obtained by solving a dynamic programming problem of size O(n2m), and ignoring the periodic character of Pn. We show that the structure of Pn can be exploited and the problem reduced to essentially solving a dynamic programming of size O(mn). If the complexity of computing gap penalties is O(1), (which is frequently the case), our algorithm runs in O(mn) time. The problem was motivated by a protein structure problem.

Keywords

Dynamic Programming Edit Distance Coiled Coil String Match Protein Structure Determination 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [CF-74]
    P.Y. Chou and G.D. Fasman, “Prediction of protein conformation,” Biochemistry, Vol. 13, 1974, pp. 222–245.Google Scholar
  2. [CP-86]
    C. Cohen, and D.A.D. Parry, “Alpha-helical coiled coils — a widespread motif in proteins,” T.I.B.S., Vol. 11, 1986, pp. 245–248.Google Scholar
  3. [CP-90]
    J. F. Conway and D. A. D. Parry, “Structural features in the heptad substructure and longer range repeats of two-stranded alpha-fibrous proteins,” Int. J. Biol. Macromol., Vol. 4, 1990, pp. 328–333.Google Scholar
  4. [FP-92]
    V. A. Fischetti, V. Pancholi, P. Sellers, J. Schmidt, G. Landau, X. Xu, O. Schneewind, Streptococcal M protein: A common Structural Motif Used by Gram-positive Bacteria for Biological Active Surface Molecules, to appear Molecular Recognition in Host-Parasite Interactions: Mechanisms in viral, bacterial and parasite infections. Published by Plenum Publishing.Google Scholar
  5. [GG-89]
    Z. Galil and R. Giancarlo, “Speeding up dynamic programming with applications to molecular biology,” Theoretical Computer Science, Vol. 64, 1989, pp. 107–118.Google Scholar
  6. [GLE-87]
    M. Gribskov, A.D. McLachlan, and D. Eisenberg, “Profile analysis: Detection of distantly related proteins,” Proc. Natl. Acad. Sci., Vol. 84, 1987, pp. 4355–4358.Google Scholar
  7. [GOR-78]
    J. Garnier, D.J. Osguthorpe, and B. Robson, “Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins,” J. Molecular Biology, Vol. 120, 1978, pp. 97–120.Google Scholar
  8. [GP-90]
    Z. Galil and K. Park, “An Improved Algorithm for Approximate String Matching,” SIAM J. Comp., Vol. 19, 1990, pp. 989–999.Google Scholar
  9. [LD-91]
    A. Lupas, M. Van Dyke, J. Stock, “Predicting Coiled Coil from Protein Sequences, Science Vol. 252, 1990, pp. 1162–1164.Google Scholar
  10. [LLE-91]
    R. Lüthy, A. D. McLachlan, and D. Eisenberg Secondary Structure-Based Profiles: Use of Structure-Conserving Scoring Tables in Searching Protein Sequence Databases for Structural Similarities'” Proteins, Vol. 10, 1991, pp. 229–239.Google Scholar
  11. [LV-89]
    G.M. Landau and U. Vishkin, “Fast parallel and serial approximate string matching,” Journal of Algorithms, Vol. 10, No. 2, June 1989, pp. 157–169.Google Scholar
  12. [NW-69]
    S.B. Needleman and C.D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequences of two proteins,” J. Molecular Biology, Vol. 48, 1969, pp. 443–453.Google Scholar
  13. [S-74]
    P.H. Sellers, “On the theory and computation of evolutionary distance,” SIAM J. Appl. Math, Vol. 26, No. 4, 1974, pp. 787–793.Google Scholar
  14. [SK-83]
    D. Sankoff and J.B. Kruskal (editors), Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison, Addison-Wesley, Reading, MA, 1983.Google Scholar
  15. [U-83]
    E. Ukkonen, “On approximate string matching,” Proc. Int. Conf. Found. Comp. Theor., Lecture Notes in Computer Science 158, Springer-Verlag, 1983, pp. 487–495.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1992

Authors and Affiliations

  • Vincent A. Fischetti
    • 1
  • Gad M. Landau
    • 2
  • Jeanette P. Schmidt
    • 2
  • Peter H. Sellers
    • 1
  1. 1.Rockefeller UniversityNew York
  2. 2.Polytechnic UniversityBrooklyn

Personalised recommendations