Discovering Sequence-Structure Patterns in Proteins with Variable Secondary Structure

  • Tom Milledge
  • Gaolin Zheng
  • Giri Narasimhan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3992)


Proteins that share a similar function often exhibit conserved sequence patterns. Sequence patterns help to classify proteins into families where the exact function may or may not be known. Research has shown that these domain signatures often exhibit specific three-dimensional structures. We have previously shown that sequence patterns combined with structural information, in general, have superior discrimination ability than those derived without structural information. However in some cases, divergent backbone configurations and/or variable secondary structure in otherwise well-aligned proteins make identification of conserved regions of sequence and structure problematic. In this paper, we describe improvements in our method of designing biologically meaningful sequence-structure patterns (SSPs) starting from a seed sequence pattern from any of the existing sequence pattern databases. Improved pattern precision is achieved by including conserved residues from coil regions that are not readily apparent from examination of multiple sequence alignments alone. Pattern recall is improved by systematically comparing the structure of all known true family members and to include all the allowable variations in the pattern residues.


Protein Data Bank Sequence Pattern Zinc Finger Domain Coil Region PROSITE Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Falquet, L., Pagni, M., Bucher, P., Hulo, N., Sigrist, C.J.A., Hofmann, K., Bairoch, A.: The PROSITE database, its status in 2002. Nucl. Acids. Res. 30(1), 235–238 (2002)CrossRefGoogle Scholar
  2. 2.
    Sigrist, C.J., Cerutti, L., Hulo, N., Gattiker, A., Falquet, L., Pagni, M., Bairoch, A., Bucher, P.: PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform 3(3), 265–274 (2002)CrossRefGoogle Scholar
  3. 3.
    Nevill-Manning, C.G., Wu, T.D., Brutlag, D.L.: Highly specific protein sequence motifs for genome analysis. Proc. Natl. Acad. Sci. USA 95(11), 5865–5871 (1998)CrossRefGoogle Scholar
  4. 4.
    Huang, J.Y., Brutlag, D.L.: The EMOTIF database. Nucl. Acids. Res. 29(1), 202–204 (2001)CrossRefGoogle Scholar
  5. 5.
    Attwood, T.K.: The PRINTS database: a resource for identification of protein families. Brief Bioinform 3(3), 252–263 (2002)CrossRefGoogle Scholar
  6. 6.
    Hart, R., Royyuru, A., Stolovitzky, G., Califano, A.: Systematic and Fully Automatic Identification of Protein Sequence Patterns. J. Comput. Biol. 7((3/4), 585–600 (2000)CrossRefGoogle Scholar
  7. 7.
    Kasuya, A., Thornton, J.M.: Three-dimensional structure analysis of PROSITE patterns1. Journal of Molecular Biology 286(5), 1673–1691 (1999)CrossRefGoogle Scholar
  8. 8.
    Milledge, T., Khuri, S., Wei, X., Yang, C., Zheng, G., Narasimhan, G.: Sequence-Structure Patterns: Discovery and Applications. In: 6th Atlantic Symposium on Computational Biology and Genome Informatics (CBG), pp. 1282–1285 (2005)Google Scholar
  9. 9.
    Wu, T.D., Brutlag, D.L.: Discovering Empirically Conserved Amino Acid Substitution Groups in Databases of Protein Families. In: ISMB 1996 (1996)Google Scholar
  10. 10.
    Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Res. 28(1), 235–242 (2000)CrossRefGoogle Scholar
  11. 11.
    Brenner, S.E., Chothia, C., Hubbard, T.J.P., Murzin, A.G.: Understanding protein structure: Using SCOP for fold interpretation. Methods in Enzymology, 635–643 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Tom Milledge
    • 1
  • Gaolin Zheng
    • 1
  • Giri Narasimhan
    • 1
  1. 1.Bioinformatics Research Group (BioRG), School of Computer ScienceFlorida International UniversityMiamiUSA

Personalised recommendations