Discovering Sequence-Structure Patterns in Proteins with Variable Secondary Structure
Proteins that share a similar function often exhibit conserved sequence patterns. Sequence patterns help to classify proteins into families where the exact function may or may not be known. Research has shown that these domain signatures often exhibit specific three-dimensional structures. We have previously shown that sequence patterns combined with structural information, in general, have superior discrimination ability than those derived without structural information. However in some cases, divergent backbone configurations and/or variable secondary structure in otherwise well-aligned proteins make identification of conserved regions of sequence and structure problematic. In this paper, we describe improvements in our method of designing biologically meaningful sequence-structure patterns (SSPs) starting from a seed sequence pattern from any of the existing sequence pattern databases. Improved pattern precision is achieved by including conserved residues from coil regions that are not readily apparent from examination of multiple sequence alignments alone. Pattern recall is improved by systematically comparing the structure of all known true family members and to include all the allowable variations in the pattern residues.
KeywordsProtein Data Bank Sequence Pattern Zinc Finger Domain Coil Region PROSITE Pattern
- 8.Milledge, T., Khuri, S., Wei, X., Yang, C., Zheng, G., Narasimhan, G.: Sequence-Structure Patterns: Discovery and Applications. In: 6th Atlantic Symposium on Computational Biology and Genome Informatics (CBG), pp. 1282–1285 (2005)Google Scholar
- 9.Wu, T.D., Brutlag, D.L.: Discovering Empirically Conserved Amino Acid Substitution Groups in Databases of Protein Families. In: ISMB 1996 (1996)Google Scholar
- 11.Brenner, S.E., Chothia, C., Hubbard, T.J.P., Murzin, A.G.: Understanding protein structure: Using SCOP for fold interpretation. Methods in Enzymology, 635–643 (1996)Google Scholar