Identification of Sequence-Specific Tertiary Packing Motifs in Protein Structures using Delaunay Tessellation
An approach to recognizing recurrent sequence-structure patterns in proteins has been developed, based on Delaunay tessellation of protein structure. Starting with a united residue (side chain centroids) representation of a protein structure, tessellation partitions the structure into a unique set of irregular tetra-hedra, or simplices whose vertices correspond to four nearest-neighbor residues. Tetrahedral clusters composed of residues not adjacent along the polypeptide chain have been classified according to their amino acid composition and the three distances separating the residues along the sequence; these distances being defined as the sequence lengths from first to second, second to third, and third to fourth residue. An elementary tertiary packing motif is defined as a Delaunay simplex with a specific amino acid composition, together with three sequence distances (i.e., number of residues along the sequence) between vertex residues. Analysis of three databases of diverse protein structures (< 30% sequence identity between any pair, 1922 structures total) identified 224 motifs found in at least two proteins from different fold families each. To further substantiate the methodology, three groups of proteins representing unique structural and functional families were analyzed and packing motifs characteristic of each of them have been identified. The proposed methodology is termed Simplicial Neighborhood Analysis of Protein Packing (SNAPP). SNAPP can be used to locate recurrent tertiary structural motifs as well as sequence-specific, functionally relevant patterns similar to Prosite (Hofmann, et al. 1999) signatures. We anticipate that the SNAPP methodology will be useful in automating the analysis and comparison of protein structures determined in structural and functional genomics projects.
KeywordsStructural motif sequence pattern fold recognition structural and functional genomics
Unable to display preview. Download preview PDF.
- 9.Finney, J.L., “Random packing and the structure of simple liquids” (1970) Proc. R. Soc., A319, 479–493.Google Scholar
- 13.Gernert K.M., Thomas B.D., Plurad J.C., Richardson J.S., Richardson D.C., Bergman L.D. (1996). Puzzle pieces defined: locating common packing units in tertiary protein contacts. In: Pacific Symposium on Biocomputing ’96, Hawaii, Jan. 3–6, 1996, Eds. L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 331–349.Google Scholar
- 26.Lahr SJ, Broadwater A, Carter CW Jr, Collier ML, Hensley L, Waldner JC, Pielak GJ, Edgell MH. (1999). Patterned library analysis: a method for the quantitative assessment of hypotheses concerning the determinants of protein structure. Proc. Natl. Acad. Sci. USA. 96, 14860–5.PubMedCrossRefGoogle Scholar
- 31.Okabe, A., Boots, B., and Sugihara, K. (1992) Spatial tessellations: concepts and applications of Voronoi diagrams. Chichester, Wiley.Google Scholar
- 35.Tropsha, A., Singh, R.K., Vaisman, I.I., and Zheng, W. (1996) Statistical Geometry Analysis of Proteins: Implications for Inverted Structure Prediction. In: Pacific Symposium on Biocomputing ’96, Hawaii, Jan. 3–6,, Eds. L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 614–623.Google Scholar
- 39.Zheng W, Cho J, Vaisman I, Tropsha A. (1997). A new approach to protein fold recognition based on Delaunay tessellation of protein structure. In: Pacific Symposium on Biocomputing ’97, Hawaii, Jan. 6–9, 1997, Eds, L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 486–497.Google Scholar