Identification of Sequence-Specific Tertiary Packing Motifs in Protein Structures using Delaunay Tessellation

  • Stephen A. Cammer
  • Charles W. CarterJr.
  • Alexander TropshaEmail author
Part of the Lecture Notes in Computational Science and Engineering book series (LNCSE, volume 24)


An approach to recognizing recurrent sequence-structure patterns in proteins has been developed, based on Delaunay tessellation of protein structure. Starting with a united residue (side chain centroids) representation of a protein structure, tessellation partitions the structure into a unique set of irregular tetra-hedra, or simplices whose vertices correspond to four nearest-neighbor residues. Tetrahedral clusters composed of residues not adjacent along the polypeptide chain have been classified according to their amino acid composition and the three distances separating the residues along the sequence; these distances being defined as the sequence lengths from first to second, second to third, and third to fourth residue. An elementary tertiary packing motif is defined as a Delaunay simplex with a specific amino acid composition, together with three sequence distances (i.e., number of residues along the sequence) between vertex residues. Analysis of three databases of diverse protein structures (< 30% sequence identity between any pair, 1922 structures total) identified 224 motifs found in at least two proteins from different fold families each. To further substantiate the methodology, three groups of proteins representing unique structural and functional families were analyzed and packing motifs characteristic of each of them have been identified. The proposed methodology is termed Simplicial Neighborhood Analysis of Protein Packing (SNAPP). SNAPP can be used to locate recurrent tertiary structural motifs as well as sequence-specific, functionally relevant patterns similar to Prosite (Hofmann, et al. 1999) signatures. We anticipate that the SNAPP methodology will be useful in automating the analysis and comparison of protein structures determined in structural and functional genomics projects.


Structural motif sequence pattern fold recognition structural and functional genomics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aurenhammer, F., Voronoi Diagrams: A survey of a fundamental data st ructure. (1991) ACM. Comput. Surveys, 23, 345–405.CrossRefGoogle Scholar
  2. 2.
    Altschul, SF., Madden T, Schffer A., Zhang J, Zhang Z, Miller W, Lipman D. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.PubMedCrossRefGoogle Scholar
  3. 3.
    Bryant SH, Lawrence CE. (1993) An empirical energy function for threading protein sequence through the folding motif. Proteins, 16, 92–112PubMedCrossRefGoogle Scholar
  4. 4.
    Carter CW Jr, LeFebvre BC, Cammer SA, Tropsha A, Edgell MH. (2001) Four-body potentials reveal protein-specific correlations to stability changes caused by hydrophobic core mutations. J. Mol. Biol.; 311, 625–638.PubMedCrossRefGoogle Scholar
  5. 5.
    Casari G, Sippl M. 1992. Structure-derived hydrophobic potential. Hydrophobic potential derived from X-ray structures of globular proteins is able to identify native folds. Proteins. 13, 258–271.CrossRefGoogle Scholar
  6. 6.
    Chothia, C. Structural invariants in protein folding (1975). Nature 254, 304–308.PubMedCrossRefGoogle Scholar
  7. 7.
    Chothia, C., Levitt, M., Richardson D. (1997). Structure of Proteins: Packing of α-helices and pleated sheets. Proc. Natl. Acad. Sci. USA 74, 4130–4134.CrossRefGoogle Scholar
  8. 8.
    Chothia, C, Janin, J. (1980). Packing of α-Helices onto β-pleated Sheets and the Anatomy of α/β Proteins. J. Mol. Biol. 143, 95–128.PubMedCrossRefGoogle Scholar
  9. 9.
    Finney, J.L., “Random packing and the structure of simple liquids” (1970) Proc. R. Soc., A319, 479–493.Google Scholar
  10. 10.
    Chothia C, Levitt M, Richardson D. (1981). Helix to Helix Packing in Proteins. J. Mol. Biol. 145, 215–250.PubMedCrossRefGoogle Scholar
  11. 11.
    Gan, H.H., Tropsha, A. and Schlick, T. Generating Folded Protein Structures with a Lattice Chain Algorithm. (2000) J. Chem. Phys., 113, 5511–5524.CrossRefGoogle Scholar
  12. 12.
    Gan, H.H., Tropsha, A. and Schlick, T. Lattice Protein Folding with Two and Four-Body Statistical Potentials. (2001) Proteins: Struct. Funct. Genetics, 43, 161–174.CrossRefGoogle Scholar
  13. 13.
    Gernert K.M., Thomas B.D., Plurad J.C., Richardson J.S., Richardson D.C., Bergman L.D. (1996). Puzzle pieces defined: locating common packing units in tertiary protein contacts. In: Pacific Symposium on Biocomputing ’96, Hawaii, Jan. 3–6, 1996, Eds. L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 331–349.Google Scholar
  14. 14.
    Gerstein, M., Tsai, J., and Levitt, M. The volume of atoms on the protein surface: calculated from simulation using Voronoi polyhedra. (1995) J. Mol. Biol., 249, 955–966.PubMedCrossRefGoogle Scholar
  15. 15.
    Godzik A, Skolnick J. (1992). Sequence Structure Matching in Globular Proteins — Application to Supersecondary and Tertiary Structure Determination. Proc. Natl. Acad. Sci. USA. 89, 12098–12102.PubMedCrossRefGoogle Scholar
  16. 16.
    Harpaz, Y., Gerstein, M., and Chothia, C. (1994) Volume changes on protein folding. Structure, 2, 641–649.PubMedCrossRefGoogle Scholar
  17. 17.
    Henikoff S, Henikoff J, Pietrokovski S. (1999). Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations. Bioinformatics. 15, 471–9.PubMedCrossRefGoogle Scholar
  18. 18.
    Hobohm U, Scharf M, Schneider R, Sander C. (1992). Selection of Represent ative Protein Data Sets. Prot. Sci. 1, 409–417.CrossRefGoogle Scholar
  19. 19.
    Hofmann K, Bucher P, Falquet L, Bairoch A. (1999). The PROSITE database, its status in 1999. Nucleic Acids Res. 27, 215–219.PubMedCrossRefGoogle Scholar
  20. 20.
    Holm L, Sander C. (1998). Touring protein fold space with Dali/FSSP. Nucleic Acids Res. 26, 316–9.PubMedCrossRefGoogle Scholar
  21. 21.
    Hooft RWW, Sander C, Vriend G. 1996. Verification of protein structures: Sidechain planarity. J. Appl. Cryst. 29, 714–716.CrossRefGoogle Scholar
  22. 22.
    Hutchinson EG, Sessions R, Thornton, J, Woolfson, D. 1998. Determinants of strand register in antiparallel (-sheets of proteins. Protein. Sci. 7:2287–2300.PubMedCrossRefGoogle Scholar
  23. 23.
    Jonassen I, Eidhammer I, Taylor W. (1999). Discovery of local packing motifs in protein structures. Proteins. Feb 1: 34(2):206–19.PubMedCrossRefGoogle Scholar
  24. 24.
    Jones D, Thornton J. (1996). Potential energy functions for threading. Curr. Opin. Struct. Biol. 6, 210–216.PubMedCrossRefGoogle Scholar
  25. 25.
    Koretke KK, Luthey-Schulten Z, Wolynes PG. (1996) Self-consistently optimized statistical mechanical energy functions for sequence structure alignment. Protein Sci. 5, 1043–1059.PubMedCrossRefGoogle Scholar
  26. 26.
    Lahr SJ, Broadwater A, Carter CW Jr, Collier ML, Hensley L, Waldner JC, Pielak GJ, Edgell MH. (1999). Patterned library analysis: a method for the quantitative assessment of hypotheses concerning the determinants of protein structure. Proc. Natl. Acad. Sci. USA. 96, 14860–5.PubMedCrossRefGoogle Scholar
  27. 27.
    Maiorov VN, Crippen GM. (1992). Contact potential that recognizes the correct folding of globular proteins. J. Mol. Biol. 227, 876–88.PubMedCrossRefGoogle Scholar
  28. 28.
    Miyazawa S, Jernigan RL. (1996). Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J. Mol. Biol. 256, 623–44.PubMedCrossRefGoogle Scholar
  29. 29.
    Munson P, Singh R. (1998). Statistical significance of hierarchical multi-body potentials based on Delaunay tessellation and their application in sequence structure alignment. Protein Sci. 6, 1467–81.CrossRefGoogle Scholar
  30. 30.
    Murzin AG, Brenner SE, Hubbard T, Chothia C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540.PubMedGoogle Scholar
  31. 31.
    Okabe, A., Boots, B., and Sugihara, K. (1992) Spatial tessellations: concepts and applications of Voronoi diagrams. Chichester, Wiley.Google Scholar
  32. 32.
    Richards, F.M. (1974). The int erpretation of protein structures: total volume, group volume distribution and packing density. J. Mol. Biol. 82, 1–14.PubMedCrossRefGoogle Scholar
  33. 33.
    Singh R, Tropsha A, Vaisman I. (1996). Delaunay tessellation of proteins: four body nearest-neighbor propensities of amino acid residues. J. Comput. Biol., 3, 213–222.PubMedCrossRefGoogle Scholar
  34. 34.
    Sippl, MJ. (1995). Knowledge-based potentials for proteins. Curr. Opin. Struct. Biol. 5, 229–35.PubMedCrossRefGoogle Scholar
  35. 35.
    Tropsha, A., Singh, R.K., Vaisman, I.I., and Zheng, W. (1996) Statistical Geometry Analysis of Proteins: Implications for Inverted Structure Prediction. In: Pacific Symposium on Biocomputing ’96, Hawaii, Jan. 3–6,, Eds. L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 614–623.Google Scholar
  36. 36.
    Young M, Skillman A, Kuntz I. (1999). A rapid method for exploring the protein structure universe. Proteins 34, 317–32.PubMedCrossRefGoogle Scholar
  37. 37.
    Wako H, Yamato T. (1998). Novel method to detect a motif of local structures in different protein conformations. Protein Eng. 11, 981–90.PubMedCrossRefGoogle Scholar
  38. 38.
    Watson, DF. (1981). Computing the n-dimensional Delaunay tesselation with application to Voronoi polytopes. Comp. J., 24, 167–172CrossRefGoogle Scholar
  39. 39.
    Zheng W, Cho J, Vaisman I, Tropsha A. (1997). A new approach to protein fold recognition based on Delaunay tessellation of protein structure. In: Pacific Symposium on Biocomputing ’97, Hawaii, Jan. 6–9, 1997, Eds, L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 486–497.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Stephen A. Cammer
    • 1
  • Charles W. CarterJr.
    • 2
  • Alexander Tropsha
    • 1
    Email author
  1. 1.The Laboratory for Molecular Modeling, Division of Medicinal Chemistry and Natural Products, School of Pharmacy, CB # 7360, Beard HallUniversity of North CarolinaChapel HillUSA
  2. 2.Department of Biochemistry and BiophysicsUniversity of North CarolinaChapel HillUSA

Personalised recommendations