Skip to main content
Log in

Statistical measures on residue-level protein structural properties

  • Published:
Journal of Structural and Functional Genomics

Abstract

The atomic-level structural properties of proteins, such as bond lengths, bond angles, and torsion angles, have been well studied and understood based on either chemistry knowledge or statistical analysis. Similar properties on the residue-level, such as the distances between two residues and the angles formed by short sequences of residues, can be equally important for structural analysis and modeling, but these have not been examined and documented on a similar scale. While these properties are difficult to measure experimentally, they can be statistically estimated in meaningful ways based on their distributions in known proteins structures. Residue-level structural properties including various types of residue distances and angles are estimated statistically. A software package is built to provide direct access to the statistical data for the properties including some important correlations not previously investigated. The distributions of residue distances and angles may vary with varying sequences, but in most cases, are concentrated in some high probability ranges, corresponding to their frequent occurrences in either α-helices or β-sheets. Strong correlations among neighboring residue angles, similar to those between neighboring torsion angles at the atomic-level, are revealed based on their statistical measures. Residue-level statistical potentials can be defined using the statistical distributions and correlations of the residue distances and angles. Ramachandran-like plots for strongly correlated residue angles are plotted and analyzed. Their applications to structural evaluation and refinement are demonstrated. With the increase in both number and quality of known protein structures, many structural properties can be derived from sets of protein structures by statistical analysis and data mining, and these can even be used as a supplement to the experimental data for structure determinations. Indeed, the statistical measures on various types of residue distances and angles provide more systematic and quantitative assessments on these properties, which can otherwise be estimated only individually and qualitatively. Their distributions and correlations in known protein structures show their importance for providing insights into how proteins may fold naturally to various residue-level structures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26

Similar content being viewed by others

References

  1. Creighton TE (1993) Proteins: structures and molecular properties, 2nd Edn. Freeman and Company, New York

  2. Dunbrack RL (2002) Rotamer libraries in the 21st century. Curr Opin Struct Biol 12:431–440

    Article  PubMed  CAS  Google Scholar 

  3. Brooks CL III, Karplus M, Pettitt BM (1989) Proteins: a theoretical perspective of dynamics, structure, and thermodynamics. Wiley, New York

  4. Schlick T (2003) Molecular modeling and simulation: an interdisciplinary guide. Springer, New York

  5. Wüthrich K (1995) NMR in structural biology. World Scientific Publishing Company, Singapore

  6. Laskowski RA, MacArthur MW, Moss DS, Thornton JM (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr 26:283–291

    Article  CAS  Google Scholar 

  7. Bernasconi A, Segre AM (2000) Ab initio methods for protein structure prediction: a new technique based on Ramachandran plots. ERCIM News 43:13–14

    Google Scholar 

  8. Ramachandran GN, Sasiskharan V (1968) Conformation of polypeptides and proteins. Advan Prot Chem 23:283–437

    Article  CAS  Google Scholar 

  9. Skolnick J, Kolinski A, Ortiz AR (1998) Reduced protein models and their application to the protein folding problem. J Biomol Struct Dyn 16:381–396

    PubMed  CAS  Google Scholar 

  10. Scheraga HA, Khalili M, Liwo A (2007) Protein-folding dynamics: overview of molecular simulation techniques. Annu Rev Phys Chem 58:57–83

    Article  PubMed  CAS  Google Scholar 

  11. Miyazawa S, Jernigan RL (1985) Estimation of effective inter-residue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 18:534–552

    Article  CAS  Google Scholar 

  12. Sippl MJ (1990) Calculation of conformational ensembles from potentials of mean force. J Mol Biol 213:859–883

    Article  PubMed  CAS  Google Scholar 

  13. Rojnuckarin A, Subramaniam S (1999) Knowledge-based potentials for protein structure. Proteins Struct Funct Genet 36:54–67

    Article  PubMed  CAS  Google Scholar 

  14. Wall ME, Subramaniam S, Phillips GN Jr (1999) Protein structure determination using a database of inter-atomic distance probabilities. Protein Sci 8:2720–2727

    Article  PubMed  CAS  Google Scholar 

  15. Kuszewski J, Gronenborn AM, Clore GM (1996) Improving the quality of NMR and crystallographic protein structures by means of a conformational database potential derived from structure databases. Protein Sci 5:1067–1080

    Article  PubMed  CAS  Google Scholar 

  16. Cui F, Jernigan R, Wu Z (2005) Refinement of NMR-determined protein structures with database derived distance constraints. J Bioinform Comput Biol 3:1315–1330

    Article  PubMed  CAS  Google Scholar 

  17. Cui F, Mukhopadhyay K, Young W, Jernigan R, Wu Z (2009) Improvement of under-determined loop regions of human prion protein by database derived distance constraints. Int J Data Min Bioinform 3:454–468

    Article  PubMed  Google Scholar 

  18. Wu D, Jernigan R, Wu Z (2007) Refinement of NMR-determined protein structures with database derived mean force potentials. Proteins Struct Funct Bioinform 68:232–242

    Article  CAS  Google Scholar 

  19. Wu D, Cui F, Jernigan R, Wu Z (2007) PIDD: A protein inter-atomic distance distribution database. Nucleic Acid Res 35:D202–D207

    Article  PubMed  CAS  Google Scholar 

  20. Sun X, Wu D, Jernigan R, Wu Z (2009) PRTAD: a protein residue torsion angle distribution database. Int J Data Min Bioinform 3:469–482

    Article  PubMed  Google Scholar 

  21. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242

    Article  PubMed  CAS  Google Scholar 

  22. Doreleijers JF, Mading S, Maziuk D, Sojourner K, Yin L, Zhu J, Makley JL, Ulrich EL (2003) BioMagResBank database with sets of experimental NMR constraints corresponding to the structures of over 1400 biomolecules deposited in the Protein Data Bank. J Biomol NMR 26:139–146

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work is partially supported by the NIH//NIGMS grants R01GM072014 and R01GM081680 and by the NSF//DMS grant DMS0914354.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhijun Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Y., Bonett, S., Kloczkowski, A. et al. Statistical measures on residue-level protein structural properties. J Struct Funct Genomics 12, 119–136 (2011). https://doi.org/10.1007/s10969-011-9104-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10969-011-9104-4

Keywords

Navigation