Abstract
The atomic-level structural properties of proteins, such as bond lengths, bond angles, and torsion angles, have been well studied and understood based on either chemistry knowledge or statistical analysis. Similar properties on the residue-level, such as the distances between two residues and the angles formed by short sequences of residues, can be equally important for structural analysis and modeling, but these have not been examined and documented on a similar scale. While these properties are difficult to measure experimentally, they can be statistically estimated in meaningful ways based on their distributions in known proteins structures. Residue-level structural properties including various types of residue distances and angles are estimated statistically. A software package is built to provide direct access to the statistical data for the properties including some important correlations not previously investigated. The distributions of residue distances and angles may vary with varying sequences, but in most cases, are concentrated in some high probability ranges, corresponding to their frequent occurrences in either α-helices or β-sheets. Strong correlations among neighboring residue angles, similar to those between neighboring torsion angles at the atomic-level, are revealed based on their statistical measures. Residue-level statistical potentials can be defined using the statistical distributions and correlations of the residue distances and angles. Ramachandran-like plots for strongly correlated residue angles are plotted and analyzed. Their applications to structural evaluation and refinement are demonstrated. With the increase in both number and quality of known protein structures, many structural properties can be derived from sets of protein structures by statistical analysis and data mining, and these can even be used as a supplement to the experimental data for structure determinations. Indeed, the statistical measures on various types of residue distances and angles provide more systematic and quantitative assessments on these properties, which can otherwise be estimated only individually and qualitatively. Their distributions and correlations in known protein structures show their importance for providing insights into how proteins may fold naturally to various residue-level structures.
Similar content being viewed by others
References
Creighton TE (1993) Proteins: structures and molecular properties, 2nd Edn. Freeman and Company, New York
Dunbrack RL (2002) Rotamer libraries in the 21st century. Curr Opin Struct Biol 12:431–440
Brooks CL III, Karplus M, Pettitt BM (1989) Proteins: a theoretical perspective of dynamics, structure, and thermodynamics. Wiley, New York
Schlick T (2003) Molecular modeling and simulation: an interdisciplinary guide. Springer, New York
Wüthrich K (1995) NMR in structural biology. World Scientific Publishing Company, Singapore
Laskowski RA, MacArthur MW, Moss DS, Thornton JM (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr 26:283–291
Bernasconi A, Segre AM (2000) Ab initio methods for protein structure prediction: a new technique based on Ramachandran plots. ERCIM News 43:13–14
Ramachandran GN, Sasiskharan V (1968) Conformation of polypeptides and proteins. Advan Prot Chem 23:283–437
Skolnick J, Kolinski A, Ortiz AR (1998) Reduced protein models and their application to the protein folding problem. J Biomol Struct Dyn 16:381–396
Scheraga HA, Khalili M, Liwo A (2007) Protein-folding dynamics: overview of molecular simulation techniques. Annu Rev Phys Chem 58:57–83
Miyazawa S, Jernigan RL (1985) Estimation of effective inter-residue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 18:534–552
Sippl MJ (1990) Calculation of conformational ensembles from potentials of mean force. J Mol Biol 213:859–883
Rojnuckarin A, Subramaniam S (1999) Knowledge-based potentials for protein structure. Proteins Struct Funct Genet 36:54–67
Wall ME, Subramaniam S, Phillips GN Jr (1999) Protein structure determination using a database of inter-atomic distance probabilities. Protein Sci 8:2720–2727
Kuszewski J, Gronenborn AM, Clore GM (1996) Improving the quality of NMR and crystallographic protein structures by means of a conformational database potential derived from structure databases. Protein Sci 5:1067–1080
Cui F, Jernigan R, Wu Z (2005) Refinement of NMR-determined protein structures with database derived distance constraints. J Bioinform Comput Biol 3:1315–1330
Cui F, Mukhopadhyay K, Young W, Jernigan R, Wu Z (2009) Improvement of under-determined loop regions of human prion protein by database derived distance constraints. Int J Data Min Bioinform 3:454–468
Wu D, Jernigan R, Wu Z (2007) Refinement of NMR-determined protein structures with database derived mean force potentials. Proteins Struct Funct Bioinform 68:232–242
Wu D, Cui F, Jernigan R, Wu Z (2007) PIDD: A protein inter-atomic distance distribution database. Nucleic Acid Res 35:D202–D207
Sun X, Wu D, Jernigan R, Wu Z (2009) PRTAD: a protein residue torsion angle distribution database. Int J Data Min Bioinform 3:469–482
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242
Doreleijers JF, Mading S, Maziuk D, Sojourner K, Yin L, Zhu J, Makley JL, Ulrich EL (2003) BioMagResBank database with sets of experimental NMR constraints corresponding to the structures of over 1400 biomolecules deposited in the Protein Data Bank. J Biomol NMR 26:139–146
Acknowledgments
This work is partially supported by the NIH//NIGMS grants R01GM072014 and R01GM081680 and by the NSF//DMS grant DMS0914354.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, Y., Bonett, S., Kloczkowski, A. et al. Statistical measures on residue-level protein structural properties. J Struct Funct Genomics 12, 119–136 (2011). https://doi.org/10.1007/s10969-011-9104-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10969-011-9104-4