Abstract
A class of secondary structure prediction algorithms use the information from the statistics of the residue pairs found in secondary structural elements. Because the protein folding process is dominated by backbone hydrogen bonding, an approach based on backbone hydrogen-bonded residue pairings would improve the predicting capabilities of these class algorithms. The reliability of the prediction algorithms depends on the quality of the statistics, therefore, of the data set. In this study, it was aimed to determine the propensities of the backbone hydrogen-bonded residue pairings for secondary structural elements of α-helix and β-sheet in globular proteins using a new and comprehensive data set created from the peptides deposited in Worldwide Protein Data Bank. A master data set including 4882 globular peptide chains with resolution better than 2.5 Å, sequence identity smaller than 25% and length of no shorter than 100 residues were created. Separate data sub sets also were created for helix and sheet structures from master set and each sub set includes 4594 and 4483 chains, respectively. Backbone hydrogen-bonded residue pairings in helices and sheets were detected and the propensities of them were represented as odds ratios (observed/[random or expected]) in matrices. Propensities assigned by this study to the residue pairings in secondary structural elements (as helix, overall strands, parallel strands and antiparallel strands) differ from the previous studies by 19 to 34%. These dissimilarities are important and they would cause further improvements in secondary structure prediction algorithms.
Similar content being viewed by others
Change history
01 February 2020
In the original version of this article, under the Introduction section in paragraph starting "Some findings of this study..." the “Sect. 10” should be changed to “Sect. 3”.
01 February 2020
In the original version of this article, under the Introduction section in paragraph starting "Some findings of this study..." the ���Sect.��10��� should be changed to ���Sect.��3���.
References
Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234(3):779–815
Berman H, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10(12):980
Zhang Y, Skolnick J (2005) The protein structure prediction problem could be solved using the current PDB library. Proc Natl Acad Sci USA 102(4):1029–1034
Bonneau R, Tsai J, Ruczinski I, Chivian D, Rohl C, Strauss CE et al (2001) Rosetta in CASP4: progress in ab initio protein structure prediction. Proteins Suppl 5:119–126
Bystroff C, Thorsson V, Baker D (2000) HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. J Mol Biol 301(1):173–190
Levitt M, Warshel A (1975) Computer simulation of protein folding. Nature 253(5494):694–698
Osguthorpe DJ (1999) Improved ab initio predictions with a simplified, flexible geometry model. Proteins Suppl 3:186–193
Simons KT, Bonneau R, Ruczinski I, Baker D (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins Suppl 3:171–176
Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181(4096):223–230
Bonneau R, Baker D (2001) Ab initio protein structure prediction: progress and prospects. Annu Rev Biophys Biomol Struct 30:173–189
Scheraga HA (1971) Theoretical and experimental studies of conformations of polypeptides. Chem Rev 71(2):195–217
Burgess AW, Ponnuswamy PK, Scheraga HA (1974) Analysis of conformations of amino acid residues and prediction of backbone topography in proteins. Israel J Chem 12(1–2):239–86
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A (2018) Critical assessment of methods of protein structure prediction (CASP)-Round XII. Proteins 86(Suppl 1):7–15
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J (2019) Critical assessment of methods of protein structure prediction (CASP)—round XIII. Proteins 87(12):1011–1020
Deleage G, Roux B (1987) An algorithm for protein secondary structure prediction based on class prediction. Protein Eng 1(4):289–294
Frishman D, Argos P (1995) Knowledge-based protein secondary structure assignment. Proteins 23(4):566–579
Frishman D, Argos P (1996) Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Protein Eng 9(2):133–142
Garnier J, Gibrat JF, Robson B (1996) GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol 266:540–553
Garnier J, Osguthorpe DJ, Robson B (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120(1):97–120
Geourjon C, Deleage G (1994) SOPM: a self-optimized method for protein secondary structure prediction. Protein Eng 7(2):157–164
Geourjon C, Deleage G (1995) SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Comput Appl Biosci 11(6):681–684
Gibrat JF, Garnier J, Robson B (1987) Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J Mol Biol 198(3):425–443
Guermeur Y, Geourjon C, Gallinari P, Deleage G (1999) Improved performance in protein secondary structure prediction by inhomogeneous score combination. Bioinformatics 15(5):413–421
King RD, Sternberg MJ (1996) Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci 5(11):2298–2310
Levin JM (1997) Exploring the limits of nearest neighbour secondary structure prediction. Protein Eng 10(7):771–776
Levin JM, Garnier J (1988) Improvements in a secondary structure prediction method based on a search for local sequence homologies and its use as a model building tool. Biochim Biophys Acta 955(3):283–295
Levin JM, Robson B, Garnier J (1986) An algorithm for secondary structure determination in proteins based on sequence similarity. FEBS Lett 205(2):303–308
Rost B, Sander C (1993) Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol 232(2):584–599
Rost B, Sander C (1994) Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19(1):55–72
Chou PY, Fasman GD (1974) Prediction of protein conformation. Biochemistry 13(2):222–245
Chou PY, Fasman GD (1974) Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. Biochemistry 13(2):211–222
Pauling L, Corey RB, Branson HR (1951) The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci USA 37(4):205–211
Pauling L, Corey RB (1951) The pleated sheet, a new layer configuration of polypeptide chains. Proc Natl Acad Sci USA 37(5):251–256
Deleage G, Blanchet C, Geourjon C (1997) Protein structure prediction. Implications for the biologist. Biochimie 79(11):681–686
de Sousa MM, Munteanu CR, Pazos A, Fonseca NA, Camacho R, Magalhaes AL (2011) Amino acid pair- and triplet-wise groupings in the interior of alpha-helical segments in proteins. J Theor Biol 271(1):136–144
Fonseca NA, Camacho R, Magalhaes AL (2008) Amino acid pairing at the N- and C-termini of helical segments in proteins. Proteins 70(1):188–196
Fooks HM, Martin AC, Woolfson DN, Sessions RB, Hutchinson EG (2006) Amino acid pairing preferences in parallel beta-sheets in proteins. J Mol Biol 356(1):32–44
Hutchinson EG, Sessions RB, Thornton JM, Woolfson DN (1998) Determinants of strand register in antiparallel beta-sheets of proteins. Protein Sci 7(11):2287–2300
Kim SB, Tsui KL, Borodovsky M (2006) Multiple testing in large-scale contingency tables: inferring patterns of pair-wise amino acid association in beta-sheets. Int J Bioinform Res Appl 2(2):193–217
Wouters MA, Curmi PM (1995) An analysis of side chain interactions and pair correlations within antiparallel beta-sheets: the differences between backbone hydrogen-bonded and non-hydrogen-bonded residue pairs. Proteins 22(2):119–131
Zhang N, Duan G, Gao S, Ruan J, Zhang T (2010) Prediction of the parallel/antiparallel orientation of beta-strands using amino acid pairing preferences and support vector machines. J Theor Biol 263(3):360–368
Zhang N, Ruan J, Duan G, Gao S, Zhang T (2009) The interstrand amino acid pairs play a significant role in determining the parallel or antiparallel orientation of beta-strands. Biochem Biophys Res Commun 386(3):537–543
Zhang N, Ruan J, Wu J, Zhang T (2007) SHEETSPAIR: a database of amino acid pairs in protein sheet structures. Data Sci J 6:S589–S595
Rose GD, Fleming PJ, Banavar JR, Maritan A (2006) A backbone-based theory of protein folding. Proc Natl Acad Sci USA 103(45):16623–16633
Lifson S, Sander C (1980) Specific recognition in the tertiary structure of beta-sheets of proteins. J Mol Biol 139(4):627–639
Petersen SB, Neves-Petersen MT, Henriksen SB, Mortensen RJ, Geertz-Hansen HM (2012) Scale-free behaviour of amino acid pair interactions in folded proteins. PLoS ONE 7(7):e41322
ww PDBc (2019) Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res 47(D1):D520–D528
Worldwide Protein Data Bank. FTP site. http://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/. Accessed 16 Apr 2019
Stephen White laboratory at UC Irvine. Membrane Proteins of Known 3D Structure. https://blanco.biomol.uci.edu/mpstruc/ Accessed 16 Apr 2019
Wikipedia The Free Encyclopedia. Extremophile. https://en.wikipedia.org/wiki/Extremophile Accessed 2 May 2019
Andreeva A, Howorth D, Chothia C, Kulesha E, Murzin AG (2014) SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res 42(Database issue):D310–D314
MRC Laboratory of Molecular Biology. Structural Classification of Proteins 2. https://scop2.mrc-lmb.cam.ac.uk/pt-index.html. Accessed 11 Oct 2019.
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197
Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89(22):10915–10919
NCBI National Center for Biotechnology Information. BLOSUM Matrices. ftp://ncbi.nih.gov/blast/matrices/. Accessed 2 May 2019.
Baker EN, Hubbard RE (1984) Hydrogen bonding in globular proteins. Progr Biophys Mol Biol 44(2):97–179
QB64. https://www.portal.qb64.org/ . Accessed 21 Oct 2019.
Periti PF, Quagliarotti G, Liquori AM (1967) Recognition of alpha-helical segments in proteins of known primary structure. J Mol Biol 24(2):313–322
Chemmama IE, Chapagain PP, Gerstman BS (2015) Pairwise amino acid secondary structural propensities. Phys Rev E 91(4):042709
Funding
N/A
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that he has no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by the author.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Nacar, C. Propensities of Amino Acid Pairings in Secondary Structure of Globular Proteins. Protein J 39, 21–32 (2020). https://doi.org/10.1007/s10930-020-09880-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10930-020-09880-6