Skip to main content
Log in

Propensities of Amino Acid Pairings in Secondary Structure of Globular Proteins

  • Published:
The Protein Journal Aims and scope Submit manuscript

A Correction to this article was published on 01 February 2020

This article has been updated

Abstract

A class of secondary structure prediction algorithms use the information from the statistics of the residue pairs found in secondary structural elements. Because the protein folding process is dominated by backbone hydrogen bonding, an approach based on backbone hydrogen-bonded residue pairings would improve the predicting capabilities of these class algorithms. The reliability of the prediction algorithms depends on the quality of the statistics, therefore, of the data set. In this study, it was aimed to determine the propensities of the backbone hydrogen-bonded residue pairings for secondary structural elements of α-helix and β-sheet in globular proteins using a new and comprehensive data set created from the peptides deposited in Worldwide Protein Data Bank. A master data set including 4882 globular peptide chains with resolution better than 2.5 Å, sequence identity smaller than 25% and length of no shorter than 100 residues were created. Separate data sub sets also were created for helix and sheet structures from master set and each sub set includes 4594 and 4483 chains, respectively. Backbone hydrogen-bonded residue pairings in helices and sheets were detected and the propensities of them were represented as odds ratios (observed/[random or expected]) in matrices. Propensities assigned by this study to the residue pairings in secondary structural elements (as helix, overall strands, parallel strands and antiparallel strands) differ from the previous studies by 19 to 34%. These dissimilarities are important and they would cause further improvements in secondary structure prediction algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Change history

  • 01 February 2020

    In the original version of this article, under the Introduction section in paragraph starting "Some findings of this study..." the “Sect. 10” should be changed to “Sect. 3”.

  • 01 February 2020

    In the original version of this article, under the Introduction section in paragraph starting "Some findings of this study..." the ���Sect.��10��� should be changed to ���Sect.��3���.

References

  1. Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234(3):779–815

    Article  CAS  PubMed  Google Scholar 

  2. Berman H, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10(12):980

    Article  CAS  PubMed  Google Scholar 

  3. Zhang Y, Skolnick J (2005) The protein structure prediction problem could be solved using the current PDB library. Proc Natl Acad Sci USA 102(4):1029–1034

    Article  CAS  PubMed  Google Scholar 

  4. Bonneau R, Tsai J, Ruczinski I, Chivian D, Rohl C, Strauss CE et al (2001) Rosetta in CASP4: progress in ab initio protein structure prediction. Proteins Suppl 5:119–126

    Article  CAS  Google Scholar 

  5. Bystroff C, Thorsson V, Baker D (2000) HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. J Mol Biol 301(1):173–190

    Article  CAS  PubMed  Google Scholar 

  6. Levitt M, Warshel A (1975) Computer simulation of protein folding. Nature 253(5494):694–698

    Article  CAS  PubMed  Google Scholar 

  7. Osguthorpe DJ (1999) Improved ab initio predictions with a simplified, flexible geometry model. Proteins Suppl 3:186–193

    Article  Google Scholar 

  8. Simons KT, Bonneau R, Ruczinski I, Baker D (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins Suppl 3:171–176

    Article  Google Scholar 

  9. Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181(4096):223–230

    Article  CAS  PubMed  Google Scholar 

  10. Bonneau R, Baker D (2001) Ab initio protein structure prediction: progress and prospects. Annu Rev Biophys Biomol Struct 30:173–189

    Article  CAS  PubMed  Google Scholar 

  11. Scheraga HA (1971) Theoretical and experimental studies of conformations of polypeptides. Chem Rev 71(2):195–217

    Article  CAS  PubMed  Google Scholar 

  12. Burgess AW, Ponnuswamy PK, Scheraga HA (1974) Analysis of conformations of amino acid residues and prediction of backbone topography in proteins. Israel J Chem 12(1–2):239–86

    Article  CAS  Google Scholar 

  13. Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A (2018) Critical assessment of methods of protein structure prediction (CASP)-Round XII. Proteins 86(Suppl 1):7–15

    Article  CAS  PubMed  Google Scholar 

  14. Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J (2019) Critical assessment of methods of protein structure prediction (CASP)—round XIII. Proteins 87(12):1011–1020

    Article  CAS  PubMed  Google Scholar 

  15. Deleage G, Roux B (1987) An algorithm for protein secondary structure prediction based on class prediction. Protein Eng 1(4):289–294

    Article  CAS  PubMed  Google Scholar 

  16. Frishman D, Argos P (1995) Knowledge-based protein secondary structure assignment. Proteins 23(4):566–579

    Article  CAS  PubMed  Google Scholar 

  17. Frishman D, Argos P (1996) Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Protein Eng 9(2):133–142

    Article  CAS  PubMed  Google Scholar 

  18. Garnier J, Gibrat JF, Robson B (1996) GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol 266:540–553

    Article  CAS  PubMed  Google Scholar 

  19. Garnier J, Osguthorpe DJ, Robson B (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120(1):97–120

    Article  CAS  PubMed  Google Scholar 

  20. Geourjon C, Deleage G (1994) SOPM: a self-optimized method for protein secondary structure prediction. Protein Eng 7(2):157–164

    Article  CAS  PubMed  Google Scholar 

  21. Geourjon C, Deleage G (1995) SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Comput Appl Biosci 11(6):681–684

    CAS  PubMed  Google Scholar 

  22. Gibrat JF, Garnier J, Robson B (1987) Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J Mol Biol 198(3):425–443

    Article  CAS  PubMed  Google Scholar 

  23. Guermeur Y, Geourjon C, Gallinari P, Deleage G (1999) Improved performance in protein secondary structure prediction by inhomogeneous score combination. Bioinformatics 15(5):413–421

    Article  CAS  PubMed  Google Scholar 

  24. King RD, Sternberg MJ (1996) Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci 5(11):2298–2310

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Levin JM (1997) Exploring the limits of nearest neighbour secondary structure prediction. Protein Eng 10(7):771–776

    Article  CAS  PubMed  Google Scholar 

  26. Levin JM, Garnier J (1988) Improvements in a secondary structure prediction method based on a search for local sequence homologies and its use as a model building tool. Biochim Biophys Acta 955(3):283–295

    Article  CAS  PubMed  Google Scholar 

  27. Levin JM, Robson B, Garnier J (1986) An algorithm for secondary structure determination in proteins based on sequence similarity. FEBS Lett 205(2):303–308

    Article  CAS  PubMed  Google Scholar 

  28. Rost B, Sander C (1993) Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol 232(2):584–599

    Article  CAS  PubMed  Google Scholar 

  29. Rost B, Sander C (1994) Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19(1):55–72

    Article  CAS  PubMed  Google Scholar 

  30. Chou PY, Fasman GD (1974) Prediction of protein conformation. Biochemistry 13(2):222–245

    Article  CAS  PubMed  Google Scholar 

  31. Chou PY, Fasman GD (1974) Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. Biochemistry 13(2):211–222

    Article  CAS  PubMed  Google Scholar 

  32. Pauling L, Corey RB, Branson HR (1951) The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci USA 37(4):205–211

    Article  CAS  PubMed  Google Scholar 

  33. Pauling L, Corey RB (1951) The pleated sheet, a new layer configuration of polypeptide chains. Proc Natl Acad Sci USA 37(5):251–256

    Article  CAS  PubMed  Google Scholar 

  34. Deleage G, Blanchet C, Geourjon C (1997) Protein structure prediction. Implications for the biologist. Biochimie 79(11):681–686

    Article  CAS  PubMed  Google Scholar 

  35. de Sousa MM, Munteanu CR, Pazos A, Fonseca NA, Camacho R, Magalhaes AL (2011) Amino acid pair- and triplet-wise groupings in the interior of alpha-helical segments in proteins. J Theor Biol 271(1):136–144

    Article  PubMed  CAS  Google Scholar 

  36. Fonseca NA, Camacho R, Magalhaes AL (2008) Amino acid pairing at the N- and C-termini of helical segments in proteins. Proteins 70(1):188–196

    Article  CAS  PubMed  Google Scholar 

  37. Fooks HM, Martin AC, Woolfson DN, Sessions RB, Hutchinson EG (2006) Amino acid pairing preferences in parallel beta-sheets in proteins. J Mol Biol 356(1):32–44

    Article  CAS  PubMed  Google Scholar 

  38. Hutchinson EG, Sessions RB, Thornton JM, Woolfson DN (1998) Determinants of strand register in antiparallel beta-sheets of proteins. Protein Sci 7(11):2287–2300

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Kim SB, Tsui KL, Borodovsky M (2006) Multiple testing in large-scale contingency tables: inferring patterns of pair-wise amino acid association in beta-sheets. Int J Bioinform Res Appl 2(2):193–217

    Article  CAS  PubMed  Google Scholar 

  40. Wouters MA, Curmi PM (1995) An analysis of side chain interactions and pair correlations within antiparallel beta-sheets: the differences between backbone hydrogen-bonded and non-hydrogen-bonded residue pairs. Proteins 22(2):119–131

    Article  CAS  PubMed  Google Scholar 

  41. Zhang N, Duan G, Gao S, Ruan J, Zhang T (2010) Prediction of the parallel/antiparallel orientation of beta-strands using amino acid pairing preferences and support vector machines. J Theor Biol 263(3):360–368

    Article  CAS  PubMed  Google Scholar 

  42. Zhang N, Ruan J, Duan G, Gao S, Zhang T (2009) The interstrand amino acid pairs play a significant role in determining the parallel or antiparallel orientation of beta-strands. Biochem Biophys Res Commun 386(3):537–543

    Article  CAS  PubMed  Google Scholar 

  43. Zhang N, Ruan J, Wu J, Zhang T (2007) SHEETSPAIR: a database of amino acid pairs in protein sheet structures. Data Sci J 6:S589–S595

    Article  Google Scholar 

  44. Rose GD, Fleming PJ, Banavar JR, Maritan A (2006) A backbone-based theory of protein folding. Proc Natl Acad Sci USA 103(45):16623–16633

    Article  CAS  PubMed  Google Scholar 

  45. Lifson S, Sander C (1980) Specific recognition in the tertiary structure of beta-sheets of proteins. J Mol Biol 139(4):627–639

    Article  CAS  PubMed  Google Scholar 

  46. Petersen SB, Neves-Petersen MT, Henriksen SB, Mortensen RJ, Geertz-Hansen HM (2012) Scale-free behaviour of amino acid pair interactions in folded proteins. PLoS ONE 7(7):e41322

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. ww PDBc (2019) Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res 47(D1):D520–D528

    Article  CAS  Google Scholar 

  48. Worldwide Protein Data Bank. FTP site. http://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/. Accessed 16 Apr 2019

  49. Stephen White laboratory at UC Irvine. Membrane Proteins of Known 3D Structure. https://blanco.biomol.uci.edu/mpstruc/ Accessed 16 Apr 2019

  50. Wikipedia The Free Encyclopedia. Extremophile. https://en.wikipedia.org/wiki/Extremophile Accessed 2 May 2019

  51. Andreeva A, Howorth D, Chothia C, Kulesha E, Murzin AG (2014) SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res 42(Database issue):D310–D314

    Article  CAS  PubMed  Google Scholar 

  52. MRC Laboratory of Molecular Biology. Structural Classification of Proteins 2. https://scop2.mrc-lmb.cam.ac.uk/pt-index.html. Accessed 11 Oct 2019.

  53. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453

    Article  CAS  PubMed  Google Scholar 

  54. Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197

    Article  CAS  PubMed  Google Scholar 

  55. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89(22):10915–10919

    Article  CAS  PubMed  Google Scholar 

  56. NCBI National Center for Biotechnology Information. BLOSUM Matrices. ftp://ncbi.nih.gov/blast/matrices/. Accessed 2 May 2019.

  57. Baker EN, Hubbard RE (1984) Hydrogen bonding in globular proteins. Progr Biophys Mol Biol 44(2):97–179

    Article  CAS  Google Scholar 

  58. QB64. https://www.portal.qb64.org/ . Accessed 21 Oct 2019.

  59. Periti PF, Quagliarotti G, Liquori AM (1967) Recognition of alpha-helical segments in proteins of known primary structure. J Mol Biol 24(2):313–322

    Article  CAS  PubMed  Google Scholar 

  60. Chemmama IE, Chapagain PP, Gerstman BS (2015) Pairwise amino acid secondary structural propensities. Phys Rev E 91(4):042709

    Article  CAS  Google Scholar 

Download references

Funding

N/A

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cevdet Nacar.

Ethics declarations

Conflict of interest

The author declares that he has no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by the author.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nacar, C. Propensities of Amino Acid Pairings in Secondary Structure of Globular Proteins. Protein J 39, 21–32 (2020). https://doi.org/10.1007/s10930-020-09880-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10930-020-09880-6

Keywords

Navigation