Skip to main content

Improved 3-D Protein Structure Predictions using Deep ResNet Model

Abstract

Protein Structure Prediction (PSP) is considered to be a complicated problem in computational biology. In spite of, the remarkable progress made by the co-evolution-based method in PSP, it is still a challenging and unresolved problem. Recently, along with co-evolutionary relationships, deep learning approaches have been introduced in PSP that lead to significant progress. In this paper a novel methodology using deep ResNet architecture for predicting inter-residue distance and dihedral angles is proposed, that aims to generate 125 homologous sequences in an average from a set of customized sequence database. These sequences are used to generate input features. As an outcome of neural networks, a pool of structures is generated from which the lowest potential structure is chosen as the final predicted 3-D protein structure. The proposed method is trained using 6521 protein sequences extracted from Protein Data Bank (PDB). For testing 48 protein sequences whose residue length is less than 400 residues are chosen from the 13th Critical Assessment of protein Structure Prediction (CASP 13) dataset are used. The model is compared with Alphafold, Zhang, and RaptorX. The template modeling (TM) score is used to evaluate the accuracy of the estimated structure. The proposed method produces better performances for 52% of the target sequences while that of Alphafold, Zhang, RaptorX were 10%, 22.9%, and 6% respectively. Additionally, for 37.5% target sequences, the proposed method was able to achieve accuracy greater than or equal to 0.80. The TM score obtained for the sequences under consideration were 0.69, 0.67, 0.65, and 0.58 respectively for the proposed method, Alphafold, Zhang, and RaptorX.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

References

  1. 1.

    Xiong J (2006) Essential bioinformatics. A&M University, College Station, pp 174–182

    Book  Google Scholar 

  2. 2.

    Whisstock JC, Lesk AM (2003) Prediction of protein function from protein sequence and structure. Q Rev Biophys 36(3):307–340

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  3. 3.

    Morelli X et al (2000) Heteronuclear NMR and soft docking: an experimental approach for a structural model of the cytochrome c 553−Ferredoxin Complex. Biochemistry 39(10):2530–2537

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  4. 4.

    Callaway E (2015) The revolution will not be crystallized: a new method sweeps through structural biology. Nat News 525(7568):172

    CAS  Article  Google Scholar 

  5. 5.

    Hanson J et al (2020) Getting to know your neighbor: protein structure prediction comes of age with contextual machine learning. J Comput Biol 27(5):796–814

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  6. 6.

    Zhang Y, Skolnick J (2007) Scoring function for automated assessment of protein structure template quality. Proteins-New York 68(4):1020

    CAS  Google Scholar 

  7. 7.

    Cheng J et al (2019) Estimation of model accuracy in CASP13. Proteins Struct Funct Bioinform 87(12):1361–1377

    CAS  Article  Google Scholar 

  8. 8.

    Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Penedones H (2020) Improved protein structure prediction using potentials from deep learning. Nature 577(7792):706–710

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  9. 9.

    Fiser A (2010) Template-based protein structure modeling. In: Fenyö D (ed) Computational biology. Humana Press, Totowa, pp 73–94

    Chapter  Google Scholar 

  10. 10.

    Feig M (2017) Computational protein structure refinement: almost there, yet still so far to go. Wiley Interdiscip Rev Comput Mol Sci 7(3):e1307

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  11. 11.

    Qu X, Swanson R, Day R, Tsai J (2009) A guide to template based structure prediction. Curr Protein Pept Sci 10(3):270–285

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  12. 12.

    Bhattacharya D, Cao R, Cheng J (2016) UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling. Bioinformatics 32(18):2791–2799

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  13. 13.

    Dhingra S, Sowdhamini R, Cadet F, Offmann B (2020) A glance into the evolution of template-free protein structure prediction methodologies. Biochimie 175:85

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  14. 14.

    Perez A et al (2016) Blind protein structure prediction using accelerated free-energy simulations. Sci Adv 2(11):e1601274

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  15. 15.

    Schaarschmidt J et al (2018) Assessment of contact predictions in CASP12: co-evolution and deep learning coming of age. Proteins Struct Funct Bioinform 86:51–66

    CAS  Article  Google Scholar 

  16. 16.

    Bowie JU, Eisenberg D (1994) An evolutionary approach to folding small alpha-helical proteins that uses sequence information and an empirical guiding fitness function. Proc Natl Acad Sci USA 91(10):4436–4440

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. 17.

    Simons KT et al (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins Struct Funct Bioinform 37(S3):171–176

    Article  Google Scholar 

  18. 18.

    Xu D, Yang Z (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins Struct Funct Bioinform 80(7):1715–1735

    CAS  Article  Google Scholar 

  19. 19.

    Weiner BE, Alexander N, Akin LR, Woetzel N, Karakas M, Meiler J (2014) BCL: fold—protein topology determination from limited NMR restraints. Proteins Struct Funct Bioinform 82(4):587–595

    CAS  Article  Google Scholar 

  20. 20.

    Maurice KJ (2014) SSThread: template-free protein structure prediction by threading pairs of contacting secondary structures followed by assembly of overlapping pairs. J Comput Chem 35(8):644–656

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  21. 21.

    AlQuraishi M (2019) End-to-end differentiable learning of protein structure. Cell Syst 8(4):292–301

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Jayaram B et al (2012) Bhageerath—targeting the near impossible: pushing the frontiers of atomic models for protein tertiary structure prediction. J Chem Sci 124(1):83–91

    CAS  Article  Google Scholar 

  23. 23.

    Jayaram B et al (2014) "Bhageerath-H: a homology/ab initio hybrid server for predicting tertiary structures of monomeric soluble proteins. BMC Bioinform 15(S16):S7

    Article  Google Scholar 

  24. 24.

    Roy A, Kucukural A, Zhang Y (2010) I- TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5(4):725–738

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. 25.

    Yang J, Zhang Y (2015) I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Res 43(W1):W174–W181

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  26. 26.

    Xu J (2018) Distance-based protein folding powered by deep learning. arXiv preprint arXiv:1811.03481.

  27. 27.

    Li Y et al (2019) ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks. Bioinformatics 35(22):4647–4655

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Steinegger M, Söding J (2018) Clustering huge protein sequence sets in linear time. Nat Commun 9(1):1–8

    CAS  Article  Google Scholar 

  29. 29.

    Suzek BE et al (2015) UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31(6):926–932

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  30. 30.

    Mirdita M et al (2017) Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res 45(D1):D170–D176

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  31. 31.

    Zhang C, Zheng W, Mortuza SM, Li Y, Zhang Y (2020) DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and foldrecognition for distant-homology proteins. Bioinform 36(7):2105–2112

    CAS  Article  Google Scholar 

  32. 32.

    Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins Struct Funct Bioinform 80(7):1715–1735

    CAS  Article  Google Scholar 

  33. 33.

    Wu Q et al (2020) Protein contact prediction using metagenome sequence data and residual neural networks. Bioinformatics 36(1):41–48

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  34. 34.

    Wu T et al (2021) DeepDist: real-value inter-residue distance prediction with deep residual convolutional network. BMC Bioinform 22(1):1–17

    Article  CAS  Google Scholar 

  35. 35.

    Remmert M et al (2012) HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 9(2):173–175

    CAS  Article  Google Scholar 

  36. 36.

    Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7(10):e1002195

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Li Y, Zhang C, Bell EW, Yu DJ, Zhang Y (2019) Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13. Proteins Struct Funct Bioinform 87(12):1082–1091

    CAS  Article  Google Scholar 

  38. 38.

    McGuffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16(4):404–405

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  39. 39.

    Buchan DWA, Jones DT (2018) Improved protein contact predictions with the MetaPSICOV2 server in CASP12. Proteins Struct Funct Bioinform 86:78–83

    CAS  Article  Google Scholar 

  40. 40.

    Jones DT, Kandathil SM (2018) High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics 34:3308–3315

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. 41.

    Seemayer S, Gruber M, Söding J (2014) CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations. Bioinformatics 30(21):3128–3130

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  42. 42.

    Jones DT et al (2015) MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long-range hydrogen bonding in proteins. Bioinformatics 31(7):999–1006

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  43. 43.

    Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition

  44. 44.

    Adhikari B et al (2018) DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics 34:1466–1472

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  45. 45.

    Adhikari B (2020) DEEPCON: protein contact prediction using dilated convolutional neural networks with dropout. Bioinformatics 36(2):470–477

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  46. 46.

    Badri A (2020) A fully open-source framework for deep learning protein real-valued distances. Sci Rep (Nature Publisher Group) 10(1):1–10

    Google Scholar 

  47. 47.

    Cheng H, Schaeffer RD, Liao Y, Kinch LN, Pei J, Shi S, Kim BH, Grishin NV (2014) ECOD: an evolutionary classification of protein domains. PLoS Comput Biol 10(12):e1003926

    PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Cheng H, Liao Y, Schaeffer RD, Grishin NV (2015) Manual classification strategies in the ECOD database. Proteins 83(7):1238–1251

    CAS  PubMed  PubMed Central  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to S. Geethu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 433 kb)

Supplementary file2 (XLSX 391 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Geethu, S., Vimina, E.R. Improved 3-D Protein Structure Predictions using Deep ResNet Model. Protein J 40, 669–681 (2021). https://doi.org/10.1007/s10930-021-10016-7

Download citation

Keywords

  • Protein
  • 3-D protein structure prediction
  • CASP
  • Experimental and Computational techniques
  • Deep ResNet Architecture
  • Distance prediction