Skip to main content
Log in

CSI-LSTM: a web server to predict protein secondary structure using bidirectional long short term memory and NMR chemical shifts

  • Article
  • Published:
Journal of Biomolecular NMR Aims and scope Submit manuscript

Abstract

Protein secondary structure provides rich structural information, hence the description and understanding of protein structure relies heavily on it. Identification or prediction of secondary structures therefore plays an important role in protein research. In protein NMR studies, it is more convenient to predict secondary structures from chemical shifts as compared to the traditional determination methods based on inter-nuclear distances provided by NOESY experiment. In recent years, there was a significant improvement observed in deep neural networks, which had been applied in many research fields. Here we proposed a deep neural network based on bidirectional long short term memory (biLSTM) to predict protein 3-state secondary structure using NMR chemical shifts of backbone nuclei. While comparing with the existing methods the proposed method showed better prediction accuracy. Based on the proposed method, a web server has been built to provide protein secondary structure prediction service.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The supporting information is attached as supplementary information.

Software availability

The service is provided by an online web server: http://www.proteindeeplearning.info/.

References

  • Bengio Y (2009) Learning deep architectures for AI. Foundations Trends® in Machine Learning 2:1–127

    Article  Google Scholar 

  • Berman HM et al (2000) The protein data bank. Nucleic Acids Res 28:235–242

    Article  ADS  Google Scholar 

  • Bohr H et al (1988) Protein secondary structure and homology by neural networks - the alpha-helices in rhodopsin. FEBS Lett 241:223–228

    Article  Google Scholar 

  • Chou PY, Fasman GD (1974) Prediction of protein conformation. Biochemistry 13:222–245

    Article  Google Scholar 

  • Garnier J, Osguthorpe DJ, Robson B (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120:97–120

    Article  Google Scholar 

  • Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18:602–610

    Article  Google Scholar 

  • Guzzo AV (1965) Influence of amino-acid sequence on protein structure. Biophys J 5:809–822

    Article  Google Scholar 

  • Hafsa NE, Wishart DS (2014) CSI 2.0: a significantly improved version of the Chemical Shift Index. J Biomol NMR 60:131–146

    Article  Google Scholar 

  • Hafsa NE, Arndt D, Wishart DS (2015) CSI 3.0: a web server for identifying secondary and super-secondary structure in proteins using NMR chemical shifts. Nucleic Acids Res 43:W370–W377

    Article  Google Scholar 

  • Harris DM, Harris SL, Prinz P, Crawford T (2019) Digital design and computer architecture. Morgan Kaufmann

  • He B et al (2009) Predicting intrinsic disorder in proteins: an overview. Cell Res 19:929–949

    Article  Google Scholar 

  • Heffernan R et al (2015) Improving prediction of secondary structure, local backbone angles and solvent accessible surface area of proteins by iterative deep learning. Sci Rep 5:11476

    Article  ADS  Google Scholar 

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780

    Article  Google Scholar 

  • Hung LH, Samudrala R (2003) Accurate and automated classification of protein secondary structure with PsiCSI. Protein Sci 12:288–295

    Article  Google Scholar 

  • Jiang Q, Jin X, Lee SJ, Yao SW (2017) Protein secondary structure prediction: a survey of the state of the art. J Mol Graph Model 76:379–402

    Article  Google Scholar 

  • Kabat EA, Wu TT (1973) The influence of nearest-neighbor amino acids on the conformation of the middle amino acid in proteins: comparison of predicted and experimental determination of β-sheets in concanavalin A. Proc Natl Acad Sci 70:1473

    Article  ADS  Google Scholar 

  • Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern-recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637

    Article  Google Scholar 

  • Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. in International Conference on Learning Representations 1–13

  • Kuhlman B, Bradley P (2019) Advances in protein structure prediction and design. Nat Rev Mol Cell Biol 20:681–697

    Article  Google Scholar 

  • Meiler J, Muller M, Zeidler A, Schmaschke F (2001) Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks. J Mol Model 7:360–369

    Article  Google Scholar 

  • Minsky M (1954) Neural-analog networks and the brain model problem. Ph. D. Thesis

  • Muggleton S, King RD, Sternberg MJE (1992) Protein secondary structure prediction using logic-based machine learning. Protein Eng 5:647–657

    Article  Google Scholar 

  • Nelson DL, Cox MM (2017) Lehninger principles of biochemistry, 7th ed. Macmillan

  • Rost B (2001) Review: protein secondary structure prediction continues to rise. J Struct Biol 134:204–218

    Article  Google Scholar 

  • Rost B, Sander C (1993) Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc Natl Acad Sci 90:7558

    Article  ADS  Google Scholar 

  • Senior AW et al (2020) Improved protein structure prediction using potentials from deep learning. Nature 577:706–710

    Article  ADS  Google Scholar 

  • Shen Y, Bax A (2013) Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks. J Biomol NMR 56:227–241

    Article  Google Scholar 

  • Smolarczyk T, Roterman-Konieczna I, Stapor K (2020) Protein secondary structure prediction: a review of progress and directions. Curr Bioinform 15:90–107

    Article  Google Scholar 

  • Soding J, Biegert A, Lupas AN (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33:W244–W248

    Article  Google Scholar 

  • Tealab A (2018) Time series forecasting using artificial neural networks methodologies: a systematic review. Fut Comput Inf J 3:334–340

    Google Scholar 

  • Ulrich EL et al (2019) NMR-STAR: comprehensive ontology for representing, archiving and exchanging data from nuclear magnetic resonance spectroscopic experiments. J Biomol NMR 73:5–9

    Article  Google Scholar 

  • Wang YJ, Jardetzky O (2002) Probability-based protein secondary structure identification using combined NMR chemical-shift data. Protein Sci 11:852–861

    Article  Google Scholar 

  • Ward JJ, McGuffin LJ, Buxton BF, Jones DT (2003) Secondary structure prediction with support vector machines. Bioinformatics 19:1650–1655

    Article  Google Scholar 

  • Wishart DS, Sykes BD, Richards FM (1992) The chemical-shift index - a fast and simple method for the assignment of protein secondary structure through NMR-spectroscopy. Biochemistry 31:1647–1651

    Article  Google Scholar 

  • Wuthrich K (1991) NMR with proteins and nucleic acids. Wiley, Chichester

    Google Scholar 

  • Zvelebil MJ, Barton GJ, Taylor WR, Sternberg MJE (1987) Prediction of protein secondary structure and active-sites using the alignment of homologous sequences. J Mol Biol 195:957–961

    Article  Google Scholar 

Download references

Acknowledgements

This research was funded by National Key R&D Program of China (Grant Nos. 2018YFA0704002, 2018YFE0202300, 2017YFA0505400), National Natural Science Foundation of China (Grant Nos. 21735007, 21991080, 21921004), and CAS Key Research Program of Frontier Sciences (Grant No. QYZDJ-SSW-SLH027).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Bin Jiang or Maili Liu.

Ethics declarations

Conflict of interest

The authors declare no competing financial interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

10858_2021_383_MOESM1_ESM.pdf

The Supplementary Information includes the PDB IDs, BMRB entries, residue numbers, and numbers of assigned backbone chemical shifts of the proteins used as training and test dataset, the protein secondary structure accuracies by three methods on test dataset, and the secondary structures information and prediction accuracies of the training and validation dataset. The Supplementary Information are available free of charge on publication’s website. Supplementary material 1 (PDF 2050.9 kb)

Supplementary material 2 (PDF 342.9 kb)

Supplementary material 3 (DOCX 154.5 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miao, Z., Wang, Q., Xiao, X. et al. CSI-LSTM: a web server to predict protein secondary structure using bidirectional long short term memory and NMR chemical shifts. J Biomol NMR 75, 393–400 (2021). https://doi.org/10.1007/s10858-021-00383-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10858-021-00383-9

Keywords

Navigation