Abstract
Protein structure prediction is one of the main research areas in the field of Bio-informatics. The importance of proteins in drug design attracts researchers for finding the accurate tertiary structure of the protein which is dependent on its secondary structure. In this paper, we focus on improving the accuracy of protein secondary structure prediction. To do so, a Multi-scale convolutional neural network with a Gated recurrent neural network (MCNN-GRNN) is proposed. The novel amino acid encoding method along with layered convolutional neural network and Gated recurrent neural network blocks helps to retrieve local and global relationships between features, which in turn effectively classify the input protein sequence into 3 and 8 states. We have evaluated our algorithm on CullPDB, CB513, PDB25, CASP10, CASP11, CASP12, CASP13, and CASP14 datasets. We have compared our algorithm with different state-of-the-art algorithms like DCNN-SS, DCRNN, MUFOLD-SS, DLBLS_SS, and CGAN-PSSP. The Q3 accuracy of the proposed algorithm is 82–87% and Q8 accuracy is 69–77% on different datasets.
Similar content being viewed by others
Data availability
Data used in this study is publicly available.
References
VrushaliBongirwar ASM (2022) Different methods, techniques and their limitations in protein structure prediction: a review. Progr Biophys Mol Biol 173:72–82. https://doi.org/10.1016/j.pbiomolbio.2022.05.002
Chou K-C (2001) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 43(3):246–255. https://doi.org/10.1002/prot.10355
Gao CF, Wu XY (2018) Feature extraction method for proteins based on markov tripeptide by compressive sensing. BMC Bioinform 19(1):239–232. https://doi.org/10.1186/s12859-018-2235-x
Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292(2):195–202. https://doi.org/10.1006/jmbi.1999.3091
Smolarczyk T, Roterman-Konieczna I, Stapor K (2020) Protein secondary structure prediction: a review of progress and directions. Curr Bioinform 15:90–107. https://doi.org/10.2174/1574893614666191017104639
Chou PY, Fasman GD (1974) Conformational parameters for amino acids in helical, B-sheet, and random coil regions calculated from proteins. Biochemistry 13(2):211–222
Rost B, Sander C (1993) Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol 232(2):584–599. https://doi.org/10.1006/jmbi.1993.1413
McGuffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16(4):404–405. https://doi.org/10.1093/bioinformatics/16.4.404
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
Ward JJ, McGuffin LJ, Buxton BF, Jones DT (2003) Secondary structure prediction with support vector machines. Bioinformatics 19(3):1950–1955. https://doi.org/10.1093/bioinformatics/btg223
Dor O, Zhou Y (2007) Achieving 80% ten-fold cross-validated accuracy for secondary structure prediction by large-scale training. Proteins 66(4):838–845. https://doi.org/10.1002/prot.21298
Faraggi E, Zhang T, Yang Y, Lukasz Kurgan YZ (2011) SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles. J Comput Chem 33(3):259–267. https://doi.org/10.1002/jcc.21968
Bettella F, Rasinski D, Knapp EW (2012) Protein secondary structure prediction with SPARROW. J Chem Inf Model 52(2):545–556. https://doi.org/10.1021/ci200321u
Ashraf Yaseen YL (2014) Context-based features enhance protein secondary structure prediction accuracy. J Chem Inf Model 54(3):992–1002. https://doi.org/10.1021/ci400647u
Drozdetskiy A, Cole C, Procter J, Barton GJ (2015) JPred4: a protein secondary structure prediction server. Nucleic Acids Res 43(April):389–394. https://doi.org/10.1093/nar/gkv332
Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A (2015) Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Nature Publishing Group. https://doi.org/10.1038/srep11476
Wang S, Peng J, Ma J, Xu J (2016) Protein secondary structure prediction using deep convolutional neural fields. Sci Rep. https://doi.org/10.1038/srep18962
Heffernan R, Yang Y, KuldipPaliwal YZ (2017) Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics 33(18):2842–2849. https://doi.org/10.1093/bioinformatics/btx218
Zhen Li YY (2016). Protein secondary structure prediction using cascaded convolutional and recurrent neural networks. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), 2560–2567.
Fang C, Shang Y, Xu D (2018) MUFOLD-SS: new deep inceptioninside- inception networks for protein secondary structure prediction. Proteins 86(5):592–598. https://doi.org/10.1002/prot.25487
Guo L, Jiang Q, Jin X, Liu L, Zhou W, Yao S (2020) A deep convolutional neural network to improve the prediction of protein secondary structure. Curr Bioinform 15:767–777. https://doi.org/10.2174/1574893615666200120103050
Yuan L, Hu X, Ma Y, Liu Y (2022) DLBLS_SS: protein secondary structure prediction using deep learning and broad learning system. RSC Adv 12:33479–33487. https://doi.org/10.1039/d2ra06433b
Jin X, Guo L, Jiang Q, Wu N (2022) Prediction of protein secondary structure based on an improved channel attention and multiscale convolution module. Front Bioeng Biotechnol 10:901018. https://doi.org/10.3389/fbioe.2022.901018
Hasic H, Buza E, Akagic A (2017) A hybrid method for prediction of protein secondary structure based on multiple artificial neural networks. In: 2017 40th international convention on information and communication technology, electronics and microelectronics (MIPRO) pp. 1195-1200. IEEE. https://doi.org/10.23919/MIPRO.2017.7973605
Yavuz BÇ, Yurtay N, Ozkan O (2018) Prediction of protein secondary structure with clonal selection algorithm and multilayer perceptron. IEEE Access 6:45256–45261. https://doi.org/10.1109/ACCESS.2018.2864665
Pollastri G, Przybylski D, Rost B (2003) Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 47(2):228–235. https://doi.org/10.1002/prot.10082
Wang Z, Zhao F, Peng J, Xu J (2011) Protein 8-class secondary structure prediction using conditional neural fields. Proteomics 11(19):3786–3792. https://doi.org/10.1002/pmic.201100196
Busia A, Jaitly N (2017). Next-step conditioned deep convolutional neural networks to improve protein secondary structure prediction. In: Conference on intelligent systems for molecular biology & European conference on computational biology, 1–11. https://doi.org/10.48550/arXiv.1702.03865
Zhou JTO (2014). Deep supervised and convolutional generative stochastic network for protein secondary structure prediction. In: Proceedings of the 31st International Conference on International Conference on Machine Learning Beijing, China, 1–9. https://doi.org/10.48550/arXiv.1403.1347
Fang C, Shang Y, Xu D (2017). A new deep neighbor residual network for protein secondary structure prediction. In: IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), Boston, MA, USA, 66–71.
Guo Y, Li W, Wang B, Liu H, Zhou D (2019) DeepACLSTM: deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction. BMC Bioinform 20(341):1–12. https://doi.org/10.1186/s12859-019-2940-0
Drori I, Dwivedi I, Shrestha P, Wan J, Wang Y, He Y, Mazza A, Krogh-Freeman H, Leggas D, Sandridge K, Nan L, Thakoor K, Joshi C, Goenka S, Keasar C, (2018). High quality prediction of protein q8 secondary structure by diverse neural network architectures. NIPS 2018 Workshop on Machine Learning for Molecules and Materials, 1–10. https://doi.org/10.48550/arXiv.1811.07143
Yang W, Hu Z, Zhou L, Jin Y (2022) Knowledge-based systems protein secondary structure prediction using a lightweight convolutional network and label distribution aware margin loss. Knowl-Based Syst 237:1–12. https://doi.org/10.1016/j.knosys.2021.107771
Hinton GE, Krizhevsky A, Srivastava N, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Acknowledgements
We thank the Department of Computer Science and Engineering, Shri Ramdeobaba College of Engineering and Management, Nagpur for providing NVIDIA GPU A100 for experimentation.
Funding
Authors state no funding is involved.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no relevant financial or non-financial/ competing interests to disclose in any material discussed in this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bongirwar, V., Mokhade, A.S. An improved multi-scale convolutional neural network with gated recurrent neural network model for protein secondary structure prediction. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09822-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00521-024-09822-8