Skip to main content

Advertisement

Log in

ACP-Dnnel: anti-coronavirus peptides’ prediction based on deep neural network ensemble learning

  • Original Article
  • Published:
Amino Acids Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The ongoing COVID-19 pandemic has caused dramatic loss of human life. There is an urgent need for safe and efficient anti-coronavirus infection drugs. Anti-coronavirus peptides (ACovPs) can inhibit coronavirus infection. With high-efficiency, low-toxicity, and broad-spectrum inhibitory effects on coronaviruses, they are promising candidates to be developed into a new type of anti-coronavirus drug. Experiment is the traditional way of ACovPs’ identification, which is less efficient and more expensive. With the accumulation of experimental data on ACovPs, computational prediction provides a cheaper and faster way to find anti-coronavirus peptides’ candidates. In this study, we ensemble several state-of-the-art machine learning methodologies to build nine classification models for the prediction of ACovPs. These models were pre-trained using deep neural networks, and the performance of our ensemble model, ACP-Dnnel, was evaluated across three datasets and independent dataset. We followed Chou's 5-step rules. (1) we constructed the benchmark datasets data1, data2, and data3 for training and testing, and introduced the independent validation dataset ACVP-M; (2) we analyzed the peptides sequence composition feature of the benchmark dataset; (3) we constructed the ACP-Dnnel model with deep convolutional neural network (DCNN) merged the bi-directional long short-term memory (BiLSTM) as the base model for pre-training to extract the features embedded in the benchmark dataset, and then, nine classification algorithms were introduced to ensemble together for classification prediction and voting together; (4) tenfold cross-validation was introduced during the training process, and the final model performance was evaluated; (5) finally, we constructed a user-friendly web server accessible to the public at http://150.158.148.228:5000/. The highest accuracy (ACC) of ACP-Dnnel reaches 97%, and the Matthew’s correlation coefficient (MCC) value exceeds 0.9. On three different datasets, its average accuracy is 96.0%. After the latest independent dataset validation, ACP-Dnnel improved at MCC, SP, and ACC values 6.2%, 7.5% and 6.3% greater, respectively. It is suggested that ACP-Dnnel can be helpful for the laboratory identification of ACovPs, speeding up the anti-coronavirus peptide drug discovery and development. We constructed the web server of anti-coronavirus peptides’ prediction and it is available at http://150.158.148.228:5000/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability statement

All data sets used in this paper can be freely downloaded from the http://150.158.148.228:5000/Download.

References

  • Aslan MF, Unlersen MF, Sabanci K et al (2021) CNN-based transfer learning–BiLSTM network: a novel approach for COVID-19 infection detection. Appl Soft Comput 98:106912

    PubMed  Google Scholar 

  • Biau G (2012) Analysis of a random forests model. J Mach Learn Res 13(1):1063–1095

    Google Scholar 

  • Boopathi V, Subramaniyam S, Malik A, Lee G, Manavalan B, Yang D-C (2019) mACPpred: a support vector machine-based metapredictor for identification of anticancer peptides. Int J Mol Sci 20(8):1964

    CAS  PubMed Central  PubMed  Google Scholar 

  • Chang KY, Yang JR (2013) Analysis and prediction of highly effective antiviral peptides based on random forests. PLoS ONE 8(8):e70166

    CAS  PubMed Central  PubMed  Google Scholar 

  • Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

  • Chen S, Liao Y, Zhao J et al (2023) PACVP: prediction of anti-coronavirus peptides using a stacking learning strategy with effective feature representation. In: IEEE/ACM transactions on computational biology and bioinformatics

  • Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom 21(1):1–13

    Google Scholar 

  • Chou KC (2011) Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 273(1):236–247

    CAS  PubMed  Google Scholar 

  • Chowdhury AS, Reehl SM, Kehn-Hall K et al (2020) Better understanding and prediction of antiviral peptides through primary and secondary structure feature importance. Sci Rep 10(1):1–8

    Google Scholar 

  • Chung CR, Kuo TR, Wu LC et al (2020) Characterization and identification of antimicrobial peptides with different functional activities. Brief Bioinform 21(3):1098–1114

    CAS  Google Scholar 

  • Dong X, Yu Z, Cao W et al (2020) A survey on ensemble learning. Front Comp Sci 14(2):241–258

    Google Scholar 

  • Dotolo S, Marabotti A, Facchiano A et al (2021) A review on drug repurposing applicable to COVID-19. Brief Bioinform 22(2):726–741

    CAS  PubMed  Google Scholar 

  • Dzisoo AM, He B, Karikari R et al (2019) CISI: a tool for predicting cross-interaction or self-interaction of monoclonal antibodies using sequences. Interdiscip Sci Comput Life Sci 11(4):691–697

    CAS  Google Scholar 

  • Fan HH, Wang LQ, Liu WL et al (2020) Repurposing of clinically approved drugs for treatment of coronavirus disease 2019 in a 2019-novel coronavirus-related coronavirus model. Chin Med J 133(09):1051–1056

    CAS  PubMed Central  PubMed  Google Scholar 

  • Genuer R, Poggi JM (2020) Random forests. In: Random forests with R. Springer, Cham, pp 33–55

  • Gns HS, Saraswathy GR, Murahari M et al (2019) An update on drug repurposing: re-written saga of the drug’s fate. Biomed Pharmacother 110:700–716

    PubMed  Google Scholar 

  • Gomes B, Augusto MT, Felício MR et al (2018) Designing improved active peptides for therapeutic approaches against infectious diseases. Biotechnol Adv 36(2):415–429

    CAS  PubMed  Google Scholar 

  • Hu S, Ma R, Wang H (2019) An improved deep learning method for predicting DNA-binding proteins based on contextual features in amino acid sequences. PLoS ONE 14(11):e0225317

    CAS  PubMed Central  PubMed  Google Scholar 

  • Kamel H, Abdulah D, Al-Tuwaijari JM (2019) Cancer classification using gaussian naive bayes algorithm. In: 2019 international engineering conference (IEC). IEEE, pp 165–170

  • Kieslich CA, Alimirzaei F, Song H et al (2021) Data-driven prediction of antiviral peptides based on periodicities of amino acid properties. Comput Aided Chem Eng 50:2019–2024

    CAS  Google Scholar 

  • Kramer O, Kramer O (2016) Scikit-learn. Machine learning for evolution strategies, pp 45–53

  • Kumar S, Kumar S (2019) Molecular docking: a structure-based approach for drug repurposing. In: Silico drug design. Academic Press, pp 161–189

  • Kurata H, Tsukiyama S, Manavalan B (2022) iACVP: markedly enhanced identification of anti-coronavirus peptides using a dataset-specific word2vec model. Brief Bioinform 23(4):bbac265

    PubMed  Google Scholar 

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    CAS  PubMed  Google Scholar 

  • Lee HT, Lee CC, Yang JR et al (2015) A large-scale structural classification of antimicrobial peptides. BioMed Res Int 2015

  • Lin TT, Sun YY, Cheng WC et al (2021) Developing an antiviral peptides predictor with generative adversarial network data augmentation. bioRxiv

  • Lippmann C, Kringel D, Ultsch A et al (2018) Computational functional genomics-based approaches in analgesic drug discovery and repurposing. Pharmacogenomics 19(9):783–797

    CAS  PubMed  Google Scholar 

  • Liu Y, Zhu Y, Sun X et al (2023) DRAVP: a comprehensive database of antiviral peptides and proteins. Viruses 15(4):820

    CAS  PubMed Central  PubMed  Google Scholar 

  • Manavalan B, Basith S, Lee G (2022) Comparative analysis of machine learning-based approaches for identifying therapeutic peptides targeting SARS-CoV-2. Brief Bioinform 23(1):bbab412

    PubMed  Google Scholar 

  • Masoudi-Sobhanzadeh Y, Esmaeili H, Masoudi-Nejad A (2022) A fuzzy logic-based computational method for the repurposing of drugs against COVID-19. Bioimpacts 12(4):315

    CAS  PubMed  Google Scholar 

  • Meher PK, Sahu TK, Saini V et al (2017) Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC. Sci Rep 7(1):1–12

    CAS  Google Scholar 

  • Mishal A, Saravanan R, Atchitha SS et al (2020) A review of corona virus disease-2019. History 4(7):1–8

    Google Scholar 

  • Moolayil J, Moolayil J, John S (2019) Learn Keras for deep neural networks. Apress, Birmingham

    Google Scholar 

  • Nishant T, Abid Q, Manoj K (2012) AVPpred: collection and prediction of highly effective antiviral peptides. Nucleic Acids Res 40:W199-204

    Google Scholar 

  • O’Brien-Simpson NM, Hoffmann R, Chia CS et al (2018) Antimicrobial and anticancer peptides. Front Chem 6:13

    PubMed Central  PubMed  Google Scholar 

  • Outlaw VK, Bovier FT, Mears MC et al (2020) Inhibition of coronavirus entry in vitro and ex vivo by a lipid-conjugated peptide derived from the SARS-CoV-2 spike glycoprotein HRC domain. Mbio 11(5):e01935-e2020

    CAS  PubMed Central  PubMed  Google Scholar 

  • Pang Y, Yao L, Jhong JH et al (2021a) AVPIden: a new scheme for identification and functional prediction of antiviral peptides based on machine learning approaches. Brief Bioinform 22(6):263

    Google Scholar 

  • Pang Y, Wang Z, Jhong JH et al (2021b) Identifying anti-coronavirus peptides by incorporating different negative datasets and imbalanced learning strategies. Brief Bioinform 22(2):1085–1095

    CAS  PubMed  Google Scholar 

  • Parikh R, Mathai A, Parikh S et al (2008) Understanding and using sensitivity, specificity and predictive values. Indian J Ophthalmol 56(1):45

    PubMed Central  PubMed  Google Scholar 

  • Pfalzgraff A, Brandenburg K, Weindl G (2018) Antimicrobial peptides and their therapeutic potential for bacterial skin infections and wounds. Front Pharmacol 9:281

    PubMed Central  PubMed  Google Scholar 

  • Pinzi L, Rastelli G (2019) Molecular docking: shifting paradigms in drug discovery. Int J Mol Sci 20(18):4331

    CAS  PubMed Central  PubMed  Google Scholar 

  • Ruby U, Yendapalli V (2020) Binary cross entropy with deep learning technique for image classification. Int J Adv Trends Comput Sci Eng 9(10)

  • Sandag GA (2020) A prediction model of company health using bagging classifier. JITK (jurnal Ilmu Pengetahuan Dan Teknologi Komputer) 6(1):41–46

    Google Scholar 

  • Shin HC, Roth HR, Gao M et al (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35(5):1285–1298

    PubMed  Google Scholar 

  • Shipe ME, Deppen SA, Farjah F et al (2019) Developing prediction models for clinical use using logistic regression: an overview. J Thorac Dis 11(Suppl 4):S574

    PubMed Central  PubMed  Google Scholar 

  • Siami-Namini S, Tavakoli N, Namin AS (2019) The performance of LSTM and BiLSTM in forecasting time series. In: 2019 IEEE international conference on big data (Big Data). IEEE, pp 3285–3292

  • Singh A (2021) A review of coronavirus disease—2019. BR Nahata Smriti Sansthan Int J Phram Sci Clin Res 1(3)

  • Timmons PB, Hewage CM (2021a) ENNAVIA is an innovative new method which employs neural networks for antiviral and anti-coronavirus activity prediction for therapeutic peptides. bioRxiv

  • Timmons PB, Hewage CM (2021b) ENNAVIA is a novel method which employs neural networks for antiviral and anti-coronavirus activity prediction for therapeutic peptides. Brief Bioinform 22(6):bbab258

    PubMed Central  PubMed  Google Scholar 

  • Van Laarhoven T (2017) L2 regularization versus batch and weight normalization. https://arxiv.org/abs/1706.05350

  • Wang B, Yao Y, Wei PW et al (2021) Housefly phormicin inhibits Staphylococcus aureus and MRSA by disrupting biofilm formation and altering gene expression in vitro and in vivo. Int J Biol Macromol 167:1424–1434

    CAS  PubMed  Google Scholar 

  • Wei L, Zhou C, Su R et al (2019) PEPred-Suite: improved and robust prediction of therapeutic peptides using adaptive feature representation learning. Bioinformatics 35(21):4272–4280

    PubMed  Google Scholar 

  • Xiao X, Shao YT, Cheng X et al (2021) iAMP-CA2L: a new CNN-BiLSTM-SVM classifier based on cellular automata image for identifying antimicrobial peptides and their functional types. Brief Bioinform 22(6):bbab209

    PubMed  Google Scholar 

  • Xing W, Bei Y (2019) Medical health big data classification based on KNN classification algorithm. IEEE Access 8:28808–28819

    Google Scholar 

  • Xue H, Li J, Xie H et al (2018) Review of drug repositioning approaches and resources. Int J Biol Sci 14(10):1232

    CAS  PubMed Central  PubMed  Google Scholar 

  • Yang W, Zhu XJ, Huang J et al (2019) A brief survey of machine learning methods in protein sub-Golgi localization. Curr Bioinform 14(3):234–240

    CAS  Google Scholar 

  • Yang S, Huang J, He B (2021) CASPredict: a web service for identifying Cas proteins. PeerJ 9:e11887

    PubMed Central  PubMed  Google Scholar 

  • Yoo SH, Geng H, Chiu TL et al (2020) Deep learning-based decision-tree classifier for COVID-19 diagnosis from chest X-ray imaging. Front Med 7:427

    Google Scholar 

  • Zhang Q, Chen X, Li B et al (2022) A database of anti-coronavirus peptides. Sci Data 9(1):294

    CAS  PubMed Central  PubMed  Google Scholar 

  • Zhou Y, Hou Y, Shen J et al (2020) Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2. Cell Discov 6(1):14

    CAS  PubMed Central  PubMed  Google Scholar 

  • Zhou Y, Xie S, Yang Y et al (2022) SSH2.0: a better tool for predicting the hydrophobic interaction risk of monoclonal antibody. Front Genet 13:842127

    CAS  PubMed Central  PubMed  Google Scholar 

Download references

Acknowledgements

This project was supported by the Provincial Health Commission Science and Technology Foundation of Guizhou (No. gzwkj2023-590) and the Guizhou Medical University National Natural Science Foundation Cultivation Project (No. 21NSFCP40), and thanks also to all the authors involved in the project.

Funding

This work was supported by the Provincial Health Commission Science and Technology Foundation of Guizhou (No. gzwkj2023-590), Guizhou Medical University National Natural Science Foundation Cultivation Project (No. 21NSFCP40), and National Natural Science Foundation of China (No. 62071099 and No. 32160668).

Author information

Authors and Affiliations

Authors

Contributions

M-YL conceived and designed the study and wrote the paper. M-YL, TW, and Y-XZ configured the experimental environment. M-YL and TW finished the developmental of the webserver. And H-ML provides interpretation of the biological significance of anti-coronavirus peptides and gives guidance. Y-WZ and Z-RH participates in project discussions and provides constructive suggestions. C-CX and JH provides algorithm optimization ideas, JH participates in all research work and gives guidance. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Changcheng Xiang or Jian Huang.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Institutional review board statement

Not applicable.

Informed consent statement

Not applicable.

Additional information

Handling editor: F. Albericio.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, M., Liu, H., Wu, T. et al. ACP-Dnnel: anti-coronavirus peptides’ prediction based on deep neural network ensemble learning. Amino Acids 55, 1121–1136 (2023). https://doi.org/10.1007/s00726-023-03300-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00726-023-03300-6

Keywords

Navigation