Skip to main content

Vaxi-DL: An Artificial Intelligence-Enabled Platform for Vaccine Development

  • Protocol
  • First Online:
Computational Vaccine Design

Abstract

Vaccine development is a complex and long process. It involves several steps, including computational studies, experimental analyses, animal model system studies, and clinical trials. This process can be accelerated by using in silico antigen screening to identify potential vaccine candidates. In this chapter, we describe a deep learning-based technique which utilizes 18 biological and 9154 physicochemical properties of proteins for finding potential vaccine candidates. Using this technique, a new web-based system, named Vaxi-DL, was developed which helped in finding new vaccine candidates from bacteria, protozoa, viruses, and fungi. Vaxi-DL is available at: https://vac.kamalrawal.in/vaxidl/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Apostolopoulos V (2010) New generation vaccines. Expert Rev Vaccines 9(6):551–553

    Article  Google Scholar 

  2. Hotez P (2021) Preventing the next pandemic and tackling antiscience: an interview with Peter Hotez. Future Microbiol 16(8):539–541

    Article  CAS  PubMed  Google Scholar 

  3. WHO Coronavirus (COVID-19) Dashboard. Available online: https://covid19.who.int. Accessed on 2 Jan 2023

  4. Pronker ES, Weenen TC, Commandeur H, Claassen EH, Osterhaus AD (2013) Risk in vaccine research and development quantified. PLoS One 8(3):e57755

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. IFPMA (2019) The complex journey of a vaccine. Retrieved 15th Sept, 2022, from https://www.ifpma.org/wp-content/uploads/2019/07/IFPMA-ComplexJourney-2019_FINAL.pdf

  6. Bernstein A, Pulendran B, Rappuoli R (2011) Systems vaccinomics: the road ahead for vaccinology. OMICS 15(9):529–531

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Rawal K, Sinha R, Abbasi BA, Chaudhary A, Nath SK, Kumari P, Preeti P, Saraf D, Singh S, Mishra K, Gupta P, Mishra A, Sharma T, Gupta S, Singh P, Sood S, Subramani P, Dubey AK, Strych U, Hotez PJ, Bottazzi ME (2021) Identification of vaccine targets in pathogens and design of a vaccine using computational approaches. Sci Rep 11(1):17626

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Abbasi BA, Saraf D, Sharma T, Sinha R, Singh S, Sood S, Gupta P, Gupta A, Mishra K, Kumari P, Rawal K (2022) Identification of vaccine targets & design of vaccine against SARS-CoV-2 coronavirus using computational and deep learning-based approaches. PeerJ 10:e13380

    Article  PubMed  PubMed Central  Google Scholar 

  9. Rappuoli R, Hanon E (2018) Sustainable vaccine development: a vaccine manufacturer’s perspective. Curr Opin Immunol 53:111–118

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Dalsass M, Brozzi A, Medini D, Rappuoli R (2019) Comparison of open-source reverse vaccinology programs for bacterial vaccine antigen discovery. Front Immunol 10:113

    Article  PubMed  PubMed Central  Google Scholar 

  11. Doytchinova IA, Flower DR (2007) VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines. BMC Bioinf 8:4

    Article  Google Scholar 

  12. Pizza M, Scarlato V, Masignani V, Giuliani MM, Arico B, Comanducci M, Jennings GT, Baldi L, Bartolini E, Capecchi B, Galeotti CL, Luzzi E, Manetti R, Marchetti E, Mora M, Nuti S, Ratti G, Santini L, Savino S, Scarselli M, Storni E, Zuo P, Broeker M, Hundt E, Knapp B, Blair E, Mason T, Tettelin H, Hood DW, Jeffries AC, Saunders NJ, Granoff DM, Venter JC, Moxon ER, Grandi G, Rappuoli R (2000) Identification of vaccine candidates against serogroup B meningococcus by whole-genome sequencing. Science 287(5459):1816–1820

    Article  CAS  PubMed  Google Scholar 

  13. Heinson AI, Gunawardana Y, Moesker B, Hume CC, Vataga E, Hall Y, Stylianou E, McShane H, Williams A, Niranjan M, Woelk CH (2017) Enhancing the biological relevance of machine learning classifiers for reverse vaccinology. Int J Mol Sci 18(2):312

    Article  PubMed  PubMed Central  Google Scholar 

  14. Bowman BN, McAdam PR, Vivona S, Zhang JX, Luong T, Belew RK, Sahota H, Guiney D, Valafar F, Fierer J, Woelk CH (2011) Improving reverse vaccinology with a machine learning approach. Vaccine 29(45):8156–8164

    Article  PubMed  Google Scholar 

  15. Magnan CN, Zeller M, Kayala MA, Vigil A, Randall A, Felgner PL, Baldi P (2010) High-throughput prediction of protein antigenicity using protein microarray data. Bioinformatics 26(23):2936–2943

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Goodswen SJ, Kennedy PJ, Ellis JT (2013) A novel strategy for classifying the output from an in silico vaccine discovery pipeline for eukaryotic pathogens using machine learning algorithms. BMC Bioinf 14:315

    Article  Google Scholar 

  17. Jaiswal V, Chanumolu SK, Gupta A, Chauhan RS, Rout C (2013) Jenner-predict server: prediction of protein vaccine candidates (PVCs) in bacteria based on host-pathogen interactions. BMC Bioinf 14:211

    Article  Google Scholar 

  18. Ong E, Wang H, Wong MU, Seetharaman M, Valdez N, He Y (2020) Vaxign-ML: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens. Bioinformatics 36(10):3185–3191

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Rawal K, Sinha R, Nath SK, Preeti P, Kumari P, Gupta S, Sharma T, Strych U, Hotez P, Bottazzi ME (2022) Vaxi-DL: a web-based deep learning server to identify potential vaccine candidates. Comput Biol Med 145:105401

    Article  CAS  PubMed  Google Scholar 

  20. Yang B, Sayers S, Xiang Z, He Y (2011) Protegen: a web-based protective antigen database and analysis system. Nucleic Acids Res 39(Database issue):D1073–D1078

    Article  CAS  PubMed  Google Scholar 

  21. UniProt C (2021) UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res 49(D1):D480–D489

    Article  Google Scholar 

  22. Chen Q, Zobel J, Zhang X, Verspoor K (2016) Supervised learning for detection of duplicates in genomic sequence databases. PLoS One 11(8):e0159644

    Article  PubMed  PubMed Central  Google Scholar 

  23. Xiao N, Cao DS, Zhu MF, Xu QS (2015) protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics 31(11):1857–1859

    Article  CAS  PubMed  Google Scholar 

  24. Kawashima S, Ogata H, Kanehisa M (1999) AAindex: amino acid index database. Nucleic Acids Res 27(1):368–369

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Kawashima S, Kanehisa M (2000) AAindex: amino acid index database. Nucleic Acids Res 28(1):374

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M (2008) AAindex: amino acid index database, progress report 2008. Nucleic Acids Res 36(Database issue):D202–D205

    CAS  PubMed  Google Scholar 

  27. Dubchak I, Muchnik I, Mayor C, Dralyuk I, Kim SH (1999) Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification. Proteins 35(4):401–407

    Article  CAS  PubMed  Google Scholar 

  28. Dubchak I, Muchnik I, Holbrook SR, Kim SH (1995) Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci U S A 92(19):8700–8704

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Shen J, Zhang J, Luo X, Zhu W, Yu K, Chen K, Li Y, Jiang H (2007) Predicting protein-protein interactions based only on sequences information. Proc Natl Acad Sci U S A 104(11):4337–4341

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Chou KC (2000) Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem Biophys Res Commun 278(2):477–483

    Article  CAS  PubMed  Google Scholar 

  31. Chou KC (2001) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 43(3):246–255

    Article  CAS  PubMed  Google Scholar 

  32. Rifaioglu AS, Atas H, Martin MJ, Cetin-Atalay R, Atalay V, Dogan T (2019) Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform 20(5):1878–1912

    Article  CAS  PubMed  Google Scholar 

  33. Ismail H, White C, Al-Barakati H, Newman RH, Kc DB (2022) FEPS: a tool for feature extraction from protein sequence. Methods Mol Biol 2499:65–104

    Article  PubMed  Google Scholar 

  34. Bonidia RP, Domingues DS, Sanches DS, de Carvalho A (2022) MathFeature: feature extraction package for DNA, RNA and protein sequences based on mathematical descriptors. Brief Bioinform 23(1):bbab434

    Article  PubMed  Google Scholar 

  35. Muhammod R, Ahmed S, Md Farid D, Shatabda S, Sharma A, Dehzangi A (2019) PyFeat: a Python-based effective feature generation tool for DNA, RNA and protein sequences. Bioinformatics 35(19):3831–3833

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Chen Z, Liu X, Zhao P, Li C, Wang Y, Li F, Akutsu T, Bain C, Gasser RB, Li J, Yang Z, Gao X, Kurgan L, Song J (2022) iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets. Nucleic Acids Res 50(W1):W434–W447

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Wu S, Liang MP, Altman RB (2008) The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation. Genome Biol 9(1):R8

    Article  PubMed  PubMed Central  Google Scholar 

  38. Mu Z, Yu T, Liu X, Zheng H, Wei L, Liu J (2021) FEGS: a novel feature extraction model for protein sequences and its applications. BMC Bioinf 22(1):297

    Article  CAS  Google Scholar 

  39. Mu Z, Yu T, Qi E, Liu J, Li G (2019) DCGR: feature extractions from protein sequences based on CGR via remodeling multiple information. BMC Bioinf 20(1):351

    Article  Google Scholar 

  40. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  41. Van Rijn JN, Hutter F (2018) Hyperparameter importance across datasets. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2367–2376

    Chapter  Google Scholar 

  42. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv 1412.6980

    Google Scholar 

  43. Qi Xu MZ, Zonghua G, Pan G (2019) Overfitting remedy by sparsifying regularization on fully-connected layers of CNNs. Neurocomputing 328:69–74

    Article  Google Scholar 

  44. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd international conference on machine learning. B. Francis and B. David. Proc Mach Learn Res: PMLR 37:448–456

    Google Scholar 

  45. Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31. Curran Associates, Inc

    Google Scholar 

  46. Prechelt L (1998) In: Orr GB, Müller K-R (eds) “Early stopping – but when?” neural networks: tricks of the trade. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 55–69

    Chapter  Google Scholar 

  47. Gardy JL, Spencer C, Wang K, Ester M, Tusnady GE, Simon I, Hua S, deFays K, Lambert C, Nakai K, Brinkman FS (2003) PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria. Nucleic Acids Res 31(13):3613–3617

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Chaudhuri R, Ansari FA, Raghunandanan MV, Ramachandran S (2011) FungalRV: adhesin prediction and immunoinformatics portal for human fungal pathogens. BMC Genomics 12:192

    Article  PubMed  PubMed Central  Google Scholar 

  49. Petersen TN, Brunak S, von Heijne G, Nielsen H (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8(10):785–786

    Article  CAS  PubMed  Google Scholar 

  50. Nielsen M, Lundegaard C, Lund O, Kesmir C (2005) The role of the proteasome in generating cytotoxic T-cell epitopes: insights obtained from improved predictions of proteasomal cleavage. Immunogenetics 57(1–2):33–41

    Article  CAS  PubMed  Google Scholar 

  51. Hofmann KAWS (1993) TMbase-A database of membrane spanning proteins segments. Biol Chem Hoppe Seyler 374:166

    Google Scholar 

  52. Wilkins MR, Gasteiger E, Bairoch A, Sanchez JC, Williams KL, Appel RD, Hochstrasser DF (1999) Protein identification and analysis tools in the ExPASy server. Methods Mol Biol 112:531–552

    CAS  PubMed  Google Scholar 

  53. Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M (2007) Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction. BMC Bioinf 8:424

    Article  Google Scholar 

  54. Andreatta M, Nielsen M (2016) Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics 32(4):511–517

    Article  CAS  PubMed  Google Scholar 

  55. Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300(4):1005–1016

    Article  CAS  PubMed  Google Scholar 

  56. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410

    Article  CAS  PubMed  Google Scholar 

  57. Emanuelsson O, Nielsen H, von Heijne G (1999) ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci 8(5):978–984

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

The computational facility used in this work was supported by Robert J. Kleberg Jr. and Helen C. Kleberg Foundation. We are also thankful to Amity University and ICMR [BMI/12(66)/2021 2021-6442] for the support provided during the conduct of this study. Preeti P has received financial support from SERB [File Number: CVD/2020/000842]. The computational facility used for hosting the server was provided by DBT, Government of India [BT/PR17252/BID/7/708/2016].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kamal Rawal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Preeti, P. et al. (2023). Vaxi-DL: An Artificial Intelligence-Enabled Platform for Vaccine Development. In: Reche, P.A. (eds) Computational Vaccine Design. Methods in Molecular Biology, vol 2673. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3239-0_21

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-3239-0_21

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-3238-3

  • Online ISBN: 978-1-0716-3239-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics