Introduction

A vaccine is an immunobiological substance from a disease-causing pathogen that triggers the immune system to elicit an effective immune response against that specific pathogen (Khan et al. 2022a). They destroy the lethality of an infectious microorganism analogous to natural immunity (Dey et al. 2022a). Infectious diseases caused by microbial pathogens like viruses, bacteria, and fungi are globally responsible for increased morbidity and mortality (Mahapatra et al. 2022a). To date, over 6.8 million people have already died of COVID-19 (Coronavirus Disease 2019) pandemic caused by SARS-CoV-2 (Severe Acute Respiratory Syndrome Coronavirus 2) (Sahoo et al. 2022), and the death toll is increasing day by day (Shawan et al. 2021a, b). Other viruses, such as HIV (Human Immunodeficiency Virus), Ebola, Zika, Dengue, etc., have a horrendous death rate and are walking on the same track (Xin et al. 2023). Besides viruses, deadly bacteria are also responsible for numerous infectious diseases (Dey et al. 2022a, b; Khan et al. 2022a, b). Upon getting the chance, commensal bacteria like Staphylococcus aureus may become slaughterous (Oh et al. 2016). In this context, vaccines are a blessing through medicine and act as a game changer by offering protection against various deadly infectious diseases, saving millions of lives. They have raised life expectancy in developed and underdeveloped countries (Xin et al. 2023).

At present, multi-epitope-based peptide vaccine design and development is an emerging area of research that focuses on using specific components of a pathogen, known as epitopes, to create a vaccine (Abass et al. 2022). Epitopes are short amino acid sequences recognized by the immune system and trigger an immune response. By using epitopes, vaccines can be designed to target specific parts of a pathogen, leading to a more targeted and effective immune response (Shawan et al. 2014). Epitope-based peptides are desired vaccine candidates due to their simpler production, non-infectious property, and chemical stability (Obaidullah et al. 2021). One of the main promises of epitope-based peptide vaccine design is its potential for relatively quick, cheap, and rapid development, as it only requires the production of a small number of antigenic peptides rather than the entire pathogen, making them ideal for use in response to emerging infectious diseases. Another advantage of this type of vaccine is its potential for improved safety (Dey et al. 2022a). Traditional vaccines use either inactivated or attenuated forms of the pathogen, which can cause adverse reactions and/or autoimmune responses in some individuals. On the other hand, epitope-based peptide vaccines are biologically harmless and highly effective at eliciting the desired immune response (Purcell et al. 2007; Kar et al. 2020; (Mahapatra et al. 2022a). The molecular mechanism of action of an epitope-based peptide vaccine is depicted in Fig. 1 (Kar et al. 2020).

The natural immune response can be triggered/evoked by entire or parts of microorganisms that may act as antigens, which can elicit a host’s immune response and produce antibodies against those antigens. Antigenicity is the capacity of an antigen to react with a particular antibody and is linked to immunoreactivity and/or immunogenicity. Immunoreactivity and/or immunogenicity is a complex network of antigen-specific biological reactions mediated by the humoral immunity of the host’s adaptive immune system (Shawan et al. 2014). During the exposure of an antigen to the immune system, B-cells are stimulated and differentiated into plasma cells with the aid of CD4 + helper T-cells, producing antigen-specific antibodies (Nicholson et al. 2016). In addition, the immune system also relies on CD8 + cytotoxic T-cells and IFN (Interferons, a group of cytokines) along with CD4 + helper T-cells to neutralize the antigen. The T-cell-mediated immune response deeply relies on the MHC (Major Histocompatibility Complex) molecules and is analogous to the binding of an antigen with its specific antibody. The human leukocyte antigen (HLA) gene encodes MHC peptide molecules. Every HLA allele stands for a peptide set found on the infected cell surface and identified by the receptors on T-cells (TCRs). Thus, both T-cell and B-cell subsequently provide cellular and humoral immunity, which are critically needed to evoke an effective immune response (Rakib et al. 2020).

The conventional approach to designing and developing an efficient vaccine candidate requires identifying target antigens, conducting in-depth research, and establishing an immunological correlation with the vaccine construct (Rappuoli et al. 2019). Traditional/experimental approach toward vaccine development is time-consuming, expensive, fraught with challenges, and requires the cultivation of large amounts of the pathogen. The process typically takes significant time to construct a commercially viable vaccine and involves a high rate of failure. That is why researchers are extremely interested in designing and developing vaccines using computer-assisted tools and techniques (Obaidullah et al. 2021). Recent research has shown that in silico approaches toward vaccine design are much more effective than earlier methods (Pyasi et al. 2021). Using novel resources (computational tools, techniques, and databases) and similar bioinformatics strategies, this process successfully establishes potent vaccine candidates that can induce strong immune responses against different types of human infectious pathogens like viruses [i.e., SARS-CoV-2 (Srivastava et al. 2022), mammarenavirus (Khan et al. 2022b) etc.], bacteria [i.e., Achromobacter xylosoxidans (Khan et al. 2022a), Enterococcus faecium (Dey et al. 2022a), Klebsiella pneumoniae (Dey et al. 2022b), Acinetobacter baumannii (Mahapatra et al. 2022a) etc.], as well as fungi [i.e., Candida auris (Khan et al. 2022c). Creating a safe and new vaccine using in silico design and development requires expertise in reverse vaccinology, multiple vaccine databases, and high-throughput methods. Databases such as Cytomegalovirus-db, Mammarenavirus-db, Hantavirus-db etc., are the repository of valuable information regarding experimentally validated vaccine components ((Khan et al. 2021a; (Khan et al. 2021a; (Khan et al. 2021a). In contrast, high-throughput methods are potent bioinformatics protocols to anticipate novel vaccine candidates (Srivastava et al. 2022). Furthermore, peptide candidates as potent epitope vaccines having improved expression patterns can be detected by in silico models that use various computational algorithms. These robust and more sophisticated algorithms are the hub for identifying immune epitopes against T and B cells. Various high-throughput screening approaches have already been developed to evaluate a vaccine construct’s efficacy (Abass et al. 2022).

In this article, we provide an outline for designing and developing multi-epitope-based peptide vaccines with the aid of different bioinformatics/immunoinformatics tools, database repositories, and computational algorithms in a simple, basic, and straightforward fashion. We expect that developments in bioinformatics and computational technologies will make vaccinology protocols more effective and accessible for researchers, enhancing the future of immunology.

Fig. 1
figure 1

Molecular mechanism of action of epitope-based peptide vaccine triggering cellular and humoral immunity. (A) The vaccine is taken up, processed, and presented by antigen-presenting cells (APC) with the help of the MHC I receptor to the T-cell receptor (TCR) of CD8+ cytotoxic T-cell (Tc-cell). This interaction activates the Tc-cell development and elicits the production of IFN/Th1 cytokines by CD4+ helper T-cell (Th-cell). IFN/Th1 cytokine results in the activation of Tc-cells to divide and attack the infected cell. The activated Tc-cells are also converted to memory Tc-cells. (B) Likewise, the antigenic vaccine is taken up, processed, and presented by MHC II of APC to TCR of Th-cell. This causes Th-cell activation, resulting in the secretion of IFN/Th2 cytokines. IFN/Th2 cytokine activates B-cells which differentiate into activated plasma cells and memory B-cells. Activated plasma cells and memory B-cells are capable of producing antigen-specific antibodies that can neutralize an infection. This figure was generated using BioRender.com

Materials and methods

The complete step-by-step methodology for the in silico designing and developing a multi-epitope-based peptide vaccine is visualized in a flow chart in Fig. 2. All the web addresses with additional comments on different servers/databases and software that are used in the vaccinomics approach are listed in Tables 1 and 2.

Fig. 2
figure 2

A schematic illustration exhibiting the overall systematic immunoinformatics strategy/approach adopted for the in silico epitope curation, designing, and development of a multi-epitope-based peptide vaccine. Initially, an antigenic target protein sequence from a desired microbe (virus, bacteria, fungi etc.) is extracted to select promiscuous T-cell (Tc and Th) and B-cell (LBL) epitopes. Appropriate linkers can then join these novel epitopes to construct a multi-epitope-based peptide vaccine candidate. After the evaluation (BLAST, disulfide engineering, CBL epitope prediction, NMA, and immune simulation) and structural assessment (2D and 3D), the newly formulated vaccine construct can be subjected to molecular docking analysis with TLR4 immune receptor. A molecular dynamics simulation is carried out to predict the stability of the docked complex. This flowchart is generated using Microsoft Office (PowerPoint) 2019

Table 1 Web addresses with additional comments on different servers/databases that are implemented for in silico vaccine discovery process
Table 2 Web addresses with additional comments on different software that are implemented in in silico vaccine discovery process

Retrieval of Target Protein Sequence

The amino acid sequence of the target protein from desired pathogenic microbes can be acquired using different protein databases like National Center for Biotechnology Information (NCBI) (Database resources of the NCBI 2016) or UniProt (The UniProt Consortium 2021). This retrieved amino acid sequence is used to generate a novel vaccine construct. The NCBI and UniProt databases provide a huge amount of biological protein information (Narang et al. 2021; Panda et al. 2022). The amino acid sequence of the target protein can be extracted in FASTA format (Shawan et al. 2014, 2018).

Target Protein Sequence Analysis

Considering the default threshold value, the target protein’s antigenicity can be determined using the VaxiJen v2.0 web server (Shawan et al. 2014). Afterward, allergenicity of the target protein can be detected using AllergenFP v1.0 server (Dimitrov et al. 2014b). Later, the TMHMM v2.0 server can be used to predict the target protein’s transmembrane (TM) helices (Doytchinova and Flower 2007). Ultimately, non-allergic and highly antigenic amino acid sequences with less TM helicase are selected for further evaluation (Dey et al. 2022a).

Prediction and Analysis of CTL (Cytotoxic T Lymphocyte) Epitopes

CTL Epitopes Prediction

Within the immune system, CTLs interact and kill the infectious cell, thus playing a crucial role in the host’s defense mechanism. To detect the CTL epitopes within a target protein, NetCTL v1.2 server can be used, which anticipates 9-mer epitopes against 12 HLA antigen allele class I supertypes (A1, A2, A3, A24, A26, B7, B8, B27, B39, B44, B58, and B62). Taking the default threshold values (C terminal cleavage- 0.15, epitope identification- 0.75, and antigen processing transport efficiency- 0.05) in consideration, this tool detects epitopes with great precision, and the CTL epitopes having the highest combined score are then selected for further analysis (Larsen et al. 2007).

Identification of MHC I Binding Allele

After the detection of CTL epitopes, the MHC I binding allele for each of the epitopes can be identified using MHC I binding module within IEDB (Immune Epitope Database) server. A consensus percentile rank score of less than or equal to 2.0 is usually considered to choose effective CTL epitopes, as a lower rank score represents higher affinity (Moutaftsi et al. 2006).

Predicted CTL Epitopes Analysis

Afterward, each of the refined CTL epitopes can be analyzed for antigenicity, allergenicity, toxicity, and immunogenicity through VaxiJen v2.0, AllerTOP v2.0, ToxinPred, and IEDB MHC I Immunogenicity tool of IEDB server respectively (Doytchinova and Flower 2007; Gupta et al. 2013; Calis et al. 2013; Dimitrov et al. 2014a). The CTL epitopes, which are highly antigenic, non-toxic, non-allergenic, and extremely immunogenic, are considered for vaccine preparation.

Prediction and Analysis of HTL (Helper T Lymphocyte) Epitopes

HTL Epitopes Prediction

HTLs are a crucial part of the adaptive immune system as they can identify foreign antigens and stimulate B-cell proliferation and CTLs to eliminate the infectious entity. HTL epitopes within a desired protein sequence can be forecasted through the MHC II binding tool from the IEDB server. This module detects 15-mer epitopes against HTLs, while a consensus percentile rank score equal to or less than 2.0 can be used as a threshold to anticipate efficient HTL epitopes. As for MHC I binding module, a lower percentile score suggests a higher binding affinity in this module (Wang et al. 2010).

Predicted HTL Epitopes Analysis

Each of the selected HTL epitopes can then be scrutinized for antigenicity, allergenicity, and toxicity using VaxiJen v2.0, AllerTOP v2.0, and ToxinPred server, respectively (Doytchinova and Flower 2007; Gupta et al. 2013; Dimitrov et al. 2014a). Later on, extremely antigenic, non-allergic, and non-toxic epitopes against HTLs can further be considered to check their cytokine-inducing capacity.

Cytokine-inducing Capacity Analysis of Predicted HTL Epitopes

In microbial infection, interferon-gamma (IFN γ) plays a pivotal role in specific and innate immune responses with the activation of macrophages and natural killer cells. IFNepitope server can be applied to predict and design potent IFN γ inducing MHC II binding HTL epitopes with an accuracy of 81.39% (Wang et al. 2008; Ashrafi et al. 2019). The interleukin-4 (IL-4) and interleukin-10 (IL-10) inducing ability of the selected HTL epitopes can be evaluated by IL4pred and IL10pred servers, respectively, with a threshold value of 0.2 and − 0.3 (Dhanda et al. 2013; Nagpal et al. 2017). After the analysis, HTL epitopes having all three cytokine-inducing capacities are chosen to construct the final vaccine candidate.

Prediction and Analysis of LBL (Linear B Lymphocyte) Epitopes

LBL Epitopes Prediction

Antigens having epitopes capable of eliciting B-cell response are critical mediators for antibody-associated humoral immunity. ABCpred server is the most popular one to identify LBL epitopes within a given set of protein sequences with a threshold of 100 for sensitivity, specificity, and accuracy (Saha and Raghava 2007). Subsequently, the probability score of each of the LBL epitopes can be predicted using iBCE-EL server considering default parameters (Manavalan et al. 2018).

Predicted LBL Epitopes Analysis

The predicted LBL epitopes’ antigenicity, allergenicity, and toxicity can be assessed through VaxiJen v2.0, AllerTOP v2.0, and ToxinPred server, respectively, accepting default parameters (Doytchinova and Flower 2007; Gupta et al. 2013; Dimitrov et al. 2014a). LBL epitopes having good scores are then chosen for vaccine construction.

Conservancy Analysis of the Predicted CTL and HTL Epitopes

The conservancy (conservation across antigens) of the previously selected MHC I and MHC II epitopes can be analyzed with the help of the epitope conservancy analysis tool under the hood of the epitope analysis tool in the IEDB server. For sequence identity, this tool helps recognize the opening of a single epitope in a range of strains with a threshold value greater than or equal to 100 (Bui et al. 2007). MHC epitopes with 100% maximum identity can be selected to construct a vaccine candidate.

Human Homology Analysis of the Predicted CTL and HTL Epitopes

Identifying homologous epitopes within human proteome is vital to design a potent vaccine, as similar epitopes with humans may hamper eliciting an adequate immune response. The epitope homology to the human proteome can be determined by BLAST (Basic Local Alignment Search Tool) module, mainly blastp (protein BLAST) within the NCBI database. In this analysis, a search for homologous sequences can be done using default parameters by selecting Homo sapiens (taxid: 9606) at a threshold e-value of 0.05 (Altschul et al. 1990; Mehla and Ramana 2016). Non-homologous epitopes of humans with an e-value below 0.05 can be selected for vaccine construction (Mehla and Ramana 2016).

3D Modeling and Molecular Docking Analysis of the Selected CTL and HTL Epitopes with HLA Antigens

CTL and HTL Epitopes Modeling

To design a reliable vaccine, evaluating the binding affinity of HLA alleles with CTL and HTL epitopes is crucial and can be done by exploiting molecular docking studies. For that, the epitopes (CTL and HTL) must first be modeled with the sOPEP scheme of the PEP-FOLD v3.5 server employing 200 simulations (Lamiable et al. 2016).

Molecular Docking Between CTL and HTL Epitopes with HLA Alleles

Before molecular docking simulation, the energy of each modeled epitope can be computed and minimized with Swiss-Pdb Viewer v4.1.0 software. The 3D structures with the lowest energy are then considered (Guex and Peitsch 1997). Two widely distributed alleles, namely HLA-A*01:01 and HLA-DRB1*01:01, can be selected to represent MHC I and MHC II alleles to examine the binding affinity with CTL and HTL epitopes. To check the molecular interaction, the 3D X-ray crystallographic structure of HLA-A*01:01 and HLA-DRB1*01:01 can be downloaded in pdb format from the RCSB protein data bank bearing PDB ID of 6AT9 and 1QEW, respectively. To validate the docking simulation, co-crystalized ligands within the PDB structures can be considered the positive control (Berman et al. 2002). The UCSF Chimera v1.11.2 is a freely available software for preparing large protein molecules. The preparation can be done by eliminating attached ligands from the co-crystalized structure and adding hydrogens GM (Gasteiger-Marsili) charges (Pettersen et al. 2004). Afterward, OpenBabel can be used to minimize ligand energy and save both the structure (protein and ligand) files into pdbqt format (O’Boyle et al. 2011). AutoDock Vina v1.2.0 is a widely used, more reliable, and cited software utilized for molecular docking simulation (Rahman et al. 2016). Throughout the molecular interaction analysis, all the parameters can be kept at default, and the grid box for HLA-A*01:01 and HLA-DRB1*01:01 can be set at (X)60.64 × (Y)73.76 × (Z)45.49 Å and (X)61.25 × (Y)48.69 × (Z)72.95 Å respectively. The results of docking studies are denoted as negative values (kcal mol− 1), and a lower score indicates strong binding affinity (Trott and Olson 2009). BIOVIA DS (Discovery Studio) v4.5 can be utilized to visualize the molecular docking simulation results, and the figure can be generated using UCSF Chimera (Accelrys Software Inc: San Diego 2012).

Population Coverage Assessment of Selected CTL and HTL Epitopes

The expression and the distribution pattern of HLA alleles (class I and II) differ by ethnic groups and regions around the globe. Population coverage analysis is pivotal for developing an effective epitope-based peptide vaccine. The population coverage of the selected CTL and HTL epitopes can be assessed by the population coverage tool in the IEDB server. After the calculation, predicted CTL and HTL epitopes and their corresponding HLA binding alleles (MHC I, MHC II, and combined) can be analyzed (Bui et al. 2006).

Cluster Analysis for Class I and Class II MHC Molecules

In humans, the genes for both classes of MHC molecules are highly polymorphic, and this extreme polymorphism in HLA antigens encompasses hundreds of thousands of alleles. MHC I and II molecules with similar binding affinity can be recognized by MHC clustering analysis with the help of the MHCcluster 2.0 server. Considering the default parameters, this tool generates phylogenetic trees and excessively intuitive heat-maps of the effective cluster between MHC class I and II molecules (Thomsen et al. 2013).

Establishment of the Vaccine Construct

The effective vaccine construct can be formulated by combining previously selected CTL, HTL, and LBL epitopes that have outperformed others based on different selection criteria with each other. For this addition, CTL, HTL, and LBL epitopes can be linked with AAY (Ala-Ala-Tyr), GPGPG (Gly-Pro-Gly-Pro-Gly), and KK (Lys-Lys) linkers, respectively (Dorosti et al. 2019). The AAY linker improves the immunogenicity of a vaccine candidate by influencing protein stability and epitope presentation capacity. The glycine-proline (GPGPG) and bi-lysine (KK) linker facilitate immune processing and immunogenic activity of the newly constructed vaccine, respectively (Nain et al. 2020). To achieve a stronger immune response, an adjuvant like the 50 S ribosomal protein subunit L7/L12 (TLR4 agonist) can be linked at the starting end of the construct with a bifunctional EAAAK linker (Glu-Ala-Ala-Ala-Lys) (Olejnik et al. 2018).

Evaluation of the Newly Constructed Vaccine Candidate

Physicochemical Property Analysis of the Vaccine Construct

The physicochemical properties, i.e., the number of amino acids, molecular weight (MW), theoretical pH (pI), amino acids composition, the total number of negatively charged residues, the total number of positively charged residues, atomic composition, formula, extinction coefficient, estimated half-life, instability index (II), aliphatic index (AI), and grand average of hydropathicity (GRAVY) of the formulated vaccine can be assessed using ProtParam tool within ExPASy proteomic server (Gasteiger 2003; Narang et al. 2021; Panda et al. 2022).

Allergenicity, Antigenicity, and Solubility Profile Analysis of the Vaccine Construct

A newly designed vaccine construct must exhibit non-allergenicity, extreme antigenicity, and high solubility to elicit a strong immune response. The allergenicity profiling can be determined by AllerTop v2.0, AllergenFP v1.0, and AlgPred v2.0 server (Saha and Raghava 2006; Dimitrov et al. 2014b). The antigenicity of the construct can be assessed with VaxiJen v2.0 and ANTIGENPro server (Doytchinova and Flower 2007; Magnan et al. 2010). The solubility of a vaccine can be analyzed through the SOLpro tool, and a given peptide is expected to be soluble if the calculated score is greater than or equal to 0.5 (Magnan et al. 2009). For a better understanding, another solubility prediction server, namely Protein-Sol, can be utilized, and a protein with a solubility score greater than 0.45 is considered highly soluble (Hebditch et al. 2017). Next, the transmembrane helices and potential signal peptides within the vaccine construct can be determined using TMHMM v2.0 and SignalP 4.1 server (Krogh et al. 2001; Nielsen 2017a; Panda et al. 2022).

BLAST and Human Homology Checking of the Constructed Vaccine

To minimize an autoimmune response, relative homology analysis between the final vaccine candidate and human proteome can be done with the BLASTp module of the PSIBLAST algorithm within the NCBI database (Altschul et al. 1990; Altschul et al. 1997; Narang et al. 2022). In this step, a search must be restricted to H. sapiens (taxid:9606), and the query sequence must exhibit less than 40% human homology.

Secondary Structure Analysis of the Vaccine Construct

The secondary structure, as well as the peptide configuration of the final vaccine, can be examined through PSIPRED v4.0 and SOPMA applications (Geourjon and Deléage 1995; Buchan et al. 2013). Considering default parameters, the two servers calculate the percentage of 2D configurations such as alpha helix, random coil, and beta-turn. The PSIPRED v4.0 and SOPMA servers generate the secondary structure of a query protein sequence with a result accuracy of 78.1% and 80%, respectively (Montgomerie et al. 2006).

Development and Analysis of the Tertiary (3D) Structure of the Vaccine Construct

Homology Modeling to Create the 3D Model of the Constructed Vaccine

The RaptorX web server can be employed to build a 3D model of the vaccine candidate. To predict the tertiary structure, this server applies a homology modeling technique, and a 3D model having the lowest p-value is admitted as the finest model (Wang et al. 2016).

3D Model Refinement and Validation

A vaccine model’s tertiary (3D) structure can be refined using the GalaxyRefine module on the GalaxyWEB server, which generates five refined models as output. These refined models are ranked according to the score of different parameters, including GDT-HA, RMSD, MolProbity, Clash score, Poor rotamers, and Rama favored (Ko et al. 2012). Afterward, the refined model can be validated with a ProSA-web server that calculates the Z-score of that particular model. This server can be used to analyze the stereochemical quality of a protein model by evaluating the geometry of both individual residues and the overall structure (Wiederstein and Sippl 2007). Then the validated model can be further assessed using Verify3D and ERRAT web servers. Verify3D algorithm assesses a query protein model with its three-dimensional profile obtained from X-ray crystallographic, NMR spectroscopic, and/or computational methods (Eisenberg et al. 1997). In contrast, the ERRAT program assesses a 3D model by identifying imprecise regions within a protein structure based on the errors resulting from the random distribution of atoms (Colovos and Yeates 1993). The PROCHEK application can be used to assess the Ramachandran plot, providing valuable information about the overall quality of the refined vaccine model. Based on dihedral angles [psi (ψ) and phi (ϕ)], the Ramachandran plot visualizes the percentage of amino acid residues within the most favored, generously allowed, additionally allowed, and disallowed regions. A good quality model should have over 90% of amino acid residues in its most favored region (Morris et al. 1992).

Engineering Disulfide Bonds Inside the Constructed Vaccine Candidate

Disulfide bonds within a protein molecule are critical to stabilizing the tertiary/quaternary structure, interactions, and dynamics. Next to the refinement, the vaccine construct can be submitted to Disulfide by Design v2.12 server for disulfide engineering. For disulfide bridging, default values (in°) can be kept for χ3 and Cα-Cβ-Sγ angles. The angle of χ3 ranging between − 87 to + 97° and the energy score of less than 2.2 kcal/mol suggests an effective disulfide bridging (Craig and Dombkowski 2013).

Scanning for CBL (Conformational B Lymphocyte) Epitopes Within the Newly Formulated Vaccine

The CBL epitopes within the formulated vaccine construct can be predicted with the help of the ElliPro: Antibody Epitope Prediction tool within the IEDB analysis resource. The discontinuous B-cell epitopes can be detected by allowing a minimum protein index (PI) score of 0.5 and a maximum distance between the residue’s center of mass (R) 6 Å as the default value. A larger value for R and PI indicates a larger conformational B-cell epitope and greater solvent accessibility, respectively (Ponomarenko and Bourne 2007).

Normal Mode Analysis (NMA) of the Vaccine Construct

NMA is highly required to understand the spontaneous functional motion of a protein complex in its internal (dihedral) coordinates. The iMODS server can be used to analyze the normal mode of the designed vaccine candidate. This quicker and cost-effective MD (Molecular Dynamic) simulation analysis technique facilitates the prediction of the eigenvalues, deformability, B-factors, and covariance (López-Blanco et al. 2014) .

Computational Immune Simulation Analysis of the Constructed Vaccine

A vaccine candidate’s immunogenicity and immune response can be understood by exploiting the C-ImmSim web server. This server applies an immune simulation technique, setting the parameters as defaults (Dellagostin et al. 2017).

Molecular Docking Simulation Study Between Vaccine Construct and TLR4 (Toll-Like Receptor) Complexes

Computer-assisted molecular docking assessment can predict the molecular interaction and binding affinity of TLR and vaccines. TLRs are extremely associated with strong immunity (Rafi et al. 2022).

TLR Preparation

For docking analysis, the X-Ray crystallographic structure of the human TLR4 complex with MD-2 and LPS (PDB ID 4G8A) can be downloaded from the RCSB protein data bank bearing a resolution of 2.4 Å. The ligands, along with B, C, and D chains, can be removed by BIOVIA DS (Discovery Studio) v4.5. Later on, the energy of the protein structure can be minimized with Swiss-Pdb Viewer v4.1.0 applying GROMOS 43B1 force field (Guex and Peitsch 1997; Berman et al. 2002; Accelrys Software Inc: San Diego 2012).

Docking Simulation Analysis

Next, the vaccine candidate and the prepared TLR4 can be docked by a protein-protein docking server, i.e., ClusPro v2.0 (Land and Humble 2018). The TLR4-vaccine docked complex with the lowest docking score can be considered to have high-affinity binding, and the molecular interaction can be observed using BIOVIA DS (Discovery Studio) v4.5 (Mahapatra et al. 2022b).

MD (Molecular Dynamics) Simulation Study of the Vaccine Construct and TLR4 Docked Complex

Molecular dynamics simulation allows researchers to examine the potential vaccine’s molecular and atomic motions. The molecular dynamics simulation is employed to analyze the association between the receptor proteins (TLRs) and the vaccine candidate (multi-epitope-based subunit vaccine) (Kozakov et al. 2017). The molecular docking technique initially determines the stability between the vaccine-receptor complex, which is further supported and verified by molecular dynamics simulation (Mahapatra et al. 2022b). The process generally suggests whether the developed vaccine would trigger TLR stimulation which could support higher immune reactions inside the human body (Kozakov et al. 2017). The YASARA (Yet Another Scientific Artificial Reality Application) Dynamics (v22.9.24) software package may be adopted to analyze the MD simulation of the vaccine-TLR4 complex. During the simulation, AMBER14 forcefield can be employed (Chatterjee et al. 2018). Before the MD simulation, the complex is cleaned by deleting unknown ligands, water molecules, and metal ions. Similarly, H-bonded networks are optimized to reorder hydrogen bonds and add the missing ones (Pyasi et al. 2021). A simulation cell can solvate the protein complex using the TIP3P solvation model, where the solvent density value may be maintained at 0.997gL-1 (Harrach and Drossel 2014). The AMBER force fields are generally integrated with the most regularly utilized TIP3P solvent model. While the TIP3P framework has no impact on the thermodynamic characteristics of the solutes, it dramatically lowers the distances among these stages, speeding up the dynamics and thereby improving testing in the computations (Krieger et al. 2012). The protonation arrangement of proteins is critical for their structural rigidity. Before initiating a traditional MD simulation, the protonation stages should be established and assigned (Florová et al. 2010). The SCWRL algorithm manages the protonation state of every amino acid within a protein molecule which helps calculate each amino acid’s pKa (acid dissociation constant) value. Furthermore, Na+ and Cl can be added to preserve the physiological environment at pH 7.4 and 298 K temperatures (Krieger et al. 2012; Pyasi et al. 2021).The Particle Mesh Ewald (PME) approach can be used to calculate the long-range interactions, short-range Coulomb, and vdW contacts (Varma et al. 2006). When utilizing PME to handle electrostatic interactions, molecular dynamics simulations of protein in specified water are significantly impacted by adding Cland Na+ particles (Alam et al. 2019). When the ionic solution equilibrates, the protein’s flexible regions’ overall architecture and movements are influenced by the presence of salt ions and charge-stabilizing opposite ions (Alam et al. 2019). The steepest descent is preferable for reducing the high-energy characteristics of the starting configuration (Hsieh et al. 2009). Using the simulated annealing methods, the energy of the TLR4-vaccine docked complex can be minimized with the steepest gradient approaches. For the simulation process, the time step can be set as 2.0 fs, where long-range electrostatic interactions can be calculated with a cut of radius 8 Å (Grote et al. 2005). The simulation may be conducted for 100 ns and the trajectories can be stored following 100 fs intervals. The data within trajectory files can be used to analyze RMSD (Root Mean Square Deviation), RMSF (Root Mean Square Fluctuation), Rg (Radius of Gyration), SASA (Solvent Accessible Surface Area), and H-bonds (Solanki and Tiwari 2018). Despite several successes, MD simulation incorporates challenges like a lack of more refined force fields or superior computational power demanding more than a microsecond simulation time (Durrant and McCammon 2011).

Insilico Codon Optimization and Molecular Cloning of the Constructed Vaccine

Highly efficient cloning and expression properties of a multi-epitope-based peptide vaccine construct are needed to develop an effective vaccine. Therefore, effectual codon adaptation, optimization, and vaccine cloning can be carried out in E. coli K12 (Solanki and Tiwari 2018). Since human codons differ from E. coli, JCat (JAVA Codon Adaptation Tool), an online application, can be employed to reverse translate and optimize the final vaccine construct. This step increases the expression of the final vaccine construct into the E. coli host. JCat output for the adapted and optimized construct exhibits the nucleotide sequence, CAI (Codon Adaptation Index), and % of GC content, which are essential for proper expression in a particular host (Grote et al. 2005). For the effective expression of a vaccine construct, the CAI value must range from 0 to 1, while % of GC content must be within 30–70%. Finally, BglII and ApaI restriction sites can be added at the newly formulated vaccine’s N and C terminal end. The freshly prepared vaccine codon sequence can be cloned into the pET-28a (+) vector using SnapGene v6.1 software (Solanki and Tiwari 2018).

Conclusion and Future Scope

Developing a swift and highly effective vaccinology technique is critical for responding to unexpected health catastrophes and lowering infection-related death rates. Vaccination via sparking the immune response offers protection against infectious diseases, reducing morbidity and mortality. Vaccine development must be efficient and prompt to tackle emergent health crises. However, conventional vaccine design and development procedures are time-consuming and expensive. On the contrary, computational vaccinology supported by vaccinomics and immunoinformatics strategies from that perspective has placed the world in an advantageous stage to screen and detect antigens of interest in an economically friendly and time-saving manner and develop vaccine candidates to combat the emergent pathogenic invasion. The wealth of genomics and proteomics data allows informatics to effectively expand its contribution to medical innovation, especially in vaccine science. In the post-genomic age, the construction of multi-epitope-based peptide vaccines has emerged as a unique concept. The availability of the entire microbial genome and proteome sequences and the applicability of bioinformatic tools/techniques for analyzing these sequences can be used to design multi-epitope-based peptide vaccines, which unleash the detection of top immunogenic protein candidates for vaccine development. Thus, designing a multi-epitope-based peptide vaccine offers a promising avenue for efficient and cost-effective therapy and generating a robust immune response against infectious disease.

This review delivers a modest, elementary, and typical procedure/protocol for designing and developing multi-epitope-based peptide vaccines with the aid of different databases, computational tools, and algorithms. Interested researchers/immunologists might utilize the information in this article to guide designing multi-epitope-based peptide vaccine candidates for subsequent pre-clinical and clinical studies. This concise and comprehensive review encompasses a range of essential resources and databases needed to identify the most potent as well as novel antigenic protein sequences (CTL, HTL, and LBL epitopes), assess MHC (both class I and II) binding, create a putative vaccine construct through homology modeling, analyze the interaction between the constructed vaccine and immune receptors (TLR4) using molecular docking and dynamics simulation, compute normal mode and immune simulation analysis of the vaccine candidate and finally molecular cloning of the newly constructed vaccine (Fig. 3). We hope that this summarized review may offer a more effective and accessible vaccinology protocol for future researchers allowing them to design vaccines according to the pathogen of interest computationally.

In the near future, multi-epitope-based peptide vaccine design and development will likely become the fastest-growing field of biological science, particularly in response to combatting infectious diseases. With bioinformatics and computational modeling advancements, researchers can predict epitopes more easily and accurately, which are most likely to elicit a potent and effective immune response, making the development of new vaccines much more economically, rapid and efficient. Using multi-epitope-based peptide vaccines may help reduce the global burden of infectious diseases by providing a safe and effective means of preventing and treating those illnesses. Additionally, as our understanding of the immune system and the mechanisms of antigen recognition and presentation continues to grow, new strategies for enhancing the immunogenicity of epitopes and improving the efficacy and durability of epitope-based peptide vaccines are likely to emerge. Overall, multi-epitope-based peptide vaccine designing and development holds great promise for preventing and controlling infectious diseases in the years to come.

Fig. 3
figure 3

Concise and comprehensive representation showing different applications employed for the vaccinomics/immunoinformatics governed multi-epitope-based peptide vaccine developmental process. This illustration is generated using Microsoft Office (PowerPoint) 2019