Introduction

Mycoplasma pneumoniae is significant respiratory pathogen that produces diverse severity from mild upper respiratory tract infection to severe atypical pneumonia (walking pneumonia), predominantly in children and young adults. The disease habitually progresses with the symptoms of respiratory tract and lung infections, mortality rate of the disease is approximately 10–30% in the hospitalized community [1,2,3,4]. The organism also produces a wide spectrum of extra-pulmonary manifestations like neurological, dermatological, congenital tract infections, etc, [5, 6]. Several reports have been published on the network of M. pneumoniae with community-acquired infections [6, 7]. Even though wide array of data from the investigating reports are available, the effective therapeutic management of M. pneumoniae infections still remains as a challenge.

Mycoplasma is cell wall deficient simplest self-limiting bacteria, capable of cell-free survival [8, 9]. M. pneumoniae has an extremely small genome (approx. 816 kilo base-pairs) responsible for its limited biosynthetic capabilities and lethargic replication rate [10]. M. pneumoniae is transmitted by the respiratory droplets via close contact [11] with respiratory tract mucosa leading to various effects like local inflammation and additional manifestations. The primary phase of M. pneumoniae respiratory tract infection implicates cytadherence of the organism towards the ciliated columnar epithelium, which defends the organism from local cytotoxic effects and mucociliary clearance [6].

A specialized organelle mediates the cellular adherence to sialoglycoproteins and sulphated glycolipids. It has a central core which consists of dense filaments and a tip-like structure comprised of adhesins and accessory proteins [12, 13]. P1-adhesin, P30-adhesin and P65-adhesin are the crucial proteins involved in receptor binding. The proteins A, B, and C as well as high molecular protein-1 (HMW-1), HMW-2, HMW-3 act as accessory proteins, interacting with P1, P30, P65 and P165 adhesins, and thus facilitating cytadherence [14, 15]. These proteins could act as potential druggable targets for the effective therapeutic and vaccine approaches against M. pneumoniae.

Mycoplasma pneumoniae is highly susceptible to macrolides, tetracyclines, and the new quinolone antibiotics. Macrolide resistance condition have been reported from China and from European countries such as France and Germany and other parts of the world including the USA [16,17,18,19]. Macrolide resistance in M. pneumoniae is associated with 23SrRNA genes on nucleotide mutations which could increase minimal inhibitory concentrations to azithromycin, erythromycin and clarithromycin. A successful vaccine development will not only enable us to avert the critical situations of M. pneumoniae infection but also serve as preventive approach against disastrous outbreaks, predominantly in closed community settings [20].

Currently, epitope-based vaccine design, has currently drawn much attention in infectious diseases treatment [21,22,23]. Immunization based on the epitope-based vaccines is dynamic in stimulation of humoral and/or cellular arms of the immune system [24]. These kind of vaccines are comprised of extremely immunogenic T and B cell epitopes, which aggravate the cytotoxic T cells (CTL), B or T helper cells to particular epitopes [25, 26]. T helper cells and B cells play a significant role in stimulation of a protective immune response in numerous bacterial infections; therefore, determination of peptides that lead to B and T cell responses is critical for the design of potent epitope-based peptide vaccines [27, 28]. The epitope-based peptide vaccines have some prospective advantages, such as cost effective production, ability to select the type of immunity and to increase safety [23]. Immunoinformatics or computational biology approaches have added an inevitable contribution towards designing epitope-based peptide vaccines.

In the present study, we have designed the epitope-based peptide vaccine from membrane-associated proteins and cytadherence proteins of M. pneumoniae with the help of immunoinformatics approach. The presented vaccine strategy could serve as promising pathogen specific candidates with extensive therapeutic applications against pneumonia and its associated infectious disease.

Materials and methods

Dataset construction

We used the dataset of 12 membrane associated proteins reported by Gupta et al. [29] and 5 cytadherence proteins of M. pneumonia from Uniprot database (http://www.uniprot.org/). The dataset was then improved by adding the information’s like protein accession ID, protein family name, subcellular location and its corresponding references.

Unraveling protein antigenicity and allergenicity

To identify the antigenicity, all the 17 proteins were submitted to VaXiJen v 2.0 server which helps in the prediction of potent antigens and subunit vaccines with preset parameters [30]. Similarly, the allergenicity prediction for all the 17 proteins were done by AllerHunter server [31] by opting plain sequence format and choosing bacteria as the target organism. The 17 proteins with highest antigenicity scores were selected for further evaluation. The allergens and non-allergens with high sensitivity and specificity for the dataset were thus predicted.

Primary and secondary structure analysis

SOPMA server [32] was used to identify secondary structural elements and physiochemical properties of all the proteins. The major properties extracted include transmembrane helices, solvent accessibility, globular regions, random coil and coiled coil region.

T cell epitope prediction from the conserved sequences

The T cell epitope prediction is highly crucial for rational vaccine design and was done using by EpiDOCK server [33], the first structure based server for the prediction of MHC class II binding. MHC class II present the peptides derived from endocytosed extracellular proteins and is encoded by three loci, HLA DR, HLA DQ and HLA DP. EpiDOCK predicts the binding of 23 most frequent human MHC class II alleles, 12 HLA DR, 6 HLA-DQ, and 5 HLA DP alleles. MHC class II present the peptides derived from endocytosed extracellular proteins and is encoded by three loci, HLA DR, HLA DQ and HLA DP. The binding cleft of MHC Class II proteins are open ended and allows longer peptides to bind. The FASTA format sequences of the proteins were given as input to identify the binding of 23 MHC class II alleles by assigning threshold to each HLA molecules. Maximum number of binders as epitopes with MHC class II alleles were thus identified and subjected for the further analysis.

Modeling and validation of best selected epitopes

To perform docking and simulation study, all the best conserved epitopes were inserted to PEP-FOLD server [34], which is a de novo method designed to predict the peptide structures from amino acid sequences. The best peptide models provided by the server were selected for further analysis.

The stereochemical properties of predicted models were retrieved from RAMPAGE server http://mordred.bioc.cam.ac.uk/~rapper/rampage.php, which provides accurate information about the residues present in the favored, allowed and disallowed regions. The PDB coordinates of the predicted epitope models were given as input and Ramachandran plot was then constructed to validate the structure. The predicted epitope models were subjected to further analysis by considering a percentage score (> 75%) of residues present in the favored region.

Molecular docking study

The binding affinity between HLA molecules and predicted epitopes were identified through docking study using CLUSPRO server [35], a web based program for computational docking of protein structures. The HLA proteins and predicted peptide structures were given as input to calculate the least binding energy score. Docking algorithm evaluate several complexes retaining a preset number with favorable surface complementarities. Filtering method is applied to these set of structures for selecting those with good electrostatic and desolvation free energies. Docking studies were conducted between the predicted epitopes and HLA DP (3LQZ), HLA DQ (1S9V), HLA DR (2G9H) molecules. The PDB coordinates for HLA DP, DQ and DR were obtained from the Research Collaboratory for Structural Bioinformatics (RCSB) protein database. Chain A and B of HLA DP, HLA DQ and HLA DR were considered for docking with predicted epitopes and the binding energies were analyzed.

Molecular dynamics simulation studies

All the molecular dynamic simulations were performed using GROMACS 4.5.2 package [36] with OPLS-AA force field [37]. Firstly, all the models were solvated with 1.09 nm simple point charge (SPC) water entrenched in the simulation boxes. After that, all the systems were examined and subjected to steepest descent energy minimization until attaining a tolerance of 100 kJ/mol. Subsequently, the solvent molecules were equilibrated with the stable protein at 300 K for a period; the entire system was gradually relaxed and heated up to 300 K. Finally, 15-ns MD simulations were carried out under the normal temperature and pressure along with coupling time constant of 1.0 ps. The particle mesh Ewald method was utilized to treat long-range electrostatic interactions. van der Waals force was maintained at 1.4 nm. The g_rmsd command was used to analyze RMSD trajectory file and corresponding graph was generated by using GRACE software.

Potential B cell epitope prediction

The module of Kolaskar and Tongaonkar [38] from the Immune Epitope Database (IEDB) (http://tools.immuneepitope.org/bcell/) was used to predict linear B-cell epitopes. The results predicted the epitopes with around 75% accuracy. Furthermore, the properties of B-cell epitopes, which includes Emini surface accessibility prediction [39], Parker hydrophilicity prediction [40] karplus and schulz flexibility prediction [41] and Bepipred linear epitope prediction [42] were predicted using IEDB. The study also confirmed that antigenic parts of the proteins are confined to beta turn regions [43] were predicted by chou and fasman (http://tools.iedb.org/bcell/) [44] beta prediction tool.

Ab-initio modeling and refinement

To predict the three-dimensional (3D) structure of the proteins of selected best predicted epitopes, I-TASSER server http://zhanglab.ccmb.med.umich.edu/I-TASSER/ [45] was used. FASTA format data of the proteins were given as input and the retrieved output comprises secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, structure based functional annotations for enzyme classification, gene ontology terms and protein ligand binding sites. All predictions were associated with a confidence score which shows the accuracy of predictions.

Protein–Ligand docking

Molecular docking analysis was performed to identify the inhibitory potential of the best epitopes predicted with the existing drugs for M. pneumoniae infection, which can be potential druggable targets. Azithromycin, telithromycin, tetracycline, doxycycline and roxithromycin drugs under the macrolide class of antibiotics were used in the study. Protein–ligand docking was performed using CLUSPRO server [35]. PDB coordinates for the drugs were obtained from PubChem (https://pubchem.ncbi.nlm.nih.gov/) and the energy minimization with MOPAC server (http://www.openmopac.net) and the docked complexes were visualized with PyMOL software [46].

Results

Structure prediction and analysis of antigenic protein

The choice of immunogen or epitope is the initial step for effective vaccine design; henceforth, to find out the most probable antigenic protein, the protein sequences of M. pneumoniae were retrieved for dataset construction. A total of 12 membrane associated proteins from the study reported by Gupta et al., and 5 cytadherence proteins from UniprotKB database were retrieved and tabulated in Table 1. The antigenicity of all proteins is identified from overall score generated by the VaxiJen server [30]. The scores of 12 membrane associated proteins and 5 cytadherence proteins from M. pneumoniae revealed their potentiality to induce immune response. Moreover, putative adhesion P1-like protein (UniprotKB ID: P75491) and the uncharacterized lipoprotein MG186 homolog protein (UniprotKB ID: P75265) showed significantly highest antigenic scores of 0.9674 and 0.9342, respectively. All the query sequences could be categorized as non-allergens by using AllerHunter server [31]. Table 2 summarizes the antigenic score and allergenicity assessment of all the selected proteins. All the antigenic proteins were analyzed for the secondary structural features like alpha helixes, beta sheets, beta turns, and random coils are depicted in Table 3.

Table 1 Membrane associated and cytadherence proteins of M. pneumoniae
Table 2 Antigenicity and allergenicity prediction
Table 3 Secondary structure prediction for all proteins

Identification of T-cell epitopes

Antigenic peptide vaccines consist of T-cell epitopes. The prime prerequisite for an efficient peptide to enact as a T-cell epitope is that it should bind to major histocompatibility complex (MHC) protein in order to promote immune response. Hence, T-cell epitopes in 12 membrane associated proteins and 5 cytadherence proteins of M. pneumoniae were predicted. A total of 113 antigenic peptides with 9-mer core sequences in the 12 membrane associated proteins and 5 cytadherence proteins of M. pneumoniae were identified to be T-cell epitopes using EpiDock server (S1 Table). The server identifies likely overlapping 9-mers on the given sequence by using docking score-based quantitative matrix (DS-QM) and generates the score by predicting peptide-MHC class II binding based on sensitivity, specificity, accuracy and area under curve (AUC) using coordinates sensitivity/1-specificity.

Superior immune response predominantly depends on successful recognition of the epitopes by HLA molecules with substantial affinity. A reliable T-cell epitope should interact with maximum number of HLA alleles in order to induce strong immune response. Therefore, the best peptides from each protein with highest number of binding HLA-DP, HLA-DQ and HLA-DR alleles were selected as putative T-cell epitope candidates. The details of predicted putative T-cell epitopes along with their respective binding HLA-DP, HLA-DQ and HLA-DR alleles are shown in Table 4.

Table 4 Putative T-cell epitopes with interacting MHC class II molecules

Structure prediction and validation of putative T-cell epitopes

The 3D structures of WIHGLILLF epitope (Fig. 1), VILLFLLLF epitope (Fig. 2), and LLAWMLVLF (Fig. 3) were predicted and their scrambled versions were generated through the PEP-FOLD de novo modeling server and validated using RAMPAGE web-based tool.

Fig. 1
figure 1

Model and Validation of peptide WIHGLILLF

Fig. 2
figure 2

Model and Validation of peptide VILLFLLLF

Fig. 3
figure 3

Model and Validation of peptide LLAWMLVLF

Through RAMPAGE analysis, the number of amino acid residues (in percentage) was obtained in the favored region and displayed in Table 5. Hence, the epitopes FKTNLNLAF, KITVYKLIF, WIHGLILLF, LLAFVILLF, VILLFLLLF, LLAFVILLF, LLDAAIPAF, FAVLEILLF, SAVLELPLF, FAILEIPLF, VKQMALDAF, QDQQIKDHF, IQQNQKPDF, PPRRKLKLF, LLAWMLVLF, KDKLDNAIF, NLQIMKQNF, and DKERAKLAK, showed the number of residues with > 75% in favored region were subjected to docking studies with HLA molecules.

Table 5 Validation of predicted T cell epitopes by RAMPAGE analysis

Binding energy determination of HLA-epitope interaction

Using CLUSPRO server, the binding models for the predicted T cell epitopes to their respective HLA-DP, -DQ and –DR molecules were generated and displayed in Table 6. The predicted epitopes, WIHGLILLF with HLA-DP (Fig. 4), VILLFLLLF with HLA-DQ (Fig. 5), and LLAWMLVLF with HLA-DR (Fig. 6), showed binding energies of − 915.8 kcal/mol, − 982.1 kcal/mol, and − 893 kcal/mol, respectively. The 2D structure representation and ligplot analysis (Figs. 7, 8, 9) depicts their hydrophobic interactions.

Table 6 Docking score of epitopes with HLA -DP, -DQ and –DR
Fig. 4
figure 4

Docking of peptide (WIHGLILLF) derived from protein (Uniprot ID: P75588) with HLA DP

Fig. 5
figure 5

2D structure representation and hydrophobic interaction of HLA-DP with the peptide WIHGLILLF

Fig. 6
figure 6

Docking of peptide (VILLFLLLF) derived from protein (Uniprot ID: Q50327) with HLA DQ

Fig. 7
figure 7

2D structure representation and hydrophobic interaction of HLA-DQ with the peptide VILLFLLLF

Fig. 8
figure 8

Docking of peptide (LLAWMLVLF) derived from protein (Uniprot ID: P75330) with HLA DR

Fig. 9
figure 9

2D structure representation and hydrophobic interaction of HLA-DR with the peptide LLAWMLVLF

Molecular dynamics simulation of protein and protein-peptide complex

The dynamic behavior of HLA proteins (DP, DQ and DR) as well as the HLA-peptide complexes was studied using Molecular Dynamic Simulation studies. We analyzed RMSD, and total energy in the native protein as well as in protein-peptide complexes. (Fig. 10) shows the RMSD of HLA DP was equilibrated after 5 ns with RMSD value of 0.15 nm, while protein-peptide complex was equilibrated at 5 ns with RMSD value of 0.15 nm. Figure 11 depicts the RMSD value of HLA DQ was equilibrated after 12.5 ns with RMSD value of 0.16 nm, while protein-peptide complex was equilibrated at 12.5 ns with RMSD value of 0.17 nm. RMSD of HLA DR was equilibrated and further RMSD value of 0.17 nm at 6 ns, while protein-peptide complex was equilibrated at 5.5 ns with RMSD value of 0.157 nm and showed in Fig. 12.

Fig. 10
figure 10

Analysis of backbone RMSD of HLA-DP protein and protein-peptide complex (epitope and HLA-DP). The symbol coding scheme as follows: HLA-DP (orange color) and protein-peptide complex (turquoise blue color). (Color figure online)

Fig. 11
figure 11

Analysis of backbone RMSD of HLA-DQ protein and protein-peptide complex (epitope and HLA-DQ). The symbol coding scheme as follows: HLA-DQ (orange color) and protein-peptide complex (turquoise blue color). (Color figure online)

Fig. 12
figure 12

Analysis of backbone RMSD of HLA-DR protein and protein-peptide complex (epitope and HLA-DR). The symbol coding scheme as follows: HLA-DR (orange color) and protein-peptide complex (turquoise blue color). (Color figure online)

Total energy of protein-peptide complex was higher than native protein in the case of HLA DP- WIHGLILLF (Fig. 13), while the total energy of HLA DQ- VILLFLLLF complex was found to be lower than the HLA-DQ protein (Fig. 14), however HLA DR- LLAWMLVLF complex showed higher energy than HLA DR Protein (Fig. 15).

Fig. 13
figure 13

Total energy of HLA-DP protein and protein-peptide complex (epitope and HLA-DP). The symbol coding scheme as follows: HLA-DP (orange color) and protein-peptide complex (turquoise blue color). (Color figure online)

Fig. 14
figure 14

Total energy of HLA-DQ protein and protein-peptide complex (epitope and HLA-DQ). The symbol coding scheme as follows: HLA-DQ (orange color) and protein-peptide complex (turquoise blue color). (Color figure online)

Fig. 15
figure 15

Total energy of HLA-DR protein and protein-peptide complex (epitope and HLA-DR). The symbol coding scheme as follows: HLA-DR (orange color) and protein-peptide complex (turquoise blue color). (Color figure online)

Identification of B-cell epitopes

Prediction and identification of B-cell epitopes in the target antigens is one of the main steps in epitope-driven vaccine design. Putative B-cell epitopes have various features which are essential for the successful recognition by B-cells. These characteristics include hydrophilicity, surface accessibility and beta turn prediction. Thus, to obtain B-cell epitope candidates in the predicted putative T-cell epitopes of proteins P2, P3, and P30, in silico identification of the B-cell epitopes based on the IEDB database was carried out.

Kolasker and Tongaonkar antigenicity prediction tool evaluated the proteins for B cell epitopes investigating the physico-chemical properties of amino acids and their affluence in known B cell epitopes. Further, the results revealed that the average antigenicity prospensity value of predicted epitopes of P2 protein was 1.037 with a minimum of 0.873 and a maximum of 1.285, and of P3 protein was 1.048 with a minimum of 0.925 and a maximum of 1.235, whereas of P30 protein was 1.005 with a minimum of 0.889 and maximum of 1.188 (Fig. 16).

Fig. 16
figure 16

Kolaskar and Tongaonkar antigenicity prediction of the best antigenic proteins. a P2 protein (Uniprot ID: P75588); The X-axis and Y-axis denote the sequence position and antigenic propensity, respectively. The threshold is 1.037. The regions above the threshold value are antigenic, shown in yellow. b P3 protein (Uniprot ID: Q50327); The X-axis and Y-axis denote the sequence position and antigenic propensity, respectively. The threshold is 1.048. The regions above the threshold value are antigenic, shown in yellow. c P30 protein (Uniprot ID: P75330); The X-axis and Y-axis denote the sequence position and antigenic propensity, respectively. The threshold is 1.029. The regions above the threshold value are antigenic, shown in yellow. (Color figure online)

The surface accessibility and hydrophilicity regions are also the key features for predicting the B-cell epitopes. The Emini surface accessibility and Parker hydrophilicity prediction tools were used and the results are displayed with graphical interpretation from the Emini surface tool for P2 protein, P3 protein, and P30 protein (Fig. 17), respectively. The results from parker hydrophilicity prediction tool for P2 protein, P3 protein, and P30 protein (Fig. 18), respectively.

Fig. 17
figure 17

Emini surface accessibility prediction of the best antigenic proteins. a P2 protein (Uniprot ID: P75588), b P3 protein (Uniprot ID: Q50327), c P30 protein (Uniprot ID: P75330); The X-axis and Y-axis denote the sequence position and surface probability, respectively. The threshold is 1.0. The regions above the threshold value are antigenic, shown in yellow. (Color figure online)

Fig. 18
figure 18

Parker hydrophilicity prediction of the best antigenic proteins. a P2 protein (Uniprot ID: P75588); The X-axis and Y-axis denote the position and score, respectively. The threshold is 1.297. The regions above the threshold value, having beta turns in the protein which shown in yellow color. b P3 protein (Uniprot ID: Q50327); The X-axis and Y-axis denote the position and score, respectively. The threshold is 0.317. The regions above the threshold value, having beta turns in the protein which shown in yellow color. c P30 protein (Uniprot ID: P75330); The X-axis and Y-axis denote the position and score, respectively. The threshold is 1.121. The regions above the threshold value, having beta turns in the protein which shown in yellow color. (Color figure online)

The beta turns in a protein are commonly surface accessible and hydrophilic in nature. Chou and Fasman Beta turn prediction was performed for P2, P3 and P30 (Fig. 19) proteins to find beta turn regions in the query protein sequence as beta turns have a substantial effect to induce antigenicity. Generated results were revealed that a region from 345 to 355 for P2 membrane protein, a region from 113 to 115 for P3 membrane protein, and similarly a region from 250 to 254 for P30 adhesin protein with constant predicted B turn region.

Fig. 19
figure 19

Chou and Fasman beta-turn prediction of the best antigenic proteins. a P2 protein (Uniprot ID: P75588); The X-axis and Y-axis denote the position and score, respectively. The threshold is 0.984. The regions above the threshold value, having beta turns in the protein which shown in yellow color. b P3 protein (Uniprot ID: Q50327); The X-axis and Y-axis denote the position and score, respectively. The threshold is 0.826. The regions above the threshold value, having beta turns in the protein which shown in yellow color. c P30 protein (Uniprot ID: P75330); The X-axis and Y-axis denote the position and score, respectively. The threshold is 0.740. The regions above the threshold value, having beta turns in the protein which shown in yellow color. (Color figure online)

Several experimental data revealed that the region of a peptide interaction with the antibody tends to be flexible. Hence, Karplus Schulz flexibility prediction tool was identified the flexible regions on all the query protein. The region from 345 to 350 for P2 membrane protein, region from 116 to 122 for P3 membrane protein, and similarly 122 to 124 for P30 adhesin protein (Fig. 20) is considerably the most favorable region in flexibility prediction analysis.

Fig. 20
figure 20

Karplus and Schulz flexibility prediction of the best antigenic proteins. a P2 protein (Uniprot ID: P75588); The X-axis and Y-axis denote the position and score, respectively. The threshold is 1.004. The regions above the threshold value, flexible regions of the protein shown in yellow color. b P3 protein (Uniprot ID: Q50327); The X-axis and Y-axis denote the position and score, respectively. The threshold is 0.989. The regions above the threshold value, flexible regions of the protein shown in yellow color. c P30 protein (Uniprot ID: P75330); The X-axis and Y-axis denote the position and score, respectively. The threshold is 1.005. The regions above the threshold value, flexible regions of the protein shown in yellow color. (Color figure online)

Biepipred tool is based on the Hidden-Markov model, to determine Linear B cell epitopes. The tool was used to eliminate the fact that single scale amino acid propensity profile cannot consistently predict the antigenic epitopes all the time, and to acquire a better result from epitope prediction tools than the receiver operating characteristics (ROC) plot. The Bepipred predicted epitopes on all the proteins are depicted in Fig. 21.

Fig. 21
figure 21

Bepipred linear epitope prediction of the best antigenic proteins. a P2 protein (Uniprot ID: P75588); The X-axis and Y-axis denote the position and score, respectively. The threshold is 0.376. The regions having beta turns are shown in yellow color. The highest peak region depicts the most potent B-cell epitope. b P3 protein (Uniprot ID: Q50327); The X-axis and Y-axis denote the position and score, respectively. The threshold is 0.578. The regions having beta turns are shown in yellow color. The highest peak region depicts the most potent B-cell epitope. c P30 protein (Uniprot ID: P75330); The X-axis and Y-axis denote the position and score, respectively. The threshold is 0.740. The regions having beta turns are shown in yellow color. The highest peak region depicts the most potent B-cell epitope

Subsequently, cross processing all the data obtained from previous B-cell epitope prediction tools, the region from 345 to 355 for P2 membrane protein, 115 to 122 for P3 membrane protein, and 122–139 for P30 adhesin protein are found to be the best capable region to induce B-cell response.

Model building and refinement

The models of all the three proteins (P75588, Q50327 and P75330) were generated by I-TASSER and validated by Ramachandran plot analysis. Further, the modelled proteins were subjected for docking studies with the available therapeutics (antibiotics) used in the treatment of M. pneumoniae infection to compare the binding efficiency of nonameric peptides.

Protein–Ligand docking analysis

The ligand information for docking analysis showed as log file in Table 7. The proteins identified as the best binders with MHC II were docked with existing drugs to identify binding affinity displayed in Table 8. Claithromycin, azithromycin, doxycycline, telithromycin and tetracycline were docked against P75588, Q50327 and P75330. The binding energy of the nonameric peptide derived from P75588 was − 915.8 kcal/mol while the conventional drugs which are targeted against the disease inducing outer membrane protein, p75588 showed binding energy of in the range of − 110.3 kcal/mol to − 127.6 kcal/mol. The nonameric peptide derived from Q50327, ATP synthase subunit b exhibited a binding energy of − 982.1 kcal/mol with HLA DQ, while the drugs showed binding energy in the range of − 101.1 kcal/mol to − 121.7 kcal/mol. Similarly, the epitope derived from p30 adhesin protein showed binding energy of − 893 kcal/mol while the drugs exhibited binding energies in the range of − 105.5 kcal/mol to − 149.8 kcal/mol. From the docking scores, it is evident that existing drugs used for the treatment of infection has least binding affinity to the antigenic outermembrane proteins but the peptide derived from these antigenic proteins exhibits better binding affinity with HLA, which critically confirms that nonameric peptides can be used as effective therapeutic agents for the treatment of the disease caused by M. pneumoniae.

Table 7 Ligand information for docking analysis and protein–ligand interaction
Table 8 Docking score for protein–ligand interaction

Discussions

The prevalence of bacterial infections such as pneumonia, meningitis, tuberculosis and so on, direct towards an urgent requirement of more enhanced and rapidly disbursable vaccines for the same. For these pathogens, immunity related with protection remains largely unknown. Insight into the gaps of protective immunity against the pathogen create vaccine development for the newly emerging infectious diseases more crucial and challenging [47].

The present study aims to screen and investigate the best antigenic proteins of the M. pneumoniae, and also to identify the T-cell and B-cell epitopes that were forged on the most antigenic protein by screening of vaccine epitopes. In this study, an immunoinformatics-driven approach was utilized to screen significantly dominant immunogens against M. pneumoniae. The results showed that all the membrane associated and cytadherence proteins were better antigenic with highest antigenic scores. Moreover, most of the studies of pneumonia vaccine focused on the membrane associated and cytadherence proteins of M. pneumoniae [29, 48, 49], because of recognizing the attachment and stimulation of local damage at the cellular level are responsible for M. pneumoniae disease. Thus, the logical method is to avert the attachment and thereby prevent initiation of disease. Further, T-cell-based cellular immunity is important for eliminating M. pneumoniae infection, yet the vaccine against the membrane associated and cytadherence proteins mainly elicit neutralizing antibody response. Notably, high mutation rate of the membrane associated and cytadherence proteins of M. pneumoniae may lead to escape of neutralizing antibodies against the same proteins. Thus, an ideal target should be much conserved that elicit both neutralizing cellular immunity and antibody against M. pneumoniae, which is more significant for an effective pneumonia vaccine development. The membrane associated and cytadherence proteins of M. pneumoniae are abundantly produced during the disease progression and exhibits strong conservancy and immunogenicity, which can act as a best immunogen to elicit both humoral and cell-mediated immune responses [29, 50].

A T-cell epitope is considered as strong and effective, if it is fine conserved among the selected proteins. To validate the selected epitopes WIHGLILLF, VILLFLLLF and LLAWMLVLF based on the docking score, which predicts nonameric peptide sequence binds to class II MHC.WIHGLILLF and VILLFLLLF interact with 17 MHC class II molecules out of 23 alleles, while LLAWMLVLF interacts with 16 MHC class II molecules out of 23 alleles. Investigating similar data from the other study revealed that this specific high binding affinity is completely desired because the epitope vaccine efficiency significantly relies on the specific interaction between epitope and HLA alleles [51]. Simultaneously, the selected epitopes are declared as non-allergen, an undeniable characteristics a vaccine must have. These results imply that the vaccine would be potential for the management of disease.

The epitopes were docked to evaluate the binding efficacy of all the epitopes with MHC class II alleles. The results revealed that the WIHGLILLF epitope highest binding affinity with specific HLA allele HLA-DP, VILLFLLLF epitope highest binding affinity with a specific HLA allele HLA-DQ, and LLAWMLVLF highest binding affinity with specific HLA allele HLA-DR. This computational analysis confirms the binding affinity of epitopes for the MHC-II molecules and upholds.

The best selected membrane associated proteins and cytadherence proteins were also searched for B cell epitopes as the B cell epitopes can activate both the primary and secondary immunity. Various tools from IEDB database generated results examining the protein based on the vital characteristics of B cell epitopes. Further, we cross referenced all data and the regions from 345 to 355 for P2 membrane protein, 115 to 120 for P3 membrane protein, and the regions form 135 to 139 for P30 adhesin protein, were found to be predicted as putative or B cell epitope by Bepired tool. The regions consist of flexible and beta turns. The region is hydrophilic and surface accessibility comparatively than other regions and evidenced to be antigenic. The 9-mer epitope WIHGLILLF from P2 adhesin protein, VILLFLLLF from P3 adhesin protein, and LLAWMLVLF from P30 adhesin protein are the most favorable as B cell epitopes according to the predicted results.

As suggested by VaXijen and Allerhunter all the proteins were found to be eligible targets for the vaccine development because of highest antigenicity and least/null allergenicity. The proteins from which 3 potent T cell epitopes WIHGLILLF, VILLFLLLF and LLAWMLVLF were derived showed antigenic score of 0.6274, 0.6902 and 0.7167 respectively and all these proteins were classified as non-allergens according to the prediction of Allerhunter. A valuable insight was achieved from secondary structure analysis of proteins from SOPMA. The abundance of alpha helix and coiled region in SOPMA generated results implies greater stability and conservancy of proteins.

The three dimensional structure for the epitopes were predicted and validated to analyze the stereochemical quality of predicted models. The RAMPAGE analysis for the potent T cell epitopes showed the percentage of residues in the favorable region. Thus, all the T cell epitopes generated from the membrane and cytadherence proteins which showed percentage of residues in the favoured region greater than 75% in RAMPAGE analysis were subjected to docking with HLA DP, DQ and DR by CLUSPRO. Epitopes WIHGLILLF, VILLFLLLF and LLAWMLVLF showed binding energy values of -915.8 kcal/mol,-982.1 kcal/mol and-893 kcal/mol with HLA DP, DQ and DR respectively. Hence the significant interaction between HLA and predicted T cell epitopes were confirmed. Proteins from the most potent T cell epitopes were generated were subjected to docking with the existing drugs which belongs to macrolide classes and found a lesser binding affinity with HLA which further confirms that the therapeutic potential of novel T cell epitopes derived from the membrane associated proteins of M. pneumoniae.

MD simulation studies were performed to analyze the stability of bound protein-peptide complex. Higher energy was observed for the HLA DP and HLA DR protein-peptide complexes with respect to the native protein, which implies a stable interaction between bound HLA and peptide complex.

Conclusion

In conclusion, the present study pointed out the application of immunoinformatics to predict epitope binding HLA alleles of pathogens as a potential strategy to hasten vaccine development. Based on this approach, the T cell epitopes and B cell epitopes in the membrane associated and cytadherence proteins of M. pneumoniae were screened out and studied to be used as promising vaccine candidates. However, the T cell and B cell stimulation prospective of the selected epitopes need to be tested using extensive laboratory based techniques along with this computational study for their effective practice as vaccines against M. pneumoniae. Nonetheless, these findings not only provide novel and valuable epitope candidates, but also prompt the way to develop the vaccines against Mycoplasmal Pneumonia and its associated clinical manifestations.