Introduction

Hendra virus (HeV) initially emerged in an outbreak of acute respiratory disease amongst horses and humans in Australia, 1994 (Murray et al. 1995). Patients of these cases were found to have infected with close contact to diseased horses. Initial cases of Hendra infection clinically diagnosed as severe respiratory infection (Selvey et al. 1995; Hanna et al. 2006; O’Sullivan et al. 1997; Playford et al. 2010). There is no specific treatment for human cases of Hendra virus. Flying foxes (pteopid bats) considered as the natural reservoir of HeV. Transmission of viral infection to humans occurred likely due to exposure of mucous membrane or non-intact skin to nasal secretions, urine and blood of infected horses (Mire et al. 2015). HeV was primarily isolated from infected uterine fluid and fetal tissue of bat species Pteropus policephalus and Pteropus alecto respectively (Halpin et al. 2000). Equivac Hev, a subunit vaccine from glycoprotein of HeV is the only approved licensed vaccine from Australian government which is used for prevention of Hev in the horse population.

Peptide vaccines are considered an alternative to classical vaccines that are trying to address issues of possible vaccine side effects related to vaccination with a heterogeneous multicomponent preparation. The peptide-based vaccines include chemical approach to synthesize the selected epitopes that are specific and trigger immune responses. Immunoinformatic focuses mainly on prediction of potential epitopes which brings down the laboratory analysis cost & time of potential vaccine candidate. Using immunoinformatic tools, immunologist can screen and analyze short sequences from the full-length foreign proteins which can act as an immunogenic epitopes & facilitating to be a vaccine candidate (Li et al. 2014; Kamthania and Sharma 2016). Important implication of this study is to screen promiscuous T-cell epitopes from HeV proteins viz matrix, glycoprotein, nucleocapsid, fusion, C protein, V protein, W protein, and polymerase. These predicted promiscuous T-cell epitopes may be the promising targets for epitope-based vaccine design for HeV.

Materials and Methods

Retrieval of Amino Acid Sequence

The amino acid sequences of matrix, glycoprotein, nucleocapsid, fusion, C protein, V protein, W protein and polymerase, were retrieved from NCBI protein sequence database (http://www.ncbi.nlm.nih.gov/protein). The total of 22 nucleocapsid, 20 matrix, 22 fusion, 22 glycoprotein, 14 W protein, 17 V protein, 15 C protein and 15 polymerase protein sequences from different HeV strains, available at NCBI database were retrieved & downloaded in FASTA format.

Antigenicity Prediction

VaxiJen server (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) (Doytchinova and Flower 2007) was used with default parameters to predict the antigenicity of HeV candidate proteins. The threshold value of vaxiJen was set to 0.4 (default for viruses). The viral proteins having Vaxijen score above 0.4 values were considered to be antigenic in nature. Antigenic proteins were chosen for further study.

Promiscuous T-Cell Epitope Prediction

The prediction of promiscuous T-cell epitope binder to HLA class-II alleles was performed by ProPred (Singh and Raghava 2001). ProPred is an on-line web tool; it utilizes matrix-based prediction of HLA (Class II alleles) binding sites in an antigenic protein sequence.

3D Structure Prediction of Promiscuous T-cell Epitopes

The PEPstrMOD (Singh et al. 2015) method performed to find out the tertiary structure of selected small epitope with sequence length of nine residues. The PEPstrMOD tool prediction strategy utilizes the secondary structure data anticipated by PSIPRED (Kaur and Raghava 2004) and β-turns data anticipated by BetaTurns (Jones et al. 1999).

Modeling of HLA Class II Alleles

The sequences of alleles viz DRB1*1304, DRB1*0804, DRB1*0405, DRB1*0806, DRB1*0402, DRB1*0701, DRB1*1301, DRB1*1104, DRB1*1102. DRB1*0410, DRB1*1128, DRB1*1101, DRB1*0301, & DRB1*0817 were downloaded from IMGT/HLA database (http://www.ebi.ac.uk/ipd/imgt/hla/allele.html) (Robinson et al. 2012). BLASTP program (Altschul et al. 1990) was used to find the templates for these allele sequences and respective PDB IDs (Table 1) retrieved from Protein Data Bank (http://www.rcsb.org/pdb). MODELLER 9.17 (Šali et al. 1995) was used for homology modeling of HLA alleles. Five models were generated by modeller software and the best model chosen based on their lowest predicted discrete optimized protein energy (DOPE) score. The modeled alleles overall quality was determined using Ramachandran plot analysis utilizing PROCHECK (Laskowski et al. 1993), which detect the geometrical orientation of each amino acid residues with respect to stereochemical parameters.

Table 1 Template PDB ID for modeling of selected HLA class II alleles

Molecular Docking

AutoDock 4.2 (Morris et al. 1998) was utilized to form the docking complex of predicted T-cell epitopes and HLA alleles. In case of rigid protein & flexible ligand during autodock, the Lamarckian Genetic Algorithm (LGA) was used. The best conformation of docked complex was chosen on the basis of minimum binding energy and best fitting of epitope-HLA allele complex with highest number of H-bonds formation. Python Molecular Viewer (Sanner et al. 1999) used to visualize the docked complex of HLA allele with predicted T cell epitope.

Docked Complex Stability Validation by MD Simulation

NAMD used for MD simulation of selected docked complex (Phillips et al. 2005). VMD analyses & view the results of MD simulation & also interface with NAMD (Humphrey et al. 1996). Protein structure file (psf) was made using the topology files and initial pdb files of the HLA II allele- epitope docked complex utilizing psfgen package of VMD. NAMD created the trajectory DCD file. The result of simulation was analyzed by calculation the Root mean square deviation (RMSD) of the docked complex. The rmsd.dat file contains the value of RMSD which was further analyzed by Microsoft office excel.

Toxicity Prediction of the Selected T-Cell Epitopes

ToxinPred (http://crdd.osdd.net/raghava/toxinpred/) (Gupta et al. 2013) was utilized to calculate toxicity of predicted T- cell epitopes. ToxinPred is an in-silico method to predict toxic/non-toxic peptides. ToxinPred was run with default parameters and only non-toxic T-cell epitopes were selected for further study.

HLA-Distribution Analysis

MHCPred (http://www.ddg-pharmfac.net/mhcpred/MHCPred/) was utilized to select the high affinity HLA class II binder for selected epitopes with their IC50 (half maximal inhibitory concentration) value. The alleles having cut-off value of IC50 between 0.01 and 500 nM were selected.

Population Coverage Analysis

IEDB (Immune Epitope Database and Analysis Resource) population coverage tool (http://tools.immuneepitope.org/tools/population/iedb_input) (Bui et al. 2006) was used to study the worldwide geographical population coverage by the selected epitopes and HLA Class II alleles pair as resulted from MHCPred (Table 2). Default parameters were set while running the tool. IEDB conservancy tool consist data set frequencies of 3245 alleles for 115 countries, 21 ethnicities and 16 geographical areas. The prediction was taken utilizing the most resent information to set from Allele Frequency Net Database (Gonzalez-Galarza et al. 2010) at IEDB.

Table 2 IC50 values of alleles with their corresponding predicted potential T-cell epitopes IRIFVPATN, MRNLLSQSL, VRRAGKYYS and VRLKCLLCG with an affinity of < 500 nM

Conservancy Analysis of Selected Epitopes

To find out the degree of conservation, all the selected epitopes were aligned against all respective source protein sequences retrieved from NCBI database using EBI-clustal Omega program (Sievers et al. 2011). The Multiple sequence alignment (MSA) was visualized using Jalview (Waterhouse et al. 2009). Conservancy analysis was again performed for the selected T-cell epitopes by conservancy tool of IEDB (Table 3) (Bui et al. 2007).

Table 3 Conservancy analysis using Conservation across antigen tool of IEDB revealed that all the four chosen epitopes are 100% conserved across all the HeV strain’s respective protein sequences retrieved from NCBI database

Results and Discussion

Antigenicity Prediction

Amino acid sequences of matrix, glycoprotein, nucleocapsid, fusion, C protein, V protein, W protein and polymerase proteins were screened by VaxiJen. All the proteins were found antigenic except one nonstructural protein C which is nonantigenic at threshold value 0.4 (default threshold for viral proteins) (Table 4). Fusion protein had been predicted as highest antigenic with score of 0.5529 among all candidate proteins.

Table 4 VaxiJen Results of antigenicity

Promiscuous T-cell Epitope selection and analysis

Antigenic HeV proteins were subjected to Propred for selection of HLA Class II specific T- cell epitopes binders. Epitopes showing highest score with the maximum number of HLA Class II alleles binders were selected at a threshold value of 4% (Table 5).

Table 5 ProPred predicted T-cell epitope for HLA Class II with binding scores

Toxicity Prediction of the Peptide Epitopes

ToxinPred (Gupta et al. 2013) used for toxicity prediction of selected T- cell epitopes. ToxinPred tool is a unique in-silico method based on Support Vector Machine (SVM) in predicting toxicity of peptides along with important physico-chemical properties viz hydropathicity, hydrophilicity, hydrophobicity, charge and molecular weight. The selected epitopes were subjected to ToxinPred and only non-toxic T-cell epitopes were selected for further studies (Table 6).

Table 6 Toxicity prediction of the peptides by ToxinPred

Structure Modeling of T-cell Epitopes and Alleles

3D structures of selected epitopes were predicted by PEPstrMOD. Modeller 9.17 was employed to generate homology model of alleles: DRB1*1304, DRB1*0804, DRB1*0405, DRB1*0806, DRB1*0402, DRB1*0701, DRB1*1301, DRB1*1104, DRB1*1102. DRB1*0410, DRB1*1128, DRB1*1101, DRB1*0301, & DRB1*0817; template used (PDB ID): 1YMM, 2SEB, 2SEB, 2WBJ, 4MDI, 1AQD, 2WBJ, 2WBJ, 2WBJ, 2SEB, 1A6A, 1YMM, 1A6A & 1A6A respectively (Table 1). PROCHECK was utilized to analyze selected models quality. Ramachandran result of HLA alleles model (DRB1*0701, DRB1*0301,DRB1*1304 and DRB1*0806 alleles) forming best HLA allele-epitope complex, based on docking study (Table 7), is shown in Fig. 1 a–d.

Table 7 Binding energy determination by autodock
Fig. 1
figure 1

Ramachandran plot of protein model: a DRB1*0701, b DRB1*0301, c DRB1*1304, d DRB1*0806

Binding Energy Determination of Epitope & HLA Class II Allele

The interaction studies of selected T-cell epitopes with their respective highest ProPred scorer HLA class II allele binders were predicted by utilizing Autodock 4.2 (Table 7). Amongst the docked complexes, nucleocapsid protein peptide IRIFVPATN were dock with two alleles DRB1*0806 & DRB1*1304 and nucleocapsid protein peptide MRNLLSQSL with DRB1*0701 allele formed stable complexes having binding energy value of − 1.88 kcal/mol, − 2.56 kcal/mol and − 2.76 kcal/mol respectively. Matrix protein peptide VRRAGKYYS was docked with DRB1*0301 HLA allele, it showed binding energy value of − 1.01 kcal/mol. Fusion protein peptide VRLKCLLCG was docked with DRB1*0806 HLA allele and has shown binding energy value of − 1.51 kcal/mol Figs. 2, 3, 4, 5 and 6 respectively. The docked complexes were visualized by Python molecular viewer.

Fig. 2
figure 2

Docked complex of Nucleocapsid protein peptide IRIFVPATN with DRB1*0806 allele

Fig. 3
figure 3

Docked complex of Nucleocapsid protein peptide IRIFVPATN with DRB1*1304 allele

Fig. 4
figure 4

Docked complex of Nucleocapsid protein peptide MRNLLSQSL and DRB1*0701 allele Formation of one H-bond with SER123 (OG) and peptide MET1(N)

Fig. 5
figure 5

Docked complex of Matrix protein peptide VRRAGKYYS and DRB1*0301 allele. Formation of one H-bond with PHE13 (O) and peptide SER9 (N)

Fig. 6
figure 6

Docked complex of Fusion protein peptide VRLKCLLCG with DRB1*0806 allele

Docked Complex Stability Validation & RMSD Plot

The lowest binding energy docked complexes of T-cell epitope & HLA Class II allele were subjected to MD simulation by NAMD. The RMSD plot of docked complexes IRIFVPATN-DRB1*0806, IRIFVPATN-DRB1*1304, MRNLLSQSL-DRB1*0701, VRRAGKYYS-DRB1*0301 and VRLKCLLCG-DRB1*0806 showed the highest peak at 0.95 Å, 0.98 Å, 0.99 Å, 1.02 Å and 1.01 Å RMSD (Fig. 7 a–c, 8, 9) respectively.

Fig. 7
figure 7

Graph displaying MD simulation of Nucleocapsid peptide a IRIFVPATN with DRB1*0806 complex with RMSD highest peak at 0.95 Å. b IRIFVPATN with DRB1*1304 allele complex with RMSD highest peak at 0.98 Å. c MRNLLSQSL and DRB1*0701 allele complex with RMSD highest peak at 0.99 Å

Fig. 8
figure 8

Graph displaying MD simulation of Matrix peptide VRRAGKYYS and DRB1*0301 allele complex with RMSD highest peak at 1.02 Å

Fig. 9
figure 9

Graph displaying MD simulation of Fusion peptide VRLKCLLCG with DRB1*0806 allele complex with RMSD highest peak at 1.01 Å

Population Coverage Estimation of Predicted T-Cell Epitopes

MHCPred (Guan et al. 2003) was employed to predict the potential HLA allele binders for selected T cell epitopes IRIFVPATN, MRNLLSQSL, VRRAGKYYS and VRLKCLLCG, along with their interacting affinity with IC50 value to be < = 500 nM (Table 2). IEDB tool predicted the four epitopes IRIFVPATN, MRNLLSQSL, VRRAGKYYS and VRLKCLLCG have population coverage of 40.24%, 56.78%, 36.82% and 57.08% respectively for total world population (Table 8). Highest coverage of 59.74%, 53.24% and 66.78% for epitope IRIFVPATN, VRRAGKYYS and VRLKCLLCG respectively were found for population of the Northeast Asia. Epitope MRNLLSQSL has highest coverage of 70.28% in the population of South Asia. The results of high population coverage suggested that the putative T-cell epitopes cover vast majority of geographic population.

Table 8 Estimated Population coverage of predicted T cell epitopes IRIFVPATN, MRNLLSQSL, VRRAGKYYS and VRLKCLLCG based on MHC-I and MHC-II data using IEDB

Epitope Conservation and Variability Analysis

The degree of conservation of epitope in protein sequences provides an important insight about its conservancy through evolution and hence its applicability as an epitope-based vaccine candidate against different strains of the infecting organism. MSA result of best predicted epitopes IRIFVPATN and MRNLLSQSL from nucleocapsid, VRRAGKYYS from Matrix and VRLKCLLCG from Fusion proteins showed that epitopes were well conserved in their entire source protein sequences from different strains of HeV available at NCBI databank. The MSA results were visualized using Jalview (Fig. 10 a–d). Results were further verified by IEDB epitope conservation analysis tool. It was found that predicted epitopes amino acid sequence were 100% conserved amongst all the NCBI protein sequence of the source protein of HeV (Table 3).

Fig. 10
figure 10

Degree of Conservancy using MSA showed that Epitopes a IRIFVPATN (Nucleocapsid) b MRNLLSQSL (Nucleocapsid) c VRRAGKYYS (Matrix) d VRLKCLLCG (Fusion) were 100% conserved in all the protein sequences obtain from different isolated HeV strains worldwide. Yellow colored rectangle shows conserved nanomer epitopes

Conclusion

This study identified the potential nanomer T-cell epitopes as a vaccine candidate for Hendra virus. T-cell epitopes IRIFVPATN (nucleocapsid), MRNLLSQSL (nucleocapsid), VRRAGKYYS (matrix) and VRLKCLLCG (fusion) were found to have considerable binding with DRB1*0806, DRB1*1304, DRB1*0701, DRB1*0301 and DRB1*0806 HLA class II alleles respectively. The chosen epitopes have shown 100% conservancy throughout all the respective protein sequences from different strains of HeV. These epitopes have shown to have high binding affinity with HLA class II alleles, stable complex formation tendency with HLA class II allele as confirmed by MD simulation results and significant worldwide population coverage. All these results make these epitopes to be a potential candidate for epitope-based vaccine development against HeV infection. Hence reported epitopes may undergo further in-vivo trials to develop vaccine against HeV infection.