Introduction

Ebolavirus (EBOV) is a Filoviridae member [1] responsible for Ebola virus disease (EVD) which leads to ungoverned viral replication and multi-organ failure [2]. The virus is known to multiply in various cell types (hepatocytes, macrophages, endothelial and epithelial cells) and speedily makes its way into the vital organs of the host [3]. Maximum cases of EVD happen due to person to person transmission [4]. Approximately, 30,000 cases of Ebola have been reported till date since 1976 with North Kivu province being the site of latest outbreak in 2018 [5].

EBOV is a single non-segmented negative-stranded RNA virus with an unusual, variable-length, filamentous morphology. It consists of seven proteins viz. nucleoprotein (NP), polymerase cofactors (VP35 and VP40), glycoprotein (GP), transcription activators (VP30 and VP24) and RNA-dependent RNA polymerase (L) [6]. Like its family members, EBOV RNA is incapable of existing in naked form [7]. Nucleoprotein (NP) serves as scaffold for assembly of filovirus nucleocapsid (NC) which includes VP35, V30, VP24 and L [8]. NP interacts with VP35 and VP30 [8] which in turn interact with polymerase and help in assembling viral replication complex [9]. NC plays role in viral RNA synthesis during proliferation cycle [10,11,12]. Therefore, NP is essential in viral RNA synthesis and virus assembly [10] as ssRNA binding is likely dependent on oligomerization and proper orientation of NP [10, 11]. NP also protects the virus from host innate immune responses and provides resistance to host ribonucleases. When NP-specific CTLs were given to naive mice challenged with a lethal EBOV dose, they helped to induce protection against EBOV indicating the role of cell-mediated immunity against NP [13]. In other recent study, analysis of T cell response was carried out for seven proteins of EBOV in 30 individuals who survived after EBOV infection and it was observed that the maximum survivors (96%) responded against the NP protein as compared to other proteins [14]. Hence, NP is a highly critical protein and, thus, presents itself as a lucrative vaccine design target.

Vaccine development against EBOV is still in development phase and various trials include viral vector-based vaccines [15], protein-based vaccines [16] and subunit vaccines [17]. In recent years, there has been a remarkable progress in peptide-based vaccines which are fragments of protein antigen sequences assembled into a single molecule capable of inducing an immune response. Immunoinformatics tools have shown success in elucidating potent peptide vaccine candidates against influenza virus [18], hepatitis C [19], West Nile virus [20] and EBOV [21]. The immunogenic peptides obtained using this approach were validated in in vitro system for influenza [22] and in in vivo system for Brucella abortus [23] and EBOV [24].

In the present investigation, peptides containing multiple epitopes against EBOV NP were selected based on different epitope prediction tools and examined for their conservation among the EBOV species and other Filoviridae members. These peptides were looked for binding potential to diverse HLA molecules based on different prediction tools, docking and population coverage analysis. Further, in vitro validation of immunogenic response of three potential peptides was carried out by measuring IFN-γ secreted by peptide-stimulated peripheral blood mononuclear cells (PBMC) isolated from healthy blood samples.

Method

Conserved peptides identification

195 unique Ebola nucleoprotein sequences (739 amino acids) out of a total of 2407 entries (1976–May 2018) were downloaded from viprbrc and NCBI databases. These sequences comprised 187 (Zaire), five (Sudan), two (Bundibugyo) and one (Taï Forest) sequences belonging to various Ebola species pathogenic to humans. MUSCLE [25] and AVANA [26] tools were employed to identify peptide fragments showing at least 90% conservancy.

Prediction of T and B cell epitopes

T cell epitopes were predicted based on a consensus approach [27] that includes three prediction tools (SYFPEITHI, NetCTL 1.2 and IEDB consensus) for CD8+ T cell epitopes (HLA class I) and three tools (MHC2Pred, Propred and IEDB consensus) for CD4+ T cell epitopes (HLA class II). The detailed information of each tool is mentioned in Table 1. The epitopes showing overlaps were further joined to obtain peptide fragments containing both CD4+ and CD8+ T cell epitopes.

Table 1 T cell epitope prediction tools

Linear, 10 amino acid long B cell epitopes were identified with the help of ABCpred at default threshold (0.51). This tool utilizes a recurring neural network method to predict B cell epitopes with 65.93% accuracy [33].

Screening of peptides for autoimmune and allergic response

Peptides having seven out of nine consecutive amino acids identical to human proteome were eliminated using BLAST analysis. Allergenicity of the peptides was predicted using the online tool AlgPred, which is based on screening IgE epitopes in query protein sequence and Motif Alignment & Search Tool [34].

Conservancy analysis amongst Ebola species and other filoviridae members

The identified peptides were looked for conservancy in human-pathogenic Ebola species (Zaire, Sudan, Bundibugyo and Taï forest) sequences as well as in 18 unique out of a total of 79 Marburgvirus nucleoprotein sequences and the only available unique nucleoprotein sequence of Lloviuvirus obtained from viprbrc and NCBI databases.

Molecular docking

HLA–peptide interaction analysis was done with the help of CABS-dock which allows for flexibility of peptide and receptor backbone [35]. High-resolution crystal structures of eighteen HLA class I and II alleles (nine each) bound to their respective native peptide were obtained from PDB. The HLA crystal structures without their native peptides (peptides already bound to the HLA) were obtained using Discovery Studio Visualizer (version 4.1). RMSD values obtained by docking the native peptides to their respective HLA structures served as standard. Peptides showing RMSD > 5 or found to be interacting outside the binding groove were eliminated.

Population coverage analysis

IEDB population coverage analysis tool, based on peptide-HLA data and HLA genotypic frequency, plays an important role in a bid to develop a globally protective vaccine. The selected peptides and their HLA alleles obtained from prediction tools were used as input for this tool. For this analysis, four different geographical continents (Africa, America, Asia and Europe) were chosen. Africa, America and Asia comprised 13 different geographical regions and, therefore, the average of population coverage for these regions was considered. Analysis was also carried out by taking into account the whole world.

Statistical analysis

One-way Anova followed by Tukey’s multiple comparison test using GraphPad Prism was carried out to analyze the docking data.

Measurement of IFN-γ secreted by peptide-stimulated peripheral blood mononuclear cells

P2, P3 and P5 were commercially synthesized by GL Biochem (Shanghai) Ltd. Healthy blood samples were obtained from Nitin Hospital, Patiala and Rajindra Hospital, Patiala (India) after informed consent from all volunteers. The study was approved by the institutional ethical committee. Peripheral blood mononuclear cells (PBMC) were isolated via ficoll density gradient method [22]. Restimulation assay was carried out for measuring peptide-induced IFN-γ secretion with certain modifications to the previous report [36]. In a 96-well cell culture plate, 2 × 105 cells were seeded per well in a total volume of 200 µL complete media (RPMI-1640 supplemented with 10% fetal bovine serum, 100 µg/mL streptomycin, 100 I.U./mL penicillin and 10 mM HEPES) and stimulated with each peptide (50 µg/mL). Unstimulated cells served as negative control while cells stimulated with 10 µg/mL of concanavalin A (ConA, Sigma-Aldrich) served as positive control. Restimulation was done on 3rd day with each peptide. On 5th day, IFN-γ secreted by unstimulated, peptide-stimulated and ConA-stimulated cells was measured by performing ELISA with the help of human IFN-γ mini Elisa development kit (Peprotech, USA). A microplate reader (Tecan Austria) was used to take absorbance at 405 nm with 630 nm as reference wavelength. All experiments were carried out in triplicates. IFN-γ production was expressed as fold change which is the ratio of absorbance of peptide-stimulated cells and unstimulated cells.

Results

Conserved peptides containing T and B cell epitopes and having no autoimmune and allergic properties

Four overlapping fragments (C1–C4) with ≥ 90% conservancy in 195 Ebola nucleoprotein sequences were obtained after multiple sequence alignment via MUSCLE and conservancy analysis via AVANA (Online Resource 1). Next, epitopes commonly predicted in the identified fragments by six epitope prediction tools (three each for HLA class I and II) were considered. Initially, 105 and 79 HLA class I (CD8+ T cell) and II (CD4+ T cell) binding epitopes were obtained, respectively (Online Resource 2). Twelve peptide fragments containing multiple CD8+ and CD4+ T cell epitopes were obtained by merging overlapping epitopes (Online Resource 3).

A total of 201 linear B cell epitopes were obtained after analyzing the four conserved fragments (C1–C4) via ABCpred (Online Resource 2). The predicted B cell epitopes were present only in eight identified fragments. These eight peptides were checked for autoimmune and allergic responses. Two peptide fragments (VGHMMVIFRLMRTNFLIKFLLIHQGMHMV and YAPFARLLNLSGV) exhibited similarity to intrinsic human proteins based on BLAST analysis and, hence, were eliminated. Algpred tool confirmed none of the peptides to be allergic in nature. Thus, six non-self and non-allergic peptide candidates possessing multiple B and T cell epitopes were selected (Table 2).

Table 2 Peptides representing the presence of different T and B cell epitopes

Conservation analysis of peptides amongst different Ebola virus species and related family members

Six identified fragments were investigated for their conservancy in different species of Ebola virus (Zaire, Sudan, Bundibugyo and Taï Forest) and other members (Marburgvirus and Lloviuvirus) of Filoviridae to judge the potential of these candidates to develop cross protective immunity. Interestingly, all selected peptides showed 100% conservancy amongst Zaire ebolavirus nucleoprotein sequences (Table 3). P3 was found to be 100% conserved in all Ebola virus species and Lloviuvirus and there was a single amino acid variation in case of Marburgvirus. P5 was found to be 100% conserved in three Ebola virus species (Zaire, Bundibugyo and Taï Forest) while one amino acid variation at same position was observed in Sudan, Lloviu and Marburg viruses (Table 3). Rest of the peptides were found with few variations in different Ebola virus species. In most cases, one variable sequence was observed but P2 (Sudan ebolavirus) and P4 (Marburgvirus) were found to be with two variable sequences (Table 3). P1, P2 and P6 could not be located in NP sequence of Lloviu and Marburg viruses.

Table 3 Peptide conservation in Ebola virus species and other filoviridae members

Peptide–HLA interactions

Peptides are presented by HLA molecules to induce immune response and HLA polymorphism is well known [37, 38]. Therefore, it is desirable for potent peptide vaccine candidates to exhibit interaction with a wide range of HLA molecules. During epitope prediction, all HLA alleles/supertypes available in various tools (Table 4) were considered. All selected peptides were found to bind with diverse and large number of HLA alleles which are of HLA-A, HLA-B, HLA-DP, HLA-DQ and HLA-DR types (Table 4) and the complete detail of HLA restrictions of all peptides has been mentioned (Online Resource 4). P2, P3 and P5 peptides were found to be predicted for most HLA types (Table 4) as well as maximum number of HLA alleles (Online Resource 4).

Table 4 Peptides containing multiple epitopes binding to diverse HLA alleles

Docking study gives a wider prospective to understand the actual binding interaction of peptides with HLA molecules. Eighteen HLA alleles belonging to different HLA categories were chosen for docking with CABS-dock. Crystal structures for eighteen HLA alleles were obtained from PDB and native peptides, ranging from 8 to 11 residues in length for HLA class I and 9–20 residues in length for HLA class II, were separated using Discovery studio Visualizer 4.1. Nonamer CD8+ T cell epitopes which are part of selected six peptides (Table 2) were docked with HLA class I molecules as the binding groove of HLA class I is closed and can accommodate 8–10 residue peptides. HLA class II has open grooves and is capable of presenting 13–25 residue peptides; thus, six peptides as such were docked with class II molecules [39]. RMSD values (CABS-dock) obtained by docking native peptides to their respective HLA molecules were used as test control.

The average of RMSD values of CD8+ T cell epitopes (HLA class I) which are associated with respective peptide is plotted in Fig. 1. The models with RMSD value < 3 are considered high-quality predictions while those with 3 ≤ RMSD ≥ 5.5 as moderate quality predictions [40]. In some HLA–peptide interactions such as P1 (B*1801 and DRB*0101), P4-DQ8 and P6-DQ8, the RMSD value was > 5 showing poor quality prediction and, hence, these values were not plotted. In majority of the cases, the RMSD value was less than 3 for peptide-HLA (class I and II) (Fig. 1). The mean and median RMSD value of each peptide was also found to be less than 3 which confirmed the strong binding interaction of predicted peptide with eighteen HLA molecules (Table 5). It was observed that RMSD values of all peptides were not significantly different from native peptides and also within the range of native peptides with few variations. The majority of RMSD data display either positive skewness or a skewness value close to zero (normal distribution) indicating stable interactions between a greater number of HLA alleles and identified peptides. P2, P4 and P6 displayed negative skewness for HLA class I indicating a highly stable interaction between them and only a few HLA class I alleles while these peptides displayed positive skewness for HLA class II alleles. Based on mean binding energy, P5 displayed best interaction with HLA class I molecules amongst the identified peptides while P2 and P3 displayed better interaction ability with HLA class II molecules as compared to native peptides (Table 5).

Fig. 1
figure 1

RMSD of the native peptide (NP) and identified Ebola nucleoprotein peptides obtained by CABS-dock analysis. For HLA class I, RMSD of a Native peptide, P1 P2 and P3 peptides and b Native peptide, P4, P5 and P6 peptides. For HLA class II, RMSD of c Native peptide, P1, P2 and P3 peptides and d Native peptide, P4, P5 and P6 peptides. For a and b, the mean RMSD of epitopes belonging to the respective peptide was considered except for peptides consisting of a single CD8+ T cell specific epitope (P1 and P6). For c and d, RMSD value obtained after docking-identified peptides with various HLA alleles was considered. Native peptides represent the peptide that already existed in the crystallographic structures of the HLA molecules. They were separated and docked with their respective HLA molecule. RMSD values found to be > 5 represent poor quality predictions and, hence, they were not plotted

Table 5 Comparative analysis of RMSD value (CABS-Dock) of selected peptides with native peptides

In addition to docking and HLA coverage in tools, population coverage analysis was carried out which provides the expected response of the peptides to various HLA molecules in different geographic regions. Encouraging results were observed as the all peptides exhibited more than 95% coverage for American, Asian and European populations (Fig. 2) and the expected response was found to be 90–100% when the analysis was done by taking whole world. The average of response for each peptide to four continents (Africa, America, Asia and Europe) comprising 14 total geographical areas was P1 (94.7%), P2 (96%), P3 (95.4%), P4 (85%), P5 (95%) and P6 (94.3%).

Fig. 2
figure 2

Population coverage of the identified peptides in four different continents and whole world. The mean population coverage for three different geographical continents (America, Africa and Asia) has been plotted

Mapping of peptide fragments

P2–P5 were found to be located in the core domain of NP protein. P1 was found to be near the N tail of the protein while P6 was found to be a part of C tail (Fig. 3).

Fig. 3
figure 3

Schematic presentation of identified peptides in different regions of Ebola nucleoprotein

Peptide-induced IFN-γ secretion

Three peptides (P2, P3 and P5) performed better during docking analysis as well as exhibited greater HLA allele coverage. Hence, these peptides were tested in vitro to validate their immunogenic potential. PBMC isolated from healthy blood samples were incubated as such (unstimulated cells), with peptides (peptide-stimulated cells) and ConA (positive control). 9 out of 10 samples showed enhanced IFN-γ secretion (fold change > = 1.0) for peptide-stimulated cells in case of P2 and P3. P5 was less responsive as only 2 out of 10 samples clearly showed enhanced IFN-γ secretions as compared to unstimulated cells (Fig. 4). ConA-treated cells showed more IFN-γ production for all the samples.

Fig. 4
figure 4

IFN-γ secretion by peripheral blood mononuclear cells of 10 healthy blood samples has been presented and expressed as fold change for a P2, b P3 and c P5. Fold change is the ratio of absorbance of peptide-stimulated cells and unstimulated cells. S1–S10 represent the 10 healthy blood samples. ConA:Concanavalin A

Discussion

Peptides as a choice for vaccine formulation is one of the recent developments and many peptide-based vaccines are in different stages of clinical trials. More than ten anti-cancer peptide vaccine candidates have made it to phase III trials [41]. Phase II trials are being conducted for peptide vaccines against influenza [42] and HPV-induced cancer [43]. Relying solely on in vitro or in vivo analysis for immunogenic peptide identification is cumbersome and not feasible in all facilities around the world. Computational immunology offers advantages in downsizing the number of peptide candidate to be validated in vitro or in vivo. In the current study, six EBOV NP peptides containing multiple epitopes having potential to interact with an array of HLA molecules were identified. Numerous computational works have been done in identifying epitopes against different infectious organisms such as Leishmania [44], Mycobacterium tuberculosis [45] and influenza virus [46]. Immunoinformatically identified peptides have shown enhanced proliferation and IFN-γ production in peptide-induced peripheral blood mononuclear cells [22]. Also, in vivo validation of computationally generated peptide vaccine candidates with mice as subject for Leishmania donovani [47], Moraxella catarrhalis [48] and Brucella abortus [23] showed promising results. Hence, computational identification of potential peptide vaccine candidates against different infections has picked pace in recent years. EBOV represents a serious concern for human health and, thus, with the application of various computational tools, six EBOV NP peptides containing multiple T and B cell epitopes were designed. Owing to the presence of different epitopes, these peptides may be capable of generating both humoral and cell-mediated immunity.

One of the interesting aspects of this study is the usage of consensus approach in which six prediction tools (three each for CD8+ and CD4+ T cell epitopes) were employed. The advantage of consensus approach is consideration of multiple prediction algorithms, immunological factors and databases in predicting the epitopes. In contrast to the present study, Sundar et al. used one prediction tool [49] while two prediction tools were used by Dutta et al. [50]. An approach similar to the current study was employed by Dikhit et al. where three different prediction algorithms were used to define only CD8+ T cell epitopes but not CD4+ T cell epitopes [51]. In all these studies [49,50,51], the work was carried out only for identifying CD8+ T cell epitopes. The present study focuses on the identification of peptides containing both CD8+ and CD4+ T cell epitopes. Further, in contrast to previous studies [49,50,51] where protein sequences were taken from single EBOV strain, the present study has considered the sequences from 1976 to May 2018 of all EBOV strains infecting humans. The conserved sequences obtained from 195 unique Ebola nucleoprotein sequences were taken for predicting the epitopes.

As per the comparative analysis performed with the help of IEDB, none of the six peptide fragments were found to be reported exactly in any of the previous studies. YQVNNLEEI, FLSFASLFL and FPQLSAIAL are the partial fragments of P2, P3 and P5 peptides, respectively (Table 2), that have been reported to induce CD8+ T cell response in earlier studies [49, 52,53,54,55]. In another study, mice injected with EBOV NP vaccine responded by producing protective CTLs against VYQVNNLEEIC (consisting of a part of P2) [56]. A large peptide sequence, HILRSQGPFDAVLYYHMMKDEPVVFSTSDGKEYTYP (consisting of P6), was reported to induce CD8+ T cell response in human survivors [14].

Any component of a vaccine may contribute to adverse effects after vaccine administration [57]. There is a chance of autoimmune reactions after vaccination [58]. Association of polyarthritis and thrombocytopenia with hepatitis B vaccine administration has been reported [59]. Also, neural complications on administering anti-rabies and tetanus vaccinations have been presented in previous studies [60]. Hence, BLAST analysis was applied in the current study to remove potential autoimmune response elucidators. VGHMMVIFRLMRTNFLIKFLLIHQGMHMV was eliminated owing to its similarity to human Anaphase-promoting complex/cyclosome and YAPFARLLNLSGV was eliminated as it showed similarity to homeobox protein, Hox-B9. Angioedema, bronchospasm, shock and a drop in blood pressure are some of the allergic responses to vaccines [57]. Measles–Mumps–Rubella (MMR) vaccination has been reported to induce anaphylactic reactions in some individuals and has been a subject of great debate for administration to egg allergic children [61, 62]. Current in silico analysis established that all peptides under consideration were incapable of inducing any allergic responses.

To validate the in silico selected peptides in the in vitro system, three peptides were checked for IFN-γ secretion in peptide-stimulated PBMC. In previous studies also, extracellular release of IFN-γ to represent antigen-induced proliferation of T cells has been measured by ELISA [22, 63, 64]. Interestingly, nine out of ten samples responded with enhanced IFN-γ production for two peptides (P2 and P3), thus validating the immunogenic potential of these computationally identified peptides.

Conservation analysis presented a wider prospective to identify peptide candidates which may have the potential to provide immunity against existing and future viral strains as well as cross protective immunity. As per various previous studies, 36–351 residues (P2-P5 lie in this polypeptide) are highly conserved amongst all EBOV strains, Marburgvirus as well as Lloviuvirus [11] belonging to the same family (Filoviridae) and other Mononegavirales members such as respiratory syncytial virus [65], Nipah virus [66] and Parainfluenza virus 5 [67] have shown structural similarity to EBOV. The identified peptides have shown conservation and similarity with EBOV species and other filovirus (Marburgvirus and Lloviuvirus) but not with other Mononegavirales members. Interestingly, all selected peptides were 100% conserved amongst Zaire ebolavirus strains. Further, P3 and P5 were conserved (100%) in most human infecting Ebola species and considered filoviridae members barring a single amino acid variation in a few species. These results supported the idea of development of cross-protective peptide vaccine.

One of the challenges in the development of peptide-based vaccine is to identify peptide candidates which provide immunity to different populations across the world. Adaptive immune response is directly associated with peptides presented by HLA molecules which are highly polymorphic. 13,680 HLA I and 5091 HLA II alleles belonging to different populations of the world have been reported in IPD-IMGT/HLA Database (release version 3.33, July 2018) [68, 69]. Peptide selection in the current study was done by accounting for multiple epitopes which provide opportunity to include large number of HLA alleles belonging to HLA-A, HLA-B, HLA-DP, HLA-DQ and HLA-DR. Further, docking studies have been carried out previously to ascertain the same in case of influenza virus and human baculo virus [18, 70]. Docking analysis was also carried out for EBOV epitopes with only HLA A0201 allele earlier [51]. In the current study, high-resolution nine PDB protein structures, each of HLA class I and II belonging to different HLA alleles, were considered. Statistical analysis indicated that the average RMSD value of selected peptides was not significantly different from that of native peptides indicating their similar binding potential. Moreover, RMSD value is less than 3 in most of the cases indicating highly favorable binding interactions with selected HLA molecules. Population coverage analysis was employed as an additional computational tool to judge the expected immune response of peptides in different geographical populations. Average population coverage lied in the range 90–100% for America, Asia, Europe and whole world which further confirmed the potential of the selected peptides as global vaccine candidates.

Further, these peptides were found to be parts of crucial nucleoprotein domains. P1–P5 are a part of N terminal 1-450 residues, a polypeptide which in itself is sufficient for viral genome replication [71]. P2–P5 are a part of N-terminal 36–351 residues which are needed for NP oligomerization and RNA binding [12]. Most of P1 form an oligomerization arm of NP [11]. A part of P1 lies in N-terminal 1–24 residues which enhance ssRNA binding as well as control NP intermolecular interactions [10, 12]. One residue of P3 (160th) is amongst the four important nucleoprotein residues (160, 171, 174 and 248) responsible for RNA encapsidation [72] and their deletion impairs EBOV replication [73]. Residues of P4 and P5 (264, 268 and 316) are involved in the formation of a highly conserved hydrophobic pocket significant for RNA formation [12].

Vaccine development against EBOV is in progress and nearly eight ebola vaccines candidates have made it to clinical trials [74]. A replication competent vesicular stomatitis virus vector-based vaccine named rVSVΔG-ZEBOV-GP has shown encouraging results in random and non-random trials conducted in Guinea, West Africa [75] and was proven to be effective with tolerable side effects [76]. Another vaccine named ChAd3-EBO is a non-replicating chimpanzee adenovirus vector-based vaccine which has exhibited efficacy in non-human primates [77] and an increased response with a modified vaccinia Ankara (MVA) booster [78]. Although some EBOV vaccines are in clinical trials, there is still a long way to reach the market. Peptides identified in this study offer advantages of being small molecules with a potential to provide immunity against all EBOV strains and other filoviruses and, hence, might be considered for designing a globally protective vaccine.

Conclusion

Six non-self and non-allergic peptides having multiple T and B cell epitopes were obtained and found to be 100% conserved in Zaire EBOV species. These peptides were predicted for diverse HLA alleles and found to have strong binding affinity with eighteen different HLA molecules. Also, the peptides exhibited strong population coverage among the different geographical regions across the globe. Two out of three potential peptides tested for in vitro immune response showed enhanced IFN-γ production for peptide-stimulated PBMC. Thus, these peptides are proposed to be validated further to inculcate these peptides in design of a synthetic peptide vaccine against EBOV and related species.