1 Introduction

Pollen allergens are one of the major factors leading to severe type I respiratory allergies, which affect up to 30% of the human population [1]. Common ragweed (Ambrosia artemisiifolia L.) and its pollen cause serious losses in the economy and in various spheres of life. Ambrosia artemisiifolia is the most important from the allergic point of view, as its pollen is extremely allergenic and very low concentrations can trigger allergic reactions (allergic rhinitis, asthma or contact dermatitis) in sensitive patients. Allergy to pollen from Ambrosia artemisiifolia is a serious health issue exerting a strong negative impact on the quality of life of allergic patients [2, 3].

Ambrosia artemisiifolia belongs to genus Ambrosia (Asteraceae). This genus includes about 40 species and it is associated with severe and widespread allergies caused by the pollen of its representatives. Short ragweed (Ambrosia artemisiifolia) is native to Northern and Central America, but it has been introduced and subsequently naturalized in many countries including a large part of Europe [4, 5]. In Europe, Ambrosia artemisiifolia is predominant among all Ambrosia species. Short ragweed is mainly found in France, Germany, Hungary, Bulgaria, Czech Republic, Poland, Northern Italy, and Eastern Austria [6,7,8,9,10].

It is important to underline that the pollen of Ambrosia artemisiifolia is produced in enormous amounts (a plant may produce 100 million–3 billion pollen grains) and even low exposure (5–10 pollen/m3 in air) is sufficient to initiate allergic reactions in sensitive individuals [11, 12]. To date, 11 ragweed allergens have been identified in the Ambrosia artemisiifolia pollen. Among them, the pectate lyase Amb a 1 and the cysteine protease Amb a 11 are described as major allergens. Within the ragweed pollen-allergic patients, they have a sensitization rate of 90 and 66%, respectively [13, 14]. Amb a 2 (a pectate lyase), Amb a 3 and Amb a 7 (two plastocyanins), Amb a 4 (homolog to the major mugwort allergen Art v 1), Amb a 5 (with unknown function), Amb a 6 (a nonspecific lipid transfer protein), Amb a 8 (profilin), and Amb a 9 and Amb a 10 (calcium-binding proteins) are considered as minor allergens. Sensitization rates of the minor allergens vary between 10 and 50% [15, 16]. There is significant cross-reactivity between the common ragweed and other Ambrosia species, mugwort, marsh elder or other grass pollens. The molecular weights of ragweed allergens range from 5 to 38 kDa, and they are grouped according to their biological differences [17].

Amb a 11 is 28 kDa glycosylated protein belonging to the peptidase family C1, subfamily C1A. It is very similar to the mite allergens Der p 1 and Der f 1, one of the most powerful known allergens. Recently, Amb a 11 was suggested as a novel major allergen with strong allergenicity (66%) [18]. Twenty different isoforms and glycoforms of the major allergen Amb a 11 were described in the WHO/IUIS allergen nomenclature database [15].

Immunoinformatics is a branch of bioinformatics that has been successfully used in the field of vaccinology. The immunoinformatics approach is the most efficient method for development of the epitope-based peptide vaccines [19, 20]. In recent years, epitope-based vaccines are preferable than the traditional vaccines because of several advantages: they are focused on the immune response, enhancing immunity and avoiding undesirable epitopes, more convenient production, and reducing costs [21].

Due to the climate change (warmer temperatures and increased carbon dioxide concentrations enable ragweed to produce more pollen), the number of individuals sensitive to ragweed pollen is increasing continuously in the United States and Europe [22, 23]. Therefore, the aim of this study was to design an efficient epitope-based peptide vaccine against the major allergen Amb a 11 using potential T-cell (CD8 + and CD4 +) and B-cell epitopes.

2 Materials and Methods

2.1 Retrieval of Amb a 11 and Identification of Conserved Regions Among Homologous Pollen Allergen of Amb a 11

The amino acids sequence and 3D structure of Amb a 11 allergenic protein was extracted from UniProt database (https://www.uniprot.org/) [24] in FASTA and PDB formats. In structural database of allergenic proteins (SDAP) [25], FAO/WHO Allergenicity Rules, which are based on sequence homology for full FASTA alignment, showed that the sequence of “V5LU01” contains 22 sequences of allergens with E-values < 0.01. The protein sequences of selected pollen allergens were extracted from Swiss-prot in FASTA format. Conservancy can be defined as the portion of protein sequences that restrain the epitope measured at or exceeding a specific level of identity. To identify the conserved regions of Amb a 11 protein, the sequence was aligned with retrieved sequences of selected pollen allergens by using the ClustalW program of the phylogenetic software MEGA-7 [26]. Phylogenetic tree of the retrieved sequences of the Amb a 11 and the pollen allergens was created using MEGA-7 software [26]. The evolutionary distances were calculated using the Maximum Composite Likelihood method within the software.

2.2 Prediction of CD8 + T-cell Epitopes and Their MHC Binding HLA-Alleles

MHC I prediction tool of the immune epitope database (IEDB) (http://tools.iedb.org/mhc1) [27] was used to determine the ability of the selected sequence to bind to a specific MHC class I molecule. Linear CD8 + T-cell epitopes were recognized by consensus approaches using various methods such as artificial neural network (ANN) [28], which was used to calculate IC50 values of peptide binding to MHC I molecules and determination of the HLA alleles; Stabilized matrix method (SMM) [29], which was applied to predict peptide binding to MHC molecules, proteasomal cleavage of protein sequences and peptide transport by the transporter associated with antigen presentation (TAP). For both, frequent and non-frequent alleles, peptide length was set to 10 amino acids. On the basis of low IC50 values, only 10 potential peptides were chosen for next processing.

2.3 Prediction of CD4 + T-cell Epitopes and Their MHC Binding HLA-Alleles

The MHC II groove has the ability to bind peptides of different lengths, making the prediction more difficult and less accurate. MHC II prediction tool provided by the immune epitope database (IEDB) was used to predict 15 amino acid long CD4 + T-cell epitopes. MHC II binding prediction was achieved using a consensus approach, which combines NN-align, SMM-align [30], and Combinatorial library methods. All epitopes that bind to many alleles at score equal or less than 1000 half-maximal inhibitory concentration (IC50) were selected for further analysis.

2.4 Population Coverage Prediction for CD8 + and CD4 + T-cell Epitopes and Their Alleles

The population coverage for identified T-cell epitopes was calculated by using epitope analysis tools of IEDB analysis resource server (http://tools.iedb.org/population/) [31]. All finally predicted CD8+ and CD4 + T-cell epitopes and their MHC I and MHC II molecules were assessed for the European and world population.

2.5 Identification of B-cell Epitopes

The prediction of B-cell epitopes is a main step for epitope-based vaccine development as it produces antibodies that assure humoral immunity. Recognition of B-cell epitopes was depended on various physico-chemical properties such as antigenicity, accessibility of surface, flexibility, turns, exposed surface, polarity hydrophilicity and predictions of linear epitope [32]. Potential B-cell epitopes were characterized by B-cell epitope prediction tools (http://tools.iedb.org/bcell/) of the immune epitope database using B-cell prediction methods such as Kolaskar and Tongaonkar antigenicity scale [33], Bepipred linear epitope prediction [34] Emini surface accessibility prediction [35], Karplus and Schulz flexibilty prediction [36], Parker hydrophilicity prediction [37] and Chou and Fashman beta turn prediction tool [38].

2.6 In Silico Molecular Docking

Peptide structure prediction server (PEP-FOLD 3) (https://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3/) was utilized to predict the three dimensional structure of the selected epitopes [39]. The crystal structures of HLA‐A0201 (PDB ID: 4UQ3) and DQA1*03:01/DQB1*02:01 (PDB ID: 4D8P) were downloaded from the RCSB protein data bank (http://www.rcsb.org/pdb/home/home.do) as PDB format [40, 41]. Crystal structures were in the complex form of protein and ligand, hence the discovery studio v16.1.0 [42]. was used to separate the protein and ligand from the complex structures. All water molecules and heteroatoms were then removed. In silico docking simulation was done in order to estimate the binding affinities between the epitopes and specific HLA proteins (MHC I & MHC II) and immunoglobulin E (IgE). Docking simulation study was performed through automated online server ClusPro 2.0 (https://cluspro.bu.edu/home.php) [43].

2.7 Experimental Validation

Ten subjects (7 females and 3 males, 30–45 years old) with proven allergy to ragweed pollen and 10 healthy age-matched controls (4 females and 6 males, 32–43 years old) were enrolled in this study. The selection of allergic patients was performed in collaboration with an experienced allergist based on a positive prick test to ragweed pollen (ALK-Abelló A/S Hørsholm, Denmark) and presence of IgE antibodies specific to ragweed pollen extract (ImmunoCAP, ImmunoDiagnostics Thermo Fisher Scientific, Uppsala, Sweden). Recruited healthy individuals had negative prick tests to ragweed pollen and no clinical histories of pollen allergies. None of the participants had received corticosteroids or allergen-specific immunotherapy during the last 2 months before taking blood samples.

Peripheral venous blood samples (3 mL) were collected from each participant into BD Vacutainer® K2EDTA tubes (Becton, Dickinson and Company, Oakville, ON, USA) and centrifuged at 1500 g for 15 min. The sera were transferred to 2 mL cryotubes and stored at − 80 °C until further analyses. Then, red blood cells were lysed with 0.84% NH4Cl, and after washing with Dulbecco’s phosphate-buffered saline (D-PBS, Gibco®, Life Technologies™, Paisley, Scotland, UK), the remaining cells were resuspended at a concentration of 2.5 × 106 cells/mL in RPMI-1640 medium, supplemented with a stabilized antibiotic antimycotic solution (penicillin, streptomycin, amphotericin B) and 10% fetal bovine serum (all from Sigma-Aldrich Chemie GmbH, Steinheim, Germany).

Cells were incubated at 37 °C in a humidified incubator containing 5% CO2 in the presence of 5 µg/mL of the identified T-cell (MHC class-I peptide GLMEPAFTYV and MHC class-II peptide LVCFSFSLVLILGLV) and B-cell (GKLVKFSEQQLVDC) epitopes in a volume of 200 μL in 96-well plates (TPP, Trasadingen, Switzerland). Peptides representing T- and B-cell epitopes were synthesized by Schafer-N (Copenhagen, Denmark).

After 72 h, the production of IL-2 in the culture supernatant was measured by LEGEND MAX™ Human IL-2 kit (BioLegend Inc., San Diego, CA, USA), according to the manufacturer’s instructions. The sensitivity of the assay was 4 pg/mL.

IgE antibodies specific to the identified epitopes were assayed by ELISA. Briefly, plates were coated overnight at 4 °C with 10 μg/mL of the synthetic peptides that represent the epitopes (50 μL/well). After blocking with D-PBS, 0.05% Tween 20 and 1% BSA, the plates were washed and sera diluted 1:5 (100 μL/well) were applied onto the coated plates in triplicates and incubated overnight at 4 °C. IgE was detected with peroxidase-conjugated monoclonal anti-human IgE (clone GE-1, Sigma-Aldrich, St. Louis, MO, USA). 2,2-azino-di-[2-ethylbenzthiazoline sulfonate] diammonium salt (ABTS tablets, Roche Diagnostics) was used as substrate, and OD values were determined by a microplate reader (ELx800™, BioTek Instruments, Winooski, VT, USA) at 405 nm.

All tests and analyses were performed in adherence to the Declaration of Helsinki ethical guidelines and was approved by the local Ethics committee at the Paisii Hilendarski University of Plovdiv. Written informed consent was obtained from all participants.

2.8 Statistics

All experiments related to the experimental validation were conducted in triplicate and results are presented as mean ± SD (n = 10). To compare non-parametric data for statistical significance, the Kruskal–Wallis test was applied using the StatView program (SAS Institute Inc., Cary, NC, USA). Values of p < 0.05 were considered significant (*p < 0.05, **p < 0.01, ***p < 0.001).

3 Results

3.1 Retrieving of Amb a 11 Protein Sequences and Identification of Conserved Domains

In this study, the amino acids sequence of the Amb a 11 was retrieved from the UniProt database and then applied FAO/WHO Allergenicity Rules based on sequence homology in the SDAP allergen database. We identified 22 homologous allergens of the pollen allergen Amb a 11 as shown in Table 1. Multiple sequence alignment was performed to identify conserved sequences with varying lengths (Table 2). The conserved region analysis helps us to identify the essential functional parts of the target protein sequences. A phylogenetic tree was created to indicate the evolutionary relationship of the pollen allergen Amb a 11 and the identified 22 allergens (Fig. 1).

Table 1 List of homologous allergens of Amb a 11
Table 2 Conserved sequences of homologous pollen allergens with their positions
Fig. 1
figure 1

Phylogenetic tree based on amino acid sequences of the pollen allergen Amb a 11 and the identified 22 homologous allergens. The reconstruction has been performed by using ML analysis applying the GTR + I + G evolutionary model. The numbers above branches indicate the bootstrap support from 1000 replicates

3.2 Prediction of CD8 + T-cell Epitopes and Their MHC Binding HLA-Alleles

In this experiment, we utilized the MHC I binding prediction tool from the immune epitope database (IEDB) to calculate IC50 value and percentile rank for peptides binding to specific MHC I molecules. Peptides that have the ability to bind to a larger number of alleles are identified as the most suitable because of their potential to elicit a potent protective response. Only ten CD8 + T-cell epitopes were selected as potential vaccine candidates according to the low percentile rank and IC50 values for their respective alleles (Table 3).

Table 3 Most potential CD8 + T-cell epitopes with interacting MHC-I alleles, their positions, length of epitope, Ann IC50, Smm IC50, and percentile rank

3.3 Prediction of CD4 + T-cell Epitopes and Their MHC Binding HLA-Alleles

In this study, the amino acids sequence of Amb a 11 allergen was analyzed using IEDB MHC II binding prediction tool to determine the ability of the submitted sequence to bind to a specific MHC II molecule. A consensus approach, which combines NN-align, SMM-align, and Combinatorial library methods were used for the prediction of peptides, which can interact with MHC class II alleles. The list of the most promising CD4 + T-cell epitopes and their correspondent binding MHC II alleles are shown in Table 4.

Table 4 Most potential CD4 + T-cell epitopes with interacting MHC-II alleles, their positions, length of epitope, Smm IC50, and percentile rank

3.4 Analysis of Population Coverage

The population coverage analysis is a vital factor in the design of epitope-based vaccines. High population coverage of vaccine compounds is significant due to a lot of people can be benefited by only one vaccine preparation. In this study, IEDB population coverage tool (http://tools.iedb.org/population/) was applied to determine the population coverage of the predicted CD8 + and CD4 + T-cell epitopes. The population coverage analysis results showed a great coverage of the European and world population (Table 5).

Table 5 Population coverage for predicted T-cell epitopes

3.5 B-cell Epitope Prediction

B-cell epitope identification is an essential characteristic in the design of an epitope-based peptide vaccine because an ideal epitope-based vaccine contains T- and B-cell epitopes. The amino acid sequence of Amb a 11 allergenic protein was subjected to Parker Hydrophilicity, Chou & Fasman Beta-Turn, Karplus & Schulz Flexibility, Emini surface accessibility, and Kolaskar and Tongaonkar antigenicity methods in IEDB, to determine the binding to B cell, being in the surface and to test the immunogenicity. From all predicted B-cell epitopes, only one epitope (“GKLVKFSEQQLVDC”) was chosen for next processing (Table 6). Predicted linear B-cell epitope and the accessibility, antigenicity, hydrophilicity, flexibility and beta-turn prediction score for each residue are summarized in Table 6. This peptide contains the sequence “SEQQLVDC”, which is presented in the conserved region of pollen allergen Amb a 11 (Table 2).

Table 6 Surface accessibility (threshold = 1.000), hydrophilicity (threshold = 1.916), flexibility (threshold = 1.004), beta turn (threshold = 1.016) and antigenicity (threshold = 1.013) prediction score for each residue of B-cell epitope

3.6 Docking Study of B-Cell and T-Cell Epitopes

A web-based server, PEP-FOLD 3 was used to create 3D structures of the predicted T- and B-cell epitopes (“GLMEPAFTYV”, “LVCFSFSLVLILGLV” and “GKLVKFSEQQLVDC”) (Fig. 2). Five models were created for each peptide and the structure with the lowest energy was chosen for further analysis. The selected models were docked through ClusPro v2.0 with the MHC receptors and the immunoglobulin protein E, respectively. CD8 + T-cell peptide showed significant binding affinity to the MHC I molecule (Fig. 3B). The 3D structures of the best docked-complexes are illustrated in Fig. 3.

Fig. 2
figure 2

3D structures of the predicted T and B-cell epitopes. A CD4 + T-cell epitope; B CD8 + T-cell epitope; C B-cell epitope

Fig. 3
figure 3

A Peptide-allele docked complex of the predicted CD4 + T-cell epitope “LVCFSFSLVLILGLV” with respective frequent allele (energy weighted score − 725.0 kcal/mole); B Molecular docking of CD8 + T-cell epitope “GLMEPAFTYV” of Amb a 11 allergenic protein docked in HLA-A0201 (energy weighted score − 729.3 kcal/mole); C B-cell epitope “GKLVKFSEQQLVDC” is bound with immunoglobulin E “4poz” with lowest energy weighted score − 695.0 kcal/mole

3.7 Functional Validation of Predicted Epitopes

In order to validate the functional activity of the predicted epitopes we have tested whether the T-cells and antibodies from patients allergic to ragweed pollen recognize these epitopes. As can be seen in Fig. 4, T-cell response was observed only for the cells isolated from allergic patients against T-cell epitope 2 (LVCFSFSLVLILGLV) presented from MHC class II molecules. Stimulation of T cells with this peptide significantly increased production of IL-2 in the group of patients with ragweed allergy compared to the healthy controls (Fig. 4). There was no response against predicted T-cell epitope 1 (GLMEPAFTYV) and the B-cell epitope (GKLVKFSEQQLVDC).

Fig. 4
figure 4

T-cell response to identified T- and B-cell epitopes in patients with ragweed pollen allergy (n = 10) and healthy individuals (n = 10). IL-2 production in the supernatants was assayed by ELISA after stimulation of the cells with the synthetic peptides for 72 h. T-cell epitope 1: GLMEPAFTYV; T-cell epitope 2: LVCFSFSLVLILGLV; B-cell epitope: GKLVKFSEQQLVDC. **p < 0.01 as determined by Kruskal–Wallis test. Results are presented as means ± SD (n = 10)

When we evaluated the sera collected from the same allergic and healthy subjects for presence of specific IgE antibodies, we detected in the allergic group IgE that specifically recognized the identified B-cell epitope (Fig. 5). Compared to the control group, levels of these antibodies were significantly higher.

Fig. 5
figure 5

Serum levels of specific IgE antibodies to identified T- and B-cell epitopes in patients with ragweed pollen allergy (n = 10) and healthy individuals (n = 10). T-cell epitope 1: GLMEPAFTYV; T-cell epitope 2: LVCFSFSLVLILGLV; B-cell epitope: GKLVKFSEQQLVDC. ***p < 0.001 as determined by Kruskal–Wallis test

Our data showed that predicted T-cell epitope 2 (LVCFSFSLVLILGLV) and the B-cell epitope (GKLVKFSEQQLVDC) are functional. They can be recognized by T-cells and antibodies, respectively.

4 Discussion

Nowadays, the development of new vaccines mainly depends on the design of peptide-based vaccines. The synthetic peptide-based vaccines are an alternative solution to overcome all the adverse effects of the classical vaccines. The improvement in the field of immunoinformatics has become a potential field for predicting the efficacy of the peptide-based vaccines [44].

In the current study, potential T- and B-cell epitopes were evaluated as possible epitope-based peptide vaccines against the allergen Amb a 11. The obtained results of FAO/WHO Allergenicity Rules based on Sequence Homology showed that Amb a 11 shares homology with other major allergens belonging to the same protease family, such as Act d 1 from Actinidia deliciosa (37.0%), Ana c 2 from Ananas comosus (34.1%), and Der f/Der p 1 from Dermatophagoides farinae/Dermatophagoides pteronyssinus (26.0/24.4%;). These results correspond to the results obtained by Bouley et al. [18]. Car p papain from Carica papaya (34.6%), Blo t 1 from Blomia tropicalis (34.6%), Eur m 1 from Euroglyphus maynei, and Gly m 1 from Glycine max were newly established. Multiple sequence alignment with other allergens was performed to identify the conserved sequences (Table 2). A phylogenetic tree was created to indicate the evolutionary relationship of the pollen allergen Amb a 11 and the identified twenty two homologous allergens (Fig. 1). The phylogenetic analysis showed a close evolutionary relationship between the homologous allergens.

The binding between T-cell epitopes and the MHC molecules is crucial for the cellular immune response. The selection of T-cell epitopes was performed according to the binding affinity of the highest numbers of alleles and the lowest IC50 value. The CD8 + T-cell peptide “GLMEPAFTYV” was found to interact with seven alleles (HLA-A*02:01, HLA-A*02:03, HLA-A*02:06, HLA-A*68:02, HLA-B*15:01, HLA-A*03:01, HLA-A*31:01), while the CD4 + T-cell peptide “LVCFSFSLVLILGLV” interacts with a large number of alleles including HLA-DQA1*03:01/DQB1*02:01, HLA-DPA1*03:01/DPB1*04:02, HLA-DPA1*01/DPB1*04:01, HLA-DPA1*01:03/DPB1*02:01, HLA-DPA1*02:01/DPB1*01:01, respectively.

B-cell epitope prediction is an essential characteristic in the design of an epitope-based vaccine [45]. The prediction scores of Emini surface accessibility [35], Parker hydrophilicity [37], Karplus and Schulz flexibility [36], and Chou and Fashman beta turn [38] for each residue of the peptide “GKLVKFSEQQLVDC” as a part of the amino acid sequence of Amb a 11, reveal that this peptide is the most possible B-cell epitope presented in this pollen allergen.

The population coverage is an important analysis step that might reflect to the possibility of the epitope to be used in vaccine design [46]. The selected T-cell peptides were subjected to MHC I and MHC II based population coverage analysis in Europe and worldwide as shown in Table 5. It was predicted that CD4 + T-cell epitope “LVCFSFSLVLILGLV” covers 90.56% of the population of Europe and 99.36% of the world population. CD8 + T-cell epitope “GLMEPAFTYV” has a population coverage of 77.37% for Europe and 71.35% worldwide. So, it may be concluded that these epitopes may cover a high amount of population when applied as a vaccine.

Our validation experiments demonstrate that T-cell epitope 2 (LVCFSFSLVLILGLV) could be used as a potential vaccine against the major allergen Amb a 11, because it can activate T cells leading to production of IgG antibodies against the ragweed pollen (neutralizing antibodies), but at the same time it is not recognized by the IgE antibodies. Thus, this peptide should not cause an allergic reaction.

An in silico molecular docking was done to explore the binding affinity between the selected peptides and their immune receptors (MHC-I and MHC-II molecules and IgE). The HLA‐A0201, HLA‐DQA1*03:01/DQB1*02:01 and immunoglobulin E “4poz” were selected as the macromolecules for docking. Docking study of CD8 + T-cell epitope “GLMEPAFTYV” with HLA‐A0201, showed lowest binding energy − 729.3 kcal/mole. Interaction between CD4 + T-cell epitope “LVCFSFSLVLILGLV” with MHC II allelic protein HLA‐DQA1*03:01/DQB1*02:01 with binding energy − 725.0 kcal/mole is shown in Fig. 3A. The B-cell epitope GKLVKFSEQQLVDC forms a stable complex with the IgE molecule (energy weighted score − 695,0) (Fig. 3C). The 3D structures of the best docked-complexes are illustrated in Fig. 3.

5 Conclusion

At the moment, in Europe, short ragweed (Ambrosia artemisiifolia L.) is of great concern because it is one of the most prominent invasive species. The development of allergy to the short ragweed pollen is a major health problem. Therefore, in this study we designed an epitope-based peptide vaccine against Amb a 11 (a new major allergen belonging to the cysteine protease family) by using an immunoinformatics approach. The proposed T- and B-cell epitopes (“GLMEPAFTYV”, “LVCFSFSLVLILGLV” and “GKLVKFSEQQLVDC”) showed high susceptibility and can be proceeded for further experimental verification for vaccine designing against the allergen Amb a 11. The MHC class II peptide (LVCFSFSLVLILGLV) is a promising candidate as a specific vaccine against the major allergen Amb a 11 because it can stimulate T cells from ragweed allergic patients.