Parasitology Research

, Volume 112, Issue 9, pp 3129–3135

Immunoproteomics approach for EPC1 antigenic epitope prediction of G1 and G6 strains of Echinococcus granulosus


  • Fazeleh Etebar
    • Department of Parasitology, Faculty of Veterinary MedicineUniversity of Tehran
  • Fatemeh Jalousian
    • Department of Parasitology, Faculty of Veterinary MedicineUniversity of Tehran
    • Department of Parasitology, Faculty of Veterinary MedicineUniversity of Tehran
  • Somayeh Kordafshari
    • Department of Parasitology, Faculty of Veterinary MedicineUniversity of Tehran
  • Ali Najafi
    • Bagiyatallah University of Medical SciencesMolecular Biology Research Center Tehran
Original Paper

DOI: 10.1007/s00436-013-3489-x

Cite this article as:
Etebar, F., Jalousian, F., Hosseini, S.H. et al. Parasitol Res (2013) 112: 3129. doi:10.1007/s00436-013-3489-x


It is important to establish the diagnosis of cystic echinococcosis (CE) infection and begin control management. Currently, it is difficult to make an accurate diagnosis of CE without the availability of an accurate test, which requires the use of sensitive and specific antigens. Using recombinant antigens the sensitivity and specificity of the CE serology assays could be improved considerably. Recently, a highly antigenic protein named EPC1was characterized and isolated from an Echinococcus granulosus protoscoleces. The current study was designed to assess the sequences of EPC1 isolated from different intermediate hosts of E. granulosus. In addition, identification of a highly antigenic linear B cell epitope was found within EPC1 antigen candidate. The EPC1 sequence contains coding and non-coding regions and was compared between two predominant strains (G1 and G6) in Iran. Sequence polymorphism was not found in protein coding regions, suggesting that these regions may be useful for identification of protein expression as an antigen. The average antigenic activity for the whole protein is above 1.1, and hydrophobicity below 0 indicates that it is hydrophilic. Structural analysis showed alpha helical regions in amino acids 6–25, 35–44, 52–62, and 72–78. Nine B cell epitope residues were identified out of 67 total residues. The identity of EPC1 sequence in both G1 and G6 genotypes affects the antigenic efficacy of EPC1and suggests the recombinant protein will be useful in serological assays in the regions where the two strains are prevalent.


Cystic echinococcosis (CE) caused by Echinococcus granulosus is known to be one of the most important parasitic infections in livestock. It is an ancient zoonotic disease and potentially a life-threatening infection (Amin Pour et al. 2011; Addy et al. 2012; Taha 2012). Various studies characterizing the morphological and molecular features of E. granulosus isolated from sheep, goat, cattle, camel, buffalo, and humans in different parts of Iran were performed (Zhang et al. 1998; Hosseini and Eslami, 1998; Rajabloo et al. 2012; Amin Pour et al. 2011; Fasihi Harandi et al. 2002). The genotype G1 was found in sheep, goat, cattle, camel, and humans; whereas, G6 was only found in camel, human, and goats (Rajabloo et al. 2012; Zhang et al. 1998; Mehrabani et al. 1999; Dalimi et al. 2002; Fasihi Harandi et al. 2002; Jamali et al. 2004; Rostami Nejad et al. 2008; Karimi and Dianatpour 2008; Amin Pour et al. 2011; Sharbatkhori et al. 2010). Hydatid disease is endemic in Iran and, accordingly, physicians need be aware of the clinical features, diagnosis, and management of this disease (Eslami and Hosseini 1998; Rokni 2009; Umhang et al. 2013). Cystic echinococcosis is one of the few parasitic infections where the primary diagnosis is by a serological test. Beginning in 1967, studies by Capron et al. on the antigenic composition of hydatid fluid led to the description of antigen5, and started a new era in the specific serologic diagnosis of hydatidosis (Zarzosa et al. 1999). In diagnosis of E. granulosus, several serological tests have been employed including precipitation, agglutination and marked antibodies assays (Zhang et al. 2003). A suspicious lesion must be diagnosed using two tests comprising a qualitative test [immunoelectrophoresis (IEP)] and a quantitative test [enzyme-linked immunosorbent assay (ELISA), or hemaglutination]. The most sensitive technique used is the specific immunoglobulin G (IgG) ELISA test. The serologic tests are also useful for postoperative follow-up.

Despite the development of sensitive and specific methods, the immunodiagnosis of CE and echinococcosis remains a difficult task (Ortona et al. 2003; Siracusano and Bruschi 2006). Majority of the available screening tests can produce a high percentage of false-negative results (up to 25 %). These false-positive results occur using different assays and can be caused by co-infection with other cestodes or helminths when diagnosing human hydatidosis (Carmena et al. 2006). Since hydatid cyst fluid (HCF) contains various metabolites of the host and the parasite origin, using HCF as an antigen, reduces the specificity of the assay (Rahimi et al. 2011). It has been suggested that CE serology may be improved by using recombinant proteins. Recently, a highly antigenic protein named EPC1was characterized and isolated from a protoscolex (larval) stage and is encoded by the EPC1 gene (Li et al. 2004). Considering the differences in cytochrome oxidase subunit 1 gene and some other locus sequences from different E. granulosus isolates, the performance of a given diagnostic assay, which uses this antigen, might be affected. The current study was designed to assess the sequences of EPC1 isolated from different intermediate hosts of E. granulosus.

Bioinformatics analysis software was used to predict Echinococcus protoscolex protein (EPC1) structure and function of the protein. Prediction of antigenic regions in a protein is helpful for a rational approach to the expression of the recombinant proteins which may elicit an appropriate antibody reaction. Previous studies demonstrate that a good correlation exists between the predicted regions and previously determined antigenic regions (Welling et al. 1985).

In the present study, identification of a highly antigenic linear B cell epitope was described within the EPC1 antigen candidate and in different E. granulosus strains.

Materials and methods

Hydatid cysts were collected from the liver and lungs of 17 sheep, 1 cattle, and 4 camels which were slaughtered in a slaughterhouse in Iran. HCF was collected from the fertile cysts. Protoscoleces were washed three times using phosphate buffered saline (PBS) at pH 7.2 and centrifuged at 5,000×g for 5 min. The tubes were kept at −20 °C. DNA was extracted from 50 μl of the protoscoleces. Following the manufacturer’s instructions (DNA extraction kit, MBST, Iran), isolation of the entire genomic DNA from the protoscoleces of E. granulosus was performed.

PCR amplification and sequence analysis

PCR amplification for mitochondrial gene, cytochrome C oxidase subunit I, (COI) gene for determining the strains was performed in 50 μl volumes containing 2 μl DNA sample and 48 μl reaction mixture, which contained 2 μmol of each primer (forward) 5′TTTTTggCCATCCTGAGGTTTAT-3′ and (reverse) 5′-TAACgACATAACATAATgAAAATg -3′ and 1 unit of Taq DNA polymerase (5U/μl Fermentas), 3 mM MgCl2, 2 mM dNTP, 1X PCR buffer, and 38 μl double distilled water. The PCR conditions were as follows: an initial denaturing step (95 °C for 5 min) followed by 40 cycles, with each cycle consisting of denaturation at 94 °C for 45 s, annealing at 56.6 °C for 45 s, elongation at 72 °C for 45 s, and a final extension at 72 °C for 10 min. For detection of the PCR amplicons, 8 ml of the PCR products was separated by 1.5 % agarose gel electrophoresis and stained with ethidium bromide. The PCR products were purified using quick PCR products purification kit (MBST, Iran). Based on Sanger’s method, genomic DNA sequencing was performed in both directions for each of the PCR products by Kowsar Biotech Company in Iran. The sequence chromatograms were analyzed using the Chromas software version 3.1 and compared to those registered in the Gen Bank using the Basic Local Alignment Search Tool (BLAST).

The strains were confirmed and then, PCR amplification was performed for EPC1 gene in sheep and camel strains following the previously used method with the exception of using 42 °C annealing temperature for ETF-ETR primers and 56 °C for ETF-KR primers (Table 1).
Table 1

Primers used in this study

Primer name

Primer sequence










Choosing the correct open reading frame is especially challenging for proteomic data, because the data for EPC1 in Gene Bank (access number AF481884.1) containing partial sequence lacks clear start and stop signals. Open reading frame is a target sequence in EPC1 for comparing different strains. Sequencing results of EPC1 in the present study contained coding and non-coding sequences. The open reading frame (ORF) was found using vector NTI (version. 11) software. This tool identifies all the ORFs using the standard or alternative genetic codes.

Predicting antigenic propensity and solvent accessible regions

The antigenic activity of EPC1 coding region was determined using antigenic peptides site prediction (Fig. 5). This prediction tool is based on the Kolaskar and Tongankars method (1990). Raw amino acid sequence as FASTA format is entered in the program to predict those segments within the calcium binding protein of EPC1, using the method of Kolaskar and Tongaonkar (1990). The reported accuracy of the method is about 75 %. The used algorithms are based on the scale of delineating hydrophobic character of a protein. The regions with values greater than 0 are hydrophilic, and thus, are likely to be exposed on the surface of a folded protein. The values under 0 are hydrophobic. The EPC1 protein sequence as a FASTA format was analyzed to obtain plots that characterize its hydrophobic property. This could be useful in predicting membrane spanning domains that are potential antigenic sites and regions likely to be exposed on the proteins surface (Fig. 6).

EPC1 protein structure modeling

A predicted model of the EPC1 was constructed using Swiss-model workspace (Arnold et al. 2006). This web-based tool is a protein structure homology-modeling server. The protein BLAST algorithm against PDB database was used for searching homologous proteins. Subsets of proteins similar to EPC1 were found in the database and were prepared for homology modeling. A predicted model of the EPC1 protein was constructed using Swiss- MODEL based on Cypro carpio template with 44 % sequence similarity and E value of 5.9 e −13 (Fig. 7). Uniport database coordinates sets with 50 % sequence identity of Teania multiceps, Taenia taeniaformis, Taenia crassiceps as the structural templates of the immunogenic protein cluster. There are no known structures The EPC1 and its homologues do not have any known structure in PDB.

B cell epitope prediction

Since most, if not all, antigenic sites are located within the surface-exposed regions of a protein, the presence of B cell epitopes is often predicted by bioinformatics tools and computer analysis. The prediction of B cell epitopes was carried out using the DiscoTop: a web-based tool for the structure-based antibody prediction (Anderson et al. 2006; Table 2). DiscoTope is a method for predicting discontinuous epitopes from 3D structures of proteins in PDB format. Swiss Pdb Viewer is used for rendering and mapping the predicted epitopes on 3D structure (Guex and Peitsch 1997).
Table 2

B cell epitopes prediction in EPC1 structure using Discotope software. The residue contact number is the number of Cα atoms in the antigen within a distance of 10 Å of the residue’s Cα atom. A low contact number correlates with localization of the residue close to the surface or in protruding regions of the antigen's structures. Propensity score represents the probability/tendency of being part of an epitope for that particular residue. The propensity score is calculated by sequentially averaging epitope log-odds ratios within a window of nine residues. Then the scores are summed up based on the proximity in the 3D structure of the antigen. For any given residue, the sequentially averaged log-odds scores from all residues within 10 Å are summed to give the propensity score. DiscoTope Score is calculated by combining the contact numbers with the propensity score. DiscoTope score above the threshold value (−7.7) indicates positive predictions and the scores below the threshold value indicate negative predictions

Residue ID

Residue name

Contact number

Propensity score

DiscoTope score








Region 1

Coiled region







Region 2

Coiled region



















Region 3

α-Helices region







Region 4

α-Helices region







Region 5

Coiled regions














In the present study, COI-PCR was used to characterize E. granulosus DNA isolated from cysts recovered from animal isolates of E. granulosus. The fragment, approximately 440 bp, in all of the DNA samples, was amplified and then sequenced. The sequencing results aligned with the Gene bank sequence. The results showed that all the sheep, goat, and cattle isolates were most similar to the sheep strain and all the camel isolates were most similar to the camel genotype.

The results of EPC1 PCR products

A partial sequence of EPC1 in the sheep strain using specific primers was amplified and the PCR products, approximately 500 bp, were sequenced (Fig. 1) and were submitted to the Gene Bank with accession number JF964264. A region of 228 bp of EPC1 in the sheep strain using kF-kR pair primers was amplified and sequenced (Fig. 2). A region of approximately 313 bp of EPC1 in the camel strain using specific primers and thermal cycler gradient program was amplified and sequenced (Fig. 3). The sequencing results were submitted to the Gene Bank with accession number JN792187.
Fig. 1

Gel electrophoresis of a PCR product 500-bp segment of EPC1 gene using ETF-ETR pair primers. From left to right: lanes 1 and 2, E. granulosus G1 strains; lane 3, 100-bp DNA marker; lanes 1–4 E. granulosus G1 strains (known G1 genotype based on COI sequence) samples, lane 5 no template control, lane 6 negative control
Fig. 2

Gel electrophoresis of a PCR product 228-bp segment of EPC1 gene using kF-kR pair primers. From left to right: 100-bp DNA marker, lane 1 negative control, lane 2 samples of E. granulosus G1 strain 226-bp segments of EPC1 gene, lane 3 no template control
Fig. 3

Gel electrophoresis of a PCR product 313-bp segment of EPC1 gene using ETF-kR pair primers. From left to right: 100-bp DNA marker, lanes 1, 2, and 3 (weak bands); lane 4 sample of E. granulosus G6 strain 313-bp segment of EPC1 gene; lane 5 no template control; and lane 6 negative control

Comparison of EPC1 sequencing results in sheep and camel strains showed no differences, especially in coding sequence (Fig. 4). The prediction of B cell epitopes was performed by hydrophobicity, residue accessibility and antigenic activity analysis. Results of the B cell epitope prediction are shown in Table 2 and Fig. 5. Overall, 9 out of 67 residues (amino acids) were identified to be B cell epitopes(Table 2). The result of this analysis for EPC1 identifies five regions as B cell epitope (Fig. 6). Three regions are predicted to found in coiled regions and two B cell epitope regions found in α-helices (Fig. 7). Based on our predictive rankings, epitope sequences 28–30 and 65–67 were chosen because of their relative immunogenic potential (Fig. 5). These two sequences have coil structure and are accessible on the surface of predicted protein (Fig. 7).
Fig. 4

Comparison result map of BLAST sequence of two strains of E. granulosus (G1 and G6) EPC1 gene. The alignment shows no changes in the coding region of G1 and G6 strains
Fig. 5

Antigenic determinant plot. X-axis contains sequence number and Y-axis contains DiscoTope score. EPC1 sequence is 76-residue long involve four antigenic site that are displayed in green
Fig. 6

Hydrophobicity plot of EPC1 protein of E. granulosus. The sequence (68–76) in above the 0 indicates it is hydrophobic in nature. The sequences (11–28, 42–57, and 59–68) are in the plot in below the 0 indicates they are hydrophilic in nature
Fig. 7

Structure of EPC1, which shows the B cell predicted epitopes in yellow. The EPC1 structure shows two residues in the helical region and seven residues in the coil region


Previous studies have demonstrated the presence of two separated strains of E. granulosus, namely the sheep and camel strains in Iran. The results presented here are in agreement with the previous studies and demonstrate that the sheep strain adapts to different hosts and is the predominant E. granulosus strain in Iran. Researchers continue to identify protein coding genes with antigenic functions (Thirugnanam et al. 2012). Detection of some parasitic infection particularly in asymptomatic individuals is often hampered due to the lack of standard diagnostic tools (Ahmad et al. 2013). The objective of this research is to find E. granulosus-specific antigens and to use the information to develop more specific antigens for E. granulosus diagnosis. The present study assessed the EPC1 gene as an antigen candidate for E. granulosus serodiagnostic assays. For the current molecular studies, the mitochondrial COI gene was used; sheep and camel isolates had an identical strain of G1 and G6, respectively, based on the sequence data. The confirmed G1 and G6 strains were chosen for EPC1 sequence analysis. The EPC1 sequence contains coding and non-coding regions and was compared against two strains of E. granulosus (G1 and G6). No changes were detected in the coding region of G1 and G6 strains and the sequences of the EPC1 were submitted to Gene Bank (access numbers JF964264 and JN792187) at nucleotide level and at the amino acid level. The five non-coding region was strongly conserved in both strains; whereas, the non-coding regions were not. The nucleotide and amino acid sequences of the genes encoding the proteins were used for bioinformatic analysis (Thirugnanam et al. 2012).

The EPC1 nucleotide sequence contains coding and non-coding regions. Sequence polymorphism was not found in protein coding regions, suggesting that these regions may be useful for identification of protein expression as an antigen. The protein was introduced as an ideal antigen and provides sequence-specific and surface structural epitopes (the small site on an antigen to which a complementary antibody may specifically bind is called an epitope). The epitope recognized by an antibody may be dependent upon the presence of a specific three-dimensional antigenic conformation. The aim of this investigation was to apply bioinformatics methods to study B cell epitopes and other structural properties of EPC1. EPC1 isolated from adult and larval stages had no variation in amino acid and nucleotide sequences. In this study, we have determined structural information of EPC1 as there are no previous reports in the literature and in the protein data bank. In a protein, antigenic sites lie in regions which are hydrophilic. Accessibility and flexibility of these segments are high. This has led to the rules that would allow the position of B cell epitopes to be predicted from the features of the sequence. For the prediction of antigenic determinant site of EPC1 in E. granulosus, comparisons of the sequences from two strains showed 100 % identity in coding regions. The average antigenic activity for the whole protein is above1.1 (more than 1.0 is potentially antigenic). Hydrophobicity of EPC1 below 0 indicates that it is hydrophilic in nature. Structural analysis showed three regions in coiled and two B cell epitope regions in α-helices (Fig. 7). Nine B cell epitope residues (amino acids) were identified out of 67 total residues and 5 regions as B cell epitope (Table 2).


The identity of EPC1 sequence in G1 and G6 genotypes in the regions where the two strains are prevalent showed the antigenic efficacy of this protein in the serological assays. The EPC1 epitopes can be classified into the conformational discontinuous epitopes as the residues are distantly separated in the sequence. These findings can introduce the EPC1 as an important antigenic protein.


This study was supported by grant number 88000408 From the Iran National Science Foundation. We would like to thank Mrs. Leigh Schulte for her valuable correction in writing the manuscript.

Conflict of interest

The authors declare that they have no conflicts of interest.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013