Immunoproteomics approach for EPC1 antigenic epitope prediction of G1 and G6 strains of Echinococcus granulosus
It is important to establish the diagnosis of cystic echinococcosis (CE) infection and begin control management. Currently, it is difficult to make an accurate diagnosis of CE without the availability of an accurate test, which requires the use of sensitive and specific antigens. Using recombinant antigens the sensitivity and specificity of the CE serology assays could be improved considerably. Recently, a highly antigenic protein named EPC1was characterized and isolated from an Echinococcus granulosus protoscoleces. The current study was designed to assess the sequences of EPC1 isolated from different intermediate hosts of E. granulosus. In addition, identification of a highly antigenic linear B cell epitope was found within EPC1 antigen candidate. The EPC1 sequence contains coding and non-coding regions and was compared between two predominant strains (G1 and G6) in Iran. Sequence polymorphism was not found in protein coding regions, suggesting that these regions may be useful for identification of protein expression as an antigen. The average antigenic activity for the whole protein is above 1.1, and hydrophobicity below 0 indicates that it is hydrophilic. Structural analysis showed alpha helical regions in amino acids 6–25, 35–44, 52–62, and 72–78. Nine B cell epitope residues were identified out of 67 total residues. The identity of EPC1 sequence in both G1 and G6 genotypes affects the antigenic efficacy of EPC1and suggests the recombinant protein will be useful in serological assays in the regions where the two strains are prevalent.
Cystic echinococcosis (CE) caused by Echinococcus granulosus is known to be one of the most important parasitic infections in livestock. It is an ancient zoonotic disease and potentially a life-threatening infection (Amin Pour et al. 2011; Addy et al. 2012; Taha 2012). Various studies characterizing the morphological and molecular features of E. granulosus isolated from sheep, goat, cattle, camel, buffalo, and humans in different parts of Iran were performed (Zhang et al. 1998; Hosseini and Eslami, 1998; Rajabloo et al. 2012; Amin Pour et al. 2011; Fasihi Harandi et al. 2002). The genotype G1 was found in sheep, goat, cattle, camel, and humans; whereas, G6 was only found in camel, human, and goats (Rajabloo et al. 2012; Zhang et al. 1998; Mehrabani et al. 1999; Dalimi et al. 2002; Fasihi Harandi et al. 2002; Jamali et al. 2004; Rostami Nejad et al. 2008; Karimi and Dianatpour 2008; Amin Pour et al. 2011; Sharbatkhori et al. 2010). Hydatid disease is endemic in Iran and, accordingly, physicians need be aware of the clinical features, diagnosis, and management of this disease (Eslami and Hosseini 1998; Rokni 2009; Umhang et al. 2013). Cystic echinococcosis is one of the few parasitic infections where the primary diagnosis is by a serological test. Beginning in 1967, studies by Capron et al. on the antigenic composition of hydatid fluid led to the description of antigen5, and started a new era in the specific serologic diagnosis of hydatidosis (Zarzosa et al. 1999). In diagnosis of E. granulosus, several serological tests have been employed including precipitation, agglutination and marked antibodies assays (Zhang et al. 2003). A suspicious lesion must be diagnosed using two tests comprising a qualitative test [immunoelectrophoresis (IEP)] and a quantitative test [enzyme-linked immunosorbent assay (ELISA), or hemaglutination]. The most sensitive technique used is the specific immunoglobulin G (IgG) ELISA test. The serologic tests are also useful for postoperative follow-up.
Despite the development of sensitive and specific methods, the immunodiagnosis of CE and echinococcosis remains a difficult task (Ortona et al. 2003; Siracusano and Bruschi 2006). Majority of the available screening tests can produce a high percentage of false-negative results (up to 25 %). These false-positive results occur using different assays and can be caused by co-infection with other cestodes or helminths when diagnosing human hydatidosis (Carmena et al. 2006). Since hydatid cyst fluid (HCF) contains various metabolites of the host and the parasite origin, using HCF as an antigen, reduces the specificity of the assay (Rahimi et al. 2011). It has been suggested that CE serology may be improved by using recombinant proteins. Recently, a highly antigenic protein named EPC1was characterized and isolated from a protoscolex (larval) stage and is encoded by the EPC1 gene (Li et al. 2004). Considering the differences in cytochrome oxidase subunit 1 gene and some other locus sequences from different E. granulosus isolates, the performance of a given diagnostic assay, which uses this antigen, might be affected. The current study was designed to assess the sequences of EPC1 isolated from different intermediate hosts of E. granulosus.
Bioinformatics analysis software was used to predict Echinococcus protoscolex protein (EPC1) structure and function of the protein. Prediction of antigenic regions in a protein is helpful for a rational approach to the expression of the recombinant proteins which may elicit an appropriate antibody reaction. Previous studies demonstrate that a good correlation exists between the predicted regions and previously determined antigenic regions (Welling et al. 1985).
In the present study, identification of a highly antigenic linear B cell epitope was described within the EPC1 antigen candidate and in different E. granulosus strains.
Materials and methods
Hydatid cysts were collected from the liver and lungs of 17 sheep, 1 cattle, and 4 camels which were slaughtered in a slaughterhouse in Iran. HCF was collected from the fertile cysts. Protoscoleces were washed three times using phosphate buffered saline (PBS) at pH 7.2 and centrifuged at 5,000×g for 5 min. The tubes were kept at −20 °C. DNA was extracted from 50 μl of the protoscoleces. Following the manufacturer’s instructions (DNA extraction kit, MBST, Iran), isolation of the entire genomic DNA from the protoscoleces of E. granulosus was performed.
PCR amplification and sequence analysis
PCR amplification for mitochondrial gene, cytochrome C oxidase subunit I, (COI) gene for determining the strains was performed in 50 μl volumes containing 2 μl DNA sample and 48 μl reaction mixture, which contained 2 μmol of each primer (forward) 5′TTTTTggCCATCCTGAGGTTTAT-3′ and (reverse) 5′-TAACgACATAACATAATgAAAATg -3′ and 1 unit of Taq DNA polymerase (5U/μl Fermentas), 3 mM MgCl2, 2 mM dNTP, 1X PCR buffer, and 38 μl double distilled water. The PCR conditions were as follows: an initial denaturing step (95 °C for 5 min) followed by 40 cycles, with each cycle consisting of denaturation at 94 °C for 45 s, annealing at 56.6 °C for 45 s, elongation at 72 °C for 45 s, and a final extension at 72 °C for 10 min. For detection of the PCR amplicons, 8 ml of the PCR products was separated by 1.5 % agarose gel electrophoresis and stained with ethidium bromide. The PCR products were purified using quick PCR products purification kit (MBST, Iran). Based on Sanger’s method, genomic DNA sequencing was performed in both directions for each of the PCR products by Kowsar Biotech Company in Iran. The sequence chromatograms were analyzed using the Chromas software version 3.1 and compared to those registered in the Gen Bank using the Basic Local Alignment Search Tool (BLAST).
Primers used in this study
Choosing the correct open reading frame is especially challenging for proteomic data, because the data for EPC1 in Gene Bank (access number AF481884.1) containing partial sequence lacks clear start and stop signals. Open reading frame is a target sequence in EPC1 for comparing different strains. Sequencing results of EPC1 in the present study contained coding and non-coding sequences. The open reading frame (ORF) was found using vector NTI (version. 11) software. This tool identifies all the ORFs using the standard or alternative genetic codes.
Predicting antigenic propensity and solvent accessible regions
The antigenic activity of EPC1 coding region was determined using antigenic peptides site prediction (Fig. 5). This prediction tool is based on the Kolaskar and Tongankars method (1990). Raw amino acid sequence as FASTA format is entered in the program to predict those segments within the calcium binding protein of EPC1, using the method of Kolaskar and Tongaonkar (1990). The reported accuracy of the method is about 75 %. The used algorithms are based on the scale of delineating hydrophobic character of a protein. The regions with values greater than 0 are hydrophilic, and thus, are likely to be exposed on the surface of a folded protein. The values under 0 are hydrophobic. The EPC1 protein sequence as a FASTA format was analyzed to obtain plots that characterize its hydrophobic property. This could be useful in predicting membrane spanning domains that are potential antigenic sites and regions likely to be exposed on the proteins surface (Fig. 6).
EPC1 protein structure modeling
A predicted model of the EPC1 was constructed using Swiss-model workspace (Arnold et al. 2006). This web-based tool is a protein structure homology-modeling server. The protein BLAST algorithm against PDB database was used for searching homologous proteins. Subsets of proteins similar to EPC1 were found in the database and were prepared for homology modeling. A predicted model of the EPC1 protein was constructed using Swiss- MODEL based on Cypro carpio template with 44 % sequence similarity and E value of 5.9 e −13 (Fig. 7). Uniport database coordinates sets with 50 % sequence identity of Teania multiceps, Taenia taeniaformis, Taenia crassiceps as the structural templates of the immunogenic protein cluster. There are no known structures The EPC1 and its homologues do not have any known structure in PDB.
B cell epitope prediction
B cell epitopes prediction in EPC1 structure using Discotope software. The residue contact number is the number of Cα atoms in the antigen within a distance of 10 Å of the residue’s Cα atom. A low contact number correlates with localization of the residue close to the surface or in protruding regions of the antigen's structures. Propensity score represents the probability/tendency of being part of an epitope for that particular residue. The propensity score is calculated by sequentially averaging epitope log-odds ratios within a window of nine residues. Then the scores are summed up based on the proximity in the 3D structure of the antigen. For any given residue, the sequentially averaged log-odds scores from all residues within 10 Å are summed to give the propensity score. DiscoTope Score is calculated by combining the contact numbers with the propensity score. DiscoTope score above the threshold value (−7.7) indicates positive predictions and the scores below the threshold value indicate negative predictions
In the present study, COI-PCR was used to characterize E. granulosus DNA isolated from cysts recovered from animal isolates of E. granulosus. The fragment, approximately 440 bp, in all of the DNA samples, was amplified and then sequenced. The sequencing results aligned with the Gene bank sequence. The results showed that all the sheep, goat, and cattle isolates were most similar to the sheep strain and all the camel isolates were most similar to the camel genotype.
The results of EPC1 PCR products
Previous studies have demonstrated the presence of two separated strains of E. granulosus, namely the sheep and camel strains in Iran. The results presented here are in agreement with the previous studies and demonstrate that the sheep strain adapts to different hosts and is the predominant E. granulosus strain in Iran. Researchers continue to identify protein coding genes with antigenic functions (Thirugnanam et al. 2012). Detection of some parasitic infection particularly in asymptomatic individuals is often hampered due to the lack of standard diagnostic tools (Ahmad et al. 2013). The objective of this research is to find E. granulosus-specific antigens and to use the information to develop more specific antigens for E. granulosus diagnosis. The present study assessed the EPC1 gene as an antigen candidate for E. granulosus serodiagnostic assays. For the current molecular studies, the mitochondrial COI gene was used; sheep and camel isolates had an identical strain of G1 and G6, respectively, based on the sequence data. The confirmed G1 and G6 strains were chosen for EPC1 sequence analysis. The EPC1 sequence contains coding and non-coding regions and was compared against two strains of E. granulosus (G1 and G6). No changes were detected in the coding region of G1 and G6 strains and the sequences of the EPC1 were submitted to Gene Bank (access numbers JF964264 and JN792187) at nucleotide level and at the amino acid level. The five non-coding region was strongly conserved in both strains; whereas, the non-coding regions were not. The nucleotide and amino acid sequences of the genes encoding the proteins were used for bioinformatic analysis (Thirugnanam et al. 2012).
The EPC1 nucleotide sequence contains coding and non-coding regions. Sequence polymorphism was not found in protein coding regions, suggesting that these regions may be useful for identification of protein expression as an antigen. The protein was introduced as an ideal antigen and provides sequence-specific and surface structural epitopes (the small site on an antigen to which a complementary antibody may specifically bind is called an epitope). The epitope recognized by an antibody may be dependent upon the presence of a specific three-dimensional antigenic conformation. The aim of this investigation was to apply bioinformatics methods to study B cell epitopes and other structural properties of EPC1. EPC1 isolated from adult and larval stages had no variation in amino acid and nucleotide sequences. In this study, we have determined structural information of EPC1 as there are no previous reports in the literature and in the protein data bank. In a protein, antigenic sites lie in regions which are hydrophilic. Accessibility and flexibility of these segments are high. This has led to the rules that would allow the position of B cell epitopes to be predicted from the features of the sequence. For the prediction of antigenic determinant site of EPC1 in E. granulosus, comparisons of the sequences from two strains showed 100 % identity in coding regions. The average antigenic activity for the whole protein is above1.1 (more than 1.0 is potentially antigenic). Hydrophobicity of EPC1 below 0 indicates that it is hydrophilic in nature. Structural analysis showed three regions in coiled and two B cell epitope regions in α-helices (Fig. 7). Nine B cell epitope residues (amino acids) were identified out of 67 total residues and 5 regions as B cell epitope (Table 2).
The identity of EPC1 sequence in G1 and G6 genotypes in the regions where the two strains are prevalent showed the antigenic efficacy of this protein in the serological assays. The EPC1 epitopes can be classified into the conformational discontinuous epitopes as the residues are distantly separated in the sequence. These findings can introduce the EPC1 as an important antigenic protein.
This study was supported by grant number 88000408 From the Iran National Science Foundation. We would like to thank Mrs. Leigh Schulte for her valuable correction in writing the manuscript.
Conflict of interest
The authors declare that they have no conflicts of interest.