Introduction

Amoebiasis, caused by an infection from the protozoan Entamoeba histolytica, is one of the most important parasitic diseases in humans. Approximately 50 million people suffer from amoebic colitis (AC) and extra-intestinal abscesses, resulting in 100,000 deaths annually (Walsh 1986). E. histolytica infections have different clinical outcomes. Most infections remain asymptomatic, whereas some develop diarrhoea and dysentery, and only a few develop extra-intestinal complications, such as liver abscess (Tanyuksel and Petri 2003) and cerebral abscess (Maldonado-Barrera et al. 2012).

The determinants of E. histolytica infection outcomes remain largely unknown, although host genetics and parasite genotype possibly play important roles. To investigate this relationship, a simple, sensitive and reliable method for strain identification is required. A few polymerase chain reaction (PCR)-based DNA typing methods have been reported over the years for E. histolytica; these methods use repetitive elements contained within both protein-coding genes and noncoding DNAs (Ali et al. 2005; Clark and Diamond 1993; Clark 2006; Zaki and Clark 2001). Genetic polymorphism in E. histolytica isolates is well established; in addition, several polymorphic markers, including the serine-rich E. histolytica protein (SREHP), which encodes an immunodominant surface antigen, have been investigated by several scholars (Ali et al. 2005; Ayeh-Kumi et al. 2001; Fu et al. 2010; Ghosh et al. 2000; Haghighi et al. 2002, 2003; Rivera et al. 2006; Zaki et al. 2002). In addition, other authors have recently investigated the relationship between parasite genotypes and the clinical outcome of infection using Bangladeshi clinical samples and a six-locus genotyping system based on tRNA-linked short tandem repeats (STRs) (Ali et al. 2005, 2007, 2008a, b; Haghighi et al. 2002; Tawari et al. 2008).

The use of single polymorphic locus to detect E. histolytica genotypes had not been sufficient. Thus, more than one locus should be used for strain typing. In the present study, genotyping systems are combined using the nucleotide sequence of both protein-coding and noncoding unique STR genes to genotype a panel of clinical samples isolated from Chinese individuals who were either asymptomatic or had intestinal or extra-intestinal disease. The current study is the first to report on the detailed genome information of E. histolytica in China. Analyses of the results suggest that the combined genotyping analysis method shows more accurate and comprehensive information than previous methods. In addition, the parasite genome of Chinese clinical isolates has some differences between each other and may play a role in determining the outcome of E. histolytica infections.

Materials and methods

Clinical samples and cultivation

Five Chinese E. histolytica samples were included in the present study: three amoebic liver abscess (ALA) cases, one combined case and one asymptomatic case. The three ALA patients were labelled as WU, SH and XLALA, the combined case was labelled as SH2, whereas the asymptomatic patient was labelled as XLAC. WU suffered ALA from Africa (Ding et al. 2010) and had two grant abscesses in both his liver lobes. On the other hand, SH was an AIDS patient with ALA, and he suffered from amoebiasis from Shanghai, and SH2 was also an AIDS patient with both ALA and AC. XLALA and XLAC were brothers from the GuangXi countryside. XLALA was an ALA patient, whereas his brother was an asymptomatic patient. Liver abscesses were routinely aspirated, and the aspirated pus was sent to the laboratory with the patients’ serum for diagnostic purposes. E. histolytica infections were defined via indirect fluorescent antibody test.

Isolation and culture conditions

The faecal sample containing Entamoeba cysts obtained from XLAC was suspended in water for 24 h to remove Blastocystis spp. and then cultured in modified Tanabe–Chiba medium (Yoshimura and Ebukuro 1988) consisting of an agar slant and an upper liquid medium in a screw cap test tube at 37 °C. The slant contained 1 % agar in Ringer’s solution supplemented with 0.1 % l-asparagine, whereas the liquid medium contained phosphate-buffered saline (pH 7.6) supplemented with one eighth the volume of horse serum. A microspatula of rice powder was added before use. Grown trophozoites of the Entamoeba were cultured in Robinson’s medium at 37 °C (Robinson 1968). The trophozoites were treated with a cocktail of antibiotics and then cultured monoxenically with live Crithidia fasciculata in BI-S-33 medium supplemented with 15 % adult bovine serum at 37 °C (Diamond et al. 1978). Finally, the trophozoites of the strain were cultured axenically in BI-S-33 medium and then cloned through limiting dilution.

PCR analysis

Genomic DNA was extracted from the axenic cultures, abscess samples (ALA patients) or faeces samples using a QIAamp DNeasy Kit (Qiagen). The genomic DNA was subjected to PCR for partial amplification of 18S ribosomal RNA genes, internal transcribed spacers (ITS) 1–2, the complete 18S and 5.8S ribosomal RNA genes (RD51 and RD31), SREHP genes, and the STR fragments were amplified using six E. histolytica-specific tRNA-linked STR primers (DA-H, AL-H, NK2-H, RR-H, SQ-H and STGAD-H) (Ali et al. 2005; Ghosh et al. 2000; Tachibana et al. 2007; Zaki et al. 2002). The primers and PCR conditions for E. histolytica were based on previously described procedures (Table 1). PCR was performed briefly in a 50-μL reaction mixture using TaKaRa Ex-taq® DNA Polymerase (Takara). A total of 35 PCR cycles were performed as follows: denaturation at 94 °C for 15 s, annealing for 30 s (temperatures are shown in Table 1) and polymerization at 72 °C for 30 s (3 min for the complete 18S and 5.8S ribosomal RNA genes). An initial 3-min denaturation step at 94 °C and a final 7-min polymerization step at 72 °C were also included. A nested PCR was used in amplifying the SREHP of the XLALA strain and SH2 strain. Briefly, a set of primers (EhS+5 and EhS+3), which amplify a 650-bp fragment of the SREHP gene, were used for the initial PCR, followed by a second set of primers (SREHP-S and SREHP-AS) located within the fragment amplified by EhS+5 and EhS+3. This process resulted in a 550-bp fragment.

Table 1 Primers used in this study

Sequencing and analysis of genes

The PCR products of 18S ribosomal RNA genes, ITS 1–2 and the complete 18S and 5.8S ribosomal RNA genes were subjected to direct sequencing after purification using a QIAquick PCR purification kit (Qiagen). Meanwhile, the PCR products of the SREHP genes and STR fragments were processed using a pMD®20-T Vector cloning kit for sequencing (TaKaRa). Four to six clones of each gene were sequenced using a BigDye Terminator v3.1 Cycle sequencing kit (Applied Biosystems). The reactions were run on an ABI Prism 3100 Genetic Analyser (Applied Biosystems). Sequence data were analysed using ClustalX Ver1.83.

Results and discussion

Comparison of the ribosomal RNA sequences

The XLAC strain was successfully axenic and clone-cultured. A region of DNA containing complete 18S and 5.8S ribosomal RNA genes was amplified via PCR and directly sequenced. Nucleotide sequences indicated that all the isolates were E. histolytica. The DNAs extracted directly from the abscess samples from ALA patients were labelled as WU, SH and XLALA. DNA regions containing 18S ribosomal RNA genes and ITS1, 5.8S and ITS2 ribosomal RNA genes were amplified by PCR and directly sequenced. The nucleotide sequences of the complete 18S and 5.8S ribosomal RNA genes from XLAC were identical to those of the HM1:IMSS strain. No differences were found in the 18S ribosomal RNA genes, and the internally transcribed spacer 1 and 2 regions among WU, SH and XLALA were identical to that of the HM1:IMSS and HK9 strains.

Comparison of SREHP gene

Sequence analysis of serine-rich protein genes confirmed the presence of seven sequence types in five isolates (Fig. 1). The nucleotide sequences of XLAC, XLALA, WU, SH and SH2 were not found in the DNA databases and were therefore deposited in the DDBJ/EMBL/GenBank databases under accession number AB685788-AB685793, AB688711 and AB688712. The XLAC strain contained two sequence types. One was identical to phenotype XI (TM83 strain isolated from Thailand), whereas the other had an extra short repeat unit SN4. The XLALA strain was completely identical with the XLAC strain. WU contained three sequence types. WU1 and WU2 had two nucleotide substitutions, but no differences in amino acid sequences. Meanwhile, WU1 and WU3 had short repeat unit (SN1) substitutions close to SHR 10 strain isolated from Iran. Two unique sequence units found in WU2, namely, SN5 and SN6, had single nucleotide substitutions comparable with SN1 and SN2, respectively. The SH strain had only one sequence types; it was close to a few isolates reported in Japan. The SH2 strain contained two sequence types. One was identical to SH strain, another was different to the other strain (Fig. 1a). Lastly, no differences in amino acid sequences were found among SN1, SN2, SN5 and SN6 (Fig. 1b).

Fig. 1
figure 1

A schematic representation of polymorphism in the repeat-containing region of the SREHP gene in the current study. Nucleotide sequence pattern is shown. Each nucleotide and deduced amino acid sequence of unit are tentatively given a number. Nucleotide and deduced amino acid sequences of these units are also shown

STR polymorphisms in the nucleotide sequences

Previous research observed that isolates with identical PCR size-based STR types displayed distinct nucleotide sequences. Thus, we examined the tRNA-linked STRs of four Chinese E. histolytica samples via direct sequencing because nucleotide sequence-based differentiation of STR types was essential for high-resolution typing of clinical isolates. Newly identified sequences had been deposited to GenBank/EMBL/DDBJ database with accession numbers AB685794-AB685812 and AB688704-AB688710.

A genotype was assigned by combining the STR sequence types obtained from six STR loci and the SREHP sequence types, and a total of four genotypes were identified (Table 2 and Fig. 2). Previously reported STRs (Ali et al. 2008a, b; Escueta-de Cadiz et al. 2010; Tawari et al. 2008) were named according to their nomenclature, whereas newly identified sequence types and genotypes were assigned with alphanumerical codes beginning with the letter “C” to indicate their Chinese origin.

Table 2 Genotypes of Chinese E. histolytica samples using STR markers
Fig. 2
figure 2

A schematic representation of short tandem repeat (STR) types of each loci based on the nucleotide sequence of isolates found in the current study. tRNA genes and STRs are depicted in arrows and rectangles, respectively, whereas non-tRNA and non-STR regions are shown in lines. The schematic diagrams and sequence types were according to Ali et al. (2008a, b), Escueta-de Cadiz et al. (2010) and Tawari et al. (2010)

The STRs amplified from the samples revealed five STR variations in the A-L, four in the N-K2, and R-R loci, three in D-A, STGA-D and S-Q loci (Fig. 2). The STRs of XLAC were identical to those of XLALA, and they both contained two sequence types in the A-L locus. The A-L, N-K2 and R-R loci were different among SH, SH2, WU and XLAC, whereas the STGA-D locus was identical between WU, SH2 and XLAC, but SH2 contained two sequence types. The D-A locus was identical between SH and SH2, and the S-Q locus was identical between WU and XLAC.

Nucleotide sequences of the 18S rDNA and 5.8S rDNA with ITS 1 and ITS 2 genes of XLAC, XLALA, WU, SH and SH2 indicated that all the isolates were E. histolytica. No other Entamoeba species were found in the present study because the nucleotide sequences of 18S rDNA and 5.8S rDNA with ITS 1 and ITS 2 are often used for previous taxonomy studies of the genus Entamoeba (Santos et al. 2010).

Many SREHP genes of E. histolytica have been sequenced for genotyping (Ayeh-Kumi et al. 2001; Fu et al. 2010; Ghosh et al. 2000; Haghighi et al. 2002, 2003; Rivera et al. 2006; Zaki et al. 2002). Majority of the sequence type has been reported in the SREHP gene. The SREHP genes of E. histolytica have been categorized into 37 sequence types from 79 strains, based on the combination pattern of constructing units in the polymorphic regions. In the present study, XLAC contained two sequence types. One was identical to that of the TM83 strain isolated from Thailand, and the other had an extra short repeat unit. WU contained three sequence types close to the SHR 10 strain isolated from Iran. These results indicate the gene diversity of the SREHP gene.

Previous studies compared the genotypes of E. histolytica identified in stool with those in the liver abscess pus of the same patients (Ali et al. 2008a, b). To detect the genetic differences, Ali et al. (2005) used six highly polymorphic loci in E. histolytica. These loci may not be directly involved in virulence, but they may be acting as surrogate markers and be physically linked to the loci having a direct effect on the infection outcome. Previous studies failed to investigate the same phenomenon when comparing invasive and intestinal E. histolytica. However, in the present study, the XLAC strain isolated from an asymptomatic patient was the same with XLALA who was an ALA patient. This result may be attributed to the STR sequence types that are common to an individual group. In addition, XLAC strain contained more than one sequence type. Thus, it had a different outcome of the infection.

SH2 was an AIDS patient with both ALA and AC; we extracted genomic DNA of E. histolytica from both abscess samples and faeces samples, amplified tRNA-linked STR genes from both DNA. Sequence results show that all the six loci genes were identical which suggested one E. histolytica strain had led to different outcome of the infection. This may also be attributed to SH2 strain which contained more than one sequence type.

Many E. histolytica infections in China have been reported recently. However, little genome information has been provided. In the current paper, we used a combined genotyping system using the nucleotide sequence of both protein-coding and noncoding unique STR genes to genotype four Chinese E. histolytica samples. The present study is the first to report on the detailed genome information of E. histolytica in China. The findings of the present study may help in understanding the epidemics and pathogenicities of amoebiasis. Two city, one imported, and two country cases were included in the current study. The genotyping system was applied to these samples and found that the brother has a different infection outcome with the same genotypes of E. histolytica, whereas in SH2 case, one E. histolytica strain had led to different outcome of the infection in one patient. Analyses of the results suggest that genome information of E. histolytica strains in China has some differences compared with foreign reports, and parasite genome may play a role in determining the outcome of E. histolytica infections.

Only a small number of isolates had been studied in China. Therefore, examining more E. histolytica strains from China and nearby areas is necessary to collect enough information through accurate methods. The resulting method of the present study may provide the tools necessary to investigate the role of parasite genotypes in E. histolytica infection outcomes and address other unanswered questions regarding the epidemiology of this parasite.