Background

Echinostomatid trematodes comprise a group of at least 60 species [1], some of which are of socioeconomic significance in animals. Hypoderaeum conoideum (Bloch, 1782) is an important member of the family. This echinostomatid was originally found in the intestines of birds and is known to infect chickens, ducks and geese in many countries around the world [2-4]. It has also been found to infect humans and cause echinostomiasis in Thailand [5,6]. Freshwater snails, Planorbis corneus, Indoplanorbis exustus, Lymnaea stagnalis, L. limosa, L. ovata and L. rubiginosa, act as first intermediate hosts and shed the cercariae; bivalves, fishes or tadpoles can act as second intermediate hosts [3,5].

The accurate identification of species and genetic variants of Hypoderaeum conoideum will be central to investigating its biology, epidemiology and ecology, and also has implications for the diagnosis of infections. Although morphological features are used to identify this and other trematodes, such characters are not always reliable [7]. Due to these constraints, various molecular methods have been established for specific identification [7]. For instance, PCR-based techniques using genetic markers in nuclear ribosomal (r) and mitochondrial (mt) DNA have been widely used [7]. The sequences of the first and second internal transcribed spacers (ITS-1 and ITS-2 = ITS) of nuclear rDNA have been particularly useful for specific identification, based on consistent levels of sequence difference between species and little variation within individual species [7], while the mitochondrial gene cox1 has been used for studying genetic variation and relationships among different species [8-10]. As a basis for the development of molecular tools to study H. conoideum populations (irrespective of developmental stage), we have characterized the complete mt genome of this parasite, compared this genome with those of selected trematodes and undertaken a phylogenetic analysis of concatenated amino acid sequence data for 12 protein-coding genes to assess the genetic relationship of H. conoideum with these other trematodes.

Methods

Parasites and DNA isolation

H. conoideum adults were collected from the intestine of a naturally infected free-range duck in Hubei province, China, in accordance with the Animal Ethics Procedures and Guidelines of Huazhong Agricultural University. These worms were washed in physiological saline and identified morphologically according to existing morphological descriptions [11]. A reference specimen was stained and mounted [12] and the remaining specimens were fixed in 70% (v/v) ethanol and stored at −20°C until use [8]. Total genomic DNA was extracted from one specimen using E.Z.N.A.® Tissue DNA Kit. To provide further identification for this specimen, the ITS-2 region was amplified and sequenced [13], it was identical to a reference sequence available for H. conoideum (GenBank accession no. KJ 944311.1).

Amplification and sequencing of partial cox1, cox3, nad4, nad5 and rrnS

Initially, ten oligonucleotide primers (Table 1) were designed to regions of the mt genome of Fasciola hepatica [14], in order to amplify short fragments from the cox1, cox3, nad4, nad5 and the small subunit of ribosomal RNA (rrnS) genes (Table 1). PCR (25 μl) was performed in 10 mM Tris–HCl (pH 8.4), 50 mM KCl, 4 mM MgCl2, 200 mM each of dNTP, 50 pmol of each primer, 2 U Taq polymerase (Takara) and 2.5 μl genomic DNA or H2O (no-DNA control) in a thermocycler (Biometra) under the following conditions: an initial denaturation at 94°C for 5 min, followed by 30 cycles of 94°C/1 min; 47–50°C/30 s (depending on primer pair), 72°C/1 min, followed by a final extension of 72°C/7 min. Amplicons were sent to Sangon Company (Shanghai, China) for sequencing by using the same forward and reverse primers (separately) as used in PCR.

Table 1 Sequences of primers used to amplify fragments from Hypoderaeum conoideum

Long-PCR amplification and sequencing

Ten additional primers (see Table 1) were then designed from the sequences obtained, and used to amplify genomic DNA (~40-80 ng) from five regions (see Table 1) by long-PCR; PCRs (25 μl) were performed in a reaction buffer containing 2 mM MgCl2, 1× LA Taq Buffer II, 0.4 mM dNTP mixture, 0.8 μM of each primer, 2.5 U LA Taq polymerase (Takara) and 2.5 μl of genomic DNA or H2O (no-DNA control) for 35 cycles of 94°C/30 s (denaturation), 50°C/30 s (annealing) and 72°C/1 min (extension) per kb. Amplicons were cloned into pGEM-T-Easy vector (Promega, USA) according to the manufacturer’s protocol; inserts were amplified by long-range PCR (employing vector primers M13 and M14) and then sequenced using a primer-walking strategy [15].

Sequence analyses

Sequences were assembled using the software ContigExpress program (Invitrogen, Carlsbad, CA), and aligned against the mt genome sequences of other available trematodes (including F. hepatica) using the programs Clustal X v.1.83 [16] to infer gene boundaries. The open reading frames (ORFs) were identified using ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) employing the flatworm mitochondrial genetic code. Translation initiation and termination codons were identified as described previously [14,17,18]. The secondary structures of the 22 tRNA genes were predicted using tRNAscan-SE and/or manual adjustment [9,19]. The two rRNA genes were identified by comparison with those from the mt genome of F. hepatica [14]. Amino acid sequences of the protein-coding genes were obtained by using the flatworm mt code, and aligned using the program MUSCLE [20] employing default settings.

Sliding window analysis of nucleotide variation

Sequence variability between H. conoideum and F. hepatica was conducted by sliding window analysis using the software DnaSP v.5 [21]. A sliding window analyses was implemented as described previously [22].

Phylogenetic analysis

Amino acid sequences conceptually translated from individual genes of the mt genome of H. conoideum were concatenated and aligned with those from available mt genomes of trematodes, including Clonorchis sinensis (NC_012147) [14,23], Fasciola gigantica (NC_024025) [22], F. hepatica (NC_002546) [14], Opisthorchis felineus (NC_011127) [23], Schistosoma haematobium (NC_008074) [24], Schistosoma japonicum (AF215860) [14], Schistosoma mekongi (NC_002529) [18], Schistosoma spindale (NC_008067) [24], and the cestode Taenia solium (outgroup) (NC_004022.1) [25]. The phylogenetic analysis was conducted using the neighbour-joining (NJ) method employing the Tamura-Nei model [20]. Confidence limits were assessed using bootstrap procedure with 1000 pseudo-replicates for neighbour-joining tree, and other settings were obtained using the default values in MEGA v.6.0 [20]. In addition, maximum parsimony (MP), Bayesian (MB) and maximum likelihood (ML) analyses were implemented as described previously by other workers [20,26,27].

Results

Features of the mt genome of H. conoideum

The circular mt genome of H. conoideum (GeneBank accession no. KM_111525) is 14,180 bp in size. It includes 22 tRNA genes, two rRNA genes (rrnS and rrnL), 12 protein-coding genes (cox1-3, nad1-6, nad4L, cytb and atp6) and a non-coding region, but lacks an atp8 gene, and all genes are transcribed in the same direction (Figure 1), which is consistent with other trematodes, such as F. hepatica [14], O. felineus [22] and S. haematobium [24]. The arrangement of the protein-encoding genes is: cox3-cytb-nad4L-nad4-atp6-nad2-nad1-nad3-cox1-cox2-nad6-nad5, which is in accordance with F. hepatica [14], O. felineus [22], S. japonicum [14] and S. mekongi [18], but different from that of S. haematobium and S. spindale [24].

Figure 1
figure 1

Organisation of genes in the mitochondrial genome of Hypoderaeum conoideum.

Overlapping nucleotides between the mt genes of H. conoideum ranged from 1 to 40 bp (Table 2), which is the same as other for trematodes, such as F. hepatica [14] and O. felineus [22]. The mt genome of H. conoideum has 26 intergenic spacers, each ranging from 1 to 34 bp in length (Table 2). The nucleotide contents in the mt genome are: 18.92% (A), 11.71% (C), 42.46% (T) and 26.91% (G). The A + T content of protein coding genes and rRNA genes ranged from 59.65% (rrnS) to 68.63% (nad3) (Table 3), and the overall A + T content of the mt genome is 61.4%.

Table 2 The organization of the mitochondrial genome of Hypoderaeum conoideum
Table 3 Nucleotide contents of genes and the non-coding region within the mitochondrial genome of Hypoderaeum conoideum

Protein-coding genes

The H. conoideum mt genome has 12 protein-coding genes, including nad5, cox1, nad4, cytb, nad1, cox3, nad2, cox2, atp6, nad6, nad3 and nad4L. For these protein coding genes, the initiation codon is ATG (seven of 12 protein genes), and GTG (five genes) (Table 2), which is in agreement with other digeneans [14,28]. The termination codon is TAG (seven of 12 protein genes) or TAA (five genes). The most frequently used codon is TTT (Phe), with the frequency of 7.96%, followed by GTT (Val: 5.99%), TGT (Cys: 4.63%), TTG (Leu: 4.30%) and TTA (Leu: 4.00%) (Table 4). The least used codons are GCC (Ala: 0.34%), CAC (His: 0.32%) and CGC (Arg: 0.11%).

Table 4 Codon usage for 12 protein-coding genes in the mitochondrial genome of Hypoderaeum conoideum

Transfer RNA and ribosomal RNA genes, and non-coding regions

The H. conoideum mt genome encodes 22 tRNAs; all of them have a typical cloverleaf structure. The length of 22 tRNA genes ranges from 60 bp to 75 bp (Table 2). There are intergenic and overlapping nucleotides between adjacent tRNA genes (Table 2). The rrnS and rrnL are 751 bp and 979 bp in length, respectively (Table 2). The location of rrnS is between tRNA-Cys and cox2, and that of rrnL is between tRNA-Thr and tRNA-Cys, which is the same as other trematodes. In contrast to some other trematodes (two AT-rich regions), such as F. hepatica and F. gigantica [14,23], O. felineus [22] and S. haematobium [24], there is only one AT-rich region (348 bp) in the mt genome of H. conoideum, which is located between tRNA-Glu and cox3 (Figure 1 and Table 2), with an A + T content of 60.19% (Table 3).

A comparison of nucleotide variability between H. conoideum and F. hepatica

A sliding window analysis of H. conoideum and F. hepatica using complete mt genomes showed the nucleotide diversity Pi (π) for 12 protein-coding genes (Figure 2). It indicated that the highest level of the mt sequence variability was within the gene atp6, and the lowest was within nad5. In our study, the most conserved protein-coding genes are cox1, nad2 and nad5, and the least conserved are atp6 and nad3.

Figure 2
figure 2

Sliding window analysis of complete mt genome sequences of Fasciola hepatica and Hypoderaeum conoideum . The black line indicates nucleotide diversity in a window of 300 bp (10 bp steps). Gene regions (grey) and boundaries are indicated.

Phylogenetic relationships

We used concatenated amino acid sequence data representing 12 mt protein-coding genes of H. conoideum, eight other digeneans (C. sinensis, F. gigantica, F. hepatica, O. felineus, S. haematobium, S. japonicum, S. mekongi and S. spindale) and one tapeworm (T. solium) for a selective analysis of genetic relationships (Figure 3). The tree reveals two large clades with strong support (100%): one contains four members representing two families (Fasciolidae and Opisthorchiidae) and H. conoideum; the other clade contains four members of the Schistosomatidae. In the present analysis, H. conoideum had a relatively close genetic relationship with F. hepatica and other members of the Fasciolidae, followed by Opisthorchiidae, and then the Schistosomatidae. There was no difference in tree topology using the ML, MB and MP methods of analysis (not shown).

Figure 3
figure 3

Phylogenetic relationship of Hypoderaeum conoideum with selected trematodes; based on concatenated amino acid sequence data representing 12 protein-coding genes by neighbor-joining analysis, using Taenia solium as an outgroup. Nodal support values are indicated (%); the bar indicates amino acid substitution per site.

Discussion

The present characterization of the mt genome of H. conoideum provides a basis for addressing questions regarding the biology, epidemiology and population genetics of Hypoderaeum spp. In addition, it will also assist in supporting taxonomic studies of Hypoderaeum spp. of other animals (e.g., chickens, ducks, geese and humans) as well as in tracking life cycles by identifying larval stages in different intermediate hosts using molecular tools.

Assisted by sliding window analysis, PCR primers could be selectively designed to regions conserved among different trematode species and flanking variable regions in the mt genome that are informative (based on sequencing from a small number of individuals from particular populations). PCR-coupled single-strand conformation polymorphism (SSCP) analysis [29] could then be employed to screen large numbers of individuals representing different populations and, based on such an analysis, samples representing all detectable genetic variability could be selected for subsequent sequencing and analyses. Such an approach has been applied to study the genetic make-up of the blood fluke S. japonicum from seven provinces in China [30,31].

Now that the H. conoideum mt genome is available, it would be interesting to undertake a comprehensive study of this morphospecies from various host species from different countries by integrating morphological data with PCR-based genetic analyses of adult worms and larval stages (from intermediate hosts) to begin to understand the epidemiology and ecology of H. conoideum. In addition to conducting targeted mt genetic analyses, it would also be useful to include analyses of sequence variability in the two internal transcribed spacers (ITS-1 and ITS-2), 18S and 28S of nuclear ribosomal DNA, because, for trematodes, these markers usually allow specific identification of trematodes. Importantly, although H. conoideum is recognized as a species, it is possible that cryptic species of this taxon might exist. This proposal could be tested using the mt markers defined here, together with ITS-1 and/or ITS-2.

Conclusions

Our analysis showed that H. conoideum is genetically closely related to F. hepatica comparing with other trematodes. The mt genome of H. conoideum should be useful as a resource for comparative mt genomic studies of trematodes and DNA markers for systematic, population genetic and epidemiological studies of H. conoideum and congeners.