Background

Echinococcosis is a zoonotic disease caused by tapeworm parasites belonging to the Echinococcus genus. There are two main types of echinococcosis: Cystic Echinococcosis (CE) caused by Echinococcus granulosus sensu lato and Alveolar Echinococcosis (AE) caused by E. multilocularis. Additionally, polycystic echinococcosis (caused by E. vogeli and E. oligarthra) also occurs predominantly in South America [1]. CE has a worldwide geographical distribution [2]. Echinococcosis disrupts the economies of many countries, affecting approximately 2–3 million people [3]. The estimated human burden of CE was below 1 million disability adjusted life years (DALYs), but may increase above this figure. Annual losses caused by CE might reach 20 million US dollars [4]. The disease has a prevalence of about 1/100,000 in developed countries and can reach 200/100,000 in rural populations having close contact with domestic dogs [3].

Most species of Echinococcus inhabit domestic and wild mammals. The definitive hosts include both domesticated dogs and wild carnivore species (foxes, wolves and coyotes). Humans and livestock act as intermediate hosts. Livestock animals are the intermediate hosts for E. granulosus s.s., while wild small mammals serve for E. multilocularis. Humans acquire infection with CE by accidental ingestion of parasite eggs in the contaminated food and water, or by direct interaction with the definitive hosts [5]. Hatching of eggs occurs in small intestine and then the developing parasite larvae can spread to any other organ; however, they prefer to reside in the liver, where parasite forms the hydatid cysts [6]. Molecular genotyping has shown that members of E. granulosus s.l. include E. granulosus s.s. (G1-G3), E. equinus (G4), E. ortleppi (G5), E. canadensis (G6/7, and G8–10) and E. felidis [2].

There are limited data about CE in Pakistan, whereas the incidence of the disease is high in neighboring countries such as Iran, India, and China, for which published data are available for both prevalence and genotyping. Limited research has been conducted on echinococcosis in the past decade in Pakistan [7]. Previous investigations of Pakistani isolates showed the incidence of E. granulosus s.s. (G1-G3) in cattle, buffalo and sheep, while in humans, E. granulosus s.s. (G1-G3) and E. canadensis (G6/7) were detected based on data using the cox1 gene sequences [8,9,10]. However, interestingly, E. multilocularis was reported in cattle from Pakistan [10, 11]. In the past, E. granulosus s.s. was reported in livestock (e.g.,cattle), while E. granulosus s.s. and the former G6 genotype were reported by using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) analysis without confirmation by gene sequencing in humans in Khyber Pakhtunkhwa (KPK) province of Pakistan [11].

Therefore, in the current study, hydatid cyst samples collected from human patients with echinococcosis and stored as formalin-fixed paraffin-embedded (FFPE) tissues were subjected to sequence analysis by analyzing mitochondrial cytochrome c oxidase subunit 1 (cox1), cytochrome b (cytb), and NADH subunit 1 (nad1) genes, to investigate the possible genetic diversity in the hydatid cyst samples in Pakistan.

Methods

Geography of the study area

Punjab is one of the largest provinces by population, with fertile agricultural land and deserts in the southern part near the border with Rajasthan and near the Sulaiman Range. Punjab comprises parts of the Cholistan and Thal deserts. It has extreme weather, with foggy and wet winters. The average temperature increases from mid-February, with springtime weather continuing until mid-April, when the summer heat sets in June and July are very hot months.

Collection of samples

The formalin fixed and paraffin embedded hydatid cyst samples were collected from the Pathology Departments of the contributing hospitals from 2012 to 2017. All patients were confirmed as being infected with echinococcosis (CE, n = 37; AE, n = 1) by histopathological investigation after surgery (detection of Periodic Acid-Schiff (PAS)-positive laminated layers, and or protoscoleces and/or hooklets). Patient data, including age, sex, and epidemiological history were recorded.

Molecular analysis

Genomic DNA extraction

The genomic DNA (gDNA) isolation was performed using individual Formalin-fixed paraffin-embedded (FFPE) tissues. Sections of 10–15 μm thickness were taken from each cyst by using astndard microtome (Leica SM2000 R Sliding Microtome,Wetzlar, Germany) with disposable DNA-RNA free blades. Equipment was autoclaved or sanitized before use. Paraffin was removed by incubation in 1 ml of xylene for 10 min at 37 °C. The supernatant was discarded after centrifugation at 12000×g for 5 min. Samples were rehydrated in descending ethanol concentrations; excess ethanol was evaporated at room temperature. A genomic DNA isolation kit (TRANS Easy Pure FFPE Tissue Genomic DNA Kit, Code: EE191–01; Transgen biotech, Beijing, China) was used to extract total gDNA according to the manufacturer’s protocol, with a few modifications. Briefly, the tissue samples were digested at 56 °C overnight in lysis buffer (400 μl), and then the gDNA was extracted. Sterile distilled water (100 μl) was used to resuspend the pellet. The gDNA samples were stored at − 20 °C until further use [12].

PCR amplification and sequencing

The mitochondrial genes (cox1, nad1, and cytb) were amplified from the isolated gDNA as previously described [12, 13]. Amplification of the cox1 (446 bp) and cytb genes (580 bp) was carried out using a thermocycler with the following PCR conditions: denaturation was done at 94 °C for 30 s, annealing at 54 °C for 30 s and extension at 72 °C for 60 s, for 35 cycles. Amplification of the nad1 gene (900 bp) was performed with the following PCR conditions: denaturation at 95 °C for 60 s, annealing was done at 50 °C for 50 s and extension at 72 °C for 70 s, for 30 cycles [14]. All PCR amplifications were performed with a negative control comprising sterile distilled water instead of the DNA template. The PCR products were visualized using a gel doc system after separation through a 1.5% agarose gel. All positive samples PCR products were subjected to sequence analysis.

Phylogenetic analysis

Construction of the phylogenetic tree, multiple sequence alignment, and unidirectional DNA sequence analysis were constructed using Mega X [15]. A maximum composite likelihood (MCL) strategy was applied to construct the initial trees, using a heuristic search with the BioNJ algorithms and neighbor-joining approach. The superior log-likelihood value was applied to select the topology [70]. The reference sequences that were used as outgroups in the phylogeny and in tree construction are shown in Table 1.

Table 1 Provenance and GenBank accessions for the reference sequences used in the phylogenetic analyses

Statistical analysis

Data was analyzed for statistical analysis by using Fisher’s exact test.

Results

In the present study, 38 human hydatid cyst samples were collected from surgically confirmed patients with echinococcosis, from different areas of Punjab, Pakistan. The average age of the patients with CE was 32.73 (ranging from 5 to 75 years). The demographic characteristics of infected cases are summarized in Table 2.

Table 2 Epidemiological and clinical information for 38 patients with echinococcosis

Among the 38 human echinococcosis samples analyzed, 22 were from males (57.8%) and 16 (42.2%) were from females and the differences were not statistically significant (χ2 = 1.89, df = 1, P > 0.05). A larger proportion (76.3%) of echinococcosis cases was reported from rural areas, which have closer contact or association with dogs compared with that in urban areas (23.7%). The liver (50%) was most affected organ, followed by the lungs (22.5%), and others (Table 2).

Genetic characterization of Echinococcus isolates

PCR amplification of the cox1 gene yielded a product of 446 bp, while cytb yielded a 580 bp fragment, and nad1 yielded a 900 bp product in all samples. The nucleotide sequences of all Pakistani samples (n = 38) were BLAST searched against reference sequences retrieved from GenBank. According to the BLAST analysis of the sequences of the cox1, cytb, and nad1 genes, E. granulosus s.s. (n = 35), E. canadensis (G6/G7) (n = 2), and E. multilocularis (n = 1) were detected. The findings showed that majority of the patients (35/38) were infected with E. granulosus s.s. All sequences have been published in GeneBank (acession no: MK229294–MK229342).

E. granulosus s.s. and E. canadensis (G6/G7) were characterized by using the sequences of cox1 (446 bp), cytb (580 bp), and nad1 (900 bp). Each sample was characterized by using the sequence at least one of them while E. multilocularis was identified by using only cytb (580 bp).

Alignment results of the sequences

In the sequence comparison, the cox1 gene showed 100% match with E. granulosus s.s. except for isolates PUN-23 and PUN-91 were identified as E. canadensis (G6/G7) (Fig. 1a).

Fig. 1
figure 1

Comparison of the phylogenies of different genotypes within Echinococcus spp. based on cox1 (446 bp) (a), cytb (580 bp) (b), nad1 (900 bp) (c), and (d) cytb (580 bp) only for E. multilocularis. Phylogenetic trees were constructed using neighbor-joining distance method analysis with a Tamura-Nei model [29]. The reliability of these trees was assessed using bootstrap analysis with 1000 replicates. Bootstrap support is shown at the nodes. In panel a, the clade of E. equinus (G4) and E. canadensis (G6-G7) can be seen clustering with E. granulosus (G1-G3) with very significant Bootstrap value of 95. In panel b, the cluster of (G4) and (G6-G7) is also further grouping with (G1-G3) with a bootstrap value of 92. In panel c, a bootstrap value of 95 occurred at the clustering node of E. granulosus with mini cluster of E. equinus, E. ortleppi and E. canadensis. In panel d, all the different species of E. multilocularis is successfully clustered together with a high bootstrap value of 76. The results are presented with the country of origin, the genotype, and GenBank accession number; the circles indicate the sequences of E. granulosus s.l. from the present study. The scale-bars indicate the number of substitutions per site

The cytb gene sequences matched with the selected reference gene sequences of E. granulosus s.s. However, only PUN-91 was identical with the E. canadensis (G6/G7) reference gene sequence (Fig. 1b).

For the nad1 gene, while the PUN-91 sequence matched with E. canadensis (G6/G7), all the other sequences were detected as E. granulosus s.s. after BLAST analysis (Fig. 1c).

Two samples were characterized as containing E. canadensis (G6/G7) based on cox1 (446 bp). They were 100% identical and showed 100% similarity with a pig G6 isolate (GenBank: JQ356716) from France and 99.8% similarity with a human G6 isolate (GenBank: AB893260) from Mongolia (Fig. 1a). However, only one of them was identified as being E. canadensis (G6/G7) using the nad1 and cytb sequences (Fig. 1b, c). The current study reveals the first report of genotype E. canadensis (G6/G7) in humans from the Punjab province in Pakistan (Table 3). The Pakistani isolate (PK-91) had 99.6% similarity with a G6 genotype (GenBank: MH300954) from Mauritania and with a human G6 genotype (GenBank: MH300938) from Kenya (Fig. 1c).

Table 3 Genotype assigned in relation to patient age, sex, and cyst localization

One cytb gene sequence (580 bp) was identified as E. multilocularis and it was 100% identical to isolates from China (GenBank: KY290787 and KY290785), 99.8% identical to isolates from foxes in Poland (GenBank: KY205676 and KY205667), and 99.6% identical to an isolate from a fox in France (GenBank: AB461396). Meanwhile, the Pakistani strains were genetically distant from the Mongolian and Canadian strains, as shown in the phylogenetic tree (Fig. 1d).

The cytb gene from the Pakistan samples of E. granulosus (s.s.) showed 100% similarity with the other selected reference sequences. All other compared G6, G7, and G10 samples, including the samples from Pakistan, showed the same sequences (Fig. 2).

Fig. 2
figure 2

Multiple sequence alignments of partial cytb gene sequences. Genotypes (G1, G3, G5 and G6), represented with PUN suffixes, were from this study, while reference sequences from GenBank of genotypes G1, G3, G5 and G6 are presented with different codes. The accession numbers range from MK229294 to MK229342

For the nad1 gene, PUN-131-Pakistan was conserved when compared with the selected genotypes, whereas point mutations and substitutions were found in some of the other compared sequences (Fig. 3). The PUN-116-Pakistan sample had point mutations reported from France (GenBank: KY766893) and Turkey, while other sequences of E. granulosus (s.s.) were conserved (Fig. 4).

Fig. 3
figure 3

Multiple sequence alignments of partial cytb gene sequences (E. multilocularis). Genotypes (G1, G3, G5 and G6) represented with PUN suffix were from this study while reference sequences from GenBank of genotypes G1, G3, G5 and G6 are presented with different codes. The accession numbers range from MK229294 to MK229342

Fig. 4
figure 4

Multiple sequence alignments of partial nad1 gene sequences. Genotypes (G1, G3, G5 and G6), represented with PUN suffixes, were from this study, while reference sequences from GenBank of genotypes G1, G3, G5 and G6 are presented with different codes. The accession numbers range from MK229294 to MK229342

Discussion

The two notable cestode-borne zoonoses are CE and AE. In the northern hemisphere, AE is widely distributed, while CE is widely distributed across the world and the disease burden in humans is highly variable in different endemic areas. AE and CE are still considered as neglected zoonoses in many areas of the world, although their prevalence is quite high in such areas, because of lack of awareness and disease management. The occurrence of CE is quite high around the world; however, the pathogenicity and fatality caused by AE is more prevalent in Asia [33]. CE is an endemic disease in Pakistan and causes serious economic losses in terms of human healthcare and livestock agriculture costs. In addition, there is lack of knowledge about CE in Pakistan that affects its transmission dynamics [34,35,36]. Agriculture is the backbone of the Pakistan and a large number of families are affiliated with this sector, including animal rearing and dairy farming for milk products. In small and domestic farms, standard principles are often not strictly followed; therefore, these populations are at high risk of acquiring Echinococcus spp. infection [7]. CE is considered a socially constructed disease because of various traditional practices found among different ethnic groups around the globe, such as keeping many dogs and a large amount of livestock, and the culture of rescuing stray dogs [37].

In current investigation, a total of 35 hydatid cyst samples were characterized as resulting from E. granulosus s.s. A high rate of E. granulosus s.s. was detected, which is in line with the data reported previously in humans (88.5%) [38] and livestock [39]. Similarly, in China, the majority (60%) of CE positive cases in humans are caused by E. granulosus s.s. (formerly the G1 strain) [40], which also caused 40.62% of the infections reported in India [41]. However, there is little information on the genetic characterization of Echinococcus spp. in humans in Pakistan. Echinococcus granulosus s.s. has been reported in buffaloes in Sindh Province of Pakistan [8]. This species was detected in small and large ruminants, while the sheep strain (G1) was found in human samples (n = 2) using cox1 gene sequencing [9]. Echinococcus granulosus s.s. in cattle has been reported in Pakistan [10]. The high rate of E. granulosus s.s. reported in current study might be because E. granulosus s.s. is the predominant species in Pakistan (so far) and in neighbouring countries [9, 39, 40]. Even globally, E. granulosus s.s. is the most predominant causitive agent of CE [38]. It has a wide host range, which makes it more dominant in endemic localities even in cases where it occurs in sympatry with other E. granulosus s.l. In addition, it might reflect the fact that the maximum number of cases with CE were inhabiting in rural areas, where people have a close association with dogs [41].

In the present study, two samples were characterized as being infected with E. canadensis (G6/G7). E. canadensis (G6/7) was thought to be less infective to humans [42]. It is now known to be the second most important causative agent of CE after E. granulosus s.s [38]. Globally, E. canadensis (G6/7) has been reported in Kenya [42, 43]; Argentina [44]; China [40]; in different parts of Africa, Asia, and South America [12, 27, 45]; and in many countries in eastern and south-eastern Europe [27, 46, 47]. Meanwhile, the G6-G10 cluster was reported in Northern Palearctic, Northern Africa, and in the Middle East [27]. In Pakistan, because of the camel and pig populations, G6 transmission to human hosts is possible, especially resulting from camel slaughtering and cross boundary migration of animals from Afghanistan. The characterization of the E. canadensis (G6/G7) in humans in Pakistan suggests the interaction between the camel-dog and pig-dog cycles. In Pakistan, the pig population is abundant and imposes a serious health threat to the human population. Often, wild pigs live near human settlements in Pakistan. Although the camel population is quite low in the Punjab Province of Pakistan, sharing a border with Afghanistan and Iran means that species transmission is possible because of illegal animal transport across the border. Camels and other livestock animals live together; therefore, there is a possibility of exposure to other genotypes through interaction with the common definitive hosts, especially dogs. We could not compare our samples with Afghanistan isolates because there are no E. granulosus sequences from Afganistan deposited in GenBank.

In the present study, one sample was characterized as being infected with E. multilocularis. In North America, human cases of AE caused by E. multilocularis have been reported [48, 49]. AE is also prevalent in the northern hemisphere and even in the neighbouring country of Afghanistan [33]. In Pakistan echinococoosis ingeglected yet [50, 51]. The current study is first report of genotyping of E. multilocularis from humans in Punjab Province, Pakistan using sequence analysis. Previously, E. multilocularis was investigated in cattle from the KPK province of Pakistan as assessed using PCR-RFP [11]. The present findings suggest that cystic echinococcosis is an important emerging health issue and that AE is circulating in rural areas of Pakistan.

Conclusions

In conclusion, the current findings indicate the presence of E. granulosus s.s., E. canadensis (G6/G7), and E. multilocularis in the Punjab province of Pakistan. Additionally, E. canadensis (G6/G7) in human isolates is reported for the first time in Pakistan. To aid the eradication of the disease, comprehensive surveillance should be initiated. Control measures developed based on surveillance results could help to slow down the spread of the disease. The probable occurrence of other E. granulosus s.l. species indicate that further epidemiological studies using more Echinococcus isolates from all intermediate hosts (e.g. human and others), as well as definitive hosts, should be performed in different climatic regions of Pakistan.