Findings

Trypanosoma cruzi, the aetiological agent of Chagas disease, infects 6-8 million people in Latin America, while some 25 million more are at risk of acquiring the disease[1]. Parasite transmission to mammal hosts, including humans, can occur through contact with the faeces of hematophagous triatomine bugs. However, non-vectorial routes are also recognized, including blood transfusion, organ transplantation, congenital transmission, and oral transmission via ingestion of meals contaminated with infected triatomine feces[2, 3].

T. cruzi (family Trypanosomatidae; Euglenozoa: Kinetoplastida) is most closely related to several widely dispersed species of bat trypanosomes[4]. Salivarian trypanosomes including medically important Trypanosoma brucei subspecies, represent a more divergent group[5]. The age of the split between the T. cruzi-containing and T. brucei-containing trypanosome lineages is thought to have been concurrent with the separation of Africa and South America/Antarctica/Australasia 100MYA[6], implying that T. cruzi and the other Schizotrypanum species evolved exclusively in South America. Others propose an alternative origin of T. cruzi from an ancestral bat trypanosome potentially capable of long range dispersal[7]. Whilst the precise scenario for the arrival of ancestral Schizotrypanum lineages in South America is a matter for debate, the current continental distribution and genetic diversity of T. cruzi supports an origin within South America. Parasite transmission is maintained via hundreds of mammal and triatomine species in different biomes throughout South and Central America, as well as the southern states of the USA[8].

Biochemical and molecular markers support the existence of six lineages or Discrete Typing Units (DTU): TcI, - TcVI agreed by international consensus ([9]. Each DTU can be loosely associated with a particular ecological and/or geographical framework[10]. TcI is ubiquitous among arboreal sylvatic foci throughout the geographic distribution of T. cruzi and is the major agent of human Chagas disease in northern South America. Several molecular tools now identify substantial genetic diversity within TcI[1114]. Importantly these new approaches consistently reveal the presence of a genetically divergent and homogeneous TcI group (henceforth TcIDOM – previously TcIa/VENDOM) associated with human infections from Venezuela to Northern Argentina, and largely absent from wild mammals and vectors sampled to date[14]. The origin of this clade is unclear, although recent work supports a sister group relationship with TcI circulating in North America (e.g.[12, 13]).

In this manuscript we have set out to evaluate the genetic diversity of TcI in North/Central America, undertaking a comparison with TcI diversity in South America, including TcIDOM. Our aim was to examine hypotheses describing the origin of the TcIDOM clade. We propose two possible scenarios: an emergence of TcIDOM in northern South America as a sister group of North American strains and dispersal among domestic transmission cycles, or an origin in North America, prior to dispersal back into South American domestic cycles, possibly anthropically. To provide further insight into this question we undertook high resolution nuclear and mitochondrial genotyping of multiple Central American strains (from areas of México and Guatemala) and included them in an analysis with other published data[1113].

A panel of 72 TcI isolates and clones was assembled for analysis (Table1)[1116]. Of these, existing sequences and microsatellite data were available for 46 isolates[11, 12]. Isolates were classified into three populations: TcINORTH-CENT, TcISOUTH and TcIDOM. TcINORTH-CENT includes samples from the USA, México, Guatemala and Honduras; TcISOUTH corresponds to South America (Argentina, Bolivia, Colombia, Venezuela and Brazil) and TcIDOM with exclusively domestic isolates from Colombia and Venezuela, already known to correspond to a genotype with restricted genetic diversity: TcIa, as previously described by Herrera et al., (2007)[17] and VENDom, as described by Llewellyn et al., (2009)[13]. Additional DTU isolates (TcIII-TcIV) were included as out-groups in the mitochondrial analysis.

Table 1 Trypanosoma cruzi I samples included in this study

Isolates from México and Guatemala were characterized to DTU level via the amplification and sequencing of glucose-6-phosphate isomerase (GPI) as previously described by Lauthier et al., (2012)[18]. Subsequently, nine maxicircle gene fragments were amplified, sequenced and concatenated from the Méxican and Guatemalan strains according to Messenger et al., 2012 (excluding ND4)[12]. Phylogenetic analysis was also conducted as in Messenger et al., 2012[12]. Nineteen nuclear microsatellite loci previously described by Llewellyn et al., 2009[13], were selected based on their level of TcI intra-lineage resolution. Microsatellite loci were amplified across 21 unpublished biological stocks from México and Guatemala. Reaction conditions were as described previously[13]. Dendrograms based on multilocus allele profiles were constructed also according to Llewellyn et al., 2009[13].

Maxicircle nucleotide diversity (π) was calculated for TcINORTH-CENT, TcISOUTH and TcIDOM respectively in DNAsp v5[19]. Nuclear allelic diversity was calculated for the same populations using allelic richness (Ar) in FSTAT[20]. The resulting values are shown in Figure1.

Figure 1
figure 1

Nucleotide diversity and allelic richness comparisons across North and South American. Trypanosoma cruzi I populations. Left hand data points (diamond) indicate allelic richness ± standard error over loci. Right hand data points (square) indicate nucleotide diversity (π) ± standard error over pair-wise comparisons.

Nucleotide sequences per gene fragment are available from GenBank under the accession numbers MURF1 (fragment a): JX431060 - JX431084; MURF1 (fragment b): JX431085 - JX431109; ND1: JX431110 - JX431134; ND5 (fragment a): JX431135 - JX431159; ND5 (fragment b): JX431160 - JX431184; 9S rRNA: JX431185 - JX431209; 12S rRNA: JX431210 - JX431234; COII: JX431235 - JX431259 and CYT b: JX431260 - JX431284.

Across the 3,449 bp final concatenated alignment (including outgroups), a total of 374 variable sites were found. The mitochondrial phylogeny supported the presence of significant diversity among the isolates examined (Figure2). TcIDOM strains formed a monophyletic clade [60% ML BS/0.98 BPP]. The principal division in the phylogeny was between TcISOUTH and TcIDOM/TcINORTH-CENT (98% ML BS/0.98 BPP). However, this division is incomplete, such that a subset of South American strains is also grouped with TcIDOM and TcINORTH-CENT. Thus, it is not possible to conclude that TcIDOM maxicircle sequences nest uniquely among those from TcINORTH-CENT strains. Conversely, a basal relationship of the TcINORTH-CENT to TcIDOM is suggested at the level of nucleotide diversity by population (Figure1), whereby TcIDOM<TcINORTH-CENT<TcISOUTH. Low standard errors about the mean in all three populations, but especially in TcIDOM and TcINORTH-CENT, suggest that sample size had little impact on the accuracy of estimation between populations.

Figure 2
figure 2

Isolate grouping of 72 Trypanosoma cruzi I strains, as well as outgroups, based on nine concatenated maxicircle sequences. Bayesian consensus topology is displayed. Bayesian posterior probability analysis (BPP) was performed using MrBAYES v3.1. Five independent analyses were run using a random starting tree with three heated chains and one cold chain over 10 million generations with sampling every 10 simulations (25% burn-in). Decimal values (second number) on nodes indicate Bayesian probabilities for clusters. First number indicates the Maximum-Likelihood (ML) % bootstrap support for clade topologies, which was estimated following the generation of 1000 pseudo-replicate datasets. Branch colours indicate isolate origin. Isolates that show clear incongruity between nuclear genotype and maxicircle genotype are marked. Outgroup branches were cropped for ease of visualization, full branch lengths are show inset top right.

Distance-based clustering using the microsatellite dataset indicated the presence of several well defined clades (Figure3). Importantly in this case the monophyly of North-Central American isolates was corroborated, and TcIDOM clustered firmly within them (bootstrap 65%). By contrast, South American isolates fall into a divergent but diverse clade. Thus the nuclear data provide stronger support for divergence of TcIDOM from within TcINORTH-CENT than the maxicircle phylogeny. Sample size-corrected allelic richness estimates are consistent with hierarchical patterns of clustering based on pair-wise genetic distances. As with the maxicircle dataset, there is a pronounced cline in diversity across the populations studied - Ar TcIDOM< Ar TcINORTH-CENT< Ar TcISouth (Figure1).

Figure 3
figure 3

Isolate grouping of 72 Trypanosoma cruzi I strains based on nineteen nuclear micrsoatellite markers. Neighour-joining clustering algorithm implemented. Bootstrap values are included on important nodes. The first figure indicates % bootstrap support over 10,000 trees, the second the % stability over 1000 trees accounting for multi-allelic loci in the dataset. For further details see Llewellyn et al., 2009[13]. Branch colours indicate isolate origin. The three principal populations TcIDOM TcISOUTH and TcINORTH-CENT are shown on both map and tree. Red circles correspond to isolates from TcIDOM. Isolates that show clear incongruity between nuclear genotype and maxicircle genotype are marked with reference to Figure2.

TcI dispersion into Central and North America

Using a 100 MYA biogeographic calibration point[6], molecular clock analyses point to the origin of T. cruzi (sensu stricto) 5 – 1 MYA[2123] and a most recent common ancestor for TcI at 1.3-0.2 MYA[22]. Reduced genetic diversity among North-Central American isolates by comparison to their southern counterparts is powerful evidence in support of others who suggest that TcI originated in South America[13, 24]. The emergence of TcI in the South occurred prior to either migration across the Isthmus of Panama alongside didelphid marsupials during the Great American Interchange[25], or perhaps prior to northerly dispersal via volant mammals (e.g. bats).

Origin of TcIDOM

Recent findings indicate a close resemblance between TcIDOM isolates from the northern region of South America and parasite populations from Central and North America by the use of nuclear and mitochondrial markers[1113]. Indeed SL-IR genotyping suggests a distribution for TcIDOM that now extends as far south as the Argentine Chaco, where multiple sequences have been identified from human and domestic vector sources[14]. Llewellyn et al., (2009) originally hypothesised that a distinct human/domestic clade could be maintained despite the presence of nearby infective sylvatic strains due to the low parasite transmission efficiency by the vector[13]. In this case multiple feeds by domestic vector nymphs are required to infect individuals, as such human – human transmission is far more common than reservoir host - human transmission. Originally this hypothesis was developed to explain the epidemiology of Chagas disease in Venezuela. However, TcIDOM is clearly widespread and recent data propose a date for its emergence 23,000 ± 12,000 years ago[11]. This corresponds to the earliest human colonisation of the Americas[26]. Thus it seems that TcIDOM may be as ancient as humans in South America. Crucially, our data, which show that TcIDOM is nested among North and Central American strains, suggest that this widespread domestic T. cruzi genotype may actually have made first contact with man in North–Central America.

The expansion of limited diversity genotypes into domestic transmission cycles is a familiar story in T. cruzi. This phenomenon seems to have occurred almost simultaneously with TcIDOM (<60,000 YA) in the Southern Cone region but involving DTUs TcV and TcVI[22]. Nonetheless, static human population densities sufficient to support a sustained domestic cycle are presumably vital. For TcIDOM, patterns of genetic diversity suggest early colonizing Amerindians may have been responsible for its southerly migration and dispersal from North/Central America. However, such early settler populations were probably small, dynamic, and inherently unsuitable to sustain transmission of such a genotype. Many questions, therefore, remain unanswered regarding its emergence. Insight could perhaps be drawn from a better understanding of the current distribution and diversity of TcIDOM (including samples from the Southern Cone), patterns of vector population migrations, and even from analysis of ancient DNA (e.g.[27]). We hope this report serves to galvanize efforts towards this understanding, especially among researchers in Central and North America, where many of the answers lie.