Background

A recent report by IUCN indicates that amphibians are the most endangered group of vertebrates worldwide, with 32 percent listed as threatened with extinction and 159 species already believed to be extinct [1]. Among the several factors often linked to the decline in amphibian populations is the chytrid fungus Batrachochytrium dendrobatidis (Bd), the causative agent of chytridiomycosis. Ever since the discovery of Bd in 1999 [2], numerous cases have been reported for the destruction associated with chytridiomycosis [3, 4]. The destructive path of chytridiomycosis fits all of the descriptors of a novel epidemic (i.e. sudden appearance, strong virulence, and rapid transmission) [5]. Investigations of museum specimens collected from as far back as 1902 in Japan [6] and 1938 in Africa [7] have detected Bd in skin cross sections. On the other hand, long-term population studies since the 1900's had no records of mass mortalities until the late 1970's when the first Bd epidemics were reported [8]. This leads to an interesting question about how pathogenesis suddenly evolved in this previously benign organism.

Many disease organisms existed in non-virulent/commensal forms until they acquired some novel traits allowing them to exploit new niches [9]. Among the several possible underlying mechanisms, horizontal gene transfer (HGT) is the most rapid way for an organism to gain novel phenotypes [10] and it has contributed to the origin and spread of pathogenesis in numerous bacteria [9, 11] and occasionally in eukaryotes [12, 13]. In this study, we show that homologs of two large families of known virulence effector genes were likely acquired by Bd. These families have been rapidly duplicated after their acquisition in Bd and evolved under strong positive selection. We hypothesize that the acquisition of these virulence effector homologs may partly explain the evolution of chytridiomycosis.

Results and Discussion

Phylogenomic analyses identified homologs of two large families of known virulence effectors in Bd, including serine peptidases and CRN proteins (Table 1; Additional file 1). Of these, members of the serine peptidase family have been implicated as virulence effectors in Bd because of their potential role in degrading host antimicrobial peptides [14]. Previous microarray data also indicated that many serine peptidase genes in Bd are highly expressed in the sporangia stage, which is associated with the infection of keratinized host tissue [14]. Our analyses show that at least 32 serine peptidase genes are present in the Bd genome (Additional file 1). Other identifiable homologs of Bd serine peptidases are only found in bacteria, where they are more broadly distributed.

Table 1 Structures and predicted subcellular localization of the CRN protein family in Bd JAM 81.

The CRN protein family consists of cytoplasmic virulence effectors only known in oomycete plant pathogens [15, 16]. These effectors are referred to as "crinkler" proteins because of their association with cell death and leaf crinkling, which parallels the effects of Bd on amphibian skin [17]. In oomycete pathogens, the N-terminal region of CRN proteins contains a highly conserved LFLAK domain that is characteristic of all CRN proteins. The C-terminal region is diverse in domain and motif structure, based on which 24 subfamilies were identified in the oomycete Phytophthora infestans [16]. The C-terminal region of CRN proteins also controls virulence and, when expressed in plant cells, may induce cell death [16]. Our analyses identified 84 genes encoding CRN protein homologs in Bd (Table 1). These genes are present in genomes of both Bd isolates (JAM81 and JEL423) and include eight of the 24 subfamilies of CRN proteins known in oomycetes (Table 2).

Table 2 Comparison of gene copy numbers of CRN protein subfamilies in Bd and oomycetes

Compared to oomycetes, the number of genes in each subfamily is often higher in Bd. In particular, the DXX-DHA subfamily of CRN proteins is highly duplicated in Bd and contains 30 and 44 gene copies from the two respective genomes, many of which show very little sequence variation. The homology of CRN proteins from Bd and oomycetes is most obvious in the C-terminal effector region, with a sequence identity of up to 46.5%. The N-terminal LFLAK domain characteristic of oomycete CRN proteins also is largely conserved in some identified CRN homologs in Bd (Additional file 2). As in oomycete pathogens [16, 18], only a fraction of CRN homologs identified in Bd are predicted to contain signal peptides (Table 1). However, almost all other identified CRN homologs in Bd have short N-terminal extensions of about 20 amino acids in positions corresponding to signal peptides (Table 1 and Additional file 2).

The observation that a gene is uniquely shared by distantly related organisms can be attributed to multiple possible scenarios, in particular differential loss and HGT [19, 20]. Although differential loss and HGT can always be invoked as alternative explanations to each other, each scenario becomes increasingly less likely if it requires considerably more corresponding (gene loss or HGT) events [19, 21]. In the cases of serine peptidases and CRN proteins, because chytrids and oomycetes belong to opisthokonts and heterokonts respectively, the gene loss scenario entails numerous independent loss events in related taxa. A more parsimonious and plausible scenario is the occurrence of HGT between Bd and oomycetes. To further investigate the number and the direction of potential HGT events involved, we performed phylogenetic analyses on serine peptidases and each of the eight CRN subfamilies shared between Bd and oomycetes. Our analyses indicate that all serine peptidase homologs in Bd form a weakly supported clade (Figure 1A). Given that serine peptidases are more widely distributed in bacteria, it is most likely that serine peptidases in Bd are derived from a single HGT event from bacteria, followed by gene duplication. Analyses of CRN subfamilies show that, in most cases, sequences from Bd form a distinct clade whereas those from oomycetes form another, suggesting independent HGT for each CRN subfamily. With paralogous sequence clades as outgroup, our analyses of the DN17 subfamily also show that Bd sequences were likely acquired from oomycetes, though the support is still modest (Figure 1B). Such low support is expected given the short length of sequences used in our phylogenetic analyses. Additional evidence consistent with the direction of HGT from oomycetes comes from their higher CRN gene diversity. The fewer CRN subfamilies in Bd (8 versus 24 in oomycetes) are in line with the transfer from oomycetes, as we would expect the recipient organism to receive a subset of the gene repertoire of the donor. Nevertheless, for both serine peptidases and the CRN family, we were unable to determine the timing of these HGT events because of the paucity of sequence data from other chytrids and oomycetes.

Figure 1
figure 1

Molecular phylogenies of serine peptidases (A) and the DN17 subfamily of CRN proteins (B). Numbers above major branches show bootstrap values from maximum likelihood and distance analyses, respectively. The DN17 tree is rooted at the two distinct sequence copies (groups) in oomycetes. Genes from Bd, oomycetes and bacteria are indicated in different colors. P = Phytophthora; A = Aphanomyces.

Because both the serine peptidase and CRN families are highly duplicated in Bd, we reasoned that they may confer important functions and, therefore, investigated the possibility that strong selection has been involved in their evolution. The estimate of d N / d S for the serine peptidase family in Bd is 0.46, while this value is 0.01 in bacteria. Similarly, d N / d S for each of seven CRN subfamilies (with more than three sequences) is significantly higher in Bd than in oomycetes (Table 3). These results indicate that both the serine peptidase and CRN families evolved faster in Bd than in oomycetes and bacteria. We further used three likelihood tests (M2a vs. M1a, M8 vs. M7 and M8 vs. M8a) [22] to identify the role of positive selection on the serine peptidase family and seven CRN subfamilies in Bd and in oomycetes or bacteria, respectively. A priori we stipulated that evidence for positive selection will be sufficient only if all three tests performed show statistical significance. In total, all eight families and subfamilies in Bd show strong evidence of positive selection (Table 3; Additional file 3). In contrast, we found that only four subfamilies of CRN proteins (DX8, DXX-DXV, DXX-DHA, and DN17) evolved under positive selection in oomycetes. For the subfamilies that were influenced by positive selection in both Bd and oomycetes, the improved branch-site model [23] was used to identify positive selection specifically on the diverging Bd lineages. The genes in Bd were used as foreground branches and those in oomycetes were used as background. Except for the subfamily DXX-DHA, likelihood tests for the other three CRN subfamilies show statistically significant support for positive selection in the Bd branch (Additional file 3). Furthermore, multiple positively selected amino acid sites were identified using the Bayes Empirical Bayes (BEB) estimation procedure [24], suggesting that positive selection acted specifically on certain amino acid sites of the serine peptidase family and the CRN subfamilies after their acquisition by Bd.

Table 3 Summary of evolutionary rate and positive selection analyses

Most genes of the serine peptidase family in Bd were assigned to two clades in the phylogeny with strong bootstrap support (Figure 1A). We estimated the coefficients of both type I (a site-specific shift in evolutionary rate) [25] and type II functional divergence (a cluster-specific shift of amino acid property) [26] between them. The results showed that only the type I coefficient of functional divergence is significantly greater than zero (θ I = 0.5744 ± 0.0589, LRT = 220.3270, p < 0.01). These results suggest that site-specific selective constraints have been altered for most of the members of this family, leading to group-specific functional evolution after their diversification. Furthermore, 26 amino acid residues were found to be critical for type I functional divergence with a posterior probability higher than 0.95 (Additional file 3), indicating that the evolutionary rates for these sites have shifted between these two clades. Some amino acid residues in the region of the Peptidase_S41 domain may also have contributed to the functional divergence, as there are 11 critical residues that are located in this region.

The acquisition of the serine peptidase and CRN families followed by rapid duplication and positive selection suggests that these acquired genes are likely important in the adaptation of Bd. Because both families are known virulence effectors, their presence in Bd provides a starting point for biochemical investigations of chytrid pathogenesis. If members of these two families are indeed related to Bd virulence, their acquisition may at least partly explain the emergence of chytridiomycosis. Peptidases are ubiquitous tools of bacterial parasites that enable host infection, and have also been identified as key factors in fungal pathogenesis. For example, Microsporum canis, a dermatophytic fungal parasite, infects keratinized tissue (e.g. skin and nails) in mammals. This fungus secretes dipeptidyl peptidases, which are homologous to serine peptidases of the S9 family in bacteria [27]. These and other proteases enable the fungus to obtain nutrients from the complex network of insoluble proteins that comprise keratin-rich tissues. As mentioned above, serine peptidases were identified as a class of proteins that are associated with virulence in Bd using microarray expression analyses [14]. The CRN genes were previously only known in oomycetes, which themselves also are important pathogens. Although most attention has been paid to oomycete plant pathogens, there is also a diverse group of oomycetes that infect animals [28]. Some of these pathogens are responsible for epidemics in aquatic animals such as crayfish [28]. The oomycete parasite Saproglenia may induce high mortality in salmon populations, both in aquaculture and in the wild [29], with effects similar to Bd [30]. Hence, the oomycete lineage is well adapted to parasitize vertebrate hosts. It is not surprising that oomycetes have an arsenal of genes associated with parasitism and that the acquisition of such genes may help convert a commensal chytrid to a parasitic pathogen.

Conclusions

Our data show that two large families of known virulence effector genes, CRN proteins and serine peptidases, were likely transferred to Bd from oomycete pathogens and bacteria, respectively. These acquired virulence effector homologs have been under rapid gene duplication and strong positive selection, suggesting that they may play an important role in the adaptation of Bd. Given that pathogenesis is the outcome of microbe-host interactions, the impact of any virulence effector is likely more severe when the host becomes stressed. We hypothesize that the acquisition of virulence effectors, in combination with habitat disruption and climate change, led to the Bd epidemics and the decline in amphibian populations. Further investigations are warranted to fully understand the biological functions of virulence effector homologs identified in our analyses.

Methods

Data source and genome screening for HGT candidates

Phylogenomic analyses used a customized database combining sequences from the NCBI non-redundant (nr) database, dbEST, the Taxonomically Broad EST Database (TBestDB) [31] and additional eukaryotic genomes. Annotated protein sequences of the heterokont Aureococcus anophagefferens, haptophyte Emiliania huxleyi, green algae Chlorella sp. and C. vulgaris, metazoans Daphnia pulex, Capitella sp. and Lottia gigantea were obtained from the Joint Genome Institute http://www.jgi.doe.gov. The annotated genome of the red alga Cyanidioschyzon merolae was downloaded from the Cyanidioschyzon merolae Genome Project http://merolae.biol.s.u-tokyo.ac.jp, and those of fungi Rhizopus oryzae, Allomyces macrogynus, and Spizellomyces punctatus were obtained from their sequencing projects at the Broad Institute of Harvard and MIT http://www.broadinstitute.org. Detailed sources for annotated protein sequences and ESTs of two Bd isolates and seven oomycetes used in our analyses are shown in Additional file 4. A comprehensive database was created using above data.

Genome screening for horizontally acquired genes was performed using AlienG [32]. AlienG is based on the observation that sequence similarity is correlated to sequence relatedness, where sequence similarity is measured by BLAST bit scores. A query sequence is considered to be a candidate of HGT-derived gene if it is significantly more similar to homologs from a potential donor (S d ) than to those from closely related taxa (S r ). Specifically, a bit score ratio S d /S r = 1.5 was used as threshold to identify candidates of HGT-derived genes in Bd. To eliminate potential artifacts, only candidates with expression evidence from ESTs or located in contigs containing more than 5 genes are retained for further analyses.

Identification of serine peptidase and CRN protein families in Bd and oomycetes

Phylogenomic analyses identified homologs of serine peptidases and CRN proteins among the HGT candidates. To retrieve other CRN and serine peptidase homologs in Bd, we first downloaded bacterial serine peptidases and annotated oomycete CRNs from both GenBank and oomycete genome projects [16, 18]; these downloaded sequences were then used as queries to search (BLASTP) against Bd JAM81 annotated protein sequences. All hits with E-values lower than 1 × 10-5 were retained. To classify the retrieved CRN homologs in Bd, we followed the criteria used for the oomycete P. infestans [16], which is based on the structures of C-terminal specific domains. As a result, eight subfamilies of CRN homologs were identified in Bd. To further estimate the copy numbers for homologs of serine peptidases and each CRN subfamily in Bd, we first constructed profiles for the Bd serine peptidase domain and the C-terminal specific domain regions of each CRN subfamily using HMMER [33]. The generated profiles were used to search the Bd genome, and sequences with E-values lower than 1 × 10-5 were retrieved and further aligned using conserved domain regions as references. Extremely divergent or partial sequences were removed from further analyses. The remaining genes were considered to be putative serine peptidases or CRN proteins. To exclude the possibility of contamination in Bd JAM81, the same process was also used to identify genes in the isolate Bd JEL423. Because the CRN protein family from four oomycetes (P. infestans, P. sojae, P. ramorum, and Pythium ultimum) have been well annotated [16, 18], we also used their C-terminal specific domain regions as references to identify CRN proteins in other oomycete datasets, including the P. capsici genome and ESTs for three other oomycetes (P. brassicae, P. parasitica, and Aphanomyces euteiches). Partial sequences were excluded and the remaining genes were regarded as CRN proteins. Secretion signals of serine peptidases and CRN proteins were identified using three tools (SignalP [34], TargetP [35], and iPSORT [36]) with default settings. The conserved N-terminal profiles were generated using WebLogo http://weblogo.berkeley.edu.

Phylogenetic analyses

Protein sequence alignment was carried out using MUSCLE [37] and clustalX [38], followed by visual inspection and manual refinement. Gaps and ambiguously aligned sites were removed manually. The most appropriate protein substitution matrix, rate heterogeneity and invariant sites were determined using ModelGenerator [39] for serine peptidases and each of the eight CRN subfamilies identified in Bd. Phylogenetic analyses were performed with a maximum likelihood method using PHYML [40] and a distance method using neighbor of PHYLIPNEW v.3.68 [41] in EMBOSS package [42]. Bootstrap support values were estimated using 100 pseudo-replicates. For distance analyses, maximum likelihood distances were calculated using TREE-PUZZLE v.5.2 [43] and PUZZLEBOOT v.1.03 (A. Roger and M. Holder, http://www.tree-puzzle.de. The models used in TREE-PUZZLE/BUZZLEBOOT to calculate the maximum likelihood distances were the same as those determined by ModelGenerator. All other parameters in the analyses used default settings.

Detection of positive selection

The d N /d S values for the serine peptidase family and seven CRN subfamilies (with more than three members in Bd) were estimated both in Bd and in oomycetes or bacteria, respectively. Genes with partial sequences were removed from the analyses. The program PAL2NAL [44] was used to convert protein sequence alignments into the corresponding codon-based DNA alignments. The program CODEML [22] was then used to calculate the d N /d S values.

Maximum likelihood analyses of selective constraints on the serine peptidase family and seven CRN subfamilies were performed using PAML [22] with their phylogenies and the codon alignments as input. For the serine peptidase family and each CRN subfamily, we firstly used the site-specific models [22, 45, 46] to identify positively selected codon sites in Bd and in oomycetes or bacteria, respectively. In this analysis, we compared models M1a vs. M2a, M7 vs. M8 and M8a vs. M8 to identify positive selection. To assess whether positive selection acted on the evolution of each group, the log likelihood values for each pair of nested models were compared using a likelihood ratio test (LRT). In the LRT, twice the log likelihood difference (2Δ/) between the two models was compared with the χ 2 statistics, with degrees of freedom (DF) equal to the difference of the number of parameters. The DFs were 3, 2, and 1 for comparisons M1a vs. M2a, M7 vs. M8 and M8a vs. M8, respectively. If the LRT was statistically significant and d N /d S estimate greater than 1, evidence for positive selection would be considered sufficient for the genes analyzed. Then, the Bayes Empirical Bayes method [24] was used to determine the amino acids most likely responsible for positive selection.

The improved branch-site models [23] were also used to search for positive selection for the serine peptidase family and each CRN subfamily. In these tests, we compared the null model A (model = 2, NSsites = 2, with ω fixed to 1) with the alternative model (model = 2, NSsites = 2). For each family or subfamily, the Bd branch on the phylogeny was used as foreground, while the oomycetes or bacteria branch used as background. When the LRT suggests positive selection under this model, the codon sites likely evolved under positive selection can be identified through posterior probability from Bayes Empirical Bays method [24].

Analysis of functional divergence

The program DIVERGE [47] was used to estimate the coefficients of both type I (θ I ) and type II (θ II ) functional divergence between serine peptidase clades A and B. Because subfamilies with fewer than four sequences cannot be analyzed using this method, three other serine peptidase proteins were excluded from this analysis. A likelihood ratio test was performed to test the null hypothesis θ = 0 using χ 2 distribution with 1 DF. A significant θ I indicates that site-specific altered selection constraints after the divergence of clades, while a significant θ II means site-specific shift of amino acid physiochemical property [25, 26].