Background

The fact and process of lateral gene transfer (LGT) has been integral to the study of infectious disease since Griffith [1]. Most investigations have, however, centered on transfer of known virulence genes and genes involved in establishing infections (antibiotic resistance, toxins, capsule, etc.) between close relatives who are both pathogens. The explosion of gene sequence information and complete genomes in the last 5 years has reinforced and extended our view of virulence evolution. To quote a review [2]: "lateral transfers have effectively changed the ecological and pathogenic character of bacterial species." Analysis of gene sequence and complete genome information[311] have led to the realization that LGT is not a rare exception to classical Darwinian evolution, but may be the predominant mode of evolutionary change in prokaryotes[2, 1215].

Given the common perception of archaea as extremophiles, one might expect the opportunity for transfer between archaeal organisms and bacterial pathogens to be a rare event. However, many archaea are present in more "normal" environments. Methanogenic archaea are very common in vertebrate and invertebrate digestive systems. In fact, in diary cattle, methanogenic archaea make up a substantial fraction of the micro biota [16] and E coli O157 is also very common, being cultured from 75% of herds [17]. Furthermore, there are similarities among bacterial and archaeal phages, plasmids, and other vectors for mediating transfer (IS elements; transposable elements)[18]. Transfer of DNA in natural environments has been extensively described for a number of different organisms: Bushman (Table 5.5,[19]) lists more than 30 studies. Many different functions – proteases, metabolic genes, oxygen protection genes, secretion systems, transporter genes, iron acquisition systems – can, under the right circumstances, contribute to virulence in an emerging pathogen.

E. coli O157:H7 causes hemorrhagic colitis and hemolytic uremic syndrome. It is widely recognized as a worldwide public health danger. However, it has only been associated with human disease since 1982. There is a complete genome sequence available for E. coli K-12[20], as well as much genetic information on many other well-studied non-pathogenic E. coli. There are two published complete genomes for E. coli O157:H7 enteropathogenic strains [8, 21], as well as seven E. coli genomes in progress [22] (GOLD database [23]). Virulence also depends on many different genes in enterohemorrhagic E. coli [8]. For the purposes of testing the contribution of archaea to virulence genes,E. coli O157:H7 serves as an ideal model.

Testing the hypothesis

We can test this hypothesis by:

1) Identifying genes likely to have been transferred (directly or indirectly) to E. coli O157:H7 from archaea.

2) Investigating the distribution of similar genes in pathogens and non-pathogens and performing rigorous phylogenetic analyses on putative transfers.

Have any transfers been described between archaea and E. coliO157?

I have demonstrated that a gene coding for a bifunctional catalase-peroxidase is likely a transfer from archaea to a variety of pathogenic bacteria, including E. coli O157:H7[9]. Although not yet directly implicated in O157:H7 as a virulence factor, this enzyme has been implicated as a virulence factor in Mycobacterium tuberculosis [24, 26], and in Legionella pneumophila [25]. Furthermore, this E. coli O157:H7 catalase-peroxidase has been associated with enterohaemorrhagic hemolysin in a variety of shiga-like toxin-producing (verotoxin-producing) E. coli [30, 31]. This correlation of the presence of the catalase-peroxidase in many virulent but not in avirulent strains suggests a direct role in the virulence of enterohemorrhagic E. coli.

How can we identify other genes likely to have been transferred (directly or indirectly) to E. coliO157:H7 from archaea?

We can use the three complete E. coli genomes available[8, 20, 21] to identify a subset of genes present in one (or both) of the O157 strains but not in K-12. This subset of genes – although most have not been directly identified as virulence genes – are more likely to be virulence-associated. This subset will be the focus of our search for genes likely to have laterally transferred from archaea. Perna et al [8] identified 1,387 genes in the EDL933 O157 strain they sequenced that are O157-specific; This number does not include the O157-specific plasmids that were previously sequenced[32] from which some of the preliminary work[9] described below was derived.

As a preliminary step to testing this hypothesis I have searched the E. coli O157:H7 strain EDL933[8] genome for open reading frames (ORFs) meeting the following criteria:

1) present in O157 strain EDL933 but not in E. coli K-12

2) highly similar (having a BLASTP similarity bit score > 95) to ORFs found in at least two archaeal genomes

3) having few or no highly similar proteins (BLASTP score < 85) in other bacterial genomes

This search was facilitated by the Clusters of Orthologous Groups (COG) database at NCBI [33, 34]. This preliminary, non-systematic search produced 6 ORFS worth considering as LGTs from archaea to pathogenic bacteria. Table 1 shows these ORFs with their location and, if any information was available, a possible function.

Although the putative functions in Table 1 are not generally associated with virulence, many genes, under the right circumstances, can facilitate infection or contribute to pathogenicity. There are examples in the literature where both ABC transporters (in Streptococcus gordonii:[35]) and helicase genes (in Legionella pneumophila :[36]) have been directly implicated in virulence.

Table 1 Potential Archaeal to Bacterial Laterally Transferred ORFs in E. coli O157:H7 EDL933 Genome. Although ORFs Z5331 and Z0509 do not have archaeal genes as the most similar BLAST hit, they are included because in general they are absent from almost all bacterial genomes.

Further testing of this hypothesis will require rigorous phylogenetic analyses of each suspected transfer. The procedure of comparing similarity scores to identify potential lateral transfers (used above) although commonly employed[6, 7, 37] it is fraught with potential errors [15, 3842, 34, 35] and must serve only as an initial screen. Ragan[15] recently wrote:

This study demonstrates the need for a systematic, comprehensive approach to the study of LGT based on first principles, i.e. rigorous inference and statistically based comparison of molecular phylogenetic trees. As more genomic sequences appear, a tree-based approach will become both more challenging and more rewarding.

Implications of the hypothesis

Although this hypothesis focuses on archaea and E. coli, the model of distant gene transfer as a major contributor of "new" virulence genes to pathogens or potential pathogens has broad applicability to a large number of pathogenic systems. Archaea represent both the most distant source and, in many ways, the most unlikely source for virulence genes. If E. coli can acquire virulence genes from archaea, then potentially any organism is a reservoir of virulence genes for pathogens. [43]

The implications, should this hypothesis be proven, are myriad. Our understanding of potential sources of virulence genes will be expanded to include virtually all life on earth – at first glance a frightening prospect. At the same time, however, it would allow us to move from a descriptive, reactionary view of infectious disease towards a predictive science of infectious disease. It would be dramatic evidence of what some microbiologists suspect: that lateral gene transfer is the predominant engine of variation in prokaryotes and the catalyst for the emergence of new bacterial pathogens.