Background

Genomes of the model bacterium, Escherichia coli, exhibit high plasticity caused by gene gain/loss via pathoadaptive mutations, genetic rearrangement, and horizontal gene transfer [1, 2]. This genetic variability is also translated into a remarkable phenotypic and pathotypic diversity: while some E. coli strains normally inhabit the mammalian colon, other pathotypes cause a wide range of intestinal and extraintestinal diseases that include mild intestinal disturbance but also severe urinary tract infections and outbreaks of shigellosis-like dysentery or cholera-like watery diarrhea [1]. In this study, we focus on enterotoxigenic E. coli (ETEC), one of the world's deadliest infectious agents, which also represents a serious public health in Egypt's rural areas. Our aim is to integrate multiple bioinformatics tools to determine horizontally transferred, pathotype-specific signature genes as targets for specific, high-throughput molecular diagnostic tools and reverse vaccinology screens.

Methods and results

To estimate the extent of horizontal gene transfer in ETEC, we used a combination of bioinformatics tools, including GC%, comparative genometrics analysis [3], and web-based prediction of pathogenicity islands via IslandPath http://www.pathogenomics.sfu.ca/islandpath[4]. Because E. coli strains are typically polylysogenic [5], we used the ACLAME Prophinder tool http://aclame.ulb.ac.be/Tools/Prophinder[6] to predict complete or rudimentary prophages scattered within the ETEC genome. To determine ETEC pathotype-specific genes or signature genes, we used comparative genomic tools available in the National Microbial Pathogen Data Resource (NMPDR) platform http://www.nmpdr.org, including the Signature Genes Tool and the Homolog Spreadsheet Tool [7]. We identified 128 genes that differentiate this pathotype from other E. coli strains, based on bidirectional-best-hit signature analysis. We also identified 94 genes that are characteristic to two closely related strains (24377A and 2348/69). Many of the ETEC-specific genes were mapped to prophages, prophage-like elements, and other pathogenicity islands; however, some of these signature genes, e.g., ORFs 21–39 in strain 24377A, seem to be rather lost in other E. coli strains (as they are conserved among other enterobacteria, e.g., Shigella and Salmonella). Our ongoing studies are testing some of these ETEC-specific genes as targets for multiplex PCR amplification to develop a rapid diagnostic typing method. Future studies will analyze the surface-association and antigenicity of these signature gene products as a first step in a reverse vaccinology strategy to develop novel ETEC vaccines.