Introduction

Vibrio species are marine bacteria that naturally inhabit aquatic environments worldwide and are commonly associated with marine organisms. Some Vibrio species are pathogenic bacteria capable of producing life-threatening infections in humans typically following consumption of contaminated food, including seafood. Although the specific factors that contribute to the pathogenicity of vibrios in humans are well studied, little is known about the bacterial factors involved in the association of the bacteria with environmental organisms.

Bacteria display a variety of mechanisms that enable them to specifically interact with target cells. Many bacteria produce hair-like surface structures, called pili or fimbriae, which are often important for survival [13]. These adhesins have been clustered into groups based on amino acid sequence similarities among their pilin subunits [4]. One type of pili, the type IV group, are known to be involved in adhesion, immune escape, microcolony formation, transformation, and phage transduction [4] and are commonly found in Gram-negative bacteria, including numerous pathogens [4, 5]. Type IV pili are known to assist many bacterial species in survival in various environments, ranging from attachment to a variety of surfaces for biofilm formation [69] to colonizing the host [1017]. These pili begin as prepilins possessing a hydrophilic leader peptide and are processed by a unique peptidase that cleaves the leader sequence to form a mature pilin protein [18]. After processing, mature pilin subunits assemble together to form pili through interactions between the conserved N-termini in the pilin cores, leaving the variable C-terminal regions to interact with the environment [4]. Type IV pili are divided into two subclasses based on differences in amino acid sequence and length. Type IVa pili have both a shorter leader peptide and mature protein sequence, while type IVb pili have considerably longer leader sequences and overall length [4, 5, 18]. In addition to similarities in their amino acid composition, all type IV pili appear to have analogous architecture [4].

When examining the genomes of Gram-negative bacteria possessing type IV pili, type IVa pili biogenesis genes are scattered throughout the genome, but the genes or gene clusters are almost always flanked by the same genes, typically housekeeping genes. In addition, homologous gene sets for type IVa pili are found in virtually identical locations throughout more than 150 sequenced genomes. Considering these genes have not been found on any identifiable pathogenicity island, it suggests that these pili are ancient to many of the bacterial phyla possessing these genes [18]. In contrast, type IVb pili genes are fewer in number than type IVa genes and are typically found clustered within the genome. Moreover, the gene sequence order does not appear to be conserved amongst different organisms possessing the type IVb pili except for the universally conserved core proteins. In addition, when comparing N-terminal sequence homology, type IVa pilin subunits are more similar among themselves than to type IVb pilins or within the type IVb pili group. Furthermore, type IVa pili occur in bacteria with a broad host range, while type IVb pili have only been identified in colonizers of the human intestinal tract [4].

Vibrio species possess many type IV pili from both type IVa and b groups, but only a select few have been studied for their role in environmental and/or host survival. One thoroughly studied pili from the type IVb group is the toxin co-regulated pilus (TCP) from Vibrio cholerae, and it is known for its key role in virulence [1921]. It is expressed by V. cholerae classical and El Tor biotypes from the O1 and O139 serogroups [22]. TCP is composed of TcpA subunits and appears as thick bundles on the electron microscope [4]. TcpA is processed by a TCP-specific signal peptidase, TcpJ, to form mature pilin subunits for assembly [22, 23]. The structure of TCP consists of the conserved N-terminal α-helices of TcpA buried in the core of the pilus, maximizing contact between subunits to provide overall strength. The structurally variable regions of the pilins interact to hold the core units together and coat the surface where interactions take place with the environment, i.e., the intestines [4]. In addition to colonization, TCP is the receptor for the CTXΦ phage [24, 25].

An additional well-studied V. cholerae type IV pilus is the mannose-sensitive hemagglutinin (MSHA), which belongs to the type IVa group. When examining operon composition, MSHA in V. cholerae consists of two operons where one operon encodes five prepilin subunits, including the major pilus subunit MshA, and the other contains genes involved in assembly and secretion [26]. In V. cholerae, the PilD peptidase has been shown to process the MshA subunits for assembly of the mature pilus structure [27, 28]. The MSHA pilus hemagglutinates red blood cells [29, 30] and is a receptor for filamentous phage [3133]. It has been studied extensively in V. cholerae to identify any involvement in host colonization [19, 21, 34]. In V. cholerae, only the El Tor biotypes produce functional MSHA pili [29, 30], and during human colonization studies, the protein was repressed [35]. Expression of the MSHA pilus was tightly regulated so that when TCP was expressed, the MSHA protein was repressed; therefore, the MSHA pilus is considered an anticolonization factor in human disease [36]. When the MSHA pilus was constitutively expressed during colonization, it resulted in immune system recognition [35]. Thus, the MSHA pilus does not appear to be a virulence factor for V. cholerae, suggesting that expression of the gene product is for utilization in the environment. Studies have shown that the MSHA pilus is used to adhere to zooplankton exoskeletons as a survival strategy in the aquatic environment [37, 38], presumably by forming biofilms. V. cholerae and Vibrio parahaemolyticus are known to use the MSHA pilus to form biofilms on various surfaces [6, 8, 38], including chitin [39], which provides some supporting evidence for the role of the MSHA pilus in environmental survival.

Another pilus found in Vibrio spp. is the type IVa PilA pilus, also known as the chitin-regulated pilus (ChiRP). The PilA operon is composed of five open reading frames that constitute a single operon, consistent with other type IVa pili [28]. A mature PilA pilus is composed of PilA subunits that were processed by the PilD peptidase [28], the same peptidase that processes the MshA pilin subunits [27, 28]. The PilD peptidase is the fourth open reading frame in the PilA operon [28]. The PilA type IVa pilus is an integral player in the V. cholerae chitin utilization program [39]. Expression of the PilA protein has been shown to be induced by chitin in both V. cholerae [39] and V. parahaemolyticus [6]. PilA is involved in biofilm formation [6, 10], adherence to human epithelial cells [10], and colonization of oysters [11]. It has been implicated as a virulence factor for V. vulnificus [10], although direct evidence of its role in virulence has not been clearly described in other human pathogenic vibrios.

Taken together, the studies of the type IVa pili MSHA and PilA in various Vibrio spp. suggest that these proteins might be utilized by vibrios for environmental survival by attaching to chitinous substrates such as zooplankton. In contrast, the type IVb pilus, TCP, from V. cholerae, is critical for host colonization and has not be implicated in environmental survival, pointing out the possibility of two very distinct roles for the different subclasses of type IV pili.

During our efforts to investigate the roles of MSHA and PilA in V. parahaemolyticus colonization of the Pacific oyster, Crassostrea gigas, we noted sequence heterogeneities in these genes. This led us to examine these genes in other human pathogenic Vibrio species, such as V. cholerae and V. vulnificus. Here, we present a comparative sequence analysis of the mshA and pilA pilin genes from several strains of V. cholerae, V. parahaemolyticus, and V. vulnificus. These sequence analyses suggest that a selective environmental pressure has been applied to these genes, resulting in the observed sequence heterogeneities for all three Vibrio species examined.

Materials and Methods

Bacterial Strains

Thirteen of the V. parahaemolyticus bacterial strains sequenced were kindly provided by Dr. Yi-Cheng Su, Oregon State University Seafood Laboratory, Astoria, OR, USA. Genomic DNA for five tdh/trh negative strains of V. parahaemolyticus was obtained from Dr. Narjol-Gonzalez-Escalona, FDA, College Park, MD, USA. Genomic DNA for ten of the V. vulnificus strains sequenced were provided by Dr. Paul Gulig, University of Florida, Gainesville, FL, USA. Five of the V. vulnificus strains sequenced were provided by Dr. Kathy O’Reilly, Oregon State University, Corvallis, OR, USA. Bacterial strains were grown on Luria–Bertani agar supplemented with sodium chloride to a final concentration of 2%. All strains used in this study are listed in Table 1.

Table 1 Strains used in this study

Sequencing

Genomic DNA from V. parahaemolyticus and V. vulnificus strains were isolated using the Qiagen DNeasy blood and tissue kit, following the protocol for DNA isolation included in the kit. Primers for sequencing each gene were designed for the region approximately 100 base pairs upstream from the start codon and 100 bp downstream of the stop codon for the gene of interest (Table 2). Polymerase chain reaction (PCR) was conducted using Invitrogen Platinum HiFi Supermix, following their standard protocol for PCR. PCR samples were quantified using the NanoDrop Spectrophotometer ND-1000. Sanger sequencing reactions for V. parahaemolyticus and V. vulnificus PCR products were performed at the Center for Genomic Research and Bioinformatics (CGRB), Oregon State University, Corvallis, OR, USA.

Table 2 Primers used in this study

In Silico Analyses

The in silico sequence data for all the V. cholerae strains and additional V. parahaemolyticus and V. vulnificus strains were obtained from the Department of Energy Joint Genome Institute website: http://img.jgi.doe.gov/cgi-bin/pub/main.cgi. The V. parahaemolyticus and V. vulnificus sequenced DNA was translated into their predicted amino acid sequences using SeqTool and sequence alignments were created in ClustalW at the bioinformatics website for the CGRB: http://bioinfo.cgrb.oregonstate.edu/. Maximum likelihood phylogenetic trees were constructed using the MEGA 5 program: http://www.megasoftware.net/ using the Tamura–Nei model with nucleotide substitutions. Bootstrap values were calculated with 500 replicates. For the analysis of synonymous and nonsynonymous substitutions, calculations were made using the Synonymous Non-synonymous Analysis Program (SNAP): www.hiv.lanl.gov [40]. The program is based on the Nei and Gojobori [41] method for calculating synonymous and nonsynonymous rates of substitution with the incorporation of Ota and Nei [42] statistics. The package is described by Ganeshan et al. [43].

Results

Sequence Alignments

Overall, the sequence alignments for the DNA encoding the mshA and pilA genes from different strains of V. cholerae, V. parahaemolyticus, and V. vulnificus showed considerable sequence heterogeneity (Supplemental Figs. 1 and 2). Although the immediate 5′ regions are highly conserved in both genes, most of the gene sequences varied depending on the strain. Interestingly, V. cholerae exhibited distinct groupings for both genes, separating most clinical isolates from environmental isolates. In contrast, V. parahaemolyticus and V. vulnificus strains did not appear to group based on isolate origin or any other phenotype. Sequence alignments of the predicted amino acid sequences of MSHA and PilA from V. cholerae, V. parahaemolyticus, and V. vulnificus are shown in Figs. 1 and 2. For V. parahaemolyticus and V. vulnificus, the predicted amino acids sequences for MSHA and PilA from both environmental and clinical isolates displayed notable sequence heterogeneity. With V. cholerae strains, most clinical isolates had conserved sequences for both MSHA and PilA. Most environmental isolates exhibited marked sequence heterogeneity, comparable to what was observed for the V. parahaemolyticus and V. vulnificus isolates.

Figure 1
figure 1figure 1figure 1

Amino acid sequence alignment of MshA from Vibrio cholerae (a), Vibrio parahaemolyticus (b), and Vibrio vulnificus (c). The predicted amino acid sequence alignments of MshA for V. cholerae (a), V. parahaemolyticus (b), and V. vulnificus (c) were constructed using the ClustalW program. White indicates normal residues. Green are similar residues. Pink are identical residues. Black indicates globally conserved residues

Figure 2
figure 2figure 2figure 2

Amino acid sequence alignment of PilA from Vibrio cholerae (a), Vibrio parahaemolyticus (b), and Vibrio vulnificus (c). The predicted amino acid sequences of PilA for V. cholerae (a), V. parahaemolyticus (b), and V. vulnificus (c) were aligned using the ClustalW program. White indicates normal residues. Green are similar residues. Pink are identical residues. Black indicates globally conserved residues

Phylogenetic Trees

Maximum likelihood (ML) phylogenetic trees were constructed from the mshA (Fig. 3) and pilA (Fig. 4) sequences for the V. cholerae, V. parahaemolyticus, and V. vulnificus isolates. Similar to the DNA and amino acid alignments, the mshA (Fig. 3a) and pilA (Fig. 4a) ML phylogenetic trees for V. cholerae clustered most clinical isolates into one branch, while environmental isolates exhibited various branching patterns. When ML phylogenetic trees were constructed for these two gene sequences from V. parahaemolyticus (Figs. 3b and 4b) and V. vulnificus (Figs. 3c and 4c), no discernable grouping patterns appeared for either species, unlike the V. cholerae phylogenetic trees.

Figure 3
figure 3

Bootstrap maximum likelihood phylogenetic trees for mshA from Vibrio cholerae (a), Vibrio parahaemolyticus (b), and Vibrio vulnificus (c). The bootstrap maximum likelihood phylogenetic trees for mshA from V. cholerae (a), V. parahaemolyticus (b), and V. vulnificus (c) were constructed using the gene sequences for mshA in the Molecular Evolutionary Genetics Analysis (MEGA) 5 software. All bootstrap values are listed

Figure 4
figure 4

Bootstrap maximum likelihood phylogenetic trees for pilA from Vibrio cholerae (a), Vibrio parahaemolyticus (b), and Vibrio vulnificus (c). The bootstrap maximum likelihood trees for pilA from V. cholerae (a), V. parahaemolyticus (b), and V. vulnificus (c) were constructed using the gene sequences for pilA in the Molecular Evolutionary Genetics Analysis (MEGA) 5 software. All bootstrap values are listed

Substitution Analyses

We analyzed mshA and pilA for the rate of synonymous (silent) (d S) and nonsynonymous (structural) (d N) changes for the V. cholerae, V. parahaemolyticus, and V. vulnificus isolates. For mshA from V. cholerae, the rate of synonymous (d S) was 0.759, while the rate of nonsynonymous (d N) was 0.471, with a d N/d S ratio of 0.621 (Table 3). The rate of synonymous changes for V. parahaemolyticus was 0.746 and that for V. vulnificus was 0.662. The rate of nonsynonymous changes for V. parahaemolyticus and V. vulnificus was 0.431 and 0.384, respectively. This resulted in a d N/d S of 0.577 for V. parahaemolyticus and 0.580 for V. vulnificus (Table 3). For pilA, the rate of synonymous changes was 1.109 for V. cholerae, 1.691 for V. parahaemolyticus, and 1.186 for V. vulnificus. The rate of nonsynonymous changes was 0.629 for V. cholerae, 0.642 for V. parahaemolyticus and 0.503 for V. vulnificus. This resulted in a d N/d S of 0.567, 0.380, and 0.424 for V. cholerae, V. parahaemolyticus, and V. vulnificus, respectively (Table 3).

Table 3 Analysis of synonymous and nonsynonymous nucleotide substitutions for genes involved in type IV Pili function from V. cholerae, V. parahaemolyticus, and V. vulnificus

Region Analyses

To compare the diversity of mshA and pilA, we examined neighboring genes from their respective operons, mshC and pilB, as well as the type IV pilin peptidase, pilD. The rate of synonymous and nonsynonymous changes for mshC was 0.135 and 0.039 for V. cholerae, 0.229 and .017 for V. parahaemolyticus, and 0.042 and 0.015 for V. vulnificus. This resulted in a d N/d S ratio of 0.290 for V. cholerae, 0.072 for V. parahaemolyticus, and 0.356 and V. vulnificus (Table 3). For pilB, the rates of synonymous and nonsynonymous for V. cholerae, V. parahaemolyticus, and V. vulnificus was 0.176 and 0.008, 0.288 and 0.037, and 0.208 and 0.016 respectively. The d N/d S ratio for pilB was 0.047 for V. cholerae, 0.127 for V. parahaemolyticus, and 0.074 for V. vulnificus. For pilD, the synonymous and nonsynonymous rates calculated for V. cholerae were 0.122 and 0.005 with a d N/d S of 0.039. The V. parahaemolyticus strains used to calculate the synonymous and nonsynonymous rates of substitution for pilD had identical sequences; thus, the synonymous and nonsynonymous rates of substitution were zero, and the d N/d S ratio cannot be calculated. These rates are comparable with data from Chattopadhyay et al. [46], which calculated the rates of synonymous and nonsynonymous substitutions for pilD from V. vulnificus as 0.092 and 0.007 with a d N/d S ratio of 0.076.

TcpA and TcpJ

To compare the findings for mshA and pilA with another type IV pilin and its corresponding peptidase, we calculated the rates of synonymous and nonsynonymous substitutions for the toxin co-regulated pilus pilin subunit tcpA from V. cholerae and its processing leader peptidase tcpJ (Table 3). Only 13 V. cholerae strains out of the available 25 possess tcpA and tcpJ. The d S and d N for tcpA was 0.486 and 0.052 with a d N/d S ratio of 0.106. For tcpJ, the d S and d N was 0.003 and 0.000 with a d N/d S ratio of 0.000.

Discussion

The results from our sequence analyses of the mshA and pilA genes from several strains of three human pathogenic Vibrio species, V. cholerae, V. parahaemolyticus, and V. vulnificus, suggested that the various alleles observed were the result of selective pressure. When examining the V. cholerae predicted amino acid alignment (Fig. 1a) and phylogenetic tree (Fig. 3a) for the mshA gene, one distinct grouping emerged with highly conserved sequences for the MSHA pilin subunit. In fact, the isolates in this group, identifiable as one branch of the phylogenetic tree (Fig. 3a), were primarily from the O1 serogroup (13 out of 15) and clinical isolates (11 out of 15). This differs considerably from the remaining V. cholerae isolates examined, which were predominately environmental, non-O1/O139 strains (9 out of 10) with no apparent grouping pattern in the phylogenetic tree (Fig. 3a). When comparing the predicted amino acid alignments and phylogenetic trees for the V. parahaemolyticus (Figs. 1b and 3b) and V. vulnificus (Fig. 1c and 3c) strains sequenced, no grouping could be established based on either isolation source or phenotype, in contrast to what was observed for V. cholerae.

Reviewing the sequence data for the PilA pilin subunit from V. cholerae, V. parahaemolyticus, and V. vulnificus, the pilA sequences exhibited a trend similar to what was observed for the MSHA pilin subunit. For V. cholerae strains, a group of highly conserved PilA sequences emerged and were primarily from the O1 serogroup (13 out of 14) and of clinical origin (11 out of 14). The remaining isolates were predominately non-O1/O139 (10 out of 11) and from an environmental source (6 out of 11). They did not have any clear pattern to their alignment (Fig. 2a) or tree branching (Fig. 4a). Consistent with the mshA findings, no apparent grouping pattern was observed for either the amino acid alignment or branching on the phylogenetic tree for any of the V. parahaemolyticus (Figs. 2b and 4b) and V. vulnificus (Figs. 2c and 4c) pilA genes sequenced. Taken together, our hypothesis is that a selective pressure has caused the differences observed in these two type IVa pili in V. cholerae, V. parahemolyticus, and V. vulnificus.

To test for selective pressure, the synonymous and nonsynonymous nucleotide substitution rates were calculated to determine a d N/d S ratio [41]. In protein-coding sequences, synonymous substitutions (d S) are structurally silent, while nonsynonymous substitutions (d N ) result in a change to the amino acid sequence. When a d N/d S ratio is calculated, typically the value suggests whether the substitutions are largely neutral (d N/d S = 1), under a negative selection (d N/d S < 1), or a positive selection (d N/d S > 1) [44]. Table 3 shows the calculations for d S , d N, and d N/d S for the mshA and pilA genes from the different Vibrio strains analyzed, and the data suggest that a selective pressure has been applied to these two genes for all three Vibrio species. To further analyze the selective pressure applied to the type IV pili examined, we compared mshA and pilA with another gene in their corresponding operon, the neighboring genes mshC gene and pilB, respectively, to determine if a selective pressure has been applied strictly to the gene encoding the pilin subunit or to the entire operon. When comparing the d N/d S value for mshA with mshC and pilA with pilB for all three vibrios, the d N/d S values for the pilin subunits (mshA and pilA) are considerably larger than the neighboring gene in the operon (mshC and pilB) (Table 3). These results suggest that the neighboring genes (mshC and pilB) in both the MSHA and PilA operons are more conserved than their corresponding pilin subunits (mshA and pilA). Thus, it is possible that the pilin subunits are not under the same selective pressure as their neighboring genes.

Both mshA and pilA encoded pilins are processed by the same type IV prepilin peptidase, pilD [27, 28, 45]. When examining the d N/d S value for pilD, it was evident that the pilD gene maintained a highly conserved sequence. We calculated the pilD d N/d S for V. cholerae (0.039) but were unable to calculate it for V. parahaemolyticus because the sequences were identical for d S (0.000) and d N (0.000) so the d N/d S was 0:0 (Table 3). Despite the inability to calculate the d N/d S for V. parahaemolyticus, the results for V. cholerae pilD (0.039) were congruent with what was found for V. vulnificus (0.076) by Chattopadhyay et al. [46]. This suggests that a strong purifying selection has maintained the highly conserved pilD sequence in contrast to the general observation for the mshA and pilA sequences. When examining the predicted amino acid sequences for both mshA (Fig. 1) and pilA (Fig. 2) for all three vibrios, it was clear that the N-termini remain highly conserved while the C-termini varied considerably. The N-termini region is recognized by the PilD peptidase for processing the protein into a mature pilin subunit [4]. If the N-terminal region of the type IVa pili proteins MSHA and PilA varied, it is possible that PilD would no longer process these proteins into mature subunits, while variations in the C-termini should still result in a mature pilin subunit. Thus, it appears that PilD has maintained a highly conserved sequence unlike the MSHA and PilA proteins it processes.

To further understand the variations observed in the MSHA and PilA pilins, the V. cholerae mshA and pilA sequences were compared to the type IVb pilin TCP from V. cholerae. The tcpA gene encodes the major pilin subunit of TCP and is processed by its own type IV pili peptidase TcpJ, encoded by tcpJ [23]. Contrary to tcpA that exhibit some variability in its sequences with mostly synonymous substitutions (d S of 0.486) and few nonsynonymous substitutions (d N of 0.052), tcpJ has relatively few substations overall (d S of 0.003 and d N of 0.000). The d N/d S for tcpA is 0.106 and that for tcpJ is 0.000, suggesting that these genes are under strong negative selection to maintain their sequences and structures. When examining the V. cholerae phylogenetic trees constructed for the mshA and pilA genes, the strains that possess TCP are all from the O1 serogroup and on a single branch (Figs. 3a and 4a). Looking at the amino acid alignment data, it was evident that the V. cholerae isolates containing all three type IV pili were highly conserved (Figs. 1a and 2a). To break it down further, the d N/d S ratio for mshA and pilA from the V. cholerae strains possessing TCP were also calculated, and the d S and d N for both genes were 0.000, resulting in an undefined d N/d S ratio (Table 3). Therefore, V. cholerae strains possessing all three type IV pili appear to be under a strong purifying selection. Even though some O1 V. cholerae isolates in this conserved branch were from environmental or unknown sources (3 out of 13), the fact that they possess TCP implies they could cause cholera. Taken together, the evidence suggests a connection between host interactions and highly conserved type IV pili in V. cholerae.

A previous study by Chattopadhyay et al. [46] analyzed pilA from 55 V. vulnificus strains of various origins and also determined that pilA is highly divergent. A total of 25 unique alleles were identified from the 55 analyzed strains, and the authors did not determine any relationship between the various alleles and pathogenicity of V. vulnificus [46]. They concluded that the genetic diversity of pilA in V. vulnificus was higher than neighboring genes (pilBCD) and thus was under strong positive, diversifying selection [46]. This conclusion was made despite the fact that the d N/d S ratio calculated for pilA was <1. The usefulness of the d N/d S ratio to detect positive selection is reduced when comparing gene polymorphisms within a single population compared to divergent populations [47]. Our results are consistent with their findings and also demonstrate that MSHA and PilA from V. cholerae, V. parhaemolyticus, and V. vulnificus exhibit higher genetic diversity than other genes in their corresponding operon (mshC and pilB and pilD).

Chattopadhyay et al. [46] suggested various ideas to explain their observation, including that the allelic variability in PilA for V. vulnificus could be the result of oyster innate immune system [46]. It was noted that since V. vulnificus commonly associate with shellfish in the environment and infections in humans are typically opportunistic, the selective pressure applied to this gene was probably not in response to an adaptive immune system [46]. Shellfish have an innate immune system that recognizes highly conserved motifs while lacking a well-developed adaptive immunity [48, 49]. Thus, the driving force behind the variations observed in the PilA protein could be the result of the innate immunity of shellfish, such as oysters, in part based on a previous study showing that PilA was involved in oyster colonization by V. vulnificus [11, 46]. Data from our laboratory also indicated that PilA and MSHA play a role in V. parahaemolyticus colonization of the Pacific oyster, C. gigas (Aagesen, A.M., and C.C. Häse, unpublished results), further supporting the idea that the shellfish immune system might be involved in applying pressure to these pili proteins, thus causing variability. Studies using different strains expressing the various alleles for MSHA and PilA from V. cholerae, V. parahaemolyticus, and V. vulnificus in shellfish interaction experiments are required to fully address this issue.

In addition to the shellfish immune system, other selective pressures in the environment could exist to cause the observed allelic diversity in MSHA and PilA, such as protozoan grazing, bacteriophages and DNA uptake [46]. Ideally, various alleles for MSHA and PilA from V. cholerae, V. parahaemolyticus, and V. vulnificus would need to be examined to better understand the role of bacteriophages as a selective pressure causing the variations observed for these proteins. However, future studies using various alleles for MSHA and PilA are required to support these hypotheses.

In summary, this study illustrates significant diversity of the MSHA and PilA pilin subunits from V. cholerae, V. parahaemolyticus, and V. vulnificus. For all three vibrios examined in this study, mshA and pilA had considerably higher d N/d S ratios than any of the other genes examined, suggesting these genes are under a possible positive selection while the other genes examined are not. Another interesting finding was that V. cholerae strains that possess TCP also maintain highly conserved MSHA and PilA sequences, suggesting a connection with the host. Even though a selective pressure appears to exist causing the allelic variations observed for mshA and pilA, the mechanism(s) driving this diversification have yet to be determined. Several suggestions can be made, yet evidence to support these ideas awaits further experimental analyses. In addition, our observations raise an important point about the use of these genes in detection methods for these important human pathogens. In particular, some PCR-based detection methods utilize certain pathogen-associated genes as targets, including type IV pili genes [50, 51]. Realizing that the Vibrio mshA and pilA genes can be extremely variable at the 3′ ends of the genes is important to consider when designing primers to target these genes. Therefore, it is possible that a PCR protocol designed to amplify mshA and pilA from various V. cholerae, V. parahaemolyticus, and V. vulnificus strains may not detect these genes simply due to the variations observed in this study. This is certainly something to consider when utilizing these genes in a PCR protocol.