Introduction

Inhospitable environmental conditions prompt microbes to respond to stress by inducing the expression of stress response genes (Barak and Wilkinson 2005; Hecker and Volker 2001). In certain microbes such as Bacillus subtilis, a more elaborate response is induced under conditions of nutrient limitation: endospore formation (Aguilar et al. 2007; Errington 2003). Endospores are able to withstand environmental extremes and have the capacity to lie dormant for thousands, if not millions, of years (Vreeland et al. 2000). The process of endospore formation is time and energy intensive, involving the expression of more than 500 genes over a 6–8-h period (Britton et al. 2002; Eichenberger et al. 2004; Fujita and Losick 2002; Molle et al. 2003; Steil et al. 2003). Because this process becomes irreversible after approximately 2 h (Dworkin and Losick 2005; Parker et al. 1996), mechanisms exist that delay commitment to this process through cannibalism (Claverys and Havarstein 2007). The SdpI protein of Bacillus subtilis is involved in orchestrating one such delay (Ellermeier et al. 2006). It is a transmembrane protein involved in both signal transduction and immunity to the cannibalistic process (Ellermeier et al. 2006).

Under the conditions of nutrient limitation and high population density, the response regulator Spo0A is turned on in about half of the cells in the population (Chung et al. 1994; Fujita and Losick 2002; Gonzalez-Pastor et al. 2003). Spo0A-ON cells switch on transcription of two operons, skfA-H and sdpABC. The skfA-H operon contains genes for the production of a peptide-like antibiotic killing factor and an export pump that transports the killing factor out of the producing cells, thereby avoiding death of Spo0A-ON cells (Gonzalez-Pastor et al. 2003). The sdpABC operon contains three genes that produce and export the SdpC toxin. The toxin and the killing factor lyse Spo0A-OFF cells, and Spo0A-ON cells are able to delay or prevent commitment to endospore formation by feeding off of nutrients released from the dead cells (Ellermeier et al. 2006). They may also use the released DNA for natural transformation (Grossman 1995).

B. subtilis Spo0A-ON cells are immune to both the toxin and the killing factor they produce. The same operon that contains genes for the killing factor also contains genes for an export pump that removes it from the Spo0A-ON cells to avoid self-killing (Gonzalez-Pastor et al. 2003). However, the operon that contains the toxin SdpC does not confer immunity. SdpC is, in fact, an extracellular signaling protein. Through its interaction with SdpI, the transcription of an adjacent convergently transcribed immunity operon, sdpRI, is induced. Thus, SdpI is a transmembrane immunity and signal transduction protein, while SdpR is the autorepressor. In Spo0A-ON cells, external SdpC acts as a ligand to existing SdpI in cell membranes. It alters the conformation of SdpI, inducing sequestration of the autorepressor, internal SdpR. Thus, the sdpRI operon is derepressed so that more sdpI is transcribed and translated. This mechanism confers immunity against the SdpC toxin only when SdpC is present.

In Spo0A-OFF cells, the AbrB repressor prevents expression of the sdpRI operon, and the cells, unable to promote immunity, die in the presence of external SdpC (Ellermeier et al. 2006). It is thus likely that SdpI exhibits two distinct functions: immunity conferral and signal transduction; these two functions are associated with different parts of the protein. Localized mutagenesis of the first half of Bacillus subtilis SdpI hinders its immunity function, while substitutions in the second half of the protein compromise the signal transduction function (Ellermeier et al. 2006). Other forms of resistance to SdpC have been identified: σw-dependent operons yknWXYZ and yfhL confer immunity to SdpC. yknWXYZ encodes an ABC transporter and is speculated to export the SdpC toxin, while yfhL encodes a paralogue of SdpI (Butcher and Helmann 2006).

Here, we use established bioinformatic methodologies to show that the basic element of the SdpI family is a 6 transmembrane α-helical segment (TMS) protein. This basic structure probably derived from a primordial 3 TMS element by intragenic duplication. The resultant 6 TMS protein then underwent another duplication, followed or occasionally preceded by deletion and possible fusion events to give rise to homologous proteins of 3, 4, 5, 7, 8, and 12 putative TMS topologies. The driving force for generation of this unusual degree of topological diversity may have been the bifunctional nature of the B. subtilis SdpI where the first 3 TMS half of this protein serves one function (external binding of SdpC and immunity), while the second 3 TMS half serves another (internal binding of SdpR and signal transduction) (Ellermeier et al. 2006). In addition, we demonstrate the existence of homologues of SdpR and other transcriptional regulators within the operons that code for SdpI homologues.

Methods

Selection of Protein Sequences

A Basic Local Alignment Search Tool (BLAST) search (Altschul et al. 1990) was performed in October 2007 using the SdpI protein of Bacillus subtilis (gi no. 16080431) as the query sequence with two iterations and the default cutoff. More than 100 homologous proteins were retrieved from the National Center for Biotechnology Information (NCBI) database. Eighty-two proteins were retained for topological analysis after redundancies and proteins with greater than 90% identity were eliminated by a modified CD-Hit program (Li et al. 2001, 2002; Saier et al. 2009; Yen et al. 2009). The proteins were further reduced in number to 76 after translating the DNA in all six reading frames and seeking sequence similarities with full-length close homologues of the three translated codirectional reading frames.

The program BCM Search Launcher (Smith et al. 1996) was used to translate the DNA coding for the query protein in the six reading frames at both ends flanking the existing sequence. The amino acid sequences at both the N- and C-termini were examined in all three reading frames for potential fragments, premature truncations, and incorrect initiation codon assignments. This was done for all proteins of the 5 TMS topology and smaller, as well as the inverted 6 TMS protein, Afu2, to establish the legitimacy of their topological deviations from the standard majority of 6 TMSs. If translation of any one of the reading frames before or after the reported sequence revealed significant similarity to another member of the SdpI family, the sequence was reconstituted or excluded from further studies. If not, it was retained and analyzed. In these procedures, any sequence of 20 amino acyl residues (aas) or greater with 0, 1, or 2 stop codons was searched using the BLAST search tool against the NCBI database to gain evidence for or against the possibility that the assigned initiation or termination codon was incorrect. If the BLAST search yielded significant similarity of the segment with a corresponding position of an established member of the SdpI family, the extended portion of the query protein was added to the original protein, and a new BLAST search was performed. If the results brought up a close homologue or a match for this new full-length protein, this protein was excluded from our analyses as its abbreviated topology was most likely artificial. When such procedures did not yield significant hits, the topology of the smaller protein was assumed to be accurate and was retained for further study. The occurrence of multiple such homologues of a specific topology provided confirmation of this assumption.

A second BLAST search was performed on May 21, 2009, using the SdpI protein of Bacillus cereus, Bce2 (gi no. 42784033), a close orthologue of the B. subtilis protein, as the query sequence with two iterations. This was done to update the family, where new members with unexpected topologies were sought. The BLAST search with a cutoff of e−4 for the first iteration and a cutoff of e−5 for the second iteration yielded 316 homologues. All 316 homologues were analyzed, and their topologies were mapped manually. Proteins with new topologies, or topologies with only one previous example, were then added to the previously existing family. Nine proteins were added to the original list. The DNA translating procedure which used the program BCM Search Launcher was also applied to the nine added proteins.

Phylogenetic, Hydropathy, and Sequence Analyses

Homologous sequences were multiply aligned using the ClustalX program (Thompson et al. 1997), and phylogenetic trees were visualized using the TreeView program (Zhai and Saier 2002; Zhai et al. 2002). Default parameters of ClustalX were used to align the sequences. Topological analyses of the individual proteins and the multiply aligned homologues were performed by the WHAT (Zhai and Saier 2001a) and AveHAS (Zhai and Saier 2001b) programs, respectively. For the latter program, the ClustalX alignment was used as input to calculate average hydrophobicity and average similarity as a function of alignment position. The window size used was 19 aas. Statistical sequence similarity comparisons between proteins, and between internal regions of these proteins, were conducted by the IC (Zhai and Saier 2002) and GAP (Devereux et al. 1984) programs. These programs randomly shuffle the desired amino acid sequences and compare these shuffled sequences with the original sequences. In effect, they correct for unusual protein compositions such as those that occur in integral membrane proteins. Default settings and 500 random shuffles have been shown to be satisfactory for obtaining statistically significant values (Yen et al. 2009). A value of 10 standard deviations (SD) for comparable regions of two proteins of at least 60 aas in length, corresponding to a probability of 10−24 that the observed degree of sequence similarity arose by chance (Dayhoff et al. 1983; Saier et al. 2009; Yen et al. 2009) is considered sufficient to establish homology. These proteins were then analyzed topologically and phylogenetically. Reference to TMSs refers throughout to putative transmembrane spanners based on hydropathy analyses, because none of the proteins in this family has been characterized topologically.

Motif Analyses

All of the SdpI proteins within our study were analyzed for motifs using the MEME program (Bailey and Elkan 1995). Default settings were used, except that the condition “any number of repetitions” was selected for the prediction of how single motifs were distributed among the sequences. The consensus sequences generated by the program guided the determination of the consensus sequences of the phylogenetic clusters through analysis of the ClustalX alignments of the individual clusters. The locations of the motifs were determined for individual proteins relative to the locations of the TMSs using the hydropathy plots generated by the WHAT program.

Determination of Protein Orientation Within the Cell Membrane

The orientations of the SdpI homologues in the cell membrane were estimated by the HMMTop (Tusnady and Simon 2001) and TMHMM (Krogh et al. 2001) programs. If and only if the two programs provided contradictory results were the proteins examined manually. The positively charged amino acyl residues (arginine and lysine) were counted in the first and last 20 aas of the primary sequence (unless otherwise specified; see Table S1 for exceptions), as well as in the loop regions between the TMSs. The inter-TMS loops were located using the TMHMM program and confirmed with the WHAT program (Zhai and Saier 2001a). The positive-inside rule was then applied to determine orientation of the proteins within the cell membrane (von Heijne and Gavel 1988). Table S1 lists the proteins analyzed manually and includes the regions of the primary sequences that were examined for positively charged amino acyl residues. The numbers of positively charged residues (Rs and Ks) that were counted in the above mentioned regions are also recorded in Table S1. The regions with the largest numbers of positively charged residues were assumed to be located inside the cell. This process estimated orientation in the cell membrane. For proteins Bcl2 and Cte1, the WHAT program was also used to determine the N- and C-terminal and loop regions because the TMHMM program did not recognize all of the putative TMSs.

Operon Analyses

Three representative proteins were chosen from every topological group of proteins (i.e., three from the proteins with 3 TMSs, three from the 4 TMS proteins, etc.). The encoding operons were examined using the Genome Browser feature of the National Microbial Pathogen Data Resource (NMPDR) database (McNeil et al. 2007). Some proteins were excluded from the analysis if their genomes were not yet represented in the NMPDR genome database or if a genome, although represented, was not yet populated with genes in the vicinity of the purported locations of the sdpI homologues included in the analysis. sdpI homologues were considered to be in the same operon with other cistrons if the distance between them was between −8 and 30 bp. Elements suspected of being in the same operon as the sdpI homologue were identified using BLAST searches, and the results are tabulated in Table S2. Within the BCM Search Launcher’s Gene Feature Search, the Prokaryotic Promoter Prediction by Neural Network (Smith et al. 1996) was used to find promoters with a score cutoff of 0.80 upstream of the alleged operons to verify their legitimacy.

Results

Table 1 lists the proteins of the SdpI family analyzed in this study alphabetically within each phylogenetic cluster (Fig. 1). A multiple alignment of these proteins may be found on our Web site (http://biology.ucsd.edu/~msaier/supmat/SdpI) (Fig. S1).

Table 1 Proteins of the SdpI family included in this study, listed alphabetically according to cluster
Fig. 1
figure 1

Phylogenetic tree of the SdpI family. Phylogenetic clusters are labeled 1–10. The tree is based on the ClustalX multiple alignment shown in Fig. S1 (http://www.biology.ucsd.edu/~msaier/supmat/SdpI/mat/FigureS1.html) and drawn with the TreeView program. Protein abbreviations are listed in Table 1

Classification of Organisms Possessing SdpI Family Members

Organisms represented include Firmicutes, with 52 of the 87 homologues derived from this bacterial kingdom. Euryarchaeota and Actinobacteria were equally represented (11 homologues each). There were also representatives from γ-Proteobacteria (1), α-Proteobacteria (3), Bacteroidetes (3), Chlorobi (2), Chloroflexi (2), Acidobacteria (1), and Deinococcus (1). The proteins vary widely in size, with sequences as short as 137 aas (Hma1 from Haloarcula marismortui) and as long as 404 aas (Dge1 from Deinococcus geothermalis). The majority of the proteins are of a size near 200 (170–230) aas in length and exhibit putative 6 TMS topologies. The SdpI family appears to be topologically heterogeneous; it includes four proteins predicted to have 3 TMSs, nine proteins with 4 TMSs, six proteins with 5 TMSs, fifty-eight proteins with 6 TMSs, four proteins with 7 TMSs, five proteins with 8 TMSs, and one protein with 12 TMSs.

SdpI Homologues

Figure 1 shows the phylogenetic tree for the SdpI family proteins included in this study. These proteins cluster primarily in accordance with topology, and to a lesser degree with organism type. Cluster 1 is made up only of 4 TMS proteins, with the majority being from Firmicutes. The two exceptions are the most distant members of the cluster, Afu1 from Archaeoglobus fulgidus, a euryarchaeon, and Csp1 from Cellulophaga sp. MED134, a member of the Bacteroidetes. Cluster 2 is composed of eight proteins, a 4 TMS homologue from Staphylococcus aureus (a Firmicute), two 5 TMS proteins (both from Actinobacteria) and five 8 TMS homologues, of which four are from Firmicutes and one is from an actinobacterium. Cluster 3 contains all of the 3 TMS proteins, four corynebacterial (actinobacterial) orthologues.

Cluster 4 contains five proteins, Afu2 from Archaeoglobus fulgidus (6 TMSs), Dge1 from Deinococcus geothermalis (a 12 TMS homologue), and three 7 TMS homologues: Tko1 from Thermococcus kodakarensis, Ton1 from Thermococcus onnurineus, and Tsp3 from Thermococcus sp. AM4. The proteins in this cluster are all from Euryarchaeota except for Dge1. Surprisingly, they were found to have an inverted order of their two 3 TMS segments relative to the standard 6 TMS majority type. Accordingly, the first 3 TMSs in these proteins show a high degree of sequence similarity with the last 3 TMSs in the standard 6 TMS homologues, while the last 3 TMSs more closely resemble the first 3 TMSs in the standard 6 TMS homologues.

Cluster 5 contains three proteins of varying topologies. Aba1 from Acidobacteria bacterium (an acidobacterium) has 6 TMSs; Cte1 from Chlorobium tepidum (a Chlorobi) has 5 TMSs, and Pae1 from Prosthecochloris aestuarii (a Chlorobi) has 4 TMSs. Cluster 6 is comprised predominantly of 6 TMS proteins from Firmicutes, with the exception of the 4 TMS Hma1 homologue from Haloarcula marismortui, a member of the Euryarchaeota. Cluster 7 is composed of four proteins, all from Firmicutes; two are 6 TMS homologues, and two are 5 TMS homologues.

Cluster 8 is made up of only 6 TMS homologues derived exclusively from Firmicutes. Cluster 9 is also derived from Firmicutes and comprises 6 TMS proteins with just two exceptions: a 5 TMS protein from Bacillus clausii (Bcl2) and a 7 TMS homologue from Dorea longicatena (Dlo1). Cluster 10 contains only 6 TMS homologues, predominantly from Firmicutes, although five other phyla are represented (Table 1). It is interesting to note that most of the 6 TMS proteins cluster loosely together (clusters 8–10), while proteins of other topologies are phylogenetically more distant.

Search for Internal Repeats Within the 6 TMS Proteins

All of the 6 TMS proteins were analyzed for internal duplication of a 3 TMS segment and triplication of a 2 TMS segment, the two principal routes by which 6 TMS proteins have been shown to arise in other families (Kimball et al. 2003; Lee et al. 2007; Saier 2003). However, we could not demonstrate homology of repeat segments because both pathways gave comparable results below the threshold comparison score needed for proof of homology, 10 SD (Saier 1994; Saier et al. 2009).

Several lines of indirect evidence support the suggestion that a 3 TMS primordial precursor duplicated internally to give rise to the standard 6 TMS topologies, as discussed below.

Sequence and Topological Analyses

The archaeal SdpI proteins, Afu2 (6 TMSs), Tko1 (7 TMSs), Ton1 (7 TMSs), and Tsp3 (7 TMSs), proved to have inverted segments of 3 TMSs relative to the standard 6 TMS homologues; TMSs 1–3 of the standard 6 TMS proteins are homologous to TMSs 4–6 of the inverted proteins, and TMSs 4–6 of the standard 6 TMS proteins are homologous to TMSs 1–3 of the inverted proteins. All of the inverted 7 TMS proteins aligned throughout with each other and with TMSs 1–6 of the inverted 6 TMS protein, Afu2 (Fig. S2). The seventh peak of the inverted 7 TMS proteins showed statistically significant similarity to peak 10 of the Dge1 12 TMS protein (Fig. S3). It seems likely that the inverted 6 TMS protein arose from the inverted 7 TMS proteins with a one-TMS deletion event at the C-terminus (see “Discussion” and Figs. 8 and 9).

To demonstrate the inversion, a representative of the standard 6 TMS topology, Bce2 of Bacillus cereus, was chosen arbitrarily for comparison with Afu2, one of the inverted proteins. Figure 2 shows the hydropathy plots for Afu2 and Bce2 where this inversion may be visualized. With respect to the relative positions of hydrophobic peaks in their WHAT-generated hydrophobicity plots (Zhai and Saier 2001a), the first half of Afu2 resembles the second half of Bce2, and the first half of Bce2 resembles the second half of Afu2. Figure 3a shows the GAP analysis between TMSs 1–3 of Afu2 and TMSs 4–6 of Bce2, with a comparison score of 16.6 SD. Figure 3b shows the GAP analysis between TMSs 4–6 of Afu2 and TMSs 1–3 of Bce2, with a comparison score of 15.5 SD. These values are substantially in excess of what is required to establish homology (Saier 1994; Saier et al. 2009).

Fig. 2
figure 2

a Hydropathy plot of the SdpI protein from Bacillus cereus (Bce2) with numbered peaks of hydropathy corresponding to putative TMSs. b Hydropathy plot of the SdpI homologue from Archaeoglobus fulgidus (Afu2) with numbered TMSs. The letters correspond to the homologous TMSs between the two proteins, demonstrating the inversion within Afu2 relative to the standard 6 TMS proteins, represented here by Bce2

Fig. 3
figure 3

a GAP comparison of the first 3 TMS segment of Afu2 (residues 1 to 105) with the second 3 TMS segment of Bce2 (residues 111 to 212) using the GAP program. Quality: 102; gaps: 5; percentage similarity: 44.4; percentage identity: 33.3. The average comparison score was 16.6 SD. b GAP comparison of the second 3 TMS segment of Afu2 (residues 106 to 228) with the first 3 TMS segment of Bce2 (residues 1 to 110) using the GAP program. Quality: 87; length: 125; gaps: 3; percent similarity: 38.9; percent identity: 21.3. The average comparison score was 15.5 SD

Excluding the four archaeal proteins with inverted 3 TMS segments noted above, all of the 6 TMS proteins aligned with each other throughout their lengths. We then analyzed proteins with other topologies to determine the regions of homology with the standard 6 TMS homologues. In the corynebacterial proteins with 3 TMSs (cluster 3), the 3 TMSs correspond only to TMSs 4–6 in the 6 TMS proteins (Fig. S4). The 4 TMS proteins align with each other and correspond to TMSs 1–4 in the 6 TMS proteins. Figure 4 presents a GAP analysis of the 4 TMS Hma1 homologue with the 6 TMS Gka1 protein; it reveals the aforementioned alignment with a comparison score of 15.3 SD. Proteins with 4 TMSs are found predominantly in cluster 1, the three exceptions being Pae1 from Prosthecochloris aestuarii, found in cluster 5, Sau1 from Staphylococcus aureus, located in cluster 2, and Hma1 from Haloarcula marismortui, located in cluster 6. Hma1 is distantly related to all of the 6 TMS proteins. This in turn leads to the supposition that the 4 TMS topology arose at least twice from the 6 TMS proteins, once by truncation of a cluster 6 homologue, leading to the formation of Hma1, and once by truncation of a cluster 1 6 TMS homologue. Pae1 is associated with Cte1 from Chlorobium tepidum, a 5 TMS protein whose hydrophobic peaks 2–5 correspond to peaks 1–4 in Pae1 and any of the standard 6 TMS proteins. The first peak of Cte1 aligns with the third peak of Afu2 (Fig. S5), which corresponds to peak 6 of the standard 6 TMS proteins, which suggests that this unique 5 TMS topology arose from the inverted 7 TMS protein through a 2 TMS deletion event at the N-terminus. Pae1 and Cte1 are found in cluster 5 along with Aba1. Aba1 is the longest 6 TMS protein with 303 aas. Only the first 210 aas code for the membrane-integrated portion of the protein. The remainder of the protein did not show homology with any region of any of the other proteins examined in this study. A BLAST search of this tail region yielded only hypothetical proteins, so no function of the region can be inferred.

Fig. 4
figure 4

GAP alignment demonstrating the regions of homology between the 6 and 4 TMS topological types within the SdpI family. Gka1 (residues 1 to 145 of 214), a 6 TMS representative, is compared with Hma1 (from residues 1 to 137), a 4 TMS representative. Quality: 106; gaps: 4; percentage similarity: 40.9; percentage identity: 29.5. The average comparison score was 15.3 SD

The 5 TMS proteins proved to have the most varied topologies. There are four unique 5 TMS topologies, each aligning slightly differently with the standard 6 TMS proteins. Cte1 (cluster 5) is the only protein within the SdpI family to have its TMSs 2–5 aligning with TMSs 1–4 in the standard 6 TMS proteins (Fig. S6). The first peak of Cte1 aligns with peak 6 of the standard 6 TMS homologues. Bcl2, with a differing 5 TMS topology, has peaks 1–5 aligning with peaks 2–6 of the standard 6 TMS proteins (Fig. S7). It is found within cluster 9, clustering mainly with 6 TMS proteins, suggesting that it evolved by deletion of a TMS from the N-terminus of a 6 TMS protein. The third variation in the 5 TMS topology is exemplified by two proteins: Sgo1 and Ssa2. These two proteins align with each other, and their peaks, numbered 1–5, correspond to peaks 1–5 of the standard 6 TMS proteins (Fig. S8). They appear in cluster 7 with 6 TMS proteins and seem to have arisen by deletion of a TMS from the C-terminus of a standard 6 TMS protein. The final 5 TMS topological variant type is illustrated by proteins Rsa1 and Cgl2. Peaks 1–4 in these two proteins align with peaks 1–4 of the standard 6 TMS proteins (Fig. S9). Their fifth peak corresponds best to the eighth peak of the 8 TMS proteins. Rsa1 and Cgl2 align with the 8 TMS proteins throughout their lengths, with their TMSs 1–5 aligning with TMSs 4–8 in the 8 TMS homologues. The two 5 TMS proteins align with each other throughout and align extremely well with the 8 TMS proteins, as revealed by a comparison score of 35.4 SD between proteins Rsa1 from Renibacterium salmoninarum, a 5 TMS protein, and Lsp1 from Lysinibacillus sphaericus, an 8 TMS homologue (Fig. S10).

The 8 TMS homologues, although aligning well with themselves, align only partially with the standard 6 TMS proteins. Peaks 4–7 of the 8 TMS proteins align with peaks 1–4 of the standard 6 TMS proteins (Fig. S11). The eighth peak of the 8 TMS homologues and the fifth peak of Rsa1 and Cgl2 are designated “A” and do not match any of the TMSs within other members of the family. The 8 TMS homologues align with the inverted 7 TMS proteins, with peaks 1–7 of the inverted 7 TMS proteins aligning with peaks 1–7 of the 8 TMS proteins (Fig. S12). Thus, the first three TMSs of the 8 TMS homologues correspond to peaks 4–6 of the standard 6 TMS proteins. The 8 TMS proteins and the two 5 TMS proteins (Rsa1 and Cgl2) are found in cluster 2 along with a 4 TMS protein, Sau1. It is possible that the 8 TMS proteins arose by deletion of three TMSs at the N-terminus of a 12 TMS Dge1-like protein and deletion of either one or two TMSs at its C-terminus. If two TMSs were deleted at the C-terminus of a Dge1-like protein, then a subsequent fusion event of one TMS must have occurred at the C-terminus of the 8 TMS homologues which corresponds to the eighth peak of the 8 TMS proteins. If, on the other hand, one TMS was deleted at the C-terminus of a 12 TMS protein, then the eleventh peak of the 12 TMS protein, corresponding to the eighth peak of the 8 TMS homologues, diverged in sequence so much that statistically significant similarity could not be found between the two. The 5 TMS topology most likely arose from an 8 TMS protein precursor by deletion of three TMSs at the N-terminus of the 8 TMS protein. It is possible that the 4 TMS topology may have arisen several times within the SdpI family, and that Sau1, as it clusters most closely with the 8 and 5 TMS homologues, may have arisen from a 5 TMS protein precursor by a deletion event at the C-terminus.

There are two variations of the 7 TMS topology. The first is an inverted topology as already discussed. The second is observed in Dlo1 with TMSs 1–6 aligning with TMSs 1–6 of the standard 6 TMS proteins (Fig. S13). The seventh peak of Dlo1 does not align with any other peak within the SdpI family and is designated “B.” This protein is found in cluster 9 with 6 TMS proteins and Bcl2 of 5 TMSs. This clustering leads to the possibility that Dlo1 originated from a 6 TMS protein by addition of a C-terminal TMS, but it may equally well have arisen from a larger precursor derived from a 12 TMS protein by deletion of 5 TMSs at the C-terminus.

An Internal Duplication Within Dge1

Dge1, a 12 TMS protein, was cut in half to test for an internal duplication. A GAP analysis of the first 6 TMSs against the second 6 TMSs yielded a comparison score that was insufficient to establish homology. However, when the two halves were compared to the 6 TMS proteins, statistically significant similarity was found between several 6 TMS proteins and both halves of Dge1, clearly implying, by the Superfamily Principle (Doolittle 1981; Saier 1994), that an intragenic duplication event of the basic 6 TMS element had led to the formation of the 12 TMS protein. The best comparison score was 19.3 SD, generated by the comparison of the first half of Dge1 with Bcl1, with TMSs 4–6 of Dge1 corresponding to TMSs 4–6 of Bcl1 (data not shown). The 6 TMS protein, Dha1, aligned with both halves of Dge1. Alignment with the first half of Dge1gave a comparison score of 14.6 SD (Fig. 5a), while alignment of the second half of Dge1 with Dha1 gave a comparison score of 15.4 SD (Fig. 5b).

Fig. 5
figure 5

a GAP comparison of Dge1 (top, first half, residues 10 to 186), a 12 TMS protein obtained with Dha1 (bottom, residues 60 to 217), a 6 TMS protein obtained with the GAP program. Quality: 114; gaps: 10; percentage similarity: 42.3; percentage identity: 31.4. The average comparison score was 15.4 SD. b GAP comparison of Dha1 (residues 6 to 198), a 6 TMS protein, with Dge1 (bottom, second half, residues 209 to 402), a 12 TMS protein, using the GAP program. Quality: 94; gaps: 4; percentage similarity: 34.8; percentage identity: 26.7. The average comparison score was 14.6 SD

The duplication event that led to the appearance of the 12 TMS Dge1 was evidently followed by extensive sequence divergence within both halves of this protein. The middle region of Dge1, spanning approximately 6 TMSs in length (TMSs 4–9), is better conserved than the end regions spanning TMSs 1–3 and TMSs 10–12. This is evident in the alignment of the inverted 6 TMS protein, Afu2, with TMSs 4–9 in Dge1, yielding a comparison score of 20.9 SD (Fig. S14). The appearance of the hydropathy plot (WHAT program) for Dge1 also supports the conclusion of an internal duplication (Fig. 6). The evidence supports the proposal that the 6 TMS proteins represent the basic element for the SdpI family from which other family members evolved. These observations suggest that the standard 6 TMS protein may have duplicated internally to give 12 TMS proteins, that 12 TMS proteins may have led to the formation of the 8 TMS homologues, and that further deletions led to the inverted 7 and 6 TMS proteins.

Fig. 6
figure 6

Hydropathy plot of the SdpI homologue from Deinococcus geothermalis (Dge1) with numbered TMSs. Letters correspond to the homologous TMSs within the protein that probably arose by intragenic duplication. The plot was generated using the WHAT program

Topological Comparisons

Figure 7 shows the average hydropathy plot (top) and average similarity plot (bottom) for the SdpI family of proteins excluding the four internally inverted proteins, Afu2, Tsp3, Ton1, and Tko1, and with the 12 TMS protein, Dge1, cut into two 6 TMS segments. The plots were generated from the multiple alignment shown in Fig. S15. Alignment of the proteins is shown according to their topologies (Fig. 7) as summarized in Fig. 8. Proteins of the 6 TMS topology, with the exception of the four inverted proteins, all align with TMSs 1–6 of all the others. The 4 TMS proteins align with each other as well as with TMSs 1–4 of the standard 6 TMS proteins. The 3 TMS proteins also align with each other and with TMSs 4–6 of the standard 6 TMS proteins. The four varying 5 TMS topologies partially align with each other; TMSs 2–5 of Cte1 align with TMSs 1–4 of the standard 6 TMS proteins, and the first TMS of Cte1 corresponds to the sixth TMS of the standard 6 TMS proteins. In Bcl2, TMSs 1–5 align with TMSs 2–6 of the standard 6 TMS proteins. TMSs 1–5 of Sgo1 and Ssa2 align with each other and with TMSs 1–5 of the standard 6 TMS proteins. Rsa1 and Cgl2 align with each other, and their TMSs 1–4 align with TMSs 1–4 of the standard 6 TMS proteins. TMSs 1–6 of Dlo1 (7 TMS topology) align with TMSs 1–6 of the standard 6 TMS proteins. TMSs 1–3 of the 8 TMS proteins align with TMSs 4–6 of the standard 6 TMS proteins, and TMSs 4–7 of the 8 TMS proteins align with TMSs 1–4 of the standard 6 TMS proteins. Finally, TMSs 1–6 and TMSs 7–12 of the 12 TMS protein, Dge1, align with TMSs 1–6 of the 6 TMS proteins as noted above.

Fig. 7
figure 7

Average hydropathy (top) and similarity (bottom) plots for the SdpI family excluding the four inverted proteins Afu2, Tsp3, Ton1, and Tko1, and with the 12 TMS protein, Dge1, spliced into two 6 TMS long halves. These plots were generated using the AveHAS program based on the ClustalX multiple alignment shown in Fig. S2 on our Web site. Between the two plots are the designations of the TMSs, which are indicated either by a number (1–12) if conserved between the different groups, or by a letter (A or B) if not conserved among the groups of proteins. At right, the total numbers of putative TMSs of each topological type are presented together with representative examples. All TMSs in a single vertical column are homologous regardless of the number designations used except for TMSs indicated by letters. TMSs A are not demonstrably homologous to TMS B. Note: the first peak of Cte1 marks the region where the first peak of Cte1 aligned, and because it is the only representative within the SdpI family to have this region, it is poorly displayed in the AveHAS plot

Fig. 8
figure 8

Topological types of proteins of the SdpI family analyzed in this work. The left column indicates the number of TMSs in each topological type of proteins analyzed as well as representative proteins. The center column shows the arrangement of the TMSs. The topological types are aligned by regions of homology, that is, TMSs found in the same column are demonstrably homologous to each other unless they are designated by letter. The number of each TMS is assigned by its corresponding TMS of homology within the standard 6 TMS proteins. The location of motif 1 is denoted by “*”. The location of motif 2 is denoted by “‡”. The right column lists the cluster numbers assigned in the phylogenetic tree (Fig. 1) in which proteins of the topological type of the same row are found. i denotes inside the cell; o denotes outside the cell

Motif Analyses

Proteins of the SdpI family have two well-conserved motifs that were recognized by the MEME program (Bailey and Elkan 1995). The best conserved motif, motif 1 (AL[YW]PXLP[ED]R[VI][PA][VI]H[WF][NG]ASGE[VP][DN][GR][YF][GM]SKF[EV][GL]) (alternative residues at a single position are in brackets; X = any residue), is found in most members of the family that include TMSs 1 and 2. On the basis of results obtained with the MEME and WHAT programs, motif 1 spans the hydrophilic region between the first and second TMSs of the standard 6 TMS proteins. Clusters 1, 2, 4, 5, 6, 7, 8, and 10 contain variants of motif 1. The absence of this motif in cluster 3 is logical because cluster 3 contains the 3 TMS proteins homologous to TMSs 4–6 of the standard 6 TMS proteins. Therefore, motif 1 would not be expected to appear in these proteins. Lsa1 from Lactobacillus salivarius and Bsu1 from Bacillus subtilis are the only members of cluster 9 to have motif 1.

The second best conserved motif, motif 2 ([IV]G[LI]L[FL]I[VG][LI]GNY[LM][PG]KX[KR]PN[YW]F[VI]GIRTPWTLS[SN][ED]EVW[RN]KT[HN]R[LF][GA]G[KR][LV][FW]V[IAV][GA]G), is well conserved in the majority of the members of the family. It spans the hydrophilic region between the fourth and fifth TMSs in the standard 6 TMS proteins. It was also identified in the expected locations of most of the other topological variants that include TMSs 4 and 5. Using the 3 TMS proteins as an example, motif 2 is found between the first and second TMSs as expected because these proteins align with TMSs 4–6 of the standard 6 TMS proteins. Figures 8 and S16 depict the locations of the recognized motif 2 variants in all of the proteins displaying this motif within the SdpI family. All members of clusters 3, 4, 8, 9, and 10 have this motif, but Lpl1 from Lactobacillus plantarum is the only protein in cluster 7 for which the MEME program recognized motif 2. Likewise, Cac2 from Corynebacterium accolens and Swo1 from Syntrophomonas wolfei were the only proteins in cluster 2 for which MEME identified this motif. It is possible that this motif deviates in sequence in some clusters. Such differences may have functional significance (see “Discussion”).

The majority of the proteins with the standard 6 TMS topology have one of three combinations of these two motifs. The 6 TMS proteins from clusters 8 and 10 contain both motifs. The four “inverted” proteins of 6 and 7 TMSs were also found to contain the same combination of motifs albeit in an inverted manner.

Of the 6 TMS proteins of clusters 5, 6, and 7, MEME recognized only motif 1 with the exception of Lpl1 of cluster 7, which displays both motifs. Finally, cluster 9 contains 6 TMS proteins in which only motif 2 was recognized by MEME except for the aforementioned proteins, Lsa1 and Bsu1.

All of the standard 6 TMS proteins align throughout their lengths and have high comparison scores with one another despite variations in the sequences displayed by these two motifs. The cluster differences for these two motifs are summarized in Table 2 (A and B) as are the sequence similarities between the consensus motifs 1 and 2 (Table 2C).

Table 2 Summary of the similarities and differences within and between the sequences of motifs 1 (A) and 2 (B) among clusters; (C) shows an alignment of the consensus motif 1 (M1) with the consensus motif 2 (M2)

Operon Analyses

The genomes encoding 17 SdpI homologues were examined, those whose genomes were found within the NMPDR Genome Browser database. Analysis of the genomic environments of sdpI genes was limited by the data currently available in the genomes of the species examined. Of the 17 genomes encoding SdpI homologues, only 6 contained sdpI genes encoded within multicistronic operons. The remaining 11 sdpI genes could not be conclusively linked to other genes. All six operons included a transcriptional regulator upstream of the SdpI homologue gene. Both Bce2 from Bacillus cereus ATCC 10987 and Bsu1 from Bacillus subtilis subsp. subtilis str. 168 encoded ArsR family transcriptional regulators, 3 bp upstream of the sdpI genes. Lpl1 from Lactobacillus plantarum WCSF1 had an unidentified transcriptional regulator encoded immediately upstream of it. The gene for Afu2 in Archaeoglobus fulgidus DSM 4304 overlapped with a PadR family transcriptional regulator gene by 8 bp; Swo1 from Syntrophomonas wolfei subsp. wolfei str. Goettingen overlapped by 4 bp with the gene for a GntR family transcriptional regulator, and Tko1 from Thermococcus kodakarensis KOD1 had a ParR family transcriptional regulator encoded immediately upstream of it. Uniquely, Tko1 additionally had an unidentified 357-bp gene coding for a hypothetical membrane protein 3 bp downstream of the sdpI gene, one 723-bp gene coding for a hypothetical protein 7 bp upstream of the transcriptional regulator, and a 342-bp gene coding for a hypothetical protein 3 bp upstream of the 723-bp gene. The hypothetical proteins could not be identified using BLAST searches and were not homologous to Tko1 or to any of the transcriptional regulators. PadR and ParR were found to be homologous to one another, although PadR, ParR, and GntR did not show obvious homology with SdpR. ArsR was found to be homologous to SdpR. We tentatively suggest that these putative transcriptional regulators bind on the cytoplasmic sides of the membrane to the SdpI homologues to effect autoregulation.

Promoter analyses (see Methods) predicted promoters ending 18 bp upstream of the gene encoding the Bce2 transcriptional regulator with a score of 0.98, 49 bp upstream of the Bsu1 regulatory gene with a score of 1.00, 7 bp upstream of the Lpl1 regulatory gene with a score of 0.84, 32 bp upstream of Afu2 s with a score of 0.94, 38 bp upstream of Swo1 s with a score of 0.99, and 33 bp upstream of Tko1’s transcriptional regulator with a score of 0.94; no promoters were found within 120 bp of the first hypothetical protein in the Tko1 operon.

Discussion

Proposed Pathway for the Evolution of Varying Topologies

Figure 9 diagrams the proposed pathways for the evolution of proteins of the SdpI family and shows their differing topologies. It is likely that the standard 6 TMS proteins represent the basic element of the SdpI family. Several other membrane protein families with members possessing 6 TMSs per polypeptide chain are known to have arisen through either internal triplication of a primordial 2 TMS element (CytC [Lee et al. 2007], MC [Kuan and Saier 1993], and ABC1 [Wang et al. 2009]), or by duplication of a primordial 3 TMS element (MIP [Pao et al. 1991], DsbD [Kimball et al. 2003], and ABC2 [Wang et al. 2009]). We postulate that in the SdpI family, the primary 6 TMS proteins may have arisen through intragenic duplication of a primordial 3 TMS-encoding DNA segment. Deletions within this basic element over evolutionary time led to the formation of the 4 TMS, the 3 TMS, and two of the 5 TMS variant proteins. A fusion event may have led to the appearance of the noninverted 7 TMS protein (Dlo1). The 12 TMS protein undoubtedly arose by intragenic duplication of the basic 6 TMS element followed by extensive sequence divergence of both halves, particularly in the first and last 3 TMS segments. Deletion events of a primordial 12 TMS protein led to the formation of the 8 TMS proteins. Deletions in the 8 TMS homologues led to the formation of the inverted 7 TMS proteins, Tko1, Tsp3, and Ton1, and one of the 5 TMS topological types, represented by Rsa1 and Cgl2. Deletions in the inverted 7 TMS proteins led to the formation of the inverted 6 TMS protein, Afu2, and the 5 TMS variant, Cte1. The 4 TMS proteins may have also arisen by deletion of one TMS from the N-termini of the 5 TMS proteins, Rsa1 and Cgl2. It is probable that the 4 TMS topology arose at least twice as some of the 4 TMS proteins cluster closer to the standard 6 TMS proteins, while other 4 TMS proteins cluster closer to 5 and 8 TMS proteins.

Fig. 9
figure 9

Proposed pathway for the evolution of the proteins of differing topologies within the SdpI family. Black arrows indicate probable direction of evolution; striped arrows indicate possible evolutionary pathways when two different pathways are equally probable

Protein Orientation Within the Cell Membrane

All of the proteins of the SdpI family included in our study proved to be oriented within the cell membrane (Fig. 8) in such a way that motif 1, between TMSs 1 and 2 in the standard 6 TMS proteins, is always found to be externally localized, while motif 2, between TMSs 4 and 5 in these same proteins, is always located on the inside, facing the cytoplasm. The N-termini of the four 3 TMS homologues, all of the inverted 7 TMS proteins, Bcl2 (5 TMSs) and Cte1 (5 TMSs) were predicted to be localized to the external surface of the cell membrane, and the C-termini were predicted by both programs to be on the inside. Both the N- and C-termini of the 4 TMS proteins, the standard 6 TMS proteins and the internally duplicated 12 TMS protein were predicted to be located on the inside. Both the N- and C-termini of the inverted 6 TMS and 8 TMS proteins appeared to localize to the outside. The N-termini of the standard 7 TMS homologue (Dlo1) and four of the 5 TMS variants (Rsa1/Cgl2 and Ssa2/Sgo1—see  Fig. 8) were predicted to be localized to the inside of the cell, while the C-termini were predicted to be on the outside. On the basis of all of these predicted orientations, which were in surprising agreement with each other, motif 1 is always on the external surface to the membrane, while motif 2 always faces the cytoplasm. Because we postulate that motif 1 is responsible for neutralizing the extracellular SdpC toxin by forming an SdpI–SdpC complex in the membrane, motif 1 should be localized to the outer surface of the cellular membrane. By contrast, because motif 2 is predicted to be responsible for promoting expression of the sdpRI operon by sequestering the cytoplasmic autorepressor, SdpR, it would follow that this process occurs on the inside of the membrane. The predicted topologies therefore fully support the functional predictions.

Conserved Motifs Confirm Homology of SdpI Family Members

Analyses of the motifs present in the proteins of the SdpI family confirmed homology of most family members despite variations in their topologies. Figure 8 illustrates the alignment of the proteins according to their topologies with the locations of the two conserved motifs denoted. Motif 1, when present, is always found between TMSs 1 and 2 of the standard 6 TMS proteins, while motif 2, when present, is always found between TMSs 4 and 5 in these same homologues. Thus, when these motifs are found in the other topologically variant proteins, they are always located in the region that would be expected to exhibit the motif in question within the standard 6 TMS proteins. These hydrophilic loops proved to be the best conserved regions of these proteins as revealed by the average similarity plots generated with the AveHAS program (Fig. 7).

Motif analyses of the four inverted proteins confirmed the proposed inversion. Motif 2, located in the hydrophilic region between TMSs 1 and 2 of the inverted proteins, is homologous to the hydrophilic region between TMSs 4 and 5 of the standard 6 TMS proteins. Further, motif 1, found in the region between TMSs 4 and 5 in the inverted proteins, is homologous to the hydrophilic region between the first and second TMSs of the standard 6 TMS proteins. This occurrence provides further evidence for the inverted topology of the former proteins with respect to the standard 6 TMS proteins proposed initially on the basis of primary sequence similarity alone.

The clustering of the single 4 TMS protein, Hma1 (cluster 6), with all of the 6 TMS proteins in cluster 6 can be rationalized on the basis of our motif analyses. Cluster 6 contains 6 TMS proteins which only exhibit motif 1, and Hma1 also contains only motif 1. This is expected as Hma1 is homologous to TMSs 1–4 of the standard 6 TMS proteins and lacks the hydrophilic region between TMSs 4 and 5. Possibly it arose independently of the other 4 TMS proteins of the SdpI family by deletion of the C-terminus of a 6 TMS homologue like those with which it clusters.

The same principle can be applied to explain the origins of the 4, 5, and 6 TMS proteins (Pae1, Cte1, and Aba1) within cluster 5. All three proteins contain only motif 1, and Pae1 is very closely related to Cte1, leading to the possibility that the 4 TMS protein, Pae1, originated from a 5 TMS protein like Cte1 by deletion of one TMS at the N-terminus.

It is likely that the original 6 TMS proteins contained the equivalent of primordial motifs 1 and 2. These 6 TMS proteins are highly similar and align with one another throughout their lengths. Consequently, there is no reason to support the idea that convergent evolution led to the appearance of the two motifs. More likely, some of the 6 TMS proteins lost one or the other motif and lost the corresponding function. Alternatively, they may have had the same motif diverge in sequence to an unrecognizable state while gaining a dissimilar function. Lpl1 of cluster 7 can serve as an example in support of this hypothesis. Both motifs were recognized by MEME in Lpl1, although this program recognized only motif 1 in the other proteins in this cluster.

The SdpI family is unusual in that it contains proteins of widely varying topologies. Such a situation has rarely been observed, the only other well-documented example being the heme handling protein (HHP) family (TC no. 9.B.14; Lee et al. 2007). We propose two possible explanations for this phenomenon. First, it is possible that the entirety of the protein is not necessary for function. Motif 1 between TMSs 1–2 or motif 2 between TMSs 4–5 may alone be adequate for one of the two subfunctions currently recognized for the SdpI protein of Bacillus subtilis. Second, the truncated versions of the 6 TMS proteins and the 6 TMS proteins containing only one recognizable motif may form heterodimers to ensure a complex possessing both of the conserved motifs.

The NCBI database was searched with motifs 1 and 2, but no significant matches were found outside of the SdpI family. The work of Ellermeier et al. (2006) provides a functional explanation for topological variation within members of the SdpI family. The first 3 TMSs of the B. subtilis 6 TMS SdpI protein are likely to be responsible for the SdpC immunity function, while the second 3 TMSs are responsible for SdpR sequestration. All of the topological variants within the family include at least one of the regions that is likely to be responsible for one of the functions. Proteins with 3, 4, and 5 TMSs may be unifuctional because they only contain the first three or second three TMSs of the 6 TMS proteins. Proteins with 6, 7, or 12 TMSs would be predicted to have both functions. Because both functions are needed to ensure regulated immunity to SdpC, it is reasonable to postulate that an organism could have two unifunctional proteins to compensate for not having a protein with both functions in a single polypeptide chain. Alternatively, an organism may have just one or the other function, e.g., unregulated immunity, or regulation of a dissimilar function. In the case of the 8 TMS proteins, only Swo1 displays both conserved motifs; the remainder display only motif 1. Thus, the 8 TMS proteins may have started out with both functions, but they have since diverged to become unifunctional. Alternatively, motif 2 may have diverged to provide for a distinct but related function (e.g., binding of a protein dissimilar to SdpR).

Two strains of Corynebacterium glutamicum and one of C. efficiens both have two SdpI homologues, a 3 TMS protein (e.g., Cgl1 in Table 1) and a 5 TMS protein (e.g., Cgl2 in Table 1; unpublished observations). The 3 TMS proteins are homologous to the second half (TMSs 4–6) of the standard 6 TMS proteins, the region that is believed to be responsible for promoting the expression of the sdpRI operon by sequestering the autorepressor, SdpR. The 5 TMS proteins are homologous to TMSs 1–4 of the standard 6 TMS proteins, the region in SdpI that is probably responsible for neutralizing the SdpC toxin by forming an SdpI–SdpC complex. By having two truncated proteins with complementary functions, possibly in complex with each other, regulated SdpC immunity could exist, involving two related but dissimilar proteins.

Evidence that the 6 TMS Topology Arose by Duplication of a 3 TMS Precursor

Several independent lines of evidence lead us to suggest that duplication of a primordial 3 TMS element, followed by substantial sequence divergence, gave rise to the major class of 6 TMS proteins. (1) The best-conserved motifs occur between TMSs 1 and 2 and TMSs 4 and 5, equivalent positions in the two halves of the proteins. (2) Assuming that these conserved motifs bind SdpC (the toxin) and SdpR (the regulator), respectively, then SdpC would bind to the external surface of the membrane while SdpR would bind to the cytoplasmic side, as expected, on the basis of the mutational analyses (Ellermeier et al. 2006; Saier 2003). (3) Comparison of the sequences of motif 1 with those of motif 2 revealed similarities, suggestive of homology, even though the observed similarity was not sufficient to establish common origin (Table 2C). (4) Binding of SdpC and SdpR to the first and second halves of the membrane, respectively, as suggested by Ellermeier et al. (2006), could be explained if the two sequence divergent halves of a 6 TMS SdpI protein arose from a 3 TMS protein binding precursor polypeptide. (5) The fact that several SdpI homologues exhibit an inverted topology relative to the standard 6 TMS proteins makes functional sense because the order of two 3 TMS halves in the polypeptide chain should be of no functional significance. (6) The same argument can be used to explain conservation within the 12 TMS homologue: the second 3 TMS element within the first 6 TMS half of the protein, and the first 3 TMS element within the second 6 TMS half, are better conserved than the first 3 TMS element in the first half and the second 3 TMS element in the second half. This would suggest that only the second and third 3 TMS elements of the four 3 TMS elements have retained function (Fig. 8). Loss of the nonfunctional regions (TMSs 1–3 and 10–12) would yield the inverted homologues.

Taken together, these observations suggest an origin of SdpI homologues comparable to those of the MIP (Pao et al. 1991), DsbD (Kimball et al. 2003), and ABC2 families (Wang et al. 2009), namely, intragenic duplication of a 3 TMS–encoding genetic element. Further work, including the generation of high-resolution three-dimensional structural data, is likely to provide confirmation or refutation of this proposal.

Operon Analyses

Operon analyses of a few SdpI homologues suggested that a significant fraction of SdpI homologues are encoded in operons downstream of transcriptional regulators. The transcriptional regulator, ArsR from Bacillus cereus E33L, is homologous to SdpR, SdpI’s autorepressor. The transcriptional regulators, ParR from Thermococcus kodakarensis KOD1, PadR from Archaeoglobus fulgidus DSM 4304, and GntR from Syntrophomonas wolfei subsp. wolfei str. Goettingen, could not be shown to be homologous to SdpR. Unique among the operons examined, Tko1 is found downstream of its transcriptional regulator, upstream of one hypothetical protein and flanked upstream of its transcriptional regulator by two hypothetical proteins in a single operon. It seems reasonable to suggest that these transcriptional regulators function in conjunction with the SdpI homologues to effect autoregulation in response to extracellular signals. The fact that several of these regulators are nonhomologous provides evolutionary rationalization for the sequence divergence of the companion SdpI homologues. Future identification of the hypothetical proteins will undoubtedly provide valuable clues as to the functions of Tko1 and the other SdpI homologues.

Alternative Potential Functions of SdpI Homologues

The basis for the presence of multiple SdpI paralogues encoded within single genomes is poorly understood from a functional standpoint. However, bacteria possess complex gene networks and process protein information to influence many cellular behavioral traits (Schultz et al. 2009). These gene circuits and functional molecules allow integration of complex signals impinging on a network of modules. The availability of multiple processed input signals, transmitted to a central integrator, allows fine-tuning of the decision-making process. This would be particularly advantageous when numerous developmental alternatives exist (Chagneau and Saier 2004; Schultz et al. 2009; Stephenson and Hoch 2002).

The need for functional integration of multiple input signals, sensed by cell surface sensors, is emphasized by the developmental complexity of Bacillus subtilis. In addition to vegetative growth, this organism can (1) sporulate and form fruiting bodies, (2) develop competence for DNA uptake, (3) become supermotile (swarming competent), and (4) form complex organized sessile communal lifestyles (biofilms) with different functions delegated to different cell populations (Branda et al. 2001; Chagneau and Saier 2004; Verhamme et al. 2009). Indeed, nonsporulating bacteria often retain several of these other possibilities.

The developmental fate of any one cell within a population is dependent on environmental conditions, internal signals, collective and individual sensing, and a nongenetic, stochastic (nondeterministic) process responding to quantitative chance events in a qualitative way (Ben-Jacob 2009; Losick and Desplan 2008). Stochastic decision making may be random, but it is likely to be influenced by dozens of input signals, each sensed by a different set of cytoplasmic and transmembrane sensors. This may explain the recurrence of multiple homologous receptors in a single organism possessing multiple programs. They provide immunity to or allow signal transmission in response to many other extracellular signals in addition to SdpC toxin. Among the bacteria possessing SdpI homologues are several present in the human intestinal tract (e.g., Lactobacilli, Clostridia, Bifidobacteria, etc.) The actions of these homologues may be to sense the presence of other bacterial cohabitants, including members of both the same and different species. Research on these proteins may be applicable to an understanding of their interactions and to their contributions to human health and disease. Targeting SdpI homologues could provide a new Achilles’ heel against multidrug-resistant superbugs. One example is the common hospital bacterium, Clostridium difficile, which possesses an SdpI homologue. Targeting the agents mediating cannibalism could allow the development of novel antibiotics that under stress conditions would reduce specific bacterial populations in vivo. The need for the development of novel antibiotics with unique targets is emphasized by the emergence of multidrug resistance in several disease-causing microbes.