The first autotransporter (AT) to be described was immunoglobulin A protease (IgAP) of Neisseria gonorrhoeae [1], which cleaves human antibodies to protect the bacteria from the host’s immune response. The name ‘autotransporter’ was coined to reflect the belief that the protein exported itself from the bacterium [2]. Interest in the AT family grew dramatically about 10 years later when it was realised that ATs constitute a large family of pathogenic proteins with a common secretion mechanism [3]. There has been much debate about the definition of an autotransporter, since the ‘classic’ ATs, now known as Type Va proteins, share much in common with the Type Vb proteins (also known as ‘two-partner’ proteins) [4]. The Type Va and Vb systems share several similarities [57], but the Type Vb system involves the expression of a separate passenger and membrane-associated protein. The group of Jacob-Dubuisson has solved the crystal structures of a Type Vb passenger fragment [8] and also the membrane component, which resembles the essential Omp85 protein of Gram-negative bacteria [9]. Here we have space to discuss only Type Va passengers, which are monomeric proteins presented at the cell surface or secreted into the medium. The C-terminal barrel domain is essential for export of the passenger. Two crystal structures of these are known [10, 11], but there is no space to describe them here in detail. The trimeric (Type Vc) autotransporters are very important in bacterial colonization and have been reviewed separately [12].

Understanding of the AT secretion mechanism has developed rapidly in recent years and is summarised in Fig. 1. While much remains to be discovered about transport across the outer membrane, it is clear that there is considerable variation in the passenger domains themselves [13]. They share two main characteristics, a long right-handed β-helical structure capped by a C-terminal region which emerges into the medium first [14]. This region forms a structure unique to the AT family and is known as the junction or ‘autochaperone’ due to its inferred role in folding the entire passenger. Even though hundreds of AT sequences are now known, there is very little structural information at the molecular level, with only four X-ray crystal structures of Type Va passenger proteins solved, pertactin [15], VacA [16], Hbp [17] and IgAP [18].

Fig. 1
figure 1

Models of AT transport. ATs can cross the inner membrane either co-translationally or post-translationally. Once in the periplasm, chaperones deliver the C-terminal region of the AT to the outer membrane protein Omp85 and its associated proteins. The Omp85 complex is required for AT secretion, but its precise function is not understood [4951]. The passenger N-terminus emerges last. Since no energy in the form of a pH gradient or ATP is required for secretion, it has been suggested that passage across the OM is driven by folding of the AT outside the cell, but folding-defective AT mutants are still secreted. a Two models are currently favoured, one in which Omp85 complex assists folding of the barrel and acts as a channel, and the so-called ‘hairpin model’ in which the β barrel allows two stretches of extended polypeptide chain through. The pore of the barrel is very narrow, however, only 10–12 Å. Since small disulphide-linked loops on the passenger domain do not prevent secretion [39], the hairpin model is highly unlikely without some breathing motion to increase the pore size. b Cα trace of the β-barrel domain of NalP, showing the ‘linker’ α-helix (in red) joining the barrel to the passenger (not seen is this model [10]). β strands are shown in yellow, and loop regions in green. The outer face of the protein and the N-terminus of the α-helix are at the top

Known molecular models

Pertactin, the first AT structure solved by X-ray crystallography, is relatively small and consists largely of a simple β-helix (Fig. 2a). The β-helical motif was first discovered in the enzyme pectate lyase [19], and the folding of β-helical proteins has been the subject of considerable discussion [20, 21]. The VacA structure is of a naturally cleaved 55 kDa C-terminal fragment of the protein from Helicobacter pylori. VacA is a pore-forming toxin that promotes peptic ulceration and gastric cancer; membrane channel formation involves self-association of VacA into a structure with sixfold symmetry [16, 22]. Hbp and IgAP are serine proteases. IgAP cuts human antibodies at a conserved site, and the molecular model provides some clues as to how it binds its target proteins [18]. Provence and Curtiss described temperature-sensitive hemagluttinin (Tsh) in 1994 [23]. It is secreted by avian pathogenic Escherichia coli and was named after its apparent ability to lyse red blood cells. Some years later Hbp was discovered by Otto and colleagues as a growth factor involved in microbial synergy in severe peritonitis, which frequently occurs after surgery or other trauma, and places an enormous economic burden on health systems across the First World [24]. E. coli isolated from a patient was found to secrete Hbp, which assisted haem uptake by Bacteroides fragilis. Despite being almost identical to Tsh, Hbp was found to bind haem and to cut haemoglobin rather than cause haemagglutination [25]. Serine protease ATs may be divided into two broad sub-classes by sequence [13], of which Hbp/Tsh and plasmid encoded toxin (Pet) [2629] are among the best studied representative members. Pet is secreted by enteroaggretative E. coli and taken up by host cells [27], whereas the Hbp/Tsh sub-family appear to degrade host proteins such as mucin [26]. Although IgAP and Hbp/Tsh are both members of the same sub-class, clearly they have very different functions.

Fig. 2
figure 2

Crystallographic models of AT passenger domains. The models are shown as Cα traces, with the α helices shown as red ribbons, the β-strands as yellow arrows and coil regions as green thread. a Pertactin (PDB entry 1DAB). The long β-helical motif is decorated with only short loops, and carries no other domains except the autochaperone at the C-terminus. This domain of each protein model is at the top of the figure, and the N-terminus at the bottom. b VacA p55 (PDB entry 2QV3). Additional α helices are found in the autochaperone region compared to other models. c Hbp (PDB entry 1WXR). The protease domain (domain-1) is coloured orange and domain-2 is coloured green. IgAP shows a very similar structure (PDB entry 3H09)

The structures of known models are shown in Fig. 2. All the known and predicted AT passenger structures consist of a β-helix carrying various loops which are presumed to confer functionality. The role of pertactin is to adhere to host cells, and it displays a three residue ‘RGD motif’ projecting from the β-helix which enables it to bind integrins. A fivefold repeat of the sequence GGXXP (where G is glycine, P is proline and X is any amino acid) is found just after the RGD motif, and may also be involved in binding to host cells [15]. Within the VacA p55 structure, disruptions of the regular β-helical pattern define different structural sub-domains whose sequences correspond to VacA sub-types with different target cell specificity [16]. Two long loops near the C terminus represent the receptor binding site for all VacA sub-types. Residues immediately N-terminal to the crystallised fragment have been shown to play a role in cell vacuolation and membrane depolarisation, but not cell binding [30, 31]. Compared to pertactin and VacA, the serine protease autotransporters are more highly decorated with surface structures, the largest of which is the protease domain, called domain-1.

Protease domains

Domain-1 of IgAP is shown in Fig. 3. This domain is a sub-family of the trypsin-like serine proteases but with some characteristic differences [17]. The active site is broadly conserved, and the active site serine was recognised from the sequence before the structures were solved [32, 33]. Modelling other domain-1 structures shows that there are different residues at the bottom of the specificity pocket, which approach the substrate at the cutting site and presumably play a large role in determining substrate specificity [17], but the biological role of the protease activity is not yet known for many ATs. Domain-1 sequences show rather little sequence similarity overall. There are however very strongly conserved residues and plotting these on models of Hbp or IgAP shows that they correspond to contacts between domain-1 and other parts of the protein. This is strong evidence that the architecture of the serine protease ATs is conserved and that the interdomain contacts are important. Domain-1 of Hbp has a high proportion of glycine residues (34 out of 256), which is a characteristic of highly conserved protein regions [34]. Glycine residues are often conserved at particular points within protein structures due to space limitations or the need for a tight turn. The remainder of Hbp passenger has a proportion of glycine which is close to that generally found for proteins.

Fig. 3
figure 3

Domain-1. Cα trace of IgAP domain-1. Domain-1 of Hbp and IgAP are closely related, but IgAP has three extra loops around the active site serine (Ser 288, coloured blue). These loops, residues 113–132, 144–162 and 213–231, are coloured red. Johnson and colleagues [18] speculate that the insertions help IgAP grip substrate antibodies. Like trypsin, IgAP domain-1 has a loop (residues 144–162) just prior to the aspartate of the catalytic triad that Hbp domain-1 lacks entirely, making its active site much more open. Both Hbp and IgAP have a tyrosine residue at the bottom of the specificity pocket of the active site, a feature not seen in other serine protease families. IgAP cuts the peptide bond C-terminal to proline residues, but Hbp/Tsh seems to have no such specificity [26]

Surface loops

The surface loops of the serine protease ATs are illustrated in Fig. 4. These loops have very little sequence similarity, but show similar positions and interactions. One such region is the loop made by residues 80–89 in domain-1 of Hbp, which we call the ‘Y-loop’ since tyrosine 89 is perfectly conserved. This β-hairpin contacts head-to-head another surface loop, called domain-4, projecting from the β-helix (Fig. 4a, b). The tyrosine makes a hydrogen bond with a glutamate residue which helps pin the Y-loop down to the surface of domain-1 (Fig. 4c), and this interaction is preserved in IgAP (Tyr 135-Glu 176). The principal contact between domain-1 and the β-helix is nearly identical between Hbp and IgAP, and the two structures overlay closely. The conserved hydrogen bonds imply that domain-1 is rigidly fixed to the β-helical stem, and does not move independently. The peptide chain passes from domain-1 to the β-helix via an α-helix. Every serine protease AT apparently maintains the same overall fold in this region since tyrosine 67, aspartate 255 and lysine 304 of Hbp are very strongly conserved. The carboxyl group of the aspartic acid residue makes hydrogen bonds to both of the other residues, bridging domain-1 and the β-helix.

Fig. 4
figure 4

Surface loops of serine protease ATs. a Hbp. Domain-1 is coloured orange, domain-2 green, domain-3 blue and domain-4 red. The Y-loop within domain-1 is separately coloured cyan. b The surface domains of IgAP, coloured as a. The conserved features of the Y-loop and domain-4 suggest that contact between them offers the protein some advantage. In particular it appears that without this contact, domain-4 might be able to move much more freely relative to the β-helical domain. c The Y-loop/domain-4 contact in Hbp. Glutamate 86 forms hydrogen bonds with the main-chain and side-chain of threonine 696. Protonation of this glutamate residue by an acid environment could weaken or break these bonds, giving domain-4 considerable flexibility

The hydrophobic core of the β-helix is on the whole conserved very patchily, and in Hbp is notable for its lack of any apparent order. IgAP shows more regular stacking of leucine and isoleucine residues in a fashion reminiscent of pectate lyase. There is no unusual distribution of any particular hydrophobic residue type, but the sequence patterns are readily predicted to form β-helices by computer programmes such as BETAWRAP [35]. The β-helix of Hbp shows many bulges or loops disrupting the regular hydrogen bonding pattern between adjacent loops, and sequence alignment indicates that some of these are preserved across the serine protease ATs. For example, a tyrosine–arginine pair forms a contact next to the Y-loop and domain-4, and all serine protease ATs apparently have a well-defined structure at this junction.

Domain-2

A readily identified feature of Hbp is a region called domain-2 that projects from the body of the protein. This domain is only 76 residues long in Hbp (residues 481–556) but forms a unique structure. Domain-2 may well be an attachment site of some sort, though its target and the mode of interaction are unknown. Intriguingly domain-2 shares core residues with the NIDO domain of human proteins of the basal membrane. The link between domain-2 and connective tissue is suggestive of binding to host cells, and Tsh is reported to have cell adhesive properties [33]. IgAP has a very similar domain with the same fold and hydrophobic core [18], but a 12 residue insertion alters the predicted binding site of Hbp domain-2 (Fig. 5) to bind substrate antibodies instead. Although absent from many serine protease ATs such as the known mucinase Pic [36, 37], domain-2 may prove to be diagnostic of protein function. In both Hbp and IgAP the main-chain leaves and returns to the β-helical stem at the same position, so that domain-2 appears to have a certain freedom of motion relative to the remainder of the protein, and to fold independently of the β-helix.

Fig. 5
figure 5

Domain-2. IgAP domain-2. Hbp forms a closely related domain projecting from the body of the protein. The Cα trace of the model is coloured by secondary structure, except for the 12 residue insert compared to Hbp, shown in purple. The function of Hbp domain-2 is unknown, but in IgAP it is believed to bind antibodies

Domains 3 and 4

Immediately following domain-2 is a region roughly fifty residues long that shows much more variation than other parts of the protein. Even closely related ATs show very little sequence similarity here, so this region may be useful as a ‘fingerprint’ to identify clones. Part of this region in Pet has been named domain-2A and may be related to cell uptake by that protein (Ian Henderson, personal communication). Hbp carries a smaller separate loop, from histidine 608 to glutamate 644, and the equivalent region in IgAP (residue 710–743) has been named domain-3, though it is clear that this cannot fold stably and independently. The last few residues of domain-2A (PDWET) are completely conserved and the structure of Hbp shows why; the glutamine side-chain of Gln 640 is the centre of a network of hydrogen bonds involving aspartate 642 and lying over tryptophan 643 (Fig. 6). These residues provide a stable platform from which an external loop can safely be projected, without unduly disturbing the β-helix. Histidine 608 hydrogen bonds to glutamate 644, and arginine 664 and asparagine 665 form a well-conserved hydrogen bond. Thus although Pet, Hbp and IgAP have different functions and sequences in this region, there are still strongly conserved structural features, suggesting each protein has a domain-2A or domain-3 with a specific role, and which may be extended or retracted like a cat’s claw. Like domain-2, this region appears to mediate interactions with other molecules, either of the host organism or the parent bacterium, but to date no functional studies of this region have been published.

Fig. 6
figure 6

The domain-4 anchor region of Hbp. The conserved residues proline 609 and tryptophan 643 make a strong apolar interaction which, combined with side-chain hydrogen bonds between the start and end regions of the loop, provide a stable structure which would stabilise the β-helix if the bulk of domain-4 broke contact with the Y-loop and moved independently

From glycine 682 to serine 714, the chain trace of Hbp leaves the β-helix to form an extended, twisted loop which contacts the Y-loop. This loop has been named domain-4 in IgAP [18], but like domain-3 has no hydrophobic core to maintain a stable independent structure. It lies between domain-2 and domain-3, against the β-helix. Aspartate 689 of Hbp forms a salt bridge with histidine 563, holding domain-4 onto the β-helix. Domain-4 is not conserved and of variable length. IgAP has a slightly shorter domain-4 than Hbp, but a considerably extended Y-loop, so that these features maintain contact in both structures (Fig. 4). Sequence alignment with the Pet family breaks down at this point. Although the sequences vary, there are notable structural similarities which suggest these loops, like domain-3, are seen in a retracted mode in the crystal structures but in solution are capable of independent movement. The very strong conservation of the loop structure itself and the position of its insertion, but not the sequence, implies that each loop has some functional role, unrelated to folding or stabilising the structure.

Autochaperone domain

The β-helix is capped at the C-terminal end by a conserved fold called the junction region or autochaperone (AC) domain, which plays an essential role in the secretion of AT proteins [38]. It is known that this region emerges from the cell first [14] and that stably folded domains may block exit from the cytoplasm to the cell surface [39]. The AC domain presumably folds outside the cell, and in vitro refolding studies of ATs suggest it refolds first, and then drives refolding of the β-helix [40]. The AC domain of VacA is rather different from the other three AC models, among which the only notably similar residues are internal. Even these are not completely conserved, and the hydrophobic core is extensively reworked between the known structures so that side-chains which overlap in space are found at different places in the sequence alignment. There are also apolar surface residues which do not appear to play any apparent structural function and whose role may possibly be to interact with surface loops of the barrel domain. Structural comparison of the AC domain of pertactin, Hbp and IgAP shows it to form a very similar β-sandwich, despite the lack of significant sequence conservation and non-conserved loops, but the VacA domain has a slightly different fold including α-helical regions. Sequence alignments of the AC domain from a variety of ATs confirm they fall into different groups, so that ATs such as AIDA and Ag43 from E. coli have six internal residues in common with ATs from Bordetella, Pseudomonas, Agrobacterium and Salmonella species among others. None of the solved structures fits the derived pattern of conserved residues, which corresponds to the final β hairpin of the passenger.

Conclusion

Proteins secreted by pathogenic bacteria play essential roles in colonization, infection and virulence. Enormous advances have been made recently in studies of ATs, but more of these proteins continue to be discovered at a frantic pace, and they remain poorly understood. Improving our understanding, and finding out how the export pathway really works, represents a considerable challenge [41]. While ATs are fascinating as secretion mechanisms, they are also of profound medical importance as essential factors in a wide variety of diseases that cause great financial cost in the developed world and great mortality elsewhere. The major secreted protein of Shigella is SepA, a serine protease AT of unknown function, but without which virulence is attenuated [42]. Pertactin is already an important component of the whooping cough vaccine, and another Bordetella pertussis AT called BrkA has also been shown to have similar protective activity in a mouse model [43]. In the case of Pet, mutation of the active site serine gives a stably folded mutant protein with no cytotoxic effects [32]. Such mutant proteins are obvious candidates for vaccines [44], and it has been shown that mice challenged with Hbp are strongly protected against severe peritonitis and associated abscess formation [45]. Autotransporter-based systems have proved useful for displaying unrelated chosen target proteins on the bacterial cell surface, and this technology shows promise in vaccine development [46, 47]. A similar approach was taken by the group of Ala'Aldeen to produce a vaccine against Actinobacillus pleuropneumoniae, a major pathogen of pigs [48]. The protein was antigenic, but was found not to protect pigs against disease, possibly because it had been purified in an unfolded form. It remains to be tested if the native protein, or fragments of it, is more effective. Structural biology has now identified domain-2, which has a function specific to each AT that possesses it, and may prove to be a useful vaccine component. Work is underway to study this domain further and to produce non-toxic fragments of AT proteins that confer immunity to widespread and difficult to treat diseases.