Background

Bacteria inhabit every environment on earth with a resilience that is central to their survival and consequently, they continue to serve as a major source of human disease. A critical factor, which has been central to the success of these organisms, is the diversity entrenched within their cell walls, which serves as a major barrier to drug treatment. The mycobacterial cell wall is an incredibly complex structure, with multiple layers that collectively constitute a waxy, durable coat around the cell, which serves as the major permeability barrier to drug action [14]. Considering this, the cell wall and related components are attractive for the mining of new drug targets, and remain relatively unexploited for drug discovery in the case of certain bacterial pathogens [2, 5, 6]. Peptidoglycan (PG or the murein sacculus) is a rigid layer that constricts the cell membrane and the cell within, providing mechanical stability to counteract imbalances of cytoplasmic turgour pressure, and plays an important role in determining cell size and shape [710]. Mycobacteria possess a highly complex additional lipid rich outer membrane, with different constituents anchored either directly to the cell membrane or to the PG [6, 11]. Arabinoglactan (AG), a structure unique to actinomycetes, is bound externally to an N-acetyl muramic acid (NAM) moiety of the PG [3, 12]. In mycobacteria, a certain proportion of the muramic acid is N-glycolylated [13] through the activity of NamH, a UDP-N-acetylmuramic acid hydroxylase [14]. This modification results in altered tumour necrosis factor α production [15, 16] however, abrogation of NamH activity does not lead to decreased virulence in mice [16].

This serves as an anchor for further lipid rich cell wall components, either by covalent attachment to the mycolic acid layer or through non-covalent interactions [trehalose dimycolate (TDM); phthiocerol dimycocerosate (PDIM); phenolic glycolipids (PGL)] [3, 11, 12]. PG consists of repeated alternating sugars N-acetyl glucosamine (NAG) and NA/GM (muramic acid with or without the glycolyl modification), which are linked to a pentapeptide side chain [79, 17], Figure 1. The crosslinking of these subunits lead to a lattice-like structure around the cell.

Figure 1
figure 1

PG units and chemical bonds associated with remodelling enzyme activities. At the top and bottom of the figure are shown the NAG-NA/GM sugar backbone in anti-parallel orientation. The NAM residues are designated as NA/GM to correspond to the N-glycolylation of muramic acid in mycobacteria. Enzymatic activities are indicated by arrows: Rpfs [yellow], PBPs [orange], endopeptidases [pink], L,D-transpeptidases [green] and amidases [blue], which are related to the corresponding colours in Table 1. Amino acid residues in the stem peptide are shown in black text. Pentapeptide stems are attached to the Carbon at position 3 of the NAM ring. Transglycosylase activities of Rpfs and the Pon domain indicate their ß-1,4-glycosidic bond substrate. Synthetic enzyme activities are shown on the left, that is those that generate bonds cross-linking the pentapeptides on opposing stems, by Pon and Pbp proteins at positions 4,3 (L-Ala to meso-DAP) or Ldt proteins at positions 3,3 (meso-DAP to meso-DAP). The hydrolytic enzyme activities are shown to the right. These include the amidases, the RipA endopeptidases and the DD-CPase (DacB) acting on the pentapeptide stem (pre- or post-crosslinking).

The PG in bacterial cell walls is an incredibly dynamic structure that requires constant expansion and remodelling during growth to accommodate the insertion of new PG subunits, secretion apparatus, flagellae etc. [9, 10]. During cell division, pre-septal PG synthesis and subsequent degradation of the septum is critical to daughter cell separation; consequently these processes are carefully regulated [7]. In this regard, there is a diversity of enzymes involved in cross-linking, degradation and remodelling of PG, which are illustrated in Figure 1. A ubiquitous feature in bacteria is the genetic multiplicity associated with these functions, which presumably contributes to the ability of different organisms to adapt under varying environmental conditions [7, 9, 10]. In the case of Mycobacterium tuberculosis, the causative agent of tuberculosis, there is a dire need for new drugs with novel modes of action. The increased prevalence of drug resistant strains has raised concerns regarding the sustainability of the current treatment regimen. To address this, several aspects of mycobacterial metabolism are being assessed for potential new drug targets [18]. The genetic redundancy associated with PG biosynthesis together with the reliance on robust bacterial growth to achieve significant drug target vulnerability, has hampered drug development initiatives that target the cell wall [19]. For other bacterial pathogens, PG has been successfully used as an antibiotic target in the past, as evidenced by the widespread use of β-lactam antibiotics among others, the biosynthesis and degradation of this macromolecule in mycobacteria is meritorious of further investigation.

In this study, we undertake a comprehensive analysis of the genomic repertoire of PG remodelling enzymes in various pathogenic and environmental mycobacteria to determine the level of genetic multiplicity/redundancy and degree of conservation. We focus on those enzymes involved in cross-linking and remodelling of the PG in the periplasmic compartment, including: resuscitation promoting factors (Rpfs), penicillin binding proteins (PBPs), transpeptidases, endopeptidases, and N-acetylmuramoyl-L-alanine amidases. Our data reveal extensive genetic multiplicity for the 19 strains analysed in this study, which allowed grouping of strains into three families based on their complement of PG remodelling enzymes, including the MTBC, other pathogenic mycobacteria and non-pathogenic environmental organisms.

Results and Discussion

The comparative genomics analysis for PG remodelling enzymes in mycobacterial species obtained from this study is summarised in Table 1. We analysed 19 distinct species/strains: Six of these belong to the MTBC, six are classified as other pathogenic bacteria [three of which belong to the Mycobacterium avium complex (MAC)] and six environmental species including Mycobacterium smegmatis. Mycobacterium leprae is listed separately due to its substantially reduced genome which emerges as an outlier in the analysis.

Table 1 Genetic complement for PG remodelling enzymes in 19 mycobacterial species

Resuscitation promoting factors (lytic transglycosylases)

Of all the enzymes identified in this study, the Rpf family is the most extensively studied. This group of enzymes are of particular interest due to demonstrated importance for reactivation from dormancy and essentiality for growth in Micrococcus luteus [22, 23]. Whilst Mi. luteus encodes a single, essential rpf gene, mycobacteria encode a multiplicity of rpf homologues and those present in M. tuberculosis, designated as rpfA-rpfE, encode closely related proteins all of which retain the Rpf domain [2426], Figure 2. These have been the subject of intense study due to the potential role they may play in reactivation disease in individuals that harbour latent TB infection [25, 2731]. In this regard, the five rpf genes present in M. tuberculosis are collectively dispensable for growth but are differentially required for reactivation from an in vitro model of non-culturability [32, 33]. Furthermore, the Rpfs are combinatorially required to establish TB infection and for reactivation from chronic infection in mice [3235]. For additional information, the reader is referred to several extensive reviews on this topic [25, 27, 28, 3638].

Figure 2
figure 2

Alignment and domains of M. tuberculosis H37Rv PG remodelling enzymes. Domain architecture is based on output from InterScanPro. All enzymes depicted are the M. tuberculosis H37Rv homologues. Amino acid sequences are grouped according to their common domains, as indicated by their colors: Rpf domains [yellow], PBPs [orange], endopeptidases [pink], LD-transpeptidases [green] and amidases [blue]. PonA proteins are grouped with PBPs. PFAM domains are annotated as follows: PF06737 Transglycosylase-like domain, PF00905 PBP transpeptidases domain, PF00912 Transglycosylase domain, PF00768 D-alanyl-D-alanine Carboxypeptidase domain, PF02113 D-Ala-D-Ala carboxypeptidase 3 (S13) family domain, PF00877 NlpC/P60 family domain, PF03734 L,D-transpeptidase catalytic domain, PF01520 N-acetylmuramoyl-L-alanine amidase amidase_3 domain, PF01510 N-acetylmuramoyl-L-alanine amidase amidase_2 domain. N-terminal signal sequence or transmembrane domains are displayed as purple and pink, respectively. Additional domains annotated at PFAM are as follows (in grey): PonA2, PF03793, PASTA domain; PbpB, PF03717, PBP dimerization domain; PBP-lipo, PF05223, NTF2-like N-terminal transpeptidase; Ami2, PF01471, Peptidoglycan-binding like; RpfB, PF03990, Domain of unknown function DUF348; RpfB, PF07501, G5 domain. Rv3627c retains two tandem copies of the PF02113 D-Ala-D-Ala carboxypeptidase 3 (S13) family domain, one of which is contracted. Figure not to scale.

Rpfs are classified as lytic transglycosylases (LTs) based on sequence conservation and three-dimensional protein structure [29, 3941]. LTs cleave the ß-1,4-glycosidic bonds between the NAG-NA/GM sugar subunits, Figure 1, and their activity is required for insertion of new PG units and expansion of the glycan backbone [9]. In mycobacteria RpfB contains a lysozyme-like, transglycosylase-like PFAM domain, and consequently this group of enzymes are predicted to cleave the glycan backbone of PG [3941]. Direct evidence for this is lacking and moreover, the mechanism through which Rpf-mediated cleavage of PG results in growth stimulation remains unknown. The repertoire of rpf genes is highly conserved in the MTBC; in contrast, other pathogenic mycobacteria lack rpfD, including M. leprae, Table 1. Based on the distribution of rpfC and rpfD, we categorize the 19 strains analysed in this study into the MTBC (which retains all five rpf homologues present in M. tuberculosis), other pathogenic mycobacteria (which lack rpfD) and environmental strains (which lack both rpfC and rpfD). This classification is supported by phylogenetics analysis which confirms these clusters and duplication/loss of genes, Additional file 1: Figure S1. Recently, it has been shown that the Rpfs can serve as potent antigens [42] and Rpf-directed host immune responses allow for detection of TB in latently infected individuals [43]. It is noteworthy that strains lacking different combinations of rpf genes confer significant protective efficacy when used as vaccine strains in mice [44]. Hence, any variation in rpf gene complement between pathogenic mycobacteria may have significant consequences for broadly protective effects of future Rpf-based vaccines.

The environmental species retain three rpf genes [rpfA, rpfB (duplicated in Mycobacterium sp. JLS, Mycobacterium sp. KMS, Mycobacterium sp. MCS) and rpfE], Table 1 and Additional file 1: Figure S1. Although rpfC (Rv1884c in M. tuberculosis) homologues have been annotated as present in all mycobacteria [45], our analysis shows that the M. tuberculosis rpfC homologue is absent from environmental species. Artemis Comparison Tool (ACT) whole genome alignment reveals that the region encoding rpfC in M. tuberculosis is absent in M. smegmatis and all other environmental mycobacteria (data not shown). Thus, based on gene synteny, there is no direct rpfC homologue in these strains. However, there is a local duplication of rpfE in all the environmental strains (annotated as MSMEG_4643 in M. smegmatis), Table 1, Additional file 1: Figure S1. Consequently, we re-annotate MSMEG_4640 to rpfE2, as a homologue of MSMEG_4643, rather than a homologue of Rv1884c. As RpfE interacts with the Rpf Interacting Protein A (RipA) [46], there may be some functional consequence to the presence of multiple copies in M. smegmatis and other environmental bacteria.

The restriction of rpfC and rpfD homologues to pathogenic and MTBC strains, along with the duplication of rpfB in some environmental species, raises interesting questions regarding the nature of growth stimulation in these organisms. These differences suggest that the latter require fewer secreted Rpfs and are more reliant on the membrane bound RpfB homologue. This could be related to the fact that environmental organisms are required to grow in diverse niches of varying size and complexity making them more dependent on localised growth stimulatory activity through a membrane bound Rpf rather than paracrine signalling from diffusible Rpfs produced by neighbouring organisms. It is noteworthy that of all five homologues in M. tuberculosis, deletion of rpfB individually or in combination with rpfA results in colony forming defects and prolonged time to reactivation from chronic infection in mice [21, 34, 35].

The role of Rpfs in TB disease in humans remains enigmatic. It has been demonstrated that sputum from patients with active TB disease, before the initiation of treatment, is characterised by a population of dormant bacteria that require Rpfs for growth [47]. These data provide tantalizing preliminary evidence that Rpfs play an important role in determining bacterial population dynamics in TB infected patients and moreover are critical for disease transmission. Within the granulomatous environment, it may be preferable for the bacterial population as a whole to facilitate emergence of fitter clones which are able to exit from arrested growth. This could explain clonal emergence in clinical samples if few strains are able to expand sufficiently to cause tubercular lung disease.

Penicillin binding proteins

Penicillin Binding proteins (PBPs) are a large family of evolutionarily related cell wall associated enzymes, that bind β-lactam antibiotics [48, 49]. PBPs are classified according to their molecular weight as either high molecular mass (HMM) or low molecular mass (LMM) and are broken down into Class A, Class B and Class C [49]. In mycobacteria, Class A PBPs constitute bi-functional enzymes designated as ponA1 (PBP1, Rv0050, [50]); and ponA2 (PBP1A, Rv3682 [51]), Figure 2. They contain separate domains for transpeptidase and transglycosylase activities. Both these genes are present in all mycobacteria and, as previously reported for M. smegmatis and other environmental strains, there is a duplication of ponA2 which was annotated as ponA3 [51], Table 1 and Additional file 1: Figure S2.

Class B PBP proteins PbpA (pbpA; Rv0016c, [52]), PbpB (pbpB; Rv2163c, [53]) and PBP-lipo (Rv2864c, [49]) are predicted to contain only transpeptidase domains and possibly additional dimerisation domains, but lack transglycosylase activities, Figure 2. Both PbpA and PbpB (FtsI) are involved in progression to cell division in M. smegmatis where gene deletion or depletion manifests in altered cell morphology and antibiotic resistance profiles [52]. In this family of PBPs – as exemplified by ponA2 - there is a distal duplication of PBP-lipo in the environmental strains, Table 1 and Additional file 1: Figure S3. No experimental data on this are currently available, but the lipophilic domain is speculated to allow for cell wall association.

D,D-carboxypeptidases (DD-CPases) are designated as Class C PBPs and are generally present in high abundance [54]. DD-CPases remove the D-Ala residue at position 5 of pentapeptides [8] and through this activity prevent cross linking of the stem peptide into 4 → 3 bridges, Figure 1. In mycobacteria, the dacB2-encoded DD-CPase is not affected by penicillin – though it does bind the antibiotic [55]. Inhibition of DacB through treatment with meropenem results in the accumulation of pentapeptides in M. tuberculosis [56]. In this context, DD-CPases have been implicated in regulating the amount of cross-linking that can occur within the PG sacculus [8]. Our analysis shows that M. tuberculosis H37Rv encodes three distinct DD-CPase homologues: dacB1 (Rv3330), dacB2 (Rv2911) and Rv3627c, Table 1, Figure 2 and Additional file 1: Figure S4. Rv3627c carries two PF02113 domains, one of which is contracted. In the environmental species there is a local duplication of the dacB2 (Rv2911) homologue, leading to consecutive numbering of the resulting duplicated genes for example, MSMEG_2432 and MSMEG_2433 in M. smegmatis. In addition, a distant DD-CPase homologue (annotated as MSMEG_1900 in M. smegmatis) was identified in the environmental strains, as well as in M. abscessus but not in the other pathogenic mycobacteria and MTBC, Table 1. Two additional loci - Rv0907 and Rv1367c – were identified in M. tuberculosis by in silico analysis through their predicted ß-lactamase domains and are grouped among Class C PBPs [49]. Analysis of these proteins revealed that they retain a β-lactamase binding domain (of the AmpH family) but further classification into the functional classes studied herein proved difficult. Consequently, we have not analysed these genes further.

Endopeptidases

Endopeptidases are enzymes that cleave within the stem peptides in PG. In this study, we focus on the Nlp/P60 class of endopeptidases, which cleave within the stem peptides between positions 2 and 3 as exemplified by RipA, Figure 1. RipA is an essential PG hydrolytic enzyme that synergistically interacts with RpfB and RpfE [46, 57] to form a complex that is able to degrade PG. The RipA-RpfB hydrolytic complex is negatively regulated by PonA2 [58] suggesting a dynamic interplay between PG hydrolases, one that would be significantly nuanced with the presence of multiple RipA and Rpf homologues. In this regard, our analysis reveals four endopeptidases in M. tuberculosis that display strong homology to ripA, Table 1, Figure 2, Additional file 1: Figure S5. With the exception of Mycobacterium abscessus and M. leprae, pathogenic mycobacteria retain all five of these homologues. Environmental strains display enhanced expansion of endopeptidases, with the exception of the ripD homologue (Rv1566c). The functional consequence of this remains unknown but it is noteworthy that these strains have also expanded their rpfE and rpfB gene repertoire, suggesting that the multiplicity in this case allows for a greater number of RipA-RpfB/E protein complexes, as well as for protein complexes with different subunit composition. Dysregulated expression of RipA leads to dramatic alterations in cellular morphology and growth [59] suggesting that careful regulation of this protein, both at the expression level as well as by post-translational level is essential. Genetic expansion of RipA homologues along with two copies of RpfB and RpfE, both of which interact with RipA implies a functional consequence of this expansion. In addition, strong regulation of these multiple copies would be required to prevent any detrimental effects on cell growth.

RipB displays strong sequence homology RipA in M. tuberculosis (100% amino acid identity over 58% coverage) and similar domain organization [60], but lacks the N-terminal motif, Figure 2, that has been implicated in auto inhibition by blocking the active site in the three-dimensional crystal structure [61]. More recently, high resolution crystal structures of RipB and the C-terminal module of RipA (designated as RipAc) revealed striking differences in the structure of these proteins, specifically in the N-terminal fragments that cross the active site [60]. Both RipB and RipAc are able to bind high molecular weight PG and retain the ability to cleave PG with variable substrate specificity, which is not regulated by the presence of the N-terminal domain [60]. This suggests that the N-terminus does not regulate PG degrading activity and in this context, the physiological consequences of the reduced size of RipB and RipD, Figure 2, remain unknown. The high degree of conservation of RipB across all pathogenic mycobacteria including M. leprae, Table 1, Additional file 1: Figure S5 indicates that variable substrate specificity in PG hydrolases in essential for pathogenesis. The Mycobacterium marinum homologues of Rv1477 and Rv1478, iipA and iipB (MMAR_2284 and MMAR_2285 respectively), Table 1, Additional file 1: Figure S5, have been implicated in macrophage invasion, antibiotic susceptibility and cell division [62]. As with the other enzymes assessed in this study, environmental mycobacteria display greater genetic multiplicity for these homologues, Table 1.

Structural analysis of RipD reveals alterations in the catalytic domain, consistent with the inability of this protein to hydrolyse PG [63]. Nevertheless the core domain of RipD is able to bind mycobacterial PG and this binding is negatively regulated by the C-terminal region [63]. However, RipD homologues in the environmental mycobacteria lack the 63C-terminal amino acids, Table 1 (shown in parentheses), possibly allowing for stronger binding of this enzyme to PG.

Rv2190c encodes another NlpC/P60-type PG hydrolase in mycobacteria. Deletion of this gene in M. tuberculosis results in altered colony morphology, attenuated growth in vitro, defective PDIM production and reduced colonisation of mouse lungs in the murine model of TB infection [64]. Consistent with this, homologues of Rv2190c are found in all pathogenic mycobacteria, Table 1, with notable genetic expansion in some environmental species. In contrast, the Rv0024 is absent from environmental species, suggesting that it could be required for intracellular growth or some other component of the pathogenic process, Table 1, Additional file 1: Figure S5.

L,D - Transpeptidases

L,D-transpeptidases (Ldt) are a group of carbapenem sensitive enzymes in M. tuberculosis [56] that contribute to the formation of a 3 → 3 link between the two adjacent mDAP (mDap → mDap bridges) residues in PG, distinct from the classic 4 → 3 link (D-Ala → mDAP), Figure 1. M. abscessus [65] and M. tuberculosis [66] exhibit increased ratios of the 3 → 3 cross-link in stationary axenic culture, indicating that mycobacteria are capable of modulating their PG at the level of transpeptidation in response to growth stage and the availability of nutrients. Both LdtMt1 and LdtMt2 (Rv0116c and Rv2518c respectively) were experimentally shown to affect M. tuberculosis H37Rv morphology, growth characteristics and antibiotic susceptibility in vivo [67]. The crystal structure of LdtMt2 places the extramembrane domain 80–100 Å from the membrane surface and indicates that this enzyme is able to remodel PG within this spatial region of the PG sacculus [68]. More recently, it has been demonstrated that the combinatorial loss of both LdtMt1 and LdtMt2 in M. tuberculosis resulted in morphological defects and altered virulence in the murine model of TB infection [69]. A notable variability of L,D-transpeptidase genes is found in mycobacteria, Table 1, Figure 2 and Additional file 1: Figure S6. Five homologues are present in all but one pathogenic strain, while multiple homologues are evident in most environmental strains. The exception is ldt Mt3 (Rv1433), which is absent from the pathogen Mycobacterium ulcerans and from the environmental species Mycobacterium vanbaalenii, M sp. MCS and M. sp. KMS, yet its presence in M. leprae suggests functional importance. As with RipA, M. gilvum shows the greatest expansion of the ldt genes. Biochemical characterisation of all five M. tuberculosis H37Rv homologues, LdtMt1 - LdtMt5, confirms PG cross-linking and/or ß-lactam acylating enzyme activities in all of these enzymes [70]. This activity can be abolished by treatment with imipenem and cephalosporins, indicating that this group of enzymes holds great promise for TB drug development [70, 71]. Moreover, the functionality of all the Ldt homologues present in M. tuberculosis raises interesting questions with respect to the functional consequences of the expansion of this protein family in environmental strains, which may require greater flexibility in Ldt function.

Amidases

While endopeptidases and transpeptidases are responsible for cleavage within or between peptide stems, amidases act to remove the entire peptide stem from the glycan strands, cleaving between the NA/GM moiety and the L-Ala in the first position of the stem peptide, Figure 1. The amidases have been implicated in PG degradation, antibiotic resistance/tolerance and cell separation in Escherichia coli and other organisms, and can be organised into 2 main families containing either an amidase_2 or amidase_3 – type domain [8, 9, 72]. The amidases of E. coli (which retains 5 amidases designated AmiA, AmiB, AmiC, AmiD and AmpD) have specific substrate requirements governed by the structural confirmation of the NAM carbohydrate moiety. Knockout of these amidases results in chaining phenotypes, abnormal cell morphologies and/or increased susceptibility to certain antibiotics [7274]. Amidases have also been implicated in spore formation, germination and cell communication in Bacillus subtilis [75, 76]. The role of amidases in mycobacterial growth, virulence and resuscitation from dormancy is unknown and any impact of these on mycobacterial morphology and antibiotic resistance remains to be demonstrated. Analysis of the amidase gene complement in mycobacteria reveals the presence of four homologues in M. tuberculosis, two containing the amidase_2 domain (ami3; Rv3811 and ami4; Rv3594) and two the amidase_3 domain (ami1; Rv3717 and ami2; Rv3915), Table 1, Figure 2 and Additional file 1: Figure S7. The crystal structure of Rv3717 from M. tuberculosis confirms that this enzyme is able to bind and cleave muramyl dipeptide [77]. The amidase family distinguishes itself from all other enzyme families by absence of a homologue (ami4) from non-MTBC pathogens and its presence in the MTBC and environmental strains. M. leprae retains only the ami1 and ami2 genes – both containing the amidase_3 domain. This suggests that amidase_2 domain amidase activity is dispensable specifically in this species, but required for peptidoglycan remodelling in the other pathogenic mycobacteria.

Mycobacterium leprae

Very little is known about in vitro growth and division of M. leprae, as it can only be grown in animal models. From our analysis, it is apparent that M. leprae habours notable genetic redundancy for PG remodelling enzymes (Table 1) in contrast to its minimal gene set for other areas of metabolism [78]. Considering that PG subunits or precursors cannot be scavenged from the host, it is expected that pathogenic bacteria would retain complete pathways for biosynthesis and remodelling of PG. However, the presence in M. leprae of multiple homologues within each class of PG remodelling enzyme assessed in this study, suggests that some level of multiplicity is required to ensure substrate flexibility. Further work in this regard is difficult due to the limited tractability of M. leprae for in vitro manipulation.

Conclusions

Mycobacteria represent a wide range of species with a great variety of phenotypes. Exposure to stresses which they encounter at various stages of their life cycles demands the ability to adapt. Consistent with this, many mycobacteria encode a multiplicity of genes for numerous important pathways such as respiration and cofactor biosynthesis [79, 80], which allows for a more nuanced regulation of physiology. The analysis performed herein summarises the general distribution of PG remodelling genes in diverse strains and reveals an emerging trend towards gene multiplicity in environmental mycobacteria. There is great conservation within the MTBC and other pathogenic mycobacteria. Of all strains, M. gilvum displays the greatest degree of gene expansion, containing a total 44 PG remodelling genes, Table 1. This organism has not been studied extensively but may represent a potential model system for understanding how the genetic multiplicity for PG remodelling enzymes contributes to bacterial physiology. As expected M. leprae shows a reduction in the number of genes that encode the enzymes assessed in this study but still retains more than one representative of each functional class. This, together with the striking degree of conservation in some families of PG remodelling enzymes in pathogenic mycobacteria, suggests that PG biosynthesis, remodelling and possibly recycling are all potential vulnerable pathways for drug development. The extracellular nature of these enzymes provides an added advantage for drug screening since small molecules need not enter the cell for biological activity. Entry of compounds into mycobacterial cells remains the major confounding factor in current drug development initiatives. Moreover, the lack of human counterparts would ensure a high degree of specificity. In conclusion, the gene complements for PG remodelling revealed in this study most likely reflect the differential requirements of various mycobacteria for murein expansion/turnover during colonisation of and proliferation within host organisms or environmental niches.

Methods

The 19 mycobacterial strain sequences used in this study were all complete and either published [24, 78, 8190] or directly submitted to GenBank [91] (Additional file 2: Table S1). The following sites were utilized for analysis of the genomes (Additional file 2: Table S2): The comparative genomic profile for the enzymes of interest were initiated by homology searches of known M. tuberculosis H37Rv genes at TubercuList [92], GenoList [93] or TBDB [94]. Where necessary for further analysis direct BLAST analysis was performed at NCBI [95], utilising protein sequence for BLASTp or DNA sequence for BLASTn particularly for the analysis of Mycobacterium sp. JLS, M. africanum and M. intracellulare which are not or only partially annotated at TBDB. To confirm the absence of genes, protein sequence was used for tBLASTn analysis. Additional homologues that are absent from M. tuberculosis H37Rv were identified by advanced search at SmegmaList (Mycobrowser) [96]. Where information was required for sequence level analysis, the Sanger Artemis Comparison Tool (ACT) [97] was utilized on annotated sequences obtained from the Integrated Microbial Genomes (IMG) site at the DOE Joint Genome Institute [98]. Phylogeny was established from FASTA files from all genes in Table 1 at EMBL-EBI by ClustalO [99] alignment and ClustalW2 [100] analysis and visualized using FigTree V1.4 software (http://tree.bio.ed.ac.uk/software/figtree). Functional annotation of each of the M. tuberculosis proteins was identified at InterScanPro [101], for PFAM domains [102], signal sequences (SignalP) [103] and membrane anchoring domains (TMHMM) [104].