All simulations reached thermal equilibration early on (energy, temperature and pressure plots in Figs. S8–10), as simulations in water and hexane underwent preceding equilibration steps and there are very few degrees of freedom to equilibrate in the vacuum simulations. Radius of gyration (Fig. S11) was checked as a measure that gives an indication of the shape of the molecule at each time. This showed convergence in the molecular shape after 6 ns for most of the replicate simulations, when simulated in water (consistent with the hydrophobic chains folding to reduce the surface area exposed to water). Convergence is not seen in vacuum nor in hexane, with hexane showing the most variation in structure. To ensure a consistent set of equilibrated structures were considered, MA conformations from the last 4 ns of each simulation were used. The analyses were also compared to the results of the full simulations to account for the broader range of structures accessed during the 6 ns ‘equilibration’ period.
Defined WUZ MA conformations
Each frame from the simulations was analyzed for the seven possible WUZ-folds according to their chain lengths and intramolecular distances, as defined in Table 3 and shown schematically in Fig. 2. The seven possible WUZ conformations for AMA are shown in Fig. 4. A W-conformation represents bending at each functional group with four parallel alkyl chains. The various Z-conformers fold at two of the functional groups while U-conformers only fold at one functional group.
Mycolic acid class in relation to WUZ folds
From the WUZ-distributions (Table 4), KMA stands out as having the highest percentage of WUZ-conformations in each of vacuum, water and hexane (48.0%, 27.7% and 29.4%, respectively). This large percentage of WUZ-conformers for KMA clearly distinguishes it from the other MAs, and this very different pattern of folding may correlate with specific biological functions for this MA class , as has already been suggested [42, 45]. The difference in the percentage of WUZ-conformations to previous work is presumably due to extended timescales in the current work and an improvement in the description of the cyclopropane group in the forcefield that was used. Both AMA and MMA show fewer WUZ-conformations (11.5%, 15.0% and 10.4% for AMA and 12.8%, 6.5% and 6.9% for MMA in vacuum, water and hexane, respectively) with the backbone control molecule without mero-chain functional groups, BMA the least (3.7%, 1.2% and 5.9% in vacuum, water and hexane, respectively). The reduced number of WUZ-conformations for BMA suggest that the mero-chain functional groups substantially influence how MAs fold, and that the functional groups may steer conformations in unique ways dependent on the chemical structure of each molecule. The different pattern of folding between solvents for AMA vs both MMA and KMA implies a specific solvent effect in the oxygenated MAs, built in water from a lack of disruption of the W-fold for AMA.
Solvent effect on WUZ folding
In terms of solvent trends, the most compactly folded W-conformation comprises a majority of the identified WUZ conformers simulated in either vacuum or in water. In contrast, mostly more open U-conformations are found in hexane. These results, taken together, emphasize the role of interchain-interactions, where only hexane can behave competitively to break up this structuring. This is further emphasized by the similarity of the WUZ-conformation distributions of BMA. This latter molecule does not have folding directed by functionality, and thus minimal directed chain–chain interactions. For example, in hexane, both AMA and BMA display similarly low levels of WUZ-folding, except for a marginally higher percentage of eU conformers for BMA (3.8%) compared to AMA (2.7%). With folding at the acid head group in the eU conformer, this result suggests that the presence of the remaining functional group, namely the head group, in BMA, facilitates folding at this position. For KMA, however, a significant proportion of sU conformer is present in hexane, and the same level of aZ conformer is retained as in the vacuum and water simulations. This structuring is consistent with an interaction between the meromycolate group and the keto functionality at the distal position, but without further folding to the W-fold as might be driven by stronger inter-chain interactions in water and vacuum. The meromycolate-keto interaction would be hydrogen-bonding in nature, and thus hard to disrupt in hexane. Although MMA, which also has an oxygenated group at the distal functional group, might also be expected to have hydrogen-bonding interactions, the methoxy group is less polar than the keto group and the methyl group may hinder hydrogen bonding in MMA. This suggests additional complexity in determining conformation preference. In addition to potential meromycolate-keto interactions, the sU conformation requires a hairpin fold at the α-methyl trans-cyclopropane group, which has recently been described as facilitating folding in MAs . KMA is the only MA with an α-methyl trans-cyclopropane group modelled here.
In general, the percentages of WUZ-conformations found here are higher than previously found , most notably for KMA showing the highest total percentages in all environments simulated. This is attributable largely to the improvements in the force field used, and may also be affected by simulation length and hence sampling of the potential surface. It remains significant that MAs can fold in such structured conformations as are found in ordered compressed monolayers or the cell wall, even when in isolation without any restrictions or lateral packing effects. Nevertheless, WUZ-conformations, comprising hairpin bends at various or all functional groups with straight alkyl chains in parallel and close each other, account for only a fraction of the conformations sampled in simulations of single MAs. WUZ-conformations, in being defined by a restricted number of 2D intramolecular distances, provide a limited description of MA folding. The applied distance cut-offs and the two-dimensionality of the definitions mean that molecules that resemble WUZ-conformations well can be excluded, and conformations that do not resemble the correct three-dimensional (3D) shape are sometimes included in WUZ-defined folds. Hence, to describe the conformational behavior of free MAs in solution more holistically, it is necessary to further develop a well-defined analysis strategy.
Exploring the wider scope of MA conformations
A more comprehensive picture of the spread of all the conformations was initially obtained by a principle component analysis (PCA) using the carbon backbone of average WUZ-structures combined from extracted frames to map out two principal coordinates for each MA. These principal component plots are shown in Figs. S12 and S13.
The plot for each MA is unique, as the axes are vectors that display the maximum amount of variance for each molecule. At the extremes of the x-axis is an extended straight conformation and an sU-like conformation, correlating this axis with the extension of the backbone of the MA. The y-axis separates those conformations with the chain terminating with “a” being extended from those terminating with “e” extended. The potential energy surface of each molecule was sampled well when the projection of all conformers from all 20 simulations, sampling a large area, is compared to the projection of single simulation conformers, which sample only a small portion of the conformational space (Fig. S12). This provides confidence in the use of nanosecond timescale and multiple simulations with varied starting conformations in improving the sampling of the conformational space.
A second set of PCA analyses comparing the full sampling trajectories (20 × 10 ns) with the truncated simulations (20 x last 4 ns) shows that, most marked in water, MA conformers converged toward the latter half of the simulation time (consistent with other equilibration measures), indicating different sampling of conformational space between the starting and final conformations (Fig. S14). When the conformers from the last 4 ns of each simulation are projected on the principal component plot (Fig. S13), vacuum and water simulations have conformers grouped more specifically at folded conformations such as W and sZ. More extended conformers are not present in the last 4 ns in vacuum and water. In hexane, the conformer spread remains diffuse with extended conformers even in the last 4 ns. Plots with conformers from the last 4 ns allow the most populated conformations to be more clearly visible.
WUZ-defined conformations are positioned peripherally to the sampled conformations (Fig. S13). Extraction and averaging of conformers from defined parts of each principal component plot indicated new conformations that differ from the WUZ-defined conformations. The letters A, B, C and G in Fig. 5 represent an example of newly characterized conformers. For each of the MAs, except KMA, a new conformation representative of a large proportion of conformers occurring in hexane was defined. This conformation shown in Fig. 5, A is representative for those indicated by A (AMA), D (MMA) and J (BMA) in Fig. S13. This new conformation is unfolded with a slight bend along the length of the molecule and various kinks in the alkyl chain. In addition, new conformers that are compactly folded, representative of those surrounding the W-conformation (Fig. S13, B and C), are shown in Fig. 5, B and C. These structures are globular in shape with numerous bends, twists and kinks in the alkyl chains. The chains weave between each other in diverse patterns, making these conformations hard to define in two dimensions. Similar globular conformations were found located at E and F (MMA), H and I (KMA), and K and L (BMA). These new conformers tend to have a high ratio of gauche-orientation of the alkyl chains.
In particular, KMA showed a different distribution in hexane for which a new conformer was obtained to represent one of the main conformational groupings. This conformation, shown in Figs. 5 and S13 as G, closely resembles an sU-conformation, with a hairpin bend at the cyclopropane group with additional kinks in the chains where the polar head group and mero-chain keto-group are in close proximity.
The principal component plots indicate that most conformations found for single MAs in vacuum or in solution, do not have extended alkyl chains in the trans-orientation, as is suggested by defined WUZ-conformations, but rather constitute a wide range of conformers with bent alkyl chains with high gauche-content. When the alkyl chains of the MAs align closely and in parallel, the trans-orientation is promoted, as seen by the highly ordered W-conformations in which all four alkyl chains are parallel and straight. In MA monolayers and in the cell wall arrangement, the packing of MAs close to each other is likely to promote ordering and trans-orientation of the MA alkyl chains. Therefore, it is not likely that the globular conformations of MAs will be prevalent in these settings. However, at high surface areas in monolayer experiments, where molecules are not tightly packed, conformations with bent and twisted alkyl chains will be more predominant, especially if the molecules are not folded into the W-conformation at these surface areas. Free MAs occurring in mycobacterial biofilms , and aqueous environments such as for serodiagnosis, are also likely to occur in more globular conformations.
FEL of mycolic acids
To further explore key conformations and folding behavior, and provide a rigorous basis for key conformer selection, FELs were used. This method assigns relative energies to conformers based on the frequency that they are represented within a simulation. As such, they reflect a number of the points already extracted from the PCA analysis, namely that there are clear differences in the folding behavior of all MAs, and that they are much more flexible in hexane and show the most defined conformations in water (exemplified by comparing Figs. S16–S27).
A key feature of the FEL is that minima are easily identified and visualized. Clustering can be achieved by applying energy cut-offs to extract conformers that are similar in energy (Fig. S15). This approach affords clearly defined groups corresponding to the most stable structures. Using this approach, the key cluster-averaged minima for AMA, MMA and KMA were extracted for each solvent, using 1, 2 and 3 kcal mol−1 cut-offs around the minima on the FEL (Tables S1–S3). The rmsd for the generated cluster structures in water (Tables S4–S6) did not vary appreciably with cut-off (with a couple of exceptions at the 3 kcal mol−1 cut-off, where a significant increase was seen), showing that a majority of molecules with similar of conformations can be captured with a cut-off of 1–2 kcal mol−1. In each case, the proportion of the structures represented by the cluster groupings was higher than those structures corresponding to WUZ representations. The structures identified indicated a different range of structural variability was important, in line with the preliminary PCA analysis. These differences were most clearly seen in the water simulations, which are also most relevant for the free MAs on which techniques, such as serodiagnosis, presumably rely. Full FELs for each MA under each solvent condition are presented in the Supporting Information (Figs. S16–S27).
For KMA, which has the most distinct structuring under the WUZ analysis, three main clusters, a clean W, a knotted W and an open mixture, are identified under the full 10 ns of simulation (Fig. 6a), and only the two ‘W-like’ clusters are in significant proportion in the last 4 ns. Here, these two W-like clusters represent ~19.7 and 36.6% of the simulation frames, respectively, at the 3 kcal mol−1 cut-off. In contrast, only 27.7% of structures are classified, across all seven WUZ-conformations for the full 10 ns. The classification of WUZ-defined W structures at 19.2% highlights that FEL analysis enables a more global classification of accessible and closely related conformers with potentially similar structuring and stabilities.
AMA again demonstrates three main clusters initially, collapsing to two key clusters in the last 4 ns (Fig. 6b). A ‘W-like’ cluster has structural representations that are consistent with WUZ-defined W, and other, more knotted, clusters based on the W form. These are not distinguished, as per the KMA case. Under this definition, the W-like structures represent nearly 68.4% of the available frames—an even higher proportion than in KMA. The second lowest-energy cluster represents sZ-related structures (Fig. 6b cluster 3, part-folded structures), contrasting with aZ structures identified by WUZ-analysis as the second major fold for AMA in water. Like KMA, the open structures identified as a cluster in the early part of the simulation collapse by later stages. AMA accounts for about half of the MA content in the M. tb cell wall and was shown in previous work to be the most flexible, with the highest percentage of WUZ-conformations. AMA also showed low immune activity and antigenicity experimentally [21, 23]. The flexibility of AMA from WUZ-conformations in the three environments simulated may be more complex, and in fact complicated by a range of knot-like forms that could be difficult to distinguish. The low barrier between different minima may contribute to poor antigenicity. The high percentage of AMA in the cell wall implies that it is key in determining cell wall fluidity and permeability properties, and as such, the impact of cell wall organization will be critical to assess in the future.
MMA shows a particularly interesting profile in water using FEL analysis (Fig. 6c). Here, a single minimum is identified under equilibrated conditions, and, consistent with the low percentage of WUZ structures identified, approximately 70% of the structures are globular-type structures at the 3 kcal mol−1 cut-off. MMA occurs naturally with either cis- or trans- cyclopropanation. Experimentally, MMA is the most antigenic, with trans-cyclopropanation showing higher antigenicity than cis-cyclopropanation . Here, the cis-cyclopropane-containing MMA was modelled as it represents the majority as found in M. tb. The cis-isomer is extremely immune active, eliciting a distinct inflammatory response, whereas the trans-isomer has largely lost this activity . The knotted structures of the MMA indicated here, and related to those identified for AMA as major contributors, may suggest that the oxygenation constitutes a critical feature of the antigenicity seen for methoxy-species. The role of stereochemistry in folding is likely to reveal further potential mechanisms for immune activity.
Comparison of WUZ and FEL-based classifications
To see how well the new minima identified by FEL analysis correlated with the WUZ analysis, the structures populating the FEL clusters for water simulations were extracted and correlated with their WUZ classifications for each of the three energy cut-offs (Table S7). Less than half of the structures described by the FEL clusters overlap with WUZ-defined structures, except for cluster 2 of KMA. For this cluster, there was a very high correlation with the W-definition, where just over 80% of this cluster could be defined in this way. This high degree of structuring was supported by the RMSD for this cluster, which was 3.7 and 4.0 Å at the 1 kcal mol−1 and 2 kcal mol−1 cut-offs, respectively, compared with a value of around 12 for the unclustered portion of the surface. This supports the recognition of KMA as having a propensity to structured folding.
For AMA, cluster 1 correlates with around 11–12% of classic W, with cluster 2 represented by around 40% aZ, whereas the clusters for MMA do not overlap significantly with WUZ-definitions, with the minor cluster 2 (representing <2% of the total structures) being the best defined in this way with between 13.9–18.6% eZ. This comparison of the two mechanisms for defining structures highlights that WUZ conformations, although present, only describe a fraction of conformations for free MAs in vacuum and solvent. FEL-clusters have highlighted that other open, part-folded, and in particular knotted, globular conformations make up a majority of accessible MA conformations, and that these differ depending on the underlying functionality. It may be helpful to apply these latter approaches to cell-wall and membrane-based studies to capture a fuller picture of MA flexibility and conformational scope.