Introduction

Lantibiotics are ribosomally-synthesised peptide bacteriocins, produced by gram-positive bacteria for targeting other bacterial species during defence strategies, and undergo extensive post-translational modifications prior to forming the mature functional lantipeptide1,2. Lantibiotics, or lanthionine-containing antibiotics, are so named because they contain unusual amino acids, lanthionine (Lan) and methyllanthionine (MeLan), which are formed by the fusion of two alanines cross-linked by a thioether linkage1,3. Lantipeptides also contain several unsaturated amino acids, including dehydroalanine and dehydrobutyrine1.

The bacteriocidal activity of lantibiotics is attributed to the formation of stable pores in the target membrane, which disrupts cellular integrity or prevents cell wall biosynthesis4,5,6. Lantipeptides are highly sought after antimicrobials in the food preservation and pharmaceutical industries owing to their low toxicity in mammalian systems, higher potency than antibiotics, few or no reports of lantibiotic resistance in bacteria, and potent activity against drug-resistant strains such as MRSA and VRE7,8,9,10,11. Drug resistance is a serious global concern at present, and the rising emergence of resistant strains demands the design of novel therapeutic strategies. Antibiotic resistant strains often develop biofilms, which further aggravates the crisis of resistance, necessitating the prevention of biofilm formation. The potential of several lantibiotics including nisin, nukacin ISK-1, and gallidermin in hindering the formation of biofilms in staphylococcal strains such as MRSA is widely known12. Numerous studies have demonstrated the efficacy of lantibiotics against resistant strains including MRSA, VRE, and GISA13,14. Several authors emphasise on the potential of lantibiotics in combating the emerging drug resistant strains and support the view that they can serve as feasible alternatives to antibiotics in the future15. Efforts are being made to employ bioengineering strategies for the development of optimised lantipeptides and nano-engineering approaches for broadening the antibacterial spectrum of lantibiotics16,17. With the exception of cinnamycin, all the lantibiotics selected herein are lanthionine-containing peptide antibiotics that are able to depolarise the energised bacterial membrane, and subsequently destabilise their membrane integrity. Additionally, the 37 lantipeptides, barring cinnamycin, are capable of creating aqueous transmembrane pores17. Although these 36 lantibiotics are functionally similar, their structures are diverse, especially with respect to post-translational modifications, presence of unusual amino acids including dehydrated and unsaturated amino acids with variable linkage patterns, and methyl lanthionine bridges that are crucial to structural stability and function10,18. The tertiary structures, structural conformation, important amino acid residues, conserved domains, and intra-molecular chemical bonds need to be understood in further detail for designing engineered lantipeptides with enhanced stability and bioactivity19.

In this study we constructed the structures of 37 lantibiotics from over 25 organisms, using molecular modelling approaches, and studied their structural and sequence diversity, in addition to analysing their structural dynamics using molecular dynamics simulations. The lantibiotic sequences selected in this study had reviewed, manually annotated information in UniProtKB, and the existence and function of the 37 lantipeptides were experimentally proven.

Results

Sequence-based information

The sequences retrieved from UniProtKB [Supplementary Table S1] belonged to five protein families (InterPro accession IDs: IPR007682, IPR006079, IPR029243, IPR027632, and IPR012519), containing five Pfam detailed signatures (Pfam accession IDs: PF04604, PF02052, PF14867, PF16934, and PF08130). Based on the composition of the conserved domains, the lantibiotics were found to belong to six super families, namely, lantibiotic type A, gallidermin, lantibiotic A, TOMM pelo, mersacidin, and antimicrobial 18. The physico-chemical properties, including the molecular weight, isoelectric point, aliphatic indices, sequence length distribution, extinction coefficients, hydropathy indices, antigenicity, and presence of disordered regions, were determined [Supplementary Figs. S1S6, Supplementary Table S3].

Phylogenetic analysis

The multiple sequence alignment (MSA) revealed that the 37 lantibiotic sequences shared a reasonable degree of sequence similarity [Fig. 1]. The Neighbour-Joining phylogenetic tree demonstrated that the sequences belonged to three distinct evolutionarily-related clusters. The nisins (A, Z, and U) were clustered in the same group as epidermin, gallidermin, mutacins, subtillin, streptin, and pep5 [Fig. 2]. The duramycins and epilancins were grouped along with mersacidin, lacticin, actagardine, cinnamycin, ancoverin, and paenibacillin. The third group comprised the ruminococcins, mutacin2, lichenicidins, salivaricin, streptococcin, nukacins, and cypermicin [Fig. 2]. This third group could be further sub grouped into two - with salivaricin A, cypemycin, lacticin 3147 A1, and the lichenicidins in one subgroup, and lacticidin 481, mutacin 2, the nukacins, streptococcins, and ruminococcins in the other.

Figure 1
figure 1

MSA demonstrating the sequence conservedness among the 37 lantipeptides selected for this study. The sequence logo represents the most commonly occurring amino acid at a particular position, where the size of the lettering indicates the frequency of occurrence of a particular amino acid.

Figure 2
figure 2

Phylogenetic tree of the 37 lantipeptides, constructed using the Neighbour-Joining algorithm. The 37 lantipeptides were grouped into three groups, which are demarcated by green, blue, and red colours.

Comparative modelling, validation, and analysis

The structures of the 37 lantipeptides constructed from homology are represented in Fig. 3. The models were comparable to experimentally-derived protein structures of similar length, as indicated by the ProSA Z-score and the global quality Z-scores obtained from the Verify 3D server. The ProSA Z-scores of the lantibiotic homology models fell within the range of experimentally-derived X-ray and NMR structures of similar length [Supplementary Table S2]. Ramachandran plot analyses indicated the proper assignment of backbone torsion angles, with the torsion angles of the majority of residues being within the allowed regions of the Ramachandran plot [Supplementary Table S2]. Additionally, the different kinds of intermolecular bonds and interactions, including intermolecular hydrogen bonds, van der Waals interactions, disulphide bonds, salt bridges, π-π stacking interactions, and π-cation interactions were determined for each of the 37 lantibiotic models generated herein and subsequently analysed [Supplementary SF1].

Figure 3
figure 3

Structures of the 37 lantibiotics constructed by homology modelling in ribbon representation.

Pockets and disordered residues

Some of the lantibiotics, including lacticin 3147-A1, lacticin 3147-A2, and cypemycin, were found to contain disordered regions that were predicted to have a role in protein binding [Supplementary Table S3]. Additionally, the residues comprising the pockets in the lantibiotic structures were analysed and the details of the pockets and mouths have been tabulated in Table 1.

Table 1 Pockets and mouth information of the 37 lantipeptides. Pocket residues that are disordered have been highlighted in grey

Structural diversity of lantibiotics

The structural diversity of the 37 lantibiotics was reflected in the RMSD values, which in some cases were as high as 10 Å, as represented in Fig. 4. The structural RMSD values of gallidermin with lichenicidin VK21-A2, ancovenin, and cinnamycin were the lowest, being 0.753 Å, 0.837 Å, and 0.934 Å, respectively. The structures of subtilin and duramycin demonstrated the greatest structural diversity, with an RMSD value of 10.226 Å between the two structures. On an average, the structural RMSD values were in the range of 4–5 Å. The average RMSD of galliderim, nukacin, and mutacin B-Ny266 with all the other lantibiotics were the lowest, being in the range of 3–3.5 Å. The relational RMSD data matrix [Supplementary SF2] of all the 37 lantibiotics were standardised prior to the Principal Component Analysis (PCA). The X and Y axis depicted principal component 1 (PC1) and principal component 2 (PC2) respectively, which represented 16.5% and 10.7% of the total variance Fig. 5. The variance explained by the principal components, the value of the principal components, and the value of component loading are provided in the supplementary [Supplementary SF3]. Analysis of the PCA plot revealed that duramycin, duramycin B, duramycin C, lacticin-481, actagardine, and ancovenin had the maximum variation among all the 37 lantibiotics.

Figure 4
figure 4

Plot of the structural RMSD, demonstrating the range of structural divergence among the 37 lantibiotics. The colour key provides the range of the structural RMSD (in Å), ranging from a low structural RMSD (blue-green), medium (yellow-orange), to high structural RMSD (red).

Figure 5
figure 5

Plot showing the PCA of the 37 lantipeptides with respect to their intra-RMSD values, where the X and Y axes depict principal component 1 (PC1) and principal component 2 (PC2).

The secondary structure composition of the lantipeptides also varied, with some lantipeptides, including mutacin-2, ruminococcin-A, lichenicidin VK21-A1, lichenicidin VK21-A2, lacticin-481, gallidermin, nukacin, epilancin-15X, epilancin, cinnamycin, duramycin, strepcoccin A-FF2, streptococcin A-M49, and lanna-staho nukacin having a higher helical content [Fig. 6]. On the other hand, mersacidin, salivaricin A, actagardine, nisin U, ruminococcin A1, ancovenin, pep-5, nisin Z, mutacin B-Ny266, lantibiotic 107891, epidermin, cypemycin, duramycin C, duramycin B, and subtilin had a higher content of turns and coils. Among the 37 lantipeptides, the beta-strands were prominent in the structures of streptin, mersacidin, salivaricin A, duramycin C, lacticin 481, lichenicidin VK21-A1, lacticin 3147-A1, and mutacin 1140.

Figure 6
figure 6

Graphical representation of the secondary structure content of the 37 lantipeptides.

MD simulation

The lantipeptides demonstrated structural consistency throughout the simulation, indicated by the RMSD and radius of gyration20 [Figs. 7 and 8]. The lantipeptides with a higher content of turns and coils, including ancovenin, duramycin B, actagardine, mutacin B-Ny266, and lantibiotic 107891, had the lowest radii of gyration among the 37 lantipeptides. Since the radius of gyration is a measure of structural compactness, it can be said that the structures of ancovenin, duramyin B, actagardine, mutacin B-Ny266, and lantibiotic 107891 were the most compact, while the structures of gallidermin, epilancin, lacticin 3147-A2, lacticin 481, mutacin 2, and lichenicidin VK21-A2 were the least compact among the 37 lantipeptides [Fig. 8 and Supplementary Fig. S7]. The RMSF of the peptide backbone was used to determine the most flexible region of the peptide backbone [Fig. 9]. It was noted that while the backbone RMSDs of most of the lantibiotics remained consistent throughout the simulation, the backbone RMSDs of lichenicidin VK21-A2, mutacin 2, lacticin 3147-A2, epilancin, gallidermin, and lichenicidin VK21-A1 were higher than the rest [Fig. 7 and Supplementary Fig. S8]. Analyses of cluster density, cluster size, and average cluster RMSD revealed that the representative structure from cluster 1 was the best conformation in each case. The representative structures were superimposed with the cluster members to compute the relation between the average RMSD and the global distance test (GDT_TS) [Supplementary Fig. S9].

Figure 7
figure 7

Plot showing the backbone RMSD of the 37 lantibiotics over the simulation time.

Figure 8
figure 8

Plot showing the radius of gyration (RoG) of the 37 lantibiotics throughout the simulation time.

Figure 9
figure 9

RMSF plots demonstrating the residual fluctuations of the 37 lantipeptides, indicating the flexible regions.

Discussion

Lantibiotics are bacteroicidal peptides characterised by the presence of unusual amino acids - the thioether-containing polycyclic lanthionines and unsaturated amino acids1. They are produced by gram-positive bacteria for targeting other bacterial species by forming pores in the target membrane that disrupt cellular integrity or inhibit cell wall biosynthesis9. Lantibiotics are widely used in the food preservation and pharmaceutical industries7. In the present global scenario, the surge in the development of drug-resistant strains demands the development of novel drugs and antimicrobials for combating the emerging drug resistance. The high in vitro potency combined with the variety of strategies employed for effectively targeting bacterial cells, makes lantibiotics a promising macromolecule for the generation of novel antibiotics in the future15,21,22. Lantibiotics inspire the construction of engineered antimicrobial peptides for combating specific bacterial diseases, making the understanding of lantibiotic structures a necessary and important one7,17. The objectives of this study were to construct the structures of 37 lantipeptides having reviewed and annotated sequence information in UniProtKB using homology modelling, and to evaluate the diversity, compactness, and stability of the structures of the 37 lantipeptides.

Analysis of the MSA revealed that the lantibiotic sequences shared a high degree of conservedness, which was in marked contrast to the diversity of their structures. The structural diversity of the 37 lantipeptides was determined from the RMSD values. The correlation coefficient between the sequence diversity and structural diversity of the 37 lantipeptides was 0.189. A value of 0.189 indicated that the structural diversity of the 37 lantibiotics is not significantly correlated to the diversity of lantibiotic sequences. This further indicates that the sequence-structure relationship of the lantibiotics selected herein is flexible, allowing room not only for human tailoring, but also explains that the natural post-transcriptional engineering is probably not an accident. Lacticin 3147-A1, lacticin 3147-A2, and cypemycin were found to contain disordered residues that are capable of binding proteins, and some of the residues were also found to comprise the pockets in the lantipeptide structures. Protein-protein interactions involving a disordered protein are generally mediated by a transition from disorder to order upon protein binding23. Since protein-protein interactions are often mediated by small flexible pockets at the protein-protein interface, these disordered residues might be responsible for lantibiotic-protein interactions, and could undergo similar structural transitions upon binding.

Methods

Lantibiotic sequences

The existence and biological functions of the 37 lantibiotics selected in this study have been established by experimental studies, and the sequences had reviewed and manually annotated information in UniProtKB/Swiss-Prot non-redundant sequence database24 [Supplementary Table S1].

Information from primary data

The domains, repeats, super families, and conserved patterns of the 37 lantibiotics were identified using InterPro Scan and the batch CD-search tool25,26. The transmembrane regions and the hydropathy indices of the lantibiotics were determined using the CLC Genomics Work Bench v 8.5. The Kyte-Doolittle and the Eisenberg scales were used for determining the local hydropathy plots. Lantibiotic antigenicity was analysed by the semi-empirical method of Kolaskar and Tongaonkarhas. Information pertaining to the physico-chemical properties, such as molecular weight, isoelectric pH, aliphatic index, hydrophobicity, hydrophilicity, and amino acid composition was also computed. The disordered regions were identified with the DISOPRED3 algorithm27.

Phylogenetic analyses

An MSA of the 37 lantibiotic sequences was generated using the MUSCLE algorithm. The phylogenetic tree was constructed using the Neighbour-Joining algorithm, keeping the bootstrap value at 1000. The CLC Genomics Work Bench v 8.5 was used for phylogenetic analyses.

Homology modelling, validation, and analysis

The complete structures of the 37 lantipeptides were constructed by homology modelling, using Modeller v 9.1128,29. A structure BLAST was performed against the Protein Data Bank (PDB) to identify templates for comparative modelling30,31. Template identification was also achieved by the threading-based fold recognition method employed by the PSIPRED server (http://bioinf.cs.ucl.ac.uk/psipred/)32. The backbone torsions of the validated models were assessed by analysing their Ramachandran plots, while the improper geometries and clashes were evaluated by checking their stereochemistry, using ProCheck33. The quality of the constructed models was additionally estimated by using different servers, including the ProSA II, Verify3D, and PSVS servers34,35,36. The intermolecular bonds and interactions of the 37 structures generated herein were determined using the RING-2.0 web server (http://protein.bio.unipd.it/ring/)37.

Identification of pockets and determination of structural diversity

The secondary structure composition of the lantipeptides were determined with STRIDE (http://webclu.bio.wzw.tum.de/cgi-bin/stride/stridecgi.py)38. The pockets were identified using CASTp (http://sts.bioe.uic.edu/castp/), with a probe of radius 1.4 Å39. The structural diversity of the lantipeptides was analysed by calculating the RMSD values following structural superimposition of the 37 lantibiotic structures. Each lantipeptide structure was individually superimposed and the intra-RMSD value was computed using CLC Genomics Work Bench v 8.5. In order to understand the structural correlation among the 37 lantipeptides with respect to their intra-RMSD values, a data matrix [Supplementary SF2] of all the 37 lantibiotics were prepared and standardised prior to the PCA. The PCA was performed with the ClustVis tool40, where vector scaling is applied to the rows and SVD with imputation is used to calculate the principal components of N = 37 data points.

Molecular dynamics simulation and trajectory analyses

The structural stability, compactness, backbone flexibility, and per-residue fluctuations were characterised by performing coarse-grained molecular dynamics (MD) simulations of the lantibiotic structures in explicit water. The simulations were performed by combining the four most widely used force fields, namely, Amber, Gromos, OPLS, and CHARMM, in the CABS simulation procedure, run on a high-performance computing server (http://biocomp.chem.uw.edu.pl/CABSflex/)41,42. The CABS protein representation was reduced up to four pseudo-atoms per residue, and the sampling was realised by the Monte Carlo method43. The simulation length was optimised to obtain the best possible convergence within 10 ns. The trajectories were analysed with VMD and VEGA ZZ44. The mean-square-fluctuation [(ΔR)2] was calculated using the following equation:

$$\langle {({\rm{\Delta }}R)}_{i}^{2}\rangle =\frac{1}{N}{\sum }_{j}^{N}{(xi(j)-\langle xi\rangle )}^{2}$$

where < > denotes the average across the entire trajectory, x represents the position of a particle i in the frame j, and N represents the total number of frames in the trajectory41,44.

The trajectories were clustered using the k-means clustering method in such a way that structurally closer models belonged to the same cluster. The best conformation of each lantibiotic was selected after screening the trajectories. Each cluster was superimposed for identifying the best conformation using the Theseus application. The RMSD and RoG of the lantipeptides were determined across the simulation time frame. The root mean square fluctuation (RMSF) was determined for estimating the residual fluctuations, and the most flexible regions were identifiedfrom the RMSF graphs. The stability of the system and the fluctuations across the trajectories were analysed with XMGRACE45.