1 Introduction

Distance determination by pulsed electron–electron double resonance (PELDOR) [or double electron–electron resonance (DEER)] spectroscopy has become a popular tool to determine exact distances in bio-macromolecules [19, 28, 29]. Since many proteins do not contain intrinsically electron paramagnetic resonance (EPR)-active (paramagnetic) centers, PELDOR relies on the introduction of spin labels, in most cases nitroxide groups. Most commonly, site-directed spin labeling (SDSL) of cysteine residues is used to introduce spin labels on the molecular surface of a protein [2]. The distances between these labels can then be determined and, if multiple sites are labeled, distance fingerprints can be produced, even for large membrane proteins [5, 11, 13, 16]. If conditions are known that drive the protein of interest into a different conformation, another fingerprint can be prepared under these conditions and the conformational changes can be investigated [13].

An intrinsic limitation of SDSL is due to the widely used methanethiosulfonate spin label (MTSSL [4]), which introduces five torsion angles and a >7 Å separation between the spin center and the Cα of the protein (Fig. 1). The five torsion angles lead to a very large number of possible positions for the unpaired electron. This causes uncertainty in the interpretation of experimental data in structural terms.

Fig. 1
figure 1

MTSSL. The position of the unpaired electron is indicated. The variable Chi-angles 1–5 are labeled

Different approaches have been used to deal with this problem. The uncertainty can be allowed for by adding ±14 Å to any distance restraints (e.g., [22]). This approach undermines the ability of PELDOR to identify structural changes that are well within the intrinsic error of measurement. A more sophisticated approach has been to use a priori knowledge, i.e., that the nitroxide has to be located somewhere inside a conical space around the attachment site (“tether-in-a-cone-model” [1, 14, 17, 18, 36]). This cone can be further reduced by using rotamer libraries of the free label derived from crystal structures [12, 14, 23] or molecular dynamics (MD) simulations [3, 27]. This method aims to predict in which area of the cone the label will most likely reside. The most time-intensive and rigorous method is a full-atom MD simulation of MTSSL attached to the protein (e.g., [3, 8, 9, 25, 26, 30, 31]).

We present here a computer program, mtsslWizard that allows the user to attach MTSSL labels to selected positions of a protein by pointing and clicking with a computer mouse. The program is designed as a plugin for the PyMOL (http://www.pymol.org) molecular graphics program (PyMOL plugins are often called wizards, hence our choice of name). PyMOL is widely used for the generation of figures of macromolecules in publications and its source code is available free of charge (http://sourceforge.net/projects/pymol/). Our program has a graphical user interface (GUI) that is fully integrated into PyMOL.

MtsslWizard is inspired by the “tether-in-a-cone” approach [17, 18, 36] in that the program searches for all MTSSL rotamers that do not clash with the protein structure (within a variable tolerance). The approach does not make assumptions about rotamer probability, nor does it estimate interaction energies. Despite this, in many cases, mtsslWizard predicts experimental data rather well, in a simple and rapid (seconds) manner.

2 Results and Discussion

2.1 Overview and Usage of MtsslWizard

MtsslWizard is launched via the PyMOL GUI and integrates seamlessly into it. Once a structure of interest has been loaded into PyMOL and mtsslWizard has been started, the residue to be labeled can be selected with a mouse click and the user can adjust the thoroughness of the search procedure, if needed (Fig. 2). In brief, the search algorithm of the program generates random values for each of the five Chi-angles and checks the corresponding conformation of MTSSL for clashes with the protein. A clash is here defined as a violation of a “vdW cutoff” between label and protein, and not as two atoms having exactly the same position in space (see below). Depending on the “thoroughness” setting, different amounts of generation/check cycles are performed for each of the five Chi-angles: 1,500 in “painstaking search”, 300 in “thorough search”, 90 in “normal search” or 10 in “quick search”. In contrast to a full-brute search, this method allows the search procedure to be shortened (if desired) without systematically missing conformations. Each generated conformation is also checked for clashes within the label itself. If internal clashes occur, the corresponding conformation is discarded.

Fig. 2
figure 2

a The mtsslWizard GUI is integrated into PyMOL (http://www.pymol.org). The mtsslWizard menu on the bottom right is marked with an asterisk. Depending on the selection in “Mode”, the appearance of the interface changes as shown in panels bd. b Menu for the “Search” mode, c menu for distance calculation, d “Copy and Move” menu allowing to transfer ensembles of conformations to symmetry related positions in multimeric proteins

Depending on the search parameters and the size of the protein, the calculation for one site typically takes on the order of 10–15 s on a standard laptop computer. For very large protein assemblies (>2,000 residues), calculation times on the order of 2–10 min may occur. Wherever the protein is a symmetric multimer, only one calculation per unique site is required; the distribution is easily transferred to other symmetry related positions by simply clicking in “Copy and Move” mode of the program.

Wherever the label is attached close to a hydrophobic surface pocket, conformations that place the label inside the pocket might be particularly stable. Such a fixed conformation would give a very different distance distribution than models assuming a flexible spin label. Identifying such a pocket would be key to correctly interpreting PELDOR data and avoiding “identifying” a conformational change where none occurred. As a possible solution to this problem, mtsslWizard sums the number of atoms that reside outside the vdW cutoff of a non-clashing conformation but are still within a 4.5 Å sphere around the same label conformation. We then use the number of these contacts as a surrogate for fit to the protein surface. The conformation with the highest number of such contacts is flagged as a “snuggly fit”, as are all those conformations that have at least 75% of this maximum value (see Spa15 example, below). The “snuggly fit” algorithm is only activated when the program runs with the thoroughness set to “painstaking” since otherwise the conformational space is not sampled well enough for this feature to work properly. We would like to stress that the “snuggly fit” algorithm merely flags up possibilities for the scientist to further investigate.

MtsslWizard was initially designed as a tool to assist in the selection of labeling sites for PELDOR experiments. The program gives the user an immediate idea if a planed labeling site is too occluded for the attachment of MTSSL and if the distance between two planed labeling positions will fall into the PELDOR window of 15–80 Å. Furthermore, we found that the program is very useful to interpret PELDOR data if two or more models for a protein structure are available. In this case, a comparison of the experimental distance distributions with the in silico data can often help to decide which of the available models shows the best agreement with the experimental data.

2.2 Accounting for Conformational Changes at the Labeling Site

Almost by definition, SDSL is performed on solvent-accessible surface residues; however, the local environment of the protein structure can change due to the introduction of the spin label. A good example for this is the spin-labeled Spa15 structure which shows some significant conformational changes compared to the wild-type structure [24, 32]. It is also important to remember that the protein structure, especially with respect to surface residues is not as static as it appears in a molecular graphics program. Surface residues are often not well defined in electron density or have multiple conformations and as yet there is no simple way to capture such uncertainty. If no experimental electron density map is available, crystallographers rely in such cases. Further, crystal packing can introduce conformational artifacts of individual surface residues. Nuclear magnetic resonance (NMR)-derived structures are a very important source of data and model structural variability explicitly by producing a structural ensemble. Any approach which implicitly assumes that a single structure is a perfect model for the solution state of the protein runs the risk of errors, since clashes in the static model may not exist in the solution structure. It is for this reasons that we have allowed the user control over two parameters in mtsslWizzard: “vdW cutoff” and “allowed clashes”, which allow conformations with apparent label/protein clashes. The “vdW cutoff” determines when a contact between label and protein is counted as a clash and is by default set to “3.4 Å” which is a typical distance (carbon–carbon) between residues in the hydrophobic core of a protein structure. This value can be changed by the user between values of 2.6 Å (length of a typical hydrogen bond, ignoring the hydrogen) and 3.4 Å. The mtsslWizard employs this cutoff value for all atom types. Although vdW radii differ between, e.g., oxygen (1.52 Å) and sulfur (1.80 Å), we found that the error of the distance prediction (~3.5 Å) of mtsslWizard and, e.g., MMM dwarfs this difference by far. The “allowed clashes” parameter is by default set to “0” but can be adjusted to values between “0” and “5”. As mentioned above, a clash is here defined as a violation of the above described “vdW cutoff” and does not mean that two atoms share exactly the same position in space. A conformation that seriously protrudes into the protein surface would have far more than five such clashes. We would envisage such parameters being varied for low-resolution structures, flexible regions, for NMR structures or homology models from SWISS-PROT or if it is known that a certain site can be labeled but no conformations can be found using the default parameters. Again we stress that the scientists have to exercise their judgment with respect to what degree they deviate from the defaults, and the program specifically cautions when the parameters are varied.

2.3 Distance Calculation

Once two or more sites of a protein have been labeled in silico, the mtsslWizard is switched into “Distance” mode and all distances between a pair of label ensembles are determined (N–N distances between nitroxide groups), displayed and written out to a file if desired (Fig. 2c). For a quick overview, a text-based histogram of all distances is calculated and displayed together with statistical information (arithmetic mean, median, the shortest, and the longest distance) and the Cβ–Cβ distance between the selected sites. The speed of the distance calculation depends on the number of conformations at the two selected sites but typically takes ~10 s.

2.4 Comparison of MtsslWizard Predictions with Experimental Data

2.4.1 Example 1: T4 Lysozyme and H3–H4 Histone

We tested mtsslWizard against a dataset of 52 published PELDOR and CW-EPR-derived distances of double spin-labeled T4 lysozyme (T4L) and (H3–H4)2 histone [7]. The program was run with the default settings except for the cases indicated in Tables 1, 2. In these cases the program could not find any conformations that do not clash with the protein using the default settings. Since the sites could be labeled experimentally the vdW cutoff parameter was adjusted in these cases. In the tables and wherever available, the weighted mean (<r>) and the most probable/peak distances r pk are listed. The reason for the usage of the mean distance is that the value for the mean distance is more stable in DEER analysis-derived distance distributions than the peak distance or the asymmetry of the distribution, especially in the case of broad distributions [20]. The residual plots in Fig. 3 show that on average, mtsslWizard gives a significantly better (<Δ r > = 1.2 Å, σ(Δ r ) = 3.0 Å) estimate of the PELDOR experimental distance than the Cβ–Cβ distance (<Δ r > = −6.4 Å, σ(Δ r ) = 3.5 Å) and performs on par with the MMM program [27] (<Δ r > = −0.9 Å, σ(Δ r ) = 3.3 Å). Here, <Δ r > is the mean value of the difference between in silico and experimentally derived distance and σ(Δ r ) is its standard deviation. Thus, mtsslWizard is “safe” to use, it is not worse than MMM, the current gold standard on average. Bowman et al. [7] used MD simulations to predict the PELDOR distance distributions of the histone distances in our test dataset. The data in Table 2 and Fig. 3 show that on average and within error, mtsslWizard and MMM perform very similar to this more sophisticated and time-consuming approach. The correlation plot for the Cβ–Cβ distances reveals (Fig. 3a) that the usage of Cβ–Cβ distances underestimates the experimental distance by about 6.5 Å. Adding this value to the Cβ–Cβ distances leaves a standard deviation of σ(Δ r ) = 3.5 Å that is not much worse than any of the simulation approaches. Thus, as a rule of thumb, 6.5 Å can be added to any Cβ–Cβ distance to get a good idea of which experimental distance is to be expected between a pair of MTSSL spin labels. However, the benefit of simulation programs is the calculation of distributions that can be compared to experimental data.

Table 1 EPR-derived distances of spin-labeled T4L (PDB entry 2LZM, [33]) compared to predictions from mtsslWizard and MMM in 298K and 175K mode
Table 2 PELDOR derived distances of spin-labeled H3-H4 histone [7] (PDB entry 1TZY, [34]) compared to distances predicted by mtsslWizard and MMM in 298 K and 175 K mode
Fig. 3
figure 3

PELDOR distances derived from the T4L and Histone datasets (compare Tables 1, 2) are plotted against different in silico predictions. The x-axis shows the experimental distance and the y-axis the difference of the experimental value to the prediction (Δ r ). The ideal y = 0 line is marked in red, the areas corresponding to different prediction errors are shaded with different colors (green: ≤3 Å, yellow: ≤5 Å, white: >5 Å). Each graph contains two plots, white circles show a comparison of the highest peaks of experiment and prediction (r(EPR)pk/r(Pred)pk) and black squares compare averages of experiment and predictions (r(EPR)avg/r(Pred)avg). The two vertical histograms (dark gray: EPRavg/Predavg; light gray: EPRpk/Predpk) on the right of each plot visualize the spread of the residuals for both plots. The average difference (<Δ r >) between prediction and experiment and the standard deviation of this value (σ(Δ r )) are given in the table to the right of each plot. The average difference ( r >) is also marked by blue horizontal lines in the histograms. The blue line in the residual plots (left) corresponds to the average difference for EPRavg/Predavg. The MD values were taken from [7]

2.5 Unrestricted Search versus Rotamer Libraries

For the comparison above, MMM was run in default mode (assuming a temperature of 175 K) and in 298 K mode. In 175 K mode we found that MMM often produces very sharp but spiky distance distributions, which can be difficult to interpret and that are usually not found in PELDOR experiments. At 298 K the MMM distributions are much more smooth and rather wide, which means that most of the MTSSL rotamers that MMM predicts for a certain site are assumed to be populated at this temperature (Supplementary Figure 1). In essence, this comes close to the mtsslWizard strategy of sampling the whole conformational space of the label at this site and calculating an average distance. Furthermore, Fig. 3 shows that for both mtsslWizard and MMM the predictions within our dataset are more reliable (smaller standard deviation) if the average of the calculated distribution is compared to the average of the experimental distribution as opposed to the peak values of experiment and prediction. This is especially obvious for MMM when run in 178 K mode where the weighting that MMM applies to the distinct rotamers becomes more apparent. This might on one hand be due to errors in the experimental distance distributions themselves (see above and [20]) or on the other hand this may indicate, that the MTSSL rotamer libraries need to be combined with a more sophisticated energy scoring function taking the protein environment, solvent and electrostatics explicitly into account. This would however come at great computational costs and only slight variations between experiment and simulation (ionic strength, pH, temperature, concentration of ethylene glycol,…) might render the outcome of such efforts futile. The influence of the protein environment on the conformation of the label becomes apparent in Fig. 4 which shows Chi-angles from PDB entries that contain MTSSL in comparison with the computationally derived rotamer library used in MMM [27]. Although the overall fit of the computational library to the experimentally observed conformations is good, there are experimental Chi-angles that are not contained in the libraries or are predicted to have a very low probability. For whatever reason—the local environment of the spin label seems to promote Chi-angles that are energetically unfavorable for the free label. We believe that such Chi-angles ought not to be excluded without good reason. Thus, mtsslWizard searches all possible Chi-angles restricted only by vdW clashes.

Fig. 4
figure 4

Comparison of MTSSL Chi-angles taken from available X-ray structures with a computationally produced library for the free spin label [27]. The datasets are represented by polar histograms. The experimentally found angles are color coded as follows: Chi1-blue, Chi2-red, Chi3-green, Chi4-yellow, Chi5-purple. The computed Chi-angles are overlaid as gray polar histograms and are taken from Polyhach et al. [27] Experimental Chi-angles where taken from the following PDB entries: 1zwn, 1zyt, 2cuu, 2igc, 2nth, 2ou8, 2ou9, 2q9d, 2q9e, 3g3v, 3g3w, 3g3x, 3ifx, 3m8b, 3m8d, 2w8h. In addition, the Chi-angles reported by Langen et al. [23] were used

2.5.1 Example 2: Potassium Ion Channel KcsA

In 2009, Endeward et al. [11] analyzed spin-labeled KcsA (KcsA R64C) by PELDOR spectroscopy and derived orientation of the label from PELDOR data. We used the published results as another test for mtsslWizard and tried to predict the experimental data. The program was run using the KcsA structure by Zhou et al. [35]. Figure 5 shows the resulting spin-label ensembles. Since the label is situated in a cone-shaped cavity, the spread of the allowed conformations is rather small. The 1–2 and 1–3 distances were then calculated with mtsslWizard and a distance histogram was prepared. The maximum of the distance histogram coincides exactly with the published PELDOR results of 22 and 31 Å for 1–2 and 1–3, respectively.

Fig. 5
figure 5

Example 2: the KcsA potassium transporter spin-labeled with mtsslWizard. Left: Found conformations of MTSSL are shown as green sticks. The O1 oxygens of MTSSL are shown as red spheres. Middle: distance histograms (green) produced for distances 1–2 and 1–3. The experimental PELDOR distance distributions [11] are shown as red line. Right: polar histogram plots of all Chi-angles from the MTSSL conformations shown in the left part of the figure

We also prepared polar histograms for the Chi-angles of the mtsslWizard ensembles to compare them to values found experimentally by Endeward et al. [11]. Interestingly, Chi1 and Chi2 of the predicted ensemble are close to the experimental values of −60 and 180° indicating that in this case the protein environment drives the label into this conformation (Fig. 5).

2.5.2 Example 3: Type III Secretion Chaperone Spa15

The type III secretion effector chaperone Spa15 is an interesting test case because a crystal structure of the spin-labeled protein shows the label to reside in a confined hydrophobic surface pocket. This led to unexpected PELDOR results and the MMM program [27] was reported as failing to model the observed data, even when the crystal structure of the spin-labeled chaperone was used for the modeling efforts [24].The mtsslWizard was run with the spin-labeled Spa15 structure and a broad spread of possible conformations was found with the most frequent distance at ~56 Å. This is more than 10 Å longer than the experimentally determined PELDOR distance of 45 Å, and similar to the discrepancy that occurs with the MMM approach (Fig. 6) [24].

Fig. 6
figure 6

Example 3: The Spa15 chaperone. a All MTSSL conformations predicted by mtsslWizard are shown as green sticks. Note that the program was run in “painstaking” mode to activate the “snuggly fit” feature and the vdW cutoff was lowered to 2.8 Å. The MTSSL conformation found in the X-ray structure (2×ga) is colored yellow). The molecular surface of the protein is colored according to atom type. Carbon: gray, oxygen: red, nitrogen: blue. b Same as (a), but for the subset of conformations (salmon) that were flagged as “snuggly fit” by mtsslWizard. The conformation observed in the crystal structure is thicker and highlighted in yellow. c The respective distance histograms after labeling both monomers of the Spa 15 dimer (for visualization purposes, both are individually scaled to values between 0 and 1). Experimental PELDOR data were digitized from [24] and is shown as red line. The inset shows a magnification of the overlapping part of the two histograms indicating that the “snuggly fitting” conformations are truly a subset of all conformations. d MtsslWizard results compared to the X-ray structure of spin-labeled (pos 118) T4L. The “snuggly fit” algorithm (red sticks) predicts the location of the spin label in a hydrophobic surface pocket, as found by crystallography. Conformations that were not flagged as “snuggly fit” by the program are colored green

This problem is an ideal test case for our above mentioned “snuggly fit” algorithm and in the case of Spa15 the subsets of snuggly fitting conformations are located in the surface pocket and match quite well with the location of the label observed by crystallography (Fig. 6). The distance histogram derived from the snuggly fitting conformations peaks at 46.5 Å, close to the experimentally observed distance of 45 Å (Fig. 6). Note that the program was run in “painstaking” mode to activate the “snuggly fit” feature and the vdW cutoff was lowered to 2.8 Å. Another example that demonstrates the usability of this approach is T4L spin labeled at position 118. A crystal structure of this spin-labeled mutant was solved and shows the spin label buried in a surface cavity [15]. We “spin labeled” this position with mtsslWizard and the 15 conformations (of 265 total found conformations) that were flagged as “snugglyFits” by the program are all contained within the surface pocket close to the position found by crystallography (Fig. 6d).

2.5.3 Example 4: The Escherichia coli Capsule Export Channel Wza; Limitations of Current Snuggly Fit Approach

Wza is a bacterial outer membrane protein that has been crystallized with and without attached spin label [10, 16]. An MTSSL spin label attached to position Q335C of the protein was found to have two conformations by crystallography, explaining a pronounced shoulder in the PELDOR data [16].

We used mtsslWizard to see if the PELDOR results including the double peak can be predicted. Distances derived from the ensemble of found conformations closely coincide with the experimental data of 28.6 and 54 Å (main peaks in [16], Fig. 7). The pronounced shoulder that was observed in the PELDOR data is however not observed in the mtsslWizard results. Figure 7 shows that both crystallographic conformations are contained in the predicted conformations, but their contribution to the histograms is overwhelmed by the other possible conformations and thus no double peak is simulated. The distance histograms of the “snuggly fitting” conformations do show multiple peaks, but deviate from the observed distances (Fig. 7). Although the simple scoring of “snuggly fit” does fail in this case, the approach does provide models for user inspection and possible further analysis.

Fig. 7
figure 7

Example 4: The octameric outer membrane channel Wza. a All conformations found by mtsslWizard are shown in translucent and surrounded by a green outline to indicate the size of the whole distribution. Those conformations flagged as “snuggly fits” are shown as red sticks. The two conformations from the X-ray structure are colored yellow. The molecular surface of the protein is colored according to atom type. Carbon: gray, oxygen: red, nitrogen: blue. b The corresponding histograms for the 1–2 and 1–3 distances of Wza. The histograms for all conformations are filled green; the histograms for the “snuggly fit” conformations are colored red. Experimental PELDOR data were taken from [16]. Note that both the red and green histograms are individually normalized, otherwise the “snuggly fit” histogram would not be visible on the chosen y-axis scale

In the Wza mutant Q335C the label is attached to the bottom of a relatively large hydrophilic crevice in the molecular surface (Fig. 6, compare molecular surface to Fig. 5). In comparison to the Spa15 and T4L examples above, this shows that our current “snuggly fit” approach may only be valid for hydrophobic pockets and less reliable for hydrophilic environments. Intuitively this seems reasonable as vdW interactions will sum but polar interactions have important geometric considerations or can be repulsive. Improvement of the “snuggly fit” algorithm can come from two sources, further increase in experimental data, which defines pockets that the label can occupy, or a more sophisticated scoring function. A more sophisticated scoring algorithm would differentiate between repulsive and attractive forces and treat polar atoms differently. The challenge for future development is to balance an increasingly sophisticated treatment of label–protein interactions against simplicity and speed.

2.6 Interpreting MtsslWizard Results

The T4L and histone examples show that the mean distance calculated by mtsslWizard predicts the experimental values quite well and we found that in many cases the shape of the distance histogram derived from mtsslWizard is also a reasonable match to the PELDOR derived distance distribution (Supplementary Figure 1). Since mtsslWizard is based on the “tether-in-a-cone” approach [1, 14, 17, 18] the logical implication is that indeed in many cases the spin label does sample a range of conformations.

Wherever the spin label is not disordered but adopts a fixed or restricted set of conformations due to interaction with the protein, only a fraction of the possible conformations that mtsslWizard finds will match the observed data. Additional parameters such as the surface complementarity score (“snuggly fit”) have been introduced to improve the predictions (example 3: Spa15). We also found that it is often helpful to interpret mtsslWizard results in the light of room temperature CW spectra because the latter contain experimental information about the flexibility of the spin label and would thus hint to whether a label is indeed trapped in a pocket or pointing into the solvent.

3 Conclusion

MtsslWizard is a simple to use and freely available program that allows users to generate plausible for MTSSL labels attached to a protein, to measure distances between labels and indeed to vary parameters to allow superficially implausible models. We have argued above that there are sound reasons (e.g. errors or uncertainties in experimental structures) not to assume each side chain of an amino acid is positioned correctly (especially on the protein surface).

As we would predict our approach produces similar results to MMM in most cases, any benefit in such cases would derive only from it being a PyMOL plugin, rather than MATLAB. Importantly, the integration of the program into a freely available, sophisticated and powerful molecular graphics program like PyMOL makes it easy for users to critically evaluate mtsslWizard results. We believe the advantage of our program occurs when PELDOR data does not match the simple model. Incorrectly interpreting disagreement between PELDOR data and a distance model in functional terms must be avoided. Determining crystal structures to resolve such anomalies is definitive but may not be possible in each case. Our approach allows the scientist to explore whether the measured distances are plausible from a range of models. As we show the program does a very good job for a number of test cases where there are unusual positions of the spin label, we stress the program is intended to provide a basis for further analysis and experimentation. By eliminating alternative explanations for discrepancy between model and data, scientists will be able to confidently identify conformational changes.

The source code of mtsslWizard is freely available at htp://www.pymolwiki.org/index.php/mtsslWizard.