Spectroscopically Orthogonal Labelling to Disentangle Site-Specific Nitroxide Label Distributions

Biomolecular applications of pulse dipolar electron paramagnetic resonance spectroscopy (PDS) are becoming increasingly valuable in structural biology. Site-directed spin labelling of proteins is routinely performed using nitroxides, with paramagnetic metal ions and other organic radicals gaining popularity as alternative spin centres. Spectroscopically orthogonal spin labelling using different types of labels potentially increases the information content available from a single sample. When analysing experimental distance distributions between two nitroxide spin labels, the site-specific rotamer information has been projected into the distance and is not readily available, and the contributions of individual labelling sites to the width of the distance distribution are not obvious from the PDS data. Here, we exploit the exquisite precision of labelling double-histidine (dHis) motifs with CuII chelate complexes. The contribution of this label to the distance distribution widths in model protein GB1 has been shown to be negligible. By combining a dHis CuII labelling site with cysteine-specific nitroxide labelling, we gather insights on the label rotamers at two distinct sites, comparing their contributions to distance distributions based on different in silico modelling approaches and structural models. From this study, it seems advisable to consider discrepancies between different in silico modelling approaches when selecting labelling sites for PDS studies. Supplementary Information The online version contains supplementary material available at 10.1007/s00723-023-01611-1.


Mass Spectrometry
Acquired ESI mass spectrometry spectra show only a single peak at the expected mass of the protein (unlabelled or labelled with the four nitroxides, respectively).The only exception is for the 6C MTSL sample where a peak with smaller intensity belonging to the unreacted protein can also be detected (Fig. S1).For the GB1 I6H/N8H/K28C control construct, a smaller peak corresponding to the dimer (12487 Da) can be observed.However, no dimer peak was detected in any of the labelled samples, leading us to assume that its formation may have occurred during the time frame between removing the DTT and performing the mass spectrometry measurement.The expected masses and the actual experimental masses for the different samples are reported in Table S1.

Con nuous Wave EPR Spectroscopy
Individual CW EPR spectra and corresponding labelling efficiencies are given in Fig. S2 and Table S2.
Quantitative labelling was obtained for all but the MTSL labelling of the I6C/K28H/Q32H construct, where the labelling efficiency was found to be lower, consistent with what was observed from the ESI-MS spectra.

RIDME raw spectra
RIDME raw spectra for both the variable and constant time RIDME, their reference background traces and the superimposed traces after deconvolution are reported in Fig. S3.

Sensi vity data for vtRIDME and ctRIDME
We employed both ctRIDME and vtRIDME to compare the two pulse sequences in their performances for gaining confidence in the general applicability and robustness of the variable-time sequence.Therefore, the modulation depth extracted from each RIDME trace during processing in DeerAnalysis, divided by the noise (RMSD), calculated from the phase corrected imaginary part of the data (Table S3), were employed for obtaining the sensitivity per echo and the sensitivity per unit of time for each trace (Table S4) as reported previously [1].As expected, higher sensitivity values were achieved from the vtRIDME, especially when considering the ratio between the non deconvoluted sequences.
Table S3 Noise estimates (RMSD), modulation depths () and the sensitivity obtained from their ratio (S) for the two different RIDME pulse sequences (variable or constant time) with and without deconvolution for the two GB1 constructs (I6C/K28H/Q32H and I6H/N8H/K28C) with the four nitroxide labels, MTSL, MPSL, IPSL, IDSL Table S4 Normalized sensitivity values (Sn) obtained from the sensitivity (S, Table S3) divided by the square root of total echoes per point (taken as the product of number of scans (1), shots per point (2), number of  averages ( 16), and phase cycle(32)), and sensitivity per unit of time (St) obtained by multiplying Sn values for the square root of the averaging rate (100 Hz).The values were extracted for both the constant and variable time RIDME with and without deconvolution for both GB1 constructs I6C/K28H/Q32H and I6H/N8H/K28C with the four nitroxide labels MTSL, MPSL, IPSL and IDSL.Sratio represents the ratio between the sensitivity values for the constant and variable time RIDME, respectively for the deconvoluted or non deconvoluted.

Distance distribu ons extracted from vtRIDME and ctRIDME with and without deconvolu on
As expected from systems with short distances and relatively narrow distance distributions, the two different RIDME pulse sequences do not display significant discrepancies, neither in shape nor in mean and widths (Table S4), showing high consistency and robustness, regardless whether the deconvolution step has been applied or not.However, the distance distributions belonging to the vtRIDME without the deconvolution step (in green) are the most affected by the presence of artifacts at higher distance ranges.The superimposed distance distributions for both GB1 constructs, each one with the four distinct nitroxide labels, are reported in Fig. S5.All mean and width values of the experimental distance distributions were extracted with an in-house Matlab script.The values were calculated considering only the data between 1.7 and 4.5 nm to suppress the influence of artifacts at higher or lower distance ranges that greatly affect these values (Table S5).Mean values for the different experimental setups on the same sample can be considered consistent and robust with respect to each other.On the other hand, the vtRIDME suffers the most from the presence of artifacts which is reflected in the consistently higher values of the extracted widths.

e f g h
Table S5 Mean and width values (reported in nm) extracted from experimental distance distributions of the two RIDME pulse sequences with and without deconvolution for both GB1 constructs (I6C/K28H/Q32H and I6H/N8H/K28C) for the four spin labels MTSL, MPSL, IPSL, IDSL However, in some cases considering the restricted range between 1.7 and 4.5 nm is not enough to be completely independent from artifacts, and some values are still relatively affected by them, such as the width for the 28C IDSL.Therefore, we monitored the full width at half maximum (FWHM), considering this to be a parameter more independent from artifacts, for every distance distribution peak, to further point out the high consistency and robustness of the two different RIDME sequences (Table S6).Moreover, we decided to extract the mean and width values for every distance distribution considering the lower error estimate obtained from Deer Analysis (Table S7).

Comparison of the distance distribu ons between two nitroxides labels or between the copper(II) and the nitroxides
Data obtained previously on the double-cysteine GB1 construct (GB1 I6C/K28C) showed bimodality for the MTSL label and generally broader distributions for the MPSL and IPSL labels (Fig. S6).Distinguishing whether this behaviour depends on the nitroxide attached to the -helix or the -sheet was possible only after the introduction of the less conformationally flexible CuNTA chelator ligand alternatively on one labelling site of the protein.The reduction in the distribution widths, provoked by the rigidity of this bipedal ligand, entailed a significant improvement in the precision of the measured distances, and a consequent decrease in the ambiguity of the interpretation of the system behaviour, leading us to mark the helix site as the one responsible for the broadness and bimodality in the distance distributions.

Modelling with MMM and MtsslWizard
All the modelling data of the distance distributions, acquired for the different structure prediction tools (AlphaFold2 (Fig. S7), OmegaFold (Fig. S8), ESMFold (Fig. S9)), and the crystallographic structure (PDB:4wh4 (Fig. S10)), are reported here compared to the experimentally obtained distributions from the ctRIDME deconvoluted data set.All structural models behave in a similar manner, and the extensive discussion in the main manuscript about the AlphaFold2 performance can be transposed to the other prediction methods and to the crystallographic structure.Minor differences in terms of shapes of the distance distributions can be detected.Here we report a visual representation of the computated rotamers for the different labelling approaches and their respective different conditions (ambient and cryogenic temperature for MMM and Tight and Loose for MtsslWizard) for the nitroxides (Fig. S11) and for the CuNTA (Tight and Loose settings for the Wizard and as previously described for MMM [2]) (Fig. S12).Correlation plots (Fig. S13 and Fig. S14) were obtained by extracting the mean and width of every distance distribution from the experimental and the in silico labelling data (Table S5).To get a numerical quantification of the curve discrepancies, the differences between the experimental and in silico mean values were also extracted as absolute values (Δ mean) (Tables S8-S11).The correlation plots were built considering the mean of the experimental distances on the x-axis and the mean of the in silico labelling on the y-axis.The widths of the experimental and simulated distributions were employed as "error bars" for each data point.A first set of correlation plots investigates the dependence on the labelling approach (MtsslWizard Tight or Loose and MMM ambient or cryogenic temperature) of the four different nitroxide labels for the same structure prediction method.To compare the overall shapes of the experimental and in silico distance distributions, the rmsd values were calculated with an in-house Matlab software (Tables S12-S15).To better estimate which labelling approach globally predicts more consistently the experimental behaviour, we compared at the same time all the mean values of the distance distributions of both constructs with the four distinct nitroxide spin labels for a single in silico labelling approach, with the same experimental data, extracting the global rmsd values (Table S16).This procedure was repeated for the two labelling approaches and their respective conditions (ambient and cryogenic temperatures for MMM and Tight and Loose settings for MtsslWizard).All three different structure prediction methods (AlphaFold2, OmegaFold and ESMFold) and the X-ray crystallographic structure were analysed.The same procedure was employed to analyse the distribution widths (Table S16).S16 Global RMSD values, extracted by comparison between all the mean values (for both GB1 constructs and the four nitroxide labels) for a single labelling approach with the experimental values.Repeated for the two labelling approaches MMM and MtsslWizard with their respective conditions, cryogenic and ambient temperature for the former, and Tight and Loose settings for the latter.All predicted structures (AlphaFold2, OmegaFold, ESMFold) and the X-ray crystallographic structure have been considered.The same procedure was employed to analyse the distribution widths

Fig. S6
Fig. S6Comparison of the distance distributions between two nitroxide spin labels on the double cysteine mutant GB1 I6C/K28C (black) obtained with the 4-pulse DEER sequence, and the distributions between the CuNTA and the same nitroxides, respectively on the I6C/K28H/Q32H (blue) and I6H/N8H/K28C (red) GB1 constructs obtained with the 5-pulse RIDME pulse sequence.The shadowed area represents the confidence estimate intervals (± 2)

Fig. S13
Fig. S13 Correlation plots, each point was plotted considering the mean value of the experimental distance distribution of the ctRIDME for the x-axis and their respective in silico mean values on the y axis.The error bars of each data point are derived from the experimental distance distribution width on the x axis and the respective in silico width on the y axis.The black line scores for the experimental trend.The different colors represent the different labelling approaches at different conditions: MMM at ambient temperature (orange), MMM at cryogenic temperature (green), MtsslWizard with Tight settings (red), MtsslWizard with Loose settings (blue).The different shapes correspond to the different nitroxide labels: circle MTSL, triangle MPSL, diamond IPSL and square IDSL.a) AlphaFold2, b) OmegaFold, c) ESMFold, d) X-ray (PDB: 4wh4)

Fig. S14
Fig. S14 Correlation plots, each point was plotted considering the mean value of the experimental distance distribution of the ctRIDME for the x-axis and their respective in silico mean values on the y axis.The error bars of each data point derive from the experimental distance distribution width on the x axis and the respective in silico width on the y axis.The black line scores for the experimental trend.The different colors represent the different labelling approaches at different conditions: MMM at ambient temperature (orange), MMM at cryogenic temperature (green), MtsslWizard with Tight settings (red), MtsslWizard with Loose settings (blue).The different shapes correspond to the different structure predictor tools: circle AlphaFold2, triangle OmegaFold, diamond X-ray (PDB: 4wh4) and square ESM-Fold.a) 6C and 28C MTSL, b) 6C and 28C MPSL, c) 6C and 28C IPSL, d) 6C and 28C IDSL.
mean square devia on (RMSD) values

Table S7
Mean and width values (reported in nm) extracted from the lower error bound (LB) estimate of the experimental distance distributions of the two RIDME pulse sequences with and without deconvolution for both GB1 constructs (I6C/K28H/Q32H and I6H/N8H/K28C) for the four spin labels MTSL, MPS, IPSL and IDSL

Table S12
Root mean square deviation (rmsd) values of the modelled distances for both MMM and MtsslWizard at both different conditions (ambient and cryogenic temperature and Tight and Loose settings), for both GB1 constructs I6C/K28H/Q32H and I6H/N8H/K28C, each respectively with the four nitroxides (MTSL, MPSL, IPSL and IDSL), for the AlphaFold2 structure

Table S13
Root mean square deviation (rmsd) values of the modelled distances for both MMM and MtsslWizard at both different conditions (ambient and cryogenic temperature and Tight and Loose settings), for both GB1 constructs I6C/K28H/Q32H and I6H/N8H/K28C, each respectively with the four nitroxides (MTSL, MPSL, IPSL and IDSL), for the OmegaFold structure TableS14Root mean square deviation (rmsd) values of the modelled distances for both MMM and MtsslWizard at both different conditions (ambient and cryogenic temperature and Tight and Loose settings), for both GB1 constructs I6C/K28H/Q32H and I6H/N8H/K28C, each respectively with the four nitroxides (MTSL, MPSL, IPSL and IDSL), for the ESMfold structure