Detector sensitivities of high-field and relaxometry data
In the previous section, we showed that it was possible to obtain detector sensitivities from optimized linear combinations of experimental sensitivities to internal motion of the protein. If the center of a detector sensitivity, \(\rho_{n}^{{{\text{solu}}{.}}} (z_{{\text{i}}} )\), is shifted to longer correlation times, that detector is able to better characterize slower motions with the corresponding detector responses. If we obtain more and narrower detectors at long correlation times, we can describe slow motions with better resolution. However, the sensitivities obtained for a set of detectors depends on the sensitivity of the experimental rate constants to internal motion, and this in turn depends on the correlation time of the molecular tumbling. Therefore, we begin with a theoretical investigation of the effect of molecular tumbling, where we optimize the sets of detector sensitivities for three different correlation times for tumbling. For each correlation time of molecular tumbling, we optimize detectors using either only high-field data or high-field data combined with HRR data. Experimental sensitivities are shown in Fig. 2a, for three correlation times for molecular tumbling \(\tau_{{\text{r}}}\) = 1 ns, 5 ns, and 25 ns.
For each value of \(\tau_{{\text{r}}}\), a set of detectors is optimized from the experimental sensitivities either without (Fig. 2b) or with (Fig. 2c) the relaxometry data. When optimizing detectors without relaxometry (high-field data set), we include longitudinal R1, transverse R2, and dipolar cross-relaxation rates σHC at fields of 400, 600, 800, and 950 MHz, yielding a total of 12 experiments. When including relaxometry, the 12 high-field experiments are considered with an additional 20 R1 relaxometry experiments with fields in the range of 13.5–170 MHz. Note that the number of detectors is chosen to limit the error of the resulting detector responses: we previously demonstrated how to optimize detectors using singular value decomposition (Smith et al. 2019a), and demonstrated that if N detectors are used to analyse a data set, then the error depends on the inverse of the largest N singular values obtained during detector optimization. Therefore, we may define a threshold for the inverse singular values, choosing N so that all of the N inverse singular values remain below the threshold. In SI Sect. 3, we use the experimental data to determine a reasonable threshold for the singular values (resulting in a value of 0.1). This threshold is applied when determining the number of detectors to use in Figs. 2b, c and 4. The value of the threshold is marked in Fig. 5.
For a small molecule (\(\tau_{{\text{r}}}\) = 1 ns), we see that detector sensitivities obtained with and without relaxometry are almost identical (Fig. 2b/c, left). The sensitivities of the relaxometry data are similar to that of R2, so that we have little to gain from relaxometry in terms of measuring slower motion. For a small protein with a correlation time for molecular tumbling \(\tau_{{\text{r}}}\) = 5 ns (middle), the sensitivities of relaxometry data are shifted to slightly longer correlation times in the low nanosecond range. This shift of detectors towards longer correlation times with relaxometry predicts the possibility to better characterize motions in the low nanosecond range. For larger proteins, for instance with \(\tau_{{\text{r}}}\) = 25 ns (right), detector sensitivities obtained with relaxometry extend to much lower correlation times, and we see that the relaxometry data (Fig. 2a, right) has sensitivities extending to significantly longer correlation times. On the other hand, we obtain an unusual shape for one of the detectors obtained with high-field data (Fig. 2b, right, bold blue line). Non-zero sensitivity of this detector both at short correlation times (\(\tau_{{\text{i}}}\) < 30 ps) and longer correlation times (local maximum at 13 ns) indicates that the experimental data cannot unambiguously separate motions at these timescales.
This problem arises because none of the experimental data is sufficiently different at these two ranges of correlation times to distinguish the motion: R2 is uniformly sensitive to fast motion due to attenuation of couplings, resulting in reduction of relaxation from molecular tumbling. This contribution is eventually canceled out for longer correlation times (normalized sensitivity reaches − 0.5 at 25 ns, or approximately \(\tau_{{\text{r}}}\)), where the slower motion makes contributions to J(0). On the other hand, high-field sensitivities of the rate constants σHC and R1 are most sensitive in the range of 33 ps<\(\tau_{{\text{i}}}\)<6 ns (i.e. having a sensitivity at least 0.5x the corresponding maximum sensitivity). Comparing these ranges of correlation times, we see that there isn’t enough difference in the high-field R2, R1, and σHC sensitivities for the ranges \(\tau_{{\text{i}}}\) < 30 ps and 6 ns < \(\tau_{{\text{i}}}\) < 25 ns to produce an unambiguous detector sensitivity. Thus, the resulting detector is sensitive at short correlation times, but has an additional hump around 13 ns. When we only have high-field data, we cannot differentiate these two timescale regimes; responses in ρ1 in this case can be due to fast motion, slow motion, or a combination of both motions. However, if relaxometry data is available, we can directly sample the timescale window between ~ 6 and ~ 25 ns and therefore obtain separated detector windows (Fig.2c, right). When fitting the spectral density function with a small number of correlations times (e.g. with model-free or other approaches) to the same set of experiments, the same ambiguity exists. Consider that one typically tries to find a fit that uses the minimum number of parameters. In many cases, restricting the number of parameters can lead to a unique best fit. Of course, this does not guarantee that the true distribution of motion is not more complex than determined by such a fit. If more parameters were allowed, one would find that the fit is not unique at all. Detectors, on the other hand, allow for an arbitrary number of motions present. Not knowing the number of motions will result in greater ambiguity in the characterization of dynamics, which becomes apparent when optimizing detector sensitivities. This ambiguity may not appear at all when fitting data using a given model with a minimum number of parameters.
Relaxation rates at low field enhance the resolution of motions in the low nanosecond range. Relaxometry experiments are not generally sensitive to motions significantly slower than can be observed via transverse relaxation at high field (Fig. 2a). For example, consider the behavior of the transverse relaxation rate constant sensitivities in Fig. 2a for \(\tau_{{\text{r}}}\) = 5 ns and 25 ns. The sensitivities of relaxometry and transverse relaxation rate constants extend to equally long correlation times for \(\tau_{{\text{r}}}\) = 5 ns, and R2 is sensitive to slightly longer correlation times than the relaxometry experiments for \(\tau_{{\text{r}}}\) = 25 ns (although using lower fringe fields could extend relaxometry sensitivity further). Therefore, the resulting detectors should, in principle, be able to identify similarly slow motion with or without relaxometry. For \(\tau_{{\text{r}}}\) = 5 ns, the first four detectors are nearly the same using only high-field data (Fig. 2b, middle) or relaxometry and high-field data (Fig. 2c, middle). With relaxometry data, we obtain two additional detectors, vs. only one more detector than when only using high-field data. These are shown in more detail in Fig. 3a. At a glance, it appears that ρ6 obtained using relaxometry is sensitive to motions significantly slower than can be measured with ρ5 using high-field data only. Upon closer inspection, we find that the sensitivity of ρ5 obtained for the high-field data set can actually be split to yield ρ5 and ρ6 of the full data set, as illustrated in Fig. 3b. ρ6 of the full data set is not sensitive to motions slower than observed with ρ5 of the high-field data; instead, ρ5 and ρ6 of the full data set simply describe slow motions with higher resolution, allowing us to better determine the timescale of such motions. This additional resolution is critical in estimating correlation times: if, for a given residue, we only detect motion with the detector that is sensitive to the longest correlation times (e.g. ρ5 for the high-field data set), then we have very little information regarding its true correlation time. The reason is that the measured response may be due to motion near the center of the detector’s sensitivity, but it may also be due to significantly slower motion with higher amplitude (which is then partially masked by the molecular tumbling). Relaxometry data leads to enhanced detector resolution, which allows one to better estimate correlation times of slow motions.
Increased resolution also offers an advantage when considering anisotropic molecular tumbling. When separating molecular tumbling from internal protein motion, the relative orientations of the residual NMR interaction tensors and the molecular tumbling tensor influence the contribution that the tumbling makes to the overall motion. This effect is important if molecular tumbling is not fully isotropic (i.e. the molecule is not spherical). If one can reasonably estimate the direction and shape of the residual tensor resulting from internal motion, one may correctly account for the influence of the anisotropy of tumbling when separating molecular tumbling and internal motion (for both model-free analysis, or for detector analysis, although the latter method is not published). For side-chain dynamics in this study, however, it is not possible to estimate the residual tensor without knowing the populations of all rotamers, so that we are instead forced to assume isotropic tumbling, although we expect slightly anisotropic molecular tumbling (Tjandra et al. 1995). Deviations of molecular tumbling from full isotropy may result in artifactual contributions to the detector sensitive to the longest correlation times: ρ5 when only high-field data is used, or ρ6 when using high-field and relaxometry data (see Fig. 3). In this case, motion on the nanosecond timescale identified with ρ5 using relaxometry can be clearly separated from molecular tumbling whereas ρ5 of the high-field only data set may include both internal motions and the effect of deviation from isotropic molecular tumbling.
Relaxometry, in principle, provides us with the ability to resolve much longer correlation times, where the lowest field used here (13.5 MHz) should be able to provide information on motions approximately 30 times slower than the lowest high field data (400 MHz). However, the experimental sensitivities and detector optimization indicate that we only achieve those gains if molecular tumbling is slow enough. That is, the slow motions observable with relaxometry are masked by the tumbling. So for \(\tau_{{\text{r}}}\) = 1 ns, no gains are made, but for \(\tau_{{\text{r}}}\) = 25 ns, we obtain clear benefits from the relaxometry data, potentially improving characterization of much slower motion, resolving ambiguity between faster and slower motions in the nanosecond regime, and helping separate true internal motion from deleterious effects due to anisotropic molecular tumbling.
Experimental data analysis
The theoretical analysis demonstrates that, in principle, relaxometry should improve the resolution with which we can describe slow motion. We now treat an experimental data set to evaluate in practice how much benefit we can obtain from relaxometry. We analyze methyl dynamics of all isoleucine residues in ubiquitin with and without relaxometry (data analyzed in this article is found in its Supplementary Information, originally from refs. Cousin et al. 2018; Kaderavek et al. 2019), where we use the ICARUS-corrected data here (Charlier et al. 2013; Bolik-Coulon et al. 2020)). An additional fit of the relaxometry data only can be found in SI Fig. 4. For this sample, the molecular tumbling correlation time was previously determined to be 5.03 ns, from an analysis of backbone 15N relaxation data with the program ROTDIF (Berlin et al. 2013), so that modest gains are expected from the relaxometry data (Fig. 2). Note that definition of the sensitivities for 13C relaxation in a methyl group is complicated by the fact that the H–C and D–C dipole tensor correlation function is different than that for the 13C chemical shift anisotropy; see SI section 2 for details.
We have optimized detector sensitivities to analyze methyl-relaxation data obtained using high-field and relaxometry data (Fig. 4a). We obtain reasonable confidence intervals using 6 detectors for the full relaxometry data set, where the detector sensitive to the longest correlation times has a center that falls at about 12 ns (7 detectors yields higher resolution, but with higher standard deviation). The choice of number of detectors is discussed in SI section 3, and we will see below that the number of detectors is related to the size of singular values obtained during the detector optimization process (Fig. 5). In parallel, we have optimized detector sensitivities using only the high-field data (Fig. 4b); to obtain similar standard deviation on the resulting detectors, we use 5 instead of 6 detectors to analyze the high-field data set. Additional analysis of the high-field data set with 6 detectors can be found in SI Fig. 3.
We note two significant gains from the inclusion of relaxometry data. First, the center of the slowest detector moves from about 4.4 ns for the high-field data set to 12 ns for the full relaxometry data set, a result of the higher resolution provided by using relaxometry. Second, we have gained an additional data point, having 6 detectors instead of 5. If we analyze the high-field data with 6 detectors (SI Fig. 3), we find a significant increase in the standard deviations of the resulting detectors. These increases are especially pronounced for detectors sensitive to long correlation times, as expected, since relaxometry provides information on slower motions.
The detectors approach sheds light on the model-free analysis of 13C relaxation rate constants in methyl groups (Cousin et al. 2018; Kaderavek et al. 2019). We have previously shown that the analysis of relaxometry in Ile44 with a model-free approach adapted to methyl groups provided an effective correlation time for slow motions of the methyl axis \(\tau_{{\text{s}}}\) = 1.3 ns, in agreement with the analyses shown in Fig. 4a. Non-zero detector responses for ρ4 (center ~ 680 ps) and ρ5 (2.4 ns) of the full data set, and non-zero responses for ρ4 (center ~ 800 ps) and ρ5 (center ~ 4.4 ns) of the high-field data set both point towards a mean correlation time in between these ranges (although we note that the responses may also be a results of a distribution of correlation times around ~ 1 ns). The high-field data set alone identifies a slow motion in the low nanosecond for Ile44. In contrast, we find that slower motions observed on Ile13 and Ile36 could not be precisely defined from an analysis of high-field relaxation only (Kaderavek et al. 2019). For these residues, we obtain a significant detector response for ρ5 of the high-field data, but no response on ρ4. The response on ρ5 alone is ambiguous at several levels: first one cannot determine whether a relatively low amplitude motion towards the center of the sensitivity of ρ5 results in the observed detector responses, or if the response results from a higher amplitude motion at a much longer correlation time (or some combination of these cases). In addition, we previously noted that because ubiquitin is not perfectly spherical, and we cannot reliably estimate the residual tensors from internal motion, we may have some artifactual contribution from tumbling in the detector which is sensitive to the longest correlation times (ρ5 for the high-field data set, and ρ6 for high-field and relaxometry data). Indeed, this seems a likely scenario; when considering only the high-field data, there is some motion detected with \(\rho_{5}^{(\theta ,S)}\) for all residues. Thus, it would be difficult to assign with certainty to slightly higher sensitivities in \(\rho_{5}^{(\theta ,S)}\) for Ile13 and Ile36 to nanosecond motions from high-field relaxation rates only.
On the other hand, if we consider the full data set, including relaxometry, we obtain an additional detector response, so that we have two detectors in the nanosecond range: ρ5 (2.4 ns) and ρ6 (12 ns). We find a larger detector response for ρ5 than ρ6 for both Ile13 and Ile36, which suggests that the observed motion is in fact found between 2.4 and 12 ns. This is in good agreement with the previous model-free analysis (Cousin et al. 2018), where effective correlation times of 3.1 ns and 2.5 ns were found for Ile13 and Ile36 respectively. The detector approach thus confirms the presence of motion in the low nanosecond range in three isoleucine side chains, by providing higher timescale resolution than high-field data alone. In addition, this enhanced resolution, obtained using relaxometry data, allows us to clearly distinguish internal motion in the low nanosecond range for Ile13, Ile36, and Ile44 from possible artifactual contributions from tumbling (whereas low responses now in \(\rho_{6}^{(\theta ,S)}\) may capture some tumbling motion). The detector approach shows that, in two out of three cases, high-field relaxation rates do not contain sufficient resolution in the nanosecond regime to fully characterize the methyl dynamics and thus demonstrates that high-resolution relaxometry provides unique information.
Quantifying information content: singular value decomposition
Information content of a data set, in particular when considering slow motion, depends on whether we have experiments sensitive to long correlation times (as provided by relaxometry), and whether tumbling of the molecule in solution masks slow motion. We have previously shown that detector accuracy is determined by a linear combination of the inverse of the singular values, obtained during detector optimization (using singular value decomposition, see (Smith et al. 2019a)). Then, the number of inverse singular values \(\left( {[\Sigma ]_{i,i}^{ - 1} } \right)\) that fall below a chosen threshold determines the number of separate detectors that can be obtained from a data set. The choice of the threshold depends on how much error can be tolerated (here the threshold is taken to be 0.1; see SI section 3, where we discuss requirements for the threshold).
We plot the inverse singular values \([\Sigma ]_{i,i}^{ - 1}\) as a function of the tumbling correlation time for three data sets in Fig. 5. In Fig. 5a, we take only the set of high-field data used in Fig. 2a. Between \(\tau_{{\text{r}}}\) = 1 ns and 3 ns (10–9 and 10–8.5 s), \([\Sigma ]_{4,4}^{ - 1}\) becomes marginally smaller and \([\Sigma ]_{5,5}^{ - 1}\) decreases significantly. After this point, little further improvement is observed: while the increase of the rotational correlation time unmasks slower motions, experiments in the high-field data set cannot characterize these slower motions better, mostly because slower nanosecond motions and very fast low picosecond motions are difficult to disentangle (see Fig. 2b for \(\tau_{{\text{r}}}\) = 25 ns).
In contrast, if we include the magnetic fields used in relaxometry experiments (down to 13.5 MHz or 0.3 T), we see decreases in singular values up to \(\tau_{{\text{r}}}\) ~ 100 ns (Fig. 5b). In this case, the increasing \(\tau_{{\text{r}}}\) unmasks more motion, and we have the experimental data to characterize that motion. We still eventually reach a plateau, since the relaxometry experiments are sensitive to motions about 30 times slower than the high-field relaxation (R1 at 400 MHz vs. 13.5 MHz). Adding further relaxometry experiments at even lower fields causes the plateaus to occur at even longer correlation times, which we demonstrate by including fields as low as 400 kHz or 9 mT (Fig. 5c). Such low fields would, in principle, allow analysis with up to ~ 10 detectors, if the molecule tumbles sufficiently slowly, To our knowledge, quantitative, high-resolution NMR of 13C1H2H2 methyl groups in proteins has been demonstrated for molecular weights up to 360 kDa but not beyond yet (Rennella et al. 2015).