Introduction

Insight in protein dynamics is crucial for understanding protein function (Palmer 2004). NMR relaxation methods provide ideal tools for studying motions faster than the overall tumbling correlation time of a protein, i.e. sub-\( \tau _c \) motion (\( \tau _c \) is ca. 4 ns for ubiquitin at room temperature). Such motion has been proposed to contribute mostly to the entropy of proteins (Schneider et al. 1992; Li et al. 1996; Prompers and Brüschweiler 2000; Lee and Wand 2001). Slow time scale motion (ca. 50 μs up to several ms) can be probed by relaxation dispersion measurements (Akke and Palmer 1996; Kay 1998; Kay et al. 1989; Palmer 2004). These experiments isolate the contribution of conformation-dependent modulation of isotropic chemical shifts to NMR line widths from alternative spin-spin or spin-lattice relaxation effects. Such relaxation dispersion experiments are sensitive to conformational changes of proteins in catalytic cycles (Eisenmesser et al. 2005; Henzler-Wildman et al. 2007; Kern et al. 2005; Kern and Zuiderweg 2003; Mulder et al. 2002; Stevens et al. 2001; Tollinger et al. 2001). Slow time scale motions have also been detected in several cross-correlated relaxation experiments (Dittmer and Bodenhausen 2004; Ferrage et al. 2006; Pelupessy et al. 2003).

Relaxation methods are not suitable for the study of dynamics between the sub-\( \tau _c \) and the relaxation dispersion time scale. This time scale window can thus far only be addressed by measurements of residual dipolar couplings (RDCs) as they are time-averaged from femtoseconds up to milliseconds (Blackledge 2005; Tolman and Ruan 2006). It has been proposed that this time window could be relevant for protein–protein recognition (Bertoncini et al. 2005; Bouvignies et al. 2005b). After the renaissance of RDCs in liquid-state protein NMR spectroscopy (Tjandra and Bax 1997; Tolman et al. 1995), the potential of RDCs to study protein dynamics has been recognized (Tolman et al. 1997) and several methods have been developed to extract dynamic information from RDCs: The RDC-based model-free approach relies on the measurement of NH RDCs for five linearly independent alignment tensor orientations in at least five different media (Meiler et al. 2001; Peti et al. 2002; Lakomek et al. 2005, 2006). Using a high-resolution structure to determine the alignment tensors, structural as well as dynamic information can be deduced. The Direct Interpretation of Dipolar Couplings (DIDC) approach (Tolman 2002) is conceptually similar but does not require a structural model of the protein. Both the structural and dynamical models as well as the underlying alignment tensors are obtained simultaneously. The DIDC approach minimizes the variation of RDC-based order parameters (see, e.g., Tolman 2002, Eq. 15). This is also true for a recent extension of this approach (Yao et al. 2008). In an alternative model-based approach several RDCs for protein G (Ulmer et al. 2003) were fitted using a three-dimensional Gaussian Axial Fluctuation (3D GAF) model (Bouvignies et al. 2005b). These studies strongly suggest that RDC-derived order parameters are on average smaller than the relaxation-derived order parameters, indicating the presence of supra-\( \tau _c \) motion. In a follow-up study, Blackledge and co-workers were able to determine the average protein backbone conformation and the nature and extent of motional disorder about this average structure ab initio from measured RDCs for protein G, using their dynamic-meccano approach and assuming fixed peptide plane geometry (Bouvignies et al. 2006, 2007).

Recently, high mobility in the sub-microsecond time scale was detected using heteronuclear dipolar couplings for the ubiquitin backbone in the microcrystalline state (Lorieau and McDermott 2006). Furthermore, several promising approaches have been undertaken to combine or compare RDC information with motional models derived from molecular dynamics simulations (Bouvignies et al. 2007; Markwick et al. 2007; Nederveen and Bonvin 2005; Showalter and Brüschweiler 2007a, b; Showalter et al. 2007). Loop dynamics comparable to or longer than \( \tau _c \) have been observed for the backbone of ubiquitin in an 1.2 μs explicit solvent MD simulation very recently (Maragakis et al. 2008).

For the model-free approach (Meiler et al. 2001; Peti et al. 2002; Lakomek et al. 2005, 2006), it was suggested that structural noise introduced by the usage of a single high-resolution structure may contribute a systematic error (Bouvignies et al. 2005a; Clore 2004; Clore and Schwieters 2006; Zweckstetter and Bax 2002). Therefore, the major focus of the present work is to introduce a Self-Consistent RDC-based Model-free (SCRM) analysis to alleviate this model bias. This analysis is here applied to the protein ubiquitin.

Materials and methods

Experimental part

Alignment media preparation

All together, 36 NH RDC data sets from the backbone of the wild-type human ubiquitin were available for the new SCRM analysis. Previous measurements (Peti et al. 2002), data sets D1–D5 in (Lakomek et al. 2006) were replaced by measurements with increased concentration of ubiquitin.

15N,13C-labeled human ubiquitin (wt) was expressed according to (Johnson et al. 1999). Thirteen new alignment conditions, A1–A13, were prepared as described in the following. In every case, 2.5 mg of ubiquitin were dissolved in a 50 mM Na phosphate buffer at pH 6.5. The final ubiquitin concentration varied between 0.75 and 0.9 mM, and 10–20% (v/v) D2O were added for field locking. The following briefly describes the individual new alignment conditions.

  • A1: A 7% positively charged gel sample was prepared according to (Cierpicki and Bushweller 2004). The positive charge was introduced by addition of (3-acrylamidopropyl)-trimethylammonium chloride (APTMAC, Sigma-Aldrich, Inc.) in a ratio of APTMAC:acrylamide = 1:3.

  • A2: A 7% positively charged gel was prepared as for A1 but with a ratio APTMAC:acrylamide = 1:1.

  • A3: A 5% negatively charged gel was prepared according to (Cierpicki and Bushweller 2004). The negative charge was introduced by addition of acrylic acid (Sigma-Aldrich, Inc.) in a ratio of acrylic acid: acryl amide = 1:1.

  • A4: The ubiquitin solution was added to dodecyl-penta(ethylene glycol) (C12E5) stock solution (15% w/v) in a ratio of 2:1 and vortexed. The solution became opalescent after addition of 1.5% (v/v) hexanol (Ruckert and Otting 2000).

  • A5: Ubiquitin was dissolved in a suspension of 25 mg/ml Pf-1 phage (ASLA Ltd., Riga, Lativa) in 50 mM Na phosphate buffer with 100 mM NaCl (Zweckstetter and Bax 2001).

  • A6: Same as A5 but with a Pf-1 phage concentration of 20 mg/ml.

  • A7: A 1,2-dimyristoyl-sn-glycero-3-phosphatidylcholine (DMPC)/1,2-dihexanoyl-sn-glycero-3-phosphatidylcholine (DHPC) = 3:1 mixture (Avanti Polar Lipids, Alabama) of 15% w/v was dissolved in buffer containing 50 mM NaCl and 50 mM Na phosphate buffer (pH = 6.5) with 0.02% sodium azide, and 10% D2O. The final ubiquitin concentration was 0.9 mM (Triba et al. 2005).

  • A8: DMPC, DHPC and SDS (sodium dodecyl sulfate, Serva, Heidelberg, Germany) were mixed in a ratio of 30:10:2 and dissolved in a 50 mM Na phosphate buffer with pH = 6.5, containing 15–20% D2O until a total lipid concentration of 5%(w/v) was reached. The composition was vortexed at 0°C until the solution became clear. Ubiquitin was dispersed in this solution with a final concentration of 0.75 mM.

  • A9–A13: Bicelle media were prepared similarly to A7 and A8. Ingredients and total lipid concentration can be found in Table 1. The ubiquitin concentration was 0.75 mM.

    Table 1 Bicelle media preparation (See chapter Materials and methods, alignment media preparation)

To complement the data sets obtained from these alignment conditions (in order to better span the 5-dimensional RDC space) the following data sets have been used: A14–A18: NH data set E1 to E5 measured in a previous analysis (Lakomek et al. 2006).

A19–A36: NH RDC data sets published by Bax and coworkers (Ottiger et al. 1998) and Tolman and coworkers (Ruan and Tolman 2005; Briggman and Tolman 2003).

The collection of 36 NH RDC data sets will be referred to as D36M. A table containing all NH RDC data sets used for the analysis is provided in the Supporting Information (Table S5). All relevant alignment conditions used for the different data sets are listed in a second table in the Supporting Information (Table S6).

NMR Spectroscopy

NH RDC data for all new aligned media A1–A13 as well as the isotropic reference experiment were measured using 2D-IPAP-15N,1H-HSQC experiments (Ottiger and Bax 1998). All data were recorded at a sample temperature of 308 K. Measurements were performed on either a Bruker-Avance 700 MHz (Bruker AG, Karlsruhe, Germany), a Bruker-DRX 600 MHz or a Bruker-Avance 600 MHz spectrometer equipped with a TXI cryogenic probe head or a Bruker 800 MHz spectrometer equipped with a TCI cryogenic probe head. The time domain was either TD1 × TD2 = 1 k × 2 k or TD1 × TD2 = 2 k × 2 k with a spectral width of \( \Upomega 1 \times \Upomega 2 \) = 25 ppm × 15 ppm (besides \( \Upomega 1 \times \Upomega 2 \) = 30 ppm × 15 ppm for data set A5 and A6). The number of scans was 32 and higher.

For processing, data were zero-filled to TD1 × TD2 = 32 k × 4 k and processed with the NMRPipe software package (Delaglio et al. 1995). (J + D)- and J-coupling constants were extracted using NMRPipe. One-bond 15N, 1H RDCs were derived from the difference in coupling between aligned samples and the isotropic sample. The experimental error was conservatively estimated to be less than 0.3 Hz. Alignment tensors were calculated using the software PALES (Zweckstetter and Bax 2000).

SECONDA analysis

In order to quantify the similarity of structure and dynamics in the different alignment media (homogeneity of RDC data), a SECONDA analysis was applied (Hus and Brüschweiler 2002; Hus et al. 2003). The SECONDA method analyzes the covariance matrix constructed of all RDC data obtained under different alignment conditions. It performs a principal component analysis (PCA) of the RDC covariance matrix, which is equivalent to a singular value decomposition (SVD) of the RDC matrix. The singular values are sorted according to decreasing size. Structural and dynamic information is contained in the first five singular values, since dipolar couplings are a second rank symmetric tensor interaction and hence reside in a linear 5-dimensional space. Accordingly, only noise, systematic errors, and structural and dynamic heterogeneity may cause the 6th and higher singular values to differ from zero. The ratio of the 5th and 6th singular values (called SECONDA gap in the following) is a measure of the homogeneity of RDC data and the magnitude of noise. The larger the SECONDA gap, the more self-consistent are the RDC data in the different alignment media.

For D36M, the SECONDA gap was 5.66 after normalization. Starting from these 36 NH RDC data sets, a subset of 23 NH RDC data sets (D23M) was selected that increased homogeneity of the RDC data with still adequate sampling of alignment tensor orientations in the 5-dimensional tensor-space (as described in the Supporting Information). The data sets contained in the subset D23M are displayed in Table S5 in the Supporting Information. The SECONDA gap of subset D23M was 6.81 after normalization.

Self-Consistent RDC-based Model-free (SCRM) approach

The Self-Consistent RDC-based Model-free (SCRM) method was developed from the theory of the original RDC-based model-free approach. RDCs are measured in at least five linear independent alignment conditions for each single inter-nuclear vector (e.g. the NH). The RDC-based model-free approach extracts dynamically averaged second order spherical harmonics \( \left\langle {Y_{2,M} } \right.(\theta ,\phi \left. ) \right\rangle \) from these RDCs,

$$ \left\langle {D_{}^{\exp } } \right\rangle = \sqrt {\frac{{4\pi }}{5}} D_{zz} \left\{ {\left\langle {Y_{2,0} (\theta ,\varphi )} \right\rangle } \right. + \sqrt {\frac{3}{8}} R\left. {\left( {\left\langle {Y_{2,2} (\theta ,\varphi )} \right\rangle + \left\langle {Y_{2, - 2} (\theta ,\varphi )} \right\rangle } \right)} \right\}. $$
(1.1)

For each NH vector j, the five averaged second order spherical harmonics \( \left\langle {Y_{2,M} } \right.(\theta _j^{mol} ,\phi _j^{mol} \left. ) \right\rangle \) are derived by solving the F-matrix equation (1.2). This equation describes for each alignment condition the residual dipolar coupling equation (1.1) in the molecular frame. The F-Matrix relates the spherical harmonics in the molecular frame to the measured RDCs by a Wigner rotation from the molecular frame to the alignment frame (Lakomek et al. 2006; Meiler et al. 2001):

$$ \frac{{D_{ij}^{\exp } }}{{D_{i,zz} }} = \sum\limits_{M = - 2}^2 {F_{i,M} } \left\langle {Y_{2,M} } \right.(\theta _j^{mol} ,\phi _j^{mol} \left. ) \right\rangle $$
(1.2)

with

$$ \begin{gathered} F_{i,M} = \sqrt {\frac{{4\pi }}{5}} \left( {D_{M0}^2 (\alpha ^i ,\beta ^i ,\gamma ^i ) + \sqrt {\frac{3}{8}} R\left( {D_{M2}^2 (\alpha ^i ,\beta ^i ,\gamma ^i )} \right. + \left. {D_{M - 2}^2 (\alpha ^i ,\beta ^i ,\gamma ^i )} \right)} \right) \hfill \\ = \sqrt {\frac{{4\pi }}{5}} \left( {e^{ - iM\alpha ^i } d_{M0}^2 (\beta ^i ) + \sqrt {\frac{3}{8}} R(e^{ - iM\alpha ^i } d_{M2}^2 (\beta ^i )e^{ - 2i\gamma ^i } + e^{ - iM\alpha ^i } d_{M - 2}^2 (\beta ^i )e^{2i\gamma ^i } )} \right). \hfill \\ \end{gathered} $$
(1.3)

From the dynamically averaged spherical harmonics, RDC-based order parameters \( S_{rdc}^2 \) are determined which are sensitive to all motions faster than the millisecond time scale. This is in contrast to Lipari–Szabo order parameters \( S_{LS}^2 \) (Lipari and Szabo 1982) obtained from relaxation measurements which are only sensitive up to the overall tumbling correlation time \( \tau _c \) (ca. 4 ns for ubiqutin at room temperature):

$$ \begin{gathered} S_{rdc}^2 = \frac{{4\pi }}{5}\sum\limits_{M = - 2}^2 {\left. {\left\langle {Y_{2,M} \left( {\theta ,\phi } \right)} \right\rangle } \right|_{ps}^{ms} \left. {\left\langle {Y_{2,M}^* \left( {\theta ,\phi } \right)} \right\rangle } \right|_{ps}^{ms} } \hfill \\ S_{LS}^2 = \frac{{4\pi }}{5}\sum\limits_{M = - 2}^2 {\left. {\left\langle {Y_{2,M} \left( {\theta ,\phi } \right)} \right\rangle } \right|_{ps}^{\tau _c } \left. {\left\langle {Y_{2,M}^* \left( {\theta ,\phi } \right)} \right\rangle } \right|_{ps}^{\tau _c } } \hfill \\ \end{gathered} $$
(1.4)

In addition to RDC-based order parameters \( S_{rdc}^2 \), the dynamic average orientations of inter-nuclear vectors can also be determined. After performing a coordinate transformation that maximizes \( \left\langle {Y_{2,0} \left( {\theta ^\prime ,\phi ^\prime } \right)} \right\rangle \), the NH vector points along the z′ axis of the new primed reference frame:

$$ \begin{aligned} \max \left\langle {Y_{2,0} \left( {\theta ^\prime ,\phi ^\prime } \right)} \right\rangle & = \sum\limits_{M = - 2}^2 {D_{M,0} \left( {\phi _{av} ,\theta _{av} ,0} \right)} \left\langle {Y_{2,M} \left( {\theta ,\phi } \right)} \right\rangle \\ & = \sqrt {\frac{{4\pi }}{5}} \sum\limits_{M = - 2}^2 {Y_{2,M}^* } \left( {\theta _{av} ,\phi _{av} } \right)\left\langle {Y_{2,M} \left( {\theta ,\phi } \right)} \right\rangle . \\ \end{aligned} $$
(1.5)

The first and second Euler angle of the respective Wigner rotation \( D_{M,0} \left( {\phi _{av}^{} ,\theta _{av}^{} ,0} \right) \) corresponds to the dynamic average orientation of the inter-nuclear NH vector \( \left( {\phi _{av} ,\theta _{av} } \right) \). Furthermore, the amplitude η and direction of anisotropy \( \phi _{rdc}^{} \) (Lakomek et al. 2006; Peti et al. 2002) are determined.

The goal of the Self-Consistent RDC-based Model-free (SCRM) approach was to remove a possible bias due to the protein structure used for the initial determination of the alignment tensors. To this aim, we proceeded as follows. The first step of the SCRM method was the application of the standard RDC-based model-free analysis as described in (Lakomek et al. 2006): First, alignment tensors were calculated from the measured RDCs using the PALES (Zweckstetter and Bax 2000) or DipoCoup software (Meiler et al. 2000) and from the X-ray structure 1ubi (Ramage et al. 1994) to which protons were added with MOLMOL (Koradi et al. 1996). Naturally, the dynamically averaged orientations \( \left( {\phi _{av} ,\theta _{av} } \right) \) calculated from the RDC-based model-free approach will exhibit deviations from the orientations in the X-ray structure. In an iterative fashion, the NH vector orientations used for the tensor calculation were replaced by the resulting \( \left( {\phi _{av} ,\theta _{av} } \right) \) after each model-free analysis cycle (Fig. 1).

Fig. 1
figure 1

Sketch of the SCRM cycles: RDC-based model-free results are calculated using NH RDC data as well as inter-nuclear NH vectors for tensor calculation. Starting from a first structural model (e.g. the X-ray structure 1ubi with protons added with standard geometry), the inter-nuclear NH vector orientations are adjusted in each SCRM cycle towards the true dynamic average orientation. Consequently, the fit of the alignment tensor to the RDC data is improved in each step and as result the fit of the model-free results to the RDCs as well

An N–H bond length of 1.04 Å (Ottiger and Bax 1998) has been used to calculate the new proton coordinates. (Note that the SCRM results are independent of the used bond-length.)

Each cycle i of model-free analysis an alignment tensor recalculation was conducted until convergence of the order parameters \(\left| {S_{rdc,i}^2 - S_{rdc,i - 1}^2 } \right| = \frac{1}{n}\sum\nolimits_{j = 1}^n {\left| {S_{rdc,i}^2 (NH_j ) - S_{rdc,i - 1}^2 (NH_j )} \right|} \le 0.01 \hfill \\ \hfill \\ \) was achieved. Note that in every step of the SCRM approach the five time-averaged second order spherical harmonics are calculated in the same way as in the original rdc-based model-free approach for each residue. From those the five rdc-based model-free parameters \( S_{rdc}^2 ,\phi _{av} ,\theta _{av} ,\eta ,\phi _{rdc}^\prime \) are derived. In the following discussion we will focus on the first three parameters \( S_{rdc}^2 \) and \( \left( {\phi _{av} ,\theta _{av} } \right) \).

The SCRM procedure was implemented in a semi-automated manner using a Mathematica 5.2 protocol, PALES, and several Python scripts. To assess the fit of the alignment tensor to the experimental RDC data, static Q-factors \( Q_{static} = \sqrt 2 R_{dip} \) as defined in (Bax and Grishaev 2005) and the Pearson correlation coefficients ρ were extracted from PALES after each cycle. Additional criteria and error measures were implemented for the SCRM analysis as explained below.

Back-calculated RDCs and dynamic Q-values assess the fitting quality of SCRM

RDCs were back-calculated from the time-averaged second order spherical harmonics \( \left\langle {Y_{2,M} } \right.(\theta ,\phi \left. ) \right\rangle \) (Eq. 1.1). Consistency of the model-free approach on a per-residue-basis (running index j) was assessed by computing the root mean square deviation (rmsd) between back-calculated RDCs D MF and experimental RDCs D exp (RDC-rmsd):

$$ rmsd(rdc,j) = \sqrt {\frac{{\sum\nolimits_{i = 1}^K {(D_{i,j,MF} - D_{i,j,\exp } )^2 } }}{K}} $$
(1.6)

where K is the number of alignment media. We also introduced dynamic Q-values \( Q_{dyn} \) utilizing the D MF RDCs:

$$ Q_{dyn,i} = \sqrt {\frac{{\sum\nolimits_{j = 1}^N {(D_{ij,MF} - D_{ij\exp } )^2 } }}{{\sum\nolimits_{j = 1}^N {D_{ij,\exp }^2 } }}} $$
(1.7)

where N is the number of RDCs available in medium i. The dynamic Q-value \( Q_{dyn} \) measures the quality of fit of the model-free solution for the different alignment media. It is a straightforward extension of the well-known Q-value that measures the quality of fit of experimental RDCs over a single static structure.

Error calculation for the SCRM analysis

To estimate the error of the RDC-based order parameters \( S_{rdc}^2 (j) \) for each residue j, the experimental error was modelled by adding Gaussian noise to the measured RDCs. The input RDCs with added noise \( D_{i,j,noise} \) were generated by drawing N = 1000 random samples \( D_{i,j,noise} = random [p (D_{i,j} )] \) from a Gaussian distribution \( p(D) \) with standard deviation \( \sigma _j \).

Two different standard deviations \( \sigma _j \) were considered to study the propagation of different sources of errors to the SCRM derived order parameters: first, in order to assess the impact of the experimental error from the RDC data alone \( \sigma _j^{\exp } \) = 0.3 Hz was used. Second, the residue-specific \( \sigma _j^{rmsd} = rmsd(rdc,j) \) was used in order to assess the combined effect of experimental error and additional systematic errors introduced by the model-free analysis. Examples for the latter are a possible correlation between alignment tensor fluctuations and internal dynamics, or the single tensor approximation. This analysis was repeated N = 1000 times. The error is evaluated as the standard deviation of the resulting N = 1000 \( S_{rdc}^2 (j) \):

$$ \Updelta S_{rdc}^2 (j)\, = \,\sqrt {\frac{1}{N}\sum\limits_{k = 1}^N {\left( {S_{rdc,k}^2 (j) - \left\langle {S_{rdc}^2 (j)} \right\rangle } \right)^2 } } . $$
(1.8)

Selection of the set of RDCs for alignment tensor calculation

The RDC-based model-free analysis assumes that internal protein dynamics of the backbone NH vectors are not correlated with the alignment tensor modulations. This assumption allows working with a single average alignment tensor for each medium. Simulations indicated that this assumption is correct for secondary structure elements, at least for steric alignment (Louhivuori et al. 2006; Salvatella et al. 2008). Correlations between alignment tensor fluctuations and backbone NH vector dynamics have been observed only for more mobile loop regions in ubiquitin (Salvatella et al. 2008). Consequently, we excluded the most mobile residues from the alignment tensor calculation. However, reducing the number of residues from which the alignment tensor is determined may lead to an inhomogeneous sampling of the three principal axes and may amplify structural noise. Thus, a consensus set of RDCs had to be found, which provides a nearly complete sampling of orientations while still avoiding correlations between tensor modulations and internal protein dynamics.

To this end, we followed an approach similar to the one introduced by (Bouvignies et al. 2005b): In a first step, alignment tensors are calculated using the experimental data for all residues 2–72. (The highly flexible C-terminus of ubiquitin (residues 73–76) was always excluded.) Four iterations of the SCRM protocol were performed. Then, the 20% most mobile residues (\( S_{rdc,unscaled}^2 \le 0.95 \)) were excluded from the alignment tensor calculation and the SCRM analysis was re-started using the remaining set of residues for alignment tensor calculation. To ensure an adequate sampling of the three principal axes of the alignment tensor, an eigenvalue analysis of the matrix

$$ {\mathbf{C}} = {\mathbf{B}}^T {\mathbf{B}} $$
(1.9)

where \( {\mathbf{B}} = ({\mathbf{e}}_1 ,{\mathbf{e}}_2 , \ldots {\mathbf{e}}_N) \) is the 3× N matrix containing the normalized NH vectors of the average structure, was performed and the diagonalized matrix D = (d(1),d(2),d(3)) was obtained that contains the three eigenvalues of C sorted according to magnitude (see Prompers and Brüschweiler 2002). It was ensured that the selected base of residues adequately samples the three tensor axes as described in the Supporting Information.

Determination of \( S_{overall} \)

Since experimental RDCs are scaled by internal motion and the alignment tensor is determined from experimental RDCs fitted to a rigid protein structure, the derived alignment tensor will be dynamically averaged and therefore reduced in size. Isotropic internal dynamics leaves the rhombicity and the orientation of the tensor with respect to the molecular frame unaffected but leads to a reduction of the overall magnitude (Lakomek et al. 2006; Meiler et al. 2001). Therefore, the true value of the principal tensor \( D_{i,zz} \) is not known and can only be estimated from the experimentally accessible dynamically averaged \( \tilde D_{i,zz} \). As has been shown previously (Meiler et al. 2001), \( \tilde D_{i,zz} \) is determined such that the average over the \( S_{rdc,unscaled}^2 \) equals 1 (when taking all residues for alignment tensor calculation), because the average dynamics is absorbed in the alignment tensor.

Therefore, the \( S_{rdc,unscaled}^2 \) provide only relative values for the dynamic amplitudes, but have to be scaled against the Lipari–Szabo order parameters \( S_{LS}^2 \) which contain absolute mobility information given a fixed distance between the amide proton and nitrogen and assuming a constant chemical shift anisotropy. The downscaling of \( S_{rdc,unscaled}^2 \) to \( S_{rdc}^2 \) is accompanied by an up-scaling of \( \tilde D_{i,zz} \) to \( D_{i,zz} \). As explained in detail in (Lakomek et al. 2006), the scaling factor \( S_{overall} \) therefore fulfills: \( \tilde D_{i,zz} = S_{overall} \cdot D_{i,zz} \) and \( S_{j,rdc} = S_{overall} \cdot S_{j,rdc,unscaled} \). Using this definition, Eq. (1.2) can be rewritten:

$$ \frac{{D_{ij}^{\exp } }}{{\tilde D_{i,zz} }} \cdot S_{overall} = \sum\limits_{M = - 2}^2 {F_{i,M} } \left\langle {Y_{2,M} } \right.(\theta _j^{mol} ,\phi _j^{mol} \left. ) \right\rangle $$
(1.10)

Subsequently, for clarity we will omit the indices i and j.

Solving the F-matrix equation yields \( \left\langle {Y_{2,M} \left( {\theta ,\phi } \right)} \right\rangle _{\left( {scaled} \right)} = S_{overall} \cdot \left\langle {Y_{2,M} \left( {\theta ,\phi } \right)} \right\rangle _{\left( {unscaled} \right)} \) and finally \( S_{rdc}^2 \). It should be noted that S overall is a mathematical parameter without a direct physical meaning. The only values that have physical meaning are the scaled order parameters \( S_{rdc}^2 \). Determining the scaling factor \( S_{overall} \) is non-trivial. Since RDC-based order parameters are sensitive up to the millisecond time scale while the Lipari–Szabo ones only up to the overall tumbling correlation time \( \tau _c \), the condition \( S_{rdc}^2 \le S_{LS}^2 \) or \( S_{overall}^2 \cdot S_{rdc,unscaled}^2 \le S_{LS}^2 \) must hold within experimental error. This relationship is used to estimate the overall scaling factor \( S_{overall} \) by requiring \( S_{overall}^2 \le S_{LS}^2 /S_{rdc,unscaled}^2 \) within the experimental error of \( S_{rdc}^2 \) and \( S_{LS}^2 \). It is further assumed that several residues do not show supra-\( \tau _c \) motion resulting in identical \( S_{rdc}^2 \) and \( S_{LS}^2 \) for those residues. Lipari–Szabo order parameters \( S_{LS}^2 \) measured at 308 K by Tjandra and co-workers were used (Chang and Tjandra 2005). Details can be found in the Supporting Information.

Application to experimental data

The SCRM method has been applied to both experimental NH RDC data sets D36M and D23M using the X-ray structure 1 ubi (Ramage et al. 1994) as starting structure (with protons added in standard positions with MOLMOL (Koradi et al. 1996) using a bond length of 1.04 Å). The influence of structural noise on the SCRM analysis was tested as described in the following paragraph.

Structural noise analysis

The influence of structural noise on the SCRM approach was tested for two different scenarios, A and B:

In scenario A, synthetic Gaussian noise was added on the NH vector orientation of the X-ray 1ubi structure (with hydrogen atoms added according to standard geometry). Using PALES (Zweckstetter and Bax 2000), the NH vector is tilted Gaussian distributed with opening angle θ and an equally distributed polar angle \( \phi \) as described in (Zweckstetter and Bax 2002). For the standard deviation of the Gaussian distribution values of (a) \( \sigma \) = 10°, (b) \( \sigma \) = 20° or (c) \( \sigma \) = 30° were chosen, subsequently referred to as structural noise of 10°, 20° or 30° respectively. For each case (a)–(c) three different random noise structures were generated. These random noise structures were used as starting structures for the alignment tensor calculation in the SCRM analysis.

For scenario B we used nine crystal structures of ubiquitin bound to its recognition proteins as input. These structures deviate from the free ubiquitin crystal structure 1ubi by backbone RMSD values between 0.3 and 0.6 Å. These structures are 1cmx (Johnston et al. 1999), 1uzx (Teo et al. 2004), 1xd3 (Misaghi et al. 2005), 1yiw (Bang et al. 2005), 2c7n (Penengo et al. 2006), 2d3g (two structures, (Hirano et al. 2006)), 2fif (two structures, (Lee et al. 2006)).

For both scenarios, the RDC-based order parameters \( S_{rdc}^2 \) are compared to those derived from the “noise-free” 1ubi X-ray structure to analyze the influence of structural noise.

Statistical analysis of \( S_{rdc}^2 \) and \( S_{LS}^2 \) distributions

We describe the spread of the \( S_{rdc}^2 \)and \( S_{LS}^2 \) distributions over all residues of ubiquitin in terms of P-percentiles. The 25th percentile P25 is the value compared to which 25% of the distribution is lower. For the 75th percentile P75, 75% of the distribution have lower values. The interquantile range (IQR) is defined as the difference between P75 and P25. The IQR covers 50% of the distribution and is a direct measure for the spread of a distribution.

Results and discussion

SCRM on experimental NH RDC data (D23M and D36M)

The SCRM method was applied to both NH RDC experimental data sets D23M and D36M using the X-ray structure 1ubi as starting input structure for the first cycle of the SCRM method.

The static X-ray structure 1ubi (with hydrogen atoms added according to standard geometry) yields static Q-values of \( \left\langle {Q_{static} } \right\rangle \) = 0.178 for D23M and \( \left\langle {Q_{static} } \right\rangle \) = 0.193 for D36M averaged over all alignment conditions. The Pearson correlation coefficients between experimental RDCs and those back-calculated from the static X-ray structure are \( \left\langle \rho \right\rangle \) = 0.977 for D23M and \( \left\langle \rho \right\rangle \) = 0.972 for D36M respectively, on average over all conditions.

As described in the Material and methods section, the SCRM method was designed to iteratively improve the accuracy of the alignment tensor determination and to adjust the average inter-nuclear vector orientations, and as a result, to further reduce the static Q-values \( \left\langle {Q_{static} } \right\rangle \) and increase the Pearson correlation coefficient \( \rho \). Indeed, after already 4 SCRM-cycles, the static Q-values decreased to less than half of the original value with \( \left\langle {Q_{static} } \right\rangle \) = 0.062 for both D23M and D36M (Fig. 2a, b). Simultaneously, the correlation coefficients \( \rho \) increased to \( \left\langle \rho \right\rangle \) = 0.997 on average (Fig. 2c, d). Convergence was attained in already 4 cycles of SCRM after which the inter-nuclear vector orientations \( \left( {\phi _{av} ,\theta _{av} } \right) \) were found to deviated by less than 0.5 between consecutive SCRM cycles (Figure S2a and S2b in the Supporting Information). Thus the iterative procedure rapidly improves the fit of the static structure to the RDCs as compared to the input X-ray structure.

Fig. 2
figure 2

(a) and (b) show Q-values for back-calculated RDCs using the X-ray structure 1ubi for alignment tensor determination (dashed line) and after 4 SCRM cycles using the fitted dynamic average NH vector orientations (black line) both for (a) D23M and (b) D36M. The fit of inter-nuclear vector orientations and determined alignment tensor to the experimental data is improved significantly: starting from \( \left\langle Q \right\rangle \) = 0.178 for D23M and \( \left\langle Q \right\rangle \) = 0.193 for D36M on average, the Q-values decrease to \( \left\langle Q \right\rangle \) = 0.062 for both D23M and D36M after 4 SCRM-cycles. (c) and (d) same as (a) and (b) but for \( \left\langle \rho \right\rangle \) instead of Q-values. Starting from \( \left\langle \rho \right\rangle \) = 0.977 for D23M and \( \left\langle \rho \right\rangle \) = 0.972 for D36M on average, \( \left\langle \rho \right\rangle \) improves to \( \left\langle \rho \right\rangle \) = 0.997 after 4 SCRM cycles. (e) and (f): The inter-nuclear angles \( \kappa _j \) enclosed between the dynamic average NH vector orientations and the NH vectors of the starting X-ray structure 1ubi are shown. The average angular deviation is 6.97° for D23M and 6.87° for D36M. (g) and (h): Same as for e and f, but compared to the 1d3z NMR structures. The average deviation to the NMR structure is 4.84° for D23M and 4.52° for D36M. Thus, the agreement between the derived dynamic average NH vector orientations and the NMR structure is significantly better than for the 1ubi structure

In Fig. 2(e, f) the inter-nuclear angles \( \kappa _j \) enclosed between the dynamic average NH vector orientations and the NH vectors of the starting X-ray structure 1ubi are shown. For better comparison, the dynamic average NH vector orientations have been rotated to a best-fit superposition with the NH vectors of the 1ubi structure (with protons added according to standard geometry).

Most of the dynamic average inter-nuclear vector orientations obtained after 4 SCRM cycles differ from those of the 1ubi X-ray structure by less than 10° for \( \kappa _j \) (Fig. 2e, f). The average angular deviation is 6.97° for D23M and 6.87° for D36M. Deviations larger than 10° are observed for Lys6, Lys11, Ile13, Ser20, Lys33, Glu34, Arg42, Lys48, Leu50, Asp52, Arg54, Leu67, Val70 and Leu71 for D23M. The largest deviation is 20.3° for Arg54 (compare Fig. 2e, f). Most of these residues are highly mobile with \( S_{rdc}^2 (NH) \) < 0.7. Exceptions are Lys6 in the first β-strand, Lys33 and Glu34 in the α-helix, Leu43 in the third β-strand and Leu67 and Val70 in the fifth β-strand. Despite the fact that \( S_{rdc}^2 (NH) \) > 0.7 for Lys33 and Glu34, they appear relatively mobile compared to the surrounding residues in the α-helix. Values for all residues have been listed in the Supporting Information in Table S3 for D23M and S4 for D36M. The derived dynamic average NH orientations have been compared to the NMR 1d3z structure (Cornilescu et al.1998) (first structure of the ensemble) as well (Fig. 2g, h). The average deviation to the NMR structure is 4.84° for D23M and 4.52° for D36M. Thus, the agreement between the derived dynamic average NH vector orientations and the NMR structure is significantly better than for the 1ubi structure. Interestingly, most of those dynamic average NH vectors that showed the largest deviations to the 1ubi X-ray structure, for example K6, L67 and V70, did not show large deviations compared to the NMR structure. Only Lys11 and Asp52 show large discrepancies both for the X-ray and NMR structure. Both are highly dynamic. In the 1ubi structure Lys6, Lys48 and Arg54 appear to be affected by crystal packing. Indeed, the largest deviation between 1ubi and 1d3z is observed for Arg54 with \( \kappa \) = 22.4°, also for Lys48 and Lys6 the deviations are high with \( \kappa \) = 9.3° and 12.3°. A tendency was observed that NH vectors involved in hydrogen bonds became more collinear to the electron donating carbonyl groups upon application of SCRM. Considering only changes greater than three of the NH vector orientation, 15 out 23 backbone amide groups became more parallel to the carbonyl group.

For comparison, the SCRM analysis was repeated using the 1d3z NMR structure as starting structure. As expected, the results are almost identical and corroborate the robustness of the SCRM approach. The resulting \( S_{rdc}^2 (NH) \) and \( \kappa _j \) are listed in the supporting information as well, in Table S3 for D23M and S4 for D36M, compare also Figure S2c and S2d.

In parallel with the improvement of the static Q-values \( \left\langle {Q_{static} } \right\rangle \) and correlation coefficients \( \rho \), the RDC-based order parameters \( S_{rdc}^2 \) also converged after 4 cycles of SCRM (Fig. 3a, b).

Fig. 3
figure 3

(a) and (b): The average difference \( \Updelta \) of RDC-based \( S_{rdc}^2 (NH) \) order parameter between subsequent SCRM cycles for (a) D23M and (b) D36M is shown: \( \left| {S_{rdc,i}^2 - S_{rdc,i - 1}^2 } \right| = \frac{1}{n}\sum\nolimits_{j = 1}^n {\left| {S_{rdc,i}^2 (NH_j ) - S_{rdc,i - 1}^2 (NH_j )} \right|} \). RDC-based order parameters \( S_{rdc}^2 \) have converged after 4 cycles of SCRM with less than 0.01 difference \( \left| {S_{rdc,i}^2 - S_{rdc,i - 1}^2 } \right| \) between subsequent SCRM cycles. (c) and (d): Residue-specific RDC-rmsd values \( rmsd(rdc,j) \) are shown for (c) D23M and (d) D36M after 4 SCRM cycles. For D23M, the average RDC-rmsd is \( \left\langle {rmsd(rdc,j)} \right\rangle \) = 0.28 Hz and 0.52 Hz for D36M. (e) and (f): Dynamic Q-values \( Q_{dyn} \) for the different alignment conditions are back-calculated, for D23M the average dynamic Q-value is \( \left\langle {Q_{dyn} } \right\rangle \) = 0.027 and for D36M \( \left\langle {Q_{dyn} } \right\rangle \) = 0.037

A more specific measure of the fit of the SCRM results to the experimental RDCs are residue-specific RDC-rmsd values which can be back-calculated from the model-free derived dynamic averaged second order spherical harmonics (compare Fig. 3c, d). For D23M, the average RDC-rmsd was strongly reduced to \( \left\langle {rmsd(rdc,j)} \right\rangle \) = 0.28 Hz after four SCRM-cycles compared to \( \left\langle {rmsd(rdc,j)} \right\rangle \) = 0.52 Hz for D36M.

To estimate the remaining inhomogeneity in the data, we added Gaussian noise to the noise-free back-calculated RDCs until the SECONDA gap reached 6.8, the value found for D23M (see Supporting Information for details). That analysis yields an estimated inhomogeneity for the D23M dataset of 0.22 Hz.

Since it is not expected that removal of only 13 data sets reduces the RDC-rmsd by almost a factor of 2, this result indicates that the resulting set D23M is more homogeneous, consistent with the SECONDA analysis. As mentioned above, SECONDA homogeneity is neither compatible with significant structural changes induced by the alignment media nor with significant correlation of the vector fluctuations and the alignment tensor. Thus, for D23M the error introduced by ignoring a possible correlation between internal protein dynamics and alignment tensor fluctuation is small. Further, also the use of a single dynamically average alignment tensor does not seem to introduce a considerable error. Significant deviations of \( rmsd(rdc,j) \) values from the average can mainly be observed for loop regions indicating a possible correlation between internal dynamics and alignment tensor fluctuations for these regions, in agreement with (Salvatella et al. 2008). Possible complications as addressed in (Louhivuori et al. 2007) are thus unlikely for the alignment conditions in the D23M subset.

For both D23M and D36M the resulting \( S_{rdc}^2 \) RDC-based order parameters are identical within the error, with very few exceptions for Gly35, Lys63 and Leu71 (Fig. 4c). The correlation coefficient between the \( S_{rdc}^2 \) derived from both data sets D23M and D36M is \( \rho \) = 0.945. The inter-nuclear angle \( \kappa _j \) enclosed between the dynamic average vectors derived from D23M and derived from D36M agree very well with an average \( \kappa _{} \) value of 1.4° (compare Supporting Information Figure S2e). A higher deviation is observed for Gly35 and Asp52 which also shows a higher discrepancy of the \( S_{rdc}^2 \).

Fig. 4
figure 4

RDC-based \( S_{rdc}^2 (NH) \) order parameters (red and blue) scaled according to the method described in the supplement are compared to the Lipari–Szabo \( S_{LS}^2 (NH) \) (black) for (a) D23M and (b) D36M. Both error bars for \( \sigma _j^{\exp } \) = 0.3 Hz and \( \sigma _j^{rmsd} = rmsd(rdc,j) \) are indicated as horizontal lines. While for some residues \( S_{rdc}^2 \) and \( S_{LS}^2 \) have almost equal values, for others, mainly in loop regions but also in secondary structure elements, \( S_{rdc}^2 \) values are significantly lower. The average RDC-based order parameter is \( \left\langle {S_{rdc}^2 } \right\rangle \) = 0.72 \( \pm \) 0.02 for D23M and\( \left\langle {S_{rdc}^2 } \right\rangle \) = 0.72\( \pm \) 0.02 for D36M compared to \( \left\langle {S_{LS}^2 } \right\rangle \) = 0.778 ± 0.003 for the Lipari–Szabo order parameter. (c) RDC-based order parameters \( S_{rdc}^2 (NH) \) derived from D23M (red) and D36M (blue) are compared. Both data sets D36M and D23M give \( S_{rdc}^2 (NH) \) that are identical within the error, with a very few exceptions for Gly35, Lys63 and Leu71. The correlation coefficient is \( \rho \) = 0.945. (d) and (e): comparison of (d) \( S_{rdc}^2 (NH) \) and (e) \( S_{LS}^2 (NH) \) order parameter distributions, The 25th percentile of the \( S_{rdc}^2 \) distribution is P25 = 0.68, the 75th percentile is P75 = 0.80 for D23M, giving and interquantile range (P25 to P75) of IQR = 0.12. (Identical values are obtained for D36M.) In contrast, the distribution of Lipari–Szabo order parameters \( S_{LS}^2 \) is 2.4 times narrower with P25 = 0.78, P75 = 0.83 and an interquantile range of IQR = 0.05. For the RDC-based order parameter \( S_{rdc}^2 \) the IQR is more than double than that for the Lipari–Szabo \( S_{LS}^2 \) showing that the RDC-based order parameters detect a much wider range of mobility

A second measure of the fit of the SCRM results to the experimental data are the dynamic Q-values \( Q_{dyn} \). Those were obtained from the correlation of the experimental data to the RDCs back-calculated from the model-free derived dynamically averaged second order spherical harmonics \( \left\langle {Y_{2,M} } \right.(\theta _j^{mol} ,\phi _j^{mol} \left. ) \right\rangle \) in the different alignment conditions (Fig. 3e, f). For D23M the average dynamic Q-value is \( \left\langle {Q_{dyn} } \right\rangle \) = 0.027 and for D36M \( \left\langle {Q_{dyn} } \right\rangle \) = 0.037 which indicates that the SCRM results agree very well with the experimental RDC results. Conceptually, the RDC-based model-free method resembles a residue-wise least-square fit to the experimental RDCs. Thus the dynamic Q-values indicate the best-fit solution to a restraint-free minimization of the second order spherical harmonics \( \left\langle {Y_{2,M} } \right.(\theta _j^{mol} ,\phi _j^{mol} \left. ) \right\rangle \) to the experimental RDCs. In terms of a possible RDC-based molecular dynamics ensemble refinement, a minimization of the NH vector orientation without additional force-field restraints yields a distribution of NH vector orientations that resembles the SCRM-derived results and order parameters.

Determination of \( S_{overall} \)

A rigorous statistical analysis yields that \( S_{overall} \) is smaller than 0.89 with a confidence level (see Supporting Information) of 95% using only the experimental error \( \sigma _j^{\exp } \) of 0.3 Hz. The confidence level drops to 67% using the total error \( \sigma _j^{rmsd} \). This result is consistent with an independent statistical analysis based on a hypothesis test (see Supporting Information). The \( S_{overall} \) found in this work deviates from our previous analysis (\( S_{overall} \) = 0.83) (Lakomek et al. 2006) since now the most mobile NH amide groups have been excluded from the alignment tensor calculation resulting in a smaller downscaling of \( D_{i,zz} \) caused by isotropic internal dynamics. Consequently, \( S_{overall} \) is expected to be larger than the value of 0.83 obtained in the previous analysis (Lakomek et al. 2006). Note that the average \( \left\langle {S_{rdc}^2 } \right\rangle \) = 0.72 are equal in this and the previous analyses (vide infra).

Analysis of \( S_{rdc}^2 \) order parameter distribution shows supra-\( \tau _c \) motion

In Fig. 4(a, b) the derived NH RDC-based order parameters \( S_{rdc}^2 (NH) \) for D23M and D36M are compared to the Lipari–Szabo \( S_{LS}^2 (NH) \) order parameters. The derived \( S_{rdc}^2 (NH) \) order parameters are listed in the supporting information. While for some residues \( S_{rdc}^2 \) and \( S_{LS}^2 \) are very similar, for others, mainly in loop regions but also in secondary structure elements, significantly lower \( S_{rdc}^2 \) values are observed. The average RDC-based order parameter is \( \left\langle {S_{rdc}^2 } \right\rangle \) = 0.72 ± 0.02 for D23M and for D36M compared to \( \left\langle {S_{LS}^2 } \right\rangle \) = 0.778 ± 0.003 for the Lipari–Szabo order parameter. The order parameter \( \left\langle {S_{LS}^2 } \right\rangle \) is a measure for the remaining rigidity in the sub-\( \tau _c \) window. For the mobility in that window, \( 1 - \left\langle {S_{LS}^2 } \right\rangle \) is the appropriate measure. Similarly, the supra-\( \tau _c \) mobility is measured by

$$ 1 - \left\langle {S_{{\hbox{supra}}\hbox{-}\tau _c }^2 } \right\rangle = 1 - \frac{{\left\langle {S_{rdc}^2 } \right\rangle }}{{\left\langle {S_{LS}^2 } \right\rangle }}. $$

Accordingly, inclusion of the supra-\( \tau _c \) window increases the averaged amplitude of mobility observed in the sub-\( \tau _c \) window by:

$$ \frac{{{\text{supra}}\hbox{-}\tau _c -\text{mobility}}}{{{{\text{sub}}\hbox{-}\tau _c } - \text{mobility}}} =\frac{{1 - \left\langle {S_{{\text{supra}}\hbox{-}\tau _c }^2 }\right\rangle }}{{1 - \left\langle {S_{\text{sub}\hbox{-}\tau _c}^2 } \right\rangle }} = \frac{{1 - \frac{{\left\langle {S_{rdc}^2} \right\rangle }}{{\left\langle {S_{LS}^2 } \right\rangle }}}}{{1- \left\langle {S_{LS}^2 } \right\rangle}} = 34\%.$$

For D23M, N = 57\( S_{rdc}^2 \) order parameters were derived and N = 62 for D36M. For the \( S_{LS}^2 \) N = 49 \( S_{LS}^2 \) were available. The correlation coefficient between \( S_{rdc}^2 \) and \( S_{LS}^2 \) is \( \rho \) = 0.45 for D23M (and \( \rho \) = 0.41 for D36M).

For both data sets, \( S_{rdc}^2 \) order parameters show a significantly broader spread than those derived from relaxation, \( S_{LS}^2 \). With an interquantile range of IQR = 0.12 for D23M (and 0.12 for D36M) compared to IQR = 0.05 for the Lipari–Szabo \( S_{LS}^2 \) order parameters, the distribution of \( S_{rdc}^2 \) order parameters is 2.4 times wider than for \( S_{LS}^2 \) (Fig. 4d, e). In conclusion the RDC-based order parameters sample additional motion beyond \( \tau _c \). The fact that all \( S_{rdc}^2 \) must be smaller than the \( S_{LS}^2 \), together with the much wider spread of the \( S_{rdc}^2 \) distribution, leads on average to lower RDC-based order parameters \( S_{rdc}^2 \).

Supra-\( \tau _c \) motion is observed mainly in loop regions like (7–11, 20, 36–40, 46–47, 50–56, 60–65, 72–76), but also for several residues in secondary structure elements (2–6, 12–17, 22–35, 41–45, 48–49, 57–59, 66–71). The average RDC-based order parameter is \( \left\langle {S_{rdc,loop}^2 } \right\rangle \) = 0.66 ± 0.04 from D23M and \( \left\langle {S_{rdc,loop}^2 } \right\rangle \) = 0.65 ± 0.04 from D36M for loop regions. These values are about 10% smaller than the Lipari–Szabo value of \( \left\langle {S_{LS,loop}^2 } \right\rangle \) = 0.72 ± 0.04 for the loop regions. For secondary structure elements the average RDC-based order-parameter is \( \left\langle {S_{rdc,\sec }^2 } \right\rangle \) = 0.77 ± 0.01 (both for D23M and D36M) and still about 5% smaller than the \( \left\langle {S_{LS,\sec }^2 } \right\rangle \) = 0.81 ± 0.01. The presence of supra-\( \tau _c \) motion in secondary structure elements is emphasized by comparing the 25th and 75th percentile P25 = 0.72 and P75 = 0.82 of the RDC-based order parameter in secondary structure elements \( \left\langle {S_{rdc,\sec }^2 } \right\rangle \) derived from D23M (P25 = 0.75 and P75 = 0.83 in the case of D36M) to P25 = 0.79 and P75 = 0.85 for the Lipari–Szabo ones \( \left\langle {S_{LS,\sec }^2 } \right\rangle \). Table 2 lists all parameters describing the \( \left\langle {S_{rdc}^2 } \right\rangle \) distribution and the \( \left\langle {S_{LS}^2 } \right\rangle \) distribution.

Table 2 Statistics on the RDC-based order parameters \( S_{rdc}^2 (NH) \): (a) derived from D23M; (b) derived from D36M

Interestingly, an alternating pattern of \( S_{rdc}^2 (NH) \) order parameters was extracted for residues Lys48 to Leu50 in the 4th β-strand. The backbone of Lys48, whose side chain is known to be involved in the poly-ubiquitination process that leads to protein trafficking and degradation, appears very mobile with an order parameter of \( S_{rdc}^2 (NH) \) = 0.59 ± 0.03 for D23M (\( S_{rdc}^2 (NH) \) = 0.58 ± 0.07 for D36M) compared to \( S_{LS}^2 (NH) \) = 0.82 for the Lipari–Szabo value.

Other alternating patterns of \( S_{rdc}^2 (NH) \) in β-sheets like Gln41 to Phe45 that have been described before (Lakomek et al. 2005) are reproduced in this analysis for D23M, however with reduced amplitude. The same alternating pattern is observed weakly also for Lipari–Szabo order parameters \( S_{LS}^2 (NH) \) (Chang and Tjandra 2005). These findings are consistent with our earlier analyses (Lakomek et al. 2005), independent findings for protein G using the 3D-GAF analysis (Bouvignies et al. 2005b) and even earlier results by Palmer and co-workers for Ribonuclease H (Mandel et al. 1995; Mandel et al. 1996) and Fibronectin type III (Carr et al. 1997) using relaxation methods. A correlation between backbone mobility and side-chain orientation has recently also been extracted from ultra high-resolution X-ray structures (Davis et al. 2006).

Focus on supra-\( \tau _c \) motion

To distinguish supra-\( \tau _c \) motion from sub-\( \tau _c \) motion, the distribution of \( S_{rdc}^2 /S_{LS}^2 (NH) \) was analyzed along the amino acid sequence of ubiquitin (Table 3). For residues with solvent-exposed side chains, the backbone amide groups appear more mobile, while residues with side chains pointing to the hydrophobic core of the protein appear more rigid in the protein backbone, in agreement with (Lakomek et al. 2005). The analysis has been applied in the same way as presented in (Lakomek et al. 2005). A very simple two-state model has been applied. All residues with a solvent accessibility less than 11.5% were considered as core residues, all others as solvent-exposed. Solvent accessibility has been calculated using MOLMOL (Koradi et al. 1996). The average \( S_{rdc}^2 /S_{LS}^2 (NH) \) value is 0.90 ± 0.02 for solvent-exposed residues and 0.93 ± 0.03 for core residues in the case of D23M (and 0.90 ± 0.02 and 0.93 ± 0.02 for D36M) which reveals a tendency of core residues to be more rigid. The 25th percentile is P25 = 0.81 for the class of solvent-exposed residues and P25 = 0.87 for the core residues for D23M (and P25 = 0.81 and P25 = 0.92 for D36M). This indicates a tendency for residues with solvent exposed side-chains to be more mobile in the protein backbone than those with side chains pointing towards the hydrophobic core (Lakomek et al. 2005).

Table 3 Statistics on the RDC-based order parameters \( S_{rdc}^2 /S_{LS}^2 (NH) \): (a) derived from D23M; (b) derived from D36M which describe the supra-\( \tau _c \) contribution to mobility

The dependence of \( S_{rdc}^2 /S_{LS}^2 (NH) \) values on the number of hydrogen bonds on the corresponding peptide plane (including the amino group NH(i) and the preceding carbonyl group CO(i-1)) is analyzed in the same way as in (Lakomek et al. 2005). Peptide planes that are not involved in a hydrogen bond appear more mobile than those that are hydrogen-bonded: The average RDC-based order parameter is \( S_{rdc}^2 /S_{LS}^2 (NH) \) = 0.81 ± 0.05 for D23M (0.82 ± 0.05 for D36M) when the peptide plane is not involved in hydrogen bonds, compared to 0.92 ± 0.02 (D23M and D36M) when the peptide plane is involved in at least one hydrogen bond. For details, see Table 3.

Comparison to previous analyses

The correlation coefficient between the \( S_{rdc}^2 (NH) \) derived in this analysis and the previous one (Lakomek et al. 2006) is \( \rho \) = 0.80 for D23M and \( \rho\) = 0.82 for D36M. Both analyses yield an average \( \left\langle {S_{rdc}^2 } \right\rangle \) of 0.72, which underlines the presence of motion beyond the overall tumbling correlation time \( \tau _c \). These results also highlight that it is important to remove the possible bias introduced by the structure used for the tensor calculation. In the previous analysis (Lakomek et al. 2006) some outliers were present, for which \( S_{rdc}^2 (NH) \) order parameters were larger than the corresponding Lipari–Szabo ones \( S_{LS}^2 (NH) \). These were Leu8, Asp32, Gln49 and Ser57. In the SCRM analysis these residues show \( S_{rdc}^2 (NH) \) values lower than the corresponding \( S_{LS}^2 (NH) \) and are less conspicuous. These previous outliers are attributed to the influence of structural noise. As described in the next paragraph, the new SCRM method can efficiently avoid such outliers.

SCRM analysis is robust against the influence of structural noise

For synthetic structural noise added to the starting structure for the SCRM analysis (see M&M scenario A), the resulting \( S_{rdc}^2 (NH) \) order parameters (using D36M) after 4 SCRM cycles are in excellent agreement with those obtained using the noise-free structure 1ubi), both for 10° and 20° Gaussian noise, as seen in Fig. 5(a, b). Even for 30° structural noise the agreement is reasonably good (see Fig. 5c).

Fig. 5
figure 5

SCRM-derived \( S_{rdc}^2 (NH) \) after addition of (a) 10° (b) 20° and (c) 30° synthetic structural noise on the 1ubi X-ray structure used as starting structure for alignment tensor calculation. The agreement between the calculated \( S_{rdc}^2 (NH) \) and those derived using the noise-free structure (black) shows that SCRM is robust against the influence of structural noise. (d) and (e): \( S_{rdc}^2 (NH) \) order parameter are derived for d) D23M and e) D36M using the original RDC-based model-free analysis and nine different structures of ubiquitin bound in several complexes as starting structure for the analysis. The different structures have a backbone rmsd between 0.3 and 0.6 Å to the free Xray structure 1ubi (Ramage et al. 1994) (black line) and serve as a test case for severe structural noise. (f) and (g): \( S_{rdc}^2 (NH) \) order parameter are derived for (f) D23M and (g) D36M using the new SCRM method and the same nine different structures of ubiquitin (free X-ray structure 1ubi black line). While the original model-free approach (Lakomek et al. 2005) is affected by structural difference of the input structure used for tensor calculation, the new SCRM method alleviates the effect of structural noise

The SCRM approach has been tested on nine different input structures that deviate considerably from the free ubiquitin crystal (see Materials and methods case B). While the original model-free approach (Lakomek et al. 2006) is affected by structural differences of the input structures used for tensor calculation (Fig. 5d, e), the new SCRM method alleviates the effect of structural noise (Fig. 5f, g). After only 4 SCRM cycles the \( S_{rdc}^2 (NH) \) order parameters of the nine different test cases have converged and agree very well with those for the free form 1ubi (Fig. 5c, d). The standard deviation of \( S_{rdc}^2 (NH) \) order parameters is \( \sigma \) = 0.033 for the original RDC-based model-free approach applied on D23M (\( \sigma \) = 0.039 for D36M) and \( \sigma \) = 0.010 after 4 SCRM cycles applied on D23M (\( \sigma \) = 0.006 for D36M). This illustrates nicely that the SCRM method is able to accurately determine alignment tensors and inter-nuclear vector orientations almost independently from the quality of the starting structure (within a certain range). This is an important prerequisite for reliable quantification of macromolecular dynamics. We are currently exploring how strongly the initial structure can deviate from the final structure to still obtain a converged, correct structure. Already now, it would be possible to derive an initial structure using the DIDC approach (Tolman 2002) or a recent extension of this approach (Yao et al. 2008). In a second step the SCRM approach could then be applied.

Conclusions

A Self-Consistent RDC-based Model-free (SCRM) approach has been developed based on the RDC-based model-free approach as implemented in (Lakomek et al. 2005, 2006). It differs from the previous approach in that it reduces the influence of the details of the structure used for the tensor calculation from the RDCs. Therefore, it makes the model-free analysis robust against the influence of structural noise.

SCRM was applied on two NH RDC data set collections, D36M and D23M. For both NH RDC data collections, the new SCRM approach gives almost identical \( S_{rdc}^2 (NH) \) order parameters (correlation factor of 0.945). According to SECONDA analysis D23M increases homogeneity of the RDC data with still adequate sampling of alignment tensor orientations in the 5-dimensional tensor-space.

For D23M we conclude that there are neither significant structural changes induced by the alignment media nor significant correlations of the vector fluctuations and the alignment tensors. A possible correlation between alignment tensor fluctuations and internal dynamics, which has been proposed as a possible source of error by (Louhivuori et al. 2006, 2007), is therefore likely to be small. This finding is also in agreement with theoretical predictions (Salvatella et al. 2008).

We have shown in this work, that the influence of structural noise on the alignment tensor determination and finally the resulting \( S_{rdc}^2 (NH) \) order parameter can be alleviated by the SCRM method. We have added synthetic structural noise to the 1ubi X-ray structure and have used different structures of ubiquitin from several complexes used as input for the SCRM analysis. The resulting order parameters were found to agree within 0.01 irrespective of the starting structure.

Consistent with the previous analysis, the inclusion of the supra-\( \tau _c \) window increases the averaged amplitude of mobility observed in the sub-\( \tau _c \) window by about 34%. These additional motions on a time scale slower than the correlation time \( \tau _c \) occur mainly within loop regions but also within secondary structure elements.

Alternating mobility patterns on the \( S_{rdc}^2 (NH) \) order parameters in β-sheets were extracted, in agreement with the previous analysis. For the 4th β-strand from Lys48 to Leu50 an alternating pattern of \( S_{rdc}^2 (NH) \) was extracted with an average amplitude of the \( S_{rdc}^2 (NH) \) order parameter oscillations of ±0.08.

A pronounced difference between the sub- and supra-\( \tau _c \) amplitude for Lys48 was observed with \( S_{LS}^2 (NH) \) = 0.82 and \( S_{rdc}^2 (NH) \) = 0.59. Thus, there is more supra-\( \tau _c \) than sub-\( \tau _c \) motion \( \frac{{S_{rdc}^2 (NH)}}{{S_{LS}^2 (NH)}} < S_{LS}^2 (NH) \). Lys48 is involved in the poly-ubiquitylation process that leads to protein degradation. This finding motivated the investigation of the role of supra-\( \tau _c \) motion for protein–protein recognition (Lange and Lakomek et al. 2008). Since conformational sampling on a time scale in the micro- to millisecond time scale has been found to be rate limiting in many catalyses (Kern and Zuiderweg 2003; Boehr et al. 2006; Dyson and Wright 2005; Eisenmesser et al. 2005; Kern et al. 2005) it would be intriguing if conformational sampling on the nano- to microsecond time scale, i.e. 1000 times faster than the time scale where most catalytic events occur, would prove to be essential for protein–protein recognition dynamics. Very recently, Karplus and Kern and co-workers have shown that pico- to nano-second timescale atomic fluctuations in hinge regions of adenylate kinase facilitate the large-scale, slower lid motions that produce a catalytically competent state (Henzler-Wildman et al. 2007).

To summarize, RDCs can provide additional information about protein dynamics (supra-\( \tau _c \) motion), complementary to relaxation methods. Since promising techniques are emerging to reduce the experimental effort of collecting enough linear independent RDC data sets (Ruan and Tolman 2005; Yao and Bax 2007), RDCs are expected to become a routine tool to complement the analysis of biomolecular dynamics.

Supporting Information

NH RDCs measured for the different alignment conditions A1–A36 and derived \( S_{rdc}^2 (NH) \) order parameters are available in the Supporting Information.

More detailed explanations about the SECONDA selection of data sets, the selection of residues used for alignment tensor calculation as well a detailed description of the determination of \( S_{overall} \) can be found in the Supporting Information.