Introduction

Hydrogen-deuterium exchange-mass spectrometry (HDX-MS) is a useful technique for characterizing the conformational mobility of proteins and protein complexes under native conditions [13]. Through the determination of deuterium incorporation into protein amide groups, HDX-MS provides an alternative method for obtaining structural information about proteins which, due to their size, limited solubility, or intrinsic conformational mobility, may not be suitable for other methods of structural biology. Individual aspects of the HDX-MS workflow, including sample handling, peptide separation, mass analysis, and post-processing analysis continue to be subjects for further development.

Hydrogen-deuterium exchange experiments provide information about the behavior of a complex system over time. In common with time course observations in fields such as gene expression, metabolomics, and toxicology studies, practical considerations generally limit the number of time points. Analytical tools that have been developed for extended or periodic time course data are unsuited to data with only a small number of time points. Functional data analysis (FDA) has been usefully applied to short time series data in several fields [46]. The dynamics of the specific system informs the choice of model curves. In the case of HDX-MS, although a theoretical framework for understanding the time dependence of amide hydrogen exchange in globular proteins has been appreciated for some time [1, 79], this framework has not been engaged as a basis for functional analysis of HDX-MS data. In this work, we present the area between the curves, A bec , as a measure of hydrogen-deuterium exchange curve dissimilarity and recall an interpretation that relates the area between exchange curves to differences in the fractional exposure of amide hydrogens [7]. We apply the method to published HDX-MS data [10] and show that A bec is useful for statistical analysis of conformational mobility differences.

Methods

We are interested in using the area between exchange curves to characterize differences in the conformational mobility of a protein between states a (e.g., free or apo) and b (e.g., ligand-bound). In the HDX-MS method, samples of the protein in the two conditions are exposed to high deuterium content buffers for times t 1t n [13]. Samples are quenched and digested into peptides that are then chromatographically separated and analyzed by mass spectrometry. Data from n rep replicate experiments are usually reported as mean relative percent deuterium incorporated, D, with the associated standard deviations, and are often presented as plots of D versus log time. We used a plot digitizer utility [11] to extract deuterium uptake values from published HDX data [10] and calculated the mean remaining hydrogen fractions y a,i = (100-D a,i )/100 and y b,i = (100-D b,i )/100, where i is the index over the timepoints. As depicted in Figure 1, we used four methods to estimate A bec . Two are geometric methods and two are parametric, curve-fitting methods. We used R to implement the calculations [12]. For additional details and statistical methods, see Supporting Information.

Figure 1
figure 1

Exchange curves and methods for estimating A bec , the area between exchange curves. Remaining hydrogen fractions for apo (○) and ligand-bound [●, rosiglitazone (rosi)] states of peptide aa 445-452 of peroxisome proliferator-activated receptor gamma (PPARγ). HDX data are from [10]. (a) A bec values are estimated by the difference of the averaged hydrogen fraction scaled by the span in log time (blue rectangle) or by trapezoidal approximation (peach trapezoids). (b) Time-dependent HDX data can be fitted by weighted second order polynomials (apo, solid grey; rosi, solid black), or logistical functions (apo, dotted red; rosi, dotted blue)

  1. 1.

    Averaged incorporation. As this method is commonly implemented, deuterium uptake values are averaged over all timepoints [13]. Here, we average the remaining hydrogen fractions and then scale the result by the span in log time:

    $$ {A}_{b ec}^{a veraged}= \log \left({t}_n/{t}_1\right)\left({\scriptscriptstyle \frac{1}{n}}\right)\left({\displaystyle \sum_{i=1}^n{y}_{b, i}-{\displaystyle \sum_{i=1}^n{y}_{a, i}}}\right) $$
  2. 2.

    Trapezoidal approximation. The area under the exchange curve for state a can be approximated as:

    $$ {A}_{uc, a}^{t rapezoid}={\scriptscriptstyle \frac{1}{2}}\left({y}_{a,1} \log \frac{t_2}{t_1}+{y}_{a, n} \log \frac{t_n}{t_{n-1}}+{\displaystyle \sum_{i=2}^{n-1}{y}_{a, i} \log \frac{t_{i+1}}{t_{i-1}}}\right) $$

    The area between two exchange curves can be expressed as:

    $$ {A}_{bec}^{trapezoid}={A}_{uc, b}^{trapezoid}-{A}_{uc, a}^{trapezoid} $$
  3. 3.

    Weighted second order polynomial fit. The area under the exchange curve for state a is the integral of the fitted curve:

    $$ {A}_{a, poly}={\displaystyle {\int}_{\log {t}_1}^{\log {t}_n}{a}_{2, a}{\kern0.2em \left( \log t\right)}^2+{a}_{1, a} \log t+{a}_{0, a}\kern0.2em d \log t} $$

    The area between the curves is given by the difference:

    $$ {A}_{b ec}^{poly2 w}={A}_{b, poly}-{A}_{a, poly} $$
  4. 4.

    Logistical function. The data are fitted by

    $$ {y}_a(t)={d}_a+\left({a}_a-{d}_a\right)/\left(1+ \exp \left({b}_{fix} \ln \left( t/{c}_a\right)\right)\right) $$

    The area between the curves is the difference between the areas under the fitted exchange curves:

    $$ {A}_{b ec}^{logistical}={A}_{b, logi}-{A}_{a, logi} $$

    where A a,logi and A b,logi are evaluated numerically.

Results and Discussion

In a published account [10], HDX-MS was used to investigate conformational mobility in the ligand binding domain (LBD) of peroxisome proliferator-activaed receptor gamma (PPARγ) in the ligand-free state (apo) compared with agonist- or partial agonist-bound states. We extracted deuterium uptake values from deuterium incorporation versus log time plots for 31 peptide/charge states for the apo and rosiglitazone-bound states using a digitizer tool [11]. We used two geometric methods and two parametric, curve-fitting methods to estimate the area between the exchange curves (see Methods and Supporting Information). We note that for these data, with only five exchange times, the parametric methods may be less appropriate. Specifically, for two peptides from the rosi-bound state, curve fitting to the three-parameter logistical function failed to converge; A bec values were calculated for only 29 peptides by this method. A visual representation of the distributions of the A bec values calculated by the four methods shows simlar patterns (Figure 2a). For the 29 peptides analyzed by all four methods, areas estimated by the two geometric methods were the most highly correlated (Pearson r = 0.995). Correlations of other pairs of methods ranged from 0.980 to 0.986. In addition, the numbers of peptides with substantially different exchange dynamics, as indicated by A bec values > 0.3, were similar by the four methods. The “averaged” method identified 16/31 substantially affected peptides, the “trapezoid” and “poly2w” methods each identified 15/31 substantially affected peptides, and the “logistical” method identified 14/29 substantially affected peptides. Values of A bec greater than 0.3 indicate at least a 2-fold reduction in the average exchange-competent fraction of amide hydrogens in the stabilized state (see Supporting Information). Thus, the area between exchange curves is relatively insensitive to the method of area estimation. In addition, the magnitude of A bec provides molecular insight.

Figure 2
figure 2

Characterization of four methods for estimating A bec values and their statistical significance. Distributions of A bec values (a) and the negative log of the P-value (b) are shown for 31 peptides comparing the exchange behavior of ligand-bound (rosiglitazone) and apo states of the LBD of PPARγ. The methods for area estimation are: averaged hydrogen fraction scaled by the span of log time (averaged), trapezoidal approximation (trapezoid), curve fitting to weighted second order polynomial (poly2w), curve fitting to a three parameter logistical function (logistical). A bec values from logistical functions were obtained for 29 out of 31 peptides. Symbol color indicates the approximate location of the peptide. Different charge states of a given peptide are represented by the same symbol

The area between exchange curves provides a useful measure of dissimilarty in dynamic behavior for statistical tests (see Supporting Information). The estimated statistical significance varies among the four methods. Distributions of the negative of the log of the P-values for the four methods show greater dispersion (Figure 2b). Visually, the distributions of significance values for the geometric methods appear the most similar and the “poly2w” distribution is the most divergent. For the 29 peptides analyzed by all four methods, negative log P-values for the two geometric methods were the most highly correlated (Pearson r = 0.996). Correlation of “poly2w” negative log P-values with those estimated by the other methods ranged from 0.775 to 0.818. The geometric methods identified 21/31 peptides to be significantly affected by ligand binding (P < 0.05), the “poly2w” method identified 19/31 significantly affected peptides, and the “logistical” method identified 16/29 significantly affected peptides.

To illustrate molecular insights provided by A bec calculations, we used published HDX-MS data for the binding of a partial agonist, nTZDpa, compared with rosi, an agonist, on the conformational mobility of PPARγ LBD [10]. In particular, we used the trapezoid method for estimating the area between exchange curves as the number of timepoints in this study was relatively small, limiting the utility of parametric methods. A bec values for rosi-bound versus apo were significantly positive for 10 out of 16 LBD peptides, whereas A bec values for nTZDpa-bound versus apo were significantly positive for 14 out of 16 LBD peptides (Figure 3a). Specifically, the binding of rosi has a strong stabilizing effect on the conformational mobility of peptides 279-287 and 443-452, resulting in an approximately 10-fold decrease in the exposed fraction in the bound state. The effects of nTZDpa binding are more widespread. Peptide 279-287 shows a 20-fold decrease in exchange competency, whereas peptide 341-351 shows a 12-fold decrease, and peptides 331-341 and 364-370 each show a 6-fold decrease. As shown in the spatial representation (Figure 3b), the conformational effects of rosi binding were lesser in magnitude and affected the binding pocket with additional stabilization of helix 12 (aa 470-477), whereas the effects of nTZDpa binding were larger in magnitude and were confined to the binding pocket without affecting the conformational mobility of helix 12.

Figure 3
figure 3

Effects of rosiglitazone (rosi, an agonist), and nTZDpa (TZD, a partial agonist) on hydrogen-deuterium exchange dynamics of PPARγ-LBD. HDX data are from [10]. (a) The area between exchange curves (A bec ) for 16 peptides. (b) A bec values overlaid onto three-dimensional structures of PPARγ-LBD showing the effects of Rosi binding (left, PDB: 2PRG) and nTZDpa binding (right, PDB: 2Q5S). Color key: A bec values; n.s., not significant

Conclusions

The time dependence of hydrogen-deuterium exchange of amide hydrogens in globular proteins is well understood to be described by a sum of exponential curves [13, 7]. In practice, recovery of individual exchange rate constants from HDX-MS data is rarely achieved due to the generally acknowledged ill-conditioned and under-determined characteristics of the problem, although many creative approaches toward extracting additional information have been developed [14, 15].

The area between exchange curves is a practical and sensitive measure for statistical tests of significance for comparing the conformational dynamics of a protein under two conditions. A bec values also provide molecular insight into changes in the exposed fractions between the two states. By integration over the log of exchange times, differences in reaction progress are converted into information about changes in the solvent-exposed/exchange competent fractions in the two states.