Introduction

Hydrogen/deuterium exchange in the gas phase has been comprehensively studied [14] and utilized for protein, peptide, and biomolecule ion structure investigations [514]. In one pioneering study, D2O as the exchange reagent was introduced into a Fourier transform ion cyclotron resonance (FTICR) mass spectrometer cell. Changes in the isotopic envelope of the gas-phase ions and the number of incorporated deuteriums indicated the existence of coexisting, stable gas-phase structures [5]. Later, the same methodology was utilized to investigate gas-phase structural transformations of cytochrome c ions [6]. A comprehensive mechanistic investigation utilized different HDX reagents [1]. Quantum mechanics calculations revealed the importance of the reagent gas-phase basicity upon the exchange process, and resulted in a proposal of a relay mechanism for HDX with D2O reagent.

The gas-phase HDX behavior of select ion conformers was reported shortly later in which ion injection into a drift tube was used to favor elongated and compact cytochrome c ions [7]. Differences in HDX levels were attributed to the potential for sampling different conformations by the different instruments as well as the timescale of the ICR measurement, which could be sensitive to longer timescale protein ion fluctuations. That is, the long timescale of HDX in a FTICR cell may provide sufficient time for an ion to widely sample conformational space [15] and adopt different conformers affecting the HDX behavior of the ions [9]. In separate experiments, HDX measurements in a FTICR instrument were used to confirm a proposed structural type for [M + H]+ bradykinin ions [2]. Here, molecular dynamics (MD) simulations were utilized to model the HDX behavior of an ensemble of structures of the salt-bridge conformer type [16]; the model was based upon the relay mechanism proposed for HDX with D2O reagent gas. IMS and HDX measurements also examined rate constants and deuterium uptake values at different buffer gas temperatures for cytochrome c ions [3]. The experiments showed that although the reaction with D2O is slower at elevated temperature, maximum deuterium uptake values are higher compared with those at lower temperatures. The lower reaction rate at higher temperature was attributed to decreased formation of the D2O-protein complex, whereas increased HDX levels indicated greater accessibility to more distal sites due to molecular fluctuations. Indeed, MD simulations at 300 and 500 K in which an exchange threshold distance between labile hydrogens and protonation sites was utilized suggested increased accessibility at higher temperatures. That is, the increase in uptake value at elevated temperature was explained by higher structural flexibility [3].

The advent of nonergodic ion dissociation methods like electron capture dissociation (ECD) [17] and electron transfer dissociation (ETD) [18] provided the opportunity to investigate deuterium uptake by individual amino acid residues [8, 1922] in the absence of isotopic scrambling [23, 24]. Recently, the first, collision cross-section measurements coupled with gas-phase HDX and tandem mass spectrometry (MS/MS) were utilized to determine per-residue deuterium uptake values of select ion conformer types of a peptide [8]. A simple model based on heteroatom hydrogen site accessibility (distance) to the charge site and the original site of deuterium incorporation was introduced to consider in-silico structures as viable candidates to represent ion conformer type. Patterns in per-residue HDX levels served as a second criterion for structure elucidation in addition to CCS filtering. Uptake values using this model could not completely describe the HDX behavior of the peptide ions in part because of saturation effects at higher partial pressures of D2O. To address this problem, in separate studies, the contribution of each residue to the HDX rate for select conformer types was measured; using the same distance model, theoretical contributions to rate better approximated measured contributions for in-silico structures intended to represent conformer type [20]. Because measuring the contributions to rate constants at the individual amino acid residue level is an extremely time-intensive process and difficult to obtain for larger ions, an effective collision model was proposed to describe deuterium uptake levels at different partial pressures of D2O [25, 26] for select ion conformers.

Ion structure dynamics can significantly affect the overall drift time distribution [27, 28] as well as the HDX behavior of peptides and proteins [2, 3]. Therefore, improvement to the accessibility model would account for such fluctuations. Here we utilize production MD to mimic structural fluctuations of each in-silico conformer in order to obtain a more realistic depiction of its reactivity with D2O. Using two cutoff distances, the accessibility of each exchange site is determined. Additionally, the relative exposed surface area of carbonyls and charge sites is implemented in a theoretical hydrogen accessibility scoring (HAS) method. Using HAS scoring coupled with a number of effective collisions (NEC) model, the deuterium uptake by individual residues is predicted for [M + 3H]3+ ions of the model peptide acetyl-PAAAAKAAAAKAAAAKAAAAK. Overall, multiple in-silico structures are required to explain the experimental HDX uptake patterns and the ETD spectra of the model peptide. Non-negative linear regression (NNLR) is applied to the theoretical uptake values to obtain population levels for proposed structures. In a subsequent report, comparisons of solution-phase (from CD spectroscopy and MD simulations) and gas-phase structures will be performed. Finally, pathways from solution to gas-phase structures will be investigated to gain insight into gas-phase ion conformer establishment as well as operative ESI mechanisms.

Experimental

Sample Preparation

The model peptide acetyl-PAAAAKAAAAKAAAAKAAAAK (>90% purity) was purchased (GenScript, Piscataway, NJ, USA) and used without further purification. One mg/mL stock solutions were prepared by dissolving 1 mg of the model peptide in Milli-Q water. ESI solutions were prepared by performing a 1:10 dilution of the stock solution with a 100 mM solution of ammonium acetate in water. ESI solution were infused (0.5 μL∙min–1) through a pulled-tipped capillary biased 2200 V relative to the hybrid ion mobility spectrometry-mass spectrometry (IMS-MS) instrument entrance [8, 29].

IMS-MS Measurements

A detailed description of drift time measurements is presented in the first manuscript associated with this work [30]. A home-built, dual gate IMS device coupled to a linear ion trap (LIT) mass spectrometer (LTQ Velos; ThermoScientific, San Jose, CA, USA) was employed [8, 29]. Delay times between IMS gates were scanned (100 μs time increments) to obtain drift time resolved mass spectra from which drift time distributions could be extracted (Supplementary Figure 1). A range of 400 to 1000 mass-to-charge (m/z) was used as the linear ion trap (LIT). For each drift time window, mass spectra were recorded for 0.5 min.

Peptide Ion Dissociation by ETD

Mobility-selected conformations were subjected to ETD to determine deuterium incorporation at specific amino acid residues. The source filament was set at 90 °C and the reagent (fluoranthene) chamber pressure was increased to 20 × 10–5 Torr using N2 gas. Mobility selected precursor peptide ions were isolated by m/z. The ion injection time was maintained at 200 ms (5 microscans) for these ETD measurements. Per-residue uptake values were obtained by subtracting the adjacent z-ion total uptake values.

Per-Residue Deuterium Uptake Calculations

The deuterium content of each residue was calculated utilizing the z-ions. Deuterium uptake values for each z-ion were obtained by subtracting the average mass of the ion in the absence of the HDX reagent gas from that recorded for the ion with the addition of the reagent gas in the drift tube. The amount of deuterium incorporated within each residue was determined by deducting the deuterium uptake value of adjacent, lower m/z z-ions. It is worth mentioning that ETD did not produce z1 and z2 ions; therefore, the values for these ions were calculated using the total deuterium uptake value for the peptide ions and the c20 and c19 ions, respectively. The uptake values for the proline residue (ideally obtained from c1 ions would include the subsequent alanine backbone amide hydrogen), were calculated using the z20 ion and the total deuterium uptake value for the peptide ion.

Molecular Dynamics (MD) Simulations

The processes of in-silico structure generation and in-vacuo MD simulations have been discussed in detail in the prior related work [30]. Briefly, after calculating the undefined force field parameters using the R.E.D server development [3136], the AMBER ff12SB force field was employed to generate the extended initial structures of [M + 3H]3+ peptide ions with charge arrangements of K(6)-K(11)-K(22) and K(6)-K(16)-K(22). Using the AMBER12 [37] MD package, the initial structures were energy minimized and subjected to two distinct cyclic simulated annealing (SA) runs – 40-ps and 1200-ps SA algorithms – to produce a pool of annealed structures. In each cycle of the SA runs, the minimized structure was heated to 1000 K, equilibrated, and gradually cooled to a lower temperature of 10 K using variant temperature coupling time constants. This was followed by a rapid, minimization-like cooling step to sample the structure at 0 K. After 1000 heating-cooling cycles, each optimized structure was heated and equilibrated at 300 K and subjected to 5 ns of MD simulations at the same temperature with no non-bonded cutoffs for long-range interactions. The resulting production MD trajectories were clustered using a modified k-means algorithm to generate 50(±10) clusters as described in the previous related work [30]. The most similar conformations (nearest structures) to the mathematically generated centroids of the clusters were selected for trajectory-method [38, 39] (TM) CCS calculations using the Mobcal [40] software. The exhibited CCS values were employed to generate the weighted-average CCS value total ) representing the CCS of the complete trajectory.

Hydrogen Accessibility Scoring (HAS)

To assign a hypothetical per-residue deuterium uptake pattern to each in-silico structure type, an in-house script was employed. MD simulations suggest that peptide ions exhibit considerable flexibility during their transit time in the drift tube; therefore, an accurate model should account for ion structure fluctuation. The production MD described above is utilized to simulate these fluctuations. For each candidate structure, the HAS) is calculated as the sum of the scores for each frame in the production MD according to Equation 1:

$$ {S}_{tota{ l}_j}={\displaystyle {\sum}_{i=1}^n{S}_{fram{ e}_{j, i}}}. $$
(1)

In Equation 1, \( {S}_{tota{ l}_j} \) is the score value for the j th labile hydrogen on each candidate structure, whereas \( {S}_{fram{ e}_{j, i}} \) is the hydrogen score value for the j th labile hydrogen on the i th frame of the production MD, and n is the number of MD frames (5000).

Individual \( {S}_{fram{ e}_{j, i}} \) values are computed in a manner that is similar to that described previously. The scoring approach considers the HDX mechanism for D2O reagent gas in which a proton from a charge site transfers to the D2O molecule with concomitant deuteron transfer to a less basic site such as a backbone carbonyl oxygen (Supplementary Figure 2) on the peptide ion [1, 2]. Subsequently this deuterium is transferred to the exchange site. According to the HDX mechanism, the scoring strategy can be divided into two steps. In the first step, the relative propensity for deuterium incorporation on the carbonyls is determined. For the second step, the likelihood of deuteron transfer to adjacent exchange sites is considered. Each of these steps is described briefly below with regard to the generation of \( {S}_{fram{ e}_{j, i}} \).

Based on the HDX mechanism, it is essential that the carbonyl group be located within a suitable distance for an extended period of time [3]. Therefore, a threshold distance between a carbonyl oxygen and a charge site can be estimated for which exchange will occur (carbonyl-charge cutoff distance). Additionally, the charge site and carbonyl group are required to be on the surface of the ion. The reaction cross-section is increased in proportion to exposed surface area of the charge site and the carbonyl oxygen; that is, the greater the surface area exposure, the greater is the number of collisions with D2O molecules [2]. Therefore, each carbonyl group on a structure can be scored according to Equation 2 as:

$$ {S}_{carbony{ l}_k}={\displaystyle {\sum}_{l=1}^m Surfac{e}_{carbony{ l}_k}\times Surfac{e}_{charg{e}_l}}\ . $$
(2)

In Equation 2, m is the total number of charges, whereas \( {S}_{carbony{ l}_k} \) and \( Surafac{e}_{carbony{ l}_k} \) are the carbonyl score and the normalized surface area (scaled to the maximum possible surface area) for the k th carbonyl of a structure. The term \( Surfac{e}_{charg{e}_l} \) is the scaled surface area of the l th protonation site. To apply the cutoff distance criterion, any carbonyl more distant than the threshold value is scored as zero. Several threshold distances were examined and a detailed discussion of distance optimization results is provided in the Results and Discussion section. Notably, independent of whether the carbonyl or the charge site is buried in the peptide ion, the carbonyl score is zero.

For single ion structures, each labile hydrogen that is located within the threshold distance of an oxygen of a scored carbonyl group is scored according to Equation 3 as:

$$ {S}_{fram{ e}_{j, i}}={\displaystyle {\sum}_{k=1}^q{S}_{carbony{ l}_k}}. $$
(3)

In Equation 3, q represents the number of carbonyls for a single structure. As mentioned above, a summation of each frame score over the entire MD trajectory yields the overall hydrogen score for an in-silico candidate structure (Equation 1).

Number of Effective Collisions (NEC) Model

Under the experimental conditions for this study, depending on the partial pressure of D2O, peptide ions typically experience thousands of collisions with D2O gas. Measured rate constants indicate that only a very small portion of these collisions leads to reaction (here termed an effective collision). At lower partial pressures of D2O corresponding to a relatively small number of labile hydrogens that undergo exchange, on average, each reaction leads to deuterium uptake. At higher partial pressures where larger numbers of deuterium are incorporated, the likelihood of exchanging deuterium with deuterium is not negligible; that is, not all of the effective collisions result in deuterium uptake (saturation conditions). Therefore, at very low partial pressures of D2O gas, the NEC is directly proportional to deuterium uptake values; the NEC at higher partial pressures of D2O is not. Because the number of ion-D2O collisions increases linearly with D2O partial pressure and the ratio of the total number of collisions to the NEC is constant, the NEC increases proportionally with partial pressure of D2O according to Equation 4:

$$ {n}_{e ffectiv e}=\frac{p_{high}}{p_{Low}}\times {n}_{e ffectiv{ e}_{low}}\ . $$
(4)

In Equation 4, n effective and \( {n}_{e ffectiv{ e}_{low}} \) are the NEC values at high- and low partial pressures of D2O, respectively. \( \frac{p_{high}}{p_{Low}} \) is the ratio of partial pressures of D2O at different leak valve settings resulting in saturation (p high ) and non-saturation (p low ) reaction conditions. Using Equation 4, the NEC can be calculated for different D2O partial pressures, even those under which saturation conditions may apply.

The HAS scores for particular hydrogens can be normalized for the candidate structures. A relative hydrogen score (\( {S}_{R_j} \)) value for each conformer type can be determined according to Equation 5 as:

$$ {S}_{R_j}=\frac{S_{tota{ l}_j}}{{\displaystyle {\sum}_{j=1}^n}{S}_{tota{ l}_j}}. $$
(5)

\( {S}_{R_j} \) reveals the contribution of each labile hydrogen to the number of HDX events occurring for the entire peptide. In Equation 5, n is the total number of labile hydrogen on the peptide ion. With this consideration, it is possible to generate Equation 6:

$$ {n}_{e ffectiv{ e}_j}={n}_{e ffectiv e} \times {S}_{R_j}, $$
(6)

that is, the NEC experienced by a labile hydrogen (\( {n}_{e ffectiv{ e}_j} \)) can be estimated. It is then necessary to convert the NEC to deuterium uptake value. With \( {n}_{e ffectiv{ e}_j} \) the behavior of an uptake site with regard to a specific number of collisions is simulated for 1000 ions. In the simulation, first an array of 1000 × 1 is utilized to model a population of 1000 ions containing one exchange site. All the elements in the array are set to zero and for each effective collision a random element is changed to 1. Therefore, the summation of all elements in the array is equivalent to the deuterium uptake for the1000 ions. It is noteworthy that the uptake value determined as a function of NEC produces a trend comparable to a pseudo-first order kinetics plot as may be expected (Supplementary Figure 3) [5], using the various \( {n}_{e ffectiv{ e}_j} \) values.

Results and Discussion

Peptide Ion Collision Cross-Sections

Electrospraying the model peptide Acetyl-PAAAAKAAAAKAAAAKAAAAK produces a series of [M + 2H]2+, [M + 3H]3+, and [M + 4H]4+ ions as shown in Supplementary Figure 1. [M + 4H]4+ ions are shown as a small feature in the figure at m/z of ~453, where three major conformer types are evident. Two conformer types exhibit similar intensities having collision cross-sections of 492 and 506 Å2, whereas the third, less-abundant and most-elongated conformer has a cross-section of 534 Å2 (respective drift times are t d  = 7, 7.2 and 7.6 ms). In Supplementary Figure 1, the [M + 2H]2+ ions are depicted as a broad (unresolved) feature spanning a CCS range of 300 to 400 Å2. As in the previous study, the [M + 3H]3+ ions were chosen for HDX experiments. These ions can be categorized into three conformer types. The most abundant (and most compact) conformer type for these ions exhibits a CCS value of 417 Å2 (t d = 7.9 ms). The second most prevalent ions comprise a more diffuse (fairly resolved) conformer type (Ω = 438 Å2, t d =8.3 ms). The third feature for the [M + 3H]3+ ions, the most diffuse conformer type, appears as a broad, unresolved shoulder with a CCS of ~464 Å2(t d  = 8.7 ms). The two most abundant conformer types of these ions were selected for characterization by gas-phase HDX.

Peptide Ion Structure Studies Using IMS-HDX-MS/MS Coupled with MD Simulations

The MD approach used in the current study is based on the production of many random in-silico structure types and subsequent filtering of the structures using the experimental CCS values and the HDX behavior of the peptide ions. From the previous study [30], 4000 in-silico structures were produced using a simulated annealing approach, and the ion structure dynamics were characterized with MD simulations. CCS values were calculated for the candidate structures and these conformer types were filtered using experimental CCS values. In all, 63 and 261 conformers exhibited MD trajectories with matching CCS values to the compact and more diffuse conformer types of the [M + 3H]3+ ions, respectively. These candidate structures comprised an ensemble consisting of a number of significantly different structural types.

Previous experiments have shown that per-residue HDX values can provide information about the relative distances of heteroatom sites to charge sites and deuterium incorporation sites, the surface accessibility of residues, and the charge site configuration of the peptide ions [8, 20, 25, 26]. In this manuscript, HAS scoring coupled with a NEC model is implemented for each candidate structure to produce its theoretical per-residue uptake pattern. Subsequently, NNLR analysis is utilized to assign a population for these conformer types. The predicted results for deuterium uptake pattern and ETD mass spectra are compared with experimental data.

Carbonyl-Charge Site and Carbonyl-Hydrogen Cutoff Distance

Considering that the first and second steps of gas-phase HDX involve hydrogen bonding, it is important to consider optimal atomic interaction distances for the HAS model (see the Experimental section). Because the length of a hydrogen bond in the gas phase can range from ~2 to 4 Å [41, 42], no precise interaction distance can be elucidated for proposed reaction geometries. In the first step of the exchange process (original incorporation of a deuteron), a threshold distance of 4–8 Å can be estimated, while for the subsequent transfer to an exchange site, a distance range of 2–5 Å can be proposed to represent the decreased flexibility of the backbone groups involved (if the deuteron receptor is a lysine residue, it can be extended to 8 Å). The theoretical uptake behavior of peptide ions can be compared with experimental results to optimize threshold distances. Several different cutoff distances were applied to the model and distances <5 Å and/or <3 Å for carbonyl-charge and carbonyl-hydrogen interactions, respectively, lead to very low total uptake values, whereas the respective distances that are >7 Å and/or >4 Å yield very high deuterium uptake values. As an example, the experimental data show that residues A(14) and A(15) have relatively high amounts of incorporated deuterium, suggesting threshold distances that are ≥6 Å and ≥3.5 Å for the first and second steps, respectively. The best match between theoretical and experimental uptake values resulted from using threshold values of 6.5 Å and 3.5 Å for the respective interactions, and these values have been used for the comparisons discussed herein.

HDX Reagent Partial Pressure and Reagent Reactivity

For the purposes of this study, it is desirable to set the experimental partial pressure of D2O at a value which produces the maximum contrast between in-silico structures. Supplementary Figure 4 shows the deuterium uptake behavior for two exchange sites, each having significantly different hypothetical uptake sites (\( {S}_{R_j} \) = 0.1 and 0.05) as a function of n effective for the peptide ion. As may be expected, a NEC of 5 × 103 for 1000 ions, the ion population with the more accessible hydrogen (>\( {S}_{R_j} \)) incorporates 2-fold more deuteriums than the population with the less accessible hydrogen. As the NEC increases, the deuterium uptake difference between these two ion populations shrinks, and at 2 × 104 effective collisions, the uptake ratio is about 2:3 (low:high \( {S}_{R_j} \) values). Finally, at 1.2 × 105 effective collisions, both ion populations incorporate a similar number of deuteriums. This situation is comparable with what may be expected for more highly reactive reagent (e.g., ND3) where accessible and semi-accessible sites exhibits complete uptake, even at very low partial pressure of HDX reagent. That said, although the work here does not preclude the use of ND3, for the timescales of our measurements, a diminishingly small amount of ND3 would be required (partial pressures not manageable with our apparatus) to obtain a similar level of discrimination. In these studies, D2O is utilized as HDX reagent because of its increased contrasting ability with regard to exchange sites. Overall, the deuterium uptake difference initially increases and reaches a maximum value and then decreases as the partial pressure of D2O increases.

It is instructive to note that the HDX uptake pattern can only serve as an ion structure filtering criterion if the contrast between separate in-silico structures is significantly greater than the experimental error. Notably, the experimental error in deuterium uptake values tends to be constant at different partial pressures of D2O; therefore, at lower partial pressures of D2O, most protein and peptide ion systems exhibit larger relative errors.

To determine the optimal pressure of D2O for HDX experiments, which accounts for both factors above, the relative standard deviations of deuterium uptake at different NEC values for all candidate structures were calculated. Figure 1 shows the results of this calculation for one of the hydrogens on the K(5) residue. As the NEC increases, the contrast in deuterium uptake of in-silico structures increases, and then decreases. Here it is noted that for this hydrogen, a NEC value of 1 × 104 is the last point of the linear range in the kinetics plot (e.g., Supplementary Figure 3). Based on a number of experiments [8, 20], a constant error of 0.1 (in deuterium uptake value) was assumed to calculate the relative experimental error values. Figure 1 shows the value of the confidence interval (99%) for the experimental results as a function of NEC. The maximum difference between the two values occurs at a NEC value of ~20,000 collisions corresponding to a partial pressure of ~0.03 Torr for D2O. This partial pressure value was chosen for the remaining studies.

Figure 1
figure 1

Comparison of the relative experimental error and the standard deviation in modeled deuterium uptake exhibited by in-silico structures as a function of the NEC (generated using the HAS algorithm – see Experimental section). Red diamonds represent the expected relative experimental error (confidence level >99%) for an exchange site on the K(5) side chain. Blue circles show the coefficient of variation in the modeled deuterium uptake value of such an exchange site for 61 candidate structures of the compact [M +3H]3+ peptide ion conformer

Experimental and Predicted Deuterium Uptake Values

All the candidate structures with matching CCS were subjected to HAS scoring and subsequently considered by the NEC model, to estimate the deuterium uptake for each exchange site (see Experimental section). With these values for the 63 and 261 candidate structures for the respective more compact and elongated structures, non-negative linear regression was utilized to estimate the contribution of each structure to the overall per-residue HDX pattern. Briefly, the function (which is implemented in the R software suit [43]) determines a combination of theoretical results, which best reconstructs the experimental result (least deviation). For more details regarding the implementation of non-negative linear regression see the Supplementary Information section.

Figure 2 compares the experimental and theoretical per-residue deuterium uptake values for the more compact conformer (Ω = 417 Å2) of the [M + 3H]3+ ions. As demonstrated in Figure 2, predicted values for all of the residues are in good agreement with experimental values with the exception of A(20). Of note is the fact that there was no c- or z-ion for this residue to directly calculate deuterium uptake. Rather, the deuterium incorporation value was calculated using a combination of c- and z-ions. The total root mean square deviation (RMSD) of the predicted results from the experimental results is ~3.3%.

Figure 2
figure 2

Experimental per-residue deuterium content versus modeled deuterium uptake values for the compact [M + 3H]3+ peptide ions. The blue bars (left) show the experimental deuterium uptake by each residue, whereas the red bars (right) are the hypothetical values estimated by HAS scoring and the NEC model

The comparison of experimental and theoretical uptake values for the more diffuse conformer (Ω = 438 Å2) of the [M + 3H]3+ ions is shown in Figure 3. The predicted deuterium uptake values exhibit ~3.1% RMSD compared with the experimental results. As with the more compact conformer, the uptake value for A(20) was calculated using a series of c- and z-ions. The error associated with this calculation cannot be estimated but may explain the difference between the experimental and predicted uptake values.

Figure 3
figure 3

Comparison of experimental and theoretical per-residue deuterium content patterns for the more diffuse [M + 3H]3+ peptide ions. Blue (left) and red (right) bars represent the experimental and theoretical per-residue deuterium uptake values, respectively. Hypothetical values (red bars) are estimated by HAS scoring and the NEC model

Although the HAS-NEC model can approximate the experimental results to within <4% error for both conformer types, the model is based on several assumptions and, like any other model, these are associated with some error. Additionally, the long experimental timescale (up to ~9 ms) in comparison to the production MD (5 ns) can lead to a lack of conformational space sampling (see the prior treatment of this topic in the associated manuscript for a more detailed discussion [30]), which can also be a source of error. Experimental per-residue deuterium uptake values exhibit an average error of <10%. Because experimental deuterium uptake values and theoretical deuterium uptake values are utilized in the NNLR function to produce a coefficient matrix (containing population values), the final model result is expected to have an average error of <11%. It is instructive to note that the theoretical per-residue deuterium uptake values for in-silico structures at 20,000 NEC produce on average >50% contrast between structures. Thus, the filtering with HDX results is statistically meaningful (confidence interval >99%).

ETD Spectral Construction Using HAS Scoring

Although the comparisons of deuterium uptake values for individual amino acid residues can be used to assess the quality of the fit, this data lacks the isotopic distribution information afforded by the ETD mass spectrum. That is, the ETD mass spectra contain isotopic distribution patterns which can also be compared with theoretical patterns to evaluate the quality of the fit. Previous studies have shown that isotopic distribution data can be examined to consider the possibility for the existence of multiple gas-phase conformer types of similar mobility that could contribute to the overall HDX uptake values [20]. As an example, a large number of conformer types with similar CCS values but distinct HDX reactivity could dramatically affect the observed isotopic distribution. Supplementary Figure 5 provides an illustration of such an effect using three hypothetical isotopic distributions for a fragment ion. Under saturation conditions, and under complete exchange, the isotopic distribution would be narrow (similar to that for the ions in the absence of D2O albeit shifted to higher m/z values). Under conditions in which ions of similar mobility and relatively similar HDX propensities would produce a broadened isotopic distribution as shown in Supplementary Figure 5. In the case that ions of similar mobility exhibit very different HDX behavior, multiplet isotopic distributions may be observed (also depicted in Supplementary Figure 5).

Comparison of theoretically predicted and experimental ETD spectra can provide a degree of estimation for the accuracy of the methodology. Indeed, it may be argued that the isotopic distribution should be used rather than the per-residue deuterium uptake values for more accurate conformer type selection utilizing the gas-phase HDX data. However, such a procedure is associated with some significant limitations and obstacles to full implementation. For example, a precise prediction of the isotopic envelope would require per-residue HDX kinetics data, which is experimentally laborious and the modeling for this procedure is extremely time intensive [20]. Here, the HAS scoring coupled with a slight variation of the NEC model (see Experimental section) presents an uptake value for each ion. Using a Monte Carlo approach, a two-dimensional array (1000 × j where j is the number of exchange sites on a given fragment ion) is populated with exchange events. In this manner, the distribution of deuterium on the exchange sites for 1000 ions can be estimated and subsequently the isotopic distribution for the best-fit composition of ion populations can be generated. Figure 4 compares the predicted and experimental isotopic distribution for c 10 and c 19 fragment ions generated from the compact and more diffuse [M + 3H]3+ precursor ions. Overall, the model generally captures the width of the isotopic distribution for both ions. Additionally, the model predicts trends in isotopologue abundance very well; the overall RMSD ranges from 4.5% to 7.5%. It is important to note that we have previously demonstrated that single conformer types do not adequately model the width of the isotopic distribution for ETD fragment ions [20]. In general, they produce distributions that are significantly smaller in overall width and shape. The agreement in isotopic distributions shown in Figure 4 further confirms a degree of conformer heterogeneity for these peptide ions.

Figure 4
figure 4

A depiction of the theoretical isotopic envelopes versus experimental results for two fragment ions originating from labeled peptide ions (gas-phase HDX). Blue bars (left) represent the predicted values, whereas the red bars (right) show the experimental intensity value of each isotope. The top panels show results for the compact conformer and the bottom panels represent the more diffuse structural type. Panels on the left and right show results for c10 and c19 ions, respectively

Gas-Phase Ion Structures

Using HDX as a second criterion for ion structure filtering in combination with NNLR analysis, the number of candidate structures were decreased to 6 and 7 species (nominal structures) for the compact and more diffuse conformer types, respectively. Figures 5 and 6 show the nominal structures estimated to be the most abundant for the compact and more diffuse conformers of the [M + 3H]3+ peptide ions, respectively. The lower abundance nominal structures are shown in Supplementary Figures 6 and 7 for the compact and more diffuse conformer types, respectively. For comparison, the annealed structures (optimized structure resulting from the SA procedure) are shown along with the gas-phase structures (nearest structures to the centroid of the gas-phase MD trajectory).

Figure 5
figure 5

The most prevalent structures (nominal structures) matching the criteria for compact [M + 3H]3+ peptide ions. The panels labeled as “Annealed structure” show the conformers resulting from SA while the “Gas-phase structure” is obtained by finding the nearest structure to the centroid of the production MD. Proposed population values (p) and a structure number (#) are also assigned to each structure and labeled in the figure

Figure 6
figure 6

The most abundant nominal structures representing the more diffuse [M + 3H]3+ peptide ion conformer. Annealed structures are shown in the left panels and the closest structures to the centroids of productions MD are depicted in the right panels. Relative population (P) and structure numbers (#) are also provided

The proposed most abundant gas-phase structures for the diffuse conformer type can be compared according to their degree of helicity; ~57% of the nominal structures (#1 P = 39%, #3 P = 16%, #5 P = 2%) exhibit a high degree of helicity (>40%), whereas the remaining nominal structures (~43% of all structures) show relatively random structures. Similar to prior predictions [26], the majority of the ion population (including structures: #1, #3, #5, and #6) contains the charge site arrangement of K(5), K(16), and K(21). Results for the more diffuse conformer type exhibit a greater variety in nominal structures (Figure 6 and Supplementary Figure 7) but, in aggregate, less helicity is observed than the more compact ions. Structures #2, #3, #6, and #5 have a charge site configuration that is comprised of the K(5), K(11), and K(21) residues.

For the compact ions, structures #1, #3, and #5 display a high degree of similarity. Additionally, structures #1 and 5# have a similar appearance. The degree of structural similarity here may suggest a sampling of the same area of conformation space. As discussed in the first installment of this work [30], the production MD simulation time is not comparable to the drift time of the ions. Simulated annealing can be exploited to overcome barriers on the potential energy surface and therefore address the short time scale of production MD. These structures point to the same area of conformation space and thus may not be independent of one another at 300 K. Therefore, the relatively short timescale of the MD production runs (in comparison to drift time) and the lack of conformation space sampling causes these apparent differences. To clarify, one can assume two trajectories point to the same area of conformation space available for an ion for an extended period of time. Additionally, it is understood that NNLR is a linear combination of contributions as illustrated in Equation 7:

$$ \left({V}_1+{V}_2\right) a=\left({V}_1\right) a+\left({V}_2\right) a. $$
(7)

In Equation 7, a is the coefficient matrix (containing the ion population); V 1 and V 2 are the HDX results for trajectory 1 and trajectory 2, respectively. Therefore, although V 1 and V 2 are presented to the function as separate entities, they may point to the same trajectory at different time points. These structures may not necessary be different conformer types; rather, they sample different portions of the same trajectory.

Ion Segment HDX Behavior Analysis for [M + 4H]4+ Ions

The study of quadruply charged ions provides the unique opportunity to assess the HAS-NEC model for another ion structure type (i.e., diffuse and less flexible ion conformers). [M + 4H]4+ peptide ions exhibit three partially-resolved conformer types with CCS values of Ω = 492, 506, and 534 Å2 as shown in Supplementary Figure 8. These ions produce very low intensities such that the ETD spectra produced upon drift selection for each conformer type is not of sufficient quality to investigate the HDX behavior of these ion conformers individually. Therefore, a wide drift time selection was applied encompassing all three conformer types in order to study the behavior of diffuse and less flexible ion conformers (see below).

Interestingly, for these more elongated [M + 4H]4+ peptide ions, lower uptake values (~4.5 deuteriums) are observed compared with the [M + 3H]3+ peptide ions (~10 deuteriums). This difference in uptake has been reported previously for peptides ions [8, 20]. These earlier studies show that lower levels of exchange may be attributed to differences in HDX kinetics and site accessibility. This lower level of deuterium incorporation limits the application of per-residue deuterium uptake studies for these ions. Considering a hypothetical 21-residue peptide with an uptake value of 4.5 deuteriums, on average, the deuterium contribution from each residue would be ~0.2 deuteriums. The experimental error in obtaining per-residue deuterium uptake information is typically ~0.1 deuteriums [8, 20]. It may thus be argued that for species exhibiting limited exchange and having considerable numbers of amino acid residues, the per-residue deuterium uptake value would not be meaningful. This would limit the applicability of the approach. Here, another strategy that divides the peptides into several segments has been utilized to study the HDX behavior of such ions. For the [M + 4H]4+ peptide ions, four different segments, namely acetyl-PAAAAK, AAAAK, AAAAK and AAAAK, are used. HAS-NEC is then utilized to propose theoretical uptake values for each segment. Here it is noted that such an approach can not only address situations in which low levels of per-residue deuterium incorporation are encountered but also those in which incomplete fragmentation may be observed. Notably, both problems would be relevant in larger protein ion analyses and so the study of the quadruply charged ions presents a proof-of-principle structure characterization.

The candidate structures of [M + 4H]4+ ions were subjected to the HAS-NEC model. Supplementary Figure 9 compares the experimental and predicted uptake values for nominal structures generated from MD simulations [30]. Gas-phase structures representing the different conformers types of [M + 4H]4+ ions filtered by the HAS-NEC model are shown in Supplementary Figure 8. It is worthwhile to mention that for the quadruply charged ions, the elongation of the ion due to excessive charge density would not allow preservation of solution-like structures; therefore, these structures are discussed from a reference point of gas-phase structure stabilization. All three conformer types display a degree of helicity within the first peptide ion segment (acetyl-PAAAAK) where the charge located at K(6) may stabilize the helix [44]. Significant Coulomb repulsion and the presence of a positive charge at N-terminal locations of subsequent segments do not allow these portions to adopt specific secondary structure. Overall, they are relatively rigid structures of varying degree of elongation.

As with the [M + 3H]3+ ions, application of the HAS-NEC model can also capture the exchange behavior of the quadruply charged ions. Having established the types of conformers contributing to the HDX behavior, the total deuterium uptake pattern of conformer types within the drift selection can be expressed as:

$$ V={\alpha}_1{V}_1 + {\alpha}_2{V}_2 + {\alpha}_3{V}_3 $$

in which V represents a vector of the experimental deuterium uptake for the four segments. V i represents the hypothetical deuterium uptake for the ith conformer (Supplementary Figure 8) from HAS-NEC. The coefficients α i are calculated using peak intensity for each conformer type comprising the mobility selection. Supplementary Figure 9 shows the comparison of the experimental and theoretical values from this treatment. The high degree of agreement between theory and experiment is promising for the segment-wise application of the HAS-NEC model to larger protein systems in which incomplete fragmentation patterns are encountered.

Conclusions

In the first installment of this work, the simulated annealing approach was utilized to sample conformation space for the model peptide acetyl-PAAAAKAAAAKAAAAKAAAAK. Production MD at constant temperature was applied on the sampled structures to provide a more realistic conception of the in-silico structures at 300 K. Additionally, a novel procedure was proposed to calculated CCS values for the in-silico trajectories. More than 300 candidate structures with matching CCS values to the experimentally measured CCS values for more compact and diffuse conformer types of [M + 3H]3+ peptide ions were extracted from a pool of 4000 structures. Here, the application of gas-phase HDX measurements is presented for further structure elucidation. A new algorithm based on the mechanism of gas-phase HDX with D2O reagent [1, 2] is proposed to yield a theoretical relative site-specific reactivity for in-silico structures (HAS scoring). A Monte Carlo approach is applied to model the HDX behavior of an exchange site as a function of the number of effective collisions (NEC). Subsequently, HAS-NEC is applied for all candidate structures (matching CCS) to produce a hypothetical deuterium uptake pattern for each structure. Non-negative linear regression is then exploited to solve a best fit reconstruction of the experimental results. In this manner, not only is the number of candidate structures reduced by a factor of 20, but also a relative population is assigned to the nominal structures. To assay the accuracy of the model, the isotopic envelopes of four fragment (c-) ions were generated and compared with experimental distributions.