Introduction

Native ion mobility-mass spectrometry (IM-MS) can provide a wealth of information about the size, shape, charge state, mass, stoichiometry, and/or topology of ionized, native-like biomolecules and biomolecular complexes, in addition to other properties [130]. IM spectrometry separates ions upon passage through a buffer gas (typically helium, argon, or nitrogen) under the influence of an electric field as a result of the size- and conformation-dependent drag force induced by repeated collisions between the ion and buffer gas. In native IM-MS, ion mobility values in the low-field limit are typically measured using drift tube, traveling wave ion guide, or, more recently, trapped ion mobility instrumentation before subse-quent mass spectrometry analysis [10, 31]. Structural informa-tion is typically obtained by converting measured drift times or ion mobility values to collisional cross-sections (CCSs) and comparing these to other experimental data or CCSs computed for model structures. However, accurately calculating CCSs with explicit computation of the scattering process can be very challenging and time-consuming due to contributions of long-range polarization interactions, orbiting, multiple scattering, and diffuse scattering, among other interactions. Due to the computational difficulty of modeling CCSs in a physically explicit manner, ions with masses greater than ~10 kDa are commonly modeled using non-explicit methods (the “projection approximation”, PA [3234]; the “projected superposition approximation”, PSA [35]; and the “local collision probability approximation”, LCPA [36]), or using hard-sphere collisions (the “exact” and “diffuse hard spheres scattering” methods, EHSS [33, 37] and DHSS [9, 38, 39]). Although physically explicit methods (the “trajectory method”, TM [33], and the “diffuse trajectory method”, DTM [9, 38, 39]) can be highly accurate, they can be prohibitively slow for large ions as currently implemented.

Generally, methods for approximating the CCS of an ion in the low-field limit balance computation time and the level of physical detail of the modeled collision events. Simple methods (such as the PA and PSA) for calculating CCSs use projections of spherical model atoms to find the orientationally-averaged area of the ion’s two-dimensional silhouette. More complex methods model the ion in three dimensions but do not explicitly model either collision gas polarization effects (EHSS, DHSS) or three-dimensional trajectories (LCPA). The most explicit currently available method for calculating CCSs, the DTM, accounts for energy transfer between the collision particles and the ion using a hybrid of explicit classical scattering trajectories and stochastic modeling of inelastic scattering. Features of these various computational methods are summarized in Table 1.

Table 1 Features Present (+) or Absent (−) in Various CCS Calculation Tools. Methods in Bold Type use Two-Dimensional Projections rather than Explicit Three-Dimensional Scattering for CCS Computations. Asterisks (*) Indicate that a Geometry-Dependent Correction Factor is used to Partially Account for 3-Dimensional Scattering

The TM evaluates an ion’s CCS by approximating the solution to the non-equilibrium Boltzmann Transport Equation [40, 41] as the momentum transfer integral Ω (1,1) avg . A common choice of a potential energy surface for the TM is the sum of Lennard-Jones 6-12 (L-J 6-12) potential energy surface of each of the atoms in the ion for the buffer gas of interest, and, at added computational expense, an ion-induced dipole (“C 4 ”) term [33, 39]. In principle, CCSs from the TM account for long-range scattering, multiple scattering, the ion’s charge, and temperature effects. However, final computed CCSs for large biomolecules and assemblies using elastic collisions are typically about 5% higher than experimentally measured values [3]. This is believed to result in part from gas-phase compaction that occurs due to self-solvation of the biomolecular ion during the electrospray ionization process, in addition to any inherent errors in the approximation method, or in converting experimental drift times to CCSs [30].

Here, we introduce a new open-source program, Collidoscope, which approximates CCSs via the TM, using parallelized code to optimize performance. Collidoscope also contains an option to compute CCSs with N2 as the buffer gas, using optimized L-J 6-12 parameters that result in CCSs closely matching experimental values. The essential features of the computational strategy and optimization procedure used for programming Collidoscope are described below, and results from Collidoscope are compared with experimental and other computational data. The utility of Collidoscope for interpreting structures of very large native-like biomolecular ions (with masses approaching ~1 MDa) is also discussed.

Theory

Physical Model

Collidoscope calculates an ion’s CCS by approximating the temperature-dependent, orientationally-averaged momentum transfer term, Ω avg (T), which arises in the solution to the Boltzmann Transport Equation. Collisions are simulated by integrating trajectories along the total ion-collision particle potential energy surface, approximated as the sum of L-J 6-12 potentials centered at each atom of the ion plus a “C 4 ” term to account for ion-induced dipole effects. According to classical scattering theory, Ω avg (T) can be approximated in the low-field limit as the binary collision integral

$$ {\varOmega}_{avg}^{\left(1,1\right)}(T)=\frac{{\displaystyle {\int}_0^{\infty }}{\displaystyle {\int}_0^{2\pi}}{\displaystyle {\int}_0^{\pi}}{\displaystyle {\int}_0^{2\pi}}{\displaystyle {\int}_0^{\infty }} b\left(1- \cos \left({\chi}_{b,\theta, \varphi, \gamma, g}\right)\right) dbd\gamma sin\varphi\ d\varphi d\theta *{g}^5*{e}^{\frac{-\mu {g}^2}{2{k}_B T}} d g}{4\pi {\displaystyle {\int}_0^{\infty }}{g}^5*{e}^{\frac{-\mu {g}^2}{2{k}_B T}}* dg} $$
(1)

where g is the initial speed of the collision particle relative to the ion; b is the impact parameter (defined as the initial distance of the approaching particle to the collision axis, see Figure 1); θ, φ, and γ are the angles that define the relative orientation of the ion and collision particle; k B is the Boltzmann constant; T is the absolute temperature; χ is the scattering angle; and μ is the reduced mass of the system [33, 40, 41]. (Note that the g 5 term reflects laboratory-frame orientational averaging of the particle and ion velocities, with probabilities given by Maxwell-Boltzmann distributions, as well as momentum transfer.)

Figure 1
figure 1

Graphical depiction of the simulation system for a single trajectory. A collision particle incident from the plane of origin with impact parameter b scatters off the potential energy surface at angle χ

In Equation 1, θ and φ define the direction of the collision axis, whereas γ describes the ion’s relative rotation about that axis. Since the γ and b parameters together describe a plane perpendicular to the collision axis, the integral over bdbdγ is equivalent to an integral over the differential area, dA. In addition, the double integral over sinφdφdθ can be written as an integral over differential solid angle, . Making the above substitutions results in the following equation:

$$ {\varOmega}_{avg}^{\left(1,1\right)}(T)=\frac{{\displaystyle {\int}_0^{\infty }}{\displaystyle \iint }{\displaystyle \iint}\left(1- \cos \left({\chi}_{x, y,\theta, \varphi, g}\right)\right) dAd\omega *{g}^5*{e}^{\frac{-\mu {g}^2}{2 RT}} dg}{4\pi {\displaystyle {\int}_0^{\infty }}{g}^5*{e}^{\frac{-\mu {g}^2}{2 RT}}* dg} $$
(2)

where dA is the differential area of a plane representing the origin of the collision particles before scattering (the “plane of origin”, see Figure 1).

In Equation 2, x and y describe the initial position of the collision particle in the plane of origin, whereas θ and φ describe the orientation of the plane of origin relative to the ion, and the denominator is a normalization factor. Collidoscope uses judicious choices of initial parameters to achieve uniform sampling of and dA as well as a Maxwell-Boltzmann-weighted sampling of g. These sampling choices, as well as the choice of potential surface and trajectory integration method, are described in greater detail below.

Sampling of Relative Orientation

Orientational averaging is achieved using a user-defined set of “vantage points” that determine a set of collision axes and corresponding planes of origin for the scattering simulations. To accurately approximate the integral over in Equation 2, these vantage points should ideally be sampled uniformly over the sphere of possible ion orientations with high density; otherwise, a poor estimate of Ω avg (T) may be obtained. Vertices of regular or quasi-regular polyhedra are thus reasonable choices for the sets of vantage points.

Sampling of Impact Parameter

Within each plane of origin, trajectories are initiated from points arranged in a uniformly-spaced square grid. Thus, dA in Equation 2 is approximated as the square of the distance between neighboring grid points for each trajectory. Scattering trajectories are computed only for the grid points for which both plane-of-origin coordinates (x and y in Figure 1) are smaller than the furthest distance of an atom in the ion to the center of mass, plus an additional “grid buffer” distance. Although the distance from the collision axis for which collision particles scatter strongly depends on the size and shape of the ion, use of a sufficiently large grid buffer distance ensures that significantly scattered trajectories are well-sampled at minimal computational expense.

For additional computational efficiency, Collidoscope performs a calculation to assess whether a trajectory within the region described above should be included in the computation of Ω (1,1) avg . The potential energy of a collision particle at its closest approach to any atom in the ion in the absence of any scattering is calculated. Scattering trajectories are then computed only if this potential energy is larger than a specified cut-off value because such trajectories are likely to scatter significantly. Effectively, for each plane of origin, the trajectories included in the CCS calculation are initiated from a region that broadly resembles an expanded silhouette of the ion in that plane. This procedure for choosing trajectories to run results in more efficient computation. The computation time as a function of the potential energy cut-off is shown in Supplementary Figure S1 in the Supporting Information.

Sampling of Relative Velocities

Initial collision speeds are sampled equally spaced, and the Ω (1,1) avg obtained for each collision particle kinetic energy is weighted by a probability factor to obtain Ω (1,1) avg (T) (see Equation 2). Each initial kinetic energy state’s contribution is weighted by a corresponding analytic integral of the probability density function from Equation 2, \( {g}^5* \exp \left(\frac{-\mu {g}^2}{2{k}_B T}\right) \), over a range of energies containing that initial kinetic energy state (see Figure 2). The number of initial kinetic energy states and the minimum and maximum initial kinetic energies used can be input by the user. For each sampled initial kinetic energy, the bounds of integration, which determine the weighting, are spaced equally in kinetic energy, with the exception that the lowest and highest bounds are 0 and infinity, respectively. This assures that Ω (1,1) avg (T) is not affected by energy state sampling in the case that it is strictly independent of temperature. (Indeed, temperature dependence of CCSs appears to be very weak for ions with a fixed conformation containing at least a few dozen atoms over the range ~200–400 K; see Figure 2 and Supplementary Figure S2). However, even the weak dependence of cross-section on temperature necessitates more densely sampled parameters for temperatures far from room temperature.

Figure 2
figure 2

Optimized default settings for initial collision speed sampling. The red curve represents the probability density function for the simulated trajectory’s initial conditions as a function of normalized collision speed, dots represent sampled collision speed values, and dashed vertical lines represent limits of integration used for probability weightings equal to the areas of the corresponding colored regions (see text). The black curve is the calculated CCSHe for LFN (the N-terminal domain of anthrax lethal factor protein) at 298 K, as a function of buffer gas speed

Model Potential Energy Surface

At present, modeling polarization interactions quantum mechanically would requirehigh computational expense even for small ions. Instead, we follow the general approach of MOBCAL [33], the IMoS suite [9, 38, 39], the PSA [35], and the LCPA [36] by approximating the potential energy surface as the sum of the L-J 6-12 potentials attributable to each atom in the ion, plus a “C 4 ” potential attributable to the polarizability of the collision particle and the localized charges on the ion:

$$ U(r)={\displaystyle \sum_{atom}}{\varepsilon}_i\left[{\left(\frac{r_{min, i}}{r_i}\right)}^{12}-2{\left(\frac{r_{min, i}}{r_i}\right)}^6\right]+\frac{\alpha {e}^2}{8\pi {\varepsilon}_0}\left({\left({\displaystyle \sum_{atom}}{z}_i\frac{r_{x, i}}{{r_i}^3}\right)}^2+{\left({\displaystyle \sum_{atom}}{z}_i\frac{r_{y, i}}{{r_i}^3}\right)}^2+{\left({\displaystyle \sum_{atom}}{z}_i\frac{r_{y, i}}{{r_i}^3}\right)}^2\right) $$
(3)

where ε i and r min,i are the L-J 6-12 parameters for well-depth and radius at minimum potential between atom i and the collision particle, respectively, ε0 is the permittivity of free space, α is the effective isotropic polarizability volume of the collision particle, e is the elementary charge, z i is the charge state of each atom, and r x,i , r y,i , r z,i is the vector between atom i and the collision particle (with length r i ). Collisions are modeled elastically, and the positions of the atoms and charges in the ions are held fixed throughout the scattering computations.

Ion-induced dipole interactions with He as the collision particle are modeled using its static electric dipolar polarizability [43]. N2 is also modeled as a quasi-spherical collision particle, using an effective polarizability that is equal to the arithmetic mean of its principal polarizabilities determined by ab initio computations [44]. Based on its moment of inertia, over the duration of a typical simulated trajectory (tens of ps), a molecule of N2 thermalized at room temperature would typically rotate about each axis a few dozen times. Thus, this quasi-spherical approximation assumes that the molecule is rotating rapidly enough that no significant rotational alignment effects occur due to collision particle polarization.

Integration of Trajectories

Integration of trajectories in time in Collidoscope is presently performed using Euler (i.e., first-order Runge-Kutta) integration. Therefore, in order to minimize calculation inaccuracy due to trajectory integration errors, multiple checks are performed to determine the numerical validity of a trajectory. The total energy of the system must be conserved to within a threshold over the course of a trajectory for it to be included in computing the CCS. To enforce this, the center-of-mass-frame kinetic and potential energy of the collision particle are calculated at the beginning and end of a trial trajectory. If the total energy changes by more than the allowed amount, the trajectory is recalculated with a shorter time step, and energy conservation is checked again. Using a smaller time step promotes stricter energy conservation, so recalculated trajectories converge toward exact energy conservation as the time step is reduced.

If a particle loses energy due to integration error, it can occasionally become trapped near the ion by the attractive portion of the potential energy surface. Therefore, any trajectories exceeding a maximum number of steps are recalculated with a shorter time step. If a trajectory either does not conserve energy or exceeds the maximum number of time steps for five consecutive attempts, the trajectory is considered to have failed and is omitted from the CCS calculation.

Parallelization

Significantly reduced computation time is achieved in Collidoscope through parallelization of trajectory simulations. At present, many common CPUs have 12 parallel hardware threads on which calculations can be performed. Each thread can calculate independent trajectories, decreasing the wall time when using 12 threads to ~9% of the wall time needed when using one thread at a time. By default, Collidoscope automatically determines the number of hardware threads available on a computer and maximizes the number of trajectories that are run simultaneously. A version of Collidoscope that utilizes GPU parallelism will be released soon.

Ions Modeled

CCSs for molecules and ions ranging in mass from 12 Da to ~800 kDa with charge states between 0 and 73+ were calculated using Collidoscope and IMoS TM with He or N2 as the collision gas, and the MOBCAL TM, PA, and EHSS methods. Collidoscope and MOBCAL calculations were performed using the Linux OS, with Collidoscope parallelized over 12 threads, on an Intel X5650 CPU. IMoS computations were performed using the Windows OS, parallelized over seven threads on an Intel Core i7-4790 CPU. All computed values are reported using the notation CCSHe and CCSN2, by analogy with the notation suggested by Barran and co-workers [45]. Sources for the atomic coordinates used in the computations as well as the charge states used are described in further detail in Supplementary Table S1 in the Supporting Information. Charge for small, non-protein ions was equally distributed among all atoms for CCS computations. For protein ions, charge configurations were generated using the charge placement algorithm described below.

Charge Placement Algorithm for Protonated Protein Ions

Extending the charge placement algorithms of Williams [46, 47], Grandori [48], and Konermann [49], Collidoscope uses a Metropolis-Hastings-like [50, 51] charge placement algorithm for protonated protein ions based on the ion’s input atomic coordinates, a user-defined charge state, the calculated point-charge electrostatic repulsion energy of a given charge configuration, and the total intrinsic proton affinity of the ion. The N-terminal amine group as well as each residue is considered to be a possible charge site throughout the charge placement computation, with one possible charge per residue. Initially, the user-defined number of charges are placed in a random configuration on the defined residues, and a total apparent proton affinity, PA app , is calculated as the sum of the intrinsic proton affinities of all charged sites minus the electrostatic repulsion energy of the point-charge sites. This procedure is repeated 1000 times, and the charge configuration with the greatest PA app is selected for Metropolis-Hastings optimization. One of the charges from this configuration is picked at random and moved to a new, random location not already occupied by a charge. The updated PA app of the ion is calculated, and the new charge configuration is accepted if this value is greater than before the charge was moved. If the updated value of PA app is instead lower, it is accepted with a probability \( \exp \left(-\frac{\varDelta P{A}_{app}}{k_B T}\right) \). This assures that the algorithm can escape local energy minima to find a low-energy charge configuration. This procedure is repeated until the standard deviation of PA app is less than 6 k B T for the last 25% of the iterations and the average value of PA app increases by less than 0.1 k B T between the second-to-last and last 25% of the iterations. At this point, the optimization is considered to have converged, and the charge configuration with the highest PA app among all the iterations is used for CCS calculations. Intrinsic proton affinities used in Collidoscope are identical to those used by Konermann and co-workers [49], and a relative permittivity of 2.5 is used in calculating electrostatic repulsion for all proteins other than GroEL, for which a value of 4 is used. For the proteins investigated here, the main effect of the charge placement algorithm is to spread the charges out among basic sites near the surface of the ion. (Examples of convergence of the charge placement for GroEL73+ are illustrated in Supplementary Figure S3.)

Results

Default Parameter Optimization

Extensive tests were performed to optimize default values for parameters that minimize computation time while preserving accuracy of the resulting CCSs. Particularly important to the optimization process were the grid buffer distance and the impact parameter spacing. Tests were performed on species spanning four orders of magnitude in mass (water, ondansetron, melittin, and LFN), and results were compared to determine if the parameter should be scaled as a function of the mass or charge state of the ion. Calculated CCSs tend asymptotically toward a “dense-sampling limit” as parameters are sampled more densely (impact parameter spacing, initial kinetic energies, time step, and vantage points) or as the computations are more extensive (grid buffer distance, initial collision particle-ion separation, and potential energy threshold). Each parameter was optimized such that calculations with the optimized value generally produced CCSs within 1% of the dense-sampling limit for that parameter. These optimization tests are described in further detail below and in the Supporting Information (Supplementary Figures S1 and S4S13).

Impact parameter spacing in Collidoscope is optimized to scale with the number of atoms in the ion, such that the CCS of an ion with 100 atoms is computed with an impact parameter spacing equal to the van der Waals radius of a carbon atom (1.7 Å; see Supplementary Figure S8) [52]. The size of each square in the plane-of-origin grid is scaled by the number of atoms, so that the number of trajectories calculated scales approximately linear with the number of atoms in the ion, for ions of similar density.

The grid buffer distance is optimized to be 15 Å. Smaller grid buffer distances result in significant deviations for the CCSs of ions with masses < ~1 kDa, and a grid buffer distance of 15 Å results in CCSs within 0.004% of their dense-sampling limit for all four ions (see Supplementary Figure S9). This default grid buffer distance value ensures that some trajectories traveling through regions of low-magnitude potential energy are included in the calculation, but extraneous computation time and memory use are avoided by omitting many trajectories that would deflect negligibly. Note that the grid buffer affects the CCS less for large ions, so it can be significantly reduced or excluded (to reduce computation time) for large enough ions.

The initial distance from the plane of origin to the center of mass, the cut-off energy used to fine-grain omission of trajectories, the number and geometry of vantage points, the range and number of energy states computed, and the integration time step were all optimized similarly, but these parameters were found to have less significant impact on computed CCSs than the parameters described above. Results used for optimization of these parameters, along with sample scattering trajectories simulated for C60 +, melittin, and GroEL, are illustrated in the Supplementary Figures S4, S5, and S10S14).

Comparison of Collidoscope CCSs to Other Calculated and Experimental Values

CCSHe for species listed in Supplementary Table S1 calculated using Collidoscope are compared with those obtained using IMoS in Figure 3. These Collidoscope CCSHe agree with those obtained via IMoS’s TM to within 4% for all species larger than 1 kDa. Because Collidoscope and IMoS calculations were performed on different CPUs, their computation times are not directly comparable. However, Collidoscope calculations were completed with wall times typically ~2% of the equivalent TM calculations in MOBCAL, on the same CPU. This decrease in computation time is largely due to parallelism.

Figure 3
figure 3

Comparison of CCS (red, He; blue, N2) obtained using Collidoscope and those obtained via IMoS’s TM approximation. The diagonal line represents perfect agreement between the two methods. For GroEL, MOBCAL PA and EHSS values are used instead of IMoS’s TM (see text)

The CCSHe and the CCSN2 for GroEL73+ were computed with Collidoscope, respectively, in 3.5 and 6 d (see Figure 4). (CCSHe were not computed for GroEL 14-mer ions using MOBCAL TM due to prohibitive computational expense, nor were they calculated using IMoS due to software restrictions on the size of molecule that can be used.) PA and EHSS CCSHe for this ion were computed with MOBCAL, and these values are 26.8% lower and 1.2% higher than the TM CCS computed with Collidoscope, respectively. These results are in agreement with previous observations that EHSS values tend to be very close to TM values for large, globular ions in low-field IMS, whereas PA values tend to be somewhat lower. No noticeable trend is observed in the relative differences between CCSHe calculated with either MOBCAL or IMoS and with Collidoscope as a function of CCS. It is therefore expected that Collidoscope can be used to accurately calculate TM CCSs of ions with masses between ~10 kDa and several MDa that may require prohibitive computation time or memory use with other methods.

Figure 4
figure 4

Log–log plot of Collidoscope computation time versus molecular weight for small, dense ions and protonated proteins in He (red) and N2 (blue) buffer gas. Fitted lines illustrate different scaling law behavior for ions with different densities

A log-log plot of computation time for CCSHe and CCSN2 versus ion mass is shown in Figure 4. Interestingly, distinct scaling law behaviors are observed for small, dense ions and for larger biomolecular ions. In particular, computation time scales with ion mass, m, as m 0.97 (resp., m 0.92) for low-mass ions and as m 1.46 (resp., m 1.36) for larger biomolecular ions with He (resp., N2) collision gas. This result is attributed to the different densities of these two types of ions, which in Collidoscope requires calculation of more trajectories per unit mass for large biomolecular ions than for smaller, denser ions. Furthermore, computations with N2 buffer gas require roughly two to three times as much time as with He (see Figure 4).

A comparison of Collidoscope CCSHe and CCSN2 to experimental results [18, 21, 53] is shown in Figure 5. Collidoscope CCSHe are typically ~1%–3% higher than IMoS, and both values are typically higher than experimental CCSs. CCSHe calculated with Collidoscope are typically between 3% and 15% higher than experimental CCSs, with smaller relative errors for smaller ions (CCS < ~200 Å2) than for larger ions (Figure 5). By comparison, results from IMoS TM for this set of ions are typically between 1% and 13% higher than experimental CCSs.

Figure 5
figure 5

Comparison of CCSs obtained using Collidoscope and those obtained from IMS experiments, using He (red) or N2 (blue) buffer gas. The diagonal line represents perfect agreement between calculated and experimental values

For Collidoscope CCSN2 calculations, optimized L-J 6-12 parameters for the quasi-spherical N2 collision particle were determined by extensive parametrization tests for r min and ε i values for both carbon and hydrogen, using experimental data for an assortment of aromatic hydrocarbon ions (see Supplementary Table S2). The r min and ε i parameters for carbon and hydrogen were optimized simultaneously, and default values were chosen that agree maximally with experimental results. We note that these optimized L-J 6-12 parameters are in some cases somewhat different from the parameters used by IMoS. This result indicates that the TM can be robust to variation of the L-J 6-12 parameters for large ions. CCSN2 obtained using these optimized parameters are typically 1% to 18% higher than experimental values for all ions studied, for both Collidoscope and IMoS (excluding GroEL, for which discrepant values have been reported in the literature [6, 29, 53]; see Figure 5).

Discussion

The overall approach used to optimize Collidoscope is to reduce computational time without sacrificing accuracy and precision, and to provide a flexible, open-source, and parallelizable code base in anticipation of future needs for researchers in the field of native IM-MS. For example, if a researcher needs to calculate a CCS for a very large ion in Collidoscope with greater precision with limited computational time, s/he may choose to change the impact parameter grid spacing or the number of energy states used. Because the computation time varies as the inverse square of the grid spacing and only linearly with the number of energy states, it is less computationally expensive to increase the number of energy states used rather than to decrease the grid spacing. However, care must be taken when increasing the number of energy states, in order to ensure that the dg integral (see Equation 2) is accurately estimated.

A major difference between Collidoscope and previous TM implementations is the type of sampling used for trajectory parameters. Instead of using Monte-Carlo sampling, Collidoscope samples all trajectory parameters uniformly and with sufficient density to obtain CCSs close to the dense-sampling limit. Computation times vary from a few seconds for small ions to a few days for ~MDa-sized complexes, such as the GroEL 14-mer. The above results indicate that Collidoscope computes TM CCSHe in close agreement with IMoS’s TM implementation.

Because it accounts for long-range polarization effects, multiple scattering, and collision gas temperature, Collidoscope inherently models the physics of the scattering process at a greater level of detail than the PA/PA* or EHSS methods. Computing TM results for large ions is useful for validating empirical “geometry corrections” of values obtained from computationally inexpensive, but less physically detailed, methods such as PA and EHSS. Notably, the EHSS CCSHe of GroEL computed with MOBCAL is very close to the TM CCSHe for GroEL66+ computed with Collidoscope, though both values are considerably higher than previously reported experimental results for this ion in N2 buffer gas [6, 29, 53]. The MOBCAL PA CCSHe for GroEL is higher than the experimental CCSN2, though for geometrical reasons the PA CCS generally provides a lower bound for the experimental low-field CCS of a dense, native-like ion. For each ion studied here, CCSN2 is strictly greater than CCSHe, due to both the larger size of the N2 molecule and its greater polarizability. These results all strongly suggest that the experimental GroEL CCSs previously reported pertain to conformations that are significantly more collapsed than the X-ray crystal structure. Therefore, calibration of IM-MS data using published experimental GroEL CCSHe or CCSN2 should be undertaken with careful consideration of the structures adopted under the experimental conditions used. Collidoscope can thus be a valuable tool for evaluating and developing calibration protocols for IM-MS data and for studying the conformational space of large ions.

Based on the comparison between Collidoscope CCSs with experimental values shown in Figure 5, Collidoscope predicts CCSs for protein geometries derived from X-ray crystallography data that are consistently between 3% and 15% (respectively, 1% and 18%) higher than experimental IM-MS values for native-like ions using He (respectively, N2) collision gas. This agrees with previous observations that many native-like biomolecular ions undergo some compaction during the electrospray ionization process due to removal of solvent and subsequent self-solvation [3]. Empirically, a scaling factor of 0.91 for Collidoscope CCSHe and CCSN2 obtained with default parameters using X-ray crystallography-derived geometries of proteins ions is recommended if the user desires to empirically “correct” for this compaction effect for native-like ions with masses between ~1 kDa and ~1 MDa. Based on the results reported here, after this correction, the method precision for the TM using both IMoS and Collidoscope is about 10% for protein ions.

For identical input structures, CCSHe computed using Collidoscope agree very closely with those computed using either the IMoS TM or MOBCAL EHSS (Figure 3). Although N2 buffer gas is not implemented in MOBCAL without modification, CCSN2 are in close agreement between IMoS and Collidoscope. In addition, ratios between the CCSN2 and CCSHe computed using Collidoscope agree very well with experimental ratios for these values (Figure 6). Both sets of ratios converge toward an asymptotic value (~1.15) similar to those previously calculated for organic macro-ions using a variety of computational methods [9]. Together, these results indicate that Collidoscope’s implementation of the TM for both He and N2 gas typically provides accurate estimates of CCSs for the input structures used. Differences between Collidoscope CCSs and experimental values (see Figure 5) for large protein ions are therefore attributed predominantly to discrepancies in the structures they represent (i.e., the experimental gas-phase structures studied are likely different from the crystal structures used here). A detailed analysis of possible structures adopted in these experiments is beyond the scope of this paper, but the ability to greatly reduce computational time for TM CCSs with Collidoscope (via GPU parallelism and other optimizations) will facilitate future work investigating the gas-phase conformational landscape for large ions.

Figure 6
figure 6

(a) Plot of the ratio of CCSN2 and CCSHe versus ion mass (experimental values, blue; Collidoscope values, red). (b) Comparison of Collidoscope and experimental ratios (diagonal indicates perfect agreement)

Although we recommend using the extensively optimized default parameters presented here, customization of Collidoscope to suit the needs of the user is possible and relatively simple because of its object-oriented programming. Lennard-Jones parameters, collision gas properties, and the trajectory integration algorithm are coded separately and may be modified as needed.

Conclusions

Collidoscope is a computationally efficient tool for calculating CCSs of ions with a wide range of masses using the Trajectory Method that produces results in close agreement with measured low-field ion mobility values for both He and N2 collision gas. Computation time is significantly decreased relative to MOBCAL TM calculations due to parallelized computing and optimized sampling of trajectory parameters. A GPU-parallelized version of Collidoscope is currently under development and will further reduce computational time. Because TM CCSs for megadalton-sized ions can be calculated using Collidoscope in a few days, the program makes detailed IM-MS analysis of the conformational space of large ions tractable at a high level of sophistication. CCSs computed with Collidoscope can also be used in combination with modeled structures to inform calibrations of IM-MS data for very large ions for which conformations are not precisely known or can vary with instrumental conditions, such as GroEL.

In the future, a more rigorous treatment of N2 will be implemented in Collidoscope to account for its permanent quadrupole moment and full dipolar polarizability tensor. Higher-order Runge-Kutta integration methods will also be made available as an option to further reduce trajectory integration error. Finally, other buffer gases, including argon, will be implemented. Collidoscope is available for public use upon request from the authors.