A Priori Intrinsic PTM Size Parameters for Predicting the Ion Mobilities of Modified Peptides
- 526 Downloads
The rising profile of ion mobility spectrometry (IMS) in proteomics has driven the efforts to predict peptide cross-sections. In the simplest approach, these are derived by adding the contributions of all amino acid residues and post-translational modifications (PTMs) defined by their intrinsic size parameters (ISPs). We show that the ISPs for PTMs can be calculated from properties of constituent atoms, and introduce the “impact scores” that govern the shift of cross-sections from the central mass-dependent trend for unmodified peptides. The ISPs and scores tabulated for 100 more common PTMs enable predicting the domains for modified peptides in the IMS/MS space that would guide subproteome investigations.
KeywordsIon mobility spectrometry Peptides Proteomics Post-translational modifications
Ion mobility spectrometry that sorts ions by mobility in gases is a growing alternative or complement to condensed-phase separations in conjunction with mass spectrometry . Broad commercialization of IMS/MS/(MS) platforms over the last decade has expanded the technology to new bioanalytical applications [1, 2, 3]. In one, IMS can increase the number of peptide and protein identifications in bottom-up proteomics by fractionating complex proteolytic digests on the precursor  or fragment  levels to reveal low-abundant peptides. Employing the measured mobility as a sequence descriptor can also improve the confidence of identifications . The proteomics frontier today is elucidating the complement of PTMs that critically affect the protein function and activity [7, 8, 9]. However, detecting and quantifying modified peptides has been a challenge, partly in view of their usually low population compared with unmodified analogs . Hence, various affinity methods have been engineered to enrich subproteomes comprising specific PTMs, particularly phosphorylation (p) [8, 9].
The key question is whether the IMS/MS domains for unmodified and some modified peptides differ, enabling similar enrichment in rapid gas-phase separations. The mobility measured in linear IMS [including the drift-tube (DT), traveling-wave (TW), DMA, trap, and transverse-modulation systems] is proportional to the inverse orientationally-averaged cross-section Ω [1, 11]. For +1 protonated ions (generated by matrix-assisted laser desorption ionization), monophosphorylated peptides statistically have smaller Ω than the benchmark for unmodified species of same mass (m) by ~3%. However, the domains for both groups distribute around the central Ω/m trends and overlap by about half [12, 13]. For polyphosphorylated peptides, the domain lies below the benchmark by >6% with hardly any overlap . Altogether, the precursor set probed for phosphorylation by MS/MS can be reduced (and analyses accelerated) by two to three times, and more for selected proteins . The pattern for 2+ peptides obtained from electrospray ionization (ESI) is close: regardless of the residue involved (S, T, or Y), the domain (in DT or TW IMS) is ~5% below the benchmark with ~50% overlap [14, 15]. The greatest shift (−10.5%) was found upon five phosphorylations . The domains for singly carboxyamidomethylated (cam) and palmitoylated (pal) 2+ peptides lie near the benchmark with virtually complete overlap and above it with <50% overlap, respectively . No other PTMs have been explored to date.
The mean shift of Ω of peptide (∆Ω) upon addition of a given building block (residue or PTM) of mass ∆m was quantified via the intrinsic size parameter (ISP), defined as ∆Ω/∆m with normalization bringing the average for all amino acid residues (aa) to unity [15, 16, 17, 18, 19]. Then PTMs have ISP <1 for shifts of Ω below the unmodified peptide trend, ~1 for on-trend shifts, and >1 for above-trend ones. Regression fitting of measured Ω for libraries of ~102 peptides per PTM yields ISP of 0.64 ± 0.05 for p , 0.92 ± 0.04 for cam , and 1.26 ± 0.04 for pal . That the ISPs for p and pal are outside of the ~0.9–1.2 range for aa [16, 17, 18, 19] shows the importance of specific PTMs for peptide mobilities, and is promising for IMS enrichment of modified proteomes and use of ISPs for faster, more robust identification of all peptides.
The ISPs for aa were also computed from the masses and van der Waals radii of constituent atoms . The outcome matched the measured Ω well, in fact marginally better than with experimental ISPs (because of the errors inevitable when extracting them from finite data). Although the first-principles molecular structure optimizations followed by mobility calculations (by the trajectory propagation , scattering on electron density isosurfaces [22, 23], and ultimately their hybrid ) are fundamentally superior to the ISP formalism, they are too costly for practical proteomics. Here we validate a priori ISP calculations for PTMs and tabulate them for most PTMs to broadly guide IMS/MS studies.
This quantity can be positive or negative, depending on the signs of factors. Note that (ISP − 1) < 0 and ∆m > 0 or (ISP – 1) > 0 and ∆m < 0 turn into a negative ΩIMP (e.g., for cit and S=S bridge), whereas (ISP − 1) < 0 and ∆m < 0 yield a positive ΩIMP (e.g., for am). The ΩIMP value for multiple (same or different) PTMs is the sum of their ΩIMP. The concept of impact score is much more crucial for PTMs, where ∆m values range over four orders of magnitude (from 1 Da for cit to >10 kDa for SUMOs), than aa that vary in mass by only ~three times (from 57 Da for G to 186 Da for W).
The ISPs and Cross-Section Impact Scores for Common PTMs (sources denoted in superscripts) Calculated with r i Sets 1 and 2, Listed in the Order of Increasing ISP. The PTMs with Mean Absolute ΩIMP Over 60 (expected strong separation of IMS/MS domains) and in the 20–60 Range (partial separation) are Highlighted in Green and Yellow, respectively
Three PTMs have negative ISPs of –(4.6–4.8): cyt, am, and deamidation (the reversal of amidation). Despite these extreme ISPs, the minimum absolute ∆m = 1 Da means │ΩIMP│ = 6 that should cause no significant deviation from the benchmark. At the other end is the ISP for S=S bridge, but │ΩIMP│ = 7 is similarly minor because of small ∆m. The ISPs for other PTMs vary from 0.08 for iodine to 1.4 for methyl (Me). As anticipated, ISPs << 1 are found for PTMs rich in heavy atoms—first of all halogens, then O, S, and P. Iodination and bromination combine very low ISPs (~0.1) with substantial mass, leading to ΩIMP of −116 (I) and −70 (Br) that ought to make for (near) completely demarcated domains below the benchmarks. The next large group of PTMs (including sulfation, nitration, trifluoroacetylation, (pyro)phosphorylation, oxidation, and hydroxylation) have essentially identical ISPs of 0.5–0.6. Their ΩIMP accordingly scale with mass, amounting to –(6–15) for oxidation and hydroxylation (likely no significant domain delineation), –(20–46) for nitration, sulfation, phosphorylation, and trifluoroacetylation (partial delineation), and –(80–90) for pyrophosphorylation (full delineation). Then IMS/MS would not broadly distinguish the nominally isobaric sulfation and nitration with near-equal ISPs, or pyrophosphorylation from double phosphorylation (an objective of some analyses). Higher but still low ISPs are found for, e.g., formylation (~0.7) and malonylation and cysteinylation (~0.8). Their ΩIMP values of –(8–26) are not promising for major domain delineation.
Again, the PTM mass often matters more than the magnitude of (ISP – 1). For example, the coincident Ω/m graphs for 1+ oligonucleotides and carbohydrates lie below that for unmodified peptides [28, 29]. Indeed, our calculated ISPs are 0.81–0.83 for ADP-ribose (a sugar derivative), and 0.85–0.89 for FAD (flavin adenine dinucleotide) and FMN (flavin mononucleotide). Despite these only moderately low ISPs, large masses of ribose and FAD lead to ΩIMP of circa –(100–120), which should differentiate the peptides from ribosylated  and flavo proteins. The ΩIMP of FMN is only –(50–60), but should still cause notable domain separation. Succinylation and glutathionylation also have ISP ~ 0.87–0.89, which leads to a low │ΩIMP│ < 10 for the former but substantial │ΩIMP│ ~30–40 for the latter. Most glycans [pentoses, hexoses, deoxyhexoses, sialic acid (SA), and hexosamines] have ISPs slightly below 1 (0.94–0.99) with │ΩIMP│ < 10 (except ~15–18 for SA), suggesting no real shift from the benchmark as seen for cam  and, thus, no domain separation for glycopeptides. These observations also tell that the ISP approach can predict the ordering of IMS/MS trend lines for different biomolecular classes such as sugars, peptides, and nucleotides.
Some PTMs with ISP close to 1 are acetylation (Ac, 0.93), biotinylation (0.96–0.97), and crotonylation (1.00–1.02): all with ΩIMP ~ −10–0. Protein PTMs also have ISP ~ 1, namely 1.01–1.02 (SUMO) and 1.04 (ubiquitin, ub): many different aa in typical proteins average to ~1.0. With the mass of ~9–12 kDa, the theoretical ΩIMP values of ~140–190 (SUMO) and ~310 (ub) are large but imprecise because of outsize impacts of small ISP variations. Although we include these and several other very large PTMs for completeness, we recognize that their size would place the adducts outside of the standard peptide mass range, where the baseline mass/mobility correlation loosens with the growing role of protein conformation.
A prominent PTM group is lipids. The Ω/m line for 1+ lipids lies above that for unmodified peptides [28, 29], and our ISPs for lipid PTMs (e.g., farnesylation, geranylation, myristoylation, pal, stearoylation, cholesterylation) are ~1.3. The associated ΩIMP values are, therefore, proportional to mass, and those for myristoylation and farnesylation (~60), geranylgeranylation and stearoylation (~80–90), and cholesterylation (~110–120) are expected, like pal , to produce strongly or fully delineated domains above the peptide benchmark. The ΩIMP of ~40 for geranylation could still be enough for substantial separation. The alkyl PTMs (mostly Me in methylation or methyl esterification, but also ethylation and butylation) have even higher ISP ~1.4. The ΩIMP of just 5–6 for light Me would not afford material trend separation with monomethylation, but small size of Me permits common di- and trimethylation of same residue with ΩIMP of ~10–11 and ~15–17. A serious task in proteomics is distinguishing (Me)3 from Ac that is only 36 mDa lighter . With the swing of ~20 between ΩIMP of (Me)3 and Ac, the domain separation between acetylated and trimethylated proteomes should be noticeable although perhaps insufficient with single substitution, but substantial to strong with multiple substitutions.
Many peptides are multiply modified with same or different PTMs. Addition of identical PTMs scales ΩIMP by their number, which may substantially augment the domain differentiation as observed for multiple phosphorylation [13, 14] ubiquitous in biology (e.g., for τ proteins relevant to Alzheimer’s) [32, 33]. The same should occur with multiple nitrations, sulfations, oxidations, methylations, etc. For example, ~0.3% of the H3 histone tails (characterized in middle-down proteomics) feature eight or nine Me . Their total computed ΩIMP is ~40–50, potentially inducing substantial domain separation that may help detect those rare proteoforms. Superposition of different PTMs may increase or decrease the effect, depending on the ISPs. For example, histones often feature  Ac and/or p (ΩIMP < 0) and Me (ΩIMP > 0), which may cancel. For instance, same H3 tails have forms including Ac2Me with ΩIMP ~ 0 . The same may happen within a PTM, e.g., glypiation that anchors proteins to cell membranes  consists of a glycan core (ISP < 1) and lipid tail (ISP > 1) for the overall ISP of 1.04.
Besides raising the utility of IMS in proteomics, ISPs convey information about the global influence of an aa or PTM on the tertiary peptide structure. That is, in experiment entities compacting the peptide by attracting and tightly packing the surrounding groups (through intramolecular solvation) would have low ISPs, whereas entities repelling those groups and thus loosening the 3-D structure would have high ISPs. The issue is that one must compare the measured ISPs to theoretical values (Table 1) rather than the average ISP ~ 1.0; else a corollary of density would be mistaken for structural effects. Specifically, polar and aromatic residues that have small ISPs were believed to contract peptides by charge–dipole and dipole–dipole interactions (with other residues or the backbone) or aromatic ring stacking [15, 17, 18, 19], while residues with long aliphatic side chains were rationalized to possess freer conformations that expand the peptide geometries. A close match between the measured and calculated ISPs for both categories has proven those effects, while plausible and potentially pertinent for individual species, to be inoperative on average . Similarly, small ISPs of cam and especially p were argued to be a manifestation of polar groups attracted to the charge sites  or intramolecular interactions that lead to structural compaction , and a large ISP for pal was thought to reflect its length and hydrophobicity . Present findings clarify that these PTMs do not affect the average tightness of peptide folding.
Although ISPs provide more accurate peptide mobilities than the mass alone, there are major limitations. The IMS resolution of sequence  and PTM localization [15, 37] isomers means that the positions of both aa and PTMs (not captured by current ISPs) matter. The influence of sequence could possibly be emulated to some extent by sequence-specific ISPs for pairs of adjacent residues . However, this escalates the number of ISPs by at least an order of magnitude, and eliciting their statistically significant values requires a much larger and more diverse experimental data set. Approaches considering the environment can also be devised for PTMs, although no direct equivalent exists as the neighboring PTMs on the backbone are often too far apart to interact. One enhancement may be to treat same PTMs on different amino acids separately, although phosphorylations of S, T, and Y appear to have the same effect as stated. Eventually, ISPs would be supplanted by sophisticated artificial neural networks—the machine learning algorithms that integrate numerous (frequently non-obvious) structural descriptors to predict chemical properties. Such models that incorporate the peptide sequence and size have succeeded for chromatographic retention [39, 40] and should work here, but require massive training sets.
Extending a priori calculations of intrinsic size parameters (ISP) from residues  to PTMs permits predicting the mobilities of modified peptides. The resulting ISPs match those for PTMs measured so far (phosphorylation, carboxyamidomethylation, and palmitoylation). Along with the agreement for amino acid residues , this validates the approach and shows that the ISPs for both amino acids and PTMs reflect primarily their density rather than cooperative intramolecular interactions. The agreement of ISPs for PTMs formed from nucleotides, sugars, peptides, and lipids with the established arrangement of IMS/MS domains for these biomolecules shows the utility of formalism for other chemical classes, supporting the idea that their mass/mobility correlations are also determined by density . The deviation from central mass/mobility trend for peptides also scales with the PTM mass. We have aggregated the effects of ISP and mass into impact scores, calibrated their effect on IMS domain delineation with available experimental data, and tabulated the results for 100 common PTMs. About half should substantially shift the IMS/MS domains from that for unmodified peptides, and some others may do so with multiple modification. These estimates ought to help plan IMS/MS analyses of modified proteomes and improve the quality and speed of identifications. They should be utilized only statistically though, as for any individual peptide the effect of specific geometry may outweigh that of the density of constituents.
The authors thank Professor David E. Clemmer (Indiana) and Dr. Pavel V. Shliaha (SDU, Odense) for insightful discussions of the use of ISPs to predict the IMS separation properties of peptides. This research was funded by NIH K-INBRE (P20 GM103418), NSF First (EPS-0903806), and NSF CAREER (CHE-1552640). A.S. also holds a faculty appointment at the Moscow Engineering Physics Institute (Russia).
- 1.Eiceman, G.A., Karpas, Z., Hill, H.H.: Ion mobility spectrometry. CRC Press, Boca Raton (2013)Google Scholar
- 5.Helm, D., Vissers, J.P.C., Hughes, C.J., Hahne, H., Ruprecht, B., Pachl, F., Grzyb, A., Richardson, K., Wildgoose, J., Maier, S.K., Marx, H., Wilhelm, M., Becher, I., Lemeer, S., Bantscheff, M., Langridge, J.I., Kuster, B.: Ion mobility tandem mass spectrometry enhances performance of bottom-up proteomics. Mol. Cell. Proteomics 13, 3709–3715 (2014)CrossRefGoogle Scholar
- 29.May, J.C., Goodwin, C.R., Lareau, N.M., Leaptrot, K.L., Morris, C.B., Kurulugama, R.T., Mordehai, A., Klein, C., Barry, W., Darland, E., Overney, G., Imatani, K., Stafford, G.C., Fieldsted, J.C., McLean, J.A.: Conformational ordering of biomolecules in the gas phase: nitrogen collision cross sections measured on a prototype high resolution drift tube ion mobility–mass spectrometer. Anal. Chem. 86, 2107–2116 (2014)CrossRefGoogle Scholar
- 30.Haag, F., Buck, F.: Identification and analysis of ADP-ribosylated proteins. Curr. Top. Microbiol. 384, 33–50 (2015)Google Scholar
- 35.Fraga, M.F., Ballestar, E., Villar-Garea, A., Boix-Chornet, M., Espada, J., Schotta, G., Bonaldi, T., Haydon, C., Ropero, S., Petrie, K., Iyer, N.G., Perez-Rosado, A., Calvo, E., Lopez, J.A., Cano, A., Calasanz, M.J., Colomer, D., Piris, M.A., Ahn, N., Imhof, A., Caldas, C., Jenuwein, T., Esteller, M.: Loss of acetylation at Lys16 and trimethylation at Lys20 of histone H4 is a common hallmark of human cancer. Nat. Genet. 37, 391–400 (2005)CrossRefGoogle Scholar
- 40.Petritis, K., Kangas, L.J., Yan, B., Monroe, M.E., Strittmatter, E.F., Qian, W.J., Adkins, J.N., Moore, R.J., Xu, Y., Lipton, M.S., Camp, D.G., Smith, R.D.: Improved peptide elution time prediction for reversed-phase liquid chromatography-MS by incorporating peptide sequence information. Anal. Chem. 78, 5026–5039 (2006)CrossRefGoogle Scholar