1 Introduction

High-throughput proteomics experiments generate large sets of tandem mass (MS/MS) spectra and require the use of a sequence search algorithm [1, 2], or a reference library [35], to match peptide sequences to the fragmentation patterns. Upon fragmentation in low energy collision-induced dissociation (CID), peptide ions typically form a, b, and y ions, and ions from losses of neutral molecules such as H2O and NH3. Other types of fragment ions have been reported, which originate from unusual neutral losses [68] or peptide sequence scrambling [911]. While several groups have studied peptide fragmentation based on trends from large sets of spectra [1215], studies of single peptides [1621], or computer modeling of fragmentation [22, 23], reliable prediction of major product ions and their intensities during CID is not yet possible. However, the accuracy of peptide identifications from tandem mass (MS/MS) spectra using search algorithms or reference libraries relies on the ability to provide accurate models or representative spectra for peptide fragmentation patterns [2426].

The National Institute of Standards and Technology (NIST) peptide library of MS/MS spectra currently contains over 900,000 ion trap (IT) MS/MS spectra generated from the tryptic digests of samples from many organisms including humans, yeast, mice, and fruit flies [27]. Before being included in the library, spectra undergo quality control to ensure that the correct peptide sequence is assigned to a high quality spectrum [28]. A major factor used in assessing the quality of a spectrum is the fraction of intensity for the major product ions arising from known fragmentation pathways [29]. While evaluating spectra, many are found that contain one or more significant peaks, which are not assigned to the customary y, b, a, and neutral loss product ions. During examination of unassigned peaks, it was noticed that many peaks had m/z values corresponding to yn-1 + 10 and yn-1 + 11, formed from peptides with n amino acid residues. Such processes have been observed with a few synthetic peptides and ascribed to yn-1 + CO-H2O and yn-1 + CO-NH3, respectively [30]. In this study, we expand the observations to thousands of peptide ion MS/MS spectra from the NIST library and use this vast statistical information to characterize this type of fragmentation in an effort to develop rules that can be used to predict the formation of yn-1 + 10 and yn-1 + 11 ions.

2 Methods

2.1 Sample Preparation and LC-MS/MS Analysis

Tryptic tripeptides were synthesized in mixtures as described previously [21] and analyzed by high performance liquid chromatography with electrospray ionization tandem mass spectrometry using an ion trap (LTQ; ThermoFisher Scientific, Waltham, MA, USA), or a quadrupole time-of-flight (QTOF, model 6530; Agilent, Santa Clara, CA, USA) instrument. Peptides with longer sequences were synthesized individually and the peptide MRFA was purchased from Sigma Aldrich (St. Louis, MO, USA). Tandem mass spectra for selected peptide ions were acquired at 20 different collision voltages in a triple quadrupole mass spectrometer (QQQ, Micromass Quattro Micro; Waters Corp., Milford, MA, USA), as described previously [21], and peak intensities were plotted as a function of collision voltage. Fragmentation was also examined by MS3 experiments in the IT mass spectrometer.

2.2 Database Analysis

The NIST peptide library of MS/MS spectra from IT instruments was used to investigate the distribution and characteristics of peptide spectra with y + 10 and y + 11 fragment ions. The libraries are available for download without charge at http://peptide.nist.gov/. Programs written in-house were used to perform the statistical survey of the MS2 spectra having major y + 10 or y + 11 ions, in peptides with n amino acids, at positions n-1, n-2, or n-3. Peaks were assigned as y + 10 or y + 11 only if the observed m/z was within ±0.4 mass units of the expected value and no other assignment could be made. Peaks that may have been 13C isotopic peaks of assigned fragments were excluded. Spectra containing y + 10 or y + 11 peaks were then analyzed to determine the intensity of the ion relative to the largest peak in the spectrum and its dependence on adjacent amino acids. Analyses also examined y + 6, y + 8, and y + 12 ions to assess the magnitude of random peaks in the vicinity of the peaks of interest.

3 Results

3.1 Observations on y + 10 and y + 11 Fragment Ions in the NIST Peptide Library

Following analysis of selected peptide MS/MS spectra in the NIST library, many peaks were identified that could be assigned as yn-1 + 10 or, less commonly, as yn-1 + 11 ions, formed from peptide ions with n amino acid residues. Spectra having peaks at yn-1 + 8 were also identified to serve as markers for random (noise) peaks. To better distinguish clear fragmentation products from spurious peaks, the fraction of peptide spectra having these peaks with intensity >5% (of the largest peak in the spectrum) were counted. These studies showed that yn-1 + 10 and yn-1 + 11 ions were generated at levels clearly above the random yn-1 + 8 peaks. Several peptide MS/MS spectra contained peaks ascribable to yn-2 + 10 and yn-3 + 10 ions, but the number of such spectra were too small to provide an assessment of their contribution.

Examples of MS/MS spectra with yn-1 + 10 and yn-1 + 11 ions are shown in Figure 1 (additional spectra are included in the Supplementary Materials, Figures S1–S11). Although Figure 1 shows examples for doubly charged peptide precursor ions, the yn-1 + 10 and yn-1 + 11 peaks were also found to originate from peptide precursor ions in other charge states (see Figures S1–S11 for example spectra). The yn-1 + 10 and yn-1 + 11 ions are observed at the same charge state as that of the precursor ion or at one charge state lower. In the following discussion of peptide ions in the NIST library we present the results for charge states 2 and 3, which are the most abundant in the library (synthetic peptides discussed below were studied mainly in charge state +1). Most of the analyses in this work, however, were performed with the doubly charged peptide precursor ion forming singly charged yn-1 + 10 or yn-1 + 11 ions for the following reasons: (1) Doubly charged peptide ions represent the majority of the spectra in the NIST peptide library. (2) Singly charged [yn-1 + 10]+ ions (from doubly charged peptide ions) are at the high m/z range of the spectrum and are well separated from other product ions. (3) It is possible to clearly distinguish between the yn-1 + 10 and yn-1 + 11 singly charged ions in low resolution IT spectra.

Figure 1
figure 1

MS/MS spectra of peptide ions showing product ion peaks corresponding to yn-1 + 10 and yn-1 + 11. The following doubly charged peptides are shown: (a) NYQLYK, (b) PHLVLDQLR, (c) PSFFQHR, and (d) HNLNPAR. The x-axis is m/z and the y-axis is relative intensity. Peaks labeled as p-17, p-18, p-34, or p-35 are losses of water and/or ammonia from the peptide precursor ion

3.2 Location of the Additional 10 Da

Experiments with synthetic peptides show that the yn-1 + 10 ions are formed during MS/MS and MS3. To determine the location of the addition in m/z, MS3 spectra were measured for the [y6 + 10]2+ ion from the triply charged albumin peptide LCVLHEK (Figure S12a) and the [y5 + 10]+ ions from the singly charged synthetic peptide PSFLYK (Figure S12b). These MS3 spectra show that all b and a ions present have an m/z increased by 10 Da but all y ions are lacking this additional 10 Da. This confirms the previous conclusion [30] that the N-terminal residue of the peptide is involved in the formation of the yn-1 + 10 ion.

3.3 Influence of Peptide Length

We find that the relative abundance of yn-1 + 10 ions decreases with increasing peptide length. This presumably is due to the increasing number, and possibly more rapid, competing fragmentation processes associated with increasing length. Examples of this type of dependence are shown in Figure 2 for doubly charged peptide ions giving a [yn-1 + 10]+ ion with intensity > 5% of the most intense peak in the MS/MS spectrum. Since the extent of formation of intense yn-1 + 10 ions is also dependent on the peptide sequence (see next section), we present in Figure 2 results for specific cases of single amino acids (Figure 2a) at the N-terminal position (N, P, H) or the second position (S, T) and for pairs of residues at the N-terminus (Figure 2b). A similar trend was found for triply charged peptide ions (not shown), though the trend began for somewhat longer peptides. Therefore, in the following discussion on the influence of peptide sequence, we averaged the results for peptides within a range of length where most of the intense yn-1 + 10 ions are observed. For doubly charged peptide ions we used only those containing seven to 10 residues and for triply charged peptide ions we expanded the range to 15 residues.

Figure 2
figure 2

Percent peptide ions with charge 2+ showing yn-1 + 10 ions with charge 1+, having intensity > 5%, versus peptide length. Peptides with six residues or less are not shown because the low numbers of such peptides in the database give low statistical significance. (a) For single amino acids: N, P, and H at position 1 (N-terminal) of the peptide, and S and T at position 2. (b) For amino acid pairs at the N-terminus

3.4 Influence of Peptide Sequence

Since the yn-1 + 10 ion intensities varied considerably between peptide ions, we examined the dependence of the frequency of observation and intensity of these ions on the peptide sequence. Figure 3a shows the fraction of doubly charged peptide ions containing seven to 10 residues exhibiting a [yn-1 + 10]+ product ion with peak intensity > 5% (of the highest peak in the MS/MS spectrum) as a function of the two N-terminal residues. The x-axis of these plots shows the N-terminal amino acid and the y-axis shows the adjacent amino acid. The value for each amino acid pair is proportional to the radius of the circle shown. It is clear from Figure 3a that peptides with N-terminal P and N, and to a lesser extent H, yield the largest [yn-1 + 10]+ ions. Among residues in the second position (y-axis), S and T are most frequently associated with intense [yn-1 + 10]+ ions. Other residues are less frequently associated with intense [yn-1 + 10]+ ions. Plots of the average intensities as a function of the first two N-terminal amino acid residues also exhibit the same trends as those shown in Figure 3. The numerical values used for Figure 3 are listed in a spreadsheet in the Supplementary Material. The spreadsheet also gives the average intensities of the yn-1 + 10 ions for all the pairs of terminal residues.

Figure 3
figure 3

Fraction of doubly charged peptide ions having (a) [yn-1 + 10]+, (b) [yn-1 + 11]+, or (c) [yn-1 + 10]2+ ions, and (d) triply charged peptide ions having [yn-1 + 10]2+ ions with intensities > 5% (of the highest peak in the spectrum) for N-terminal amino acid pairs. The x-axis shows the N-terminal amino acid and the y-axis is the adjacent amino acid. Oxidized methionine is represented as B, and C stands for carbamidomethylated cysteine. The fraction of the amino acid pairs is proportional to the radius of the circle shown

To assess the validity of the small circles in Figure 3a, we created similar plots for [yn-1 + 8]+ ions (figure not shown, data are in the Supplementary Material Spreadsheet). In Figure 3a, there are 129 points with values > 0, with an average value of 7.5 %, but for the [yn-1 + 8]+ ions there are only 27 points with values > 0 and their average value was only 0.8 %. Of all the yn-1 + 8 points, only 11 have values greater than 10 % of the corresponding values for the yn-1 + 10 points. This shows that the amount of noise in Figure 3a is small. Data were also collected for yn-1 + 6 and yn-1 + 12 peaks (not shown) to serve as additional background noise for comparison with the data in the other plots in Figure 3 and the comparison affirms the validity of those plots.

The intensity of [yn-1 + 10]+ ions depends on the second residue as well. This is particularly so for doubly charged peptides with N-terminal P, where the highest values are predominantly associated with S and T in the second position (Figure 3a). N-terminal N generates significant [yn-1 + 10]+ ions with a number of different residues in the second position (most often with A, C, E, F, L, M, W, and Y) and, in contrast with P, it shows low intensities for S and T in the second position. N-terminal H generates significant [yn-1 + 10]+ with fewer amino acids in the second position (D, R, S, T) compared to N. When S or T residues are in the second position (Figure 3a), P is the N-terminal residue with the most intense [yn-1 + 10]+ ions, but F, I, and L also show high values. It should be pointed out with regards to N-terminal P that such peptides are less likely to be produced during tryptic digestion because of the specificity of trypsin. However, peptide ions with N-terminal P are frequently formed by in-source fragmentation and are expected to exhibit the same behavior as if produced by protonation of a peptide during ESI. We have demonstrated this similarity by synthesizing the peptide LLPHEFYAK and showing that the MS3 spectrum of its y9 ion, PHEFYAK, is identical with the MS/MS spectrum of the synthetic peptide PHEFYAK (data not shown).

Figure 3b shows the results for doubly charged peptide ions producing intense [yn-1 + 11]+ ions. The data for [yn-1 + 11]+ have been coarsely corrected for contribution from [yn-1 + 10]+ isotopic peaks by subtracting from the former peak the intensity of the latter. It is clear that N in the second position shows the largest values in this figure and indicates that the loss of NH3, necessary for formation of [yn-1 + 11]+ ions, takes place from this residue. Among the N-terminal residues facilitating this process, H appears to have the strongest effect. It is possible that this basic residue helps localize the second proton of the doubly charged peptide at the N-terminus and, thus, facilitates loss of NH3 from the adjacent asparagine. It is also possible that H at this location increases the likelihood that the yn-1 + 11 fragment ion is formed with a single rather than double charge (note that the figure is for singly charged product ions only). An example of a spectrum of a peptide with terminal HN showing a [yn-1 + 11]+ ion is in Figure 1d. While Figure 3b shows data for N in the second position, there were also spectra found to contain yn-1 + 11 ions when N is located in position 3 (at about the same level) and position 4 (at about 30% of that level) (sample spectra are shown in Figures S5–S7).

Plots similar to those in Figure 3a and b were created for several other combinations of peptide charge and yn-1 + 10 product ion charge. Of these, we show in Figure 3c the results for doubly charged peptides producing doubly charged yn-1 + 10 ions. Although both Figure 3a and c are for doubly charged peptide precursor ions, there is a profound difference between the sequence dependence for formation of singly (Figure 3a) versus doubly (Figure 3c) charged yn-1 + 10 ions. In both figures, the N-terminal residues most frequently associated with intense yn-1 + 10 ions are N and P. However, the highest intensities for amino acid residues paired with N-terminal N and P are different for doubly charged peptide ions. The highest values in Figure 3c are for [yn-1 + 10]2+ ions are formed from peptides with N-terminal PH and NK. Peptide sequences starting with PH also give exceptionally high intensity [yn-1 + 10]2+ peaks (see Figure 1b for an example). The basic residues H and K attract the second proton and increase the likelihood that the yn-1 + 10 ion produced is doubly charged. It is not clear, however, why the effects of H and K are not the same with both P and N at the N-terminus.

For triply charged peptides forming doubly charged yn-1 + 10 ions (Figure 3d), the formation of [yn-1 + 10]2+ follows the same general trends as in Figure 3a (doubly charged peptides with singly charged yn-1 +10 ions). The highest intensity of the corresponding yn-1 + 10 ions in both plots are associated with N-terminal N and P. The effects of S and T in second position are greater in Figure 3d than in 3a and occur with many more N-terminal residues (A, B, F, H, I, K, L, M, N, N, S, V, W, and Y).

3.5 Results with Different Mass Spectrometers

While the results discussed above were obtained with ion trap (IT) mass spectrometers, synthetic peptides were analyzed with IT as well as with QQQ and QTOF mass spectrometers. In the latter two instruments the MS/MS spectra were recorded at increasing collision energies, which show the progression of fragmentation, resulting in MS/MS spectra with different relative peak intensities. The spectra obtained with all three instruments show mostly the same peaks but the relative intensities vary with collision energy; in general a spectrum taken at one of the intermediate collision energies was found to be very similar to that taken with the IT instrument. For the following discussion of the intensities of the yn-1 + 10 ions, we consider the maximal intensities, i.e., those obtained at the collision energy, which gave the highest value.

Experiments with 120 synthesized mixtures of tryptic tripeptides (containing all possible combinations) using the QTOF show that singly charged tripeptide ions with N-terminal P or N produce [y2 + 10]+ fragment ions with the highest intensities. A relatively small number of singly charged peptide ions are in the NIST library and found to follow the same trend. The percent of tripeptides forming y2 + 10 ions with intensities > 5% is 38% for N-terminal P, 42% for N-terminal N, and <5% for several other residues. Selected tripeptides and longer synthetic peptides were also analyzed by the QQQ at 20 different collision energies and the intensities of fragment ion peaks were plotted as a function of collision energy. Such plots, like those for the synthetic peptide PHEFYAK (Figure 4), show that the yn-1 and yn-1 + 10 ions have similar energy dependences, indicating that they are formed in parallel. Figure 4 also shows that loss of water from the precursor ion is one of the early processes, i.e., the p-18 ion is formed at low collision energy, but it disappears rapidly with increasing energy and never builds up to an intensity > 2%. In peptides with S or T in the second position this loss of water occurs at lower energies and builds up to higher intensities [7] and is a possible contributor to the formation of the yn-1 + 10 fragment ions. The mechanism will be discussed in the next section.

Figure 4
figure 4

Relative intensities of product ion peaks from a synthesized peptide as a function of collision voltage (in QQQ mass spectrometer). (a) [PHEFYAK + 2H]2+, (b) [PHEFYAK + H]+. To minimize overlaps, only the y ions are shown

A significant difference was found between the synthetic peptide results and those in the NIST library when the peptides contained methionine (M) at the N-terminus. Doubly charged tripeptides with y2 + 10 fragment ion intensity > 5% (at the collision energy where it is maximal) are most abundant among those with N-terminal M (60%). Collision energy dependence of fragment ion formation shows that the yn-1 + 10 fragment ion appears after rapid loss of 48 Da (CH3SH) from the N-terminal methionine in these peptide ions. Similar results were obtained for the peptide MRFA. Figure 5a shows the formation of various ions as a function of collision voltage in the MS/MS spectrum of [MRFA + 2H]2+ ion. Figure 5b shows MS3 results on the [p-48]2+ ion, where a decrease in this ion is accompanied by an increase in y3 + 10 ion (and other fragment ions not shown). This loss of 48 Da from M is observed only with doubly charged small peptides [6] and its contribution decreases with increasing peptide length. Therefore, no such effect was observed in the peptide library, where virtually all peptides have more than five amino acid residues.

Figure 5
figure 5

MS2 and MS3 spectral development for the doubly charged peptide MRFA in the QQQ mass spectrometer. (a) MS2 results showing neutral loss of 48 Da (CH3SH) at low collision voltage followed by other fragmentations. (b) MS3 results on the [p-48]2+ ion showing decrease of this precursor along with formation of the y3 + 10 ion and subsequent formation of other ions (only selected ions are shown). The precursor ion (p) intensity is divided by 6 to bring its 100% starting value into the range of the plot

4 Discussion

Analysis of the NIST peptide MS/MS library shows that spectra with yn-1 + 10 and yn-1 + 11 ions are significant in number. For example, about 5% of all the doubly and triply charged peptide ions produce yn-1 + 10 ions with relative intensity > 5%, but the value approaches 30% if only peptides with N-terminal P or N are considered. The sample spectra shown in Figure 1 and in the Supplement show that the yn-1 + 10 ions can have high intensities, sometimes becoming the base peak in the spectrum. The 10 or 11 Da increase in the expected m/z for the y ion is ascribed to the m/z for [+ CO – H2O] or [+ CO – NH3], respectively, confirming previous observations [30]. The suggested mechanism involves dissociation of the N-terminal residue at the Cα–Ccarbonyl bond following loss of H2O or NH3 (Scheme 1). Loss of water from peptide ions has been shown to occur not only from the carboxyl terminus, to initiate formation of b ions, but also near the amino terminus and elsewhere along the chain [31, 32]. The mechanism for water loss near the amino terminus has been shown to involve an imidazolidinone ring as the most likely structure [33, 34]. This structure is used in Scheme 1 to indicate the general mechanism for formation of yn-1 + 10 ions. In most cases, peptides with a yn-1 + 10 peak also form yn-1; therefore, two pathways may occur concomitantly and, from results with the QQQ mass spectrometer, we show that they require similar collision energies.

Scheme 1
scheme 1

Proposed mechanism of yn-1 + 10 ion formation

The yn-1 + 10 ions may occur for many different N-terminal amino acid residues, but are found most frequently for peptides with N-terminal P, N, and H, or having S or T at the adjacent position. This finding may indicate differences in the mechanism of yn-1 + 10 formation for different amino acids and may involve the side chains. For example, the MS2 spectrum of the peptide PSFLYK in an ion trap mass spectrometer shows formation of the y5 + 10 and y5 ions with similar, low intensities (Figure 6a), but the MS3 spectrum of the peptide precursor ion with a loss of water shows much greater intensity for the y5 + 10 ion relative to that of the y5 ion (Figure 6b). This finding indicates that loss of water, most likely from the serine residue [7], leads to enhanced formation of the yn-1 + 10 ion and suggests that loss of water takes place from serine before chain scission (Scheme S1).

Figure 6
figure 6

Ion trap MS2 and MS3 spectra of the synthesized peptide PSFLYK. (a) MS2 spectrum of singly charged PSFLYK showing formation of both y5 and y5 + 10 ions and (b) MS3 spectrum of the [MH – H2O] ion from PSFLYK, showing a much higher y5 + 10 compared with y5. The peaks are labeled with the assignments corresponding to those from the original peptide

Similarly, other amino acids may have their own specific effects on the mechanism of yn-1 + 10 formation. For example, histidine residues at the N-terminal or second position may have enhanced y + 10 formation due to the ability to localize a proton and/or to form a six-membered ring (Scheme S2). When asparagine is in the second position, the side chain may attack the Ccarbonyl and lose either H2O or NH3 forming the experimentally observed yn-1 + 10 and yn-1 + 11 ions, respectively (Scheme S3). These mechanisms and structures are speculative and require further studies for confirmation.

In conclusion, the identification and characterization of the yn-1 + 10 and yn-1 + 11 peptide ions in the NIST peptide MS/MS library show trends in the data that can be used to create fragmentation rules that may be applied in sequence search algorithms to improve peptide identification. The specificity of the amino acid sequence at the N-terminus for forming intense yn-1 + 10 and yn-1 + 11 peaks can be used in de novo sequencing to determine the position of residues when this cannot be directly derived from the spectrum. These rules may also be used to improve the confidence of peak assignments and inclusion of peptide MS/MS spectra in reference libraries.