Introduction

Protein sequence coverage obtained in a proteomic analysis depends on multiple variables, including peptide size, peptide hydrophobicity, aromatic amino acid content, charged side chains, ability to form stable secondary structures, etc. On the other hand, protein identification is also dependent on workflow parameters such as ion selectivity, limit of detection, limit of quantification, dynamic range, data density, repeatability, and reproducibility [1]. For regular proteomic workflows, using database search for protein identification, a small stretch of contiguous b- or y-ions would be enough for mapping theoretical and experimental spectra. However, studies focused on de novo sequence assignments for peptide/protein mapping need to explore and optimize strategies that can improve peptide fragmentation in terms of ion occurrence as well as intensity. Peptide properties such as peptide basicity, location, and number of acidic and basic residues and activation mode influence the peptide fragmentation and thereby act as key players that determine the appearance of fragment ions [212].

Basic processes that govern peptide fragmentation in mass spectrometry are comprehensively explained by mobile proton model proposed by Gaskell and Wysocki [1317]. Sequence-specific cleavages have been identified by careful analysis of CID fragmentation data [7, 9, 11, 1824]. Peptide fragmentation can be tuned by addition/altering the charge localization by selective modification of peptide termini, especially internal residues such as K, R, S, T, C, etc. [25, 26]. Although many of the chemical modifications are known since 1970 [27], their utility was limited because of the incompatibility of these reactions to MS-based workflows, need for higher amounts of sample, interferences due to by-products, possible side reactions, and increased data analysis time. However, development of improved chemical labeling procedures has enabled researchers to integrate the chemical tags into traditional proteomic workflows [28]. Chemical tags have found three major applications in proteomics: (1) As affinity tags to reduce the problem of dynamic range; (2) as differential isotope labeled chemical tags in quantitative proteomics; (3) as chemical labels to engineer peptide fragmentation [26, 29, 30].

Peptide fragmentation in a typical CID results in the formation of b- and y-type ions. Representation of both b-type and y-type ions is favorable for confident protein sequence assignment. Moreover, in proteomic studies involving novel biological systems lacking cognate database, de novo peptide annotation followed by homology search is the most frequently adopted methodology for protein annotation [3134]. In such a scenario, chemical labels that can optimize peptide fragmentation would provide comprehensive sequence information [3537]. Chemical labeling strategies have provided an alternate platform for improving the confidence of protein identifications. The major contribution of chemical tags to the mainstream proteomic studies is in peptide quantitation, e.g., ITRAQ [26]. But one of the most popular applications of chemical tags is to direct the peptide fragmentation towards specific ion type (i.e., b-type and y-type ions). Based on their chemical nature, these tags can enhance or suppress the appearance of a specific type of fragment ion. For instance, sulphonation of peptides by 4-sulphophenyl isothiocyanate/3-sulpho propionic acid NHS ester results in a sulfate group at the peptide amino terminus, the negative charge of which neutralizes the protonated b-ion and, thus, the net charge on b-ion becomes zero; only y-ion series could be observed in the tandem mass spectra [3840]. Whereas derivatization with 2,4,6-trimethylpyrillium tetrafluoroborate [41]/ 2,4,6- trimethyl pyridinium [41] or 4-amidino benzoic acid [42] or acetylation would result in dominant N-terminal ion series (i.e., b-type ions) [41, 43, 44].

However, b-type ions are under-represented in ion trap mass spectra [10, 4552]. Among many chemical labeling approaches, acetylation and guanidinylation are simple to perform and are known to affect the peptide fragmentation. Acetylation is employed to study gas-phase peptide fragmentation and results in a mass shift of 42.01 Da, which helps us in differentiating lysine and glutamine [44, 49]. It has been shown that acetylation of N-terminus results in (1) peptide fragmentation at low energies [4, 8], (2) prevents gas phase cyclization of b-ions [44, 49, 53], (3) increases b-ion relative intensity [43, 54], and (4) b-ion occurrence [43]. Guanidinylation was initially used as a tool to improve the stability of the proteins [55]. Guanidinylation is specific to epsilon amine, it causes a mass shift of 42.02 Da, and results in an increased precursor ion intensity [56]. Guanidinylation improves peptide fragmentation via charge remote and charge directed mechanisms, which in turn results in an increased sequence coverage [36]. Although a combination of acetylation and guanidinylation has been utilized for quantitative labeling, the impact of dual labeling on peptide fragmentation efficiency and influence of each modification was not investigated systematically [57].

A combination of acetylation and guanidinylation would play a role at two stages of mass spectrometry. Guanidinylation is expected to improve ionization due to increased basicity rendered by guanidine group. On the other hand, acetylation leads to an improved b-ion relative intensity in the tandem mass spectra, which is otherwise dominated by y-type ions. Thus, a combination of acetylation and guanidinylation could yield improved peptide fragmentation. To evaluate the specific statistical advantages of dual labeling method, we have tested this workflow on a lipase from Bacillus subtilis. This lipase is generally identified with high sequence coverage (>85%) and has a broad range of peptide products from 5 to 23 amino acids long, which could be confidently assigned. Simultaneously, we also labeled standard bovine six protein mix digest to evaluate the efficiency of dual labeling in a multi-protein scenario.

Materials and Methods

Trypsin gold and chymotrypsin of mass spectrometry grade were obtained from Promega (Madison, Wisconsin, USA). α-Cyano-4-hydroxycinnamic acid (HCCA) was obtained from Sigma Aldrich (St. Louis, MO, USA). Six bovine protein tryptic digest equal molar mix (P/N PTD/00001/63) was obtained from Michrom Bioresources (Auburn, CA, USA), acetic anhydride, triethylamine, methanol, and acetonitrile of analytical grade were sourced locally (SpectroChem, India).

Protein Purification

Bacillus subtilis lipase was cloned in pET21b, and purified upon expression in E. coli BL21 (DE3) as described earlier [58]. Purified protein was stored in 2 mM glycine-NaOH buffer (pH 10.0) at –20 °C. Protein purity was checked on SDS-PAGE. Protein quantitation was carried out by the modified Lowry method [59].

Proteolysis

Purified lipase (1 μg ~52 picomoles) was treated with 25 ng of Trypsin Gold and the total reaction volume was adjusted to 30 μL with 50 mM ammonium bicarbonate. The reaction mixture was incubated at 37 °C for 18 h. Samples were then vacuum dried and stored at –20 °C, till further use.

Acetylation

Acetylation was carried out according to the published protocol with slight modification [54]. Acetylation mix was prepared by adding 12 μL of acetic anhydride to 83 μL of methanol, and 5 μL of triethylamine was added to this mix. Two μL of the freshly prepared acetylation mix was added to trypsin digested peptides; the concentration of 17 fmol/μL bovine six protein mix was used. The reaction mixture was allowed to stand at room temperature for 10 min. Samples were then vacuum dried and stored at –20 °C till further use.

Guanidinylation

Guanidinylation and desalting of the peptide samples were carried out according to the published protocol [60]. A stock solution of O-methylisourea was prepared by dissolving 0.05 g in 51 μL of water. The guanidinylation reaction mixture was prepared by mixing a 5-μL aliquot of digested protein with 5.5 μL of 7 N NH4OH and 1.5 μL of O-methylisourea stock solution. After incubating the reaction mixture at 65 °C for 5–10 min, the reaction was terminated by adding 15 μL of 10% TFA (v/v). The acidified reaction mixture was partially dried in a speed-vac to a final volume of 10 μL. Guanidinylated peptides were desalted by using Zip tips (Millipore) packed with the C-18 matrix as per manufacturer’s recommendations.

Dual Labeling

In the case of dual labeling method, peptides were subjected to guanidinylation followed by acetylation (as described above), vacuum dried, and the dual labeled peptides stored at –20 °C, till further use.

Mass Spectrometry

nLC-ESI MS/MS Unlabeled, acetylated, guanidinylated, and dual labeled trypsin digested peptides were resuspended in 10 μL of 5% ACN containing 0.1% formic acid. Peptides were fractionated on nanoflow LC system (Easy nLCII; Proxeon Biosystems, Odense, Denmark) using Bio Basic C-18 Pico Frit nanocapillary column (75 μm × 10 cm; New Objective, Woburn, MA, USA) with a 60 min linear gradient 0%–100% B, 5% ACN with 0.1% formic acid (solvent A), and 95% ACN with 0.1% formic acid (solvent B)] at a flow rate of 200 nL/min and analyzed on LTQ Orbitrap Velos (Thermo Scientific, San Jose, CA, USA); 1.7 kV was applied for ionization. Full scan MS with a mass window 300–2000 Da were acquired after accumulation to target value of 1*E6 in FT mode. FT resolution was set to 60,000, top 20 peptides with two or more charge state were isolated to a target value of 5000, and fragmented in a linear ion trap with a normalized collision energy of 35% in CID mode. Fragment ions were scanned in a low-pressure ion trap at a scan rate of 33,333 am/s, and the minimum threshold of ion selection for MS/MS was set at 500 counts. Ion accumulation time was set at 500 ms for MS and 25 ms for MS/MS. Activation time of 10 ms and q value of 0.25 was used [61].

Data Analysis

Sequest Search

Sequest HT search was performed using Proteome Discoverer 1.4 ver. 1.4.0.288 platform; Thermo Scientific Inc. Raw files of unlabeled and labeled peptides of lipase were analyzed against Bacillus subtilis from NCBInr database, along with a database of common contaminants. Bovine six protein mix was analyzed against Bos tauras from NCBInr database and also against curated bovine database from Uniprot. In the case of unacetylated peptides of lipase, methionine oxidation, deamidation of asparagine/glutamine were taken as variable modification. Trypsin as a protease, with two missed cleavages, the charge of +2 to +3, mass range from 3500 to 5000 Da, peptide length of ≥4 aminoacids were considered for spectral matching. Peptide mass tolerance of 10 ppm and fragment mass tolerance of 0.6 Da were applied. For acetylated peptides of lipase, search included N-terminal acetylation as fixed modification and lysine/histidine/serine/threonine/tyrosine/cysteine acetylation, methionine oxidation as variable modifications. In the case of guanidinylated peptides, guanidinylation of lysine was considered as fixed modification, deamidation of asparagine and glutamine, oxidation of methionine was considered as variable modifications. In the case of dual labeled peptides, guanidinylation of lysine was taken as additional fixed modification; the remaining search parameters are the same as that of acetylation. Target FDR value of 0.01 was achieved by including only high confidence peptide hits, score versus charge state (Xcorr) as filters.

Peaks 6.0 Search

nLC-ESI MS/MS files of acetylated, guanidinylated, dual labeled, and unacetylated trypsin digested lipase were analyzed in the PEAKS6.0 software with recommended data refining parameters, (i.e., precursor ion tolerance of 10 ppm, fragment ion tolerance of 0.6 Da, and precursor correction with a peptide charge window of +1 to +4). The mass window of 200–5000 and retention time window 0.05–59.95 were used for data analysis. Spectral quality was set at 0.65 as recommended by the manufacturers; –10logP score cut-off was set to run specific scores in order to achieve theoretical peptide FDR of zero. Database search for lipase was carried out against Bacillus subtilis from NCBInr database with the same search settings used in Sequest HT. De novo analysis of the raw files was carried out wherein the modification settings were the same as those used in Sequest HT. The de novo sequences obtained were analyzed for differences in the accuracy of sequence assignments between labeled and unlabeled peptides.

Mascot Search

Unlabeled, acetylated, and dual labeled trypsin peptides were subjected to Mascot search using proteome Discoverer ver. 1.4. Spectra were searched against B. subtilis in the Uniprot database along with contaminants; bovine six protein mix was analyzed against curated Bovine database from Uniprot search settings were same as those used for Sequest HT search. The data files (.DAT) generated from this search were used for peptide fragmentation analysis.

Fragmentation Analysis

Fragment ion occurrence, as well as intensity variability statistics, were assessed for unlabeled and labeled peptides using fragmentation analyzer tool [62] ver. 1.5.14. Using high peptide confidence, 95% significance threshold and peptide score ≥30 Mascot identification files (.DAT) were generated from nLC ESI-MS/MS of unlabeled and labeled lipase digests sample triplicates.

For fragmentation analysis, peptides with +2 charge state with at least 10 peptide spectrum matches per sequence across three nLC ESI-MS/MS replicates were selected. All the possible modifications were considered for analysis. Peptide spectrum matches were assessed for changes in normalized median intensity changes in b-/y-ions upon modification. B-/y-ion relative intensity pattern for unique peptides at each amino acid position as a function of the chemical label was generated, and finally, fragment ion occurrence pattern for b-/y-ions was plotted for labeled and unlabeled samples as a function of C-terminal arginine and C-terminal lysine.

Raw Meat ver. 2.1 Analysis

Alterations in the retention time, ionization and precursor ion intensity (MS2) of the peptides upon labeling was assessed by comparing the raw files of unlabeled and dual labeled samples of bovine six protein mix digest and lipase using Raw Meat ver. 2.1 developed by Vast Scientific in conjuction with Thermo Fisher Scientific BRIMS Center, Cambridge, MA, USA.

Results and Discussion

Labeling Efficiency and Protein Identification

Unlabeled and labeled lipase digests were subjected to nLC ESI-MS/MS. Data analysis by Sequest HT and Peaks 6.0 search engine showed complete sequence coverage for unlabeled lipase, whereas for acetylated, guanidinylated, and dual labeled lipase, the sequence coverage was 97%, 100%, and 97%, respectively, suggesting that the labels individually or in combination did not affect the sequence coverage of the protein significantly. At this stage, we checked the labeling efficiency by considering all the modifications as variable and reanalyzed the data (Supplementary Tables S1S7). Based on the data analysis and spectral information, labeling efficiency was found to be 100% for individual labels as well as in combination (Supplementary Figure S1S3).

Normalized Fragment Ion Intensity Increased Upon Labeling

The impact of the chemical labels on the peptide fragmentation was assessed by comparing the normalized median fragment ion intensity for b- and y-type ions. Normalized median ion intensity was generated from triplicate runs for each chemical label. From the normalized ion intensity values, it is evident that both chemical labels (i.e., acetylation and guanidinylation) were efficient in increasing overall b- and y-ion intensity. However, increase in the fragment ion intensities is higher upon dual labeling in comparison with individual labels (Table 1), which could be understood as a combinatorial effect (Supplementary Figure S4).

Table 1 Mean and Median Values of Normalized Fragment Ion Intensity for Different Chemical Labels Tested

Dual Labeling Improved Peptide Fragmentation

Elevated normalized b- and y-ion relative intensity upon labeling is suggestive of altered fragmentation at the peptide level. Therefore, fragment ion intensity information was derived from all the peptide spectrum matches for unique peptides of lipase (Supplementary Figures S5S11). Changes in the normalized intensity of b- and y-ions was plotted as a function of each amino acid. The impact of each chemical label on the fragmentation pattern of the peptide was studied.

  1. (1)

    ALPGTDPNQK

    Fragment ion intensity box plot of unlabeled peptide shows dominant y-ion peaks for proline at the third and seventh positions of the peptide Figure 1a. It is very well established that proline promotes fragmentation towards the N-terminal side of the peptide, resulting in strong y-ion at that position [5, 18, 19], and similarly, relative higher intensity y-ion peak for glycine and glutamate as well as higher intensity b-ion peak for aspartate and leucine, characteristic for a typical ion trap CID MS/MS.

    Figure 1
    figure 1

    A box plot of the fragment ion and corresponding relative intensity for (a) unlabeled (519 PSM’s), (b) acetylated (92 PSM’s), (c) guanidinylated (110 PSM’s), and (d) dual labeled (127 PSM’s) peptide “ALPGTDPNQK”; b- and y-fragment ions are represented in blue and red color, respectively

    There was a reversal in the b- and y-ion intensities for aspartate and glutamate residues upon acetylation Figure 1b. Guanidinylation resulted in stronger y-ion intensity for asparagine and aspartate residues, and higher b-ion intensity was observed at threonine residue. Overall, higher intensity peaks could be seen for threonine, glycine, and aspartate upon guanidinylation Figure 1c. Although proline effect is still prominent for all the labeled peptides, each label did influence the peptide fragmentation pattern, whereas dual labeling showed a cumulative effect of acetylation and guanidinylation, Figure 1d.

  2. (2)

    KVDIVAHSMGGANTLYYIK

    This peptide presents an interesting case where basic residues are present at the termini and the middle of the peptide in the form of lysine and histidine, respectively. This allows us to monitor the influence of these residues on peptide fragmentation with and without labeling. All the amino acids were represented in the MS/MS spectra for the unlabeled peptide. Higher fragment ion intensities were observed for residues close to the histidine residue Figure 2(a). This could be attributed to the higher basicity of histidine residue in comparison to lysine [23].

    Acetylated peptide had increased b-ion relative intensity, especially towards the C-terminal. Lowered peptide basicity upon di-acetylation of lysine at peptide N-terminal seemingly suppressed the histidine effect while improving fragmentation at glycine and tyrosine residues, Figure 2b. Guanidinylation resulted in b- and y-ion representation prominently at the amino terminus of the peptide. Increase in peptide basicity due to guanidinylation of lysine to homoarginine resulted in a dominant y-ion representation, Figure 2c. Dual labeling resulted in acetyl, guanidyl lysine at peptide amino termini and guanidinylation of carboxyl terminus. Dual labeling resulted in the equitable fragment ion distribution across the entire peptide length Figure 2d. Thus a balance of peptide basicity upon dual labeling allowed optimized fragment ion occurrence in this peptide.

    Figure 2
    figure 2

    A box plot of the fragment ion and corresponding relative intensity for (a) unlabeled (192 PSM’s), (b) acetylated (33 PSM’s), (c) guanidinylated (11 PSM’s), and (d) dual labeled (30 PSM’s) peptide “KVDIVAHSMGGANTLYYIK”; b- and y-fragment ions are represented in blue and red color, respectively

Dual Labeling Increased the Accuracy of De Novo Sequence Annotation

Improvement in accuracy of de novo amino acid sequence annotation upon labeling was compared for unlabeled and labeled peptides of lipase using PEAKS 6.0 algorithm. The confidence of amino acid assignment of a peptide is judged with the help of two statistical parameters. They are, TLC (total local confidence), which measures the probability of correct amino acid assignment in a peptide and ALC (average local confidence), which indicates the sum total of percentage (or probability) of correct amino acid assignment in the sequence (TLC/peptide length*100).

Peptide assignments generally were found to be very close to the actual sequence for both unlabeled and dual labeled peptides. Although there are errors in identification of a correct amino acid at a given position, number of mass fits (N = G + G, Q = G + A, S + Y = C + F, K = Q,W = E + G, (M + 16) = F, etc.) for amino acids were lower in dual labeled peptides (13) compared with unlabeled peptides (28) because of increased fragment ion occurrence. Dual labeling, thus, improved the accuracy of the de novo fragment ion annotation compared with unlabeled peptides (Table 2).

Table 2 Comparison Table of the Peptide De Novo Sequence Annotation Between Unlabeled and Dual Labeled Peptides Using PEAKS 6.0 Software

Due to the complexity involved in de novo sequencing, the studies of this kind will help in the development of de novo sequencing algorithms. Even in large scale protein identification studies, it is prudent to manually evaluate the spectra to confirm the results (on crucial peptides or PTMs). Phenylalanine and oxidized methionine have identical mass but differ in fragmentation [63] and that helps in distinguishing these residues, Acetylation is useful in differentiating lysine and glutamine.

Fragment Ion Occurrence was Balanced upon Dual Labeling

Higher accuracy in de novo sequencing suggests that dual labeling improved the fragmentation of peptides. The effect of dual labeling on fragment ion occurrence was evaluated for peptides with C-terminal arginine and C-terminal lysine. It is hypothesized that acetylation improves b-ion representation in peptides with arginine at the carboxyl terminus, whereas for peptides with lysine at the carboxyl terminus, a dual impact is expected; (1) guanidinylation should improve the ionization efficiency, (2) acetylation is expected to increase b-ion representation. In total, spectra with balanced representation of b- and y-ions are expected.

Normalized fragment ion occurrence pattern of unlabeled and labeled peptides was compared to examine the impact of individual labels on the peptide fragmentation. In the case of unlabeled peptides, normalized y-ion occurrence percentage is higher compared with b-ions through the peptide length as seen in a typical ion trap CID, Figure 3a. Acetylation of the peptides resulted in substantial increase in b-ion occurrence, with a concomitant decrease in y-ion occurrence, especially after the first 7–8 ions. Thus tandem mass spectra of an acetylated peptide are expected to be dominated by b-ions with high-intensity y-ions, Figure 3b.

Figure 3
figure 3

Comparison plot of variation in b- (blue) and y- (red) ion occurrence for trypsin digested peptides of lipase. Unlabeled peptides (20 unique peptides, 2686 PSM’s) (a), acetylated peptides (23 unique peptides, 1802 PSM’s) (b), guanidinylated peptides (25 unique peptides, 1993 PSM’s) (c), and dual labeled peptides (30 unique peptides, 2709 PSM’s) (d). Y-axis is the occurrence of ions (%) and the x-axis is fragment number. Dashed lines are neutral loss ions y-NH3 (pink), yellow (y-H2O), purple (b-H2O), and turquoise (b-NH3)

Guanidinylation increases the peptide basicity due to the conversion of lysine to homoarginine. While this does not affect the peptides with C-terminal arginine (except missed cleavage products), peptides with C-terminal lysine are expected to show fragmentation behavior similar to C-terminal arginine peptides. Guanidinylated peptides had higher y-ion occurrence percentage compared with b-ion up to 15 ions. Guanidinylation showed marginal improvement in b- and y-ion intensities. When coupled with higher y-ion occurrence percentage, strong y-ion appearance with few b-ions of high intensity can be expected in typical guanidinylated peptide spectra, Figure 3c. Dual labeling had a combinatorial effect where there is higher y-ion occurrence at the peptide N-terminal. However, b-ion occurrence percentage increased with the fragment length suggesting that acetylation tilted the balance of fragment ion appearance towards b-ion; b- and y-type ion representation was much more balanced for dual labeled peptides for the first 15 fragment ions, Figure 3d.

Fragment ion occurrence pattern for unlabeled and labeled peptides was further examined by comparing the ion distribution for peptides with lysine and arginine at the C-terminal. C-terminal arginine peptides showed the highest difference in b- and y-ion occurrence percentage, Figure 4a. A similar pattern is seen in guanidinylated peptides, Figure 4c, since guanidinylation does not affect the peptides with arginine at C-terminal. Acetylation-induced increase in b-ion occurrence is clearly evident, Figure 4b and d, which balances the b- and y-ion representation in the MS /MS spectra up to 12 fragment ions.

Figure 4
figure 4

Comparison plot of variation in b- (blue) and y- (red) ion occurrence for peptides with C-terminal arginine (peptides with internal lysine and terminal peptide included). Unlabeled (eight unique peptides, 823 PSM’s) (a), acetylated (11 unique peptides, 1257 PSM’s) (b), guanidinylated (15 unique peptides, 1005 PSM’s) (c), and dual labeled (17 unique peptides, 1774 PSM’s) (d). Y-axis is the occurrence of ions (%) and the x-axis is fragment number. Dashed lines are neutral loss ions Y-NH3 (pink), yellow (y-H2O), purple (b-H2O), and turquoise (b-NH3)

Similarly, for peptides with lysine at C-termini, y-ion occurrence was slightly higher for unlabeled peptides, Figure 5a; guanidinylation increased this difference, Figure 5c. On the other hand, acetylation skewed the balance of fragment ion occurrence towards b-ions, Figure 5b. Dual labeling resulted in equitable b- and y-ion occurrence without compromising the ion intensity, Figure 5d.

Figure 5
figure 5

Comparison plot of variation in b (blue) and y (red) ion occurrence for peptides with C-terminal lysine (peptide with internal arginine are included). Unlabeled (12 unique peptides, 1862 PSM’s) (a), acetylated (12 unique peptides, 545 PSM’s) (b), guanidinylated (10 unique peptides, 988 PSM’s) (c), and dual labeled (13 unique peptides, 935 PSM’s) (d). Y-axis is the occurrence of ions (%) and the x-axis is fragment number. Dashed lines are neutral loss ions Y-NH3 (pink), yellow (y-H2O), purple (b-H2O), and turquoise (b-NH3)

Overall, y-type ions were over-represented in the unlabeled peptides; the balance shifted to increased b-ion occurrence upon acetylation. Although guanidinylation had improved the balance of b- and y-ion occurrence, its impact on overall fragment ion intensity was less. It is the dual labeling that clearly improved the balance of b- and y-ion representation and also had a substantial increase in overall fragment ion intensities. This is also reflected in the increased accuracy of de novo amino acid annotation.

Dual Labeling of Six Protein Mix Improves the Accuracy of Protein Identification

It was clear from the de novo sequence analysis as well as fragmentation analysis that dual labeling of lipase improved the efficiency of peptide fragmentation. Nonetheless, it is important to ascertain universality of this procedure across different types of proteins. Bovine six protein mix is routinely used in mass spectrometry as a protein standard for evaluating the quality of the nLC-ESI MS/MS procedures. Unlabeled and dual labeled six protein mix were subjected to nLC-ESI MS/MS followed by Sequest HT search. Under ideal conditions, a bovine six protein mix is expected to give six protein hits. However, six hitchhikers are routinely identified upon database search (ABRF 2013) [64]. We have analyzed raw files of unlabeled and labeled six protein mix against Uniprot and NCBInr databases to evaluate accuracy and database dependency on protein identification (Supplementary Tables S8S11). Sequence coverage obtained was slightly lower for dual labeled sample compared with unlabeled proteins. This could be attributable to the peptide loss associated with multiple desalting and resuspension steps and also due to changes in the peptide solubility upon modification. However, more importantly, the number of unique peptides is critical for protein identification and, in fact, total unique peptides in the digest increased from 48 to 53 upon dual labeling.

The number of protein dentifications were dependent on the database on which search was executed. Uniprot database had a higher number of protein identifications compared with NCBInr database. Sequest HT search of unlabeled and dual labeled peptides of bovine six protein against bovine species in Uniprot database gave 15 and 9 protein identifications, respectively, whereas 13 and 8 proteins were identified when the search was done using NCBInr Bos taurus database (Table 3).

Table 3 Comparison of Proteins Identified from Data Analysis of Inlabeled and Dual Labeled Bovine Six Protein Mix Digest with NCBInr and Uniprot database

While all the component proteins of bovine six protein mix were identified for both unlabeled and dual labeled samples, the accuracy of protein identification for bovine six protein mix improved upon dual labeling as the number of proteins identified (9 and 8) is close to the expected value (6), indicating a reduced number of false positives. Based on analysis of MS2 precursor ion intensities by Raw Neat ver. 2.1, a decrease in ion count and intensity was observed. Fragment ion intensities were higher for labeled peptides despite a marginal decrease in MS2 precursor ion intensities, as shown in Supplementary Figure S12. While MS2 precursor ion intensity certainly increases the chances of fragmentation, labeling is clearly increasing the relative ion intensity and occurrence, which are central to peptide/protein identification. The trade-off between MS2 precursor ion intensity versus increase in fragment ion relative intensity/occurrence is evident. By dual labeling of peptides, we were able to eliminate at least three to four common intrusive protein identifications. Optimization of peptide fragmentation by dual labeling has augmented increase in accuracy of protein identification.

Fragment Ion Occurrence was Balanced upon Dual Labeling for Six Protein Mix

The reason for the improvement in accuracy of protein inference accuracy was investigated further by checking the fragment ion occurrence differences for dual labeled and unlabeled samples. Similar to lipase, for bovine six protein mixture, also the fragment ion profile is largely dominated by b- and y-ions; y-ion occurrence percentage was higher compared with b-ions typical of an ion trap CID Figure 6a.

Figure 6
figure 6

Comparison plot of variation in b- (blue) and y- (red) ion occurrence for peptides of bovine digest six protein mix unlabeled (a), dual labeled (b). Y-axis is the occurrence of ions (%) and the x-axis is fragment number. Dashed lines are neutral loss ions Y-NH3 (pink), yellow (y-H2O), purple (b-H2O), and turquoise (b-NH3)

However, the equivalent occurrence of both b- and y-ions was much clearer in the dual labeled bovine six protein mix compared with lipase Figure 6b. Thus, dual labeling of bovine six protein mixture improved the fragment ion occurrence with an improved protein inference accuracy as the final outcome.

Earlier studies employing chemical labels for improving fragment ion occurrence/intensity largely focused on enhancing or suppressing the representation of a particular fragment ion; however, the study presented here focused on equivalent representation of both types of fragment ions in the MS/MS spectra. It would be interesting to investigate the effect of this dual modification method in other fragmentation modes like HCD, ETD etc.

Conclusions

The efficiency of acetylation, guanidinylation, and their combination as routine chemical tools in peptide fragmentation was investigated. Labeled and unlabeled peptide digests were subjected to nLC ESI-MS/MS database search, de novo sequencing, and fragmentation analysis. From this analysis, it was clear that dual labeling was most efficient in improving the relative intensity and occurrence of the fragment ions in the spectra. This resulted in improved de novo sequence annotation accuracy and protein inference without compromising the sequence coverage. High efficiency, safety, and ease of these protocols make the dual labeling (acetylation and guanidinylation) of peptides the most attractive validation tool for de novo annotation as well database-dependent proteomic workflows.