1 Introduction

Structural mass spectrometry-based methodologies, many of which utilize chemical reagents for investigation, have become an invaluable tool for evaluating protein structure and function [1]. Several strategies exist for this type of investigation, each yielding different structural information about the protein(s) in inquiry. One such method is chemical crosslinking, which can divulge tertiary and quaternary information through both inter- and intramolecular covalent conjugation [2]. Hydrogen-deuterium exchange probes hydrogen bonding and solvent accessibility through monitoring the exchange of backbone amide hydrogen atoms yielding secondary structural properties [3]. Another method complementary to hydrogen-deuterium exchange is protein footprinting. Here, chemical probes are used to label side chains, revealing evidence of side chain solvent accessibility [4].

Hydroxyl radical (·OH) based footprinting, first coupled with mass spectrometry by Chance and coworkers [5], is one of the most informative covalent labeling methods for a number of reasons. The · OHs have similar properties to water and can freely oxidize solvent exposed side chains. Additionally, their reactivity is well known and researchers can capitalize on their low selectivity [6], increasing the amount of information obtained. Furthermore, there are multiple methods [1, 7, 8] available for generating · OHs, increasing the accessibility of this method. Fast photochemical oxidation of proteins (FPOP), used for this work, generates · OHs through laser induced photolysis of hydrogen peroxide [9]. This technique modifies proteins on a microsecond timescale [9], theoretically eliminating structural changes induced by labeling [10].

A consequence of design features employed in FPOP experiments to eliminate radical induced unfolding [9] is that oxidized species are present in lower abundance compared with their unoxidized counterparts. Therefore, the difficulty in detecting these species will grow concurrently with increasing sample complexity. Investigating the structures of large, megadalton-sized molecular assemblies has often proven difficult. Although there are several methods for obtaining protein structures, the majority come from X-ray crystallography [11]. At the outset, it is often challenging to purify all of the protein components of megadalton complexes [12], a necessity in obtaining a structure. Even when this is accomplished, it can be equally difficult to crystallize these complexes, or the process may only yield crystals too small for analysis [11]. Although FPOP has the potential to start to fill this gap in information, it is first necessary to overcome the hurdle of identifying the relatively low abundant oxidized species in a sea of higher abundant peptides. A major obstacle is using data-dependent acquisition (DDA) for MS/MS analysis. In this method, precursor ions are selected for fragmentation based on their signal intensities. Often, if chromatographic separation is not sufficient, peptides with higher abundance are identified whereas lower abundance peptides are not. A more proficient chromatographic separation could aid in increased peptide identifications.

Multidimensional protein identification technology (MudPIT), is a method used to overcome the inability of single-dimensional separations to resolve complex biological samples [13]. The use of a biphasic analytical column increases the peak column capacity and allows for online two-dimensional separations [13, 14]. The coupling of FPOP labeling with MudPIT could provide an increase in identifications of oxidized peptides in complex systems. The use of this method to identify oxidatively modified peptides has been previously reported [15]. However, the study was mainly focused on comparison of informatics methods rather than as a method to be utilized to identify more oxidatively modified peptides for highly complex samples. Additionally, the researchers used a low complexity sample with a “mini-MudPIT” method consisting of only three salt steps. In this paper, we describe the combination of a full MudPIT method with FPOP on a highly complex sample, Saccharomyces cerevisiae yeast cell lysate. Our objective is to improve the detection of FPOP labeled species and expand the application of FPOP to more complex systems.

2 Materials and Methods

All chemicals were obtained from Thermo Fisher Scientific (Waltham, MA, USA) unless otherwise noted.

2.1 Oxidative Labeling

Each 100 μL sample contained 10 mM phosphate buffered saline (PBS; Sigma Aldrich, St. Louis, MO, USA) 10 mM ʟ-glutamine, 7.5 mM hydrogen peroxide, and yeast cell lysate (a gift from Dr. Amber Mosley and Whitney Smith-Kinnaman, Department of Biochemistry, Indiana University School of Medicine, Indianapolis, IN) at a concentration of 0.18 mg/mL. The hydrogen peroxide was added just prior to infusion. FPOP was performed similarly as described [9, 10]. A 248 nm KrF excimer laser (GAM Laser Inc., Orlando, FL, USA) was used to irradiate the sample solution at 135 mJ/pulse. The laser was focused through a 250 mm plano convex lens (Thorlabs, Inc., Newton, NJ, USA) onto 150 μm i.d. fused silica tubing (Polymicro Technologies, Phoenix, AZ, USA) with the polyimide coating removed, giving a 2.5 mm irradiation window. The flow rate, 33 μL/min, was set to allow for a 20% exclusion fraction. A total of four FPOP samples and three controls (no irradiation) were prepared.

2.2 Proteolysis

Post-FPOP, the yeast lysate samples were subjected to a two-step digestion process as previously described [16]. Each sample was acetone precipitated [17] and re-suspended in 8 M urea 150 mM tris-HCL pH 8.5 buffer. Proteins were reduced with 10 mM tris(2-carboxyethyl) phosphine (TCEP) for 30 min at room temperature (RT). They were then alkylated with 20 mM iodoacetamide for 30 min at RT with a foil cover to protect the sample from light. The alkylation reaction was quenched with 10 mM dithiothreitol (DTT) for 15 min at RT. Lys-C was added at a 100:1 substrate to protease ratio and incubated overnight at 37°C. The samples were then diluted with 150 mM Tris buffer to bring the urea concentration to 2 M. Trypsin was added at a 50:1 substrate to protease ratio and incubated overnight at 37°C. Digestion was quenched with formic acid (Sigma Aldrich) at a final concentration of 5%.

2.3 LC-MS

Analysis was completed using an UltiMate 3000 RSLC and a Q Exactive mass spectrometer (Thermo Fisher Scientific). For each experiment, 1 μg of the digest was loaded onto a 2 cm Acclaim Pepmap 100 C18 trap column (Thermo Fisher Scientific). MS1 spectra were acquired over an m/z range of 350–2000 at a resolving power of 70,000. The 25 most abundant ions were selected for MS2 at a resolving power of 17,500. Ions with a charge-state of +1 and > +8 ions were rejected.

2.4 One-Dimensional LC-MS

Samples were loaded onto a 100 μm × 2 cm Acclaim PepMap100 C18 nano trap column (5 μm, 100 Å) (Thermo Scientific, Waltham, MA, USA) and washed for 10 min with loading buffer (LB, 2% acetonitrile 0.1% formic acid) with a flow rate of 5 μL/min. The samples were separated on a 75 μm i.d. reverse phase (RP) analytical column packed in-house with a 30 cm bed of Magic 5 μm C18 particles (Michrom Bioresources Inc., Auburn, CA, USA) with a 67 min linear gradient at a flow rate of 300 nL/min to 40% acetonitrile 0.1% formic acid. The total run time was 105 min including loading, washing, and equilibration. AGC targets were set to 3e6 for MS1 and 1e5 for data-dependent MS2 with an underfill ratio of 1.0%, giving an intensity threshold of 2.0 e4

2.5 MudPIT LC-MS

Fully automated analysis was completed in a similar manner as previously described [18, 19]. Each sample was loaded onto a trap column and washed for 10 min with LB at a flow rate of 5 μL/min. Samples were separated on a 75 μm i.d. RP analytical column packed in-house with a 26-cm bed of Magic 5 μm C18 particles (Michrom Bioresources Inc.) followed by a 4-cm bed of Luna strong cation exchange (SCX) resin (Phenomenex, Torrence, CA, USA). Peptide fractions were displaced from the SCX resin to the RP resin using the following salt pulses: (1) 0% (2) 5% (3) 10% (4) 15% (5) 20% (6) 30% (7) 40% (8) 50% (9) 60% (10) 80% of SCX buffer (SCXB, 500 mM ammonium acetate (Sigma Aldrich) in 5% acetonitrile and 0.1% formic acid) mixed with LB by the loading pump mixer. The 0% fraction was used to displace the sample from the trap column to the analytical column. Each subsequent salt pulse was generated by increasing the SCXB percentage to the next concentration with a loading pump gradient during the previous salt step. A 2.6 μL aliquot of salt (roughly 15× the SCX bed volume) was collected by coupling a 30 cm 75 μm i.d. NanoViper line (Thermo Fisher Scientific) to the trap column with a stainless steel union, and delivered when the switching valve position was changed. Each salt pulse was pushed over the analytical column by the gradient pump for 20 min at a flow rate of 300 nL/min. Sample fractions were separated with a 67-min linear gradient at a flow rate of 300 nL/min to 40% acetonitrile 0.1% formic acid. The total run time for each fraction was 105 min including loading, washing and equilibration time. AGC targets were set to 1e6 for MS1 and 5e4 for data-dependent MS2 with an underfill ratio of 1.0%, giving an intensity threshold of 1.0e4.

2.6 Analysis of MS/MS Data

All data files were searched using Proteome Discoverer version 1.4 (Thermo Fisher Scientific) with Sequest HT and Mascot ver. 2.4 (Matrix Sciences Ltd., London, UK) against a Saccharomyces cerevisiae FASTA database (strain ATCC 204508/S288c, downloaded from Uniprot February 2014), and extracted ion chromatogram (EIC) areas for each peptide spectrum match (PSM) were calculated using a custom multi-level workflow. Peptides were ungrouped and filtered to a 1% false discovery rate (FDR). Only PSMs identified as selected or unambiguous were used for analysis. The data was exported to Excel and summarized using the PowerPivot add-in. The fractional oxidation per residue on a given sequence was determined according to Equation 1:

$$ \frac{{\displaystyle \sum }\ EIC\ area\ modified}{{\displaystyle \sum }EIC\ area} $$
(1)

where EIC area modified is the EIC area of a specific modified residue and EIC area the EIC area of any PSMs with a peptide sequence identical to that containing the modification.

3 Results and Discussion

3.1 Method Comparison

In order to make the most direct comparison between the one-dimensional chromatography data-dependent analysis (1D-DDA) and MudPIT, when possible, all processes and parameters were kept identical. However, some parameters were altered for the analysis. First, we doubled typical FPOP sample size to ensure there was an adequate amount to analyze each sample by both methods. In addition, the total analytical column length was kept at 30 cm and the gradient for each sample or step was constant over the entire experiment.

For comparison of the two methods, the sample loading procedure for MudPIT had to be altered. In MudPIT analysis, samples are often pressure loaded or directly injected onto the analytical column [13]. In this experiment, the samples were loaded onto a trap column via the autosampler. There were several advantages to loading the samples in this manner. First, sample washing was identical for both methods. Any hydrophilic peptides that may have been washed off of the trap column should be the same over both methods. Directly loading the sample onto a three phase analytical column could have created a bias between the two methods. Second, trap column loading allowed both method analyses to be completed continuously, whereas pressure loading would require the MudPIT analysis to be completed discontinuously. Other parameters that were altered were automatic gain control (AGC) targets, for both MS1 and MS2 (see section 2), and dynamic exclusion times. These parameters were optimized for each method to provide peak performance.

3.2 Increases in Identifications by MudPIT

In agreement with previously reported results [13, 14], using the MudPIT method to analyze the labeled yeast lysate samples gave a substantial increase in peptide spectrum matches (Figure 1). Comparing the two methods at the protein level, a 1.7-fold increase in protein group identifications (IDs), including 820 unique proteins, was observed with MudPIT (Figure 1 top). At the peptide level, a 1.3-fold increase in IDs with MudPIT was observed. Comparing unique peptides, MudPIT had a 1.7-fold increase in IDs with almost 1700 more unique peptides observed over 1D-DDA (Figure 1 middle). Although significant increases were observed with MudPIT on the protein and peptide levels, the true value of using the method in conjunction with protein footprinting is appreciated when looking at oxidatively modified peptides (Figure 1 bottom). Here, a 2.7-fold increase in oxidized peptide IDs was observed with MudPIT. Even more significant, MudPIT has a 4.6-fold increase in IDs of unique oxidized peptides over 1D-DDA. This demonstrates the efficacy for coupling MudPIT with FPOP. The higher sequence coverage of oxidatively modified peptides will provide a more complete description of the protein system.

Figure 1
figure 1

Visual comparison of IDs between MudPIT (red) and 1D-DDA (blue) methods by proteins (a), peptides (b), and oxidatively modified peptides (c)

To further evaluate the increased IDs achieved with the MudPIT method, we compared the identification of oxidatively modified residues on pyruvate kinase 1 (PK1, PDB ID: 1A3W [20]) and phosphoglycerate kinase (PGK1, PDB ID: 3PGK [21]). These proteins were chosen as a representation because both had high coverage with each method (greater than 75%) and each had oxidatively modified peptides identified by the search workflow. For both of these proteins, modifications were only included if they were identified more than once in the samples (PSM ≥2) and if the quantifiable oxidation levels were greater than the mean standard error. Table 1 shows residues that were identified by each method. For PK1, 1D-DDA identified only six of the 14 oxidatively modified residues identified by MudPIT. For PGK1, the fourth most abundant protein found in S. cerevisiae [22], 1D-DDA only IDs 16 of the 41 residues that MudPIT identified. Since PKG1 is very abundant in yeast lysate, it can be assumed that this protein is oxidized more frequently than lower abundant proteins in the lysate. Consequently, the oxidized peptides from PKG1 may be relatively abundant. Despite that, 1D-DDA only IDs less than half the number of oxidized residues as MudPIT.

Table 1 Identified Oxidatively Modified Residues

3.3 Properties of Peptides Identified by MudPIT

Wolters et al. [13] demonstrated that MudPIT has a high dynamic range with the ability to ID low abundant peptides. To determine whether the IDs from MudPIT are lower abundance than those from 1D-DDA, the intensities of the identified peptides were analyzed. Figure 2a compares the intensity of peptides identified from MudPIT and 1D-DDA. The average intensity of the peptides identified by both methods is similar. However, the minimum intensity of peptides identified by MudPIT is lower than for 1D-DDA. A histogram of frequency of identifications of peptides at varying intensities further demonstrates this (Figure 2b). Since the MudPIT method has more overall identifications, the histogram has been normalized to show the percent of total peptides. For both MudPIT and 1D-DDA, the highest number of identifications were from peptides with intensities in the range of 1.00E + 06 (1E + 06–9E + 06), followed by intensities in the range 1.00E + 05 (1E + 05–9E + 05) and 1.00E + 07 (1E + 07–9E + 07). At the lowest intensity bin, 1.00e + 04 (1E + 04–9E + 04), MudPIT facilitated detection of three times as many peptides, 90 (1%) and 25 (0.4%) for MudPIT and 1D-DDA, respectively. This increase in lower intensity identifications is even more significant for oxidized peptides where 55 and 10 oxidatively modified peptides from MudPIT and 1D-DDA were identified, respectively.

Figure 2
figure 2

Distribution of the intensities of PSMs identified by MudPIT (red) and 1D-DDA (blue). (a) The spread of intensities is demonstrated in the box-and-whisker plot with the box lines marking the upper median and lower quartiles, and the whiskers marking the complete range. (b) The frequency of the distributions of intensities is displayed in a histogram

Comparing MudPIT identifications to yeast lysate protein abundance further demonstrates that the method can aid in identifying low abundance proteins. As mentioned previously, PGK1 is highly abundant in S. cerevisiae with an estimated abundance of 21,000 parts per million (ppm) [22]. The sequence coverage for this protein is 75%. The ATP-dependent transporter protein YER036C is also identified by MudPIT with 25% sequence coverage. This protein has an abundance of 743 ppm, 29-fold lower than PGK11 indicating the dynamic range of the MudPIT method.

3.4 MudPIT as a Method for Megadalton Protein Complexes

A major obstacle in oxidative labeling experiments is the ability to obtain residue level oxidation on large, macromolecular protein complexes. Given that the surface area to volume ratio decreases as a particle increases in size, it stands to reason that the proportion of oxidized species present when analyzing a MDa sized complex would also decrease, making the likelihood of detecting modifications even more difficult.

To demonstrate the power of using MudPIT analysis in oxidative footprinting experiments, residue level oxidation was calculated on a yeast 80S ribosome, which has a published structure (PDB ID 4V6I) [23]. Ribosomes are cellular organelles, consisting of both protein and RNA, involved in protein assembly. The protein component of the structure is assembled in two subunits, 40S and 60S, and contains a total of 70 known proteins. We identified 52 of the 70 proteins in the MudPIT samples, with sequence coverage values ranging from 5% to 80% (data not shown). A total of 86 residues were identified as oxidized and mapped to the crystal structure for a visual representation (Figure 3). Since RNase was not added to the sample at any time to remove the RNA, the structure is presented with the RNA present. The mapping of oxidized residues onto a surface representation of the crystal structure demonstrates that many solvent-accessible residues are oxidized.

Figure 3
figure 3

Two perspectives of the structural location of MudPIT determined FPOP oxidation levels mapped to a yeast 80 s ribosomal crystal structure, 4V6I [23]. The lowest oxidation levels are in blue going to the highest in red

To further investigate the correlation between residue oxidation and solvent accessibility, the extent of oxidation of residues identified by MudPIT was compared with solvent accessibility surface area (SASA) calculations. Since FPOP was performed on yeast lysate where various proteins could be interacting, we had to consider certain variables prior to the comparison. While a binary interactome of yeast has been published [24], it is unlikely that every interaction with this complex has been documented. With this in mind, it seemed unlikely that a comparison of SASA to oxidation over the complete complex would yield any reliable assessment of the method. As a consequence, we chose to do this comparison on a single protein within the complex. The SASA was determined on an asymmetric unit of the 40s ribosome (pdb: 31ZB). A plot that correlates the extent of oxidation compared to the residue SASA, demonstrates a good correlation between the two parameters (Figure 4). The data fits well to a linear fit with an R2 of 0.7. There is a possibility that protein–protein interactions are occurring that are not taken into account in the SASA calculations, which could explain why the R2 value is not higher.

Figure 4
figure 4

Extent of oxidation on MudPIT identified residues versus the calculated SASA factor for ASC1, chain a of the 3IBZ portion of the complete ribosomal structure, illustrating the linear relationship between them

4 Discussion

An advantage of using protein footprinting coupled with mass spectrometry for protein structural analysis is the ability to study large protein complexes. Analysis of these complexes is often hampered with other structural tools such as X-ray crystallography and NMR. Although MS analysis has the capability for analysis of complex protein systems, the nature of data-dependent acquisition limits the number of identifications achieved in analysis. Since DDA analysis focuses on the highest abundant peptides at a given time, it is often difficult to identify low abundant peptides with one-dimensional chromatography. This provides a challenge for oxidative labeling where it is advantageous to limit the levels of oxidation; thus, many oxidized peptides are of low abundance. Therefore, the ability to carry out oxidative labeling on large protein complexes hinges upon the capability to identify low abundant peptides.

The application of two-dimensional MudPIT chromatography to an oxidatively modified yeast lysate sample increased the number of identified proteins and peptides over one-dimensional chromatography. Yeast lysate contains thousands of proteins and is indicative of a complex system. The increase in identification is most significant for oxidatively modified peptides where an almost 3-fold increase in identifications is observed (Figure 1c). The higher abundance of identifications for oxidized peptides provides more detailed information on the proteins being analyzed. When investigating individual proteins, the benefit of MudPIT is further revealed. For both pyruvate kinase 1 and phosphoglycerate kinase, MudPIT identifies 5- and 2.6-fold higher numbers of oxidatively modified residues than 1D-DDA. There were peptides identified by 1D-DDA that were not observed with the MudPIT method, however. To gain as complete a coverage as possible, it may be necessary to perform 1D-DDA and MudPIT in tandem.

Examining the intensity of peptides identified by MudPIT indicates this method is detecting lower abundant proteins. However, intensity alone does not account for the increased number of peptides identified by MudPIT. Another factor that may influence the number of IDs is ionization efficiency. Co-elution of peptides that compete for efficient ionization could lead to suppression of some peptides by higher abundant peptides. These suppressed peptides may be of lower abundance than their co-elution partners but are not low enough to be in the 1.00E + 04 intensity range. Two-dimensional chromatography could lead to better separation and reduction in co-elution and ionization suppression.

The identification of 52 of the 70 proteins in the ribosome complex demonstrates that coupling FPOP with MudPIT would be effective for studying large complexes in lysates. However, this approach could likely be improved by further enriching the protein complex with methods such as tandem affinity purification. Comparing extent of oxidation of residues to SASA calculations established a good correlation between the data. Since MudPIT analysis occurs over a longer time-scale than one-dimensional chromatography, there is an opportunity for spurious oxidation. Correlation of oxidative modification levels with solvent accessibility demonstrates that the sample is not adversely affected by the long MudPIT analysis.

The ability to obtain greater sequence coverage for oxidatively modified peptides increases the efficacy of FPOP for megadalton complexes. In order to obtain structural information on proteins using oxidative labeling, it is imperative to have good sequence coverage of your oxidative modified peptides. The data presented here demonstrates that MudPIT can provide this increased sequence coverage.

5 Conclusions

A hallmark of scientific progress is the unceasing march of new technological frontiers and solutions. This holds true in the field of structural mass spectrometry. In order for the use and application of hydroxyl radical-mediated covalent labeling to continue to expand, we must look for new approaches in analysis. In this work, we have demonstrated the use of MudPIT in conjunction with FPOP as a means for increased detection of modified species, and expansion of protein footprinting for complex systems.