Introduction

Protein glycosylation involves the covalent attachment of glycans, typically to serine, threonine, or asparagine residues of a protein [1]. These modifications play key roles in regulating biological activities of proteins [24]. A common phenomenon associated with glycoproteins is glycan microheterogeneity (i.e., a range of related oligosaccharides can modify the same glycosylation site) leading to different glycosylated forms of the same protein. Mounting evidence suggests that alterations in glycosylation patterns are associated with different physiological and disease states [58]. Therefore, it is essential to establish the relationship between a glycosylation site and its corresponding set of attached glycans.

In the last two decades, a range of analytical strategies have been developed to characterize glycosylated species [9]. Mass spectrometry (MS) is one of the most powerful techniques for glycomic and glycoproteomic analysis [10, 11]. However, glycan microheterogeneity and the minute quantities of glycosylated species that are available impose serious challenges for analysis [9]. Additionally, glycosylated peptides often ionize poorly relative to nonglycosylated peptides [12]. In order to characterize a glycoproteome in detail, a MS workflow typically requires an efficient enrichment platform, a high-throughput MS analysis, and algorithms for targeted glycoproteomics [13, 14]. To this end, a number of MS workflows have been established for confident discovery and identification of complex mixtures of glycopeptides [1319]. One of the most widely used enrichment platforms is lectin affinity chromatography (LAC) [2022]. During LAC enrichment, glycopeptides are retained through binding interactions with the immobilized lectins, while nonglycosylated peptides and other unwanted species are washed off. Upon release, the enriched mixtures are often suitable for MS analysis.

Another challenge in MS analyses of glycosylated species arises from the extraordinary structural diversity associated with glycans. Glycan isomers vary in monosaccharide units, positions of linkage, and anomeric configurations. Fragmentation techniques such as collision induced dissociation (CID) [2326] are used to generate diagnostic fragment ions that can be used to define a glycan structure. Recently, ion mobility spectrometry (IMS)-MS [2731], which incorporates a gas-phase separation technique prior to MS, has emerged as a promising tool for characterizing glycosylated species [3238]. The mobility of an ion through a buffer gas depends on the shape of the analyte ion; therefore, isomeric glycans that are indistinguishable in MS because of their identical masses can potentially be distinguished in IMS based on differences in their mobilities [3336]. Because glycopeptides retain information about both the glycans and the associated glycosylation sites, it is important to characterize these species in addition to the glycans released from a given protein. However, this is a challenging task because of the microheterogeneity of glycopeptides, as well as the complexity associated with possible structural and positional isomers. IMS distinctions between isomeric glycopeptides have only been made recently [37, 38].

In the work presented here, we describe a relatively simple workflow involving hybrid techniques for the enrichment of glycopeptides and determination of their site heterogeneity and structural diversity using a model glycoprotein chicken ovomucoid. Ovomucoid is a ~23 kDa, highly glycosylated protein, the attached glycans of which account for ~30% of its molecular weight. It has five N-glycosylation sites at Asn 34, 77, 93, 99, and 199 [39, 40], and is glycosylated with hybrid and high-mannose glycans [4147]. A simplified LAC platform [4850] involving immobilized wheat germ agglutinin (WGA), a lectin specific to N-acetylglucosamine (GlcNAc) [5153], is used to enrich the glycopeptides. Typically, the association constant between a glycan and a lectin is in the range of 102–105 M−1. This weak and readily reversible binding interaction allows LAC to be used without establishing a series of binding, washing, and eluting procedures such as those associated with traditional affinity chromatography, thereby simplifying the LAC platform [49, 50]. The enriched glycopeptides were profiled using a high-resolution Orbitrap mass spectrometer as well as IMS-MS. IMS-MS provides direct observation of a high degree of glycan site heterogeneity, enabling confident assignment of the glycopeptides. A total of 117 glycopeptides having 27 glycan forms are identified for this glycoprotein. The IMS profiles of a series of glycopeptides varying only in the attached glycans allow evaluation of the structural diversity of glycan moieties associated with each peptide. The inclusion of an IMS-CID-MS [5457] platform in this workflow provides a means of distinguishing and determining isomeric glycans from this glycoprotein.

Experimental

Materials

Toyopearl AF-formyl-650M resin was from Tosoh Bioscience (King of Prussia, PA, USA). Wheat germ agglutinin (WGA) was purchased from Vector Laboratories (Burlingame, CA, USA). Stainless steel columns (2 × 250 mm) and the Gemini C18 column were obtained from Phenomenex (Torrance, CA, USA). Ovomucoid (type III-O, free of ovoinhibitor), trypsin (TPCK treated), PNGase F, chloroform, NaOH beads (97% purity), iodomethane, 2-mercaptoethanol, dithiothreitol, iodoacetamide, trizma base, and sodium cyanoborohydride (NaCNBH3) were all purchased from Sigma-Aldrich (St. Louis, MO, USA). Water and acetonitrile (ACN) were obtained from EMD Chemicals (Darmstadt, Germany). Formic acid was from Alfa Aesar (Ward Mill, MA, USA). Trifluoroacetic acid (TFA), dimethylformamide (DMF), urea, ammonium bicarbonate, sodium chloride, magnesium chloride, and calcium chloride were all from Mallinckrodt Pharmaceuticals (St. Louis, MO, USA). C18 Zip-tips were from Agilent Technologies (Santa Clara, CA, USA). C18 Sep-Pak cartridges were from Waters (Milford, MA, USA). Empty spin columns were purchased from Harvard Apparatus (Holliston, MA, USA).

Tryptic Digestion

A solution mixture containing 2 mg of ovomucoid, 100 μL of 8 M urea, and 5 μL of 20 mM dithiothreitol was incubated at 56°C for 1 h. After the solution was cooled down to room temperature, 5 μL of 84 mM iodoacetamide was added prior to incubation in the dark for 45 min. The reaction mixture was diluted with 290 μL of 100 mM ammonium bicarbonate followed by the addition of trypsin at an enzyme:protein ratio of 1:50 (w:w) and incubated at 37°C for 14 h. The digest was desalted using a C18 Sep-Pak cartridge and dried under vacuum.

Preparation of Isolated Glycans

In a separate procedure, glycans were cleaved from intact ovomucoid using PNGase F. Briefly, 1 mg of the glycoprotein was dissolved in 50 mM phosphate buffer containing 0.1% 2-mercaptoethanol (pH = 7.5) at a concentration 1 mg/mL. The solution was heated at 100°C for 10 min and allowed to cool to room temperature before the addition of 2 μL of 500 unit/mL PNGase F. The mixture was then incubated at 37°C for 14 h and the digest was briefly cleaned by removing the peptides using a C18 Sep-Pak cartridge. The cartridge was preconditioned with 5 mL of 85% ACN and 0.1% TFA solution followed by 5 mL of 5% ACN and 0.1% TFA solution. The digest was diluted with 1 mL of 5% ACN and 0.1% TFA solution and loaded three times onto the cartridge. The resulting unbound solution containing the glycans was collected and dried under vacuum. The glycans were permethylated using a spin-column method [58]. Briefly, an empty spin column was packed with NaOH beads (suspended in ACN) up to 1 cm from the top of the column and then preconditioned with DMF. The dried glycan mixture was reconstituted with 45 μL of iodomethane, 60 μL of DMF, and 2.4 μL of water and mixed briefly. The reaction mixture was loaded onto the column and incubated for 15 min before centrifugation. After the second addition of 45 μL of iodomethane, the mixture was reloaded onto the column and incubated for another 15 min. The column was then washed with two applications of 50 μL of ACN. The permethylated glycans were extracted with 400 μL of chloroform and dried under vacuum.

Preparation of the Lectin Affinity Chromatography Column

Approximately 1.2 mL of the Tosoh resin (aldehyde activated) slurry was spun down and washed with 10 mM phosphate buffered saline (PBS, pH = 7.5) before mixing with 3 mL of 10 mM PBS containing 25 mg WGA and 5 mg of NaCNBH3. To protect free amines of the carbohydrate binding sites of the lectin from reacting with aldehyde groups on the resin, GlcNAc was added at a final concentration of 5 mM. The reaction mixture was rotated in a cold room at 4°C for 48 h. The resin was spun down, and the supernatant was removed. The remaining aldehyde groups on the resin were quenched using 1 mL of buffer A (100 mM Tris, 150 mM NaCl, 2 mM MgCl2, and 2 mM CaCl2, pH = 7.5) containing 5 mg NaCNBH3, and rotated at room temperature for 3 h. The resin was spun down, and the supernatant was removed. The resin was packed into a 2 × 250 mm stainless steel column for subsequent use.

Enrichment of Glycopeptides Using the WGA Column

The glycoprotein digest was suspended in 20 μL of buffer A and loaded onto an AKTA Purifier HPLC system (GE Healthcare, Little Chalfont, UK). The glycopeptides were enriched using an isocratic 100% of buffer A at a flow rate of 0.2 mL/min. At 1.3 mL of the elution volume, an injection of 100 μL of 50 mM GlcNAc (in buffer A) was made to displace any remaining resin-bound glycopeptides. Fractions were collected every 0.25 mL and desalted using C18 Zip-tips before the Orbitrap MS analysis. The enriched glycopeptides were pooled and further fractionated using the Gemini C18 column on the AKTA Purifier system using a reverse-phase gradient, and each fraction was analyzed by IMS-MS.

High-Resolution MS Measurements

High-resolution and accurate mass (±8 ppm) measurements of the glycopeptides were achieved using a LTQ OrbitrapXL mass spectrometer (Thermo Scientific, San Jose, CA, USA) equipped with an Eksigent nanoLC system (AB SCIEX, Redwood City, CA, USA). The glycopeptides were resolved on a home-made emitter column packed with 5 micron Magic C18AQ resin (Michrom Bioresources/Bruker, Billerica, MA, USA) using a reversed-phase gradient before the MS measurement.

IMS-MS Measurements

The IMS-MS experiments were carried out on a Synapt G2S HDMS traveling wave ion mobility time-of-flight mass spectrometer (Waters) [5962]. For the traveling wave IMS measurements, a wave height of 40 V and wave velocity of 650 m/s were applied to the mobility cell. The mobility cell was maintained at 3.5 mbar of N2 buffer gas pressure. The instrument was equipped with a nanoelectrospray (nanoESI) ionization source. The enriched glycopeptides in 49.9:49.9:0.2(v:v:v) water:acetonitrile:formic acid solution were delivered into the source orifice via a syringe pump (KD Scientific, Holliston, MA, USA) at 20 μL/h. The ESI and cone voltages were held at 2.9 kV and 40 V, respectively. The temperature of the source was kept at 140°C. IMS data were processed using DriftScope 2.7 (Waters).

IMS-CID-MS Measurements

The IMS-CID-MS experiments for the released glycans were performed on a home-built instrument, and the details regarding the instrumentation and modes of operations have been reported elsewhere [5457]. The instrument is comprised of an electrospray source, a drift tube, and a linear ion trap mass spectrometer. The drift tube is ~1 m long with an input buffer gas of helium at ~3 Torr and 300 K. To improve transfer of ions from the source to the IMS cell, and thereby increase sensitivity, we increased the pressure in the source region, which allowed a net flux of atmospheric gas into the IMS chamber [57]. The permethylated glycans were prepared in 49.9:49.9:0.2 (v:v:v) water:acetonitrile:formic acid and 2 mM sodium acetate solution at a final concentration of ~0.1 mg/mL. Ions were generated by electrospray, accumulated in an hourglass funnel [63], and pulsed into the drift tube. Ions were separated through collisions with the buffer gas under the influence of a uniform electric field (~23 V/cm). Mobility selection of the ions was achieved by applying a voltage to the selection gate at the exit of the drift tube at appropriate delay times from the initial ion pulse. The mobility-selected ions were introduced into an LTQ Velos instrument (Thermo Scientific, San Jose, CA, USA) for CID under a resonant rf excitation waveform applied for 10 ms with 35% normalized collision energy and an activation q of 0.25.

Data Analysis

Glycopeptide identification was assisted using the online software GlycoMod [64]. For a given protein sequence, protease, and set of possible glycans, GlycoMod examines an input MS mass list and determines which compounds potentially correspond to glycopeptides [64, 65]. Specifically, the date files from the Orbitrap MS measurement were deconvoluted using the Thermo Xcalibur Xtract software (output threshold at signal/noise = 10) into singly protonated precursor masses before being submitted to GlycoMod for mass match (±8 ppm). Ovomucoid sequence information was obtained from UniProt database (ID P01005). The Gaussian fitting of the IMS distribution was performed using the Peak Analyzer tool in OriginPro 9.0 software (OriginLab Corp., Northampton, MA, USA). The center, width, and height of each Gaussian peak were optimized by altering the settings iteratively. For each fitting, the maximum number of iteration was set at 5 × 106, and the tolerance was set at 1 × 10−15.

Results and Discussion

WGA Enrichment and Orbitrap MS Analysis of Ovomucoid Glycopeptides

Due to the fact that purified trypsin often contains active co-purified chymotrypsin even after TPCK treatment, and that a relatively high percentage of chymotryptic peptides were identified in the tryptic digest, we interpreted the data allowing for both tryptic and chymotryptic cleavages. Ovomucoid is glycosylated with high-mannose type and hybrid type glycans, masses of which are in the range of 900–2600 Da [4147]. Although ~30% of the mass of the glycoprotein is due to glycans, only a minority of the peptides observed in the MS spectrum are glycosylated, as glycopeptides generally have weaker ionization efficiencies compared with nonglycosylated peptides [20]. Furthermore, owing to the microheterogeneity at any given site, the relative abundance of a given glycopeptide may be very low. Consistent with this, the majority of the intense peaks in the MS spectrum obtained without enrichment are nonglycosylated peptides (Supplementary Figure S1). Glycopeptides are present at relatively low intensity. Several of the relatively intense glycopeptide peaks could be observed between 1200–1400 m/z (Supplementary Figure S1), whereas the rest of the glycopeptides are not well resolved from the background noise. In total, glycopeptides only account for ~0.5% of the overall peak intensity in Supplementary Figure S1 (calculated by summing up the intensities of identified glycopeptides and normalized to the sum of all peak intensities in the spectrum). Thus, direct MS measurement of glycopeptides in the digest mixture is inefficient. In order to examine the glycoproteome in detail, it is desirable to minimize the interferences from the nonglycosylated peptides by removing them as much as possible.

As mentioned above, separation of glycopeptides from nonglycosylated peptides can be achieved through retention on a lectin column using an isocratic buffer condition, as glycopeptides experience significant binding interactions with the lectin whereas nonglycosylated peptides do not. The major UV peak in Figure 1a indicates the flow-through of the nonglycosylated peptides in the digest. Fraction F2, which was collected near the absorption maxima, is shown to contain predominantly nonglycosylated peptides (Figure 1b). The spectrum is similar to the one obtained for the entire digest (Supplementary Figure S1) in that the majority of the peaks in both spectra are nonglycosylated peptides. However, slight differences were noted. In particular, the glycopeptides peaks observed in Supplementary Figure S1 were essentially absent in Figure 1b, suggesting that they have been enriched in the later LAC fractions. The WGA-enriched glycopeptides begin to elute in fraction F4. It can be seen that in fraction F4 (Figure 1c), the majority of observed peptides are glycosylated. This has clearly demonstrated that the WGA column has effectively enriched the glycoproteome from the digest mixture. In the absence of strongly ionizing nonglycosylated peptides, the glycopeptide signals are significantly improved. In Figure 1c, the glycopeptides account for ~12% of the overall peak intensity in the spectrum, whereas the number is ~0.5% before enrichment. Thus, the WGA-based lectin affinity chromatography provides more than 20-fold enrichment for the glycopeptides. Without the enrichment, glycopeptide signals are extremely weak because of the presence of intense neighboring nonglycosylated peptide peaks as well as competitive ionization. In Figure 1c, the dominant ions in the spectrum are the glycopeptides containing the peptides FPN*ATDK and SIEFGTN*ISK modified by the glycans H3Nx (H=hexose, N=GlcNAc, x = 2–8). This is in agreement with prior work that the glycans H3Nx (x = 2–6) are the most abundant glycan series in ovomucoid [66]. Figure 1d shows the mass spectrum for species in fraction F7, which contains those glycopeptides present in the GlcNAc elution plug. Only a few glycopeptides were observed in this fraction. These glycopeptides were found at higher intensity in earlier fractions, indicating that at this point the majority of these glycopeptides have already eluted from the column in an isocratic fashion.

Figure 1
figure 1

(a) UV absorption trace at 214 nm of the lectin affinity chromatography (LAC) enrichment of the ovomucoid digest using an isocratic buffer condition. Vertical lines immediately above the X axis indicate positions of the fractions collected (F1–F8). The Orbitrap MS analysis of species in fractions F2, F4, and F7 are displayed in panels (b), (c), and (d), respectively. Each spectrum was obtained by integrating the elution window corresponding to all peptides. Glycopeptides are labeled in color (pink: SIEFGTN*ISK; green: FPN*ATDK. The * symbol indicates the glycosylation site), and the associated glycan forms are shown in the parentheses above the m/z labels. Non-glycosylated peptides are indicated by black m/z labels. Charge state of each ion is labeled in superscript to the m/z label

IMS-MS of Ovomucoid Glycopeptides

The enriched glycopeptide fractions (F4–F8 from LAC) were combined and further fractionated offline using a C18 column (the C18 chromatography separation and fractionation have been shown in Supplementary Figure S2). The IMS-MS distributions of F5 and F7 from the C18 fractions are shown in Figures 2 and 3, respectively. When ionizing a mixture of compounds using electrospray, the resulting MS spectrum can be very complicated because of multiple charge states of the analyte ions. IMS provides a means of separating the mixture of ions into families of charge states and, in favorable cases, can also delineate different isomeric forms based on differences in their shapes [6769]. As can be seen from Figures 2a and 3a, ions have been separated into different bands in the IMS-MS spectra. The ovals in Figures 2a and 3a define the separated ion bands that have been plotted in Figures 2b–d and 3b–d, respectively. In both cases, the majority of the ions in the doubly, triply, and quadruply charged states correspond to glycopeptides. The majority of the peaks in the singly charged ion bands correspond to relatively short, nonglycosylated peptides, which were not plotted in the mass spectra here.

Figure 2
figure 2

(a) IMS-MS (m/z versus drift bins) distribution of the ovomucoid glycopeptides from the C18 fraction f5. The white ovals designate the extracted ion bands shown in the corresponding panels below (b)–(d). Glycopeptides are labeled in color (green: FPN*ATDK; red: GAEVDCSRFPN*ATDKEGK; the * symbol indicates the glycosylation site), and the associated glycan forms are shown in the parentheses above the m/z labels. The charge state of each ion is labeled in superscript to the m/z label

Figure 3
figure 3

(a) IMS-MS (m/z versus drift bins) distribution of the ovomucoid glycopeptides from the C18 fraction f7. The white ovals designate the extracted ion bands shown in the corresponding panels below (b)–(d). Glycopeptides are labeled in color (green: FPN*ATDK; pink: SIEFGTN*ISK; blue: TYGNKCNFCNAVVESN*GTLTLSHF; the * symbol indicates the glycosylation site), and the associated glycan forms are shown in the parentheses above the m/z labels. The charge state of each ion is labeled in superscript to the m/z label

It is important to note that even in the case of glycopeptides derived from a purified protein, glycopeptides cannot be unambiguously assigned based solely on their masses because of complications such as missed tryptic cleavages and variable endogenous and artifactual modifications [70]. In recent years, CID and electron transfer dissociation has been used to obtain information about the associated glycan and peptide sequences, respectively [24, 71]. Although these techniques are emerging as powerful tools for glycopeptide analysis, the large tandem MS datasets that are produced often require automated data processing tools as well as well-established scoring systems for confident assignment of glycopeptides [65, 72]. It appears that IMS-MS datasets can provide at least some direct insight into the degree of glycopeptide site heterogeneity. The glycopeptides shown in Figure 2 are primarily associated with glycosylation site 34. The doubly charged ion family in Figure 2b contains abundant glycopeptides FPN*ATDK as well as a version of this peptide containing two missed cleavages, GAEVDCSRFPN*ATDKEGK. This site is observed to be highly glycosylated by the glycans H3Nx (x = 2–8). The glycopeptides GAEVDCSRFPN*ATDKEGK also appear in the triply and quadruply charged states (Figure 2c and d). The glycopeptides shown in Figure 3 are primarily associated with glycosylation sites 77 and 199. The majority of the doubly charged ions in Figure 3b correspond to the glycopeptides SIEFGTN*ISK having glycosylation at residue 77. These glycopeptides appear in the triply charged state as well (Figure 3c). Similar to site 34, site 77 is also highly glycosylated by the glycans H3Nx (x = 2–8) as well as the glycans H4Nx (x = 2–8). The quadruply charged ions in Figure 2d correspond to the glycopeptides TYGNKCNFCNAVVESN*GTLTLSHFhaving glycosylation at site 199. These glycopeptides also appear in the triply charged state (Figure 3c). Site 199 appears to be highly glycosylated by the glycans H4Nx (x = 3–8).

Combining LAC enrichment and IMS-MS analysis, we observed a high degree of glycopeptide site heterogeneity. Together with accurate mass measurements, we were able to confidently assign the glycosylation patterns for this protein. A total of 117 glycopeptides from five different sites are identified in the ovomucoid digest (Supplementary Table S1), and a total of 27 glycan forms are found to modify this protein (Supplementary Table S2). These glycan forms are consistent with those found in literature [4147], and have the composition formula of HyNx (y = 3–6, x = 2–8), where the glycans H3Nx (x = 2–8) appear to be the most abundant glycans for this glycoprotein. Very limited site-specific information regarding sites 93 and 99 was obtained, as both sites are on the same tryptic peptide. Further proteolytic cleavages may be needed in order to assign their respective glycan forms. In this study, the number of identified glycopeptides in ovomucoid is much greater than those reported based on LC-MS [66] and MS/MS techniques [73].

IMS Distributions of the Glycopeptides

The IMS distributions of a series of glycopeptides containing the same peptide GAEVDCSRFPN*ATDKEGK modified by the glycans H3Nx (x = 2, 5–7) are shown in Figure 4. In the triply charged state, the IMS profiles of these glycopeptides all display a single feature (Figure 4a–d). Interestingly, additional IMS features become resolved in the quadruply charged state (Figure 4f–h). As noted elsewhere [67, 74], additional IMS features may arise from different isomeric species or from new gas-phase conformations of the same species. We note that the glycopeptide containing the glycan H3N2, having only one possible structure, shows a single IMS feature at both of the triply and quadruply charged states. Each of the remaining glycans H3Nx (x = 5–7) has been proposed to contain two isomeric forms [41, 42, 47]. For example, the glycopeptide containing the glycan H3N5 has two isomeric structures and, in this case, we observed two partially resolved peaks in the IMS distribution: a major peak is detected at 3.0 ms and a less intense shoulder at 3.2 ms (Figure 4f). Similar observations for the other isomeric glycopeptides are also made (Figure 4g, h). Because these glycopeptides share the same peptide, and the only variance comes from the attached glycans, it appears that these additional IMS features present at the quadruply charged state are a result of the partial separation between the isomers (Figure 4f–h). Apparently, the increased Coulombic repulsion between charge sites causes the glycopeptides to open up, resulting in more open structural forms that are more readily resolved by IMS compared with the more compact structures observed for the lower charge state forms of glycopeptide isomers.

Figure 4
figure 4

IMS distributions of the triply charged [panels (a)–(d)] and quadruply charged [panels (e)–(h)] glycopeptides containing the peptide GAEVDCSRFPN*ATDKEGK modified by various glycans H3Nx (x = 2, 5–7) shown in each panel. The glycan structures are drawn in cartoon representations: blue square-GlcNAc; green circle-mannose. The glycan structures are from Reference [41]

The IMS distribution of a glycopeptide can become very complicated when it is associated with multiple glycan isomers. Figure 5 shows the series of glycopeptides containing the same peptide TYGNKCNFCNAVVESN*GTLTLSHF with the glycans H4Nx (x = 6–8). These glycans may contain multiple isomers, and their IMS distributions show a broad range of features from 2.6 to 4.0 ms. Overall, the number of IMS features increases with the number of potential isomers. While interesting, it remains challenging to determine the exact gas-phase structures associated with individual IMS features. Gaussian functions are typically used to model the distribution of a single gas-phase structural population [76]. For the glycopeptide containing the glycan H4N8, at least four different Gaussian functions are needed to fully represent the overall distribution (Supplementary Figure S3). This suggests the presence of at least four structural isomers, which is a reasonable conjecture when theoretically five such isomers can exist (Figure 5c). The IMS profiles of these glycopeptides indicate a high degree of structural diversity of these species, and further experiments involving theoretical calculations may be needed in order to fully explain the gas-phase behaviors.

Figure 5
figure 5

IMS distributions of the quintuply charged glycopeptide series containing the peptide TYGNKCNFCNAVVESN*GTLTLSHF modified by various glycans H4Nx (x = 6–8) shown in each panel. Glycans are shown in cartoon representations: blue square-GlcNAc; green circle-mannose; yellow circle-galactose; the big parentheses are used to indicate the possible positions where the terminal galactose can be attached. The glycan structures are from References [41] and [75]

IMS-CID-MS Analysis of the Glycan ion [H3N5 + 2Na]2+

We have previously demonstrated the use of an extracted fragment ion drift time distribution (XFIDTD) approach to identify and resolve isomeric compounds in IMS-CID-MS analysis [5456]. The basis of this approach is that molecules are fragmented by CID after they are separated in the IMS drift region. The intensity of a fragment unique to a single isomer is then extracted as a function of the precursor ion’s IMS distribution, and the obtained drift profile is representative of the specific precursor that produces the unique fragment. Here, we performed the IMS-CID-MS analysis in order to distinguish potential isomeric glycans from ovomucoid.

Figure 6a shows the IMS distribution of the precursor glycan [H3N5 + 2Na]2+. A broad distribution from ~13 to ~16 ms is observed with the major peak at ~14.4 ms, suggesting unresolved features underneath the IMS distribution. In the case of the glycan H3N5, there are two isomers we are interested in resolving. Fragmentation of these isomers can yield several distinct diagnostic ions. With respect to isomer I, loss of either of the GlcNAc-mannose disaccharide from the non-reducing end will result in a fragment at m/z = 1443.6 (Figure 6c). An additional loss of the reducing end GlcNAc will result in a fragment at m/z = 1166.6 (Figure 6b). With respect to isomer II, loss of the (GlcNAc)2-mannose trisaccharide moiety from the non-reducing end will result in a fragment at m/z = 1198.5 (Figure 6d) as well as its complementary fragment at m/z = 731.4 (Figure 6e). Importantly, each of these four diagnostic ions can only result from their respective isoforms and not be generated from the other.

Figure 6
figure 6

(a) IMS distribution of the precursor glycan ion [H3N5 + 2Na]2+ (m/z = 964.8). Two isomeric structures, isomer I and isomer II, are shown in cartoon representations as described in Figure 4. CID spectra of this precursor ion are taken across entire the drift time range. The extracted fragment ion drift time distributions (XFIDTD) of the unique fragments to isomer I (red traces) and isomer II (green traces) are shown in panels (b)–(e). Fragments at m/z = 1166.6 and 1443.6 are unique to isomer I . Fragments at m/z = 1198.5 and 731.4 are unique to isomer II . The fragments are singly sodiated. The dashed lines are drawn to guide the eye

The XFIDTD traces for both of the diagnostic ions for isomer I show nearly identical patterns (Figure 6b and c). In both cases, there is a major peak centered at 14.4 ms, a drift time that is identical to the major peak in the precursor IMS distribution. Both of these fragment ion profiles are narrower than the precursor ion profile, mainly because they do not display the feature at ~15 ms, which is evident in the precursor ion distribution. The XFIDTD traces for both of the diagnostic ions for isomer II show patterns that are similar to each other but distinct from those shown by the isomer I diagnostic ions (Figure 6d and e). Both of the XFIDTD traces show a major peak at 14.8 ms, which corresponds to the rightward shoulder region that was observed in the precursor IMS distribution, the region notably absent from the distributions of the isomer I diagnostic ions.

The IMS-CID-MS analysis of the isomeric glycans H3N5 indicates that isomer I has a slightly smaller collision cross section than isomer II. Because these isomers have such close mobilities, it may be difficult to resolve them using IMS alone, resulting in potential loss of glycomic features. Combining IMS with CID, it becomes possible to differentiate isomeric glycans having close mobilities and to reconstruct their IMS profiles, which can be particularly useful for identification as well as discovery of potential glycan isomers.

Summary and Conclusions

A workflow for reliable identification of glycopeptides in a complex mixture has been demonstrated. Using the WGA-based LAC platform, effective enrichment was achieved for the glycopeptides in the ovomucoid digest. Combining accurate mass measurement and IMS-MS analysis, a total of 117 glycopeptides having 27 different glycan forms were identified for the protein. The numbers may increase substantially when taking into account possible isomeric glycans. IMS-MS allows for separation of glycopeptides into different charge state families. Within each charge state, a distribution of glycan heterogeneity at each site can be clearly observed. A close examination of the IMS distributions from across different charge states reveals that higher charge states generally increase the number of resolvable structural features for the glycopeptides. It is also observed that the IMS features for a glycopeptide increase with the number of possible glycan isomers that are associated with the glycopeptide. Additionally, IMS-CID-MS has been demonstrated as a useful technique for characterizing isomeric glycan structures and reconstructing their individual IMS profiles. While this experiment was conducted on our home-built machine, a similar experimental design can be carried out in commercial instruments equipped with IMS cells, such as a Waters Synapt G2.

An interesting outcome of this workflow is that the glycosylation pattern at a particular site of a protein can be directly observed from the IMS-MS analysis, providing a complementary means of assessing glycopeptide heterogeneity as well as defining the structures. Such additional information may be useful in reducing the total workload for MS/MS analysis that is typically used to identify or discover glycopeptides in a compex mixture. Additionally, IMS-based platforms allow delineation of isomeric glycan/glycopeptide profiles, which often vary in different physiological and disease states [77]; thus, these approaches may assist efforts in discovering and validating biomarkers in a glycome or glycoproteome.