Introduction

Protein glycosylation is the covalent attachment of complex carbohydrates or oligosaccharides, collectively called glycans, to specific amino acid residues of the polypeptide backbone of proteins. Being one of the most common post-translational modifications (PTMs), glycosylation is estimated to occur in more than half of all eukaryotic proteins [1], with two major types being N-glycosylation and O-glycosylation [2]. It is involved in biological processes such as cell adhesion, signaling, inflammatory response, as well as a wide variety of pathologic states [2,3,4].

The ability to accurately characterize the structure of glycoproteins is of great importance to unravel the diverse functions of glycosylation, especially for N-glycosylation that can be highly heterogeneous. However, N-glycosylated proteins are most often highly complex by presenting a multitude of diverse sugar–amino acid combinations and characterization with MS remains a great analytical challenge due to this heterogeneity [4, 5]. Structural characterization encompasses N-glycoprotein/glycopeptide identification, locations of attachment sites, and evaluation of glycosylation site micro-heterogeneity. Most of the previous N-glycoproteomic studies involved enzymatic removal of glycans, and separate identifications of glycans and amino acid sequence [6,7,8,9,10]. Although there are multiple well-established protocols, detaching glycans from their peptide backbones accompanies loss of multiple layers of information (i.e., amino acid site-specific glycoform and micro-heterogeneity) [11, 12].

The ideal way to comprehensively characterize glycopeptides, especially in a large-scale glycoproteomics context, is to preserve the glycan side chain on the peptide and collect glycan and peptide fragmentation information simultaneously [13, 14]. However, due to its heterogeneous nature, fragmenting an intact glycopeptide has always been difficult and yet no single fragmentation technique is able to generate a complete picture in a single MS/MS spectrum [6, 7, 14, 15]. Peaks resulting from glycosidic bond cleavages dominate spectra generated by collision-induced dissociation (CID) with little knowledge of glycosylation sites and amino acid sequences, whereas c/z-ion series in ETD type of experiments yield the glycosylation site and peptide identity with little information on glycan side chain composition [16, 17]. Higher-energy collision dissociation produces abundant diagnostic oxonium ions and partial glycopeptide information as the collision energy is more evenly distributed, enabling intact glycopeptide identification in some cases, most of which are studies on purified target proteins [16,17,18]. Although there have been multiple reports showing the capabilities of using CID or ETD alone to sequence sugar-modified peptides, most often the data were collected only with a few protein standards that were modified by less complex glycation or O-glycosylation. It is not applicable to large-scale characterization of much more branched and complex glycosylated peptides, such as N-glycopeptides where glycan fragments dominate MS/MS spectra, and it is especially challenging to work with complex proteome samples [19,20,21,22,23,24]. An alternative approach for a more detailed characterization of glycopeptides combines MS/MS and MS3 experiments with CID or HCD. In this approach, the glycopeptide ion is selected and fragmented, resulting in a variety of fragment ions predominantly attributable to the cleavage of glycosidic linkages. The peptide ion carrying a single HexNAc, is subjected to a second ion isolation/fragmentation cycle, resulting in fragmentation of the peptide moiety [25,26,27,28]. However, this also requires prior knowledge of the targeted peptides to select for MS3, thus limiting its throughput [25,26,27,28]. Several other reports alternate collision energies and collect sequential MS/MS on the same precursor, based on the observation that lower energy preferentially fragment glycans and higher energy can fragment peptide backbones [29, 30]. However, acquiring or averaging sequential scans significantly slows down instruments and special custom-made bioinformatics tools are required to process such data. As a consequence, the authors have only demonstrated the utility with a few protein standards, but have not yet been able to apply to real complex samples [29, 30]. Accurate mass matching has also been employed in the studies involving glycoprotein standards; however, its applicability to complex mixtures is questionable due to the existence of thousands of possibilities for one m/z value [31, 32]. Additional studies identify N-glycopeptides with HCD using the spectra acquired for intact glycopeptides as well as deglycopeptides to explore the heterogeneity assisted by a custom software platform [33, 34]. Yet its successful application still relies on extra enzymatic PNGase-F treatment and requires dedicated data processing tools.

To overcome these limitations, researchers have built a multifaceted method, running sequential HCD (or CID) and ETD in hope of assembling pieces of information from both fragmentation techniques [35,36,37]. Later on, a more targeted approach, the product-triggered method, took advantage of the abundant oxonium ion upon HCD by triggering subsequent ETD events only if certain oxonium ions were recognized in the prior HCD MS/MS [38, 39]. This method allowed more instrument time to be spent on glycopeptides by avoiding non-glycosylated peptides to be analyzed by MS/MS. Despite providing complementary data, sequential HCD and ETD acquisitions consume more duty cycle and require complex post-acquisition data processing since two types of information need to be considered respectively. Frese et al. combined ETD with HCD and developed a hybrid dissociation method, termed EThcD [40]. A supplemental energy is applied to all ions formed by ETD to generate more informative spectra [41, 42]. It also showed great promise in glycopeptide studies [43, 44].

In this study, we established a redefined EThcD approach for improved analysis of glycopeptides, mainly N-glycopeptides. Instead of using calibrated ETD reaction times and using HCD as a supplemental energy to improve ETD spectra populated with mostly c/z ions as Parker et al. presented [43], our approach integrated HCD as the second dimension of fragmentation. It enabled both HCD and ETD type of fragments to be collected in a single spectrum, enabling improved instrumental speed by greatly shortening ETD reaction times for each charge state [43], complete product ion generation, consisting of glycan fragments, c/z and b/y ions from peptide backbones and, therefore, intact glycopeptide characterization. Following the method validation with fetuin, a slightly more complex human serum sample was studied where 331 unique N-glycopeptides were sequenced and 66 glycosylation sites mapped. Large scale experiments explored N-glycoproteomics from rat carotids collected over the course of restenosis progression with over 2000 glycopeptides identified. Detailed information on microheterogeneity was gained to guide further biological studies.

Experimental

Materials

Fetuin from fetal bovine serum, concanavalin A (ConA), wheat germ agglutinin (WGA), Ricinuscommunis agglutinin (RCA120), iodoacetamide (IAA), N-acetyl-D-glucosamine, D-lactose, methyl α-D-mannopyranoside, and manganese dichloride were obtained from Sigma-Aldrich (St. Louis, MO, USA). Tris base, urea, sodium chloride, human serum, and calcium chloride were obtained from Fisher Scientific (Pittsburgh, PA, USA). C18 OMIX tips were obtained from Agilent (Santa Clara, CA, USA). Dithiothreitol (DTT) and sequencing grade trypsin were supplied by Promega (Madison, WI, USA). Hydrophilic interaction chromatography material was obtained from PolyLC (Columbia, MD, USA).

Rat Carotid Artery Sample Collection

To induce restenosis in experimental model, carotid artery balloon angioplasty was performed in male Sprague–Dawley rats (Charles River; 350–400 g) as previously described [45]. Briefly, rats were anesthetized with isoflurane (5% for inducing and 2.5% for maintaining anesthesia). A longitudinal incision was made in the neck and carotid arteries were exposed. A 2-F balloon catheter (Edwards Lifesciences, Irvine, CA, USA) was inserted through an arteriotomy on the left external carotid artery and placed into the common carotid artery. To produce arterial injury, the balloon was inflated at a pressure of 2 atm and withdrawn to the carotid bifurcation, and this action was repeated three times. The external carotid artery was then permanently ligated, and blood flow was resumed. At d 3, d 7, and d 14 post-surgery, injured common carotid arteries as well as uninjured ones of ~1.5 cm in length were collected following systemic perfusion of the animals with cold PBS solution. Peri-adventitial tissues were carefully dissected and excised in cold PBS, and remaining blood in the lumen was cleaned. The arteries were then immediately snap-frozen in liquid nitrogen and stored in cryogenic vials until ready for downstream processing. For protein extraction, frozen arterial segments were pulverized with micro-pestle over liquid nitrogen bath.

Trypsin Digestion

Fetuin and human serum proteins were dissolved in 8 M urea, reduced (5 mM DTT, 1 h at room temperature), and alkylated (15 mM IAA, 30 min at room temperature in the dark). Sample was diluted with 50 mM Tris–HCl (pH = 8) to lower urea to 0.9 M, trypsin was added in a 1:50 (w/w) ratio, and incubated for 18 h at 37 °C. Digestion was quenched with 10% TFA to a final concentration of 0.3%. Rat carotids were processed with filter-aided sample preparation (FASP) technique [46]. Details can be found in Supplemental Information.

Lectin Enrichment

Human plasma glycopeptide enrichment was performed with a modified filter-aided protocol [47,48,49]. Tryptic plasma peptides were loaded onto Microcon filters YM-30 with 80 μL binding buffer (1 mM CaCl2, 1 mM MnCl2, 0.5 M NaCl in 20 mM TrisHCl, pH 7.3). Ninety mg ConA, 90 mg WGA, and 71.5 mg RCA120 were mixed in 36 μL2 × binding buffer and loaded onto the filter unit. After incubation for 1 h, the unbound peptides were eluted by centrifugation at 14,000 × g at 18 °C for 10 min. The captured peptides were washed four times with 200 μL binding buffer and then eluted two times with elution solution (300 mM N-acetyl-D-glucosamine, D-lactose, methyl α-D-mannopyranoside in 200 μL binding buffer). Final solution was acidified with 10% TFA and desalted using C18 OMIX tips. Tips were first wetted with ACN and equilibrated with water containing 0.1% TFA. Samples were applied onto the tip, washed with water containing 0.1% TFA, and then eluted with 50% and 75% aqueous ACN containing 0.1% TFA. Samples were dried down and loaded on LC-MS.

HILIC Enrichment

Rat carotid glycopeptides were enriched using HILIC beads (PolyHYDROXYETHYL A; PolyLC Inc., Columbia, MD, USA) following a previously reported procedure with minor modification [50]. HILIC beads were first activated with 200 μL of elution buffer (0.1% TFA, 99.9% H2O) for 30 min and then washed with binding buffer twice. About 140 μg tryptic peptides mixture was dissolved in 300 μL of binding buffer (0.1% TFA, 19.9% H2O, 80% ACN) and mixed with 7 mg activated ZIC-HILIC resin at a 1:50 peptide-to-material mass ratio in a microcentrifuge tube. The tube was shaken over a vortex mixer for 1 h and the supernatant was removed by centrifugation. The beads were washed with 70 μL binding buffer (6X) and glycopeptides were eluted with 70 μL elution buffer (5X).

NanoLC-MS/MS Analysis

The desalted peptide mixtures were analyzed on the Orbitrap Fusion Lumos Tribrid Mass Spectrometer (Thermo Fisher Scientific, San Jose, CA, USA) coupled to a Dionex UPLC system. A binary solvent system composed of H2O containing 0.1% formic acid (A) and MeCN containing 0.1% formic acid (B) was used for all analyses. Peptides were loaded and separated on a 75 μm × 15 cm homemade column packed with 1.7 μm, 150 Å, BEH C18 material obtained from a Waters (Milford, MA) UPLC column (part no. 186004661). A gradient ramping 3% to 30% solvent B in 30 min was used at a flow rate of 300 nL/min to separate bovine fetuin digest. A same solvent composition with an extended 120 min gradient was used to separate enriched human serum glycopeptides.

The mass spectrometer was operated in data-dependent mode to automatically switch between MS and MS/MS acquisition. Survey full scan MS spectra (from m/z 300 to 1800) were acquired in the Orbitrap with resolution of 120,000 at m/z 200. The 10 precursors with the highest charge states were selected for MS/MS in an order of intensity.

Three consecutive scans with CID, ETD, and EThcD were acquired on the same precursor, respectively, for bovine fetuin peptides. CID was carried out at collision energy at 35% in the ion trap. ETD was performed with calibrated charge-dependent reaction time [51] (Supplementary Table S1) supplemented by 15% HCD activation, and ion species was analyzed in the Orbitrap at resolution of 60,000 FWHM (at m/z 200). EThcD was performed with user-defined charge-dependent reaction time (Supplementary Table S1) supplemented by 33% HCD activation, and ion species was analyzed in the Orbitrap at resolution of 60,000 FWHM (at m/z 200). 1.0e4 and 3.0e5 were set as MS/MS AGC target for ion trap and Orbitrap scans, respectively. HCD alone experiments with normal human serum was performed with a 33% normalized collision energy, while the other parameters remained changed as used in EThcD experiments.

Lectin-enriched human serum glycopeptides and HILIC-enriched rat carotid glycopeptides were only analyzed in the EThcD mode with user-defined charge-dependent reaction times (Supplementary Table S1) supplemented by 33% HCD activation.

Data Analysis

All spectra were analyzed with Byonic (Protein Metrics, San Carlos, CA, USA) [52] with corresponding protein databases [i.e., bovine fetuin or UniProt Homo sapiens proteome (April 12, 2016; 16,764 entries) or Rattus norvegicus proteome (August 27, 2016; 36991 entries)]. Trypsin was selected with maximum two missed cleavages allowed. Static modifications consisted of carbamidomethylation of cysteine residues (+57.0215 Da) and dynamic oxidation of methionine (+15.9949 Da).

Glycan searches were conducted on fetuin, all with default mammalian N- and O-glycan databases. Enriched serum glycopeptide spectra were searched against human serum glycan database or human glycan database, respectively. Glycopeptides from carotid arteries were searched with mammalian N-glycan database. Precursor ion tolerance of 10 ppm and a product ion mass tolerance of 0.01 Da were allowed. Results were filtered at 1% FDR. Fetuin data was filtered with Byonic score >300 while plasma and rat carotid artery data were filtered with the normalized |Log Prob| >2, and further validation was performed manually.

Results and Discussion

Traditional glycoproteomic studies usually opt to enzymatically release glycans from their attached peptides, due to their heterogeneous nature, and thereby not amenable to a single fragmentation method [6, 14, 16]. However, certain information is lost during such a process and creates a gap between glycomics, proteomics, and glycoproteomics. A universal MS method is in great demand and has driven multiple studies to characterize intact glycopeptides rather than breaking them apart. In recent years, sequential MS/MS acquisition on the same precursor with multiple fragmentation methods (i.e., CID/HCD/ETD) has become a popular approach as the consensus exists that CID/HCD primarily produces glycan products, whereas ETD cleaves peptide backbones [31, 35, 43, 53]. However, running sequential scans undoubtedly lowers instrument duty cycle and, therefore, sensitivity and sequence coverage, especially with ETD scans consuming milliseconds [51, 54]. In addition, it often requires tedious sample preparation and complex data processing since a complete picture is divided and distributed among several spectra [31, 55]. To address these existing limitations, this study aims to employ a hybrid EThcD approach to characterize intact glycopeptides in a one-spectrum fashion with much improved instrument duty cycle.

Identification of Glycopeptides from Trypsin Digest of Bovine Fetuin

Analysis of a rather simple mixture of modified and native peptides was performed using an enzymatic digest of fetuin, which is highly glycosylated [56, 57]. To illustrate the difference between CID, ETD with supplemental HCD activation, and our EThcD approach, three sequential MS/MS scans were recorded on the same precursor with each of the three activation types (Figure 1). As annotated, most of the dominant peaks formed during CID fragmentation of glycopeptide KLCPDCPLLAPLNDSR resulted from the cleavage of glycosidic bonds without breaking amide bonds. Additionally, diagnostic oxonium ions at the lower mass region were lost during fragmentation, which arose from decreased stability of fragment ions with m/z values less than 30% of the m/z for the precursor peptide selected for fragmentation in a linear ion trap device, commonly known as the “one-third rule.” Therefore, as stated above, for glycopeptides CID should preferentially be used for glycan sequencing but not for glycosylation site determination and amino acid sequencing. ETD spectrum was also recorded (Figure 1a, inset), using calibrated ETD reaction time plus 15% HCD supplemental activation. However, barely any sequence information was provided, likely due to inefficiency of ETD on doubly or triply charged ions. In sharp contrast to CID and ETD, our EThcD scheme generated a variety of product ions, originating from cleaving both amide bonds and glycosidic bonds (Figure 1b). The spectrum was manually deconvoluted with Xtract (Thermo Fisher Scientific, San Jose, CA, USA) to reveal more fragments. As manually annotated, CID and ETD provided 22 pieces of fragments altogether with neither of them informative enough to complete the sequencing, whereas EThcD alone revealed over 30 unique fragments with much more balanced information for sequencing the glycopeptide. The sequence was also verified by Byonic search.

Figure 1
figure 1

MS/MS of 3+ charge state precursor ion at m/z 1577.9 of bovine fetuin triantennary N-glycopeptide KLCPDCPLLAPLNDSR (AA 126–141). Alternating between CID/ETD/EThcD resulted in different sets of ions. (a) CID and ETD spectra (inset). Asterisk (*) in the peptide sequence indicates carbamidomethylation. (b) EThcD spectrum. Starred peaks (*) in the spectra were deconvoluted and annotated in the inset

Using this strategy, various glycoforms of N- and O-glycopeptides were identified (Table 1) and assigned with their MS/MS. All glycopeptide spectrum matches were filtered to ensure mass accuracy better than 10 ppm and scores above 300 [52]. The observed glycoforms were in good agreement with previous reports [31, 57,58,59,60,61]. Even though some of these prior studies reported a few additional glycoforms, it is worth noting that these studies required extra experiments, extensive fractionation, extended acquisition time, accurate mass matching, or a lot more materials to work with, whereas our current method has greatly simplified workflow, only requiring trypsin digestion. Compared with CID and ETD, our approach provided improved accuracy and sensitivity towards intact glycopeptide sequencing and site localization.

Table 1 Glycopeptides and Corresponding Glycoforms Identified by EThcD for the N- and O-Linked Oligosaccharides from the Enzymatic Digestion of Bovine Fetuin

Identification of N-Glycopeptides Enriched from Trypsin Digestion of Normal Human Serum

We also applied our method to a large-scale N-glycopeptide identification with lectin enriched human serum, in order to further investigate its utility. Regular HCD and our EThcD were first evaluated in terms of their effectiveness in identifying glycopeptides. Data was searched against human plasma glycan database (57 glycan entities). Results from a duplicate comparison clearly suggested that EThcD outperformed HCD, by enabling more unique glycoform identifications as well as glycosylation sites (Figure 2a, b). Even when some of their identifications overlapped, EThcD provided better spectral quality, which is more informative and, therefore, allows more confident sequence determination (Figure 2c, Figure 3). Presented in Figure 3 are spectra of the same glycopeptide with different charge states and fragmentation methods. Although HCD permits successful sequencing (Figure 3a), it was quite obvious that EThcD outperforms HCD with detection of more complete ion series (Figure 3b) for 3+ charged precursors. Apart from several oxonium ions (m/z 163.06, 204.08, and 366.14) and signature HexNAc fragments (m/z 138.05, 168.06, and 186.07), there were only two ions with one or two HexNAc still attached in HCD, whereas EThcD spectrum recorded cleavage at every single glycosidic bond and entire glycan sidechain products. This was even more obvious with EThcD on 4+ charged precursor (Figure 3c). In general, EThcD peptide spectral matches (PSMs) were higher scored than HCD (Figure 2c). Overall, after filtering with several criteria (1% FDR, Log Prob >3 and mass error <10 ppm), EThcD enabled successful identifications of 184 unique glycoforms, 49 glycosylation sites from 29 proteins, and HCD identified 106 unique glycoforms, 34 glycosylation sites from 23 proteins (Supplementary Table S2).

Figure 2
figure 2

Comparison between HCD and EThcD. (a), (b) Number of identified glycopeptide and glycosylation site comparison. (c) Distribution of the Byonic scores of identified glycopeptide peptide spectral matches

Figure 3
figure 3

MS/MS of N-glycopeptide TVLTPATNHMGNVTFTIPANR (AA 74–94) from human complement C3. (a) HCD, and (b) EThcD spectra of 3+ charge state precursor at m/z 1163.20. (c) EThcD spectrum of 4+ charge state precursor at m/z 872.65

As there are several glycan databases available in Byonic, we reasoned that it is worth comparing a couple of them in terms of any influence that different databases may have on identifications [52]. With identical searching parameters, our data was searched against the human plasma N-glycan database (HPD), which included 57 glycan entities, and human N-glycan database (HD), which included 182 glycan entities, respectively. HPD resulted in 223 unique glycoforms that met our filter criteria, whereas that number with HD was 288. Analysis of the results revealed 170 glycoforms identified by both databases at first, 53 being unique with HPD and 118 with HD. However, further investigation revealed that 10 unique glycoforms from each side were ambiguously assigned different glycan sequences, though the peptide backbone stayed the same. As shown in Figure 4c, d, HexNAc(5)Hex(6)NeuAc(2) was assigned by HPD and HexNAc(5)Hex(5)Fuc(1)NeuAc(2) was assigned by HD; mass difference (~16 Da) was attributed to methionine oxidation. A closer look at the spectrum suggested that database searching against HD not only enabled assignment of more product ions, but also, and most importantly, annotated y11 and z11 ions that contained oxidized methionine, increasing the confidence of assignment. Furthermore, Jia et al. reported this particular site to be core fucosylated, which was in good agreement with HD assignment [62]. This ambiguity happened because certain glycan entities were not included in HPD. Comparing Byonic score ratios indicated all these 10 ambiguous assignments tended to have better annotations with HD (Figure 4b) and, therefore, enriching glycan database would certainly increase search confidence in general. After manually removing ambiguity, 180 glycoforms were shared between HPD and HD. The rest of the 43 unique ones with HPD were primarily (24 in 43) due to sodium adducts not included in HD. On the other hand, 100 out of 108 unique ones were solely sequenced with HD since those glycan entities were not included in HPD.

Figure 4
figure 4

Evaluation of human glycan database and human plasma glycan database. (a) Unique glycopeptide identification between search results with two databases, before and after manual validation of ambiguous spectra. (b) Byonic ratio of ambiguous peptide spectral matches. The ratio is calculated as the score with human glycan database divided by plasma glycan database. (c) Assigned MS/MS of a 4+ charged precursor at m/z 1228.00 with human plasma glycan database, and (d) human glycan database

All N-glycopeptides from human plasma had their glycans on 66 different asparagine residues from 41 proteins (Table 2). Most of these glycosylation sites (~91%) were very well documented by UniProt and previous publications [62,63,64,65,66,67,68]. It also resulted in six novel N-glycosylation sites that required further validation. Moreover, one of the major advantages of sequencing intact glycopeptides rather than sequencing glycans and peptides separately is the capability of comprehensive characterization of glycoforms. Instead of only knowing what glycans are in the sample and what potential peptides they are attached to, we are able to unequivocally list multiple glycoforms with individual glycosite mapped.

Table 2 Glycosylation Site Identifications Using Human Plasma and Human Glycan Databases

Identification of N-Glycopeptides Enriched from Trypsin Digestion of Rat Carotid Arteries

Restenosis is the re-narrowing of a blood vessel following surgical interventions that aim to treat stenosis. While the underlying mechanisms have been studied from multiple perspectives [69, 70], the N-glycoproteomics has not been very well characterized, even though N-glycosylation is critical for correct protein folding, stability, and mediating cell attachment [71]. Extracellular matrix (ECM) proteins have long been implicated in the pathogenesis of restenosis, and the fact that more than 90% of ECM proteins are glycosylated makes it more relevant to study carotid artery glycoproteomics [71, 72]. We harvested rat carotid arteries along the restenosis progression at 0, 3, 7, 14 d time points and enriched N-glycopeptides with a HILIC method. Four biological replicates were analyzed and glycopeptides were only counted if they were identified in at least two replicates with |Log Prob| >2.

Combining results from all four time points resulted in the identification of total 2092 glycopeptides, 387 glycosylation sites, and 226 glycoproteins. By analyzing data from each time point, we managed to observe temporal dynamics of glycosylation events along with the progression of the restenotic lesion, with a gradual increase in glycosylation in the early phase of the vascular responses post-angioplasty (i.e., from d 0 to d 7), followed by a decrease in the late phase of restenosis progression (i.e., from d 7 to d 14) (Figure 5a, b). Indeed, it has been widely acknowledged that the acute and subacute window following angioplasty features the most active cellular events in the vessel wall, such as cell proliferation, migration, and apoptosis, and d 7 post-procedure is frequently used for marking the peak of vascular smooth muscle cells’ proliferative state [73]. Subsequently, vascular cells become relatively quiescent, and restenosis progression reaches a plateau during the late phase of vascular repair post-injury, which would explain the observed reduction of general glycosylations during the transition from d 7 to d 14. As intact glycopeptides were analyzed, details about microheterogeneity were gained with multiple glycoforms mapped on the same glycosylation site. While most peptides (80.0%) identified had less than five glycan compositions at a given site, there were a few (8.6%) that came with more than 10 glycan compositions (Figure 5c). For example, 83 unique glycan masses were identified at lumican Asn 127 (Supplementary Table S3) from four time points, suggesting a highly diverse microheterogeneity. Specifically, 76 glycoforms were observed in healthy rats and after a sharp drop to 53 at d 3, the number gradually went up to 59 at d 14, suggesting varied extent of glycosylation and possible functional modulation. Lumican is an extracellular matrix protein extensively modified by keratan sulfate in cornea [74] but occurs predominantly as a glycoprotein with little or no sulfate in arteries [75,76,77]. Therefore, the large number of glycoforms could be derived from various lengths of lactosaminoglycan with or without fucose and NeuAc. Some of these glycopeptides co-eluting with identical retention time might also be derived from in-source fragmentation. The accumulation of lumican deposition has long been established as an indicator for cardiovascular diseases such as atherosclerosis, restenosis, and aneurysm [78, 79]. Detailed information on its primary structure with glycosylation would help elucidate associated functions. Another interesting protein was LRP1, the homologs to human pro low-density lipoprotein receptor-related protein 1. Twenty-two glycosites were identified and they were either previously documented sites or identical sites as in human LRP1 [80, 81]. LRP1 is a key suppressor in preventing atherosclerosis and restenosis [82, 83], and N-linked glycosylation is known to influence its proper folding and, more importantly, its resistance to γ-secretase [84, 85]. LRP1’s atheroprotective function heavily relies on its cleavage by γ-secretase, as its anti-inflammatory signaling is entirely mediated by its intracellular domain cleaved and released from the membrane bound form of intact LRP1. Previous studies have suggested that hyperglycosylation, particularly N-linked glycosylation, would render LRP1 resistant to γ-secretase [86]. We observed a hyperglycosylation trend with the largest number of glycoforms being identified at d 7, indicating that LRP1 protein folding pattern as well as resistance to secretase could be altered as restenosis progressed. A further biological study is underway to determine how these differentially glycosylated proteins could affect diseased phenotypes of vascular cells, and ultimately evaluate their potential as intervention targets for restenosis treatment.

Figure 5
figure 5

Glycosylation identifications with four biological replicates from uninjured and injured carotid arteries at 3, 7, and 14 d post-injury. (a) Unique glycopeptides and (b) unique glycosites and glycoproteins. Each column represents mean + standard deviation. (c) Pie chart of number of glycoforms mapped to each glycosite. Each segment represents how many glycosites have a particular number of glycoforms

Conclusions

Enabled by the recently introduced EThcD fragmentation technique and bioinformatics tool, we manage to move one step forward towards large-scale characterizing of intact glycopeptides. While ETD, CID/HCD have their own merits, we develop a redefined EThcD approach, combining their unique features, in pursuit of intact glycopeptide sequencing. This hybrid approach features much more informative spectra as both amide bond and glycosidic bond are cleaved, yielding dual information about peptide sequence and residue-specific glycosite attachment simultaneously. Duty cycle is improved as EThcD approach takes less than half of the reaction time that traditional ETD uses and only one spectrum is required, instead of acquiring several consecutive CID/HCD and ETD scans in order to collect sufficient information. This one-spectrum approach also accompanies simplified data processing by avoiding combining information from multiple different scans. Its feasibility is demonstrated with simple fetuin standard, complex human plasma, and rat carotid artery samples. Identification of previously documented glycosylation proves its reliability and identification of over a thousand N-glycopeptides in rat carotid artery during restenosis progression demonstrates its utility in large-scale glycoproteomics studies and provides novel targets for future investigation. Effects of using databases with different N-glycan compositions/inclusions are explored. As expected, more comprehensive glycan database will facilitate more accurate and reliable glycopeptide discovery. We believe that such a workflow can be coupled with the HCD product-dependent EThcD function to further improve sensitivity and specificity by performing EThcD only if diagnostic oxonium ions are detected in the prior HCD scan. The integration of these advanced workflows will accelerate the pace of in-depth glycopeptide mapping.