Introduction

N-linked glycosylation, being one of the most common protein PTMs, plays important roles in many biological processes (such as protein folding and cell-cell interaction); its aberrant regulation has been widely linked to malignant diseases (such as cancers) [1,2,3,4,5,6,7,8,9].

With the successful development of efficient enrichment materials, high-resolution, high-end mass spectrometry (MS) with powerful MSn (such as Orbitr-ap), MS-based glycomics has become one of the state-of-art instrumental methods for high-throughput qualitative and quantitative analysis of N-glycans from complex biological systems [10,11,12]. Qualitative analysis of N-glycans (i.e., comprehensive characterization of topological structures including monosaccharide composition and sequence as well as the glycosidic linkage) can, in principle, be achieved with effcient dissociation in tandem MS, where sufficient product ions (especially structural diagnostic ones) are obtained; exploration of efficient dissociation methods and conditions as well as comprehensive investigation of fragmentation pathways and product ions are thus of paramount importance [13,14,15,16,17,18,19]. Sagi et al. observed D and A (0,2A and 2,4A) ions which were useful to identify branching as well as the positions of fucose and sialic acid in low-energy CID of human erythropoietin N-glycans on a Q-TOF tandem MS with negative ESI [20]; Vakhrushev et al. also observed 0,2An ions and used them to identify N-glycans containing a HexNAc at the reducing end and O-glycans (GlcNAc and GalNAc) when they analyzed O- and N-glycans in urine samples of patients suffering from congenital disorder using exactly the same instrument [21]. In CID of three kinds of synthetic glycopeptides on a LTQ-FTICR MS, Yu et al. found that fragmentation patterns of HexNAc-derived oxonium ions (m/z 168, 144, 138, and 126) can be used to distinguish GalNAc- and GlcNAc-O-glycans; m/z 168 and 138 were more prominent for GlcNac-containing glycopeptides, whereas m/z 144 and 126 were more abundant for GalNAc-containing glycopeptides; high-mannose N-glycopeptides generated abundant Hex-derived oxonium ions [22]. Harvey et al. studied CID fragmentation pathways of high-mannose, hybrid, and complex N-glycans as well as glycans containing one N-acetylglucosamine in the core on a Q-TOF MS, where the glycans were released from well-characterized glycoproteins (ribonuclease B, chicken ovalbumin, and bovine fetuin) [23,24,25,26,27]; cross-ring cleavage ions and C ions are dominant in the negative mode, while B- and Y-type glycosidic fragments are dominant in the positive mode. Our recent development of N-glycan database search engine GlySeeker [28] has made it possible for us to do large-scale identification of N-glycans; 559 and 214 unique N-glycans were successfully identified from the OVCAR-3 ovarian cancer cells [28] and the human normal liver LO2 cells [29], respectively. GlySeeker automatically outputs the list as well as the isotopic envelope fingerprinting details of every matched product ions (MPs), which has also made it convenient for us to do fragmentation pathway analysis of N-glycans.

Here, we report our large-scale identification and fragmentation pathways analysis of N-glycans from mouse brain, which is a widely used model to study biological and pathological processes like Alzheimer and other disease [30,31,32,33,34,35]. With three technical replicates of RPLC-ESI-MS/MS analysis each in both positive and negative modes, 240 unique N-glycans with comprehensive topological structural information (monosaccharide composition and sequence as well as glycosidic linkages) were identified with FDE ≤ 1% and NoBHs (number of best hits) = 1. Here, we also report our systematic investigation of fragmentation pathways of N-glycans in both the positive and negative modes under higher energy collisional dissociation (HCD), which has been widely adopted for N-glycan fragmentation; the obtained statistical fragmentation patterns are useful in choice of appropriate product ions for efficient database search as well as accurate identification of N-glycans.

Experimental

Chemicals and Reagents

BCA assay kit, ammonium bicarbonate (ABC, Reagent Grade), and 20× phosphate-buffered saline (PBS, DEPC-treated) were purchased from Sangon Biotech (Shanghai, China). Methyl iodide was purchased from Micxy Reagent (Sichuan, China). PMSF was purchased from Life Science Products & Services (Shanghai, China). Sodium hydroxide, dithiothreitol (DTT), PNGase F (P7367, for proteomics), formic acid (FA, eluent additive for LC-MS, 56302), protease inhibitor cocktail (P8340), and all solvents (acetonitrile and methanol) were purchased from Sigma Aldrich (St. Louis, MO, USA). Ultrapure water was produced on site by Millipore Simplicity System (Billerica, MA). Six-week-old male and female C57BL/6 mice were obtained from Sino-British Sippr/BK Laboratory Animal Ltd. (Shanghai, China). Brains were carefully removed.

Preparation of N-Glycans from Mouse Brain Tissue

Protein isolation buffer (100 mM DTT, 10 mM Tris-HCl, pH 8.0) was freshly prepared. One mouse brain tissue (ca. 400 mg) was cut into small pieces, washed with 1× PBS, and then lysed on a homogenizer (Kuo Wang Instrument Manufacturing Co., Ltd., Changzhou, China) with 2 mL isolation buffer supplemented with 20 μL protease inhibitor cocktail for 20 s at 10,000 r/min; after cooled down, the above homogenization was repeated once. After sitting on ice for 15 min, the lysate was centrifuged for 10 min (4 °C; 16,000 g) and the supernatant protein mixture was collected. With concentration measured by BCA assay, the protein mixture was aliquoted into 1.5 mL centrifuge tubes and stored in −80 °C freezer for future use.

For N-glycan preparation, 80 μL protein solution (ca. 650 μg) was first mixed with a mixture solution of 500 μL ultrapure water, 150 μL ABC (100 mM), and 24 μL denaturing buffer (0.5% SDS and 40 mM DTT), and heated to 100 °C for 10 min. After cooling down, the denatured proteins were incubated with PNGase F (3 μL, 0.25 units/μL) at 37 °C for 24 h. N-glycans were enriched from the digestion solution with PGC-SPE column (Thermo Scientific, San Jose, USA) according to the manufacturer’s protocol. After 20 times’ binding, the column was washed five times each with 100 μL ACN solution (99.5% ACN with 0.5% FA); N-glycans were finally eluted with 80% ACN (0.5% FA), dried on a SpeedVac (Thermo Scientific, SanJose, USA), and stored in −80 °C freezer. The above steps were repeated five times to prepare a total of six N-glycan samples; three of them were permethylated before C18-RPLC-MS/MS analysis in the positive ESI mode, and three of them were resuspended in 30 μL H2O for direct PGC-LC-MS/MS analysis in the negative ESI mode.

Solid-Phase Permethylation of the Mouse Brain N-Glycome Using CH3I

The N-glycan mixture prepared above was permethylated using CH3I according to the reported procedure [36,37,38]. Firstly, a NaOH microspin column was made by adding 100 mg NaOH powder into a 250-μL pipette tip which was fitted to a 1.5-mL centrifuge tube; NaOH was pre-conditioned with 200 μL ACN twice. The N-glycan sample was first dissolved in a mixture of 4 μL ultrapure water, 40 μL CH3I, and 100 μL DMSO, then loaded onto the microspin column. The column was centrifuged at 1000g for 2 min, then the obtained solution was re-loaded; this loading-centrifugation process was repeated four times. With the addition of 20 μL CH3I, the eluted sample was incubated for 20 min at RT and went through the final round of loading and centrifugation on the microspin column. The column was washed twice with ACN (100 μL each time), and the eluates were combined. The permethylated N-glycans in the eluates were extracted with 200 μL chloroform and repeatedly washed by 1500 μL ultrapure water [37, 38]; the chloroform solution was then dried on the SpeedVac. The obtained pellets of permethylated N-glycans were resuspended in 45 μL water for further analysis.

LC-MS/MS Analysis of Native and Permethylated N-Glycans from Mouse Brain

Permethylated and native N-glycans were analyzed on a Q Exactive Orbitrap mass spectrometer (Thermo Scientific, San Jose, CA, USA) coupled with a nano-ESI source and a Dionex Ultimate3000 RSLC nano HPLC system.

For permethylated N-glycans, the separation was carried out on a 75-cm-long analytical column (360 μm o.d. × 75 μm i.d.) packed with C18 particles (Agilent ZORBAX 300SB, 5 μm, 300 Å); buffer A is composed of 99.8% H2O and 0.2% FA, and buffer B is composed of 99.8% ACN and 0.2% FA. The flow rate of the mobile phase was 200 nL/min with a multi-step gradient: 1% B, 10 min; 1–25% B, 20 min; 25–75% B, 270 min; 75–95% B, 20 min; 95–95% B, 10 min; 95–1% B, 10 min. MS spectra were acquired as follows: m/z range 500–2500, mass resolution 35 k, automatic gain control (AGC) target 5e5, max ion injection time 200 ms; MS/MS spectra were acquired at the top 20 data-dependent mode with the following settings: mass resolution 17.5 k, AGC target 1e5, max ion injection time 200 ms, dynamic exclusion 50 s, HCD normalized collision energy 10%, isolation window 3 Th. ESI was set at the positive mode and the conditions were as follows: spray voltage 2.8 kV, capillary temperature 250 °C, and S-lens RF level 75 V. With the above RPLC-MS/MS settings, three technical replicate datasets (TR_I, TR_II, and TR_III) were acquired for the permethylated N-glycans.

For native N-glycans, three technical replicate LC-MS/MS datasets were acquired with the same settings except that the separation was carried out on a PGC column and ESI was set at the negative mode. The PGC column was packed with HyperCarb particles (Thermo, 5 μm, 250 Å); buffer A is 20 mM ABC, and buffer B is 20 mM ABC with 80% ACN; the flow rate of the mobile phase was 300 nL/min with a linear gradient: 10% B, 15 min; 10–40% B, 5 min; 40–75% B, 25 min; 75–75% B, 5 min.

Identification and Fragmentation Pathway Analysis of N-Glycans

Database search of the LC-MS/MS (HCD) datasets from the analysis the permethylated and native N-glycans (three technical replicates each) was done by N-glycan database search engine GlySeeker [28], which has been reported in details elsewhere and only a brief description is given here. The theoretical human N-glycan database containing 79,611 N-glycans was built using the retrosynthetic strategy, and the corresponding decoy database was created by adding a random mass in the range of 1–30 Da to all the isotopic peaks of all the product ions. The search parameters of isotopic peak abundance cutoff (IPACO), isotopic peak m/z deviation (IPMD), and isotopic peak abundance deviation (IPAD) for the matched experimental precursor and product ions are 40%/15.0 ppm/50% and 20%/15.0 ppm/30%, respectively; in addition, matched B/Y ions for every N-glycan spectrum match (GSM) must be no less than 5; for each MS/MS spectrum, only N-glycan(s) with the highest P score (i.e., best hits) are kept as GSM(s); “NoBH = 1” in this study means only one putative structure/linkage is identified with the highest P score for a given precursor ion, and this is both database- and scoring-dependent. The P score distribution of GSMs from the forward and decoy searches was mixed and sorted from low to high, and a cutoff P score was then chosen for FDR control; fragmentation pathways were analyzed manually with each matched N-glycans and the output MPs information.

Results and Discussion

With PNGase F digestion, PGC enrichment, permethylated and native N-glycans of mouse brain were analyzed with RPLC-ESI(+)-MS/MS and PGC-LC-ESI(−)-MS/MS, respectively, each with three technical replicates; the raw datasets were then searched by GlySeeker against both forward and random databases, and N-glycan IDs were obtained target-decoy searches and number of best hits of 1.

For permethylated N-glycans analyzed in the positive mode, 87, 94, and 117 N-glycans were identified from the three technical replicate datasets, and the detailed information (including spectrum index, retention time, experimental m/z, z, theoretical m/z, IPMD, molecular formula, monosaccharide composition, matched B/Y/C/Z ions, P score, primary structure in the format of one-line text) for these N-glycan IDs are listed in Table S1, Table S2, and Table S3, respectively. There are 54 common N-glycans among the three technical replicates (Figure 1A), and the combined total number of unique N-glycan IDs is 153.

Figure 1
figure 1

The Venn diagram of the permethylated (a) and native (b) N-glycans identified from the mouse brain in the positive and negative ESI modes each with three LC-MS/MS technical replicate datasets; in combination, there are 240 unique N-glycan IDs in total from the two modes

For native N-glycans analyzed in the negative mode, 41, 67, and 44 N-glycans were identified from the three technical replicate datasets, and the detailed information (including spectrum index, retention time, experimental m/z, z, theoretical m/z, IPMD, molecular formula, monosaccharide composition, matched B/Y/C/Z ions, P score, primary structure in the format of one-line text) for these N-glycan IDs are listed in Table S4, Table S5, and Table S6, respectively. There are 19 common N-glycans among the three technical replicates (Figure 1B), and the combined total number of unique N-glycan IDs is 88.

There are 17 common N-glycans between the positive and negative modes, and there are 221 unique N-glycan IDs in total from the two modes. Although much fewer N-glycans are identified from the negative mode, it is still a good compliment to the positive mode.

For the N-glycans identified in the positive mode with derivatization of permethylation, the percentages of complex, hybrid, and high-mannose N-glycans are 70%, 24%, and 6%, respectively, which are in the same order with those (92.34%, 7.61%, and 0.04%) in the theoretical human N-glycan database as described above but with the latter two types more favorably identified (Figure S1). For the N-glycans identified in the negative mode without any derivatization, the percentages of complex, hybrid, and high-mannose N-glycans are 40%, 48%, and 12%, respectively, where hybrid N-glycans are much more favorably identified.

In terms of composition (Figure S2), most of the N-glycans identified in the positive mode contain either fucose (73%) or fucose + sialic acid (11%); in the negative mode, N-glycans containing sialic acid (27%) or sialic acid+fucose (40%) are more favorably identified.

For each N-glycan ID, GlySeeker outputs the detailed information for all the matched product ions. After statistical analysis of the type and monosaccharide composition for all the glycan IDs, further statistical investigation was carried out for the fragmentation pathways and matched product ions, which can serve as the basic reference for choice of appropriate fragmentation pathways and types of product ions for efficient database search as well as accurate identification of N-glycans.

In the positive mode, 54 common N-glycans were identified among the three C18-RPLC-ESI(+) MS/MS technical replicates, and the fragmentation pathways of these N-glycans were investigated systematically and statistically. Among the 54 N-glycans, the numbers of complex, hybrid, and high-mannose N-glycans are 32, 18, and 4, respectively. The percentages of experimentally observed matched B/Y/C/Z ions were calculated and compared at the glycan-type levels (Figure 2A); A/X ions are rarely observed here. First of all, B ions have the highest percentages, C ions the lowest, and the percentages of Y and Z ions are about the same; secondly, the percentages of B/Y/C/Z ions do not vary significantly with the N-glycan types. The unusual high percentage of Y ions in high-mannose N-glycans is due to different antenna share lots of Y ions, and these Y ions are repeatedly counted; for 01Y41Y41M(31M21M21M)61M(31M)61M, the YIII3-1+ ion and YI3-1+ ions are exactly the same, I4-1+, YII4-1+, and YIII5-1+ ions share the same m/z of 1747.885 as well.

Figure 2
figure 2

The occurrence ratios of B/Y/C/Z ions for the N-glycans identified from mouse brain in the positive (a) and negative (b) modes, where the ratio for each type of ion is calculated as the number of experimental observed product ions divided by the corresponding theoretical number (# of glycosidic bonds minus one)

In the negative mode, 19 common N-glycans (containing 7 complex, 4 hybrid, and 8 high-mannose) were identified among the three PGC-LC-ESI(−)-MS/MS technical replicates, and the percentages of experimentally observed matched B/Y/C/Z ions are calculated and compared (Figure 2B); X ions were rarely observed and A ions will be discussed separately. Compared to the protonated permethylated N-glycans dissociated at the positive mode, deprotonated native N-glycans have different fragmentation patterns. The percentage of observed C ions is much higher than that in the positive mode and is equally predominant with the observed B ions; and the percentage of observed Y ions is a little lower than that in the positive mode; whereas the percentages of the observed Z ions almost stay unchanged.

While B/Y/C/Z ions coming from inter-monosaccharide glycosidic cleavages are important for sequence identification, A/X ions from intra-monosaccharide ring cleavages are essential for differentiation of linkage positions. Specific molecular formula for the two types of six-membered monosaccharides with different combinatorial cleavage sites is summarized (Table 1) to facilitate automated interpretation by GlySeeker.

Table 1 Elemental Compositions of A and X Ions for Putative Intra-Ring Cleavages of Six-Membered Monosaccharides

We count the occurrence ratio of A ions from the 20 common N-glycans among the three technical replicates in the negative mode (Figure 3).

Figure 3
figure 3

The occurrence ratios (exp./theo) of A ions for the N-glycans identified from mouse brain in the negative mode

A ions with (0, 4)/(1, 3)/(2, 4) cross-ring cleavages (common composition of C2H4O2) share the highest abundance by 85 ± 4%, followed by 53 ± 5% of A ions with (0, 2) cross-ring cleavage (composition, C4H8O4) and 37 ± 2% of A ions with (1, 5) cross-ring cleavage (composition, C5H10O4). A ions with (0, 3) and (1, 4) cross-ring cleavages shared the same ratio of 28 ± 2%. A ions with (2, 5) and (3, 5) cleavages are rare with low occurrence ratios of 5 ± 0% and 2 ± 2%.

The relative abundance of the observed matched ions B/Y/C/Z/A ions for the N-glycans identified in both the positive and negative modes was investigated and compared side by side; the observed matched Z ions tend to have the highest intensities in the positive mode, while the observed matched C ions tend to have the highest intensities in the negative mode; both Z and B ions are normally observed in the two modes. The relative abundance of the example common N-glycan 01Y41Y41M(31M(21Y31F)41Y31F)61M61M identified in both the positive and the negative modes is shown in Figure 4 A and B, respectively; and the relative abundance of the other 4 common N-glycan IDs also identified in the two modes is compared (Figures S3, S4, S5, and S6).

Figure 4
figure 4

The observed matched B/Y/C/Z/A ions from HCD of 01Y41Y41M(31M(21Y31F)41Y31F)61M61M identified from mouse brain in positive mode (a) and negative mode (b)

It should be noted that some spectra were likely generated from multiple isomers, and this could lead to misidentification, and that the NoBH = 1 result does not guarantee the correct structure(s) would be identified; and linkage isomers in this study are reported with putative isomers created in the theoretical database and reported with target-decoy search. The uniqueness of each structure can be further validated biologically with refined theoretical databases and analytically with cross-ring fragments form multi-stage tandem mass spectrometry.

Conclusion

Permethylated and native N-glycans of mouse brain were analyzed with C18-RPCL-ESI(+)-MS/MS (HCD) and PGC-LC-ESI(−)-MS/MS (HCD), respectively, each with three technical replicates; after database search with N-glycan database search engine GlySeeker, a total of 240 N-glycans were identified with target-decoy searches and comprehensive putative topological structures (monosaccharide composition and sequence as well as glycosidic linkages). The actual FDR is likely to be much higher because of the adoption of the retrosynthetic N-glycan database among others. For HCD of the permethylated N-glycans in the positive mode, B ions are the most dominant product ions followed by Y and C ions; for HCD of the native N-glycans in the negative mode, C and B ions are the most dominant product ions followed by Z ions; X ions were rarely observed in the two modes, and A ions were mostly observed in the negative modes with the native N-glycans. A ions with (0, 4)/(1, 3)/(2, 4) cross-ring cleavages are dominant followed by (0, 2), (1, 5), (0, 3), and (1, 4); cleavages of (2, 5) and (3, 5) are rarely observed. Further adoption of reduction before permythlation is helpful for removal of splitting peaks from anomeric isomers and improvement of RPLC separation. In this study, all the results especially the common N-glycans were corrected manully according to the standard deviation of their retention time. Improved algorithm will be taken into consideration in the futher study to make the matching process more automatic and precise.