Introduction

Glycosylation is a posttranslational modification that plays an important role in many cellular processes such as protein conformation, folding, transport, targeting, and stability [1]. Cell growth, differentiation, cell–cell communication, cellular trafficking, immune response, microbial pathogenesis, and other biological processes are influenced by glycosylation [2, 3]. Protein glycosylation is often found to vary with disease progression [46]. It is not surprising to note that a significant number of known and Food and Drug Administration-approved biomarkers for diseases are glycoproteins [7], including CA 15-3 [8] and c-erb-2 [9] for breast cancer, prostrate-specific antigen [10] for prostrate cancer, CA125 [11] for ovarian cancer, and carcinoembryonic antigen for ovarian, prostrate, and colon cancers [12]. Glycoproteins represent an interesting subproteome that can be mined to improve the chances for discovering putative robust, specific, and selective biomarkers for diseases.

Biofluids such as serum/plasma, saliva, and urine are excellent starting points for such biomarker discovery studies. Human saliva is gaining popularity as a body fluid for diagnostic purposes. Its ready availability, ease of collection, and ease of storage makes it a potential ideal candidate for use as a diagnostic medium. Whole saliva has contributions from three primary salivary glands: the parotid (PA), submandibular (SM), and sublingual (SL) glands, in addition to other minor contributions (e.g., from numerous minor glands in the lip, cheek, tongue, and palate, microbes, epithelial cells, nasal, and bronchial secretions, and serum products).

Many types of posttranslational modifications have been observed on salivary proteins [13, 14]. Some of the common modifications include glycosylation, phosphorylation, sulfation, acylation, deamidation, and proteolysis. Glycosylation is a posttranslational modification that has important functions in the oral cavity such as lubrication and protection of oral cavity and teeth from oral pathogens, chemicals, and mechanical wear and tear. Some of the salivary glycoproteins involved in lubrication include proline-rich glycoprotein and mucins. Proline-rich glycoproteins are primarily secreted by the PA glands and are N-glycosylated. The oligosaccharides make up to 50% of their weight and are responsible for their lubricating properties [15]. Mucins are also heavily O-glycosylated, which explains their viscous nature and their profound role in oral lubrication [16, 17]. The sugars also help mucins bind to surfaces in the mouth to protect them from chemicals, wear and tear, and microbes [18]. Other functions of salivary glycoproteins include binding to oral pathogens and eliminating them from the oral cavity; proline-rich glycoproteins, mucins, and salivary agglutinins have an especially important part in these functions. Proline-rich glycoproteins bind Fusobacterium nucleatum that causes periodontal diseases and may be responsible for its clearance from the mouth [19]. Mucins also bind and aggregate bacteria such as Heliobacter pylori and flush them from the oral cavity [20]. Salivary agglutinins efficiently adhere to and help rid the oral cavity of Streptococcus mutans [21]. In addition, agglutinins have shown an influenza A-neutralizing effect [22]. After secretion of saliva from the salivary glands into the mouth, glycoproteins may undergo deglycosylation. This might have interesting consequences for oral pathogens and the function of glycoproteins. Many bacteria are known to deglycosylate glycoproteins such as mucins and utilize the released sugars for their growth [23]. Some Streptococcus species secrete neuraminidases that cleave the terminal sialic acid from mucins. This affects the ability of mucins to bind and aggregate bacteria such as Streptococcus sanguis [24]. The removal of sialic acid may also help bacteria bind to sugars on glycoproteins and colonize in the oral environment [14].

Mapping the salivary glycoproteome will be an important step toward using saliva for disease diagnosis. Whole saliva is far easier to collect compared to ductal saliva, and whole saliva will likely have more applications for disease biomarker studies compared to PA, SM, and SL saliva. However, mapping the N-glycoproteome of PA, SM, and SL saliva is important for a thorough understanding of biological processes occurring in the oral cavity and to realize the role of saliva in the overall health of human individuals.

Several methods for global glycoprotein profiling of biofluids have been explored to date, and many of the same techniques have also been applied to enrich for disease biomarkers. Some of the common methodologies tested include lectin affinity purification [25], hydrazide-based chemical derivatization for glycoprotein capture, and hydrophilic affinity capture methods. Lectins are a class of proteins found in plants, bacteria, fungi, and animals that bind to specific oligosaccharides on glycoproteins [26]. Whereas the use of a single lectin can help to enrich for a small group of glycoproteins, the use of multiple lectins increases the chances for isolating a broad spectrum of glycoproteins. This method, termed multilectin affinity chromatography, has been used to map glycoproteins in the plasma and serum [2729]. To enrich for glycoproteins to which lectins do not have a strong specificity, the use of serial lectin affinity chromatography (SLAC) has been reported [30]. Because there are few reported lectins with sufficient selectivity for O-linked glycoproteins, SLAC has been used to isolate and identify O-linked glycopeptides in human serum [30]. The use of a broad-spectrum lectin such as concanavilin A was used to deplete N-linked glycoproteins and to enrich for O-linked glycoproteins. Jacalin is a lectin that has affinity for the GalNAc core of O-linked glycans but also for high mannose-type sugars found in N-linked glycoproteins. The subsequent application of the N-glycoprotein-depleted sample to a Jacalin column improved recovery and identification of O-linked glycoproteins [30]. Lectin affinity chromatography has also been employed to search for disease biomarkers in human plasma and serum [31, 32].

Phenyl boronic acid is considered a pseudolectin because it covalently attaches to cis-diol containing groups on the sugars of glycoproteins. This bond is fairly stable under alkaline conditions. Thus, boronic acids can be utilized to select for a broader group of glycoproteins including N-, O-, and C-linked glycoproteins. Lectins and boronic acid can be combined in a single experiment to enrich for glycoproteins [33]. Sparbier and coworkers used a combination of lectin and boronic acid magnetic beads to affinity select glycoproteins from human serum [34].

Hydrophilic affinity-based capture methods have been extended to isolate glycoproteins from complex mixtures. Because sugars are very hydrophilic, glycoproteins lend themselves well to being captured onto a hydrophilic stationary phase. Several hydrophilic materials have been tested in the past. Some of them include carbohydrate gel matrices such as sepharose and cellulose and hydrophilic interaction chromatography (HILIC) columns. Cellulose and sepharose matrices were successfully employed to isolated glycoproteins from human serum [35]. HILIC using a resin containing a polyhydroxyethyl aspartamide group has been used to separate sialylated glycopeptides from recombinant human interferon-γ [36]. Hagglund and coworkers used a resin with a covalently bound neutral, zwitterionic sulfobetaine functional group to perform zwitterion chromatography–HILIC (ZIC-HILIC). Glycopeptides were separated via ZIC-HILIC from nonglycopeptides in trypsin digest mixtures and were then partially deglycosylated to identify glycoproteins in human plasma [37, 38].

Larsen and coworkers published an elegant method to isolate sialic acid-containing glycoproteins in plasma and whole saliva. Their method employs titanium dioxide, which has an affinity for acidic groups. TiO2 has been used commonly to purify and measure phosphoproteins [39]. However, it also binds to sialic acid-containing glycoproteins. After removing phospho-groups on phosphopeptides by alkaline phosphatase treatment, the peptide mixture was passed through the TiO2 column to isolate sialated glycopeptides, and the glycopeptides were analyzed subsequently by mass spectrometry (MS). Using this method, sialated proteins in plasma and whole saliva were identified [40].

Zhang and coworkers published a landmark paper describing the isolation of formerly N-linked glycopeptides from complex mixtures using hydrazide chemistry [41]. This method has gained popularity over the past few years. Sugars on glycoproteins are first oxidized and immobilized onto an agarose–hydrazide resin. The resin is then washed to remove nonspecifically bound proteins. The proteins bound to the resin are digested with trypsin, and tryptic peptides not covalently bound to the column are removed by extensive washing. The bound tryptic N-glycopeptides are then released using the enzyme PNGase F, and the formerly N-glycosylated peptides are analyzed by MS. The identification of the formerly N-glycosylated peptide is confirmed by the presence of an N–X–S/T motif on the peptide sequence (where X denotes any amino acid residue except proline) and the conversion of asparagines at the site for glycosylation to aspartic acid resulting in a mass increase of +1 Da. This method has since been widely applied toward the study of the N-glycoproteome of various human body fluids including plasma [42], serum [43], whole saliva [44], and cerebrospinal fluid [45]. In 2007, two independent groups published slightly modified versions of the original method described by Zhang et al. [41]. The similarity between the two methods was that the proteins initially were digested to tryptic peptides before being coupled onto the agarose–hydrazide resin. In the method published by Sun et al., excess sodium periodate, used to oxidize sugars on glycopeptides, was quenched with sodium sulfite. The method was used to identify unique proteins from microsomal fractions of the cisplatin-resistant ovarian cancer cell lines [46]. Alternately, Tian et al. removed remaining sodium periodate by desalting the oxidized glycopeptides using a C18 column [47]. This technique was successfully used to isolate N-linked glycopeptides from mouse plasma including some proteins that are found in low abundance in human plasma [48]. There are also differences in the wash step to remove nonspecifically bound peptides from the agarose–hydrazide resin, but the basic outline of the two procedures remains similar.

Our group previously reported, using the method of Zhang et al. [41], a global N-glycoproteome analysis of whole-saliva proteins using the hydrazide capture technique and MS, identifying 84 formerly N-glycosylated peptides from 45 glycoproteins [44]. Larsen et al. used the TiO2 enrichment strategy to isolate the whole salivary sialome and identified 97 sites of N-linked glycosylation [40]. In this paper, we extend the salivary glycoprotein catalogue using the modified hydrazide capture method of Sun et al. [46], i.e., the capture of tryptic N-glycopeptides and MS measurement of formerly N-linked glycopeptides. In addition, we compare for the first time the N-linked glycoproteins identified in PA, SM, and SL saliva.

Materials and Methods

Chemicals and Reagents

The chemicals were mostly purchased from Sigma (St. Louis, MO, USA), unless stated otherwise. Affigel Hz hydrazide gel, coupling buffer, and dithiothreitol (DTT) were obtained from Bio-Rad (Hercules, CA, USA). Tris-(2-carboxyethyl)-phosphine (TCEP) and trifluoroacetic acid (TFA) were obtained from Pierce (Rockford, IL, USA). Glycerol-free PNGase F was obtained from New England Biolabs (Ipswich, MA, USA). Sequencing-grade trypsin was procured from Promega (Madison, WI, USA).

Saliva Collection

Whole saliva was collected from healthy nonsmoking adults in the morning, at least 2 h after the last intake of food. The mouth was rinsed with water immediately before the collection. Whole saliva was collected and placed on ice. Protease cocktail inhibitor (1 μL/mL of whole saliva) was added to saliva immediately after collection to minimize protein degradation. Whole saliva was then centrifuged at 12,000 rpm at 4°C for 10 min. The supernate was collected and stored at −80°C. The pellet was saved for future analysis.

For collecting PA, SM, and SL saliva, 10 adult subjects of various ethnic and racial backgrounds, ranging in age from 22 to 30 years, were recruited. Saliva collection took place on a monthly basis and was performed between the hours of 9 and 11 a.m. Stimulated PA, SM, and SL saliva were collected by the repeated application of an aqueous citric acid solution (2%). PA saliva was obtained as the ductal secretion by using a cup-like device [49]. Separate SM and SL secretions were acquired by using a saliva collector, described by Wolff and coworkers, that was fitted with a sterile 100-μL-pipette tip [50]. Collection volumes over a 10-min period ranged from 500 to 2,000 μL for PA saliva, 50 to 100 μL for SL saliva, and 100 to 500 μL for SM saliva. PA, SM, and SL saliva from five subjects were pooled before carrying out the experiments.

Solution Isoelectric Focusing Fractionation

The procedure for solution isoelectric focusing (IEF) fractionation has been described previously [44]. Briefly, proteins in whole saliva were precipitated by mixing with four times the volume of cold ethanol and incubating at −20°C overnight. The mixture was centrifuged at 13,000 rpm for 15 min at 4°C. The pellet was resuspended in Zoom 2D protein solubilizer (Invitrogen, Carlsbad, CA, USA), protease inhibitor (Roche Diagnostic, Indianapolis, IN, USA), Tris base, DTT, and water and sonicated on ice. The pH of the solution was adjusted to pH 8.5–8.7 with 1 M Tris base and then incubated for 15 min at room temperature with shaking. Proteins were alkylated with 99% dimethylacrylamide (DMA) at room temperature for 30 min, as suggested by the protocols from Invitrogen. To quench excess DMA, DTT was added and incubated at room temperature for 5 min. After centrifuging the sample for 30 min at 13,400 rpm at 4°C, the supernate was collected. The protein concentration was determined by the noninterfering protein assay (Geno Technology, St. Louis, MO, USA). The concentration was measured to be approximately 1.5 mg/mL.

The solution of proteins in the Zoom-2D solubilizer (1.5 mg/mL, 400 μL) was diluted to a final concentration of 0.6 mg/mL in diluted buffer (Zoom IEF denaturant, Zoom focusing buffer pH 3–7, Zoom focusing buffer, pH 7–12, and 5 μL 2M DTT). Zoom IEF fractionation was performed in the standard format (pH 3.0 to 10). The diluted sample was loaded into each of the five chambers of the fractionator and subjected to solution-phase IEF. Five fractions (pI 3–4.6, 4.6–5.4, 5.4–6.2, 6.2–7.0, and 7.0–10.0) were obtained after the procedure. Proteins from each fraction were precipitated by mixing with 70% acetone, incubating at −20°C for 3–4 h, and centrifuging at 13,000 rpm for 30 min.

Glycoprotein Enrichment

Proteins from saliva that were not previously pI fractionated were isolated by precipitation using ethanol. Proteins from Zoom IEF-separated fractions were obtained by acetone precipitation. The salivary proteins isolated by either method were resuspended in coupling buffer (100 mM sodium acetate, 150 mM sodium chloride, pH 5.5). Sodium periodate was added to a final concentration of 15 mM. The solution was incubated in the dark at room temperature for 1 h. Glycerol was added to a final concentration of 20 mM to quench any excess sodium periodate remaining in the solution. The mixture was incubated for 15 min with mixing at room temperature. To remove remaining sodium periodate, the solution was dialyzed using a 3.5-kDa dialysis cassette against 1× coupling buffer at 4°C overnight. The hydrazide resin was equilibrated by washing with 3 vol of water and 6 vol of coupling buffer. The proteins were added to the resin and coupled overnight by incubating overnight at room temperature with shaking. The resin was then allowed to settle, and the supernate containing uncoupled nonglycoproteins was discarded. The resin was washed six times with urea buffer A containing 8 M urea, 200 mM Tris, 0.05% sodium dodecyl sulfate, and 5 mM ethylenediamine tetraacetic acid (pH 8.3). The proteins on the resin were reduced with a solution of 10 mM TCEP in urea buffer A. The reduced proteins were alkylated with 50 mM iodoacetamide in urea buffer. The resin was then washed six times with urea buffer B (1 M urea, 25 mM Tris, pH 8.3). The resin was resuspended in urea buffer B. Trypsin was added to the solution, and the proteins attached to the resin were digested at 37°C overnight with shaking. The nonglycopeptides released by trypsin digestion were removed by washing three times with 1.5 M NaCl, 80% acetonitrile (ACN)/0.1% TFA, methanol, and water and six times with 100 mM ammonium bicarbonate. The resin was resuspended in 100 mM ammonium bicarbonate. The N-linked glycopeptides were released by adding PNGase to the resin and incubating at 37°C overnight. The resin was washed twice with 80% ACN solution. The washes were combined, and the released glycopeptides were dried by vacuum centrifugation. The peptides were then resuspended in 0.1% formic acid (FA) and analyzed by one-dimensional (1D) liquid chromatography (LC)–MS/MS.

Glycopeptide Enrichment

The method used followed that described by Sun et al. [46]. Proteins from whole saliva, PA, SM, and SL saliva were precipitated using cold ethanol. The pellets were resuspended in 50 mM ammonium bicarbonate buffer (pH 7.8) with sonication. TCEP was added to a final concentration of 10 mM at room temperature to reduce disulfide bonds (TCEP is preferred over DTT in this case because elevated temperatures required by DTT–disulfide reduction often causes protein precipitation from salivary fluids). The solution was incubated at room temperature for 30 min with shaking. Iodoacetamide (50 mM) was added to alkylate the reduced cysteines, and the mixture was incubated at room temperature for 30 min with shaking. Afterwards, DTT (25 mM) was added to the solution to quench any remaining unreacted iodoacetamide. The mixture was incubated at room temperature for 30 min. Trypsin was added (1:50 by weight) to digest the proteins in solution, and the solution was incubated at 37°C overnight. The tryptic peptide mixture was then acidified with FA to bring the pH below 4.0. The tryptic peptides were desalted using a C18 Sep-Pak reverse-phase (RP) column (Waters, Milford, MA, USA). Peptides were then dried and resuspended in coupling buffer (100 mM sodium acetate, 150 mM sodium chloride, pH 5.5). Sodium periodate was added to a final concentration of 10 mM to oxidize cis-diol groups on sugars. The mixture was incubated in the dark at room temperature for 1 h. Any remaining unreacted sodium periodate was quenched with sodium sulfite to a final concentration of 20 mM. The solution was incubated at room temperature for 10 min. The mixture was then added to agarose–hydrazide resin to couple the peptides to the resin, and the coupling reaction was allowed to proceed overnight at room temperature. The agarose-hydrazide resin was transferred to a Handee minispin column (Pierce, Rockford, IL, USA) and washed with 1.5 M NaCl, 80% ACN, methanol, water, and freshly prepared 100 mM ammonium bicarbonate solution. The resin was incubated overnight at 37°C with PNGase F in 100 mM ammonium bicarbonate to release formerly N-glycosylated peptides. Afterwards, the resin was washed twice with 80% ACN solution. The washes were combined, and the released glycopeptides were dried in the vacuum centrifuge. The formerly N-glycosylated peptides were then resuspended in 1% FA and analyzed by 1D- and two-dimensional (2D) LC-MS/MS.

Liquid Chromatography–Tandem Mass Spectrometry

LC-MS/MS and 2D LC-MS/MS were performed on a QSTAR XL (QqTOF) mass spectrometer (Applied Biosystems, Foster City, CA, USA) equipped with a nanoelectrospray (Protana, Odense, Denmark) interface and an LC Packings/Dionex (Sunnyvale, CA, USA) nano-LC system.

For 1D LC-MS/MS, the nano-LC was equipped with a set of homemade precolumns (75 μm × 10 mm) and a column (75 μm × 150 mm) packed with Jupiter Proteo C12 resins (particle size 4 μm, Phenomenex, Torrance, CA, USA). The peptides were dried and redissolved in 0.1% FA solution. For each LC-MS/MS run, typically, 6 μL sample solution was loaded onto the precolumn. The precolumn was washed with the loading solvent containing 0.1% FA for 4 min before the sample was injected onto the LC column. The eluents used for the LC were water containing 0.1% FA (A) and 95% ACN/water containing 0.1% FA (B). The flow was 200 nL/min. The following analytical LC gradient was used for analyzing the formerly N-glycosylated peptides obtained by the glycopeptide enrichment method: 3–21% B for 36 min, 21–35% B for 14 min, and 36–80% B for 4 min and held at 80% B for 10 min. The column was equilibrated with 3% B for 16 min before the next run. The gradient used for analyzing formerly N-glycosylated peptides obtained by the glycoprotein enrichment method was 3–35% B for 72 min and 35–80% B for 18 min and maintained at 80% B for 9 min.

For the online 2D LC-MS/MS, 20 μL of sample solution was loaded onto a strong cation exchange (SCX) precolumn (Luna SCX resin, particle size 5 μm, 150 μm × 5 mm, Phenomenex) before transfer to a RP precolumn (Jupiter Proteo C12 resin, particle size 4 μm, 150 μm × 5 mm, Phenomenex) and RP analytical column (Jupiter Proteo C12 resin, particle size 4 μm, 75 μm × 150 mm, Phenomenex). Seven concentrations of ammonium acetate solutions (50, 100, 200, 400, 600, 1,000, and 2,000 mM) were injected (4 μL) onto the SCX precolumn for the step gradient elution of peptides.

The RP precolumn was used to preconcentrate and desalt each peptide fraction eluted from the SCX column prior to nano-RP LC separation. SCX fractions were loaded onto the RP precolumn with the following gradient: 3% B for 6 s, 6–24% B for 18 min, 24–36% B for 6 min, and 36–80% B for 2 min and maintained at 80% B for 8 min. The column was equilibrated with 3% B for 15 min prior to the next run. SCX fractions were separated on the analytical RP column (200 nL/min) with the following gradient: 3% B for 5 min, 3–6% B for 6 s, 6–24% B for 18 min, 24–36% B for 6 min, and 36–80% B for 2 min and held at 80% B for 8 min.

For online MS/MS analyses, a Proxeon nanobore stainless steel online emitter (inner diameter = 30 μm) was used for electrospraying with the voltage set to 1,900 V. Peptide product ion mass spectra were recorded during LC-MS/MS by information-dependent analysis on the QSTAR XL mass spectrometer. Argon was employed as the collision gas. Collision energies for maximum fragmentation were automatically calculated using empirical parameters based on the charge and mass-to-charge ratio of the precursor peptide.

Protein identification was accomplished utilizing the Mascot database search engine (Matrix Science, London, UK). For search the sequence databases, the following variable modifications were set: carbamidomethylation of cysteines, oxidization of methionines and enzyme-catalyzed conversion of asparagines to aspartic acid at the site of carbohydrate attachment on asparagines, conversion of N-terminal glutamate and aspartate to pyro-Glu, and cyclization of N-terminal cysteine. DMA modification of cysteine was set as one of the parameters in experiments where Zoom IEF fractionation had been done prior to glycoprotein pulldown. Searches for peptides obtained by the glycoprotein enrichment method were performed against the Human IPI database. For the searches, one missed tryptic cleavage was tolerated, and the peptide and MS/MS mass tolerance was set as 0.3 Da. A Mascot score of greater than 20 was considered a significant match. All matching peptide MS/MS spectra were manually examined to verify the accuracy of the identification. Positive protein identification was based on standard Mascot criteria for statistical analysis of the LC-MS/MS data. For samples obtained by the glycopeptide enrichment method, searches were done against the Human IPI database (version 3.39) and its reverse decoy database. Only peptides above Mascot’s homology or identity threshold were considered. The validity of the formerly N-glycosylated peptides were confirmed by the presence of a consensus N–X–(S/T) where X is any amino acid except proline and the conversion of asparagines to aspartic acid at the site of former N-glycosylation resulting in a mass change of +1 Da. With MS/MS tolerances set to 0.3 Da, a 1-Da difference caused by deamidation of Asn to Asp is readily detected from measurements performed on the QTOF mass spectrometer.

Results

Isolation of N-linked Glycoproteins

In our previous study on salivary N-linked glycoproteins [44], we used the agarose–hydrazide glycoprotein pulldown technique of Zhang et al. [41] to characterize human whole saliva (we will henceforth refer to this technique as the “Zhang method”). Proteins from whole saliva were isolated in two different ways. In the first method, proteins in whole saliva were precipitated with ethanol, and glycoprotein pulldown proceeded thereafter. The second method involved the preseparation of proteins into five different pI fractions of 3–4.6, 4.6–5.4, 5.4–6.2, 6.2–7, and 7–10, by solution IEF fractionation. The proteins in each pI fraction were then precipitated using acetone. The steps employed for glycoprotein pulldown in our previous study using the Zhang method were: (a) resuspending whole salivary proteins in the coupling buffer, (b) oxidizing saccharides on the glycoproteins with sodium periodate, (c) quenching excess sodium periodate with glycerol and removing any remaining sodium periodate by dialyzing against coupling buffer, (d) coupling glycoproteins to the agarose–hydrazide resin, (e) washing nonglycosylated proteins, (f) digesting proteins attached to the hydrazide resin with trypsin, (g) washing away the nonglycopeptides, (h) releasing the formerly N-glycosylated proteins from the hydrazide resin with PNGase F, and (i) identifying the formerly N-glycopeptides by 1D LC-MS/MS (Fig. 1a). A total of eight repetitions were performed for the glycoprotein pulldown experiments where saliva proteins were previously precipitated by ethanol precipitation. Five repetitions of the IEF fractionation followed by glycoprotein pulldown were performed.

Fig. 1
figure 1

Steps involved in the isolation of formerly N-glycosylated peptides using the a Zhang method and b the Sun method

In this current study, we applied a modified version of the glycoprotein pulldown method [46] to extend our catalogue of whole-saliva N-linked glycoproteins and developed a new catalogue of N-glycoproteins from segregated PA, SM, and SL fluids (we will refer to this modified technique as the “Sun method” in this paper). The new method entails proteolytic digestion of proteins prior to coupling to the agarose–hydrazide resin. The other difference lies in that after the oxidation of sugars, any unreacted sodium periodate is quenched with sodium sulfite. This obviates the need to remove sodium periodate by dialysis or using a desalting column as performed in the previous study. In this current study, the salivary proteins were precipitated using ethanol. Thus, the steps to isolate N-glycoproteins are: (a) resuspending salivary proteins in ammonium bicarbonate buffer, (b) digesting proteins with trypsin, (c) desalting the tryptic peptides using a C18 column, (d) oxidizing sugars on glycopeptides with sodium periodate, (e) quenching any remaining sodium periodate with sodium sulfite, (f) coupling glycopeptides onto the agarose–hydrazide column, (g) washing away nonglycopeptides, (h) releasing formerly N-linked glycopeptides with PNGase F, and (i) identifying formerly N-linked glycopeptides using 1D and 2D LC-MS/MS (Fig. 1b). A total of eight 1D LC-MS/MS runs were carried out. Some of the samples were combined, and four additional 2D LC-MS/MS experiments were performed. Data obtained from both 1D and 2D LC-MS/MS runs were combined. In both methods, the formerly N-glycosylated peptide identifications were validated by the presence of the consensus N–X–(S/T) motif and conversion of asparagines to aspartic acid at the site of N-glycosylation. We applied the Sun method to identify N-glycoproteins in whole saliva, PA, SM, and SL saliva.

Identification of N-Glycoproteins in Whole Saliva

In our previous study on N-linked glycoproteins in human whole saliva, we reported 84 formerly N-linked glycopeptides from 44 unique N-glycoproteins (Table 1) [44]. Using the newer Sun method, we identified 80 formerly N-glycosylated peptides from 46 unique N-linked glycoproteins (Table 1). However, comparing our previously identified N-glycoproteins using the Zhang method and new data set obtained by the Sun method, we found that 42 formerly N-glycosylated peptides from 28 N-glycoproteins were identified by both methods, 42 formerly N-glycosylated peptides from 16 N-glycoproteins were identified by the Zhang method uniquely, and 38 formerly N-glycopeptides from 18 N-glycoproteins were identified uniquely by the Sun method. Combining all of the N-glycoproteins identified by both methods, we have thus far identified 122 formerly N-glycosylated peptides from 62 unique N-glycoproteins (Fig. 2a,b). Figure 2c shows the LC-MS/MS spectrum of a formerly N-glycosylated peptide GDQLILNLNN*ISSDR (where the asterisk denotes the site of N-glycosylation) from Isoform 1 of long palate, lung, and nasal epithelium carcinoma-associated protein 1 identified in whole saliva using the Sun method, whereas this formerly N-glycopeptide was not detected in PA, SM, or SL saliva (vide infra).

Fig. 2
figure 2

Comparison of a formerly N-glycosylated peptides and the b N-glycoproteins they represent identified in whole saliva using the Zhang method and the Sun method. c MS/MS mass spectrum of a doubly charged peptide, GDQLILNLNN*ISSDR (m/z 837; asterisk denotes the site of N-glycosylation), from isoform 1 of long palate, lung, and nasal epithelium carcinoma (PLUNC)-associated protein 1 measured from whole saliva

Table 1 Comparison of N-linked glycoproteins identified in whole saliva using the Zhang method and the Sun method

The seemingly poor overlap in protein identification between the two methods could be attributed to several factors. The method of collection of whole saliva has been the same in both studies. However, the glycoprotein pulldown experiments in our previous paper were performed with saliva collected from only one subject [44]. In this current study, saliva from five individual subjects was pooled prior to carrying out the experiments. The identification of proteins found exclusively using the Sun method and not by the Zhang method could be explained by the fact that in the latter method, the precipitated proteins were resuspended in the coupling buffer. Some proteins may not be readily soluble in this buffer and may have been lost in the subsequent steps. Even for proteins that are solubilized in the coupling buffer, carbohydrates may not be well exposed, and oxidation and the ensuing agarose–hydrazide coupling may be inefficient. The identification of peptides exclusively by the Zhang method (and not by the Sun method) could be because of the hydrophilic nature of glycopeptides and their poor retention on the C18 columns. However, we observed that the number of nonformerly N-glycosylated peptides identified were fewer for the Sun method compared to the Zhang method, i.e., the number of nonglycosylated proteins binding to the resin was reduced (Table 1). For the previous analysis of whole saliva using the Zhang method, we measured 163 nonglycosylated peptides. In the current study using the Sun method, we detected only 14 nonglycosylated peptides in whole saliva. This appears to be a significant advantage of the N-glycopeptide capture method over the glycoprotein capture.

Identification of N-Glycoproteins in Parotid, Submandibular, and Sublingual Fluids

To extend our catalogue of salivary N-glycoproteins, we examined the salivary secretions of the three main salivary glands: PA, SM, and SL. Whole saliva is likely to include contributions from the salivary glands and other sites in the oral cavity. Moreover, the collection and processing of saliva from the PA, SM, and SL glands is slightly different from whole saliva, as the stimulation of secretion by the application of citric acid to the tongue was not used, and the centrifugation of saliva to remove food debris was not required. We applied the Sun method to identify N-glycoproteins from PA, SM, and SL fluids. The total number of formerly N-glycopeptides and the N-linked glycoproteins they represent identified from PA, SM, and SL fluids are 62/34 (peptides/protein), 80/44, and 98/53, respectively (Tables 2, 3, and 4). A comparison of glycoproteins isolated from the three glands is shown in Fig. 3a,b. Only 42 formerly N-glycosylated peptides from 25 N-glycoproteins were found in all three glands. There were a significant number of N-glycoproteins that were unique to the saliva secretions of the individual glands. Figure 4a shows the MS/MS spectrum of a doubly charged formerly N-glycosylated peptide (LNAENN*ATFYFK, m/z 716.8) from splice isoform 1 of kininogen. This formerly N-glycosylated peptide was found only in the PA fluid and not in whole saliva, SM, or SL gland saliva. Figure 4b shows the MS/MS spectrum of a triply charged peptide (HYTN*SSQDVTVPCR, m/z 555.6) from the hypothetical protein DKFZp686C02220. This formerly N-glycosylated peptide was observed in SM saliva but not in whole saliva, PA, or SL gland fluids.

Fig. 3
figure 3

Comparison of a formerly N-glycosylated peptides and the b N-glycoproteins they represent identified in PA, SM, and SL salivary fluids

Fig. 4
figure 4

MS/MS mass spectra of a doubly charged peptide, LNAENN*ATFYFK (m/z 716.8; asterisk denotes the site of N-glycosylation) from splice isoform 1 of kininogen from parotid fluid, and b triply charged peptide, HYTN*SSQDVTVPCR (m/z 555.6) from hypothetical protein DKFZp686C02220 from submandibular fluid

Table 2 N-linked glycoproteins identified in parotid salivary fluid
Table 3 N-linked glycoproteins identified in submandibular salivary fluid
Table 4 N-linked glycoproteins identified in sublingual salivary fluid

Comparison of N-Glycoproteins in Whole Saliva and Parotid, Submandibular, and Sublingual Fluids

Whole saliva is a complex mixture. It has contributions not only from the three main salivary glands but also from other minor salivary glands located in the mouth. Nonsalivary secretions in whole saliva include gingival crevicular fluid, bronchial and nasal secretions, and blood derivatives that might enter the mouth by cuts or abrasions. Other components of whole saliva include microbes such as bacteria, fungi, and viruses, food, and other extrinsic substances and from the lining of the mouth [51]. It is expected that the glycoprotein profile will differ between that of whole saliva and PA, SM, and SL saliva. A total of 83 formerly N-glycosylated peptides from 46 N-glycoproteins were found to be common in whole saliva and PA, SM, and SL fluids combined. Thirty-nine formerly N-glycosylated peptides from 16 N-glycoproteins were unique to whole saliva, and 34 formerly N-glycosylated peptides from 15 N-glycoproteins were found in PA, SM, or SL and not detected in whole saliva (Fig. 5a,b). Of the glycoproteins that were unique to whole saliva and not found in PA, SM, and SL fluids, three proteins were not detected in our previous global proteome analysis of PA, SM, and SL fluids [52]. These proteins include lumican precursor, complement component C9, and cystatin-related epididymal spermatogenic protein. Of these, complement component C9 has been detected in whole saliva in our previous studies (unpublished data), but the other two proteins were not detected previously in our global whole-saliva proteome analysis. Table 5 is a combined listing of all the salivary N-glycoproteins identified. The detection of N-glycoproteins that are unique to whole saliva is not surprising because whole saliva is a mixture of various components. Some glycoproteins detected in whole saliva may not originate from the salivary glands. Additionally, some proteins originating from the salivary glands may have undergone posttranslational modifications after being secreted into the oral cavity. These formerly glycosylated peptides are observed only in whole saliva. However, the detection of N-glycoproteins unique to PA, SM, and SL saliva is somewhat surprising. It may be explained by the fact that the glycosylation might have been lost upon secretion into the mouth by the action of enzymes. Whole saliva has a plethora of oral bacteria that secrete enzymes to deglycosylate glycoproteins to alter the properties of the protein altogether or to utilize the sugars for their growth [14]. Glycoproteins from the salivary glands may be also sufficiently diluted in whole saliva, so that they fall below the detection limit of our glycoprotein pulldown method. It is also possible that these set of glycoproteins are secreted only upon stimulation of the glands and not otherwise.

Fig. 5
figure 5

Comparison of a formerly N-glycosylated peptides and the b N-glycoproteins they represent identified in whole saliva and PA, SM, and SL fluids combined

Table 5 Combined list of all N-Glycoproteins identified in human salivary fluids

A gene ontology (GO) analysis of the N-glycoproteins identified in saliva to categorize them according to their cellular location, function, and processes is shown in Fig. 6. The majority of the N-glycoproteins are extracellular (40%; Fig. 6a). Several identified N-glycoproteins were annotated as membrane proteins (14%). Liu and coworkers employed the similar hydrazide chemistry to analyze the human plasma N-glycoproteome and observed a similar trend [42]; the majority of the plasma N-glycoproteins are extracellular or plasma membrane proteins. However, nuclear, cytoplasmic, or cytoskeletal proteins were not identified in the plasma N-glycoprotein study, whereas in our study of salivary proteins, we observed proteins from the nucleus, cytoplasm, and cytoskeleton. Figure 6b shows the GO distribution of salivary glycoproteins based on their functions. Fifty-one percent of the glycoproteins identified are involved in binding, a trend observed also in the analysis of plasma N-glycoproteins [42]. A large number of salivary N-glycoproteins also showed catalytic activity and also a role in enzyme regulation. Salivary N-glycoproteins are involved in catalytic functions such as peptidase, nuclease, hydrolase, and transferase activities. N-Glycoproteins detected in whole saliva are also involved in biological processes such as metabolism, transport, response to stimuli, response to stress, and signal transduction.

Fig. 6
figure 6

Gene ontology analysis of the a cellular distribution and b cellular functional distribution of N-glycoproteins identified in salivary fluids

Discussion

The glycoprotein pulldown method using agarose–hydrazide is an efficient technique for identifying N-linked glycoproteins in complex mixtures. We employed this method to study the whole salivary N-glycoproteome by the Zhang method [41, 44] and the Sun method [46]. This aided us to extend our list of whole-salivary N-glycoproteins that may not have been achieved by the application of one method alone. In our study of N-glycoproteins in whole saliva, several proteins were identified by the Zhang method that were not identified by the Sun method and vice versa.

In this current study, we attempted to probe deeper into the salivary proteome and extend the N-glycoprotein list by examining not only whole saliva but PA, SM, and SL fluids. Combining data sets obtained from whole saliva, PA, SM, and SL saliva, we measured 148 sites of N-linked glycosylation to date. Larsen et al. reported 97 sites of N-linked glycosylation in whole saliva [40]. An overlap of 62 sites of N-glycosylation was found in a comparison of the two studies. In a comparison of N-glycoproteins in whole saliva versus PA, SM, and SL saliva, a large number of N-glycoproteins were identified in PA, SM, and SL saliva that were not found in whole saliva and vice versa. In our study, 15 N-glycoproteins were detected in PA, SM, or SM saliva and were not found in whole saliva.

In this study, we report isolation of glycoproteins from the fluids from the SM and SL glands separately. The SM and SL glands are both located below the tongue. It was earlier believed that it is difficult to segregate the two fluids, as they often share a common salivary duct to empty the contents of the gland into the oral cavity [53]. However, other studies have shown that many individuals have separate salivary ducts for the two glands, and it is possible to collect secretions from SM and SL separately [54]. Many new devices have been fabricated that make the collection of segregated SM and SL saliva easier [50, 55]. Hu and coworkers found markers that clearly differentiate SM from SL saliva. They found cystatin C to be a specific marker for SM saliva and Muc5B and calgranulin B for SL saliva [56]. During the course of collection of SM and SL saliva for our experiments, calgranulin B was employed as a marker to ensure purity of the segregated samples. Cystatin C and calgranulin B have no sites of N-linked glycosylation and were thus not detected by our method. However, Muc5B has several known and putative sites of N-glycosylation. In our current study, we found formerly N-glycosylated peptides in SM and SL, but we found a slightly larger number of Muc5B peptides in SL compared to SM; a total of ten formerly N-glycosylated peptides were found in SL and only eight in SM. Hu et al. found no peptides from Muc5B in SM saliva in their shotgun protein identification experiments. However, we did find differences in the overall N-glycoprotein profile between SM and SL saliva (Fig. 3). As a result, we believe that collection of segregated SM and SL fluids can be accomplished and their salivary proteomes can be differentiated.

Studies on the salivary glycoproteins have potential implications on salivary biomarker discovery efforts, as saliva is gaining popularity as a diagnostic fluid for disease markers. Many attempts have been made in the past few years to use saliva to discover biomarkers for diseases localized in the oral cavity or in the head/neck region. Some diseases for which saliva has been used as a medium for biomarker discovery include oral cancer squamous cell carcinoma [5765], head and neck squamous cell carcinoma [6668], and Sjogren’s syndrome [6977]. In addition, saliva has shown promise for use in the detection of many nonoral (i.e., nonproximal) systemic diseases. In patients with breast cancer, elevated levels of C-erb2 and CA15-3 have been detected in saliva of cancer patients versus control subjects [7880]. Antibodies to the human immunodeficiency virus (HIV) have been found in saliva of HIV-positive patients; saliva-based HIV tests are now gaining popularity [81]. We believe our studies on N-glycoproteins in whole saliva will benefit future work on disease biomarker discovery.