1 Introduction

Bovine κ-caseinoglycomacropeptide (GMP) is a 64 amino acid peptide formed from κ-casein during the cheese manufacturing process [1]. Chymosin acts upon κ-casein by hydrolyzing the peptide bond between residues Phe105 and Met106 to produce two peptides, para-κ-casein and GMP. Para-κ-casein is the N-terminal fragment, contains no post-translational modifications (PTMs), and precipitates into the curd during the cheese-making process. GMP is the C-terminal fragment of κ-casein and unlike para-κ-casein is a highly modified peptide containing all the PTMs that are found in κ-casein, specifically phosphorylation and O-glycosylation [1]. Owing to its hydrophilic character, GMP remains in solution and is therefore available for recovery in cheese whey. Research into GMP has gained much interest over the last decade, primarily because of the peptide’s bioactive properties, including regulation of gastric secretions [1], promoting the growth of bifidobacteria [1, 2], decreasing platelet aggregation [3], and adhesion of oral micro-organisms to host cells [4, 5]. Many of these bioactivities have been either directly linked to or are greatly influenced by the modifications on GMP, especially glycosylation [5].

The GMP amino acid sequence is highly conserved with two amino acid differences between the two predominant genetic variants, GMPa (Thr157, Asp169) and GMPb (Ile157, Ala169) [1, 6] (Figure 1). Although differing in only in two residues, previous studies have demonstrated that different κ-casein genetic variants greatly affect the coagulation properties of bovine milk [7, 8]. In addition, the sites of phosphorylation and O-glycosylation have been mapped out in GMP through the use of enzymatic digests, Edman sequencing, and liquid chromatography. Phosphorylation in GMP appears to occur primarily at the Ser149 residue, although forms lacking phosphorylation or having two sites of phosphorylation have been previously identified [1, 5, 9]. Unlike the site of phosphorylation, sites of O-glycosylation in GMP vary greatly. As many as six potential sites of O-glycosylation in GMP (Thr131, Thr133, Thr135, Thr136, Ser141, and Thr142) have been reported [5]. The structures of the O-glycans occupying these sites in GMP have been determined to be predominately core 1 type O-glycans and may contain up to two N-acetylneuraminic acid (NeuAc) residues bonded by either an α2–3 glycosidic bond to a galactose (Gal) residue or by an α2–6 glycosidic bond to an N-acetylgalactosamine (GalNac) residue [1, 5, 9].

Figure 1
figure 1

GMP sequence including potential O-glycosylation sites, potential phosphorylation sites, as well as the two predominant genetic variants GMPa (Thr157, Asp169) and GMPb (Ile157, Ala169)

Despite the importance of this compound to the industry, there is no rapid and extensive method for characterization. The particular chemistry of GMP has made it a difficult peptide to study by the traditional methods used in proteomics for separations and detection, such as polyacrylamide gel electrophoresis (PAGE), ultraviolet-visible (UV-VIS) spectroscopy, or ion chromatography [1]. The amino acid sequence of GMP is devoid of aromatic residues, which limits the use of UV-VIS spectroscopy as a detector in chromatographic separations [1, 6]. Separation and detection of GMP by PAGE is also not practical owing to the GMP molecule bearing a net negative charge except at very low pH values and fails to migrate correctly through the gel, leading to false molecular weight determinations [1, 10]. In addition, the net negative charge of GMP also limits the use of cation exchange chromatography as a means of separation. Advances in both capillary electrophoresis and high-performance anion exchange chromatography have shown promising results for the analysis of GMP [5]. However, both of these methods require either chemical hydrolysis of the glycan followed by introduction of a chemical tag or chemical modifications of the peptide in order to facilitate separation or detection of the glycans [5, 11, 12]. Affinity-based methods like lectin affinity chromatography [13] or precipitation of GMP based on its affinity to chitosan [13, 14] have been used to separate the different glycoforms for identification and quantification. Although these methods can be successful, they are time-intensive and have potential problems with specificity in regards to the isolation and detection of the glycoforms.

In this work, a purified GMP sample from a batch production was studied by direct infusion in an ESI FT-ICR MS. Using both negative and positive mode and combining them with different fragmentation techniques, all GMP phospho- and glycoforms were properly identified. The top-down methodology described in this study is significantly more simple compared with previous approaches requiring little sample preparation other than desalting of the sample. The minimal sample preparation speeds the analysis and reduces potential bias. This method could be potentially used with other large peptides that have undergone extensive post-translational modification.

2 Experimental

2.1 Chemicals and Sample Set

Trifluoroacetic acid (TFA) and ammonium acetate were both purchased from Fisher Scientific (Pittsburgh, PA, USA). Acetonitrile (ACN) was purchased from Sigma Aldrich (St. Louis, MO, USA). Water was prepared in-house using a Barnsted E-Pure (ThermoScientific, Waltham, MA, USA) water treatment system and was purified to a final resistance of 18 MΩ. The GMP sample used in this study was kindly provided by Davisco Foods International, Inc. (Eden Prairie, MN, USA, USA) from their GMP Biopure product line. The sample was 90% pure. The purification procedure can be found under the patent US6800739B2.

2.2 Sample Preparation

One milligram of the GMP sample was desalted prior to the analysis by using Discovery DSC-8 solid phase extraction columns (SPE) containing 500 mg of extraction material (Sigma Aldrich, St. Louis, MO, USA). The extraction was performed according to the method supplied by Michrom Bioresources, Inc. for peptide desalting. The desalted GMP was dried using a centrifugal vacuum device and stored at –20°C. For the mass spectrometry analysis, the GMP sample was reconstituted in 1 mL of a solution of water:acetonitrile (1:1) spiked with 5 mM ammonium acetate.

2.3 Mass Spectrometry

Mass spectrometry experiments were performed on a Varian MS-920 hybrid triple quadrupole Fourier transform ion cyclotron resonance mass spectrometer (FTICR) equipped with an actively shielded 9.4 T superconducting magnet. Briefly, this instrument consists of a standard triple quadrupole mass spectrometer that is capable of passing ions to the FT-ICR portion of the instrument. Analyses were performed in both positive and negative mode. Samples, introduced by direct infusion at a flow rate of 10 uL/min were ionized using a standard electrospray ionization source (ESI). The ESI source conditions were optimized as follows: a needle voltage –4500 kV, shield voltage –600 V, drying gas temperature 300°C, drying gas pressure 10 psi, and a nebulizing gas pressure 15 psi. The triple quadrupole was operated in ion guide mode with the ion transmission window set at m/z 1990 and a dwell time of 0.5 s. The voltages on the ion optics and ion accumulation times were adjusted as needed to maximize ion transmission to the ICR cell. Ions were excited for detection or removed from the ICR cell during isolation events by application of arbitrary waveforms. All transients were recorded at a digitizer rate of 2 MHz with 2048 K samples being taken to give a transient length of 1.049 s. Following acquisition, each transient was zero-filled a single time and apodized with a Blackman windowing function. Prior to data collection, the FT-ICR was externally calibrated using Agilent ESI Tune mix (Agilent, Santa Clara, CA, USA).

Tandem mass spectrometry experiments were performed using both infrared multiphoton dissociation (IRMPD) and electron capture dissociation (ECD) in the positive mode. Ions of interest were initially mass filtered in the quadrupole side of the instrument and further isolated within the ICR cell by application of an arbitrary waveform. A pulse of nitrogen gas was introduced during the IRMPD event to collisionally cool the product ions and keep them in the path of the IR beam. ECD experiments were performed following isolation by irradiating the ions with electrons generated with a heated rhenium filament. Different charge states were tested and the period of irradiation and the energy of the electrons were optimized to fragment efficiently the ions.

2.4 Data Analysis

All acquired spectra were analyzed in the Varian Omega FTMS software. All spectra were charge deconvoluted and the average mass for each isotope cluster was determined with the Omega software. ECD fragmentation experiments were analyzed with MS-Seq of Protein Prospector [15]. Masses were allowed 5 and 10 ppm error for the parent ion and the fragments, respectively. No complete peptide modifications were included but phosphorylation was allowed as a potential modification at the serine and threonine residues. A nonspecific cleavage was used to search against the bovine proteome obtained from Uniprot. Instrument fragmentation was set as ESI-FT-ICR-ECD.

3 Results and Discussion

3.1 High-Resolution Cationic Mode Experiments

The positive mode ESI FT-ICR MS spectrum of the GMP sample showed several clusters of multiply charged ions (up to charge state +7) spanning the m/z range 900–3500 and varying greatly in intensity. Figure 2a shows charge state from +3 to +5 in the m/z range 1300–2300. While identification of the various forms of GMP is possible in the m/z domain, it is greatly simplified by using the charge deconvoluted data as all of the different forms of GMP will occur at one unique mass rather than occurring at multiple m/z values.

Figure 2
figure 2

GMP results in the positive mode; (a) mass spectrum in the 1300–2300 m/z range showing different GMP forms clustering by charge-state; (b) zoom in cluster +5; (c) deconvoluted spectrum; (d) theoretical isotopic pattern of GMPa-P

The initial analysis of the deconvoluted GMP spectrum in the positive mode (Figure 2c) showed several distinct forms. The more intense signal in the sample corresponds to the phosphorylated GMPa (GMPa-P), whose more abundant isotope (A + 3) was detected with a mass of 6786.3382 Da. GMPa-P is the most intense signal in the spectrum, being the base peak and roughly three times as abundant as the other genetic variant, phosphorylated GMPb (GMPb-P). Both GMPa-P and GMPb-P were identified by comparing their deconvoluted experimental masses with theoretical masses. The isotopic pattern of each form of GMP showed good agreement with the theoretical distribution (Figure 2d and Supplementary Table S2). Both GMPa and GMPb appear in the spectrum in two forms: mono- and bi-phosphorylated species (GMPa-2P and GMPb-2P in Figure 2c) Dephosphorylated species were not found. Additionally, adducts with sodium and potassium as well as dehydrated species were detected.

3.2 Genetic Variant Confirmation by ECD

Unambiguous identification of the genetic variants A and B was confirmed by ECD tandem-MS. Tandem-MS experiments of quasimolecular ions 970.48+7m/z and 965.92+7 m/z corresponding to GMPa-P and GMPb-P forms were performed (Supplementary Figure S1a and S1b). The ECD fragmentation results were analyzed with MS-Tag from Protein prospector [15] against the whole bovine proteome. MS-Tag matched 55 ion fragments from the tandem MS spectrum of ion 970.48+7 to GMPa-P phosphorylated at residue S170. Similarly, MS-Tag identified ion 965.92+7 to GMPb-P matching 19 fragments. A table with fragment assignments is shown at the Supplementary Material (Supplementary Table S1).

3.3 High-Resolution Anionic Mode Experiments

The negative mode ESI FT-ICR MS spectrum of the GMP sample (Figure 3a) showed several clusters of multiply charged ions spanning the m/z range 2000–3500 and varying greatly in intensity. The charge states of these ions were –2 and –3, with the majority of the ions being in the –3 charge state.

Figure 3
figure 3

GMP results in the negative mode; (a) mass spectrum in the 2300–3900 m/z range showing different GMP forms with charge states –2/–3; (b) deconvoluted spectrum

The deconvoluted spectrum in the negative mode (Figure 3b) revealed a larger diversity of GMP forms compared with the positive mode. Again, both GMPa and GMPb are found phosphorylated (GMPa/b-P) and bi-phosphorylated (GMPa/b-2P) with their corresponding cohort of oxidized and dehydrated forms. No adducts with sodium or potassium were found. The other forms of GMP that were detected in the sample were determined to be GMPa-P and GMPb-P but have been modified with different glycan structures. The presence of sialylated glycans facilitates the detection of GMP glycosylated forms in the anionic mode. GMP is negatively charged even in acidic conditions (pI ≈ 4) [16]. As the sample is infused without chromatographic separation, non-glycosylated GMP competes the glycosylated GMP for the charge during ESI. Ion suppression of glycopeptides by non-glycosylated peptides is a common phenomenon in MS [17, 18]. In the anionic mode, on the other hand, the additional negative charge of the sialic acid groups seems to facilitate the glycopeptide ionization.

The identification of the glycosylated forms and the glycans on these species was made by determining the difference in mass between the species that were potentially glycosylated and GMPa-P/GMPbB-P. The addition of each carbohydrate residue (each residue being the carbohydrate minus a water molecule to account for the bond to the peptide) to a peptide causes a distinct increase in the mass, which can be used as a diagnostic mass difference. As an example, the addition of a single N-acetylhexoseamine (HexNAc) residue to GMPa-P will lead to a mass increase of 203.079 Da, whereas the addition of a disaccharide composed of HexNAc and hexose (Hex) would lead to a mass increase of 365.132 Da. It was determined that the glycosylated forms shown are modified with glycans consisting of HexNAc, Hex, and sialic acid (NeuAc) residues. Some forms of GMPa-P and GMPb-P were observed to contain one or two glycan moieties (Table 1).

Table 1 Summary of GMP forms found

The smallest glycan that was detected is a monosaccharide of composition HexNAc, with the largest glycan being a tetrasaccharide of composition HexNAcHexNeuAc2. On the glycosylated forms bearing two glycans, a mix of the tri- and tetrasaccharide glycans were found. These findings are in agreement with what has previously been published on the glycosylation of GMP [1, 5, 9, 19, 20]. After identification of all forms of GMP that were expected based on literature, several species in the spectra were left unidentified and did not match what has previously been reported. Based on the increase in mass (≈79.9 Da), these species were tentatively assigned as being forms of GMP with two sites of phosphorylation. Doubly phosphorylated GMP has been previously reported, albeit at low abundance and only occurring in non-glycosylated species [9]. In this study, the double phosphorylation was observed for both the non-glycosylated GMP and for the GMP bearing a single glycan, whereas the combination of two sites of phosphorylation and two or more sites of glycosylation were not observed. In a more recent study combining electrophoresis and mass spectrometry, Jensen et al. [20] identified six different κ-casein isoforms. Otherwise in agreement with our results, bi-phosphorylated glycosylated forms were not found in that study although unidentified low abundance κ-casein forms were present in the gel. Sulfation of a tyrosine residue would also result in a mass increase of 79.9 Da. However, GMP has no tyrosine residues in its sequence; therefore sulfation of the peptide can be eliminated. Sulfation of a glycan structure can also be ruled out as the unidentified species are seen for all forms of GMP, including the non-glycosylated form.

Although the specific biological role of the GMP modifications is not fully understood, a number of studies indicated that both phosphorylation and O-glycosylation are involved on the stability of the casein micelles [21, 22]. Briefly, the negative charge that these modifications add to κ-casein enhances the electrostatic repulsion of the micelles preventing their aggregation. In turn, the stability of the casein micelles affects the precipitation of the curd during the cheese-making process. In this context, the presence of bi-phosphorylated glycosylated forms of GMP is expected to increase the charge of κ-casein and, hence, the micelle stability. Nevertheless, a rheological study would be needed to determine the importance of this new GMP form on the precipitation of the curd.

3.4 O-glycosylation Confirmation by IRMPD

Additional confirmation of the glycan composition was made by performing tandem mass spectrometry experiments on the 7733.6904 Da species that was assigned as a singly glycosylated GMPa-P. This glycoform was isolated as m/z 1547.7381, the 5+ charge state, in the positive mode and was fragmented by IRMPD [19] (Figure 4a). Although the corresponding ion-molecule in the negative mode is orders of magnitude more intense, its fragmentation yields only the dehydrated form (data not shown). IRMPD fragmentation of this glycosylated form of GMP in the positive mode was possible by accumulation of 50 spectra. The charge deconvoluted IRMPD spectrum of m/z 1548.1 is shown in Figure 4b.

Figure 4
figure 4

IRMPD fragmentation of 1547.7381 (5+) (a) raw spectrum, (b) deconvoluted spectrum

The results of the IRMPD fragmentation clearly support the mass-based glycan assignment of HexNAcHexNeuAc2. The isolated ion fragmented in a manner such that the glycan can be sequenced by looking at the differences in mass between the fragments. It can be seen that the glycan contained two NeuAc residues as there is a loss of 291 Da from the precursor ion and a second loss of 291 Da from the fragment at 7442.6 Da. The final diagnostic loss in the spectrum occurs between 7151.6 Da and 6786.4 Da, a difference in mass of 365 Da, which corresponds to the disaccharide HexNAcHex.

4 Conclusion

A method for the comprehensive analysis of post-translational modifications in bovine caseinoglycomacropeptide using ESI FT-ICR MS has been presented. The method makes use of differences in mass between different species and a combinatorial approach based on diagnostic carbohydrate masses in order to identify the different glycoforms in terms of their glycan composition. Confirmation was obtained by tandem-MS. Otherwise in agreement with previous studies, glycosylated, bi-phosphorylated forms of GMP, never described before, were detected.