A DNA sequence provides all of the information necessary for a cell to produce a recombinant protein with the same amino acid sequence as the native protein but it does not dictate exactly how that protein will be glycosylated. Therefore, using a human DNA sequence does not guarantee that the molecule will be glycosylated as it is when synthesized by our bodies. The peptide sequence may determine where glycosylation is added to the protein but the mix of glycosyltransferases, within the cell the protein is expressed in and even the conditions under which those cells are cultured will determine what oligosaccharide structures are added to the molecule [13]. And even though glycosylation is dependent upon the machinery (glycosyltransferases) present, the cells do not add identical oligosaccharide structures to each protein molecule. This means a glycoprotein is always present in many different glycoforms (the same protein molecule with different oligosaccharides).

It is difficult to know a priori what the “correct” glycosylation is for any therapeutic since:

  1. 1.

    There is rarely information on the glycosylation of the human molecule let alone how the glycosylation affects the molecule.

  2. 2.

    The therapeutic is usually delivered from the bloodstream, which might not be where it is normally found in the human body. It is therefore often exposed to different receptors than the native protein.

  3. 3.

    Human proteins are glycosylated in many different ways so there are no rules for what is appropriate “human glycosylation” and

  4. 4.

    The optimal glycosylation for any molecule is often a trade-off since, for example, the ideal glycosylation for efficacy (i.e. receptor binding) might not be ideal for a long half-life in the bloodstream.

Other than choosing an expression system that tends to glycosylate in a certain fashion or has been engineered to glycosylate in a desired manner there is little that can be done to change the glycosylation of the molecule other than limited in vitro modifications.

Although the first two recombinant proteins approved as therapeutics were not glycoproteins (insulin and human growth hormone) about 40% of the approved therapeutics today are glycoproteins (excluding monoclonal antibodies, see Table 1). Approximately 70% of the approved therapeutic glycoproteins have been expressed in CHO cells (Table 1) which means there is now a great deal of information available on the glycosylation of recombinant proteins expressed in CHO cells. The glycosylation patterns of different commonly used cell lines is reviewed in Grabenhorst [4]. There are also many groups working at engineering the glycosylation in different expression systems [2]. The goal of these engineering efforts is to improve the consistency, reduce the heterogeneity and/or make it possible to produce specific glycoforms.

Table 1 The glycosylation of approved therapeutics

Glycosylation can impact the pharmacokinetics, pharmacodynamics and/or efficacy of a glycoprotein therapeutic. It is therefore necessary to analyze the glycosylation on glycoproteins being developed as therapeutics. However, it can still be difficult to determine how much glycosylation analysis should be performed at the different stages of development or which assays should be used for the analysis. These issues are addressed in this review.

This topic was last reviewed in 1990 by Michael Spellman [55] when there were very few approved recombinant glycoprotein therapeutics, let alone the prospect of biosimilars or follow-on biologics. The list of approved recombinant glycoprotein therapeutics is now much longer and there is much more known about the glycosylation of recombinant glycoproteins (see Table 1 for a summary) and some of the problems that are likely to arise while developing a glycoprotein therapeutic. There are now more and better analytical tools available for carbohydrate analysis. There may not be a consensus on the best specific methods but certain types of analysis have become the standard for assessing the consistency of glycosylation.

This review will cover N-linked glycosylation (where glycans are attached to asparagine residues in the peptide sequence, sees Figs. 1 and 2) and mucin-type O-linked glycosylation (where GalNAc is attached to serine/threonine residues and then additional monosaccharides may be added to the GalNAc, see Fig. 3). Other types of glycosylation are reviewed by Vliegenthart [56] and Spiro [57].

Fig. 1
figure 1

An imaginary complex N-linked oligosaccharide demonstrating modifications that can occur on these oligosaccharide structures. is GlcNAc; is galactose; is mannose, is fucose and is sialic acid

Fig. 2
figure 2

Oligomannose and hybrid oligosaccharide structures. is GlcNAc; is galactose; is mannose and P is phosphate

Fig. 3
figure 3

O-linked oligosaccharide core structures. These structures can be extended, much like N-linked oligosaccharides. is GlcNAc; is galactose and is GalNAc

Due to the wealth of literature already available on monoclonal antibodies (reviewed in [58] and [59]), the focus of this review will be on glycoprotein therapeutics and not monoclonal antibodies. Nonetheless, the strategy and analytical methods discussed in this review do also apply to the characterization of the glycosylation of monoclonal antibodies.

1 An historical perspective

The first two glycoproteins approved as therapeutics were tissue plasminogen activator (t-PA) and erythropoietin. The glycosylation of t-PA was shown to affect its enzymatic activity [6062] and the plasma clearance of the molecule [6365]. Glycosylation was also shown to be necessary for the activity [6668] and plasma clearance of erythropoietin (reviewed by Takeuchi [21]). These two molecules demonstrate how glycosylation can have a significant impact on plasma clearance and why it isn’t always easy to predict what impact the glycosylation will have on the plasma clearance of your molecule.

Alteplase (t-PA) has 3 glycosylation sites; one which is glycosylated with oligomannose structures and two that are glycosylated with complex oligosaccharides although one of these is only occupied 50% of the time [10]. The plasma clearance of t-PA is mediated through receptors that bind to the protein and the mannose receptor that binds to the oligosaccharides [6972]. T-PA binds to the mannose receptor with a higher affinity than other glycoproteins containing oligomannose structures and there is evidence that the underlying protein structure enhances the binding to the mannose receptor [73].

Tenecteplase is an engineered version of t-PA in which the glycosylation site containing oligomannose structures has been removed and a new glycosylation site (which is glycosylated with complex structures) has been added to the molecule. Tenecteplase has a much longer plasma half-life than alteplase and is a good demonstration of how difficult it can be to understand the role that glycosylation plays in plasma clearance. Some of the increase in half-life seen with Tenecteplase can be attributed to the removal of the oligomannose structures (because of clearance by mannose receptors) but not all, since a mutant with only the oligomannose site removed does not have a half-life as long as tenecteplase [74]. A nonglycosylated version of t-PA (produced in E. coli) in which a large part of the molecule has been deleted (reteplase) has a similar half-life to tenectaplase [75].

Erythropoietin also has an interesting glycosylation story. Erythropoietin has 3 N-linked oligosaccharides and one O-linked oligosaccharide [22]. Erythropoietin tends to be glycosylated predominantly with triantennary and tetraantennary oligosaccharides, whether it is the natural urinary form or the recombinant expressed in CHO [76] or BHK cells (reviewed by Takeuchi [21]). Polylactosamine (the repeating disaccharide [3Galβ1-4GlcNAcβ1]n that is attached to terminal galactose residues on complex structures) is also commonly seen on the molecule. Polylactosamine is more common on the molecule expressed in CHO or BHK cells than on the natural urinary form [21]. The glycosylation of erythropoietin has been shown to be necessary for its activity [6668] as well as its secretion and function [77]. As has been reported for other glycoprotein growth factors [78], EPO binds better to its receptor when the molecule is desialylated, however; the loss of sialic acid results in a dramatic reduction in in vivo activity [79, 80]. Although the loss of in vivo activity in glycoprotein therapeutics is often attributed to the efficient clearance of undersialylated molecules by the hepatic asialo glycoprotein receptor [81], most often the role of carbohydrates in plasma clearance is more complicated than can be explained by the asialoglycoprotein receptor alone. In EPO, branching of the oligosaccharide chains seems to play a role since a version of EPO with more biantennary structures (rather than the typical tetraantennary structures) was shown to clear much faster even though most of its galactose residues were sialylated and it had a three-fold higher in vitro activity [82].

There is a new version of EPO in which two additional N-linked oligosaccharide structures are present [83]. This molecule has been shown to be safe and has a much longer plasma half-life. There have been no reports of patients developing antibodies to this molecule, even though glycosylation sites have been added to the molecule [83].

When the first protein therapeutics were produced the methods used for characterizing their glycosylation came from academic groups that had skillfully worked out ways to identify the oligosaccharide structures on glycoproteins or other glycans. The techniques utilized included FAB-MS, NMR, lectin blots and columns, multi-dimensional HPLC methods and GC-MS, among others. These techniques had been developed to enable scientists to determine the exact structure of oligosaccharide structures, including the glycan linkages between the monosaccharide units. Although these methods were tremendously important for determining the glycosylation of those first recombinant proteins—these methods did not transfer well to biotechnology laboratories. In biotechnology laboratories the emphasis was more on demonstrating that the glycosylation was consistent batch-to-batch (as opposed to identifying each oligosaccharide on a molecule), and analyzing many samples at once to compare their glycosylation. Data were required quickly and techniques would need to be run by scientists with relatively little experience in carbohydrate chemistry. The first real assays developed for the unique needs of biotechnology came when Hardy and Townsend published their HPAEC methods for a one-dimensional separation of oligosaccharides (now referred to as oligosaccharide profiling) and monosaccharide analysis using an HPLC [84, 85] and reviewed in [86, 87]. The key to these methods was a detector that was sensitive to underivitized carbohydrates (which are very difficult to detect by UV) and a chromatography method that was able to separate oligosaccharides and monosaccharides far better than any previous method. These methods for carbohydrate analysis were the first designed to address the issues unique to biotechnology. Since then, many methods have been developed that work well in the biotechnology laboratory.

2 A strategy

Unfortunately, there is no single method that can be used to completely characterize the glycosylation on a molecule and different methods must be employed, depending upon the type of glycosylation on the molecule and the specific information that is required. Oligosaccharides can be analyzed in three different ways: while attached to the protein (peptide mapping and analysis of glycopeptides), after releasing intact from the protein (oligosaccharide profiling) or after being broken down into their constituent monosaccharide units (monosaccharide analysis).

The degree to which the glycosylation on the therapeutic is analyzed may vary during drug development. How thorough the characterization of the glycosylation needs to be and how early in drug development it is started should depend upon consideration of factors such as whether the glycosylation is known to have an impact on the efficacy or the pharmacokinetics, any special issues raised by the expression system used to produce the protein, or changes in the expression system that would affect the glycosylation. If specific types of glycosylation are necessary for optimal activity, then the glycosylation may need to be characterized early in development. And conversely, if your expression system might be adding glycosylation that is not ideal or could be a safety concern then this will need to be addressed early in development. If there is any reason to believe that the glycosylation is not consistent, especially if changes in glycosylation are coincident with any changes in the pharmokinetics or activity of your molecule then more detailed analysis may become necessary to understand this relationship.

Unless there is reason to believe that very specific types of glycosylation (i.e. more highly branched complex structures) are required for an efficacious product, regulatory agencies are usually more concerned with demonstrating that the glycosylation is consistent (often referred to as having “consistent heterogeneity”) rather than on the identification of all the different oligosaccharides present on the molecule. This is especially true if the role of glycosylation on the function of the glycoprotein is well understood and/or it has been expressed in an expression system where the glycosylation patterns have been well characterized. The heterogeneity in glycosylation profiles of many therapeutics does make it far more challenging to assess whether two different lots of a drug have comparable glycosylation.

It is always wise to monitor the glycosylation for consistency throughout product development to demonstrate that your process produces molecules with consistent glycosylation and for the reassurance that the molecules used in preclinical and clinical work have a similar glycosylation profile to the ultimate molecule headed for the market. Due to this, it is usually wise to begin analyzing glycosylation by the time the first clinical, or even pre-clinical lots are produced.

Analytical methods used to analyze the glycosylation of therapeutic proteins can be separated into two categories: methods needed to demonstrate the consistency of glycosylation from lot to lot (likely to become release assays, which will need to be validated) and methods used for more in depth analysis of the glycosylation (used to characterize specific critical lots and/or to understand changes in glycosylation patterns, that won’t necessarily be validated).

The most common method used for demonstrating the lot-to-lot consistency of glycosylation on a protein therapeutic is oligosaccharide profiling. These methods provide an overview of the heterogeneity of the molecule, can be validated and transferred to a QC setting for release testing and should be very sensitive to changes in glycosylation. These methods can be useful from early in development and it is the most common type of analysis used for lot release-testing to demonstrate the consistency of the glycosylation once a product has been approved. Furthermore, most of these methods can be used to separate oligosaccharide structures for identification or further characterization, when and if this becomes necessary (either by on-line methods or by isolating them and doing further analysis on the fractions). For all of these reasons, oligosaccharide profiling methods are used for most therapeutic glycoproteins starting prior to clinical trials and are usually included as a release test by the time the drug is marketed. Due to the important role that sialic acid has been shown to play in the half-life of protein therapeutics in the bloodstream, sialic acid is often quantified on therapeutics. The efficacy of a drug is also more likely to be affected by differences in sialylation than other modifications in the glycosylation (see Figs. 1, 2, and 3 for examples of modifications). This makes sialic acid analysis the second most common type of glycosylation analysis for therapeutics and like oligosaccharide profiling, is usually used prior to the start of clinical trials and often becomes a release test by the time the drug is marketed.

Another reason for analyzing the sialic acid on your molecule is to determine whether there is any N-glycolylneuraminic acid (NGNA or NeuGc) present. NeuGc is a sialic acid found in many animal cells (including CHO cells) that is not found in humans because we lack the enzyme required for its synthesis. Historically, this has created concern over the presence of this sialic acid on therapeutics but it has now been demonstrated that although humans cannot synthesize NeuGc, it is found in normal human tissues [88]. It would appear that humans pick up NeuGc from diet [88, 89] and it is incorporated into tissues. In fact, Byres et al. have shown that the presence of this NeuGc on human tissues creates high affinity receptors for a bacterial toxin [90]. It has also been demonstrated that many humans produce antibodies against oligosaccharide structures with terminal NeuGc [91].

Beyond these two analyses there are additional assays used for analyzing glycosylation. Determining whether further analysis is required and which additional methods are employed depends upon the role of glycosylation on your molecule and the expression system used. These different methods and the information obtained from them are described in the next section.

3 Oligosaccharide profiling

Oligosaccharide Profiling (also referred to as mapping or fingerprinting) is the best method for monitoring the consistency of the glycosylation of your molecule. With this method, oligosaccharides (N-linked and/or O-linked) are released intact from the molecule and separated to create the profile. This profile can be a chromatogram from an HPLC/CE separation, a mass profile generated using mass spectrometry or the pattern of bands generated by gel electrophoresis of oligosaccharides. For a therapeutic, it is most important that the profile is sensitive to changes in glycosylation, especially any changes in glycosylation that are known to affect the safety or efficacy or the molecule. The method must also be robust and reproducible. It is helpful if the same method can be used for routine testing and further characterization/identification of the oligosaccharides (i.e. by isolating the peaks or by on-line mass spectrometric analysis).

N-linked oligosaccharides are most commonly removed enzymatically from glycoproteins using peptide:N-glycosidase F (PNGase F). PNGase F has been shown to release oligomannose, hybrid and complex oligosaccharides from glycoproteins. It will not release oligosaccharides from the asparagine unless both the carboxyl and amino termini are in peptide bonds [92, 93]. There is also one report in the literature [48] of a bisphosphorylated glycopeptide that was resistant to PNGase F until it was desphosphorylated. PNGase F will not release any oligosaccharide that contains a fucose linked α1-3 to the GlcNAc bound to the asparagine [94]. Plants and insect cells add fucose in this linkage. N-linked oligosaccharides can also be released by peptide:N-glycosidase A (PNGase A), which releases all classes of oligosaccharides [92, 93]. The different enzymes used to release glycans are reviewed in O’Neill [95].

N-linked oligosaccharides can also be released chemically using hydrazinolysis [96, 97]. This method has not been commonly used since PNGase F has become commercially available, because of the safety issues created by the chemicals used in this procedure and the complexity of the procedure. N-acetyl and N-glycolyl groups are also lost when oligosaccharides are released by hydrazinolysis [98, 99].

For many laboratories, the fastest and easiest way of profiling oligosaccharides is by mass spectrometry, especially MALDI-TOF MS. This method has the advantage of providing more information on the types of oligosaccharide structures present because the masses of the oligosaccharides can be matched to possible oligosaccharide structures. There are software packages that can perform this analysis. However, all possible isobaric oligosaccharides (oligosaccharide structures having the same mass) must be carefully considered when matching possible oligosaccharide structures to masses. An understanding of glycosylation pathways, in general and in different expression systems is critical when eliminating impossible or unlikely glycan structures. In early drug development, generating oligosaccharide mass profiles is often the preferred method of oligosaccharide profiling. However, there has been resistance to validating these methods and transferring them into QC laboratories due to concerns over the accuracy of quantification by mass spectrometry (even though it has been demonstrated that it can be used quantitatively [100]), the complexity of the equipment, the volume of data generated and the challenges of data analysis. This has left oligosaccharide mass profiling as an incredibly valuable method during early drug development, when oligosaccharides need to be identified, as an orthogonal method (to HPLC or CE methods) later in development and when critical lots need to be analyzed (for example, process qualification lots). Mass spectrometric methods have been reviewed recently by Wada [100] and Stadlmann [101].

Although electrophoretic methods [102], mass spectrometry [100, 103] and CE [102, 104, 105] are all used for oligosaccharide profiling, HPLC methods have been the most commonly used method for lot release. There are many HPLC separations for N-linked oligosaccharides described in the literature (reviewed in [106]), however, many of them are not all that useful for N-linked oligosaccharide profiling of therapeutic glycoproteins. To be useful for the profiling of therapeutic glycoproteins a method should:

  1. 1.

    Generate a good separation in one dimension (earlier methods separated oligosaccharides based upon one characteristic on one column, then another on a second column).

  2. 2.

    Shifts in the oligosaccharide pattern should be predictive of changes in the glycoform distribution (i.e. changes in size or charge can be inferred from a shift in retention times)

  3. 3.

    Utilize a sensitive method for detection (important since oligosaccharides do not contain good chromophores)

  4. 4.

    Be amenable to further characterization of the oligosaccharides, either by collecting the oligosaccharides for further analysis or by on-line analysis.

Although oligosaccharide profiling by HPLC may not immediately provide many specific details about the glycosylation of the molecule (methods utilizing mass spectrometry are better at this) it is a very sensitive test of lot-to-lot variation in the glycosylation. It is also a relatively simple analysis to perform and a good HPLC profiling method can provide important information on consistency even if the identity of the oligosaccharides in each peak is not known. It is possible to develop platform methods that will work well for most glycoproteins without the need for extensive optimization for individual glycoproteins. HPLC profiling data can be collected to demonstrate the consistency of glycosylation early in drug development and then the oligosaccharide peaks in the profile can be identified later in development. It is however, important to choose an oligosaccharide profiling method that has been documented to provide sufficient detail and able to detect relevant changes in glycosylation.

High pH anion-exchange chromatography (HPAEC) of underivatized oligosaccharides with pulsed-amperometric detection (PAD) was the first oligosaccharide profiling method ([85, 107] and reviewed in [86]) that addressed the needs of monitoring the glycosylation of therapeutic glycoproteins. In this method, oligosaccharides are separated using high pH (pH 12). The high pH converts hydroxyl groups to oxyanions and differences in the interaction of these oxyanions in the oligosaccharides with the anion exchange resin result in differences in the retention times. Although these methods provided very good separations of oligosaccharides they are difficult to work with, because the mobile phases contain a large amount of salt (sodium hydroxide and sodium acetate) which is difficult to remove from isolated oligosaccharides. Different oligosaccharides respond differently to this form of detection making quantification, or even relative quantification of different oligosaccharides impossible. Although recent advances in the technology have made this less of an issue, the gold electrode gets fouled over time, which affects the response and creates issues with day-to-day reproducibility.

Separations on amide or amino columns are now available that are as sensitive to changes in the oligosaccharide structure as the HPAEC methods and these have become the method of choice for many laboratories. These separations use a combination of both normal phase and anion-exchange separations [108112]. Weak anion-exchange methods were historically used to separate oligosaccharides into charged groups, but they were unable to separate oligosaccharides within a charged group. These new methods, like HPAEC, are capable of separating the oligosaccharides into charged groups and then separating oligosaccharides based upon size/monosaccharide composition and/or linkages within charge groups. Siemiatkoski et al. also report an excellent separation of the neutral oligosaccharides on monoclonal antibodies using one of these columns without any organics in the mobile phases [113].

These separations on amide/amino columns are performed with fluorescently labeled oligosaccharides. The two most commonly used fluorescent labels are 2-AA (2-aminobenzoic acid) and 2-AB (2-aminobenzamide) [114], which are attached to the reducing end of the oligosaccharide using reductive amination. There is an alternative protocol for labeling with Fmoc-Cl (9-fluorenylmethyl chloroformate) where the label is attached to the reducing end of an intermediate of the oligosaccharide formed during release of the oligosaccharide by PNG’ase F [115]. The different types of labeling used for carbohydrates are reviewed in [114, 116].

RP-HPLC is another separation that has been used for oligosaccharide profiling [112, 117]. And more recently RP-HPLC using graphitized carbon columns [118] because they retain carbohydrates better (even underivitized) than other reversed-phase resins. Both these separations have the disadvantage of not being able to separate into charged groups. Since negatively-charged groups (sialic acid, phosphate and sulfate) are often critical to the function of glycoproteins it is preferable to be able to easily determine changes in the amount of charged oligosaccharides on your molecule, which is easier done with separations involving anion-exchange. However, RP-HPLC has been shown to work very well in LC-MS applications [119].

The methods used for profiling O-glycans are very similar to those used for N-glycans except there is no enzyme commercially available that will release all O-glycans from proteins. O-Glycanase will release certain O-glycans, but not all, from proteins [120]. O-glycans must therefore be released chemically from proteins and as a consequence of the chemical release care must be taken to not degrade the released oligsaccharide in the process. The most common technique is β-elimination [121]. This β-elimination has the disadvantage of producing reduced oligosaccharides, which precludes using reductive amination to label the released oligosaccharides with a chromophore/fluorophore (the most common method for labeling oligosaccharides).

There are several methods that leave the reducing end of the O-glycan intact. The most common method is hydrazinolysis. Hydrazinolysis is also used to release N-glycans but by altering the temperature of the reaction it is possible to favour the release of N-linked or O-linked oligosaccharides [96, 97]. The disadvantages for releasing O-glycans are the same as described for the removal of N-linked oligosaccharides (see above). However, using this release method has allowed for fluorescent labeling of the released O-links and separation by normal-phase chromatography [122, 123].

A method using ethylamine to remove O-glycans has been reported [124] although with a low recovery of O-glycans relative to the original β-elimination. An ammonium based β-elimination [125] has recently become popular (and is in use in our labs). This method releases both N-linked and O-linked oligosaccharides. Both of these methods release glycans with the reducing end intact allowing for labeling of the released oligosaccharides.

Due to the greater difficulty in removing O-links from the protein, especially without losing the reducing end, there are far fewer examples of oligosaccharide profiles of O-glycans. HPAEC-PAD has been used [126] and the separation of neutral O-links on an amino column [127, 128]. There are also reports of separation on amide columns [111, 129] and also RP-HPLC separations [123, 128].

As with N-linked oligosaccharide profiling, it is also possible to perform LC-MS using some of these profiling methods [128, 130, 131].

When more specific details of the glycosylation of your molecule is necessary an oligosaccharide profile alone will not provide sufficient information on the oligosaccharides present. A more extensive characterization might be necessary in order to select an expression system, cell culture conditions or optimize purification of the molecule. Certain lots of material also require a more detailed analysis of the glycosylation (i.e. reference lots, validation or process qualification lots and comparability studies). Adding additional testing of key lots and demonstrating that these lots are comparable, as shown by the oligosaccharide profiling method, serves to validate that the oligosaccharide profiling method(s) is sufficient for demonstrating consistent glycosylation and that the method is not missing any changes in glycosylation. The methods used for identification of the oligosaccharides are described later in this review (under Structural Characterization), but some of the HPLC methods have the advantage of working as LC-MS methods [132] and CE can also be coupled to MS (reviewed in [133, 134]).

Due to the complexity of oligosaccharides, information from two or more techniques may be required to confirm the identity of certain oligosaccharides. For instance, the retention time of an oligosaccharide on an oligosaccharide profile or the mass could be consistent with a fucosylated oligosaccharide structure and monocompositional analysis could support the presence of fucose on the glycoprotein (or on the isolated oligosaccharide itself). Further information from techniques such as oligosaccharide sequencing using fucosidases, GC-MS analysis or MSn would still be required to determine the position and linkage of the fucose residue.

4 Monosaccharide composition

For the determination of the monosaccharide composition of glycoproteins or glycans, oligosaccharides are hydrolyzed into monosaccharides and the monosaccharides are then separated and quantified. Typically, fucose, galactose, mannose, GlcNAc and GalNAc are measured since they are the most common monosaccharides; however, there are methods that can be used to separate different mixes of monosaccharides. This discussion will focus on methods used to measure the monosaccharides present on a glycoprotein; similar methods can also be optimized for isolated oligosaccharides or polysaccharides. The downside of monosaccharide analysis is that much information is lost upon hydrolysis (size of oligosaccharides, branching) and therefore, it is always more informative to examine the intact oligosaccharides rather than measuring the hydrolyzed monosaccharides.

Two methods have been most commonly used for monosaccharide analysis: gas-liquid chromatography coupled with mass spectrometry and HPLC. Due to the chemistry involved in the GC methods and the more specialized equipment required for this analysis, the HPLC methods are much more commonly used. More information on the GC methods can be found in the following references [135138] but this review will concentrate on the HPLC and CE methods.

All of the HPLC and CE methods require that oligosaccharides are first hydrolyzed into their consistuent monosaccharide units using acid hydrolysis. Choosing conditions for the acid hydrolysis is complicated because no acid hydrolysis has been demonstrated to work for all glycans. Different monosaccharides are released at different rates by and the rate of release can even be affected by their glycosidic linkage. Once released, monosaccharides are destroyed at different rates by the acid. In particular, GlcNAc and GalNAc residues are much more stable in acid and more difficult to hydrolyze than the other monosaccharides ([139] and reviewed in [140]).

For the best accuracy, the recommendation has been to hydrolyze at 100°C in 4 M HCl for 6 h for amino sugars and 2 M TFA for 4 h for the other neutral monosaccharides [84, 139]. In practice, most laboratories settle for a compromise of 2 M TFA for 3–6 h at 100°C [84, 139, 141] understanding that the amino sugars will not be 100% released from the sample, but finding that two separate hydrolyses are not worth the effort given the limitations of this analysis. An alternative method using 4 N TFA for 2 h at 121°C for all monosaccharides has also been reported [142].

Once the hydrolysis of the monosaccharides is complete then the monosaccharides can be separated underivatized by HPAEC and detected by PAD or they can be labeled and separated by HPLC or CE (reviewed by Anumula [114]).

5 Sialic acid analysis

Sialic acids are a family of negatively charged monosaccharides (reviewed in [143]) that are usually found at the termini of oligosaccharides. Typically, the addition of sialic acid to oligosaccharide chains prohibits any further elongation of the oligosaccharide chains, although in some cells sialic acid polymers (with sialic acids linked α2-8 to each other) are formed [144]. The most common sialic acid is N-actetylneuraminic acid (abbreviated as NeuAc, Neu5Ac or NANA) but there are many possible modifications of this molecule. The modification that is found on many therapeutic glycoproteins and has received much attention is N-glycolylneuraminic acid (abbreviated as NeuGc, Neu5Gc or NGNA). As discussed earlier in this review humans are unable to synthesize NeuGc.

Sialic acids are easier to remove from oligosaccharide chains, but they are also more acid labile. They are hydrolyzed using milder hydrolysis conditions than other monosaccharides. With the milder acid hydrolysis there is much less destruction of the released sialic acid than seen during the hydrolysis of neutral monosaccharides where there is significant destruction of the released monosaccharides. Quantitative release of sialic acids is also much easier to achieve than for some of the neutral monosaccharides that are difficult to hydrolyze. The most common methods for sialic acid analysis are reviewed in Manzi [145] and Zanetta [146].

6 Specialized methods

Oligosaccharides on some therapeutics will have a modification that will require development of a specialized assay. Two examples are: galactose residues linked α1-3 to galactose residues and mannose 6-phosphate residues. GalαGal is of interest because some expression systems will add these structures and humans do not add galactose in this linkage [147]. As well, all humans have high concentrations of antibodies against this epitope in their serum [147]. Mannose 6-phosphate is added to lysosomal proteins to assist in targeting to the lysosome [148, 149] but can also be found on other proteins (Deoxyribonuclease I, for example [23]). This residue may impact the clearance of molecules because there are receptors involved in the clearance of glycoproteins from the bloodstream that recognize mannose 6-phosphate residues (reviewed in [150]). There is also the possibility of monosaccharides that may, or may not, be measured in the monosaccharide analysis. An example of this would be monosaccharides commonly found in plants [151].

7 Site-specific analysis

Different glycosylation sites on the same glycoprotein can carry different types of oligosaccharides and differences in the occupancy of certain glycosylation sites can affect the efficacy of a molecule (t-PA is an example and this is reviewed in [152]). Consequently, there are times when the oligosaccharides at specific glycosylation sites need to be characterized. This is usually accomplished by LC-MS analysis of peptide maps where the glycosylation sites can often be isolated on different peptides [100, 153].

8 Identification of oligosaccharide structures/structural analysis/sequencing

Sometimes it is necessary to characterize the oligosaccharide structures present on the glycoprotein. This characterization can range from simply determining whether the oligosaccharides are oligomannose/complex and the relative size (the number of mannose residues or antennarity) to determining the glycosidic linkage of each monosaccharide on the oligosaccharide. We now understand that glycosidic linkages are carefully controlled by the glycosyltransferases that synthesize oligosaccharides and there is a great deal of information on the types of oligosaccharides found in the commonly used expression systems. It is therefore usually not necessary to determine the glycosidic linkages of the oligosaccharides, especially since they rarely have an impact on the efficacy of glycoproteins.

This information was historically obtained by GC-MS (gas chromatography mass spectrometry) or NMR (nuclear magnetic resonance) [123, 153, 154]. Oligosaccharides are now mostly sequenced by digesting with exoglycosidases or by mass spectrometry with fragmentation. Exoglycosidases are enzymes that release monosaccharides from an oligosaccharide and some will only release a specific monosaccharide or even a specific monosaccharide in one type of glycosidic linkage. The removal of monosaccharides can be monitored by shifts in the retention time of the oligosaccharide after digestion [155] or by changes in the mass by mass spectrometry [156159]. Unfortunately, glycosidase sequencing is complicated, because it is possible to find commercial sources of neither all the exoglycosidases necessary to completely sequence an oligosaccharide structure nor the oligosaccharide standards necessary to identify an oligosaccharide. Oligosaccharide standards are necessary to help interpret shifts in oligosaccharides after digestion with glycosidases.

9 Conclusions

Much was learned from the extensive characterization of the early glycoprotein therapeutics and as more and more therapeutics are developed we have added to this body of knowledge. Data has been accumulated on the glycosylation added to many different proteins by the most common expression systems [4, 160]. This information is very useful in predicting how a molecule will be glycosylated by a particular expression system, for choosing the best expression system for your molecule and for anticipating problems that may arise because of the chosen expression system.

Over time, oligosaccharide profiling has replaced monosaccharide composition analysis as the method of choice for characterizing therapeutic glycoproteins and especially for release testing and comparability/consistency testing. However, most therapeutic glycoproteins will require more exhaustive testing at times during their development and possibly for product release as well. The most recent advances in carbohydrate analysis have been the use of fluorescent probes for HPLC analysis and the much more common use of mass spectrometry, particularly MALDI-TOF MS. Hopefully, in the near future, LC-MS (and LC-MS-MS) methods for oligosaccharide analysis will become more widely available making it easier to unequivocally identify oligosaccharide structures.