Introduction

Glycosylation is one of the most common posttranslational modifications in eukaryotes, and a majority of cellular proteins are glycosylated through either N- or O-glycosylation or through glycophosphatidylinositol anchor pathway that connects the protein with two fatty acid chains. Other important glycoconjugates include glycosphingolipids, lipopolysaccharides, and peptidoglycans. The attachment of polysaccharides or glycans to biomolecules depends on the physiological status of the cells [1] and the protein sequence [2]. Glycosylation changes have been identified in various diseases, ranging from systemic genetic diseases, like congenital disorders of glycosylation (CDG) syndrome [3], to localized malignancy, like ovarian cancers [4]. There has been a long history in applying glycobiomarkers for disease diagnosis and prognosis. Because of their diverse structures and the information they carry, glycans provide a valuable source of biomarkers. In this “Omics” era, the concept of glycomics has evolved. In the past, research was restricted to glycosylation analysis of individual glycoproteins, and large scale glycome screening at tissue level was not possible because of technical difficulties. With the recent advances in analytical technologies, glycomics and glycoproteomics is gaining momentum in biomarker researches.

While genomics or proteomics have been relatively well established and commonly applied to biomarker discovery [5, 6], biomarker research is relatively unexplored through glycomics perspective. This review will give an introduction of the characteristic of glycosylation and the historical applications of glycobiology to biomarker research. The concept of glycomics, its differences from glycoproteomics, its recent advancement and future prospective in biomarker research, and the technologies for quantitative profiling of tissue glycome are then reviewed. Given the board spectrum of glycomic studies in various diseases, it would be impossible to provide an in-depth review for every aspect in this review article. The focus of this article will be mainly on the applications of N-linked and O-linked glycans in biomarker discovery.

Overview of Biochemical and Functional Characteristics of Glycan

Compared with DNA molecules and polypeptides, glycans attached to glycoconjugates have three fundamental differences, which are (1) non-template-based synthesis, (2) branching primary structures, and (3) variable linkages between the basic units, i.e., monosaccharides. In humans, the major glycan basic units include mannose (Man), galactose (Gal), N-acetylgalactosamine (GalNAc), glucose (Glc), N-acetylglucosamine (GlcNAc), sialic acid (or called N-acetylneuraminic acid, Neu5Ac), and fucose (Fuc). In contrast to the production of DNA molecules or polypeptides, which requires the presence of complementary DNA strands or mRNAs, there is no template or blue-print molecule for the production of oligosaccharides to be based upon. The synthesis of a glycan depends on the activity of a set of glycosidases and glycosyltransferases in rough endoplasmic reticulum and Golgi apparatus and the target molecule. To date, it is still impossible to predict the structure(s) of the glycosylation side chain(s) based on the protein sequence. While the primary structures of DNA molecules and proteins are linear and relatively straightforward, the branching property allows glycans to have exceedingly a large number of isomeric primary structures even if only six basic units are concerned [7]. In contrast to DNA molecules and proteins, the bonding between two monosaccharide residues can have a variety of configurations and linkages. There are two different stereochemical configurations of glycosidic bonds—an alpha linkage and a beta linkage. The only difference between the alpha and beta linkages is the orientation of the linked carbon atoms. Furthermore, each carbon in a monosaccharide can participate in such bonding. This flexibility and complexity cannot be addressed by using conventional analytical approaches in genomics and proteomics.

Glycosylation alters the biochemical properties of a glycoprotein in a number of ways, including an isoelectric point [8], conformational stability [9], thermal and pH stabilities [10, 11], susceptibility to inorganic solvent and proteolysis [12], and a lectin-binding behavior [13]. In general, glycosylation increases the thermal stability and reduces the susceptibility of protein to proteolysis, as demonstrated by various deglycosylated glycoenzymes [14]. However, the removal of glycosylation does not always change the biochemical properties of different glycoproteins in the same way [14]. Therefore, functional changes brought about by glycosylation not only depend on the structure of a glycan, but also on the characteristics of the protein concerned. It is almost impossible to predict the functional changes brought by the glycans based solely on their glycan structures. Moreover, a single glycoprotein molecule may have several glycosylation sites with various susceptibilities to glycosylation and may carry different glycan chains independently, leading to a considerable number of glycoforms with subtle differences in their properties. For example, serum haptoglobin (Hp) is a tetramer composed of two alpha subunits of 9.1 kDa and two beta subunits of 40 kDa. The carbohydrate content of Hp is found exclusively as ‘complex’ N-linked glycans on the beta subunit only [15]. There are four N-linked glycosylation sites on the beta subunit. The glycans are either biantennary or triantennary, both terminating with sialic acid residue(s). Fucose is linked to the core GlcNAc residue at either alpha-1,6 position or alpha-1,3 position [16]. In our recent study, we observed a total of 18 glycoforms with slight differences in molecular weight (ranging from 35–44 kDa) and pI value (ranging from 4.6–5.8) [17]. Given such complicated relationships between protein biochemical properties and glycosylations, it is inevitable in glycoproteomics to isolate and study structures of glycan motifs. This is especially important in biomarkers research because glycosylation is usually tissue or disease specific [18, 19]. These allow identification of glycoproteins with tissue-specific glycosylations or tracing back the origin of disease-related differential glycoproteins.

N-linked and O-linked Glycosylations

There are two major types of glycoprotein glycosylation concerned and studied extensively in biomarker discovery: N-linked and O-linked glycosylations. It is because they are commonly associated with secretory glycoproteins found in the blood. Glycosylphosphatidylinositol (GPI)-anchored glycoproteins, on the other hand, are bound to plasma membrane by two fatty acid chains and less readily detected in body fluids. The structural and physiological characteristics of these two types of glycans are more diverse than those of GPI-anchored glycoproteins.

N-linked glycan is covalently bonded with the amide group on asparagine residue of proteins. The asparagine residue is located in a consensus Asn-X-Ser/Thr sequence in which X can be any amino acid except proline. In some rare cases, the serine or threonine residue can be replaced with a cysteine, giving an Asn-X-Cys glycosylation site, as found in epidermal growth factor receptor [20], a minor glycosylation site in human transferrin [21] and CD81 molecules [22]. N-glycans can be classified into three classes based on the composition and sequence of oligosaccharides: high mannose, complex, and hybrid. The high-mannose N-linked glycan is composed of mainly polymannosyl residues in all branches. The complex-type glycans have a characteristic N-acetyllactosamine Gal(β1–4)GlcNAc in every branches. While the hybrid-type glycan carries both high-mannose branches and complex branches. Regardless of class, all N-linked glycans share the common tri-mannosyl core Manα1–6(Manα1–3)Manβ1–4GlcNAcβ1–4GlcNAc. Each mannose residue of the tri-mannosyl core at the nonreducing termini can be extended in an independent manner by oligomannose (high-mannose type) or N-acetyllactosamine (complex type), or linked up with two more branches (antennae). A single N-glycan can therefore carry up to four branches (tetraantennary). The branching mannose residue in the core is susceptible to the addition of bisecting GlcNAc through β1–3 bond through the action of N-acetylglucosaminyltransferase-III, while the asparagines-linked GlcNAc may be fucosylated through the α1–6 bond through the action of α1–6-fucosyltransferase [23, 24].

O-linked glycans, on the other hand, have more complex and less defined structures than N-linked glycans. O-linked glycan is linked to the hydroxyl side chain group on the serine or threonine residue in the protein. The two most contrasting differences between O- and N-linked glycans are the absence of common tri-mannosyl core and a consensus amino acid sequence on the glycosylation site. In most of the O-linked glycoproteins, the glycans are mucin type that carries the core sugar GalNAc, followed by a Gal, GalNAc, or GlcNAc residue. Unlike N-linked glycans, O-linked glycans are less branched and usually carry, at most, two antennae, branching at the core GalNAc by Gals.

In both types of glycosylations, the main monosaccharide residues encountered are usually hexoses and their N-acetylated derivatives, including GlcNAc and N-acetyllactosamine. Pentoses are rarely found in human glycosylation system except xylose in O-linked glycans, and their presences are usually allergenic [25]. Sialic acid, or N-acetylneuraminic acid, is another important constituent of glycans. It is a nine-carbon acidic monosaccharide usually found at the termini of various glycans and on the plasma membrane of vertebrates and invertebrates. Sialic acid can link up with the terminal Gal in α2–3 or α2–6 configuration, depending on the cell-type, tissue-type, developmental stages, and environmental factors [26]. α2–8 linkage is also found in the terminal polysialic acid chain in N-linked and O-linked glycans and in ganglioside [27, 28]. The presence of sialic acid on plasma membrane increases the hydrophilicity and negative charges of the cell and masks the Gal residue in the internal sequence to avoid binding of it with specific receptors [26]. It is also commonly expressed in microbial pathogen as a molecular mimicry, as sialic acid is heavily related with cell surface glycoproteins and cell–cell recognition system [26, 29].

Historical Development and Clinical Application of Glycobiomarkers

Glycosylation changes have been identified in various diseases, ranging from systemic genetic diseases like CDG syndrome [3] to localized malignancy like ovarian cancers [4]. There has been a long history in applying glycobiomarkers for disease diagnosis and prognosis.

Glycans with specific sequence are recognized as antigens in immune system. Besides the ABO blood group antigens, one of the famous examples is the Lewis antigen (Le) present on the plasma membrane of red blood cells. Lewis antigen belongs to a type of cell adhesion molecules expressed by leukocytes and some circulating cancer cells. Lewis antigen is recognized by selectin, a family of lectins expressed by vascular endothelial cells, and assists the adhesion of leukocytes and cancer cells to endothelia. Four types of Le antigen are identified: Lea (Galβ1–3(Fucα1–4)GlcNAc), Leb (Fucα1–2Galβ1–3(Fucα1–4)GlcNAc), Lex (Galβ1–4(Fucα1–3)GlcNAc), and Ley (Fucα1–2Galβ1–4(Fucα1–3)GlcNAc). Lea, Lex, and their sialylated (Neu5Acα1–3Galβ1–4(Fucα1–3)GlcNAc) and sulphated (SO43-OGalβ1–4(Fucα1–3)GlcNAc, and Neu5Acα1–3Galβ1–4(Fucα1–3)(SO46-O)GlcNAc) variants are commonly associated with cancer cell adhesions. Previous studies have shown that inhibition of Lea and Lex antigen expressions in cancer cells greatly reduce their adhesion to endothelial cells or selectin-expressing cells in vitro, and the expression of sialyl Lea and sialyl Lex is associated with tumor progression and metastasis [30]. Because of the connection between sialyl Lewis antigen and tumorigenicity, it is not surprising that the expression of sialyl Lea is a useful prognostic factor in cancers, for example, for colorectal carcinomas [3133]. Helicobacter pylori also expresses Lex and Ley antigens on its cell surface as part of the lipopolysaccharide to mimic the cell-surface glycoconjugate molecules of human gastric endothelial cells [34]. They are responsible for the adhesion of H. pylori, the infection and colonization of which cause chronic gastritis and gastric cancer.

The glycosylation of immunoglobulin (Ig) is also of great interest. Immunoglobulin is a well-known circulating glycoprotein found in serum. It is glycosylated at a conserved location in heavy chains and at a less conserved location in light chains. Immunoglobulins are produced and secreted specifically by Ig-secreting cells and play important roles in the immune system. Glycosylation plays a critical role in modulating the structure and function of Igs. Alterations in glycan structures of Ig has been observed in a variety of autoimmune diseases, such as rheumatoid arthritis and systemic lupus erythematosus [35, 36], as well as immune-related diseases like IgA nephropathy [37]. Abnormal hypogalactosylated O-glycosylation on the Fc region in IgA molecules is known to cause mesangial IgA deposition in kidney in IgA nephropathy, leading to glomerular damage [37].

The majority of serum glycoproteins are of hepatic origin. The close relationship between liver and serum glycoproteins suggests that liver abnormalities associated with aberrant glycosylations can be reflected by the changes in serum glycoprotein glycosylation patterns. The degrees of fucosylation on certain serum glycoproteins, such as haptoglobin [38], cholinesterase [39], and alpha-1 acid glycoprotein [40], were increased in liver cirrhosis. Microheterogeneity with concanavalin A affinity of serum transferrin was observed in patients with alcoholic liver disease [41]. Altered glycosylation pattern of serum transferrin can be observed in patients with alcohol abuse. Carbohydrate deficient transferrin is a well-established biomarker for detecting alcohol abuse [4244].

Alternations in glycosylation of glycoproteins and glycolipids are common in various cancers, and a considerable amount of them play important roles in carcinogenesis, such as tumor progression, tumor cell differentiation, cell–cell interaction, and tumor cell adhesion and metastasis [4548]. For examples, downregulation of beta-1,3-N-acetylglucosaminyl-transferase-T2 expression was observed in invasive human bladder transitional cell carcinomas compared with their noninvasive counterparts, suggesting that downregulation of this glycosylation enzyme may be involved in cancer progression [45]. In colorectal cancers, mRNA expressions of various glycosyltransferases are significantly altered [46]. Elevated mRNA expression of alpha1–6 fucosyltransferase in human hepatoma tissues was associated with the production of tumor-specific fucosylated alpha-fetoprotein (AFP) glycoform [47]. The changes of glycosylation machinery in the cancer cells can be reflected in the blood circulation by tracing the changes in the glycosylation of the proteins released by the tumor [48].

The poor specificity of a tumor marker is often because of the fact that it is also produced by normal cells under other pathological conditions. As tumor cells have different glycosylation machinery, it is hypothesized that identification of tumor-specific glycoforms should improve the specificity of a tumor marker. This hypothesis has been supported by various studies showing that identification of tumor-specific glycoforms can improve the diagnostic value of serum AFP. AFP is an N-glycosylated serum glycoprotein carrying a biantennary complex type glycan [49] and is a well-documented upregulated biomarker for hepatocellular carcinoma (HCC). However, it was also found to be elevated in chronic liver diseases, reducing its specificity in diagnosis HCC. By using lectin-affinity electrophoresis, an elevated level of core alpha1–6-fucosylated AFP glycoform called AFP-L3 can differentiate between chronic liver diseases and HCC, demonstrating the application of altered glycosylation in biomarker discovery [47, 50]. Apart from AFP-L3, a monosialylated glycoform of AFP termed msAFP was also found to be able to differentiate between early stage HCC with nondiagnostic AFP level and liver cirrhosis patients with similar total serum AFP level [51]. These studies have shown that, although total serum AFP alone was not sufficient to differentiate HCC and chronic liver diseases, the qualitative and quantitative information of the N-glycosylation pattern of serum AFP have significantly improved the performance of AFP as a HCC-specific biomarker. Besides AFP, it has been reported that serum levels of fucosylated glycoforms of alpha-1 antitrypsin and transferrin in patients with HCC are significantly higher than with liver cirrhosis [52, 53, 54].

Haptoglobin is a serum glycoprotein carrying sialylated complex type biantennary N-glycan. Changes in the haptoglobin glycosylation pattern have been studied as early as 1992 in ovarian cancer [36] and in canine diseases in the past decade [55, 56]. Serum haptoglobin level was found to be elevated in serum of HCC patients [17], and the fucosylated glycoform of haptoglobin with altered sialylation was found to be associated with tumor progression, enhancing its value as HCC biomarker [17]. Recently, aberrantly glycosylated haptoglobin has been shown to be a potential biomarker for other cancer types, including non-small cell lung cancer [57], prostate cancer [58], and pancreatic cancer [59].

Glycomics—An Uprising Approach for Biomarker Discovery

Glycome refers to a repertoire of glycans in a tissue/cell type. Glycomics is the study of glycome. Glycomics is becoming a hot research field in the recent years. There have been considerable advances in mass spectrometry (MS) technologies and oligosaccharide analysis technologies, including techniques in derivatization, fluorescent labeling, capillary electrophoresis (CE), and high-performance liquid chromatography (HPLC), making glycomics study possible. Recent advances of analytical technologies also allow more effective differentiation of isomeric and anomeric glycans, which can visualize the glycome with higher resolution. Similar to genomics and proteomics studies, glycomics studies rely on high-throughput screening to identify a panel of distinguishable glycomic features simultaneously. Glycomics study usually involves a large-scale systemic analysis of glycan pools, which usually contain several subtypes such as N-linked and O-linked glycans and glycans from glycolipid. Among these glycan subgroups, their structural properties, such as sequences and extent of branching and sialylation, differ from each other significantly as mentioned above. The analytical approaches for these subgroups hence are also different.

Moreover, as mentioned in the last section, glycans have been recognized as a valuable source of biomarkers for various diseases. Glycomic analysis allows rapid global comparison of glycome within body fluids or tissues of interest, which would allow identification and application of a new type of biomarkers for cancer diagnosis and to monitor cancer development and treatment. With the establishment of powerful high-throughput technologies, the analysis of glycome—a complex mixture with significant biological importance—has become a surmountable task. Unlike genomic or proteomic biomarkers, which directly or indirectly rely on transcriptional or translational information, glycomics allows biomarker researches to focus solely on the posttranslational events within the cells.

It should be pointed out that there are several differences between glycoproteomics and glycomics. Glycoproteomics aims to enrich, quantify, separate, and identify low abundant glycoproteins with specific glycosylation. Therefore, glycoproteomics focuses on a subset of proteome characterized by the presence of glycosylation on proteins, and proteins remain the main subject of study [60]. In glycoproteomics study, a specific glycoform of a glycoprotein, such as monosialylated AFP, will be identified as a biomarker. Glycomics, on the other hand, focuses on structures and sequences of glycan motifs, and the conjugated molecules are not the main concern. The focus of this review is mainly on the applications of glycomics to biomarker discovery, while glycoproteomics is not covered.

Human Proteome Organization Human Disease Glycomics/Proteome Initiative

Human disease glycomics/proteome initiative (HGPI) was launched by Human Proteome Organization. The major aim of HGPI is to identify disease-related glycobiomarkers in biological fluids mainly by undertaking functional glycomics and high-throughput MS approaches. To achieve this, HGPI has attempted to develop a common data and analytical standard method of N-glycans for various diseases. Ultimately, a common database platform will be developed. Since October 2004, several projects were carried out by HGPI, including the development of a standard methodology for glycome analysis of glycans carried by human transferrin and IgG and the development of a diagnostic program for CDGs. The result of method development was noteworthy. It made a comprehensive and quantitative comparison between the merits and drawbacks of different analytical approaches [61]. The ongoing CDG program, on the other hand, aims to complement traditional methods for classification of the symptom through genetics and to discover new subtypes of the symptom. This anticipation by the global proteomics community clearly illustrates the worldwide awareness of the importance of glycomics in biomarker research.

In August 2007, the National Cancer Institute (NCI), which is part of the National Institutes of Health in the USA, had funded a new US$ 15.5-million 5-year initiative consisting of seven projects on glycomic biomarker research with aims to discover novel cancer biomarkers and to improve preexisting ones. All seven projects (Table 1) are focused on either serum glycomic biomarkers or anti-glycan autoantibodies. Majority of these projects focused mainly on cancers with poor prognosis, including pancreas cancer and breast cancer. This campaign has highlighted the importance of glycomics in biomarker discovery and the advent of glycomics in this postgenome era.

Table 1 The seven biomarker discovery projects funded by NCI by targeting glycans (information source: http://www.nih.gov/news/pr/aug2007/nci-21.htm)

Recent Applications of Glycomics to Biomarker Discovery

Owing to technological limitations to elucidate the complex information of various glycans including sequences, structures, and quantities, early studies of glycosylation remained at the level of separation of protein glycoforms based on isoelectric focusing or lectin-affinity electrophoresis and subsequent identification by immunoblotting [4244, 50]. Recently, the development and maturation of more sophisticated analysis platforms like MS and bioinformatics allow more in-depth studies and high-throughput analyses on glycosylations. These have accelerated the utilization of glycosylation as a new source of biomarkers, as differential glycans can now be assessed more accurately and precisely at a larger scale. Aberrant glycosylation has gained much attention in biomarker researches, especially in liver diseases and cancers.

Except for the use of lectin microarrays, in most cases, it is inevitable in glycomic study to deglycosylate the glycoconjugate of interest (glycoprotein, glycolipid, or lipopolysaccharide) and to retrieve the free glycan molecules for downstream analysis. The whole workflow can be roughly divided into three parts: deglycosylation, purification, and quantitative analysis (Fig. 1).

Fig. 1
figure 1

Three parts of the workflow: deglycosylation, purification, and quantitative analysis

The number of glycomic researches and glycobiomarker studies has been gradually rising in the beginning of this century. A brief summary of all articles published in relation with glycomics was given in this review to provide an overview of this growing field. In 2004, the global pattern of desialylated N-linked glycans from whole serum proteins was successfully profiled by using a DNA sequencer as a CE instrument and subsequently used to detect liver cirrhosis [18, 62]. Four glycomic features were found to be significantly different in liver cirrhosis patients compared with control, and all upregulated features were shown to be fucosylated and carry a bisecting GlcNAc. Log ratio of two of these features gave a receiver operating characteristic (ROC) curve area of 0.87 in classifying mild fibrosis and compensated liver cirrhosis. In 2007, the same approach was applied to identify N-glycomic changes in HCC patients with hepatitis B virus-induced liver cirrhosis [63]. Two fucosylated glycomic features, one of them carrying bisecting GlcNAc, were found to be associated with HCC. The log ratio of these two features gave an ROC curve area of 0.81 in diagnosing HCC, which is comparable with that of using AFP.

Apart from CE, a matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) MS approach has been successfully applied to obtain the quantitative profiles of N-linked glycans from the whole serum proteins without the need of desialylation and has been applied to identify potential N-glycomic biomarkers for liver fibrosis and liver cirrhosis [19]. N-glycans were released from whole serum proteins by PNGase F digestion and were purified by hydrophilic chromatography. Finally, they were profiled by linear MALDI-TOF MS. A total of 17 differential N-glycans were identified that correlated with degree of liver fibrosis. Four of these N-glycans were selected by using linear regression for the construction of diagnostic model. By using linear regression, a diagnostic model was constructed from the potential diagnostic N-glycans. It gave an ROC curve area of 0.91 for detecting liver fibrosis and 0.911 for detecting liver cirrhosis in the pilot study. In a similar study [64], the N-glycans released were purified by HPLC on graphitized carbon columns, and subsequently subjected to either desialylation or methyl-esterification before MALDI-TOF MS analysis. MALDI-TOF MS was also used in conjunction with a novel type of sepharose beads to profile total serum N-glycan [65]. The sepharose bead was functionalized with a hydrazide polymer to facilitate the immobilization of free glycans through hydrazone bond, which allowed on-beads methyl-esterification of sialic acid. The technology was applied to study three types of diseases: CDGs, HCC, and prostate cancer. It was able to identify serum glycomic features that could differentiate type 1 and type 2 CDG from healthy normal controls and HCC from health normal controls. The method also allowed analysis of total cellular glycan profiles of human prostate cancer cells and normal human prostate epithelial cells.

Apart from N-glycans, total serum O-glycomic pattern was also profiled by MALDI-Fourier transform ion cyclotron resonance (FTICR) MS and applied to discover biomarkers for ovarian cancer [66]. A unique serum O-glycomic profile containing 16 cancer-specific signatures was obtained from patients with ovarian cancer. Infrared multiphoton dissociation was applied to glycan sequencing. The diagnostic features identified in the spectra were confirmed to be oligosaccharides but not peptides. Later, the same approach was applied to search for potential biomarkers for breast cancer in mouse and human [67]. Principle component analysis of the O-glycomic profiles had successfully distinguished breast cancer samples from normal samples. A summary of the above studies was given in Table 2.

Table 2 Summary of the published glycomics studies on biomarker discovery

Technologies for Quantitative Profiling of Tissue Glycome

Quantitative profiling is essential for the identification of differential features in biomarker discovery. Among the technologies mentioned above, HPLC and CE are well known for their quantitative performance. The detection methods of glycans in these systems are based on the measurement of fluorescent signal of the derivatized glycans. As long as there is neither detector saturation nor substrate bias during derivatization, the signal intensity should be proportional to amount of analytes. CE has been demonstrated to be quantitative in analyzing native and derivatized glycosaminoglycans [68].

The quantitative aspect of glycan analysis in MALDI-TOF MS is affected by various parameters including analyte derivatization, types of laser, spotting protocol, choice of matrix, and acquisition protocol of spectrum. Structure of glycans, on the other hand, does not significantly affect the quantitative performance of MALDI-TOF MS [69], although it has been suspected that an extra antenna may promote postsource decay [61]. To obtain a quantitative profile of glycome by MALDI-TOF MS, methyl-esterification of the sialic acid residues is usually required to prevent the loss of sialic acid during MS analysis. However, our recent study has shown that sialylated glycan could be already quantified in the presence of 10 mM NaCl, but without the need of methyl-esterification [19]. In proteomic study, it had been shown that the reproducibility of MALDI-TOF mass spectra was greatly affected by the spotting methods, such as direct mixing, overlaying, and sandwiching [70]. Similarly, such parameter should also dictate the quantitative performance of MALDI-TOF MS in glycan analysis.

The reproducibility of the signal intensity of a glycan depends on (1) the quality of the glycan-matrix co-crystal on the sample spot and (2) the laser scanning protocol. The variation of signal intensities can be minimized by obtaining more mass spectra over different areas on the sample spot and averaging the resulting spectra. From a multi-institutional study, the intra-assay and inter-assay coefficients of variation percentages of glycans released from human IgG were below 10% and 4.2% for major species and 34% and 13% for minor species [61]. In a separate study, the intra-assay and inter-assay coefficients of variation percentages of three different standard glycans were below 9% and 18%, respectively [19]. This suggests that the quantitative performance of MALDI-TOF MS is comparable to chromatography. However, the multi-institutional study clearly illustrated that there were considerable variations between different centers [61]. More efforts have to be made to standardize the protocols and instrumentation to obtain comparable mass spectra from different research centers. The major shortcoming of MALDI-TOF MS is its incapability to provide concrete structure information of the glycan detected, although it is possible to predict structures of glycans based on molecular weights obtained from MS [19]. In the future, a similar approach could be carried out with a MALDI-TOF/TOF MS system to allow direct structure analysis by tandem MS. Besides a combined use of MALDI and TOF MS, MALDI has been combined with FTICR MS to obtain quantitative glycomic profiles [66, 67]. However, the reproducibility of the MALDI-FTICR MS in quantitative glycomic profiles has not been systematically evaluated.

Comparable to gene expression microarray and antibody microarray, lectin microarray was a newly developed technology for glycomics and glycoproteomics studies. Lectins are highly specific carbohydrate-binding biomolecules that recognizes glycans by their structural information. Different types of lectins recognize different oligosaccharide structures, terminal residue, and linkage [71]. For example, galectins are specific toward Gal residue [72]; siglecs are a group of I-type lectins, which mediate glycan recognition via Ig-like domains [73], and selectins are expressed by leukocytes and endothelial cells to recognize sialyl Lewis antigens [74]. Lectins have been widely used to study and isolate glycoproteins. Lens culinaris agglutinin lectin, which binds specific to alpha-1,6 fucosylation at proximal core GlcNAc, has been applied to isolate fucosylated AFP glycoforms associated with HCC [75]. Ricinus communis agglutinin lectin, which binds β-Gal, was applied on monitoring the desialylation of glycoproteins in murine B16 melanoma cells [76] and on investigating the degree of galactosylation of glycoproteins in human astrocytoma [77]. Sambucus nigra agglutinin lectin, which recognizes alpha 2,6-linked sialic acid, was used to measure the degree of alpha 2,6-sialylation of glycoproteins in human colon cancer [78]. Binding of Helix pomatia agglutinin lectin, which recognizes GalNAc, to human cutaneous malignant melanoma was found to be associated with metastasis formation [79]. Peanut lectin, which binds Galβ1−3GalNAc, was used to measure levels of mucin-type O-glycan in the human benign and malignant colorectal tissues [80].

In the lectin microarray technology, a panel of lectins is immobilized onto solid support in a microarray format. This allows simultaneous detection of glycans based on their interaction with different lectins. Lectin microarray was first developed in 2004 [81]. It has been shown that a lectin microarray spotted with nine lectins was already capable of giving distinct glycoprofiles for different glycoproteins [82]. In the experiment, lectin microarray was exposed to fluorescence-tagged glycoproteins, followed by washing and scanning. The experiment outcome was a pattern of lectins that interact with the glycans of glycoproteins. When a purified glycan or glycoprotein is analyzed with the lectin microarray, the structure of the glycan can then be interpreted based on the interaction pattern. Moreover, lectin microarrays can be used to study glycan–lectin interaction in a high-throughput manner [83].

Because the affinity of lectins to glycans was relatively low (Kd = 10–4 to 10–7M) compared with antibody–antigen interaction (Kd = 10–8 to 10–12M), various techniques have been developed to avoid loss of analytes during washing and to probe such weak interactions such as the evanescent-field fluorescence detection [83] and the ratiometric approach [84]. An evanescent field is a weak electric field that only propagates wavelength distance from the sensor surface. It allows real-time detection during the interaction event between lectins and glycans, thus omitting the need of washing. While the ratiometric approach utilizes dual fluorescence colors similar to DNA microarrays for different samples, allowing ratiometric comparison between glycomes. This approach was also shown to be able to analyze complex mixture including the total glycome of whole-cell lysates [84]. The quantitative performance of lectin microarray has been well evaluated [82]. The signal intensity was linear for glycoproteins in the range of 50 to 300 μg/ml. Coefficient of variation of signal intensities was within 10% to 20% [83].

Although lectin microarray also detects fluorescence signal associated with the glycans, there are several ways that are different from the approaches based on HPLC or CE. Firstly, in the lectin microarray technology, fluorescent tags could be attached to protein portions of the glycoconjugates, instead of the glycan molecules. Hence, the glycoproteins or glycoconjugates can be applied directly to the array without deglycosylation and desialylation. This simplifies the experimental process and allows more labeling options and downstream analysis as well as simultaneous analysis of O-linked, N-linked, and glycolipid glycans [82]. Secondly, lectin microarray belongs to the type of competition assays. Different glycoproteins having the same glycan structures will compete for the same group of lectins. It is not uncommon that different glycoproteins are labeled with a different number of fluorescent tags, leading to variations in the signal intensities between experiments. Furthermore, a single lectin molecule can bind to glycan molecules with different structures, but at different affinities. When a mixture of glycoproteins or glycoconjugates is subjected to a lectin microarray analysis, a highly complex pattern of competition and interaction among glycans and lectins will result.

Future Prospective

As glycosylation changes have been associated with a wide range of diseases, it can be foreseen that glycomics holds the promise for biomarker discovery. Currently, HPLC and MS are the most popular analytical platforms for glycobiomarker researches. Although elucidation of glycan structures is not necessary for biomarker studies, it is still highly recommended to distinguish isomeric and anomeric compounds and to validate the biological significance of the result. In the past 10 years, the detection sensitivities of various MS technologies have been greatly improved, while new tandem MS technologies have been invented to elucidate the glycan structures. MS-based approaches will be very important in glycomics in the near future and provide a practical and reliable solution for high-throughput quantitative profiling of tissue glycome. Multiple reaction monitoring (MRM) has been recently used to quantity a panel of targeted serum proteins [85]. Its application to quantitative glycan profiling is still under exploration. When a panel of disease-associated glycans is discovered, a clinically practical assay based on MRM can be developed. Because of the isomeric and anomeric natures of oligosaccharides, the separation and characterization of these isomers remain a challenging task, especially if high-throughput screening is required. Lectin microarray may provide a practical solution to this, but it is still in its infant stage. Above all, there is no doubt that glycomics is gaining momentum in biomarker research.