Introduction

The incidence of hepatocellular carcinoma is rising, making it the fifth most common malignancy worldwide [1]. This increased incidence is attributed to viral hepatitis and carcinogenic toxins such as aflatoxin [2]. Alpha fetoprotein (AFP), one of the tumor markers used for diagnostic purposes in hepatocellular carcinoma (HCC), is limited in its diagnostic utility because 50% (35% in Johns Hopkins Hospital [3]) of the patients with HCC do not show high level of AFP [46]. Serum level of des-gamma-carboxy prothrombin (DCP) is also shown to be useful for the diagnosis of HCC in conjunction with AFP [7]. Several studies have reported potential diagnostic markers for HCC, which are associated with chromosomal alterations, mRNA expression alterations, protein-level alterations and post-translational modifications. Lately, proteomic strategies have been extensively used for biomarker discovery because of their ability for multiplex analysis using quantitative approaches. Quantitative proteomics using isotope-labeling techniques provide opportunities to analyze changes in protein levels in multiple samples in an unbiased fashion [8]. However, a limitation of these techniques is that, due to the complexity of samples, least abundant proteins are not detected.

O- and N-glycosylation of proteins have been associated with cancers including HCC. For example, an overall increase in fucosylation has been reported in HCC serum proteins [9]. Increased or altered glycosylation is also reported to be associated with secreted alpha fetoprotein [10] and transferrin [11] in HCC. Glycomic profiling is a complementary strategy used to identify cancer-specific glycans especially in ovarian cancer [12], HCC [13], and pancreatic cancer [14]. There are several reports on the application of lectin-based enrichment method to study the glycoproteome in cancers [15, 16]. Lectin reactive form of AFP is reported to be a more sensitive marker for HCC [17]. The profile of carbohydrate moieties of glycoproteins has also been used for differentiating normal from cancerous tissue in lung [16] and pancreatic cancers [18]. Since individual lectins have their own specificity while the nature of the carbohydrates present in samples may vary dramatically, it is advantageous to use a combination of lectins for enrichment purposes. Such a multiple lectin affinity approach has been used for breast cancer biomarker analysis [19]. A mixture of different lectins has also been used for lectin affinity enrichment method to identify biomarkers in bile secretome from cancer patients [20].

Stable isotope can be incorporated into proteins and peptides by different methods in vitro such as ICAT [21], iTRAQ [22], 18O-labeling methods [23], and in vivo labeling, using heavy-isotope-labeled amino acids in cell culture (SILAC) [24]. Labeling of peptides in the presence of 18O-labeled water is a simple method without chemical tagging and can be readily used to analyze the proteins separated by SDS-PAGE [25]. The present study is designed to analyze subproteome in HCC using affinity enrichment and relative quantitation based on 18O labeling. Further, analysis of glycopeptides treated with PNGase F provides definitive evidence of N-linked glycosylation [20]. Western blotting was used to confirm the differential expression results obtained by quantitative mass spectrometry. The candidate biomarkers were screened by immunohistochemical labeling using tissue microarrays containing hepatocellular carcinoma and non-cancerous liver tissue. We validated fetuin, a glycoprotein involved in endocytosis using immunohistochemical labeling. We also identified N-glycosylated peptides with sites from lectin-enriched protein. These results demonstrate advantages of lectin affinity enrichment method before quantitative proteomics profiling for identification of potential biomarkers for cancers.

Materials and Methods

Tissue Samples

Liver tissue samples were obtained after obtaining appropriate Institutional Review Board approval. One pair of tumor tissue and patient-matched non-cancerous tissue was collected at the time of surgery from patients and snap-frozen in liquid nitrogen. Serum samples were collected from HCC patients and stored at −80°C until analysis. A total of 114 HCC samples and 79 non-cancer liver sections from two tissue microarrays were used in this study. Formalin fixed and paraffin embedded tissue microarrays (Creative Biolabs, Cat No. CBL-TMA-070 and CBL-TMA-076), which consisted of 55 tissue sections from hepatocellular carcinoma with I, II, and III stages and 20 non-cancer tissues were used for immunohistochemical labeling. Tissue microarrays obtained from Imgenex consisted of 59 tumors (13 metastatic tissues, 46 poor to well-differentiated HCC, Cat. No. IMH-318) and 59 non-cancer tissues (Cat. No. IMH-342). Frozen tissues were used for mass spectrometric analysis and validation by Western blotting.

Lectin Affinity Enrichment

Tissue lysates were prepared from fresh frozen tissues from the tumor and adjacent non-cancerous region from a patient diagnosed with HCC in the presence of cocktail protease inhibitor and 10 mM phosphate buffer pH 7.8 (wash buffer). Lysates (4 mg protein) were subsequently incubated with 100 μl of each of ConA sepharose (Amersham BioSciences), wheat germ agglutinin and jacalin agarose for 12 h at 4°C. The beads were then washed three times using wash buffer and the bound proteins eluted using a mixture of carbohydrates (100 mM each of N-acetylglucosamine, melibiose, and galactose). The eluate was dialyzed against wash buffer (three volumes) to remove free sugars, concentrated and resolved by SDS-PAGE and visualized using Colloidal Coomassie staining.

Serum samples were obtained from patients diagnosed with hepatocellular carcinoma. Pooled normal serum was obtained from Invitrogen. Serum samples were prepared by incubating 100 μl (7 mg) of serum with 0.5 ml of mixture of lectins (ConA sepharose, jacalin, wheat germ lectins) overnight at 4°C in Tris-buffered saline (0.05 M Tris–HCL, pH 7.1, 0.15 M NaCl). Beads were washed three times in Tris buffer saline and boiled in Laemmli buffer for 10 min at 95°C to elute the proteins, and eluted proteins were run on 10% SDS-PAGE for further analysis by Western blot.

18O Labeling and In-gel Trypsin Digestion

Liver tissue samples from non-cancer and HCC tissues were run in parallel lanes on an SDS-PAGE gel for comparative analysis. Two gels were used for analysis, one from whole proteome and another from lectin-enriched proteins from HCC and normal tissues. Exact size gel slices corresponding to individual protein bands from HCC and normal tissues were excised and used for in-gel trypsin digestion as explained earlier [11] using Stratagene prolytica 18O labeling kit. In brief, sliced colloidal Coomassie-stained gel bands were destained and completely dehydrated gel slices were incubated with trypsin in the presence of heavy or normal water overnight. After completion of digestion, peptides were extracted from each sample, dried and subjected to post-trypsin digestion. Samples were incubated with immobilized trypsin (5 μl beads per sample) in the presence of heavy or normal water for 3 h. Finally, peptides derived from adjacent bands corresponding to normal and HCC were acidified, mixed, and analyzed on a quadrupole time-of-fight (QToF) mass spectrometer.

PNGase F Treatment for Identification of N-glycosylation Sites

Lectin affinity enriched proteins from HCC tissue were separated on 1D SDS-PAGE and stained with colloidal Coomassie. Protein bands were destained, reduced, and alkylated. Gel pieces were then subjected to in-gel trypsin (Promega) digestion and extracted peptides were dried completely. For the confirmation of N-glycosylation sites, peptides were incubated with PNGase F in the presence of 18O-water at 37°C for 3 h. After incubation, peptides were analyzed by LC-MS/MS (liquid chromatography–tandem mass spectrometry).

LC-MS/MS Analysis and Quantitation

In a separate study HCC and normal whole tissue lysates were analyzed on a SDS-PAGE gel and 18O labeling was done during in-gel trypsin digestion. For the common proteins found in this experiment, we were able to compare the changes in ratios to study glycoyslation status of proteins in normal and HCC. Bands were destained using ammonium bicarbonate in 50% ACN, dehydrated by acetonitrile and dried completely. Digestion was carried out in sequence grade trypsin (Promega, Southampton, UK), 20 μg/ml in 16O-water or 18O-water. Peptides were extracted using 16O-water or 18O-water three times and dried in SpeedVac. Subsequently, postdigest-end labeling was carried using prewashed immobilized trypsin (Stratagene; 2 μl/sample). Immobilized trypsin was added to the dried peptide samples and subsequently reconstituted in 8 μl of 18O-water or 16O-water, 2 μl of acetonitrile and incubated for 4 h at room temperature. Samples were centrifuged 10,000 ×g to remove the immobilized trypsin. Supernatant was acidified with 1 μl formic acid and stored at −80°C until LC-MS/MS analysis.

For LC-MS/MS analysis, peptides from paired samples were mixed and analyzed using Micromass quadrupole time-of-flight mass spectrometer (QToF) connected to reversed-phase nano LC system (Agilent 1100). The trap column consisted of (150 μM × 3 cm, C18, 15 μm, 300A, YDAC, flow rate 1.5 μl/min) and analytical column (75 μM × 10 cm, C18, 5 μm, 300A, YDAC, flow rate 200 nl/min), solvent system included 0.4% acetic acid, 0.004% heptafluorobutyric acid). The peptides were eluted using a gradient of acetonitrile up to 45% containing 0.4% acetic acid, 0.004% heptafluorobutyric acid. LC-MS/MS analysis was based on data dependent analysis (DDA) manner, performed using a 1-s MS survey scan m/z, 300-1,800 followed by four MS/MS scans (m/z, 50–1,800) for most intense ions and precursor ions were excluded for 60 s for the next MS/MS scans. A total of 4.4 s time was allowed for MS/MS spectrum acquisition.

The identification and quantitation of the peptide was done after formatting the mass spectrometry data. LC-MS/MS data acquired using Masslynx (Micromass) were searched using Mascot (Matrixscience, Manchester, UK). While searching the database, 18O at C-terminal carboxyl group (+4 Da) and oxidation of methionine were allowed as variable modification. Relative abundance of proteins was quantitated using MSQuant software downloaded from http://msquant.sourceforge.net [26] and expressed as fold changes (±S.D.). Essentially, Mascot search results were parsed with LC-MS/MS instrument data file using MSQuant. The new_MSQ_quantitationModes.xml files were modified to read the mascot output files and identify the correct light and heavy isotopic MS peaks. The quantitation data was also verified by manual inspection of heavy- and light-peptide-derived MS and MS/MS spectra in MSQuant.

Western Blot

Validation of the mass spectrometry results was carried out for a subset of proteins by Western blotting analysis using specific antibodies. These proteins were selected based on their functional relevance and differential expression between the cancerous and normal liver tissue. Eighty micrograms of protein was resolved on SDS-PAGE and transferred electrophoretically onto a nitrocellulose membrane. The membrane was blocked using 5% bovine serum albumin prepared in phosphate-buffered saline containing 0.1% Tween 20 (PBS-T) for 1 h at room temperature. The membrane was probed with specific primary antibody for 3 h and washed thrice with PBS-T and incubated in the HRP-conjugated secondary antibody for 1 h. After the incubation, the membrane was washed thrice in PBS-T. The signal was visualized by an enhanced chemiluminescence solution and exposed to Hyperfilm. Following antibodies were used for Western blot analysis. Fetuin (AF 1184, 1:1000, R&D systems) and CSRP1 (SC-33331, 1:1000, Santa Cruz).

Immunohistochemical Labeling

Immunohistochemical labeling was performed on liver tissue sections and liver cancer tissue microarrays for a subset of proteins based on their biological relevance. Tissue microarrays were obtained from Creative Biolabs, NY (tumor TMA, CBL-TMA-070 and normal TMA, CBL-TMA-076) and Imgenex, San Diego, CA (tumor TMA, IMH 318, lot CS3 and normal TMA, IMH-342 lot CSN3). The Envision kit (DAKO) was used according to the manufacturer’s specifications. Briefly, the slides were first deparaffinized by xylene and rehydrated with ethanol. Antigen retrieval was done by heating the slides in 0.01 mol/L of sodium citrate buffer for 20 min on a steamer. After blocking by peroxidase, the sections were incubated with primary antibody anti-Fetuin A (dilution 1:400). After rinsing with wash buffer, the slides were incubated with HRP-conjugated appropriate secondary antibody. The signal was developed using chromogen supplied for peroxidase. Tissue sections were observed using Nikon DS-Fi1, microscope-operated using NIS-Elements F package. The immunostaining was assessed by an experienced liver pathologist and intensity was scored as negative, weak (1+), moderate (2+), and strong (3+). The distribution of staining of cancer cells was scored as 3+ for maximum distribution and, accordingly, 2+ and 1+ score was given for lower distribution. The following antibodies were used for immunohistochemical analysis: Fetuin (1:100) and CSRP1 (1:100).

Results and Discussion

The strategy used for quantitative profiling of lectin binding proteins from HCC liver tissue is shown in scheme Fig. 1. Recently, mass-spectrometry-based relative quantitation approach has been used to quantitate peptides containing N-linked glycosylation and also to identify the site of glycosylation [27]. Our approach of lectin affinity enrichment followed by 18O-labeling strategy was aimed at identifying proteins undergoing both N- and O-linked modifications in which the mass-spectrometry-based quantitation is based on any peptide derived from lectin-bound proteins and not glycopeptides alone. A mixture of lectins was used for enrichment of a broader group of glycoproteins in samples from normal individual and HCC patients. 16O/18O labeling allows quantitative profiling of a pair of samples and it relies on 18O labeling of peptides at C-terminal carboxyl groups during trypsin digestion in the presence of heavy water, which allows relative quantitation using mass spectrometry. Interesting candidate proteins found in this study were validated using tissue microarrays. Peptides containing N-linked glycosylation sites were identified by enzymatic incorporation 18O at the sites of glycosylation.

Fig. 1
figure 1

Strategy used for the enrichment of glycosylated proteins for 16O/18O labeling based quantitative mass spectrometric analysis of liver proteins in HCC. Tissue lysates (4 mg) were prepared from tumor and non-tumor liver tissues from a diagnosed case of HCC and incubated with mixture of lectins. The lectin-bound proteins were eluted using mixture of sugars and bound proteins derived from tumor and non-tumor samples were resolved separated on SDS-PAGE. In-gel trypsin digestion was carried out in the presence of 16O/18O-water differential labeling of peptides derived from non-tumor/tumor samples. After mixing the labeled and unlabeled peptides, the LC-MS/MS analysis was done using quadrupole time-of-flight mass spectrometer

Lectin Affinity Chromatography for Enrichment of Glycoproteins

It has been established that aberrant glycosylation serves as a marker of tumor progression in cancers. Several reagents that block glycosylation have been shown to inhibit tumor metastasis. N- or O-glycosylation of membrane components play an important role in altering tumor cell adhesion or motility [28]. In general, glycosylation modification is involved in cell adhesion, enhanced matrix destructive properties and cell motility in tumors.

Lectins are proteins that bind carbohydrates and can be used to enrich glycoproteins. Lectins can be distinguished based on their carbohydrate specificity. Different lectins can be used for isolation of specific glycosylated proteins from a complex sample. Concanavalin A preferentially binds alpha-mannosyl groups, wheat germ agglutinin has a higher specificity for N-acetylglucosamine-containing glycoproteins and jacalin interacts readily with mannose, glucose, and N-acetylneuraminic acid. Since these lectins show affinity for a broad spectrum of glycoproteins, we used the mixture of the above lectins and glycoproteins eluted using corresponding mixture of sugars. In our study, we were able to capture several proteins by lectin affinity purification which were not detected in unfractionated samples, which include transforming growth factor, beta-induced, 68 kDa, apolipoprotein A-1, destrin, and transgelin (Table 1). A complete list of overexpressed proteins is given in Supplementary Table 1.

Table 1 A partial list of identified proteins previously reported to be overexpressed in HCC

18O Labeling and Mass Spectrometry

Differential labeling of peptides by 18O labeling using trypsin has been shown to be a simple and efficient method of relative quantitation of proteins [20, 29]. We used 18O-labeling method of quantitation for proteins obtained from tissue lysates using lectin-bound beads. Peptides derived from HCC tissue and non-cancerous tissue were labeled with heavy and light oxygen, respectively. After mixing, the peptides were analyzed using LC-MS/MS on a quadrupole time-of-flight mass spectrometer. 18O/16O-labeled peptides appear as doublets in MS, and ratio of peak intensities indicates relative differences in level of protein expression. Labeling at both oxygen atoms from the C-terminal carboxyl group is confirmed by visual inspection of spectrum. MSQuant assigned monoisotopic peaks from 16O-labeled and two 18O-labeled peptides. Corrections to the fold changes applied when there is a mixed labeling [30]. Peptide ratios were grouped and the average value is taken as protein-level fold changes. With a cutoff of threefold, we found 30 proteins to be overexpressed in HCC (Table 2).

Table 2 Partial list of proteins overexpressed in HCC

Overexpressed Proteins in HCC

We used the 18O-labeling method for the relative quantitation of proteins from HCC and adjacent normal tissue. Both unfractionated sample and eluates of lectin-bound proteins from HCC and adjacent normal were separated on 1D SDS-PAGE. Nearly 50% of the proteins identified from lectin affinity methods were also found in whole proteome analysis allowing the comparison of glycosylation status of these proteins in cancer. Both quantitation approaches revealed many known and novel upregulated proteins in cancer. Table 1 shows a partial list of overexpressed proteins detected in lectin affinity methods. Among them haptoglobin [31], alpha-2-microglobulin [32], tenascin C [33], transferrin [34, 35], fibronectin [36], caveolin 1 [37, 38], heat shock 70 kDa protein 5 [39], and ceruloplasmin [40] have been reported to be upregulated in HCC. We also found several interesting proteins, which are highly overexpressed and they have not been reported in HCC. These include serpin peptidase inhibitor (clade A, member 3), alpha-2-HS-glycoprotein, calumenin, cysteine and glycine-rich protein 1, beta-galactoside-binding lectin, peroxiredoxin 2, and clusterin isoform 2. Similarly, many of the proteins enriched by lectin affinity have been shown to be associated with HCC, demonstrating effectiveness of our approach. Lectin affinity method enriched several known glycosylated proteins. Galectin 1 is one such protein overexpressed (3.0-fold) in HCC, which has been reported to be overexpressed at mRNA level in HCC [41]. Alpha-2-glycoprotein 1, zinc and alpha 1B-glycoprotein were found to be 5.0 and 3.1-fold overexpressed in HCC, respectively. Many studies on cancer-associated glycosylation reported upregulated proteins such as haptoglobin related protein and hemopexin which are known to undergo glycosylation in HCC [42, 43]. Clusterin isoform 2 (apolipoprotein J), a sulfated glycoprotein involved in apoptosis, cell death and complement activation, was found to be five-fold upregulated in the present study. Clusterin is a known potential biomarker involved in HCC metastasis [4446]. Similarly, a glycated form of haptoglobin (>15-fold in HCC) has been shown to be elevated in sera from an HCC patient [31], indicating that quantitative proteomics coupled to lectin affinity method is a promising strategy for cancer biomarker study.

Figure 2 (panels A, B, and C) shows some of the MS and MS/MS spectra of proteins respectively, fetuin, destrin and SERPINA1, which were detected only in lectin affinity enrichment method. Fetuin is a serum glycoprotein, which has earlier been shown to contain both N-glycans and O-glycans [47, 48]. SERPINA1 has been shown to contain N-glycans and mass-spectrometry-based quantitation was successfully used for the measurements of extent of glycosylation [49]. Figure 3 shows the MS and MS/MS spectra of peptides derived proteins, which were also found in whole proteome analysis. MS spectra of peptides from hemopexin (panel A) and peptidylprolyl isomerase B (PPIB; panel B) show increase in glycosylated form when compared to MS spectra from whole proteome analysis. Hemopexin has been found to contain many N-glycan and O-glycan sites [50]. Interestingly, PPIB is not known to be glycosylated; however, in the present study, we observe an elevated glycosylated form in tumor tissue. Similarly, transgelin and cofilin 1 (Fig. 4, panels A and B) indicate different levels of glycosylation in cancer and normal tissues. Although, further studies are necessary to validate our approach, results indicate it is a promising method to study cancer-specific glycosylation versus normal glycosylation events. All proteins identified in lectin affinity method need not represent the cancer-associated hyperglycosylation. In fact, in this study, glycosylation of CSRP1 and peroxiredoxin 6 (Fig. 5, panels A and B) shows no change or decreased level in tumor tissue compared to normal, respectively. In case of tenascin C and HSP 27 kDa, the known upregulated proteins associated with HCC show an HCC-associated increase in both protein level and glycosylated form as shown in Fig. 6 panels A and B, respectively. In general, increased fold changes seen in lectin affinity enrichment method indicate that transgelin, cofilin 1, tenascin C and HSP 27 are upregulated compared to normal tissue, whereas glycated form of CSRP1 did not alter in tumor, glycated form of peroxiredoxin 6 seems to downregulated in tumor compared to normal. Complete lists of proteins identified with their relative quantitation from the two large-scale experiments involving whole lysates and lectin-affinity-enriched proteins from tumor and non-tumor are given in Supplementary Tables 1 and 2, respectively. We further analyzed the overexpressed proteins identified in lectin affinity method for functional annotations [51]. Among the 120 overexpressed proteins, 40 proteins (33%) localized to the extracellular region, 49 (40%) of proteins contain predicted or assigned signal peptide and 25 (20%) were plasma proteins. Our data indicate marked enrichment of glycoproteins (40%) and membrane-related proteins (20%) which would otherwise be under represented in the whole proteome analysis.

Fig. 2
figure 2

Differential proteomic analysis by 18O labeling: Proteins identified only in lectin affinity enrichment. Panel of MS spectra and MS/MS spectra showing fold changes and peptide sequence identification from overexpressed proteins in HCC. Panels A, B, and C show the MS and MS/MS spectra of HTLNQIDEVK (fetuin), YALYDASFETK (destrin) and ADLSGITGAR (SERPINA1), respectively. The fold changes after lectin affinity enrichment is indicated in MS spectra

Fig. 3
figure 3

Quantitative analysis of protein changes by whole proteome and lectin enrichment analysis. Panels A and B show the MS and MS/MS spectra of GGYTLVSGYPK (hemopexin) and TVDNFVALATGEK (peptidylprolyl isomerase B), respectively. The fold changes at the whole proteome level and after lectin affinity enrichment are indicated in MS spectra

Fig. 4
figure 4

Quantitative analysis of protein changes by whole proteome and lectin enrichment analysis. Panel A and B show the MS and MS/MS spectra of TLMALGSLAVTK (transgelin) and YALYDATYETK (cofilin 1), respectively. The fold changes at the whole proteome level and after lectin affinity enrichment are indicated in MS spectra

Fig. 5
figure 5

Quantitative analysis of protein changes by whole proteome and lectin enrichment analysis. Panels A and B show the MS and MS/MS spectra of GLESTTLADK (CSRP 1) and LSILYPATTGR (peroxiredoxin 6), respectively. The fold changes at the whole proteome level and after lectin affinity enrichment are indicated in MS spectra

Fig. 6
figure 6

Quantitative analysis of protein changes by whole proteome and lectin enrichment analysis. Panels A and B show the MS and MS/MS spectra of ETFTTGLDAPR (tenascin C) and QLSSGVSEIR (HSP 27-kDa), respectively. The fold changes at the whole proteome level and after lectin affinity enrichment are indicated in MS spectra

Novel HCC-associated Proteins Identified by Lectin Affinity Method

Fetuin (also called as alpha-2-HS-glycoprotein) is a glycoprotein present in serum, which is synthesized by hepatocytes and forms dimers. Although its exact function is unknown, it has been postulated to play a role in tissue growth and brain development. It is found to be a circulating inhibitor of vascular calcification [52]. Fetuin has been shown to be associated with head and neck cancers [53]. In our study, fetuin was overexpressed 3.8-fold in HCC. Alpha-2-glycoprotein 1, zinc-binding (ZAG) is a glycoprotein which has 18.2% carbohydrate and binds zinc ions. It is a secreted protein which stimulates lipid degradation in adipocytes [54]. ZAG is not only expressed in normal liver and breast but also reported to be expressed in breast cancer [55, 56], prostate cancer [57, 58], and bladder cancer [58]. In our study, alpha-2 glycoprotein 1 zinc was overexpressed 4.6-fold in HCC. Transgelin is an actin cross-linking protein found in fibroblasts and smooth muscles. Downregulation of transgelin expression in many cell lines may be a sensitive marker for the onset of transformation. It is reported to be overexpressed in breast cancer, cervical cancer, colorectal cancer, endometrial cancer, head and neck cancer, malignant lymphoma, ovarian cancer, pancreatic cancer, prostate cancer, skin cancer, stomach cancer, and urothelial cancer [59]. Transgelin is reported to be an inhibitor of ARA54-enhanced androgen receptor transactivation which gives an insight into the suppressor role of transgelin in prostate cancer cell growth [60]. In our study, transgelin was overexpressed 6.0-fold in HCC when compared to 2.5 in whole liver proteome without lectin affinity purification.

Destrin belongs to F-actin polymerizing factor family. It is ubiquitously expressed and reported to be associated with colon cancer [61] and Alzheimer’s disease [62]. Destrin is overexpressed 6.5-fold in HCC with lectin affinity purification. Leucine-rich alpha-2 glycoprotein 1 belongs to a leucine-rich family of proteins which are involved in protein–protein interactions and signal transduction. This protein consists of a single polypeptide chain with one galactosamine and four glucosamine oligosaccharides attached. Leucine-rich alpha-2 glycoprotein 1 is reported to be a marker involved in neutrophilic granulocyte differentiation [63]. It is also shown to be overexpressed in several cancers similar to transgelin [59]. In our study, leucine-rich alpha-2 glycoprotein 1 was overexpressed 4.7-fold with lectin affinity purification. Heparan sulfate proteoglycan 2 or perlecan is a major component of basement membrane and has angiogenic and growth-promoting attributes. It acts as a co-receptor for basic fibroblasts growth factor. It is reported that antisense targeting of perlecan subsides tumor growth and angiogenesis, and could be a potential target for therapeutics [64]; it has been shown to be overexpressed in prostate cancer [65]. In our study, perlecan was overexpressed 4.3-fold in HCC with lectin affinity enrichment. Hemopexin is a heme-binding plasma glycoprotein which is synthesized in liver and forms about 1.4% of total serum proteins. Each molecule of hemopexin binds to one heme and transports it to hepatocytes for salvage of the iron; it plays an important in receptor mediated cellular uptake of heme [66]. It has been reported that expression of hemopexin is low in fetal liver and its expression increases in adult liver [67]. In our study, hemopexin was overexpressed threefold in HCC.

Validation of CSRP1 and Fetuin Western Blot and IHC

Validation of proteomic data is crucial for eliminating the artifacts of sample heterogeneity. Western blotting and immunohistochemical labeling methods were used for confirming and validating the expression level in tissue microarrays. Based on the availability of antibodies, we checked the expression levels of cysteine and glycine-rich protein (CSRP 1) and fetuin in HCC and adjacent normal tissue and found that these two proteins were overexpressed in HCC when compared to adjacent normal in correlation to our mass spectrometry data. Western blot results shown in Fig. 7a indicates upregulation of CSRP 1 and fetuin from whole proteome. However, as observed in 18O-labeling analysis after lectin enrichment, Western blot shows no change in CSRP1 and increase in level of fetuin in HCC. Fig. 7b shows the increased staining CSRP 1 and fetuin in tumor tissue compared to non-tumor tissue used for proteomic analysis.

Fig. 7
figure 7

Validation of fetuin and cysteine-rich protein by Western blotting and immunohistochemical labeling in tissues used for proteomic analysis. A The homogenates from the tissue samples were resolved by SDS-PAGE and subsequently electroblotted onto nitrocellulose membrane and probed with specific antibodies as indicated. B Shows immunohistochemical labeling for fetuin and CSRP1, respectively

Validation of Fetuin as an Overexpressed Protein in HCC Using Tissue Microarrays

We used a total of 114 tissue samples from liver cancer patients and 79 non-cancerous liver tissues for the validation of fetuin expression using immunohistochemical labeling. Tumor tissues were grouped into grade I (21 cases), grade II (47 cases), grade III (28 cases), grade IV (five cases), and 13 metastasis tissues. The scoring of the TMAs was done by an expert pathologist, in which the intensity and distribution were each scored from 0 to 3 points. Although fetuin was detectable in both non-cancer and HCC sections, the overall intensity and distribution of staining pattern indicates that fetuin overexpression in HCC compared non-cancer tissues (Fig. 8). No significant difference in expression levels of fetuin among different tissue grades was found. TMAs obtained from a different commercial source contained 32 non-neoplastic matching tissues among them 28 samples showed increased distribution of staining for fetuin compared to matching non-cancer tissues (p < 0.05). In all, fetuin was overexpressed in 60/114 (52%) of HCC cases with the total score ranging from 2 to 5 whereas in non-HCC cases, 27/79 (34%) showed expression of fetuin with total score less than 2.

Fig. 8
figure 8

Immunohistochemical labeling for alpha-2HS-glycoprotein proteins in tissue microarrays. Six representative sections from tissue microarrays (TMAs) containing 114 tumor and 79 normal tissues are shown

Identification of N-glycosylation Sites

Using PNGase F and 18O-water, we identified 34 N-glycosylation sites from 28 HCC-associated proteins. The peptides containing the N-glycosylation sites are presented in Table 3, which also shows the consensus sequences identified for each peptide, Mascot score and expect values. *N-X-T/S motif indicates cleavage of glycosylated group at aspargine, incorporation of an 18O atom, and conversion of Asn to Asp 18O (+3 Da). PNGase F cleaves innermost GlcNAc residue linked to Asn resulting in deamidation of Asn to Asp. Earlier, Kristiansen et al. used PNGase F and 18O-water successfully for the identification of N-glycosylated proteins in human bile [20]. Sites identified using PNGaseF treatment contain N-X-T/S motif. Comparison of these sites with the well-annotated human protein database (www.hprd.org) showed majority of known glycosylation sites which indicate glycopeptide enrichment using lectin affinity method, indeed an efficient approach for the enrichment of subproteome. We also identified 14 novel N-glycosylation sites from 12 HCC-associated proteins, two of the novel N-linked glycosylation sites are annotated in MS/MS spectrum as shown in Fig. 9.

Fig. 9
figure 9

MS/MS spectra of peptides from fibrinogen-like 2 and complement component 4A analyzed using QToF. The spectra A and B show the N-glycosylation sites of the peptide LDGST*NFTR from fibrinogen-like 2 and GL*NVTLSSTGR from complement component 4A respectively (where *N indicates the N-glycosylation site modified by 18O). Spectrum shows the conversion of Asn to Asp after PNGase F treatment

Table 3 N-glycosylated peptides identified from lectin affinity enriched proteins of HCC tissue

Levels of Transgelin and Destrin in Lectin-enriched Serum

To evaluate the expression of some of the interesting candidate proteins found in our study, we used lectin affinity enrichment of serum for Western blot analysis to see whether tissue-level fold change is reflected in the serum from HCC patients. In this experiment, sera from eight HCC patients and pooled normal sera (Invitrogen) were chosen for lectin affinity enrichment followed by Western blot analysis. Figure 10 shows an average increase level of transgelin and destrin in serum of HCC compared to normal.

Fig. 10
figure 10

Validation of candidate proteins by Western blotting in serum. Serum samples (7 mg) from HCC patients and normal pooled sample was enriched for glycoproteins using mixture of lectins as explained in methods. Eluted proteins were resolved by SDS-PAGE and subsequently electroblotted onto nitrocellulose membrane and probed with antibodies for destrin and transgelin

Conclusions

Quantitative profiling of proteins in HCC revealed many overexpressed proteins relevant to HCC. Many proteins that have been previously associated with HCC were detected in this study. Important among those were complement component H, tenascin C, ceruloplasmin, haptoglobin, and alpha-2 macroglobin. More importantly, a number of novel HCC-associated proteins discovered in this study are candidates for validation in HCC using alternate platforms such as immunohistochemical labeling or ELISA. One of the novel candidates, fetuin was validated as a potential biomarker using tissue microarrays containing 114 tumor and 79 normal liver tissue sections. Although many lectin-bound proteins were found to be overexpressed in HCC, whole protein levels were also high in both normal and tumor cells. We also compared the extent of fold changes in proteins before and after lectin affinity enrichment to evaluate the status of glycosylation in HCC. Many of the proteins identified in this study showed differences in fold changes before and after lectin affinity enrichment indicating that this subset of proteins may represent hyperglycosylation associated with cancer. This approach is useful in identifying cancer-associated proteins which are otherwise undetectable in a complex mixture. For example, fetuin, SERPINA1, and destrin were identified only by lectin affinity method because of reduced complexity. Hemopexin, peptidyl prolyl isomerase B, cofilin 1, and HSP 27-kDa protein showed more fold changes in lectin affinity method indicating glycation specific changes in these proteins in HCC. Our study showed many N-linked glycosylation sites from HCC including novel sites, which may be of interest for further study of glycoprotein marker. Nearly 30% of the protein-bound lectins contain confirmed N-glycosylation sites indicating this technical approach of quantitative proteomics using multiple lectin affinity method can be used for simple screening of biomarkers in cancer. Majority of the confirmed N-glycosylation sites were found in proteins upregulated in HCC such as haptoglobin, hemopexin, clusterin, and ceruloplasmin. While many potential biomarker studies in HCC are based on identification and validation of single upregulated proteins which involve unfractionated serum or tissues, lectin affinity allows enrichment of subproteomes containing aberrantly glycated proteins and help in identification of multiple markers in a single analysis. A significant number of upregulated proteins (40%) identified in this study were also predicted to be N-glycosylated proteins [51]. In our study, PNGase F treatment in the presence of 18O-water found to be useful for the identification of N-glycosylation sites in addition to the presence of consensus motif N-X-T/S. In conclusion, quantitative proteomic approach can be successfully used for the analysis of subproteomes, such as glycosylated proteins, to specifically identify and quantitate N-glycosylated proteins and peptides.