Background

Formalin − fixation, paraffin − embedding (FFPE) is routinely used to preserve human tissue samples for routine pathological diagnostics. Comprehensive archives of FFPE specimens have been established in most large Institutes of Pathology, harboring tissue specimens from most human diseases, including comparably rare medical conditions. As these specimens are carefully annotated regarding diagnosis, treatment and patient outcome, they are a unique research resource for proteomic analysis aiming to identify key proteins in disease progression and treatment response.

In recent years considerable advances have been made regarding the proteomic analysis of FFPE specimens. Upon formalin fixation, extensive protein cross-linking occurs, largely involving the ε–amino group of lysine residues. These covalent cross-links are reversed by sample heating in the presence of a strong detergent (e.g. sodium dodecyl sulfate, SDS) or a chaotropic salt [13]. FFPE samples that were > 5 years old have been successfully analyzed by liquid chromatography (LC) − tandem mass spectrometry (MS/MS) [47]. There have been considerable improvements with regard to proteome coverage [2, 8, 9] and the ability to analyse post–translational modifications such as glycosylation or phosphorylation [8].

Quantitative proteomic analysis of FFPE specimens has been performed using two–dimensional differential gel electrophoresis [10] or label–free chromatographic approaches [2, 5, 11]. Chemical isotope tagging of FFPE samples has been performed using iTRAQ labeling [1215].

Reductive dimethylation of primary amines is a widely used strategy for relative quantification of peptides and proteins by LC-MS/MS [16]. Distinguishing features of dimethyl labeling are its robustness and cost–effectiveness together with the ability of binary and triplex approaches [16]. It has recently been shown that quantification with dimethyl labeling is as accurate as metabolic labeling strategies [17], which have been referred to as the gold standard in quantitative proteomics [18].

In this study we establish the usage of dimethyl labeling for the quantitative proteomic investigation of FFPE specimens. Using corresponding cryopreserved tissue specimens as controls, we show that formalin fixation and paraffin embedding does not interfere with the formaldehyde–based dimethyl labeling. In an exemplary proof–of–concept application, we used dimethyl labeling to profile proteome changes in FFPE tissue specimens of clear cell renal cell carcinoma (ccRCC), comparing ccRCC to adjacent non–malignant renal cells. We found elevated levels of glycolytic enzymes that are in line with previous studies.

Our findings add robust and cost–effective dimethyl labeling to the toolbox for quantitative proteomic analysis of FFPE specimens as well as providing further proteomic insight into ccRCC pathology.

Results and discussion

Overview

The present study aims to evaluate isotopic dimethyl labeling for the quantitative proteomic analysis of FFPE tissue specimens. This stable isotope tagging method is then applied for quantitative proteome profiling of ccRCC tissue in comparison to adjacent non-malignant tissue.

Labeling efficiency

We initially profiled FFPE–derived protein samples for the occurrence of dimethylated primary amines (α-amines of peptide N–termini and ε–amines of lysine side chains). For FFPE samples and cryopreserved non-labeled control protein samples, we did not detect relevant levels of mono– or dimethylated peptides N-termini (Fig. 1a) or lysine side chains (Fig. 1b). Several reports indicate that FFPE protein samples are susceptible to a +12 Da addition at N-termini as well as lysine, tryptophan, and tyrosine residues [1921]. Using the non-dimethylated protein samples, we probed for occurrence of these modifications. For both FFPE and cryopreserved samples, less than 5 % of the identified peptides showed these modifications (not shown).

Fig. 1
figure 1

Modifications present at N-termini and lysine side chains before and after reductive dimethylation with either light (12COH2) or heavy (13COD2) formaldehyde. Data represents average values ± standard deviation; based on the analysis of three proteome samples from solid tumors. Identification numbers (average values ± standard deviation) are stated. a) modifications of peptide N-termini before di-methylation; b) modifications of lysine side-chains before dimethylation; c) modifications of peptide N-termini after di-methylation; d) modifications of lysine side-chains after di-methylation

Next, we assessed whether FFPE–derived protein samples are amenable for dimethyl labeling on the peptide level after trypsination. To discriminate between the chemically introduced label and arbitrarily introduced modifications (e.g. formalin traces in the FFPE samples), we used heavy formaldehyde (13COD2). For both FFPE samples and cryopreserved control samples, dimethylation works robustly as > 95 % of all identified peptides were present in the heavy dimethylated form with regard to their N–termini (Fig. 1c) as well as lysine side chains (Fig. 1d). We did not detect relevant levels of unlabeled, monomethylated or light-dimethylated peptides.

Protein samples derived from both FFPE or cryopreserved control tissue specimens yield comparable peptide identification numbers, which are in the range of recent reports of proteomic analysis of FFPE tissues [6, 7, 11, 22]. Together, these results highlight that FFPE specimens are amenable to dimethyl labeling without interference from the initial formalin fixation.

Quantitation accuracy

To investigate whether dimethyl labeling yields accurate quantification of FFPE –derived protein samples, we halved FFPE samples after trypsination, labeled the different aliquots with either light (12COH2) or heavy (13COD2) formaldehyde, and mixed the aliquots at a 1:1 ratio, followed by LC–MS/MS analysis (Fig. 2). The same procedure was conducted with cryopreserved control samples (Fig. 2). The analysis was performed in triplicate, yielding consistently between 560 and 540 protein identifications. The Fc-values (log2 of the L:H ratio) display a narrow distribution, with almost identical standard deviations, ranging from 0.34 – 0.41 for the cryopreserved samples and from 0.33 – 0.41 for the FFPE samples. A recent benchmarking study of dimethyl labeling showed a standard deviation of 0.34 for a 1:0.5 mix of an identical sample that was differentially labeled using dimethylation [17]. Our results are in very good agreement with the previously reported outcome. Recently, Wakabayashi et al. have also shown applicability of reductive dimethylation to FFPE-extracted phosphopeptides [23].

Fig. 2
figure 2

Quantitative proteomic comparison of identical samples that were split, labeled by reductive dimethylation with either light (12COH2) or heavy (13COD2) formaldehyde, and mixed 1:1. Box-and-whisker plots denote the 25–75 percentile and the 5 – 95 percentile, respectively. Data represents average values ± standard deviation; based on the analysis of three proteome samples from solid tumors. Protein and peptide identification numbers are 572 proteins/4528 unique peptides for cryo-sample 1; 637 proteins/4627 unique peptides for cryo-sample 2; 605 proteins/4763 unique peptides for cryo-sample 3; 647 proteins/4977 unique peptide for FFPE-sample 1; 564 proteins/4113 unique peptides for FFPE-sample 2; 601 proteins/4367 unique peptides for FFPE sample 3

In summary, FFPE tissue specimens are amenable to protein extraction and subsequent relative quantitation by isotopic dimethyl labeling in LC-MS/MS analysis.

Application to quantitative proteome profiling of clear cell renal cell carcinoma

Clear cell renal cell carcinoma (ccRCC) is among the ten most common human malignancies and accounts for more than 90 % of renal neoplasms. As it is clinically occult in many patients, diagnosis often occurs only at a later stage of disease and approximately one third of patients presents with metastatic disease upon diagnosis. Response rates to treatment are generally low for metastatic ccRCC, leading to a median survival of less than one year [24, 25].

As a first application of isotopic dimethyl labeling for relative protein quantitation, we focused on ccRCC. Four cases of ccRCC were analyzed, in which tumor and adjacent normal tissue was represented in the same FFPE tissue blocks. In accordance with the workflow established in this study, we used “light” and “heavy” dimethylation with isotopic formaldehyde for quantitative proteomic comparison of the protein samples derived from the malignant and non–malignant FFPE specimen areas. Protein samples of each case were analyzed separately. To increase proteome coverage, we employed SCX prefractionation.

The Fc-values of each replicate experiment followed a near-normal distribution (Shapiro–Wilk test, Fig. 3a). Peptides with unlabeled N-termini or lysine side-chains constitute less than 3 % of all identified peptides (data not shown). As expected, the Fc-values are broadly distributed in each replicate (Fig. 3a), indicating substantial proteome differences between ccRCC tissue and adjacent, non-malignant tissue. We used the APEX method to calculate protein abundances [26, 27]. The resulting APEX scores displayed good correlation between the different replicates (Fig. 3b).

Fig. 3
figure 3

Quantitative proteomic comparison of FFPE derived ccRCC tumor tissue with FFPE derived, adjacent non-malignant tissue. Stable isotope labeling was achieved by reductive dimethylation with either light (12COH2) or heavy (13COD2) formaldehyde. a Box-and-whisker plots denoting the 25–75 percentile and the 5 – 95 percentiles, respectively, of the four replicates (b) Correlation of APEX scores [26, 27], showing the Pearson correlation. Protein and peptide identification numbers are 1518 proteins/6619 unique peptides for replicate 1; 2490 proteins/13501 unique peptides for replicate 2; 1590 proteins/12811 peptides for replicate 3; 1352 proteins/9207 peptides for replicate 4

A total of 2938 non-redundant proteins were identified, 1307 of these were found in at least three replicates. As previously described, we employed the following criteria to distinguish significantly affected proteins: (A) identification in at least three replicate experiments, (B) protein abundance differences resulted in a p-value < 0.05 (2-tailed Student’s t test with Benjamini-Hochberg correction for multiple testing at an FDR < 0.05), (C) protein abundance increased or decreased in with an average Fc-value > 0.58 or < −0.58 (equivalent to an abundance change > 50 %). With these criteria, 112 proteins were found to be increased in ccRCC tissue (Additional file 1: Table S1) whereas 77 proteins were found to be decreased in ccRCC tissue in comparison to adjacent non-malignant tissue (Additional file 1: Table S2).

The online Search Tool for the Retrieval of Interacting Genes (STRING) was used to display connections between proteins with either elevated or decreased levels [28]. STRING visualized several functional clusters for proteins that were found to be elevated in ccRCC (Fig. 4), including ribosomal and proteasomal proteins as well as proteins involved in glycolysis and energy metabolism.

Fig. 4
figure 4

STRING protein functional association network [30] of proteins that were found to be significantly upregulated in ccRCC compared to adjacent non-malignant tissue (p-value < 0.05, 2-tailed Student t-test, at least 50 % increased abundance). STRING was employed using “high confidence”. Disconnected nodes are not shown. Connections are shown using standard STRING coloring scheme as highlighted in the legend

Our finding of increased levels of proteins involved in glycolysis in ccRCC, compared to corresponding non-malignant kidney tissue is in line with further reports, including proteomic, metabolomic, and functionally genomic approaches; all of which point towards increased aerobic glycolysis (“Warburg effect”) in ccRCC [25, 2931].

Although not highlighted by a STRING cluster, we noticed elevated levels of annexins A2 and A4. This is corroborated by further proteomic studies on ccRCC [25, 29, 31]. For annexins II and IV, tumor promoting roles in ccRCC have been suggested, based on in vitro findings indicating that annexin II and IV, respectively, sustain tumor cell migration [32, 33].

Our proteomic analysis of ccRCC further suggests elevated levels of ribosomal and proteasomal proteins, putatively indicating a generally increased protein turnover in ccRCC as compared to non-malignant kidney tissue. Previous proteomic analyses of ccRCC have not highlighted a ribosomal/proteasomal fingerprint. There are however several reports that link proteasomal function to ccRCC. For example, exome sequencing has previously shown that components of the ubiquitin system are frequently mutated in ccRCC [34]. Inactivation of the nuclear deubiquitinating enzyme BAP1 is also a frequent event in ccRCC [35]. Moreover, increased serum levels of the 20 S proteasome have been found in ccRCC patients [36]. We identified 28 proteasomal proteins. All of these were increased in ccRCC with the exceptions of proteasome subunit β type-5 and type-7. 11 proteasomal proteins met our criteria for significantly increased abundance. Likewise, we identified 72 ribosomal or ribosome-associated proteins excluding mitochondrial ribosomal proteins. All of these were increased in ccRCC with the exceptions of ribosome-binding protein 1 and 60S ribosomal protein L34. 30 proteasomal proteins met our criteria for significantly increased abundance.

In summary, application of isotopic dimethyl tagging allowed for the quantitative profiling of ccRCC tissue. Our analysis corroborated previously established proteome motifs of ccRCC, i.e. aerobic glycolysis, as well as pointing to newly discovered proteome alterations, i.e. increased levels of both ribosomal and proteasomal proteins.

Our exemplary application focuses on a pairwise comparison of different samples. Dimethyl labeling is typically restricted to pairwise or triple comparisons. We consider this setting to be its typical application. This limitation can be overcome by using a differentially dimethylated standard sample, analogous to the Super-SILAC strategy [37]. By comparing multiple samples against this standard, a larger number of samples can be probed.

Conclusion

Our results show that dimethyl labeling is applicable for the quantitative proteomic analysis of FFPE tissue specimens without interference from the formalin fixation process. Quantitation accuracy is comparable to cryopreserved tissue specimens. An initial application of dimethyl labeling to FFPE specimens portrayed differences in the proteome composition of ccRCC compared to adjacent non–malignant tissue. Dimethyl labeling with isotopic formaldehyde is a robust and cost–effective labeling strategy for quantitative proteomics. The present work adds dimethyl labeling to the toolbox for quantitative proteome analysis of FFPE specimens.

Methods

Samples

As proof-of-principle tissue specimens for labeling experiments, samples were derived from large solid tumors. From each tissue specimen, one piece was immediately fixed with formalin and embedded in paraffin, the other was immediately snap-frozen in liquid nitrogen and stored at −80 °C. As a clinical application, four FFPE tissue specimens of clear cell renal cell carcinoma (ccRCC) and adjacent non-malignant kidney tissue were chosen. Routine protocols of the Institute of Surgical Pathology were used for all proof-of-principle and ccRCC samples.

For all tissue specimens, diagnosis was confirmed by experienced pathologists. All tissue specimens were processed within 20 min after surgical removal. After processing, samples were immediately anonymized. No tumor showed macroscopical or microscopical signs of necrosis. Routine diagnostics was not affected. The study was approved by the Ethics Committee of the Medical University Freiburg, (311/12_130523, “Feingewebliche, immunhistochemische und molekularpathologische Untersuchungen von benignem und malignem Gewebe urogenitaler Tumore sowie korrespondierender Metastasen aus Formalin-fixiertem, Paraffin-eingebettetem und Frischgewebe.”/“Histological, immunohistochemical, and molecular-pathological investigations of benign and malignant tissues of urogenital tumours and corresponding metastases from formalin-fixed, paraffin-embedded tissues”). Before study inclusion, all patient data were anonymized. Informed consent was obtained from all participants.

Sample preparation

10 μm slides were cut from FFPE specimens and were deparaffinized with xylene, rehydrated in a decreasingly graded ethanol series and transferred into microreaction tubes. Cryopreserved specimens were carefully crushed with a scalpel. All tissue specimens were incubated in 100 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) pH 7.5, 4 % (w/v) sodium dodecyl sulfate (SDS), 50 mM dithiothreitol (DTT) for 1 h at 95 °C with gentle rotation. Typically, 150 μl buffer was used for approximately 10 FFPE slices of 10 μm thickness. Proteins were precipitated by addition of 9 volumes of acetone and 1 volume of methanol and incubated at −80 °C for 2 h. After washing with methanol, the proteins were resuspended in 100 mM NaOH aided by sonication at 4 °C and the solution was brought to pH 8.0 with 200 mM HEPES free acid. Protein concentrations were determined using BCA (Pierce) and Bradford (Bio-Rad) assays. Typical protein yield after acetone precipitation was in the range of 1.0 mg from 10 FFPE slices of 10 μm thickness. Proteins (up to 500 μg) were trypsinized using sequencing grade trypsin (Worthington, 1:100, 18 h at 37 °C). Cysteine residues were reduced and alkylated. If applicable, primary amines were reductively di-methylated in solution (200 mM HEPES, pH 8.0) by addition of 40 mM formaldehyde (12COH2,light‘(Sigma) or 13COD2,heavy’ (Cambridge Isotopes)) and 40 mM sodium cyanoborohydride (pH 8.0, 37 °C 18 h, (Sigma)). Excess reagents were quenched with 20 mM glycine (20 min, 22 °C). If applicable, equal amounts of amounts of heavy and light labelled samples were mixed. Before mass spectrometric analysis, all samples were desalted using self packed C18 Stage-tips [38]. ccRCC proteome comparison samples were pre-fractionated using strong cation exchange (SCX) chromatography as described previously [39, 40].

LC-MS/MS analysis

Analysis was performed on an Orbitrap XL (Thermo Scientific) mass spectrometer that was coupled to an Ultimate3000 micro pump (Thermo Scientific). Buffer A was 0.5 % acetic acid, buffer B 0.5 % acetic acid in 80 % acetonitrile (HPLC grade). Liquid phases were applied at a flow rate of 300 nl/min with an increasing gradient of organic solvent for peptide separation. Reprosil-Pur 120 ODS-3 (Dr. Maisch) was used to pack column tips of 75 μm inner diameter and 11 cm length. The MS was operated in data dependent mode and each MS scan was followed by a maximum of five MS/MS scans.

LC-MS/MS data analysis

LC-MS/MS data was obtained in raw format and converted to the mzXML [41] format, using msconvert [42] with centroiding of MS1 and MS2 data, and deisotoping of MS2 data. For spectrum to sequence assignment X! Tandem (version 2013.09.01) [43] was used. The proteome database consisted of human reviewed canonical uniprot sequences (without isoforms, 20,240 protein entries) downloaded from UniProt on November 26th, 2013, appended with an equal number of shuffled decoy entries derived from the original human protein sequences (DB toolkit, [44]). Two different searches were conducted for light and heavy labeled peptides. X! Tandem parameters included: pre-cursor mass error of 10 ppm, fragment ion mass tolerance of 0.3 Da, tryptic cleavage specificity with up to three missed cleavages for probing labeling efficiency and up to one missed cleavage when applying the labeling technique to tumor samples. Residue modifications: cysteine carboxyamidomethylation (+57.02 Da), lysine and N-terminal dimethylation (light formaldehyde 28.03 Da; heavy formaldehyde 34.06 Da); no variable modifications. X! Tandem results were further validated by PeptideProphet [45] at a confidence level of > 95 %. Peptides were assembled to proteins using ProteinProphet [46] with a false discovery rate (FDR) < 1.0 %. For relative peptide and protein quantification XPRESS [47] was used. Mass tolerance for quantification was 0.02 Da. XPRESS data was log2-transformed yielding fold change (Fc)–values. For the ccRCC replicate analyses, protein abundance was considered to be significantly altered if the following conditions were met: (A) the protein was identified in at least three replicate experiments, (B) protein abundance was significantly increased or decreased (p-value < 0.05, based on 2-tailed Student’s t test with Benjamini-Hochberg correction for multiple testing at an FDR < 0.05; the Perseus framework was used for statistical analysis [48]), (C) protein abundance increased or decreased with an average Fc-value > 0.58 or < −0.58.

Supporting data

The LC-MS/MS data underlying this study were uploaded to the PeptideAtlas database and can be retrieved at http://www.peptideatlas.org/PASS/PASS00702.