Metabolomics is a field of study focusing on quantitative and qualitative analysis of products of the metabolic pathways [1]. The aim of metabolomics is to provide the information about the composition of metabolome and changes within it under various conditions. In metabolomics, three different approaches are utilized: metabolic fingerprinting (determination of global metabolome), metabolic profiling (quantitative analysis of selected set of compounds that are related by the similar function, properties or belong to the same biochemical pathway), and targeted analysis (determination of particular compounds). Taking into account the object of the research, human [2,3,4], plant [5,6,7], or microbial [8,9,10] metabolomics is distinguished [11]. Mostly, human metabolomics concentrates on pathophysiological disorders, either on evaluating potential biomarkers or explaining pathomechanism. Consequently, in metabolomics, different types of biological matrices are utilized, mainly urine, serum, and tissue [12]. Many diseases, especially cancer, initialize in tissue and because of that, the slightest qualitative and quantitative changes in tissue metabolic profile can be observed at the early stage of disease development. Consequently, tissue seems to be an adequate matrix when the aim is to investigate the mechanism of the disease [12]. In genomics or proteomics of cancer, formalin-fixed, paraffin-embedded (FFPE) tissues are commonly used samples, whereas they were rarely described in metabolomics research [13,14,15,16,17]. Preparation of FFPE samples includes three steps: formalin fixation, dehydratation, and paraffin embedding. Formalin fixation is performed with the use of neutral-buffered formalin solution (3.7–4.0% formaldehyde buffered with phosphate at a pH of 7) [18]. The mechanism of formalin fixation consists of two steps. In the first step, formaldehyde reacts with amino group (for example from proteins) to create a Schiff base, which, in the second step, reacts with another amino group and creates a stable crosslink by methylene bridge [18, 19]. The aim of formalin fixation is to preserve the tissue by the inhibition of proteolytic enzymes as well as the growth of microorganisms [20, 21]. Consequently, fixed tissue should remain biologically stable through the time of storage [21].

The next step of FFPE sample preparation is paraffin embedding, followed by the dehydratation. Dehydratation includes washing the specimen with ethanol and xylene. It is performed to remove the water from tissue, because paraffin, which is the most often used embedding factor, cannot penetrate through wet-fixed matrix [21, 22]. The final step is paraffin embedding. Melted paraffin is added to the dry tissue, which is then left to solidify at room temperature [18, 21]. The aim of embedding is to improve the preservation as well as to ease further sample sectioning.

The main advantage of FFPE samples is that tissue preservation through formalin fixation and paraffin embedding provides the stability of tissue morphology, and enables to store the sample at ambient temperature. It is considered to be a more cost and space effective choice in comparison to fresh-frozen tissues that must be kept at special conditions (e.g., temperature − 80 °C or liquid nitrogen) to maintain biological stability. However, comparing to fresh-frozen tissue, application of FFPE samples requires more complex sample preparation protocol, including deparaffinization and purification step. Furthermore, specimen preservation by formalin fixation may cause loss of some group of compounds due to the creation of cross-links between proteins and other biomolecules. This aspect is critical for FFPE utilization in metabolomics, especially in untargeted approach, that focuses on the identification of as many different chemical class of metabolites as possible in analyzed biological matrix. Nonetheless, previously published research indicated that FFPE tissues can be successfully implemented in metabolomics; however, there is a lack of data concerning optimization of sample preparation step. Application of FFPE was performed with several analytical techniques and the goal was to evaluate the influence of preservation procedure on the quality of obtained metabolic profiles [13,14,15,16,17, 23, 24]. It was proved that formalin fixation and paraffin embedding may cause loss of specific group of metabolites such as amino acids, peptides, carbohydrates, steroids, or fatty acids esters. Nevertheless, it was possible to observe statistically significant differentiation between healthy and prostate cancer FFPE tissues based on the levels of detected compounds.

However, application of FFPE was reported in several studies, detailed optimization of sample preparation step was not conducted. Due to the fact that the fixation as well as deparaffinization may lead the loss of metabolites, sample preparation step seems to be crucial for the final results of analysis. For this reason, the main aim of the study was to develop reliable and effective sample preparation protocol for FFPE samples, which could be applied for untargeted metabolomics approach. There are previously published work focusing on the developing of sample preparation, but they mainly cover DNA and RNA extraction from FFPE samples [25,26,27,28], whereas, to our knowledge, there is no such research in metabolomics that would focus on the comparison of different sample preparation protocols. Described sample preparation method consisted of four parts: deparaffinization, purification, extraction, and derivatization. Among the study, deparaffinization and purification methods were developed. Two deparaffinization methods were compared: based on xylene and hot water. In addition, several conditions were evaluated: type and volume of deparaffinization solvent, number of purification cycles as well as type of solvent used in purification step, extraction solvent, and thickness of FFPE samples. All samples were analyzed with the use of gas chromatography hyphenated with triple quadrupole mass spectrometer (GC–QqQ/MS) technique. Application of GC–MS required a derivatization step to convert metabolites to their volatile derivatives. Developed method was applied for the untargeted metabolomics analysis of FFPE tissue samples obtained from prostate cancer patients and healthy prostatic tissue.



Pentadecanoic acid standard was purchased from Sigma-Aldrich, USA. The derivatizing agents, methoxyamine hydrochloride, pyridine, N,O-bis(trimethylsilyl)trifluoroacetamide (BSTFA), and chlorotrimethylsilane (TMCS), were obtained from Sigma-Aldrich, St. Louis, MO, USA. Heptane, ethanol and xylene were purchased from Chemsolve, Witko, Poland. Chloroform was purchased from POCH, Gliwice, Poland, and deionized water was obtained with Milli-RO and Milli-QPlus instrumentation from Millipore, Switzerland.

Sample Collection

The research was approved by the Independent Commission for Bioethics Research, Medical University of Gdansk, NKBBN/448/2015 and NKBBN/432/2016. The FFPE samples were obtained, preserved, and stored in the Department of Pathology and Neuropathology, Medical University of Gdańsk, Poland, according to routine protocol used in pathology laboratories. All methods used to carry out the study were performed in accordance with the relevant guidelines and regulations. All the participants were recruited at the Department of Pathology and Neuropathology, Medical University of Gdańsk, Poland by a single physician—a specialist in pathology and all the patients agree to use the collected tissue for scientific study. In the following study, 67 samples were collected. Samples were divided according to developed step as follows: 9 samples were used to choose the proper deparaffinization method, 24 samples for tissue samples purification, 25 samples for investigation of the influence of FFPE thickness on metabolic profile, and 9 samples for the application study. Samples were stored in an ambient temperature and dark place up to the day of analysis.

Developing of Sample Preparation

Sample preparation method was developed regarding several conditions: type and volume of deparaffinizing solvent, number of purification cycles, type of solvent for sample purification, solvent used for extraction, and thickness of samples. Each tissue sample was weighed after the deparaffinization procedure, what was used for normalization. First, two deparaffinizing agents: hot water and xylene, were compared in terms of deparaffinization efficiency. In addition, thickness of sample was tested. In reference to the literature, thickness of utilized paraffin-embedded tissue applied in clinical settings varies between 7 and 40 µm, while in metabolomics, most common is 40 µm [13, 14]. In the presented research, five thicknesses of samples, namely, 10 µm, 15 µm, 20 µm, 30 µm, and 40 µm, were tested. Afterward, purification step was optimized. Three purification methods were evaluated: utilization of no purification procedure, one-step purification with the use of ethanol, and two-step purification using ethanol followed by acetone. Moreover, the number of purification cycles was compared Paragraph 2 to describe final protocol of optimized FFPE sample preparation procedure.

Sample Deparaffinization and Purification Procedure

1 mL of xylene was added to each specimen and sample was vortex-mixed for 1 min. Samples were centrifuged for 5 min (16 657 g, 25 °C). After that, xylene was discarded and procedure was repeated three times. Next step included purification procedure. Xylene residue was removed with the use of 0.7 mL of 96% EtOH. Samples were vortex-mixed and centrifuged, and this step was repeated three times. Thereafter, tissues were slightly dried. After deparaffinization and purification, each deparaffinized prostate tissue was weighed.

Sample Extraction Procedure

Metabolites extraction from prostate tissues was performed as described by Yuan et al. [29], with modification. Tissue samples were homogenized using 150 µL of methanol: water mixture (1:1, v/v), preceded by addition of 5 µL of internal standard (IS, 1 mg mL−1, n-pentadecanoic acid in methanol). The procedure included 1 min of homogenization and 1 min of cooling on ice, and was repeated three times. Afterward, samples were centrifuged (10 min, 16 657g; 4 °C). 120 µL of supernatant was collected. The residue was then homogenized with 150 µL of mixture chloroform:methanol (3:1, v/v) and the whole procedure was repeated. Finally, 120 µL of organic fraction was collected and pooled with aqueous phase. Pooled fraction was evaporated to dryness with the use of vacuum concentrator (Genevac Inc., Valley Cottage, NY, USA) for 2 h at 36 °C. After that, derivatization was performed.

Derivatization Procedure

Two-step derivatization procedure was applied to enhance thermal stability and volatility of analyzed compounds. First step was methoximation performed by the addition of 30 µL of methoxyamine dissolved in pyridine (15 mg mL−1) to the dry mass. Samples were then kept for 16 h in a dark place, at room temperature. The next step was silylation that included the addition of 30 µL of mixture of BSTFA and TMCS (99:1, v/v). Samples were vortex-mixed for 5 min and kept for 60 min at 70 °C. Derivatized samples were diluted in 90 µL of heptane and vortex-mixed for 5 min. Samples were introduced to the GC–MS system.

Preparation of Alkane Standard Mixture

To calculate retention index of analyzed compounds, alkane standard mixture in concentration of 10 μg mL−1 was prepared by dilution in heptane and introduced to GC/MS system.


Analyses were performed with the use of gas chromatography–tandem mass spectrometry (GC/MS/MS) system from Shimadzu (Kyoto, Japan) 8030 series composed of auto-sampler (AOC-20S), autoinjector (AOC-20i), and gas chromatograph (GC 2010 plus) coupled with a triple quadrupole mass spectrometer. Post-analysis processing of the obtained data was conducted using Automated Mass Spectral Deconvolution and Identification System software (AMDIS, and Mass Profiler Professional B. 02.01 software (Agilent Technologies, Waldbronn, Germany) as well as Mass GC/MS Solution Software version 4.01 (Shimadzu, Kyoto, Japan). Statistical analysis was performed with the use of Metaboanalyst 4.0 software (\), MatLab (Mathworks, Natick, Ma, USA), and Simca P+ 13.0.3 software (Umetrics,Umea, Sweden).

GC/MS Analysis

Derivatized samples were introduced to gas chromatograph equipped with Zebron ZB-5MS (30 m × 0.25 mm i.d., 0.25 μm film thickness) capillary column (Phenomenex, Torrance, CA, USA). Helium was used as a carrier gas, with the constant pressure mode (65.2 kPa) and total flow 10 mL/min. 1 µL of sample was injected in a splitless mode. The temperature of injection port was maintained at 320 °C. Compounds were separated using temperature gradient program. The oven temperature was set at 70 °C for 1 min, and then, temperature increased to 320 °C with grade 8 °C min−1 and held at 320 °C for 10 min to elute any impurities with high boiling point. The temperature of MS interface was set at 320 °C.

The analyses were performed in a scan mode with mass range 50–1000 m/z. Electron impact (EI) ion source with temperature set at 220 °C was used. The time of each analysis was 43 min. All samples were analyzed in one sequence in randomized order. At the beginning of the sequence, standard mixture of alkanes was analyzed.

Data Preprocessing

Obtained chromatograms were transferred from GC/MS Solution Software Postrun Analysis (Shimadzu, Kyoto, Japan) to AMDIS with NIST spectra library to perform deconvolution and identification steps. At first, retention indices for all compounds present in each sample were calculated based on retention time (tR) and retention index (RI) of alkane mixture. The retention time of each compound was normalized by retention time and retention index of the nearest eluting alkane. Deconvolution procedure of raw data was applied to reduce noise signals and matrix interferences from the background. To deconvolute spectra in whole m/z range, following parameters were used: width = 15, adjacent peak subtraction = two, resolution = low, sensitivity = low, and shape requirements = low.

Statistical Analysis

The post-analysis processing included data alignment regarding the retention time and data filtration based on QA criteria and presence in the compared groups. Data processing was conducted using Mass Profiler Professional B.02.01 software (Agilent Technologies, Waldbronn, Germany). Only compounds present in at least 80% of all samples with CV lower than 30 were used for further statistical calculations. The filtered data set was normalized using intensity of IS (n-pentadecanoic acid, Sigma-Aldrich). Univariate and multivariate statistical comparisons were conducted using Matlab 2016b software (Mathworks, Natick, Ma, USA) to evaluate compounds contributing the most into group discrimination. Identification of the statistically significant metabolites was based on RI, RT and mass spectra of compounds detected in tissue samples and compared with hits included in the NIST 11 and in-house spectral libraries.

Results and Discussion

Optimization of Deparaffinization Procedure

Type of the Deparaffinization

In the current study, the deparaffinization procedure of formalin-fixed, paraffin-embedded tissue samples was evaluated. Two methods of deparaffinization were tested: with the use of xylene and hot water.

Results showed that after the deparaffinization step with the use of hot water, the paraffin still covered the biological material. The main concern included the possibility of tissue destruction after performing too many cycles with the use of hot water to obtain efficient deparaffinization step. Consequently, this method was found as problematic to control and deparaffinization was not efficient and reliable. For this reason, method based on hot water was not applied for further metabolic investigation. Second, deparaffinization method was compared based on xylene. Deparaffinization with the utilization of xylene was efficient as the paraffin was easily dissolved and removed, and no paraffin residues were observed on the surface of the tissue. Deparaffinization was easy to control and did not provide the destruction of biological matrix.

It is worth highlighting that in genomics, both tested methods were previously successfully applied [26, 28, 30,31,32]. For instance, Sengüven et al. reported the use of xylene-free deparaffinization protocol [26]. Furthermore, Kalantari et al. [28] compared hot water and xylene as two methods for deparaffinization in case of DNA extraction. Obtained results showed that both methods are similarly effective. However, authors indicated that application of hot water seems to be a shorter, safer, and easier one. Nonetheless, in proteomics, the most commonly used deparaffinization method is based on xylene [27, 33, 34].

Results of our study proved that the deparaffinization with the use of hot water was not efficient and resulted in contamination of tissue samples; the paraffin residues were still observed, also on the tissue surface. On the other hand, in the case of xylene, deparaffinization was found as more efficient. Xylene caused that the paraffin melted and could be removed from the sample.

The first application of paraffin-embedded tissue in metabolomics was reported by Kelly et al. [17] in 2011. Targeted LC–MS analysis was carried out to determine polar metabolites in FFPE sarcoma tissues. However, in their study, deparaffinization step was done with the use of 80% methanol solution. Samples were incubated in 70 °C for 45 min. In reference to our results, utilization of high temperature, as in the method with hot water, can cause protein denaturation. In addition, application of the high temperature and methanol may accelerate the extraction step and cause the significant loss of some compounds. On the other hand, application of high temperature and methanol as a solvent may comprise two steps of protocol, namely deparaffinization and purification in one step. Because of that, it can be assumed that method based on application of xylene can be found as more efficient for metabolomics. Furthermore, this method was previously reported in metabolomics study, for instance by Wojakowska et al. [13, 14]. However, it should be highlighted that influence on the deparaffinization step with the use of several methods has not been previously reported. Wojakowska et al. [13] used untargeted metabolomics approach to compare three types of mouse kidney tissue samples: fresh-frozen; formalin-fixed; and formalin-fixed, paraffin-embedded. Samples were analyzed by gas chromatography–mass spectrometry (GC–MS) technique. In the case of paraffin-embedded tissue, it was proved that embedding impacts on extraction efficiency. In the case of fresh-frozen specimen, the profile was more complex in reference to paraffin-embedded tissue. As a result, 75% of all identified metabolites were common for three types of samples. However, paraffinization can be considered as alternative method for tissue preparation.

Type of Solvent Used for Purification

To evaluate the influence of the purification procedure on the quality of analyzed samples, two methods were tested. First method included three cycles of purification with the use of 0.7 mL 96% ethanol, whereas second method was enriched by the addition of two cycles of purification with 1 mL of acetone. Twelve samples were analyzed; six samples for each analyzed group. Samples underwent deparaffinization, extraction, and derivatization procedures as described in “Sample deparaffinization and purification procedure”, “Sample extraction procedure”, and “Preparation of alkane standard mixture” sections, respectively.

Obtained data were processed according to protocol described in “Data Preprocessing” section. First step of statistical analysis included normalization procedure. Signals were normalized by peak intensity of internal standard (n-pentadecanoic acid) and weight of each tissue sample. Obtained data were filtrated according to quality assurance (QA) criteria (coefficient of variation, CV, lower than 30%) and presence in group (presence in at least 80% of samples from one group). After filtration, 29 compounds were chosen for further statistical analysis. Multivariate statistical methods were performed to check if addition of acetone provides statistically significant differences between profiles obtained after two compared purification procedures. In the first step of multivariate analysis, principal component analysis (PCA) was applied to evaluate general trends in the data. PCA models were built using Simca P+ 13.0.3 software. Obtained PCA model is presented in Fig. 1.

Fig. 1
figure 1

PCA model built on data obtained with GC–QqQ/MS. The calculated R2 factor was R2 = 0.899. Charts correspond to two tested purification procedures. Protocol based on ethanol is marked by blue triangles and purification procedure enriched with acetone is presented by red triangles

As it can be observed in Fig. 1, samples corresponding to two different purification procedures are not separated. It can be assumed that addition of purification step with the use of acetone does not improve sample clean-up. In addition, samples are spread on the graph, what can be explained by the high heterogeneity of biological samples. Afterwards, obtained data were analyzed with the use of parametric or nonparametric statistical tests to prove the assumption about the influence of acetone on complexity of metabolic profile as well as to select statistically significant compounds. First step included checking the normality of the data distribution by Shapiro–Wilk test. For normally distributed variables, with homogeneity of variances, the standard t test was used. In the case of the lack in homoscedasticity, Welch’s t test was applied. Mann–Whitney U test was applied for variables without normal distribution. The multiple comparison correction of calculated p values was performed with the use of Benjamini–Hochberg false discovery rate (FDR) test. Analyses were carried out with the use of Matlab 2016b software. Results are presented in Table 1.

Table 1 Univariate statistical analysis performed on samples analyzed with GC–EI–QQQ/MS

As it can be observed in Table 1, calculated p value was under p < 0.05 for three compounds, namely butanoic acid, tetradecanoic acid, and octadecanoic acid. However, application of p value correction excluded these compounds as statistically significant. Results show that sample purification procedure based on enrichment of acetone does not enhance the purity of sample. In addition, comparison of profiles obtained with the use of two tested methods shows that addition of acetone does not influence their complexity. To our knowledge, this is a first time when the influence of acetone on the efficiency of FFPE sample purification was tested for metabolomics approach. It was evaluated by the comparison of metabolic profiles of samples prepared with the use of two different clean-up methods. Obtained results proved that there is no need to enlarge sample purification by the addition of acetone.

Number of Purification Cycle

During the method optimization, the numbers of the purification cycles were taken into account. Three sets of samples were prepared:

  1. (A)

    samples deparaffinized without purification step;

  2. (B)

    deparaffinization followed by two cycles of purification process with the use of 1 mL of ethanol for each cycle;

  3. (C)

    deparaffinization followed by three cycles of purification process with the use of 0.7 mL of ethanol for each cycle.

Further steps of sample preparation procedure and data processing were performed as described above (“Sample deparaffinization and purification procedure, Sample extraction procedure, Preparation of alkane standard mixture, and Data Preprocessing” sections). Exemplary metabolic profiles obtained with the use of GC–EI–QQQ/MS are presented in Fig. 2.

Fig. 2
figure 2

Chromatograms obtained from the GC–MS analysis of samples prepared with modified purification step to control the influence of the number of purification cycles on the sample purity. a Tissue samples without purification step, b two cycles using 1 mL of ethanol, and c purification composed of three cycles with the use of 0.7 mL of ethanol

As it was observed, in the case of no purification step, paraffin is not fully removed. Sample is contaminated by the paraffin residue which is seen on the last part of chromatogram as regular peaks characteristic for the alkanes (Fig. 2a). On the other hand, on chromatogram presented in Fig. 2b, it is seen that application of two-step purification with higher volume of ethanol significantly reduced the amount of the paraffin impurities; however, the paraffin is still present in the sample. The most pure profile was obtained in the case of three-step purification covered by 0.7 mL of ethanol (Fig. 2c). As a result, three-step purification processes was chosen for further analysis. Previously reported protocols applied two cycles of purification with the use of ethanol. However, based on theoretical knowledge, division of the same volume of the solvent enhances the efficiency of the sample purification. Although the procedure proposed here may extend the time of sample preparation step, it enables to obtain more pure and complex metabolic profile instead.

The Influence of the Paraffin Specimen Thickness

In presented study, different thickness values of specimen were tested: 10 µm, 15 µm, 20 µm, 30 µm, and 40 µm. Five replicates were prepared for each thickness. Sample preparation and data processing were performed as described in Experimental section. Obtained data were filtered based on QA criteria and presence in the group. Only compounds present in at least 80% of samples with CV lower than 30% were used for further statistical calculations. As a result, 28 compounds were chosen for further statistics. Filtered data set was normalized using abundance of IS (n-pentadecanoic acid) and calculated normalization factor-tissue weight divided by the thickness of the specimen. This type of normalization excluded the influence of weight on metabolic profile quality. As a result, observed differences are a consequence of various sample thicknesses. Multivariate statistical calculations were applied to evaluate how metabolic profiles differed between each other depending on the thickness of used paraffin specimen. For this reason, analyzed samples were divided into five groups consisting of five replicates (Group 1—10 µm, Group 2—15 µm, Group 3—20 µm, Group 4—30 µm, and Group 5—40 µm). Twenty-eight selected analytical signals were used to build PCA model. Results are presented in Fig. 3.

Fig. 3
figure 3

PCA models built on data obtained with GC–QqQ/MS. The calculated R2 was 0.994. Boxes correspond to five groups differing in thickness of paraffin-embedded tissue. Group 1—10 µm is marked by red boxes, Group 2—15 µm by dark blue boxes, Group 3—20 µm by brown color, Group 4—30 µm by yellow color, and Group 5—40 µm by light blue boxes

No clustering was observed in the presented model what provides the assumption about negligible influence of the paraffin thickness on the differentiation of the metabolic profiles. Statistical analysis included ANOVA test or Kruskal–Wallis depending on normality of data and the homogeneity of the variances. Normal distribution was tested by Shapiro–Wilk test, while homogeneity of the variances was tested by Levene test. Post hoc test was carried out with the use of Fisher test or Wilcoxon test after ANOVA or Kruskal–Wallis, respectively. Calculations were covered by RStudio [35] and MetaboAnalyst 3.6

As a result, four compounds, namely: citric acid, glycine, inositol, and 5-oxo-proline, demonstrated statistically significant differences between five tested groups. Average abundances of chosen metabolites are shown in Fig. 4.

Fig. 4
figure 4

Compounds differentiating compared groups at statistical significant level and box plots representing their average abundance. Each number corresponds to group characterized by different thickness of FFPE tissue (Group 1—10 µm, Group 2—15 µm, Group 3—20 µm, Group 4—30 µm, and Group 5—40 µm)

Selected compounds are presented in Table 2. Post hoc test provides the information about observed differences between groups.

Table 2 Statistically significant compounds, selected by ANOVA or Kruskal–Wallis methods

Statistical analysis proved that the differences between obtained profiles are not statistically significant in the case of paraffin tissue samples with 10 μm, 15 μm, and 20 μm thick. These results were similar for each compound taken into account. However, multivariate statistical analysis enabled to select four compounds for which differences in profiles depending on tissue thickness were observed. Post hoc test, performed for glycine, shows that only profiles of metabolites extracted from paraffin-embedded tissue 40 μm thick differed in statistically significant way from other tested thickness of samples. Results obtained for the rest three compounds, namely inositol, citric acid, and proline, showed that profiles obtained from samples with 30 μm and 40 μm of thickness are different from profiles obtained from samples with 10 μm, 15 μm, and 20 μm thick. Results of post hoc test provided information that changes are on infinitesimal level in the case of paraffin-embedded tissue with thickness between 10 and 20 μm. It can be assumed that the most reliable, complex, and repeatable profiles are obtained in the case of paraffin-embedded tissues with thickness not lower than 30 μm. However, Kelly et al. [17] as well as Wojakowska et al. [13, 14] used 40 μm thickness of paraffin-embedded tissue. Nonetheless, there is lack of knowledge in the field of the optimization regarding tissue thickness and results presented here provide information that tissue thickness should be at least 30 μm.

Application of the Optimized Sample Preparation Procedure

Optimized sample preparation procedure was utilized for the analysis of 9 FFPE prostate tissue samples. Four samples were obtained during prostatectomy surgery from patient with diagnosed prostate cancer and five were control group composed of tissue obtained during prostatectomy from non-cancer patient. 30 μm-thick samples were chosen for the analysis. Samples were prepared according to the optimized procedure as described above and consisted of three main steps: deparaffinization (described in “Sample deparaffinization and purification” section), extraction (described in “Sample extraction procedure” section), and derivatization (described in “Derivatization procedure” section). Selected chromatograms from cancer and control group are presented in Fig. 5.

Fig. 5
figure 5

Example of samples chromatograms obtained from GC–MS analysis of 30 μm-thick FFPE prostate tissues samples. a FFPE tissue from prostate cancer patient; b FFPE prostate tissue from non-cancer patient

As a result, 157 compounds were identified. Identification of 157 compounds was based on the calculated retention index, retention time, and comparison of obtained mass spectrum to the library (in-house library based on NIST). According to the Fiehn protocol [36], the usual number of metabolites identified by GC–MS in biological matrices varies between 120 (cell culture) and 200 (urine). Matsuo et al. [37] reported that with the use of GC–MS analysis, 500–1000 chromatographic peaks can be obtained in a single run; however, only 100–200 metabolites (20%) can be identified. It has to be highlighted that the number of identified compounds in GC–MS strongly depends on the applied library. The number of 157 identified compounds is comparable with results obtained by Wojakowska et al. [13].

Obtained data were processed as shown in “Data Preprocessing” section. Processed and aligned data were filtered based on the presence of the group. Compounds present in at least 80% of samples were used for further statistical calculations. As a result, 24 compounds were selected for statistical analysis. Restrict filtration was carried out to avoid selection of false-positive features. Next, normalization was performed based on the intensity of the IS as well as weight of the tissue. Normalized data were used to build PCA model (Fig. 6).

Fig. 6
figure 6

PCA model for tissue samples obtained from prostate cancer patients (red triangles) and control group (blue triangles). The calculated R2 factor was R2 = 0.977

Although PCA belongs to unsupervised methods, some trends in data grouping can already be observed. Furthermore, data underwent univariate statistical analysis. In the first step, the normality of the data distribution and equality of variances were tested by Shapiro–Wilk and Levene test, respectively. Based on the result, either parametric t student or Mann–Whitney U test was applied. Calculated p values were corrected by Benjamini–Hochberg false discovery rate (FDR) test. Analyses were performed with the use of Matlab and Simca P+ software. Obtained results are presented in Table 3.

Table 3 Results of univariate statistical analysis of tissue samples obtained from prostate cancer patients and healthy volunteers

Table 3 shows ten compounds found to differ between two tested groups at statistically significant level (p value after Benjamini–Hochberg correction under p < 0.05). The main goal of applicability as well as further filtration and statistical analysis was to assess whether developed method allows identifying significant compounds that would be biologically reliable and have ever been described in prostate cancer tissues.

As can be observed, most of statistically significant metabolites belong to fatty acids. Among them, citric acid was also identified. It was calculated that amount of fatty acids in prostate cancer group was up-regulated versus the amount determined in control samples. The opposite phenomenon was observed in the case of citric acid. In prostate cancer samples, the concentration of citric acid was downregulated in comparison with control group. 2-amino-2-methyl-1,3-propanediol and benzamide are assumed to be microbiota origins. Moreover, benzamide is considered a diet or drug metabolite. Kyoto Encyclopedia of Genes and Genomes reports participation of benzamide in two biochemical pathways: aminobenzoid degradation and microbial metabolism in diverse environments [].

FFPE tissues were also applied in untargeted metabolomic studies previously. Wojakowska et al. [13] used GC–MS technique to compare metabolic profiles obtained from formalin-fixed, fresh-frozen, and FFPE mouse kidney tissue specimens. As a result, 60 compounds were detected in FFPE samples. Recently, Cacciatore et al. [16] performed untargeted metabolomic analysis, based on ultraperformance liquid chromatography–mass spectrometry (UPLC–MS) and GC–MS techniques, to compare two types of samples: fresh-frozen and FFPE cell lines of prostate cancer, and fresh-frozen, and FFPE prostate cancer tissues. As a result, 252 metabolites were detected in FFPE cell line specimens, while 140 metabolites were identified in FFPE prostate cancer tissues. Detected compounds belonged to various chemical classes, including energy metabolites, nucleotides, lipids, amino acids, carbohydrates, xenobiotics, and peptides, but the number of identified metabolites within every class was between 2 and 50 times smaller when compared to fresh-frozen specimens. The number of metabolites was similar to results obtained within presented study (157 compounds before filtration). However, the results presented by Cacciatore et al. [16] were obtained with the use of two complementary techniques, whereas, in presented research, only one technique was applied. Moreover, authors did not distinguish how many compounds were detected by every applied technique. According to results, 48 metabolites were found as statistically significant, what constituents ca. 34% of all identified metabolites. Moreover, Kelly et al. [17] applied LC–MS/MS in Single Reaction Monitoring (SRM) mode to detect 249 compounds in FFPE tissues. Based on filtration, 119 compounds were selected for further investigations. As a result, averagely 106 metabolites were identified in analyzed biological matrix. As it can be seen, application of data preprocessing reduced in around 50% the number of metabolites. In presented manuscript, data preprocessing reduced the number of metabolites from 157 to 24 compounds (ca. 85%). These 24 compounds underwent statistical analysis. As a result, ten compounds were found as statistically significant among studied groups. Since presented research is a retrospective study, fresh-frozen samples collected from the same patients as FFPE specimens could not be obtained. However, some previously published studies covered the comparison of metabolic profiles obtained from FFPE tissues and fresh-frozen specimens. According to the literature, formalin fixation and paraffin embedding may cause loss of about 25% of detected metabolites when compared to fresh-frozen samples, especially such groups as amino acids, peptides, or carbohydrates [13, 15, 16]. Fortunately, the goal of our study was to analyze fatty compound synthesis and metabolism, which were found to be stable during FFPE sample preparation procedure.


In the presented study, optimized method in reference to several factors for direct extraction of metabolites from FFPE tissue samples is proposed. Although obtained protocol was prepared based on one type of samples, it is considered as universal and may be applied for various types of tissue specimens as well as different approaches in metabolomics (untargeted but also targeted analysis). To our knowledge, it was the first time when the method was fully optimized and applied for untargeted metabolomics in prostate cancer. Doubtlessly, paraffin-embedded tissue is an attractive, alternative matrix for metabolomics. Application of the developed and optimized sample preparation method of the FFPE tissue for metabolic fingerprinting by GC–MS confirmed its potential in CaP development investigation. As a result, ten metabolites were found as contributing the most into differentiating tested cancer and non-cancerous prostatic tissue groups. Tissue metabolites, with observed altered levels in cancer group, are involved in such biochemical pathways like TCA cycle and fatty acid metabolism. Obtained results may provide new insight into the mechanism of prostate cancer development. However, it should be underlined that they are preliminary results, derived from pilot study, and because of that require future validation on larger set of samples.