Semi-automated non-target processing in GC × GC–MS metabolomics analysis: applicability for biomedical studies
- First Online:
- Cite this article as:
- Koek, M.M., van der Kloet, F.M., Kleemann, R. et al. Metabolomics (2011) 7: 1. doi:10.1007/s11306-010-0219-6
- 1.9k Downloads
Due to the complexity of typical metabolomics samples and the many steps required to obtain quantitative data in GC × GC–MS consisting of deconvolution, peak picking, peak merging, and integration, the unbiased non-target quantification of GC × GC–MS data still poses a major challenge in metabolomics analysis. The feasibility of using commercially available software for non-target processing of GC × GC–MS data was assessed. For this purpose a set of mouse liver samples (24 study samples and five quality control (QC) samples prepared from the study samples) were measured with GC × GC–MS and GC–MS to study the development and progression of insulin resistance, a primary characteristic of diabetes type 2. A total of 170 and 691 peaks were quantified in, respectively, the GC–MS and GC × GC–MS data for all study and QC samples. The quantitative results for the QC samples were compared to assess the quality of semi-automated GC × GC–MS processing compared to targeted GC–MS processing which involved time-consuming manual correction of all wrongly integrated metabolites and was considered as golden standard. The relative standard deviations (RSDs) obtained with GC × GC–MS were somewhat higher than with GC–MS, due to less accurate processing. Still, the biological information in the study samples was preserved and the added value of GC × GC–MS was demonstrated; many additional candidate biomarkers were found with GC × GC–MS compared to GC–MS.
KeywordsMetabolomics Comprehensive two-dimensional gas chromatography mass spectrometry GC × GC–MS Automated data processing Diabetes Insulin resistance
Metabolomics research involves the comprehensive non-target analysis of all, or at least as many as possible, metabolites in cells, tissue or body fluids. The complexity of the metabolome makes this a challenging task for analytical chemists. For example, samples of the simplest microorganisms already contain, by estimation, several hundreds of different metabolites. At present, the main analytical techniques used for the analysis of the metabolome are nuclear magnetic resonance spectroscopy (NMR) and hyphenated techniques, such as gas chromatography (GC) and liquid chromatography (LC) coupled to mass spectrometry (MS).
Gas chromatography coupled to mass spectrometry (GC–MS) is a highly suitable technique for metabolomics analysis due to the high separation power, reproducible retention times and sensitive selective mass detection. In previous papers a one-dimensional GC–MS method (Koek et al. 2006) and a comprehensive two-dimensional gas chromatography mass spectrometry method (GC × GC–MS; Koek et al. 2008) suitable for the analysis of a broad range of small polar metabolites were described using a derivatization with an oximation reagent followed by silylation. Several other GC–MS (Fiehn et al. 2000; Jonsson et al. 2004; Roessner et al. 2000; Strelkov et al. 2004; Villas-Boas et al. 2005) and GC × GC–MS (O’Hagan et al. 2007; Pierce et al. 2006b; Shellie et al. 2005) based methods for metabolomics have been reported.
The principle of GC × GC–MS is based on the coupling of two analytical columns with different selectivities coupled through a modulator. The so-called dual-stage cryogenic modulator equipped with four jets (two liquid-nitrogen cooled and two hot-gas jets) allows for the consecutive trapping, cryogenic focussing and release of small fractions from the first column effluent in narrow bands onto the second column. In this comprehensive setup, the entire sample is separated on both columns and no information of the first separation is lost during the second one. The resulting GC × GC–MS chromatogram consists of a large series of consecutive second dimension (2D) separations. To maintain the separation of the first column, each peak eluting from the first dimension should be sampled, i.e. modulated, minimally three to four times (Murphy et al. 1998).
GC × GC–MS offers several advantages over GC–MS, i.e. higher chromatographic separation power, a broader dynamic range and lower detection limits, and should be the preferred technique for metabolomics analysis. However, quantification of metabolomics samples using GC × GC–MS is still a major challenge. To get from raw total-ion chromatographic data to a list of sample components with their corresponding peak areas and mass spectra, many steps are required, including peak finding, deconvolution, integration and combining of the peaks from different modulations originating from one compound. The performances of all the steps are influencing the final data quality and, consequently, the reliability of the biological information extracted from the data. In addition, all metabolites are of interest and need to be quantified. Several approaches have been published to process GC × GC–MS data after analysis to find metabolites that distinguish between samples (Mohler et al. 2008; Pierce et al. 2006a; Shellie et al. 2001; Sinha et al. 2004), but only few papers on the quantification of all (or at least as many as possible peaks) peaks have been published.
Hoggard and Synovec (2008) described a method for applying PARAFAC to GC × GC-TOF-MS data in an automated fashion that required no assumptions about analyte identities. They proposed that the method was applicable as post processing step providing deconvolution and quantification of all analytes in a sample. However, their method was very time-consuming, i.e. one chromatogram had to be divided in numerous subsections, and complete analysis required, by their estimation, tens of hours. Oh et al. (2008) developed a peak sorting method (MSsort) for GC × GC–MS data in Matlab. Raw data files were first processed using the ChromaTOF software (LECO, St. Joseph, MI, USA) to provide peak tables. Subsequently, MSsort was used to sort and combine peaks by utilizing first- (1rT) and second-dimension (2rT) retention times and the mass spectrum. However, no quantitative data were presented in neither of the described papers. To our knowledge the only attempt for non-target quantification of metabolites in a real life metabolomics study was published by Li et al. (2009). They quantified 692 peaks in 79 human-plasma samples to identify possible biomarkers for type-2 diabetes mellitus. Quantification was performed by exporting m/z 73 from the GC × GC–MS chromatograms and alignment, peak merging and quantification was performed using their in-house developed software (GC × GC Workstation; Qiu et al. 2007). The repeatability of the quantification was tested using pooled plasma samples. The mean relative standard deviations (RSDs) in five consecutive injections of one plasma sample and five consecutive injections of five different plasma samples were 14 and 20%, respectively. It is not fully clear how many peaks were included in the mean RSD (only peaks that were quantified in all samples were included). Besides, the use of a single mass trace (m/z) instead of the deconvoluted spectrum of a peak for quantification can result in errors in quantification of coeluting peaks and the assignment of the identity of a peak.
In this paper, the possibilities and limitations of the software with regard to non-target semi-automated processing of GC × GC–MS data were evaluated. This was done by measuring and processing a set of mouse-liver samples that were part of a larger study investigating the development of insulin resistance/type-2 diabetes mellitus (DM2) (Kleemann et al. 2010). DM2 is a multifactorial complex disease associated with metabolic deregulations. Despite major efforts, the pathophysiological mechanisms underlying the beginning and progression of the disease are still incompletely understood. Identification of changes in hepatic metabolite profiles can help to identify dysregulated metabolic pathways in DM2 and thus in the selection of (new) therapeutic regimens. Mice with a humanized lipoprotein metabolism, APOE*3Leiden transgenic (E3L) mice (Zadelaar et al. 2007) were used and mice were fed a high-fat diet known to induce insulin resistance/DM2. Livers were collected at different time points during 12 weeks of high-fat-diet feeding. The samples were measured with GC × GC–MS and with GC–MS to be able to compare the results from both methods. Results of semi-automated GC × GC–MS data processing were compared with a fully optimized, but labour-intensive, targeted GC–MS data processing method used in our lab, involving the inspection and, if required, manual correction of the integration of all quantified metabolites. In addition, time-resolved changes in metabolic profiles of the mouse livers were identified using principal-component analysis (PCA) and principal-component-discriminant analysis (PCDA).
2.1 Chemicals and materials
Pyridine (Baker analyzed) was purchased from Mallinkrodt Baker (Deventer, The Netherlands) and pyridine hydrochloride (analytical grade) was purchased from Sigma-Aldrich (Zwijndrecht, The Netherlands). A solution of 56 mg/ml ethoxyamine hydrochloride (>99%, Acros Organics, Geel, Belgium) in pyridine was used for oximation and N-methyl-N-trimethylsilyl trifluoroacetamide (MSTFA; Alltech, Breda, The Netherlands) was used for silylation.
Standards used as quality-control standards, leucine-d3, glutamic acid-d3, phenylalanine-d5, glucose-d7, alanine-d4 and cholic acid-d4, were purchased from Spectra Stable Isotopes (Columbia, USA). 4,4-Difluorobiphenyl, trifluoroantracene and dicyclohexyl phthalate were purchased from Sigma-Aldrich. Three internal standard (IS) mixtures were prepared; IS mix 1 containing leucine-d3 (250 ng/μl), glutamic acid-d3 (250 ng/μl), phenylalanine-d5 (250 ng/μl), glucose-d7 (250 ng/μl) in water, IS mix 2 containing alanine-d4 (250 ng/μl) and cholic acid-d4 (250 ng/μl) in pyridine, and IS mix 3 containing 4,4-difluorobiphenyl (250 ng/μl), trifluoroantracene (250 ng/μl) and dicyclohexyl phthalate (250 ng/μl) in pyridine.
2.3 Mouse-liver samples
Animal experiments were approved by the Institutional Animal Care and Use Committee of The Netherlands Organization for Applied Scientific Research (TNO) and were in compliance with European-Community specifications regarding the use of laboratory animals. Male ApoE*3Leiden transgenic (E3L) mice subjected to high fat diet feeding essentially as specified in Kleemann et al. (2010). Briefly, E3L mice displaying a humanized lipoprotein metabolism and lipid profile and sensitive to high fat diet treatment (Kleemann et al., 2007) were fed a high-fat diet containing 24% beef tallow (HF diet; Hope Farms, Woerden, The Netherlands) and were euthanized with CO/CO2 after zero weeks (n = 8), 6 weeks (n = 8) and 12 weeks (n = 8) of high-fat diet feeding.
Livers were collected at sacrifice and were snap-frozen immediately in liquid nitrogen, and stored at −80°C until use (no longer than 10 months).
2.4 Sample preparation
The liver samples were freeze-dried overnight and homogenized. 10-mg aliquots of the liver samples were weighed and placed inside a 2-ml Eppendorf tube. After addition of 10 μl of IS mix 1 and 500 μl of methanol/water 4:1 v/v, all samples were sonificated for 30 min and subsequently centrifuged for 10 min at 14086×g (10000 rpm). The supernatants were transferred to autosampler vials and subsequently dried under nitrogen flow. Then 10 μl IS mix 2 and 30 μl ethoxyamine hydrochloride solution were added and the samples were oximated for 90 min at 40°C on a tube roller mixer placed inside an oven. Subsequently, 10 μl of IS mix 3 and 100 μl of MSTFA were added and the samples were silylated for 50 min at 40°C on a tube roller mixer inside an oven. Finally, the samples were centrifuged for 20 min at 2081×g (3500 rpm) prior to injection.
2.5 Quality-control (QC) sample
A pooled sample of six different liver samples from different time points (two per time point) was used as QC sample. The samples were prepared according to the sample preparation described above; however, after extraction the supernatants of all samples were mixed and subsequently divided over ten separate autosampler vials. Furthermore, the amounts of liver sample and IS-mix were adjusted to obtain the same amount of biomass and internal standards in the QC samples compared to the study samples.
2.6 GC–MS analysis
The derivatized extracts were analyzed with an Agilent 6890 gas chromatograph coupled with an Agilent 5973 mass-selective detector (Agilent technologies, Santa Clara, CA, USA). 1-μl aliquots of the extracts were injected into a DB5-MS capillary column (30 m × 250 μm I.D., 0.25 μm film thickness; J&W Scientific, Folson, CA, USA) using PTV-injection (Gerstel CIS4 injector; Mülheim an der Ruhr, Germany) in the splitless mode. The temperature of the PTV was 70°C during injection and 0.6 min after injection the temperature was raised to 300°C at a rate of 2°C/s and held at 300°C for 20 min. The initial GC oven temperature was 70°C, 5 min after injection the GC-oven temperature was increased with 5°C/min to 320°C and held for 5 min at 320°C. Helium was used as a carrier gas, and pressure programmed such that the helium flow was kept constant at a flow rate of 1.7 ml per min. Detection was achieved using MS detection in electron ionisation and full-scan monitoring mode (m/z 15–800). The temperature of the ion source was set at 250°C and that of the quadrupole at 200°C.
2.7 GC × GC–MS analysis
The derivatized samples were analyzed with an Agilent 6890 gas chromatograph fitted with a dual-stage, four-jet (two liquid-nitrogen cooled and two hot-gas jets) cryogenic modulator and a secondary oven (LECO) and coupled to a time-of-flight mass spectrometer (Pegasus III, LECO). The configuration of the first (1D) and second dimension (2D) column and the method parameters were optimized, as described in Koek et al. (2008).
A 30 m × 0.25 mm I.D. × 0.25 μm forte BPX-50 column (SGE, Milton Keynes, UK) was used as the 1D column and a 2 m × 0.32 mm I.D. × 0.25 μm forte BPX5 column (SGE Europe) was used as the 2D column.
1-μl aliquots of the derivatized extracts were injected using PTV-injection (Gerstel CIS4) in the splitless mode. The temperature of the PTV was 70°C during injection and 0.6 min after injection the temperature was raised to 300°C at a rate of 2°C/s and held at 300°C for 20 min. The initial GC-oven temperature was 70°C, 3 min after injection the temperature was raised to 300°C with a rate of 5°C/min and held at 300°C for 10 min. The temperature offset of the secondary oven and modulator compared to the GC oven were set at +30 and +40°C, respectively. The modulation time was 6 s, with the hot-pulse time set at 1 s. Helium was used as carrier gas and the analyses were carried out in constant-pressure mode at 300 kPa. The MS transfer line was set at 325°C and the ion-source temperature was 280°C. The detector voltage was set at −1600 V and the data acquisition rate was 75 Hz.
2.8 Data processing GC–MS
The Chemstation software (Version E02.00.493, Agilent Technologies) was used for processing of the data. A target table was constructed using an in-house library containing the mass spectra and retention times of over 600 reference metabolites (authentic standards), over 100 annotated metabolites (spectral match with NIST library) and over 200 unknown metabolites commonly found in blood products. Furthermore, metabolites (known or unknown) specific for this study were added to the target table. A total of 175 targets were found in the QC samples (total of three) and quantified in all samples by reconstructing an ion chromatogram of a specific mass from the mass spectrum of the target. The quantification for all targets was manually checked by visual control and if necessary peak integration was corrected manually.
2.9 Optimization of GC × GC–MS data processing
ChromaTOF software V3.35 was used for data processing. During the optimization step the following parameters were varied separately in the processing method: first dimension peak width (1wB) (30, 60, 90 and 120 s), second dimension peak width (2wB) (0.1, 0.15, 0.2, 0.3 and 0.4 s), smoothing factor (auto, 3, 5, 7) and the match required to combine different 2D peaks originating from one entry (400–800). The different processing methods were evaluated by investigating the deconvoluted mass spectra, the integration and the combining of the 2D peaks of the IS. For all IS, except for cholic acid-d4, the naturally-occurring non-labelled form was also detected in the sample and partly coeluted with the labelled IS. These naturally-occurring compounds (except for glucose that was present in extremely high concentration) were also evaluated to check the performance of the deconvolution. The IS and naturally-occurring metabolites were distributed over the entire chromatogram and eluted at 1rT between 356 and 2846 s and 2rT between 2.4 and 5 s. The 2wB and the match required to combine 2D peaks were the primary parameters determining the quality of the deconvolution (2wB) and the combining of the different 2D peaks from one peak (both parameters). Unfortunately, it is not possible to set different 2wB in the software for different 2rT, because metabolites eluting at high 2rT, e.g. cholic acid-d4, were better quantified with broader peak widths than metabolites eluting at low 2rT. In our case study, the 2wB was best set somewhat narrower (0.15 s) than the actual peak width of the narrowest peaks of interest (0.2 s baseline).
2.10 Data processing for GC × GC–MS
A computer with the following specifications was used: Pentium [R] dual Intel processor CPU 3.4 GHz, 3 GB RAM, hard disk: Serial ATA, 7200 RPM, 16 MB cache, RAID 24/7 (Seagate Barracuda ES, 3.0 GB/s, 500 GB). All samples were processed with ChromaTOF V3.35 software with the following settings. Baseline tracking: default; baseline offset: 1.0; peak width: 0.15 s; segmented processing: peak find S/N 20, number of apexing masses 2; GC × GC parameters: match required to combine 500, peak width 90 s, mass threshold 0. Quantification for every individual entry was performed on their unique mass in the mass spectrum determined by the ChromaTOF software. The peaks from the constructed calibration table were quantified in all QC and study samples.
2.11 Construction of calibration table for GC × GC–MS
One of the QC samples from the middle of the sequence was processed with the method described above, except the peak find S/N was set to 200. As many artefact peaks as possible were removed. For example, all peaks related with solvents and reagents (eluting at low 2rT) and multiple entries from highly concentrated tailing metabolites (i.e. phosphate). All remaining entries were added to a calibration table. Targets from the 1D-GC–MS target table that were unambiguously identified in the 2D-GC × GC–MS data, i.e. the identity was confirmed by the injection of a authentic standard or the mass spectrum of the metabolite was considered unique, were renamed (total 107 targets) in the 2D calibration table. The maximum 1rT deviation in the calibration table was set to 13 s for every entry. The retention time deviation was set to 0.1 s, the minimum area threshold was 0, the match threshold was 550 and the S/N threshold was set to 5.
2.12 Post processing of GC × GC–MS
The quantitative data for all 1025 targets in the calibration table were exported to Excel. Compounds that were not found in more than one QC sample were removed (825 entries left). Subsequently, entries with more than four blank values in all samples were removed from the data set (691 entries left excl. internal standards). Of course, a blank value can be obtained when the concentration of the metabolite is below the limit of detection. However, in many occasions blank values were obtained even when the peak of interest was present in the sample (further referred to as a missing value), due to a low spectrum match. A low match was mostly caused by mistakes in the deconvolution either in the sample itself or in the sample used for the construction of the calibration table. However, the use of a selective mass from the mass spectrum for every metabolite (as defined in the reference table of the selected QC sample), still allows the quantification of wrongly deconvoluted peaks, although the reliability is lower. To fill the remaining missing values in the data set (total of 169 blank values), the chromatograms were reprocessed with a match threshold in the calibration table of 200 rather than 500. In this way the missing values for peaks that were unassigned due to a low match factor could be filled from the newly processed data. Of course, only correct assignments of these missing peak areas (as manually controlled via correct mass spectrum and retention time) were filled from the newly processed data. Then, all remaining peaks with missing values in the QC and/or study samples were checked and corrected manually by assigning the right peak in the chromatogram to the compound in the calibration table. The integration of the peaks and the combining of 2D peaks were not corrected as this was extremely time-consuming and therefore considered an unrealistic option.
3 Results and discussion
The present study was directed at performing and optimizing non-target data processing for GC × GC–MS. A set of 29 mouse liver samples was analyzed with both GC–MS and GC × GC–MS. The same set of samples were analyzed with both systems, both systems used identical injectors, injection methods and gas chromatographs. Therefore, the variability in the RSDs of internal standards and QC samples was caused by later stages of the analytical process (i.e. GC temperature ramp, second-dimension GC separation versus no second-dimension GC separation, detection and processing). Ideally, the same detectors should be used to compared one-dimensional and two-dimensional processing. However, in our experience the Chemstation software allows a more precise quantification in one-dimensional processing than can be achieved with the ChromTOF software, e.g. due to (i) the possibility to set integration parameters for individual peaks, (ii) the ability to use qualifier masses (define ratios between masses that should be fulfilled to assign a target) and (iii) the absence of an automated deconvolution process. Therefore, we used an Agilent GC-quadrupole MS system for this comparative study (and the majority of metabolomics studies in our lab), even though the time-of-flight mass spectrometer is more sensitive.
The fully optimized GC–MS processing method had a targeted approach (see Sect. 2). All 170 quantified metabolites in all samples were automatically integrated, the integration results visually inspected and wrongly integrated peaks were manually corrected. The non-targeted GC × GC–MS data processing method was semi-automated, i.e. the construction of the target table and assignment of missing values required manual interaction; however, the integration of peaks or mistakes in the combination of peaks from the same entry were not corrected to reduce the processing time. The data-processing times and results for GC–MS and GC × GC–MS were compared. Furthermore, the general data quality of the GC × GC–MS analyses was investigated. Finally, the results obtained with the liver samples using GC–MS and GC × GC–MS were analyzed using multivariate statistics (PCA/PCDA) in order to identify time-resolved metabolite patterns. These data may provide biomarkers for the development and progression of insulin resistance/DM2 and insight into the metabolic dysregulations underlying the disease process.
3.1 Comparison of time required for processing of GC–MS and GC × GC–MS data
Workflow for optimizing and carrying out GC × GC–MS data processing
Analyst time (h)
Computer time (h)
1. Optimize processing method (peak width, smoothing, match required to combine 2D peaks from one entry)
2. Processing for construction of target table
3. Construction target table (removing artifacts from, i.e. solvent and reagents)
4. Find targets from GC–MS in GC × GC–MS target table
5. Processing of samples using constructed target table
6. Copy data to spreadsheet
7. Removing entries with too many blanks
8. Assigning peaks of remaining blank valuesa
Total time required
3.2 Comparing data processing results of one-dimensional GC–MS with two-dimensional GC × GC–MS
3.2.1 Number of entries
Number of entries in a GC–MS and GC × GC–MS chromatogram of a pooled mouse liver sample
GC × GC–MSb
For the construction of the GC × GC–MS target table an S/N cutoff of 200 was chosen. It should be mentioned that this S/N is calculated for the unique mass (RIC) determined by the software, rather than the S/N ratio in the total ion current used in Table 2. Therefore, the number of entries at this cut-off value were higher than in Table 2, i.e. 1034 entries were found with a S/N >200. It should be mentioned that for some metabolites due to the oximation two peaks can be obtained for one metabolite, so the actual number of metabolites detected is lower than the number of entries found.
3.2.2 RSDs of internal standards
Comparing the RSDs of normalized MS responsea for the internal standards for GC–MS and GC × GC–MS in all samples (QC and study samples)
RSD of MS response (%)
GC × GC–MS
3.2.3 RSDs in pooled QC samples
A set of pooled mouse-liver samples were used as QC samples. These samples were injected at the beginning and at the end of the sequence and between every six samples. In total five QC samples were measured over the course of the study. The RSDs of the MS response of target compounds that were found with both 1D and 2D GC–(×GC)–MS (total 107 targets) were compared. The RSDs for all compared metabolites are shown in Table S1 in the supplement.
For six metabolites, better RSDs (differences in RSD more than 10%) were obtained with the GC × GC–MS method. For 37 metabolites poorer RSDs (differences in RSD more than 10%) were obtained in the GC × GC–MS processing. The chromatographic performance of GC × GC–MS was comparable when compared visually, or even better than the performance of GC–MS; this is in agreement with the comparable or even better RSD’s for the manual corrected integration results of the internal standards (see above); the poorer RSD values for the not-manually corrected peaks with GC × GC–MS were therefore caused by errors in the data processing. Seven of these compounds were overloaded (S-Table 1), which resulted in split peaks in the second dimension. Obviously these peaks will not be integrated correctly in an automated fashion, neither in GC–MS nor GC × GC–MS. In the GC–MS processing method overloaded peaks were manually integrated and therefore better RSDs were obtained.
For most other peaks the higher RSDs resulted from errors in the combining of 2D peaks belonging to the same metabolite. For peaks to be combined the match between the mass spectra of different modulation cycles should meet the required match factor as set in the software. Decreasing the match required to combine, however, would risk combining peaks that originate from different metabolites, especially because masses 73 ((CH3)3SiOH) and 147 ((CH3)3SiOSi(CH3)2) are dominant masses in the mass spectra of silylated compounds. Furthermore, in most cases the problems with combining peaks was due to deconvolution faults, and decreasing the match factor would not be an option in these cases. For nine compounds isomeric interferences of a close eluting peak was the cause of the combination error (S-Table 1). Due to the nature of the derivatization, two distinct compounds are formed for, for example, sugars and sugar-phosphates (cis- and trans-oxime forms). These two forms of one sugar elute close together in the first dimension and posses identical mass spectra. In these cases the chance of wrong assignment of the identity or mistakes in the combination of second dimension peaks is high. Another obstacle that impaired the quantification of seven metabolites was the assignment of the unique mass in the mass spectrum by the ChromaTOF software. For these compounds the non-selective masses m/z 73 or m/z 147 were assigned as unique masses, while these masses are present in all mass spectra of silylized compounds. Due to interferences of (partly) coeluting compounds the integration of these metabolites was inaccurate. In principle, the masses used for quantification of these metabolites can be manually adjusted in the calibration table, and these mistakes can probably be avoided by selecting a more selective mass instead of m/z 73 or m/z 147. However, this requires extra time to check all automatically chosen quantification masses in the calibration table.
In summary, 70 metabolites were quantified correctly and 37 metabolites were quantified less accurately with the semi-automated GC × GC–MS data processing method compared to the semi-automated GC–MS processing. For seven of the less-accurately quantified peaks the cause for the less good quantification could not be attributed to the ChromaTOF software, but to overloading effects. Thus, the semi-automated GC × GC–MS data processing method worked for 70% of the evaluated metabolites as good as the manually corrected GC–MS reference method.
3.2.4 Summary on GC × GC–MS data quality
The aim of the GC(×GC)–MS study was to investigate the influence of a high fat diet on the metabolite profiles in the liver. A total of 24 mouse liver samples, i.e. t = 0 weeks (n = 8), t = 6 weeks (n = 8) and t = 12 weeks (n = 8) after the start of the high fat diet, were measured with GC × GC–MS and GC–MS. Development of insulin resistance was monitored in individual mice over time by performing glucose tolerance tests and measuring specific biomarkers in plasma, and hyperinsulinemic-euglycemic clamp analysis to assess insulin resistance in a tissue-specific manner as described by Kleemann et al. (2010). After 6 weeks the mice had developed insulin resistance in the liver and after 12 weeks also in skeletal muscle and fat tissue (white adipose tissue).
Mahalanobis distances in the overlap data after PCA analysis of GC–MS and GC × GC–MS
Mahalanobis distance between groups
GC × GC–MS
t = 0 and t = 6 weeks
t = 0 and t = 12 weeks
t = 6 and t = 12 weeks
Top 20 metabolites with highest loading in LD1 in PCDA analysisa
GC × GC–MS
M0617 (unsaturated fatty acid methyl ester)
M0221 (amino-organic acid)
C20:1 fatty acidc
M0600 (polyunsaturated fatty acid)
M0307 (deoxyglucose or isomer)
Finally, the relative responses of tyrosine, spermidine (Fig. 7, compounds E and F) and beta-alanine were lower in the groups of mice sacrificed at t = 6 weeks and t = 12 weeks, while the level of taurine was significantly increased. The levels of amino acids are known to fluctuate during the development of insulin resistance (Huffman et al. 2009; Wijekoon et al. 2004). Interestingly, taurine was suggested to have beneficial effects by its ability to reduce intracellular oxidative stress generation and glycooxidation (Anuradha 2009), while this is the only metabolite in the GC × GC–MS PCDA top 20 that was significantly elevated in the animals after 6 and 12 weeks of high-fat diet. Furthermore, it is believed that certain amino acids play an important role in the development of diabetes and that dietary treatment with amino acids could prevent diabetes and diabetic complications (Anuradha 2009).
In conclusion, the added value of GC × GC–MS compared to GC–MS is clearly illustrated in this pre-clinical study. Although the RSDs of compounds in the QC samples for GC × GC–MS were somewhat higher than in the GC–MS data, the biological information in the data was preserved. In addition, many more candidate biomarkers were detected that were significant in explaining the differences between the different sample groups in this study. Furthermore, the higher peak capacity resulted in cleaner mass spectra, facilitating the identification of possible biomarkers. Moreover, the position of the metabolite in the chromatogram (especially the 2rT) can also aid in the identification process (data not shown).
3.4 Concluding remarks
The feasibility of semi-automated non-target processing of GC × GC–MS data using commercially available software was assessed. A set of mouse liver samples was measured with GC–MS and GC × GC–MS and a total of 170 and 691 peaks, respectively, were quantified. The performance of the two methods was compared by evaluating the RSD values in the quality-control samples of metabolites present in both datasets. Although the chromatographic performance was comparable or even slightly better for GC × GC–MS, as demonstrated for the manually integrated labelled internal standards, somewhat poorer RSDs for the relative responses of peaks determined in a semi-automated manner due to less accurate processing. Still a reliable and repeatable quantification was obtained for approximately 70% of the peaks, even though the integrations of the peaks from the GC × GC–MS data were not manually corrected in contrast with the GC–MS data.
In addition, GC × GC–MS processing is time-consuming, the major bottleneck being the speed of the software tools and algorithms. However, application of the strategy described in this paper is feasible for small studies with a maximum of about 30–50 samples (eventually measured in duplicate). For the routine application of GC × GC–MS in metabolomics in larger studies, further improvement of data processing tools is required.
The mouse-liver samples were measured to study the development and progression of insulin resistance. The added value of GC × GC–MS was clearly illustrated, (i) over four times more peaks could be quantified, (ii) the biological information as acquired in GC–MS was preserved, (iii) several extra candidate biomarkers for the development of insulin resistance were found, and (iv) the superior peak capacity resulted in cleaner mass spectra, facilitating in principle the putative identification of metabolites.
This study was supported by the research program of TNO Personalized Health
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.