Main

Gastric adenocarcinoma (GC) is the fifth most common cancer worldwide and the third most deadly. Approximately one million people are diagnosed worldwide yearly, and there is a 70% mortality rate (Worldwide Cancer Research Fund, 2012; Cancer Research UK, 2014). GC is often diagnosed late, as non-specific symptoms, such as dyspepsia, resemble benign (BN) causes such as gastritis. In spite of this, cancers identified early have a moderate chance of cure. The 5-year survival rate of Stage IA tumours is 71% and Stage IB tumours is 57% (American Cancer Society, 2015). This highlights the importance of appropriate screening in higher-risk populations.

Metabolomics is the study of low-molecular weight chemicals (<1500 Da) in a biological system. It is the most downstream of the ‘omics’ sciences (Genomics, Transcriptomics, Proteomics, etc.), and is thus considered closest to an organism’s phenotype (Dunn et al, 2011). Previous studies show that GC cells preferentially convert glucose into lactate even in the presence of sufficient oxygen (Warburg effect) (Hirayama et al, 2009; Cai et al, 2010; Hu et al, 2011; Aa et al, 2012). Citrate is one metabolite with connections to apoptotic pathways in GC (Lu et al, 2011). Certain nucleic acids are overexpressed in GC (hypoxanthine, uridine, guanosine), indicating active replication (Hirayama et al, 2009; Hu et al, 2011; Yu et al, 2011).

Identification of a distinct urinary metabolomic profile for GC could offer a non-invasive, cost effective, efficient, and reasonably accurate modality towards accurate diagnoses. The study described herein provides a preliminary investigation of the ability for 1hydrogen nuclear magnetic resonance (1H-NMR) spectroscopy to discriminate between urine samples collected from GC, healthy (HE), and benign gastric disease (BN) patients.

Materials and methods

Patient selection

Midstream urine samples were collected from 43 GC, 40 BN, and 40 HE patients from January 2009 to December 2014 from three hospitals in Edmonton, Canada. GC samples were collected prior to chemoradiotherapy and surgery. All patients provided written informed consent. Ethics approval was obtained from the Health Research Ethics Board at the University of Alberta.

Inclusion criteria for cancer patients were: biopsy-confirmed diagnosis of GC, age ⩾18 years, and no metastases on their staging computed tomography scans. BN patients had to experience gastrointestinal symptoms (such as haematemesis or epigastric discomfort) and must have endoscopic evidence within the past 6 months of consent that symptoms were not due to a malignant cause. BN patients had the following conditions: gastritis, gastro-oesophageal reflux disease (GORD), portal hypertensive gastropathy, varices, gastritis, ulcers, and polyps. HE controls had no declared history of cancer and no gastrointestinal symptoms. Groups were matched on age, gender, and BMI.

Exclusion criteria included: breastfeeding, pregnancy, significant cardiac disease with New York Heart Association ⩾Class II, systemic infection, prior cancer, and glomerular filtration rate <30 ml min−1.

Sample collection and NMR spectroscopy

Within 2 h of collection, one ml aliquots of urine mixed with 50 μl of 0.42% sodium azide preservative were prepared and biobanked at −80 °C. All one-dimensional (1D) 1H-NMR spectra were acquired at Canada’s National High Field Nuclear Magnetic Resonance Centre using a 600-MHz Varian Inova spectrometer (Agilent Inc., Palo Alto, CA, USA). Sample preparation and NMR analysis followed the standard protocols outlined in Supplementary File.

Data modelling and statistical analysis

Following standard data cleaning protocols, 77 metabolite concentrations were reproducibly detected by NMR platform. For each metabolite, pairwise comparisons of GC vs HE and BN vs HE were tested using the non-parametric Mann–Whitney U-test. Correction for multiple comparisons was performed using Benjamini and Hochberg method (Benjamini, 1995).

Exploratory multivariate statistical analysis in the form of Partial Least Squares discriminant analysis (PLS-DA) and Orthogonal Partial Least Squares discriminant analysis (O-PLS-DA) were used to uncover any latent correlated structure in the data (Eriksson et al, 2013). Logistic regression optimised by LASSO regularisation (LASSO-LR) was then performed to derive a parsimonious discriminant GC vs HE biomarker model. Statistical analyses were performed using SIMCA (version 13, Umetrics, Umea, Sweden), Matlab scripting language (MathWorks Inc., Natick, MA, USA), and STATA Version 13 (StataCorp LP, College Station, TX, USA).

Results

Patient characteristics

Baseline patient and tumour characteristics are listed in Table 1. To compare univariate statistical results from two arms of this study (GC vs HE and BN vs HE), a bi-plot of log median fold change for metabolites significant in either comparison was constructed (Figure 1). P-values, q-values, median concentrations, and median-fold differences for each pairwise comparison are reported in Supplementary Table S1. A detailed discussion of the PLS-DA and O-PLS-DA models are provided in Supplementary Files. These results reflect those of the univariate statistics. Of particular interest were nine metabolites, which had high VIP scores in the GC vs HE OPLS model but low VIP scores in the BN vs HE OPLS model (Supplementary Table S2): sucrose, dimethylamine, 1-methylnicotinamide, 2-furoylglycine, N-acetylserotonin, trans-aconitate, alanine, formate, and serotonin.

Table 1 Baseline characteristics of the study subjects and tumour
Figure 1
figure 1

Biplot of log2 median fold change for metabolites in GC vs HE and BN vs HE models. Blue circles represent metabolites significantly changed in both models; red squares, significantly changed in GC vs HE only; green triangles, significantly changed in BN vs HE only.

LASSO-LR produced an optimal GC vs HE model using just three metabolites: 2-hydroxyisobutyrate (2-HIB), 3-indoxylsulfate (3-IS), and alanine (A). This resulted in the following diagnostic regression model:

The corresponding ROC curve had an AUC of 0.95 (95% CI: 0.86−0.99) (Figure 2A). For a fixed specificity of 80%, the corresponding sensitivity for predicting GC was 95% (95% CI: 0.86–0.99). According to this specificity, if the predicted score, P, for a given individual is >0.3 the diagnosis would be ‘GC’; otherwise if P<0.3, ‘not GC’. Figure 2B shows a frequency histogram for three disease classifications grouped by the LASSO-LR model score. BN samples are split into two broad distributions: half of BN patients classified with GC, and the other half with HE.

Figure 2
figure 2

Three-metabolite logistic regression model. (A) Receiver Operating Characteristic (ROC) curve for GC vs HE comparison based on three-metabolite model. Area under the curve (AUC) is 0.95 (95% CI=0.86–0.99). For a fixed specificity of 80%, the sensitivity is 95% (95% CI=0.85–1.00). (B) Frequency histogram for logistic regression model scores. Yellow bars represent HE patients; red, BN patients; and black, GC patients. The number (frequency) of patients with each score is depicted by the height of the bars. Scores closer to 1 indicate a high probability of GC; close to 0 indicates high probability of HE. Cutoff boundary is score 0.3. Above 0.3, classified as GC; below, not GC.

Discussion

GC is a highly morbid and fatal disease. Diagnosis of GC is often delayed. The present study used 1D 1H-NMR spectroscopy to characterise a urinary metabolic profile of GC that is distinct from HE and a subpopulation of BN patients.

Five to seven percent of skeletal muscle is composed of alanine, an endogenous amino acid (Felig et al, 1978). During fasting, muscle protein is catabolised to release alanine for liver gluconeogenesis. Similar to previous studies (Hirayama et al, 2009; Chen et al, 2010), alanine concentration increased from HE to GC. Elevated alanine levels in GC patients’ urine compared with HE show that alanine may be a biomarker of muscle wasting but not necessarily a specific biomarker of the disease itself.

In rats with chemically induced gastric lesions (ulcers, erosions), treatment with 1-methylnicotinamide inhibited gastric acid secretion and increased mucosal blood flow and healing (Brzozowski et al, 2008). Diminished levels of 1-methylnicotinamide in both BN and GC groups suggest loss of this mucosal protective mechanism. Where mucosa is ulcerated or eroded, sucrose can penetrate more easily into the bloodstream and be excreted into the urine (Sutherland et al, 1994). Our study shows significant sucrose elevations in both BN and GC groups compared with HE. Perhaps this is due to the increased permeability of damaged mucosa in GC and BN patients.

Creatinine, a waste product of muscle metabolism, is excreted by the kidneys (Eisner et al, 2011). The amount of creatinine in urine is related to muscle mass (Swaminathan et al, 2000). Cachectic patients have lower total body skeletal muscle mass and therefore lower levels of urinary creatinine. This phenomenon was consistent with our results.

Citrate is an intermediate of the Kreb’s cycle. An in vitro experiment showed that citrate induced apoptosis in two GC cell lines in a dose-dependent manner (Lu et al, 2011). In our study, citrate was downregulated in GC patients, suggesting an ability of GC to escape regular programmed cell death.

The distinction between BN and either GC or HE was less clear using the multiclass PLS model (Supplementary Figure S2). BN conditions that clustered more frequently with GC include: ulcers, GORD, and gastritis. These observations fit with Correa’s hypothesis (Correa, 1988). He delineated a preneoplastic cascade from healthy to chronic atrophic gastritis and eventually to cancer. Patients with chronic gastritis are farther on the preneoplastic cascade than early gastritis patients, so their phenotypes and metabolomic signatures more likely resemble GC than HE.

This observational study has limitations. We enrolled a pragmatic sample size of roughly 40 patients in each group. A small sample size limits the power to detect a difference, and conversely, differences detected may be spurious. This experiment matched patients on three common confounders – age, sex, and BMI, but as it is an observational design, only known confounders can be controlled. Other confounders in this experiment include: medications, smoking, and Helicobacter pylori status.

This study shows clinical potential for metabolic profiling, although numerous steps are required to move this test into the clinic. Should the three-metabolite model be successfully validated, then a point-of-care diagnostic could be developed such as a simple dipstick or laboratory assay. Alternatively, if the complete metabolite profile needs to be done, then this assay could be performed at a centralised laboratory with samples being collected and processed in the periphery.