Introduction

Previous research suggests that there is a relation between diet and survival after colorectal cancer diagnosis [1]. However, the underlying mechanisms are largely unknown. The identification of metabolites associated with diet in colorectal cancer patients could be a first step in unravelling the link between dietary exposures and colorectal cancer progression [2,3,4] and understanding the biological processes involved [5, 6]. Metabolomics measures a range of small molecules, many of which belong to a number of different biochemical pathways. Such metabolic profiles can provide a snapshot of the current metabolic state of the body, characteristic of a phenotype [7] and are, therefore, increasingly used to study the interface of diet, lifestyle and diseases [8,9,10].

Thus far, research has shown that dietary exposures can be associated with metabolite concentrations in blood. Predefined diet quality indices [11] as well as distinct dietary patterns such as veganism [12, 13] were reported to be associated with specific blood metabolites. Three diet quality indices were associated with metabolites, including mainly lipids and amino acids [11]. Another study reported that a vegan diet was associated with lower concentrations of glycerophospholipids, sphingolipids and amino acids compared to a diet containing meat and/or fish [12].

In terms of understanding the complex relationship between diet and metabolites, investigating dietary patterns and indices, instead of single nutrients or food groups, is of specific interest since nutrients and food groups may interact with each other [14]. Diet quality indices and dietary patterns are used to assess exposure to combinations of food groups. An example of a diet quality index, commonly used to investigate the relationship with health outcomes after cancer diagnosis [15,16,17], is the World Cancer Research Fund (WCRF) score that is based on the 2018 cancer prevention recommendations of the WCRF/American Institute for Cancer Research (AICR) [18]. The Dutch Healthy Diet index (DHD15-index) is used to assess the adherence to the Dutch dietary guidelines of 2015 [19] and has also previously been associated with various health outcomes [20, 21]. In addition to these indices, dietary patterns are data-driven and are also commonly evaluated against health outcomes after cancer diagnosis [22].

It is well-established that disease can have a great impact on metabolism [23,24,25] and, to the best of our knowledge, no research has been conducted investigating the associations between dietary exposures and metabolites in cancer patients. Metabolites associated with dietary exposures in colorectal cancer patients may give clues to potential underlying mechanisms for colorectal cancer progression which could be studied in detail in the future. Therefore, the aim of this explorative study was to investigate whether the diet, evaluated using diet quality indices and dietary patterns, is associated with plasma metabolites in colorectal cancer patients.

Methods

Study population

In total, 200 stage I-IV colorectal cancer patients with available plasma metabolite concentrations of the COLON study [26], a prospective cohort study among colorectal cancer patients in the Netherlands, were considered for the present study. The design and recruitment of the COLON study has been described earlier [26]. Participants were recruited from 11 hospitals in the Netherlands, shortly after colorectal cancer diagnosis. Females and males of all ages and of any stage of the disease were eligible. Non-Dutch speaking patients, patients with a history of colorectal cancer or (partial) bowel resection, chronic inflammatory bowel disease, hereditary colorectal cancer syndromes (e.g. Lynch syndrome, Familial Adenomatous Polyposis, Peutz-Jegher), dementia or another mental condition causing an inability to fill out a questionnaire correctly were excluded. All participants provided a written informed consent. The COLON study was approved by the Committee on Research involving Human Subjects (region Arnhem-Nijmegen), the Netherlands.

Participants with missing dietary intake data (n = 2) or with a missing cancer stage (n = 3) were excluded from the current study, resulting in a final study population of n = 195 stage I-IV colorectal cancer patients for analysis.

Data collection

Habitual dietary intake in the month prior to diagnosis was assessed using a 204-item validated, semi-quantitative food frequency questionnaire (FFQ) developed by the Division of Human Nutrition and Health of Wageningen University & Research, the Netherlands [27, 28]. The FFQ was used to calculate a priori defined diet quality indices and to construct a posteriori data-driven dietary patterns. Demographic and lifestyle characteristics such as sex, age, weight, height, and smoking habits were assessed using self-administered questionnaires. All questionnaires were filled out prior to tumor resection. Medical information, including cancer stage, tumor location, and treatment strategies, was collected using the Dutch ColoRectal Audit [29].

Non-fasted plasma EDTA samples were collected upon recruitment, which were intended before the start of treatment, and stored at − 80 °C using a standardized protocol to ensure identical sample handling across the eleven hospitals.

Diet quality indices

Two diet quality indices have been included in the current study, namely the WCRF dietary score and the DHD15-index. Briefly, the WCRF dietary score is based on the 2018 WCRF/AICR recommendations for cancer prevention using the standard WCRF/AICR score developed by Shams-White et al. [18]. Since the current study focusses on dietary intake, the recommendations regarding weight, physical activity, supplement use, and breastfeeding were not included. The remaining recommendations were: (1) eat a diet rich in whole grains, vegetables, fruits, and beans, (2) limit consumption of ‘fast foods’ and other processed foods high in fat, starches or sugar, (3) limit consumption of red and processed meat, (4) limit consumption of sugar-sweetened drinks, and (5) reduce alcohol consumption. Quantitative criteria were used as cut-off points for all recommendations, except for the recommendation (2) limit consumption of ‘fast foods’ and other processed foods high in fat, starches or sugar, where cut-offs were based on tertiles calculated as a percentage of total energy intake from processed foods. Processed foods included French fries, crisps, pastry and biscuits, savory snacks, sugar and candy, sauces, pizza, pancake, sandwich fillings high in sugar or fat, refined grain products, and sweet dairy desserts. Processed meat included sausages, bacon, ribs, ham, cold cuts, and unknown types of meat. Sugary drinks included sugar-sweetened soft drinks, sugar-sweetened dairy drinks, and fruit juices. We did not include yoghurt and cheese, nuts, oils and fats, and diet soft drinks in the WCRF dietary score, since these food or food groups are not part of the WCRF recommendations.

The score assigned for each recommendation of the WCRF dietary score was 1 when the recommendation was met (full adherence), a score of 0.5 was assigned to moderate adherence and a score of 0 was assigned to low adherence. The recommendation regarding a diet rich in whole grains, vegetables, fruit, and beans, included sub-recommendations for fiber intake and for fruit and vegetable consumption. As a result, the recommendation score was the sum of sub-recommendation scores of fiber intake and fruit and vegetables intake, meaning that plausible scores were 0, 0.25, 0.5, 0.75, and 1. The overall score of the WCRF dietary score ranged from 0 to 5.

The DHD15-index [19] was developed on the basis of the 2015 Dutch dietary guidelines [30] and refers to 15 recommendations. In the current study, the recommendations regarding coffee consumption and sodium intake were excluded since the type of coffee and sodium intake were not assessed in the COLON study. The DHD15-index used in the current study included the intake of sugary drinks, liquid fats and oils, processed meat, red meat, nuts, dairy products, refined grains products, whole grain products, vegetables, alcohol, legumes, solid fat, fruit, fish, and tea. The scores for each individual recommendation ranged from 0 to 10 points with a maximum total DHD15-index score of 130 points.

For both indices, a higher score represents a healthier diet, i.e. a better compliance with the recommendations of the corresponding diet quality index. Details on the used diet quality indices have been described before [18, 19, 31].

Empirical construction of dietary patterns

Total intake of food items (g/d) and total energy intake (kcal/d) were calculated based on frequency of intake, number of portions, portion size, and the type of products, as recorded in the FFQ. All food items were categorized into 33 food groups that were constructed according to the Dutch food composition table 2011 [32]. Final food groups are described in Supplementary Table S1. Total intake of food groups was recalculated to relative intake (g/d per 1000 kcal) using total energy intake to simplify comparison of participants.

Principal component analysis was used to investigate data-driven dietary patterns among participants [33]. Food group data were log-transformed using the natural logarithm and Z-standardized before performing principal component analysis. As a result, the intake of all food groups has a mean of zero and a variance of one, which is important since the results of components highly depend on the variance of each variable [34]. In case a certain food group was not consumed, i.e. 0 g/day per 1000 kcal, 0.001 was added to the food group sum to allow log transformation. The number of dietary patterns was decided based on the components with eigenvalues > 1.0, the scree plot and the interpretability of the components [35]. The remaining components were orthogonally rotated for ease of interpretation and labels in accordance with the included food groups were given. Positive and negative food group loadings >|0.2| were considered when naming the respective dietary pattern. Participants’ scores were determined by multiplying the observed intake of all food groups by the factor loading for each of all the respective food groups [36].

Three dietary patterns were identified based on the available data, which were defined as a Western, Carnivore, and Prudent dietary pattern, see Fig. 1. The Western dietary pattern was characterized by a high intake of snacks, savory sauces and spreads, refined grains, pizza, high and medium-fat dairy, nuts and seeds, beer, and hard fats, and a low intake of whole grain products and potatoes. The Carnivore pattern was characterized by a high intake of red and processed meat, poultry, fish, eggs, and potatoes, and a low intake of soy and vegetarian products, and medium and high-fat dairy. Lastly, the Prudent pattern consisted of a high intake of vegetables, fruits, fish, nuts and seeds, low-fat dairy, tea, pastry and biscuits, and a low intake of beer.

Fig. 1
figure 1

Overview of food group loadings of the Western, Carnivore, and Prudent dietary patterns. Green and red bars represent positive and negative loading strengths, respectively. A more positive loading illustrates higher consumption of a specific food group, while a more negative loading characterizes lower consumption of the food group. Only food group loadings >|0.2| were considered to contribute to the dietary pattern and visualized to improve readability

Biomarker analysis

Plasma samples were analyzed in four analytical batches at the International Agency for Research on Cancer (IARC) in Lyon, France. In total, 147 metabolites were measured using the AbsoluteIDQ p180 kit (Biocrates, Innsbruck, Austria). The analytical method [37, 38] characterizes up to 188 metabolites from five compound classes. Amino acids and biogenic amines were quantified (calibration curves, individual isotope-labelled internal standards) by ultra-high performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS). Lipids, including glycerophospholipids and sphingolipids, acylcarnitines, as well as the sum of hexose sugars were semi-quantified (one-point calibration, single representative internal standard) by flow injection analysis-tandem mass spectrometry (FIA-MS/MS).

Metabolites with > 20% missing values (n = 13), including values below the level of detection (LOD) and true missings, were removed from the dataset while remaining missing values were imputed in line with previous studies [38,39,40]. Briefly, missing values below the LOD were imputed by half of the batch-specific LOD and values below or above the quantitative range were replaced by the lower or upper limit of quantification, respectively. Subsequently, to normalize distributions, metabolite concentrations were log transformed using the natural logarithm and were Z-standardized to allow comparison of estimates among metabolites. In total, 134 metabolites were included in the current analysis (Supplementary Table S2), consisting of 12 acylcarnitines (Cx:y), 21 amino acids, 8 biogenic amines, 78 glycerophospholipids (10 lysophosphatidylcholines (lysoPC) and 68 phosphatidylcholines [PC diacyl (aa) and acyl-alkyl (ae)], 14 sphingolipids (SM Cx:y) and the sum of hexoses. The abbreviation Cx:y is used to describe the total number of carbons and double bonds in the alkyl chains, respectively.

Statistical analysis

Clinical, demographic and lifestyle characteristics were described using descriptive analyses. Linear regression models were used to determine the associations between diet quality indices and dietary patterns as independent variables and concentrations of metabolites as dependent variables. Diet quality indices and dietary patterns were analyzed as continuous variables as well as analyzed in tertiles, for which the lowest tertile, corresponding to the lowest intake of the exposure, was used as the reference category. P-trend values were computed for tertiles using the medians of the corresponding tertiles.

All models were adjusted for age at diagnosis (continuous), sex, BMI (continuous), smoking status (current smoker/former smoker/never smoker), analytical batch (1–4), and cancer stage (stage I/stage II/ stage III/ stage IV). The basis for assessing whether covariates should be included in the final model were existing evidence, biological plausibility and whether the regression coefficient of interest changed by > 10% after adding the potential covariate.

Furthermore, to explore the consistency in the observed associations between dietary exposures and plasma metabolites, we evaluated the top-15 metabolites (based on the smallest p-value for trend over tertiles) associated with diet quality indices and dietary patterns using a heatmap.

Sensitivity analyses were conducted excluding patients from whom blood was collected during or after any type of treatment, i.e. (neo-) adjuvant chemotherapy and/or surgery (n = 19) and excluding stage IV patients (n = 7).

All statistical analyses were performed in R, version 3.4.0 and SAS, version 9.4. After correction for multiple testing, using false discovery rate (FDR) according to the Benjamin-Hochberg procedure [41, 42], a p value (pFDR) < 0.05 was considered statistically significant.

Results

Study population characteristics

Characteristics of the overall study population are summarized in Table 1. The mean (SD) age of the 195 colorectal cancer patients was 66 (9) years and almost 60% of the study population was male. Mean (SD) BMI was 25.6 (4.9) kg/m2, around ten percent were current smokers and participants had a stage I (27%), stage II (33%), stage III (36%) or stage IV (4%) cancer. Distal colon cancer was diagnosed in 36% of the study population, proximal colon cancer and rectal cancer in 29% and 35%, respectively. The study population had a mean (SD) total energy intake of 1856 (559) kcal/day. The mean (SD) WCRF dietary score and DHD15-index were 2.1 (0.7) and 73.8 (14.1), respectively. There were no consistent different directions in diet quality indices and dietary patterns when comparing colorectal cancer stages (data not shown). Baseline characteristics of the participants in the lowest and highest tertile of each diet quality index and dietary pattern are shown in Supplementary Table S3.

Table 1 Baseline characteristics of the overall study population

Diet quality indices

Table 2 presents the top-15 metabolites based on the smallest value of ptrend across tertiles for the analyses between diet quality indices and metabolite concentrations, which were ranked by the ptrend across tertiles of the diet quality indices. A higher concordance of the WCRF dietary score was statistically significantly associated after FDR adjustment with lower concentrations of ten phosphatidylcholines over increasing tertiles. Each one-point increase in the WCRF dietary score also showed statistically significantly lower concentrations of four of the above-mentioned ten phosphatidylcholines (PC ae C36:3, PC ae C36:4, PC aa C36:3, and PC aa C38:3).

The DHD15-index was not statistically significantly associated with plasma metabolites after FDR adjustment when analyzed by tertiles of the DHD15-index and continuously (Table 2). An overview of all the results on the association between the diet quality indices and all 134 metabolites is available in Supplementary Table S2.

Table 2 Top-15 plasma metabolites associated with diet quality indices, ranked by ptrend

Dietary patterns

The top-15 metabolites resulting from analysis of associations between the Western, Carnivore, and Prudent pattern and plasma metabolites, ranked by the ptrend across tertiles of the dietary patterns, are shown in Table 3. No linear trend was observed over increasing tertiles of the Western pattern in relation to plasma metabolites. In contrast, every SD increase in consumption of the Western pattern was statistically significantly associated with 35 metabolites (Table 3 and Supplementary Table S2).

Table 3 Top-15 plasma metabolites associated with dietary patterns, ranked by ptrend

A linear trend was observed between increasing tertiles of the Carnivore pattern and higher concentrations of two phosphatidylcholines. Similarly, every SD increase in the intake of the Carnivore pattern was also statistically significantly associated with higher concentrations of PC aa C38:0 (pFDR: 0.001) and PC ae 38:6 (pFDR: 0.01).

The Prudent pattern was not statistically significantly associated with any metabolite when evaluating the linear trend, as well as when testing each SD increase in consumption of the Prudent pattern. Results of the analyses between the dietary patterns and all 134 metabolites are provided in Supplementary Table S2.

Overlap in the top-15 metabolites

Figure 2 illustrates the overlap in the observed top-15 metabolites (based on the smallest p-value for trend over tertiles) for each of the respective diet quality indices and dietary patterns. No overlap among acylcarnitines in the top-15 metabolites for each of the dietary exposures was observed. One amino acid was overlapping; the Carnivore and Prudent pattern both showed a positive association with plasma tryptophan. Sarcosine, a biogenic amine, was positively associated with both the DHD15-index and the Prudent pattern.

Fig. 2
figure 2

Heatmap illustrating the observed top-15 metabolites (based on the smallest p-value for trend over tertiles) associated with the diet quality indices, i.e. the WCRF dietary score and DHD15-index, and the dietary patterns, i.e. the Western, Carnivore, and Prudent pattern. The color is correlated to the observed β values; a darker blue color corresponds with a more positive association, while a darker red color represents a more inverse association between the dietary exposure and the plasma metabolite. Statistically significant associations are presented by a black box around the cell

Several glycerophospholipids showed overlap between the investigated dietary exposures. An increasing adherence to the WCRF dietary score recommendations, as well as an increasing adherence to the DHD15-index recommendations were associated with decreasing concentrations of plasma phosphatidylcholine PC aa C32:1. Positive associations were observed between the DHD15-index and the Carnivore pattern and PC aa C38:6 concentrations. Inverse associations were observed for the WCRF dietary score, the DHD15-index and the Prudent pattern in relation to phosphatidylcholine PC aa C40:4.

Four phosphatidylcholines (PC ae 34:1, PC ae C34:2, PC ae C36:3, and PC ae C38:3) were among the top-15 metabolites associated with the WCRF dietary score and the Western pattern. Concentrations of these phosphatidylcholines were lower with increasing adherence to the WCRF dietary recommendations, while higher concentrations were observed with a higher consumption of the Western pattern. In addition, the WCRF dietary score and DHD15-index were inversely associated with two phosphatidylcholines (PC ae C36:4 and PC ae C38:5), while the opposite was observed for the Carnivore pattern. A higher DHD15-index and a higher intake of the Carnivore and Prudent pattern were associated with higher concentrations of PC ae C40:6.

Two sphingolipids overlapped between the dietary exposures. A higher consumption of the Western and Prudent pattern were both associated with a higher consumption of plasma SM (OH) C14:1. A higher DHD-15 index score reported higher concentrations of SM (OH) C22:2, and, in line, higher intakes of the Prudent pattern showed higher concentrations of SM (OH) C:22.

Sensitivity analyses

Sensitivity analyses excluding patients from whom blood was collected during or after any type of treatment, i.e. (neo-) adjuvant chemotherapy and/or surgery (n = 19) and excluding stage IV patients (n = 7) showed similar beta coefficients between the dietary exposures and plasma metabolite concentrations compared to the main analysis (data not shown).

Discussion

The aim of the current study was to explore the associations between the diet, evaluated using diet quality indices and dietary patterns, and plasma metabolite concentrations in colorectal cancer patients. The WCRF dietary score and the Carnivore pattern were observed to be statistically significantly associated with several long-chain phosphatidylcholines. In addition, when exploring the overlap in the top-15 metabolites for the respective dietary exposures, several dietary exposures were associated with identical long-chain phosphatidylcholines, which strengthens the hypothesis that diet and plasma metabolite concentrations might be associated in colorectal cancer patients.

Better adherence to the WCRF dietary and DHD guidelines, reflecting a healthier diet, was, in general, associated with lower concentrations of phosphatidylcholines in colorectal cancer patients in this study. In contrast, higher intakes of the Western pattern, which is generally regarded as an unhealthier diet [22, 43,44,45,46], showed higher concentrations of phosphatidylcholines. Similarly, a higher intake of the Carnivore pattern was positively associated with phosphatidylcholines in the current study, suggesting that a higher intake of a diet with red and processed meat, poultry, fish, and eggs is associated with higher levels of phosphatidylcholines. Interestingly, a study by Schmidt et al. reported that a vegan diet was characterized by lower concentrations of phosphatidylcholines and sphingolipids compared to a diet high in animal products [12]. A previous study among healthy participants also reported decreased lipid concentrations, including lysophosphatidylcholines and other glycerophospholipids, after a two-month intervention assigning healthy individuals to a Mediterranean diet, which is generally low in animal products, except for fish, compared to a control diet. The control diet was based on the American Heart Association guidelines [47], which recommend to consume low-fat dairy products, fish, poultry, and lean meats regularly. In line with our results and the study of Schmidt et al. [12], this may suggest that a higher intake of animal products is associated with higher phosphatidylcholine concentrations, also among those with colorectal cancer. Further studies are needed to elucidate whether phosphatidylcholine metabolism may play a role in the colorectal cancer continuum.

Diet quality indices and dietary patterns have been linked to colorectal cancer survival previously [22, 48, 49]. Associations between dietary exposures and circulating metabolites in colorectal cancer patients may provide important leads for future research regarding the underlying mechanisms between diet and colorectal cancer progression and survival. When these underlying mechanisms are identified, there is more solid scientific evidence to make nutritional recommendations for colorectal cancer survivors. However, since the current study is based on observational data only, it is not possible to clearly determine the causal relationships between dietary exposures and phosphatidylcholines in colorectal cancer patients. A previous study suggested that cancer cells display an elevated production of phosphatidylcholines, as part of enhanced lipogenesis in cancer cells [50], to further promote proliferation and evade apoptosis [51]. Given our findings that diet seems to be associated with phosphatidylcholines in colorectal cancer patients, and thus may theoretically support the hypothesized neoplastic growth, further studies studying phosphatidylcholines in relation to colorectal cancer recurrence and survival might be of interest.

The main strength of the current study is that this is, to the best of our knowledge, the first study investigating the associations between dietary exposures and plasma metabolites in a diseased population, i.e. colorectal cancer patients, using different approaches. When exploring the top-15 metabolites associated with the investigated dietary exposures, several phosphatidylcholines were observed to overlap between our exposures. This may indicate that the reported associations in our study population between dietary exposures and plasma metabolites are robust findings.

One of the limitations is that non-fasted blood samples were used for the current study and, as a result, we cannot rule out the possibility that some observed associations might be related to recent occasional dietary intake [38]. Our study was also limited to the metabolites included in the kit, while other metabolites might also be associated with the various dietary exposures. Following the presented results, a lipid-focused approach is of interest when investigating the association between diet and metabolites in colorectal cancer patients in the future. Lipid species, such as phosphatidylcholines, possess different physicochemical properties [52], and the methods used in the current and previous studies [11, 12] do not allow in-depth interpretation of the individual fatty acid compositions. Our relatively small sample size did not allow comparison of different colorectal cancer stages [40] and subtypes, although associations between diet and metabolites could potentially be related to specific tumor characteristics [53]. Lastly, we were not able to analyze the potential associations between diet, metabolites and colorectal cancer recurrence or survival.

In summary, we reported that the WCRF dietary score and the Carnivore pattern are associated with plasma concentrations of phosphatidylcholines in colorectal cancer patients. Several phosphatidylcholines were also observed to overlap between the dietary exposures when comparing the top-15 metabolites. Our findings should be replicated in larger study populations to allow more in-depth analysis regarding colorectal cancer stages and subtypes to also explore the role of nutritional metabolites in the colorectal cancer continuum. Furthermore, future studies should investigate the association between nutritional metabolites and colorectal cancer recurrence and survival. These explorative analyses might provide additional information about the potential underlying mechanisms of dietary intake in colorectal cancer patients, and the potential relationship with recurrence and survival.