Background

Globally, colorectal cancer (CRC) is the third most common cancer and the second leading cause of cancer-related death [1]. CRC is more frequent in men than in women, and its burden is expected to increase by 60% to more than 2.2 million new cancer cases and 1.1 million cancer deaths by 2030 [2].

Chronic inflammation is one of the hallmark characteristics of cancer, and inflammatory cells can also release reactive oxygen species, which trigger mutations in cancer cells [3]. Due to the inflammatory roots of CRC [4], it might be a candidate for prevention by anti-inflammatory and anti-oxidative agents. A compelling body of evidence from experimental and clinical studies has demonstrated that serum bilirubin, a byproduct of hemoglobin breakdown, has substantial anti-inflammatory and anti-oxidative properties [5,6,7,8,9]. Blood levels of total bilirubin are usually less than 17.1 μmol/L and consist primarily of unconjugated bilirubin (UCB) [10], which is also normally present in the gut and can cross gut cell membranes [11]. In vitro, UCB is the most active anti-oxidant part of total bilirubin [11,12,13]. The liver selectively removes UCB from the blood, and UCB is conjugated by a uridine diphosphoglucuronyltransferase (UGT1A1), after which it is transported to the bowel via the bile, where it is unconjugated by bacteria and excreted in the stool or reabsorbed [5,6,7,8,9]. Men usually have higher total bilirubin levels than women due to lower estrogen levels [5, 14] and a higher red blood cell turn-over [15, 16].

As the heme pathway plays an important role against oxidative stress, UGT1A1 gene polymorphisms might be predictive of genetic pre-disposition to cancer [17]. Congenital underexpression of UGT1A1 causes mild chronic unconjugated hyperbilirubinemia, known as “Gilbert’s syndrome (GS),” and is associated with a polymorphism of the 5′ end of the UGT1A1 gene promoter. The frequency of Gilbert’s polymorphism is 30–45%; however, phenotypic hyperbilirubinemia is estimated to be 5–10% in Caucasians [18,19,20].

Few epidemiological studies have investigated the association between circulating bilirubin levels and CRC risk with inconsistent findings [17, 21,22,23,24,25,26]. Notably, these previous studies only considered total bilirubin, were of limited size, and were cross-sectional or retrospective in design with one exception [22].

In this study, we analyzed pre-diagnostic circulating levels of UCB in relation to CRC development in the European Prospective Investigation into Cancer and Nutrition (EPIC). Additionally, we applied a complementary Mendelian randomization (MR) approach to investigate a potential causal relationship between genetically raised bilirubin levels and CRC in large international genetics consortia. We decided a priori to perform analyses separately in men and women because of the well-established sex differences in blood levels of bilirubin [10] and suggestive evidence that bilirubin CRC associations may differ between men and women [17, 23].

Methods

Study population and collection of blood samples and data

EPIC is a multi-center prospective cohort of 521,330 participants (~ 70% women, 25–70 years), recruited between 1992 and 2000, predominantly from the general population in 23 centers of 10 European countries (Sweden, Denmark, Norway, Germany, France, Greece, Italy, Spain, the UK, and the Netherlands) [27]. Around 80% of the participants donated a blood sample at recruitment, and plasma/serum samples were collected according to standardized procedures [27, 28] and stored at the International Agency for Research on Cancer (IARC, Lyon, France, at − 196 °C in liquid nitrogen), except in Denmark (nitrogen vapor, − 150 °C) and Sweden (− 80 °C freezers). At recruitment, participants completed standardized lifestyle and personal history questionnaires, had their diet assessed covering the previous 12 months using validated country/center-specific dietary questionnaires, and had height and weight (self-reported in the Oxford center and Norway, measured elsewhere) assessed [28].

Cancer case ascertainment and selection

A detailed explanation of cancer case selection and ascertainment in EPIC has been published previously [29]. Briefly, incident cancer cases were identified through population cancer registries (Denmark, Italy except Naples, The Netherlands, Norway, Spain, Sweden, and UK; complete follow-up for cancer incidence ranging between December 2004 and 2008) or by active follow-up (France, Germany, Greece, and Naples; complete follow-up ranging between December 2006 and June 2010), consisting of a combination of methods including health insurance records, cancer and pathology registries, and active follow-up of study subjects and their next of kin. Cases were coded by anatomic location as colon and rectal cancer cases, identified according to the 10th revision of the International Classification of Diseases (ICD-10) and the second revision of the International Classification of Disease for Oncology (ICD-O-2). Proximal colon cancers included those within the cecum, appendix, ascending colon, hepatic flexure, transverse colon, and splenic flexure (C18.0-18.5). Distal colon cancers included those within the descending (C18.6) and sigmoid (C18.7) colon. Overlapping (C18.8) and unspecified (C18.9) lesions of the colon were grouped among all colon cancers only (C18.0-C18.9). Rectal cancers were defined as tumors occurring at the recto-sigmoid junction (C19) or rectum (C20). CRC is the combination of the colon and rectal cancer cases. Anal canal cancers (C21) were excluded.

Controls were selected by incidence density sampling from all cohort members alive and cancer-free at the time of matching to cases (1:1) by sex, age at blood collection, study center, time of day at blood collection, fasting status, menopausal status, and phase of menstrual cycle at blood collection.

A total of 1386 CRC cases (374 proximal colon, 412 distal colon, 80 overlapping proximal plus distal colon, and 520 rectal cancers) and 1386 controls were included in the current analyses.

Laboratory measurement of circulating bilirubin

Circulating UCB levels were measured in plasma samples following a well-established protocol [30, 31] using high-performance liquid chromatography (HPLC, Merck, Hitachi, LaChrom, Vienna, Austria), equipped with a Fortis C18 HPLC-column (4.6 × 150 mm, 3 μm), a Phenomenex SecurityGuard™ cartridges for C18 HPLC-columns (4 × 3 mm), and a photodiode array detector (PDA, Shimadzu). An isocratic mobile phase contained glacial acetic acid (6.01 g/L) and 0.1 M n-dioctylamine in HPLC grade methanol/water (96.5/3.5%). Before starting the procedure, all aliquots were centrifuged and 50 μL plasma/serum was mixed with 200 μL mobile phase. After a second centrifugation, 120 μL of the supernatant was injected to the HPLC at a flow of 1 ml/min.

Case-control pairs were analyzed in the same plate to minimize batch-to-batch fluctuation. Bilirubin (alpha) (purity ≥ 98%, Sigma Aldrich) acted as an external standard (3.3% IIIα, 92.8% IXα, and 3.9% XIIIα isomers, 450 nm). One reference plasma sample was assessed per analysis as internal standard. The coefficient of variation (CV) between each plate was 6%.

Genetic data

Genetic determinants for bilirubin levels

Genetic instruments for the MR analysis were identified as single-nucleotide polymorphisms (SNPs) associated with total bilirubin levels in the largest genome-wide association study (GWAS) (P < 5 × 10−8) conducted to date that included 317,639 individuals of European ancestry from the UK Biobank study [32]. UK Biobank is a prospective cohort that recruited more than 500,000 men and women aged 40–96 years between 2006 and 2010 and collected anthropometric, health, and lifestyle data and biological samples [33]. Explained phenotypic variance for a single SNP was estimated as a function of effect size for the risk factor in standard deviation units and minor allele frequency [34]. The strength of associations between the genetic instrument and bilirubin levels is reflected in the F-statistic, which is inversely related to weak instrument bias, being 10 the minimum estimation for a F-statistic to avoid bias of this nature [35]. The F-statistic was estimated as \( F=\left(\frac{n-k-1}{k}\right)\left(\frac{R^2}{1-{R}^2}\right) \), where R2 is the proportion of phenotypic variance explained by the genetic instrument, n is the sample size, and k the number of genetic variants [35]. A total of 115 SNPs were identified as genetic instruments for total bilirubin, explaining 20.0% of phenotypic variance in circulating total bilirubin levels with an F-statistic of 696.5.

The SNP with the largest contribution was rs6431625 in the UTG1A1 gene on chromosome 2. This SNP explained 16.9% of phenotypic variance and was in strong linkage disequilibrium (LD R2 = 0.74) with the UGT1A1*28 promoter TA repeat polymorphism (rs3064744) in European populations [36]. The other SNPs explained a 3.1% of phenotypic variance with an F-statistic of 89.1. All SNPs were independently associated with total bilirubin levels (LD R2 < 0.001), and SNPs with ambiguous strand codification (A/T or C/G) were replaced by SNPs in LD R2 > 0.8 in European populations using the proxysnps R package. As described in the GWAS where SNPs were identified, raw total bilirubin levels were adjusted for age, sex and their interaction, the top 40 principal components for population stratification, recruitment center, socioeconomic status, and potential technical confounders (blood draw time and its square and interactions with age and sex; urine sample time and its square and interactions with age and sex; sample dilution factor; fasting time, its square, and interactions with age and sex; and interactions of blood draw time and urine sample time with dilution factor) [32]. These adjusted residuals were inverse-normal-transformed and reflect the genetic association with bilirubin levels in standard deviation units (Supplementary Table 1, see Additional file 1). Total bilirubin is the sum of UCB (~ 80–85%) and conjugated bilirubin (~ 15–20%), and this ratio is constant under physiologic conditions.

Genome-wide data on CRC risk

Epidemiological and genetic data were derived from 51 studies (Supplementary Table 2, see Additional file 1) participating in the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) [37], the Colon Cancer Family Registry (CCFR) [38], and the Colorectal Transdisciplinary (CORECT) study [39]. Men and women with incident invasive colorectal adenocarcinoma (ICD-9, codes 153-154) were included as cases. All CRC cases were confirmed by medical records, pathology reports, or death certificates. A total of 52,775 cases and 45,940 matched controls were included in the analyses [40]. On average, 51% of the study participants were men and the mean age was ~ 60 years; in all studies, controls were matched to cases on age and sex. The UK Biobank CRC cases and controls were excluded from the genetic consortia. This should prevent that weak instruments (i.e., genetic instruments not explaining much variation in circulating bilirubin) bias the MR risk estimate towards observed traditional risk estimates due to sample overlap between the SNP discovery sample (UK Biobank) and the CRC case-control samples [34].

Genotype information was available for all included studies. Details on genotyping, quality assurance, and imputation are described elsewhere [41]. In short, SNPs were excluded based on call rate (< 98% GECCO; < 95% CORECT), lack of Hardy-Weinberg equilibrium in controls (P < 1 × 10−4), or low minor allele frequency (≤ 1%). Analyses were restricted to individuals self-reported as of European descent and clustering with Utah residents with Northern/Western European ancestry from the CEU population in principal component analysis, including the HapMap II populations as reference. Summary statistics for genetic association with CRC risk were obtained for all studies included in the consortia and are shown in Supplementary Table 1.

Statistical analyses

Serological analyses

Our a priori decision to perform all statistical analyses separately in men and women was confirmed by a strong effect modification by sex with regard to CRC risk in EPIC (Pheterogeneity = 0.008). Conditional logistic regression models were used to estimate odds ratios (OR) and 95% confidence intervals (CI) for associations between log-transformed UCB levels (log-UCB), standardized per one standard deviation (1-SD) increments, and CRC risk. Two models were constructed: a crude model which was conditioned on the matching criteria and then a multivariable model adjusted for level of education (none/primary school, technical/professional, secondary school, university degree), BMI (continuous, kg/m2), height (continuous), smoking status (never, former, and current smoker), physical activity (inactive, moderately inactive, moderately active, and active), alcohol consumption (g/day), dietary intakes of fiber (g/day), red meat (g/day), processed meats (g/day), dairy products (g/day), and total energy intake (kcal/day), and in women ever use of hormone therapy (HT, yes/no). Based on prior knowledge about the causal structure, we adjusted for variables that allowed all backdoor paths to be blocked in the directed acyclic graph (DAG) shown in Supplementary Figure 3, while avoiding adjustment for variables affected by either the exposure or the outcome [42]. Missing values in any of the categorical covariates were treated as a separate category.

We also investigated the potential non-linear dose-response associations between circulating levels of UCB and CRC risk. We used three-knot restricted cubic spline models at Harrell’s default percentiles (i.e., 10th, 50th, and 90th) in combination with a Wald-type test [43].

We tested for effect modification by categories of age (median), BMI (median), alcohol consumption (median), smoking status, menopausal status, use of HT, genotypes of the main UGT1A1 SNP (rs6431625), and follow-up time (categories) by adding in the multivariable model a multiplicative interaction term between log-UCB and each of the aforementioned variables at a time. These hypothesis-free analyses were meant to assess the consistency of associations across population subgroups. Additional heterogeneity analysis was performed by cancer sub-sites (colon vs. rectum and proximal vs. distal). For this, we fitted stratified conditional logistic regression models based on competing risks and calculated the OR and their 95% CI in the subgroups of interest [44].

Finally, to evaluate the robustness of the results and address potential sources of bias such as reverse causation and residual confounding, we performed a range of sensitivity analyses. To exclude individuals with hepatic impairment, we calculated BTR index (the molar ratio of branched-chain amino acids to tyrosine) [45] and Fischer’s ratio (the molar ratio of branched-chain amino acids to tyrosine and phenylalanine) [46], which are clinical indicators of liver dysfunction and metabolism. Last, the fully adjusted models in EPIC were repeated after excluding subjects with missing values in any covariate.

To validate the genetic instruments for total bilirubin, we regressed the allele dose of the bilirubin-increasing allele of the main SNP (rs6431625) in the UGT1A1 gene on the measured bilirubin levels in the EPIC sample with available GWAS data (N controls = 808).

Genetically predicted total bilirubin levels vs. CRC risk in GECCO/CCFR/ and CORECT

We investigated the genetic instruments for total bilirubin levels in relation to CRC risk using a 2-sample MR in 52,775 cases and 45,940 control participants within GECCO, CCFR, and CORECT (28,207 cases/22,204 controls in men and 24,568 cases/23,736 controls in women). With this sample size, the power was 80% to detect an OR ≥ 1.065 for the sex-stratified analyses per one standard deviation increment of total bilirubin levels.

Each genetic variant provides an estimation of the total bilirubin level effect on cancer risk (Wald estimate: genetic effect on CRC risk/genetic effect on total bilirubin levels). Before performing the main MR analysis, we assessed the presence of outlier observations within the SNP Wald estimates using the MR pleiotropy residual sum and outlier (MR-PRESSO) test [47]. This method identifies heterogeneity between SNP effects (PGlobal) as an evidence of horizontal pleiotropy, identifies outlier SNPs, and tests if the presence of outliers is biasing the estimation of risk (PDistortion). Then, as the main MR approach used in this study, SNP Wald estimates were combined in a single causal estimation through a likelihood-based MR approach, which is considered the most accurate MR method to estimate effects when there is a continuous log-linear association between risk factor and disease risk [48]. The multiplicative random effects inverse-variance weighted MR estimator was also applied [49]. However, the presence of pleiotropic variants can lead to biased causal effect estimates. In order to overcome this potential issue, several MR sensitivity analyses for data with potentially invalid instruments were applied. Initially, to evaluate the extent to which directional pleiotropy (non-balanced horizontal pleiotropy) may affect the effect estimate, we used the intercept test within an MR-Egger weighted linear regression approach [50]. Furthermore, two additional approaches, namely the weighted median method [51] and the modal-based estimate approach [52], relying on the distribution on SNP effects, were applied. In the former, the causal effect estimate is weighted towards the median of the distribution of SNPs used in the genetic instrument, while in the latter, the effect estimate is reflected by the mode of density distribution provided by SNP Wald estimates. Both methods are less sensitive to SNPs with biased effects. Finally, to identify whether the strongest SNP (rs6431625) was driving the association estimates, we obtained MR estimates leaving out this SNP from the SNP set.

Additionally, we investigated the between-sex heterogeneity of main causal effects by estimating the percentage of variance that is attributable to sex heterogeneity (I2 statistic), and the P value derived from Q statistic for heterogeneity (Pheterogeneity), assuming a fixed-effect model of 1 degree of freedom.

Scatter plots were used to depict the genetic association on total bilirubin levels and CRC risk. All statistical analyses and plots were performed using Stata SE14 (Stata Corporation, College Station, TX, USA) and R (MRPRESSO, TwoSampleMR, and ggplot2; The R project). The significance testing was based on two-sided P values of less than 0.05.

Results

Baseline characteristics of the EPIC participants are shown in Table 1. Mean follow-up time from blood collection to cancer diagnosis was 4.3 years (± 2.5 SD). Among men, cases compared to controls had higher UCB concentrations, were heavier (higher weight and BMI), and consumed more alcohol. Among women, cases compared to controls had lower UCB concentrations, were heavier (higher weight) and taller, and consumed less dairy products.

Table 1 Baseline characteristics of colorectal cancer cases and their matched controls by sex in the EPIC nested case-control study

There was a suggestive higher frequency of genotypes in the homozygotes or heterozygotes than the frequency of the wild-type in CRC cases as compared to controls in men, and less so in women.

Serological analyses: association between circulating bilirubin levels and CRC risk

In the EPIC cohort, among men, we observed a positive association between pre-diagnostic UCB levels and CRC risk in both crude and multivariable adjusted models (multivariable OR = 1.19, 95% CI = 1.04–1.36; P = 0.01; per 1-SD increment in log-UCB). In contrast, we observed an inverse association between UCB and CRC risk in women in both crude and multivariable adjusted models (multivariable OR = 0.86, 95% CI = 0.76–0.97; P = 0.02; per 1-SD log-UCB increment) (Table 2). These associations followed a linear trend in men (Pnonlinearity = 0.7) and in women (Pnonlinearity = 0.1), but for the latter with little change in the OR between 4 to 15 μmol/L of UCB (Supplementary Figure 1, see Additional file 1).

Table 2 Odds ratio and 95% confidence interval for the association between bilirubin levels and CRC risk

Effect modification and sensitivity analyses

The association of UCB levels with CRC risk in EPIC women differed by age (Table 3). We observed an inverse association between UCB levels and CRC risk in older women (> 58.5 years) (multivariable OR = 0.73, 95% CI = 0.61–0.87; P = 0.001; per 1-SD increment in log-UCB), but not in younger women (OR = 1.01, 95% CI = 0.85–1.19; P > 0.9) (Pheterogeneity = 0.008). Serum levels of UCB were lower in older women compared with younger women. No effect modification by age at blood collection was observed in men (Pheterogeneity = 0.3).

Table 3 The association between unconjugated bilirubin (UCB) levels and colorectal cancer risk across strata of potential effect modifiers in the EPIC study

In contrast, in men (Pheterogeneity = 0.02), but not in women (Pheterogeneity = 0.14), effect modification by the rs6431625 genotype was observed (Table 3). In men with homozygous genotype of the bilirubin-increasing effect allele (CC) in rs6431625, higher levels of measured UCB were positively associated with CRC risk (OR = 2.01, 95% CI = 1.26–3.20; P = 0.003; per 1-SD increment in log-UCB), while no associations were observed in those with heterozygous (TC) or wild-type (TT) genotypes. Homozygote UGT1A1 bilirubin-increasing allele carriers (rs6431625) had higher serum UCB levels compared to heterozygotes or wild-type in the EPIC population with GWAS data (R2 = 0.20; P < 0.001, N controls = 808) (Supplementary Figure 2, see Additional file 1).

No differences in the association between UCB levels and CRC risk in men and women were observed across categories of BMI, alcohol consumption, smoking status, menopausal status, use of HT, and follow-up time in years (Table 3). Estimated associations between UCB levels and CRC risk in men and women were also robust to sensitivity analyses (Supplementary Table 3, see Additional file 1).

There was no heterogeneity in associations by anatomical sub-sites (colon vs. rectum, or proximal colon vs. distal colon) (all Pheterogeneity ≥ 0.1) (Table 4).

Table 4 The association between unconjugated bilirubin (UCB) concentrations and colorectal cancer risk by anatomical sub-sites in the EPIC study

Genetically predicted bilirubin levels and CRC risk in GECCO/CCFR/ and CORECT

In light of the heterogeneous results across the main UGT1A1 SNP (rs6431625) genotype categories in serological analyses, we applied a MR approach to this SNP separately from the other genetic instruments. In the MR analysis of the rs6431625, higher levels of genetically predicted bilirubin were positively associated with CRC risk in men (OR = 1.07, 95% CI = 1.02–1.12; P = 0.006; per 1-SD of total bilirubin), but not in women (OR = 1.01, 95% CI = 0.96–1.06; P = 0.73) (Table 2) (I2 = 64.0%; Pheterogeneity = 0.10).

In the MR analyses of the other 114 genetic instruments, no outlier SNPs were identified by MR-PRESSO analyses, with some heterogeneity among the instruments (PGlobal < 0.04). The likelihood-based MR risk estimates, which excluded rs6431625, showed some evidence that higher levels of bilirubin were inversely associated with CRC risk in men (OR = 0.89, 95% CI = 0.80–1.00; P = 0.05), while in women, no association was observed (OR = 1.00, 95% CI = 0.89–1.11; P = 0.96) (Table 2) (I2 = 39.0%; Pheterogeneity = 0.20). Scatter plots depicting the genetic association of the 115 SNPs with total bilirubin levels and with CRC risk, together with MR risk estimates for the genetic instrument comprising the 114 SNPs, are shown in Fig. 1.

Fig. 1
figure 1

Scatter plots depicting the genetic association between total bilirubin levels and colorectal cancer risk. Per allele association of total bilirubin SNPs with inverse-normal-transformed bilirubin levels (x axis) and risk for colorectal cancer (y axis; logarithmic scale) in men (a) and in women (b), together with the likelihood-based MR estimate for the genetic instrument comprising of the 114 SNPs (dashed-blue line) and their 95% CI (dotted-blue lines)

In MR sensitivity analyses of the 114 SNP instrument, the MR-Egger test did not detect directional pleiotropy in the intercept analysis for total bilirubin levels in men or women (Pintercept ≥ 0.45). The additional inverse-variance weighting, weighted median, and modal-based estimates provided similar results compared to the likelihood-based MR risk estimates (Supplementary Table 4, see Additional file 1).

Discussion

We investigated the relation between pre-diagnostic levels of circulating UCB, the main component of total bilirubin, and CRC risk in the EPIC study, and then complemented these analyses with an MR approach using data from large-scale genetic consortia of CRC. In the serological analysis, higher circulating levels of UCB were positively associated with CRC risk in men and inversely associated in women. The complementary MR analysis supported a positive association between total bilirubin levels, genetically predicted by a UGT1A1 SNP (rs6431625), and CRC risk in men, but not in women. We further found that bilirubin levels predicted by instrumental variables excluding the UGT1A1 SNP were suggestive of an inverse association with CRC in men, which is in line with our initial hypothesis, but not in women.

These directionally different associations of bilirubin-raising genetic instruments with CRC in men suggest that the UGT1A1 SNP either has horizontal pleiotropic effects through pathways other than elevated blood levels of bilirubin or indicates an elevated bilirubin distribution among individuals with GS as compared to the general population. Both scenarios are biologically plausible.

Potential pleiotropic effects of the UGT1A1 SNP include a reduced capacity of the UGT1A1 enzyme in the liver or gut to metabolize xenobiotics and toxic substances (e.g., heterocyclic aromatic amines, in well-done red meat) [24]. Furthermore, the influence of sex hormones on UGT1A1 activity [14], and differences in UGT1A1 expression between men and women, leading to differential bilirubin conjugation and circulating levels [53], might partly explain the sex differences in CRC risk found in this study. There is suggestive evidence for sex differences in the UGT1A1 variants and CRC risk [23]. In our control outcome and yet unpublished work, we observed similar sex differences in associations between bilirubin, predicted by the same UGT1A1 SNP, and risk of pancreatic cancer (suggestive positive association in men and null association in women) using data of genetic consortia on pancreatic cancer (Supplementary Table 5, see Additional file 1).

In a second scenario, the findings in men could indicate that bilirubin, an anti-oxidant in vitro [30, 54,55,56], could trigger pro-oxidative processes at high-normal levels in the gut, similar to what has been described for ascorbic acid [57]. Both serological and MR analyses indicated that increased CRC risk was confined to men with a genetic pre-disposition to high bilirubin levels (in our study: bilirubin effect allele (CC) in rs6431625). It is estimated that 11–16% of Caucasians carry a homozygous bilirubin-increasing risk allele [58], and if one in ten individuals have a physiologic trait that affects their risk of cancer, this would have significant implications for future cancer prevention. Nevertheless, follow-up studies are needed to fully clarify the role of bilirubin in CRC development; for example, by conducting a multivariable MR [59], where bilirubin is jointly instrumented with potential other phenotype(s) that could be associated with UGT1A1 variants.

The few studies to date that have investigated the association between circulating bilirubin levels and CRC risk have reported inconsistent results [17, 19, 22]. In an exploratory retrospective case-control study (174 cases), lower total bilirubin levels were associated with higher risk of CRC in men and in women [17]. In a prospective investigation in the National Health and Nutrition Examination Survey (NHANES I), a null association between total bilirubin levels and incidence of CRC was reported (110 cases men and women combined) [22], whereas a prior cross-sectional analysis in the NHANES III reported an inverse association between total bilirubin levels and CRC (83 cases, men and women combined) [19]. These inconsistencies are most likely attributable to differences in study design and/or limited sample sizes. The current analysis goes beyond previous studies in that we used a prospective design with pre-diagnostic blood samples and a large number of incident cases that provided sufficient power for sex stratification.

To our knowledge, no other studies to date have investigated potential causal association between circulating bilirubin and CRC risk using an MR approach. However, variants in the UGT1A1 gene have been previously examined in relation to CRC. Consistent with our findings, a positive association between the UGT1A1*28 allele (homo-/heterozygous for higher bilirubin) and CRC risk in men (OR = 1.97, 95% CI = 1.22–3.19; P = 0.005), but not in women (P = 0.26) was reported in a Macedonian retrospective case-control study [23]. However, another retrospective case-control study [60], which combined men and women, found no significant association between UGT1A1*28 and CRC risk (OR = 1.10, 95% CI = 0.84–1.50). In contrast, Jiraskova et al. [17] reported an inverse association between the UGT1A1*28 polymorphism and CRC risk in men (OR = 0.75, 95% CI = 0.58–0.96) and also a non-significant inverse association in women (OR = 0.88, 95% CI = 0.66–1.18), which however may have limited generalizability due to a highly selected study sample. Our approach goes beyond these studies in terms of sample size, comprehensive SNP analyses and linking for the first time circulating bilirubin to a cancer outcome using an MR approach.

In subgroup analyses of our EPIC study, we found a stronger inverse association between UCB and CRC risk in older women (> 58.5 years) compared to younger women. This effect modification by age was not observed in men. The age patterns seen with bilirubin were observed in previous studies in respect to indicators of metabolic health in men and women [61, 62]. However, a more likely explanation for this finding in women is bias due to differential selection of women less susceptible for CRC over time [63].

The main strengths of our study were the prospective design with long follow-up time between blood sampling and CRC diagnosis, and large sample size to stratify by sex and anatomical sub-sites of CRC with access to biomarkers and lifestyle factors for a better control of potential confounding. Second, we applied an MR approach to address potential confounding, including residual confounding, and reverse causation in our serological analysis.

Our study was limited by the lack of liver enzyme data at baseline in the EPIC study to infer hepatic pathology which would impact bilirubin synthesis. In order to overcome this issue, we used Fischer’s ratio and BTR index for excluding those subjects potentially having liver abnormalities; therefore, we could be sure that participants who had higher UCB did not suffer from liver disease. Second, storage of samples for prolonged periods of time could have contributed to a degradation of UCB concentrations. As with traditional epidemiological analysis, selection bias can also adversely affect MR studies [63]. Given that attrition rates in the genetic consortia were reported as low [38, 39] and that the GWAS on bilirubin was not conditioned on another [32], selection bias may not explain our findings [64]. A major assumption in our MR was that the genetic instruments affect CRC risk only through bilirubin levels. Potential pleiotropic effects of our UGT1A1 SNP (rs6431625) cannot be excluded, and pathways other than mild hyperbilirubinemia associated with lower UGT1A1 activity could therefore also play a role in CRC development [24]. Nevertheless, it is also biologically plausible that our observed associations reflect the effect of an elevated distribution of circulating bilirubin. This is supported by our serological finding that the positive association between serum levels of bilirubin and CRC risk was confined to men with a genetic pre-disposition to high bilirubin levels (in our study: bilirubin effect allele (CC) in rs6431625). A look-up at the PhenoScanner database indicated self-reported liver or biliary/pancreas problems, which likely hints at undiagnosed GS.

We also assessed potential horizontal pleiotropy of the other genetic instruments without the UGT1A1 SNP [65]. The corresponding MR analysis after strictly removing all SNPs, (including those associated with yet unknown phenotypes), which might have violated the exclusion restriction (horizontal pleiotropy) and the independence assumption (no confounders) [59, 66], resulted in virtually similar associations, despite our conservative unsupervised approach (Supplementary Table 4). These excluded SNPs were genome-wide associated with educational attainment, BMI, mean corpuscular volume of red blood cells, and others (Supplementary Table 6). We also employed a set of sensitivity MR methods (e.g., conservative MR-Egger approach) [50], known to be robust for different types of pleiotropy, and there was no indication of horizontal pleiotropy in our MR analysis. Lastly, weak instruments in a two-sample MR study can bias estimates towards the null [51], which we deem unlikely in our study given the F-statistics of our UGT1A1 SNP (F = 696.5) and of our other instruments (F = 89.1).

Conclusions

In conclusion, we observed that higher circulating bilirubin levels were positively associated with CRC risk in men. Both serological and MR analysis suggested that increased CRC risk was confined to men with a genetic pre-disposition to high bilirubin levels. In women, the inverse relationship between circulating bilirubin and CRC risk observed in the serological analysis was not supported in the MR approach. Additional insight into the relationship between circulating bilirubin and CRC is needed in order to conclude on a potential causal role of bilirubin in CRC development.