Background

As communities across the globe are increasingly experiencing a rising influx of cannabis products of many types a pleasant confluence of many events suggests that this is a suitable opportunity to re-investigate the important issue of the extent, impact and implications of cannabis-related carcinogenesis.

It has been known for several years that cannabis is linked with testicular cancer rates and indeed all four studies to have investigated the issue have made positive findings [1,2,3,4], with a relative risk of 2.59-fold (95%C.I. 1.60–4.19) [5]. Beyond a simple disease linkage this datum is highly impactful for our understanding of disease mechanisms for two reasons both of which are deserving of close attention. It is well described in the testicular cancer literature that the pathogenesis of testicular cancer begins in utero and is activated by the hormonal surge of puberty so that the preclinical phase of the disease takes place over several decades [6,7,8]. Patients who smoke cannabis and later contract testicular cancer, whose mean age of incidence is around 34 years, have obviously greatly contracted the preclinical disease course. That is to say that cannabis has aggressively accelerated malignant oncogenic processes from several decades to just a few years. Further the testis houses the male germ cell epithelium so that mutation there necessarily implies heritable mutagenesis potentially transmissible to following generations. This combination of powerful carcinogenesis and transgenerational transmissibility is a most concerning confluence. Similarly several pediatric cancers, including acute myeloid leukaemia (AML), have also been linked with parental cannabis use again demonstrating transgenerational transmissibility of oncogenesis [9,10,11,12,13,14,15].

It was recently reported in a geospatial and causal inference study that cannabis is a major driver of the significantly rising US total pediatric cancer rates which have risen 49% 1970–2017 [16]. This is important because what is implied is transgenerational transmission of oncogenesis, exactly as suggested above. Furthermore five major chromosomal anomalies and five major cancers were recently linked with cannabis exposure across USA [17].

Moreover cannabis-related oncogenesis is part of a larger overall story of cannabis-related genotoxicity. Warnings are found on the registered product information and prescribing information for both Epidiolex and Sativex indicating that genotoxicity is an activity of cannabinoids which is widely recognized and accepted by regulators, marketers, distributors and many scientists [18, 19]. It is well established that genotoxicity can be expected to be manifested primarily in increased rates of congenital malformations and cancer incidence [20]. Several cardiac malformations were described by the American Heart Association and American Academy of Pediatrics in a major review in 2007 [21]. However it was recently shown, again in a geospatial and causal inference study, that another common congenital heart defect, atrial septal defect secundum type is also being driven sharply upwards by increased cannabis exposure, which is not occurring uniformly across USA [22]. Description of a new cannabis-related congenital anomaly necessarily implies that our understanding of cannabis teratogenesis is as yet incomplete and indeed we have more to learn in this field. Many congenital anomalies were recently described as being more common in the highest quintile of cannabis using US states [23].

Patterns of cannabinoid consumption are changing rapidly. Cannabis legalization has resulted in not only more children and adults exposed to cannabis [24, 25] but also more people using it more intensely so that the number of people smoking daily or near daily has doubled in USA [26]. And it is well established that the concentration of most cannabinoids has risen dramatically in recent decades [27,28,29]. Hence more people are smoking stronger cannabis with greater intensity than previously creating a triple convergence of cannabinoid exposure especially in habitual smokers. High concentration “dabs”, highly concentrated oils and waxes and solid cannabinoid “shatter” are widely available in many parts of USA. This very new pattern clearly heralds a new era in cannabis epidemiology so that it is only appropriate that we well understand our recent history and epidemiology in this area. Indeed leading authorities have called for a complete revision of cannabis epidemiology in this new high dose – high intensity – high use paradigm [30]. Of note one widely quoted paper with a null finding on the cannabis cancer link actually omitted high dose cannabis smokers from its analysis by protocol likely amputating the most intriguing and important analytical signal [31].

One of the pillars of the epidemiological link between tobacco and lung cancer is the high odds ratio for smokers who experience a nine-fold elevation in lung cancer risk [32]. The E-Value or expected value is a measure on the relative risk scale of the strength of an association which some unmeasured confounder would require with both the exposure of interest and the outcome of concern to explain away the observed association. It can be calculated from the relative risk ratio or from the output from many common regression models. E-Values have both a point estimate and a 95% lower confidence interval bound [33,34,35]. The applicable lower E-Value for tobacco-lung cancer is 9.0. Our analysis makes extensive use of E-Values on linear regression equations and rate ratio count data for multiple outcomes [35] as was recently recommended by leading public health authorities [36]. We also considered that it would be useful to explore the formal techniques of causal inference and geotemporospatial regression for selected cancers as appropriate.

Cannabis is not a pure substance but a mixture of many substances. Prior to combustion it has over 400 unique chemicals in it collectively known as cannabinoids [37, 38]. Cannabis contains most of the major carcinogens of tobacco including benzopyrene, anthracyclines and aromatic polycyclic hydrocarbons [31, 37, 38]. THC is a major cannabinoid but cannabidiol is a well described minor constituent. Although cannabidiol currently enjoys a relatively harmless reputation in the popular press due to its relative lack of psychoactivity it has been known for several decades to be damaging to chromosomes, the bases of DNA, mitochondrial metabolism and energy generation and the epigenome [39]. Given that it is so widely available we were especially concerned to ascertain if this supposedly “safe” reputation was borne out by the observed epidemiological trends.

Companion papers examine these relationships as continuous variables [40], in detail in prostate and ovarian cancer [41], and the epidemiology of congenital teratogenesis from a space-time and causal inference perspective [17, 42, 43]. The present paper addresses these issues with variables categorized by quintiles of exposure which allows the calculation of key epidemiological metrics including rate ratios (R.R.), attributable fractions in the exposed (AFE) and population attributable risks (PAR, also known as attributable fractions in the population, AFP). Calculation of such proportions across different substances allows the oncogenicity of the known carcinogens tobacco and alcohol to be directly compared with that of the cannabinoids which is the principle subject of the present enquiry.

Methods

Data

The Surveillance, Epidemiology and End Results (SEER) database from the Centres for Disease Control (CDC) Atlanta, Georgia and the National Cancer Institute (NCI) and from the National Program of Cancer Registries (NPCR) and SEER Incidence US Cancer Statistics Public Use Database 2019 submission covering years 2001–2017 using the SEER*Stat software was sourced for rates of age-adjusted cancer rates by state and year and cancer type [44]. This study was focussed on 28 of the most common cancers (listed below). One category, called Al Cancer in this report related to the rate of all non-skin cancers. Drug exposure data for USA by state and year was taken from the National Survey of Drug Use and Health (NSDUH) Restricted-Use Data Analysis System (RDAS) of the Substance Use and Mental Health Data Archive (SAMHDA) held by the Substance Use and Mental Health Services Administration (SAMHSA) 2003–2017 [45]. Thus the overlap period between the cancer and drug exposure datasets was 2003–2017 which therefore became the period of analysis. The parameters taken from this dataset were last month cigarettes, last year alcohol use disorder (AUD), last month cannabis, last year non-medical use of opioid analgesics (Analgesics) and last year cocaine. Quintiles of substance exposure were calculated annually and were numbered from one, the lowest quintile, to five the highest exposure quintile. There were no unexposed groups. Median household income, ethnicity and population by state and year data was sourced directly from the US Census bureau via the tidycensus package [46] in R and linear interpolation was used tom complete missing years. The ethnic categories studies were Caucasian-American, African-American, Hispanic-American, Asian-American, American Indian / Alaska Native (AIAN) and Native Hawaiian / Pacific Islander (NHPI). National cannabinoid concentration data across USA was taken from reports published by the US Drug Enforcement Agency (DEA) for the five cannabinoids Δ9-tetrahydrocannabinol (THC), cannabigerol (CBG), cannabichromene (CBC), cannabinol (CBN), and cannabidiol (CBD) [27,28,29]. National cannabinoid levels were multiplied by state level cannabis use to provide an estimate of state level exposure. Cannabinoid exposure quintiles were calculated on the whole period considered as a whole. Age adjusted case numbers were derived by multiplying the age-adjusted cancer rate in each state and year by the population of that state and dividing it by 100,000.

Statistical analysis

Data was processed in R-Studio version 1.3.1093 (2009–2020) based upon R version 4.0.3 (2020-10-10). The Shapiro-Wilks test was used to guide log transformation of covariates where appropriate. Data manipulation was performed using the “dplyr” package in the “tidyverse” [47]. Maps and graphs were drawn in R-Base, ggplot2 and “sf” (simple features) [48] and graphs were drawn using ggplot2 from tidyverse [47, 49]. Some colour palettes employed the viridis and plasma palettes taken from the package “Viridis” [50] and several palettes were originally designed for this project. Bivariate maps were drawn using the colorplaner two way colour matrices [51]. Maps and graphs are all original and have not been published elsewhere. Rate ratios, attributable fraction in the exposed and population attributable risks (also known as attributable fraction in the population) were calculated using “epiR” version 2.0.11 developed by Professor Mark Stevenson and colleagues [52]. The Anova test in R-base was used for models comparison.

Regression models

Bivariate linear trends were computed with linear regression from R-Base.

Simultaneous multiple model analysis

Simultaneous multiple model analysis was conducted in the tidyverse package “purrr” [47] using tidy and glance from package “broom” [53] using established nest-map-unnest workflows. This methodology allows a whole long dataset providing data on many cancers to be analyzed in a single analysis run at one time.

Causal inference

E-values were computed using the R-package “EValue” [54] from count data [33,34,35]. Minimum E-Values above 1.25 are said to suggest causal relationships [33].

P < 0.05 was considered significant throughout.

Data availability

Data, including R-code, ipw weights and spatial weights have been made available through the Mendeley Data repository online and can be freely accessed at https://doi.org/10.17632/dt4jbz7vk4.1

Ethics

Ethical approval for this study from the University of Western Australia Human Research Ethics Committee was granted on 7th January 2020 with approval number RA/4/20/7724.

Results

The cancers upon which we chose to focus our attention were chosen because they were relatively common or because they involved tissues which had been implicated in the literature with cannabinoid activities. For this reason cancers of the male and female reproductive tract were well represented amongst the cancers chosen for the present study. The list in alphabetical order includes tumours of: acute lymphoid leukaemia (ALL), acute myeloid leukaemia (AML), bladder, brain, breast, cervix, chronic lymphoid leukaemia (CLL), chronic myeloid leukaemia (CML), colorectum, oesophagus, Hodgkins lymphoma, Kaposi sarcoma, kidney, liver, lung, melanoma, multiple myeloma, Non-Hodgkins lymphoma, oropharynx, ovary, pancreas, penis, prostate, stomach, testis, thyroid and vulva and vagina combined. Based on 2017 data the 27 cancers chosen comprehended 1,339,737 of the 1,670,227 cancers reported to state cancer registries in that year or 80.21% of all non-melanoma non-skin cancers reported. In addition total non-skin cancer was also included in this list making 28 cancer types in all.

Nineteen thousand eight hundred seventy-seven age-adjusted cancer rates were retrieved from the SEER*Stat State NPCR database. The total age-adjusted number of cancers reviewed across the 28 cancer types was 51,623,922 and the total aggregated population across the period 2003–2017 was 124,896,418,350.

Other papers in this series consider these data analyzed as continuous covariates [40] and detailed analyses [41] respectively.

Bivariate categorical analysis

Figure 1 reports graphically a quintile analysis for all cancers for tobacco exposure. The progression by quintile is clearly demonstrated for lung cancer in the first panel and is also evident in different ways for the other tumours displayed.

Fig. 1
figure 1

Relationship of selected cancer incidence to tobacco exposure rates by tobacco quintiles

Figures 2, 3 and 4 perform a similar function for all cancers by AUD, THC and cannabidiol exposure quintiles respectively.

Fig. 2
figure 2

Relationship of selected cancer incidence to AUD exposure rates by AUD quintiles

Fig. 3
figure 3

Relationship of selected cancer incidence to THC exposure rates by THC quintiles

Fig. 4
figure 4

Relationship of selected cancer incidence to cannabidiol exposure rates by cannabidiol quintiles

Figure 5 is a series of boxplots comparing the highest and lowest quintiles’ cancer incidence by tobacco exposure quintile by cancer type. It is ordered by the ratio of the highest to the lowest quintiles. Again lung and vulvovaginal cancers feature at the top of the list.

Fig. 5
figure 5

Comparison of lowest and highest quintiles of tobacco exposure on various cancer rates

Figure 6 repeats this exercise for AUD exposure quintiles.

Fig. 6
figure 6

Comparison of lowest and highest quintiles of AUD exposure on various cancer rates

Figure 7 does this for cannabidiol exposure quintiles.

Fig. 7
figure 7

Comparison of lowest and highest quintiles of cannabidiol exposure on various cancer rates

Table 1 presents the quantitative data emerging from these graphs for the comparisons of the highest and lowest tobacco exposure quintiles using the age-adjusted rates and the state population to calculate the expected numbers of cases. This procedure inherently corrects for the differing age structure and therefore cancer predispositions of various state populations. The Table lists the predicted numbers in the highest tobacco using states aggregated over the whole 2003–2017 period, those without cancer, performs similar calculations for the lowest quintile states, presents the rate ratios (RR), the attributable fraction in the exposed (AFE), the population attributable risk (PAR), the applicable P-Value and the point estimates and minimum E-Values. In R P < 2.2 × 10− 320 is the lower limit to which calculations go so P < 2.2 × 10− 320 has been inserted in some cells to indicate such vanishingly low significance levels. One notes that 12 cancers in this Table have elevated E-Values. In particular lung, cervix, oropharynx, colorectal, female genital, esophagus, penis, all cancer, CML, kidney and bladder cancer are included on this list which are all known to be associated with tobacco smoking [55].

Table 1 Numbers, calculated rates, extreme values, significance and e-values for tobacco

Table 2 performs a similar function comparing highest and lowest THC exposure quintiles, with THC quintiles calculated over the whole exposure period in aggregate. 11 cancers in this table have elevated E-Values. Melanoma was most highly significant in this series with rate ratio of 2.16 (95%C.I. 2.15, 2.18), attributable fraction in the exposed 53.83% (53.54, 54.11%), population attributable risk 36.13% (35.87, 36.40%), Chi Squ. = 63,311.55, P < < 2.2 × 10− 320, and minimum E-Value 3.73.

Table 2 Numbers, calculated rates, extreme values, significance and E-Values for THC

Table 3 performs a similar function for the upper and lower quintiles of cannabidiol exposure with cannabidiol quintiles calculated over the whole exposure period considered together. 15 cancers in this Table have elevated E-Values. Prostate cancer is most strongly represented with a rate ratio of 1.397 (95%C.I. 1.392, 1.402), attributable fraction in the exposed of 28.40% (28.14, 28.66%) and population attributable risk 15.34% (15.17, 15.51%). Its Chi Squ. value was 32,606.52 at one degree of freedom which corresponds to a P-Value << 2.2 × 10− 320. The minimum applicable E-Value was 2.13.

Table 3 Numbers, calculated rates, extreme values, significance and E-Values for cannabidiol

Figure 8 sets out the relevant rate ratios (which act like odds ratios for cohort studies) and their tight confidence intervals for cannabidiol exposure.

Fig. 8
figure 8

Rate ratios of highest v lowest cannabidiol exposure quintiles calculated from age adjusted rates

Figure 9 sets out the attributable fractions in the exposed and their confidence intervals for cannabidiol exposure. They are noted to decline from almost 20%.

Fig. 9
figure 9

Attributable fractions in the exposed of highest v lowest cannabidiol exposure quintiles calculated from age adjusted rates

Figure 10 sets out the population attributable risks for the highest and lowest quintiles of cannabidiol exposure.

Fig. 10
figure 10

Population attributable risks of highest v lowest cannabidiol exposure quintiles calculated from age adjusted rates

Figure 11 illustrates graphically the applicable P-values for cancers where the risk posed from cannabidiol exposure was elevated and again compares the highest and lowest quintiles. The horizontal line indicates significance on this log scale. The graph may therefore be interpreted as showing illustratively those tumours with elevated P-values for the interquintile comparison.

Fig. 11
figure 11

Log P-values ratios of highest v lowest cannabidiol exposure quintiles calculated from age adjusted rates

Figure 12 illustrated the applicable E-Values for these tumours. The horizontal line represents the threshold value of 1.25, which is described in the literature to be indicative of causality [33].

Fig. 12
figure 12

Log E-values ratios of highest v lowest cannabidiol exposure quintiles calculated from age adjusted rates

Summary of bivariate calculations

Finally we turn again to some concluding calculations on the bivariate summary data presented earlier.

Table 4 shows the SEER*Stat derived total case numbers by cancer type for 2017 the final year of the present study. It also shows the attributable fraction in the exposed (AFE) and Population Attributable Risk (PAR) for tobacco, THC and cannabidiol. All the data in the table is complete. The AFE’s and PAR’s are taken from the comparisons listed in Tables 4, 5 and 6.

Table 4 Calculated attribtuable fraction in the exposed and population attribtuable risk and case numbers 2017
Table 5 AFE and PAR calculations by cancer type
Table 6 Summary Statistics

Table 5 shows this data again but includes only those tumours with positive AFE’s. It also includes in the last row the applicable totals for the three substances under both AFE and PAR conditions. Clearly the PAR fraction is highly dependent on the penetration of the use of each substance into the community, a factor which is changing rapidly across the USA in relation to cannabinoids. In this respect it is obvious that the PAR for cannabinoids, to which access was until recently relatively restricted, it not properly comparable with that for tobacco and alcohol. This is to say that one cannot properly compare the PAR for licit and illicit substances without careful consideration of the impact of their differing legal statuses on their penetration into the community. It should be noted that the methodology adopted is extremely conservative since the attributable fraction of tobacco for lung cancer in reality is known to be 1.00 [32, 33]. However in the circumstances such an approach is equitable across all substances identified. The number of cases for total cancer has not been included in calculating the column totals, which as shown is 36,450 for tobacco PAR numbers and 48,510 for cannabidiol AFE numbers.

In any event for clarity and for equanimity, the numbers derived from both metrics are presented finally in Table 6. Irrespective of the metric used one notes at once that the numbers of tumours which might be attributable to each substance under these conditions are significant. As mentioned these are clearly highly conservative estimates.

Discussion

Main results

When the highest and lowest exposure quintiles were compared 12, 11 and 15 cancers were noted to be elevated in the highest quintiles for tobacco, THC and cannabidiol exposure respectively. Based on 2017 numbers of total non-skin cancer cases (1,670,227) these positively associated cancers translate into an extra 93,860, 91,677 and 48,510 for the three substances on an AFE basis representing 5.62, 5.49 and 2.90% of the total cancer case burden. Based on PAR rates these exposures indicate excess case burdens of 36,450, 55,780 and 14,819 or 2.18, 3.34 and 0.89% respectively. Since cannabis access has until recently been relatively restricted it may be reasonable to compare the PAR rates for legal substances with the AFE rates of the restricted substances THC and cannabidiol, making the cannabinoids important community carcinogens alongside tobacco and alcohol at the population health level.

Comparing the highest and lowest quintiles of THC exposure melanoma, thyroid, liver, AML, ALL, pancreas, myeloma, CML, breast, oropharynx and stomach cancer demonstrated elevated minimum E-Values from 3.72 to 1.08. Rate ratios for these tumours declined from 2.166 (95%C.I. 2.153, 2.180) to 1.016 (1.006, 1.026); AFE declined from 53.8% (53.5, 54.1%) to 1.60% (0.6 to 2.57%); and PAR declined from 36.1% (35.9, 36.4%) to 0.78% (0.30, 0.13%).

Comparing highest and lowest quintiles of cannabidiol exposure prostate, melanoma, Kaposi sarcoma, ovarian, bladder, colorectal, stomach, Hodgkins, esophagus, Non-Hodgkins lymphoma, All cancer, brain, lung, CLL and breast cancer demonstrated elevated minimum E-Values from 2.13 to 1.19. Rate ratios for these tumours declined from 1.397 (95%C.I. 1.392, 1.402) to 1.031 (1.028, 1.035); AFE declined from 28.40% (28.14, 28.66%) to 3.05% (2.74 to 3.37%); and PAR declined from 15.3% (15.1, 15.5%) to 1.42% (1.27, 1.57%).

These general relationships were confirmed with categorical analysis when highest and lowest exposure quintiles were compared. AML, breast, CML, liver, oropharynx, pancreas and thyroid cancers were significantly related to THC exposure when studied as both continuous and categorical variables [40]. All cancers, bladder, brain, breast, colorectal, esophagus, Hodgkins, lung, melanoma, ovary, prostate and stomach cancer were significantly related to cannabidiol exposure when studied both as continuous and categorical variables [40].

Interpretation

These data suggest that 23 cancers are epidemiologically associated with either THC or cannabidiol with minimum E-values in the same range as those for tobacco. These 23 cancers are: prostate, melanoma, Kaposi sarcoma, ovarian, bladder, colorectal, stomach, Hodgkins, esophagus, Non-Hodgkins lymphoma, All cancer, brain, lung, CLL, breast, thyroid, liver, AML, ALL, pancreas, myeloma, CML, oropharynx.

Based on the numbers of cancers implicated (11 and 15) THC and cannabidiol are as important community carcinogens as tobacco. Based on the case numbers involved THC and cannabidiol are confirmed to be important population health carcinogenic agents particularly if one accepts that it is reasonable to compare the PAR rates for the legal substances with the AFE rates for the restricted substances so that the PAR case numbers of tobacco of 36,450 relate to the AFE numbers of THC and cannabidiol of 91,677 and 48,510. Further, since the E-values for the cannabinoids upon categorical analysis are in the same range as those for tobacco the epidemiological strength of evidence for a causal relationship between the two groups of substance is substantially equivalent. As noted earlier int eh continuous analysis study [40] the evidence for causality is actually stronger for cannabidiol and cannabichromene than for tobacco in that paradigm.

Mechanisms

The subject of cannabinoids and cancer is too large to be reviewed in detail here. This and related subjects have been described in several other publications to which the interested reader is referred [56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72]. Our intention here is merely to make some observations which are of particular interest and illustrate how all these seemingly disparate observations may present a coherent conceptual framework of cannabinoid-related carcinogenesis.

This section takes the overall plan of first considering the very large field of epigenomics an area which is increasingly being implicated in the pathogenesis of many cancers and also in cannabinoid pathophysiology, and then considering some specific cancers which arise from the above epidemiological analyses. It is intended that this section be read in parallel with the mechanistic sections of the first and third papers in this series.

Overview of epigenetics

Since the genomic code is the same in each cell the fact that each cell is different implies that the way its complement of genes is used must be different. That is to say control of the available genes is central to cell specification and function. Indeed cell lineage determination is mainly determined by its epigenomic state. The epigenome also carries data on historical exposure to past major events recording neural, immune and metabolic memories [73,74,75,76,77,78,79,80]. Some of the major ways in which epigenomic information is encoded include DNA methylation, post-translational modifications of the tails of the histones around which DNA is wrapped, macro- and micro- RNA’s, position within the cell nucleus in relation to the nuclear membrane, proximity to transcriptional factories also called topologically active domains and whether the gene is subject to major silencing apparatus such as being heavily coated in the repressive machinery as occurs in heterochromatin and the inactivated X-chromosome which becomes the juxta-membrane Barr body. These and other layers of epigenomic machinery do not operate in isolation but are closely coordinated [79, 81, 82].

Epigenomic states including 3-D nuclear spatial organization are heritable across three to four generations [81, 83]. Many organs have been shown to be affected including brain, immunity, obesity, kidney prostate, ovary and testis [65, 66, 71, 81, 84,85,86,87,88,89,90,91,92]. A variety of phenomena have been shown to be epigenetically inherited including stress, obesity, starvation, the fungicide vinclozin, trauma, chemicals, tobacco, alcohol, opioids, cocaine, and cannabis [64,65,66, 71, 81, 84, 85, 93, 94].

DNA methylation

DNA Methylation is a primary mode of control of gene availability. The commonest pattern of aging is that genes become progressively methylated in their promoter regions and demethylated in the gene bodies. This has the overall effect of shutting down gene expression or changing the splice sites or isoforms of transcribed genes. This progressive decline in gene expression clearly fits well with the obvious steady decline in functions as organisms age. It has long been understood that the pattern of DNA methylation at the CpG islands of certain key marker genes can be used to determine an epigenetic age [95,96,97].

In a recent tour de force study from Harvard Aging lab, UCLA and other centres it was shown that reversal of this age-related promoter DNA hypermethylation could actually return the post-mitotic neural cells of the mouse retina to their newborn state and reverse their epigenetic age [98]. This was done by the intraocular delivery of Oct4, Sox2 and Klf4 (OSK) three of the four Yamanaka stem cell inductive factors. Myc was not used as it was not required and has been linked with cancer development. This epigenetic age reversion allowed the ganglion neuronal cells of the retina to recover after a crush injury and to regenerate their axons which were able to grow into the optic chiasm. The acceleration in epigenomic age induced by optic nerve crush injury was reversed by OSK administration and was dependent on the ten-eleven translocation methyldioxygenases (Tet) 1 and 2 which are known to initiate the DNA demethylation process [98]. Accelerated aging of human neurons induced by the chemotherapeutic drug vincristine was similarly reversed by OSK treatment. Murine retinal ganglion cells were also able to regrow and recover after the intraocular hypertension of glaucoma which does not naturally occur including with restoration of impaired sight. They were also able to reverse the aging of advanced mouse retinae, restore the transcriptome to young again and improve sight [98]. Epigenomic gene analysis showed that the most affected genes were special targets of Polycomb Repressive Complex 2 (PRC2) and its histone methyltransferase product trimethylated lysine of histone 3 (H3K27me3). This wonderful bioinformatic approach demonstrates that not only is DNA methylation a hallmark and biomarker of aging but it is also a key cause of the multi-level changes which are known to accompany the aging process.

Cannabis has also been shown to greatly perturb the cellular DNA methylation profile and patterns of both hyper- and hypo- DNA methylation are described with hypomethylation being predominant [64, 65, 71, 84, 85, 93]. Such findings suggest that cannabis exposure may also directly and causally impact the epigenomic aging machinery as has been demonstrated clinically in longitudinal studies [99].

Since aging is the leading risk factor for most adult cancers this would in turn imply a powerful effect widespread across the genome which predisposes towards malignant transformation.

Histone reduction and modifications

DNA inside cells does not usually occur as long threads but is coiled twice around two sets of four histone proteins which together form a histone octamer with a frequency of around 147 base pairs to form a unit known as a nucleosome. The four histones involved are H2A, H2B, H3 and H4 and two copies each comprise each octamer. This arrangement allows tight packing of DNA and also control over its availability for transcription. Post-translational modifications on the tails of these histones, particularly H3 and H4, control the spacing of the nucleosomes and thus the accessibility of the genes to the transcription machinery.

It was shown by Mon long ago that cannabinoids including THC and cannabinol reduce the synthesis of histones H1, H2A, H2B, H3 and H4 including their acetylated derivatives which make genes available for transcription [100].

If less histones are available for nucleosome casing of DNA it follows that DNA must be less constrained and necessarily inhabit a more open and accessible DNA configuration where it is more accessible to the transcription machinery. This is know to constitute a pro-oncogenic state as stem cell, cell survival and anti-apoptotic genes usually get the upper hand in such situations creating a survival advantage, apoptosis resistance and conferring enhanced clonal replicative capacity. As described below in the discussion on Non-Hodgkins Lymphoma this has been well demonstrated directly for H1 and several of its isoforms.

Proteins

As catalogued [101] cannabinoids inhibit the synthesis of many proteins. Two of the most important are histones and tubulin which have been discussed above.

Bioenergetic Epigenomics

Mitochondria are small subcellular organelles within the cytoplasm of all human cells which are known as the “cells powerhouse” as they generate most of the cells energy by oxidative phosphorylation. They also perform several other functions including having a role in cell replication and cell death by apoptosis, antioxidant defence by glutathione maintenance, they protect DNA, and assist with pH and calcium balance and with electrochemical integrity [102].

Mitochondria also carry a full complement of the cannabinoid signalling system. Hence CB1R’s occur in their outer membrane and the intermembrane space and inner mitochondrial membrane actually carry all the machinery necessary to receive and transduce downstream cannabinoid signals [103,104,105,106,107,108,109,110]. It is important to appreciate that as bioactive lipids cannabinoid molecules can pass through lipid-rich biomembranes readily and transmit signals to intracellular sites [105, 108, 111, 112]. In general the action of cannabinoids on mitochondria is inhibitory [105, 108, 111, 112].

Since many reactions involving DNA are energy dependent their continued healthy supply of energy as ATP to the nucleus has major implications for the maintenance of genomic integrity [102].

Mitochondria are involved in epigenomic pathways both directly through the supply of small chemical moieties for post-translational modifications, such as activated phosphate, acetate, methyl, succinate, fumarate, palmitoylation, myristylation and nitrosylation groups but also via coordinated cross-talk and communication channels with the nucleus [113]. Since the mitochondrial DNA codes for many of the mitochondrial proteins, and some are also encoded in the nuclear DNA clearly expression of the two sets of genomes needs to be coordinated. This is fashioned via at least three molecular shuttles involving malate – aspartate, nicotinamide adenine mononucleotide and glyceraldehyde-3-phosphate [113]. For these reasons close relationships between cellular metabolic state and epigenomic systems are well documented and increasingly appreciated as being of importance [74, 78, 113, 114].

Interactions with specific pathways

Interactions between cannabinoids and many morphogenic pathways have been described. Most of these have been previously implicated in cancer development and malignant transformation. They are discussed further in a companion manuscript [41].

Cannabinoids have been shown to interact with sonic hedgehog [20], fibroblast growth factor ((FGF) [115, 116], including transactivation of the FGF1R by CB1R [117]; bone morphogenetic proteins [118,119,120], retinoic acid signalling [121,122,123], notch signalling [124,125,126,127,128] (which is very involved in colorectal cancer), Wnt signalling [129,130,131,132,133,134] and the hippo pathway [64].

Generalizability

Our results are likely to be widely generalizable for several reasons. Results presented are internally very consistent both with each other and with much known evidence external to this study. The confirmation of the results for tobacco with those in many other sources is strongly confirmatory both for the tobacco analyses and for the cannabinoids analyses which employ similar methodology [55, 135,136,137,138,139]. The cancer data used are derived from census samples from all US states. The drug exposure data is taken from a well authenticated and widely studied nationally representative survey which has been operating on an annual basis for several decades. The bivariate analysis is at once conceptually simple yet very powerful particularly when paired with E-Value calculations. One of the major result outputs from the present study was E-Values which are a major pillar of causal inference. It was very noteworthy that the E-Values seen for the cannabinoids were of the same order as those for tobacco. We note that the large US dataset represents an ideal context within which to address the present concerns. In that the present results demonstrate causal relationships we are confident that they could be widely reproduced and note that in nations where cannabis use is more widespread we would expect the findings to be more dramatic if the extant data sources are of sufficient quality and currency to properly document the link.

Strengths and limitations

This study a number of strengths

A large national cancer census dataset was used. Age adjusted rates derived from CDC, SEER and NCI were access and employed. The drug dataset was taken from a large well validated nationally representative dataset. The bivariate statistics were straightforward yet, when harnessing the power of E-values they were powerful and enabled us to assess causality directly. These studies were internally and externally consistent with known data both on tobacco-related cancer and on cannabis-related cancer and aetiopathogenesis. Panelled graphs enabled the simultaneous display of results for direct comparison across many different cancer types.

Individual level participant data was not available to this study in common with most epidemiological studies. State-level cannabinoid exposure was estimated as described as state level data itself was not directly available to the present investigators. Another issue of considerable interest is the possible role of synthetic cannabinoids as genotoxins. In the absence of spatiotemporal data on this issue we are unable to comment on this increasingly important matter. However several lines of evidence suggest that they are likely to be implicated. Several recent studies implicate many cannabinoids in genotoxic activities [16, 17, 22, 23, 39, 93, 140,141,142,143]. Long ago the genotoxic action was found to reside in the polycyclic olevitol nucleus of the cannabinoids with little modulation by the various side chains [144]. And several other studies implicate synthetic cannabinoids in genotoxicity [145,146,147,148,149,150,151]. Overall therefore we feel that this is a fertile and important area for further laboratory based investigation and epidemiological surveillance.

Furthermore this was also an ecological study. It may therefore be seen as potentially being susceptible to the shortcomings typical of ecological studies including the ecological fallacy and selection and information biases. Within the present paper we have carefully addressed such issues with the use of inverse probability weighting in all mixed effects, robust and panel regression models which transform the analytical paradigm from merely an observational study into a pseudorandomized one from which it is entirely appropriate to draw causal inferences. We have also employed E-values widely in many Tables. Therefore these principle tools of quantitative causal analysis have been widely deployed in the present analyses. The issue of causality is further addressed by the detailed pathophysiological mechanisms which have been described above and by mention of other countries where many of the same findings have been made. We therefore feel that we have taken all reasonable steps to minimize observational and ecological shortcomings for prostatic and ovarian cancers and in doing so have demonstrated in a pathfinding way the manner in which such analyses may be extended to other tumours and indeed to other disorders.

Conclusion

In conclusion this overview of 28 selected cancers showed strong bivariate evidence that THC and cannabidiol were associated with multiple cancers. All cancer incidence was associated with cannabidiol exposure. Breast cancer, the commonest cancer, was associated with tobacco, THC and cannabidiol exposure. 11 cancers were associated with THC and 15 with cannabidiol and together these two cannabinoids alone accounted for 23/28 cancers. The strength of association as measured by the minimum E-Values was equivalent to that from tobacco. The results for tobacco were closely concordant with multiple reports and CDC data an important finding which not only confirms the analysis in relation to tobacco but also confirms the methodology employed for the cannabinoid analyses also. The finding that THC AFE’s declined from 53.8% (53.5, 54.1%) and cannabidiol AFE’s declined from 28.40% (28.14, 28.66%) is very concerning indeed as more people across the globe are exposed to cannabinoids and as cannabinoids increasingly make their way into the food chain of USA, Canada, Europe and Australia amongst many other nations. This is particularly so given the well documented pseudo-exponential relationship of the cannabis genotoxic dose response curve documented both in the laboratory and epidemiologically [41]. The evidence presented strongly implies that the generally benign view with which cannabis and cannabinoids are considered is not supported by the weight of extent epidemiological evidence relating to genotoxicity and carcinogenicity, which is fact is most concerning indeed. The present data is further supported by results presented in the continuous data analyses and more detailed multivariable adjusted causal models in companion and related papers [16, 17, 22, 23, 40, 41, 62, 93, 142, 143, 152,153,154,155]. The clear implication from the present work and its accompanying reports [40, 41] is that community penetration of cannabinoids should be carefully restricted not only as a matter of public health and safety including importantly integrity of the food chain, but also as a non-negotiable investment in the genomic health and onco-protection of multiple coming generations in a manner precisely analogous to that of all other seriously genotoxic agents. Particular concerns relate to the movement of increasing sections of the community into higher dose ranges of cumulative cannabinoid exposure in the context of exponentiation of genotoxic dose-responses which has now been convincingly demonstrated both in the laboratory and in epidemiological studies of human populations.