Background

Aside from testicular carcinogenesis the relationship of cannabis use to cancer incidence is controversial with both positive [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19] and negative [4, 20, 21] case–control studies being well described [17, 22,23,24,25]. Cannabis use has been associated with cancer of the head and neck [1,2,3], lung [4,5,6], larynx [7], prostate [7], testes [8,9,10,11], cervix [7], brain [12] and urothelial tracts [13,14,15]. Some investigators have described evidence of a positive dose–response relationships [1, 3, 4, 6]. Several paediatric cancers have been found to be elevated following prenatal in utero exposure including childhood neuroblastoma [16], rhabdomyosarcoma [17] and leukaemia particularly non-lymphoblastic leukaemia [18] which provide clinical evidence of inheritable mutagenicity and carcinogenicity [26, 27].

The increasing use of cannabis internationally [28] associated with rising cannabinoid concentrations [29,30,31], increasing intensity of near daily use [32] and the prolonged storage time of cannabinoids in adipose reservoirs and gonads [33,34,35,36,37,38] in a context where laboratory studies have long indicated that cannabinoid genotoxicity is more significant with higher dose exposures and a pseudo-exponential genotoxic dose–response curve [39,40,41,42,43,44] imply that there is a pressing need to apply novel and innovative epidemiological methodologies to the investigation of this important issue at the level of population health.

The case of testicular carcinogenesis is at once interesting, important and instructive. Since it is well established that the testicular germ cell niche begins its oncogenic transformation during in utero life [45,46,47] and the mean age of testicular carcinoma incidence is 34 years [48] this implies a protracted period of subclinical transformation of the germ cell epithelium. Since cannabis exposure in adolescent and young adult life is known to increase the incidence of testicular cancer [8,9,10,11, 49, 50] by an average of 2.59-fold (95%C.I. 1.60–4.19) in meta-analysis [50] it follows that cannabis exposure dramatically accelerates testicular carcinogenesis by 2.4-fold from 34 to around 12 years [51]. The molecular pathogenesis of testicular carcinogenic transformation involves key oncogenic steps including whole genome reduplication, loss of arms of dozens of chromosomes, widespread genome demethylation and functional or structural reduplication events on chromosome 12. Since cannabis accelerates this pathway so markedly it follows that it must have diverse major genotoxic effects in the human testicular germ cell niche. Since the testis houses the male germ cell epithelium the possibility remains open that such major genotoxic damage may be passed on to subsequent generations through the male germ line.

Indeed the recent demonstration that cannabis is causally linked with several birth defects including trisomies of chromosomes 21, 18, and 13, deletion 22q11.2 [52] along with paediatric acute lymphoid leukaemia [53] (which also involves damage to chromosome 12) directly implicates cannabis exposure in damage to 528 MB of the human genome representing 17.6% of its 3,000 MB total length [52]. Indeed one recent study has provided epidemiological evidence that cannabis exposure is causally linked to breast, thyroid, pancreatic and liver cancer along with acute myeloid leukaemia in USA [52]. The link with paediatric acute non-lymphoid leukaemia has previously been demonstrated also by earlier researchers [18, 54]. Since acute lymphoid leukaemia is the commonest cancer of childhood it follows that cannabis may also be an important cause and driver of rising rates of paediatric cancer rates and this has also recently been demonstrated [55].

Some widely quoted earlier negative case–control studies suffered major methodological limitations such as the deletion of individuals who accumulated a high lifetime cannabis exposure from the analysis [21] which, given what has been learned since that time, amounts to a virtual amputation of the signal of interest.

Cannabidiol is of particular interest and concern as it is widely promoted in the culture for a myriad of medical complaints as it is not hallucinogenic and is said not to be psychoactive. However it is widely recommended for the relief of anxiety and it is likely that it is acting at the cannabinoid type 1 receptor (CB1R) where it has been shown to bind after high dose exposure [56,57,58,59,60]. Moreover it was shown long ago that many cannabinoids including Δ9-tetrahydrocannabinol, cannabidiol and cannabinol are genotoxic [38] and indeed the genotoxic moiety was demonstrated to be the polycyclic central ring shared by all cannabinoids known as olevitol [36]. Cannabidiol and many other cannabinoids inhibit mitochondrial function directly through the mitochondrial cannabinoid signalling system, by reactive oxygen species generation and through uncoupling protein 2 [61,62,63,64,65,66,67]. Mitochondria are a key regulator of epigenetic function via both indirect regulatory pathways and substrate supply [68]. With the major popular focus on cannabidiol relating to alleged lack of psychoactive potency the far-reaching implications of its known genotoxicity and epigenotoxicity have been essentially overlooked.

The broad issue then of the relationship between cannabis and cancer incidence must be regarded as an open question. A formal detailed epidemiological exploration of this large issue is necessarily complex so the matter has been broken into a series of three papers to aid with presentation and to assist reader understanding. The present paper considers cancer and drug exposures as continuous variables. It is followed by a second paper which examines these covariates as categorical variables which allows data dichotomization in various ways and the calculation of key parameters of interest such as attributable fraction in the exposed and population attributable risk and thus allows the derivation of national case numbers affected [69]. Finally two cannabidiol-related cancers are considered in detail as a demonstration of the manner in which advanced statistical methods can be deployed to investigate these questions [70]. Prostate and ovarian cancer were chosen for these examples as their relationship with cannabidiol was amongst the strongest and their role in the reproductive tract may portend transgenerational impacts and these are the subject of the third paper in this series [70]. It is important that all three papers be read collectively to appreciate the depth and the inter-relatedness and therefore the power of the evidence implicating cannabinoid genotoxicity with an important role in epidemiological cancerogenicity.

Methods

Data

Rates of age-adjusted cancer rates by state and year and cancer type was taken from the Surveillance, Epidemiology and End Results (SEER) database from the Centres for Disease Control (CDC) Atlanta, Georgia and the National Cancer Institute (NCI) and from the National Program of Cancer Registries (NPCR) and SEER Incidence US Cancer Statistics Public Use Database 2019 submission covering years 2001–2017 using the SEER*Stat software [71]. The focus of this study was 28 of the most common cancers (as listed below). This includes the category all non-skin cancer (called All Cancer in this report). This was joined with drug use cross-tabulation data across USA by state and year from the National Survey of Drug Use and Health (NSDUH) Restricted-Use Data Analysis System (RDAS) of the Substance Use and Mental Health Data Archive (SAMHDA) held by the Substance Use and Mental Health Services Administration (SAMHSA) for 2003–2017 [72]. Thus the overlap period between the cancer and drug exposure datasets was 2003–2017 which therefore became the period of analysis. The variables of interest were last month cigarettes, last year alcohol use disorder (AUD), last month cannabis, last year non-medical use of opioid analgesics (Analgesics) and last year cocaine. Quintiles of substance exposure were calculated for each year numbered from one, the lowest quintile, to five the highest exposure quintile. Data on median household income, ethnicity and population by state and year was sourced directly from the US Census bureau via the tidycensus package [73] in R including linear interpolation for missing years. The ethnicities of interest were Caucasian-American, African-American, Hispanic-American, Asian-American, American Indian / Alaska Native (AIAN) and Native Hawaiian / Pacific Islander (NHPI). Data on cannabinoid concentration across USA was taken from reports published by the US Drug Enforcement Agency (DEA) for the five cannabinoids Δ9-tetrahydrocannabinol (THC), cannabigerol (CBG), cannabichromene (CBC), cannabinol (CBN), and cannabidiol (CBD) [29,30,31]. It was multiplied by state level cannabis use to provide an estimate of state level exposure. Quintiles of cannabinoid exposure were calculated on the whole period considered in aggregate. These are used particularly in Part 2. Age adjusted case numbers were derived by multiplying the age-adjusted cancer rate in each state and year by the population of that state and dividing it by 10,000.

Statistical analysis

Data was processed in R-Studio version 1.3.1093 (2009–2020) based upon R version 4.0.3 (2020–10-10). Covariates were log transformed guided by the Shapiro-Wilks test. Data was manipulated using the “dplyr” package in the “tidyverse” [74]. Graphs were drawn in ggplot2 from tidyverse [74, 75] and maps and graphs were drawn in R-Base, ggplot2 and “sf” (simple features) [76]. Some colour palettes employed the viridis and plasma palettes taken from the package “Viridis” [77] and several palettes were originally designed for this project. Bivariate maps were drawn using colorplaner two way colour matrices [78]. All maps and graphs are original and have not been previously published.

Regression models

Bivariate linear trends were computed with linear regression from R-Base.

Simultaneous multiple model analysis

This was conducted in the tidyverse package “purrr” [74] using tidy and glance from package “broom” [79] using established nest-map-unnest workflows. In this way a whole long dataset providing data on many cancers could be analyzed in a single analysis run at one time.

Causal inference

E-values (expected values) quantitate the degree of an association required of some unknown extraneous confounder variable with both the exposure of concern and the outcome of interest to explain away an apparently causal effect. They therefore provide a quantitative estimate of the degree to which the model is formally complete and subject to extraneous explanations from unidentified confounding covariates. They are a foundational pillar for formal quantitative causal inferential methods. E-value estimates above nine are said to be high [80] and a threshold of 1.25 is typically quoted as being required of potentially causal effects [81]. E-values were computed using the R-package “EValue” [82] from regression equations using the parameter estimate, its standard error and the standard model deviation [81, 83, 84].

P < 0.05 was considered significant throughout.

Data availability

Data, including R-code, ipw weights and spatial weights has been made freely available through the Mendeley Data repository online and can be accessed at http://dx.doi.org/10.17632/dt4jbz7vk4.1

Ethics

Ethical approval for this study was granted from the University of Western Australia Human Research Ethics Committee approval number on 7th January 2020 RA/4/20/7724.

Results

The cancers upon which we chose to focus our attention were chosen because they were relatively common or because they involved tissues which had been implicated in the literature with cannabinoid activities. For this reason cancers of the male and female reproductive tract were well represented amongst the cancers chosen for study. The list in alphabetical order includes tumours of: acute lymphoid leukaemia (ALL), acute myeloid leukaemia (AML), bladder, brain, breast, cervix, chronic lymphoid leukaemia (CLL), chronic myeloid leukaemia (CML), colorectum, oesophagus, Hodgkins lymphoma, Kaposi sarcoma, kidney, liver, lung, melanoma, multiple myeloma, Non-Hodgkins lymphoma, oropharynx, ovary, pancreas, penis, prostate, stomach, testis, thyroid and vulva and vagina combined. Based on 2017 data the 27 cancers chosen comprehended 1,339,737 of the 1,670,227 cancers reported to state cancer registries in that year or 80.21% of all non-melanoma non-skin cancers reported. In addition total non-skin cancer was also included in this list making 28 cancer types in all.

19,877 age-adjusted cancer rates were retrieved from the SEER*Stat State NPCR database. The total age-adjusted number of cancers reviewed across the 28 cancer types was 51,623,922 and the total aggregated population across the period 2003–2017 was 124,896,418,350.

Other papers in this series consider categorical [69] and detailed analyses [70] respectively.

Bivariate continuous analysis

Figure 1 shows the time trend for the age-adjusted incidence rate for more common cancers (panel A), less common cancers (panel B) and rare cancers (panel C) derived from the CDC SEER*Stat database.

Fig. 1
figure 1

Time trends of A common, B intermediate frequency and C rarer cancers in USA 2003–2017

The NSDUH survey reports a national response rate of 74.1% [85]. Figure 2 shows the time trend for five substances of interest. One notes that cannabis alone shows a strong upward trend whilst the rate of the other substances is falling or in the case of cocaine, variable and at a low level.

Fig. 2
figure 2

Trends in various substance use rates at state level across USA 2003–2017

Fig. 3 shows the rate for the state based estimates of cannabinoid exposure calculated as described above.

Fig. 3
figure 3

State level cannabinoid exposure estimates across USA 2003 – 2017

Fig. 4 shows a progression of the incidence rates of 28 cancers of interest, including all cancers, against tobacco exposure. The panels of the graph are ordered by the slope of the cancer:tobacco regression line. The first 9 cancers are seen to be rising in association with increasing tobacco exposure. The fastest rising cancer is lung cancer, which of course is well known. This confirmation of this important finding confirms the technical utility of this technique and indicates that its extension to other substances would also be of interest and of worth. One notes that the top line of the graph also includes cervical cancer, all cancer and vulvovaginal cancer. Bladder, oropharyngeal and esophageal cancer also appear in the second line of the graph which are well established as being tobacco-related tumours.

Fig. 4
figure 4

Incidence of 28 cancer types by tobacco exposure across USA

Fig. 5 presents the relationship of the various tumour incidences to AUD exposure. Esophageal and all cancer are noted to demonstrate positive relationships which are confirmed in the published literature [86].

Fig. 5
figure 5

Incidence of 28 cancer types by Alcohol Use Disorder incidence across USA

Fig. 6 presents the relationship of the various tumours of interest to THC exposure. 14 tumours are noted to demonstrate a positive relationship as shown in the top two lines.

Fig. 6
figure 6

Incidence of 28 cancer types by estimates of Styate level Δ9-Tetrahydrocannabinol exposure across USA

Fig. 7 presents the relationship of the various tumours to estimated cannabidiol exposure. 13 cancers are noted to demonstrate a positive relationship, the two most strongly related being prostate and ovarian cancer.

Fig. 7
figure 7

Incidence of 28 cancer types by estimates of Styate level Cannabidiol exposure across USA

Using the techniques for multiple model simultaneous analysis in R packages purrr and broom it is possible to analyze the slopes of these regression lines for the multiple cancers simultaneously by substance type. The full results of this analysis are shown in Supplementary Table 1 (Excel Sheetname “ST1 Subs Slopes All Canc's”) which lists the slope of the regression line (as the Student’s t statistic) the P-value for the significance of the relationship together with various model parameters and their applicable E-Values for all 28 tumours. The table is ordered in terms of descending minimum E-Value. The most significant of these results is shown in Table 1. Table 1 is ordered both my minimum E-value and by substance. One notes that thyroid and liver cancer are the top two associations for cannabis use (P-values 2.3510x−24 and 4.82 × 10–20 and minimum E-Values 2.74 × 104 and 7.96 × 103 respectively) with breast, bladder and pancreatic cancers and AML also featuring significantly.

Table 1 Significant Linear Regression Models by Substance

Supplementary Table 2 (Excel Sheetname “ST2 Cannbd Slopes All Cancs”) performs a similar function for cannabinoid exposure as Supplementary Table 1. Again the most significant results from this table are extracted as Table 2. 13 of the 44 cancers listed in this Table demonstrate a significant relationship to cannabidiol exposure. The counts for the other cannabinoids are THC = 9, Cannabinol = 9, cannabigerol = 7 and cannabichromene = 6. The most tightly related cancers to cannabidiol exposure are prostate, bladder, ovary and all cancers which have P-values ranging from 6.87 × 10–20 to 2.23 × 10–41 and minimum E-Values ranging from 1.43 × 1011 to 2.34 × 1018.

Table 2 Significant Linear Regression Models by Cannabinoid

Table 3 presents the slopes of the regression lines as the Student’s-t value for each of the substances for the cancers listed in descending order of cannabis slope (as the t-statistic).

Table 3 Linear Regression Line Slopes as Student’s t Value by SubstanceOrdered by Slope of Cannabis Regression Line

Table 4 performs a similar function for the cannabinoids listed in descending order of the cannabidiol slope (as the t-statistic).

Table 4 Linear Regression Line Slopes by Cannabinoid Ordered by Slope of Cannabidiol Regression Line

As noted above Tables 1, 2, 3 and 4 present data for all cancers and all rates. Table 5 takes the logs of the cannabis exposure rate and the cannabinoid exposure rates (as indicated by the Shapiro-Wilks test) and regresses them against the cancer rates for each tumour (using the broom-purrr workflow sequence on the dataset in long format). The table concentrates on those tumours with positive and significant regression line slopes. The results are at once intriguing and fascinating. Only four tumours namely ALL, CML, myeloma and testicular cancer, do not appear in this table which is quite remarkable in itself. If one considers this Table in the light of Figs. 1, 2 and 3 one notes that those cancers with falling incidence correlate significantly with those substances whose use is falling. For this reason it is very obvious from the Table that cigarettes, AUD and cannabidiol are grouped together in one cluster whilst all the other cannabinoids, whose exposure is rising, group together in another cluster. The tumours which correlate most tightly with cannabidiol exposure are prostate, ovary, bladder, colorectal and total cancers. Cannabidiol therefore is highly associated with the commonest tumours namely all non-skin cancers, breast, lung and prostate cancer. Interestingly breast cancer correlates with cocaine, cannabis and all the cannabinoids. The substance most associated with cancer types in this table is tobacco (14 tumour types) followed by cannabidiol (12 tumour types) followed by AUD (9 tumour types).

Table 5 Summary of Significant Regression Line Positive Slopes by Cancer, Substance and P-Value Calculations Utilize Logarithm of Rates of Cannabis and Cannabinoid Exposure

Table 6 extracts the results from Supplementary Table 1 for cigarette exposure. Cancers are again listed in descending order of the minimum E-Value. One notes that the list is headed by lung, cervical, colorectal, All cancer and vulvovaginal cancers which seems correct. Fourteen cancers are noted to be significantly related and all have minimum E-Values > 1.70.

Table 6 Summary of Tobacco Regression Line Slopes by Cancer and E-Value

Table 7 performs a similar function for last year AUD exposure. Nine cancers are significantly related on this Table and also demonstrate elevated minimum E-Values.

Table 7 Summary of Alcohol Use Disorder Regression Line Slopes by Cancer and E-Value

Table 8 performs a similar role for cannabis exposure. Here six tumours are significantly related with P-values less than 6.0 × 10–5 and minimum E-Values greater than 19.0. The cancers of interest are in order thyroid, liver, breast, bladder, pancreas and AML.

Table 8 Summary of Cannabis Regression Lines Slopes by Cancer and E-Value

Table 9 performs a similar function for THC exposure. Positive findings in this table occur for nine tumours which are in order thyroid, liver, pancreas, AML, breast, oropharynx, chronic myeloid leukaemia (CML), testis and kidney. Eight cancers have minimum E-Values > 1.30. If one performs this exercise with the logarithm of THC exposure myeloma, melanoma and ALL also become significant.

Table 9 Summary of Δ9-TetrahydrocannabinolRegression Lines Slopes by Cancer and E-Value

Table 10 performs a similar function for cannabidiol. Here twelve cancers are implicated including in order prostate, bladder, ovary, All Cancers, colorectal, Hodgkin’s, brain, lung, Non-Hodgkin’s lymphoma, esophagus, breast and stomach cancers. In this series of tumours the nadir minimum E-Value is 30.11.

Table 10 Summary of Cannabidiol Regression Lines Slopes by Cancer and E-Value

To facilitate conceptual comparison of this mass of data Fig. 8 presents graphically the minimum applicable E-Values for these cancers by substance exposure for those cancers where a finite minimum E-Value is reported. Tumours are ordered by descending E-Value. One notes the log scale on the ordinate axis which ranges up to 1020. The scale is held constant across all substances to facilitate direct comparison between substances both in this graph and on the following graph. The largest minimum E-Values for tobacco, AUD, cannabis, analgesics, and cocaine are 1.76 × 109, 4.67 × 108, 2.74 × 104, 4.76 × 104 and 1.29 × 1011 respectively (see also Supplementary Tables 3 and 4, Excel Sheetnames “ST3 Analgesic Slopes” and “ST4 Cocaine Slopes”).

Fig. 8
figure 8

Comparative Minimum E-values regression models tumour incidence against various substances

Fig 9 presents similar data for the minimum E-Values by cannabinoid exposure. The scale is held constant for consistency with the preceding graph using a log scale with a maximum of 1020. The most striking feature of this graph is that the minimum E-Values for cannabigerol, cannabichromene and cannabidiol dominate the graph, and are also much higher than those shown on the preceding graph which included tobacco and AUD exposure. The most dramatic minimum E-Values of all of those considered thus far relate to cannabidiol. The largest minimum E-Values for THC, cannabigerol, cannabinol, cannabichromene and cannabidiol are 4.72, 5.05X109, 1.91X107, 2.74X1017, 2.34X1018 respectively (see also Supplementary Tables 5, 6 and 7; Excel Sheetnames “ST5 Cannabinol Slopes”, “ST6 CBC Slopes” and “ST7 Cannabigerol Slopes”).

Fig. 9
figure 9

Comparative Minimum E-values regression models tumour incidence against estimates of various cannabinoid exposures

Fig. 10 summarizes these E-Value graphs by illustrating as a bar graph, the cumulative exponents of the E-Values for each substance. This is a simple way of integrating the area under the E-Value curve apparent for each substance. In reality any summary statistics could have been chosen for comparison (e.g. median, interquartile range, range etc.) but it was felt that use of the sum had the major advantage of integrating the area underneath the E-value curve and therefore most closely quantifying the key parameter of interest. From this graph it is clear that for the cancers selected, the area under the curve for cannabidiol and cannabichromene (103 and 58) are considerably larger than that for tobacco and AUD (34 and 32). Other values for the graph are shown in Table 11.

Fig. 10
figure 10

Comparative cumulative sum of the regression model minimum E-value exponents by substance

Table 11 Cumulative E-Values Exponents of Regression Lines by Substance

Discussion

Main results

The main result of this survey and overview is that strong continuous bivariate relationships are noted between the incidence of many cancers and cannabinoids including cannabidiol to an extent comparable to and indeed exceeding that seen with tobacco and alcohol. Whilst positive regression slopes were seen for 9 and 13 cancers for tobacco and AUD exposure, the applicable numbers for cannabis, THC, cannabidiol, cannabigerol, cannabinol and cannabichromene exposure were 15, 14, 13, 13, 15 and 15 cancers respectively (Tables 3 and 4). Elevated minimum E-Values occurred for tobacco and alcohol exposure for 14 and 9 cancers and for 6, 9, 12, 6, 9 and 7 cancers in association with cannabis, THC, cannabidiol, cannabichromene, cannabinol and cannabigerol exposure (Tables 6, 7, 8, 9 and 10 and Supplementary Tables 3, 4, 5, 6 and 7). Compared to tobacco and AUD exposure which have largest minimum E-Values of 1.76 × 109 and 4.67 × 108, the largest minimum E-Values for exposure to THC, cannabigerol, cannabinol, cannabichromene and cannabidiol were 4.72, 5.05 × 109, 1.90 × 107, 2.74 × 1017, 2.34 × 1018 respectively (Fig. 9 and 10 and above cited Tables). The summed exponents of the minimum E-Values for cannabidiol and cannabichromene were 103 and 58 compared to 34 and 32 for tobacco and AUD. The summed exponents for cannabigerol, cannabinol, cannabis and THC were 31, 25, 13 and 0 respectively. These results are in close concordance with the results reported in an accompanying report for the categorical analysis [69].

Causality in the bivariate results was implied by the high E-Values documented.

Hence the present findings argue strongly for the significance of cannabis and cannabinoids as serious bona fidé carcinogens in the US environment. These findings are strengthened by results in accompanying reports [69, 70] showing that the reported bivariate changes are robust to adjustment, fulfil quantitative epidemiological criteria for causality, and for prostate and ovarian cancer demonstrate a supra-linear sigmoidal dose–response relationship with carcinogenic outcomes so that rising doses of cannabinoid exposure generate disproportionate cancerogenic outcomes.

Interpretation

Some of these findings are particularly noteworthy. All cancers as a group were noted to rise with both tobacco exposure and with cannabidiol exposure. It is concerning that another major carcinogen appears to have been identified, which at present is being consumed virtually without restriction in many parts of USA, Canada and elsewhere.

It is also concerning that at least judged by the area under the E-Value curves that cannabidiol and cannabichromene (cumulative minimal E-Value exponents of 103 and 58) were shown to be a more powerful environmental carcinogens than tobacco and alcohol (34 and 32).

From the findings with AML (present report and [87]) and other pediatric cannabis-related tumours [12, 18,19,20, 54, 88,89,90] real concerns exist that widespread cannabinoid exposure may lead to a multigenerational epidemic of cancer. This is supported by recent US history with the rate of all childhood cancer rising 49% and the rate of acute lymphoid leukaemia, the commonest cancer of childhood, rising 94% in the period 1975 to 2018 [48]. This view is closely concordant with a recent report describing cannabis exposure as a primary driver of USA pediatric cancers [55] and of the commonest cancer of childhood acute lymphoid leukaemia [53]. From the very clear findings with testicular cancer it would appear that the usual course of oncogenesis in some stem cell niches may be greatly accelerated [51].

Given the rising level of cannabinoid exposure in the US community, its entry into the food chain seems inevitable. Indeed in some states such as Kentucky and Mississippi this already appears to be occurring [91]. One of the pressing needs in the field therefore is for the development of reliable biomarkers possibly derived from epigenomic or glycomic metrics so that cannabinoid exposure can be quantified formally and analyzed as a continuous variable as has been previously suggested [92]. This would greatly improve epidemiology and surveillance in this field, reduce the numbers required and increase the geospatial precision with which temporal trends can be mapped and surveilled.

Higher precision geotemporospatial mapping of cancer trends is also indicated where such data is available.

Causal assignment

E-values have been used extensively in the present report. In the literature E-Values greater than 1.25 are said to be linked with causality [81]. It is worth noting that the minimum E-Value for the association between tobacco smoke and lung cancer is 9. This places the greatly elevated E-Values highlighted in this report in a proper context. The methodology employed here has also been validated en passant in that many tobacco-related cancers including lung, colorectum, all cancer, vulva and vagina, penis, bladder, oropharynx and esophagus, were correctly identified as such by the methodology adopted.

The findings relating to causal analysis in the present report are further strengthened by the accompanying categorical analysis and the detailed presentation of inverse probability weighted regression models and geospatiotemporal modelling in accompanying reports [69, 70].

Specific cancers

Breast

Breast cancer is the commonest cancer. It was noted to be linked with cannabis, THC, cannabidiol, cannabinol, cannabichromene, cannabinol and cannabigerol. It would seem to be a major public health concern that the commonest cancer is linked with an increasingly common environmental exposure.

Bladder and prostate

It is interesting that bladder and prostate cancer are linked with cannabis and cannabidiol exposure, as bladder cancer has previously been linked with tobacco smoking. In the case of tobacco the causative action is believed to be the prolonged time tobacco-derived carcinogens spend in contact with the transitional epithelium of the bladder [86]. It is known that many of the carcinogens of tobacco are also found in cannabis smoke. Accumulation of urinary CBD and THC metabolites over ten days of cannabis consumption has been documented [93]. It may be that the association of bladder cancer with cannabidiol documented above rests on a similar mechanistic basis. Presumably similar actions are in play in relation to cannabis and cannabidiol urocarcinogensis.

Testis

Testicular cancer has been linked with cannabis use in prior investigations by all four studies to have examined this issue [8,9,10,11]. It was seen in the present work to be linked with THC exposure but not cannabidiol exposure. The involvement of THC with testicular cancer is a cause for concern for two reasons in that it is a germinal epithelium and so genotoxic changes there could well be passed on to subsequent generations. Secondly the extensive literature on the pathogenesis of testicular cancer states quite repeatedly and emphatically that testicular cancer is thought to arise from changes which occur in utero which then become manifest due to the hormonal surge of puberty [18, 54]. Since what is being witnessed at present is that cannabis use is being reflected relatively quickly in higher rates of testicular cancer this necessarily involves a profound telescoping and contraction of the usual decades long pathogenic pathways of testicular cancer from several decades to several years. This implies that, at least in the testis, THC must be acting a powerful carcinogen indeed. One notes indeed that since 1975 the age-adjusted rate of testicular cancer across USA has doubled [48]. It may also be a hint that relatively abrupt mechanisms such as those mentioned above in relation to myeloid malignancies may also be operating in this germ cell context [94].

Ovary

As was shown the ovary is also implicated in cannabidiol exposure and carcinogenesis. This is also concerning. The ovary of course contains the female germinal epithelium. These findings imply that both male and female germinal epithelium are subject to cannabis induced genotoxicity and / or epigenotoxicity and carcinogenesis. The prospect of offspring who have been subject to mutagenic and potentially teratogenic actions in both parental gonads is very concerning indeed, particularly as it is well established that epigenomic changes are heritable for multiple generations [95, 96].

Liver

Liver cancer was one of the two cancers most affected by cannabis, THC, cannabinol cannabigerol and cannabichromene exposure. This is a provocative finding as cannabis has previously been linked with exacerbating liver inflammation and inducing cirrhosis, especially in patients with other risk factors for hepatic disease [97]. This is consistent with the frequently pro-inflammatory action of cannabinoids binding at cannabinoid type 1 (CB1R) receptors. This finding implies that THC and its related cannabinoids is linked with not only hepatic proinflammatory processes but that it is linked with persistent chemical hepatitis to the point of neoplasia.

The endogenous endocannabinoid anandamide along with its natural receptor the cannabinoid type 1 receptor (CB1R) are known to be normally involved in hepatic lipogenesis, insulin resistance and glucose intolerance and to be strongly upregulated during normal liver regeneration following partial hepatectomy or major liver injury and confer on the liver a remarkable degree of regenerative capacity [98, 99]. Anandamide (AEA) stimulates CB1R synthesis which further stimulates AEA release in a autoinductive loop typical of tumour promoting growth factors [99]. CB1R also stimulates multiple oncogenic pathways including the key master transcription factor Forkhead Box M1 (FOXM1) [99]. FOXM1 stimulates indoleamine 2,3 dioxygenase (IDO2) which stimulates T-reg cells which are immunosuppressive and induce tumour tolerance. CB1R also interacts directly with the pro-oncogenic Growth Factor Receptor Bound Protein 2 (GRB2) [100,101,102] and stimulates its interactome which signals activation to many oncogenic nuclear genes including RAS [99]. CB1R and IDO2 also stimulate angiogenesis and the ingrowth of new vessels to the developing tumour [99].

Whist it is noted that cannabinoids have both tumour stimulatory and tumour inhibitory actions it is also pointed out that the tumour stimulatory actions occur at nanomolar concentrations close to the dissociation constants of cannabinoids whilst the tumour suppressive actions occur at much higher micromolar concentrations [98, 99].

Cannabidiol alone was also found to induce liver hypertrophy even at the low concentration of 17 μM in a recent study [103].

Since the liver is a major metabolic organ and controls the central metabolic milieu, and since its inflammatory state is a key regulator of many metabolic pathways both in the liver and systemically in immune and other cells, this implies that hepatic inflammation is linked with a dysmetabolic state systemically throughout the organism, This dysmetabolic and systemic proinflammatory state is itself known to be linked with pro-aging processes including oncogenesis [104,105,106,107,108]. Moreover the oxidative action of cannabidiol and related cannabinoids on DNA bases is greatly increased in oxidizing environments such as cellular inflammation [43]. Inflammation is known to increase the activity of retrotransposons repeat pseudogenes which are endogenous to the human genome and makes the “jumping genes” jump [109, 110]. This increases genomic instability and has been linked with tumour invasiveness, growth rate and metastasis [109,110,111]. Some of the genomic material spills into the cytoplasm where it stimulates innate immunity directly via the cytoplasmic GMP-AMP Synthase and the Stimulators of Interferon Gamma (cGAS-STING) pathway [111,112,113,114,115]. These processes thus set up positive feedback loops as inflammation causes increased mutation and genomic destabilization which stimulates further inflammation [99, 116]. This positive feedback loop between inflammation and genomic instability may be a key driver of the many case series reporting a link between early and high dose cannabis use and the development of aggressive highly metastatic tumours in patients of younger ages [117,118,119,120]. Complex interplays have been demonstrated between metabolic state, immunophenotype, immune cell differentiation, epigenetic state and tumourigenesis [121].

Non-Hodgkins lymphoma

Histone 1 (H1)

It was shown as long ago as 1981 that THC and cannabinol inhibit H1, H2a and H2b histone synthesis by 50% after acute administration to cultured cells [122]. Importantly these investigators also showed that the acetylated forms of these histones, which in general are permissive for gene transcription were similarly reduced to 50–60%.

Whist most of the histones are core proteins at the centre of the histone octamer, H1 is a linker protein which sits like a clasp or clamp near the entry and exit of the DNA strand to hold the whole assembly together [123, 124]. H1 undergoes many post-translational modifications including phosphorylation, acetylation, methylation, citrullination, ubiquitylation, formylation, denitration, ADP-ribosylation, crotonylation, and lysin-2-hydroxyisobutyrylation, many of which have functional significance [124].

H1 also interacts powerfully with the genome repressive machinery particularly (Polycomb Repressive Complex 2) to recruit genome repression and to help form heterochromatin which is transcriptionally inactive [125]. Hence H1 is a key determinant of gene silencing [126]. Indeed its knockdown has been shown to lead to hyperactivation of B-cells in the germinal centres (GC) of lymph nodes where it acts to make genes more available for transcription, increases the activation of stem cell genes, provides fitness and self-renewal advantages to GC B-cells, and thereby launches aggressive B-cell neoplasias. Indeed H1 mutations have been identified in over 80% of B-cell lymphomas [127] and found to be mainly loss-of-function mutations. Data indicated that H1 is an absolute requirement to sequester genes in transcriptionally inactive compartments (the B-nuclear compartment). B-cells are thought to be particularly sensitive to this action as hypermutation is a normal part of their repertoire following activation as it is the mechanism by which they produce antigenic variation in their B-cell receptors.

That is to say that cannabinoid exposure partly phenocopies genetic allelic ablation of H1. Together with the other genetic, epigenetic, chromosomal and metabolic effects of cannabinoids outlined above, these effects may explain the presence of the signal for Non-Hodgkin’s lymphoma in the present epidemiological analysis.

AML and ALL

AML has previously been linked with parental cannabis consumption [18, 54]. In the present study AML was found to be elevated with cannabis exposure and AML and CML were noted to be elevated by THC exposure. ALL is mainly a pediatric cancer and it has been linked to inherited genotoxicity [26, 27]. It was recently shown epidemiologically to be causally linked to environmental cannabis exposure [53]. The increase of the ALL incidence rate by cannabis and THC necessarily implies transgenerational teratogenesis, mutagenesis and oncogenesis. This issue was further heightened by a recent report noting that cannabis consumption is a major driver of the 50% rise in total pediatric cancer in the USA since 1970 [55]. This is a grave concern indeed as it indicates not only heritable mutagenesis but heritable carcinogenesis. The number of generations for which such inheritance can continue has not been defined at the time of writing. It is believed however that epigenetic changes can be inherited for three to four generations [95, 96] which translates to about the next one hundred years. Moreover it was recently shown that myeloid malignancies can be suddenly oncogenically transformed by relatively abrupt clonal sweeps due to specific genotoxic stressors where a minor clone collects extra mutations which suddenly sweep it into clonal dominance and drive overall tumourigenesis and malignant behaviour [94].

It was shown recently for myeloid malignancies that they tend to collect epimutations which often affect epigenomic signalling genes [94]. Clones with the most advantageous collection of mutations out-perform others and can become dominant within the tumour a phenomenon which can happen either spontaneously or as a result of treatment imposed tumour stress. Moreover this can happen relatively abruptly in what has been referred to as “clonal sweeps” across the tumour [94].

Mechanisms

The cellular and molecular mechanisms underlying these epidemiological relationships outlines in the above analyses are outlined further in the second and third papers in the present series.

Generalization

We feel that our results are widely generalizable for several reasons. As noted above they are internally very consistent both with each other and with much known evidence external to this study. The cancer data used are derived from census samples from all US states. The drug exposure data is taken from a well authenticated and widely studied nationally representative survey which has been operating for several decades. The bivariate analysis is at once conceptually simple yet very powerful especially when paired with E-Value calculations. For prostate and ovarian cancer bivariate results were verified by further causal regression and space–time modelling which confirmed the bivariate results and demonstrated overall robustness to multivariable adjustment. One of the major result outputs from the present study was E-Values which are a major pillar of causal inference. We are of the view that the large US dataset represents an ideal context within which to address the present concerns. In that the present results demonstrate causal relationships we are confident that they could be widely reproduced with the sole caveat that in nations where cannabis use is more widespread we would expect the findings to be more dramatic provided that the data collection systems are sufficiently accurate.

Strengths and limitations

It is important that this study be read in parallel with the other two papers in this series [69, 70]. This study has several strengths. We used a large national cancer census dataset. Age adjusted rates derived from CDC, SEER and NCI were employed. The drug dataset was from a large well validated nationally representative dataset. The bivariate statistics were straightforward yet, when harnessing the power of E-values they were powerful to address causality directly. These studies were internally and externally consistent with known data both on tobacco-related cancer and on cannabis-related cancer. Panelled graphs were used to allow the simultaneous display of results for direct comparison across many cancers. Together with other papers in the present series [69, 70] the present report strongly indicts population level cannabinoid exposure in cancer aetiopathogenesis.

In common with most epidemiological studies this study did not have available to it individual level participant data. State-level cannabinoid exposure had to be estimated as described as state level data itself was not directly available to the present investigators. This study is an epidemiological study and thus is not able to formally prove with formal experimental rigor the causal nature of the relationships indicated from these studies at the level of population health. However these results do indicate detailed mechanistic studies in many cell lines and tumour models. Another issue of considerable interest is the possible role of synthetic cannabinoids as genotoxins. In the absence of spatiotemporal data on this issue we are unable to comment on this increasingly important matter. However several lines of evidence suggest that they are likely to be implicated. Several recent studies implicate many cannabinoids in genotoxic activities [37, 38, 43, 51,52,53, 55, 87, 91, 128,129,130]. Long ago the genotoxic action was found to reside in the polycyclic olevitol nucleus of the cannabinoids with little modulation by the various side chains [36]. And several other studies implicate synthetic cannabinoids in genotoxicity [131,132,133,134,135,136,137]. Overall therefore we feel that this is a fertile and important area for further laboratory based investigation and epidemiological surveillance.

Furthermore this was also an ecological study. It is therefore potentially susceptible to the short-comings typical of ecological studies including the ecological fallacy and selection and information biases. Within the present paper we began to address these issues with the use of E-values in all Tables. This issue is further addressed by the detailed pathophysiological mechanisms which have been described above, by mention of other countries where many of the same findings have been made, and with the use of inverse probability weighting in multiple regression models and further extensive application of E-values in Parts 2 and 3 of the present series of papers.

Conclusion

In conclusion this overview of 28 selected cancers showed strong bivariate evidence that cannabis and several cannabinoids were associated with multiple cancers. All cancer incidence was associated with cannabidiol exposure. Breast cancer, the commonest cancer, was associated with tobacco, cannabis, THC, cannabidiol, cannabinol, cannabichromene and cannabigerol exposure. The pediatric cancer AML was linked with THC exposure. It is also presumptive evidence of transgenerational transmission of oncogenesis. Testicular cancer, previously linked with cannabis exposure, was found to be linked with THC exposure. THC greatly accelerates the course of testicular carcinogenesis by several decades. The area under the cumulative exponential E-Value curve for tobacco, AUD, cannabis, THC, cannabidiol, cannabichromene, cannabinol and cannabigerol was 34, 32, 13, 0, 103, 58, 25, 31 indicating that cannabidiol appears to be most strongly implicated in environmental carcinogenesis of the substances studied. The clear implication from this work and its accompanying reports [69, 70] including the suggested extensions is that community penetration of cannabinoids should be carefully restricted not only as a matter of public health and safety including importantly integrity of the food chain, but also as a non-negotiable investment in the genomic health and onco-protection of multiple coming generations in a manner precisely analogous to that of all other seriously genotoxic agents. Particular concerns relate to the movement of increasing sections of the community into higher dose ranges of cumulative cannabinoid exposure in the context of exponentiation of genotoxic dose-responses which has now been convincingly demonstrated both in the laboratory and in epidemiological studies of human populations.