Cancer in patients admitted to hospital with diabetes mellitus aged 30 years and over: record linkage studies
The aim of this study was to determine the risk of cancer in people admitted to hospital for diabetes mellitus when aged 30 or older.
This study involved the analysis of two statistical datasets of linked hospital and mortality data, in an area in southern England, between 1963 and 1998 (the Oxford Record Linkage Study, ORLS1) and between 1999 and 2008 (ORLS2). Rates of cancer in the diabetes cohorts were compared with rates of cancer in reference cohorts and expressed as rate ratios.
The rate ratio for all cancer in people admitted to hospital with diabetes was 1.01 (95% CI 0.95–1.06, based on 15,898 people with diabetes) for the years 1963–1998; and 1.09 (1.00–1.19, based on 7,771 people with diabetes) in the years 1999–2008. In both datasets, there were significantly high rate ratios for cancers of the liver (ORLS1 and ORLS2, respectively, 2.0 [95% CI 1.4–2.9]; 2.5 [95% CI 1.3–4.3]), pancreas (2.2 [95% CI 1.8–2.7]; 3.5 [95% CI 2.5–4.8]) and uterus (1.5 [95% CI 1.0–2.2]; 2.6 [95% CI 1.4–4.5]). There were significantly low rate ratios for cancer of the prostate (0.6 [95% CI 0.5–0.7]; 0.7 [95% CI 0.5–0.9]) and non-melanoma skin cancer (0.6 [95% CI 0.5–0.8]; 0.8 [95% CI 0.6–0.96]).
Diabetes mellitus was associated with an elevated risk of some site-specific cancers and a reduction of risk of others. Considering the risk in diabetes of all cancers combined, the elevation of risk, if any, is likely to be small and numerically less important than other known complications of diabetes.
KeywordsCancer Diabetes mellitus Record linkage Type 2
Hospital episode statistics
International Classification of Disease
National Health Service
Office of Population, Censuses and Surveys
Oxford Record Linkage Study
Interest in cancer risk in people with diabetes mellitus has focused, in particular, on an increased risk of pancreatic cancer. Meta-analyses of studies of pancreatic cancer in type 2 diabetes have shown an association [1, 2], as has a recent record linkage study from Sweden . It is unclear whether diabetes is a risk factor for pancreatic cancer or a consequence of it. Meta-analyses, of varying quality, have been published on diabetes and cancers of the colorectum , breast , prostate , endometrium , bladder  and non-Hodgkin’s lymphoma . Elevated risks of colorectal , liver [10, 11] and kidney  cancers have been reported, and there is evidence of an inverse association with prostate cancer [3, 6]. An important outstanding prognostic question, for people with diabetes mellitus, is whether diabetes is associated with an elevated risk of cancer, overall, as well as with an increased risk of some specific cancers and a reduced risk of others. There is also current interest in cancer risk in diabetes because therapeutic agents used in its treatment, including metformin, insulin and sulfonylureas, have been implicated in altering cancer risk [12, 13, 14, 15].
We have used the long-standing Oxford Record Linkage Study (ORLS) (1963–1998) and the Oxford subset of the more recent English national linked Hospital episode statistics (HES) (1999–2008) to investigate associations between diabetes mellitus in people aged 30 years and over for cancer overall, and for individual cancers. These datasets provide the means to study large numbers of different cancers in the same population, using the same methodology, over a long period of time.
Population and data
The ORLS [16, 17] includes brief statistical abstracts of records of all hospital admissions (including day cases) in National Health Service (NHS) hospitals, and all deaths regardless of where they occurred, in defined populations within the former Oxford NHS region. The ORLS covered the years 1963 to 1998, and is referred to here as ORLS1. The hospital data were collected routinely in the NHS as the region’s hospital statistics system and were similar to English national HES. The data for each individual were linked together, as they accrued, and are now anonymised and archived. A second dataset, referred to as ORLS2, has been linked and built as the Oxford regional subset of the English national HES system (version v08a), and runs from 1999 to March 2008. The data items available for linkage changed between 1998 and 1999 and the two datasets are not themselves linked together. The death data derive from death certificates. The population covered by the ORLS gradually expanded over time from an initial population of part of Oxfordshire (approximately 300,000 resident population) to all four counties of the former Oxford NHS region (resident population 2.5 million).
A cohort of people with an admission, or record of day care, for diabetes when aged 30 or older was constructed for those with diabetes mellitus recorded as the main reason for hospital care, by identifying the first admission, or episode of day case care, for the condition in an NHS hospital during 1963–1998 (ORLS1) or 1999–2008 (ORLS2). The International Classification of Disease (ICD) codes used for diabetes were 260 in ICD-7, 250 in ICD-8 and ICD-9 (www.icd9data.com/2007/Volume1/240-279/250-259/250/default.htm), and E10-E14 in ICD-10 (www.who.int/classifications/icd/en/). A comparison cohort was constructed by identifying the first admission for each individual with various medical and surgical conditions (listed in table footnotes) but excluding those with a record of diabetes. This is based on a ‘reference’ group of conditions that has been used in other studies of associations between non-malignant diseases and cancer [17, 18, 19, 20, 21, 22]. We followed the standard epidemiological practice, when hospital controls are used, of selecting a diverse range of conditions, rather than relying on a narrow range (in case the latter are themselves atypical in their risk of subsequent disease). People were included in the diabetes or reference cohort if they did not have an admission for cancer either before or at the same time as the admission for diabetes mellitus or the reference condition. We then searched the database for any subsequent NHS hospital care for, or death from, cancer in these cohorts. We considered that rates of cancer in the reference cohort would approximate those in the general population of the region while allowing for migration in and out of it (data on migration of individuals were not available). Type of diabetes was not reliably recorded in routine hospital statistics; we therefore used an age cut-off of ‘30 or older’ at first recorded admission to reduce the proportion of the cohort that will have had type 1 diabetes. We chose 30 as our age cut-off as the only method available to us to minimise the inclusion of people with type 1 diabetes; it should be noted, nonetheless, that about 10% of adult diabetics have type 1 diabetes  and our study population is likely to be similarly composed.
Ethical approval for analysis of the record linkage study data was obtained from the Central and South Bristol Multi-Centre Research Ethics Committee (04/Q2006/176).
We calculated rates of each cancer based on person-years at risk. For each broad age group of people with diabetes, and for each individual cancer site, we took ‘date of entry’ into each cohort as the date of first admission for diabetes, or reference condition, and ‘date of exit’ as the date of first record of the cancer, death, or the end of the data file (31 December 1998 for ORLS1, 31 March 2008 for ORLS2), whichever was the earliest. In comparing the diabetes cohort with the reference cohort, we first calculated rates for each cancer in the diabetes and reference cohorts combined, as the standard cohort, stratified by age (in 5-year age groups), sex, calendar year of first recorded admission, and district of residence. We then applied the stratum-specific rates in the combined diabetes and reference cohorts to the number of people in each stratum in the diabetes cohort, separately, and then to those in the reference cohort, and calculated stratum-specific standardised rates in each cohort. We calculated the ratio of the standardised rate of occurrence of the cancer in the diabetes cohort relative to that in the reference cohort. The confidence interval for the rate ratio and χ2 statistics for its significance were calculated as described elsewhere , as have further details of our methods . Analyses were run using a suite of programs developed in-house using SAS 9 software (SAS Institute, Cary, NC, USA).
In comparing the diabetes and reference cohorts, the precision of the rate ratio depends on the number of people with each subsequent cancer within each cohort. The size of the diabetes cohort is fixed by the number in the database with the condition. In the reference cohort, we included all the people in the database with the comparison conditions in each age group. We did this to maximise the numbers in each stratum in the reference cohort in order to maximise the precision of the rate ratios.
We studied the data with and without cases of cancer that occurred within 1 year of first recorded admission for diabetes mellitus. We did so with the aim of distinguishing longer term associations from those that might be explained by short-term ascertainment bias. In fact, we generally found no important difference between the rate ratios with or without inclusion of first year cases. Accordingly, we report the main results with all cases included.
Age distribution of study population of people admitted to hospital for diabetes mellitus
Age group (years)
Diabetes mellitus cohort
Diabetes mellitus cohort
Number (% men)
Number (% men)
Number (% men)
Number (% men)
Occurrence of cancer in people admitted for diabetes mellitus at 30 years of age or older
Cancer (ICD code)a
ORLS1, ORLS2 (n; n)
Adjusted rate ratiob (95% CI)
Adjusted rate ratiob (95% CI)
All cancers (140–208)c
Oral cavity, pharynx (140–141, 143–146, 148–149)
Breast (women) (174)
Kidney (189.0, 189.1)
Malignant melanoma (172)
Other skin cancer (173)
Malignant brain (191)
Non-Hodgkin’s (200, 202)
Multiple myeloma (203)
Lymphoid leukaemia (204)
Myeloid leukaemia (205)
Benign brain (225)
Rate ratios and their 95% confidence intervals are shown in Table 2. There were significantly high rate ratios for cancers of the liver (2.0 in ORLS1, 2.5 in ORLS2), pancreas (respectively, 2.2, 3.5) and uterus (1.5, 2.6). There was a significantly high risk of cancer of the cervix (2.1) in ORLS1, but not in ORLS2. There was a significantly high risk of stomach cancer (2.1) in ORLS2, but not in ORLS1. There were significantly low rate ratios for cancers of the bladder (0.7) in ORLS1, prostate (0.6, 0.7, respectively, in ORLS1 and ORLS2), and non-melanoma skin cancer (0.6, 0.8, respectively).
Occurrence of selected cancers in people admitted for diabetes mellitus at 30 years of age or older
Adjusted rate ratio (95% CI)a
Adjusted rate ratio (95% CI)a
Pancreas – all
All <5 years
Liver – all
All <5 years
Uterus – all
All <5 years
Men and women
In ORLS2, most of the cases of liver cancer occurred in men. The rate ratio was significant in men (2.94 [95% CI 1.35–5.76]), but not in women (1.75 [95% CI 0.46–4.76]). These rate ratios were not significantly different from each other as one rate ratio falls within the confidence intervals of the other. There were no other notable differences between men and women in either ORLS1 or ORLS2.
The rate ratios for cancer overall in our study, at 1.01 and 1.09, are similar to that in a recent study in Scotland, which reported an unadjusted rate ratio of 1.01 (95% CI 0.92–1.11) and a rate ratio adjusted for deprivation of 0.99 (95% CI 0.90–1.09) . They are also similar to those in record linkage studies from Denmark (rate ratio 1.1)  and Sweden (men 1.0, women 1.1, data on both sexes combined not given) . The population of the Oxford region is comparatively healthy, as shown by health indicators such as standardised mortality ratios and life expectancy , with a reasonably homogeneous population ethnically. More than 90% of the total population covered by ORLS1 was born in the UK. The population covered by ORLS2 was becoming more diverse, ethnically, but data on ethnicity or place of birth were not available in the dataset. A limitation of the datasets are that they do not include data on potential confounding variables such as social class, smoking, exercise or diet. However, the level of consistency seen across the studies noted above [10, 25, 26] makes it unlikely that our main finding, and theirs—no significant elevation of cancer risk, overall, in people aged 30 and over—is much influenced by unmeasured confounding.
The ORLS does not include data about therapy. Therapeutic agents have been implicated in altering the risk of cancer in diabetic patients, although evidence is inconsistent [12, 13, 14, 15, 28]. Metformin , insulin [13, 14, 15] and sulfonylureas  have all been studied in an attempt to determine whether they alter the risk profile of cancer in people with type 2 diabetes. Our study is observational, as are the studies from Scotland , Denmark  and Sweden . The best designs for studies intended to identify risks associated with individual treatments are randomised controlled trials. The contribution of our study is to show cancer risks in a large hospitalised population that will have been treated with the regimes that were typical in the times covered by ORLS1 and ORLS2, and to demonstrate that those risks are small.
Methodological issues: datasets, population, and multiple comparisons
Strengths of the Oxford datasets include their size and their potential for long follow-up, together with the opportunity to analyse the full range of individual cancers within a single, large population, and with the same study design. The datasets also have limitations. The cohorts are based on prevalent cases—the first recorded hospital admission or episode of day case care for each person with each condition—rather than being cohorts with follow-up from the date of first diagnosis. The datasets are limited to people who were admitted to hospital, or who received day case specialist care. This would not capture all people with diabetes, although it should identify the great majority with subsequent cancer. Data are not recorded on patients who move out of the area covered by data collection or who are treated in hospitals outside the area (mainly affecting ORLS1). Accordingly, absolute risks of cancer cannot be calculated, but relative risks can be, as shown by us in comparing the diabetes cohort with a reference cohort. In studies of cohorts of people with non-malignant disease, followed for subsequent cancer risk, and compared with the reference cohort, the reference cohort is known to give cancer rates that are highly plausible. Using the reference cohort as a comparator, we have reported on combinations of non-malignant disease and subsequent cancer that show no elevation or diminution of risks when none are expected [18, 19, 20]; high cancer rates when these are expected (e.g. high rates of colon cancer in people with inflammatory bowel disease , high rates of liver and pancreatic cancer in people with chronic non-malignant liver disease or pancreatitis ); and low rates when expected (e.g. low rates of some cancers in people with Parkinson’s disease ).
Of people with the conditions covered by the reference cohort, 1.5% were excluded from the reference cohort because they had diabetes, i.e. there was a prevalence rate of 1.5% in those otherwise eligible to be in the reference cohort. Data from general practice show prevalence rates of diagnosed diabetes in the general population in the former Oxford NHS region. For example, the rates are 2.9% in the Oxfordshire Primary Care Trust area, 2.8% in West Berkshire and 3.0% in Milton Keynes . Our 1.5% prevalence rate is an underestimate as it will not include people admitted to hospital outside of the timeframe of our study or those who have not been admitted to hospital at all with diabetes as the main reason for their admission. Our prevalence of hospitalised diabetes covers a time period from 1963 and, as diabetes prevalence is increasing year on year, there will also be some real differences in the prevalence of diabetes in our ‘historical’ study and in contemporary estimates. Furthermore, as we miss people with diabetes who are not hospitalised in the period covered by our study, it follows that the reference cohort is itself not a ‘pure’ cohort in respect of excluding people with diabetes. It includes people, unidentifiable by us, who have diabetes. Thus the comparisons shown are those between a cohort of people, all of whom have diabetes, and a cohort of people in which maybe 2% of people have non-hospitalised and perhaps undiagnosed diabetes. The two datasets are not linked to each other because of changes in the data items available for linkage between 1998 and 1999. Consequently, it is likely that some people have been recorded as having a ‘first admission’ for diabetes in each of the time periods studied.
We used an age of 30 years or older as a proxy for type 2 diabetes. Although this group will mostly consist of people with type 2 diabetes, there will be some people with type 1 diabetes in the cohort as well, given that around 10% of the UK adult population has type 1 diabetes .
The definition of diabetes changed during the period covered by this study (1963–2008). This may have influenced the results in the two different periods studied (1963–1998 and 1999–2008). In 2000, the WHO introduced new guidelines that included the recommendation of a lower cut-off point for fasting plasma glucose for the diagnosis of diabetes. Lower admission thresholds for diabetes may have increased the number of people admitted to hospital with diabetes mellitus in the period of the ORLS2 dataset, compared with ORLS1.
We studied a large number of associations between diseases. The effect of making multiple comparisons needs to be considered. For this reason, we have given exact p values, as well as confidence intervals, so that the reader can judge the degree of significance of diabetes and each subsequent cancer. It is possible that some of the associations that are significant at a level of p < 0.05 or even p < 0.01 may result from making multiple comparisons and the play of chance. This may particularly be so where there is no prior hypothesis to support the finding. On the other hand, in a study with the number of comparisons that we have made, findings where the significance level is <0.001 are very unlikely to be attributable to chance alone. With multiple comparisons, the play of chance can also work the other way round: rates that may truly be high can, by chance, be found to be lower than anticipated.
Some of the literature on the risk of specific cancers in people with diabetes mellitus is conflicting. Our findings for cancers of the pancreas, liver, uterus and prostate are similar to those reported in previous studies [1, 2, 3, 6, 7, 10, 11, 26]. The association between diabetes and elevated risks of pancreatic cancer is well established. A meta-analysis of 20 studies, performed in 1995, found a significant association between diabetes and pancreatic cancer with a pooled relative risk of 2.1 . A further meta-analysis has reported on pooled risks of similar magnitude . Elevated risks of pancreatic cancer may be confounded by obesity. However, two case–control studies found an elevated risk ratio for pancreatic cancer in people with diabetes that did not materially change after adjustment for BMI [30, 31]. Although much of the risk of cancer of the pancreas in our data appears to be confined to the short-term, there is some evidence in other studies of a longer term risk of pancreatic cancer [2, 3]. In the ORLS2 dataset, there was an elevated risk of liver cancer 5 or more years after diabetes admission and a longer term risk of liver cancer has been reported by others . Further studies of long term associations between type 2 diabetes and cancers of the pancreas and liver need to be done.
The association, within both datasets, with cancer of the uterus has been reported in a recent meta-analysis . Any association with cancer of the uterus may be a consequence of confounding with shared risk factors, such as obesity, that may independently contribute to the development of both diabetes and endometrial cancer. For example, a recent study found that the risk of endometrial cancer in women with diabetes was increased if they were also obese or did not undertake physical exercise . The associations with cancer of the cervix (ORLS1) and stomach (ORLS2) may be real (although an alternative explanation is that, though significant, they could be due to the play of chance) but they are unlikely to be a consequence of long-standing diabetes. A further possibility is that these associations, too, may be a consequence of confounding with shared risk factors. For example, chronic inflammation in response to infection has been implicated in the aetiology of both gastric (Helicobacter pylori) and cervical (human papilloma virus) cancers ; and low grade inflammation has been implicated in the risk of type 2 diabetes . Rates of cervical cancer declined in the period and population covered by this study, as a result of screening and perhaps other factors. Our analysis, comparing the diabetes and reference cohorts, included adjustment by matching on year of admission in single years. Any effect of the decline in cervical cancer would have been the same in the diabetes and reference cohorts.
We found inverse associations between type 2 diabetes and both prostate cancer and non-melanoma skin cancer in both datasets. Previous studies have suggested a deficit of prostate cancer in people with diabetes [3, 6]. Although few other studies have reported on skin cancer, the Danish record linkage study found an inverse association in women (standardised incidence ratio [SIR] 0.9 [95% CI 0.8–0.9]), although not in men (1.0 [95% CI 0.9–1.1]) . A study investigating non-melanoma skin cancer in type 2 diabetes concluded that the use of insulin in type 2 diabetes may provide a protective effect against this form of skin cancer . It has also been suggested that insufficient levels of vitamin D may increase the risk of diabetes . Because the production of vitamin D requires sunlight exposure, it is plausible that people who have relatively low exposure to sunlight may have a greater risk of diabetes as well as a reduced risk of skin cancer. It is also possible that people with diabetes may spend less time in the sun than others; and an interpretation of the relationship between diabetes and skin cancer could only be meaningfully made with access to information about sun exposure.
Some previous meta-analyses have found elevated risks of colorectal , breast  and bladder cancers  and an association with non-Hodgkin’s lymphoma . We did not find significantly elevated risks of these cancers in the present study. Our findings for colorectal cancers, although not significant, were consistent with an elevated risk of up to 60% for colon cancer and up to 74% for rectal cancer, in ORLS2 (as judged by the upper confidence limit); these are consistent with the rate ratios found in the meta-analysis . It is also possible that our rate, lower than anticipated from the literature, may be a ‘chance’ low (see discussion of multiple comparisons above). A lack of association between diabetes and breast cancer has been reported before . It can be problematic to compare risks of different individual cancers using data from different studies with different regions. Record linkage studies seem to provide some of the most reproducible findings on individual cancer sites [3, 10, 11, 26, 32]. One of the reasons for this may be that many different cancer sites can be studied within a single population using the same methodology.
This study adds to the evidence that type 2 diabetes is associated with elevated risks of cancers of the pancreas, liver and uterus, and to that of a possible inverse association with prostate cancer and non-melanoma skin cancer. Considering the risk in diabetes of all cancers combined, the elevation of risk, if any, is likely to be small and numerically less important than other known complications of diabetes.
Over many years, the linked datafiles were built by L. Gill, G. Bettley and M. Griffith, Unit of Health-Care Epidemiology, University of Oxford. The Unit of Health-Care Epidemiology is funded by the English National Institute for Health Research to analyse the linked data.
Duality of interest
The authors declare that there is no duality of interest associated with this manuscript.
- 23.Department of Health (2007) About diabetes. Available from http://tna.europarchive.org/20080817162314/http://www.dh.gov.uk/en/Healthcare/NationalServiceFrameworks/Diabetes/DH_074762, accessed 15 October 2010
- 24.Breslow NE, Day NE (1987) Statistical methods in cancer research, Volume II. The design and analysis of cohort studies. IARC scientific publication No. 82. International Agency for Research in Cancer, Lyon, pp 103–115Google Scholar
- 27.Association of Public Health Observatories—download data tables. Available from www.apho.org.uk/resource/view.aspx?RID=71961, accessed 8 November 2010
- 29.QOF Database (2007) Available from www.gpcontract.co.uk/sha.php?orgcode=Q38&year=7, accessed 15 October 2010