Background

In 2018, colorectal cancer was the third most incident cancer and caused the second largest number of cancer deaths in high-income countries.1,2 It is a heterogeneous disease with varied presentations and large differences in prognosis. Considering the cancer stage alone, 1-year net survival for localised and metastatic cancer varies from 96 to 55%, respectively, in the United States.3

Clinical prediction models combine multiple prognostic factors to estimate individualised risks of outcomes for each patient.4,5 These risk predictions have many uses. In colorectal cancer research, prediction models have been used to examine prognosis in clinical trials,6 to control for confounding in observational studies,7 and to assess the added prognostic value of biomarkers.8 In clinical practice, they may be used to inform treatment decisions and to communicate prognosis to patients, in line with the aims of personalised medicine and shared decision-making.9

In the absence of high-quality prediction models, clinicians’ predictions of cancer survival may be inaccurate, non-transparent, and difficult to explain to patients.10,11,12 Existing models for colorectal cancer mortality have focused on selected populations recruited to clinical trials (such as stage III and IV groups),6,13,14 risks after surgery or chemotherapy,15,16 or long-term survival using primary care data.17 A recent systematic review18 did not identify any models to predict mortality for all colorectal cancer patients using contemporary national hospital data.

In this study, our objective was to develop and validate a prediction model for death from colorectal cancer within 3, 6 and 12 months after diagnosis. To do this, we analysed national electronic hospital records linked to official mortality data from England and Wales.

Methods

Study populations

The National Bowel Cancer Audit collects data on adults (aged 18 years or over) newly diagnosed with colorectal cancer (International Classification of Diseases 10th Revision (ICD-10) codes: C18-2019) in England and Wales.20 These data are entered into electronic record systems by hospital staff and later combined into a pooled national dataset by the National Health Service (NHS). We analysed data for patients whose date of diagnosis was from January 2015 to December 2016.

We defined one population to develop the prediction model and two separate populations to test the performance of this model. The eligible population used for model development included patients who were diagnosed in England in 2015 (n = 28,505 patients). The first test population included patients who were diagnosed in England in 2016 (n = 28,216 patients). The second test population included patients who were diagnosed in Wales in 2015 or 2016 (n = 3861 patients).

Outcome

The outcome was death from colorectal cancer as identified from official death records provided by the Office for National Statistics.21 We defined death from colorectal cancer using relevant ICD-10 codes recorded as the ‘underlying cause of death’ (see Supplement S1). The underlying cause is ‘the disease or injury, which initiated the train of morbid events leading directly to death’.22

Time to death was defined as the number of days between the date of diagnosis (as recorded in the National Bowel Cancer Audit dataset) and the date of death from colorectal cancer (as recorded in Office for National Statistics mortality data). The date of diagnosis was ‘the date when cancer was confirmed or diagnosis agreed’, which is typically the date of the pathology report that confirmed cancer. Patients who died from other causes were censored on the date of death. Patients alive as of 1 January 2018 were censored on that date, providing at least 365 days of follow-up for all patients.

Records from the National Bowel Cancer Audit and Office for National Statistics datasets were combined using deterministic linkage based on each patient’s unique NHS number, date of birth, gender and postcode. From the 60,582 eligible patients (in both development and test sets), the final sample size was 57,705 patients (27,480 in the development population; 26,411 in the England test population; and 3814 in the Wales test population). Supplement S2 provides the sample flow chart. Distributions of variables were similar for the linked and unlinked patients (Supplement S3).

Predictor variables

We used ten variables from the National Bowel Cancer Audit dataset as predictor variables: age, gender, socioeconomic status, source of referral, performance status, tumour site, TNM (tumour, node, metastasis) stage at diagnosis and treatment intent. All variables were recorded in electronic data systems around the time of the first meeting between clinicians to discuss patients’ treatment after diagnosis. We selected these predictors a priori to include variables recorded around the time of diagnosis that had relatively complete data (≥80% of values nonmissing).

Patient age was coded as a continuous variable defined as the number of complete years between the dates of birth and diagnosis. Gender was male/female. Socioeconomic status was defined as the national rank of a patient’s area of residence according to the Index of Multiple Deprivation;23 the mean population size of these areas was 1500.23 To aid interpretation, these ranks were linearly rescaled to have a median of zero and lower and upper quartiles of −1 and +1, respectively.24

The source of referral for investigation of suspected cancer had five categories: emergency hospital admission, urgent care/emergency department visit, primary care, national screening programme and ‘other’ (e.g. a separate outpatient clinic). Performance status was defined by five categories of the Eastern Cooperative Oncology Group score (ranging from ‘fully active’ to ‘completely disabled’).25 Tumour site was one of nine ICD-10 codes indexed under C18-20. T, N and M stages of the cancer were defined by the TNM Classification of Malignant Tumours 5th Edition.26 The treatment intent had three categories: curative, non-curative and no active cancer treatment.

All ten predictor variables were defined using the National Bowel Cancer Audit dataset. The original (incomplete) data were used to calculate descriptive statistics for each variable. To account for missing values of predictors, we used multiple imputation with chained equations to generate 40 complete datasets (see Supplement S4 for details). All analysis of associations between the outcome and predictors was done using these 40 imputed datasets. We pooled model estimates and performance measures across the datasets to produce the final results.27

Statistical analysis

We used Cox proportional hazards regression28 to estimate associations between predictor variables and the hazard of colorectal cancer death. Deaths from other causes were treated as censoring events. All predictors entered the regression model simultaneously. We fitted linear associations with the outcome for age and socioeconomic status, as nonlinear transformations fitted by a multivariable fractional polynomial algorithm29,30,31 were well approximated by linear relationships.

We assessed model performance at 90, 180 and 365 days after diagnosis. Overall model performance was measured using Brier scores.32 These scores were calculated from the mean squared differences between predicted probabilities of colorectal cancer death within a given time period and the observed death status. We scaled these scores from 0–100% (0% if non-informative and 100% if perfect).33

To assess discrimination, we calculated the c-index.34 This indicates the proportion of all pairs of patients whose survival times could be ordered such that the patient with the lower predicted risk of colorectal cancer death survived longer.24 C-indices equal one for perfect models and 0.5 for random predictions. To assess model calibration, we plotted the predicted risks of colorectal cancer death against the actual observed risks, using the loess smoother to estimate the calibration curve.24

We assessed the internal validity of the model using 10-fold cross-validation and calculated mean values of the performance measures across the ten folds. We tested the performance of the model in two other populations: patients diagnosed in England in 2016 and in Wales in either 2015 or 2016.

Sensitivity analyses

Three sensitivity analyses tested the specification of the model and its performance, as detailed in Supplement S5. These added interaction terms between key predictors, added a comorbidity score and the number of unplanned admissions in the past year as predictor variables, and assessed whether censoring of surviving patients at 365 days affected the associations estimated.

Data preparation was done using Stata (v15) and R (v3.5) was used for all statistical analysis.

Results

In the population used to develop the prediction model, the percentages of patients who died from colorectal cancer were 7.4% (within 90 days), 11.7% (180 days) and 17.9% (365 days). These percentages were similar in the England test population but slightly greater in the Wales test population (Table 1). The Wales population had greater percentages of patients who were referred for diagnostic investigations after an emergency admission (29.0% vs. 13.0% in the development population) and who had metastases (25.6% vs. 22.1%). Most patients in each population were treated with curative intent (73.3–74.1%) (Table 2).

Table 1 Descriptive statistics for the outcome variable and follow-up time.
Table 2 Descriptive statistics for predictor variables.

Missing values were most common for the performance status of the patient (16.8%) and the T and N-stages of the cancer (19.2% and 17.0%; Table 2). Data fields were complete across all variables for 61.5% of patients. These patients were more likely to be treated with curative intent (76.6% vs. 67.5%) and to survive until the end of follow-up (70.6% vs. 61.5%) than patients who had at least one predictor variable with a missing value (Supplement S6).

After multiple imputation of missing values, risks of colorectal cancer death were greatest for patients who had metastatic disease, had a treatment plan with non-curative intent or no active cancer treatment, or had an unfavourable performance status (Table 3). The risk of cancer death within 365 days was more than 50% for three patient groups: patients in the two worst performance status categories (50.3% and 58.3%) and patients with a non-curative treatment intent (51.9%).

Table 3 Univariable and multivariable associations between the outcome and predictor variables in the development population, estimated using Cox regression.

In the multivariable model including all predictor variables, the greatest relative difference in the hazard of colorectal cancer death was between the T4 and T1 stages (hazard ratio (HR) = 4.67; 95% confidence interval (CI): 3.59–6.09). Compared to patients with a curative treatment intent, the hazard of colorectal cancer death was 3.85 times greater for patients whose treatment plan was non-curative (HR = 3.85, 95% CI: 3.60–4.11) or did not include active cancer treatment (HR = 3.85, 95% CI: 3.52–4.21). Outcomes were similar between the non-curative and no active cancer treatment groups (HR = 1.00, 95% CI: 0.92–1.09) (Table 3).

Predicted probabilities of colorectal cancer death varied widely within treatment intent categories. In the England test population, the 10th and 90th percentiles of predicted risks within 365 days were 1.7% and 12.9% for patients treated with curative intent, 23.8% and 88.8% for patients with a non-curative intent and 16.3% and 89.6% for patients with no active cancer treatment planned.

Model performance

The probabilities of colorectal cancer death predicted by the model were well calibrated with the observed proportions of patients that died, in both England and Wales test populations (Fig. 1).

Fig. 1: Calibration plots for predicted probabilities of colorectal cancer death within 90, 180 and 365 days after diagnosis, in the England and Wales test populations.
figure 1

Note: The coloured lines represent the smoothed relationships between the observed and predicted risks of colorectal cancer death. The black dotted 45° line represents the ideal relationship showing perfect calibration.

The model typically predicted very low risks of colorectal cancer death for patients who did not experience this outcome (Fig. 2). The predicted risks were generally much greater for patients who did die from colorectal cancer, particularly for the 365-day outcome period. As a result, the predicted probabilities of colorectal cancer death were well separated between patients who did and did not have this outcome (Fig. 2). This was reflected in large values of the c-index, ranging from 0.873 to 0.890 and 0.856 to 0.873 in the England and Wales test populations, respectively (Table 4).

Fig. 2: Boxplots comparing predicted probabilities of colorectal cancer death by outcome status within 90, 180 and 365 days after diagnosis, in the England and Wales test populations.
figure 2

Note: Boxes are drawn from the lower to upper quartile of predicted probabilities with a white horizontal line at the median value. Annotated values and black dots correspond to mean values. Whiskers are drawn to the most extreme predicted probabilities that are no more than 1.5 times the interquartile range from the box.

Table 4 Overall model performance and discrimination in the development and test populations.

The overall performance of the model as measured by the scaled Brier score was best for the 365-day period, followed by the 180 then 90-day periods (Table 4). For the 365-day period in the England test population, the Brier score was improved by 40.0% compared to if the overall risk of colorectal cancer death had been used as the predicted probability for all patients, indicating a large improvement in prediction ability when using the model (versus no model).

Sensitivity analyses

In the sensitivity analyses, interaction terms between patient age, M-stage and treatment intent did not improve model performance (maximum absolute difference in c-index or Brier score vs. main analysis = 0.001). Results were also similar when each patient’s history of comorbidities and unplanned hospitalisations were added as predictors (maximum absolute difference = 0.002). When patients who were alive 365 days after diagnosis were censored at this timepoint, predictor effects were similar to those in the main analysis (range of relative differences in HRs: 0.97–1.08).

Discussion

The model developed was valid for predicting death from colorectal cancer within 3, 6, and 12 months after diagnosis in England and Wales. The model discriminated very well between patients who did and did not die from colorectal cancer, such that the former group typically had much higher predicted probabilities of death. These predictions were well calibrated with observed outcomes. The T-stage of the tumour had the largest adjusted association with the risk of death, followed by the treatment intent and performance status of the patient.

No single variable alone had a high positive predictive value for colorectal cancer death. For example, just over half of patients (51%) who did not have a curative treatment intent died within 365 days. Predicted risks of death varied widely across patients who did not have a curative intent. This wide variation also existed for patients who did have a curative treatment plan.

Strengths and limitations

We used large, national datasets to develop a new model and examine its temporal and geographic validity in whole populations from two countries. The data used for predictor variables were entered as part of routine care processes and therefore represent information available to clinicians in practice around the time of decision-making. We used cause of death information from official death records to distinguish colorectal cancer deaths from other deaths, and we were able to measure these outcomes for at least 1 year after diagnosis for all patients. Although the patients in the test sets were similar to those in the development set, the differences in the type of referrals and TNM stages between England and Wales provided a reasonable test of external validity.

The model would likely be improved if further information about the cancer was available, such as the sites of any metastases or possibly molecular data, as well as additional characteristics of patients (such as frailty) and their cancer care. This may help to predict greater probabilities of colorectal cancer death for patients who experienced this outcome. Some uncertainty in prognosis may reflect the biological development of cancer and the possibility of treatment-related complications.

Detailed assessment of patients’ overall morbidity, particularly for older patients, could be used to contextualise predictions of cancer mortality in terms of overall life expectancy. However, the overall risk of dying from causes other than colorectal cancer within 1 year after diagnosis was only 4%, so other causes of mortality in this period may be less relevant to treatment decisions for most patients.

Differences in data collection or population characteristics may limit the generalisability of the model to other countries. Estimates of 1-year survival for colorectal cancer can differ markedly between high-income countries, such as 78% in England and 84% in Sweden in 2010–2012.35 The model may need to be recalibrated when used elsewhere if the survival differences are unexplained by differences in the distributions of predictors. However, despite survival in Wales being somewhat worse than in England in the current study, model calibration remained acceptable. Most predictors used have standard international definitions. We rescaled the measure of socioeconomic status so that it might approximate similarly rescaled measures in other settings.

In order to avoid the possibility that any racial biases in access to treatment are reinforced by the prediction model, we did not consider patient ethnicity as a predictor.36 This is in line with most clinical prediction models.37 Prognostic factors such as lymphovascular invasion, surgical margin status and definitive treatment were not included in the model as they are typically unknown around the time of diagnosis and were not relevant to all patients (some of whom do not receive surgery).

Missing data will have biased results if data were ‘missing not at random’, which multiple imputation cannot address. The extent of this bias cannot be ascertained from observed data, but each predictor had less than 20% of values missing, thus reducing the potential bias. National Bowel Cancer Audit records could not be linked to Office for National Statistics death records for 4.4% of eligible patients; distributions of predictor variables were similar between the linked and unlinked groups of patients but some bias due to linkage problems cannot be ruled out.

The 5th edition of the TNM system used in the analysis has been superseded by the 8th edition in the U.K., which will affect the N-stage of some (but relatively few) patients.

Relation to existing literature

A previous study17 used primary care records and cancer registry data to develop a prediction model for longer-term survival (1, 5 and 10-year) of colorectal cancer patients in England. This model did not include several variables that are routinely recorded in clinical team meetings shortly after diagnosis such as the referral source, performance status, separate TNM stages and treatment intent. The c-index of 0.873 attained by our model for predicting 365-day cancer mortality in England is much greater than that reported for one-year mortality (from all causes) in the previous study (0.795 for men and 0.807 for women17). This indicates a large increase in performance (closer to the perfect c-index of 1), especially as c-indices are relatively insensitive to improvements in model fit.38

A systematic review18 reported several prediction models developed for mortality in subgroups of colorectal cancer patients, such as patients with stage III6 or metastatic13,14 cancer, or for posttreatment mortality.15,16 None of these models were developed to predict mortality for all colorectal cancer patients using contemporary national hospital data. A previous study7 by our group used linked National Bowel Cancer Audit and Office for National Statistics death records to develop a risk-adjustment model for 90-day postoperative mortality. This model used similar predictors to the model presented here and showed good discrimination (c-index = 0.799) and calibration; the c-index may have been lower in this surgical cohort partly due to the population being more homogeneous.

Implications for research and practice

The predictor information used in the model is recorded electronically as part of routine practice in England and Wales, typically during clinical team meetings where patient care is planned. Patients’ risks of death within 3, 6 and 12 months could be automatically calculated in these meetings without additional data entry. Supplement S7 gives the formula for calculating predicted probabilities of colorectal cancer death within 90, 180 and 365 days after diagnosis.

The external validity of the model should be tested further before being used outside of England and Wales, possibly in combination with well-established methods for updating prediction models when used in new settings.39 Ideally, the effects of the model on decision-making and patient outcomes would also be evaluated in future research (though such impact studies are rare40).

The model’s predictions could be used to provide accurate prognostic information to patients, so that they can make informed decisions together with clinicians. The risk predictions may also help to prioritise patients for specialist palliative care services,41,42 given the wide range of predicted risks for patients without a curative treatment intent. The predictions also varied widely for those with a curative intent, which may help to inform the intensity of related treatment. Finally, the model could also be relevant to various clinical, epidemiological and biomarker studies.