How much do tumor stage and treatment explain socioeconomic inequalities in breast cancer survival? Applying causal mediation analysis to population-based data
- 1.4k Downloads
Substantial socioeconomic inequalities in breast cancer survival persist in England, possibly due to more advanced cancer at diagnosis and differential access to treatment. We aim to disentangle the contributions of differential stage at diagnosis and differential treatment to the socioeconomic inequalities in cancer survival. Information on 36,793 women diagnosed with breast cancer during 2000–2007 was routinely collected by an English population-based cancer registry. Deprivation was determined for each patient according to her area of residence at the time of diagnosis. A parametric implementation of the mediation formula using Monte Carlo simulation was used to estimate the proportion of the effect of deprivation on survival mediated by stage and by treatment. One-third (35 % [23–48 %]) of the higher mortality experienced by most deprived patients at 6 months after diagnosis, and one tenth (14 % [−3 to 31 %]) at 5 years, was mediated by adverse stage distribution. We initially found no evidence of mediation via differential surgical treatment. However, sensitivity analyses testing some of our study limitations showed in particular that up to thirty per cent of the higher mortality in most deprived patients could be mediated by differential surgical treatment. This study illustrates the importance of using causal inference methods with routine medical data and the need for testing key assumptions through sensitivity analyses. Our results suggest that, although effort for earlier diagnosis is important, this would reduce the cancer survival inequalities only by a third. Because of data limitations, role of differential surgical treatment may have been under-estimated.
KeywordsBreast cancer Survival Socioeconomic inequalities Causal mediation Population-based data Tumour stage Surgical treatment
Substantial socioeconomic inequalities in cancer survival have been observed in England for decades [1, 2, 3], meaning that many cancer deaths could be avoided . For breast cancer, besides lower screening uptake and differential access to treatment, more advanced stage at diagnosis and severe comorbidity are regularly proposed as the most plausible explanatory factors of these inequalities [5, 6]. However, both factors seem to explain only part of these inequalities, at least for breast and colorectum cancers [7, 8].
Population-based data are crucial to understand the mechanisms affecting all patients and to help define policies. Quantifying the proportion of the effect of deprivation on survival that’s attributable to differential stage of diagnosis and treatment is important for better resource allocation to address the gap between the rich and the poor. Methodological issues, however, are inherent to observational data. Most of the previous results were based on conventional analytic approaches (e.g. by describing the deprivation gap after adjusting for or stratifying by stage). However, if stage and treatment are on the causal pathway from deprivation to cancer survival, or if there is an interaction between deprivation and the mediator(s), these conventional approaches may lead to flaws in interpretation [9, 10, 11, 12]. Using methods from the causal inference literature, we aim to disentangle the contributions of differential stage at diagnosis and differential treatment to the socioeconomic inequalities in cancer survival. To this end, we use population-based and routinely collected data for all patients diagnosed with a breast cancer within a defined area.
Materials and methods
We included in the analyses all women (aged 15–99 at diagnosis) diagnosed with malignant, invasive breast cancer during 2000–2007, followed up until 31 December 2007, and collected by the Northern and Yorkshire Cancer Registry Information Service (NYCRIS), a population-based cancer registry covering 12 % of the English population. Ascertainment of the vital status was considered to be complete for all patients .
Each patient was allocated a socio-economic deprivation score according to her area (Lower Super Output Area) of residence at the time of diagnosis, using the English Indices of Multiple Deprivation (IMD) 2001 (income domain) . These scores were categorised according to the quintiles of their national distribution.
Each patient was allocated one of the four broad tumor TNM stages using a restrictive approach .
Information on surgical treatment was retrieved from a routinely collected national hospital dataset (Hospital Episode Statistics or HES). We retained surgical treatment within 1 month before and 6 months after the cancer diagnosis. The treatment (OPCS-4) codes  were categorized based on recommendations made by the Site-Specific Clinical Reference Group (SSCRG) for breast cancer  (Appendix 1). These categories were then dichotomized into ‘major treatment’ (axillary dissection or other axillary nodal procedures, breast conserving surgery, mastectomy, and plastic surgery) and ‘minor or no surgery’ (other surgical procedures and none).
We examined what proportions of the deprivation gap in survival were explained separately by tumor stage and treatment. Because of our data structure (in particular, the existence of important mediator-outcome confounders affected by exposure, the likely presence of many interactions and the fact that our outcome is binary) we focused on the decomposition of the total causal effect (TCE) into what have recently been termed randomized interventional analogues of natural direct and indirect effects, henceforth RIANDE and RIANIE [21, 22, 23].
The RIANDE and RIANIE can be estimated with an extension of Robins’ g-computation formula  implemented using Monte Carlo simulation in the Stata command gformula . We chose this method because of flexible modelling that allows interactions and other non-linearities. Although flexible in terms of parametric modelling assumptions, this method relies on the assumptions of no unaccounted confounding of the exposure–mediator, mediator–outcome or exposure–outcome relationship.
We conducted three analyses to investigate the mediating roles of stage and treatment (Appendix 3, Appendix 4). We first estimated the proportion of the effects of deprivation on survival that was mediated by differences in stage at diagnosis, i.e. we computed the ratio between the effect of deprivation on log odds of death that was mediated by stage (the RIANIE) and the total effect of deprivation on log odds of death (the total causal effect, TCE, which is the sum of the RIANDE—the effect not mediated by the mediator stage—and the RIANIE). In the second analysis, we estimated the proportion of the effect of deprivation on log odds of death that was mediated by differences in treatment. Stage at diagnosis was here considered to be a confounder of the relationship between treatment and survival, and was allowed to be affected by deprivation. Such a confounder is dealt with using an extension of the g-computation formula [24, 25]. In the third analysis, we estimated the proportion of the effect of deprivation on treatment that is mediated by differential stage.
Because the deprivation gap in survival varies by time since diagnosis, the binary survival outcome (dead vs. alive) was stratified according to time since diagnosis: at 6 months, 1 year given (conditioning on) 6-month survival, 3 years given 1-year survival, and 5 years given 3-year survival. The analyses were performed separately on each of these four binary survival outcomes, in order to disentangle early from late mediating effects of stage and treatment on deprivation gap in survival.
We used multinomial regressions to model stage at diagnosis (four categories) and logistic regression for treatment and survival status. Age at diagnosis was modelled using restricted cubic splines .
Single stochastic imputation within the g-computation procedures was used to handle missing stage (8 %). All variables in the models (including vital status), exact length of follow-up times and detailed treatment categories were included in the imputation model.
Characteristics of women diagnosed with breast cancer, Yorkshire and North East (England), 2000–2007
Number of patients
% Alive at end of follow-up
Mean age at diagnosis
Stage at diagnosis (%)
% Receiving major treatment
Survival from breast cancer
Stage-specific survival estimates were consistently lower in the more deprived patients. Large deprivation gap existed for the short-term survival (at 1 year after diagnosis) in the most advanced stage (IV), and in the long-term survival (at 5 years) in the less advanced stages (II–III). For patients with missing stage information, the more deprived experienced worse survival.
Total effect of deprivation on cancer survival status
Role of stage on the socio-economic differences in cancer survival status
The effect of socioeconomic status on survival mediated through stage (the RIANIE, Fig. 3b) was only apparent when comparing the most deprived with the least deprived. This indirect effect through stage decreased as time since diagnosis increased (OR for 6-month mortality: 1.43, CI: 1.27, 1.62; OR for 5-year conditional mortality: 1.08, CI 1.00, 1.16) (Fig. 3b). On the log odds scale, stage only accounted for about one-third of the total effect of deprivation at 6-month and 1 year (proportion mediated [PM]: 35 %, CI 23, 48 %; 30 %, CI 5, 54 %, respectively), a proportion which decreased to just over a tenth at 3 and 5 years since diagnosis (PM: 12 %, CI 4, 21 %; 14 %, CI −3, 31 %, respectively) (Appendix 6).
We also treated both age and stage as mediators (in place of just stage). We assumed here linear associations between the logarithm of age and treatment or mortality. The overall pattern hardly changed although adding age tended to slightly increase the long-term PM. This might reflect the long-term effect of age on all-cause mortality (Appendix 6).
Role of treatment on the socio-economic differences in cancer survival status
We did not find any evidence for total effect of deprivation and effect mediated by stage on treatment (Appendix 8). Although treatment patterns differ between different socioeconomic classes, the main mediation analysis found no evidence for the effect of deprivation on mortality mediated through differential treatment (Fig. 3c, Appendix 7).
Our results suggest that, for the most deprived patients only, earlier breast cancer diagnosis would reduce the deficit in short-term survival by up to a third and in longer-term survival by up to a tenth. The available crude information on treatment seems to show that differential surgical treatment between deprivation groups played a minor role in socioeconomic inequalities in breast cancer survival.
For the younger (15–69) patients diagnosed at stage I or II, the finding that more deprived patients received more treatment contradicts the a prior hypothesis by some oncologists: more deprived patients may have more comorbidity, and thus less aggressive diagnostic investigation and treatment. Prevalence of both obesity and tobacco smoking widely varies in the general population between deprivation groups [27, 28], but we did not have reliable information about comorbidity of the cancer patients. However, the surgical differences observed between the socioeconomic groups may reflect that, within a given stage, more deprived patients were diagnosed with more advanced disease. To investigate this hypothesis, we will need more detailed information on tumor stage and diagnostic investigation. In addition, more affluent patients may have received treatment within private facilities, information not available to us.
In the absence of individual measure of socioeconomic status for population-based studies in England, we used an ecological measure of deprivation . Because LSOAs (the geographical level of the deprivation measure) are relatively small (1500 inhabitants on average) and have been made as socially homogenous as possible, the ecological bias is probably small. An ecological measure reflects both the individual and contextual dimensions of deprivation. We are not able to disentangle individual and contextual dimensions of deprivation and this affects conceptualizing hypothesized interventions. The English healthcare system is strongly territorialised, and any perceived intervention should primarily target these territories in which individual-level actions (via the general practices) are also possible. Such interventions correspond to our conceptual framework, i.e. we asked: what would be the outcome of women in the deprived group, had they lived in the same area as those lived in the most affluent areas, with similar background risk factors and access to primary and secondary healthcare for their cancer diagnosis and treatment.
We identified three main plausible reasons that could bias our results: misclassification of the stage at diagnosis, misclassification of the treatment and unmeasured confounders between the mediator(s) and the outcome(s).
Misclassification of stage at diagnosis
More deprived cancer patients may more likely be managed by non-specialized centres and low-workload surgeons . Evaluating the spread of their cancers (i.e. staging) may not be thorough enough (Fig. 1) and, as a result, they might be more often under-staged and receive non-optimal treatment . We tested this hypothesis by assuming different proportions of the most deprived patients were under-staged. We randomly up-staged 10, 30 and 50 % of the most deprived patients by one level (stage I to II, etc.) ten times and reran the analyses to estimate the PM distributions. The proportion of survival inequalities mediated by stage hardly changed with 10 % of under-staged most deprived patients, but increased substantially with 30 and 50 % of under-staged, more particularly for conditional survival at 1 year and over (data on request). For example, more than half of the lower conditional 1-year survival among the most deprived patients would be mediated by stage if above 30 % of them were under-staged (vs. 30 % mediated if stage was not misclassified). Changing our main conclusion about the role of stage on survival inequalities would require above 30 % of the most deprived patients were systematically under-staged, compared to none in the most affluent group, a rather extreme assumption that is not supported by the literature.
Misclassification of treatment
Our analyses crudely dichotomized treatment into ‘major’ and ‘minor or no’ surgery categories. When ‘major’ surgery was further categorized into ‘breast conserving surgery’ and ‘mastectomy’, the results remained unchanged (Appendix 7). Nevertheless, the crude treatment information may explain why the mediating effect of treatment on deprivation gap in survival remained moderate and was not affected in the sensitivity analysis on the misclassification of tumor stage.
We assumed that, conditional on deprivation, age, stage and year at diagnosis, and government regions, there were no further (unmeasured) common causes of treatment and survival status. However, in addition to staging thoroughness, comorbidity could be an important confounder for treatment and mortality, which we did not account for due to lack of reliable individual information. Ignoring the confounding effect of comorbidity would potentially lead to over-estimation of the beneficial effect of major surgery on mortality: patients with high levels of comorbidity experience high mortality, and may have lower rate of major surgery. Since we found little evidence for treatment to mediate the effect between deprivation and mortality in the original dataset, inclusion of comorbidity would not change this overall interpretation, but only if stage and treatment were not misclassified. If reliable information on comorbidity becomes available, we could potentially treat it as a mediator between deprivation and mortality, and estimate how much contribution it has to the deprivation gap in survival.
Our results are based on population-based data, i.e. on virtually all patients diagnosed with a breast cancer in a given region, including those who were diagnosed with advanced stage and those who were not optimally managed. Since our main focus is to better understand the causal relationships between deprivation and breast cancer survival, and to divide it into path-specific components, applying methods from the growing literature on causal mediation is highly appropriate.
To our knowledge, very few studies attempted to disentangle the effects of deprivation on breast cancer survival. Two studies used data from an earlier periods (late 1990s) of the same region as our study [31, 32]. A complete-case analysis found adverse stage distribution and less surgical treatment (even after adjustment for stage) among more deprived patients . No stage-specific results were provided on treatment. Lower overall 5-year survival was associated with deprivation after adjustment for age and stage, but underlying pathways could not be deduced from these results. A second analysis using latent class modelling  clearly identified two groups of patients according to their prognosis: more advanced stage seemed to play a role in the deprivation gap in 5-year survival only in one group. The conclusions were weakened by the fact that overall survival was analysed, while mortality from causes other than breast cancer varies greatly by deprivation within 5 years since diagnosis. Our study is also based on overall mortality. Not adjusting for competing risks of death will dilute the mediating effect of stage. However, this effect would be minimal for short-term survival, as mortality from causes not related to breast cancer does not play a significant role in short-term survival status, especially at 6 months after diagnosis. Using conditional survival also reduced this bias.
Contrasting with our results, a study in another English region found that, in 2006–2010, adverse stage distribution explained half of the deficit in 5-year breast cancer relative survival observed among the most deprived patients, but all of it in other deprivation groups . However, stage-standardisation, used in order to eliminate differences in stage distribution by deprivation, cannot fully identify the effect of deprivation mediated by tumor stage on such observational data.
Applying another causal inference approach, Valeri et al.  found that the contribution of stage to the disparities in survival from colorectal cancer between Blacks and Whites in the US was similar to our results for the socio-economic disparities in breast cancer survival in England. They however concluded that the mediation effect of stage represented a “substantial reduction” while we talked about a small reduction, which reflects differences in the study context. Contrasting with the US (at least until recently), the healthcare system in England is universal with free access to diagnosis and treatment. In theory, most disparities in cancer survival should be therefore due to patient and tumour factors, more specifically tumour stage at diagnosis and comorbidity, and not to healthcare system factors. Contrasting this belief, our results add to the growing evidence that one of the strongest prognostic factors, stage, plays a relatively small role in the socio-economic inequalities in cancer survival. Comorbidity (or health performance status) is likely to contribute to inequalities, but will reduce the stage contribution estimated further. It means that, in the context of a supposedly equitable healthcare system, a large proportion of these inequalities remain unexplained; inequalities within the healthcare system are likely to play a key role.
Despite data limitations, we were able to estimate the proportions of the deprivation gap in cancer survival mediated via tumor stage and treatment separately. It informs us about their respective roles, and ultimately, what may be done to most effectively reduce the deprivation gap in cancer survival. In particular, effort for earlier diagnosis would reduce the cancer survival inequalities only by a third. Our conclusions may, however, be altered by unmeasured confounders such as comorbidity, staging thoroughness and detailed treatment information, of which quality and completeness are improving dramatically in the population-based cancer registry data in England. The changes in results after sensitivity analyses demonstrate the vital importance of using reliable and correctly classified surgical treatment data in similar studies.
- 11.Pearl J. Direct and indirect effects. In: 17th Conference on Uncertainty in Artificial Intelligence. San Francisco, CA: Morgan Kaufmann; 2001. pp. 411–420.Google Scholar
- 13.Office for National Statistics. Cancer statistics: registrations of cancer diagnosed in 2007, England. In: Series MB1 No. 38. Newport: Office for National Statistics; 2010. pp. 1–80.Google Scholar
- 14.Department of the Environment Transport and the Regions. Measuring multiple deprivation at the small area level: the indices of deprivation 2000. London: DETR; 2000.Google Scholar
- 16.Health & Social Care Information Centre. OPCS-4 Classification. 2014. http://systems.hscic.gov.uk/data/clinicalcoding/codingstandards/opcs4. Accessed 1 Sept 2014.
- 17.National Cancer Intelligence Network. Site Specific Clinical Reference Groups (SSCRG) for breast cancer. 2013. http://www.ncin.org.uk/cancer_type_and_topic_specific_work/cancer_type_specific_work/breast_cancer/. Accessed 1 Sept 2014.
- 19.StataCorp. STATA statistical software. 13th ed. College Station, TX: Stata Corporation; 2013.Google Scholar
- 20.Grzebyk M, Urmès I, Hédelin G. Net survival estimation with stns. Stata J. 2014;14:87–102.Google Scholar
- 22.Didelez V, Dawid P, Geneletti S. Direct and indirect effects of sequential treatments. In: Dechter R, Richardson TS, editors. Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence. Arlington, VA: AUAI Press; 2006. pp. 138–146.Google Scholar
- 25.Daniel RM, Stavola BLD, Cousens SN. gformula: estimating causal effects in the presence of time-varying confounding or mediation using the g-computation formula. Stata J. 2011;11:479–517.Google Scholar
- 26.Royston P, Sauerbrei W. Multivariable modeling with cubic regression splines: a principled approach. Stata J. 2007;7:45–70.Google Scholar
- 27.Sutton R. Adult anthropometric measures, overweight and obesity. In Craig R, Mindell J, editors. Health Survey for England 2011. Health and Social Care Information Centre; 2012. p. 37.Google Scholar
- 28.Office for National Statistics. Smoking and drinking among adults, 2008. In: General Lifestyle Survey 2008. Office for National Statistics; 2010. p. 74.Google Scholar
- 30.Lawrence G. Further analysis of ICBP treatment data (version 1.2). 2013. p. 2 (unpublished report).Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.