Axillary dissection versus axillary observation for low risk, clinically node-negative invasive breast cancer: a systematic review and meta-analysis

Purpose 1. To systematically analyse studies comparing survival outcomes between axillary lymph-node dissection (ALND) and axilla observation (Obs), in women with low-risk, clinically node-negative breast cancer. 2. To consider results in the context of current axillary surgery de-escalation trials and studies. Methods 9 eligible studies were identified, 6 RCTs and 3 non-randomized studies (4236 women in total). Outcomes assessed: overall survival (OS) and disease-free survival (DFS). The logged (ln) hazard ratio (HR) was calculated and used as the statistic of interest. Data was grouped by follow-up. Results Meta-analyses found no significant difference in OS at 5, 10 and 25-years follow-up (5-year ln HR = 0.08, 95% CI − 0.09, 0.25, 10-year ln HR =  0.33, 95% CI − 0.07, 0.72, 25-year ln HR = 0.00, 95% CI − 0.18, 0.19). ALND caused improvement in DFS at 5-years follow-up (ln HR = 0.16, 95% CI 0.03, 0.29), this was not demonstrated at 10 and 25-years follow-up (10-year ln HR = 0.07, 95% CI − 0.09, 0.23, 25-year ln HR = − 0.03, 95% CI − 0.21, 0.16). Studies supporting ALND for DFS at 5-years follow-up had greater relative chemotherapy use in the ALND cohort. Conclusion ALND does not cause a significant improvement in OS in women with clinically node-negative breast cancer. ALND may improve DFS in the short term by tailoring a proportion of patients towards chemotherapy. Our evidence suggests that when the administration of systemic therapy is balanced between the two arms, axillary de-escalation studies will likely find no difference in OS or DFS. Supplementary Information The online version contains supplementary material available at 10.1007/s12282-021-01273-6.


Introduction
Lack of evidence-based demonstration of survival benefit in landmark trials such as NSABP B-04 [1] and ACO-SOG Z0011 [2] have been pivotal in reducing the extent of surgery in breast cancer. Currently, the SOUND [3] trial aims to determine whether there is a therapeutic role in sentinel lymph node biopsy (SLNB) over observation alone in low-risk breast cancer with normal preoperative axillary imaging. However, studies pre-dating the SLNB era, which compared axillary lymph node dissection (ALND) to observation (Obs) in clinically node-negative women with invasive breast cancer, can shed light on the likely direction results, from axillary de-escalation trials, will take. A previous review by Sanghani et al. [4] conducted a three-way comparison between Obs, ALND and axillary radiotherapy and reported no differences in survival. The review was limited to a follow-up of 5-years, and only two (of four) studies compared axillary ALND to Obs. In this review, studies pre-dating SLNB and comparing ALND to Obs only are comprehensively assessed and a meta-analysis is conducted to examine the difference in long-term outcomes between ALND and Obs in women with low-risk, clinically nodenegative breast cancer. Furthermore, the relevance of results to current practice and future research on axillary de-escalation is considered.

Study design
The study was conducted in accordance with the Preferred Reporting Items of Systematic Reviews and Meta-Analyses (PRISMA) guidance.

Search process and study selection
A systematic literature search was conducted using PubMed/ MEDLINE and Cochrane databases. Search terms included: breast cancer, node negative, axillary dissection, clearance and radical mastectomy. The search, and title and abstract screening were conducted by one author (M.S.), full article screening was conducted independently by two authors (M.S. and M.A.), disagreements in study selection were resolved through discussion. Results of each stage are illustrated in Fig. 1. Reference lists of screened articles were also reviewed. Date of the last search: 15th April 2021. Studies included in qualitaƟve synthesis (n = 9) Studies included in quanƟtaƟve synthesis (meta-analysis) (n = 9)

Inclusion and exclusion criteria
All studies comparing long-term outcomes between ALND and Obs of the axilla in women with clinically node-negative invasive breast cancer were eligible. Studies must have reported at least one measure of long-term outcome that could be extracted from text or figures. Measures were prespecified as overall survival (OS) and disease-free survival (DFS) due to relative consistency in reporting. Studies must have been published in the English language. Studies not fulfilling the inclusion criteria were excluded, no restrictions were placed on study design or year of publication.

Bias assessment
Bias assessment was carried out by two authors (M.S. and M.A.) independently, then reviewed jointly for discrepancies and re-assessment. Bias was assessed in accordance with the Cochrane Handbook [5]. For randomised control trials (RCTs), the 'revised Cochrane risk of bias tool for randomized trials' (RoB 2) [6] was used. For non-randomised studies, the 'Risk Of Bias In Non-randomized Studies-of Interventions' (ROBINS-I) tool [7] was used.
Overall assessment of bias is presented in Fig. 2.

Data collection and statistical analysis
Study data was extracted by one author (M.S.) and reassessed by a second (M.A.) Where available, the following variables (presented in Table 1) were collated: total sample size, control/intervention sample size, mean/median age, follow-up period, OS and DFS at each follow-up interval and corresponding hazard ratios (HR), axillary recurrence, the proportion of ALND group with involved lymph nodes, proportion of cohort with T1 staged disease, proportion with oestrogen receptor (ER) positive disease and proportion of cohort receiving chemotherapy, radiotherapy and endocrine therapy. Where OS and DFS were available from survival curves only, the PlotDigitizer [8] software was used to extract data.  Data were analysed by authors R.B. and M.S. An odds ratio (OR), as seen in other reviews, was not deemed suitable as DFS/OS (cumulative survival data) cannot be reliably converted into an OR and no studies reported crude survival data. There was an additional issue where few studies reported hazard ratios or standard errors for OS/DFS at each interval of follow-up. Instead an approach suggested by Moodie et al. [9], for meta-analyses of survival data where HR and SE are not reported, was considered appropriate and viable after discussion with statistician colleagues. Data processing was carried out by author R.B. using Fortran90 [10]- [12]. Code is presented in supplementary materials and utilised data for: OS, DFS and number at risk. A million simulations were ran to calculate the HR and SE for each study. From this, values for logged (ln) HR, ln SE and 95% confidence intervals (CI) for each study were acquired and used as the statistics of interest for meta-analyses, results are presented in Table 2. Meta-analyses were grouped according to follow-up interval (5, 10 and 25-years) and were conducted using RevMan 5.4.1 [13]. A random/fixed-effects model was used depending on the presence/absence of heterogeneity, respectively [14]. Heterogeneity was evaluated using the Chi 2 test [15]. Egger's Funnel Plot was used to assess for publication bias.

Selected studies
2824 studies were identified. The final review found nine suitable studies: six RCTs, one non-randomised control trial, one retrospective study and one cohort study (Table 1).

Study characteristics
Study characteristics are presented in Table 1. 2617 patients were assigned to ALND and 1619 to Obs. OS was defined as the interval from randomization to the last point of follow-up or death from all causes in all studies. DFS was defined as the interval from randomization to death, first recurrence of disease in the breast, axilla or elsewhere, or last follow-up in all studies.

Risk of bias within and across studies
Bias assessment results are presented in Fig. 2 with individual comments in Table 1. Of the six RCTs, four included power analyses [16,19]- [21], none achieved sufficient numbers. There were concerns over randomization techniques used by three RCTs. Blinding was not feasible in any study. Details on withdrawn participants were given in all studies and did not impact results. Concerns over censorship and missing outcome data were present in one study [16]. One study ended patient recruitment early [16] and one extended the recruitment window [20]. All studies had some concerns over selective reporting of result statistics.
Of the three non-randomized studies, two included small sample sizes [17,22]. One cohort study [18] lacked an appropriate method to control for confounding and the study design allowed for selection bias. Concerns over bias due to measurement outcomes were present in all three studies.
Funnel plots were symmetrical and suggested no publication bias.

Results of individual studies
The RCT by Agresti et al. [18] compared ALND to Obs in women aged  The RCT by Avril et al. [16] compared ALND to Obs in post-menopausal women aged > 50. Significant difference in OS (HR: 3.07, 90%C I 1.40-6.70, p = 1) and DFS (HR = 2.26, 90% CI 1.32-3.86, p = not reported), was found at 5-years follow-up. Our statistical analyses, using the study's data, emulated these findings but did not demonstrate significance in either OS or DFS (Figs. 3a and 4a).
The non-randomised trial by Shin et al. [17] compared ALND to Obs in women aged 24-90. No significant
At 10-years, three studies remained. There was significant heterogeneity between studies (I 2 = 79%, χ 2 = 9.63, p = 0.008; Fig. 3b), thus a random-effects analysis was used. Meta-analysis showed no significant difference in OS (0.33, CI − 0.07, 0.72). The study by Fentiman et al., which strongly opposed observation, used comparatively low-dose adjunct therapy, which may have contributed to heterogeneity. Removal of this study results in a ln HR that is in keeping with 5-and 25-years follow-up and does not show heterogeneity (0.11, CI -0.07 to 0.28, I 2 = 0%, χ 2 = 0.03, p = 0.86).
It appears that data favours equivalence the longer the follow-up interval. As with OS, data from later follow-up intervals favours equivalence more so than short-term.

Discussion
Meta-analyses demonstrate that the lack of survival advantage from ALND presented in individual studies was not due to a lack of power and was not limited by the age of the cohort.
No significant difference in OS at 5, 10 and 25-years follow-up was identified between ALND and Obs. DFS was significantly greater in the ALND cohort at 5-years follow-up, but became non-significant at 10-and 25-years follow-up and shifted towards equivalence.
Our findings are supported by a previous study by Sanghani et al. [4] which compared axillary radiotherapy, dissection and observation. Authors found no improvement in OS when comparing dissection to radiotherapy (1 study) and observation (2) or when comparing radiotherapy to observation (1). Conversely, the study found no significant difference in DFS at 5-years, but was limited to a small study selection, comparisons between multiple interventions/protocol and an inappropriate statistical technique [9].
Additionally, we report that non-significant difference in survival is sustained over longer follow-up, confirming that treatment after recurrence benefits survival in the Obs cohort and early recurrence in the Obs cohort does not lead to increased reduction in long-term survival. Moreover, studies analysed in Sanghani et al.'s review were limited to an older patient cohort, our findings suggest that ALND does not improve survival irrespective of age. treatment, specifically chemotherapy, may explain this result as it has been shown to improve survival [24]. We posit that ALND identifies the presence of axillary metastases and tailors the sub-group towards adjuvant treatment that more effectively reduces the rate of recurrence. In the Obs cohort, patients who have a relapse of disease undergo ALND and are treated with a similar protocol and thus overall survival is not impacted in the long-term.

Disease-free survival
When examining studies individually, four [16,17,19,21] showed variation in adjuvant therapy between ALND and Obs groups and two [1,18] did not disclose adjuvant therapy use. Three studies utilised adjuvant chemotherapy in the treatment protocol, all reported greater usage in the ALND cohort (Obs vs. ALND: Shin: 3% vs 29%; Avril: 2% vs 8%; Agresti: 35.5% vs 51.5%). Two studies (Shin et al. and Avril et al.) reported greater DFS in ALND over Obs, with a large relative difference in chemotherapy usage between the two cohorts. However, ln HR was not significant in either. The Agresti study had the smallest relative difference in chemotherapy and DFS is marginally in favour of Obs, though not significant.
These findings support the notion that chemotherapy may be responsible for improving early DFS by eliminating metastatic disease that is conducive to early recurrence, and that ALND is able to select for disease that is responsive to chemotherapy and which would otherwise recur in a 5-year interval.
This may also be true for SLNB. Studies such as the Z0011 trial [2] and more recently a study by Shigematsu et al. [25] had participants undergo SLNB before assignment to Obs or ALND. Both reported no difference in the proportion of participants receiving adjuvant chemotherapy and, as expected, no difference in 5-year DFS.

Axillary recurrence
Although ALND appears to select for a disease that benefits from chemotherapy, it cannot identify, with a high degree of sensitivity, the small cohort of women who have metastases that lead to recurrences.
Previous analyses [26] have highlighted increased axillary recurrence in the Obs cohort and its lack of association with the number of patients with histologically involved nodes in the ALND cohort [4]. Although not formally assessed, data presented within this study (Table 2) leans towards the support of this. Our results suggest axillary recurrence in the Obs cohort does not correlate with the proportion of patients in the ALND group with histologically node-positive disease, nor with a significant difference in DFS beyond shortterm follow-up.
This finding supports a hypothesis initially proposed by Veronesi et al. [27] that postulated the presence of indolent metastases that are unlikely to lead to disease recurrence and more aggressive metastases that do lead to recurrence.
Genotypic differences in metastases may explain the presence of a small cohort of women who have aggressive microscopic, metastatic foci that lead to occult recurrence rather than remaining indolent. In this group, ALND or SLNB may be necessary to reduce disease progression. But, as the recurrence cohort is small, this will expose many women to unnecessary surgery and a greater risk of comorbidities.
Genomic assays present an alternative and non-invasive solution that can identify loci which confer a recurrence risk in initially clinically node-negative women. The efficacy of genomic assays for this purpose should be investigated in future studies.

In the context of modern trials
Current ongoing trials are examining the extent to which ALND should be omitted, our review of pre-SLNB studies can provide some insight into the expected results of these.
The SENOMAC trial [28] is randomising patients with T1-3 primary breast tumours and up to two axillary macrometastases to either SLNB only or SLNB with dissection. Parallels with our study can be drawn, for instance authors argue that previous trials omitted key groups such as those who underwent mastectomy and therefore ALND cannot be confidently ruled out. All three studies (Shin et al., Fentiman et al. and Bedwani et al.), analysed in this review that involved mastectomy, supported ALND at 5-and 10-years follow-up respectively, suggesting patients undergoing mastectomy (who likely have higher grade disease) benefit from dissection. It is important to note that the (neo)adjuvant therapy regimen, which we argue is critical in equating survival between cohorts, in these three studies differs from the regimens of the modern era. Similarly, SLNB may guide no ALND patients towards specific and more intense therapy than if the axillary status remained unknown. Therefore no difference in survival is an equally plausible result from the SENOMAC study.
The SERC trial [29] is examining the value of ALND in patients with breast cancer of higher risk features. Currently reported underpowered results suggest that chemotherapy use significantly reduces the presence of disease in sentinel nodes and more so when given prior to dissection. Considering our results suggest few histologically node-positive patients undergo disease recurrence, reducing the burden further with chemotherapy prior to ALND/SLNB implies that even SLNB may not be needed when neoadjuvant chemotherapy is administered in women with breast cancer of select characteristics.
Our findings can be extrapolated to suggest that trials such as TAXIS [30] and Alliance A011202 [31], which are examining the effect of radiotherapy (RT) on the undissected and dissected axilla, will not identify improved survival outcomes. It is plausible that loco-regional recurrence may be reduced by RT in these studies. Therefore, the short-term improvement in DFS caused by ALND may be minimised as the undissected group will also be receiving targeted axillary (radio)therapy which prior studies have established is equivocal to dissection in the low-risk cohort [32]. This is supported by that fact that our result of significant improvement in DFS at 5-years follow-up differs from similar reviews [33,34] that included axillary RT with no surgical intervention and found no significant difference in DFS over a similar interval.
In the context of the SOUND [3] and INSEMA [35] trials, which are comparing SLNB to Obs in low-risk breast cancer, our findings suggest that Obs will unlikely be inferior to SLNB, especially considering the eligible cohort is of a lower risk than patients recruited in studies analysed by this review. It is unlikely these studies skewed results towards favouring dissection as meta-analysis excluding these sub-optimal studies does not alter results dramatically. When all other parameters are controlled, it is unlikely the primary breast surgery type exacerbates any effect ALND has on OS or DFS.

Qualitative analysis of variables
No study reported a significant difference in the proportion of patients with positive oestrogen/progesterone receptors between study arms. Two studies reported greater endocrine therapy use in the Obs cohort (Shin: 84% vs 70%; Avril: 91% vs 66%) and showed increased DFS in ALND (non-significant ln HR). One study reported greater endocrine therapy usage in the ALND cohort (Agresti: 41% vs 56%) and favoured Obs for DFS (non-significant). Endocrine therapy is unlikely to truly detriment DFS as other reports suggest otherwise [36]. Chemotherapy preceded endocrine therapy in the Agresti study and was used only in patients with poor characteristics; it is likely chemotherapy in the Obs cohort positively influenced DFS rather than endocrine therapy correlating with DFS decline in the ALND cohort. Radiotherapy usage was similar between ALND and Obs cohorts and is thus unlikely to influence DFS at 5-years follow-up.
Perioperative therapy changed between 1996 and 2014. Increasing efficacy of non-surgical therapeutics may tend recent studies towards equivalence and historic studies towards ALND. This was not demonstrated by our results which instead suggested an association with unbalanced chemotherapy assignment between arms. Instead recurrence rates may have reduced over time as perioperative treatment protocols shifted. Due to the lack of completeness of reported data, this could not be assessed.

Limitations and future direction
Our study was limited by the lack of declared values for HR (of OS and DFS) and standard error, and variation in follow-up periods. This was mitigated through appropriate statistical techniques.
Of the studies assessed, a single study by Avril et al., which suggested ALND improved both OS and DFS, was attenuated when adjusted based on our statistical analysis. This is not concerning, however, as authors reported a HR < 1.6 would support equivalence and the lower confidence intervals of declared data for OS and DFS are below this value (1.40 and 1.32, respectively). Our statistical analyses mirror these findings, showing that Avril et al.'s data is in favour of OS and DFS at 5-years follow-up but the results are not significant. Of further note, Avril et al. also reported a trend towards equivalence from uncensored data in all parameters except DFS, which supports our analysis.
A single study recorded survival at 25-years follow-up, more data is required to confirm equivalent survival over extended follow-up.
Future studies should assess the risk of comorbidities from SLNB when compared to axillary Obs only, and the value of genomic assessment compared to SLNB in stratifying patients into those who would benefit from further axillary therapy and those who would not.
Qualitative analysis was conducted on pre-specified data. Comprehensive and quantitative subgroup analysis was not possible due to limited access to raw data, such an analysis may yield information on the variation of treatment effect in subgroups and should be attempted in future studies.
Non-randomised studies were included in this study with the aim to comprehensively analyse all data available and generate representative findings. Unfortunately, this increased risk of confounding bias. The extent was minimised by strict non-randomised study bias assessment by two authors. Moreover, the undue effect from non-randomised studies is unlikely to be substantial, as these studies were weighted less, and analysis demonstrated equivocal findings upon their removal.

Conclusion
The results of this study indicate long-term equivalence in OS between ALND and Obs in all women with earlystage low-risk breast cancer, however, some improvement in DFS is seen in the ALND cohort in the short-term. It is unlikely difference in OS or DFS will be identified in axillary de-escalation studies of clinically node-negative breast cancer when the administration of systemic therapy is balanced between the two arms. However, the value of ALND, and possibly SLNB, may be in its ability to tailor a proportion of patients towards chemotherapy and thus DFS improvement, but this does not translate to OS benefit as relapses are treated by further interventions.
Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.

Availability of data and material
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Code availability
The code generated in the current study is available within the supplementary information files.

Conflict of interest The authors declare no conflicts of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.