Introduction

The usefulness of preoperative magnetic resonance imaging (MRI) of the breast, its influence on mastectomy rates, and its potential in reducing reoperation rates are still the focus of heated debates [1,2,3,4,5,6,7] and of large studies [8,9,10,11,12,13,14,15,16,17,18]. Meanwhile, in the last 20 years, the routine implementation of breast MRI increased in almost all clinical settings [19,20,21]. Notably, while MRI screening programs had been established for narrowly defined very high-risk populations [22], the expansion to women with dense breasts [23, 24] has been progressively advocated from 2019 onwards after the results of the DENSE [25, 26] and ECOG-ACRIN EA1141 [27] trials. As a result, an increasing number of breast cancer patients have already undergone breast MRI before diagnosis and before surgical planning: even though these MRI examinations were not performed with preoperative intent, their results ultimately go on to impact treatment.

This was also observed in the Multicenter International Prospective Analysis (MIPA) study, which aimed to compare the mastectomy and reoperation rates between patients who did and did not undergo preoperative breast MRI according to usual practice in 27 centres worldwide [28, 29]. Other MRI indications than preoperative MRI ultimately accounted for more than 20% of patients who underwent MRI, who were excluded from the main analysis which was focused on the preoperative indication [29]. In this 20% cohort, two subgroups could be identified: (i) women with higher-than-average breast cancer risk who had MRI as a screening examination; (ii) women who underwent MRI as a diagnostic examination, mainly for problem-solving purposes.

As the MIPA study enrolled patients from 2013 to 2018, a further increase of the percentage of patients who come to surgical planning with screening or diagnostic MRI beyond the aforementioned 20% is easily foreseeable or—most likely—already occurring, but the effects of MRI in these subgroups on the choice between breast-conserving surgery (BCS) and mastectomy remain to be ascertained.

Therefore, this report from the MIPA study will address mastectomy rates and reoperation rates in women who underwent breast MRI for screening and diagnostic purposes, using multivariable analysis for investigating the role of MRI referral/nonreferral and other covariates in driving surgical outcomes.

Materials and methods

Details on the design and methods of the MIPA study have been previously reported [28, 29] and are summarised here. The MIPA study observationally enrolled women aged 18–80 years with newly diagnosed breast cancer at core-needle or vacuum-assisted biopsy (CNB/VAB), without indications for neoadjuvant therapy and amenable to upfront surgery. Following routine practice of each centre, the diagnostic pathway included a variable combination of the following: conventional imaging, i.e. mammography and/or breast ultrasonography; stereotactic, ultrasound- or MRI-guided CNB/VAB; bilateral contrast-enhanced MRI; and eventual further CNB/VAB sampling of additional lesions detected by preoperative MRI. Data from all these steps were recorded alongside data from surgical planning stages (multidisciplinary team meetings or direct interview with clinicians) and from surgical pathology.

Patient subgroups

Purpose and timing of MRI referral were recorded and classified as “screening”, “diagnostic”, or “preoperative”, specific purposes behind MRI referral in the first two subgroups being detailed in Table 1. Notably, screening and diagnostic MRI had to be performed before CNB/VAB. This work extends the per-protocol and patient-based analysis of surgical endpoints (detailed below), already performed for patients who did not undergo MRI (noMRI subgroup) and patients who underwent MRI with a preoperative purpose (P-MRI subgroup) [29], to patients from the screening (S-MRI) and diagnostic (D-MRI) subgroups.

Table 1 Referral purposes among the S-MRI and D-MRI subgroups

Evaluation of surgical planning and outcomes was recorded at four different timepoints: (1) surgical planning according to findings from conventional imaging only; (2) surgical planning according to findings from conventional imaging and MRI; (3) actual surgery performed; (4) surgical outcome and immediate/short-term reoperation for close or positive margins. While data from all four timepoints were available for the P-MRI subgroup, timepoint 1 was not available for the S-MRI and D-MRI, as MRI was embedded in the diagnostic process and no surgical treatment was planned before the patient underwent MRI. Therefore, while analyses on timepoints 3 and 4 were conducted as planned, timepoints 1 and 2 were collapsed into a single timepoint, i.e. “surgical planning after imaging”, considering surgical planning after MRI for the S-MRI, D-MRI, and P-MRI subgroups and surgical planning after conventional imaging for the noMRI subgroup.

Endpoints

The two primary surgical endpoints are first-line mastectomy and immediate/short-term reoperation for close or positive margins. The two secondary surgical endpoints are first-line bilateral mastectomy and the overall mastectomy rate, the latter obtained adding first-line mastectomies to conversions from BCS to mastectomy after reoperation.

Data analysis

The Supplementary Material details univariate comparisons of demographic, imaging, CNB/VAB, and surgical pathology characteristics among the noMRI, S-MRI, D-MRI, and P-MRI subgroups. Figure 1 presents univariate comparisons of surgical planning and outcomes. Considering the observational and non-randomised design of the MIPA study and size imbalances between the four subgroups, non-parametric statistics were used, namely the χ2 and Fisher’s tests for categorical variables and the Kruskal-Wallis test for continuous variables. To account for multiple testing, the Bonferroni correction was applied for the 24 overall comparisons—resulting in a p < 0.002 threshold for statistical significance—whereas p values presented in pairwise post hoc testing were automatically adjusted with the Bonferroni-Holm correction (adjusted p < 0.05 significance threshold).

Fig. 1
figure 1

Stage-by-stage analysis of surgical endpoints in the four subgroups. Red lines indicate comparisons of rates between subgroups that were statistically significant at post hoc testing after adjustment with the Bonferroni-Holm correction (adjusted p values are shown). S-MRI, screening magnetic resonance imaging; D-MRI, diagnostic magnetic resonance imaging; P-MRI, preoperative magnetic resonance imaging

Multivariable binary logistic regression analysis was then performed to estimate the relative effect of covariates on the four outcomes of interest by calculating adjusted odds ratios (ORs) for such predictors with their 95% confidence intervals (CIs). Covariate selection from the clinically reasoned pool of demographic, imaging, and pathology variables was performed for each of the four binary logistic regressions using stepwise multivariable linear regression (forward selection with p < 0.1 as the threshold for variable inclusion) and is detailed in the Supplementary Material.

All analyses were performed with SPSS v.26.0 (IBM SPSS Inc.).

Results

Study population

Of the 7245 patients enrolled between June 2013 and November 2018, 1417 were excluded due to unretrievable or missing data. Thus, 5828 patients entered this analysis with 2763 (47.4%) in the noMRI subgroup. Among the 3065 (52.6%) patients who underwent MRI, 2441 (79.7%) were in the P-MRI subgroup, 510 (16.6%) in the D-MRI subgroup, and 114 (3.7%) in the S-MRI subgroup.

As detailed in Tables E1, E2, E3, and E4 (Supplementary Material), differences between MRI subgroups in demographic, imaging, and pathology characteristics became evident when examining the S-MRI and D-MRI subgroups. Patients of the S-MRI subgroup had different characteristics in almost all analysed indicators, being younger than those of the P-MRI subgroup, having a far higher proportion of familial or personal genetically proven increased risk of breast cancer and the highest rates of small and single-focus cancers. The D-MRI subgroup exhibited an intermediate profile either between the noMRI and the P-MRI subgroups (e.g. in terms of lesion size and focality at conventional imaging and final pathology) or between the S-MRI and the P-MRI subgroups (e.g. in terms of patients’ age, hormonal status, lesion size and focality at MRI, rate of invasive lobular component at CNB/VAB and surgical pathology).

Table 2 Multivariable binary logistic regression model of variables associated with first-line mastectomy
Table 3 Multivariable binary logistic regression model of variables associated with bilateral first-line mastectomy
Table 4 Multivariable binary logistic regression model of variables associated with reoperation for close or positive margins

Surgical endpoints

After imaging, the planned mastectomy rate was lowest in the noMRI subgroup (14.4%, 398/2763 patients, < 0.001) and highest in the S-MRI (32.5%, 37/114 patients) and P-MRI subgroups (32.4%, 791/2441 patients). The D-MRI subgroup had an intermediate rate of 22.0% (112/510 patients, adjusted p values < 0.001 for pairwise comparisons).

First-line mastectomy

These trends were confirmed analysing the rates of actually performed first-line mastectomy. Except for the S-MRI subgroup, actually performed mastectomy rates only slightly differed from initial planning, due to patient preference-based conversions of planned BCS to mastectomy and viceversa. The S-MRI subgroup had the highest mastectomy rate (36.0%, 41/114 patients), followed by the P-MRI subgroup (33.6%, 820/2441 patients), the D-MRI subgroup (21.8%, 111/510 patients), and the noMRI subgroup (15.6%, 432/2763).

The findings from univariate analysis were confirmed by multivariable linear (Table E5) and binary logistic (Table 2) regressions, where S-MRI had the highest significant OR for first-line mastectomy (2.5, < 0.001), D-MRI having the lowest and non-significant OR (1.2, = 0.300). Multicentric cancer diagnosis at ultrasonography (OR 5.8, < 0.001) or mammography (OR 2.4, = 0.001) was the imaging feature more strongly associated with mastectomy, while among CNB/VAB features the presence of invasive lobular carcinoma carried a 1.7 OR for mastectomy (< 0.001).

Table 5 Multivariable binary logistic regression model of variables associated with overall mastectomy

The S-MRI subgroup had the highest rate of bilateral mastectomy among patients undergoing first-line mastectomy (34.1%, 14/41 patients, adjusted p values < 0.001 for all comparisons), followed by P-MRI (10.6%, 87/820 patients) and D-MRI (8.1%, 9/111 patients), whereas only 12/432 first-line mastectomies (2.8%) in the noMRI subgroup were bilateral. Multivariable regression (Table E6 and Table 3) mirrored the clinical bias towards bilateral mastectomy in patients undergoing screening MRI (OR 29.2, < 0.001) or patients with multicentric cancer diagnosis at ultrasonography (OR 7.5, < 0.001).

Reoperation

The noMRI subgroup had the highest reoperation rate for close or positive margins (11.7%, 323/2763 patients), followed by the S-MRI subgroup (10.5%, 12/114 patients), the P-MRI subgroup (8.5%, 207/2441 patients), and the D-MRI subgroup with the lowest reoperation rate (8.2%, 42/510 patients). While a significant difference (adjusted p value < 0.001) was found between the reoperation rates of the noMRI and P-MRI subgroups, no significant difference was observed between different MRI subgroups (adjusted p values = 1.000).

The S-MRI subgroup had the highest rate of BCS that underwent reoperation with conversion to mastectomy instead of wider local excision (4/11, 36.4%), followed by the P-MRI subgroup (63/190, 33.2%), the D-MRI subgroup (12/40, 30.0%), and the noMRI subgroup (66/316, 20.9%). A significant overall difference (= 0.016) was observed, the pairwise comparison between the P-MRI and the noMRI subgroup showing the only significant difference (adjusted p value = 0.013, all other comparisons = 1.000).

Multivariable modelling (Table E7 and Table 4) highlighted how all indications for MRI had a protective effect against reoperation (OR 0.7 with p values ≤ 0.029 for D-MRI and P-MRI, OR 0.8 with = 0.452 for S-MRI), a role shared with many drivers of first-line mastectomy such as multicentric cancer presentation (OR 0.4, = 0.029) and lesion size ≥ 20 mm (OR 0.8, = 0.029) at ultrasonography. All factors associated with re-operations came from pathology, with the presence of pure DCIS carrying the highest OR (2.5, < 0.001) for reoperation, followed by the presence of DCIS associated with invasive cancer (OR 1.8, < 0.001).

Overall mastectomy rate

Differences outlined in previous comparisons were more evident in overall mastectomy rates: this rate rose to 39.5% (45/114 patients) in the S-MRI subgroup, closely followed by the P-MRI subgroup (36.2%, 883/2441 patients, pairwise comparison adjusted = 1.000). Compared to these two subgroups, as was for first-line surgery, the noMRI subgroup had half (or less) of these rates, with an 18.0% overall mastectomy rate (498/2763 patients, adjusted p values < 0.001 for the comparisons with S-MRI and P-MRI). The 24.1% overall mastectomy rate in the D-MRI subgroup (123/510 patients) significantly differed from all subgroups while being marginally closer to the noMRI subgroup (adjusted p values of 0.009, 0.001, and 0.010 for the comparisons with the S-MRI, P-MRI, and noMRI subgroups, respectively). The multivariable model (Table E8 and Table 5) showed how variables strongly associated with reoperation protected against overall mastectomy, such as pure DCIS (OR 0.6, < 0.001) or DCIS associated with invasive cancer (OR 0.8, = 0.005) at surgical pathology and single-focus cancer presentation at mammography (OR 0.7, = 0.005). Conversely, strong association with overall mastectomy was observed for surgical pathology findings that implied a reoperation strategy based on mastectomy, such as multifocal or multicentric cancer (OR 3.9, < 0.001) and lesion size ≥ 20 mm (OR 1.8, < 0.001). Clinical variables driving upfront mastectomy were also associated with overall mastectomy, such as multicentric cancer at conventional imaging (OR 2.5 with = 0.002 for mammography, OR 4.2 with < 0.001 for ultrasonography), screening MRI referral (OR 2.4, < 0.001), high familial risk (OR 2.2, = 0.003), and pure DCIS or invasive lobular carcinoma diagnosis at CNB/VAB (OR 1.7 with < 0.001 and OR 1.5 with < 0.001, respectively).

Discussion

The rationale of this subgroup analysis of the MIPA study was based on the observation that 21% of women in the MRI subgroup had already undergone MRI for screening or diagnostic purposes when they reached treatment planning. In these patients, the MRI result is used in surgical decision-making, bypassing the debate about the appropriateness of its preoperative use [5, 7]. Besides concerns about overdiagnosis and overtreatment, this represents a relevant issue caused by the ever-widening application of MRI from problem-solving [30] to screening [23].

As expected, the characteristics of the S-MRI subgroup translated in three times higher bilateral mastectomy rates compared to the D-MRI and P-MRI subgroups, both at surgical planning after imaging and at the evaluation of actually performed first-line surgery, also acknowledging trends towards contralateral prophylactic mastectomy [31, 32] and the influence of patient preferences on 9.8% of all mastectomies (compared to “only” 3.7% in the P-MRI subgroup). Of note, while the reoperation rate of 10.5% in the S-MRI subgroup was higher than that in the D-MRI and P-MRI subgroups (8.2% and 8.5%, respectively) and was close to that in the noMRI subgroup (11.7%), re-operations were performed in 92% of cases on previously attempted BCS.

In the D-MRI subgroup, MRI referral was due to equivocal findings at conventional imaging in 83% of cases (MRI as problem solving), with a patient profile in-between the noMRI and the P-MRI subgroups. In the latter, the composite selection bias towards MRI [6, 18] defined a subgroup of younger patients with complex cases, larger lesions, and higher rates of multifocal or even multicentric disease. Conversely, this phenomenon was less conspicuous in the D-MRI subgroup, likely because cancers exhibiting equivocal findings at conventional imaging come from the whole spectrum of breast cancer stages, as recently shown by a population-based study on problem-solving MRI [33]. Univariate analysis highlighted how the intermediate profile of the D-MRI subgroup extended to surgical endpoints, with the lowest rates of first-line and overall mastectomy among MRI subgroups, and the lowest reoperation rate in the whole study (8.2%). These findings were confirmed by multivariable analysis, where patients’ association to the D-MRI subgroup had the lowest OR for first-line mastectomy, first-line bilateral mastectomy, and overall mastectomy. Notably, D-MRI carried a 1.0 OR of overall mastectomy (95% CI 0.8–1.3) compared to the noMRI subgroup.

The comparison of the D-MRI and P-MRI subgroups provided other insights: P-MRI examinations were driven by the sole purposes of ipsilateral staging and contralateral screening in CNB/VAB-proven cancer, whereas the spectrum of indications in the D-MRI subgroup included the heterogeneous “equivocal findings” at conventional imaging and a substantial diversity in the remaining 17% of cases. Notably, the MIPA study did not include patients without breast cancer. Thus, the D-MRI subgroup was relatively small in size due to the fact that only 10–13% of patients undergoing problem-solving MRI are reported to be ultimately diagnosed with breast cancer [33, 34]. Still, the imaging and pathology profiles of cancers in the D-MRI subgroup in our study seem to be less polarised towards large and multifocal or multicentric tumours compared to cancers from the P-MRI subgroup, as already observed in a population-based study [33].

Moreover, multivariable regression analysis indicated that first-line mastectomy, reoperation, and overall mastectomy were driven by patient-specific imaging and pathology features rather than by the indication for the MRI examination. These findings hint that overtreatment concerns might be less pronounced than what could be surmised by the experience of MRI screening in high-risk populations—that naturally carry a strong multi-layered propensity towards surgery [35]—or by the experience with preoperative MRI, solely performed on CNB/VAB-proven cancers. Indeed, the D-MRI subgroup has a far more “neutral” purpose behind MRI referral, while the S-MRI and P-MRI subgroups are affected by referral biases that—albeit different—ultimately drive surgical planning towards mastectomy. These biases are supported by patient-based, imaging-based, and biopsy-based characteristics that, respectively, characterise the S-MRI and P-MRI subgroups in comparison to the noMRI and D-MRI subgroups. For example, while the use of MRI in the S-MRI subgroup could be still considered a “diagnostic” one, as in the D-MRI subgroup, data from Tables E1, E2, and E3 highlight how factors such as age and familial or personal genetically proven increased risk of breast cancer characterise this subgroup and constitute an a priori referral bias towards mastectomy. Evidence from follow-up and secondary analyses of breast MRI screening trials outside the high-risk setting will therefore be crucial to better define these issues, also acknowledging that the panorama of contrast-enhanced breast imaging saw an extension towards contrast-enhanced mammography [36], which has already received a conditional recommendation from the European Commission Initiative on Breast Cancer to substitute MRI in the preoperative setting [37], even before the completion of specific randomised trials [38]. Conversely, the P-MRI subgroups carry an a posteriori referral bias, supported by several characteristics that only emerge when the diagnostic pathway has already begun, i.e. imaging-based and biopsy-based features such as larger maximal lesion diameters and higher rates of multifocal or multicentric presentation at conventional imaging, and the presence of lobular component at CNB/VAB.

Limitations of this work and of the MIPA study itself chiefly reside in its non-randomised and observational design and in the impossibility of conducting a thorough evaluation of factors that affect surgical decision-making, such as individual surgeon experience and choice, access to advanced reconstruction techniques, or patient-specific and institutional factors. Furthermore, considering the 2013–2018 enrolment timeframe, the potential intervening effect of three factors that emerged in the last decade must be acknowledged: first, the technical and clinical improvements of breast MRI; second, the expanded role of MRI compared to the guidelines issued at the beginning of the 2010s [19]; third, the widespread adoption of digital breast tomosynthesis. These factors could have mitigated the imbalance between subgroups.

In conclusion, this subgroup analysis of the MIPA study confirmed that in all patients, rather than the different reasons for an MRI referral, many other factors drove surgical planning, including demographic, conventional imaging, and pathologic features. Patients with MRI performed before CNB/VAB for screening or diagnostic purposes had different characteristics and surgical outcomes compared to both the noMRI subgroup and the P-MRI subgroup. Patients from the D-MRI subgroup had the lowest overall mastectomy rate (24.1%) among MRI subgroups and the lowest absolute reoperation rate (8.2%) together with the P-MRI subgroup (8.5%). This analysis offers an insight into how the initial indication for the MRI affects the subsequent influence on surgical treatment of breast cancer.