Introduction

International guidelines recommend breast-conserving surgery (BCS), whole breast irradiation (WBI) and systemic therapy including endocrine therapy, HER2-targeted- and chemotherapy as the appropriate treatment for early-stage breast cancer. This multimodal approach has been shown to be equivalent to mastectomy in numerous randomized trials [1, 2]. Breast conserving surgery also seems to be associated with an improved quality of life (QoL) [3,4,5,6,7,8].

Scientific and structural advances including diagnostic imaging, high quality pathological testing, less invasive and morbid resection and effective systemic treatment have achieved very favorable oncologic results for most patients with early-stage disease. Women with localized breast cancer have a 99% chance of being alive after 5 years [9].

Several attempts have been put forward to de-escalate the treatment of early-stage breast cancer. Omission of adjuvant whole breast irradiation was studied in multiple randomized trials [10,11,12,13,14,15,16,17]. Published meta-analyses established that forgoing WBI does not impact overall survival in selected patients but is associated with a significantly higher rate of local recurrence [18,19,20]. Partial breast Irradiation (PBI) was suggested as a technique that limits radiotherapy to the tissue adjacent to the tumor bed which is the most frequent site of local recurrence. Studies of PBI investigated non-inferiority of local control, comparable or better cosmetic results and lower toxicities when compared to WBI. Additionally, PBI was designed to be delivered for shorter treatment durations, being more convenient for patients and less expensive for health care providers.

This analysis will review and analyze the current data on adverse events comparing WBI to PBI, their side effects and cosmesis. The results on survival as well as local control rates were published separately [21, 22].

Material and methods

On March 1st, 2023 we performed a literature review according to the published PRISMA guideline [23]. We searched the PubMed library with the search terms (“radiation therapy” or “radiotherapy” or “irradiation”) AND ("breast cancer" or "carcinoma of the breast") AND ("partial" or "targeted") AND (“randomized” OR “randomized” OR “randomly”). In addition, we screened the major meetings (e.g. ASTRO-, ESTRO-, ESMO-, ASCO annual meetings) for published abstracts. The search was supplemented by hand searches. We included randomized controlled trials including patients suffering from early-stage invasive breast cancer comparing PBI to WBI with a language restriction to English. Trials had to be published after the year 2009 in order allow for comparable modern techniques.

Selection of endpoints

All available toxicity and QoL analyses were retrieved from the published literature and filtered for matching scales and follow-up time. In order to include the highest possible number of trials we chose the endpoint and scoring used in the majority of trials. In the toxicity and QoL analyses, the longest available follow-up time was used to insure adequate capture of adverse events. In the evaluation of cosmetic events, we stratified by follow-up time. In order to compare different PBI techniques and external beam radiation therapy (EBRT) fractionation schedules, we performed an analysis using all available data that included non-standard protocol-specific endpoints. We hierarchically used the grade 2 + toxicities, or when unavailable, grade 3 + and subsequently grade 1 + toxicities.

Toxicities

Acute and late toxicities were generally reported on the EORTC/RTOG or LENT-SOMA scales which are 6 and 5 point Likert-scales (0–4/5) in increments of 1 with the main difference of death from toxicity (5) only scorable in the EORTC/RTOG scoring system. We also included other interval scales like the four point used in the IMPORT LOW-trial reporting dichotomized when we were investigating relative effect sizes in the study arms.

The cosmetic results were obtained at the latest available time point according to a four point scale (excellent/good/fair/poor) in most trials [24,25,26,27,28,29,30,31,32,33,34]. The cosmetic assessment by the physicians from the IMPORT LOW trial was based on the photographic assessment in a three-point scale (no change [none], mild, or marked change) and reported for the mild and marked change at the five year timepoint [35]. Cosmetic results evaluated by the patients were scored in 4-point scale. The item “Breast appearance changed” was dichotomized in the 4-point scale (not at all, a little, quite a bit, or very much) and used from the IMPORT LOW trial at the five years follow-up timepoint. The DBCG PBI trial assessed patient rated cosmesis in the item “Patient satisfaction, compared with contralateral breast” on a 4-point scale [36]. When the objective assessment included both physicians and nurses, we used the evaluation of the trained nurses as the might provide a more unbiased assessment [26, 34].

Quality of life analysis

Three trials used the EORTC QLQ-C30 questionnaire for general quality of life and the QLQ-BR23 questionnaire for breast- cancer specific quality of life. Both scales are subdivided into functional and symptomatic scales.

The IMPORT LOW trials also assessed QoL with the QLQ-BR23 questionnaire and protocol specific questions. However, the published analysis was restricted to the dichotomized values for moderate or marked responses. The NSABP B-39 trial reported several patient assessments including QoL measured by the BCTOS scale. Due to these differences, we were unable to include both trials in the assessment. The predefined threshold for minimal clinically meaningful difference was set at values of 5 or above [37].

Endpoints

We extracted the provided hazard ratios, odds ratios or event numbers from the identified trials to estimate the effect sizes comparing WBI to PBI in the endpoints of acute adverse events including any acute toxicities, acute skin toxicity, pneumonitis and breast pain. Late adverse events included any late toxicities, late skin toxicity, late subcutaneous/fibrosis/induration, telangiectasia, breast pain, chest wall pain, breast edema, fat necrosis and pulmonary toxicity. Bone toxicity according to RTOG scale was pooled with chest wall pain from the Common Terminology Criteria for Adverse Events (CTCAE) Version 3.0 [26, 33].

Subgroups

In order to compare different techniques, we grouped the trials according to PBI technique into external beam radiation (QD = a maximum of once daily treatment and BID = twice daily therapy), intraoperative radiotherapy using electrons or photons and any interstitial or applicator-based brachytherapy. If a trial included more than one technique, the trials were reported as the combination of the two techniques, except when separate data were available. The respective PBI techniques in Figs. 2, 4, 5, 6 and 7 are: mixed (green), EBRT BID (light blue), darker blue (EBRT QD), red (IORT), orange (BT).

Statistics

The comparison of acute and late adverse events and cosmetic results was obtained using odds ratios. QoL subcategories from the EORTC QLQ-C30 and EORTC QLQ-BR23 scales were compared with weighted mean differences (WMD). We used the inverse variance heterogeneity model (ivhet) to pool effect sizes, as this model uses a more conservative estimate of the confidence limits, has less observed variances and favors larger trials compared to the commonly used random effects model [38]. Zero event correction was applied where appropriate [39]. Statistical significance limit was set at p-values lower than 0.05. Significant values are marked in bold letters for better visibility. Heterogeneity within the meta-analysis was obtained with Cochran’s Q-test with the corresponding p values. The I2 statistics were also described, with values above 25% identified as considerable heterogeneity, triggering a subgroup analysis by technique [40]. Funnel plots were created to assess publication bias. For statistical analysis, Microsoft Excel add-in MetaXL 5.3 was used (EpiGear International, Sunrise Beach, Australia). Plots were created using Microsoft Excel for Microsoft Office 365 Pro Plus (Redmond, Washington, U.S.).

Results

The literature analysis (Appendix Fig. 8) identified 51 publications reporting 16 different randomized trials. These included an overall number of 19,085 patients. Median follow-up for the primary endpoints was 8.6 years. Appendix Tables 1 and 2 show an overview of the included trials with important patient characteristics, treatment details and toxicity endpoints.

Acute toxicity

The comparison in acute toxicities between PBI and WBI is described in Fig. 1. PBI was associated with a significant decrease in acute adverse events equal or higher than grade I (n = 678; OR = 0.12; 95% CI 0.09–0.18; p < 0.001) and a reduction in acute skin toxicity, specifically grade II + radiodermatitis (n = 7348; OR = 0.16; 95% CI 0.07–0.41; p < 0.001). Grade II + acute skin toxicity occurred in 5.5% (95% CI 2.1–9.5%) of patients treated with PBI compared to 29.5% (95% CI 18.1–41.6%) with WBI. There were no statistically significant differences regarding grade II + acute pneumonitis (OR = 0.26; 95% CI 0.06–1.06; p = 0.060), or grade II + breast pain (OR = 0.92; 95% CI 0.65–1.30; p = 0.632) between the two modalities. In addition to the shown endpoints, PBI decreased acute breast edema, as analyzed in the RAPID trial (n = 2135; OR = 0.68; 95% CI 0.49–0.95; p = 0.023). Acute fatigue was not statistically different between the two groups (n = 2135; OR = 0.90; 95% CI 0.71–1.16; p = 0.423).

Fig. 1
figure 1

Comparison of acute toxicities sorted by grade using pooled rates and odds ratio with the respective 95% confidence intervals between partial breast radiotherapy and whole breast radiotherapy

The analysis of relative acute skin toxicity by subgroup is shown in Fig. 2. A relative reduction of acute toxicity was shown for all PBI methods, except for IORT (p = 0.104).

Fig. 2
figure 2

Comparison of acute skin toxicity using odds ratios with the respective 95% confidence intervals between partial breast radiotherapy and whole breast radiotherapy according to technique and fractionation schedule. The red and orange lines indicate PBI with IORT and BT, respectively

Late toxicity

The combined analysis of all assessed late toxicity endpoints separated by grading is presented in Fig. 3. Any late toxicity, late skin toxicity, late subcutaneous fibrosis/induration, late telangiectasia, late breast pain, fat necrosis and lung toxicity were not different between PBI and WBI as shown in Fig. 3. PBI resulted in more late chest wall pain in all analyzed grades.

Fig. 3
figure 3

Comparison of late toxicities sorted by grade using pooled rates and odds ratio with the respective 95% confidence intervals between partial breast radiotherapy and whole breast radiotherapy

When the PBI technique was analyzed separately (Appendix Fig. 9), there was no difference in any late adverse events between the trial groups (OR = 2.05’; 95% CI 0.45–9.39; p = 0.357). Numerically, EBRT BID had higher rates of late adverse events (OR = 2.05; 95% CI 0.39–24.13; p = 0.285), but without reaching the threshold of statistical significance. There was also no difference in late skin adverse events like fibrosis, by either radiation modality or technique (Appendix Fig. 10). In the sub-analysis of late subcutaneous tissue toxicity by different radiation techniques, the overall comparison likewise did not detect a significant difference (OR = 2.00; 95% CI 0.89–4.51; p = 0.094) (Appendix Fig. 11) Patients treated with brachytherapy suffered more subcutaneous tissue toxicity (OR = 1.66; 95% CI 1.03–2.67; p = 0.037).

Cosmesis

Unfavorable cosmetic outcome (fair or poor on the 4-point scale) rated by medical professionals (physicians or nurses) was not different between the treatments after 1, 3, 5, 10 years and maximal follow-up, but reached statistical significance at the 20th year time point. However, this was based on data from a single trial (Fig. 4). The analysis by technique using all available maximal follow-up time intervals showed a significantly worse cosmesis for EBRT BID/BT (n = 604; rate PBI: 25.9%; rate WBI: 30.9%; OR = 2.14; 95% CI 1.41–3.24; p < 0.001), while once a day EBRT resulted in less cosmetic deterioration (n = 2071; rate PBI: 7.7%; rate WBI: 14.7%; OR = 0.60; 95% CI 0.45–0.79; p < 0.001). There was a trend towards worse cosmesis using the point estimates for EBRT BID, while numerically BT and IORT were associated with improved cosmesis.

Fig. 4
figure 4

Comparison of unfavorable cosmesis (fair/poor) rated by medical professionals by different time points and technique using odds ratios with the respective 95% confidence intervals between partial breast radiotherapy and whole breast radiotherapy

When using the patients’ assessment of breast cosmesis we obtained similar results. Overall, unfavorable cosmesis was not different between PBI and WBI at the timepoints 1y, 3y, 5y, 10y and maximal follow-up. Both EBRT BID/BT (n = 675; OR = 1.63; 95% CI 1.14–2.34; p = 0.007) and EBRT BID (n = 3215; OR = 2.08; 95% CI 1.22–3.54; p = 0.007) resulted in significantly worse cosmetic results. Patients receiving IORT reported better results (n = 68; OR = 0.24; 95% CI 0.06–0.95; p = 0.042) (Fig. 5).

Fig. 5
figure 5

Comparison of unfavorable cosmesis (fair/poor) rated by patients by different time points and technique using odds ratios with the respective 95% confidence intervals between partial breast radiotherapy and whole breast radiotherapy

QoL assessment

Quality of life was scored by the EORTC QLQ-C30 and QLQ-BR23 questionnaires divided into functional and symptom scales. The pooled QLQ-C30 functional items (global, physical function, role function, emotional function, cognitive function and social function) did not show any differences between PBI and WBI (Appendix Fig. 12). Notably, all comparisons had significant heterogeneity with superior QoL in all items in the assessment of the Florence trial.

The analysis of the QLQ-C30 symptom scales (Fig. 6) did not show a significant improvement in any subscale when analyzing all PBI techniques together. Numerically, all point estimates were superior in the PBI arm. Clinically meaningful differences are reported for the PBI arm in the Florence trial for fatigue, pain and appetite loss.

Fig. 6
figure 6

Comparison of QLQ-C30 scores between partial breast irradiation and whole breast irradiation in different subdomains using weighted mean differences. Lower symptom scores represent better QoL. FA = fatigue, NV = nausea and vomiting, PA = pain, DY = dyspnea, SL = insomnia, AP = appetite loss, CO = constipation, DI = diarrhea, FI = financial difficulties

The effect of PBI versus WBI on breast-specific quality of life was compared using the EORTC QLQ-BR23. The functional scales on body image, sexual functioning, sexual enjoyment and future perspective were not significantly different between PBI and WBI (Appendix Fig. 13). CMDs were present in the PBI arm of the Florence trial in the items body image, sexual enjoyment, and future perspective.

The analysis of the EORTC QLQ-BR23 symptom domains showed that patients receiving PBI reported significantly fewer systemic side effects, breast and arm symptoms (BRST: WMD = − 3.4; 95% CI − 6.5 to 0.3; p = 0.031) (BRBS: WMD = − 6.6; 95% CI − 11.4 to 1.9; p = 0.007); (BRAS: WMD = − 5.9; 95% CI − 8.5 to 3.3; p < 0.001) (Fig. 7). Notably, the pooled differences did not exceed the threshold for CMD.

Fig. 7
figure 7

Comparison of QLQ-BR23 scores between partial breast irradiation and whole breast irradiation in different subdomains using weighted mean differences. Lower symptom scores represent better QoL. BRST = systemic therapy side effects, BRBS = breast symptoms, BRAS = arm symptoms, BRHL = hair loss

Discussion

This meta-analysis compares the side effects of partial and whole breast radiotherapy. While PBI seems to be associated with less acute toxicity and better breast-specific QoL, the effects on late and cosmetic events are similar to whole breast radiation. However, when analyzing the pooled estimates, it is important to consider that the radiation fractionation schedule, influenced the late adverse events as well as the cosmetic breast appearance.

The reduction in treated breast volume led to a noticeable relative decrease in acute skin toxicity by 83%. With an estimation of grade 2+ acute skin toxicity of around 29.5% in WBI and 5.5% in PBI, this difference should be clinically meaningful and might be considered by the treating physicians. However, PBI did not result in a reduction in grade 3 skin toxicity which occurred in less than 4% of patients.

Acute side effects appear to be heavily influenced by the treated volume while PBI reported a lower incidence of acute skin reactions of well as strong tendencies in pneumonitis and any acute side effects. The lack of statistical significance in some of the acute toxicity endpoints might be explained by the conservative comparative model used in this investigation. When the alternatives of fixed effect or random effect models were used, the results showed significant differences between the two treatment modalities. The improvement in acute toxicities was similar in all used radiation techniques and was consistent with the reported prospective and retrospective data as well as the published systematic reviews [41,42,43,44,45,46,47,48]. Therefore, different PBI procedures did not seem to have a relevant impact on acute adverse events as all point estimates favored PBI.

The assessment of late adverse events and cosmetic outcomes showed no overall significant differences. Unfavorable cosmetic results were detected after PBI in about 17% by both medical professionals and patients whereas WBI resulted in impaired cosmesis in 15% and 14% respectively. However, significant heterogeneity in the comparison suggested an association with the radiation technique and the fractionation schedules used. Partial breast treatment with once daily EBRT, BT and IORT might be associated with improved cosmesis. The analyses showed a consistent harmful effect of twice-daily external beam radiotherapy in any late adverse outcome measures and cosmesis as previously hypothesized [49]. Five trials used twice daily radiation schedules [24, 26, 30, 50, 51]. A proposed explanation for this observation is an insufficient normal tissue recovery time between fractions, which was initially anticipated to be less than 6 h. Other authors suggested that an inhomogeneous dose distribution with excessive hotspots could contribute to this finding. However, the published trial protocols did not allow non-standard radiation dose maxima in the target volume. Moreover, other techniques of BT and KV-based IORT also apply inhomogeneous doses and reported favorable cosmesis.

Younger age, larger breast size/surgical deficit, lymph node positivity and higher levels of anxiety/depression have been reported as adverse outcome predictors [52], while previous breast infection/surgical complication, seroma, higher age, smoking status, larger breast volume, greater volume of breast excised, central or inner tumor location, application of a tumor bed boost and use of a conventionally fractionated treatment schedule for whole breast irradiation [53,54,55,56,57] are predictors of poor cosmetic results.

Patients receiving a radiation boost of the tumor bed have higher incidence of tissue fibrosis. The use of a boost doubled the cumulative incidence of moderate or severe fibrosis from 15 to 30% after 20 years of follow-up [55] and it was likely a contributing factor for induration when PBI and WBI were compared. The criteria and frequency of the treatment in the included trials were not standardized, therefore any imbalance in these risk factors could contribute to the obtained results, even though unlikely, due to the randomized distribution of patients to the treatment arms.

Numerous publications have investigated the effects of different radiation schedules on the volume of breast fibrosis and cosmesis [58, 59]. Peterson and colleagues analysed predictors of poor cosmesis in the RAPID trial [53]. Their multivariable model did not demonstrate a significant impact of high dose treatment volume on adverse cosmesis. A detailed analysis from the DBCG-PBI trial that used 40 Gy in 15 fractions in both treatment arms demonstrated that the volume of breast treated with 40 Gy (V40Gy) was closely linked to a diagnosis of breast induration. This observation would support the use of PBI in larger breast sizes. Recently, the DBCG group showed that this correlation is true for women that are older than 65 years (Thomsen et al. ESTRO 2023, Vienna).

A substudy of the TARGIT-A trial focusing on adverse events with the use of IORT, reported better cosmetic results and less acute and late side effects. However, this was a single center outcome analysis within the trial, limiting a widespread applicability of their results. Further, the differences in the experimental arm between patients receiving TARGIT alone and TARGIT with additional whole breast radiation raise concerns regarding the long-term effects of the combined treatment approach [60].

Biologically effective doses (BED) used in PBI arms of the included trials differed significantly. Assuming an alpha/beta ratio of 4 the PBI techniques delivered the following dose ranges: EBRT QD (66.8–75.6 Gy), EBRT BID (62.9–75.6 Gy), IORT (120–131.2 Gy) and BT (62.9–83.7 Gy). The observed changes in the cosmetic outcome between the two EBRT techniques albeit almost no differences in BEDs as well as no relevant differences in the IORT arms despite much higher BEDs given suggest that other factors are mainly contributing to these findings.

The detectable impact of the addition of whole breast radiotherapy compared to endocrine therapy alone on QoL appears to be small. In the PRIME I trial only “breast symptoms” were more pronounced in the radiotherapy arm and resolved after 3 years [11]. This was confirmed by another prospective assessment of QoL during radiotherapy compared to no WBI [61]. Age, socioeconomic status of the patient, administration of chemotherapy or endocrine therapy, BMI and higher baseline anxiety scores are well known factors associated with poor QoL [7, 62,63,64,65,66]. Randomization should help decrease bias related to group allocation. With a threshold of 5 points for clinically meaningful difference, we detected improved QoL only in the Florence study, using EBRT with IMRT and treatment in QD schedules. Other trials and the pooled results did not reach a statistically significant threshold. The analysis was limited by the number of trials and patients included to detect smaller differences. Moreover, pooling of the results might be influenced by the different follow-up times of the trials, as a phenomenon called the “response shift” could influence the QoL scores explained by an adaption of the individual’s QoL assessment [67, 68]. However, other studies have already demonstrated that even a numerical small change in QoL scores could result in a clinically meaningful difference for the individual patient [69].

Our results are consistent with a prospective evaluation of different cohorts receiving IORT, PBI (EBRT QD) and various WBI regimes showed small differences in QoL. Breast symptoms were better after IORT and EBRT compared to WBI, in addition to decreased fatigue, global health status and role functioning over time. These differences were limited to a 2 years-period.

The comparison to other published meta-analyses is difficult because of different data pooling methods, outcome measures and selected endpoints [47, 48, 70, 71]. The scientific evidence for partial breast radiotherapy is also supplemented by numerous prospective single arm trials which also endorsed the same approach as in the trials included in our paper. BID EBRT with total doses of 38 Gy or above seemed to provide excellent/good cosmesis in less than 90% and a substantial rate of induration or fibrosis [59, 72,73,74,75]. When the total dose was reduced to 34–38 Gy, the treatment appeared to be better tolerated [42, 76]. It is possible that a reduction in single fraction and overall dose in the BID EBRT treatment might reduce breast induration and mitigate the observed higher toxicity rates. In the small trial from India included in this analysis, this approach showed early favorable results [51].

The other analyzed PBI techniques, the trials using once daily EBRT trials reported excellent/good cosmesis above 88% [41, 44, 45, 76, 77]. Brachytherapy using “Mammosite” also described good cosmetic results (excellent/good above 90%) [78, 79], whereas results with interstitial brachytherapy caused more diverse results (excellent/good: 68–94%) [43, 46, 80,81,82]. In low-risk cohorts receiving KV-based IORT, the reports are stating satisfactory cosmetic results (excellent/good: 89–97%) [83,84,85,86].

The interpretation of randomized trials comparing PBI and WBI is often difficult due to several factors changed in the PBI arms. Investigators changed not only the treated breast volume, but also the fractionation schedule, number of daily treatments and radiation technique which interferes with the genuine study query. Only two trials used the same technique and fractionation schedule and randomized only to the treated breast volume [35, 36]. Both of them reported favourable point estimates for all evaluated toxicities, patient-reported outcomes, cosmetic results and QoL analyses.

Data from multiple randomized trials suggest that the difference in oncologic endpoints between partial- and whole breast radiation therapy is very limited [21, 22, 48, 71]. This observation strengthens the necessity of an analysis of adverse events as well as quality of life. Comparative research suggests that patients’ priorities when weighing side effects and QoL compared to oncologic cure are similarly important [87]. In addition to equal recurrence and survival outcomes, Shah and colleagues demonstrated that multiple PBI regimes are cost effective, both per cost-effectiveness ratio analysis and cost per quality adjusted life year compared to hypofractionated WBI [88, 89].

Limitations

A limitation of our study is the use of published data rather than individual patient data which would be generally preferable. However, meta-analyses of aggregated patient data have been also shown to provide valuable conclusions [90]. Pooling of different toxicity scales can introduce bias in the analysis [91]. Yet, a good correlation between the LENT-SOMA and the RTOG/EORTC toxicity scales has been reported [92]. The strategy of using the last available time point during follow-up reduced the number of patients, and ensured the detection of possible toxicities. Further, the prevalence of breast hardness, pain, oversensitivity, edema, and skin changes is reduced over follow-up whereas breast shrinkage increased [52]. The use of conventionally fractionated instead of hypofractionated radiation therapy (HFX) in the standard arm of most trials introduces a bias towards PBI, especially regarding skin toxicity. The pooled analysis of the UK START trials, a Cochrane meta-analysis as well as other randomized trials demonstrated reduction in the adverse events as well as improved QoL and cosmetic results with hypofractionated WBI [52, 93,94,95,96,97].

A noticeable strength of our analysis is the use of multiple toxicity endpoints, separated by grading and different follow-up intervals as well as the differentiation by PBI technique. A follow-up period with a median of 8.6 years should be adequate to capture the majority of adverse events.

Conclusion

A reduction of the breast volume treated by adjuvant radiotherapy reduces acute skin toxicity and improves breast symptom-related quality of life. Twice-daily fractionation leads to higher fibrosis and worse cosmesis.