Introduction

Black women suffer 41% higher breast cancer mortality compared to White women [1]. Differences in tumor biology at diagnosis (either due to differential risk factors or differences in detection) may contribute to this underlying disparity [2,3,4,5,6,7,8]. While research and treatment advances have significantly lowered breast cancer mortality rates over the years, declines in mortality among Black women continue to lag behind [1]. Therefore, there is a vital need for novel targets for therapeutic response in diverse breast cancer patients. Survivin is a protein in the inhibitor of apoptosis protein family encoded by the BIRC5 gene, and its mechanisms of action include inhibition of apoptosis, dysregulation of mitosis, cell cycle progression, carcinogenesis, and DNA repair [9].

Survivin is a marker of poor prognosis [10, 11] and is commonly associated with enhanced proliferative index [12], reduced levels of apoptosis [13], resistance to chemotherapy [14,15,16], and increased rate of tumor recurrence [17] across multiple tumor types, including breast cancer. Survivin/BIRC5 is already included as a proliferation marker in two clinically utilized RNA-based prognostic assays in breast cancer, including the Oncotype DX assay [18] and Prosigna assay [19]. Prior studies have shown that high survivin expression is associated with estrogen receptor (ER)-negative [20, 21], high grade, and lymph node-positive breast tumors [22, 23]. However, most studies investigating survivin have been conducted in smaller cohorts of predominantly White women or that did not report on race [10, 11, 20,21,22,23,24], and little is known about survivin in tumors from young and Black breast cancer patients, who are more frequently diagnosed with advanced disease, higher grade, and aggressive molecular subtypes [25, 26]. Currently, there are various methods of targeting survivin therapeutically, including small molecule inhibitors that block the function of survivin, interference with survivin gene expression, or survivin-based immunotherapy [27], making this a promising candidate for addressing disparities in outcomes.

Given that survivin/BIRC5 may be an attractive target for aggressive and resistant malignancies that lack effective therapies, we evaluated RNA expression of BIRC5 according to clinical and demographic variables in a large and diverse study population, the Carolina Breast Cancer Study (CBCS; N = 2174 cases: 1113 Black and 1061 non-Black; 1137 < 50 and 1037 ≥ 50 years of age) and compared these findings to those in the Cancer Genome Atlas (TCGA; N = 1095 cases: 183 Black and 816 non-Black; 295 < 50 and 798 ≥ 50 years of age). We hypothesized that in a diverse patient population, BIRC5 would be associated with aggressive disease and recurrence, suggesting potential value in targeted therapy.

Methods

Study population

The Carolina Breast Cancer Study (CBCS) [28] is a population-based study that utilized rapid case ascertainment with the North Carolina Central Cancer Registry to identify women aged 20–74 years across 44 counties diagnosed with first primary breast cancer. Recruitment occurred in three phases: 1993–1996 (Phase 1), 1996–2001 (Phase 2), and 2008–2013 (Phase 3). Black women and younger women (< 50 years of age) were oversampled using randomized recruitment [28], such that the final study population is 50% Black and 50% < 50 years old at diagnosis. Out of 4806 invasive breast cancer cases enrolled across all phases, 2174 bulk tumor samples were profiled by Nanostring (Phase 1: N = 259; Phase 2: N = 491; Phase 3: N = 1424). Exclusions included samples with depleted tissue (n = 1188, predominantly from CBCS1/2) or samples with low-quality or insufficient RNA (n = 241). This study was approved by the University of North Carolina at Chapel Hill (UNC-CH) School of Medicine Institutional Review Board in accordance with the revised U.S. Common Rule, and participants provided written informed consent.

Demographic and clinical characteristics

Health history and demographic variables were collected by a nurse during in-home interviews. Race was self-reported and categorized as Black and non-Black; > 94.7% of non-Black participants self-reported as White (n = 1005), while < 5.3% self-identified as either multiracial (n = 9, 0.85%), Hispanic (n = 15, 1.41%), American Indian/Eskimo (n = 8, 0.75%), Asian or Pacific Islander (n = 23, 2.17%) or Arab (n ≤ 5, < 1%). Importantly, we interpret race herein under a cells-to-society framework [29, 30], that defines race as a social construct, representing the culmination of biological, social (individual and community-level), and environmental exposures that differ by self-reported race. Tumor size, AJCC stage, estrogen receptor (ER), progesterone receptor (PR), and HER2 receptor status were abstracted from medical records and pathology reports.

Recurrence data were available for CBCS Phase 3 (2008–2013; n = 1424). Recurrence-free survival (RFS) was defined as the time between the date of diagnosis to the first local, regional, or distant breast cancer recurrence and verified through medical record review. Recurrence data are complete through October 2019, with a 5-year follow-up completed for all study participants. Among 1424 eligible women, 50 participants were stage IV at diagnosis and excluded from the recurrence analysis. Among 1374 patients with Stage I-III breast cancer, 159 recurrences were identified within 5 years.

Gene expression data

Normalization, molecular subtyping, and BIRC5

RNA was isolated from bulk tumor tissue using the Qiagen FFPE RNeasy isolation kit (Germantown, MD), assayed using Nanostring nCounter technology (Seattle, Washington), and normalized using Remove Unwanted Variation (RUV) as previously described [31,32,33]. PAM50 molecular subtyping was performed using a research version of the predictor to classify tumors as Luminal A, Luminal B, HER2-Enriched, Basal-like, or Normal-like, and to generate proliferation and risk of recurrence scores (ROR-PT) incorporating tumor size, proliferation and subtype [31, 34].

BIRC5 was considered as both a continuous and categorical variable. For continuous measures of BIRC5, log2-transformed gene expression was utilized in all analyses. Standardized clinical cutpoints do not exist for survivin/BIRC5, and while it is a target of both OncotypeDX [18] and Prosigna [19] multi-gene assays, single gene levels are not established. Therefore, for use as a categorical variable, BIRC5 expression was dichotomized into BIRC5-low and BIRC5-high expression categories using the upper limit of the third expression quartile as a cut point (Log2 3rd quartile cutpoint: CBCS = 7.6; TCGA = 9.4). Differences in the expression of BIRC5 between CBCS and TCGA are likely a result of the different mRNA platforms used in each study (i.e., NanoString in CBCS, RNAseq in TCGA). All tumors were treatment naïve at the time of collection and prior to NanoString assay assessing BIRC5 mRNA expression.

Statistical analysis

Continuous BIRC5 expression levels were compared across race and clinical tumor characteristics using Welch’s two-sample t-tests. Generalized linear models (glm) with binomial distribution and the identity link function were used to calculate relative frequency differences (RFDs) and 95% confidence intervals (CIs) as the measure of association between BIRC5 expression categories and covariates of interest in CBCS. RFDs are defined as the percentage difference between index and referent groups, namely, the relative frequency of BIRC5-high tumors across demographic and clinical variables. With smaller sample sizes, RFDs could not be computed for TCGA because several models failed to converge. Thus, to measure the strength of association between BIRC5-high and covariates of interest in both CBCS and TCGA, multivariate logistic regression was used to calculate odds ratios (ORs) and 95% CIs. Multivariable models were adjusted for age and race according to the CBCS randomized recruitment design in reduced models, and additionally adjusted for ER status and tumor stage in full models. In models comparing age or race, age comparisons were only adjusted for race, and race comparisons were only adjusted for age. Similarly, in models additionally adjusting for ER status and stage, ER comparisons were only adjusted for stage, and stage comparisons were only adjusted for ER status. Multivariable analyses relied on complete case analysis as rates of missingness were < 1.3% for all covariates. Normal-like tumors were excluded from analyses because this subtype arises from insufficient tumor cellularity [31].

Kaplan–Meier curves and log-rank tests were used to compare mean time to recurrence across BIRC5 categories in stage I-III cases (n = 1374). Recurrence analyses were stratified according to clinical breast cancer subtypes (i.e., ER-positive/Her2-, and TNBC) and were performed across all tumor subtypes, overall. Hazard ratios (HR) and 95% CI were calculated using crude and multivariate Cox proportional hazard models adjusted for patient age and tumor stage. The Wald p-value was used to assess the assumption of proportionality. While there was evidence of non-proportional hazards, point estimates did not differ substantially between models. All statistical analyses were performed in R version 4.0.3.

Data availability

RNA sequencing and clinical data from TCGA breast cancer dataset, including 1095 primary tumors, were used to compare and validate BIRC5 relationships identified in CBCS. These data are publicly available under dbGaP accession phs000178.v1.p1, with additional data available at https://gdc.cancer.gov/about-data/publications/PanCan-CellOfOrigin35. CBCS data are available upon request (https://unclineberger.org/cbcs).

Results

BIRC5 expression, patient and tumor characteristics

The distribution of clinical and demographic characteristics in CBCS reflects its population-based sampling schema, with higher proportions of Black participants, higher proportions of participants < 50 years of age, and higher proportions of ER-negative, Basal-like, and Stage I cases compared to TCGA (Table 1). However, in both TCGA and CBCS, BIRC5-high tumors were more common among Luminal B (LumB), Her2-enriched, Basal-like, ROR-PT-high and ER-negative tumor subtypes, as well as higher-stage tumors, and were more frequent among cases from Black women and younger women (< 50 years of age). BIRC5 is one of the genes used in the PAM50 subtype predictor, so we also performed a sensitivity analysis excluding BIRC5 from the algorithm and found that the distribution of BIRC5-high tumors remained very similar across PAM50 subtypes (Additional file 1: Table S1). Figure 1 shows that in univariate analyses among both CBCS and TCGA, continuous BIRC5 expression differs by race, even after stratification by tumor stage (I, II, III/IV; Fig. 1A) and ER status (positive or negative; Fig. 1B).

Table 1 Characteristics of the study population
Fig. 1
figure 1

BIRC5 Expression by Stage and Estrogen Receptor Status in Black and non-Black Patients in CBCS and TCGA. Boxplots displaying continuous log-2 BIRC5 mRNA expression among Black and non-Black breast cancer patients in CBCS (upper panels) and TCGA (lower panels) stratified by (A) tumor stage and (B) estrogen receptor status. Welsh’s two-sample t-test p values listed within each plot. ER: estrogen receptor

We next evaluated associations between categories of BIRC5 expression (i.e., tumors classified as BIRC5-high vs. BIRC5-low, defined as the upper quartile of RNA expression vs. all other quartiles) across the full CBCS and TCGA study populations. In both CBCS and TCGA, similar associations were observed for age at diagnosis, race, ER/PR/HER2 status, PAM50 subtype, tumor stage, and tumor size (Fig. 2, Table 2). To characterize these associations, we estimated relative frequency differences, defined as the difference between the proportions of participants with BIRC5-high tumors in each index group compared to the referent category. In the CBCS, BIRC5-high tumors were 12.1% more frequent among younger participants (< 50 years of age) and 11.7% more frequent among tumors from Black participants. In addition, BIRC5-high displayed strong relationships with aggressive tumor characteristics, with higher frequency among hormone receptor (HR)-negative tumors (BIRC5-high RFD for ER-negative: 27.3%, PR-negative: 21.1%) and aggressive PAM50 subtypes (BIRC5-high RFD for LumB: 33.0%, HER2-Enriched: 28.4%, and Basal-like 49.8%). After additional adjustment for tumor characteristics (e.g., ER status and tumor stage), BIRC5-high remained significantly associated with young age, Black race, ER status, and tumor size (Fig. 2, left panel; Table 2). We also observed that stage II tumors had a higher frequency of BIRC5-high (compared to Stage I), although similar associations with Stage III/IV tumors were attenuated after additional adjustment. We performed a sensitivity analysis excluding non-Black participants that did not self-report White race [N = 56 (2.6%)] and the magnitude of associations in Table 2 were unchanged.

Fig. 2
figure 2

Association between BIRC5-high, patient and tumor characteristics in CBCS and TCGA. Forest plot displaying relative frequency differences and 95% confidence intervals (left panel) and odds ratios (center and right panels) for patient age, race, estrogen receptor/progesterone receptor/Her2 receptor status, PAM50 subtype, stage and tumor size across BIRC5 expression categories in CBCS (left and center panels, red circles) and TCGA (right panel, blue circles). Reduced models were adjusted for age and race according to the CBCS randomized sampling design (solid circles), and additionally adjusted for tumor stage and estrogen receptor status in full models (open circles). RFD: relative frequency difference; OR: odds ratio; 95% CI: 95% confidence interval; ER: estrogen receptor; PR: estrogen receptor; Ref.: Referent; BIRC5 referent group = BIRC5-low for all models. Dashed line represents the null value for each model

Table 2 Associations between BIRC5-High, clinical, and demographic variables in the Carolina breast cancer study and cancer genome atlas

We also present odds ratios from multivariate logistic regression models, which converge better with the smaller cell sizes present in TCGA. Figure 2 displays odds ratios for the association between BIRC5-high, patient, and tumor characteristics in CBCS (center panel) and TCGA (right panel), which mirrored relationships observed across CBCS. In both studies, younger participants and Black participants had higher odds of BIRC5-high, as did tumors with HR-negative status, aggressive PAM50 subtypes, advanced stage, and larger size. These relationships, including associations between young age and Black race, remained significant after additional adjustment for tumor characteristics, but associations with Stage III/IV were attenuated. No relationship was observed between BIRC5 and clinical Her2 status in either CBCS or TCGA, which may be due to the low proportion of Her2-positive cases in each dataset (Her2-positive cases: CBCS n = 329; TCGA n = 164). Thus, BIRC5 is highly correlated with aggressive tumor features. Further, age and race seem to be important factors contributing to BIRC5 levels, even after adjusting for tumor characteristics.

Prognostic utility of BIRC5

We hypothesized that BIRC5 is associated with early (5-year) recurrence. The CBCS Phase 3 identified 159 recurrences during the first 5 years of follow-up. Among all tumors, BIRC5-high tumors had higher recurrence in univariate models, but not in multivariate models [Crude HR (95% CI): 1.68 (1.20, 2.37); Adjusted HR (95% CI): 1.41 (0.99, 2.0)] (Fig. 3A). However, in stratified analyses, BIRC5-high was significantly associated with recurrence only among ER-positive/Her2-negative tumors [Crude HR (95% CI): 2.73 (1.61, 4.63); Adjusted HR (95% CI): 1.94 (1.11, 3.36)] (Fig. 3B). No significant associations with recurrence were observed among TNBC cases [Crude HR (95% CI): 0.7 (0.39, 1.24); Adjusted HR (95% CI): 0.68 (0.38, 1.22)] (Fig. 3C). In a sensitivity analysis, we additionally adjusted recurrence models for race and found that BIRC5-High remained significantly associated with recurrence only among ER-positive/Her2-negative tumors, although hazard ratios were slightly attenuated [Adjusted HR (95% CI), All tumors: 1.34 (0.94, 1.91); ER + /Her2-: 1.91 (1.10, 3.32); TNBC: 0.67 (0.37, 1.21)]. However, adjusting for race, herein considered a social construct, is difficult to interpret due to differential distribution of multiple biological treatment, and health care access factors. We also performed sensitivity analyses restricting to participants that were chemo treated (ER + /HER2-: 54.5%; TNBC: 94.1%) or restricting to those who initiated endocrine therapy (ER + /HER2-: 90.3%), and the magnitude of the HRs were not substantially altered.

Fig. 3
figure 3

Five-year recurrence-free survival (RFS) by BIRC5 expression status in CBCS. Kaplan–Meier survival analysis illustrating 5 year RFS in (A) all CBCS phase 3 cases, (B) among ER-positive/Her2-negative tumors only and (C) among triple-negative tumors only. Cox proportional hazard ratios and 95% confidence intervals adjusted for patient age and tumor stage are displayed within each plot for BIRC5-high relative to BIRC5-low tumors. All analyses were restricted to stage I-III tumors. Tick marks represent censored individuals. Shaded regions represent 95% confidence intervals for each group. ER: estrogen receptor; TNBC: triple-negative breast cancer; HR: hazard ratio; 95% CI: 95% Confidence Interval. Referent group = BIRC5-low for CoxPH models.

Discussion

In this analysis, BIRC5/survivin was investigated as a biomarker in two large studies representing 3269 patients with breast cancer, including TCGA and the CBCS, a large and diverse population-based study enriched for Black and younger patients. In both studies, BIRC5 was associated with high-risk populations, including participants with aggressive tumor subtypes, advanced stage and larger tumors. Young women and Black women also had higher frequencies of BIRC5-high tumors. These differences persisted after adjustment for ER status and tumor stage, suggesting that BIRC5 associations are not driven exclusively by subtype and stage and may reflect additional biological, genetic or environmental influences. Higher BIRC5 was also prognostic for early recurrence among ER-positive participants in the CBCS, which is important given that the disparities in recurrence between Black and White women are greatest among ER-positive breast cancer [25, 36,37,38,39,40].

The results of our study aligns with prior work demonstrating an association between high survivin expression and aggressive breast tumor features including hormone receptor negativity, higher stage, larger size, and non-Luminal A subtype [20,21,22,23, 40], all of which remained significantly associated, independent of estrogen receptor status. BIRC5/survivin expression has also previously been reported as an independent marker of poor prognosis in breast cancer [24, 41] however, the findings of the current analysis extend those prior investigations to a large and diverse patient population. The prognostic relationship with poorer RFS persists in this study. Given our finding of higher BIRC5/survivin and a previous study showing increased survivin phosphorylation in tumors from Black women [42], the burden of BIRC5-high may be particularly important for Black patients.

Our findings also shed light on previously reported BIRC5 associations with breast cancer clinical outcomes [24, 41, 43, 44], which have seldom been stratified by clinical subtype. While BIRC5-high was more prevalent among TNBC tumors, BIRC5 had the strongest prognostic value among ER-positive/Her2-negative disease. This was also seen in the METABRIC cohort presented by Oparina et al., [40] where BIRC5 was only prognostic in the ER + group and not the ER- group. In contrast, Zhang et al [43] showed that survivin predicted survival in 136 TNBC patients. These inconsistent findings across studies highlight that variables mediating BIRC5/survivin responses remain poorly understood. One hypothesis is that in TNBC — a truly distinct disease with its own set of hallmark mutations [35], levels of genomic instability, and underlying tumor immune microenvironment — BIRC5/survivin has a distinct relationship with survival. Elucidating mediating events will be essential to understanding the treatment prospective of anti-survivin therapies. Based on our current results, BIRC5-targeted therapies may be valuable, especially for patients with ER-positive tumors, the subtype with the largest Black-White outcomes disparity [25, 36,37,38,39,40].

There is high feasibility of translating anti-survivin therapy to breast cancer, as it has already been pursued as a cancer therapeutic target by various strategies [27, 45,46,47], and is already measured in the clinic on the validated prognostic assays, Prosigna [19] and Oncotype DX [18]. A strength of our study was the ability to exclude BIRC5 from the PAM50 algorithm (the research version of the Prosigna assay) to independently assess BIRC5/survivin as a high-risk biomarker in breast cancer and its relationship with tumor subtype. Another strength was the use of a large, diverse population-based cohort that represents the natural distribution of breast cancer in the population, and for which RNA expression profiling was optimized for FFPE specimens. However, our analysis also had limitations. A limitation of our findings is that while we observed differences in BIRC5 expression by race, we are unable to evaluate the differential effects of BIRC5 in context of the social construct of race. Our targeted approach also does not allow for the investigation of survivin splice variants, which have been suggested to differ in function and according to prognostic significance [48, 49]. Further studies investigating the role of different survivin splice variants in diverse populations may be necessary for therapeutic stratification. Another limitation was the low number of HER2-positive tumors in our dataset, which did not allow for assessment of BIRC5-mediated recurrence among HER2-positive cases. Future studies should also consider longer follow-up times and detailed chemotherapy data to further disentangle the relationship between race, age, tumor subtype and survivin. Our results fill a research gap in understanding the potential role of survivin in breast cancer disparities, and possibly provide future insight into treatment strategies for a cohort of women with unmet clinical needs. Further studies are needed to help close this gap which constitutes the largest disparity among cancer-specific diseases.