Patient demographics
Patient demographics of the 30 patients are described in Supplementary Table 2. All patients were premenopausal with ER-positive/HER2-negative tumours. Of those, 88% were progesterone receptor (PgR)-positive and 8 patients had node-positive disease (range 1–2 nodes positive).
Changes in RS, ROR and EP scores during the menstrual cycle
Figure 1a shows the individual changes in the prognostic scores between W1 (low oestrogen and progesterone) and W2 (high oestrogen ± progesterone) for each test. Mean [± standard error of the mean (SEM)] scores were not significantly different between W1 and W2 for RS (26.7 ± 3.5 vs. 26.9 ± 3.9; Wilcoxon p = 0.96), ROR (34.2 ± 3.7 vs. 38.0 ± 3.6; p = 0.27), EP (6.57 ± 0.58 vs. 6.82 ± 0.59; p = 0.57) or EPclin (3.50 ± 0.19 vs. 3.57 ± 0.20; p = 0.57) (Fig. 1a). There was a strong correlation of the individual signature scores in W1 and W2 with ROR showing the largest variation (RS; r = 0.93, ROR; r = 0.72, EP; r = 0.85, EPclin; r = 0.82; Supplementary Fig. 1a). The mean (± SEM) absolute difference in scores between W1 and W2 irrespective of direction of change was 5.2 ± 1.1 for RS, 9.2 ± 2.0 for ROR, 1.18 ± 0.25 for EP and 0.33 ± 0.07 for EPclin.
The change in the corresponding estimates of % risk of recurrence generated from each score is shown in Fig. 1b; again, there was no significant difference between W1 and W2 for RS (mean ± SEM, 17.7 ± 2.5% vs. 17.9 ± 2.7%; p = 0.88), ROR (8.9 ± 1.3% vs. 9.8 ± 1.3%; p = 0.32), EP (15.6 ± 1.9% vs. 16.8 ± 2.3%; p = 0.59) or EPclin (15.1 ± 2.6% vs. 16.4 ± 2.8%; p = 0.55). There was a high degree of correlation between the W1 and W2% risk estimates for all signatures (RS; r = 0.93, ROR; r = 0.76, EP; r = 0.85 or EPclin; r = 0.83) (Supplementary Fig. 1b). The mean (± SEM) absolute difference in % risk estimates between W1 and W2 irrespective of direction of change was 3.6 ± 0.77% for RS, 2.2 ± 0.47% for ROR, 4.3 ± 0.92% for EP and 4.4 ± 0.93% for EPclin.
Variation of scores measured in the same window vs. different windows
Measurements of the four signature scores in the same window, one menstrual cycle apart, from eight patients showed no significant changes (Fig. 2). The variation of the scores when they were measured in W1 and W2 compared to those measured in the same window was significantly higher for RS (F test; p = 0.0003) and EP/EPclin (p = 0.029 and 0.019, respectively), but not for ROR (p > 0.05) (Fig. 2a). Variation of the corresponding estimates of % risk of disease recurrence showed the same pattern with significant differences for RS (p = 0.0008) and EP/EPclin (p = 0.0064 and 0.0071, respectively), but again not for ROR (p > 0.05) (Fig. 2b).
Changes in risk categories and intrinsic subtype classifications
Tumour samples were classified into their corresponding risk groups using the published cut-points for each signature [3,4,5]. For RS, ROR, EP and EPclin, 5 (23%), 6 (27%), 3 (14%) and 3 (14%), respectively, of the 22 tumours were assigned to a different risk category in W2 compared to W1 (Fig. 1a and Table 1). The kappa statistic (κ) measuring the agreement between the risk groups in the two windows was 0.66 (95% CI 0.40–0.92) for RS, 0.56 (95% CI 0.27–0.85) for ROR, 0.67 for EP (95% CI 0.34–1.00) and 0.73 for EPclin (95% CI 0.45–1.00). When measurements were made in the same window for RS, ROR, EP and EPclin, 0, 4 (50%), 3 (37%) and 1 (12%) of the 8 tumours were assigned to a different risk category, respectively. If the reduced cut-points for RS from the TAILORx study (intermediate group 11–25) [9] were used, 6 (27.3%) tumours were classified differently in W2 compared to W1 (κ = 0.54, 95% CI 0.27–0.80) and 4 (50%) tumours were classified differently when measured in the same window.
Table 1 Concordance of risk categorisation for paired measurements of (a) RS, (b) ROR, (c) EP and (d) EPclin scores performed in W1 (low oestrogen and progesterone) and W2 (high oestrogen ± progesterone) of the menstrual cycle The ROR test also provides intrinsic subtype information: 17 (77.3%) tumours were classified as Luminal A, 3 (13.6%) as Luminal B and one each as HER2-enriched (4.5%) and basal-like (4.5%) in W1. Three tumours (13.6%) had a different subtype classification in W2 compared to W1 (Luminal B to Luminal A, HER2-enriched to Luminal B & Luminal A to Luminal B). Two (25%) tumours had a different subtype assigned (both Luminal A to Luminal B) when measured in the same window.
Changes in gene signature component modules and individual genes
Of the individual modules of the RS, the mean ER module score was significantly higher in the window with high oestrogen (W2) (+ 16.6%; p = 0.046), whilst the mean invasion module score trended lower in W2 than W1 (− 10.9%; p = 0.098) with more than a twofold reduction in W2 in some patients (Fig. 3). The change in ER module score was driven by a significant increase in PGR expression between the two windows (+ 81.4%; p = 0.0029) with no change apparent in the other three genes (ESR1, BCL2 and SCUBE2) in the module (Supplementary Fig. 2a). There was a trend for a higher RS proliferation module score (mean + 7.3%; p = 0.13) in W2, even though the score was thresholded in 13 cases in W1 and 10 cases in W2 (Fig. 3). All five of the individual PAGs that make up the RS proliferation module showed an increase in their mean expression in W2 compared to W1 (9.6–44.6%; p = 0.065–0.21) (Supplementary Fig. 2b), but in no case was this statistically significant. Both genes in the RS invasion module (MMP11 and CTSL2) showed lower expression in W2, but this did not reach significance for either of them (Supplementary Fig. 2c). There was no significant change in the HER2 module scores, which were thresholded in 21/22 cases, between the windows (mean + 1.7%; p = 0.25) (Fig. 3).
The ROR proliferation score showed a non-significant trend to be higher in W2 compared to W1 (23.9%, p = 0.092; Supplementary Fig. 3) and there was a very strong correlation with the change in the ROR proliferation score and the change in ROR score between W1 and W2 (r = 0.86, p < 0.0001). Other than PGR (see above), no other individual gene in any of the signatures showed a significant change between W1 and W2.
Correlation of RS, ROR and EPclin signature scores
RS, ROR and EPclin scores showed a stronger correlation with each other in W1 (RS vs. ROR: r = 0.69, p = 0.0004; ROR vs. EPclin: r = 0.81, p < 0.0001; RS vs. EPclin: r = 0.75, p = < 0.0001) than in W2 (RS vs. ROR: r = 0.52, p = 0.014; ROR vs. EPclin: r = 0.65, p = 0.001; RS vs. EPclin: r = 0.70, p = 0.0003) (Supplementary Fig. 4). In both windows, RS and ROR showed the weakest correlation, whilst all correlations were stronger in W1 than W2.
Changes in estimated risk between W1 and W2 with RS did not correlate significantly with the change in estimated risk with each of the other 3 signatures (range r = 0.32–0.41; p = 0.06–0.15). However, the change in estimated risk found in each of the other signatures did correlate significantly between each of those signatures (range r = 0.73–0.98; p ≤ 0.001), such that in most cases tumours showing an increase or decrease in risk with one test also showed an increase or decrease, respectively, with the other tests (Fig. 4).