Diagnostic and predictive accuracy of anti-mullerian hormone for ovarian function after chemotherapy in premenopausal women with early breast cancer

Purpose Accurate diagnosis and prediction of loss of ovarian function after chemotherapy for premenopausal women with early breast cancer (eBC) is important for future fertility and clinical decisions regarding the need for subsequent adjuvant ovarian suppression. We have investigated the value of anti-mullerian hormone (AMH) as serum biomarker for this. Methods AMH was measured in serial blood samples from 206 premenopausal women aged 40–45 years with eBC, before and at intervals after chemotherapy. The diagnostic accuracy of AMH for loss of ovarian function at 30 months after chemotherapy and the predictive value for that of AMH measurement at 6 months were analysed. Results Undetectable AMH showed a high diagnostic accuracy for absent ovarian function at 30 months with AUROC 0.89 (96% CI 0.84–0.94, P < 0.0001). PPV of undetectable AMH at 6 months for a menopausal estradiol level at 30 months was 0.77. In multivariate analysis age, pre-treatment AMH and FSH, and taxane treatment were significant predictors, and combined with AMH at 6 months, gave AUROC of 0.90 (95% CI 0.86–0.94), with PPV 0.79 for loss of ovarian function at 30 months. Validation by random forest models with 30% data retained gave similar results. Conclusions AMH is a reliable diagnostic test for lack of ovarian function after chemotherapy in women aged 40–45 with eBC. Early analysis of AMH after chemotherapy allows identification of women who will not recover ovarian function with good accuracy. These analyses will help inform treatment decisions regarding adjuvant endocrine therapy in women who were premenopausal before starting chemotherapy.


Introduction
The treatment of early breast cancer (eBC) frequently includes multi-agent chemotherapy; adjuvant endocrine therapy is also widely used in case of hormone receptorpositive tumour (HR+) to suppress the effect of remaining estrogen production and reduce the risk of relapse [1]. Extensive research has demonstrated the superiority of aromatase inhibitors (AIs) over tamoxifen as adjuvant endocrine therapy for postmenopausal eBC [2]. While AIs alone are ineffective in premenopausal women [3,4], when co-administered with a gonadotropin releasinghormone agonist (GnRHa), they achieve therapeutically adequate suppression of serum estrogen levels [5]. The superiority of combining AI with ovarian suppression as adjuvant endocrine therapy in premenopausal women as compared to tamoxifen-based treatment has been recently demonstrated in the SOFT and TEXT trials [6] and in the HOBOE study [7].
Many women with eBC become amenorrhoic after chemotherapy, the proportion increasing with age [8,9]. As some show variable recovery, which may take 2 years or occasionally longer [10,11], the diagnosis of a permanent menopausal state is often difficult. However, many women will have permanent loss of ovarian function during or shortly after chemotherapy, and accurate early identification of these women might allow optimization and simplification of the choice of adjuvant endocrine therapy [12].
The measurement of anti-Müllerian hormone (AMH) has become established as the most reliable biomarker of the number of small growing follicles in the ovary, which indirectly reflects the number of primordial follicles, i.e. the ovarian reserve [13]. AMH levels decline to undetectable at the time of the menopause [14,15]. A substantial body of evidence has demonstrated that AMH levels fall in women during chemotherapy, with variable recovery depending on the treatment regimen [16][17][18][19][20], pre-treatment AMH levels and younger age [21][22][23][24][25], and possibly BMI [26]. Post-chemotherapy AMH measurement also predicts of ovarian function recovery [27][28][29]: if a woman with eBC has a very low or undetectable AMH level after chemotherapy, there is high confidence that she is indeed permanently menopausal [28]. Assessment shortly after completion of chemotherapy would aid clinical management; measurement of AMH shortly after completion of chemotherapy showed good prediction of women who would have ovarian failure at 24 months after diagnosis [27].
In this study, we investigated whether AMH measurement is a reliable method of identifying whether or not there is residual ovarian function following completion of chemotherapy in women aged 40 and over with eBC. This would potentially allow avoidance of unnecessary administration of GnRHa treatment as adjuvant endocrine therapy, with significant benefits in cost savings and in convenience to the patient.

Patients
This study was conducted within a cohort of consecutive patients with eBC diagnosed between 40 and 45 years of age who underwent (neo)adjuvant chemotherapy between January 2008 and December 2016 at the Henri Becquerel Cancer Center (Rouen, France). Of a total of 494 patients of appropriate age during that period, only patients with available stored blood samples before and at 6, 18 and 30 months after chemotherapy were included, and hormone assays were then performed. Chemotherapy was based on epirubicin, cyclophosphamide +/− a taxane (docetaxel in the great majority). Adjuvant endocrine therapy consisted of tamoxifen exclusively, with no exposure to AIs or GnRH agonists.
All patients gave written informed consent allowing the conservation and study of their biological samples. The present study was approved by the Institutional Scientific and Ethics Committees of Henri Becquerel Centre (registering order N°1917B).

Statistical methods
Data are presented as median and 95% confidence interval (CI). Changes in hormone concentrations over time were analysed by repeat measures ANOVA with Bonferroni correction for multiple comparisons. Receiver-operator characteristic (ROC) curve analyses were performed, reporting area under the curve (AUROC). Univariate analysis investigated simple relationships between detectable and undetectable AMH as binary category and later ovarian function as a binary category defined by a threshold level.
Multivariate analysis was also performed to assess the predictive performance of baseline and treatment characteristics (endocrine and non-endocrine) and post-chemotherapy endocrine factors in terms of later ovarian function. Multivariate analysis was performed in three stages. First, individual variables were assessed for prediction of undetectable AMH at 30 months' post-treatment. Second, suitable candidate variables from the first stage were used in multivariate linear regression models (PRISM version 9, GraphPad Software LLC, San Diego USA) to provide estimates of AUROC, PPV and NPV. Third, and to guard against the multivariate linear regression models over-or underfitting the data (i.e. supplying estimates that are unlikely to generalize to new data instances) a full machine learning workflow was performed using scikit-learn version 0.46 within Python version 3.9.2. The workflow stages were: shuffling and splitting data into 70% train and 30% validation subsets; fivefold cross-validated grid search of 420 options for optimal hyperparameters for the random forest algorithm applied to the test data; cross-validated application of the optimal model on the training data; application of the model to the validation data subset that mimics new data instances. A linear regression model was considered validated in terms of clinical utility if (a) the cross-validated test performance of the random forest model for the test data was close to the validation performance (i.e. the model is neither overnor-underfitting the data), and (b) the validation AUROC is similar to the estimate found by linear regression.

Results
Samples from a total of 206 women were analysed, with complete sample sets in 197 women. Most (76%) patients had HR+ disease and received tamoxifen; 48 patients had HR-tumours and did not receive adjuvant endocrine treatment. Chemotherapy regimens were based on 6 cycles of cyclophosphamide with an anthracycline, and the addition of a taxane in 84%; 22% received anti-HER2 targeted therapy. Six patients received 8 cycles of chemotherapy in the context of inflammatory breast carcinoma. Among the 173 patients exposed to a taxane, all but 3 were received docetaxel. Patient characteristics, tumour and treatment details are described in Table 1.
The value of AMH as an index and as a predictor of absent ovarian function after recovery from chemotherapy was explored. At 30 months, women with undetectable AMH at that time (n = 119) had median estradiol of 50 pmol/L (IQR 34-68), whereas it was 313 pmol/L (IQR For prediction of later ovarian activity, women with undetectable AMH at 6 months (N = 137) had median estradiol levels at 30 months of 56 pmol/L (IQR 40-104), vs 258 pmol/L (IQR 69-780) (P < 0.0001) in women with detectable AMH at 6 months (n = 62) (Fig. 2a). AUROC for estradiol at 30 months by undetectable AMH at 6 months was 0.75 (95% CI 0.67-0.82, P < 0.0001; Fig. 2c), with sensitivity 19.7% and specificity 95.1% at estradiol concentration of 34.4 pmol/L, at which likelihood ratio peaked at 4.0. The positive predictive value of undetectable AMH at 6 months for a menopausal estradiol level (< 110 pmol/L [30]) at 30 months was 0.77. Supporting this, AMH at 6 months for prediction of undetectable AMH at 30 months was explored. AUROC was 0.76 (CI 0.68-0.83, P < 0.0001), with PPV of undetectable AMH at 6 months for unpredictable AMH at 30 months of 0.78.
As both FSH and estradiol may be impacted by tamoxifen treatment, data were additionally analysed separately in the 48 women not taking tamoxifen. AMH was undetectable in 29 (60%) of these women at 6 months, and also in 29 (60%) women at 30 months. At 30 months, median estradiol concentrations of women grouped by detectable vs undetectable AMH levels at both 30 and 6 months (Fig. 2d) were similar to those groups in the whole cohort of women (Fig. 2a).
FSH is an established diagnostic test for POI, thus analyses were performed in women not taking tamoxifen for AMH as a predictor of FSH > 25 IU/L. At 30 months, median FSH in women with undetectable AMH at that time point was 87.8 IU/L (IQR 67.7-126.9) vs 12.4 IU/L (8.6-25.2) (P < 0.0001) in those with detectable AMH. Analysis by AMH at 6 months gave comparable results (Fig. 2g), with median FSH at 30 months of 69.4 IU/L (42.5-108.8) vs 12.2 IU/L (8.6-23.9). The diagnostic value was assessed by ROC analysis for AMH at 30 months, showing AUROC 0.98 (0.96-1.00), and for prediction by AMH at 6 months, AUROC was 0.86 (0.72-0.99) (both P < 0.0001; Fig. 2h and i) with peak likelihood ratio of 7.9 at FSH 27.7 IU/L. An undetectable AMH at 6 months had a PPV for FSH > 25 IU/L at 30 months of 0.93, indicating a very high predictive value for long-term POI after chemotherapy.

Multivariate analyses
The variables age, pre-treatment AMH and FSH, and taxane treatment were found to be significant predictors of AMH at 30 months; BMI and pre-treatment estradiol were not ( Table 2). The significant predictors were then combined with AMH at 6 months for prediction of AMH at 30 months (Table 2). This gave AUROC of 0.90 (95% CI 0.86-0.94), with PPV 0.79 and NPV 0.79 (Fig. 3). Using estradiol at 30 months of < 110 pmol/L as the outcome, the same variables gave AUROC of 0.82 (0.76-0.90), PPV 0.68 and NPV 0.76 (Fig. 3). Two additional analyses were performed to assess prediction if pre-treatment hormone data were not available, and of pre-treatment variables (including taxane treatment) only. In the absence of pre-treatment hormone data, age/taxane treatment/AMH6 gave AUROC 0.71 (0.63-0.79) with PPV 0.78 and NPV 0.77. Conversely, age/taxane treatment/pretreatment hormone variables gave AUROC 0.88 (0.83-0.92), PPV 0.77 and NPV 0.79.
The linear regression models were validated by random forest models with data retained for validation purposes,  results (a, d, g), results of diagnostic testing at 30 months (b, e, h) and predictive testing of 30 months by data at 6 months (c, f, i). a Estradiol levels at 30 months by AMH at 6 and 30 months, divided into AMH undetectable (−) vs AMH detectable (+), with ROC curves for diagnostic analysis by AMH at 30 months (b), and prediction by AMH at 6 months (c). d In women not treated with tamoxifen: Estradiol levels at 30 months by AMH at 6 and 30 months, divided into AMH undetectable (−) vs AMH detectable (+), with ROC curves for diagnostic analysis by AMH at 30 months (e), and predictive analysis by AMH at 6 months (f). g In women not treated with tamoxifen: FSH levels at 30 months by AMH at 6 and 30 months, divided into AMH undetectable (−) vs AMH detectable (+), with ROC curves for diagnostic analysis by AMH at 30 months (h), and predictive analysis by AMH at 6 months (i) with random forest AUROC within the 95% CI for the AUROC reported for the logistic regression model ( Table 2). Without pre-treatment hormone data, the random forest AUROC was significantly higher at 0.85 compared to 0.71, indicating that the linear regression model is underfitting the data. For the other analyses, the cross-validated test accuracy of each optimal random forest model was within 4.6 percentage points of the validation accuracy.

Discussion
Assessment of ovarian function after chemotherapy is critical for women with breast cancer where decisions about appropriate endocrine treatment are required [12]. Moreover, many women also want to know whether a later pregnancy might possible. There is increasing evidence for the value of AIs in women who are premenopausal at the time of diagnosis [6,7], but if there is ovarian activity after chemotherapy, concomitant ovarian suppression with a GnRH agonist is necessary to ensure adequate suppression of estradiol levels. There is however uncertainty as to the degree of suppression of estradiol levels that is required and accuracy of immunoassays at these low concentrations [31], indicating a need for improved biomarkers of ovarian function.
In these analyses we have explored the potential accuracy of AMH as a biomarker of ovarian activity after chemotherapy for eBC, as a diagnostic test at 30 months after completion of chemotherapy (thus allowing for any recovery) and a predictive test at 6 months after chemotherapy. AMH levels . Undetectable AMH at that time accurately distinguished women with low estradiol levels, indicating that AMH is a clinically useful index of ovarian function in this context. The best estradiol concentration cut-off distinguishing women with and without detectable AMH levels was 34 pmol/L, similar to the upper limit in postmenopausal women using mass spectroscopy [32]. While accurate diagnosis of absent ovarian function after allowing for potential recovery is of value, it would be of yet greater clinical value to be able to predict post-treatment menopausal status at the end of chemotherapy. At 6 months after chemotherapy, thus at a clinically relevant time point to decide on whether ovarian suppression might be necessary [6,33], AMH levels were undetectable in 70% of the population. This had clear value in predicting later ovarian function, by estradiol levels or AMH at 30 months. Thus, women aged over 40 treated for eBC with anthracycline-and taxane-based chemotherapy regimens who have an undetectable AMH level at 6 months, using a highly sensitive assay, are very likely to show permanent loss of ovarian function, and ovarian suppression may not be required. This supports a previous analysis of a smaller group of women with eBC (n = 32), where undetectable AMH at the end of chemotherapy accurately predicted lack of recovery of ovarian function in women aged over 40, but not younger women [27].
However, some women did show a degree of recovery of ovarian function, mostly within 18 months of chemotherapy.
This late recovery has been demonstrated previously [10,11], and while more likely in younger women, the present analysis documents its prevalence in women aged 40-45 years at approximately 11% of the population studied. While the recovery in AMH levels was small, estradiol levels in some women were high, reflecting the effect of tamoxifen treatment inducing multifollicular ovarian activity.
While cut-off levels of estradiol for diagnosis of menopausal status are debated [31], there is consensus that the biochemical diagnosis of menopause or POI should be based on FSH levels, with high levels reflecting a lack of estrogen and inhibin-mediated feedback on the hypothalamus and anterior pituitary gland. A value of 25 IU/L is widely recommended for both POI and natural menopause [34][35][36], although others suggest a higher value. As tamoxifen, through estrogen receptor antagonism, raises FSH levels, this can only be used in women not taking any endocrine therapy. In that group of women, our study showed that undetectable AMH levels at both 6 and 30 months were associated with similar discrimination of estradiol levels as in the whole study population, and analysis of diagnostic accuracy showed slightly greater precision for both diagnosis at 30 months and prediction at 6 months of both 30-month estradiol and AMH than in the wider group. PPV of undetectable AMH at 6 months for elevated FSH consistent with a diagnosis of POI at 30 months was a remarkable 0.93.
While a single assay of AMH at 6 months provides good prediction of later ovarian function and has the clinical benefit of simplicity, we also explored whether additional endocrine, patient and treatment factors could improve this prediction. We and others have shown that pre-treatment AMH is predictive [21][22][23][24][25], as is age, with BMI also contributing in some studies [26]. The addition of a taxane to cyclophosphamide-based regimens also increases ovarian toxicity [17,20]. In multivariate analysis, pre-treatment AMH and taxane treatment were the most important predictors: the limited value of age is likely to reflect the narrow age range in this specific study population. Including all identified factors resulted in PPV 0.79 for prediction of undetectable AMH at 30 months: random forest analysis gave a similar value of 0.82. Very similar results were obtained using estradiol at 30 months as the outcome variable. Analysis without pre-treatment hormone data gave similar results (though with an improvement in PPV to 0.92 by random forest), and by pre-treatment variables only (thus including pre-treatment AMH and taxane treatment) gave PPV of 0.77, with again better prediction by random forest analysis with PPV 0.84. Thus, using this approach with partial data retention for validation to prevent over-fitting allows accurate prediction of long-term ovarian function from either a single post-chemotherapy AMH test alone, or supplemented by knowledge of pre-treatment AMH and taxane treatment, or indeed with similar accuracy from pre-treatment AMH and taxane treatment alone. Therefore, this has validity and utility in a range of clinical scenarios, depending on which variables are known.

Conclusion
These data demonstrate that in women aged 40-45 treated for eBC and after time to allow any recovery of ovarian function, an undetectable AMH level, using this assay platform, is a reliable diagnostic test for lack of ovarian function. Furthermore, early analysis of AMH after completion of chemotherapy allows identification of women who will not recover ovarian function with good accuracy. The combination of pre-treatment AMH measurement with knowledge of whether treatment will include a taxane in anthracycline/ cyclophosphamide-based chemotherapy also provides good prediction of long-term ovarian function. These analyses will help inform treatment decisions regarding adjuvant endocrine therapy and the need for adding ovarian suppression to an AI in women who were premenopausal before starting chemotherapy. Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.