At a glance commentary, Gleicher N et al

Background

Prediction of IVF outcomes in patients at different ages has been a longstanding goal in reproductive medicine. Here we demonstrate that, based on embryo numbers produced (retroactive prediction paradigm), FSH and AMH levels (both prospective prediction paradigms), different models are predictive of good, intermediate and poor IVF prognoses at all ages.

Translational significance

This is the first study to demonstrate prospective age-specific IVF outcome predictions based on FSH and AMH levels, and a retrospective prediction model based on embryo numbers.

In an unexpected translational finding, AMH was found associated with contradictory IVF outcomes at "best" (unexpectedly high pregnancy and delivery rates) and excessively high peripheral serum levels (spiking spontaneous miscarriages).

If confirmed, these observation suggest the potential clinical use of AMH as a fertility enhancing pharmacological agent at "best" levels and as a potential abortefaciant at excessively high levels.

Background

How to establish outcome prognoses for infertile women entering in vitro fertilization (IVF) cycles is not well defined [1]. If accomplishable at all ages, the ability to predict prognoses would, therefore, be clinically very valuable. Better definitions of patient populations at IVF centers would also improve internal as well as external quality controls. In the US such external controls are mandated by an act of Congress [2], and currently not satisfactory [3]. Finally, treatments offer different levels of efficacy in good-, intermediate- and poor-prognosis patients [4]. Better definition of “disease” severity, therefore, should improve individualization of IVF treatments and, thereby, improve outcomes.

Prognostication of IVF outcomes has been a longstanding goal [5]. With key component female age [6, 7, 9], a variety of models have been published [611] as declining clinical pregnancy and live birth rates with advancing female age well demonstrate [12]. te Velde et al. [10] therefore, were correct in noting that, when building prediction models for IVF, changes in outcomes have to be considered with advancing female age.

Age is, however, not the only important predictor of IVF outcomes. Functional ovarian reserve (FOR), a term reflecting the growing follicle pool, and, therefore, oocyte and embryo numbers, is also closely associated with IVF outcomes. Abnormally low FOR (LFOR) is defined by abnormally increased age-specific follicle stimulating hormone (FSH) [13] and/or decreased age-specific anti-Müllerian hormone (AMH) [14], both reflecting declining egg and embryo numbers and, therefore, deteriorating pregnancy and live birth chances [15].

In women with premature ovarian aging (POA), also called occult primary ovarian insufficiency (oPOI) [16], normal statistical associations between age and FOR are disturbed. POA/oPOI patients prematurely demonstrate LFOR. They represent approximately 10 % of females, independent of race and ethnicity, and at IVF centers can exceed half of all patients [17]. In POA patients FOR-based rather than age-based prediction models in IVF may, therefore, be preferable.

We here present three different models, involving age and FOR, which based on clinical pregnancy as well as live birth chances allow definitions of patients into good-, intermediate and poor IVF prognosis categories. The here presented study yielded in addition unexpected results, which suggest previously unrecognized physiologic effects of AMH on IVF outcomes.

Methods

Patient populations

This study involves three partially overlapping patient cohorts: Cohort I, 1247 consecutive fresh IVF cycles during 2009–2013, including egg donor, however excluding elective single embryo transfer (eSET) and mild stimulation cycles, was used to investigate associations of good quality embryo numbers (between 1 and 15) with clinical pregnancy and live birth rates at different ages (<36, 36–38, 39–40, 41–42 and ≥43 years). Patients <36 are presented as a single age category because ages <30, 31–32, 33–34 and 35–36 produced basically identical outcomes (see Additional file 1: Appendix Figure S1).

Cohort II, 1514 consecutive fresh autologous non-donor IVF cycles, excluding eSET and mild stimulation cycles, was stratified for age used to establish associations of highest FSH levels (2.5–40.0 mIU/mL) with clinical pregnancy and live birth rates.

Cohort III, 632 fresh autologous non-donor cycles between 2011 and 2014, excluding eSET and mild stimulation cycles, was used to assess associations of lowest AMH levels (≤0.5–10.0 ng/mL) with clinical pregnancy and live birth rates, stratified for age. Only AMH measurements by the Beckman Generation 2 AMH assay were included as neither manufacturers of earlier AMH assays nor our own statisticians able to generate conversion tables.

All patient data, representing consecutive IVF cycles, were extracted from our center’s anonymized electronic research data base unless meeting the exclusion criteria noted above. Table 1 summarizes patient and IVF cycle characteristics for all three patient cohorts.

Table 1 Patient characteristics of patient Cohort I, II and III

FSH and AMH values

FSH values were tested in house by commercial assay. Though commercial AMH assays are similar at mid-range (variations between assays are usually only seen at very low and very high levels), here reported values should not automatically be applied to other AMH assays since earlier generation AMH assays differ from the in this study utilized assay. We later demonstrate that, indeed, mid-range values matter most in here reported AMH model.

Blood draws occurred at initial presentation, with IVF cycle starts on average initiating 8 weeks later.

IVF cycle protocols

Cycle stimulation protocols at our center are limited, while choice of gonadotropin manufacturer is left up to patients and their insurance coverage. Oocyte donors receive a long agonist protocol (150–300 IU of gonadotropins daily), usually given as human menopausal gonadotropin (hMG). Since most of our center’s patients present with LFOR, a majority receive short microdose agonist protocols, with FSH (300–450 IU) and hMG (150 IU). Patients with normal FOR, if under age 38, receive similar stimulation to egg donors. Patients with LFOR are pretreated with dehydroepiandrosterone (DHEA) to raise testosterone levels to above 28 ng/mL (1 nmol/L) before IVF cycle start [18], and also receive CoQ10 supplementation [19].

Up to age 38, our center transfers in fresh cycles only 1–2 embryos; between ages 38–42, 3 embryos and above age 42, 3 to maximally 5 embryos.

Embryo assessment

After assessment and grading, our center routinely transfers embryos on day-3 (cleavage stage) [20]. Only 4—8-cell embryos of at least grade 3 are transferred or cryopreserved and, therefore, considered good quality.

Statistics

FOR parameters and categorical age were used to model the probability of clinical pregnancy, live birth or pregnancy loss using logistic regression. For models with AMH, AMH [2] was also included, and a statistically significant predictor of all outcomes. A P value of <0.05 was considered statistically significant. All statistical analyses were performed by the center’s senior statistican (S.K.D.), using SAS version 9.4 software.

Ethics, consent and permissions

Patients whose data are preserved in our center’s anonymized electronic database sign at presentation an informed consent that allows use of their medical records for research, as long as their anonymity is preserved and their medical records remains confidential. Both conditions are met when data is extracted from the electronic database. Such projects are, therefore, approved by the center’s IRB (IRB of The Center for Human Reproduction, Chairman, Neil Rosenberg, MD) as expedited applications. This here presented study was, thus, approved under IRB application number ER0330215/01.

Results and discussion

Effects of embryo numbers

Table 2 summarizes cycle characteristics for Cohort I. As expected, good quality embryos, pregnancy and live birth rates declined with advancing age, while miscarriage rates increased. In age-specific categories, miscarriages can be defined in all figures as the differences between age-specific clinical pregnancy and live birth rates.

Table 2 Cycle outcome characteristics for Cohorts I, II and III in age categories

How embryo numbers affected clinical pregnancy and live birth rates is shown in Fig. 1: Good-, intermediate- and poor-outcomes within each age group were defined at visually obvious break points in pregnancy and live birth rates. In all figures, fields were colored in yellow for poor prognosis, in blue for good prognosis and left uncolored for intermediate-prognosis.

Fig. 1
figure 1

Age-specific model of pregnancies and live births based on good quality embryos produced per cycle. a, b reflect clinical pregnancy rates; c, d reflect live birth rates; In (a) and (c), blue background denotes good-prognosis, white denoted intermediate- and yellow poor-prognosis

As Fig. 1a, c demonstrate, at youngest age (<36 years) pregnancy and delivery rates were excellent almost independent of good quality embryo numbers. Even poor prognosis patients (defined by only 1–3 embryos) still achieved clinical pregnancy rates of 34–38 % and live birth rates of 29–32 %. Both rates steadily increased with increasing embryo production to a maximum of 62 and 53 %, respectively.

Because our center only rarely performs elective single embryo transfer (eSET) [21], and up to age 38 practically never transfers more than 2 embryos, this age category at most received 2-embryo transfers (2ETs). Yet, pregnancy and live birth rates increased almost linearly (Fig. 1b, d) with increasing embryo production.

The pregnancy loss (miscarriage) rate, defined as clinical pregnancies minus live births, however remained similar, whether a woman produced 1 or 15 embryos: Pregnancy loss <36, for example, occurred in 14.7 % of women with 1 embryo and in 14.5 % of women with 15 embryos.

Figure 1a and c also demonstrate that, despited uniformly good clinical pregnancy and live birth rates <36 years, separation of good-prognosis (≥51 % clinical pregnancy and ≥44 % live birth), intermediate prognosis (respectively 40–50 and 34–44 %) and poor prognosis patients (respectively ≤39 and ≤33 %) was still possible based on obvious break points in cycle outcomes.

From age 36, outcomes started declining, while pregnancy losses increased, at age 36–38 reaching 29.2 % for women who produced 1, and 28.6 % in those with 15 embryos. This traditional embryo quality parameter, thus, remained stable (Fig. 1a, c) and almost linear improvements of pregnancy and live birth rates with increasing embryo production was maintained into older ages (Fig. 1b, d). Indeed, improvements within age categories between 1 and 15 embryos grew with advancing age: Under age 36, clinical pregnancy chances increased by 82.4 % (from 34 to 62 %) but by 104.2 % (from 24 to 49 %) in age category 36–38. Concomittantly, live birth rates improved by 89.3 % (from 29 to 53 %) under age 36 and by 105.9 % (from 17 to 35 %) at ages 36–38. By age ≥43, clinical pregnancy rate for 1 embryo was 6 %, and for 15 embryos 17 %, a 183 % increase, while live births increased from 3 to 8 %, a 166.7 % increase (Fig. 1).

As Fig. 1a and c demonstrate, with persistently decreasing clinical pregnancy and live birth rates, women ≥43 years only with 7 or more embryos reached 10 % clinical pregnancy rates or higher, and even with up to 15 embryos remained in single digit range for live births. No woman in that age group, therefore, could be considered a good prognosis patient.

Increasingly, poor embryo quality with advancing female age was also reflected in increasing pregnancy loss, in women with 1 embryo reaching 50.0 % at age ≥43, and 52.9 % with 15 embryos. The embryo quality parameter of pregnancy loss, therefore, remained similar within age categories,—even at most advanced age; yet, clinical pregnancy and live birth rates within age categories improved with growing embryo numbers produced, and did so increasingly more pronounced as women grew older.

Effects of FSH levels

Table 2 describes cycle numbers, peak FSH levels and clinical pregnancy as well as live birth rates at different female ages.

Figure 2 summarizes probabilities of clinical pregnancies (Fig. 2a, b) and live births (Fig. 2c, d) at FSH levels between 2.5 and 40.0 mIU/mL. Both at all ages declined with increasing FSH levels. Moreover, within each FSH category, both outcomes also declined with advancing age.

Fig. 2
figure 2

Age-specific model of pregnancies and live births based on FSH levels (in mIU/nL). a, b Reflect clinical pregnancy rates; c, d reflect live birth rates; In (a) and (c) blue background denotes good-prognosis, white background intermediate- and yellow background poor-prognosis

For women <36 years, FSH only up to 7.5 mIU/L denoted good prognosis (pregnancy 36–43 %; live birth 30–36 %). FSH levels mattered at all ages, with lower FSH levels, even with good-prognosis and within normal FSH levels offering better outcomes. Pregnancy after age 40, and live births even as early as age 36, failed to reach good-prognosis at even lowest FSH, suggesting that, at least in adversely selected patients, normal FSH levels may have to be reconsidered.

FSH changed in its clinical relevance with advancing female age: For example, an FSH of 22.5–25.0 mIU/mL; <36 years resulted in clinical pregnancy in 19 %; though at age 41–42, the same rate required an FSH of 2.5 mIU/mL (Fig. 2a); FSH of 32.5 mIU/mL <36 years, allowed live births in 11 %; but at 41–42, this live birth rate required an FSH of 2.5 mIU/mL (Fig. 2c).

Figure 2 also demonstrates that ≥43 years treatment futility, according to the American Society for Reproductive Medicine (ASRM) at ca. 1 % live birth rate [22], was reached at FSH 22.5 mIU/mL. Yet, up to 42 years, even up to FSH 40.0 mIU/mL futility was avoided.

Effects of AMH levels

Table 2 also introduces Cohort III, which was used to assess associations of a patient’s lowest AMH (between ≥0.5 and 10.0 ng/mL) with pregnancy (Fig. 3a, b) and live birth rates (Fig. 3c, d). In contrast to embryo and FSH models, pregnancy and live birth chances in association with AMH followed a bell-shaped curve, with best outcomes at midrange.

Fig. 3
figure 3

Age-specific model of pregnancies and deliveries based on AMH levels (in ng/ml). a, b Reflect clinical pregnancy rates; c, d reflect live birth rates; In (a) and (c) blue background denotes good prognosis, white average- and yellow poor-prognosis patients

The very high pregnancy and live birth rates at “best” AMH levels were unexpected: Under age 36, AMH values between 3.5 ng/mL and 8.5 ng/mL offered best pregnancy chances (49–55 %, good-prognosis patients); 1.0–3.0, and 9.0–10.0 ng/mL offered intermediate-prognoses (pregnancy 33–46 %), and only AMH of ≤0.5 ng/mL denoted poor prognosis (with still respectable pregnancy rate of 29 %, Fig. 3a).

Live births behaved similarly (Fig. 3b): best live birth rates (43–47 %) were obtained at AMH 3.5–7.0 ng/mL; intermediate rates (32–41 %) at AMH of 1.5–3.0 and 7.5–9.0 ng/mL. Even poor prognosis at AMH of ≤1.0 ng/mL still was associated with 25–29 % live births.

Clinical pregnancy and live birth declined only mildly up to age 42, and an unexpectedly high 18 % pregnancy rate was still achieved ≥43. Live births reached a respectable 7 % (oldest conception at age 47). Pregnancies in single digits occurred only with AMH <1.5 ng/mL. With AMH ≥2.0 ng/mL, clinical pregnancy rates were between 10–18 %, though declined at very high AMH (Fig. 3a) after reaching peak pregnancy rates at AMH 5.5–6.5 ng/mL. This patient group, however, also experienced the highest pregnancy loss rate of any model.

Pregnancy loss at all ages remained similar for low and “best” AMH levels but significantly increased at highest AMH levels: <36 years, at AMH 0.5 ng/mL only 13.8 % miscarried and at “best” level of 5.5 ng/mL only 13.0 %; but at AMH of 10.0 ng/mL, rates spiked to 42.9 %. The same occurred ≥43, where pregnancy loss was 57.1 % with lowest AMH, 61.1 % at “best” AMH levels and spiked to 81.8 % at highest AMH, contradicting that AMH linearly reflects not only oocyte quantity but also quality (23).

Statistical comments

Because all three here utilized statistical models highly correlate in representation of FOR in association with patient age, construction of combined statistical models was not feasable. In univariate models, FSH and AMH, independently, were not predictive of miscarriage. Embryo numbers, however, did reach significance in an univariate model (P = 0.008), though, as expected, significance was lost with adjustments for age, as embryo numbers in themselves are age-dependent.

General discussion

This study was initiated to determine whether, even in relatively adversely selected infertile patient populations, definitions of good-, intermediate- and poor-prognosis can at different ages be reached based on clinical pregnancy and live birth rates. To the best of our knowledge, such an age-based association study has never before been performed, and certainly not in poor prognosis IVF patients. Here studied populations’ relatively poor prognoses are best demonstrated by their elevated FSH and abnormally low AMH levels (Table 1).

To be able to classify patients prospectively would be clinically useful for patients and physicians alike. To classify patient populations retrospectively, would allow for their better definition and, therefore, hypothetically for better outcome comparisons between IVF centers. To allow for such comparisons, national outcome reporting in the US is legislatively mandated by Congress [2], though the current system has recently been described as inadequate, and even misleading [3].

A final reason for this study was the recent recognition that some treatment effects vary between good-, intermediate- and poor-prognosis patients. Indeed, especially in poor prognosis patients, some widely utilized treatments may be outright harmful [4].

Definition of patient prognosis

Since prospective definition of prognosis of IVF patients has been a longstanding goal, various models have been proposed [611]. None so far have, however, proven clinically effective [23, 24].

By assessing the impact of FOR on IVF outcome in three distinctively different models, this study, therefore, approached the issue differently: In a retrospective model, based on number of embryos produced in a given IVF cycle; and in two prospective models, utilizing FOR’s two most widely used laboratory surrogates, FSH and AMH.

This multifocal evaluation of FOR proved successful since, based on breakpoints in clinical pregnancy and live birth rates, it allowed in in each age category for differentiation between good-, intermediate- and poor-prognosis patients.

Defining parameters for individual prognosis categories, as expected [1] changed with advancing female age, and required increasing embryo numbers to maintain designations. With clinical pregnancy as final outcome, women ≥43 years no longer demonstrated what could be defined as good prognoses. With live births as final outcome, all women in that age category, indeed, demonstrated poor prognosis.

Utilizing peak FSH levels as FOR surrogate, similar associations became apparent (Fig. 2): pregnancy and live birth rates declined with increasing FSH and advancing age. Again, prognoses could be defined based on rather obvious cut offs in pregnancy and live birth rates. This model, however, already at young ages revealed a surprisingly narrow range of good-prognosis: in women <36 years, only FSH ≤ 7.5 mIU/mL, and at ages 36–37, only FSH ≤ 2.5 mIU/mL qualified as good-prognosis with reference pregnancy, while with end point live births, only age <36 qualified. Even intermediate-prognosis became rare after age 40, and required FSH levels <5.0 mIU/mL, while only poor-prognosis patients were left ≥43 years.

These data confirm the importance of utilization age-specific FSH levels in assessing infertile women [13].

More surprising observations were made in the AMH model: In contrast to the embryo and FSH models, it demonstrated a typical bell-shaped polynomial pattern. Worst IVF outcomes were observed at AMH extremes; “best” AMH was slightly above mid-point (Fig. 3). In pregnancy rates, this pattern carried over into the oldest patient group (Fig. 3a), though based on live births, no good prognosis patients were found ≥43 years (Fig. 3c).

Here reported outcomes are, of course, not automatically applicable to other IVF programs. They were the consequence of very specific practice patterns [18, 25]. Even assuming identical patient populations (in itself also a highly unlikely proposition), different clinical protocols at other centers will result in different pregnancy and live birth rates. To construct universally applicable models, this study will have to be repeated on a multicenter or even national basis, and further validated against results from IVF centers with varying patient populations and treatment protocols.

Different AMH assays utilized by IVF centers may also offer mildly varying results [26], though mid-range AMH, in this study demonstrated to be most important AMH range, demonstrates least discrepancies between currently in use AMH assays.

Relevance of treatment protocols

Reliable prognostication of patients is of potential clinical importance: Treatments, which recently entered routine IVF, have shown varying effectiveness in different patient categories. For example, the concept of embryo selection in all of its applications appears beneficial only in good-prognosis patients. With intermediate-prognosis, embryo selection appears ineffective, while with poor prognosis it outright decreases pregnancy and live birth chances [4]. At the other extreme, treatments reported effective in adversely selected patients [18, 19, 25], may be ineffective in intermediate and good prognosis patients.

IVF protocols, therefore, have to evolve toward individualization of care, and a reproducible classification of patients, as here presented, would greatly contribute to standardization of individualized treatment options.

Previously unknown AMH associations with IVF outcomes

Likely the clinically most consequential and translationally most important findings of this study relate to AMH levels: While in embryo and FSH models relationships were almost linear, clinical pregnancy and live birth chances in relation to AMH levels followed a bell-shaped curve, with maximal clinical pregnancy and live birth chances at midrange AMH, rather than highest or lowest levels.

Even into oldest age categories, this model at “best” AMH levels demonstrated unexpectedly high pregnancy rates. Live birth rates behaved similarly, and were remarkably high up to age 42. Beyond age 42, miscarrage rates, however, even at “best” AMH levels were extremely high. At “best” AMH levels, women ≥43, for example, reached an almost incredolous 18 % clinical pregnancy rate; but only a 7 % live birth rate, representing a 61.1 % clinical miscarriage rate. Though a 7 % live birth rate in this age category is still remarkable, the spike in observed pregnancy loss is even more stunning.

Though in embryo and FSH models pregnancy, independent of embryo numbers and FSH levels, loss rates remained the same within all age categories, in the AMH model miscarriage rates remained similar only at low (57.1 %) and “best” AMH levels (61.1 %); at highest AMH levels, they spiked to an incredible 81.8 %.

Combined, these AMH-associations suggest positive effects on clinical pregnancy and live birth rates up to “best” AMH levels but, because of increasing miscarriage rates, with highest AMH levels unfavorable effects on live birth rates. A currently widely held opinions is that AMH linearly reflects oocyte quantity and quality, both declining with advancing female age [27, 28]. Here presented AMH observations, however, now suggest otherwise.

Moreover, since miscarriage rates remained the same in all age categories, whether patients produced 1 or 15 embryos, improved outcomes with increasing embryo production (though identical embryo transfer numbers), likely, were independent of embryo quality, as defined by an embryo’s chromosomal integrity. Here presented data, therefore, suggest the existence of yet another embryo quality factor, which is independent of the embryo’s chromosomal status.

The concept of selecting out euploid embryos prior to embryo transfer is the basic principle behind preimplantation genetic screening (PGS) [29]. Here reported findings, therefore, may at least partially explain why, contrary to most predictions, the PGS procedure has so far failed to improve IVF outcomes [30, 31].

The next question to be answered is what drives this previously unknown embryo quality factor, which apparently increases in relative importance with advancing female age? Here presented data suggest that it must be associated with increasing oocyte/embryo production in IVF cycles; yet, since improvements with increasing embryo numbers almost doubled between youngest and oldest age categories, the efficacy of this additional “embryo quality factor” must increase with advancing female age. This observation suggests that AMH may, indeed, be this second, previously unknown “embryo quality factor.”

AMH is, of course, strongly associated with oocyte/embryo production in IVF [27, 28, 32]. At “best” AMH levels, our third model demonstrated extraordinarily high clinical pregnancy rates into even the oldest age categories. Women at ages 41–42 years and above 43 achieved almost unheard of clinical pregnancy rates of 29–30, and 17–18 %, respectively. Neither embryo nor FSH models, however, demonstrated such extraordinary clinical outcomes at advanced age categories.

These extraordinary IVF cycle outcomes, therefore, appear associated with “best” AMH levels, which in this study were defined at ranges of 3.5–8.5 ng/mL in the youngest, and between 4.5 and 7.5 ng/mL in even the oldest age categories.

Yet, in oldest patients this apparently beneficial AMH-associated effect on clinical pregnancy rates was mostly lost to high miscarriage rates. Though live birth rates still remained relatively high until age 42, above age 43, at “best” AMH, they reached only 6–7 %. These rates, though, were still clearly higher than at vey low or very high AMH levels (2–5 %).

Conclusions

Combined, these observations suggest a “dosage-dependent,” effect of AMH on clinical IVF outcomes: At “best”levels, AMH improves embryo implantation at all ages, leading to peak clinical pregnancy rates. Whether this observation represents an AMH effect on oocytes, embryos or the endometrium remains to be determined. At excessively high levels, AMH, however, to significant degrees appears to increase the risk of pregnancy loss. Miscarriages at highest AMH spiked at all ages to approximately 60 % but reached the incredible rate of 81.8 % in the oldest age category above 43 years.

At “best” levels, AMH, thus meets previously noted requirements for a here newly described “embryo quality factor,” which is quantitatively associated with increasing embryo yields but also increases in efficacy with advancing age. This study, indeed, suggests that AMH, as facilitator and inhibitor, demonstrates increasing utility with advancing female age.

If confirmed by further investigations, here reported effects of AMH on IVF outcomes suggest, especially in older women, at appropriate dosaging a potential therapeutic role in improving clinical outcomes in IVF. Our data, however, also raise the specter of AMH, at higher therapeutic levels, functioning as an abortifaciant.

Somewhat surprisingly, a pharmacological AMH product for human use is currently not available anywhere in the world. This is that more surprising since, at least in animal models, AMH has been demonstrated to demonstrate clinical effects [33].