Introduction

Urinary gonadotropins include follicle-stimulating hormone (FSH), luteinizing hormone (LH), and human chorionic gonadotropin (hCG), which has LH-like activity [1]. The advantages of recombinant over urinary gonadotropins in terms of purity, consistency, and immunogenicity are well established [1, 2]. Accordingly, recombinant human FSH (r-hFSH) is used around the world for ovulation induction (OI) in women with oligo- or anovulatory infertility.

The majority of women with oligo- or anovulatory infertility related to hypothalamic-pituitary dysfunction have polycystic ovary syndrome (PCOS) [3]. Japanese patients with anovulatory infertility are similar to those defined by the World Health Organization (WHO) as having Group II anovulatory infertility. Both populations are defined by dysfunctional ovulation, a positive response to a progestin challenge, and include a subset of patients with PCOS.

The ovaries of patients with PCOS are very sensitive to gonadotropin stimulation, and the development of a single dominant follicle is particularly difficult to achieve [4]. It has been suggested that these complications are related to elevated LH and androgen levels, which are commonly seen in patients with PCOS [5]. The development of multiple dominant follicles is associated with risks of ovarian hyperstimulation syndrome (OHSS) and multiple pregnancies [69]. Various approaches to minimize the risks of OHSS and multiple pregnancies have been proposed, including the use of fixed-dose and step-up and step-down gonadotropin protocols.

An earlier Phase II dose-finding study using a low-dose step-up protocol for OI showed that a starting dose of 75 IU r-hFSH (follitropin alfa), with 37.5 IU increments, was associated with a favorable efficacy and safety profile in Japanese women with amenorrhea I or anovulatory infertility [10]. In this Phase III study, the efficacy and safety of follitropin alfa were compared with those of purified urine-derived FSH (urofollitropin), using a low-dose step-up protocol for follicle development and OI in Japanese women with anovulatory infertility who were menstruating or had progestin-positive amenorrhea, including PCOS.

Materials and methods

Study design

This was a Phase III, multicenter, single-blind, parallel-group, comparative, noninferiority study of r-hFSH (follitropin alfa; GONALEF®/GONAL-f®; Merck Serono S.A., Geneva, Switzerland, an affiliate of Merck KGaA, Darmstadt, Germany) versus purified urofollitropin (Fertinorm® P/Fertinorm® HP; Merck Serono S.A., Geneva). The study was conducted by Merck Serono Co., Ltd, Japan (an affiliate of Merck KGaA, Darmstadt, Germany) between February 2007 and December 2007 at 21 medical institutions in Japan (protocol number 26648; ClinicalTrials.gov identifier NCT00467480).

The institutional review boards of all participating investigational centers approved the study for their respective centers. Written informed consent was obtained from each participant. The study was conducted in accordance with good clinical practice (GCP) and the Declaration of Helsinki.

Patients

Japanese national diagnostic criteria were used to identify patients with anovulatory infertility caused by hypothalamic or pituitary dysfunction (with or without PCOS). Patients diagnosed with either ‘amenorrhea I’ [in which menstrual bleeding was induced by administration of progesterone (P4)] or anovulatory cycles (oligo- or polymenorrhea) were included.

Women aged 20–39 years with a body mass index (BMI) of 17.0–28.0 kg/m2 who were menstruating without apparent ovulation or were amenorrheic (with a positive progestin challenge test), and had failed to ovulate or achieve pregnancy despite two or more cycles of anti-estrogen therapy (including clomiphene citrate and cyclofenil) were eligible for enrollment. Key exclusion criteria comprised cardiac, pulmonary, hepatic, renal, or cardiovascular dysfunction; malignancy; genitourinary bleeding of unknown cause; and baseline serum FSH levels of 20 mIU/mL or more, P4  5 ng/mL or more, or prolactin  30 ng/mL or more (by chemiluminescence immunoassay).

A baseline diagnosis of PCOS was made when the following three conditions were met: menstrual abnormality; elevated LH with normal FSH levels at baseline (LH/FSH ≥ 1.0); and cystic change in multiple follicles demonstrated on ultrasound examination.

Treatment schedule

Patients were randomized (1:1) at the central registration unit to receive either follitropin alfa or urofollitropin (both as freeze-dried formulations for reconstitution) by subcutaneous injection. All patients were blinded to their treatment allocation.

Follitropin alfa or urofollitropin was administered using a low-dose step-up treatment regimen. FSH stimulation was initiated 2–5 days after the start of spontaneous or P4-induced menstrual bleeding. The 75 IU starting dose of FSH was administered for at least 7 consecutive days. The dose of FSH was increased by 37.5 IU if the mean diameter of the dominant follicle was less than 11 mm on treatment day 8. Two further increments of 37.5 IU were permitted if the mean diameter of the dominant follicle was still less than 11 mm on days 15 and 22 of stimulation. FSH was administered for a maximum of 28 days.

Each patient was permitted to miss one of the seven equal daily doses of FSH per week. This dose-free day was termed a treatment holiday. Treatment holidays were permitted because self-injection of gonadotropins for OI was not approved in Japan. Therefore, the women had to attend a medical center each day to receive FSH injections and their center was likely to be closed on some days during the stimulation period. A maximum of four treatment holidays was permitted during a stimulation period, with no more than two consecutive dose-free days.

Regular and frequent serum estradiol (E2) measurements and transvaginal ultrasound (TVUS) scans were performed. Ovulation was triggered using a single dose of 5000 IU hCG administered intramuscularly when the dominant follicle reached a mean diameter of 18 mm. However, on day 28 of stimulation, hCG administration was permitted if the mean diameter of the dominant follicle was at least 16 mm. To minimize the risk of OHSS or a multiple pregnancy, hCG was withheld if the patient did not meet the hCG cancellation criterion (defined as four or more follicles with a mean diameter of at least 16 mm observed on TVUS). Aspiration of follicles was not permitted to avoid cycle cancellation.

Mid-luteal serum P4 levels were measured 6 (±1) and/or 9 (±1) days following hCG administration. All patients who received hCG underwent a urinary beta hCG pregnancy test 28–31 days after cessation of FSH stimulation. Those with a positive urinary pregnancy test underwent TVUS examination on days 35–42 for confirmation of clinical pregnancy.

If a pregnancy was confirmed before the scheduled pregnancy test on day 28–31, progesterone for luteal phase support was permitted at the investigator’s discretion.

Blood samples were collected (at baseline, before starting treatment, before hCG administration, and on day 28–31 post-treatment) and analyzed centrally for hematology and biochemistry variables. To detect anti-FSH and anti-Chinese hamster ovary (CHO) antibodies, blood samples were collected prior to treatment and on day 28–31 post-treatment. Anti-FSH and anti-CHO antibodies were assessed using radioimmunoprecipitation and enzyme-linked immunosorbent assays, respectively (tests conducted by Merck Serono, Bari, Italy, an affiliate of Merck, Darmstadt, Germany).

Statistical analyses

The primary efficacy endpoint was the proportion of patients who ovulated. Ovulation was assumed if a mid-luteal serum P4 level of 5 ng/mL or more was detected or if clinical pregnancy occurred. Secondary efficacy endpoints were the: proportion of patients who developed a dominant follicle with a mean diameter of 18 mm or more; duration of stimulation (days) with follitropin alfa or urofollitropin that was required to achieve a dominant follicle with a mean diameter of 18 mm or more; mean total dose of follitropin alfa or urofollitropin administered to patients who achieved a dominant follicle with a mean diameter of 18 mm or more; proportion of patients who met the hCG cancellation criterion; proportion of patients with a dominant follicle of mean diameter 18 mm or more and without the concurrent presence of other follicles of 14 mm or more in diameter (single follicle maturation rate); biochemical pregnancy rate; and clinical pregnancy rate.

Adverse events (AEs) were categorized by System Organ Class and Preferred Term and were coded using the Medical Dictionary for Drug Regulatory Activities Terminology version 10.0. A serious AE (SAE) was defined as an event that was life-threatening, resulted in death or a persistent or significant disability, required or prolonged hospitalization, or was a congenital anomaly or birth defect, or any other medically important condition. An adverse drug reaction (ADR) was an AE for which the causal relationship to the study drug could not be excluded. Thus, ADRs included all events that were considered to be probably, possibly, or unlikely related to treatment. Treatment-emergent AEs (TEAEs) were defined as AEs that appeared after administration of the first dose of study drug.

Efficacy evaluations were performed in the full-analysis set (FAS) and the per-protocol set (PPS). The FAS comprised patients who received at least one dose of follitropin alfa or urofollitropin, had no major discrepancies in inclusion or exclusion criteria, and did not violate GCP guidelines. The PPS included patients who were administered at least 80% of the doses of follitropin alfa or urofollitropin (including treatment holidays) and had no serious protocol deviations. Safety evaluations were conducted in the safety population, which comprised patients who received at least one dose of follitropin alfa or urofollitropin.

Noninferiority of the primary endpoint was assessed by analyzing the difference in ovulation rates between patients receiving follitropin alfa and those receiving urofollitropin. The null hypothesis: the ovulation rate among follitropin alfa-treated patients was less than or equal to that among urofollitropin-treated patients by a clinically significant absolute difference (delta) of at least 15%. A 97.5% one-sided lower confidence bound was calculated for the difference in ovulation rates between follitropin alfa and urofollitropin, using an asymptotic method (which allows normal approximation of data) on the proportions. The ovulation rate of follitropin alfa would be declared noninferior to that of urofollitropin if the lower limit of this confidence bound was greater than −15% in both the FAS and the PPS populations.

A total sample size of 240 patients was required. When the delta level was set at 15%, a target count of 108 patients was calculated to secure a 90% power for the test. As the expected study dropout rate was 10%, 120 patients were required for each treatment group. Analysis of covariance was performed on patient factors and the primary efficacy endpoint using Cochran–Mantel–Haenszel statistics and logistic regression, respectively.

Results

Patient disposition

Patient disposition throughout the study is depicted in Fig. 1. A total of 265 patients were randomized to one of two treatment groups (follitropin alfa, n = 129; urofollitropin, n = 136). Of these, 261 patients (follitropin alfa, n = 129; urofollitropin, n = 132) received at least one dose of study medication and comprised both the FAS and the safety population. A total of 251 patients (follitropin alfa, n = 125; urofollitropin, n = 126) were included in the PPS. Thirteen patients discontinued treatment (follitropin alfa, n = 6; urofollitropin, n = 7); only one patient (receiving urofollitropin) dropped out because of a TEAE (mumps infection). The numbers of patients who received hCG, underwent P4 assessment, and underwent pregnancy testing were similar in both treatment groups.

Fig. 1
figure 1

Patient disposition flow chart. AE Adverse event, FAS full-analysis set, PPS per-protocol set, r-hFSH recombinant human follicle-stimulating hormone, u-hFSH urinary human FSH

Baseline demographics and disease characteristics

The mean (SD) age of patients enrolled was 31.6 (3.8) years and their mean (SD) BMI was 21.2 (3.0) kg/m2. Baseline demographics and disease characteristics were well matched in both treatment groups (Table 1). Approximately one quarter of the patients in both treatment groups fulfilled the predefined diagnostic criteria for PCOS (Table 1).

Table 1 Baseline demographic and disease characteristics of the full-analysis set

Exposure

The mean (SD) total dose of follitropin alfa administered was 1042 (645) IU with a duration of exposure (excluding treatment holidays) of 11.7 (5.4) days (n = 129). The mean (SD) total dose of urofollitropin administered was 882 (479) IU with a duration of exposure (excluding treatment holidays) of 10.4 (4.2) days (n = 132). Most patients in both groups received daily doses of 75 and 112.5 IU; only two patients in the follitropin alfa group received 187.5 IU.

Treatment holidays (drug-free days) were taken by more than two-thirds of patients in each treatment group (follitropin alfa, 87/129; urofollitropin, 90/132). The majority of patients in both groups had only one or two drug-free days (follitropin alfa, 74/129; urofollitropin, 85/132).

Primary endpoint

The primary efficacy endpoint was the proportion of patients who ovulated, as defined by a mid-uteal serum P4 level of 5 ng/mL or more or by assumed ovulation due to clinical pregnancy. No patient with a P4 level of less than 5 ng/mL met the criterion of assumed ovulation due to clinical pregnancy. According to the criterion, a P4 level of  5 ng/mL or more, ovulation occurred in 79.1% (102/129) and 82.6% (109/132) of patients in the FAS who received follitropin alfa or urofollitropin, respectively, and in 79.2% (99/125) and 82.5% (104/126) of those in the PPS (Table 2). As the 97.5% one-sided lower limit of the difference in ovulation rates exceeded the predefined delta level (−15%) in the FAS and PPS, follitropin alfa was considered noninferior to urofollitropin in inducing ovulation.

Table 2 Ovulation rates (progesterone ≥ 5 ng/mL or clinical pregnancy; primary endpoint) among patients receiving follitropin alfa or urofollitropin in the FAS and PPS

An adjusted subgroup analysis of patients with and without a diagnosis of PCOS was also carried out. As the 97.5% one-sided lower limit of the difference in ovulation rates was −12.9% in both the FAS and the PPS, the predefined noninferiority criteria for the primary endpoint were achieved in both populations. Thus, PCOS status did not impact the primary efficacy outcome.

Secondary endpoints

Biochemical pregnancy rates were 17.8% (23/129) and 15.2% (20/132) in the follitropin alfa and urofollitropin groups, respectively (Fig. 2). A total of 41/261 (15.7%) patients achieved a clinical pregnancy. Clinical pregnancy rates were 17.1% (22/129) and 14.4% (19/132) in the follitropin alfa and urofollitropin groups, respectively (Fig. 2). No significant differences between the two treatment groups were observed in any of the secondary efficacy endpoints (Table 3).

Fig. 2
figure 2

Pregnancy outcomes for patients in the full-analysis set. P values, χ2 test

Table 3 Outcomes of other secondary endpoints for the full-analysis set

Safety data

A similar proportion of patients in each group reported TEAEs. In the follitropin alfa group, 69/129 (53.5%) patients reported a total of 120 TEAEs, whereas 66/132 (50.0%) patients in the urofollitropin group reported 127 TEAEs.

The TEAEs reported most commonly (occurring in ≥5% patients in either treatment group) are shown in Table 4. OHSS occurred in 15/261 (5.7%) patients; 10/129 (7.8%) patients receiving follitropin alfa, and 5/132 (3.8%) patients receiving urofollitropin. The majority of these cases (9/10 for patients receiving follitropin alfa and 5/5 for patients receiving urofollitropin) were of moderate severity. Two SAEs were reported (one in each treatment group). Both cases were OHSS (with pregnancy) that required hospitalization, and both resolved without sequelae.

Table 4 TEAEs reported most commonly (reported by ≥5% of patients in either treatment group) in the safety population

The proportion of patients experiencing at least one treatment-emergent ADR was similar in the two treatment groups: 27.1% (35/129) and 21.2% (28/132) in the follitropin alfa and urofollitropin groups, respectively. Local tolerability was similar in both groups; injection-site AEs were reported by 1.6% (2/129) of patients who received follitropin alfa and 2.3% (3/132) of patients who received urofollitropin. Antibodies for FSH and CHO were not detected in either treatment group (either pre- or post-treatment). No clinically meaningful change or trend was observed in mean laboratory values.

The incidence of multiple pregnancies was comparable in each treatment group: 3/23 (13.0%) in the follitropin alfa group and 5/20 (25.0%) in the urofollitropin group. No high-order multiple pregnancies (three or more fetal heartbeats detected on ultrasound scan) were observed in the patients receiving follitropin alfa, whereas two (of 20) women in the urofollitropin group had a triplet pregnancy.

Discussion

In this Phase III, multicenter, single-blind, parallel-group study, we compared the efficacy and safety of a low-dose step-up regimen of follitropin alfa and urofollitropin for follicle development and OI in Japanese women with anti-estrogen-ineffective anovulatory infertility, including PCOS. A mid-luteal P4 level of 5 ng/mL is considered to be adequate evidence of ovulation by the Japanese coordinating committee members of the Japanese r-hFSH Ovulation Induction Study Group.

No significant difference in the primary efficacy endpoint (rate of ovulation) was observed between follitropin alfa and urofollitropin in the FAS or PPS populations. In subgroup analyses of the primary endpoint, PCOS status did not affect the overall outcome. Furthermore, no significant differences were demonstrated in any other secondary efficacy endpoints, including the proportion of patients with a dominant follicle of 18 mm in diameter and biochemical or clinical pregnancy rates. A slightly higher pregnancy rate was observed in the follitropin alfa group despite somewhat lower rates of achievement of dominant and single follicle development compared with the urofollitropin group.

AEs, including OHSS and multiple pregnancies, occurred at similar frequencies in each treatment group and injections of both gonadotropins were well tolerated. Immunogenicity was similar in both treatment groups, with few injection-site AEs and undetectable antibodies for FSH and CHO. Overall, no significant differences were observed in the efficacy of follitropin alfa and urofollitropin, and the two products had similar safety and tolerability profiles.

Low-dose step-up gonadotropin protocols aim to attain and maintain a serum FSH level that is equal to or slightly above an individual’s FSH threshold and, thus, induce the development of a single dominant follicle [5, 6, 11, 12]. A starting dose of FSH of 75 IU per day is used in conventional step-up treatment regimens, with incremental dose increases of 75 IU every 5–7 days. Classic low-dose step-up regimens begin with a 75 IU daily dose of FSH (maintained for 7–14 days); this is then augmented with small incremental dose rises, usually of 37.5 IU (each maintained for at least 7 days) [4]. Lower starting and incremental doses have also been tested [13, 14]. In each case, the patient is closely monitored using serial E2 levels and ultrasound measurements. Low-dose step-up regimens are preferred to conventional regimens for both recombinant and urinary FSH [15] and are likely to become implemented with increasing frequency in Japan [16].

Until recently, self-injection of gonadotropins for OI was not approved in Japan. At the time of the present study, patients had to attend a medical center to receive daily treatment. Treatment holidays were taken by more than two-thirds of patients in each group due to clinic closure. As such, care must be taken when comparing the treatment duration or total dose of FSH used in this study with other published studies of recombinant versus urine-derived gonadotropins (in which consecutive daily treatment was received and the inclusion of treatment holidays was rare). Due to the pharmacokinetic profile of follitropin alfa, patients who have treatment holidays may require a longer duration of treatment and larger total doses of FSH.

In conclusion, no significant difference in the primary efficacy endpoint (rate of ovulation) was observed between follitropin alfa and purified urofollitropin in Japanese women with anti-estrogen-ineffective anovulatory infertility, including PCOS, when using a low-dose step-up regimen with a starting dose of 75 IU per day. Follitropin alfa administered using this treatment regimen appears to be effective and well tolerated for OI in this patient population. The use of treatment holidays in this study prevents the comparison of data with previous trials, which utilized consecutive daily doses.