FormalPara Key Summary Points

Why carry out this study?

In Japanese women, menstrual problems are the main cause of impairment at work and dysmenorrhea is one of the most common menstrual symptoms. Therefore, improving the disease burden of dysmenorrhea is essential, especially in Japan.

The efficacy of progestins alone and low-dose estrogen–progestins (LEPs) in treating dysmenorrhea has not been sufficiently studied, so this study was performed as a systematic review, direct meta-analysis, and indirect network meta-analysis to evaluate the difference in efficacy between LEPs and progestins available for the treatment of dysmenorrhea in Japanese women.

What is learned from this study?

In the direct meta-analysis, we found significant differences between all drugs and placebo in both types of dysmenorrhea and both outcomes except ultra-low-dose norethisterone/ethinylestradiol with cyclic regimen in primary dysmenorrhea, and in the indirect network meta-analysis, which included eight randomized controlled trials, we found a significant difference in visual analogue scale between dienogest and norethisterone/ethinylestradiol with cyclic regimen in secondary dysmenorrhea but no other differences between drugs.

We confirmed that LEPs and dienogest are effective for primary and secondary dysmenorrhea and suggest that continuous regimens may be more effective than cyclic regimens in improving outcomes.

Introduction

Dysmenorrhea is common in women of reproductive age and is characterized by symptoms including low abdominal pain, bloating, nausea, vomiting, headache, and dizziness during the menstrual period that have negative effects on quality of life (QoL) and productivity [1, 2]. In Japanese women, menstrual problems are the main cause of impairment at work [3] and dysmenorrhea is one of the most common menstrual symptoms [4]. For this reason, although female labor force participation has been on the rise recently in Japan, improving the disease burden of dysmenorrhea is essential, especially in Japan, where female labor force participation is lower than in other developed countries [5].

Two types of dysmenorrhea have been defined: primary dysmenorrhea, also referred to as functional dysmenorrhea, which is menstrual pain that is not associated with causative diseases and frequently occurs 2 or 3 years after first menstruation. The other type of dysmenorrhea is called secondary dysmenorrhea, also referred to as organic dysmenorrhea, which is mainly associated with a disease of the reproductive organs, e.g., adenomyosis, endometriosis, and uterine fibromatosis [6,7,8]. Endometriosis is known to be the underlying cause of secondary dysmenorrhea in approximately 70% of women and causes dysmenorrhea that lasts longer than primary dysmenorrhea.

Several studies have reported the efficacy and safety of dysmenorrhea treatments, including analgesics, hormonal therapies, and Chinese herbal medicines [9,10,11,12,13,14,15]. In particular, oral hormonal therapies, i.e., combined oral contraceptives and progestins-only pills, decrease hypercontraction of the uterus by suppressing prostaglandin production or act directly on the uterine lining. In Japan, low-dose estrogen–progestins (LEPs), administered in either a cyclic or extended (includes flexible) regimen, and progestins are approved for treatment of dysmenorrhea, and the medication is covered by public health insurance.

In clinical practice, these treatments are prescribed selectively because they have different adverse effects [16]. When administered alone, progestins are associated with a high incidence of irregular uterine bleeding. On the other hand, although LEPs can control bleeding, they can cause headaches and pelvic pain (because of the hormone-free interval) and, less frequently, thrombosis.

The efficacy of LEPs administered in either a cyclic or extended regimen has been reported in comparative clinical trials and systematic reviews [13, 17,18,19,20,21]. But their efficacy of progestin alone versus LEP in treating dysmenorrhea has not been sufficiently studied. Therefore, the aim of this study was to perform a systematic review, direct meta-analysis, and indirect network meta-analysis to evaluate the difference in efficacy between LEPs and progestins available for the treatment of dysmenorrhea in Japanese women.

Methods

We registered the protocol for this systematic review and meta-analyses with the International Prospective Register of Systematic Reviews (PROSPERO: CRD42021283446) [22] and report the study here according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [23].

Search Criteria

Eligible studies had to meet the following criteria: (1) the study was a randomized controlled trial (RCT) or systematic review of RCTs; (2) the study was performed in Japan to ensure that the study population was Japanese women with dysmenorrhea; (3) the studied drugs were oral LEPs or progestins available in Japan at the time of the literature search for the present analysis, and the study compared them with each other or with placebo; (4) the study evaluated the total dysmenorrhea score (defined as the sum of two subscores), severity of dysmenorrhea, and analgesics use or pain evaluated by a visual analogue scale (VAS); (5) the publication was written in English or Japanese; and (6) the study was published on or before 31 August 2021.

Literature Search

To identify potentially relevant studies, the two reviewers (NS and AS) independently conducted a systematic search in MEDLINE (via the PubMed interface), the Cochrane Library database, and ICHUSHI Web (a database for articles written in Japanese) on 31 August 2021. Additional studies were identified by hand searches of the reference lists of published studies. The detailed search terms and retrieval records are shown in Tables S1 and S2 in the supplementary material. We also searched clinical trial registration systems and package inserts, reports, and product information submitted by manufacturers to the Pharmaceuticals and Medical Devices Agency in Japan to confirm if there were any other reports of the products that had not been published.

The two reviewers independently evaluated whether the studies met the inclusion criteria. Possible discrepancies between the two reviewers were to be resolved through discussions with a third reviewer (AI); however, no discrepancies occurred.

Study Outcomes

The main study outcome was the total dysmenorrhea score. The two reviewers extracted the following information from each eligible article: study title; author names; publication year; number of cases; number of controls; age; drug studied; dosage form; administration regimen (cyclic, extended, or continuous); type of dysmenorrhea (primary, secondary, or both); follow-up period; amount of analgesic, if used; and variables in seven relevant domains (see below) to assess the risk of bias. Outcome data, i.e., the mean change in total dysmenorrhea score and VAS from baseline to the assessment time point, were also extracted. If standard deviations (SDs) were not specified for the means, we extracted measures that enabled us to estimate SD (e.g., standard errors of the mean [SEMs] or confidence intervals [CIs]). If SDs were specified only at baseline and the assessment time point, we derived them for mean changes by assuming additivity of variance. Because the assessment time points differed between studies and outcomes likely changed over the duration of treatment, we also extracted outcomes at the 12th week, which was the earliest time point available across the eligible studies. If numerical data were not specified in the text or tables, we obtained them from the figures.

The data collection in this study is based on previously conducted studies and does not contain any new studies with human participants or animals performed by any of the authors.

Quality Assessment

The two reviewers independently assessed the quality of the included studies with the Cochrane risk of bias tool [24]. Each of these domains was categorized as being a low (+), unclear (?), or high (−) risk of bias according to the recommendations outlined in the Cochrane Handbook for Systematic Reviews of Interventions (version 6.1) [24]. Disagreements were resolved through discussion with a third reviewer.

Statistical Analysis

Data were abstracted and analyzed by R (version 4.1.0, R Foundation for Statistical Computing, Vienna, Austria). For the direct meta-analysis, we used “metafor” (version 3.0.2); and for the indirect Bayesian network meta-analysis, the “gemtc” Package (version 1.0.2) and JAGS (version 4.3.0, MRC Biostatistics Unit, Cambridge, UK). We calculated mean differences (MDs) for mean values and reported them with 95% CIs or 95% credible intervals (CrIs); statistical significance was defined as a P value of less than 0.05. All statistical tests were two-sided. Publication bias was assessed with funnel plots, but no statistical test of funnel plot asymmetry was used because sufficient studies are required to detect true asymmetry [25].

We first conducted a traditional pairwise meta-analysis for every treatment (i.e., direct comparisons) with the DerSimonian and Laird random-effects model (“metafor” package for R). We assessed statistical heterogeneity with the I2 statistic, which describes the proportion of the variation that is related to heterogeneity rather than chance, and the Q test; an I2 greater than 50% or a P value less than 0.05 was considered to indicate substantial heterogeneity.

We also conducted a Bayesian hierarchical network meta-analysis that used a non-informative priors and a Markov chain Monte Carlo (MCMC) simulation (“gemtc” package, which recalls JAGS in R for MCMC sampling). We selected the fixed-effect or random-effects model according to the heterogeneity. We used four parallel chains and ran 20,000 simulations to obtain model parameters after 5000 burn-in samples for each chain. To check convergence, we used the Gelman–Rubin diagnostic and trace plots. The surface under the cumulative ranking curves (SUCRA) was calculated to obtain the hierarchy of each treatment. In addition to the comparisons between the drugs, we conducted a network meta-analysis to compare the types of administration regimen.

To synthesize most of the arms, in the analysis we used the values at the main or later assessment time point in each study. A sensitivity analysis was performed to evaluate the effect of the length of drug administration by conducting the network meta-analysis with the values at the 12-week assessment time point. If data were not available at week 12, the arms were excluded from the sensitivity analysis.

Results

Search Results

A flow diagram of the literature selection process is shown in Fig. 1. The initial search identified 314 articles and we selected 32 articles for further full-text review. Of these 32 articles, 17 articles, including 10 articles [9,10,11,12,13,14,15, 26,27,28] on RCTs and seven systematic reviews, met the inclusion and exclusion criteria (Table 1). With regards to the type of dysmenorrhea studied, two of the 10 studies included only participants with primary dysmenorrhea [11, 14]; one included only participants with secondary dysmenorrhea [10]; and the other seven included participants with primary or secondary dysmenorrhea (overall dysmenorrhea), although one of these studies did not report outcomes for primary and secondary dysmenorrhea separately [15].

Fig. 1
figure 1

Flow diagram of the literature selection process. RCT randomized controlled trial, SLR systematic literature review

Table 1 All eligible studies included in the meta-analyses

The results of the quality assessment of the 10 studies according to the Cochrane risk of bias tool are shown in Fig. S1 in the supplementary material. Downgrading of quality because of an unclear risk of bias was based on an insufficient or incomplete description of the random sequence generation, blinding, or allocation concealment or missing data management.

The extracted endpoints of the 10 studies included in the analyses are summarized in Table S3 in the supplementary material. The assessment time points were as follows: week 12, DNG and DRSP/EE-cyclic [14], DNG [15], and LNG/EE-cyclic and LNG/EE-extended [26]; week 16, NET/EE LD-cyclic [9, 11], NET/EE ULD-cyclic [12], and DRSP/EE-cyclic [10]; and week 24 for DRSP/EE-extended and DRSP/EE-cyclic [13]. Trials were evaluated at week 12–16 as the primary endpoint except for the study of DRSP/EE-extended regimen. For DRSP/EE-extended, both the total dysmenorrhea score and VAS clearly decreased from baseline and maintained with little fluctuation through week 12–24. Therefore, dysmenorrhea severity scores at week 24 as the secondary endpoint for DRSP/EE-extended were available for the data synthesis.

Meta-Analyses

Direct Meta-Analysis

The results of the direct comparisons with placebo (i.e., not including the study that compared DRSP/EE-cyclic with DRSP/EE-extended) are shown in Fig. 2. In both primary and secondary dysmenorrhea, almost all drugs showed greater improvements than placebo in the total dysmenorrhea score and VAS. In primary dysmenorrhea, the improvements in the total dysmenorrhea score compared with placebo were higher with LNG/EE-extended than with the other drugs (− 2.00 [95% CI − 2.49 to −1.51]; Fig. 2a) and the improvements in VAS compared with placebo were higher with DNG than with the other drugs (− 25.45 [95% CI − 34.22 to − 16.68]; Fig. 2b). In secondary dysmenorrhea, the improvements compared with placebo were higher with DNG than with the other drugs in both the total dysmenorrhea score (− 2.70 [95% CI − 3.64 to − 1.76]) and VAS (− 46.10 [95% CI − 62.43 to − 29.77]; Fig. 2c, d).

Fig. 2
figure 2

Forest plot of direct meta-analysis of drug treatments for dysmenorrhea assessed by total dysmenorrhea score (a primary dysmenorrhea; c secondary dysmenorrhea) and visual analogue scale (b primary dysmenorrhea; d secondary dysmenorrhea). DNG dienogest with continuous regimen, DRSP/EE-cyclic drospirenone/ethinylestradiol betadex with cyclic regimen, DRSP/EE-extended drospirenone/ethinylestradiol betadex with extended regimen, LNG/EE-cyclic levonorgestrel/ethinylestradiol with cyclic regimen, LNG/EE-extended levonorgestrel/ethinylestradiol with extended regimen, NET/EE LD-cyclic norethisterone/ethinylestradiol with cyclic regimen, NET/EE ULD-cyclic ultra-low-dose norethisterone/ethinylestradiol with cyclic regimen, 95% CI 95% confidence interval, RE model random effects model

The VAS did not show significant heterogeneity among the studies in primary dysmenorrhea (I2 = 0.0%; p = 0.4289) or secondary dysmenorrhea (I2 = 47.54%, p = 0.1063). However, the total dysmenorrhea score did show significant heterogeneity in both types of dysmenorrhea (primary, I2 = 65.51%, p = 0.0050; secondary, I2 = 78.38%, p = 0.0001).

Indirect Network Meta-Analysis

The results of the indirect comparisons are shown in Table 2, the network diagrams in Fig. 3, and the relative efficacies against placebo in Fig. S2 in the supplementary material. In primary dysmenorrhea, no statistical difference was found between drugs in the improvement of the total dysmenorrhea score and VAS (Table 2a, b). In secondary dysmenorrhea, the same result was found in the improvement of the total dysmenorrhea score (Table 2c), but DNG showed more improvement of the VAS than NET/EE LD-cyclic (− 25.84 [95% CrI − 44.46 to − 7.15]; Table 2d). DNG and LNG/EE-extended were ranked highest in the total dysmenorrhea score for both primary and secondary dysmenorrhea (SUCRA values; primary dysmenorrhea: DNG, 0.71, and LNG/E2-extended, 0.78; secondary dysmenorrhea: DNG, 0.75, and LNG/EE-extended, 0.71) and DNG was ranked highest in VAS for both primary and secondary dysmenorrhea (SUCRA values; 0.96 and 0.95, respectively). All SUCRA values are shown in Table S4 in the supplementary material.

Table 2 Comparison of the relative effect between arms
Fig. 3
figure 3

Network diagram of indirect comparisons. Each node represents intervention arm. The lines represent the direct comparison in the studies. a Total dysmenorrhea score in primary dysmenorrhea; b visual analogue scale in primary dysmenorrhea; c total dysmenorrhea score in secondary dysmenorrhea; d visual analogue scale in secondary dysmenorrhea. DNG dienogest with continuous regimen, DRSP/EE-cyclic drospirenone/ethinylestradiol betadex with cyclic regimen, DRSP/EE-extended drospirenone/ethinylestradiol betadex with extended regimen, LNG/EE-cyclic levonorgestrel/ethinylestradiol with cyclic regimen, LNG/EE-extended levonorgestrel/ethinylestradiol with extended regimen, NET/EE LD-cyclic norethisterone/ethinylestradiol with cyclic regimen, NET/EE ULD-cyclic ultra-low-dose norethisterone/ethinylestradiol with cyclic regimen

We performed a sensitivity analyses to confirm the impact of the difference in assessment time points between studies by using outcomes at week 12 (Table 3). Data were available only for NET/EE LD-cyclic, NET/EE ULD-cyclic, DRSP/EE-cyclic, LNG/EE-cyclic, LNG/EE-extended, and DNG. The results were similar to those shown in Table 2, so we concluded that the difference in assessment time points did not have a significant impact on the results.

Table 3 Relative effect between arms at week 12

The results of the comparison between administration regimens are shown in Table 4, and the relative efficacies against placebo are shown in Fig. S3 in the supplementary material. The continuous and extended groups tended to have better efficacy in the total dysmenorrhea score (Table 4a, c), and the continuous groups showed a significant difference compared with the cyclic group, as assessed by VAS (Table 4b, d). Continuous and extended groups were ranked highest in the total dysmenorrhea score for both primary and secondary dysmenorrhea (SUCRA values; primary dysmenorrhea: continuous group, 0.87 and extended group, 0.74; secondary dysmenorrhea: continuous group, 0.86 and extended group, 0.77) and the continuous group was ranked highest in VAS for both primary and secondary dysmenorrhea (SUCRA values; 0.98 for both types of dysmenorrhea). All SUCRA values are shown in Table S5 in the supplementary material.

Table 4 Relative effect between administration regimens

Discussion

In the present study, we performed a systematic review, direct meta-analysis, and indirect network meta-analysis to evaluate the difference in efficacy between drugs for dysmenorrhea approved in Japan. This is the first such study to include the progestin DNG 1 mg/day in a comparison of the efficacy of LEPs, a combined oral contraceptive commonly used overseas. In this study, the efficacy of the drugs was evaluated by two evaluation indexes for dysmenorrhea as follows: First, total dysmenorrhea score that defined pain according to limited ability to work and need for analgesics [11]. Second, VAS that visualized the current degree of pain.

In the direct meta-analysis, we found significant differences between all drugs and placebo in both types of dysmenorrhea and both outcomes except NET/EE ULD-cyclic in primary dysmenorrhea. In the indirect network meta-analysis, which included eight RCTs, we found a significant difference in VAS between DNG and NET/EE LD-cyclic in secondary dysmenorrhea but no other differences between drugs. Endometriosis is one of the causes of secondary dysmenorrhea. In a network meta-analysis in 2127 patients that compared drugs for endometriosis-related pain, Samy et al. reported that the probability ranking p-score of DNG was the highest among five interventions (DNG 2 mg/day, oral contraceptives, elagolix 150 mg, elagolix 250 mg, placebo) for improvement in pelvic pain as measured by a VAS in the third month after initiation of the drugs [29].

We suggest two possible reasons for the significant difference in improvement of VAS between NET/EE LD-cyclic and DNG in secondary dysmenorrhea. First, NET/EE LD-cyclic involves withdrawal bleeding and pelvic pain (due to the hormone-free interval), whereas DNG causes amenorrhea by suppressing ovulation and thus prevents pain [30]. Furthermore, in endometriosis—a typical cause of secondary dysmenorrhea—DNG is expected to have anti-inflammatory effects and antiproliferation effects on the endometrium because of its high progestin activity [30,31,32]. Although DNG was shown to be effective in improving VAS in endometriosis, the dose of DNG included in this study was half of the dose generally used for the treatment of endometriosis, so the estrogen-suppressing effect was relatively weak [14]. The pain suppression by DNG in secondary dysmenorrhea shown in our study was thought to be due to both maintenance of amenorrhea symptoms and a direct effect on organic diseases, as mentioned above.

We also evaluated the differences between administration regimens. Extended and cyclic regimens of LEP have been investigated in many systematic reviews and clinical trials [17,18,19,20,21, 33], and international guidelines recommend the extended regimen. However, VAS was not significantly different between the extended and cyclic groups in this study. A meta-analysis by Damm et al. showed that LEP extended regimens reduced the duration of pain by 4 days compared with LEP cyclic regimens [19], although the difference in efficacy in reducing the severity of dysmenorrhea was unclear. Our study likely underestimated efficacy in the extended group because the included studies assessed the efficacy of DRSP/EE-extended at the end of a cycle, i.e., the time of painful withdrawal bleeding. To precisely evaluate the difference in efficacy in reducing the severity of dysmenorrhea between drugs with different administration regimens, appropriate evaluation methods should be used.

Zorbas et al. considered that, compared with a LEP cyclic regimen, a LEP extended regimen relieves pain by achieving amenorrhea [20]. Consequently, the progestin continuous group without a hormone-free interval could be expected to be more effective than the cyclic group. Therefore, in the present study, we also compared administration regimens by dividing the studies into continuous (progestin), extended (LEP-extended), and cyclic (LEP-cyclic) groups. As a result, we found slightly better improvement of the total dysmenorrhea score in the continuous and extended groups, but the continuous group showed greater improvement in VAS than the cyclic group did, regardless of the type of dysmenorrhea. One of the reasons for the higher efficacy in continuous (progestin) groups than in cyclic (LEP cyclic) groups was considered to be the contribution of amenorrhea occurring in continuous (progestin) groups. On the other hand, differences in the presence or absence of estrogen inclusion and progestin activity should also be considered. The degree to which differences in regimens between both drugs contributed to efficacy is unknown.

We observed a difference in heterogeneity between comparisons evaluating the total dysmenorrhea score and those evaluating the VAS in both dysmenorrhea types and considered the measurement methodology of each score as a major cause of the difference [34]. The total dysmenorrhea score is a verbal rating scale that assesses the impact on daily life and the frequency of analgesic intake, and the degree of impact on daily life depends on health literacy [35] and social position and tends to show high variance among responders. In contrast, the VAS assesses the intensity of one’s own pain. As regards clinical heterogeneity, we concluded that it has limited impact on our conclusions because we performed separate analyses of different disease types, i.e., primary and secondary dysmenorrhea, and found similar clinical characteristics according to the distribution of basic demographics and baseline total dysmenorrhea scores that could cause clinical heterogeneity among the eligible studies.

We included only RCTs in our analyses to guarantee the quality of the studies; nevertheless, our study has some limitations. First, the assessment time point of each score was not always the same. In the present analysis, we mainly used the values reported in each trial, but the duration of a treatment course differed even within the LEP drugs with an extended regimen (LNG/EE-extended, 84-day cycle; DRSP/EE-extended, 124-day cycle). In addition, the pain-suppressing effect in a LEP extended regimen has been reported to appear soon after the initiation of treatment, whereas in a LEP cyclic regimen, it increases throughout the continued administration and consequently decreases the frequency of analgesic use over time [26, 36]. Consequently, the efficacy of the LEP cyclic regimen was likely underestimated. Second, all of the eligible studies were performed by pharmaceutical companies to confirm the efficacy of their compounds. The risk of bias in each study was evaluated (Fig. S1 in the supplementary material), and no serious bias was identified in any of the studies. Although we did not investigate other elements of study design that are not evaluated by the risk of bias assessment, we considered their impact to be limited because all eligible studies were sponsored by companies. Third, the publication bias could not be statistically evaluated because of the limited number of eligible studies, although we did find asymmetry in the funnel plots (Fig. S4 in the supplementary material). Fourth, the 95% CrIs were wide for the total dysmenorrhea score, which is considered to be due to the insufficient number of studies and patients in our analysis. If more studies are conducted in future, analyses will be able to evaluate whether the drugs with higher mean differences in our study actually are more effective than others. Fifth, our meta-analyses synthesized only efficacy and did not consider the safety of each drug. Thus, additional investigations are required to evaluate the overall usefulness of the drugs in real-world clinical settings.

Conclusion

In this study, we conducted a systematic review, direct meta-analysis, and indirect network meta-analysis of oral hormonal therapies for the treatment of dysmenorrhea and found that both LEPs and progestins are effective in treating dysmenorrhea. In addition, progestin continuous regimens are suggested to be more effective than LEP cyclic regimens in improving pain relief. LEP-extended and progestin continuous could be superior to LEP-cyclic in the improvement of total dysmenorrhea score, although no significant differences between regimens were shown in our analyses with a limited number of studies.