Introduction

Previous studies have reported that while premenopausal women have a lower risk of developing cardiovascular disease (CVD) than men, postmenopausal women have a higher risk1,2. It has been suggested that metabolic changes due to estrogen depletion after menopause lead to an increased CVD risk in postmenopausal women3,4,5. Experimental studies have identified several protective mechanisms of estrogen against CVD, which include increasing angiogenesis and vasodilation, and reducing fibrosis and oxidative stress6. Menopausal hormone therapy (MHT) was suggested to contribute to the reduction of CVD risk based on the hypothesis of cardioprotection by estrogen7,8. Many randomized controlled trials (RCTs) and observational studies have investigated the association between MHT and CVD risk; however, there have been inconsistent results among studies. Early observational studies have reported beneficial effects of MHT on CVD, whereas large RCTs, such as the Women’s Health Initiative (WHI) and the Heart and Estrogen/Progestin Replacement Study (HERS) have not7,8,9,10,11. However, some major limitations of the RCTs are that the women were older, had initiated MHT late after menopause, and either had CVD risk factors at baseline or had a history of a CVD event10,11,12. Since the publication of the WHI report, several studies have reevaluated the risk profile of MHT13,14,15. Despite that, controversies remain regarding CVD-related risks and benefits of MHT. Therefore, emphasis has been placed on the necessity of additional studies that evaluate the following factors: dose of estrogen, route of administration, timing after menopause, duration of use, other hormone effects, pre-existing pathology, and age3.

The Cochrane library conducted a meta-analysis of the association of MHT with specific CVD outcomes in RCTs. They conducted subgroup analyses based on the timing of MHT after menopause (timing hypothesis), but the study did not consider other confounding factors16. Another meta-analysis of RCTs examining the timing hypothesis for MHT and CVD risk supported the importance of the timing of initiation of MHT, and concluded that MHT may have beneficial effects on mortality and CVD events in younger menopausal women17. The previous meta-analysis studies of RCTs were limited to the timing hypothesis, and did not include other confounding factors that could affect the association between MHT and CVD. The most recent study was a systematic review of individual studies on the effects of timing, routes of administration, duration, and dose of MHT on CVD. However, that review was based on the findings of observational studies, and did not conduct pooled analyses of results owing to the diversity of the studies18. In addition, a previous meta-analysis of RCTs reported that inconsistent findings between the study designs may be due to the differences in the characteristics of the study populations, methodologic limitations of observational studies, and lower event rates and shorter duration of treatments in RCTs19.

Therefore, it is necessary to consistently examine the results of RCTs and observational studies by conducting various subgroup analyses and by comparing the characteristics of the included study populations. The aim of this study was to assess the association between MHT and CVD outcomes, and to compare the results of RCTs and observational studies through a systematic review and meta-analysis of RCTs and observational studies, respectively.

Methods

Search strategy and selection criteria

A literature search was conducted according to each study design (RCTs and observational studies) using the following search terms: (“cardiovascular diseases” OR “cerebrovascular disorder” OR “all-cause death” OR “cardiovascular death” OR “death” OR “mortality”) AND (“hormone replacement therapy” OR “hormone therapy” OR “menopausal hormone therapy” OR “postmenopause”). Detailed search terms can be found in Supplementary File 1. The PubMed and EMBASE databases were searched to identify relevant articles published between January 2000 and December 2019. In the Cochrane review, studies published before 2000 were assessed as having a higher risk of bias than articles published after 200016. Thus, the current study included only studies published after 2000.

The study selection criteria for the RCTs were based on the Cochrane reviews by Boardman et al.16. The same criteria were applied to the observational studies. After removing duplicates, exclusion criteria were separately applied to the remaining RCTs and observational studies. Only original articles of human studies published in English were included. Studies that did not report about relevant exposure or outcomes, or included either an ineligible population (premenopausal women or cancer survivors) or a duplicated study population, or had ineligible data to conduct the meta-analysis were excluded. Studies with an insufficient follow-up duration of less than 6 months in the RCTs or those with an ineligible study design (cross-sectional design) were also excluded. Reference lists of the relevant studies were manually screened to include more articles in our meta-analysis.

Data extraction and quality assessment

Two authors (JHC and MJJ) participated in study selection and data extraction. Two other authors (JEK and JYC) checked and reviewed the data in two steps. We extracted the data as follows: (1) characteristics of the included studies and populations, including author, year of publication, study design, follow-up duration, sample size, ethnicity, age at baseline, and underlying diseases; (2) exposure, including initiation of MHT after menopause, regimen type [estrogen only (E only) or combined estrogen-progesterone (combined EP)], route of administration, and duration, and recency of MHT; (3) outcomes, including all-cause death, cardiovascular death, stroke, venous thromboembolism (VTE), pulmonary embolism (PE), myocardial infarction (MI), coronary heart disease (CHD), angina, and revascularization; and (4) the effect estimates of the association between MHT and outcomes, such as hazard ratio, relative risk, odds ratio, 95% confidence interval (CI), the number of exposed/non-exposed of MHT, and each event. Multivariable adjusted estimates were primarily extracted to reduce any confounding effects. If a study did not include the estimated values, the combined estimates were calculated based on the original estimates or the number of exposed/non-exposed, and each event was extracted as it was for the meta-analysis. Supplementary Tables S1S4 provide details about the RCTs and observational studies included in the meta-analysis.

Representative studies were selected from one or more trials or studies, prioritizing the following selection criteria: (1) longest follow-up duration, (2) largest number of outcomes, or (3) largest number of participants. Detailed information on the selected representative RCTs and observational studies is presented in Supplementary Tables S5 and S6, respectively.

Quality assessment was conducted using the Jadad scale for RCTs, and the Newcastle–Ottawa Scale (NOS) for observational studies20,21. The Jadad scale consists of three domains: randomization (0–2 points), blinding (0–2 points), and an account of all patients (0–1 point). We classified the quality of RCTs as good (4–5 points), fair (3 points), or poor (0–2 points). The NOS is based on three domains: selection (0–4 stars), comparability (0–2 stars), and exposure for cohort and nested case–control studies or outcomes for case–control studies (0–3 stars). The NOS, a star system, was converted to the Agency for Healthcare Research and Quality standards. The thresholds for assessing quality were as follows: (1) good: selection (3 or 4 stars) AND comparability (1 or 2 stars) AND outcome/exposure (2 or 3 stars); (2) fair: selection (2 stars) AND comparability (1 or 2 stars) AND outcome/exposure (2 or 3 stars); and (3) poor: selection (0 or 1 star) OR comparability (0 star) OR outcome/exposure (0 or 1 star).

Statistical analysis

The association between MHT and each CVD outcome was evaluated using summary estimates (SE) and corresponding 95% CIs. Heterogeneity among the included studies was assessed by the I2 index and Q statistics. We employed a fixed-effects model if the I2 was < 30% and the P-value by Q statistic was > 0.05. If not, a random-effects model was used. In both RCTs and observational studies, we conducted subgroup analyses by regimen type (E only and combined EP), duration of MHT (< 5 and ≥ 5 years), timing of initiation of MHT (early: age < 60 years or initiation within 10 years since menopause; late: age ≥ 60 years or initiation after 10 years since menopause), and underlying diseases (with or without). Subgroup analyses of observational studies were conducted by route of administration (oral and non-oral), study design (cohort, nested case–control, and case–control), recency of MHT (past and current), and study quality (good/fair and poor). We defined the timing of initiation of MHT based on the criteria of the timing hypothesis presented in a previous RCT meta-analysis conducted by Boardman et al.16 and Nudy et al.17. Subgroup analyses were performed when the number of studies was adequate. We evaluated the publication bias of included studies using symmetry funnel plots and Egger’s test. Statistical analysis was performed using the “meta” packages in the R version 3.4.1 software (R Foundation for Statistical Computing).

Results

Eligible studies and characteristics

A total of 26 RCT studies (20 trials) and 47 observational studies were included in the final meta-analysis (Figs. 1, 2). We compared the characteristics of the included studies (Table 1). Most of the RCTs and observational studies were conducted in Europe or North America, and only a few studies were conducted in Asia. Eighteen RCTs (69.2%) were published before 2006, the year in which the WHI findings were being actively reevaluated. Twenty-nine of the observational studies (61.7%) were published after 2006. Study populations included in the RCTs were older than those included in the observational studies (median age, 63.6 vs. 60.6 years, respectively), and had more underlying diseases at baseline; subjects in the observational studies were relatively healthy. The route of MHT administration was oral in most RCTs, whereas some non-oral routes, such as transdermal or vaginal were used in the observational studies. The median follow-up duration of the RCTs was shorter than that of the observational studies (3.4 vs. 6.8 years). The observational studies included 30 cohort studies, 5 nested case–control studies, and 13 case–control studies.

Figure 1
figure 1

PRISMA flow diagram for study selection of the randomized controlled trials. From: Moher et al.48.

Figure 2
figure 2

PRISMA flow diagram for study selection of the observational studies. From: Moher et al.48.

Table 1 Overview of the characteristics of the included studies.

Quality assessment

Most of the RCTs were classified as good or fair quality studies according to the Jadad scale. Among the 20 trials, 15 were good quality and 5 were fair quality studies (Supplementary Table S7). The RCTs were assessed as fair quality studies for the following reasons: (1) incomplete blinding that affected the results or (2) allocation based on laboratory tests that could have increased selection bias.

Results of the quality assessment of cohort, nested case–control, and case–control studies by the NOS can be found in Supplementary Tables S8S10. Among the 30 cohort studies, 25 and 5 studies were assessed as good/fair and poor quality studies, respectively. All 5 nested case–control studies were assessed as good quality. Studies with a cohort design were assessed as fair quality studies for the following reasons: (1) nurses or teachers included in the study population, increasing the risk of selection bias, or (2) MHT ascertained using a self-reported questionnaire. Studies with a cohort design were assessed as poor quality studies for the following reasons: (1) no control for confounding factors, such as age, disease history, and other lifestyle factors; (2) outcomes ascertained using a self-reported questionnaire; or (3) follow-up duration insufficient or follow-up rate not reported. Among the 13 case–control studies, 7 studies were assessed as good quality, and 6 studies were assessed as poor quality. Studies with a case–control design were assessed as poor quality studies for the following reasons: (1) no control for confounding factors, such as age, disease history, and other lifestyle factors; (2) MHT ascertained via interview and interviewer not blinded to case/control status; or (3) response rates differed between cases and controls.

Meta-analysis of RCTs and observational studies

All-cause death and cardiovascular death

MHT was not associated with all-cause death (SE: 1.00, 95% CI: 0.96–1.04 in RCTs; SE: 0.90, 95% CI: 0.79–1.02 in observational studies) and cardiovascular death (SE: 0.96, 95% CI: 0.83–1.12 in RCTs; SE: 0.81, 95% CI: 0.61–1.07 in observational studies) in the pooled analysis of both RCTs and observational studies (Table 2). Subgroup analyses of RCTs did not identify any association between MHT and death (Table 3). In subgroup analyses of the observational studies, a decreased risk of all-cause death was observed among E only (SE: 0.85, 95% CI: 0.77–0.95) and early users after menopause (SE: 0.68, 95% CI: 0.51–0.92; Table 4).

Table 2 Meta-analysis of randomized controlled trials and observational studies for menopausal hormonal therapy (MHT) and cardiovascular disease (CVD) outcomes.
Table 3 Subgroup analyses for menopausal hormonal therapy (MHT) and cardiovascular disease (CVD) outcomes in randomized controlled trials.
Table 4 Subgroup analyses for MHT and CVD outcomes in observational studies.

Stroke

MHT was associated with an increased risk of stroke in the pooled analysis of RCTs (SE: 1.14, 95% CI: 1.04–1.25), although this was not observed in the pooled analysis of observational studies (SE: 0.98, 95% CI:0.85–1.13; Table 2). In the subgroup analyses of RCTs, an increased risk of stroke was observed in combined EP users (SE: 1.14, 95% CI: 1.01–1.29), users with a MHT duration ≥ 5 years (SE: 1.13, 95% CI: 1.03–1.25), late users after menopause (SE: 1.17, 95% CI: 1.01–1.37), and in women with an underlying disease at baseline (SE: 1.14, 95% CI: 1.04–1.26; Table 3). In subgroup analysis of the observational studies, an increased risk of stroke was observed in women administered oral MHT (SE: 1.24, 95% CI: 1.11–1.39), whereas a decreased risk of stroke was observed in women administered non-oral MHT (SE: 0.86, 95% CI: 0.77–0.96). There was no difference in risk by duration of MHT (SE: 1.11, 95% CI: 1.04–1.18 for < 5 years duration; SE: 1.22, 95% CI: 1.16–1.29 for ≥ 5 years duration; Table 4).

Venous thromboembolism

MHT was associated with an increased risk of VTE in the pooled results of both RCTs (SE: 1.70, 95% CI: 1.33–2.16) and observational studies (SE: 1.32, 95% CI: 1.13–1.54; Table 2). An increased risk of VTE was observed in combined EP users in both RCTs (SE: 2.28, 95% CI: 1.64–3.18; Table 3) and observational studies (SE: 2.21, 95% CI: 1.51–3.22; Table 4). This increased risk was also observed in late users after menopause (SE: 1.79, 95% CI: 1.39–2.29) and women with an underlying disease in the RCTs (SE: 1.67, 95% CI: 1.29–2.17; Table 3). It was not possible to evaluate the effects of underlying diseases on risk estimates in the observational studies, because women included in the observational studies were relatively healthy. Unlike findings from RCTs, an increased risk of VTE was observed in early users after menopause (SE: 1.55, 95% CI: 1.26–1.92), and in women administered oral MHT (SE: 1.41, 95% CI: 1.19–1.67; Table 4). There was no difference in risk by duration of MHT (SE: 1.93, 95% CI: 1.10–3.36 for < 5 years duration; SE: 1.65, 95% CI: 1.26–2.15 for ≥ 5 years duration) in the RCTs, although an increased risk was observed in use of MHT for < 5 years in observational studies (SE: 1.23, 95% CI: 1.02–1.47; Tables 3, 4). Regardless of study quality, an increased risk of VTE was observed in the observational studies (SE: 1.28, 95% CI: 1.08–1.51 in good and fair quality, SE: 1.60, 95% CI: 1.15–2.22 in poor quality; Table 4).

Pulmonary embolism

MHT was associated with an increased risk of PE in the pooled results of both RCTs (SE: 1.26, 95% CI: 1.06–1.50) and observational studies (SE: 1.44, 95% CI: 1.17–1.76; Table 2). In the subgroup analyses of RCTs, an increased risk of PE was observed in users with a MHT duration ≥ 5 years (SE: 1.25, 95% CI: 1.05–1.48), late users (SE: 1.88, 95% CI: 1.28–2.78), and women with an underlying disease at baseline (SE: 1.24, 95% CI: 1.05–1.48; Table 3).

Myocardial infarction and other outcomes

MHT was not associated with MI in the pooled results of RCTs (SE: 1.04, 95% CI: 0.94–1.14), whereas a decreased risk of MI was observed in the pooled results of observational studies (SE: 0.79, 95% CI: 0.75–0.84; Table 2). Subgroup analyses of RCTs did not reveal any association between MHT and MI (Table 3), whereas that of observational studies revealed a decreased risk in users with a MHT duration ≥ 5 years (SE: 0.51, 95% CI: 0.34–0.76), and with a non-oral route of MHT administration (SE: 0.75, 95% CI: 0.60–0.93). A decreased risk of MI was observed regardless of regimen type, timing of initiation, underlying diseases, study design, and quality of observational studies (Table 4).

The pooled results from both RCTs and observational studies did not reveal any association between MHT and CHD (SE: 1.02, 95% CI: 0.94–1.10 in RCTs; SE: 0.91, 95% CI: 0.72–1.15 in observational studies; Table 2). In the pooled results of the RCTs, there was also no association between MHT and revascularization (SE: 0.96, 95% CI: 0.87–1.06), or angina (SE: 0.95, 95% CI: 0.84–1.08; Table 2).

The forest plots for all analyses can be found in Supplementary Figures S1S4.

Publication bias

There was no evidence of publication bias for all-cause death, cardiovascular death, stroke, VTE, PE, MI, CHD, angina, or revascularization in the RCTs or the observational studies (Egger’s P-value > 0.05). The funnel plots and Egger’s P-values calculated for the assessment of publication bias are included in Supplementary Figures S1 and S2.

Discussion

Summary of findings

RCTs had a shorter follow-up duration than did observational studies, and the study populations in the RCTs were older, initiated MHT late after menopause, and had more underlying diseases than those in the observational studies. RCTs and observational studies both showed that MHT was associated with an increased risk of VTE and PE, although only the RCTs revealed an increased risk of stroke among those administered MHT. A decreased risk of MI by MHT was identified in the observational studies, but the RCTs did not show this association. Although still unexplained in the current literature, differential clinical effects according to regimen type, timing of initiation, underlying disease, and route of administration were identified in subgroup analyses.

Comparison of findings with previous systematic reviews and a meta-analysis

Our meta-analysis of RCTs was based on the Cochrane review published in 201516. Among the included RCTs, 13 trials overlapped with those included in the Cochrane review12,22,23,24,25,26,27,28,29,30,31,32,33. Four other trials (ESPRIT, HERS, WHI I, and WHI II)13,14,34,35,36 were included according to our inclusion criteria. Three trials (EMS, KEEPS, and PHASE)37,38,39 were newly identified in this study. Two other trials40,41 with a higher risk of bias than other studies, and one trial27 assessing recurrent VTE as the outcome, were excluded.

Consistent with the Cochrane review16, our pooled results from the RCTs showed an increased risk of stroke, VTE, and PE among MHT users. However, the effect size in the current study was decreased compared to those in the Cochrane review. We considered multivariable adjusted-estimates as a priority for the meta-analysis, thus potentially attenuating the effects of confounding factors. Nudy et al. conducted another RCT-based meta-analysis to assess the assumption of the timing hypothesis17. They reported that younger MHT users had a decreased risk of all-cause death and cardiac events (a composite of cardiac mortality and non-fatal MI), whereas the risk of a composite of stroke, transient ischemic attack (TIA), and systemic embolism increased as age increased. Unlike the Cochrane review16 and our meta-analysis, they integrated stroke, TIA, and systemic embolism as an outcome. Thus, it is not possible to compare the results with those from this study.

Two previous RCT-based meta-analyses16,17 did not assess the effects by confounding factors owing to insufficient information, but a more recent systematic review18 reported the effects of the timing of initiation, route of administration, duration, and dose on CVD risk. They reported that a low dose of oral MHT and transdermal MHT may have beneficial effects on CVD, including stroke and VTE. However, they did not report a synthesis of results, and most of the results were derived from observational studies. They reviewed 33 studies that included 6 RCTs and 27 observational studies; thus, it is difficult to compare the results between RCTs and observational studies. As another limitation, they reported that most of the included studies had a low or moderate evidence level based on quality assessment. We conducted subgroup analysis according to the quality of the observational studies. Although we investigated the effect of MHT on CVD through a meta-analysis in a manner similar to older more conventional studies, our study is comparable to the most recent review.

Comparison of findings between RCTs and observational studies

Our pooled analysis of both RCTs and observational studies identified consistent findings with respect to thrombotic events, and inconsistent findings regarding stroke and MI. However, differential associations in the subgroup analyses were observed. In our meta-analysis, timing of initiation and underlying diseases at baseline were likely to affect CVD outcomes. Mostly, late users and women who had an underlying disease at baseline had an increased risk of CVD outcomes, whereas early users and relatively healthy women had a decreased risk in both RCTs and observational studies. We found that the route of MHT administration was a possible factor for differential associations with CVD outcomes. Oral MHT was related to an increased risk of thrombotic events and stroke, whereas transdermal and vaginal MHT were comparatively safer than oral MHT in a review of observational studies18; however, this information was not available for RCTs16,17. Our subgroup analyses of observational studies according to route of administration supported findings from the most recent review18,42. The recent Nurses’ Health Study and the WHI Observational Study also have suggested the safety of low-dose vaginal estrogen with respect to the risk of CVD and cancer43,44.

Strengths and limitations

In the current study, we compared the characteristics of RCTs and observational studies, and identified possible reasons for inconsistent findings through various subgroup analyses. However, it is necessary to be cautious when interpreting our findings owing to some limitations. First, most of the study subjects were Europeans or North Americans. Thus, it was difficult to identify ethnic differences, for example, the prevalence of CVD, and age at natural menopause45,46. Second, some observational studies defined MHT through a self-reported questionnaire, and the included RCTs used different treatment regimens. Our subgroup analysis only considered E only and combined EP regimen types, and therefore, we were unable to assess the effect of more detailed regimens. Although the heterogeneity of the observational studies was higher than that of the RCTs, it was slightly attenuated in the subgroup analyses. Third, methodologic limitations for the control of confounding effects may remain. However, because we extracted multivariable adjusted-estimates for the associations as a priority, some of the confounders may have been controlled in this study. Although atrial fibrillation (AF) has been known to be strongly associated with thrombotic events, most of the included studies in our meta-analysis did not take this into account47. Therefore, the potential strong association between AF and thrombotic events may have contributed to the consistent findings of increased risk of thrombotic events in both RCTs and observational studies. Further studies are necessary to evaluate well-known risk factors, such as AF, to identify the association between MHT and risk of thrombotic events. Finally, the largest WHI trials may have contributed to the RCT findings. When we performed sensitivity analyses after excluding the WHI trials, the results did not change except for stroke, which suggested that the increased risk of stroke in the RCTs may be overestimated, or may reflect the characteristics of women who received MHT. Nevertheless, to the best of our knowledge, our study is the first meta-analysis of both RCTs and observational studies that takes into account as many factors as possible, unlike previous meta-analyses. The meta-analysis of observational studies can also be comparable to that of RCTs, as the observational studies often have longer follow-ups and are conducted in a more real-world setting. Although the included observational studies were rated ‘good’ or ‘fair’ in the quality assessment, healthy user bias may remain, and therefore, the results should be interpreted with caution.

Conclusion

Our findings support the idea that the risks and benefits of MHT are likely to depend on the characteristics of the women who are treated. MHT is still not recommended for the prevention of chronic diseases; however, it may have beneficial effects with respect to CVD and mortality in postmenopausal women with severe menopausal symptoms, after sufficient consideration of underlying diseases and timing of treatment initiation. Moreover, the use of non-oral types of MHT for menopausal symptoms may be suggested for women at high risk of VTE and stroke than oral types. Further studies to investigate the influence of ethnicity or specific MHT types are required.