Introduction

Menopause marks a major event for women and occurs twelve months after the last menstrual period [1]. Importantly, it is preceded by the perimenopause which extends from early through late menopausal transition to early post menopause, lasting up to four years [2]. During perimenopause, women experience hormonal fluctuations associated with vasomotor symptoms (VMS) such as hot flushes, sweating, sleep disturbances, irritability, anxiety, and depression [1, 3, 4]. According to the STRAW criteria, VMS are “likely” and “most likely” during perimenopausal stages − 1 and + 1a respectively [1, 2, 5,6,7,8]. This contrasts with the postmenopausal period, when there is low endocrine activity associated with genitourinary syndrome of menopause (GSM) characterised by vaginal dryness, atrophic vaginitis, vulvovaginal pain, pruritus, sexual intercourse pain and urinary problems [1, 5,6,7,8,9,10].

Currently there are almost 168 million women in China aged 45 to 59 years and considering the mean natural age of menopause (49 years) and the four-year median duration of perimenopause, this translates into a substantial number of women eligible for healthcare services to alleviate menopausal symptoms [1, 4, 11, 12]. Few studies have investigated the burden of symptoms on perimenopausal women in China, with contradictory results, for example Sun et al. reported that symptoms were more severe in postmenopausal women whereas Zhang et al. reported that VMS were more prevalent in perimenopausal women [7, 13].

This is coupled with a lack of evidence on the impact of symptoms on health-related quality of life, again with a focus on perimenopausal women. Measuring HRQoL is key to informing cost utility models (CUM), which are playing an increasingly important role in decision making in China [14]. The current pharmacoeconomic guideline recommends CUM as the preferred method to evaluate the impact of competing technologies [14]. The key requirement for a CUM is HRQoL, specifically health state utility values (HSUV) measured by the preferred, validated and widely used five level EQ5D (EQ5D5L) questionnaire [14]. The EQ5D5L version is an expansion of the EQ5D3L version and a Chinese format with value set for China is available [15].

Currently there is only one study reporting HSUV in China performed a decade ago and unlikely to represent the current situation [16]. Liu et al. examined the relationship between menopause and HSUV in premenopausal versus postmenopausal Chinese women in rural Fangshan, Beijing China [16]. The cross-sectional study measured utility in a cohort of 1351 women of which 656 were premenopausal, 133 were menopausal and 562 were postmenopausal [16]. Noteworthy is that HSUV in the perimenopausal group is not reported. The absence of HRQoL for menopausal women more broadly outside of China is highlighted in a review by Valentzis et al. (2017) which reported that out of five CUM [17,18,19,20,21] three [18, 19, 21] used the same HSUV data from a prior study from Zethraeus [22, 23]. The prior study by Zethraeus measured HRQoL in 104 women almost three decades ago and showed that Menopausal Hormone Therapy improved HRQoL associated with mild symptoms (0.18 to 0.26) and severe symptoms (0.42 to 0.50) using time trade-off and rating scale respectively [23]. The fourth CUM by Ylikangas (2007) used data from a clinical trial in Finland using the 15D instrument in a sample of 210 and 58 respondents at 6 and 9 years respectively [20, 22, 24]. The fifth CUM by Salpeter (2009) used a utility multiplier from a range of sources ranging from 1.0 for older cohorts and 1.07–1.21 for younger cohorts. The multipliers were derived from a range of sources and the methods used to synthesise the data are not described [17]. This lack of recent health state utility data contributes to uncertainty in CUM.

In the absence of published HSUV, a health economist developing a CUM is faced with the option of using suboptimal secondary evidence, making informed model assumptions, or performing primary research [25]. In the case of the latter, timelines and resources may render the face-to-face collection of EQ5D5L data impractical, especially in a large country like China. Research using digital technology offers a promising alternative to time-consuming, resource intense paper-based research [26]. The role of smartphones in the collection of research data is constantly increasing and a range of studies have shown the advantages of using smartphone for data collection [27,28,29,30,31]. Importantly, results are comparable whether collected by paper-based version or smartphone with higher response rates with smartphone versus paper-based versions [32].

In view of the above, the objective of this research was to measure symptoms and health state utility values in a cross-sectional cohort of menopausal-aged women in the general population of China, with the purpose of parameterising a cost utility model and informing healthcare services in China. We hypothesised that women in the perimenopausal age group experiencing symptoms would have the greatest negative impact on HRQoL.

Method

The study was approved by Griffith Ethics GU Ref No: 2020/389. A commercial license was purchased from EuroQoL and CAHE (Centre for Applied Health Economics, Griffith University) was specified as third-party user. Screening questions included age, gender, uterine status (intact or absent), date of last menses, regularity of menses, symptoms and HRQoL. The screening questions and EQ5D5L were scripted and coded into the survey platform and piloted internally to test the survey on the digital platform, check logic and ensure all questions were comprehensible with no errors in survey flow (n = 20). The screening questions and EQ5D5L were programmed in personal digital assistant (PDA) format for smartphone and underwent two rounds of validation via EuroQoL. A contract research organisation administered the survey via panel data. No formal recruitment was undertaken, the panels were activated and around 30,000 females in the age group of 45 + were invited to the survey link via SMS or WeChat. All participants were provided with a Participant Information Sheet approved by Griffith ethics committee, which described the study and informed participants that their information would be used for research and publication and their responses would remain anonymous. All participants providing informed consent proceeded to the screening questions. Chinese females aged 45 years and over were eligible to participate. All panel respondents were asked the screening questions first, and if eligible, went on to answer EQ5D5L. A general population cohort accruing the first 2000 females over 45 years was used. Respondents received 1.4 USD each via digital wallets of WeChat or Alipay. Three stages of screening occurred: complete questions, passing screening questions and quality control criteria. We took multiple steps to ensure the highest quality of data. Initially we checked cookies, recorded, and blocked repeat internet protocol addresses and checked digital fingerprint (we blocked any participant who completed the survey in the preceding 48 h). We added duplicate questions as checks, for example menopausal regularity was added as first and last question with respondents with inconsistent answers disqualified. We also added “trick questions” (5 + 2=, which is a fruit: apple, pear, banana, fish); respondents with incorrect answers were excluded. Finally, we checked the consistency of the answers, for example checking that age corresponded with birth date. Data validation and cleaning were undertaken. No personal information (name, address) was collected and respondents remained anonymous. The research team accessed the back-end of the survey platform and downloaded the completed survey results via excel. The data was stored on a secured Chinese server and disposed at the end of the study (publication of results). EQ5D5L raw scores were converted to utility tariffs using Chinese value sets reported by Luo et al. [15].

Respondents were classified in accordance with the STRAW criteria and the four-year median duration of perimenopause reported by Delamater et al. as follows: respondents with uterus, regular menstruation and reporting last menstrual period up to the study date (December 2020) were classified as premenopausal (reproductive stage); respondents reporting irregular menstruation at the time of the study and up to four years prior to the study were classified as perimenopausal; and women with uterus reporting the date of last menstrual period ≥ four years prior to the study were classified as postmenopausal (see appendix) [2, 3]. Sample size calculations were performed using the G*Power 3.1 statistical software assuming a significance level of 0.05 and a two-tailed test [33]. For dichotomous outcome measures, for example the presence of symptoms, the cohort size of 2000 had more than 98% power to detect a small effect size of 0.1 between three women groups (e.g. pre-, peri-, and postmenopausal) using a chi-square test [34]. For continuous outcome measures, for example utility, the cohort size of 450 in a subgroup analysis (n1 = 70, n2 = 380) had more than 95% power to detect a medium effect size of 0.5 between two women groups (e.g. with and without symptoms) using an independent-cohort t-test [34]. Statistical analyses were performed using IBM SPSS version 27 (IBM, Armonk, NY, USA). Descriptive results were expressed as counts and percentages for categorical variables and as mean and standard deviation for continuous variables. If chi-square test was found significant, further multiple comparisons using z test were performed to test difference in proportions between various menopausal status against perimenopausal with adjustment by the Bonferroni method. An independent-sample t-test was used to compare the utility scores between women with and without symptoms, separately for the pre-, peri-, and postmenopausal groups. An ANOVA was performed to compare the utility scores between women in three different groups (e.g., between the pre-, peri-, and postmenopausal groups). If ANOVA was found significant, further multiple comparisons were performed to test differences among the groups with adjustment by the Tukey method. Results were considered statistically significant when p < 0.05. All methods were carried out in accordance with appropriate guidelines and regulations and are reported in accordance with the STROBE recommendations for reporting of cross-sectional studies [35].

Results

Sample data was collected between 01 October  2020 and 25 December 2020. A total of 3001 tapped on the digital survey of which 466 respondents were excluded due to incomplete screening questions. Of 2535 (84%) who completed the screening questions, 431 (14%) respondents were excluded because they failed the cookie check, digital fingerprint check, or were blocked due to repeat internet protocol address. Out of the remaining 2104 (70%), 104 (3%) respondents were excluded as they failed the trick questions or provided inconsistent answers to the duplicate questions. A final cohort of 2000 (67%) completed all screening and EQ5D5L questions to the required quality and were included in the analysis, as shown in Fig. 1.

Fig. 1
figure 1

Flow diagram of included cohort

The mean age of the cohort was 49 years (range 45–73), with a mean age of 47 years for premenopausal, 49 years for perimenopausal and 53 years for postmenopausal women. There was a significant difference between the age across the groups in keeping with the phases of menopause, therefore it was not appropriate to adjust for age in the analysis of HSUV among the three groups. Respondents were from the East (32%), South-Central (25%), North (26%); Southwest (7%); Northeast (8%) and Northwest (2%) regions. Most respondents had intact uterus (89%). Out of a total cohort of 2000 women, 732 (37%) were classified as premenopausal, 798 (40%) perimenopausal and 470 (23%) as postmenopausal. There were statistically significant differences between the groups for uterine status, EQ5D utility and VAS. Overall, the majority (61%) of women reported being mildly to extremely affected by anxiety/depression and a high proportion were mildly to extremely affected by pain and discomfort (52%). Fewer women reported an effect on mobility (19%), usual activities (18%) and self-care (10%). Comparing the groups, the proportion reporting an effect on EQ5D5L domains was highest in the peri versus pre and post groups for mobility (25% vs. 9%, 22%), self-care (14% vs. 3% and 14%), pain/discomfort (60% vs. 40% and 57%), anxiety/depression (71% vs. 50% and 59%) except for usual activities where post had the highest (24% versus pre and 8% and peri 23%). The characteristics of the cohort are shown in Table 1.

Table 1 Characteristics of a general population cohort of menopausal-aged women in China

Our first key finding was that symptomatic perimenopausal women with intact uterus had significantly lower HRQoL (0.864) than the premenopausal (0.919, p < 0.05) and postmenopausal (0.877, p < 0.05) women using the ANOVA and Tukey tests. There was also a significant difference between symptomatic versus asymptomatic women within each group as shown in Table 2.

Table 2 Health state utility values for pre, peri and post groups with uterus according to symptoms and no symptoms

The second key finding was that a higher proportion of perimenopausal women reported symptoms (91%) compared to premenopausal (77%, difference = 14%, 95% CI = 10–17%) and postmenopausal (81%, difference = 10%, 95% CI = 2–10%) women and this was statistically significant based on z-tests with adjustment by the Bonferroni method. Significantly more perimenopausal women reported nine of the eleven symptoms compared to the premenopausal group (sleep problems, irritability, exhaustion, anxiety, hot flushes, depression, heart discomfort, loss of interest in sexual activity, vaginal dryness). The other two symptoms, although higher, were not statistically significant (joint pain, bladder problems). Significantly more perimenopausal women experienced five of eleven symptoms compared to the postmenopausal group (sleep problems, irritability, anxiety, hot flushes, depression) as shown in Table 3.

Table 3 General population cohort of women with uterus and menopausal symptoms

Comparing perimenopausal versus premenopausal women (with uterus), HRQoL was significantly lower for three symptoms having no or little impact on daily life (sleep problems, exhaustion, anxiety); six symptoms having moderate impact on daily life (sleep problems, irritability, exhaustion, hot flushes, heart discomfort, loss of interest in sexual activity) and four symptoms having large impact on daily life (sleep problems, irritability, anxiety, and depression). Comparing perimenopausal versus postmenopausal groups (with uterus), HRQoL was significantly lower for one symptom having large impact on daily life (loss of interest in sexual activity) as shown in Table 4.

Table 4 Health state utility values according to symptoms with no/little, moderate and large impact on daily life (respondents with uterus)

Discussion

Our study provides insight into a quota of menopausal-aged women drawn from the general population in China. In keeping with our hypothesis, perimenopausal women experienced significantly more symptoms and had significantly lower HRQoL compared to premenopausal and postmenopausal women. The HRQoL of women in our study was notably lower (0.893) than the general population (0.962) for 40–49-year-old females in urban China [36]. The mean utility of premenopausal women (0.929) was more closely aligned to the general population and declined in the postmenopausal group (0.881) [versus 0.954 (0.933; 0.975) for 50–59-year-olds] [36].

To the best of our knowledge the only other study measuring HSUV in menopausal women using EQ5D3L in China is by Liu et al., however it compares the HSUV of premenopausal (0.810) versus postmenopausal (0.800) women and does not report the HSUV for perimenopausal women [16]. The study findings are in agreement with a review of the impact of menopausal transition on HRQoL more broadly [37]. Matthews et al. reported that, based on twelve cross sectional studies, perimenopause is associated with more somatic symptoms, however none of the studies used the EQ5D5L nor were they conducted in China [37]. Other studies use a range of generic and disease specific instruments as described by Zollner and Matthews and are therefore not directly comparable to our study [37, 38]. Zöllner’s review of HRQoL instruments concluded that of the eight instruments reviewed, none were found to capture all relevant aspects of HRQoL and treatment. In Taiwan a cohort of 734 premenopausal women was followed up for 2 years and HRQoL was assessed with SF36 and HADS with no effect [39]. Hess enrolled 732 women of menopausal age in a GP practice in USA and measured HRQoL with the RAND-36 and found that physical health was poorer in late peri- and early post- versus premenopausal women. The mental health component was lowest in late peri- versus early post-, late post- and premenopausal women [40]. A recent study evaluated the influence of education on perimenopausal symptoms and HRQoL using the disease specific World Health Organisation Quality of Life BREF Questionnaire. Higher education corresponded to better HRQoL in perimenopause women in 1632 treatment-naïve women attending an outpatient clinical in Hangzhou, Zhejiang Province, China [1]. Adding further complexity is the timing of the studies over different phases of menopause, and comparison between different time periods during menopause [37]. Finally, other studies use different drugs formulations [39,40,41,42,43,44]. One study evaluated the prevalence of screening-detected depression and the association of depression with HRQoL in community-dwelling postmenopausal women living in three Asian countries including China [45]. In 336 women in Chengdu or Kunming metropolitan areas (mean age 59 years), HRQoL was measured using EQ5D, however the health state utility values are not reported [45]. Our findings differ from the study by Zhang that showed that menopausal symptoms were more severe in postmenopausal women [13]. Zhang et al. highlight that this may be selection bias due to the study group comprising women who “traveled from all over China to one, specialized center;” so this is cohort more affected by symptoms. Furthermore, a generic or recognised disease specific instrument was not used to measure HRQoL [13].

Our research findings should be interpreted in the context of the strengths and weaknesses of our study. In keeping with other studies collecting patient reported outcomes using electronic methods, one of the key strengths of our study was the timely data collection in a large cohort with low resource demands [46]. Another unique advantage of smartphone format was the flow and layout of the questionnaire meant that each question needed to be answered before respondents could proceed to the next question, as a result there are no missing values in respondents who got the end of the EQ5D5L questions (i.e., those with complete surveys). Response rates were high (2000/3001 = 67%) compared to other studies likely due to digital accessibility [40]. Screening questions and EQ5D5L were in local Chinese language. In our study we used the EQ5D5L rather than the EQ5D3L used by Liu, which is more sensitive to changes in HRQoL [16]. Importantly, results of administering the paper-based version of EQ5D are comparable to electronic version [32]. The cohort is a representative sample of the general population based on quota and no randomisation occurred, although it is the basis for the statistical tests. The study is descriptive, and due to the cross-sectional design cannot evaluate causality. Many of the symptoms measured may occur independent of menopause and due to the nature of the study there was no way to determine whether that was the case in our quota. Our study did not measure detailed socio-demographic characteristics such as chronic diseases, and lifestyle factors like physical activity, smoking and alcohol consumption which could confound results. It is likely that biological and social confounders are contributing factors to the HRQoL measured in our study. We cannot correlate our findings with clinical hormone levels therefore we cannot confirm the link between menopause, symptoms, and HRQoL. A recent study reported that hypomnesia was the third most common symptom experienced by 71% of postmenopausal and 66% of perimenopausal women however hypomnesia is not measured in our study [13]. The respondents were restricted to women with digital literacy with access to smartphone and are therefore likely to have better education and higher socioeconomic status. Since our data was collected during the COVID pandemic the status of our participants and their responses may have been affected by the outbreak. Although our findings have high internal validity; we caution against generalising our findings to the broader population. Despite the limitations, the findings are useful and provide insight into the current status of menopausal-aged women in China. Future research would benefit from studies measuring the causal relationships between menopause, symptoms and HRQoL.

Conclusion

The perimenopausal phase of menopause is associated with significantly more symptoms and significantly lower HRQoL compared to premenopausal and postmenopausal phases.