Background

By the age of 35 years, the quality and quantity of ovarian follicles would decline [1], and consequential hormonal and symptomatic changes would lead to cessation of menses [2]. During menopause, the fluctuating levels of sex hormones, including luteinizing hormone, follicle-stimulating hormone, estrogen, and progesterone [3], can cause osteoporosis and menopausal symptoms, such as hot flushes, depression, nocturnal sweating, uterine bleeding, vaginal dryness, insomnia, and loss of sexual function [46]. It is estimated that there will be about 1.2 billion menopausal women worldwide by 2030 [7]. Menopause occurs between 44.6 and 52 years of age, varying among different races and countries [8]. In the United States, about 6,000 women reach menopause every day, which is more than 2 million per year [7]. The average age of menopause in the United Kingdom and United States is 52 and 51 years, respectively [9, 10]. In China, women around 50 years of age would experience natural menopause and in the southeast of China reach menopause at an average age of 48.9 years [11, 12]; thus, 0.28 billion women will be over the age of 50 years by 2030 would have menopause [13].

Hormone replacement therapy (HRT) has been used for more than 60 years to relieve menopausal symptoms. However, there are many adverse effects associated with HRT [14], e.g., increasing the risks of breast cancer, coronary artery disease, endometrial cancer, venous thromboembolism and stroke [15].

Chinese medicines (CM) are also used in treating menopausal symptoms [1621]. Some Chinese herbal formulas (CHFs) are indicative for treating gynecological disorders including menopausal symptoms [16, 17]. However, few studies on the biological actions of the CHFs have been conducted [2426]. As a typical example for CHFs, Erxian decoction (EXD) is commonly used to treat menopause related symptoms [17, 2234], consisting of six herbs, Herba Epimedium Brevicornum (HE; Xian-ling-pi), Rhizoma Curculiginis Orchioides (RC; Xian-mao), Radix Morindae Officinalis (RMO; Ba-ji-tian), Radix Angelicae Sinensis (RAS; Dang-gui), Cortex Phellodendri Chinensis (CPC; Huang-bo), and Rhizoma Anemarrhenae Asphodeloides (RA; Zhi-mu) [35].

During the past two decades, drug discovery has pursued a dominant target, “one drug, one disease” paradigm. However, many drugs exert therapeutic effects via restoration of multiple disease-related targets rather than a single one [36, 37]. Network pharmacology, which is based on systems biology, polypharmacology and molecular network analysis, provides a possible strategy to elucidate the action mechanism of multi-ingredient medicine in a holistic view [3840]. Molecular networks are constructed by interactions of target-based proteins and genes for predicting their function and facilitating drug discovery, which provides pharmacological information in a holistic manner [40, 41]. Enrichment analysis is an analytical method to assess functional associations between sets of genes or proteins of interest to us and a database of known gene or protein sets [42, 43]. It can identify the significant pathways and their enriched gene/protein sets, and elucidate significant multiple pharmacological mechanisms [42, 44].

The complexity of numerous chemical constituents and biological actions has not been fully identified in EXD. This study aims to identify the bioactive compounds and actions of EXD by a network pharmacological analysis.

Methods

The constituent compounds of EXD were identified by two phytochemical databases, the Traditional Chinese Medicine Systems Pharmacology (TCMSP) database and TCM Database@Taiwan., as well as published EXD literatures [2630, 35, 45, 46]. The druggability analysis of the identified compounds in EXD were performed and provided by Lipinski’s rule (LR) and TCMSP database in term of oral bioavailability (OB) and drug-likeness (DL) indices, respectively. OB is the degree to which a drug or other substance becomes available to the target tissue after oral administration. DL is to evaluate their potentials to be bioactive compounds compare with the well-developed drug. The significant pathways and gene-associated diseases for the identified compounds were determined by enrichment analysis (JEPETTO (US): http://apps.cytoscape.org/apps/jepetto) [43] of the compound-protein interaction and enrichment analysis (DAVID 6.7 (US): http://david.abcc.ncifcrf.gov/home.jsp) [47] of the compound–gene interactions, respectively. The workflow of the network pharmacology study of EXD was summarized in Figure 1.

Figure 1
figure 1

The workflow of the network pharmacological study of EXD.

Identification of potential bioactive constituents in EXD

All phytochemicals from the six constituent herbs of EXD were identified by the TCM Database@Taiwan (http://tcm.cmu.edu.tw/), TCMSP database (http://sm.nwsuaf.edu.cn/lsp/tcmsp.php), and previous EXD literatures [2630, 35, 45, 46].

Druggability analysis by LR, OB and DL properties

Lipinski’s rule (LR) [48] was used to identify druggable compounds according to the following criteria: molecular weight (MW) of not more than 500 Da (MW ≤500), chemical composition with no more than five hydrogen bond donors (H-bond donors ≤5), no more than 10 hydrogen bond acceptors (H-bond acceptors ≤10), and octanol–water partition coefficient, LogP, no >5 (LogP ≤ 5). A compound that does not satisfy at least two of the above conditions is less likely to be an orally active drug [49].

The phytochemical information of the compounds with their OB and DL properties were explored using the TCMSP database, which embed OBioavail 1.1 software for OB [50] and Tanimoto similarity software for DL [51]. The DL calculations in TCMSP database were based on the following formula [51]:

$$F(A,B) = A \times \frac{B}{{A^{2} + B^{2} - A \times B}}$$

where A is related to the molecular property of the target compound and B refers to the average molecular properties of all drugs from the Drugbank database (http://www.drugbank.ca/). A more detailed calculation of the DL index can be found in Tao et al. [51] and Wang et al. [52]. The thresholds used were OB ≥30% and DL index ≥0.18, as recommended by the TCMSP database. The thresholds were selected to efficiently identify bioactive compounds from the large pool of chemical compounds based on the following criteria: (1) the model obtained could be reasonably explained by previous pharmacological data and (2) the compound met the recommended mean DL index of 0.18 (the mean of DL index of 6,511 molecules from Drugbank database (2011) is 0.18) [51, 52].

Identification of associated proteins and genes

The integrative efficacy of the identified constituents in EXD was determined by analyzing the chemical–protein and chemical–gene interactions obtained from the Search Tool for Interactions of Chemicals and Proteins (STITCH) database and Comparative Toxicogenomics Database (CTD), respectively. The STITCH 4.0 database (http://stitch.embl.de/) can be used to study potential interactions between 300,000 phytochemicals and 2.6 million proteins curated from 1,133 organisms [53]. In this database, the approximate probability of a predicted association for a chemical–protein interaction is determined by the confidence score, with a higher score indicating a stronger interaction (low confidence score ~0.2; medium confidence score ~0.5; high confidence score ~0.75; highest confidence score ~0.95, provided by STITCH 4.0 database). The CTD (http://ctd.mdibl.org/) is a publicly available research resource that includes more than 116,000 interactions between 9,300 chemicals and 13,300 genes [54]. Both databases were searched independently by two researchers to minimize any bias.

In order to identify the associated significant pathways, proteins with a chemical–protein interaction confidence score ≥0.5 were selected for the enrichment analysis by JEPETTO with the KEGG database, a Java-based Cytoscape 3.01 plugin [43]. For studying the gene-associated diseases, the genes were firstly ranked by frequency of occurrence of the chemical–gene interactions, and then the genes with gene frequency ≥1.67 were chosen for the enrichment analysis by Visualization and Integrated Discovery (DAVID) Bioinformatics Resources 6.7 (http://david.abcc.ncifcrf.gov/).

Results

Compounds in EXD

Eight hundred and ninety-five phytochemicals were collected from the six herbs in EXD. From the TCM Database@Taiwan, 203 compounds were identified, comprising 29 in HE, 44 in RC, 38 in RMO, 56 in RAS, seven in CPC, and 29 in RA. From the TCMSP database, 646 compounds were identified, comprising 130 in HE, 78 in RC, 174 in RMO, 125 in RAS, 58 in CPC, and 81 in RA. 46 phytochemicals from previous studies in the literature [2630, 35, 45, 46], comprising 15 in HE, one in RC, five in ROM, five in RAS, 14 in CPC, 5 in RA, and one in EXD (specific herbs unknown). Finally, a total of 721 phytochemicals were identified in EXD after removing overlapping/duplicate compounds from the database s and the literature (Additional file 1).

Identifying druggable compounds by LR, OB, and DL predictions

Of the 150 compounds from HE, 75 (50%) compounds were identified based on LR, 23 (15.3%) had OB ≥30% and DL index ≥0.18, and only 17 (11.3%) satisfied all criteria. Of the 104 compounds from RC, 29 (27.9%) passed LR, seven (6.7%) had OB ≥30% and DL index ≥0.18, and only four (3.8%) satisfied all criteria. Of the 189 compounds from RMO, 125 (66.1%) passed LR, 20 (10.6%) had OB ≥30% and DL index ≥0.18, and only 12 (6.3%) satisfied all criteria. Of the 173 compounds from RAS, 131 (75.7%) passed LR, five (2.9%) had OB ≥30% and DL index ≥0.18, and only three (1.7%) satisfied all criteria. Of the 63 compounds from CPC, 43 (68.3%) passed LR, 28 (44.4%) had OB ≥30% and DL index ≥0.18, and only 19 (30.2%) satisfied all criteria. Of the 81 compounds from RA, 45 (55.6%) passed LR, 15 (18.5%) had OB ≥30% and DL index ≥0.18, and only 11 (13.6%) satisfied all criteria (Table 1). The physicochemical properties of anemarsaponin BII from EXD reported in the literature (specific herbs unknown) did not pass LR. Overall, 66 compounds passed LR and had OB ≥30% and DL index ≥0.18. A total of 63 compounds were obtained after removing the duplicate compounds (Table 2).

Table 1 Compounds in EXD satisfying LR, OB ≥30% and DL ≥0.18
Table 2 The 63 bioactive compounds from HE, RC, RMO, RAS, CPC, and RA herbs and their corresponding molecular properties, OB and DL (20 of 63 bioactive compounds related to 34 significant pathway- or 12 gene- associated with menopause)

Revealing the significant pathways and gene-associated diseases

Overall, 155 of the 721 compounds from EXD were found to have 2,656 chemical–protein interactions. After removing the overlapping/duplicate information, 1,963 associated proteins were obtained (Additional file 2). 1,824 of 1,963 proteins with a confidence score exceeding 0.5 were obtained. After enrichment analysis of 1,824 associated proteins, XD-scores and q values of pathways have been obtained. The XD-score is relative to the average distance to all pathways and represents a deviation from the average distance [43]. A larger positive XD-score indicates a stronger association between the inputted associated proteins and molecular interaction network of pathways. The q value determines the significance of the overlap (Fisher’s exact test) between the input information and the pathways. The enrichment algorithm analysis (graph-based statistic) of XD-score and q-value revealed that the threshold value of XD-score in this study was 0.67, therefore there are 34 pathways significantly associated with input set of proteins (Table 3).

Table 3 The 34 significant pathways found by JEPETTO (Cytoscape plugin) with KEGG database

In total, 210 of the 721 compounds from EXD were found to have 14,893 compound–gene interactions with 8,536 associated genes in the CTD (Additional file 3). Subsequently, the 8,536 genes were ranked according to their frequency of occurrence. The number of genes fell abruptly when the frequency of occurrence was small (gene frequency ≤8; Figure 2). Subsequently, the number of genes became stabilized for gene frequencies between 10 and 19. However, the number of genes with gene frequencies ≥20 was quite small. Genes with gene frequencies below the average of 1.74 were removed to reduce the number of redundant genes. After that, the remaining 2,183 genes were used to conduct the gene enrichment analysis by the DAVID platform. The “GENETIC_ASSOCIATION_DB_DISEASE_CLASS” was selected as the annotation category to search for the significant diseases associated with the input genes, which was statistically verified by Fisher’s exact test using the DAVID platform [47]. P ≤ 0.01 indicated significant association or enrichment with the related items. After removing nonspecific diseases, 12 classes of diseases were found to be highly associated with the input genes (Tables 4 and 5). Most of these diseases were related to menopause, such as aging, reproduction, cancer, cardiovascular diseases, and neurological diseases [5558].

Figure 2
figure 2

Gene frequency of the associated genes of 210 compounds.

Table 4 Chemical–protein interactions and related significant signaling pathways
Table 5 The 12 disease classes highly associated with input genes

Identifying twenty bioactive compounds related to menopause with following the druggability prediction

Eighteen of the 155 compounds that have 2,656 chemical–protein interaction, followed the Lipinski’s Rule with OB ≥30% and DL index ≥0.18. Thirteen of the 210 compounds that have compound–gene interactions interaction, followed the Lipinski’s Rule with OB ≥30% and DL index ≥0.18. Finally, 11 compounds has been identified related to both chemical–gene and chemical–protein interaction and followed the druglikeness prediction. Moreover, 20 compounds related to 34 significant pathway- or 12 gene- associated with menopause have been identified (Table 3).

Discussion

The actions of bioactive compounds in EXD were investigated by combining a drug prediction method with an enrichment analysis using information from bioinformatics databases at the gene and protein levels. For example, candidate compounds such as berberine, palmatine, and jatrorrhizine, which we identified using our drug prediction method, have been shown to exhibit extensive pharmacological activities [59, 60]. From the enrichment analysis based on the available information for compound–protein and compound–gene interactions of EXD, we identified the most significantly related pathways and gene-associated disease, including pathways related to endocrine [35], VEGF [61], lipid metabolism [62] and anti-inflammatory [34]. Their pharmacological association with EXD were in line with previous publications [34, 35, 61, 62].

Several pathways involved the endocrine have also been identified, such as steroid hormone biosynthesis, GnRH signaling pathway, and adipocytokine signaling pathway, covering the previous finding of our group to promote estradiol biosynthesis in animal study [35]. For the steroid hormone biosynthesis signaling pathways, the EXD compound, quercetin, promoted the expression of aromatase (CYP19A1), which is the enzyme for estrogen biosynthesis [63]. This compound also met the druggability criteria. Other important overlapping proteins were HSD11B1, SULT2B1, CYP1A1, COMT, and CYP1B1 (Figure 3).

Figure 3
figure 3

Chemical–protein interactions related to steroid hormone biosynthesis pathways. The grey color represents genes in the target set, green relates to the steroid hormone biosynthesis pathway, blue (labeled) is the overlap between the related pathway and the input protein set.

For the VEGF signaling pathways, VEGFA protein was involved in the antiangiogenic ability of EXD from our previous study [61]. The anti-cancer effect of EXD compound interact with VEGFA, emodin, has been reported [64]. Other interacting proteins of significance were PTK2, HRAS, MAPK1, AKT1, SRC, MAPK3, KRAS, MAPK14, PTGS2, PIK3CB, CASP9, PPP3CB, PPP3CA, NOS3, PLA2G4A, PIK3CA, PLA2G2A, PLA2G1B, NFAT5, NFATC3, PLA2G10, PLA2G5, and PPP3CC (Figure 4). The steroid hormone biosynthesis and VEGF signaling pathways were selected for further analysis in the present study (Table 5).

Figure 4
figure 4

Chemical–protein interactions related to the VEGF signaling pathway. The grey color represents genes in the target set, green relates to the VEGF pathway, blue (labeled) is the overlap between the related pathway and the input protein set. The orange is the expansion of their pathways.

For the lipid metabolism, EXD associated-pathways related to linoleic acid metabolism, fatty acid metabolism, unsaturated fatty acid biosynthesis, glycerophospholipid metabolism, arachidonic acid metabolism, and PPAR were identified [6569]. Besides, our previous study found that EXD could improve the lipid profile in cardiovascular disease [62].

While a previous study showed EXD to have anti-inflammatory activity [34], the present study suggested the pathways to include the Toll-like receptor signaling pathway, NOD-like receptor signaling pathway, and Fc epsilon RI signaling pathway [7072]. This findings were consistent with previous studies on EXD antimetastatic activity in a human ovarian cancer model [73] and its antiangiogenic properties [61].

Compound–compound interactions were not considered in this study because the available databases could only provide limited information for the six individual herbs. The information of the databases did not cover the new compounds synthesized by chemical reactions during the decoction of EXD’s ingredients, which will be confirmed by liquid chromatograph couple with mass spectrometry in further study. The ranking of the compound–gene and compound–protein interaction information was based on published evidence, but qualify of this evidence still needs extensive assessment. This study exemplified how to screen and identify bioactive compounds in CHFs.

Conclusions

Twenty compounds were identified by network pharmacology as potential effective ingredients of EXD for menopause with acceptable oral bioavailability and druggability.