Background

In United States, ovarian cancer is the fifth leading cause of cancer-related death in female [1], which accounts for 2.5% of all cancers in female, whereas, 5% of all cancer death in female [2]. In 2018, there are about 22,000 new cases of ovarian cancer, and 14,000 deaths [2]. The high death rate (< 50% of 5 year survival rate) is mainly because of the late diagnosis and aggressive high grade serous carcinoma [2, 3]. Platinum-based chemotherapy after surgical debulking is the standard treatment for ovarian cancer [4]. However, the cancer recurrence rate is high, and recurred tumors are often platinum resistant [4,5,6], with complicated mechanism of platinum resistance [7]. Though a few targeted therapies are being evaluated in clinical trials, e.g., VEGF, PARP, EGFR inhibitors [4], some of them are not very successful [4]. Therefore, novel targeted therapies and synergistic drug combinations are needed for ovarian cancer.

On the other hand, comprehensive multi-omics data of ovarian cancer patients have been profiled and analyzed [1, 8]. A set of genetic biomarkers, e.g., TP53, NOTCH, FOXM1, have been identified via association analyses [1]. Also, a few dysfunctional signaling pathways, e.g., MYC, TP53, PI3K/RAS, were be identified in ovarian cancer by mapping multi-omics data, e.g., differentially expressed genes, mutations, copy number variation, and methylation data, to the curated signaling pathways [8]. However, the functional consequence of these biomarkers and cross-talk of complicated signaling pathways in ovarian cancer remain unclear. It is still a challenge to discover effective drugs and synergistic drug combinations [9,10,11,12] for ovarian cancer based these valuable knowledge and multi-omics data.

In this study, we aim to systematically investigate potential activated core signaling pathways in ovarian cancer sub-groups by uncovering the up-stream signaling pathways of activated transcription factors (TFs), and identify all available FDA approved drugs targeting on these up-stream signaling and TFs. The combinations of these drugs have the potential to be synergy with standard platinum chemotherapy by disrupting multiple up-stream signaling and their cross-talk. This study will provide a useful reference resource for repositioning effective drugs and drug combinations for ovarian cancer. The rest of the paper is organized as follows. The details of datasets and methods are provided in Section 2. The analysis results are presented in Section 3, followed by a summary in Section 4.

Methods

Gene expression data of ovarian cancer and ovarian normal tissue

We download the gene expression (RNAseq - RSEM expected_count (DESeq2 standardized)) data of 427 ovarian cancer samples (from The Cancer Genome Atlas (TCGA) [1]), and 88 ovarian normal samples (from Genotype-Tissue Expression (GTEx) [13]) from the Xena server [14].

KEGG signaling pathways and regulatory network

To obtain KEGG signaling pathways, the “Pathview” R package [15] was employed to download KGMLs of signaling pathways. Then the “KEGGgraph” R package was used to extract nodes and edges of KEGG signaling pathways from KGMLs [16]. In total, 282 signaling pathways were collected from seven categories: metabolism, genetic information processing, environmental information processing, cellular processes, organismal systems, human diseases, and drug development. The TF-Target regulatory network was downloaded from the supplemental material of reference [17], which was derived from the TF binding site predictions for all target genes from TRANSFAC (v7.4) [18]. In summary, the TF-target regulatory network consists of 230 TFs, 12,733 target genes, and 79,100 TF-Target interactions.

Drug combination screening data in NCI ALMANAC

This dataset includes screening results of pairwise combinations of 104 FDA-approved anticancer drugs on NCI-60 cancer cell lines (59 cancer cell lines with detailed genomics profiles) [19]. Specifically, ~ 5232 pairwise drug combinations were evaluated in each cancer cell line. Each drug combination was tested at either 9 or 15 dose points for a total of 2,809,671 dose-specific combinations. The detailed definition of synergistic drug combination score was introduced in reference [19].

Selection of up-regulated genes for each sample

In this study, the GTEx normal ovarian tissue samples were used as normal control versus ovarian cancer tumor samples from TCGA. The simple fold change and p-value <= 0.05 (using t test) will result in too many up-regulated genes. The Maximum Likelihood Estimate (MLE) method (see Fig. 1, red probability distribution function (PDF) curve) also generated too many up-regulated genes. Thus, we employ the Markov chain Monte Carlo (MCMC) model to simulate the distribution of gene expression distribution of given genes based on the normal tissues. Let x, D present the gene expression of a given gene and normal tissues respectively.

Fig. 1
figure 1

Gene expression distribution of gene “CENPH”

$$ p\left(x|D\right)=\frac{p(xD)}{p(D)}=\frac{\int_{\theta \in \Theta}p\left( xD|\theta \right) d\theta}{p(D)}=\frac{\int_{\theta \in \Theta}p\left(x|\theta \right)p\left(D|\theta \right)}{p(D)} $$
(1)
$$ \theta =\left(\mu, {\sigma}^2\right),\kern0.5em \Theta =\left\lfloor -\infty, +\infty \right\rfloor \times \left\lfloor 0,+\infty \right\rfloor, \kern0.5em x:N=\left(\mu, {\sigma}^2\right). $$

We use the conjugate priors for μ andσ2 , which are the Normal distribution and Inverse Gamma distribution: μ : N(w0, v0), σ2 : IG(a0, b0).. To get uninformative priors, we set w0 = 0, v0 = +∞, a0 = 0, b0 = 0. Since it is hard to calculate eq. (1), we use MCMC method to simulate the distribution. The python package “Pymc3” [20] was employed to conduct the analysis. We set w0 = 0, v0 = 104, a0 = 10− 3, b0 = 10− 3. The MCMC model is better than MLE (see the green PDF curve in Fig. 1), but still too many up-regulated genes will be selected. To further reduce the number of up-regulated genes, we empirically simulate the PDF of random variable y = 2x, and use the PDF of y to calculate the p-value of given gene expression in ovarian cancer samples. Specifically, we selected up-regulated genes for each tumor sample with fold change> = 2 and p-value<=0.05 (calculated based on the PDF of random variable y). We take the gene “CENPH” as an example to illustrate this analysis. The PDF generated by the MCMC model is more robust than generated by Maximum Likelihood Estimate (MLE) (see Fig. 1). The yellow point is the threshold and area under blue curve on the right of yellow point is about 0.05 (the calculation of p-value).

Identification of activated TFs for individual ovarian cancer patients

The Fisher’s exact test (using hyper-geometric distribution) was used to identify the activated TFs by comparing the number of up-regulated targets vs. the number of all target genes, with the number of all the up-regulated genes vs. the number of all the genes tested. The p-value threshold, 0.05, was used to select the activated TFs.

Sub-grouping analysis using activated TFs

We cluster 427 ovarian cancer samples using the identified activated TFs. We transform p-value to 0–1 using 0.05 as a threshold. For categorical data, we use the k-modes method [21] for the sub-grouping analysis.

Uncovering up-stream signaling of activated TFs

All 282 signaling pathways from KEGG are investigated, and all the signaling cascades from the starting nodes to the activated TFs are extracted using the python package, NetworkX, to extract the up-stream signaling cascades starting from the beginning genes of individual signaling pathways to the given TFs. Then we score each signaling cascades using the average probability of genes (obtained from the MCMC analysis). To control the size of up-stream signaling network, the top 3 signaling cascades are kept.

Target importance scoring

The impact analysis (IA) evaluates both the topology and dynamics of a signaling pathway by considering the gene expression changes, the direction and type of signaling interaction, and the position and role of every gene in a pathway. A perturbation factor for each gene, PF(gi), is calculated using the impact analysis method [22], as follows:

$$ PF\left({g}_i\right)=\Delta E\left({g}_i\right)+\sum \limits_{j=1}^n{\beta}_{ij}\frac{PF\left({g}_j\right)}{N_{ds}\left({g}_j\right)}, $$

The term ΔE(gi) represents the signed normalized measured gene expression change of gene gi. The second term is the sum of perturbation factors of direct upstream genes of target gene gi, normalized by the number of downstream genes of each such gene Nds(gj). The value of βij quantifies the strength of the interaction between genes gj and gi. We use the probability density of gene expression instead of gene expression, which s will be more accurate considering that the standard deviation of different genes is different.

Results

Ovarian cancer samples were clustered into 3 groups based on activated TFs

Using the K-modes method, the 427 ovarian cancer samples were classified into 3 sub-groups (with 100, 172, 155 samples respectively) based on the activated TFs. For each sub-group, there is a center sample, and we use the center sample to characterize each sub-group. In another word, the activated TFs in the center sample were used as the activated TFs for this sub-group.

For visualization purpose, the principal component analysis (PCA) was employed to reduce the 230 TFs to 2 dimensions (see Fig. 2). In one sub-group (Group 1), 14 TFs were activated: ELK1, FOXF2, NRF1, ETS2, NF.muE1, ADD1, TBP, SP1, GABP, E4F1, TELO2, MYC, YY1, NFE2L2A. Interestingly, these 14 TFs are also activated in the other two groups. Group 2 and Group 3 have 26 and 25 TFs respectively. The additional TFs for Group2 are: AR, ETS1, GABPB1, GFI1, HMG, LHX3, NKX6.2, PAX3, PDX1, PITX2, REST(NRSF), S8. The additional TFs for Group 3 are: ARNT_MAX, ETS1, FOXN1, FOXO4, GABPB1, LHX3, NFATC2, NFIL3_ATF2, NKX6.2, PAX3, SREBF1.

Fig. 2
figure 2

Ovarian cancer samples are clustered into 3 groups based on the activated TFs

Up-stream signaling of activated TFs and related FDA approved drugs

The up-stream signaling of activated TFs are shown in Figs. 3, 4 and 5. As can be seen, multiple important signaling pathways are uncovered, e.g., MYC, WNT, PDGFRA (RTK), PI3K, AKT TP53, and MTOR. This result is consistent with the discoveries in aforementioned references. There are 43 common genes among these 3 sub-groups. We calculated and ranked the perturbation factor of 43 common genes. The top 5 related genes are TBP, MMP9, MYC, MAPK1, MTOR, which might play important roles in ovarian cancer. An interesting finding is MMP9, MYC, MAPK1, MTOR are all in Proteoglycans in cancer. Thus RTK-PI3K-AKT-MTOR can be an important signaling cascade for ovarian cancer. In addition, MTOR actives TP53 by cellular senescence pathway while T53 inhibits MTOR through IGF1/MTOR. Since TP53 is the most frequently altered genes in ovarian cancer, the signaling loop between TP53 and MTOR might be a potential target of novel synergistic drug combinations. Moreover, drug combinations targeting on multiple up-stream signaling and TFs are also potentially synergy to disrupt the activated signaling of ovarian cancer sub-groups.

Fig. 3
figure 3

FDA approved drugs targeting on up-stream signaling of activated TFs in Group 1. The color of green, blue, yellow, and red represents signaling starting genes, signaling transduction genes, TFs, and drugs respectively

Fig. 4
figure 4

FDA approved drugs targeting on up-stream signaling of activated TFs in group 2. The color of green, blue, yellow, and red represents signaling starting genes, signaling transduction genes, TFs, and drugs respectively

Fig. 5
figure 5

FDA approved drugs targeting on up-stream signaling of activated TFs in group 3. The color of green, blue, yellow, and red represents signaling starting genes, signaling transduction genes, TFs, and drugs respectively

To investigate potential drugs that can potentially perturb these up-stream signaling networks, we mapped the FDA approved drugs on the signaling networks (see Figs. 3, 4 and 5). The target information was obtained from DrugBank (version 5.0.11) [23]. In total, 66 drugs (red nodes in Figs. 3, 4 and 5) were selected targeting on different targets. Through the literature search, 44 drugs had been reported to treat ovarian cancer (see Table 1). In addition to these single drugs, we investigated effective combinations that appeared in our drug list, and validated in the drug combination screening on NCI 60 ovarian cancer cell lines (the synergy is defined with a threshold score higher than 8) (see Table 2). Moreover, we found that the top 10 drug targets of synergistic drug combinations are EGFR, TUBB1, TUBA4A, TUBB, TOP2B, MTOR, TUBB3, CYP19A1, ERS1 and BCL2. TUBB1, TUBA4A, TUBB, TUBB3 and TOP2B are related to cell proliferation. CYP19A1 and ERS1 are related to estrogen. BCL2 is the member of the Bcl-2 family of regulator proteins that regulate cell death. EGFR and MTOR are in PI3K-AKT pathway, and EGFR is one of the upstream of MTOR signaling. The combination of MTOR inhibitors, and EGFR, RTK, PI3K signaling inhibitors might be synergy to inhibit ovarian cancer development.

Table 1 FDA approved drugs targeting on upstream signaling of activated transcription factors (TFs)
Table 2 Validated synergistic drug combinations in NCI-60

Moreover, we investigated the difference of activated core signaling pathways among these 3 sub-groups. The unique TFs appeared in the core signaling pathways in Group 2 and Group 3 are: PDX1, REST and AR; and ROXO4, SREB F1, NFATC2 respectively. In upstream signaling genes, PML, LEF1, MAPK12, FGF22, JUP, AKT3, MAPK10, MAP2K1, TCF7 are the unique genes for Group1. The CLTA, AR, RPS6KB2, AP2S1, AKT1S1, FOXA2, PDPK1, HTT, MAP2K6, TNFRSF1A, TNF, GNB1, MAFA, TRAF2, REST, HHEX, EFNA4, MAP3K5, PDX1 and AP2M1 are unique genes for Group2. The FOXO4, PLCB4, NFATC2, IRS4, KRAS, PRKCI, PTPRF, ICOS, EGLN3, NOTUM, PPP3R1, SREBF1, EPHA2, EGFR, EGF and PRKCZ are the unique genes for Group3. For Group 1, MAPK10, AKT3, FGF22 are in MAPK signaling pathway and RAS signaling pathway, and PML, JUP, LEF1, TCF7 are signaling cascades linking to MYC. For Group2, the signaling cascade from TNF to p38 is an upstream of p53. For Group3, many genes appeared in RAP1 signaling pathway. For the drugs listed in Table 1, for example, Celecoxib, Chloroquine, Etanercept, Infliximab and Thalidomide targets on unique Group2 genes; and Cetuximab, Dasatinib, Erlotinib, Gefitinib, Lidocaine, Necitumumab, Osimertinib, Panitumumab, Sucralfate, Tamoxifen, Vandetanib, Vitamin C targets on Group 3 unique genes. The signaling diversity and heterogeneity can be potential therapeutic targets for drug combination discovery.

Discussion

Ovarian cancer is the fifth leading cause of cancer-related death among women, and the 5-year survival rate is fewer than one half. Though a set of biomarkers and signaling pathways have been identified to be associated with ovarian cancer, the functional consequence of these biomarkers and signaling pathways remain unclear. Moreover, there is a lack of effective targeted therapies for ovarian cancer, especially for the platinum resistant ovarian cancer. In this study, we analyzed the gene expression data of ovarian cancer samples and ovarian normal tissues via network analysis. We aim to systematically explore the activated signaling pathways of individual ovarian cancer patients and sub-groups, and identify potential targets and drugs that are able to disrupt the core signaling pathways. There are still several limitations of the study. First, in addition to gene expression, mutation, methylation, and copy number variation data should be integrated in the network analysis to uncover the TFs, and up-stream signaling. Second, the signaling cross-talk among these up-streams are not investigated, which might be responsible for drug resistance. In the future, we will also investigate the signaling network and TFs of platinum resistant ovarian cancer samples; and conduct the network-based drug repositioning approaches [66, 67] to reposition drugs [68, 69] and drug combinations [70] for ovarian cancer treatment.

Conclusions

The purpose of this study is to systematically uncover potential activated core signaling pathways in ovarian cancer using integrative network analysis. We identified about 37 activated TFs from three sub-groups of ovarian cancer, as well as a set of up-stream signaling pathways linking to these TFs, e.g., WNT, TP53, MYC, AKT, RAS, mTOR, PDGFRA signaling pathways. In addition, 66 FDA approved drugs were identified targeting on the uncovered core signaling pathways. Forty-four drugs had been reported in ovarian cancer related reports. Combinations of these drugs could be potentially synergy to disrupt the cross-talk of multiple activated signaling pathways and TFs for better ovarian therapy. These uncovered signaling networks, TFs and drugs can be used as reference resources to support biomedical studies in ovarian cancer.