Comprehensive analyses of mitophagy-related genes and mitophagy-related lncRNAs for patients with ovarian cancer

Zheng, Jianfeng; Jiang, Shan; Lin, Xuefen; Wang, Huihui; Liu, Li; Cai, Xintong; Sun, Yang

doi:10.1186/s12905-023-02864-5

Comprehensive analyses of mitophagy-related genes and mitophagy-related lncRNAs for patients with ovarian cancer

Research
Open access
Published: 13 January 2024

Volume 24, article number 37, (2024)
Cite this article

Download PDF

You have full access to this open access article

BMC Women's Health Aims and scope Submit manuscript

Comprehensive analyses of mitophagy-related genes and mitophagy-related lncRNAs for patients with ovarian cancer

Download PDF

Jianfeng Zheng¹^na1,
Shan Jiang¹^na1,
Xuefen Lin¹^na1,
Huihui Wang²,
Li Liu¹,
Xintong Cai¹ &
…
Yang Sun¹

1065 Accesses
1 Citation
Explore all metrics

Abstract

Background

Both mitophagy and long non-coding RNAs (lncRNAs) play crucial roles in ovarian cancer (OC). We sought to explore the characteristics of mitophagy-related gene (MRG) and mitophagy-related lncRNAs (MRL) to facilitate treatment and prognosis of OC.

Methods

The processed data were extracted from public databases (TCGA, GTEx, GEO and GeneCards). The highly synergistic lncRNA modules and MRLs were identified using weighted gene co-expression network analysis. Using LASSO Cox regression analysis, the MRL-model was first established based on TCGA and then validated with four external GEO datasets. The independent prognostic value of the MRL-model was evaluated by Multivariate Cox regression analysis. Characteristics of functional pathways, somatic mutations, immunity features, and anti-tumor therapy related to the MRL-model were evaluated using abundant algorithms, such as GSEA, ssGSEA, GSVA, maftools, CIBERSORT, xCELL, MCPcounter, ESTIMATE, TIDE, pRRophetic and so on.

Results

We found 52 differentially expressed MRGs and 22 prognostic MRGs in OC. Enrichment analysis revealed that MRGs were involved in mitophagy. Nine prognostic MRLs were identified and eight optimal MRLs combinations were screened to establish the MRL-model. The MRL-model stratified patients into high- and low-risk groups and remained a prognostic factor (P < 0.05) with independent value (P < 0.05) in TCGA and GEO. We observed that OC patients in the high-risk group also had the unfavorable survival in consideration of clinicopathological parameters. The Nomogram was plotted to make the prediction results more intuitive and readable. The two risk groups were enriched in discrepant functional pathways (such as Wnt signaling pathway) and immunity features. Besides, patients in the low-risk group may be more sensitive to immunotherapy (P = 0.01). Several chemotherapeutic drugs (Paclitaxel, Veliparib, Rucaparib, Axitinib, Linsitinib, Saracatinib, Motesanib, Ponatinib, Imatinib and so on) were found with variant sensitivity between the two risk groups. The established ceRNA network indicated the underlying mechanisms of MRLs.

Conclusions

Our study revealed the roles of MRLs and MRL-model in expression, prognosis, chemotherapy, immunotherapy, and molecular mechanism of OC. Our findings were able to stratify OC patients with high risk, unfavorable prognosis and variant treatment sensitivity, thus improving clinical outcomes for OC patients.

View this article's peer review reports

Mitophagy-related long non-coding RNA signature predicts prognosis and drug response in Ovarian Cancer

Article Open access 26 August 2023

Integrative analysis of the prognostic value and immune microenvironment of mitophagy-related signature for multiple myeloma

Article Open access 12 September 2023

Mitophagy genes in ovarian cancer: a comprehensive analysis for improved immunotherapy

Article Open access 01 December 2023

Background

Ovarian cancer (OC) is the most lethal cancer of the female reproductive system, which is due to the lack of effective screening at the early stage and resistance to chemotherapy as the tumor progresses [1, 2]. The preferred treatment for OC is surgery assisted by the combination of paclitaxel and platinum which prolongates the survival of OC patients [2]. Nevertheless, the survival rate of OC patients with advanced stage is still low, posing a serious threat to women’s lives [1]. Therefore, predicting individual prognosis for OC is important for both patients and gynecologic oncologists.

Cells can remove incomplete or damaged mitochondria through the mechanism of autophagy selectively and the process is called mitophagy [3]. The body can maintain the integrity of mitochondrial function through mitophagy, so as to achieve the purpose of delaying aging and treating diseases [3, 4]. In recent years, mitophagy is found to contribute to OC progression [5, 6]. The specific regulatory mechanism of mitophagy in OC progression may be involved in tumor-associated macrophages [7] and cell stemness [8]. Mitophagy is also involved in anticancer activity of drugs in OC, such as platinum [4, 9,10,11,12,13,14], EGFR tyrosine kinase inhibitors [15], Janus kinases 1/2 inhibitor [16], pardaxin [17], nanomedicine [5], and epoxycytochalasin H [18]. Despite studies in investigating the role and mechanism of mitophagy in OC, the precise effect of mitophagy in clinical applications remain challenging due to the lack of targetable biomarkers combination.

Long non-coding RNA (lncRNA) refers to a loose RNA transcript with more than 200 nucleotides, which has no protein coding potential [19], and the number of lncRNAs significantly exceeds that of protein-coding genes [19]. Although the functions of lncRNAs in tumorigenesis have been confirmed [19] and our earlier study demonstrated that lncRNA can regulate autophagy in OC [20], little is known about their regulation in mitochondrial function and the mechanism by which lncRNAs regulate mitophagy even remains blank.

Because of the small size and hidden location in the female pelvic cavity, early diagnosis of OC is extremely challenging [1]. Currently, the most commonly used tumor marker for OC screening in clinical practice is Carbohydrate Antigen 125 (CA125) [21] and Human Epididymis Protein 4 (HE4) [22]. Given that other benign diseases can also cause elevated serum biomarkers, the diagnostic specificity and sensitivity of using serum CA125 or HE4 alone are not high [23]. Existing studies have attempted to establish prognostic models for patients with OC based on clinicopathologic characteristics. For instance, the Risk of Ovarian Malignancy Algorithm (ROMA) model incorporated both serum CA125 and HE4, nevertheless, the model did not fully address the challenge of detecting OC with high risk [23].

More and more studies show that gene expression profiles can be used to identify many important prognostic genes in various types of cancer and to map prognostic related molecular models [24, 25]. Based on high-throughput technologies and data sharing, cancer research has entered the era of big data due to large-scale multi-omics data accumulated in The Cancer Genome Atlas (TCGA) [26] and Gene Expression Omnibus (GEO) databases [27]. Bioinformatics is an emerging interdisciplinary subject used for analyzing biological information [28], which takes computer as a tool (mainly R packages) [29]. The application of big data from TCGA and GEO databases based on bioinformatics allows us to evaluate the predictive value of mitophagy-related lncRNA (MRL) combinations for OC patients.

The packages in R language software can be used for data mining and statistical analysis [30]. Herein, we mainly utilized R packages to carry out comprehensive analyses of mitophagy-related genes (MRGs) and MRLs for patients with OC. Using weighted co-expression network analysis (WGCNA) and least absolute shrinkage and selection operator (LASSO) Cox regression analysis, we analyzed the landscape of MRGs and MRLs comprehensively. The reliable MRL-model to predict overall survival (OS) and therapeutic strategies was constructed. Our data showed that the MRL-model was associated with immunity characteristics, tumor mutational burden (TMB), immunotherapy, and chemotherapeutic drug sensitivity.

Methods

Data collection

The processed data were extracted from UCSC-Xena (https://xenabrowser.net/datapages/) [31]. The Ensemble Gene was converted into Gene Symbol based on gene annotation information in GENCODE [32]. The low-expression mRNAs and lncRNAs were filtered. Collectively, 417 OC samples with expression profiles and prognostic information from TCGA were included. Besides, 88 normal ovarian tissues from GTEx were obtained for identification of differentially expressed genes. We also retrieved four OC datasets that had lncRNA expression profiles and prognostic information from GEO database (https://www.ncbi.nlm.nih.gov/geo/) [27], including 268 OC cases. We selected the dataset from the GPL570 Affymetrix Human Genome U133 Plus 2.0 Array to annotate as many lncRNAs as possible. MRGs were screened from GeneCards (https://www.genecards.org) [33] based on their relevance score. Furthermore, the somatic mutations were generated with Mutation Annotation Format (MAF) using the “maftools” package (Version 2.16.0) [34].

Differentially expressed genes screening

Linear regression and Empirical Bayesian [35] were able to shrink the analyzed variances toward a common estimate and the method was conducted using “limma” package (Version 3.10.3) [36] to screen out the differentially expressed MRGs and lncRNAs. Benjamini-Hochberg was used for multiple test correction to obtain greater power relative based on False Discovery Rate (FDR) [37]. The threshold of screening differentially expressed genes was set as adjusted P < 0.05 and |logFC| > 0.5.

Prognostic genes screening

The “survminer” package (Version 0.4.3) was used to determine the optimal cut-point based on the expression of genes, survival time and survival state. The prognostic genes were screened out based on Kaplan-Meier (K-M) curves and logRank test.

MRLs screening based on WGCNA

We used the “WGCNA” package [38] (Version 1.61) to analyze the expression matrix of lncRNAs, so as to identify highly synergistic lncRNA modules. Firstly, a series of power was set to calculate the square value of correlation coefficient between connectivity k and p(k) and the average connectivity under each power value. The power value whose square value of correlation coefficient reached above 0.85 for the first time was selected. Secondly, based on dynamic pruning and clustering methods, we aggregated highly correlated lncRNAs into modules (correlation coefficient > 0.8). Finally, the correlation between modules and the prognostic MRGs was calculated, and the lncRNA modules associated with multiple MRGs were identified. We defined the modules with the most obvious positive or negative correlation with multiple MRGs as the key modules, and the lncRNAs in these modules were MRLs.

Establishment of the MRL-model

After obtaining prognostic MRLs, we applied the high-dimensional index regression method of “glmnet” R package (Version 2.0–18), LASSO Cox regression analysis, to screen the combination of prognostic MRLs by utilizing a penalty proportional to the contraction of the regression coefficient based on 20-fold cross-validation analysis, thus addressing multicollinearity [39]. The regression coefficient and the expression level of each MRL was applied to calculate the risk score and construct the MRL-model as follows:

$$Risk\ Score=\sum {\beta}_{lncRNA}\times {Exp}_{lncRNA}$$

Herein, β_lncRNA was the LASSO regression coefficient of the MRL, and Exp_lncRNA represented the expression value of MRL. The highly correlated MRLs were excluded to prevent the MRL-model from overfitting.

Validation of the MRL-model

We included four external datasets that had lncRNA expression profile and prognostic information to validate the model: GSE19829 (28 OC samples), GSE26193 (107 OC samples), GSE30161 (58 OC samples), and GSE63885 (75 OC samples). The batch effects of the four external datasets were removed by “sva” R package (Version 3.48.0) [40]. The β_lncRNA was first generated based on TCGA training dataset and the risk score of the GEO validation datasets was calculated based on the formula described above. TCGA training and GEO validation datasets were divided into high-risk group (risk score higher than threshold value), or low-risk group (risk score lower than threshold value) based on the threshold value (median of risk score). K-M curves were used to evaluate the survival outcomes of risk groups for TCGA training and GEO validation datasets, thus validating the effectiveness of predicting prognosis.

Establishment of the nomogram based on MRL-model

We conducted Univariate Cox regression analysis to assess the prognostic value of MRL-model and clinicopathological parameters. Multivariate Cox regression analysis was further implemented to evaluate and validate their independent prognostic value in TCGA training and GEO validation datasets. Subsequently, the “rms” package (Version 6.7.0) was applied to establish the Nomogram based on MRL-model and clinicopathological parameters [41]. The Nomogram was validated by discrimination and calibration with B = 1000 resampling optimism added to describe the relationship between the actual and the predicted OS probability of the Nomogram, thus evaluating the consistency of the MRL-model. The closer the predicted curve is to 45°, the better the prediction ability.

Quantitative real-time PCR

A total of 30 OC and 10 normal tissues were collected after approving by Ethics Committee of Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital. The samples obtained were pathologically confirmed as OC or ovarian tissues. Quantitative Real-time PCR analysis (SuperReal PreMix Plus from Tiangen Biotech, Beijing, China) was carried out after extracting total RNA (TRNzol Universal Reagent from Tiangen Biotech, Beijing, China) and reverse transcription (FastKing gDNA Dispelling RT SuperMix from Tiangen Biotech, Beijing, China). The sequence of lncRNA was obtained from LNCipedia (https://lncipedia.org/) [42]. The primers of the lncRNAs were designed and provided by Sangon Biotech (Shanghai, China).

Analysis of functional pathways

The protein-protein interaction (PPI) network was established using STRING (https://string-db.org/) [43] and Cytoscape (Version 3.4.0) [44]. Gene set enrichment analysis (GSEA) was performed in high-risk group versus low-risk group using “GSEA” (Version 4.3.2) [45]. The background gene set was the pathway set in MsigDB molecular label database [46].

Analysis of immunity features

The carcinogenesis of OC is strongly correlated with the immune microenvironment [47]. Utilizing single-sample gene set enrichment analysis (ssGSEA), we calculated enrichment fraction of 28 immune cells using gene set variation analysis (GSVA, Version 1.48.3) to indicate the relative abundance of each tumor microenvironment-infiltrated cell [48]. In addition, three algorithms, CIBERSORT (Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts, Version 0.1.0) [47], xCELL (Version 1.1.0) [49], MCPcounter (Microenvironment Cell Populations-counter, Version 1.2.0) [50], were used to characterize the cellular composition of complex tissues according to corresponding literature. Further, we estimated immune and stromal scores using ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data) algorithm (Version 1.1.7) to indicate the presence of stromal and immune cells [51].

Analysis of therapy

We predicted potential responses to immune checkpoint blockade (ICB) using the Tumor Immune Dysfunction and Exclusion (TIDE) tool (http://tide.dfci.harvard.edu/) [52]. Through contrasting gene expression profiles of OC and dataset of immunotherapy, we compared the discrepancy between the two risk groups in immunotherapy using submap and the P value was Bonferroni corrected [53]. The reactivity of chemotherapy drugs were extracted from the Genomics of Drug Sensitivity in Cancer (GDSC) database (https://www.cancerrxgene.org/) [54] and we used “pRRophetic” package (Version 0.5) [55] to analyze cell line expression profiles and OC gene expression profiles by constructing ridge regression model to assess IC50 levels of drugs.

Construction of ceRNA network

Pearson correlation coefficient (correlation coefficient > 0.2) between mRNAs and lncRNAs was calculated and FDR value (FDR < 0.05) was obtained from Benjamini-Hochberg correction. The local software miranda (Version 3.3a) [56] was used to screen the lncRNA-mRNA pairs (Score ≥ 140 and Energy≤ − 20). We used miRWalk3.0 (http://mirwalk.umm.uni-heidelberg.de/search_genes/) [57] to obtain the miRNA-mRNA pairs which had been verified by experiment. Further, lncRNAs and mRNAs regulated by the same miRNA with positive co-expression relationship were screened to establish the ceRNA (competing endogenous RNA) network. We used Cytoscape software (Version 3.4.0) for network graph construction [44]. The Degree Centrality of network node were analysed using CytoNCA plug-in (Version 2.1.6) [58].

Statistical analysis

The statistical analysis and graph visualization were performed by using R programming language [59, 60] or GraphPad Prism. The software, packages and their versions used for statistical analysis were listed in Supplementary Table S1. The genes with prognostic value were identified based on the hazard ratio (HR) and 95% confidence interval (CI). K-M curves and log-rank test were applied to contrast the survival outcome between two subgroups. Univariate and Multivariate Cox analyses were conducted to determine the independent prognostic value. Wilcox test was used to compare the immune characteristic or drug sensitivity between two groups. The two-tailed P lower than 0.05 was considered statistically significant.

Results

The research flowchart was plotted to summarize the main design of our study (Fig. 1).

Differentially expressed and prognostic genes screening

Compared with normal, there were 52 MRGs differentially expressed in OC (adjusted P < 0.05 and |logFC| > 0.5) (Fig. 2A). Through prognostic analysis, we found that 22 of the 52 MRGs were significantly correlated with the prognosis (Fig. 2B). Among the 22 prognostic MRGs, there were four MRGs correlated with favorable prognosis (HR < 1), including E2F1, MAPK8, MTX1, and UBE2L3. In contrast, the remaining 18 MRGs were associated with a poor prognosis (HR > 1), including BCL2L1, BECN1, CSNK2A1, CSNK2A2, FOXO3, GABARAPL1, MAP1LC3A, MFN2, NBR1, PINK1, RAB7A, SNCA, TBC1D15, TBK1, TFE3, TIGAR, USP30, and VPS13D. The box diagram visually demonstrated the expression differences of these 22 prognostic MRGs between OC and normal tissues (Fig. 2C).

To further observe the relationship between the 22 prognostic MRGs and clinicopathological parameters, box plots for each MRG were drawn between different clinical groups. We found that TBC1D15 (P < 0.05), UBE2L3 (P < 0.05), VPS13D (P < 0.05), TFE3 (P < 0.01), NBR1 (P < 0.01), MFN2 (P < 0.01), PINK1 (P < 0.05), USP30 (P < 0.05), and CSNK2A1 (P < 0.01) was associated with the stage of OC (Supplementary Fig. S1A). Most of the MRGs were not significantly different among other clinical factors, except SNCA (P < 0.05) and E2F1 (P < 0.05) in Grade (Supplementary Fig. S1B), CSNK2A2 (P = 0.02) in Age (Supplementary Fig. S1C), TIGAR (P < 0.05) and MTX1 (P < 0.05) in Macroscopic disease (Supplementary Fig. S1D).

In summary, the differentially expressed and prognostic MRGs can be used as diagnostic markers to identify cancer and non-cancer as well as different clinical stages. These MRGs are expected to be involved in OC progression and deserve further study.

The interactions among MRGs

Gene Ontology (GO) enrichment analysis revealed that the MRGs were enriched in mitophagy, mitochondrion disassembly, organelle disassembly, macroautophagy, cellular component disassembly, regulation of mitochondrion organization, and so on (Fig. 3A), suggesting that these MRGs were indeed involved in mitophagy and their biological implications for wet experiments. Interestingly, the correlations among the expression of the 22 prognostic MRGs were mostly positive, and CSNK2A2 and NBR1 (P < 0.05, Cor = 0.85) were the most positively correlated gene pair (Fig. 3B), which further hinted at their similarity in biological functions. To further explore the interactions of these 22 MRGs, the PPI analysis was performed (Fig. 3C). By ranking the degree in the PPI network, we could find that BECN1, GABARAPL1, PINK1, SNCA, MAP1LC3A, MEN2, and NBR1, FOXO3, RAB7A, and BCL2L1 were the top nine hub genes (Supplementary Fig. S2), indicating that these MRGs play a more prominent role in mitophagy in OC. Genetic mutations of the majority of MRGs were not detected in OC samples, except TP53, HUWE1, and VPS13C (Supplementary Fig. 3), hence most MRGs are wild-type in OC.