Abstract
Cancer is a complex gene mutation disease that derives from the accumulation of mutations during somatic cell evolution. With the advent of high-throughput technology, a large amount of omics data has been generated, and how to find cancer-related driver genes from a large number of omics data is a challenge. In the early stage, the researchers developed many frequency-based driver genes identification methods, but they could not identify driver genes with low mutation rates well. Afterwards, researchers developed network-based methods by fusing multi-omics data, but they rarely considered the connection among features. In this paper, after analyzing a large number of methods for integrating multi-omics data, a hierarchical weak consensus model for fusing multiple features is proposed according to the connection among features. By analyzing the connection between PPI network and co-mutation hypergraph network, this paper firstly proposes a new topological feature, called co-mutation clustering coefficient (CMCC). Then, a hierarchical weak consensus model is used to integrate CMCC, mRNA and miRNA differential expression scores, and a new driver genes identification method HWC is proposed. In this paper, the HWC method and current 7 state-of-the-art methods are compared on three types of cancers. The comparison results show that HWC has the best identification performance in statistical evaluation index, functional consistency and the partial area under ROC curve.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13755-024-00279-6/MediaObjects/13755_2024_279_Fig11_HTML.png)
Similar content being viewed by others
Data availability
The source code can be obtained at https://github.com/Mrhuhappy/HWC.git.
References
Vandin F, Upfal E, Raphael BJ. De novo discovery of mutated driver pathways in cancer. Genome Res. 2011.
Mclendon R, et al. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455(7216):1061–8.
Bobrow M, Zhao S. International network of cancer genome projects. Nature. 2010;464(7291):993–8.
Peng J, Xue H, Shao Y, Shang X, Wang Y, Chen J. A novel method to measure the semantic similarity of hpo terms. Int J Data Min Bioinform. 2017;17(2):173–88.
Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature. 2009;458(7239):719–24.
Bashashati A, et al. DriverNet: uncovering the impact of somatic driver mutations on transcriptional networks in cancer. Genome Biol. 2012;13(12):1–14.
Shi K, Gao L, Wang B. Discovering potential cancer driver genes by an integrated network-based approach. Mol BioSyst. 2016;12(9):2921–31.
Tian R, Basu MK, Capriotti E. ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples. Bioinformatics. 2014;30(17):i572–8.
Dees ND, et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res. 2012;22(8):1589–98.
Lawrence MS, et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013;499(7457):214–8.
Ding L, et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature. 2008;455(7216):1069–75.
Pon JR, Marra MA. Driver and passenger mutations in cancer. Annu Rev Pathol. 2015;10:25–50.
Wendl MC, et al. PathScan: a tool for discerning mutational significance in groups of putative cancer genes. Bioinformatics. 2011;27(12):1595–602.
Youn A, Simon R. Identifying cancer driver genes in tumor genome sequencing studies. Bioinformatics. 2011;27(2):175–81.
Gatza ML, Silva GO, Parker JS, Fan C, Perou CM. An integrated genomics approach identifies drivers of proliferation in luminal-subtype human breast cancer. Nat Genet. 2014;46(10):1051–9.
Dimitrakopoulos CM, Beerenwinkel N. Computational approaches for the identification of cancer genes and pathways. Wiley Interdiscip Rev. 2017;9(1): e1364.
Martincorena I, et al. Universal patterns of selection in cancer and somatic tissues. Cell. 2017;171(5):1029–41.
Torti D, Trusolino L. Oncogene addiction as a foundational rationale for targeted anti-cancer therapy: promises and perils. EMBO Mol Med. 2011;3(11):623–36.
Hahn WC, Weinberg RA. Modelling the molecular circuitry of cancer. Nat Rev Cancer. 2002;2(5):331–41.
Hahn WC, Counter CM, Lundberg AS, Beijersbergen RL, Brooks MW, Weinberg RA. Creation of human tumour cells with defined genetic elements. Nature. 1999;400(6743):464–8.
Hou P, Ma J. DawnRank: discovering personalized driver genes in cancer. Genome Med. 2014;6:1–16.
Xi J, Wang M, Li A. Discovering mutated driver genes through a robust and sparse co-regularized matrix factorization framework with prior information from mRNA expression patterns and interaction network. BMC Bioinform. 2018;19(1):1–14.
Xi J, Wang M, Li A. Discovering potential driver genes through an integrated model of somatic mutation profiles and gene functional information. Mol BioSyst. 2017;13(10):2135–44.
Dimitrakopoulos C, et al. Network-based integration of multi-omics data for prioritizing cancer genes. Bioinformatics. 2018;34(14):2441–8.
Song J, Peng W, Wang F. A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph. BMC Bioinform. 2019;20(1):1–17.
Song J, Peng W, Wang F. An entropy-based method for identifying mutual exclusive driver genes in cancer. IEEE/ACM Trans Comput Biol Bioinform. 2019;17(3):758–68.
Wei T, Fa B, Luo C, Johnston L, Zhang Y, Yu Z. An efficient and easy-to-use network-based integrative method of multi-omics data for cancer genes discovery. Front Genet. 2021;11: 613033.
Wang C, Shi J, Cai J, Zhang Y, Zheng X, Zhang N. DriverRWH: discovering cancer driver genes by random walk on a gene mutation hypergraph. BMC Bioinform. 2022;23(1):1–19.
Choudhury Y, et al. Attenuated adenosine-to-inosine editing of microRNA-376a* promotes invasiveness of glioblastoma cells. J Clin Investig. 2012;122(11):4059–76.
Stahlhut C, Slack FJ. MicroRNAs and the cancer phenotype: profiling, signatures and clinical implications. Genome Med. 2013;5:1–12.
Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D. Defining and identifying communities in networks. Proc Natl Acad Sci. 2004;101(9):2658–63.
Li M, Zhang H, Wang J-X, Pan Y. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst Biol. 2012;6:1–9.
Xiao Q, Wang J, Peng X, Wu F-X. Detecting protein complexes from active protein interaction networks constructed with dynamic gene expression profiles. Proteome Sci. 2013;11(1):1–8.
Bhattacharyya A. On a measure of divergence between two statistical populations defined by their probability distribution. Bull Calcutta Math Soc. 1943;35:99–110.
Tomczak K, Czerwińska P, Wiznerowicz M. Review The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol/Współczesna Onkologia. 2015;2015(1):68–77.
Patil A, Nakamura H. HINT: a database of annotated protein-protein interactions and their homologs. Biophysics. 2005;1:21–4.
Huang H-Y, et al. miRTarBase 2020: updates to the experimentally validated microRNA–target interaction database. Nucleic Acids Res. 2020;48(D1):D148–54.
Tate JG, et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2019;47(D1):D941–7.
Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man(OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33(Suppl 1):D514–7.
Ashburner M, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25(1):25–9.
Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017;45(D1):D353–61.
Fabregat A, et al. Reactome graph database: efficient access to complex pathway data. PLoS Comput Biol. 2018;14(1): e1005968.
Yu G, Wang L-G, Han Y, He Q-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics. 2012;16(5):284–7.
Yu G, He Q-Y. ReactomePA: an R/bioconductor package for reactome pathway analysis and visualization. Mol BioSyst. 2016;12(2):477–9.
Kuchenbaecker KB, et al. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. JAMA. 2017;317(23):2402–16.
Wang J, Rouse C, Jasper JS, Pendergast AM. ABL kinases promote breast cancer osteolytic metastasis by modulating tumor-bone interactions through TAZ and STAT5 signaling. Sci Signal. 2016;9(413):ra12.
Moore-Smith L, Pasche B. TGFBR1 signaling and breast cancer. J Mammary Gland Biol Neoplasia. 2011;16:89–95.
Sugano T, et al. Inhibition of ABCB1 overcomes cancer stem cell–like properties and acquired resistance to MET inhibitors in non-small cell lung cancer ABCB1 inhibition overcomes resistance to MET inhibitors. Mol Cancer Ther. 2015;14(11):2433–40.
Gao X, et al. Estrogen receptors promote NSCLC progression by modulating the membrane receptor signaling network: a systems biology perspective. J Transl Med. 2019;17:1–15.
Gorgisen G, et al. Identification of novel mutations of Insulin Receptor Substrate 1 (IRS1) in tumor samples of non-small cell lung cancer (NSCLC): implications for aberrant insulin signaling in development of cancer. Genet Mol Biol. 2019;42:15–25.
Wei B, et al. TRAF2 is a valuable prognostic biomarker in patients with prostate cancer. Med Sci Monit. 2017;23:4192.
Rochester MA, Riedemann J, Hellawell GO, Brewster SF, Macaulay VM. Silencing of the IGF1R gene enhances sensitivity to DNA-damaging agents in both PTEN wild-type and mutant human prostate cancer. Cancer Gene Ther. 2005;12(1):90–100.
Sunkel B, et al. Integrative analysis identifies targetable CREB1/FoxA1 transcriptional co-regulation as a predictor of prostate cancer recurrence. Nucleic Acids Res. 2016;44(9):4105–22.
Funding
This research is supported by National Natural Science Foundation of China (No. 61972185, No. 62141207, No. 62302107, No. 62366007), Guangxi Natural Science Foundation (No. 2022GXNSFAA035625), Natural Science Foundation of Yunnan Province of China (No. 2019FA024), Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security (No. 20-A-01-03,19-A-03-01), Guangxi Normal University Science Research Project (Natural Science) (No. 2021JC008), Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Innovation Project of Guangxi Graduate Education (YCSW2023180).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, G., Hu, Z., Luo, X. et al. Identification of cancer driver genes based on hierarchical weak consensus model. Health Inf Sci Syst 12, 21 (2024). https://doi.org/10.1007/s13755-024-00279-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13755-024-00279-6