Abstract
Biomarkers, also known as biological markers, are substances like transcripts, deoxyribonucleic acid (DNA), genes, proteins, and metabolites that indicate whether a biological activity is normal or abnormal. Markers play an essential role in diagnosing and prognosis of diseases like cancer, diabetes, and Alzheimer’s. In past years, in healthcare, an enormous amount of omics data, including genomics, proteomics, transcriptomic, metabolomics, and interatomic data, is becoming available, which helps researchers to find markers or signatures needed for disease diagnosis and prognosis and to provide the best potential course of therapy. Furthermore, integrative omics, often known as multi-omics data, are also proliferating in biomarker analysis. Therefore, various computational methods in healthcare engineering, including machine learning (ML) and deep learning (DL), have emerged to identify the markers from the complex multi-omics data. This study examines the current state of the art and computational methods, including feature selection strategies, ML and DL approaches, and accessible tools to uncover markers in single and multi-omics data. The underlying challenges, recurring problems, limitations of computational techniques, and future approaches in biomarker research have been discussed.
Similar content being viewed by others
References
Collins FS, Varmus H (2015) A new initiative on precision medicine. N Engl J Med 372:793–795
Cagney DN, Sul J, Huang RY et al (2017) The FDA NIH Biomarkers, EnfpointS, and other Tools (BEST) Resource in Neurology. Neuro-Oncology 20:1162–1172. https://doi.org/10.1093/neuonc/nox242
Zhu K, Zhan H, Peng Y et al (2020) Plasma hsa_circ_0027089 is a diagnostic biomarker for hepatitis B virus-related hepatocellurar carcinoma. Carcinogenesis 41:296–302. https://doi.org/10.1093/carcin/bgz154
Fattahi S, Kosari-Monfared M, Golpour M et al (2020) LncRNAs as potential diagnostic and prognostic biomarkers in gastric cancer: a novel approach to personalized medicine. J Cell Physiol 235:3189–3206. https://doi.org/10.1002/jcp.29260
Marquardt JU, Galle PR, Teufel A (2012) Molecular diagnosis and therapy of hepatocellular carcinoma (HCC): an emerging field for advanced technologies. J Hepatol 56:267–275. https://doi.org/10.1016/j.jhep.2011.07.007
The Cancer Genome Atlas Program. https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga. Accessed 20 Jan 2021
(2021) TARGET. https://ocg.cancer.gov/programs/target/overview. Accessed 20 Feb 2021
(2021) ICGC Data Portal. https://dcc.icgc.org/. Accessed 28 Feb 2021
Cao H, Schwarz E (2019) Opportunities and challenges of ML approaches for biomarker signature identification in psychiatry. Elsevier Inc., Amsterdam
Kaur P, Singh A, Chana I (2021) Computational techniques and tools for omics data analysis: state-of-the-art, challenges, and future directions. Arch Comput Methods Eng. https://doi.org/10.1007/s11831-021-09547-0
Zhang ZY (2015) Healthcare engineering defined: a white paper. J Healthc Eng 6(4):635–648. https://doi.org/10.1260/2040-2295.6.4.635
Swan AL, Mobasheri A, Allaway D et al (2013) Application of ML to proteomics data: classification and biomarker identification in postgenomics biology. OMICS 17:595–610. https://doi.org/10.1089/omi.2013.0017
Qin G, Zhao XM (2014) A survey on computational approaches to identifying disease biomarkers based on molecular networks. J Theor Biol 362:9–16. https://doi.org/10.1016/j.jtbi.2014.06.007
Jagga Z, Gupta D (2015) ML for biomarker identification in cancer research developments toward its clinical application. Pers Med 12:371–387. https://doi.org/10.2217/PME.15.5
Dragani TA, Matarese V, Colombo F (2020) Biomarkers for early cancer diagnosis: prospects for success through the lens of tumor genetics. BioEssays 42:1–6. https://doi.org/10.1002/bies.201900122
Shi K, Lin W, Zhao X (2020) Identifying molecular biomarkers for diseases with ML based on integrative omics. IEEE/ACM Trans Comput Biol Bioinform 5963:1–1. https://doi.org/10.1109/tcbb.2020.2986387
Kaur H, Kumar R, Lathwal A, Raghava GPS (2021) Computational resources for identification of cancer biomarkers from omics data. Brief Funct Genomics 00:1–10. https://doi.org/10.1093/bfgp/elab021
(2021) What are biomarkers. https://www.mycancer.com/resources/what-are-biomarkers/. Accessed 25 Jan 2021.
Khan TK (2016) Introduction to Alzheimer’s disease biomarkers. Biomarkers Alzheimers Dis. https://doi.org/10.1016/b978-0-12-804832-0.00001-8
Sechidis K, Papangelou K, Metcalfe PD et al (2018) Distinguishing prognostic and predictive biomarkers: an information theoretic approach. Bioinformatics 34:3365–3376. https://doi.org/10.1093/bioinformatics/bty357
Pezo RC, Bedard PL (2015) Definition: translational and personalised medicine, biomarkers, pharmacodynamics. https://oncologypro.esmo.org/content/download/67864/1221489/1/2015-ESMO-Handbook-Translational-Research-Chapter-1.pdf
Matheis K, Laurie D, Andriamandroso C et al (2011) A generic operational strategy to qualify translational safety biomarkers. Drug Discov Today 16:600–608. https://doi.org/10.1016/j.drudis.2011.04.011
Jones K, Nourse JP, Keane C et al (2014) Plasma microRNA are disease response biomarkers in classical Hodgkin lymphoma. Clin Cancer Res 20:253–264. https://doi.org/10.1158/1078-0432.CCR-13-1024
Ibraheem O, Adigun RO, Olatunji IT (2018) Omics technologies in unraveling plant stress responses; using Sorghum as a model crop, how far have we gone? Int J Plant Res 31:1–18. https://doi.org/10.4172/2229-4473.1000405
Bravo-Merodio L, Williams JA, Gkoutos GV, Acharjee A (2019) Omics biomarker identification pipeline for translational medicine. J Transl Med 17(1):1–10. https://doi.org/10.1186/s12967-019-1912-5
Subramanian I, Verma S, Kumar S et al (2020) Multi-omics data integration, interpretation, and its application. Bioinform Biol Insights 14:7–9. https://doi.org/10.1177/1177932219899051
Husi H, Albalat A (2014) Proteomics. Handb Pharm Stratif Med 147–179. https://doi.org/10.1016/b978-0-12-386882-4.00009-8
Mestrovic T (2020) Proteomics uses. https://www.news-medical.net/life-sciences/Proteomics-Uses.aspx. Accessed 28 Jan 2020
Kim M, Tagkopoulos I (2018) Data integration and predictive modeling methods for multi-omics datasets. Mol Omics 14(1):8–25. https://doi.org/10.1039/c7mo00051k
Hasin Y, Seldin M, Lusis A (2017) Multi-omics approaches to disease. Genome Biol 18(1):1–15. https://doi.org/10.1186/s13059-017-1215-1
Cortese-Krott MM, Santolini J, Wootton SA et al (2019) The reactive species interactome. Elsevier Inc., Amsterdam
Kristensen VN, Lingjærde OC, Russnes HG et al (2014) Principles and methods of integrative genomic analyses in cancer. Nat Rev Cancer 14(5):299–313. https://doi.org/10.1038/nrc3721
Dhillon A, Singh A (2020) EBreCaP: extreme learning-based model for BRCA survival prediction. IET Syst Biol 14(3):160–169. https://doi.org/10.1049/iet-syb.2019.0087
Jollife IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans R Soc A. https://doi.org/10.1098/rsta.2015.0202
Izenman AJ (2013) Linear discriminant analysis. Springer, New York
Gillis N (2020) Nonnegative matrix factorization. Society for Industrial and Applied Mathematics, Philadelphia
Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319. https://doi.org/10.1162/089976698300017467
De Ridder D, Kouropteva O, Okun O et al (2003) Supervised locally linear embedding. Lect Notes Comput Sci 2714:333–341. https://doi.org/10.1007/3-540-44989-2_40
Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2625
Wang Y, Yao H, Zhao S (2016) Auto-encoder based dimensionality reduction. Neurocomputing 184:232–242. https://doi.org/10.1016/j.neucom.2015.08.104
Ding H (2016) Visualization and integrative analysis of cancer multi-omics data. The Ohio State University, Columbus
Bommert A, Sun X, Bischl B et al (2020) Benchmark for filter methods for feature selection in high-dimensional classification data. Comput Stat Data Anal 143:106839. https://doi.org/10.1016/j.csda.2019.106839
Xie Y, Meng W-Y, Li R-Z et al (2021) Early lung cancer diagnostic biomarker discovery by ML methods. Transl Oncol 14(1):100907. https://doi.org/10.1016/j.tranon.2020.100907
Khatri I, Bhasin MK (2020) A transcriptomics-based meta-analysis combined with ML approach identifies a secretory biomarker panel for diagnosis of pancreatic adenocarcinoma. medRxiv. https://doi.org/10.1101/2020.04.16.20061515
Liu B, Liu Y, Pan X et al (2019) DM markers for pan-cancer prediction by DL. Genes (Basel). https://doi.org/10.3390/genes10100778
Senthil Kumar P, Lopez D (2016) A review on feature selection methods for high dimensional data. Int J Eng Technol 8(2):669–672
Darst BF, Malecki KC, Engelman CD (2018) Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genet 19(1):1–6. https://doi.org/10.1186/s12863-018-0633-8
Aha DW, Bankert RL (1996) A comparative evaluation of sequential feature selection algorithms. In: Fisher D, Lenz HJ (eds) Learning from data. Springer, New York, pp 56–63
Mirjalili S (2019) Genetic algorithm. Evol Algorithms Neural Netw 780:43–55. https://doi.org/10.1007/978-3-319-93025-1_4
Yu J, Zhu M, Lv M et al (2019) Characterization of a five-microRNA signature as a prognostic biomarker for esophageal squamous cell carcinoma. Sci Rep 9(1):1–11. https://doi.org/10.1038/s41598-019-56367-1
Lal TN, Chapelle O, Weston J (2006) Embedded methods. Study Fuzziness Soft Comput 165:137–165
Liu P, Tian W (2020) Identification of DM patterns and biomarkers for clear-cell renal cell carcinoma by multi-omics data analysis. PeerJ 8:1–31. https://doi.org/10.7717/peerj.9654
Lim J, Bang S, Kim J et al (2019) Integrative DL for identifying differentially expressed (DE) biomarkers. Comput Math Methods Med. https://doi.org/10.1155/2019/8418760
Zhang Y, Yang M, Ng DM et al (2020) Multi-omics data analyses construct TME and identify the immune-related prognosis signatures in human LUAD. Mol Ther Nucleic Acids 21:860–873. https://doi.org/10.1016/j.omtn.2020.07.024
Dhillon A, Singh A (2019) ML in healthcare data analysis: a survey. J Biol Todays World 8(6):1–10
Hastie T, Tibshirani R, Friedman J (2009) Overview of supervised learning. Elem Stat Learn 27(2):83–85. https://doi.org/10.1007/b94608
Quinlan JR (1993) C4.5: programs for ML. Morgan Kaufman Publishers, San Francisco
Ghahramani Z (2004) Unsupervised learning. Mach Learn. https://doi.org/10.1007/978-3-540-28650-9_5
Goldberg AB, Zhu X (2009) Introduction to semi-supervised learning. Morgan & Claypool, San Rafael
Esteva A, Robicquet A, Ramsundar B et al (2019) A guide to DL in healthcare. Nat Med 25(1):24–29. https://doi.org/10.1038/s41591-018-0316-z
Chung NC et al (2019) Unsupervised classification of multi-omics data during cardiac remodeling using DL. Methods 166:66–73
Kamel HFM, Al-Amodi HSB (2015) Cancer biomarkers role. Biomarkers Med 45:1–32. https://doi.org/10.5772/62421
George ED, Sadovsky R (1999) Multiple myeloma: recognition and management. Am Fam Physician 59(7):1885–1892
Biomarker.en.wikipedia.org/wiki/Biomarker. Accessed 28 Jan 2021
Chatterjee SK, Zetter BR (2005) Cancer biomarkers: knowing the present and predicting the future. Futur Oncol 1(1):37–50. https://doi.org/10.1517/14796694.1.1.37
Kitchenham B, Brereton O, Budgen B, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering—a systematic literature review. Inf Softw Technol 51(1):7–15. https://doi.org/10.1016/j.infsof.2008.09.009
Mallik S, Bhadra T, Maulik U (2017) Identifying epigenetic biomarkers using maximal relevance and minimal redundancy based feature selection for multi-omics data. IEEE Trans Nanobiosci 16(1):3–10. https://doi.org/10.1109/TNB.2017.2650217
Fujita N, Mizuarai S, Murakami K, Nakai K (2018) Biomarker discovery by integrated joint non-negative matrix factorization and pathway signature analyses. Sci Rep 8(1):1–10. https://doi.org/10.1038/s41598-018-28066-w
Jia Y, Shen M, Zhou Y, Liu H (2020) Development of a 12-biomarkers-based prognostic model for pancreatic cancer using multi-omics integrated analysis. Acta Biochim Pol 67(4):501–508. https://doi.org/10.18388/ABP.2020_5225
Southekal S, Mishra NK, Guda C (2021) Pan-cancer analysis of human kinome gene expression and promoter DNA methylation identifies dark kinase biomarkers in multiple cancers. Cancers (Basel) 13:1189. https://doi.org/10.3390/cancers13061189
Moon M, Nakai K (2018) Integrative analysis of gene expression and DM using unsupervised feature extraction for detecting candidate cancer biomarkers. J Bioinform Comput Biol 16(2):1850006. https://doi.org/10.1142/S0219720018500063
Hamzeh O, Rueda L (2019) A gene-disease-based ML approach to identify prostate cancer biomarkers. In: ACM-BCB 2019—proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics. pp 633–638
Zhao X, Dou J, Cao J et al (2020) Uncovering the potential differentially expressed miRNAs as diagnostic biomarkers for hepatocellular carcinoma based on ML in the Cancer Genome Atlas database. Oncol Rep 43(6):1771–1784. https://doi.org/10.3892/or.2020.7551
Kloten V, Becker B, Winner K et al (2013) Promoter hypermethylation of the tumor-suppressor genes ITIH5, DKK3, and RASSF1A as novel biomarkers for blood-based BRCA screening. BRCA Res 15(1):1–11. https://doi.org/10.1186/bcr3375
Rehman O, Zhuang H, Ali AM et al (2019) Validation of miRNAs as BRCA biomarkers with a ML approach. Cancers (Basel) 11(3):1–10. https://doi.org/10.3390/cancers11030431
Alkhateeb A, Rezaeian I, Singireddy S et al (2019) Transcriptomics signature from next-generation sequencing data reveals new transcriptomic biomarkers related to prostate cancer. Cancer Inform. https://doi.org/10.1177/1176935119835522
Jin T, Talos FM, Wang D (2019) ECMarker: interpretable ML model identifies gene expression biomarkers predicting clinical outcomes and reveals molecular mechanisms of human disease in early stages. bioRxiv. https://doi.org/10.1101/825414
Tyanova S, Albrechtsen R, Kronqvist P et al (2016) Proteomic maps of BRCA subtypes. Nat Commun 7(1):1–11. https://doi.org/10.1038/ncomms10259
Muazzam F (2020) Multi-class cancer classification and biomarker identification using DL. bioRxiv. https://doi.org/10.1101/2020.12.24.424317
Toth R, Schiffmann H, Hube-Magg C et al (2019) Random forest-based modelling to detect biomarkers for prostate cancer progression. Clin Epigenet 11(1):148–163. https://doi.org/10.1101/602334
Ma B, Geng Y, Meng F et al (2020) Identification of a sixteen-gene prognostic biomarker for lung adenocarcinoma using a ML method. J Cancer 11(5):1288–1298. https://doi.org/10.7150/jca.34585
Hossain MA, Saiful Islam SM, Quinn JMW et al (2019) ML and bioinformatics models to identify gene expression patterns of ovarian cancer associated with disease progression and mortality. J Biomed Inform 100:103313. https://doi.org/10.1016/j.jbi.2019.103313
Cai J, Li B, Zhu Y et al (2017) Prognostic biomarker identification through integrating the gene signatures of hepatocellular carcinoma properties. EBioMedicine 19:18–30. https://doi.org/10.1016/j.ebiom.2017.04.014
Ghosal S, Das S, Pang Y et al (2020) Long intergenic noncoding RNA profiles of pheochromocytoma and paraganglioma: a novel prognostic biomarker. Int J Cancer 146(8):2326–2335. https://doi.org/10.1002/ijc.32654
Li Y, Lu S, Lu S et al (2020) A prognostic nomogram integrating novel biomarkers identified by ML for cervical squamous cell carcinoma. J Transl Med 18(1):1–12. https://doi.org/10.1186/s12967-020-02387-9
Liu F, Xing L, Zhang X, Zhang X (2019) A four-pseudogene classifier identified by ML serves as a novel prognostic marker for survival of osteosarcoma. Genes (Basel) 10(6):414. https://doi.org/10.3390/genes10060414
Xing L, Zhang X, Zhang X, Tong D (2020) Expression scoring of a small-nucleolar-RNA signature identified by ML serves as a prognostic predictor for head and neck cancer. J Cell Physiol 235(11):8071–8084. https://doi.org/10.1002/jcp.29462
Long NP, Jung KH, Yoon SJ et al (2017) Systematic assessment of cervical cancer initiation and progression uncovers genetic panels for DL-based early diagnosis and proposes novel diagnostic and prognostic biomarkers. Oncotarget 8(65):109436–109456. https://doi.org/10.18632/oncotarget.22689
Wong KK, Rostomily R, Wong STC (2019) Prognostic gene discovery in glioblastoma patients using DL. Cancers (Basel) 11(1):1–15. https://doi.org/10.3390/cancers11010053
Nam Y, Jhee JH, Cho J et al (2019) Disease gene identification based on generic and disease-specific genome networks. Bioinformatics 35(11):1923–1930. https://doi.org/10.1093/bioinformatics/bty882
Zhao T, Hu Y, Peng J, Cheng L (2020) GCN-CNN A novel DL method for prioritizing lncRNA target genes. Bioinformatics 36(16):4466–4472. https://doi.org/10.1093/bioinformatics/btaa428
Zhang Y, Chen Y, Hu T (2020) PANDA: prioritization of autism-genes using network-based deep-learning approach. Genet Epidemiol 44(4):382–394. https://doi.org/10.1002/gepi.22282
Jiang X, Zhao J, Qian W et al (2020) A generative adversarial network model for disease gene prediction with RNA-seq data. IEEE Access 8:37352–37360. https://doi.org/10.1109/ACCESS.2020.2975585
Sinkala M, Mulder N, Martin D (2020) ML and network analyses reveal disease subtypes of pancreatic cancer and their molecular characteristics. Sci Rep 10(1):1–14. https://doi.org/10.1038/s41598-020-58290-2
Hamzeh O, Alkhateeb A, Zheng JZ et al (2019) A hierarchical ML model to discover Gleason grade-specific biomarkers in prostate cancer. Diagnostics. https://doi.org/10.3390/diagnostics9040219
Xu W, Xu M, Wang L et al (2019) Integrative analysis of DM and gene expression identified cervical cancer-specific diagnostic biomarkers. Signal Transduct Target Ther 4(1):1–11. https://doi.org/10.1038/s41392-019-0081-6
Guo LY, Wu AH, Wang YX et al (2020) DL-based ovarian cancer subtypes identification using multi-omics data. BioData Min 13(1):1–12. https://doi.org/10.1186/s13040-020-00222-x
Long NP, Jung KH, Anh NH et al (2019) An integrative data mining and omics-based translational model for the identification and validation of oncogenic biomarkers of pancreatic cancer. Cancers (Basel) 11(2):155. https://doi.org/10.3390/cancers11020155
Long NP, Park S, Anh NH et al (2019) High-throughput omics and statistical learning integration for the discovery and validation of novel diagnostic signatures in colorectal cancer. Int J Mol Sci 20(2):296. https://doi.org/10.3390/ijms20020296
Feng J, Jiang L, Li S et al (2021) Multi-omics data fusion via a joint kernel learning model for cancer subtype discovery and essential gene identification. Front Genet 12:1–10. https://doi.org/10.3389/fgene.2021.647141
Kwon MS, Kim Y, Lee S et al (2017) Erratum: integrative analysis of multi-omics data for identifying multi-markers for diagnosing pancreatic cancer. [BMC Genomics. (2015), 16, Suppl 9: (S4)]. BMC Genomics 18(1):1–10. https://doi.org/10.1186/s12864-016-3464-x
Joshi P, Jeong S, Park T (2020) Sparse superlayered neural network-based multi-omics cancer subtype classification. Int J Data Min Bioinform 24(1):58–73. https://doi.org/10.1504/IJDMB.2020.109500
Cheng J, Wei D, Ji Y et al (2018) Integrative analysis of DM and gene expression reveals hepatocellular carcinoma-specific diagnostic biomarkers. Genome Med 10(1):1–11. https://doi.org/10.1186/s13073-018-0548-z
Zhang M, Wang Y, Wang Y et al (2020) Integrative analysis of DM and gene expression to determine specific diagnostic biomarkers and prognostic biomarkers of BRCA. Front Cell Dev Biol 8:1–16. https://doi.org/10.3389/fcell.2020.529386
Zhang M, Cheng L, Zhang Y (2020) Characterization of dysregulated lncRNA-ASSOCIATED ceRNA network reveals novel lncRNAs With ceRNA activity as epigenetic diagnostic biomarkers for osteoporosis risk. Front Cell Dev Biol 8:1–9. https://doi.org/10.3389/fcell.2020.00184
Zhao N, Guo M, Wang K et al (2020) Identification of pan-cancer prognostic biomarkers through integration of multi-omics data. Front Bioeng Biotechnol 8:1–15. https://doi.org/10.3389/fbioe.2020.00268
Mishra NK, Southekal S, Guda C (2019) Survival analysis of multi-omics data identifies potential prognostic markers of pancreatic ductal adenocarcinoma. Front Genet 10:1–18. https://doi.org/10.3389/fgene.2019.00624
Zhuang H, Chen Y, Sheng X et al (2020) Searching for a signature involving 10 genes to predict the survival of patients with acute myelocytic leukemia through a combined multi-omics analysis. PeerJ 8(6):e9437. https://doi.org/10.7717/peerj.9437
Dong X, Zhang R, He J et al (2019) Trans-omics biomarker model improves prognostic prediction accuracy for early-stage lung adenocarcinoma. Aging (Albany NY) 11(16):6312–6335. https://doi.org/10.18632/aging.102189
Ouyang X, Fan Q, Ling G et al (2020) Identification of diagnostic biomarkers and subtypes of liver hepatocellular carcinoma by multi-omics data analysis. Genes (Basel) 11(9):1–18. https://doi.org/10.3390/genes11091051
Peng C, Zheng Y, Huang DS (2020) Capsule network based modeling of multi-omics data for discovery of BRCA-related genes. IEEE/ACM Trans Comput Biol Bioinform 17(5):1605–1612. https://doi.org/10.1109/TCBB.2019.2909905
Lai YH, Chen WN, Hsu TC et al (2020) Overall survival prediction of non-small cell lung cancer by integrating microarray and clinical data with DL. Sci Rep 10(1):1–11. https://doi.org/10.1038/s41598-020-61588-w
Cui L, Li H, Hui W et al (2020) A DL-based framework for lung cancer survival analysis with biomarker interpretation. BMC Bioinform 21(1):1–14. https://doi.org/10.1186/s12859-020-3431-z
Mo W, Ding Y, Zhao S et al (2020) Identification of a 6-gene signature for the survival prediction of BRCA patients based on integrated multi-omics data analysis. PLoS ONE 15(11):1–18. https://doi.org/10.1371/journal.pone.0241924
Mo Q, Li R, Adeegbe DO et al (2020) Integrative multi-omics analysis of muscle-invasive bladder cancer identifies prognostic biomarkers for frontline chemotherapy and immunotherapy. Commun Biol 3(1):1–14. https://doi.org/10.1038/s42003-020-01491-2
Xu D, Wang Y, Liu X et al (2021) Development and clinical validation of a novel 9-gene prognostic model based on multi-omics in pancreatic adenocarcinoma. Pharmacol Res 164:105370. https://doi.org/10.1016/j.phrs.2020.105370
Chang Z, Miao X, Zhao W (2020) Identification of prognostic dosage-sensitive genes in colorectal cancer based on multi-omics. Front Genet 10:1–8. https://doi.org/10.3389/fgene.2019.01310
Yuan Y, Qi P, Xiang W et al (2020) Multi-omics analysis reveals novel subtypes and driver genes in glioblastoma. Front Genet 11:1–9. https://doi.org/10.3389/fgene.2020.565341
Dimitrakopoulos C, Hindupur SK, Hafliger L et al (2018) Network-based integration of multi-omics data for prioritizing cancer genes. Bioinformatics 34(14):2441–2448. https://doi.org/10.1093/bioinformatics/bty148
Shang H, Liu ZP (2020) Network-based prioritization of cancer genes by integrative ranks from multi-omics data. Comput Biol Med 119:103692. https://doi.org/10.1016/j.compbiomed.2020.103692
Guan Y, Li T, Zhang H et al (2018) Prioritizing predictive biomarkers for gene essentiality in cancer cells with mRNA expression data and DNA copy number profile. Bioinformatics 34(23):3975–3982. https://doi.org/10.1093/bioinformatics/bty467
Yao Q, Xu Y, Yang H et al (2015) Global prioritization of disease candidate metabolites based on a multi-omics composite network. Sci Rep 5(1):1–14. https://doi.org/10.1038/srep17201
Fortino V, Kinaret P, Fyhrquist N et al (2014) A robust and accurate method for feature selection and prioritization from multi-class OMICs data. PLoS ONE 9(9):e107801. https://doi.org/10.1371/journal.pone.0107801
Fan H, Zhao H, Pang L et al (2015) Systematically prioritizing functional differentially methylated regions (fDMRs) by integrating multi-omics data in colorectal cancer. Sci Rep 5(1):1–16. https://doi.org/10.1038/srep12789
Chen Y, Wu X, Jiang R (2013) Integrating human omics data to prioritize candidate genes. BMC Med Genomics. https://doi.org/10.1186/1755-8794-6-57
Zhang T, Zhang D (2017) Integrating omics data and protein interaction networks to prioritize driver genes in cancer. Oncotarget 8(35):58050–58060. https://doi.org/10.18632/oncotarget.19481
Valdeolivas A, Tichit L, Navarro C et al (2019) Random walk with restart on multiplex and heterogeneous biological networks. Bioinformatics 35(3):497–505. https://doi.org/10.1093/bioinformatics/bty637
Wei PJ, Wu FX, Xia J et al (2020) Prioritizing cancer genes based on an improved random walk method. Front Genet 11:1–10. https://doi.org/10.3389/fgene.2020.00377
Zeng Z, Lu Y, Shen J et al (2019) A random interaction forest for prioritizing predictive biomarkers. arXiv. https://doi.org/10.48550/arXiv.1910.01786
Yang K, Lu K, Wu Y et al (2021) A network-based machine-learning framework to identify both functional modules and disease genes. Hum Genet. https://doi.org/10.1007/s00439-020-02253-0
Singh A, Shannon CP, Gautier B et al (2019) DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics 35(17):3055–3062. https://doi.org/10.1093/bioinformatics/bty1054
Kaur H, Dhall A, Kumar R, Raghava GPS (2020) Identification of platform-independent diagnostic biomarker panel for hepatocellular carcinoma using large-scale transcriptomics data. Front Genet 10:1–16. https://doi.org/10.3389/fgene.2019.01306
Kaur H, Bhalla S, Raghava GPS (2019) Classification of early and late stage liver hepatocellular carcinoma patients from their genomics and epigenomics profiles. PLoS ONE 14(9):e0221476. https://doi.org/10.1371/journal.pone.0221476
Gevaert O, Nabian M, Bakr S et al (2020) Imaging-AMARETTO: an imaging genomics software tool to interrogate multiomics networks for relevance to radiography and histopathology imaging biomarkers of clinical outcomes. JCO Clin Cancer Inform 4(4):421–435. https://doi.org/10.1200/cci.19.00125
Sangaralingam A, Dayem Ullah AZ, Marzec J et al (2019) “Multi-omic” data analysis using O-miner. Brief Bioinform 20(1):130–143. https://doi.org/10.1093/bib/bbx080
Abstract G, Torun FM, Virreira Winter S et al (2021) Transparent exploration of ML for biomarker discovery from proteomics and omics data. bioRxiv. https://doi.org/10.1101/2021.03.05.434053
Leclercq M, Vittrant B, Martin-Magniette ML et al (2019) Large-scale automatic feature selection for biomarker discovery in high-dimensional omics data. Front Genet 10:452. https://doi.org/10.3389/fgene.2019.00452
Song X, Ji J, Gleason KJ et al (2018) Insights into impact of DNA copy number alteration and methylation on the proteogenomic landscape of human ovarian cancer via a multi-omics integrative analysis. bioRxiv. https://doi.org/10.1101/488833
Ghannoum S, Netto WL, Fantini D et al (2021) Discbio: a user-friendly pipeline for biomarker discovery in single-cell transcriptomics. Int J Mol Sci 22(3):1–19. https://doi.org/10.3390/ijms22031399
Netanely D, Stern N, Laufer I, Shamir R (2019) PROMO: an interactive tool for analyzing clinically-labeled multi-omic cancer datasets. BMC Bioinform 20(1):1–10. https://doi.org/10.1186/s12859-019-3142-5
Tang Z, Kang B, Li C et al (2019) GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res 47(W1):W556–W560. https://doi.org/10.1093/nar/gkz430
Wang Q, Zhang L, Yan Z et al (2019) OScc: an online survival analysis web server to evaluate the prognostic value of biomarkers in cervical cancer. Futur Oncol 15(32):3693–3699. https://doi.org/10.2217/fon-2019-0412
Champion M, Brennan K, Croonenborghs T et al (2018) Module analysis captures pancancer genetically and epigenetically deregulated cancer driver genes for smoking and antiviral response. EBioMedicine 27:156–166. https://doi.org/10.1016/j.ebiom.2017.11.028
Jang Y, Seo J, Jang I et al (2019) CaPSSA: visual evaluation of cancer biomarker genes for patient stratification and survival analysis using mutation and expression data. Bioinformatics 35(24):5341–5343. https://doi.org/10.1093/bioinformatics/btz516
Xie B, Yuan Z, Yang Y et al (2018) MOBCdb: a comprehensive database integrating multi-omics data on BRCA for precision medicine. BRCA Res Treat 169(3):625–632. https://doi.org/10.1007/s10549-018-4708-z
Mohammed A, Biegert G, Adamec J, Helikar T (2018) CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data. Oncotarget 9(2):2565–2573. https://doi.org/10.18632/oncotarget.23511
Chong J, Soufan O, Li C et al (2018) MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucleic Acids Res 46(W1):W486–W494. https://doi.org/10.1093/nar/gky310
Zeng D, Ye Z, Yu G et al (2020) IOBR: multi-omics immuno-oncology biological research to decode tumor microenvironment and signatures. bioRxiv. https://doi.org/10.1101/2020.12.14.422647
Liu CJ, Hu FF, Xia MX et al (2018) GSCALite: a web server for gene set cancer analysis. Bioinformatics 34(21):3771–3772. https://doi.org/10.1093/bioinformatics/bty411
Dong H, Wang Q, Zhang G et al (2020) OSdlbcl: an online consensus survival analysis web server based on gene expression profiles of diffuse large B-cell lymphoma. Cancer Med 9(5):1790–1797. https://doi.org/10.1002/cam4.2829
Gill S, Xu M, Ottaviani C et al (2022) AI for next generation computing: emerging trends and future directions. Internet Things 19:100514. https://doi.org/10.1016/j.iot.2022.100514
Funding
The authors have no funding to report.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Ethical Approval
The author declares that this article complies the ethical standard.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dhillon, A., Singh, A. & Bhalla, V.K. A Systematic Review on Biomarker Identification for Cancer Diagnosis and Prognosis in Multi-omics: From Computational Needs to Machine Learning and Deep Learning. Arch Computat Methods Eng 30, 917–949 (2023). https://doi.org/10.1007/s11831-022-09821-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11831-022-09821-9