Skip to main content

Advertisement

Log in

A Systematic Review on Biomarker Identification for Cancer Diagnosis and Prognosis in Multi-omics: From Computational Needs to Machine Learning and Deep Learning

  • Survey article
  • Published:
Archives of Computational Methods in Engineering Aims and scope Submit manuscript

Abstract

Biomarkers, also known as biological markers, are substances like transcripts, deoxyribonucleic acid (DNA), genes, proteins, and metabolites that indicate whether a biological activity is normal or abnormal. Markers play an essential role in diagnosing and prognosis of diseases like cancer, diabetes, and Alzheimer’s. In past years, in healthcare, an enormous amount of omics data, including genomics, proteomics, transcriptomic, metabolomics, and interatomic data, is becoming available, which helps researchers to find markers or signatures needed for disease diagnosis and prognosis and to provide the best potential course of therapy. Furthermore, integrative omics, often known as multi-omics data, are also proliferating in biomarker analysis. Therefore, various computational methods in healthcare engineering, including machine learning (ML) and deep learning (DL), have emerged to identify the markers from the complex multi-omics data. This study examines the current state of the art and computational methods, including feature selection strategies, ML and DL approaches, and accessible tools to uncover markers in single and multi-omics data. The underlying challenges, recurring problems, limitations of computational techniques, and future approaches in biomarker research have been discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Collins FS, Varmus H (2015) A new initiative on precision medicine. N Engl J Med 372:793–795

    Article  Google Scholar 

  2. Cagney DN, Sul J, Huang RY et al (2017) The FDA NIH Biomarkers, EnfpointS, and other Tools (BEST) Resource in Neurology. Neuro-Oncology 20:1162–1172. https://doi.org/10.1093/neuonc/nox242

    Article  Google Scholar 

  3. Zhu K, Zhan H, Peng Y et al (2020) Plasma hsa_circ_0027089 is a diagnostic biomarker for hepatitis B virus-related hepatocellurar carcinoma. Carcinogenesis 41:296–302. https://doi.org/10.1093/carcin/bgz154

    Article  Google Scholar 

  4. Fattahi S, Kosari-Monfared M, Golpour M et al (2020) LncRNAs as potential diagnostic and prognostic biomarkers in gastric cancer: a novel approach to personalized medicine. J Cell Physiol 235:3189–3206. https://doi.org/10.1002/jcp.29260

    Article  Google Scholar 

  5. Marquardt JU, Galle PR, Teufel A (2012) Molecular diagnosis and therapy of hepatocellular carcinoma (HCC): an emerging field for advanced technologies. J Hepatol 56:267–275. https://doi.org/10.1016/j.jhep.2011.07.007

    Article  Google Scholar 

  6. The Cancer Genome Atlas Program. https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga. Accessed 20 Jan 2021

  7. (2021) TARGET. https://ocg.cancer.gov/programs/target/overview. Accessed 20 Feb 2021

  8. (2021) ICGC Data Portal. https://dcc.icgc.org/. Accessed 28 Feb 2021

  9. Cao H, Schwarz E (2019) Opportunities and challenges of ML approaches for biomarker signature identification in psychiatry. Elsevier Inc., Amsterdam

    Google Scholar 

  10. Kaur P, Singh A, Chana I (2021) Computational techniques and tools for omics data analysis: state-of-the-art, challenges, and future directions. Arch Comput Methods Eng. https://doi.org/10.1007/s11831-021-09547-0

    Article  Google Scholar 

  11. Zhang ZY (2015) Healthcare engineering defined: a white paper. J Healthc Eng 6(4):635–648. https://doi.org/10.1260/2040-2295.6.4.635

    Article  Google Scholar 

  12. Swan AL, Mobasheri A, Allaway D et al (2013) Application of ML to proteomics data: classification and biomarker identification in postgenomics biology. OMICS 17:595–610. https://doi.org/10.1089/omi.2013.0017

    Article  Google Scholar 

  13. Qin G, Zhao XM (2014) A survey on computational approaches to identifying disease biomarkers based on molecular networks. J Theor Biol 362:9–16. https://doi.org/10.1016/j.jtbi.2014.06.007

    Article  Google Scholar 

  14. Jagga Z, Gupta D (2015) ML for biomarker identification in cancer research developments toward its clinical application. Pers Med 12:371–387. https://doi.org/10.2217/PME.15.5

    Article  Google Scholar 

  15. Dragani TA, Matarese V, Colombo F (2020) Biomarkers for early cancer diagnosis: prospects for success through the lens of tumor genetics. BioEssays 42:1–6. https://doi.org/10.1002/bies.201900122

    Article  Google Scholar 

  16. Shi K, Lin W, Zhao X (2020) Identifying molecular biomarkers for diseases with ML based on integrative omics. IEEE/ACM Trans Comput Biol Bioinform 5963:1–1. https://doi.org/10.1109/tcbb.2020.2986387

    Article  Google Scholar 

  17. Kaur H, Kumar R, Lathwal A, Raghava GPS (2021) Computational resources for identification of cancer biomarkers from omics data. Brief Funct Genomics 00:1–10. https://doi.org/10.1093/bfgp/elab021

    Article  Google Scholar 

  18. (2021) What are biomarkers. https://www.mycancer.com/resources/what-are-biomarkers/. Accessed 25 Jan 2021.

  19. Khan TK (2016) Introduction to Alzheimer’s disease biomarkers. Biomarkers Alzheimers Dis. https://doi.org/10.1016/b978-0-12-804832-0.00001-8

    Article  Google Scholar 

  20. Sechidis K, Papangelou K, Metcalfe PD et al (2018) Distinguishing prognostic and predictive biomarkers: an information theoretic approach. Bioinformatics 34:3365–3376. https://doi.org/10.1093/bioinformatics/bty357

    Article  Google Scholar 

  21. Pezo RC, Bedard PL (2015) Definition: translational and personalised medicine, biomarkers, pharmacodynamics. https://oncologypro.esmo.org/content/download/67864/1221489/1/2015-ESMO-Handbook-Translational-Research-Chapter-1.pdf

  22. Matheis K, Laurie D, Andriamandroso C et al (2011) A generic operational strategy to qualify translational safety biomarkers. Drug Discov Today 16:600–608. https://doi.org/10.1016/j.drudis.2011.04.011

    Article  Google Scholar 

  23. Jones K, Nourse JP, Keane C et al (2014) Plasma microRNA are disease response biomarkers in classical Hodgkin lymphoma. Clin Cancer Res 20:253–264. https://doi.org/10.1158/1078-0432.CCR-13-1024

    Article  Google Scholar 

  24. Ibraheem O, Adigun RO, Olatunji IT (2018) Omics technologies in unraveling plant stress responses; using Sorghum as a model crop, how far have we gone? Int J Plant Res 31:1–18. https://doi.org/10.4172/2229-4473.1000405

    Article  Google Scholar 

  25. Bravo-Merodio L, Williams JA, Gkoutos GV, Acharjee A (2019) Omics biomarker identification pipeline for translational medicine. J Transl Med 17(1):1–10. https://doi.org/10.1186/s12967-019-1912-5

    Article  Google Scholar 

  26. Subramanian I, Verma S, Kumar S et al (2020) Multi-omics data integration, interpretation, and its application. Bioinform Biol Insights 14:7–9. https://doi.org/10.1177/1177932219899051

    Article  Google Scholar 

  27. Husi H, Albalat A (2014) Proteomics. Handb Pharm Stratif Med 147–179. https://doi.org/10.1016/b978-0-12-386882-4.00009-8

  28. Mestrovic T (2020) Proteomics uses. https://www.news-medical.net/life-sciences/Proteomics-Uses.aspx. Accessed 28 Jan 2020

  29. Kim M, Tagkopoulos I (2018) Data integration and predictive modeling methods for multi-omics datasets. Mol Omics 14(1):8–25. https://doi.org/10.1039/c7mo00051k

    Article  Google Scholar 

  30. Hasin Y, Seldin M, Lusis A (2017) Multi-omics approaches to disease. Genome Biol 18(1):1–15. https://doi.org/10.1186/s13059-017-1215-1

    Article  Google Scholar 

  31. Cortese-Krott MM, Santolini J, Wootton SA et al (2019) The reactive species interactome. Elsevier Inc., Amsterdam

    Google Scholar 

  32. Kristensen VN, Lingjærde OC, Russnes HG et al (2014) Principles and methods of integrative genomic analyses in cancer. Nat Rev Cancer 14(5):299–313. https://doi.org/10.1038/nrc3721

    Article  Google Scholar 

  33. Dhillon A, Singh A (2020) EBreCaP: extreme learning-based model for BRCA survival prediction. IET Syst Biol 14(3):160–169. https://doi.org/10.1049/iet-syb.2019.0087

    Article  Google Scholar 

  34. Jollife IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans R Soc A. https://doi.org/10.1098/rsta.2015.0202

    Article  MATH  Google Scholar 

  35. Izenman AJ (2013) Linear discriminant analysis. Springer, New York

    Book  Google Scholar 

  36. Gillis N (2020) Nonnegative matrix factorization. Society for Industrial and Applied Mathematics, Philadelphia

    Book  MATH  Google Scholar 

  37. Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319. https://doi.org/10.1162/089976698300017467

    Article  Google Scholar 

  38. De Ridder D, Kouropteva O, Okun O et al (2003) Supervised locally linear embedding. Lect Notes Comput Sci 2714:333–341. https://doi.org/10.1007/3-540-44989-2_40

    Article  MATH  Google Scholar 

  39. Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2625

    MATH  Google Scholar 

  40. Wang Y, Yao H, Zhao S (2016) Auto-encoder based dimensionality reduction. Neurocomputing 184:232–242. https://doi.org/10.1016/j.neucom.2015.08.104

    Article  Google Scholar 

  41. Ding H (2016) Visualization and integrative analysis of cancer multi-omics data. The Ohio State University, Columbus

    Google Scholar 

  42. Bommert A, Sun X, Bischl B et al (2020) Benchmark for filter methods for feature selection in high-dimensional classification data. Comput Stat Data Anal 143:106839. https://doi.org/10.1016/j.csda.2019.106839

    Article  MathSciNet  MATH  Google Scholar 

  43. Xie Y, Meng W-Y, Li R-Z et al (2021) Early lung cancer diagnostic biomarker discovery by ML methods. Transl Oncol 14(1):100907. https://doi.org/10.1016/j.tranon.2020.100907

    Article  Google Scholar 

  44. Khatri I, Bhasin MK (2020) A transcriptomics-based meta-analysis combined with ML approach identifies a secretory biomarker panel for diagnosis of pancreatic adenocarcinoma. medRxiv. https://doi.org/10.1101/2020.04.16.20061515

    Article  Google Scholar 

  45. Liu B, Liu Y, Pan X et al (2019) DM markers for pan-cancer prediction by DL. Genes (Basel). https://doi.org/10.3390/genes10100778

    Article  Google Scholar 

  46. Senthil Kumar P, Lopez D (2016) A review on feature selection methods for high dimensional data. Int J Eng Technol 8(2):669–672

    Google Scholar 

  47. Darst BF, Malecki KC, Engelman CD (2018) Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genet 19(1):1–6. https://doi.org/10.1186/s12863-018-0633-8

    Article  Google Scholar 

  48. Aha DW, Bankert RL (1996) A comparative evaluation of sequential feature selection algorithms. In: Fisher D, Lenz HJ (eds) Learning from data. Springer, New York, pp 56–63

    Google Scholar 

  49. Mirjalili S (2019) Genetic algorithm. Evol Algorithms Neural Netw 780:43–55. https://doi.org/10.1007/978-3-319-93025-1_4

    Article  Google Scholar 

  50. Yu J, Zhu M, Lv M et al (2019) Characterization of a five-microRNA signature as a prognostic biomarker for esophageal squamous cell carcinoma. Sci Rep 9(1):1–11. https://doi.org/10.1038/s41598-019-56367-1

    Article  Google Scholar 

  51. Lal TN, Chapelle O, Weston J (2006) Embedded methods. Study Fuzziness Soft Comput 165:137–165

    Article  Google Scholar 

  52. Liu P, Tian W (2020) Identification of DM patterns and biomarkers for clear-cell renal cell carcinoma by multi-omics data analysis. PeerJ 8:1–31. https://doi.org/10.7717/peerj.9654

    Article  Google Scholar 

  53. Lim J, Bang S, Kim J et al (2019) Integrative DL for identifying differentially expressed (DE) biomarkers. Comput Math Methods Med. https://doi.org/10.1155/2019/8418760

    Article  Google Scholar 

  54. Zhang Y, Yang M, Ng DM et al (2020) Multi-omics data analyses construct TME and identify the immune-related prognosis signatures in human LUAD. Mol Ther Nucleic Acids 21:860–873. https://doi.org/10.1016/j.omtn.2020.07.024

    Article  Google Scholar 

  55. Dhillon A, Singh A (2019) ML in healthcare data analysis: a survey. J Biol Todays World 8(6):1–10

    Google Scholar 

  56. Hastie T, Tibshirani R, Friedman J (2009) Overview of supervised learning. Elem Stat Learn 27(2):83–85. https://doi.org/10.1007/b94608

    Article  Google Scholar 

  57. Quinlan JR (1993) C4.5: programs for ML. Morgan Kaufman Publishers, San Francisco

    Google Scholar 

  58. Ghahramani Z (2004) Unsupervised learning. Mach Learn. https://doi.org/10.1007/978-3-540-28650-9_5

    Article  MATH  Google Scholar 

  59. Goldberg AB, Zhu X (2009) Introduction to semi-supervised learning. Morgan & Claypool, San Rafael

    MATH  Google Scholar 

  60. Esteva A, Robicquet A, Ramsundar B et al (2019) A guide to DL in healthcare. Nat Med 25(1):24–29. https://doi.org/10.1038/s41591-018-0316-z

    Article  Google Scholar 

  61. Chung NC et al (2019) Unsupervised classification of multi-omics data during cardiac remodeling using DL. Methods 166:66–73

    Article  Google Scholar 

  62. Kamel HFM, Al-Amodi HSB (2015) Cancer biomarkers role. Biomarkers Med 45:1–32. https://doi.org/10.5772/62421

    Article  Google Scholar 

  63. George ED, Sadovsky R (1999) Multiple myeloma: recognition and management. Am Fam Physician 59(7):1885–1892

    Google Scholar 

  64. Biomarker.en.wikipedia.org/wiki/Biomarker. Accessed 28 Jan 2021

  65. Chatterjee SK, Zetter BR (2005) Cancer biomarkers: knowing the present and predicting the future. Futur Oncol 1(1):37–50. https://doi.org/10.1517/14796694.1.1.37

    Article  Google Scholar 

  66. Kitchenham B, Brereton O, Budgen B, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering—a systematic literature review. Inf Softw Technol 51(1):7–15. https://doi.org/10.1016/j.infsof.2008.09.009

    Article  Google Scholar 

  67. Mallik S, Bhadra T, Maulik U (2017) Identifying epigenetic biomarkers using maximal relevance and minimal redundancy based feature selection for multi-omics data. IEEE Trans Nanobiosci 16(1):3–10. https://doi.org/10.1109/TNB.2017.2650217

    Article  Google Scholar 

  68. Fujita N, Mizuarai S, Murakami K, Nakai K (2018) Biomarker discovery by integrated joint non-negative matrix factorization and pathway signature analyses. Sci Rep 8(1):1–10. https://doi.org/10.1038/s41598-018-28066-w

    Article  Google Scholar 

  69. Jia Y, Shen M, Zhou Y, Liu H (2020) Development of a 12-biomarkers-based prognostic model for pancreatic cancer using multi-omics integrated analysis. Acta Biochim Pol 67(4):501–508. https://doi.org/10.18388/ABP.2020_5225

    Article  Google Scholar 

  70. Southekal S, Mishra NK, Guda C (2021) Pan-cancer analysis of human kinome gene expression and promoter DNA methylation identifies dark kinase biomarkers in multiple cancers. Cancers (Basel) 13:1189. https://doi.org/10.3390/cancers13061189

    Article  Google Scholar 

  71. Moon M, Nakai K (2018) Integrative analysis of gene expression and DM using unsupervised feature extraction for detecting candidate cancer biomarkers. J Bioinform Comput Biol 16(2):1850006. https://doi.org/10.1142/S0219720018500063

    Article  Google Scholar 

  72. Hamzeh O, Rueda L (2019) A gene-disease-based ML approach to identify prostate cancer biomarkers. In: ACM-BCB 2019—proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics. pp 633–638

  73. Zhao X, Dou J, Cao J et al (2020) Uncovering the potential differentially expressed miRNAs as diagnostic biomarkers for hepatocellular carcinoma based on ML in the Cancer Genome Atlas database. Oncol Rep 43(6):1771–1784. https://doi.org/10.3892/or.2020.7551

    Article  Google Scholar 

  74. Kloten V, Becker B, Winner K et al (2013) Promoter hypermethylation of the tumor-suppressor genes ITIH5, DKK3, and RASSF1A as novel biomarkers for blood-based BRCA screening. BRCA Res 15(1):1–11. https://doi.org/10.1186/bcr3375

    Article  Google Scholar 

  75. Rehman O, Zhuang H, Ali AM et al (2019) Validation of miRNAs as BRCA biomarkers with a ML approach. Cancers (Basel) 11(3):1–10. https://doi.org/10.3390/cancers11030431

    Article  Google Scholar 

  76. Alkhateeb A, Rezaeian I, Singireddy S et al (2019) Transcriptomics signature from next-generation sequencing data reveals new transcriptomic biomarkers related to prostate cancer. Cancer Inform. https://doi.org/10.1177/1176935119835522

    Article  Google Scholar 

  77. Jin T, Talos FM, Wang D (2019) ECMarker: interpretable ML model identifies gene expression biomarkers predicting clinical outcomes and reveals molecular mechanisms of human disease in early stages. bioRxiv. https://doi.org/10.1101/825414

    Article  Google Scholar 

  78. Tyanova S, Albrechtsen R, Kronqvist P et al (2016) Proteomic maps of BRCA subtypes. Nat Commun 7(1):1–11. https://doi.org/10.1038/ncomms10259

    Article  Google Scholar 

  79. Muazzam F (2020) Multi-class cancer classification and biomarker identification using DL. bioRxiv. https://doi.org/10.1101/2020.12.24.424317

    Article  Google Scholar 

  80. Toth R, Schiffmann H, Hube-Magg C et al (2019) Random forest-based modelling to detect biomarkers for prostate cancer progression. Clin Epigenet 11(1):148–163. https://doi.org/10.1101/602334

    Article  Google Scholar 

  81. Ma B, Geng Y, Meng F et al (2020) Identification of a sixteen-gene prognostic biomarker for lung adenocarcinoma using a ML method. J Cancer 11(5):1288–1298. https://doi.org/10.7150/jca.34585

    Article  Google Scholar 

  82. Hossain MA, Saiful Islam SM, Quinn JMW et al (2019) ML and bioinformatics models to identify gene expression patterns of ovarian cancer associated with disease progression and mortality. J Biomed Inform 100:103313. https://doi.org/10.1016/j.jbi.2019.103313

    Article  Google Scholar 

  83. Cai J, Li B, Zhu Y et al (2017) Prognostic biomarker identification through integrating the gene signatures of hepatocellular carcinoma properties. EBioMedicine 19:18–30. https://doi.org/10.1016/j.ebiom.2017.04.014

    Article  Google Scholar 

  84. Ghosal S, Das S, Pang Y et al (2020) Long intergenic noncoding RNA profiles of pheochromocytoma and paraganglioma: a novel prognostic biomarker. Int J Cancer 146(8):2326–2335. https://doi.org/10.1002/ijc.32654

    Article  Google Scholar 

  85. Li Y, Lu S, Lu S et al (2020) A prognostic nomogram integrating novel biomarkers identified by ML for cervical squamous cell carcinoma. J Transl Med 18(1):1–12. https://doi.org/10.1186/s12967-020-02387-9

    Article  Google Scholar 

  86. Liu F, Xing L, Zhang X, Zhang X (2019) A four-pseudogene classifier identified by ML serves as a novel prognostic marker for survival of osteosarcoma. Genes (Basel) 10(6):414. https://doi.org/10.3390/genes10060414

    Article  Google Scholar 

  87. Xing L, Zhang X, Zhang X, Tong D (2020) Expression scoring of a small-nucleolar-RNA signature identified by ML serves as a prognostic predictor for head and neck cancer. J Cell Physiol 235(11):8071–8084. https://doi.org/10.1002/jcp.29462

    Article  Google Scholar 

  88. Long NP, Jung KH, Yoon SJ et al (2017) Systematic assessment of cervical cancer initiation and progression uncovers genetic panels for DL-based early diagnosis and proposes novel diagnostic and prognostic biomarkers. Oncotarget 8(65):109436–109456. https://doi.org/10.18632/oncotarget.22689

    Article  Google Scholar 

  89. Wong KK, Rostomily R, Wong STC (2019) Prognostic gene discovery in glioblastoma patients using DL. Cancers (Basel) 11(1):1–15. https://doi.org/10.3390/cancers11010053

    Article  Google Scholar 

  90. Nam Y, Jhee JH, Cho J et al (2019) Disease gene identification based on generic and disease-specific genome networks. Bioinformatics 35(11):1923–1930. https://doi.org/10.1093/bioinformatics/bty882

    Article  Google Scholar 

  91. Zhao T, Hu Y, Peng J, Cheng L (2020) GCN-CNN A novel DL method for prioritizing lncRNA target genes. Bioinformatics 36(16):4466–4472. https://doi.org/10.1093/bioinformatics/btaa428

    Article  Google Scholar 

  92. Zhang Y, Chen Y, Hu T (2020) PANDA: prioritization of autism-genes using network-based deep-learning approach. Genet Epidemiol 44(4):382–394. https://doi.org/10.1002/gepi.22282

    Article  Google Scholar 

  93. Jiang X, Zhao J, Qian W et al (2020) A generative adversarial network model for disease gene prediction with RNA-seq data. IEEE Access 8:37352–37360. https://doi.org/10.1109/ACCESS.2020.2975585

    Article  Google Scholar 

  94. Sinkala M, Mulder N, Martin D (2020) ML and network analyses reveal disease subtypes of pancreatic cancer and their molecular characteristics. Sci Rep 10(1):1–14. https://doi.org/10.1038/s41598-020-58290-2

    Article  Google Scholar 

  95. Hamzeh O, Alkhateeb A, Zheng JZ et al (2019) A hierarchical ML model to discover Gleason grade-specific biomarkers in prostate cancer. Diagnostics. https://doi.org/10.3390/diagnostics9040219

    Article  Google Scholar 

  96. Xu W, Xu M, Wang L et al (2019) Integrative analysis of DM and gene expression identified cervical cancer-specific diagnostic biomarkers. Signal Transduct Target Ther 4(1):1–11. https://doi.org/10.1038/s41392-019-0081-6

    Article  MathSciNet  Google Scholar 

  97. Guo LY, Wu AH, Wang YX et al (2020) DL-based ovarian cancer subtypes identification using multi-omics data. BioData Min 13(1):1–12. https://doi.org/10.1186/s13040-020-00222-x

    Article  Google Scholar 

  98. Long NP, Jung KH, Anh NH et al (2019) An integrative data mining and omics-based translational model for the identification and validation of oncogenic biomarkers of pancreatic cancer. Cancers (Basel) 11(2):155. https://doi.org/10.3390/cancers11020155

    Article  Google Scholar 

  99. Long NP, Park S, Anh NH et al (2019) High-throughput omics and statistical learning integration for the discovery and validation of novel diagnostic signatures in colorectal cancer. Int J Mol Sci 20(2):296. https://doi.org/10.3390/ijms20020296

    Article  Google Scholar 

  100. Feng J, Jiang L, Li S et al (2021) Multi-omics data fusion via a joint kernel learning model for cancer subtype discovery and essential gene identification. Front Genet 12:1–10. https://doi.org/10.3389/fgene.2021.647141

    Article  Google Scholar 

  101. Kwon MS, Kim Y, Lee S et al (2017) Erratum: integrative analysis of multi-omics data for identifying multi-markers for diagnosing pancreatic cancer. [BMC Genomics. (2015), 16, Suppl 9: (S4)]. BMC Genomics 18(1):1–10. https://doi.org/10.1186/s12864-016-3464-x

    Article  Google Scholar 

  102. Joshi P, Jeong S, Park T (2020) Sparse superlayered neural network-based multi-omics cancer subtype classification. Int J Data Min Bioinform 24(1):58–73. https://doi.org/10.1504/IJDMB.2020.109500

    Article  Google Scholar 

  103. Cheng J, Wei D, Ji Y et al (2018) Integrative analysis of DM and gene expression reveals hepatocellular carcinoma-specific diagnostic biomarkers. Genome Med 10(1):1–11. https://doi.org/10.1186/s13073-018-0548-z

    Article  Google Scholar 

  104. Zhang M, Wang Y, Wang Y et al (2020) Integrative analysis of DM and gene expression to determine specific diagnostic biomarkers and prognostic biomarkers of BRCA. Front Cell Dev Biol 8:1–16. https://doi.org/10.3389/fcell.2020.529386

    Article  Google Scholar 

  105. Zhang M, Cheng L, Zhang Y (2020) Characterization of dysregulated lncRNA-ASSOCIATED ceRNA network reveals novel lncRNAs With ceRNA activity as epigenetic diagnostic biomarkers for osteoporosis risk. Front Cell Dev Biol 8:1–9. https://doi.org/10.3389/fcell.2020.00184

    Article  Google Scholar 

  106. Zhao N, Guo M, Wang K et al (2020) Identification of pan-cancer prognostic biomarkers through integration of multi-omics data. Front Bioeng Biotechnol 8:1–15. https://doi.org/10.3389/fbioe.2020.00268

    Article  Google Scholar 

  107. Mishra NK, Southekal S, Guda C (2019) Survival analysis of multi-omics data identifies potential prognostic markers of pancreatic ductal adenocarcinoma. Front Genet 10:1–18. https://doi.org/10.3389/fgene.2019.00624

    Article  Google Scholar 

  108. Zhuang H, Chen Y, Sheng X et al (2020) Searching for a signature involving 10 genes to predict the survival of patients with acute myelocytic leukemia through a combined multi-omics analysis. PeerJ 8(6):e9437. https://doi.org/10.7717/peerj.9437

    Article  Google Scholar 

  109. Dong X, Zhang R, He J et al (2019) Trans-omics biomarker model improves prognostic prediction accuracy for early-stage lung adenocarcinoma. Aging (Albany NY) 11(16):6312–6335. https://doi.org/10.18632/aging.102189

    Article  Google Scholar 

  110. Ouyang X, Fan Q, Ling G et al (2020) Identification of diagnostic biomarkers and subtypes of liver hepatocellular carcinoma by multi-omics data analysis. Genes (Basel) 11(9):1–18. https://doi.org/10.3390/genes11091051

    Article  Google Scholar 

  111. Peng C, Zheng Y, Huang DS (2020) Capsule network based modeling of multi-omics data for discovery of BRCA-related genes. IEEE/ACM Trans Comput Biol Bioinform 17(5):1605–1612. https://doi.org/10.1109/TCBB.2019.2909905

    Article  Google Scholar 

  112. Lai YH, Chen WN, Hsu TC et al (2020) Overall survival prediction of non-small cell lung cancer by integrating microarray and clinical data with DL. Sci Rep 10(1):1–11. https://doi.org/10.1038/s41598-020-61588-w

    Article  Google Scholar 

  113. Cui L, Li H, Hui W et al (2020) A DL-based framework for lung cancer survival analysis with biomarker interpretation. BMC Bioinform 21(1):1–14. https://doi.org/10.1186/s12859-020-3431-z

    Article  Google Scholar 

  114. Mo W, Ding Y, Zhao S et al (2020) Identification of a 6-gene signature for the survival prediction of BRCA patients based on integrated multi-omics data analysis. PLoS ONE 15(11):1–18. https://doi.org/10.1371/journal.pone.0241924

    Article  Google Scholar 

  115. Mo Q, Li R, Adeegbe DO et al (2020) Integrative multi-omics analysis of muscle-invasive bladder cancer identifies prognostic biomarkers for frontline chemotherapy and immunotherapy. Commun Biol 3(1):1–14. https://doi.org/10.1038/s42003-020-01491-2

    Article  Google Scholar 

  116. Xu D, Wang Y, Liu X et al (2021) Development and clinical validation of a novel 9-gene prognostic model based on multi-omics in pancreatic adenocarcinoma. Pharmacol Res 164:105370. https://doi.org/10.1016/j.phrs.2020.105370

    Article  Google Scholar 

  117. Chang Z, Miao X, Zhao W (2020) Identification of prognostic dosage-sensitive genes in colorectal cancer based on multi-omics. Front Genet 10:1–8. https://doi.org/10.3389/fgene.2019.01310

    Article  Google Scholar 

  118. Yuan Y, Qi P, Xiang W et al (2020) Multi-omics analysis reveals novel subtypes and driver genes in glioblastoma. Front Genet 11:1–9. https://doi.org/10.3389/fgene.2020.565341

    Article  Google Scholar 

  119. Dimitrakopoulos C, Hindupur SK, Hafliger L et al (2018) Network-based integration of multi-omics data for prioritizing cancer genes. Bioinformatics 34(14):2441–2448. https://doi.org/10.1093/bioinformatics/bty148

    Article  Google Scholar 

  120. Shang H, Liu ZP (2020) Network-based prioritization of cancer genes by integrative ranks from multi-omics data. Comput Biol Med 119:103692. https://doi.org/10.1016/j.compbiomed.2020.103692

    Article  Google Scholar 

  121. Guan Y, Li T, Zhang H et al (2018) Prioritizing predictive biomarkers for gene essentiality in cancer cells with mRNA expression data and DNA copy number profile. Bioinformatics 34(23):3975–3982. https://doi.org/10.1093/bioinformatics/bty467

    Article  Google Scholar 

  122. Yao Q, Xu Y, Yang H et al (2015) Global prioritization of disease candidate metabolites based on a multi-omics composite network. Sci Rep 5(1):1–14. https://doi.org/10.1038/srep17201

    Article  Google Scholar 

  123. Fortino V, Kinaret P, Fyhrquist N et al (2014) A robust and accurate method for feature selection and prioritization from multi-class OMICs data. PLoS ONE 9(9):e107801. https://doi.org/10.1371/journal.pone.0107801

    Article  Google Scholar 

  124. Fan H, Zhao H, Pang L et al (2015) Systematically prioritizing functional differentially methylated regions (fDMRs) by integrating multi-omics data in colorectal cancer. Sci Rep 5(1):1–16. https://doi.org/10.1038/srep12789

    Article  Google Scholar 

  125. Chen Y, Wu X, Jiang R (2013) Integrating human omics data to prioritize candidate genes. BMC Med Genomics. https://doi.org/10.1186/1755-8794-6-57

    Article  Google Scholar 

  126. Zhang T, Zhang D (2017) Integrating omics data and protein interaction networks to prioritize driver genes in cancer. Oncotarget 8(35):58050–58060. https://doi.org/10.18632/oncotarget.19481

    Article  Google Scholar 

  127. Valdeolivas A, Tichit L, Navarro C et al (2019) Random walk with restart on multiplex and heterogeneous biological networks. Bioinformatics 35(3):497–505. https://doi.org/10.1093/bioinformatics/bty637

    Article  Google Scholar 

  128. Wei PJ, Wu FX, Xia J et al (2020) Prioritizing cancer genes based on an improved random walk method. Front Genet 11:1–10. https://doi.org/10.3389/fgene.2020.00377

    Article  Google Scholar 

  129. Zeng Z, Lu Y, Shen J et al (2019) A random interaction forest for prioritizing predictive biomarkers. arXiv. https://doi.org/10.48550/arXiv.1910.01786

    Article  Google Scholar 

  130. Yang K, Lu K, Wu Y et al (2021) A network-based machine-learning framework to identify both functional modules and disease genes. Hum Genet. https://doi.org/10.1007/s00439-020-02253-0

    Article  Google Scholar 

  131. Singh A, Shannon CP, Gautier B et al (2019) DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics 35(17):3055–3062. https://doi.org/10.1093/bioinformatics/bty1054

    Article  Google Scholar 

  132. Kaur H, Dhall A, Kumar R, Raghava GPS (2020) Identification of platform-independent diagnostic biomarker panel for hepatocellular carcinoma using large-scale transcriptomics data. Front Genet 10:1–16. https://doi.org/10.3389/fgene.2019.01306

    Article  Google Scholar 

  133. Kaur H, Bhalla S, Raghava GPS (2019) Classification of early and late stage liver hepatocellular carcinoma patients from their genomics and epigenomics profiles. PLoS ONE 14(9):e0221476. https://doi.org/10.1371/journal.pone.0221476

    Article  Google Scholar 

  134. Gevaert O, Nabian M, Bakr S et al (2020) Imaging-AMARETTO: an imaging genomics software tool to interrogate multiomics networks for relevance to radiography and histopathology imaging biomarkers of clinical outcomes. JCO Clin Cancer Inform 4(4):421–435. https://doi.org/10.1200/cci.19.00125

    Article  Google Scholar 

  135. Sangaralingam A, Dayem Ullah AZ, Marzec J et al (2019) “Multi-omic” data analysis using O-miner. Brief Bioinform 20(1):130–143. https://doi.org/10.1093/bib/bbx080

    Article  Google Scholar 

  136. Abstract G, Torun FM, Virreira Winter S et al (2021) Transparent exploration of ML for biomarker discovery from proteomics and omics data. bioRxiv. https://doi.org/10.1101/2021.03.05.434053

    Article  Google Scholar 

  137. Leclercq M, Vittrant B, Martin-Magniette ML et al (2019) Large-scale automatic feature selection for biomarker discovery in high-dimensional omics data. Front Genet 10:452. https://doi.org/10.3389/fgene.2019.00452

    Article  Google Scholar 

  138. Song X, Ji J, Gleason KJ et al (2018) Insights into impact of DNA copy number alteration and methylation on the proteogenomic landscape of human ovarian cancer via a multi-omics integrative analysis. bioRxiv. https://doi.org/10.1101/488833

    Article  Google Scholar 

  139. Ghannoum S, Netto WL, Fantini D et al (2021) Discbio: a user-friendly pipeline for biomarker discovery in single-cell transcriptomics. Int J Mol Sci 22(3):1–19. https://doi.org/10.3390/ijms22031399

    Article  Google Scholar 

  140. Netanely D, Stern N, Laufer I, Shamir R (2019) PROMO: an interactive tool for analyzing clinically-labeled multi-omic cancer datasets. BMC Bioinform 20(1):1–10. https://doi.org/10.1186/s12859-019-3142-5

    Article  Google Scholar 

  141. Tang Z, Kang B, Li C et al (2019) GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res 47(W1):W556–W560. https://doi.org/10.1093/nar/gkz430

    Article  Google Scholar 

  142. Wang Q, Zhang L, Yan Z et al (2019) OScc: an online survival analysis web server to evaluate the prognostic value of biomarkers in cervical cancer. Futur Oncol 15(32):3693–3699. https://doi.org/10.2217/fon-2019-0412

    Article  Google Scholar 

  143. Champion M, Brennan K, Croonenborghs T et al (2018) Module analysis captures pancancer genetically and epigenetically deregulated cancer driver genes for smoking and antiviral response. EBioMedicine 27:156–166. https://doi.org/10.1016/j.ebiom.2017.11.028

    Article  Google Scholar 

  144. Jang Y, Seo J, Jang I et al (2019) CaPSSA: visual evaluation of cancer biomarker genes for patient stratification and survival analysis using mutation and expression data. Bioinformatics 35(24):5341–5343. https://doi.org/10.1093/bioinformatics/btz516

    Article  Google Scholar 

  145. Xie B, Yuan Z, Yang Y et al (2018) MOBCdb: a comprehensive database integrating multi-omics data on BRCA for precision medicine. BRCA Res Treat 169(3):625–632. https://doi.org/10.1007/s10549-018-4708-z

    Article  Google Scholar 

  146. Mohammed A, Biegert G, Adamec J, Helikar T (2018) CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data. Oncotarget 9(2):2565–2573. https://doi.org/10.18632/oncotarget.23511

    Article  Google Scholar 

  147. Chong J, Soufan O, Li C et al (2018) MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucleic Acids Res 46(W1):W486–W494. https://doi.org/10.1093/nar/gky310

    Article  Google Scholar 

  148. Zeng D, Ye Z, Yu G et al (2020) IOBR: multi-omics immuno-oncology biological research to decode tumor microenvironment and signatures. bioRxiv. https://doi.org/10.1101/2020.12.14.422647

    Article  Google Scholar 

  149. Liu CJ, Hu FF, Xia MX et al (2018) GSCALite: a web server for gene set cancer analysis. Bioinformatics 34(21):3771–3772. https://doi.org/10.1093/bioinformatics/bty411

    Article  Google Scholar 

  150. Dong H, Wang Q, Zhang G et al (2020) OSdlbcl: an online consensus survival analysis web server based on gene expression profiles of diffuse large B-cell lymphoma. Cancer Med 9(5):1790–1797. https://doi.org/10.1002/cam4.2829

    Article  Google Scholar 

  151. Gill S, Xu M, Ottaviani C et al (2022) AI for next generation computing: emerging trends and future directions. Internet Things 19:100514. https://doi.org/10.1016/j.iot.2022.100514

    Article  Google Scholar 

Download references

Funding

The authors have no funding to report.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arwinder Dhillon.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Ethical Approval

The author declares that this article complies the ethical standard.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dhillon, A., Singh, A. & Bhalla, V.K. A Systematic Review on Biomarker Identification for Cancer Diagnosis and Prognosis in Multi-omics: From Computational Needs to Machine Learning and Deep Learning. Arch Computat Methods Eng 30, 917–949 (2023). https://doi.org/10.1007/s11831-022-09821-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11831-022-09821-9

Navigation