A Review on Machine Learning Aided Multi-omics Data Integration Techniques for Healthcare

Bansal, Hina; Luthra, Hiya; Raghuram, Shree R.

doi:10.1007/978-3-031-38325-0_10

Hina Bansal⁶,
Hiya Luthra⁶ &
Shree R. Raghuram⁶

Part of the book series: Studies in Big Data ((SBD,volume 132))

317 Accesses

Abstract

To understand the mechanism of biological processes inside a human, it is necessary to look at its various regulatory aspects, such as DNA methylation and post-translational modifications of histones (PTMs). These characteristics are all susceptible to disease-induced alterations in cell signalling and phenotypes. We need to use a multi-omics approach because many illnesses result from complex processes, and we must examine each of these traits and their interactions to gain insights into the causes of diseases. Therefore, investigating multi-omics data is a crucial aspect of molecular-level healthcare research and has yielded cutting-edge discoveries. High-throughput technologies are becoming more widely available, which has led to an increase in the amount of omics data being produced. These omics data include epigenomics, transcriptomics and genomics, proteomics which all aim to represent various but complementary biological layers. By making it possible to thoroughly examine biological systems and molecular underpinnings of disease development, these data have changed healthcare research. There is a strong trend toward adding multi-omics analysis into healthcare research to explain the intricate interactions across molecular levels, even if the integration and translation of multi-omics data into relevant functional insights remains a significant barrier. Multi-omics data can help improve prevention, early detection, and prediction, monitor history, interpret patterns and design a personalised treatment. Various Machine Learning algorithms grouped under supervised and unsupervised learning techniques have been used to integrate data through various omics levels. This multi-omics analysis has various applications in deciphering the causative reason for many diseases like cancer and thus has helped in taking a step forward towards personalised medicine for tailoring the right medication for the right person. Hence, a lot of attention is given to establishing various machine learning algorithms for the automatic integration of multi-omics data. With this data, machine learning algorithms can be employed to produce diagnostic and classification biomarkers, offering fresh information. However, researchers have identified a bulk of biomarkers that consider only one omics parameter at a time and have not properly utilised a recent multi-omics research strategy, which can adequately capture the complexity of biological systems. The complementary knowledge that each omics layer contributes must be included in multi-omics data integration strategies. As a result, it is advisable to support the development of novel machine—learning methods. This chapter outlines the roadmap for multi-omics integration with machine learning, various integration methods, challenges, and future aspects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Machine Learning Approaches for Multi-omics Data Integration in Medicine

Multi-omic Approaches to Improve Cancer Diagnosis, Prognosis, and Therapeutics

Precision medicine journey through omics approach

Article 24 November 2021

References

Graw, S., Chappell, K., Washam, C.L., Gies, A., Bird, J., Robeson, M.S., Byrum, S.D.: Multi-omics data integration considerations and study design for biological systems and disease. Mol. Omics 17(2), 170–185 (2021). https://doi.org/10.1039/D0MO00041H
Article Google Scholar
Santiago-Rodriguez, T.M., Emily, B.: Multi ‘omic data integration: a review of concepts, considerations, and approaches. In: Seminars in Perinatology, p. 151456. WB Saunders (2021). https://doi.org/10.1016/j.semperi.2021.151456
Picard, M., Scott-Boyer, M.P., Bodein, A., Périn, O., Droit, A.: Integration strategies of multi-omics data for machine learning analysis. Comput. Struct. Biotechnol. J. 19, 3735–3746 (2021). https://doi.org/10.1016/j.csbj.2021.06.030
Article Google Scholar
Subramanian, I., Verma, S., Kumar, S., Jere, A., Anamika, K.: Multi-omics data integration, interpretation, and its application. Bioinform. Biol. Insights 14, 1177932219899051 (2020). https://doi.org/10.1177/1177932219899051
Huang, S., Chaudhary, K., Garmire, L.X.: More is better: recent progress in multi-omics data integration methods. Front. Genet. 8, 84 (2017). https://doi.org/10.3389/fgene.2017.00084
Article Google Scholar
Reel, P.S., Reel, S., Pearson, E., Trucco, E., Jefferson, E.: Using machine learning approaches for multi-omics data analysis: a review. Biotechnol. Adv. 49, 107739 (2021). https://doi.org/10.1016/j.biotechadv.2021.107739
Cai, Z., Poulos, R.C., Liu, J., Zhong, Q.: Machine learning for multi-omics data integration in cancer. iScience 22, 103798 (2022). https://doi.org/10.1016/j.isci.2022.103798
Bansal, H., Luthra, H., Chaurasia, A.: Impact of machine learning practices on biomedical informatics, its challenges and future benefits. In: Artificial Intelligence Technologies for Computational Biology, pp. 273–294. CRC Press (2023). https://doi.org/10.1201/9781003246688-12
Arjmand, B., Hamidpour, S.K., Tayanloo-Beik, A., Goodarzi, P., Aghayan, H.R., Adibi, H., Larijani, B.: Machine learning: a new prospect in multi-omics data analysis of cancer. Front. Genet. 13, 76 (2022). https://doi.org/10.3389/fgene.2022.824451
Article Google Scholar
El-Manzalawy, Y., Hsieh, T.Y., Shivakumar, M., Kim, D., Honavar, V.: Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data. BMC Med. Genomics 11(3), 19–31 (2018). https://doi.org/10.1186/s12920-018-0388-0
Article Google Scholar
Wang, B., Mezlini, A.M., Demir, F., Fiume, M., Tu, Z., Brudno, M., Haibe-Kains, B.: Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11(3), 333–337 (2014). https://doi.org/10.1038/nmeth.2810
Article Google Scholar
Lan, L., Djuric, N., Guo, Y., Vucetic, S.: MS-k NN: protein function prediction by integrating multiple data sources. BMC Bioinform. 14(Suppl 3), S8 (2013). https://doi.org/10.1186/1471-2105-14-S3-S8
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1, 81–106 (1986). https://doi.org/10.1007/BF00116251
Article Google Scholar
Gligorijević, V., Pržulj, N.: Methods for biological data integration: perspectives and challenges. J. R. Soc. Interface 12(112), 20150571 (2015). https://doi.org/10.1098/rsif.2015.0571
Article Google Scholar
Huang, S., Cai, N., Pacheco, P.P., Narrandes, S., Wang, Y., Xu, W.: Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genomics Proteomics 15(1), 41–51 (2018). https://doi.org/10.21873/cgp.20063
Fawagreh, K., Gaber, M.M., Elyan, E.: Random forests: from early developments to recent advancements. Syst. Sci. Control Eng.: Open Access J. 2(1), 602–609 (2014). https://doi.org/10.1080/21642583.2014.956265
Article Google Scholar
Shen, R., Olshen, A.B., Ladanyi, M.: Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics 25(22), 2906–2912 (2009). https://doi.org/10.1093/bioinformatics/btp543
Article Google Scholar
Curtis, C., Shah, S., Chin, S.F., et al.: The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486(7403), 346–352 (2012). https://doi.org/10.1038/nature10983
Article Google Scholar
Lock, E.F., Hoadley, K.A., Marron, J.S., Nobel, A.B., et al.: Joint and individual variation explained (JIVE) for integrated analysis of multiple data types. Ann. Appl. Stat. 7(1), 523 (2013). https://doi.org/10.1214/12-AOAS597
Hasin, Y., Seldin, M., Lusis, A.: Multi-omics approaches to disease. Genome Biol. 18(1), 1–15 (2017). https://doi.org/10.1186/s13059-017-1215-1
Article Google Scholar
Xu, J., Wu, P., Chen, Y., Meng, Q., Dawood, H., Dawood, H.: A hierarchical integration deep flexible neural forest framework for cancer subtype classification by integrating multi-omics data. BMC Bioinform. 20(1), 1–11 (2019). https://doi.org/10.1186/s12859-019-3116-7
Article Google Scholar
Bonnet, E., Calzone, L., Michoel, T.: Integrative multi-omics module network inference with Lemon-Tree. PLoS Comput. Biol. 11(2), e1003983 (2015). https://doi.org/10.1371/journal.pcbi.1003983
Article Google Scholar
Yang, Y., Dong, X., Xie, B., Ding, N., Chen, J., Li, Y., Zhang, Q., Qu, H., Fang, X.: Databases and web tools for cancer genomics study. Genomics Proteomics Bioinform. 13(1), 46–50 (2015). https://doi.org/10.1016/j.gpb.2015.01.005
Article Google Scholar
Tepeli, Y.I., Ünal, A.B., Akdemir, F.M., Tastan, O.: PAMOGK: a pathway graph kernel based multi-omics approach for patient clustering. Ph.D. Thesis. (2020)
Google Scholar
Rappoport, N., Shamir, R.: NEMO: cancer subtyping by integration of partial multi-omic data. Bioinformatics 35(18), 3348–3356 (2019). https://doi.org/10.1093/bioinformatics/btz058
Article Google Scholar
Reel, P.S., Reel, S., Pearson, E., Trucco, E., Jefferson, E.: Using machine learning approaches for multi-omics data analysis: a review. Biotechnol. Adv. 49, 107739 (2021). https://doi.org/10.1016/j.biotechadv.2021.107739
Article Google Scholar
Chappell, K., Manna, K., Washam, C.L., Graw, S., Alkam, D., Thompson, M.D., Zafar, M.K., Hazeslip, L., Randolph, C., Gies, A., Bird, J.T.: Multi-omics data integration reveals correlated regulatory features of triple negative breast cancer. Mol. Omics 17(5), 677–691 (2021). https://doi.org/10.1039/d1mo00117e
Article Google Scholar
Zhang, L., Lv, C., Jin, Y., Cheng, G., Fu, Y., Yuan, D., Tao, Y., Guo, Y., Ni, X., Shi, T.: Deep learning-based multi-omics data integration reveals two prognostic subtypes in high-risk neuroblastoma. Front. Genet. 9, 477 (2018). https://doi.org/10.3389/fgene.2018.00477
Article Google Scholar
Kamburov, A., Cavill, R., Ebbels, T.M., Herwig, R., Keun, H.C.: Integrated pathway-level analysis of transcriptomics and metabolomics data with IMPaLA. Bioinformatics 27(20), 2917–2918 (2011). https://doi.org/10.1093/bioinformatics/btr499
Article Google Scholar
Rohart, F., Gautier, B., Singh, A., Lê Cao, K.A.: mixOmics: an R package for ‘omics feature selection and multiple data integration. PLoS Comput. Biol. 13(11), e1005752 (2017). https://doi.org/10.1371/journal.pcbi.1005752
Article Google Scholar
Meng, C., Kuster, B., Culhane, A.C., Gholami, A.M.: A multivariate approach to the integration of multi-omics datasets. BMC Bioinform. 15, 1–13 (2014). https://doi.org/10.1186/1471-2105-15-162
Article Google Scholar
Argelaguet, R., Velten, B., Arnol, D., Dietrich, S., Zenz, T., Marioni, J.C., Buettner, F., Huber, W., Stegle, O.: Multi‐omics factor analysis—a framework for unsupervised integration of multi‐omics data sets. Mol. Syst. Biol. 14(6), e8124 (2018). https://doi.org/10.15252/msb.20178124
Bauer C., Stec, K., Glintschert, A., Gruden, K., Schichor, C., Or-Guil, M., Selbig, J., Schuchhardt, J.: BioMiner: paving the way for personalized medicine. Cancer Inform. 14, CIN. S20910 (2015). https://doi.org/10.4137/CIN.S20910
Tomczak, K., Czerwińska, P., Wiznerowicz, M.: Review the cancer genome atlas (TCGA): an immeasurable source of knowledge. Contemp. Oncol./Współczesna Onkol. 2015(1), 68–77 (2015). https://doi.org/10.5114/wo.2014.47136
Article Google Scholar
Wu, P., Heins, Z.J., Muller, J.T., Katsnelson, L., de Bruijn, I., Abeshouse, A.A., Schultz, N., Fenyö, D., Gao, J.: Integration and analysis of CPTAC proteomics data in the context of cancer genomics in the cBioPortal*[S]. Mol. Cell. Proteomics 18(9), 1893–1898 (2019). https://doi.org/10.1074/mcp.TIR119.001673
Article Google Scholar
Shimada, K., Bachman, J.A., Muhlich, J.L., Mitchison, T.J.: shinyDepMap, a tool to identify targetable cancer genes and their functional connections from Cancer Dependency Map data. Elife 10, e57116 (2021). https://doi.org/10.7554/eLife.57116
Article Google Scholar
García-Alcalde, F., García-López, F., Dopazo, J., Conesa, A.: Paintomics: a web-based tool for the joint visualization of transcriptomics and metabolomics data. Bioinformatics 27(1), 137–139 (2011). https://doi.org/10.1093/bioinformatics/btq594
Article Google Scholar
Misra, B.B., Langefeld, C., Olivier, M., Cox, L.A.: Integrated omics: tools, advances and future approaches. J. Mol. Endocrinol. 62(1), R21–R45 (2019). https://doi.org/10.1530/JME-18-0055
Article Google Scholar
Subramanian, I., Verma, S., Kumar, S., Jere, A., Anamika, K.: Multi-omics data integration, interpretation, and its application. Bioinform. Biol. Insights 14, 1177932219899051 (2020). https://doi.org/10.1177/1177932219899051
Article Google Scholar
Luthra, H., Nihith, T.A.S., Pravallika, V.S.S., Raghuram Shree, R., Chaurasia, A., Bansal, H.: New paradigm in healthcare industry using big data analytics. In: IOP Conference Series: Materials Science and Engineering, p. 012054. IOP Publishing (2021). https://doi.org/10.1088/1757-899X/1099/1/012054
Bhattacharjya, R., Tiwari, A., Marella, T.K., Bansal, H., Srivastava, S.: New paradigm in diatom omics and genetic manipulation. Bioresour. Technol. 325, 124708 (2021). https://doi.org/10.1016/j.biortech.2021.124708
Bansal, H., Kohli, R.K., Saluja, K., Chaurasia, A.: Recent advancements in biomedical research in the era of AI and ML. Artif. Intell. Comput. Dyn. Biomed. Res. 8, 1–20 (2022). https://doi.org/10.1515/9783110762044-001
Article Google Scholar
García, V., Sánchez, J.S., Marqués, A.I., Florencia, R., Rivera, G.: Understanding the apparent superiority of over-sampling through an analysis of local information for class-imbalanced data. Expert. Syst. Appl. 158 (2020). https://doi.org/10.1016/j.eswa.2019.113026
Bolívar, A., García, V., Florencia, R., Alejo, R., Rivera, G., Sánchez-Solís, J.P.: A preliminary study of smote on imbalanced big datasets when dealing with sparse and dense high dimensionality. In: Pattern Recognition: 14th Mexican Conference, MCPR 2022, Ciudad Juárez, Mexico, June 22–25, 2022, Proceedings, pp. 46–55. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-031-07750-0_5
Rivera, R., Florencia, R., García, V., Ruiz, A., Sánchez-Solís, J.P.: News classification for identifying traffic incident points in a Spanish-speaking country: a real-world case study of class imbalance learning. Appl. Sci. (Switzerland) 10(18) (2020). https://doi.org/10.3390/APP10186253
Leng, D., Zheng, L., Wen, Y., Zhang, Y., Wu, L., Wang, J., Wang, M., Zhang, Z., He, S., Bo, X.: A benchmark study of deep learning-based multi-omics data fusion methods for cancer. Genome Biol. 23(1), 1–32 (2022). https://doi.org/10.1186/s13059-022-02739-2
Article Google Scholar
Nicora, G., Vitali, F., Dagliati, A., Geifman, N., Bellazzi, R.: Integrated multi-omics analyses in oncology: a review of machine learning methods and tools. Front. Oncol. 10, 1030 (2020). https://doi.org/10.3389/fonc.2020.01030
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, 201303, Uttar Pradesh, India
Hina Bansal, Hiya Luthra & Shree R. Raghuram

Authors

Hina Bansal
View author publications
You can also search for this author in PubMed Google Scholar
Hiya Luthra
View author publications
You can also search for this author in PubMed Google Scholar
Shree R. Raghuram
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hina Bansal .

Editor information

Editors and Affiliations

División Multidisciplinaria de Ciudad Universitaria, Universidad Autónoma de Ciudad Juárez, Chihuahua, Mexico
Gilberto Rivera
Tecnológico Nacional de México/Instituto Tecnológico de Ciudad Madero, Ciudad Madero, Tamaulipas, Mexico
Laura Cruz-Reyes
School of Engineering, Universidad de Cádiz, Cadiz, Spain
Bernabé Dorronsoro
Universidad Tecnológica de La Habana “José Antonio Echeverría”, La Habana, Cuba
Alejandro Rosete

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bansal, H., Luthra, H., Raghuram, S.R. (2023). A Review on Machine Learning Aided Multi-omics Data Integration Techniques for Healthcare. In: Rivera, G., Cruz-Reyes, L., Dorronsoro, B., Rosete, A. (eds) Data Analytics and Computational Intelligence: Novel Models, Algorithms and Applications. Studies in Big Data, vol 132. Springer, Cham. https://doi.org/10.1007/978-3-031-38325-0_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-38325-0_10
Published: 13 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38324-3
Online ISBN: 978-3-031-38325-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Review on Machine Learning Aided Multi-omics Data Integration Techniques for Healthcare

Abstract

Access this chapter

Similar content being viewed by others

Machine Learning Approaches for Multi-omics Data Integration in Medicine

Multi-omic Approaches to Improve Cancer Diagnosis, Prognosis, and Therapeutics

Precision medicine journey through omics approach

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

A Review on Machine Learning Aided Multi-omics Data Integration Techniques for Healthcare

Abstract

Access this chapter

Similar content being viewed by others

Machine Learning Approaches for Multi-omics Data Integration in Medicine

Multi-omic Approaches to Improve Cancer Diagnosis, Prognosis, and Therapeutics

Precision medicine journey through omics approach

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation