Abstract
Cancer is a complex disease characterized by molecular heterogeneity and the involvement of several cellular mechanisms throughout its evolution and pathogenesis. Despite the great efforts made to untangle these mechanisms, cancer pathophysiology remains far from clear. So far, panels of biomarkers have been reported from high-throughput data generated through different platforms. These biomarkers are primarily focused on one type of coding molecules such as transcripts or proteins, mainly due to the apparent heterogeneity of output data resulting from the use of various techniques specific to the molecular type. Hence, there is a major need to understand how these molecules interact and complement each other to be able to explain the deregulated processes involved. The breadth of large-scale data availability as well as the lack of in-depth analysis of publicly available data has raised concerns and enabled opportunities for new strategies to analyze “Big data” more comprehensively. Here, a new protocol to perform integrative analysis based on a systems biology approach is described. The foundation of the approach relies on groups of datasets from published studies compared within the original described groups and organized in a designated format to allow the integration and cross-comparison among different studies and different platforms. This approach follows an unbiased hypothesis-free methodology that will facilitate the identification of commonalities among different data-set sources, and ultimately map and characterize specific molecular pathways using significantly deregulated molecules. This in turn will generate novel insights about the mechanisms deregulated in complex diseases such as cancer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
American Cancer Society (2018) Global cancer facts and figures 4th edition. Am Cancer Soc, pp 1–76
Levy SE, Boone BE (2019) Next-generation sequencing strategies. Cold Spring Harb Perspect Med 9:a025791
Aslam B, Basit M, Nisar MA et al (2017) Proteomics: technologies and their applications. J Chromat Sci 55(2):182–196. https://doi.org/10.1093/chromsci/bmw167
Serna G, Ruiz-Pace F, Cecchi F et al (2019) Targeted multiplex proteomics for molecular prescreening and biomarker discovery in metastatic colorectal cancer. Sci Rep 9:1–10
Zhang C, Leng W, Sun C et al (2018) Urine proteome profiling predicts lung cancer from control cases and other tumors. EBioMedicine 30:120–128
Sim SY, Choi YR, Lee JH et al (2019) In-depth proteomic analysis of human bronchoalveolar lavage fluid toward the biomarker discovery for lung cancers. ProteomicsClin Appl 13:e1900028
Yang QJ, Zhao JR, Hao J et al (2018) Serum and urine metabolomics study reveals a distinct diagnostic model for cancer cachexia. J Cachexia Sarcopenia Muscle 9:71–85
Li Y, Kang K, Krahn JM et al (2017) A comprehensive genomic pan-cancer classification using The Cancer Genome Atlas gene expression data. BMC Genomics 18:508
Sunami K, Ichikawa H, Kubo T et al (2019) Feasibility and utility of a panel testing for 114 cancer-associated genes in a clinical setting: a hospital-based study. Cancer Sci 110:1480–1490
Dagogo-Jack I, Shaw AT (2018) Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol 15(2):81–94. www.nature.com/nrclinonc
Pavlou MP, Diamandis EP, Blasutig IM (2013) The long journey of cancer biomarkers from the bench to the clinic. Clin Chem 59:147–157
Borrebaeck CAK (2017) Precision diagnostics: moving towards protein biomarker signatures of clinical utility in cancer. Nat Rev Cancer 17(3):199–204. www.nature.com/nrc
Vogel C, Marcotte EM (2012) Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet 13:227–232
Zhang B, Wang J, Wang X et al (2014) Proteogenomic characterization of human colon and rectal cancer. Nature 513:382–387
Alfaro JA, Sinha A, Kislinger T et al (2014) Onco-proteogenomics: cancer proteomics joins forces with genomics. Nat Methods 11(11):1107–1113. https://www.nature.com/articles/nmeth.3138
Hristova VA, Chan DW (2019) Cancer biomarker discovery and translation: proteomics and beyond. Expert Rev Proteomics 16(2):93–103. pmc/articles/PMC6635916/?report=abstract
Tomczak K, Czerwińska P, Wiznerowicz M (2015) The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol (Pozn) 19(1A):A68–A77. pmc/articles/PMC4322527/?report=abstract
Sondka Z, Bamford S, Cole CG et al (2018) The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat Rev Cancer 18(11):696–705. https://doi.org/10.1038/s41568-018-0060-1
Clough E, Barrett T (2016) The gene expression omnibus database. In: Methods in molecular biology. Humana Press Inc, Totowa, NJ, pp 93–110
Kechavarzi BD, Wu H, Doman TN (2019) Bottom-up, integrated -omics analysis identifies broadly dosage-sensitive genes in breast cancer samples from TCGA. PLoS One 14:e0210910
Konstorum A, Lynch ML, Torti SV et al (2018) A systems biology approach to understanding the pathophysiology of high-grade serous ovarian cancer: focus on iron and fatty acid metabolism. Omi A J Integr Biol 22:502–513
Krempel R, Kulkarni P, Yim A et al (2018) Integrative analysis and machine learning on cancer genomics data using the Cancer Systems Biology Database (CancerSysDB). BMC Bioinformatics 19:156
Selvaraj G, Kaliamurthi S, Kaushik AC et al (2018) Identification of target gene and prognostic evaluation for lung adenocarcinoma using gene expression meta-analysis, network analysis and neural network algorithms. J Biomed Inform 86:120–134
Archer TC, Fertig EJ, Gosline SJC et al (2016) Systems approaches to cancer biology. In: Cancer research. American Association for Cancer Research Inc, Philadelphia, pp 6774–6777
Xia J, Fjell CD, Mayer ML et al (2013) INMEX—a web-based tool for integrative meta-analysis of expression data. Nucleic Acids Res 41:W63
Durinck S, Moreau Y, Kasprzyk A et al (2005) BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21:3439–3440
Huang DW, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57
Husi H (2004) NMDA receptors, neural pathways, and protein interaction databases. Int Rev Neurobiol 61:49–77
Brown J, Phillips AR, Lewis DA et al (2019) Bioinformatics Resource Manager: a systems biology web tool for microRNA and omics data integration. BMC Bioinformatics 20:255
Zhou G, Soufan O, Ewald J et al (2019) NetworkAnalyst 3.0: a visual analytics platform for comprehensive gene expression profiling and meta-analysis. Nucleic Acids Res 47:W234–W241
Lehrmann A, Huber M, Polatkan AC et al (2013) Visualizing dimensionality reduction of systems biology data. Data Min Knowl Discov 27:146–165
Mramor M, Leban G, Demšar J et al (2007) Visualization-based cancer microarray data classification analysis. Bioinformatics 23:2147–2154
Bartenhagen C, Klein HU, Ruckert C et al (2010) Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data. BMC Bioinformatics 11:1–11
Lever J, Krzywinski M, Altman N (2017) Principal component analysis. Nat Methods 14:641–642. https://doi.org/10.1038/nmeth.4346
Censi F, Calcagnini G, Bartolini P et al (2010) A systems biology strategy on differential gene expression data discloses some biological features of atrial fibrillation. PLoS One 5:e13668
Yeung KY, Ruzzo WL (2001) Principal component analysis for clustering gene expression data. Bioinformatics 17:763–774
Tahmasebi A, Ebrahimie E, Pakniyat H et al (2019) Tissue-specific transcriptional biomarkers in medicinal plants: application of large-scale meta-analysis and computational systems biology. Gene 691:114–124
Khan A, Rehman Z, Hashmi HF et al (2020) An integrated systems biology and network-based approaches to identify novel biomarkers in breast cancer cell lines using gene expression data. Interdiscip Sci Comput Life Sci 12:155–168
Reznik E, Luna A, Aksoy BA et al (2018) A landscape of metabolic variation across tumor types. Cell Syst 6:301–313.e3
Van’t Veer LJ, Dai H, Van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536
Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22:4–37
Kashyap H, Ahmed HA, Hoque N et al (2016) Big data analytics in bioinformatics: architectures, techniques, tools and issues. Netw Model Anal Health Inform Bioinforma 5:28
Husi H, Fernandes M, Skipworth RJ et al (2019) Identification of diagnostic upper gastrointestinal cancer tissue type-specific urinary biomarkers. Biomed Reports 10:165–174
Fernandes M, Patel A, Husi H (2018) C/VDdb: a multi-omics expression profiling database for a knowledge-driven approach in cardiovascular disease (CVD). PLoS One 13(11):e0207371
Cervantes-Gracia K, Husi H (2018) Integrative analysis of multiple sclerosis using a systems biology approach. Sci Rep 8:1–14
Krochmal M, Fernandes M, Filip S et al (2016) PeptiCKDdb-peptide-and protein-centric database for the investigation of genesis and progression of chronic kidney disease. Database (Oxford) 2016:baw128
Bindea G, Galon J, Mlecnik B (2013) CluePedia Cytoscape plugin: pathway insights using integrated experimental and in silico data. Bioinformatics 29:661–663
Bindea G, Mlecnik B, Hackl H et al (2009) ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25:1091–1093
Kutmon M, van Iersel MP, Bohler A et al (2015) PathVisio 3: an extendable pathway analysis toolbox. PLoS Comput Biol 11:e1004085
van Iersel MP, Pico AR, Kelder T et al (2010) The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services. BMC Bioinformatics 11:1–7
Pinu FR, Beale DJ, Paten AM et al (2019) Systems biology and multi-omics integration: viewpoints from the metabolomics research community. Metabolites 9(4):76
Zhou G, Li S, Xia J (2020) Network-based approaches for multi-omics integration. Methods Mol Biol 2104:469–487
Warde-Farley D, Donaldson SL, Comes O et al (2010) GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 38:W214–W220
Piñero J, Ramírez-Anguita JM, Saüch-Pitarch J et al (2020) The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res 48:D845–D855
Enright AJ, John B, Gaul U et al (2003) MicroRNA targets in Drosophila. Genome Biol 5:R1
Karnovsky A, Weymouth T, Hull T et al (2012) Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data. Bioinformatics 28:373–380
Pang Z, Chong J, Li S et al (2020) MetaboAnalystR 3.0: toward an optimized workflow for global metabolomics. Metabolites 10:186
Sean D, Meltzer PS (2007) GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 23:1846–1847
Gentleman RC, Carey VJ, Bates DM et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80
Smyth GK (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3:Article3
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc 57:289–300
Husi H, Skipworth RJE, Cronshaw A et al (2016) Proteomic identification of potential cancer markers in human urine using subtractive analysis. Int J Oncol 48:1921–1932
Husi H, Van Agtmael T, Mullen W et al (2014) Proteome-based systems biology analysis of the diabetic mouse aorta reveals major changes in fatty acid biosynthesis as potential hallmark in diabetes mellitus-associated vascular disease. Circ Cardiovasc Genet 7:161–170
Delles C, Husi H (2017) Systems biology approach in hypertension research. In: Methods in molecular biology. Humana Press Inc, Totowa, NJ, pp 69–79
Fernandes M, Husi H (2016) Integrative systems biology investigation of fabry disease. Diseases 4:35
García-Campos MA, Espinal-Enríquez J, Hernández-Lemus E (2015) Pathway analysis: state of the art. Front Physiol 6:383
De Anda-Jáuregui G, Mejía-Pedroza RA, Espinal-Enríquez J et al (2015) Crosstalk events in the estrogen signaling pathway may affect tamoxifen efficacy in breast cancer molecular subtypes. Comput Biol Chem 59:42–54
Acknowledgments
KCG is supported by CONACYT Mexico scholarship (No. 2019-000021-01EXTF-00542). HH is supported by a grant from Highlands & Islands Enterprise.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Cervantes-Gracia, K., Chahwan, R., Husi, H. (2021). Integrative Analysis of Incongruous Cancer Genomics and Proteomics Datasets. In: Cecconi, D. (eds) Proteomics Data Analysis. Methods in Molecular Biology, vol 2361. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1641-3_17
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1641-3_17
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1640-6
Online ISBN: 978-1-0716-1641-3
eBook Packages: Springer Protocols