Abstract
The main purpose of pathway or gene set analysis methods is to provide mechanistic insight into the large amount of data produced in high-throughput studies. These tools were developed for gene expression analyses, but they have been rapidly adopted by other high-throughput techniques, becoming one of the foremost tools of omics research.
Currently, according to different biological questions and data, we can choose among a vast plethora of methods and databases. Here we use two published examples of RNAseq datasets to approach multiple analyses of gene sets, networks and pathways using freely available and frequently updated software. Finally, we conclude this chapter by presenting a survival pathway analysis of a multiomics dataset. During this overview of different methods, we focus on visualization, which is a fundamental but challenging step in this computational field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29
GeneOntologyConsortium (2019) The gene ontology resource: 20 years and still going strong. Nucleic Acids Res 47:D330–D338
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102:15545–15550
Dolgalev I (2019) Msigdbr: MSigDB gene sets for multiple organisms in a tidy data format
Bader GD, Cary MP, Sander C (2006) Pathguide: a pathway resource list. Nucleic Acids Res 34:D504–D506
Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30
Jassal B, Matthews L, Viteri G, Gong C, Lorente P, Fabregat A, Sidiropoulos K, Cook J, Gillespie M, Haw R et al (2020) The reactome pathway knowledgebase. Nucleic Acids Res 48:D498–D503
Sales G, Calura E, Cavalieri D, Romualdi C (2012) G raphite-a bioconductor package to convert pathway topology to gene network. BMC Bioinformatics 13:20
Oberbeck N, Pham VC, Webster JD, Reja R, Huang CS, Zhang Y, Roose-Girma M, Warming S, Li Q, Birnberg A et al (2019) The ripk4–irf6 signalling axis safeguards epidermal differentiation and barrier function. Nature 574:249–253
Network CGAR et al (2011) Integrated genomic analyses of ovarian carcinoma. Nature 474:609
Giorgi FM (2019) Aracne.networks: ARACNe-inferred gene networks from tcga tumor datasets
Martini P, Sales G, Massa MS, Chiogna M, Romualdi C (2013) Along signal paths: an empirical gene set approach exploiting pathway topology. Nucleic Acids Res 41:e19–e19
Yu G, Wang L-G, Han Y, He Q-Y (2012) ClusterProfiler: an r package for comparing biological themes among gene clusters. Omics 16:284–287
Robinson MD, McCarthy DJ, Smyth GK (2010) EdgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140
Wickham H (2016) Ggplot2: elegant graphics for data analysis. Springer, New York, NY
Hanzelmann S, Castelo R, Guinney J (2013) GSVA: gene set variation analysis for microarray and rna-seq data. BMC Bioinformatics 14:7
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015) Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43:e47–e47
Martini P, Chiogna M, Calura E, Romualdi C (2019) MOSClip: multi-omic and survival pathway analysis for the identification of survival associated gene and modules. Nucleic Acids Res 47:e80–e80
Carlson M (2019) Org.Hs.eg.db: genome wide annotation for human
Carlson M (2019) Org.Mm.eg.db: genome wide annotation for mouse
Luo W, Brouwer C (2013) Pathview: an r/bioconductor package for pathway-based data integration and visualization. Bioinformatics 29:1830–1831
Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, Califano A (2016) Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nat Genet 48:838
Liu J, Lichtenberg T, Hoadley KA, Poisson LM, Lazar AJ, Cherniack AD, Kovatich AJ, Benz CC, Levine DA, Lee AV et al (2018) An integrated tcga pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell 173:400–416
Darvin P, Toor SM, Nair VS, Elkord E (2018) Immune checkpoint inhibitors: recent progress and potential biomarkers. Exp Mol Med 50:1–11
Liberzon A, Birger C, Thorvaldsdóttir H, Ghandi M, Mesirov JP, Tamayo P (2015) The molecular signatures database hallmark gene set collection. Cell Syst 1:417–425
Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP (2011) Molecular signatures database (msigdb) 3.0. Bioinformatics 27:1739–1740
Tarca AL, Draghici S, Khatri P, Hassan SS, Mittal P, J-s K, Kim CJ, Kusanovic JP, Romero R (2009) A novel signaling pathway impact analysis. Bioinformatics 25:75–82
Acknowledgments
This work was supported by the “My First AIRC grant” to Enrica Calura (MFAG 2019, Grant N. 23522), the IG grant to Chiara Romualdi (Grant N. IG 21837) both provided by Italian Association for Cancer Research (AIRC) and the European Molecular Biology Organization (EMBO) Short-Term Fellowship [8517] to Paolo Martini.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Calura, E., Martini, P. (2021). Summarizing RNA-Seq Data or Differentially Expressed Genes Using Gene Set, Network, or Pathway Analysis. In: Picardi, E. (eds) RNA Bioinformatics. Methods in Molecular Biology, vol 2284. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1307-8_9
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1307-8_9
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1306-1
Online ISBN: 978-1-0716-1307-8
eBook Packages: Springer Protocols