Abstract
RNA sequencing (RNA-seq) has become a routine method for transcriptomic profiling. We developed a user-friendly web app called iDEP (integrated differential expression and pathway analysis) to help biologists interpret read counts or other types of expression matrices derived from read mapping. With iDEP, users can easily conduct exploratory data analysis, identify differentially expressed genes, and perform pathway analysis. Due to its intuitive user interface and massive annotation database, iDEP is being widely adopted for interactive analysis of RNA-seq data. Using a public dataset on the effect of heat shock on mouse with and without functional Hsf1, we demonstrate how users can prepare data files and conduct in-depth analysis. We also discuss the importance of critical interpretion of results (avoid p-hacking and rationalizing) and validation of significant pathways by using different methods and independent annotation databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ge SX, Son EW, Yao R (2018) iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data. BMC Bioinformatics 19(1):534. https://doi.org/10.1186/s12859-018-2486-6
Lachmann A, Torre D, Keenan AB, Jagodnik KM, Lee HJ, Wang L, Silverstein MC, Ma'ayan A (2018) Massive mining of publicly available RNA-seq data from human and mouse. Nat Commun 9(1):1366. https://doi.org/10.1038/s41467-018-03751-6
Ziemann M, Kaspi A, El-Osta A (2019) Digital expression explorer 2: a repository of uniformly processed RNA sequencing data. Gigascience 8(4). https://doi.org/10.1093/gigascience/giz022
Neueder A, Gipson TA, Batterton S, Lazell HJ, Farshim PP, Paganetti P, Housman DE, Bates GP (2017) HSF1-dependent and -independent regulation of the mammalian in vivo heat shock response and its impairment in Huntington’s disease mouse models. Sci Rep 7(1):12556. https://doi.org/10.1038/s41598-017-12897-0
Mallona I, Peinado MA (2017) Truke, a web tool to check for and handle excel misidentified gene symbols. BMC Genomics 18(1):242. https://doi.org/10.1186/s12864-017-3631-8
Zeeberg BR, Riss J, Kane DW, Bussey KJ, Uchio E, Linehan WM, Barrett JC, Weinstein JN (2004) Mistaken identifiers: gene name errors can be introduced inadvertently when using Excel in bioinformatics. BMC Bioinformatics 5:80. https://doi.org/10.1186/1471-2105-5-80
Ziemann M, Eren Y, El-Osta A (2016) Gene name errors are widespread in the scientific literature. Genome Biol 17(1):177. https://doi.org/10.1186/s13059-016-1044-7
Zych K, Snoek BL, Elvin M, Rodriguez M, Van der Velde KJ, Arends D, Westra HJ, Swertz MA, Poulin G, Kammenga JE, Breitling R, Jansen RC, Li Y (2017) reGenotyper: detecting mislabeled samples in genetic data. PLoS One 12(2):e0171324. https://doi.org/10.1371/journal.pone.0171324
Jin X, Moskophidis D, Mivechi NF (2011) Heat shock transcription factor 1 is a key determinant of HCC development by regulating hepatic steatosis and metabolic syndrome. Cell Metab 14(1):91–103. https://doi.org/10.1016/j.cmet.2011.03.025
Li J, Labbadia J, Morimoto RI (2017) Rethinking HSF1 in stress, development, and organismal health. Trends Cell Biol 27(12):895–905. https://doi.org/10.1016/j.tcb.2017.08.002
Tonelli C, Morelli MJ, Bianchi S, Rotta L, Capra T, Sabo A, Campaner S, Amati B (2015) Genome-wide analysis of p53 transcriptional programs in B cells upon exposure to genotoxic stress in vivo. Oncotarget 6(28):24611–24626. https://doi.org/10.18632/oncotarget.5232
Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A 101(12):4164–4169. https://doi.org/10.1073/pnas.0308531101
Furge K, Dykema K (2012) PGSEA: parametric gene set enrichment analysis. R package version 1480
Kim SY, Volsky DJ (2005) PAGE: parametric analysis of gene set enrichment. BMC Bioinformatics 6:144. https://doi.org/10.1186/1471-2105-6-144
Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15(12):550. https://doi.org/10.1186/s13059-014-0550-8
Zhao F, Xuan Z, Liu L, Zhang MQ (2005) TRED: a transcriptional regulatory element database and a platform for in silico gene regulation studies. Nucleic Acids Res 33(Database issue):D103–D107. https://doi.org/10.1093/nar/gki004
Toma-Jonik A, Vydra N, Janus P, Widlak W (2019) Interplay between HSF1 and p53 signaling pathways in cancer initiation and progression: non-oncogene and oncogene addiction. Cell Oncol (Dordr) 42(5):579–589. https://doi.org/10.1007/s13402-019-00452-0
Davis AP, Grondin CJ, Johnson RJ, Sciaky D, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ (2019) The comparative toxicogenomics database: update 2019. Nucleic Acids Res 47(D1):D948–D954. https://doi.org/10.1093/nar/gky868
Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander ES, Golub TR (2006) The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795):1929–1935. https://doi.org/10.1126/science.1132939
Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, von Mering C (2015) STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 43(Database issue):D447–D452. https://doi.org/10.1093/nar/gku1003
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102(43):15545–15550. https://doi.org/10.1073/pnas.0506580102
Luo W, Friedman MS, Shedden K, Hankenson KD, Woolf PJ (2009) GAGE: generally applicable gene set enrichment for pathway analysis. BMC Bioinformatics 10:161. https://doi.org/10.1186/1471-2105-10-161
Liu Y, Chang A (2008) Heat shock response relieves ER stress. EMBO J 27(7):1049–1059. https://doi.org/10.1038/emboj.2008.42
Cahill CM, Waterman WR, Xie Y, Auron PE, Calderwood SK (1996) Transcriptional repression of the prointerleukin 1beta gene by heat shock factor 1. J Biol Chem 271(40):24874–24879
Yu G, He QY (2016) ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization. Mol Biosyst 12(2):477–479. https://doi.org/10.1039/c5mb00663e
Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34(5):525–527. https://doi.org/10.1038/nbt.3519
Clifford H, Wessely F, Pendurthi S, Emes RD (2011) Comparison of clustering methods for investigation of genome-wide methylation array data. Front Genet 2:88. https://doi.org/10.3389/fgene.2011.00088
Law CW, Chen Y, Shi W, Smyth GK (2014) voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 15(2):R29. https://doi.org/10.1186/gb-2014-15-2-r29
Sergushichev AA (2016) An algorithm for fast preranked gene set enrichment analysis using cumulative statistic calculation. bioRxiv
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Ge, X. (2021). iDEP Web Application for RNA-Seq Data Analysis. In: Picardi, E. (eds) RNA Bioinformatics. Methods in Molecular Biology, vol 2284. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1307-8_22
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1307-8_22
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1306-1
Online ISBN: 978-1-0716-1307-8
eBook Packages: Springer Protocols