Differential Expression Analysis of RNA-Seq Data and Co-expression Networks

Javed, Sana

doi:10.1007/978-3-030-69951-2_2

Sana Javed²⁷

Part of the book series: Computational Biology ((COBO,volume 31))

1116 Accesses

Abstract

At present, RNA-seq has become the most common and powerful platform in the study of transcriptomes. A major goal of RNA-seq analysis is the identification of genes and molecular pathways which are differentially expressed in two altered situations. Such difference in expression profiles might be linked with changes in biology giving an indication for further intense investigation. Generally, the traditional statistical methods used in the study of differential expression analysis of gene profiles are restricted to individual genes and do not provide any information regarding interactivities of genes contributing to a certain biological system. This need led the scientists to develop new computational methods to identify such interactions of genes. The most common approach used to study gene-set interactivities is gene network inference. Co-expression gene networks are the correlation-based networks which are commonly used to identify the set of genes significantly involved in the occurrence or presence of a particular biological process. This chapter describes a basic procedure of an RNA-seq analysis along with a brief description about the techniques used in the analysis: an illustration on a real data set is also shown. In addition, a basic pipeline is presented to elucidate how to construct a co-expression network and detect modules from the RNA-seq data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tavassoly I, Goldfarb J, Iyengar R (2018) Systems biology primer: the basic methods and approaches. Essays Biochem 62(4):487–500. https://doi.org/10.1042/EBC20180003
Article PubMed Google Scholar
Longo G, Montévil M (2014) Perspectives in organisms. Lecture Notes in Morphogenesis, pp 23–27. Available at: https://link.springer.com/content/pdf/10.1007/978-3-642-35938-5.pdf
Bu Z, Callaway DJE (2011) Chapter 5—Proteins MOVE! Protein dynamics and long-range allostery in cell signaling. In: Donev RBT-A, P. C. and S. B. (ed.) Protein structure and diseases. Academic Press, pp 163–221. https://doi.org/10.1016/B978-0-12-381262-9.00005-7
Zewail AH (2008) Physical biology: from atoms to medicine. Imperial college press
Google Scholar
Churko JM et al (2013) Overview of high throughput sequencing technologies to elucidate molecular pathways in cardiovascular diseases. Circ Res 112(12): 1613–1623. https://doi.org/10.1161/CIRCRESAHA.113.300939
Schuster SC (2008) Next-generation sequencing transforms today’s biology. Nat Methods 5(1):16–18. https://doi.org/10.1038/nmeth1156
Article CAS PubMed Google Scholar
Zhao S et al (2014) Comparison of RNA-seq and microarray in transcriptome profiling of activated T cells. PloS one. Public Library of Science, 9(1): e78644
Google Scholar
Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30(7):923–930. https://doi.org/10.1093/bioinformatics/btt656
Article CAS PubMed Google Scholar
Anders S, Pyl PT, Huber W (2015) HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31(2):166–169. https://doi.org/10.1093/bioinformatics/btu638
Article CAS PubMed Google Scholar
McCarthy DJ, Chen Y, Smyth GK (2012) Differential expression analysis of multifactor RNA-seq experiments with respect to biological variation. Nucleic Acids Res 40(10): 4288–4297. https://doi.org/10.1093/nar/gks042
’t Hoen PAC et al (2008) Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res 36(21): e141. https://doi.org/10.1093/nar/gkn705
Cloonan N et al (2008) Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods. United States, 5(7): 613–619. https://doi.org/10.1038/nmeth.1223
Langmead B, Hansen KD, Leek JT (2010) Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 11(8): R83. https://doi.org/10.1186/gb-2010-11-8-r83
Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11(10): R106. https://doi.org/10.1186/gb-2010-11-10-r106
Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics (Oxford, England), 26(1): 139–140. https://doi.org/10.1093/bioinformatics/btp616
Robinson MD, Smyth GK (2007) Moderated statistical tests for assessing differences in tag abundance. Bioinformatics (Oxford, England). England, 23(21): 2881–2887. https://doi.org/10.1093/bioinformatics/btm453
Robinson MD, Smyth GK (2008) Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics (Oxford, England). England, 9(2): 321–332. https://doi.org/10.1093/biostatistics/kxm030
Nagalakshmi U et al (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science (New York, N.Y.), 320(5881): 1344–1349. https://doi.org/10.1126/science.1158441
Lund SP et al (2012) Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Stat Appl Genet Mol Biol. De Gruyter, 11(5)
Google Scholar
Lun ATL, Chen Y, Smyth GK (2016) It’s DE-licious: a recipe for differential expression analyses of RNA-seq experiments using Quasi-Likelihood methods in edgeR. Methods in molecular biology (Clifton, N.J.). United States, vol 1418, pp 391–416. https://doi.org/10.1007/978-1-4939-3578-9_19
Phipson B et al (2013) Empirical Bayes in the presence of exceptional cases, with application to microarray data. Phytochemistry 26(8):2247–2250
Google Scholar
Smyth GK (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. De Gruyter, 3(1)
Google Scholar
Zhou Y-H, Xia K, Wright FA (2011) A powerful and flexible approach to the analysis of RNA sequence count data. Bioinformatics (Oxford, England), 27(19): 2672–2678. https://doi.org/10.1093/bioinformatics/btr449
Wu H, Wang C, Wu Z (2013) A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics (Oxford, England), 14(2): 232–243. https://doi.org/10.1093/biostatistics/kxs033
Hardcastle TJ, Kelly KA (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinform 11: 422. https://doi.org/10.1186/1471-2105-11-422
Van De Wiel MA et al (2013) Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. Biostatistics (Oxford, England). England, 14(1): 113–128. https://doi.org/10.1093/biostatistics/kxs031
Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11(3):R25. https://doi.org/10.1186/gb-2010-11-3-r25
Article CAS PubMed PubMed Central Google Scholar
Bullard JH et al (2010) Evaluation of statistical methods for normalization and differential expression in mRNA-seq experiments. BMC Bioinform 11(1):94. https://doi.org/10.1186/1471-2105-11-94
Article CAS Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc: Ser B (Methodological). Wiley Online Library, 57(1): 289–300
Google Scholar
Kurppa KJ et al (2020) Treatment-induced tumor dormancy through YAP-mediated transcriptional reprogramming of the apoptotic pathway. Cancer Cell 37(1): 104–122.e12. https://doi.org/10.1016/j.ccell.2019.12.006
Wu D et al (2010) ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics. Oxford University Press, 26(17): 2176–2182
Google Scholar
Cho K-H et al (2007) Reverse engineering of gene regulatory networks. IET Syst Biol. IET 1(3):149–163
Article CAS Google Scholar
Csete ME, Doyle JC (2002) Reverse engineering of biological complexity. Science. American Association for the Advancement of Science, 295(5560): 1664–1669
Google Scholar
Kitano H (2000) Perspectives on systems biology. New Gener Comput. Springer, 18(3): 199–216
Google Scholar
Bansal M et al (2007) How to infer gene networks from expression profiles. Mol Syst Biol 3: 78. https://doi.org/10.1038/msb4100120
Bellazzi R, Zupan B (2007) Towards knowledge-based gene expression data mining. J Biomed Infor. United States, 40(6): 787–802. https://doi.org/10.1016/j.jbi.2007.06.005
Ernst J et al (2007) Reconstructing dynamic regulatory maps. Mol Syst Biol 3: 74. https://doi.org/10.1038/msb4100115
Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science (New York, N.Y.). United States, 303(5659): 799–805. https://doi.org/10.1126/science.1094068
Gilbert D et al (2006) Computational methodologies for modelling, analysis and simulation of signalling networks. Briefings Bioinform. England, 7(4): 339–353. https://doi.org/10.1093/bib/bbl043
Hecker M et al (2009) Gene regulatory network inference: data integration in dynamic models-a review. Bio Syst. Ireland, 96(1): 86–103. https://doi.org/10.1016/j.biosystems.2008.12.004
Markowetz F, Spang R (2007) Inferring cellular networks--a review. BMC Bioinform 8(Suppl 6): S5. https://doi.org/10.1186/1471-2105-8-S6-S5
Schlitt T, Brazma A (2007) Current approaches to gene regulatory network modelling. BMC Bioinform 8(Suppl 6): S9. https://doi.org/10.1186/1471-2105-8-S6-S9
Stigler B et al (2007) Reverse engineering of dynamic networks. Ann New York Acad Sci. United States, 1115: 168–177. https://doi.org/10.1196/annals.1407.012
Lee WP, Tzou W-S (2009) Computational methods for discovering gene networks from expression data. Briefings Bioinform 10(4): 408–423. https://doi.org/10.1093/bib/bbp028
Bühlmann P, Van De Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media
Google Scholar
Dong J, Horvath S (2007) Understanding network concepts in modules. BMC Syst Biol. Springer 1(1):24
Article Google Scholar
Horvath S, Dong J (2008) Geometric interpretation of gene coexpression network analysis. PLoS Comput Biol. Public Library of Science, 4(8): e1000117
Google Scholar
Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinform 9(1):559. https://doi.org/10.1186/1471-2105-9-559
Article CAS Google Scholar
Sulaimanov N, Koeppl H (2016) Graph reconstruction using covariance-based methods. EURASIP J Bioinf Syst Biol 1:19. https://doi.org/10.1186/s13637-016-0052-y
Article Google Scholar
Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the lasso. The Ann Stat. Institute of Mathematical Statistics, 34(3): 1436–1462
Google Scholar
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics. Oxford University Press, 9(3): 432–441
Google Scholar
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc. Taylor & Francis, 101(476): 1418–1429
Google Scholar
Bien J, Tibshirani RJ (2011) Sparse estimation of a covariance matrix. Biometrika. Oxford University Press, 98(4): 807–820
Google Scholar
Inbar E et al (2017) The Transcriptome of Leishmania major developmental stages in their natural sand fly vector. mBio 8(2): e00029–17 (Edited by L. D. Sibley). https://doi.org/10.1128/mBio.00029-17
Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. De Gruyter, 4(1)
Google Scholar
Li A, Horvath S (2007) Network neighborhood analysis with the multi-node topological overlap measure. Bioinformatics. Oxford University Press, 23(2): 222–231
Google Scholar
Ravasz E et al (2002) Hierarchical organization of modularity in metabolic networks. Science (New York, N.Y.). United States, 297(5586): 1551–1555. https://doi.org/10.1126/science.1073374
Yip AM, Horvath S (2007) Gene network interconnectedness and the generalized topological overlap measure. BMC Bioinform. BioMed Central 8(1): 22
Google Scholar
Langfelder P, Zhang B, Horvath S (2008) Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics (Oxford, England). England, 24(5): 719–720. https://doi.org/10.1093/bioinformatics/btm563

Download references

Author information

Authors and Affiliations

Department of Mathematics, COMSATS University Islamabad, Lahore Campus, Pakistan
Sana Javed

Authors

Sana Javed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sana Javed .

Editor information

Editors and Affiliations

Center for Artificial Intelligence, Prince Mohammad Bin Fahd University, Khobar, Saudi Arabia
Tuan D. Pham
Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong
Hong Yan
College of Sciences and Human Studies, Prince Mohammad Bin Fahd University, Khobar, Saudi Arabia
Muhammad W. Ashraf
Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
Folke Sjöberg

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Javed, S. (2021). Differential Expression Analysis of RNA-Seq Data and Co-expression Networks. In: Pham, T.D., Yan, H., Ashraf, M.W., Sjöberg, F. (eds) Advances in Artificial Intelligence, Computation, and Data Science. Computational Biology, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-030-69951-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-69951-2_2
Published: 13 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69950-5
Online ISBN: 978-3-030-69951-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics