Skip to main content

Differential Expression Analysis of RNA-Seq Data and Co-expression Networks

  • Chapter
  • First Online:
Advances in Artificial Intelligence, Computation, and Data Science

Part of the book series: Computational Biology ((COBO,volume 31))

  • 1116 Accesses

Abstract

At present, RNA-seq has become the most common and powerful platform in the study of transcriptomes. A major goal of RNA-seq analysis is the identification of genes and molecular pathways which are differentially expressed in two altered situations. Such difference in expression profiles might be linked with changes in biology giving an indication for further intense investigation. Generally, the traditional statistical methods used in the study of differential expression analysis of gene profiles are restricted to individual genes and do not provide any information regarding interactivities of genes contributing to a certain biological system. This need led the scientists to develop new computational methods to identify such interactions of genes. The most common approach used to study gene-set interactivities is gene network inference. Co-expression gene networks are the correlation-based networks which are commonly used to identify the set of genes significantly involved in the occurrence or presence of a particular biological process. This chapter describes a basic procedure of an RNA-seq analysis along with a brief description about the techniques used in the analysis: an illustration on a real data set is also shown. In addition, a basic pipeline is presented to elucidate how to construct a co-expression network and detect modules from the RNA-seq data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tavassoly I, Goldfarb J, Iyengar R (2018) Systems biology primer: the basic methods and approaches. Essays Biochem 62(4):487–500. https://doi.org/10.1042/EBC20180003

    Article  PubMed  Google Scholar 

  2. Longo G, Montévil M (2014) Perspectives in organisms. Lecture Notes in Morphogenesis, pp 23–27. Available at: https://link.springer.com/content/pdf/10.1007/978-3-642-35938-5.pdf

  3. Bu Z, Callaway DJE (2011) Chapter 5—Proteins MOVE! Protein dynamics and long-range allostery in cell signaling. In: Donev RBT-A, P. C. and S. B. (ed.) Protein structure and diseases. Academic Press, pp 163–221. https://doi.org/10.1016/B978-0-12-381262-9.00005-7

  4. Zewail AH (2008) Physical biology: from atoms to medicine. Imperial college press

    Google Scholar 

  5. Churko JM et al (2013) Overview of high throughput sequencing technologies to elucidate molecular pathways in cardiovascular diseases. Circ Res 112(12): 1613–1623. https://doi.org/10.1161/CIRCRESAHA.113.300939

  6. Schuster SC (2008) Next-generation sequencing transforms today’s biology. Nat Methods 5(1):16–18. https://doi.org/10.1038/nmeth1156

    Article  CAS  PubMed  Google Scholar 

  7. Zhao S et al (2014) Comparison of RNA-seq and microarray in transcriptome profiling of activated T cells. PloS one. Public Library of Science, 9(1): e78644

    Google Scholar 

  8. Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30(7):923–930. https://doi.org/10.1093/bioinformatics/btt656

    Article  CAS  PubMed  Google Scholar 

  9. Anders S, Pyl PT, Huber W (2015) HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31(2):166–169. https://doi.org/10.1093/bioinformatics/btu638

    Article  CAS  PubMed  Google Scholar 

  10. McCarthy DJ, Chen Y, Smyth GK (2012) Differential expression analysis of multifactor RNA-seq experiments with respect to biological variation. Nucleic Acids Res 40(10): 4288–4297. https://doi.org/10.1093/nar/gks042

  11. ’t Hoen PAC et al (2008) Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res 36(21): e141. https://doi.org/10.1093/nar/gkn705

  12. Cloonan N et al (2008) Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods. United States, 5(7): 613–619. https://doi.org/10.1038/nmeth.1223

  13. Langmead B, Hansen KD, Leek JT (2010) Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 11(8): R83. https://doi.org/10.1186/gb-2010-11-8-r83

  14. Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11(10): R106. https://doi.org/10.1186/gb-2010-11-10-r106

  15. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics (Oxford, England), 26(1): 139–140. https://doi.org/10.1093/bioinformatics/btp616

  16. Robinson MD, Smyth GK (2007) Moderated statistical tests for assessing differences in tag abundance. Bioinformatics (Oxford, England). England, 23(21): 2881–2887. https://doi.org/10.1093/bioinformatics/btm453

  17. Robinson MD, Smyth GK (2008) Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics (Oxford, England). England, 9(2): 321–332. https://doi.org/10.1093/biostatistics/kxm030

  18. Nagalakshmi U et al (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science (New York, N.Y.), 320(5881): 1344–1349. https://doi.org/10.1126/science.1158441

  19. Lund SP et al (2012) Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Stat Appl Genet Mol Biol. De Gruyter, 11(5)

    Google Scholar 

  20. Lun ATL, Chen Y, Smyth GK (2016) It’s DE-licious: a recipe for differential expression analyses of RNA-seq experiments using Quasi-Likelihood methods in edgeR. Methods in molecular biology (Clifton, N.J.). United States, vol 1418, pp 391–416. https://doi.org/10.1007/978-1-4939-3578-9_19

  21. Phipson B et al (2013) Empirical Bayes in the presence of exceptional cases, with application to microarray data. Phytochemistry 26(8):2247–2250

    Google Scholar 

  22. Smyth GK (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. De Gruyter, 3(1)

    Google Scholar 

  23. Zhou Y-H, Xia K, Wright FA (2011) A powerful and flexible approach to the analysis of RNA sequence count data. Bioinformatics (Oxford, England), 27(19): 2672–2678. https://doi.org/10.1093/bioinformatics/btr449

  24. Wu H, Wang C, Wu Z (2013) A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics (Oxford, England), 14(2): 232–243. https://doi.org/10.1093/biostatistics/kxs033

  25. Hardcastle TJ, Kelly KA (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinform 11: 422. https://doi.org/10.1186/1471-2105-11-422

  26. Van De Wiel MA et al (2013) Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. Biostatistics (Oxford, England). England, 14(1): 113–128. https://doi.org/10.1093/biostatistics/kxs031

  27. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11(3):R25. https://doi.org/10.1186/gb-2010-11-3-r25

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Bullard JH et al (2010) Evaluation of statistical methods for normalization and differential expression in mRNA-seq experiments. BMC Bioinform 11(1):94. https://doi.org/10.1186/1471-2105-11-94

    Article  CAS  Google Scholar 

  29. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc: Ser B (Methodological). Wiley Online Library, 57(1): 289–300

    Google Scholar 

  30. Kurppa KJ et al (2020) Treatment-induced tumor dormancy through YAP-mediated transcriptional reprogramming of the apoptotic pathway. Cancer Cell 37(1): 104–122.e12. https://doi.org/10.1016/j.ccell.2019.12.006

  31. Wu D et al (2010) ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics. Oxford University Press, 26(17): 2176–2182

    Google Scholar 

  32. Cho K-H et al (2007) Reverse engineering of gene regulatory networks. IET Syst Biol. IET 1(3):149–163

    Article  CAS  Google Scholar 

  33. Csete ME, Doyle JC (2002) Reverse engineering of biological complexity. Science. American Association for the Advancement of Science, 295(5560): 1664–1669

    Google Scholar 

  34. Kitano H (2000) Perspectives on systems biology. New Gener Comput. Springer, 18(3): 199–216

    Google Scholar 

  35. Bansal M et al (2007) How to infer gene networks from expression profiles. Mol Syst Biol 3: 78. https://doi.org/10.1038/msb4100120

  36. Bellazzi R, Zupan B (2007) Towards knowledge-based gene expression data mining. J Biomed Infor. United States, 40(6): 787–802. https://doi.org/10.1016/j.jbi.2007.06.005

  37. Ernst J et al (2007) Reconstructing dynamic regulatory maps. Mol Syst Biol 3: 74. https://doi.org/10.1038/msb4100115

  38. Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science (New York, N.Y.). United States, 303(5659): 799–805. https://doi.org/10.1126/science.1094068

  39. Gilbert D et al (2006) Computational methodologies for modelling, analysis and simulation of signalling networks. Briefings Bioinform. England, 7(4): 339–353. https://doi.org/10.1093/bib/bbl043

  40. Hecker M et al (2009) Gene regulatory network inference: data integration in dynamic models-a review. Bio Syst. Ireland, 96(1): 86–103. https://doi.org/10.1016/j.biosystems.2008.12.004

  41. Markowetz F, Spang R (2007) Inferring cellular networks--a review. BMC Bioinform 8(Suppl 6): S5. https://doi.org/10.1186/1471-2105-8-S6-S5

  42. Schlitt T, Brazma A (2007) Current approaches to gene regulatory network modelling. BMC Bioinform 8(Suppl 6): S9. https://doi.org/10.1186/1471-2105-8-S6-S9

  43. Stigler B et al (2007) Reverse engineering of dynamic networks. Ann New York Acad Sci. United States, 1115: 168–177. https://doi.org/10.1196/annals.1407.012

  44. Lee WP, Tzou W-S (2009) Computational methods for discovering gene networks from expression data. Briefings Bioinform 10(4): 408–423. https://doi.org/10.1093/bib/bbp028

  45. Bühlmann P, Van De Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media

    Google Scholar 

  46. Dong J, Horvath S (2007) Understanding network concepts in modules. BMC Syst Biol. Springer 1(1):24

    Article  Google Scholar 

  47. Horvath S, Dong J (2008) Geometric interpretation of gene coexpression network analysis. PLoS Comput Biol. Public Library of Science, 4(8): e1000117

    Google Scholar 

  48. Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinform 9(1):559. https://doi.org/10.1186/1471-2105-9-559

    Article  CAS  Google Scholar 

  49. Sulaimanov N, Koeppl H (2016) Graph reconstruction using covariance-based methods. EURASIP J Bioinf Syst Biol 1:19. https://doi.org/10.1186/s13637-016-0052-y

    Article  Google Scholar 

  50. Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the lasso. The Ann Stat. Institute of Mathematical Statistics, 34(3): 1436–1462

    Google Scholar 

  51. Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics. Oxford University Press, 9(3): 432–441

    Google Scholar 

  52. Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc. Taylor & Francis, 101(476): 1418–1429

    Google Scholar 

  53. Bien J, Tibshirani RJ (2011) Sparse estimation of a covariance matrix. Biometrika. Oxford University Press, 98(4): 807–820

    Google Scholar 

  54. Inbar E et al (2017) The Transcriptome of Leishmania major developmental stages in their natural sand fly vector. mBio 8(2): e00029–17 (Edited by L. D. Sibley). https://doi.org/10.1128/mBio.00029-17

  55. Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. De Gruyter, 4(1)

    Google Scholar 

  56. Li A, Horvath S (2007) Network neighborhood analysis with the multi-node topological overlap measure. Bioinformatics. Oxford University Press, 23(2): 222–231

    Google Scholar 

  57. Ravasz E et al (2002) Hierarchical organization of modularity in metabolic networks. Science (New York, N.Y.). United States, 297(5586): 1551–1555. https://doi.org/10.1126/science.1073374

  58. Yip AM, Horvath S (2007) Gene network interconnectedness and the generalized topological overlap measure. BMC Bioinform. BioMed Central 8(1): 22

    Google Scholar 

  59. Langfelder P, Zhang B, Horvath S (2008) Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics (Oxford, England). England, 24(5): 719–720. https://doi.org/10.1093/bioinformatics/btm563

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sana Javed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Javed, S. (2021). Differential Expression Analysis of RNA-Seq Data and Co-expression Networks. In: Pham, T.D., Yan, H., Ashraf, M.W., Sjöberg, F. (eds) Advances in Artificial Intelligence, Computation, and Data Science. Computational Biology, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-030-69951-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69951-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69950-5

  • Online ISBN: 978-3-030-69951-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics