Skip to main content

Using RNentropy to Detect Significant Variation in Gene Expression Across Multiple RNA-Seq or Single-Cell RNA-Seq Samples

  • Protocol
  • First Online:
RNA Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2284))

Abstract

RNA-Seq has become the de facto standard technique for characterization and quantification of transcriptomes, and a large number of methods and tools have been proposed to model and detect differential gene expression based on the comparison of transcript abundances across different samples. However, state-of-the-art methods for this task are usually designed for pairwise comparisons, that is, can identify significant variation of expression only between two conditions or samples. We describe the use of RNentropy, a methodology based on information theory, devised to overcome this limitation. RNentropy can thus detect significant variations of gene expression in RNA-Seq data across any number of samples and conditions, and can be applied downstream of any analysis pipeline for the quantification of gene expression from raw sequencing data. RNentropy takes as input gene (or transcript) expression values, defined with any measure suitable for the comparison of transcript levels across samples and conditions. The output consists of genes (or transcripts) exhibiting significant variation of expression across the conditions studied, together with the samples in which they result to be over- or underexpressed. RNentropy is implemented as an R package and freely available from the CRAN repository. We provide a detailed guide to the functions and parameters of the package and usage examples to demonstrate the software capabilities, also showing how it can be applied to the analysis of single-cell RNA sequencing data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17:333–351. https://doi.org/10.1038/nrg.2016.49

    Article  CAS  PubMed  Google Scholar 

  2. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63. https://doi.org/10.1038/nrg2484

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Kulkarni A, Anderson AG, Merullo DP, Konopka G (2019) Beyond bulk: a review of single cell transcriptomics methodologies and applications. Curr Opin Biotechnol 58:129–136. https://doi.org/10.1016/j.copbio.2019.03.001

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515. https://doi.org/10.1038/nbt.1621

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652. https://doi.org/10.1038/nbt.1883

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12:323. https://doi.org/10.1186/1471-2105-12-323

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Anders S, Pyl PT, Huber W (2015) HTSeq-A Python framework to work with high-throughput sequencing data. Bioinformatics 31:166–169. https://doi.org/10.1093/bioinformatics/btu638

    Article  CAS  PubMed  Google Scholar 

  8. Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34:525–527. https://doi.org/10.1038/nbt.3519

    Article  CAS  PubMed  Google Scholar 

  9. Patro R, Duggal G, Love MI et al (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14:417–419. https://doi.org/10.1038/nmeth.4197

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Fay DS (2013) A biologist’s guide to statistical thinking and analysis. WormBook, Pasadena, CA, pp 1–54. https://doi.org/10.1895/wormbook.1.159.1

    Book  Google Scholar 

  11. Brennecke P, Anders S, Kim JK et al (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10:1093–1098. https://doi.org/10.1038/nmeth.2645

    Article  CAS  PubMed  Google Scholar 

  12. Garber M, Grabherr MG, Guttman M, Trapnell C (2011) Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods 8:469–477. https://doi.org/10.1038/nmeth.1613

    Article  CAS  PubMed  Google Scholar 

  13. Costa-Silva J, Domingues D, Lopes FM (2017) RNA-Seq differential expression analysis: an extended review and a software tool. PLoS One 12:1–18. https://doi.org/10.1371/journal.pone.0190152

    Article  CAS  Google Scholar 

  14. Robinson MD, McCarthy DJ, Smyth GK (2009) edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140. https://doi.org/10.1093/bioinformatics/btp616

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Schurch NJ, Schofield P, Gierliński M et al (2016) How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use? RNA 22:839–851. https://doi.org/10.1261/rna.053959.115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Conesa A, Madrigal P, Tarazona S et al (2016) A survey of best practices for RNA-seq data analysis. Genome Biol 17:1–19. https://doi.org/10.1186/s13059-016-0881-8

    Article  CAS  Google Scholar 

  17. Kryuchkova-Mostacci N, Robinson-Rechavi M (2017) A benchmark of gene expression tissue-specificity metrics. Brief Bioinform 18:205–214. https://doi.org/10.1093/bib/bbw008

    Article  CAS  PubMed  Google Scholar 

  18. Mcintyre LM, Lopiano KK, Morse AM et al (2011) RNA-seq : technical variability and sampling. BMC Genomics. https://doi.org/10.1186/1471-2164-12-293

  19. Mccarthy DJ, Chen Y, Smyth GK (2012) Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res 40:4288–4297. https://doi.org/10.1093/nar/gks042

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Zambelli F, Mastropasqua F, Picardi E et al (2018) RNentropy: an entropy-based tool for the detection of significant variation of gene expression across multiple RNA-Seq experiments. Nucleic Acids Res 46(8):e46. https://doi.org/10.1093/nar/gky055

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Bhattacherjee A, Djekidel MN, Chen R et al (2019) Cell type-specific transcriptional programs in mouse prefrontal cortex during adolescence and addiction. Nat Commun 10:4169. https://doi.org/10.1038/s41467-019-12054-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Wang K, Phillips CA, Rogers GL et al (2014) Differential Shannon entropy and differential coefficient of variation: alternatives and augmentations to differential expression in the search for disease-related genes. Int J Comput Biol Drug Des 7:183–194. https://doi.org/10.1504/IJCBDD.2014.061656

    Article  PubMed  PubMed Central  Google Scholar 

  23. Vajapeyam S (2014) Understanding Shannon’s entropy metric for information. arXiv 1405:2061

    Google Scholar 

  24. McDonald JH (2014) Handbook of biological statistics, 3rd edn. Sparky House Publishing, Baltimore, MD

    Google Scholar 

  25. Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J R Stat Soc B 57:289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x

    Article  Google Scholar 

  26. Fano RM, Hawkins D (1961) Transmission of information: a statistical theory of communications. Am J Physiol 29:793–794. https://doi.org/10.1119/1.1937609

    Article  Google Scholar 

  27. Zhang Y, Chen K, Sloan SA et al (2014) An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex. J Neurosci 34:11929–11947. https://doi.org/10.1523/JNEUROSCI.1860-14.2014

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giulio Pavesi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Zambelli, F., Pavesi, G. (2021). Using RNentropy to Detect Significant Variation in Gene Expression Across Multiple RNA-Seq or Single-Cell RNA-Seq Samples. In: Picardi, E. (eds) RNA Bioinformatics. Methods in Molecular Biology, vol 2284. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1307-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1307-8_6

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1306-1

  • Online ISBN: 978-1-0716-1307-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics