Skip to main content
Log in

Detecting differential expression from RNA-seq data with expression measurement uncertainty

Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

High-throughput RNA sequencing (RNA-seq) has emerged as a revolutionary and powerful technology for expression profiling. Most proposed methods for detecting differentially expressed (DE) genes from RNA-seq are based on statistics that compare normalized read counts between conditions. However, there are few methods considering the expression measurement uncertainty into DE detection. Moreover, most methods are only capable of detecting DE genes, and few methods are available for detecting DE isoforms. In this paper, a Bayesian framework (BDSeq) is proposed to detect DE genes and isoforms with consideration of expression measurement uncertainty. This expression measurement uncertainty provides useful information which can help to improve the performance of DE detection. Three real RAN-seq data sets are used to evaluate the performance of BDSeq and results show that the inclusion of expression measurement uncertainty improves accuracy in detection of DE genes and isoforms. Finally, we develop a GamSeq-BDSeq RNA-seq analysis pipeline to facilitate users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

  1. Mortazavi A, Williams A, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods, 2008, 5(7): 621–628

    Article  Google Scholar 

  2. Marioni J, Mason C, Mane S, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Research, 2008, 18: 1509–1517

    Article  Google Scholar 

  3. Marguerat S, Bähler J. RNA-seq: from technology to biology. Cellular and Molecular Life Sciences, 2010, 67(4): 569–579

    Article  Google Scholar 

  4. Rapaport F, Khanin R, Liang Y, Pirun M, Krek A, Zumbo P, Mason C E, Socci N D, Betel D. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biology, 2013, 14(9): R95

    Article  Google Scholar 

  5. Zhang Z H, Jhaveri D J, Marshall VM, Bauer D C, Edson J, Narayanan R K, Zhao Q. A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS ONE, 2014, 9: e103207

    Article  Google Scholar 

  6. Ozsolak F, Milos P. RNA sequencing: advances, challenges and opportunities. Nature Reviews Genetics, 2011, 12(2): 87–98

    Article  Google Scholar 

  7. Soneson C, Delorenzi M. A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics, 2013, 14(1): 9

    Article  Google Scholar 

  8. Kvam V, Lu P, Si Y. A comparison of statistical methods for detecting differentially expressed genes from Rna-Seq data. American Journal of Botany, 2012, 99(2): 248–256

    Article  Google Scholar 

  9. Seyednasrollah F, Laiho A, Elo L L. Comparison of software packages for detecting differential expression in RNA-seq studies. Briefings in bioinformatics, 2013, bbt086

    Google Scholar 

  10. Anders S, McCarthy D J, Chen Y, Okoniewski M, Smyth G K, Huber W, Robinson M D. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols, 2013, 8(9): 1765–1786

    Article  Google Scholar 

  11. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biology, 2010, 11(10): R106

    Article  Google Scholar 

  12. Hardcastle T, Kelly K. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 2010, 11(1): 422

    Article  Google Scholar 

  13. Di Y, Schafer D, Cumbie J, Chang J. The NBP negative binomial model for assessing differential gene expression from RNA-Seq. Statistical Applications in Genetics and Molecular Biology, 2011, 10(1): 1–28

    Article  MathSciNet  Google Scholar 

  14. Yu D, Huber W, Vitek O. Shrinkage estimation of dispersion in negative binomial models for RNA-seq experiments with small sample size. Bioinformatics, 2013, 29(10): 1275–1282

    Article  Google Scholar 

  15. Robinson M, Smyth G. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics, 2007, 23(21): 2881–2887

    Article  Google Scholar 

  16. Wu H, Wang C, Wu Z. A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics, 2013, 14(2): 232–243

    Article  Google Scholar 

  17. Law CW, Chen Y, Shi W, Smyth G K. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology, 2014, 15: R29

    Article  Google Scholar 

  18. Bi Y, Davuluri R V. NPEBseq: nonparametric empirical bayesianbased procedure for differential expression analysis of RNA-seq data. BMC bioinformatics, 2013, 14(1): 262

    Article  Google Scholar 

  19. Sandmann T, Vogg M, Owlarn S, Boutros M, Bartscherer K. The headregeneration transcriptome of the planarian Schmidtea mediterranea. Genome Biol, 2011, 12(8): R76

    Article  Google Scholar 

  20. Jiang H, Wong W. Statistical inferences for isoform expression in RNA-Seq. Bioinformatics, 2009, 25(8): 1026–1032

    Article  Google Scholar 

  21. Li B, Dewey C. RSEM: accurate transcript quantification from RNASeq data with or without a reference genome. BMC Bioinformatics, 2011, 12(1): 323

    Article  Google Scholar 

  22. Trapnell C, Williams B, Pertea G, Mortazavi A, Kwan G, Baren M, Salzberg S, Wold B, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 2010, 28(5): 211–215

    Article  Google Scholar 

  23. Glaus P, Honkela A, Rattray M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics, 2011, 28(13): 1721–1728

    Article  Google Scholar 

  24. Leng N, Dawson J, Thomson A, Ruotti V, Rissman A, Smits B M G, Haag J D, Gould M N, Stewart R M, Kendziorski C. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics, 2013, 29(8): 1035–1043

    Article  Google Scholar 

  25. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley D, Pimentel H, Salzberg S L, Rinn J L, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 2012, 7(3): 562–578

    Article  Google Scholar 

  26. Hein A, Richardson S, Causton H, Ambler G, Green P. BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data. Biostatistics, 2005, 6(3): 349–373

    Article  Google Scholar 

  27. Liu X, Milo M, Lawrence D, Rattray M. Probe-level measurement error improves accuracy in detecting differential gene expression. Bioin formatics, 2006, 22(17): 2107–2113

    Article  Google Scholar 

  28. Zhang L, Liu X. An improved probabilistic model for finding differential gene expression. In: Proceedings of the 2nd International Conference on Biomedical Engineering and Informatics. 2009, 1–4: 1566–1571

    Google Scholar 

  29. Zhang L, Liu X. A Gamma-based method of RNA-seq analysis. Journal of Nanjing University (Natural Sciences), 2013, 49: 465–474 (in Chinese)

    Google Scholar 

  30. Jordan M, Ghahramani Z, Jaakkola T, Saul L. An introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183–233

    Article  MATH  Google Scholar 

  31. Sun J, Kaban A. A fast algorithm for robust mixtures in the presence of measurement errors. IEEE Transactions on Neural Networks, 2010, 21(8): 1206–1220

    Article  Google Scholar 

  32. MAQC Consortium. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature Biotechnology, 2006, 24(9): 1151–1161

    Article  Google Scholar 

  33. Canales R D, Luo Y L, Willey J C, Austermiller B, Barbacioru C C, Boysen C, Hunkapiller K, Jensen R V, Knight C R, Lee K Y, Ma Y Q, Maqsodi B, Papallo A, Peters E H, Poulter K, Ruppel P L, Samaha R R, Shi L M, Yang W, Zhang L, Goodsaid F M. Evaluation of DNA microarray results with quantitative gene expression platforms. Nature Biotechnology, 2006, 24(9): 1115–1122

    Article  Google Scholar 

  34. Griffith M, Griffith OL, Mwenifumbo J, Goya R, Morrissy A S, Morin R D, Corbett R, Tang M J, Hou Y C, Pugh T J, Robertson G, Chittaranjan S, Ally A, Asano J K, Chan S Y, Li H Y I, McDonald H, Teague K, Zhao Y J, Zeng T, Delaney A, Hirst M, Morin G B, Jones S GM, Tai I T, Marra M A. Alternative expression analysis by RNA sequencing. Nature Methods, 2010, 7(10): 843–847

    Article  Google Scholar 

  35. Wang E, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore S F, Schroth G P, Burge C B. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008, 456(7221): 470–476

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuejun Liu.

Additional information

Li Zhang received the BS in computer science from Changsha University of Science & Technology, China in 2007. In 2010, he received his MS in computer applications from Nanjing University of Aeronautics and Astronautics (NUAA), China. Now he is a PhD student at the Department of Computer Science and Engineering, NUAA. His research interests include probabilistic modeling and gene expression analysis.

Songcan Chen received the BS in mathematics from Hangzhou University (now merged into Zhejiang University), China, the MS in computer applications from Shanghai Jiaotong University, China, and the PhD degree in communication and information systems from the Nanjing University of Aeronautics and Astronautics (NUAA), China in 1983, 1985, and 1997, respectively. Since 1998, he has been a full-time professor with the Department of Computer Science and Engineering, NUAA. He has authored or co-authored over 200 scientific peer-reviewed papers. His current research interests include pattern recognition, machine learning, and neural computing.

Xuejun Liu is a professor in the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics (NUAA), China. She received her BS and MS in 1999 and 2002, respectively, from NUAA, and PhD in 2006 from the University of Manchester, UK, all in computer science. Her research interests include probabilistic modeling and gene expression analysis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Chen, S. & Liu, X. Detecting differential expression from RNA-seq data with expression measurement uncertainty. Front. Comput. Sci. 9, 652–663 (2015). https://doi.org/10.1007/s11704-015-4308-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-015-4308-6

Keywords

Navigation