Frontiers of Computer Science

, Volume 9, Issue 4, pp 652–663

Detecting differential expression from RNA-seq data with expression measurement uncertainty

Research Article

Abstract

High-throughput RNA sequencing (RNA-seq) has emerged as a revolutionary and powerful technology for expression profiling. Most proposed methods for detecting differentially expressed (DE) genes from RNA-seq are based on statistics that compare normalized read counts between conditions. However, there are few methods considering the expression measurement uncertainty into DE detection. Moreover, most methods are only capable of detecting DE genes, and few methods are available for detecting DE isoforms. In this paper, a Bayesian framework (BDSeq) is proposed to detect DE genes and isoforms with consideration of expression measurement uncertainty. This expression measurement uncertainty provides useful information which can help to improve the performance of DE detection. Three real RAN-seq data sets are used to evaluate the performance of BDSeq and results show that the inclusion of expression measurement uncertainty improves accuracy in detection of DE genes and isoforms. Finally, we develop a GamSeq-BDSeq RNA-seq analysis pipeline to facilitate users.

Keywords

RNA-seq Bayesian method differentially expressed genes/isoforms expression measurement uncertainty analysis pipeline 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Mortazavi A, Williams A, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods, 2008, 5(7): 621–628CrossRefGoogle Scholar
  2. 2.
    Marioni J, Mason C, Mane S, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Research, 2008, 18: 1509–1517CrossRefGoogle Scholar
  3. 3.
    Marguerat S, Bähler J. RNA-seq: from technology to biology. Cellular and Molecular Life Sciences, 2010, 67(4): 569–579CrossRefGoogle Scholar
  4. 4.
    Rapaport F, Khanin R, Liang Y, Pirun M, Krek A, Zumbo P, Mason C E, Socci N D, Betel D. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biology, 2013, 14(9): R95CrossRefGoogle Scholar
  5. 5.
    Zhang Z H, Jhaveri D J, Marshall VM, Bauer D C, Edson J, Narayanan R K, Zhao Q. A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS ONE, 2014, 9: e103207CrossRefGoogle Scholar
  6. 6.
    Ozsolak F, Milos P. RNA sequencing: advances, challenges and opportunities. Nature Reviews Genetics, 2011, 12(2): 87–98CrossRefGoogle Scholar
  7. 7.
    Soneson C, Delorenzi M. A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics, 2013, 14(1): 9CrossRefGoogle Scholar
  8. 8.
    Kvam V, Lu P, Si Y. A comparison of statistical methods for detecting differentially expressed genes from Rna-Seq data. American Journal of Botany, 2012, 99(2): 248–256CrossRefGoogle Scholar
  9. 9.
    Seyednasrollah F, Laiho A, Elo L L. Comparison of software packages for detecting differential expression in RNA-seq studies. Briefings in bioinformatics, 2013, bbt086Google Scholar
  10. 10.
    Anders S, McCarthy D J, Chen Y, Okoniewski M, Smyth G K, Huber W, Robinson M D. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols, 2013, 8(9): 1765–1786CrossRefGoogle Scholar
  11. 11.
    Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biology, 2010, 11(10): R106CrossRefGoogle Scholar
  12. 12.
    Hardcastle T, Kelly K. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 2010, 11(1): 422CrossRefGoogle Scholar
  13. 13.
    Di Y, Schafer D, Cumbie J, Chang J. The NBP negative binomial model for assessing differential gene expression from RNA-Seq. Statistical Applications in Genetics and Molecular Biology, 2011, 10(1): 1–28MathSciNetCrossRefGoogle Scholar
  14. 14.
    Yu D, Huber W, Vitek O. Shrinkage estimation of dispersion in negative binomial models for RNA-seq experiments with small sample size. Bioinformatics, 2013, 29(10): 1275–1282CrossRefGoogle Scholar
  15. 15.
    Robinson M, Smyth G. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics, 2007, 23(21): 2881–2887CrossRefGoogle Scholar
  16. 16.
    Wu H, Wang C, Wu Z. A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics, 2013, 14(2): 232–243CrossRefGoogle Scholar
  17. 17.
    Law CW, Chen Y, Shi W, Smyth G K. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology, 2014, 15: R29CrossRefGoogle Scholar
  18. 18.
    Bi Y, Davuluri R V. NPEBseq: nonparametric empirical bayesianbased procedure for differential expression analysis of RNA-seq data. BMC bioinformatics, 2013, 14(1): 262CrossRefGoogle Scholar
  19. 19.
    Sandmann T, Vogg M, Owlarn S, Boutros M, Bartscherer K. The headregeneration transcriptome of the planarian Schmidtea mediterranea. Genome Biol, 2011, 12(8): R76CrossRefGoogle Scholar
  20. 20.
    Jiang H, Wong W. Statistical inferences for isoform expression in RNA-Seq. Bioinformatics, 2009, 25(8): 1026–1032CrossRefGoogle Scholar
  21. 21.
    Li B, Dewey C. RSEM: accurate transcript quantification from RNASeq data with or without a reference genome. BMC Bioinformatics, 2011, 12(1): 323CrossRefGoogle Scholar
  22. 22.
    Trapnell C, Williams B, Pertea G, Mortazavi A, Kwan G, Baren M, Salzberg S, Wold B, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 2010, 28(5): 211–215CrossRefGoogle Scholar
  23. 23.
    Glaus P, Honkela A, Rattray M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics, 2011, 28(13): 1721–1728CrossRefGoogle Scholar
  24. 24.
    Leng N, Dawson J, Thomson A, Ruotti V, Rissman A, Smits B M G, Haag J D, Gould M N, Stewart R M, Kendziorski C. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics, 2013, 29(8): 1035–1043CrossRefGoogle Scholar
  25. 25.
    Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley D, Pimentel H, Salzberg S L, Rinn J L, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 2012, 7(3): 562–578CrossRefGoogle Scholar
  26. 26.
    Hein A, Richardson S, Causton H, Ambler G, Green P. BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data. Biostatistics, 2005, 6(3): 349–373CrossRefGoogle Scholar
  27. 27.
    Liu X, Milo M, Lawrence D, Rattray M. Probe-level measurement error improves accuracy in detecting differential gene expression. Bioin formatics, 2006, 22(17): 2107–2113CrossRefGoogle Scholar
  28. 28.
    Zhang L, Liu X. An improved probabilistic model for finding differential gene expression. In: Proceedings of the 2nd International Conference on Biomedical Engineering and Informatics. 2009, 1–4: 1566–1571Google Scholar
  29. 29.
    Zhang L, Liu X. A Gamma-based method of RNA-seq analysis. Journal of Nanjing University (Natural Sciences), 2013, 49: 465–474 (in Chinese)Google Scholar
  30. 30.
    Jordan M, Ghahramani Z, Jaakkola T, Saul L. An introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183–233CrossRefMATHGoogle Scholar
  31. 31.
    Sun J, Kaban A. A fast algorithm for robust mixtures in the presence of measurement errors. IEEE Transactions on Neural Networks, 2010, 21(8): 1206–1220CrossRefGoogle Scholar
  32. 32.
    MAQC Consortium. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature Biotechnology, 2006, 24(9): 1151–1161CrossRefGoogle Scholar
  33. 33.
    Canales R D, Luo Y L, Willey J C, Austermiller B, Barbacioru C C, Boysen C, Hunkapiller K, Jensen R V, Knight C R, Lee K Y, Ma Y Q, Maqsodi B, Papallo A, Peters E H, Poulter K, Ruppel P L, Samaha R R, Shi L M, Yang W, Zhang L, Goodsaid F M. Evaluation of DNA microarray results with quantitative gene expression platforms. Nature Biotechnology, 2006, 24(9): 1115–1122CrossRefGoogle Scholar
  34. 34.
    Griffith M, Griffith OL, Mwenifumbo J, Goya R, Morrissy A S, Morin R D, Corbett R, Tang M J, Hou Y C, Pugh T J, Robertson G, Chittaranjan S, Ally A, Asano J K, Chan S Y, Li H Y I, McDonald H, Teague K, Zhao Y J, Zeng T, Delaney A, Hirst M, Morin G B, Jones S GM, Tai I T, Marra M A. Alternative expression analysis by RNA sequencing. Nature Methods, 2010, 7(10): 843–847CrossRefGoogle Scholar
  35. 35.
    Wang E, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore S F, Schroth G P, Burge C B. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008, 456(7221): 470–476CrossRefGoogle Scholar

Copyright information

© Higher Education Press and Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.College of Computer Science and TechnologyNanjing University of Aeronautics and AstronauticsNanjingChina

Personalised recommendations