Skip to main content
Log in

A Hierarchical Bayesian Model for Estimating and Inferring Differential Isoform Expression for Multi-sample RNA-Seq Data

  • Published:
Statistics in Biosciences Aims and scope Submit manuscript

Abstract

RNA-Seq has drastically changed our ways of studying transcriptomes in providing more precise estimates of gene expression, including isoform-specific expression. Most of the available methods for RNA-Seq data focus on one sample at a time. We present in this paper a Poisson-Gamma hierarchical model for multi-sample RNA-Seq data analysis in order to simultaneously estimate isoform-specific expression and to identify differentially expressed isoforms. Our model has the advantage of borrowing information across all samples in estimating expression levels, which can improve the estimates drastically, particularly for low abundance isoforms. Furthermore, our hierarchical model has the ability to account for overdispersion in the data and also can incorporate sample-specific covariates in the underlying model, which facilitates the isoform-specific differential expression analysis. Simulation studies demonstrated that this Bayesian multi-sample approach can lead to more precise estimates of isoform-specific expression and higher power to detect differential expression by borrowing information across all samples than single-sample analysis, especially for isoforms of low abundance. We further illustrated our methods using the RNA-Seq data of 10 Yoruban and 10 Caucasian individuals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Christiansen CL, Morris CN (1997) Hierarchical Poisson regression modeling. J Am Stat Assoc 92:618–632

    Article  MathSciNet  MATH  Google Scholar 

  2. Gilks WR (1992) Derivative-free adaptive rejection sampling for Gibbs sampling. In: Bayesian statistics, vol 4. Oxford University Press, London, pp 641–649

    Google Scholar 

  3. Jiang H, Wong WH (2009) Statistical inferences for isoform expression in RNA-Seq. Bioinformatics 25:1026–1032

    Article  Google Scholar 

  4. Kass R, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795

    Article  MATH  Google Scholar 

  5. Katz Y, Wang ET, Airoldi EM, Burge CB (2010) Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods 7:1009–1015

    Article  Google Scholar 

  6. Li B, Ruotti V, Stewart RM, Thomson JA, Dewey CN (2009) RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26:493–500

    Article  Google Scholar 

  7. Li J, Jiang H, Wong WH (2010) Modeling non-uniformity in short-read rates in RNA-Seq data. Genome Biol 11:R50

    Article  Google Scholar 

  8. Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y (2008) RNA-Seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18:1509–1517

    Article  Google Scholar 

  9. Montgomery SB, Sammeth M, Gutierrez-Arcelus M, Lach RP, Ingle C, Nisbett J, Guigo R, Dermitzakis ET (2010) Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464:773–777

    Article  Google Scholar 

  10. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628

    Article  Google Scholar 

  11. Muller P, Parmigiani G, Rice K (2006) FDR and Bayesian multiple comparisons rules. In: Proc Valencia/ISBA 8th World meeting on Bayesian statistics, Benidorm, Alicante, Spain

    Google Scholar 

  12. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320:1344–1349

    Article  Google Scholar 

  13. Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras JB, Stephens M, Gilad Y, Pritchard J (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464:768–772

    Article  Google Scholar 

  14. Spiegelhalter DJ, Thomas A, Best NG (1999) WinBUGS version 1.2 user manual. MRC Biostatistics Unit

  15. Storey JD, Madeoy J, Strout JL, Wurfel M, Ronald J, Akey JM (2007) Gene-expression variation within and among human populations. Am J Hum Genet 80:502–509

    Article  Google Scholar 

  16. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB (2008) Alternative isoform regulation in human tissue transcriptomes. Nature 456:470–476

    Article  Google Scholar 

  17. Zhang W, Duan S, Kistner EO, Bleibel WK, Huang RS, Clark TA, Chen TX, Schweitzer AC, Blume JE, Cox NJ, Dolan ME (2008) Evaluation of genetic variation contributing to differences in gene expression between populations. Am J Hum Genet 82:631–640

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongzhe Li.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary Tables and Figure (PDF 140 kB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vardhanabhuti, S., Li, M. & Li, H. A Hierarchical Bayesian Model for Estimating and Inferring Differential Isoform Expression for Multi-sample RNA-Seq Data. Stat Biosci 5, 119–137 (2013). https://doi.org/10.1007/s12561-011-9052-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12561-011-9052-3

Keywords

Navigation