Skip to main content
Log in

Empirical Bayes Analysis of RNA-seq Data for Detection of Gene Expression Heterosis

  • Published:
Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Abstract

An important type of heterosis, known as hybrid vigor, refers to the enhancements in the phenotype of hybrid progeny relative to their inbred parents. Although hybrid vigor is extensively utilized in agriculture, its molecular basis is still largely unknown. In an effort to understand phenotypic heterosis at the molecular level, researchers are measuring transcript abundance levels of thousands of genes in parental inbred lines and their hybrid offspring using RNA sequencing (RNA-seq) technology. The resulting data allow researchers to search for evidence of gene expression heterosis as one potential molecular mechanism underlying heterosis of agriculturally important traits. The null hypotheses of greatest interest in testing for gene expression heterosis are composite null hypotheses that are difficult to test with standard statistical approaches for RNA-seq analysis. To address these shortcomings, we develop a hierarchical negative binomial model and draw inferences using a computationally tractable empirical Bayes approach to inference. We demonstrate improvements over alternative methods via a simulation study based on a maize experiment and then analyze that maize experiment with our newly proposed methodology.

Supplementary materials accompanying this paper appear on-line.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Analytics, R. (2014). doMC: Foreach parallel adaptor for the multicore package. R package version 1.3.3.

  • Bell, G. D., Kane, N. C., Rieseberg, L. H., and Adams, K. L. (2013). RNA-seq analysis of allele-specific expression, hybrid effects, and regulatory divergence in hybrids compared with their parents from natural populations. Genome biology and evolution 5, 1309–1323.

    Article  Google Scholar 

  • Chen, Z. J. (2013). Genomic and epigenetic insights into the molecular bases of heterosis. Nature Reviews Genetics 14, 471–482.

    Article  Google Scholar 

  • Darwin, C. (1876). The effects of cross and self fertilisation in the vegetable kingdom. John Murray.

  • Datta, S. and Nettleton, D. (2014). Statistical Analysis of Next Generation Sequencing Data. Springer.

  • Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science 7, 457–472.

    Article  Google Scholar 

  • Gentleman, R. C., Carey, V. J., Bates, D. M., and others (2004). Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology 5, R80.

    Article  Google Scholar 

  • Hallauer, A. and Miranda, F. (1981). Quantitative genetics in maize breeding. Iowa St. Univ. Press, Ames, IA .

    Google Scholar 

  • Hallauer, A. R., Carena, M. J., and Miranda Filho, J. (2010). Quantitative genetics in maize breeding, volume 6. Springer.

  • Hans, C. (2009). Bayesian lasso regression. Biometrika 96, 835–845.

    Article  MATH  MathSciNet  Google Scholar 

  • Hardcastle, T. J. (2012). baySeq: Empirical Bayesian analysis of patterns of differential expression in count data. R package version 2.0.50.

  • Hardcastle, T. J. and Kelly, K. A. (2010). baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11, 422.

    Article  Google Scholar 

  • Ji, T., Liu, P., and Nettleton, D. (2014). Estimation and testing of gene expression heterosis. Journal of Agricultural, Biological, and Environmental Statistics 19, 319–337.

    Article  MATH  MathSciNet  Google Scholar 

  • Neal, R. (2011). MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo, volume 2, pages 113–162. Chapman & Hall/CRC.

  • Park, T. and Casella, G. (2008). The Bayesian lasso. Journal of the American Statistical Association 103, 681–686.

    Article  MATH  MathSciNet  Google Scholar 

  • Paschold, A., Jia, Y., Marcon, C., Lund, S., Larson, N. B., Yeh, C.-T., Ossowski, S., Lanz, C., Nettleton, D., Schnable, P. S., et al. (2012). Complementation contributes to transcriptome complexity in maize (Zea mays L.) hybrids relative to their inbred parents. Genome research 22, 2445–2454.

    Article  Google Scholar 

  • R Core Team (2014). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

  • Robinson, M. and Oshlack, A. (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology 11, R25.

    Article  Google Scholar 

  • Robinson, M. D., McCarthy, D. J., and Smyth, G. K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–40.

    Article  Google Scholar 

  • Robinson, M. D. and Smyth, G. K. (2007). Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23, –6.

  • Rue, H., Martino, S., and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71, 319–392.

    Article  MATH  MathSciNet  Google Scholar 

  • Springer, N. and Stupar, R. (2007). Allelic variation and heterosis in maize: How do two halves make more than a whole? Genome research 17, 264–275.

    Article  Google Scholar 

  • Stan Development Team (2014a). RStan: the R interface to Stan, version 2.5.0.

  • Stan Development Team (2014b). Stan: A C++ library for probability and sampling, version 2.5.0.

  • Swanson-Wagner, R., Jia, Y., DeCook, R., Borsuk, L., Nettleton, D., and Schnable, P. (2006). All possible modes of gene action are observed in a global comparison of gene expression in a maize f1 hybrid and its inbred parents. Proceedings of the National Academy of Sciences 103, 6805–6810.

    Article  Google Scholar 

  • van de Wiel, M. A., Neerincx, M., Buffart, T. E., Sie, D., and Verheul, H. M. (2014). ShrinkBayes: a versatile R-package for analysis of count-based sequencing data in complex study designs. BMC bioinformatics 15, 116.

    Article  Google Scholar 

  • Wei, X. and Wang, X. (2013). A computational workflow to identify allele-specific expression and epigenetic modification in maize. Genomics, proteomics & bioinformatics 11, 247–252.

    Article  Google Scholar 

  • Wickham, H. (2011). The split-apply-combine strategy for data analysis. Journal of Statistical Software 40, 1–29.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jarad Niemi.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 55 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Niemi, J., Mittman, E., Landau, W. et al. Empirical Bayes Analysis of RNA-seq Data for Detection of Gene Expression Heterosis. JABES 20, 614–628 (2015). https://doi.org/10.1007/s13253-015-0230-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-015-0230-5

Keywords

Navigation