Skip to main content
Log in

Empirical Bayes estimation of gene-specific effects in micro-array research

  • Original Paper
  • Published:
Functional & Integrative Genomics Aims and scope Submit manuscript

Abstract

Micro-array technology allows investigators the opportunity to measure expression levels of thousands of genes simultaneously. However, investigators are also faced with the challenge of simultaneous estimation of gene expression differences for thousands of genes with very small sample sizes. Traditional estimators of differences between treatment means (ordinary least squares estimators or OLS) are not the best estimators if interest is in estimation of gene expression differences for an ensemble of genes. In the case that gene expression differences are regarded as exchangeable samples from a common population, estimators are available that result in much smaller average mean-square error across the population of gene expression difference estimates. We have simulated the application of such an estimator, namely an empirical Bayes (EB) estimator of random effects in a hierarchical linear model (normal-normal). Simulation results revealed mean-square error as low as 0.05 times the mean-square error of OLS estimators (i.e., the difference between treatment means). We applied the analysis to an example dataset as a demonstration of the shrinkage of EB estimators and of the reduction in mean-square error, i.e., increase in precision, associated with EB estimators in this analysis. The method described here is available in software that is available at http://www.soph.uab.edu/ssg.asp?id=1087.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Baldi P, Long AD (2001) A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 17:509–519

    Article  CAS  PubMed  Google Scholar 

  • Broet P, Richardson S, Radvanyi F (2002) Bayesian hierarchical model for identifying changes in gene expression from microarray experiments. J Comput Biol 9:671–683

    Article  CAS  PubMed  Google Scholar 

  • Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96:1151–1160

    Article  Google Scholar 

  • Evans M, Hastings N, Peacock B (1993) Statistical distributions, 2nd edn. Wiley, New York

    Google Scholar 

  • Everitt BS (1998) Cambridge dictionary of statistics, 2nd edn. Cambridge University Press, Cambridge

    Google Scholar 

  • Gelman A, Carlin JB, Stern HS, Rubin DB (2003) Bayesian data analysis. Chapman and Hall, New York

    Google Scholar 

  • Good P (1994) Permutation tests. Springer, New York Berlin Heidelberg

    Google Scholar 

  • Hagopian K, Ramsey JJ, Weindruch R (2003) Influence of age and caloric restriction on liver glycolytic enzyme activities and metabolite concentrations in mice. Exp Gerontol 38:253–266

    Article  CAS  PubMed  Google Scholar 

  • Henderson CR (1984) Applications of linear models in animal breeding. University of Guelph, Guelph

    Google Scholar 

  • Ibrahim JG, Chen MH, Gray RJ (2002) Bayesian models for gene expression with DNA microarray data. J Am Stat Assoc 97:88–99

    Article  Google Scholar 

  • James W, Stein C (1961) Estimation with quadratic loss. In: Neyman J (ed) Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, vol 1. University of California Press, Berkeley, pp 361–379

    Google Scholar 

  • Kendziorski CM, Newton MA, Lan H, Gould MN (2003) On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Technical report no. 166. Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wis.

  • Lee CK, Klopp RG, Weindruch R, Prolla TA (1999) Gene expression profile of aging and its retardation by caloric restriction. Science 285:1390–1393

    Article  CAS  PubMed  Google Scholar 

  • Lee ML, Kuo FC, Whitmore GA, Sklar J (2000) Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proc Natl Acad Sci USA 97:9834–9839

    Article  CAS  PubMed  Google Scholar 

  • Lee ML, Lu W, Whitmore GA, Beier D (2002) Models for microarray gene expression data. J Biopharm Stat 12:1–19

    Article  PubMed  Google Scholar 

  • Louis TA, Shen W (1999) Innovations in Bayes and empirical Bayes methods: estimating parameters, populations and ranks. Stat Med 18:2493–2505

    Article  CAS  PubMed  Google Scholar 

  • Morris CN (1983) Parametric empirical Bayes inference: theory and applications. J Am Stat Assoc 78:47–55

    Google Scholar 

  • Newton MA, Kendziorski CM, Richmond CS, Blattner FR, Tsui KW (2001) On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. J Comput Biol 8:37–52

    Article  CAS  PubMed  Google Scholar 

  • Newton MA, Noueiry A, Sarkar D, Ahlquist P (2003) Detecting differential gene expression with a semiparametric hierarchical mixture method. Technical report no. 1074. Department of Statistics, University of Wisconsin, Madison, Wis.

  • Pan W, Lin J, Le CTA (2003) mixture model approach to detecting differentially expressed genes with microarray data. Funct Integr Genomics 3:117–124

    Article  CAS  PubMed  Google Scholar 

  • Robinson GK (1991) That BLUP is a good thing: the estimation of random effects. Stat Sci 6:15–51

    Google Scholar 

  • Searle SR (1971) Linear models. Wiley, New York

    Google Scholar 

  • Searle SR, Casella G, McCulloch CE (1992) Variance components. Wiley, New York

    Google Scholar 

  • Wolfinger RD, Gibson G, Wolfinger ED, Bennett L, Hamadeh H, Bushel P, Afshari C, Paules RS (2001) Assessing gene significance from cDNA microarray expression data via mixed models. J Comput Biol 8:625–637

    Article  CAS  PubMed  Google Scholar 

  • Yang C, Bakshi BR, Rathman JF, Blower PE Jr (2002) Multiscale and Bayesian approaches to data analysis in genomics high-throughput screening. Curr Opin Drug Discov Dev 5:428–438

    CAS  Google Scholar 

Download references

Acknowledgements

We wish to acknowledge Dr Eva Gropp for her contributions to earlier versions of this manuscript and Dr Alfred Bartolucci for reading two versions of this manuscript and providing many helpful comments. This research was supported in part by NSF grants 0217651 and 0090286, NIH grants T32AR007450, P01AG11915, and R01AG18922, an intramural award from the University of Alabama Health Services Foundation, and the Frederick Gardner Cottrell Foundation. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David B. Allison.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Edwards, J.W., Page, G.P., Gadbury, G. et al. Empirical Bayes estimation of gene-specific effects in micro-array research. Funct Integr Genomics 5, 32–39 (2005). https://doi.org/10.1007/s10142-004-0123-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10142-004-0123-0

Keywords

Navigation