Skip to main content
Log in

Estimating the Number of Genes That Are Differentially Expressed in Both of Two Independent Experiments

  • Published:
Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Abstract

A common procedure for estimating the number of genes that are differentially expressed (DE) in two experiments involves two steps. In the first step, data from the two experiments are separately analyzed to produce a list of genes declared to be DE in each experiment. Usually, each list is produced using a method that attempts to control the false discovery rate (FDR) in each experiment at some desired level α. In the second step, the number of genes common to both lists is used as an estimate of the number of genes DE in both experiments. A problem with this approach is that the resulting estimates can vary greatly with α, and the value of α that produces the best estimate for any given pair of experiments is difficult to predict. We propose a method that uses the p-values from both experiments simultaneously to produce one estimate—which does not depend on FDR level α—for the number of genes that are DE in both experiments. We use two simulation studies (one involving independent, normally distributed data and one involving microarray data) to compare the performances of our proposed method, the commonly used method, and another method proposed in literature to test for consistency of replicate experiments. The results of the simulation studies demonstrate the advantages of our approach. We conclude the article by estimating the number of genes that are DE in both of two experiments involving gene expressions in maize leaves.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akopyants, N. S., Matlib, R. S., Bukanova, E. N., Smeds, M. R., Brownstein, B. H., Stormo, G. D., and Beverley, S. M. (2004), “Expression Profiling Using Random Genomic DNA Microarrays Identifies Differentially Expressed Genes Associated With Three Major Developmental Stages of the Protozoan Parasite Leishmania Major,” Molecular and Biochemical Parasitology, 136, 71–86.

    Article  Google Scholar 

  • Buchanan-Wollaston, V., Page, T., Harrison, E., Breeze, E., Ok Lim, P., Gil Nam, H., Lin, J.-F., Wu, S.-H., Swidzinski, J., Ishizzaki, K., and Leaver, C. J. (2005), “Comparative Transcriptome Analysis Reveals Significant Differences in Gene Expression and Signalling Pathways Between Developmental and Dark/Starvation-Induced Senescence in Arabidopsis,” The Plant Journal, 42, 567–585.

    Article  Google Scholar 

  • Covshoff, S., Majeran, W., Liu, P., Kolkman, J. M., van Wilk, K. J., and Brutnell, T. P. (2008), “Deregulation of Maize C4 Photosynthetic Development in a Mesophyll Cell-Defective Mutant,” Plant Physiology, 146, 1469–1481.

    Article  Google Scholar 

  • Edgar, R., Domrachev, M., and Lash, A. E. (2002), “Gene Expression Omnibus: NCBI Gene Expression and Hybridization Array Repository,” Nucleic Acids Research, 30, 207–210.

    Article  Google Scholar 

  • Genovese, C. R., and Wasserman, L. (2004), “A Stochastic Process Approach to False Discovery Control,” Annals of Statistics, 32, 1035–1061.

    Article  MathSciNet  Google Scholar 

  • Hannenhalli, S., Putt, M. E., Gilmore, J. M., Wang, J., Parmacek, M. S., Epstein, J. A., Morrisey, E. E., Marguilies, K. B., and Cappola, T. P. (2006), “Transcriptional Genomics Associates FOX Transcription Factors With Human Heart Failure,” Circulation, 114, 1269–1276.

    Article  Google Scholar 

  • Ianculescu, I., Wu, D.-Y., Siegmund, K. D., and Stallcup, M. R. (2012), “Selective Roles for cAMP Response Element-Binding Protein Binding Protein and p300 Protein as Coregulators for Androgen-Regulated Gene Expression in Advanced Prostate Cancer Cells,” The Journal of Biological Chemistry, 287, 4000–4013.

    Article  Google Scholar 

  • Lai, Y., Adam, B., Podolsky, R., and She, J.-X. (2007), “A Mixture Model Approach to the Tests of Concordance and Discordance Between Two Large-Scale Experiments With Two-Sample Groups,” Bioinformatics, 23, 1243–1250.

    Article  Google Scholar 

  • Langaas, M., Ferkingstad, E., and Lindqvist, B. H. (2005), “Estimating the Proportion of True Null Hypotheses, With Application to DNA Microarray Data,” Journal of the Royal Statistical Society, Series B, 67, 555–572.

    Article  MathSciNet  MATH  Google Scholar 

  • Liang, K., and Nettleton, D. (2012), “Adaptive and Dynamic Adaptive Procedures for False Discovery Rate Control and Estimation,” Journal of the Royal Statistical Society, Series B, 74, 163–182.

    Article  MathSciNet  Google Scholar 

  • Metzeler, K. H., Hummel, M., Bloomfield, C. D., Spiekermann, K., Braess, J., Sauerland, M.-C., Heinecke, A., Radmacher, M., Marcucci, G., Whitman, S. P., Maharry, K., Paxchka, P., Larson, R. A., Berdel, W. E., Buchner, T., Wormann, B., Mansmann, U., Hiddemann, W., Bohlander, S. K., and Buske, C. (2008), “An 86-Probe-Set Gene-Expression Signature Predicts Survival in Cytogenetically Normal Acute Myeloid Leukemia,” Blood, 112, 4193–4201.

    Article  Google Scholar 

  • Mosig, M., Lipkin, E., Khutoreskaya, G., Tchourzyna, E., Soller, M., and Friedmann, A. (2001), “A Whole Genome Scan for Quantitative Trait Loci Affecting Milk Protein Percentage in Israeli-Holstein Cattle, by Means of Selective Milk DNA Pooling in a Daughter Design, Using an Adjusted False Discovery Rate Criterion,” Genetics, 157, 1683–1698.

    Google Scholar 

  • Nettleton, D., Hwang, J., Caldo, R., and Wise, R. (2006), “Estimating the Number of True Null Hypotheses From a Histogram of p Values,” Journal of Agricultural, Biological, and Environmental Statistics, 11, 337–356.

    Article  Google Scholar 

  • Smyth, G. K. (2004), “Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments,” Statistical Applications in Genetics and Molecular Biology, 3, 3.

    Article  MathSciNet  Google Scholar 

  • Storey, J. D. (2002), “A Direct Approach to False Discovery Rates,” Journal of the Royal Statistical Society, Series B, 64, 479–498.

    Article  MathSciNet  MATH  Google Scholar 

  • Storey, J. D., Taylor, J., and Siegmund, D. (2004), “Strong Control, Conservative Point Estimation and Simultaneous Conservative Consistency of False Discovery Rates: A Unified Approach,” Journal of the Royal Statistical Society, Series B, 66, 187–205.

    Article  MathSciNet  MATH  Google Scholar 

  • Storey, J. D., and Tibshirani, R. (2003), “Statistical Significance for Genomewide Studies,” Proceedings of the National Academy of Sciences of the United States of America, 100, 9440–9445.

    Article  MathSciNet  MATH  Google Scholar 

  • Voineagu, I., Wang, X., Johnston, P., Lowe, J. K., Tian, Y., Horvath, S., Mill, J., Cantor, R. M., Blencowe, B. J., and Geshwind, D. H. (2011), “Transcriptomic Analysis of Autistic Brain Reveals Convergent Molecular Pathology,” Nature, 474, 380–386.

    Article  Google Scholar 

  • Wang, J., Coombes, K. R., Highsmith, W. E., Keating, M. J., and Abruzzo, L. V. (2004), “Differences in Gene Expression Between B-Cell Chronic Lymphocytic Leukemia and Normal B Cells: A Meta-Analysis of Three Microarray Studies,” Bioinformatics, 20, 3166–3178.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Megan Orr.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Orr, M., Liu, P. & Nettleton, D. Estimating the Number of Genes That Are Differentially Expressed in Both of Two Independent Experiments. JABES 17, 583–600 (2012). https://doi.org/10.1007/s13253-012-0108-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-012-0108-8

Key Words

Navigation