Skip to main content
Log in

Estimating the number of true null hypotheses from a histogram of p values

  • Published:
Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Abstract

In an earlier article, an intuitively appealing method for estimating the number of true null hypotheses in a multiple test situation was proposed. That article presented an iterative algorithm that relies on a histogram of observed p values to obtain the estimator. We characterize the limit of that iterative algorithm and show that the estimator can be computed directly without iteration. We compare the performance of the histogram-based estimator with other procedures for estimating the number of true null hypotheses from a collection of observed p values and find that the histogram-based estimator performs well in settings similar to those encountered in microarray data analysis. We demonstrate the approach using p values from a large microarray experiment aimed at uncovering molecular mechanisms of barley resistance to a fungal pathogen.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allison, D. B., Gadbury, G. L., Heo, M., Fernández, J. R., Lee, C.-K., Prolla, T. A., and Weindruch, R. (2002). “A Mixture Model Approach for the Analysis of Microarray Gene Expression Data,” Computational Statistics and Data Analysis, 39, 1–20.

    Article  MATH  MathSciNet  Google Scholar 

  • Benjamini, Y., and Hochberg, Y. (1995), “Controlling False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal of the Royal Statistical Society, Series B, 57, 289–300.

    MATH  MathSciNet  Google Scholar 

  • — (2000), “On the Adaptive Control of the False Discovery Rate in Multiple Testing with Independent Statistics,” Journal of Educational and Behavioral Statistics, 25, 60–83.

    Google Scholar 

  • Brem, R. B., Yvert, G., Clinton, R., and Kruglyak, L. (2002), “Genetic Dissection of Transcriptional Regulation in Budding Yeast,” Science, 296, 752–755.

    Article  Google Scholar 

  • Bystrykh, L., Weersing, E., Dontje, B., Sutton, S., Pletcher, M. T., Wiltshire, T., Su, A. I., Vellenga, E., Wang, J., Manly, K. F., Lu, L., Chesler, E. J., Alberts, R., Jansen, R. C., Williams, R. W., Cooke, M. P. and de Haan, G. (2005). “Uncovering Regulatory Pathways that Affect Hematopoietic Stem Cell Function Using ‘Genetical Genomics’,” Nature Genetics, 37, 225–232.

    Article  Google Scholar 

  • Caldo, R. A., Nettleton, D., and Wise, R. P. (2004), “Interaction-Dependent Gene Expression in Mla-Specified Response to Barley Powdery Mildew,” The Plant Cell, 16, 2514–2528.

    Article  Google Scholar 

  • Chesler, E. J., Lu, L., Shou, S., Qu, Y., Gu, J., Wang, J., Hsu, H. C., Mountz, J. D., Baldwin, N. E., Langston, M. A., Threadgill, D. W., Manly, K. F. and Williams, R. W. (2005), “Complex Trait Analysis of Gene Expression Uncovers Polygenic and Pleiotropic Networks that Modulate Nervous System Function,” Nature Genetics, 37, 233–242.

    Article  Google Scholar 

  • Close, T. J., Wanamaker, S., Caldo, R., Turner, S. M., Ashlock, D. A., Dickerson, J. A., Wing, R. A., Muehlbauer, G. J., Kleinhofs, A. and Wise, R. P. (2004), “A New Resource for Cereal Genomics: 22K Barley GeneChip Comes of Age,” Plant Physiology, 134, 960–968.

    Article  Google Scholar 

  • DeCook, R., Lall, S., Nettleton, D., and Howell, S. H. (2006), “Genetic Regulation of Gene Expression During Shoot Development in Arabidopsis,” Genetics, 172, 1155–1164.

    Article  Google Scholar 

  • Fernando, R. L., Nettleton, D., Southey, B. R., Dekkers, J. C. M., Rothschild, M. F., and Soller, M. (2004), “Controlling the Proportion of False Positives (PFP) in Multiple Dependent Tests,” Genetics, 166, 611–619.

    Article  Google Scholar 

  • Genovese, C. R., and Wasserman, L. (2004), “A Stochastic Process Approach to False Discovery Control,” The Annals of Statistics, 32, 1035–1061.

    Article  MATH  MathSciNet  Google Scholar 

  • Hochberg, Y., and Benjamini, Y. (1990), “More Powerful Procedures for Multiple Significance Testing,” Statistics and Medicine, 9, 811–818.

    Article  Google Scholar 

  • Hsueh, H., Chen, J. J., and Kodell, R. L. (2003), “Comparison of Methods for Estimating the Number of True Null Hypotheses in Multiplicity Testing,” Journal of Biopharmaceutical Statistics, 13, 675–689.

    Article  MATH  Google Scholar 

  • Hubner, N., Wallace, C. A., Zimdahl, H., Petretto, E., Schulz, H., Maciver, F., Mueller, M., Hummel, O., Monti, J., Zidek, V., Musilova, A., Kren, V., Causton, H., Game, L., Born, G., Schmidt, S., Müller, A., Cook, S., Kurtz, T. W., Whittaker, J., Pravenec, M., and Aitman, T. J. (2005), “Integrated Transcriptional Profiling and Linkage Analysis for Identification of Genes Underlying Disease,” Nature Genetics, 37, 243–253.

    Article  Google Scholar 

  • Jansen, R. C., and Nap, J. P. (2001), “Genetical Genomics: The Added Value from Segregation,” Trends in Genetics, 17, 388–391.

    Article  Google Scholar 

  • Langaas, M., Ferkingstad, E., and Lindqvist, B. H. (2005), “Estimating the Proportion of True Null Hypotheses, with Application to DNA Microarray Data,” Journal of the Royal Statistics Society, Series B, 67, 555–572.

    Article  MATH  MathSciNet  Google Scholar 

  • Lipshutz, R. J., Fodor, S. P., Gingeras, T. R. and Lockhart, D. J. (1999), “High Density Synthetic Oligonucleotide Arrays,” Nature Genetics, 21 Supplement, 20–24.

    Article  Google Scholar 

  • Mosig, M. O., Lipkin, E., Galina, K., Tchourzyna, E., Soller, M., and Friedmann, A. (2001), “A Whole Genome Scan for Quantitative Trait Loci Affecting Milk Protein Percentage in Israeli-Holstein Cattle, by Means of Selective Milk DNA Pooling in a Daughter Design, Using an Adjusted False Discovery Rate Criterion,” Genetics, 157, 1683–1698.

    Google Scholar 

  • Nguyen, D. V. (2004), “On Estimating the Proportion of True Null Hypotheses for False Discovery Rate Controlling Procedures in Exploratory DNA Microarray Studies,” Computational Statistics and Data Analysis, 47, 611–637.

    Article  MATH  MathSciNet  Google Scholar 

  • Pomp, D., Allan, M. F., and Wesolowski, S. R. (2004), “Quantitative Genomics: Exploring the Genetic Architecture of Complex Trait Predisposition,” Journal of Animal Science, 82, E300–312.

    Google Scholar 

  • Schadt, E.E., Monks, S.A., Drake, T.A., Lusis, A.J., Che, N., Colinayo, V. Ruff, T.G., Milligan, S.B., Lamb, J.R., Cavet, G., Linsley, P.S., Mao, M., Stoughton, R.B., and Friend, S.H. (2003a), “Genetics of Gene Expression Surveyed In Maize, Mouse And Man,” Nature, 422, 297–302.

    Article  Google Scholar 

  • Schadt, E. E., Monks, S. A., and Friend, S. H. (2003b), “A New Paradigm for Drug Discovery: Integrating Clinical, Genetic, Genomic and Molecular Phenotype Data to Identify Drug Targets,” Biochemical Society Transactions, 31, 437–443.

    Article  Google Scholar 

  • Schweder, T., and Spjøtvoll, E. (1982), “Plots of P-values to Evaluate Many Tests Simultaneously,” Biometrika, 69, 493–502.

    Google Scholar 

  • Simes, R. J. (1986), “An Improved Bonferroni Procedure for Multiple Tests of Significance,” Biometrika, 73, 751–754.

    Article  MATH  MathSciNet  Google Scholar 

  • Storey, J. D. (2002a), “A Direct Approach to False Discovery Rates,” Journal of the Royal Statistical Society, Series B, 64, 479–498.

    Article  MATH  MathSciNet  Google Scholar 

  • Storey, J. D. (2002b), “False Discovery Rates: Theory and Applicatons to DNA Microarrays,” unpublished Ph.D. thesis, Department of Statistics, Stanford University.

  • — (2003), “The Positive False Discovery Rate: A Bayesian Interpretation and the q-Value,” The Annals of Statistics, 31, 2013–2035.

    Article  MATH  MathSciNet  Google Scholar 

  • Storey, J. D., Taylor, J. E., and Siegmund, D. (2004), “Strong Control, Conservative Point Estimation, and Simultaneous Conservative Consistency of False Discovery Rates: A Unified Approach,” Journal of the Royal Statistical Society, Series B, 66, 187–205.

    Article  MATH  MathSciNet  Google Scholar 

  • Storey, J. D., and Tibshirani, R. (2003), “Statistical Significance for Genomewide Studies,” in Proceedings of the National Academy of Sciences, 100, pp. 9440–9445.

    Article  MATH  MathSciNet  Google Scholar 

  • Yvert, G., Brem, R.B., Whittle, J., Akey, J.M., Foss, E., Smith, E.N., Mackelprang, R., and Kruglyak, L. (2003), “Trans-acting Regulatory Variation in Saccharomyces cerevisiae and the Role of Transcription Factors,” Nature Genetics, 35, 57–64.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dan Nettleton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nettleton, D., Hwang, J.T.G., Caldo, R.A. et al. Estimating the number of true null hypotheses from a histogram of p values. JABES 11, 337–356 (2006). https://doi.org/10.1198/108571106X129135

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1198/108571106X129135

Key Words

Navigation