Estimating the number of true null hypotheses from a histogram of p values

Nettleton, Dan; Hwang, J. T. Gene; Caldo, Rico A.; Wise, Roger P.

doi:10.1198/108571106X129135

Estimating the number of true null hypotheses from a histogram of p values

Published: September 2006

Volume 11, pages 337–356, (2006)
Cite this article

Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Dan Nettleton¹,
J. T. Gene Hwang²,
Rico A. Caldo³ &
…
Roger P. Wise^3,4

578 Accesses
92 Citations
Explore all metrics

Abstract

In an earlier article, an intuitively appealing method for estimating the number of true null hypotheses in a multiple test situation was proposed. That article presented an iterative algorithm that relies on a histogram of observed p values to obtain the estimator. We characterize the limit of that iterative algorithm and show that the estimator can be computed directly without iteration. We compare the performance of the histogram-based estimator with other procedures for estimating the number of true null hypotheses from a collection of observed p values and find that the histogram-based estimator performs well in settings similar to those encountered in microarray data analysis. We demonstrate the approach using p values from a large microarray experiment aimed at uncovering molecular mechanisms of barley resistance to a fungal pathogen.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple Hypothesis Tests: A Bayesian Approach

Distributions associated with simultaneous multiple hypothesis testing

Article Open access 19 October 2020

Blending Bayesian and frequentist methods according to the precision of prior information with applications to hypothesis testing

Article 14 February 2015

References

Allison, D. B., Gadbury, G. L., Heo, M., Fernández, J. R., Lee, C.-K., Prolla, T. A., and Weindruch, R. (2002). “A Mixture Model Approach for the Analysis of Microarray Gene Expression Data,” Computational Statistics and Data Analysis, 39, 1–20.
Article MATH MathSciNet Google Scholar
Benjamini, Y., and Hochberg, Y. (1995), “Controlling False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal of the Royal Statistical Society, Series B, 57, 289–300.
MATH MathSciNet Google Scholar
— (2000), “On the Adaptive Control of the False Discovery Rate in Multiple Testing with Independent Statistics,” Journal of Educational and Behavioral Statistics, 25, 60–83.
Google Scholar
Brem, R. B., Yvert, G., Clinton, R., and Kruglyak, L. (2002), “Genetic Dissection of Transcriptional Regulation in Budding Yeast,” Science, 296, 752–755.
Article Google Scholar
Bystrykh, L., Weersing, E., Dontje, B., Sutton, S., Pletcher, M. T., Wiltshire, T., Su, A. I., Vellenga, E., Wang, J., Manly, K. F., Lu, L., Chesler, E. J., Alberts, R., Jansen, R. C., Williams, R. W., Cooke, M. P. and de Haan, G. (2005). “Uncovering Regulatory Pathways that Affect Hematopoietic Stem Cell Function Using ‘Genetical Genomics’,” Nature Genetics, 37, 225–232.
Article Google Scholar
Caldo, R. A., Nettleton, D., and Wise, R. P. (2004), “Interaction-Dependent Gene Expression in Mla-Specified Response to Barley Powdery Mildew,” The Plant Cell, 16, 2514–2528.
Article Google Scholar
Chesler, E. J., Lu, L., Shou, S., Qu, Y., Gu, J., Wang, J., Hsu, H. C., Mountz, J. D., Baldwin, N. E., Langston, M. A., Threadgill, D. W., Manly, K. F. and Williams, R. W. (2005), “Complex Trait Analysis of Gene Expression Uncovers Polygenic and Pleiotropic Networks that Modulate Nervous System Function,” Nature Genetics, 37, 233–242.
Article Google Scholar
Close, T. J., Wanamaker, S., Caldo, R., Turner, S. M., Ashlock, D. A., Dickerson, J. A., Wing, R. A., Muehlbauer, G. J., Kleinhofs, A. and Wise, R. P. (2004), “A New Resource for Cereal Genomics: 22K Barley GeneChip Comes of Age,” Plant Physiology, 134, 960–968.
Article Google Scholar
DeCook, R., Lall, S., Nettleton, D., and Howell, S. H. (2006), “Genetic Regulation of Gene Expression During Shoot Development in Arabidopsis,” Genetics, 172, 1155–1164.
Article Google Scholar
Fernando, R. L., Nettleton, D., Southey, B. R., Dekkers, J. C. M., Rothschild, M. F., and Soller, M. (2004), “Controlling the Proportion of False Positives (PFP) in Multiple Dependent Tests,” Genetics, 166, 611–619.
Article Google Scholar
Genovese, C. R., and Wasserman, L. (2004), “A Stochastic Process Approach to False Discovery Control,” The Annals of Statistics, 32, 1035–1061.
Article MATH MathSciNet Google Scholar
Hochberg, Y., and Benjamini, Y. (1990), “More Powerful Procedures for Multiple Significance Testing,” Statistics and Medicine, 9, 811–818.
Article Google Scholar
Hsueh, H., Chen, J. J., and Kodell, R. L. (2003), “Comparison of Methods for Estimating the Number of True Null Hypotheses in Multiplicity Testing,” Journal of Biopharmaceutical Statistics, 13, 675–689.
Article MATH Google Scholar
Hubner, N., Wallace, C. A., Zimdahl, H., Petretto, E., Schulz, H., Maciver, F., Mueller, M., Hummel, O., Monti, J., Zidek, V., Musilova, A., Kren, V., Causton, H., Game, L., Born, G., Schmidt, S., Müller, A., Cook, S., Kurtz, T. W., Whittaker, J., Pravenec, M., and Aitman, T. J. (2005), “Integrated Transcriptional Profiling and Linkage Analysis for Identification of Genes Underlying Disease,” Nature Genetics, 37, 243–253.
Article Google Scholar
Jansen, R. C., and Nap, J. P. (2001), “Genetical Genomics: The Added Value from Segregation,” Trends in Genetics, 17, 388–391.
Article Google Scholar
Langaas, M., Ferkingstad, E., and Lindqvist, B. H. (2005), “Estimating the Proportion of True Null Hypotheses, with Application to DNA Microarray Data,” Journal of the Royal Statistics Society, Series B, 67, 555–572.
Article MATH MathSciNet Google Scholar
Lipshutz, R. J., Fodor, S. P., Gingeras, T. R. and Lockhart, D. J. (1999), “High Density Synthetic Oligonucleotide Arrays,” Nature Genetics, 21 Supplement, 20–24.
Article Google Scholar
Mosig, M. O., Lipkin, E., Galina, K., Tchourzyna, E., Soller, M., and Friedmann, A. (2001), “A Whole Genome Scan for Quantitative Trait Loci Affecting Milk Protein Percentage in Israeli-Holstein Cattle, by Means of Selective Milk DNA Pooling in a Daughter Design, Using an Adjusted False Discovery Rate Criterion,” Genetics, 157, 1683–1698.
Google Scholar
Nguyen, D. V. (2004), “On Estimating the Proportion of True Null Hypotheses for False Discovery Rate Controlling Procedures in Exploratory DNA Microarray Studies,” Computational Statistics and Data Analysis, 47, 611–637.
Article MATH MathSciNet Google Scholar
Pomp, D., Allan, M. F., and Wesolowski, S. R. (2004), “Quantitative Genomics: Exploring the Genetic Architecture of Complex Trait Predisposition,” Journal of Animal Science, 82, E300–312.
Google Scholar
Schadt, E.E., Monks, S.A., Drake, T.A., Lusis, A.J., Che, N., Colinayo, V. Ruff, T.G., Milligan, S.B., Lamb, J.R., Cavet, G., Linsley, P.S., Mao, M., Stoughton, R.B., and Friend, S.H. (2003a), “Genetics of Gene Expression Surveyed In Maize, Mouse And Man,” Nature, 422, 297–302.
Article Google Scholar
Schadt, E. E., Monks, S. A., and Friend, S. H. (2003b), “A New Paradigm for Drug Discovery: Integrating Clinical, Genetic, Genomic and Molecular Phenotype Data to Identify Drug Targets,” Biochemical Society Transactions, 31, 437–443.
Article Google Scholar
Schweder, T., and Spjøtvoll, E. (1982), “Plots of P-values to Evaluate Many Tests Simultaneously,” Biometrika, 69, 493–502.
Google Scholar
Simes, R. J. (1986), “An Improved Bonferroni Procedure for Multiple Tests of Significance,” Biometrika, 73, 751–754.
Article MATH MathSciNet Google Scholar
Storey, J. D. (2002a), “A Direct Approach to False Discovery Rates,” Journal of the Royal Statistical Society, Series B, 64, 479–498.
Article MATH MathSciNet Google Scholar
Storey, J. D. (2002b), “False Discovery Rates: Theory and Applicatons to DNA Microarrays,” unpublished Ph.D. thesis, Department of Statistics, Stanford University.
— (2003), “The Positive False Discovery Rate: A Bayesian Interpretation and the q-Value,” The Annals of Statistics, 31, 2013–2035.
Article MATH MathSciNet Google Scholar
Storey, J. D., Taylor, J. E., and Siegmund, D. (2004), “Strong Control, Conservative Point Estimation, and Simultaneous Conservative Consistency of False Discovery Rates: A Unified Approach,” Journal of the Royal Statistical Society, Series B, 66, 187–205.
Article MATH MathSciNet Google Scholar
Storey, J. D., and Tibshirani, R. (2003), “Statistical Significance for Genomewide Studies,” in Proceedings of the National Academy of Sciences, 100, pp. 9440–9445.
Article MATH MathSciNet Google Scholar
Yvert, G., Brem, R.B., Whittle, J., Akey, J.M., Foss, E., Smith, E.N., Mackelprang, R., and Kruglyak, L. (2003), “Trans-acting Regulatory Variation in Saccharomyces cerevisiae and the Role of Transcription Factors,” Nature Genetics, 35, 57–64.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Iowa State University, 50011-1210, Ames, IA
Dan Nettleton (Associate Professor)
Departments of Mathematics and Statistics, Cornell University, 14853-4201, Ithaca, NY
J. T. Gene Hwang (Professor)
Department of Plant Pathology and Center for Plant Responses to Environmental Stresses, Iowa State University, 50011-1020, Ames, IA
Rico A. Caldo (Postdoctoral Research Associate) & Roger P. Wise (Professor and Research Plant Geneticist)
USDA-ARS-Corn Insects and Crop Genetics Research Unit, Iowa State University, 50011-1020, Ames, IA
Roger P. Wise (Professor and Research Plant Geneticist)

Authors

Dan Nettleton
View author publications
You can also search for this author in PubMed Google Scholar
J. T. Gene Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Rico A. Caldo
View author publications
You can also search for this author in PubMed Google Scholar
Roger P. Wise
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dan Nettleton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nettleton, D., Hwang, J.T.G., Caldo, R.A. et al. Estimating the number of true null hypotheses from a histogram of p values. JABES 11, 337–356 (2006). https://doi.org/10.1198/108571106X129135

Download citation

Received: 15 August 2005
Revised: 15 March 2006
Issue Date: September 2006
DOI: https://doi.org/10.1198/108571106X129135

Key Words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating the number of true null hypotheses from a histogram of p values

Abstract

Access this article

Similar content being viewed by others

Multiple Hypothesis Tests: A Bayesian Approach

Distributions associated with simultaneous multiple hypothesis testing

Blending Bayesian and frequentist methods according to the precision of prior information with applications to hypothesis testing

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key Words

Navigation

Estimating the number of true null hypotheses from a histogram of p values

Abstract

Access this article

Similar content being viewed by others

Multiple Hypothesis Tests: A Bayesian Approach

Distributions associated with simultaneous multiple hypothesis testing

Blending Bayesian and frequentist methods according to the precision of prior information with applications to hypothesis testing

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Search

Navigation