Abstract
Multiple hypothesis testing is widely used to evaluate scientific studies involving statistical tests. However, for many of these tests, p values are not available and are thus often approximated using Monte Carlo tests such as permutation tests or bootstrap tests. This article presents a simple algorithm based on Thompson Sampling to test multiple hypotheses. It works with arbitrary multiple testing procedures, in particular with step-up and step-down procedures. Its main feature is to sequentially allocate Monte Carlo effort, generating more Monte Carlo samples for tests whose decisions are so far less certain. A simulation study demonstrates that for a low computational effort, the new approach yields a higher power and a higher degree of reproducibility of its results than previously suggested methods.
Similar content being viewed by others
References
Agrawal, S., and Goyal, N.: Analysis of Thompson Sampling for the Multi-armed Bandit Problem. JMLR: Workshop and Conference Proceedings of the 25th Annual Conference on Learning Theory, 23(39), 1–26 (2012)
Asomaning, N., Archer, K.: High-throughput dna methylation datasets for evaluating false discovery rate methodologies. Comput. Stat. Data Anal. 56, 1748–1756 (2012)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57(1), 289–300 (1995)
Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29(4), 1165–1188 (2001)
Besag, J., Clifford, P.: Sequential Monte Carlo p values. Biometrika 78(2), 301–304 (1991)
Bonferroni, C.: Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze 8, 3–62 (1936)
Davison, A., Hinkley, D.: Bootstrap Methods and Their Application. Cambridge University Press, Cambridge (1997)
Dazard, J.-E., Rao, S.: Joint adaptive mean variance regularization and variance stabilization of high dimensional data. Comput. Stat. Data Anal. 56, 2317–2333 (2012)
Edgington, E., Onghena, P.: Randomization Tests, 4th edn. Chapman & Hall/CRC, Boca Raton (1997)
Gandy, A., Hahn, G.: MMCTest—a safe algorithm for implementing multiple Monte Carlo tests. Scand. J. Stat. 41(4), 1083–1101 (2014)
Gleser, L.: Comment on ’Bootstrap Confidence Intervals’ by T. J. DiCiccio B. Efron. Stat. Sci. 11, 219–221 (1996)
Guo, W., Peddada, S.: Adaptive choice of the number of bootstrap samples in large scale multiple testing. Stat. Appl. Genet. Mol. Biol. 7(1), 1–16 (2008)
Gusenleitner, D., Howe, E., Bentink, S., Quackenbush, J., Culhane, A.: iBBiG: iterative binary bi-clustering of gene sets. Bioinformatics 28(19), 2484–2492 (2012)
Hochberg, Y.: A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4), 800–802 (1988)
Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6(2), 65–70 (1979)
Jiang, H., Salzman, J.: Statistical properties of an early stopping rule for resampling-based multiple testing. Biometrika 99(4), 973–980 (2012)
Li, G., Best, N., Hansell, A., Ahmed, I., Richardson, S.: BaySTDetect: detecting unusual temporal patterns in small area data via bayesian model choice. Biostatistics 13(4), 695–710 (2012)
Liu, J., Chen, R.: Sequential monte carlo methods for dynamic systems. J. Am. Stat. Assoc. 93(443), 1032–1044 (1998)
Liu, J., Huang, J., Ma, S., Wang, K.: Incorporating group correlations in genome-wide association studies using smoothed group Lasso. Biostatistics 14(2), 205–219 (2013)
Lourenco, V., Pires, A.: M-regression, false discovery rates and outlier detection with application to genetic association studies. Comput. Stat. Data Anal. 78, 33–42 (2014)
Manly, B.: Randomization, Bootstrap and Monte Carlo Methods in Biology, 2nd edn. Chapman & Hall, London (1997)
Martínez-Camblor, P.: On correlated z-values distribution in hypothesis testing. Comput. Stat. Data Anal. 79, 30–43 (2014)
Nusinow, D., Kiezun, A., O’Connell, D., Chick, J., Yue, Y., Maas, R., Gygi, S., Sunyaev, S.: Network-based inference from complex proteomic mixtures using SNIPE. Bioinformatics 28(23), 3115–3122 (2012)
Pekowska, A., Benoukraf, T., Ferrier, P., Spicuglia, S.: A unique h3k4me2 profile marks tissue-specific gene regulation. Genome Res. 20(11), 1493–1502 (2010)
Pounds, S., Cheng, C.: Robust estimation of the false discovery rate. Bioinformatics 22(16), 1979–1987 (2006)
Rahmatallah, Y., Emmert-Streib, F., Glazko, G.: Gene set analysis for self-contained tests: complex null and specific alternative hypotheses. Bioinformatics 28(23), 3073–3080 (2012)
Rom, D.: A sequentially rejective test procedure based on a modified Bonferroni inequality. Biometrika 77(3), 663–665 (1990)
Sandve, G., Ferkingstad, E., Nygård, S.: Sequential Monte Carlo multiple testing. Bioinformatics 27(23), 3235–3241 (2011)
Shaffer, J.: Modified sequentially rejective multiple test procedures. J. Am. Stat.Assoc. 81(395), 826–831 (1986)
Sidak, Z.: Rectangular confidence regions for the means of multivariate normal distributions. J. Am. Stat.Assoc. 62(318), 626–633 (1967)
Simes, R.: An improved Bonferroni procedure for multiple tests of significance. Biometrika 73(3), 751–754 (1986)
Tamhane, A., Liu, L.: On weighted Hochberg procedures. Biometrika 95(2), 279–294 (2008)
Thompson, W.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4), 285–294 (1933)
Wu, H., Wang, C., Wu, Z.: A new shrinkage estimator for dispersion improves differential expression detection in rna-seq data. Biostatistics 14(2), 232–243 (2013)
Zhou, Y.-H., Barry, W., Wright, F.: Empirical pathway analysis, without permutation. Biostatistics 14(3), 573–585 (2013)
Acknowledgments
We would like to thank the two referees for their constructive comments on the manuscript. The second author was supported by the EPSRC.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Gandy, A., Hahn, G. QuickMMCTest: quick multiple Monte Carlo testing. Stat Comput 27, 823–832 (2017). https://doi.org/10.1007/s11222-016-9656-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-016-9656-z