Abstract
Multiple hypotheses tests decide on more than one hypothesis based upon the same data set. Despite the significant relevance for business research, we find that multiple testing methods are not systematically applied in our research area. As surprising this finding is, as crucial it is to the significance of research findings. False positive findings can inflate with the number of employed single tests. The focus on medicine and psychology related topics of current multiple testing literature might deter business researchers from employing the tests. Hence, we provide guidance to researchers by exposing the importance of multiple testing. Based on an application oriented systematization, Monte Carlo simulations, and a decision theoretic evaluation scheme, we highlight method recommendations for different conservatism definitions regarding errors. As implicit results we find, that the dependency structure in the data does not influence the method outcomes significantly. In addition, intuitive single step margin based approaches perform similar than sophisticated tests for typical multiple testing problems in a conservative setting. More liberal methods show a stable relation for all simulation designs.
Similar content being viewed by others
References
Barillas F, Shanken J (2018) Comparing asset pricing models. J Finance 73:715–754
Bartenschlager CC, Brunner JO (2018) A new user specific multiple testing method for business applications: the SiMaFlex procedure. Working paper, University of Augsburg
Bartenschlager CC, Krapp M (2015) Theory and methods of multiple comparisons: a review of 80 years of multiple testing. AStA Wirtschafts und Sozialstatistisches Archiv 9:107–129
Benjamini Y (2010a) Discovering the false discovery rate. J R Stat Soc B 72:405–416
Benjamini Y (2010b) Simultaneous and selective inference: current successes and future challenges. Biometr J 52:708–721
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300
Benjamini Y, Hochberg Y (2000) On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Educ Behav Stat 25:60–83
Benjamini Y, Liu W (1999) A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence. J Stat Plan Inference 82:163–170
Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29:1165–1188
Benjamini Y, Krieger AM, Yekutieli D (2006) Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93:491–507
Bessec M, Fouquau J (2018) Short-run electricity load forecasting with combinations of stationary wavelet transforms. Eur J Oper Res 264:149–164
Blakesley RE, Mazumdar S, Dew MA, Houck PR, Tang G, Reynolds CF, Butters MA (2009) Comparisons of methods for multiple hypothesis testing in neuropsychological research. Neuropsychology 23:255–264
Blanchard G, Roquain E (2008) Two simple sufficient conditions for FDR control. Electr J Stat 2:963–992
Chen Y, Hao JK (2018) Two phased hybrid local search for the periodic capacitated arc routing problem. Eur J Oper Res 264:55–65
Clarke S, Hall P (2009) Robustness of multiple testing procedures against dependence. Ann Stat 37:332–358
Cohen A, Sackrowitz HB (2005) Decision theory results for one-sided multiple comparison procedures. Ann Stat 33:126–144
Cramer AOJ, van Ravenzwaaij D, Matzke D, Steingroever H, Wetzels R, Grasmann RPPP, Waldorp LJ, Wagenmakers EJ (2016) Hidden multiplicity in exploratory multiway ANOVA: prevalence and remedies. Psychon Bull Rev 23:640–647
Dickhaus T (2014) Simultaneous statistical inference with applications in the life sciences. Springer, Berlin
Dmitrienko A, Tamhane AC, Bretz F (2010) Multiple testing problems in pharmaceutical statistics. Chapman & Hall/CRC, Boca Raton
Dudoit S, van der Laan MJ (2008) Multiple testing procedures with applications to genomics. Springer, New York
Dudoit S, Shaffer JP, Boldrick JC (2003) Multiple hypothesis testing in microarray experiments. Stat Sci 18:71–103
Farcomeni A (2008) A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion. Stat Methods Med Res 17:347–388
Finner H, Dickhaus T, Roters M (2009) On the false discovery rate and an asymptotically optimal rejection curve. Ann Stat 37:596–618
Fisher RA (1935) The design of experiments. Oliver and Boyd, Edinburgh und London, currently: edition 9. 1974. Hafner, New York
Fox J (2015) Applied regression analysis and generalized linear models, 3rd edn. Sage, London
Gerum W, Mölls SH, Shen C (2018) Corporate governance, capital market orientation and firm performance: empirical evidence for large publicly traded German corporations. J Bus Econ 88:203–252
Goeman JJ, Solari A (2014) Multiple hypothesis testing in genomics. Stat Med 33:1946–1978
Hartmann WR, Klapper D (2018) Super bowl ads. Market Sci 37:78–96
Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75:800–802
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70
Hommel G (1988) A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75:383–386
Horn M, Vollandt R (1995) Multiple Tests und Auswahlverfahren. Fischer, Stuttgart
Keppo J, Korte J (2018) Risk targeting and policy illusions—evidence from the announcement of the Volcker rule. Manag Sci 64:215–234
Korhonen PJ, Maio P, Pajala T, Ravaja N, Somervuori O (2018) Context matters: the impact of product type, emotional attachment and information overload on choice quality. Eur J Operat Res 264:270–279
Lehmann EL (1957a) A theory of some multiple decision problems 1. Ann Math Stat 28:1–25
Lehmann EL (1957b) A theory of some multiple decision problems 2. Ann Math Stat 28:547–572
Lehmann EL, Romano JP (2005) Testing statistical hypotheses. Springer, New York
Lisovskaja V, Burman CF (2015) A decision theoretic approach to optimization of multiple testing procedures. Biometr J 57:64–75
Maas S, Schuster T, Hartmann E (2018) Stakeholder pressures, environmental practice adoption and economic performance in German third-party logistics industry—a contingency perspective. J Business Econ 88:167–201
Morey RD, Rouder JN, Verhagen J, Wagenmakers EJ (2014) Why hypothesis tests are essential for psychological science: a comment on cumming. Psychol Sci 25:1289–1290
Morey RD, Romeijn JW, Rouder JN (2016) The philosophy of Bayes factors and the quantification of statistical evidence. J Math Psychol 72:6–18
Ostermaier A (2018) Incentives for students: effects of certificates and deadlines on students performance. J Bus Econ 88:65–96
Parmigiani G, Inoue L (2009) Decision theory: principles and approaches. Wiley, Chichester
Peng DX, Lai F (2012) Using partial least squares in operations management research: a practical guideline and summary of past research. J Oper Manag 30:467–480
Pigeot I (2000) Basic concepts of multiple tests: a survey. Stat Pap 41:3–36
Rao CV, Swarupchand U (2009) Multiple comparison procedures: a note and a bibliography. J Stat 16:66–109
Rom DM (1990) A sequentially rejective test procedure based on a modified Bonferroni inequality. Biometrika 77:663–665
Schipper BC (2015) Sex hormones and competitive bidding. Manag Sci 61:249–266
Shaffer JP (1995) Multiple hypothesis testing. Annu Rev Psychol 46:561–584
Šidák Z (1967) Rectangular confidence regions for the mean of multivariate normal distributions. J Am Stat Assoc 62:626–633
Simes RJ (1986) An improved Bonferroni procedure for multiple tests of significance. Biometrika 73:751–754
Stange J, Bodnar T, Dickhaus T (2015) uncertainty quantification for the family-wise error rate in multivariate copula models. AStA Adv Stat Anal 99:281–310
Storey JD, Taylor JE, Siegmund D (2004) Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J R Stat Soc B 66:187–205
Tsai C, Chen JJ (2007) Kernel estimation for adjusted p values in multiple testing. Comput Stat Data Anal 51:3885–3897
Vasilopoulos T, Morey TE, Dhatariya K, Rice MJ (2016) Limitations of significance testing in clinical research: a review of multiple comparison corrections and effect size calculations with correlated measures. Anesth Analg 122:825–830
Wagenmakers EJ (2007) A practical solution to pervasive problems of p values. Psychon Bull Rev 14:779–804
Wald A (1949) Statistical decision functions. Ann Math Stat 20:165–205
Westfall P, Young SS (1993) Resampling-based multiple testing. Wiley, New York
Wetzels R, Matzke D, Lee MD, Rouder JN, Iverson GJ, Wagenmakers EJ (2011) Statistical evidence in experimental psychology: an empirical comparison using 855 t Tests. Perspect Psychol Sci 6:291–298
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix 1
See Appendix Table 4.
Appendix 2
See Appendix Fig. 8.
Appendix 3
See Appendix Fig. 9.
Appendix 4: Synopsis of the simulation test design per setting
-
1.
Generate positive definite m × m covariance matrix
-
Fix the proportion of true null hypotheses \( \frac{{m_{0} }}{m} = 0.1 \) and define the expectation vector
→Simulate 100 normally distributed 100 × m data sets
→Save mean type I and II error counts for each multiple testing method
-
Fix the proportion of true null hypotheses \( \frac{{m_{0} }}{m} = 0.5 \) and define the expectation vector
→Simulate 100 normally distributed 100 × m data sets
→Save mean type I and II error counts for each multiple testing method
-
Fix the proportion of true null hypotheses \( \frac{{m_{0} }}{m} = 0.9 \) and define the expectation vector
→Simulate 100 normally distributed 100 × m data sets
→Save mean type I and II error counts for each multiple testing method
-
-
2.
Repeat [1] 100 times
-
3.
Calculate decision theoretic results.
Rights and permissions
About this article
Cite this article
Bartenschlager, C.C., Brunner, J.O. Reaching for the stars: attention to multiple testing problems and method recommendations using simulation for business research. J Bus Econ 89, 447–479 (2019). https://doi.org/10.1007/s11573-018-0919-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11573-018-0919-3