Reaching for the stars: attention to multiple testing problems and method recommendations using simulation for business research

Bartenschlager, Christina C.; Brunner, Jens O.

doi:10.1007/s11573-018-0919-3

Reaching for the stars: attention to multiple testing problems and method recommendations using simulation for business research

Original Paper
Published: 20 October 2018

Volume 89, pages 447–479, (2019)
Cite this article

Journal of Business Economics Aims and scope Submit manuscript

Christina C. Bartenschlager¹ &
Jens O. Brunner¹

539 Accesses
1 Citation
Explore all metrics

Abstract

Multiple hypotheses tests decide on more than one hypothesis based upon the same data set. Despite the significant relevance for business research, we find that multiple testing methods are not systematically applied in our research area. As surprising this finding is, as crucial it is to the significance of research findings. False positive findings can inflate with the number of employed single tests. The focus on medicine and psychology related topics of current multiple testing literature might deter business researchers from employing the tests. Hence, we provide guidance to researchers by exposing the importance of multiple testing. Based on an application oriented systematization, Monte Carlo simulations, and a decision theoretic evaluation scheme, we highlight method recommendations for different conservatism definitions regarding errors. As implicit results we find, that the dependency structure in the data does not influence the method outcomes significantly. In addition, intuitive single step margin based approaches perform similar than sophisticated tests for typical multiple testing problems in a conservative setting. More liberal methods show a stable relation for all simulation designs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What is Qualitative in Qualitative Research

Article Open access 27 February 2019

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

References

Barillas F, Shanken J (2018) Comparing asset pricing models. J Finance 73:715–754
Article Google Scholar
Bartenschlager CC, Brunner JO (2018) A new user specific multiple testing method for business applications: the SiMaFlex procedure. Working paper, University of Augsburg
Bartenschlager CC, Krapp M (2015) Theory and methods of multiple comparisons: a review of 80 years of multiple testing. AStA Wirtschafts und Sozialstatistisches Archiv 9:107–129
Article Google Scholar
Benjamini Y (2010a) Discovering the false discovery rate. J R Stat Soc B 72:405–416
Article Google Scholar
Benjamini Y (2010b) Simultaneous and selective inference: current successes and future challenges. Biometr J 52:708–721
Article Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300
Google Scholar
Benjamini Y, Hochberg Y (2000) On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Educ Behav Stat 25:60–83
Article Google Scholar
Benjamini Y, Liu W (1999) A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence. J Stat Plan Inference 82:163–170
Article Google Scholar
Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29:1165–1188
Article Google Scholar
Benjamini Y, Krieger AM, Yekutieli D (2006) Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93:491–507
Article Google Scholar
Bessec M, Fouquau J (2018) Short-run electricity load forecasting with combinations of stationary wavelet transforms. Eur J Oper Res 264:149–164
Article Google Scholar
Blakesley RE, Mazumdar S, Dew MA, Houck PR, Tang G, Reynolds CF, Butters MA (2009) Comparisons of methods for multiple hypothesis testing in neuropsychological research. Neuropsychology 23:255–264
Article Google Scholar
Blanchard G, Roquain E (2008) Two simple sufficient conditions for FDR control. Electr J Stat 2:963–992
Article Google Scholar
Chen Y, Hao JK (2018) Two phased hybrid local search for the periodic capacitated arc routing problem. Eur J Oper Res 264:55–65
Article Google Scholar
Clarke S, Hall P (2009) Robustness of multiple testing procedures against dependence. Ann Stat 37:332–358
Article Google Scholar
Cohen A, Sackrowitz HB (2005) Decision theory results for one-sided multiple comparison procedures. Ann Stat 33:126–144
Article Google Scholar
Cramer AOJ, van Ravenzwaaij D, Matzke D, Steingroever H, Wetzels R, Grasmann RPPP, Waldorp LJ, Wagenmakers EJ (2016) Hidden multiplicity in exploratory multiway ANOVA: prevalence and remedies. Psychon Bull Rev 23:640–647
Article Google Scholar
Dickhaus T (2014) Simultaneous statistical inference with applications in the life sciences. Springer, Berlin
Book Google Scholar
Dmitrienko A, Tamhane AC, Bretz F (2010) Multiple testing problems in pharmaceutical statistics. Chapman & Hall/CRC, Boca Raton
Google Scholar
Dudoit S, van der Laan MJ (2008) Multiple testing procedures with applications to genomics. Springer, New York
Book Google Scholar
Dudoit S, Shaffer JP, Boldrick JC (2003) Multiple hypothesis testing in microarray experiments. Stat Sci 18:71–103
Article Google Scholar
Farcomeni A (2008) A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion. Stat Methods Med Res 17:347–388
Article Google Scholar
Finner H, Dickhaus T, Roters M (2009) On the false discovery rate and an asymptotically optimal rejection curve. Ann Stat 37:596–618
Article Google Scholar
Fisher RA (1935) The design of experiments. Oliver and Boyd, Edinburgh und London, currently: edition 9. 1974. Hafner, New York
Fox J (2015) Applied regression analysis and generalized linear models, 3rd edn. Sage, London
Google Scholar
Gerum W, Mölls SH, Shen C (2018) Corporate governance, capital market orientation and firm performance: empirical evidence for large publicly traded German corporations. J Bus Econ 88:203–252
Article Google Scholar
Goeman JJ, Solari A (2014) Multiple hypothesis testing in genomics. Stat Med 33:1946–1978
Article Google Scholar
Hartmann WR, Klapper D (2018) Super bowl ads. Market Sci 37:78–96
Article Google Scholar
Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75:800–802
Article Google Scholar
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70
Google Scholar
Hommel G (1988) A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75:383–386
Article Google Scholar
Horn M, Vollandt R (1995) Multiple Tests und Auswahlverfahren. Fischer, Stuttgart
Google Scholar
Keppo J, Korte J (2018) Risk targeting and policy illusions—evidence from the announcement of the Volcker rule. Manag Sci 64:215–234
Article Google Scholar
Korhonen PJ, Maio P, Pajala T, Ravaja N, Somervuori O (2018) Context matters: the impact of product type, emotional attachment and information overload on choice quality. Eur J Operat Res 264:270–279
Article Google Scholar
Lehmann EL (1957a) A theory of some multiple decision problems 1. Ann Math Stat 28:1–25
Article Google Scholar
Lehmann EL (1957b) A theory of some multiple decision problems 2. Ann Math Stat 28:547–572
Article Google Scholar
Lehmann EL, Romano JP (2005) Testing statistical hypotheses. Springer, New York
Google Scholar
Lisovskaja V, Burman CF (2015) A decision theoretic approach to optimization of multiple testing procedures. Biometr J 57:64–75
Article Google Scholar
Maas S, Schuster T, Hartmann E (2018) Stakeholder pressures, environmental practice adoption and economic performance in German third-party logistics industry—a contingency perspective. J Business Econ 88:167–201
Article Google Scholar
Morey RD, Rouder JN, Verhagen J, Wagenmakers EJ (2014) Why hypothesis tests are essential for psychological science: a comment on cumming. Psychol Sci 25:1289–1290
Article Google Scholar
Morey RD, Romeijn JW, Rouder JN (2016) The philosophy of Bayes factors and the quantification of statistical evidence. J Math Psychol 72:6–18
Article Google Scholar
Ostermaier A (2018) Incentives for students: effects of certificates and deadlines on students performance. J Bus Econ 88:65–96
Article Google Scholar
Parmigiani G, Inoue L (2009) Decision theory: principles and approaches. Wiley, Chichester
Book Google Scholar
Peng DX, Lai F (2012) Using partial least squares in operations management research: a practical guideline and summary of past research. J Oper Manag 30:467–480
Article Google Scholar
Pigeot I (2000) Basic concepts of multiple tests: a survey. Stat Pap 41:3–36
Article Google Scholar
Rao CV, Swarupchand U (2009) Multiple comparison procedures: a note and a bibliography. J Stat 16:66–109
Google Scholar
Rom DM (1990) A sequentially rejective test procedure based on a modified Bonferroni inequality. Biometrika 77:663–665
Article Google Scholar
Schipper BC (2015) Sex hormones and competitive bidding. Manag Sci 61:249–266
Article Google Scholar
Shaffer JP (1995) Multiple hypothesis testing. Annu Rev Psychol 46:561–584
Article Google Scholar
Šidák Z (1967) Rectangular confidence regions for the mean of multivariate normal distributions. J Am Stat Assoc 62:626–633
Google Scholar
Simes RJ (1986) An improved Bonferroni procedure for multiple tests of significance. Biometrika 73:751–754
Article Google Scholar
Stange J, Bodnar T, Dickhaus T (2015) uncertainty quantification for the family-wise error rate in multivariate copula models. AStA Adv Stat Anal 99:281–310
Article Google Scholar
Storey JD, Taylor JE, Siegmund D (2004) Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J R Stat Soc B 66:187–205
Article Google Scholar
Tsai C, Chen JJ (2007) Kernel estimation for adjusted p values in multiple testing. Comput Stat Data Anal 51:3885–3897
Article Google Scholar
Vasilopoulos T, Morey TE, Dhatariya K, Rice MJ (2016) Limitations of significance testing in clinical research: a review of multiple comparison corrections and effect size calculations with correlated measures. Anesth Analg 122:825–830
Article Google Scholar
Wagenmakers EJ (2007) A practical solution to pervasive problems of p values. Psychon Bull Rev 14:779–804
Article Google Scholar
Wald A (1949) Statistical decision functions. Ann Math Stat 20:165–205
Article Google Scholar
Westfall P, Young SS (1993) Resampling-based multiple testing. Wiley, New York
Google Scholar
Wetzels R, Matzke D, Lee MD, Rouder JN, Iverson GJ, Wagenmakers EJ (2011) Statistical evidence in experimental psychology: an empirical comparison using 855 t Tests. Perspect Psychol Sci 6:291–298
Article Google Scholar

Download references

Author information

Authors and Affiliations

Health Care Operations/Health Information Management, Faculty of Business and Economics, University of Augsburg, University Center of Health Sciences at Klinikum Augsburg (UNIKA-T), Universitätsstraße 16, 86159, Augsburg, Germany
Christina C. Bartenschlager & Jens O. Brunner

Authors

Christina C. Bartenschlager
View author publications
You can also search for this author in PubMed Google Scholar
Jens O. Brunner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christina C. Bartenschlager.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 18 kb)

Appendices

Appendix 1

See Appendix Table 4.

Table 4 Study on the application of regression models and (number of) multiple tests in 2015 Management Science articles

Full size table

Appendix 2

See Appendix Fig. 8.

Appendix 3

See Appendix Fig. 9.

Appendix 4: Synopsis of the simulation test design per setting

1.
Generate positive definite m × m covariance matrix
- Fix the proportion of true null hypotheses \( \frac{{m_{0} }}{m} = 0.1 \) and define the expectation vector
  
  →Simulate 100 normally distributed 100 × m data sets
  
  →Save mean type I and II error counts for each multiple testing method
- Fix the proportion of true null hypotheses \( \frac{{m_{0} }}{m} = 0.5 \) and define the expectation vector
  
  →Simulate 100 normally distributed 100 × m data sets
  
  →Save mean type I and II error counts for each multiple testing method
- Fix the proportion of true null hypotheses \( \frac{{m_{0} }}{m} = 0.9 \) and define the expectation vector
  
  →Simulate 100 normally distributed 100 × m data sets
  
  →Save mean type I and II error counts for each multiple testing method
2.
Repeat [1] 100 times
3.
Calculate decision theoretic results.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bartenschlager, C.C., Brunner, J.O. Reaching for the stars: attention to multiple testing problems and method recommendations using simulation for business research. J Bus Econ 89, 447–479 (2019). https://doi.org/10.1007/s11573-018-0919-3

Download citation

Published: 20 October 2018
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s11573-018-0919-3

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reaching for the stars: attention to multiple testing problems and method recommendations using simulation for business research

Abstract

Access this article

Similar content being viewed by others

What is Qualitative in Qualitative Research

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOCX 18 kb)

Appendices

Appendix 1

Appendix 2

Appendix 3

Appendix 4: Synopsis of the simulation test design per setting

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Reaching for the stars: attention to multiple testing problems and method recommendations using simulation for business research

Abstract

Access this article

Similar content being viewed by others

What is Qualitative in Qualitative Research

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOCX 18 kb)

Appendices

Appendix 1

Appendix 2

Appendix 3

Appendix 4: Synopsis of the simulation test design per setting

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation