A 20-Year Review of Outcome Reporting Bias in Moderated Multiple Regression

O’Boyle, Ernest; Banks, George C.; Carter, Kameron; Walter, Sheryl; Yuan, Zhenyu

doi:10.1007/s10869-018-9539-8

A 20-Year Review of Outcome Reporting Bias in Moderated Multiple Regression

Original Paper
Published: 20 April 2018

Volume 34, pages 19–37, (2019)
Cite this article

Journal of Business and Psychology Aims and scope Submit manuscript

Ernest O’Boyle¹,
George C. Banks²,
Kameron Carter³,
Sheryl Walter¹ &
…
Zhenyu Yuan³

4760 Accesses
25 Citations
9 Altmetric
Explore all metrics

Abstract

Moderated multiple regression (MMR) remains the most popular method of testing interactions in management and applied psychology. Recent discussions of MMR have centered on their small effect sizes and typically being statistically underpowered (e.g., Murphy & Russell, Organizational Research Methods, 2016). Although many MMR tests are likely plagued by type II errors, they may also be particularly prone to outcome reporting bias (ORB) resulting in elevated false positives (type I errors). We tested the state of MMR through a 20-year review of six leading journals. Based on 1218 MMR tests nested within 343 studies, we found that despite low statistical power, most MMR tests (54%) were reported as statistically significant. Further, although sample size has remained relatively unchanged (r = − .002), statistically significant MMR tests have risen from 41% (1995–1999) to 49% (2000–2004), to 60% (2005–2009), and to 69% (2010–2014). This could indicate greater methodological and theoretical precision but leaves open the possibility of ORB. In our review, we found evidence that both increased rigor and theoretical precision play an important role in MMR effect size magnitudes, but also found evidence for ORB. Specifically, (a) smaller sample sizes are associated with larger effect sizes, (b) there is a substantial frequency spike in p values just below the .05 threshold, and (c) recalculated p values less than .05 always converged with authors’ conclusions of statistical significance but recalculated p values between .05 and .10 only converged with authors’ conclusions about half (54%) of the time. The findings of this research provide important implications for future application of MMR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

Criteria for Good Qualitative Research: A Comprehensive Review

Article Open access 18 September 2021

How to use and assess qualitative research methods

Article Open access 27 May 2020

Notes

We did conduct a series of multilevel analyses, and the results provided to the editor and reviewers are virtually identical to the meta-regression results presented below. Further, we tested the models with different weighting schemes (e.g., unweighted, weighted by sample size, weighted by inverse standard error of the semipartial correlation), various effect sizes (e.g., semipartial correlation, shrunken semipartial correlation, f², shrunken f²) calculated in different ways (e.g., based on t statistics using Cohen and Cohen’s (1983) formulas, change in R² alone), with and without outliers, and with a variety of subsamples in the data (e.g., randomly selected effect sizes from a study, averaged effect sizes). Our intention was not to “hack” the data, but to assure that our results were robust. Across the more than 30 different analyses, our results are remarkably consistent not just in the overall conclusions, but also in the specific effect size directions and magnitudes for the focal variables. The full set of analyses is available from the first author.
We thank an anonymous reviewer for raising this concern and recommending the unweighted approach for the reported analyses.
Again, we are thankful to an anonymous reviewer for this suggestion.

References

Aguinis, H., & Gottfredson, R. K. (2010). Best-practice recommendations for estimating interaction effects using moderated multiple regression. Journal of Organizational Behavior, 31, 776–786. https://doi.org/10.1002/job.686.
Article Google Scholar
Aguinis, H., & Stone-Romero, E. F. (1997). Methodological artifacts in moderated multiple regression and their effects on statistical power. Journal of Applied Psychology, 82, 192–206. https://doi.org/10.1037//0021-9010.82.1.192.
Article Google Scholar
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Thousand Oaks, CA: Sage.
Google Scholar
Antonakis, J. (2017). On doing better science: From thrill of discovery to policy implications. The Leadership Quarterly, 28(1), 5–21.
Banks, G. C., Kepes, S., & McDaniel, M. A. (2015). Publication bias: Understanding the myths concerning threats to the advancement of science. In C. E. Lance & R. J. Vandenberg (Eds.), More statistical and methodological myths and urban legends (pp. 36–64). New York, NY: Routledge.
Google Scholar
Banks, G. C., & McDaniel, M. A. (2011). The kryptonite of evidence-based I-O psychology. Industrial and Organizational Psychology: Perspectives on Science and Practice, 4, 40–44. https://doi.org/10.1111/j.1754-9434.2010.01292.x.
Article Google Scholar
Banks, G. C., O’Boyle, E. H., Pollack, J. M., White, C. D., Batchelor, J. H., Whelpley, C. E., et al. (2016). Questions about questionable research practices in the field of management: A guest commentary. Journal of Management, 42, 5–20. https://doi.org/10.1177/0149206315619011.
Article Google Scholar
Banks, G. C., Rogelberg, S. G., Woznyj, H. M., Landis, R. S., & Rupp, D. E. (2016). Evidence on questionable research practices: The good, the bad, and the ugly. Journal of Business and Psychology, 31, 323–338. https://doi.org/10.1007/s10869-01609456-7.
Article Google Scholar
Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. https://doi.org/10.1037/0022-3514.51.6.1173.
Article PubMed Google Scholar
Bennett, R. J., & Robinson, S. L. (2000). Development of a measure of workplace deviance. Journal of Applied Psychology, 85, 349–360. https://doi.org/10.1037/0021-9010.85.3.349.
Article PubMed Google Scholar
Bergh, D. D., Sharp, B. M., & Li, M. (2017). Tests for identifying “red flags” in empirical findings: Demonstration and recommendations for authors, reviewers, and editors. Academy of Management Learning and Education, 16, 110–124. https://doi.org/10.5465/amle.2015.0406.
Article Google Scholar
Biemann, T. (2013). What if we were Texas sharpshooters? Predictor reporting bias in regression analysis. Organizational Research Methods, 16, 335–363. https://doi.org/10.1177/1094428113485135.
Article Google Scholar
Bobko, P. (1986). A solution to some dilemmas when testing hypotheses about ordinal interactions. Journal of Applied Psychology, 71, 323–326. https://doi.org/10.1037/0021-9010.71.2.323.
Article Google Scholar
Bosco, F. A., Aguinis, H., Field, J. G., Pierce, C. A., & Dalton, D. R. (2016). HARKing’s threat to organizational research: Evidence from primary and meta-analytic sources. Personnel Psychology, 69, 709–750. https://doi.org/10.1111/peps.12111.
Article Google Scholar
Bosco, F. A., Aguinis, H., Singh, K., Field, J. G., & Pierce, C. A. (2015). Correlational effect size benchmarks. Journal of Applied Psychology, 100, 431–449. https://doi.org/10.1037/a0038047.
Article PubMed Google Scholar
Cohen, J. E. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Google Scholar
Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Google Scholar
Cortina, J. M. (1993). Interaction, nonlinearity, and multicollinearity: Implications for multiple regression. Journal of Management, 19, 915–922. https://doi.org/10.1016/0149-2063(93)90035-L.
Article Google Scholar
Cortina, J. M., Green, J. P., Keeler, K. R., & Vandenberg, R. J. (in press). Degrees of freedom in SEM: Are we testing the models that we claim to test? Organizational Research Methods. 1094428116676345.
Cronbach, L. J. (1987). Statistical tests for moderator variables: Flaws in analyses recently proposed. Psychological Bulletin, 102, 414–417. https://doi.org/10.1037/0033-2909.102.3.414.
Article Google Scholar
de Winter, J. C., & Dodou, D. (2015). A surge of p-values between 0.041 and 0.049 in recent decades (but negative results are increasing rapidly too). PeerJ, 3, e733. https://doi.org/10.7717/peerj.733.
Article PubMed PubMed Central Google Scholar
Editors. (1909). The reporting of unsuccessful cases. The Boston Medical and Surgical Journal, 161, 263–264. https://doi.org/10.1056/NEJM190908191610809.
Edwards, J. R., & Berry, J. W. (2010). The presence of something or the absence of nothing: Increasing theoretical precision in management research. Organizational Research Methods, 13, 668–689. https://doi.org/10.1177/1094428110380467.
Article Google Scholar
Emerson, G. B., Warme, W. J., Wolf, F. M., Heckman, J. D., Brand, R. A., & Leopold, S. S. (2010). Testing for the presence of positive-outcome bias in peer review: A randomized controlled trial. Archives of Internal Medicine, 170, 1934–1939. https://doi.org/10.1001/archinternmed.2010.406.
Article PubMed Google Scholar
Evans, M. G. (1985). A Monte Carlo study of the effects of correlated method variance in moderated multiple regression analysis. Organizational Behavior and Human Decision Processes, 36, 305–323. https://doi.org/10.1016/0749-5978(85)90002-0.
Article Google Scholar
Fanelli, D. (2012). Negative results are disappearing from most disciplines and countries. Scientometrics, 90, 891–904. https://doi.org/10.1007/s11192-011-0494-7.
Article Google Scholar
Finkel, E. J., Eastwick, P. W., & Reis, H. T. (2015). Best research practices in psychology: Illustrating epistemological and pragmatic considerations with the case of relationship science. Journal of Personality and Social Psychology, 108, 275–297. https://doi.org/10.1037/pspi0000007.
Article PubMed Google Scholar
Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345, 1502–1505. https://doi.org/10.1126/science.1255484.
Article PubMed Google Scholar
Gerber, A. S., & Malhotra, N. (2008a). Do statistical reporting standards affect what is published? Publication bias in two leading political science journals. Quarterly Journal of Political Science, 3, 313–326. https://doi.org/10.1561/100.00008024.
Article Google Scholar
Gerber, A. S., & Malhotra, N. (2008b). Publication bias in empirical sociological research: Do arbitrary significance levels distort published results? Sociological Methods & Research, 37, 3–30. https://doi.org/10.1177/0049124108318973.
Article Google Scholar
Grand, J. A., Rogelberg, S. G., Banks, G. C., Landis, R. S., Tonidandel, S. (in press). From outcome to process focus: Fostering a more robust psychological science through registered reports and results-blind reviewing. Perspectives on Psychological Science.
Greco, L. M., O’Boyle, E. H., Cockburn, B. S., & Yuan, Z. (in press). A reliability generalization examination of organizational behavior constructs. Journal of Management Studies.
Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1–20. https://doi.org/10.1037/h0076157.
Article Google Scholar
Hardwicke, T. E., Mathur, M., MacDonald, K., Nilsonne, G., Banks, G. C., Kidwell, M. C., ... Tessler, M. H. (2018). Data availability, reusability, and analytic reproducibility: Evaluating the impact of a mandatory open data policy at the journal Cognition.
Google Scholar
Hartgerink, C. H., van Aert, R. C., Nuijten, M. B., Wicherts, J. M., & Van Assen, M. A. (2016). Distributions of p-values smaller than .05 in psychology: What is going on? PeerJ, 4, e1935. https://doi.org/10.7717/peerj.1935.
Article PubMed PubMed Central Google Scholar
Hollenbeck, J. R., & Wright, P. M. (2016). Harking, sharking, and tharking: Making the case for post hoc analysis of scientific data. Journal of Management, 43, 5–18. https://doi.org/10.1177/0149206316679487.
Article Google Scholar
Ioannidis, J. P. A. (2008). Why most discovered true associations are inflated. Epidemiology, 19, 640–648. https://doi.org/10.1097/EDE.0b013e31818131e7.
Article PubMed Google Scholar
Jaccard, J., Wan, C. K., & Turrisi, R. (1990). The detection and interpretation of interaction effects between continuous variables in multiple regression. Multivariate Behavioral Research, 25, 467–478. https://doi.org/10.1207/s15327906mbr2504_4.
Article PubMed Google Scholar
James, L. R., & Brett, J. M. (1984). Mediators, moderators, and tests for mediation. Journal of Applied Psychology, 69, 307–321. https://doi.org/10.1037/0021-9010.69.2.307.
Article Google Scholar
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23, 524–532. https://doi.org/10.1177/0956797611430953.
Article PubMed Google Scholar
Journal Citation Reports® (2014). Social Science Edition. (Thompson Reuters, 2015). http://jcr.incites.thomsonreuters.com.
Kepes, S., Banks, G. C., McDaniel, M. A., & Whetzel, D. L. (2012). Publication bias in the organizational sciences. Organizational Research Methods, 15, 624–662. https://doi.org/10.1177/1094428112452760.
Article Google Scholar
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2, 196–217. https://doi.org/10.1207/s15327957pspr0203_4.
Article PubMed Google Scholar
Krawczyk, M. (2015). The search for significance: A few peculiarities in the distribution of P values in experimental psychology literature. PLoS One, 10(6), e0127872. https://doi.org/10.1371/journal.pone.0127872.
Article PubMed PubMed Central Google Scholar
Kühberger, A., Fritz, A., & Scherndl, T. (2014). Publication bias in psychology: A diagnosis based on the correlation between effect size and sample size. PLoS One, 9(9), e105825. https://doi.org/10.1371/journal.pone.0105825.
Article PubMed PubMed Central Google Scholar
LeBreton, J. M. (2016). Editorial. Organizational Research Methods, 19, 3–7. https://doi.org/10.1177/1094428115622097.
Article Google Scholar
LeBreton, J. M., Tonidandel, S., & Krasikova, D. V. (2013). Residualized relative importance analysis: A technique for the comprehensive decomposition of variance in higher order regression models. Organizational Research Methods, 16, 449–473. https://doi.org/10.1177/1094428113481065.
Article Google Scholar
Leggett, N. C., Thomas, N. A., Loetscher, T., & Nicholls, M. E. (2013). The life of p: “Just significant” results are on the rise. The Quarterly Journal of Experimental Psychology, 66, 2303–2309. https://doi.org/10.1080/17470218.2013.863371.
Article PubMed Google Scholar
Masicampo, E. J., & Lalande, D. R. (2012). A peculiar prevalence of p values just below. 05. The Quarterly Journal of Experimental Psychology and Aging, 65, 2271–2279. https://doi.org/10.1080/17470218.2012.711335.
Article Google Scholar
Matthes, J., Marquart, F., Naderer, B., Arendt, F., Schmuck, D., & Adam, K. (2015). Questionable research practices in experimental communication research: A systematic analysis from 1980 to 2013. Communication Methods and Measures, 9(4), 193–207. https://doi.org/10.1080/19312458.2015.1096334.
Article Google Scholar
Murphy, K. R., & Russell, C. J. (2016). Mend it or end it: Redirecting the search for interactions in the organizational sciences. Organizational Research Methods. 1094428115625322.
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., et al. (2015). Promoting an open research culture: The TOP guidelines for journals. Science, 348, 1422–1425. https://doi.org/10.1126/science.aab2374.
Article PubMed PubMed Central Google Scholar
Nosek, B. A., & Bar-Anan, Y. (2012). Scientific utopia: I. Opening scientific communication. Psychological Inquiry, 23(3), 217–243. https://doi.org/10.1080/1047840X.2012.692215.
Article Google Scholar
Nuijten, M. B., Hartgerink, C. H., van Assen, M. A., Epskamp, S., & Wicherts, J. M. (2016). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 48, 1205–1226. https://doi.org/10.3758/s13428-015-0664-2.
Article PubMed Google Scholar
O’Boyle, E. H., Banks, G. C., & Gonzalez-Mulé, E. (2017). The chrysalis effect: How ugly initial results metamorphosize into beautiful articles. Journal of Management, 43, 376–399 doi: 0149206314527133.
Article Google Scholar
Orlitzky, M. (2012). How can significance tests be deinstitutionalized? Organizational Research Methods, 15, 199–228. https://doi.org/10.1177/109442811428356.
Article Google Scholar
Porter, T. M. (1992). Quantification and the accounting ideal in science. Social Studies of Science, 22, 633–652. https://doi.org/10.1177/030631292022004004.
Article Google Scholar
Robinson, S. L., & Bennett, R. J. (1995). A typology of deviant workplace behaviors: A multidimensional scaling study. Academy of Management Journal, 38, 555–572. https://doi.org/10.2307/256693.
Article Google Scholar
Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86, 638–641. https://doi.org/10.1037/0033-2909.86.3.638.
Article Google Scholar
Russell, C. J., & Bobko, P. (1992). Moderated regression analysis and Likert scales: Too coarse for comfort. Journal of Applied Psychology, 77, 336–342. https://doi.org/10.1037//0021-9010.77.3.336.
Article PubMed Google Scholar
Scandura, T. A., & Williams, E. A. (2000). Research methodology in management: Current practices, trends, and implications for future research. Academy of Management Journal, 43, 1248–1264. https://doi.org/10.2307/1556348.
Article Google Scholar
Schmidt, F. L., & Hunter, J. E. (2015). Methods of meta-analysis: Correcting error and bias in research findings (3rd ed.). Thousand Oaks, CA: Sage.
Book Google Scholar
Schwab, A., & Starbuck, W. H. (in press). A call for openness in research reporting: How to turn covert practices into helpful tools. Academy of Management Learning and Education, 16, 125–141. https://doi.org/10.5465/amle.2016.0039.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366. https://doi.org/10.1177/0956797611417632.
Article PubMed Google Scholar
Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2015). Better P-curves: Making P-curve analysis more robust to errors, fraud, and ambitious P-hacking, a reply to Ulrich and Miller (2015). Journal of Experimental Psychology: General, 144, 1146–1152. https://doi.org/10.1037/xge0000104.
Article Google Scholar
Song, F., Parekh, S., Hooper, L., Loke, Y. K., Ryder, J., Sutton, A. J., et al. (2010). Dissemination and publication of research findings: An updated review of related biases. Health Technology Assessment, 14, 1–220. https://doi.org/10.3310/hta14080.
Article Google Scholar
Spector, P. E., & Fox, S. (2005). The stressor-emotion model of counterproductive work behavior. In S. Fox & P. E. Spector (Eds.), Counterproductive work behavior: Investigations of actors and targets (pp. 151–174). Washington, DC: American Psychological Association.
Chapter Google Scholar
Starbuck, W. H. (in press). 60th anniversary essay: How journals could improve research practices in social science. Administrative Science Quarterly, 61, 165–183. https://doi.org/10.1177/0001839216629644.
Sterling, T. D. (1959). Publication decisions and their possible effects on inferences drawn from tests of significance—Or vice versa. Journal of the American Statistical Association, 54, 30–34. https://doi.org/10.1080/01621459.1959.10501497.
Article Google Scholar
Tonidandel, S., & LeBreton, J. M. (2011). Relative importance analysis: A useful supplement to regression analysis. Journal of Business and Psychology, 26, 1–9. https://doi.org/10.1007/s10869-010-9204-3.
Article Google Scholar
Tsang, E. W., & Kwan, K. M. (1999). Replication and theory development in organizational science: A critical realist perspective. Academy of Management Review, 24, 759–780. https://doi.org/10.5465/AMR.1999.2553252.
Article Google Scholar
Viechtbauer, W. (2010). Conducting meta-analyses in R with the Metafor package. Journal of Statistical Software, 36(3), 1–48. https://doi.org/10.18637/jss.v036.i03.
Article Google Scholar
Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7, 632–638. https://doi.org/10.1177/1745691612463078.
Article PubMed Google Scholar
Wicherts, J. M., Bakker, M., & Molenaar, D. (2011). Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLoS One, 6(11), e26828. https://doi.org/10.1371/journal.pone.0026828.
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Kelley School of Business, Indiana University, Bloomington, IN, USA
Ernest O’Boyle & Sheryl Walter
Department of Management in the Belk College of Business at UNC Charlotte, Charlotte, NC, USA
George C. Banks
Tippie College of Business, University of Iowa, Iowa City, IA, USA
Kameron Carter & Zhenyu Yuan

Authors

Ernest O’Boyle
View author publications
You can also search for this author in PubMed Google Scholar
George C. Banks
View author publications
You can also search for this author in PubMed Google Scholar
Kameron Carter
View author publications
You can also search for this author in PubMed Google Scholar
Sheryl Walter
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyu Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ernest O’Boyle.

Rights and permissions

Reprints and permissions

About this article

Cite this article

O’Boyle, E., Banks, G.C., Carter, K. et al. A 20-Year Review of Outcome Reporting Bias in Moderated Multiple Regression. J Bus Psychol 34, 19–37 (2019). https://doi.org/10.1007/s10869-018-9539-8

Download citation

Published: 20 April 2018
Issue Date: 15 February 2019
DOI: https://doi.org/10.1007/s10869-018-9539-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A 20-Year Review of Outcome Reporting Bias in Moderated Multiple Regression

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Criteria for Good Qualitative Research: A Comprehensive Review

How to use and assess qualitative research methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A 20-Year Review of Outcome Reporting Bias in Moderated Multiple Regression

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Criteria for Good Qualitative Research: A Comprehensive Review

How to use and assess qualitative research methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation