Progress in science often comes from discovering invariances in relationships among variables; these invariances often correspond to null hypotheses. As is commonly known, it is not possible to state evidence for the null hypothesis in conventional significance testing. Here we highlight a Bayes factor alternative to the conventional t test that will allow researchers to express preference for either the null hypothesis or the alternative. The Bayes factor has a natural and straightforward interpretation, is based on reasonable assumptions, and has better properties than other methods of inference that have been advocated in the psychological literature. To facilitate use of the Bayes factor, we provide an easy-to-use, Web-based program that performs the necessary calculations.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.
Ashby, F. G., & Maddox, W. T. (1992). Complex decision rules in categorization: Contrasting novice and experienced performance. Journal of Experimental Psychology: Human Perception & Performance, 18, 50–71.
Augustin, T. (2008). Stevens’ power law and the problem of meaningfulness. Acta Psychologica, 128, 176.
Berger, J. O., & Berry, D. A. (1988). Analyzing data: Is objectivity possible? American Scientist, 76, 159–165.
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis: Theory and practice. Cambridge, MA: MIT Press.
Clarke, F. R. (1957). Constant-ratio rule for confusion matrices in speech communication. Journal of the Acoustical Society of America, 29, 715–720.
Cohen, J. (1994). The earth is round ( p <.05). American Psychologist, 49, 997–1003.
Cumming, G., & Finch, S. (2001). A primer on the understanding, use, and calculation of confidence intervals based on central and noncentral distributions. Educational & Psychological Measurement, 61, 532–574.
Debner, J. A., & Jacoby, L. L. (1994). Unconscious perception: Attention, awareness, and control. Journal of Experimental Psychology: Learning, Memory, & Cognition, 20, 304–317.
Dehaene, S., Naccache, L., Le Clec’H, G., Koechlin, E., Mueller, M., Dehaene-Lambertz, G., et al. (1998). Imaging unconscious semantic priming. Nature, 395, 597–600.
Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70, 193–242.
Egan, J. P. (1975). Signal detection theory and ROC-analysis. New York: Academic Press.
Fechner, G. T. (1966). Elements of psychophysics. New York: Holt, Rinehart & Winston. (Original work published 1860)
García-Donato, G., & Sun, D. (2007). Objective priors for hypothesis testing in one-way random effects models. Canadian Journal of Statistics, 35, 303–320.
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis (2nd ed.). Boca Raton, FL: Chapman & Hall.
Gillispie, C. C., Fox, R., & Grattan-Guinness, I. (1997). Pierre-Simon Laplace, 1749–1827: A life in exact science. Princeton, NJ: Princeton University Press.
Gönen, M., Johnson, W. O., Lu, Y., & Westfall, P. H. (2005). The Bayesian two-sample t test. American Statistician, 59, 252–257.
Goodman, S. N. (1999). Toward evidence-based medical statistics: I. The p value fallacy. Annals of Internal Medicine, 130, 995–1004.
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.
Grider, R. C., & Malmberg, K. J. (2008). Discriminating between changes in bias and changes in accuracy for recognition memory of emotional stimuli. Memory & Cognition, 36, 933–946.
Hawking, S. (Ed.) (2002). On the shoulders of giants: The great works of physics and astronomy. Philadelphia: Running Press.
Hays, W. L. (1994). Statistics (5th ed.). Fort Worth, TX: Harcourt Brace.
Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory & Language, 30, 513–541.
Jeffreys, H. (1961). Theory of probability (3rd ed.). Oxford: Oxford University Press, Clarendon Press.
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.
Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for nested hypotheses with large samples. Journal of the American Statistical Association, 90, 928–934.
Killeen, P. R. (2005). An alternative to null-hypothesis significance tests. Psychological Science, 16, 345–353.
Killeen, P. R. (2006). Beyond statistical inference: A decision theory for science. Psychonomic Bulletin & Review, 13, 549–562.
Kline, R. B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: American Psychological Association.
Lee, M. D., & Wagenmakers, E.-J. (2005). Bayesian statistical inference in psychology: Comment on Trafimow (2003). Psychological Review, 112, 662–668.
Lehmann, E. L. (1993). The Fisher, Neyman—Pearson theories of testing hypotheses: One theory or two? Journal of the American Statistical Association, 88, 1242–1249.
Liang, F., Paulo, R., Molina, G., Clyde, M. A., & Berger, J. O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association, 103, 410–423.
Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.
Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492–527.
Logan, G. D. (1992). Shapes of reaction-time distributions and shapes of learning curves: A test of the instance theory of automaticity. Journal of Experimental Psychology: Learning, Memory, & Cognition, 18, 883–914.
Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley.
Masson, M. E. J., & Loftus, G. R. (2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology, 57, 203–220.
Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting & Clinical Psychology, 46, 806–834.
Myung, I.-J., & Pitt, M. A. (1997). Applying Occam’s razor in modeling cognition: A Bayesian approach. Psychonomic Bulletin & Review, 4, 79–95.
Plant, E. A., & Peruche, B. M. (2005). The consequences of race for police officers’ responses to criminal suspects. Psychological Science, 16, 180–183.
Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163.
Reingold, E. M., & Merikle, P. M. (1988). Using direct and indirect measures to study perception without awareness. Perception & Psychophysics, 44, 563–575.
Rouder, J. N., & Lu, J. (2005). An introduction to Bayesian hierarchical models with an application in the theory of signal detection. Psychonomic Bulletin & Review, 12, 573–604.
Rouder, J. N., & Morey, R. D. (2005). Relational and arelational confidence intervals: A comment on Fidler, Thomason, Cumming, Finch, and Leeman (2004). Psychological Science, 16, 77–79.
Rouder, J. N., Morey, R. D., Speckman, P. L., & Pratte, M. S. (2007). Detecting chance: A solution to the null sensitivity problem in subliminal priming. Psychonomic Bulletin & Review, 14, 597–605.
Rouder, J. N., & Ratcliff, R. (2004). Comparing categorization models. Journal of Experimental Psychology: General, 133, 63–82.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Sellke, T., Bayarri, M. J., & Berger, J. O. (2001). Calibration of p values for testing precise null hypotheses. American Statistician, 55, 62–71.
Shepard, R. N. (1957). Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika, 22, 325–345.
Shibley Hyde, J. (2005). The gender similarities hypothesis. American Psychologist, 60, 581–592.
Shibley Hyde, J. (2007). New directions in the study of gender similarities and differences. Current Directions in Psychological Science, 16, 259–263.
Stevens, S. S. (1957). On the psychophysical law. Psychological Review, 64, 153–181.
Swets, J. A. (1996). Signal detection theory and ROC analysis in psychology and diagnostics: Collected papers. Mahwah, NJ: Erlbaum.
Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.
Wagenmakers, E.-J. (2007). A practical solution to the pervasive problem of p values. Psychonomic Bulletin & Review, 14, 779–804.
Wagenmakers, E.-J., & Grünwald, P. (2006). A Bayesian perspective on hypothesis testing: A comment on Killeen (2005). Psychological Science, 17, 641–642.
Wagenmakers, E.-J., Lee, M. D., Lodewyckx, T., & Iverson, G. (2008). Bayesian versus frequentist inference. In H. Hoijtink, I. Klugkist, & P. A. Boelen (Eds.), Bayesian evaluation of informative hypotheses in psychology (pp. 181–207). New York: Springer.
Zellner, A., & Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In J. M. Bernardo, M. H. DeGroot, D. V. Lindley, & A. F. M. Smith (Eds.), Bayesian statistics: Proceedings of the First International Meeting (pp. 585–603). Valencia: University of Valencia Press.
This research was supported by NSF Grant SES-0720229 and NIMH Grant R01-MH071418.
About this article
Cite this article
Rouder, J.N., Speckman, P.L., Sun, D. et al. Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review 16, 225–237 (2009). https://doi.org/10.3758/PBR.16.2.225
- Akaike Information Criterion
- Marginal Likelihood
- Posterior Odds
- Subliminal Priming
- Prior Standard Deviation