Skip to main content

Bayesian t tests for accepting and rejecting the null hypothesis

Abstract

Progress in science often comes from discovering invariances in relationships among variables; these invariances often correspond to null hypotheses. As is commonly known, it is not possible to state evidence for the null hypothesis in conventional significance testing. Here we highlight a Bayes factor alternative to the conventional t test that will allow researchers to express preference for either the null hypothesis or the alternative. The Bayes factor has a natural and straightforward interpretation, is based on reasonable assumptions, and has better properties than other methods of inference that have been advocated in the psychological literature. To facilitate use of the Bayes factor, we provide an easy-to-use, Web-based program that performs the necessary calculations.

References

  1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.

    Article  Google Scholar 

  2. Ashby, F. G., & Maddox, W. T. (1992). Complex decision rules in categorization: Contrasting novice and experienced performance. Journal of Experimental Psychology: Human Perception & Performance, 18, 50–71.

    Article  Google Scholar 

  3. Augustin, T. (2008). Stevens’ power law and the problem of meaningfulness. Acta Psychologica, 128, 176.

    PubMed  Article  Google Scholar 

  4. Berger, J. O., & Berry, D. A. (1988). Analyzing data: Is objectivity possible? American Scientist, 76, 159–165.

    Google Scholar 

  5. Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis: Theory and practice. Cambridge, MA: MIT Press.

    Google Scholar 

  6. Clarke, F. R. (1957). Constant-ratio rule for confusion matrices in speech communication. Journal of the Acoustical Society of America, 29, 715–720.

    Article  Google Scholar 

  7. Cohen, J. (1994). The earth is round ( p <.05). American Psychologist, 49, 997–1003.

    Article  Google Scholar 

  8. Cumming, G., & Finch, S. (2001). A primer on the understanding, use, and calculation of confidence intervals based on central and noncentral distributions. Educational & Psychological Measurement, 61, 532–574.

    Google Scholar 

  9. Debner, J. A., & Jacoby, L. L. (1994). Unconscious perception: Attention, awareness, and control. Journal of Experimental Psychology: Learning, Memory, & Cognition, 20, 304–317.

    Article  Google Scholar 

  10. Dehaene, S., Naccache, L., Le Clec’H, G., Koechlin, E., Mueller, M., Dehaene-Lambertz, G., et al. (1998). Imaging unconscious semantic priming. Nature, 395, 597–600.

    PubMed  Article  Google Scholar 

  11. Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70, 193–242.

    Article  Google Scholar 

  12. Egan, J. P. (1975). Signal detection theory and ROC-analysis. New York: Academic Press.

    Google Scholar 

  13. Fechner, G. T. (1966). Elements of psychophysics. New York: Holt, Rinehart & Winston. (Original work published 1860)

    Google Scholar 

  14. García-Donato, G., & Sun, D. (2007). Objective priors for hypothesis testing in one-way random effects models. Canadian Journal of Statistics, 35, 303–320.

    Article  Google Scholar 

  15. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis (2nd ed.). Boca Raton, FL: Chapman & Hall.

    Google Scholar 

  16. Gillispie, C. C., Fox, R., & Grattan-Guinness, I. (1997). Pierre-Simon Laplace, 1749–1827: A life in exact science. Princeton, NJ: Princeton University Press.

    Google Scholar 

  17. Gönen, M., Johnson, W. O., Lu, Y., & Westfall, P. H. (2005). The Bayesian two-sample t test. American Statistician, 59, 252–257.

    Article  Google Scholar 

  18. Goodman, S. N. (1999). Toward evidence-based medical statistics: I. The p value fallacy. Annals of Internal Medicine, 130, 995–1004.

    PubMed  Google Scholar 

  19. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.

    Google Scholar 

  20. Grider, R. C., & Malmberg, K. J. (2008). Discriminating between changes in bias and changes in accuracy for recognition memory of emotional stimuli. Memory & Cognition, 36, 933–946.

    Article  Google Scholar 

  21. Hawking, S. (Ed.) (2002). On the shoulders of giants: The great works of physics and astronomy. Philadelphia: Running Press.

    Google Scholar 

  22. Hays, W. L. (1994). Statistics (5th ed.). Fort Worth, TX: Harcourt Brace.

    Google Scholar 

  23. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory & Language, 30, 513–541.

    Article  Google Scholar 

  24. Jeffreys, H. (1961). Theory of probability (3rd ed.). Oxford: Oxford University Press, Clarendon Press.

    Google Scholar 

  25. Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.

    Article  Google Scholar 

  26. Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for nested hypotheses with large samples. Journal of the American Statistical Association, 90, 928–934.

    Article  Google Scholar 

  27. Killeen, P. R. (2005). An alternative to null-hypothesis significance tests. Psychological Science, 16, 345–353.

    PubMed  Article  Google Scholar 

  28. Killeen, P. R. (2006). Beyond statistical inference: A decision theory for science. Psychonomic Bulletin & Review, 13, 549–562.

    Article  Google Scholar 

  29. Kline, R. B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: American Psychological Association.

    Book  Google Scholar 

  30. Lee, M. D., & Wagenmakers, E.-J. (2005). Bayesian statistical inference in psychology: Comment on Trafimow (2003). Psychological Review, 112, 662–668.

    PubMed  Article  Google Scholar 

  31. Lehmann, E. L. (1993). The Fisher, Neyman—Pearson theories of testing hypotheses: One theory or two? Journal of the American Statistical Association, 88, 1242–1249.

    Article  Google Scholar 

  32. Liang, F., Paulo, R., Molina, G., Clyde, M. A., & Berger, J. O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association, 103, 410–423.

    Article  Google Scholar 

  33. Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.

    Google Scholar 

  34. Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492–527.

    Article  Google Scholar 

  35. Logan, G. D. (1992). Shapes of reaction-time distributions and shapes of learning curves: A test of the instance theory of automaticity. Journal of Experimental Psychology: Learning, Memory, & Cognition, 18, 883–914.

    Article  Google Scholar 

  36. Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley.

    Google Scholar 

  37. Masson, M. E. J., & Loftus, G. R. (2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology, 57, 203–220.

    PubMed  Google Scholar 

  38. Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting & Clinical Psychology, 46, 806–834.

    Article  Google Scholar 

  39. Myung, I.-J., & Pitt, M. A. (1997). Applying Occam’s razor in modeling cognition: A Bayesian approach. Psychonomic Bulletin & Review, 4, 79–95.

    Article  Google Scholar 

  40. Plant, E. A., & Peruche, B. M. (2005). The consequences of race for police officers’ responses to criminal suspects. Psychological Science, 16, 180–183.

    PubMed  Article  Google Scholar 

  41. Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163.

    Article  Google Scholar 

  42. Reingold, E. M., & Merikle, P. M. (1988). Using direct and indirect measures to study perception without awareness. Perception & Psychophysics, 44, 563–575.

    Article  Google Scholar 

  43. Rouder, J. N., & Lu, J. (2005). An introduction to Bayesian hierarchical models with an application in the theory of signal detection. Psychonomic Bulletin & Review, 12, 573–604.

    Article  Google Scholar 

  44. Rouder, J. N., & Morey, R. D. (2005). Relational and arelational confidence intervals: A comment on Fidler, Thomason, Cumming, Finch, and Leeman (2004). Psychological Science, 16, 77–79.

    PubMed  Article  Google Scholar 

  45. Rouder, J. N., Morey, R. D., Speckman, P. L., & Pratte, M. S. (2007). Detecting chance: A solution to the null sensitivity problem in subliminal priming. Psychonomic Bulletin & Review, 14, 597–605.

    Article  Google Scholar 

  46. Rouder, J. N., & Ratcliff, R. (2004). Comparing categorization models. Journal of Experimental Psychology: General, 133, 63–82.

    Article  Google Scholar 

  47. Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.

    Article  Google Scholar 

  48. Sellke, T., Bayarri, M. J., & Berger, J. O. (2001). Calibration of p values for testing precise null hypotheses. American Statistician, 55, 62–71.

    Article  Google Scholar 

  49. Shepard, R. N. (1957). Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika, 22, 325–345.

    Article  Google Scholar 

  50. Shibley Hyde, J. (2005). The gender similarities hypothesis. American Psychologist, 60, 581–592.

    Article  Google Scholar 

  51. Shibley Hyde, J. (2007). New directions in the study of gender similarities and differences. Current Directions in Psychological Science, 16, 259–263.

    Article  Google Scholar 

  52. Stevens, S. S. (1957). On the psychophysical law. Psychological Review, 64, 153–181.

    PubMed  Article  Google Scholar 

  53. Swets, J. A. (1996). Signal detection theory and ROC analysis in psychology and diagnostics: Collected papers. Mahwah, NJ: Erlbaum.

    Google Scholar 

  54. Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.

    Google Scholar 

  55. Wagenmakers, E.-J. (2007). A practical solution to the pervasive problem of p values. Psychonomic Bulletin & Review, 14, 779–804.

    Article  Google Scholar 

  56. Wagenmakers, E.-J., & Grünwald, P. (2006). A Bayesian perspective on hypothesis testing: A comment on Killeen (2005). Psychological Science, 17, 641–642.

    PubMed  Article  Google Scholar 

  57. Wagenmakers, E.-J., Lee, M. D., Lodewyckx, T., & Iverson, G. (2008). Bayesian versus frequentist inference. In H. Hoijtink, I. Klugkist, & P. A. Boelen (Eds.), Bayesian evaluation of informative hypotheses in psychology (pp. 181–207). New York: Springer.

    Chapter  Google Scholar 

  58. Zellner, A., & Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In J. M. Bernardo, M. H. DeGroot, D. V. Lindley, & A. F. M. Smith (Eds.), Bayesian statistics: Proceedings of the First International Meeting (pp. 585–603). Valencia: University of Valencia Press.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jeffrey N. Rouder.

Additional information

This research was supported by NSF Grant SES-0720229 and NIMH Grant R01-MH071418.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Rouder, J.N., Speckman, P.L., Sun, D. et al. Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review 16, 225–237 (2009). https://doi.org/10.3758/PBR.16.2.225

Download citation

Keywords

  • Akaike Information Criterion
  • Marginal Likelihood
  • Posterior Odds
  • Subliminal Priming
  • Prior Standard Deviation