Psychonomic Bulletin & Review

, Volume 16, Issue 2, pp 225–237 | Cite as

Bayesian t tests for accepting and rejecting the null hypothesis

  • Jeffrey N. Rouder
  • Paul L. Speckman
  • Dongchu Sun
  • Richard D. Morey
  • Geoffrey Iverson
Theoretical and Review Articles

Abstract

Progress in science often comes from discovering invariances in relationships among variables; these invariances often correspond to null hypotheses. As is commonly known, it is not possible to state evidence for the null hypothesis in conventional significance testing. Here we highlight a Bayes factor alternative to the conventional t test that will allow researchers to express preference for either the null hypothesis or the alternative. The Bayes factor has a natural and straightforward interpretation, is based on reasonable assumptions, and has better properties than other methods of inference that have been advocated in the psychological literature. To facilitate use of the Bayes factor, we provide an easy-to-use, Web-based program that performs the necessary calculations.

References

  1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.CrossRefGoogle Scholar
  2. Ashby, F. G., & Maddox, W. T. (1992). Complex decision rules in categorization: Contrasting novice and experienced performance. Journal of Experimental Psychology: Human Perception & Performance, 18, 50–71.CrossRefGoogle Scholar
  3. Augustin, T. (2008). Stevens’ power law and the problem of meaningfulness. Acta Psychologica, 128, 176.PubMedCrossRefGoogle Scholar
  4. Berger, J. O., & Berry, D. A. (1988). Analyzing data: Is objectivity possible? American Scientist, 76, 159–165.Google Scholar
  5. Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis: Theory and practice. Cambridge, MA: MIT Press.Google Scholar
  6. Clarke, F. R. (1957). Constant-ratio rule for confusion matrices in speech communication. Journal of the Acoustical Society of America, 29, 715–720.CrossRefGoogle Scholar
  7. Cohen, J. (1994). The earth is round ( p <.05). American Psychologist, 49, 997–1003.CrossRefGoogle Scholar
  8. Cumming, G., & Finch, S. (2001). A primer on the understanding, use, and calculation of confidence intervals based on central and noncentral distributions. Educational & Psychological Measurement, 61, 532–574.Google Scholar
  9. Debner, J. A., & Jacoby, L. L. (1994). Unconscious perception: Attention, awareness, and control. Journal of Experimental Psychology: Learning, Memory, & Cognition, 20, 304–317.CrossRefGoogle Scholar
  10. Dehaene, S., Naccache, L., Le Clec’H, G., Koechlin, E., Mueller, M., Dehaene-Lambertz, G., et al. (1998). Imaging unconscious semantic priming. Nature, 395, 597–600.PubMedCrossRefGoogle Scholar
  11. Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70, 193–242.CrossRefGoogle Scholar
  12. Egan, J. P. (1975). Signal detection theory and ROC-analysis. New York: Academic Press.Google Scholar
  13. Fechner, G. T. (1966). Elements of psychophysics. New York: Holt, Rinehart & Winston. (Original work published 1860)Google Scholar
  14. García-Donato, G., & Sun, D. (2007). Objective priors for hypothesis testing in one-way random effects models. Canadian Journal of Statistics, 35, 303–320.CrossRefGoogle Scholar
  15. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis (2nd ed.). Boca Raton, FL: Chapman & Hall.Google Scholar
  16. Gillispie, C. C., Fox, R., & Grattan-Guinness, I. (1997). Pierre-Simon Laplace, 1749–1827: A life in exact science. Princeton, NJ: Princeton University Press.Google Scholar
  17. Gönen, M., Johnson, W. O., Lu, Y., & Westfall, P. H. (2005). The Bayesian two-sample t test. American Statistician, 59, 252–257.CrossRefGoogle Scholar
  18. Goodman, S. N. (1999). Toward evidence-based medical statistics: I. The p value fallacy. Annals of Internal Medicine, 130, 995–1004.PubMedGoogle Scholar
  19. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.Google Scholar
  20. Grider, R. C., & Malmberg, K. J. (2008). Discriminating between changes in bias and changes in accuracy for recognition memory of emotional stimuli. Memory & Cognition, 36, 933–946.CrossRefGoogle Scholar
  21. Hawking, S. (Ed.) (2002). On the shoulders of giants: The great works of physics and astronomy. Philadelphia: Running Press.Google Scholar
  22. Hays, W. L. (1994). Statistics (5th ed.). Fort Worth, TX: Harcourt Brace.Google Scholar
  23. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory & Language, 30, 513–541.CrossRefGoogle Scholar
  24. Jeffreys, H. (1961). Theory of probability (3rd ed.). Oxford: Oxford University Press, Clarendon Press.Google Scholar
  25. Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.CrossRefGoogle Scholar
  26. Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for nested hypotheses with large samples. Journal of the American Statistical Association, 90, 928–934.CrossRefGoogle Scholar
  27. Killeen, P. R. (2005). An alternative to null-hypothesis significance tests. Psychological Science, 16, 345–353.PubMedCrossRefGoogle Scholar
  28. Killeen, P. R. (2006). Beyond statistical inference: A decision theory for science. Psychonomic Bulletin & Review, 13, 549–562.CrossRefGoogle Scholar
  29. Kline, R. B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: American Psychological Association.CrossRefGoogle Scholar
  30. Lee, M. D., & Wagenmakers, E.-J. (2005). Bayesian statistical inference in psychology: Comment on Trafimow (2003). Psychological Review, 112, 662–668.PubMedCrossRefGoogle Scholar
  31. Lehmann, E. L. (1993). The Fisher, Neyman—Pearson theories of testing hypotheses: One theory or two? Journal of the American Statistical Association, 88, 1242–1249.CrossRefGoogle Scholar
  32. Liang, F., Paulo, R., Molina, G., Clyde, M. A., & Berger, J. O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association, 103, 410–423.CrossRefGoogle Scholar
  33. Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.Google Scholar
  34. Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492–527.CrossRefGoogle Scholar
  35. Logan, G. D. (1992). Shapes of reaction-time distributions and shapes of learning curves: A test of the instance theory of automaticity. Journal of Experimental Psychology: Learning, Memory, & Cognition, 18, 883–914.CrossRefGoogle Scholar
  36. Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley.Google Scholar
  37. Masson, M. E. J., & Loftus, G. R. (2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology, 57, 203–220.PubMedGoogle Scholar
  38. Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting & Clinical Psychology, 46, 806–834.CrossRefGoogle Scholar
  39. Myung, I.-J., & Pitt, M. A. (1997). Applying Occam’s razor in modeling cognition: A Bayesian approach. Psychonomic Bulletin & Review, 4, 79–95.CrossRefGoogle Scholar
  40. Plant, E. A., & Peruche, B. M. (2005). The consequences of race for police officers’ responses to criminal suspects. Psychological Science, 16, 180–183.PubMedCrossRefGoogle Scholar
  41. Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163.CrossRefGoogle Scholar
  42. Reingold, E. M., & Merikle, P. M. (1988). Using direct and indirect measures to study perception without awareness. Perception & Psychophysics, 44, 563–575.CrossRefGoogle Scholar
  43. Rouder, J. N., & Lu, J. (2005). An introduction to Bayesian hierarchical models with an application in the theory of signal detection. Psychonomic Bulletin & Review, 12, 573–604.CrossRefGoogle Scholar
  44. Rouder, J. N., & Morey, R. D. (2005). Relational and arelational confidence intervals: A comment on Fidler, Thomason, Cumming, Finch, and Leeman (2004). Psychological Science, 16, 77–79.PubMedCrossRefGoogle Scholar
  45. Rouder, J. N., Morey, R. D., Speckman, P. L., & Pratte, M. S. (2007). Detecting chance: A solution to the null sensitivity problem in subliminal priming. Psychonomic Bulletin & Review, 14, 597–605.CrossRefGoogle Scholar
  46. Rouder, J. N., & Ratcliff, R. (2004). Comparing categorization models. Journal of Experimental Psychology: General, 133, 63–82.CrossRefGoogle Scholar
  47. Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.CrossRefGoogle Scholar
  48. Sellke, T., Bayarri, M. J., & Berger, J. O. (2001). Calibration of p values for testing precise null hypotheses. American Statistician, 55, 62–71.CrossRefGoogle Scholar
  49. Shepard, R. N. (1957). Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika, 22, 325–345.CrossRefGoogle Scholar
  50. Shibley Hyde, J. (2005). The gender similarities hypothesis. American Psychologist, 60, 581–592.CrossRefGoogle Scholar
  51. Shibley Hyde, J. (2007). New directions in the study of gender similarities and differences. Current Directions in Psychological Science, 16, 259–263.CrossRefGoogle Scholar
  52. Stevens, S. S. (1957). On the psychophysical law. Psychological Review, 64, 153–181.PubMedCrossRefGoogle Scholar
  53. Swets, J. A. (1996). Signal detection theory and ROC analysis in psychology and diagnostics: Collected papers. Mahwah, NJ: Erlbaum.Google Scholar
  54. Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.Google Scholar
  55. Wagenmakers, E.-J. (2007). A practical solution to the pervasive problem of p values. Psychonomic Bulletin & Review, 14, 779–804.CrossRefGoogle Scholar
  56. Wagenmakers, E.-J., & Grünwald, P. (2006). A Bayesian perspective on hypothesis testing: A comment on Killeen (2005). Psychological Science, 17, 641–642.PubMedCrossRefGoogle Scholar
  57. Wagenmakers, E.-J., Lee, M. D., Lodewyckx, T., & Iverson, G. (2008). Bayesian versus frequentist inference. In H. Hoijtink, I. Klugkist, & P. A. Boelen (Eds.), Bayesian evaluation of informative hypotheses in psychology (pp. 181–207). New York: Springer.CrossRefGoogle Scholar
  58. Zellner, A., & Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In J. M. Bernardo, M. H. DeGroot, D. V. Lindley, & A. F. M. Smith (Eds.), Bayesian statistics: Proceedings of the First International Meeting (pp. 585–603). Valencia: University of Valencia Press.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2009

Authors and Affiliations

  • Jeffrey N. Rouder
    • 1
  • Paul L. Speckman
    • 1
  • Dongchu Sun
    • 1
  • Richard D. Morey
    • 1
  • Geoffrey Iverson
    • 2
  1. 1.Department of Psychological SciencesUniversity of MissouriColumbia
  2. 2.University of CaliforniaIrvine

Personalised recommendations