Psychonomic Bulletin & Review

, Volume 12, Issue 4, pp 573–604 | Cite as

An introduction to Bayesian hierarchical models with an application in the theory of signal detection

Theoretical And Review Articles


Although many nonlinear models of cognition have been proposed in the past 50 years, there has been little consideration of corresponding statistical techniques for their analysis. In analyses with nonlinear models, unmodeled variability from the selection of items or participants may lead to asymptotically biased estimation. This asymptotic bias, in turn, renders inference problematic. We show, for example, that a signal detection analysis of recognition memory data leads to asymptotic underestimation of sensitivity. To eliminate asymptotic bias, we advocate hierarchical models in which participant variability, item variability, and measurement error are modeled simultaneously. By accounting for multiple sources of variability, hierarchical models yield consistent and accurate estimates of participant and item effects in recognition memory. This article is written in tutorial format; we provide an introduction to Bayesian statistics, hierarchical modeling, and Markov chain Monte Carlo computational techniques.


False Alarm Posterior Distribution Markov Chain Monte Carlo False Alarm Rate Hierarchical Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Ahrens, J. H., &Dieter, U. (1974). Computer methods for sampling from gamma, beta, Poisson and binomial distributions. Computing,12, 223–246.CrossRefGoogle Scholar
  2. Ahrens, J. H., &Dieter, U. (1982). Generating gamma variates by a modified rejection technique.Communications of the Association for Computing Machinery,25, 47–54.Google Scholar
  3. Albert, J., &Chib, S. (1995). Bayesian residual analysis for binary response regression models.Biometrika,82, 747–759.CrossRefGoogle Scholar
  4. Ashby, F. G. (1992). Multivariate probability distributions. In F. G. Ashby (Ed.),Multidimensional models of perception and cognition (pp. 1–34). Hillsdale, NJ: Erlbaum.Google Scholar
  5. Ashby, F. G., Maddox, W. T., &Lee, W.W. (1994). On the dangers of averaging across subjects when using multidimensional scaling or the similarity-choice model.Psychological Science,5, 144–151.CrossRefGoogle Scholar
  6. Baayen, R. H., Tweedie, F. J., &Schreuder, R. (2002). The subjects as a simple random effect fallacy: Subject variability and morphological family effects in the mental lexicon.Brain & Language,81, 55–65.CrossRefGoogle Scholar
  7. Bayes, T. (1763). An essay towards solving a problem in the doctrine of chances.Philosophical Transactions of the Royal Society of London,53, 370–418.CrossRefGoogle Scholar
  8. Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research.Journal of Verbal Learning & Verbal Behavior,12, 335–359.CrossRefGoogle Scholar
  9. Cohen, J. (1994). The earth is round ( p.05).American Psychologist,49, 997–1003.CrossRefGoogle Scholar
  10. Cumming, G., &Finch, S. (2001). A primer on the understanding, use, and calculation of confidence intervals that are based on central and noncentral distributions.Educational & Psychological Measurement,61, 532–574.Google Scholar
  11. Curran, T. C., &Hintzman, D. L. (1995). Violations of the independence assumption in process dissociation.Journal of Experimental Psychology: Learning, Memory, & Cognition,21, 531–547.CrossRefGoogle Scholar
  12. Egan, J. P. (1975).Signal detection theory and ROC analysis. New York: Academic Press.Google Scholar
  13. Einstein, G. O., McDaniel, M. A., &Lackey, S. (1989). Bizarre imagery, interference, and distinctiveness.Journal of Experimental Psychology: Learning, Memory, & Cognition,15, 137–146.CrossRefGoogle Scholar
  14. Estes, W. K. (1956). The problem of inference from curves based on grouped data.Psychological Bulletin,53, 134–140.CrossRefPubMedGoogle Scholar
  15. Forster, K. I., &Dickinson, R. G. (1976). More on the language-asfixed-effect fallacy: Monte Carlo estimates of error rates forF1, F2, F’, and min F’. Journal of Verbal Learning & Verbal Behavior,15, 135–142.CrossRefGoogle Scholar
  16. Gelfand, A. E., &Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities.Journal of the American Statistical Association,85, 398–409.CrossRefGoogle Scholar
  17. Gelman, A., Carlin, J. B., Stern, H. S., &Rubin, D. B. (2004).Bayesian data analysis (2nd ed.). London: Chapman & Hall.Google Scholar
  18. Gelman, A., &Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences (with discussion).Statistical Science,7, 457–511.CrossRefGoogle Scholar
  19. Geman, S., &Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images.IEEE Transactions on Pattern Analysis & Machine Intelligence,6, 721–741.CrossRefGoogle Scholar
  20. Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.),Bayesian statistics (Vol. 4, pp. 169–194). Oxford: Oxford University Press, Clarendon Press.Google Scholar
  21. Gilks, W. R., Richardson, S. E., &Spiegelhalter, D. J. (1996).Markov chain Monte Carlo in practice. London: Chapman & Hall.Google Scholar
  22. Gill, J. (2002).Bayesian methods: A social and behavioral sciences approach. London: Chapman & Hall.Google Scholar
  23. Gilmore, G. C., Hersh, H., Caramazza, A., &Griffin, J. (1979). Multidimensional letter similarity derived from recognition errors.Perception & Psychophysics,25, 425–431.CrossRefGoogle Scholar
  24. Glanzer, M., Adams, J. K., Iverson, G. J., &Kim, K. (1993). The regularities of recognition memory.Psychological Review,100, 546–567.CrossRefPubMedGoogle Scholar
  25. Green, D. M., &Swets, J. A. (1966).Signal detection theory and psychophysics. New York: Wiley.Google Scholar
  26. Greenwald, A. G., Draine, S. C., &Abrams, R. L. (1996). Three cognitive markers of unconscious semantic activation.Science,273, 1699–1702.CrossRefPubMedGoogle Scholar
  27. Haider, H., &Frensch, P. A. (2002). Why aggregated learning follows the power law of practice when individual learning does not: Comment on Rickard (1997, 1999), Delaney et al. (1998), and Palmeri (1999).Journal of Experimental Psychology: Learning, Memory, & Cognition,28, 392–406.CrossRefGoogle Scholar
  28. Heathcote, A., Brown, S., &Mewhort, D. J. K. (2000). The power law repealed: The case for an exponential law of practice.Psychonomic Bulletin & Review,7, 185–207.Google Scholar
  29. Hirshman, E., Whelley, M. M., &Palij, M. (1989). An investigation of paradoxical memory effects.Journal of Memory & Language,28, 594–609.CrossRefGoogle Scholar
  30. Hobert, J. P., &Casella, G. (1996). The effect of improper priors on Gibbs sampling in hierarchical linear mixed models.Journal of the American Statistical Association,91, 1461–1473.CrossRefGoogle Scholar
  31. Hohle, R. H. (1965). Inferred components of reaction time as a function of foreperiod duration.Journal of Experimental Psychology,69, 382–386.CrossRefPubMedGoogle Scholar
  32. Hunter, J. E. (1997). Needed: A ban on the significance test.Psychological Science,8, 3–7.CrossRefGoogle Scholar
  33. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory.Journal of Memory & Language,30, 513–541.CrossRefGoogle Scholar
  34. Jeffreys, H. (1961).Theory of probability (3rd ed.). New York: Oxford University Press.Google Scholar
  35. Kass, R. E., &Raftery, A. E. (1995). Bayes factors.Journal of the American Statistical Association,90, 773–795.CrossRefGoogle Scholar
  36. Kreft, I., &de Leeuw, J. (1998).Introducing multilevel modeling. London: Sage.Google Scholar
  37. Lee, M. D., &Wagenmakers, E. J. (2005). Bayesian statistical inference in psychology: Comment on Trafimow (2003).Psychological Review,112, 662–668.CrossRefPubMedGoogle Scholar
  38. Lee, M. D., & Webb, M. R. (2005). Modeling individual differences in cognition. Manuscript submitted for publication.Google Scholar
  39. Lu, J. (2004).Bayesian hierarchical models for process dissociation framework in memory research. Unpublished manuscript.Google Scholar
  40. Luce, R. D. (1963). Detection and recognition. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.),Handbook of mathematical psychology (Vol. 1, pp. 103–189). New York: Wiley.Google Scholar
  41. Macmillan, N. A., &Creelman, C. D. (1991).Detection theory: A user’s guide. Cambridge: Cambridge University Press.Google Scholar
  42. Massaro, D. W., &Oden, G. C. (1979). Integration of featural information in speech perception.Psychological Review,85, 172–191.Google Scholar
  43. McClelland, J. L., &Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: I. An account of basic findings.Psychological Review,88, 375–407.CrossRefGoogle Scholar
  44. Medin, D. L., &Schaffer, M. M. (1978). Context theory of classification learning.Psychological Review,85, 207–238.CrossRefGoogle Scholar
  45. Meng, X., &Wong, W. H. (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration.Statistica Sinica,6, 831–860.Google Scholar
  46. Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship.Journal of Experimental Psychology: General,115, 39–57.CrossRefGoogle Scholar
  47. Pitt, M. A., Myung, I.-J., &Zhang, S. (2003). Toward a method of selecting among computational models of cognition.Psychological Review,109, 472–491.CrossRefGoogle Scholar
  48. Pra Baldi, A., de Beni, R., Cornoldi, C., &Cavedon, A. (1985). Some conditions of the occurrence of the bizarreness effect in recall.British Journal of Psychology,76, 427–436.Google Scholar
  49. Pruzek, R. M. (1997). An introduction to Bayesian inference and its applications. In L. Harlow, S. Mulaik, & J. Steiger (Eds.),What if there were no significance tests? (pp. 221–257). Mahwah, NJ: Erlbaum.Google Scholar
  50. Raaijmakers, J. G. W., Schrijnemakers, J. M. C., &Gremmen, F. (1999). How to deal with “the language-as-fixed-effect fallacy”: Common misconceptions and alternative solutions.Journal of Memory & Language,41, 416–426.CrossRefGoogle Scholar
  51. Raftery, A. E., &Lewis, S. M. (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo.Statistical Science,7, 493–497.CrossRefGoogle Scholar
  52. Ratcliff, R. (1978). A theory of memory retrieval.Psychological Review,85, 59–108.CrossRefGoogle Scholar
  53. Ratcliff, R., &Rouder, J. N. (1998). Modeling response times for decisions between two choices.Psychological Science,9, 347–356.CrossRefGoogle Scholar
  54. Ratcliff, R., Sheu, C.-F., &Gronlund, S. D. (1992). Testing global memory models using ROC curves.Psychological Review,99, 518–535.CrossRefPubMedGoogle Scholar
  55. Riefer, D. M., &Rouder, J. N. (1992). A multinomial modeling analysis of the mnemonic benefits of bizarre imagery.Memory & Cognition,20, 601–611.Google Scholar
  56. Rindskopf, R. M. (1997). Testing “small,” not null, hypotheses: Classical and Bayesian approaches. In L. Harlow, S. Mulaik, & J. Steiger (Eds.),What if there were no significance tests? (pp. 221–257). Mahwah, NJ: Erlbaum.Google Scholar
  57. Roberts, G. O., &Sahu, S. K. (1997). Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler.Journal of the Royal Statistical Society: Series B,59, 291–317.CrossRefGoogle Scholar
  58. Rouder, J. N. (2000). Assessing the roles of change discrimination and luminance integration: Evidence for a hybrid race model of perceptual decision making in luminance discrimination.Journal of Experimental Psychology: Human Perception & Performance,26, 359–378.CrossRefGoogle Scholar
  59. Rouder, J. N., Lu, J., Speckman, P. [L.], Sun, D., &Jiang, Y. (2005). A hierarchical model for estimating response time distributions.Psychonomic Bulletin & Review,12, 195–223.Google Scholar
  60. Rouder, J. N., Sun, D., Speckman, P. L., Lu, J., &Zhou, D. (2003). A hierarchical Bayesian statistical framework for response time distributions.Psychometrika,68, 589–606.CrossRefGoogle Scholar
  61. Rozeboom, W. W. (1960). The fallacy of the null-hypothesis significance test.Psychological Bulletin,57, 416–428.CrossRefPubMedGoogle Scholar
  62. Smithson, M. (2001). Correct confidence intervals for various regression effect sizes and parameters: The importance of noncentral distributions in computing intervals.Educational & Psychological Measurement,61, 605–632.CrossRefGoogle Scholar
  63. Steiger, J. H., &Fouladi, R. T. (1997). Noncentrality interval estimation and the evaluation of statistical models. In L. Harlow, S. Mulaik, & J. Steiger (Eds.),What if there were no significance tests? (pp. 221–257). Mahwah, NJ: Erlbaum.Google Scholar
  64. Tanner, M. A. (1996).Tools for statistical inference: Methods for the exploration of posterior distributions and likelihood functions (3rd ed.). New York: Springer.Google Scholar
  65. Tierney, L. (1994). Markov chains for exploring posterior distributions.Annals of Statistics,22, 1701–1728.CrossRefGoogle Scholar
  66. Wickelgren, W. A. (1968). Unidimensional strength theory and component analysis of noise in absolute and comparative judgments.Journal of Mathematical Psychology,5, 102–122.CrossRefGoogle Scholar
  67. Wishart, J. (1928). A generalized product moment distribution in samples from normal multivariate population.Biometrika,20, 32–52.Google Scholar
  68. Wollen, K. A., &Cox, S. D. (1981). Sentence cueing and the effect of bizarre imagery.Journal of Experimental Psychology: Human Learning & Memory,7, 386–392.CrossRefGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 2005

Authors and Affiliations

  1. 1.Department of Psychological SciencesUniversity of MissouriColumbia
  2. 2.American UniversityWashington, D.C.

Personalised recommendations