Assessing individual differences in categorical data

Smith, Jared B.; Batchelder, William H.

doi:10.3758/PBR.15.4.713

Assessing individual differences in categorical data

Theoretical and Review Articles
Published: August 2008

Volume 15, pages 713–731, (2008)
Cite this article

Download PDF

Psychonomic Bulletin & Review Aims and scope Submit manuscript

Assessing individual differences in categorical data

Download PDF

Jared B. Smith¹ &
William H. Batchelder¹

1270 Accesses
55 Citations
Explore all metrics

Abstract

In cognitive modeling, data are often categorical observations taken over participants and items. Usually subsets of these observations are pooled and analyzed by a cognitive model assuming the category counts come from a multinomial distribution with the same model parameters underlying all observations. It is well known that if there are individual differences in participants and/or items, a model analysis of the pooled data may be quite misleading, and in such cases it may be appropriate to augment the cognitive model with parametric random effects assumptions. On the other hand, if random effects are incorporated into a cognitive model that is not needed, the resulting model may be more flexible than the multinomial model that assumes no heterogeneity, and this may lead to overfitting. This article presents Monte Carlo statistical tests for directly detecting individual participant and/or item heterogeneity that depend only on the data structure itself. These tests are based on the fact that heterogeneity in participants and/or items results in overdispersion of certain category count statistics. It is argued that the methods developed in the article should be applied to any set of participant 3 item categorical data prior to cognitive model-based analyses.

Article PDF

Sampling Techniques for Quantitative Research

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

References

Agresti, A. (1992). A survey of exact inference for contingency tables. Statistical Science, 7, 131–177.
Article Google Scholar
Agresti, A. (2002). Categorical data analysis (2nd ed.). Hoboken, NJ: Wiley.
Book Google Scholar
Agresti, A., Caffo, B., & Ohman-Strickland, P. (2004). Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies. Computational Statistics & Data Analysis, 47, 639–653.
Article Google Scholar
Albert, J. H. (1999). Criticism of a hierarchical model using Bayes factors. Statistics in Medicine, 18, 287–305.
Article PubMed Google Scholar
Albert, J. [H.], & Chib, S. (1997). Bayesian tests and model diagnostics in conditionally independent hierarchical models. Journal of the American Statistical Association, 92, 916–925.
Article Google Scholar
Ansari, A. R., & Bradley, R. A. (1960). Rank sum tests for dispersion. Annals of Mathematical Statistics, 31, 1174–1189.
Article Google Scholar
Ashby, F. G., Maddox, W. T., & Lee, W. W. (1994). On the dangers of averaging across subjects when using multidimensional scaling or the similarity—choice model. Psychological Science, 5, 144–151.
Article Google Scholar
Batchelder, W. H. (1975). Individual differences and the all-or-none vs incremental learning controversy. Journal of Mathematical Psychology, 12, 53–74.
Article Google Scholar
Batchelder, W. H., Chosak-Reiter, J., Shankle, W. R., & Dick, M. B. (1997). A multinomial modeling analysis of memory deficits in Alzheimer’s disease and vascular dementia. Journals of Gerontology, 52B, P206-P215.
Google Scholar
Batchelder, W. H., & Riefer, D. M. (1980). Separation of storage and retrieval factors in free recall of clusterable pairs. Psychological Review, 87, 375–397.
Article Google Scholar
Batchelder, W. H., & Riefer, D. M. (1986). The statistical analysis of a model for storage and retrieval processes in human memory. British Journal of Mathematical & Statistical Psychology, 39, 129–149.
Google Scholar
Batchelder, W. H., & Riefer, D. M. (1999). Theoretical and empirical review of multinomial process tree modeling. Psychonomic Bulletin & Review, 6, 57–86.
Article Google Scholar
Batchelder, W. H., & Riefer, D. M. (2007). Using multinomial processing tree models to measure cognitive deficits in clinical populations. In R. W. J. Neufeld (Ed.), Advances in clinical cognitive science: Formal modeling of processes and symptoms (pp. 19–50). Washington, DC: American Psychological Association.
Chapter Google Scholar
Bjork, R. A. (1989). Retrieval inhibition as an adaptive mechanism in human memory. In H. L. Roediger III & F. I. M. Craik (Eds.), Varieties of memory and consciousness: Essays in honour of Endel Tulving (pp. 309–330). Hillsdale, NJ: Erlbaum.
Google Scholar
Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning & Verbal Behavior, 12, 335–359.
Article Google Scholar
Congdon, P. (2005). Bayesian models for categorical data. New York: Wiley.
Book Google Scholar
Conover, W. J. (1965). Several k-sample Kolmogorov—Smirnov tests. Annals of Mathematical Statistics, 36, 1019–1026.
Article Google Scholar
Curran, T., & Hintzman, D. L. (1995). Violations of the independence assumption in process dissociation. Journal of Experimental Psychology: Learning, Memory, & Cognition, 21, 531–547.
Article Google Scholar
de Boeck, P., & Wilson, M. (Eds.) (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer.
Google Scholar
DeCarlo, L. T. (2002). Signal detection theory with finite mixture distributions: Theoretical developments with applications to recognition memory. Psychological Review, 109, 710–721.
Article PubMed Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39, 1–38.
Google Scholar
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.
Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
Google Scholar
Estes, W. K. (1956). The problem of inference from curves based on group data. Psychological Bulletin, 53, 134–140.
Article PubMed Google Scholar
Evans, M., Hastings, N., & Peacock, J. B. (2000). Statistical distributions (3rd ed.). New York: Wiley.
Google Scholar
Garren, S. T., Smith, R. L., & Piegorsch, W. W. (2001). Bootstrap goodness-of-fit test for the beta-binomial model. Journal of Applied Statistics, 28, 561–571.
Article Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2003). Bayesian data analysis (2nd ed.). Boca Raton, FL: Chapman & Hall.
Google Scholar
Gilden, D. L. (2001). Cognitive emissions of 1/f noise. Psychological Review, 108, 33–56.
Article PubMed Google Scholar
Gill, J. (2002). Bayesian methods: A social and behavioral sciences approach. New York: Chapman & Hall.
Google Scholar
Griffiths, D. A. (1973). Maximum likelihood estimation for the beta-binomial distribution and an application to the household distribution of the total number of cases of a disease. Biometrics, 29, 637–648.
Article PubMed Google Scholar
Haider, H., & Frensch, P. A. (2002). Why aggregated learning follows the power law of practice when individual learning does not: Comment on Rickard (1997, 1999), Delaney et al. (1998), and Palmeri (1999). Journal of Experimental Psychology: Learning, Memory, & Cognition, 28, 392–406.
Article Google Scholar
Hays, W. L. (1988). Statistics (4th ed.). New York: Holt, Rinehart & Winston.
Google Scholar
Heathcote, A., Brown, S., & Mewhort, D. J. K. (2000). The power law repealed: The case for an exponential law of practice. Psychonomic Bulletin & Review, 7, 185–207.
Article Google Scholar
Hintzman, D. L. (1980). Simpson’s paradox and the analysis of memory retrieval. Psychological Review, 87, 398–410.
Article Google Scholar
Hintzman, D. L. (1993). On variability, Simpson’s paradox, and the relation between recognition and recall: Reply to Tulving and Flexser. Psychological Review, 100, 143–148.
Article PubMed Google Scholar
Hogg, R. V., McKean, J. W., & Craig, A. T. (2005). Introduction to mathematical statistics (6th ed.). Upper Saddle River, NJ: Pearson.
Google Scholar
Howard, M. W., & Kahana, M. J. (2002). A distributed representation of temporal context. Journal of Mathematical Psychology, 46, 269–299.
Article Google Scholar
Jones, M., Love, B. C., & Maddox, W. T. (2006). Recency effects as a window to generalization: Separating decisional and perceptual sequential effects in category learning. Journal of Experimental Psychology: Learning, Memory, & Cognition, 32, 316–332.
Article Google Scholar
Karabatsos, G., & Batchelder, W. H. (2003). Markov chain estimation for test theory without an answer key. Psychometrika, 68, 373–389.
Article Google Scholar
Karpiuk, P., Jr., Lacouture, Y., & Marley, A. A. J. (1997). A limited capacity, wave equality, random walk model of absolute identification. In A. A. J. Marley (Ed.), Choice, decision, and measurement: Essays in honor of R. Duncan Luce (pp. 279–299). Mahwah, NJ: Erlbaum.
Google Scholar
Kiefer, J. (1959). K-sample analogues of the Kolmogorov—Smirnov and Cramér—von Mises tests. Annals of Mathematical Statistics, 30, 420–447.
Article Google Scholar
Kim, B. S., & Margolin, B. H. (1992). Testing goodness of fit of a multinomial model against overdispersed alternatives. Biometrics, 48, 711–719.
Article Google Scholar
Klauer, K. C. (2006). Hierarchical multinomial processing tree models: A latent-class approach. Psychometrika, 71, 7–31.
Article Google Scholar
Klein, S. A. (2001). Measuring, estimating, and understanding the psychometric function: A commentary. Perception & Psychophysics, 63, 1421–1455.
Article Google Scholar
Kruskal, W. H., & Wallis, W. A. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47, 583–621.
Article Google Scholar
Kuss, M., Jäkel, F., & Wichmann, F. A. (2005). Bayesian inference for psychometric functions. Journal of Vision, 5, 478–492.
Article PubMed Google Scholar
Lee, M. D., & Webb, M. R. (2005). Modeling individual differences in cognition. Psychonomic Bulletin & Review, 12, 605–621.
Article Google Scholar
Lehmann, E. L., & Romano, J. P. (2005). Testing statistical hypotheses (3rd ed.). New York: Springer.
Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
Madden, L. V., & Hughes, G. (1994). BBD—Computer software for fitting the beta-binomial distribution to disease incidence data. Plant Disease, 78, 536–540.
Article Google Scholar
McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.
Book Google Scholar
Minka, T. (2000). Estimating a Dirichlet distribution (Tech. Rep.). Cambridge, MA: MIT. Available at research.microsoft.com/∼minka/ papers/dirichlet/.
Google Scholar
Moore, D. S., & McCabe, G. P. (2006). Introduction to the practice of statistics (5th ed.). New York: Freeman.
Google Scholar
Mosimann, J. E. (1962). On the compound multinomial distribution, the multivariate β-distribution, and correlations among proportions. Biometrika, 49, 65–82.
Google Scholar
Myung, I. J., & Pitt, M. A. (1997). Applying Occam’s razor in modeling cognition: A Bayesian approach. Psychonomic Bulletin & Review, 4, 79–95.
Article Google Scholar
Navarro, D. J., Griffiths, T. L., Steyvers, M., & Lee, M. D. (2006). Modeling individual differences using Dirichlet processes. Journal of Mathematical Psychology, 50, 101–122.
Article Google Scholar
Neerchal, N. K., & Morel, J. G. (2005). An improved method for the computation of maximum likelihood estimates for multinomial overdispersion models. Computational Statistics & Data Analysis, 49, 33–43.
Article Google Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.
Google Scholar
R Development Core Team (2005). R: A language and environment for statistical computing Vienna: R Foundation for Statistical Computing.
Google Scholar
Riefer, D. M., Kevari, M. K., & Kramer, D. L. F. (1995). Name that tune: Eliciting the tip-of-the-tongue experience using auditory stimuli. Psychological Reports, 77, 1379–1390.
PubMed Google Scholar
Riefer, D. M., Knapp, B. R., Batchelder, W. H., Bamber, D., & Manifold, V. (2002). Cognitive psychometrics: Assessing storage and retrieval deficits in special populations with multinomial processing tree models. Psychological Assessment, 14, 184–201.
Article PubMed Google Scholar
Rouder, J. N., & Lu, J. (2005). An introduction to Bayesian hierarchical models with an application in the theory of signal detection. Psychonomic Bulletin & Review, 12, 573–604.
Article Google Scholar
Rouder, J. N., Lu, J., Sun, D., Speckman, P. [L.], Morey, R. [D.], & Naveh-Benjamin, M. (2007). Signal detection models with random participant and item effects. Psychometrika, 72, 621–642.
Article Google Scholar
Rouder, J. N., Sun, D., Speckman, P. L., Lu, J., & Zhou, D. (2003). A hierarchical Bayesian statistical framework for response time distributions. Psychometrika, 68, 589–606.
Article Google Scholar
Spiegelhalter, D., Thomas, A., Best, N., & Lunn, D. (2003). WinBUGS user manual version 1.4. Cambridge: MRC Biostatistics Unit.
Google Scholar
Thornton, T. L., & Gilden, D. L. (2005). Provenance of correlations in psychological data. Psychonomic Bulletin & Review, 12, 409–441.
Article Google Scholar
Titterington, D. M., Smith, A. F. M., & Makov, U. E. (1985). Statistical analysis of finite mixture distributions. New York: Wiley.
Google Scholar
von Davier, M., & Carstensen, C. H. (Eds.) (2007). Multivariate and mixture distribution Rasch models: Extensions and applications. New York: Springer.
Google Scholar
Wagenmakers, E.-J., Farrell, S., & Ratcliff, R. (2004). Estimation and interpretation of 1/f^α noise in human cognition. Psychonomic Bulletin & Review, 11, 579–615.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Cognitive Sciences, University of California, 92697-5100, Irvine, CA
Jared B. Smith & William H. Batchelder

Authors

Jared B. Smith
View author publications
You can also search for this author in PubMed Google Scholar
William H. Batchelder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William H. Batchelder.

Additional information

Work on this article was supported by two grants from the National Science Foundation: SES-0136115 to A. K. Romney and W.H.B. (Co-PIs) and SES-0616657 to X. Hu and W.H.B. (Co-PIs). In addition, we acknowledge the support from the Department of Cognitive Sciences and the Institute for Mathematical Behavioral Sciences for summer fellowship assistance to J.B.S.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Smith, J.B., Batchelder, W.H. Assessing individual differences in categorical data. Psychonomic Bulletin & Review 15, 713–731 (2008). https://doi.org/10.3758/PBR.15.4.713

Download citation

Received: 14 June 2007
Accepted: 26 December 2007
Issue Date: August 2008
DOI: https://doi.org/10.3758/PBR.15.4.713

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Assessing individual differences in categorical data

Abstract

Article PDF

Similar content being viewed by others

Sampling Techniques for Quantitative Research

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing individual differences in categorical data

Abstract

Article PDF

Similar content being viewed by others

Sampling Techniques for Quantitative Research

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation