Skip to main content
Log in

Statistical inference for multiple choice tests

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Finite sample inference procedures are considered for analyzing the observed scores on a multiple choice test with several items, where, for example, the items are dissimilar, or the item responses are correlated. A discrete p-parameter exponential family model leads to a generalized linear model framework and, in a special case, a convenient regression of true score upon observed score. Techniques based upon the likelihood function, Akaike's information criteria (AIC), an approximate Bayesian marginalization procedure based on conditional maximization (BCM), and simulations for exact posterior densities (importance sampling) are used to facilitate finite sample investigations of the average true score, individual true scores, and various probabilities of interest. A simulation study suggests that, when the examinees come from two different populations, the exponential family can adequately generalize Duncan's beta-binomial model. Extensions to regression models, the classical test theory model, and empirical Bayes estimation problems are mentioned. The Duncan, Keats, and Matsumura data sets are used to illustrate potential advantages and flexibility of the exponential family model, and the BCM technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike, H. (1978). A Bayesian analysis of the minimum AIC procedure.Annals of the Institute of Statistical Mathematics, 30(A), 9–14.

    Google Scholar 

  • Altham, P. M. E. (1978). Two generalizations of the binomial distribution.Applied Statistics, 27, 162–167.

    Google Scholar 

  • Anderson, D. A., & Aitken, M. (1985). Marginal maximum likelihood estimation of item parameters: Application of an algorithm.Journal of Royal Statistical Society, Series B, 26, 203–210.

    Google Scholar 

  • Atilgan, T. (1983).Parameter parsimony, model selection, and smooth density estimation. Unpublished doctoral dissertation, University of Wisconsin-Madison.

  • Atilgan, T., & Leonard, T. (1988). On the application of AIC to bivariate density estimation, non-parametric regression, and discrimination. In H. Bozadogan & A. K. Gupta (Eds.),Multivariate statistical modeling and data analysis (pp. 1–16). Dordrecht, Holland: Reidel.

    Google Scholar 

  • Bell, S. S. (1990). Empirical Bayes alternatives to the beta-binomial model. Unpublished doctoral dissertation, Columbia University.

  • Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories.Psychometrika, 37, 29–51.

    Google Scholar 

  • Bock, R. D., & Aitken, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm.Psychometrika, 46, 443–454.

    Google Scholar 

  • Carter, M. C., & Williford, W. O. (1975). Estimation in a modified binomial distribution.Applied Statistics, 24, 319–328.

    Google Scholar 

  • Consul, P. C. (1974). A simple urn model dependent upon predetermined strategy.Sankhya, Series B, 36, 391–399.

    Google Scholar 

  • Consul, P. C. (1975). On a characterization of Lagrangian Poisson and quasi-binomial distributions.Communications in Statistics, 4, 555–563.

    Google Scholar 

  • Dalal, S. R., & Hall, W. J. (1983). Approximating priors by mixtures of natural conjugate priors.Journal of Royal Statistical Society, Series B, 45, 278–286.

    Google Scholar 

  • Duncan, G. T. (1974). An empirical Bayes approach to scoring multiple-choice tests in the misinformation model.Journal of the American Statistical Association, 69, 50–57.

    Google Scholar 

  • Gelfand, A. E., & Smith, A. F. M. (1990). Sampling based approaches to calculating marginal densities.Journal of the American Statistical Association, 85, 398–409.

    Google Scholar 

  • Geweke, J. (1988). Antithetic acceleration of Monte-Carlo integration in Bayesian inference.Journal of Econometrics, 38, 73–89.

    Google Scholar 

  • Geweke, J. (1989). Exact predictive density for linear models with arch distributions.Journal of Econometrics, 40, 63–86.

    Google Scholar 

  • Hsu, J. S.J. (1990).Bayesian inference and marginalization. Unpublished doctoral dissertation, University of Wisconsin-Madison.

  • Keats, J. A. (1964). Some generalizations of a theoretical distribution of mental test scores.Psychometrika, 29, 215–231.

    Google Scholar 

  • Lehmann, E. L. (1983).Theory of point estimation. New York: John Wiley & Sons.

    Google Scholar 

  • Leonard, T. (1972). Bayesian methods for binomial data.Biometrika, 59, 581–589.

    Google Scholar 

  • Leonard, T. (1973). A Bayesian method for histograms.Biometrika, 60, 297–308.

    Google Scholar 

  • Leonard, T. (1982). Comment on the paper by Lejeune and Faulkenberry.Journal of the American Statistical Association, 77, 657–658.

    Google Scholar 

  • Leonard, T. (1984). Some data-analytic modifications to Bayes-Stein estimation.Annals of the Institute of Statistical Mathematics, 36, 11–21.

    Google Scholar 

  • Leonard, T., Hsu, J. S.J., & Tsui, K. (1989). Bayesian marginal inference.Journal of the American Statistical Association, 84, 1051–1058.

    Google Scholar 

  • Leonard, T., & Novick, J. B. (1986). Bayesian full rank marginalization for two-way contingency tables.Journal of Educational Statistics, 11, 33–56.

    Google Scholar 

  • Lord, F. M. (1965). A strong true-score theory, with applications.Psychometrika, 30, 239–270.

    Google Scholar 

  • Lord, F. M. (1969). Estimating true-score distributions in psychological testing: An empirical Bayes estimation problem.Psychometrika, 34, 259–299.

    Google Scholar 

  • Lord, F. M., & Novick, M. R. (1968).Statistical theories of mental test scores (with contributions by Allen Birnbaum). Reading, MA: Addison-Wiley.

    Google Scholar 

  • Lord, F. M., & Stocking, M. L. (1976). An interval estimate for making statistical inference about true scores.Psychometrika, 41, 79–87.

    Google Scholar 

  • McCullagh, P., & Nelder, J. A. (1985).Generalized linear models. New York: Chapman and Hall.

    Google Scholar 

  • Mislevy, R. J. (1986). Bayes modal estimation in item response.Psychometrika, 51, 177–195.

    Google Scholar 

  • Morrison, D. G., & Brockway, G. (1979). A modified beta-binomial model with applications to multiple choice and taste tests.Psychometrika, 44, 427–442.

    Google Scholar 

  • Prentice, R. L., & Barlow, W. E. (1988). Correlated binary regression with covariates specific to each binary observation.Biometrics, 44, 1033–48.

    Google Scholar 

  • Rubinstein, R. Y. (1981).Simulation and the Monte Carlo method. New York: John Wiley and Sons.

    Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model.Annals of Mathematical Statistics, 6, 461–464.

    Google Scholar 

  • Takane, Y., Bozdogan, H., & Shibayama, T. (1987). Ideal point discriminant analysis.Psychometrika, 52, 371–392.

    Google Scholar 

  • Wilcox, R. R. (1981a). A review of the beta-binomial model and its extensions.Journal of Educational Statistics, 6, 3–32.

    Google Scholar 

  • Wilcox, R. R. (1981b). A cautionary note on estimating the reliability of a mastery test with the beta-binomial model.Applied Psychological Measurement, 5, 531–537.

    Google Scholar 

  • Young, A. S. (1977). A Bayesian approach to prediction using polynomials.Biometrika, 64, 309–318.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The authors wish to thank Ella Mae Matsumura for her data set and helpful comments, Frank Baker for his advice on item response theory, Hirotugu Akaike and Taskin Atilgan, for helpful discussions regarding AIC, Graham Wood for his advice concerning the class of all binomial mixture models, Yiu Ming Chiu for providing useful references and information on tetrachoric models, and the Editor and two referees for suggesting several references and alternative approaches.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsu, J.S.J., Leonard, T. & Tsui, KW. Statistical inference for multiple choice tests. Psychometrika 56, 327–348 (1991). https://doi.org/10.1007/BF02294466

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02294466

Key words

Navigation