, Volume 56, Issue 2, pp 327–348 | Cite as

Statistical inference for multiple choice tests

  • John S. J. Hsu
  • Tom Leonard
  • Kam-Wah Tsui


Finite sample inference procedures are considered for analyzing the observed scores on a multiple choice test with several items, where, for example, the items are dissimilar, or the item responses are correlated. A discrete p-parameter exponential family model leads to a generalized linear model framework and, in a special case, a convenient regression of true score upon observed score. Techniques based upon the likelihood function, Akaike's information criteria (AIC), an approximate Bayesian marginalization procedure based on conditional maximization (BCM), and simulations for exact posterior densities (importance sampling) are used to facilitate finite sample investigations of the average true score, individual true scores, and various probabilities of interest. A simulation study suggests that, when the examinees come from two different populations, the exponential family can adequately generalize Duncan's beta-binomial model. Extensions to regression models, the classical test theory model, and empirical Bayes estimation problems are mentioned. The Duncan, Keats, and Matsumura data sets are used to illustrate potential advantages and flexibility of the exponential family model, and the BCM technique.

Key words

multiple choice test exponential family likelihood Akaike's information criterion generalized linear model Bayesian marginalization importance sampling regression of true score upon observed score classical test theory model 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Akaike, H. (1978). A Bayesian analysis of the minimum AIC procedure.Annals of the Institute of Statistical Mathematics, 30(A), 9–14.Google Scholar
  2. Altham, P. M. E. (1978). Two generalizations of the binomial distribution.Applied Statistics, 27, 162–167.Google Scholar
  3. Anderson, D. A., & Aitken, M. (1985). Marginal maximum likelihood estimation of item parameters: Application of an algorithm.Journal of Royal Statistical Society, Series B, 26, 203–210.Google Scholar
  4. Atilgan, T. (1983).Parameter parsimony, model selection, and smooth density estimation. Unpublished doctoral dissertation, University of Wisconsin-Madison.Google Scholar
  5. Atilgan, T., & Leonard, T. (1988). On the application of AIC to bivariate density estimation, non-parametric regression, and discrimination. In H. Bozadogan & A. K. Gupta (Eds.),Multivariate statistical modeling and data analysis (pp. 1–16). Dordrecht, Holland: Reidel.Google Scholar
  6. Bell, S. S. (1990). Empirical Bayes alternatives to the beta-binomial model. Unpublished doctoral dissertation, Columbia University.Google Scholar
  7. Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories.Psychometrika, 37, 29–51.Google Scholar
  8. Bock, R. D., & Aitken, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm.Psychometrika, 46, 443–454.Google Scholar
  9. Carter, M. C., & Williford, W. O. (1975). Estimation in a modified binomial distribution.Applied Statistics, 24, 319–328.Google Scholar
  10. Consul, P. C. (1974). A simple urn model dependent upon predetermined strategy.Sankhya, Series B, 36, 391–399.Google Scholar
  11. Consul, P. C. (1975). On a characterization of Lagrangian Poisson and quasi-binomial distributions.Communications in Statistics, 4, 555–563.Google Scholar
  12. Dalal, S. R., & Hall, W. J. (1983). Approximating priors by mixtures of natural conjugate priors.Journal of Royal Statistical Society, Series B, 45, 278–286.Google Scholar
  13. Duncan, G. T. (1974). An empirical Bayes approach to scoring multiple-choice tests in the misinformation model.Journal of the American Statistical Association, 69, 50–57.Google Scholar
  14. Gelfand, A. E., & Smith, A. F. M. (1990). Sampling based approaches to calculating marginal densities.Journal of the American Statistical Association, 85, 398–409.Google Scholar
  15. Geweke, J. (1988). Antithetic acceleration of Monte-Carlo integration in Bayesian inference.Journal of Econometrics, 38, 73–89.Google Scholar
  16. Geweke, J. (1989). Exact predictive density for linear models with arch distributions.Journal of Econometrics, 40, 63–86.Google Scholar
  17. Hsu, J. S.J. (1990).Bayesian inference and marginalization. Unpublished doctoral dissertation, University of Wisconsin-Madison.Google Scholar
  18. Keats, J. A. (1964). Some generalizations of a theoretical distribution of mental test scores.Psychometrika, 29, 215–231.Google Scholar
  19. Lehmann, E. L. (1983).Theory of point estimation. New York: John Wiley & Sons.Google Scholar
  20. Leonard, T. (1972). Bayesian methods for binomial data.Biometrika, 59, 581–589.Google Scholar
  21. Leonard, T. (1973). A Bayesian method for histograms.Biometrika, 60, 297–308.Google Scholar
  22. Leonard, T. (1982). Comment on the paper by Lejeune and Faulkenberry.Journal of the American Statistical Association, 77, 657–658.Google Scholar
  23. Leonard, T. (1984). Some data-analytic modifications to Bayes-Stein estimation.Annals of the Institute of Statistical Mathematics, 36, 11–21.Google Scholar
  24. Leonard, T., Hsu, J. S.J., & Tsui, K. (1989). Bayesian marginal inference.Journal of the American Statistical Association, 84, 1051–1058.Google Scholar
  25. Leonard, T., & Novick, J. B. (1986). Bayesian full rank marginalization for two-way contingency tables.Journal of Educational Statistics, 11, 33–56.Google Scholar
  26. Lord, F. M. (1965). A strong true-score theory, with applications.Psychometrika, 30, 239–270.Google Scholar
  27. Lord, F. M. (1969). Estimating true-score distributions in psychological testing: An empirical Bayes estimation problem.Psychometrika, 34, 259–299.Google Scholar
  28. Lord, F. M., & Novick, M. R. (1968).Statistical theories of mental test scores (with contributions by Allen Birnbaum). Reading, MA: Addison-Wiley.Google Scholar
  29. Lord, F. M., & Stocking, M. L. (1976). An interval estimate for making statistical inference about true scores.Psychometrika, 41, 79–87.Google Scholar
  30. McCullagh, P., & Nelder, J. A. (1985).Generalized linear models. New York: Chapman and Hall.Google Scholar
  31. Mislevy, R. J. (1986). Bayes modal estimation in item response.Psychometrika, 51, 177–195.Google Scholar
  32. Morrison, D. G., & Brockway, G. (1979). A modified beta-binomial model with applications to multiple choice and taste tests.Psychometrika, 44, 427–442.Google Scholar
  33. Prentice, R. L., & Barlow, W. E. (1988). Correlated binary regression with covariates specific to each binary observation.Biometrics, 44, 1033–48.Google Scholar
  34. Rubinstein, R. Y. (1981).Simulation and the Monte Carlo method. New York: John Wiley and Sons.Google Scholar
  35. Schwarz, G. (1978). Estimating the dimension of a model.Annals of Mathematical Statistics, 6, 461–464.Google Scholar
  36. Takane, Y., Bozdogan, H., & Shibayama, T. (1987). Ideal point discriminant analysis.Psychometrika, 52, 371–392.Google Scholar
  37. Wilcox, R. R. (1981a). A review of the beta-binomial model and its extensions.Journal of Educational Statistics, 6, 3–32.Google Scholar
  38. Wilcox, R. R. (1981b). A cautionary note on estimating the reliability of a mastery test with the beta-binomial model.Applied Psychological Measurement, 5, 531–537.Google Scholar
  39. Young, A. S. (1977). A Bayesian approach to prediction using polynomials.Biometrika, 64, 309–318.Google Scholar

Copyright information

© The Psychometric Society 1991

Authors and Affiliations

  • John S. J. Hsu
    • 1
  • Tom Leonard
    • 2
  • Kam-Wah Tsui
    • 2
  1. 1.Department of Statistics and Applied ProbabilityUniversity of California-Santa BarbaraSanta Barbara
  2. 2.Department of StatisticsThe University of WisconsinMadison

Personalised recommendations