, Volume 81, Issue 4, pp 921–939 | Cite as

Latent Variable Selection for Multidimensional Item Response Theory Models via \(L_{1}\) Regularization

  • Jianan Sun
  • Yunxiao Chen
  • Jingchen Liu
  • Zhiliang Ying
  • Tao Xin


We develop a latent variable selection method for multidimensional item response theory models. The proposed method identifies latent traits probed by items of a multidimensional test. Its basic strategy is to impose an \(L_{1}\) penalty term to the log-likelihood. The computation is carried out by the expectation–maximization algorithm combined with the coordinate descent algorithm. Simulation studies show that the resulting estimator provides an effective way in correctly identifying the latent structures. The method is applied to a real dataset involving the Eysenck Personality Questionnaire.


latent variable selection multidimensional item response theory model \(L_{1}\) regularization expectation–maximization BIC 



This research was funded by Fundamental Research Funds for the Central Universities (No. BLX2014-31), NSF grant SES-1323977, NSF grant IIS-1633360, Army Research Office grant W911NF-15-1-0159, NIH grant R01GM047845, National Natural Science Foundation of China (31371047; 11171029). We also would like to thank Dr. Paul Barrett for letting us use the EPQ-R data.


  1. Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13, 113–127.CrossRefGoogle Scholar
  2. Ackerman, T. A. (1994). Using multidimensional item response theory to understand what items and tests are measuring. Applied Measurement in Education, 7, 255–278.CrossRefGoogle Scholar
  3. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.CrossRefGoogle Scholar
  4. Ansley, T. N., & Forsyth, R. A. (1985). An examination of the characteristics of unidimensional IRT parameter estimates derived from two-dimensional data. Applied Psychological Measurement, 9, 37–48.CrossRefGoogle Scholar
  5. Béguin, A. A., & Glas, C. A. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.CrossRefGoogle Scholar
  6. Bock, D. R., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261–280.CrossRefGoogle Scholar
  7. Bock, D. R., Gibbons, R., Schilling, S., Muraki, E., Wilson, D., & Wood, R. (2003). Testfact 4.0. In Computer software and manual. Lincolnwood, IL: Scientific Software International.Google Scholar
  8. Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.CrossRefGoogle Scholar
  9. Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75, 33–57.CrossRefGoogle Scholar
  10. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39, 1–38.Google Scholar
  11. Donoho, D. L., & Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association, 90, 1200–1224.CrossRefGoogle Scholar
  12. Embretson, S. E. (1984). A general latent trait model for response processes. Psychometrika, 49, 175–186.CrossRefGoogle Scholar
  13. Embretson, S. E., & Reise, S. P. (2000). Psychometric methods: Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar
  14. Eysenck, S., & Barrett, P. (2013). Re-introduction to cross-cultural studies of the EPQ. Personality and Individual Differences, 54(4), 485–489.CrossRefGoogle Scholar
  15. Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267–269.CrossRefPubMedGoogle Scholar
  16. Friedman, J., Hastie, T., Hofling, H., & Tibshirani, R. (2007). Pathwise coordinate optimization. The Annals of Applied Statistics, 1, 302–332.CrossRefGoogle Scholar
  17. Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1.CrossRefPubMedPubMedCentralGoogle Scholar
  18. Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183–202.CrossRefGoogle Scholar
  19. Kang, T. (2006). Model selection methods for unidimensional and multidimensional IRT models. Madison, WI: University of Wisconsin.Google Scholar
  20. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
  21. Mallows, C. L. (1973). Some comments on Cp. Technometrics, 15, 661–675.Google Scholar
  22. Maydeu-Olivares, A., & Liu, Y. (2015). Item diagnostics in multivariate discrete data. Psychological Methods, 20, 276–292.CrossRefPubMedGoogle Scholar
  23. McDonald, R. P. (1967). Nonlinear factor analysis. Psychometric Monographs, No. 15. Richmond, VA: Psychometric Corporation.Google Scholar
  24. McDonald, R. P. (1982). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6, 379–396.CrossRefGoogle Scholar
  25. McKinley, R. L. (1989). Confirmatory analysis of test structure using multidimensional item response theory. Technical Report No. RR-89-31. Princeton, NJ: Educational Testing Service.Google Scholar
  26. McKinley, R. L., & Reckase, M. D. (1982). The use of the general Rasch model with multidimensional item response data. Technical Report No. ONR-82-1. Iowa City, IA: American College Testing Program.Google Scholar
  27. Reckase, M. D. (1972). Development and application of a multivariate logistic latent trait model. Unpublished Doctoral Dissertation, Syracuse University, Syracuse, NY.Google Scholar
  28. Reckase, M. D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21, 25–36.CrossRefGoogle Scholar
  29. Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer.CrossRefGoogle Scholar
  30. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.CrossRefGoogle Scholar
  31. Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 583–639.CrossRefGoogle Scholar
  32. Svetina, D., & Levy, R. (2012). An overview of software for conducting dimensionality assessment in multidimensional models. Applied Psychological Measurement, 36, 659–669.CrossRefGoogle Scholar
  33. Sympson, J. B. (1978). A model for testing with multidimensional items. In D. J. Weiss (Ed.), Proceedings of the 1977 computerized adaptive testing conference (pp. 82–98).Google Scholar
  34. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58, 267–288.Google Scholar
  35. Way, W. D., Ansley, T. N., & Forsyth, R. A. (1988). The comparative effects of compensatory and noncompensatory two-dimensional data on unidimensional IRT estimates. Applied Psychological Measurement, 12, 239–252.CrossRefGoogle Scholar

Copyright information

© The Psychometric Society 2016

Authors and Affiliations

  • Jianan Sun
    • 1
  • Yunxiao Chen
    • 2
  • Jingchen Liu
    • 3
  • Zhiliang Ying
    • 3
  • Tao Xin
    • 4
  1. 1.Beijing Forestry UniversityBeijingChina
  2. 2.Emory UniversityAtlantaUSA
  3. 3.Columbia UniversityNew YorkUSA
  4. 4.Beijing Normal UniversityBeijingChina

Personalised recommendations