Latent Variable Selection for Multidimensional Item Response Theory Models via \(L_{1}\) Regularization

Abstract

We develop a latent variable selection method for multidimensional item response theory models. The proposed method identifies latent traits probed by items of a multidimensional test. Its basic strategy is to impose an \(L_{1}\) penalty term to the log-likelihood. The computation is carried out by the expectation–maximization algorithm combined with the coordinate descent algorithm. Simulation studies show that the resulting estimator provides an effective way in correctly identifying the latent structures. The method is applied to a real dataset involving the Eysenck Personality Questionnaire.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13, 113–127.

    Article  Google Scholar 

  2. Ackerman, T. A. (1994). Using multidimensional item response theory to understand what items and tests are measuring. Applied Measurement in Education, 7, 255–278.

    Article  Google Scholar 

  3. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.

    Article  Google Scholar 

  4. Ansley, T. N., & Forsyth, R. A. (1985). An examination of the characteristics of unidimensional IRT parameter estimates derived from two-dimensional data. Applied Psychological Measurement, 9, 37–48.

    Article  Google Scholar 

  5. Béguin, A. A., & Glas, C. A. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.

    Article  Google Scholar 

  6. Bock, D. R., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261–280.

    Article  Google Scholar 

  7. Bock, D. R., Gibbons, R., Schilling, S., Muraki, E., Wilson, D., & Wood, R. (2003). Testfact 4.0. In Computer software and manual. Lincolnwood, IL: Scientific Software International.

  8. Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.

    Article  Google Scholar 

  9. Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75, 33–57.

    Article  Google Scholar 

  10. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39, 1–38.

    Google Scholar 

  11. Donoho, D. L., & Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association, 90, 1200–1224.

    Article  Google Scholar 

  12. Embretson, S. E. (1984). A general latent trait model for response processes. Psychometrika, 49, 175–186.

    Article  Google Scholar 

  13. Embretson, S. E., & Reise, S. P. (2000). Psychometric methods: Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  14. Eysenck, S., & Barrett, P. (2013). Re-introduction to cross-cultural studies of the EPQ. Personality and Individual Differences, 54(4), 485–489.

    Article  Google Scholar 

  15. Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267–269.

    Article  PubMed  Google Scholar 

  16. Friedman, J., Hastie, T., Hofling, H., & Tibshirani, R. (2007). Pathwise coordinate optimization. The Annals of Applied Statistics, 1, 302–332.

    Article  Google Scholar 

  17. Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183–202.

    Article  Google Scholar 

  19. Kang, T. (2006). Model selection methods for unidimensional and multidimensional IRT models. Madison, WI: University of Wisconsin.

    Google Scholar 

  20. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

    Google Scholar 

  21. Mallows, C. L. (1973). Some comments on Cp. Technometrics, 15, 661–675.

    Google Scholar 

  22. Maydeu-Olivares, A., & Liu, Y. (2015). Item diagnostics in multivariate discrete data. Psychological Methods, 20, 276–292.

    Article  PubMed  Google Scholar 

  23. McDonald, R. P. (1967). Nonlinear factor analysis. Psychometric Monographs, No. 15. Richmond, VA: Psychometric Corporation.

  24. McDonald, R. P. (1982). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6, 379–396.

    Article  Google Scholar 

  25. McKinley, R. L. (1989). Confirmatory analysis of test structure using multidimensional item response theory. Technical Report No. RR-89-31. Princeton, NJ: Educational Testing Service.

  26. McKinley, R. L., & Reckase, M. D. (1982). The use of the general Rasch model with multidimensional item response data. Technical Report No. ONR-82-1. Iowa City, IA: American College Testing Program.

  27. Reckase, M. D. (1972). Development and application of a multivariate logistic latent trait model. Unpublished Doctoral Dissertation, Syracuse University, Syracuse, NY.

  28. Reckase, M. D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21, 25–36.

    Article  Google Scholar 

  29. Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer.

    Google Scholar 

  30. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.

    Article  Google Scholar 

  31. Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 583–639.

    Article  Google Scholar 

  32. Svetina, D., & Levy, R. (2012). An overview of software for conducting dimensionality assessment in multidimensional models. Applied Psychological Measurement, 36, 659–669.

    Article  Google Scholar 

  33. Sympson, J. B. (1978). A model for testing with multidimensional items. In D. J. Weiss (Ed.), Proceedings of the 1977 computerized adaptive testing conference (pp. 82–98).

  34. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58, 267–288.

    Google Scholar 

  35. Way, W. D., Ansley, T. N., & Forsyth, R. A. (1988). The comparative effects of compensatory and noncompensatory two-dimensional data on unidimensional IRT estimates. Applied Psychological Measurement, 12, 239–252.

    Article  Google Scholar 

Download references

Acknowledgments

This research was funded by Fundamental Research Funds for the Central Universities (No. BLX2014-31), NSF grant SES-1323977, NSF grant IIS-1633360, Army Research Office grant W911NF-15-1-0159, NIH grant R01GM047845, National Natural Science Foundation of China (31371047; 11171029). We also would like to thank Dr. Paul Barrett for letting us use the EPQ-R data.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jingchen Liu.

Appendix

Appendix

The cyclical coordinate descent algorithm for solving the optimization (12) is introduced as follows. For each item j,  there are one difficulty parameter \(b_j\) and K discrimination parameters \(\mathbf {a}_j=(a_{j1},\ldots ,a_{jK}).\) The algorithm update each of the \(K+1\) variables iteratively according to the following updating rule. For the difficulty parameter, there is no \(L_1\) penalty and it is updated by

$$\begin{aligned} {\hat{b}_{j}}=b_{j}^{*}-\frac{\partial _{b_{j}}\hat{Q}_{j}({\mathbf {a}_{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})}{\partial ^2 _{b_{j}}\hat{Q}_{j}({\mathbf {a} _{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})}, \end{aligned}$$

where \(\partial \hat{Q}_{j}\) denotes derivative of \(\hat{Q}_{j}({\mathbf {a}_{j}} ,\,b_{j}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})\) with respect to \(b_{j}\) or \(a_{jk}\) as labeled by the subscript and \(\partial ^{2}{\hat{Q}_{j}}\) is the second derivative. During the above update, the discrimination vector \(\mathbf {a}_j\) takes its most up-to-date value. The above update employs a local quadratic approximation of \(\hat{Q}_{j}({\mathbf {a}_{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})\) as a function of \(b_j\) with all the other variables fixed. For each discrimination parameter \(a_{jk},\) an \(L_1\) penalty is imposed and it is updated by

$$\begin{aligned} {\hat{a}_{jk}}={-}\frac{{S(- \partial ^{2}{_{a{{_{jk}}}}\hat{Q}_{j}({\mathbf {a}_{j} },\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})\times a_{jk}^{*}+\partial _{a{ {_{jk}}}}\hat{Q}_{j}({\mathbf {a}_{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)}),\,{ \eta }}) }}{\partial ^{2}{_{a{{_{jk}}}}\hat{Q}_{j}({\mathbf {a}_{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})}}, \end{aligned}$$

The function S is the soft threshold operator (Donoho & Johnstone, 1995):

$$\begin{aligned} S(\delta ,\,\eta )={\text {sign}}(\delta ){(|\delta |-\eta )_{+}}=\left\{ \begin{array}{ll} \delta -\eta , &{} \text {if}\,\delta >0\,\text {and}\,\eta<|\delta |, \\ \delta +\eta , &{} \text {if}\,\delta<0\,\text {and}\,\eta <|\delta |, \\ 0, &{} \text {if}\,\eta \ge |\delta |. \end{array} \right. . \end{aligned}$$

To obtain the above updating rule, we approximate a generic univariate function f(x) by a quadratic function

$$\begin{aligned} f(x) \approx f\left( x_0\right) + f^{\prime }\left( x_0\right) \left( x-x_0\right) + \frac{f^{\prime \prime }(x_0)}{2} \left( x-x_0\right) ^2, \end{aligned}$$

where \(f^{\prime \prime }(x_0)\) is negative. Furthermore, the \(L_1\)-penalized maximization with the approximated function

$$\begin{aligned} \sup _x \left\{ f\left( x_0\right) + f^{\prime }\left( x_0\right) \left( x-x_0\right) + \frac{f^{\prime \prime }(x_0)}{2} \left( x-x_0\right) ^2 - \eta |x|\right\} , \end{aligned}$$

is solved at

$$\begin{aligned} -\frac{S(- f^{\prime \prime }(x_0) x_0 + f^{\prime }(x_0),\,\eta )}{f^{\prime \prime }(x_0)}. \end{aligned}$$

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sun, J., Chen, Y., Liu, J. et al. Latent Variable Selection for Multidimensional Item Response Theory Models via \(L_{1}\) Regularization. Psychometrika 81, 921–939 (2016). https://doi.org/10.1007/s11336-016-9529-6

Download citation

Keywords

  • latent variable selection
  • multidimensional item response theory model
  • \(L_{1}\) regularization
  • expectation–maximization
  • BIC