Latent Variable Selection for Multidimensional Item Response Theory Models via $$L_{1}$$ Regularization

Sun, Jianan; Chen, Yunxiao; Liu, Jingchen; Ying, Zhiliang; Xin, Tao

doi:10.1007/s11336-016-9529-6

Latent Variable Selection for Multidimensional Item Response Theory Models via $L_{1}$ Regularization

Published: 03 October 2016

Volume 81, pages 921–939, (2016)
Cite this article

Psychometrika Aims and scope Submit manuscript

Jianan Sun¹,
Yunxiao Chen²,
Jingchen Liu³,
Zhiliang Ying³ &
…
Tao Xin⁴

1521 Accesses
36 Citations
Explore all metrics

Abstract

We develop a latent variable selection method for multidimensional item response theory models. The proposed method identifies latent traits probed by items of a multidimensional test. Its basic strategy is to impose an $L_{1}$ penalty term to the log-likelihood. The computation is carried out by the expectation–maximization algorithm combined with the coordinate descent algorithm. Simulation studies show that the resulting estimator provides an effective way in correctly identifying the latent structures. The method is applied to a real dataset involving the Eysenck Personality Questionnaire.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Article Open access 17 April 2024

References

Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13, 113–127.
Article Google Scholar
Ackerman, T. A. (1994). Using multidimensional item response theory to understand what items and tests are measuring. Applied Measurement in Education, 7, 255–278.
Article Google Scholar
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.
Article Google Scholar
Ansley, T. N., & Forsyth, R. A. (1985). An examination of the characteristics of unidimensional IRT parameter estimates derived from two-dimensional data. Applied Psychological Measurement, 9, 37–48.
Article Google Scholar
Béguin, A. A., & Glas, C. A. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.
Article Google Scholar
Bock, D. R., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261–280.
Article Google Scholar
Bock, D. R., Gibbons, R., Schilling, S., Muraki, E., Wilson, D., & Wood, R. (2003). Testfact 4.0. In Computer software and manual. Lincolnwood, IL: Scientific Software International.
Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.
Article Google Scholar
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75, 33–57.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39, 1–38.
Google Scholar
Donoho, D. L., & Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association, 90, 1200–1224.
Article Google Scholar
Embretson, S. E. (1984). A general latent trait model for response processes. Psychometrika, 49, 175–186.
Article Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Psychometric methods: Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.
Google Scholar
Eysenck, S., & Barrett, P. (2013). Re-introduction to cross-cultural studies of the EPQ. Personality and Individual Differences, 54(4), 485–489.
Article Google Scholar
Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267–269.
Article PubMed Google Scholar
Friedman, J., Hastie, T., Hofling, H., & Tibshirani, R. (2007). Pathwise coordinate optimization. The Annals of Applied Statistics, 1, 302–332.
Article Google Scholar
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1.
Article PubMed PubMed Central Google Scholar
Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183–202.
Article Google Scholar
Kang, T. (2006). Model selection methods for unidimensional and multidimensional IRT models. Madison, WI: University of Wisconsin.
Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
Mallows, C. L. (1973). Some comments on Cp. Technometrics, 15, 661–675.
Google Scholar
Maydeu-Olivares, A., & Liu, Y. (2015). Item diagnostics in multivariate discrete data. Psychological Methods, 20, 276–292.
Article PubMed Google Scholar
McDonald, R. P. (1967). Nonlinear factor analysis. Psychometric Monographs, No. 15. Richmond, VA: Psychometric Corporation.
McDonald, R. P. (1982). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6, 379–396.
Article Google Scholar
McKinley, R. L. (1989). Confirmatory analysis of test structure using multidimensional item response theory. Technical Report No. RR-89-31. Princeton, NJ: Educational Testing Service.
McKinley, R. L., & Reckase, M. D. (1982). The use of the general Rasch model with multidimensional item response data. Technical Report No. ONR-82-1. Iowa City, IA: American College Testing Program.
Reckase, M. D. (1972). Development and application of a multivariate logistic latent trait model. Unpublished Doctoral Dissertation, Syracuse University, Syracuse, NY.
Reckase, M. D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21, 25–36.
Article Google Scholar
Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer.
Book Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.
Article Google Scholar
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 583–639.
Article Google Scholar
Svetina, D., & Levy, R. (2012). An overview of software for conducting dimensionality assessment in multidimensional models. Applied Psychological Measurement, 36, 659–669.
Article Google Scholar
Sympson, J. B. (1978). A model for testing with multidimensional items. In D. J. Weiss (Ed.), Proceedings of the 1977 computerized adaptive testing conference (pp. 82–98).
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58, 267–288.
Google Scholar
Way, W. D., Ansley, T. N., & Forsyth, R. A. (1988). The comparative effects of compensatory and noncompensatory two-dimensional data on unidimensional IRT estimates. Applied Psychological Measurement, 12, 239–252.
Article Google Scholar

Download references

Acknowledgments

This research was funded by Fundamental Research Funds for the Central Universities (No. BLX2014-31), NSF grant SES-1323977, NSF grant IIS-1633360, Army Research Office grant W911NF-15-1-0159, NIH grant R01GM047845, National Natural Science Foundation of China (31371047; 11171029). We also would like to thank Dr. Paul Barrett for letting us use the EPQ-R data.

Author information

Authors and Affiliations

Beijing Forestry University, Beijing, China
Jianan Sun
Emory University, Atlanta, USA
Yunxiao Chen
Columbia University, New York, USA
Jingchen Liu & Zhiliang Ying
Beijing Normal University, Beijing, China
Tao Xin

Authors

Jianan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yunxiao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jingchen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiliang Ying
View author publications
You can also search for this author in PubMed Google Scholar
Tao Xin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingchen Liu.

Appendix

The cyclical coordinate descent algorithm for solving the optimization (12) is introduced as follows. For each item j, there are one difficulty parameter $b_j$ and K discrimination parameters $\mathbf {a}_j=(a_{j1},\ldots ,a_{jK}).$ The algorithm update each of the $K+1$ variables iteratively according to the following updating rule. For the difficulty parameter, there is no $L_1$ penalty and it is updated by

$$\begin{aligned} {\hat{b}_{j}}=b_{j}^{*}-\frac{\partial _{b_{j}}\hat{Q}_{j}({\mathbf {a}_{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})}{\partial ^2 _{b_{j}}\hat{Q}_{j}({\mathbf {a} _{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})}, \end{aligned}$$

where $\partial \hat{Q}_{j}$ denotes derivative of $\hat{Q}_{j}({\mathbf {a}_{j}} ,\,b_{j}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})$ with respect to $b_{j}$ or $a_{jk}$ as labeled by the subscript and $\partial ^{2}{\hat{Q}_{j}}$ is the second derivative. During the above update, the discrimination vector $\mathbf {a}_j$ takes its most up-to-date value. The above update employs a local quadratic approximation of $\hat{Q}_{j}({\mathbf {a}_{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})$ as a function of $b_j$ with all the other variables fixed. For each discrimination parameter $a_{jk},$ an $L_1$ penalty is imposed and it is updated by

$$\begin{aligned} {\hat{a}_{jk}}={-}\frac{{S(- \partial ^{2}{_{a{{_{jk}}}}\hat{Q}_{j}({\mathbf {a}_{j} },\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})\times a_{jk}^{*}+\partial _{a{ {_{jk}}}}\hat{Q}_{j}({\mathbf {a}_{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)}),\,{ \eta }}) }}{\partial ^{2}{_{a{{_{jk}}}}\hat{Q}_{j}({\mathbf {a}_{j}},\,b_{j}^{*}|\mathbf {a}_{j}^{(t)},\,b_{j}^{(t)})}}, \end{aligned}$$

The function S is the soft threshold operator (Donoho & Johnstone, 1995):

$$\begin{aligned} S(\delta ,\,\eta )={\text {sign}}(\delta ){(|\delta |-\eta )_{+}}=\left\{ \begin{array}{ll} \delta -\eta , &{} \text {if}\,\delta >0\,\text {and}\,\eta<|\delta |, \\ \delta +\eta , &{} \text {if}\,\delta<0\,\text {and}\,\eta <|\delta |, \\ 0, &{} \text {if}\,\eta \ge |\delta |. \end{array} \right. . \end{aligned}$$

To obtain the above updating rule, we approximate a generic univariate function f(x) by a quadratic function

$$\begin{aligned} f(x) \approx f\left( x_0\right) + f^{\prime }\left( x_0\right) \left( x-x_0\right) + \frac{f^{\prime \prime }(x_0)}{2} \left( x-x_0\right) ^2, \end{aligned}$$

where $f^{\prime \prime }(x_0)$ is negative. Furthermore, the $L_1$-penalized maximization with the approximated function

$$\begin{aligned} \sup _x \left\{ f\left( x_0\right) + f^{\prime }\left( x_0\right) \left( x-x_0\right) + \frac{f^{\prime \prime }(x_0)}{2} \left( x-x_0\right) ^2 - \eta |x|\right\} , \end{aligned}$$

is solved at

$$\begin{aligned} -\frac{S(- f^{\prime \prime }(x_0) x_0 + f^{\prime }(x_0),\,\eta )}{f^{\prime \prime }(x_0)}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, J., Chen, Y., Liu, J. et al. Latent Variable Selection for Multidimensional Item Response Theory Models via $L_{1}$ Regularization. Psychometrika 81, 921–939 (2016). https://doi.org/10.1007/s11336-016-9529-6

Download citation

Received: 08 June 2014
Revised: 22 March 2016
Published: 03 October 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s11336-016-9529-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Latent Variable Selection for Multidimensional Item Response Theory Models via \(L_{1}\) Regularization

Abstract

Access this article

Similar content being viewed by others

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Latent Variable Selection for Multidimensional Item Response Theory Models via \(L_{1}\) Regularization

Abstract

Access this article

Similar content being viewed by others

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation