Skip to main content
Log in

A Mixed Stochastic Approximation EM (MSAEM) Algorithm for the Estimation of the Four-Parameter Normal Ogive Model

  • Theory and Methods
  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

In recent years, the four-parameter model (4PM) has received increasing attention in item response theory. The purpose of this article is to provide more efficient and more reliable computational tools for fitting the 4PM. In particular, this article focuses on the four-parameter normal ogive model (4PNO) model and develops efficient stochastic approximation expectation maximization (SAEM) algorithms to compute the marginalized maximum a posteriori estimator. First, a data augmentation scheme is used for the 4PNO model, which makes the complete data model be an exponential family, and then, a basic SAEM algorithm is developed for the 4PNO model. Second, to overcome the drawback of the SAEM algorithm, we develop an improved SAEM algorithm for the 4PNO model, which is called the mixed SAEM (MSAEM). Results from simulation studies demonstrate that: (1) the MSAEM provides more accurate or comparable estimates as compared with the other estimation methods, while computationally more efficient; (2) the MSAEM is more robust to the choices of initial values and the priors for item parameters, which is a valuable property for practice use. Finally, a real data set is analyzed to show the good performance of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle. In Selected papers of hirotugu akaike (pp. 199–213). Springer.

  • Allassonnière, S., Kuhn, E., Trouvé, A., et al. (2010). Construction of Bayesian deformable models via a stochastic approximation algorithm: A convergence study. Bernoulli, 16(3), 641–678.

    Google Scholar 

  • Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques. Boca Raton: CRC Press.

    Google Scholar 

  • Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item-response model. ETS Research Report Series, 1981(1), i–8.

    Google Scholar 

  • Battauz, M. (2020). Regularized estimation of the four-parameter logistic model. Psych, 2(4), 269–278.

    Google Scholar 

  • Béguin, A. A., & Glas, C. A. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66(4), 541–561.

    Google Scholar 

  • Berger, J. O. (1990). Robust Bayesian analysis: Sensitivity to the prior. Journal of Statistical Planning and Inference, 25(3), 303–328.

    Google Scholar 

  • Camilli, G., & Fox, J.-P. (2015). An aggregate IRT procedure for exploratory factor analysis. Journal of Educational and Behavioral Statistics, 40(4), 377–401.

    Google Scholar 

  • Camilli, G., & Geis, E. (2019). Stochastic approximation EM for large-scale exploratory IRT factor analysis. Statistics in Medicine, 38(21), 3997–4012.

    PubMed  Google Scholar 

  • Celeux, G., Hurn, M., & Robert, C. P. (2000). Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association, 95(451), 957–970.

    Google Scholar 

  • Culpepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81(4), 1142–1163.

    PubMed  Google Scholar 

  • Culpepper, S. A. (2017). The prevalence and implications of slipping on low-stakes, large-scale assessments. Journal of Educational and Behavioral Statistics, 42(6), 706–725.

    Google Scholar 

  • Delyon, B., Lavielle, M., Moulines, E., et al. (1999). Convergence of a stochastic approximation version of the EM algorithm. The Annals of Statistics, 27(1), 94–128.

    Google Scholar 

  • DeMars, C. E. (2012). A comparison of limited-information and full-information methods in M plus for estimating item response theory parameters for nonnormal populations. Structural Equation Modeling: A Multidisciplinary Journal, 19(4), 610–632.

    Google Scholar 

  • Feuerstahler, L. M., & Waller, N. G. (2014). Estimation of the 4-parameter model with marginal maximum likelihood. Multivariate Behavioral Research, 49(3), 285.

    PubMed  Google Scholar 

  • Fox, J.-P. (2003). Stochastic EM for estimating the parameters of a multilevel IRT model. British Journal of Mathematical and Statistical Psychology, 56(1), 65–81.

    PubMed  Google Scholar 

  • Galarza, C. E., Lachos, V. H., & Bandyopadhyay, D. (2017). Quantile regression in linear mixed models: A stochastic approximation EM approach. Statistics and its Interface, 10(3), 471.

    PubMed  PubMed Central  Google Scholar 

  • Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis. Boca Raton: CRC Press.

    Google Scholar 

  • Gu, M. G., & Zhu, H.-T. (2001). Maximum likelihood estimation for spatial models by Markov chain Monte Carlo stochastic approximation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 339–355.

    Google Scholar 

  • Guo, S., & Zheng, C. (2019). The Bayesian expectation–maximization–maximization for the 3plm. Frontiers in Psychology, 10, 1175.

    PubMed  PubMed Central  Google Scholar 

  • Jank, W. (2006). Implementing and diagnosing the stochastic approximation EM algorithm. Journal of Computational and Graphical Statistics, 15(4), 803–829.

    Google Scholar 

  • Kern, J. L., & Culpepper, S. A. (2020). A restricted four-parameter IRT model: The dyad four-parameter normal ogive (Dyad-4PNO) model. Psychometrika, 85(3), 575–599.

    PubMed  Google Scholar 

  • Kuhn, E., & Lavielle, M. (2004). Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM: Probability and Statistics, 8, 115–131.

    Google Scholar 

  • Lavielle, M., & Mbogning, C. (2014). An improved SAEM algorithm for maximum likelihood estimation in mixtures of non linear mixed effects models. Statistics and Computing, 24(5), 693–707.

    Google Scholar 

  • Liao, W.-W., Ho, R.-G., Yen, Y.-C., & Cheng, H.-C. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality: An International Journal, 40(10), 1679–1694.

    Google Scholar 

  • Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63(3), 509–525.

    PubMed  Google Scholar 

  • McKinley, R. L., & Mills, C. N. (1985). A comparison of several goodness-of-fit statistics. Applied Psychological Measurement, 9(1), 49–57.

    Google Scholar 

  • Meng, X.-L., & Schilling, S. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling. Journal of the American Statistical Association, 91(435), 1254–1267.

    Google Scholar 

  • Meng, X., Xu, G., Zhang, J., & Tao, J. (2020). Marginalized maximum a posteriori estimation for the four-parameter logistic model under a mixture modelling framework. British Journal of Mathematical and Statistical Psychology, 73, 51–82.

    PubMed  Google Scholar 

  • Orlando, M., & Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24(1), 50–64.

    Google Scholar 

  • Orlando, M., & Thissen, D. (2003). Further investigation of the performance of s-x2: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27(4), 289–298.

    Google Scholar 

  • Patsula, L. (1995). A comparison of item parameter estimates and ICCs produced with TESTGRAF and BILOG under different test lengths and sample sizes. University of Ottawa (Canada).

  • Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8(2), 164–184.

    PubMed  Google Scholar 

  • Robbins, H., & Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics, 22, 400–407.

    Google Scholar 

  • Rulison, K. L., & Loken, E. (2009). I’ve fallen and I can’t get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83–101.

    PubMed  Google Scholar 

  • Svetina, D., Valdivia, A., Underhill, S., Dai, S., & Wang, X. (2017). Parameter recovery in multidimensional item response theory models under complexity and nonnormality. Applied Psychological Measurement, 41(7), 530–544.

    PubMed  PubMed Central  Google Scholar 

  • Tang, K. L., Way, W. D., & Carey, P. A. (1993). The effect of small calibration sample sizes on TOEFL IRT-based equating. ETS Research Report Series, 1993(2), 1–38.

    Google Scholar 

  • Tao, J., Shi, N.-Z., & Chang, H.-H. (2012). Item-weighted likelihood method for ability estimation in tests composed of both dichotomous and polytomous items. Journal of Educational and Behavioral Statistics, 37(2), 298–315.

    Google Scholar 

  • Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic model. Psychometrika, 47(2), 175–186.

    Google Scholar 

  • von Davier, M. (2009). Is there need for the 3pl model? Guess what? Measurement: Interdisciplinary Research and Perspectives, 7(2), 110–114.

    Google Scholar 

  • Waller, N. G., & Feuerstahler, L. (2017). Bayesian modal estimation of the four-parameter item response model in real, realistic, and idealized data sets. Multivariate Behavioral Research, 52(3), 350–370.

    PubMed  Google Scholar 

  • Wang, C., Su, S., & Weiss, D. J. (2018). Robustness of parameter estimation to assumptions of normality in the multidimensional graded response model. Multivariate Behavioral Research, 53(3), 403–418.

    PubMed  Google Scholar 

  • Wei, G. C., & Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. Journal of the American Statistical Association, 85(411), 699–704.

    Google Scholar 

  • Wollack, J. A., Bolt, D. M., Cohen, A. S., & Lee, Y.-S. (2002). Recovery of item parameters in the nominal response model: A comparison of marginal maximum likelihood estimation and Markov Chain Monte Carlo estimation. Applied Psychological Measurement, 26(3), 339–352.

    Google Scholar 

  • Yen, W. M. (1981). Using simulation results to choose a latent trait model. Applied Psychological Measurement, 5(2), 245–262.

    Google Scholar 

  • Yen, W. M. (1987). A comparison of the efficiency and accuracy of bilog and logist. Psychometrika, 52(2), 275–291.

    Google Scholar 

  • Yoes, M. (1995). An updated comparison of micro-computer based item parameter estimation procedures used with the 3-parameter IRT model. St. Paul, MN: Assessment Systems Corporation.

    Google Scholar 

  • Zhang, J., Du, H., Zhang, Z., & Tao, J. (2020). Gibbs-slice sampling algorithm for estimating the four-parameter logistic model. Frontiers in Psychology, 11, 2121.

    PubMed  PubMed Central  Google Scholar 

  • Zhang, S., Chen, Y., & Liu, Y. (2020). An improved stochastic EM algorithm for large-scale full-information item factor analysis. British Journal of Mathematical and Statistical Psychology, 73(1), 44–71.

    PubMed  Google Scholar 

  • Zhang, X., Wang, C., Weiss, D. J., & Tao, J. (2020c). Bayesian inference for IRT models with non-normal latent trait distributions. Multivariate Behavioral Research 1–21.

Download references

Acknowledgements

The authors are greatly indebted to the editor, an associate editor, and three anonymous reviewers for their valuable comments and suggestions. Meng is partially supported by the National Natural Science Foundation of China (11501094,11571069), and Xu is partially supported by National Science Foundation (SES-1846747) and Institute of Education Sciences (R305D200015).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangbin Meng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: The Simulation Result on the Recovery Accuracy of the 4PL Model

Appendix: The Simulation Result on the Recovery Accuracy of the 4PL Model

In this simulation, the 4PL is the data-generating model,

$$\begin{aligned} P_{i}(\theta _j)=P(U_{ij}=1|\theta _i,\pmb {\xi }_j)=c_j+(d_j-c_j)\frac{\exp [D(a_j\theta _i+b_j)]}{1+\exp [D(a_j\theta _i+b_j)]}, \end{aligned}$$

where \(D=1.702\) is the scale constant. To keep comparability with the results shown in Tables 2, 3 and 4, the testing conditions as well as the true values of \(a_j, b_j, c_j\) and \(d_j\) are the same as that of “Study 2”. The MMAP estimate of the 4PL model is computed by the EM algorithm implemented in the R package of “mirt” and the EM algorithm proposed by Meng et al. (2020) separately.

The prior for \((a_j,b_j)\) is \((\ln a_j, b_j)'\sim N_2 (\mu _{0}, \Sigma _{0}),\) which is different from that for the 4PNO model, since the above two EM algorithms are implemented under a lognormal prior for \(a_j\). The prior for \((c_j,d_j)\) is the same as that of the 4PNO, which is a bivariate Beta distribution given in Eq. 20.

Following the design of “Study 2”, the MMAP estimate of the 4PL model is computed under the four different priors, please see Table 8. Note that the variance of \(\ln a_j\) is 1, which is to make the prior information close to the truncated normal distribution in Eq. 19 with the variance is 2. The same as that of “Study 2”, the simulation study generated 200 replications, and the parameter recovery is assessed by computing the three criteria (ARMSE, ACor and AIRF) across the 200 replications. The obtained results are given in Tables 9 and 10.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meng, X., Xu, G. A Mixed Stochastic Approximation EM (MSAEM) Algorithm for the Estimation of the Four-Parameter Normal Ogive Model. Psychometrika 88, 1407–1442 (2023). https://doi.org/10.1007/s11336-022-09870-w

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-022-09870-w

Keywords

Navigation