Revisiting the 4-Parameter Item Response Model: Bayesian Estimation and Application

Abstract

There has been renewed interest in Barton and Lord’s (An upper asymptote for the three-parameter logistic item response model (Tech. Rep. No. 80-20). Educational Testing Service, 1981) four-parameter item response model. This paper presents a Bayesian formulation that extends Béguin and Glas (MCMC estimation and some model fit analysis of multidimensional IRT models. Psychometrika, 66 (4):541–561, 2001) and proposes a model for the four-parameter normal ogive (4PNO) model. Monte Carlo evidence is presented concerning the accuracy of parameter recovery. The simulation results support the use of less informative uniform priors for the lower and upper asymptotes, which is an advantage to prior research. Monte Carlo results provide some support for using the deviance information criterion and \(\chi ^{2}\) index to choose among models with two, three, and four parameters. The 4PNO is applied to 7491 adolescents’ responses to a bullying scale collected under the 2005–2006 Health Behavior in School-Aged Children study. The results support the value of the 4PNO to estimate lower and upper asymptotes in large-scale surveys.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2

References

  1. Albert, J. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational and Behavioral Statistics, 17(3), 251–269.

    Article  Google Scholar 

  2. Barton, M A., & Lord, F.M. (1981). An upper asymptote for the three-parameter logistic item-response model (Tech. Rep. No. No. 80-20). Educational Testing Service.

  3. Béguin, A. A., & Glas, C. A. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66(4), 541–561.

    Article  Google Scholar 

  4. Brooks, S. P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7(4), 434–455.

    Google Scholar 

  5. Chang, H.-H., & Ying, Z. (2008). To weight or not to weight? Balancing influence of initial items in adaptive testing. Psychometrika, 73(3), 441–450.

    Article  Google Scholar 

  6. Culpepper, S. A. (2015). An improved correction for range restricted correlations under extreme, monotonic quadratic nonlinearity and heteroscedasticity. Psychometrika,. doi:10.1007/s11336-015-9466-9.

    Google Scholar 

  7. Feuerstahler, L. M., & Waller, N. G. (2014). Estimation of the 4-parameter model with marginal maximum likelihood. Multivariate Behavioral Research, 49(3), 285–285.

    Article  PubMed  Google Scholar 

  8. Fox, J.-P. (2010). Bayesian item response modeling. New York: Springer.

    Google Scholar 

  9. Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–472.

    Article  Google Scholar 

  10. Green, B. F. (2011). A comment on early student blunders on computer-based adaptive tests. Applied Psychological Measurement, 35(2), 165–174.

    Article  Google Scholar 

  11. Holmes, D. (1990). The robustness of the usual correction for restriction in range due to explicit selection. Psychometrika, 55, 19–32.

    Article  Google Scholar 

  12. Iannotti, R. (2005). Health behavior in school-aged children HBSC, 2005–2006. Ann Arbor, MI: Inter-university Consortium for Political and Social Research. doi:10.3886/ICPSR28241.v1.

    Google Scholar 

  13. Liao, W., Ho, R., Yen, Y., & Cheng, H. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality: An International Journal, 40, 1679–1694.

    Article  Google Scholar 

  14. Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63(3), 509–525.

    Article  PubMed  Google Scholar 

  15. Magis, D. (2013). A note on the item information function of the four-parameter logistic model. Applied Psychological Measurement, 37(4), 304–315.

    Article  Google Scholar 

  16. Mendoza, J. (1993). Fisher transformations for correlations corrected for selection and missing data. Psychometrika, 58(4), 601–615.

    Article  Google Scholar 

  17. Ogasawara, H. (2012). Asymptotic expansions for the ability estimator in item response theory. Computational Statistics, 27(4), 661–683.

    Article  Google Scholar 

  18. Patz, R. J., & Junker, B. W. (1999a). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24(4), 342–366.

    Article  Google Scholar 

  19. Patz, R. J., & Junker, B. W. (1999b). A straightforward approach to Markov Chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24(2), 146–178.

    Article  Google Scholar 

  20. Pham-Gia, T., & Turkkan, N. (1998). Distribution of the linear combination of two general Beta variables and applications. Communications in Statistics Theory and Methods, 27(7), 1851–1869.

    Article  Google Scholar 

  21. Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8(2), 164–184.

    Article  PubMed  Google Scholar 

  22. Rulison, K. L., & Loken, E. (2009). I’ve fallen and I can’t get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83–101.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Sahu, S. K. (2002). Bayesian estimation and model choice in item response models. Journal of Statistical Computation and Simulation, 72(3), 217–232.

    Article  Google Scholar 

  24. San Martín, E., González, J., & Tuerlinckx, F. (2014). On the unidentifiability of the fixed-effects 3PL model. Psychometrika, 80, 1–18.

    Google Scholar 

  25. Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30(4), 298–321.

    Article  Google Scholar 

  26. Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4), 583–639.

    Article  Google Scholar 

  27. Waller, N. G., & Reise, S. P. (2010). Measuring psychopathology with nonstandard item response theory models: Fitting the four-parameter model to the Minnesota Multiphasic Personality Inventory. In S. Embretson (Ed.), Measuring psychological constructs: Advances in model based approaches. Washington, DC: American Psychological Association.

    Google Scholar 

  28. Zheng, Y., & Chang, H.-H. (2014). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39, 104–118.

    Article  Google Scholar 

Download references

Acknowledgments

The manuscript benefited from the comments of Alberto Maydeu-Olivares, Niels Waller, and three anonymous reviewers. Any remaining errors belong to the author.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Steven Andrew Culpepper.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Culpepper, S.A. Revisiting the 4-Parameter Item Response Model: Bayesian Estimation and Application. Psychometrika 81, 1142–1163 (2016). https://doi.org/10.1007/s11336-015-9477-6

Download citation

Keywords

  • 4-parameter item response model
  • Bayesian
  • Gibbs sampling
  • large-scale assessment
  • bullying
  • psychopathology