The UMP Exact Test and the Confidence Interval for Person Parameters in IRT Models

Liu, Xiang; Han, Zhuangzhuang; Johnson, Matthew S.

doi:10.1007/s11336-017-9580-y

The UMP Exact Test and the Confidence Interval for Person Parameters in IRT Models

Published: 23 August 2017

Volume 83, pages 182–202, (2018)
Cite this article

Psychometrika Aims and scope Submit manuscript

461 Accesses
3 Citations
Explore all metrics

Abstract

In educational and psychological measurement when short test forms are used, the asymptotic normality of the maximum likelihood estimator of the person parameter of item response models does not hold. As a result, hypothesis tests or confidence intervals of the person parameter based on the normal distribution are likely to be problematic. Inferences based on the exact distribution, on the other hand, do not suffer from this limitation. However, the computation involved for the exact distribution approach is often prohibitively expensive. In this paper, we propose a general framework for constructing hypothesis tests and confidence intervals for IRT models within the exponential family based on exact distribution. In addition, an efficient branch and bound algorithm for calculating the exact p value is introduced. The type-I error rate and statistical power of the proposed exact test as well as the coverage rate and the lengths of the associated confidence interval are examined through a simulation. We also demonstrate its practical use by analyzing three real data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Power Analysis for the Wald, LR, Score, and Gradient Tests in a Marginal Maximum Likelihood Framework: Applications in IRT

Article Open access 27 August 2022

Correction for Item Response Theory Latent Trait Measurement Error in Linear Mixed Effects Models

Article 10 June 2019

Adjusted Residuals for Evaluating Conditional Independence in IRT Models for Multistage Adaptive Testing

Article 06 November 2023

References

Agresti, A. (2003). Dealing with discreteness: Making ‘exact’ confidence intervals for proportions, differences of proportions, and odds ratios more exact. Statistical Methods in Medical Research, 12(1), 3–21.
Article PubMed Google Scholar
Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). Boca Raton: CRC Press.
Google Scholar
Biehler, M., Holling, H., & Doebler, P. (2014). Saddlepoint approximations of the distribution of the person parameter in the two parameter logistic model. Psychometrika, 80(3), 665–688. doi:10.1007/s11336-014-9405-1.
Article PubMed Google Scholar
Bock, D. R., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35(2), 179–197. doi:10.1007/BF02291262.
Article Google Scholar
Brent, R. P. (1973). Algorithms for minimization without derivatives. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Casella, G., & Berger, R. (2001). Statistical inference. Textbook Binding: Duxbury Resource Center.
Google Scholar
Doebler, A., Doebler, P., & Holling, H. (2012). Optimal and most exact confidence intervals for person parameters in item response theory models. Psychometrika, 78(1), 98–115. doi:10.1007/s11336-012-9290-4.
Article PubMed Google Scholar
Fisher, R. A. (1935). The design of experiments. Edinburgh: Oliver and Boyd.
Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis (3rd ed.). Boca Raton: CRC.
Google Scholar
Hagell, P., & Westergren, A. (2011). Measurement properties of the SF-12 health survey in Parkinson’s disease. Journal of Parkinson’s Disease, 1, 185–196. doi:10.3233/JPD-2011-11026.
PubMed Google Scholar
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer-Nijhoff Publishing. doi:10.1007/978-94-017-1988-9.
Book Google Scholar
Johnson, M. S. (2004). Item response models and their use in measuring food insecurity and hunger. In Paper presented at the workshop on the measurement of food insecurity and hunger. The national academy of science panel to review USDA’s measurement of food insecurity and hunger.
Klauer, K. C. (1991). Exact and best confidence intervals for the ability parameter of the Rasch model. Psychometrika, 56(3), 535–547. doi:10.1007/BF02294489.
Article Google Scholar
Land, A. H., & Doig, A. G. (1960). An automatic method of solving discrete programming problems. Econometrica, 28(3), 497–520. doi:10.2307/1910129.
Article Google Scholar
Leiserson, C. C. E., Rivest, R. R. L., Stein, C., & Cormen, T. H. (2009). Introduction to algorithms (3rd ed.). Cambridge: The MIT Press.
Google Scholar
Liou, M., & Chang, C.-H. (1992). Constructing the exact significance level for a person fit statistic. Psychometrika, 57(2), 169–181. doi:10.1007/BF02294503.
Article Google Scholar
Little, J. D. C., Murty, K. G., Sweeney, D. W., & Karel, C. (1963). An algorithm for the traveling salesman problem. Operations Research, 11(6), 972–989.
Article Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems (Vol. 365). Broadway, NJ: Lawrence Erlbaum Associates, Inc.
Google Scholar
Lord, F. M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability. Psychometrika, 48(2), 233–245. doi:10.1007/BF02294018.
Article Google Scholar
Lugannani, R., & Rice, S. (1980). Saddle point approximation for the distribution of the sum of independent random variables. Advances in Applied Probability, 12(2), 475. doi:10.2307/1426607.
Article Google Scholar
Mair, P., & Hatzinger, R. (2007). CML based estimation of extended Rasch models with the eRm package in R. Psychology Science, 49(1), 26–43.
Google Scholar
Molenaar, I. W., & Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55(1), 75–106. doi:10.1007/BF02294745.
Article Google Scholar
Rizopoulos, D. (2006). ltm: An R package for latent variable modeling and item response theory analyses. Journal of Statistical Software, 17(5), 1–25.
Article Google Scholar
Thissen, D. (2016). Bad questions: An essay involving item response theory. Journal of Educational and Behavioral Statistics, 41(1), 81–89.
Article Google Scholar
Ware, J., Kosinski, M., & Keller, S. D. (1996). A 12-item short-form health survey: Construction of scales and preliminary tests of reliability and validity. Medical Care, 34(3), 220–33.
Article PubMed Google Scholar
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427–450. doi:10.1007/BF02294627.
Article Google Scholar
Wasserman, L. (2004). All of statistics. New York, NY: Springer. doi:10.1007/978-0-387-21736-9.
Book Google Scholar

Download references

Author information

Authors and Affiliations

Department of Human Development, Teachers College of Columbia University, 525 West 120th Street, New York, NY, 10027-6696, USA
Xiang Liu, Zhuangzhuang Han & Matthew S. Johnson

Authors

Xiang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhuangzhuang Han
View author publications
You can also search for this author in PubMed Google Scholar
Matthew S. Johnson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiang Liu.

Appendices

Appendix A

Under the 2PL model, the probability of a correct response for jth item from a subject is

$$\begin{aligned} P_j(X_j=1 | a_j,b_j,\theta ) = \frac{\exp [a_j(\theta -b_j)]}{1+\exp [a_j(\theta -b_j)]}, \end{aligned}$$

(13)

where $a_j$ is the item discrimination parameter, $b_j$ is the item difficulty parameter, and $\theta $ is the ability parameter for the subject. It follows that the likelihood of $\theta $ given a response pattern $\varvec{X} = \varvec{x}$ is

$$\begin{aligned} L\left( \theta |\varvec{x},\varvec{a},\varvec{b}\right)&= \prod _{j=1}^{J} P_j\left( X_j=1 | a_j,b_j,\theta \right) ^{x_j} P_j\left( X_j=0 | a_j,b_j,\theta \right) ^{1-x_j} \end{aligned}$$

(14)

$$\begin{aligned}&= \prod _{j=1}^{J} \left\{ \frac{\exp \left[ a_j\left( \theta -b_j\right) \right] }{1+\exp \left[ a_j\left( \theta -b_j\right) \right] }\right\} ^{x_j} \left\{ \frac{1}{1+\exp \left[ a_j\left( \theta -b_j\right) \right] }\right\} ^{1-x_j} \end{aligned}$$

(15)

$$\begin{aligned}&= \prod _{j=1}^{J} \frac{\left\{ \exp \left[ a_j\left( \theta -b_j\right) \right] \right\} ^{x_j}}{1+\exp \left[ a_j\left( \theta -b_j\right) \right] } \left\{ \frac{1}{1+\exp \left[ a_j\left( \theta -b_j\right) \right] }\right\} ^{x_j-x_j} \end{aligned}$$

(16)

$$\begin{aligned}&= \prod _{j=1}^{J} \frac{\left\{ \exp \left[ a_j\left( \theta -b_j\right) \right] \right\} ^{x_j}}{1+\exp \left[ a_j\left( \theta -b_j\right) \right] } \end{aligned}$$

(17)

$$\begin{aligned}&= \frac{\exp \left[ \sum _{j=1}^{J}x_j a_j\left( \theta -b_j\right) \right] }{\prod _{j=1}^{J} \left\{ 1+\exp \left[ a_j\left( \theta -b_j\right) \right] \right\} } \end{aligned}$$

(18)

$$\begin{aligned}&= \frac{\exp \left[ \theta \sum _{j=1}^{J}a_j x_j\right] }{\exp \left[ \sum _{j=1}^{J}a_j x_j b_j\right] \prod _{j=1}^{J} \left\{ 1+\exp \left[ a_j\left( \theta -b_j\right) \right] \right\} } \end{aligned}$$

(19)

$$\begin{aligned}&= \exp \left[ \theta \sum _{j=1}^{J}a_j x_j\right] \left\{ \exp \left[ \sum _{j=1}^{J}a_j x_j b_j\right] \right\} ^{-1} \left\{ \prod _{j=1}^{J} \left\{ 1+\exp \left[ a_j\left( \theta -b_j\right) \right] \right\} \right\} ^{-1}. \end{aligned}$$

(20)

Equation (20) is in the exponential form $L(\theta | \varvec{x}) = \exp [\eta (\theta )T(\varvec{x})]h(\varvec{x})g(\theta )$, where

$$\begin{aligned} \exp \left[ \eta \left( \theta \right) T\left( \varvec{x}\right) \right]&= \exp \left( \theta \sum _{j=1}^{n}a_j x_j\right) , \end{aligned}$$

(21)

$$\begin{aligned} h\left( \varvec{x}\right)&= \exp \left( \sum _{j=1}^{n}a_j x_j b_j\right) ^{-1}, \end{aligned}$$

(22)

and

$$\begin{aligned} g(\theta ) = \prod _{j=1}^{n}\{1-\exp [a_j(\theta -b_j)]\}^{-1}. \end{aligned}$$

(23)

Appendix B

In the food security data example, we are interested in testing the one-sided hypothesis: $H_0{:}\; \theta \le 1.93$ against $H_1{:}\; \theta \ge 1.93$. Using the exact test approach, the following response patterns are rejected at $\alpha =0.05$ level:

Table 2 18 Patterns that are rejected under the exact test.

Full size table

Table 3 10 Patterns that are rejected under the asymptotic approach.

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Han, Z. & Johnson, M.S. The UMP Exact Test and the Confidence Interval for Person Parameters in IRT Models. Psychometrika 83, 182–202 (2018). https://doi.org/10.1007/s11336-017-9580-y

Download citation

Received: 06 May 2016
Revised: 30 March 2017
Published: 23 August 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s11336-017-9580-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The UMP Exact Test and the Confidence Interval for Person Parameters in IRT Models

Abstract

Access this article

Similar content being viewed by others

Power Analysis for the Wald, LR, Score, and Gradient Tests in a Marginal Maximum Likelihood Framework: Applications in IRT

Correction for Item Response Theory Latent Trait Measurement Error in Linear Mixed Effects Models

Adjusted Residuals for Evaluating Conditional Independence in IRT Models for Multistage Adaptive Testing

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The UMP Exact Test and the Confidence Interval for Person Parameters in IRT Models

Abstract

Access this article

Similar content being viewed by others

Power Analysis for the Wald, LR, Score, and Gradient Tests in a Marginal Maximum Likelihood Framework: Applications in IRT

Correction for Item Response Theory Latent Trait Measurement Error in Linear Mixed Effects Models

Adjusted Residuals for Evaluating Conditional Independence in IRT Models for Multistage Adaptive Testing

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation