Skip to main content

Advertisement

Log in

Optimal Bayesian Adaptive Design for Test-Item Calibration

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers’ ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1.
Figure 2.
Figure 3.
Figure 4a.
Figure 4b.
Figure 4c.
Figure 5.
Figure 6.
Figure 7.

Similar content being viewed by others

References

  • Abdelbasit, K.M., & Plankett, R.L. (1983). Experimental design for binary data. Journal of the American Statistical Association, 78, 90–98.

    Article  Google Scholar 

  • Atchadé, Y.F., & Rosenthal, J.S. (2005). On adaptive Markov chain Monte Carlo algorithms. Bernoulli, 20, 815–828.

    Article  Google Scholar 

  • Berger, M.P.F. (1991). On the efficiency of IRT models when applied to different sampling designs. Applied Psychological Measurement, 15, 293–306.

    Article  Google Scholar 

  • Berger, M.P.F. (1992). Sequential sampling designs for the two-parameter item response theory model. Psychometrika, 57, 521–538.

    Article  Google Scholar 

  • Berger, M.P.F. (1994). D-optimal sequential sampling designs for item response theory models. Journal of Educational Statistics, 19, 43–56.

    Article  Google Scholar 

  • Berger, M.P.F., King, C.Y.J., & Wong, W.K. (2000). Minimax D-optimal designs for item response theory models. Psychometrika, 65, 377–390.

    Article  Google Scholar 

  • Berger, M.P.F., & van der Linden, W.J. (1991). Optimality of sampling design in item response theory models. In M. Wilson (Ed.), Objective measurement: theory into practice (pp. 274–288). Norwood: Ablex.

    Google Scholar 

  • Berger, M.P.F., & Wong, W.K. (2009). Introduction to optimal designs for social and biomedical research. Chichester: Wiley.

    Book  Google Scholar 

  • Cai, L. (2010). Metropolis–Hastings Robbins–Monro algorithm for confirmatory factor analysis. Journal of Educational and Behavioral Statistics, 35, 307–335.

    Article  Google Scholar 

  • Chaloner, K., & Larntz, K. (1989). Optimal Bayesian design applied to logistic regression experiments. Journal of Statistical Planning and Inference, 21, 191–208.

    Article  Google Scholar 

  • Chang, Y.-C.I., & Lu, H.-Y. (2010). Online calibration via variable length computerized adaptive testing. Psychometrika, 75, 140–157.

    Article  Google Scholar 

  • Fedorov, V.V. (1972). Theory of optimal experiments. New York: Academic Press.

    Google Scholar 

  • Fox, J.-P. (2010). Bayesian item response modeling. New York: Springer.

    Book  Google Scholar 

  • Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (1996). Introducing Markov chain Monte Carlo. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 1–19). London: Chapman & Hall.

    Google Scholar 

  • Johnson, V.E., & Albert, J.H. (1999). Ordinal data modeling. New York: Springer.

    Google Scholar 

  • Jones, D.H., & Jin, Z. (1994). Optimal sequential designs for on-line item estimation. Psychometrika, 59, 59–75.

    Article  Google Scholar 

  • Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Erlbaum.

    Google Scholar 

  • Makransky, G., & Glas, G.A.W. (2010). An automatic online calibration design in adaptive testing. Journal of Applied Testing Technology, 11, 1. Retrieved from http://www.testpublishers.org/mc/page.do?sitePageId=112031&orgId=atpu.

    Google Scholar 

  • Mislevy, R.J., & Chang, H.-H. (2000). Does adaptive testing violate local independence. Psychometrika, 65, 149–156.

    Article  Google Scholar 

  • Patz, R.J., & Junker, B.W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.

    Article  Google Scholar 

  • Patz, R.J., & Junker, B.W. (1999b). Applications and extensions of MCMC in IRT: multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366.

    Article  Google Scholar 

  • Robbins, H., & Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics, 22, 400–407.

    Article  Google Scholar 

  • Rosenthal, J.S. (2007). AMCMC: an R interface for adaptive MCMC. Computational Statistics & Data Analysis, 51, 5467–5470.

    Article  Google Scholar 

  • Silverman, B.W. (1986). Density estimation for statistics and data analysis. London: Chapman & Hall.

    Book  Google Scholar 

  • Silvey, S.D. (1980). Optimal design. London: Chapman & Hall.

    Book  Google Scholar 

  • Stefanski, L.A., & Carroll, R.J. (1985). Covariate measurement error in logistic regression. The Annals of Statistics, 13, 1335–1351.

    Article  Google Scholar 

  • Stocking, M.L. (1990). Specifying optimum examinees for item parameter estimation in item response theory. Psychometrika, 55, 461–475.

    Article  Google Scholar 

  • van der Linden, W.J. (1988). Optimizing incomplete sampling designs for item response model parameters (Research Report No. 88-5). Enschede, The Netherlands: University of Twente.

  • van der Linden, W.J. (1994). Optimal design in item response theory: applications to test assembly and item calibration. In G.H. Fischer & D. Laming (Eds.), Contributions to mathematical psychology, psychometrics, and methodology (pp. 305–318). New York: Springer.

    Chapter  Google Scholar 

  • van der Linden, W.J. (1999). Empirical initialization of the trait estimator in adaptive testing. Applied Psychological Measurement, 23, 21–29. [Erratum, 23, 248].

    Article  Google Scholar 

  • van der Linden, W.J. (2005). Linear models for optimal test design. New York: Springer.

    Book  Google Scholar 

  • van der Linden, W.J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33, 5–20.

    Article  Google Scholar 

  • van der Linden, W.J. (2010). Sequencing an adaptive test battery. In W.J. van der Linden & C.A.W. Glas (Eds.), Elements of adaptive testing (pp. 103–119). New York: Springer.

    Chapter  Google Scholar 

  • van der Linden, W.J., & Pashley, P.J. (2010). Item selection and ability estimation adaptive testing. In W.J. van der Linden & C.A.W. Glas (Eds.), Elements of adaptive testing (pp. 3–30). New York: Springer.

    Chapter  Google Scholar 

  • Wingersky, M., & Lord, F.M. (1984). An investigation of methods for reducing sampling error in certain IRT procedures. Applied Psychological Measurement, 8, 347–364.

    Article  Google Scholar 

  • Wynn, H.P. (1970). The sequential generation of D-optimum experimental designs. The Annals of Mathematical Statistics, 41, 1655–1664.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wim J. van der Linden.

Appendices

Appendix A. Implementations of the MCMC Algorithm

The MCMC algorithms used for the calibration design are variations of the usual MH within Gibbs algorithm with blocks of item and examinee parameters and symmetric proposal densities for the 3PL model. The general structure of the algorithm has been documented extensively (e.g., Fox 2010, Chapter 3; Johnson & Albert, 1999, Section 2.5; Patz & Junker, 1999a, 1999b). The following two versions of the algorithm are used.

1.1 A.1 Posterior of Ability Parameter

The first version is for the draws from the posterior distribution \(g(\theta_{j}\mid \mathbf{u}_{i_{k-1}})\) of ability parameter θ for test taker j in (2) upon answers to the items l=1,…,k−1 in the adaptive test. The posterior distributions of their parameters, \(g(\boldsymbol {\eta }_{i_{l}})\), are assumed to be available in the system in the form of vectors with random draws \(\boldsymbol {\eta }_{i_{l}}^{(t)}= (\boldsymbol {\eta }_{i_{l}}^{(1)},\ldots\boldsymbol {\eta }_{i_{l}}^{(T)})\).

The version can be summarized as iterations r=1,…,R each consisting of the following two steps:

  1. 1.

    The rth draw from the posterior distribution of θ for test taker j is obtained by

    1. (a)

      Drawing a candidate value \(\theta _{j}^{(c)}\) for \(\theta _{j}^{(r)}\) from the proposal density \(q(\theta _{j}\mid \theta _{j}^{(r-1)})\);

    2. (b)

      Accepting \(\theta _{j}^{(r)}=\theta _{j}^{(c)}\) with probability

      $$ \min \Biggl\{ \frac{g(\theta _{j}^{(c)})\prod_{l=1}^{k-1}p(\theta _{j}^{ ( c ) };\boldsymbol {\eta }_{i_{l}}^{(r-1)})^{u_{i_{l}}}[1-p(\theta _{j}^{(c)};\boldsymbol {\eta }_{i_{l}}^{(r-1)})]^{1-u_{i_{l}}}}{g(\theta _{j}^{(r-1)})\prod_{l=1}^{k-1}p(\theta _{j}^{ ( r-1 ) };\boldsymbol {\eta }_{i_{l}}^{(r-1)})^{u_{i_{l}}}[1-p(\theta _{j}^{(r-1)}; \boldsymbol {\eta }_{i_{l}}^{(r-1)})]^{1-u_{i_{l}}}},1 \Biggr\} $$
      (A.1)

      Otherwise, \(\theta _{j}^{(r)}=\theta _{j}^{(r-1)}\).

  2. 2.

    The rth draws from the posterior distributions of the operational parameters \(\eta _{i_{l}}\), l=1,…,k−1, are randomly sampled from the vectors \(\boldsymbol {\eta }_{i_{l}}^{(t)}\) present in the system.

Upon stationarity, the draws from the posterior distribution of θ for test taker j in the first steps collected in a vector \(\boldsymbol {\theta }_{j}^{(s)}= (\theta _{j}^{(1)},\ldots,\theta _{j}^{(S)})\).

1.2 A.2 Update of Posterior of Field-Test Parameters

The second version is for the updates of the posterior distributions of the parameters of field-test item f after batch b f =1,2,… of test takers. The current posterior distributions of the ability parameters θ j for the test takers jb f are available in the form of estimates of their densities g(θ j ) derived from the vectors with the draws \(\boldsymbol {\theta }_{j}^{(s)}=(\theta _{j}^{(1)},\ldots,\theta _{j}^{(S)})\). Likewise, the current posterior distributions of the field-test parameter η f , f=1,…,F, are available in the form of estimates of their densities \(g^{(b-1)}(\boldsymbol {\eta }_{f}\mid \mathbf{u}_{f_{j}})\) derived from the vectors with random draws \(\boldsymbol {\eta }_{f}^{(b-1,t)}=(\boldsymbol {\eta }_{f}^{(b-1,1)},\ldots, \boldsymbol {\eta }_{f}^{(b-1,T)})\). See the main text for the derivation of these density estimates.

The second version can be summarized as iterations r=1,…,R each consisting of the following two steps:

  1. 1.

    For each jb, the rth draw from the posterior distribution of θ j is obtained by

    1. (a)

      Drawing a candidate value \(\theta _{j}^{(c)}\) for \(\theta _{j}^{(r)}\) from the proposal density \(q(\theta _{j}\mid \theta _{j}^{(r-1)})\);

    2. (b)

      Accepting \(\theta _{j}^{(r)}=\theta _{j}^{(c)}\) with probability

      $$ \min \Biggl\{ \frac{g(\theta _{j}^{(c)})\prod_{f=1}^{F}p(\theta_{j}^{ ( c ) }; \boldsymbol {\eta }_{f}^{(r-1)})^{u_{f_{j}}}[1-p(\theta_{j}^{(c)}; \boldsymbol {\eta }_{f}^{(r-1)})]^{1-u_{f_{j}}}}{g(\theta_{j}^{(r-1)}) \prod_{f=1}^{F}p(\theta _{j}^{ ( r-1 ) };\boldsymbol {\eta }_{f}^{(r-1)})^{u_{f_{j}}}[1-p(\theta _{j}^{(r-1)}; \boldsymbol {\eta}_{f}^{(r-1)})]^{1-u_{f_{j}}}},1 \Biggr\} $$
      (A.2)

      Otherwise, \(\theta _{j}^{(r)}=\theta _{j}^{(r-1)}\).

  2. 2.

    For each of the field-test items f administered to a test taker jb f , the rth draw from the posterior distribution of η f is obtained by

    1. (a)

      Drawing a candidate value \(\boldsymbol {\eta }_{f}^{(c)}\) for \(\boldsymbol {\eta}_{f}^{(r)}\) from a proposal density \(q(\boldsymbol {\eta }_{f}\mid \boldsymbol {\eta }_{f}^{(r-1)})\);

    2. (b)

      Accepting \(\boldsymbol {\eta }_{f}^{(r)}=\boldsymbol {\eta }_{f}^{(c)}\) with probability

      $$ \min \Biggl\{ \frac{g^{(b-1)}(\boldsymbol {\eta }_{f}^{(c)}) \prod_{j=1}^{n_{b_{f}}} \{ p(\theta _{j}^{(r)};\boldsymbol {\eta }_{f}^{(c)})^{u_{f_{j}}} [1-p(\theta _{j}^{(r)};\boldsymbol {\eta }_{f}^{(c)})]^{1-u_{f_{j}}} \} }{g^{(c)}(\boldsymbol {\eta }_{f}^{(r-1)}) \prod_{j=1}^{n_{b_{f}}} \{ p(\theta _{j}^{ (r ) }; \boldsymbol {\eta }_{f}^{(r-1)})^{u_{f_{j}}}[1-p(\theta _{j}^{(r)}; \boldsymbol {\eta }_{f}^{(r-1)})]^{1-u_{f_{j}}} \} },1 \Biggr\} $$
      (A.3)

      Otherwise, \(\boldsymbol {\eta }_{f}^{(r)}=\boldsymbol {\eta }_{f}^{(r-1)}\).

Upon stationarity, the draws \(\boldsymbol {\eta }_{f}^{(b,t)}=(\boldsymbol {\eta }_{f}^{(b,1)},\ldots, \boldsymbol {\eta }_{f}^{(b,T)})\) from the posterior update of η f for batch b during the second steps are saved as updates of the vectors \(\boldsymbol {\eta }_{f}^{(b-1,t)}\).

Appendix B. Information Matrices

2.1 B.1 Observed Information Matrix

For the 3PL model, the observed information matrix \(J_{u_{f_{j}}}(\boldsymbol {\eta }_{f};\theta _{j})\) in (17) has entries

$$\begin{aligned} J_{u_{f}}(a_{f},a_{f};\theta ) =& -(1-p_{f}) (p_{f}-c_{f}) \bigl(u_{f}c_{f}-p_{f}^{2} \bigr) \biggl[ \frac{\theta -b_{f}}{(1-c_{f})p_{f}} \biggr] ^{2}; \end{aligned}$$
(B.1)
$$\begin{aligned} J_{u_{f}}(b_{f},b_{f};\theta ) =& -(1-p_{f}) (p_{f}-c_{f}) \bigl(u_{f}c_{f}-p_{f}^{2} \bigr) \biggl[ \frac{a_{f}}{(1-c_{f})p_{f}} \biggr] ^{2}; \end{aligned}$$
(B.2)
$$\begin{aligned} J_{u_{f}}(c_{f},c_{f};\theta ) =& \frac{u_{f}-2u_{f}p_{f}+p_{f}^{2}}{(1-c_{f})^{2}p_{f}^{2}}; \end{aligned}$$
(B.3)
$$\begin{aligned} J_{u_{f}}(a_{f},b_{f};\theta ) =& \frac{(p_{f}-c_{f})}{(1-c_{f})p_{f}^{2}} \biggl[ p_{f}(u_{f}-p_{f})+a_{f} \bigl(u_{f}c_{f}-p_{f}^{2}\bigr) \frac{(\theta -b_{f})(1-p_{f}) }{1-c_{f}} \biggr] ; \end{aligned}$$
(B.4)
$$\begin{aligned} J_{u_{f}}(a_{f},c_{f};\theta ) =& \frac{u_{f}(\theta -b_{f})(1-p_{f})(p_{f}-c_{f})}{(1-c_{f})^{2}p_{f}^{2}}; \end{aligned}$$
(B.5)
$$\begin{aligned} J_{u_{f}}(b_{f},c_{f};\theta ) =& \frac{-u_{f}a_{f}(1-p_{f})(p_{f}-c_{f})}{(1-c_{f})^{2}p_{f}^{2}}, \end{aligned}$$
(B.6)

where p f is the response probability on field-test item f in (1).

2.2 B.2 Expected Information Matrix

The expected matrix \(I_{U_{f}}(\boldsymbol {\eta }_{f};\theta _{j})\) in (9) is readily available in the literature (e.g., Lord 1980, Section 12.1). Using the notation in this paper, it is obtained taking the expectations of (B.1)–(B.6) over the response distribution, as

$$\begin{aligned} I_{U_{f}}(a_{f},a_{f};\theta ) =& \frac{(1-p_{f})(p_{f}-c_{f})^{2}(\theta-b_{f})^{2}}{p_{f}(1-c_{f})^{2}}; \end{aligned}$$
(B.7)
$$\begin{aligned} I_{U_{f}}(b_{f},b_{f};\theta ) =& \frac{a_{f}^{2}(1-p_{f})(p_{f}-c_{f})^{2}}{p_{f}(1-c_{f})^{2}}; \end{aligned}$$
(B.8)
$$\begin{aligned} I_{U_{f}}(c_{f},c_{f};\theta ) =& \frac{(1-p_{f})}{p_{f}(1-c_{f})^{2}}; \end{aligned}$$
(B.9)
$$\begin{aligned} I_{U_{f}}(a_{f},b_{f};\theta ) =& \frac{-a_{f}(1-p_{f})(p_{f}-c_{f})^{2}(\theta -b_{f})}{p_{f}(1-c_{f})^{2}}; \end{aligned}$$
(B.10)
$$\begin{aligned} I_{U_{f}}(a_{f},c_{f};\theta ) =& \frac{(1-p_{f})(p_{f}-c_{f})(\theta -b_{f})}{p_{f}(1-c_{f})^{2}}; \end{aligned}$$
(B.11)
$$\begin{aligned} I_{U_{f}}(b_{f},c_{f};\theta ) =& \frac{-a_{f}(1-p_{f})(p_{f}-c_{f})}{p_{f}(1-c_{f})^{2}}. \end{aligned}$$
(B.12)

2.3 B.3 Observed Information Matrix for Transformed Parameters

For a multivariate normal proposal distribution based on the transformations \(a_{f}^{\ast }=\ln a_{f}\) and \(c_{f}^{\ast }=\operatorname {logit}c_{f}\) in (5), the entries of the observed information matrix in (B.1)–(B.6) take the following form:

$$\begin{aligned} J_{u_{f}}^{\ast }(a_{f},a_{f};\theta ) =& \frac{-a_{f}(\theta -b_{f})(u_{f}-p_{f})(p_{f}-c_{f})}{p_{f}(1-c_{f})} \\ &{}-a_{f}^{2}(1-p_{f}) (p_{f}-c_{f}) \bigl(u_{f}c_{f}-p_{f}^{2}\bigr) \biggl[ \frac{\theta-b_{f}}{p_{f}(1-c_{f})} \biggr] ^{2}; \end{aligned}$$
(B.13)
$$\begin{aligned} J_{u_{f}}^{\ast }(b_{f},b_{f};\theta) =& -(1-p_{f}) (p_{f}-c_{f}) \bigl(u_{f}c_{f}-p_{f}^{2}\bigr) \biggl[ \frac{a_{f}}{p_{f}(1-c_{f})} \biggr] ^{2}; \end{aligned}$$
(B.14)
$$\begin{aligned} J_{u_{f}}^{\ast }(c_{f},c_{f};\theta ) =& \frac{-(c_{f}-3c_{f}^{2}+2c_{f}^{3})(u_{f}-p_{f})}{p_{f}(1-c_{f})} +\frac{c_{f}^{2}(u_{f}-2u_{f}p_{f}+p_{f}^{2})}{p_{f}^{2}}; \end{aligned}$$
(B.15)
$$\begin{aligned} J_{u_{f}}^{\ast }(a_{f},b_{f};\theta ) =& \frac{a_{f}(p_{f}-c_{f})}{(1-c_{f})p_{f}^{2}} \\ &{}\times \biggl[ p_{f}(u_{f}-p_{f})+a_{f} \bigl(u_{f}c_{f}-p_{f}^{2}\bigr) \frac{(\theta -b_{f})(1-p_{f})}{1-c_{f}} \biggr] ; \end{aligned}$$
(B.16)
$$\begin{aligned} J_{u_{f}}^{\ast }(a_{f},c_{f};\theta ) =& \frac{u_{f}a_{f}c_{f}(\theta -b_{f})(1-p_{f})(p_{f}-c_{f})}{(1-c_{f})p_{f}^{2}}; \end{aligned}$$
(B.17)
$$\begin{aligned} J_{u_{f}}^{\ast }(b_{f},c_{f};\theta ) =& \frac{-u_{f}a_{f}c_{f}(1-p_{f})(p_{f}-c_{f})}{(1-c_{f})p_{f}^{2}}. \end{aligned}$$
(B.18)

Observe that, before entering the acceptance criterion in (A.2), the draws of \(a_{f}^{\ast }\) and \(c_{f}^{\ast }\) from the proposal distribution with this version of covariance matrix have to transformed back to their original scales as \(a_{f}=\exp (a_{f}^{\ast })\) and \(c_{f}=[1+\exp (-c_{f}^{\ast})]^{-1}\).

Rights and permissions

Reprints and permissions

About this article

Cite this article

van der Linden, W.J., Ren, H. Optimal Bayesian Adaptive Design for Test-Item Calibration. Psychometrika 80, 263–288 (2015). https://doi.org/10.1007/s11336-013-9391-8

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-013-9391-8

Key words

Navigation