Skip to main content

Advertisement

Log in

Using Response Times and Response Accuracy to Measure Fluency Within Cognitive Diagnosis Models

  • Theory and Methods
  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

The recent “Every Student Succeed Act" encourages schools to use an innovative assessment to provide feedback about students’ mastery level of grade-level content standards. Mastery of a skill requires the ability to complete the task with not only accuracy but also fluency. This paper offers a new sight on using both response times and response accuracy to measure fluency with cognitive diagnosis model framework. Defining fluency as the highest level of a categorical latent attribute, a polytomous response accuracy model and two forms of response time models are proposed to infer fluency jointly. A Bayesian estimation approach is developed to calibrate the newly proposed models. These models were applied to analyze data collected from a spatial rotation test. Results demonstrate that compared with the traditional CDM that using response accuracy only, the proposed joint models were able to reveal more information regarding test takers’ spatial skills. A set of simulation studies were conducted to evaluate the accuracy of model estimation algorithm and illustrate the various degrees of model complexities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Alberto, P. A., & Troutman, A. C. (2013). Applied behavior analysis for teachers. 6th. Upper Saddle River: Prentice Hall.

    Google Scholar 

  • Biancarosa, G., & Shanley, L. (2016). What Is Fluency?. In The fluency construct (pp. 1–18). New York, NY: Springer.

  • Bolsinova, M., Tijmstra, J., & Molenaar, D. (2017a). Response moderation models for conditional dependence between response time and response accuracy. British Journal of Mathematical and Statistical Psychology, 70(2), 257–279.

    Article  Google Scholar 

  • Bolsinova, M., Tijmstra, J., Molenaar, D., & De Boeck, P. (2017b). Conditional dependence between response time and accuracy: An overview of its possible sources and directions for distinguishing between them. Frontiers in Psychology, 8, 202.

    Article  Google Scholar 

  • Cattell, R. B. (1948). Concepts and methods in the measurement of group syntality. Psychological Review, 55(1), 48.

    Article  Google Scholar 

  • Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37(6), 419–437.

    Article  Google Scholar 

  • Chiu, C.-Y., & Köhn, H.-F. (2015). The reduced RUM as a logit model: Parameterization and constraints. Psychometrika, 81, 350–370.

    Article  Google Scholar 

  • Choe, E. M., Kern, J. L., & Chang, H.-H. (2018). Optimizing the use of response times for item selection in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 43(2), 135–158.

    Article  Google Scholar 

  • Christ, T. J., Van Norman, E. R., & Nelson, P. M. (2016). Foundations of fluency-based assessments in behavioral and psychometric paradigms. In The fluency construct (pp. 143–163). New York, NY: Springer.

  • Corballis, M. C. (1986). Is mental rotation controlled or automatic? Memory & Cognition, 14(2), 124–128.

    Article  Google Scholar 

  • Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40(5), 454–476.

    Article  Google Scholar 

  • Cummings, K. D., Park, Y., & Bauer Schaper, H. A. (2013). Form effects on dibels next oral reading fluency progress-monitoring passages. Assessment for Effective Intervention, 38(2), 91–104.

    Article  Google Scholar 

  • De Boeck, P., Chen, H., & Davison, M. (2017). Spontaneous and imposed speed of cognitive test responses. British Journal of Mathematical and Statistical Psychology, 70(2), 225–237.

    Article  Google Scholar 

  • De Boeck, P., & Jeon, M. (2019). An overview of models for response times and processes in cognitive tests. Frontiers in Psychology, 10, 102.

    Article  Google Scholar 

  • de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333–353. https://doi.org/10.1007/bf02295640.

    Article  Google Scholar 

  • Deno, S. L. (1985). Curriculum-based measurement: The emerging alternative. Exceptional Children, 52(3), 219–232.

    Article  Google Scholar 

  • Engelhardt, L., & Goldhammer, F. (2019). Validating test score interpretations using time information. Frontiers in Psychology, 10, 1131.

    Article  Google Scholar 

  • Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science,. https://doi.org/10.1214/ss/1177011136.

    Article  Google Scholar 

  • Goldhammer, F. (2015). Measuring ability, speed, or both? challenges, psychometric solutions, and what can be gained from experimental control. Measurement: Interdisciplinary Research and Perspectives, 13(3–4), 133–164.

    Google Scholar 

  • Goldhammer, F., Naumann, J., Stelter, A., Tóth, K., Rölke, H., & Klieme, E. (2014). The time on task effect in reading and problem solving is moderated by task difficulty and skill: Insights from a computer-based large-scale assessment. Journal of Educational Psychology, 106(3), 608.

    Article  Google Scholar 

  • Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272. https://doi.org/10.1177/01466210122032064.

    Article  Google Scholar 

  • Kail, R. (1991). Controlled and automatic processing during mental rotation. Journal of Experimental Child Psychology, 51(3), 337–347.

    Article  Google Scholar 

  • Karelitz, T. M. (2004). Ordered category attribute coding framework for cognitive assessments. PhD thesis, University of Illinois at Urbana-Champaign.

  • Ketterlin-Geller, L. R., & Yovanoff, P. (2009). Diagnostic assessments in mathematics to support instructional decision making. Practical Assessment, Research & Evaluation, 14(16), 1–11.

    Google Scholar 

  • Li, F., Cohen, A., Bottge, B., & Templin, J. (2016). A latent transition analysis model for assessing change in cognitive skills. Educational and Psychological Measurement, 76(2), 181–204. https://doi.org/10.1177/0013164415588946.

    Article  Google Scholar 

  • Maris, G., & Van der Maas, H. (2012). Speed-accuracy response models: Scoring rules based on response time and accuracy. Psychometrika, 77(4), 615–633.

    Article  Google Scholar 

  • Partchev, I., & De Boeck, P. (2012). Can fast and slow intelligence be differentiated? Intelligence, 40(1), 23–32.

    Article  Google Scholar 

  • Petscher, Y., Mitchell, A. M., & Foorman, B. R. (2015). Improving the reliability of student scores from speeded assessments: An illustration of conditional item response theory using a computer-administered measure of vocabulary. Reading and Writing, 28(1), 31–56.

    Article  Google Scholar 

  • Prindle, J. J., Mitchell, A. M., & Petscher, Y. (2016). Using response time and accuracy data to inform the measurement of fluency. In The Fluency Construct (pp. 165–186). New York, NY: Springer.

  • Samejima, F. (1997). Graded response model. In Handbook of modern item response theory (pp. 85–100). New York, NY: Springer.

  • Sia, C. J. L., & Lim, C. S. (2018). Cognitive diagnostic assessment: An alternative mode of assessment for learning. In Classroom assessment in mathematics (pp. 123–137). Cham: Springer.

  • Spearman, C. (1927). The abilities of man (Vol. 6). New York: Macmillan.

    Google Scholar 

  • Su, S., & Davison, M. L. (2019). Improving the predictive validity of reading comprehension using response times of correct item responses. Applied Measurement in Education, 32(2), 166–182.

    Article  Google Scholar 

  • Templin, J. L. (2004). Generalized linear mixed proficiency models. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.

  • Thissen, D. (1983). Timed testing: An approach using item response theory. New Horizons in Testing: Latent Trait Test Theory and Computerized Adaptive Testing,. https://doi.org/10.1016/b978-0-12-742780-5.50019-6.

    Article  Google Scholar 

  • van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72(3), 287–308. https://doi.org/10.1007/s11336-006-1478-z.

    Article  Google Scholar 

  • van der Linden, W. J. (2009). Predictive control of speededness in adaptive testing. Applied Psychological Measurement, 33(1), 25–41.

    Article  Google Scholar 

  • van der Maas, H. L., Molenaar, D., Maris, G., Kievit, R. A., & Borsboom, D. (2011). Cognitive psychology meets psychometric theory: On the relation between process models for decision making and latent variable models for individual differences. Psychological Review, 118(2), 339.

    Article  Google Scholar 

  • Van Der Maas, H. L., Wagenmakers, E.-J., et al. (2005). A psychometric analysis of chess expertise. American Journal of Psychology, 118(1), 29–60.

    Google Scholar 

  • van Rijn, P. W., & Ali, U. S. (2018). A generalized speed—accuracy response model for dichotomous items. Psychometrika, 83(1), 109–131.

    Article  Google Scholar 

  • von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287–307. https://doi.org/10.1348/000711007x193957.

    Article  Google Scholar 

  • Wang, C., Xu, G., & Shang, Z. (2016). A two-stage approach to differentiating normal and aberrant behavior in computer based testing. Psychometrika,. https://doi.org/10.1007/s11336-016-9525-x.

    Article  Google Scholar 

  • Wang, S., Hu, Y., Wang, Q., Wu, B., Shen, Y., & Carr, M. (2020). The development of a multidimensional diagnostic assessment with learning tools to improve 3-d mental rotation skills. Frontiers in Psychology, 11, 305.

    Article  Google Scholar 

  • Wang, S., Yang, Y., Culpepper, S. A., & Douglas, J. A. (2018a). Tracking skill acquisition with cognitive diagnosis models: A higher-order, hidden markov model with covariates. Journal of Educational and Behavioral Statistics, 43(1), 57–87.

    Article  Google Scholar 

  • Wang, S., Zhang, S., Douglas, J., & Culpepper, S. (2018b). Using response times to assess learning progress: A joint model for responses and response times. Measurement: Interdisciplinary Research and Perspectives, 16(1), 45–58.

    Google Scholar 

  • Wang, S., Zhang, S., & Shen, Y. (2019). A joint modeling framework of responses and response times to assess learning outcomes. Multivariate Behavioral Research, 55(49), 68.

    Google Scholar 

  • Zhan, P., Jiao, H., & Liao, D. (2018). Cognitive diagnosis modelling incorporating item response times. British Journal of Mathematical and Statistical Psychology, 71(2), 262–286.

    Article  Google Scholar 

  • Zhan, P., Jiao, H., Liao, D., & Li, F. (2019). A longitudinal higher-order diagnostic classification model. Journal of Educational and Behavioral Statistics, 44(3), 251–281.

    Article  Google Scholar 

Download references

Acknowledgments

This study is funded by 2019 National Academy of Education and Spencer Postdoctoral Fellowship Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shiyu Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Appendix I: Gibbs Algorithm for Parameter Updating

The proposed MCMC algorithm is used to sample from the posterior distribution of the model parameters. To do this, we first assigned initial values to all model parameters as follows:

  • The initial population membership probabilities \({{\varvec{\pi }}}^{[0]}\) were randomly generated from \(\text{ Dirichlet }(1,1,\ldots ,1)\).

  • The initial \({\varvec{\alpha }}^{[0]}\) were randomly sampled from the discrete uniform distribution over all possible patterns.

  • The initial \(\phi _i^{[0]}\) was randomly generated from \(\text{ Uniform }(0,1)\).

  • For the DINA model item parameters, we randomly generated the \(g_j^{[0]},s_{1j}^{[0]},s_{2j}^{[0]}\) for each item from \(\text{ Uniform }(.1,.3)\).

  • For the lognormal response time model parameters, we sampled \(a_j^{[0]} \sim \text{ Uniform }(2,4),\) and \(\gamma _j^{[0]}\sim N(3.45, .5^2).\)

  • For response time model (2), the initial variance \({\sigma ^2}^{[0]}_\tau \) was generated from Uniform (1,1.5). For each test taker i, \(\tau ^{[0]}_{i}\) was generated from \(N(0,{\sigma ^2}^{[0]}_\tau )\)

  • For response time model (3), the initial covariance matrix \({{\varvec{\Sigma }}}_{\tau _0\tau _1}^{[0]}\) was set to be the \(2\times 2\) identity matrix. For each test taker i, \((\tau _{0i}^{[0]}, \tau _{1i}^{[0]})\) was randomly sampled from \(MVN(\mathbf {0},{{\varvec{\Sigma }}}_{\tau _0\tau _1}^{[0]}).\)

The following procedures were then used to update the parameters in the rth iteration of the MCMC chain.

  1. (1)

    For \(i = 1,\ldots , N\), sample \({\varvec{\alpha }}_{i}^{[r+1]}\) from the multinomial distribution with probabilities \(\tilde{\pi }_{ic}\) is given in Table 1.

  2. (2)

    For response time model (2), for \(i = 1,\ldots , N\), \(\tau _i^{[r+1]}\) is updated using a Gibbs step based on the conditional distribution specified in Table 1.

  3. (3)

    For response time model (3), for \(i=1,\ldots ,N\), \((\tau _{0i}^{[r+1]},\tau _{1i}^{r+1})\) is updated based on the conditional distribution specified in Table 1.

  4. (4)

    Given a specified response time model, for \(i = 1,\ldots , N\), obtain \(\phi _i^{[r+1]}\) based on its conditional distribution in Table 1.

  5. (5)

    For response time model (2), using the conditional distribution in Table 1 and \(\tau ^{r+1}_i\), update \({\sigma ^2}^{[r+1]}_{\tau }\) from the inverse gamma distribution.

  6. (6)

    For response time model (3), based on the conditional distribution in Table 1 and \(({{\varvec{\tau }}}_0,{{\varvec{\tau }}}_{1})^{[r+1]}\), obtain \({{\varvec{\Sigma }}}^{[r+1]}_{\tau _0\tau _1}\) from the inverse Wishart distribution.

  7. (7)

    Based on \({\varvec{\alpha }}^{[r+1]},\) update \({{\varvec{\pi }}}^{[r+1]}\) according to the Dirichlet distribution in Table 1.

  8. (8)

    For \(j = 1,\ldots , J\), sample \(s_{2j}^{[r+1]}\) from the truncated beta distribution in Table 1, based on \({\varvec{\alpha }}^{[r+1]}, \mathbf{Y},\) and \(s^{[r]}_{1j}\),\(g_j^{[r]}\); and then sample \(s^{[r+1]}_{j1}\) based on \({\varvec{\alpha }}^{[r+1]}, \mathbf{Y},\) \(g_j^{[r]}\) and \(s_{j2}^{[r+1]}\). Finally, update \(g_j^{[r+1]}\) using the truncated beta distribution based on \({\varvec{\alpha }}^{[r+1]}, \mathbf{Y},\) \(s_{j1}^{[r+1]}\) and \(s_{j2}^{[r+1]}\).

  9. (9)

    Given the specified response time model, for \(j = 1,\ldots , J\), sample \(a^{[r+1]}\) from the inverse gamma distribution given in Table 1, based on \(\varvec{L}, {{\varvec{\tau }}}^{[r+1]} (\text {or } ({{\varvec{\tau }}}_0,{{\varvec{\tau }}}_1)^{[r+1]}),{\varvec{\alpha }}^{[r+1]}, {{\varvec{\phi }}}^{[r]}\) and \(\gamma _j^{[r]}\), and sample \(\gamma _j^{[r+1]}\) from the normal distribution in Table 1, based on \(\varvec{L}, {{\varvec{\tau }}}^{[r+1]}(\text {or } ({{\varvec{\tau }}}_0,{{\varvec{\tau }}}_1)^{[r+1]}),{\varvec{\alpha }}^{[r+1]}, {{\varvec{\phi }}}^{[r]}\) and \(a_j^{[r+1]}.\)

Appendix II: Supplementary Simulation Results

We report the results for the four models when \(N=500, J=40\) and \(N=1000, J=20\) in this appendix. Tables 9 and 10 document the latent attribute profile classification results and the recovery of \(\phi _i\).

Table 9 Classification results of \(\phi \) generation condition 1
Table 10 Classification results of \(\phi \) generation condition 2

For recovery of latent speed parameter, the correlations between estimated and true \(\tau _{i}\) for models 1 and 2 were always above 0.95, and the performance were comparable across all settings. For model 3 and 4, the correlations between estimated and true \(\tau _{0i}\) were similar to models 1 and 2, and the correlations between estimated and true \(\tau _{1i}\) were from 0.552 to 0.933. Model 4 performed better than model 3 across different settings.

For models 1 and 2, the RMSEs of estimated \(\sigma _{\tau }\) are below 0.05, and the relative bias was always below 0.08. Both models performed better with a more idealized \(\phi \) condition and were comparable under the two item conditions. For models 3 and 4, the range of the absolute bias of \(\sigma _{00}\) was from 0.017 to 0.051 for model 3 and from 0.012 to 0.038 for model 4. The range of absolute bias of \(\sigma _{01}\) (resp. \(\sigma _{11}\)) was from 0.013 to 0.062 (resp. 0.015–0.017) for model 3 and from 0.035 to 0.046 (resp. 0.012–0.016) for model 4. The estimation of \(\sigma _{00}\) and \(\sigma _{11}\) was better when N increased, whereas the performance of \(\sigma _{01}\) improved when J increased. The estimation of response time model parameters \(\gamma _j\) and \(a_j\) were consistently good across all simulation conditions. The correlation of estimated and true parameter \(a_j\) was greater than 98% for all situations, and the correlation of estimated and true parameter \(\gamma _j\) was always greater than 95%. For both parameters, 99% of the relative absolute deviations were smaller than 0.2 for all situations.

The range of RMSEs of response model item parameter was from 0.021 to 0.070 for models 1 and 2 and from 0.021 to 0.042 for models 3 and 4. Given the same \(\phi _i\) generation condition, the estimation of all item parameters was better with a more idealized true item parameter condition, that is \((g, s_1, s_2)=(0.1, 0.2, 0.1)\). Given the same true item parameter generation condition, the estimation of item parameters was better with the more idealized \(\phi _i\) generation condition. All four models performed better with a larger item size J despite the smaller subject size N.

Appendix III: Model Fit of Four Simplified Fluency Models (\(\phi _i=\phi \))

See Table 11.

Table 11 DIC of four simplified fluency models

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, S., Chen, Y. Using Response Times and Response Accuracy to Measure Fluency Within Cognitive Diagnosis Models. Psychometrika 85, 600–629 (2020). https://doi.org/10.1007/s11336-020-09717-2

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-020-09717-2

Keywords

Navigation