Abstract
Computer-based interactive items have become prevalent in recent educational assessments. In such items, detailed human–computer interactive process, known as response process, is recorded in a log file. The recorded response processes provide great opportunities to understand individuals’ problem solving processes. However, difficulties exist in analyzing these data as they are high-dimensional sequences in a nonstandard format. This paper aims at extracting useful information from response processes. In particular, we consider an exploratory analysis that extracts latent variables from process data through a multidimensional scaling framework. A dissimilarity measure is described to quantify the discrepancy between two response processes. The proposed method is applied to both simulated data and real process data from 14 PSTRE items in PIAAC 2012. A prediction procedure is used to examine the information contained in the extracted latent variables. We find that the extracted latent variables preserve a substantial amount of information in the process and have reasonable interpretability. We also empirically prove that process data contains more information than classic binary item responses in terms of out-of-sample prediction of many variables.
Similar content being viewed by others
References
Borg, I., & Groenen, P. J. (2005). Modern multidimensional scaling: Theory and applications. New York, NY: Springer. https://doi.org/10.1007/0-387-28981-X.
Gómez-Alonso, C., & Valls, A. (2008). A similarity measure for sequences of categorical data based on the ordering of common elements. In V. Torra & Y. Narukawa (Eds.), Modeling decisions for artificial intelligence (pp. 134–145). Berlin: Springer. https://doi.org/10.1007/978-3-540-88269-5_13.
Greiff, S., Niepel, C., Scherer, R., & Martin, R. (2016). Understanding students’ performance in a computer-based assessment of complex problem solving: An analysis of behavioral data from computer-generated log files. Computers in Human Behavior, 61, 36–46. https://doi.org/10.1016/j.chb.2016.02.095.
He, Q., & von Davier, M. (2015). Identifying feature sequences from process data in problem-solving items with n-grams. In L. A. van der Ark, D. M. Bolt, W.-C. Wang, J. A. Douglas, & S.-M. Chow (Eds.), Quantitative psychology research (pp. 173–190). Cham: Springer. https://doi.org/10.1007/978-3-319-19977-1_13.
He, Q., & von Davier, M. (2016). Analyzing process data from problem-solving items with n-grams: Insights from a computer-based large-scale assessment. In Y. Rosen, S. Ferrara, & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp. 749–776). Hershey, PA: Information Science Reference. https://doi.org/10.4018/978-1-4666-9441-5.ch029.
Herranz, J., Nin, J., & Sole, M. (2011). Optimal symbol alignment distance: A new distance for sequences of symbols. IEEE Transactions on Knowledge and Data Engineering, 23(10), 1541–1554. https://doi.org/10.1109/TKDE.2010.190.
Karni, E. S., & Levin, J. (1972). The use of smallest space analysis in studying scale structure: An application to the California psychological inventory. Journal of Applied Psychology, 56(4), 341. https://doi.org/10.1037/h0032934.
Klein Entink, R., Fox, J.-P., & van der Linden, W. J. (2009). A multivariate multilevel approach to the modeling of accuracy and speed of test takers. Psychometrika, 74(1), 21. https://doi.org/10.1007/s11336-008-9075-y.
Kroehne, U., & Goldhammer, F. (2018). How to conceptualize, represent, and analyze log data from technology-based assessments? A generic framework and an application to questionnaire items. Behaviormetrika, 45(2), 527–563. https://doi.org/10.1007/s41237-018-0063-y.
Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), 707–710.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. New York, NY: Routledge.
Meyer, E. M., & Reynolds, M. R. (2018). Scores in space: Multidimensional scaling of the WISC-V. Journal of Psychoeducational Assessment, 36(6), 562–575. https://doi.org/10.1177/0734282917696935.
Qian, H., Staniewska, D., Reckase, M., & Woo, A. (2016). Using response time to detect item preknowledge in computer-based licensure examinations. Educational Measurement: Issues and Practice, 35(1), 38–47. https://doi.org/10.1111/emip.12102.
Robbins, H., & Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics, 22(3), 400–407. https://doi.org/10.1214/aoms/1177729586.
Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford Press.
Shoben, E. J. (1983). Applications of multidimensional scaling in cognitive psychology. Applied Psychological Measurement, 7(4), 473–490. https://doi.org/10.1177/014662168300700406.
Skager, R. W., Schultz, C. B., & Klein, S. P. (1966). The multidimensional scaling of a set of artistic drawings: Perceived structure and scale correlates. Multivariate Behavioral Research, 1(4), 425–436. https://doi.org/10.1207/s15327906mbr0104_2.
Subkoviak, M. J. (1975). The use of multidimensional scaling in educational research. Review of Educational Research, 45(3), 387–423. https://doi.org/10.3102/00346543045003387.
Takane, Y. (2006). 11 applications of multidimensional scaling in psychometrics. Handbook of Statistics, 26, 359–400. https://doi.org/10.1016/S0169-7161(06)26011-5.
van der Linden, W. J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33(1), 5–20. https://doi.org/10.3102/1076998607302626.
Wang, S., Zhang, S., Douglas, J., & Culpepper, S. (2018). Using response times to assess learning progress: A joint model for responses and response times. Measurement: Interdisciplinary Research and Perspectives, 16(1), 45–58. https://doi.org/10.1080/15366367.2018.1435105.
Zhan, P., Jiao, H., & Liao, D. (2018). Cognitive diagnosis modelling incorporating item response times. British Journal of Mathematical and Statistical Psychology, 71(2), 262–286. https://doi.org/10.1111/bmsp.12114.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Liu and Ying’s research is supported by National Science Foundation Grants SES-1826540 and IIS-1633360. He’s research is supported by National Science Foundation Grant IIS-1633353. The authors would like to thank Educational Testing Service for providing the data, and Hok Kan Ling for cleaning it.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix
Appendix
To compare the prediction performance of features extracted from different dissimilarity measures, we compute the Levenshtein distance matrix of the action sequences for each item and extracted features using Procedure 1 with the number of features K chosen by fivefold cross-validation. With these newly extracted features, we repeat the experiment of score prediction using multiple items (Section 4.5.2). All the settings are the same as before. A comparison of the prediction performance with the results in the main text is presented in Figure 12. Although the \(\text {OSR}^2\) for the Levenshtein distance features is lower than that for the features extracted previously, it is still higher than that from the baseline model and the general trend of \(\text {OSR}^2\) as the number of items increases is similar.
Rights and permissions
About this article
Cite this article
Tang, X., Wang, Z., He, Q. et al. Latent Feature Extraction for Process Data via Multidimensional Scaling. Psychometrika 85, 378–397 (2020). https://doi.org/10.1007/s11336-020-09708-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-020-09708-3