Abstract
In the human-to-human Collaborative Problem Solving (CPS) test, students’ problem-solving process reflects the interdependency among partners. The high interdependency in CPS makes it very sensitive to group composition. For example, the group outcome might be driven by a highly competent group member, so it does not reflect all the individual performances, especially for a low-ability member. As a result, how to effectively assess individuals’ performances has become a challenging issue in educational measurement. This research aims to construct the measurement model to estimate an individual’s collaborative problem-solving ability and correct the impact of partners’ abilities. First, 175 eighth graders’ dyads were divided into six cooperative groups with different levels of problem-solving (PS) ability combinations (i.e., high-high, high-medium, high-low, medium-medium, medium–low, and low-low). Then, they participated in the test of three CPS tasks, and the log data of the dyads were recorded. We applied Multidimensional Item Response Theory (MIRT) measurement models to estimate an individual’s CPS ability and proposed a mean correction method to correct the impact of group composition on individual ability. Results show that (1) the multidimensional IRT model fits the data better than the multidimensional IRT model with the testlet effect; (2) the mean correction method significantly reduced the impact of group composition on obtained individual ability. This study not only successfully increased the validity of individuals’ CPS ability measurement but also provided useful guidelines in educational settings to enhance individuals’ CPS ability and promote an individualized learning environment.
Similar content being viewed by others
Data availability
The datasets analyzed during the current study are available from the corresponding author upon reasonable request.
References
Adams, R. J., Wilson, M., & Wang, W. C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1–23. https://doi.org/10.1177/0146621697211001
Adams, R. J., Vista, A., Scoular, C., Awwal, N., Griffin, P., & Care, E. (2015). Automatic coding procedures for collaborative problem solving. In E. Care, P. Griffin, & M. Wilson (Eds.), Assessment and teaching of 21st century skills: Methods and approaches (pp. 115–132). Springer.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1016/j.socnet.2007.09.001
Andrews, T. J., & Forsyth, C. M. (2020). Exploring social and cognitive dimensions of collaborative problem solving in an open online simulation-based task. Computers in Human Behavior, 104, 105759. https://doi.org/10.1016/j.chb.2018.10.025
Andrews, J. J., Kerr, D., Mislevy, R. J., von Davier, A., Hao, J., & Liu, L. (2017). Modeling collaborative interaction patterns in a simulation-based task. Journal of Educational Measurement, 54(1), 54–69. https://doi.org/10.1111/jedm.12132
Baker, F. B., & Kim, S. (2004). Item response theory, parameter estimation techniques (2nd ed.). Marcel Dekker, Inc.
Birnbaum, A. (1968). Some latent trait models and their use in inferring a student’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–424). Addison-Wesley.
Cannon-Bowers, J. A., Tannenbaum, S. I., Salas, E., & Volpe, C. E. (1995). Defining competencies and establishing team training requirements. In R. A. Guzzo & E. Salas (Eds.), Team effectiveness and decision making in organizations (pp. 333–380). Wiley.
Care, E., Griffin, P., Scoular, C., Awwal, N., & Zoanetti, N. (2015). Collaborative problem solving tasks. In P. Griffin & E. Care (Eds.), Assessment and teaching of 21st century skills: Methods and approach (pp. 85–104). Springer.
Care, E., Scoular, C., & Griffin, P. (2016). Assessment of collaborative problem solving in education environments. Applied Measurement in Education, 29(4), 250–264. https://doi.org/10.1080/08957347.2016.1209204
Christensen, P. R. (2012). mirt: a multidimensional item response theory package for the R envionment. Journal of Statistical Software, 48, 1–29. https://doi.org/10.18637/jss.v048.i06
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104
Dechurch, L. A., & Mesmer-Magnus, J. R. (2010). The cognitive underpinnings of effective teamwork: A meta-analysis. Journal of Applied Psychology, 95(1), 32–53. https://doi.org/10.1037/a0017328
Demars, C. E. (2006). Application of the bi-factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43(2), 145–168. https://doi.org/10.2307/20461818
Demars, C. E., & Jacovidis, J. (2016). Multilevel Item Response Theory (IRT): When is local independence violated. Paper poster presented at the annual meeting of the National Council on Measurement in Education, Washington, DC.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum.
Gao, Q., Zhang, S., Cai, Z., Liu, K., Hui, N., & Tong, M. (2022). Understanding student teachers’ collaborative problem solving competency: insights from process data and multidimensional item response theory. Thinking Sills and Creativity, 45, 101097. https://doi.org/10.1016/j.tsc.2022.101097
Graesser, A., Kuo, B. C., & Liao, C. H. (2017). Complex problem solving in assessments of collaborative problem solving. Journal of Intelligence, 5(10), 1–14. https://doi.org/10.3390/jintelligence5020010
Graesser, A. C., Fiore, S. M., Greiff, S., Andrews, T. J., Foltz, P. W., & Hesse, F. W. (2018). Advancing the science of collaborative problem solving. Psychological Science in the Public Interest, 19(2), 59–92. https://doi.org/10.1177/1529100618808244
Griffin, P., Care, E., & McGaw, B. (2012). The changing role of education and schools. In P. Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills (pp. 1–16). Springer.
Griffin, P., Care, E., & Harding, S. (2015). Task characteristics and calibration. In P. Griffin & E. Care (Eds.), Assessment and teaching of 21st century skills: Methods and approach (pp. 133–178). Springer.
Herborn, K., Stadler, M., Mustafić, M., & Greiff, S. (2020). The assessment of collaborative problem solving in PISA 2015: Can computer agents replace humans? Computers in Human Behavior, 104, 105624. https://doi.org/10.1016/j.chb.2018.07.035
Hesse, F., Care, E., Buder, J., Sassenberg, K., & Griffin, P. (2015). A framework for teachable collaborative problem solving skills. In P. Griffin & E. Care (Eds.), Assessment and teaching of 21st century skills: Methods and approach (pp. 37–56). Springer.
Hao, J., Liu, L., Davier, A. A. V., & Kyllonen, P. C. (2017). Initial steps towards a standardized assessment for collaborative problem solving (cps): Practical challenges and strategies. In A. A. von Davier, M. Zhu, & P. C. Kyllonen (Eds.), Innovative assessment of collaboration. (pp. 135–156). Switzerland: Springer International Publishing.
Kyllonen, P. C., Zhu, M., & von Davier, A. A. (2017). Introduction: Innovative assessment of collaboration. In A. A. von Davier, M. Zhu, & P. C. Kyllonen (Eds.), Innovative assessment of collaboration (pp. 1–18). Springer International Publishing.
Laughlin, P. R., & Branch, L. G. (1972). Individual versus tetadic performance on a complementary task as a function of initial ability level. Organizational Bahavior and Human Performance, 8(2), 201–216. https://doi.org/10.1016/0030-5073(72)90046-3
Li, M., Liu, H., & Yuan, J. (2022). The application of computational psychometrics in the assessment of key competencies: A case of collaborative problem solving. Educational Research, 43(3), 127–137.
Liu, L., Hao, J., von Davier, A. A., Kyllonen, P., & Zapata-Rivera, J. D. (2015). A tough nut to crack: measuring collaborative problem solving. In Y. Rosen, S. Ferrara, & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp. 344–359). IGI Global.
Marais, I., & Andrich, D. (2008). Formalizing dimension and response violations of local independence in the unidimensional Rasch model. Journal of Applied Measurement, 9(3), 200–215. https://doi.org/10.1088/0963-0252/4/1/010
Marks, M. A., Mathieu, J. E., & Zaccaro, S. J. (2001). A temporally based framework and taxonomy of team processes. The Academy of Management Review, 26(3), 356. https://doi.org/10.2307/259182
Moreland, R. L., & Levine, J. M. (1992). The compositioin of small groups. Advances in Group Process, 9, 237–280.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159–177. https://doi.org/10.1002/j.2333-8504.1992.tb01436.x
Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus user’s guide, eighth edition. Muthén & Muthén.
OECD. (2004). Problem solving for tomorrow’s world: First measures of cross-curricular competencies from PISA 2003. OECD Publishing.
OECD. (2017a). PISA 2015 assessment and analytical framework: Science, reading, mathematic, financial literacy and collaborative problem solving (revised). OECD Publishing.
OECD. (2017b). PISA 2015 technical report. OECD Publishing.
Partnership for 21st Century Skills. (2019). Framework for 21st century learning definitions. Retrieved from http://www.battelleforkids.org/networks/p21/frameworks-resources. Accessed 10-08-2021
Ramalingam, D., & Adams, R. J. (2015). How can the use of data from computer-delivered assessments improve the measurement of twenty-first century skills? In E. Care, P. Griffin, & M. Wilson (Eds.), Assessment and teaching of 21st century skills: Research and applications (pp. 225–238). Springer.
Raudenbush, S. W., & Bryk, A. W. (2002). Hierarchical linear models: Application and data analysis methods (second edtion). England.
Reckase, M. D. (2009). Multidimensional item response theory. Springer.
Rosen, Y., & Rimor, R. (2009). Using collaborative database to enhance students’ knowledge construction. Interdisciplinary Journal of E-Learning and Learning Objects, 5, 187–195.
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.2307/2335942
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464. https://doi.org/10.1214/aos/1176344136
Stewart, A. E., Amon, M. J., Duran, N. D., & D’Mello, S. K. (2020). Beyond team makeup: Diversity in teams predicts valued outcomes in computer-mediated collaborations. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–13.
Sun, C., Shute, V. J., Stewart, A., Yonehiro, J., Duran, N., & D’Mello, S. (2020). Towards a generalized competency model of collaborative problem solving. Computers & Education, 143, 103672. https://doi.org/10.1016/j.compedu.2019.103672
Swiecki, Z., Ruis, A. R., Farrell, C., & Shaffer, D. W. (2020). Assessing individual contributions to collaborative problem solving: a network analysis approch. Computers in Human Behavior, 104, 105876. https://doi.org/10.1016/j.chb.2019.01.009
Tannenbaum, S. I., Beard, R. L., & Salas, E. (1992). Team building and its influence on teameffectiveness: An examination of conceptual and empirical developments. Advances inPsychology, 82, 117–153. https://doi.org/10.1016/S0166-4115(08)62601-1
Vista, A., Awwal, N., & Care, E. (2016). Sequential actions as markers of behavioural and cognitive processes: Extracting empirical pathways from data streams of complex tasks. Computers & Education, 92–93, 15–36. https://doi.org/10.1016/j.compedu.2015.10.009
Vista, A., Care, E., & Awwal, N. (2017). Visualising and examining sequential actions as behavioural paths that can be interpreted as markers of complex behaviours. Computers in Human Behavior, 76, 656–671. https://doi.org/10.1016/j.chb.2017.01.027
Von Davier, A. A. (2017). Computational psychometrics in support of collaborative educational assessments. Journal of Educational Measurement, 54(1), 3–11. https://doi.org/10.1111/jedm.12129
Von Davier, A. A., & Halpin, P. F. (2013). Collaborative Problem Solving and the Assessment of Cognitive Skills: Psychometric Considerations. ETS RR-13–41. https://doi.org/10.1002/j.2333-8504.2013.tb02348.x
Wang, W. C., & Wilson, M. (2005). Exploring local item dependence using a random-effects facet model. Applied Psychological Measurement, 29(4), 296–318. https://doi.org/10.1177/0146621605276281
Webb, N. M. (1995). Group collaboration in assessment: Multiple objectives, processes and outcomes. Educational Evaluation and Policy Analysis, 17(2), 239–261. https://doi.org/10.2307/1164563
Webb, N., Nemer, K. M., Chizhik, A., & Sugrue, B. (1998). Equity issues in collaborative group assessment: Group composition and performance. American Educational Research Journal, 35(4), 607–651. https://doi.org/10.3102/00028312035004607
Wilczenski, F. L., Bontrager, T., Ventrone, P., & Correia, M. (2001). Observing collaborative problem-solving processes and outcomes. Psychology in the Schools, 38, 269–281. https://doi.org/10.1002/pits.1017
Wilson, M., Gochyyev, P., & Scalise, K. (2017). Modeling data from collaborative assessments: Learning in digital interactive social networks. Journal of Educational Measurement, 54(1), 85–102. https://doi.org/10.1111/jedm.12134
Wise, S. L. (2019). An information-based approach to identifying rapid-guessing thresholds. Applied Measurement in Education, 32(4), 325–336. https://doi.org/10.1080/08957347.2019.1660350
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8(2), 125–145. https://doi.org/10.1177/014662168400800201
Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30(3), 187–213. https://doi.org/10.1111/j.1745-3984.1993.tb00423.x
Yuan, J. L., Xiao, Y., & Liu, H. Y. (2019). Assessment of collaborative problem solving based on process stream data: A new paradigm for extracting indicators and modeling dyad data. Frontiers in Psychology, 10, 369. https://doi.org/10.3389/fpsyg.2019.00369
Yuan, J. L. (2018). A study of measuring collaborative problem solving based on behavioral process performance (doctoral dissertation). Beijing Normal University, Beijing.
Zhang, S., Gao, Q., Sun, M., Cai, Z., Li, H., Tang, Y., et al. (2022). Understanding student teachers’ collaborative problem solving: insights from an epistemic network analysi (ENA). Computers & Education, 183, 104485. https://doi.org/10.1016/j.compedu.2022.104485
Zhang, Z., Wilson, M., Alom, M., Awwal, N., & Griffin, P. (2018, April). Adopting a process perspective on collaborative problem solving. Paper presented at the annual meeting of the National Council on Measurement in Education, NewYork, NY.
Zhu, M., Andrews-Todd, J., & Zhang, M. (2020). Application of network analysis in understanding collaborative problem solving processes and skills. In H. Jiao & R. W. Lissitz (Eds.), Innovative Psychometric Modeling and Methods. Information Age Publisher. https://www.researchgate.net/publication/344887044_Application_of_Network_Analysis_in_Understanding_Collaborative_Problem_Solving_Processes_and_Skills/link/5f96d49492851c14bce7a903/download. Accessed 10-08-2021
Zoanetti, N. (2010). Interactive computer based assessment tasks: how problem-solving process data can inform instruction. Australasian Journal of Educational Technology, 26(5), 585–606. https://doi.org/10.14742/ajet.1053
Funding
The key project for the 14th five-year plan of Beijing education sciences in 2022 (CDAA22033)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Three human-to-human interaction tasks (Clown, Plant Growth, and Olive Oil) were employed in this study. The following is a detailed description of the coding scheme for the log data, using the "olive oil" task as an example.
1.1 The Olive Oil task
The “Olive Oil” task is a well-defined algorithm problem, which can be completed by two students’ collaboration. As Appendix Fig. 4 shows, the left side is student A’s interface and the right one is student B’s. Two students have jars with different volumes, student A’s is 3L and student B’s is 5L. The task goal is to fill up student B’s jar with 4L olive oil. The transfer pipe can be used to transfer olive oil from student A to B, and the unused olive oil can be put in the bucket. Students A and B need to type texts in the chat windows to communicate and collaborate on this task. The ideal path includes eight steps: 1, Student A filled up 3L olive oil; 2, Student A transfered the 3L olive oil to student B; 3, Student A filled up 3L olive oil again; 4, Student A transfered 2L olive oil to student B and kept 1L olive oil; 5, Student B poured all the olive oil in the 5L jar; 6, Student A transfered the left 1L olive oil to student B; 7, Student A filled up with 3L olive oil again; 8, Student transfered the 3L olive oil to student B so the later has 4L in total. Appendix Fig. 4 shows the instruction and the problem space on two screen tabs in the Olive Oil task.
1.2 Log data
Appendix Table 13 shows the original log data of a group of students completing the “Olive Oil” task recorded by a computer. It includes five variables: ID is the sequence number of a group’s behaviors and conversations while completing the task; GroupID is the group number; log_type shows the type of current event of the student; log_content shows the content of current event of the student; role is the student’s mission role. In addition, Table 14 has a detailed introduction of the data types in log_content.
1.3 Coding
The operation behaviors and conversations in the CPS process need to be coded by using the following methods:
1.3.1 Behavior coding
There are two steps in behavior coding. The first one is to delete the meaningless behaviors and keep the meaningful ones. Meaningful behaviors can reflect the progress of CPS mission. For example, students’ filling oil, transferring oil, and pouring oil conducted in the “Olive Oil” task. Meaningless behaviors include the ones that don’t provide any task progress information, such as clicking and dragging the mouse, moving the jar, and turning on and off the transfer pipe in the task.
The second step is to code the meaningful behavior data to describe students’ meaningful behaviors in the CPS process. In order to achieve the goal, the two students in the same group need to collaborate to clearly understand the problem-solving strategy and design a logical problem-solving process. To represent the students’ behavioral status in the CPS process, this type of coding needs to reflect the student’s operation, as well as the cumulated status of all the steps. For example, using the expression formula “A/B 3L/5L_fill/to/trans: 3L = X;5L = Y” to code students’ operation behaviors and the oil volumes in the jars. “A/B” stands for students’ roles; “3L/5L” means the jar volume in operation. Student A uses the 3L jar and student B uses the 5L jar; “fill” means adding oil, “to” means pouring oil, and “trans” means passing oil; “3L = X” is the oil volume in student A’s jar at this moment, and “5L = Y” is the oil volume in student B’s jar at this moment. For example, “A 3L_fill:3L = 3,5L = 0” represents student A added 3L oil at this moment, so his/her jar has 3L oil, while student B’s jar has 0L oil.
1.3.2 Language coding
Language coding has two steps. First, Language coding indicators are determined based on students’ performance, including four dimensions: sharing ideas, negotiating ideas, regulating problem-solving, and maintaining communication (Liu et al., 2015; Hao et al., 2017). 33 coding indicators found by Hao (2017) were expanded to 38 to distinguish students’ language characteristics in CPS process(Table 14). Among the added five indicators, four of them are about sharing ideas, which distinguish the content of sharing ideas (resources, mission progress status), proactivity, and the roles of questioning and responding. The other added indicator is about maintaining communication—students communicating the negative thoughts of giving up. The second step is to use manual coding to achieve language coding. It started with using a coding manual to train all the coding staff. Then 10 sets of data were chosen from each mission, and language contents from different time frames were double-coded. When two codes are inconsistent, it was discussed and finalized. Lastly, 20% of data were chosen and double-coded so the consistency coefficient was calculated. For the three missions, if Kappa coefficient reaches 0.98 on the CPS skills level and reaches 0.84 ~ 0.88 at the student performance (subcategories) level, then they have reached a sufficient consistency coefficient (Cohen, 1960). Once the coding consistency coefficient reaches 0.80, the rest of the data is single-coded .
1.4 Form structured log data
Through behavior coding and language coding, CPS structured log data is established. Appendix Table 15 presents the data example of structured log data in this task. “Eventtype” is the type of event, in which “action” stands for behaviors, and “chat” stands for language. “Event” represents the structured log data. For example, the fourth event in the group is “C11”, which means student A’s language type at this moment is “to share information related to mission resources with teammate”. The sixth event is “A 3L_fill:3L = 3;5L = 0”, which means at this moment student A is using the 3L jar to fill oil, then he/she has 3L oil, while student B has 0L oil.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, M., Liu, H., Cai, M. et al. Estimation of individuals’ collaborative problem solving ability in computer-based assessment. Educ Inf Technol 29, 483–515 (2024). https://doi.org/10.1007/s10639-023-12271-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10639-023-12271-w