Skip to main content
Log in

Estimation of individuals’ collaborative problem solving ability in computer-based assessment

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

In the human-to-human Collaborative Problem Solving (CPS) test, students’ problem-solving process reflects the interdependency among partners. The high interdependency in CPS makes it very sensitive to group composition. For example, the group outcome might be driven by a highly competent group member, so it does not reflect all the individual performances, especially for a low-ability member. As a result, how to effectively assess individuals’ performances has become a challenging issue in educational measurement. This research aims to construct the measurement model to estimate an individual’s collaborative problem-solving ability and correct the impact of partners’ abilities. First, 175 eighth graders’ dyads were divided into six cooperative groups with different levels of problem-solving (PS) ability combinations (i.e., high-high, high-medium, high-low, medium-medium, medium–low, and low-low). Then, they participated in the test of three CPS tasks, and the log data of the dyads were recorded. We applied Multidimensional Item Response Theory (MIRT) measurement models to estimate an individual’s CPS ability and proposed a mean correction method to correct the impact of group composition on individual ability. Results show that (1) the multidimensional IRT model fits the data better than the multidimensional IRT model with the testlet effect; (2) the mean correction method significantly reduced the impact of group composition on obtained individual ability. This study not only successfully increased the validity of individuals’ CPS ability measurement but also provided useful guidelines in educational settings to enhance individuals’ CPS ability and promote an individualized learning environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The datasets analyzed during the current study are available from the corresponding author upon reasonable request.

References

  • Adams, R. J., Wilson, M., & Wang, W. C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1–23. https://doi.org/10.1177/0146621697211001

    Article  Google Scholar 

  • Adams, R. J., Vista, A., Scoular, C., Awwal, N., Griffin, P., & Care, E. (2015). Automatic coding procedures for collaborative problem solving. In E. Care, P. Griffin, & M. Wilson (Eds.), Assessment and teaching of 21st century skills: Methods and approaches (pp. 115–132). Springer.

    Chapter  Google Scholar 

  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1016/j.socnet.2007.09.001

    Article  MathSciNet  Google Scholar 

  • Andrews, T. J., & Forsyth, C. M. (2020). Exploring social and cognitive dimensions of collaborative problem solving in an open online simulation-based task. Computers in Human Behavior, 104, 105759. https://doi.org/10.1016/j.chb.2018.10.025

    Article  Google Scholar 

  • Andrews, J. J., Kerr, D., Mislevy, R. J., von Davier, A., Hao, J., & Liu, L. (2017). Modeling collaborative interaction patterns in a simulation-based task. Journal of Educational Measurement, 54(1), 54–69. https://doi.org/10.1111/jedm.12132

    Article  Google Scholar 

  • Baker, F. B., & Kim, S. (2004). Item response theory, parameter estimation techniques (2nd ed.). Marcel Dekker, Inc.

    Book  Google Scholar 

  • Birnbaum, A. (1968). Some latent trait models and their use in inferring a student’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–424). Addison-Wesley.

    Google Scholar 

  • Cannon-Bowers, J. A., Tannenbaum, S. I., Salas, E., & Volpe, C. E. (1995). Defining competencies and establishing team training requirements. In R. A. Guzzo & E. Salas (Eds.), Team effectiveness and decision making in organizations (pp. 333–380). Wiley.

    Google Scholar 

  • Care, E., Griffin, P., Scoular, C., Awwal, N., & Zoanetti, N. (2015). Collaborative problem solving tasks. In P. Griffin & E. Care (Eds.), Assessment and teaching of 21st century skills: Methods and approach (pp. 85–104). Springer.

    Chapter  Google Scholar 

  • Care, E., Scoular, C., & Griffin, P. (2016). Assessment of collaborative problem solving in education environments. Applied Measurement in Education, 29(4), 250–264. https://doi.org/10.1080/08957347.2016.1209204

    Article  Google Scholar 

  • Christensen, P. R. (2012). mirt: a multidimensional item response theory package for the R envionment. Journal of Statistical Software, 48, 1–29. https://doi.org/10.18637/jss.v048.i06

    Article  Google Scholar 

  • Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104

  • Dechurch, L. A., & Mesmer-Magnus, J. R. (2010). The cognitive underpinnings of effective teamwork: A meta-analysis. Journal of Applied Psychology, 95(1), 32–53. https://doi.org/10.1037/a0017328

    Article  Google Scholar 

  • Demars, C. E. (2006). Application of the bi-factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43(2), 145–168. https://doi.org/10.2307/20461818

    Article  Google Scholar 

  • Demars, C. E., & Jacovidis, J. (2016). Multilevel Item Response Theory (IRT): When is local independence violated. Paper poster presented at the annual meeting of the National Council on Measurement in Education, Washington, DC.

  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum.

    Google Scholar 

  • Gao, Q., Zhang, S., Cai, Z., Liu, K., Hui, N., & Tong, M. (2022). Understanding student teachers’ collaborative problem solving competency: insights from process data and multidimensional item response theory. Thinking Sills and Creativity, 45, 101097. https://doi.org/10.1016/j.tsc.2022.101097

    Article  Google Scholar 

  • Graesser, A., Kuo, B. C., & Liao, C. H. (2017). Complex problem solving in assessments of collaborative problem solving. Journal of Intelligence, 5(10), 1–14. https://doi.org/10.3390/jintelligence5020010

    Article  Google Scholar 

  • Graesser, A. C., Fiore, S. M., Greiff, S., Andrews, T. J., Foltz, P. W., & Hesse, F. W. (2018). Advancing the science of collaborative problem solving. Psychological Science in the Public Interest, 19(2), 59–92. https://doi.org/10.1177/1529100618808244

    Article  Google Scholar 

  • Griffin, P., Care, E., & McGaw, B. (2012). The changing role of education and schools. In P. Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills (pp. 1–16). Springer.

    Chapter  Google Scholar 

  • Griffin, P., Care, E., & Harding, S. (2015). Task characteristics and calibration. In P. Griffin & E. Care (Eds.), Assessment and teaching of 21st century skills: Methods and approach (pp. 133–178). Springer.

    Chapter  Google Scholar 

  • Herborn, K., Stadler, M., Mustafić, M., & Greiff, S. (2020). The assessment of collaborative problem solving in PISA 2015: Can computer agents replace humans? Computers in Human Behavior, 104, 105624. https://doi.org/10.1016/j.chb.2018.07.035

    Article  Google Scholar 

  • Hesse, F., Care, E., Buder, J., Sassenberg, K., & Griffin, P. (2015). A framework for teachable collaborative problem solving skills. In P. Griffin & E. Care (Eds.), Assessment and teaching of 21st century skills: Methods and approach (pp. 37–56). Springer.

    Chapter  Google Scholar 

  • Hao, J., Liu, L., Davier, A. A. V., & Kyllonen, P. C. (2017). Initial steps towards a standardized assessment for collaborative problem solving (cps): Practical challenges and strategies. In A. A. von Davier, M. Zhu, & P. C. Kyllonen (Eds.), Innovative assessment of collaboration. (pp. 135–156). Switzerland: Springer International Publishing.

  • Kyllonen, P. C., Zhu, M., & von Davier, A. A. (2017). Introduction: Innovative assessment of collaboration. In A. A. von Davier, M. Zhu, & P. C. Kyllonen (Eds.), Innovative assessment of collaboration (pp. 1–18). Springer International Publishing.

    Google Scholar 

  • Laughlin, P. R., & Branch, L. G. (1972). Individual versus tetadic performance on a complementary task as a function of initial ability level. Organizational Bahavior and Human Performance, 8(2), 201–216. https://doi.org/10.1016/0030-5073(72)90046-3

    Article  Google Scholar 

  • Li, M., Liu, H., & Yuan, J. (2022). The application of computational psychometrics in the assessment of key competencies: A case of collaborative problem solving. Educational Research, 43(3), 127–137.

    Google Scholar 

  • Liu, L., Hao, J., von Davier, A. A., Kyllonen, P., & Zapata-Rivera, J. D. (2015). A tough nut to crack: measuring collaborative problem solving. In Y. Rosen, S. Ferrara, & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp. 344–359). IGI Global.

    Google Scholar 

  • Marais, I., & Andrich, D. (2008). Formalizing dimension and response violations of local independence in the unidimensional Rasch model. Journal of Applied Measurement, 9(3), 200–215. https://doi.org/10.1088/0963-0252/4/1/010

    Article  Google Scholar 

  • Marks, M. A., Mathieu, J. E., & Zaccaro, S. J. (2001). A temporally based framework and taxonomy of team processes. The Academy of Management Review, 26(3), 356. https://doi.org/10.2307/259182

    Article  Google Scholar 

  • Moreland, R. L., & Levine, J. M. (1992). The compositioin of small groups. Advances in Group Process, 9, 237–280.

    Google Scholar 

  • Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159–177. https://doi.org/10.1002/j.2333-8504.1992.tb01436.x

    Article  Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus user’s guide, eighth edition. Muthén & Muthén.

  • OECD. (2004). Problem solving for tomorrow’s world: First measures of cross-curricular competencies from PISA 2003. OECD Publishing.

    Google Scholar 

  • OECD. (2017a). PISA 2015 assessment and analytical framework: Science, reading, mathematic, financial literacy and collaborative problem solving (revised). OECD Publishing.

    Book  Google Scholar 

  • OECD. (2017b). PISA 2015 technical report. OECD Publishing.

    Google Scholar 

  • Partnership for 21st Century Skills. (2019). Framework for 21st century learning definitions. Retrieved from http://www.battelleforkids.org/networks/p21/frameworks-resources. Accessed 10-08-2021

  • Ramalingam, D., & Adams, R. J. (2015). How can the use of data from computer-delivered assessments improve the measurement of twenty-first century skills? In E. Care, P. Griffin, & M. Wilson (Eds.), Assessment and teaching of 21st century skills: Research and applications (pp. 225–238). Springer.

    Google Scholar 

  • Raudenbush, S. W., & Bryk, A. W. (2002). Hierarchical linear models: Application and data analysis methods (second edtion). England.

    Google Scholar 

  • Reckase, M. D. (2009). Multidimensional item response theory. Springer.

    Book  Google Scholar 

  • Rosen, Y., & Rimor, R. (2009). Using collaborative database to enhance students’ knowledge construction. Interdisciplinary Journal of E-Learning and Learning Objects, 5, 187–195.

    Google Scholar 

  • Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.2307/2335942

    Article  MathSciNet  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464. https://doi.org/10.1214/aos/1176344136

    Article  MathSciNet  Google Scholar 

  • Stewart, A. E., Amon, M. J., Duran, N. D., & D’Mello, S. K. (2020). Beyond team makeup: Diversity in teams predicts valued outcomes in computer-mediated collaborations. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–13.

  • Sun, C., Shute, V. J., Stewart, A., Yonehiro, J., Duran, N., & D’Mello, S. (2020). Towards a generalized competency model of collaborative problem solving. Computers & Education, 143, 103672. https://doi.org/10.1016/j.compedu.2019.103672

    Article  Google Scholar 

  • Swiecki, Z., Ruis, A. R., Farrell, C., & Shaffer, D. W. (2020). Assessing individual contributions to collaborative problem solving: a network analysis approch. Computers in Human Behavior, 104, 105876. https://doi.org/10.1016/j.chb.2019.01.009

    Article  Google Scholar 

  • Tannenbaum, S. I., Beard, R. L., & Salas, E. (1992). Team building and its influence on teameffectiveness: An examination of conceptual and empirical developments. Advances inPsychology, 82, 117–153. https://doi.org/10.1016/S0166-4115(08)62601-1

    Article  Google Scholar 

  • Vista, A., Awwal, N., & Care, E. (2016). Sequential actions as markers of behavioural and cognitive processes: Extracting empirical pathways from data streams of complex tasks. Computers & Education, 92–93, 15–36. https://doi.org/10.1016/j.compedu.2015.10.009

    Article  Google Scholar 

  • Vista, A., Care, E., & Awwal, N. (2017). Visualising and examining sequential actions as behavioural paths that can be interpreted as markers of complex behaviours. Computers in Human Behavior, 76, 656–671. https://doi.org/10.1016/j.chb.2017.01.027

    Article  Google Scholar 

  • Von Davier, A. A. (2017). Computational psychometrics in support of collaborative educational assessments. Journal of Educational Measurement, 54(1), 3–11. https://doi.org/10.1111/jedm.12129

    Article  Google Scholar 

  • Von Davier, A. A., & Halpin, P. F. (2013). Collaborative Problem Solving and the Assessment of Cognitive Skills: Psychometric Considerations. ETS RR-13–41. https://doi.org/10.1002/j.2333-8504.2013.tb02348.x

  • Wang, W. C., & Wilson, M. (2005). Exploring local item dependence using a random-effects facet model. Applied Psychological Measurement, 29(4), 296–318. https://doi.org/10.1177/0146621605276281

    Article  MathSciNet  Google Scholar 

  • Webb, N. M. (1995). Group collaboration in assessment: Multiple objectives, processes and outcomes. Educational Evaluation and Policy Analysis, 17(2), 239–261. https://doi.org/10.2307/1164563

    Article  Google Scholar 

  • Webb, N., Nemer, K. M., Chizhik, A., & Sugrue, B. (1998). Equity issues in collaborative group assessment: Group composition and performance. American Educational Research Journal, 35(4), 607–651. https://doi.org/10.3102/00028312035004607

    Article  Google Scholar 

  • Wilczenski, F. L., Bontrager, T., Ventrone, P., & Correia, M. (2001). Observing collaborative problem-solving processes and outcomes. Psychology in the Schools, 38, 269–281. https://doi.org/10.1002/pits.1017

    Article  Google Scholar 

  • Wilson, M., Gochyyev, P., & Scalise, K. (2017). Modeling data from collaborative assessments: Learning in digital interactive social networks. Journal of Educational Measurement, 54(1), 85–102. https://doi.org/10.1111/jedm.12134

    Article  Google Scholar 

  • Wise, S. L. (2019). An information-based approach to identifying rapid-guessing thresholds. Applied Measurement in Education, 32(4), 325–336. https://doi.org/10.1080/08957347.2019.1660350

    Article  Google Scholar 

  • Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8(2), 125–145. https://doi.org/10.1177/014662168400800201

    Article  Google Scholar 

  • Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30(3), 187–213. https://doi.org/10.1111/j.1745-3984.1993.tb00423.x

    Article  MathSciNet  Google Scholar 

  • Yuan, J. L., Xiao, Y., & Liu, H. Y. (2019). Assessment of collaborative problem solving based on process stream data: A new paradigm for extracting indicators and modeling dyad data. Frontiers in Psychology, 10, 369. https://doi.org/10.3389/fpsyg.2019.00369

    Article  Google Scholar 

  • Yuan, J. L. (2018). A study of measuring collaborative problem solving based on behavioral process performance (doctoral dissertation). Beijing Normal University, Beijing.

  • Zhang, S., Gao, Q., Sun, M., Cai, Z., Li, H., Tang, Y., et al. (2022). Understanding student teachers’ collaborative problem solving: insights from an epistemic network analysi (ENA). Computers & Education, 183, 104485. https://doi.org/10.1016/j.compedu.2022.104485

    Article  Google Scholar 

  • Zhang, Z., Wilson, M., Alom, M., Awwal, N., & Griffin, P. (2018, April). Adopting a process perspective on collaborative problem solving. Paper presented at the annual meeting of the National Council on Measurement in Education, NewYork, NY.

  • Zhu, M., Andrews-Todd, J., & Zhang, M. (2020). Application of network analysis in understanding collaborative problem solving processes and skills. In H. Jiao & R. W. Lissitz (Eds.), Innovative Psychometric Modeling and Methods. Information Age Publisher. https://www.researchgate.net/publication/344887044_Application_of_Network_Analysis_in_Understanding_Collaborative_Problem_Solving_Processes_and_Skills/link/5f96d49492851c14bce7a903/download. Accessed 10-08-2021

  • Zoanetti, N. (2010). Interactive computer based assessment tasks: how problem-solving process data can inform instruction. Australasian Journal of Educational Technology, 26(5), 585–606. https://doi.org/10.14742/ajet.1053

    Article  Google Scholar 

Download references

Funding

The key project for the 14th five-year plan of Beijing education sciences in 2022 (CDAA22033)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongyun Liu.

Ethics declarations

Competing interests

The authors have no competing interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Three human-to-human interaction tasks (Clown, Plant Growth, and Olive Oil) were employed in this study. The following is a detailed description of the coding scheme for the log data, using the "olive oil" task as an example.

1.1 The Olive Oil task

The “Olive Oil” task is a well-defined algorithm problem, which can be completed by two students’ collaboration. As Appendix Fig. 4 shows, the left side is student A’s interface and the right one is student B’s. Two students have jars with different volumes, student A’s is 3L and student B’s is 5L. The task goal is to fill up student B’s jar with 4L olive oil. The transfer pipe can be used to transfer olive oil from student A to B, and the unused olive oil can be put in the bucket. Students A and B need to type texts in the chat windows to communicate and collaborate on this task. The ideal path includes eight steps: 1, Student A filled up 3L olive oil; 2, Student A transfered the 3L olive oil to student B; 3, Student A filled up 3L olive oil again; 4, Student A transfered 2L olive oil to student B and kept 1L olive oil; 5, Student B poured all the olive oil in the 5L jar; 6, Student A transfered the left 1L olive oil to student B; 7, Student A filled up with 3L olive oil again; 8, Student transfered the 3L olive oil to student B so the later has 4L in total. Appendix Fig. 4 shows the instruction and the problem space on two screen tabs in the Olive Oil task.

Fig. 4
figure 4

The screenshot of student A (at the top) and student B (at the bottom) in the Olive Oil Task

1.2 Log data

Appendix Table 13 shows the original log data of a group of students completing the “Olive Oil” task recorded by a computer. It includes five variables: ID is the sequence number of a group’s behaviors and conversations while completing the task; GroupID is the group number; log_type shows the type of current event of the student; log_content shows the content of current event of the student; role is the student’s mission role. In addition, Table 14 has a detailed introduction of the data types in log_content.

Table 13 Sample log data for the Olive Oil task

1.3 Coding

The operation behaviors and conversations in the CPS process need to be coded by using the following methods:

1.3.1 Behavior coding

There are two steps in behavior coding. The first one is to delete the meaningless behaviors and keep the meaningful ones. Meaningful behaviors can reflect the progress of CPS mission. For example, students’ filling oil, transferring oil, and pouring oil conducted in the “Olive Oil” task. Meaningless behaviors include the ones that don’t provide any task progress information, such as clicking and dragging the mouse, moving the jar, and turning on and off the transfer pipe in the task.

The second step is to code the meaningful behavior data to describe students’ meaningful behaviors in the CPS process. In order to achieve the goal, the two students in the same group need to collaborate to clearly understand the problem-solving strategy and design a logical problem-solving process. To represent the students’ behavioral status in the CPS process, this type of coding needs to reflect the student’s operation, as well as the cumulated status of all the steps. For example, using the expression formula “A/B 3L/5L_fill/to/trans: 3L = X;5L = Y” to code students’ operation behaviors and the oil volumes in the jars. “A/B” stands for students’ roles; “3L/5L” means the jar volume in operation. Student A uses the 3L jar and student B uses the 5L jar; “fill” means adding oil, “to” means pouring oil, and “trans” means passing oil; “3L = X” is the oil volume in student A’s jar at this moment, and “5L = Y” is the oil volume in student B’s jar at this moment. For example, “A 3L_fill:3L = 3,5L = 0” represents student A added 3L oil at this moment, so his/her jar has 3L oil, while student B’s jar has 0L oil.

1.3.2 Language coding

Language coding has two steps. First, Language coding indicators are determined based on students’ performance, including four dimensions: sharing ideas, negotiating ideas, regulating problem-solving, and maintaining communication (Liu et al., 2015; Hao et al., 2017). 33 coding indicators found by Hao (2017) were expanded to 38 to distinguish students’ language characteristics in CPS process(Table 14). Among the added five indicators, four of them are about sharing ideas, which distinguish the content of sharing ideas (resources, mission progress status), proactivity, and the roles of questioning and responding. The other added indicator is about maintaining communication—students communicating the negative thoughts of giving up. The second step is to use manual coding to achieve language coding. It started with using a coding manual to train all the coding staff. Then 10 sets of data were chosen from each mission, and language contents from different time frames were double-coded. When two codes are inconsistent, it was discussed and finalized. Lastly, 20% of data were chosen and double-coded so the consistency coefficient was calculated. For the three missions, if Kappa coefficient reaches 0.98 on the CPS skills level and reaches 0.84 ~ 0.88 at the student performance (subcategories) level, then they have reached a sufficient consistency coefficient (Cohen, 1960). Once the coding consistency coefficient reaches 0.80, the rest of the data is single-coded .

Table 14 Coding rubic of Language in log data

1.4 Form structured log data

Through behavior coding and language coding, CPS structured log data is established. Appendix Table 15 presents the data example of structured log data in this task. “Eventtype” is the type of event, in which “action” stands for behaviors, and “chat” stands for language. “Event” represents the structured log data. For example, the fourth event in the group is “C11”, which means student A’s language type at this moment is “to share information related to mission resources with teammate”. The sixth event is “A 3L_fill:3L = 3;5L = 0”, which means at this moment student A is using the 3L jar to fill oil, then he/she has 3L oil, while student B has 0L oil.

Table 15 Sample structured log data for the Olive Oil task

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, M., Liu, H., Cai, M. et al. Estimation of individuals’ collaborative problem solving ability in computer-based assessment. Educ Inf Technol 29, 483–515 (2024). https://doi.org/10.1007/s10639-023-12271-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-023-12271-w

Keywords

Navigation