Skip to main content

Predicting Group Performance Using Process Data in a Collaborative Assessment


Technology-based assessments that involve collaboration among students offer many sources of process data, although it remains unclear which aspects of these data are most meaningful for making inferences about students’ collaborative skills. Recent research has focused mainly on theory-based rubrics for qualitative coding of process data (e.g., text from chat dialogues, click-stream data), but many reliability and validity issues arise in the application of such rubrics. In this research, we take a more data-driven approach to the problem. Data were collected from 122 dyads who interacted over online chat to complete a twelfth-grade mathematics assessment. We focus on features of chat and click-stream that can be extracted automatically, including the extent to which chat dialogue contained content from assessment materials; chat-based cues of affective tone and mirroring; and temporal synchronization in task-related activities. Using a block-wise linear regression, we show that process features of chat and click-stream accounted for 30.5% of the variation in group performance, after controlling for group members math proficiency and the total number of words in the chat dialogue. The full model explained 61% of the variation in group performance. Implications for the design and scoring of collaborative assessments are discussed.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2




  • Adams, R., Vista, A., Scoular, C., Awwal, N., Griffin, P., & Care, E. (2015). Automatic coding procedures for collaborative problem solving. In P. Griffin & E. Care (Eds.), Assessment and teaching of 21st century Skills (pp. 115–132). Dordrecht: Springer.

    Chapter  Google Scholar 

  • Attali, Y., & Burstein, J. (2004). Automated essay scoring with e-rater® V.2.0. ETS research report series, 2004(2), i–21.

  • Bergner, Y. (2018). CPSX: A tool for online collaborative problem-solving in open edX (Research memorandum no. RM-18-03). Technical report. Princeton, NJ: Educational Testing Service.

  • Bollen, K. A., & Jackman, R. W. (1985). Regression diagnostics: An expository treatment of outliers and influential cases. Sociological Methods & Research, 13(4), 510–542.

    Article  Google Scholar 

  • Bradley, M. M., & Lang, P. J. (1999). Affective norms for english words (anew): Instruction manual and affective ratings, Technical report. Citeseer.

  • Carlson, J. E., & von Davier, M. (2013). Item response theory (ETS research report series no. RR-13-28). Princeton, NJ: Educational Testing Service.

  • Chen, J., Wang, M., Kirschner, P. A., & Tsai, C.-C. (2018). The role of collaboration, computer use, learning environments, and supporting strategies in CSCL: A meta-analysis. Review of Educational Research,.

    Article  Google Scholar 

  • Cleveland, W. S. (1993). Visualizing data. Murray Hill, N.J.; Summit, N. J. At & T Bell Laboratories; Published by Hobart Press. (OCLC: 607634013)

  • Crossley, S., Liu, R., & McNamara, D. (2017). Predicting math performance using natural language processing tools. In Proceedings of the seventh international learning analytics & knowledge (pp. 339–347).

  • Davis, J. H. (1973). Group decision and social interaction: A theory of social decision schemes. Psychological Review, 80(3), 97–125.

    Article  Google Scholar 

  • Fiore, S. M., Graesser, A., Greiff, S., Griffin, P., Gong, B., Kyllonen, P., et al. (2017). Collaborative problem solving: Considerations for the national assessment of educational progress. Alexandria: National Center for Education Statistics.

    Google Scholar 

  • Griffin, P., & Care, E. (2015). Assessment and teaching of 21st century skills: Methods and approach. New York, NY: Springer.

    Book  Google Scholar 

  • Griffin, P., McGaw, B., & Care, E. (2012). Assessment and teaching of 21st century skills. New York: Springer.

    Book  Google Scholar 

  • Halpin, P. F., & Bergner, Y. (2018). Psychometric models of small group collaborations. Psychometrika,.

    Article  Google Scholar 

  • Hao, J., Chen, L., Flor, M., Liu, L., & von Davier, A. A. (2017). CPS-rater: Automated sequential annotation for conversations in collaborative problem-solving activities: CPS-rater. ETS research report series, 2017 (No. 1, pp. 1–9).

  • Hlavac, M. (2013). Stargazer: Latex code and ASCII text for well-formatted regression and summary statistics tables. Accessed 11 Nov 2018.

  • Ilgen, D. R., Hollenbeck, J. R., Johnson, M., & Jundt, D. (2005). Teams in organizations: From input-process-output models to IMOI models. Annual Review of Psychology, 56(1), 517–543.

    Article  Google Scholar 

  • Jenkins, J. R., Fuchs, L. S., Van Den Broek, P., Espin, C., & Deno, S. L. (2003). Sources of individual differences in reading comprehension and reading fluency. Journal of Educational Psychology, 95(4), 719.

    Article  Google Scholar 

  • Johnson, D. W., & Johnson, R. T. (2009). An educational psychology success story: Social interdependence theory and cooperative learning. Educational Researcher, 38(5), 365–379.

    Article  Google Scholar 

  • Kerr, N. L., & Tindale, R. S. (2004). Group performance and decision making. Annual Review of Psychology, 55(1), 623–655.

    Article  Google Scholar 

  • Kortemeyer, G. (2006). An analysis of asynchronous online homework discussions in introductory physics courses. American Journal of Physics, 74(6), 526.

    Article  Google Scholar 

  • Kozlowski, S. W. J. (2015). Advancing research on team process dynamics. Organizational Psychology Review, 5(4), 270–299.

    Article  Google Scholar 

  • Kozlowski, S. W. J., & Ilgen, D. R. (2006). Enhancing the efectiveness of work groups and teams. Psychological Science in the Public Interest, Supplement, 73, 77–124.

    Article  Google Scholar 

  • Larson, J. R. (2010). In search of synergy in small group performance. New York, NY: Taylor & Francis Group.

    Google Scholar 

  • Lee, Y.-H., & Jia, Y. (2014). Using response time to investigate students’ test-taking behaviors in a naep computer-based study. Large-Scale Assessments in Education, 2(1), 8.

    Article  Google Scholar 

  • Liu, L., Hao, J., von Davier, A. A., Kyllonen, P., & Zapata-Rivera, D. (2015). A tough nut to crack: Measuring collaborative problem solving. In Y. Rosen, S. Ferrara, & M. Mosharraf (Eds.), Handbook of research on computational tools for real-world skill development (pp. 344–359). Hershey, PA: IGI-Global.

    Chapter  Google Scholar 

  • Marks, M., Mathieu, J. E., & Zaccaro, S. J. (2001). A conceptual framework and taxonomy of team processes. Academy of Management Journal, 26(3), 356–376.

    Google Scholar 

  • Mathieu, J. E., Tannenbaum, S. I., Donsbach, J. S., & Alliger, G. M. (2014). A review and integration of team composition models: Moving toward a dynamic and temporal framework. Journal of Management,.

    Article  Google Scholar 

  • OECD. (2017). PISA 2015 results (volume V): Collaborative problem solving. Paris: PISA, OECD Publishing.

    Book  Google Scholar 

  • Ogan, A., Finkelstein, S., Walker, E., Carlson, R., & Cassell, J. (2012). Rudeness and rapport: Insults and learning gains in peer tutoring. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) LNCS (Vol. 7315, pp. 11–21).

  • Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1969). The measurement of meaning. Chicago: Aldine Publishing Company.

    Google Scholar 

  • Pentland, A. (2010). To signal is human: Real-time data mining unmasks the power of imitation, kith and charisma in our face-to-face social networks. American Scientist, 98(3), 204–211.

    Article  Google Scholar 

  • R Core Team. (2019). R: A language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria.

  • Thurstone, L. L. (1937). Ability, motivation, and speed. Psychometrika, 2(4), 249–254.

    Article  Google Scholar 

  • van der Linden, W. J. (2009). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46(3), 247–272.

    Article  Google Scholar 

  • Webb, N. M. (1995). Group collaboration in assessment: Multiple objectives, processes, and outcomes. Educational Evaluation and Policy Analysis, 17(2), 239–261.

    Article  Google Scholar 

  • Wise, A. F., & Cui, Y. (2018). Unpacking the relationship between discussion forum participation and learning in MOOCs. In Proceedings of the 8th international conference on learning analytics and knowledge—LAK ’18 (pp. 330–339).

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Kaushik Mohan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix 1

This “Appendix” derives the group performance measure \(\varDelta\) presented in Sect. 2.2.1 from the SC-IRT model developed by Halpin and Bergner (2018). The SC-IRT model describes the probability of both members of a dyad providing a correct response to a collaborative-solved item, denoted \(R_i\), for \(i = 1, \dots , I\). The group probability is modeled as a function of the probability of either partner solving the item when working individually, \(P_{ij}\), for \(j = 1, 2\). The model additionally includes a decision parameter, w, that describes how the individual probabilities combine to produce the group result. The model can be written:

$$\begin{aligned} R_i&= w(P_{i1} + P_{i2} - 2 P_{i1}\,P_{i2}) + P_{i1}\,P_{i2}. \end{aligned}$$

To obtain the scoring rule used in the present analysis, we first sum both sides of the equation to obtain the model-implied total scores, and then re-arrange to solve for w:

$$\begin{aligned} \sum _i R_i&= \sum _i \left( w (P_{i1} + P_{i2} - 2 P_{i1} \, P_{i2}) + P_{i1}\,P_{i2} \right) \\ {\widehat{Y}}&= w( \widehat{X}_1 + \widehat{X}_2 - 2 \widehat{X}_{12}) + \widehat{X}_{12} \\ \rightarrow&\\ \nonumber w&= \frac{{\widehat{Y}} - \widehat{X}_{12}}{\widehat{X}_1 + \widehat{X}_2 - 2 \widehat{X}_{12}} \end{aligned}$$

Eq. 1 of the paper is obtained by replacing the model-implied total scores with the observed total scores on the mean-equated individual and collaborative test forms. Finally, we divided the numerator and denominator by I to change the total scores to proportion correct. For further details on the theoretical motivation and properties of the SC-IRT model, please see Halpin and Bergner (2018).

Appendix 2

See Table 5.

Table 5 Descriptive statistics of study variables

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mohan, K., Bergner, Y. & Halpin, P. Predicting Group Performance Using Process Data in a Collaborative Assessment. Tech Know Learn 25, 367–388 (2020).

Download citation

  • Published:

  • Issue Date:

  • DOI:


  • Process data
  • Group performance
  • Collaborative assessments
  • Item response theory