Abstract
Two common CSCL questions regarding analyses of temporal data, such as event sequences, are: (i) What variables are related to event attributes? and (ii) what is the process (or what are the processes) that generated the events? The first question is best answered with statistical methods, the second with stochastic or deterministic process modeling methods. This chapter provides an overview of statistical and stochastic methods of direct relevance to CSCL research. Many of the statistical analyses are integrated into statistical discourse analysis. From the stochastic modeling repertoire, the basic hidden Markov model as well as recent extensions is introduced, ending with dynamic Bayesian models as the current best integration. Looking into the near future, we identify opportunities for a closer alignment of qualitative with quantitative methods for temporal analysis, afforded by developments such as automization of quantitative methods and advances in computational modeling.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abrahamson, D., Blikstein, P., & Wilensky, U. (2007). Classroom model, model classroom: Computer-supported methodology for investigating collaborative-learning pedagogy. In C. Chinn, G. Erkens, & S. Puntambekar (Eds.), Proceedings of the 8th international conference on computer supported collaborative learning (CSCL) (Vol. 8, part 1, pp. 49–58). International Society of the Learning Sciences.
Assunção, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. (2015). Big data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, 3–15.
Bakeman, R., & Gottman, J. M. (1986). Observing interaction: An introduction to sequential analysis. Cambridge: Cambridge University Press.
Bannert, M., Reimann, P., & Sonnenberg, C. (2014). Process mining techniques for analysing patterns and strategies in students’ self-regulated learning. Metacognition and Learning, 9(2), 161–185.
Bello-Orgaz, G., Jung, J. J., & Camacho, D. (2016). Social big data: Recent achievements and new challenges. Information Fusion, 28, 45–59.
Bergner, Y., Walker, E., & Ogan, A. (2017). Dynamic Bayesian network models for peer tutoring interactions. In A. A. von Davier, M. Zhu, & P. C. Kyllonen (Eds.), Innovative assessment of collaboration (pp. 249–268). New York: Springer.
Boyer, K. E., Ha, E. Y., Phillips, R., Wallis, M. D., Vouk, M. A., & Lester, J. (2009). Inferring tutorial dialogue structure with hidden Markov modeling. In Proceedings of the Fourth Workshop on Innovative Use of NLP for Building Educational Applications—EdAppsNLP ‘09 (pp. 19–26). Association for Computational Linguistics.
Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models. London: Sage.
Cen, H., Koedinger, K., & Junker, B. (2006). Learning factors analysis–a general method for cognitive model evaluation and improvement. In M. Ikeda, K. D. Ashley, & T. W. Chan (Eds.), Intelligent tutoring systems, lecture notes in computer science (Vol. 4053, pp. 164–175). New York: Springer.
Chen, G., & Chiu, M. M. (2008). Online discussion processes: Effects of earlier messages’ evaluations, knowledge content, social cues and personal information on later messages. Computers and Education, 50, 678–692.
Chiu, M. M. (2008). Flowing toward correct contributions during groups’ mathematics problem solving: A statistical discourse analysis. Journal of the Learning Sciences, 17(3), 415–463. https://doi.org/10.1080/10508400802224830.
Chiu, M. M. (2013). Cycles of discourse analysis <=> statistical discourse analysis. In 10th International conference on computer supported collaborative learning, Madison, WI, USA.
Chiu, M. M. (2018). Statistically modelling effects of dynamic processes on outcomes: An example of discourse sequences and group solutions. Journal of Learning Analytics, 5(1), 75–91.
Chiu, M. M., & Lehmann-Willenbrock, N. (2016). Statistical discourse analysis: Modeling sequences of individual behaviors during group interactions across time. Group Dynamics: Theory, Research, and Practice, 20(3), 242–258. DOI: 10.1037/gdn0000048
Cohen, J., West, S. G., Aiken, L., & Cohen, P. (2003). Applied multiple regression/correlation analysis for the behavioral sciences. Mahwah, NJ: Lawrence Erlbaum.
Cress, U. (2008). The need for considering multilevel analysis in CSCL research—an appeal for the use of more advanced statistical methods. International Journal of Computer-Supported Collaborative Learning, 3, 69–84.
Embretson, S. E., & Reise, S. P. (2013). Item response theory. Hove, East Sussex, UK: Psychology Press.
Farran, D. C., & Son-Yarbrough, W. (2001). Title I funded preschools as a developmental context for children’s play and verbal behaviors. Early Childhood Research Quarterly, 16(2), 245–262.
Feldman, R., & Sanger, J. (2007). The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge: Cambridge University Press.
Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2), 137–144.
Goldstein, H. (2011). Multilevel statistical models. London: Edward Arnold.
Gottman, J. M., & Roy, A. K. (1990). Sequential analysis: A guide for behavioral researchers. Cambridge: Cambridge University Press.
Greene, W. H. (1997). Econometric analysis (3rd ed.). London: Prentice-Hall.
Helske, S., & Helske, J. (2017). Mixture hidden Markov models for sequence data: The seqHMM package in R. Retrieved from http://arxiv.org/abs/1704.00543
Jackson, C. H. (2011). Multi-state models for panel data: The msm package for R. Journal of Statistical Software, 38(8), 1–29.
Joreskog, K., & Sorbom, D. (2015). LISREL 9.2. New York: Scientific Software International.
Kennedy, P. (2008). Guide to econometrics. New York: Wiley-Blackwell.
Korb, K. B., & Nicholson, A. E. (2010). Bayesian artificial intelligence. Boca Raton, FL: CRC Press.
Loehlin, C. (2004). Latent variable models: An introduction to factor, path, and structural equation analysis. Hove, East Sussex, UK: Psychology Press.
Mandryk, R. L., & Inkpen, K. M. (2004). Physiological indicators for the evaluation of co-located collaborative play. In Proceedings of the 2004 ACM conference on Computer Supported Cooperative Work—CSCW ‘04 (pp. 102–111). Association for Computing Machinery.
Muthén, L. K., & Muthén, B. O. (2018). Mplus 8.1. Los Angeles, CA: Muthén & Muthén.
Nagarajan, R., Scutari, M., & Lèbre, S. (2013). Bayesian networks in R. New York: Springer.
National Research Council. (2013). Frontiers in massive data analysis. Washington, DC: National Academies Press.
O’Connell, J., & Højsgaard, S. (2011). Hidden semi Markov models for multiple observation sequences: The mhsmm package for R. Journal of Statistical Software, 39(4), 1–22.
Oshima, J., Oshima, R., & Fujita, W. (2018). A mixed-methods approach to analyze shared epistemic agency in jigsaw instruction at multiple scales of temporality. Journal of Learning Analytics, 5(1), 10–24.
Picciano, A. G. (2012). The evolution of big data and learning analytics in American higher education. Journal of Asynchronous Learning Networks, 16(3), 9–20.
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Reimann, P. (2009). Time is precious: Variable- and event-centred approaches to process analysis in CSCL research. International Journal of Computer-Supported Collaborative Learning, 4, 239–257.
Reimann, P., Frerejean, J., & Thompson, K. (2009). Using process mining to identify models of group decision making processes in chat data. In C. O’Malley, D. Suthers, P. Reimann, & A. Dimitracopoulou (Eds.), Computer-supported collaborative learning practices: CSCL2009 conference proceedings (pp. 98–107). International Society for the Learning Sciences.
Russell, S., & Norvig, P. (2016). Artificial intelligence: A modern approach (global edition). London: Prentice-Hall.
Sarkar, P., & Moore, A. W. (2006). Dynamic social network analysis using latent space models. In Y. Weiss, B. Scholkopf, and J. Platt (Eds.) Advances in neural information processing systems 18 (pp. 1145–1152). Cambridge, MA: MIT Press.
Schneider, B., Sharma, K., Cuendet, S., Zufferey, G., Dillenbourg, P., & Pea, R. (2018). Leveraging mobile eye-trackers to capture joint visual attention in co-located collaborative learning groups. International Journal of Computer-Supported Collaborative Learning, 13(3), 241–261.
Schwarz, B., & Baker, M. (2016). Dialogue, Argumentation and education. Cambridge: Cambridge University Press.
Shaffer, D. W., Hatfield, D., Svarovsky, G. N., Nash, P., Nulty, A., Bagley, E., Frank, K., Rupp, A. A., & Mislevy, R. (2009). Epistemic network analysis: A prototype for 21st-century assessment of learning. International Journal of Learning and Media, 1(2), 33–53.
Soller, A. (2004). Computational modeling and analysis of knowledge sharing in collaborative distance learning. User Modeling and User-Adapted Interaction, 14, 351–381.
Teddlie, C., & Tashakkori, A. (2009). Foundations of mixed methods research: Integrating quantitative and qualitative approaches in the social and behavioral sciences. London: Sage.
Tuckman, B. W. (1965). Developmental sequence in small groups. Psychological Bulletin, 63(6), 384–399.
Turner, R., & Liu, L. (2014). Hmm.discnp: Hidden Markov models with discrete non-parametric observation distributions. R Package Version 0.2-3. Retrieved from http://CRAN.R-project.org/package=hmm.discnp
Visser, I., & Speekenbrink, M. (2010). depmixS4: An R Package for Hidden Markov Models. Journal of Statistical Software, 36, 1–21.
Walker, E., Rummel, N., & Koedinger, K. R. (2014). Adaptive intelligent support to improve peer tutoring in algebra. International Journal of Artificial Intelligence in Education, 24(1), 33–61.
Weinberger, A., & Fischer, F. (2006). A framework to analyze argumentative knowledge construction in computer-supported collaborative learning. Computers & Education, 46(1), 71–95.
Wise, A., & Chiu, M. M. (2011). Analyzing temporal patterns of knowledge construction in a role-based online discussion. International Journal of Computer-Supported Collaborative Learning, 6, 445–470.
Wolery, M., Busick, M., Reichow, B., & Barton, E. E. (2010). Comparison of overlap methods for quantitatively synthesizing single-subject data. The Journal of Special Education, 44(1), 18–28.
Zikopoulos, P., & Eaton, C. (2011). Understanding big data: Analytics for enterprise class Hadoop and streaming data. New York: McGraw-Hill Osborne Media.
Further Readings
Abrahamson, D., Blikstein, P., & Wilensky, U. (2007). Classroom model, model classroom: Computer-supported methodology for investigating collaborative-learning pedagogy. In C. Chinn, G. Erkens, & S. Puntambekar (Eds.), Proceedings of the eighth International Conference on Computer Supported Collaborative Learning (CSCL) (Vol. 8, Part 1, pp. 49–58). International Society of the Learning Sciences. A powerful demonstration of how (deterministic) computational modeling can interact with empirical (classroom) research. Using the agent-based modeling tool, NetLogo, the authors provide an analysis of the mechanisms that lead to the emergence of stratified learning zones in a prototypical collaborative classroom activity. Also important because it highlights the tension between collaborative solving problems and learning from collaboration.
Bergner, Y., Walker, E., & Ogan, A. (2017). Dynamic Bayesian Network models for peer tutoring interactions. In A. A. von Davier, M. Zhu, & P. C. Kyllonen (Eds.), Innovative assessment of collaboration (pp. 249–268). Springer. This chapter provides a nice illustration of the use of modern HMM approaches to analyzing (peer) tutorial dialogue. While an important area of collaborative learning, research on tutor–tutee dialogue is only partially reflected in the CSCL literature, with this chapter providing a welcome connection between CSCL, AI in Education, and assessment research. It includes an application in the context of an empirical study.
Chiu, M. M. (2008). Flowing toward correct contributions during groups’ mathematics problem solving: A statistical discourse analysis. Journal of the Learning Sciences, 17(3), 415–463. This empirical study applied statistical discourse analysis to test whether (a) groups that created more correct, new ideas (micro-creativity) were more likely to solve a problem and (b) students’ recent actions (microtime context of evaluations, questions, justifications, politeness, and status differences) increased subsequent micro-creativity.
Chiu, M. M., & Lehmann-Willenbrock, N. (2016). Statistical discourse analysis: Modeling sequences of individual behaviors during group interactions across time. Group Dynamics: Theory, Research, and Practice, 20(3), 242–258. This article showcases statistical discourse analysis, a method that integrates most of the above methods (parallel chats, trees, group/individual differences, pivotal events, time periods, multiple target events, indirect effects, later group outcomes) and addresses related issues (e.g., missing data, inter-rater reliability, false positives, etc.).
Reimann, P. (2009). Time is precious: Variable- and event-centred approaches to process analysis in CSCL research. International Journal of Computer-Supported Collaborative Learning, 4, 239–257. This methodological paper provides an overview of qualitative, quantitative, and computational methods for analyzing temporal data in CSCL. It argues that there is a rather fundamental difference between explaining collaboration over time in terms of variables versus explaining them in terms of events. Implications for doing temporal analysis are discussed.
NAPLES Video
Chiu, M. M. (2018). How to statistically model processes? Statistical discourse analysis. Network of Academic Programs in the Learning Sciences (NAPLeS) webinar. http://isls-naples.psy.lmu.de/intro/all-webinars/chiu/index.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Chiu, M.M., Reimann, P. (2021). Statistical and Stochastic Analysis of Sequence Data. In: Cress, U., Rosé, C., Wise, A.F., Oshima, J. (eds) International Handbook of Computer-Supported Collaborative Learning. Computer-Supported Collaborative Learning Series, vol 19. Springer, Cham. https://doi.org/10.1007/978-3-030-65291-3_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-65291-3_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65290-6
Online ISBN: 978-3-030-65291-3
eBook Packages: EducationEducation (R0)