Data-Driven Analyses of Electronic Text Books

  • Ahcène Boubekki
  • Ulf Kröhne
  • Frank Goldhammer
  • Waltraud Schreiber
  • Ulf BrefeldEmail author
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9580)


We present data-driven log file analyses of an electronic text book for history called the mBook to support teachers in preparing lessons for their students. We represent user sessions as contextualised Markov processes of user sessions and propose a probabilistic clustering using expectation maximisation to detect groups of similar (i) sessions and (ii) users. We compare our approach to a standard K-means clustering and report on findings that may have a direct impact on preparing and revising lessons.


  1. 1.
    Agosti, M., Crivellari, F., Di Nunzio, G.: Web log analysis: a review of a decade of studies about information acquisition, inspection and interpretation of user interaction. Data Mining and Knowledge Discovery pp. 1–34 (2011)Google Scholar
  2. 2.
    Anderson, C.R., Domingos, P., Weld, D.S.: Relational markov models and their application to adaptive web navigation. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)Google Scholar
  3. 3.
    Armentano, M., Amandi, A.: Modeling sequences of user actions for statistical goal recognition. User Model. User-Adap. Inter. 22(3), 281–311 (2012)CrossRefGoogle Scholar
  4. 4.
    Baker, R., Siemens, G.: Cambridge Handbook of the Learning Sciences. In: Sawyer, R.K. (ed.) Educational data mining and learning analytics, 2nd edn., pp. 253–274. Cambridge University Pres, New York (2014)Google Scholar
  5. 5.
    Baker, R., Yacef, K.: The state of educational data mining in 2009: a review and future visions. J. Educ. Data Min. 1(1), 3–17 (2009)Google Scholar
  6. 6.
    Billsus, D., Pazzani, M.: User modeling for adaptive news access. User Model. User-Adap. Inter. 10(2), 147–180 (2000)CrossRefGoogle Scholar
  7. 7.
    Black, P., Wiliam, D.: Assessment and classroom learning. Assess. Educ. 5(1), 7–74 (1998)CrossRefGoogle Scholar
  8. 8.
    Borges, J., Levene, M.: Evaluating variable-length markov chain models for analysis of user web navigation sessions. IEEE Trans. Knowl. Data Eng. 19(4), 441–452 (2007)CrossRefGoogle Scholar
  9. 9.
    Brinton, C.G., Chiang, M., Jain, S., Lam, H., Liu, Z., Wong, F.M.F.: Learning about social learning in moocs: From statistical analysis to generative model. Technical Report (2013). arXiv:1312.2159
  10. 10.
    Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Visualization of navigation patterns on a web site using model-based clustering. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2000)Google Scholar
  11. 11.
    Chevalier, K., Bothorel, C., Corruble, V.: Discovering rich navigation patterns on a web site. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 62–75. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Cocea, M.: Can log files analysis estimate learner’s level of motivation? In: Proceedings of the Workshop on Lernen - Wissensentdeckung - Adaptivität (2006)Google Scholar
  13. 13.
    Cocea, M., Weibelzahl, S.: Log file analysis for disengagement detection in e-learning environments. User Model. User-Adap. Inter. 19(4), 341–385 (2009)CrossRefGoogle Scholar
  14. 14.
    Daş, R., Türkoğlu, İ.: Creating meaningful data from web logs for improving the impressiveness of a website by using path analysis method. Expert Syst. Appl. 36(3), 6635–6644 (2009)CrossRefGoogle Scholar
  15. 15.
    Daş, R., Türkoğlu, İ.: Extraction of interesting patterns through association rule mining for improvement of website usability. Istanbul Univ. J. Electr. Electron. Eng. 9(18), 1037–1046 (2010)Google Scholar
  16. 16.
    Delestre, N., Malandain, N.: Analyse et représentation en deux dimensions de traces pour le suivi de l’apprenant. Revue des Sciences et Technologies de l’Information et de la Communication pour l’Education et la Formation (STICEF) 14 (2007)Google Scholar
  17. 17.
    Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. Ser. B (Methodological) 39, 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Deshpande, M., Karypis, G.: Selective markov models for predicting web page accesses. ACM Trans. Internet Technol. (TOIT) 4(2), 163–184 (2004)CrossRefGoogle Scholar
  19. 19.
    Gündüz, Ş., Özsu, M.T.: A web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2003)Google Scholar
  20. 20.
    Haider, P., Chiarandini, L., Brefeld, U., Jaimes, A.: Contextual models for user interaction on the web. In: ECML/PKDD Workshop on Mining and Exploiting Interpretable Local Patterns (I-PAT) (2012)Google Scholar
  21. 21.
    Hay, B., Wets, G., Vanhoof, K.: Mining navigation patterns using a sequence alignment method. Knowl. Inf. Syst. 6, 150–163 (2004)CrossRefGoogle Scholar
  22. 22.
    Hershkovitz, A., Nachmias, R.: Developing a log-based motivation measuring tool. In: Proceedings of the International Conference on Educational Data Mining (2008)Google Scholar
  23. 23.
    Wegrzyn-Wolska, K.M., Szczepaniak, P.S.: On clustering visitors of a web site by behavior and interests. In: Hoebel, N., Zicari, R.V. (eds.) Advances in Intelligent Web Mastering. Advances in Soft Computing, vol. 43, pp. 160–167. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  24. 24.
    Jansen, B.J.: Understanding user-web interactions via web analytics. Synth. Lect. Inf. Concepts Retrieval Serv. 1(1), 1–102 (2009)Google Scholar
  25. 25.
    Karagiorgi, Y., Symeou, L.: Translating constructivism into instructional design: potential and limitations. Educ. Technol. Soc. 8(1), 17–27 (2005)Google Scholar
  26. 26.
    Kay, J., Maisonneuve, N., Yacef, K., Zaïane, O.: Mining patterns of events in students’ teamwork data. In: Proceedings of the ITS Workshop on Educational Data Mining (2006)Google Scholar
  27. 27.
    Köck, M., Paramythis, A.: Activity sequence modelling and dynamic clustering for personalized e-learning. User Model. User-Adap. Inter. 21(1–2), 51–97 (2011)CrossRefGoogle Scholar
  28. 28.
    Körber, A., Schreiber, W., Schöner, A. (eds.): Kompetenzen historischen Denkens: Ein Strukturmodell als Beitrag zur Kompetenzorientierung in der Geschichtsdidaktik. Neuried, Ars una (2007)Google Scholar
  29. 29.
    Lemieux, F., Desmarais, M.C., Robillard, P.N.: Analyse chronologique des traces journalisées d’un guide d’étude pour apprentissage autonome. Revue des Sciences et Technologies de l’Information et de la Communication pour l’Education et la Formation (STICEF) 20 (2014)Google Scholar
  30. 30.
    Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Macfadyen, L.P., Dawson, S.: Mining LMS data to develop an “early warning system” for educators: a proof of concept. Comput. Educ. 54(2), 588–599 (2010)CrossRefGoogle Scholar
  32. 32.
    Manavoglu, E., Pavlov, D., Giles, C.L.: Probabilistic user behavior models. In: Proceedings of the Third IEEE International Conference on Data Mining (2003)Google Scholar
  33. 33.
    Merceron, A., Yacef, K.: A web-based tutoring tool with mining facilities to improve learning and teaching. In: 11th International Conference on Artificial Intelligence in Education (AIED03), pp. 201–208. IOS Press (2003)Google Scholar
  34. 34.
    Qiqi, J., Chuan-Hoo, T., Chee Wei, P., Wei, K.K.: Using sequence analysis to classify web usage patterns across websites. In: Proceedings of the 45th Hawaii International Conference on System Science (HICSS), pp. 3600–3609 (2012)Google Scholar
  35. 35.
    Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.): Recommender Systems Handbook. Springer, US (2015)zbMATHGoogle Scholar
  36. 36.
    Schreiber, W., Sochatzy, F., Ventzke, M.: Das multimediale schulbuch - kompetenzorientiert, individualisierbar und konstruktionstransparent. In: Schreiber, W., Schöner, A., Sochatzy, F. (eds.) Analyse von Schulbüchern als Grundlage empirischer Geschichtsdidaktik, pp. 212–232. Kohlhammer (2013)Google Scholar
  37. 37.
    Sheard, J., Ceddia, J., Hurst, J., Tuovinen, J.: Inferring student learning behaviour from website interactions: a usage analysis. Educ. Inf. Technol. 8(3), 245–266 (2003)CrossRefGoogle Scholar
  38. 38.
    Srivastava, J., Cooley, R., Deshpande, M., Tan, P.N.: Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD Explor. Newsl. 1(2), 12–23 (2000)CrossRefGoogle Scholar
  39. 39.
    Ypma, A., Heskes, T.: Automatic categorization of web pages and user clustering with mixtures of hidden markov models. In: Zaïane, O.R., Srivastava, J., Spiliopoulou, M., Masand, B. (eds.) WebKDD 2003. LNCS (LNAI), vol. 2703, pp. 35–49. Springer, Heidelberg (2003)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Ahcène Boubekki
    • 1
  • Ulf Kröhne
    • 1
  • Frank Goldhammer
    • 1
  • Waltraud Schreiber
    • 2
  • Ulf Brefeld
    • 1
    Email author
  1. 1.Leuphana University of LüneburgLüneburgGermany
  2. 2.Faculty of History and Social ScienceKU EichstättEichstättGermany

Personalised recommendations