Skip to main content
Log in

What is wrong with deep knowledge tracing? Attention-based knowledge tracing

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Scientifically and effectively tracking student knowledge states is a significant and fundamental task in personalized education. Many neural network-based models, e.g., deep knowledge tracing (DKT), have achieved remarkable results on knowledge tracing. DKT does not require handcrafted knowledge and can capture more complex representations of student knowledge. However, a severe problem of DKT is that the output fluctuates wildly. In this paper, we utilize a finite state automaton (FSA), a mathematical computation model, to interpret the waviness of DKT because an FSA has observable state evolution in response to external input. With the support of an FSA, we discover that DKT cannot handle long sequential inputs, which leads to unstable predictions. Accordingly, we introduce two novel attention-based models that solve the above problems by directly capturing the relationships among each item of the input sequence. Extensive experimentation on five well-known datasets shows that our two proposed models achieve state-of-the-art performance compared to existing knowledge tracing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Piech C, Bassen J, Huang J, Ganguli S, Sahami M, Guibas L J, Sohl-Dickstein J (2015) Deep knowledge tracing. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, pp 505–513

  2. Yeung C-K, Yeung D-Y (2018) Addressing two problems in deep knowledge tracing via prediction-consistent regularization. In: Proceedings of the Fifth Annual ACM Conference on Learning at Scale. ACM, London, pp 5:1–5:10

  3. Hou B-J, Zhou Z-H (2020) Learning with interpretable structure from gated rnn. IEEE Trans Neural Netw Learn Syst 31(7):2267–2279

    Google Scholar 

  4. Corbett A T, Anderson J R (1995) Knowledge tracing: modeling the acquisition of procedural knowledge. User Model User-Adap Inter:253–278

  5. Hawkins W J, Heffernan N T, Baker R SJD (2014) Learning bayesian knowledge tracing parameters with a knowledge heuristic and empirical probabilities. In: Proceedings of the 12th International Conference on Intelligent Tutoring Systems, pp 150–155

  6. Kaeser T, Klingler S, Schwing A G, Gross M (2017) Dynamic bayesian networks for student modeling. IEEE Trans Learn Technol 10:450–462

    Article  Google Scholar 

  7. Pardos Z A, Heffernan N T (2011) Kt-idem: introducing item difficulty to the knowledge tracing model. In: Proceedings of the 19th International Conference on User Modeling, Adaption and Personalization, pp 243–254

  8. Yudelson M V, Koedinger K R, Gordon G J (2013) Individualized bayesian knowledge tracing models. In: Proceedings of the 16th International Conference on Artificial Intelligence in Education, pp 171–180

  9. Cen H, Koedinger K, Junker B (2006) Learning factors analysis - a general method for cognitive model evaluation and improvement

  10. Pavlik P I, Cen H, Koedinger K R (2009) Performance factors analysis - a new alternative to knowledge tracing. In: Proceedings of the 14th International Conference on Artificial Intelligence in Education, pp 531–538

  11. Pu S, Converse G A, Huang Y (2021) Deep performance factors analysis for knowledge tracing. In: Artificial intelligence in education - 22nd international conference, AIED 2021, proceedings, part I, Lecture Notes in Computer Science, vol 12748. Springer, Utrecht, pp 331–341

  12. Huo Y, Wong D F, Ni L M, Chao L S, Zhang J (2020) Knowledge modeling via contextualized representations for lstm-based personalized exercise recommendation. Inf Sci 523:266–278

    Article  Google Scholar 

  13. Xiong X, Zhao S, Inwegen E V, Beck J (2016) Going deeper with deep knowledge tracing. In: Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016. International Educational Data Mining Society (IEDMS), Raleigh, pp 545–550

  14. Minn S, Yu Y, Desmarais M C, Zhu F, Vie J-J (2018) Deep knowledge tracing and dynamic student classification for knowledge tracing. In: IEEE international conference on data mining, ICDM 2018. IEEE Computer Society, Singapore, pp 1182–1187

  15. Vie J-J, Kashima H (2019) Knowledge tracing machines: Factorization machines for knowledge tracing. In: The Thirty-Third Conference on Artificial Intelligence, AAAI 2019. AAAI Press, Honolulu, pp 750–757

  16. He L, Li X, Tang J, Wang T (2021) EDKT: an extensible deep knowledge tracing model for multiple learning factors. In: Database systems for advanced applications - 26th international conference, DASFAA 2021, proceedings, part I, Lecture Notes in Computer Science, vol 12681. Springer, Taipei, pp 340–355

  17. Choi Y, Lee Y, Cho J, Baek J, Kim B, Cha Y, Shin D, Bae C, Heo J (2020) Towards an appropriate query, key, and value computation for knowledge tracing. In: L@S’20: Seventh ACM Conference on Learning @ Scale, Virtual Event. ACM, USA, pp 341–344

  18. Shin D, Shim Y, Yu H, Lee S, Kim B, Choi Y (2021) SAINT+: integrating temporal features for ednet correctness prediction. In: LAK’21: 11th International Learning Analytics and Knowledge Conference. ACM, Irvine, pp 490–496

  19. Pu S, Yudelson M, Ou L, Huang Y (2020) Deep knowledge tracing with transformers. In: Artificial intelligence in education - 21st international conference, AIED 2020, proceedings, part II, Lecture Notes in Computer Science, vol 12164. Springer, Ifrane, pp 252–256

  20. Pandey S, Srivastava J (2020) RKT: relation-aware self-attention for knowledge tracing. In: CIKM ’20: The 29th ACM international conference on information and knowledge management, virtual event. ACM, Ireland, pp 1205–1214

  21. Ghosh A, Heffernan N T, Lan A S (2020) Context-aware attentive knowledge tracing. In: KDD ’20: The 26th ACM SIGKDD conference on knowledge discovery and data mining, virtual event. ACM, CA, pp 2330–2339

  22. Pandey S, Karypis G (2019) A self attentive model for knowledge tracing. In: Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal

  23. Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007. SIAM, New Orleans, pp 1027–1035

  24. Ming Y, Cao S, Zhang R, Li Z, Chen Y, Song Y, Qu H (2017) Understanding hidden memories of recurrent neural networks. In: 12th IEEE Conference on Visual Analytics Science and Technology, IEEE VAST 2017. IEEE Computer Society, Phoenix, pp 13–24

  25. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems: Annual conference on neural information processing systems 2017, Long Beach, pp 5998–6008

  26. Berant J, Chou A, Frostig R, Liang P (2013) Semantic parsing on freebase from question-answer pairs. In: Proceedings of the 2013 conference on empirical methods in natural language processing, EMNLP 2013, A meeting of sigdat, a special interest group of the ACL. ACL, Grand Hyatt Seattle, pp 1533–1544

  27. Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14. ACM, New York, pp 601–610

  28. Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016. The Association for Computational Linguistics, Austin, pp 551–561

  29. Parikh A P, Täckström O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016. The Association for Computational Linguistics, Austin, pp 2249–2255

  30. Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. In: 6th International Conference on Learning Representations, ICLR 2018, Conference Track Proceedings. OpenReview.net, Vancouver

  31. Luong T, Pham H, Manning C D (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015. The Association for Computational Linguistics, Lisbon, pp 1412–1421

  32. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, San Diego

  33. Tang G, Müller M, Rios A, Sennrich R (2018) Why self-attention? A targeted evaluation of neural machine translation architectures. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pp 4263–4272

  34. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  35. Prechelt L (1998) Automatic early stopping using cross validation: quantifying the criteria. Neural Netw 11(4):761–767

    Article  Google Scholar 

  36. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2010, vol 9. JMLR.org, Chia Laguna Resort, pp 249–256

  37. Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, pp 4171–4186

Download references

Acknowledgments

This work was supported by the Guangdong Province Educational Science Planning Project (Research on the Promotion of Higher Vocational SPOC Blended Teaching Based on Learner Profile No.2019GXJK272), the National Natural Science Foundation of China under Grant No.62077015 and the Key Laboratory of Intelligent Education Technology and Application of Zhejiang Province, Zhejiang Normal University, Jinhua, Zhejiang, China.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zetao Zheng or Jia Zhu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Zheng, Z., Zhu, J. et al. What is wrong with deep knowledge tracing? Attention-based knowledge tracing. Appl Intell 53, 2850–2861 (2023). https://doi.org/10.1007/s10489-022-03621-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03621-1

Keywords

Navigation