What is wrong with deep knowledge tracing? Attention-based knowledge tracing

Wang, Xianqing; Zheng, Zetao; Zhu, Jia; Yu, Weihao

doi:10.1007/s10489-022-03621-1

What is wrong with deep knowledge tracing? Attention-based knowledge tracing

Published: 14 May 2022

Volume 53, pages 2850–2861, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Xianqing Wang¹,
Zetao Zheng²,
Jia Zhu³ &
…
Weihao Yu ORCID: orcid.org/0000-0003-0727-4744⁴

1506 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Scientifically and effectively tracking student knowledge states is a significant and fundamental task in personalized education. Many neural network-based models, e.g., deep knowledge tracing (DKT), have achieved remarkable results on knowledge tracing. DKT does not require handcrafted knowledge and can capture more complex representations of student knowledge. However, a severe problem of DKT is that the output fluctuates wildly. In this paper, we utilize a finite state automaton (FSA), a mathematical computation model, to interpret the waviness of DKT because an FSA has observable state evolution in response to external input. With the support of an FSA, we discover that DKT cannot handle long sequential inputs, which leads to unstable predictions. Accordingly, we introduce two novel attention-based models that solve the above problems by directly capturing the relationships among each item of the input sequence. Extensive experimentation on five well-known datasets shows that our two proposed models achieve state-of-the-art performance compared to existing knowledge tracing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 9

Learning from Interpretable Analysis: Attention-Based Knowledge Tracing

Input-Aware Neural Knowledge Tracing Machine

Deep Knowledge Tracing with Side Information

References

Piech C, Bassen J, Huang J, Ganguli S, Sahami M, Guibas L J, Sohl-Dickstein J (2015) Deep knowledge tracing. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, pp 505–513
Yeung C-K, Yeung D-Y (2018) Addressing two problems in deep knowledge tracing via prediction-consistent regularization. In: Proceedings of the Fifth Annual ACM Conference on Learning at Scale. ACM, London, pp 5:1–5:10
Hou B-J, Zhou Z-H (2020) Learning with interpretable structure from gated rnn. IEEE Trans Neural Netw Learn Syst 31(7):2267–2279
Google Scholar
Corbett A T, Anderson J R (1995) Knowledge tracing: modeling the acquisition of procedural knowledge. User Model User-Adap Inter:253–278
Hawkins W J, Heffernan N T, Baker R SJD (2014) Learning bayesian knowledge tracing parameters with a knowledge heuristic and empirical probabilities. In: Proceedings of the 12th International Conference on Intelligent Tutoring Systems, pp 150–155
Kaeser T, Klingler S, Schwing A G, Gross M (2017) Dynamic bayesian networks for student modeling. IEEE Trans Learn Technol 10:450–462
Article Google Scholar
Pardos Z A, Heffernan N T (2011) Kt-idem: introducing item difficulty to the knowledge tracing model. In: Proceedings of the 19th International Conference on User Modeling, Adaption and Personalization, pp 243–254
Yudelson M V, Koedinger K R, Gordon G J (2013) Individualized bayesian knowledge tracing models. In: Proceedings of the 16th International Conference on Artificial Intelligence in Education, pp 171–180
Cen H, Koedinger K, Junker B (2006) Learning factors analysis - a general method for cognitive model evaluation and improvement
Pavlik P I, Cen H, Koedinger K R (2009) Performance factors analysis - a new alternative to knowledge tracing. In: Proceedings of the 14th International Conference on Artificial Intelligence in Education, pp 531–538
Pu S, Converse G A, Huang Y (2021) Deep performance factors analysis for knowledge tracing. In: Artificial intelligence in education - 22nd international conference, AIED 2021, proceedings, part I, Lecture Notes in Computer Science, vol 12748. Springer, Utrecht, pp 331–341
Huo Y, Wong D F, Ni L M, Chao L S, Zhang J (2020) Knowledge modeling via contextualized representations for lstm-based personalized exercise recommendation. Inf Sci 523:266–278
Article Google Scholar
Xiong X, Zhao S, Inwegen E V, Beck J (2016) Going deeper with deep knowledge tracing. In: Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016. International Educational Data Mining Society (IEDMS), Raleigh, pp 545–550
Minn S, Yu Y, Desmarais M C, Zhu F, Vie J-J (2018) Deep knowledge tracing and dynamic student classification for knowledge tracing. In: IEEE international conference on data mining, ICDM 2018. IEEE Computer Society, Singapore, pp 1182–1187
Vie J-J, Kashima H (2019) Knowledge tracing machines: Factorization machines for knowledge tracing. In: The Thirty-Third Conference on Artificial Intelligence, AAAI 2019. AAAI Press, Honolulu, pp 750–757
He L, Li X, Tang J, Wang T (2021) EDKT: an extensible deep knowledge tracing model for multiple learning factors. In: Database systems for advanced applications - 26th international conference, DASFAA 2021, proceedings, part I, Lecture Notes in Computer Science, vol 12681. Springer, Taipei, pp 340–355
Choi Y, Lee Y, Cho J, Baek J, Kim B, Cha Y, Shin D, Bae C, Heo J (2020) Towards an appropriate query, key, and value computation for knowledge tracing. In: L@S’20: Seventh ACM Conference on Learning @ Scale, Virtual Event. ACM, USA, pp 341–344
Shin D, Shim Y, Yu H, Lee S, Kim B, Choi Y (2021) SAINT+: integrating temporal features for ednet correctness prediction. In: LAK’21: 11th International Learning Analytics and Knowledge Conference. ACM, Irvine, pp 490–496
Pu S, Yudelson M, Ou L, Huang Y (2020) Deep knowledge tracing with transformers. In: Artificial intelligence in education - 21st international conference, AIED 2020, proceedings, part II, Lecture Notes in Computer Science, vol 12164. Springer, Ifrane, pp 252–256
Pandey S, Srivastava J (2020) RKT: relation-aware self-attention for knowledge tracing. In: CIKM ’20: The 29th ACM international conference on information and knowledge management, virtual event. ACM, Ireland, pp 1205–1214
Ghosh A, Heffernan N T, Lan A S (2020) Context-aware attentive knowledge tracing. In: KDD ’20: The 26th ACM SIGKDD conference on knowledge discovery and data mining, virtual event. ACM, CA, pp 2330–2339
Pandey S, Karypis G (2019) A self attentive model for knowledge tracing. In: Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal
Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007. SIAM, New Orleans, pp 1027–1035
Ming Y, Cao S, Zhang R, Li Z, Chen Y, Song Y, Qu H (2017) Understanding hidden memories of recurrent neural networks. In: 12th IEEE Conference on Visual Analytics Science and Technology, IEEE VAST 2017. IEEE Computer Society, Phoenix, pp 13–24
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems: Annual conference on neural information processing systems 2017, Long Beach, pp 5998–6008
Berant J, Chou A, Frostig R, Liang P (2013) Semantic parsing on freebase from question-answer pairs. In: Proceedings of the 2013 conference on empirical methods in natural language processing, EMNLP 2013, A meeting of sigdat, a special interest group of the ACL. ACL, Grand Hyatt Seattle, pp 1533–1544
Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14. ACM, New York, pp 601–610
Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016. The Association for Computational Linguistics, Austin, pp 551–561
Parikh A P, Täckström O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016. The Association for Computational Linguistics, Austin, pp 2249–2255
Paulus R, Xiong C, Socher R (2018) A deep reinforced model for abstractive summarization. In: 6th International Conference on Learning Representations, ICLR 2018, Conference Track Proceedings. OpenReview.net, Vancouver
Luong T, Pham H, Manning C D (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015. The Association for Computational Linguistics, Lisbon, pp 1412–1421
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, San Diego
Tang G, Müller M, Rios A, Sennrich R (2018) Why self-attention? A targeted evaluation of neural machine translation architectures. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pp 4263–4272
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Prechelt L (1998) Automatic early stopping using cross validation: quantifying the criteria. Neural Netw 11(4):761–767
Article Google Scholar
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2010, vol 9. JMLR.org, Chia Laguna Resort, pp 249–256
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, pp 4171–4186

Download references

Acknowledgments

This work was supported by the Guangdong Province Educational Science Planning Project (Research on the Promotion of Higher Vocational SPOC Blended Teaching Based on Learner Profile No.2019GXJK272), the National Natural Science Foundation of China under Grant No.62077015 and the Key Laboratory of Intelligent Education Technology and Application of Zhejiang Province, Zhejiang Normal University, Jinhua, Zhejiang, China.

Author information

Authors and Affiliations

Guangdong Polytechnic of Science and Technology, Guangzhou, China
Xianqing Wang
University of Electronic Science and Technology of China, Chengdu, China
Zetao Zheng
Zhejiang Normal University, Jinhua, China
Jia Zhu
Research Institute of China Telecom Corporate Ltd., Guangzhou, China
Weihao Yu

Authors

Xianqing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zetao Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jia Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Weihao Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zetao Zheng or Jia Zhu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Zheng, Z., Zhu, J. et al. What is wrong with deep knowledge tracing? Attention-based knowledge tracing. Appl Intell 53, 2850–2861 (2023). https://doi.org/10.1007/s10489-022-03621-1

Download citation

Accepted: 10 April 2022
Published: 14 May 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10489-022-03621-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What is wrong with deep knowledge tracing? Attention-based knowledge tracing

Abstract

Access this article

Similar content being viewed by others

Learning from Interpretable Analysis: Attention-Based Knowledge Tracing

Input-Aware Neural Knowledge Tracing Machine

Deep Knowledge Tracing with Side Information

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

What is wrong with deep knowledge tracing? Attention-based knowledge tracing

Abstract

Access this article

Similar content being viewed by others

Learning from Interpretable Analysis: Attention-Based Knowledge Tracing

Input-Aware Neural Knowledge Tracing Machine

Deep Knowledge Tracing with Side Information

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation