Exercise Hierarchical Feature Enhanced Knowledge Tracing

Tong, Hanshuang; Zhou, Yun; Wang, Zhen

doi:10.1007/978-3-030-52240-7_59

Hanshuang Tong¹³,
Yun Zhou¹³ &
Zhen Wang¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12164))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

4852 Accesses
7 Citations
1 Altmetric

Abstract

Knowledge tracing is a fundamental task in the computer-aid educational system. In this paper, we propose a hierarchical exercise feature enhanced knowledge tracing framework, which could enhance the ability of knowledge tracing by incorporating knowledge distribution, semantic features, and difficulty features from exercise text. Extensive experiments show the high performance of our framework.

You have full access to this open access chapter, Download conference paper PDF

Knowledge Tracing with Exercise-Enhanced Key-Value Memory Networks

Deep Knowledge Tracing with Side Information

Validating Revised Bloom’s Taxonomy Using Deep Knowledge Tracing

Keywords

1 Introduction

Knowledge tracing is an essential and classical problem in intelligent education systems. By tracing the knowledge transition process, we could recommend specific educational items to a student based on one’s weak knowledge. Existing methods try to solve knowledge tracing problems from both educational psychology and data mining perspectives, such as Item Response Theory (IRT) [7], Bayesian Knowledge Tracing (BKT) [1], Performance Factors Analysis (PFA) framework [9] and Deep knowledge tracing (DKT) [10]. Those models have been proved effective but still have limitations. They do not systematically consider the impact of different attributes of the exercises itself on the knowledge tracing problem. Exercise Enhanced Knowledge Tracing (EKT) [5] is the first method to take exercise text and attention mechanism into consideration. However, EKT extracts features of text by feeding the text of exercise directly into a neural network, which fails to extract hierarchical features from exercise (Fig. 1).

2 Exercise Hierarchical Feature Enhanced Framework

Framework Overview. Knowledge tracing task can be summarized as: In an online educational system, suppose we have M students and E exercises in total. Given any learners’ exercise record $ E =\{(q_{1},r_{1}),(q_{2},r_{2})\ldots (q_{m},r_{m})\}$, predict one’s performance on $q_{t+1}$. Here $(q_{t},r_{t})$ represents that a learner practices question $q_{t}$ and answers $r_{t}$ at step t. The entire structure of the framework is shown in Fig. 2. In order to dig deeper into the information in the exercise text, first we utilize Bert [2] to generate embedding vector $v_{b}$. Then we feed them into three systems to generate knowledge distribution $v_{t} \in R^{K}$, semantic features $s_{t}$ and question difficulty $d_{t}$ separately. Let $\varphi (s_{t})$ be the one-hot encoding of the semantic cluster where the question belongs at time t. Finally, we concatenate $v_{t}$, $\varphi (s_{t})$, $d_{t}$, and $r_{t}$ as $x_{t}$ and feed $x_{t}$ into a sequence model.

Subsystems Introduction. Two text classification systems, named KDES and DFES, are designed to predict the knowledge distribution and difficulty of the exercise respectively. The semantic feature extractor system (SFES) could be considered as an unsuperviesed clusering problems. The input of those systems is the Bert encoding of the exercise text. The knowledge labeled by teacher and the correct rate of a question [4] serve as ground truth and are predicted using TextCNN [8] in KDES and DFES systems. In KDES system, we use softmax results classified in the trained model to represent the knowledge distribution of an exercise. In DFES systems, we use neural networks to predict difficulty in order to solve the cold start problem. In SFES systems, we cluster the input using a Hierarchical Clustering method by calculating the cos distance between different semantic vectors [6].

$$\begin{aligned} h_{t},c_{t} = LSTM(x_{t},h_{t-1},c_{t-1};\theta _{t}) \end{aligned}$$

(1)

$$\begin{aligned} y_{t}= \sigma (W_{yh} \cdot h_{t}+b_{y}) \end{aligned}$$

(2)

$$\begin{aligned} loss = -\sum _{t}(r_{t+1}*log(y_{i}^{T}\cdot {\varphi (s_{t+1})})+(1-r_{t+1}) *log(1-y_{i}^{T}\cdot {\varphi (s_{t+1})})) \end{aligned}$$

(3)

Modeling Process. In the propagation stage, as shown in Eq. 1, we process $x_{t}$ and the previous learner’s hidden state $h_{t-1}$ and then use RNN network to get current learner’s hidden state $h_{t}$. Here we use LSTM as a variant of RNN since it can better preserve long-term dependency in the exercise sequence [3]. Finally, we use $h_{t}$ to predict $y_{t}$ which contains information about students’ mastery of each semantic feature. Additionally, the dimension of $y_{t}$ is same as the total number of different semantic clustering in DFES system. The $\theta _{t}, W_{yh}, b_{y}$ in the equation are the parameters of models. The goal of training is to minimize the negative log likelihood of the observed sequence of student response logs (shown in Eq. 3).

3 Experiment

3.1 Experimental Setting

Since there is no open dataset which could provide exercising records with text information. We derive an experimental dataset containing 132,179 students and 91,449,914 answer records from a large real-world online education system: aixuexi.com.

Table 1. The result of clustering

Full size table

The baselines of the experiments are as following: BKT, which is based on Bayesian inference; DKT, which uses recurrent neural networks to model student learning; EKTA, which incoporate exercise text features and attention mechanism into the recurrent neural networks; EHFKT_K/S/D, a simplified version of EHFKT, which only contains KDES/SFES/DFES system. The input of EHFKT series is the concatenation of problem encoding and the ouput of each system; EHFKT_T, which contains all subsystems. It diagnoses the transition of mastery of knowledge, while EHFKT diagnoses transition of the mastery of semantic features.

3.2 Experimental Results

Hierarchical Clustering Result. The SFES system uses Bert and Hierarchical Clustering to obtain semantic features of questions. Figure 3 shows the visualization of the clustering results of 11410 questions. The y-axis corresponds to the classification threshold and x-axis corresponds to each exercise. Table 1 implies the result of clustering when the number of clustering $\lambda _{s}$ is 912.

EHFKT Result. In this part, our experiment divides the dataset into a training set with 105,744 learners’ logs and a test dataset with 26,435 learners’ logs. Figure 4 shows the transition of AUC during the training process. Table 2 shows the overall comparing results in this task. The results indicate that EHFKT performs better than other baseline models. Thus, we could draw several conclusions from the result: In the knowledge tracing task, adding hierarchical features can better represent questions; Besides, tracing the mastery of semantic clusterings can predict students’ performance more precisely. The reason is that the exercises contained in the same clusters have similar knowledge distribution, difficulty, and semantics; This result also demonstrates the instability of the tracing of knowledge mastery since the difficulty of an exercise is unpredictable.

Table 2. Evaluation metrics of different deep learning methods

Full size table

4 Conclusions

In this article, we propose a novel knowledge tracing framework which could extract the knowledge distribution, semantic features and difficulty from the exercise. Besides, We introduce the diagnosis of semantic features of questions into knowledge tracing, which leads to more accurate performance prediction. Although the meaning of these semantic clusters is beyond people’s understanding, in the future we will try extracting the meaning of the exercises in the same cluster by text sumarization technique to make the data-driven clusters result more understandable to human.

References

Corbett, A.T., Anderson, J.R.: Knowledge tracing: modeling the acquisition of procedural knowledge. User Model. User-Adap. Inter. 4(4), 253–278 (1994)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hontangas, P., Ponsoda, V., Olea, J., Wise, S.L.: The choice of item difficulty in self-adapted testing. Eur. J. Psychol. Assess. 16(1), 3 (2000)
Article Google Scholar
Huang, Z., et al.: Ekt:Exercise-aware knowledge tracing for student performance prediction. IEEE Trans. Knowl. Data Eng. (2019)
Google Scholar
Johnson, S.C.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)
Article Google Scholar
Khajah, M.M., Huang, Y., González-Brenes, J.P., Mozer, M.C., Brusilovsky, P.: Integrating knowledge tracing and item response theory: a tale of two frameworks. In: CEUR Workshop Proceedings, vol. 1181, pp. 7–15. University of Pittsburgh (2014)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Pavlik Jr., P.I., Cen, H., Koedinger, K.R.: Performance factors analysis-a new alternative to knowledge tracing. Online Submission (2009)
Google Scholar
Piech, C., et al.: Deep knowledge tracing. In: Advances in Neural Information Processing Systems, pp. 505–513 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

AIXUEXI Education Group Ltd., AI Lab, Beijing, China
Hanshuang Tong, Yun Zhou & Zhen Wang

Authors

Hanshuang Tong
View author publications
You can also search for this author in PubMed Google Scholar
Yun Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanshuang Tong .

Editor information

Editors and Affiliations

Federal University of Alagoas, Maceió, Brazil
Ig Ibert Bittencourt
University College London, London, UK
Mutlu Cukurova
Carleton University, Ottawa, ON, Canada
Kasia Muldner
University College London, London, UK
Rose Luckin
University of Malaga, Málaga, Spain
Eva Millán

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tong, H., Zhou, Y., Wang, Z. (2020). Exercise Hierarchical Feature Enhanced Knowledge Tracing. In: Bittencourt, I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds) Artificial Intelligence in Education. AIED 2020. Lecture Notes in Computer Science(), vol 12164. Springer, Cham. https://doi.org/10.1007/978-3-030-52240-7_59

Download citation

DOI: https://doi.org/10.1007/978-3-030-52240-7_59
Published: 30 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-52239-1
Online ISBN: 978-3-030-52240-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics