Abstract
The nature of education has been transformed by technological advances and online learning platforms, providing educational institutions with more options than ever to thrive in a complex and competitive environment. However, they still face challenges such as academic underachievement, graduation delays, and student dropouts. Fortunately, by harnessing student data from institution databases and online platforms, it becomes possible to predict the academic performance of individual students at an early stage. In this study, we utilized knowledge graphs (KG), clustering, and machine learning (ML) techniques on data related to students in the College of Information Technology at UAEU. To construct knowledge graphs and visualize students’ performance at various checkpoints, we employed Neo4j-a high-performance NoSQL graph database. The findings demonstrate that incorporating clustered knowledge graphs with machine learning reduces predictive errors, enhances classification accuracy, and effectively identifies students at risk of course failure. Additionally, the utilization of visualization methods facilitates communication and decision-making within educational institutions. The combination of KGs and ML empowers course instructors to rank students and provide personalized learning interventions based on individual performance and capabilities, allowing them to develop tailored remedial actions for at-risk students according to their unique profiles.
Similar content being viewed by others
Availability of data and materials
The data of this study “EduRisk” is available from the corresponding author upon reasonable request through bi-dac.com/download
Abbreviations
- Acc:
-
Accuracy
- AdaBoost:
-
Adaptive Boosting
- AI:
-
Artificial Intelligence
- CP:
-
Checkpoints
- DL:
-
Deep Learning
- DS:
-
Dataset Features
- EF:
-
Embedded Features
- GCN:
-
Graph Convolution Neural Network
- HF:
-
Historical Features
- KG:
-
Knowledge Graph
- KGE:
-
Knowledge Graph Embeddings
- LGBM:
-
Light Gradient Boosting Machine
- LR:
-
Linear Rregression
- MAE:
-
Mean Absolute Error
- ML:
-
Machine Learning
- MSE:
-
Mean Squared Error
- NoSQL:
-
Not Only SQL
- PCA:
-
Principal Component Analysis
- RF:
-
Random Forest
- SVM:
-
Support Vector Machine
- TG:
-
Total Grade
- TPR:
-
True Positive Rate
- HW:
-
Homework assignment
- LGB:
-
Light Gradient Boosting
References
Acharya, A., & Sinha, D. (2014). Early prediction of students performance using machine learning techniques. International Journal of Computer Applications, 107(1)
Adekitan, A. I., & Noma-Osaghae, E. (2019). Data mining approach to predicting the performance of first year student in a university using the admission requirements. Education and Information Technologies, 24(2), 1527–1543.
Ahadi, A., Lister, R., & Haapala, H., et al. (2015). Exploring machine learning methods to automatically identify students in need of assistance. In: Proceedings of the eleventh annual international conference on international computing education research (pp. 121–130)
Ahmad, Z., & Shahzadi, E. (2018). Prediction of students’ academic performance using artificial neural network. Bulletin of Education and Research, 40(3), 157–164.
Al Breiki, B., Zaki, N., & Mohamed, E. A. (2019). Using educational data mining techniques to predict student performance. In: 2019 International conference on electrical and computing technologies and applications (ICECTA), IEEE, (pp. 1–5)
Albreiki, B., Habuza, T., Shuqfa, Z., et al. (2021). Customized rule-based model to identify at-risk students and propose rational remedial actions. Big Data and Cognitive Computing, 5(4), 71.
Albreiki, B., Habuza, T., & Zaki, N. (2022). Framework for automatically suggesting remedial actions to help students at risk based on explainable ml and rule-based models. International Journal of Educational Technology in Higher Education, 19(1), 1–26.
Albreiki, B., Habuza, T., & Zaki, N. (2023). Extracting topological features to identify at-risk students using machine learning and graph convolutional network models. International Journal of Educational Technology in Higher Education, 20(1), 1–22.
Aleem, A., & Gore, M. M. (2020). Educational data mining methods: A survey. In: 2020 IEEE 9th International conference on communication systems and network technologies (CSNT), IEEE, (pp. 182–188)
Almarabeh, H. (2017). Analysis of students’ performance by using different data mining classifiers. International Journal of Modern Education and Computer Science, 9(8), 9.
Alshanqiti, A., & Namoun, A. (2020). Predicting student performance and its influential factors using hybrid regression and multi-label classification. IEEE Access, 8, 203,827–203,844
Al-Shehri, H., Al-Qarni, A., & Al-Saati, L., et al. (2017). Student performance prediction using support vector machine and k-nearest neighbor. In: 2017 IEEE 30th canadian conference on electrical and computer engineering (CCECE), IEEE, (pp. 1–4)
Amador-Domínguez, E., Serrano, E., Manrique, D., et al. (2019). Prediction and decision-making in intelligent environments supported by knowledge graphs, a systematic review. Sensors, 19(8), 1774.
Baradwaj, BK., & Pal, S. (2012). Mining educational data to analyze students’ performance. arXiv preprint arXiv:1201.3417
Binkhonain, M., & Zhao, L. (2019). A review of machine learning algorithms for identification and classification of non-functional requirements. Expert Systems with Applications X, 1(100), 001.
Bordes, A., Usunier, N., & Garcia-Duran, A., et al .(2013). Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26
Buenaño-Fernández, D., Gil, D., & Luján-Mora, S. (2019). Application of machine learning in predicting performance for computer engineering students: A case study. Sustainability, 11(10), 2833.
Chen, H. C., Prasetyo, E., & Tseng, S. S., et al. (2022) Week-wise student performance early prediction in virtual learning environment using a deep explainable artificial intelligence. Applied Sciences, 12(4), 1885
Chicaiza, J., & Valdiviezo-Diaz, P. (2021). A comprehensive survey of knowledge graph-based recommender systems: Technologies, development, and contributions. Information, 12(6), 232.
Chowdhury, F. R. R., Ma, C., & Islam, M. R., et al. (2017) Select-and-evaluate: A learning framework for large-scale knowledge graph search. In: Asian conference on machine learning, PMLR, (pp 129–144)
Chui, K. T., Fung, D. C. L., Lytras, M. D., et al. (2020). Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Computers in Human Behavior, 107(105), 584.
Crivei, L. M., Ionescu, V. S., & Czibula, G. (2019). An analysis of supervised learning methods for predicting students’ performance in academic environments. ICIC Express Lett, 13(3), 181–189.
Cui, J., & Yu, S. (2019). Fostering deeper learning in a flipped classroom: Effects of knowledge graphs versus concept maps. British Journal of Educational Technology, 2019, 1–21.
Deng, Y., Zeng, Z., Jha, K., et al. (2022). Problem-based cybersecurity lab with knowledge graph as guidance. Journal of Artificial Intelligence and Technology, 2(2), 55–61.
Dhanalakshmi, V., Bino, D., & Saravanan, A. M. (2016). Opinion mining from student feedback data using supervised learning algorithms. In: 2016 3rd MEC international conference on big data and smart city (ICBDSC), IEEE, (pp. 1–5)
Donato, R. D., Garofalo, M., & Malandrino, D., et al. (2020). Education meets knowledge graphs for the knowledge management. In: International conference in methodologies and intelligent systems for techhnology enhanced learning (pp 272–280). Springer
Ehrlinger, L., & Wöß, W. (2016). Towards a definition of knowledge graphs. SEMANTiCS (Posters, Demos, SuCCESS), 48(1–4), 2.
Ettorre, A., Bobasheva, A., & Michel, F., et al. (2022). Stunning doodle: a tool for joint visualization and analysis of knowledge graphs and graph embeddings. In: European semantic web conference (pp 370–386). Springer
Fahd, K., Venkatraman, S., & Miah, S. J., et al. (2021). Application of machine learning in higher education to assess student academic performance, at-risk, and attrition: A meta-analysis of literature. Education and Information Technologies (pp. 1–33)
Faria, J. R., Wanke, P. F., Ferreira, J. J., et al. (2018). Research and innovation in higher education: empirical evidence from research and patenting in Brazil. Scientometrics, 116(1), 487–504.
Fei, M., & Yeung, D. Y. (2015). Temporal models for predicting student dropout in massive open online courses. In: 2015 IEEE International conference on data mining workshop (ICDMW) (pp. 256–263). IEEE
Feng, W., Tang, J., & Liu, T. X. (2019). Understanding dropouts in moocs. In: Proceedings of the AAAI Conference on Artificial Intelligence (pp. 517–524)
Ferguson, R. (2012). Learning analytics: drivers, developments and challenges. International Journal of Technology Enhanced Learning, 4(5/6), 304–317.
Gafarov, F., Rudneva, Y. B., & Sharifov, U. Y., et al. (2020). Analysis of students’ academic performance by using machine learning tools. In: International scientific conference digitalization of education: history, trends and prospects (DETP 2020) (pp. 574–579). Atlantis Press
Galbraith, C. S., Merrill, G. B., & Kline, D. M. (2012). Are student evaluations of teaching effectiveness valid for measuring student learning outcomes in business related classes? a neural network and bayesian analyses. Research in Higher Education, 53(3), 353–374.
Gaur, M., Faldu, K., & Sheth, A. (2021). Semantics of the black-box: Can knowledge graphs help make deep learning systems more interpretable and explainable? IEEE Internet Computing, 25(1), 51–59. https://doi.org/10.1109/MIC.2020.3031769
Guleria, P., & Sood, M. (2022). Explainable ai and machine learning: performance evaluation and explainability of classifiers on educational data mining inspired career counseling. Education and Information Technologies (pp. 1–36)
Gutierrez, C., & Sequeda, J. F. (2020). Knowledge graphs: A tutorial on the history of knowledge graph’s main ideas. In: Proceedings of the 29th ACM international conference on information & knowledge management (pp 3509–3510)
Ha, D. T., Loan, P. T. T., & Giap, C. N., et al. (2020). An empirical study for student academic performance prediction using machine learning techniques. International Journal of Computer Science and Information Security (IJCSIS), 18(3)
Hamilton, W. L. (2020). Graph representation learning. Synthesis Lectures on Artifical Intelligence and Machine Learning, 14(3), 1–159.
Hao, X., Ji, Z., Li, X., et al. (2021). Construction and application of a knowledge graph. Remote Sensing, 13(13), 2511.
Hasan, R., Palaniappan, S., Mahmood, S., et al. (2020). Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Applied Sciences, 10(11), 3894.
Hellas, A., Ihantola, P., & Petersen, A., et al. (2018). Predicting academic performance: a systematic literature review. In: Proceedings companion of the 23rd annual ACM conference on innovation and technology in computer science education (pp. 175–199)
Huang, C. L., & Huang, C. C. (2021). Study on customized knowledge graph of student pilot learning in fits training. Journal of Intelligent & Fuzzy Systems, 40(4), 7969–7979.
Hu, Y. H., Lo, C. L., & Shih, S. P. (2014). Developing early warning systems to predict students’ online learning performance. Computers in Human Behavior, 36, 469–478.
Iatrellis, O., Savvas, I. K., Fitsilis, P., et al. (2021). A two-phase machine learning approach for predicting student outcomes. Education and Information Technologies, 26(1), 69–88.
Iqbal, Z., Qadir, J., & Mian, A. N., et al. (2017). Machine learning based student grade prediction: A case study. arXiv preprint arXiv:1708.08744
Ji, G., He, S., & Xu, L., et al. (2015). Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers), (pp. 687–696)
Jung, Y., & Lee, J. (2018). Learning engagement and persistence in massive open online courses (moocs). Computers & Education, 122, 9–22.
Karimi, H., Derr, T., & Huang, J., et al. (2020a). Online academic course performance prediction using relational graph convolutional neural network. International Educational Data Mining Society
Karimi, H., Derr, T., & Huang, J., et al. (2020b) Online academic course performance prediction using relational graph convolutional neural network. International Educational Data Mining Society
Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Kolo, D. K., & Adepoju, S. A. (2015). A decision tree approach for predicting students academic performance. International Journal of Education and Management Engineering
Kursuncu, U., Gaur, M., & Sheth, A. (2020). Knowledge infused learning (K-IL): Towards deep incorporation of knowledge in deep learning. Proceedings of the AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE).
Lau, E., Sun, L., & Yang, Q. (2019). Modelling, prediction and classification of student academic performance using artificial neural networks. SN Applied Sciences, 1(9), 1–10.
Li, K., Uvah, J., & Amin, R. (2020). A technology-enhanced smart learning environment based on the combination of knowledge graphs and learning paths. SCITEPRESS
Lin, Y., Liu, Z., & Sun, M., et al. (2015). Learning entity and relation embeddings for knowledge graph completion. In: Twenty-ninth AAAI conference on artificial intelligence
Lin, Q., Zhu, Y., Lu, H., et al. (2021). Improving university faculty evaluations via multi-view knowledge graph. Future Generation Computer Systems, 117, 181–192.
Listl, F. G., Fischer, J., & Beyer, D., et al. (2020). Knowledge representation in modeling and simulation: A survey for the production and logistic domain. In: 2020 25th IEEE International conference on emerging technologies and factory automation (ETFA) (pp 1051–1056). IEEE
Livieris, I. E., Kotsilieris, T., Tampakas, V., et al. (2019). Improving the evaluation process of students’ performance utilizing a decision support software. Neural Computing and Applications, 31(6), 1683–1694.
Lovelace, J., Newman-Griffis, D., & Vashishth, S., et al. (2021). Robust knowledge graph completion with stacked convolutions and a student re-ranking network. arXiv preprint arXiv:2106.06555
Márquez-Vera, C., Cano, A., Romero, C., et al. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied intelligence, 38(3), 315–330.
Meissner, R., & Köbis, L. (2020). Annotated knowledge graphs for teaching in higher education. In: International conference on web engineering (pp. 551–555). Springer
Moubayed, A., Injadat, M., Shami, A., et al. (2020). Student engagement level in an e-learning environment: Clustering using k-means. American Journal of Distance Education, 34(2), 137–156.
Mubarak, A. A., Cao, H., & Hezam, I. M., et al. (2022a). Modeling students’ performance using graph convolutional networks. Complex & Intelligent Systems (pp. 1–19)
Mubarak, A. A., Cao, H., & Hezam, I. M., et al. (2022b). Modeling students’ performance using graph convolutional networks. Complex & Intelligent Systems (pp. 1–19)
Nickel, M., Rosasco, L., & Poggio, T. (2016). Holographic embeddings of knowledge graphs. In: Proceedings of the AAAI conference on artificial intelligence
Niyogisubizo, J., Liao, L., Nziyumva, E., et al. (2022). Predicting student’s dropout in university classes using two-layer ensemble machine learning approach: A novel stacked generalization. Computers and Education: Artificial Intelligence, 3(100), 066.
Osmanbegovic, E., & Suljic, M. (2012). Data mining approach for predicting student performance. Economic Review: Journal of Economics and Business, 10(1), 3–12.
Paulheim, H. (2017). Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic web, 8(3), 489–508.
Qi, S., Liu, L., & Kumar, B. S., et al. (2022). An english teaching quality evaluation model based on gaussian process machine learning. Expert Systems 39(6), e12,861
Rastrollo-Guerrero, J. L., Gómez-Pulido, J. A., & Durán-Dom’ınguez, A. (2020). Analyzing and predicting students’ performance by means of machine learning: A review. Applied sciences, 10(3), 1042.
Rizun, M. (2019). Knowledge graph application in education: A literature review. Folia Oeconomica
Rodr’ıguez-Hernández, C. F., Musso, M., Kyndt, E., et al. (2021). Artificial neural networks in academic performance prediction: Systematic implementation and predictor evaluation. Computers and Education: Artificial Intelligence, 2(100), 018.
Stahr, M., Yu, X., Chen, H., et al. (2020). Design and implementation knowledge graph for curriculum system in university. EasyChair: Tech. rep.
Stapel, M., Zheng, Z., & Pinkwart, N. (2016). An ensemble method to predict student performance in an online math learning environment. Journal of Educational Data Mining, 231–238
Sun, Y., Liang, J., & Niu, P. (2021). Generation of personalized knowledge graphs based on gcn. Journal of Computer and Communications, 9(9), 108–119.
Tinto, V. (1982). Limits of theory and practice in student attrition. The journal of higher education, 53(6), 687–700.
Tomasevic, N., Gvozdenovic, N., & Vranes, S. (2020). An overview and comparison of supervised data mining techniques for student exam performance prediction. Computers & education, 143(103), 676.
Trouillon, T., Welbl, J., & Riedel, S., et al. (2016). Complex embeddings for simple link prediction. In: International conference on machine learning (pp. 2071–2080). PMLR.
Wang, Z., Zhang, J., & Feng, J., et al. (2014). Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the AAAI conference on artificial intelligence
Wang, Q., Mao, Z., Wang, B., et al. (2017). Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12), 2724–2743.
Wang, P.-W., Stepanova, D., Domokos, C., & Kolter, J. Z. (2020). Differentiable learning of numerical rules in knowledge graphs. In: Proceedings of the International Conference on Learning Representations (ICLR).
Wang, M., Qiu, L., & Wang, X. (2021). A survey on knowledge graph embeddings for link prediction. Symmetry, 13(3), 485.
Wang, J., & Zhang, W. (2020). Fuzzy mathematics and machine learning algorithms application in educational quality evaluation model. Journal of Intelligent & Fuzzy Systems, 39(4), 5583–5593.
Whitehill, J., Mohan, K., & Seaton, D., et al. (2017). Delving deeper into mooc student dropout prediction. arXiv preprint arXiv:1702.06404
Wu, Z., Pan, S., Chen, F., et al. (2020). A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1), 4–24.
Yu, X., Stahr, M., & Chen, H., et al. (2021). Design and implementation of curriculum system based on knowledge graph. In: 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE) (pp 767–770). IEEE
Zaki, N., Mohamed, E., & Habuza, T. (2021a). From tabulated data to knowledge graph: A novel way of improving the performance of the classification models in the healthcare data. medRxiv preprint
Zaki, N., Mohamed, E. A., & Habuza, T. (2021b) From tabulated data to knowledge graph: A novel way of improving the performance of the classification models in the healthcare data. medRxiv
Zha, Z. J., Mei, T., Wang, J., et al. (2009). Graph-based semi-supervised learning with multiple labels. Journal of Visual Communication and Image Representation, 20(2), 97–103.
Zhang, H., Sun, M., & Wang, X., et al. (2017). Smart jump: Automated navigation suggestion for videos in moocs. In: Proceedings of the 26th international conference on world wide web companion (pp. 331–339)
Zhao, Q., Li, Q., & Wen, J. (2018). Construction and application research of knowledge graph in aviation risk field. In: MATEC Web of Conferences, EDP Sciences (pp. 05003)
Zhen, Y., Zheng, L., & Chen, P. (2021). Constructing knowledge graphs for online collaborative programming. IEEE Access 9, 117,969–117,980
Zohair, A., & Mahmoud, L. (2019). Prediction of student’s performance by modelling small dataset size. International Journal of Educational Technology in Higher Education, 16(1), 1–18.
Zwaneveld, B. (2014). Structuring mathematical knowledge and skills by means of knowledge graphs. International Journal of Mathematical Education in Science and Technology, 31(3), 393–414.
Funding
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Albreiki, B., Habuza, T., Palakkal, N. et al. Clustering-based knowledge graphs and entity-relation representation improves the detection of at risk students. Educ Inf Technol 29, 6791–6820 (2024). https://doi.org/10.1007/s10639-023-11938-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10639-023-11938-8