Abstract
Understanding, modeling, and predicting student performance in higher education poses significant challenges concerning the design of accurate and robust diagnostic models. While numerous studies attempted to develop intelligent classifiers for anticipating student achievement, they overlooked the importance of identifying the key factors that lead to the achieved performance. Such identification is essential to empower program leaders to recognize the strengths and weaknesses of their academic programs and thereby take the necessary corrective interventions to ameliorate student achievements. To this end, our paper contributes, to begin with, a hybrid approach of factor analysis that combines various data science approaches. The prediction of student performance is produced by combining baseline models, cross-validation, and factor analysis. We empirically investigate and demonstrate the effectiveness of our entire approach on four datasets. The experimental results show considerable improvements compared to single baseline models, demonstrating the practicality of the proposed approach in pinpointing multiple factors impacting student performance. The result proves that the hybrid algorithm combining cross-validation and factor analysis approaches yields results that are far superior in terms of achieving accuracy in prediction of academic performance of the students. The model may be successfully extended to other programs to predict the performance of the students.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aladeemy M, Tutun S, Khasawneh M (2017) A new hybrid approach for feature selection and support vector machine model selection based on self-adaptive cohort intelligence. Expert Syst Appl 88:118–131. https://doi.org/10.1016/j.eswa.2017.06.030
Alshanqiti A, Namoun A (2020) Predicting student performance and its influential factors using hybrid regression and multi-label classification. IEEE Access. 8:203827–203844. https://doi.org/10.1109/ACCESS.2020.3036572
Alyahyan E, Düştegör D (2020) Predicting academic success in higher education: Literature review and best practices. Int J Educ Technol Higher Edu 17(1), Dec 2020
Alyahyan E, Düştegör D (2020) Predicting academic success in higher education: literature review and best practices. Int J Educ Technol High Educ 17(1):3
Al-Zawqari A, Peumans D, Vandersteen G (2022) A flexible feature selection approach for predicting students’ academic performance in online courses. Comput Educ Artif Intell 3, [100103]. https://doi.org/10.1016/j.caeai.2022.100103
Arun DK, Namratha V, Ramyashree BV, Jain YP, Choudhury AR (2021) Student academic performance prediction using educational data mining, 2021. In: International conference on computer communication and informatics (ICCCI) (2021)
Badal Y, Sungkur R (2022) Predictive modelling and analytics of students’ grades using machine learning algorithms. Educ Inf Technol. https://doi.org/10.1007/s10639-022-11299-8
Brahim AB, Limam M (2016) A hybrid feature selection method based on instance learning and cooperative subset search. Pattern Recognit Lett 69:28–34
Chaudhury P and Hrudaya T (2020) A novel academic performance estimation model using two stage feature selection. Indonesian J Electric Eng Comput Sci 19:1610. https://doi.org/10.11591/ijeecs.v19.i3.pp1610-1619
Farid J, Ahmad AS (2019) Building student’s performance cesissiion tree classifier using boosting algorithm. Indonesian J Electri Eng Comput Sci 14(3):1298–1304
Febro J (2019) Utilizing feature selection in identifying predicting factors of student retention. Int J Adv Comput Sci Appl 10. https://doi.org/10.14569/IJACSA.2019.0100934
Francis BK, Babu SS (2019) Predicting academic performance of students using a hybrid data mining approach. J Med Syst 43:162. https://doi.org/10.1007/s10916-019-1295-4
Gajwani J, Chakraborty P (2021) Students’ performance prediction using feature selection and supervised machine learning algorithms. In: International conference on innovative computing and communications. Springer, Singapore, pp 347–354
Ghareb AS, Bakar AA, Hamdan AR (2016) Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst Appl 49:31–47
Ghorbani R, Ghousi R (2020) Comparing different resampling methods in predicting students’ performance using machine learning techniques. IEEE Access 8(1):67899–67911, Apr 2020
Hasan R, Palaniappan S, Mahmood S et al (2020) Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Appl Sci 10(11):3894
Huijuan L, Chen J, Yan K, Jin Q, Gao Z (2017) A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256:56–62
Hussain M, Zhu W, Zhang W et al (2019) Using machine learning to predict student difficulties from learning session data. Artif Intell Rev 52:381–407. https://doi.org/10.1007/s10462-018-9620-8
Hussain M, Zhu W, Zhang W, Abidi R (2018) Student engagement predictions in an e-learning system and their impact on student course assessment scores. Comput Intell Neurosci 2018:1–21. https://doi.org/10.1155/2018/6347186
Huynh-Cam T-T, Chen L-S, Huynh K-V (2022) Learning performance of international students and students with disabilities: early prediction and feature selection through educational data mining. Big Data Cognitive Comput 6:94. https://doi.org/10.3390/bdcc6030094
Kamala R, Thangaiah RJ (2019) An improved hybrid feature selection method for huge dimensional datasets. IAES Int J Artif Intell (IJ-AI) 8(1):77–86, Mar 2019. ISSN: 2252-8938. https://doi.org/10.11591/ijai.v8.i1.pp77-86
Khan A, Ghosh SK, Ghosh D, Chattopadhyay S (2021) Random wheel: an algorithm for early classification of student performance with confidence. Eng Appl Artif Intell
Khan A, Ghosh SK (2021) Student performance analysis and prediction in classroom learning: a review of educational data mining studies. Educ Inf Technol 26:205–240. https://doi.org/10.1007/s10639-020-10230-3
Kotsiantis S, Piarrekeas C, Pintelas P (2007) Predicting students’ performance in distance learning using machine learning techniques. Appl Artif Intell 18:411–426
Kou G, Yang P, Peng Y et al (2020) Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl Soft Comput 86:105836
Kumar M, Nidhi BS, Handa D (2022) Building predictive model by using data mining and feature selection techniques on academic dataset. Int J Mod Educ Comput Sci (IJMECS) 14(4):16–29. https://doi.org/10.5815/ijmecs.2022.04.02
Livieris IE, Drakopoulou K, Mikropoulos TA, Tampakas V, Pintelas P (2018) An ensemble-based semi-supervised approach for predicting students performance. In: Research on e-learning and ICT in education. Springer, Cham, Switzerland, pp 25–42
Malik S, Jothimani K, Ujwal UJ (2023) A comparative analysis to measure scholastic success of students using data science methods. In: Shetty NR, Patnaik LM, Prasad NH (eds) Emerging research in computing, information, communication and applications. lecture notes in electrical engineering, vol 928. Springer, Singapore. https://doi.org/10.1007/978-981-19-5482-5_3
Marbouti F, Diefes-Dux H, Madhavan K (2016) Models for early prediction of at-risk students in a course using standards-based grading. Comput Educ 103. https://doi.org/10.1016/j.compedu.2016.09.005
Miguéis VL, Freitas A, Garcia PJ et al (2018) Early segmentation of students according to their academic performance: a predictivemodelling approach. Decis Support Syst 115:36–51
Mohamed Y, Alkawsi G, Mustafa A, Alkahtani A, Alsariera Y, Ali A, Hashim W, Kiong T (2022) Toward predicting student’s academic performance using artificial neural networks (ANNs). Appl Sci 12. https://doi.org/10.3390/app12031289
Pandey M, Taruna S (2014) A comparative study of ensemble methods for students’ performance modeling. Int J Comput Appl 103(8):26–32, Oct 2014
Phauk S, Takeo O (2020) Study on dominant factor for academic performance prediction using feature selection methods. Int J Adv Comput Sci Appl 11:492–502. https://doi.org/10.14569/IJACSA.2020.0110862
Polyzou A, Karypis G (2016) Grade prediction with models specific to students and courses. Int J Data Sci Anal 2(3–4):159–171
Rai S, Shastry KA, Pratap S et al (2021) Machine learning approach for student academic performance prediction. In: Evolution in computational intelligence. Springer, Singapore, pp 611–618
Raj NS, Renumol VG (2022) Early prediction of student engagement in virtual learning environments using machine learning techniques. E-Learning and Digital Media
Rao CS, Arunachalam AS (2021) Ensemble based learning style identification using VARK. NVEO-Natural Volatiles Essent OILS J| NVEO, pp 4550–4559
Rawat KS, Malhan IV (2019) A hybrid classification method based on machine learning classifiers to predict performance in educational data mining. In: Krishna C, Dutta M, Kumar R (eds) Proceedings of 2nd international conference on communication, computing and networking. Lecture notes in networks and systems, vol 46. Springer, Singapore. https://doi.org/10.1007/978-981-13-1217-5_67
Romero C, Ventura (2010) Educational data mining: a review of the state of art. IEEE Trans Syst Man Cybern Part C Appl Rev 40(6):601–618
Sassirekha MS, Vijayalakshmi S (2022) Predicting the academic progression in student’s standpoint using machine learning. Automatika 63(4):605–617. https://doi.org/10.1080/00051144.2022.2060652
Sharma A, Mishra PK (2022) Performance analysis of machine learning based optimized feature selection approaches for breast cancer diagnosis. Int J Inf Technol 14:1949–1960. https://doi.org/10.1007/s41870-021-00671-5
Sokkhey P, Okazaki T (2019) Comparative study of prediction models on high school student performance in mathematics. J IEIE Trans Smart Process Comput 8(5):394–404, Oct 2019
Sokkhey P, Okazaki T (2020) Hybrid machine learning algorithms for predicting academic performance. Int J Adv Comput Sci Appl (IJACSA), 11(1)
Wang A, An N, Chen G, Li L, Alterovitz G (2015) Accelerating wrapper-based feature selection with K-nearest-neighbour. Knowl Based Syst 83:81–91
Zohair LMA (2019a) Prediction of student's performance by modelling small dataset size. Int J Educ Technol Higher Edu 16(1):27
Zohair LMA (2019b) Prediction of student’s performance by modelling small dataset size. Int J Educ Technol Higher Educ16(1):1–18
Zorarpacı E, Ozel SA (2016) A hybrid approach of differential evolution and artificial bee colony for feature selection. Expert Syst Appl 62:91–103
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Malik, S., Malik, S. (2024). Hybrid Data Science Approaches to Predict the Academic Performance of Students. In: Shetty, N.R., Prasad, N.H., Nagaraj, H.C. (eds) Advances in Communication and Applications . ERCICA 2023. Lecture Notes in Electrical Engineering, vol 1105. Springer, Singapore. https://doi.org/10.1007/978-981-99-7633-1_39
Download citation
DOI: https://doi.org/10.1007/978-981-99-7633-1_39
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7632-4
Online ISBN: 978-981-99-7633-1
eBook Packages: EngineeringEngineering (R0)