Skip to main content

Hybrid Data Science Approaches to Predict the Academic Performance of Students

  • Conference paper
  • First Online:
Advances in Communication and Applications (ERCICA 2023)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1105))

  • 93 Accesses

Abstract

Understanding, modeling, and predicting student performance in higher education poses significant challenges concerning the design of accurate and robust diagnostic models. While numerous studies attempted to develop intelligent classifiers for anticipating student achievement, they overlooked the importance of identifying the key factors that lead to the achieved performance. Such identification is essential to empower program leaders to recognize the strengths and weaknesses of their academic programs and thereby take the necessary corrective interventions to ameliorate student achievements. To this end, our paper contributes, to begin with, a hybrid approach of factor analysis that combines various data science approaches. The prediction of student performance is produced by combining baseline models, cross-validation, and factor analysis. We empirically investigate and demonstrate the effectiveness of our entire approach on four datasets. The experimental results show considerable improvements compared to single baseline models, demonstrating the practicality of the proposed approach in pinpointing multiple factors impacting student performance. The result proves that the hybrid algorithm combining cross-validation and factor analysis approaches yields results that are far superior in terms of achieving accuracy in prediction of academic performance of the students. The model may be successfully extended to other programs to predict the performance of the students.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aladeemy M, Tutun S, Khasawneh M (2017) A new hybrid approach for feature selection and support vector machine model selection based on self-adaptive cohort intelligence. Expert Syst Appl 88:118–131. https://doi.org/10.1016/j.eswa.2017.06.030

    Article  Google Scholar 

  • Alshanqiti A, Namoun A (2020) Predicting student performance and its influential factors using hybrid regression and multi-label classification. IEEE Access. 8:203827–203844. https://doi.org/10.1109/ACCESS.2020.3036572

    Article  Google Scholar 

  • Alyahyan E, Düştegör D (2020) Predicting academic success in higher education: Literature review and best practices. Int J Educ Technol Higher Edu 17(1), Dec 2020

    Google Scholar 

  • Alyahyan E, Düştegör D (2020) Predicting academic success in higher education: literature review and best practices. Int J Educ Technol High Educ 17(1):3

    Article  Google Scholar 

  • Al-Zawqari A, Peumans D, Vandersteen G (2022) A flexible feature selection approach for predicting students’ academic performance in online courses. Comput Educ Artif Intell 3, [100103]. https://doi.org/10.1016/j.caeai.2022.100103

  • Arun DK, Namratha V, Ramyashree BV, Jain YP, Choudhury AR (2021) Student academic performance prediction using educational data mining, 2021. In: International conference on computer communication and informatics (ICCCI) (2021)

    Google Scholar 

  • Badal Y, Sungkur R (2022) Predictive modelling and analytics of students’ grades using machine learning algorithms. Educ Inf Technol. https://doi.org/10.1007/s10639-022-11299-8

    Article  Google Scholar 

  • Brahim AB, Limam M (2016) A hybrid feature selection method based on instance learning and cooperative subset search. Pattern Recognit Lett 69:28–34

    Google Scholar 

  • Chaudhury P and Hrudaya T (2020) A novel academic performance estimation model using two stage feature selection. Indonesian J Electric Eng Comput Sci 19:1610. https://doi.org/10.11591/ijeecs.v19.i3.pp1610-1619

  • Farid J, Ahmad AS (2019) Building student’s performance cesissiion tree classifier using boosting algorithm. Indonesian J Electri Eng Comput Sci 14(3):1298–1304

    Article  Google Scholar 

  • Febro J (2019) Utilizing feature selection in identifying predicting factors of student retention. Int J Adv Comput Sci Appl 10. https://doi.org/10.14569/IJACSA.2019.0100934

  • Francis BK, Babu SS (2019) Predicting academic performance of students using a hybrid data mining approach. J Med Syst 43:162. https://doi.org/10.1007/s10916-019-1295-4

    Article  Google Scholar 

  • Gajwani J, Chakraborty P (2021) Students’ performance prediction using feature selection and supervised machine learning algorithms. In: International conference on innovative computing and communications. Springer, Singapore, pp 347–354

    Google Scholar 

  • Ghareb AS, Bakar AA, Hamdan AR (2016) Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst Appl 49:31–47

    Google Scholar 

  • Ghorbani R, Ghousi R (2020) Comparing different resampling methods in predicting students’ performance using machine learning techniques. IEEE Access 8(1):67899–67911, Apr 2020

    Google Scholar 

  • Hasan R, Palaniappan S, Mahmood S et al (2020) Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Appl Sci 10(11):3894

    Article  Google Scholar 

  • Huijuan L, Chen J, Yan K, Jin Q, Gao Z (2017) A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256:56–62

    Article  Google Scholar 

  • Hussain M, Zhu W, Zhang W et al (2019) Using machine learning to predict student difficulties from learning session data. Artif Intell Rev 52:381–407. https://doi.org/10.1007/s10462-018-9620-8

    Article  Google Scholar 

  • Hussain M, Zhu W, Zhang W, Abidi R (2018) Student engagement predictions in an e-learning system and their impact on student course assessment scores. Comput Intell Neurosci 2018:1–21. https://doi.org/10.1155/2018/6347186

  • Huynh-Cam T-T, Chen L-S, Huynh K-V (2022) Learning performance of international students and students with disabilities: early prediction and feature selection through educational data mining. Big Data Cognitive Comput 6:94. https://doi.org/10.3390/bdcc6030094

    Article  Google Scholar 

  • Kamala R, Thangaiah RJ (2019) An improved hybrid feature selection method for huge dimensional datasets. IAES Int J Artif Intell (IJ-AI) 8(1):77–86, Mar 2019. ISSN: 2252-8938. https://doi.org/10.11591/ijai.v8.i1.pp77-86

  • Khan A, Ghosh SK, Ghosh D, Chattopadhyay S (2021) Random wheel: an algorithm for early classification of student performance with confidence. Eng Appl Artif Intell

    Google Scholar 

  • Khan A, Ghosh SK (2021) Student performance analysis and prediction in classroom learning: a review of educational data mining studies. Educ Inf Technol 26:205–240. https://doi.org/10.1007/s10639-020-10230-3

    Article  Google Scholar 

  • Kotsiantis S, Piarrekeas C, Pintelas P (2007) Predicting students’ performance in distance learning using machine learning techniques. Appl Artif Intell 18:411–426

    Article  Google Scholar 

  • Kou G, Yang P, Peng Y et al (2020) Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl Soft Comput 86:105836

    Article  Google Scholar 

  • Kumar M, Nidhi BS, Handa D (2022) Building predictive model by using data mining and feature selection techniques on academic dataset. Int J Mod Educ Comput Sci (IJMECS) 14(4):16–29. https://doi.org/10.5815/ijmecs.2022.04.02

  • Livieris IE, Drakopoulou K, Mikropoulos TA, Tampakas V, Pintelas P (2018) An ensemble-based semi-supervised approach for predicting students performance. In: Research on e-learning and ICT in education. Springer, Cham, Switzerland, pp 25–42

    Google Scholar 

  • Malik S, Jothimani K, Ujwal UJ (2023) A comparative analysis to measure scholastic success of students using data science methods. In: Shetty NR, Patnaik LM, Prasad NH (eds) Emerging research in computing, information, communication and applications. lecture notes in electrical engineering, vol 928. Springer, Singapore. https://doi.org/10.1007/978-981-19-5482-5_3

  • Marbouti F, Diefes-Dux H, Madhavan K (2016) Models for early prediction of at-risk students in a course using standards-based grading. Comput Educ 103. https://doi.org/10.1016/j.compedu.2016.09.005

  • Miguéis VL, Freitas A, Garcia PJ et al (2018) Early segmentation of students according to their academic performance: a predictivemodelling approach. Decis Support Syst 115:36–51

    Article  Google Scholar 

  • Mohamed Y, Alkawsi G, Mustafa A, Alkahtani A, Alsariera Y, Ali A, Hashim W, Kiong T (2022) Toward predicting student’s academic performance using artificial neural networks (ANNs). Appl Sci 12. https://doi.org/10.3390/app12031289

  • Pandey M, Taruna S (2014) A comparative study of ensemble methods for students’ performance modeling. Int J Comput Appl 103(8):26–32, Oct 2014

    Google Scholar 

  • Phauk S, Takeo O (2020) Study on dominant factor for academic performance prediction using feature selection methods. Int J Adv Comput Sci Appl 11:492–502. https://doi.org/10.14569/IJACSA.2020.0110862

  • Polyzou A, Karypis G (2016) Grade prediction with models specific to students and courses. Int J Data Sci Anal 2(3–4):159–171

    Article  Google Scholar 

  • Rai S, Shastry KA, Pratap S et al (2021) Machine learning approach for student academic performance prediction. In: Evolution in computational intelligence. Springer, Singapore, pp 611–618

    Google Scholar 

  • Raj NS, Renumol VG (2022) Early prediction of student engagement in virtual learning environments using machine learning techniques. E-Learning and Digital Media

    Google Scholar 

  • Rao CS, Arunachalam AS (2021) Ensemble based learning style identification using VARK. NVEO-Natural Volatiles Essent OILS J| NVEO, pp 4550–4559

    Google Scholar 

  • Rawat KS, Malhan IV (2019) A hybrid classification method based on machine learning classifiers to predict performance in educational data mining. In: Krishna C, Dutta M, Kumar R (eds) Proceedings of 2nd international conference on communication, computing and networking. Lecture notes in networks and systems, vol 46. Springer, Singapore. https://doi.org/10.1007/978-981-13-1217-5_67

  • Romero C, Ventura (2010) Educational data mining: a review of the state of art. IEEE Trans Syst Man Cybern Part C Appl Rev 40(6):601–618

    Google Scholar 

  • Sassirekha MS, Vijayalakshmi S (2022) Predicting the academic progression in student’s standpoint using machine learning. Automatika 63(4):605–617. https://doi.org/10.1080/00051144.2022.2060652

    Article  Google Scholar 

  • Sharma A, Mishra PK (2022) Performance analysis of machine learning based optimized feature selection approaches for breast cancer diagnosis. Int J Inf Technol 14:1949–1960. https://doi.org/10.1007/s41870-021-00671-5

    Article  Google Scholar 

  • Sokkhey P, Okazaki T (2019) Comparative study of prediction models on high school student performance in mathematics. J IEIE Trans Smart Process Comput 8(5):394–404, Oct 2019

    Google Scholar 

  • Sokkhey P, Okazaki T (2020) Hybrid machine learning algorithms for predicting academic performance. Int J Adv Comput Sci Appl (IJACSA), 11(1)

    Google Scholar 

  • Wang A, An N, Chen G, Li L, Alterovitz G (2015) Accelerating wrapper-based feature selection with K-nearest-neighbour. Knowl Based Syst 83:81–91

    Google Scholar 

  • Zohair LMA (2019a) Prediction of student's performance by modelling small dataset size. Int J Educ Technol Higher Edu 16(1):27

    Google Scholar 

  • Zohair LMA (2019b) Prediction of student’s performance by modelling small dataset size. Int J Educ Technol Higher Educ16(1):1–18

    Google Scholar 

  • Zorarpacı E, Ozel SA (2016) A hybrid approach of differential evolution and artificial bee colony for feature selection. Expert Syst Appl 62:91–103

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saleem Malik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Malik, S., Malik, S. (2024). Hybrid Data Science Approaches to Predict the Academic Performance of Students. In: Shetty, N.R., Prasad, N.H., Nagaraj, H.C. (eds) Advances in Communication and Applications . ERCICA 2023. Lecture Notes in Electrical Engineering, vol 1105. Springer, Singapore. https://doi.org/10.1007/978-981-99-7633-1_39

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7633-1_39

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7632-4

  • Online ISBN: 978-981-99-7633-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics