Skip to main content
Log in

Enhanced Harris Hawks optimization as a feature selection for the prediction of student performance

  • Regular Paper
  • Published:
Computing Aims and scope Submit manuscript

Abstract

Predicting student performance for educational organizations such as universities, community colleges, schools, and training centers will enhance the overall results of these organizations. Big data can be extracted from the internal systems of these organizations, such as exam records, statistics about virtual courses, and e-learning systems. Finding meaningful knowledge from extracted data is a challenging task. In this paper, we proposed a modified version of Harris Hawks Optimization (HHO) algorithm by controlling the population diversity to overcome the early convergence problem and prevent trapping in a local optimum. The proposed approach is employed as a feature selection algorithm to discover the most valuable features for student performance prediction problem. A dynamic controller that controls the population diversity by observing the performance of HHO using the k-nearest neighbors (kNN) algorithm as a clustering approach. Once all solutions belong to one cluster, an injection process is employed to redistribute the solutions over the search space. A set of machine learning classifiers such as kNN, Layered recurrent neural network (LRNN), Naïve Bayes, and Artificial Neural Network are used to evaluate the overall prediction system. A real dataset obtained from UCI machine learning repository is adopted in this paper. The obtained results show the importance of predicting students’ performance at an earlier stage to avoid students’ failure and improve the overall performance of the educational organization. Moreover, the reported results show that the combination between the enhanced HHO and LRNN can outperform other classifiers with accuracy equal to \(92\%\), since LRNN is a deep learning algorithm that is able to learn from previous and current input values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Kemper L, Vorhoff G, Wigger BU (2020) Predicting student dropout: a machine learning approach. Eur J Higher Educ 10:28–47. https://doi.org/10.1080/21568235.2020.1718520

    Article  Google Scholar 

  2. Baker R, Yacef K, The state of educational data mining in, (2009) A review and future visions. JEDM 1(2009):3–17

  3. Turabieh H (2019) Hybrid machine learning classifiers to predict student performance. In: 2019 2nd international conference on new trends in computing sciences (ICTCS), pp 1–6. https://doi.org/10.1109/ICTCS.2019.8923093

  4. Fernandes E, Holanda M, Victorino M, Borges V, Carvalho R, Erven GV (2019) Educational data mining: Predictive analysis of academic performance of public school students in the capital of Brazil. J Bus Res 94(2019):335–343. https://doi.org/10.1016/j.jbusres.2018.02.012

    Article  Google Scholar 

  5. Bakhshinategh B, Zaiane OR, Elatia S, Ipperciel D (2018) Educational data mining applications and tasks: a survey of the last 10 years. Educ Inf Technol 23:537–553. https://doi.org/10.1007/s10639-017-9616-z

    Article  Google Scholar 

  6. Aldowah H, Al-Samarraie H, Fauzy WM (2019) Educational data mining and learning analytics for 21st century higher education: a review and synthesis. Telematics Inf 37:13–49. https://doi.org/10.1016/j.tele.2019.01.007

    Article  Google Scholar 

  7. Hussain S, Atallah R, Kamsin A, Hazarika J (2019) Classification, clustering and association rule mining in educational datasets using data mining tools: A case study. In: Silhavy R (ed) Cybernetics and algorithms in intelligent systems. Springer International Publishing, Cham, pp 196–211

  8. Khare K, Lam H, Khare A (2018) Educational Data Mining (EDM): Researching impact on online business education. Springer International Publishing, Cham, pp 37–53. https://doi.org/10.1007/978-3-319-62776-2_3

  9. Olivé DM, Huynh DQ, Reynolds M, Dougiamas M, Wiese D (2018) A supervised learning framework for learning management systems. In: Proceedings of the first international conference on data science, E-learning and information systems, DATA ’18, ACM, New York, NY, USA, pp 18:1–18:8. https://doi.org/10.1145/3279996.3280014

  10. Talbi E-G (2009) Metaheuristics: from design to implementation, vol 74. Wiley, Hoboken

    Book  Google Scholar 

  11. Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13:533–549

    Article  MathSciNet  Google Scholar 

  12. Lourenço HR, Martin OC, Stützle T (2003) Iterated Local Search, Springer, US, Boston. MA 320–353. https://doi.org/10.1007/0-306-48056-5_11

  13. Van Laarhoven PJ, Aarts EH (1987) Simulated annealing. In: Simulated annealing: theory and applications. Springer, pp 7–15

  14. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE International Conference on Neural Networks. Proceedings, vol 4, pp 1942–1948. https://doi.org/10.1109/ICNN.1995.488968

  15. Maniezzo A (1992) Distributed optimization by ant colonies. In: Toward a practice of autonomous systems: proceedings of the First European conference on artificial life. Mit Press, p 134

  16. Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. MIT Press, Cambridge, MA, USA

  17. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008

    Article  Google Scholar 

  18. Mirjalili S (2015) Moth-flame optimization algorithm: a novel nature-inspired heuristic paradigm. Knowl Based Syst 89(2015):228–249. https://doi.org/10.1016/j.knosys.2015.07.006

    Article  Google Scholar 

  19. Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27:1053–1073

    Article  Google Scholar 

  20. Yang XS (2010) A new metaheuristic bat-inspired algorithm. Springer, Berlin, pp 65–74. https://doi.org/10.1007/978-3-642-12538-6_6

  21. Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H (2019) Harris hawks optimization: algorithm and applications. Future Gener Comput Syst. 97:849–872. https://doi.org/10.1016/j.future.2019.02.028

    Article  Google Scholar 

  22. Ho Y-C, Pepyne DL (2002) Simple explanation of the no-free-lunch theorem and its implications. J Optim Theory Appl 115:549–570

    Article  MathSciNet  Google Scholar 

  23. Lin L, Gen M (2009) Auto-tuning strategy for evolutionary algorithms: balancing between exploration and exploitation. Soft Comput 13:157–168. https://doi.org/10.1007/s00500-008-0303-2

    Article  MATH  Google Scholar 

  24. Elaziz MA, Heidari AA, Fujita H, Moayedi H (2020) A competitive chain-based harris hawks optimizer for global optimization and multi-level image thresholding problems. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2020.106347

    Article  Google Scholar 

  25. Yang F, Li FW (2018) Study on student performance estimation, student progress analysis, and student potential prediction based on data mining. Comput Educ 123:97–108. https://doi.org/10.1016/j.compedu.2018.04.006

    Article  Google Scholar 

  26. Rana S, Garg R (2018) Student’s performance evaluation of an institute using various classification algorithms. In: Mishra DK, Nayak MK, Joshi A (eds) Information and communication technology for sustainable development. Springer, Singapore, pp 229–238

  27. Kesumawati A, Utari DT (2018) Predicting patterns of student graduation rates using naïve bayes classifier and support vector machine. AIP Conf Proc 2021:060005

    Article  Google Scholar 

  28. Bharara S, Sabitha S, Bansal A (2018) Application of learning analytics using clustering data mining for students’ disposition analysis. Educ Inf Technol 23:957–984. https://doi.org/10.1007/s10639-017-9645-7

    Article  Google Scholar 

  29. Alfiani HAP, Wulandari FA (2015) Mapping student’s performance based on data mining approach (a case study). Agric Sci Proc 3:173 – 177. https://doi.org/10.1016/j.aaspro.2015.01.034, international Conference on Agro-industry (IcoA): Sustainable and Competitive Agro-industry for Human Welfare Yogyakarta-INDONESIA 2014

  30. de Morais AM, Araújo JMFR, Costa EB (2014) Monitoring student performance using data clustering and predictive modelling. In: 2014 IEEE frontiers in education conference (FIE) proceedings, pp 1–8. https://doi.org/10.1109/FIE.2014.7044401

  31. Trivedi S, Pardos ZA, Heffernan NT (2011) Clustering students to generate an ensemble to improve standard test score predictions. In: Biswas G, Bull S, Kay J, Mitrovic A (eds) Artificial Intelligence in Education. Springer, Berlin, pp 377–384

  32. Romero C, Ventura S (2007) Educational data mining: a survey from 1995 to 2005. Exp Syst Appl 33:135–146. https://doi.org/10.1016/j.eswa.2006.04.005

    Article  Google Scholar 

  33. Njeru AM, Omar MS, Yi S, Paracha S, Wannous M (2017) Using iot technology to improve online education through data mining. Int Conf Appl Syst Innov (ICASI) 2017:515–518. https://doi.org/10.1109/ICASI.2017.7988469

    Article  Google Scholar 

  34. Marquez J, Villanueva J, Solarte Z, Garcia A (2016) Iot in education: integration of objects with virtual academic communities. In: Rocha Á, Correia AM, Adeli H, Reis LP, Mendonça Teixeira M (Eds), New advances in information systems and technologies. Springer International Publishing, Cham, pp 201–212

  35. Farhan M, Jabbar S, Aslam M, Hammoudeh M, Ahmad M, Khalid S, Khan M, Han K (2018) Iot-based students interaction framework using attention-scoring assessment in elearning. Future Gener Comput Syst 79:909–919. https://doi.org/10.1016/j.future.2017.09.037

    Article  Google Scholar 

  36. Memeti S, Pllana S, Ferati M, Kurti A, Jusufi I (2019) Iotutor: How cognitive computing can be applied to internet of things education. In: Strous L, Cerf VG (eds) Internet of Things. Springer International Publishing, Cham, Information Processing in an Increasingly Connected World, pp 218–233

  37. Minaei-Bidgoli B, Kashy DA, Kortemeyer G, Punch WF (2003) Predicting student performance: an application of data mining methods with an educational web-based system In: 33rd annual frontiers in education. FIE 2003, vol 1, pp T2A–13. https://doi.org/10.1109/FIE.2003.1263284

  38. García E, Romero C, Ventura S, de Castro C (2011) A collaborative educational association rule mining tool. Internet and Higher Educ 14:77–88. http://www.sciencedirect.com/science/article/pii/S1096751610000618. https://doi.org/10.1016/j.iheduc.2010.07.006, web mining and higher education: Introduction to the special issue

  39. Ougiaroglou S, Paschalis G (2012) Association rules mining from the educational data of esog web-based application. In: Iliadis L, Maglogiannis I, Papadopoulos H, Karatzas K, Sioutas S (eds) Artificial intelligence applications and innovations. Springer, Berlin, pp 105–114

  40. Damaševičius R (2010) Analysis of academic results for informatics course improvement Using Association Rule Mining. Springer, Boston, pp 357–363

    Google Scholar 

  41. Pinto H, Han J, Pei J, Wang K, Chen Q, Dayal U (2001) Multi-dimensional sequential pattern mining. In: Proceedings of the tenth international conference on information and knowledge management, CIKM ’01, ACM, New York, NY, USA, pp 81–88. https://doi.org/10.1145/502585.502600

  42. Simpson K, Beukelman D, Sharpe T (2000) An elementary student with severe expressive communication impairment in a general education classroom: Sequential analysis of interactions. Augmentative Alternative Commun 16:107–121. https://doi.org/10.1080/07434610012331278944

    Article  Google Scholar 

  43. Nakamura S, Nozaki K, Morimoto Y, Miyadera Y (2014) Sequential pattern mining method for analysis of programming learning history based on the learning process. In: International conference on education technologies and computers (ICETC) 2014, pp 55–60. https://doi.org/10.1109/ICETC.2014.6998902

  44. Črepinšek M, Liu S-H, Mernik M (2013) Exploration and exploitation in evolutionary algorithms: a survey. ACM Comput Surv. 45:35:1–35:33. https://doi.org/10.1145/2480741.2480752

  45. Sun J, Zhang H, Zhang Q, Chen H (2018) Balancing exploration and exploitation in multiobjective evolutionary optimization. In: Proceedings of the genetic and evolutionary computation conference companion, GECCO ’18, ACM, New York, NY, USA, pp 199–200. https://doi.org/10.1145/3205651.3205708

  46. Mittal N, Singh U, Sohi BS (2016) Modified grey wolf optimizer for global engineering optimization. Appl Comp Intell Soft Comput. https://doi.org/10.1155/2016/7950348

    Article  Google Scholar 

  47. Albina K, Lee SG (2019) Hybrid stochastic exploration using grey wolf optimizer and coordinated multi-robot exploration algorithms. IEEE Access 7:14246–14255. https://doi.org/10.1109/ACCESS.2019.2894524

    Article  Google Scholar 

  48. Lynn N, Suganthan PN (2015) Heterogeneous comprehensive learning particle swarm optimization with enhanced exploration and exploitation. Swarm Evol Comput 24(2015):11–24. https://doi.org/10.1016/j.swevo.2015.05.002

    Article  Google Scholar 

  49. Chen F, Sun X, Wei D, Tang Y (2011) Tradeoff strategy between exploration and exploitation for pso. In: 2011 seventh international conference on natural computation, vol 3, pp 1216–1222. https://doi.org/10.1109/ICNC.2011.6022365

  50. Shojaedini E, Majd M, Safabakhsh R (2019) Novel adaptive genetic algorithm sample consensus. Appl Soft Comput 77:635–642. https://doi.org/10.1016/j.asoc.2019.01.052

    Article  Google Scholar 

  51. Kelly J, Hemberg E, O’Reilly U-M (2019) Improving genetic programming with novel exploration - exploitation control. In: Sekanina L, Hu T, Lourenço N, Richter H, García-Sánchez P (eds) Genetic Programming. Springer International Publishing, Cham, pp 64–80

  52. Rezapoor Mirsaleh M, Meybodi MR (2018) Balancing exploration and exploitation in memetic algorithms: a learning automata approach. Comput Intell 34:282–309. https://doi.org/10.1111/coin.12148

    Article  MathSciNet  Google Scholar 

  53. Jedrzejowicz P (2019) Current trends in the population-based optimization. In: Nguyen NT, Chbeir R, Exposito E, Aniorté P, Trawiński B (eds) Computational collective intelligence. Springer International Publishing, Cham, pp 523–534

  54. Turabieh H, Mafarja M, Li X (2019) Iterated feature selection algorithms with layered recurrent neural network for software fault prediction. Exp Syst Appl 122:27–42. https://doi.org/10.1016/j.eswa.2018.12.033

    Article  Google Scholar 

  55. Cortez P, Silva A (2008) Using data mining to predict secondary school student performance. In: Brito A, Teixeira J (Eds) Proceedings of 5th future business technology conference (FUBUTEC 2008), Porto, Portugal, 2008, pp 5–12

  56. Dua D, Graff C (2019) UCI machine learning repository. http://archive.ics.uci.edu/ml

Download references

Acknowledgements

The authors would like to acknowledge Taif University Researchers Supporting Project Number (TURSP-2020/125), Taif University, Taif, Saudi Arabia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamza Turabieh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Turabieh, H., Azwari, S.A., Rokaya, M. et al. Enhanced Harris Hawks optimization as a feature selection for the prediction of student performance. Computing 103, 1417–1438 (2021). https://doi.org/10.1007/s00607-020-00894-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-020-00894-7

Keywords

Mathematics Subject Classification

Navigation