Abstract
Student dropout is considered an important indicator for measuring social mobility and reflecting the social contribution that universities offer. In economic terms, there is evidence that students attribute their decision to defect from their academic programs because of their economic situation. Dropout causes significant waging gaps among people who complete their tertiary studies compared to those who do not, leading to a lack of skilled human capital that pays greater productivity to economic development of a country. Given the above, the objective of this study is to present a tree-based classification of decisions (CBAD) with optimized parameters to predict the dropout of students at Colombian universities. The study analyses 10,486 cases of students from three private universities with similar characteristics. The result of the application of this technique with optimized parameters achieved a precision ratio of 88.14%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50(1):159–175
Duan L, Xu L, Liu Y, Lee J (2009) Cluster-based outlier detection. Ann Oper Res 168(1):151–168
Haykin S (1999) Neural networks a comprehensive foundation, 2nd edn. Macmillan College Publishing, Inc., New York. ISBN: 9780023527616
Haykin S (2009) Neural networks and learning machines. Prentice Hall International, London, NJ
Isasi P, Galván I (2004) Redes de neuronas artificiales. Un enfoque Práctico. Pearson, London. ISBN: 8420540250
Kulkarni S, Haidar I (2009) Forecasting model for crude oil price using artificial neural networks and commodity future prices. Int J Comput Sci Inf Secur 2(1):81–89
Mazón JN, Trujillo J, Serrano M, Piattini M (2005) Designing data warehouses: from business requirement analysis to multidimensional modeling. In: Proceedings of the 1st international workshop on requirements engineering for business need and IT alignment. Paris, France
Izquierdo NV, Lezama OBP, Dorta RG, Viloria A, Deras I, Hernández-Fernández L (2018) Fuzzy logic applied to the performance evaluation. Honduran coffee sector case. In: Tan Y, Shi Y, Tang Q (eds) Advances in swarm intelligence. ICSI 2018. Lecture notes in computer science, vol 10942. Springer, Berlin
Pineda Lezama O, Gómez Dorta R (2017) Techniques of multivariate statistical analysis: an application for the Honduran banking sector. Innovare: J Sci Technol 5(2):61–75
Viloria A, Lis-Gutierrez JP, Gaitán-Angulo M, Godoy ARM, Moreno GC, Kamatkar SJ (2018) Methodology for the design of a student pattern recognition tool to facilitate the teaching—learning process through knowledge data discovery (big data). In: Tan Y, Shi Y, Tang Q (eds) Data mining and big data. DMBD 2018. Lecture notes in computer science, vol 10943. Springer, Berlin
Ben Salem S, Naouali S, Chtourou Z (2018) A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach. Comput Electronic Eng 68:463–483. https://doi.org/10.1016/j.compeleceng.2018.04.023
Chakraborty S, Das S (2018) Simultaneous variable weighting and determining the number of clusters—a weighted gaussian algorithm means. Stat Probab Lett 137:148–156. https://doi.org/10.1016/j.spl.2018.01.015
Abhay KA, Badal NA (2015) Novel approach for intelligent distribution of data warehouses. Egypt Inform J 17(1):147–159
Aguado-López E, Rogel-Salazar R, Becerril-García A, Baca-Zapata G (2009) Presencia de universidades en la Red: La brecha digital entre Estados Unidos y el resto del mundo. Revista de Universidad y Sociedad del Conocimiento 6(1):1–17
Bontempi G, Ben Taieb S, Borgne YA (2013) Machine learning strategies for time series forecasting. In: Aufaure M-A, Zimányi E (eds) Lecture notes in business information processing, vol 138, no 1. Springer, Heidelberg, pp 70–73
Parthasarathy S et al (2001) Parallel data mining for association rules on shared-memory systems. Knowl Inf Syst 3(1):1–29
Grossman RL, Bailey SM, Sivakumar H, Turinsky AL (1999) Papyrus: a system for data mining over local and wide area clusters and super-clusters. In: Proceedings of ACM/IEEE conference on supercomputing, Article No 63
Chattratichat J, Darlington J, Guo Y, Hedvall S, Kohler M, Syed J (1999) An architecture for distributed enterprise data mining. In: Proceedings of 7th international conference on high performance computing and networking, Netherlands, 12–14 Apr, pp 573–582
Wang L et al (2013) G-hadoop: MapReduce across distributed data centers for data-intensive computing. Futur Gener Comput Syst 29(3):739–750
Butenhof DR (1997) Programming with POSIX threads. Addison-Wesley, Boston
Bhaduri K, Wolf R, Giannella C, Kargupta H (2008) Distributed decision-tree induction in peer-to-peer systems. Stat Anal Data Min 1(2):85–103
Rafailidis D, Kefalas P, Manolopoulos Y (2017) Preference dynamics with multimodal user-item interactions in social media recommendation. Expert Syst Appl 74:11–18
Vásquez C, Torres M, Viloria A (2017) Public policies in science and technology in Latin American countries with universities in the top 100 of web ranking. J Eng Appl Sci 12(11):2963–2965
Aguado-López E, Rogel-Salazar R, Becerril-García A, Baca-Zapata G (2009) Presencia de universidades en la Red: La brecha digital entre Estados Unidos y el resto del mundo. Revista de Universidad y Sociedad del Conocimento 6(1):1–17
Torres-Samuel M, Vásquez C, Viloria A, Lis-Gutiérrez JP, Borrero TC, Varela N (2018) Web visibility profiles of top 100 Latin American universities. In: Tan Y, Shi Y, Tang Q (eds) Data mining and big data. DMBD 2018. Lecture notes in computer science, vol 10943. Springer, Berlin
Caicedo EJC, Guerrero S, López D (2016) Propuesta para la construcción de un índice socioeconómico para los estudiantes que presentan las pruebas Saber Pro. Comunicaciones en Estadística 9(1):93–106 (85–97 English)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Silva, J. et al. (2020). Prediction of Academic Dropout in University Students Using Data Mining: Engineering Case. In: Gunjan, V., Senatore, S., Kumar, A., Gao, XZ., Merugu, S. (eds) Advances in Cybernetics, Cognition, and Machine Learning for Communication Technologies. Lecture Notes in Electrical Engineering, vol 643. Springer, Singapore. https://doi.org/10.1007/978-981-15-3125-5_49
Download citation
DOI: https://doi.org/10.1007/978-981-15-3125-5_49
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3124-8
Online ISBN: 978-981-15-3125-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)