Abstract
Distance learning is now a key component in higher level education. Given the high dropout rates and the important investments in distance learning it is of utmost concern to determine the most critical data in the success and failure of students. In this article we data mine enrollment profiles, educational background and students´ data from the Open University System and Distance Learning of the National Autonomous University of Mexico to determine the key factors that drive success and failure, creating a relevant predictive model using a Naive Bayes classifier. We have found that the number of subjects approved and their average qualification in the first semester are part of the most interesting predictors of student success.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Each instance refers to a student with his/her data entrance profile, educational background and enrollment data.
- 2.
The graph was generated with the web calculator ROC analysis of the school of medicine at Johns Hopkins University. URL: http://bit.ly/1eWFAnC.
References
Zhao, C.-M., Luan, J.: Data mining: going beyond traditional statistics. New Dir. Inst. Res. 131(2), 7–16 (2006)
Yukselturk, E., Ozekes, S., Türel, Y.: Predicting dropout student: an application of data mining methods in an online education program. Eur. J. Open Distance E-Learn. 17(1), 118–133 (2014)
Lykourentzou, I., Giannoukos, I., Nikolopoulos, V., Mpardis, G., Loumos, V.: Dropout prediction in e-learning courses through the combination of machine learning techniques. Comput. Educ. 53(3), 950–965 (2009)
Willging, P.A., Johnson, S.D.: Factors that influence students’ decision to dropout of online courses. J. Asynchronous Learn. Netw. 13(3), 115–127 (2004)
Lile, A.: Analyzing E-learning systems using educational data mining techniques. Mediterranean J. Soc. Sci. 2(3), 403–419 (2011)
Kotsiantis, S., Pierrakeas, C., Pintelas, P.: Preventing student dropout in distance learning using machine learning techniques. Knowl.-Based Intell. Inf. Eng. Syst. 2774, 267–274 (2003)
Zang, W., Lin, F.: Investigation of web-based teaching and learning by boosting algorithms. In: Proceedings of IEEE International Conference on Information Technology: Research and Education (ITRE 2003), pp. 445–449 (2003)
Dekker, G., Pechenizkiy, M., Vleeshouwers, J.: Predicting student drop out: a case study. In: Barnes, T., Desmarais, M., Romero, C., Ventura, S. (eds.), Proceedings of the 2nd International Conference on Educational Data Mining, (EDM 2009), pp. 41–50 (2009)
Stephens, C.R., Heau, J.G., González, C., Ibarra-Cerdeña, C.N., Sánchez-Cordero, V., et al.: Using biotic interaction networks for prediction in biodiversity and emerging diseases. PLoS ONE 4(5), e5725 (2009). doi:10.1371/journal.pone.0005725
Mitchell, T., Machine Learning, Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression (Draft Version). McGraw Hill (2005)
Swet, J.A.: Measuring the accuracy of diagnostic systems. Science 240, 1285–1293 (1988)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Salinas, J.G.M., Stephens, C.R. (2015). Applying Data Mining Techniques to Identify Success Factors in Students Enrolled in Distance Learning: A Case Study. In: Pichardo Lagunas, O., Herrera Alcántara, O., Arroyo Figueroa, G. (eds) Advances in Artificial Intelligence and Its Applications. MICAI 2015. Lecture Notes in Computer Science(), vol 9414. Springer, Cham. https://doi.org/10.1007/978-3-319-27101-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-27101-9_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27100-2
Online ISBN: 978-3-319-27101-9
eBook Packages: Computer ScienceComputer Science (R0)