Detection of Desertion Patterns in University Students Using Data Mining Techniques: A Case Study

  • Dayana Vila
  • Saúl Cisneros
  • Pedro Granda
  • Cosme Ortega
  • Miguel Posso-Yépez
  • Iván García-Santillán
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 895)


Student desertion is a phenomenon that affects higher education and academic quality standards. Several causes can lead to this issue, the academic factor being a potential reason. The main objective of this research is to detect dropout patterns in the “Técnica del Norte” University (Ecuador), based on personal and academic historical data, using predictive classification techniques in data mining. The KDD (Knowledge Discovery in Databases) process was used to determine desertion patterns focused on two approaches: (i) Bayesian, and (ii) Decision Trees, both implemented on Weka. The classifiers performance was quantitatively evaluated using the confusion matrix and quality metrics. The results proved that the technique based on decision trees had slightly better performance than the Bayesian approach on the processed data.


Student desertion Pattern discovery Data mining KDD Weka 



To the IT department of the “Técnica del Norte” University for allowing access to the raw data of the academic database.


  1. 1.
    Devasia, T., Vinushree, T.P., Hegde, V.: Prediction of students performance using Educational Data Mining. In: International Conference on Data Mining and Advanced Computing (SAPIENCE), Ernakulam, pp. 91–95 (2016).
  2. 2.
    Senescyt: Rendición de cuantas 2015 (Accountability 2015). Quito-Ecuador (2015). Accessed 13 Mar 2018
  3. 3.
    Hernández, G., Melendez, R.A., Morales, L.A., Garcia, A., Tecpanecatl, J.L., Algredo, I.: Comparative study of algorithms to predict the desertion in the students at the ITSM-Mexico. IEEE Latin Am. Trans. 14(11), 4573–4578 (2016). Scholar
  4. 4.
    Kotsiantis, S.B.: Use of machine learning techniques for educational proposes: a decision support system for forecasting students’ grades. Artif. Intell. Rev. 37(4), 331–344 (2012). Scholar
  5. 5.
    UTN: Universidad Técnica del Norte (2018). Accessed 22 Apr 2018
  6. 6.
    Lara, J.: Minería de Datos (Data Mining). Madrid, Centro de Estudios Financieros (2014)Google Scholar
  7. 7.
    Subsecretaría de Informática: Decreto Ejecutivo No. 1014 (Executive Decree No. 1014) (2009). Accessed 22 Apr 2018
  8. 8.
    Lehr, S., Liu, H., Kinglesmith, S., Konyha, A., Robaszewska, N., Medinilla, J.: Use educational data mining to predict undergraduate retention. In: IEEE 16th International Conference on Advanced Learning Technologies (ICALT), Austin, TX, pp. 428–430 (2016).
  9. 9.
    Peralta, B., Poblete, T., Caro, L.: Automatic feature selection for desertion and graduation prediction: a chilean case. In: 35th International Conference of the Chilean Computer Science Society (SCCC), Valparaiso, pp. 1–8 (2016).
  10. 10.
    Merchan, S.M., Duarte, J.A.: Analysis of data mining techniques for constructing a predictive model for academic performance. IEEE Latin Am. Trans. 14(6), 2783–2788 (2016). Scholar
  11. 11.
    Zaffar, M., Hashmani, M.A., Savita, K.S.: Performance analysis of feature selection algorithm for educational data mining. In: IEEE Conference on Big Data and Analytics (ICBDA), Kuching, pp. 7–12 (2017).
  12. 12.
    Mishra, A., Bansal, R., Singh, S.N.: Educational data mining and learning analysis. In: 7th International Conference on Cloud Computing, Data Science and Engineering - Confluence, Noida, pp. 491–494 (2017).
  13. 13.
    Moscoso-Zea, O., Andres-Sampedro, Luján-Mora, S.: Datawarehouse design for educational data mining. In: 15th International Conference on Information Technology Based Higher Education and Training (ITHET), Istanbul, pp. 1–6 (2016).
  14. 14.
    Kimball, R., Ross, M.: The Data Warehouse Toolkit. Wiley, Indianapolis (2013)Google Scholar
  15. 15.
    INEC: Ecuador en Cifras (Ecuador in figures) (2010). Accessed 10 Nov 2018
  16. 16.
    Conadis: Estadística y análisis de datos de personas con discapacidad (Statistics and data analysis of people with disabilities) (2018). Accessed 04 Nov 2018
  17. 17.
    The PostgreSQL Global Development Group: Download PostgreSQL (2018). Accessed 04 Sept 2018
  18. 18.
    HITACHI: Hitachi Vantara (2018). Accessed 04 May 2018
  19. 19.
    Conadis: Reglamento a la Ley Orgánica de Discapacidades del Ecuador (Regulation to the Organic Law on Disability of Ecuador) (2017). Accessed 12 Feb 2017
  20. 20.
    The University of Waikato: Weka 3: Data Mining Software in Java (2018). Accessed 22 Mar 2018
  21. 21.
    Sierra, B.: Aprendizaje automático: conceptos básicos y avanzados (Machine learning: basic and advanced concepts). Prentice-Hall, Madrid (2006)Google Scholar
  22. 22.
    Pajares, G., de la Cruz, J.: Aprendizaje automático: un enfoque práctico (Machine learning: a practical approach). Madrid, Ra-Ma (2010)Google Scholar
  23. 23.
    Castillejo-González, I.L., López-Granados, F., García-Ferrer, A., Peña-Barragán, J.M., Jurado-Expósito, M., Sánchez De La Orden, M., et al.: Object and pixel-based classification for mapping crops and their agri-environmental associated measures in QuickBird images. Comput. Electron. Agric. 68, 207–215 (2009)CrossRefGoogle Scholar
  24. 24.
    Landis, J.R., Kock, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)CrossRefGoogle Scholar
  25. 25.
    Congalton, R.G.: A review of assessing the accuracy of classification of remotely sensed data. Remote Sens. Environ. 37, 35–46 (1991)CrossRefGoogle Scholar
  26. 26.
    Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10(3), e0118432 (2015). Ed. by Brock, G.CrossRefGoogle Scholar
  27. 27.
  28. 28.
    Jeni, L.A., Cohn, J.F., De La Torre, F.: Facing imbalanced data–recommendations for the use of performance metrics. In: Humaine Association Conference on Affective Computing and Intelligent Interaction, Geneva, pp. 245–251 (2013).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Dayana Vila
    • 1
  • Saúl Cisneros
    • 1
  • Pedro Granda
    • 1
  • Cosme Ortega
    • 1
  • Miguel Posso-Yépez
    • 2
  • Iván García-Santillán
    • 1
  1. 1.Department of Software Engineering, Faculty of Applied SciencesUniversidad Técnica del NorteIbarraEcuador
  2. 2.Faculty of Education, Science and TechnologyUniversidad Técnica del NorteIbarraEcuador

Personalised recommendations