Predicting Chronic Absenteeism Using Educational Data Mining Methods

  • Şebnem Özdemir
  • Fatma Çınar
  • C. Coşkun Küçüközmen
  • Kutlu Merih
Conference paper
Part of the Springer Proceedings in Complexity book series (SPCOM)


The rate of chronic absenteeism is important in assessing the validity of current educational practices conditions. Every student who exhibits this behavior faces the risk of failing to progress to higher level of education and/or dropping out/leaving the school. Students in this risk group represent not only a problem from an educational standpoint but also a potential and multifaceted problem with respect to participation in the economy, the development of a skilled labor force, and the ability to become well integrated into society. In the literature for Turkey, the framework of this problem was constructed using statistical methods, and it is important to analyze this problem in greater depth. The main objective of this study is therefore to employ educational data mining methods to predict cases of chronic absenteeism at high school level. The data, compiled from 2,495 students from different districts of Istanbul, was prepared for data mining operations based on the CRISP-EDM steps. The analysis process was conducted using R language and R language packages due to their flexibility and strength. The study results revealed that the random forest algorithm is able to establish a more successful model, while the C4.5 algorithm more accurately describes the problem in terms of decision rules.


Educational data mining CRISP-EDM (cross-industry standard process for educational data mining) Chronic absenteeism Machine learning 


  1. Abdous, M., Wu, H., & Cherng-Jyh, Y. (2012). Using data mining for predicting relationships between online question theme and final grade. Educational Technology & Society, 15(3), 77–88.Google Scholar
  2. Alexander, K. L., Entwisle, D. R., & Horsey, C. S. (1997). From first grade forward: Early foundation of high school dropout. Sociology of Education, 70(2), 87–107.CrossRefGoogle Scholar
  3. Allensworth, E. M., & Easton, J. Q. (2005). The on-track indicator as a predictor of high school graduation. Chicago: Consortium on Chicago School Research.Google Scholar
  4. Allensworth, E. M., & Easton, J. Q. (2007). What matters for staying on track and graduating in Chicago public high schools. Chicago: Consortium on Chicago school research.Google Scholar
  5. Allensworth, E. M., Gwynne, J. A., Moore, P., & Torre, M. D. L. (2014). Looking forward to high school and college: Middle grade indicators of readiness in Chicago public schools. Chicago: The University of Chicago Consortium on Chicago School Research.Google Scholar
  6. Altınkurt, Y. (2008). Öğrenci devamsızlıklarının nedenleri ve devamsızlığın akademik başarıya olan etkisi. Dumlupınar Üniversitesi Sosyal Bilimler Dergisi 20.Google Scholar
  7. Ataman, A. (2001). Sınıf içinde karşılaşılan davranış problemleri ve bunlara karşı geliştirilen önlemler. Sınıf Yönetiminde Yeni Yaklaşımlar (Ed. Leyla Küçükahmet). Ankara: Nobel Yayınları.Google Scholar
  8. Avcı, B. (2009). Öğrencinin liderliği. Tokat: Gazi Osman Paşa Üniversitesi, Eğitim Bilimleri Enstitüsü, Yayınlanmamış Yüksek Lisans Tezi.Google Scholar
  9. Ayas, A. (2013). Eğitimle İlgili Temel Kavramlar. Eğitim Bilimine Giriş (pp. 1–12). Ankara: Pegem Akademi.Google Scholar
  10. Başarır, D. (1990). Ortaokul Son Sınıf Öğrencilerinde Sınav Kaygısı, Durumluk Kaygı, Akademik Başarı ve Sınav Başarısı Arasındaki İlişkiler. Yayımlanmamış yüksek lisans tezi. Ankara: Hacettepe Üniversitesi Sosyal Bilimler Enstitüsü.Google Scholar
  11. Battin-Pearson, S., et al. (2000). Predictors of early high school dropout: A test of five theories. Journal of Educational Psychology, 92(3), 568.CrossRefGoogle Scholar
  12. Bayer, J., Bydzovská, H., Géryk, J., Obsivac, T., & Popelinsky, L. (2012). Predicting drop-out from social behaviour of students. International Educational Data Mining Society, 97, 103–109.Google Scholar
  13. Becker, R., Chambers, J. M., & Wilks, A. (1988). The (new) S language: A programming environment for data analysis and graphics. Pacific Grove, CA: Wadsworth.Google Scholar
  14. Blue, D., & Cook, J. E. (2004). High school dropouts: can we reverse the stagnation in school graduation? Study of High School Restructuring, 1, 1e11.Google Scholar
  15. Bowers, A. J., & Sprott, R. (2012). Why tenth graders fail to finish high school: Dropout typology latent class analysis. Journal of Education for Students Placed at Risk, 17(3), 129–148.CrossRefGoogle Scholar
  16. Bowers, A. J., Sprott, R., & Taff, S. A. (2013). Do we know who will drop out?: A review of the predictors of dropping out of high school: Precision, sensitivity and specificity. The High School Journal, 96(2), 77–100.CrossRefGoogle Scholar
  17. Bydovska, H., & Popelínský, L. (2013). Predicting student performance in higher education. In 2013 24th International workshop on database and expert systems applications (pp. 141–145). Los Alamitos: IEEE CA, USA.Google Scholar
  18. Cassady, J. C., & Johnson, R. E. (2002). Cognitive test anxiety and academic performance. Contemporary Educational Psychology, 27, 270–295.CrossRefGoogle Scholar
  19. Çapri, B. (2006). Tükenmişlik Ölçeğinin Türkçe Uyarlaması: Geçerlik Ve Güvenirlik Çalışması. Mersin Üniversitesi Eğitim Fakültesi Dergisi, 2(1), 62–78.Google Scholar
  20. Chapell, M., Blanding, Z., Silverstein, M., Takahashi, M., Newman, B., Gubi, A., & McCann, N. (2005). Test anxiety and academic performance İn undergraduate and graduate students. Journal of Educational Psychology, 97, 268–274.CrossRefGoogle Scholar
  21. Croninger, R. G., & Lee, V. E. (2001). Social capital and dropping out of high school: Benefits to at-risk students of teachers’ support and guidance. Teachers College Record, 103, 548e581. Scholar
  22. Cromey, A., & Hanson, M. (2000). An exploratory analysis of school-based student assessment systems. Oak Brook: North Central Regional Educational Lab.Google Scholar
  23. Çınar, İ. (2014). Eğitim ve otoriteye bağlılık. Eğitişim Dergisi, 42–46.Google Scholar
  24. Cullen, B. (2000). Evaluating integrated responses to educational disadvantage. Dublin: Combat Poverty Agency.Google Scholar
  25. da Cunha, J. A., Moura, E., Analide, C. (2016). Data Mining in academic databases to detect behaviors of students related to school dropout and disapproval. In A. Rocha, A. M. Correia, H. Adeli, L. P. Reis, & M. M. Teixeira (Eds.), New advances in information systems and technologies (pp. 189–198). Springer International Publishing.Google Scholar
  26. Data Quality Campaign. (2014). Supporting early warning systems: Using data to keep students on track to success. [Çevrimiçi] Available at: [Erişildi: 5.10.2016].
  27. De Witte, K., Cabus, S., Thyssen, G., Groot, W., & Maassen van den Brink, H. (2013). A critical review of the literature on school dropout. Educational Research Review, 10, 13e28. Scholar
  28. Dekker, G. W., Pechenizkiy, M., Vleeshouwers, J.M. (2009). Predicting students drop out: A case study. Proceedings International Conference on Educational Data Mining, 41–50.Google Scholar
  29. Dringus, L., & Ellis, T. (2005). Using data mining as a strategy for assessing asynchronous discussion forums. Computer & Education Journal, 45(1), 141–160.Google Scholar
  30. Ertürk, S. (1973). Eğitimde program geliştirme. Ankara: Yelkentepe Yayınları.Google Scholar
  31. French, D., & Conrad, J. (2001). School dropout as predicted by peer rejection and antisocial behavior. Journal of Research on Adolescence, 11, 225–244.CrossRefGoogle Scholar
  32. Fox, J., Weisberg, S., Adler, D., Bates, D., Baud-Bovy, G., Ellison, S., Heiberger, R. (2016). Package ‘car’.
  33. Gamulin, J., Gamulin, O., Kermek, D. (2013). Data mining in hybrid learning: Possibility to predict the final exam result. In Information & communication technology electronics & microelectronics (MIPRO), 2013 36th International Convention on (pp. 591–596). Opatija, Croatia: IEEE.Google Scholar
  34. Gökyer, N. (2012). Ortaöğretim okullarında devamsızlık nedenlerine ilişkin öğrencigörüşleri. Kastamonu Eğitim Dergisi, 20(3), 913–938.Google Scholar
  35. Grunsky, E. C. (2002). R: A data analysis and statistical programming environment an emerging tool for the geosciences. Computers and Geosciences, 28, 1219–1222.CrossRefGoogle Scholar
  36. Han, J., Kamber, M., & Pei, J. (2012). Data mining concepts and techniques (3nd ed.). Amsterdam: Elsevier; Morgan Kauffman.Google Scholar
  37. Hanson, G. H., & Woodruff, C. (2003). Emigration and educational attainment in Mexico. Working Paper. San Diego: University of California.Google Scholar
  38. Hornik, K., Buchta, C., Hothorn, T., Karatzoglou, A., Meyer, D., Zeleis, A. (2015). Package ‘RWeka. 03 03, 2015 tarihinde CRAN:
  39. Hothorn, T., & Zeileis, A. (2015). Partykit: A modular toolkit for recursive partytioning in R. 10 1, 2015 tarihinde CRAN:
  40. Huebner, R. A. (2013). A survey of educational data-mining research. Research in Higher Education Journal, 19, 1–13.Google Scholar
  41. Jenkins, P. H. (1995). School delinquency and school commitment. Sociology of Education, 68, 221e239. Scholar
  42. Jones, K. (2012). What is the purpose of education?. [Çevrimiçi] Available at:
  43. Kadı, Z. (2000). Adana ili merkezindeki ilköğretim okulu öğrencilerinin sürekli devamsızlık nedenleri. Yayımlanmamış Yüksek Lisans Tezi. İnönü Üniversitesi Sosyal BilimlerEnstitüsü, Malatya.Google Scholar
  44. Kantardzic, M. (2011). Data mining: Concepts, models, methods, and algorithms. New Jersey: Wiley.CrossRefGoogle Scholar
  45. Kena, G. et al. (2015). The condition of education 2015, National Center for Education Statistics.Google Scholar
  46. Kuhn, M., Wing, J., Weston, S., Williams, A., Keefer, C., Engelhardt, A., Candan, C. (2015). Caret: Classification and regression training. 11 26, 2015 tarihinde CRAN:
  47. Kurniawan, Y., & Halim, E. (2013). Data warehouse and data mining to predict student academic performance in schools: A case study. Teaching, assessment and learning for engineering (TALE), 2013 IEEE International Conference (s. 98–103). IEEE.Google Scholar
  48. Lansford, J. E., Dodge, K. A., Pettit, G. S., & Bates, J. E. (2016). A public health perspective on school dropout and adult outcomes: A prospective study of risk and protective factors from age 5 to 27 years. Journal of Adolescent Health, 58(6), 652–658.CrossRefGoogle Scholar
  49. Lee, V. E., & Burkam, D. T. (2003). Dropping out of high school: The role of schoolorganization and structure. American Educational Research Journal, 40, 353e393. Scholar
  50. Liaw, A., Wiener, M., Breiman, L., Cutler, A. (2009). Package “randomforest”. Retrieved December, 12, 2009.Google Scholar
  51. Ligges, U., & Mächler, M. (2002). Scatterplot3d-an r package for visualizing multivariate data. Universitätsbibliothek Dortmund.
  52. Maslach, C., & Jackson, S. (1981). The measurement of experienced burnout. Journal of Occupational Behavior, 2, 99–113.CrossRefGoogle Scholar
  53. Márquez-Vera, C., Cano, A., Romero, C., & Ventura, S. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied Intelligence, 38(3), 315–330.CrossRefGoogle Scholar
  54. Maynard, B. R. (2010). The absence of presence: A systematic review and meta_analysis ofindicated interventions to ıncrease student attendance. Dissertations. Paper 254. ( (erişim Tarihi=10.10.2016).
  55. Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F. (2014). E1071: Misc functions of the department of statistics (e1071). Haziran 23, 2015 tarihinde e1071: Misc functions of the department of statistics, probability theory group (Formerly: E1071), TU Wien:
  56. Öner, N. (1978). Türkçe’ye Uyarlanmış Bir Kaygı Envanterinin Geçerlik Denemesi; Bir Araştırma Özeti. Psikoloji Dergisi, 1(1), 15.Google Scholar
  57. Osmanbegović, E., Suljić, M., & Agić, H. (2015). Determining dominant factor for students performance prediction by using data mining classification algorithms. Tranzicija, 16(34), 147–158.Google Scholar
  58. Özbaş, M. (2010). İlköğretim okullarında öğrenci devamsızlığının nedenleri. Eğitim ve Bilim, 35, 156–169.Google Scholar
  59. Özdemir, Ş. (2016). Eğitimde veri madenciliği ve öğrenci akademik başarı öngörüsüne ilişkin bir uygulama doktora tezi. İstanbul: İstanbul Üniversitesi Fen Bilimleri Enstitüsü.Google Scholar
  60. Peña-Ayala, A., & Cárdenas, L. (2014). How educational data mining empowers state policies to reform education: The Mexican case study. In A. Peña-Ayala (Ed.), Educational data mining. SCI, vol. 524 (pp. 65–101). Heidelberg: Springer.CrossRefGoogle Scholar
  61. Rumberger, R. W. (1987). High school dropouts: A review of issues and evidence. Review of Educational Research, 57, 101e121. Scholar
  62. Rumberger, R. W., & Lim, S. A. (2008). Why students drop out of school: A review of 25 years of research. California: Technical report, University of California.Google Scholar
  63. Rumberger, R. W., & Thomas, S. L. (2000). The distribution of dropout and turnover rates among urban and suburban high schools. Sociology of Education, 73, 39e67. Scholar
  64. Sarkar, D., Sarkar, M.D. KernSmooth, S. (2015). Package ‘lattice’.
  65. Silah, M. (2003). Üniversite Öğrencilerinin Akademik Basarılarını Etkileyen Çesitli Nedenler Arasından Süreksiz Durumluk Kaygısının Yeri ve Önemi. Eğitim Araştırmaları Dergisi, 10, 102–115.Google Scholar
  66. Şimşek, H. (2011). Lise öğrencilerinde okulu bırakma eğilimi ve nedenleri. Aralık. EğitimBilimleri Araştırmaları Dergisi (EBAD_JESR) Uluslar arası e_dergi 1 2.Google Scholar
  67. Şimşek, H., & Şahin, S. (2012). İlköğretim ikinci kademe öğrencilerinde okulu bırakmaeğilimleri ve nedenleri (Şanlıurfa İli Örneği). Abant Baysal Üniversitesi Eğitim Fakültesi Dergisi, 12, 2.Google Scholar
  68. Sivakumar, S., Venkataraman, S., Selvaraj, R. (2016). Predictive modeling of student dropout indicators in educational data mining using improved decision tree. Indian Journal of Science and Technology, 9(4), 1–5.Google Scholar
  69. Soria, D., Garibaldi, J. M., Ambrogi, F., Baiganzoli, E. M., & Ellis, I. O. (2011). A non-parametric version of the naive bayes classifier. Knowledge-Based Systems, 24(6), 775–784.CrossRefGoogle Scholar
  70. Therneau, T., Atkinson, B., Ripley, B., Ripley, M.B. (2015). Package ‘rpart’.
  71. Tunç, A. İ. (2009). Kız çocuklarının okula gitmeme nedenleri. Van ili örneği. Yüzüncü Yıl Üniversitesi Eğitim Fakültesi Dergisi Haziran, VI(I), 237–269.Google Scholar
  72. UNICEF. (2012). Türkiye’de çocuk ve genç nüfusun durumunun analizi 2012. HYPERLINK “ [Erişim Tarihi: 11.06.2016].
  73. Uysal, A. (2008). Okulu bırakma sorunu üzerine tartışmalar: Çevresel Faktörler. MilliEğitim Dergisi Sayı, 178, 139–149.Google Scholar
  74. Van Houtte M (2011) So where’s the teacher in school effects research? The impact of teachers’ beliefs, culture and behaviour on equity and excellence in education. K Van den Branden, P Van Avermaet M Van Houtte Equity and excellence in education. Towards maximal learning opportunities for allstudents (75e95)., New York: Routledge.Google Scholar
  75. Van Houtte, M., & Demanet, J. (2016). Teachers’ beliefs about students, and the intention of students to drop out of secondary education in Flanders. Teaching and Teacher Education, 54, 117–127.CrossRefGoogle Scholar
  76. Wei, T., & Wei, M. T. (2016). Package ‘corrplot’. Statistician, 56, 316–324.Google Scholar
  77. Wickham, H. (2009). Ggplot2: Elegant graphics for data analysis. (pp. 1–7). Springer: New York, 2009.Google Scholar
  78. Yi, H., et al. (2015). Exploring the dropout rates and causes of dropout in upper-secondary technical and vocational education and training (TVET) schools in China. International Journal of Educational Development, 42, 115–123.CrossRefGoogle Scholar
  79. Yıldırım, İ., & Ergene, T. (2003). Lise Son Sınıf Öğrencilerinin Akademik Başarılarının Yordayıcısı Olarak Sınav Kaygısı, Boyun Eğici Davranışlar Ve Sosyal Destek. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 25, 224–234.Google Scholar
  80. Yıldırım, H. H., Yıldırım, S., Yetişir, M. İ., & Ceylan, E. (2013). PISA 2012 Ulusal Ön Raporu. TC MEB YEĞİTEK Genel Müdürlüğü: Ankara.Google Scholar
  81. Zorrilla, M. E., Menasalvas, E., Marin, D., Mora, E., & Segovia, J. (2005, February). Web usage mining project for improving web-based learning sites. In International conference on computer aided systems theory (pp. 205–210). Springer Berlin Heidelberg.Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Şebnem Özdemir
    • 1
  • Fatma Çınar
    • 2
  • C. Coşkun Küçüközmen
    • 3
  • Kutlu Merih
    • 4
  1. 1.Department of Management Information SystemsBeykent UniversityIstanbulTurkey
  2. 2.Capital Markets Board of TurkeyIstanbulTurkey
  3. 3.Izmir University of Economics, Faculty of Business IzmirIstanbulTurkey
  4. 4.Department of Quantitative MethodsIstanbul University (retirement) + DATALAB (Sociesty of Data Science)IstanbulTurkey

Personalised recommendations