Using Academic Analytics to Predict Dropout Risk in E-Learning Courses

Part of the Annals of Information Systems book series (AOIS, volume 18)


Information technology is reshaping higher education globally and analytics can help provide insights into complex issues in higher education, such as student recruitment, enrollment, retention, student learning, and graduation. Student retention, in particular, is a major issue in higher education, since it has an impact on students, institutions, and society. With the rapid growth in online enrollment, coupled with a higher dropout rate, more students are at risk of dropping out of online courses. Early identification of students who are at risk to drop out is imperative for preventing student dropout. This study develops a model to predict real-time dropout risk for each student while an online course is being taught. The model developed in this research utilizes a combination of variables from the Student Information Systems (SIS) and Course Management System (CMS). SIS data consists of ten independent variables, which provide a baseline risk score for each student at the beginning of the course. CMS data consists of seven independent variables that provide a dynamic risk score as the course progresses. Furthermore, the study provides an evaluation of various data mining techniques for their predictive accuracy and performance to build the predictive model and risk scores. Based on predictive model, the study presents a recommender system framework, to generate alerts and recommendations for students, instructors, and staff to facilitate early and effective intervention. The study results show that the boosted C5.0 decision tree model achieves 90.97 % overall predictive accuracy in predicting student dropout in online courses.


Retention Dropout Course Management System Student Information System Data mining Predictive model Prescriptive analytics Recommender system 


  1. Allen, J., & Robbins, S. B. (2008). Prediction of college major persistence based on vocational interests, academic preparation, and first-year academic performance. Research in Higher Education, 49(1), 62–79.CrossRefGoogle Scholar
  2. Araque, F., Roldán, C., & Salguero, A. (2009). Factors influencing university drop out rates. Computers & Education, 53(3), 563–574. Retrieved from
  3. Bean, J. P., & Metzner, B. S. (1985). A conceptual model of nontraditional undergraduate student attrition. Review of Educational Research, 55(4), 485–540.CrossRefGoogle Scholar
  4. Bellaachia, A., Vommina, E., & Berrada, B. (2006). Minel: A framework for mining e-learning logs (p. 263). Anaheim: ACTA Press.Google Scholar
  5. Braxton, J. M. (2000). Reworking the student departure puzzle. Nashville: Vanderbilt University Press.Google Scholar
  6. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees (1). Belmont: Wadsworth.Google Scholar
  7. Bukralia, R., Sarnikar, S., & Deokar, A. V. (2009). Predictive modeling to improve retention of online students. In Proceedings of the 4th Midwest association for information systems conference (MWAIS’09), Madison.Google Scholar
  8. Cabrera, A. F., Nora, A., & Castaneda, M. B. (1993). College persistence: Structural equations modeling test of an integrated model of student retention. Journal of Higher Education, 64(2), 123–139.Google Scholar
  9. Campbell, J. (2008). Analysis of institutional data in predicting student retention utilizing knowledge discovery and statistical techniques. Arizona: Northern Arizona University.Google Scholar
  10. Campbell, P. J., & Oblinger, D. (2007). Academic analytics. EDUCAUSE. Retrieved from
  11. Campbell, J. P., Finnegan, C., & Collins, B. (2006). Academic analytics: Using the CMS as an early warning system. WebCT impact conference 2006, Chicago.Google Scholar
  12. Campbell, J. P., DeBlois, P. B., & Oblinger, D. G. (2007). Academic analytics: A new tool for a New Era. Educause Review, 11, 41–57.Google Scholar
  13. Carr, S. (2000). As distance education comes of age, the challenge is keeping the students: Colleges are using online courses but retaining them is another matter. The Chronicle of Higher Education, 41–57. Retrieved from
  14. Chen, C. M., Chen, Y. Y., & Liu, C. Y. (2007). Learning performance assessment approach using web-based learning portfolios for e-learning systems. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 37(6), 1349–1359.CrossRefGoogle Scholar
  15. Cocea, M., & Weibelzahl, S. (2007). Cross-system validation of engagement prediction from log files. In Creating new learning experiences on a global scale (pp. 14–25). Berlin: Springer.CrossRefGoogle Scholar
  16. Dawson, S., McWilliam, E., & Tan, J. P. L. (2008). Teaching smarter: How mining ICT data can inform and improve learning and teaching practice. Hello! Where are you in the landscape of educational technology? Proceedings ascilite Melbourne 2008, Deakin University, Melbourne.Google Scholar
  17. Denson, K., & Schumacker, R. E. (1996). Student choices: Using a competing risks model of survival analysis. Annual Meeting of the American Educational Research Association. New York, NYGoogle Scholar
  18. Diaz, D. (2000). Comparison of student characteristics, and evaluation of student success, in an online health education course. Nova Southeastern University. Retrieved from
  19. Dutton, M., & Perry, J. J. (2002). Do online students perform as well as lecture students? Journal of Engineering Education, 90(1), 131.CrossRefGoogle Scholar
  20. Etchells, T. A., Nebot, A., Vellido, A., Lisboa, P. J. G., & Mugica, F. (2006). Learning what is important: Feature selection and rule extraction in a virtual course. In The fourteenth European symposium on artificial neural networks (pp. 401–406). Bruges: Citeseer.Google Scholar
  21. Hegedorn, L. (2005). How to define retention. In A. Seidman (Ed.), College student retention: Formula for student success. Westport: Greenwood Publishing.Google Scholar
  22. Herzog, S. (2006). Estimating student retention and degree-completion time: Decision trees and neural networks vis-a-vis regression. New Directions for Institutional Research, 133, 17–33.Google Scholar
  23. Hung, J.-L., & Zhang, K. (2008). Revealing online learning behaviors and activity patterns and making predictions with data mining techniques in online teaching. MERLOT Journal of Online Learning and Teaching, 4(4), 426–437.Google Scholar
  24. Kemp, W. C. (2002). Persistence of adult learners in distance education. American Journal of Distance Education, 16(2), 65–81.CrossRefGoogle Scholar
  25. Kiser, A., & Price, L. (2007). The persistence of college students from their freshman to sophomore year. Journal of College Student Retention: Research, Theory and Practice, 9(4), 421–436.CrossRefGoogle Scholar
  26. Lotkowski, V. A., Robbing, S. B., & Noeth, R. J. (2004). The role of academic and non-academic factors in improving college retention. Iowa City: American College Testing. Retrieved from
  27. Lykourentzou, I., Giannoukos, I., Nikolopoulos, V., Mpardis, G., & Loumos, V. (2009). Dropout prediction in e-learning courses through the combination of machine learning techniques. Computers & Education, 53(3), 950–965. Retrieved from
  28. Lynch, M. M. (2001). Effective student preparation for online learning. The Technology Source, November/December 2001. Retrieved from
  29. Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education, 54(2), 588–599. Retrieved from
  30. Mangold, W. D., Bean, L. A. G., Adams, D. J., Schwab, W. A., & Lynch, S. M. (2002). Who goes who stays: An assessment of the effect of a freshman mentoring and unit registration program on college persistence. Journal of College Student Retention: Research, Theory and Practice, 4(2), 95–122.CrossRefGoogle Scholar
  31. Morris, L. V, Finnegan, C., & Wu, S.-S. (2005). Tracking student behavior, persistence, and achievement in online courses. The Internet and Higher Education, 8(3), 221–231. Retrieved from
  32. Muehlenbrock, M. (2005). Automatic action analysis in an interactive learning environment. In Twelfth international conference on artificial intelligence in education, Amsterdam.Google Scholar
  33. Muse, H. E. (2003). The web-based community college student: An examination of factors that lead to success and risk. The Internet and Higher Education, 6(3), 241–261.CrossRefGoogle Scholar
  34. Newell, C. (2007). Learner characteristics as predictors of online course completion among nontraditional technical college students. University of Georgia. Retrieved from
  35. Nisbet, R., Elder, J. F., & Miner, G. (2009). Handbook of statistical analysis and data mining applications. Burlington: Academic.Google Scholar
  36. O’Brien, C., & Shedd, J. (2001). Getting through college: Voices of low income and minority students in New England. Washington, DC: The Institute for Higher Education Policy.Google Scholar
  37. Opitz, D., & Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11(1), 169–198.CrossRefGoogle Scholar
  38. Osborn, V. (2001). Identifying at-risk students in videoconferencing and web-based distance education. American Journal of Distance Education, 15(1), 41–54.CrossRefGoogle Scholar
  39. Park, J. H. (2007). Factors related to learner dropout in online learning. In International research conference in The Americas of the academy of human resource development. Indianapolis. Retrieved from
  40. Park, J. H., & Choi, H. J. (2009). Factors influencing adult learners’ decision to drop out or persist in online learning. Educational Technology & Society, 12(4), 207–217.Google Scholar
  41. Parker, S., & Greenlee, H. (1997). From numbers to action: a preliminary study of retention. In Annual Forum of the Association for Institutional Research. Albuquerque.Google Scholar
  42. Pittman, K. (2008). Comparison of data mining techniques used to predict student retention. Florida: Nova Southeastern University.Google Scholar
  43. Porter, O. F. (1989). Undergraduate completion and persistence at four-year colleges and Universities: Completers, Persisters, Stopouts, and Dropouts. Washington, DC: National Institute of Independent Colleges and Universities.Google Scholar
  44. Roblyer, M. D., Davis, L., Mills, S. C., Marshall, J., & Pape, L. (2008). Toward practical procedures for predicting and promoting success in virtual school students. American Journal of Distance Education, 22(2), 90–109.CrossRefGoogle Scholar
  45. Ross, L. R., & Powell, R. (1990). Relationships between gender and success in distance education courses: A preliminary investigation. Research in Distance Education, 2(2), 10–11.Google Scholar
  46. Rovai, A. P. (2003). In search of higher persistence rates in distance education online programs. The Internet and Higher Education, 6(1), 1–16. Retrieved from
  47. Seidman, A. (2005). Where we go from here: A retention formula for student success. In A. Seidman (Ed.), College student retention (p. 296). Westport: Praeger Publishers.Google Scholar
  48. Tinto, V. (1987). The principles of effective retention. Retrieved from
  49. Tinto, V. (1993). Leaving college: Rethinking the causes and cures of student attrition. Chicago: University of Chicago Press.Google Scholar
  50. Vare, J. W., Dewalt, M. W., & Dockery, R. E. (2000). Predicting student retention in teacher education programs. In Proceedings of the annual meeting of the American association of colleges for teacher education. Chicago.Google Scholar
  51. Wang, A. Y., & Newlin, M. H. (2002). Predictors of performance in the virtual classroom: Identifying and helping at-risk cyber-students. The Journal of Higher Education Academic Matters, 29(10), 21–25.Google Scholar
  52. Whiteman, J. M. (2004). Factors associated with retention rates in career and technical education teacher preparation web-based courses. Orlando: University of Central Florida.Google Scholar
  53. Willging, P., & Johnson, S. (2004). Factors that influence students’ decision to dropout of online courses. Journal of Asynchronous Learning Networks, 8(4), 105–18.Google Scholar
  54. Yu, C. H., DiGangi, S. A., Jannasch-Pennell, A., Lo, W., & Kaprolet, C. (2007). A data-mining approach to differentiate predictors of retention. In EDUCAUSE Southwest conference. Austin. Retrieved from

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Information & Computing ScienceUniversity of Wisconsin – Green BayGreen BayUSA
  2. 2.Penn State Behrend, Sam and Irene Black School of Business, Pennsylvania State UniversityErieUSA
  3. 3.College of Business and Information SystemsDakota State UniversityMadisonUSA

Personalised recommendations