Predicting Student Success: A Naïve Bayesian Application to Community College Data
Abstract
This research focuses on developing and implementing a continuous Naïve Bayesian classifier for GEAR courses at Rio Salado Community College. Previous implementation efforts of a discrete version did not predict as well, 70%, and had deployment issues. This predictive model has higher prediction, over 90%, accuracy for both at-risk and successful students while easing interpretation and implementation. Predictive results across eleven courses and cumulative gain charts show potential improvements to be made in students’ academic success by focusing on high level risk students. Researchers at other colleges might find this empirical application relevant for implementation of early alert systems.
Keywords
At risk students Cumulative gains Naïve Bayesian Predictive modelReferences
- Allen, I. E., Seaman, J., Poullin, R., & Taylor, S. T. (2016). Tracking online education in the United States. Online Consortium. http://onlinelearningconsortium.org/read/online-report-card-tracking-online-education-united-states-2015/.
- Barber, R., & Sharkey, M. (2012). Course correction: Using analytics to predict course success. In LAK12: 2nd international conference on learning analytics & knowledge, Vancouver, BC, Canada.Google Scholar
- Bienkowski, M., Feng, M., & Means, B. (2012). Enhancing teaching and learning through educational data mining and learning analytics: An issue brief. Department of Education.Google Scholar
- Brooks, C., & Thompson, C. (2017). Predictive modelling in teaching and learning. In C. Lang, A. Wise, & D. Gasevic (Eds.), Handbook of learning analytics (pp. 61–68). doi:10.18608/hla17.005.Google Scholar
- Buntine, W. (1992). Learning classification trees. Statistics and Computing, 2, 63–73.CrossRefGoogle Scholar
- ECAR-ANALYTICS Working Group. (2015). The predictive learning analytics revolution: Leveraging learning data for student success. ECAR working group paper. Louisville, CO: ECAR.Google Scholar
- Eduventures. (2013). Predictive analytics in higher education data-driven decision-making for the student life cycle.Google Scholar
- Elkan, C. (2014). Maximum likelihood, logistic regression and stochastic gradient training.Google Scholar
- Hung, J. L., & Zhang, K. (2008). Revealing online learning behaviors and active patterns and making predictions with data mining techniques in online teaching. MERLOT Journal of Online Learning and Teaching, 4(4), 426-437.Google Scholar
- Ifenthaler, D., & Widanapathirana, C. (2014). Development and validation of a learning analytics framework: Two case studies using support vector machines. Technology, Knowledge and Learning., 19, 221–240. doi: 10.1007/s10578-014-9926-4.CrossRefGoogle Scholar
- Jaffery, T., & Liu, S. X. (2009). Measuring campaign performance by using cumulative gains and lift charts. Paper 196-2009, SAS Global Forum.Google Scholar
- John, H. G., & Langley P. (1995). Estimating continuous distributions in Bayesian classifiers. In Proceedings of the eleventh conference on uncertainty in artificial intelligence. San Mateo: Morgan Kauffman Publishers.Google Scholar
- Liu, Y. S., Gomez, J., & Yen, C. (2009). Community college online course retention and final grade: Predictability of social presence. Journal of Interactive Online Learning, 8(2), 165–182.Google Scholar
- Lumina Foundation. (2015). Strategic plan for 2017 to 2020. http://www.luminafoundation.org/resources/lumina-foundation-strategic-plan-for-2017-to-2020.
- Macfadyen, P. L., & Dawson, S. (2010). Mining data to develop an “Early Warning System” for educators: A proof of concept. Computers and Education, 54, 588–599. www.elsevier.com/locate/compedu.
- Oliff, P., Palacios, V., Johnson, I., & Leachman, M. (2013). Recent deep state higher education cuts may harm students and the economy for years to come. Center on Budget and Policy Priorities, 1–21.Google Scholar
- Ordonez, C., & Pitchaimalai, S. (2010). Bayesian classifiers programmed in SQL. IEEE Transactions on Knowledge and Data Engineering (TKDE), 22(1), 139–144.CrossRefGoogle Scholar
- Pitchaimalai, S. K., Ordonez, C., & Alvarado, C. G. (2010). Comparing SQL and map reduce to compute Naïve Bayes in a single table scan. doi: 10.1145/1871929.1871932.
- President Obama 2013 State of the Union Address. (2013).Google Scholar
- Rampell, C. (2013). Data reveal a rise in college degrees among Americans. The New York Times.Google Scholar
- Rio Salado College Assessment of Student Learning. (2013). Annual report.Google Scholar
- Shelton, B. E., Hung, J., & Baughman, S. (2016). Online graduate teacher education: Establishing an EKG for student success intervention. Technology, Knowledge and Learning, 21, 21–32. doi: 10.1007/s10758-015-9254-8.CrossRefGoogle Scholar
- Shelton, B. E., Hung, J., & Lowenthal, P. R. (2017). Predicting student success by modeling student interaction in asynchronous online courses. Distance Education, 38(1), 59–69. doi: 10.1080/01587919.2017.1299562.CrossRefGoogle Scholar
- Smith, V. S., Lange A., & Huston, D. R. (2012). Predictive modeling to forecast student outcomes and effective interventions in online community college courses. Journal of Asynchronous Learning Networks, 16(3), 51-61.Google Scholar
- Vuk, M., & Curk, T. (2006). ROC curve, lift chart and calibration plot. Metodolozkisveski, 3(1), 89–108.Google Scholar
- Zhang, H. (2004). The optimality of Naïve Bayes. American Association for Artificial Intelligence. www.aaai.org.
Copyright information
© Springer Science+Business Media B.V. 2017