Predictive Models for Undergraduate Student Retention Using Machine Learning Algorithms

Conference paper


In this paper, we have presented some results of undergraduate student retention using machine learning and wavelet decomposition algorithms for classifying the student data. We have also made some improvements to the classification algorithms such as Decision tree, Support Vector Machines (SVM), and neural networks supported by Weka software toolkit. The experiments revealed that the main factors that influence student retention in the Historically Black Colleges and Universities (HBCU) are the cumulative grade point average (GPA) and total credit hours (TCH) taken. The target functions derived from the bare minimum decision tree and SVM algorithms were further revised to create a two-layer neural network and a regression to predict the retention. These new models improved the classification accuracy. Furthermore, we utilized wavelet decomposition and achieved better results.


Decision tree Machine learning Neural network Signal processing Student retention Support vector machines 


  1. 1.
    D.B. Stone, African-American males in computer science – examining the pipeline for clogs, The School of Engineering and Applied Science of the George Washington University, Thesis, 2008Google Scholar
  2. 2.
    C.H. Yu, S. Digangi, A. Jannasch-pennell, C. Kaprolet, A data mining approach for identifying predictors of student retention from sophomore to junior year. J. Data Sci. 8, 307–325 (2010) (ISSN 1680-743X)Google Scholar
  3. 3.
    S.K. Yadav, B. Bharadwaj, S. Pal, Mining educational data to predict student’s retention: a comparative study. Int. J. Comput. Sci. Inf. Secur. 10(2), 113–117 (2012) (ISSN 1947-5500)Google Scholar
  4. 4.
    A. Nandeshwara, T. Menziesb, A. Nelson, Learning patterns of university student retention. Expert Syst. Appl. 38(2), 14984–14996 (2011)Google Scholar
  5. 5.
    S.A. Kumar, M.V.N., Implication of classification techniques in predicting student’s recital. Int. J. Data Min. Knowl. Manage. Process (IJDKP), 1(5), 2011Google Scholar
  6. 6.
    D. Kabakchieva, Student performance prediction by using data mining classification algorithms. Int. J. Comput. Sci. Manage. Res. 1(4), 686–690 (2012)Google Scholar
  7. 7.
    S. Lin, Data mining for student retention management, in The Consortium for Computing Sciences in Colleges (2012)Google Scholar
  8. 8.
    S. Singh, V. Kumar, Classification of student’s data using data mining techniques for training and placement department in technical education, Int. J. Comput. Sci. Netw. (IJCSN) 1(4) (2012)Google Scholar
  9. 9.
    J. -W. Jia, M. Mareboyana, Machine learning algorithms and predictive models for undergraduate student retention. Lecture notes in engineering and computer science, in Proceedings of the World Congress on Engineering and Computer Science 2013, WCECS 2013, San Francisco, pp 222–227, 23–25 Oct, 2013Google Scholar
  10. 10.
    T. Mitchell, Machine Learning (McGraw Hill, 1997). ISBN 0070428077Google Scholar
  11. 11.
    S.L. Hagedorn, How to define retention: a new look at an old problem, in College Student Retention: Formula for Student Success, ed. by Alan Seidman (Praeger Publishers, Westport, 2005)Google Scholar
  12. 12.
    M.H. Dunham, Data Mining Introductory and Advanced pp 1–124, (Prentice Hall, Upper Saddle River 2003). ISBN 0-13-088892-3Google Scholar
  13. 13.
    E. Frank, Pruning Decision Trees and Lists, Department of Computer Science, University of Waikato, Thesis, 2000Google Scholar
  14. 14.
    F. Esposito, D. Malerba, G. Semeraro, A comparative analysis of methods for pruning decision tree. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 476–491 (1997)Google Scholar
  15. 15.
    D.D. Patil, V.M. Wadhai, J.A. Gokhale, Evaluation of decision tree pruning algorithms for complexity and classification accuracy. Int. J. Comput. Appl. 11(2), 23–30 (2010) (0975–8887)Google Scholar
  16. 16.
    S. Russell, P. Norvig, Artificial Intelligence, a Modern Approach, 3rd edn. (Pearson, 2010). ISBN-13: 978-0-13-604259-4, ISBN-10: 0-13-604259-7Google Scholar
  17. 17.
    Y.S. Abu-Mostafa, M. Magdon-Lsmail, H. Lin, Learning from Data (AMLbook, 2012). ISBN 10:1-60049-006-9Google Scholar
  18. 18.
    N. Stanevski, D. Tsvetkov, Using support vector machines as a binary classifier, in International Conference on Computer Systems and Technologies – CompSys Tech’ (2005)Google Scholar
  19. 19.
    S. Sembiring, M. Zarlis, D. Hartama, E. Wani, Prediction of student academic performance by an application of data mining techniques, in International Conference on Management and Artificial Intelligence IPEDR, vol. 6 (2011)Google Scholar
  20. 20.
    S.D. Stearns, D.R. Hush, Digital Signal Processing with Examples in MATLAB, 2nd edn. (CRC Press, Boca Raton, 2011)Google Scholar
  21. 21.
    I.W. Selesnick, Wavelet Transforms – A Quick Study, Physics Today magazine (2007)Google Scholar
  22. 22.
    R. Alkhasawneh, R. Hobson, Modeling student retention in science and engineering disciplines using neural networks, in IEEE Global Engineering Education Conference (EDUCON) (2011), pp. 660–663Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Bowie State UniversityBowieUSA
  2. 2.Department of Computer ScienceBowie State UniversityBowieUSA

Personalised recommendations