The College Completion Puzzle: A Hidden Markov Model Approach
Higher education in America is characterized by widespread access to college but low rates of completion, especially among undergraduates at less selective institutions. We analyze longitudinal transcript data to examine processes leading to graduation, using Hidden Markov modeling. We identify several latent states that are associated with patterns of course taking, and show that a trained Hidden Markov model can predict graduation or nongraduation based on only a few semesters of transcript data. We compare this approach to more conventional methods and conclude that certain college-specific processes, associated with graduation, should be analyzed in addition to socio-economic factors. The results from the Hidden Markov trajectories indicate that both graduating and nongraduating students take the more difficult mathematical and technical courses at an equal rate. However, undergraduates who complete their bachelor’s degree within 6 years are more likely to alternate between these semesters with a heavy course load and the less course-intense semesters. The course-taking patterns found among college students also indicate that nongraduates withdraw more often from coursework than average, yet when graduates withdraw, they tend do so in exactly those semesters of the college career in which more difficult courses are taken. These findings, as well as the sequence methodology itself, emphasize the importance of careful course selection and counseling early on in student’s college career.
KeywordsCollege completion COURSE-TAKING Academic momentum Quantitative methodology Longitudinal analysis
We thank the National Science Foundation (Grant DRL 1243785) and the Bill & Melinda Gates Foundation (Grant OPP 1012951) for their support for this study. We also thank Andrew Rosenberg (Queens College, CUNY) for his extensive technical support and his feedback on programming Hidden Markov models.
- Achieve Inc. (2004). Ready or not: Creating a high school diploma that counts. An American diploma project. http://www.achieve.org/files/ADPreport.pdf. Accessed 24 November 2015.
- Adelman, C. (1999). Answers in the toolbox: Academic intensity, attendance patterns, and bachelor’s degree attainment. Washington, DC: U.S. Department of Education.Google Scholar
- Adelman, C. (2004). Undergraduate grades: A complex story (Chapter 6). In C. Adelman (Ed.), Principal indicators of student academic histories in postsecondary education (pp. 1972–2000). Washington, DC: US Department of Education.Google Scholar
- Adelman, C. (2006). The toolbox revisited: paths to degree completion from high school through college. Washington, DC: US Department of Education.Google Scholar
- Adelman, C. (2009). The spaces between numbers: getting international data on higher education straight. Washington, DC: Institute for Higher Education Policy.Google Scholar
- Aud, S., Wikinson-Flicker, S., Kristapovich P., Rathbun A., Wang X., & Zhang, J. (2013). The condition of education 2013. National Center for Education Statistics (NCES) 2013-037. Washington, DC: US Department of Education.Google Scholar
- Bowen, W. G., Chingos, M. M., & McPherson, M. S. (2009). Crossing the finish line: Completing college at America’s Public Universities. Princeton, NJ: Princeton University Press.Google Scholar
- Chen, X. (2005). First generation students in postsecondary education: a look at their college transcripts. National Center for Education Statistics (NCES) 2005-171. Washington, DC: US Department of Education.Google Scholar
- Complete College America. (2011). Time is the enemy. Washington, DC: Complete College America. http://www.completecollege.org/docs/Time_Is_the_Enemy.pdf. Accessed 24 Nov 2015.
- Duda, R. O., Hart, P. E., & Stork, D. G. (1973). Pattern classification and scene analysis (1st ed.). New York: John Wiley.Google Scholar
- Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification (2nd ed.). New York: John Wiley.Google Scholar
- Hess, F., Schneider, M., Carey, K., & Kelly, A. P. (2009). Diplomas and dropouts: Which colleges actually graduate their students (and which don’t). Washington, DC: American Enterprise Institute.Google Scholar
- Horn, L. & Kojaku, L.K. (2001). High school curriculum and the persistence path through college. National Center for Education Statistics (NCES) 2001-163. Washington, DC: US Department of Education.Google Scholar
- Murphy, K. P. (2002). Dynamic Bayesian networks: Representation, inference and learning. (PhD dissertation, Department of Computer Science). Berkeley, CA: University of California.Google Scholar
- Murphy, K. P. (2005). Hidden Markov model (HMM) Toolbox for Matlab (Original Toolbox of 1998). http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html. Accessed 1 Aug 2016.
- National Center for Education Statistics. (2011). 2004/2009 Beginning postsecondary students longitudinal study restricted use data file [in Stata]. Washington, DC: US Department of Education, NCES 2011-244 [distributor].Google Scholar
- Perna, L. W. (2010). Toward a more complete understanding of financial aid in promoting college enrollment. In J. Smart, Higher education: handbook of theory and research (Vol. 25) (pp. 129–180). New York, NY: Springer.Google Scholar
- Perna, L. W., & Li, C. (2006). College affordability: Implications for college opportunity. Journal of Student Financial Aid, 36(1), 7–24.Google Scholar
- Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.Google Scholar
- Radford, A. W., Berkner L., Wheeless, S.C., & Shepard, B. (2011). Persistence and attainment of 2003–2004 beginning postsecondary students: After six years. National Center for Education Statistics (NCES) 2011-151. Washington, DC: US Department of Education.Google Scholar
- Schneider, M., & Yin, M. L. (2012). Completion matters: The high cost of low community college graduation rates. Washington, DC: American Enterprise Institute for Public Policy Research.Google Scholar
- Schuh, J. (2005). Finances and retention: Trends and potential implications. In A. Seidman (Ed.), College student retention: Formula for student success (pp. 277–294). Westport, CT: American Council on Education and Praeger.Google Scholar
- Scott, S. L. (2002). Bayesian methods for hidden Markov models. Journal of the American Statistical Association, 97(457), 337–351.Google Scholar
- St. John, E. P., Cabrera A. F., Nora, A., & Asker, E.H. (2000). Economic influences on persistence reconsidered. In J.M. Braxton (Ed.), Reworking the student departure puzzle (pp. 29–47). Nashville, TN: Vanderbilt University Press.Google Scholar
- Stamp, M. (2015). A revealing introduction to hidden Markov models (Course). http://www.cs.sjsu.edu/~stamp/RUA/HMM.pdf. Accessed 24 Nov 2015.
- Tinto, V. (1993). Leaving college: Rethinking the causes of student attrition (2nd ed.). Chicago, IL: University of Chicago Press.Google Scholar
- Vermunt, J. K., Tran, B., & Magidson, J. (2008). Latent class models in longitudinal research. In S. Menard (Ed.), Handbook of longitudinal research: design, measurement, and analysis (pp. 373–385). Burlington, MA: Elsevier.Google Scholar
- Wine, J., Janson, N., & Wheeless, S. (2011). 2004/09 Beginning postsecondary students longitudinal study (BPS:04/09) full-scale methodology report. National Center for Education Statistics (NCES) 2012-246. Washington, DC: US Department of Education.Google Scholar
- Wyner, J. S., Bridgeland, J.M., Diiulio, J. (2007). Achievement trap: How America is failing millions of high-achieving students from lower-income families. In V.A. Lansdowne (Ed.), Jack Kent Cooke Foundation. http://www.jkcf.org/news-knowledge/research-reports/. Accessed 24 Nov 2015.