Selection into Online Community College Courses and Their Effects on Persistence


Online courses at the college level are growing in popularity, and nearly all community colleges offer online courses (Allen and Seaman in Tracking online education in the United States, Babson Survey Research Group, Babson Park, 2015). What is the effect of the expanded availability of online curricula on persistence in the field and towards a degree? We use a model of self-selection to estimate the effect of taking an online course, using region and time variation in Internet service as a source of identifying variation. Our method, as opposed to standard experimental methods, allows us to consider the effect among students who actually choose to take such courses. For the average person, taking an online course has a negative effect on the probability of taking another course in the same field and on the probability of earning a degree. The negative effect on graduation for students who choose to take an online course is stronger than the negative effect for the average student. Community colleges must balance these results against the attractive features of online courses, and institutions may want to consider actively targeting online courses toward those most likely to do well in them.

This is a preview of subscription content, access via your institution.

Fig. 1


  1. 1.

  2. 2.

    We were able to replicate these results using distance as an instrumental variable, although exact estimates do not match because, although the data sets are the same, we did not limit our sample to students who intend to transfer to a 4-year college and earn a bachelor’s degree. In our sample, we found marginally larger effects on in-course grades and marginally smaller effects on course completion.

  3. 3.

    Coates et al. (2004) include knowing someone who took an online class and the use of supplemental Internet-provided material as excluded variables as well. There is less of a literature on these variables, but it is possible that these variables, in particular the use of supplementary materials, indicate a dedication to studies that could relate directly to course performance.

  4. 4.

    The argument presented can be easily shown to hold if linearity is relaxed.

  5. 5.

    This is a simplification, especially given that travel costs are likely to differ between online and face-to-face courses. However, the implications of the model as used in this paper are the same if this assumption is not made. Additionally, a potential correlation between \(\alpha_{i3}\) and \(Y_{iO} - Y_{iF}\) offers another way of explaining why students who learn most effectively in online courses may not be the students who choose them.

  6. 6.

    We additionally attempted to estimate the effect of online courses on grades in a follow-up course, but this required severely limiting the sample (to those who took a valid treatment course and also a valid follow-up course in the same department), such that the excluded variable was no longer significant in the first stage.

  7. 7.

    Specifically, we use the estimator described in equation 6.6 of Wooldridge (2010). We replace the time dummies in that specification with quintile dummies for class size, since in our context differences in class size should explain differences in variance across classes.

  8. 8.

    In the language of the model from the previous section, \(Z_{i}\) enters into \(V_{iO} - V_{iF}\) but not \(Y_{iO} - Y_{iF}\).

  9. 9.

    The choice of \(m = n^{.9}\) satisfies the properties that \(m \to \infty\) and \(m/n \to 0\) as \(n \to \infty\), necessary to ensure that the \(m\)-out-of-\(n\) parameter distribution is non-degenerate. This choice of \(m\) does not use data to adjust subsample size for non-smoothness in the underlying distribution, as in adaptive \(m\)-setting procedures like those proposed in Bickel and Sakov (2008) or Chakraborty et al. (2013). However, since the parameters of interest are means based on regression predictions, they are likely to have smooth underlying distributions, so a simple relationship between \(m\) and \(n\) is used to avoid the computational difficulties of the above adaptive rules.

  10. 10.

    FCC data are reported at the census tract level and do not distinguish between 1, 2, and 3 providers. ZIP codes were connected to census tracts by using a ZIP code/census tract crosswalk offered by the U.S. Department of Housing and Urban Development (2014). When ZIP codes reside in two census tracts, the proportion of the population in each tract is used to construct a weighted average. To allow for these averages, tracts with 1, 2, or 3 providers are assumed to have 2 providers. Results are robust to the use of 1 or 3 providers instead.

  11. 11.

    This variable uses the latitude and longitude of the centroid of each ZIP code. Distances between points are calculated using the VICENTY package in Stata (Nichols 2007), which accounts for the ellipsoidal shape of the Earth. Results are nearly identical if we instead use driving time in minutes, as calculated using Google Maps between 1:00 p.m. and 2:00 p.m. on Wednesday, June 17, 2015.

  12. 12.

    Only online or face-to-face courses were allowed. Hybrid courses and courses using non-online forms of distance learning were dropped.

  13. 13.

    The online and face-to-face versions are said to be the “same course” if they have the same course title and number and are in the same department in the same quarter.

  14. 14.

    Results are robust to the sample being limited only to those who are not missing a high school GPA.

  15. 15.

    These degrees are mainly associate’s degrees. Only .3 % of these were bachelor’s degrees.

  16. 16.

    We provide an alternate analysis using distance-to-campus as an excluded variable. This analysis facilitates a more direct comparison to other studies. A similarity between results using both distance to campus and number of Internet providers supports the use of either.

  17. 17.

    The survey asks students about their planned length of attendance. The possible responses are “One quarter,” “Two quarters,” “One year,” “Up to two years, no degree planned,” “Long enough to complete a degree,” and “I don’t know.” We used a response of “Long enough to complete a degree” as the dependent variable.

  18. 18.

    To see this, begin with Eq. (11). Set \(ATT = 0\) and replace the sample probability of success after taking an online course with the “true” probability omitting those who would never take a face-to-face course, which is \(\left( {1 - S} \right)\) times the sample probability plus \(pS\). Solve this equation with the original Eq. (11) to get the above formula.

  19. 19.

    We use the term “department type” rather than “department” because the formal names of these departments vary among different colleges.


  1. Allcott, H. (2015). Site selection bias in program evaluation. Quarterly Journal of Economics, 130(3), 1117–1165.

    Article  Google Scholar 

  2. Allen, I. E., & Seaman, J. (2013). Changing course: Ten Years of tracking online education in the United States. Babson Park, MA: Babson Survey Research Group.

    Google Scholar 

  3. Allen, I. E., & Seaman, J. (2015). Tracking online education in the United States. Babson Park, MA: Babson Survey Research Group.

    Google Scholar 

  4. Allison, P. D. (1999). Comparing logit and probit coefficients across groups. Sociological Methods and Research, 28(2), 186–208.

    Article  Google Scholar 

  5. Alstadsæter, A. (2011). Measuring the consumption value of higher education. CESifo Economic Studies, 57(3), 458.

    Article  Google Scholar 

  6. Bickel, P. J., & Sakov, A. (2008). On the choice of M in the M out of N bootstrap and confidence bounds for extrema. Statistica Sinica, 18, 967–985.

    Google Scholar 

  7. Bowen, W. G. (2015). Higher education in the digital age. Princeton, NJ: Princeton University Press.

    Google Scholar 

  8. Chakraborty, B., Laber, E. B., & Zhao, Y. (2013). Inference for optimal dynamic treatment regimes using an adaptive M-out-of-N bootstrap scheme. Biometrics, 69(September), 714–723.

    Article  Google Scholar 

  9. Coates, D., Humphreys, B. R., Kane, J., & Vachris, M. A. (2004). ‘No significant distance’ between face-to-face and online instruction: evidence from principles of economics. Economics of Education Review, 23(5), 533–546.

    Article  Google Scholar 

  10. Deming, D. J., & Katz, L. F. (2015). Can online learning bend the higher education cost? 20890. NBER Working paper.

  11. Federal Communications Commission. (2014). Local telephone competition and broadband deployment. Accessed 06/28/14.

  12. Figlio, D., Rush, M., & Yin, L. (2013). Is it live or is it internet? Experimental estimates of the effects of online instruction on student learning. Journal of Labor Economics, 31(4), 763–784. doi:10.1086/669930.

    Article  Google Scholar 

  13. Goodwin, M. A. L. (2011). The open course library: Using open educational resources to improve community college access. PhD Dissertation, Washington State University, Pullman, WA.

  14. Gratton-Lavoie, C., & Stanley, D. (2009). Teaching and learning principles of microeconomics online: An empirical assessment. Journal of Economic Education, 40(1), 3–25. doi:10.3200/JECE.40.1.003-025.

    Article  Google Scholar 

  15. Heckman, J. J. (1979). Sample selection bias as a specification error. Econometrica, 47(1), 153–161.

    Article  Google Scholar 

  16. Heckman, J. J., & Edward, J. V. (2007). Econometric evaluation of social programs, part I: Causal models, structural models and econometric policy evaluation. Handbook of Econometrics, 6(Part B), 4779–4874. doi:10.1016/S1573-4412(07)06070-9.

    Article  Google Scholar 

  17. Heckman, J. J., Layne-Farrar, A., & Todd, P. (1996). Human capital pricing equations with an application to estimating the effect of schooling quality on earnings. The Review of Economics and Statistics, 78(4), 562–610.

    Article  Google Scholar 

  18. Huntington-Klein, N. (2015). Consumption value and the demand for college education. Working paper.

  19. Joyce, T. J., Crockett, S., Jaeger, D. A., Altindag, O., O’Connell, S. D., & Remler, D. K. (2015a). Do students know best? Choice, classroom time, and academic performance. 21656. NBER Working paper.

  20. Joyce, T. J., Jaeger, D. A., Crockett, S., Altindag, O., & O’Connell, S. D. (2015b). Does classroom time matter? Economics of Education Review, 46, 64–77.

    Article  Google Scholar 

  21. Krieg, J. M., & Henson, S. E. (2015). The educational impact of online learning: How do university students perform in subsequent courses? Working paper.

  22. Lee, L.-F. (1978). Unionism and wage rates: A simultaneous equations model with qualitative and limited dependent variables. International Economic Review, 19(2), 415–433.

    Article  Google Scholar 

  23. Means, B., Toyama, Y., Murphy, R., Bakia, M., & Jones, K. (2009). Evaluation of evidence-based practices in online learning: A meta-analysis and review of online learning studies. Washington, DC: U.S. Department of Education.

    Google Scholar 

  24. National Center for Education Statistics. (2015). The condition of education 2015. Washington, D.C.

  25. Nichols, A. (2007). VINCENTY: Stata module to calculate distances on the earth’s surface. IDEAS.

  26. Roblyer, M. D. (1999). Is choice important in distance learning? A study of student motives for taking internet-based courses at the high school and community college level. Journal of Research on Computing in Education, 32(1), 157–171.

    Article  Google Scholar 

  27. Rouse, C. E. (1995). Democratization or diversion? The effect of community colleges on educational attainment. Journal of Business and Economic Statistics, 13(2), 217–224.

    Google Scholar 

  28. Russell, T. (2015). No significant difference. Retrieved from

  29. U.S. Department of Housing and Urban Development. (2014). HUD USPS ZIP Code Crosswalk Files. Accessed 06/28/14.

  30. Vigdor, J. L., Ladd, H. F., & Martinez, E. (2014). Scaling the digital divide: Home computer technology and student achievement. Economic Inquiry, 52(3), 1103–1119.

    Article  Google Scholar 

  31. Washington State Board for Community and Technical Colleges. (2008). Strategic Technology Plan for Washington State Community and Technical Colleges. Olympia, WA: Washington State Board for Community and Technical Colleges.

    Google Scholar 

  32. Willis, R. J., & Rosen, S. (1979). Education and self-selection. Journal of Political Economy, 87(5), 7–36.

    Article  Google Scholar 

  33. Wiswall, M., & Zafar, B. (2015). Determinants of college major choice: Identification using an information experiment. Review of Economic Studies, 82(2), 791–824.

    Article  Google Scholar 

  34. Wooldridge, J. M. (2010). Correlated random effects models with unbalanced panels. Working paper.

  35. Xu, D., & Jaggars, S. S. (2011). Online and hybrid course enrollment and performance in Washington State community and technical colleges. CCRC working paper no. 31. Community College Research Center, Columbia University.

  36. Xu, D., & Jaggars, S. S. (2013). The impact of online learning on students’ course outcomes: Evidence from a large community and technical college system. Economics of Education Review, 37, 149–162.

    Article  Google Scholar 

Download references


We thank the State of Washington’s Education Research and Data Center for access to data. This paper is part of the Postsecondary Education and Labor Market Program at the Center for the Analysis of Longitudinal Data in Education Research (CALDER) at AIR. This research was supported by the CALDER postsecondary initiative, funded through grants provided by the Bill and Melinda Gates Foundation and an anonymous foundation to the American Institutes of Research.

Author information



Corresponding author

Correspondence to Nick Huntington-Klein.

Appendix: Main Model Results, Using Distance as an Excluded Variable

Appendix: Main Model Results, Using Distance as an Excluded Variable

See Tables 6 and 7.

Table 6 Predictors of online course-taking, retention, and graduation, with distance as excluded variable
Table 7 Average effects of online courses on retention and graduation, with distance as excluded variable

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huntington-Klein, N., Cowan, J. & Goldhaber, D. Selection into Online Community College Courses and Their Effects on Persistence. Res High Educ 58, 244–269 (2017).

Download citation


  • Community college
  • Online education
  • Distance learning
  • Quasi experiment