Research in Higher Education

, Volume 43, Issue 3, pp 259–293 | Cite as

The Use and Interpretation of Logistic Regression in Higher Education Journals: 1988–1999

  • Chao-Ying Joanne Peng
  • Tak-Shing Harry So
  • Frances K. Stage
  • Edward P. St. John


This article examines the use and interpretation of logistic regression in three leading higher education research journals from 1988 to 1999. The journals were selected because of their emphasis on research, relevance to higher education issues, broad coverage of research topics, and reputable editorial policies. The term “logistic regression” encompasses logit modeling, probit modeling, and tobit modeling and the significance tests of their estimates. A total of 52 articles were identified as using logistic regression. Our review uncovered an increasingly sophisticated use of logistic regression for a wide range of topics. At the same time, there continues to be confusion over terminology. The sample sizes used did not always achieve a desired level of stability in the parameters estimated. Discussion of results in terms of delta-Ps and marginal probabilities was not always cautionary, according to definitions. The review is concluded with recommendations for journal editors and researchers in formulating appropriate editorial policies and practice for applying the versatile logistic regression technique and in communicating its results with readers of higher education research.

logistic regression logit probit tobit delta-P marginal probability odds ratio higher education research multivariate statistics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Afifi, A. A., and Clark, V. (1990). Computer-Aided Multivariate Analysis, 2nd ed. New York: Van Nostrand Reinhold.Google Scholar
  2. Agresti, A. (1990). Categorical Data Analysis. New York: Wiley.Google Scholar
  3. Aldrich, J. H., and Nelson, F. D. (1984). Linear Probability, Logit, and Probit Models. Beverly Hills, CA: Sage.Google Scholar
  4. Allison, P. D. (1995). Survival Analysis Using the SAS ® System: A Practical Guide. Cary, NC: SAS Institute Inc.Google Scholar
  5. Amemiya, T. (1981). Qualitative response models: A survey. Journal of Economic Literature 19(4): 1483-1536.Google Scholar
  6. Amemiya, T. (1984). Tobit models: A survey. Journal of Econometrics 24: 3-61.Google Scholar
  7. Anderson, J. A., and Philips, P. R. (1981). Regression, discrimination and measurement: Models for ordered categorical variables. Applied Statistics 30: 22-31.Google Scholar
  8. Austin, J. T., Yaffee, R. A., and Hinkle, D. E. (1992). Logistic regression for research in higher education. In J. C. Smart (ed.), Higher Education: Handbook of Therory and Research, Vol. VIII, 379-410. New York: Agathon Press.Google Scholar
  9. Backer, R. J., and Nelder, J. A. (1988). The Generalised Linear Interactive Modeling (Release 3.77). Oxford: Numerical Algorithms Group.Google Scholar
  10. Becker, W. E., and Kennedy, P. E. (1992). A graphic exposition of the ordered probit. Econometric Theory 8: 127-131.Google Scholar
  11. Becker, W. E., and Waldman, D. (1987). The probit model. In W. Becker and W. Walstad (eds.), Econometric Modeling in Economic Education Research, pp. 135-140. Boston: Kluwer-Nijhoff Publishing.Google Scholar
  12. Budd, J. M. (1988). A bibliometric analysis of higher education literature. Research in Higher Education 28(2): 180-190.Google Scholar
  13. Cabrera, A. F. (1994). Logistic regression analysis in higher education: An applied perspective. In J. C. Smart (ed.), Higher Education: Handbook of Theory and Research, Vol. X, 225-256. New York: Agathon Press.Google Scholar
  14. Cleary, P. D., and Angel, R. (1984). The analysis of relationships involving dichotomous dependent variables. Journal of Health and Social Behavior 25: 334-348.Google Scholar
  15. Davison, R., and MacKinnon, J. G. (1993). Estimation and Inference in Econometrics. New York: Oxford University Press.Google Scholar
  16. DeMaris, A. (1992). Logit Modeling: Practical Applications (Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07-086). Newbury Park, CA: Sage.Google Scholar
  17. Dey, E. L., and Astin, A. W. (1993). Statistical alternatives for studying college student retention: A comparative analysis of logit, probit, and linear regression. Research in Higher Education 34: 569-581.Google Scholar
  18. Fan, X., and Wang, L. (1999). Comparing linear discriminant function with logistic regression for the two-group classification problem. The Journal of Experimental Education 67(3): 265-286.Google Scholar
  19. Fienberg, S. E. (1983). The analysis of cross-classified categorical data, Rev. ed. Cambridge: Massachusetts Institute of Technology.Google Scholar
  20. Flury, B. (1997). A First Course in Multivariate Statistics. New York: Springer.Google Scholar
  21. Greene, W. H. (1989). LIMDEP, Vol. 5.1. New York: Econometric Software, Inc.Google Scholar
  22. Greene, W. H. (1993). Econometric Analysis, 2nd ed. New York: Macmillan.Google Scholar
  23. Hanushek, E. A., and Jackson, J. E. (1977). Statistical Methods for Social Scientists. New York: Academic Press.Google Scholar
  24. Harrell, F. E. (1986). The LOGIST procedure. In SAS Institute, Inc., SUGI: Supplemental library users guide, version 5 edition, pp. 269-293. Cary, NC: SAS Institute, Inc.Google Scholar
  25. Harrell, F. E., and Lee, K. L. (1985). A comparison of the discrimination of discriminant analysis and logistic regression under conditions of multivariate normality. In P. K. Sen (ed.), Biostatistics: Statistics in Biomedical, Public Health, and Environmental Science, pp. 333-343. New York: North Holland for Elsevier Science Publishers.Google Scholar
  26. Heckman, J. J. (1979). Sample selection as a specification error. Econometrica 47: 153-161.Google Scholar
  27. Hinkle, D. E., Austin, J. T., and McLaughlin, G. W. (1989). Log-linear models: Applications in Higher Education Research. In J. C. Smart (ed.), Higher Education: Handbook of Theory and Research, Vol. V, pp. 323-353. New York: Agathon Press.Google Scholar
  28. Hosmer, D. W., Jr., and Lemeshow, S. (1989). Applied Logistic Regression. New York: John Wiley and Sons, Inc.Google Scholar
  29. Hossler, D., and Scalese-Love, P. (1989). Grounded meta-analysis: A guide for research synthesis. The Review of Higher Education 13(1): 1-28.Google Scholar
  30. Jackson, G. A. (1981). Linear analysis of logistic choices, and vice versa. Paper presented to the Social Statistics Section of the American Statistical Association, Washington, D. C.Google Scholar
  31. Jennings, D. E. (1986). Judging inference adequacy in logistic regression. Journal of the American Statistical Association 81: 471-476.Google Scholar
  32. Judge, G. G., Griffiths, W. E., Hill, R. C., and Lee, T.-C. (1980). Introduction to the Theory and Practice of Econometrics. New York: John Wiley and Sons.Google Scholar
  33. Judge, G. G., Hill, R. C., Griffiths, W. E., Lutkepohl, H., and Lee, T.-C. (1982). Introduction to the Theory and Practice of Econometrics. New York: John Wiley and Sons.Google Scholar
  34. Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29: 119-127.Google Scholar
  35. Kennedy, J. J. (1992). Analyzing Qualitative Data. New York: Praeger.Google Scholar
  36. Kennedy, P. E. (1981). Estimation with correctly interpreted dummy variables in semilogarithmic equations. American Economic Review 71(4): 801.Google Scholar
  37. Kim, J. O., and Mueller, C. W. (1978). Factor analysis: Statistical methods and practical issues. Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07-014. Beverly Hills and London: Sage Publications.Google Scholar
  38. Kleinbaum, D. G. (1994). Logistic Regression: A Self-Learning Text. New York: Springer-Verlag.Google Scholar
  39. Lawley, D. N., and Maxwell, A. E. (1971). Factor Analysis as a Statistical Method. London: Butterworth and Co.Google Scholar
  40. Lei, P.-W., and Koehly, L. M. (April, 2000). Linear discriminant analysis versus logistic regression: A comparison of classification errors. Paper presented at the 2000 Annual Meeting of American Educational Research Association, New Orleans, LA.Google Scholar
  41. Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications, Inc.Google Scholar
  42. Maddala, G. S. (1987). Limited Dependent and Qualitative Variables in Econometrics, Rev. ed. Cambridge: Cambridge University Press.Google Scholar
  43. Maddala, G. S. (1992). Introduction to Econometrics. New York: Macmillan.Google Scholar
  44. Marascuilo, L. A., and Levin, J. R. (1983). Multivariate Statistics in the Social Sciences: A Researcher's Guide. Monterey, CA: Brooks/Cole Publishing Company.Google Scholar
  45. McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society Series B 42: 109-142.Google Scholar
  46. McCullagh, P., and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. London: Chapman and Hall.Google Scholar
  47. McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (ed.), Frontiers of Econometrics, pp. 105-142. New York: Academic Press.Google Scholar
  48. Menard. S. (1995). Applied Logistic Regression Analysis. Thousand Oaks, CA: Sage.Google Scholar
  49. Menard, S. (February 2000). Coefficients of determination for multiple logistic regression analysis. The American Statistician 54(1): 17-24.Google Scholar
  50. Norušis, M. J. (1990). SPSS Advanced Statistics. Chicago, IL: SPSS, Inc.Google Scholar
  51. Pedhazur, E. (1982). Multiple Regression in Behavioral Research: Explanation and Prediction, 2nd ed. New York: Holt, Rinehart and Winston.Google Scholar
  52. Peng, C. Y. J., and So, T. S. H. (April 1999). Computing issues and considerations in logistic regression. Paper presented at the 1999 annual meeting of the American Educational Research Association,Montreal, Canada. [URL:]Google Scholar
  53. Peng, C. Y. J., and So, T. S. H. (2002). Logistic Regression Analysis and Reporting: A Primer. Understanding Statistics, 1(1), 31-70.Google Scholar
  54. Peng, C. Y. J., and So, T. S. H. (in press). Modeling strategies in logistic regression. Journal of Applied Modern Statistical Methods.Google Scholar
  55. Peterson, T. (1985). A comment on presenting results from logit and probit models. American Sociological Review 50: 130-131.Google Scholar
  56. Pregibon, D. (1981). Logistic regression diagnostics. Annals of Statistics 9: 705-724.Google Scholar
  57. Press, S. J. (1972). Applied Multivariate Analysis. New York: Holt, Rinehart, and Winston.Google Scholar
  58. Press, S. J., and Wilson, S. (1978). Choosing between logistic regression and discriminant analysis. Journal of the American Statistical Association 73: 699-705.Google Scholar
  59. Pugh, R. C., and Hu, Y. L. (1991). Use and interpretation of canonical correlation analyses in Journal of Educational Research articles: 1978-1989. Journal of Educational Research 84(3): 147-152.Google Scholar
  60. Ryan, T. P. (1997). Modern Regression Methods. New York: John Wiley and Son, Inc.Google Scholar
  61. SAS Institute Inc. (1999). SAS/STAT ® User's Guide, Version 8, Volume 2. Cary, NC: SAS Institute Inc.Google Scholar
  62. Scott, K. G., Mason, C. A., and Chapman, D. A. (1999). The use of epidemiological methodology as a means of influencing public policy. Child Development 70(5): 1263-1272.Google Scholar
  63. Silverman, R. J. (1985). Higher education as a maturing field? Evidence from referencing practices. Research in Higher Education 23(2): 150-183.Google Scholar
  64. Soderstrom, I. R., and Leitner, D. W. (October 1997). The effects of base rate, selection ratio, sample size, and reliability of predictors on predictive efficiency indices associated with logistic regression models. Paper presented at the annual meeting of the Mid-Western Educational Research Association, Chicago, Il.Google Scholar
  65. SPSS Inc. (1999). SYSTAT ® 9.0 Statistics I. Chicago, IL: SPSS Inc.Google Scholar
  66. Stage, F. K. (1990). LISREL: An introduction and applications in higher education. In J. C. Smart (ed.), Higher Education: Handbook of Theory and Research, Vol. VI, pp. 427-466. New York: Agathon Press.Google Scholar
  67. St. John, E. P., Kirshstein, R. J., and Noell, J. (1991). The effects of student financial aid on persistence: A sequential analysis. The Review of Higher Education 14(3): 383-406.Google Scholar
  68. Tabachnick, B. G., and Fidell, L. S. (1983). Using Multivariate Statistics. New York: Harper Collins.Google Scholar
  69. Tabachnick, B. G., and Fidell, L. S. (1996). Using Multivariate Statistics, 3rd ed. New York: Harper Collins.Google Scholar
  70. Thorndike, R. M. (1978). Correlational Procedures for Research. New York: Gardner Press.Google Scholar
  71. Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica 26(1): 64-85.Google Scholar
  72. Weiler, W. C. (1987). An application of nested multinomial logit model to enrollment choice behavior. Research in Higher Education 27: 273-282.Google Scholar
  73. Weiler, W. C. (1989). A flexible approach to modeling enrollment choice behavior. Economics of Education Reviews 8(3): 277-283.Google Scholar

Copyright information

© Human Sciences Press, Inc. 2002

Authors and Affiliations

  • Chao-Ying Joanne Peng
    • 1
  • Tak-Shing Harry So
    • 2
  • Frances K. Stage
    • 3
  • Edward P. St. John
    • 2
  1. 1.Department of Counseling and Educational Psychology, school of EducationIndiana UniversityBloomington
  2. 2.Indiana University–BloomingtonUSA
  3. 3.New York UniversityUSA

Personalised recommendations