Abstract
Ordinal outcomes are common in the social, behavioral, and health sciences, but there is no commonly accepted approach to analyzing them. Researchers make a number of different seemingly arbitrary recoding decisions implying different levels of measurement and theoretical assumptions. As a result, a wide array of models are used to analyze ordinal outcomes, including the linear regression model, binary response model, ordered models, and count models. In this tutorial, we present a diverse set of ordered models (most of which are under-utilized in applied research) and argue that researchers should approach the analysis of ordinal outcomes in a more systematic fashion by taking into consideration both theoretical and empirical concerns, and prioritizing ordered models given the flexibility they provide. Additionally, we consider the challenges that ordinal independent variables pose for analysts that often go unnoticed in the literature and offer simple ways to decide how to include ordinal independent variables in ordered regression models in ways that are easier to justify on conceptual and empirical grounds. We illustrate several ordered regression models with an empirical example, general self-rated health, and conclude with recommendations for building a sounder approach to ordinal data analysis.
Similar content being viewed by others
Notes
There are constrained and unconstrained partial models, which yields four versions of each approach.
For a discussion of the constrained partial model, see Fullerton and Xu (2016, p.65).
Espinosa and Hennig (2019) developed an ordinal model that allows one to impose a monotonicity constraint on ordinal independent variables via constrained MLE (p.872). The models we present in this study do not impose this inequality constraint. Ordinal patterns may emerge for groups of binary independent variables, but they are not imposed in the models.
The “poor” category only had 1.53% of the overall sample. The LR test for combining the “fair” and “poor” categories was not significant (Χ2 = 9.798, p = 0.367), providing an empirical justification for this approach.
The “fairly likely” category only had 4.36% of the sample, and the “very likely” category only had 2.59% of the sample. The LR tests for combining these categories was not significant (Χ2 = 6.688, p = 0.571).
We used the gologit2 (Williams 2006) command in Stata to estimate the cumulative models. Although the focus in this example is on cumulative models, we also estimated stage and adjacent models for comparison purposes. The results are substantively similar to the cumulative results. We used the gencrm command (Bauldry et al., 2018) and the regular mlogit command in Stata to estimate the stage and adjacent models, respectively. See the online appendix for the results and details regarding estimation using Stata.
References
Agresti, A. (2010). Analysis of ordinal categorical data (2nd ed.). Wiley
Amemiya, T. (1981). Qualitative response models: a survey. Journal of Economic Literature, 19, 1483–1536
Bauldry, S., Xu, J., & Fullerton, A. S. (2018). Gencrm: a new command for generalized continuation-ratio models. Stata Journal, 18, 924–936
Bourdieu, P., Chambordeon, J., & Passeron, J. (1991). The craft of sociology: Epistemological preliminaries. Walter de Gruyter. Berlin, Germany.
Brant, R. (1990). Assessing proportionality in the proportional odds model for ordinal logistic regression. Biometrics, 46, 1171–1178
Buis, M. L. (2011). The consequences of unobserved heterogeneity in a sequential logit model. Research in Social Stratification and Mobility, 29, 247–262
Bürkner, P.-C., & Vuorre, M. (2019). Ordinal regression models in psychology: a tutorial. Advances in Methods and Practices in Psychological Science, 2, 77–101
Cameron, S. V., & Heckman, J. J. (1998). Life cycle schooling and dynamic selection bias: Models and evidence for five cohorts of American males. Journal of Political Economy, 106, 262–333
Cheng, S., & Long, J. S. (2007). Testing for IIA in the multinomial logit model. Sociological Methods and Research, 35, 583–600
Clogg, C. C., & Shihadeh, E. S. (1994). Statistical models for ordinal variables. Sage
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society Series B, 34, 187–220
Espinosa, J., & Hennig, C. (2019). A constrained regression model for an ordinal response with ordinal predictors. Statistics and Computing, 29, 869–890
Fienberg, S. E. (1980). The analysis of cross-classified categorical data (2nd ed.). MIT Press
Fullerton, A. S. (2009). A conceptual framework for ordered logistic regression models. Sociological Methods and Research, 38, 306–347
Fullerton, A. S., & Dixon, J. C. (2009). Racialization, asymmetry, and the context of welfare attitudes in the American states. Journal of Political and Military Sociology, 37, 95–120
Fullerton, A. S., & Xu, J. (2012). The proportional odds with partial proportionality constraints model for ordinal response variables. Social Science Research, 41, 182–198
Fullerton, A. S., & Xu, J. (2016). Ordered regression models: Parallel, partial, and non-parallel alternatives. Chapman & Hall/CRC Press
Fullerton, A. S., & Xu, J. (2018). Constrained and unconstrained partial adjacent category logit models for ordinal response variables. Sociological Methods and Research, 47, 169–206
Goodman, L. A. (1983). The analysis of dependence in cross-classifications having ordered categories, using log-linear models for frequencies and log-linear models for odds. Biometrics, 39, 149–160
Greene, W. H., & Hensher, D. A. (2010). Modeling ordered choices: A primer. Cambridge, UK: Cambridge University Press.
Hedeker, D. R., Mermelstein, R. J., & Weeks, K. A. (1999). The thresholds of change model: An application to analyzing stages of change data. Annals of Behavioral Medicine, 21, 61–70
Lieberson, S. (1985). Making it count. University of California Press
Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 1–55
Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks, CA: Sage
Long, J. S., & Freese, J. (2014). Regression models for categorical dependent variables using Stata (3rd ed.). College Station, TX: Stata Press
Maddala, G. S. (1983). Limited-dependent and qualitative variables in econometrics. Cambridge University Press
Mare, R. D. (2011). Introduction to symposium on unmeasured heterogeneity in school transition models. Research in Social Stratification and Mobility, 29, 239–245
McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society Series B, 42, 109–142
McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd ed.). Chapman & Hall
McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (Ed.), Frontiers in econometrics (pp. 105–142). Academic Press
McKelvey, R. D., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4, 103–120
Nadler, J. T., Weston, R., & Voyles, E. C. (2015). Stuck in the middle: The use and interpretation of mid-points in items on questionnaires. Journal of General Psychology, 142, 71–89
Peterson, B., & Harrell, F. E., Jr. (1990). Partial proportional odds models for ordinal response variables. Applied Statistics, 39, 205–217
Tucker, G., Adams, R., & Wilson, D. (2013). Observed agreement problems between sub-scales and summary components of the SF-36 Version 2 — An alternative scoring method can correct the problem. PLoS ONE , 8, e61191
Tucker, G., Adams, R., & Wilson, D. (2014). Results from several population studies show that recommended scoring methods of the SF-36 and the SF-12 may lead to incorrect conclusions and subsequent health decisions. Quality of Life Research, 23, 2195–2203
Tutz, G. (1991). Sequential models in categorical regression. Computational Statistics & Data Analysis, 11, 275–295
Williams, R. (2006). Generalized ordered logit/partial proportional odds models for ordinal dependent variables. Stata Journal, 6, 58–82
Williams, R. (2009). Using heterogeneous choice models to compare logit and probit coefficients across groups. Sociological Methods and Research, 37, 531–559
Williams, R. (2016). Understanding and interpreting generalized ordered logit models. Journal of Mathematical Sociology, 40, 7–20
Xie, Y. (2011). Values and limitations of statistical models. Research in Social Stratification and Mobility, 29, 343–349
Xu, J., Bauldry, S., & Fullerton, A. S. (2019). Bayesian approaches to assessing the parallel lines assumption in cumulative ordered logit models. Sociological Methods and Research (In Press). https://doi.org/10.1177/0049124119882461
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
The OSU Institutional Review Board granted the study exempt status.
Informed Consent
Not applicable.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Fullerton, A.S., Anderson, K.F. Ordered Regression Models: a Tutorial. Prev Sci 24, 431–443 (2023). https://doi.org/10.1007/s11121-021-01302-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11121-021-01302-y