Skip to main content

Best Practices for Binary and Ordinal Data Analyses


The measurement of many human traits, states, and disorders begins with a set of items on a questionnaire. The response format for these questions is often simply binary (e.g., yes/no) or ordered (e.g., high, medium or low). During data analysis, these items are frequently summed or used to estimate factor scores. In clinical applications, such assessments are often non-normally distributed in the general population because many respondents are unaffected, and therefore asymptomatic. As a result, in many cases these measures violate the statistical assumptions required for subsequent analyses. To reduce the influence of the non-normality and quasi-continuous assessment, variables are frequently recoded into binary (affected–unaffected) or ordinal (mild–moderate–severe) diagnoses. Ordinal data therefore present challenges at multiple levels of analysis. Categorizing continuous variables into ordered categories typically results in a loss of statistical power, which represents an incentive to the data analyst to assume that the data are normally distributed, even when they are not. Despite prior zeitgeists suggesting that, e.g., variables with more than 10 ordered categories may be regarded as continuous and analyzed as if they were, we show via simulation studies that this is not generally the case. In particular, using Pearson product-moment correlations instead of maximum likelihood estimates of polychoric correlations biases the estimated correlations towards zero. This bias is especially severe when a plurality of the observations fall into a single observed category, such as a score of zero. By contrast, estimating the ordinal correlation by maximum likelihood yields no estimation bias, although standard errors are (appropriately) larger. We also illustrate how odds ratios depend critically on the proportion or prevalence of affected individuals in the population, and therefore are sub-optimal for studies where comparisons of association metrics are needed. Finally, we extend these analyses to the classical twin model and demonstrate that treating binary data as continuous will underestimate genetic and common environmental variance components, and overestimate unique environment (residual) variance. These biases increase as prevalence declines. While modeling ordinal data appropriately may be more computationally intensive and time consuming, failing to do so will likely yield biased correlations and biased parameter estimates from modeling them.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. Agresti A (1990) Analysis of categorical data. Wiley, New York

    Google Scholar 

  2. Bock RD, Aitkin M (1981) Marginal maximum likelihood estimation of item parameters: application of an em algorithm. Psychometrika 46(4):443–459.

    Article  Google Scholar 

  3. Boker SM, Neale MC, Maes H, Wilde M, Spiegel M, Brick TR, Bates T et al (2011) OpenMx: n open source extended structural equation modeling framework. Psychometrika 76(2):306–317

    Article  Google Scholar 

  4. Boyle EA, Li YI, Pritchard JK (2017) An expanded view of complex traits: from polygenic to omnigenic. Cell 169(7):1177–1186.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Browne MW (1984) Asymptotically distribution-free methods for the analysis of covariance structures. Br J Math Stat Psychol 37(1):62–83.

    Article  PubMed  Google Scholar 

  6. Chalmers RP (2012) mirt: a multidimensional item response theory package for the R environment. J Stat Softw 48(6):1–29

    Article  Google Scholar 

  7. Curnow RN (1972) The multifactorial model for the inheritance of liability to disease and its implications for relatives at risk. Biometrics 28(4):931–46

    Article  Google Scholar 

  8. Eaves L (2017) Genotype x environment interaction in psychiatric genetics: deep truth or thin ice? Twin Res Hum Genet 20(3):187–196.

    Article  PubMed  Google Scholar 

  9. Eaves L, Verhulst B (2014) Problems and pit-falls in testing for g x e and epistasis in candidate gene studies of human behavior. Behav Genet 44(6):578–90.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Fisher RA (1915) Frequency distribution of the values of the correlation coefficient in samples of an indefinitely large population. Biometrika 10:507–521

    Google Scholar 

  11. Fisher RA (1921) On the ‘probable error’ of a coefficient of correlation deduced from a small sample. Metron 1:3–32

    Google Scholar 

  12. Flora DB, Curran PJ (2004) An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychol Methods 9(4):466–491.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Fox J (2019) Polycor: Polychoric and polyserial correlations. R package version 0.7-10.

  14. Glass GV, Hopkins KD (1995) Statistical methods in education and psychology, 3rd edn. Allyn & Bacon, Boston

    Google Scholar 

  15. Gottesman II, Shields J (1967) A polygenic theory of schizophrenia. Proc Natl Acad Sci USA 58(1):199–205.

    Article  PubMed  Google Scholar 

  16. Huber P (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the fifth berkeley symposium on mathematical statistics and probability, vol 1, pp. 221–233. University of California Press, Berkeley, CA

  17. Jöreskog KG, Sórbom D (1993) PRELIS2 - user’s reference guide. Scientific Software, Chicago, IL

    Google Scholar 

  18. Lehmann EL (1998) Elements of large-sample theory. Springer, New York

    Google Scholar 

  19. Long JS (1997) Regression models for categorical and limited dependent variables. Advanced Quantitative Techniques in the Social Sciences. Sage Publications Inc, Thousand Oaks, CA

    Google Scholar 

  20. Martin NG, Eaves LJ (1977) The genetical analysis of covariance structure. Heredity (Edinb) 38(1):79–95.

    Article  Google Scholar 

  21. Mehta PD, Neale MC, Flay BR (2004) Squeezing interval change from ordinal panel data: latent growth curves with ordinal outcomes. Psychol Methods 9(3):301.

    Article  PubMed  Google Scholar 

  22. Neale MC, Hunter MD, Pritikin JN, Zahery M, Brick TR, Kirkpatrick R, Boker SM et al (2016) OpenMx 2.0: extended structural equation and statistical modeling. Psychometrika 81(2):535–549.

    Article  PubMed  Google Scholar 

  23. Neuman RJ, Heath A, Reich W, Bucholz KK, Madden P, Sun L, Hudziak JJ (2001) Latent class analysis of ADHD and comorbid symptoms in a population sample of adolescent female twins. J Child Psychol Psychiatry 42(7):933–942.

    Article  PubMed  Google Scholar 

  24. Newman H, Freeman F, Holzinger K (1937) Twins: a study of heredity and environment. The University of Chicago Press, Chicago, Il

    Google Scholar 

  25. Pritikin Brick TR, Neale MC (2018) Multivariate normal maximum likelihood with both ordinal and continuous variables, and data missing at random. Behav Res Methods 50(2):490–500.

    Article  PubMed  Google Scholar 

  26. Pritikin Neale MC, Prom-Wormley EC, Clark SL, Verhulst B (Under Review). Gw-sem 2.0: enhancing efficiency, flexibility, and accessibility. Behav Genetics

  27. R Core Team (2014) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

  28. Smith C (1970) Heritability of liability and concordance in monozygous twins. Ann Hum Genet 34(1):85–91.

    Article  PubMed  Google Scholar 

  29. Smith C (1974) Concordance in twins: methods and interpretation. Am J Hum Genet 26(4):454–66

    PubMed  PubMed Central  Google Scholar 

  30. Teugels JL (1990) Some representations of the multivariate Bernoulli and binomial distributions. J Multivariate Anal 32:256–268

    Article  Google Scholar 

  31. van den Oord EJ, Simonoff E, Eaves LJ, Pickles A, Silberg J, Maes H (2000) An evaluation of different approaches for behavior genetic analyses with psychiatric symptom scores. Behav Genet 30(1):1–18.

    Article  PubMed  Google Scholar 

  32. Verhulst B, Maes HH, Neale MC (2017) Gw-sem: a statistical package to conduct genome-wide structural equation modeling. Behav Genet 47(3):345–359.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Verhulst B, Prom-Wormley E, Keller M, Medland S, Neale MC (2019) Type I error rates and parameter bias in multivariate behavioral genetic models. Behav Genet 49(1):99–111.

    Article  PubMed  Google Scholar 

  34. White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48:817–830

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Brad Verhulst.

Ethics declarations


This study was supported by NIDA grants R01-DA018673 and R01-DA049867.

Conflict of interest

Brad Verhulst and Michael C. Neale declare that they have no conflicts of interest related to the publication of this article.

Ethical Approval

This article does not contain any studies with human participants or animal subjects performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors would like to express our deepest gratitude to an anonymous reviewer and to Professor Conor Dolan for their invaluable comments as reviewers of this manuscript. Not only did they provide outstanding critiques that undoubtedly improved the overall quality of the manuscript, but Professor Dolan also provided an initial draft of the R code for the fourth simulation study.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Verhulst, B., Neale, M.C. Best Practices for Binary and Ordinal Data Analyses. Behav Genet 51, 204–214 (2021).

Download citation


  • Ordinal data
  • Pearson product-moment correlation
  • Polychoric correlation
  • Point biserial correlation
  • Tetrachoric correlation
  • Odds ratio
  • Prevalence