Skip to main content

A nonnormal look at polychoric correlations: modeling the change in correlations before and after discretization

Abstract

Two algorithms for establishing a connection between correlations before and after ordinalization under a wide spectrum of nonnormal underlying bivariate distributions are developed by extending the iteratively found normal-based results via the power polynomials. These algorithms are designed to compute the polychoric correlation when the ordinal correlation is specified, and vice versa, along with the distributional properties of latent, continuous variables that are subsequently ordinalized through thresholds dictated by the marginal proportions. The method has broad applicability in the simulation and random number generation world where modeling the relationships between these correlation types is of interest.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Amatya A, Demirtas H (2015) Simultaneous generation of multivariate mixed data with Poisson and normal marginals. J Stat Comput Simul 85:3129–3139

    MathSciNet  Article  Google Scholar 

  2. Barbiero A, Ferrari PA (2015) Simulation of ordinal and discrete variables with given correlation matrix and marginal distributions, R package GenOrd. https://cran.r-project.org/web/packages/GenOrd

  3. Cario MC, Nelson BR (1997) Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix (Technical Report). Department of Industrial Engineering and Management Services Northwestern University, Evanston, IL, USA

  4. Demirtas H (2004) Simulation-driven inferences for multiply imputed longitudinal datasets. Stat Neerl 58:466–482

    MathSciNet  Article  MATH  Google Scholar 

  5. Demirtas H (2005) Multiple imputation under Bayesianly smoothed pattern-mixture models for non-ignorable drop-out. Stat Med 24:2345–2363

    MathSciNet  Article  Google Scholar 

  6. Demirtas H (2006) A method for multivariate ordinal data generation given marginal distributions and correlations. J Stat Comput Simul 76:1017–1025

    MathSciNet  Article  MATH  Google Scholar 

  7. Demirtas H (2007) Practical advice on how to impute continuous data when the ultimate interest centers on dichotomized outcomes through pre-specified thresholds. Commun Stat Simul Comput 36:871–889

    MathSciNet  Article  MATH  Google Scholar 

  8. Demirtas H (2016a) A note on the relationship between the phi coefficient and the tetrachoric correlation under nonnormal underlying distributions. Am Stat. doi:10.1080/00031305.2015.1077161

  9. Demirtas H (2016b) Concurrent generation of binary and nonnormal continuous data through fifth order power polynomials. Commun Stat Simul Comput. doi:10.1080/03610918.2014.963613

  10. Demirtas H, Arguelles LM, Chung H, Hedeker D (2007) On the performance of bias-reduction techniques for variance estimation in approximate Bayesian bootstrap imputation. Comput Stat Data Anal 51:4064–4068

    MathSciNet  Article  MATH  Google Scholar 

  11. Demirtas H, Doganay B (2012) Simultaneous generation of binary and normal data with specified marginal and association structures. J Biopharm Stat 22:223–236

    MathSciNet  Article  Google Scholar 

  12. Demirtas H, Freels SA, Yucel RM (2008) Plausibility of multivariate normality assumption when multiply imputing non-Gaussian continuous outcomes: a simulation assessment. J Stat Comput Simul 78:69–84

    MathSciNet  Article  MATH  Google Scholar 

  13. Demirtas H, Hedeker D (2007) Gaussianization-based quasi-imputation and expansion strategies for incomplete correlated binary responses. Stat Med 26:782–799

    MathSciNet  Article  Google Scholar 

  14. Demirtas H, Hedeker D (2008a) Multiple imputation under power polynomials. Commun Stat Simul Comput 37:1682–1695

    MathSciNet  Article  MATH  Google Scholar 

  15. Demirtas H, Hedeker D (2008b) An imputation strategy for incomplete longitudinal ordinal data. Stat Med 27:4086–4093

    MathSciNet  Article  Google Scholar 

  16. Demirtas H, Hedeker D (2011) A practical way for computing approximate lower and upper correlation bounds. Am Stat 65:104–109

    MathSciNet  Article  MATH  Google Scholar 

  17. Demirtas H, Hedeker D (2016) Computing the point-biserial correlation under any underlying continuous distribution. Commun Stat Simul Comput. doi:10.1080/03610918.2014.920883

  18. Demirtas H, Hedeker D, Mermelstein JM (2012) Simulation of massive public health data by power polynomials. Stat Med 31:3337–3346

    MathSciNet  Article  Google Scholar 

  19. Demirtas H, Schafer JL (2003) On the performance of random-coefficient pattern-mixture models for non-ignorable drop-out. Stat Med 22:2553–2575

    Article  Google Scholar 

  20. Demirtas H, Yavuz Y (2015) Concurrent generation of ordinal and normal data. J Biopharm Stat 25:635–650

    Article  Google Scholar 

  21. Emrich JL, Piedmonte MR (1991) A method for generating high-dimensional multivariate binary variates. Am Stat 45:302–304

    Google Scholar 

  22. Ferrari PA, Barbiero A (2012) Simulating ordinal data. Multivar Behav Res 47:566–589

    Article  Google Scholar 

  23. Fleishman AI (1978) A method for simulating non-normal distributions. Psychometrika 43:521–532

    Article  MATH  Google Scholar 

  24. Fréchet M (1951) Sur les tableaux de corrélation dont les marges sont données. Annales de l’Université de Lyon Section A 14:53–77

    MATH  Google Scholar 

  25. Headrick TC (2002) Fast fifth-order polynomial transforms for generating univariate and multivariate nonnormal distributions. Comput Stat Data Anal 40:685–711

    MathSciNet  Article  MATH  Google Scholar 

  26. Headrick TC (2010) Statistical simulation: power method polynomials and other transformations. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  27. Hoeffding W (1940) Scale-invariant correlation theory. In: Fisher NI, Sen PK (eds) The collected works of Wassily Hoeffding. Springer, New York, pp 57–107 (1994)

    Google Scholar 

  28. Inan G, Demirtas H (2015) Data generation with binary and continuous non-normal components, R package BinNonNor. https://cran.r-project.org/web/packages/BinNonNor

  29. Shi Y, Demirtas H (2015) Simultaneous generation of count and continuous data, R package PoisNonNor. https://cran.r-project.org/web/packages/PoisNonNor

  30. Vale CD, Maurelli VA (1983) Simulating multivariate nonnormal distributions. Psychometrika 48:465–471

    Article  MATH  Google Scholar 

  31. Wang Y, Demirtas H (2015), Concurrent generation of binary, ordinal and continuous data, R package BinOrdNonNor. https://cran.r-project.org/web/packages/BinOrdNonNor

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hakan Demirtas.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Demirtas, H., Ahmadian, R., Atis, S. et al. A nonnormal look at polychoric correlations: modeling the change in correlations before and after discretization. Comput Stat 31, 1385–1401 (2016). https://doi.org/10.1007/s00180-016-0653-7

Download citation

Keywords

  • Random number generation
  • Simulation
  • Nonnormality
  • Threshold concept