Advertisement

Psychometrika

, Volume 80, Issue 4, pp 920–937 | Cite as

Generating Correlated, Non-normally Distributed Data Using a Non-linear Structural Model

  • Max AuerswaldEmail author
  • Morten Moshagen
Article

Abstract

An approach to generate non-normality in multivariate data based on a structural model with normally distributed latent variables is presented. The key idea is to create non-normality in the manifest variables by applying non-linear linking functions to the latent part, the error part, or both. The algorithm corrects the covariance matrix for the applied function by approximating the deviance using an approximated normal variable. We show that the root mean square error (RMSE) for the covariance matrix converges to zero as sample size increases and closely approximates the RMSE as obtained when generating normally distributed variables. Our algorithm creates non-normality affecting every moment, is computationally undemanding, easy to apply, and particularly useful for simulation studies in structural equation modeling.

Keywords

Non-normal multivariate data Structural equation modeling Simulation 

Notes

Acknowledgments

This research was supported in parts by a grant from the Baden-Württemberg foundation to the second author

Supplementary material

11336_2015_9468_MOESM1_ESM.zip (9 kb)
Supplementary material 1 (zip 9 KB)

References

  1. Bradley, D. R., & Fleisher, C. L. (1994). Generating multivariate data from nonnormal distributions: Mihal and Barrett revisited. Behavior Research Methods, Instruments, & Computers, 26, 156–166. doi: 10.3758/BF03204610.CrossRefGoogle Scholar
  2. Burr, I. W. (1942). Cumulative frequency functions. The Annals of Mathematical Statistics, 13, 215–232. doi: 10.1214/aoms/1177731607.CrossRefGoogle Scholar
  3. Cario, M. C., & Nelson, B. L. (1998). Numerical methods for fitting and simulating autoregressive-to-anything processes. INFORMS Journal on Computing, 10, 72–81.CrossRefGoogle Scholar
  4. Cook, R. D., & Johnson, M. E. (1981). A family of distributions for modelling non-elliptically symmetric multivariate data. Journal of the Royal Statistical Society. Series B, 43, 210–218. doi: 10.2307/2984851.Google Scholar
  5. Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1, 16–29.CrossRefGoogle Scholar
  6. Devroye, L. (1986). Non-uniform random variate generation. New York: Springer.CrossRefGoogle Scholar
  7. Fang, K.-T., Kotz, S., & Ng, K. W. (1990). Symmetric multivariate and related distributions. London: Chapman and Hall.CrossRefGoogle Scholar
  8. Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43, 521–532. doi: 10.1007/BF02293811.CrossRefGoogle Scholar
  9. Foldnes, N., & Grønneberg, S. (in press). How general is the Vale-Maurelli simulation approach? Psychometrika. doi: 10.1007/s11336-014-9414-0.
  10. Headrick, T. C. (2002). Fast fifth-order polynomial transforms for generating univariate and multivariate nonnormal distributions. Computational Statistics & Data Analysis, 40, 685–711. doi: 10.1016/S0167-9473(02)00072-5.CrossRefGoogle Scholar
  11. Headrick, T. C. (2010). Statistical simulation: Power method polynomials and other transformations. Boca Raton, FL: Chapman & Hall/CRC.Google Scholar
  12. Headrick, T. C., & Kowalchuk, R. K. (2007). The power method transformation: Its probability density function, distribution function, and its further use for fitting data. Journal of Statistical Computation and Simulation, 77, 229–249. doi: 10.1080/10629360600605065.CrossRefGoogle Scholar
  13. Headrick, T. C., & Mugdadi, A. (2006). On simulating multivariate non-normal distributions from the generalized lambda distribution. Computational Statistics & Data Analysis, 50, 3343–3353. doi: 10.1016/j.csda.2005.06.010.CrossRefGoogle Scholar
  14. Headrick, T. C., & Sawilowsky, S. S. (1999). Simulating correlated multivariate nonnormal distributions: Extending the Fleishman power method. Psychometrika, 64, 25–35. doi: 10.1007/BF02294317.CrossRefGoogle Scholar
  15. Hodis, F. A., Headrick, T. C., & Sheng, Y. (2012). Power method distributions through conventional moments and L-moments. Applied Mathematical Sciences, 6, 2159–2193.Google Scholar
  16. Hu, L.-T., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424–453. doi: 10.1037/1082-989X.3.4.424.CrossRefGoogle Scholar
  17. Joe, H. (1997). Multivariate models and multivariate dependence concepts. Boca Raton, FL: Chapman & Hall/CRC.CrossRefGoogle Scholar
  18. Johnson, N. L. (1949). Systems of frequency curves generated by methods of translation. Biometrika, 36, 149–176. doi: 10.2307/2332539.CrossRefPubMedGoogle Scholar
  19. Mair, P., Satorra, A., & Bentler, P. M. (2012). Generating nonnormal multivariate data using copulas: Applications to SEM. Multivariate Behavioral Research, 47, 547–565. doi: 10.1080/00273171.2012.692629.CrossRefGoogle Scholar
  20. Mattson, S. (1997). How to generate non-normal data for simulation of structural equation models. Multivariate Behavioral Research, 32, 355–373. doi: 10.1207/s15327906mbr3204_3.
  21. Moshagen, M. (2012). The model size effect in SEM: Inflated goodness-of-fit statistics are due to the size of the covariance matrix. Structural Equation Modeling, 19, 86–98. doi: 10.1080/10705511.2012.634724.CrossRefGoogle Scholar
  22. Nagahara, Y. (2004). A method of simulating multivariate nonnormal distributions by the Pearson distribution system and estimation. Computational Statistics & Data Analysis, 47, 1–29. doi: 10.1016/j.csda.2003.10.008.CrossRefGoogle Scholar
  23. Ramberg, J. S., & Schmeiser, B. W. (1974). An approximate method for generating asymmetric random variables. Communications of the ACM, 17, 78–82. doi: 10.1145/360827.360840.CrossRefGoogle Scholar
  24. Ruscio, J., & Kaczetow, W. (2008). Simulating multivariate nonnormal data using an iterative algorithm. Multivariate Behavioral Research, 43, 355–381. doi: 10.1080/00273170802285693.CrossRefGoogle Scholar
  25. Savalei, V. (2010). Expected versus observed information in SEM with incomplete normal and nonnormal data. Psychological Methods, 15, 352–367.CrossRefPubMedGoogle Scholar
  26. Tadikamalla, P. R. (1980). On simulating non-normal distributions. Psychometrika, 45, 273–279. doi: 10.1007/BF02294081.CrossRefGoogle Scholar
  27. Vale, C. D., & Maurelli, V. A. (1983). Simulating multivariate nonnormal distributions. Psychometrika, 48, 465–471. doi: 10.1007/BF02293687.CrossRefGoogle Scholar
  28. Yuan, K.-H., & Bentler, P. M. (1999). On normal theory and associated test statistics in covariance structure analysis under two classes of nonnormal distributions. Statistica Sinica, 9, 831–853.Google Scholar

Copyright information

© The Psychometric Society 2015

Authors and Affiliations

  1. 1.Department of Psychology, School of Social SciencesUniversity of MannheimMannheimGermany
  2. 2.Institute of PsychologyUniversity of KasselKasselGermany

Personalised recommendations