Applying Multivariate Discrete Distributions to Genetically Informative Count Data
- 212 Downloads
We present a novel method of conducting biometric analysis of twin data when the phenotypes are integer-valued counts, which often show an L-shaped distribution. Monte Carlo simulation is used to compare five likelihood-based approaches to modeling: our multivariate discrete method, when its distributional assumptions are correct, when they are incorrect, and three other methods in common use. With data simulated from a skewed discrete distribution, recovery of twin correlations and proportions of additive genetic and common environment variance was generally poor for the Normal, Lognormal and Ordinal models, but good for the two discrete models. Sex-separate applications to substance-use data from twins in the Minnesota Twin Family Study showed superior performance of two discrete models. The new methods are implemented using R and OpenMx and are freely available.
KeywordsCount variables Twin study Biometric variance components Multivariate discrete distributions Substance use Lagrangian probability distributions
The authors were supported by U.S. Public Health Service grant DA026119. William G. Iacono and Matt McGue provided the MTFS dataset, which was supported by U.S. Public Health Service Grants DA05147, AA009367, and DA013240. The first author gives his special thanks to Matt McGue, Niels G. Waller, and Hermine H. Maes for their comments on drafts of the paper.
Compliance with Ethical Standards
Conflict of Interest
Robert M. Kirkpatrick and Michael C. Neale declare that they have no conflict of interest.
Human and animal rights and informed consent
The MTFS was reviewed and approved by the Institutional Review Board at the University of Minnesota. Written informed assent or consent was obtained from all participants, with parents providing written consent for their minor children.
- Balakrishnan N, Lai C-D (2009) Continuous bivariate distributions, 2nd edn. Springer, New YorkGoogle Scholar
- Consul PC (1989) Generalized poisson distributions: properties and applications. Marcel Dekker Inc., New YorkGoogle Scholar
- Consul PC, Famoye F (2006) Lagrangian probability distributions. Birkhäuser, BostonGoogle Scholar
- Forbes C, Evans M, Hastings N, Peacock B (2011) Statistical distributions, 4th edn. Wiley, HobokenGoogle Scholar
- Genz A, Bretz F (2009) Computation of multivariate normal and t probabilities. Springer, Heidelberg. Software and documentation available at http://cran.r-project.org/web/packages/mvtnorm/index.html
- Giles DE (2010) Hermite regression analysis of multi-modal count data. Econ Bull 30(4):2936–2945Google Scholar
- Johnson NL, Kotz S, Balakrishnan N (1997) Discrete multivariate distributions. Wiley, New YorkGoogle Scholar
- Kirkpatrick RM (2014) RMKdiscrete (Version 0.1). Software and documentation available at http://cran.r-project.org/web/packages/RMKdiscrete/
- Kocherlakota S, Kocherlakota K (1992) Bivariate discrete distributions. Marcel Dekker Inc, New YorkGoogle Scholar
- Nelsen RB (2006) An introduction to copulas, 2nd edn. Springer, New YorkGoogle Scholar
- R Core Team (2013). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/. [computer software]