Abstract
Multivariate twin and family studies are one of the most important tools to assess diseases inheritance as well as to study their genetic and environment interrelationship. The multivariate analysis of twin and family data is in general based on structural equation modelling or linear mixed models that essentially decomposes sources of covariation as originally suggested by Fisher. In this paper, we propose a flexible and unified statistical modelling framework for analysing multivariate Gaussian and non-Gaussian twin and family data. The non-normality is taken into account by actually modelling the mean and variance relationship, while the covariance structure is modelled by means of a linear covariance model including the option to model the dispersion components as functions of known covariates in a regression model fashion. The marginal specification of our models allows us to extend classic models and biometric indices such as the bivariate heritability, genetic, environmental and phenotypic correlations to non-Gaussian data. We illustrate the proposed models through simulation studies and six data analyses and provide computational implementation in R through the package mglm4twin.
Similar content being viewed by others
Data Availability
Data is available as a supplementary material.
References
Bonat WH, Jørgensen B (2016) Multivariate covariance generalized linear models. J Royal Statist Soc: Series C 65:649–675
Bonat WH, Kokonendji CC (2017) Flexible tweedie regression models for continuous data. J Statist Comput Simulat 87(11):2138–2152
Bonat WH, Jørgensen B, Kokonendji CC, Hinde J, Demétrio CGB (2018) Extended Poisson–Tweedie: properties and regression models for count data. Stat Modell 18(1):24–49
Bonat WH, Peterle R, Hinde J, Demétrio CGB (2018) Flexible regression models for continuous bounded data. Stat Modell
Bonat WH, Petterle RR, Hinde J, Demétrio CG (2019) Flexible quasi-beta regression models for continuous bounded data. Stat Model 19(6):617–633
Boomsma D, Busjahn A, Peltonen L (2002) Classical twin studies and beyond. Nat Rev Genet 3:872–882
Feng R, Zhou G, Zhang M, Zhang H (2009) Analysis of twin data using sas. Biometrics 65(2):584–589
Folstein MF, Folstein SE, McHugh PR (1975) Mini-mental state: a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 12(3):189–198
Holst KK, Scheike TH, Hjelmborg JB (2016) The liability threshold model for censored twin data. Comput Stat Data Anal 93:324–335
Jørgensen B (1987) Exponential dispersion models. J Royal Statist Soc Series B 49(2):127–162
Jørgensen B (1997) The theory of dispersion models. Chapman & Hall
Jørgensen B, Knudsen SJ (2004) Parameter orthogonality and bias adjustment for estimating functions. Scand J Stat 31(1):93–114
Jørgensen B, Kokonendji CC (2016) Discrete dispersion models and their tweedie asymptotics. AStA Adv Stat Anal 100(1):43–78
Khoury MJ, Beaty TH, Cohen BH (1993) Fundamentals of genetic epidemiology. Oxford University Press, Fundamentals of Genetic Epidemiology
Liang K-Y, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73(1):13–22
McArdle JJ, Prescott CA (2005) Mixed-effects variance components models for biometric family analyses. Behav Genet 35(5):631–652
McGue M, Christensen K (1997) Genetic and environmental contributions to depression symptomatology: evidence from danish twins 75 years of age and older. J Abnorm Psychol 106(3):439–448
Neale MC, Maes HH (2004) Methodology for genetic studies of twins and families. Technical report, Virginia Common wealth University, Department of Psychiatry. http://ibgwww.colorado.edu/workshop2004/cdrom/HTML/book2004a.pdf
Neale MC, Hunter MD, Pritikin JN, Zahery M, Brick TR, Kirkpatrick RM, Estabrook R, Bates TC, Maes HH, Boker SM (2016) OpenMx 2.0: extended structural equation and statistical modeling. Psychometrika 81(2):535–549
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A 135(3):370–384
Ozaki K, Toyoda H, Iwama N, Kubo S, Ando J (2011) Using non-normal sem to resolve the acde model in the classical twin design. Behav Genet 41(2):329–339
Prescott CA (2004) Using the mplus computer program to estimate models for continuous and categorical data from twins. Behav Genet 34(1):17–40
Rabe-Hesketh S, Skrondal A, Gjessing HK (2008) Biometrical modeling of twin and family data using standard mixed model software. Biometrics 64(1):280–288
Roth M, Tym E, Mountjoy CQ (1986) Camdex: a standardized instrument for the diagnosis of mental disorder in the elderly with special reference to the elderly detection of dementia. Br J Psychiatry 149:698–709
van Dongen J, Slagboom PE, Draisma HHM, Martin NG, Boomsma DI (2012) The continuing value of twin studies in the omics era. Nat Rev Genet 13:640–653
Wedderburn RWM (1974) Quasi-likelihood functions, generalized linear models, and the gauss-newton method. Biometrika 61(3):439–447
Funding
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Wagner Hugo Bonat and Jacob V. B. Hjelmborg declares that they have no conflict of interest.
Ethical Approval
Not applicable.
Human and Animal Rights statement and Informed consent
Not applicable.
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Code Availability
R code is available as a supplementary material.
Additional information
Edited by: Stacey S. Cherny.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bonat, W.H., Hjelmborg, J.V.B. Multivariate Generalized Linear Models for Twin and Family Data. Behav Genet 52, 123–140 (2022). https://doi.org/10.1007/s10519-021-10095-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10519-021-10095-3