Analyzing twin resemblance in multisymptom data: Genetic applications of a latent class model for symptoms of conduct disorder in juvenile boys
A model based on the latent class model is developed for the effects of genes and environment on multivariate categorical data in twins. The model captures many essential features of dimensional and categorical conceptions of complex behavioral phenotypes and can include, as special cases, a variety of major locus models including those that allow for etiological heterogeneity, differential sensitivity of latent classes to measured covariates, and genotype × environment interaction (G×E). Many features of the model are illustrated by an application to ratings on eight items relating to conduct disorder selected from the Rutter Parent Questionnaire (RPQ). Mothers rated their 8-to 16-year-old male twin offspring [174 monozygotic (MZ) and 164 dizygotic (DZ) pairs]. The impact of age on the frequency of reported symptoms was relatively slight. Preliminary latent class analysis suggests that four classes are required to explain the reported behavioral profiles of the individual twins. A more detailed analysis of the pairwise response profiles reveals a significant association between twins for membership of latent classes and that the association is greater in MZ than DZ twins, suggesting that genetic factors played a significant role in class membership. Further analysis shows that the frequencies of MZ pairs discordant for membership of some latent classes are close to zero, while others are definitely not zero. One possible explanation of this finding is that the items reflect underlying etiological heterogeneity, with some response profiles reflecting genetic categories and others revealing a latent environmental risk factor. We explore two “four-class” models for etiological heterogeneity which make different assumptions about the way in which genes and environment interact to produce complex disease phenotypes. The first model allows for genetic heterogeneity that is expressed only in individuals exposed to a high-risk (“predisposing”) environment. The second model allows the environment to differentiate two forms of the disorder in individuals of high genetic risk. The first model fits better than the second, but neither fits as well as the general model for four latent classes associated in twins. The results suggest that a single-locus/two-allele model cannot fit the data on these eight items even when we allow for etiological heterogeneity. The pattern of endorsement probabilities associated with each of the four classes precludes a simple “unidimensional” model for the latent process underlying variation in symptom profile in this population. The extension of the approach to larger pedigrees and to linkage analysis is briefly considered.