Abstract
The study of genetic properties of a disease requires the collection of information concerning the subjects in a set of pedigrees. The main focus of this study was the detection of susceptible genes. However, even with large pedigrees, the heterogeneity of phenotypes in complex diseases such as Schizophrenia, Bipolar and Autism, makes the detection of susceptible genes difficult to accomplish. This is mainly due to a genetic heterogeneity: many genes phenomena are involved in the disease. In order to reduce this heterogeneity, our idea consists in sub-typing the disease and in partitioning the population into more alike sub-groups. We developed a probabilistic model based on a Latent Class Analysis (LCA) that takes into account the familial dependence inside a pedigree, even for large pedigrees. It also takes into account individuals with missing and partially missing measurements. Estimation of model parameters is performed by an EM algorithm, and computations for the E step inside a pedigree are achieved using a pedigree peeling algorithm. When more than one model are fitted, we use model selection strategies such as cross-validation or/and BIC approaches to choose the suitable model among a set of candidates. Moreover, we present a simulation based on a genetic disease class model and we show that our model leads to better individual classification than the model that assumes independence among subjects. An application of our model to a Schizophrenia-Bipolar pedigree data set from Eastern Quebec is also performed.
Similar content being viewed by others
References
Allman ES, Matias C, Rhodes JA (2009) Identifiability of latent class models with many observed variables. Ann Stat 37(6A): 3099–3132
Bishop YM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis: theory and practice. The MIT Press, Cambridge
Bureau A, Labbe A, Croteau J, Merette C (2008) Using disease symptoms to improve detection of linkage under genetic heterogeneity. Genet Epid 32(5): 476–486
Celedon JC, Soto-Quiros ME, Avila L, Lake SL, Liang C, Fournier E, Spesny M, Hersh CP, Sylvia JS, Hudson TJ, Verner A, Klanderman BJ, Freimer NB, Silverman EK, Weiss ST (2007) Significant linkage to airway responsiveness on chromosome 12q24 in families of children with asthma in costa rica. Hum Genet 120: 691–709
Dudoit S, Fridlyand J (2002) A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol 3(7): 1–21
Elston RC, Stewart J (1971) a General model for the genetic analysis of pedigree data. Hum Hered 21: 523–542
Fanous AH, Neale MC, Webb BT, Straub RE, O’Neill FA, Walsh D, Riley BP, Kendler KS (2008) Novel linkage to chromosome 20p using latent classes of psychotic illness in 270 irish high density families. Biol Psychiatry 64(2): 121–127
Hagenaars JA (1988) Latent structure models with direct effects between indicators: local dependence models. Sociol Methods Res 16: 379–405
Hallmayer JF, Jablensky A, Michie P, Woodbury M, Salmon B, Combrinck J, Wichmann H, Rock D, D’Ercole M, Howell S, Dragovic M, Kent A (2003) Linkage analysis of candidate regions using a composite neurocognitive phenotype correlated with schizophrenia. Mol Psychiatry 8(5): 511–523
Holi F, Tuulio-Henricksson A, Haukka J, Partonen T, Holmström L, Lönnqvist J (2004) Family-based clusters of cognitive test performance in familial schizophrenia. BMC Psychiatry 4(20). doi:10.1186/1471-244X-4-20
Kendler KS, Karkowski LM, Walsh D (1998) The structure of psychosis: latent class analysis of probands from the roscommon family study. Arch Gen Psychiatry 55: 492–499
Labbe A, Bureau A, Merette C (2009) Integration of genetic familial dependence structure in latent class model. Int J Biostat 5(1), Article 6. doi:10.2202/1557-4679.1126
Lin Y, Liu T, Li J, Yang J, Du Q, Wang J, Yang Y, Liu X, Fan Y, Lu F, Chen Y, Pu Y, Zhang K, He X, Yang Z (2008) A genome-wide scan maps a novel autosomal dominant juvenile-onset open-angle glaucoma locus to 2p15-16. Mol Vis 14: 739–744
McLachlan GJ, Krishnan T (2007) The EM algorithm and extensions. Wiley, New York
Neuman RJ, Todd RD, Heath AC, Reich W, Hudziak JJ, Bucholz KK, Madden PA, Begleiter H, Porjesz B, Kuperman S, Hesselbrock V, Reich T (1999) Evaluation of adhd typology in three contrasting samples: a latent class approach. J Am Acad Child Adolesc Psychiatry 38: 25–33
Raskind WH, Matsushita M, Peter B, Biderston J, Wolff J, Lipe H, Burbank R, Bird TD (2009) Familial dyskinesia and facial myokymia (FDFM): follow-up of a large family and linkage to chromosome 3p21–3q21. Am J Med Genet B Neuropsychiatr Genet 150B(4): 570–574
Robert CP, Casella G (2004) Monte Carlo statistical methods. Springer-Verlag, New York
Schmidt M, Hauser ER, Martin ER, Schmidt S (2005) Extension of the simla package for generating pedigrees with complex inheritance patterns: environmental covariates, gene-gene and gene-environment interaction. Stat Appl Genet Mol Biol 4(1) Article 15. doi:10.2202/1544-6115.1133
Smith C (1963) Testing for heterogeneity of recombination fraction values in human genetics. Ann Hum Genet 27: 175–182
Sullivan PF, Kessler RC, Kendler KS (1998) Latent class analysis of lifetime depressive symptoms in the national comorbidity survey. Am J Psychiatry 155: 1398–1406
Schwartz G (1978) Estimating the dimension of a model. Ann Stat 6: 461–464
Uebersax JS (1993) Statistical modeling of expert ratings on medical treatment appropriateness. J Am Stat Assoc 88(422): 421–427
Uebersax JS (1997) Analysis of student problem behaviors with latent trait, latent class and related probit mixture models. In: Rost J, Langeheine R (eds) Applications of latent trait and latent class models in the social sciences. Waxmann, New York, pp 188–195
Van Der lann M, Dudoit S, Keles S (2004) Asymptotic optimality of likelihood-based cross validation. Stat Appl Genetics Mol Biol 3(1), Article 4. doi:10.2202/1544-6115.1036
Vermunt JK (2008) Latent class and finite mixture models for multilevel data sets. Stat Methods Med Res 17(1): 33–51
Wang Y, Kuan PJ, Xing C, Cronkhite JT, Torres F, Rosenblatt RL, Dimaio JM, Kinch LN, Grishin NV, Garcia CK (2008) Genetic defects in surfactant protein A2 are associated with pulmonary fibrosis and lung cancer. Am J Hum Genet 84: 52–59
Whittemore AS, Halpern J (1994) A class of tests for linkage using affected pedigree members. Biometrics 50: 118–127
Yakowitz SJ, Spragins JD (1968) On the identifiability of finite mixtures. Ann Math Stat 39: 209–214
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
The Below is the Electronic Supplementary Material.
Rights and permissions
About this article
Cite this article
Tayeb, A., Labbe, A., Bureau, A. et al. Solving genetic heterogeneity in extended families by identifying sub-types of complex diseases. Comput Stat 26, 539–560 (2011). https://doi.org/10.1007/s00180-010-0224-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-010-0224-2