Behavior Genetics

, Volume 43, Issue 3, pp 254–266 | Cite as

The Use of Imputed Sibling Genotypes in Sibship-Based Association Analysis: On Modeling Alternatives, Power and Model Misspecification

  • Camelia C. Minică
  • Conor V. Dolan
  • Jouke-Jan Hottenga
  • Gonneke Willemsen
  • Jacqueline M. Vink
  • Dorret I. Boomsma
Original Research

Abstract

When phenotypic, but no genotypic data are available for relatives of participants in genetic association studies, previous research has shown that family-based imputed genotypes can boost the statistical power when included in such studies. Here, using simulations, we compared the performance of two statistical approaches suitable to model imputed genotype data: the mixture approach, which involves the full distribution of the imputed genotypes and the dosage approach, where the mean of the conditional distribution features as the imputed genotype. Simulations were run by varying sibship size, size of the phenotypic correlations among siblings, imputation accuracy and minor allele frequency of the causal SNP. Furthermore, as imputing sibling data and extending the model to include sibships of size two or greater requires modeling the familial covariance matrix, we inquired whether model misspecification affects power. Finally, the results obtained via simulations were empirically verified in two datasets with continuous phenotype data (height) and with a dichotomous phenotype (smoking initiation). Across the settings considered, the mixture and the dosage approach are equally powerful and both produce unbiased parameter estimates. In addition, the likelihood-ratio test in the linear mixed model appears to be robust to the considered misspecification in the background covariance structure, given low to moderate phenotypic correlations among siblings. Empirical results show that the inclusion in association analysis of imputed sibling genotypes does not always result in larger test statistic. The actual test statistic may drop in value due to small effect sizes. That is, if the power benefit is small, that the change in distribution of the test statistic under the alternative is relatively small, the probability is greater of obtaining a smaller test statistic. As the genetic effects are typically hypothesized to be small, in practice, the decision on whether family-based imputation could be used as a means to increase power should be informed by prior power calculations and by the consideration of the background correlation.

Keywords

Family-based imputation Mixture model Dosage model Robustness 

References

  1. Abecasis GR, Cardon LR, Cookson WO (2000) A general test of association for quantitative traits in nuclear families. Am J Hum Genet 66(1):279–292PubMedCrossRefGoogle Scholar
  2. Abecasis GR, Cherny SS, Cookson WO, Cardon LR (2002) Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30(1):97–101PubMedCrossRefGoogle Scholar
  3. Boker S, Neale MC, Maes H, Wilde M, Spiegel M, Brick T, Spies J, Estabrook R, Kenny S, Bates T, Mehta P, Fox J (2011) OpenMx: an open source extended structural equation modeling framework. Psychometrika 76(2):306–317PubMedCrossRefGoogle Scholar
  4. Boomsma DI, de Geus EJK, Vink JM, Stubbe JH, Distel MA, Hottenga JJ, Posthuma D, van Beijsterveldt TCEM, Hudziak JJ, Bartels M, Willemsen G (2006) Netherlands Twin Register: from twins to twin families. Twin Res Hum Genet 9(6):849–857PubMedCrossRefGoogle Scholar
  5. Burdick JT, Chen WM, Abecasis GR, Cheung VG (2006) In silico method for inferring genotypes in pedigrees. Nat Genet 38(9):1002–1004PubMedCrossRefGoogle Scholar
  6. Chen WM, Abecasis GR (2007) Family-based association tests for genomewide association scans. Am J Hum Genet 81(5):913–926PubMedCrossRefGoogle Scholar
  7. Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics, 4th edn. Prentice Hall, HarlowGoogle Scholar
  8. Fulker D, Cherny S, Sham P, Hewitt J (1999) Combined linkage and association sib-pair analysis for quantitative traits. Am J Hum Genet 64(1):259–267PubMedCrossRefGoogle Scholar
  9. Gorjanc G, Henderson DA, with code contributions by Kinghorn B and Percy A (2007) GeneticsPed: Pedigree and genetic relationship functions. R package version 1.20.0. http://rgenetics.org
  10. Kinghorn BP (1997) An index of information content for genotype probabilities derived from segregation analysis. Genetics 145(2):479–483PubMedGoogle Scholar
  11. Kinghorn BP (1999) Use of segregation analysis to reduce genotyping costs. J Anim Breed Genet 116(3):175–180CrossRefGoogle Scholar
  12. Laird NM, Lange C (2011) The fundamentals of modern statistical genetics. Springer Verlag, New YorkCrossRefGoogle Scholar
  13. Lango Allen H, Estrada K, Lettre G, Berndt S, Weedon MN, Rivadeneira F, Willer CJ, Jackson AU, Vedantam S, Ferreira T, Wood AR et al (2010) Hundreds of variants influence human height and cluster within genomic loci and biological pathways. Nature 467(7317):832–838PubMedCrossRefGoogle Scholar
  14. Li Y, Willer C, Sanna S, Abecasis GR (2009) Genotype imputation. Annu Rev Genomics Hum Genet 10:387–406PubMedCrossRefGoogle Scholar
  15. Mather K, Jinks JL (1977) Introduction to biometrical genetics. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  16. Percy A, Kinghorn BP (2005) A genotype probability index for multiple alleles and haplotypes. J Anim Breed Genet 122(6):387–392PubMedCrossRefGoogle Scholar
  17. Pinheiro J, Bates D, DebRoy S, Sarkar D, the R Development Core Team (2012) nlme: linear and nonlinear mixed effects models. R package version 3.1–104Google Scholar
  18. R development Core Team (2005) R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org
  19. Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78(4):629–644PubMedCrossRefGoogle Scholar
  20. Silventoinen K, Sammalisto S, Perola M, Boomsma DI, Cornes BK, Davis C, Dunkel L, De Lange M, Harris JR, Hjelmborg JV, Luciano M, Martin NG, Mortensen J, Nisticò L, Pedersen NL, Skytthe A, Spector TD, Stazi MA, Willemsen G, Kaprio J (2003) Heritability of adult body height: a comparative study of twin cohorts in eight countries. Twin Res 6(5):399–408PubMedGoogle Scholar
  21. Van der Sluis S, Dolan CV, Neale CM, Posthuma D (2008) Power calculations using exact data simulation: a useful tool for genetic study designs. Behav Genet 38(2):202–211PubMedCrossRefGoogle Scholar
  22. Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New YorkCrossRefGoogle Scholar
  23. Vink JM et al (2009) Genome-wide association study of smoking initiation and current smoking. Am J Hum Genet 84(3):367–379PubMedCrossRefGoogle Scholar
  24. Visscher PM, Duffy DL (2006) The value of relatives with phenotypes but missing genotypes in association studies for quantitative traits. Genet Epidemiol 30(1):30–36PubMedCrossRefGoogle Scholar
  25. Visscher PM, Benyamin B, White J (2004) The use of linear mixed models to estimate variance components from data on twin pairs by maximum likelihood. Twin Res 7(6):670–674PubMedGoogle Scholar
  26. Visscher PM, Macgregor S, Benyamin B, Zhu G, Gordon S, Medland S, Hill WG, Hottenga JJ, Willemsen G, Boomsma DI, Liu YZ, Deng HW, Montgomery GW, Martin NG (2007) Genome partitioning of genetic variation for height from 11,214 sibling pairs. Am J Hum Genet 81(5):1104–1110PubMedCrossRefGoogle Scholar
  27. Visscher PM, Andrew TA, Nyholt DR (2008) Genome-wide association studies of quantitative traits with related individuals: little (power) lost but much to be gained. Eur J Hum Genet 16(3):387–390PubMedCrossRefGoogle Scholar
  28. Willemsen G, de Geus EJC, Bartels M, van Beijsterveldt TCEM, Brooks AI, van Burk GFE, Fugman DA, Hoekstra C, Hottenga JJ, Kluft K, Meijer P, Montgomery GW, Rizzu P, Sondervan D, Smit AB, Spijker S, Suchiman HED, Tischfield JA, Lehner T, Slagboom PE, Boomsma DI (2010) The Netherlands Twin Register biobank: a resource for genetic epidemiological studies. Twin Res Hum Genet 13(3):231–245PubMedCrossRefGoogle Scholar
  29. Zheng J, Yun L, Abecasis GR, Scheet P (2011) A comparison of approaches to account for uncertainty in analysis of imputed genotypes. Genet Epidemiol 35(2):102–111PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Camelia C. Minică
    • 1
  • Conor V. Dolan
    • 1
    • 2
  • Jouke-Jan Hottenga
    • 1
  • Gonneke Willemsen
    • 1
  • Jacqueline M. Vink
    • 1
  • Dorret I. Boomsma
    • 1
  1. 1.Department of Biological PsychologyVU University AmsterdamAmsterdamThe Netherlands
  2. 2.Department of PsychologyUniversity of AmsterdamAmsterdamThe Netherlands

Personalised recommendations