Abstract
The Hardy-Weinberg principle, one of the most important principles in population genetics, was originally developed for the study of allele frequency changes in a population over generations. It is now, however, widely used in studies of human diseases to detect inbreeding, population stratification, and genotyping errors. For assessment of deviation from Hardy-Weinberg proportions in data, the most popular approaches include the asymptotic Pearson’s chi-squared goodness-of-fit test and the exact test. Pearson’s chi-squared goodness-of-fit test is simple and straightforward, but is very sensitive to a small sample size or rare allele frequency. The exact test of Hardy-Weinberg proportions is preferable in these situations. The exact test can be performed through complete enumeration of heterozygote genotypes or on the basis of the Markov chain Monte Carlo procedure. In this chapter, we describe the Hardy-Weinberg principle and the commonly used Hardy-Weinberg proportion tests and their applications, and we demonstrate how the chi-squared test and exact test of Hardy-Weinberg proportions can be performed step-by-step using the popular software programs SAS, R, and PLINK, which have been widely used in genetic association studies, along with numerical examples. We also discuss approaches for testing Hardy-Weinberg proportions in case–control study designs that are better than traditional approaches for testing Hardy-Weinberg proportions in controls only. Finally, we note that deviation from the Hardy-Weinberg proportions in affected individuals can provide evidence for an association between genetic variants and diseases.
References
Castle WE (1903) The laws of Galton and Mendel and some laws governing race improvement by selection. Proc Amer Acad Arts Sci 35:233–242
Hardy GH (1908) Mendelian proportions in a mixed population. Science 28(706):49–50
Weinberg W (1908) On the demonstration of heredity in man. In: Boyer SH (ed) Papers on human genetics. Prentice Hall, Englewood Cliffs, NJ
Crow JF (1988) Eighty years ago: the beginnings of population genetics. Genetics 119(3):473–476
Weir BS (1996) Genetic data analysis II: methods for discrete population genetic data. Sinauer Associates, Sunderland, MA
Cockerham CC (1969) Variance of gene frequencies. Evolution 23:72–84
Wright S (1951) The genetical structure of populations. Ann Eugen 15:323–354
Price GR (1971) Extension of the Hardy-Weinberg law to assortative mating. Ann Hum Genet 34(4):455–458
Shockley W (1973) Deviations from Hardy-Weinberg frequencies caused by assortative mating in hybrid populations. Proc Natl Acad Sci U S A 70(3):732–736
Templeton A (2006) Population genetics and microevolutionary theory. Wiley, Hoboken, NJ
Voight BF, Pritchard JK (2005) Confounding from cryptic relatedness in case-control association studies. PLoS Genet 1(3):e32
Weinberg CR, Morris RW (2003) Invited commentary: testing for Hardy-Weinberg disequilibrium using a genome single-nucleotide polymorphism scan based on cases only. Am J Epidemiol 158(5):401–403
Deng HW, Chen WM, Recker RR (2000) QTL fine mapping by measuring and testing for Hardy-Weinberg and linkage disequilibrium at a series of linked marker loci in extreme samples of populations. Am J Hum Genet 66(3):1027–1045
Deng HW, Chen WM, Recker RR (2001) Population admixture: detection by Hardy-Weinberg test and its quantitative effects on linkage-disequilibrium methods for localizing genes underlying complex traits. Genetics 157(2):885–897
Grover VK, Cole DE, Hamilton DC (2010) Attributing Hardy-Weinberg disequilibrium to population stratification and genetic association in case-control studies. Ann Hum Genet 74(1):77–87
Ryckman K, Williams SM (2008) Calculation and use of the Hardy-Weinberg model in association studies. Curr Protoc Hum Genet Chapter 1:Unit 1.18
Wigginton JE, Cutler DJ, Abecasis GR (2005) A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet 76(5):887–893
Attia J, Thakkinstian A, McElduff P, Milne E, Dawson S, Scott RJ, Klerk N, Armstrong B, Thompson J (2010) Detecting genotyping error using measures of degree of Hardy-Weinberg disequilibrium. Stat Appl Genet Mol Biol 9 (1):Article
Gomes I, Collins A, Lonjou C, Thomas NS, Wilkinson J, Watson M, Morton N (1999) Hardy-Weinberg quality control. Ann Hum Genet 63(Pt 6):535–538
Graffelman J, Camarena JM (2008) Graphical tests for Hardy-Weinberg equilibrium based on the ternary plot. Hum Hered 65(2):77–84
Hosking L, Lumsden S, Lewis K, Yeo A, McCarthy L, Bansal A, Riley J, Purvis I, CF X (2004) Detection of genotyping errors by Hardy-Weinberg equilibrium testing. Eur J Hum Genet 12(5):395–399
Laurie CC, Doheny KF, Mirel DB, Pugh EW, Bierut LJ, Bhangale T, Boehm F, Caporaso NE, Cornelis MC, Edenberg HJ, Gabriel SB, Harris EL, Hu FB, Jacobs KB, Kraft P, Landi MT, Lumley T, Manolio TA, McHugh C, Painter I, Paschall J, Rice JP, Rice KM, Zheng X, Weir BS (2010) Quality control and quality assurance in genotypic data for genome-wide association studies. Genet Epidemiol 4(6):591–602
Li M, Li C (2008) Assessing departure from Hardy-Weinberg equilibrium in the presence of disease association. Genet Epidemiol 32(7):589–599
Schaid DJ, Batzler AJ, Jenkins GD, Hildebrandt MA (2006) Exact tests of Hardy-Weinberg equilibrium and homogeneity of disequilibrium across strata. Am J Hum Genet 79(6):1071–1080
Tapper W, Collins A, Gibson J, Maniatis N, Ennis S, Morton NE (2005) A map of the human genome in linkage disequilibrium units. Proc Natl Acad Sci U S A 102(33):11835–11839
Wang J, Shete S (2010) Using both cases and controls for testing Hardy-Weinberg proportions in a genetic association study. Hum Hered 69(3):212–218
Weale ME (2010) Quality control for genome-wide association studies. Methods Mol Biol 628:341–372
Wittke-Thompson JK, Pluzhnikov A, Cox NJ (2005) Rational inferences about departures from Hardy-Weinberg equilibrium. Am J Hum Genet 76(6):967–986
Pompanon F, Bonin A, Bellemain E, Taberlet P (2005) Genotyping errors: causes, consequences and solutions. Nat Rev Genet 6(11):847–859
Akey JM, Zhang K, Xiong M, Doris P, Jin L (2001) The effect that genotyping errors have on the robustness of common linkage-disequilibrium measures. Am J Hum Genet 68(6):1447–1456
Weiss ST, Silverman EK, Palmer LJ (2001) Case-control association studies in pharmacogenetics. Pharmacogenomics J 1(3):157–158
Xu J, Turner A, Little J, Bleecker ER, Meyers DA (2002) Positive results in association studies are associated with departure from Hardy-Weinberg equilibrium: hint for genotyping error? Hum Genet 111(6):573–574
Wang J, Shete S (2008) A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases. Am J Hum Genet 83(1):53–63
Cox DG, Kraft P (2006) Quantification of the power of Hardy-Weinberg equilibrium testing to detect genotyping error. Hum Hered 61(1):10–14
Fardo DW, Becker KD, Bertram L, Tanzi RE, Lange C (2009) Recovering unused information in genome-wide association studies: the benefit of analyzing SNPs out of Hardy-Weinberg equilibrium. Eur J Hum Genet 17(12):1676–1682
Leal SM (2005) Detection of genotyping errors and pseudo-SNPs via deviations from Hardy-Weinberg equilibrium. Genet Epidemiol 29(3):204–214
Teo YY, Fry AE, Clark TG, Tai ES, Seielstad M (2007) On the usage of HWE for identifying genotyping errors. Ann Hum Genet 71(Pt 5):701–703
Zou GY, Donner A (2006) The merits of testing Hardy-Weinberg equilibrium in the analysis of unmatched case-control data: a cautionary note. Ann Hum Genet 70(Pt 6):923–933
Salanti G, Amountza G, Ntzani EE, Ioannidis JP (2005) Hardy-Weinberg equilibrium in genetic association studies: an empirical evaluation of reporting, deviations, and power. Eur J Hum Genet 13(7):840–848
Feder JN, Gnirke A, Thomas W, Tsuchihashi Z, Ruddy DA, Basava A, Dormishian F et al (1996) A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis. Nat Genet 13(4):399–408
Jiang R, Dong J, Wang D, Sun FZ (2001) Fine-scale mapping using Hardy-Weinberg disequilibrium. Ann Hum Genet 65(Pt 2):207–219
Nielsen DM, Ehm MG, Weir BS (1998) Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am J Hum Genet 63(5):1531–1540
Lee WC (2003) Searching for disease-susceptibility loci by testing for Hardy-Weinberg disequilibrium in a gene bank of affected individuals. Am J Epidemiol 158(5):397–400
Song K, Elston RC (2006) A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies. Stat Med 25(1):105–126
Won S, Elston RC (2008) The power of independent types of genetic information to detect association in a case-control study design. Genet Epidemiol 32(8):731–756
Hoh J, Wille A, Ott J (2001) Trimming, weighting, and grouping SNPs in human case-control association studies. Genome Res 11(12):2115–2119
Yates F (1934) Contingency tables involving small numbers and the X2 test. J Roy Stat Soc Suppl 1:217–235
Fisher RA (1935) The logic of inductive inference. J Roy Stat Soc 98:39–54
Emigh T (1954) A comparison of tests for Hardy-Weinberg equilibrium. Biometrics 36:627–642
Haldane JBS (1954) An exact test for randomness of mating. J Genet 52:631–635
Engels WR (2009) Exact tests for Hardy-Weinberg proportions. Genetics 183(4):1431–1441
Levene H (1949) On a matching problem arising in genetics. Ann Math Stat 20:91–94
Louis EJ, Dempster ER (1987) An exact test for Hardy-Weinberg and multiple alleles. Biometrics 43(4):805–811
Guo SW, Thompson EA (1992) Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 48(2):361–372
Aoki S (2003) Network algorithm for the exact test of Hardy-Weinberg proportion for multiple alleles. Biom J 45(4):471–490
Maurer HP, Melchinger AE, Frisch M (2007) An incomplete enumeration algorithm for an exact test of Hardy-Weinberg proportions with multiple alleles. Theor Appl Genet 115(3):393–398
Huber M, Chen Y, Dinwoodie I, Dobra A, Nicholas M (2006) Monte Carlo algorithms for Hardy-Weinberg proportions. Biometrics 62(1):49–53
Yuan A, Bonney GE (2003) Exact test of Hardy-Weinberg equilibrium by Markov chain Monte Carlo. Math Med Biol 20(4):327–340
Lazzeroni LC, Lange K (1997) Markov chains for Monte Carlo tests of genetic equilibrium in multidimensional contingency tables. Ann Stat 25(1):138–168
Hernandez JL, Weir BS (1989) A disequilibrium coefficient approach to Hardy-Weinberg testing. Biometrics 45(1):53–70
Maiste PJ, Weir BS (2004) Optimal testing strategies for large, sparse multinomial models. Comput Stat Data An 46(3):605–620
Montoya-Delgado LE, Irony TZ, de B Pereira CA, Whittle MR (2001) An unconditional exact test for the Hardy-Weinberg equilibrium law: sample-space ordering using the Bayes factor. Genetics 158(2):875–883
Shoemaker J, Painter I, Weir BS (1998) A Bayesian characterization of Hardy-Weinberg disequilibrium. Genetics 149(4):2079–2088
Wakefield J (2010) Bayesian methods for examining Hardy-Weinberg equilibrium. Biometrics 66(1):257–265
Wellek S, Goddard KA, Ziegler A (2010) A confidence-limit-based approach to the assessment of Hardy-Weinberg equilibrium. Biom J 52(2):253–270
Goddard KA, Ziegler A, Wellek S (2009) Adapting the logical basis of tests for Hardy-Weinberg Equilibrium to the real needs of association studies in human and medical genetics. Genet Epidemiol 33(7):569–580
SAS Institute Inc. (2008) SAS/Genetics™ 92 user’s guide. SAS Institute Inc., Cary, NC
R Development Core Team (2009) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575
Purcell S (2009) PLINK (v1.07). http://pngu.mgh.harvard.edu/purcell/plink/
Wang J, Yu R, Shete S (2014) X-chromosome genetic association test accounting for X-inactivation, skewed X-inactivation, and escape from X-inactivation. Genet Epidemiol 38(6):483–493
Clayton D (2008) Testing for association on the X chromosome. Biostatistics 9(4):593–600
Zheng G, Joo J, Zhang C, Geller NL (2007) Testing association for markers on the X chromosome. Genet Epidemiol 31(8):834–843
Graffelman J, Weir BS (2016) Testing for Hardy-Weinberg equilibrium at biallelic genetic markers on the X chromosome. Heredity (Edinb) 116(6):558–568
Warnes G, Gorjanc G, Leisch F, Man M (2013) genetics: Population Genetics. R package version 1.3.8.1. https://CRAN.R-project.org/package=genetics
Graffelman J (2015) Exploring diallelic genetic markers: the HardyWeinberg package. J Stat Softw 64(3):1–22
Marchini J, Howie B (2010) Genotype imputation for genome-wide association studies. Nat Rev Genet 11(7):499–511
Shriner D (2013) Impact of Hardy-Weinberg disequilibrium on post-imputation quality control. Hum Genet 132(9):1073–1075
Howie BN, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 5(6):e1000529
Roshyara NR, Kirsten H, Horn K, Ahnert P, Scholz M (2014) Impact of pre-imputation SNP-filtering on genotype imputation results. BMC Genet 15:88
Anderson CA, Pettersson FH, Clarke GM, Cardon LR, Morris AP, Zondervan KT (2010) Data quality control in genetic case-control association studies. Nat Protoc 5(9):1564–1573
Fuchsberger C, Abecasis GR, Hinds DA (2015) minimac2: faster genotype imputation. Bioinformatics 31(5):782–784
Uh HW, Deelen J, Beekman M, Helmer Q, Rivadeneira F, Hottenga JJ, Boomsma DI, Hofman A, Uitterlinden AG, Slagboom PE, Bohringer S, Houwing-Duistermaat JJ (2012) How to deal with the early GWAS data when imputing and combining different arrays is necessary. Eur J Hum Genet 20(5):572–576
Southam L, Panoutsopoulou K, Rayner NW, Chapman K, Durrant C, Ferreira T, Arden N, Carr A, Deloukas P, Doherty M, Loughlin J, McCaskie A, Ollier WE, Ralston S, Spector TD, Valdes AM, Wallis GA, Wilkinson JM, Marchini J, Zeggini E (2011) The effect of genome-wide association scan quality control on imputation outcome for common variants. Eur J Hum Genet 19(5):610–614
Porcu E, Sanna S, Fuchsberger C, Fritsche LG (2013) Genotype imputation in genome-wide association studies. Curr Protoc Hum Genet Chapter 1:Unit 1.25
Howie B, Fuchsberger C, Stephens M, Marchini J, Abecasis GR (2012) Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat Genet 44(8):955–959
Browning SR (2008) Missing data imputation and haplotype phase inference for genome-wide association studies. Hum Genet 124(5):439–450
Yu C, Zhang S, Zhou C, Sile S (2009) A likelihood ratio test of population Hardy-Weinberg equilibrium for case-control studies. Genet Epidemiol 33(3):275–280
Taylor J, Tibshirani R (2006) A tail strength measure for assessing the overall univariate significance in a dataset. Biostatistics 7(2):167–181
Wang J, Shete S (2009) Is the tail-strength measure more powerful in tests of genetic association? Response. Am J Hum Genet 84(2):298–300
Painter I (2013) GWASExactHW: exactHardy-Weinburg testing for Genome Wide Association Studies. R package version 1.01. http://CRAN.R-project.org/package=GWASExactHW
Maindonald JH and Johnson R (2016) hwde: Models and tests for departure from Hardy-Weinberg equilibrium and independence between loci. R package version 0.67. https://CRAN.R-project.org/package= hwde
Zhao JH (2007) gap: Genetic analysis package. J Stat Softw 23(8):1–18
Cardillo G (2007) HWtest: a routine to test if a locus is in Hardy Weinberg equilibrium (exact test). http://www.mathworks.com/matlabcentral/fileexchange/14425-hwtest
Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21(2):263–265
Li B, Leal SM (2009) Deviations from hardy-weinberg equilibrium in parental and unaffected sibling genotype data. Hum Hered 67(2):104–115
Lancaster HO (1961) Significance tests in discrete distributions. J Am Stat Assoc 56(294):223–234
Cirulli ET, Goldstein DB (2010) Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 11(6):415–425
Lee S, Abecasis GR, Boehnke M, Lin X (2014) Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet 95(1):5–23
Zhu X, Wang J, Peng B, Shete S (2016) Empirical estimation of sequencing error rates using smoothing splines. BMC Bioinformatics 17:177
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Wang, J., Shete, S. (2017). Testing Departure from Hardy-Weinberg Proportions. In: Elston, R. (eds) Statistical Human Genetics. Methods in Molecular Biology, vol 1666. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7274-6_6
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7274-6_6
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7273-9
Online ISBN: 978-1-4939-7274-6
eBook Packages: Springer Protocols