Human Genetics

, Volume 121, Issue 3–4, pp 357–367

An entropy-based genome-wide transmission/disequilibrium test

Original Investigation

Abstract

Availability of a large collection of single nucleotide polymorphisms (SNPs) and efficient genotyping methods enable the extension of linkage and association studies for complex diseases from small genomic regions to the whole genome. Establishing global significance for linkage or association requires small P-values of the test. The original TDT statistic compares the difference in linear functions of the number of transmitted and nontransmitted alleles or haplotypes. In this report, we introduce a novel TDT statistic, which uses Shannon entropy as a nonlinear transformation of the frequencies of the transmitted or nontransmitted alleles (or haplotypes), to amplify the difference in the number of transmitted and nontransmitted alleles or haplotypes in order to increase statistical power with large number of marker loci. The null distribution of the entropy-based TDT statistic and the type I error rates in both homogeneous and admixture populations are validated using a series of simulation studies. By analytical methods, we show that the power of the entropy-based TDT statistic is higher than the original TDT, and this difference increases with the number of marker loci. Finally, the new entropy-based TDT statistic is applied to two real data sets to test the association of the RET gene with Hirschsprung disease and the Fcγ receptor genes with systemic lupus erythematosus. Results show that the entropy-based TDT statistic can reach p-values that are small enough to establish genome-wide linkage or association analyses.

References

  1. Borrego S, Ruiz A, Saez ME, Gimm O, Gao X, Lopez-Alonso M, Hernandez A, Wright FA, Antinolo G, Eng C (2000) RET genotypes comprising specific haplotypes of polymorphic variants predispose to isolated Hirschsprung disease. J Med Genet 37:572–578PubMedCrossRefGoogle Scholar
  2. Bourgain C, Genin E, Margaritte-Jeannin P, Clerget-Darpoux F (2001) Maximum identity length contrast: a powerful method for susceptibility gene detection in isolated populations. Genet Epidemiol 21(Suppl 1):S560–S564PubMedGoogle Scholar
  3. Clayton D, Jones H (1999) Transmission/disequilibrium tests for extended marker haplotypes. Am J Hum Genet 65:1161–1169PubMedCrossRefGoogle Scholar
  4. Edberg JC, Langefeld CD, Wu J, Moser KL, Kaufman KM, Kelly J, Bansal V, Brown WM, Salmon JE, Rich SS, Harley JB, Kimberly RP (2002) Genetic linkage and association of Fcgamma receptor IIIA (CD16A) on chromosome 1q23 with human systemic lupus erythematosus. Arthritis Rheum 46:2132–2140PubMedCrossRefGoogle Scholar
  5. Ewens WJ, Spielman RS (1995) The transmission/disequilibrium test: history, subdivision, and admixture. Am J Hum Genet 57:455–464PubMedGoogle Scholar
  6. Freimer N, Sabatti C (2004) The use of pedigree, sib-pair and association studies of common diseases for genetic mapping and epidemiology. Nat Genet 36:1045–1051PubMedCrossRefGoogle Scholar
  7. Graybill FA (1976) Theory and application of the linear model. Duxbury Press, North ScituateGoogle Scholar
  8. Hampe J, Schreiber S, Krawczak M (2003) Entropy-based SNP selection for genetic association studies. Hum Genet 114:36–43PubMedCrossRefGoogle Scholar
  9. Lehmann EL (1983) Theory of point estimation. Wiley, New YorkGoogle Scholar
  10. Nothnagel M (2002) Simulation of LD block-structured SNP haplotype data and its use for the analysis of case-control data by supervised learning methods. Am J Hum Genet 71(Suppl 4): A2363Google Scholar
  11. Rabinowitz D, Laird N (2000) A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Hum Hered 50:211–223PubMedCrossRefGoogle Scholar
  12. Risch N, Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273:1516–1517PubMedCrossRefGoogle Scholar
  13. Schaid DJ (1996) General score tests for associations of genetic markers with disease using cases and their parents. Genet Epidemiol 13:423–449PubMedCrossRefGoogle Scholar
  14. Sham PC (1997) Transmission/disequilibrium tests for multiallelic loci. Am J Hum Genet 61:774–778PubMedGoogle Scholar
  15. Sham PC, Curtis D (1995a) An extended transmission/disequilibrium test (TDT) for multi-allele marker loci. Ann Hum Genet 59:323–336PubMedGoogle Scholar
  16. Sham PC, Curtis D (1995b) An extended transmission/disequilibrium test (TDT) for multi-allele marker loci. Ann Hum Genet 59(Pt 3):323–336PubMedGoogle Scholar
  17. Shannon CE (1948) A mathematical theory of communication. Bell Systems Tech J 27:379–423Google Scholar
  18. Spielman RS, Ewens WJ (1996) The TDT and other family-based tests for linkage disequilibrium and association. Am J Hum Genet 59:983–989PubMedGoogle Scholar
  19. Spielman RS, McGinnis RE, Ewens WJ (1993) Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 52:506–516PubMedGoogle Scholar
  20. Wilson SR (1997) On extending the transmission/disequilibrium test (TDT). Ann Hum Genet 61(Pt 2):151–161PubMedCrossRefGoogle Scholar
  21. Zhang S, Sha Q, Chen HS, Dong J, Jiang R (2003) Transmission/disequilibrium test based on haplotype sharing for tightly linked markers. Am J Hum Genet 73:566–579PubMedCrossRefGoogle Scholar
  22. Zhao H, Zhang S, Merikangas KR, Trixler M, Wildenauer DB, Sun F, Kidd KK (2000) Transmission/disequilibrium tests using multiple tightly linked markers. Am J Hum Genet 67:936–946PubMedCrossRefGoogle Scholar
  23. Zhao J, Boerwinkle E, Xiong M (2005) An entropy-based statistic for genomewide association studies. Am J Hum Genet 77:27–40PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Human Genetics CenterThe University of Texas Health Science Center at HoustonHoustonUSA
  2. 2.Division of CardiologyEmory University School of MedicineAtlantaUSA

Personalised recommendations