Skip to main content
Log in

Mutual information and linkage disequilibrium based SNP association study by grouping case-control

  • Research Article
  • Published:
Genes & Genomics Aims and scope Submit manuscript

Abstract

Two main reasons for the difficulties to search for susceptibility single-nucleotide polymorphisms (SNPs) underlying genetic diseases are that the findings are not easy to be confirmed and the interactions between potential susceptibility SNPs are not clear. Many available association studies usually presented results with significance levels but did not illustrate the stability of the results. In some sense, their performances might be unclear in real practice. In this paper, we develop a novel method based on mutual information theory and linkage disequilibrium by grouping case-control. Mutual information (MI) is used to test multiple SNPs in combining with disease status. Those SNPs contributing the maximum MI are selected as potential susceptibility SNPs. Linkage disequilibrium (LD) analysis is used to extend MI detected result so that both direct and indirect factors can be included in the final result. The purpose of case-control grouping is to generate a number of data groups by randomly sampling from target samples. Each group is assumed to have almost the same number of individuals (cases and controls), and overlap is allowed among the groups. We apply the method to each data group, and then make comparisons and intersections between the results obtained from each of the groups so as to give the final result. We implement the method by continuously grouping until the final result reaches a stable state and a highly significance level. The experimental results indicate that our method to detect susceptibility SNPs in simulated and real data sets has shown remarkable success.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abraham R, Moskvina V, Sims R, Hollingworth P, Morgan A, Georgieva L, Dowzell K, Cichon S, Hillmer AM, O’Donovan MC, et al. (2008) A genome — wide association study for late — onset Alzheimer’s disease using DNA pooling. BMC Med. Genomics 1: 44.

    Article  PubMed  Google Scholar 

  • Barrett JC, Fry B, Maller J and Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265.

    Article  PubMed  CAS  Google Scholar 

  • Bjarki E and John W (2008) Linkage Disequilibrim Under Skewed Offspring Distribution Among Individuals in a Population. Genetics 178: 1517–1532.

    Article  Google Scholar 

  • Burgner D, Davila S, Breunis WB, Ng SB, Li Y, Bonnard C, Ling L, Wright VJ, Thalamuthu A, Odam M, et al. (2009) A genomewide association study identifies novel and functionally related susceptibility Loci for Kawasaki disease. PLoS Genet. 5: e1000319.

    Article  PubMed  Google Scholar 

  • Dempster AP, Laird NM and Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39: 1–38.

    Google Scholar 

  • Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger DG, Struewing JP, Morrison J, Field H, Luben R, et al. (2007) Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447: 1087–1093.

    Article  PubMed  CAS  Google Scholar 

  • Hahn LW, Ritchie MD and Moore JH (2003) Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics 19: 376–382.

    Article  PubMed  CAS  Google Scholar 

  • Hampe J, Schreiber S and Krawczak M (2003) Entropy-based SNP selection for genetic association studies. Hum. Genet. 114: 36–43.

    Article  PubMed  CAS  Google Scholar 

  • Hirschhorn JN and Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6: 95–108.

    Article  PubMed  CAS  Google Scholar 

  • Kimmel G and Shamir R (2005) GERBIL: Genotype resolution and block identification using likelihood. Proc. Natl. Acad. Sci. USA 102: 158–162.

    Article  PubMed  CAS  Google Scholar 

  • Saxena R, Voight BF, Lyssenko V, Burtt NP, de Bakker PI, Chen H, Roix JJ, Kathiresan S, Hirschhorn JN, Daly MJ, et al. (2007) Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316: 1331–1336.

    Article  PubMed  CAS  Google Scholar 

  • Shannon CE (1948) A mathematical theory of communication. Bell Syst. Tech. J. 27: 379–423, 623–656.

    Google Scholar 

  • Shi YY and He L (2005) SHEsis, a powerful software platform for analyses of linkage disequilibrium, haplotype construction, and genetic association at polymorphism loci. Cell Res. 15: 97–98.

    Article  PubMed  CAS  Google Scholar 

  • Stram DO (2004) Tag SNP selection for association studies. Genet. Epidemiol. 27: 365–374.

    Article  PubMed  Google Scholar 

  • Zhang K, Qin Z, Chen T, Liu JS, Waterman MS and Sun F (2005) HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics 21: 131–134.

    Article  PubMed  CAS  Google Scholar 

  • Zhang K, Qin ZS, Liu JS, Chen T, Waterman MS and Sun F (2004) Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies. Genome Res. 14: 908–916.

    Article  PubMed  CAS  Google Scholar 

  • Zheng M and McPeek MS (2007) Multipoint linkage-disequilibrium mapping with haplotype-block structure. Am. J. Hum. Genet. 80: 112–125.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiguo Yuan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuan, X., Zhang, J. & Wang, Y. Mutual information and linkage disequilibrium based SNP association study by grouping case-control. Genes Genom 33, 65–73 (2011). https://doi.org/10.1007/s13258-010-0094-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13258-010-0094-6

Keywords

Navigation