Hidden Markov Models for Controlling False Discovery Rate in Genome-Wide Association Analysis

Part of the Methods in Molecular Biology book series (MIMB, volume 802)


Genome-wide association studies (GWAS) have shown notable success in identifying susceptibility genetic variants of common and complex diseases. To date, the analytical methods of published GWAS have largely been limited to single single nucleotide polymorphism (SNP) or SNP–SNP pair analysis, coupled with multiplicity control using the Bonferroni procedure to control family wise error rate (FWER). However, since SNPs in typical GWAS are in linkage disequilibrium, simple Bonferonni correction is usually over conservative and therefore leads to a loss of efficiency. In addition, controlling FWER may be too stringent for GWAS where the number of SNPs to be tested is enormous. It is more desirable to control the false discovery rate (FDR). We introduce here a hidden Markov model (HMM)-based PLIS testing procedure for GWAS. It captures SNP dependency by an HMM, and based which, provides precise FDR control for identifying susceptibility loci.

Key words

Genome-wide association SNP Hidden Markov model False discovery rate EM algorithm Multiple tests 


  1. 1.
    McCarthy MI, Abecasis GR, Cardon LR et al (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9:356–69.PubMedCrossRefGoogle Scholar
  2. 2.
    Sabatti C, Service S, Freimer N (2003) False discovery rate in linkage and association genome screens for complex disorders. Genetics 164:829–833.Google Scholar
  3. 3.
    Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological) 57:289–300.Google Scholar
  4. 4.
    Wei Z, Sun W, Wang K et al (2009) Multiple testing in genome-wide association studies via hidden Markov models. Bioinformatics 25:2802–2808.PubMedCrossRefGoogle Scholar
  5. 5.
    Cardon LR, Bell JI (2001) Association study designs for complex diseases. Nat Rev Genet 2:91–9.PubMedCrossRefGoogle Scholar
  6. 6.
  7. 7.
    Ephraim Y, Merhav N (2002) Hidden Markov processes. IEEE transactions on Information Theory 48:1518–1569.CrossRefGoogle Scholar
  8. 8.
    Sun W, Cai TT (2009) Large-scale multiple testing under dependence. Journal Of The Royal Statistical Society Series B 71:393–424.CrossRefGoogle Scholar
  9. 9.
    Schwarz G (1978) Estimating the dimension of a model. Ann. Statist. 6:461–464.CrossRefGoogle Scholar
  10. 10.
    Hardy GH (1908) Mendelian Proportions in a Mixed Population. Science 28:49–50.PubMedCrossRefGoogle Scholar
  11. 11.
    Weinberg W (1908) Über den Nachweis der Vererbung beim Menschen. Jahresh Wuertt Ver vaterl Natkd 64:368–382.Google Scholar
  12. 12.
    Rabiner LR (1989) A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the IEEE, p.257–286.Google Scholar
  13. 13.
    Wigginton JE, Cutler DJ, Abecasis GR (2005) A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet 76:887–893.PubMedCrossRefGoogle Scholar
  14. 14.
    Fisher RA (1932) Statistical Methods for Research Workers. Oliver & Boyd, EdinburghGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceNew Jersey Institute of TechnologyNewarkUSA

Personalised recommendations