An Evaluation of the MiDCoP Method for Imputing Allele Frequency in Genome Wide Association Studies
A genome wide association studies require genotyping DNA sequence of a large sample of individuals with and without the specific disease of interest. The current technologies of genotyping individual DNA sequence only genotype a limited DNA sequence of each individual in the study. As a result, a large fraction of Single Nucleotide Polymorphisms (SNPs) are not genotyped. Existing imputation methods are based on individual level data, which are often time consuming and costly. A new method, the Minimum Deviation of Conditional Probability (MiDCoP), was recently developed that aims at imputing the allele frequencies of the missing SNPs using the allele frequencies of neighboring SNPs without using the individual level SNP information. This article studies the performance of the MiDCoP approach using association analysis based on the imputed allele frequency by analyzing the GAIN Schizophrenia data. The results indicate that the choice of reference sets has strong impact on the performance. The imputation accuracy improves if the case and control data sets are imputed using a separate but better matched reference set, respectively.
KeywordsAssociation Tests Conditional Probability Imputation Minimum Deviation Multilocus Information Measure Single Nucleotide Polymorphisms
Unable to display preview. Download preview PDF.
- 2.Howie, B., Donnelly, P., Marchini, J.: A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics 5, e1000529 (2009)Google Scholar
- 3.Li, Y., Ding, J., Abecasis, G.R.: Mach 1.0: Rapid Haplotype Reconstruction and Missing Genotype Inference. The American Journal of Human Genetics 79, S2290 (2006)Google Scholar
- 6.Guan, Y., Stephens, M.: Practical Issues in Imputation-Based Association Mapping. PLoS Genetics 4(12), e1000279 (2008), doi:10.1371/journal.pgen.1000279Google Scholar
- 10.Gautam, Y.: A novel approach of imputing untypes SNP using the allele frequencies of neighboring SNPs. Unpublished dissertation, Central Michigan University, USA (2014)Google Scholar
- 11.The International HapMap Consortium: Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010)Google Scholar
- 13.Database of Genotype and phenotype (dbGap): Available at Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine, http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap
- 15.The 1000 Genomes Project Consortium: An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012)Google Scholar