Journal of Genetics

, Volume 89, Issue 1, pp 55–64

Evaluating variations of genotype calling: a potential source of spurious associations in genome-wide association studies

  • Huixiao Hong
  • Zhenqiang Su
  • Weigong Ge
  • Leming Shi
  • Roger Perkins
  • Hong Fang
  • Donna Mendrick
  • Weida Tong
Research Article

DOI: 10.1007/s12041-010-0011-4

Cite this article as:
Hong, H., Su, Z., Ge, W. et al. J Genet (2010) 89: 55. doi:10.1007/s12041-010-0011-4

Abstract

Genome-wide association studies (GWAS) examine the entire human genome with the goal of identifying genetic variants (usually single nucleotide polymorphisms (SNPs)) that are associated with phenotypic traits such as disease status and drug response. The discordance of significantly associated SNPs for the same disease identified from different GWAS indicates that false associations exist in such results. In addition to the possible sources of spurious associations that have been investigated and discussed intensively, such as sample size and population stratification, an accurate and reproducible genotype calling algorithm is required for concordant GWAS results from different studies. However, variations of genotype calling of an algorithm and their effects on significantly associated SNPs identified in downstream association analyses have not been systematically investigated. In this paper, the variations of genotype calling using the Bayesian Robust Linear Model with Mahalanobis distance classifier (BRLMM) algorithm and the resulting influence on the lists of significantly associated SNPs were evaluated using the raw data of 270 HapMap samples analysed with the Affymetrix Human Mapping 500K Array Set (Affy500K) by changing algorithmic parameters. Modified were the Dynamic Model (DM) call confidence threshold (threshold) and the number of randomly selected SNPs (size). Comparative analysis of the calling results and the corresponding lists of significantly associated SNPs identified through association analysis revealed that algorithmic parameters used in BRLMM affected the genotype calls and the significantly associated SNPs. Both the threshold and the size affected the called genotypes and the lists of significantly associated SNPs in association analysis. The effect of the threshold was much larger than the effect of the size. Moreover, the heterozygous calls had lower consistency compared to the homozygous calls.

Keywords

genotype callinggenome-wide association studiesmissing call ratecalling algorithmspurious association

Copyright information

© Indian Academy of Sciences 2010

Authors and Affiliations

  • Huixiao Hong
    • 1
  • Zhenqiang Su
    • 1
  • Weigong Ge
    • 1
  • Leming Shi
    • 1
  • Roger Perkins
    • 1
  • Hong Fang
    • 2
  • Donna Mendrick
    • 1
  • Weida Tong
    • 1
  1. 1.Division of Systems Toxicology, National Center for Toxicological ResearchUS Food and Drug AdministrationJeffersonUSA
  2. 2.Z-Tech Corp, ICF International Company at National Center for Toxicological ResearchUS Food and Drug AdministrationJeffersonUSA