Skip to main content
Log in

SNP characteristics and validation success in genome wide association studies

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

Genome wide association studies (GWASs) have identified tens of thousands of single nucleotide polymorphisms (SNPs) associated with human diseases and characteristics. A significant fraction of GWAS findings can be false positives. The gold standard for true positives is an independent validation. The goal of this study was to identify SNP features associated with validation success. Summary statistics from the Catalog of Published GWASs were used in the analysis. Since our goal was an analysis of reproducibility, we focused on the diseases/phenotypes targeted by at least 10 GWASs. GWASs were arranged in discovery-validation pairs based on the time of publication, with the discovery GWAS published before validation. We used four definitions of the validation success that differ by stringency. Associations of SNP features with validation success were consistent across the definitions. The strongest predictor of SNP validation was the level of statistical significance in the discovery GWAS. The magnitude of the effect size was associated with validation success in a non-linear manner. SNPs with risk allele frequencies in the range 30–70% showed a higher validation success rate compared to rarer or more common SNPs. Missense, 5’UTR, stop gained, and SNPs located in transcription factor binding sites had a higher validation success rate compared to intergenic, intronic and synonymous SNPs. There was a positive association between validation success and the level of evolutionary conservation of the sites. In addition, validation success was higher when discovery and validation GWASs targeted the same ethnicity. All predictors of validation success remained significant in a multivariate logistic regression model indicating their independent contribution. To conclude, we identified SNP features predicting validation success of GWAS hits. These features can be used to select SNPs for validation and downstream functional studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and materials

The data from A Catalog of Published Genome-Wide Association Studies https://www.genome.gov/catalog-of-published-genomewide-association-studies, UCSC Human Genome Browser https://genome.ucsc.edu, The Ensembl Regulatory Build http://useast.ensembl.org/info/genome/funcgen/regulatory_build.html, and ENCODE https://www.encodeproject.org/, all in the public domain, were used in this project.

Code availability

Not applicable.

References

Download references

Funding

Partial financial support was received from National Institutes of Health Grants U19CA203654, U19CA203654S1, R01CA231141, and P01 CA206980-01A1, Cancer Prevention and Research Institute of Texas Grant RR170048. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olga Y. Gorlova.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Ethics approval

Not applicable: the study used aggregate statistics from datasets in the public domain.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (TXT 97026 KB)

Supplementary file2 (XLSX 11 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gorlova, O.Y., Xiao, X., Tsavachidis, S. et al. SNP characteristics and validation success in genome wide association studies. Hum Genet 141, 229–238 (2022). https://doi.org/10.1007/s00439-021-02407-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-021-02407-8

Navigation