Skip to main content
Log in

Genome-Wide Association Studies—Data Generation, Storage, Interpretation, and Bioinformatics

  • Published:
Journal of Cardiovascular Translational Research Aims and scope Submit manuscript

Abstract

Genome-wide association studies (GWAS) have had great success in identifying common genetic determinants of disease. One of the challenges posed by GWAS is the analysis of the large amount of data generated. This review aims to provide the non-geneticists with an overview of the different steps entailed in analysis of GWAS data, with an emphasis on popular bioinformatics tools available. GWAS data generation, analysis, and interpretation will be covered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Korn, J. M., et al. (2008). Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nature Genetics, 40(10), 1253–1260.

    Article  CAS  PubMed  Google Scholar 

  2. Rabbee, N., & Speed, T. P. (2006). A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics, 22(1), 7–12.

    Article  CAS  PubMed  Google Scholar 

  3. McCarroll, S. A., et al. (2008). Integrated detection and population-genetic analysis of SNPs and copy number variation. Nature Genetics, 40(10), 1166–1174.

    Article  CAS  PubMed  Google Scholar 

  4. Purcell, S., et al. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 81(3), 559–575.

    Article  CAS  PubMed  Google Scholar 

  5. Wigginton, J. E., Cutler, D. J., & Abecasis, G. R. (2005). A note on exact tests of Hardy–Weinberg equilibrium. American Journal of Human Genetics, 76(5), 887–893.

    Article  CAS  PubMed  Google Scholar 

  6. Cox, D. G., & Kraft, P. (2006). Quantification of the power of Hardy–Weinberg equilibrium testing to detect genotyping error. Human Heredity, 61(1), 10–14.

    Article  PubMed  Google Scholar 

  7. Pe'er, I., et al. (2008). Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genetic Epidemiology, 32(4), 381–385.

    Article  PubMed  Google Scholar 

  8. Aulchenko, Y. S., et al. (2007). GenABEL: an R library for genome-wide association analysis. Bioinformatics, 23(10), 1294–1296.

    Article  CAS  PubMed  Google Scholar 

  9. Gonzalez, J. R., et al. (2007). SNPassoc: an R package to perform whole genome association studies. Bioinformatics, 23(5), 644–645.

    Article  PubMed  Google Scholar 

  10. Soranzo, N., et al. (2009). A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Nature Genetics, 41(11), 1182–1190.

    Article  CAS  PubMed  Google Scholar 

  11. Rivadeneira, F., et al. (2009). Twenty bone-mineral-density loci identified by large-scale meta-analysis of genome-wide association studies. Nature Genetics, 41(11), 1199–1206.

    Article  CAS  PubMed  Google Scholar 

  12. Benjamin, E. J., et al. (2009). Variants in ZFHX3 are associated with atrial fibrillation in individuals of European ancestry. Nature Genetics, 41(8), 879–881.

    Article  CAS  PubMed  Google Scholar 

  13. Hindorff, L. A., et al. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences of the United States of America, 106(23), 9362–9367.

    Article  CAS  PubMed  Google Scholar 

  14. Zeggini, E., et al. (2008). Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nature Genetics, 40(5), 638–645.

    Article  CAS  PubMed  Google Scholar 

  15. Browning, B. L., & Browning, S. R. (2009). A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. American Journal of Human Genetics, 84(2), 210–223.

    Article  CAS  PubMed  Google Scholar 

  16. Guan, Y., & Stephens, M. (2008). Practical issues in imputation-based association mapping. PLoS Genet, 4(12), e1000279.

    Article  PubMed  Google Scholar 

  17. Howie, B. N., Donnelly, P., & Marchini, J. (2009). A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet, 5(6), e1000529.

    Article  PubMed  Google Scholar 

  18. International HapMap Consortium, A haplotype map of the human genome. (2005). Nature, 437(7063), 1299-1320.

  19. Frazer, K. A., et al. (2007). A second generation human haplotype map of over 3.1 million SNPs. Nature, 449(7164), 851–861.

    Article  CAS  PubMed  Google Scholar 

  20. Clayton, D. G., et al. (2005). Population structure, differential bias and genomic control in a large-scale, case-control association study. Nature Genetics, 37(11), 1243–1246.

    Article  CAS  PubMed  Google Scholar 

  21. Zeggini, E., & Ioannidis, J. P. (2009). Meta-analysis in genome-wide association studies. Pharmacogenomics, 10(2), 191–201.

    Article  PubMed  Google Scholar 

  22. de Bakker, P. I., et al. (2008). Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Human Molecular Genetics, 17(R2), R122–R128.

    Article  PubMed  Google Scholar 

  23. Pereira, T. V., et al. (2009). Discovery properties of genome-wide association signals from cumulatively combined data sets. American Journal of Epidemiology, 170(10), 1197–1206.

    Article  PubMed  Google Scholar 

  24. Ioannidis, J. P., Patsopoulos, N. A., & Evangelou, E. (2007). Heterogeneity in meta-analyses of genome-wide association investigations. PLoS ONE, 2(9), e841.

    Article  PubMed  Google Scholar 

  25. Price, A. L., et al. (2006). Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics, 38(8), 904–909.

    Article  CAS  PubMed  Google Scholar 

  26. Patterson, N., Price, A. L., & Reich, D. (2006). Population structure and eigenanalysis. PLoS Genet, 2(12), e190.

    Article  PubMed  Google Scholar 

  27. Ramensky, V., Bork, P., & Sunyaev, S. (2002). Human non-synonymous SNPs: server and survey. Nucleic Acids Research, 30(17), 3894–3900.

    Article  CAS  PubMed  Google Scholar 

  28. Ng, P. C., & Henikoff, S. (2001). Predicting deleterious amino acid substitutions. Genome Research, 11(5), 863–874.

    Article  CAS  PubMed  Google Scholar 

  29. Chasman, D., & Adams, R. M. (2001). Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. Journal of Molecular Biology, 307(2), 683–706.

    Article  CAS  PubMed  Google Scholar 

  30. Wang, E. T., et al. (2008). Alternative isoform regulation in human tissue transcriptomes. Nature, 456(7221), 470–476.

    Article  CAS  PubMed  Google Scholar 

  31. Lopez-Bigas, N., et al. (2005). Are splicing mutations the most frequent cause of hereditary disease? FEBS Letters, 579(9), 1900–1903.

    Article  CAS  PubMed  Google Scholar 

  32. Cooper, T. A., Wan, L., & Dreyfuss, G. (2009). RNA and disease. Cell, 136(4), 777–793.

    Article  CAS  PubMed  Google Scholar 

  33. Lim, L. P., & Burge, C. B. (2001). A computational analysis of sequence features involved in recognition of short introns. Proceedings of the National Academy of Sciences of the United States of America, 98(20), 11193–11198.

    Article  CAS  PubMed  Google Scholar 

  34. Wang, Z., & Burge, C. B. (2008). Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. RNA, 14(5), 802–813.

    Article  CAS  PubMed  Google Scholar 

  35. Wang, J., et al. (2005). Distribution of SR protein exonic splicing enhancer motifs in human protein-coding genes. Nucleic Acids Research, 33(16), 5053–5062.

    Article  CAS  PubMed  Google Scholar 

  36. Wang, Z., et al. (2004). Systematic identification and analysis of exonic splicing silencers. Cell, 119(6), 831–845.

    Article  CAS  PubMed  Google Scholar 

  37. Ge, B., et al. (2009). Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Nature Genetics, 41(11), 1216–1222.

    Article  CAS  PubMed  Google Scholar 

  38. Pastinen, T., et al. (2004). A survey of genetic and epigenetic variation affecting human gene expression. Physiol Genomics, 16(2), 184–193.

    CAS  PubMed  Google Scholar 

  39. Emilsson, V., et al. (2008). Genetics of gene expression and its effect on disease. Nature, 452(7186), 423–428.

    Article  CAS  PubMed  Google Scholar 

  40. Schadt, E. E., et al. (2008). Mapping the genetic architecture of gene expression in human liver. PLoS Biology, 6(5), e107.

    Article  PubMed  Google Scholar 

  41. Matys, V., et al. (2003). TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Research, 31(1), 374–378.

    Article  CAS  PubMed  Google Scholar 

  42. Barrett, J. C., et al. (2005). Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics, 21(2), 263–265.

    Article  CAS  PubMed  Google Scholar 

  43. Chen, W., Liang, L., & Abecasis, G. R. (2009). GWAS GUI: graphical browser for the results of whole-genome association studies with high-dimensional phenotypes. Bioinformatics, 25(2), 284–285.

    Article  PubMed  Google Scholar 

  44. Johnson, A. D., et al. (2008). SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics, 24(24), 2938–2939.

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillaume Pare.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pare, G. Genome-Wide Association Studies—Data Generation, Storage, Interpretation, and Bioinformatics. J. of Cardiovasc. Trans. Res. 3, 183–188 (2010). https://doi.org/10.1007/s12265-010-9181-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12265-010-9181-y

Keywords

Navigation