Human Genetics

, Volume 130, Issue 5, pp 607–621

The efficacy of detecting variants with small effects on the Affymetrix 6.0 platform using pooled DNA

  • Charleston W. K. Chiang
  • Zofia K. Z. Gajdos
  • Joshua M. Korn
  • Johannah L. Butler
  • Rachel Hackett
  • Candace Guiducci
  • Thutrang T. Nguyen
  • Rainford Wilks
  • Terrence Forrester
  • Katherine D. Henderson
  • Loic Le Marchand
  • Brian E. Henderson
  • Christopher A. Haiman
  • Richard S. Cooper
  • Helen N. Lyon
  • Xiaofeng Zhu
  • Colin A. McKenzie
  • Mark R. Palmert
  • Joel N. Hirschhorn
Original Investigation

Abstract

Genome-wide genotyping of a cohort using pools rather than individual samples has long been proposed as a cost-saving alternative for performing genome-wide association (GWA) studies. However, successful disease gene mapping using pooled genotyping has thus far been limited to detecting common variants with large effect sizes, which tend not to exist for many complex common diseases or traits. Therefore, for DNA pooling to be a viable strategy for conducting GWA studies, it is important to determine whether commonly used genome-wide SNP array platforms such as the Affymetrix 6.0 array can reliably detect common variants of small effect sizes using pooled DNA. Taking obesity and age at menarche as examples of human complex traits, we assessed the feasibility of genome-wide genotyping of pooled DNA as a single-stage design for phenotype association. By individually genotyping the top associations identified by pooling, we obtained a 14- to 16-fold enrichment of SNPs nominally associated with the phenotype, but we likely missed the top true associations. In addition, we assessed whether genotyping pooled DNA can serve as an inexpensive screen as the second stage of a multi-stage design with a large number of samples by comparing the most cost-effective 3-stage designs with 80% power to detect common variants with genotypic relative risk of 1.1, with and without pooling. Given the current state of the specific technology we employed and the associated genotyping costs, we showed through simulation that a design involving pooling would be 1.07 times more expensive than a design without pooling. Thus, while a significant amount of information exists within the data from pooled DNA, our analysis does not support genotyping pooled DNA as a means to efficiently identify common variants contributing small effects to phenotypes of interest. While our conclusions were based on the specific technology and study design we employed, the approach presented here will be useful for evaluating the utility of other or future genome-wide genotyping platforms in pooled DNA studies.

Abbreviations

AF

Allele frequency

GRR

Genotypic relative risk

GWA

Genome-wide association

LD

Linkage disequilibrium

MEC

Multi-Ethnic Cohort

QC

Quality control

Supplementary material

439_2011_974_MOESM1_ESM.doc (28 kb)
Supplementary material (DOC 28 kb)

References

  1. Brown KM, Macgregor S, Montgomery GW, Craig DW, Zhao ZZ, Iyadurai K, Henders AK, Homer N, Campbell MJ, Stark M, Thomas S, Schmid H, Holland EA, Gillanders EM, Duffy DL, Maskiell JA, Jetann J, Ferguson M, Stephan DA, Cust AE, Whiteman D, Green A, Olsson H, Puig S, Ghiorzo P, Hansson J, Demenais F, Goldstein AM, Gruis NA, Elder DE, Bishop JN, Kefford RF, Giles GG, Armstrong BK, Aitken JF, Hopper JL, Martin NG, Trent JM, Mann GJ, Hayward NK (2008) Common sequence variants on 20q11.22 confer melanoma susceptibility. Nat Genet 40:838–840PubMedCrossRefGoogle Scholar
  2. Butcher LM, Meaburn E, Knight J, Sham PC, Schalkwyk LC, Craig IW, Plomin R (2005) SNPs, microarrays and pooled DNA: identification of four loci associated with mild mental impairment in a sample of 6000 children. Hum Mol Genet 14:1315–1325PubMedCrossRefGoogle Scholar
  3. Butcher LM, Davis OS, Craig IW, Plomin R (2008) Genome-wide quantitative trait locus association scan of general cognitive ability using pooled DNA and 500K single nucleotide polymorphism microarrays. Genes Brain Behav 7:435–446PubMedCrossRefGoogle Scholar
  4. Chiang CW, Gajdos ZK, Korn JM, Kuruvilla FG, Butler JL, Hackett R, Guiducci C, Nguyen TT, Wilks R, Forrester T, Haiman CA, Henderson KD, Le Marchand L, Henderson BE, Palmert MR, McKenzie CA, Lyon HN, Cooper RS, Zhu X, Hirschhorn JN (2010) Rapid assessment of genetic ancestry in populations of unknown origin by genome-wide genotyping of pooled samples. PLoS Genet 6:e1000866PubMedCrossRefGoogle Scholar
  5. Cooper R, Rotimi C, Ataman S, McGee D, Osotimehin B, Kadiri S, Muna W, Kingue S, Fraser H, Forrester T, Bennett F, Wilks R (1997) The prevalence of hypertension in seven populations of west African origin. Am J Public Health 87:160–168PubMedCrossRefGoogle Scholar
  6. Craig JE, Hewitt AW, McMellon AE, Henders AK, Ma L, Wallace L, Sharma S, Burdon KP, Visscher PM, Montgomery GW, MacGregor S (2009) Rapid inexpensive genome-wide association using pooled whole blood. Genome Res 19:2075–2080PubMedCrossRefGoogle Scholar
  7. Davis OS, Butcher LM, Docherty SJ, Meaburn EL, Curtis CJ, Simpson MA, Schalkwyk LC, Plomin R (2010) A three-stage genome-wide association study of general cognitive ability: hunting the small effects. Behav Genet 40:759–767Google Scholar
  8. Docherty SJ, Butcher LM, Schalkwyk LC, Plomin R (2007) Applicability of DNA pools on 500K SNP microarrays for cost-effective initial screens in genomewide association studies. BMC Genomics 8:214PubMedCrossRefGoogle Scholar
  9. Docherty SJ, Davis OS, Kovas Y, Meaburn EL, Dale PS, Petrill SA, Schalkwyk LC, Plomin R (2010) A genome-wide association study identifies multiple loci associated with mathematics ability and disability. Genes Brain Behav 9:234–247PubMedCrossRefGoogle Scholar
  10. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D (2002) The structure of haplotype blocks in the human genome. Science 296:2225–2229PubMedCrossRefGoogle Scholar
  11. Gajdos ZK, Butler JL, Henderson KD, He C, Supelak PJ, Egyud M, Price A, Reich D, Clayton PE, Le Marchand L, Hunter DJ, Henderson BE, Palmert MR, Hirschhorn JN (2008) Association studies of common variants in 10 hypogonadotropic hypogonadism genes with age at menarche. J Clin Endocrinol Metab 93:4290–4298PubMedCrossRefGoogle Scholar
  12. Gajdos ZK, Henderson KD, Hirschhorn JN, Palmert MR (2010) Genetic determinants of pubertal timing in the general population. Mol Cell EndocrinolGoogle Scholar
  13. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106:9362–9367PubMedCrossRefGoogle Scholar
  14. Homer N, Szelinger S, Redman M, Duggan D, Tembe W, Muehling J, Pearson JV, Stephan DA, Nelson SF, Craig DW (2008a) Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 4:e1000167PubMedCrossRefGoogle Scholar
  15. Homer N, Tembe WD, Szelinger S, Redman M, Stephan DA, Pearson JV, Nelson SF, Craig D (2008b) Multimarker analysis and imputation of multiple platform pooling-based genome-wide association studies. Bioinformatics 24:1896–1902PubMedCrossRefGoogle Scholar
  16. Iles MM (2008) What can genome-wide association studies tell us about the genetics of common disease? PLoS Genet 4:e33PubMedCrossRefGoogle Scholar
  17. Kang SJ, Chiang CW, Palmer CD, Tayo BO, Lettre G, Butler JL, Hackett R, Adeyemo AA, Guiducci C, Berzins I, Nguyen TT, Feng T, Luke A, Shriner D, Ardlie K, Rotimi C, Wilks R, Forrester T, McKenzie CA, Lyon HN, Cooper RS, Zhu X, Hirschhorn JN (2010) Genome-wide association of anthropometric traits in African- and African-derived populations. Hum Mol Genet 19:2725–2738PubMedCrossRefGoogle Scholar
  18. Kolonel LN, Henderson BE, Hankin JH, Nomura AM, Wilkens LR, Pike MC, Stram DO, Monroe KR, Earle ME, Nagamine FS (2000) A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. Am J Epidemiol 151:346–357PubMedGoogle Scholar
  19. Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins PJ, Darvishi K, Lee C, Nizzari MM, Gabriel SB, Purcell S, Daly MJ, Altshuler D (2008) Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 40:1253–1260PubMedCrossRefGoogle Scholar
  20. Macgregor S, Zhao ZZ, Henders A, Nicholas MG, Montgomery GW, Visscher PM (2008) Highly cost-efficient genome-wide association studies using DNA pools and dense SNP arrays. Nucleic Acids Res 36:e35PubMedCrossRefGoogle Scholar
  21. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 9:356–369PubMedCrossRefGoogle Scholar
  22. Meaburn E, Butcher LM, Schalkwyk LC, Plomin R (2006) Genotyping pooled DNA using 100K SNP microarrays: a step towards genomewide association scans. Nucleic Acids Res 34:e27PubMedCrossRefGoogle Scholar
  23. Meaburn EL, Harlaar N, Craig IW, Schalkwyk LC, Plomin R (2008) Quantitative trait locus association scan of early reading disability and ability using pooled DNA and 100K SNP microarrays in a sample of 5760 children. Mol Psychiatry 13:729–740PubMedCrossRefGoogle Scholar
  24. Ong KK, Elks CE, Li S, Zhao JH, Luan J, Andersen LB, Bingham SA, Brage S, Smith GD, Ekelund U, Gillson CJ, Glaser B, Golding J, Hardy R, Khaw KT, Kuh D, Luben R, Marcus M, McGeehin MA, Ness AR, Northstone K, Ring SM, Rubin C, Sims MA, Song K, Strachan DP, Vollenweider P, Waeber G, Waterworth DM, Wong A, Deloukas P, Barroso I, Mooser V, Loos RJ, Wareham NJ (2009) Genetic variation in LIN28B is associated with the timing of puberty. Nat Genet 41:729–733Google Scholar
  25. Papassotiropoulos A, Stephan DA, Huentelman MJ, Hoerndli FJ, Craig DW, Pearson JV, Huynh KD, Brunner F, Corneveaux J, Osborne D, Wollmer MA, Aerni A, Coluccia D, Hanggi J, Mondadori CR, Buchmann A, Reiman EM, Caselli RJ, Henke K, de Quervain DJ (2006) Common Kibra alleles are associated with human memory performance. Science 314:475–478Google Scholar
  26. Pearson JV, Huentelman MJ, Halperin RF, Tembe WD, Melquist S, Homer N, Brun M, Szelinger S, Coon KD, Zismann VL, Webster JA, Beach T, Sando SB, Aasly JO, Heun R, Jessen F, Kolsch H, Tsolaki M, Daniilidou M, Reiman EM, Papassotiropoulos A, Hutton ML, Stephan DA, Craig DW (2007) Identification of the genetic basis for complex disorders by use of pooling-based genomewide single-nucleotide-polymorphism association studies. Am J Hum Genet 80:126–139PubMedCrossRefGoogle Scholar
  27. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575PubMedCrossRefGoogle Scholar
  28. Schrauwen I, Ealy M, Huentelman MJ, Thys M, Homer N, Vanderstraeten K, Fransen E, Corneveaux JJ, Craig DW, Claustres M, Cremers CW, Dhooge I, Van de Heyning P, Vincent R, Offeciers E, Smith RJ, Van Camp G (2009) A genome-wide analysis identifies genetic variants in the RELN gene associated with otosclerosis. Am J Hum Genet 84:328–338PubMedCrossRefGoogle Scholar
  29. Shifman S, Bhomra A, Smiley S, Wray NR, James MR, Martin NG, Hettema JM, An SS, Neale MC, van den Oord EJ, Kendler KS, Chen X, Boomsma DI, Middeldorp CM, Hottenga JJ, Slagboom PE, Flint J (2008a) A whole genome association study of neuroticism using DNA pooling. Mol Psychiatry 13:302–312PubMedCrossRefGoogle Scholar
  30. Shifman S, Johannesson M, Bronstein M, Chen SX, Collier DA, Craddock NJ, Kendler KS, Li T, O’Donovan M, O’Neill FA, Owen MJ, Walsh D, Weinberger DR, Sun C, Flint J, Darvasi A (2008b) Genome-wide association identifies a common variant in the reelin gene that increases the risk of schizophrenia only in women. PLoS Genet 4:e28PubMedCrossRefGoogle Scholar
  31. Skol AD, Scott LJ, Abecasis GR, Boehnke M (2006) Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet 38:209–213PubMedCrossRefGoogle Scholar
  32. Speliotes EK, Willer CJ, Berndt SI, Monda KL, Thorleifsson G, Jackson AU, Allen HL, Lindgren CM, Luan J, Magi R, Randall JC, Vedantam S, Winkler TW, Qi L, Workalemahu T, Heid IM, Steinthorsdottir V, Stringham HM, Weedon MN, Wheeler E, Wood AR, Ferreira T, Weyant RJ, Segre AV, Estrada K, Liang L, Nemesh J, Park JH, Gustafsson S, Kilpelainen TO, Yang J, Bouatia-Naji N, Esko T, Feitosa MF, Kutalik Z, Mangino M, Raychaudhuri S, Scherag A, Smith AV, Welch R, Zhao JH, Aben KK, Absher DM, Amin N, Dixon AL, Fisher E, Glazer NL, Goddard ME, Heard-Costa NL, Hoesel V, Hottenga JJ, Johansson A, Johnson T, Ketkar S, Lamina C, Li S, Moffatt MF, Myers RH, Narisu N, Perry JR, Peters MJ, Preuss M, Ripatti S, Rivadeneira F, Sandholt C, Scott LJ, Timpson NJ, Tyrer JP, van Wingerden S, Watanabe RM, White CC, Wiklund F, Barlassina C, Chasman DI, Cooper MN, Jansson JO, Lawrence RW, Pellikka N, Prokopenko I, Shi J, Thiering E, Alavere H, Alibrandi MT, Almgren P, Arnold AM, Aspelund T, Atwood LD, Balkau B, Balmforth AJ, Bennett AJ, Ben-Shlomo Y, Bergman RN, Bergmann S, Biebermann H, Blakemore AI, Boes T, Bonnycastle LL, Bornstein SR, Brown MJ, Buchanan TA et al (2010) Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42: 937–948Google Scholar
  33. Tang K, Fu DJ, Julien D, Braun A, Cantor CR, Koster H (1999) Chip-based genotyping by mass spectrometry. Proc Natl Acad Sci USA 96:10016–10020PubMedCrossRefGoogle Scholar
  34. Visscher PM, Le Hellard S (2003) Simple method to analyze SNP-based association studies using DNA pools. Genet Epidemiol 24:291–296PubMedCrossRefGoogle Scholar
  35. Yang HC, Huang MC, Li LH, Lin CH, Yu AL, Diccianni MB, Wu JY, Chen YT, Fann CS (2008) MPDA: microarray pooled DNA analyzer. BMC Bioinformatics 9:196PubMedCrossRefGoogle Scholar
  36. Zhang H, Yang HC, Yang Y (2008) PoooL: an efficient method for estimating haplotype frequencies from large DNA pools. Bioinformatics 24:1942–1948PubMedCrossRefGoogle Scholar
  37. Zuo Y, Zou G, Zhao H (2006) Two-stage designs in case-control association analysis. Genetics 173:1747–1760PubMedCrossRefGoogle Scholar
  38. Zuo Y, Zou G, Wang J, Zhao H, Liang H (2008) Optimal two-stage design for case-control association analysis incorporating genotyping errors. Ann Hum Genet 72:375–387PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Charleston W. K. Chiang
    • 1
    • 2
    • 3
  • Zofia K. Z. Gajdos
    • 1
    • 2
    • 3
  • Joshua M. Korn
    • 2
    • 4
    • 5
  • Johannah L. Butler
    • 3
  • Rachel Hackett
    • 2
  • Candace Guiducci
    • 2
  • Thutrang T. Nguyen
    • 3
  • Rainford Wilks
    • 6
  • Terrence Forrester
    • 7
  • Katherine D. Henderson
    • 8
  • Loic Le Marchand
    • 9
  • Brian E. Henderson
    • 10
  • Christopher A. Haiman
    • 10
  • Richard S. Cooper
    • 11
  • Helen N. Lyon
    • 2
    • 3
  • Xiaofeng Zhu
    • 12
  • Colin A. McKenzie
    • 7
  • Mark R. Palmert
    • 13
    • 14
  • Joel N. Hirschhorn
    • 1
    • 2
    • 3
    • 15
  1. 1.Department of GeneticsHarvard Medical SchoolBostonUSA
  2. 2.Program in Medical and Population GeneticsBroad Institute of Harvard and MITCambridgeUSA
  3. 3.Program in Genomics and Divisions of Genetics and EndocrinologyChildren’s HospitalBostonUSA
  4. 4.Center for Human Genetic ResearchMassachusetts General HospitalBostonUSA
  5. 5.Department of Molecular BiologyMassachusetts General HospitalBostonUSA
  6. 6.Epidemiology Research Unit, Tropical Medicine Research InstituteUniversity of the West IndiesKingstonJamaica
  7. 7.Tropical Metabolism Research Unit, Tropical Medicine Research InstituteUniversity of the West IndiesKingstonJamaica
  8. 8.Division of Cancer Etiology, Department of Population SciencesCity of Hope National Medical CenterDuarteUSA
  9. 9.Epidemiology Program, Cancer Research Center of HawaiiUniversity of HawaiiHonoluluUSA
  10. 10.Department of Preventive Medicine, Keck School of MedicineUniversity of Southern CaliforniaLos AngelesUSA
  11. 11.Department of Preventive Medicine and EpidemiologyStritch School of Medicine, Loyola University ChicagoMaywoodUSA
  12. 12.Department of Biostatistics and EpidemiologyCase Western Reserve UniversityClevelandUSA
  13. 13.Division of EndocrinologyThe Hospital for Sick ChildrenTorontoCanada
  14. 14.Department of PediatricsUniversity of TorontoTorontoCanada
  15. 15.Children’s Hospital BostonBostonUSA

Personalised recommendations