Skip to main content
Log in

Does haplotype diversity predict power for association mapping of disease susceptibility?

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

Many recent studies have established that haplotype diversity in a small region may not be greatly diminished when the number of markers is reduced to a smaller set of “haplotype-tagging” single-nucleotide polymorphisms (SNPs) that identify the most common haplotypes. These studies are motivated by the assumption that retention of haplotype diversity assures retention of power for mapping disease susceptibility by allelic association. Using two bodies of real data, three proposed measures of diversity, and regression-based methods for association mapping, we found no scenario for which this assumption was tenable. We compared the chi-square for composite likelihood and the maximum chi-square for single SNPs in diplotypes, excluding the marker designated as causal. All haplotype-tagging methods conserve haplotype diversity by selecting common SNPs. When the causal marker has a range of allele frequencies as in real data, chi-square decreases faster than under random selection as the haplotype-tagging set diminishes. Selecting SNPs by maximizing haplotype diversity is inefficient when their frequency is much different from the unknown frequency of the causal variant. Loss of power is minimized when the difference between minor allele frequencies of the causal SNP and a closely associated marker SNP is small, which is unlikely in ignorance of the frequency of the causal SNP unless dense markers are used. Therefore retention of haplotype diversity in simulations that do not mirror genomic allele frequencies has no relevance to power for association mapping. TagSNPs that are assigned to bins instead of haplotype blocks also lose power compared with random SNPs. This evidence favours a multi-stage design in which both models and density change adaptively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Ackerman H, Usen S, Mott R, Richardson A, Sisay-Joof F, Katundu P, Taylor T, Ward R, Molyneux M, Pinder M, Kwiatkowski DP (2003) Haplotype analysis of the TNF locus by association efficiency and entropy. Genome Biol 4:R24

    Google Scholar 

  • Botstein D, Risch N (2003) Discovering genotypes underlying human phenotypes: past successes for Mendelian disease, future approaches for complex disease. Nat Genet 33 (Suppl): 228–237

    Article  CAS  PubMed  Google Scholar 

  • Burgner D, Usen S, Rockett K, Jallow M, Ackerman H, Cervino A, Pinder M, Kwiatkowski DP (2003) Nucleotide and haplotypic diversity of the NOS2A promoter region and its relationship to cerebral malaria. Hum Genet 112:379–386

    CAS  PubMed  Google Scholar 

  • Cardon LR, Abecasis GR (2003) Using haplotype blocks to map human complex trait loci. Trends Genet 19:135–140

    Article  CAS  PubMed  Google Scholar 

  • Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA (2004) Selecting a maximally informative set of single-nucleotide polymorphisms for association analysis using linkage disequilibrium. Am J Hum Genet 74:106–120

    PubMed  Google Scholar 

  • Clark AG (2003) Finding genes underlying risk of complex disease by linkage disequilibrium mapping. Curr Opin Genet Dev 13:296–302

    Article  CAS  PubMed  Google Scholar 

  • Collins A, Morton NE (1998) Mapping a disease locus by allelic association. Proc Natl Acad Sci USA 95:1741–1745

    Article  CAS  PubMed  Google Scholar 

  • Couzin J (2002) Genomics. New mapping project splits the community. Science 296:1391–1393

    Article  CAS  PubMed  Google Scholar 

  • Crow JF, Kimura M (1970) An introduction to population genetics theory. Harper and Row, New York

  • Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES (2001) High-resolution haplotype structure in the human genome. Nat Genet 29:229–232

    CAS  PubMed  Google Scholar 

  • Devlin B, Risch N, Roeder K (1996) Disequilibrium mapping: composite likelihood for pairwise disequilibrium. Genomics 36:1–16

    Article  CAS  PubMed  Google Scholar 

  • Jeffreys AJ, Kauppi L, Neumann R (2001) Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet 29:217–222

    CAS  PubMed  Google Scholar 

  • Johnson GC, Esposito L, Barratt BJ, Smith AN, Heward J, Di Genova G, Ueda H, Cordell HJ, Eaves IA, Dudbridge F, Twells RC, Payne F, Hughes W, Nutland S, Stevens H, Carr P, Tuomilehto-Wolf E, Tuomilehto J, Gough SC, Clayton DG, Todd JA (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233–237

    CAS  PubMed  Google Scholar 

  • Ke X, Hunt S, Tapper W, Lawrence R, Stavrides G, Ghori J, Whittaker P, Collins A, Morris AP, Bentley D, Cardon LR, Deloukas P (2004) The impact of SNP density on fine-scale patterns of linkage disequilibrium. Hum Mol Genet 13:577–588

    Article  CAS  PubMed  Google Scholar 

  • Kruglyak L, Nickerson DA (2001) Variation is the spice of life. Nat Genet 27:234–236

    Article  CAS  PubMed  Google Scholar 

  • Lonjou C, Zhang W, Collins A, Tapper WJ, Elahi E, Maniatis N, Morton NE (2003) Linkage disequilibrium in human populations. Proc Natl Acad Sci USA 100:6069–6074

    Article  CAS  PubMed  Google Scholar 

  • Malecot G (1969) The Mathematics of Heredity. Freeman, San Francisco

  • Malecot G (1973) Isolation by distance. In: Morton NE (ed) Genetic Structure of Populations. University of Hawaii Press, Honolulu, pp 72–75

  • Maniatis N, Collins A, Xu CF, McCarthy LC, Hewett DR, Tapper W, Ennis S, Ke X, Morton NE (2002) The first linkage disequilibrium (LD) maps: delineation of hot and cold blocks by diplotype analysis. Proc Natl Acad Sci USA 99:2228–2233

    CAS  PubMed  Google Scholar 

  • Maniatis N, Collins A, Gibson J, Zhang W, Tapper W, Morton NE (2004) Positional cloning by linkage disequilibrium. Am J Hum Genet 74:846–855

    Article  CAS  PubMed  Google Scholar 

  • Meng Z, Zaykin DV, Xu C-F, Wagner M, Ehm MG (2003) Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes. Am J Hum Genet 73:115–130

    CAS  PubMed  Google Scholar 

  • Morris AP, Whittaker JC, Balding DJ (2002) Fine-scale mapping of disease loci via shattered coalescent modelling of genealogies. Am J Hum Genet 70:686–707

    Article  CAS  PubMed  Google Scholar 

  • Morton NE (1955) Sequential tests for the detection of linkage. Am J Hum Genet 7:277–318

    CAS  PubMed  Google Scholar 

  • Morton NE, Zhang W, Taillon-Miller P, Ennis S, Kwok PY, Collins A (2001) The optimal measure of allelic association. Proc Natl Acad Sci USA 98:5217–5221

    Article  CAS  PubMed  Google Scholar 

  • Patil N, Berno AJ, Hinds DA, Barrett WA, Doshi JM, Hacker CR, Kautzer CR, Lee DH, Marjoribanks C, McDonough DP, Nguyen BT, Norris MC, Sheehan JB, Shen N, Stern D, Stokowski RP, Thomas DJ, Trulson MO, Vyas KR, Frazer KA, Fodor SP, Cox DR (2001) Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294:1719–1723

    CAS  PubMed  Google Scholar 

  • Pritchard JK (2001) Are rare variants responsible for susceptibility to common diseases? Am J Hum Genet 69:124–137

    Article  CAS  PubMed  Google Scholar 

  • Pritchard JK, Cox NJ (2002) The allelic architecture of human disease genes: common disease-common variant...or not? Hum Mol Genet 11:2417–2423

    Article  CAS  PubMed  Google Scholar 

  • Reich DE, Lander ES (2001) On the allelic spectrum of human disease. Trends Genet 17:502–510

    Google Scholar 

  • Risch N, Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273:1516–1517

    CAS  PubMed  Google Scholar 

  • Sebastiani P, Lazarus R, Weiss ST, Kunkel LM, Kohane IS, Romani MF (2003) Minimal haplotype tagging. Proc Natl Acad Sci USA 100:9900–9905

    CAS  PubMed  Google Scholar 

  • Shannon CE (1948) A mathematical theory of communication. Bell System Tech J 27:379–423, 623–656

    Google Scholar 

  • Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100:9440–9445

    Article  CAS  PubMed  Google Scholar 

  • Stram DO, Haiman CA, Hirschhorn JN, Altshuler D, Kolonel LN, Henderson BE, Pike MC (2003) Choosing haplotype-tagging SNPs based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the multiethnic cohort study. Hum Hered 55:27–36

    PubMed  Google Scholar 

  • Terwilliger JD (2000) A likelihood-based extended admixture model of oligogenic inheritance in ‘model-based’ and ‘model-free’ analysis. Eur J Hum Genet 8:399–406

    Google Scholar 

  • Wang WY, Todd JA (2003) The usefulness of different density SNP maps for disease association studies of common variants. Hum Mol Genet 12:3145–3149

    Article  CAS  PubMed  Google Scholar 

  • Weiss KM, Clark AG (2002) Linkage disequilibrium and the mapping of complex human traits. Trends Genet 18:19–24

    Article  CAS  PubMed  Google Scholar 

  • Zaykin DV, Westfall PH, Young SS, Karnoub MA, Wagner MJ, Ehm MG (2002) Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals. Hum Hered 53:79–91

    Article  PubMed  Google Scholar 

  • Zhang K, Calabrese P, Nordborg M, Sun F (2002a) Haplotype block structure and its applications to association studies: power and study designs. Am J Hum Genet 71:1386–1394

    Article  CAS  PubMed  Google Scholar 

  • Zhang K, Deng M, Chen T, Waterman MS, Sun F (2002b) A dynamic programming algorithm for haplotype block partitioning. Proc Natl Acad Sci USA 99:7335–7339

    Article  CAS  PubMed  Google Scholar 

  • Zhang W, Collins A, Maniatis N, Tapper W, Morton NE (2002c) Properties of linkage disequilibrium (LD) maps. Proc Natl Acad Sci USA 99:17004–17007

    Article  CAS  PubMed  Google Scholar 

  • Zhao H, Pfeiffer R, Gail M (2003) How useful are the tagging SNPs for identifying complex disease genes? Am J Hum Genet 73 (Suppl): 216

    Google Scholar 

Download references

Acknowledgements

We are grateful to Alec Jeffreys and Mark Daly for making their data publicly available. We thank Daniel Stram and Kui Zhang for the tagSNPs and HapBlock programs and suggestions in using them. This work was supported by a grant from the Medical Research Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Newton E. Morton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, W., Collins, A. & Morton, N.E. Does haplotype diversity predict power for association mapping of disease susceptibility?. Hum Genet 115, 157–164 (2004). https://doi.org/10.1007/s00439-004-1122-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-004-1122-x

Keywords

Navigation