Abstract
Genome wide association studies using high throughput technology are already being conducted despite the significant hurdles that need to be overcome (Nat Rev Genet 6:95–108, 2005; Nat Rev Genet 6:109–118, 2005). Methods for detecting haplotype association signals in genome wide haplotype datasets are as yet very limited. Much methodological research has already been devoted to linkage disequilibrium (LD) fine mapping where the focus is the identification of the disease locus rather than the detection of a disease signal. Applications of these approaches to genome wide scanning are limited by the strong model assumptions of the sharing process, which lead to computational complexity. We describe a new algorithm for the initial identification of disease susceptibility loci in genome wide haplotype association studies. Excess sharing of ancestral haplotypes, which indicates the presence of a disease locus, is detected with a simple, easy to interpret, χ 2 based statistic. The method allows genome wide scanning for qualitative traits within reasonable computational timeframes and can serve as a first pass analysis prior to the usage of likelihood based methods, providing candidate regions and inferred susceptibility haplotypes. Our method makes no assumptions regarding the population history or the pattern of background LD. Statistical significance is evaluated with permutation tests. The method is illustrated on simulated and real data where it is applied to simple (cystic fibrosis) and complex disease (multiple sclerosis) examples. The statistic has low type I error and greater power to map disease loci over conventional single marker tests for low to moderate levels of LD.
Similar content being viewed by others
References
Abecasis GR, Cherny SS, Cookson WO, Cardon LR (2002) Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30:97–101 (see comments)
Altshuler D, Brooks LD, Chakravarti A, Collins FS, Daly MJ, Donnelly P (2005) A haplotype map of the human genome. Nature 437:1299–1320
Becker T, Knapp M (2004) A powerful strategy to account for multiple testing in the context of haplotype analysis. Am J Hum Genet 75(4):561–570
Besag J, Clifford P (1991) Sequential Monte Carlo p-values. Biometrika 78:301–304
Brown MA, Jones KA, Nicolai H, Bonjardim M, Black D, McFarlane R, de Jong P, Quirk JP, Lehrach H, Solomon E (1995) Physical mapping, cloning, and identification of genes within a 500-kb region containing BRCA1. Proc Natl Acad Sci USA 92:4362–4366
Cheng R, Ma JZ, Elston RC, Li MD (2005) Fine mapping functional sites or regions from case–control data using haplotypes of multiple linked SNPs. Ann Hum Genet 69:102–112
Clark AG, Nielsen R, Signorovitch J, Matise TC, Glanowski S, Heil J, Winn-Deen ES, Holden AL, Lai E (2003) Linkage disequilibrium and inference of ancestral recombination in 538 single-nucleotide polymorphism clusters across the human genome. Am J Hum Genet 73:285–300
Coraddu F, Sawcer S, Feakes R, Chataway J, Broadley S, Jones HB, Clayton D, Gray J, Smith S, Taylor C, Goodfellow PN, Compston A (1998) HLA typing in the United Kingdom multiple sclerosis genome screen. Neurogenetics 2:24–33
Durrant C, Zondervan KT, Cardon LR, Hunt S, Deloukas P, Morris AP (2004) Linkage disequilibrium mapping via cladistic analysis of single-nucleotide polymorphism haplotypes. Am J Hum Genet 75:35–43
Feder JN, Gnirke A, Thomas W, Tsuchihashi Z, Ruddy DA, Basava A, Dormishian F, Domingo R Jr, Ellis MC, Fullan A, Hinton LM, Jones NL, Kimmel BE, Kronmal GS, Lauer P, Lee VK, Loeb DB, Mapa FA, McClelland E, Meyer NC, Mintier GA, Moeller N, Moore T, Morikang E, Wolff RK, et al. (1996) A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis. Nat Genet 13:399–408
Hampe J, Wienker T, Schreiber S, Nurnberg P (1998) POPSIM: a general population simulation program. Bioinformatics 14:458–464
Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6:95–108
Houwen RH, Baharloo S, Blankenship K, Raeymaekers P, Juyn J, Sandkuijl LA, Freimer NB (1994) Genome screening by searching for shared segments: mapping a gene for benign recurrent intrahepatic cholestasis. Nat Genet 8:380–386
Kerem B, Rommens JM, Buchanan JA, Markiewicz D, Cox TK, Chakravarti A, Buchwald M, Tsui LC (1989) Identification of the cystic fibrosis gene: genetic analysis. Science 245:1073–1080
Klein RJ, Zeiss C, Chew EY, Tsai JY, Sackler RS, Haynes C, Henning AK, SanGiovanni JP, Mane SM, Mayne ST, Bracken MB, Ferris FL, Ott J, Barnstable C, Hoh J (2005) Complement factor H polymorphism in age-related macular degeneration. Science 308:385–389
Knight MA, McKinlay Gardner RJ, Bahlo M, Matsuura T, Dixon JA, Forrest SM, Storey E (2005) Dominantly inherited ataxia and dysphonia with dentate calcification: spinocerebellar ataxia type 20. Brain 127:1172–1181
Lander E, Kruglyak L (1995) Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet 11:241–247
Lin S, Chakravarti A, Cutler DJ (2004a) Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies. Nat Genet 36:1181–1188
Lin S, Chakravarti A, Cutler DJ (2004b) Haplotype and missing data inference in nuclear families. Genome Res 14:1624–1632
Liu JS, Sabatti C, Teng J, Keats BJ, Risch N (2001) Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res 11:1716–1724
McPeek MS, Strahs A (1999) Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. Am J Hum Genet 65:858–875
Meuwissen TH, Goddard ME (2000) Fine mapping of quantitative trait loci using linkage disequilibria with closely linked marker loci. Genetics 155:421–430
Morris AP, Whittaker JC, Balding DJ (2000) Bayesian fine-scale mapping of disease loci, by hidden Markov models. Am J Hum Genet 67:155–169
Ophoff RA, Escamilla MA, Service SK, Spesny M, Meshi DB, Poon W, Molina J, Fournier E, Gallegos A, Mathews C, Neylan T, Batki SL, Roche E, Ramirez M, Silva S, De Mille MC, Dong P, Leon PE, Reus VI, Sandkuijl LA, Freimer NB (2002) Genomewide linkage disequilibrium mapping of severe bipolar disorder in a population isolate. Am J Hum Genet 71:565–574
Reeve JP, Rannala B (2002) DMLE+: Bayesian linkage disequilibrium gene mapping. Bioinformatics 18:894–895
Risch NJ (2000) Searching for genetic determinants in the new millennium. Nature 405:847–856
Rubio JP, Bahlo M, Butzkueven H, van Der Mei IA, Sale MM, Dickinson JL, Groom P, Johnson LJ, Simmons RD, Tait B, Varney M, Taylor B, Dwyer T, Williamson R, Gough NM, Kilpatrick TJ, Speed TP, Foote SJ (2002) Genetic dissection of the human leukocyte antigen region by use of haplotypes of Tasmanians with multiple sclerosis. Am J Hum Genet 70:1125–1137
Rubio JP, Bahlo M, Tubridy N, Stankovich J, Burfoot R, Butzkueven H, Chapman C, Johnson L, Marriott M, Mraz G, Tait B, Wilkinson C, Taylor B, Speed TP, Foote SJ, Kilpatrick TJ (2004) Extended haplotype analysis in the HLA complex reveals an increased frequency of the HFE-C282Y mutation in individuals with multiple sclerosis. Hum Genet 114:573–580
Schaid DJ, McDonnell SK, Wang L, Cunningham JM, Thibodeau SN (2002) Caution on pedigree haplotype inference with software that assumes linkage equilibrium. Am J Hum Genet 71:992–995
Sham PC, Curtis D (1995) Monte Carlo tests for associations between disease and alleles at highly polymorphic loci. Ann Hum Genet 59(Pt 1):97–105
Stankovich J, Bahlo M, Rubio JP, Wilkinson CR, Thomson R, Banks A, Ring M, Foote SJ, Speed TP (2005) Identifying nineteenth century genealogical links from genotypes. Hum Genet 117:188–199
Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978–989
Stewart GJ, Teutsch SM, Castle M, Heard RN, Bennetts BH (1997) HLA-DR, -DQA1 and -DQB1 associations in Australian multiple sclerosis patients. Eur J Immunogenet 24:81–92
Terwilliger JD, Zollner S, Laan M, Paabo S (1998) Mapping genes through the use of linkage disequilibrium generated by genetic drift: ‘drift mapping’ in small populations with no demographic expansion. Hum Hered 48:138–154
Thomas DC, Haile RW, Duggan D (2005) Recent developments in genomewide association scans: a workshop summary and review. Am J Hum Genet 77:337–345
Toivonen HT, Onkamo P, Vasko K, Ollikainen V, Sevon P, Mannila H, Herr M, Kere J (2000) Data mining applied to linkage disequilibrium mapping. Am J Hum Genet 67:133–145
Wang WY, Barratt BJ, Clayton DG, Todd JA (2005) Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 6:109–118
Weeks DE, Conley YP, Ferrell RE, Mah TS, Gorin MB (2002) A tale of two genotypes: consistency between two high-throughput genotyping centers. Genome Res 12:430–435
Zollner S, von Haeseler A (2000) A coalescent approach to study linkage disequilibrium between single-nucleotide polymorphisms. Am J Hum Genet 66:615–628
Acknowledgements
We would like to thank the Tasmanian MS sufferers and their family members for participating in our study. We would also like to thank Prof. Trevor Kilpatrick, Dr. Helmut Butzkueven and Dr. Bruce Taylor who were the consulting neurologists. We would also like to acknowledge Ingileif Hallgrimsdottir, Department of Statistics, UC Berkeley, for useful discussions on the development of the algorithm. Dr. Melanie Bahlo is supported by the Australian National Health and Medical Research Council (NHMRC). Prof. Simon Foote, Prof. Terry Speed and Dr. Justin Rubio are fellows of the NHMRC. The MS study was funded by the NHMRC, MS Australia and the Co-operative Research Centre (CRC) for the Discovery of Genes for common Human Diseases.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bahlo, M., Stankovich, J., Speed, T.P. et al. Detecting genome wide haplotype sharing using SNP or microsatellite haplotype data. Hum Genet 119, 38–50 (2006). https://doi.org/10.1007/s00439-005-0114-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00439-005-0114-9