Abstract
Exome sequencing identifies thousands of DNA variants and a proportion of these are involved in disease. Genotypes derived from exome sequences provide particularly high-resolution coverage enabling study of the linkage disequilibrium structure of individual genes. The extent and strength of linkage disequilibrium reflects the combined influences of mutation, recombination, selection and population history. By constructing linkage disequilibrium maps of individual genes, we show that genes containing OMIM-listed disease variants are significantly under-represented amongst genes with complete or very strong linkage disequilibrium (P = 0.0004). In contrast, genes with disease variants are significantly over-represented amongst genes with levels of linkage disequilibrium close to the average for genes not known to contain disease variants (P = 0.0038). Functional clustering reveals, amongst genes with particularly strong linkage disequilibrium, significant enrichment of essential biological functions (e.g. phosphorylation, cell division, cellular transport and metabolic processes). Strong linkage disequilibrium, corresponding to reduced haplotype diversity, may reflect selection in utero against deleterious mutations which have profound impact on the function of essential genes. Genes with very weak linkage disequilibrium show enrichment of functions requiring greater allelic diversity (e.g. sensory perception and immune response). This category is not enriched for genes containing disease variation. In contrast, there is significant enrichment of genes containing disease variants amongst genes with more average levels of linkage disequilibrium. Mutations in these genes may less likely lead to in utero lethality and be subject to less intense selection.
Similar content being viewed by others
References
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29
Christodoulou K, Wiskin AE, Gibson J, Tapper W, Willis C, Afzal NA, Upstill-Goddard R, Holloway JW, Simpson MA, Beattie RM et al. (2012) Next generation sequencing of paediatric inflammatory bowel disease patients identifies rare and novel variants in candidate genes. Gut. doi:10.1136/gutjnl-2011-301833
Chuang JH, Li H (2004) Functional bias and spatial organization of genes in mutational hot and cold regions of the human genome. PLoS Biol 2(2):253–263
Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES (2001) High-resolution haplotype structure in the human genome. Nat Genet 29:229–232
Desai A, Mitchison TJ (1997) Microtubule polymerization dynamics. Annu Rev Cell Dev Biol 13:83–117
Dickerson JE, Zhu A, Robertson DL, Hentges KE (2011) Defining the role of essential genes in human disease. PLoS ONE 6(11):e273368
Fuentes Fajardo KV, Adams D, NISC Comparative Sequencing Program, Mason CE, Sincan M, Tifft C, Toro C, Boerkoel CF, Gahl W, Markello M (2012) Detecting false-positive signals in exome sequencing. Hum Mutat. doi:10.1002/humu.22033
Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M et al (2002) The structure of haplotype blocks in the human genome. Science 296:2225–2229
Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL (2007) The human disease network. Proc Natl Acad Sci USA 104(21):8685–8690
Jeffreys AJ, Kauppi L, Neumann R (2001) Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet 29:217–222
Lau W, Kuo T-Y, Tapper W, Cox S, Collins A (2007) Exploiting large scale computing to construct high resolution linkage disequilibrium maps of the human genome. Bioinformatics 23(4):517–519
Lercher MJ, Hurst LD (2002) Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet 18:337–340
Li M-X, Gui H-S, Kwan JSH, Bao S-Y, Sham PC (2012) A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases. Nucleic Acids Res 40(7):e53
Maniatis N, Collins A, Ku X-F, McCarthy LC, Hewett DR, Tapper W, Ennis S, Ke X, Morton NE (2002) The first linkage disequilibrium (LD) maps: delineation of hot and cold blocks by diplotype analysis. Proc Natl Acad Sci USA 99(4):2228–2233
McVean GAT, Myers S, Hunt S, Deloukas P, Bentley DR, Donnelly P (2004) The fine-scale structure of recombination rate variation in the human genome. Science 304:581–584
Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent KM, Huff CD, Shannon PT, Jabs EW, Nickerson DA et al (2010) Exome sequencing identifies the cause of a Mendelian disorder. Nat Genet 42:30–35
Papavasiliou FN, Schatz DG (2002) Somatic hypermutation of immunoglobulin genes: merging mechanisms for genetic diversity. Cell 109(Suppl):S35–S44
Parla JS, Iossifov I, Grabill I, Spector MS, Kramer M, McCombie WR (2011) A comparative analysis of exome capture. Genome Biol 12:R97
Service S, DeYoung J, Karayiorgou M, Louw Roos J, Pretorious H, Bedoya G, Ospina G, Ruiz-Linares A, Macedo A, Almeida Palha J et al (2006) Magnitude and distribution of linkage disequilibrium in population isolates and implications for genome-wide association studies. Nat Genet 38(5):556–560
Sharon D, Glusman G, Pilpel Y, Khen M, Gruetzner F, Haaf T, Lancet D (1999) Primate evolution of an olfactory receptor cluster: diversification by gene conversion and recent emergence of pseudogenes. Genomics 61:24–36
Smith AV, Thomas DJ, Munro HM, Abecasis GR (2005) Sequence features in regions of weak and strong linkage disequilibrium. Genome Res 15(11):1519–1534
Sun P, Zhang R, Jiang Y, Wang X, Li J, Lv H, Tang G, Guo X, Meng X, Zhang H, Zhang R (2011) Assessing the patterns of linkage disequilibrium in genic regions of the human genome. FEBS J 278(19):3748–3755
Tapper W, Collins A, Gibson J, Maniatis N, Ennis S, Morton NE (2005) A map of the human genome in linkage disequilibrium units. Proc Natl Acad Sci USA 102(33):11835–11839
The International HapMap Consortium (2003) The International HapMap Project. Nature 426:789–796
Wei Haung D, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57
Zhang W, Collins A, Maniatis N, Tapper W, Morton NE (2002) Properties of linkage disequilibrium (LD) maps. Proc Natl Acad Sci USA 99(26):17004–17007
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Gibson, J., Tapper, W., Ennis, S. et al. Exome-based linkage disequilibrium maps of individual genes: functional clustering and relationship to disease. Hum Genet 132, 233–243 (2013). https://doi.org/10.1007/s00439-012-1243-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00439-012-1243-6