Abstract
Contemporary sequencing studies often ignore the diploid nature of the human genome because they do not routinely separate or 'phase' maternally and paternally derived sequence information. However, many findings — both from recent studies and in the more established medical genetics literature — indicate that relationships between human DNA sequence and phenotype, including disease, can be more fully understood with phase information. Thus, the existing technological impediments to obtaining phase information must be overcome if human genomics is to reach its full potential.
Similar content being viewed by others
References
Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007).
Lifton, R. P. Individual genomes on the horizon. N. Engl. J. Med. 362, 1235–1236 (2010).
Ashley, E. A. et al. Clinical assessment incorporating a personal genome. Lancet 375, 1525–1535 (2010).
Roach, J. C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636–639 (2010).
Ng, S. B. et al. Exome sequencing identifies the cause of a mendelian disorder. Nature Genet. 42, 30–35 (2010).
A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Frazer, K. A. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
Kasowski, M. et al. Variation in transcription factor binding among humans. Science 328, 232–235 (2010).
Lupski, J. R. et al. Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. N. Engl. J. Med. 362, 1181–1191 (2010).
Montgomery, S. B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).
Pickrell, J. K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).
Morozova, O., Hirst, M. & Marra, M. A. Applications of new sequencing technologies for transcriptome analysis. Annu. Rev. Genomics Hum. Genet. 10, 135–151 (2009).
Park, P. J. ChIP-seq: advantages and challenges of a maturing technology. Nature Rev. Genet. 10, 669–680 (2009).
Tucker, T., Marra, M. & Friedman, J. M. Massively parallel sequencing: the next big thing in genetic medicine. Am. J. Hum. Genet. 85, 142–154 (2009).
McDaniell, R. et al. Heritable individual-specific and allele-specific chromatin signatures in humans. Science 328, 235–239 (2010).
Zhang, D. et al. Genetic control of individual differences in gene-specific methylation in human brain. Am. J. Hum. Genet. 86, 411–419 (2010).
Kong, A. et al. Parental origin of sequence variants associated with complex diseases. Nature 462, 868–874 (2009).
Tycko, B. Mapping allele-specific DNA methylation: a new tool for maximizing information from GWAS. Am. J. Hum. Genet. 86, 109–112 (2010).
Gimelbrant, A., Hutchinson, J. N., Thompson, B. R. & Chess, A. Widespread monoallelic expression on human autosomes. Science 318, 1136–1140 (2007).
Wen, G. et al. Both rare and common polymorphisms contribute functional variation at CHGA, a regulator of catecholamine physiology. Am. J. Hum. Genet. 74, 197–207 (2004).
Alkan, C. et al. Personalized copy number and segmental duplication maps using next-generation sequencing. Nature Genet. 41, 1061–1067 (2009).
Wain, L. V., Armour, J. A. & Tobin, M. D. Genomic copy number variation, human health, and disease. Lancet 374, 340–350 (2009).
Leary, R. J. et al. Integrated analysis of homozygous deletions, focal amplifications, and sequence alterations in breast and colorectal cancers. Proc. Natl Acad. Sci. USA 105, 16224–16229 (2008).
Knudson, A. G. Two genetic hits (more or less) to cancer. Nature Rev. Cancer 1, 157–162 (2001).
Cirulli, E. T. & Goldstein, D. B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nature Rev. Genet. 11, 415–425 (2010).
Zschocke, J. Dominant versus recessive: molecular mechanisms in metabolic disease. J. Inherit. Metab. Dis. 31, 599–618 (2008).
Frazer, K. A., Murray, S. S., Schork, N. J. & Topol, E. J. Human genetic variation and its contribution to complex traits. Nature Rev. Genet. 10, 241–251 (2009).
Su, Z., Cardin, N., Donnelly, P., Marchini, J. & Control, W. T. C. A Bayesian method for detecting and characterizing allelic heterogeneity and boosting signals in genome-wide association etudies. Statistical Sci. 24, 430–450 (2009).
Dickson, S. P., Wang, K., Krantz, I., Hakonarson, H. & Goldstein, D. B. Rare variants create synthetic genome-wide associations. Plos Biol. 8, e1000294 (2010).
Graham, R. R. et al. Genetic variants near TNFAIP3 on 6q23 are associated with systemic lupus erythematosus. Nature Genet. 40, 1059–1061 (2008).
Musone, S. L. et al. Multiple polymorphisms in the TNFAIP3 region are independently associated with systemic lupus erythematosus. Nature Genet. 40, 1062–1064 (2008).
Graham, R. R. et al. A common haplotype of interferon regulatory factor 5 (IRF5) regulates splicing and expression and is associated with increased risk of systemic lupus erythematosus. Nature Genet. 38, 550–555 (2006).
Graham, R. R. et al. Three functional variants of IFN regulatory factor 5 (IRF5) define risk and protective haplotypes for human lupus. Proc. Natl Acad. Sci. USA 104, 6758–6763 (2007).
Harley, J. B. et al. Genome-wide association scan in women with systemic lupus erythematosus identifies susceptibility variants in ITGAM, PXK, KIAA1542 and other loci. Nature Genet. 40, 204–210 (2008).
Shimane, K. et al. The association of a nonsynonymous single-nucleotide polymorphism in TNFAIP3 with systemic lupus erythematosus and rheumatoid arthritis in the Japanese population. Arthritis Rheum. 62, 574–579 (2010).
Lemmers, R. J. et al. A unifying genetic model for facioscapulohumeral muscular dystrophy. Science 329, 1650–1653 (2010).
Kitzman, J. O. et al. Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nature Biotech. 19 Dec 2010 (doi:10.1038/nbt.1740).
Nievergelt, C. M., Libiger, O. & Schork, N. J. Generalized analysis of molecular variance. PLoS Genet. 3, e51 (2007).
Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).
Fan, H. C., Wang, J., Potanina, A. & Quake, S. R. Whole-genome molecular haplotyping of single cells. Nature Biotech. 19 Dec 2010 (doi:10.1038/nbt.1739).
Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nature Genet. 40, 1068–1075 (2008).
Li, Y., Willer, C., Sanna, S. & Abecasis, G. Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009).
Browning, S. R. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008).
Biernacka, J. M. et al. Assessment of genotype imputation methods. BMC Proc. 3 Suppl. 7, S5 (2009).
Gao, G., Allison, D. B. & Hoeschele, I. Haplotyping methods for pedigrees. Hum. Hered. 67, 248–266 (2009).
Salem, R. M., Wessel, J. & Schork, N. J. A comprehensive literature review of haplotyping software and methods for use with unrelated individuals. Hum. Genomics 2, 39–66 (2005).
Andres, A. M. et al. Understanding the accuracy of statistical haplotype inference with sequence data of known phase. Genet. Epidemiol. 31, 659–671 (2007).
Durbin, R. M. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Goriely, A. & Wilkie, A. O. Missing heritability: paternal age effect mutations and selfish spermatogonia. Nature Rev. Genet. 11, 589 (2010).
Moloney, D. M. et al. Exclusive paternal origin of new mutations in Apert syndrome. Nature Genet. 13, 48–53 (1996).
Bansal, V., Tewhey, R., Topol, E. J. & Schork, N. The next phase in human genetics. Nature Biotech. 29, 38–39 (2011).
Ma, L. et al. Direct determination of molecular haplotypes by chromosome microdissection. Nature Methods 7, 299–301 (2010).
Kouprina, N. & Larionov, V. TAR cloning: insights into gene function, long-range haplotypes and genome structure and evolution. Nature Rev. Genet. 7, 805–812 (2006).
Paul, P. & Apgar, J. Single-molecule dilution and multiple displacement amplification for molecular haplotyping. Biotechniques 38, 553–559 (2005).
Kim, J. H., Waterman, M. S. & Li, L. M. Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi. Genome Res. 17, 1101–1110 (2007).
Bansal, V. & Bafna, V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24, i153–159 (2008).
Bansal, V., Halpern, A. L., Axelrod, N. & Bafna, V. An MCMC algorithm for haplotype assembly from whole-genome sequence data. Genome Res. 18, 1336–1346 (2008).
He, D., Choi, A., Pipatsrisawat, K., Darwiche, A. & Eskin, E. Optimal algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics 26, i183–i190 (2010).
Shimizu, H. et al. Epidermolysis bullosa simplex associated with muscular dystrophy: phenotype-genotype correlations and review of the literature. J. Am. Acad. Dermatol. 41, 950–956 (1999).
Fong, C. Y., Mumford, A. D., Likeman, M. J. & Jardine, P. E. Cerebral palsy in siblings caused by compound heterozygous mutations in the gene encoding protein C. Dev. Med. Child. Neurol. 52, 489–493 (2010).
McLaughlin, H. M. et al. Compound heterozygosity for loss-of-function lysyl-tRNA synthetase mutations in a patient with peripheral neuropathy. Am. J. Hum. Genet. 87, 560–566 (2010).
Welch, K. O., Marin, R. S., Pandya, A. & Arnos, K. S. Compound heterozygosity for dominant and recessive GJB2 mutations: effect on phenotype and review of the literature. Am. J. Med. Genet. A 143A, 1567–1573 (2007).
Aguilar Martinez, P. et al. Compound heterozygotes for hemochromatosis gene mutations: may they help to understand the pathophysiology of the disease? Blood Cells Mol. Dis. 23, 269–276 (1997).
Nakamura, A., Yazaki, M., Tokuda, T., Hattori, T. & Ikeda, S. A Japanese patient with familial Mediterranean fever associated with compound heterozygosity for pyrin variant E148Q/M694I. Intern. Med. 44, 261–265 (2005).
Majumdar, S. et al. Compound heterozygous mutation with a novel splice donor region DNA sequence variant in the succinate dehydrogenase subunit B gene in malignant paraganglioma. Pediatr. Blood Cancer 54, 473–475 (2010).
Avigad, S. et al. Compound heterozygosity in nonphenylketonuria hyperphenylalanemia: the contribution of mutations for classical phenylketonuria. Am. J. Hum. Genet. 49, 393–399 (1991).
Moon, S. et al. Novel compound heterozygous mutations in the fructose-1,6-bisphosphatase gene cause hypoglycemia and lactic acidosis. Metabolism 60, 107–113 (2011).
Dork, T., Bendix-Waltes, R., Wegner, R. D. & Stumm, M. Slow progression of ataxia-telangiectasia with double missense and in frame splice mutations. Am. J. Med. Genet. A 126A, 272–277 (2004).
Maimaiti, M. et al. Silent exonic mutation in the acid-α-glycosidase gene that causes glycogen storage disease type II by affecting mRNA splicing. J. Hum. Genet. 54, 493–496 (2009).
Miyake, A. et al. A compound heterozygote of novel and recurrent DTDST mutations results in a novel intermediate phenotype of Desbuquois dysplasia, diastrophic dysplasia, and recessive form of multiple epiphyseal dysplasia. J. Hum. Genet. 53, 764–768 (2008).
De Rosa, M. et al. Evidence for a recessive inheritance of Turcot's syndrome caused by compound heterozygous mutations within the PMS2 gene. Oncogene 19, 1719–1723 (2000).
Drysdale, C. M. et al. Complex promoter and coding region β2-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness. Proc. Natl Acad. Sci. USA 97, 10483–10488 (2000).
Horan, M. et al. Human growth hormone 1 (GH1) gene expression: complex haplotype-dependent influence of polymorphic variation in the proximal promoter and locus control region. Hum. Mutat. 21, 408–423 (2003).
Barroso, E. et al. FANCD2 associated with sporadic breast cancer risk. Carcinogenesis 27, 1930–1937 (2006).
Chen, H. et al. Single nucleotide polymorphisms in the human interleukin-1B gene affect transcription according to haplotype context. Hum. Mol. Genet. 15, 519–529 (2006).
Weyrich, P. et al. Role of AMP-activated protein kinase gamma 3 genetic variability in glucose and lipid metabolism in non-diabetic whites. Diabetologia 50, 2097–2106 (2007).
Yang, H. et al. ATM sequence variants associate with susceptibility to non-small cell lung cancer. Int. J. Cancer 121, 2254–2259 (2007).
Maggini, V. et al. MDR1 diplotypes as prognostic markers in multiple myeloma. Pharmacogenet. Genomics 18, 383–389 (2008).
Pickard, B. S. et al. Interacting haplotypes at the NPAS3 locus alter risk of schizophrenia and bipolar disorder. Mol. Psychiatry 14, 874–884 (2009).
Sun, H. et al. The association of adiponectin allele 45T/G and -11377C/G polymorphisms with type 2 diabetes and rosiglitazone response in Chinese patients. Br. J. Clin. Pharmacol. 65, 917–926 (2008).
Williams, A. L., Housman, D. E., Rinard, M. C. & Gifford, D. K. Rapid haplotype inference for nuclear families. Genome Biol. 11, R108 (2010).
Jiang, H. T., Xu, Y., Zhao, Y. Z. & Chen, G. L. A novel algorithm for minimum recombinant haplotyping on pedigrees by zero recombinant block partition. Interdiscip. Sci. 2, 185–192 (2010).
Delaneau, O., Coulonges, C. & Zagury, J. F. Shape-IT: new rapid and accurate algorithm for haplotype inference. BMC Bioinformatics 9, 540 (2008).
Browning, B. L. & Browning, S. R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009).
Eronen, L., Geerts, F. & Toivonen, H. HaploRec: efficient and accurate large-scale reconstruction of haplotypes. BMC Bioinformatics 7, 542 (2006).
Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006).
Halperin, E. & Eskin, E. Haplotype reconstruction from genotype data using imperfect phylogeny. Bioinformatics 20, 1842–1849 (2004).
Qin, Z. S., Niu, T. & Liu, J. S. Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am. J. Hum. Genet. 71, 1242–1247 (2002).
Abecasis, G. R., Cherny, S. S., Cookson, W. O. & Cardon, L. R. Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genet. 30, 97–101 (2002).
Stephens, M., Smith, N. J. & Donnelly, P. A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68, 978–989 (2001).
Gudbjartsson, D. F., Thorvaldsson, T., Kong, A., Gunnarsson, G. & Ingolfsdottir, A. Allegro version 2. Nature Genet. 37, 1015–1016 (2005).
Excoffier, L. & Slatkin, M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. Evol. 12, 921–927 (1995).
Lander, E. S. & Green, P. Construction of multilocus genetic linkage maps in humans. Proc. Natl Acad. Sci. USA 84, 2363–2367 (1987).
Acknowledgements
This work was supported, in part, by the following research grants: U19 AG023122-01, R01 MH078151-01A1,N01 MH22005, U01 DA024417-01, P50 MH081755-01 and UL1 RR025774, as well as the Price Foundation and Scripps Genomic Medicine. This work is the authors' sole responsibility and does not necessarily represent funding agencies' views.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Related links
Rights and permissions
About this article
Cite this article
Tewhey, R., Bansal, V., Torkamani, A. et al. The importance of phase information for human genomics. Nat Rev Genet 12, 215–223 (2011). https://doi.org/10.1038/nrg2950
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrg2950
- Springer Nature Limited
This article is cited by
-
MethPhaser: methylation-based long-read haplotype phasing of human genomes
Nature Communications (2024)
-
Inferring compound heterozygosity from large-scale exome sequencing data
Nature Genetics (2024)
-
Familial co-segregation and the emerging role of long-read sequencing to re-classify variants of uncertain significance in inherited retinal diseases
npj Genomic Medicine (2023)
-
Duet: SNP-assisted structural variant calling and phasing using Oxford nanopore sequencing
BMC Bioinformatics (2022)
-
Accurate genome-wide phasing from IBD data
BMC Bioinformatics (2022)