Abstract
There is uncertainty about the true nature of predicted single-nucleotide polymorphisms (SNPs) in segmental duplications (duplicons) and whether these markers genuinely exist at increased density as indicated in public databases. We explored these issues by genotyping 157 predicted SNPs in duplicons and control regions in normal diploid genomes and fully homozygous complete hydatidiform moles. Our data identified many true SNPs in duplicon regions and few paralogous sequence variants. Twenty-eight percent of the polymorphic duplicon sequences we tested involved multisite variation, a new type of polymorphism representing the sum of the signals from many individual duplicon copies that vary in sequence content due to duplication, deletion or gene conversion. Multisite variations can masquerade as normal SNPs when genotyped. Given that duplicons comprise at least 5% of the genome and many are yet to be annotated in the genome draft, effective strategies to identify multisite variation must be established and deployed.
Similar content being viewed by others
References
Venter, J.C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).
Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Bailey, J.A. et al. Recent segmental duplications in the human genome. Science 297, 1003–1007 (2002).
Bailey, J.A., Yavor, A.M., Massa, H.F., Trask, B.J. & Eichler, E.E. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res. 11, 1005–1017 (2001).
Istrail, S. et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc. Natl. Acad. Sci. USA 101, 1916–1921 (2004).
Shaw, C.J. & Lupski, J.R. Implications of human genome architecture for rearrangement-based disorders: the genomic basis of disease. Hum. Mol. Genet. 13, R57–R64 (2004).
Estivill, X. et al. Chromosomal regions containing high-density and ambiguously mapped putative single nucleotide polymorphisms (SNPs) correlate with segmental duplications in the human genome. Hum. Mol. Genet. 11, 1987–1995 (2002).
Cheung, J. et al. Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence. Genome Biol. 4, R25 (2003).
Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001).
Tsui, C. et al. Single nucleotide polymorphisms (SNPs) that map to gaps in the human SNP map. Nucleic Acids Res. 31, 4910–4916 (2003).
Hurles, M.E. Gene conversion homogenizes the CMT1A paralogous repeats. BMC Genomics 2, 11 (2001).
Hurles, M. Are 100,000 “SNPs” useless? Science 298, 1509 (2002).
Conant, G.C. & Wagner, A. Asymmetric sequence divergence of duplicate genes. Genome Res. 13, 2052–2058 (2003).
Prince, J.A. et al. Robust and accurate single nucleotide polymorphism genotyping by dynamic allele-specific hybridization (DASH): design criteria and assay validation. Genome Res. 11, 152–162. (2001).
Sebire, N.J., Fisher, R.A. & Rees, H.C. Histopathological diagnosis of partial and complete hydatidiform mole in the first trimester of pregnancy. Pediatr. Dev. Pathol. 6, 69–77 (2003).
Kruglyak, L. & Nickerson, D.A. Variation is the spice of life. Nat. Genet. 27, 234–236 (2001).
Smit, A.F. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9, 657–663 (1999).
Jeffreys, A.J. & May, C.A. Intense and highly localized gene conversion activity in human meiotic crossover hot spots. Nat. Genet. 36, 151–156 (2004).
Rozen, S. et al. Abundant gene conversion between arms of palindromes in human and ape Y chromosomes. Nature 423, 873–876 (2003).
Hollox, E.J., Armour, J.A. & Barber, J.C. Extensive normal copy number variation of a beta-defensin antimicrobial-gene cluster. Am. J. Hum. Genet. 73, 591–600 (2003).
Locke, D.P. et al. BAC microarray analysis of 15q11-q13 rearrangements and the impact of segmental duplications. J. Med. Genet. 41, 175–182 (2004).
White, S.J. et al. Two-colour MLPA; detecting genomic rearrangements in hereditary multiple exostoses. Hum. Mutat. 24, 86–92 (2004).
Schouten, J.P. et al. Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. Nucleic Acids Res. 30, e57 (2002).
Lucito, R. et al. Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation. Genome Res. 13, 2291–2305 (2003).
Sherry, S.T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Birney, E. et al. Ensembl 2004. Nucleic Acids Res. 32, D468–D470 (2004).
Fredman, D., Jobs, M., Stromqvist, L. & Brookes, A.J. DFold: PCR design that minimizes secondary structure and optimizes downstream genotyping applications. Hum. Mutat. 24, 1–8 (2004).
Carlson, C.S. et al. Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat. Genet. 33, 518–521 (2003).
White, S. et al. Comprehensive detection of genomic duplications and deletions in the DMD gene, by use of multiplex amplifiable probe hybridization. Am. J. Hum. Genet. 71, 365–374 (2002).
Acknowledgements
We thank R.J. Fisher and M. Seckl for CHM DNA samples and R.A. Clark, S. Sawyer and C. Lagerberg for technical assistance. Funding was provided by Pfizer Corporation and Stiftelsen för Kompetens-och Kunskapsutveckling (to D.F. and A.J.B.) and by the US National Institutes of Health (to E.E.E.).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
A.J.B. declares share interests in Dynametrix Ltd.
Supplementary information
Supplementary Fig. 1
Average raw MLPA signal strength correlates with target sequence copy number. (PDF 198 kb)
Supplementary Fig. 2
MLPA and DASH correlation. (PDF 208 kb)
Rights and permissions
About this article
Cite this article
Fredman, D., White, S., Potter, S. et al. Complex SNP-related sequence variation in segmental genome duplications. Nat Genet 36, 861–866 (2004). https://doi.org/10.1038/ng1401
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng1401
- Springer Nature America, Inc.
This article is cited by
-
Genome-wide single nucleotide polymorphism array analysis unveils the origin of heterozygous androgenetic complete moles
Scientific Reports (2019)
-
The development and growth of EJHG 1995–2017
European Journal of Human Genetics (2017)
-
Identification of genome-wide single nucleotide polymorphisms in allopolyploid crop Brassica napus
BMC Genomics (2013)
-
A common copy-number breakpoint of ERBB2 amplification in breast cancer colocalizes with a complex block of segmental duplications
Breast Cancer Research (2012)
-
Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote
BMC Genomics (2012)