Chromosome Research

, Volume 23, Issue 3, pp 421–426 | Cite as

Completing the human genome: the progress and challenge of satellite DNA assembly

Review

Abstract

Genomic studies rely on accurate chromosome assemblies to explore sequence-based models of cell biology, evolution and biomedical disease. However, even the extensively studied human genome has not yet reached a complete, ‘telomere-to-telomere’, chromosome assembly. The largest assembly gaps remain in centromeric regions and acrocentric short arms, sites known to contain megabase-sized arrays of tandem repeats, or satellite DNAs. This review aims to briefly address the progress and challenges of generating correct assemblies of satellite DNA arrays. Although the focus is placed on the human genome, many concepts presented here are applicable to other genomes.

Keywords

Satellite DNA assembly centromere pericentromeric heterochromatin acrocentric repeats 

Abbreviations

HVGM

Human Variation Genome Map

ONT

Oxford Nanopore Technology

PacBio

Pacific Biosciences

rDNA

Ribosomal DNA

WGS

Whole genome shotgun

References

  1. Alkan C, Ventura M, Archidiacono N et al (2007) Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data. PLoS Comput Biol 3:1807–1818PubMedGoogle Scholar
  2. Altemose N, Miga KH, Maggioni M et al (2014) Genomic characterization of large heterochromatic gaps in the human genome assembly. PLoS Comput Biol 10:e1003628PubMedCentralCrossRefPubMedGoogle Scholar
  3. Amini S, Pushkarev D, Christiansen L et al (2014) Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing. Nat Genet 46:1343–1349PubMedCentralCrossRefPubMedGoogle Scholar
  4. Bentley DR, Balasubramanian S, Swerdlow HP et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53–59PubMedCentralCrossRefPubMedGoogle Scholar
  5. Biscotti M, Canapa A, Forconi M et al (2015) Transcription of tandemly repetitive DNA: functional roles. Chromosom Res SubmittedGoogle Scholar
  6. Burton JN, Adey A, Patwardhan RP et al (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31:1119–1125PubMedCentralCrossRefPubMedGoogle Scholar
  7. Chaisson MJ, Huddleston J, Dennis MY et al (2015) Resolving the complexity of the human genome using single-molecule sequencing. Nature 517:608–611PubMedCentralCrossRefPubMedGoogle Scholar
  8. Eichler EE, Clark RA, She X (2004) An assessment of the sequence gaps: unfinished business in a finished human genome. Nat Rev Genet 5:345–354CrossRefPubMedGoogle Scholar
  9. English AC, Richards S, Han Y et al (2012) Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One 7:e47768PubMedCentralCrossRefPubMedGoogle Scholar
  10. Hayden KE (2012) Human centromere genomics: now it’s personal. Chromosom Res 20:621–633CrossRefGoogle Scholar
  11. Hayden KE, Strome ED, Merrett SL et al (2013) Sequences associated with centromere competency in the human genome. Mol Cell Biol 33:763–772PubMedCentralCrossRefPubMedGoogle Scholar
  12. Huddleston J, Ranade S, Malig M et al (2014) Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res 24:688–696PubMedCentralCrossRefPubMedGoogle Scholar
  13. Jain M, Fiddes IT, Miga KH et al (2015) Improved data analysis for the MinION nanopore sequencer. Nat Methods 12:351–356CrossRefPubMedGoogle Scholar
  14. Koren S, Schatz MC, Walenz BP et al (2012) Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol 30:693–700PubMedCentralCrossRefPubMedGoogle Scholar
  15. Lee C, Wevrick R, Fisher RB et al (1997) Human centromeric DNAs. Hum Genet 100:291–304CrossRefPubMedGoogle Scholar
  16. Levy S, Sutton G, Ng PC et al (2007) The diploid genome sequence of an individual human. PLoS Biol 5:e254PubMedCentralCrossRefPubMedGoogle Scholar
  17. Luce AC, Sharma A, Mollere OS et al (2006) Precise centromere mapping using a combination of repeat junction markers and chromatin immunoprecipitation-polymerase chain reaction. Genetics 174:1057–1061PubMedCentralCrossRefPubMedGoogle Scholar
  18. Manuelidis L (1978) Chromosomal localization of complex and simple repeated human DNAs. Chromosoma 66:23–32CrossRefPubMedGoogle Scholar
  19. Miga KH, Newton Y, Jain M et al (2014) Centromere reference models for human chromosomes X and Y satellite arrays. Genome Res 24:697–707PubMedCentralCrossRefPubMedGoogle Scholar
  20. Miga KH, Eisenhart C, Kent WJ (2015) Utilizing mapping targets of sequences underrepresented in the reference assembly to reduce false positive alignments. Nucleic Acids Res in pressGoogle Scholar
  21. Nguyen N, Hickey G, Zerbino DR et al (2015) Building a pan-genome reference for a population. J Comput Biol 22:387–401CrossRefPubMedGoogle Scholar
  22. Novak A, Rosen Y, Haussler D, Paten B (2015) Canonical, stable, general mapping using context schemes. BioinformaticsGoogle Scholar
  23. Paten B, Novak A, Haussler D (2014) Mapping to a reference genome structure. arXiv preprint arXiv 1404.5010Google Scholar
  24. Putnam NH et al (2015) Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. arXiv 1502: 05331Google Scholar
  25. Roizes G (2006) Human centromeric alphoid domains are periodically homogenized so that they vary substantially between homologues. Mechanism and implications for centromere functioning. Nucleic Acids Res 34:1912–1924PubMedCentralCrossRefPubMedGoogle Scholar
  26. Rudd MK, Willard HF (2004) Analysis of the centromeric regions of the human genome assembly. Trends Genet 20:529–533CrossRefPubMedGoogle Scholar
  27. Schueler MG, Higgins AW, Rudd MK et al (2001) Genomic and genetic definition of a functional human centromere. Science 294:109–115CrossRefPubMedGoogle Scholar
  28. Vissel B, Choo KH (1991) Four distinct alpha satellite subfamilies shared by human chromosomes 13, 14 and 21. Nucleic Acids Res 19:271–277PubMedCentralCrossRefPubMedGoogle Scholar
  29. Wang J, Fan HC, Behr B et al (2012) Genome-wide single-cell analysis of recombination activity and de novo mutation rates in human sperm. Cell 150:402–412PubMedCentralCrossRefPubMedGoogle Scholar
  30. Warburton PE, Willard HF (1990) Genomic analysis of sequence variation in tandemly repeated DNA. Evidence for localized homogeneous sequence domains within arrays of alpha-satellite DNA. J Mol Biol 216:3–16CrossRefPubMedGoogle Scholar
  31. Warburton PE, Wevrick R, Mahtani MM et al (1992) Pulsed-field and two-dimensional gel electrophoresis of long arrays of tandemly repeated DNA: analysis of human centromeric alpha satellite. Methods Mol Biol 12:299–317PubMedGoogle Scholar
  32. Waye JS, Willard HF (1985) Chromosome-specific alpha satellite DNA: nucleotide sequence analysis of the 2.0 kilobasepair repeat from the human X chromosome. Nucleic Acids Res 13:2731–2743PubMedCentralCrossRefPubMedGoogle Scholar
  33. Wevrick R, Willard HF (1989) Long-range organization of tandem arrays of alpha satellite DNA at the centromeres of human chromosomes: High-frequency array-length polymorphism and meiotic stability. Proc Natl Acad Sci U S A 86:9394–9398PubMedCentralCrossRefPubMedGoogle Scholar
  34. Wevrick R, Willard HF (1991) Physical map of the centromeric region of human chromosome 7: relationship between two distinct alpha satellite arrays. Nucleic Acids Res 19:2295–2301PubMedCentralCrossRefPubMedGoogle Scholar
  35. Willard HF, Waye JS (1987) Chromosome-specific subsets of human alpha satellite DNA: analysis of sequence divergence within and between chromosomal subsets and evidence for an ancestral pentameric repeat. J Mol Evol 25:207–214CrossRefPubMedGoogle Scholar
  36. Yunis JJ, Yasmineh WG (1971) Heterochromatin, satellite DNA, and cell function. Science 174:1200–1209CrossRefPubMedGoogle Scholar
  37. Zhang M, Zhang Y, Scheuring CF et al (2012) Preparation of megabase-sized DNA from a variety of organisms using the nuclei method for advanced genomics research. Nat Protoc 7:467–478CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2015

Authors and Affiliations

  1. 1.Center for Biomolecular Science and EngineeringUniversity of California Santa CruzSanta CruzUSA

Personalised recommendations