Abstract
Genomic studies rely on accurate chromosome assemblies to explore sequence-based models of cell biology, evolution and biomedical disease. However, even the extensively studied human genome has not yet reached a complete, ‘telomere-to-telomere’, chromosome assembly. The largest assembly gaps remain in centromeric regions and acrocentric short arms, sites known to contain megabase-sized arrays of tandem repeats, or satellite DNAs. This review aims to briefly address the progress and challenges of generating correct assemblies of satellite DNA arrays. Although the focus is placed on the human genome, many concepts presented here are applicable to other genomes.
This is a preview of subscription content, access via your institution.


Abbreviations
- HVGM:
-
Human Variation Genome Map
- ONT:
-
Oxford Nanopore Technology
- PacBio:
-
Pacific Biosciences
- rDNA:
-
Ribosomal DNA
- WGS:
-
Whole genome shotgun
References
Alkan C, Ventura M, Archidiacono N et al (2007) Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data. PLoS Comput Biol 3:1807–1818
Altemose N, Miga KH, Maggioni M et al (2014) Genomic characterization of large heterochromatic gaps in the human genome assembly. PLoS Comput Biol 10:e1003628
Amini S, Pushkarev D, Christiansen L et al (2014) Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing. Nat Genet 46:1343–1349
Bentley DR, Balasubramanian S, Swerdlow HP et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53–59
Biscotti M, Canapa A, Forconi M et al (2015) Transcription of tandemly repetitive DNA: functional roles. Chromosom Res Submitted
Burton JN, Adey A, Patwardhan RP et al (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31:1119–1125
Chaisson MJ, Huddleston J, Dennis MY et al (2015) Resolving the complexity of the human genome using single-molecule sequencing. Nature 517:608–611
Eichler EE, Clark RA, She X (2004) An assessment of the sequence gaps: unfinished business in a finished human genome. Nat Rev Genet 5:345–354
English AC, Richards S, Han Y et al (2012) Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One 7:e47768
Hayden KE (2012) Human centromere genomics: now it’s personal. Chromosom Res 20:621–633
Hayden KE, Strome ED, Merrett SL et al (2013) Sequences associated with centromere competency in the human genome. Mol Cell Biol 33:763–772
Huddleston J, Ranade S, Malig M et al (2014) Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res 24:688–696
Jain M, Fiddes IT, Miga KH et al (2015) Improved data analysis for the MinION nanopore sequencer. Nat Methods 12:351–356
Koren S, Schatz MC, Walenz BP et al (2012) Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol 30:693–700
Lee C, Wevrick R, Fisher RB et al (1997) Human centromeric DNAs. Hum Genet 100:291–304
Levy S, Sutton G, Ng PC et al (2007) The diploid genome sequence of an individual human. PLoS Biol 5:e254
Luce AC, Sharma A, Mollere OS et al (2006) Precise centromere mapping using a combination of repeat junction markers and chromatin immunoprecipitation-polymerase chain reaction. Genetics 174:1057–1061
Manuelidis L (1978) Chromosomal localization of complex and simple repeated human DNAs. Chromosoma 66:23–32
Miga KH, Newton Y, Jain M et al (2014) Centromere reference models for human chromosomes X and Y satellite arrays. Genome Res 24:697–707
Miga KH, Eisenhart C, Kent WJ (2015) Utilizing mapping targets of sequences underrepresented in the reference assembly to reduce false positive alignments. Nucleic Acids Res in press
Nguyen N, Hickey G, Zerbino DR et al (2015) Building a pan-genome reference for a population. J Comput Biol 22:387–401
Novak A, Rosen Y, Haussler D, Paten B (2015) Canonical, stable, general mapping using context schemes. Bioinformatics
Paten B, Novak A, Haussler D (2014) Mapping to a reference genome structure. arXiv preprint arXiv 1404.5010
Putnam NH et al (2015) Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. arXiv 1502: 05331
Roizes G (2006) Human centromeric alphoid domains are periodically homogenized so that they vary substantially between homologues. Mechanism and implications for centromere functioning. Nucleic Acids Res 34:1912–1924
Rudd MK, Willard HF (2004) Analysis of the centromeric regions of the human genome assembly. Trends Genet 20:529–533
Schueler MG, Higgins AW, Rudd MK et al (2001) Genomic and genetic definition of a functional human centromere. Science 294:109–115
Vissel B, Choo KH (1991) Four distinct alpha satellite subfamilies shared by human chromosomes 13, 14 and 21. Nucleic Acids Res 19:271–277
Wang J, Fan HC, Behr B et al (2012) Genome-wide single-cell analysis of recombination activity and de novo mutation rates in human sperm. Cell 150:402–412
Warburton PE, Willard HF (1990) Genomic analysis of sequence variation in tandemly repeated DNA. Evidence for localized homogeneous sequence domains within arrays of alpha-satellite DNA. J Mol Biol 216:3–16
Warburton PE, Wevrick R, Mahtani MM et al (1992) Pulsed-field and two-dimensional gel electrophoresis of long arrays of tandemly repeated DNA: analysis of human centromeric alpha satellite. Methods Mol Biol 12:299–317
Waye JS, Willard HF (1985) Chromosome-specific alpha satellite DNA: nucleotide sequence analysis of the 2.0 kilobasepair repeat from the human X chromosome. Nucleic Acids Res 13:2731–2743
Wevrick R, Willard HF (1989) Long-range organization of tandem arrays of alpha satellite DNA at the centromeres of human chromosomes: High-frequency array-length polymorphism and meiotic stability. Proc Natl Acad Sci U S A 86:9394–9398
Wevrick R, Willard HF (1991) Physical map of the centromeric region of human chromosome 7: relationship between two distinct alpha satellite arrays. Nucleic Acids Res 19:2295–2301
Willard HF, Waye JS (1987) Chromosome-specific subsets of human alpha satellite DNA: analysis of sequence divergence within and between chromosomal subsets and evidence for an ancestral pentameric repeat. J Mol Evol 25:207–214
Yunis JJ, Yasmineh WG (1971) Heterochromatin, satellite DNA, and cell function. Science 174:1200–1209
Zhang M, Zhang Y, Scheuring CF et al (2012) Preparation of megabase-sized DNA from a variety of organisms using the nuclei method for advanced genomics research. Nat Protoc 7:467–478
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editors: Maria Assunta Biscotti, Pat Heslop-Harrison and Ettore Olmo
Rights and permissions
About this article
Cite this article
Miga, K.H. Completing the human genome: the progress and challenge of satellite DNA assembly. Chromosome Res 23, 421–426 (2015). https://doi.org/10.1007/s10577-015-9488-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10577-015-9488-2