Chromosome Research

, Volume 24, Issue 3, pp 421–436 | Cite as

Clusters of alpha satellite on human chromosome 21 are dispersed far onto the short arm and lack ancient layers

  • William Ziccardi
  • Chongjian Zhao
  • Valery Shepelev
  • Lev Uralsky
  • Ivan Alexandrov
  • Tatyana Andreeva
  • Evgeny Rogaev
  • Christopher Bun
  • Emily Miller
  • Catherine Putonti
  • Jeffrey DoeringEmail author
Original Article


Human alpha satellite (AS) sequence domains that currently function as centromeres are typically flanked by layers of evolutionarily older AS that presumably represent the remnants of earlier primate centromeres. Studies on several human chromosomes reveal that these older AS arrays are arranged in an age gradient, with the oldest arrays farthest from the functional centromere and arrays progressively closer to the centromere being progressively younger. The organization of AS on human chromosome 21 (HC21) has not been well-characterized. We have used newly available HC21 sequence data and an HC21p YAC map to determine the size, organization, and location of the AS arrays, and compared them to AS arrays found on other chromosomes. We find that the majority of the HC21 AS sequences are present on the p-arm of the chromosome and are organized into at least five distinct isolated clusters which are distributed over a larger distance from the functional centromere than that typically seen for AS on other chromosomes. Using both phylogenetic and L1 element age estimations, we found that all of the HC21 AS clusters outside the functional centromere are of a similar relatively recent evolutionary origin. HC21 contains none of the ancient AS layers associated with early primate evolution which is present on other chromosomes, possibly due to the fact that the p-arm of HC21 and the other acrocentric chromosomes underwent substantial reorganization about 20 million years ago.


alpha satellite chromosome 21 chromosome evolution centromere acrocentric chromosome chromosome mapping 



Alpha satellite


Human chromosome X


Human chromosome 21


Human chromosome 21 short arm


Higher order repeat


Segmental duplications



Parts of this work were supported by Loyola University Chicago. Other portions were supported by the Government of the Russian Federation (№ 14.B25.31.0033) and by the Russian Science Foundation (№ 14-50-00029). EIR was supported in part by R01 AG029360. IAA was supported by Research Center of Mental Health, Russian Academy of Medical Sciences. We thank Michael Shaffer for assistance with phylogenetic analyses.

Compliance with ethical standards

The experiments in this article comply with the current laws of the countries in which they were performed. This article does not contain any studies with human or animal subjects performed by any of the authors.

Conflict of interest

The authors declare that they have no conflicts of interest.

Supplementary material

10577_2016_9530_MOESM1_ESM.pdf (236 kb)
Supplementary Figure 1 PERCON Maps of Mp3 and Mq1contigs. Short schematic maps indicating only the monomeric structure of the HOR units and HOR domains, and the complete PERCON maps of each contig are shown. In the Mp3 23-mer HOR, the individual monomers are designated A through W and in the Mq1 13-mer HOR A through J, and various deletions and duplications are indicated. These groups were identified by constructing minimum evolution trees of all the HOR monomers in Mega5 as described in Materials and Methods. In the complete PERCON maps, the indicators shown in the map for each monomer (Materials and Methods) are as follows: monomer type, coordinate of the first nucleotide, coordinate of the last nucleotide, length, direct or reverse (C) strand, identity (%) to overall AS consensus monomer ALPHA-ALL, alignment score and relative alignment score (rs) to the same consensus monomer and the result of 171/172 test where 171 means deletion in position 21 of 172 bp monomer, 172 means no deletion and u means unresolved. In Mp3 map, the regions peppered with 172 bp monomers are marked brown (stands for yellow-striped) and the regions devoid of 172 bp monomers are marked yellow, thus designating respective AS layers. The Mp3 HOR cluster is located on the border of yellow-striped and yellow domains and each HOR has a yellow-striped and a yellow part suggesting amplification of a border region. In Mq1 contig, the D1 and D2 monomers (SF2) are marked azure and the R1 and R2 monomers (SF5) are marked blue. The array of SF2 HORs with one atypical monomer which is classed variably as Um (unclassed, marked grey) or R1 (SF5) can be seen in each HOR. The number of monomers in each HOR unit and its number from the start of the contig are indicated on the left. Note that the alphabetical letter codes for HORs should be read from bottom to top in each HOR, as AS goes on reverse strand in both contigs. Rare SF4+ monomers with rs<0.62 are marked grey. It can be seen that these are mostly monomers with abnormal length, which probably explains their low rs score. Also, a single case of solitary 172 bp monomer in a large yellow region is indicated by red. (PDF 235 kb)
10577_2016_9530_MOESM2_ESM.png (2.1 mb)
Supplementary Figure 2 An example of a mixing test where Mp3 AS monomers mix with yellow and yellow-striped branches of a model tree which contains HCX AS monomers.Minimum evolution phylogenetic tree of all monomers extracted from the pericentromeric region of HCX, aligned and color-coded as described previously (Shepelev et al. 2009) was constructed using Mega5 with default parameters. The branches on the tree go in the following order: dark blue - R1 (SF5), light blue - R2 (SF5), yellow, yellow-striped (marked brown), green (olive-green layer), olive (olive-green layer), red and grey. Mp3 monomers were extracted and aligned in the same manner and color coded as follows. Monomers of the left (42305-54455 bp) and right (129163-185622 bp) flanks deemed yellow-striped and yellow by PERCON analysis (see Suppl. Figure 1) were marked lilac and azure, respectively. Only one representative complete HOR unit (23 monomers, 73849-77762 bp) was used and its monomers were marked by red triangles. It can be seen that yellow and yellow-striped Mp3 monomers mix well with respective branches of the tree, and the HOR contains both yellow and yellow-striped monomers. (PNG 2.07 mb)
10577_2016_9530_MOESM3_ESM.png (267 kb)
Supplementary Table 1 Size Estimates of α21-II Clusters. The size for each of the AS clusters as estimated from HC21p hybrid cell lines (Zhao 1999), cosmids, BACs, and YACs that span the clusters. The sizes estimated in the YAC digests correspond to the largest bands seen in the hybridization as it is assumed those bands represent an intact α21-II cluster and not any other sequence. Dashes indicate that the size of the cluster could not be estimated using the given method. The BAC and YAC clones used in this work are not from the same individual (Lyle et al. 2007, Wang et al. 1999) which may account for the marked differences in size estimates. (PNG 267 KB)
10577_2016_9530_MOESM4_ESM.xls (36 kb)
Supplementary Table 2 BLAST Comparisons of AS-Containing Plasmid Clones to HC21 BAC/Cosmid Clones. Plasmid clones containing AS sequences are in the “Marker” column, and the various BAC/ cosmid clones are in subsequent columns. The results are given as length of match (in bp), percentage identity match, and number of insertions/deletions in the alignment (shown in parentheses) and are the best results as scored by the E-values of all search results. Results showing a 90 % or greater match by identity are shown in bold. The results of the WAV-17 test (Materials and Methods) and the SF assignment established for each clone are also shown. The WAV-17 test results for the plasmid clones are shown as “negative” or “positive” and the numbers of perfect full-length (101 bp) matches scored in three portions of AS, each extracted from 50 million raw WAV-17 reads, are indicated. For the BAC/cosmid clones, the number of matches in only one such portion is indicated. The SF assignments into SF2, SF4+ and SF5 were established by the PERCON program, and the differentiation of SF4+ into yellow (y) and yellow-striped (ys) was achieved by rs<0.62 and 171/172 analyses, as described in Materials and Methods and Table 1. FP565424 contains an SF2 13-mer HOR which is composed of the same types of monomers as the major HC21 D21Z1 11-mer HOR. The monomers of the same types are ~94 % identical, and the 13-mer has extra copies of monomers A and I (see Suppl. Figure 1 for detailed maps). * CU638690 has an ~340 bp internal duplication with positions 517-861 being 99 % identical to positions 170-516. With this duplication removed, pTRA-1 is 99 % identical to CU638690.. (XLS 36.5 kb)
10577_2016_9530_MOESM5_ESM.xlsx (13 kb)
Supplementary Table 3 Repetitive Sequences in HC21p BAC/Cosmid Clones. The table lists all major repetitive sequences detected by RepeatMasker. The type of repetitive sequence that comprises each cluster is identified as well as its start and stop points in the sequence and total cluster length (in bp). Only L1 inserts embedded in or directly adjacent to AS clusters are listed. The SatIII cluster in AF254982 is Group I while that in CU638690 is Group 2 (Bandyopadhyay et al. 2001). (XLSX 12.9 kb)
10577_2016_9530_MOESM6_ESM.xls (28 kb)
Supplementary Table 4 HC21 AS Reference models. HC21 reference models listed in the hg38 assembly, with SF assignments, results of the WAV-17 test and identities to known HC21 loci or clones, shown in the same way as described in Suppl. Table 2.. (XLS 27.5 kb)
10577_2016_9530_MOESM7_ESM.xlsx (17 kb)
Supplementary Table 5 L1 Insertions in HC21p BAC/Cosmid Clones. Insertions are listed by L1 family type, start and stopping positions in the clone or contig, total size of the insertion, and position of the insertion relative to AS sequences in the clones or contigs (embedded in an AS cluster, directly adjacent to an AS cluster, or free from association with any AS sequence). Those L1 insertions with an origin older than L1PA3 were defined as ancient, while modern L1s were defined as great ape-specific L1 insertions with recent evolutionary origins. The only embedded L1 of ancient origin (highlighted) is in the older (yellow-striped) part of the Mp3 contig. Full length inserts are denoted by a * in the Full Length column.. (XLSX 16.7 kb)
10577_2016_9530_MOESM8_ESM.xls (50 kb)
Supplementary Table 6 rs Statistics for AS Dead Layers. Rs statistics for specific regions of HCX, whole HCX, HC8 and HC17, and the HC21 long clones and contigs used in this paper were determined as described in Materials and Methods. The specific regions of HCX were mapped in Shepelev et al., (2009) as shown in Figure 5. Their coordinates in hg38 assembly were as follows: Xp yellow plus yellow-striped layers (old AS) chrX:58445813-58520563, Xp olive-green (the youngest of the ancient layers) chrX:58332483-58444212, Xp red plus grey (yet older ancient layers) chrX:58061517-58332386, Xq yellow plus yellow-striped (old) chrX:62,549,232-62,590,077; chrX:62644083-62696584, Xq red (ancient) chrX:62700017-62821704. For whole chromosome analysis the following regions were used: Xp chrX:58061517-58555578, Xq chrX:62462545-62821704, 8p chr8:43572234-43983741, 8q chr8:45927268-46544,749, 17p chr17:22232104-22763679 and 17q chr17:26639872-27198582. Only SF4+ monomers with length 90 bp or longer, as identified by PERCON, were analyzed and distributed into rs classes with incremental values. The table shows the number of monomers in each rs class and summary statistics at the bottom. (XLS 50.5 kb)
10577_2016_9530_MOESM9_ESM.xlsx (20 kb)
Supplementary Table 7 rs Analysis of WAV17. The data in this table were used in Figure 4 and were generated as described in the legend to that figure and in the text. The results for whole pericentromeric regions of chromosomes 8, 17 and X were obtained as described in the legend to Suppl. Figure 6 and processed as 100 bp reads. “Real ancient” is the real number of ancient monomers taken from the maps of chromosome X as in Shepelev et al. (2009), it is calculated as a difference between a given version and the shortest truncated version (old). The overall ratio of real ancient monomers to monomers with rs<0.62, established for the whole HCX was used to calculate predicted proportions (% predicted ancient) of ancient monomers in HC21 (WAV-17), human genomic DNA and chromosomes 8 and 17. Controls show that predicted values can be both higher and lower than real ones depending on the composition of the ancient layers. Because calculation of “predicted ancient” is based on a crude approximate ratio, the resulting percentages may slightly exceed 100 % (see chromosomes 8 and 17). This reflects the fact that the old domains on these chromosomes are rather small and the ancient domains are rather large and are composed mostly of the older ancient layers with higher proportion of rs<0.62 monomers (Shepelev et al. 2009). (XLSX 20.2 kb)
10577_2016_9530_MOESM10_ESM.xls (46 kb)
Supplementary Table 8 rs Analysis of whole chromosome assemblies of human acrocentric chromosomes. The left part of this table is identical to the left part of Suppl. Table 6 and shows the analysis of control sequences. The right part shows the rs analysis of the whole hg38 assemblies of human acrocentric chromosomes 13, 14, 15, 21 and 22 processed as described in Materials and Methods and in Suppl. Table 6. (XLS 46.5 kb)


  1. Alexandrov I, Kazakov A, Tumeneva I, Shepelev V, Yurov Y (2001) Alpha-satellite DNA of primates: old and new families. Chromosoma 110:253–266CrossRefPubMedGoogle Scholar
  2. Bozovsky MR, Shukai, SA, Cummings MR, Doering JL (2004) Organization of the regions flanking the centromere of human chromosome 21. [abstract 1567]. Available from
  3. Brun M-E, Ruault M, Ventura M, Roizes G, De Sario A (2003) Juxtacentromeric region of human chromosome 21: a boundary between centromeric heterochromatin and euchromatic chromosome arms. Gene 312:41–50CrossRefPubMedGoogle Scholar
  4. Bun C, Ziccardi W, Doering JL, Putonti C (2012) MiIP: the monomer identification and isolation program. Evol Bioinformatics Online 8:293–300Google Scholar
  5. Cardone MF, Ballarati L, Ventura M, Rocchi M, Marozzi A, Ginelli E, Meneveri R (2004) Evolution of beta satellite DNA sequences: evidence for duplication-mediated repeat amplification and spreading. Mol Biol Evol 21:1792–1799CrossRefPubMedGoogle Scholar
  6. Carnahan SL, Palamidis-Bourtsos E, Musich PR, Doering JL (1993) Characterization of an evolutionarily old human alphoid DNA. Gene 123:219–225CrossRefPubMedGoogle Scholar
  7. Choo KH (1990) Role of acrocentric cen-pter satellite DNA in Robertsonian translocation and chromosomal non-disjunction. Mol Biol Med 7:437–449PubMedGoogle Scholar
  8. Choo KH, Vissel B, Brown R, Filby RG, Earle E (1988) Homologous alpha satellite sequences on human acrocentric chromosomes with selectivity for chromosomes 13, 14 and 21: implications for recombination between nonhomologues and Robertsonian translocations. Nucleic Acids Res 16:1273–1284CrossRefPubMedPubMedCentralGoogle Scholar
  9. Doering JD, Jelachich ML, Hanlon KM (1982) Identification and genomic organization of human tRNALys genes. FEBS Lett 146:1620–1624CrossRefGoogle Scholar
  10. Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224CrossRefPubMedGoogle Scholar
  11. Hayden KE, Strome ED, Merrett SL, Lee HR, Rudd MK, Willard HF (2013) Sequences associated with centromere competency in the human genome. Mol Cell Biol 33:763–772CrossRefPubMedPubMedCentralGoogle Scholar
  12. Ikeno M, Masumoto H, Okazaki T (1994) Distribution of CENP-B boxes reflected in CREST centromeric antigenic sites on long range α-satellite DNA arrays of human chromosome 21. Hum Mol Genet 3:1245–1257CrossRefPubMedGoogle Scholar
  13. Jarmuz M, Glotzbach CD, Bailey KA, Bandyophahyay R, Shaffer LG (2007) The evolution of satellite III DNA subfamilies among primates. Am J Hum Genet 80:495–501CrossRefPubMedPubMedCentralGoogle Scholar
  14. Jordan GE, Piel WH (2008) PhyloWidget: web-based visualizations for the tree of life. Bioinformatics 24:1641–1642CrossRefPubMedGoogle Scholar
  15. Kazakov AE, Shepelev VA, Tumeneva IG, Alexandrov AA, Yurov YB, Alexandrov IA (2003) Interspersed repeats are found predominantly in the “old” alpha satellite families. Genomics 82:619–627CrossRefPubMedGoogle Scholar
  16. Lyle R, Prandini P, Osoegawa K et al (2007) Islands of euchromatin-like sequence and expressed polymorphic sequences within the short arm of human chromosome 21. Genome Res 17:1690–1696CrossRefPubMedPubMedCentralGoogle Scholar
  17. Mashkova TD, Tyumeneva IG, Zinov’eva OL, Romanova LY, Jabs E, Alexandrov IA (1996) Centromeric alpha-satellite DNA at euchromatin/heterochromatin boundary of human chromosome 21. Mol Biol 30:617–625Google Scholar
  18. Miga KH, Newton Y, Jain M, Altemose N, Willard HF, Kent WJ (2014) Centromere reference models for human chromosomes X and Y satellite arrays. Genome Res 24:697–707CrossRefPubMedPubMedCentralGoogle Scholar
  19. Miller DA (1977) Evolution of primate chromosomes. Science 198:1116–1124CrossRefPubMedGoogle Scholar
  20. Perrière G, Gouy (1996) WWW-query: an on-line retrieval system for biological sequence banks. Biochimie 78:364–369CrossRefPubMedGoogle Scholar
  21. Rosandić M, Paar V, Basar I, Gluncić M, Pavin N, Pilas I (2006) CENP-B box and pJalpha sequence distribution in human alpha satellite higher-order repeats (HOR). Chromosom Res 14:735–753CrossRefGoogle Scholar
  22. Rudd MK, Willard HF (2004) Analysis of the centromeric regions of the human genome assembly. Trends Genet 20:529–533CrossRefPubMedGoogle Scholar
  23. Rudd MK, Wray GA, Willard HF (2006) The evolutionary dynamics of alpha-satellite. Genome Res 16:88–96CrossRefPubMedPubMedCentralGoogle Scholar
  24. Schueler MG, Higgins AW, Rudd MK, Gustashaw K, Willard HF (2001) Genomic and genetic definition of a functional human centromere. Science 294:109–115CrossRefPubMedGoogle Scholar
  25. Schueler MG, Dunn JM, Bird CP et al (2005) Progressive proximal expansion of the primate X chromosome centromere. Proc Natl Acad Sci U S A 102:10563–10568CrossRefPubMedPubMedCentralGoogle Scholar
  26. Shepelev VA, Alexandrov AA, Yurov YB, Alexandrov IA (2009) The evolutionary origin of man can be traced in the layers of defunct ancestral alpha satellites flanking the active centromeres of human chromosomes. PLoS Genet 5:e1000641CrossRefPubMedPubMedCentralGoogle Scholar
  27. Shepelev VA, Uralsky LI, Alexandrov AA, Yurov YB, Rogaev EI, Alexandrov IA (2015) Annotation of suprachromosomal families reveals uncommon types of alpha satellite organization in pericentromeric regions of hg38 human genome assembly. Genom Data 5:139–146CrossRefPubMedPubMedCentralGoogle Scholar
  28. Smit AF, Tóth G, Riggs AD, Jurka J (1995) Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences. J Mol Biol 246:401–417CrossRefPubMedGoogle Scholar
  29. Smith G (1976) Evolution of repeated DNA sequences by unequal crossover. Science 191:528–535CrossRefPubMedGoogle Scholar
  30. Stanyon R, Rocchi M, Capozzi O et al (2008) Primate chromosome evolution: ancestral karyotypes, marker order and neocentromeres. Chromosom Res 16:17–39CrossRefGoogle Scholar
  31. Trowell HE, Nagy A, Vissel B, Choo KH (1993) Long-range analyses of the centromeric regions of human chromosomes 13, 14 and 21: identification of a narrow domain containing two key centromeric DNA elements. Hum Mol Genet 2:1639–1649CrossRefPubMedGoogle Scholar
  32. Vissel B, Choo KH (1991) Four distinct alpha satellite subfamilies shared by human chromosomes 13, 14 and 21. Nucleic Acid Res 19:271–277CrossRefPubMedPubMedCentralGoogle Scholar
  33. Wang SY, Cruts M, Del-Favero J et al (1999) A high-resolution physical map of human chromosome 21p using yeast artificial chromosomes. Genome Res 9:1059–1073CrossRefPubMedPubMedCentralGoogle Scholar
  34. Willard HF, Waye JS (1987) Hierarchical order in chromosome-specific human alpha satellite DNA. Trends Genet 3:192–198CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • William Ziccardi
    • 1
  • Chongjian Zhao
    • 1
  • Valery Shepelev
    • 2
    • 3
    • 4
  • Lev Uralsky
    • 2
    • 4
  • Ivan Alexandrov
    • 5
  • Tatyana Andreeva
    • 3
  • Evgeny Rogaev
    • 3
    • 4
    • 6
    • 7
  • Christopher Bun
    • 8
  • Emily Miller
    • 1
  • Catherine Putonti
    • 1
    • 8
    • 9
  • Jeffrey Doering
    • 1
    Email author
  1. 1.Department of BiologyLoyola University ChicagoChicagoUSA
  2. 2.Institute of Molecular GeneticsRussian Academy of SciencesMoscowRussia
  3. 3.Department of Genomics and Human Genetics, Vavilov Institute of General GeneticsRussian Academy of SciencesMoscowRussia
  4. 4.Center for Brain Neurobiology and Neurogenetics, Institute of Cytology and GeneticsSiberian Branch of the Russian Academy of SciencesNovosibirskRussia
  5. 5.Research Center of Mental HealthRussian Academy of Medical SciencesMoscowRussia
  6. 6.Department of Psychiatry, Brudnick Neuropsychiatric Research InstituteUniversity of Massachusetts Medical SchoolWorcesterUSA
  7. 7.Faculty of Bioengineering and BioinformaticsLomonosov Moscow State UniversityMoscowRussia
  8. 8.Department of Computer ScienceLoyola University ChicagoChicagoUSA
  9. 9.Bioinformatics ProgramLoyola University ChicagoChicagoUSA

Personalised recommendations