Abstract
Chromatin assembled with the histone H3 variant CENP-A is the heritable epigenetic determinant of human centromere identity. Using genome-wide mapping and reference models for 23 human centromeres, CENP-A binding sites are identified within the megabase-long, repetitive α-satellite DNAs at each centromere. CENP-A is shown in early G1 to be assembled into nucleosomes within each centromere and onto 11,390 transcriptionally active sites on the chromosome arms. DNA replication is demonstrated to remove ectopically loaded, non-centromeric CENP-A. In contrast, tethering of centromeric CENP-A to the sites of DNA replication through the constitutive centromere associated network (CCAN) is shown to enable precise reloading of centromere-bound CENP-A onto the same DNA sequences as in its initial prereplication loading. Thus, DNA replication acts as an error correction mechanism for maintaining centromere identity through its removal of non-centromeric CENP-A coupled with CCAN-mediated retention and precise reloading of centromeric CENP-A.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Main
Correct chromosome segregation relies on a unique chromatin domain known as the centromere. Human centromeres are located on megabase-long1 chromosomal regions and are comprised of tandemly repeated arrays of an approximately 171 base pair (bp) element, termed α-satellite DNA2,3,4. CENP-A is a histone H3 variant5,6 that replaces histone H3 in chromatin assembled onto about 3% of α-satellite DNA repeats7,8, and is flanked by pericentric heterochromatin containing H3K9me2/3 (ref. 9). Nevertheless, α-satellite DNA sequences are neither sufficient nor essential for centromere identity2,10, as demonstrated by several measures including identification of multiple examples of acquisition of a new centromere (referred to as a neocentromere) at a new location coupled with inactivation of the original centromere11.
This has led to a consensus view that mammalian centromeres are defined by an epigenetic mark2. Use of gene replacement in human cells and fission yeast has identified the mark to be CENP-A-containing chromatin12, which maintains and propagates centromere function indefinitely by recruiting CENP-C and the 16-subunit constitutive centromere associated network (CCAN)13,14,15,16. We8 and others17 have shown that the overwhelming majority of human CENP-A chromatin particles are octameric nucleosomes containing two molecules of CENP-A at all cell cycle points, with heterotypic CENP-A/histone H3-containing nucleosomes comprising at most 2% of CENP-A-containing chromatin8.
During DNA replication, initially bound CENP-A is quantitatively redistributed to each daughter centromere18, while incorporation of new CENP-A into chromatin occurs only after exit from mitosis18,19, when its loading chaperone HJURP (ref. 20,21) is active22. Temporal separation of centromeric DNA replication from new CENP-A chromatin assembly raises the important question of how the centromeric epigenetic mark is maintained across the cell cycle, when it would be expected to be dislodged by DNA replication and diluted at each centromere as no new CENP-A is assembled until the next G1 (ref. 18). Moreover, endogenous CENP-A comprises only about 0.1% of the total histone H3 variants. Recognizing that a proportion of CENP-A is assembled at the centromeres with the remainder loaded onto sites on the chromosome arms7,8,23, long-term maintenance of centromere identity and function requires limiting accumulation of non-centromeric CENP-A. Indeed, artificially increasing CENP-A expression in human cells increases ectopic deposition at non-centromeric sites, accompanied by chromosome segregation aberrations23,24,25,26.
Using CENP-A chromatin immunoprecipitation (ChIP) and mapping onto centromere reference models for the centromere of each human chromosome, we now establish that DNA synthesis acts as an error correction mechanism to maintain epigenetically defined centromere identity by mediating precise reassembly of centromere-bound CENP-A chromatin, while removing ectopically loaded CENP-A found within transcriptionally active chromatin outside the centromeres.
Results
CENP-A binding at 23 human centromere reference models
We produced HeLa cells either (1) expressing CENP-ALAP, a CENP-A variant carboxy-terminally fused to a localization (enhanced yellow fluorescent protein, EYFP) and affinity (His) purification tag27 at one endogenous CENP-A allele (Supplementary Fig. 1a and Fig. 1a), or (2) stably expressing an elevated level (4.5 times the level of CENP-A in parental cells) of CENP-ATAP, a CENP-A fusion with carboxy-terminal tandem affinity purification (S protein and protein A) tags separated by a tobacco etch virus protease cleavage site (Supplementary Fig. 1b and Fig. 1a). Both CENP-A variants localize to centromeres (Supplementary Fig. 1c,d), support long-term centromere identity, and mediate high-fidelity chromosome segregation in the absence of wild-type CENP-A (refs. 7,8). Importantly, while the HeLa cells we use have acquired an aneuploid genome, they are chromosomally stable, with a high-fidelity centromere function that has maintained the same karyotype over almost two decades28.
To immunopurify CENP-A-containing chromatin at G1 or G2, chromatin was isolated from synchronized cells (Fig. 1a and Supplementary Fig. 1e). Nuclease digestion was used to produce mononucleosomes from chromatin isolated at G1 and G2, yielding the expected 147 bp of protected DNA length for nucleosomes assembled with histone H3 (Fig. 1a,c, upper panel). In parallel, chromatin was also isolated from randomly cycling cells stably expressing TAP-tagged histone H3.1 (H3.1TAP—Supplementary Fig. 1b and Fig. 1a)13. CENP-ALAP-, CENP-ATAP- or H3.1TAP-containing chromatin was then affinity purified and eluted under mild conditions with PreScission or tobacco etch virus protease cleavage (Fig. 1a).
α-satellite DNA sequences were enriched 30–35-fold in DNA isolated from CENP-ATAP or CENP-A+/LAP cells (Fig. 1b), the expected enrichment since α-satellite DNA comprises about 3% of the genome8,29. While microcapillary electrophoresis of bulk input chromatin produced the expected 147 bp of protected DNA length for nucleosomes assembled with histone H3 (Fig. 1c, upper panel), isolated CENP-ALAP chromatin expressed at endogenous CENP-A levels produced DNA lengths centred on 133 bp, before and after DNA replication (Fig. 1c, lower panel), as previously reported for octameric CENP-A-containing nucleosomes with DNA unwinding at entry and exit8,30.
Affinity-purified CENP-ALAP-, CENP-ATAP- and H3.1TAP-bound DNAs were sequenced and mapped (Fig. 1a,d and Supplementary Table 1) onto the centromere reference model of the human X chromosome31 and centromere models (incorporated into the HuRef genome hg38) for each human autosome that include the observed variation in α-satellite higher-order repeat (HOR) array sequences32,33. Sequences bound by CENP-A were identified (Fig. 1d,e and Supplementary Fig. 2) using algorithm-based scripts (SICER and MACS, refs. 34,35). CENP-ALAP expressed at endogenous levels mapped with high reproducibility across the centromeric regions of all 23 reference centromeres (see Fig. 1e for chromosome 18 and Supplementary Fig. 2 for the other 22). Sequences bound were largely unaffected by increasing CENP-A levels 4.5-fold in CENP-ATAP cells (Figs. 1e and 2a,b).
Centromeric α-satellite arrays varied widely in CENP-A binding, from 10.5-fold enrichment for array D3Z1 in cen3 to 213-fold for array GJ211930.1 in cen10, and with most enriched 20–40-fold relative to input DNAs (Supplementary Table 2). For the 6 (of 17) centromeres that contain more than one α-satellite array, only one array actively bound CENP-A (Supplementary Table 2). Multiple α-satellite arrays in 11 centromeres (Supplementary Table 2) showed enriched CENP-A binding in two or more arrays, consistent with CENP-A binding to a different array in each homologue, as previously shown for cen17 in two diploid human cell lines36. The increased levels of CENP-A in CENP-ATAP cells did not increase the number of centromeric binding peaks (Fig. 1d,e), but did elevate CENP-A occupancy at some divergent monomeric α-satellite repeats (Supplementary Fig. 1f) or within HORs (Supplementary Fig. 1g), with both examples occurring in regions with few CENP-B boxes.
CENP-A nucleosomes are retained at centromeric loading sites after DNA replication
Despite the known redistribution of initially centromere-bound CENP-A onto each of the new daughter centromeres without addition of new CENP-A (ref. 18), comparison of the sequences bound by CENP-A in G1 with those bound in G2 revealed that for all 23 centromeres, at both normal (CENP-A+/LAP) and elevated (CENP-ATAP) levels, CENP-A was bound to indistinguishable α-satellite sequences before and after DNA replication (Fig. 1e and Supplementary Fig. 2). Indeed, almost all (87%) of α-satellite binding peaks algorithmically identified for CENP-ATAP during G1 remained at G2 (Supplementary Fig. 1h, top). A similarly high (89%) retention of CENP-A peaks found in G1 remained at G2 in CENP-ALAP cells with CENP-A expressed at endogenous levels (Supplementary Fig. 1h, bottom). After filtering out multimapping reads, 96 single-copy, centromeric CENP-A binding sites were identified within the HORs of the 23 reference centromeres (Fig. 2). Remarkably, examination of these before and after DNA replication in CENP-ATAP cells revealed quantitative retention of CENP-A in G2 in almost all (93 of 96) of these unique centromeric sites, with the remaining three peaks only slightly diminished (Fig. 2 and Supplementary Fig. 1i).
Ectopic CENP-A assembled onto chromosome arms in early G1 is removed by G2
In addition to the striking enrichment at centromeric α-satellites, genome-wide mapping of CENP-A-bound DNAs revealed preferential and highly reproducible incorporation into unique sequences, non-α-satellite sites on the arms of all 23 chromosomes (Figs. 3a,b and 4). At endogenous CENP-A levels, 11,390 ectopic sites were identified, 620 of which were loaded ≥5-fold over background (Fig. 3d; Supplementary Fig. 3a). Sites enriched for bound CENP-A were essentially identical in DNAs from randomly cycling cells or G1 cells (Fig. 3a,b). While a 4.5-fold increase in CENP-A levels in CENP-ATAP cells did not increase the binding peaks (Fig. 1d) or the number of unique single-copy sites within centromeric HORs (Supplementary Fig. 1i, bottom), it drove an increased number of sites of CENP-A incorporation on the arms, producing 40,279 non-centromeric sites (Fig. 3d), 12,550 of which were loaded ≥5-fold over background (Supplementary Fig. 3a).
Remarkably, for all 23 human chromosomes and for CENP-A accumulated to endogenous (CENP-ALAP) or increased (CENP-ATAP) expression levels, passage from G1 to G2 almost eliminated enrichment of CENP-A binding to specific sites on the chromosome arms, while leaving α-satellite-bound sequences unaffected (Figs. 1d, 3 and 4). Loss by G2 of CENP-A binding at specific arm sites was highly reproducible (see experimental replicas in Fig. 3b). Scoring peak binding sites with thresholds of ≥5-, 10- or 100-fold of CENP-A binding over background, at least 90% of sites bound on chromosome arms in G1 in CENP-ATAP cells were removed by early G2 (Supplementary Fig. 3a), and all of those still identified in G2 were substantially reduced in peak height.
Ectopic CENP-A removal after DNA replication was confirmed using CENP-A ChIP following synchronization in G1 or mitosis (Supplementary Fig. 3b) in a second, nearly diploid human cell line (DLD1) in which the two CENP-A alleles were modified to encode a degron-tagged, auxin-inducible degradable CENP-AAID or a doxycycline-inducible CENP-AWT (ref. 37). Levels of CENP-A loaded at each of four ectopic sites before (G1) and after DNA replication revealed that almost all (85–90%) of ectopic CENP-A loaded in G1 was removed by mitotic entry (Fig. 3e).
Neocentromeres are not at sites of ectopic CENP-A loading
Recognizing that ectopic loading of CENP-A on chromosomes could be one component of neocentromere formation, we tested if the positions of known human neocentromeres38 are sites of preferential ectopic CENP-A loading. Despite cytogenetic positioning of many reported neocentromeres11, only two have been precisely mapped39. The first (PDNC4) spans 300 kilobases (kb) on chromosome 4 (ref. 40) (87.278 to 87.578 megabases (Mb) in hg38). In CENP-ATAP cells, only four sites of elevated CENP-A were present in G1 within the genomic region of this neocentromere, only two of which had CENP-A loading more than fivefold over the background. Importantly, all sites were removed in G2-derived chromatin (Fig. 5a). An additional neocentromere (IMS13q) maps to a 100 kb region on chromosome 13 (ref. 41) (97.047 to 97.147 Mb in hg38 assembly39). CENP-A binding to this region in CENP-ATAP cells in G1 did not differ in density of peaks or peak heights from many similarly loaded sites scattered across the long arm of chromosome 13 (Supplementary Fig. 3c). Passage from G1 to G2 of CENP-ATAP cells stripped almost all CENP-A bound that corresponded to the region of the IMS13q neocentromere. While we cannot exclude the possibility that other cell types have different epigenetic landscapes that affect the sites to which CENP-A binds at non-centromeric regions, our examination of these two best defined neocentromeres offers no support for neocentromere formation arising at the site of an inherent hotspot of ectopic CENP-A loading.
CENP-A is ectopically loaded at early G1 into open/active chromatin
The sites on the chromosome arms into which CENP-A was assembled in G1 in CENP-ATAP cells were enriched twofold (compared with levels expected by chance) at promoters or enhancers of expressed genes, with a 2.5-fold enrichment at sites bound by the transcriptional repressor CTCF (Supplementary Fig. 3e), trends similar to previous reports for cells with increased CENP-A23. More than 80% of CENP-ATAP binding sites on chromosome arms with peak heights ≥5-fold over background (Fig. 5b–d) overlapped with DNase I hypersensitive, accessible chromatin sites identified by the ENCODE project that are functionally related to transcriptional activity. Similarly, CENP-A expressed at endogenous levels was enriched threefold at DNase I hypersensitive sites (Fig. 5e) and promoters (Supplementary Fig. 3g).
Most (80%) of non-centromeric CENP-ATAP binding peaks overlapped with H3k4me1 and H3k4me2 sites found in active and primed enhancers and at transcription factor binding sites identified by ENCODE, with a similar trend for CENP-ALAP (Fig. 5d,e). Ectopic CENP-ATAP or CENP-ALAP (Supplementary Fig. 4a,b) peaks also showed a significant overlap with other marks of active transcription, including H2A.Z, H3K4me3, H3K27ac, H3K36me3 and H3K9ac. Conversely, both CENP-ATAP and CENP-ALAP were not enriched at H3k27me3 sites tightly associated with inactive gene promoters and facultatively repressed genes (Supplementary Figs. 3d,f and 4a,b). CENP-A binding peaks showed a mild (30–40%) overlap with histone modifications H4K20me1 and H3K79me2 (active transcription marks) or the H3K9me3 mark of transcription repression (Supplementary Fig. 4a,b). Overall, most (65% and 93%, respectively) of the ectopic CENP-A sites in cells with endogenous or elevated CENP-A were associated with any active transcription mark (Supplementary Fig. 4a,b), consistent with ectopic CENP-A preferentially bound to open, active chromatin.
Ectopic CENP-A in G1 in either CENP-ATAP or CENP-ALAP cells bound to ‘high-occupancy target’ regions defined by highly expressed regions of the genome (and that show binding of unrelated transcription factors without underlying sequence specificity42) were almost quantitatively removed from cells that had progressed through DNA replication (Supplementary Fig. 3h,i), demonstrating that enrichment of CENP-A in such highly expressed regions cannot be a consequence of non-specific binding to ‘hyper-ChIPable’ regions43.
Analysis of published CENP-A ChIP sequencing (ChIP-seq) datasets for HT1080 cells44, a human epithelial fibrosarcoma cell line expressing Flag-tagged CENP-A at about three times the level of parental CENP-A, revealed similar trends: ectopic sites were enriched at DNase I hypersensitive sites and at transcription activation marks (H2A.Z and H3K4me1/2) but were not enriched at the transcription repression mark H3K27me3 (Supplementary Fig. 4c). Similarly, CENP-A ChIP-seq datasets from the HuRef human lymphoblastoid cell line45 also revealed that the majority (51%) of ectopic CENP-A accumulated to its endogenous level was found at marks associated with transcription activation (including H2A.Z, H3K36me3 and H4K20me1). CENP-A was not enriched at the transcription repression mark H3K27me3 (Supplementary Fig. 4d), although in all three cell lines analysed ectopic CENP-A was enriched at H3K9me3 sites associated with heterochromatin of constitutively repressed genes (Supplementary Fig. 4a–d).
Ectopic, but not centromeric, CENP-A is removed by replication fork progression
We next tested whether removal by G2 of CENP-A assembled into nucleosomes at unique sites on the chromosome arms is mediated by the direct action of the DNA replication machinery. CENP-ATAP was affinity purified from mid-S-phase cells and CENP-A-bound DNAs were sequenced and mapped (Fig. 6a and Supplementary Fig. 5a). In parallel, newly synthesized DNA in synchronized cells was labelled by addition of bromodeoxyuridine (BrdU) for 1 h at early (S0–S1), mid- (S3–S4) and late S phase (S6–S7) (Fig. 6a and Supplementary Fig. 5a). Genomic DNA from each time was sonicated (Supplementary Fig. 5b) and immunoprecipitated with a BrdU antibody (Fig. 6a). Eluted DNA was then sequenced and mapped to the genome (an approach known as Repli-seq46), yielding regions of early-, mid- and late-replicating chromatin (an example from a region of an arm of chromosome 20 is shown in Fig. 6b). Early replication timing was validated (Supplementary Fig. 5c) for two genes (MRGPRE and MMP15) previously reported to be early replicating (ENCODE Repli-seq46). Similarly, a gene (HBE1) and a centromeric region (Sat2) previously reported to be late replicating (ENCODE Repli-seq46) were confirmed to be replicated late (Supplementary Fig. 5d).
CENP-A chromatin immunoprecipitated from early- and mid-S-phase cells yielded levels of α-satellite DNA enrichment (Supplementary Fig. 5e) similar to those achieved at G1 phase (Fig. 1b). Furthermore, nucleosomal CENP-A chromatin produced by micrococcal nuclease digestion protected 133 bp of DNA at early and mid-S phase (Supplementary Fig. 5f), just as it did in G1 and G2 (Fig. 1c; see also ref. 8), with no evidence for a structural change from hemisomes to nucleosomes and back to hemisomes during S phase as previously claimed47. Mapping of CENP-A binding sites on chromosome arms, combined with Repli-seq, revealed that almost all (91%) ectopic G1 CENP-A binding was found in early- or mid-S-replicating regions (Fig. 6b,c). While alphoid DNA sequences have been reported to replicate in mid-to-late48 S phase, in our cells α-satellite-containing DNAs in all 23 centromeres were found almost exclusively to be late replicating (Fig. 6d).
Remarkably, throughout S phase, centromere-bound CENP-A found in G1 was completely retained across each reference centromere with the same sequence binding preferences (Fig. 6e and Supplementary Fig. 5g). Retention of CENP-A binding during DNA replication was also observed at the unique-sequence-binding sites within the HORs of each centromere: all 96 CENP-ATAP G1 peaks at single-copy variants within α-satellite HORs remained bound by CENP-A (Fig. 6f and Supplementary Fig. 5h). In contrast, early-replicating ectopic CENP-A-binding sites were nearly quantitatively removed during or quickly after their replication and were no longer visible in mid-S phase (Fig. 6g,h). Similarly, ectopic CENP-A-binding sites found in mid-S-replicating regions remained at mid-S but were removed quickly after that and were absent by late S/G2 (Fig. 6i,j). For the 10% of ectopic CENP-A G1 peaks in late-S-replicating regions (Fig. 6c), almost all (85%) were removed by G2 (Fig. 6k,l), while late-replicating centromeric CENP-A peaks were retained, including the single-copy variants within the α-satellite HORs (Figs. 6d–f and 2 and Supplementary Fig. 5g,h). Thus, ectopic, but not centromeric, CENP-A-binding sites are removed as DNA replication progresses.
CENP-C/CCAN remain centromeric CENP-A associated during DNA replication
To comprehensively determine the components that associate with CENP-A chromatin during replication in late S, we used mass spectrometry following affinity purification of CENP-A nucleosomes (Supplementary Fig. 5i, left). A structural link that normally bridges multiple centromeric CENP-A nucleosomes and nucleates the full kinetochore assembly before mitotic entry is the 16-subunit constitutive centromere associated network (CCAN)49,50,51,52. This complex is anchored to CENP-A primarily through CENP-C50,53,54 and sustained by CENP-B binding to CENP-B box sequences within α-satellite DNAs55. Remarkably, mass spectrometry identified that all 16 CCAN components13,15 remained associated with mononucleosomal CENP-A chromatin that was affinity purified from late S/G2 (Table 1). Stable association with CENP-A was also seen for HJURP, multiple chromatin remodelling factors and nuclear chaperones, histones, centromere and kinetochore components, and other DNA replication proteins (Supplementary Fig. 5j–n). The continuing interaction during DNA replication of CCAN proteins with CENP-A, which is maintained even on mononucleosomes, provides strong experimental support that the CCAN complex tethers CCAN-bound centromeric CENP-A at or near the centromeric DNA replication forks, thereby enabling its efficient reincorporation after replication fork passage.
To test this further, the composition of CENP-A-containing nucleosomal complexes from G1 to late S/G2 was determined following affinity purification (via the TAP tag) of chromatin-bound CENP-ATAP from a predominantly mononucleosome pool (Supplementary Fig. 5i, right). We initially focused on the chromatin assembly factor 1 (CAF1) complex, which is required for de novo chromatin assembly following DNA replication56. Its p48 subunit (also known as CAF1 subunit c, RbAp48 or RBBP4) binds histone H4 (ref. 57), is a binding partner in a CENP-A prenucleosomal complex with HJURP and nucleophosmin (NPM1)21 and maintains the deacetylated state of histones in the central core of centromeres after deposition58. Remarkably, CAF1 p48 co-immunopurified with CENP-A from G1 to late S/G2 (Fig. 7a). In striking contrast, the two other CAF1 subunits (CAF1 p150 and CAF1 p60), which are essential for de novo chromatin assembly in vitro59, remained much more strongly associated with CENP-A nucleosomes in late S/G2 compared with mid-S (Fig. 7a). Additionally, MCM2, a core subunit of the DNA replicative helicase MCM2-7 complex that recycles old histones as the replication fork advances60, was robustly copurified with CENP-A only in late-S-phase-derived chromatin, with no association detected in mid-S (Fig. 7a).
CENP-C is essential for the maintenance of centromeric CENP-A during DNA replication
The stable association only in late S phase (when all centromeric, but only a small minority of ectopically loaded CENP-A, is replicated) of CENP-A with MCM2 and the CAF1 subunits necessary for chromatin reassembly suggested that CENP-C and its CCAN complex tethered centromeric CENP-A near the replication forks and stabilized CENP-A binding to MCM2 and CAF1, thereby enabling CENP-A reassembly onto the daughter centromeres after DNA replication. We tested this possibility by rapidly depleting CENP-C just after S-phase entry in a human cell line (CENP-CAE/AE) in which both CENP-C alleles were genome-engineered to produce CENP-C fused to both an auxin-inducible degron and EYFP55. Thymidine was used to synchronize these CENP-CAE/AE cells at the G1/S boundary (Fig. 7b,c). CENP-C degradation was induced just after S-phase entry to test CENP-C’s role specifically during centromeric DNA replication in late S phase (Fig. 7b), but without affecting the deposition of new CENP-A that occurs in early G1.
Auxin addition 2 h after release from thymidine block resulted in polyubiquitination and degradation of almost all CENP-CAE within 15 min, as was evident by the loss of fluorescence of the EYFP in CENP-CAE (Fig. 7b–d and Supplementary Video 1). DNA replication was then allowed to continue without CENP-C and the CCAN complex it nucleates53,55. At the end of DNA replication, chromatin-bound CENP-A was immunoprecipitated and the enrichment of α-satellite-containing DNA was determined. In randomly cycling cells this resulted in a 30-fold enrichment of alphoid DNA (Fig. 7e). At the end of DNA replication and distribution of CENP-A to the two daughter centromeres in CENP-C-containing cells, alphoid DNA enrichment was reduced by half, as expected from doubling of centromeric DNA without addition of new CENP-A. However, degradation of CENP-C early in DNA replication led to loss by the end of S phase of most (73%) of CENP-A initially bound to α-satellite DNA (Fig. 7e).
CENP-C-dependent retention of centromeric CENP-A late in S phase was confirmed by examination of two specific α-satellite variants found within the HORs of the centromeres of chromosomes 8 (Fig. 7f) and 15 (Fig. 7g). Each of these satellite variants is represented only once in the human genome and each shows precise retention in G2 of CENP-A bound in G1 (Fig. 2). In both variants, α-satellite DNA was enriched 50–60-fold following CENP-A ChIP from randomly cycling cells, which was reduced to half as much after DNA replication. Following CENP-C degradation in early S phase, however, CENP-A was not retained at either site during DNA replication (Fig. 7f,g). Taken together, these results demonstrate that depletion of CENP-C (and CCAN bound to CENP-A53,55) before centromere DNA replication results in loss of centromeric CENP-A by the end of S phase.
Discussion
Using reference models for 23 human centromeres, we have identified that during DNA replication CENP-A nucleosomes initially assembled onto centromeric α-satellite repeats are reassembled onto the same spectrum of α-satellite repeat sequences of each daughter centromere as are bound before DNA replication. Additionally, genome-wide mapping of sites of CENP-A assembly has identified that when CENP-A is expressed at endogenous levels the selectivity of the histone chaperone HJURP’s loading in early G1 of new CENP-A at or near existing sites of centromeric CENP-A-containing chromatin is insufficient to prevent its loading onto more than 11,000 sites along the chromosome arms (Fig. 3d). We also show that the number of ectopic sites increases as CENP-A expression levels increase, as has been reported in multiple human cancers39,61,62. Sites of ectopic CENP-A are replicated in early and mid-S (Fig. 6c) and are nearly quantitatively removed as DNA replication progresses (Fig. 6g–l).
Taken together, our evidence demonstrates that DNA replication functions not only to duplicate centromeric DNA but also as an error correction mechanism to maintain epigenetically defined centromere position and identity by coupling centromeric CENP-A retention with its removal from assembly sites on the chromosome arms (Fig. 7h). Indeed, our data reveal that CENP-A loaded onto unique, single-copy sites within α-satellite DNAs of the 23 reference centromeres is precisely maintained at these sites during and after DNA replication, offering direct support that (at least for each of these single-copy sites) the replication machine reloads CENP-A onto the exact same centromeric DNA site (Figs. 2 and 6f).
DNA replication produces a very different situation for CENP-A initially assembled into nucleosomes on the chromosome arms. Sites of ectopically loaded CENP-A are nearly quantitatively stripped during DNA replication (Figs. 3, 4 and 6g–l), thereby precluding premitotic acquisition of CENP-A-dependent centromere function at non-centromeric sites and reinforcing centromere position and identity (Fig. 7h). Without such correction, ectopically loaded sites would be maintained cell cycle after cell cycle, potentially recruiting CENP-C and assembly of the CCAN complex13,14,15,16. Arm-associated CENP-A/CCAN would present a major problem for faithful assembly and function of a single centromere/kinetochore per chromosome, both by acquisition of partial centromere function and by competition with the authentic centromeres for the pool of available CCAN components. Indeed, high levels of CENP-A expression (1) lead to recruitment of detectable levels of 3 of 16 CCAN components (CENP-C, CENP-N and Mis18) assembled onto the arms23,24,63, (2) are associated with ongoing chromosome segregation errors25 and (3) have been reported in several cancers, where it has been proposed to be associated with increased invasiveness and poor prognosis26,61,62.
As to the mechanism for retention during DNA replication of centromeric but not ectopically loaded CENP-A, our mass spectrometry analysis identifies a strong association of HJURP with CENP-A mononucleosomes in late S phase, comparable to the association identified in G1 (Supplementary Fig. 5l), supporting a probable role for HJURP in CENP-A retention, perhaps through interaction with MCM2-7 complex, as has previously been suggested21. This is consistent with evidence that HJURP can associate with MCM2 in a histone-independent manner60, consistent with a possible co-chaperone relationship for CENP-A. Moreover, degradation of HJURP in early S reduces centromeric CENP-A retention through S phase64.
Most importantly, our evidence demonstrates that the local reassembly of CENP-A within centromeric domains requires the continuing centromeric CENP-A association with CCAN complexes (Fig. 7), which act to tether disassembled CENP-A/H4 near the sites of centromere DNA replication. This local CENP-C/CCAN-dependent retention of CENP-A, coupled with the actions of the MCM2 replicative helicase, HJURP and CAF1, enables CENP-A’s precise reassembly into chromatin within each daughter centromere, thereby maintaining epigenetically defined centromere identity.
Methods
Cell lines
Adherent HeLa cells stably expressing CENP-ATAP or H3.1TAP by retrovirus infection13 or endogenously tagged CENP-A+/LAP by infection of a rAAV harbouring a LAP-targeting construct containing homology arms for CENP-A27 were adapted to suspension growth by selecting surviving cells and were maintained in DMEM (Gibco) containing 10% fetal bovine serum (Omega Scientific), 100 U ml−1 penicillin, 100 U ml−1 streptomycin and 2 mM l-glutamine at 37 °C in a 5% CO2 atmosphere with 21% oxygen. DLD1 cells with auxin-degradable CENP-AAID and doxycycline-inducible CENP-AWT37 and DLD1 CENP-CAE/AE 55 were maintained in the same conditions. Cells were maintained and split every 4–5 d according to ATCC recommendations.
Cell synchronization
Cells were synchronized as previously described8. Briefly, suspension HeLa cells were treated with 2 mM thymidine in complete medium for 19 h, pelleted and washed twice in PBS, and released in complete medium containing 24 μM deoxycytidine for 9 h followed by addition of thymidine to a final concentration of 2 mM for 16 h, after which cells were released again into complete medium containing 24 μM deoxycytidine. For G2, cells were collected 7 h after release from the second thymidine block. For G1, thymidine was added for a third time, 7 h after the release, and cells were collected 11 h after this (a total of 18 h after the release from the second thymidine block). Adherent DLD1 cells with auxin-degradable CENP-AAID and doxycycline-inducible CENP-AWT (ref. 37) were synchronized at G1 using 1.5 μM CDK4/6 inhibitor PD-0332991 (also known as Palbociclib) for 30 h or at mitosis using a single 2 mM thymidine block for 20 h and release into 0.1 mg ml−1 nocodazole. Adherent DLD1 CENP-CAE/AE (ref. 55) were synchronized at G1/S with a single thymidine block, followed by washing twice in PBS, and releasing in complete medium containing 24 μM deoxycytidine. Two hours after release, CENP-C rapid degradation was induced by addition of IAA at a final concentration of 500 μM. Nocodazole (Sigma) was added at 0.1 mg ml−1 to prevent cells from going into the next cell cycle. Cells were collected 8 h after release.
Chromatin extraction and affinity purification
Chromatin was extracted from 1 × 109 nuclei of HeLa cells as previously described8. TAP- or LAP-tagged chromatin were purified in two steps. In the first step, native TAP-tagged chromatin was immunoprecipitated by incubating the bulk soluble mononucleosome pool with rabbit IgG (Sigma-Aldrich) coupled to Dynabeads M-270 epoxy (Thermo Fisher Scientific, 14301). Alternatively, CENP-ALAP chromatin was immunoprecipitated using mouse anti-GFP antibody (clones 19C8 and 19F7, Monoclonal Antibody Core Facility at Memorial Sloan-Kettering Cancer Center, New York, USA) coupled to Dynabeads M-270 epoxy. Chromatin extracts were incubated with antibody-bound beads for 16 h at 4 °C. Bound complexes were washed once in buffer A (20 mM HEPES at pH 7.7, 20 mM KCl, 0.4 mM EDTA and 0.4 mM DTT), once in buffer A with 300 mM KCl and finally twice in buffer A with 300 mM KCl, 1 mM DTT and 0.1% Tween 20. In the second step, TAP–chromatin complexes were incubated for 16 h in final wash buffer with 50 μl recombinant tobacco etch virus protease, resulting in cleavage of the TAP tag and elution of the chromatin complexes from the beads. Alternatively, CENP-ALAP chromatin was eluted from the beads by cleaving the LAP tag using PreScission protease (4 h, 4 °C). CENP-A-containing chromatin was immunoprecipitated from DLD1 cells with auxin-degradable CENP-AAID and doxycycline-inducible CENP-AWT (ref. 37) using Abcam ab13939 CENP-A antibody coupled to Dynabeads M-270 epoxy.
DNA extraction
Following elution of the chromatin from the beads, Proteinase K (100 μg ml−1) was added and samples were incubated for 2 h at 55 °C. DNA was purified from proteinase K-treated samples using a DNA purification kit following the manufacturer instructions (Promega, Madison, WI, USA) and was subsequently analysed either by running a 2% low-melting agarose (APEX) gel or with an Agilent 2100 Bioanalyzer using the DNA 1000 kit. The Bioanalyzer determines the quantity of DNA on the basis of fluorescence intensity.
Quantitative real-time PCR
Quantitative real-time PCR was performed using SYBR Green mix (Bio Rad) with a CFX384 Bio Rad Real Time System. Primer sequences used in this study were the following: MRGPRE (forward) 5′-CTGCGCGGATCTCATCTTCC-3′ and (reverse) 5′-GGCCCACGATGTAGCAGAA-3′; MMP15 (forward) 5′-GTGCTCGACGAAGAGACCAAG-3′ and (reverse) 5′-TTTCACTCGTACCCCGAACTG-3′; HBE1 (forward) 5′-ATGGTGCATTTTACTGCTGAGG-3′ and (reverse) 5′-GGGAGACGACAGGTTTCCAAA-3′; Sat2 (forward) 5′-TCGCATAGAATCGAATGGAA-3′ and (reverse) 5′-GCATTCGAGTCCGTGGA-3′; α-satellite DNA (from chromosomes 1, 3, 5, 10, 12 and 16) (forward) 5′-CTAGACAGAAGAATTCTCAG-3′ and (reverse) 5′-CTGAAATCTCCACTTGC-3′ (ref. 41); α-satellite DNA variant from chromosome 8 (forward) 5′-TGAATGCGAGAGAGAAGTAA-3′ and (reverse) 5′-TCAAATATATCCAAATATCCA-3′; α-satellite DNA variant from chromosome 15 (forward) 5′-GTTGCACATTCCGGTTCATACA-3′ and (reverse) 5′- TTTCACCGTAGGCCTCAAAGGGCTCCAACT-3′; ectopic site 1 in chromosome 5 (forward) 5′- CCCTCC TGCCTGAAGATTTGAT-3′ and (reverse) 5′-AAAGCTTGGTGAGGGCAGTT-3′; ectopic site 2 in chromosome 14 (forward) 5′-GCTGTGTACTCCCGAACTCC-3′ and (reverse) 5′- GATCCTGTCCAG CTGCCAG; ectopic site 3 in chromosome 1 (forward) 5′- TCAGTTTGCACCATCCCCTG-3′ and (reverse) 5′-GCTCTGACTCATGCTCCTACTG-3′; ectopic site 4 in chromosome 9 (forward) 5′- AGTGCCCTCTGA ACGCTAAC-3′ and (reverse) 5′- ATTCCTCCCTGAGCTCCCAT-3′. Melting curve analysis was used to confirm primer specificity. To ensure linearity of the standard curve, reaction efficiencies over the appropriate dynamic range were calculated. Using the dCt method, we calculated fold enrichment of α-satellite DNA after immunopurification of CENP-ATAP chromatin, compared with its level in the bulk input chromatin. For the Repli-seq experiment, we used the dCt method to calculate fold enrichment of replicated DNA after immunopurification of BrdU-labelled DNA compared with its level in the bulk input DNA. Reported values are the means of two independent biological replicates with technical duplicates that were averaged for each experiment. Error bars represent s.e.m. To determine CENP-A levels at ectopic sites at G1 and mitosis, we used the dCt method, to calculate fold enrichment of CENP-A-containing DNA compared with its level in the bulk input DNA. CENP-A levels at mitosis were normalized to levels at G1. Reported values are the means of two (G1) or three (mitosis) independent biological replicates with technical duplicates that were averaged for each experiment. Error bars represent s.e.m.
Immunoblotting
For immunoblot analysis, protein samples were separated by SDS–PAGE, transferred onto polyvinylidene difluoride membranes (Millipore) and then probed with the following antibodies: rabbit anti-CENP-A (Cell Signaling, 2186 s, 1:1,000), rabbit anti-CENP-B (Millipore, 07-735, 1:200), mouse anti-α-tubulin (Abcam, DM1A, 1:5,000), rabbit anti-CAF1 p150 (Santa Cruz, sc-10772, 1:500), rabbit anti-CAF1 p60 (Bethyl Laboratories, A301-085A, 1:1,000), rabbit anti-CAF1 p48 (Bethyl Laboratories, A301-206A, 1:1,000) and rabbit anti-MCM2 (Abcam, Ab4461, 1:1,000). Following incubation with horseradish peroxidase-labelled antibody (GE Healthcare, NA931V or NA934V), horseradish peroxidase was detected using enhanced chemiluminescence substrate (Thermo Scientific, 34080 or 34096).
Immunofluorescence and live-cell imaging
1 × 106 suspension cells were centrifuged and resuspended with PBS. 105 cells were immobilized on glass slide by cytospin centrifugation for 3 min at 800 rpm. Cells were then fixed using ice-cold methanol at −20 °C for 10 min, followed by washing with cold PBS, and then incubated in Triton Block (0.2 M glycine, 2.5% fetal bovine serum, 0.1% Triton X-100, PBS) for 1 h. The following primary antibodies were used: mouse anti-GFP (Roche, 11814460001, 1:500), rabbit anti-CENP-B (Abcam 25734, 1:1,000) and human anti-centromere antibodies (ACA, Antibodies Inc, 15-234-0001, 1:500). The following secondary antibodies (Jackson Laboratories) were used for 45 min: donkey anti-human TR (1:300) and anti-mouse fluorescein isothiocyanate (FITC) (1:250). TAP fusion proteins were visualized by incubation with FITC-rabbit IgG (Jackson Laboratories, 1:200). Cells were then washed with 0.1% Triton X-100 in PBS, counterstained with 4,6-diamidino-2-phenylindole and mounted with mounting medium (Molecular Probes, P36934). Immunofluorescent images were acquired on a Deltavision Core system at ×60–100 magnification. 0.2 μm Z-stack deconvolved projections were generated using the softWoRx program. For live-cell imaging, cells were plated on high-optical-quality plastic slides (ibidi) and imaged using a CQ1 confocal quantitative image cytometer (Yokogawa) following addition of IAA. DNA was labelled with Sir-DNA (Cytoskeleton). IAA was added at the microscope stage.
Flow cytometry
Flow cytometry was used to determine the DNA content of the cells. 1 × 106 cells were collected, washed in PBS and fixed in 70% ethanol. Cells were then washed and DNA was stained by incubating cells for 30 min with 1% fetal bovine serum, 10 μg ml−1 propidium iodide and 0.25 mg ml−1 RNase A in PBS followed by FACS analysis for DNA content using a BD LSR II flow cytometer (BD Biosciences).
ChIP-seq library generation and sequencing
ChIP libraries were prepared following Illumina protocols with minor modifications. To reduce biases induced by PCR amplification of a repetitive region, libraries were prepared from 80–100 ng of input or ChIP DNA. The DNA was end-repaired and A-tailed and Illumina TruSeq adaptors were ligated. Libraries were run on a 2% agarose gel. Since the chromatin was digested to mononucleosomes, following adaptor ligation the library size was 250–280 bp. The libraries were size selected for 200–375 bp. The libraries were then PCR-amplified using only five or six PCR cycles since the starting DNA amount was high. Resulting libraries were sequenced using 100 bp, paired-end sequencing on a HiSeq 2000 instrument according to the manufacturer’s instructions with some modifications (Illumina). Sequence reads are summarized in Supplementary Table 1.
Initial sequence processing and alignment
Illumina paired-end reads were merged to determine CENP-A- or H3-containing target fragments of varying length using PEAR software65, with standard parameters (P = 0.01, minimum overlap 10 bases, minimum assembly length 50 bp). Merged paired reads were mapped (BWA-MEM, standard parameters66,67) to the hg38 assembly (including alternative assemblies), which contains human α-satellite sequence models in each centromeric region (ref. 31; BioProject PRJNA193213). The highly repeated sequences in the centromere reference models preclude distinguishing between centromeric and pericentromeric sequences, the order of repeats in the models is arbitrarily assigned, and portions of the centromeres of the acrocentric chromosomes 13, 14, 21 and 22, as well as portions of centromeres of chromosomes 1, 5 and 19, contain nearly identical arrays that cannot be distinguished. Alignments to repeats, or multiple regions in the reference genome with the same mapping score, were assigned randomly. That is, across an entire reference model a given read may have equal probability of mapping across most of the repeat copies; however, the final assignment is random and so the alignments are distributed across the array. Mapping was performed under the same protocol for all ChIP and input samples. Reads were determined to contain α-satellite if they overlapped sites (BEDTools: intersect68) in the genome previously annotated as α-satellite (UCSC Table Browser69 was used to obtain a bed file of all sites annotated as ALR/α-satellite). Additionally, merged sequences were defined as containing α-satellite if they contained an exact match to at least two 18-base oligonucleotides specific to a previously published whole-genome sequencing read database of α-satellite, representing 2.6% of sequences from the HuRef genome29,33. Comparisons between the BWA mapping and 18-base oligonucleotide exact matching based strategies were highly concordant. Total α-satellite DNA content in hg38 assembly was estimated by using the UCSC RepeatMasker annotation69. To identify reads that aligned uniquely to low-frequency repeat variants, or base changes in a repeat unit that are only observed once within the hg38 reference model, we used mapping scores (MAPQ = 20, or the probability of correctly mapping a random read was 0.99). A summary of reads obtained is shown in Supplementary Table 1.
ChIP-seq peak calling
Enrichment peaks for ChIP experiments were determined using the SICER algorithm (v1.0.3)35 using relevant input reads as background, with stringent parameters previously optimized for human CENP-A23: threshold for redundancy allowed for chip reads, 1; threshold for redundancy allowed for control reads, 1; window size, 200 bp; fragment size, 150 bp; shift in window length, 150; effective genome size as a fraction of the reference genome of hg38, 0.74; gap size, 400 bp; e-value for identification of candidate islands that exhibit clustering, 1,000; false discovery rate controlling significance, 0.00001. All sequencing samples were normalized to their matched input control, that is, each G1 sample was normalized to the G1 input and the G2 sequencing samples to the G2 input. In parallel, MACS peak calling was performed (macs14)34, and wiggle tracks were created to represent the read depth of each dataset independently. Finally, we performed a final, rigorous evaluation of ectopic CENP-A peaks, or peaks predicted outside centromeric regions, using k-mer enrichment (previously described29). Each ectopic peak was reformatted into 50 bp sliding windows (in both orientations, with a slide of 1 bp). The normalized frequency of each 50-base oligonucleotide candidate window was evaluated in each ChIP-seq dataset relative to a normalized observed frequency in the corresponding background dataset. Scores were determined as the log-transformed normalized value of the ratio between ChIP-seq and background, and those with a score greater than or equal to 2 were included in our study as a high-confidence enrichment set.
Analysis of CENP-A peak overlap with functional annotation
Ectopic CENP-A peak calls, that is those that did not overlap with centromeric α-satellite DNA, for HeLa, HT1080b (ref. 44) and HuRef (ref. 45) cells, were evaluated for enrichment with functional annotation if they were supported between replicate ChIP-seq experiments and overlapped at least one enriched 50-base oligonucleotide with a log-transformed normalized ratio of ≥2, or with a minimum standard ratio of fivefold. Resulting high-confidence ectopic peak calls were intersected (BEDTools: intersect68) with select functional datasets in the genome (UCSC Table Browser69). Peaks within genes were determined as such if 90% (–f 0.9) of the SICER peak intersected with GRCh38 RefSeq genes (including introns and exons). Peaks were determined at promoters if they intersected 1,000 bp upstream and downstream of genes (with minimum overlap of 50% of the SICER peak). To evaluate the role of expression, gene annotation was catalogued further based on intersection with ENCODE HeLa expression data with RefSeq gene annotations (where 22,211 RefSeq genes (40.5% of total) demonstrated at least 10 average reads/gene; and highly expressed RefSeq genes (10,033, or 18.3% of total) are defined as ≥100 average reads/gene; UCSC Table Browser69). To investigate peak overlap with sites of CTCF enrichment, we intersected peaks with two ENCODE replicate datasets (GEO GSM749729 and GSM749739), with minimum overlap of 20 bp. To study the overlap with sites of open chromatin, peaks were intersected with DNase I hypersensitive ENCODE datasets (GEO GSM736564 and GSM736510), with minimum overlap of 100 bases. Enrichment at H3K27me3 sites was determined by intersecting CENP-A binding data with H3K27me3 ChIP-seq (GEO GSM945208), with minimum overlap of 100 bases. Results were evaluated relative to a simulated peak dataset to test if observed peak counts were higher than expected by chance. Simulations were repeated 100 times to provide basic summary statistics: average, s.d., maximum/minimum, relative enrichment value and empirical P-value. To study the features of CENP-A non-centromeric preferential sites, CENP-A peaks were intersected with Broad Peaks of the following publicly available ENCODE HeLa S3 datasets with minimum overlap of 100 bases: H3K4me1 (GEO GSM798322), H3K4me2 (GEO GSM733734), H3K4me3 (GEO GSM733682), H2A.z (GEO GSM1003483), H3K9me3 (GEO GSM1003480), H3K27ac (GEO GSM733684), H3K27me3 (GEO GSM733696), H3K36me3 (GEO GSM733711), H3K79me2 (GEO GSM733669), H3K9ac (GEO GSM733756) and H4K20me1 (GEO GSM733689). CENP-A SICER peaks were also intersected with ENCODE high-occupancy target regions in the human genome assembly42 (GEO GSE54296).
Repli-seq experiments
BrdU-labelled DNA across S phase was prepared as previously described46 with some modifications. Briefly, cells were synchronized using double thymidine block and release8. Following release from double thymidine block, cells were labelled with BrdU (Sigma, B5002) for 1 h by adding BrdU to the culture medium to a final concentration of 50 µM. BrdU was added immediately after release for labelling at early S (S0–S1), 3 h after release (S3) for mid-S (S3–S4) labelling or 6 h after release (S6) for labelling at late S (S6–S7). Following labelling with BrdU, genomic DNA was extracted, sonicated and heat denatured as previously described46. BrdU-labelled DNA was immunoprecipitated using an anti-BrdU antibody (BD Biosciences, 555627) coupled to magnetic Dyna M-270 epoxy beads (Thermo Fisher Scientific, 14301). Eluted single-stranded DNA was made into double-stranded DNA using random-prime extension (Thermo Fisher Scientific, Random Primers DNA labelling kit, 18187-013). Following cleanup of the double-stranded DNA (QIAgen QiaQuick PCR purification kit, 28104), the DNA was validated by performing quantitative real-time PCR using primers for MRGPRE, MMP15, HBE1 and Sat2 and comparison with the ENCODE HeLa S3 replication timing profile (GEO GSM923449). Libraries were then prepared as described above, and sequenced using the Illumina instrument according to the manufacturer’s instructions, with the exception that, following adaptor ligation, Repli-seq libraries were size selected between 250 and 500 bp.
Mass spectrometry identification of proteins associating with CENP-ATAP chromatin
CENP-ATAP was immunoprecipitated from the chromatin fraction of randomly cycling cells or late-S-synchronized cells as described above. Following bead washes, beads were snap frozen in liquid nitrogen. Samples were prepared for mass spectrometry and run using LTQ Orbitrap Velos mass spectrometer (Thermo Scientific) as previously described70. Analysis was performed as previously described and is detailed in the Reporting Summary.
Statistics and reproducibility
For all experiments shown, n is indicated in the figure legends. Values represent the mean. Error bars, if shown, are s.e.m. (as indicated in the figure legends). For Fig. 6b,e,f,h,j,l, the experiment was repeated independently twice with similar results. For Fig. 6c,d,i,k,m, results of two biologically independent experiments are shown. Statistics source data for all graphical representations are available in Supplementary Table 4.
Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data availability
ChIP-seq and Repli-seq datasets generated during the current study were deposited at GEO under primary accession number GSE111381. Figures 1, 2, 3, 4, 5 and 6 and Supplementary Figs. 1, 2, 3 and 5 have associated raw mapping data. Mass spectrometry data were deposited at the ProteomeXchange database under accession number PXD013385. Mass spectrometry data are found in Table 1 and Supplementary Fig. 5.
Previously published ChIP-seq data that were reanalysed in this study include HT1080b (ref. 44) CENP-AFLAG ChIP-seq (GSM2808187 and corresponding input dataset GSM2808190) and HuRef (ref. 45) CENP-A native ChIP-seq (GSM1494428, GSM1494429, GSM1494430, GSM1494431, and corresponding input datasets GSM1494424, GSM1494425, GSM1494426, GSM1494427). Hela CENP-ATAP, Hela CENP-ALAP, HT1080b CENP-AFLAG (ref. 44) and HuRef (ref. 45) endogenous CENP-A ChIP-seq datasets were intersected with the following publicly available ENCODE HeLa S3 datasets: DNase I (GSM736564 and GSM736510), H3K4me1 (GSM798322), H3K4me2 (GSM733734), H3K4me3 (GSM733682), H2A.z (GSM1003483), H3K9me3 (GSM1003480), H3K27ac (GSM733684), H3K27me3 (GSM945208), H3K36me3 (GSM733711), H3K79me2 (GSM733669), H3K9ac (GSM733756) and H4K20me1 (GSM733689). CENP-ATAP and CENP-ALAP SICER peaks were also intersected with ENCODE CTCF datasets (GSM749729 and GSM749739), and high-occupancy target region datasets (GSE54296).
Source data for Figs. 1, 3, 5, 6 and 7 and Supplementary Figs. 3, 4 and 5 have been provided as Supplementary Table 4. All other data supporting the findings of this study are available from the corresponding authors on reasonable request.
References
Wevrick, R. & Willard, H. F. Long-range organization of tandem arrays of alpha satellite DNA at the centromeres of human chromosomes: high-frequency array-length polymorphism and meiotic stability. Proc. Natl Acad. Sci. USA 86, 9394–9398 (1989).
Cleveland, D. W., Mao, Y. & Sullivan, K. F. Centromeres and kinetochores: from epigenetics to mitotic checkpoint signaling. Cell 112, 407–421 (2003).
Willard, H. F. Chromosome-specific organization of human alpha satellite DNA. Am. J. Hum. Genet. 37, 524–532 (1985).
Manuelidis, L. & Wu, J. C. Homology between human and simian repeated DNA. Nature 276, 92–94 (1978).
Earnshaw, W. C. & Rothfield, N. Identification of a family of human centromere proteins using autoimmune sera from patients with scleroderma. Chromosoma 91, 313–321 (1985).
Palmer, D. K., O’Day, K., Wener, M. H., Andrews, B. S. & Margolis, R. L. A 17-kD centromere protein (CENP-A) copurifies with nucleosome core particles and with histones. J. Cell Biol. 104, 805–815 (1987).
Bodor, D. L. et al. The quantitative architecture of centromeric chromatin. eLife 3, e02137 (2014).
Nechemia-Arbely, Y. et al. Human centromeric CENP-A chromatin is a homotypic, octameric nucleosome at all cell cycle points. J. Cell Biol. 216, 607–621 (2017).
Sullivan, B. A. & Karpen, G. H. Centromeric chromatin exhibits a histone modification pattern that is distinct from both euchromatin and heterochromatin. Nat. Struct. Mol. Biol. 11, 1076–1083 (2004).
Stimpson, K. M. & Sullivan, B. A. Epigenomics of centromere assembly and function. Curr. Opin. Cell Biol. 22, 772–780 (2010).
Marshall, O. J., Chueh, A. C., Wong, L. H. & Choo, K. H. Neocentromeres: new insights into centromere structure, disease development, and karyotype evolution. Am. J. Hum. Genet. 82, 261–282 (2008).
Fachinetti, D. et al. A two-step mechanism for epigenetic specification of centromere identity and function. Nat. Cell Biol. 15, 1056–1066 (2013).
Foltz, D. R. et al. The human CENP-A centromeric nucleosome-associated complex. Nat. Cell Biol. 8, 458–469 (2006).
Okada, M. et al. The CENP-H-I complex is required for the efficient incorporation of newly synthesized CENP-A into centromeres. Nat. Cell Biol. 8, 446–457 (2006).
Hori, T. et al. CCAN makes multiple contacts with centromeric DNA to provide distinct pathways to the outer kinetochore. Cell 135, 1039–1052 (2008).
Hori, T., Shang, W. H., Takeuchi, K. & Fukagawa, T. The CCAN recruits CENP-A to the centromere and forms the structural core for kinetochore assembly. J. Cell Biol. 200, 45–60 (2013).
Padeganeh, A. et al. Octameric CENP-A nucleosomes are present at human centromeres throughout the cell cycle. Curr. Biol. 23, 764–769 (2013).
Jansen, L. E., Black, B. E., Foltz, D. R. & Cleveland, D. W. Propagation of centromeric chromatin requires exit from mitosis. J. Cell Biol. 176, 795–805 (2007).
Nechemia-Arbely, Y., Fachinetti, D. & Cleveland, D. W. Replicating centromeric chromatin: spatial and temporal control of CENP-A assembly. Exp. Cell Res. 318, 1353–1360 (2012).
Foltz, D. R. et al. Centromere-specific assembly of CENP-a nucleosomes is mediated by HJURP. Cell 137, 472–484 (2009).
Dunleavy, E. M. et al. HJURP is a cell-cycle-dependent maintenance and deposition factor of CENP-A at centromeres. Cell 137, 485–497 (2009).
Silva, M. C. et al. Cdk activity couples epigenetic centromere inheritance to cell cycle progression. Dev. Cell 22, 52–63 (2012).
Lacoste, N. et al. Mislocalization of the centromeric histone variant CenH3/CENP-A in human cells depends on the chaperone DAXX. Mol. Cell 53, 631–644 (2014).
Van Hooser, A. A. et al. Specification of kinetochore-forming chromatin by the histone H3 variant CENP-A. J. Cell Sci. 114, 3529–3542 (2001).
Shrestha, R. L. et al. Mislocalization of centromeric histone H3 variant CENP-A contributes to chromosomal instability (CIN) in human cells. Oncotarget 8, 46781–46800 (2017).
Filipescu, D. et al. Essential role for centromeric factors following p53 loss and oncogenic transformation. Genes Dev. 31, 463–480 (2017).
Mata, J. F., Lopes, T., Gardner, R. & Jansen, L. E. A rapid FACS-based strategy to isolate human gene knockin and knockout clones. PloS ONE 7, e32646 (2012).
Landry, J. J. et al. The genomic and transcriptomic landscape of a HeLa cell line. G3 3, 1213–1224 (2013).
Hayden, K. E. et al. Sequences associated with centromere competency in the human genome. Mol. Cell. Biol. 33, 763–772 (2013).
Conde e Silva, N. et al. CENP-A-containing nucleosomes: easier disassembly versus exclusive centromeric localization. J. Mol. Biol. 370, 555–573 (2007).
Miga, K. H. et al. Centromere reference models for human chromosomes X and Y satellite arrays. Genome Res. 24, 697–707 (2014).
Schneider, V. A. et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849–864 (2017).
Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Zang, C. et al. A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25, 1952–1958 (2009).
Maloney, K. A. et al. Functional epialleles at an endogenous human centromere. Proc. Natl Acad. Sci. USA 109, 13704–13709 (2012).
Ly, P. et al. Selective Y centromere inactivation triggers chromosome shattering in micronuclei and repair by non-homologous end joining. Nat. Cell Biol. 19, 68–75 (2017).
Amor, D. J. & Choo, K. H. Neocentromeres: role in human disease, evolution, and centromere study. Am. J. Hum. Genet. 71, 695–714 (2002).
Hasson, D. et al. The octamer is the major form of CENP-A nucleosomes at human centromeres. Nat. Struct. Mol. Biol. 20, 687–695 (2013).
Amor, D. J. et al. Human centromere repositioning “in progress”. Proc. Natl Acad. Sci. USA 101, 6542–6547 (2004).
Alonso, A. et al. Co-localization of CENP-C and CENP-H to discontinuous domains of CENP-A chromatin at human neocentromeres. Genome Biol. 8, R148 (2007).
Li, H. et al. Functional annotation of HOT regions in the human genome: implications for human disease and cancer. Sci. Rep. 5, 11633 (2015).
Teytelman, L., Thurtle, D. M., Rine, J. & van Oudenaarden, A. Highly expressed loci are vulnerable to misleading ChIP localization of multiple unrelated proteins. Proc. Natl Acad. Sci. USA 110, 18602–18607 (2013).
Thakur, J. & Henikoff, S. Unexpected conformational variations of the human centromeric chromatin complex. Genes Dev. 32, 20–25 (2018).
Henikoff, J. G., Thakur, J., Kasinathan, S. & Henikoff, S. A unique chromatin complex occupies young alpha-satellite arrays of human centromeres. Sci. Adv. 1, e1400234 (2015).
Hansen, R. S. et al. Sequencing newly replicated DNA reveals widespread plasticity in human replication timing. Proc. Natl Acad. Sci. USA 107, 139–144 (2010).
Bui, M. et al. Cell-cycle-dependent structural transitions in the human CENP-A nucleosome in vivo. Cell 150, 317–326 (2012).
Erliandri, I. et al. Replication of alpha-satellite DNA arrays in endogenous human centromeric regions and in human artificial chromosome. Nucl. Acids Res. 42, 11502–11516 (2014).
Guse, A., Carroll, C. W., Moree, B., Fuller, C. J. & Straight, A. F. In vitro centromere and kinetochore assembly on defined chromatin templates. Nature 477, 354–358 (2011).
Carroll, C. W., Milks, K. J. & Straight, A. F. Dual recognition of CENP-A nucleosomes is required for centromere assembly. J. Cell Biol. 189, 1143–1155 (2010).
Petrovic, A. et al. Structure of the MIS12 complex and molecular basis of its interaction with CENP-C at human kinetochores. Cell 167, 1028–1040 (2016).
Rago, F., Gascoigne, K. E. & Cheeseman, I. M. Distinct organization and regulation of the outer kinetochore KMN network downstream of CENP-C and CENP-T. Curr. Biol. 25, 671–677 (2015).
Klare, K. et al. CENP-C is a blueprint for constitutive centromere-associated network assembly within human kinetochores. J. Cell Biol. 210, 11–22 (2015).
Weir, J. R. et al. Insights from biochemical reconstitution into the architecture of human kinetochores. Nature 537, 249–253 (2016).
Hoffmann, S. et al. CENP-A is dispensable for mitotic centromere function after initial centromere/kinetochore assembly. Cell Rep. 17, 2394–2404 (2016).
Smith, S. & Stillman, B. Stepwise assembly of chromatin during DNA replication in vitro. EMBO J. 10, 971–980 (1991).
Verreault, A., Kaufman, P. D., Kobayashi, R. & Stillman, B. Nucleosome assembly by a complex of CAF-1 and acetylated histones H3/H4. Cell 87, 95–104 (1996).
Hayashi, T. et al. Mis16 and Mis18 are required for CENP-A loading and histone deacetylation at centromeres. Cell 118, 715–729 (2004).
Kaufman, P. D., Kobayashi, R., Kessler, N. & Stillman, B. The p150 and p60 subunits of chromatin assembly factor I: a molecular link between newly synthesized histones and DNA replication. Cell 81, 1105–1114 (1995).
Huang, H. et al. A unique binding mode enables MCM2 to chaperone histones H3-H4 at replication forks. Nat. Struct. Mol. Biol. 22, 618–626 (2015).
Zhang, W. et al. Centromere and kinetochore gene misexpression predicts cancer patient survival and response to radiotherapy and chemotherapy. Nat. Commun. 7, 12619 (2016).
Sun, X. et al. Elevated expression of the centromere protein-A (CENP-A)-encoding gene as a prognostic and predictive biomarker in human cancers. Int. J. Cancer 139, 899–907 (2016).
Gascoigne, K. E. et al. Induced ectopic kinetochore assembly bypasses the requirement for CENP-A nucleosomes. Cell 145, 410–422 (2011).
Zasadzinska, E. et al. Inheritance of CENP-A nucleosomes during DNA replication requires HJURP. Dev. Cell 47, 348–362 e7 (2018).
Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Karolchik, D. et al. The UCSC Table Browser data retrieval tool. Nucl. Acids Res. 32, D493–D496 (2004).
Wang, Z., Wu, C., Aslanian, A., Yates, J. R.III. & Hunter, T. Defective RNA polymerase III is negatively regulated by the SUMO-Ubiquitin-Cdc48 pathway. eLife 7, e35447 (2018).
Acknowledgements
We thank A. Desai, P. Ly and C. Eissler for critical discussion, L.E.T. Jansen (Gulbenkian Institute, Portugal) for reagents and D.-H. Kim for productive discussions and technical help. This work was supported by grants (R01 GM-074150 and R35 GM-122476) from the NIH to D.W.C., who receives salary support from Ludwig Cancer Research.
Author information
Authors and Affiliations
Contributions
Y.N.-A. and D.W.C. conceived and designed experiments and wrote the manuscript. Y.N.-A. performed experiments. K.H.M. analysed the sequencing data. M.A.M. and O.S. analysed data and performed experiments. D.F. suggested experiments and provided key experimental input. A.Y.L. and B.R. prepared sequencing libraries and provided resources. A.A. and J.R.Y. performed mass spectrometry experiments and provided resources.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Integrated supplementary information
Supplementary Figure 1 Identification of peaks enriched for CENP-A binding.
(a) Scheme showing experimental design for tagging an endogenous CENP-A locus to produce CENP-A+/LAP HeLa cells. These cells were then adapted to suspension growth. (b) Scheme showing the experimental design for obtaining increased levels of CENP-ATAP expression. CENP-ATAP is expressed in these cells at 4.5-fold the level of CENP-A in the parental HeLa cells1. (c, d) Localization of endogenously tagged CENP-ALAP (c) and CENP-ATAP (d) determined with indirect immunofluorescence using anti-GFP antibody (c) or rabbit IgG (d). Scale bar, 5 μm. The experiment was repeated independently three times with similar results for both (c) and (d). (e) FACS analysis of DNA content showing the synchronization efficiency of CENP-A+/LAP and CENP-ATAP HeLa cell lines. The experiment was repeated independently five times with similar results. (f, g) Examples of centromeric regions of chromosome 7 (f) and 5 (g) showing increased occupancy of overexpressed CENP-ATAP (compare CENP-ATAP with CENP-ALAP). The experiment was repeated independently twice with similar results. (h) Overlap between G1 and G2 CENP-ATAP binding peaks at α-satellite sequences. (i) Top, overlap between G1 and G2 CENP-A single-mapping binding sites at α-satellite HOR sequences. Bottom, peak overlap between G1 CENP-ATAP (increased expression) and CENP-ALAP (endogenous level) single-mapping binding sites at α-satellite HOR sequences.
Supplementary Figure 2 CENP-A ChIP-seq identifies CENP-A binding at reference centromeres of 23 human chromosomes.
CENP-ALAP bound DNAs at G1 and G2 were sequenced, with 2 replicates per condition, and mapped to the centromeric reference models in the hg38 assembly2,3. Shown are the raw mapping data (coloured) for every human centromere (except for the centromere of chromosome 19 that shares almost all of its α-satellites arrays with α-satellites arrays of chromosomes 1 and 5) and CENP-A binding called as SICER peaks (black lines, underneath) for one replicate for each time point. The experiment was repeated twice independently with similar results. Centromere reference location, red. CENP-B box, orange.
Supplementary Figure 3 Ectopic deposition of CENP-A into open and active chromatin at G1 does not function as a seeding hotspot for neocentromere formation.
(a) Number of non-α-satellite CENP-A SICER binding sites called at G1 or G2 at different fold thresholds (above background). (b) Human DLD1 cells with auxin degradable CENP-AAID and a doxycycline-inducible CENP-AWT4, were synchronized at G1 using the CDK4/6 inhibitor PD-0332991 (also known as Palbociclib) or at mitosis using nocodazole, following addition of doxycycline. The experiment was repeated independently three times with similar results. (c) Read mapping data of CENP-ATAP ChIP-seq at G1 (red) and G2 (blue), at the chromosomal location of a known patient derived neocentromere5 found in chromosome 13. The experiment was repeated twice independently with similar results. A third human neocentromere, identified in line MS4221, has been identified to lie within a 400 kb neocentromere at position 86.5 to 86.9 Mb on chromosome 8 in hg195,6 (corresponding to 85.78–85.88 Mb in hg38). However, a gap and segmental duplications that appear in this region precluded precise analysis of CENP-A mapping at this neocentromere. (d-g) Fold enrichment of CENP-ATAP chromatin in randomly cycling cells or at G1 (d, e) and CENP-ALAP chromatin (f, g) at G1 at different genomic locations. SICER peaks ≥ 5-fold supported between two replicates were analysed for their enrichment level at different genomic locations, compared to the level of enrichment at these sites by chance. (h, i) Number of CENP-ATAP (h) and CENP-ALAP (i) SICER peaks ≥ 5-fold that overlap with ‘HOT’ regions in the human genome in G1 and G2 synchronized cells. Data shown are from two biologically independent experiments. Source data for d-i can be found in Supplementary Table 4.
Supplementary Figure 4 Non-centromeric CENP-A binding peaks overlap with active transcription marks.
(a,b) The chromatin features of CENP-ATAP (a) and CENP-ALAP (b) non-centromeric preferential sites were analysed by intersecting SICER peaks ≥ 5-fold supported between two replicates with publicly available ENCODE datasets for histone modification profiles in HeLa S3, that represent modifications typically associated with transcription activation or repression. The experiment was performed one time, except for DNase I and H3K27me3 for which there are 2 ENCODE datasets available, and therefore for DNase I and H3K27me3 the experiment was repeated twice independently with similar results. Statistics source data for Supplementary Fig. 4a, b can be found in Supplementary Table 4. (The sum of ectopic CENP-ATAP sites at active or repression marks is more than 100%, the result of overlap between H3K9me3 and active transcription marks.) (c,d) The chromatin features of sites of preferential, non-centromeric CENP-A binding were analysed for histone modification profiles associated with transcription activation or repression in HeLa S3 cells by intersecting SICER peaks ≥ 5-fold found in previously published CENP-A ChIP-seq datasets in HT10807 (c) and HuRef8 (d) cell lines with publicly available ENCODE datasets for histone modification profiles in HeLa S3. For HT1080b (c) the experiment was performed one time, except for DNase I and H3K27me3 for which there are 2 ENCODE datasets available, and therefore for DNase I and H3K27me3 the experiment was repeated twice independently with similar results. For HuRef (d), The experiment was performed four times, except for DNase I and H3K27me3 for which there are 2 ENCODE datasets available, and therefore for DNase I and H3K27me3 the experiment was repeated independently eight times with similar results. Source data for a-d can be found in Supplementary Table 4.
Supplementary Figure 5 Centromeres are late replicating with CENP-A remaining tethered locally by continued binding to the CCAN complex.
(a) FACS analysis of DNA content showing the synchronization efficiency of CENP-ATAP HeLa cell line across S phase. The experiment was repeated independently twice with similar results. (b) Genomic DNA of cells labelled for 1 hour with BrdU was sonicated prior to the BrdU immunoprecipitation and fragments of 200–800 bp were obtained. The experiment was repeated independently twice with similar results. Unprocessed images of DNA gels can be found in Supplementary Fig. 6. (c) Quantitative real-time PCR for MRGPRE and MMP15 genes, previously reported to replicate early (ref9 and ENCODE Repli-seq). (d) Quantitative real-time PCR for HBE1 and Sat210 genes, previously reported to replicate late (ref11 and ENCODE Repli-seq). (e) Quantitative real-time PCR for α-satellite DNA. Data shown in c-e are from two biologically independent experiments. Source data for c-e can be found in Supplementary Table 4. (f) MNase digestion profile showing the nucleosomal DNA length distributions of bulk input mononucleosomes (upper panel) and purified CENP-ATAP following native ChIP at early S and mid-S phase. The experiment was repeated twice independently with similar results. (g) CENP-A ChIP-seq raw mapping data spanning the whole of cen18 at G1, mid-S phase and G2, and BrdU repli-seq at early S (S1), mid-S (S4) and late S/G2 (S7). SICER peaks are denoted as black lines underneath the raw mapping data. The experiment was repeated twice independently with similar results. Centromere reference location, red. CENP-B boxes, orange. Scale bar, 2Mb. (h) Overlap degree between CENP-A G1 and mid-S at α-satellite HORs single copy variants. (i) Ethidium Bromide stained DNA agarose gel showing MNase digestion profile of bulk chromatin used for mass spectrometry identification of proteins associating with CENP-ATAP chromatin (left panel) and for CENP-ATAP co-immunoprecipitation experiment (right panel). Mass spectrometry was performed once and co-IP was performed twice with similar results. Unprocessed images of DNA gels can be found in Supplementary Fig. 6. (j-n) CENP-ATAP immunopurification followed by mass spectrometry identifies association with CENP-A chromatin of DNA replication related proteins (j,k), chromatin remodelling factors and nuclear chaperones (l), histones (m) and centromere and kinetochore proteins (n).
Supplementary Figure 6
Unprocessed film scans of all immunoblots and DNA gels with corresponding protein and DNA size markers.
Supplementary information
Supplementary Information
Supplementary Figs. 1–6 and their legends, and legends for Supplementary Video 1 and Supplementary Tables 1–4
Supplementary Table 1
Read statistics for ChIP-seq and Repli-seq experiments.
Supplementary Table 2
Endogenous CENP-A sequence mapping onto α-satellite DNAs in human centromere reference models for each autosome and the X chromosome.
Supplementary Table 3
Antibodies used in the study.
Supplementary Table 4
Statistical source data for graphical representations.
Supplementary Video 1
Rapid CENP-CAE/AE depletion following IAA treatment.
Rights and permissions
About this article
Cite this article
Nechemia-Arbely, Y., Miga, K.H., Shoshani, O. et al. DNA replication acts as an error correction mechanism to maintain centromere identity by restricting CENP-A to centromeres. Nat Cell Biol 21, 743–754 (2019). https://doi.org/10.1038/s41556-019-0331-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41556-019-0331-4
- Springer Nature Limited
This article is cited by
-
Higher-order protein assembly controls kinetochore formation
Nature Cell Biology (2024)
-
DNAJC9 prevents CENP-A mislocalization and chromosomal instability by maintaining the fidelity of histone supply chains
The EMBO Journal (2024)
-
Stable inheritance of H3.3-containing nucleosomes during mitotic cell divisions
Nature Communications (2022)
-
CENPA promotes clear cell renal cell carcinoma progression and metastasis via Wnt/β-catenin signaling pathway
Journal of Translational Medicine (2021)
-
CENP-A overexpression promotes distinct fates in human cells, depending on p53 status
Communications Biology (2021)