Centromere identity: a challenge to be faced
- First Online:
- Cite this article as:
- Mehta, G.D., Agarwal, M.P. & Ghosh, S.K. Mol Genet Genomics (2010) 284: 75. doi:10.1007/s00438-010-0553-4
- 524 Views
The centromere is a genetic locus, required for faithful chromosome segregation, where spindle fibers attach to the chromosome through kinetochore. Loss of centromere or formation of multiple centromeres on a single chromosome leads to chromosome missegregation or chromosome breakage, respectively, which are detrimental for fitness and survival of a cell. Therefore, understanding the mechanism of centromere locus determination on the chromosome and perpetuation of such a locus in subsequent generation (known as centromere identity) is very fundamental to combat conditions like aneuploidy, spontaneous abortion, developmental defects, cell lethality and cancer. Recent studies have come up with different models to explain centromere identity. However, the exact mechanism still remains elusive. It has been observed that most eukaryotic centromeres are determined epigenetically rather than by a DNA sequence. The epigenetic marks that are instrumental in determining centromere identity are the histone H3 variant, CENP-A and the specialized posttranslational modification of the core histones. Here we will review the recent studies on the factors responsible for generating unique centromeric chromatin and how it perpetuates during cell division giving the present-day models. We will further focus on the probable mechanism of de novo centromere formation with an example of neocentromere. As a matter of similitude, this review will include marking extrachromosomal chromatin to be served as a partitioning locus by deposition of CENP-A homolog in budding yeast.
Centromeric protein A
Constitutive centromere-associated network
Chromodomain helicase DNA-binding protein 1
Holliday junction recognition protein
Kinetochore null phenotype
Nucleosome remodeling factor
Nuclear autoantigenic sperm protein
The faithful transmission of a genome from one generation to another is dependent on the mechanism of cell division where each pair of replicated chromosomes, known as sister chromatids, are separated and equally distributed to mother and daughter cells. The centromere is a specialized locus of the eukaryotic chromosome that attaches to the spindles emanated from both the poles via a kinetochore protein complex and helps move the chromosome during cell division.
Despite the fact that centromeric DNA sequences are not conserved among different species, the centromeric chromatin contains some conserved characteristics. For example, the nucleosomes of all the centromeric chromatin studied to date contain histone variant CENP-A (centromere protein A) instead of canonical histone H3 (Henikoff and Dalal 2005; Dalal 2009; Buscaino et al. 2010). Moreover, the posttranslational modifications present on the N-terminal tail of core histones in centromeric chromatin are also very unique (Sullivan and Karpen 2004). Importantly, centromeric chromatin in most of the eukaryotes is flanked by heterochromatin on both the sides of the centromere. The role of CENP-A containing chromatin is believed to assemble kinetochore proteins and the role of surrounding heterochromatin is to maintain sister chromatid cohesion (Gartenberg 2009; Torras-Llort et al. 2009). Thus, CENP-A containing nucleosomes or specific histone modifications or both present at the centromeric chromatin can act as an epigenetic mark to specify the position of a centromere and its perpetuation from one generation to the next generation. If CENP-A deposition is an epigenetic mark, it is poorly understood how this marking is achieved by targeting CENP-A to the centromeric chromatin. Experimental evidence has shown that there is temporal difference between CENP-A synthesis and loading on the chromatin (Torras-Llort et al. 2009). In this review, we will summarize the contribution of several factors in depositing the CENP-A specifically at the centromere and, as relevant, will also accommodate literature that has come out studying neocentromere. We will discuss the prevailing models describing how CENP-A can be deposited in the context of the cell cycle. Additionally, we will discuss other epigenetic factors like histone modifications or DNA methylation as probable marks in centromere identity. In general, the centromere research on different model organisms has given very diverse and sometimes contradictory results leaving the exact mechanism of centromere identity very illusive, making it necessary for this topic to be reviewed at regular intervals. Currently the key issues of centromere research are (1) what are the molecular mechanisms that specify centromere identity at a single site, (2) when is this identity imposed in the cell cycle and (3) how is this identity passed on from mother to daughter DNA strands? In this review, our goal is to summarize current work from several research groups in order to provide an updated status of these issues. Towards the end, as a matter of resemblance, we have also compared the pathways leading to the deposition of CENP-A at the chromosomal and extrachromosomal chromatins.
A special flavor of centromeric chromatin
The centromere is the fastest evolving region on the chromosome. It involves variation in its DNA sequences, kinetochore proteins, and large-scale recombination and rearrangements between phylogenetically related species (Wang et al. 2009). The sequences of centromeric chromatin are diverse among different species. Despite this diversity, all the centromeres display a unique feature where the canonical histone H3 is replaced by a variant H3, CENP-A. The simplest centromere is the ‘point’ centromere in budding yeast Saccharomyces cerevisiae which spans only 125 bp that contains three conserved DNA sequence elements (CDEs). Centromere identity is basically determined by CDEIII, which contains 25 bp palindromic DNA sequences that can be recognized by the sequence-specific DNA-binding factor CBF3. CBF3 binding is crucial for the assembly of a Cse4, yeast homolog of CENP-A (Meluh et al. 1998), containing nucleosome that forms over CDEII, an AT rich (>90%) domain. Thus the budding yeast centromeres are well defined by DNA sequence alone in which an entire 125 bp long DNA wraps around a single Cse4 containing nucleosome (Fig. 1; Henikoff and Dalal 2005; Furuyama and Biggins 2007). On the other hand, the ‘regional’ centromeres of fission yeast Schizosaccharomyces pombe are 35–110 kb in size which contains three regions: A 4–7 kb nonrepetitive central core (cnt) surrounded by centromere-specific inner repeats (imr) which are, in turn, surrounded by tandem arrays of outer repeats (otr) (Fig. 1; Wood et al. 2002). Central core and imr repeats make the central domain that harbors CENP-A containing nucleosomes which is involved in kinetochore formation, whereas otr repeats, which remain embedded in heterochromatic region, are involved in the cohesion of sister chromatids during cell division. The cnt and imr sequences are different among all three chromosomes of S. pombe, whereas otr sequences are common to all (Ishii 2009). However, cnt1 and cnt3 regions also contain a 3.3 kb ‘tm’ element which has 99% sequence similarity, whereas cnt2 has a ~1.5 kb element which is 48% identical to ‘tm’ (Pidoux and Allshire 2004).
In spite of the enormous variation of centromeric sequences, CENP-A is found to be present at all the natural centromeres as well as all the variant functional centromeres like neocentromeres and de novo artificial centromeres (Carroll and Straight 2006). Overexpression of CENP-A showed the deposition of CENP-A at ectopic locations on the chromosome and the kinetochore formation at the noncentromeric site which results in chromosomal breakage and genome instability (Heun et al. 2006; Van Hooser et al. 2001). The loss of CENP-A results in the failure of centromere formation and kinetochore assembly thus causing mitotic arrest and embryonic lethality (Howman et al. 2000; Oegema et al. 2001; Régnier et al. 2005; Blower et al. 2006). These observations attest that CENP-A is the structural and functional foundation for the kinetochore formation, and it is situated at or close to the top of the pathway that is responsible for kinetochore formation. How does a CENP-A containing chromatin differ from an H3 containing chromatin? In vitro studies suggest that nucleosomes containing CENP-A are structurally more rigid than conventional H3 containing nucleosomes (Black et al. 2007a). Structural rigidity of CENP-A can be recapitulated by substituting 22 amino acids specific to human CENP-A within 40 amino acids long loop1 and α2 helix into the histone fold of histone H3. This domain containing 22 amino acids is known as a CENP-A centromere targeting domain (CATD) as it is sufficient to target chimeric histone H3 to the centromere and compete with CENP-A for the assembly (Black et al. 2004, 2007a, b). Deuterium exchange experiments have shown that this CATD domain is locked inside the nucleosome and forms a rigid interface with histone H4, suggesting that the function of CENP-A is derived from the structural changes induced by its CATD domain (Black et al. 2004, 2007a). It has also been shown that the amino acid sequence of the N-terminal tail of CENP-A shows little sequence similarities with that of canonical histone H3. This N-terminal tail sequence of CENP-A is very diverse among different species. For example, in budding yeast and Drosophila, it is 120 and 130 amino acids long, respectively, whereas in humans and fission yeast, it is only 45 and 20 amino acid long, respectively (Henikoff and Dalal 2005). Additionally, it has also been shown that the micrococcal nuclease digestion of CENP-A containing centromeric chromatin gives a smeared pattern instead of a ladder pattern that is obtained with the noncentromeric chromatin (Takahashi et al. 1992). This nuclease digestion pattern indicates either the lack of regularity in the nucleosome position on centromeric chromatins as compared to other loci, or it indicates the presence of multiple populations of highly phased nucleosomes. These differentially phased nucleosome populations could be temporally or spatially distinct and thus yield a similar smeared pattern. A recent study (Furuyama and Henikoff 2009) has pointed out that CENP-A containing nucleosomes generates a positive super coil in DNA instead of a canonical negative super coil of the histone H3 containing nucleosomes, and it has been suggested that incompatibility generated by opposite topologies might define the location of centromere on a chromosome and promote its maintenance in the subsequent generations.
The core histones of the centromeric chromatin are also posttranslationally modified in a very unique way. Chromatin stretch with immunofluorescence experiments in humans and Drosophila revealed that centromeric H3 is not di- or tri-methylated at Lys 9 (H3K9Me2/H3K9Me3), a modification pattern usually present in heterochromatin. Instead, centromeric H3 is dimethylated at Lys 4 (H3K4Me2), a modification generally associated with open but transcriptionally non-active euchromatin (Sullivan and Karpen 2004). The contradictory results were obtained in the case of maize A and B chromosomes where the centromeric chromatin lacks H3K4Me2 and it possesses H3K9Me2 (Fig. 1; Jin et al. 2008), which shows higher diversity associated with epigenetic marks in different eukaryotes. In humans and Drosophila, H3 and H4 at the centromeric chromatin are also hypoacetylated like heterochromatin. However, histone modification of actively transcribing chromatin, i.e., trimethylation at Lys 4 of H3 is not present in the centromeric chromatin. Thus, the histone code of centromeric chromatin in humans and Drosophila appears to be intermediate between the transcriptionally active euchromatin and inactive heterochromatin (Sullivan and Karpen 2004). Another distinct modification of inner centromeric chromatin is the phosphorylation of histone H3 at threonine 3 (H3T3) (Dai et al. 2006), which is important for the recruitment of chromosome passenger complex (CPC, involved in chromosome segregation and cytokinesis by ensuring the correct assembly of spindle midzone) to the centromere (Rosasco-Nitcher et al. 2008). As a matter of fact, some of the active genes have been found in the centromeric chromatin of rice (Yan et al. 2006) and in the chromatin of neocentromeres (Wong et al. 2006), so it can be concluded that the centromeric chromatin is not a transcriptionally inactive (heterochromatin) region. In mammalian centromere, a differential pattern of histone code between inner and outer centromeric chromatin have been observed, in which inner centromeric chromatin contains H3K4diMe whereas outer centromeric chromatin contains CENP-A containing nucleosomes (Fig. 2a). These patterns supposedly subject a centromere to perform different functions just like inner centromeric chromatin is involved in sister chromatid cohesion by recruiting cohesin and chromosome passenger complex. Outer centromeric chromatin is involved in kinetochore formation by recruiting CENP-A (Sullivan and Karpen 2004). A recent study on human functional neocentromeres has demonstrated significant decrease in canonical H3K4Me2 marking suggesting this mark may not play a higher order structural role in neocentromere specification and function (Alonso et al. 2010). However, whether this conclusion holds true for endogenous centromere is debatable since studied neocentromeric chromatids separate prematurely, supposedly due to an observed overall scarcity of heterochromatin and consequent lack of well-established cohesion at the neocentromere.
Similar to the core histone proteins, posttranslational modification of CENP-A could be one of the avenues to regulate the activity of this protein. The one and only modification of CENP-A which is known to date is phosphorylation of serine 7 at the N-terminal tail in human cells which is responsible for the localization of INCENP, Aurora B and PP1gamma1 to the spindle midzone at anaphase. Its absence results in concomitant delay in cytokinesis. From a protein fusion experiment using H3–CATD chimera, it appears that CENP-A N-terminal tail is dispensable for CENP-A assembly (Black et al. 2004). However, the significance of the N-terminal tail and its associated posttranslational modification in downstream kinetochore assembly and chromosome segregation is poorly understood (Chen et al. 2000; Zeitlin et al. 2001; Van Hooser et al. 2001). Thus, deposition of variant histone and specific modification of N-terminal tails of core histones together generate a specialized chromatin structure that can be conceived as a ‘mark’ to nucleate kinetochore assembly at that locus (centromere). Once marked, such a chromatin state can be utilized for kinetochore formation in the subsequent generations epigenetically (Allshire and Karpen 2008; Vagnarelli et al. 2008).
In some specialized cases, other factors also appear to influence centromeric chromatin. Zhang et al. (2008) have suggested the role of DNA methylation in epigenetic demarcation of centromeric chromatin in Arabidopsis thaliana and maize. They have reported that the 178 bp repeats associated with the CENP-A containing chromatin in Arabidopsis are hypomethylated compared with the same repeats located in the flanking pericentromeric regions. The same kind of methylation pattern was also observed in maize. Hypomethylation of the DNA in centromeric chromatin can be correlated with the reduced level of H3K9me2 in Arabidopsis. They have found the differential distribution of methylation sites, i.e., CG and CNG sites (where cytosine can be methylated) in 178 bp repeats in Arabidopsis, which may provide a foundation for the differential methylation of these repeats. Other histone variants also appear to influence the formation of centromeric chromatin. H2A.Z, a histone H2A variant, is present in the centromeric chromatin of human and mouse cells. H2A.Z is primarily associated with nucleosomes containing H3K4Me2 (Fig. 2a; Greaves et al. 2007). A small fraction of H3K9Me3 nucleosomes at the centromeric chromatin also contains H2A.Z (Fig. 2a; Greaves et al. 2007). From a 3D reconstruction experiment it has been proposed that H2A.Z/H3K4Me2 and H2A.Z/H3K9Me3 occupy spatially distinct regions. Surprisingly, H2A.Z is found to be a part of CENP-A containing nucleosome in Hela cells (Foltz et al. 2006) but not in mouse cells (Greaves et al. 2007). These results suggest that H2A.Z may reside close to the CENP-A nucleosome but may not be a part of the CENP-A nucleosome. Thus, histone modifications as well as histone variants participate in epigenetic marking of the centromeric chromatin.
Usual octameric structure does not hold true on a few occasions for a CENP-A containing nucleosome as reported in the case of budding yeast and Drosophila centromeric chromatins. Experiments on budding yeast centromeric nucleosomes by Wu and co-workers have shown in vitro that nucleosomes containing Cse4 possess an unusual composition of histones in which H2A and H2B are replaced by one molecule of Scm3, a non histone protein. Scm3 is able to displace H2A and H2B from pre-assembled canonical octamers containing Cse4 in vitro. Thus, the budding yeast centromeric nucleosomes are believed to be hexameric, containing two copies of each of Cse4, Scm3 and histone H4. However, apart from diminished occupancy of histone H2A and H2B at the centromeres, there is no in vivo evidence of having Scm3 as a part of Cse4 containing nucleosomes (Fig. 2b, Mizuguchi et al. 2007). Contrary to this observation, a recent report (Camahort et al. 2009) has found that the Cse4 is associated with H2A, H2B and H4 but not with H3 or the non histone protein Scm3. They have also shown that the SCM3 is a nonessential gene, rescuing the lethality of scm3 deletion by over expressing Cse4. Their observations argue for an octameric structure of Cse4 containing nucleosomes. This also correlates with the finding that the Scm3 in fission yeast S. pombe is not a part of the nucleosome but it is responsible for the centromeric localization of Cnp1 (CENP-A homolog in fission yeast, Williams et al. 2009). Scm3 in yeast has been found to be the distant relative of HJURP (Holliday Junction Recognition Protein, Sanchez-Pulido et al. 2009), which acts as a chaperon for the deposition of CENP-A to the centromere in humans (Dunleavy et al. 2009; Foltz et al. 2009). This suggests the function of Scm3 is more like a loader of Cse4 rather than a part of the Cse4 containing nucleosome. Other temptative results were obtained while analyzing Drosophila centromeric nucleosomes by cross linking experiments followed by electron microscopy and atomic force microscopy. Nucleosomes containing Cid, CENP-A homolog in Drosophila, from the interphase nuclei composed of heterotypic tetramers (containing Cid, H2A, H2B and H4) were termed as “hemisomes” or “Half Nucleosomes” instead of canonical octamers (Fig. 2b) (Dalal et al. 2007). In support of these observations, it has also been suggested that the right-handed wrapping of DNA which is present at the centromeric chromatin is inconsistent with octameric nucleosomes due to steric hindrance, whereas tetrameric nucleosomes can be bound by either right or left-handed DNA, with a steric preference for right-handed DNA (Furuyama and Henikoff 2009). As a matter of fact, H3–H4 tetrameric archaeal nucleosomes also wrap DNA in a right-handed manner in vivo (Musgrave et al. 1991) although such a nucleosome, in absence of H2A–H2B, is capable of switching between both left-handed and right-handed configurations in vitro. Given that the wrapping of DNA around noncentromeric nucleosome is in the left-handed direction, the change in the direction of DNA around nucleosomes from left-handed to right-handed might have a significant biological relevance for maintaining functional centromere.
Evidences for the epigenetic inheritance of centromeres
So far we have seen that centromeric DNA sequences are not the ultimate determinants of centromere identity. Some epigenetic factors in the form of histone variant H3 deposition, histone modifications and other unknown factors must be involved for the propagation and maintenance of the centromere identity. Experimental evidence has strongly argued in favor of the epigenetic nature of centromere inheritance.
Dicentric chromosomes possess two centromeric sequences due to duplications or more complex chromosomal rearrangements. In such cases, it has been found that both the centromeric sequences on autosomes of humans are able to bind to the protein CENP-B, but out of these two, one will fail to bind to the marker of kinetochore like CENP-C (Earnshaw et al. 1989) and thus cannot form functional kinetochore. In the case of dicentric X chromosome, one of the centromeric sequences is even unable to bind to CENP-B (Earnshaw and Migeon 1985). Thus, it indicates that the absence of an epigenetic mark on one of the duplicated sequences makes it an inactive centromere and allows the dicentric chromosome to propagate stably.
Further evidence for epigenetic effect in centromere identity came from the analysis of human neocentromeres. The DNA sequences of neocentromeres and centromeric chromatin are entirely different. Neocentromeres in humans do not contain α-satellite sequences which can bind to CENP-B DNA-binding protein. Nevertheless, neocentromeres are able to acquire the entire kinetochore assembly and can be stably maintained through mitosis and meiosis (Tyler-Smith et al. 1999; Saffery et al. 2000). This result suggests that neither DNA sequences nor CENP-B are essential or sufficient for centromere identity, but some epigenetic mechanism stays behind. We will revisit neocentromere in a later part of this review.
Complete dispensability of centromeric primary DNA sequence for the formation of centromere can drift occasionally. Analysis of de novo formation of centromere has revealed a relative preference for one type of DNA sequence over the other. Introduction of long stretches of cloned DNA into human cells have shown that de novo centromere assembly only occurs on type I α-satellite DNA which contains CENP-B binding sites (Masumoto et al. 2004). This study concluded that the CENP-B and α-satellite DNA are responsible for the establishment of the centromere in humans but not for the maintenance of a functional centromere (Okada et al. 2007).
As such there is no concrete evidence that an epigenetic mechanism does operate in point centromeres of budding yeast (Mythreye and Bloom 2003). Therefore we can conclude that unlike budding yeast where DNA sequences determine the site of centromere formation, in regional centromeres some DNA sequences along with other factors specify a special chromatin structure, which via epigenetic mechanism regulates the formation and maintenance of the centromere.
Cell cycle-dependent centromeric deposition of CENP-A: being in the right place at the right time
CENP-A localizes exclusively at the centromeres. The understanding of how CENP-A is targeted to the centromere is not completely known yet, but research in that direction has given many exciting clues about the probable mechanism of CENP-A localization. CENP-A is able to bind to any location on the genome as shown by transient expression experiments in different organisms (Heun et al. 2006). However, when CENP-A binds to noncentromeric DNA, CENP-A nucleosomes seem very unstable and they are rapidly degraded by proteolysis (Bernad et al. 2009). In budding yeast, it has been demonstrated that Cse4 targeted at the centromere is protected from proteolysis (Collins et al. 2004). Incorporation of canonical histone H3 into chromatin is coupled with DNA replication. Unlike that, incorporating CENP-A into chromatin is independent of DNA replication in most of the species including human cells (Fig. 2c; Shelby et al. 2000). Through an explicitly designed experiment in human cells using covalent fluorescent pulse chase labeling of SNAP tagged CENP-A (Jansen et al. 2007) has revealed the timing of centromere localization of newly synthesized CENP-A during cell cycle. They have shown that CENP-A, albeit synthesis peaks up at G2, deposition occurs in late telophase to early G1 phase. This argues that the passage through mitosis is an essential event for CENP-A loading (Jansen et al. 2007). Furthermore, following CENP-A in segregating chromosomes that are not attached to the microtubule, Jansen et al. have disproved an earlier model which suggested that the tension generated due to the biorientation of sister kinetochores pulls the centromeric chromatin leading to incorporation of CENP-A containing nucleosomes (Mellone and Allshire 2003). In Drosophila embryos, in which G1 and G2 phases are not observed, Cid (Drosophila CENP-A homolog) deposition takes place at anaphase (Schuh et al. 2007). Budding yeast cells appear to show an exception to this rule of replication-independent deposition of CENP-A. In this system, all existing Cse4 (budding yeast CENP-A homolog) is evicted from centromeres and replaced by newly synthesized Cse4 during S phase (Pearson et al. 2004). In case of S. pombe, Cnp1 (fission yeast CENP-A homolog) deposition takes place in two phases, during S and in late G2 phase (Takayama et al. 2008). In Arabidopsis, it is reported that loading of CENP-A occurs mainly in G2 phase (Lermontova et al. 2006). Thus, there is a vast difference in the timing of CENP-A deposition along the cell cycle in different organisms; it even varies between two closely related yeast species, i.e., S. cerevisiae and S. pombe (Fig. 2c).
The loading of CENP-A late in mitosis raises some intriguing issues: first, most of the kinetochore functions are executed on a chromatin fiber containing a low concentration of nucleosomes before they are fully replenished with CENP-A nucleosomes. Second, CENP-A loading occurs after chromosome segregation, suggesting that there may be some signaling events occurring during chromosome segregation which may trigger CENP-A deposition and third, CENP-A assembly can take place during mitosis when chromatin is more compact and inaccessible (Allshire and Karpen 2008).
The regulation of CENP-A loading into centromere is mediated, at least partially, by the action of Mis18α/β, M18BP1 in humans and by KNL2 in C. elegans because the depletion studies for these proteins result in CENP-A mislocalization (Fujita et al. 2007; Maddox et al. 2007). KNL2 has a Myb like DNA-binding motif which suggests that it interacts with centromeric DNA. Mis18 α/β and KNL2 were found to be present at the centromere at the same time window as that of CENP-A deposition, i.e., late telophase to early G1 phase. The isolation of KNL2 associated chromatin was found to be enriched with CENP-A, which indicates the close association of both the proteins on centromeric chromatin (Maddox et al. 2007). The mislocalization of CENP-A due to depletion of Mis18α is rescued by the histone deacetylase inhibitor, trichostatin (Fujita et al. 2007). These results suggest a ‘priming’ event whereby the CENP-A containing nucleosome, adjacent H3 containing nucleosome or another centromere protein, requires acetylation to license the centromere for cell cycle-dependent targeting of newly synthesized CENP-A.
How CENP-A is targeted to centromere?
In humans, the core centromere contains CENP-A chromatin-associated protein throughout the cell cycle which are collectively known as the constitutive centromere-associated network (CCAN) that so far includes, in addition to centromeric nucleosomes, 15 proteins: CENP-C, CENP-H, CENP-I, CENP-K through CENP-U, and CENP-W (Cheeseman and Desai 2008; Hori et al. 2008). These proteins are subdivided into three categories known as the CENP-A nucleosome associated complex (CENPANAC; Foltz et al. 2006), the CENP-H-I complex (Okada et al. 2006), and the interphase centromere complex (Izuta et al. 2006). Some members of the CCAN can directly affect CENP-A levels at the centromere that include CENP-H, CENP-I, CENP-K, CENP-M and CENP-N (Okada et al. 2006; Carroll et al. 2009). CENP-T/CENP-W complex forms DNA-binding complex, which directly associates with centromeric nucleosomes having canonical histone H3 in close proximity to the CENP-A containing nucleosome (Hori et al. 2008). In a recent study, Carroll and co-workers have reported that among eight members of the CENP-ANAC complex, only CENP-N directly binds CENP-A nucleosomes in vitro (Carroll et al. 2009) through recognizing a structural aspect specific to the CENP-A nucleosome. CENP-N bound by CENP-L leads the formation of the rest of the centromere complex. Similar to other proteins of the CCAN, down regulation of CENP-N also affects CENP-A levels at the centromere. It has also been reported that the C-terminal domain of CENP-C, which contains two regions named Mif2p homology domains II and III, is conserved from yeast to humans. Mif2p homology domain II was found to target centromere and contact alpha satellite DNA whereas Mif2p homology domain III showed multiple activities like an ability to form higher order structures like homo-dimers and homo-oligomers and to mediate interaction with CENP-A and histone H3. Thus, it seems that the C-terminal domain of CENP-C plays a crucial role in the structuring of kinetochore at the centromeric chromatin (Trazzi et al. 2009), whereas the N-terminal domain of CENP-C is reported to promote kinetochore assembly by ensuring proper targeting of the Mis12/MIND complex and CENP-K (Milks et al. 2009). The fact that all these proteins of CCAN are associated with CENP-A nucleosomes (either directly or indirectly) but absent in soluble fractions of CENP-A (Dunleavy et al. 2009; Foltz et al. 2009) indicates that they do not act as assembly factors for CENP-A loading. Instead, they serve as a platform for specific CENP-A loading factors to target CENP-A to the centromeres or, alternatively, their complex formation onto centromeric chromatin is necessary for stabilizing CENP-A nucleosomes. Importantly, it has been found that all of these proteins are themselves dependent on CENP-A for their localization to the centromere (Foltz et al. 2006; Liu et al. 2006; Carroll et al. 2009). It can be concluded that one or more members of the CCAN may form the molecular basis for an epigenetic feedback loop to control the propagation of active centromeres.
To know the specific CENP-A assembly factors, Yanagida and co-workers have found that chaperone proteins named hMis18α, hMis18β and M18BP1 are essential for the recruitment of CENP-A to the centromere. Their level increases during telophase–G1phase, the time when CENP-A is loaded, and their loss results in CENP-A mislocalization (Fujita et al. 2007). Additionally, using fission yeast they have shown that proteins like Mis16 (similar to RbAp46/48 histone chaperone proteins in human) and Mis18 recruit CENP-ACnp1 through maintaining the hypoacetylated state of the centromeric chromatin (Hayashi et al. 2004). Desai and colleagues have shown the importance of Mis12 in kinetochore assembly and chromosome segregation in human and chicken cells (Kline et al. 2006). In Mis12 depleted cells, the level of checkpoint protein BubR1 along with CENP-E, CENP-A and CENP-H was reduced at the centromere. Mis12 depleted cells showed delay in mitosis with misaligned chromosomes and defects in chromosome biorientation. This suggests an altered kinetochore structure in these depleted cells which was further demonstrated by reduced localization of Ndc80HEC1 at the outer plate of the kinetochore. Recently, Yoda and co-workers (Perpelescu et al. 2009) have shown the role of remodeling and spacing factor (RSF) complex in CENP-A assembly at the centromere using HeLa cells. RSF complex interacts with CENP-A in the mid G1 phase. Depletion of Rsf-1, one of the proteins of RSF complex, induced loss of CENP-A from the centromere and the purified RSF complex has been shown to reconstitute CENP-A nucleosomes in vitro. This result suggests RSF complex as a new factor which actively participates in CENP-A deposition at centromeric chromatin (Fig. 4a).
Another study in humans has found that Holliday Junction-Recognizing Protein (HJURP) is responsible for CENP-A loading at the centromere (Dunleavy et al. 2009; Foltz et al. 2009). The HJURP level rises during the same time window as CENP-A deposition, i.e., late telophase to G1 phase. HJURP associates with non-nucleosomal CENP-A suggesting a role in delivery of CENP-A, and its down regulation leads to a decrease in the CENP-A level at centromere and chromosome segregation error. Thus, HJURP is a key chaperone for newly formed CENP-A that facilitates the loading and incorporation of CENP-A at centromeres through its particular cell cycle dynamics (Fig. 4a). This is further reconfirmed by a report that shows association of CENP-A with histone H4, nucleophosmin1 and HJURP before assembling into centromeric chromatin (Foltz et al. 2009; Shuaib et al. 2010). Recent in vitro study has demonstrated that HJURP facilitates the CENP-A/H4 tetramer deposition at centromeric chromatin. HJURP expressed in a bacterial system was not able to bind to H3/H4 tetramer, but it specifically did bind to CENP-A/H4 tetramer (Foltz et al. 2009; Shuaib et al. 2010).
Zeitlin et al. have recently demonstrated in human and mouse models that double-stranded DNA breaks recruit CENP-A by using multiphoton absorption technology and DNA cleavage at unique sites by I-SceI endonuclease. Three other components CENP-N, CENP-T, and CENP-U were also found to be associated with CENP-A at the cleavage sites. The centromere-targeting domain of CENP-A was found to be necessary and sufficient for the recruitment to the double-strand breaks. They have also found the correlation between the CENP-A expression level and the number of survivals after DNA damage, and thus they have proposed the role of CENP-A in DNA repair (Zeitlin et al. 2009). These results further corroborate the report by Kato et al. (2007) where it has been demonstrated that HJURP can relocalize to the nuclear foci carrying DNA lesions upon induction of DNA damage and can bind to Holliday junctions in vitro.
Very recently Strunnikov and co-workers have reported the role of condensin in CENP-A deposition in humans. Condensin depletion results in a loss of CENP-A from the centromere. They have also mentioned that the protein kinase Aurora B dysfunction, probably due to misdeposition of CENP-A, is the key defect in condensin depleted cells which can lead to chromosome missegregation (Samoshkin et al. 2009).
It has also been reported in fission yeast that heterochromatin, RNAi and centromeric outer repeats are required for the establishment of CENP-A chromatin at the centromeres (Folco et al. 2008; Kagansky et al. 2009). The RNA interference produced by the breakdown of the transcripts produced from the centromeric outer repeats promotes heterochromatin formation to flank to the central core of the centromere region which is required for the establishment of Cnp1 (fission yeast CENP-A homolog) containing chromatin at the central core (Folco et al. 2008). This study further showed that once assembled, CENP-ACnp1 chromatin can be propagated by epigenetic means in the absence of heterochromatin or RNAi machinery. Heterochromatin is defined by distinct RNAi-stimulated posttranslational modifications of histone H3 (H3K9Me2) by H3K9 methyltransferase Clr4 (Suv39, homolog in Drosophila) followed by the recruitment of heterochromatin protein 1 (HP1) which is related to chromodomain protein Swi6. Consistently, the key components required for the establishment of CENP-ACnp1 chromatin on a naïve template have been shown. They are Clr4, the ribonuclease Dicer, which cleaves heterochromatic double-stranded RNA to small interfering RNA (siRNA); Chp1, a component of the RNAi effector complex (RNA-induced initiation of transcriptional gene silencing; RITS) and Swi6 (Folco et al. 2008). Furthermore, recently Allshire and co-workers have demonstrated that H3K9 methylation-dependent heterochromatin formation is the key event in forming a functional centromere (Kagansky et al. 2009). They observed when Clr4 (H3K9 methyltransferase) is artificially tethered at the euchromatic loci, it induces heterochromatin assembly with or without RNAi. This synthetic heterochromatin completely substitutes for outer repeats, which are responsible for generating small interfering RNAs to direct Clr4 to form heterochromatin on homologous loci in fission yeast. However, all these studies describe the role of RNAi in centromere function through the formation of heterochromatin. Whether RNAi per se is involved in CENP-A deposition requires further investigation.
Role of transcription in CENP-A assembly
A number of observations suggests that transcription has a role in CENP-A assembly and hence defines a chromatin structure for centromere formation. In humans, it has been shown that artificial targeting of transcription activating or deactivating complex can influence the kinetochore assembly (Nakano et al. 2008). In fission yeast, it has been demonstrated that transcription across the centromere is important to deposit Cnp1, fission yeast CENP-A homolog, through an RNAi-mediated pathway (Folco et al. 2008). This is consistent with the finding that depletion of RNAi machinery can reduce the frequency of formation of centromere at an ectopic location (Ishii et al. 2008). This can be envisaged that RNAi-mediated heterochromatin might provide a heterochromatin milieu amenable for kinetochore formation.
A debate remains whether transcription per se is required or the product of transcription (RNA transcript) is involved in CENP-A assembly. In support of the first case, it has been shown that transcription can assemble, disassemble and reposition nucleosomes (Williams and Tyler 2007). Histone H3 dimethylation or trimethylation on lysine 4 (H3K4me2 or H3K4me3) is associated with the 5′ regions of active genes which can attract chromatin factors such as NURF or Chd1 (possibly responsible for altering gene expression by modifying chromatin structure), which are involved in repositioning nucleosomes (Reinberg and Sims 2006). Similar modification (H3K4me2) is found to be present within the interspersed H3 nucleosomes that are present in human, fly and plant centromeric chromatin. As a corollary, in S. pombe, it has been shown that the central domain of S. pombe centromere contains H3K4me2, and Hrp1, a Chd1 homolog, is required for normal CENP-A chromatin levels (Walfridsson et al. 2005). These argue for the importance of transcription over the primary sequence of RNA. This notion was further fueled by the co-purification of the CENP-A nucleosome with a complex that facilitates chromatin transcription (FACT, Izuta et al. 2006).
The function of FACT is believed to perform efficient transcription through chromatin and to dissociate H2A–H2B dimers from nucleosomes, thus aiding nucleosome disassembly ahead of, and reassembly behind the advancing RNA polymerase II (Rocha and Verreault 2008). Thus, FACT might be involved in transcription-based replacement of H3 with CENP-A (Fig. 4b, upper panel), a phenomenon analogous to the transcription-coupled replacement of H3 with H3.3, which is mediated by HIRA complex (Ray-Gallet et al. 2004). Very recently, it has been demonstrated that FACT localizes to centromeres in a CENP-H, a component of CCAN containing complex, dependent manner (Okada et al. 2009). A conditional mutant cell line for one of the subunits of FACT (i.e., SSRP1, structure-specific recognition protein 1) has shown decreased CENP-A deposition at the centromere. The chromatin remodeling factor Chd1 binds to SSRP1, and the localization of Chd1 at the centromere was decreased in SSRP1 depleted cells. Knockdown of Chd1 by RNAi showed decreased localization of CENP-A at the centromere. These results together conclude the role of CENP-H containing complex in the deposition of CENP-A into centromeric chromatin in cooperation with FACT and Chd1 (Okada et al. 2009), highlighting the importance of transcription in CENP-A deposition. However, it cannot be ruled out that the interaction of FACT with CENP-A is a mere consequence of interaction of FACT with DNA replication machinery and has nothing to do with transcription (Rocha and Verreault 2008).
In support of RNA transcript as a factor involved in CENP-A deposition at the centromere, it has been suggested that the 5′ end of the newly formed RNA may recruit some factors, which are required for CENP-A deposition (Fig. 4b, lower panel). It is also possible that nascent transcripts will hybridize with their centromeric templates to form R-loops, hybrids of RNA and DNA, (Aguilera 2005) which can elicit DNA damage response and lead to repair-coupled chromatin remodeling and deposition of CENP-A. RNAs complementary to the centromeric DNA and remain associated with kinetochore proteins have been identified in maize (Topp et al. 2004), and the exact role of these transcripts is not clear at present. Recently, Dawe and colleagues have demonstrated that the DNA binding of CENP-C is stabilized by single stranded RNA. The RNA transcript produced from the centromeric DNA alters the DNA-binding characteristics of CENP-C to target it to the inner kinetochore (Du et al. 2010). Therefore, it is very important to determine whether the process of transcription or the RNA transcripts themselves play a direct role in establishing or maintaining CENP-A chromatin. Comparing the timing of CENP-A deposition with that of transcription and production of transcript during the cell cycle will be an important step to correlate these events. It is possible that the deposition of H3 or the formation of nucleosomal gaps in centromeric chromatin during the S phase could trigger transcription of centromeric DNA, and restoration of full levels of CENP-A nucleosomes could terminate further CENP-A assembly by switching off the transcription.
How might CENP-A be transported into the nucleus and be incorporated into the chromatin?
The exact mechanism for the import and assembly of CENP-A onto the centromeric chromatin is poorly understood. However, chaperones like Sim3 (human NASP related protein) and RbAp46/48 associate with CENP-A and have been shown to be required for CENP-A deposition in fission yeast and humans, respectively (Hayashi et al. 2004; Dunleavy et al. 2007). In vitro assembly of Drosophila CENP-A nucleosome along with H4 and RbAp46/48 suggests that centromeric chromatin somehow favors this complex over CAF1 and HIRA complex. As a result CENP-A is deposited in place of H3 (Furuyama et al. 2006). Specific posttranslational modifications of CENP-A might also play a part in CENP-A nuclear import and deposition of this protein. The factors that mediate CENP-A assembly into chromatin are unknown. However, the FACT remodeling complex associates with CENP-A, and RbAp46/48 can mediate CENP-A nucleosome assembly in vitro and probably in vivo by associating with the MIS18 complex, which is recruited to centromeres in telophase and through G1 (Fujita et al. 2007).
Therefore, synthesis of histones and their transport to the nucleus is important for proper assembly of the CENP-A nucleosome. Any error in histone gene transcription, translation, modification or import could affect the ability to assemble an intact CENP-A chromatin, which would result in the loss of CENP-A from centromeres and hence centromere identity (Allshire and Karpen 2008).
Neocentromere: a clue for unraveling the mechanism of centromere identity
The stability of dicentric chromosomes as described above is due to the functional inactivation of one centromere (Agudo et al. 2000) or due to centromere cooperation when both centromeres act coordinately for kinetochore formation, microtubule attachment and anaphase segregation (Sullivan and Willard 1998). This study explains that the presence of centromeric DNA on a chromosome is not sufficient for centromere function. This notion is supported by the fact that simple presence of centromeric DNA sequences on a chromosome does not automatically lead to centromere formation in humans and flies. In fact, in these systems, chromosomes that lack centromeric DNA sequences were able to form kinetochore by forming neocentromeres, which were mitotically and meiotically stable (Du Sart et al. 1997; Williams et al. 1998). Around 93 human neocentromeres that involve 20 different chromosomes have been identified so far, indicating that many genomic regions are amenable to centromere formation or that any sequence can be activated as a centromere under certain conditions (Marshall et al. 2008). Immunofluorescence based experiments have shown that 22 centromere/kinetochore/centromere-region proteins are present at normal human centromeres and at neocentromeres (Saffery et al. 2000), which indicates that neocentromeres recruit the same set of proteins as that of centromeres and mediate inheritance through the same mechanisms as normal centromeres. However, in one case CENP-B association with neocentromere was not detected suggesting both primary DNA sequence and CENP-B binding are dispensable for centromere identity (Depinet et al. 1997). Importantly, the DNA sequences of neocentromeres were found to be similar to their parental homologous loci and they did not acquire any α-satellite DNA. Furthermore, the DNA sequence of these neocentromeres did not show significant homology to each other, indicating that different sequences can acquire neocentromeric function. One interesting observation was that the sequences of all neocentromeres were significantly (A + T) rich (>60%) and enriched for retroviral elements, long terminal repeats and/or short tandem repeats. Thus, it is possible that (A + T) rich DNA or repetitive arrays can more easily achieve the conformations that are required for centromere function and kinetochore assembly (Murphy and Karpen 1998). In Candida albicans, when a centromeric DNA sequence was removed from chromosome V by homologous recombination using URA3 as a selection marker, neocentromere formation occurred in most of the transformants at different locations along the chromosome V. They have also checked the mitotic stability of such chromosomes which have acquired neocentromeres by growing the cells on 5-FOA plates which is toxic to the Ura+ cells. They found that the neocentromere containing chromosomes are mitotically stable and demonstrated the silencing of the URA3 gene caused by the formation of the neocentromere at this gene locus (Ketel et al. 2009; Marshall and Choo 2009). Recently, Choo and co-workers have reviewed the perception of neocentromere in centromere structure, karyotype evolution and disease development (Marshall et al. 2008).
Comparing the sites of neocentromere formation in humans and Drosophila, it has been revealed that human neocentromeres occur at sites that are significantly distant from the endogenous centromeres whereas opposite results have been obtained in the case of Drosophila. This suggests that activation of neocentromeres by the spreading of centromere proteins in cis might be the mechanism for neocentromere formation in Drosophila, but it occurs in trans in humans (Maggert and Karpen 2000, 2001). The study of the methylation pattern of human neocentromeric DNA at 10q25 has shown an overall increase in methylation at the neocentromeric chromatin in comparison with the pre-neocentromeric chromatin. But at the boundary of neocentromeric chromatin and at the active genes, hypomethylation is present as a pocket in the overall hypermethylated chromatin of the neocentromere. Thus it may be possible that the hypomethylation at the boundary regions demarcate the position of neocentromere formation by recruiting some proteins (Wong et al. 2006). These results explain the importance of DNA methylation in activation of 10q25 neocentromere in humans. Inhibition of DNA methylation demonstrated increased neocentromere instability with a decrease in methylation, reconfirming the importance of DNA methylation at neocentromeres (Wong et al. 2006). Very recently, the LINE retrotransposon RNA has been found to be an essential structural and functional epigenetic component of a core neocentromeric chromatin. FL-L1b retrotransposon reside centrally in CENP-A binding clustures. They have demonstrated the direct incorporation of the FL-L1b RNA transcripts into the CENP-A-associated chromatin. The RNAi-mediated knockdown of the FL-L1b RNA transcript reduces CENP-A binding and impairs mitotic function of the 10q25 neocentromeric function (Chueh et al. 2009).
Marking non-chromosomal locus by CENP-A homolog in budding yeast
Concluding remarks and future directions
In the last decade, many factors have been identified which are required for determining centromere identity. CENP-A deposition is one of the hallmarks in centromere identity, and several mechanisms have been proposed to explain how these factors might deposit CENP-A in a spatio-temporal fashion. A generalized theme of CENP-A assembly has emerged studying different model organisms. Nevertheless, the exact role of these factors in the pathway remains unclear. In vitro reconstitution of CENP-A nucleosomes from the purified factors added in different combinations might shed light in identifying individual roles. Undoubtedly, a major challenge for the future would be to find molecular machinery that earmarks a particular locus on a chromosome as a centromere at a particular window of the cell cycle and to know how temporal partitioning between CENP-A synthesis and deposition is achieved. To understand this state, centromeric chromatin needs to be monitored carefully throughout the cell cycle. To provide a handle for this study, a clear structural distinction between centromeric and noncentromeric chromatin has been obtained through studies on S. pombe and C. albicans. Following nuclease digestion, centromeric chromatin showed a smeared pattern suggesting either an irregular phasing of nucleosomes or the presence of multiple populations of highly phased nucleosomes whereas noncentromeric chromatin exhibited canonical ladder pattern suggesting a regular phasing of the nucleosomes (Polizzi and Clarke 1991; Takahashi et al. 1992; Baum et al. 2006). Furthermore, it would be intriguing to address whether CENP-A assembly pathway varies with developmental stages or cell types.
Since histone modification leads to a dynamic nature of the chromatin, it was envisaged that some modification would be specific to the centromeric chromatin which turned out to be true (Sullivan and Karpen 2004). Blocks of H3 modified by methylation at lys4 interspersed by CENP-A chromatin have been found in both humans and flies. However, how this modification promotes CENP-A deposition at the nearby chromatin needs to be clarified. Whether CENP-A is also subjected to similar modification like other core histones remains to be tested. If modified, it would be interesting to know how they are modified and whether this modification is required to restrict CENP-A at the centromere. Are they involved in regulating CENP-A deposition, kinetochore assembly or other aspects of centromere/kinetochore function?
Further research is required for investigating the relationship between centromeric chromatin and transcription and non-coding RNAs. Though centromere is present at transcriptionally inactive regions, recent studies have shown that neocentromeres contain some active genes (Saffery et al. 2003), and non-coding RNAs homologous to centromeric DNAs have been detected in mammals and plants (May et al. 2005). This suggests involvement of transcription in centromere function. It is important to distinguish whether the process or the product of the transcription is more important in defining centromeric chromatin. The choice of one or both may be a subject of cell types and species.
Importantly, many studies have shown the rationale between loss of centromere identity and cancer (Tomonaga et al. 2003; Liao et al. 2007). For example, in colorectal cancer, overexpression of CENP-A has been observed (Tomonaga et al. 2003). Hieter and colleagues (Yuen et al. 2005) have reviewed the association of different kinetochore proteins with different kinds of cancers. All these studies corroborate the importance of proper deposition of CENP-A at the right time at the right place since the entire kinetochore structure is built up in a hierarchical fashion on the CENP-A nucleosome.
Finally, it is most surprising that despite their highly conserved cellular function and stringent requirements of their unit identity, centromeres evolve so rapidly. Not only centromere sequence and length are diverse, but also kinetochore proteins seem to evolve most frequently, and identifiable orthologs show no significant sequence similarity. This high degree of structural diversity reflects a contribution towards speciation. It is also an interesting challenge to know how a mechanism of CENP-A deposition is evolving accordingly with time. Elucidating the mechanism of CENP-A deposition from a simple 2 micron circle to a complex human chromosome will allow us to do a comparative study from an evolutionary perspective. The next decade should allow a more vivid understanding of molecular mechanisms: how a particular state of chromatin becomes amenable to form a centromere/partitioning locus, how the state perpetuates across the generations and how these two processes might be evolutionary adjusted with rapid evolving centromere. This will enable us not only to understand neocentromere formation but also to find out answers when these mechanisms can fail and threaten human health.
We acknowledge Sujata Hajra for a critical reading of the manuscript. We regret not being able to refer to the work of everyone in the field. We are grateful to the reviewers for their insightful critique that helped improve the article’s style and content. G.D.M. and M.P.A. are supported by CSIR fellowships (20-6/2009(i)EU-IV/329667, EU-IV/2008/JUNE/327214, respectively). SKG laboratory is supported by start-up grant from the Indian Institute of Technology, Bombay, India.