Introduction

The proper inheritance of genomic information in eukaryotes requires both well-coordinated DNA replication in S phase and separation of duplicated chromosomes into daughter cells in mitosis [1]. Prior to S phase, pre-replication complex (pre-RC), a multi-protein complex which dictates when and where the DNA replication will initiate, is assembled [26]. Studies in Saccharomyces cerevisiae revealed conserved replication initiation sites (origins) that comprise a highly conserved autonomously replicating sequence (ARS) [7]. Identification of proteins bound to this sequence led to the discovery of a six-subunit complex that serves as the initiator to select replication initiation sites, and was therefore named the origin recognition complex (ORC) [8]. The assembly of pre-RC starts with ORC recognizing the replication elements and recruiting two factors, Cdc6 and Cdt1. These proteins function together to load the minichromosome maintenance proteins (MCM) onto chromatin [26]. This process takes place as early as the end of mitosis of the previous cell cycle [9]. In yeast, at the onset of S phase, Dbf4-dependent kinase (DDK) phosphorylation of MCMs and cyclin-dependent kinases (CDKs) phosphorylation of Sld2 and Sld3 lead to the assembly of Dpb11, GINS complex, MCM10, Cdc45, and DNA polymerase to initiation sites to form the pre-initiation complex (pre-IC), which in turn activates the MCM helicase [14, 10, 11]. In higher eukaryotes, a similar cascade has been identified, with RecQ4 and TopBP1 being orthologs for Sld2 and Dpb11 respectively [1, 11]. In order to maintain the genome content, replication must occur “once and only once” during each cell cycle and re-replication must be strictly prevented. This “replication licensing” mission is carried out by multiple mechanisms at the levels of the regulation of mRNA transcription, protein localization and protein stability, the presence of pre-RC inhibitors, and the alteration of local chromatin architecture [3, 4, 6, 1214].

Since the initial identification of ORC in Saccharomyces cerevisiae in 1992 [8], tremendous progress has been made in the past two decades in dissecting how the assembly of pre-RC and pre-IC regulates the initiation event of DNA replication. The ordered assembly has been found to be highly conserved in all the examined model organisms, including budding and fission yeast, Drosophila, Xenopus, and mammalian cells. However, as the complexity of the organisms increases dramatically, significant differences between organisms become apparent. For example, the ortholog for Sld3 in higher eukaryotes is missing. It is also clear that additional mechanisms control the initiation and completion of replication in higher eukaryotes. For instance, Geminin, the pre-RC inhibitor, can only be found in metazoans [15, 16]. In recent years, many new factors including several proteins and RNAs have been unveiled to play important roles in pre-RC/pre-IC assembly and licensing. In this review, we survey these emerging players, categorizing them as the pre-RC/pre-IC accessory proteins to denote the proteins with which they associate. It is noteworthy that many of them are only present in higher eukaryotes.

ORC accessory factors

The primary role of ORC is sequence identification and origin binding. During pre-RC assembly, ORC binding to origins serves as the landing pad for the sequential loading of Cdc6, Cdt1, and MCM2-7 to these sites [4]. However, even though some specific DNA binding sites have been revealed in different eukaryotic organisms, no consensus has been identified other than the ACS (ARS consensus sequence) in the budding yeast. In addition, though ORC is conserved in eukaryotes, the mechanism that recruits ORC to chromatin remains to be clearly elucidated in metazoans [4]. The discovery of ORC-associating factors may help us fill the missing blocks.

ORCA/LRWD1

ORCA/LRWD1 (ORC associated/leucine-rich repeats and WD repeat domain containing 1) was identified from the mass spectrometric analysis of ORC-interacting proteins [17]. It co-localizes with ORC at heterochromatic sites and shows similar cell cycle dynamics to that of ORC in human cells. Tethering ORCA to an artificially generated chromatin region efficiently recruits ORC to that chromatin locus [17]. Further, depletion of ORCA in human primary cells and embryonic stem cells results in the loss of ORC association to chromatin, the reduction of MCM binding to chromatin, and the subsequent accumulation of cells in G1 phase [17]. These data suggest that ORCA is required for the stable association of ORC to chromatin.

ORCA protein levels fluctuate throughout the cell cycle, peaking in G1 phase [18]. In addition to ORC, ORCA also associates with other cell cycle regulated proteins: Cdt1 and Geminin, and these interactions are cell cycle-dependent: ORCA associates with ORC core complex throughout the cell cycle, with Cdt1 in mitosis and G1, and with Geminin in post-G1 phases [18]. Single molecule analyses demonstrate that one molecule of ORCA binds to one ORC, one Cdt1 and/or two Geminin [18]. Overexpression of Geminin in human cells results in the loss of interaction between ORCA and Cdt1, suggesting that increased levels of Geminin in post-G1 cells titrate Cdt1 away from ORCA [18]. Taken together, these data suggest that ORCA modulates the stoichiometric assembly of the pre-RC components, and may serve as another licensing factor, in addition to its role in facilitating pre-RC assembly on chromatin.

Structural and functional analysis reveals that the five WD repeats of ORCA are essential for its interaction with ORC, Cdt1, Geminin, and its chromatin association [1720]. The WD domain forms a circular β-propeller structure, with each repeat unit as a blade in the form of a β sheet [2124]. Therefore, the intact WD domain is important for these functional associations, as deletion of any repeat abolishes the interaction [18, 20]. The N-terminus of Orc2 interacts directly with the WD domain of ORCA [18]. Upon Orc2 depletion, ORCA also undergoes rapid destruction, which can be rescued by the addition of proteasome inhibitor MG132, indicating that Orc2 protects ORCA from ubiquitin-mediated degradation. This is supported by the fact that K48 polyubiquitin chain can be formed on the WD domain of ORCA, and Orc2 only interacts with the non-ubiquitinated form of ORCA [25].

Notably, ORCA also associates with heterochromatic regions and has been shown to bind to repressive histone marks [19, 20, 26]. ORCA associates with H3K9me3, H4K20me3, and H3K27me3 peptides and is recruited to pericentric heterochromatin through its association with H3K9me3 [20]. Moreover, ChIP-seq using the BAC ORCA-GFP cell line displays an enriched signal on satellite repeats [26], and depletion of ORCA in MEF cells results in the up-regulation of major satellite repeat transcripts, suggesting its requirement for heterochromatin silencing [20]. Therefore, ORCA may function as a key molecule that links pre-RC assembly to higher order chromatin structure [27].

Interestingly, ORCA and ORC localize to centromeres in human mammary epithelial cells (MCF7) and other telomerase positive cell lines throughout the cell cycle [17, 28]. A proteomic screen of mitotic chromosome-associated proteins also identified ORCA (alias CENP-33) as a novel centromere protein [29]. These observations indicate the possible involvement of ORCA in DNA recombination, since centromeres are highly recombinogenic regions [30]. ORCA was found to localize to telomeres in interphase U2OS (Osteosarcoma) cells [17], which utilize DNA recombination mediated ALT mechanism (Alternative Lengthening of Telomeres) to maintain their telomeres [31]. ORCA was also identified in a large-scale proteomic analysis of ATM/ATR substrates as one of the proteins that is phosphorylated in response to DNA damage, corroborating ORCA’s role in DNA recombination/repair [32]. Moreover, the mouse ortholog of ORCA is highly expressed in the testis, an organ with high recombinogenic activity, and has been reported to be involved in spermatogenesis [33]. Further investigation on the functional significance of ORCA binding to centromeres and telomeres will be critical to address if ORCA has a direct role in heterochromatin replication or heterochromatin organization and if ORCA plays any role in DNA repair/recombination.

HBO1

HBO1 (human acetylase binding to Orc1) was originally identified in a yeast two-hybrid screen from a HeLa cell cDNA library using Orc1 as the bait. The histone acetyltransferase activities were found in the HBO1-containing complex, indicating that an ORC-mediated chromatin acetylation influenced DNA replication [34]. In human cells, HBO1 knockdown leads to the MCM chromatin loading defect, with no effect on chromatin loading of ORC and Cdc6, suggesting HBO1’s involvement at a step after ORC and Cdc6 binding to chromatin but upstream of MCM loading. In Xenopus egg extracts, immunodepletion of HBO1 also impairs chromatin binding of MCM and inhibits DNA replication, but this can be restored upon the addition of recombinant Cdt1 [35].

HBO1 associates with origins in G1 phase, directly interacts with Cdt1, and enhances Cdt1-dependent re-replication [36]. It has been suggested that HBO1 acts as the co-activator of Cdt1 and thereby facilitates replication initiation [36]. Further, HBO1-mediated histone H4 acetylation at origins is required for MCM loading, and Geminin inhibits HBO1 acetylase activity in a Cdt1-dependent manner [37]. This is consistent with a recent report that Cdt1-HBO1 complex promotes MCM loading through acetylation-mediated enhancement of chromatin accessibility in G1 phase. The MCM loading is inhibited by Cdt1-Geminin-HDAC11 via deacetylation in S phase, providing yet another mechanism for replication licensing [38]. Interestingly, Cdt1-HBO1 interaction is well regulated: in response to stress, JNK1 phosphorylates Cdt1 on threonine 29, which results in the dissociation of HBO1 from replication origins and consequently results in the inhibition of replication initiation [39]. Taken together, HBO1 is a key molecule that organizes chromatin to facilitate pre-RC assembly and replication initiation.

14-3-3

14-3-3 proteins exhibit specific phospho-serine/phospho-threonine binding activities, and thus are involved in various cellular pathways, including cell growth, apoptosis, cytokinesis, and tumor suppression [40, 41]. In mammalian cells, CBP (cruciform-binding protein) belongs to the 14-3-3 family. ChIP experiments reveal that CBP associates with monkey replication origins ors8 and ors12, which bear inverted repeats and form the cruciform structure. This origin-association takes place in a cell cycle-dependent manner, with maximal activity at the G1/S boundary [4244]. Addition of CBP antibodies impairs CBP-cruciform DNA complex formation and inhibits DNA replication in vitro [42, 43]. In Saccharomyces cerevisiae, Bmh1 and Bmh2, the 14-3-3 homologs in budding yeast, also have cruciform DNA binding activities and bind to replication origin ARS307 in vivo [45, 46]. Recently, the role of 14-3-3 in replication initiation has been elucidated. Bmh2 interacts with Orc2 and MCM2, and binds to ARS, peaking in G1 phase [47]. Utilizing a Bmh2 temperature-sensitive (bmh2-ts) mutant strain, it has been demonstrated that Bmh2 is required for the loading and maintenance of MCM on chromatin during G1 phase. Bmh2 has been suggested to be an essential component for pre-RC formation, DNA replication initiation, and normal cell cycle progression [47]. 14-3-3 binding to cruciform DNA indicates that additional factors may be needed in eukaryotes for replication activation at specific DNA structures.

This site-specific association of ORC accessory factors is also evident at other DNA contexts. In Drosophila, the Myb (Myeloblastosis)-containing complex (Myb p85, Caf1 p55, p40, p120, and p130) binds in a site-specific manner to the chorion gene cluster, ACE3 and ori-β. The Myb complex interacts with ORC, and is essential for chorion gene amplification [48]. Myb may function in converting specific inactive replication origins to active ones, possibly by facilitating acetylation at origins [49, 50]. In mammalian cells, high mobility group (HMG) proteins have the AT-hook motif that binds to the minor groove of the AT-rich regions of double-stranded DNA [51]. A member of the HMG proteins, HMGA1a, associates with ORC in vitro and in vivo [52]. Targeting HMGA1a to specific DNA sites recruits ORC and creates functional replication origins, and the abundance of recruited ORC correlates with the local density of HMGA1a. It is therefore proposed that high local concentration of HMGA1a may function as a potent, dominant replication origin [52]. These examples indicate that it will be critical to determine if a defined set of protein complexes associates with specific DNA elements that in turn regulates the replication initiation at site-specific replication origins.

Cdt1-Geminin associated proteins

Cdt1 is required for loading MCM onto origins, and it accumulates within the nucleus and associates to chromatin with high expression levels only in G1. In metazoan, another cell cycle regulated protein, Geminin, accumulates during S-G2-M phases and is then mitotically degraded by APC/C (anaphase promoting complex/cyclosome). Geminin is an inhibitor of Cdt1 and does not allow the assembly of pre-RC outside of G1 phase. Therefore, the dynamic oscillating pattern ensures that pre-RC is assembled only in G1 phase and is strictly inhibited outside G1 phase, serving as one of the replication licensing mechanisms that is critical for the maintenance of genome stability [4].

HOX

HOX proteins control cell fate determination and patterning specification processes during development. They function as transcription factors that bind to DNA via their homeodomain [5357]. Through a yeast two-hybrid screen using a cDNA library from 8.5 days post-coitum mouse embryos, Geminin was found to associate with HOX proteins, the HOX regulatory DNA elements, as well as the HOX-repressing polycomb complex [58]. Geminin inhibits the transcriptional activator function of HOX. HOX binding to Geminin impairs the association of Cdt1 with Geminin [58]. Interestingly, HOXD13, HOXD11 and HOXA13 bind to human replication origins in vivo; and HOXD13 directly interacts with Cdc6 via its homeodomain. This association promotes pre-RC assembly at origins and eventually stimulates DNA replication [59]. Binding of Geminin to HOXD13 blocks its pre-RC promoting function; however, exogenous HOXD13 expression overrides the Geminin-induced G1 accumulation [59]. A recent study has resolved the solution structure of the homeodomain of HOXC9 (HOXC9-HD) in complex with Geminin homeodomain binding region (Gem-HBR). Interestingly, the C-terminal Ser184 residue of Geminin can be phosphorylated by Casein kinase II (CK2), resulting in its higher binding affinity and inhibitory effect toward HOX [60]. Taken together, these data establish the HOX interactions with pre-RC and Geminin as an additional mechanism for controlling pre-RC assembly and replication initiation.

Idas

Discovered as a Geminin homolog, Idas is highly similar with Geminin’s central coiled-coil region. The term “Idas” was derived after the name of Gemini’s cousin (in ancient Greek mythology). In human cells, Idas localizes in the nucleus and shows decreased protein levels in anaphase [61]. Idas forms a complex with Geminin, but not Cdt1, and this direct binding prevents Geminin from binding to Cdt1 and translocates Geminin from the cytoplasm to the nucleus [61]. Depletion of Idas leads to the accumulation of cells in S phase and inefficient progression to mitosis and G1, whereas the over-expression of Idas leads to accumulation of multi-nucleated cells [61]. In contrast, over-expression of Geminin shows S phase accumulation [62], while depletion of Geminin leads to multipolar spindle defects [63]. It remains to be determined if Idas has a role in replication initiation. A recent study has also indicated specific high expression of Idas in the cortical hem and choroid plexus of the developing mouse telencephalon, indicating its involvement in developmental control as a putative modulator of proliferation-differentiation determination during development [61].

Geminin has dual functions: replication licensing inhibition and developmental control [6466]. Both HOX proteins and Idas affect normal cell cycle proliferation and are also involved in the developmental regulation of differentiation. Also notably, they both compete with Cdt1 for Geminin binding, suggesting that these proteins play crucial roles in pre-RC regulation or by balancing Geminin’s dual functions along with Cdt1.

MCM-related proteins

MCM (minichromosome maintenance) proteins were first isolated in yeast mutants that were defective in the maintenance of circular minichromosomes in S. cerevisiae [67]. MCM2-7 is generally believed to serve as the DNA helicase during replication. It forms a hexameric ring structure and all of the six subunits belong to the AAA+ family (ATPases associated with a variety of cellular activities) [68]. Recently, a number of MCM related proteins were identified based on either the sequence homology or the association with MCM2-7.

MCM8

MCM8 was categorized as a new member of the MCM family because it contained a conserved MCM helicase ATP binding domain--similar to what is observed in the MCM2-7 proteins [69, 70]. In human cells, MCM8 binding to chromatin is cell cycle-regulated [71]. It interacts with Orc2 and Cdc6 [71], and co-localizes with Cdc6 at the c-myc initiation site [72]. Depletion of MCM8 affects the normal G1/S transition and leads to loading defects of Cdc6 and MCM onto chromatin [71]. In Xenopus, MCM8 has been shown to function as a DNA helicase during replication elongation, but not for replication initiation [73]. MCM8 is not required for MCM2-7 chromatin loading, but instead binds to chromatin at the onset of DNA replication, after replication licensing [73]. MCM8 co-localizes with replication foci; with its DNA helicase and DNA-dependent ATPase activities, MCM8 regulates the chromatin assembly of DNA polymerase α and RPA34 [73]. In Drosophila S2 cells, MCM8 depletion diminishes PCNA binding by 30–50%, also indicating the involvement of MCM8 in DNA synthesis [74].

Different cell assay systems and species might explain the differences regarding the roles of MCM8 in replication initiation and/or elongation. In Xenopus, Cdc6 displays reduced chromatin association after replication licensing, and is reloaded onto chromatin at the onset of S phase [75]. In HeLa cells, MCM8 and Cdc6 co-localize at the c-myc replication initiation zone during G1 and this association continues even after completion of DNA replication [72]. These data indicate that the loss of Cdc6 association to chromatin in MCM8-depleted cells could either suggest a role of MCM8 in loading Cdc6 onto chromatin to facilitate pre-RC assembly, and/or an independent role in post-G1 phase cells. Moreover, during a small time window at the G1/S boundary, the chromatin bound levels of MCM8 drop significantly, thus manifesting two discontinuous functions of MCM8 and possibly the dual role in both DNA replication initiation and elongation [72].

MCM9

MCM9 was identified as another MCM family member by bioinformatic approaches [76, 77]. In Xenopus, the binding of MCM9 onto chromatin is dependent on ORC. Further, MCM9 facilitates the loading of MCM2-7 onto chromatin [78]. In the absence of MCM9, pre-RC assembly is hampered and DNA replication halts. Mechanistic analysis shows that MCM9 forms a complex with Cdt1, limiting the amount of Geminin that can associate with Cdt1 on chromatin during replication licensing. This enables the active Cdt1 to load MCM2-7 complex onto chromatin by modulating the ratio of Cdt1 to Geminin [78, 79]. However, MCM9 is not required for pre-RC formation or DNA replication in mice, as no change in the chromatin-bound MCM2/4/7 or Cdt1 was observed upon MCM9 disruption [80]. It is possible that these organisms have evolved species-specific and tissue-specific requirements.

Recently, using MCM8 and MCM9 knockout chicken DT40 cell lines as well as MCM8−/− and MCM9−/− deficient mice, two groups demonstrated that MCM8 and MCM9 form a complex, distinct from the MCM2-7 complex. Further MCM8 and MCM9 complex co-regulate their stability. Importantly, they both play essential roles in homologous recombination-mediated double strand break repair during replication fork maintenance [81, 82].

MCM10

MCM10 was identified in genetic screens for mini-chromosome maintenance defects and DNA replication defects [67, 8385]. It does not exhibit sequence homology to MCM2-7, but is conserved in most eukaryotes [86], and it also has the ability to bind both single and double stranded DNA [8792]. In Saccharomyces cerevisiae and Xenopus laevis, MCM10 can self-assemble into a homo-complex, which requires its zinc finger motif [89, 93]; and in human cells, MCM10 forms a hexameric ring structure [88]. MCM10 interacts with MCM2-7, which is essential for DNA replication initiation in all eukaryotes examined, including Saccharomyces cerevisiae [85, 94, 95], Schizosaccharomyces pombe [9698], Xenopus [99], Drosophila [100], and human cells [99, 101]. Chromatin loaded MCM2-7 (the inactive form) enables the efficient recruitment of MCM10 [102]. MCM10 also interacts with DNA polymerase α [87, 8991, 99, 103106] and recruits DNA polymerase α to replication origins/forks and stabilizes its catalytic subunit [103105], functioning together with Ctf4/And-1 (see below the Ctf4/And-1 section) [99, 107]. Taken together, MCM10 may serve as the coordinator for MCM2-7 helicase and DNA polymerase, facilitating their roles during replication initiation and elongation.

Even though the role of MCM10 in DNA replication has been appreciated for a while, significant new investigations into its interaction with the Cdc45-MCM2-7-GINS (CMG) complex and the functional significance of this has prompted us to include MCM10 in the review. MCM10 binds to origins in a TopBP1 dependent manner [108] and associates with replication initiation sites before Cdc45, promoting Cdc45 chromatin association [109111], as well as after Cdc45 along with DNA polymerase α [103, 112]. This pre- and post- Cdc45 recruitment of MCM10 is intriguing, indicating its complicated but yet uncovered function with CMG.

Recently, several groups have reported that the origin association of MCM10 is independent of CMG assembly, but is essential for DNA unwinding [102, 113, 114]. In Saccharomyces cerevisiae, the stable CMG complex remains at origins upon the auxin-induced MCM10 degradation (mcm10-1-aid) [113]. Further, Cdc45-MCM2-7-GINS interactions are not impaired upon temperature-induced MCM10 degradation (mcm10-1td) [102], indicating CMG origin assembly is independent of MCM10. However, RPA loading is defective in the absence of MCM10 in both studies, suggesting that MCM10 facilitates an origin DNA unwinding reaction [102, 113]. Parallel observations were also made in Schizosaccharomyces pombe using combined promoter shut-off and auxin-induced degradation system (off-aid) [114]. Moreover, point mutations within the zinc finger motif in fission yeast MCM10 (MCM10ZA) revealed that this motif plays an important role in unwinding the origin, though it is not essential for MCM10 origin association [114].

MCM-BP

MCM-BP (MCM-binding protein) was first identified from mass spectrometric analyses of TAP (tandem affinity purification) tagged MCM6 and MCM7 immunoprecipitations [115]. Interestingly, MCM-BP interacts with MCM3-7 but not MCM2, indicating that MCM-BP and MCM2 form different complexes with MCM3-7 [115118]. Further, MCM-BP can interact with the core complex MCM4/6/7 in vitro, but does not inhibit the helicase activity [115].

In Arabidopsis thaliana, the MCM-BP ortholog, ETG1 (E2F target gene 1) is required for DNA replication [119]. ETG1-deficient plants display the activation of DNA replication stress checkpoint, which is crucial to their survival [119]. In Schizosaccharomyces pombe, overexpression of Mcb1 (MCM-BP homolog) disrupts MCM2 association with MCM3-7 complex and causes MCMs to dissociate from the chromatin, leading to DNA replication inhibition, DNA damage and checkpoint activation [117, 118]. In Xenopus egg extracts, MCM-BP accumulates in the nucleus in late S phase. Immunodepletion of MCM-BP inhibits replication-coupled MCM2-7 dissociation from chromatin, whereas addition of excess recombinant MCM-BP disassembles the MCM2-7 complex [116]. Similar observations on these species were also obtained in human cells. MCM-BP binds to chromatin in a cell cycle-dependent manner: associating with chromatin in M-G1-S and dissociating from chromatin in late G2-early M. The MCM-BP dynamic patterning occurs slightly later than that of MCM2-7 [115]. MCM-BP siRNA knockdown results in the reduced dissociation of MCM from chromatin, consistent with its role as the MCM2-7 unloader from chromatin [116]. ShRNA prolonged knockdown results in G2 checkpoint activation, and raised whole cell levels as well as soluble levels of MCM proteins [120]. Interestingly, yeast two-hybrid assay and immunoprecipitation analysis reveal that MCM-BP also interacts with Dbf4. However, MCM-BP is not a DDK substrate; rather, it inhibits phosphorylation of MCM by DDK instead, adding another layer of complexity in replication regulation [121].

Moreover, MCM-BP is also involved in other cellular events. In Arabidopsis thaliana, ETG1 is required for the establishment of sister chromatid cohesion [122]; in fission yeast, MCM-BP is essential for meiosis [118]; and in human cells, MCM-BP shRNA knockdown leads to abnormal nuclear morphology and centrosomal amplification [120]. Further investigations are needed to distinguish whether these observations indicate its separate functions in DNA replication and mitosis or an indirect effect due to aberrant DNA replication.

Factors influencing pre-IC

The establishment of pre-IC sets the stage for the initiation of DNA replication during S phase of cell cycle [4]. In addition to Cdc45, MCM10, GINS and other related proteins, several novel components have been identified that influence the pre-IC formation.

DUE-B

DUE-B (DUE-binding protein) was identified in a yeast one-hybrid screen to identify proteins that bind to c-myc DUE (DNA-unwinding element)/ACS region [123]. Structural analysis reveals that DUE-B interacts with DNA via its C-terminus and forms a homodimer via the N-terminus, resembling the structure of tRNA-editing enzymes [124]. The N-terminus of DUE-B exhibits both D-aminoacyl-tRNA deacylase and ATPase activity [124]. In HeLa cells, RNAi knockdown of DUE-B results in the delayed S phase entry and increased cell death [123]. In Xenopus egg extracts, immunodepletion of DUE-B inhibits replication, which can be restored by the addition of recombinant DUE-B (expressed in HeLa cells) [123]. Further, DUE-B forms a complex with TopBP1 and Cdc45. Knockdown of DUE-B in HeLa cells abolishes the chromatin binding of Cdc45, and immunodepletion of DUE-B in Xenopus egg extracts abolishes the loading of both Cdc45 and TopBP1 to chromatin [125]. DUE-B can be phosphorylated, and this affects Cdc45 (not TopBP1) association with phosphorylated DUE-B at high salt concentration [125]. It is therefore hypothesized that TopBP1-DUE-B-Cdc45 complex associates with the pre-RC. CDK phosphorylation then triggers Cdc45-mediated activation of pre-IC [125].

GEMC1

GEMC1 (Geminin coiled-coil containing protein 1) was identified during the search for open reading frames (ORFs) containing degenerate signature motifs present in known replication factors [126]. In Xenopus egg extracts, GEMC1 interacts with Cdc45 and TopBP1, and its chromatin binding is independent of the MCM2‑7 complex, suggesting its involvement subsequent to the formation of pre-RC. GEMC1 depletion prevents Cdc45’s chromatin loading, whereas TopBP1 depletion prevents GEMC1’s chromatin loading, indicating the sequential assembly where TopBP1-dependent loading of GEMC1 to chromatin promotes the subsequent loading of Cdc45 to chromatin [126]. Moreover, GEMC1 interacts with and is phosphorylated by Cyclin E/Cdk2. Constitutively phosphorylated GEMC1 stimulated efficient DNA replication and enhanced the loading of Cdc45 to chromatin [126]. These data together suggest that TopBP1- and Cdk2-dependent binding of GEMC1 to chromatin promotes the loading of Cdc45 onto chromatin to facilitate DNA replication [126, 127].

Treslin/Ticrr

Treslin was recently identified as a TopBP1 interacting protein in Xenopus egg extracts. Phosphorylation of Treslin by Cdk2 is required for Treslin-TopBP1 complex formation [128]. Immunodepletion of Treslin does not affect TopBP1 binding to chromatin and vice versa. Treslin and TopBP1 co-operate in the Cdk2-mediated loading of Cdc45 to chromatin [128]. Treslin phosphorylation mediated by Cdk2 has been suggested to promote the formation of a TopBP1-Treslin complex, which in turn would enable Cdc45 loading and the subsequent action of replicative helicase activity [128]. In human cells, a similar Cdk2-mediated Treslin-TopBP1 association is demonstrated [129], and Treslin knockdown is found to affect S phase progression due to accumulation of DNA damage [128]. Treslin was concurrently identified from a screen in zebrafish for G2/M checkpoint regulators, and hence named as Ticrr (TopBP1-interacting, checkpoint, and replication regulator) [130]. Ticrr, also a chromatin associated protein that binds TopBP1, has been shown to be required for the proper assembly of the pre-IC, but not the pre-RC [130].

DUE-B, GEMC1, and Treslin/Ticrr all associate with TopBP1 as well as Cdc45, and phosphorylation of these three proteins are all required for the proper activation of Cdc45 on chromatin. The formation of pre-IC requires CMG and other replication factors. In yeast, these factors belong to, but are not limited to, the Sld2-Dpb11-Sld3 complex. In higher eukaryotes, RecQ4 is generally considered the ortholog for Sld2 [131, 132], and the TopBP1 for Dpb11 [133135]. However, the counterpart for Sld3 is missing. Given the fact that CDK-mediated regulations of DUE-B, GEMC1, and Treslin/Ticrr interactions with TopBP1 and Cdc45 are similar to that of Sld3–Dbp11, it is tantalizing to propose that one or more of them may serve as the functional orthologs. Specifically, Treslin/Ticrr interacts with BRCT repeats of TopBP1, resembling that of Sld3 to Dpb11 [128130], and conserved CDK sites between Sld3 and Treslin/Ticrr were identified; it is therefore likely that Treslin/Ticrr is functionally equivalent to Sld3 [136139]. Nevertheless, a recent study found that neither Treslin/Ticrr, nor the above-mentioned DUE-B or GEMC1, could complement the loss of Sld3 (fail to rescue the growth defect of sld3 mutants) in Schizosaccharomyces pombe, indicating much more complexity within the higher eukaryotes [140].

Interestingly, existing components in the pre-IC also exhibit functions coupling both replication initiation and elongation. In human cells, RecQ4 is loaded onto chromatin in late G1 phase after pre-RC formation, following which RecQ1 and additional RecQ4 are recruited to origins at the onset of DNA replication [141]. During DNA synthesis, RNAi knockdown of RecQ1 impairs replication fork progression [141], whereas RecQ4 displays DNA unwinding activity and functions along with CMG and MCM10 [142144]. Our quest towards identifying novel components that regulate replication initiation and elongation continues.

Replisome Progression Complex (RPC) coupled factors

FACT

FACT (facilitates chromatin transcription) was first purified from HeLa cell nuclear extracts as the factor required for transcription elongation [145, 146]. It interacts with nucleosomes and histone H2A/H2B dimers, and is composed of Spt16/Cdc68 and SSRP1 (structure-specific recognition protein-1) [147]. Similar findings were also reported from yeast FACT Spt16-Pob3-Nhp6 [148, 149] and Xenopus FACT DUF (DNA unwinding factor, DUF140-DUF87) [150]. In yeast, FACT interacts with DNA polymerase and RPA [151153], whereas in Xenopus egg extracts, FACT unwinds DNA and is crucial for DNA replication [150].

Interestingly, FACT interacts with MCM2-7 and promotes its DNA unwinding activity [154]. ChIP assays reveal that FACT and MCM2-7 co-localize at replication origins in human cells, and disruption of FACT-MCM interaction by introducing mutant MCM4120-250 results in reduced levels of FACT at origins and delayed replication initiation [154]. These data clearly demonstrate that the FACT-MCM interaction is essential for the initiation stage of DNA replication. Further, FACT-MCM interaction exhibits cell cycle regulated dynamics, with interaction levels peaking at the onset of DNA replication and during S phase [155], and the study from SSRP1 conditional knockout chicken DT40 cells indicates that FACT maintains the normal elongation rate/fork progression of DNA synthesis [156]. This suggests that FACT-MCM interaction functions not only in the initiation phase but also the elongation stage of DNA replication. Similarly in Xenopus egg extracts FACT displays two phases of chromatin binding: one before origin licensing and the other binding during replication [157].

Ctf4/And-1

Ctf4 gene was first described from two genetic screens for chromosome transmission fidelity (CTF) and chromosome loss (CHL) in Saccharomyces cerevisiae [158], and was found essential for sister chromatid cohesion [159]. Later, it was found that Ctf4 depletion in the orc2-1 and orc5-1 mutants background leads to synthetic lethality, indicating its involvement in DNA replication [160]. Recent studies in several model organisms have provided more insights into the mechanism of how Ctf4 facilitates DNA replication.

In Saccharomyces cerevisiae, Ctf4 binds to chromatin predominantly in S phase, and this chromatin binding requires its interaction with MCM10. Deletion of Ctf4 results in the destabilization of MCM10 and DNA polymerase α, and defects in S phase [107]. Further, Ctf4 interacts directly with GINS and DNA polymerase α [161], and the interaction between CMG complex and DNA polymerase α is abolished in a ctf4Δ strain [162]. In Drosophila, Ctf4 interacts with Psf1, Psf2, MCM2, and DNA polymerase α in yeast two-hybrid analyses. In vivo RNAi knockdown of Ctf4 demonstrates its requirement for normal S phase progression during homeostasis as well as replication stress induced by replication fork pausing [163]. In Xenopus, immunodepletion of Ctf4 from egg extracts abolishes DNA replication. Time course analysis indicates that MCM10 loading onto chromatin precedes that of Ctf4 and DNA polymerase α, and MCM10-Ctf4 interaction is required for the loading of Ctf4 and DNA polymerase α [99]. In mammalian cells, Ctf4 also interacts with MCM10 and DNA polymerase, and is required for the stabilization of DNA polymerase and replication [99, 164]. RNAi knockdown of Ctf4 in HeLa cells results in the failure of the CMG complex assembly on chromatin [144], and a slower DNA replication rate [164]. Therefore, Ctf4 is an essential factor in DNA replication and facilitates replisome progression [161].

The initial identification of FACT and Ctf4 indicated their functions in regulating chromatin contexts, not DNA synthesis; however, recent data have clearly demonstrated the close crosstalk between DNA replication and its chromatin context in S phase. It is noteworthy that DNA replication components act in concert with genome stability surveillance factors. For instance, the checkpoint mediator proteins Mrc1-Tof1-Csm3, as RPC components, are involved in replisome progression. In Saccharomyces cerevisiae, Mrc1 is required for normal replication fork movements, whereas Tof1 is necessary for replication forks to pause at protein-DNA barriers [165]. In Schizosaccharomyces pombe, Mrc1 regulates early-firing origins, independent of its checkpoint function [166]. Another example comes from Xenopus Dna2 [167, 168], which co-localizes with RPA during replication and interacts with Ctf4/And-1 and MCM10. On the other hand, Dna2 also associates with DSB repair and checkpoint proteins Nbs1 and ATM [167]. These concerted interactions at the replication fork ensure genome stability during DNA synthesis.

Non-coding RNAs

Y RNAs

Y RNAs are conserved small stem-loop RNAs, first identified as the cytoplasmic RNA components of Ro ribonucleoprotein particles (Ro RNPs) in higher eukaryotes. In human cells, there are 4 types of Y RNAs: Y1, Y3, Y4, and Y5 [169174]. In human cell free system (late G1 phase nuclei), both reconstitution and individual degradation of Y RNAs demonstrate that they are required for chromosomal DNA replication, and can substitute for each other [175]. However, Y1 RNA binding to Ro60 is not necessary for DNA replication [175], neither is the Y RNAs containing ribonucleoprotein complex [176]. During the investigation on how Y RNAs are involved in DNA replication, DNA combing/fiber fluorescence and nascent-strand analysis were utilized to study the effect of Y RNA degradation (Y3) and the supplementation of non-targeted Y RNA (Y1), which demonstrated that Y RNAs are required for the initiation rather than the elongation of DNA replication [177]. Interestingly, only vertebrate Y RNAs are able to reconstitute chromosomal DNA replication in vitro, which led to the discovery of a conserved motif in the double-stranded stem of vertebrate Y RNAs that is sufficient for its function [178]. In the recent study using fluorescence labeling, it has been shown that Y RNAs associate with unreplicated euchromatin in late G1 phase, then disassociate from replicated DNA. Pull-down experiments revealed that they interact with ORC and other pre-RC components. Therefore, a “catch and release” mechanism was proposed for Y RNAs and the Y RNAs accordingly served as another licensing factor for replication initiation [179]. In addition, evidence from Xenopus and zebrafish embryos suggests that the midblastula transition (MBT) defines the onset of Y RNA-dependent DNA replication. The Y RNA is not required for DNA replication before MBT, but is necessary for replication and associates with the chromatin in an ORC-dependent fashion after MBT [180].

G-quadruplex RNA

Epstein-Barr virus (EBV) is an oncogenic herpesvirus that can infect human cells and stay in the host as an episome, which replicates once per cell cycle. EBV utilizes EBV-encoded protein EBNA1 (Epstein–Barr nuclear antigen 1) to recruit host ORC to its origin, oriP, in order to replicate its DNA [181186]. The N terminal domain of EBNA1 has two linking regions 1 and 2 (LR1 and LR2) that interact with ORC, and deletion of LR1 and LR2 from EBNA1 eliminates its ORC binding ability [187]. Sequence analysis of LR1 and LR2 regions shows that they both have arginine- and glycine-rich motifs (RG-rich motifs) [187]. Substitution of arginines or glycines to alanines in LR2 region abolishes ORC association, indicating that the RG-rich motif plays a pivotal role in ORC-EBNA1 association [187]. Interestingly, RNase A but not DNase I disrupts ORC-EBNA1 interaction, suggesting that the recruitment of ORC to EBNA1 is RNA dependent [187]. EBNA1 displays RNA binding ability [188, 189], and has a RG-rich motif that is known for its RNA binding activity [190]. Taken together, these indicate an RNA-dependent recruitment of ORC to EBNA1 by the N-terminal LR1 and LR2 regions of EBNA1.

Further analyses reveal that this RNA dependency is from the G-quadruplex RNA [187, 191]. A G-quartet is formed by four guanines in a square arrangement via hydrogen bonds, and a G-quadruplex is formed by two or more stacked G-quartets [192]. The involvement of G-quadruplex RNA in ORC-EBNA1 association is demonstrated by introducing a G-quadruplex-interacting compound, BRACO-19, which abolishes EBNA1 recruitment of ORC and inhibits EBNA1-dependent replication of oriP [191].

The emergence of non-coding RNAs from these studies implies the existence of additional factors assisting ORC for origin targeting and replication licensing. Intriguingly, upon RNase A treatment, a fraction of ORC is released from chromatin, indicating that the ORC association with chromatin can be partially stabilized by RNA [187]. Therefore, it is highly possible that some structured RNAs mediate ORC recruitment to certain origins. These findings reinforce the important role of non-coding RNAs in the regulation of replication initiation.

Concluding remarks

Recently, several novel members have joined the family of pre-RC (Figure 1) and pre-IC (Figure 2), mostly discovered from proteomic screens/analyses and bioinformatic predictions; at the same time, many classic factors, including HBO1, 14-3-3, HOX, MCM10, FACT, Ctf4, and Y RNAs, were merited with additional roles in replication.

Figure 1
figure 1

Emerging players in the assembly of pre-replicative complex (pre-RC). The classical model of pre-RC assembly involves the ordered loading of ORC, Cdc6, Cdt1 and MCM2-7 onto replication origins. The Cdt1 inhibitor, Geminin associates with Cdt1 outside G1 phase of the cell cycle and prevents the re-assembly of pre-RC. Several ORC interacting proteins (ORCA/LRWD1, 14-3-3) and non-coding RNAs (G-quadruplex RNA, Y RNA) have been found to be involved in regulating ORC binding to origins. MCM8 interacts with Cdc6 and is required for Cdc6 chromatin loading. HBO1 and MCM9 facilitate Cdt1 activity by antagonizing Geminin and modulating Cdt1-Geminin stoichiometry respectively. In addition, HOX and Idas interacts with Geminin and may control Geminin’s dual functions in proliferation-differentiation determination.

Figure 2
figure 2

Emerging players in the assembly of pre-initiation complex (pre-IC) and replisome progression complex (RPC). Other than classical factors in pre-IC and RPC (Cdc45, MCM2-7, GINS, RecQ4, TopBP1, and DNA polymerase), a number of novel factors have recently been identified. DUE-B, GEMC1, and Treslin/Ticrr all interact with TopBP1 and Cdc45, and phosphorylation of these three proteins (possibly by S phase CDK) are all required for the TopBP1-dependent activation of Cdc45 on chromatin. MCM10, Ctf4/And-1, and FACT connect MCM2-7 helicase activity with DNA polymerase activity and facilitate replisome progression. In addition, MCM-BP regulates replication-coupled MCM2-7 helicase dissociation from chromatin, whereas MCM8 regulates the chromatin assembly of DNA polymerase. (For simple illustration, only one MCM2-7 complex and related factors are presented here).

It is notable that different players exhibit diverse conservation levels across the organisms. For some proteins like Ctf4, their functions are faithfully reproduced in all model organisms investigated. For some novel players, such as ORCA/LRWD1, whether orthologs exist or have conserved functions in other eukaryotic species is still an open question. For other factors like MCM9, disparate observations have been made in different systems, which could either reflect technical difference, or indicate real evolutionary divergence.

It is also noteworthy that many of these factors are involved in other non-replication events, evident from ORCA/LRWD1 in heterochromatin silencing [20], 14-3-3 in various signaling pathways [40, 41], HOX in developmental control [5357], FACT in transcription elongation [145, 146], Ctf4 in sister chromatid cohesion [159], and Treslin/Ticrr in G2/M checkpoint [130]. It is possible that these factors may have replication-independent functions, or may couple replication to additional processes to form a regulatory network.

What we know is probably only the tip of the iceberg. For instance, the complete in vitro reconstitution of DNA replication initiation machinery or sub-complexes using purified proteins has so far been unsuccessful, indicating that there are still important players missing from the orchestra. Investigations in other less-studied species may lead us to some valuable clues. In Tetrahymena thermophila, ORC has a 26T RNA component that is involved in ribosomal DNA origin recognition via complementary base pairing [193]. It will be interesting to find out whether a similar functional player exists in other model organisms. On the other hand, with the increasing genome size and complexity of protein network from lower to higher eukaryotes, how timely control and dynamic regulation of replication are achieved also need further elucidation. As reported from a recent study using a novel hybrid CHO cell/Xenopus egg extracts, a pre-restriction point inhibitor exists for licensing. Other than ruling out pre-RC components, Geminin, HBO1 or CDK, the nature of this inhibitor was not characterized [194]. Therefore, exploring new model systems and utilizing innovative techniques, along with mechanistic studies on existing factors, will help us probe for the missing links and mechanisms, toward a better grasp of the final truth on how eukaryotes initiate DNA replication.