Gene organization and evolutionary history

The first origin recognition complex (ORC) proteins to be identified were purified from cell extracts of budding yeast (Saccharomyces cerevisiae) as a heterohexameric complex that specifically binds to origins of DNA replication [1], and the subunits were named Orc1 through Orc6 in descending order of apparent molecular mass, as judged by SDS-PAGE (Figure 1). Shortly thereafter, the corresponding genes were cloned [27]. Dispersed among six chromosomes (ORC1 chromosome 13, ORC2 chromosome 2, ORC3 chromosome 12, ORC4 chromosome 16, ORC5 chromosome 14, ORC6 chromosome 8) the sizes of the genes mirrors the sizes of the proteins they encode, ranging from 1,308 bp to 2,745 bp, and all are intronless, as is the case for the vast majority of budding yeast open reading frames [8]. Subsequently, orthologs of ORC1-ORC5 were identified in organisms as diverse as Drosophila melanogaster [9], Arabidopsis thaliana [10] and Homo sapiens [11], strongly suggesting that these genes are likely to exist in all eukaryotes. ORC6 genes have also been assigned in numerous metazoan species (Figure 2), and although the encoded proteins are relatively well conserved between metazoans and fission yeast (Schizosaccharomyes pombe), there is insufficient identity to definitively conclude that they are homologous to budding yeast Orc6, which is also considerably larger than Orc6 in these other species [11]. As with S. cerevisiae, the genes in other species are spread among multiple chromosomes. Apart from Orc6, the size of the individual protein subunits encoded does not vary much between species, although the length of the genes themselves is considerably longer in higher eukaryotes (for example, they range from 8,746 bp for ORC6 to 87,405 bp for ORC4 in H. sapiens) as would be expected as a result of the presence of intronic sequence.

Figure 1
figure 1

Comparison of domains for Orc1-5 and Cdc6 from S. cerevisiae. Orc1, Orc4, Orc5, and Cdc6 each contain an AAA+ domain as part of a larger ORC/Cdc6 domain (orange) [75]. Orc2 and Orc3 are predicted to share this domain structure [19], but have a greater degree of sequence divergence. Motifs within the AAA+ domain include Walker A (WA), Walker B (WB), Sensor-1 (S1) and Sensor-2 (S2). The carboxy-terminal region of ORC/Cdc6 is predicted to contain a winged-helix domain (WH), involved in DNA binding. Orc1 contains an additional BAH (bromo-adjacent homology) domain (pink), which interacts with the Sir1 protein and is involved in epigenetic silencing. Orc1 and Orc2 have regions of disorder (yellow); a DNA-binding AT-hook motif (here PRKRGRPRK) is identified in S. cerevisiae Orc2, and several of these have also been identified in disordered regions in S. pombe Orc4. The number of amino acids for each protein is indicated at the right.

Figure 2
figure 2

Homology between Orc6 in representative species D. melanogaster (Dm), H. sapiens (Hs), A. thaliana (At), S. pombe (Sp), and S. cerevisiae (Sc). Orc6 contains a unique conserved domain, identified by homology with the Orc6 protein fold superfamily (pfam 05460) [76]. This domain is interrupted by a large disordered region [77] in S. cerevisiae. Orc6 has no recognizable homology to Orc1-5 or AAA+ domains. The carboxy-terminal region of Orc6 in D. melanogaster has been shown to interact with a coiled-coil region of the septin protein Pnut, possibly mediated by coiled-coil motifs predicted in Orc6 [78]. The number of amino acids for each protein is indicated at the right.

Along with ORC subunit orthologs, additional Orc1-like proteins are widespread in eukaryotic species. The most notable of these is Cdc6, a replication factor that aids in loading the Mcm2-7 DNA helicase onto replication origins (Figure 3). In budding yeast, Cdc6 has strong similarity with a 270-amino-acid stretch of Orc1 [6], and phylogenetic analysis of a wide array of species suggests that the ORC1 and CDC6 genes may be paralogs [12]. As shown by a neighbor-joining tree based on AAA+ protein domains (discussed below), Orc1 is more closely related to Cdc6 than to other ORC subunits (Figure 4). In addition to Cdc6, which is well conserved among eukaryotes, some species-specific Orc1-like proteins have also been identified. These include budding yeast Sir3, a protein which mediates hetero-chromatin formation [6]. In Arabidopsis, paralogous ORC1 genes, termed ORC1a and ORC1b, have been found, and it appears that ORC1a is preferentially expressed in endoreplicating cells, whereas Orc1b expression is limited to proliferating cells [10].

Figure 3
figure 3

ORC and its interactions with other pre-RC proteins at origins of DNA replication. Orc1-Orc5 are required for origin recognition and binding in S. cerevisiae, whereas Orc6 is dispensable in this regard [44]. In contrast, Orc6 is essential for ORC DNA binding in D. melanogaster [28]. Studies with both S. cerevisiae and human cells have indicated that Cdc6 interacts with ORC through the Orc1 subunit (indicated by a double arrow) [31, 79, 80]. This association increases the specificity of the ORC-origin interaction [20]. Further studies with S. cerevisiae suggest that hydrolysis of Cdc6-bound ATP promotes the association of Cdt1 with origins through an interaction with Orc6 (indicated by a double arrow) [25, 31], and this in turn promotes the loading of Mcm2-7 helicase onto chromatin.

Figure 4
figure 4

Neighbor-joining tree for ORC and Cdc6 proteins. Orc1-5 and Cdc6 sequences were retrieved from the NCBI protein database for H. sapiens (Hs), X. laevis (Xl), D. melanogaster (Dm), S. cerevisiae (Sc), and S. pombe (Sp). The protein corresponding to Cdc6 in S. pombe is named Cdc18 in this species. AAA+ domain regions were extracted from Orc1-5 and Cdc6 sequences using the Walker A and Walker B motifs identified in [19]. The multiple sequence alignment program Muscle [81] was used to align the sequences, and any regions in the multiple sequence alignment containing gaps were deleted. The resulting ungapped alignment was used to construct a phylogenetic tree using the BioNJ algorithm [82]. One hundred resampled alignments were used to generate bootstrap values, with values greater than 70% indicated. For the five eukaryotic organisms from yeast to human, the Orc1-5 and Cdc6 sequences are conserved across all organisms. Orc1 seems to be the most highly conserved, and Orc3 the most divergent, within a group. Interestingly, Orc1 is most closely related to Cdc6 within the ORC-Cdc6 family. Orc6 was not aligned, as it does not share the AAA+ domain with the other members. Scale bar represents changes per site.

ORC-like proteins are not just confined to the eukaryotes. Genes with homology to ORC1 and CDC6 have been found in most species of archaea, which typically have 1 to 9 copies, although as many as 17 have been found in the case of Haloarcula marismortui (reviewed in [13]). Studies of archaeal ORC proteins have yielded important results, because they not only bind to defined origin sequences but are amenable to crystallization, which has provided important structural information about ORC-DNA interactions [14, 15]. Curiously, genome analysis of several Methanococcus species has uncovered no evidence of ORC-like sequences. Given the apparent functional conservation of ORC proteins between eukaryotes and archaea, it will be interesting to determine whether ORC orthologs have simply been overlooked as a result of lower sequence conservation, or whether these species have developed another means of initiating DNA replication at origin sequences.

Evidence that proteins with ORC-like functions are actually common to all domains of life is provided by investigations of the bacterial DnaA protein. DnaA, like ORC, acts as an initiator of DNA replication and, whereas DnaA and the archaeal Orc1/Cdc6 proteins share little sequence identity, structural studies have shown that they do have a high degree of similarity in some of their functional domains [16]. Moreover, a recent study of Drosophila ORC structure suggests that DnaA and ORC wrap DNA in a similar manner [17].

Characteristic structural features

Orc1-5 as well as Cdc6 have conserved AAA+ folds, including Walker A and Walker B ATP-binding domains, characteristic of ATP-dependent clamp-loading proteins, which allow ring-shaped protein complexes to encircle duplex DNA (see Figure 1). Sensor-1 and Sensor-2 motifs are also found within the AAA+ fold and are believed to detect whether ADP or ATP is bound and to contribute to ATPase activity [18]. These domains are located centrally, in the case of Orc1 and Orc2, and towards the amino termini in Cdc6, Orc3, Orc4, and Orc5. Near the carboxyl termini of these proteins a winged-helix domain is present that mediates DNA binding [14, 15, 17]. Somewhat surprisingly, structural studies of archaeal Orc1 suggest that the AAA+ domain also contributes to its association with origin sequences [14, 15]. Interestingly, Cdc6 has been shown to act like an additional ORC subunit, associating with the complex in the G1 phase of the cell cycle and inducing a conformational change that increases its sequence specificity for DNA binding [19, 20]. When Cdc6 is bound to ORC, a ring-like structure is predicted with structural similarities to the Mcm2-7 helicase complex that ORC-Cdc6 loads onto DNA in an ATP-dependent manner [19, 21].

As mentioned above, sequence similarity has been identified for Orc1 and Sir3, with a particularly high degree of conservation between their amino-terminal 214 amino acids (50% identical, 63% similar), which includes a BAH (bromo-adjacent homology) protein-protein interaction domain [6, 22]. Sir3 is required for transcriptional silencing of telomeres and mating-type loci, functions that are also ORC-dependent [3, 5, 23], as discussed below. Although formally a member of ORC, Orc6 contains none of the aforementioned structural features, and shows no evidence of a common evolutionary origin with Orc1-5. It is nevertheless considered an ORC protein as its association with the other five subunits is required to promote the initiation of DNA replication. Relative to other ORC subunits, Orc6 is poorly conserved between budding yeast and metazoan eukaryotes [11] (see Figure 2). Nevertheless, a number of important domains specific to Orc6 have been identified in S. cerevisiae, including an amino-terminal 'RXL' docking sequence (amino acids 177-183) which mediates an interaction with the S-phase cyclin Clb5 [24], and a carboxy-terminal region (the last 62 amino acids) which associates with the other ORC subunits. Both ends of Orc6 (amino-terminal 185 amino acids, carboxy-terminal 165 amino acids) interact with Cdt1, another replication factor required to load Mcm2-7 onto DNA [25]. In both human and Drosophila cells, Orc6 plays a role in cytokinesis, and studies with the latter organism have identified a carboxy-terminal domain that interacts with the septin Pnut, a component of the septin ring that forms in cell division, as well as an amino-terminal domain that is important for DNA binding [2629]. Interestingly, structural modeling of Drosophila Orc6 revealed that the amino terminus, but not the carboxyl terminus, is homologous to the human transcription factor TFIIB, raising the possibility that proteins involved in replication and transcription may have coevolved [27].

Localization and function

Detection of ORC by immunofluorescence and live-cell imaging of fluorescently tagged subunits in budding yeast have demonstrated that it localizes to punctate subnuclear foci throughout the cell cycle [30, 31]. Moreover, chromatin immunoprecipitation (ChIP) of ORC-bound genomic DNA that was subsequently labeled and hybridized to high-density, tiled, whole-genome S. cerevisiae oligonucleotide arrays revealed 400 ORC-enriched regions, which included 70 of the 96 replication origins that had been experimentally verified previously [32]. These findings are consistent with a role for ORC as a scaffold for the sequential association of a number of additional replication factors in G1 phase of the cell cycle, including Cdc6, Cdt1, and Mcm2-7, which collectively form the pre-replicative complex (pre-RC), required for the initiation of DNA (reviewed in [23]).

Binding sites for budding yeast ORC have been identified at HML (hidden MAT left), and HMR (hidden MAT right) silent cassettes, used for mating-type switching through gene conversion of the MAT allele, and at telomeric loci, whereas the majority of Drosophila ORC appears to be associated with heterochromatin, consistent with the role of this complex in mediating gene silencing [23, 33]. The amino terminus of S. cerevisiae Orc1 interacts with the hetero-chromatin factor Sir1, and truncation mutants lacking this region are defective in silencing but not DNA replication [6, 34], indicating that these two functions of the protein are separable. The role of the Orc1 amino terminus in mediating transcriptional repression seems to be conserved among eukaryotes, as it has also been found to interact with hetero-chromatin protein 1 (HP1) in Xenopus and Drosophila [33] which, in a fashion similar to Sir1, helps to propagate silenced chromatin.

It appears that all six ORC subunits remain chromatin-associated throughout the cell cycle in S. cerevisiae [35], but this differs from observations in metazoan cells where, in a number of cases, Orc1 appears to be absent from ORC at certain points in the cell cycle. For example, in human HeLa cells, Orc1 dissociates from chromatin during S phase, and then reassociates at the end of mitosis (reviewed in [36]). Immunofluorescent detection of Orc2 in one study indicated that it is found on chromatin throughout the cell cycle in Drosophila embryos [33]; however, a similar analysis with Drosophila neuroblasts and recently reported live-cell imaging of Orc2-green fluorescent protein (GFP) in embryos argue that this protein is actually excluded from chromosomes from prophase until anaphase [37, 38]. Fluorescence loss in photobleaching analysis in hamster cells suggests that the interaction of ORC subunits with chromatin may be less static than previously thought, revealing a highly dynamic interaction for both Orc1 and Orc4 with chromatin throughout the cell cycle [39].

In metazoan cells, ORC localization clearly extends beyond origin sequences (reviewed in [40]). Studies with Drosophila and human cells have revealed that Orc6 also localizes to the cleavage furrow in dividing cells, and a role for this protein in cytokinesis has been confirmed in both organisms through depletion by RNA interference [26, 27]. In addition, human Orc6 was shown to localize to kinetochores and reticular-like structures around the cell periphery during mitosis, and it is required for the proper progression of this cell-cycle stage [26], whereas human Orc2 also localizes to the centrosome throughout the cell cycle and its depletion results in mitotic defects and multiple centrosomes [41]. Recently, a similar role in controlling centrosome copy number was reported for human Orc1 [42].

Mechanism of action

The mechanism by which ORC promotes DNA replication, through loading and maintenance of the Mcm2-7 helicase at origin sequences, has been most closely examined in S. cerevisiae. ATP binding by the Orc1 subunit promotes association with DNA [43]. Cdc6 is then thought to bind ATP and associate with ORC, causing a conformational change that increases the specificity for the conserved origin sequences found in budding yeast. These origin regions are often referred to as autonomously replicating sequences (ARSs), and include an 11-bp ARS consensus sequence (ACS), as well as one or more B elements [20, 21, 23]. Cross-linking analysis has shown interactions between Orc1, Orc2, Orc4, and Orc5 proteins and origin DNA [44].

Given the lack of such conserved origin sequences in other eukaryotes, it is not surprising that other means by which ORC association with DNA is promoted have been discovered. Some of these are related to the relatively high AT content that is a common feature of replication origins among diverse species. For example, in the fission yeast S. pombe, a domain of Orc4 binds to AT-rich DNA [45], and another 'AT-hook' protein, HMGA1a, has recently been shown to target ORC to replication origins in human cells [46]. HMGA1a, which is known to interact in a highly specific manner with the minor groove of stretches of AT, was shown to interact with Orc1, Orc2, Orc4 and Orc6. Interestingly, an AT-hook motif is also present in S. cerevisiae Orc2, although its functional significance has not been determined (see Figure 1). It is clear, however, that AT content is not the only origin determinant, as numerous studies with both S. pombe and Drosophila have shown differences in ORC binding between stretches of DNA that have the same proportion of AT [23]. A study of human Orc1 revealed that the BAH domain of this subunit promotes association of ORC with chromatin [47]. Human and Drosophila investigations have pointed to transcription factors, including c-Myc, E2F, and the Myb complex, as likely ORC-targeting factors [4851], whereas a ribosomal RNA fragment that associates with Tetrahymena ORC has been found to direct the complex to complementary rDNA sequence in the genome of this organism [52]. Furthermore, whereas Orc6 is dispensable for origin binding in S. cerevisiae [44], it is absolutely required for this function in Drosophila [28, 53].

Rather than merely acting as a landing pad for pre-replicative complex (pre-RC) assembly, S. cerevisiae ORC appears to play an active role in loading additional pre-RC components. Following ORC-Cdc6 binding, Orc6 interacts with Cdt1 to promote Mcm2-7 association with origin DNA [25, 31]. The hydrolysis of Cdc6-bound ATP is then thought to load the initial Mcm2-7 complexes more tightly onto the DNA, and additional Mcm2-7 binding occurs following the hydrolysis of ORC-bound ATP [21]. Interestingly, even though it does not bind ATP itself, a predicted arginine finger in Orc4 is required for Orc1 ATP hydrolysis [54, 55]. Once loaded, the continued presence of Orc6, Cdc6, and most probably other pre-RC components, is required to maintain the Mcm2-7 helicase complex at origins until the initiation of DNA replication [25, 31, 56].

Although it is not known whether the mechanism determined for the promotion of DNA replication by the ORC in budding yeast operates in precisely the same fashion in other organisms, the sequential association of the ORC, Cdc6, Cdt1, and Mcm2-7 at origins appears to be conserved in other eukaryotes, including S. pombe and Xenopus (reviewed in [23]). Furthermore, several reports have demonstrated interactions between archaeal ORC-Cdc6 and MCM proteins [5759].

Frontiers

Now that roles for ORC proteins have been established at other points in the cell cycle than simply the G1/S boundary, it is of primary interest to determine the way in which the proper progression of cell-cycle stages might be coordinated by the complex as a whole or by its individual subunits. For example, studies of human Orc6 have shown that it associates with the kinetochore during the G2/prophase transition [60], and in both human and Drosophila cells it localizes to the cleavage furrow just before cytokinesis [26, 27]. Similarly, a mitotic function has been uncovered for Orc2 in promoting sister-chromatid cohesion in budding yeast after it is no longer required for DNA replication [61]. Thus, it is possible that a redistribution of ORC subunits after their role in DNA replication is complete helps to ensure the proper order of cell-cycle events.

Another avenue of ORC research that is presently yielding intriguing results is the elucidation of roles for these proteins in development [62]. Studies with Drosophila Orc3 have shown that it localizes to larval neuromuscular junctions, and that its mutation leads to impaired neuronal cell proliferation and to learning defects, as judged by a reduction in olfactory memory [63, 64]. Similarly, Orc2-5 have been detected at high levels in mouse brain, and knockdown of Orc3 and Orc5 by short interfering RNAs (siRNAs) impeded dendritic growth [65]. Furthermore, siRNA knockdown of Orc1 was recently shown to inhibit the proliferation of rat smooth muscle cells [66].

In recent years, numerous ORC-associated proteins have shown promise as biomarkers for early cancer detection (reviewed in [67]), and alterations in the expression levels of a number of them have been implicated as contributing to human lung carcinomas and mouse mammary adenocarcinomas [6870]. The extent to which mutations in ORC subunits and/or perturbations of their normal levels may contribute to carcinogenesis is an important unresolved question. Some initial indications have been obtained through the observation that genomic instability, in the form of DNA re-replication, can occur as a result of mutations in combinations of pre-RC components, including Orc2 and Orc6, in budding yeast [71, 72]. Given the finding that ORC plays an active enzymatic role in loading Mcm2-7 onto DNA in S. cerevisiae, it will be very important to determine if the complex acts in the same way in higher eukaryotes, including humans. Interestingly, Drosophila Orc2 interacts with the tumor suppressor protein retinoblastoma 1 (Rb1) and siRNA-mediated reduction in Orc6 levels sensitizes human colon cancer cells to treatment with chemotherapeutic agents, pointing to possible links between ORC subunits and cancer development [73, 74].

Further investigation into both normal and dysregulated ORC function should yield important insights into the way cells coordinate the distinct stages required for their duplication, how they are organized into specific tissue types, and how carcinogenesis occurs.