Introduction

Replication of the genome occurs during S phase of the cell cycle in a highly regulated process that ensures the reliability of DNA duplication. Each genomic region is replicated at a distinct time during S phase through the activation of an origin of replication (Hand 1978). The time each region is replicated is a function of its distance from an active origin and the time the origin was activated. Adjacent origins are usually activated at the same time, giving rise to large chromosomal regions that are replicated at approximately the same time, called replication time zones. The time of replication (herein called ToR) of a region seems to reflect high order genomic organization, since it is in good correlation with basic chromosomal features such as the regional GC content, Giemsa banding, and gene density. Moreover, detailed analysis of the ToR of individual genes has revealed a striking correlation between transcription and early replication (Braunstein et al. 1982; Gilbert 1986; Goldman et al. 1984; Hansen et al. 1993; Schmidt and Migeon 1990). Expressed genes, such as constitutively transcribed housekeeping genes, replicate at early stages of S phase, whereas repressed tissue-specific genes may replicate in most tissues at late stages and become early replicating only in the expressing tissue (Holmquist 1987; Selig et al. 1992).

Recently, the development of various formats and applications of microarray technology enabled the systematic mapping of genome-wide replication timing and the investigation of its connection to transcription and to developmental changes. The new approach has broadened the scope of previous anecdotal finding and revealed new insights regarding the organization of DNA replication and the consequences thereof. In the current review, we will concentrate on genomic studies performed in mammals and the insights obtained from them.

Methods

The common method for measuring ToR involves isolation of 5-bromo-2′-deoxyruridine (BrdU) labeled DNA from cells at different stages of S phase (Fig. 1a). BrdU is incorporated into the DNA during the replication process, and therefore, the addition of BrdU at a specific time during S phase will label specifically the regions that are undergoing replication during this time. This method was used extensively for measuring replication timing of individual genes (Hansen et al. 1993; Azuara et al. 2003) and has recently been applied for the genome-wide determination of ToR in yeast, Drosophila, mouse, and human genomes(Jeon et al. 2005; MacAlpine et al. 2004; Schubeler et al. 2002; White et al. 2004; Farkash-Amar et al. 2008; Hiratani et al. 2008; Karnani et al. 2007; Schwaiger et al. 2009; Watanabe et al. 2002; see also Table 1). In order to successfully label cells at only a specific time slot in S, the cell culture needs to be tightly synchronized. This can be done by either whole culture synchronization or isolation of cells at a certain stage of the cell cycle [usually by florescence-activated cell sorting (FACS)]. Both methods have drawbacks. On one hand, whole culture synchronization methods frequently perturb the cell cycle (Cooper and Shedden 2003) and thus may change the replication program. On the other hand, FACS-based isolation procedures produce limited amounts of cells with only two to four non-overlapping fractions along the S phase, thus limiting the temporal resolution of the results. Recently, a new synchronization method in which cells are grown on a special device (baby-machine) and efficiently collected at the beginning of the cell cycle (Helmstetter et al. 2003; Thornton et al. 2002) was applied to this field. Use of this method achieved high temporal resolution with no perturbation of the cell cycle (Farkash-Amar et al. 2008).

Fig. 1
figure 1

Two major technologies for measuring genomic replication patterns. a Isolation of newly replicated DNA at various time points during S phase. The newly replicated DNA is labeled by BrdU and isolated by immunoprecipitation or density fractionation. b ToR is measured by the changes in DNA content. In an unsynchronized culture, the DNA content of a region is proportional to its ToR—early replicating regions are present in two copies in most S phase cells, whereas late-replicating regions are present in only one copy in most cells. DNA is harvested from S phase cells and from G1 phase cells and the DNA content of each region is compared

Table 1 Genome-wide ToR determination for various organisms

A second method for measuring ToR is based on the increase in DNA content of a region upon replication and on the fact that, in an unsynchronized culture, the fraction of cells in which a locus has been replicated is a function of its ToR (Fig. 1b). This principle can be used to measure the ToR of individual loci by performing fluorescence in situ hybridization (FISH) and counting the number of alleles in an asynchronous culture. Finding a large proportion of cells containing doublets (replicated locus) reflects early replication of the region, whereas detecting a large proportion of singlets reflects late replication (Selig et al. 1992). The same principle can be used for genome-wide mapping of the ToR (Woodfine et al. 2005; Woodfine et al. 2004). Here, the difference in the DNA content between G1 cells and S phase cells is measured for every locus using DNA microarrays (Fig. 1b).

Genomic studies using both experimental approaches resulted in similar findings (Farkash-Amar et al. 2008; Hiratani et al. 2008 and data not shown). Nevertheless, there is a profound difference between the two approaches regarding regions in which there is a difference in the ToR between two allelic variants (asynchronous replication). Such differences can only be revealed by the first method, since the second method, which measures the DNA content of the regions, will result in the average ToR of the two alleles rather than the actual difference between them. Even with the BrdU-based method, asynchronous replication can be revealed only by the collection of four or more fractions along the S phase (Farkash-Amar et al. 2008; Jeon et al. 2005; Karnani et al. 2007), since otherwise, it is impossible to discriminate between the ToR of the two alleles.

Genomic studies

Using the above methods, the genome-wide ToR was determined for multiple organisms, including budding and fission yeast, Drosophila, mouse, and human (see Table 1). Surprisingly, in yeast, the ToR was not correlated with transcriptional activity or DNA GC content (Raghuraman et al. 2001; Yabuki et al. 2002), while such correlations do exist in higher eukaryotes including Drosophila (MacAlpine et al. 2004; Schubeler et al. 2002; Schwaiger et al. 2009) and mammals (Woodfine et al. 2004; Farkash-Amar et al. 2008; Hiratani et al. 2008; Jeon et al. 2005; White et al. 2004). The availability of genomic ToR maps has elevated our understanding of many aspects of the replication program in higher eukaryotes. Here, we will limit our discussion to the contribution of the genome-wide approach to our understanding of the replication program in mammals.

Replication structure

More than 30 years ago, with the recognition that cells labeled with BrdU give rise to chromosomes with distinct banding patterns, it became clear that DNA replication occurs in an ordered manner, with large chromosomal domains replicating at the same time (Latt 1975). Since the size of a single replicon (a region replicated by replication forks originating from a single origin) is usually small (in the range of 100–200 kb) and the replication zones are huge (can be seen with cytogenetics resolution) (Holmquist 1987), the existence of such large replication zones (regions with a similar ToR) suggests that, in each such zone, multiple origins of replication are coordinately activated. New genome-wide ToR data confirm these early observations and offer a much more detailed picture of the genomic organization of DNA replication.

Two recent studies (Farkash-Amar et al. 2008; Hiratani et al. 2008) have mapped the ToR of the entire mouse genome. Analysis of those maps has revealed that the size of the replication zones is in the order of 1 Mb [in embryonic stem (ES) cells and lymphoblasts, the median size is approximately 850 kb and in neuronal progenitor cells, 1.4 Mb]. For the first time, the question of how the transition between early and late replication zones occurs can be addressed. A priori, there are three possibilities: (1) Replication fork barriers similar to those found near the yeast rDNA genes (Brewer and Fangman 1988) can halt the early replication fork for a few hours until a late replication fork reaches this region and then they will converge; (2) no such border elements exist, and therefore, the early fork will propagate until its convergence a few hours later with the late fork; (3) additional origins may exist in between the early and the late replication zones, which are activated at intermediate times (Fig. 2). Analysis of the ToR maps reveals that the first scenario, which is characterized by abrupt transition from early to late replication, is very rare. Actually, we found only six cases in the entire mouse genome in which the ToR map strongly supports this interpretation (unpublished results). On the other hand, almost all transitions are characterized by a gradual change in the ToR, suggesting that the replication of these regions occurs by the propagation of the early fork through the entire region. This interpretation is further supported by the fact that the replication fork rate deduced from those maps (1–3 kb/min) fits previous estimates (Edenberg and Huberman 1975; Housman and Huberman 1975; Yurov 1980) and from the demonstration of uniform fork direction in several such regions (Farkash-Amar et al. 2008). These large regions are quite abundant (cover approximately 10% of the mouse genome) and were recently termed originless regions (Hiratani et al. 2008). The third scenario in which additional origins exist in the transition regions implies that there is high-order organization of the replication zones, and there is a gradual activation of origins in adjacent zones. Although there are not much data to support this idea, bioinformatic analysis of the ToR maps revealed that gradual activation of origins does exist, since frequently adjacent replication zones have similar ToR (Fig. 3 and unpublished results).

Fig. 2
figure 2

Three possible models for the transition between replication zones. a At the margin of the early zone, the early replication fork pauses and waits for the convergence with the late replication fork b. The early replication fork continues to replicate the DNA in the transition zone until it converges with a late fork a few hours later. c Origins of replication are organized along the transition zones and fire in a gradual manner. Origins of replication are marked by circles, the arrows depict the direction of the replication fork and the stop signal designates a hypothetical element that causes the pausing of the replication fork. The colors depict the ToR of the region

Fig. 3
figure 3

Genomic organization of replication. ToR map of a 12 Mb region on mouse chromosome 19 is shown (data from Farkash-Amar et al. 2008). Transition regions between replication zones are marked by dashed gray lines. The transition regions at positions 34 and 36 Mb represent probable originless regions since these regions can be replicated by a single replication fork with a replication rate of 2.6 and 1 kb/min, respectively. On the other hand, the other two transition regions (located at 30 and 38 Mb) are too long to be replicated by a single origin and therefore probably represent cases in which consecutive origins are activated gradually

The genomic organization of replication time zones connected by large originless regions is usually similar on both alleles, and thus, most of the genome replicates synchronously. Nevertheless, there are genomic loci in which the ToR of the two alleles is different. This type of asynchronous replication was first discovered on the female X chromosome, and later on, it was expanded to include many other cases of monoallelic expression such as parental imprinted genes, immunoglobulin regions, and olfactory receptors (Goldmit and Bergman 2004). As has been discussed above, this issue can be addressed on a genomic level only by the methods that isolate newly replicated DNA at multiple time points along S phase. Compilation of the available data has revealed that 10–20% of the human and the mouse genomes replicate asynchronously (Farkash-Amar et al. 2008; Karnani et al. 2007). These asynchronous regions not only include many of the monoallelic expressed genes but also many more genes, and the significance of their asynchronous replication awaits further studies.

Does replication occur continuously throughout S phase? Early works reported a decline in replication during middle S phase, which was interpreted as a cessation in DNA replication at middle S (Holmquist et al. 1982). Genomic data demonstrate clearly that replication does occur in middle S (Farkash-Amar et al. 2008; Karnani et al. 2007; Schwaiger et al. 2009; Woodfine et al. 2004). Analysis of the ToR maps reveals that there are origins that are actually activated during middle S although with reduced frequency (Farkash-Amar et al. 2008; Schwaiger et al. 2009). Moreover, analyses of the correlation between ToR and multiple genomic features (such as GC content and gene density; see below) indicate that the strong correlation observed for all these features goes beyond the early and late-replicating regions and extends to regions from all stages of S phase (Farkash-Amar et al. 2008). These results suggest that the common view of dividing the genome into early and late-replicating regions does not capture the full picture and that ToR is actually a continuous feature.

Genetic and epigenetic characteristics of the replication zones

ToR and the global structure of a genomic region were observed to be associated as a corollary of the finding that late- replicating regions tend to colocalize with dark Giemsa bands (G bands), whereas early regions are associated with R bands (Drouin et al. 1994; Latt 1975; Hand 1978). Similarly, it was noticed that the early replicating regions are rich in genes (Cohen et al. 1998). However, only the development of genome-wide assays for ToR allows for systematically studying the association between ToR and regional chromosomal properties. Indeed, recent studies revealed a striking correlation between ToR and multiple genetic features including (1) regional GC content, (2) gene density, and (3) Alu and LINE density (Jeon et al. 2005; Schubeler et al. 2002; White et al. 2004; Farkash-Amar et al. 2008; Hiratani et al. 2008; Woodfine et al. 2004). ToR is also associated with epigenetic characteristics of the domain, which is probably the underlying mechanism of the association with the Giemsa bands. Moreover, by fractionating the genome into open and closed chromatin, Gilbert et al. (2004) were able to demonstrate that early replication regions are associated with open chromatin. The differences between open and closed chromatin probably lies in the different chromatin modifications associated with each. Indeed, recent genomic studies (Hiratani et al. 2008; Thurman et al. 2007) found an association between early replication and open chromatin markers. An association between late replication and closed chromatin markers is still not clear. Immunostaining experiments found an association between histone H4 deacetylation (Belyaev et al. 1996) and histone H3 Lys9 methylation (Cowell et al. 2002) and late replication in various animal cells. Genomic analysis of the ENCODE regions found an association between H3K27Me3 and late replication in HeLa cells (Thurman et al. 2007). These results are further supported by unpublished results from our lab that did find similar correlation in human blood cells. On the other hand, extensive analysis of the association of the ToR and chromatin markers in mouse ES cells did not find any correlation with closed chromatin markers (Hiratani et al. 2008). It is not yet clear whether these differences reflect genuine differences between ES and somatic cells. This important field requires further study.

Time of replication and transcription

Detailed analysis of the replication timing of individual genes has revealed a striking correlation between transcription and early ToR (Braunstein et al. 1982; Gilbert 1986; Goldman et al. 1984; Hansen et al. 1993; Schmidt and Migeon 1990). Expressed genes, such as constitutively transcribed housekeeping genes, replicate at early stages of the S phase, whereas repressed genes replicate at late stages. Moreover, replication timing was also shown to be subject to developmental regulation. Thus, many tissue-specific genes replicate late in most cell types but become early replicating in the expressing tissue (Holmquist 1987; Selig et al. 1992). The new genomic ToR maps provide a broader picture of the association between the ToR and transcription. Several such works on Drosophila (Schubeler et al. 2002; MacAlpine et al. 2004; Schwaiger et al. 2009), mouse (Farkash-Amar et al. 2008; Hiratani et al. 2008), and human (Jeon et al. 2005; White et al. 2004; Woodfine et al. 2004) have revealed a significant correlation between early replication and transcriptional activity. Nevertheless, the genomic view has revealed that there are also many cases of transcription from late-replicating regions and many cases of silenced genes that reside in early replicating regions. It is worth noting that many aspects of gene transcription are correlated to early replication, including transcript levels, promoter RNA polymerase II occupancy, transcription breadth (number of tissues in which the gene is expressed), and transcriptional potential. The latter refers to genes that can potentially be induced in a tissue, e.g., under stress conditions. These genes tend to replicate early even in conditions in which they are not transcribed (Farkash-Amar et al. 2008).

The three-way association between ToR, transcription, and chromatin structure can be explained by two opposing models. Closed chromatin structure may be the cause for late replication and gene repression, through a mechanism by which closed chromatin restricts the access of replication factors and thus delays the firing time of the origin. On the other hand, closed chromatin can be a consequence of late replication, since the reestablishment of the chromatin structure after its disruption by the replication fork may be different at early and late S (window of opportunity model). The latter model suggests that replication timing may serve as a memory mark for the chromatin state of a region during S phase when the chromatin must be reloaded (Gilbert 2002; Gondor and Ohlsson 2009; Goren et al. 2008). Although there is no definitive proof for either one of these models, few lines of evidence support the latter. First, a set of elegant experiments in which plasmids were injected into cells at early and late S phase demonstrated that chromatin that was established in early S supported transcription, whereas chromatin established in late S supported a repressed state (Lande-Diner et al. 2009; Zhang et al. 2002). Second, the ToR conservation between human and mouse goes beyond the conservation of GC content or transcription (Farkash-Amar et al. 2008 and data not shown), suggesting that the ToR is an important genomic feature that is regulated independently from transcription. Third, the finding that ToR is correlated with transcriptional potential rather than actual transcription suggests that ToR is not a mere consequence of gene expression. Fourth, the finding that many early replicating genes are not expressed and, on the other hand, many late replicating genes are expressed suggests that the association between transcription and ToR is not straight forward. The simplest explanation of these findings is that replication timing is only one of several contributors (including for example transcription factor binding and DNA methylation) to gene expression, and therefore, the correlation with expression is imperfect. Finally, when restricting attention to the long transition originless regions, the correlation between ToR and transcription is still observed at a significant level (Farkash-Amar et al. 2008). Since these regions contain no origin, the chromatin structure in these regions should not affect their replication timing. Therefore, the fact that transcription is correlated with ToR in these regions as well suggests a role for replication timing in defining the chromatin state.

Dynamics

As mentioned before, the ToR of certain genomic regions is different in different tissues; however, the extent of this phenomenon is not clear. The new genomic data is ideal for addressing such questions. The first work that tried to address this question compared the ToR of the entire human chromosome 22 between fibroblast and lymphoblastoid cells (White et al. 2004). This work revealed that, overall, the ToR in these two tissues is similar with only 1% of this chromosome showing differences. Recently, the ToR of the entire mouse genome was compared between ES cells and neuronal progenitor cells (NPCs) (Hiratani et al. 2008). This study found that 20% of the genome showed substantial changes in replication timing upon neural differentiation of the ES cells. Most of the changes were either near domain bounderies (termed “boundery shift”) or caused by merging of two replication zones (termed “consolidation”). The size of the replication zones that changed their ToR was small and quite uniform (400–800 kb), which raises the hypothesis that the building blocks of the replication zones are those smaller units. Interestingly, the size of those units is similar to the amount of DNA replicated in single foci (Ma et al. 1998).

As mentioned above, ToR is strongly correlated to many static features of the genome (like GC content and gene density), and therefore, the dynamic nature of the ToR is somewhat surprising. Nevertheless, tissue-type differences highlight the fact that ToR is not a simple reflection of such static sequence features, and it must be regulated separately in a dynamic way. A good example of the possible inconsistencies between ToR and GC content was observed in ES cells in which the correlation is much weaker than in other somatic cells (Hiratani et al. 2008). Indeed, the regions that change their ToR during the differentiation to NPCs have an unusual sequence composition; Late to early and early to late domains were found to be GC rich/gene poor and GC poor/gene rich, respectively.

Analysis of the association between ToR and transcription in the dynamic replication zones revealed that the general correlation between the two genomic features is preserved. Most of the genes that advanced their ToR (late to early) were upregulated, whereas genes that delayed their ToR (early to late) showed a weak tendency to be downregulated. However, many exceptions also exist. Interestingly, CpG island containing genes seem to be expressed even when located in late-replication zones (Hiratani et al. 2008), further supporting the idea that ToR is only one of several factors that promote transcription.

The profound differences between the replication zones in ES and somatic cells (smaller and less correlated to GC content) raised the hypothesis that the changes observed in the ToR between ES cells and NPCs may be characteristic to this differentiation process only. Preliminary results from our lab suggest that this is not the case, since we have observed changes, to a similar extent, in the ToR between two differentiated tissues. Further analysis of this type of data is needed in order to reach broader conclusions.

Genome stability

An interesting aspect of the genomic organization is its relationship with genome stability. A recent bioinformatics analysis of the ToR map of the human ENCODE regions revealed that late-replicating regions have higher mutation rates than early replicating regions. This conclusion is based on an increase in the divergence of the human genome from the chimpanzee genome and in single nucleotide polymorphisms (SNP) frequency in late replicating regions. This association cannot be explained by variations in other genomic features linked to ToR like GC content and gene density. Moreover, finding this association for all classes of mutations suggests the existence of a general mechanism involving replication time-dependent DNA damage (Stamatoyannopoulos et al. 2009).

Replication of the large originless regions is performed, under normal conditions, by a single replication fork. Therefore, it is plausible that these regions would be more susceptible to DNA damage. Indeed, Watanabe et al. (2002) found increased SNP density and higher frequency of disease-related genes in such transition regions on human chromosomes 11q and 21q. Nevertheless, analysis of the ToR maps of the entire mouse genome revealed no differences in SNP density and in DNA rearrangement frequency between originless regions and regular replication time zones (Farkash-Amar et al. 2008).

Conclusion and future directions

The recent introduction of genome-wide and high throughput techniques into the field of DNA replication greatly improves our understanding of the organization and potential function of the replication program in mammals. Specifically, the new methods revealed (1) the genome-wide organization of the replication zones and the transition between them through large originless regions. (2) It also revealed many exceptions to the general correlation between ToR, genetic, epigenetic, and transcriptional features. The more complex picture that emerged from the genomic studies highlights the fact that the association between these genomic features is subtle, and no single one is a simple predictor of the others. (3) ToR is likely to play a role in gene regulation and not vice versa. (4) ToR is a dynamic feature with approximately 20% of the genome showing changes in the replication program between tissues. (5) The ToR of a region affects the mutation frequency therein.

Even though the recent advances in the ToR field are impressive, much remains to be done. First, very little is known about the location of individual origins of replication. Recently, genomic approaches were used for mapping origins of replication in the ENCODE regions in human HeLa cells (Cadoret et al. 2008) and in a small portion (0.4%) of the mouse genome in mouse ES cells (Lucas et al. 2007). Obtaining reliable data about origin locations in larger regions of the genome and in multiple cell types is crucial for a better description of the genomic organization of replication. Second, almost nothing is known about the regulation of the ToR. It is quite clear that there is a cellular mechanism to dynamically control the ToR; however, nothing is known yet about how it is achieved in mammals. A better description of the origins of replication and of their time of activation is a prerequisite for elucidating such control mechanisms. Third, a high-resolution full-genome human ToR map does not yet exist. Such data are very important for better evaluation of the association between ToR, chromatin structure, and mutation rates. Finally, obtaining ToR data for human and other primates will allow comparative genomic analyses, which may reveal control elements participating in ToR regulation.

Genomic tools have tremendous descriptive power, and as such, they have significantly advanced the field of replication. However, the big challenge now is to gain mechanistic insights into the regulation of the replication program. In yeast, the new unbiased approaches have already contributed to such mechanistic questions (Alvino et al. 2007; Knott et al. 2009; McCune et al. 2008; Raveendranathan et al. 2006). Only the future will tell if similar advances will be gained in the studies of the mammalian replication program.