Background

The timing of DNA replication during S-phase of the cell cycle plays an important role in genome integrity, the mutational spectrum, and a wide range of human diseases [1]. Despite many recent advances in our ability to measure the time of replication (Trep) across entire genomes [27], our understanding of what regulates this timing remains far from complete [1, 811]. The time at which origins of replication (ORIs) fire is thought to be determined in M-phase [12] or G1 [13, 14], at which point factors such as Cdc45 and Sld3 bind to ORIs that will fire early in the following S-phase [15, 16]. These and several other proteins critical for replication initiation are present at copy-numbers lower than the number of ORIs [1719], and their over-expression advances Trep for many late-firing ORIs in both budding and fission yeast [12, 1720], suggesting that their re-use may be a key step in regulating ORI firing time. However, what determines the relative affinities of different ORIs for these limiting factors - and hence their temporal order of initiation - is largely unknown [19].

Among the strongest correlates (and potential determinants) of Trep in metazoans are transcriptional activity and chromatin state. Although transcriptionally active euchromatin has been known to replicate earlier than repressive heterochromatin for over 50 years [11, 21], the reason - and even the direction of causation - has remained elusive. The two major models [8, 11], not mutually exclusive, are that 1) the euchromatic chromatin structure is more permissive both to transcription and to DNA replication initiation, or 2) Trep itself affects chromatin structure and transcription as a result of changes in the nuclear milieu during S-phase. The former is most directly supported by experiments altering ORI firing time via manipulation of histone modifications [810, 18, 2224], whereas the latter is supported by differences in chromatin and transcription of DNA templates injected into cells during either early or late S-phase [8, 9, 25, 26].

Measuring Trep genome-wide in the budding yeast Saccharomyces cerevisiae (Sc), Raghuraman et al. [2] reported a surprising lack of association between transcription and Trep (with the exception of the eight histone genes, which are highly transcribed in S phase and are replicated early). However, this analysis only involved clusters of co-expressed genes, and did not actually compare the highest- versus lowest-expressed genes. Nevertheless, it has been widely interpreted in the literature as indicating the absence of any association, and many authors have speculated as to why budding yeast lacks this relationship [5, 811]. Similarly, the fission yeast Schizosaccharomyces pombe (Sp) is thought to lack any association between transcription and replication timing [8], though again no systematic comparison has been reported.

Results

Because DNA replication is confined to a specific period during the cell cycle, I reasoned that the relationship between Trep and transcription may depend on when in the cell cycle transcription is occurring. The transcription of most genes does not vary greatly throughout the cell cycle, so cannot be used to determine phase-dependent effects. However, several hundred genes have been identified in both Sc and Sp for which transcription does vary consistently during the cell cycle [27, 28]. I compared the expression levels of these cell cycle-regulated genes measured in synchronized cells [27, 28] with the Trep for each gene to determine if any relationship exists. For both Sc and Sp expression levels measured in G2 phase, higher expression associated with earlier Trep (Figure 1A). However, at other points in the cell cycle the relationship was quite different; mostly notably in M/G1 (Sc) or G1 (Sp), the relationship reversed, such that highly expressed genes were replicated late (Figure 1A).

Figure 1
figure 1

The transcription/T rep association varies by cell-cycle stage. (A) Comparing mean Trep of the top decile (10%) of most-induced versus most-repressed cell cycle-regulated genes reveals that genes highly expressed in G2 replicate early in both Sc and Sp, whereas those highly expressed in M/G1 (Sc) or G1 (Sp) replicate late (error bars represent the standard error). (B) The correlation between Trep and expression levels of known cell cycle-regulated genes was calculated separately for expression levels from each time point of cell cycle-synchronized time courses [27, 28]. An oscillation of the correlation coefficient (Pearson’s r) was observed for both budding yeast (all |r| > 0.107 are significant at P < 0.0025) and fission yeast (all |r| > 0.177 are at P < 0.0025). The approximate cell-cycle phase of each time point is shown [27, 28]. Similar oscillations are observed for other methods of synchronization as well (Additional file 1: Figures S1 and S2). (C) A moving average of Trep is shown for all cell cycle-regulated genes, arranged in order of their time of maximal expression, beginning immediately following mitosis. A similar pattern is observed for both yeast species, with the latest Trep for genes with maximal transcript levels in G1, and the earliest Trep for genes with maximal transcript levels in G2.

To more systematically visualize these patterns, I calculated the correlation between the expression levels of all cell cycle-regulated genes measured in synchronized cultures [27, 28] with their Trep, separately for each expression data time-point (see Materials and methods). Plotting these correlation coefficients as a function of the time at which the expression data were sampled, I found a striking relationship: both the strength and direction of the correlation oscillate as a function of cell-cycle stage (Figure 1B). In these plots, positive r values represent time-points at which up-regulated genes tend to be replicated late in S phase; negative r values indicate times when up-regulated genes are replicated early. Consistent with the results in Figure 1A, in both species of yeast, genes highly expressed in G2 phase are replicated early, while those expressed in late M/G1 are replicated late. The oscillation is observed regardless of the method used to achieve cell-cycle synchronization (Additional file 1: Figures S1 and S2).

To further characterize this relationship, I plotted a moving average of Trep for the cell cycle-regulated genes in each species, ordered by their time of maximal expression. If expression in certain cell-cycle phases correlates with early or late replication, this should be reflected by troughs or peaks in such a plot. Again in both species a similar trend emerged: Trep reaches a maximum for genes expressed in G1, and a minimum for those expressed in G2 (Figure 1C; Additional file 1: Figure S3), consistent with the correlation analysis (Figure 1B). The strong conservation of this pattern was surprising, considering how much the regulation of DNA replication has diverged in the hundreds of millions of years separating these two yeast lineages [29].

Although the strongest association between high mRNA levels and early replication was observed for G2-phase expression levels, it is important to note that this does not imply these genes are maximally transcribed in G2. Rather, one would expect maximal transcription to occur in the time leading up to the maximal transcript level, that is, in S phase. Indeed, plotting mRNA levels for G2-upregulated genes (those with early Trep in Figure 1C), it is clear that their transcript levels show the greatest increase - likely reflecting active transcription - in S phase (Additional file 1: Figure S4A). Likewise, genes with late Trep show the opposite pattern: maximal decrease in mRNA levels during S phase (Additional file 1: Figure S4B).

The oscillating relationships shown in Figure 1 do not establish whether Trep is more directly associated with transcription in S phase or in M phase. For example, if M-phase repression led to early Trep, S-phase induction could be associated with early Trep simply as an indirect consequence, because genes repressed in M phase are typically induced in S phase (Additional file 1: Figure S4A). To disentangle the effects of S and M phases, I examined the Trep of genes that are expressed at similar levels throughout the cell cycle. If M-phase repression leads to early Trep, then genes repressed throughout the cell cycle would be expected to have early Trep, as a result of their repression in M phase (in this scenario, their S-phase expression levels are not relevant). However, if the association is instead due to S-phase induction, genes with constitutive high expression would have earlier Trep because of their active transcription in S phase (in which case M-phase expression levels would be irrelevant). This analysis showed a clear trend: highly expressed genes replicate 5.9 minutes earlier in Sc and 3.0 minutes earlier in Sp (Figure 2). Therefore, the results shown in Figure 1 can be entirely, and most parsimoniously, explained by the association of Trep with S-phase transcription; the M-phase relationship is likely to be an indirect side effect of this. This result also suggests a more general association between transcription and Trep in yeast that extends beyond cell cycle-regulated genes.

Figure 2
figure 2

Asynchronous gene expression associates with T rep in budding and fission yeast. Comparison of the 100 highest-expressed genes with the 100 lowest-expressed shows that highly expressed genes are replicated earlier in both budding yeast and fission yeast. Error bars represent the standard error.

To further investigate the connection between S-phase transcription and Trep, I tested whether the relationship differed for genes replicated during early versus late S phase. In this analysis I separated all cell cycle-regulated genes into 10 bins (that is, deciles) by their Trep, and plotted the median G2-phase transcript level (the time-point most closely reflecting S-phase transcription; Additional file 1: Figure S4A) for each. Across all 10 Trep bins, I observed the expected relationship: decreasing expression of bins with increasing Trep (Figure 3A). However, closer examination revealed that, for both yeast species, this pattern was almost entirely driven by late-replicating genes. In other words, there was no correlation between expression levels and Trep for genes in the first five bins (replicated in early S phase), while in late S phase the relationship was quite strong (Figure 3A). Consistent with this, applying the correlation analysis from Figure 1A to just early- or late-replicating genes revealed that the oscillation is entirely driven by replication in late S phase; genes with early Trep showed no oscillation, and only a weak correlation at nearly all time-points (Figure 3B). These results parallel the finding in mouse that genes replicated in the second half of S phase show the strongest association between transcript levels and Trep[7].

Figure 3
figure 3

Factors affecting the strength of the transcription/T rep association. (A) Median G2-phase transcript levels (representing S-phase transcription; Additional file 1: Figure S4A) are shown for all cell cycle-regulated genes separated into 10 equally sized bins (deciles) by their Trep. For both yeast species, no correlation is observed for the first five bins, whereas a strong relationship is present for later Trep. (B) Consistent with the decile analysis, no oscillation is observed in the correlation between expression level and Trep for early Trep genes, while a strong oscillation is observed for late Trep genes. (C) Only weak oscillation is observed in the correlation between expression level and Trep for ORI-distal genes (>5 kb from the nearest ORI in budding yeast, or 10 kb in fission yeast), while a strong oscillation is observed for ORI-proximal genes.

Another factor that may influence the relationship between S-phase transcription and replication timing is a gene’s distance from the nearest ORI. Under the model where chromatin affects both transcription and Trep, the strongest association would be expected for genes near ORIs, whereas if instead Trep affects a gene’s level of S-phase transcription, the relationship should be independent of distance to the nearest ORI [3]. Separating genes into two classes, ORI-proximal or ORI-distal, the ORI-proximal class showed far stronger oscillations (Figure 3C; ORI distance cutoffs, chosen to result in approximately equal-sized lists, were 5 kb from the nearest ORI in Sc and 10 kb in Sp, due to the higher density of known ORIs in Sc; results from equal distance cutoffs are shown in Additional file 1: Figure S5). Because ORI-proximal genes tend to be replicated earlier than ORI-distal genes, this result could not be an indirect effect of the stronger association for late Trep genes, as it acts in the opposite direction. This result suggests that the relationship is unlikely to be caused by an effect of Trep on S-phase transcription, which is one of the two major classes of models that have been proposed to explain the transcription/Trep association [8, 25, 26].

To test whether the relationship between S-phase transcription and replication timing is conserved outside of fungi, I applied the same correlation analysis to cell-cycle gene expression and Trep data from human HeLa cells [6, 30]. Analyzing all known HeLa cell-cycle regulated genes [30], I found no significant relationship of any kind (Figure 4A). However, applying each of the two filters identified from yeast - late Trep and ORI proximity (within 10 kb) - resulted in clear and significant oscillations, of a magnitude similar to that observed for both yeast species (Figure 4B,C). As observed for yeast, the minimum correlation (indicating early Trep of up-regulated genes) occurred in G2, and the maximum in late M/G1. The fact that the same oscillating relationship exists in human, and that its strength is influenced by the same two factors, suggests that it is likely to be caused by a mechanism conserved between fungi and metazoans.

Figure 4
figure 4

Transcription and T rep in human. (A) No oscillation is observed when comparing the Trep versus expression levels of all cell cycle-regulated genes in HeLa cells (all |r| > 0.063 are significant at P < 0.05; the four time-points that exceed this are within the range expected by chance, given that 47 time-points were analyzed). (B) Significant oscillation is observed when comparing Trep versus expression levels of cell cycle-regulated genes with late Trep (red line; the final 50% of S phase; all |r| > 0.195 are significant), but not early Trep (blue line). (C) Significant oscillation is observed when comparing Trep versus expression levels of cell cycle-regulated genes within 10 kb of an ORI (blue line; all |r| > 0.197 are significant), but not further than 10 kb from an ORI (red line).

To put into perspective the strength of the relationship between Trep and cell cycle-regulated gene expression in human, I compared it to the well-established association between Trep and average (asynchronous) gene expression. The latter provides a useful benchmark because it is regarded as a strong relationship that has been observed in numerous studies across diverse metazoans [5, 6, 8, 9]. To facilitate a direct comparison with the results in Figure 4, I used the same Trep data [6] for the same genes, but replaced the cell cycle-synchronized gene expression data [30] with high-coverage RNA-seq data from asynchronous HeLa cells [31]. The correlation between asynchronous expression and Trep was r = -0.16 for late Trep genes (the genes represented by the red line in Figure 4B) and r = -0.15 for ORI-proximal genes (represented by the blue line in Figure 4C). In both cases, the asynchronous data explained less than a third of the variance in Trep that is explained by S-phase transcription (see Materials and methods). Differing quality of the two gene expression data sets [30, 31] could contribute to this difference; however, because RNA-seq is of far higher precision than spotted cDNA microarrays [32], any difference would likely underestimate the strength of the cell-cycle oscillations (Figure 4). These results suggest that the relationship between Trep and S-phase transcription in human is substantially stronger than the well-established association with asynchronous expression.

Discussion

These results suggest that 1) S-phase transcription is associated with DNA replication timing in budding yeast, fission yeast, and human; 2) the association is strongest for genomic regions near ORIs, excluding the causal model in which Trep affects transcription [8, 9, 25, 26]; 3) it is also strongest for regions replicated in late S phase, implying that early-firing ORIs are not affected by this relationship; and 4) this association explains at least three times more of the variability in Trep than the well-known association with (asynchronous) gene expression in human.

Although the replication of these patterns across three species (and across multiple data sets within species; Additional file 1: Figures S1 and S2) lends confidence to their robustness, several caveats should be considered. First, gene expression was represented by transcript abundances, which is a function of both transcription and mRNA decay; therefore, the correlations reported here may underestimate the relationship between transcription and Trep. This prediction can be tested once rates of transcription have been measured throughout the cell cycle. Second, data quality is critical in any analysis; poor-quality data can reduce, or entirely mask, a real relationship. However, in most analyses reported here this is not a major concern, because it could only make the current results conservative (one exception to this is the ORI-proximal versus distal analyses (Figures 3C and 4C): if Trep was measured more accurately near ORIs, this would lead to stronger ORI-proximal correlations, although additional analysis suggests this is not the case (see Materials and methods)). Third, correlation does not imply causation. Although the evidence does not support a model where Trep affects transcription (Figures 3C and 4C), I cannot determine whether transcription itself is affecting Trep, or whether unobserved (latent) factors may be involved. With this caveat in mind, I believe there is still sufficient evidence to propose a testable model to account for these data.

A plausible mechanism explaining these observations draws from the finding that the firing of ORIs in late S phase is governed by recruitment of limiting replication initiation factors [12, 1720]. These factors are sequestered by early-firing ORIs from G1 until early S phase, and are reused at late-firing ORIs after their release from early-firing ORIs. I propose that the level of S-phase transcription near a late-firing ORI reflects local chromatin accessibility and/or subnuclear positioning, and in turn the ability of ORIs to recruit these limiting factors during S phase (Figure 5). This model accounts for the relationship of Trep with S-phase transcription (and the differing relationships in other phases); for the relationship being strongest near late-firing ORIs; and for the inferred direction of causality (that is, Trep not being causal).

Figure 5
figure 5

A model to explain these observations. Components: ORC and MCM2-7 are protein complexes comprising the pre-replicative complex. Blue cylinders represent nucleosomes, with dark blue indicating closed/repressive chromatin and light blue indicating open/accessible chromatin. Red proteins are limiting replication initiation factors (such as Cdc45 and Sld3). Txn = transcription. Sequence of events: in G1 (not depicted), the limiting replication initiation factors (red circles) associate with the earliest-firing ORIs (top row). When S phase begins, these early ORIs fire and release the factors, which are then free to associate with other ORIs (though note that Cdc45 is a component of the replication fork, so can only be recycled after fork termination). The relative affinities of the remaining ORIs for these factors - and thus their relative firing times - are determined by the chromatin state near the ORI during S-phase. ORIs near genes highly transcribed in S phase (middle row) have an accessible chromatin structure and thus high affinity, so will tend to fire earlier than those with little nearby S-phase transcription and thus less accessible chromatin (bottom row). Although not shown here, subnuclear positioning could help determine ORI accessibility, either by influencing chromatin structure or through other mechanisms. Figure adapted from [19].

The proposed mechanism likely acts in concert with other factors determining Trep, and thus is not inconsistent with evidence supporting these other factors. For example, although the determination of early versus late-firing ORIs is completed during M/G1 [1214], S-phase transcription may still influence firing time specifically at late-firing ORIs (Figure 5).

Conclusions

Future work integrating these results with other (non-mutually-exclusive) mechanisms affecting Trep - for example, Forkhead transcription factors [33] and subnuclear positioning [8, 29, 34, 35] - may lead to a unified framework for understanding the causes, and consequences, of the temporal program of DNA replication across eukaryotes.

Materials and methods

Data sources

Genome-wide Trep values were downloaded for all three species [2, 4, 6], and mapped onto genes by linear interpolation to the gene’s midpoint. Asynchronous yeast expression levels (used in Figure 2) were taken from [36, 37], using the poly-A data for Sc and the median of wild-type replicates for Sp. Asynchronous HeLa RNA-seq data were from the ENCODE project [31]. Identities of cell cycle-regulated genes, their expression levels, and the cell-cycle phase of each expression time-point were acquired from [27, 28, 30]. All cell-cycle expression data were measured as mRNA levels relative to asynchronous levels of each gene, as opposed to absolute mRNA abundances that can be measured by RNA-seq; therefore, these expression levels represent the relative induction or repression of each gene throughout the cell cycle. The order of maximum expression levels was obtained from [38] for Sp and [27] for Sc. ORI locations were downloaded from ORIdb [39] for both yeasts (using only 'confirmed' or 'likely' ORIs), and from [6] for human (see below).

Yeast data analysis

All correlations were Pearson’s (significance cutoffs given in each figure legend). Trep moving averages (Figure 1C) were calculated for windows of 100 genes for Sc and 60 genes for Sp (due to the smaller number of cell cycle-regulated genes in Sp). For Figures 1A and 3A, the G2 expression data were represented by the 42 minute time-point for Sc and 135 minute time-point for Sp; for Figure 1A, Sc M/G1 was represented by the 70 minute time-point, and Sp G1 was represented by the 225 minute time-point. For Figure 3B, the early/late S-phase cutoff was chosen at halfway through S phase of each Trep data set (39.6 minutes after release from hydroxyurea arrest in Sp, and 26.8 minutes after release from cdc7 arrest in Sc). The cutoff for ORI-proximal versus ORI-distal (5 kb from each gene’s 5' end in Sc and 10 kb in Sp) was chosen in each yeast to result in gene lists of approximately equal size.

P-values in Figure 2 were calculated with a two-tailed Student’s t-test. Because the Sc expression levels were calculated as a ratio of mRNA/genomic DNA from asynchronous cells [37], they represent the number of mRNAs per DNA copy, and thus account for the fact that genes with early Trep spend a greater portion of the cell cycle with two copies. Although the Sp expression data [36] do not account for this, correcting for the effect by subtracting a fraction of each expression level proportional to the time each gene spends with two copies had only a minimal effect.

All code and data are available at [40].

Human data analysis

Human ORIs were defined as Orc1 binding sites [6] located within 1 mb of early-replicating peaks in the HeLa Trep profile, which indicate active ORIs (this window size was necessitated by the low resolution of the Trep profile) [6]. The early/late Trep cutoff was the first 50% of S phase and the ORI-proximal/distal cutoff was 10 kb from each gene’s 5' end. Due to the higher number of expression data points per cell cycle in human (approximately 15 in human versus approximately 9 for both yeasts), a two-point moving average was used for plotting human correlation coefficients.

To compare asynchronous expression versus S-phase transcription in HeLa cells, I compared high-coverage RNA-seq data from HeLa cells [31] with Trep[6] for the same genes analyzed in Figure 4B,C. The fraction of variance in Trep explained by the expression data is simply the r2 value from the Pearson’s correlation. Comparing these values for the asynchronous data with the strongest G2-phase (used to represent S-phase transcription, as described above) correlations, among the late-replicating genes (represented by the red line in Figure 4B) 2.7% of the variance in Trep was explained by the asynchronous data, versus 8.1% for S-phase transcription. Likewise for ORI-proximal genes (represented by the blue line in Figure 4C), the asynchronous data explained 2.3% of the variance in Trep, versus 7.6% for S-phase transcription.

To determine whether Trep is measured with greater accuracy near ORIs, I compared the Trep data used in Figure 4[6] with an independent Trep data set from HeLa cells [41]. Restricting the analysis to the cell cycle-regulated genes analyzed in Figure 4C, I found that ORI-distal genes actually showed better agreement between Trep data sets than did ORI-proximal genes (r = 0.59 and 0.46, respectively). This implies that, if anything, Trep is measured less accurately in ORI-proximal regions, which would lead to an underestimate of the strength of the oscillating correlation (blue line in Figure 4C).