Abstract
Although nucleosome remodeling is essential to transcriptional regulation in eukaryotes, little is known about its genome-wide behavior. Since a number of nucleosome positioning maps in vivo have been recently determined, we examined if their comparisons might be used for obtaining a genome-wide profile of nucleosome remodeling. Using seven yeast maps, the local variability of nucleosomes, measured by the entropy, was significantly higher in a set of reported unstable nucleosomes. The binding sites of four transcription factors, known as the remodeling factors, were distinctively high both in entropy and linker ratio, whereas those of Yhp1, their potential inhibitor, showed the lowest values in both of them. Taken together, our map shows the general information of nucleosome dynamics reasonably well. The “nucleosome dynamics” map provides the new significant correlation with the degree of expression variety instead of their intensity. Furthermore, the associations with gene function and histone modification were also discussed here.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The genome of eukaryotes takes the chromatin structure, which plays important roles in many cellular processes, such as transcription and replication. The unit of the chromatin structure is the nucleosome, which is composed of a histone octamer, and its positioning on DNA seems to be optimized for facilitating these cellular processes. For example, a number of researchers reported that the upstream region of transcription start sites (TSSs) of genes is statistically free of nucleosomes (Jiang and Pugh 2009a, b). This kind of studies are accelerated thanks to a series of recent releases of genome-wide nucleosome positioning maps in human (Ozsolak et al. 2007; Schones et al. 2008), fly (Mavrich et al. 2008a, b), nematode (Johnson et al. 2006; Valouev et al. 2008), Medaka fish (Sasaki et al. 2009), and budding yeast (Kaplan et al. 2009; Lee et al. 2007; Mavrich et al. 2008a, b; Whitehouse et al. 2007; Yuan et al. 2005).
The nucleosome positioning is not uniform temporally, either. This phenomenon is known as nucleosome remodeling, where ATP-dependent chromatin remodeling factors can change the nucleosome organization by inducing DNA superhelical torsion (Chandy et al. 2006; Havas et al. 2000). In yeast, several transcription factors, such as Abf1, Reb1, and Rap1, known as General Regulatory Factors (GRFs; Chasman et al. 1990), also perturb nucleosome positioning in the vicinity of their binding sites for activating neighboring regulatory sites (Fourel et al. 2002; Hartley and Madhani 2009). In the promoters of ribosomal protein, Rap1 is required for recruitment of Esa1 catalytic subunit of NuA4 H4 lysine acetyltransferase (Reid et al. 2000). Additionally, it is known that induction of chromatin remodeling factors is controlled by a variety of histone modifications. For example, a typical remodeling complex, Swi/Snf, requires histone acetylation by the Spt-Ada-Gcn5-acetyltransferase (SAGA) complex to bind to DNA and then displaces the acetylated histone (Chandy et al. 2006; Varga-Weisz 2001). Another remodeling complex, Mi-2, also contains histone deacetylases (Whitehouse et al. 1999). Therefore, it is of great interest to compare the changes between histone modification patterns and associated nucleosome dynamics.
To examine the chromatin remodeling globally, it is necessary to compare the change of nucleosome positions upon some stimuli. So far, while the positional changes of nucleosomes have been analyzed in several loci (Almer et al. 1986; Barbaric et al. 1992; Kent et al. 2001; Moreira and Holmberg 1998; Weiss and Simpson 1997), only a little is known about their genome-wide behavior. As a pioneering study, Shivaswamy et al. showed the depletion of nucleosomes in promoters induced by heat shock (Shivaswamy et al. 2008). However, their results are based on the comparison between only two conditions. Much more data are necessary to get the general genome-wide landscape. Jiang and Pugh developed a compiled reference map of nucleosome positions in Saccharomyces cerevisiae, using multiple maps independent of our study (Jiang and Pugh 2009a, b). However, detailed analyses of its “dynamic” positions have not been published yet.
In this study, we tested a simple hypothesis that the local positional variance of nucleosomes in a set of heterogeneous maps can be an indicator of local nucleosome remodeling. Seven yeast maps were used, and we show that our results are quite promising, that is, consistent with a number of previous observations, such as promoters, replication origins, and transcription factor binding sites (TFBSs) (Field et al. 2008; Hartley and Madhani 2009; Shivaswamy et al. 2008), and are useful for deriving novel but reasonable correlations between the change of nucleosome positioning and several features, such as gene function, expression variety, and histone modifications.
Results
Re-definition of nucleosome positions
We combined seven different maps from four publications, which are based on two different experimental techniques: tiling arrays and next-generation sequencers (Supplementary Table 1). Since it is desirable to define all nucleosome positions based on a unified method, we modified a hidden Markov Model (HMM) that was first introduced by Yuan et al. with the signal gradient instead of its intensity (Yuan et al. 2005; Supplementary Fig. 1 and Materials and methods). In each map, more than 55,000 locations of nucleosomes, which cover 70–85% of the entire genome, were re-assigned (Fig. 1a and Table 1). The total count of detected nucleosomes in this study was about 3,000–4,000 higher than the original values in (Mavrich et al. 2008a, b; Whitehouse et al. 2007) and was closer to that in Lee et al. (2007). However, for more than 80% of the originally reported positions, newly assigned nucleosomes were positioned within a 30-bp range of the inter-center distance (Supplementary Fig. 2). Thus, we regarded that our re-definition of nucleosome positions is reasonable.
Consistency with known stable/unstable nucleosomes
By comparing the relative positions of re-assigned nucleosomes with the above procedure, we can identify local regions where nucleosome positions are stable or unstable among the seven maps. As a measure of the positional variability of nucleosomes, we adopted the average of “entropy” value over a 100-bp window (see Materials and methods). This definition of entropy is based on Shannon's entropy, which is a standard measure for estimating the uncertainty of a given signal in Information Theory (Schneider 2010). While the entropy is high in the region where nucleosome positioning is variable between datasets (black rectangle in Fig. 1a), the entropy is low where nucleosomes are occupied or depleted in all datasets (white or gray rectangles in Fig. 1a).
To verify our “nucleosome dynamics” map, we collected experimentally verified examples of 21 stable and 13 unstable nucleosomes on five promoters from literature (Almer et al. 1986; Barbaric et al. 1992; Kent et al. 2001; Moreira and Holmberg 1998; Weiss and Simpson 1997) (Supplementary Table 2). The unstable nucleosomes show significantly higher entropy values than the stable ones (P = 1.6e-3 from Wilcoxon test; Fig. 1b).
It is also important to know how much the variability of nucleosome locations is explained by the difference of experimental techniques. Thus, the mutual correlation of assigned positions in highly dynamic regions between different maps was examined using a cluster analysis (Supplementary Fig. 3). The data produced by the same authors tend to be clustered together. To confirm if this intra-laboratory correlation does significant harm or not, we further checked the effect of removing one or two of the data from the same Kaplan et al.'s laboratory. The Pearson's correlation coefficient (PCC) was calculated between the entropy values of each case and the original one along the whole genome. In all cases, high correlation was observed [PCC ≥0.936 (P < 2.2e-16)], indicating that the differences of nucleosome locations were not always due to those of experimental techniques or authors. In the following sections, we will further validate the reliability of our map by comparing our observations with a variety of previous reports.
Nucleosomes around TSSs and replication origins
The averaged positions of nucleosomes around TSSs and replication origins have been studied in various organisms (Field et al. 2008; Johnson et al. 2006; Lee et al. 2007; Mavrich et al. 2008a,b; Ozsolak et al. 2007; Sasaki et al. 2009; Schones et al. 2008; Valouev et al. 2008; Whitehouse et al. 2007). Then, we took similar statistics from our map. In Saccharomyces Genome Database (SGD), the data of mRNAs/TSSs are classified into three groups depending on their reliability: verified, uncharacterized, and dubious ones (Supplementary Table 3). Since the lower entropy may indicate the stable open or closed nucleosome regions (gray or white rectangles in Fig. 1a, respectively), we added the linker ratio (the ratio of linker regions in a 100 bp window), which indicates the tendency of nucleosome depletion, to distinguish them in this analysis. In Fig. 2, the entropy values of these three groups as well as autonomous replication sequences (ARSs) are shown in black lines while the linker ratio values are shown as a heat map. Around the “verified” TSSs, a clear peak for both values was observed around the −100 bp region and a less significant negative peak just downstream of the TSS. This result can be interpreted that a highly unstable nucleosome exists around −100 bp from the TSS while another stable nucleosome is positioned just downstream of the TSS. Similar tendencies were also observed in “uncharacterized”, but not in “dubious”. Additionally, one unstable nucleosome was also observed at around +100 bp from ARSs. It is noteworthy that the less conspicuous the peaks around TSS become, the more questionable the data becomes. This makes sense and supports the plausibility of our data.
Nucleosomes around the binding sites of 100 transcription factors
Since nucleosome remodeling has been reported to occur around the binding sites of several transcription factors (Shivaswamy et al. 2008), we next examined the positional variability of nucleosomes around conserved binding sites of 100 transcription factors (Harbison et al. 2004; MacIsaac et al. 2006; Supplementary Fig. 4). Interestingly, the correlation between TFBSs and the entropy/linker ratio was quite variable with transcription factors (Fig. 3a; typical examples are shown in Fig. 3b). For example, Abf1 and Reb1 show the highest linker ratios, implicating that the chromatin structure in their binding sites is relatively open. On the other hand, Yhp1 shows the lowest in both entropy and linker ratio, suggesting that its binding sites exist in closed and stable chromatin. As discussed later, many of those extremely positioned transcription factors are known or the relationship with chromatin remodeling is proposed (Fourel et al. 2002; Hartley and Madhani 2009), which strongly suggests that our “nucleosome dynamics” map is reasonably reliable. Furthermore, the binding sites of 83 out of 100 transcription factors are located in regions with significantly higher entropy than expected at random (Bonferroni-adjusted P < 0.01; Supplementary Table 4). These results imply that most conserved binding sites are subject to chromatin remodeling.
Genome-wide distribution of “dynamic” regions
From the above results, we concluded that the entropy/linker ratio values of the combined map are reliable indicators of (at least) the averaged dynamic status of chromatin. Hereafter, we report some of the new results derived from our data.
The first one is the distribution of “dynamic” regions along the whole genome (Fig. 4a). The histogram of the entropy value is shown in Supplementary Fig. 5a. The local regions with higher entropy values, which are interpreted as highly dynamic chromatin regions, are distributed relatively uniformly in each chromosome. This distribution is slightly negatively correlated with the GC content (PCC = −0.243), but shows no clear correlation with the transcriptome map by Nagalakshmi et al. (PCC = −0.039; Nagalakshmi et al. 2008).
Difference between the inside and the outside of genes
Next, the difference of “nucleosome dynamics” between intragenic regions (i.e., exons plus introns) and intergenic regions was examined. The entropy in intergenic regions was significantly higher than in intragenic regions (P < 1.0e-15 from the Wilcoxon test; Fig. 4b). Within the intragenic regions, the entropy in introns was significantly higher than in exons (P < 1.0e-15), but was lower than in the intergenic regions (P < 1.0e-15). Like entropy, the linker ratio in exons was significantly lower than in introns and in intergenic regions (P < 1.0e-15 in both regions; Fig. 4c). These results suggest that nucleosomes are stably and densely located within exonic regions.
Correlation with gene expression
Since chromatin remodeling is involved in transcriptional regulation, it is of interest to study the correlation between our data and systematic gene expression data. First, we plotted 5,089 genes based on the entropy and the linker ratio, averaged over their upstream 400-bp region (Fig. 2). Roughly speaking, these two values correlate with each other, but there exists a group of exceptional genes that have larger linker ratios than expected. Second, we classified these 5,089 genes into four clusters by Ward's method and tried to characterize each of them (clusters 1 to 4 in Fig. 5a and Supplementary Fig. 6). Each cluster contained 1,142, 788, 1,289, and 1,870 genes, respectively (Supplementary Table 5). Since the entropy and the linker ratio in cluster 1 were the highest, this cluster was interpreted as having “dynamic and open” promoters. On the other hand, cluster 2 showed the lowest entropy and linker ratio, characterized as “stable and closed”. As noted above, cluster 3 is exceptional because it has lower entropy but higher linker ratio. This cluster of genes is expected to have “stable and open” promoters.
Next, using 173 DNA microarrays for which time-scale expression changes under 16 different stresses have been measured (Gasch et al. 2000), we calculated the standard deviation of log2(ratio) for each gene as a measure of expression variety and examined this value among the clusters (Fig. 5b). Cluster 1, the “dynamic and open” cluster, shows the largest expression variety (Bonferroni-adjusted P < 3.0e-3 in all comparisons from Wilcoxon test). On the other hand, cluster 3, the “stable and open” cluster, shows the smallest expression variety (Bonferroni-adjusted P < 1.5e-12 in all comparisons). For a confirmation, we did an opposite analysis: we classified the genes into 20 clusters based on the degree of their expression variety. As shown in Fig. 5c, clusters with larger variety tend to have higher entropy values while clusters with smaller variety show lower values, an observation which supports our claim that “dynamic” promoters often regulate genes with larger expression variety. We also tested the correlation with the gene expression intensity (interpreted as equivalent with the mRNA amount) by comparing the counts of cDNA tags mapped on the intragenic regions among the four clusters (Fig. 5d; Nagalakshmi et al. 2008). Although the expression level of cluster 1 was slightly higher than cluster 2 (Bonferroni-adjusted P = 8.1e-4), there were no clear differences. Thus, these results support the idea that chromatin remodeling is linked to the degree of expression change but not to its expression level.
Relationship with gene function
For each of the above-mentioned four clusters, we checked whether there are any over-represented Gene Ontology (GO) terms (Table 2). In cluster 1, terms such as “plasma membrane” and “cell wall” were over-represented. Cluster 2 showed the enrichment of terms such as “ribosome”, “structural molecule activity”, and “microtubule organizing center”. Cluster 3 was enriched with terms such as “DNA metabolic process”, “chromosome”, and “nucleus”. To confirm the stability of this GO analysis, we clustered the genes into three classes in a different way. [First, divide the data into two, using the average entropy value as a border; then, the genes with lower entropy values were further classified by the average linker ratio value (Supplementary Fig. 7)]. With these three clusters, we confirmed that the result of their GO analysis gave the same tendency (Supplementary Table 6). Although they are just a statistical tendency, these results may imply that the initial changes of gene expression in response to various outside signals tend to be regulated by chromatin remodeling (See Discussion).
Comparison between two histone modification maps and ours
To clarify the correlation between the variability of nucleosome positioning and the histone modification pattern, we used two datasets: ChIP-on-chip data (Pokholok et al. 2005) and tiling array data (Liu et al. 2005).
From the comparison with the ChIP–chip data, we found significant differences in entropy for five out of eight modifications (Fig. 6). In monomethylation of H3 lysine 4 (H3K4me1) and trimethylation of H3 lysine 36 (H3K36me3), the entropies in hyper-modified sites were significantly higher than those in hypo-modified sites (P < 1.3e-7 from Wilcoxon test). On the other hand, hyper-acetylation of H3 lysine 14 (H3K14ac) and H4 (H4ac) as well as trimethylation of H3 lysine 4 (H3K4me3) showed significantly lower entropies than their hypo-modifications (P < 2.6e-6). For the linker ratio, small differences were observed in H3K36me3, H3K4me3, and H3K14ac (P < 5.0e-3; Supplementary Fig. 8).
Similarly, in the comparison with the tiling array data, hyper-H3K4me1 showed significantly higher entropies while hyper-H3K4me3 and H3K14ac showed significantly lower entropies (P < 1.0e-04 from Wilcoxon test; Supplementary Fig. 9). Note that trimethylation of H3 lysine 79 (H3K79me3), H3K36me3, and H4ac were not included in this tiling array data. The linker ratio was not significantly different between hyper- and hypo-modifications in all of the five modifications (P > 0.01). Thus, at least the consensus of these two results (positive correlation of H3K4me1 and negative correlation of H3K4me3 and H3K14ac) seems to be reliable.
Histone modifications in promoters
To examine if there are any tight links between any of the four clusters of genes and the status of histone modification in each promoter is also interesting. Cluster 3 was clearly enriched with H3K4me3 in the upstream region (Fig. 7). The levels of H3K9ac and H3K14ac in stable chromatin clusters (clusters 2 and 3) were modestly higher, too. On the other hand, H3K4me1 and H3K4me2 did not show a large difference among the four clusters. These results suggest that H3K4me3 modification is required to maintain stably open chromatin in the upstream regions.
Discussion
In this study, we tested our hypothesis that a compendium of heterogeneous nucleosome maps should give us the local information of nucleosome instability, in other words, nucleosome dynamics. There are two kinds of raw data: the tag counts of next-generation sequencers and the hybridization signals in tiling arrays. They differ both in signal intensities and in resolutions. To make objective alignments of nucleosome positions, it is desirable to use a common method to assign these positions. Therefore, we modified the HMM method by Yuan et al. so that signal gradients, instead of intensity, are used as inputs (Yuan et al. 2005). Another modification was to add two self-loops to nucleosome states to make the model flexible enough to deal with noisy data.
The comparative analysis of the entropy value between known stable nucleosomes and known unstable nucleosomes showed the accuracy of our “nucleosome dynamics” map (Fig. 1b). Although there was a statistically significant difference between the two groups, the distinction was not perfect. It can be interpreted that our map may not be accurate enough to assess the situation of individual nucleosome accurately, but its statistical analysis is meaningful. Then, we analyzed the averaged distribution of the two indicator values (the entropy and the linker ratio) around TSS with three groups of mRNA data and ARSs (Fig. 2). There was a sharp peak of open and dynamic chromatin at around the −100 bp position, and this open region extends to around the −400 bp position from the TSS. A similar peak was also observed within ARSs. On the contrary, there was a weak tendency of dense and stable chromatin just downstream of TSS. Notably, these observations are consistent with our current knowledge of chromatin remodeling in promoters (Shivaswamy et al. 2008) and in replication origins (Field et al. 2008). The observed tendency becomes weaker as the reliability of the data becomes less.
Furthermore, our data indicate that most of the transcription factor binding sites are dynamic and open, more or less. Impressively, GRFs showed especially large entropy/linker ratio values (Fig. 3a). This is consistent with previous studies proposing that GRFs may have chromatin remodeling activity (Fourel et al. 2002; Hartley and Madhani 2009). Our observation that Abf1 and Reb1 show similar values, but are a little apart from Rap1, is also consistent with a previous report (Kaplan et al. 2009; Yarragudi et al. 2004). Mcm1, which is also located near GRFs in our plot, is implicated to be involved in chromatin remodeling (Chang et al. 2003). The fact that Mcm1 regulates the expression of a number of replication initiation factors may be related to our observation that there is a peak of nucleosome dynamics around ARSs (Fig. 2). On the contrary, Yhp1, which can bind to Mcm1 and the sites adjacent to Mcm1-binding sites, was observed in the stable and closed chromatin regions. Pramila et al. previously showed that Yhp1 protein represses cell-cycle-regulated genes with Mcm1 (Pramila et al. 2002). Thus, Yhp1 may restrict the activity of chromatin remodeling of Mcm1. Moreover, the two transcription activators, Ime1 and Fhl1, are also associated with the regulators of chromatin structure. The former factor activates meiosis-specific genes by recruiting with RSC chromatin remodeling complex (Inai et al. 2007); the latter is involved in the activation of ribosomal protein genes with Rap1 and is proposed to mediate the recruitment of Esa1 (Schawalder et al. 2004; Wade et al. 2004).
These results suggest that our map is reliable enough to predict the degree of chromatin remodeling around the binding sites of individual transcription factor. In addition, significant differences of the entropy and the linker ratio between inside and outside of genes (Fig. 4b, c) are consistent with the fact that TFBSs are enriched in intergenic regions, which contain promoters and enhancers. Recent reports suggesting that nucleosome positioning within exon regions may function as an exon–intron marker also support our results (Schwartz et al. 2009; Tilgner et al. 2009).
The correlation between nucleosome positioning and gene expression has often been reported. For example, Lee et al. showed that average nucleosome occupancy in promoters with high expression intensity is significantly lower than that with low intensity using a single nucleosome map (Lee et al. 2007). Our results give a little different interpretation from an additional dimension: based on the entropy value that is derived from the comparison of multiple maps, we further classified the “open” chromatin status, which is characterized by higher linker ratio values, into “dynamic” (higher entropy) and “stable” (lower entropy) statuses (Fig. 5a). As an example, we mapped the four clusters onto the promoter of genes involved in a typical stress-responsive pathway, cAMP-protein kinase A (Aguilera et al. 2007). There is a clear tendency that upstream genes have the “dynamic and open” promoters while downstream ones have “stable” promoters except a few exceptions, such as Msn4 (Supplementary Fig. 10). Additionally, while genes involved in nucleotide metabolism, promoters of which are enriched in cluster 3, are reported to show little changes in expression under various winemaking conditions too (Varela et al. 2005), promoters of upstream genes on another stress-responsible pathway, protein kinase C pathway, such as Rho1 and Mid2, are enriched in cluster 1. Taken together, we conclude that our clustering results seem to reflect the difference between genes that are responsive to various external stimuli and ones that are constantly expressed. Furthermore, the difference between the “open” and “closed” statuses was not significantly correlated with the expression intensity. It is an important observation that is only obtainable using multiple maps.
Although the interpretation of various histone modification data is complicated, the results obtained from two independent datasets (Liu et al. 2005; Pokholok et al. 2005) were basically consistent, further supporting our “nucleosome dynamics” map. Moreover, we observed that the contribution of each histone modification to nucleosome dynamics is different. H3K4me3, H3K14ac, and H4K9ac, which are known to be rich in the 5′ end of active genes (Barski et al. 2007; Heintzman et al. 2007; Li et al. 2007; Liu et al. 2005; Pokholok et al. 2005; Sung and Amasino 2006; Wang et al. 2008), were enriched in stable chromatin promoters in our study. Furthermore, H3K4me3 showed a clear difference in the upstream regions of genes in cluster 3 (Fig. 7). Recently, Vermeulen et al. showed that H3K4me3, but not the other two acetylations, is specifically associated with TFIID complex (Vermeulen et al. 2007). They also showed that H3K9ac and H3K14ac have a potential to enhance TFIID interaction with H3K4me3. Their associations with remodeling factors also have been presented in several studies (Kuo et al. 1998; Wysocka et al. 2006; Zhang et al. 1998). On the other hand, the H3K4me1-rich regions showed significantly higher entropy values (Fig. 6). Since this modification is reported to be associated with enhancer activity (Heintzman et al. 2007), it is possible that the modification modulates the enhancer activity with the change of nucleosome positioning.
Conclusion
From the above results, we conclude that the integration of various maps can show general features of nucleosome dynamics. Using our “nucleosome dynamics” map, a number of novel observations are made. We hope that some of them will be experimentally verified in the future. For example, the order of nucleosome dynamics was: “intergenic regions” > “introns” > “exons”. The degree of nucleosome instability correlates well with their degree of expression variety but not their intensity. The genes whose TSS region is highly dynamic and open tend to encode proteins that can sense extracellular conditions while the genes whose TSS regions are stably open tend to encode nuclear proteins. In addition to the GRFs, there are additional transcription factors whose binding sites exist at dynamically open regions: Leu3, Ime1, Rds1, and Fhl1, two of which have been suggested to be associated with chromatin remodeling.
It is evident that our methodology is quite intuitive, and there is much room for further improvement. Nevertheless, this methodology is promising in clarifying various aspects of epigenetic effects in cellular processes such as gene expression.
Materials and methods
Dataset
The in vivo nucleosome maps were obtained from four articles (Supplementary Table 1; Kaplan et al. 2009; Lee et al. 2007; Mavrich et al. 2008a, b; Whitehouse et al. 2007). The genome sequences, the gene coordinates, and the GO Slim data of S. cerevisiae were downloaded from the SGD (http://www.yeastgenome.org/). Yeast cDNA tags generated by random primers (GSE11209) were obtained from the Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/; Nagalakshmi et al. 2008). The microarray data of time-scale expression changes were obtained from Stanford Genomic Resources (http://www-genome.stanford.edu/yeast_stress/; Gasch et al. 2000). The in vivo binding sites of yeast transcription factors, with the criteria of p value cutoff 0.005 and no conservation, were downloaded from the website of Fraenkel's laboratory (http://fraenkel.mit.edu/; Harbison et al. 2004; MacIsaac et al. 2006). The ChIP-on-chip data for histone acetylations and methylations were obtained from the website of Young's laboratory (http://inside.wi.mit.edu/young/pub/download.html; Pokholok et al. 2005). The tiling array dataset of histone modifications was downloaded from the supporting information of (Liu et al. 2005). The data of stable and unstable nucleosomes was collected from five articles (Almer et al. 1986; Barbaric et al. 1992; Kent et al. 2001; Moreira and Holmberg 1998; Weiss and Simpson 1997). In each article, nucleosomes, which were depleted or largely moved in another condition, were defined as unstable nucleosomes (Supplementary Table 2).
Assignment of nucleosome locations
Since the downloaded nucleosome maps were heterogeneous in both their experimental techniques and the algorithms for nucleosome position assignment (Supplementary Table 1), we modified the HMM algorithm by Yuan et al. to be applicable to both the next-generation sequencer and the tiling array data with various resolutions (Yuan et al. 2005):
Since only 36-bp from 5′ end is read for each mononucleosomal DNA fragment by Illumina sequencer, we need to estimate the position of corresponding nucleosome centers from the data. Basically, we followed the way of Jiang and Pugh (2009a, b) and added 73 bp, which is the half length of the nucleosomal DNA, to each position of the 5′ end as the estimated center. In the GS20 sequencer data, covering more than 100 bp of the nucleosomal DNA, the midpoint of tags is used as the nucleosome center. For each nucleosome center i, the counts of corresponding tags c(i) were recorded. Then, at each discrete position I j (I j = 5 + 10j, j = 0, 1, 2, ...) in the genome, the nucleosome signal s(I j ) is calculated with the following formula:
where w(i, Ij) is the Gaussian distribution with mean Ij and standard deviation 20 as introduced in (Albert et al. 2007). The nucleosome signals were further normalized to the Z-scores. In the tiling array data, we used the log2(intensity) as the nucleosome signal.
Next, for each discrete position L (L = 10, 20, 30, ...), the minimum range that encompasses it is defined as follows:
Note that the above formulae are applicable to both of the data. It is possible that m = n−1 where the tiling array is sparse.
Then, we define the gradient (more accurately, the sine) g(L) as:
g(L) takes the values from −1 to +1. The value “10” is a scaling coefficient. In the linker region, the gradient signals are almost zero, whereas the signals are positive/negative on the 5′/3′ side of the nucleosome region, respectively (Supplementary Fig. 1a).
Our HMM contains three types of hidden states, each of which outputs the gradient signal value: one linker node (L), seven nucleosome nodes on the 5′ side (5N1-5N7), and seven nucleosome nodes on the 3′ side (3N1-3N7; Supplementary Fig. 1b). In addition to one self-connecting loop on the L node for allowing various lengths of linker regions, two self-loops were added to the 5N7 and 3N7 nodes for the detection of wider peaks. The emission probability function in each node is represented by a Gaussian distribution N(μ,ρ), where μ and ρ are parameters that take the same value in all nodes for each state (Supplementary Fig. 1c). All model parameters were estimated from a sliding window of 100 consecutive positions by Baum-Welch algorithm using the gradient signal in chromosome 3 as input (therefore, the parameter values are different for each original map). The averaged parameters from all windows were used for the estimation of hidden states by Viterbi's algorithm.
Since several nucleosome signals were obtained from short DNA sequences, we removed abnormal regions, where the length of the linker region is more than 300 bp (Supplementary Fig. 11).
For control studies, the nucleosome positions were shuffled randomly. This shuffling was iterated ten times. The distance between each nucleosome reported in the original article and the nearest nucleosome detected by our method was used for evaluation.
Similarity of nucleosome maps in highly dynamic regions
In the highly dynamic regions, where the entropy is more than 0.8, the output of Viterbi algorithm was converted to binary code (1 = nucleosome state and 0 = linker state). Phi coefficient was calculated with each pair of nucleosome maps.
Two indicators for representing chromatin status
To evaluate the level of nucleosome positional variety, we introduced average Shannon entropy, calculated as follows:
where P N,i and P L,i (=1−P L,i ) are the ratio of nucleosome states and linker states at relative position i, respectively (Schneider 2010). The entropy value goes toward zero if either state becomes dominant within the window (P N,i → 1 or P L,i → 1) while it takes the maximum value if the two states are equally observed (P N,i = P L,i = 0.5). The entropy is calculated through the whole genome using moving windows. In this study, we used 100 bp as the window size W, with a sliding interval of 10 bp.
Additionally, we used the linker ratio, which is defined as the count of linker states within the window divided by the window size (=100 bp) and the number of the nucleosome maps (=7).
Histone modification data analysis
In Pokholok et al.'s data, which covers the whole genome, we used the average of the entropy and the linker ratio within the probe center ±250 bp for characterizing the chromatin state. In Liu et al.'s tiling array data, which covers the entire chromosome 3 and parts of the other chromosomes at 20-bp resolution, we used the values at the corresponding sites as they were. For the analysis of hyper- and hypo-modified sites, the 250 highest and the 250 lowest probes were used.
Transcription factor binding sites
Among the data of 119 transcription factors downloaded from Fraenkel's website, we removed those of 19 factors because their binding sites were not observed in the “normal” regions more than five times. The entropy and the linker ratio at each position within ±50 bp from the binding sites were averaged over all binding sites for each transcription factor. As a control, we randomly picked up 100 sites and calculated the values in the same manner. This sampling was repeated 100 times. The significance of the distribution against the control for each transcription factor was calculated based on the two-dimensional Gaussian distribution.
Mapping of cDNA tags
About 15 million cDNA tags were mapped onto the yeast genome using the Maq software with the option “-n 3 -e 100” (http://maq.sourceforge.net/). Failed tags were further mapped by BLAT with the option “-trimHardA-minIdentity=85 -mask=lower” (Kent 2002). Then, the transcriptome map was constructed by counting the mapped tags within a sliding window of 100 bp with a 10-bp interval. Additionally, the count of tags mapped on the intragenic region of each gene was used as its expression intensity.
Gene clustering and gene ontology analysis
Using the average of the entropy and the linker ratio on the upstream 400 bp region from the TSS of “verified mRNAs” and “uncharacterized mRNAs”, we categorized these mRNAs (genes) into four clusters using Ward's method. In this clustering, we ignored 683 genes, whose promoters were overlapped with the above-mentioned abnormal regions. In each cluster, p values for 84 GO terms were calculated by the hyper-geometric test and were adjusted using Benjamini-Hochberg's method (false discovery rate <0.05).
References
Aguilera J, Randez-Gil F, Prieto JA (2007) Cold response in Saccharomyces cerevisiae: new functions for old mechanisms. FEMS Microbiol Rev 31:327–341
Albert I, Mavrich TN, Tomsho LP, Qi J, Zanton SJ, Schuster SC, Pugh BF (2007) Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446:572–576
Almer A, Rudolph H, Hinnen A, Horz W (1986) Removal of positioned nucleosomes from the yeast PHO5 promoter upon PHO5 induction releases additional upstream activating DNA elements. EMBO J 5:2689–2696
Barbaric S, Fascher KD, Horz W (1992) Activation of the weakly regulated PHO8 promoter in S. cerevisiae: chromatin transition and binding sites for the positive regulatory protein PHO4. Nucleic Acids Res 20:1031–1038
Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K (2007) High-resolution profiling of histone methylations in the human genome. Cell 129:823–837
Chandy M, Gutierrez JL, Prochasson P, Workman JL (2006) SWI/SNF displaces SAGA-acetylated nucleosomes. Eukaryot Cell 5:1738–1747
Chang VK, Fitch MJ, Donato JJ, Christensen TW, Merchant AM, Tye BK (2003) Mcm1 binds replication origins. J Biol Chem 278:6093–6100
Chasman DI, Lue NF, Buchman AR, LaPointe JW, Lorch Y, Kornberg RD (1990) A yeast protein that influences the chromatin structure of UASG and functions as a powerful auxiliary gene activator. Genes Dev 4:503–514
Field Y, Kaplan N, Fondufe-Mittendorf Y, Moore IK, Sharon E, Lubling Y, Widom J, Segal E (2008) Distinct modes of regulation by chromatin encoded through nucleosome positioning signals. PLoS Comput Biol 4:e1000216
Fourel G, Miyake T, Defossez PA, Li R, Gilson E (2002) General regulatory factors (GRFs) as genome partitioners. J Biol Chem 277:41736–41743
Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11:4241–4257
Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, Hannett NM, Tagne JB, Reynolds DB, Yoo J, Jennings EG, Zeitlinger J, Pokholok DK, Kellis M, Rolfe PA, Takusagawa KT, Lander ES, Gifford DK, Fraenkel E, Young RA (2004) Transcriptional regulatory code of a eukaryotic genome. Nature 431:99–104
Hartley PD, Madhani HD (2009) Mechanisms that specify promoter nucleosome location and identity. Cell 137:445–458
Havas K, Flaus A, Phelan M, Kingston R, Wade PA, Lilley DM, Owen-Hughes T (2000) Generation of superhelical torsion by ATP-dependent chromatin remodeling activities. Cell 103:1133–1142
Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu C, Ching KA, Wang W, Weng Z, Green RD, Crawford GE, Ren B (2007) Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39:311–318
Inai T, Yukawa M, Tsuchiya E (2007) Interplay between chromatin and trans-acting factors on the IME2 promoter upon induction of the gene at the onset of meiosis. Mol Cell Biol 27:1254–1263
Jiang C, Pugh BF (2009a) A compiled and systematic reference map of nucleosome positions across the Saccharomyces cerevisiae genome. Genome Biol 10:R109
Jiang C, Pugh BF (2009b) Nucleosome positioning and gene regulation: advances through genomics. Nat Rev Genet 10:161–172
Johnson SM, Tan FJ, McCullough HL, Riordan DP, Fire AZ (2006) Flexibility and constraint in the nucleosome core landscape of Caenorhabditis elegans chromatin. Genome Res 16:1505–1516
Kaplan N, Moore IK, Fondufe-Mittendorf Y, Gossett AJ, Tillo D, Field Y, LeProust EM, Hughes TR, Lieb JD, Widom J, Segal E (2009) The DNA-encoded nucleosome organization of a eukaryotic genome. Nature 458:362–366
Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664
Kent NA, Karabetsou N, Politis PK, Mellor J (2001) In vivo chromatin remodeling by yeast ISWI homologs Isw1p and Isw2p. Genes Dev 15:619–626
Kuo MH, Zhou J, Jambeck P, Churchill ME, Allis CD (1998) Histone acetyltransferase activity of yeast Gcn5p is required for the activation of target genes in vivo. Genes Dev 12:627–639
Lee W, Tillo D, Bray N, Morse RH, Davis RW, Hughes TR, Nislow C (2007) A high-resolution atlas of nucleosome occupancy in yeast. Nat Genet 39:1235–1244
Li B, Carey M, Workman JL (2007) The role of chromatin during transcription. Cell 128:707–719
Liu CL, Kaplan T, Kim M, Buratowski S, Schreiber SL, Friedman N, Rando OJ (2005) Single-nucleosome mapping of histone modifications in S. cerevisiae. PLoS Biol 3:e328
MacIsaac KD, Wang T, Gordon DB, Gifford DK, Stormo GD, Fraenkel E (2006) An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics 7:113
Mavrich TN, Ioshikhes IP, Venters BJ, Jiang C, Tomsho LP, Qi J, Schuster SC, Albert I, Pugh BF (2008a) A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast genome. Genome Res 18:1073–1083
Mavrich TN, Jiang C, Ioshikhes IP, Li X, Venters BJ, Zanton SJ, Tomsho LP, Qi J, Glaser RL, Schuster SC, Gilmour DS, Albert I, Pugh BF (2008b) Nucleosome organization in the Drosophila genome. Nature 453:358–362
Moreira JM, Holmberg S (1998) Nucleosome structure of the yeast CHA1 promoter: analysis of activation-dependent chromatin remodeling of an RNA-polymerase-II-transcribed gene in TBP and RNA pol II mutants defective in vivo in response to acidic activators. EMBO J 17:6028–6038
Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320:1344–1349
Ozsolak F, Song JS, Liu XS, Fisher DE (2007) High-throughput mapping of the chromatin structure of human promoters. Nat Biotechnol 25:244–248
Pokholok DK, Harbison CT, Levine S, Cole M, Hannett NM, Lee TI, Bell GW, Walker K, Rolfe PA, Herbolsheimer E, Zeitlinger J, Lewitter F, Gifford DK, Young RA (2005) Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 122:517–527
Pramila T, Miles S, GuhaThakurta D, Jemiolo D, Breeden LL (2002) Conserved homeodomain proteins interact with MADS box protein Mcm1 to restrict ECB-dependent transcription to the M/G1 phase of the cell cycle. Genes Dev 16:3034–3045
Reid JL, Iyer VR, Brown PO, Struhl K (2000) Coordinate regulation of yeast ribosomal protein genes is associated with targeted recruitment of Esa1 histone acetylase. Mol Cell 6:1297–1307
Sasaki S, Mello CC, Shimada A, Nakatani Y, Hashimoto S, Ogawa M, Matsushima K, Gu SG, Kasahara M, Ahsan B, Sasaki A, Saito T, Suzuki Y, Sugano S, Kohara Y, Takeda H, Fire A, Morishita S (2009) Chromatin-associated periodicity in genetic variation downstream of transcriptional start sites. Science 323:401–404
Schawalder SB, Kabani M, Howald I, Choudhury U, Werner M, Shore D (2004) Growth-regulated recruitment of the essential yeast ribosomal protein gene activator Ifh1. Nature 432:1058–1061
Schneider TD (2010) Information Theory Primer. National Institutes of Health. Available via http://www-lmmb.ncifcrf.gov/∼toms/paper/primer/. Accessed 18 Jan 2010
Schones DE, Cui K, Cuddapah S, Roh TY, Barski A, Wang Z, Wei G, Zhao K (2008) Dynamic regulation of nucleosome positioning in the human genome. Cell 132:887–898
Schwartz S, Meshorer E, Ast G (2009) Chromatin organization marks exon–intron structure. Nat Struct Mol Biol 16:990–995
Shivaswamy S, Bhinge A, Zhao Y, Jones S, Hirst M, Iyer VR (2008) Dynamic remodeling of individual nucleosomes across a eukaryotic genome in response to transcriptional perturbation. PLoS Biol 6:e65
Sung S, Amasino RM (2006) Molecular genetic studies of the memory of winter. J Exp Bot 57:3369–3377
Tilgner H, Nikolaou C, Althammer S, Sammeth M, Beato M, Valcarcel J, Guigo R (2009) Nucleosome positioning as a determinant of exon recognition. Nat Struct Mol Biol 16:996–1001
Valouev A, Ichikawa J, Tonthat T, Stuart J, Ranade S, Peckham H, Zeng K, Malek JA, Costa G, McKernan K, Sidow A, Fire A, Johnson SM (2008) A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Res 18:1051–1063
Varela C, Cardenas J, Melo F, Agosin E (2005) Quantitative analysis of wine yeast gene expression profiles under winemaking conditions. Yeast 22:369–383
Varga-Weisz P (2001) ATP-dependent chromatin remodeling factors: nucleosome shufflers with many missions. Oncogene 20:3076–3085
Vermeulen M, Mulder KW, Denissov S, Pijnappel WW, van Schaik FM, Varier RA, Baltissen MP, Stunnenberg HG, Mann M, Timmers HT (2007) Selective anchoring of TFIID to nucleosomes by trimethylation of histone H3 lysine 4. Cell 131:58–69
Wade JT, Hall DB, Struhl K (2004) The transcription factor Ifh1 is a key regulator of yeast ribosomal protein genes. Nature 432:1054–1058
Wang Z, Zang C, Rosenfeld JA, Schones DE, Barski A, Cuddapah S, Cui K, Roh TY, Peng W, Zhang MQ, Zhao K (2008) Combinatorial patterns of histone acetylations and methylations in the human genome. Nat Genet 40:897–903
Weiss K, Simpson RT (1997) Cell type-specific chromatin organization of the region that governs directionality of yeast mating type switching. EMBO J 16:4352–4360
Whitehouse I, Flaus A, Cairns BR, White MF, Workman JL, Owen-Hughes T (1999) Nucleosome mobilization catalysed by the yeast SWI/SNF complex. Nature 400:784–787
Whitehouse I, Rando OJ, Delrow J, Tsukiyama T (2007) Chromatin remodelling at promoters suppresses antisense transcription. Nature 450:1031–1035
Wysocka J, Swigut T, Xiao H, Milne TA, Kwon SY, Landry J, Kauer M, Tackett AJ, Chait BT, Badenhorst P, Wu C, Allis CD (2006) A PHD finger of NURF couples histone H3 lysine 4 trimethylation with chromatin remodelling. Nature 442:86–90
Yarragudi A, Miyake T, Li R, Morse RH (2004) Comparison of ABF1 and RAP1 in chromatin opening and transactivator potentiation in the budding yeast Saccharomyces cerevisiae. Mol Cell Biol 24:9152–9164
Yuan GC, Liu YJ, Dion MF, Slack MD, Wu LF, Altschuler SJ, Rando OJ (2005) Genome-scale identification of nucleosome positions in S. cerevisiae. Science 309:626–630
Zhang W, Bone JR, Edmondson DG, Turner BM, Roth SY (1998) Essential and redundant functions of histone acetylation revealed by mutation of target lysines and loss of the Gcn5p acetyltransferase. EMBO J 17:3155–3167
Acknowledgements
We are grateful to Riu Yamashita, Alexis Vandenbon, and all other members of the Nakai–Kinoshita Laboratory for their valuable advice and discussions. We also thank Takashi Ito for valuable comments. Computational time was provided by the Super Computer System, Human Genome Center, Institute of Medical Science, University of Tokyo. This work was supported in part by Global COE Program (Center of Education and Research for Advanced Genome-Based Medicine), MEXT, Japan.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by P. Hieter
Electronic supplementary material
Below is the link to the electronic supplementary material.
Fig. S1
Detection method of nucleosome locations. a The gradients of signals. b The diagram of HMM. c The emission probability of each hidden state. A linker state is represented in pink. The 5′-end and the 3′-end nucleosome states are represented in green and blue, respectively. (GIF 17 kb)
Fig. S2
Difference of our detected nucleosomes with ones reported in each original article. Red lines correspond to the relationship between the positions of nucleosomes re-defined by us, and blue lines show the relationship between the positions of nucleosomes randomly positioned and those originally reported (GIF 10 kb)
Fig. S3
Similarity of nucleosome maps used in this study (GIF 3 kb)
Fig. S4
Distribution of chromatin dynamics around the binding sites of 100 transcription factors. The solid and dashed lines represent the entropies and the linker ratios, respectively (GIF 74 kb)
Fig. S5
Histogram of a the entropies and b the linker ratios (GIF 4 kb)
Fig. S6
Gene clustering using the chromatin state in the promoter by Ward's method (GIF 5 kb)
Fig. S7
Gene clustering by averages of entropy and linker ratio (GIF 10 kb)
Fig. S8
Difference of the linker ratios between hyper- and hypo-modified sites with the dataset of Pokholok et al. The asterisk mark indicates significant difference (P < 0.01 from Wilcoxon test; GIF 8 kb)
Fig. S9
Differences of the entropies and the linker ratios between hyper- and hypo-modified sites with the dataset of Liu et al. The asterisk mark indicates significant difference (P < 0.01 from Wilcoxon test; GIF 9 kb)
Fig. S10
Mapping clusters of chromatin states onto cAMP-PKA pathway. Gray, red, green, and blue ovals indicate genes in cluster 1–4, respectively (GIF 21 kb)
Fig. S11
Histogram of linker DNA region width (GIF 7 kb)
Table S1
Summary of nucleosome maps used in this study (DOC 28 kb)
Table S2
Collection of stable and unstable nucleosomes reported from five literatures (DOC 30 kb)
Table S3
Category of genomic features used in this study (DOC 28 kb)
Table S4
Statistical significance of chromatin dynamics in the binding site of each transcription factor (DOC 34 kb)
Table S5
Genes in each cluster (DOC 1599 kb)
Table 6
Over-represented GO terms in three clusters by another clustering method (DOC 44 kb)
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Tanaka, Y., Yoshimura, I. & Nakai, K. Positional variations among heterogeneous nucleosome maps give dynamical information on chromatin. Chromosoma 119, 391–404 (2010). https://doi.org/10.1007/s00412-010-0264-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00412-010-0264-y