Abstract
With the exception of lamina-associated domains, the radial organization of chromatin in mammalian cells remains largely unexplored. Here we describe genomic loci positioning by sequencing (GPSeq), a genome-wide method for inferring distances to the nuclear lamina all along the nuclear radius. GPSeq relies on gradual restriction digestion of chromatin from the nuclear lamina toward the nucleus center, followed by sequencing of the generated cut sites. Using GPSeq, we mapped the radial organization of the human genome at 100-kb resolution, which revealed radial patterns of genomic and epigenomic features and gene expression, as well as A and B subcompartments. By combining radial information with chromosome contact frequencies measured by Hi-C, we substantially improved the accuracy of whole-genome structure modeling. Finally, we charted the radial topography of DNA double-strand breaks, germline variants and cancer mutations and found that they have distinctive radial arrangements in A and B subcompartments. We conclude that GPSeq can reveal fundamental aspects of genome architecture.
Similar content being viewed by others
Data availability
Source data for Figures, Extended Data Figures, Supplementary Figures, Supplementary Tables and Supplementary Notes are available at https://github.com/ggirelli/GPSeq-source-data. The following GPSeq data have been deposited in the GEO Repository GSE135882:
1. Raw and pre-processed GPSeq sequencing data
2. Bead coordinates for chromflock-generated whole-genome structures
3. Genome-wide GPSeq scores at chromosome-wide, 1-Mb and 100-kb resolution
4. GPSeq scores in genomic windows centered on the midpoint of the DNA FISH probes shown in Supplementary Fig. 1a, at 1-Mb and 100-kb resolution
Previously published data sets used in the analyses, for which accession numbers are available, are described in Supplementary Table 5. For SNPs, tumor SNVs and gene fusions, we used the following data sets:
1. Chronic lymphocytic leukemia, lung cancer, prostate cancer and melanoma SNVs were obtained from the supplementary tables of the corresponding papers described in ref. 48
2. SNPs from the 1000 Genomes Project Phase 3 were downloaded from https://www.internationalgenome.org/
3. TCGA gene fusions were downloaded from https://www.tumorfusions.org/
Code availability
The following code was used and is available at the indicated links:
1. pygpseq: https://github.com/ggirelli/pygpseq/releases/tag/v3.3.4
2. pygpseq-scripts: https://github.com/ggirelli/pygpseq-scripts/releases/tag/v0.0.1
3. iFISH-singleLocus-analysis: https://github.com/ggirelli/iFISH-singleLocus-analysis/releases/tag/v1.0
4. gpseq-seq-gg: https://github.com/ggirelli/gpseq-seq-gg/releases/tag/v2.0.3
5. bed-fix-chrom-rearrangement: https://github.com/ggirelli/bed-fix-chrom-rearrangement/releases/tag/v0.0.1
6. gpseqc: https://github.com/ggirelli/gpseqc/releases/tag/v2.3.6.post1
7. gpseqc-snakemake: https://github.com/ggirelli/gpseqc-snakemake/releases/tag/v1.0
8. bioTrackBinner: https://github.com/ggirelli/bioTrackBinner/releases/tag/v0.0.1
9. ggkaryo2: https://github.com/ggirelli/ggkaryo2/releases/tag/v0.0.3
10. chromflock: https://github.com/elgw/chromflock/releases/tag/0.1
References
Sleeman, J. E. & Trinkle-Mulcahy, L. Nuclear bodies: new insights into assembly/dynamics and disease relevance. Curr. Opin. Cell Biol. 28, 76–83 (2014).
Croft, J. A. et al. Differences in the localization and morphology of chromosomes in the human nucleus. J. Cell Biol. 145, 1119–1131 (1999).
Bridger, J. M., Boyle, S., Kill, I. R. & Bickmore, W. A. Re-modelling of nuclear architecture in quiescent and senescent human fibroblasts. Curr. Biol. CB 10, 149–152 (2000).
Cremer, M. et al. Non-random radial higher-order chromatin arrangements in nuclei of diploid human cells. Chromosome Res. 9, 541–567 (2001).
Boyle, S. et al. The spatial organization of human chromosomes within the nuclei of normal and emerin-mutant cells. Hum. Mol. Genet. 10, 211–219 (2001).
Mayer, R. et al. Common themes and cell type specific variations of higher order chromatin arrangements in the mouse. BMC Cell Biol. 6, 44 (2005).
Bolzer, A. et al. Three-dimensional maps of all chromosomes in human male fibroblast nuclei and prometaphase rosettes. PLoS Biol. 3, e157 (2005).
Sun, H. B., Shen, J. & Yokota, H. Size-dependent positioning of human chromosomes in interphase nuclei. Biophys. J. 79, 184–190 (2000).
Tanabe, H. et al. Evolutionary conservation of chromosome territory arrangements in cell nuclei from higher primates. Proc. Natl Acad. Sci. USA 99, 4424–4429 (2002).
van Steensel, B. & Belmont, A. S. Lamina-associated domains: links with chromosome architecture, heterochromatin, and gene repression. Cell 169, 780–791 (2017).
Guelen, L. et al. Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature 453, 948–951 (2008).
Peric-Hupkes, D. et al. Molecular maps of the reorganization of genome−nuclear lamina interactions during differentiation. Mol. Cell 38, 603–613 (2010).
Kind, J. et al. Genome-wide maps of nuclear lamina interactions in single human cells. Cell 163, 134–147 (2015).
Alcobia, I., Dilão, R. & Parreira, L. Spatial associations of centromeres in the nuclei of hematopoietic cells: evidence for cell-type-specific organizational patterns. Blood 95, 1608–1615 (2000).
Alcobia, I., Quina, A. S., Neves, H., Clode, N. & Parreira, L. The spatial organization of centromeric heterochromatin during normal human lymphopoiesis: evidence for ontogenically determined spatial patterns. Exp. Cell Res. 290, 358–369 (2003).
Molenaar, C. et al. Visualizing telomere dynamics in living mammalian cells using PNA probes. EMBO J. 22, 6631–6641 (2003).
Weierich, C. et al. Three-dimensional arrangements of centromeres and telomeres in nuclei of human and murine lymphocytes. Chromosome Res. 11, 485–502 (2003).
Tjong, H. et al. Population-based 3D genome structure analysis reveals driving forces in spatial genome organization. Proc. Natl Acad. Sci. USA 113, E1663–E1672 (2016).
Németh, A. & Längst, G. Genome organization in and around the nucleolus. Trends Genet. 27, 149–156 (2011).
Quinodoz, S. A. et al. Higher-order inter-chromosomal hubs shape 3D genome organization in the nucleus. Cell 174, 744–757 (2018).
Federico, C. et al. Gene-rich and gene-poor chromosomal regions have different locations in the interphase nuclei of cold-blooded vertebrates. Chromosoma 115, 123–128 (2006).
Grasser, F. et al. Replication-timing-correlated spatial chromatin arrangements in cancer and in primate interphase nuclei. J. Cell Sci. 121, 1876–1886 (2008).
Hepperger, C., Mannes, A., Merz, J., Peters, J. & Dietzel, S. Three-dimensional positioning of genes in mouse cell nuclei. Chromosoma 117, 535–551 (2008).
Kreth, G., Finsterle, J., von Hase, J., Cremer, M. & Cremer, C. Radial arrangement of chromosome territories in human cell nuclei: a computer model approach based on gene density indicates a probabilistic global positioning code. Biophys. J. 86, 2803–2812 (2004).
Andrulis, E. D., Neiman, A. M., Zappulla, D. C. & Sternglanz, R. Perinuclear localization of chromatin facilitates transcriptional silencing. Nature 394, 592–595 (1998).
Sadoni, N. et al. Nuclear organization of mammalian genomes. Polar chromosome territories build up functionally distinct higher order compartments. J. Cell Biol. 146, 1211–1226 (1999).
Kosak, S. T. et al. Subnuclear compartmentalization of immunoglobulin loci during lymphocyte development. Science 296, 158–162 (2002).
Kosak, S. T. et al. Coordinate gene regulation during hematopoiesis is related to genomic organization. PLoS Biol. 5, e309 (2007).
Finlan, L. E. et al. Recruitment to the nuclear periphery can alter expression of genes in human cells. PLoS Genet. 4, e1000039 (2008).
Reddy, K. L., Zullo, J. M., Bertolino, E. & Singh, H. Transcriptional repression mediated by repositioning of genes to the nuclear lamina. Nature 452, 243–247 (2008).
Takizawa, T., Meaburn, K. J. & Misteli, T. The meaning of gene positioning. Cell 135, 9–13 (2008).
Therizols, P. et al. Chromatin decondensation is sufficient to alter nuclear organization in embryonic stem cells. Science 346, 1238–1242 (2014).
Shachar, S. & Misteli, T. Causes and consequences of nuclear gene positioning. J. Cell Sci. 130, 1501–1508 (2017).
Cook, P. R. & Marenduzzo, D. Transcription-driven genome organization: a model for chromosome structure and the regulation of gene expression tested through simulations. Nucleic Acids Res. 46, 9895–9906 (2018).
Ganai, N., Sengupta, S. & Menon, G. I. Chromosome positioning from activity-based segregation. Nucleic Acids Res. 42, 4145–4159 (2014).
Küpper, K. et al. Radial chromatin positioning is shaped by local gene density, not by gene expression. Chromosoma 116, 285–306 (2007).
Lieberman-Aiden, E. et al. Comprehensive mapping of long range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Gelali, E. et al. iFISH is a publically available resource enabling versatile DNA FISH to study genome architecture. Nat. Commun. 10, 1636 (2019).
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Bracken, A. P., Dietrich, N., Pasini, D., Hansen, K. H. & Helin, K. Genome-wide mapping of Polycomb target genes unravels their roles in cell fate transitions. Genes Dev. 20, 1123–1136 (2006).
Schermelleh, L., Solovei, I., Zink, D. & Cremer, T. Two-color fluorescence labeling of early and mid-to-late replicating chromatin in living cells. Chromosome Res. 9, 77–80 (2001).
Hua, N. et al. Producing genome structure populations with the dynamic and automated PGS software. Nat. Protoc. 13, 915–926 (2018).
Hsu, T. C. A possible function of constitutive heterochromatin: the bodyguard hypothesis. Genetics 79, 137–150 (1975).
Stamatoyannopoulos, J. A. et al. Human mutation rate associated with DNA replication timing. Nat. Genet. 41, 393–395 (2009).
Liu, L., De, S. & Michor, F. DNA replication timing and higher-order nuclear organization determine single-nucleotide substitution patterns in cancer genomes. Nat. Commun. 4, 1502 (2013).
Morganella, S. et al. The topography of mutational processes in breast cancer genomes. Nat. Commun. 7, 11383 (2016).
Schuster-Böckler, B. & Lehner, B. Chromatin organization is a major influence on regional mutation rates in human cancer cells. Nature 488, 504–507 (2012).
Chiarle, R. et al. Genome-wide translocation sequencing reveals mechanisms of chromosome breaks and rearrangements in B cells. Cell 147, 107–119 (2011).
Clarke, L. et al. The International Genome Sample Resource (IGSR): a worldwide collection of genome variation incorporating the 1000 Genomes Project data. Nucleic Acids Res. 45, D854–D859 (2017).
Hu, X. et al. TumorFusions: an integrative resource for cancer-associated transcript fusions. Nucleic Acids Res. 46, D1144–D1149 (2018).
Mertens, F., Johansson, B., Fioretos, T. & Mitelman, F. The emerging complexity of gene fusions in cancer. Nat. Rev. Cancer 15, 371–381 (2015).
Gothe, H. J. et al. Spatial chromosome folding and active transcription drive DNA fragility and formation of oncogenic MLL translocations. Mol. Cell 75, 267–283.e12 (2019).
Yan, W. X. et al. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks. Nat. Commun. 8, 15058 (2017).
Lensing, S. V. et al. DSBCapture: in situ capture and sequencing of DNA breaks. Nat. Methods 13, 855–857 (2016).
Chen, Y. et al. Mapping 3D genome organization relative to nuclear compartments using TSA-Seq as a cytological ruler. J. Cell Biol. 217, 4025–4048 (2018).
Beagrie, R. A. et al. Complex multi-enhancer contacts captured by genome architecture mapping. Nature 543, 519–524 (2017).
Nagano, T. et al. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502, 59–64 (2013).
Tan, L., Xing, D., Chang, C.-H., Li, H. & Xie, X. S. Three-dimensional genome structures of single diploid human cells. Science 361, 924–928 (2018).
Chen, X. et al. ATAC-see reveals the accessible genome by transposase-mediated imaging and sequencing. Nat. Methods 13, 1013–1020 (2016).
Gonzalez-Perez, A., Sabarinathan, R. & Lopez-Bigas, N. Local determinants of the mutational landscape of the human genome. Cell 177, 101–114 (2019).
Koren, A. et al. Differential relationship of DNA replication timing to different forms of human mutation and variation. Am. J. Hum. Genet. 91, 1033–1040 (2012).
Köster, J. & Rahmann, S. Snakemake-a scalable bioinformatics workflow engine. Bioinformatics 34, 3600 (2018).
Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA 112, E6456–E6465 (2015).
Kalhor, R., Tjong, H., Jayathilaka, N., Alber, F. & Chen, L. Genome architectures revealed by tethered chromosome conformation capture and population-based modeling. Nat. Biotechnol. 30, 90–98 (2011).
Pettersen, E. F. et al. UCSF Chimera-a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
McFarland, C. D. A modified ziggurat algorithm for generating exponentially- and normally-distributed pseudorandom numbers. J. Stat. Comput. Simul. 86, 1281–1294 (2016).
Acknowledgements
We thank A. van Oudenaarden (Hubrecht Institute) for initial discussions on GPSeq data analysis and I. Solovei (UMC Munich), M.A. Marti-Renom (CRG Barcelona), S.L. Klemm (Stanford) and B. Bouwman (Bienko-Crosetto lab) for critically reading the manuscript and providing ideas. We thank L. Xu and R. Mirzazadeh (Bienko-Crosetto lab) for helping with FISH probe production. We acknowledge H. Blom at the Advanced Light Microscopy facility at the Science for Life Laboratory (SciLifeLab) for acquiring and processing STED images and for providing computing resources. We acknowledge the van Steensel laboratory for providing HAP1 lamin DamID data generated in the frame of the 4D Nucleome project. This work was supported by a postdoctoral scholarship from the Swedish Society for Medical Research to E.W.; by funding from the Swedish Research Council (2018-02950), the Swedish Cancer Research Foundation (CAN 2018/728), the Ragnar Söderberg Foundation (Fellows in Medicine 2016) and the Strategic Research Programme in Cancer (StratCan) at the Karolinska Institutet to N.C.; and by funding from the Science for Life Laboratory, the Karolinska Institutet KID Funding Program, the Swedish Research Council (621-2014-5503), the Human Frontier Science Program (CDA-00033/2016-C), the Ragnar Söderberg Foundation (Fellows in Medicine 2016) and the European Research Council under the European Union’s Horizon 2020 research and innovation program (StG-2016_GENOMIS_715727) to M.B.
Author information
Authors and Affiliations
Contributions
Conceptualization: J.C., T.K., G.G., F.A., B.S., E.W., A.v.O, N.C. and M.B.; data curation: G.G. and F.A.; formal analysis: G.G., F.A., E.W., B.S. and J.C.; funding acquisition: M.B. and N.C.; investigation: T.K., J.C., S.K. and M.B.; methodology: J.C., T.K., G.G., N.C. and M.B.; project administration: M.B. and N.C.; resources: SciLifeLab, H.B., L.X. and R.M.; software: G.G. and E.W.; supervision: M.B. and N.C.; validation: J.C., T.K., A.M., S.K., E.G., L.X., R.M., G.G., F.A., E.W., M.B. and N.C.; visualization: G.G., F.A., J.C., M.B. and N.C.; writing: M.B, N.C., G.G., J.C., T.K., F.A. and S.K.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Monitoring gradual gDNA restriction by YFISH.
(a) Gradual gDNA digestion with HindIII revealed by wide-field epifluorescence microscopy. Green: HindIII cut sites. Blue: DNA stained with Hoechst 33342. Scale bars: 20 µm (field-of-view) and 10 µm (insets). Times indicate the duration of incubation with HindIII. Mid optical sections are shown. The same dynamic range was used for each digestion time. The experiment was repeated twice with similar results. (b) Normalized YFISH fluorescence intensity at various distances from the nuclear lamina, for each of the times shown in (a). The YFISH signal was normalized over the fluorescence intensity of DNA stained with Hoechst 33342. Each dot represents the median intensity in one of 200 radial layers. n, number of cells analyzed. (c) Calculation of YFISH signal inter-cellular variability. Top: each nucleus is divided in m concentric layers of equal thickness and the mean fluorescence intensity per layer is calculated. Bottom: for each restriction time, the peak, inflection point, and contrast are calculated from the distribution of the mean fluorescence intensity in all the nuclei. (d-f) Distributions of the peak position (d), inflection point position (e), and peak contrast (f) at various digestion times, for the samples of which (a) are representative images. n, number of nuclei analyzed as described in (c). (g) Calculation of YFISH signal intra-cellular variability. Top: 200 radii (as exemplified by the dotted lines) are randomly drawn inside each 3D segmented nucleus and the YFISH intensity profile (green) is evaluated at 100 points (as exemplified by the dotted lines) evenly spaced along each radius. Bottom: the standard deviation (s.d.) of the positions of the peak and inflection point and of the peak contrast are calculated from all the YFISH signal profiles from the same nucleus. (h–j) Distributions of the standard deviation (s.d.) of the peak position (h), inflection point position (i), and peak contrast (j) at various digestion times, for the samples of which (a) are representative images. n, number of nuclei analyzed as described in (g). In all the violin plots in the figure, each box spans from the 25th to the 75th percentile and the whiskers extend from –1.5×IQR to +1.5×IQR from the closest quartile, where IQR is the inter-quartile range. Dots: outliers (data falling outside whiskers). All the source data for this figure are from HAP1 cells.
Extended Data Fig. 2 Quantification of gradual gDNA restriction and GPSeq reproducibility.
(a) Distribution of the position of the peak in the YFISH fluorescence intensity radial profile (see Extended Data Fig. 1c) at different restriction times, in two HindIII experiments (Exp.1 and 2). (b) Same as in (a), but for the position of the inflection point. (c) Distribution of the absolute residuals of the linear regression fitting between the log2 GPSeq score (1 Mb resolution, overlapping windows with 100 kb step size) in two HindIII experiments (Exp.1 and 2). The regression layers were generated by dividing the linear regression line into 10 bins of equal size. (d) Same as in (c) but correlating the GPSeq score at 100 kb resolution. All box plots in (c, d) span from the 25th to the 75th percentile and whiskers extend from –1.5×IQR to +1.5×IQR from the closest quartile, where IQR is the inter-quartile range. Dots: data falling outside whiskers. (e) Gradual gDNA digestion with MboI revealed by wide-field epifluorescence microscopy. Green: MboI cut sites. Blue: DNA stained with Hoechst 33342. Scale bars: 20 µm (field-of-view) and 10 µm (insets). Times indicate the duration of incubation with MboI. Mid optical sections are shown. The same dynamic range was used for all the digestion times. The experiment was repeated twice with similar results. (f, g) Same as in (a, b), but for MboI experiments (Exp.3 and 4). (h) Correlation between the GPSeq score in four GPSeq experiments at chromosome resolution (that is, using genomic windows of the size of each chromosome). (i) Same as in (h) but at 1 Mb resolution (overlapping windows, 100 kb step size). (j) Same as in (h) but at 100 kb resolution (non-overlapping windows). In all the violin plots in the figure, the median is shown as a black line and the violins extend from the min to the max value. Sample size information for (a-d), (f, g) and (i, j) is available in Supplementary Table 11. All the source data for this figure are from HAP1 cells.
Extended Data Fig. 3 Predictors of chromatin radiality.
(a) Correlation between the log2 GPSeq score and the mean number of transcription start sites (TSS, one TSS per gene) at 1 Mb resolution (overlapping genomic windows, 100 kb step size). Each dot represents one out of 26,330 genomic windows analyzed. (b) Correlation between the log2 GPSeq score and the average RNA-seq reads count at 1 Mb resolution (overlapping genomic windows, 100 kb step size). Each dot represents one out of 26,330 genomic windows analyzed. (c) Correlation between the log2 GPSeq score (1 Mb resolution, overlapping genomic windows with 100 kb step size) and chromosome size in base-pairs (bp). Each dot represents a single 1 Mb genomic window. (d) Correlation between the log2 GPSeq score (chromosome resolution) and the median GC-content per Mb per chromosome. Each dot represents one chromosome. (e) Correlation between the log2 GPSeq score (1 Mb resolution, overlapping genomic windows with 100 kb step size) and the median GC-content per Mb per chromosome. Each dot represents a single 1 Mb window. n = 25,026 genomic windows (points) were analyzed. (f) Same as in (e) but at 100 kb resolution (non-overlapping windows). n = 25,342 genomic windows (points) were analyzed. (g) Predicted over observed chromosome-wide GPSeq score. The prediction is based on a multivariable model including both chromosome size and GC-content as described in the Methods. PE, prediction error. Dotted red line: bisector. Each dot represents one chromosome. (h) Same as in (g) but using 1 Mb overlapping genomic windows with 100 kb step and using GC-content, chromosome size, gene expression and gene density to model the GPSeq score. n = 26,293 genomic windows (points) were analyzed. In all the plots in the figure, PCC and SCC are the Pearson’s and Spearman’s correlation coefficient, respectively. Dashed red lines: linear regressions. All the source data for this figure are from HAP1 cells.
Extended Data Fig. 4 Radial distribution of chromatin marks and features as well as gene expression.
(a–e) Mean normalized signal of various chromatin features in ten concentric nuclear layers, divided by A/B subcompartments. Gene density was calculated as the mean number of transcription start sites (TSS, one TSS per gene) per 100 kb, and gene expression was calculated as the average RNA-seq reads count per 100 kb (Supplementary Methods). The dashed grey lines show the radial distribution of the features without dividing by subcompartment. (f) Distribution of the log2 GPSeq scores of all the genes and of each gene set pathway. P-values: Wilcoxon test, two-sided. n, number of genes. Box plots span from the 25th to the 75th percentile and whiskers extend from –1.5×IQR to +1.5×IQR from the closest quartile, where IQR is the inter-quartile range. All the source data for this figure are from HAP1 cells, except for DNA methylation data, which are from K562 cells.
Extended Data Fig. 5 Radial progression of DNA replication.
(a) Correlation between the log2 GPSeq score and the Repli-seq signal after wavelet transformation, at 1 Mb resolution (overlapping genomic windows, 100 kb step size). Each dot represents a single 1 Mb genomic window out of 26,330 genomic windows (dots) analyzed. The dots are colored based on the cell cycle sub-phase (G1, S1-4, G2). The density distribution on top of each scatterplot corresponds to the density of the log2 GPSeq score of the 5% bins with the highest Repli-seq signal in the indicated sub-phase. (b) Distribution of the Repli-seq signal by A/B subcompartment type in ten concentric nuclear layers. In all the boxplots, each box spans from the 25th to the 75th percentile and whiskers extend from –1.5×IQR to +1.5×IQR from the closest quartile, where IQR is the inter-quartile range. Dots: outliers (data falling outside whiskers). (c) Repli-seq signal in 100 kb genomic windows (dots) radially arranged based on their GPSeq score, separately for each sub-phase and A/B subcompartment. Only the 5% bins with the highest Repli-seq signal in the indicated sub-phase are reported. Solid black lines indicate the mean in each sector. Dashed circles: nuclear lamina. Grey circles separate ten concentric nuclear layers. Sample size information is available in Supplementary Fig. 6f (b) and in Supplementary Table 11 (c). GPSeq source data for this figure are from HAP1 cells, while the Repli-seq data are from K562 cells.
Extended Data Fig. 6 Analysis of chromflock structures generated using both GPSeq and Hi-C data (HG structures).
(a) Distribution of the average distance from the modeled nuclear surface of 1 Mb beads in 10,000 HG structures per chromosome. chr9:22 and chr22:9 are the derivative chromosomes of the t(9;22)(q34;q11.2) translocation. (b) Correlation between the average chromosome distance from the modeled nuclear surface in HG structures and chromosome size in base-pairs (bp). Each dot corresponds to one chromosome. (c) Distance matrix heatmap. The upper triangle shows the inter-bead 3D distances in HG structures. The bottom triangle shows the KR-normalized Hi-C contact frequency matrix, with each element raised to the power of –0.25. The reported correlation coefficients are for 1 Mb resolution, while the plot shows averaged values over 10 Mb genomic windows for simplicity. (d) Correlation between the distance from the modeled nuclear surface position of 1 Mb beads in HG structures, and the log2 GPSeq score of the corresponding windows. n = 2,627 genomic windows (points) were analyzed.
Extended Data Fig. 7 Analysis of chromflock structures generated using GPSeq and Hi-C intra-chromosomal contacts only (H(intra)G).
(a) Distribution of the average distance from the modeled nuclear surface of 1 Mb beads in 10,000 H(intra)G structures. chr9:22 and chr22:9 are the derivative chromosomes of the t(9;22)(q34;q11.2) translocation. (b) Correlation between the average chromosome distance from the modeled nuclear surface in H(intra)G structures and chromosome size in base-pairs (bp). Each dot corresponds to one chromosome. (c) Distance matrix heatmap. The upper triangle shows the inter-bead 3D distances in H(intra)G structures. The bottom triangle shows the KR-normalized Hi-C contact frequency matrix, with each element raised to the power of –0.25. The reported correlation coefficients are for 1 Mb resolution, while the plot shows averaged values over 10 Mb genomic windows for simplicity. (d) Correlation between the average inter-bead 3D distance in H(intra)G structures and the KR-normalized Hi-C contact frequency. Each dot represents a pair of 10 Mb non-overlapping genomic windows, each obtained by averaging 1 Mb non-overlapping bins. n = 47,531 genomic window pairs (points) were analyzed. Density contours are shown as concentric curves. (e) Correlation between the distance from the modeled nuclear surface position of 1 Mb beads in H(intra)G structures and the log2 GPSeq score of the corresponding windows. n = 2,627 genomic windows (points) were analyzed. (f) Correlation between the radial position in H(intra)G structures and the median 3D distance to the nuclear lamina measured by DNA FISH. Each dot represents one of the FISH probes (n = 68) shown in Supplementary Fig. 1a. In all the violin plots in the figure, each box spans from the 25th to the 75th percentile, whiskers extend from –1.5×IQR to +1.5×IQR from the closest quartile, where IQR is the inter-quartile range. Dots: outliers (data falling outside whiskers). In all the figure, PCC and SCC are the Pearson’s and Spearman’s correlation coefficient, respectively. Dashed red lines: linear regressions.
Extended Data Fig. 8 Radial organization of A/B compartments and subcompartments in chromflock structures.
(a) Examples of A/B arrangement in chromflock structures (1 Mb resolution) built using both GPSeq and Hi-C (HG) or only Hi-C (H) data. In all the structures, each bead represents a single 1 Mb genomic window. Elements connecting the beads are shown in yellow. The modeled nuclear surface is shown in grey. (b) Distribution of the difference in the median distance from the modeled nuclear surface of 1 Mb A-compartment beads vs. B-compartment beads per structure (n = 10,000) per chromosome (either for the HG or the H structures). Grey shades are used to visually distinguish different chromosomes. Sample size information is available in Source Data. (c) Examples of subcompartment arrangement in three out of 1,000 HG structures at 100 kb resolution. In all the structures, each bead represents a single 100 kb genomic window. The modeled nuclear surface is shown in grey. (d) Distribution of the distance to the modeled nuclear surface of the 100 kb beads belonging to different A/B subcompartments in 1,000 HG structures. n, number of beads belonging to each A/B subcompartment pooled from all the 1,000 structures. In all the violin plots in the figure, each box spans from the 25th to the 75th percentile, whiskers extend from –1.5×IQR to +1.5×IQR from the closest quartile, where IQR is the inter-quartile range.
Extended Data Fig. 9 Polarity and orientation of A1 and B3 subcompartments in 100 kb-resolution chromflock structures.
(a) Examples of possible arrangements of two subcompartments (red and blue) and their corresponding polarity score, p (see Supplementary Methods for how p is calculated). (b) Same as in (a), but for the orientation score, o. (c) Distributions of polarity scores in structures built using GPSeq and Hi-C data (HG), separately for each chromosome. (d) Same as in (c), but for orientation scores. (e, f) Same as in (c, d), respectively, but for structures built using only Hi-C data (H). Each boxplot in (c-f) corresponds to n = 1,000 structures. chr9:22 and chr22:9 are the derivative chromosomes of the t(9;22)(q34;q11.2) translocation. Box plots span from the 25th to the 75th percentile and whiskers extend from –1.5 × IQR to +1.5 × IQR from the closest quartile, where IQR is the inter-quartile range.
Extended Data Fig. 10 Relationship between chromosome mingling, cancer-associated gene fusions and DSBs.
(a) Distribution of the inter-chromosome mingling frequency of the 10% most frequently mingling beads in 100 kb-resolution chromflock structures, separately for beads overlapping (Fusions) or not (Controls) with cancer-associated gene fusions annotated in TCGA. Structures were generated using Hi-C data only (that is, without GPSeq integration). P-value: Wilcoxon test, two-sided. n, number of beads analyzed. (b) Average number of Hi-C trans-chromosomal contacts per 1 Mb genomic window in ten concentric layers defined based on the GPSeq score. (c) Distribution of the normalized number of trans-chromosomal Hi-C contacts (trans/all) per 1 Mb genomic window in the same layers as in (b). P-values: Wilcoxon test, two-sided. n, number of genomic windows analyzed. (d) Distributions of the total BLISS read count per 100 kb genomic windows, separately for windows overlapping (Fusions) or not (Controls) with cancer-associated gene fusions annotated in TCGA. P-value: Wilcoxon test, two-sided. n, number of genomic windows analyzed. (e) Radial distribution of DSBs in genic vs. intergenic genomic regions in ten concentric nuclear layers defined based on the GPSeq score. (f) Radial profile of γH2A.X along the nuclear radius. The intensity of γH2A.X immunofluorescence was normalized by the intensity of DNA staining using Hoechst 33342 using the same approach as for quantifying YFISH signal radial profiles (Supplementary Methods). Each point represents the median γH2A.X signal intensity in one of 200 radial layers. n, number of cells analyzed. The red line is a polynomial fit to the points. In all the violin plots and boxplots in the figure, boxes extend from the 25th to the 75th percentile, the midline represents the median, and whiskers extend from –1.5×IQR to +1.5×IQR from the closest quartile, where IQR is the inter-quartile range. Dots: outliers (data falling outside whiskers).
Supplementary information
Supplementary Information
Supplementary Figs. 1–11, methods, tables, notes and references.
Supplementary Table 1
List of oligos used to make YFISH and GPSeq adapters.
Supplementary Table 2
Summary of sequencing experiments.
Supplementary Table 3
Genomic coordinates of the 68 DNA FISH probes used for GPSeq validation and shown in Supplementary Fig. 1.
Supplementary Table 8
Correlation between TFBS density and GPSeq score at 1-Mb resolution, with 100-kb step. n = 26,630 genomic windows were compared.
Supplementary Table 10
List of masked manually curated telomeric and pericentromeric regions.
Supplementary Table 11
Specification of sample size (n) and P values for Main, Extended Data Figures and Supplementary Figures for which the values are too many to be included in the corresponding legend.
Supplementary Table 12
Specification of sample size (n) and P values for Supplementary Note 1, figures for which the values are too many to be included in the corresponding legend.
Supplementary Video 1
Rendering of gradual gDNA digestion showing which parts of the genome are cut first (and, therefore, are more peripheral) and which are digested later. The GPSeq score appears along each chromosome ideogram as bars of increasing height. The height of the bars follows the time of enzyme diffusion, as shown in the cartoon on the left. The video is provided as a separate .mp4 file. In the chromosome ideograms, the color of the cytobands is based on the intensity of the Giemsa staining; pericentromeric regions are colored in red; and acrocentric regions and variable heterochromatic regions are colored in cyan.
Supplementary Video 2
3D rendering of one out of 10,000 whole-genome structures generated by chromflock by integrating GPSeq and Hi-C information. Each bead represents a 1-Mb genomic window (nonoverlapping). Chromosomes are shown with distinct colors. Elements connecting the beads are shown in yellow. The modeled nuclear surface is shown in gray.
Supplementary Video 3
3D rendering of one out of 10,000 whole-genome structures generated by chromflock by integrating GPSeq and Hi-C information. Each bead represents a 1-Mb genomic window (nonoverlapping). Chromosomes are shown with distinct colors. Elements connecting the beads are shown in yellow. The modeled nuclear surface is shown in gray.
Supplementary Video 4
3D rendering of one out of 10,000 whole-genome structures generated by chromflock by integrating GPSeq and Hi-C information. Each bead represents a 1-Mb genomic window (nonoverlapping). Chromosomes are shown with distinct colors. Elements connecting the beads are shown in yellow. The modeled nuclear surface is shown in gray.
Supplementary Video 5
3D rendering of one out of 10,000 whole-genome structures generated by chromflock by integrating GPSeq and Hi-C information. Each bead represents a 1-Mb genomic window (nonoverlapping). Chromosomes are shown with distinct colors. Elements connecting the beads are shown in yellow. The modeled nuclear surface is shown in gray.
Rights and permissions
About this article
Cite this article
Girelli, G., Custodio, J., Kallas, T. et al. GPSeq reveals the radial organization of chromatin in the cell nucleus. Nat Biotechnol 38, 1184–1193 (2020). https://doi.org/10.1038/s41587-020-0519-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-020-0519-y
- Springer Nature America, Inc.
This article is cited by
-
Computational methods for analysing multiscale 3D genome organization
Nature Reviews Genetics (2024)
-
Identifying quantitatively differential chromosomal compartmentalization changes and their biological significance from Hi-C data using DARIC
BMC Genomics (2023)
-
Evaluating the role of the nuclear microenvironment in gene function by population-based modeling
Nature Structural & Molecular Biology (2023)
-
Guiding DNA repair at the nuclear periphery
Nature Cell Biology (2023)
-
FRET-FISH probes chromatin compaction at individual genomic loci in single cells
Nature Communications (2022)