Chrom3D: three-dimensional genome modeling from Hi-C and nuclear lamin-genome contacts
- 9k Downloads
Current three-dimensional (3D) genome modeling platforms are limited by their inability to account for radial placement of loci in the nucleus. We present Chrom3D, a user-friendly whole-genome 3D computational modeling framework that simulates positions of topologically-associated domains (TADs) relative to each other and to the nuclear periphery. Chrom3D integrates chromosome conformation capture (Hi-C) and lamin-associated domain (LAD) datasets to generate structure ensembles that recapitulate radial distributions of TADs detected in single cells. Chrom3D reveals unexpected spatial features of LAD regulation in cells from patients with a laminopathy-causing lamin mutation. Chrom3D is freely available on github.
KeywordsThree-dimensional (3D) genome Hi-C Lamin-associated domain (LAD) Laminopathy Modeling Nuclear lamin Topologically-associated domain (TAD)
fluorescence in situ hybridization
topologically associated domain
Advances in molecular and computational techniques have enhanced our understanding of the three-dimensional (3D) organization of eukaryotic genomes . Current interpretation of chromosome-chromosome contacts determined from genome-wide chromosome conformation capture (Hi-C) data pictures a hierarchically organized genome with fundamental ~1 Mb units termed topologically associated domains (TADs) [2, 3, 4]. In mammals, the genomic linear position of TADs and TAD boundaries are overall conserved between cell types [2, 3]. However, TADs can differ in their internal chromatin folding patterns, chromatin states, and transcriptional activity , and contacts between TADs can be altered during cell differentiation . While these observations suggest an orchestrated genome topology [6, 7], processes modulating transcriptional activity of TADs remain largely unknown.
One way for the cell to regulate chromatin activity in TADs would be to place them in distinct nuclear compartments, such as the nuclear interior which is conducive of transcriptional activity or the nuclear periphery (NP) which provides a more repressive environment. At the NP, chromatin interacts with the nuclear lamina, a meshwork of A- and B-type nuclear lamins , through lamin-associated domains (LADs) . While lamin B1 (abbreviated as LMNB1 here) is restricted to the NP, lamins A and C, splice variants of the LMNA gene (abbreviated as LMNA), also exist in the nuclear interior  where they seem to play a role in gene regulation and differentiation  presumably by interacting with chromatin [6, 7]. Thus, a dynamic association of TADs with the NP would constitute a mode of regulation of transcriptional activity within TADs . However, TAD positioning in the 3D nucleus space has not been examined because there are currently no means of assessing spatial mammalian genome conformation using chromatin anchor-point information. This limits our understanding of principles of genome dynamics.
Chromatin connections with intranuclear structures such as the nuclear lamina  contribute to spatial genome organization and regulation of gene expression. In yeast, attachment of centromeres to the spindle pole body and tethering of telomeres to the NP [12, 13, 14] provide constraints on chromosome movement which have proven useful to generate 3D genome structures [15, 16]. These observations suggest that integrating positional constraints from various genomic datasets, such as LAD information from chromatin immunoprecipitation sequencing (ChIP-seq) of nuclear lamins, in addition to Hi-C, would provide more realistic structures of the mammalian genome.
A strategy to study genome conformation is to computationally model 3D structures of chromatin and analyze the properties of these structures. 3D genome modeling approaches have been applied at various scales and resolutions [16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33]. One approach to modeling genomes from Hi-C data is to reconstruct a consensus 3D structure, using multidimensional scaling [17, 20, 21, 34] or Bayesian inference methods such as Bayesian 3D constructor for Hi-C data (BACH) and derivatives thereof . Other methods recapitulate structural variations in genome conformation across cells in a population by simulating ensembles of structures [18, 19, 24, 28, 31, 35] or by data deconvolution [22, 24, 25, 31, 36]. A commonly used framework that models ensembles of structures is the Integrative Modeling Platform (IMP) [24, 31, 36, 37] (https://3dgenomes.github.io/TADbit). However, IMP has not been designed for genome modeling and requires advanced programming skills. Another constrained optimization approach (BACH-MIX) designed for local genome modeling, relies on Bayesian inference of 3D chromosome arrangements to assess variations in genome structures in a cell population . BACH-MIX, however, is not designed to incorporate positional constraints for loci in the nucleus. There is therefore no user-friendly framework that models the 3D genome over a wide range of scales and that incorporates chromosome positional constraints.
We introduce Chrom3D, a genome 3D modeling platform designed to integrate positional constraints based on association of loci with intranuclear anchors. The combination of Hi-C and LAD information enables genome-wide radial positioning of TADs in ensembles of 3D structures. We also show that Chrom3D provides new opportunities to investigate mechanisms of spatial gene regulation in diseases susceptible to affect spatial chromatin organization.
A 3D genome modeling framework integrating chromosomal interactions and radial position information
Chrom3D is based on Monte Carlo (MC) optimization with the goal of minimizing a loss-score function. The optimization process starts from random self-avoiding chromosome structures. Using the constraints described above imposed by TAD-TAD and TAD-LMNA interactions, iteration invokes one of five predefined local bead moves, affecting one or multiple beads while preserving bead chain connectivity (Fig. 1a; Additional file 1: Figure S3). This is in contrast to previous MC-based genome modeling where each bead is moved independently [18, 24, 31]. We model the genome at TAD (and sub-TAD) resolution from 13,878 beads, each spanning ~230 kb. In the simulations, TADs constrained by LADs are pushed toward the NP while Hi-C-constrained interacting TAD pairs are attracted to each other. The resulting Euclidean distances are assessed through a loss-score optimized until convergence is reached (Fig. 1b). The result of one simulation is a 3D modeled structure of the entire human genome where the concept of chromosome territories is respected (Fig. 1c; see also below for analysis of modeled chromosome territories). Of note, the lamin constraint is neutral with respect to the detection of chromosome territories in the modeled structures (Additional file 1: Figure S4a).
Comparison of Chrom3D with IMP
We next compared Chrom3D with IMP, a popular framework for ensemble 3D genome modeling. To this end, we customized IMP to include LAD information as radial positional constraints. Simulation time is slightly faster with Chrom3D (Additional file 1: Figure S6a). Importantly, IMP tends to draw LMNA-containing TADs (beads) to the NP by stretching distances between consecutive beads, thereby violating chain continuity, especially for TADs associated with LMNA (Additional file 1: Figure S6b, c). We attributed this to IMP’s permutation strategy which involves randomly repositioning single beads, whereas by design Chrom3D always connects beads (Additional file 1: Figure S6b, red line). Moreover, using many beads, IMP generates large and intermingled chromosomes (Additional file 1: Figure S6d, e) that are less ellipsoidal and with greater variation in asphericity, beyond the 1–2 μm radius of chromosome territories estimated from microscopy studies  (Additional file 1: Figure S6f, g). This is likely due to initialization in an unconnected configuration. Thus, despite IMP’s suitability for 3D genome modeling of coarse-grained systems, Chrom3D more favorably models 3D genome structures with bead sizes at TAD and sub-TAD resolution from the constraints imposed in our system.
Chrom3D is able to model local chromatin conformation
We assessed whether Chrom3D was scalable to restriction fragment-level size by modeling the ENCODE ENm008 region containing the α-globin locus, whose 3D conformation has been inferred from 5C data . Clustering of 1000 Chrom3D-simulated conformations with no lamin constraint (see “Methods”) reveals greater structural variability in erythroleukemia K562 cells where the α-globin gene is expressed, than in lymphoblastoid GM12878 cells where it is repressed (Additional file 1: Figure S7a). Chrom3D structures also show compactness of the locus consistent with the previous structure and fluorescence in situ hybridization (FISH) data and with expression of the gene in these cell types  (Additional file 1: Figure S7b–d). We conclude that Chrom3D is also suited for structural chromatin modeling at the gene locus level.
TADs associated with LMNA are more centrally placed than those associated with LMNB1
Previous FISH analysis simultaneously probing 25 LMNB1 LADs in single HT1080 cells show that only 32% of LADs can be simultaneously detected at the NP (defined there as < 0.7 μm, or 8 pixels, from the nucleus edge) in a given nucleus . This proportion is remarkably similar to that of TADs associated with LMNB1 localized within 0.7 μm of the nucleus edge across our 400 Chrom3D structures (30.5%; Fig. 3a, inset). It is also higher than that of LMNA-associated TADs modeled at the periphery (25%; P < 2.2 × 10–16; Mann–Whitney U test). This again indicates that not all LADs can be assumed to be found at the NP in individual nuclei in a cell population. This may be because some regions only transiently contact nuclear lamins at the NP and are therefore mainly detected in the nuclear interior.
Assessment of chromatin stability at the nuclear periphery
Our previous results suggest that Chrom3D can recapitulate structures of genome conformation at the single-cell level. To further assess this contention, we examined the consistency of assignment of TADs at the NP across structures, with the rationale that this would reflect stably positioned TADs in this compartment across cells in a population. Chromosomal heatmaps of radial placement of TADs in structures modeled using LMNA or LMNB1 constraints reveal TADs with constitutive placement at the NP (<1 μm from the nucleus edge) or in the nucleus center and TADs with intermediate placement (Fig. 3b). As expected, the most stable peripheral TADs are located on the largest and most gene-poor chromosomes, while smaller gene-rich chromosomes harbor TADs more centrally placed (Fig. 3b). We find, however, no correlation between TAD size and peripheral stability of TADs across structures, indicating that the attribution of TADs from large chromosomes to the NP is not merely caused by TAD size (r = 0.16; Additional file 1: Figure S8). There is also concordance of radial positioning of TADs based on LMNB1 or LMNA constraints (Fig. 3b, c), consistent with the bulk of LMNA being enriched in the peripheral lamina. Moreover, subtelomeric regions appear overall more centrally placed than pericentromeric regions that are more stably ascribed to the NP (Fig. 3b), corroborating previous FISH data [42, 43]. The patterns and consistency of radial assignment of TADs across our ensemble of structures indicate that Chrom3D can capture principles of chromatin organization in single cells.
Features of LADs estimated from the structures concur with lamin-genome contact patterns observed in single cells
We next compared our three radial placement categories (center, intermediate, periphery; see Fig. 2a) with single-cell NP-genome contact frequencies. These were defined by association of chromatin with the nuclear lamina observed in a previous single-cell LMNB1 DamID study in the near-haploid KBM7 cell line, the only cell type for which to our knowledge LADs have been mapped at the single-cell level . Despite the difference in ploidy between HeLa and KBM7 cells, our structure ensembles reveal features of genome organization inferred from the single-cell observations. Indeed, TAD sequences assigned to the periphery in our structures show the highest peripheral contact frequency calculated from the single-cell LMNB1 DamID data and, conversely, the central category shows the lowest peripheral contact frequency in the single-cell dataset (Fig. 3d). Repeating this comparison excluding the lamin constraint in our modeling shows strongly reduced assignment of the regions to the periphery (Fig. 3d; P = 4.34 × 10–5 to P < 2.2 × 10–6; Mann–Whitney U tests). Moreover, focusing on LADs, our structures predict that larger LADs are more stably assigned at the periphery than smaller LADs (Fig. 3e; P = 1.23 × 10–6; Mann–Whitney U tests), again in agreement with the single-cell observations . Our predictions of lower gene density and expression level in peripheral TADs (see Fig. 2b, c) are also supported by the single-cell data. We conclude that our ensemble of structures reflects the radial localization of LADs observed in single cells.
FISH validates LADs modeled in the nuclear periphery and nuclear interior
Chrom3D reveals laminopathy-specific LADs and differential gene regulation in the nucleus interior
Mutations in LMNA cause laminopathies which affect specific tissues  through still largely unknown mechanisms. The roles of LMNA on chromatin organization and mobility [46, 47] suggest that laminopathies may involve altered interactions of LMNA with chromatin in distinct nuclear compartments. This, however, has not been examined due to a lack of suitable tools. To gain 3D insight on chromatin changes that might be associated with LMNA mutations, we used Chrom3D to model the radial distribution of LADs associated with wt or mutated LMNA.
Using Chrom3D, we generated 100 structures for each of the control and patient fibroblasts by integrating control and FPLD2 LMNA LAD datasets with TAD information from published Hi-C data for IMR90 human fibroblasts . We find that strikingly, LADs specific to FPLD2 patients (LADs “gained” in FPLD2 fibroblasts) are more centrally located than all LADs in these cells (Fig. 6d, bars 2, 6; P < 2.2 × 10–16, Mann–Whitney U test); these domains are also centrally placed in control cells (Fig. 6d). In contrast, LADs unique to control fibroblasts (“lost” in patient cells) are found at the NP, to the same extent as all LADs in these cells (Fig. 6d; bars 3, 5; P = 0.35). Figure 6e shows examples of peripherally placed “lost” LADs in a modeled control nucleus and centrally placed “gained” LADs in an FPLD2 nucleus. We nevertheless note that a gain or loss of LADs is associated with partial recruitment of these regions towards, or away from, the NP, respectively (Fig. 6d, bars 1, 2; P < 2.2 × 10–16 and bars 3, 4; P < 2.2 × 10–16). These findings imply that LADs gained in patient cells are mainly restricted to the nuclear interior and are unexpectedly not fully repositioned to the NP. In addition, a gain or loss of LADs correlates with overall downregulation or upregulation, respectively, of gene expression within them (Fig. 6f; P < 2.2 × 10–16, Mann–Whitney U test), providing functional significance to the differential LMNA associations identified in FPLD2 patient cells. This is in a context of similar range of expression levels of all genes and of genes within LADs, both in patient and control cells (Fig. 6g).
Providing additional biological meaning to the modeled structures, Gene Ontology enrichment analysis highlights distinct functions of genes found in control-specific LADs (signaling and metabolic processes) and FPLD2-specific LADs (developmental processes; Additional file 1: Table S2). Interestingly, relevant for the metabolic phenotype of FPLD2 patients , the gained and lost LADs in patient fibroblasts contain genes implicated in white and brown adipocyte differentiation and metabolism (e.g. PRDM16, a master regulator of adipose tissue browning, RARRES2, LGR4, BCATENIN, PLCB1, PTGS2, FABP4, RSPO3, and EIF2AK3). Predictions emerging from this modeling therefore suggest that adipogenic and metabolic defects in FPLD2 patients with the LMNA p.R482W substitution might be associated with a deregulation of LADs in the nuclear interior and not exclusively at the nuclear envelope. This could speculatively involve differential binding of lamin A/C to promoters . Our 3D genome modeling framework paves the way to more targeted investigations of disease mechanisms affecting genome architecture.
We present Chrom3D, a software for 3D genome modeling based on the inclusion of positional input constraints from chromosomal interactions (Hi-C data) and nuclear lamin-chromatin associations (lamin ChIP-seq data). Several key features of our modeled structures are shown: inclusion of radial positional constraints, scalability from a single locus to the whole human genome, and predictive value of radial placement of TADs. Chrom3D is versatile in that other positional constraints can be integrated. Finally, we show an application of Chrom3D to the study of disease mechanisms using FPLD2 patient-specific positional constraints imposed by a LMNA mutant displaying alterations in its association with the genome. Incorporation of radial positional constraints provides new insight into the placement of genomic regions in the 3D mammalian nucleus space with respect to the NP, which has not been possible from current genome modeling platforms. Our ensembles of structures reveal information on the cell-to-cell variation in genome structures likely to exist in a cell population. They recapitulate the peripheral positions of TADs in single cells and notably ascribe a subset of LMNA-associated TADs, and thereby LADs, in the nuclear interior without prior knowledge of such localization. Inference of an intranuclear localization of LMNA LADs concurs with the nuclear distribution of A-type lamins  and with their association with euchromatin, including active genes [6, 7, 53], which is enriched in the nuclear interior. Our structures therefore have predictive capacity. We exploit this property to infer the internal positioning of LADs associated with a pathological LMNA(R388P) mutation, after superimposition of these LADs onto structures. This concurs with the only information currently available on this lamin mutant, namely LAD data determined by ChIP-seq of an epitope-tagged version of this mutant, and its localization throughout the nucleoplasm visualized by immunostaining. Our findings illustrate the benefit of optimization-based 3D modeling to unveil the interplay between factors determining 3D genome structure.
The predictive capacity of our structures has important implications in understanding the relationship between genome structure and disease [54, 55]. Chrom3D structures enable a gain of spatial insight into disease-causing mechanisms, e.g. laminopathies as illustrated here. The structures reveal how alterations in LMNA-chromatin associations specific to FPLD2 patients with the LMNA(R482W) substitution predictively occur centrally in the nucleus and not necessarily at the NP as one might have expected. This opens the door to better targeted molecular investigations of the disease. Our modeling approach should not only be applicable to other laminopathies, but also potentially to diseases linked to dysfunction or mis-regulation in other nuclear components.
Challenges remain, however, before 3D genome modeling can be routinely applied in disease contexts. First, the genome must be modeled at appropriate spatial resolution to infer significant associations between genome structure and disease mechanisms; this may be required, for instance, to place selected genes and other genomic elements into correct regulatory neighborhoods. We have modeled the genome at TAD and sub-TAD resolution, providing high resolution structures of the diploid human genome. We do not imply that TADs exist as structural units at the single-cell level, but TADs reflect statistically enriched topological domains that prove to be relevant units for modeling. Second, the size and complexity of the human genome necessitate some level of coarse graining for any 3D modeling exercise. Thus, a tradeoff between resolution and throughput of structures is inevitable: we have focused here on an elevated number of beads in our structures and a smaller ensemble of structures. Nevertheless, we show how critical insights into cell-to-cell variability of genome structures can be gained from radial positioning constraints.
Chrom3D is a genome 3D modeling platform integrating Hi-C data together with positional constraints from the association of loci with intranuclear anchors such as nuclear lamins. While pairwise domain interactions are important to enforce contacts between distal genomic regions, radial positioning provides key information on the spatial organization of genomic domains. Incorporation of radial positioning constraints in 3D genome structures enables the study of spatial gene regulation in disease, for example so-called nuclear envelopathies, caused by mutations in nuclear envelope proteins. Extending positional information to other chromatin anchor points in the nucleus should expectedly enhance applications of 3D genome structures to the study of disease mechanisms.
HeLa cells (American Type Culture Collection; CCL-2) were cultured in MEM medium containing Glutamax (Gibco), 1% non-essential amino acids and 10% fetal calf serum. Cells were transfected using XtremeGENE 9 (Roche) using a 3:1 ratio (μL:μg) of X-tremeGENE 9 DNA Transfection Reagent and DNA. Primary skin fibroblast cultures were established from healthy volunteers aged 20 years and 33 years (CTL-1, CTL-3) and from four patients with familial partial lipodystrophy of Dunnigan type (FPLD2) due to a LMNA p.R482W heterozygous mutation (female, age 43 years (“FPLD-p1” patient), female, age 37 years (FPLD-p2), female, age 14 years (FPLD-p3), male, age 43 years (FPLD-p4) . These studies were approved by the Institutional Review Board of Hôpital Saint Antoine (Paris, France). Normal skin fibroblasts were also purchased from Lonza (“CTL-2”). Fibroblasts were cultured in DMEM/F12/10% fetal calf serum, 10 ng/mL epidermal growth factor, 24 ng/mL basic fibroblast growth factor, and 1% Penicillin-Streptomycin. Cultures were at passage 5–7 when used.
pCMV-Flag-preLA-WT and pCMV-Flag-preLA-L647R vectors were generated by LMNA amplification of pSVK3-Flag-preLA-WT and pSVK3-preLA-L647R , with the 5′ CCGGATCCTATGGAGACCCCGTCCCAGCGG-3′ and 5′ GCGAATTCTTACATGATGCTGCAGTTCTG-3′ primers and insertion of the PCR product into pCMV-Flag at BamH1 and EcoRI sites. pCMV-Flag-preLA-R388P was constructed from pCMV-Flag-preLA-WT using the QuikChange Lightning Site-Directed Mutagenesis Kit (Agilent Technologies). pCMV-Flag-preLA-wt was amplified by PCR using 5′-GAGGAGAGGCTACCACTGTCCCCCAGC-3′ and 5′-GCTGGGGGACAGTGGTAGCCTCTCCTC-3′ primers, products digested by DpnI and XL10-Gold® ultracompetent cells were transformed. pEGFP-preLA-R388P was constructed by LMNA amplification of pCMV-Flag-preLA-R388P with the 5′ GCCCTAGGTGAGGCCAAGAAGCAACTT 3′ and 5′ GCCCATGGACTGGTCCTCATTGGACTTGT 3′ primers and insertion of the PCR product into pEGFP-preLA-wt at EcoNI and PflMI sites.
Cells grown on coverslips were fixed 24 h after transfection with 3% paraformaldehyde, permeabilized in PBS/0.5% Triton X-100, and incubated in PBS/0.1% Triton X-100/2% BSA for 25 min. Cells were incubated for 30 min each with primary and secondary antibodies in PBS/0.1% Triton X-100/1% BSA. Antibodies were anti-Flag (1:200; Sigma), anti-lamin A/C  (1:400), and anti-rabbit Alex Fluor® 594 (1:200; Jackson ImmunoResearch). DNA was stained with Hoechst 33258. Coverslips were mounted with Mowiol and examined on a LSM 700 confocal microscope (Zeiss) at the Imaging Facility of the Functional and Adaptive Biology Unit of University Paris Diderot/CNRS.
Proteins were separated by SDS-PAGE and transferred onto nitrocellulose. Membranes were incubated with anti-lamin A/C  or anti-GAPDH antibodies (1:15,000; Sigma) and with horseradish peroxidase-conjugated antibodies (1:20,000; Promega). Signals were detected by enhanced chemiluminescence.
Total RNA was isolated from control and FPLD2 fibroblasts using the Ambion TRIzol® Reagent RNA extraction kit (Life Technologies) . Libraries were sequenced on an Illumina HiSeq2500. RNA-sequencing (RNA-seq) reads were processed using Tuxedo . TopHat  was used to align reads to hg19 applying the Bowtie 2  preset “very sensitive.” Gene ontology analysis was done with topGO in R .
ChIP of LMNA and LAD identification
ChIP of Flag-LMNA proteins in HeLa cells was done using anti-Flag antibodies (20 μg/107 cells) as described . ChIP of LMNA from fibroblasts was done using anti-lamin A/C antibodies . Illumina libraries were sequenced on a HiSeq2500. DNA was also used as template for qPCR (Additional file 1: Table S3), with 95 °C for 3 min and 40 cycles of 95 °C for 30 s, 60 °C for 30 s, and 72 °C for 30 s. Sequence reads were aligned to hg19 genome using Bowtie2 with default parameters and option -best enabled. LADs were called using Enriched Domain Detector  using a 1-kb bin size and default parameters. Browser files were generated from the ratio of ChIP/input for each 1-kb bin with input normalized to ratio of [total ChIP reads/total input reads]. Scripts were written in Perl  or R .
TAD and bead definition
Genomic positions of TADs were based on contact domains identified from Hi-C data  accessed under GEO GSE63525. Overlapping TADs were merged into single domains and regions not covered by a TAD were assigned a bead of size proportional to the corresponding genomic region. Bead sizes were scaled so that total bead volume constituted 15% of the volume of a 10-μm diameter modeled nuclei , using a previous scaling function .
Assigning lamin information to TADs
TADs that overlap, fully or partially, with a called peak from lamin ChIP-seq data (i.e. a LAD)  were designated as LMNA-associated or LMNB1-associated TADs and were constrained towards the NP. This resulted in 1718 LMNA-associated TADs (see Additional file 1: Figure S1 and S2, blue segments) and 2770 LMNB1-associated TADs.
Inference of significant interactions from Hi-C data
Interactions between beads were defined from high-resolution Hi-C data for HeLa cells  accessed under dbGap number phs000640. To infer statistically significant interactions, we adapted the ChiaSig method designed for ChlA-PET  to Hi-C. To this end, we estimated the dependency between linear genomic distance and contact frequencies using 1-Mb bins. Refinement of genomic distance–contact frequency relationship was not necessary because most pairwise combinations of bins reflect background looping information. To estimate background distribution for inter-chromosomal interactions, we used the average number of inter-chromosomal interactions between all pairs of bins between chromosomes. ChiaSig calculates a P value based on the probability of observing a given number of contacts conditional on the total number of contacts for both regions involved, as well as the total number of contacts, using a non-central hypergeometric distribution. This adjusts for the propensity of different regions to be involved in contacts, including technical bias (GC-content, accessibility). Intra-chromosomal interactions were selected with FDR 0.01% . For inter-chromosomal interactions, we also required that interactions be significant in HeLa cells and in > 4 of the seven cell lines analyzed previously (GM12878, HMEC, HUVEC, IMR90, K562, KBM7, NHEK) .
This resulted in 3824 significant interactions (3657 intra-chromosomal and 167 inter-chromosomal) for HeLa (see Additional file 1: Figure S2, red segments). For IMR90, we obtained 2349 significant interactions (1558 intra-chromosomal and 791 inter-chromosomal). These interactions were associated with TADs by mapping the mid-point of each Hi-C bin to the corresponding TAD. This resulted in 2586 beads for HeLa cells and 1744 beads for IMR90 (each × 2 to account for a diploid genome) with at least one interaction.
Peripheral, central, and intermediate assignment of TADs in the modeled structures
the NP if placed in the shell in > 67% of 400 structures;
an “intermediate” location if placed in the shell in 33–67% of the 400 structures;
the nucleus center if placed in the shell in < 33% of the 400 structures.
Chromatin modeling framework
where the sum runs over all bead positions where a constraint has been defined and d ij is the target Euclidean distance of the given pair of beads i and j. Beads to be associated with the NP are optimized according to the distance from a “dummy bead” assigned in the nucleus center (the origo). The dummy bead has a radius of 0 and is in all instances (except for loss-score calculations) not considered as part of the modeled structure. Each constraint can be weighted by a factor k ij , to allow for selected constraints to be prioritized in the MC optimization. Target distance of all bead pairs constrained by a Hi-C interaction between them was set to the sum of the radii of the two beads, effectively minimizing the distance between them without bead overlap. For beads constrained by LADs, target distance (from the nucleus center) was set to the difference between the nucleus radius and the bead radius, allowing lamin-constrained beads to move to the nuclear “wall.” All non-lamin beads were pulled towards the center by minimizing their distance to the nucleus center (“dummy bead”). At the start of each simulation, we initialize the modeling based on self-avoiding random walk structures sampled such that none of the chromosomes clash or overlap. We perturb the structure using the “moves” and minimize the loss-score using simulated annealing. A move was accepted based on resulting Euclidian distances between interacting TADs or between TADs and the NP, according to the Metropolis–Hastings algorithm with simulated annealing. Moves causing a clash between beads were discarded.
Chrom3D software and documentation can be freely accessed at: https://github.com/CollasLab/Chrom3D.
The version of the source code used in the manuscript is available at: https://doi.org/ 10.5281/zenodo.168212
Comparisons of Chrom3D with IMP
We developed a modeling procedure based on IMP using the same set of constraints and number of beads as for Chrom3D. We initialized TAD beads as particles and added “ExcludedVolumeRestraint” to disable bead–bead clashes. For each consecutive pair of beads on each chromosome, we added harmonic springs with a spring distance equal to the two radii of the beads. Interactions between non-consecutive beads (based on Hi-C) were modeled using harmonic springs with a distance corresponding to the sum of the radii of the bead pair, similarly to Chrom3D. Beads with lamin constraints were pushed to the NP using the “HarmonicLowerBound” and a “dummy bead” placed in the nucleus center. Spring distance from the lamin-bead and the dummy bead was set to nucleus radius (5 μm) minus bead radius. To run MC optimization, we used “MonteCarloWithLocalOptimization” with fie local steps and a total of 500 iterations, which was sufficient to reach convergence. Ten independent simulations were done for comparison with Chrom3D structures.
To compare chromosome territories, we used the radius of gyration, calculated as the root mean square distance of the beads on each chromosome from their common center of mass (using bead volume to represent the mass). To estimate individual chromosome deviations from a spherical shape, we used the asphericity measure based on the eigenvalues of the gyration tensor .
Modeling of the α-globin gene locus
The ENm008 ENCODE region containing the α-globin locus was modeled using Chrom3D based on published 5C chromosome conformation capture data . For each restriction fragment, we created beads (n = 70) of diameter corresponding to the genomic length of the fragment multiplied by 0.005 . Preprocessing and distance conversion rules for 5C data were as described . In contrast to the input file used for earlier 3D reconstruction , we did not include distance constraints between neighboring beads since our modeling framework represents each chromosome as a chain of connected beads. Thus, distance constraints included non-interaction constraints (two beads should not get closer than a given distance) and interaction constraints (two beads should not get further from each other than a given distance). For all bead pairs with zero contacts detected in the 5C contact matrix, we used non-interaction constraints. Thresholds for non-interaction and interaction distances were cell type-specific (K562 and GM12878 cells) . We ran 1000 simulations for each cell line using 40,000 iterations and a cooling rate of 0.000125, excluding whole chromosome “Translation” and “Rotation” moves. Final structures were aligned using Procrustes analysis (procrustes method in the vegan R package; default parameters with no scaling) and clustered using agglomerative hierarchical clustering (agnes method in the cluster R package; metric = “manhattan”). We extracted the cluster containing highest proportion of simulated structures for each dataset and plotted these on top of each other with high transparency so that the most common positions for each bead are the most visible in the plot. Coloring scheme was approximated to the scheme used in Table 1 in . Bead sizes (in base-pairs) were: min 2; median 5151; mean 7043; max 29,050; bead radii in nm: min 0.01, median 25.76, mean, 35.21, and max 145.20. For distance calculations for comparison with FISH probes, we used beads 14 and 58 to calculate their distances across all structures for each cell line.
FISH probe design
The model nucleus was divided into two compartments at r half_v = 3.97 μm distance from nuclear center (considering a 5 μm radius), each compartment being of equal volume. This provided two regions for bead placement: beads with centers located < r half_v from the nucleus center were classified as central, and peripheral otherwise. Proportions of each bead placed in peripheral or central region across 400 structures were used to identify the most stable beads in the periphery or center. Beads were further filtered to select beads associated with LMNA  (GEO GSE57149; track GSM1376181). We also designed probes to beads that were neither stable in the periphery nor center (“intermediate” area). Additionally, due to the variable copy number of genomic segments in HeLa cells, probes were designed to avoid areas with high copy number variations. Positions of FISH probes are shown in Additional file 1: Figure S4a and FISH probe information is shown in Additional file 1: Table S1.
FISH procedure and signal detection
Cells were incubated in hypotonic buffer (0.25% KCl, 0.5% tri-sodium citrate) for 10 min and fixed in ice-cold methanol:acetic acid (3:1). Cells were dropped on slides. BAC FISH probes (BacPac Resource Center) (Additional file 1: Table S1) were labeled using a Nick Translation Kit and Biotin-16-dUTP (Roche). Per slide, a 200–300 ng labeled probe was mixed with 8 μg human Cot-1 DNA and 30 μg salmon sperm DNA (Invitrogen) and precipitated. A DNA pellet was dissolved in 11 μL hybridization mix (50% deionized formamide (Ambion), 2× SSC, 1% Tween 20, 10% dextran sulphate) at 42 °C for 20 min and pre-annealed for 1 h at 37 °C. Slides were RNase-treated and washed twice in 2× SSC, dehydrated in 70%, 90%, and 100% ethanol, and air-dried. Slides were denatured for 1 min 20 s in 70 °C 70% deionized formamide/2× SSC, pH 7.5, dehydrated in ice-cold 70%, 90%, and 100% ethanol, and air-dried. Probes were denatured for 5 min at 70 °C and pre-annealed for 15 min at 37 °C. Ten microliters of probe were applied onto coverslips (22 × 22 mm) which were then mounted on a slide. Slides were hybridized overnight at 37 °C. Slides were washed in 2× SSC (45 °C 2 min then 3× 5 min) and in 0.1× SSC (60 °C for 4× 4 min). Slides were blocked in 5% skim milk in 4× SSC for 15 min at 37 °C and incubated at 37 °C for 30–60 min with Avidin Alexa Fluor 488 conjugate (Invitrogen) (1.7 μg/mL in blocking buffer). Slides were washed in 4× SSC/0.1% Tween 20 for 3× 5 min and incubated with Biotinylated Anti-Avidin D conjugate (goat; 1.0 μg/mL in blocking buffer) (Vector) for 30 min at 37 °C. Slides were washed and incubated with Avidin Alexa Fluor 488 conjugate as above. Slides were mounted with 0.2 μg/mL DAPI in Dako Fluorescent Mounting Medium.
A total of 484 FISH images were analyzed using FISHfinder  to detect probes and calculate their position relative to the nucleus edge (n = 1105 FISH signals). Images were taken in DeltaVision image stack format (.dv). Significance of FISH signal localization in central, intermediate, and peripheral regions was tested by Mann–Whitney–Wilcoxon tests.
Browser views of ChIP-seq data are shown using Integrated Genomics Viewer . Genes are from Illumina iGenomes gene annotation with UCSC source for hg19.
We thank Kristin Vekterud for her technical assistance and Dr. Ragnhild Eskeland for her advice on FISH.
This work was supported by INSERM, CNRS, Sorbonne Universités, UPMC Université Paris 6, Université Paris 7, the Research Council of Norway, The Norwegian Cancer Society, EU Scientia Fellowship FP7-PEOPLE-2013-COFUND No. 609020 (to NB), and the University of Oslo.
Availability of data and materials
Chrom3D software and documentation can be accessed under an MIT license at https://github.com/CollasLab/Chrom3D. The version of the Chrom3D source code used in the manuscript has been assigned doi.org/ 10.5281/zenodo.168212. ChIP-seq data for LMNA and LMNB1 from HeLa cells  were downloaded from NCBI GEO GSE57149 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE57149). We note that in this dataset, two tracks have been inadvertently swapped; correct tracks are for LMNA_sonication: GSM1376181, and for LMNA_digestion: GSM1376179. ChIP-seq data for fibroblasts and HeLa cells expressing LMNA wt, R388P, and L647R are available in NCBI GEO GSE81671 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE81671). RNA-seq for HeLa cells were downloaded from NCBI GEO GSE33480 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE33480). RNA-seq data for control and FPLD fibroblasts are available in NCBI GEO GSE81671 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE81671). HeLa Hi-C data were accessed under dbGap number phs000640 after permission. Hi-C data for IMR90 fibroblasts  were from NCBI GEO GSE63525 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63525). IMP code used in the Chrom3D-IMP comparison is available at http://folk.uio.no/jonaspau/imp-lamin.
JP designed the study and software. JP and MS performed scripting and analyses and wrote the manuscript. AS did LAD analysis. ARO, AB, ALS, ED, and NB did wet-lab experiments. CV and BB supervised lab work. PC designed and supervised the study and wrote the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Use of fibroblasts was approved by the Institutional Review Board of Hôpital Saint Antoine (Paris, France). The genome sequence used here was from HeLa cells. Henrietta Lacks, and the HeLa cell line established from her tumor cells without her knowledge or consent in 1951, have made significant contributions to scientific progress and advances in human health. We are grateful to Henrietta Lacks, now deceased, and to her family members for their contributions to biomedical research. This study was approved by the NIH HeLa Genome Data Access Working Group.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.