Introduction

Relapsed pediatric B-cell acute lymphoblastic leukemia (B-ALL) remains one of the leading causes of cancer mortality in children1,2. While five-year survival rates for newly diagnosed pediatric B-ALL have improved and now approach 90%, up to 20% of patients will suffer relapse and face a dismal prognosis3. Although recent therapeutic approaches to treating relapsed B-ALL seem encouraging, leukemia sub-clones continue to emerge through the selective pressures of therapy4. Therefore, investigating the mechanisms responsible for cancer recurrence and therapy resistance is crucial for preventing and treating B-ALL relapse.

Previous studies have focused on the discovery of relapse-enriched somatic mutations and copy number changes, which proved to be fruitful in the identification of mutations involved in resistance to one or more drugs used in treatment5,6,7,8. Our recent work characterized the evolution of the epigenetic landscape of B-ALL from diagnosis to relapse. While changes in the epigenome varied widely across patients, we were able to identify shared relapse-specific superenhancers along with concordant gene expression changes. We also found convergent evolution of histone modifications on pathways known to be associated with resistance, such as p53, JAK/STAT, and Ras, thus highlighting the importance of chromatin changes in disease progression9. Although studies of the linear genome have been enormously productive in yielding fundamental insights into B-ALL biology and therapeutic options, recent studies suggest that an additional critical level of control of the epigenome is connected to the 3D structure of chromosomes.

Thus far, no one has explored B-ALL progression through the lens of 3D chromatin organization. The advent of chromosome conformation capture (3 C) techniques, such as Hi-C, revealed that the folding of large complex chromosomes is nonrandom and involves a hierarchy of biologically relevant structures including chromosome territories, A and B compartments, topologically associating domains (TADs), and enhancer-promoter chromatin loops10,11,12,13,14. These organizational layers have been shown to be tightly linked to epigenetic and transcriptional regulation. Additionally, a growing number of cancers have been shown to harbor disruptions in various layers of chromosome topology including gliomas, non-Hodgkin lymphoma, and T-cell ALL15,16,17.

In B-cell lymphomas, a recurrent histone H1 ((HIST1H1B-E) linker protein mutation was shown to cause large scale chromatin decompaction coupled with a gain of H3K36me2 and/or loss of H3K27me3, which created a permissive landscape for the expression of stem cell genes normally silenced during development18. Loss of H1c and H1e in mice enhanced fitness and self-renewal properties of GC B-cells, causing aggressive lymphomas. Similarly, recurrent alterations in TAD boundaries and TADs in T-ALL were shown to be linked to changes in gene expression and affect downstream oncogenic effectors including NOTCH1 signaling17. In B-ALL, we showed that the relapse-specific NSD2 E1099K mutation drives relapse by disrupting 3D chromatin organization at the compartment level19. More recently, recurrent and subtype-specific alterations in A/B compartments, TAD activity, and chromatin loops were identified in AML20. Additionally, inhibition of DNA methylation with a hypomethylating agent restored 3D genome organization and gene regulation in AML cells. We now aim to capitalize on chromatin conformation capture techniques to discover how each layer of 3D chromatin organization evolves over the course of B-ALL progression and contributes to therapy resistance and relapse.

To uncover 3D chromatin architecture-related mechanisms underlying disease progression, we performed Hi-C, ATAC-seq, and RNA-seq on 12 matched primary pediatric leukemia specimens at diagnosis and relapse. Mapping of structural variations using Hi-C data revealed previously unidentified stable, diagnosis-specific, and relapse-specific SVs as well as relapse-specific fusion transcripts indicating a possible role in clonal evolution (e.g. MBTD1::WDR89). Moreover, Hi-C analysis revealed genome wide chromatin remodeling specifically in terms of A/B compartments, TAD interactivity, and chromatin loops. We showed that 12.17% of changes were shared by at least 25% of the patients, including at the JAK2 locus. Integration with ATAC-seq and RNA-seq datasets revealed strong correlation with both gene expression and chromatin accessibility. Importantly, integration of looping data with ATAC-seq and Hi-ChIP for H3K27ac demonstrated rewiring events can occur at stable chromatin accessibility sites. Additionally, we identified recurrent A/B compartments and TAD interactivity changes across the patient cohort supporting a crucial role in the clonal evolution of B-ALL. Recurrent TAD interactivity changes were identified in genes previously found dysregulated in cancer, such as PCDH9, as well as pathways previously implicated in drug resistance such as Wnt-beta catenin signaling, highlighting the activation of these vital pathways through 3D chromatin architecture alterations. Lastly, we discovered novel genes dysregulated by recurrent 3D changes, such as MN1 and ATXN1, which were found to play a role in drug response validating the functional significance of our findings. These results significantly extend our knowledge of shifts in the genetic and epigenetic landscape that drive therapy resistance in childhood B-ALL and identify novel therapeutic targets.

Results

Mapping structural variations in pediatric ALL using Hi-C

To broadly assess global chromatin architecture throughout the progression of pediatric B-ALL, we performed Hi-C, RNA-seq, and ATAC-seq on sorted blasts (CD45dim, CD19 + ) from 12 matched primary bone marrow B-ALL samples at diagnosis and at relapse from patients enrolled in Children’s Oncology Group (COG) protocols or St. Jude Children’s Research Hospital Total XV protocol21 (Supplementary Fig. 1a; Supplementary Table 1). Recurrent cytogenetic abnormalities in pediatric B-ALL serve to identify subgroups important for risk stratification of therapy22,23, therefore we first examined the Hi-C data for structural variants. Patient Hi-C data was processed by our in-house HiC-bench platform24. Alignment rates showed a high percentage of usable long-range read-pairs across our patient cohort ( ~ 100 million; Supplementary Fig. 1b, c).

Visual inspection of entire patient Hi-C contact matrices revealed inter-chromosomal contacts (Fig. 1a). Closer inspection of the Hi-C contact matrices of patient PATIKJ demonstrated a canonical ALL rearrangement, the BCR-ABL1 translocation between chromosomes 9 and 22 at both diagnosis and relapse (Fig. 1b). Reconstruction of Hi-C contact matrices revealed the presence of inter-chromosomal contacts across the breakpoint and the formation of a new topologically associating domain, referred to as a neoTAD25 (Fig. 1c). Fusion transcript analysis26 with RNA-seq data confirmed the presence of the BCR::ABL1 fusion transcript in this patient (Supplementary Data 1). This example of a well-known B-ALL translocation screened for at diagnosis by conventional methods provided evidence that our methods in detecting these chromosomal abnormalities using Hi-C was successful.

Fig. 1: Mapping of structural variations at Diagnosis and Relapse using Hi-C.
figure 1

a Entire genome Hi-C contact matrix for patient PATIKJ. b Zoomed in Hi-C contact matrix demonstrating BCR-ABL1 translocation between chromosome 9 and chromosome 22 in patient PATIKJ at diagnosis and at relapse (left and right respectively). c Reconstructed Hi-C contact matrix presenting an example of a shared BCR-ABL1 translocation in patient PATIKJ between chromosomes 9 and 22 at diagnosis and relapse (left and right respectively). Green dashed triangle highlights a NeoTAD. d Translocations and their log-odds score identified in diagnosis and relapse per patient. Those also found by cytogenetics or fusion transcripts are marked.

To further our investigation, we utilized Hi-C-Breakfinder27 and HiNT28 to call inter-chromosomal translocations from each Hi-C dataset in our cohort. We overlapped translocations predicted by the methods with cytogenetics provided for each patient, and found Hi-C-Breakfinder performed better and revealed previously identified as well as novel translocations (Supplementary Fig. 2a, b; Supplementary Data 2). Additionally, EagleC was used to call all SVs and was found to produce approximately 50% overlapping translocation calls with Breakfinder. While 50 stable translocations were called, for the first time, we report diagnosis-specific (5) and relapse-specific (21) translocations providing further evidence for clonal evolution as a mechanism for drug resistance (Fig. 1d). To further investigate these 26 dynamic translocations, we visually inspected each of them using Juicebox web application29 and confirmed one of the diagnosis-specific and 11 of the relapse-specific translocations (Supplementary Data 3). Interestingly, two of the relapse-specific translocations had faint signal at diagnosis upon visual inspection but were below the level of detection for Breakfinder, supporting clonal selection of a minor subclone at diagnosis. For example, while a translocation was called between chromosomes 16 and 22 in SJETV043 at relapse but not at diagnosis, upon visual inspection we were able to detect faint signal at the breakpoint at diagnosis (Fig. 2a). Another example in the same patient revealed a faint signal at diagnosis at the breakpoint of the t(chr1,12) called at relapse (Supplementary Fig. 2c).

Fig. 2: Mapping of structural variations using Hi-C reveals Diagnosis and Relapse-specific translocations.
figure 2

a Hi-C contact matrix demonstrating relapse-specific translocation between chromosomes 16 and 22 in patient SJETV043 with faint signal visually detected at diagnosis (arrow). b Reconstructed Hi-C contact matrix presenting a diagnosis-specific translocation in patient PATJJX between chromosomes 5 and 9 at diagnosis and relapse (left and right respectively). c Reconstructed Hi-C contact matrix presenting a relapse-specific translocation in patient SJETV043 between chromosomes 14 and 17 at diagnosis and relapse (left and right respectively). Green dashed triangle highlights a NeoTAD. d Reconstructed Hi-C contact matrix demonstrating a relapse-specific NeoTAD (dashed green triangle). New contacts overlap region of ATAC peaks (purple tracks) and increased transcription at relapse (red tracks) overlapping the E2F4 locus (dashed blue triangle). e Bar plot showing upregulation of E2F4 transcripts at relapse in PASFIF.

While the presence of inter-chromosomal contacts across breakpoints were evidenced by the reconstruction of Hi-C contact matrices25, we wanted to determine whether these contacts resulted in any fusions at the RNA level. Overall the fusion transcript analysis using STAR-fusion called a variable number of fusion transcripts across patients (Supplementary Fig. 2d; Supplementary Data 1). Of the 50 translocations shared from diagnosis to relapse, nine were found to have fusion transcripts. The one diagnosis-specific and three of the 11 relapse-specific translocations also had fusion transcripts detected, none of which were shared across patients. An example of a diagnosis-specific translocation in patient PATJJX showed the presence of a fusion between chromosomes 5 and 9 at diagnosis only (Fig. 2b). Fusion transcript analysis revealed the existence of a previously unreported fusion transcript in the diagnosis sample, ZCCHC7::PDGFRB (Supplementary Data 1 and 3). PDGFRB alterations are commonly found in Ph-like B-ALL cases, which has clinical implications as these patients can be responsive to tyrosine kinase inhibition30,31. Lastly, an example of a relapse-specific translocation in patient SJETV043 showed the presence of inter-chromosomal contacts between chromosomes 14 and 17 only in the relapse sample (Fig. 2c). Fusion transcript analysis confirmed the presence of the known ETV6::RUNX1 at both diagnosis and relapse, and demonstrated the existence of a previously unreported fusion transcript MBTD1::WDR89 in the relapse sample (Supplementary Data 1 and 3). MBTD1 is found as a recurrent translocation partner with ZMYND11 in acute myeloid leukemia and has been demonstrated to play a role in transcriptional activation of oncogenic genes such as myc32.

For the remaining relapse-specific, visually validated translocations that did not have fusion transcripts detectable by STAR-fusion, we sought to determine if there were examples of dysregulated gene expression surrounding breakpoints. One example was found for the relapse-specific translocation between chromosomes 5 and 16 in patient PASFIF (Fig. 2d; Supplementary Fig. 2e). Upon reconstruction around the breakpoint, new contacts were observed across the two chromosomes in the relapse sample compared to the diagnosis sample. Interestingly, several genes within the neoTAD show differential gene expression. Two ATAC peaks near a transcriptionally active region are evident on chromosome 5 brought into contact with genes near the breakpoint on chromosome 16 (Fig. 2d), suggestive of potential promoter/enhancer hijacking as Xu et al. previously demonstrated20. One of the genes, E2F transcription factor 4 (E2F4), that is upregulated at relapse (Fig. 2e) has been implicated in acute myeloid leukemia (AML). AML patients with high expression of E2F4 had significantly lower overall survival and in in vitro models its overexpression lead to increased proliferation and differentiation blockage33. These results extend the landscape of cytogenetic abnormalities in pediatric B-ALL that could play a significant role in the identification of subgroups that have significant clinical implications for risk stratification and therapeutic strategy.

B-ALL relapse is associated with recurrent changes in A/B compartments

Next, we sought to determine changes in the various layers of 3D chromatin architecture throughout the progression of pediatric B-ALL. We first assessed compartmentalization of the genome between B-ALL samples at diagnosis and relapse. A/B compartment status was analyzed in individual patient pairs using Cscore34. Principal Component Analysis (PCA) of Cscore compartment scores revealed two distinct groups of patient samples in which the matched diagnosis and relapse pairs cluster together and the patient with a KMT2A rearrangement, SJETV043, clusters separately (Fig. 3a). Overall, the number of A and B compartment calls was comparable between diagnosis and relapse samples (Fig. 3b). We next assessed A/B compartment switching in individual diagnosis-relapse patient pairs. We observed 4.88% compartment switching between diagnosis and relapse samples across the 12 patient pairs (Fig. 3c; Supplementary Fig. 3a). In order to confirm that 3D changes were not driven by copy number variations (CNVs), we called CNVs using EagleC. Less than 1.3% of compartment bins that underwent a switch were found to overlap with a CNV (Supplementary Data 4). B-ALL is a heterogeneous disease classified by various chromosomal abnormalities, such as hyperdiploidy, hypodiploidy, translocations, and intra-chromosomal amplification. Given this heterogeneity, B-ALL relapse expectedly has various mechanisms of clonal selection. While the small size of our cohort is limiting, we were particularly interested in identifying recurrent compartment switches across the patient cohort and/or switches that were concordant with gene expression and chromatin accessibility as they could be indictive of a role in clonal evolution. Overall, we identified 12.17% recurrent switches shared by at least 3 out of 12 patients (25% of patients). Specifically, 5.73% (420/7335) of A to B switches and 5.99% (483/8059) of B to A switches were shared by at least 3 patient samples (Fig. 3d).

Fig. 3: Compartments undergo discrete changes upon B-ALL relapse.
figure 3

a PCA of A/B compartment calls with Cscore. b Violin plot showing A and B compartment calls for diagnosis and relapse samples. Significant differences were calculated with a paired one-sided t test comparing number of A or B eigenvector-1 bins between diagnosis and relapse. c Bar plot showing number of compartment switches for each patient pair upon relapse. Bar plot shows mean, bars represent standard deviation, and each point represents a patient (n = 12). d Heatmap representation of compartment switches shared by at least 3 patients upon relapse. e Correlation boxplots between gene expression and compartment changes (left) and chromatin accessibility and compartment changes (right) from relapse to diagnosis of 12 individual patient pairs. Significant differences were calculated by unpaired one-sided t test comparing genes or peaks from compartment changes AB or BA to genes or peaks from stable compartments (*****p < .00001). Boxplots show medians (horizontal line in each box), interquartile ranges (boxes), 1.5 interquartile (whiskers) and each point represents a gene or peak. f IGV tracks demonstrating concordant B to A compartment switches and increased gene expression in 3 patient pairs at the JAK2 gene locus.

We then integrated the Hi-C data with RNA-seq and ATAC-seq data to investigate whether compartment switches are associated with changes in gene expression and chromatin accessibility (Supplementary Fig. 3b–e). We performed differential expression analysis of all expressed genes (FPKM > 1) and differential chromatin accessibility analysis of all ATAC-seq peaks (abs(log2 (fold change))>1.0, false-discovery rate (FDR) < 0.01) within compartment switches at the individual patient pair level and found that B to A compartment switches upon relapse correlated with an increase in gene expression, whereas A to B compartment switches upon relapse were associated with a decrease in gene expression relative to expression changes within stable compartments (Fig. 3e, left panel). We witnessed a similar trend with chromatin accessibility changes (Fig. 3e, right panel). Interestingly, we found a recurrent B to A compartment switch at the JAK2 locus in 3 of the 12 patient pairs. This B to A compartment switch was also concordant with increased gene expression (Fig. 3f). The JAK2 gene is a member of the non-receptor tyrosine kinase family and mediates intracellular signaling upon activation of cytokine receptors35. Several studies have identified gain of function mutations and translocations affecting JAK2 in pediatric B-ALL36,37.

B-ALL relapse drives recurrent changes in TAD interactivity

Following compartment analysis, we investigated TAD interactivity changes (i.e. changes in intra-TAD chromatin interactions) between B-ALL diagnosis and relapse patient pairs. We first called TADs at diagnosis and relapse and identified ~3000 TADs in each patient (Supplementary Fig. 4a). Comparison of TAD interactivity between B-ALL samples at diagnosis and relapse identified a range of significantly differential TADs across patients, with 8 out of 12 patients having more gains in TAD activity at relapse (Fig. 4a, b; (FDR) < 0.1 and abs(log2(fold change))>0.25; Supplementary Fig. 4b). Less than 2.4% of differential TADs overlapped with CNVs (Supplementary Data 5). Next, to investigate whether TAD interactivity associated with changes in gene expression and chromatin accessibility, we performed differential expression analysis of all expressed genes (FPKM > 1) and differential chromatin accessibility analysis of all ATAC-seq peaks (abs(log2 (fold change))>1.0, FDR < 0.01) within differentially active TADs at the individual patient pair level. Increased TAD interactivity at relapse significantly associated with positive fold changes in gene expression, whereas decreased TAD interactivity at relapse associated with negative fold changes in gene expression relative to expression changes within stable TADs (Fig. 4c, left). We saw a similar trend with chromatin accessibility changes (Fig. 4c, right).

Fig. 4: B-ALL patients share concordant changes in TAD activity.
figure 4

a Bar plot presenting number of intra-TAD activity changes per patient pair upon relapse (abs(L2FC) > 0.25, FDR < 0.01). Bar plot shows means, bars represent standard deviation, and each point represents a patient (n = 12). b Volcano plot presenting increased, decreased, and stable TADs between diagnosis and relapse samples (red, blue, and gray respectively). c Correlation boxplots showing gene expression and chromatin accessibility changes within differential TADs (paired t test, p < 0.05) from relapse to diagnosis of 12 individual patient pairs. Significant differences were calculated by unpaired one-sided t test comparing gene expression or ATAC peaks within TADs with interactivity changes, losses, or gains, to gene expression or ATAC peaks within stable TADs(*****p < .00001). Boxplots show medians (horizontal line in each box), interquartile ranges (boxes), 1.5 interquartile (whiskers) and each point represents a gene or peak. d Hi-C contact matrices presenting a TAD decreased in activity at relapse in 4 of 12 patient pairs at the PCDH9 locus (Diagnosis, Relapse, and L2FC (Relapse/Diagnosis) from top to bottom). e Box plots demonstrating decreased PCDH9 mean TAD activity in cpm (top left), decreased gene expression in tpm (top right), decreased PCDH9 compartment Cscore (bottom left), and decreased chromatin accessibility in cpm (bottom right) in the 4 patient pairs at diagnosis and relapse. Significant differences were calculated with a paired one-sided t test comparing mean TAD activity, TPM, Cscore, and cpm between diagnosis and relapse. Boxplots show medians (horizontal line in each box), interquartile ranges (boxes), 1.5 interquartile (whiskers) and each point represents a patient with pairs connected by lines. f IGV genome browser tracks demonstrating A to B compartment switch, decreased chromatin accessibility, and decreased gene expression in the 4 patients with decreased TAD interactivity at the PCDH9 gene locus.

Of note, patient SJPHALL005 had more than 800 changes in TAD interactivity, when applying the same thresholds, with the majority being loss in activity (Supplementary Fig. 4c). We observed a similar trend with patient SJPHALL005 when integrating the TAD interactivity data with either gene expression or chromatin accessibility data (Supplementary Fig. 4d). This patient also showed the greatest amount of compartment changes at 10.85%, with the majority being A to B (Supplementary Fig. 3a). Interestingly, we saw increased CEBPA expression and decreased PAX5 at relapse (Supplementary Fig. 4e), suggesting a more myeloid-like profile upon relapse. These observations are consistent with patient SJPHALL005 having a therapy related or lineage switch secondary malignancy as previously described38. Furthermore, we also observed myeloid lineage architectural changes, including a gain of TAD activity, a switch to A compartment, and increased expression of DACH1 at relapse (Supplementary Fig. 4f, g). DACH1 has been shown to regulate cell cycle progression in myeloid cells39.

As with the compartment analysis, we next aimed to identify TAD interactivity changes that were shared amongst our patient cohort possibly indicating major modes of tumor escape from therapy. Notably, we found a recurrent decrease in TAD interactivity at the PCDH9 locus in 4 of the 12 patient pairs (Fig. 4d). All 4 patients demonstrated decreased mean TAD interactivity, gene expression, Cscore, and accessibility (Fig. 4e). Closer examination of the patient with the most dramatic change in TAD interactivity, patient PASPNY, revealed concordant changes in compartments, chromatin accessibility, and gene expression (Fig. 4f). The PCDH9 (Protocadherin 9) gene encodes a transmembrane protein involved in cell adhesion and calcium ion binding and has been identified as one of the top downregulated genes in pediatric B-ALL40,41,42,43,44.

Lastly, we performed pathway enrichment analysis45,46,47 to identify pathways associated with changes in 3D architecture. Pathway enrichment analysis on genes that were concordant in expression with 3D changes such as A/B compartment switches or TAD interactivity changes revealed enrichment of previously identified key pathways including, mTORC1, Wnt-beta catenin, and p53 signaling (Supplementary Fig. 4h)9,48. These data demonstrate that the activation of these important pathways goes beyond genetic alterations and validate these pathways as novel targets to prevent and treat relapse.

Chromatin loop data reveals enriched factors at relapse

We next wanted to investigate the rewiring of chromatin loops and the factors responsible for mediating such changes during selection from therapy. We performed a looping analysis with FitHiC249,50. Looping calls varied based on sequencing depth across the patient cohort (Fig. 5a). We performed differential looping analysis to identify loops that increased, decreased, or remained the same upon relapse (Fig. 5b) and visually confirmed them using Aggregate Peak Analysis (APA) (Supplementary Fig. 5a). We observed a strong correlation of loop changes with both gene expression changes and chromatin accessibility changes (Fig. 5c). Additionally, we identified a positive association between looping changes and both compartment and TAD interactivity changes (Supplementary Fig. 5b)

Fig. 5: Chromatin loop data reveals enriched factors at relapse.
figure 5

a Bar plot presenting number of loops called with Fit-Hi-C2 at diagnosis and relapse (q < 0.01). b Bar plot presenting number of loops increased, decreased or stable upon relapse (blue, red, gray respectively). c Fit-Hi-C loop correlation with gene expression and chromatin accessibility (left and right respectively). Significant differences were calculated by unpaired one-sided t test comparing genes or peaks from loops that increased or decreased to genes/peaks from stable loops (*****p < 0.00001) from relapse to diagnosis of 12 individual patient pairs. Boxplots show medians (horizontal line in each box), interquartile ranges (boxes), 1.5 interquartile (whiskers) and each point represents a gene or peak. d Enrichment analysis with LOLA of decreased and increased ATAC-seq peaks (left and right) concordant with Fit-Hi-C loops and gene expression.

To identify key regulators of transcription attributed to loop changes, we performed motif enrichment analysis with Locus Overlap Analysis (LOLA)51 with ATAC-seq peaks overlapping concordant chromatin loops. We identified several chromatin binding factors enriched in regions with decreased chromatin accessibility and decreased chromatin looping including RXR and EZH2 (Fig. 5d, left panel). We also identified several enriched factors in regions with increased chromatin accessibility and increased chromatin looping including FOS and NF-E2 (Fig. 5d, right panel).

H3K27ac HiChIP data reveals exclusive looping at stable chromatin accessibility sites

We were also interested in the behavior of enhancer anchored loops upon B-ALL relapse based on our previous work demonstrating the importance of enhancers and superenhancers in relapse9. We performed H3K27ac HiChIP with one patient pair, PASFIF, to evaluate H3K27ac anchored loops. Looping analysis was performed with FitHiC249,50. To examine the relationship between alterations in the H3K27ac chromatin loops and chromatin accessibility, we categorized the H3K27ac chromatin loops into those that were exclusively increased (exclusive to relapse), exclusively decreased (exclusive to diagnosis), dynamically increased (increased), dynamically decreased (decreased), or stable and assessed chromatin accessibility changes. Interestingly, integration of H3K27ac HiChIP with ATAC-seq data revealed significantly increased and decreased exclusive loops at regions with stable chromatin accessibility sites (Fig. 6a). This data suggests that significant 3D rewiring events can occur at stable chromatin accessibility sites that would otherwise be entirely ignored if analysis is focused only on sites with changes in chromatin accessibility using ATAC-seq.

Fig. 6: PASFIF H3K27ac HiChIP data reveals exclusive loops at stable ATAC-seq peaks.
figure 6

a ATAC-seq and Fit-Hi-C loop association bar plot. b Enrichment analysis with LOLA of increased, decreased, or stable ATAC-seq peaks with exclusively decreased, dynamic decreased, stable, exclusively increased, or dynamic increased loops. c IGV snapshot of the Fms Related Receptor Tyrosine Kinase 1 (FLT1) gene locus presenting ATAC-seq, RNA-seq, and H3K27ac Hi-ChIP and Hi-C virtual 4C tracks at the TSS as a viewpoint. Gray boxes highlight regions of differential contact frequency.

To expand on this finding, we performed enrichment analysis using LOLA51 with specifically stable ATAC-seq peaks overlapping concordant and exclusive H2K27ac anchored loops. We identified several interesting chromatin binding factors enriched in regions with stable chromatin accessibility and decreased enhancer-promoter looping including NCOR1, CREBBP, and MEIS (Fig. 6b, left panel). We also identified several factors enriched in regions with stable chromatin accessibility and increased enhancer-promoter looping including PLU1, THAP1, and PHF8 (Fig. 6b, right panel).

Lastly, we present an example of stable chromatin accessibility and exclusively increased H3K27ac chromatin looping at the vascular endothelial growth factor receptor 1 (FLT1) gene locus52 in patient PASFIF (Fig. 6c). Genome browser tracks show stable chromatin accessibility and increased gene expression. H3K27ac HiChIP and Hi-C virtual 4 C tracks show differential contact frequencies between diagnosis and relapse samples. FLT1 activation in ALL has been shown to increase survival of blasts by influencing their localization into the bone marrow and exit back into circulation52.

Functional validation of candidate genes suggests a role in drug resistance

In an effort to discover potential gene targets involved in drug resistance or relapse that are downstream of 3D chromatin changes, we identified genes that demonstrated 3D change from diagnosis to relapse (A/B compartment, TAD interactivity, or chromatin loop), that were concordant with gene expression, and were shared by at least 3 patients, indicating a possible major role in clonal evolution. This resulted in a list of 361 upregulated and 387 downregulated genes. To validate the expression changes, we first overlapped our 3D gene-based lists with a list of the most significantly differentially expressed genes from a meta-analysis of a large cohort of three microarray datasets generated from pediatric B-ALL diagnosis/relapse pairs53. Thirty-two percent (116 of 387) of the 3D mediated downregulated and 17% (61 of 361) of the upregulated genes were also differentially regulated in this much larger cohort of samples (Fig. 7a; Supplementary Table 2) indicating a prominent role of 3D reorganization in clonal evolution. Furthermore, using published data from a genome wide CRISPR screen performed with the B-ALL cell line REH and various chemotherapies used in pediatric B-ALL treatment performed by Oshima et al.54, we found that 42% (153 of 361) of the upregulated and 24% (93 of 387) of the downregulated genes were within the top 10% of guides negatively (for up) or positively (for down) selected from at least one of the drugs tested, suggesting a cell intrinsic role in drug resistance (Supplementary Data 6).

Fig. 7: Functional validation of MN1 and ATXN1 suggests a role in drug resistance.
figure 7

a Venn diagrams presenting overlap of 3D gene-based lists with a list of the most significantly differential genes from a meta-analysis of three microarray datasets. b Example shown for PASYFN of an A to B compartment switch shared by 4 patient pairs at the MN1 gene locus including TAD activity, V4C signal, A/B compartments, RNA-seq, and ATAC-seq tracks. c Box plot demonstrating a decrease in MN1 gene expression in tpm (left) and decrease in Cscore (right) in the 4 patient pairs with the A to B compartment switch. Significant differences were calculated using paired t test (*p < 0.05). Boxplots show medians (horizontal line in each box), interquartile ranges (boxes), 1.5 interquartile (whiskers) and each point represents a patient with pairs connected by lines. d Linear regression model demonstrating positive correlation between compartment and gene expression changes at the MN1 locus for patients with differential expression absL2FC > 1.0 (n = 7). e Hi-C contact matrices presenting a TAD increased at relapse in 5 of 12 patient pairs at the ATXN1 locus (Diagnosis, Relapse, and L2FC (Relapse/Diagnosis) from top to bottom). f Box plots demonstrating increased ATXN1 gene expression (top) and increased intra-TAD activity (bottom) in the 5 patient pairs upon relapse. Significant differences were calculated using paired t test (*p < 0.05). Boxplots show medians (horizontal line in each box), interquartile ranges (boxes), 1.5 interquartile (whiskers) and each point represents a patient with pairs connected by lines. g Cell viability (left) and apoptosis assays (right) demonstrating increased sensitivity to treatment with Prednisolone in SEM cells overexpressing MN1 compared to empty vector control (EV). h Cell viability (left) and apoptosis assays (right) demonstrating increased resistance to treatment with Prednisolone in 697 cells overexpressing ATXN1 compared to EV. Cell viability graphs shown (g, h) are one experiment representative of three independent experiments, each with technical triplicates. The dots represent the mean of technical triplicates for that experiment with standard deviation shown. Curve represents the nonlinear fit of transformed X. The null hypothesis of one curve fits all datasets was rejected with p < 0.001 (g, h). Apoptosis graphs shown represents the mean and standard deviation of 4 (g) and 3 (h) individual experiments. Statistical significance shown based on unpaired t test with Welch correction with p value indicated.

Interestingly, at the MN1 locus we identified a recurrent A to B compartment switch in 4 of the 12 patient pairs that was concordant with decreased gene expression, chromatin accessibility, and virtual 4 C signal (Fig. 7b–d). Linear regression model revealed a positive relationship between changes in compartment Cscore and gene expression (R2 = 0.7, p < 0.019, Fig. 7d). MN1 encodes a transcriptional co-regulator involved in the co-activation of various transcription factors55,56. Notably, we also found a recurrent increase in TAD interactivity with a concomitant increase in gene expression at the ATXN1 locus in 5 of the 12 patient pairs (Fig. 7e, f). Closer examination of the patient with the most dramatic change in TAD interactivity, PASPNY, revealed concordant changes in TAD interactivity, compartments, gene expression, chromatin accessibility, and virtual 4 C signal (Supplementary Fig. 6a). This was further validated by increased H3K27ac peak changes in many patients from our previous study9. The Ataxin-1 (ATXN1) gene is a chromatin binding factor that represses Notch signaling57,58.

We validated a functional role for both MN1 and ATXN1 in B-ALL cell lines. The B-ALL cell line SEM was engineered to overexpress MN1 or EV as control and verified via western blot (Supplementary Fig. 6b). We observed a significant increase in sensitivity to prednisolone upon overexpression of MN1 (Fig. 7g) but no difference was observed with thiopurines, suggesting that loss of MN1 would provide an advantage to cells undergoing steroid treatment. The B-ALL cell line 697 was engineered to overexpress ATXN1 or EV as control and verified via western blot (Supplementary Fig. 6c). Overexpression of ATXN1 resulted in increased resistance to prednisolone compared to EV (Fig. 7h), suggesting upregulation of ATXN1 leads to drug resistance. There was no difference in IC50s for other chemotherapies, including thiopurines or doxorubicin (Supplementary Fig. 6d). Note when ATXN1 or MN1 expression was modified in other cell lines (REH, RS4;11, SEM, 697), we did not observe changes in chemosensitivity indicating cell context plays an important role, a finding consistent with the fact that these changes were shared in a subset of patients.

Discussion

Relapsed pediatric B-ALL remains one of the leading causes of cancer mortality in children. Previous studies have characterized the genetic, epigenetic, and transcriptional landscapes of B-ALL. However, no one has investigated B-ALL progression through the lens of 3D chromatin organization. In this work, we investigate the dynamics of 3D chromatin architecture in B-ALL progression by assessing 12 matched primary pediatric leukemia specimens at diagnosis and relapse. We demonstrate that 3D genome organization plays an additional critical level of control in B-ALL progression.

The current clinical standard for the detection of genetic abnormalities in B-ALL is fluorescence in-situ hybridization (FISH). Despite its sensitivity, FISH can only detect known genetic aberrations with the provision of specific probes. Hi-C in contrast is a powerful tool that gives insight into previously unknown structural variants. We sought to leverage Hi-C and investigate the stability of structural variants between diagnosis and relapsed B-ALL. Robust SV analysis with Hi-C data from diagnosis and relapse samples confirmed translocations provided by conventional cytogenetics but also some previously unidentified stable, diagnosis-specific, and relapse-specific translocations. This data suggests that Hi-C data can be used as a computationally less rigorous alternative to WGS methods to discover cytogenetic abnormalities in ALL and other disease models. We confirmed the presence of fusion transcripts such as BCR::ABL1 and ETV6::RUNX1 as well as identified novel translocation-driven fusion transcripts offering possible new avenues for therapeutic intervention especially those that appear to be enriched at relapse.

In general, such translocations are thought to be an early event in tumorigenesis and are maintained at diagnosis and relapse. We show a striking number of additional translocations indicating a fundamental role in tumorigenesis. The loss of translocations at relapse indicates a dispensable role in clonal evolution while relapse-specific translocations may provide additional fitness to the leukemic clone. Interestingly, both PDGFRB and ZCCHC7 have been previously observed in rearrangements with different loci in B-ALL. The detection of ZCCHC7::PDGFRB at diagnosis would support the use of a tyrosine kinase inhibitor, although in this case the clone appeared to be extinguished with chemotherapy. MBTD1, malignant brain tumor domain containing 1, has been identified as a part of a chimeric protein with ZMYND11 (zinc finger MYND-type containing 11) that leads to active chromatin states at stemness related genes in AML59. Herein, we show a novel fusion transcript with MBTD1 and WDR89, a member of the WD40-repeat proteins whose function is largely unknown but may play a role in immunotherapy response60,61. While the functional relevance is beyond the scope of the present study, the fact that this novel transcript was relapse-specific suggests a role in disease progression. Performing Hi-C on a larger cohort of samples could enable the rapid benchmarking of cytogenetic abnormalities associated with B-ALL progression and would allow for an unbiased view of the genome to identify unknown fusions that would otherwise be missed using current methods. Lastly, while our studies show a number of diagnosis or relapse specific translocations, all pairs had shared translocations indicating that the origin of the relapsed clone was from a common ancestral precursor including case SJPHALL005 where the initial diagnostic clone was lymphoid and the patient relapsed with AML. Thus, the subsequent relapse likely originated from a common ancestral stem-like clone, a feature known to be associated with Ph+ ALL62. Furthermore, there were several relapse-specific translocations that were visually detected (but not called by Breakfinder) at diagnosis, supporting the idea of clonal selection of a minor subclone at diagnosis5,7.

In addition to structural changes, we demonstrate significant A/B compartment, TAD interactivity, and looping changes between diagnosis and relapse that strongly correlated with expression and accessibility. These findings support a prominent role of 3D architecture in modulating transcription. We saw that the magnitude of these architectural changes varied among pairs. This is consistent with our previous findings showing variable degrees of epigenetic changes from diagnosis to relapse, highlighting the plasticity of the chromatin landscape under the selective pressures of therapy9. Analysis of 3D-regulated genes revealed previously discovered pathways associated with drivers of clonal evolution. For example, activation of JAK/STAT signaling has been implicated in leukemia progression and we observed a recurrent B to A compartment switch at the JAK2 locus. Previous studies have shown mutations as well as translocations lead to activated JAK2 in pediatric B-ALL, which could be targeted through JAK inhibitors53,54. Our Hi-C analysis demonstrates an additional mechanism by which blasts may increase JAK2 thus extending the use of JAK inhibitors at relapse.

Importantly our work identified additional novel pathways involved in drug resistance specifically to prednisolone, a cardinal feature of ALL at relapse. A meta-analysis of gene expression in B-ALL diagnosis and relapse pairs confirms that MN1 expression is downregulated while ATXN1 is upregulated at relapse consistent with our findings53. MN1 is a transcriptional co-activator and indirectly activates retinoic acid receptor and vitamin D receptor (VDR) target genes63. Many studies have confirmed an association between polymorphisms of VDR with osteonecrosis, a side effect of glucocorticoids in ALL therapy64,65. Furthermore, retinoids have been found to promote glucocorticoid-induced apoptosis of various T-cell lines. Thus, downregulation of MN1 may impair apoptosis induced by retinoid signaling under steroid treatment66. Conversely, overexpression of ATXN1 resulted in increased resistance to prednisolone. ATXN1, thought to function in extracellular matrix remodeling, has been shown to regulate Notch and Capicua (CIC) controlled developmental processes67. We were unable to replicate phenotypic changes in multiple cell lines; however, we have previously demonstrated cell context specific changes in drug response, which can be explained in part due to 3D chromatin differences56.

By integrating looping changes with gene expression and chromatin accessibility, we were able to identify potential regulators of these changes. We found enrichment of sites for chromatin binding factors shown to be key players in cancer. One such example is EZH2, which is part of the polycomb repressive complex (PRC2) that is crucial for proliferation and differentiation and plays a complex role in cancer68. We previously demonstrated DNA hypermethylation at PRC2 target genes at relapse9, which would be consistent with decreased expression, accessibility, and 3D looping interactions where there are EZH2 motifs. Studies related to EZH2 and NSD2 mutations in leukemia and myeloma have also revealed a complex interplay between these two epigenetic modifiers69,70 and perhaps understanding the 3D chromatin changes associated with EZH2 could provide more insight for the best method to target epigenetic modifiers. The importance of epigenetic modifiers is strengthened further by the enrichment of H3K27me3 and the role this mark has been demonstrated to play in repressor hijacking events20. The highest-ranked factor for decreased looping and expression was retinoid X receptor (RXR), which is a nuclear receptor that interacts with retinoic acid receptor alpha and recruits corepressors to chromatin to regulate target genes, including genes involved in differentiation and apoptosis71. Targeting these receptors in IKZF1 mutated BCR-ABL1 ALL has also been shown to reverse stem cell like phenotype72. The highest-ranking factor motif for activated regions was nuclear factor erythroid-2 (NF-E2). Overexpression in mouse models has been shown to lead to increased stem and progenitor cells and spontaneous development of AML73.

Interestingly, the H3K27ac HiChIP with patient PASFIF demonstrated a significant amount of de novo loops appearing and disappearing at regions of stable chromatin upon relapse. This finding indicates that sites of stable chromatin accessibility provide invaluable information on a significant number of 3D rewiring events that would otherwise be entirely ignored if analyzes are directed to changes in chromatin accessibility exclusively (e.g., ATAC-seq). For example, CREBBP motifs were enriched in regions of decreased looping at relapse. CREBBP loss of function mutations have previously been observed at relapse in a significant number of patients and is associated with resistance to glucorticoids74. Likewise, PHF8 motifs were enriched in regions of increased looping and PHF8 expression is associated with ALL progression through activation of the MEK/ERK pathway75.

Overall, this study demonstrates 3D genome organization as a critical factor in reshaping the transcriptional output of leukemia cells under the selective pressures of therapy. These processes appear to drive pathways shown to be involved in drug resistance but also previously undiscovered genes/pathways that may play a role in clonal evolution. Our findings extend indications for inhibition of kinase and cytokine receptor signaling in B-ALL using available agents, as well as, highlight the potential of novel approaches to modulate MN1, ATXN1 and FLT1 at relapse. The identification of common downstream targets among patients indicates the convergence of drug resistance mechanisms and has important clinical implications for possible future therapeutic interventions.

Methods

Cell culture, drug preparation, viral preparation, immunoblotting, and phenotypic assays were performed according to methods published previously5,6,7.

Cells and reagents

The B-lineage leukemia cell lines Reh (ATCC), RS4;11 (ATCC), RCH-ACV (DSMZ), and SEM (kindly gifted by Jun Yang, St. Jude Children’s Hospital) were grown in RPMI1640 medium. All media were supplemented with 10% FBS, 1% penicillin/streptomycin under 5% CO2 at 37 °C. No cell lines were used beyond passage 20. Each leukemia line was validated by short tandem repeat analysis through IDEXX cellcheck9+ service. Cell lines were routinely monitored for mycoplasma contamination by PCR using ATCC Universal Mycoplasma Detection Kit (20-1012 K).

Experimental Procedures

Primary B-ALL patient samples

Cryopreserved paired diagnosis/relapse primary patient bone marrow samples from individual patients were obtained from the Children’s Oncology group (COG) ALL biorepository or St. Jude Children’s Research Hospital21. All subjects provided consent for banking and future research use of these specimens in accordance with the regulations of the institutional review boards of all participating institutions. Samples were thawed and stained for CD45 (PerCP-Cy™5.5 Mouse Anti-Human CD45 clone HI30, BD Bioscience Cat# 564105) and CD19 (APC Mouse Anti-Human CD19 Clone HIB19, BD Bioscience Cat# 555415). Leukemic blasts (CD45dimCD19 + ) were sorted using SY3200 cell sorter (Sony Technologies) in the NYU Cytometry and Cell Sorting Laboratory and then immediately processed for downstream steps.

RNA-seq

Following FACS, 50,000-200,000 leukemic cells were immediately resuspended in Buffer RLT plus BME and frozen on dry ice. RNA was extracted using the QIAGEN RNeasy Micro Kit and quality was verified by an Agilent Bioanalyzer 2100 (PICO chip). RNA-seq libraries were generated by the NYU Genome Technology Center using the Low input Clontech SMART-Seq kit and sequenced on the Illumina NovaSeq 6000.

ATAC-seq

Following FACS, 50,000 leukemic cells were immediately processed for ATAC-seq by the NYU Genome Technology Center. ATAC libraries were generated based on the protocol by Buenrostro et al.76 Briefly, cells are resuspended in cold lysis buffer (10 mM Tris Cl, 10 mM NaCl, 3 mM MgCl2, 0.1% (v/v) Igepal CA-630, pH 7.4) and centrifuged for 1 min at 500 × g. Nuclei were tagmented using Nextera (Illumina) Tagmentation DNA buffer and enzyme. PCR amplification was performed as described by Buenrostro et al.76 Libraries were sequenced on the Illumina NovaSeq 6000.

Hi-C

Following FACS, leukemic cells were immediately fixed and crosslinked with 2% formaldehyde at room temperature for 10 min. Glycine was added to stop the reaction. Crosslinked cells were washed with PBS and then frozen on dry ice. Crosslinked cell pellets were stored at −80 until ready for processing. Hi-C libraries were generated using the Arima-HiC kit (A410110) with Swift Biosciences® Accel-NGS® 2 S Plus DNA Library Kit (Cat # 21024) and Indexing kit (Cat #) according to manufacturer’s protocol. Libraries were amplified and quantified using KAPA Library Amplification kit (Cat # KK2620). Final libraries were sequenced at the NYU Genome Technology Center on the Illumina NovaSeq 6000.

H3K27ac HiChIP

HiChIP libraries were generated using the Arima-HiC+ kit (A410232) with Swift Biosciences® Accel-NGS® 2 S Plus DNA Library Kit (Cat # 21024) and Indexing kit (Cat # 26148) according to manufacturer’s protocol with minor modifications. After in-situ proximity ligation, samples were sonicated on Covaris LE 220 at 4 C, 300 W, 15% duty factor, 200 cycles per burst, for 300 s. Samples are then precleared with protein G beads for an hour, and subsequently incubated with 5 μg of H3K27AC antibody (ab4729, Lot:GR3231988) at 4 C overnight. After IP, libraries were amplified and quantified using KAPA Library Amplification kit (Cat # KK2620). Final libraries were sequenced at the NYU Genome Technology Center on the Illumina NovaSeq 6000.

Drug preparation

Stock solutions of Doxorubicin (Dox) (Sigma-Aldrich, St. Louis, MO) were prepared in double-distilled water, Prednisolone (Pred) (Pharmacia, St. Paul, MN) in 0.9% NaCl (Saline), 6-thioguanine (6-TG) and 6-mercaptopurine (6-MP) in 0.1 M NaOH. Drugs were diluted and added to cell culture media at indicated concentrations. Vehicle toxicity was tested at highest drug dose used to ensure no impact on viability.

Generation of overexpression cell lines

The open reading frames of MN1 or ATXN1 were PCR amplified from the pcDNA3.1 plasmids purchased from Genscript (OHu18815D and OHu26181D, respectively) and cloned into pLenti-R1R2-DEST (kindly gifted by Kenneth Scott, Baylor College of Medicine) using Invitrogen’s Gateway(R) Technology. HEK293T cells were then transfected with these constructs (empty vector or gene of interest) along with packaging plasmids coding for VSV-G, Gag-pol, and Rev using lipofectamine 2000 (Invitrogen, Carlsbad, CA). Viral supernatant was used to spin infect B-ALL cell lines with 8 µg/ml Polybrene (Millipore). After 72 h, infected cell lines were selected with 0.75–3 µg/ml of Puromoycin.

Cytotoxicity assays

Cells were plated at a density of 60,000–80,000 cells/well and exposed to chemotherapy agents commonly used in the treatment of pediatric B-ALL for 4 days. Cell viability was measured using CellTiter-Glo® Luminescent Cell viability Assay (Promega) according to the manufacturer’s protocol. The luminescence was read using a PerkinElmer EnSpire multimode plate reader. Luminescent readings were normalized to the untreated control for each cell line to account for any differences in baseline growth kinetics or survival. Cytotoxicity percentages were plotted using nonlinear regression with a four parametric variable slope with log transformed x concentrations with a bottom constraint equal to zero on GraphPad Prism 7.2 (GraphPad Prism Software Inc., La Jolla, CA). For apoptosis, following treatment, cells were stained with Annexin V and DAPI and analyzed via flow cytometry using the BD LSR II cytometer within the NYU Cytometry and Cell Sorting core. The percent Annexin V+ cells were determined for percent apoptotic using FlowJo Software. The percentage of apoptotic cells was compared between empty vector control and MN1 or ATXN1 overexpressing cells using an unpaired t test. Each experiment was plated in triplicate and repeated at least three times.

Computational analysis

RNA-seq analysis

Patient RNA-seq fastq files were processed in single replicates using the route “rna-star” and “rna-star-groups-dge” from the Slide-n-Seq (sns) pipeline: https://igordot.github.io/sns/. Processing steps include alignment of paired-end reads to the human reference genome (hg19) using the STAR aligner with default parameters77. Counts were obtained using featureCounts78. Bigwig tracks were obtained for visualization on individual samples using deeptools (v3.1.0)79. Downstream analysis including normalization and differential expression analysis was performed using DESeq280. Genes were categorized as differentially expressed if abs(L2FC > 0.58, p value < .05). Pathway analysis was performed using enrichR46.

ATAC-seq analysis

Patient ATAC-seq fastq files were processed in two replicates using the route “atac” from the Slide-n-Seq (sns) pipeline: https://igordot.github.io/sns/. Processing steps included aligning paired-end reads to the human reference genome(hg19) with Bowtie2(v2.3.4.1)81. Reads with a mapping quality <30 were removed. Duplicated reads were removed using Sambamba (v0.6.8)82. Remaining reads were analyzed by applying the peak-calling algorithm MACS2(v2.1.1)83. Bigwig tracks were obtained for visualization on individual samples using deeptools (v3.1.0)79. Differential ATAC-seq peak analysis was performed using DiffBind84. Nearest genes were annotated using ChIPseeker85. Enrichment analysis of genomic regions sets was performed using Bioconductor package LOLA (Locus overlap analysis or enrichment of genomic ranges; R package version 1.24.0) with RStudio (v3.6.1) with the hg19 LOLA core database51. LOLA core is curated from many sources including TF binding sites from Encode and epigenome databases from Cistrome.

Hi-C and H3K27ac HiChIP analysis

Raw Hi-C sequencing data was processed with the hic-bench platform24. Cell line Hi-C data was processed as single replicates. Data was aligned against the human reference genome(GRCh37/hg19) with bowtie2(version 2.3.1)81. The reads used for downstream analyzes were filtered for by the GenomicTools tools-hic filter command in the hic-bench platform using default parameters. The GenomicTools tools-hic filter command discards reads including multi-mapped reads (“multihit”), read-pairs with only one mappable read (“single sided”), duplicated read-pairs (“ds.duplicate”), read-pairs with a low mapping quality of MAPQ < 20, read-pairs resulting from self-ligated fragments (together called “ds.filtered”), and short range interactions resulting from read-pairs aligning within 25 kb (“ds.too.short”). Downstream analysis was performed with the accepted intra-chromosomal read- pairs (“ds.accepted intra”). The number of accepted intra-chromosomal read-pairs varied between ~40 and ~140 million for all samples (Chapter 1; Fig. 2). Hi-C interaction matrices were generated for each chromosome separately using the hic-bench platform at 40 kb resolution. Filtered read counts were normalized by iterative correction and eigenvector decomposition (ICE)86. To account for variances in read counts of more distant loci, distance normalization for each chromosome matrix was performed.

Translocation calling from patient Hi-C data

Hi-C breakfinder27 and HiNT28 were used to call previously identified, as well as, novel intra-chromosomal and inter-chromosomal translocations from the matched B-ALL diagnosis/relapse patient Hi-C data. Hi-C breakfinder required 3 input files including a bam file, an inter-chromosomal expectation file, and an intra-chromosomal expectation file. The bam file was generated by hic-bench and the two expectations files for hg19 were provided on the Dixon lab GitHub page: https://github.com/dixonlab/hic_breakfinder. Hi-C breakfinder generated lists of structural variant predictions at different resolutions including 1 Mb, 100 kb, and 10 kb. For the purposes of comparing translocations between Hi-C breakfinder and HiNT, the 100 kb resolution was used. Translocations were called at 100 kb resolution using HiNT through the “translocations” step within the hic-bench platform24. EagleC87 (version 0.1.9) was used to call copy number inferences (deletions and duplications) at three resolutions 5 kb,10 kb,50 kb. High-confidence calls were determined by setting the probability cutoff to 0.95.

Fusion transcript analysis

Fusion transcript analysis was performed using STAR-fusion: https://github.com/STAR-Fusion/STAR-Fusion/wiki88. STAR-fusion required paired-end RNA-seq fastq files and a reference genome (hg19) in order to identify candidate fusion transcripts.

Hi-C contact matrix visualization

To visualize Hi-C contact matrices, ICE normalized Hi-C contact matrices for the corresponding chromosome were loaded and normalized by the total number of intra-chromosomal interactions for Diagnosis and Relapse samples. The log2FC Hi-C contact matrix was produced by applying the log2 function on the division product of the Relapse Hi-C table by the Diagnosis Hi-C table.

To visualize translocations, Hi-C contact matrices were generated in the.hic format using the “tracks” step within the hic-bench platform24. Hi-C contact matrices were visualized using Juicebox29.

Hi-C contact matrix reconstruction

Hi-C contact matrices were reconstructed to demonstrate inter-chromosomal translocations at 20 kb resolution using NeoLoopFinder25. NeoLoopFinder required.cool files which were generated using the “tracks” step within the hic-bench platform24. NeoLoopFinder also required files containing a list of structural variations that were generated at 10 kb resolution using Hi-C breakfinder27 as described above.

A/B compartments analysis

A/B compartments analysis was performed using the “compartments” and “compartments-stats” steps within the hic-bench platform24. The Cscore tool algorithm was used to assign active (A) and inactive (B) compartments34. For a bin to be considered a switch from A to B or from B to A, the compartment score sign had to flip in sample 2 (Relapse) when compared with the reference sample (sample 1 or Diagnosis). The absolute difference between the compartment scores had to be higher than the cutoff (default: 1.2). The difference was computed as a relative delta:

$${{\rm{delta}}}=(Y-X)/{abs}(Y) \,\#\, {{\rm{delta}}}\; {{\rm{value}}}\; {{\rm{calculation}}}$$

X = compartment score of bin in sample 1 (reference)

Y = compartment score of bin in sample 2

TAD interactivity analysis

TAD interactivity was assessed using the “domains” and “domains-diff” steps within the hic-bench platform24. The “domains” step uses the hic-ratio algorithm for TAD calling developed within hic-bench by previous Tsirigos lab member Haris Lazaris in which the average of the normalized interaction scores is calculated for all interactions taking place within a particular TAD. The “domains-diff” step assesses TAD interactivity alterations and was developed by previous lab members Sofia Nomikou and Andreas Kloetgen17,24. To identify TADs with differential interactivity between Relapse and Diagnosis, we used the TADs identified in the Diagnosis sample as a reference to identify common TADs. Once mean TAD interactivity values were obtained from the Hi-C data, a wilcoxon two-sided rank sum non-parametric test was performed to determine the p-values for each TAD. Multiple testing was used to correct these p-values by adjusting to the total number of TADs. Lastly, the log2 fold change (log2FC) of intra-TAD activity value was calculated between the samples. TAD interactivity alterations were categorized as significant if abs(L2FC > 0.25) and FDR < 0.01.

Looping analysis

Looping analysis was performed using the “loops” and “loops-diff” steps within the hic-bench platform24. Chromatin loops from Hi-C and H2K27ac HiChIP data were called using FitHiC249,50. Loops were classified as diagnosis-specific, relapse-specific, or common. The common loops were also classified as increased, decreased, or stable using a log2 fold change cutoff (abs(L2FC) > 1). A loop was considered sample-specific if it was found to be significant only in sample ‘x’ but not in sample ‘y’, when using a q-value cutoff of ‘qcut1’ (default: 0.01) as well as the loop shouldn’t be found significant in sample ‘y’ when using a more relaxed cutoff of ‘qcut2’ (default: 0.1). This is to avoid considering a loop sample-specific when the significance was relatively close in both samples but in one sample it was lower than the cutoff (significant) and in the other sample it was slightly higher than the cutoff (not significant). These cases were classified as common loops.

Integration of patient Hi-C data with other datasets

To show the correlation between the Hi-C data and the various sequencing datasets, the peaks obtained from ATAC-seq, and the genes obtained from RNA-seq were mapped to the AB, BA, and stable regions or the increased, decreased, or stable TADs using the “bedtools intersect” command89,90. We calculated the peak intensity fold change or gene expression fold change for peaks or genes assigned to a compartment region or TAD activity region between diagnosis and relapse and showed the correlation with boxplots or bar plots. Statistical significance was assessed using a paired two-sample t test.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.