Hybrid sequencing resolves two germline ultra-complex chromosomal rearrangements consisting of 137 breakpoint junctions in a single carrier

Chromoanagenesis is a genomic event responsible for the formation of complex structural chromosomal rearrangements (CCRs). Germline chromoanagenesis is rare and the majority of reported cases are associated with an affected phenotype. Here, we report a healthy female carrying two de novo CCRs involving chromosomes 4, 19, 21 and X and chromosomes 7 and 11, respectively, with a total of 137 breakpoint junctions (BPJs). We characterized the CCRs using a hybrid-sequencing approach, combining short-read sequencing, nanopore sequencing, and optical mapping. The results were validated using multiple cytogenetic methods, including fluorescence in situ hybridization, spectral karyotyping, and Sanger sequencing. We identified 137 BPJs, which to our knowledge is the highest number of reported breakpoint junctions in germline chromoanagenesis. We also performed a statistical assessment of the positioning of the breakpoints, revealing a significant enrichment of BPJ-affecting genes (96 intragenic BPJs, 26 genes, p < 0.0001), indicating that the CCRs formed during active transcription of these genes. In addition, we find that the DNA fragments are unevenly and non-randomly distributed across the derivative chromosomes indicating a multistep process of scattering and re-joining of DNA fragments. In summary, we report a new maximum number of BPJs (137) in germline chromoanagenesis. We also show that a hybrid sequencing approach is necessary for the correct characterization of complex CCRs. Through in-depth statistical assessment, it was found that the CCRs most likely was formed through an event resembling chromoplexy—a catastrophic event caused by erroneous transcription factor binding. Electronic supplementary material The online version of this article (10.1007/s00439-020-02242-3) contains supplementary material, which is available to authorized users.


Background
Complex chromosomal rearrangements (CCRs) are structural variants (SVs) consisting of multiple adjacent breakpoint junctions (BPJs). The vast majority of CCRs are reported in cancers, and most of the current knowledge on CCRs originate from such studies (Collins et al. 2017). However, a growing number of CCRs are reported also in the germline (Collins et al. 2019), and such CCRs are known to cause a variety of disorders, including intellectual disability and dysmorphism . CCRs arise through a multitude of events, including breakage-fusionbridge (BFB) cycles (McClintock 1941), chromoanasynthesis (Liu et al. 2011), chromothripsis (Zhang et al. 2013), and chromoplexy (Baca et al. 2013).
BFB cycles may be initiated through telomeric dysfunction. Chromatids carrying dysfunctional telomeres may undergo rearrangements and fuse with other chromatids. During anaphase, these fused chromatids will be torn apart, resulting in rearranged chromatids that are lacking the telomere. Since the resulting chromatids have lost the telomere, they may fuse and be torn apart the next cell cycle, allowing for continuous cycles of breakage, fusion and bridging (McClintock 1941).
CCRs formed through BFB cycles are, therefore, mainly terminal rearrangements, they include duplications, deletions as well as copy number neutral fragments (Zakov and Bafna 2015). CCRs formed through BFB cycles may also be recognized based on the characteristic, non-random orientation of the DNA fragments involved, and various algorithms are described for such purposes (Kinsella and Bafna 2012).
In contrast to BFB cycles, chromothripsis is a single event of localized scattering of one or a few (< 4) chromosomes; once scattered, the fragments are rapidly joined in a seemingly random fashion (Pellestor 2019). Chromothripsis may involve deletions and copy number neutral fragments (Korbel and Campbell 2013) and may be initiated through numerous events, including the formation of micronuclei, viral insertion, or radiation (Koltsova et al. 2019). Chromothripsis has been observed in cancer (Cortés-Ciriano et al. 2020) as well as in germline (Macera et al. 2015).
Chromoanasynthesis is a catastrophic event, occurring during DNA replication and may be initiated by endogenous factors, including DNA secondary structures, as well as exogenous factors, including radiation, causing the DNA polymerase to replicate the DNA in an aberrant way (Liu et al. 2011). Chromoanasynthesis is characterized by nonclustered breakpoints, copy number states including both deletions and duplications, and templated insertions in the BPJs (Korbel and Campbell 2013;Zepeda-Mendoza and Morton 2019).
Chromoplexy is a recently discovered mechanism of CCR formation (Baca et al. 2013) which usually involves several chromosomes (> 2), and is generally a copy number neutral event (Pellestor 2019). Chromoplexy is believed to occur due to aberrant binding of transcription factors (Haffner et al. 2010). As such, the BPJs will cluster within genes, and may involve multiple co-transcribed genes. Chromoplexy has so far only been observed in cancers (Zhang et al. 2013), and with only a few cases reported.
In aggregate, these events involve different parts of the genetic repair machinery, and occur due to catastrophic errors in the most fundamental activities of the cell, including cell division, DNA replication and transcription.
As such, there is a great value in characterizing CCRs, the molecular characterization of CCRs may provide insights on genetic repair mechanisms and double-stranded breakage (Koltsova et al. 2019), as well as provide details on the structure and function of the genome in general (Pellestor 2019).
However, the correct characterization of CCRs is a difficult task, and commonly involves a large number of experiments, as well as time-consuming and manual analyses. Today, whole-genome sequencing (WGS) is the main method of choice for solving CCRs, and a variety of WGS methods have been applied to solve CCRs, including shortread sequencing , nanopore sequencing (Stancu et al. 2017), linked short-read WGS (Ott et al. 2018), and optical mapping (Chan et al. 2018). Given the novelty of these technologies, and the rarity of the CCR events, the choice of WGS method can be a great challenge. It is clear that each technology comes with a variety of advantages and disadvantages ) and that the data may be analyzed through a diversity of bioinformatic pipelines (Stancu et al. 2017).
Here, we present a female with no reported health issues except for fertility problems carrying two distinct germline de novo chromosomal rearrangements involving a total of 6 chromosomes and 137 breakpoints, which is, to our knowledge, the largest number of reported BPJs in a germline CCR so far. The rearrangements [(t(X;21;19;4) and t(7;11)] were characterized using Illumina short-read WGS, Oxford nanopore WGS, 10X Genomics Chromium WGS, as well as Bionano optical mapping. We have compared the results of each method, and we find (1) that the methods are complementary, and (2) that a hybrid-sequencing approach is necessary for the complete characterization of these CCRs. Lastly, we provide a detailed description of the two CCRs, showing that they were formed as two separate events and through different cellular mechanisms: while t(7;11) was most likely formed through a replicative mechanism, t(X;4;19;21) was most likely formed through chromoplexy.

Whole-genome sequencing analyses
WGS analysis including short-read paired end (PE) WGS, linked-read WGS, nanopore WGS and Bionano optical mapping revealed that the rearrangement was ultra-complex and identified a total of 137 BPJs. The two rearrangements were confirmed to be completely separated and the t(7;11) translocation was far less complex than the t(X;21;19;4) rearrangement, involving 5 and 132 breakpoints, respectively (Figs. 2, 3). The breakpoints were confined to single-chromosome arms on all chromosomes involved and chromosome 4 and chromosome 21 were the most shattered chromosomes with 92 and 35 breakpoints, respectively (Figs. 2, 3). No 1 3 Fig. 1 Chromosome analysis reveals complex translocations involving a total of six chromosomes. The karyotype of the patient was set as 46,X,t(X;21;19;4)(q26;q21;q13;q21)t(7;11)(p13;p15) after initial chromosome analysis (upper figure). Follow-up analysis with spectral karyotyping (SKY) was performed to visualize the chromosomal segments that had been rearranged, especially the chromosome 19 and 21 segments that were not distinguishable from the karyotype. White arrows indicate the different segments (lower figure) 1 3 additional chromosomes except those previously detected by karyotyping were involved in the CCRs.
Genomic segments involved in the rearrangement, apart from the end segments (i.e., chromosomal segments including the telomere), ranged from 10 s of megabases to less than 100 bp (Supplementary Table 1). Considering the high number of fragments, the rearrangement was remarkably balanced, involving 51 deletions totaling 100 Kbp (Supplementary Table 1). Three of the five breakpoints in the t(7;11) rearrangement were balanced and two breakpoints had deletions of one and two nucleotides, respectively. In the t(X;21;19;4) rearrangement, 84/132 (64%) breakpoints harbored deletions (median size = 80 bp, min = 1 bp, max = 26 Kbp, Supplementary Table 1). No duplications were detected. A total of 23 protein-coding genes were disrupted by one (n = 9) or several (n = 14) breakpoints (Table 1). Two of those genes were associated with human disease (ANO3 associated with Dystonia 24, OMIM 610110, and ABCG2 associated with Gout susceptibility, OMIM 138900) (Table 1). Except for RASGEF1B, all the affected proteincoding genes had a pLI score (Lek et al. 2016) lower than what is expected of genes causing Mendelian disease (0.9) ( Table 1).
A total of 23 fusions of genes (disrupted transcripts positioned next to each other due to the rearrangement) were formed (Supplementary Table 2, Supplementary Table 1). However, these fused genes lack promoter regions, or are fused in incompatible orientations, and are therefore unlikely to be transcribed into a stable RNA.
Next, we analyzed topologically associated domains (TADs) using a publicly available Hi-C dataset (Lajoie et al. 2015), produced from cerebellar astrocytes ( Table 2). In total, 28 TADs are affected in this dataset involving all rearranged chromosomes in our patient, 5 by breakpoints from the t(7;11) rearrangement, and 23 by breakpoints from the t(X;21;19;4) rearrangement ( Table 2). The breakpoints are unevenly spread across the TADs, and the majority of the breakpoints (156) cluster within the 5 most affected TADs  Table 1 (tad7, tad92, tad90, tad89, tad83). Notably, 127 BPJ were found to bridge across TADs (Supplementary Table 1). Four breakpoints were not located in any TAD and were, therefore, excluded from this analysis.
The affected TADs contain a total of 113 genes, of which 21 are known MIM morbid genes (Supplementary Table 1). Eleven of those follow an autosomal-dominant inheritance pattern and span a wide range of phenotypes, including three genes linked to various dental issues (AMTN, ENAM, DSPP) (Nakayama et al. 2015;Dong et al. 2000;Crosby et al. 1995) and APP known to cause Alzheimer disease when duplicated (Adler et al. 1991). Eight genes have a pLI score > 0.9, including one MIM morbid gene (SLC4A4) linked to autosomal recessive disease (Table 2, Supplementary Table 1). The genes located within affected TADs were further analyzed with a PANTHER GO biological process statistical overrepresentation test. This test revealed a total of 32 significantly enriched biological processes (P < 0.05, false discovery rate < 0.05) (Supplementary Table 1).
Analyzing the BPJs on the nucleotide level revealed distinct mutational signatures for the two separate complex rearrangements. The t(7;11) rearrangement showed rare single-nucleotide variants (SNVs), microhomology (4 bp) and a templated insertion in the junctions (Supplementary Fig. 1, Supplementary Table 1), consistent with replicative errors such as fork-stalling and template-switching (FoSTeS)/microhomology-mediated break-induced replication (MMBIR) (Weckselblatt and Rudd 2015) that are typical features of chromoanasynthesis. In contrast, the t(X;21;19;4) rearrangement consistently showed blunt ends, little to no microhomology, non-templated insertions and small deletions in the junctions (Table 2, Supplementary Fig. 1), consistent with non-homologous end-joining (NHEJ) (Weckselblatt and Rudd 2015). Genomic segments as small as 33 bp have previously been shown to sometimes be processed by the repair machinery during reassembly of shattered chromosomes (Slamova et al. 2018), so the large insertions were first manually checked and then checked using BLAT, which generated no specific matches. Next, we used RepeatMasker (Smit et al. 2013) to analyze all insertions larger than 50 bp, revealing that these insertions contain large amounts of simple repeats (29% vs 1.5% in GRCh38 (Smit et al. 2013)). Interestingly, one insertion was found to be a chimera of an SVA and srpRNA (Supplementary Dataset 1).
Out of the 137 breakpoints, 51 (37%) were completely balanced and 45 (33%) had losses of less than 100 nucleotides (Table 3, Supplementary Table 1). Among the total 137 BPJs, 44 (32%) showed microhomology in the junctions, but no more than 4 nucleotides ( Parental origin of both de novo rearrangements could not be determined as no parental samples were available for analysis.

Statistical assessment of the derivative chromosome structure and breakpoint junction characteristics of the t(X;21;19;4) rearrangement
The t(7;11) rearrangement carries features typical to chromoanasynthesis including dispersed interchromosomal translocations, as well as traces of replicative repair mechanisms (Zepeda-Mendoza and Morton 2019). In contrast, an in-depth statistical analysis was required to determine the mechanism of formation of the t(X;21;19;4) CCR. As noted previously, the t(X;21;19;4) CCR BPJ carries signatures consistent with NHEJ, which is found in a diversity of mechanisms of CCR formation, including chromothripsis, chromoplexy (Pellestor 2019), and BFB cycles (Marotta et al. 2013).
Analyzing the distribution of DNA fragments across the derivative chromosomes, we find that the fragments are spread across the derivative chromosomes in a non-random fashion. In particular, we note that derivative chromosome 4 contain 48% (n = 45, p = 10 −6 ) of the fragments originating from chromosome 4, and 56% (n = 20, p = 10 −4 ) of the fragments originating from chromosome 21 (Table 4). Conversely the derivative chromosome 21 is depleted in fragments originating from chromosome 4 (n = 13, p = 0.007), and chromosome 21 (n = 4, p = 0.034). In addition, there is an absence of fragment exchange between some of the These results indicate that the t(X;21;19;4) CCR was formed either through a non-random process, causing skewed reassembly of the chromosomes, or that the t(X;21;19;4) CCR was formed through a progressive multistep process, allowing the fragments to be reused in a directed, non-random fashion.
Analyzing the orientation of the fragments, we found no statistically significant patterns, and inverted and noninverted fragment fusions appear equally likely (Table 5). Such patterns are consistent with a number of mechanisms, including chromoplexy and chromothripsis (Korbel and Campbell 2013), but is inconsistent with BFB cycles, a mechanism known to produce CCRs enriched in head-tohead and tail-to-tail fusions (Kinsella and Bafna 2012).
To investigate the expression pattern of these affected genes, a GTex multigene query (https ://gtexp ortal .org/) was performed on the 15 protein-coding genes affected by at least 3 breakpoints, revealing that all of those genes are co-expressed in testis, but no other tissue ( Supplementary  Fig. 2).
Optical mapping detected the lowest amount of the t(X;21;19;4) BPJs (n = 19, 14.4%), and produced the second largest number of calls (12,057). Analyzing the rearrangement manually in Bionano Access, we do find a larger number of calls corresponding to the BPJs of t(X;21;19;4) ( Fig. 4a-d) and comparing all interchromosomal SV Bionano t(X;21;19;4) calls (n = 33) to all confirmed interchromosomal SV BPJs (n = 56), it was found that Bionano produces a significant number of similar calls (Fig. 4c, d). However, only 22 of these calls are located within 100 Kbp of the verified breakpoint positions, indicating low resolution.
Producing pairwise combinations of each technology, the detection rate is slightly increased. The highest detection rates were obtained by combining Illumina PE with either Nanopore WGS or 10 × Chromium WGS (Fig. 5, Table 6), allowing the automated detection of 130 BPJs (99%). Notably, there is only a small gain in adding a third-sequencing technology (Fig. 5). Inspecting the BPJs not detected by the various pipelines, it was found that the Illumina PE pipeline fails to detect BPJs in highly repetitive regions, as well as BPJs carrying large non-templated insertions (Supplementary Table 1). In contrast, nanopore WGS perform well in repetitive regions, but is limited by the high error rate and relatively low span-coverage. The linked short-read WGS is affected by various sequencing biases, such as GC content, and the sequencing coverage appears noisier than the  standard Illumina PE, causing dropouts at the BPJ regions. Optical mapping appears ill-suited for these many and short fragments, and the usefulness of this technology is limited due to its low resolution.

Hybrid de novo assembly of t(X;21;19;4)
We attempted to solve the CCR using a hybrid de novo assembly approach, potentially allowing automated characterization of the CCR. These hybrid assemblies were based on the Wtdgb2 (Ruan and Li 2020) assembly of the nanopore WGS data ( Supplementary Fig. 3A), Supernova (Weisenfeld et al. 2017) assembly of the linked short-read WGS data ( Supplementary Fig. 3B), and optical maps. Notably, only a few of the BPJs were assembled (Table 7), resulting in low detection rate compared to the mapping assembly approach ( Table 6). The highest detection rate is obtained by performing a nanopore-Bionano hybrid assembly. However, with only 71 detected BPJs, the detection rate is clearly lower than most of the mapping-based approaches (Table 6). Interestingly, Quickmerge removes the majority of the called BPJs (Table 7); however, it also removes the majority of calls present in the Supernova assembly. Quickmerge is, therefore, better suited for constructing reference genomes, as it removes noise and ambiguities at the cost of true variation.
Despite the low detection rate of these de novo assembly approaches, the resulting contigs were useful for validating the hypothesized derivative chromosomes. Contigs covering multiple aberrant fragments were especially useful, as such contigs allow for validation of the results obtained through mapping assembly (Supplementary Fig. 4).
Quality-wise, the best performance was achieved by combining the nanopore Wtdgb2 assembly with optical maps, yielding a relatively low number of calls (3941) indicating a tolerable amount of misassemblies while maintaining high contiguity and a relatively large assembly size (Supplementary Fig. 3C).
Even higher contiguity is obtained by merging the Supernova-WTDB2 Quickmerge results with optical mapping; however, the large number of calls indicate a large number of misassemblies ( Supplementary Fig. 3D).

Discussion
Herein, we utilized four high-throughput genomic technologies to fully resolve the structure of two CCRs that had arisen independently as de novo events in a healthy woman and found Illumina PE WGS and nanopore WGS to be the most promising hybrid-sequencing approach for solving CCRs consisting of a large number of small aberrant fragments. These two technologies offer high detection rate, and  Through a hybrid-sequencing approach and long read de novo assembly, the structure of the two CCRs was fully characterized. Subsequent breakpoint junction analysis enabled the most plausible mechanisms of formation to be determined. The t(7;11) rearrangement was determined to most likely have been formed through a replicative error mechanism, largely due to the templated insertions. Instead, an in-depth statistical analysis was performed to determine the mechanism of formation of t(X;21;19;4). Through these analyses, it was found that the CCR most likely was formed through a progressive multistep process, that the fragments were reinserted in random orientation, and that the breakpoints are widely spread across the genome but cluster within genes. All these signatures are consistent with  (Yi and Ju 2018), and as presented here the t(X;21;19;4) CCR, which is highly enriched in intragenic breakpoints would be consistent with this idea. Our second-best hypothesis would be formation through chromothripsis, however, this hypothesis is likely to be rejected as chromothripsis is defined as one single localized cataclysmic event of breakage, followed by random reintegration of fragments (Korbel and Campbell 2013), which would be the opposite of the formation of t (X;21;19;4). The number of BPJs (132) in t(X;21;19;4) exceed the typical numbers reported in both chromoplexy and germline chromothripsis in the literature. We must note, however, that chromoplexy is a recently described mechanism of CCR formation, and that larger, more complex chromoplexy rearrangements are likely to be reported as they are detected.
In the t(X;21;19;4) CCR, insertions were found in 39% of the BPJs (Table 3) and 24 BPJs harbor repetitive nontemplated insertions larger than 50 bp (Supplementary Table 1). In addition, most of the breakpoints (66%) contain deletions (Table 3). These findings provide clues on which cellular repair pathways underly the formation of the CCR. Alternative-NHEJ is a likely candidate, as it is known to produce large insertions and deletions at the BPJs (Seol et al. 2018) and has previously been described in chromoplexy (Zepeda-Mendoza and Morton 2019). In particular, polymerase theta-mediated end-joining (PTMEJ) is known to produce large insertions and these insertions are produced through fill-in synthesis that stabilizes the open ends (Black et al. 2016) and provides the homologous sequences used to join the open ends (Mateos-Gomez et al. 2017). Such DNA synthesis could explain the large random insertions observed at the BPJs of t(X;21;19;4) CCR. Furthermore, polymerase theta is known to include templated insertions at the BPJ (Schimmel et al. 2019) which can be seen at numerous BPJs, including BPJ 4-68 → 21-17 and 21-32 → 4-74 (Supplementary Figure S1).
Analyzing the co-expression pattern of the affected protein-coding genes, it was found that the affected genes are co-expressed in one tissue only, the testis ( Figure S2). As t(X;21;19;4) is a germline rearrangement, formed through chromoplexy, these findings could imply that the rearrangement was formed during spermatogenesis. However, since we lack paternal DNA samples, this hypothesis cannot be tested. In particular, we would be interested in confirming whether the t(X;21;19;4) CCR is of paternal or maternal origin, as well as to check whether the CCR was formed before or after meiosis.
Single cell Hi-C studies of the various developmental stages of spermatozoa and ova could be utilized to overcome these issues. Through such analyses, 3D genomic maps could be constructed (https ://githu b.com/lh3/hicki t) (Nagano et al. 2017) and these maps could be used to assess the likelihood of a suggested mechanism of formation (Berthelot et al. 2015).
Comparing the four technologies used (Bionano optical mapping, Illumina PE WGS, linked-read WGS, and nanopore sequencing), we would recommend Illumina PE WGS as a first experiment for solving a CCR like the t(X;21;19;4) rearrangement presented here as it is the most cost-efficient approach, offering high detection rate, and a large selection of bioinformatic tools, at a relatively small cost. Follow-up experiments should be chosen based on the biological question, as well as the structure of the CCR.  Except for fertility problems, the proband is unaffected which is surprising given the high number of breakpoints. The reason, however, for the lack of a clinical phenotype is that the CCR largely consists of intragenic breakpoints disrupting no early-onset dominant disease genes (Table 1). In addition, the majority of disrupted genes are unlikely to be haploinsufficient (pLI < 0.9). The fact that 27 TADs containing a total of 113 genes were potentially affected by the CCRs (Table 2, Supplementary Table 1) is remarkable and highlights the importance of continued research into the clinical significance of TAD disruptions. We speculate that the TADs are perturbed so that the change of expression is not consistent with the gene-related disease. The majority of affected TADs and MIM morbid genes are found in the more complex t (X;21;19;4), which is likely to be formed through chromoplexy. As chromoplexy involves regions that are co-expressed and close in the 3D-space of the nucleus (Yi and Ju 2018), the 3D structure of the probands genome may, therefore, be less perturbed than what is shown in these one-dimensional TAD analyses (Table 2, Supplementary  Table 1) which could also explain why the patient do not display a phenotype consistent with these disease genes. Alternatively, one could speculate that the rearrangement may cause late-onset disease, explaining why the proband currently do not display any disease phenotype, an example being APP (Supplementary Table 1), which is involved in Alzheimer disease (Adler et al. 1991), a well-known lateonset disease. Whether to test a healthy individual for lateonset disease or not is an ethical dilemma, however, both Hi-C and RNA-seq data would be useful in understanding the functional effect of such a complex rearrangement.
Despite fertility problems, the proband naturally conceived and gave birth to a healthy child carrying only the t(7;11) CCR. The fertility problems observed may be explained by a high number of gametes carrying unbalanced combinations of the two CCRs. Of the 64 gametes possible through alternate or adjacent segregation I ( Supplementary  Fig. 5), only 4 (6.25%) would carry a balanced set of chromosomes compared to 2/4 (50%) in carriers of reciprocalbalanced translocations (Morel et al. 2004).

Conclusion
In conclusion, we present two de novo complex chromosomal rearrangements involving 6 chromosomes and 137 BPJs in a healthy female. The amount of BPJs are approximately twice as many as have been reported to this day in germline chromothripsis (Collins et al. 2017), and almost sixfold the amount of BPJs that have been reported in a healthy individual (De Pagter et al. 2015). Analysis of the BPJs suggests that a combination of repair machineries have operated in the same cell, and that the distinct rearrangements have occurred through different mechanisms. In this way, we illustrate that present day commercial genomic technologies are suitable for fully characterizing such a rearrangement, however, it is clear that a multi-omics approach is necessary for understanding its full structure and complexity.
The large amount of work and effort put into this single case illustrate how a single a patient can provide significant clues and insights to molecular mechanisms underlying these events; however, we also illustrate that personalized care may require costly and truly personalized analyses.

Clinical synopsis
The proband is a female who was first referred for genetic investigation at age 3 years because of short stature. The clinical investigation was dropped when the patient started catching up in height and she now reports as an adult as being of normal height compared to other women in her family, and otherwise healthy. She has a history of fertility problems likely due to the chromosomal rearrangement but has a healthy child conceived naturally who only inherited one translocation, the t(7;11).

Cytogenetic analysis
Metaphase slides were prepared from peripheral blood cultures according to standardized protocols. Chromosome analysis was performed according to routine procedures with the GTG-banding technique and an approximate resolution of 550 bands per haploid genome was obtained.

Linked-read whole-genome sequencing
Genomic DNA derived from whole blood from the proband was also sequenced using the 10 × Genomics Chromium WGS protocol and data were analyzed and processed as described previously . Data were analyzed using 10 × Genomics default pipelines Long Ranger V2.1.2 (https ://suppo rt.10xge nomic s.com/ genom e-exome /softw are/downl oads/lates t).

Nanopore whole-genome sequencing
Nanopore WGS was performed at National Genomics Infrastructure (NGI) Uppsala. The sequencing library was prepared using the LSK-109 ligation kit and sequencing was performed on the PromethION platform using the R9.4 flow cell. Bases were called using the Guppy base caller (https ://nanop orete ch.com) producing 2729,339 reads, with an average length of 15 Kbp, resulting in roughly 13 × coverage.
The resulting nanopore WGS data were aligned to Hg19 using Minimap2 (Li 2018), and variants were called using Sniffles (Sedlazeck et al. 2018), setting the minimum read support parameter to three reads. SV calls shorter than 2 Kbp were considered Indels (insertions/deletions) and were, therefore, removed.

Statistical tests
Statistical tests were performed to describe the t(X;21;19;4) CCR, and to reject implausible mechanisms of formation. Binomial tests were performed to assess the randomness of the distribution of fragments among the derivative chromosomes, and to test for biases regarding the orientation of the aberrant fragments. The binomial tests were performed using the social statistics online binomial test calculator https ://www.socsc istat istic s.com/tests /binom ial/.
The tests for biased fragment orientation were run for each derivative chromosome, and each possible orientation (Head-Head, Head-Tail, Tail-Head, and Tail-Tail).
The number of trials were set to the number of fragments within the derivative chromosome being tested, and the probability of the outcome was set to 1/4 (i.e., the number of possible fragment orientations), and the number of observed occasions were set to the orientations as observed.
The randomness of the distribution of fragments was assessed by performing binomial tests for each derivative chromosome. Here, the number of trials were set to the number of fragments involved in t (X;21;19;4), the probability of the outcome was set to 1/4 (i.e., the number of chromosomes involved in t (X;21;19;4)). The number of observed occasions was set to the observed number of fragments within the derivative chromosome being analyzed.
A Monte Carlo test was performed to assess the enrichment of BPJs within genes.
We utilized the RefSeq gene annotation (assembly GCF_000001405.25) to count the number of intragenic t(X;21;19;4) BPJs. A BPJ was considered intragenic if any of its two breakpoints were located within a gene. Next, an equal number of random BPJs were simulated across chromosome 4, 19, 21 and X and the number of intragenic simulated BPJs were counted. The simulated BPJs were selected according to a random uniform distribution, such that all bases across chromosome 4, 19, 21 and X were equally likely to be affected by simulated rearrangements.
This procedure was repeated for 1000 iterations, and a p value was defined as the fraction of simulated CCRs carrying more intragenic BPJs than the observed number of intragenic BPJs in t(X;21;19;4).

Breakpoint PCR and Sanger sequencing
Breakpoint PCR was performed on some BPJs that were too complex for WGS data analysis only. Primers were designed flanking the junctions approximately 500 bp away from the estimated breakpoint with M13 sequences attached. Breakpoint PCR was performed by standard methods using Phusion High-Fidelity DNA Polymerase (ThermoFisher Scientific, Waltham, MA, USA). Sanger sequencing of amplicons was performed according to standard protocols with M13 primers. Sequences were aligned using the BLAT tool (Kent 2002) and visualized in CodonCode Aligner (CodonCode Corp., Dedham, MA, USA). sequencing data. As mentioned previously, the Bionano optical maps were assembled using Bionano Solve. The nanopore WGS data were assembled using Wtdgb2 (Ruan and Li 2020), and the 10X Genomics Chromium linked reads were assembled using the Supernova assembler (Weisenfeld et al. 2017). The assemblies were aligned to Hg19 using Mini-map2 (Li 2018), and SVs were called using Assemblatron (Eisfeldt et al. 2020).
A variety of hybrid assemblies were produced: the Wtdgb2 and Supernova assemblies were merged using the Quickmerge tool (Solares et al. 2018) and all previously mentioned assemblies were merged with Bionano optical maps using the Bionano Solve hybrid-scaffolding tool. The resulting hybrid scaffolds were aligned to Hg19 using Mini-map2, and SVs were called using Assemblatron.

Solving the complex chromosomal rearrangements
The BPJs were found by combining the filtered FindSV ) output with the filtered Sniffles (Sedlazeck et al. 2018) output using SVDB (Eisfeldt et al. 2017). Next, the BPJs were manually inspected through IGV (Thorvaldsdóttir et al. 2013), this inspection was necessary to resolve inaccuracies due to repeats or microhomology, as well as to find BPJs not called by any of the pipelines.
The filtered list of BPJs were manually compared the Optical mapping data, using Bionano Access, as well as the linked reads, using IGV and Loupe (https ://suppo rt.10xge nomic s.com). The manual comparison was also necessary due to the low resolution of these technologies, as well as the large amount of false-positive calls produced by the Bionano and Longranger pipelines. BPJs were considered true if they were supported by at least two of the WGS methods (either through calling or manual inspection), or by breakpoint PCR.
Aberrant DNA fragments were defined based on the quality controlled and filtered data. Except for the terminal fragments, each DNA fragment was associated with two BPJs, one at the head and one at the tail, forming chains of fragments, fused together through the BPJs. Lastly, the derivative chromosomes were defined using a custom script. This script "walks" through these chains, starting at a userdefined fragment (such as the first fragment of chromosome 4), and continues to the next fragment through the BPJs (or calls) associated with each fragment. This "walking" is continued until the script finds the terminal fragment in a chain and at that point the script ends and returns the path traveled through the chain of aberrant fragments. This path is a representation of the derivative chromosome, detailing the order and orientation of the fragments involved.
This chain was compared to the various sequencing data, de novo assemblies, as well as the cytogenetic results and was refined until the path was consistent with all signatures in each of the datasets.
A call was considered to represent a BPJ if both of its reported breakpoint positions were within 100 Kbp of the confirmed breakpoint. An SV call being close to multiple confirmed BPJs were considered to represent only one BPJ (the closest one). The distance of 100 Kbp was selected based on the size of the aberrant fragments, and to account for the lower resolution of Optical mapping.

Hi-C data and TAD analysis
Hi-C data produced from cerebellar astrocytes were downloaded via the ENCODE experiment ENCSR011GNI website (https ://www.encod eproj ect.org/exper iment s/ENCSR 011GN I/). The analyses were based on the publicly available bed file of TAD regions (Lajoie et al. 2015). TAD regions were coupled with breakpoints using TABIX (Li 2011): in these analyses, the TAD bed file was used as a database that was queried using the coordinates of the BPs. Next, the TADs were searched for protein-coding genes, querying the RefSeq gene annotation (assembly GCF_000001405.25) gff files with the positions of the affected TADs.
Funding Open access funding provided by Karolinska Institute. AL was supported by the SciLifeLab national sequencing projects grant, the Swedish Research Council , Stockholm City Council and the Swedish Brain Foundation. MP was supported by grants from Karolinska Institutet funding for doctoral education (KID).

Data availability
The datasets generated and/or analyzed during the current study are not publicly available due the European GDPR law but are available from the corresponding author on reasonable request.

Compliance with ethical standards
Conflict of interest The authors declare no conflict of interest.

Consent to participate
The local ethical board in Stockholm, Sweden approved the study (ethics permit number KS 2012/222-31/3) and written informed consent was obtained.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.