Fine mapping chromatin contacts in capture HiC data
Abstract
Background
HiC and capture HiC (CHiC) are used to map physical contacts between chromatin regions in cell nuclei using highthroughput sequencing. Analysis typically proceeds considering the evidence for contacts between each possible pair of fragments independent from other pairs. This can produce long runs of fragments which appear to all make contact with the same baited fragment of interest.
Results
We hypothesised that these long runs could result from a smaller subset of direct contacts and propose a new method, based on a Bayesian sparse variable selection approach, which attempts to fine map these direct contacts. Our model is conceptually novel, exploiting the spatial pattern of counts in CHiC data. Although we use only the CHiC count data in fitting the model, we show that the fragments prioritised display biological properties that would be expected of true contacts: for bait fragments corresponding to gene promoters, we identify contact fragments with active chromatin and contacts that correspond to edges found in previously defined enhancertarget networks; conversely, for intergenic bait fragments, we identify contact fragments corresponding to promoters for genes expressed in that cell type. We show that long runs of apparently cocontacting fragments can typically be explained using a subset of direct contacts consisting of <10% of the number in the full run, suggesting that greater resolution can be extracted from existing datasets.
Conclusions
Our results appear largely complementary to those from a perfragment analytical approach, suggesting that they provide an additional level of interpretation that may be used to increase resolution for mapping direct contacts in CHiC experiments.
Keywords
Capture HiC Chromatin conformation Bayesian statistics Variable selectionBackground
The threedimensional structure of the genome influences gene expression at varying levels of scale [1]. Multimegabase compartments of active and inactive chromatin, as well as topologicallyassociated domains (TADs) spanning hundreds of kilobases, can be readily identified by mapping physical interactions using genomewide chromatin conformation capture techniques (HiC) [2, 3]. However, as HiC quantifies interactions between all possible pairs of regions in the genome (e.g. HindIII fragments) via massively parallel sequencing, it is inefficient at characterizing individual enhancerpromoter interactions in great depth. To explore such regulatory interactions in detail, the more recently developed Capture HiC (CHiC) method targets sequencing efforts toward interactions between predefined regions of interest (“baits”, e.g. HindIII fragments overlapping gene promoters) on one end, and all other regions (“prey”) on the other [4, 5].
CHiC has enabled identification of contacts made by promoters in primary human cells [5, 6]. The contact maps thus generated show a tendency for multiple contiguous fragments to be linked with the same promoter [7, 8], but it is not clear whether enhancers overlapping all these fragments or only a subset of them are directly relevant to the promoter’s regulation. Conversely, the same enhancer region can appear to interact with promoters of multiple genes [9, 10], while it remains unclear whether this reflects coregulation of these genes [11]. Either phenomenon could also be caused by a lack of resolution in these maps, which are typically constrained by the restriction enzyme used (e.g. HindIII produces fragments of median length 4kb). Given that typical enhancers and promoters are considerably shorter than a single HindIII fragment [12], we hypothesised that collateral contacts may be identified along with the direct enhancerpromoter contacts they neighbour. Such collateral contacts might result from a bait traversing the regions around its primary target via Brownian motion, potentially during the formation of loops [13] or from inaccuracies in the crosslinking of proximal regions during the CHiC procedure [14, 15].
Here, we propose a statistical model in which, for any given bait, the expected CHiC signal at each prey is expressed as a sum of contributions from a sparse set of fragments directly contacting that bait. This decomposition model allows us to view the CHiC signal at each prey in the context of the signals in its local environment. We fit the model through reversible jump Markov Chain Monte Carlo (RJMCMC) to identify primary contacts in published CHiC data from two cell types, nonactivated and activated CD4 ^{+} T cells [8].
Results
A spatial model for CHiC data
where β_{bq} captures the strength of the interaction at q and d(p,q) is the absolute linear distance between the midpoints of fragments p and q, with parameter ω (assumed fixed and known) controlling the rate of decay.
The exponential form in (2) was chosen by examining model fits with this and other possible forms of decay functions to a subset of baits. The value of ω was fixed at 10^{−4.7}, chosen from a range of values tried because it produced the best fit to our data (full details in Additional file 1).
where the sum is taken over all fragments q in some neighbourhood of the bait b, and γ_{bq} is a latent indicator variable, taking the value 1 if there is a direct contact between fragments b and q (i.e. β_{bq}≠0) and 0 otherwise.
We provide functions to implement this model in the R package Peaky, available from http://github.com/cqgd/pky.
Inference of baitprey interactions
We fit the model using an RJMCMC sampler, R2BGLiMS [16]. For each bait b, the distribution of sampled coefficients \(\phantom {\dot {i}\!}\beta _{b1},\ldots,\beta _{{bF}_{b}}\) reflect the posterior distribution of contact strengths between the bait b and its neighbouring preys. Given our prior assumption that contacts lead to increased rather than decreased counts, we decided not to use the common marginal posterior probability of inclusion, defined as the proprtion of samples in which β_{bp}≠0. Instead, we defined an analogous statistic: the marginal posterior probability of a contact (MPPC) between bait b and prey p, as the proportion of sampled models in which β_{bp}>0, and use this as the primary statistic for inference.
Reproducibility of MPPC calls on replicate data from macrophages
Application to CHiC data from activated and nonactivated CD4 ^{+} T cells
We applied the above model to four parallel data sets generated from CD4 ^{+} T cells: two from nonactivated cells (“non”) cultured for four hours in buffer and two from activated cells (“act”) cultured for four hours with antiCD3/CD28 beads, all previously analysed with CHiCAGO [8]. We chose to use two cell types so we could check any results were representative rather than specific to a single dataset (which might indicate overfitting), and chose these cell types specifically because of availability of external datasets for biological validation. Each pair consisted of a promoter capture set, with 22,032 bait fragments representing the promoters of 28,007 unique annotated genes (16,116 baits representing 17,731 protein coding genes), and a validation capture set, with 945 bait fragments that were preys contacting baits in the promoter dataset according to analysis using the standard CHiCAGO pipeline [6] in CD4 ^{+} T cells, megakaryocytes or erythroblasts [8]. Note that the validation set are used here to provide a complementary dataset whose true contacts would be expected to have alternative biological characteristics than the promoter capture set, owing to the opposite fragment being captured. The validation capture array was designed before peaky was conceived, on the basis of CHiCAGO scores only. Thus, because baits were not selected into the validation data on the basis of MPPC they cannot be used to validate the MPPC itself. We preprocessed raw counts from each dataset separately to generate NB residuals. QQ plots showed that our assumptions of central normality and a long right tail were met (Additional file 2: Figure S1).
Number and % of baits for which correlation between MPPC between two parallel runs exceeded 0.75
Experiment  Total baits  n. ρ>0.75  % ρ>0.75 

Promoter, non  13078  12528  95.8 
Promoter, act  13319  12785  96.0 
Validation, non  648  622  96.0 
Validation, act  706  688  97.5 
MPPC provides additional information for distinguishing biologically plausible contacts
ΔBIC from the interceptonly model for four measures of biological plausibility of contacts
Model  Nonactivated  Activated 

a: Promoter: match to Cao et al.  
MPPC  −338.0  −245.8 
CHiCAGO  −332.2  −331.4 
MPPC + CHiCAGO  ^{a}−411.6  ^{a}−358.8 
b: Promoter: link to active chromatin  
MPPC  −1134.2  −812.3 
CHiCAGO  −659.5  −560.1 
MPPC + CHiCAGO  ^{a}−1231.1  ^{a}−924.0 
c: Validation: overlap baited promoter  
MPPC  −404.8  −347.9 
CHiCAGO  −430.0  −419.2 
MPPC + CHiCAGO  ^{a}−541.7  ^{a}−499.0 
d: Validation: expression at linked promoter  
MPPC  −1571.0  ^{a}−1329.2 
CHiCAGO  −871.9  −430.2 
MPPC + CHiCAGO  ^{a}−1640.3  −1318.7 
MPPC can be used to prioritise direct contacts amongst long runs
This suggested that long stretches could result from direct contacts at a small subset, and that our joint model could distinguish these, ranking some as more probable direct contacts than others. As the true sets of direct contacts are unknown, we again used external data to assess whether this prioritisation corresponded to fragments with more biologically supportable characteristics. We found that, within these stretches, the MPPC remained significant predictors of whether fragments corresponded to biologically plausible features across all run lengths (Additional file 2: Table S2).
The MPPC is a continuous measure from 0 to 1, rather than a yes/no discriminator. We evaluated its utility according to its correspondence with characteristics expected in direct contacts.
Discussion
Our results support our hypothesis that long runs of prey fragments with high counts in CHiC data can result from a smaller number of direct contacts together with collateral signal at their neighbours. This suggests that efforts to jointly model the pattern of counts across multiple fragments have potential to distinguish those direct contacts. Joint modelling to improve resolution is already used to fine map genetic causal variants in genomewide association studies (GWAS). It is accepted in GWAS that the p value corresponding to a test of association between a single genetic variant and some phenotype should be interpreted in the context of the p values of its neighbours, either by highlighting the variant in the region with the smallest p value, or by fitting a variable selection model to find a sparse subset of variants which could explain the association signals across the region. The primary difference between our CHiC model and the class of GWAS fine mapping models that also fit the association statistics directly (e.g. PAINTOR, [18]) is that the decay of GWAS association signals across genetic variants has been established to relate to the linkage disequilibrium or correlation between those variants within the population, while our model assumes an exponential decay specified by a single parameter ω. We chose ω by considering a range of values and choosing that which produced residuals without obvious autocorrelation. This meant we could parallelise our analysis, considering each bait independently, but different values of ω would produce different results. Future work will explore whether it is computationally feasible to specify ω within a hierarchical framework that considers multiple baits simultaneously. In addition, we intend to investigate whether these ideas – using information from sets of proximal locations in a joint model to make inference about each individual location – could be adapted to other techniques used to call DNA contacts, such as next generation CaptureC [19], ChIAPET [20] and HiChIP [21], although different decay functions might be required.
In addition to jointly modelling the signal across multiple fragments, our proposed model contrasts to previous efforts to analyse CHiC data by producing a Bayesian measure of confidence in the location of a direct contact  the MPPC. Both the MPPC and the CHiCAGO score decay with distance from bait, emphasising that short range contacts predominate, at least within the set of contacts detectable through CHiC. There is, though, a notable difference between the rates of decay (Additional file 2: Figure S3). This reflects the deliberate choice of the CHiCAGO authors to weight p values such that more distant interactions were less likely to be called significant. We chose not to adopt any distantdependent prior as our intention was to fine map contacts already called by a method that incorporates this distance penalty, such as CHiCAGO, and we did not wish to doubly penalise distant contacts. However, it is possible that adopting such a prior would lead to improved inference were our joint model to be applied alone. We also note that while we have broadly followed the CHiCAGO preprocessing approach here so that our results can be considered a fine mapping layer on top of a standard CHiCAGO analysis, other preprocessing would be possible. For instance, for bait fragments near a TAD boundary, the counts are likely to have a different distribution (fewer counts) towards the boundary rather than towards the centre of the TAD. While our approach is agnostic to the location of TAD boundaries, other methods such as PeakC [22] explicitly account for asymmetrically distributed counts around bait and could be used as alternate models to generate standardised residuals, followed by peaky fine mapping.
We argue that our joint analysis of neighbouring prey fragments adds a further useful dimension to the analysis of CHiC data, with a CHiCAGO score reflecting the (distance from baitadjusted) evidence for there being any contacts in a neighbourhood, and the MPPC reflecting the expected number and the uncertainty in the precise location of direct contacts. Other advantages of adopting a Bayesian framework include the ability to extend the model to include not just baitprey distance, but other prior information on the likelihood of direct contacts. This would enable, for example, information from previous experiments in related cell types to inform future analyses.
Conclusions
We have proposed a new model for calling direct contacts from CHiC data that, in contrast to existing fragmentbyfragment analysis methods, exploits information from each prey’s neighbouring fragments. Our joint model identifies prey fragments with biological characteristics that would be expected at sites of direct contact, such as an active chromatin state when they contact promoters. We have shown this information is largely complementary to that produced by the perfragment method, CHiCAGO. Combining inference across these two approaches is more stringent – a prey fragment needs to simultaneously have a higher count than expected and a supporting pattern among neighbouring fragments – and leads to improved resolution of direct contacts in CHiC datasets.
Methods
Preprocessing of read count data
We first preprocess the read count data using similar methods to standard CHiC analysis to produce residuals which have a standard normal distribution in the absence of interactions. The raw data for a CHiC experiment takes the form of a sparse matrix of counts for pairs of baits and preys. In practice, most entries in this matrix are zero, and analysis focuses on modelling the counts at preys that are within some linear genomic distance of each bait. Statistical inference of contacts is based on a two step approach. First, counts are modelled to adjust for systematic effects such as distance between bait and prey, and capture efficiency using either negative binomial (NB) regression [9] or a convolution of NB and Poisson regression, to model biological and technical noise separately [6]. Second, a decision is taken to call contacts based on comparing observed counts to those expected under this empirical model estimated under a null hypothesis of no true interactions, either using raw p values [9] or p values weighted to allow for the complication that we expect to find more interactions among fragments proximal to the bait, but test many more long distance pairs. We wished to use the first part of this procedure to account for the systematic effects in the data, and generate standardised residuals (that is, residuals with unit variance) for input into our proposed joint model.
The CHiC data from CD4 ^{+} cells that we propose to use for this study have previously been processed by the HiCUP pipeline [23] and CHiCAGO [6] as described in [8]. We noted that the technical noise component had a generally small contribution compared to the biological noise (Additional file 2: Figure S8). We therefore applied NB regression alone to the raw counts using standard software to generate these standardised residuals, which we call NB residuals in the text below. This allowed us to add additional covariates to the regression which we found provided small improvements to the model fit. Besides the distance between an interaction’s bait and prey fragments, and whether both fragments in a putative contact were baited, we used the length of both fragments, as well as transchromosomal bait activity, as covariates. Assuming that transchromosomal contacts are equally rare across baits [24], the latter is a proxy for enrichment and capture biases. To account for the difference in the number of possible transchromosomal interaction sites between baits on different chromosomes, transchromosomal bait activity is defined for each bait as the residual following from the regression of the sum of its transchromosomal counts against its chromosome number. We used the R package GAMLSS to fit zerotruncated NB models to counts for each pair of bait (b) and prey (p) within ten distance bins (Additional file 2: Table S3). Assuming most baitprey fragment pairs do not make direct contacts, the null model can be parametrized using the full dataset. We used normalized, randomized quantile residuals [25] as the input for our joint model. A comparison between the predicted fits from CHiCAGO and our NB models applied to the same data showed a good correspondence (Additional file 2: Figure S9).
Priors on model parameters
Rather than fixing σ_{β}, which controls the magnitude of interaction strengths supported by the model, and therefore can have an important impact on the efficiency of the algorithm, we use a flexible hyperprior allowing adaption to the data. Specifically, we placed a weakly informative Uniform(0.01,2) hyperprior on σ_{β}. The median, σ_{β}=1, corresponds to 95% support for interaction strengths up to a plausible 1.96. However, this hyperprior equally supports much smaller values of σ_{β}, as well as values up to the maximum of 2, corresponding to support for interaction strengths as large as 8 — marginally larger than any individual NB residual we observed (Additional file 2: Figure S1).
Conditional on θ_{b}, each γ_{bq} is then i.i.d. Bernoulli(θ_{b}). This prior has two attractive properties. First, the marginal prior odds that a particular fragment interacts with b is 1/F_{b}, and therefore decreases with the total number of fragments considered. Meanwhile, the prior odds for there being no interactions is a constant 0.5 for every bait. This setup provides an intrinsic multiplicity correction for the number of fragments in each bait, and allows fair comparison of inference across baits, due to the common prior on the null model [26]. Note too that this corresponds to a very small prior odds of interaction for each individual fragment, since F_{b} is usually in the order of 3000, and thereby encourages the exploration of sparse models.
Model fitting via reversible jump MCMC
Assessing relationship of CHiCAGO scores and MPPC to outcome measures
CHiCAGO scores are nonnegative real numbers, and are typically asinh transformed for presentation or downstream inference, to prevent overleverage of points in the extreme right of the distribution [8]. In constrast, MPPC lies between 0 and 1, although rarely reaches 1 in practice. We found MPPC were generally positively correlated with CHiCAGO scores, with the relationship closest to linear when sqrt(MPPC) was compared to asinh(CHiCAGO) (Additional file 2: Figure S4). We therefore use a square root transform in following analyses to perform a fair comparison with CHiCAGO scores.

Promoter: match to [17] For validation with external promoterenhancer networks, we used the positions given in http://yiplab.cse.cuhk.edu.hk/jeme/encoderoadmap_lasso/encoderoadmap_lasso.34.csv (accessed 2017/09/11). We used GenomicRanges to identifying baitprey fragment pairs which overlapped the paired coordinates given in this file, and set a binary outcome 1 if such an overlap was found and 0 otherwise. Analyses of this measure were restricted to prey fragments within 200 kb of the bait, because 95% of these reported links were within that range.

Promoter: link to active chromatin These cells had previously been assayed by ChIPseq, and a 15 state CHROMHMM model fitted. 8 of these states showed characteristics of “active chromatin” and we combined these into a binary measure for active or inactive chromatin [7]. We used these results to quantify the overlap, for each prey fragment, with regions of active chromatin. For the most part (∼90%), a fragment showed complete overlap or lack of overlap with active chromatin regions, in which case the outcome measures was set to 1 or 0 respectively. To allow logistic regression of this mainly binary outcome, the observations with fractional overlap were set to missing for analysis.

Validation: overlap baited promoter For a measure of promoter overlap, we used the binary indicator of whether a prey fragment in the validation experiment had also been baited in the promoter experiment.

Validation: expression at linked promoter Given evidence that recruitment of prey fragments is associated with increased expression of the baited gene [7], we expected that, amongst prey that did correspond to a baited promoter in the promoter capture experiment, the level of expression of the target gene should be higher when there was a direct contact. RNAseq has previously been used to quantify transcription in these cells, and we used the expression of the target gene (log_{2}(count + 1)) as an outcome measure in linear regression. Analyses of this measure were restricted to baitprey pairs where the prey corresponded to a gene promoter.
Because each prey fragment is represented multiple times (with different baits), we assessed the relationship between asinhtransformed CHiCAGO scores and sqrttransformed MPPC with each outcome measure using robust clustered linear or logistic regression implemented in the R library rms (https://cran.rproject.org/web/packages/rms/), clustering on the prey fragment.
Notes
Acknowledgments
We thank Frank Dudbridge and Mikhail Spivakov for helpful discussions throughout the development of our method.
Funding
This work was funded by the MRC (MC_UU_00002/4, MC_UU_00002/9) and the Wellcome Trust (WT107881).
Availability of data and materials
Data used is available as described in the primary publications and as Additional Files (see below). CHiC data, inferred CHROMHMM states, RNAseq quantification are available as described in [7]. We downloaded summaries of the enhancerpromoter networks [17] from http://yiplab.cse.cuhk.edu.hk/jeme/encoderoadmap_lasso/encoderoadmap_lasso.34.csv. We provide functions to implement this model in the R package Peaky, available from http://github.com/cqgd/pky. Code used to run peaky on these data, and generate the tables and figures in this paper are available at https://github.com/chr1swallace/eijsboutsetal. All raw data used in are available in Additional file 3.
Authors’ contributions
CE developed the statistical method, implemented it, applied it to the CHiC datasets, wrote the paper, authored the software. OB annotated gene TSS and CHiC HindIII fragments, critically evaluated the enrichment analysis, wrote the paper. PN developed the statistical method, wrote the paper. CW devised the study, developed the statistical method, performed statistical enrichment analyses, wrote the paper. All authors read and approved the final manuscript.
Ethics approval and consent to participate
No new data were generated in this study. All data comes from a previous study [7] for which all samples and information were collected with written and signed informed consent. The study was approved by the local Peterborough and Fenland research ethics committee for the project entitled: ’An investigation into genes and mechanisms based on genotypephenotype correlations in type 1 diabetes and related diseases using peripheral blood mononuclear cells from volunteers that are part of the Cambridge BioResource project’ (05/Q0106/20). Experimental methods comply with the Helsinki Declaration.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary material
References
 1.Gierman HJ, Indemans MH, Koster J, Goetze S, Seppen J, Geerts D, van Driel R, Versteeg R. Domainwide regulation of gene expression in the human genome. Genome Res. 2007; 17(9):000–000.CrossRefGoogle Scholar
 2.Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012; 485(7398):376–80.CrossRefGoogle Scholar
 3.Van Berkum NL, LiebermanAiden E, Williams L, Imakaev M, Gnirke A, Mirny LA, Dekker J, Lander ES. HiC: a method to study the threedimensional architecture of genomes. JoVE (J Visualized Exp). 2010; 39:1869.Google Scholar
 4.Jäger R, Migliorini G, Henrion M, Kandaswamy R, Speedy HE, Heindl A, Whiffin N, Carnicer MJ, Broome L, Dryden N, et al. Capture HiC identifies the chromatin interactome of colorectal cancer risk loci. Nat Commun. 2015; 6:6178.CrossRefGoogle Scholar
 5.Mifsud B, TavaresCadete F, Young AN, Sugar R, Schoenfelder S, Ferreira L, Wingett SW, Andrews S, Grey W, Ewels PA, Herman B, Happe S, Higgs A, LeProust E, Follows GA, Fraser P, Luscombe NM, Osborne CS. Mapping longrange promoter contacts in human cells with highresolution capture HiC. Nat Genet. 2015; 47(6):598–606. https://doi.org/10.1038/ng.3286.CrossRefGoogle Scholar
 6.Cairns J, FreirePritchett P, Wingett SW, Várnai C, Dimond A, Plagnol V, Zerbino D, Schoenfelder S, Javierre BM, Osborne C, et al. Chicago: robust detection of dna looping interactions in capture HiC data. Genome Biol. 2016; 17(1):127.CrossRefGoogle Scholar
 7.Burren OS, Rubio García A, Javierre BM, Rainbow DB, Cairns J, Cooper NJ, Lambourne JJ, Schofield E, Castro Dopico X, Ferreira RC, Coulson R, Burden F, Rowlston SP, Downes K, Wingett SW, Frontini M, Ouwehand WH, Fraser P, Spivakov M, Todd JA, Wicker LS, Cutler AJ, Wallace C. Chromosome contacts in activated T cells identify autoimmune disease candidate genes. Genome Biol. 2017; 18(1):165. https://doi.org/10.1186/s1305901712850.CrossRefGoogle Scholar
 8.Javierre BM, Burren OS, Wilder SP, Kreuzhuber R, Hill SM, Sewitz S, Cairns J, Wingett SW, Várnai C, Thiecke MJ, Burden F, Farrow S, Cutler AJ, Rehnström K, Downes K, Grassi L, Kostadima M, FreirePritchett P, Wang F, BLUEPRINT Consortium, Stunnenberg HG, Todd JA, Zerbino DR, Stegle O, Ouwehand WH, Frontini M, Wallace C, Spivakov M, Fraser P. LineageSpecific Genome Architecture Links Enhancers and Noncoding Disease Variants to Target Gene Promoters. Cell. 2016; 167(5):1369–138419. https://doi.org/10.1016/j.cell.2016.09.037.CrossRefGoogle Scholar
 9.Dryden NH, Broome LR, Dudbridge F, Johnson N, Orr N, Schoenfelder S, Nagano T, Andrews S, Wingett S, Kozarewa I, Assiotis I, Fenwick K, Maguire SL, Campbell J, Natrajan R, Lambros M, Perrakis E, Ashworth A, Fraser P, Fletcher O. Unbiased analysis of potential targets of breast cancer susceptibility loci by Capture HiC. Genome Res. 2014; 24(11):1854–68. https://doi.org/10.1101/gr.175034.114.CrossRefGoogle Scholar
 10.Martin P, McGovern A, Orozco G, Duffus K, Yarwood A, Schoenfelder S, Cooper NJ, Barton A, Wallace C, Fraser P, Worthington J, Eyre S. Capture HiC reveals novel candidate genes and complex longrange interactions with related autoimmune risk loci. Nat Commun. 2015; 6:10069. https://doi.org/10.1038/ncomms10069.CrossRefGoogle Scholar
 11.Novo CL, Javierre BM, Cairns J, SegondsPichon A, Wingett SW, FreirePritchett P, FurlanMagaril M, Schoenfelder S, Fraser P, RuggGunn PJ. Longrange enhancer interactions are prevalent in mouse embryonic stem cells and are reorganized upon pluripotent state transition. Cell Rep. 2018; 22(10):2615–27.CrossRefGoogle Scholar
 12.Malin J, Aniba MR, Hannenhalli S. Enhancer networks revealed by correlated dnase hypersensitivity states of enhancers. Nucleic Acids Res. 2013; 41:374.CrossRefGoogle Scholar
 13.Schwarzer W, Abdennur N, Goloborodko A, Pekowska A, Fudenberg G, LoeMie Y, Fonseca NA, Huber W, Haering C, Mirny L, et al. Two independent modes of chromosome organization are revealed by cohesin removal. bioRxiv. 2016; 551:094185.Google Scholar
 14.Belmont AS. Largescale chromatin organization: the good, the surprising, and the still perplexing. Curr Opin Cell Biol. 2014; 26:69–78.CrossRefGoogle Scholar
 15.Williamson I, Berlivet S, Eskeland R, Boyle S, Illingworth RS, Paquette D, Dostie J, Bickmore WA. Spatial genome organization: contrasting views from chromosome conformation capture and fluorescence in situ hybridization. Genes Dev. 2014; 28(24):2778–91.CrossRefGoogle Scholar
 16.Newcombe PJ, Ali HR, Blows FM, Provenzano E, Pharoah PD, Caldas C, Richardson S. Weibull regression with Bayesian variable selection to identify prognostic tumour markers of breast cancer survival. Stat Methods Med Res. 2014; 26:414–36. https://doi.org/10.1177/0962280214548748.CrossRefGoogle Scholar
 17.Cao Q, Anyansi C, Hu X, Xu L, Xiong L, Tang W, Mok MTS, Cheng C, Fan X, Gerstein M, Cheng ASL, Yip KY. Reconstruction of enhancertarget networks in 935 samples of human primary cells, tissues and cell lines. Nat Genet. 2017. https://doi.org/10.1038/ng.3950.
 18.Kichaev G, Yang WY, Lindstrom S, Hormozdiari F, Eskin E, Price AL, Kraft P, Pasaniuc B. Integrating functional data to prioritize causal variants in statistical finemapping studies. PLoS Genet. 2014; 10(10):1004722. https://doi.org/10.1371/journal.pgen.1004722.CrossRefGoogle Scholar
 19.Davies JO, Telenius JM, McGowan SJ, Roberts NA, Taylor S, Higgs DR, Hughes JR. Multiplexed analysis of chromosome conformation at vastly improved sensitivity. Nat Methods. 2015; 13(1):74.CrossRefGoogle Scholar
 20.Li G, Cai L, Chang H, Hong P, Zhou Q, Kulakova EV, Kolchanov NA, Ruan Y. Chromatin interaction analysis with pairedend tag (chiapet) sequencing technology and application. BMC Genom. 2014; 15(12):11.CrossRefGoogle Scholar
 21.Mumbach MR, Rubin AJ, Flynn RA, Dai C, Khavari PA, Greenleaf WJ, Chang HY. Hichip: efficient and sensitive analysis of proteindirected genome architecture. Nature Methods. 2016; 13(11):919.CrossRefGoogle Scholar
 22.Geeven G, Teunissen H, de Laat W, de Wit E. peakC: a flexible, nonparametric peak calling package for 4C and CaptureC data. Nucleic Acids Res. 2018; 46(15):91. https://doi.org/10.1093/nar/gky443.CrossRefGoogle Scholar
 23.Wingett S, Ewels P, FurlanMagaril M, Nagano T, Schoenfelder S, Fraser P, Andrews S. Hicup: pipeline for mapping and processing hic data. F1000Research. 2015; 4:1310.CrossRefGoogle Scholar
 24.Johanson TM, Coughlan HD, Lun AT, Bediaga NG, Naselli G, Garnham AL, Harrison LC, Smyth GK, Allan RS. No kissing in the nucleus: Unbiased analysis reveals no evidence of trans chromosomal regulation of mammalian immune development. bioRxiv. 2017. https://doi.org/10.1101/212985. https://www.biorxiv.org/content/early/2017/11/02/212985.full.pdf.
 25.Dunn PK, Smyth GK. Randomized Quantile Residuals. J Comput Graph Stat. 1996; 5(3):236–44. https://doi.org/10.1080/10618600.1996.10474708.Google Scholar
 26.Wilson MA, Iversen ES, Clyde MA, Schmidler SC, Schildkraut JM. Bayesian model search and multilevel inference for SNP association studies. Ann Appl Stat. 2010; 4(3):1342–64. https://doi.org/10.1214/09AOAS322. http://arxiv.org/abs/0908.1144.CrossRefGoogle Scholar
 27.Green PJ. Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination. Biometrika. 1995; 82(4):711. https://doi.org/10.2307/2337340.CrossRefGoogle Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.