Abstract
As a revolutionary tool, the Hi-C technology can be used to capture genomic segments that have close spatial proximity in three dimensional space and enable the study of chromosome structures at an unprecedentedly high throughput and resolution. However, during the experimental steps of Hi-C, systematic biases from different sources are often introduced into the resultant data (i.e., reads or read counts). Several bias reduction methods have been proposed recently. Although both systematic biases and spatial distance are known as key factors determining the number of observed chromatin interactions, the existing bias reduction methods in the literature do not include spatial distance explicitly in their computational models for estimating the interactions. In this work, we propose an improved Poisson regression model and an efficient gradient descent based algorithm, GDNorm, for reducing biases in Hi-C data that takes spatial distance into consideration. GDNorm has been tested on both simulated and real Hi-C data, and its performance compared with that of the state-of-the-art bias reduction methods. The experimental results show that our improved Poisson model is able to provide more accurate normalized contact frequencies (measured in read counts) between interacting genomic segments and thus a more accurate chromosome structure prediction when combined with a chromosome structure determination method such as ChromSDE. Moreover, assessed by recently published data from human lymphoblastoid and mouse embryonic stem cell lines, GDNorm achieves the highest reproducibility between the biological replicates of the cell lines. The normalized contact frequencies obtained by GDNorm is well correlated to the spatial distance measured by florescent in situ hybridization (FISH) experiments. In addition to accurate bias reduction, GDNorm has the highest time efficiency on the real data. GDNorm is implemented in C++ and available at http://www.cs.ucr.edu/~yyang027/gdnorm.htm
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dekker, J., Marti-Renom, M.A., Mirny, L.A.: Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data. Nature Reviews. Genetics 14(6), 390–403 (2013)
Hu, M., Deng, K., Qin, Z., Liu, J.S.: Understanding spatial organizations of chromosomes via statistical analysis of Hi-C data. Quantitative Biology 1(2), 156–174 (2013)
Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., Sandstrom, R., Bernstein, B., Bender, M.A., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L.A., Lander, E.S., Dekker, J.: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326(5950), 289–293 (2009)
Eskeland, R., Leeb, M., Grimes, G.R., Kress, C., Boyle, S., Sproul, D., Gilbert, N., Fan, Y., Skoultchi, A.I., Wutz, A., Bickmore, W.A.: Ring1B compacts chromatin structure and represses gene expression independent of histone ubiquitination. Molecular Cell 38(3), 452–464 (2010)
Dekker, J., Rippe, K., Dekker, M., Kleckner, N.: Capturing chromosome conformation. Science 295(5558), 1306–1311 (2002)
Simonis, M., Klous, P., Splinter, E., Moshkin, Y., Willemsen, R., de Wit, E., van Steensel, B., de Laat, W.: Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nature Genetics 38(11), 1348–1354 (2006)
Zhao, Z., Tavoosidana, G., Sjölinder, M., Göndör, A., Mariano, P., Wang, S., Kanduri, C., Lezcano, M., Sandhu, K.S., Singh, U., Pant, V., Tiwari, V., Kurukuti, S., Ohlsson, R.: Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nature Genetics 38(11), 1341–1347 (2006)
Dostie, J., Richmond, T.A., Arnaout, R.A., Selzer, R.R., Lee, W.L., Honan, T.A., Rubio, E.D., Krumm, A., Lamb, J., Nusbaum, C., Green, R.D., Dekker, J.: Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Research 16(10), 1299–1309 (2006)
Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., Ren, B.: Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485(7398), 376–380 (2012)
Hu, M., Deng, K., Qin, Z., Dixon, J., Selvaraj, S., Fang, J., Ren, B., Liu, J.S.: Bayesian inference of spatial organizations of chromosomes. PLoS Computational Biology 9(1), e1002893 (2013)
Marti-Renom, M.A., Mirny, L.A.: Bridging the resolution gap in structural modeling of 3D genome organization. PLoS Computational Biology 7(7), e1002125 (2011)
Zhang, Z., Li, G., Toh, K.-C., Sung, W.-K.: Inference of spatial organizations of chromosomes using semi-definite embedding approach and hi-C data. In: Deng, M., Jiang, R., Sun, F., Zhang, X. (eds.) RECOMB 2013. LNCS, vol. 7821, pp. 317–332. Springer, Heidelberg (2013)
Yaffe, E., Tanay, A.: Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nature Genetics 43(11), 1059–1065 (2011)
Imakaev, M., Fudenberg, G., Mccord, R.P., Naumova, N., Goloborodko, A., Lajoie, B.R., Dekker, J., Mirny, L.A.: Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature Methods (September) (2012)
Cournac, A., Marie-Nelly, H., Marbouty, M., Koszul, R., Mozziconacci, J.: Normalization of a chromosomal contact map. BMC Genomics 13, 436 (2012)
Hu, M., Deng, K., Selvaraj, S., Qin, Z., Ren, B., Liu, J.S.: HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics 28(23), 3131–3133 (2012)
Jin, F., Li, Y., Dixon, J.R., Selvaraj, S., Ye, Z., Lee, A.Y., Yen, C.A., Schmitt, A.D., Espinoza, C.A., Ren, B.: A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature 503(7475), 290–294 (2013)
Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 25(14), 1754–1760 (2009)
Lindsey, J.K., Altham, P.M.E.: Analysis of the human sex ratio by using overdispersion models. Journal of the Royal Statistical Society. Series C (Applied Statistics) 47(1), 149–157 (1998)
Rousseau, M., Fraser, J., Ferraiuolo, M.A., Dostie, J., Blanchette, M.: Three-dimensional modeling of chromatin structure from interaction frequency data using Markov chain Monte Carlo sampling. BMC Bioinformatics 12(1), 414 (2011)
Kabsch, W.: A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A 32(5), 922–923 (1976)
Goulden, C.H.: Methods of Statistical Analysis, 2nd edn. Wiley, New York (1956)
Nagano, T., Lubling, Y., Stevens, T.J., Schoenfelder, S., Yaffe, E., Dean, W., Laue, E.D., Tanay, A., Fraser, P.: Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502(7469), 59–64 (2013)
Wang, Z., Cao, R., Taylor, K., Briley, A., Caldwell, C., Cheng, J.: The properties of genome conformation and spatial gene interaction and regulation networks of normal and malignant human cell types. PLoS ONEÂ 8(3), e58793 (2013)
Dobson, A.J.: An Introduction to Generalized Linear Models. Chapman and Hall, London (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, EW., Jiang, T. (2014). GDNorm: An Improved Poisson Regression Model for Reducing Biases in Hi-C Data. In: Brown, D., Morgenstern, B. (eds) Algorithms in Bioinformatics. WABI 2014. Lecture Notes in Computer Science(), vol 8701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44753-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-662-44753-6_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44752-9
Online ISBN: 978-3-662-44753-6
eBook Packages: Computer ScienceComputer Science (R0)