Skip to main content

Computational tools for Hi-C data analysis

Abstract

Background

In eukaryotic genome, chromatin is not randomly distributed in cell nuclei, but instead is organized into higher-order structures. Emerging evidence indicates that these higher-order chromatin structures play important roles in regulating genome functions such as transcription and DNA replication. With the advancement in 3C (chromosome conformation capture) based technologies, Hi-C has been widely used to investigate genome-wide longrange chromatin interactions during cellular differentiation and oncogenesis. Since the first publication of Hi-C assay in 2009, lots of bioinformatic tools have been implemented for processing Hi-C data from mapping raw reads to normalizing contact matrix and high interpretation, either providing a whole workflow pipeline or focusing on a particular process.

Results

This article reviews the general Hi-C data processing workflow and the currently popular Hi-C data processing tools. We highlight on how these tools are used for a full interpretation of Hi-C results.

Conclusions

Hi-C assay is a powerful tool to investigate the higher-order chromatin structure. Continued development of novel methods for Hi-C data analysis will be necessary for better understanding the regulatory function of genome organization.

References

  1. Gorkin, D. U., Leung, D. and Ren, B. (2014) The 3D genome in transcriptional regulation and pluripotency. Cell Stem Cell, 14, 762–775

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  2. Phillips-Cremins, J. E., Sauria, M. E., Sanyal, A., Gerasimova, T. I., Lajoie, B. R., Bell, J. S., Ong, C. T., Hookway, T. A., Guo, C., Sun, Y., et al. (2013) Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell, 153, 1281–1295

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  3. Dekker, J., Rippe, K., Dekker, M. and Kleckner, N. (2002) Capturing chromosome conformation. Science, 295, 1306–1311

    CAS  Article  PubMed  Google Scholar 

  4. Simonis, M., Klous, P., Splinter, E., Moshkin, Y., Willemsen, R., de Wit, E., van Steensel, B. and de Laat, W. (2006) Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nat. Genet., 38, 1348–1354

    CAS  Article  PubMed  Google Scholar 

  5. Dostie, J., Richmond, T. A., Arnaout, R. A., Selzer, R. R., Lee, W. L., Honan, T. A., Rubio, E. D., Krumm, A., Lamb, J., Nusbaum, C., et al. (2006) Chromosome conformation capture carbon copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res., 16, 1299–1309

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., et al. (2009) Comprehensive mapping of longrange interactions reveals folding principles of the human genome. Science, 326, 289–293

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  7. Fullwood, M. J., Liu, M. H., Pan, Y. F., Liu, J., Xu, H., Mohamed, Y. B., Orlov, Y. L., Velkov, S., Ho, A., Mei, P. H., et al. (2009) An oestrogen-receptor-alpha-bound human chromatin interactome. Nature, 462, 58–64

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. Jäger, R., Migliorini, G., Henrion, M., Kandaswamy, R., Speedy, H. E., Heindl, A., Whiffin, N., Carnicer, M. J., Broome, L., Dryden, N., et al. (2015) Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci. Nat. Commun., 6, 6178

    Article  PubMed  PubMed Central  Google Scholar 

  9. Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J. S. and Ren, B. (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376–380

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. Schmitt, A. D., Hu, M., Jung, I., Xu, Z., Qiu, Y., Tan, C. L., Li, Y., Lin, S., Lin, Y., Barr, C. L., et al. (2016) A Compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell Rep., 17, 2042–2059

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  11. Castellano, G., Le Dily, F., Hermoso Pulido, A., Beato, M. and Roma, G. (2015) Hi-Cpipe: a pipeline for high-throughput chromosome capture. bioRxiv, doi: https://doi.org/10.1101/020636

    Google Scholar 

  12. HiC-Box. available from https://github.com/koszullab/HiC-Box

  13. Schmid, M. W., Grob, S. and Grossniklaus, U. (2015) HiCdat: a fast and easy-to-use Hi-C data analysis tool. BMC Bioinformatics, 16, 277

    Article  PubMed  PubMed Central  Google Scholar 

  14. Hwang, Y. C., Lin, C. F., Valladares, O., Malamon, J., Kuksa, P. P., Zheng, Q., Gregory, B. D. and Wang, L. S. (2015) HIPPIE: a highthroughput identification pipeline for promoter interacting enhancer elements. Bioinformatics, 31, 1290–1292

    Article  PubMed  Google Scholar 

  15. Durand, N. C., Shamim, M. S., Machol, I., Rao, S. S., Huntley, M. H., Lander, E. S. and Aiden, E. L. (2016) Juicer provides a oneclick system for analyzing loop-resolution Hi-C experiments. Cell Syst., 3, 95–98

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  16. Imakaev, M., Fudenberg, G., McCord, R. P., Naumova, N., Goloborodko, A., Lajoie, B. R., Dekker, J. and Mirny, L. A. (2012) Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods, 9, 999–1003

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  17. Wingett, S., Ewels, P., Furlan-Magaril, M., Nagano, T., Schoenfelder, S., Fraser, P. and Andrews, S. (2015) HiCUP: pipeline for mapping and processing Hi-C data. F1000Res, 4, 1310

    PubMed  PubMed Central  Google Scholar 

  18. Servant, N., Varoquaux, N., Lajoie, B. R., Viara, E., Chen, C. J., Vert, J. P., Heard, E., Dekker, J. and Barillot, E. (2015) HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol., 16, 259

    Article  PubMed  PubMed Central  Google Scholar 

  19. Serra, F., Baù, D., Filion, G. and Marti-Renom, M. A. (2016) Structural features of the fly chromatin colors revealed by automatic three-dimensional modeling. bioRxiv, doi: https://doi. org/10.1101/036764

    Google Scholar 

  20. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and the 1000 Genome Project Data Processing Subgroup. (2009) The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079

    Article  PubMed  PubMed Central  Google Scholar 

  21. Ma, W., Ay, F., Lee, C., Gulsoy, G., Deng, X., Cook, S., Hesson, J., Cavanaugh, C., Ware, C. B., Krumm, A., et al. (2015) Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes. Nat. Methods, 12, 71–78

    Article  PubMed  Google Scholar 

  22. Hu, M., Deng, K., Selvaraj, S., Qin, Z., Ren, B. and Liu, J. S. (2012) HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics, 28, 3131–3133

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. Knight, P. A. and Ruiz, D. (2013) A fast algorithm for matrix balancing. IMA J. Numer. Anal., 33, 1029–1047

    Article  Google Scholar 

  24. Yaffe, E. and Tanay, A. (2011) Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Genet., 43, 1059–1065

    CAS  Article  PubMed  Google Scholar 

  25. Rao, S. S., Huntley, M. H., Durand, N. C., Stamenova, E. K., Bochkov, I. D., Robinson, J. T., Sanborn, A. L., Machol, I., Omer, A. D., Lander, E. S., et al. (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell, 159, 1665–1680

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  26. Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A. and Cavalli, G. (2012) Three-dimensional folding and functional organization principles of the Drosophila genome. Cell, 148, 458–472

    CAS  Article  PubMed  Google Scholar 

  27. Filippova, D., Patro, R., Duggal, G. and Kingsford, C. (2014) Identification of alternative topological domains in chromatin. Algorithms Mol. Biol., 9, 14

    Article  PubMed  PubMed Central  Google Scholar 

  28. Lévy-Leduc, C., Delattre, M., Mary-Huard, T. and Robin, S. (2014) Two-dimensional segmentation for analyzing Hi-C data. Bioinformatics, 30, i386–i392

    Article  PubMed  PubMed Central  Google Scholar 

  29. Wang, Y., Li, Y., Gao, J. and Zhang, M. Q. (2015) A novel method to identify topological domains using Hi-C data. Quant. Biol., 3, 81–89

    CAS  Article  Google Scholar 

  30. Zhou, X., Lowdon, R. F., Li, D., Lawson, H. A., Madden, P. A., Costello, J. F. and Wang, T. (2013) Exploring long-range genome interactions using the WashU Epigenome Browser. Nat. Methods, 10, 375–376

    CAS  Article  PubMed  Google Scholar 

  31. The 3D Genome Browser. Available from: http://www.3dgenome.org

  32. Karolchik, D., Barber, G. P., Casper, J., Clawson, H., Cline, M. S., Diekhans, M., Dreszer, T. R., Fujita, P. A., Guruvadoo, L., Haeussler, M., et al. (2014) The UCSC Genome Browser database: 2014 update. Nucleic Acids Res., 42, D764–D770

    CAS  Article  PubMed  Google Scholar 

  33. Asbury, T. M., Mitman, M., Tang, J. and Zheng, W. J. (2010) Genome3D: a viewer-model framework for integrating and visualizing multi-scale epigenomic information within a threedimensional genome. BMC Bioinformatics, 11, 444

    Article  PubMed  PubMed Central  Google Scholar 

  34. Lewis, T. E., Sillitoe, I., Andreeva, A., Blundell, T. L., Buchan, D. W., Chothia, C., Cozzetto, D., Dana, J. M., Filippis, I., Gough, J., et al. (2015) Genome3D: exploiting structure to help users understand their sequences. Nucleic Acids Res., 43, D382–D386

    CAS  Article  PubMed  Google Scholar 

  35. Lewis, T. E., Sillitoe, I., Andreeva, A., Blundell, T. L., Buchan, D. W., Chothia, C., Cuff, A., Dana, J. M., Filippis, I., Gough, J., et al. (2013) Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains. Nucleic Acids Res., 41, D499–D507

    CAS  Article  PubMed  Google Scholar 

  36. TADkit. available from http://sgt.cnag.cat/3dg/tadkit

  37. Ay, F. and Noble, W. S. (2015) Analysis methods for studying the 3D architecture of the genome. Genome Biol., 16, 183

    Article  PubMed  PubMed Central  Google Scholar 

  38. Schmitt, A. D., Hu, M. and Ren, B. (2016) Genome-wide mapping and analysis of chromosome architecture. Nat. Rev. Mol. Cell Biol., 17, 743–755

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  39. Ashish, N., Dewan, P., Ambite, J. L. and Toga, A.W. (2015) GEM: the GAAIN entity mapper. Data Integr. Life Sci., 9162, 13–27

    Article  PubMed  PubMed Central  Google Scholar 

  40. Marco-Sola, S., Sammeth, M., Guigó, R. and Ribeca, P. (2012) The GEM mapper: fast, accurate and versatile alignment by filtration. Nat. Methods, 9, 1185–1188

    CAS  Article  PubMed  Google Scholar 

  41. Durand, N. C., Robinson, J. T., Shamim, M. S., Machol, I., Mesirov, J. P., Lander, E. S. and Aiden, E. L. (2016) Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst., 3, 99–101

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  42. Li, W., Gong, K., Li, Q., Alber, F. and Zhou, X. J. (2015) Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data. Bioinformatics, 31, 960–962

    CAS  Article  PubMed  Google Scholar 

  43. Sauria, M. E., Phillips-Cremins, J. E., Corces, V. G. and Taylor, J. (2015) HiFive: a tool suite for easy and efficient HiC and 5C data analysis. Genome Biol., 16, 237

    Article  PubMed  PubMed Central  Google Scholar 

  44. Lun, A. T. and Smyth, G. K. (2015) diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics, 16, 258

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Basic Research Program of China (Nos. 2016YFA0100703 and 2015CB964800) and the National Natural Science Foundation of China (No. 31271354).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Wei.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Han, Z., Wei, G. Computational tools for Hi-C data analysis. Quant Biol 5, 215–225 (2017). https://doi.org/10.1007/s40484-017-0113-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40484-017-0113-6

Keywords

  • 3D genome structure
  • Hi-C data processing tool
  • chromatin interactions