Abstract
Processing, storing, and visualizing high-resolution Hi-C data required development of efficient data formats. A sparse matrix format saving only nonzero values has become the norm. A “zoomable” matrix style also became popular, storing multiple resolutions in a single file for interactive visualization. This chapter discusses the latest matrix file formats such as .hic and .mcool, and other intermediate formats including SAM/BAM and random-accessible contact lists.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rao SSP, Huntley MH, Durand NC et al (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159:1665–1680
Durand NC, Shamim MS, Machol I et al (2016) Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst 3:95–98
Durand NC, Robinson JT, Shamim MS et al (2016) Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst 3:99–101
Dekker J, Belmont AS, Guttman M et al (2017) The 4D nucleome project. Nature 549:219–226
Cock PJA, Fields CJ, Goto N et al (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Robinson JT, Turner D, Durand NC et al (2018) Juicebox.js provides a cloud-based visualization system for Hi-C data. Cell Syst 6:256–258
Kerpedjiev P, Abdennur N, Lekschas F et al (2018) HiGlass: web-based visual exploration and analysis of genome interaction maps. Genome Biol 19:125
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997
Li H (2011) Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27:718–719
Heinz S, Benner C, Spann N et al (2010) Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38:576–589
Abdenur N, Mirny LA (2020) Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics 36:311–316
Servant N, Varoquaux N, Lajoie BR et al (2015) HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol 16:259
Akdemir KC, Chin L (2015) HiCPlotter integrates genomic data with interaction matrices. Genome Biol 16:198
Kent WJ, Sugnet CW, Furey TS et al (2002) The human genome browser at UCSC. Genome Res 12:996–1006
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842
Dixon JR, Selvaraj S, Yue F et al (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485:376–380
Cao Y, Chen Z, Chen X et al (2019) Accurate loop calling for 3D genomic data with cLoops. Bioinformatics 36(3):666–675
Crane E, Bian Q, McCord RP et al (2015) Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523:240–244
Kent WJ, Zweig AS, Barber GP et al (2010) BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26:2204–2207
Robinson JT, Thorvaldsdóttir H, Winckler W et al (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26
Sridhar B, Rivas-Astroza M, Nguyen TC et al (2017) Systematic mapping of RNA-chromatin interactions in vivo. Curr Biol 27:602–609
Hsieh T-HS, Weiner A, Lajoie B et al (2015) Mapping nucleosome resolution chromosome folding in yeast by micro-C. Cell 162:108–119
Quinodoz SA, Ollikainen N, Tabak B et al (2018) Higher-order inter-chromosomal hubs shape 3D genome organization in the nucleus. Cell 174:744–757
Schoenfelder S, Javierre B-M, Furlan-Magaril M et al (2018) Promoter capture Hi-C: high-resolution, genome-wide profiling of promoter interactions. J Vis Exp 136:57320
Ma W, Ay F, Lee C et al (2015) Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes. Nat Methods 12:71–78
Flyamer IM, Gassler J, Imakaev M et al (2017) Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte-to-zygote transition. Nature 544:110–114
Ramani V, Deng X, Qiu R et al (2019) Massively multiplex single-cell Hi-C. Nat Methods 14:264–266
Nagano T, Lubling Y, Stevens TJ et al (2013) Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502:59–64
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Lee, S. (2022). Hi-C Data Formats. In: Bicciato, S., Ferrari, F. (eds) Hi-C Data Analysis. Methods in Molecular Biology, vol 2301. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1390-0_6
Download citation
DOI: https://doi.org/10.1007/978-1-0716-1390-0_6
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-1389-4
Online ISBN: 978-1-0716-1390-0
eBook Packages: Springer Protocols