Skip to main content

Hi-C Data Formats

  • Protocol
  • First Online:
Hi-C Data Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2301))

  • 2509 Accesses

Abstract

Processing, storing, and visualizing high-resolution Hi-C data required development of efficient data formats. A sparse matrix format saving only nonzero values has become the norm. A “zoomable” matrix style also became popular, storing multiple resolutions in a single file for interactive visualization. This chapter discusses the latest matrix file formats such as .hic and .mcool, and other intermediate formats including SAM/BAM and random-accessible contact lists.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Rao SSP, Huntley MH, Durand NC et al (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159:1665–1680

    Article  CAS  Google Scholar 

  2. Durand NC, Shamim MS, Machol I et al (2016) Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst 3:95–98

    Article  CAS  Google Scholar 

  3. Durand NC, Robinson JT, Shamim MS et al (2016) Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst 3:99–101

    Article  CAS  Google Scholar 

  4. Dekker J, Belmont AS, Guttman M et al (2017) The 4D nucleome project. Nature 549:219–226

    Article  CAS  Google Scholar 

  5. Cock PJA, Fields CJ, Goto N et al (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771

    Article  CAS  Google Scholar 

  6. Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079

    Article  Google Scholar 

  7. Robinson JT, Turner D, Durand NC et al (2018) Juicebox.js provides a cloud-based visualization system for Hi-C data. Cell Syst 6:256–258

    Article  CAS  Google Scholar 

  8. Kerpedjiev P, Abdennur N, Lekschas F et al (2018) HiGlass: web-based visual exploration and analysis of genome interaction maps. Genome Biol 19:125

    Article  Google Scholar 

  9. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760

    Article  CAS  Google Scholar 

  10. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25

    Article  Google Scholar 

  11. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359

    Article  CAS  Google Scholar 

  12. Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997

    Google Scholar 

  13. Li H (2011) Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27:718–719

    Article  Google Scholar 

  14. Heinz S, Benner C, Spann N et al (2010) Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38:576–589

    Article  CAS  Google Scholar 

  15. Abdenur N, Mirny LA (2020) Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics 36:311–316

    Article  Google Scholar 

  16. Servant N, Varoquaux N, Lajoie BR et al (2015) HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol 16:259

    Article  Google Scholar 

  17. Akdemir KC, Chin L (2015) HiCPlotter integrates genomic data with interaction matrices. Genome Biol 16:198

    Article  Google Scholar 

  18. Kent WJ, Sugnet CW, Furey TS et al (2002) The human genome browser at UCSC. Genome Res 12:996–1006

    Article  CAS  Google Scholar 

  19. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842

    Article  CAS  Google Scholar 

  20. Dixon JR, Selvaraj S, Yue F et al (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485:376–380

    Article  CAS  Google Scholar 

  21. Cao Y, Chen Z, Chen X et al (2019) Accurate loop calling for 3D genomic data with cLoops. Bioinformatics 36(3):666–675

    Google Scholar 

  22. Crane E, Bian Q, McCord RP et al (2015) Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523:240–244

    Article  CAS  Google Scholar 

  23. Kent WJ, Zweig AS, Barber GP et al (2010) BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26:2204–2207

    Article  CAS  Google Scholar 

  24. Robinson JT, Thorvaldsdóttir H, Winckler W et al (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26

    Article  CAS  Google Scholar 

  25. Sridhar B, Rivas-Astroza M, Nguyen TC et al (2017) Systematic mapping of RNA-chromatin interactions in vivo. Curr Biol 27:602–609

    Article  CAS  Google Scholar 

  26. Hsieh T-HS, Weiner A, Lajoie B et al (2015) Mapping nucleosome resolution chromosome folding in yeast by micro-C. Cell 162:108–119

    Article  CAS  Google Scholar 

  27. Quinodoz SA, Ollikainen N, Tabak B et al (2018) Higher-order inter-chromosomal hubs shape 3D genome organization in the nucleus. Cell 174:744–757

    Article  CAS  Google Scholar 

  28. Schoenfelder S, Javierre B-M, Furlan-Magaril M et al (2018) Promoter capture Hi-C: high-resolution, genome-wide profiling of promoter interactions. J Vis Exp 136:57320

    Google Scholar 

  29. Ma W, Ay F, Lee C et al (2015) Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes. Nat Methods 12:71–78

    Article  Google Scholar 

  30. Flyamer IM, Gassler J, Imakaev M et al (2017) Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte-to-zygote transition. Nature 544:110–114

    Article  CAS  Google Scholar 

  31. Ramani V, Deng X, Qiu R et al (2019) Massively multiplex single-cell Hi-C. Nat Methods 14:264–266

    Google Scholar 

  32. Nagano T, Lubling Y, Stevens TJ et al (2013) Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502:59–64

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soohyun Lee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Lee, S. (2022). Hi-C Data Formats. In: Bicciato, S., Ferrari, F. (eds) Hi-C Data Analysis. Methods in Molecular Biology, vol 2301. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1390-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1390-0_6

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1389-4

  • Online ISBN: 978-1-0716-1390-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics