ChIP-seq Data Processing for PcG Proteins and Associated Histone Modifications

  • Ozren Bogdanovic´
  • Simon J. van HeeringenEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1480)


Chromatin Immunoprecipitation followed by massively parallel DNA sequencing (ChIP-sequencing) has emerged as an essential technique to study the genome-wide location of DNA- or chromatin-associated proteins, such as the Polycomb group (PcG) proteins. After being generated by the sequencer, raw ChIP-seq sequence reads need to be processed by a data analysis pipeline. Here we describe the computational steps required to process PcG ChIP-seq data, including alignment, peak calling, and downstream analysis.

Key words

Polycomb ChIP-seq ChIP-sequencing PRC1 PRC2 H3K27me3 



O.B. is supported by an Australian Research Council Discovery Early Career Researcher Award—DECRA (DE140101962); S.J.v.H. is supported by the Netherlands Organization for Scientific Research (NWO-ALW grant 863.12.002).


  1. 1.
    Lewis EB (1978) A gene complex controlling segmentation in Drosophila. Nature 276:565–570CrossRefPubMedGoogle Scholar
  2. 2.
    Shao Z, Raible F, Mollaaghababa R et al (1999) Stabilization of chromatin structure by PRC1, a Polycomb complex. Cell 98:37–46CrossRefPubMedGoogle Scholar
  3. 3.
    Cao R, Wang L, Wang H et al (2002) Role of histone H3 lysine 27 methylation in Polycomb-group silencing. Science (New York, NY) 298:1039–1043CrossRefGoogle Scholar
  4. 4.
    Gao Z, Zhang J, Bonasio R et al (2012) PCGF homologs, CBX proteins, and RYBP define functionally distinct PRC1 family complexes. Mol Cell 45:344–356CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Tavares L, Dimitrova E, Oxley D et al (2012) RYBP-PRC1 complexes mediate H2A ubiquitylation at polycomb target sites independently of PRC2 and H3K27me3. Cell 148:664–678CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Gao Z, Lee P, Stafford JM et al (2014) An AUTS2-Polycomb complex activates gene expression in the CNS. Nature 516:349–354CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Francis NJ, Saurin AJ, Shao Z et al (2001) Reconstitution of a functional core polycomb repressive complex. Mol Cell 8:545–556CrossRefPubMedGoogle Scholar
  8. 8.
    Saurin AJ, Shao Z, Erdjument-Bromage H et al (2001) A Drosophila Polycomb group complex includes Zeste and dTAFII proteins. Nature 412:655–660CrossRefPubMedGoogle Scholar
  9. 9.
    Steffen PA, Ringrose L (2014) What are memories made of? How Polycomb and Trithorax proteins mediate epigenetic memory. Nat Rev 15:340–356CrossRefGoogle Scholar
  10. 10.
    Schwartz YB, Pirrotta V (2013) A new world of Polycombs: unexpected partnerships and emerging functions., Nature reviews. Genetics 14:853–864PubMedGoogle Scholar
  11. 11.
    Czermin B, Melfi R, McCabe D et al (2002) Drosophila enhancer of Zeste/ESC complexes have a histone H3 methyltransferase activity that marks chromosomal Polycomb sites. Cell 111:185–196CrossRefPubMedGoogle Scholar
  12. 12.
    Müller J, Hart CM, Francis NJ et al (2002) Histone methyltransferase activity of a Drosophila Polycomb group repressor complex. Cell 111:197–208CrossRefPubMedGoogle Scholar
  13. 13.
    Margueron R, Reinberg D (2011) The Polycomb complex PRC2 and its mark in life. Nature 469:343–349CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Smits AH, Jansen PWTC, Poser I et al (2013) Stoichiometry of chromatin-associated protein complexes revealed by label-free quantitative mass spectrometry-based proteomics. Nucleic Acids Res 41:e28CrossRefPubMedGoogle Scholar
  15. 15.
    Simon J, Chiang A, Bender W et al (1993) Elements of the Drosophila bithorax complex that mediate repression by Polycomb group products. Dev Biol 158:131–144CrossRefPubMedGoogle Scholar
  16. 16.
    Chan CS, Rastelli L, Pirrotta V (1994) A Polycomb response element in the Ubx gene that determines an epigenetically inherited state of repression. EMBO J 13:2553–2564PubMedPubMedCentralGoogle Scholar
  17. 17.
    Schuettengruber B, Oded Elkayam N, Sexton T et al (2014) Cooperativity, specificity, and evolutionary stability of Polycomb targeting in Drosophila. Cell Rep 9:219–233CrossRefPubMedGoogle Scholar
  18. 18.
    Ku M, Koche RP, Rheinbay E et al (2008) Genomewide analysis of PRC1 and PRC2 occupancy identifies two classes of bivalent domains. PLoS Genet 4:e1000242CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Mendenhall EM, Koche RP, Truong T et al (2010) GC-rich sequence elements recruit PRC2 in mammalian ES cells. PLoS Genet 6:e1001244CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Lynch MD, Smith AJH, De Gobbi M et al (2012) An interspecies analysis reveals a key role for unmethylated CpG dinucleotides in vertebrate Polycomb complex recruitment. EMBO J 31:317–329CrossRefPubMedGoogle Scholar
  21. 21.
    Long HK, Sims D, Heger A et al (2013) Epigenetic conservation at gene regulatory elements revealed by non-methylated DNA profiling in seven vertebrates. Elife 2, e00348CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    van Heeringen SJ, Akkers RC, van Kruijsbergen I et al (2014) Principles of nucleation of H3K27 methylation during embryonic development. Genome Res 24:401–410CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Wachter E, Quante T, Merusi C et al (2014) Synthetic CpG islands reveal DNA sequence determinants of chromatin structure. Elife 3, e03397CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Dietrich N, Lerdrup M, Landt E et al (2012) REST-mediated recruitment of polycomb repressor complexes in mammalian cells. PLoS Genet 8:e1002494CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Arnold P, Schöler A, Pachkov M et al (2013) Modeling of epigenome dynamics identifies transcription factors that mediate Polycomb targeting. Genome Res 23:60–73CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Rinn JL, Kertesz M, Wang JK et al (2007) Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell 129:1311–1323CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Zhao J, Sun B, Erwin J et al (2008) Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome. Science 215Google Scholar
  28. 28.
    Tsai M-C, Manor O, Wan Y et al (2010) Long noncoding RNA as modular scaffold of histone modification complexes. Science (New York, NY) 329:689–693CrossRefGoogle Scholar
  29. 29.
    da Rocha ST, Boeva V, Escamilla-Del-Arenal M et al (2014) Jarid2 Is Implicated in the Initial Xist-Induced Targeting of PRC2 to the Inactive X Chromosome. Mol Cell 53:301–316CrossRefPubMedGoogle Scholar
  30. 30.
    Kaneko S, Bonasio R, Saldaña-Meyer R et al (2014) Interactions between JARID2 and noncoding RNAs regulate PRC2 recruitment to chromatin. Mol Cell 53:290–300CrossRefPubMedGoogle Scholar
  31. 31.
    Landt SG, Marinov GK, Kundaje A et al (2012) ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res 22:1813–1831CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Akkers RC, van Heeringen SJ, Jacobi UG et al (2009) A hierarchy of H3K4me3 and H3K27me3 acquisition in spatial gene regulation in Xenopus embryos. Dev Cell 17:425–434CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Irimia M, Tena JJ, Alexis MS et al (2012) Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints. Genome Res 22:2356–2367CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Richly H, Aloia L, Di Croce L (2011) Roles of the Polycomb group proteins in stem cells and cancer. Cell Death Dis 2, e204CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Enderle D, Beisel C, Stadler M (2011) Polycomb preferentially targets stalled promoters of coding and noncoding transcripts. Genome Res 21(2):216–226CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Herz H-M, Mohan M, Garrett AS et al (2012) Polycomb repressive complex 2-dependent and -independent functions of Jarid2 in transcriptional regulation in Drosophila. Mol Cell Biol 32:1683–1693CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Wu X, Johansen JV, Helin K (2013) Fbxl10/Kdm2b recruits polycomb repressive complex 1 to CpG islands and regulates H2A ubiquitylation. Mol Cell 49:1134–1146CrossRefPubMedGoogle Scholar
  38. 38.
    Bonn S, Zinzen RP, Girardot C et al (2012) Tissue-specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development. Nat Genet 44:148–156CrossRefPubMedGoogle Scholar
  39. 39.
    Rada-Iglesias A, Bajpai R, Swigut T et al (2011) A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470:279–283CrossRefPubMedGoogle Scholar
  40. 40.
    Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England) 30:2114–2120CrossRefGoogle Scholar
  41. 41.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 25:1754–1760CrossRefGoogle Scholar
  42. 42.
    Li H, Handsaker B, Wysoker A et al (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics (Oxford, England) 25:2078–2079CrossRefGoogle Scholar
  43. 43.
    Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics (Oxford, England) 26:841–842CrossRefGoogle Scholar
  44. 44.
    Kharchenko PV, Tolstorukov MY, Park PJ (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol 26:1351–1359CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Marinov GK, Kundaje A, Park PJ et al (2014) Large-scale quality analysis of published ChIP-seq data. G3 (Bethesda, MD) 4:209–223CrossRefGoogle Scholar
  46. 46.
    Zhang Y, Liu T, Meyer CA et al (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9:R137CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Thorvaldsdóttir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192CrossRefPubMedGoogle Scholar
  48. 48.
    Blackledge NP, Farcas AM, Kondo T et al (2014) Variant PRC1 complex-dependent H2A ubiquitylation drives PRC2 recruitment and polycomb domain formation. Cell 157:1445–1459CrossRefPubMedPubMedCentralGoogle Scholar
  49. 49.
    Kent W, Sugnet C, Furey T (2002) The human genome browser at UCSC. Genome Res 12(6):996–1006CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    McLean CY, Bristor D, Hiller M et al (2010) GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol 28:495–501CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    Huang DW, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57CrossRefGoogle Scholar
  52. 52.
    Cock PJA, Fields CJ, Goto N et al (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771CrossRefPubMedGoogle Scholar
  53. 53.
    Langmead B, Trapnell C, Pop M et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM, 00, 3Google Scholar
  56. 56.
    Daley T, Smith A (2013) Predicting the molecular complexity of sequencing libraries. Nat Methods 10:325–327CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Robinson J, Thorvaldsdóttir H (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26CrossRefPubMedPubMedCentralGoogle Scholar
  58. 58.
    Feng X, Grossman R, Stein L (2011) PeakRanger: A cloud-enabled peak caller for ChIP-seq data. BMC Bioinformatics 12:139CrossRefPubMedPubMedCentralGoogle Scholar
  59. 59.
    ENCODE Project Consortium, Bernstein BE, Birney E et al (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74CrossRefGoogle Scholar
  60. 60.
    modENCODE Consortium, Celniker SE, Dillon LA et al (2009) Unlocking the secrets of the genome. Nature 459:927–930CrossRefGoogle Scholar
  61. 61.
    de la Calle Mustienes E, Gómez-Skarmeta JL, Bogdanović O (2015) Genome-wide epigenetic cross-talk between DNA methylation and H3K27me3 in zebrafish embryos. Genomics Data 6:79Google Scholar
  62. 62.
    Song Q, Smith AD (2011) Identifying dispersed epigenomic domains from ChIP-Seq data. Bioinformatics (Oxford, England) 27:870–871CrossRefGoogle Scholar
  63. 63.
    Brinkman AB, Gu H, Bartels SJJ et al (2012) Sequential ChIP-bisulfite sequencing enables direct genome-scale investigation of chromatin and DNA methylation cross-talk. Genome Res 22:1128–1138CrossRefPubMedPubMedCentralGoogle Scholar
  64. 64.
    Lund E, Oldenburg AR, Collas P (2014) Enriched domain detector: a program for detection of wide genomic enrichment domains robust against local variations. Nucleic Acids Res 42:92CrossRefGoogle Scholar
  65. 65.
    Shen L, Shao N, Liu X et al (2014) ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC Genomics 15:284CrossRefPubMedPubMedCentralGoogle Scholar
  66. 66.
    D’haeseleer P (2005) How does gene expression clustering work? Nat Biotechnol 23:1499–1501CrossRefPubMedGoogle Scholar
  67. 67.
    Bowman SK, Deaton AM, Domingues H et al (2014) H3K27 modifications define segmental regulatory domains in the Drosophila bithorax complex. Elife 3, e02833CrossRefPubMedPubMedCentralGoogle Scholar
  68. 68.
    Orsi G, Kasinathan S (2014) High-resolution mapping defines the cooperative architecture of Polycomb response elements. Genome Res 24(5):809–820CrossRefPubMedPubMedCentralGoogle Scholar
  69. 69.
    Cao Q, Wang X, Zhao M et al (2014) The central role of EED in the orchestration of polycomb group complexes. Nat Commun 5:3127PubMedPubMedCentralGoogle Scholar
  70. 70.
    Pemberton H, Anderton E, Patel H et al (2014) Genome-wide co-localization of Polycomb orthologs and their effects on gene expression in human fibroblasts. Genome Biol 15:R23CrossRefPubMedPubMedCentralGoogle Scholar
  71. 71.
    Bernstein BE, Stamatoyannopoulos JA, Costello JF et al (2010) The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol 28:1045–1048CrossRefPubMedPubMedCentralGoogle Scholar
  72. 72.
    Farcas AM, Blackledge NP, Sudbery I et al (2012) KDM2B links the Polycomb Repressive Complex 1 (PRC1) to recognition of CpG islands. Elife 1, e00205CrossRefPubMedPubMedCentralGoogle Scholar
  73. 73.
    Frangini A, Sjöberg M, Roman-Trufero M et al (2013) The aurora B kinase and the polycomb protein ring1B combine to regulate active promoters in quiescent lymphocytes. Mol Cell 51:647–661CrossRefPubMedGoogle Scholar
  74. 74.
    Pasini D, Cloos PAC, Walfridsson J et al (2010) JARID2 regulates binding of the Polycomb repressive complex 2 to target genes in ES cells. Nature 464:306–310CrossRefPubMedGoogle Scholar
  75. 75.
    Peng JC, Valouev A, Swigut T et al (2009) Jarid2/Jumonji coordinates control of PRC2 enzymatic activity and target gene occupancy in pluripotent cells. Cell 139:1290–1302CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.ARC Centre of Excellence in Plant Energy BiologyThe University of Western AustraliaPerthAustralia
  2. 2.Radboud University, Department of Molecular Developmental Biology, Faculty of ScienceRadboud Institute for Molecular Life SciencesNijmegenThe Netherlands

Personalised recommendations