Systematic Discovery of Chromatin-Bound Protein Complexes from ChIP-seq Datasets

Part of the Methods in Molecular Biology book series (MIMB, volume 1507)


Chromatin immunoprecipitation followed by sequencing is an invaluable assay for identifying the genomic binding sites of transcription factors. However, transcription factors rarely bind chromatin alone but often bind together with other cofactors, forming protein complexes. Here, we describe a computational method that integrates multiple ChIP-seq and RNA-seq datasets to discover protein complexes and determine their role as activators or repressors. This chapter outlines a detailed computational pipeline for discovering and predicting binding partners from ChIP-seq data and inferring their role in regulating gene expression. This work aims at developing hypotheses about gene regulation via binding partners and deciphering the combinatorial nature of DNA-binding proteins.

Key words

Combinatorial transcription factor binding Protein complexes ENCODE datasets Protein-protein interactions ChIP-seq RNA-seq 



E.G. is supported by start-up finds provided by the City University of New York and the Hospital for Special Surgery. O.E. is supported by NSF CAREER, LLS SCOR, Hirschl Trust Award, Starr Cancer Consortium I6-A618, NIH 1R01CA194547.


  1. 1.
    Wittkopp PJ, Kalay G (2012) Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat Rev Genet 13(1):59–69. doi: 10.1038/nrg3095, nrg3095 [pii]CrossRefGoogle Scholar
  2. 2.
    Siepel A, Arbiza L (2014) Cis-regulatory elements and human evolution. Curr Opin Genet Dev 29:81–89. doi: 10.1016/j.gde.2014.08.011, S0959-437X(14)00092-6 [pii]CrossRefPubMedGoogle Scholar
  3. 3.
    Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R, Min R, Alves P, Abyzov A, Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, Lacroute P, Leng J, Lian J, Monahan H, O’Geen H, Ouyang Z, Partridge EC, Patacsil D, Pauli F, Raha D, Ramirez L, Reddy TE, Reed B, Shi M, Slifer T, Wang J, Wu L, Yang X, Yip KY, Zilberman-Schapira G, Batzoglou S, Sidow A, Farnham PJ, Myers RM, Weissman SM, Snyder M (2012) Architecture of the human regulatory network derived from ENCODE data. Nature 489(7414):91–100. doi: 10.1038/nature11245, nature11245 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X, Wang L, Issner R, Coyne M, Ku M, Durham T, Kellis M, Bernstein BE (2011) Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473(7345):43–49. doi: 10.1038/nature09906, nature09906 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Giannopoulou EG, Elemento O (2013) Inferring chromatin-bound protein complexes from genome-wide binding assays. Genome Res 23(8):1295–1306. doi: 10.1101/gr.149419.112, gr.149419.112 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Ji H, Li X, Wang QF, Ning Y (2013) Differential principal component analysis of ChIP-seq. Proc Natl Acad Sci U S A 110(17):6789–6794. doi: 10.1073/pnas.1204398110, 1204398110 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Jiang P, Singh M (2014) CCAT: Combinatorial Code Analysis Tool for transcriptional regulation. Nucleic Acids Res 42(5):2833–2847. doi: 10.1093/nar/gkt1302, gkt1302 [pii]CrossRefPubMedGoogle Scholar
  8. 8.
    Ram O, Goren A, Amit I, Shoresh N, Yosef N, Ernst J, Kellis M, Gymrek M, Issner R, Coyne M, Durham T, Zhang X, Donaghey J, Epstein CB, Regev A, Bernstein BE (2011) Combinatorial patterning of chromatin regulators uncovered by genome-wide location analysis in human cells. Cell 147(7):1628–1639. doi: 10.1016/j.cell.2011.09.057, S0092-8674(11)01448-6 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Whitington T, Frith MC, Johnson J, Bailey TL (2011) Inferring transcription factor complexes from ChIP-seq data. Nucleic Acids Res 39(15), e98. doi: 10.1093/nar/gkr341, gkr341 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Wong KC, Li Y, Peng C, Zhang Z (2015) SignalSpider: probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles. Bioinformatics 31(1):17–24. doi: 10.1093/bioinformatics/btu604, btu604 [pii]CrossRefPubMedGoogle Scholar
  11. 11.
    Xie D, Boyle AP, Wu L, Zhai J, Kawli T, Snyder M (2013) Dynamic trans-acting factor colocalization in human cells. Cell 155(3):713–724. doi: 10.1016/j.cell.2013.09.043, S0092-8674(13)01217-8 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Zeng X, Sanalkumar R, Bresnick EH, Li H, Chang Q, Keles S (2013) jMOSAiCS: joint analysis of multiple ChIP-seq datasets. Genome Biol 14(4):R38. doi: 10.1186/gb-2013-14-4-r38, gb-2013-14-4-r38 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Ouyang Z, Zhou Q, Wong WH (2009) ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells. Proc Natl Acad Sci U S A 106(51):21521–21526. doi: 10.1073/pnas.0904863106, 0904863106 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A 101(12):4164–4169. doi: 10.1073/pnas.0308531101 CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Giannopoulou EG, Elemento O (2011) An integrated ChIP-seq analysis platform with customizable workflows. BMC Bioinformatics 12:277. doi: 10.1186/1471-2105-12-277 CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Gaujoux R, Seoighe C (2010) A flexible R package for nonnegative matrix factorization. BMC Bioinformatics 11:367. doi: 10.1186/1471-2105-11-367, 1471-2105-11-367 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Quinlan AR (2014) BEDTools: The Swiss-Army tool for genome feature analysis. Curr Protoc Bioinformatics 47:11.12.11–11.12.34. doi: 10.1002/0471250953.bi1112s47 Google Scholar
  18. 18.
    Pascual-Montano A, Carazo JM, Kochi K, Lehmann D, Pascual-Marqui RD (2006) Nonsmooth nonnegative matrix factorization (nsNMF). IEEE Trans Pattern Anal Mach Intell 28(3):403–415. doi: 10.1109/TPAMI.2006.60 CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Biological Sciences Department, New York City College of TechnologyCity University of New YorkNew YorkUSA
  2. 2.Arthritis and Tissue Degeneration Program and the David Z. Rosensweig Genomics Research CenterHospital for Special SurgeryNew YorkUSA
  3. 3.HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine and Department of Physiology and BiophysicsWeill Cornell Medical CollegeNew YorkUSA

Personalised recommendations