ChEA2: Gene-Set Libraries from ChIP-X Experiments to Decode the Transcription Regulome

  • Yan Kou
  • Edward Y. Chen
  • Neil R. Clark
  • Qiaonan Duan
  • Christopher M. Tan
  • Avi Ma‘ayan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8127)


ChIP-seq experiments provide a plethora of data regarding transcription regulation in mammalian cells. Integrating ChIP-seq studies into a computable resource is potentially useful for further knowledge extraction from such data. We continually collect and expand a database where we convert results from ChIP-seq experiments into gene-set libraries. The manual portion of this database currently contains 200 transcription factors from 221 publications for a total of 458,471 transcription-factor/target interactions. In addition, we automatically compiled data from the ENCODE project which includes 920 experiments applied to 44 cell-lines profiling 160 transcription factors for a total of ~1.4 million transcription-factor/target-gene interactions. Moreover, we processed data from the NIH Epigenomics Roadmap project for 27 different types of histone marks in 64 different human cell-lines. All together the data was processed into three simple gene-set libraries where the set label is either a mammalian transcription factor or a histone modification mark in a particular cell line, organism and experiment. Such gene-set libraries are useful for elucidating the experimentally determined transcriptional networks regulating lists of genes of interest using gene-set enrichment analyses. Furthermore, from these three gene-set libraries, we constructed regulatory networks of transcription factors and histone modifications to identify groups of regulators that work together. For example, we found that the Polycomb Repressive Complex 2 (PRC2) is involved with three distinct clusters each interacting with different sets of transcription factors. Notably, the combined dataset is made into web-based application software where users can perform enrichment analyses or download the data in various formats. The open source ChEA2 web-based software and datasets are available freely online at


ChIP-seq ChIP-chip Microarrays Systems Biology ENCODE Enrichment Analysis Transcriptional Networks Data Integration Data Visualization JavaScript D3 


  1. 1.
    Lachmann, A., Xu, H., Krishnan, J., Berger, S.I., Mazloom, A.R., Ma’ayan, A.: ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics 26, 2438–2444 (2010)CrossRefGoogle Scholar
  2. 2.
    Chen, L., Wu, G., Ji, H.: hmChIP: a database and web server for exploring publicly available human and mouse ChIP-seq and ChIP-chip data. Bioinformatics 27, 1447–1448 (2011)CrossRefGoogle Scholar
  3. 3.
    Qin, J., Li, M.J., Wang, P., Zhang, M.Q., Wang, J.: ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor. Nucleic Acids Research 39, W430–W436 (2011)CrossRefGoogle Scholar
  4. 4.
    Lepoivre, C., Bergon, A., Lopez, F., Perumal, N., Nguyen, C., Imbert, J., Puthier, D.: TranscriptomeBrowser 3.0: introducing a new compendium of molecular interactions and a new visualization tool for the study of gene regulatory networks. BMC Bioinformatics 13, 19 (2012)CrossRefGoogle Scholar
  5. 5.
    Qin, B., Zhou, M., Ge, Y., Taing, L., Liu, T., Wang, Q., Wang, S., Chen, J., Shen, L., Duan, X.: CistromeMap: a knowledgebase and web server for ChIP-Seq and DNase-Seq studies in mouse and human. Bioinformatics 28, 1411–1412 (2012)CrossRefGoogle Scholar
  6. 6.
    Sun, H., Qin, B., Liu, T., Wang, Q., Liu, J., Wang, J., Lin, X., Yang, Y., Taing, L., Rao, P.K., et al.: CistromeFinder for ChIP-seq and DNase-seq data reuse. Bioinformatics 29, 1352–1354 (2013)CrossRefGoogle Scholar
  7. 7.
    Bovolenta, L., Acencio, M., Lemke, N.: HTRIdb: an open-access database for experimentally verified human transcriptional regulation interactions. BMC Genomics 13, 405 (2012)CrossRefGoogle Scholar
  8. 8.
    Pepke, S., Wold, B., Mortazavi, A.: Computation for ChIP-seq and RNA-seq studies. Nat. Meth. 6, S22–S32 (2009)CrossRefGoogle Scholar
  9. 9.
    Zang, C., Schones, D.E., Zeng, C., Cui, K., Zhao, K., Peng, W.: A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25, 1952–1958 (2009)CrossRefGoogle Scholar
  10. 10.
    The ENCODE Consortium Project, An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)Google Scholar
  11. 11.
    Chen, E.Y., Tan, C., Kou, Y., Duan, Q., Wang, Z., Meirelles, G., Clark, N.R., Ma’ayan, A.: Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 14, 128 (2013)CrossRefGoogle Scholar
  12. 12.
    Tan, C., Chen, E.Y., Dannenfelser, R., Clark, N.R., Ma’ayan, A.: Network2Canvas: Network Visualization on a Canvas with Enrichment Analysis. Bioinformatics (2013) (published online: June 7, 2013)Google Scholar
  13. 13.
    Clark, N., Dannenfelser, R., Tan, C., Komosinski, M., Ma’ayan, A.: Sets2Networks: network inference from repeated observations of sets. BMC Systems Biology 6, 89 (2012)CrossRefGoogle Scholar
  14. 14.
    Berger, S., Posner, J., Ma’ayan, A.: Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases. BMC Bioinformatics 8, 372 (2007)CrossRefGoogle Scholar
  15. 15.
    Eppig, J.T., Blake, J.A., Bult, C.J., Kadin, J.A., Richardson, J.E.: The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse. Nucleic Acids Res. 40(1), D881–D886 (2012)CrossRefGoogle Scholar
  16. 16.
    Decressac, M., Mattsson, B., Weikop, P., Lundblad, M., Jakobsson, J., Björklund, A.: TFEB-mediated autophagy rescues midbrain dopamine neurons from α-synuclein toxicity. Proc. Natl. Acad. Sci. U S A 110, E1817–E1826 (2013)CrossRefGoogle Scholar
  17. 17.
    Hai, T., Wolfgang, C.D., Marsee, D.K., Allen, A.E., Sivaprasad, U.: ATF3 and stress responses. Gene Expr. 7(4-6), 321–335 (1999)Google Scholar
  18. 18.
    Corre, S., Galibert, M.: Upstream stimulating factors: highly versatile stress-responsive transcription factors. Pigment Cell Res. 18(5), 337–348 (2005)CrossRefGoogle Scholar
  19. 19.
    Holzinger, A.: On Knowledge Discovery and interactive intelligent visualization of biomedical data - Challenges in Human–Computer Interaction & Biomedical Informatics. In: Helfert, M., Francalanci, C., Filipe, J. (eds.) Proceedings of the International Conference on Data Technologies and Application, Rome DATA 2012, Setubal (PT), pp. 3–16. SciTec Press (2012)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2013

Authors and Affiliations

  • Yan Kou
    • 1
  • Edward Y. Chen
    • 1
  • Neil R. Clark
    • 1
  • Qiaonan Duan
    • 1
  • Christopher M. Tan
    • 1
  • Avi Ma‘ayan
    • 1
  1. 1.Department of Pharmacology and Systems Therapeutics, Systems Biology Center New York (SBCNY)Icahn School of Medicine at Mount SinaiNew YorkUSA

Personalised recommendations