ChEA2: Gene-Set Libraries from ChIP-X Experiments to Decode the Transcription Regulome
ChIP-seq experiments provide a plethora of data regarding transcription regulation in mammalian cells. Integrating ChIP-seq studies into a computable resource is potentially useful for further knowledge extraction from such data. We continually collect and expand a database where we convert results from ChIP-seq experiments into gene-set libraries. The manual portion of this database currently contains 200 transcription factors from 221 publications for a total of 458,471 transcription-factor/target interactions. In addition, we automatically compiled data from the ENCODE project which includes 920 experiments applied to 44 cell-lines profiling 160 transcription factors for a total of ~1.4 million transcription-factor/target-gene interactions. Moreover, we processed data from the NIH Epigenomics Roadmap project for 27 different types of histone marks in 64 different human cell-lines. All together the data was processed into three simple gene-set libraries where the set label is either a mammalian transcription factor or a histone modification mark in a particular cell line, organism and experiment. Such gene-set libraries are useful for elucidating the experimentally determined transcriptional networks regulating lists of genes of interest using gene-set enrichment analyses. Furthermore, from these three gene-set libraries, we constructed regulatory networks of transcription factors and histone modifications to identify groups of regulators that work together. For example, we found that the Polycomb Repressive Complex 2 (PRC2) is involved with three distinct clusters each interacting with different sets of transcription factors. Notably, the combined dataset is made into web-based application software where users can perform enrichment analyses or download the data in various formats. The open source ChEA2 web-based software and datasets are available freely online at http://amp.pharm.mssm.edu/ChEA2.
- 10.The ENCODE Consortium Project, An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)Google Scholar
- 12.Tan, C., Chen, E.Y., Dannenfelser, R., Clark, N.R., Ma’ayan, A.: Network2Canvas: Network Visualization on a Canvas with Enrichment Analysis. Bioinformatics (2013) (published online: June 7, 2013)Google Scholar
- 17.Hai, T., Wolfgang, C.D., Marsee, D.K., Allen, A.E., Sivaprasad, U.: ATF3 and stress responses. Gene Expr. 7(4-6), 321–335 (1999)Google Scholar
- 19.Holzinger, A.: On Knowledge Discovery and interactive intelligent visualization of biomedical data - Challenges in Human–Computer Interaction & Biomedical Informatics. In: Helfert, M., Francalanci, C., Filipe, J. (eds.) Proceedings of the International Conference on Data Technologies and Application, Rome DATA 2012, Setubal (PT), pp. 3–16. SciTec Press (2012)Google Scholar