Computational Analysis of ChIP-seq Data

  • Hongkai JiEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 674)


Chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-seq) is a new technology to map protein–DNA interactions in a genome. The genome-wide transcription factor binding site and chromatin modification data produced by ChIP-seq provide invaluable information for studying gene regulation. This chapter reviews basic characteristics of ChIP-seq data and introduces a computational procedure to identify protein–DNA interactions from ChIP-seq experiments.

Key words

Transcription factor binding site high-throughput sequencing peak detection false discovery rate 


  1. 1.
    Johnson, D.S., Mortazavi, A., Myers, R.M., and Wold, B. (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502.PubMedCrossRefGoogle Scholar
  2. 2.
    Robertson, G., Hirst, M., Bainbridge, M. et al. (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4, 651–657.PubMedCrossRefGoogle Scholar
  3. 3.
    Mikkelsen, T.S., Ku, M., Jaffe, D.B. et al. (2007) Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560.PubMedCrossRefGoogle Scholar
  4. 4.
    Barski, A., Cuddapah, S., Cui, K. et al. (2007) High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837.PubMedCrossRefGoogle Scholar
  5. 5.
    Shendure, J., and Ji, H. (2008) Next-generation DNA sequencing. Nat Biotechnol 26, 1135–1145.PubMedCrossRefGoogle Scholar
  6. 6.
    Ren, B., Robert, F., Wyrick, J.J. et al. (2000) Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309.PubMedCrossRefGoogle Scholar
  7. 7.
    Cawley, S., Bekiranov, S., Ng, H.H. et al. (2004) Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509.PubMedCrossRefGoogle Scholar
  8. 8.
    Park, P.J. (2009) ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 10, 669–680.PubMedCrossRefGoogle Scholar
  9. 9.
    Ji, H., Jiang, H., Ma, W. et al. (2008) An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol 26, 1293–1300.PubMedCrossRefGoogle Scholar
  10. 10.
    Rozowsky, J., Euskirchen, G., Auerbach, R.K. et al. (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 27, 66–75.PubMedCrossRefGoogle Scholar
  11. 11.
    Kharchenko, P.V., Tolstorukov, M.Y., and Park, P.J. (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol 26, 1351–1359.PubMedCrossRefGoogle Scholar
  12. 12.
    Marson, A., Levine, S.S., Cole, M.F. et al. (2008) Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134, 521–533.PubMedCrossRefGoogle Scholar
  13. 13.
    Schmid, C.D., and Bucher, P. (2007) ChIP-Seq data reveal nucleosome architecture of human promoters. Cell 131, 831–832.PubMedCrossRefGoogle Scholar
  14. 14.
    Valouev, A., Johnson, D.S., Sundquist, A. et al. (2008) Genome-wide analysis of transcription factor binding sites based on ChIP-seq data. Nat Methods 5, 829–834.PubMedCrossRefGoogle Scholar
  15. 15.
    Zhang, Y., Liu, T., Meyer, C.A. et al. (2008) Model-based analysis of ChIP-seq (MACS). Genome Biol 9, R137.PubMedCrossRefGoogle Scholar
  16. 16.
    Jothi, R., Cuddapah, S., Barski, A., Cui, K., and Zhao, K. (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-seq data. Nucleic Acids Res 36, 5221–5231.PubMedCrossRefGoogle Scholar
  17. 17.
    Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25.PubMedCrossRefGoogle Scholar
  18. 18.
    Li, H., Ruan, J., and Durbin, R. (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18, 1851–1858.PubMedCrossRefGoogle Scholar
  19. 19.
    Jiang, H., and Wong, W.H. (2008) SeqMap : mapping massive amount of oligonucleotides to the genome. Bioinformatics 24, 2395–2396.PubMedCrossRefGoogle Scholar
  20. 20.
    Rumble, S.M., Lacroute, P., Dalca, A.V. et al. (2009) SHRiMP: accurate mapping of short color-space reads. PLoS Comput Biol 5, e1000386.PubMedCrossRefGoogle Scholar
  21. 21.
    Subramanian, A., Tamayo, P., Mootha, V.K. et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 102, 15545–15550.PubMedCrossRefGoogle Scholar
  22. 22.
    Kent, W.J., Sugnet, C.W., Furey, T.S. et al. (2002) The human genome browser at UCSC. Genome Res 12, 996–1006.PubMedGoogle Scholar
  23. 23.
    Smyth, G.K. (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3, 1–9 (Article 3).Google Scholar
  24. 24.
    Fejes, A.P., Robertson, G., Bilenky, M. et al. (2008) FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics 24, 1729–1730.PubMedCrossRefGoogle Scholar
  25. 25.
    Nix, D.A., Courdy, S.J., and Boucher, K.M. (2008) Empirical methods for controlling false positives and estimating confidence in ChIP-seq peaks. BMC Bioinformatics 9, 523.PubMedCrossRefGoogle Scholar
  26. 26.
    Ji, H., Vokes, S.A., and Wong, W.H. (2006) A comparative analysis of genome-wide chromatin immunoprecipitation data for mammalian transcription factors. Nucleic Acids Res 34, e146.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Department of BiostatisticsThe Johns Hopkins Bloomberg School of Public HealthBaltimoreUSA

Personalised recommendations