DNA–Protein Interaction Analysis (ChIP-Seq)

  • Geetu Tuteja


ChIP-Seq, which combines chromatin immunoprecipitation (ChIP) with high throughput sequencing, is a powerful technology that allows for identification of genome-wide protein–DNA interactions. Interpretation of ChIP-Seq data has proven to be a complicated computational task, and multiple methods have been developed to address these challenges. This chapter begins by describing the protocol for ChIP-Seq library preparation and proper experimental design, without which computational tools would not be able to accurately capture in vivo interactions. Following a section on raw data pre-processing and data visualization, using Illumina Genome Analyzer output files as examples, general approaches taken by peak-calling tools are described. GLITR, a powerful peak-calling tool that utilizes a large set of control data to accurately identify regions that are bound in ChIP-Seq data, is then explained in detail. Finally, an approach for functional interpretation of ChIP-Seq peaks is discussed.


Library Preparation UCSC Genome Browser Sequence Read Illumina Genome Analyzer Illumina Pipeline 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Auerbach RK, Euskirchen G, Rozowsky J, et al. (2009) Mapping accessible chromatin regions using Sono-Seq. Proc Natl Acad Sci USA 106:14926–31.PubMedCrossRefGoogle Scholar
  2. Blahnik KR, Dou L, O’Geen H, et al. (2010) Sole-Search: an integrated analysis program for peak detection and functional annotation using ChIP-seq data. Nucleic Acids Res 38:e13.PubMedCrossRefGoogle Scholar
  3. Boyle AP, Guinney J, Crawford GE, et al. (2008) F-Seq: a feature density estimator for high-throughput sequence tags. Bioinformatics 24:2537–8.PubMedCrossRefGoogle Scholar
  4. Buck MJ and Lieb JD (2004) ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics 83:349–60.PubMedCrossRefGoogle Scholar
  5. Chaya D and Zaret KS (2004) Sequential chromatin immunoprecipitation from animal tissues. Methods Enzymol 376:361–72.PubMedCrossRefGoogle Scholar
  6. Collas P (2010) The current state of chromatin immunoprecipitation. Mol Biotechnol 45:87–100.PubMedCrossRefGoogle Scholar
  7. Fejes AP, Robertson G, Bilenky M, et al. (2008) FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics 24:1729–30.PubMedCrossRefGoogle Scholar
  8. Friedman JR, Larris B, Le PP, et al. (2004) Orthogonal analysis of C/EBPbeta targets in vivo during liver proliferation. Proc Natl Acad Sci USA 101:12986–91.PubMedCrossRefGoogle Scholar
  9. Harbison CT, Gordon DB, Lee TI, et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature 431:99–104.PubMedCrossRefGoogle Scholar
  10. Huntley D, Tang YA, Nesterova TB, et al. (2008) Genome Environment Browser (GEB): a dynamic browser for visualising high-throughput experimental data in the context of genome features. BMC Bioinformatics 9:501.PubMedCrossRefGoogle Scholar
  11. Ji H, Jiang H, Ma W, et al. (2008) An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol 26:1293–300.PubMedCrossRefGoogle Scholar
  12. Johnson DS, Mortazavi A, Myers RM, et al. (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316:1497–502.PubMedCrossRefGoogle Scholar
  13. Jothi R, Cuddapah S, Barski A, et al. (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res 36:5221–31.PubMedCrossRefGoogle Scholar
  14. Kharchenko PV, Tolstorukov MY and Park PJ (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol 26:1351–9.PubMedCrossRefGoogle Scholar
  15. Kuo MH and Allis CD (1999) In vivo cross-linking and immunoprecipitation for studying dynamic Protein: DNA associations in a chromatin environment. Methods 19:425–33.PubMedCrossRefGoogle Scholar
  16. Le Lay J, Tuteja G, White P, et al. (2009) CRTC2 (TORC2) contributes to the transcriptional response to fasting in the liver but is not required for the maintenance of glucose homeostasis. Cell Metab 10:55–62.PubMedCrossRefGoogle Scholar
  17. Le PP, Friedman J, Schug J, et al. (2005) Glucocorticoid receptor-dependent gene regulatory networks. PLoS Genetics 2:159–170.Google Scholar
  18. Lefrancois P, Euskirchen GM, Auerbach RK, et al. (2009) Efficient yeast ChIP-Seq using multiplex short-read DNA sequencing. BMC Genomics 10:37.PubMedCrossRefGoogle Scholar
  19. Lun DS, Sherrid A, Weiner B, et al. (2009) A blind deconvolution approach to high-resolution mapping of transcription factor binding sites from ChIP-seq data. Genome Biol 10:R142.PubMedCrossRefGoogle Scholar
  20. McLean CY, Bristor D, Hiller M, et al. (2010) GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol 28:495–501.PubMedCrossRefGoogle Scholar
  21. Mockler TC, Chan S, Sundaresan A, et al. (2005) Applications of DNA tiling arrays for whole-genome analysis. Genomics 85:1–15.PubMedCrossRefGoogle Scholar
  22. Mortazavi A, Williams BA, McCue K, et al. (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–8.PubMedCrossRefGoogle Scholar
  23. Nicol JW, Helt GA, Blanchard SG, Jr., et al. (2009) The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets. Bioinformatics 25:2730–1.PubMedCrossRefGoogle Scholar
  24. Nix DA, Courdy SJ and Boucher KM (2008) Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks. BMC Bioinformatics 9:523.PubMedCrossRefGoogle Scholar
  25. Orlando V (2000) Mapping chromosomal proteins in vivo by formaldehyde-crosslinked-chromatin immunoprecipitation. Trends Biochem Sci 25:99–104.PubMedCrossRefGoogle Scholar
  26. Pepke S, Wold B and Mortazavi A (2009) Computation for ChIP-seq and RNA-seq studies. Nat Methods 6:S22–32.PubMedCrossRefGoogle Scholar
  27. Qin ZS, Yu J, Shen J, et al. (2010) HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data. BMC Bioinformatics 11:369.PubMedCrossRefGoogle Scholar
  28. Ren B and Dynlacht BD (2004) Use of chromatin immunoprecipitation assays in genome-wide location analysis of mammalian transcription factors. Methods Enzymol 376:304–15.PubMedCrossRefGoogle Scholar
  29. Ren B, Robert F, Wyrick JJ, et al. (2000) Genome-wide location and function of DNA binding proteins. Science 290:2306–9.PubMedCrossRefGoogle Scholar
  30. Robertson G, Hirst M, Bainbridge M, et al. (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4:651–7.PubMedCrossRefGoogle Scholar
  31. Royce TE, Rozowsky JS, Bertone P, et al. (2005) Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping. Trends Genet 21:466–75.PubMedCrossRefGoogle Scholar
  32. Rozowsky J, Euskirchen G, Auerbach RK, et al. (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 27:66–75.PubMedCrossRefGoogle Scholar
  33. Rubins N, Friedman J, Le P, et al. (2005) Transcriptional networks in the liver: hepatocyte nuclear factor 6 function is largely independent of Foxa2. Mol Cell Biol 25:7069–77.PubMedCrossRefGoogle Scholar
  34. Simon I, Barnett J, Hannett N, et al. (2001) Serial regulation of transcriptional regulators in the yeast cell cycle. Cell 106:697–708.PubMedCrossRefGoogle Scholar
  35. Solomon MJ, Larsen PL and Varshavsky A (1988) Mapping protein-DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell 53:937–47.PubMedCrossRefGoogle Scholar
  36. Spyrou C, Stark R, Lynch AG, et al. (2009) BayesPeak: Bayesian analysis of ChIP-seq data. BMC Bioinformatics 10:299.PubMedCrossRefGoogle Scholar
  37. Tuteja G, White P, Schug J, et al. (2009) Extracting transcription factor targets from ChIP-Seq data. Nucleic Acids Res 37:e113.PubMedCrossRefGoogle Scholar
  38. Valouev A, Johnson DS, Sundquist A, et al. (2008) Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods 5:829–34.PubMedCrossRefGoogle Scholar
  39. Wilbanks EG and Facciotti MT (2010) Evaluation of algorithm performance in ChIP-seq peak detection. PLoS One 5:e11471.PubMedCrossRefGoogle Scholar
  40. Wu S, Wang J, Zhao W, et al. (2010) ChIP-PaM: an algorithm to identify protein-DNA interaction using ChIP-Seq data. Theor Biol Med Model 7:18.PubMedCrossRefGoogle Scholar
  41. Wyrick JJ, Aparicio JG, Chen T, et al. (2001) Genome-wide distribution of ORC and MCM proteins in S. cerevisiae: high-resolution mapping of replication origins. Science 294:2357–60.PubMedCrossRefGoogle Scholar
  42. Xu H, Handoko L, Wei X, et al. (2010) A signal-noise model for significance analysis of ChIP-seq with negative control. Bioinformatics 26:1199–204.PubMedCrossRefGoogle Scholar
  43. Zang C, Schones DE, Zeng C, et al. (2009) A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25:1952–8.PubMedCrossRefGoogle Scholar
  44. Zhang X, Robertson G, Krzywinski M, et al. (2011) PICS: Probabilistic Inference for ChIP-seq. Biometrics 67(1):151–63.Google Scholar
  45. Zhang Y, Liu T, Meyer CA, et al. (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9:R137.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Department of Developmental BiologyStanford UniversityStanfordUSA

Personalised recommendations