Abstract
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a high-throughput antibody-based method to study genome-wide protein–DNA binding interactions. ChIP-seq technology allows scientist to obtain more accurate data providing genome-wide coverage with less starting material and in shorter time compared to older ChIP-chip experiments. Herein we describe a step-by-step guideline in analyzing ChIP-seq data including data preprocessing, nonlinear normalization to enable comparison between different samples and experiments, statistical-based method to identify differential binding sites using mixture modeling and local false discovery rates (fdrs), and binding pattern characterization. In addition, we provide a sample analysis of ChIP-seq data using the steps provided in the guideline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Johnson DS, Mortazavi A, Myers R et al (2007) Genome-Wide Mapping of in Vivo Protein-DNA Interactions. Science 316: 1441–1442
Liu E, Pott S, Huss M (2010) Q&A: ChIP-seq technologies and the study of gene regulation. BMC Biology 8: 56
Cleveland WS (1988) Locally-Weighted Regression: An Approach to Regression Analysis by Local Fitting. J. Am. Stat. Assoc. 85: 596–610
Taslim C, Wu J, Yan P et al (2009) Comparative study on ChIP-seq data: normalization and binding pattern characterization. Bioinformatics 25: 2334–2340
Khalili A, Huang T, Lin S (2009) A robust unified approach to analyzing methylation and gene expression data. Computational Statistics and Data Analysis 53: 1701–1710
Akaike H (1973) Information Theory and an Extension of the Maximum Likelihood Principle: 267–281
Efron B (2004) Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis. Journal of the American Statistical Association 99: 96–104
Oetken G, Parks T, Schussler H (1975) New results in the design of digital interpolators. IEEE Transactions on Acoustics, Speech and Signal Processing [see also IEEE Transactions on Signal Processing] 23: 301–309
Pruitt KD, Tatusova T, Maglott DR (2007) NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins, Nucleic Acids Research 35: D61–65
Lin CY, Strom A, Vega V et al (2004) Discovery of estrogen receptor alpha target genes and response elements in breast tumor cells. Genome Biology 5, R66
Feng W, Liu Y, Wu J et al (2008) A Poisson mixture model to identify changes in RNA polymerase II binding quantity using high-throughput sequencing technology. BMC Genomics 9: S23
Rozowsky J, Euskirchen G, Auerbach RK et al (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotech 27: 66–75
Kharchenko PV, Tolstorukov MY, Park PJ (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nature biotechnology 26: 1351–1359
Jothi R, Cuddapah S, Barski A et al (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucl. Acids Res. 36: 5221–5231
McLachlan G, Peel D (2000) Finite Mixture Models. Wiley-Interscience, New York
Mortazavi A, Williams BA, McCue K et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Meth 5:621–628
The networks and functional analyses were generated through the use of Ingenuity Pathways Analysis (Ingenuity® Systems), see http://www.ingenuity.com
KEGG pathway analysis, see http://www.genome.jp/kegg/
Gene Ontology website, see http://www.geneontology.org/
WEB-based GEne SeT AnaLysis Toolkit, see http://bioinfo.vanderbilt.edu/webgestalt/
Software and datasets used can be downloaded, see http://www.stat.osu.edu/~statgen/SOFTWARE/GNG/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Taslim, C., Huang, K., Huang, T., Lin, S. (2012). Analyzing ChIP-seq Data: Preprocessing, Normalization, Differential Identification, and Binding Pattern Characterization. In: Wang, J., Tan, A., Tian, T. (eds) Next Generation Microarray Bioinformatics. Methods in Molecular Biology, vol 802. Humana Press. https://doi.org/10.1007/978-1-61779-400-1_18
Download citation
DOI: https://doi.org/10.1007/978-1-61779-400-1_18
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-61779-399-8
Online ISBN: 978-1-61779-400-1
eBook Packages: Springer Protocols