Statistical Analysis of ChIP-seq Data with MOSAiCS

Sun, Guannan; Chung, Dongjun; Liang, Kun; Keleş, Sündüz

doi:10.1007/978-1-62703-514-9_12

Guannan Sun²,
Dongjun Chung²,
Kun Liang^2,3 &
…
Sündüz Keleş^2,4

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1038))

6618 Accesses
12 Citations

Abstract

Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is invaluable for identifying genome-wide binding of transcription factors and mapping of epigenomic profiles. We present a statistical protocol for analyzing ChIP-seq data. We describe guidelines for data preprocessing and quality control and provide detailed examples of identifying ChIP-enriched regions using the Bioconductor package “mosaics.”

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Shen Y, Yue F, McCleary DF, Ye Z, Edsall L, Kuan S, Wagner U, Dixon J, Lee L, Lobanenkov VV, Ren B (2012) A map of cis-regulatory sequences in the mouse genome. Nature 488:116–120
Article PubMed CAS Google Scholar
Fujiwara T, O’Geen H, Keles S, Blahnik K, Linnemann AK, Kang Y, Choi K, Farnham PJ, Bresnick EH (2009) Discovering hematopoietic mechanisms through genome-wide analysis of GATA factor chromatin occupancy. Mol Cell 36(4):667–681
Article PubMed CAS Google Scholar
Wilbanks EG, Facciotti MT (2010) Evaluation of algorithm performance in ChIP-Seq peak detection. PLoS One 5:e11471
Article PubMed Google Scholar
Chen Y, Negre N, Li Q, Mieczkowska JO, Slattery M, Liu T, Zhang T, Kim T-K, He HH, Zieba J, Ruan Y, Bickel PJ, Myers RM, Wold BJ, White KP, Lieb JD, Liu XS (2012) Systematic evaluation of factors influencing ChIP-seq fidelity. Nat Methods 9(6):609–614
Article PubMed CAS Google Scholar
Kuan PF, Chung D, Pan G, Thomson JA, Stewart R, Keles S (2011) A statistical framework for the analysis of ChIP-Seq data. J Am Stat Assoc 106(495):891–903
Article CAS Google Scholar
Chung D, Kuan P-F, Li B, SanalKumar R, Liang K, Bresnick E, Dewey C, Keles S (2011) Discovering transcription factor binding sites in highly repetitive regions of genomeswith multi-read analysis of ChIP-Seq data. PLoS Comput Biol 7(7):e1002111
Article PubMed CAS Google Scholar
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Article PubMed Google Scholar
Rozowsky J, Euskirchen G, Auerbach R, Zhang Z, Gibson T, Bjornson R, Carriero N, Snyder M, Gerstein M (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 27:66–75
Article PubMed CAS Google Scholar
Benjamini Y, Speed TS (2012) Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res 40(10):e72
Article PubMed CAS Google Scholar
Liang K, Keles S (2012) Normalization of ChIP-seq data with control. BMC Bioinformatics 13:199
Article PubMed Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc B Met 57(1):289–300
Google Scholar
Liang K, Keles S (2012) Detecting differential binding of transcription factors with ChIP-seq. Bioinformatics 28(1):121–122
Article PubMed CAS Google Scholar
Zeng X, Sanalkumar R, Bresnick EH, Li H, Chang Q, Keles S (2012) jMOSAiCS: joint analysis of multiple ChIP-seq datasets. Submitted. Technical report available at http://www.stat.wisc.edu/~keles/Papers/jmosaics.pdf. R package available at http://www.stat.wisc.edu/~keles/Software/

Download references

Acknowledgments

This work is supported by National Institutes of Health Grants (HG0067161, HG003747) to S.K. We thank Audrey Gasch and Jeff Lewis (yeast TFx), John Svaren and Rajini Srinivasan (Sox10 in rat), and Qiang Chang and Emily Cunningham (human ChIP-seq) for the datasets and useful discussions regarding the analysis.

Author information

Authors and Affiliations

Department of Statistics, University of Wisconsin, Madison, WI, USA
Guannan Sun, Dongjun Chung, Kun Liang & Sündüz Keleş
Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada
Kun Liang
Department of Biostatistics and Biomedical Informatics, University of Wisconsin, Madison, WI, USA
Sündüz Keleş

Authors

Guannan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Dongjun Chung
View author publications
You can also search for this author in PubMed Google Scholar
Kun Liang
View author publications
You can also search for this author in PubMed Google Scholar
Sündüz Keleş
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, Faculty of Medicine, Tel Aviv University, N/A, Tel Aviv, 69978, Israel
Noam Shomron

Appendix: R Script for the Analysis of Yeast TFx ChIP-seq Datasets

library( mosaics)

library( hexbin)

# construct bin-level files for each replicate ChIP sample #

constructBins(infile = "TFx_EtOH_IP_1_2mis_bowtie_uni.txt", outfileLoc = "/bin_cap0",