Skip to main content

Generative Models for Quantification of DNA Modifications

Part of the Methods in Molecular Biology book series (MIMB,volume 1807)

Abstract

There are multiple chemical modifications of cytosine that are important to the regulation and ultimately the functional expression of the genome. To date no single experiment can capture these separate modifications, and integrative experimental designs are needed to fully characterize cytosine methylation and chemical modification. This chapter describes a generative probabilistic model, Lux, for integrative analysis of cytosine methylation and its oxidized variants. Lux simultaneously analyzes partially orthogonal bisulfite sequencing data sets to estimate proportions of different cytosine methylation modifications and estimate multiple cytosine modifications for a single sample by integrating across experimental designs composed of multiple parallel destructive genomic measurements. Lux also considers the variation in measurements introduced by different imperfect experimental steps; the experimental variation can be quantified by using appropriate spike-in controls, allowing Lux to deconvolve the measurements and recover accurately the underlying signal.

Key words

  • DNA methylation
  • Bayesian analysis
  • Hierarchical generative modeling
  • 5-methylcytosine oxidation
  • Bisulfite sequencing
  • BS-seq/oxBS-seq/TAB-seq/fCAB-seq/CAB-seq/redBS-seq/MAB-seq

This is a preview of subscription content, access via your institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4939-8561-6_4
  • Chapter length: 14 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-1-4939-8561-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Hardcover Book
USD   199.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Kohli RM, Zhang Y (2013) TET enzymes, TDG and the dynamics of DNA demethylation. Nature 502(7472):472. https://doi.org/10.1038/nature12750

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  2. Pastor WA, Aravind L, Rao A (2013) TETonic shift: biological roles of TET proteins in DNA demethylation and transcription. Nat Rev Mol Cell Biol 14(6):341. https://doi.org/10.1038/nrm3589

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  3. Wu X, Zhang Y (2017) TET-mediated active DNA demethylation: mechanism, function and beyond. Nat Rev Genet 18(9):517–534

    CrossRef  CAS  PubMed  Google Scholar 

  4. Shen L, Wu H, Diep D, Yamaguchi S, D’Alessio AC, Fung H-L et al (2013) Genome-wide analysis reveals TET-and TDG-dependent 5-methylcytosine oxidation dynamics. Cell 153(3):692–706

    CrossRef  CAS  PubMed  PubMed Central  Google Scholar 

  5. Spruijt CG, Gnerlich F, Smits AH, Pfaffeneder T, Jansen PW, Bauer C (2013) Dynamic readers for 5-(hydroxy)methylcytosine and its oxidized derivatives. Cell 152(5):1146–1159. https://doi.org/10.1016/j.cell.2013.02.004

    CrossRef  PubMed  CAS  Google Scholar 

  6. Yin Y, Morgunova E, Jolma A, Kaasinen E, Sahu B, Khund-Sayeed S et al (2017) Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356(6337):eaaj2239. http://www.sciencemag.org/lookup/doi/10.1126/science.aaj2239

    CrossRef  CAS  Google Scholar 

  7. Äijö T, Huang Y, Mannerström H, Chavez L, Tsagaratou A, Rao A et al (2016) A probabilistic generative model for quantification of DNA modifications enables analysis of demethylation pathways. Genome Biol 17(1):49. https:// doi.org/10.1186/s13059-016-0911-6

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  8. Äijö T, Yue X, Rao A, Lähdesmäki H (2016) LuxGLM: a probabilistic covariate model for quantification of DNA methylation modifications with complex experimental designs. Bioinformatics 32(17):i511–i519

    CrossRef  CAS  PubMed  PubMed Central  Google Scholar 

  9. Plongthongkum N, Diep DH, Zhang K (2014) Advances in the profiling of DNA modifications: cytosine methylation and beyond. Nat Rev Genet 15(10):647–661. https://doi.org/10.1038/nrg3772

    CrossRef  PubMed  CAS  Google Scholar 

  10. Huang Y, Pastor WA, Shen Y, Tahiliani M, Liu DR, Rao A (2010) The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing. PLoS One 5(1):e8888. https:// doi.org/10.1371/journal.pone.0008888

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  11. Booth MJ, Branco MR, Ficz G, Oxley D, Krueger F, Reik W (2012) Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336(6083):934–937. https://doi.org/10.1126/science.1220671

    CrossRef  PubMed  CAS  Google Scholar 

  12. Yu M, Hon GC, Szulwach KE, Song CX, Zhang L, Kim A (2012) Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149(6):1368–1380. https://doi.org/10.1016/j.cell.2012.04.027

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  13. Song CX, Szulwach KE, Dai Q, Fu Y, Mao SQ, Lin L (2013) Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell 153(3):678–691. https://doi.org/10.1016/j.cell.2013.04.001

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  14. Booth MJ, Marsico G, Bachman M, Beraldi D, Balasubramanian S (2014) Quantitative sequencing of 5-formylcytosine in DNA at single-base resolution. Nat Chem 6(5):435–440. https://doi.org/10.1038/nchem.1893

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  15. Lu X, Song CX, Szulwach K, Wang Z, Weidenbacher P, Jin P (2013) Chemical modification-assisted bisulfite sequencing (CAB-Seq) for 5-carboxylcytosine detection in DNA. J Am Chem Soc 135(25):9315–9317. https://doi.org/10.1021/ja4044856

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  16. Wu H, Wu X, Shen L, Zhang Y (2014) Single-base resolution analysis of active DNA demethylation using methylase-assisted bisulfite sequencing. Nat Biotechnol 32(12):1231–1240. https://doi.org/10.1038/nbt.3073

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  17. Yu M, Hon GC, Szulwach KE, Song C-X, Jin P, Ren B et al (2012) Tet-assisted bisulfite sequencing of 5-hydroxymethylcytosine. Nat Protoc 7(12):2159–2170. https://doi.org/ 10.1038/nprot.2012.137

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  18. Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M et al (2017) Stan: a probabilistic programming language. J Stat Softw 76(1):1–32. https://www.jstatsoft.org/v076/i01

    CrossRef  Google Scholar 

  19. Hoffman MD, Gelman A (2014) The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15(1):1593–1623

    Google Scholar 

  20. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. Taylor & Francis. (Chapman & Hall/CRC Texts in Statistical Science), London. https://books.google.com/books?id=ZXL6AQAAQBAJ

    Google Scholar 

  21. Andrews S (2010) FastQC: a quality control tool for high throughput sequence data [Internet]. http://www. bioinformatics.babraham.ac.uk/projects/fastqc/

  22. Krueger F, Andrews SR (2011) Bismark: a flexible aligner and methylation caller for bisulfite-Seq applications. Bioinformatics 27(11):1571–1572. https://doi.org/ 10.1093/bioinformatics/btr167

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  23. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841–842. https:// doi.org/10.1093/bioinformatics/btq033

    CrossRef  PubMed  PubMed Central  CAS  Google Scholar 

  24. Stan Development Team (2017) PyStan: the Python interface to Stan [Internet]. http://mc-stan.org

  25. Stan Development Team (2017) CmdStan: the command-line interface to Stan

    Google Scholar 

  26. Äijö T, Mannerström H (2017) Lux: an integrative hierarchical Bayesian modeli for analyzing bisulphite based sequencing data [Internet]. https://github.com/tare/Lux/

  27. Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472. http://projecteuclid.org/euclid.ss/1177011136  

    CrossRef  Google Scholar 

  28. Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90(430):773–795

    CrossRef  Google Scholar 

  29. Dickey JM (1971) The weighted likelihood ratio, linear hypotheses on normal location parameters. Ann Math Stat 42(1):204–223

    CrossRef  Google Scholar 

  30. Jeffreys H (1998) Theory of probability, 3rd edn. Oxford University Press, New York, p xii+459; (Oxford Classic Texts in the Physical Sciences)

    Google Scholar 

  31. Hon GC, Rajagopal N, Shen Y, McCleary DF, Yue F, Dang MD et al (2013) Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues. Nat Genet 45(10):1198–1206. http://www.nature.com/doifinder/10.1038/ng.2746

    CrossRef  CAS  PubMed  PubMed Central  Google Scholar 

  32. Tsagaratou A, Äijö T, Lio C-WJ, Yue X, Huang Y, Jacobsen SE et al (2014) Dissecting the dynamic changes of 5-hydroxymethylcytosine in T-cell development and differentiation. Proc Natl Acad Sci 111(32):E3306–E3315. http://www.pnas.org/cgi/doi/10.1073/pnas.1412327111

    CrossRef  CAS  PubMed  Google Scholar 

  33. Ritchie MD, Holzinger ER, Li R, Pendergrass SA, Kim D (2015) Methods of integrating data to uncover genotype–phenotype interactions. Nat Rev Genet 16(2):85–97. http://www.nature.com/doifinder/10.1038/nrg3868

    CrossRef  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harri Lähdesmäki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Verify currency and authenticity via CrossMark

Cite this protocol

Äijö, T., Bonneau, R., Lähdesmäki, H. (2018). Generative Models for Quantification of DNA Modifications. In: Mamitsuka, H. (eds) Data Mining for Systems Biology. Methods in Molecular Biology, vol 1807. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8561-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8561-6_4

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8560-9

  • Online ISBN: 978-1-4939-8561-6

  • eBook Packages: Springer Protocols