Model-Based Whole-Genome Analysis of DNA Methylation Fidelity
We consider the problem of understanding how DNA methylation fidelity, i.e. the preservation of methylated sites in the genome, varies across the genome and across different cell types. Our approach uses a stochastic model of DNA methylation across generations and trains it using data obtained through next generation sequencing. By training the model locally, i.e. learning its parameters based on observations in a specific genomic region, we can compare how DNA methylation fidelity varies genome-wide. In the paper, we focus on the computational challenges to scale parameter estimation to the whole-genome level, and present two methods to achieve this goal, one based on moment-based approximation and one based on simulation. We extensively tested our methods on synthetic data and on a first batch of experimental data.
KeywordsDNA methylation Epigenomics Branching processes Parameter estimation Next generation sequencing