Statistical Methods in Molecular Biology

Volume 620 of the series Methods in Molecular Biology pp 243-265


Introduction to Epigenomics and Epigenome-Wide Analysis

  • Melissa J. FazzariAffiliated withDivision of Biostatistics, Department of Epidemiology and Population Health, Department of Genetics, Albert Einstein College of Medicine
  • , John M. GreallyAffiliated withDepartment of Genetics, Department of Medicine, Albert Einstein College of Medicine

* Final gross prices may vary according to local VAT.

Get Access


Epigenetics is the study of heritable change other than those encoded in DNA sequence. Cytosine methylation of DNA at CpG dinucleotides is the most well-studied epigenetic phenomenon, although epigenetic changes also encompass non-DNA methylation mechanisms, such as covalent histone modifications, micro-RNA interactions, and chromatin remodeling complexes. Methylation changes, both global and gene specific, have been observed to be associated with disease, particularly in cancer.

This chapter begins with a general overview of epigenomics, and then focuses on understanding and analyzing genome-wide cytosine methylation data. There are many microarray-based techniques available to measure cytosine methylation across the genome, as well as gold-standard techniques based on sequencing bisulfite converted DNA, which is used to measure methylation in a smaller, more targeted set of loci. We have provided an overview of many of the current technologies – their advantages, limitations, and recent improvements. Regardless of which technology is used, the goal is to produce a set of methylation measurements that are highly consistent with true methylation levels of the corresponding set of CpG dinucleotides.

Identifying all loci with aberrant methylation or hypomethylation in disease, or in natural processes such as aging, requires the comparison of methylation levels across many samples. In such studies, the development of methylation-based diagnostic tools may be of interest, potentially to be used as early disease detection strategies based on a set of sentinel loci. In addition, the identification of loci with potentially reversible methylation events may result in new therapeutic options. Given the vast number of measurable sites, prioritization of candidate loci is an important and complex issue and rests on a foundation of appropriate statistical testing and summarization. Coupled with statistical estimates of importance, the genomic context of each locus measured may offer important information about the mechanisms by which epigenetic changes impact disease and allows us further refinement of candidate loci. We will conclude this chapter by identifying issues in building methylation-based models for prediction and potential directions of further statistical research in epigenetics.

Key words

Epigenomics methylation statistical epigenetics CpG islands quantile–quantile plots prioritization