Methods for CpG Methylation Array Profiling Via Bisulfite Conversion

  • Fatjon LetiEmail author
  • Lorida Llaci
  • Ivana Malenica
  • Johanna K. DiStefano
Part of the Methods in Molecular Biology book series (MIMB, volume 1706)


DNA methylation is a key factor in epigenetic regulation, and contributes to the pathogenesis of many diseases, including various forms of cancers, and epigenetic events such X inactivation, cellular differentiation and proliferation, and embryonic development. The most conserved epigenetic modification in plants, animals, and fungi is 5-methylcytosine (5mC), which has been well characterized across a diverse range of species. Many technologies have been developed to measure modifications in methylation with respect to biological processes, and the most common method, long considered a gold standard for identifying regions of methylation, is bisulfite conversion. In this technique, DNA is treated with bisulfite, which converts cytosine residues to uracil, but does not affect cytosine residues that have been methylated, such as 5-methylcytosines. Following bisulfite conversion, the only cytosine residues remaining in the DNA, therefore, are those that have been methylated. Subsequent sequencing can then distinguish between unmethylated cytosines, which are displayed as thymines in the resulting amplified sequence of the sense strand, and 5-methylcytosines, which are displayed as cytosines in the resulting amplified sequence of the sense strand, at the single nucleotide level. In this chapter, we describe an array-based protocol for identifying methylated DNA regions. We discuss protocols for DNA quantification, bisulfite conversion, library preparation, and chip assembly, and present an overview of current methods for the analysis of methylation data.

Key words

DNA Methylation CpG Epigenetics Bisulfite conversion 



We thank Diego Portillo Santos for editing and generating the figures.


  1. 1.
    Bird A, Taggart M, Frommer M, Miller OJ, Macleod D (1985) A fraction of the mouse genome that is derived from islands of nonmethylated, CpG-rich DNA. Cell 40(1):91–99CrossRefPubMedGoogle Scholar
  2. 2.
    Holliday R, Pugh JE (1975) DNA modification mechanisms and gene activity during development. Science 187(4173):226–232CrossRefPubMedGoogle Scholar
  3. 3.
    Bird A (2007) Perceptions of epigenetics. Nature 447(7143):396–398. CrossRefPubMedGoogle Scholar
  4. 4.
    Dayeh T, Volkov P, Salo S, Hall E, Nilsson E, Olsson AH et al (2014) Genome-wide DNA methylation analysis of human pancreatic islets from type 2 diabetic and non-diabetic donors identifies candidate genes that influence insulin secretion. PLoS Genet 10(3):e1004160. PubMed PMID: 24603685; PubMed Central PMCID: PMCPMC3945174.CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Kulis M, Esteller M (2010) DNA methylation and cancer. Adv Genet 70:27–56. PubMedGoogle Scholar
  6. 6.
    Li E (2002) Chromatin modification and epigenetic reprogramming in mammalian development. Nat Rev Genet 3(9):662–673. CrossRefPubMedGoogle Scholar
  7. 7.
    Lu H, Liu X, Deng Y, Qing H (2013) DNA methylation, a hand behind neurodegenerative diseases. Front Aging Neurosci 5:85. PubMed PMID: 24367332; PubMed Central PMCID: PMCPMC3851782.CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Zhong J, Agha G, Baccarelli AA (2016) The role of DNA methylation in cardiovascular risk and disease: methodological aspects, study design, and data analysis for epidemiological studies. Circ Res 118(1):119–131. PubMed PMID: 26837743; PubMed Central PMCID: PMCPMC4743554CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Goll MG, Bestor TH (2005) Eukaryotic cytosine methyltransferases. Annu Rev Biochem 74:481–514. CrossRefPubMedGoogle Scholar
  10. 10.
    Feng S, Jacobsen SE, Reik W (2010) Epigenetic reprogramming in plant and animal development. Science 330(6004):622–627. PubMed PMID: 21030646; PubMed Central PMCID: PMCPMC2989926CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Deaton AM, Bird A (2011) CpG islands and the regulation of transcription. Genes Dev 25(10):1010–1022. Epub 2011/05/18. doi. PubMed PMID: 21576262; PubMed Central PMCID: PMC3093116.CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Harris RA, Wang T, Coarfa C, Nagarajan RP, Hong C, Downey SL et al (2010) Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat Biotechnol 28(10):1097–1105. PubMed PMID: 20852635; PubMed Central PMCID: PMCPMC2955169CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Kurdyukov S, Bullock M (2016) DNA methylation analysis: choosing the right method. Biology (Basel) 5(1). PubMed PMID: 26751487; PubMed Central PMCID: PMCPMC4810160
  14. 14.
    Frommer M, McDonald LE, Millar DS, Collis CM, Watt F, Grigg GW et al (1992) A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc Natl Acad Sci U S A 89(5):1827–1831. PubMed PMID: 1542678; PubMed Central PMCID: PMCPMC48546CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Weber M, Davies JJ, Wittig D, Oakeley EJ, Haase M, Lam WL et al (2005) Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells. Nat Genet 37(8):853–862. CrossRefPubMedGoogle Scholar
  16. 16.
    Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD et al (2008) Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452(7184):215–219. PubMed PMID: 18278030; PubMed Central PMCID: PMCPMC2377394CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM et al (2011) High density DNA methylation array with single CpG site resolution. Genomics 98(4):288–295. CrossRefPubMedGoogle Scholar
  18. 18.
    Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD et al (2014) Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30(10):1363–1369. PubMed PMID: 24478339; PubMed Central PMCID: PMCPMC4016708CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Price ME, Cotton AM, Lam LL, Farre P, Emberly E, Brown CJ et al (2013) Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array. Epigenetics Chromatin 6(1):4. PubMed PMID: 23452981; PubMed Central PMCID: PMCPMC3740789CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Zhang X, Mu W, Zhang W (2012) On the analysis of the illumina 450k array data: probes ambiguously mapped to the human genome. Front Genet 3:73. PubMed PMID: 22586432; PubMed Central PMCID: PMCPMC3343275PubMedPubMedCentralGoogle Scholar
  21. 21.
    Chen YA, Lemire M, Choufani S, Butcher DT, Grafodatskaya D, Zanke BW et al (2013) Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 8(2):203–209. PubMed PMID: 23314698; PubMed Central PMCID: PMCPMC3592906CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Teschendorff AE, Marabita F, Lechner M, Bartlett T, Tegner J, Gomez-Cabrero D et al (2013) A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29(2):189–196. PubMed PMID: 23175756; PubMed Central PMCID: PMCPMC3546795CrossRefPubMedGoogle Scholar
  23. 23.
    Pidsley R, CC YW, Volta M, Lunnon K, Mill J, Schalkwyk LC (2013) A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics 14:293. PubMed PMID: 23631413; PubMed Central PMCID: PMCPMC3769145CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Dedeurwaerder S, Defrance M, Bizet M, Calonne E, Bontempi G, Fuks F (2014) A comprehensive overview of Infinium HumanMethylation450 data processing. Brief Bioinform 15(6):929–941. PubMed PMID: 23990268; PubMed Central PMCID: PMCPMC4239800CrossRefPubMedGoogle Scholar
  25. 25.
    Maksimovic J, Gordon L, Oshlack A (2012) SWAN: subset-quantile within array normalization for illumina infinium HumanMethylation450 BeadChips. Genome Biol 13(6):R44. PubMed PMID: 22703947; PubMed Central PMCID: PMCPMC3446316CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Touleimat N, Tost J (2012) Complete pipeline for Infinium((R)) human methylation 450K BeadChip data processing using subset quantile normalization for accurate DNA methylation estimation. Epigenomics 4(3):325–341. CrossRefPubMedGoogle Scholar
  27. 27.
    Du P, Kibbe WA, Lin SM (2008) lumi: a pipeline for processing Illumina microarray. Bioinformatics 24(13):1547–1548. CrossRefPubMedGoogle Scholar
  28. 28.
    Xu Z, Niu L, Li L, Taylor JA (2016) ENmix: a novel background correction method for Illumina HumanMethylation450 BeadChip. Nucleic Acids Res 44(3):e20. PubMed PMID: 26384415; PubMed Central PMCID: PMCPMC4756845CrossRefPubMedGoogle Scholar
  29. 29.
    Heiss JA, Brenner H (2015) Between-array normalization for 450K data. Front Genet 6:92. PubMed PMID: 25806048; PubMed Central PMCID: PMCPMC4354407CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Fortin JP, Labbe A, Lemire M, Zanke BW, Hudson TJ, Fertig EJ et al (2014) Functional normalization of 450k methylation array data improves replication in large cancer studies. Genome Biol 15(11):503. PubMed PMID: 25599564; PubMed Central PMCID: PMCPMC4283580CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127. PubMed PMID: 16632515CrossRefPubMedGoogle Scholar
  32. 32.
    Teschendorff AE, Zhuang J, Widschwendter M (2011) Independent surrogate variable analysis to deconvolve confounding factors in large-scale microarray profiling studies. Bioinformatics 27(11):1496–1505. CrossRefPubMedGoogle Scholar
  33. 33.
    Leek JT, Storey JD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 3(9):e161–e135. Epub 2007/10/03. PubMed PMID: 17907809; PubMed Central PMCID: PMC1994707CrossRefPubMedCentralGoogle Scholar
  34. 34.
    Jaffe AE, Irizarry RA (2014) Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol 15(2):R31. PubMed PMID: 24495553; PubMed Central PMCID: PMCPMC4053810CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Morris TJ, Butcher LM, Feber A, Teschendorff AE, Chakravarthy AR, Wojdacz TK et al (2014) ChAMP: 450k Chip analysis methylation pipeline. Bioinformatics 30(3):428–430. PubMed PMID: 24336642; PubMed Central PMCID: PMCPMC3904520CrossRefPubMedGoogle Scholar
  36. 36.
    Warden CD, Lee H, Tompkins JD, Li X, Wang C, Riggs AD et al (2013) COHCAP: an integrative genomic pipeline for single-nucleotide resolution DNA methylation analysis. Nucleic Acids Res 41(11):e117. PubMed PMID: 23598999; PubMed Central PMCID: PMCPMC3675470CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2018

Authors and Affiliations

  • Fatjon Leti
    • 1
    Email author
  • Lorida Llaci
    • 2
  • Ivana Malenica
    • 3
  • Johanna K. DiStefano
    • 4
  1. 1.Center for Genes, Environment and Health, Department of Biomedical ResearchNational Jewish HealthDenverUSA
  2. 2.Neurogenomics DivisionTranslational Genomics Research InstitutePhoenixUSA
  3. 3.Division of BiostatisticsUniversity of California, BerkeleyBerkeleyUSA
  4. 4.Translational Genomics Research InstitutePhoenixUSA

Personalised recommendations