Background

Epigenetic mechanisms, such as DNA methylation, have been suggested as possible causal pathways linking environmental exposure to disease. Many of these studies depend on the epigenome-wide analysis of prospectively collected samples, in the context of large human cohorts. As epigenome-wide technologies are becoming available, the use of such cohort studies will provide large amounts of information in the coming years. Due to the general lack of biospecimen collection in observational human studies, many of these cohorts rely on the use of dried blood spots (DBS) obtained soon after birth as the main source of biological information [1].

The use of filter paper for blood collection and analysis was implemented as early as the 1960s by Guthrie et al. using dried-blood samples for newborn phenylketonuria detection [2]. “Guthrie cards” are widely used in many types of tests, including chemical, serological, and genetic applications [3]. More recently, Flinders Technology Associates chemically treated filter papers (FTA cards) were specifically developed for DNA/RNA analyses [4]. These chemically treated cards allow long-term storage of DNA at room temperature and are impregnated with denaturants that guard against oxidation, nuclease and ultraviolet damage, and both bacterial and fungal degradation.

Neonatal DBS are routinely collected in many countries and represent a cost-effective tool to store precious biological specimens for subsequent studies. However, reliable profiling the DNA methylome in DBS has proven to be technically challenging, particularly because such techniques require stringent bisulfite preprocessing that can degrade DNA [5]. Other limitations of their use include the variable degradation of DNA due to storage and extraction, the usually small amounts of DNA that can be obtained (typical blood spots are between 6 and 10 mm in diameter), and the identification of technical artifacts potentially associated with long term storage [6].

Recently, there has been increasing interest in the use of DBS in DNA methylome analyses, using Methylated DNA Immunoprecipitaion combined with sequencing (MeDIP-seq) [7], Methyl-CpG Binding Domain (MBD) protein-enrichment combined with sequencing (MBD-seq) [8], and Infinium (Illumina) bead arrays [7, 911]. The last version of Illumina’s bead array, Infinium HumanMethylation450 (HM450) Beadchip, is cost-effective, requires DNA amounts as low as 300 ng, enables the detection and quantitation of DNA methylation levels at 486,685 CpG sites across the genome and represents one of the most comprehensive microarray methods to date for investigating the methylome [12]. Three reports have addressed the utility of HM450 on DBS [7, 10, 11]. The first report validated the use and the high correlation between two different methylomic platforms on DBS DNA: HM450 and MeDIP-seq [7]. The second one generated good quality methylome-wide data from DBS, as compared to their matched frozen buffy coat [10]. The third one used DBS-based HM450 analyses to study the epigenetic effects of gestational age, as recently demonstrated by our group [11]. However, none of these studies analyzed methods for optimized DNA extraction and quality verification from DBS, both of which represent major upstream steps in the pipeline for DBS-based research, including epigenetics.

In this report, we tested and developed a range of DNA extraction methods from neonatal FTA cards, individually or in combinations. We incorporated or modified protocol steps that were crucial to increase the DNA yield and quality from DBS, and additionally tested their efficiency on Guthrie cards. Moreover, we suggest an optimal protocol for both, pyrosequencing- and HM450-based, methylation studies. This work could prove useful in meeting the increased demand for research on prenatal origins of human diseases and for newborn screening programs.

Results

Optimization of Phases I and II in the DNA extraction protocols

Limited quantity and quality are important drawbacks in the use of DNA obtained from DBS, particularly for epigenome-wide studies. Initially, we ruled out the possibility of using whole bisulfitome amplification (WGA) after confirming the introduction of biases, mostly in the middle range of DNA methylation levels (Additional file 1), further confirming the recently reported finding by Bundo et al. [13]. Then, to systematically optimize DNA extraction from DBS, we divided the different steps of this process into two phases (Figure 1). Critical steps in Phase I included blood extraction off the filter papers, cell lysis and protease digestion (Figure 1A, left panel). Phase II included DNA precipitation, purification and elution (Figure 1A, right panel).

Figure 1
figure 1

Phases and classification of protocols used to extract DNA from DBS. Two sequential phases, each encompassing three steps, are outlined (A) and were optimized in the different protocols or method combinations (B) used to extract DNA from DBS. A spin basket is shown next to Phase I.A and consists of a tube with an embedded perforated basket used to separate blood solutions from the filter papers from which they were extracted. A silica-gel column with a funnel-shape design is shown next to Phase II.E and often used to elute small volumes (5–30 μl), as manufactured by Macherey-Nagel and supplied with the extra-small (XS) versions of NucleoSpin kits (B).

We have previously tested several genomic DNA extraction methods on DBS, including resin-based, lysis-based and magnetic bead-based [5]. Lysis- and bead-based methods were the best, but the latter is not suitable for beadchip methylation profiling, so it was not considered in this study [9]. Among lysis-based methods, several commercially available kits, including QIAamp DNA Micro Kit, GenSolve and NucleoSpin, have been shown to be efficient for DBS DNA extraction [10, 14, 15]. Therefore, we selected these three kits to optimize the two phases of DNA extraction. This optimization involved the combination of the different kits and modifications in several steps of the two protocol phases (Figure 1B) (described in detail in Additional file 2). A combination of GenSolve reagents in Phase I and QIAamp reagents in Phase II (referred to as GQ method) was set as a reference protocol to which other tested methods were compared (Tables 1, 2, 3, 4, 5 and 6). In all pairwise comparisons, two DBS punches from the same DBS were used, and assessment of quantity and quality were initially done using Nanodrop (with 260/280 and 260/230 spectrophotometric ratios as a measure of quality) (Tables 1, 2, 3, 4 and 5). These DBS were obtained from the National Children Study (NCS), USA, and were FTA-type, which preserves well DNA relative to other types of neonatal cards (Methods), hence, allowing comparisons across a wide range of DNA extraction protocols.

Table 1 Combinations of Gensolve and Qiagen protocols for DNA extraction from DBS: GQ, QQ and Qq methods
Table 2 Combinations of GenSolve, Qiagen and NucleoSpin protocols for DNA extraction from DBS: GQ versus GN and Gn methods
Table 3 Combinations of GenSolve, Qiagen and NucleoSpin protocols for DNA extraction from DBS: GQ versus GN-XS and Gn-XS methods
Table 4 Combinations of GenSolve, Qiagen and NucleoSpin protocols for DNA extraction from DBS: GQ versus NN and Nn methods
Table 5 Combinations of GenSolve, Qiagen and NucleoSpin protocols for DNA extraction from DBS: GQ versus NN-XS methods
Table 6 Cross-comparisons of DNA quantity and quality parameters among the different tested DNA extraction protocols

DNA yield and quality were consistently better for the reference GQ protocol when compared to Qiagen protocol (QQ) (Table 1, p < 0.05). Although DNA yield was drastically increased when ethanol was used in the Qiagen precipitation step (Qq, p < 0.001), DNA quality was still suboptimal compared to GQ, as assessed by nanodrop (Table 1). In contrast, the combination of Gensolve and NucleoSpin (GN and Gn) increased the DNA yield while preserving DNA quality, regardless of the use of ethanol in the precipitation step (Table 2, p < 0.05). A similar improvement was observed when using NucleoSpin kit in both phases of DNA extraction (NN and Nn) (Table 4, p < 0.001). The extra-small (XS) versions of NucleoSpin, with column designs specific for low elution volumes, did not consistently improve DNA quantity or quality, whether combined or not with other kits (GN-XS, Gn-XS and NN-XS; Tables 3, 5 and 6), the DNA precipitation buffer changed to ethanol, or the washing volume and frequency increased (Table 3, p > 0.05).

Cross-comparisons across the different tested DNA extraction protocols

DNA quality parameters assessed earlier were based on DNA 260/280 and 260/230 spectrophotometric ratios. Two other important quality parameters are DNA detectability by PCR and DNA integrity and size range, which can be assessed by gel electrophoresis and bioanalyzer analyses. DNA from all tested protocols exhibited detectable PCR bands of a housekeeping gene, GAPDH (Table 6 and data not shown), hence indicating that the DNA is amplifiable for specific short regions. DNA isolated by the GQ method exhibited a smear-like profile by gel electrophoresis, with peak intensity often greater than 1 kilo base pair (Kbp) (Figure 2 and Table 6). Bioanalyzer smear analyses confirmed the DNA average size peak to be greater than 1 Kbp, with an average size ranging across samples between 4.9-9.7 Kbp (Table 6 and data not shown). Compared to GQ, all tested protocols often showed more DNA degradation, except for protocol Qq which usually exhibited similar DNA smear profiles (Figure 2 and Table 6).

Figure 2
figure 2

DNA integrity and size range as assessed by agarose gel electrophoresis. (A) DNA size markers used to estimate size ranges are shown in addition to genomic DNA that was isolated from white blood cells (WBC) and used as a positive control. (B) Representative DBS from each of the tested protocols are shown, except for protocol QQ in which DNA amounts were insufficient to be analyzed by gel electrophoresis. Eight different gel sections are shown and are derived from either the same gel or different gels. In each section, two punches from the same NCS spot were run on the same gel, with the first punch, labeled ‘a’, representing protocol GQ and the second punch, labeled ‘b’, representing another unique protocol from the tested set. The two blue lines, representing the 100 and 1000 base pair (bp) size ranges, were set according to the molecular size marker used in each section. The 1000 base pair limit is a minimum size range with useful applications in many genetic and epigenetic studies, including Illumina’s HM450 Beadchip array. The results of other DBS analyzed by gel electrophoresis or bioanalyzer are summarized in Table 6.

Overall comparison of the tested DNA quantity and quality parameters across the protocols shows that the best two protocols that at least match GQ in most of the tested parameters are Qq and NN (or Nn) (Table 6). Relative to GQ, the only drawback of Qq is its 260/280 DNA ratios (Table 6). Qq 260/280 ratios were always out of range, indicating that the high Qq DNA quantity measurements recorded spectrophotometrically by nanodrop may not be accurate. However, quantification with Qubit, a fluorescent-based method, confirmed the yield in Qq to be 2.7× higher than in GQ (data not shown). The only drawback of NN (or Nn) relative to GQ was DNA integrity, with the DNA size ranges in NN or Nn being lower than in GQ (Table 6).

In conclusion, protocol GQ seems to be the most robust among the tested methods across all tested DNA quantity and quality parameters. Qq can be rather suitable for applications requiring larger DNA quantities from DBS, while maintaining large fragment sizes, but in which 260/280 ratios are not a necessity. On the other hand, protocols NN or Nn may be better suitable for applications requiring larger DNA quantities from DBS, relative to GQ, while maintaining optimal 260/280 and better 260/230 ratios, but in which large DNA fragment sizes are not a requirement.

Of note, when GQ was tested on DBS (Guthrie cards, Whatman 903) from the Tasmanian Infant Health Survey (TIHS), Australia, dating more than 20 years old, an average of 66 ± 15 ng (n = 3) of DNA could be extracted per two punches, each being 1 mm in diameter, with a mean 260/280 ratio of 1.66 ± 0.02; these DNA quantities are equivalent to 42.0 ng/mm2 for TIHS, compared to 12.7 ng/mm2 for NCS samples.

Performance of DNA extracted from DBS using methylome-wide analysis

Methylation probe call index

In order to assess the performance of DNA extracted from DBS in HM450 methylome-wide analyses, we used the DNA extracted by GQ, being the most robust protocol. DNA from two sample pairs, each representing two punches (serving as technical replicates) from the same DBS, were analyzed by HM450. In addition, DBS pairs were compared to reference DNA from neonatal blood or cell lines (Table 7). In all tested samples, whether originating from neonatal blood, DBS or cell lines, more than 99% of the 485,577 HM450 individual probes were detected, using the commonly accepted quality control detection p-value of 0.01, hence, indicating high quality data. The average beta-values were similar between the technical replicates NCS 37a and 37b (approximately 0.47 for either) and between the technical replicates NCS 38a and 38b (approximately 0.42 for either) (Table 7).

Table 7 Methylation quality control probe evaluation

Sample–dependent and –independent HM450 internal quality control probes

For sample and array quality, HM450 array includes 850 quality control (QC) probes. Fifteen QC probes are sample-independent and 835 QC probes are sample-dependent [12]. DNA from DBS, neonatal blood or cell lines passed the described HM450 QC (Additional file 3 and Figure 3). Background probes, wherever included, produced minimal signal (maximum limit is 1000 units, as recommended by Illumina Inc.), and the intended positive signals from the experimental QC probes were above background, for all of the three tested DNA sources (Additional file 3 and exemplified in Figure 3 using non-polymorphic probes, which are indicative of overall performance). In addition, performance of DBS samples was similar to that of subsets taken from reference neonatal blood and cell line samples (Additional file 4 and Figure 3). Bisulfite conversion efficiency for both, type I and II probes, was high for all tested samples (Additional file 4) and was confirmed by PCR using primers that are specific either to bisulfite converted or to non-modified GAPDH DNA regions (data not shown).

Figure 3
figure 3

HM450 QC plot using Non-polymorphic probes which assess overall performance. In the green channel, background signals are shown in red and pink while positive signals in opaque and fluorescent green. In the red channel, background signals are shown in opaque and fluorescent green while positive signals in red and pink. One non-polymorphic control has been designed for each of the four nucleotides A, T, C, and G. Four DBS DNA samples are shown between four neonatal blood and four cell line DNA samples, in each of the two plots. The DBS samples represent two NCS spots, 37 and 38, each consisting of two tested punches labeled ‘a’ or ‘b’.

Differential methylation and clustering analyses using HM450 data

Differential methylation of HM450 beta-values produced two major clusters (Figure 4). Cell line DNA samples formed one cluster while DBS and neonatal blood DNA formed another; this is expected because the cell lines used are of hepatic tissue origin while both, DBS and neonatal blood samples, are of blood tissue origin. Within the cluster of blood biospecimens, all four neonatal blood samples formed one sub-cluster, which was segregated away from the DBS sub-cluster. The two punches NCS 37a and 37b, representing the technical replicates from the same spot NCS 37, clustered together and away from the other two technical replicates, NCS 38a and 38b, which also clustered together (Figure 4). This further supports the observed higher correlations (p < 0.001, Steiger Z test) between technical duplicates punched from the same DBS (r2 range = 0.963-0.990) versus different DBS (r2 range = 0.949-0.951; Additional file 4A). When the analysis was limited to the top 1% of probes that showed the highest variance in M-values (transformed betas) across any of the four tested DBS, NCS_37a, NCS_37b, NCS_38a and NCS_38b, the correlation between replicates (r2 range = 0.850-0.972; p < 0.001, Pearson) become significantly higher and more predictive of replication (p < 0.001, Steiger Z test) than the correlation between non-replicates (r2 range = 0.365-0.371; p < 0.001, Pearson) (Additional file 4B). Similar observations were reported using Spearman correlations. Moreover, the frequency distributions of delta M-values (δM) between samples were centred at zero only between technical replicates (Additional file 4B). Hence, we can conclude that HM450 analyses using DBS DNA, extracted using the GQ protocol, is reproducible.

Figure 4
figure 4

Differential methylation and unsupervised clustering analysis of HM450 data from neonatal blood, DBS and cell line DNA. Neonatal blood and cell line DNA samples are used as positive controls of good DNA quality for reference comparisons with DNA extracted from DBS. Neonatal blood and DBS are from different individuals. Four DBS DNA samples are shown between four different neonatal blood and four different cell line DNA samples. The DBS samples represent two NCS spots, 37 and 38, each consisting of two tested punches labeled ‘a’ or ‘b’. HM450 beta-values were clustered using Euclidean distance as the dissimilarity index. As shown in the color key, the red and blue signals represent relatively hypomethylated and hypermethylated regions, respectively.

Performance of DNA extracted from DBS using sequence-specific methylation analysis

The performance of DNA extracted from DBS by the GQ method was then tested using sequence-specific methylation analyses. For this purpose, we analyzed the methylation levels of several CpG sites in Line1 and AluYb8 sequences, both of which are proxy markers of global methylation, being transposable elements interspersed across the genome [16]. The technical replicates, NCS 37a and 37b, showed similar Line1 and AluYb8 methylation levels at each tested CpG site, and similar data were observed with the pair, NCS 38a and 38b (Figure 5A and B). These results show that inter-replicate variation in methylation levels is minimal using several CpG sites in two different loci, Line1 and AluYb8. Moreover, the observed difference in methylation levels between the pair NCS 37a and 37b versus NCS 38a and 38b was consistent at every single CpG tested and across both, Line1 and AluYb8 loci (p < 0.1 for CpG6 in Line1 and CpG3 in AluYb8 and p < 0.05 for all other CpGs; Mann–Whitney test) (Figure 5A and B). These results confirm that DNA extracted from DBS using GQ is suitable to detect small methylation differences in a consistent manner and with low inter-replicate variation.

Figure 5
figure 5

Methylation analyses of Line1 and AluYb8 loci using bisulfite pyrosequencing. The methylation levels of six and four CpG sites were analyzed for Line1 (A) and AluYb8 (B), respectively, and are expressed as percent of the total number of CpGs analyzed for each individual CpG site. The DBS samples represent two NCS spots, 37 and 38, each consisting of two tested punches labeled ‘a’ or ‘b’.

Discussion

DBS have become an increasingly important tool for diagnostic purposes and for epigenetic, genetic and epidemiological research. We have previously tested a range of commercially available DNA extraction kits for purifying genomic DNA from fresh and dried blood for downstream PCR and DNA methylation applications [5]. We found that genomic DNA extraction, using the ChargeSwitch Forensic DNA Purification kit (Invitrogen), with subsequent bisulfite modification, using the MethylEasy kit (Human Genetic Signatures), was best in yielding bisulfite-converted DNA of sufficient quantity and quality for downstream candidate-gene DNA methylation analyses, such as SEQUENOM MassArray EpiTYPER analysis [5]. However, DNA extraction with ChargeSwitch was recently shown to be not suitable for beadchip methylation profiling, leading to up to 16% loss of detectable probes in Infinum HumanMethylation27 (Illumina Inc.) arrays analysis [9]. In relation to the limited amounts of DNA extracted from DBS, a recent report also pointed to the biases introduced by whole bisulfitome amplification and the need for careful data interpretation [13], as we have also observed in this study.

In this work, we have systematically compared different DNA extraction methods from DBS by dissecting different phases of extraction and optimizing several steps within each phase, using commercial and in-house extraction protocols. For these purposes, we used a homogenous set of DBS samples, spotted on the same day and stored in a similar manner, to provide a common platform for cross-protocol comparisons. Moreover, this study emphasizes DNA extraction protocols that have particular utility in a recent technology for studying methylome-wide methylation, Infinium HM450, and sequence-specific methylation, by pyrosequencing. The use of DBS for diagnostic and research purposes is not new, but there is a lack of quality standards for optimizing DNA extraction. This study suggests different DNA extraction protocols, each having specific advantages tailored for specific applications. Protocol GQ does not extract the highest DNA yield, but provides DNA in quantities and qualities sufficient for HM450 methylome-wide and sequence-specific methylation analyses. With GQ, the 260/280 ratios are consistently optimal and the extracted DNA is less fragmented relative to other protocols. Protocol Qq, on the other hand, produces twice as much DNA as GQ and with similar DNA integrity. However, 260/280 ratios in Qq are unreliable and cannot be used for sample selection, particularly for expensive downstream applications. As for protocols NN or Nn, they extract 1.7 folds more DNA than GQ and show optimal ranges for both, 260/280 and 260/230 ratios. However, these protocols lead to more DNA fragmentation relative to GQ. This may represent a limitation for bead array-based assays and other DNA methylation assays where DNA integrity is a requirement. It should be noted though that 260/280 and 260/230 ratios should be treated with caution; for example, different contaminants can compensate for each other’s’ deviations, resulting in misleading optimal 260/280 ratios.

Because the type of purification column was identical in GQ and Qq, but different from those used in all the other methods, column nature (Phase II) could be the reason for the better DNA integrity in the two protocols. This is further supported by our results showing that in GQ, changing Phase II (includes column type) while maintaining Phase I, as in GN/n or GN/n-XS, compromises DNA integrity (Table 6). Changing the DNA precipitation buffer to ethanol reduces the need in some protocols to vigorously vortex to dissolve resultant precipitates but does not seem to enhance DNA integrity (Figure 2; compare GN versus Gn, GN-XS versus Gn-XS, and NN versus Nn). However, ethanol, as a precipitating buffer, was essential in some protocols to increase the DNA yield, as in Qq versus QQ (Tables 1 and 6).

Other studies have compared DNA extraction protocols from DBS, but irrespective of epigenetic applications [1719]. It is difficult to compare DNA extraction protocols across different studies due to many reasons, such as differences in the structures of filter papers on which blood was soaked, the storage conditions and year-durations, and DNA quantifications methods used. However, these studies used QIAamp DNA Mini Kit (Qiagen) as a reference method, which is similar to the QQ protocol in this work, hence, allowing comparisons between our and their optimized methods. Sjoholm et al. reported that QIAamp DNA Mini Kit performed the best relative to EZNA (Omega Bio-Tek), Chelex 100 (AmershamBiosciences) and alkaline lysis (GenomiPhi DNA Amplification Kit, AmershamBiosciences) [19]. On the other hand, the in-house developed DNA extraction methods reported by Hue et al. and Hollegaard et al. yielded, by Nanodrop quantification, 3.3 and 2.5 fold more DNA, respectively than QIAamp DNA Mini Kit [18]. In comparison, the GQ protocol in our study yielded, also by Nanodrop quantification, on average 4.6 fold more DNA than matched QQ samples; in addition, protocols Qq, NN and Nn yielded at least 1.7 folds more DNA than matched GQ samples. Moreover, the in-house protocol by Hue et al. produced a low purity 260/280 average ratio (1.50) [18], while GQ, NN and Nn ratios were optimal in every tested sample. These findings support the good performance of our optimized methods relative to many other in-house and commercial DNA extraction protocols from DBS. Interestingly, one study reported a recent method suitable for performing scalable DNA extractions simultaneously from many DBS, but with less emphasis on DNA quality and yield comparisons across different methods [20]. The scale of our tested methods can be increased by implementing the QIAcube technology (Qiagen), and, with scalable designs, laser cutting of DBS punches would eliminate cross-contamination, as has been recently reported [21].

Conclusion

This study arises from an international effort across several cohorts and working groups aiming to fulfill the need to systematize quality standards in DNA extraction and to increase the DNA yield using DBS, particularly with the advent of high-throughput epigenomic technology that require high quality and quantity of DNA. Given the emerging appreciation of DBS collected at birth as a valuable resource for epigenetic analyses prior to phenotypic onset, our optimized methods for DNA extraction with application in methylation analyses have great potential for diagnostic and research purposes.

Methods

Sample overview

DBS that were used to perform the comparisons across the DNA extraction protocols were obtained from NCS, USA, and have been spotted on Flinders Technology Associates (FTA) mini cards on the same day and dried in air-sealed containers for approximately two years at room temperature. Guthrie cards (Whatman 903) from TIHS, Australia, dating more than 20 years old, were also used to test the efficiency of the robust protocol GQ. Both NCS and TIHS samples were heel-prick without anticoagulants added. Permissions from the ethical committees of the International Agency for Research on Cancer (IARC), as well as both, NCS and TIHS, were obtained. TIHS is one of the founder cohorts of the International Childhood Cancer Cohort Consortium (I4C) [1].

DNA extraction protocols

Combinations of different commercially available DNA extraction kits were used, including QIAamp DNA Micro Kit (Qiagen 56304), GenSolve (Gen Vault, GVR110), NucleoSpin (gDNA clean-up, Macherey-Nagel 740230), and the extra-small (XS) version of NucleoSpin (gDNA clean-up XS, Macherey-Nagel 740904). Reported quantifications were done using Nanodrop, unless indicated otherwise using Qubit™ dsDNA High Sensitivity Assay, Invitrogen Q32851. Detailed protocols are included in Additional Methods (Additional File 2).

Gel electrophoresis and bioanalyzer analysis

Samples were run on a 0.8% agarose gel (Eurobio GEPAGA07-65) in 1 × Tris Acetate-EDTA buffer and stained with GelRed. 300 ng of DBS DNA were utilized per sample in electrophoresis analyses of DNA integrity and size range. The following DNA size markers were used: 80–10,000 bp ladder (Thermo SM0403), 500–5,000 bp ladder (Takara 3411A), and 100–1,000 bp ladder (Thermo SM0243). As for bioanalyzer analysis, 500 pg of DNA per sample was loaded on the chip and analyzed on Agilent 2100 Bioanalyzer, as per manufacturer’s instructions (Agilent Technology, High Sensitivity DNA Kit, 5067–4626).

Bisulfite conversion and PCR

DBS DNA (300 ng) samples were bisulfite converted using EZ DNA Methylation Kit (Zymo Research D5001) according to manufacturer’s instructions. To assess the efficiency of bisulfite conversion, DNA was amplified using PCR primers that were specific either to bisulfite-converted or to non-modified DNA, and spanning the region of the housekeeping gene, GAPDH. The primer pairs spanning bisulfite-converted GAPDH regions are termed GAPDH-bc and consisted of the following forward and reverse primers, respectively: 5’-GTATTTGTTGATGGGTTAAGG-3’ and 5’-ATAAAAACAAATCCCCTACCC-3’. The primer pairs spanning non-modified GAPDH regions are termed GAPDH-nm and consisted of the following forward and reverse primers, respectively: 5’-CTCTTGCTACTCTGCTCTGG-3’ and 5’-GCTAAGTTTAGCCTGCCTGG-3’. Efficient conversion is observed when PCR bands are detected with GAPDH-bc but not with GAPDH-nm for a given sample. The PCR conditions used were: 95°C 15 min, [95°C 30 s, 57°C 30 s, 72°C 30 s] × 50 cycles, 72°C 10 min, and pause 4°C.

Pyrosequencing

DNA methylation analysis by pyrosequencing of bisulfie-converted DNA was performed as described [22]. Briefly, the region of interest was amplified with forward and reverse primers, one of which is biotinylated (btn), and then the methylation levels of the amplified region were analyzed using a sequencing primer. The forward, reverse and sequencing primers used for Line1 were the following, respectively, and adopted from Daskalos et al. [16]: 5’-btn-TAGGGAGTGTTAGATAGTGG-3’, 5’-AACTCCCTAACCCCTTAC-3’ and 5’-CAAATAAAACAATACCTC-3’. The forward, reverse and sequencing primers used for AluYb8 were the following, respectively: 5’-AGATTATTTTGGTTAATAAG-3’, 5’-btn-AACTACRAACTACAATAAC-3’ and 5’-GTTTGTAGTTTTAGTTATT- 3’, as previously described [23].

Illumina Infinium HM450 array and data processing

Infinium HM450 arrays were processed according to manufacturer’s instructions. GenomeStudio was used to analyze quality controls. The raw colour channels were corrected using the internal control probes and converted, without background subtraction and normalization, into absolute methylation levels (beta-values). Data were then imported into R (3.0.0), using the minfi package version 1.2.0 (http://www.bioconductor.org). Subset-quantile Within Array Normalisation (SWAN) normalisation was performed to correct for technical discrepancies between Type I and Type II [24]. Probes with detection p-values above 0.01 were considered as background noise and omitted from further analysis. Sex chromosome-specific probes were eliminated to minimize gender-specific variation of the X versus Y chromosomes [25]. Logarithmic transformation of the beta-values into M values was done, as previously described [26]. Statistical tests were performed using M-values. Clustering plots were generated using the lumi R package and based on coefficient of variation, which is calculated by standard deviation divided by mean across samples [27]. Differentially methylated CpGs were identified using an F-test in minfi.