Purification of nanogram-range immunoprecipitated DNA in ChIP-seq application
- 1.8k Downloads
Chromatin immunoprecipitation-sequencing (ChIP-seq) is a widely used epigenetic approach for investigating genome-wide protein-DNA interactions in cells and tissues. The approach has been relatively well established but several key steps still require further improvement. As a part of the procedure, immnoprecipitated DNA must undergo purification and library preparation for subsequent high-throughput sequencing. Current ChIP protocols typically yield nanogram quantities of immunoprecipitated DNA mainly depending on the target of interest and starting chromatin input amount. However, little information exists on the performance of reagents used for the purification of such minute amounts of immunoprecipitated DNA in ChIP elution buffer and their effects on ChIP-seq data. Here, we compared DNA recovery, library preparation efficiency, and ChIP-seq results obtained with several commercial DNA purification reagents applied to 1 ng ChIP DNA and also investigated the impact of conditions under which ChIP DNA is stored.
We compared DNA recovery of ten commercial DNA purification reagents and phenol/chloroform extraction from 1 to 50 ng of immunopreciptated DNA in ChIP elution buffer. The recovery yield was significantly different with 1 ng of DNA while similar in higher DNA amounts. We also observed that the low nanogram range of purified DNA is prone to loss during storage depending on the type of polypropylene tube used. The immunoprecipitated DNA equivalent to 1 ng of purified DNA was subject to DNA purification and library preparation to evaluate the performance of four better performing purification reagents in ChIP-seq applications. Quantification of library DNAs indicated the selected purification kits have a negligible impact on the efficiency of library preparation. The resulting ChIP-seq data were comparable with the dataset generated by ENCODE consortium and were highly correlated between the data from different purification reagents.
This study provides comparative data on commercial DNA purification reagents applied to nanogram-range immunopreciptated ChIP DNA and evidence for the importance of storage conditions of low nanogram-range purified DNA. We verified consistent high performance of a subset of the tested reagents. These results will facilitate the improvement of ChIP-seq methodology for low-input applications.
KeywordsNanogram DNA ChIP-seq DNA purification DNA storage
Histone H3 Lysine 27 trimethylation
Histone H3 Lysine 4 trimethylation
Next generation sequencing
It is clear that the epigenetic dysregulation is deeply involved in the etiology of various human diseases. Chromatin Immunoprecipitation in combination with next-generation sequencing (ChIP-seq) is a highly informative epigenetic approach as it reveals genome-wide distribution profiles of histone marks, transcription factors, and chromatin-associated proteins . The methodology is well established in cultured cells and indeed the vast majority of ChIP-seq data are generated from cell lines. Current ChIP protocols typically require 5–10 million cells per ChIP [2, 3], which limits the use of ChIP-seq technology in primary cells, rare cell populations, and small clinical samples such as needle biopsies. Furthermore, whereas quality control standards for ChIP-seq studies have been established , several key steps still require further optimization, particularly in small and patient-derived samples, where high degree of consistency and efficiency must be achieved before the technique is introduced into clinical medicine.
The workflow of a typical ChIP-seq experiment consists of multiple steps including sample collection, chromatin input preparation, immunoprecipitation, purification of immunopreciptated DNA, library preparation, next-generation sequencing, mapping of sequencing reads, and data analysis . In our efforts to improve the technology, we have realized that purification of immunoprecipitated DNA and storage of the purified DNA, two steps that have received little attention, are critical for successful library preparation and overall ChIP-seq quality. The yield of immunoprecipitated ChIP DNA is dependent on several factors including target of interest, starting amount of chromatin input, and antibody used. Typically, ChIP experiment is designed to generate immunoprecipitated DNA in the nanogram range. However, it is often difficult to obtain greater than 1 ng of purified ChIP DNA in some targets such as transcription factors, chromatin-associated proteins, and histone marks with small genomic footprints, and also in some experiments performed in small numbers of cells and patient-derived clinical samples. The PCR amplification-based library preparation method is well accepted in ChIP-seq applications. There are some reports that less than 1 ng of DNA can be used for ChIP-seq library preparation . However, more DNA is typically better and at least 1–10 ng of purified DNA is recommended for consistency and data quality. Currently, there is no report on how purification methods and reagents affect the recovery of nanogram-range immunoprecipitated DNA and how the purified DNA from different reagents impacts on library preparation and ChIP-seq data quality.
In this study, we sought to improve the experimental steps for purification of immunopreciptated DNA and library preparation for ChIP-seq application. The purification yield was tested for nanogram-range immunoprecipitated DNA by ten ready-to-use, commercially available DNA purification reagents and phenol/chloroform extraction. We also showed the potential interference of purification reagent in the downstream application. Logistically, library preparation is usually performed a day or more after purification. We observed that the storage condition is important for the preservation of low nanogram-range purified ChIP DNA. Finally, we selected four better performing reagents in our hands, and tested how these purification reagents impact library preparation and ChIP-seq data quality using 1 ng of immunoprecipitated DNA generated from H3K4me3 or H3K27me3 ChIPs. Our results indicate that the selected purification kits have a minimal effect on the efficiency of library preparation and the resulting ChIP-seq data.
DNA recovery is significantly different amongst purification reagents
Storage conditions affect the preservation of low amounts of purified ChIP DNA
Highly correlated ChIP-seq data are generated from different purification reagents
The mapping and global enrichment results from the ChIP-seq experiments are shown in (Additional files 5 and 6) [3, 7]. Mapping rates, library complexity, and peak numbers were similar among the libraries prepared with different purification reagents (Additional files 7 and 8). Peak profiles visualized in the Integrative Genomics Viewer (IGV; Broad Intitute, Cambridge, MA) [8, 9] were also highly similar among the datasets including the results generated using our standard protocol (St) (Fig. 3b). As expected [10, 11], narrow H3K4me3 peaks are primarily found at active promoters (see genes in the middle), whereas broad H3K27me3 peaks are distributed over PRC2-repressed genes such as MYT1 and ZBTB46. The H3K4me3 peak profiles also closely matched corresponding data generated by the ENCODE consortium in HeLa cells . Although similarities could also be found between our H3K27me3 data and the corresponding ENCODE results, the data from University of Washington were considerably less clear due likely to suboptimal enrichment.
To formally analyze the correspondence among the data generated with the different purification reagents, we performed Pearson correlation analysis between H3K4me3 and H3K27me3 ChIP-seq dataset generated with our standard protocol and each of the datasets obtained using the tested DNA purification kits (Fig. 3c). We observed uniformly high correlation for both marks with Pearson correlation coefficients ranging between 0.990 and 0.999 with p < 0.001. Our experimental datasets were also highly correlated with the corresponding ENCODE data (Additional file 9). These results indicate high degree of similarity among ChIP-seq dataset obtained using different DNA purification reagents.
To our knowledge, this is the first report to systematically test the efficacy of purification and library preparation of nanogram-range immunoprecitated DNA in ChIP-seq application. These aspects of ChIP-seq experimentation have received relatively little attention , although they may affect success rates and reliability of these demanding experiments. We showed that DNA purification reagents have variable impacts on the recovery of nanogram-range immunoprecipitated DNA in ChIP elution buffer. We also observed the storage condition of purified DNA is important. However, DNA purification reagents have a minimal impact on ChIP-seq data if sufficient amount of DNA is available for library preparation. It is noteworthy that current library preparation technology supports consistent and robust library preparation from over 1 ng of purified DNA. Several groups have reported the development of ChIP-seq protocols for low cell number, expecting to generate from nanogram to picogram range of immunoprecipitated DNA [2, 13, 14]. Further optimization of these and other key steps may help achieve the consistency and efficiency of ChIP-seq experimentation required for its introduction into clinical applications.
Our results have revealed significant differences among DNA purification kits in their ability to recover various low amounts of DNA. Our study was not meant to be comprehensive as many other kits were not included. However, we found that four of the eleven tested reagents were capable of handling low nanogram DNA in ChIP elution buffer and had no noticeable negative impact on library preparation and ChIP-seq data quality (Figs. 1 and 3). Well-performing reagents included both more traditional, silica-membrane-based column purification kits and solid-phase reversible immobilization (SPRI)-based reagents, which utilize paramagnetic beads and can be easily automated. DNA fragments less than 30 bp in size are preferentially lost during column-based purification; whereas SPRI bead-based purification results in loss of DNA less than 100 bp in size (Fig. 1c). Consistently, 1% of sequencing reads from the Agencourt AMPure XP kit is smaller than 100 bp but nearly 5–10% of sequencing reads are smaller than 100 bp from column-based purification reagents (Additional file 10). We tested whether purification reagents utilizing silica columns or SPRI may introduce any bias in ChIP-seq data. The correlation analysis of ChIP-seq data clearly indicated no detectable differences between reagents based on these two principles of operation (Fig. 3; Additional file 9). We also compared the distribution of ChIP-seq peaks from small fragment in column-based purification reagents with the distribution of ChIP-seq peaks from the Agencourt AMPure XP kit. We did not observe any noticeable differences (Data not shown). These results indicate that the differential loss of small fragments do not introduce the bias in ChIP-seq application. However, it is noteworthy to mention that it may be important in other applications involved with DNA fragments smaller than 100 bp.
The lengthy and complicated nature of standard ChIP-seq protocols makes sample storage almost unavoidable e.g. before library preparation. Therefore, we have also tested the effect of key storage conditions on ChIP DNA loss. In these experiments we confirmed that depending on the DNA concentration (0.1–1 ng/μL in 15 μL volume), the duration and temperature of storage, as well as on the tubes used, 7% to >50% of the ChIP DNA could be lost during storage (Fig. 2). In our experiments, the type of polypropylene storage tubes had the greatest impact. The loss of ChIP DNA was greater at 0.1 ng/μL concentration and at 4 °C vs. at −20 °C. Most loss occurred during the first 3 days of storage. These observations are consistent with previous reports of preferential loss of short DNA fragments stored at low concentrations from adsorption to the wall of tubes and denaturation [15, 16, 17]. Thus, low amounts of purified ChIP DNA should be stored in low-binding tubes, at −20 °C, and at the highest possible concentration for the shortest possible time. If storage is unavoidable, it is advisable to re-quantify the DNA before library preparation.
We compared the performance of ten commercial DNA purification reagents and phenol/chloroform extraction on low nanogram quantities of ChIP DNA. Four of the well-performing reagents were selected for investigating the impact on library preparation and ChIP-seq data quality. The selected purification reagents had minimal impact on library preparation and generated highly correlated ChIP-seq data. We also showed that considering storage conditions such as the type of tubes used, DNA concentration, temperature, and duration is critical for maximizing the preservation of low amounts of purified ChIP DNA. Our results will aid efforts directed at the optimization of ChIP-seq methodology for low-input applications including the analysis of small and non-renewable patient samples.
Cell culture and reagents
HeLa cells were purchased from ATCC. Cells were grown in Advanced DMEM (Dulbecco’s Modified Eagle Medium) containing 10% calf bovine serum at 37 °C and 5% CO2 with saturating humidity.
HeLa Cells were cross-linked with 1% formaldehyde for 10 min, followed by quenching with 125 mM glycine for 5 min at room temperature. Fixed cells were washed twice with TBS, resuspended in cell lysis buffer (10 mM Tris-HCl, pH 7.5, 10 mM NaCl, 0.5% NP-40), and incubated on ice for 10 min. The lysates were washed with MNase digestion buffer (20 mM Tris-HCl, pH 7.5, 15 mM NaCl, 60 mM KCl, 1 mM CaCl2) and incubated for 20 min at 37 °C in the presence of MNase (2000 gel units/4× 106 cells, New England Biolabs, Ipswich, MA). After adding the same volume of sonication buffer (100 mM Tris-HCl, pH 8.1, 20 mM EDTA, 200 mM NaCl, 2% Triton X-100, 0.2% sodium deoxycholate), the lysate was sonicated for 15 cycles (30 s on, 30 s off) using a Diagenode Bioruptor and centrifuged at 15,000 rpm for 10 min. The cleared supernatant equivalent to 4 × 106 cells was incubated with 2 μg of antibody at 4 °C on a rocker overnight. The anti-H3K27me3 antibody (Cat. #9733, Lot #8) was purchased from Cell Signaling Technology (Danvers, MA) and the purified anti-H3K4me3 antibody was generated in-house (EDL Lot 1). After adding 30 μl of prewashed protein G-magnetic beads, the reaction was further incubated for 3 h. The beads were extensively washed with ChIP buffer, high salt buffer, LiCl2 buffer, and TE buffer. All washes were carried out for 5 mins at 4 °C on a rotating wheel. Bound chromatin was eluted in 100 μL ChIP elution buffer (10 mM Tris-HCl, pH 8.0, 10 mM EDTA, 150 mM NaCl, 5 mM DTT, 1% SDS) and reverse-crosslinked at 65 °C overnight. After treatment with RNase A and proteinase K, DNA was purified by Qiagen MinElute PCR Purification Kit (Cat. # 28006, Valencia, CA) and quantified using Qubit dsDNA High Sensitivity assay (Invitrogen, Q32851). To check the size of input chromatin, purified input DNA was analyzed by Fragment Analyzer (Advanced Analytical Technologies; AATI; Ankeny, IA) using the High Sensitivity NGS Fragment Analysis Kit (Cat. # DNF-486).
Analysis of DNA recovery in ChIP elution buffer using different purification reagents
Chromatin input was prepared from HeLa cells following ChIP protocol as described above and was reverse-crosslinked at 65 °C overnight. DNA was purified using MinElute PCR Purification Kit after treatment of RNase A and proteinase K. DNA was quantified using Qubit dsDNA High Sensitivity assay and adjusted to 1 ng/μL with TE buffer. DNAs were prepared to final 1 ng, 5 ng, 10 ng and 50 ng in 100 μL ChIP elution buffer and were purified by 11 different purification reagents as suggested by the manufacturer except for the elution volume, which was fixed at 16 μL. Similarly, DNAs were purified from de-crosslinked chromatin estimated to include 1 ng, 5 ng, 10 ng, and 50 ng of DNA after treatment of RNase A and proteinase K. The following reagents were used in the experiment: ChIP DNA Clean & Concentrator™ (Cat. # D5205) from Zymo Research (Zy) (Irvine, CA); Wizard® SV Gel and PCR Clean-Up System (Cat. # A9281) from Promega (Pr) (Fitchburg, WI); GeneJET PCR Purification Kit (Cat. # K0701) from Thermo Fisher Scientific (Th) (Waltham, MA); PureLink® PCR Purification Kit (Cat. # K310001) from Invitrogen (In) (Carlsbad, CA); Monarch® PCR & DNA Cleanup Kit (Cat. # T1030S) from New England Biolabs (Ne) (Ipswich, MA); Chromatin IP DNA Purification Kit (Cat. # 58002) from Active Motif (Am) (Carlsbad, CA); QIAquick PCR Purification Kit (Cat. # 28106) from Qiagen (Qp) (Valencia, CA), MinElute PCR Purification Kit (Cat. # 28006) from Qiagen (Qm); Agencourt AMPure XP (Cat. # A63881) from Beckman (Ba) (Indianapolis, IN), RNAClean™ XP (Cat. # A63987) from Beckman (Br), and phenol/chloroform extraction (PC) (Additional file 11). The sample-to-beads ratio tested for Ba and Br were 1:1.25, 1:1.50, 1:1.75, and 1:2. Each purification reagent was tested in triplicate DNA samples derived from 3 independent experiments. The recovery rate was calculated by dividing the recovered DNA amount after purification by the starting amount and expressed in percentages. DNA size of purified DNA from de-crosslinked chromatin was analyzed by AATI Fragment Analyzer using the High Sensitivity NGS Fragment Analysis Kit.
PCR analysis of final eluent from different purification reagents
To check the potential interference of purification reagent in the downstream application, qPCR assay was performed. 9 μl of final DNA eluent from each purification reagent was combined with 1 μl of 166 bp fragment of Drosophila probe DNA, and the resulting mixture was used as the template to amplify Drosophila-specific probe DNA in 20 μl of real-time PCR reaction. TE buffer was used as control and the Ct value from TE buffer was set as 100%. The experiment was repeated 3 times using the final eluents from de-crosslinked chromatin estimated to include 1 ng of DNA. The following primer sequences were used for Drosophila probe preparation and real-time PCR: Drosophila probe-F: 5′- GCTGACGGGAACAATGGTC-3′, Drosophila probe-R: 5’-TGGCGACGACGTAACAACAT-3′.
Analysis of storage conditions for purified ChIP DNA
ChIP DNA was prepared in HeLa cells using H3K4me3 antibody and the protocol described above. Purified ChIP DNA was adjusted to 0.1 or 1 ng/μL with TE buffer. Sufficient number of aliquots were made into 1.5 mL polypropylene-based tubes used in the typical molecular biology laboratory in 15 μL volume, and stored at 4 °C and −20 °C. The following tubes were tested: Axygen® 1.7 mL MaxyClear Snaplock Microcentrifuge Tube (Cat. # MCT-175-C) (Corning, New York), Eppendorf DNA LoBind Snap Cap PCR Tube (Cat. # 022431021) (Hauppauge, NY), Fisherbrand™ Siliconized Low-Retention Microcentrifuge Tube (Cat. # 02–681-331) (Waltham, MA), and Fisherbrand™ Premium Microcentrifuge Tube (Cat. # 05–408-129). DNA was quantified using Qubit dsDNA High Sensitivity assay on days 0, 1, 2, 3, and 7 of storage. At each time point, the DNA amount from 5 individual tubes was measured and the tubes were discarded. The experiment was repeated in triplicate in independently prepared H3K4me3 ChIP DNA samples. DNA amounts detected in solution were compared to the starting DNA amount.
Library preparation of DNA purified by selected purification reagents and sequencing
ChIP was performed in HeLa cells as described above for H3K4me3 and H3K27me3 marks. After RNase A and proteinase K treatments, immunopreciptated DNA in the ChIP elution buffer was evenly divided into aliquot A and aliquot B. Aliquot A was purified by MinElute PCR Purification Kit, which is the standard protocol in the Epigenomics Development Lab (EDL). The enrichment was analyzed by real-time PCR targeting positive and negative control genomic loci and DNA was quantified using Qubit dsDNA High Sensitivity assay. The DNA concentration in aliquot B was back-calculated from aliquot A. For aliquot B, the immunoprecipitate equivalent to 1 ng of purified ChIP DNA was diluted to 100 μL of ChIP elution buffer. DNA was purified using selected purification reagents according to the manufacturer’s instructions. For the MinElute PCR Purification Kit, DNA was eluted in Maxyclear and Premium tubes. And for other reagents, DNA was eluted in Maxyclear tubes. ChIP-seq libraries were prepared from DNA purified from the aliquot B by the various reagents using the ThruPLEX® DNA-seq Kit V2 (Rubicon Genomics, Ann Arbor, MI) according to the manufacturer’s instructions. For comparison, the libraries were also prepared from 1 ng of purified input and ChIP DNA from aliquot A (EDL standard protocol). Following repair and adaptor ligation steps, the adaptor-ligated DNA was amplified 12 cycles by PCR in 50 μL reaction volume. To analyze the size and quantity of the library DNA, 2 μL of the PCR reaction was analyzed by the AATI Fragment Analyzer using the High Sensitivity NGS Fragment Analysis Kit. The remaining PCR reaction was further purified for sequencing. A total of 11 ChIP-seq libraries were sequenced to 51 base pairs from both ends in the same lane using the Illumina HiSeq 4000 instrument at the Mayo Clinic Medical Genome Facility Sequencing Core.
Real-time PCR analysis
Real-time PCR analysis was performed using SYBR Green universal PCR mixes (Bio-Rad). The following primer sequences were used in the experiments: H3K4me3-positive control locus: hGAPDH-F: 5’-CCCACTCCTCCACCTTTGAC-3′, hGAPDH-R: 5’-CCCAGCCACATACCAGGAAA-3′. H3K27me3-positive control locus: hMYT1-F: 5’-CCTGCCGTGTGCTGTTTTT-3′, hMYT1-R: 5’-CACAACATGTCCCCTGGAATC-3′. H3K4me3- and H3K27me3-negative control locus: hCh19-intergenic-F: 5’-AGCTTGTCTTTCCCAAGTTTACTC-3′, hCh19-intergenic-R: 5’-TAGCTGTCGCACTTCAGAGGA-3′.
Mapping and data analysis
Raw sequencing reads were processed and analyzed using the HiChIP pipeline  to obtain visualization files and a list of peaks. Briefly, paired-end reads were mapped to the human reference genome (release hg19/GRCh37) by BWA  with default settings, and only uniquely mapped reads were used for further analysis. Peaks were called using the MACS2 algorithm  for H3K4me3 and SICER  for H3K27me3 at FDR < =1%. Fragment size was calculated from properly mapped read pairs. H3K4me3 and H3K27me3 ChIP-seq datasets (Broad institute and University of Washington) generated by the ENCODE consortium  in HeLa cells were downloaded from the Gene Expression Omnibus. Correlation analysis was performed by our in-house scripts. All datasets were randomly downsized to 25 million reads. In brief, the whole genome was divided into 5 kb bins for H3K4me3 and 100 kb bins for H3K27me3, and the number of mapped reads in the bin was calculated. The counts by logarithm log2(count +1) were used for pairwise correlation analysis with Pearson coefficient. Here, 1 is a pseudo-count to avoid an undefined error of logarithm of zero. The score of FRiP (fraction of reads in peaks) for each sample was calculated by following the method described in . The library complexity was calculated by the Preseq package .
Sequencing was performed by Mayo Clinic Medical Genomics Facility Sequencing Core. We thank Dr. Huihuang Yan (Mayo Clinic) and Krutika Gaonkar (Mayo Clinic) for valuable suggestions of data analysis.
This work was supported, in part, by National Institutes of Health grant P01DK068055 (A.B., G.F., T.O., J.H.L) and the Mayo Clinic Center for Individualized Medicine Epigenomics Program. The funding bodies played no role in the design of the study or collection, analysis and interpretation of data or writing of the manuscript.
Availability of data and materials
The ChIP-seq data reported in this manuscript are available in the National Center for Biotechnology Information Gene Expression Omnibus under accession number GSE103396.
JZ and JHL conceived and designed the experiments. JZ conducted most of the experiments. JHL supervised the work. ZY, JZ, and JHL performed data analysis. SL, CC, ZZ, KR, TO, JHL contributed to optimize ChIP-seq protocol. AB, GF, TO provided key resources and reagents. JZ and JHL wrote the manuscript. All authors read and edited the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
HeLa cells (ATCC® CCL-2™) were purchased from ATCC.
Consent for publication
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.