Objective

Legume plants regulate the number of nodules formed through a long distance signal transduction pathway that involves many genes [1]. Earlier work in our lab identified a spontaneous mutant in Medicago truncatula in the Jemalong cultivar, resulting in increased nodulation [2]. Molecular genetic evidence indicated the phenotype of the like-sunn-supernodulator (lss) mutant was due to an unknown lesion resulting in cis-silencing of the SUNN gene, which has a wild type sequence in lss mutants. Expression of SUNN in the shoots is critical to regulation of nodulation in the roots [3]. The lesion is mapped to an 810 kilobase area of the genome on chromosome 4, including the SUNN gene, but the nature of the lesion has not been determined [2]. Altered methylation of the promoter was suspected, but analysis of the SUNN promoter by bisulfite sequencing at the time of publication revealed no significant methylation differences between the SUNN promoter in A17 wild type and lss plants [2]. Using genome capture combined with bi-sulfite sequencing, Satgé et al. [4] identified 474 regions that were differentially methylated during nodule development in M. truncatula and over 400 genes downregulated in plants with a mutant copy of the DEMETER demethylase gene. Because the lss lesion behaves like a paramutation, including reversion events [2], we decided to expand our bisulfite sequencing beyond the SUNN promoter and compare the methylomes of the A17 wild type to the lss mutant.

Data description

Whole genome bisulfite sequencing

Data consist of sequencing results of two bisulfite libraries made from young leaves of individual 6-week-old greenhouse grown Medicago truncatula A17 wild type and lss mutant plants (one of each). DNA was extracted using the DNeasy Plant Mini Kit (Qiagen). Bisulfite treatment of A17 and lss DNA was performed using the EZ DNA Methylation-Gold Kit (Zymo Research) according to manufacturer instructions. Whole genome bisulfite sequencing (WGBS) libraries were prepared using the Illumina TruSeq DNA Methylation library preparation kit. Library quality was assessed using an Agilent Bioanalyzer and quantitated using a Q-PCR kit from KAPA Biosystems following the manufacturer’s instructions. Sequencing was performed on an Illumina HiSeq 2500 with the HiSeq Cluster Kit v4 (1 × 125 bp single-read; see Data set 1 and Data set 2 for unaligned sequencing reads). Raw sequencing FASTQ files were trimmed to remove adapters, low-quality bases, and short reads using Trim Galore! v0.4.2 (Quality Phred score cutoff 5, min length 20). Alignment was carried out using Bismark v0.16.3 [5] and Bowtie2 v2.2.9 with N-1 for increased sensitivity against the M. truncatula 4.0 genome. Duplicated sequences were removed with Picard MarkDuplicates v2.8.0 and alignments with MAPQ scores of 0 removed with Samtools v1.3.1. Sequencing and alignment results are summarized in data file 1 referenced in Table 1.

Table 1 Overview of data files/data sets

Methylation analysis

Methylation calling was performed using MethylDackel v0.2.1. Statistically significant differentially methylated cytosines (DMCs) were identified using the Bioconductor R package DSS v2.14.0 [6] (p.threshold 1e−5) (Data file 2 referenced in Table 1). Differentially methylated regions (DMRs) were also identified through DSS (pct.sig 0.5, minimum number of DMCs 10, 50 bp minimum length, and DMRs merged if within 100 bp) (Data file 3 referenced in Table 1). There were 307 DMRs in the CG context and 772 DMRs in the CHG context. Displays of the number of DMCs in CG and CHG contexts, the number of DMRs identified by comparison of A17 and lss in CHG and CG contexts and the distributions of methylation across the exons and 1-kb flanking sequences for A17 and lss are graphically displayed in Data file 4 referenced in Table 1. Scripts used to analyze data and produce figures can be found at https://bitbucket.org/nfreese/medicagobseq. Bedtools v2.26.0 was used to associate DMRs with overlapping or closest gene (against M. truncatula MedtrA17_4.0 annotation) (Data file 3 referenced in Table 1) [7]. Methylation data was visualized using the Integrated Genome Browser in the area around the receptor protein kinase SUNN (Medtr4g070970) gene (Data file 5 referenced in Table 1) [8]. There was a single significant CG DMR within the first Medtr4g070970 exon. No DMRs were identified upstream or downstream within the CG or CHG context. The CG DMR identified within Medtr4g070970 displayed decreased methylation in the A17 sample.

Limitations

The data sets were generated without biological replicates and thus any comparison of the A17 and lss data is limited by a small sample size.