Genome-wide repeat dynamics reflect phylogenetic distance in closely related allotetraploid Nicotiana (Solanaceae)
Nicotiana sect. Repandae is a group of four allotetraploid species originating from a single allopolyploidisation event approximately 5 million years ago. Previous phylogenetic analyses support the hypothesis of N. nudicaulis as sister to the other three species. This is concordant with changes in genome size, separating those with genome downsizing (N. nudicaulis) from those with genome upsizing (N. repanda, N. nesophila, N. stocktonii). However, a recent analysis reflecting genome dynamics of different transposable element families reconstructed greater similarity between N. nudicaulis and the Revillagigedo Island taxa (N. nesophila and N. stocktonii), thereby placing N. repanda as sister to the rest of the group. This could reflect a different phylogenetic hypothesis or the unique evolutionary history of these particular elements. Here we re-examine relationships in this group and investigate genome-wide patterns in repetitive DNA, utilising high-throughput sequencing and a genome skimming approach. Repetitive DNA clusters provide support for N. nudicaulis as sister to the rest of the section, with N. repanda sister to the two Revillagigedo Island species. Clade-specific patterns in the occurrence and abundance of particular repeats confirm the original (N. nudicaulis (N. repanda (N. nesophila + N. stocktonii))) hypothesis. Furthermore, overall repeat dynamics in the island species N. nesophila and N. stocktonii confirm their similarity to N. repanda and the distinctive patterns between these three species and N. nudicaulis. Together these results suggest that broad-scale repeat dynamics do in fact reflect evolutionary history and could be predicted based on phylogenetic distance.
KeywordsChromoviruses Graph-based clustering High-throughput sequencing Phylogenetics Repetitive DNA Ty-3 Gypsy
Materials and methods
Plant material, sequencing and data acquisition
Nicotiana nesophila accession 974750097 (from the Botanical and Experimental Garden, Radboud, University of Nijmegen, the Netherlands) and N. stocktonii accession TW126 (from the United States Department of Agriculture, North Carolina State University, NC) were used in this study. Genomic DNA was extracted from approximately 100 mg of fresh leaf tissue using a modified CTAB protocol as described in Wang et al. (2013). Libraries were prepared using the TruSeq DNA PCR-Free Library Preparation Kit (version B) and sequenced on a MiSeq at the Genome Centre, Queen Mary University of London with V2 chemistry (300 cycles, 151-bp paired-end reads). New read data (for N. nesophila and N. stocktonii) have been deposited in the NCBI short-read archive (SRA) as follows: SRP082467. Data for other species were downloaded from SRA as follows: the two remaining species of section Repandae—N. repanda [GenBank: SRR453021] and N. nudicaulis [GenBank: SRR452996], plus the extant relatives of parental lineages of section Repandae—N. sylvestris [GenBank: SRR343066] and N. obtusifolia [GenBank: SRR452993], resulting from Renny-Byfield et al. (2011, 2012, 2013).
Graph-based clustering using repeat explorer
Sequence reads were screened using a threshold of quality score 20 over 95% of the read length. Newly generated reads for N. nesophila and N. stocktonii (151 bp) were then trimmed to 95 bp in order to match the read length of previous datasets (Renny-Byfield et al. 2013) that were downloaded from the SRA. Each set of reads was down-sampled to represent 1% of each genome (i.e. coverage of 0.01×) based on flow cytometry 1C values (www.data.kew.org/cvalues): 277,472 reads for N. sylvestris, 159,052 reads for N. obtusifolia, 366,000 for N. nudicaulis, 560,000 for N. repanda, 514,526 for N. stocktonii and 518,106 for N. nesophila. Samples were prefixed with a four-letter code unique to that species and combined to produce a dataset of 2,395,156 reads as input to RepeatExplorer for graph-based clustering (www.repeatexplorer.org; Novák et al. 2013) using the default settings of 90% similarity over 55% of the read length. For full details of the clustering process, see Novák et al. (2010, 2013). The N. tabacum plastid genome [GenBank: Z00044.2] was used as a custom database to identify plastid clusters for removal.
A phylogenetic tree was reconstructed from the top 1000 most abundant clusters (repeats) using continuous character parsimony in TNT as described fully in Dodsworth et al. (2015a, b) with the exception that abundances were cube-root-transformed prior to inference. This was in order to make the cluster abundances in the range of 0–65, which is required for input to TNT.
Deviation from expectation in polyploids
In a neo-allotetraploid, we assume that inheritance is additive (i.e. the allotetraploid genome is the sum of the parental genomes). Thus, the expectation is that repeats in the allotetraploid genome are inherited faithfully in the same abundance from the parental taxa. For Nicotiana sect. Repandae, the closest extant relatives of progenitor species are N. sylvestris (maternal) and N. obtusifolia (paternal). The expected cluster size in each of the section Repandae polyploids is therefore the sum of the cluster size in N. sylvestris and N. obtusifolia. The cumulative deviation from expectation was calculated and plotted in log scale using R version 3.0.0 (R Core Team 2013), from smallest (left) to largest cluster size (right), in a similar manner to Renny-Byfield et al. (2013).
Genomic in situ hybridisation
For cytological investigations, actively growing root meristems of Nicotiana sect. Repandae species were pre-treated with 0.002 M solution of 8-hydroxyquinoline for 2.5 h at room temperature and 2.5 h at 4 °C, fixed in ethanol-to-acetic acid (3:1) and stored at −20 °C until use. Genomic in situ hybridisation was performed using genomic DNAs of previously identified closest extant relatives of parental diploid species (N. sylvestris and N. obtusifolia; both 2n = 24) as probes (Clarkson et al. 2005; Lim et al. 2007; Renny-Byfield et al. 2013). Genomic DNAs of diploid species were isolated using the CTAB method (Jang and Weiss-Schneeweiss 2015), sheared at 98 °C for 5 min and labelled using either digoxigenin or biotin nick translation kit (Roche). Hybridisation and detection of the probes followed the method of Jang and Weiss-Schneeweiss (2015). Preparations were analysed with an AxioImager M2 epifluorescent microscope (Carl Zeiss, Vienna, Austria), images acquired with a CCD camera and processed using AxioVision version 4.8 (Carl Zeiss, Vienna, Austria) with only those functions that apply equally to whole image. At least 30 well-spread metaphases and prometaphases were analysed for each individual. Cut-out karyotypes are given in Online Resource 1.
Results and discussion
Phylogenetic relationships in section Repandae
Patterns of overall repeat divergence reflect evolutionary history
Phylogenetic analysis of the abundance of different repeats in the genomes confirms relationships in Nicotiana sect. Repandae as (N. nudicaulis (N. repanda (N. stocktonii + N. nesophila))). This result reflects much previous work using standard phylogenetic markers, summarised in Renny-Byfield et al. (2013). Additionally, the patterns of repeat dynamics in section Repandae species clearly reflect phylogenetic distance of these species from one another. This is an extension of the phylogenetic signal in abundance of repeat types (e.g. Piednoël et al. 2012, 2013; Novák et al. 2014; Macas et al. 2015) and shows that the overall patterns of element accumulation/deletion/stasis are broadly similar in closely related taxa. For example, the profiles for N. nesophila and N. stocktonii, the most recently diverged taxa, are almost identical, and together they most closely resemble N. repanda, to which they are sister (Fig. 4). These three taxa are very divergent from Nicotiana nudicaulis, in repeat profile, as well genome size, with N. nudicaulis having a smaller and less restructured genome, being sister to the other three species. Previous SSAP patterns (Parisod et al. 2012) therefore reflected the novel activity of a very limited number of retroelement families, the dynamics of which are not directly indicative of shared evolutionary history.
We thank Petr Novak for an R script used to produce Fig. 4a, Simon Renny-Byfield for previous Nicotiana sequence data, and Sandra Knapp, Elizabeth McCarthy and Yoong Lim for photographs. We also thank Ales Kovarik and two anonymous reviewers for constructive comments on a previous version of the manuscript.
This work was supported by a NERC studentship to SD and a grant from the Botanical Research Fund to SD.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
The authors comply will all rules of the journal following the COPE guidelines; all authors have contributed and approved the final manuscript.
- R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
- Renny-Byfield S, Chester M, Kovarik A et al (2011) Next generation sequencing reveals genome downsizing in allotetraploid Nicotiana tabacum, predominantly through the elimination of paternally derived repetitive DNAs. Molec Biol Evol 28:2843–2854. doi:10.1093/molbev/msr112 CrossRefPubMedGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.