MuTAnT: a family of Mutator-like transposable elements targeting TA microsatellites in Medicago truncatula

Transposable elements (TEs) are mobile DNA segments, abundant and dynamic in plant genomes. Because their mobility can be potentially deleterious to the host, a variety of mechanisms evolved limiting that negative impact, one of them being preference for a specific target insertion site. Here, we describe a family of Mutator-like DNA transposons in Medicago truncatula targeting TA microsatellites. We identified 218 copies of MuTAnTs and an element carrying a complete ORF encoding a mudrA-like transposase. Most insertion sites are flanked by a variable number of TA tandem repeats, indicating that MuTAnTs are specifically targeting TA microsatellites. Other TE families flanked by TA repeats (e.g. TAFT elements in maize) were described previously, however we identified the first putative autonomous element sharing that characteristics with a related group of short non-autonomous transposons. Electronic supplementary material The online version of this article (doi:10.1007/s10709-015-9842-5) contains supplementary material, which is available to authorized users.


Introduction
Transposable elements (TEs) are mobile DNA segments present in most organisms. In higher plants, their content varies from 10 % in Arabidopsis (Arabidopsis Genome Initiative 2000) to more than 80 % in maize (Schnable et al. 2009). With respect to the transposition mechanism, TEs are divided into two classes; class I (retrotransposons) transpose via an RNA intermediate while class II (DNA transposons) change their location by a cut-and-paste mechanism characteristic for TEs carrying terminal inverted repeats (TIRs) or a rolling-circle mechanism typical for Helitrons (Wicker et al. 2007). A family of DNA transposons usually consists of one or a few autonomous elements capable of inducing their own transposition and more copies with internal deletions and rearrangements, referred to as non-autonomous, which lost the ability to transpose independently, however, they can be mobilized by a related autonomous element (Wessler 2006).
The canonical Mutator element was discovered in a maize stocks showing a high forward mutation rate (Robertson 1978). Since then, many Mutator-like elements (MULEs) have been identified in plants (Holligan et al. 2006), fungi (Chalvet et al. 2003), protozoans (Pritham et al. 2005;Lopes et al. 2009), and metazoans (Marquez and Pritham 2010). Autonomous MuDR-like elements carry two open reading frames, mudrA and mudrB, the former coding for a transposase, while a function of the latter is not well defined. There is also a group of Mutatorlike autonomous elements, e.g. Jittery, carrying only mu-drA-like ORF (Xu et al. 2004).
Tandemly repeated motifs of 2-6 nt are commonly referred to as microsatellites. Microsatellites exhibit variation in length, structure, frequency of individual motifs and genomic distribution (Schulman et al. 2005). In plants, (TA) n repeats are more abundant compared to other dinucleotide motifs (Wang et al. 1994). Microsatellite regions are considered as hypervariable, as the number of tandem repeats can be changed following DNA polymerase slippage in the course of DNA replication. In plants, tandem repeats were shown to be preferentially associated with gene-rich regions (Morgante et al. 2002). In Medicago truncatula, microsatellites were found near genes, in 5 0 and 3 0 untranslated regions (UTRs) and introns (Mun et al. 2006).
Here, we report on MuTAnTs, a novel family of MULEs present in M. truncatula and targeting (TA) n microsatellite repeats. We identified 218 copies of MuTAnTs and characterized a putative autonomous element carrying a complete ORF encoding a mudrA-like transposase.

Plant material
Molecular analyses were performed on the reference line A17 'Jemalong', 2HA (an A17 derivative) and on 21 wild accessions of M. truncatula provided by INRA, Montpelier, France. Apart from M. truncatula, eight other Fabaceae species, i.e. Lupinus angustifolius L., L. luteus L., Pisum sativum L., Phaseolus vulgaris L., Trifolium pratense L., T. repens L., and Vicia faba L. were included in the analyses (Additional file 1). Each accession was represented by a single plant, seeds were germinated according to the Medicago Handbook (Garcia et al. 2006), plants were grown in pots in the greenhouse. Genomic DNA was isolated from the fresh tissue collected from ca. 8-weeksold plants with Plant DNeasy Mini Kit (Qiagen) following the manufacturer's protocol.

Mining for MuTAnT copies in M. truncatula
The family of non-autonomous elements flanked by TA repeats was identified following manual inspection of TE sequences reported by REPET (Flutre et al. 2011) for M. truncatula genome version 3.5.2 (Young et al. 2011) downloaded from medicago.org. Individual copies of MuTAnTs were mined with TARGET (Han et al. 2009) at www.iplantcollaborative.org using a MuTAnT sequence reported by REPET as a query. The related autonomous element was identified with TIRfinder (Gambin et al. 2013) using the following parameters: tirMask: GGGGTTTGCT AGAACA, tsdMask: N, tirSeqMismatches: 1, tsdSeqMismatches: 1, tirMaskMismatches: 3, tsdMaskMismatches: 0, and the aminoacid sequence of the maize mudrA transposase (Genebank acc. no. AAA21566) as a query with tblastn threshold of 1e-2. MuTAnT structure was analysed with mfold (Zuker 2003) and Dotlet (Junier and Pagni 2000). Sequence logos of TIRs were obtained with WebLogo (Crooks et al. 2004).

MuTAnT diversity and evolutionary dynamics
Sequences were processed with BioEdit (Hall 1999). Phylogenetic analyses including calculation of pairwise distances under Tajima-Nei model were performed with MEGA 5.2 (Tamura et al. 2011), frequencies were calculated with MS Excel. Possibility of past transposition events was demonstrated through the identification of sequences related to empty sites (RESites), which are paralogous sequences lacking TE insertion. Basic strategy included the comparison between the occupied locus and related empty sequence reveals the TSD events and gaps corresponding to the TE insertion (Le et al. 2000).

PCR assay
The PCR assay was used to investigate the distribution of AutoMuTAnT copies within Fabaceae, as well as to reveal the genomic distribution of MuTAnTs among M. truncatula ecotypes. Primers were designed using Primer3 (Koressaar and Remm 2007;Untergasser et al. 2012). PCR reactions were set up as followed: ca. 10 ng of genomic DNA, 0.5 mM dNTP, 0.4 lM of forward primer, 0.4 lM of reversed primer, 5 % of DMSO, 19 buffer for AccuTaq LA DNA Polymerase, 0.2 U of JumpStart AccuTaq LA DNA Polymerase Mix (Sigma Aldrich) in the total volume of 20 ll (Additional file 1). The following PCR conditions were applied: 96°C/30 s, 30 9 (94°C/15 s, 62°C/30 s, 68°C/2 min), and 68°C/30 min. Amplified fragments were separated in 1.5 % agarose gel in 19 TBE buffer and detected by ethidium bromide staining. Target bands were extracted from the gel with Wizard SV Gel and PCR Clean-Up System (Promega) as described by the manufacturer. Purified fragments were ligated into pGEM-T vector and cloned into E. coli strain DH10B according to standard cloning procedure provided by Promega. Positive clones verified by PCR assay were Sanger-sequenced in Genomed SA, Warsaw, Poland.

Identification and characterization of MuTAnTs
Upon visual inspection of REPET output for M. truncatula, we identified a 292 bp-long element flanked by TA repeats. It comprised almost identical 144 bp-long TIRs spawning a foldback sequence, which showed a strong propensity to form a hairpin-like tertiary structure (Fig. 1). We identified 218 related elements ranging from ca. 200 bp to over 1.6 kb and grouped them into a family named MuTAnT (Mutatorlike (TA) n Targeting). A significant fraction (90 %) of these elements ranged in size from 200 to 300 bp, the mean TIR length was 47 bp (Additional file 2). A related 4873 bp-long putative autonomous element, dubbed AutoMuTAnT (position 26,442,402-26,447,275 on chromosome 1), carried a single ORF composed of five exons, predicted to encode a 833 aa protein (Fig. 2) similar to Jittery transposase (e-value: 6e-79, GenBank acc. no. AAF66982, Xu et al. 2004). The insertion was flanked by long TA stretches comprising 59 and 27 perfect repeats on the 5 0 and 3 0 end, respectively. The sequence of AutoMuTAnT and majority rule consensus sequences of MuTAnT1 and MuTAnT2 subfamilies are provided in Additional file 3.
AutoMuTAnTs were present only within the Trifolieae tribe of the Fabaceae family, including white clover and all Medicago spp. but M. laciniata (Fig. 2a). Two wild ecotypes, L310 and L530, possibly carried other, likely truncated, copies of AutoMuTAnT, not present in the reference genome of A17 cv. 'Jemalong' (Fig. 2b).

Insertion site preference
MuTAnTs were evenly distributed across chromosomes of M. truncatula with one insertion per 1.44 Mb on average. A substantial fraction of insertions were present in generich regions, 36 of 218 identified copies occurred in introns or 5 0 and 3 0 UTRs, while additional 65 insertions were localized less than 1 kb away from genes. All 218 MuTAnT insertions occurred in AT-rich regions, predominantly inside (TA) n microsatellites varying in length and reaching up to 41 repeats. An average microsatellite flanking a MuTAnT insertion in the A17 reference genome consisted of 11.7 (±9.5 SD) TA repeats. Only 14 of the 218 copies were not inserted into perfect (TA) n microsatellites. For these, the surrounding sequences indicated presence 9 ntlong target site duplications (Fig. 6). Additional insertion sites PCR-amplified from M. truncatula ecotypes carried on average 15 TA repeats on each TE flank. In contrast, sequencing of a subset of corresponding empty insertion sites and RESite analysis indicated that empty target sites consisted of nine TA repeats on average (Table 1).

Discussion
We identified and characterized a novel family of MULEs named MuTAnT showing a strong preference for insertion into (TA) n microsatellites. We found MuTAnTs to be composed of long TIRs built up from modules forming a foldback structure, characteristic to previously reported families, such as Jittery in maize flanked by 181 bp-long TIRs (Xu et al. 2004) or FARE1 in Arabidopsis, a group of Foldback carrying long palindromic repeats on both ends (Windsor and Waddell 2000), currently classified as MULEs (Feschotte and Pritham 2007). The appurtenance of MuTAnTs to MULEs is further supported by similarity of the mudrA-like protein sequence of AutoMuTAnT and the Jittery transposase. The presence of AutoMuTAnT in white clover and in all but one analyzed Medicago species demonstrates that MuTAnTs are likely to predate the origin of the Medicago genus, as they were present in the most recent common ancestor of Trifolium and Medicago which lived at least 16 million years ago (Lavin et al. 2005).
To reveal the evolutionary history of the MuTAnT family, we calculated frequencies of pairwise distances between elements, as proposed previously for DINE-1 elements in Drosophila  and ATons in yellow fever mosquito (Yang et al. 2012). It indicated two bursts of transpositional activity giving rise to two subfamilies. We also analyzed the distribution of MuTAnT copies among wild ecotypes of M. tuncatula by a PCR assay. Insertion polymorphisms of ten sites and the unique presence of MuTAnT copies in three of those sites in A17 are indicative for recent transposition events similar to those reported previously for other plant species and TE families (Naito et al. 2006;Benjak et al. 2009;Grzebelus et al. 2009Grzebelus et al. , 2011. Precise determination of TSDs for most copies was impeded due to the repetitive nature of the flanking sequences. In addition, a high proportion of defective copies was revealed, as 25 % of all identified MuTAnTs lacked a significant portion of one or both TIRs. Even if both TIRs were present, they varied in terms of deletions of one or more of the four distal nucleotides of TIRs (GGGG/ CCCC). Nevertheless, we showed that MuTAnTs generated 9 nt-long TSD, which in general is typical for MULEs.
MuTAnT insertion sites are frequently located in proximity to genes, mostly less than 3 kb downstream or upstream from the adjacent coding region, with the number of insertions decreasing with the distance from genes. A similar tendency for insertion into gene-rich regions was reported early with the discovery of MITEs Wessler 1992, 1994) and was supported by subsequent studies (Yang et al. 2001;Sampath et al. 2013). As MuTAnTs are short, non-autonomous and relatively numerous in M. truncatula, they resemble MITEs both in terms of their structure and mode of operation. Notably, all MuTAnT insertions present within transcribed regions were located in UTRs or introns.
A hallmark of MuTAnTs activity is their propensity to insert into (TA) n microsatellites. Affinity to insert into (TA) n microsatellites provides a direct barrier against their insertions into coding regions. the apparent preference of MuTAnTs for insertion into TA repeats is possibly their survival strategy. Weak selection pressure imposed on microsatellite sites may favor TE families adapted to target microsatellites and use them as 'safe havens.' On the other hand, insertions proximal to coding regions can still introduce more subtle regulatory changes on the expression of adjacent genes. It is interesting to compare MuTAnTs to AhMITEs, a family of short MULEs inserting into AT-rich but non-microsatellite regions of the peanut genome (Shirasawa et al. 2012). Notably, we observed a similar behavior also in a minor group of MuTAnTs. Thus, it is possible that both families represent successive stages of the evolutionary process of exploiting microsatellites as target sites, with AhMITEs being a transitional form. TAFT and Micron elements identified in maize and rice, respectively, also show preference for insertion into (TA) n microsatellites which suggests that the strategy may be more widespread, as it evolved independently in several unrelated families of DNA transposons. Education fund for statutory activity of the University of Agriculture in Krakow.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.