New Insights on the Evolution of Genome Content: Population Dynamics of Transposable Elements in Flies and Humans
Understanding the abundance, diversity, and distribution of TEs in genomes is crucial to understand genome structure, function, and evolution. Advances in whole-genome sequencing techniques, as well as in bioinformatics tools, have increased our ability to detect and analyze the transposable element content in genomes. In addition to reference genomes, we now have access to population datasets in which multiple individuals within a species are sequenced. In this chapter, we highlight the recent advances in the study of TE population dynamics focusing on fruit flies and humans, which represent two extremes in terms of TE abundance, diversity, and activity. We review the most recent methodological approaches applied to the study of TE dynamics as well as the new knowledge on host factors involved in the regulation of TE activity. In addition to transposition rates, we also focus on TE deletion rates and on the selective forces that affect the dynamics of TEs in genomes.
Key wordsLong-read sequencing Transposition rates Self-regulation Effective population size Adaptation Horizontal transfer
1 Transposable Elements Are Abundant and Active Genome Denizens
Transposable elements (TEs) are short DNA sequences, typically from a few hundred bp to ~10 kb long, which have the ability to move around in the genome by generating new copies of themselves. In addition to active autonomous elements, genomes also contained nonautonomous elements that can be mobilized by the enzymatic machinery of active TEs from the same family. Additionally, genomes contain TEs that cannot be mobilized anymore due to accumulation of mutations in their sequences . TEs are an ancient, extremely diverse, and exceptionally active component of genomes. TEs have been found in virtually all organisms studied so far including bacteria, archaea, fungi, protists, plants, and animals [2, 3, 4, 5]. The main TE groups, class I and class II, are present in all kingdoms, revealing their persistence over evolutionary time . These two classes of TEs differ in their transposition intermediates: while class I TEs transpose through RNA intermediates, class II TEs transpose directly as DNA. TEs within each class are further classified into (1) different orders, based on their insertion mechanism, structure, and encoded proteins; (2) different superfamilies, based on their replication strategy and on presence and size of target site duplications; and (3) different families, based on sequence conservation [2, 3]. Piegu et al.  criticized the current classification system, which accounts for sequence homology, structural features, and target site duplications, because it does not always take into account the evolutionary origins of the TEs [1, 2, 3]. As a consequence, phylogenetically unrelated classes or subclasses of TEs are grouped . Piegu et al.  also suggested that a more inclusive classification that includes prokaryotic and eukaryotic TE classes should be considered. Recently, Arkhipova  proposed a TE classification system based on the replicative, integrative, and structural components of TEs, which integrates different aspects of all the existing classification systems .
Overall, recent advances in sequencing technologies and in TE detection methods showed that, as expected, the TE content is higher than previously estimated. These new data also provided further evidence for the impact of TEs in genome function and genome structure. Thus, it is still indisputable that a thorough understanding of TE population dynamics is essential for the understanding of the eukaryotic genome structure, function, and evolution.
2 Drosophila and Humans: Two Extremes in TE Diversity and Population Dynamics
Much of the detailed information on TE evolution still comes from two species with the best-studied genomes: fruit flies (D. melanogaster) and humans. Fortunately, these two genomes represent two extremes in terms of TE diversity and population dynamics and thus give a reasonably diverse picture of the TE evolution and dynamics. For the rest of this chapter, we focus primarily on these two genomes and will highlight the similarities and differences observed between them.
As mentioned above, the human reference genome has millions of TE copies, with 66–69% of the genome mostly derived from TE sequences . Two human retrotransposable element (class I) families, LINE1 (L1, long interspersed nuclear element 1) and Alu, account for 60% of all interspersed repeat sequences. The vast majority of the TEs in the human genome are fixed, and most families are inactive. However, some elements of the main families of human endogenous retrovirus (HERV-K) and LINE1 elements show autonomous transposition. Meanwhile, elements of Alu and the hybrid SVA elements formed by SINEs (short interspersed nuclear elements), VNTRs (variable number tandem repeat), and Alus show nonautonomous activity [64, 65, 66].
Summary of recent TE population dynamic studies
Relevance for TE dynamics
Overview of new discoveries about TEs in 75 basidiomycete fungi genomes
TE content varies among species displaying different lifestyles from 0.1% to 45.2%. The correlation between TE content and genome size is not strong. TEs seem essential for chromosomal architecture. A large battery of mechanisms to avoid transposition is present
The result of most TE activity is likely neutral as they often insert in intergenic regions. However, TEs play an important role in the evolution of plant pathogens and probably in symbiotic species
Characterization of TE content in the only selfing hermaphroditic vertebrate: the mangrove killifish Kryptolebias marmoratus
TE content is 27%. There is a great diversity of families with a pronounce abundance of Helitrons compared to its closest phylogenetic relatives. TE sequence divergence is also higher in K. marmoratus compared to close species
Against expectations, the number and composition of TEs in these selfing organisms is comparable to that of many other fish with outcrossing mating systems. The high Helitron content is one of the factors that could explain the high genetic diversity observed in this selfing killifish
Testing whether genome size equilibrium observed in 10 mammals and 24 birds species is due to covariation between DNA gain by transposition and DNA loss by deletion
DNA gain varies by more than sixfold across mammals and 30-fold across birds. DNA loss varies by twofold in mammals and threefold in birds. Neither DNA gain nor loss can solely explain variation in genome size. DNA loss exceeded gain in all but two lineages. Midsize deletions (31 bp to 10 kb) play a larger role than microdeletions (1–30 bp) in DNA loss
Genome size equilibrium is maintained through DNA loss counteracting DNA gains through TE expansions. DNA loss has probably been driven by large deletions (>10 kb). Genome expansion via transposition could promote genome contraction through TE-mediated deletions
Understanding the differences in abundance and diversity of L1 elements across vertebrates
Vertebrate L1s differ in the length of the 5′ UTR, 3′ UTR, and intergenic regions. They also differ in base composition with mammals and lizards showing a stronger A bias on the positive strand than frog and fish
Mammals show very little 5′ UTR homology due to the frequent acquisition of novel nonhomologous 5′ UTR during evolution. This seems not to occur in other groups of vertebrates since the relative conservation of the 5′ UTR and ORF1 suggests that the host do not repress transposition in a sequence-specific way
Understanding the role of TEs in D. melanogaster genome evolution, by estimating their insertion and deletion rates
24 TE superfamilies are active in mutation accumulation lines. TE activity is background dependent. There is an association between activity of some TE families and chromatin state, as well as a week correlation between insertion activity and GC content, and a negative correlation between deletion activity and exon content
Insertion rate is higher than deletion rate which helps explain the relative stability of TE numbers and genome size in Drosophila in the face of previously reported deletion bias. Heterochromatin may play a bigger role than recombination in shaping TE accumulation
Characterization and description of TEs in the coffee berry borer Hypothenemus hampei genome
8.3% of the genome are TEs (880 TE sequences): 49.24% of the TEs are MITEs. Several new families described: Hypo belonging to Gypsy superfamily, Hamp a new non-LTR family and rosa a new DNA TE family
Low TE content, compared with other insects, could be related to the reproductive characteristics and the population size of this species. Males have a chromosome set not transmitted to the next generation like asexual populations. The colonization of America probably produced a founder effect
To develop a comprehensive assessment of transposition activity at the A. thaliana species level
The analysis includes 211 samples collected all over the world. 165 of the 326 families annotated in A. thaliana showed recent transposition activity at the species level. TE composition and activity are strongly affected both by environmental and genetic factors
TEs have pervasive effects on the expression and methylation status of nearby genes which are likely deleterious and could help explain why bursts of transposition were not detected. Its self-fertilizing mating system should also lead to accelerated elimination of deleterious TE insertions. TEs are also involved in the generation of large-effect alleles at adaptive trait loci
Characterization of TE presence/absence in 216 A. thaliana accessions with respect to the reference genome
TE deletions were biased toward pericentromeric regions, while TE insertions had a more uniform distribution over chromosomes. TE variants associated with changes in nearby gene expression and local and distal methylation patterns
TEs are a significant source of genetic variation. Most TEs present at low frequencies. TEs likely play a role in facilitating epigenomic and transcriptional differences between A. thaliana accessions
To understand the role of TE in genome evolution of the sweet potato Ipomoea batatas
1405 TEs described based on transcriptomic data. 417 TEs are expressed in one or more tissues and 107 in the seven tissues analyzed
TE activity is tissue- and background-specific. Although several TEs are expressed in all the tissues and strains analyzed, some of them are active only in one specific strain and/or tissue. Authors suggest that TEs may play a role in environmental adaptation
3 Methodology Used to Study TE Population Dynamics
Summary of recent mathematical models and computer simulations applied to the study of TE dynamics
The model quantifies the transposition activity over time based on the distribution of transposition events in the phylogenetic tree and the tree topology
Fot subfamilies from Fusarium oxysporum
The four subfamilies analyzed are still active with two of them showing clear changes in their transposition dynamics. The results obtained showed that regulation of transposition by the number of copies is not strong enough to maintain stable transposition-deletion equilibrium
Considering the genome as an ecosystem, the model analyzes the interaction between nonautonomous and autonomous TEs as a predator-prey relationship in individual cells
L1 and Alus from Homo sapiens
The model predicts oscillations in the number of TEs in a time scale much longer than the cell replication time. Thus, the genome stores the predator-prey state during successive generations
The model, based in the Fisher geometric model, analyzes TE dynamics under changing environments in clonal organisms
Autonomous and nonautonomous TEs in asexual population
The model predicts that when nonautonomous TE copies are present, the transposition activity is lost and thus the stability of the host-TE system is compromised. Changes in the environment may induce bursts of transposition activity associated with faster adaptation. However, it is unlikely that the transposition activity is maintained in the long term
The model, based on the Fisher geometrical model, analyzes TEs dynamics in sexual diploid organisms under environmental changes
TEs in sexual diploid populations
The model suggests that the presence of inactive copies of TEs is necessary for the transposition-selection equilibrium of active copies and that the mutagenic role of TEs is crucial when host populations face rapid environmental changes
The model, based in the selfish DNA theory, analyzes the invasion dynamics of active TEs during the first stages of an experimental evolution experiment
Mos1 and peach, mariner family from Drosophila melanogaster
The model predicts lower invasion frequencies than the ones observed experimentally. A substantial rate of replicative transposition during the initial invasion of the element was inferred from the discrepancy between observed and theoretical copy numbers
The model analyzes the impact of intermediate selfing rates on TE dynamics and the influence of the mating system on the evolutionary properties of TEs
Active TEs in a diploid hermaphrodite population
The model predicts that the efficiency of TEs as genomic parasites decreases with the selfing rate, although rare TE invasions can still occur even in populations with 90% selfers. The model predicts TE extinction if populations change from sexual to asexual reproduction, although empirical data does not strongly support this result
The model studies the evolutionary behavior of TE copy number and the molecular evolution of their DNA sequences
TEs in sexual diploid populations
The model predicts that weak selection allows high copy numbers of TEs most of them inactive copies, while strong selection reduces the number of TEs but increases the proportion of active copies. Regarding TE sequences, the model shows that the phylogeny of these sequences allows distinguishing active copies from non- and less active copies
The model analyzes the propagation of LTR TEs by taking into account the TE position in the chromosome, the degradation level of the TEs, and the duplication rate that varies with the degradation level
roo, Gypsy and DM412, TEs of LTR family from Drosophila melanogaster
The simulation estimates several parameters affecting the propagation of TEs and identifies the initial copy from which three LTR families have spread on the euchromatin part of the 3L chromosome
Traditionally, mathematical models considered the relationship between the host and a homogenous group of active TEs. However, the TE content of any genome is a mixed of autonomous and nonautonomous insertions. Xue and Goldenfeld  proposed a mathematical model that considers the relationship between nonautonomous and autonomous TEs as a predator-prey dynamic. Unlike previous models that also use the analogy to ecological models, Xue and Goldenfeld model takes into account the molecular level interactions between transposable elements and the small copy number of the active transposons. The model predicts oscillations in the number of TEs in a time scale much longer than the cell replication time, suggesting that the genome stores the predator-prey state during successive generations .
TE dynamics have also been analyzed in variable environments [80, 81] (Table 2). Gogolesky et al.  proposed a stochastic computational model to analyze the dynamics of active TEs in genomes of sexual diploid organisms under environmental stress. They based their model in the Fisher geometrical model of fitness landscapes. Overall, the authors conclude that the presence of inactive copies of TEs is necessary for the transposition-selection equilibrium of autonomous copies and that the mutator capacity of TEs might be important when host populations face rapid environmental changes .
Other recently developed methods analyzed the influence of the mating system in TE dynamics, different modes of selection, or applied branching models for studying the propagation of particular TE classes [82, 83, 84] (Table 2).
In addition to mathematical modeling and simulations, multiple computational tools have been developed to analyze TEs in sequenced genomes in the last 5 years. While some of these tools aimed at assessing the global abundance and diversity of TEs in the genome, such as dnaPipeTE, or to annotate TEs in assembled genomes, such as REPET, most of them are focused on discovering and/or genotyping individual copies of TEs in the genome using next-generation sequencing (NGS) data [11, 64, 85, 86, 87, 88, 89, 90]. The diversity of methods available makes it difficult to choose the most appropriate one for the analyses of a given genome. To try to overcome this limitation, Nelson et al.  developed an integrated pipeline named McClintock that incorporates six complementary TE detection methods. McClintock generates standardized output for the different TE detection methods, thus facilitating the comparison of the results obtained with the different pipelines, as well as facilitating their installation and use . This and other studies that compared the performance of several tools arrived to the same conclusion: several computational tools should be combined to increase the accuracy of TE analysis [64, 86, 91].
The availability of third-generation sequencing techniques (3GS) should help improve the detection and genotyping of TE insertions. Although 3GS was developed before 2010 , it has only been in the last few years when this technique has started to be used [14, 93]. Chakraborty et al.  reported the assembly of a D. melanogaster genome from a Zimbabwe strain using long-read single molecule real-time sequencing with 147X coverage. Among several novel structural variants described, they identified 37% additional TE insertions in the 2L chromosome compared with a previous study that used 70X coverage of short reads [14, 94]. 3GS technologies have also been applied to the sequencing of human genomes, although a detailed analysis of TE content based on long-read data has not been performed yet [95, 96, 97].
Recently, Disdero and Filée  introduced the first tool that uses long-read sequences to identify TE insertions in the D. melanogaster genome: LoRTE . The authors argue that available software based on short reads fail to correctly identify TEs that are present in highly repetitive regions of the genome, while long-read technologies should allow us to identify all TEs in a given genome. LoRTE, developed in Python, verifies presence and/or absence of previously annotated TEs and can also detect new insertions not previously annotated in the reference genome. LoRTE is able to work with low-coverage sequences (<10X) providing an efficient accurate TE annotation in a cost-effective manner .
4 Rates of Transposition
4.1 Empirical Estimates of the Rates of Transposition in Drosophila and Humans
Transposition rates in D. melanogaster have been traditionally estimated empirically by in situ hybridization and by using PCR approaches. The activation of TEs following intra- and interspecific hybridization has been studied in different Drosophila species [99, 100, 101]. For example, Vela et al.  estimated transpositions rates in D. buzzatii-D. koepferae interspecific hybrid flies by in situ hybridization . They found that hybrids showed at least one order of magnitude higher transposition rates than parental lines for at least three TE families . Robillard et al.  estimated transposition rates by qPCR in an experimental evolution study in which a TE insertion was introduced in a strain lacking insertions from that particular family . In the first generations after the introduction of the TE insertion, the transposition rate was 0.33–0.45 per copy per generation, while in the following generations, transposition rates were reduced at least one order of magnitude per copy per generation. These values represent the first steps in the invasion of a TE in a genome that is faster than the rate of transposition when measured in natural populations .
In the first edition of this chapter , we anticipated that NGS would allow studying transposition rates in a deeper and more accurate way. Indeed, recent studies have taken advantage of NGS data to estimate transposition rates in D. melanogaster. Rahman et al.  estimated using NGS data the transposition rate in the reference strain by comparing two available genomes that were sequenced with ~15 years difference. The average transposition rate for TEs belonging to different families was 7 × 10−5, which is on the same order of magnitude as the previously reported rates (~10−4–10−5). Furthermore, they confirmed the prediction of increased transposition rate in inbred lines: they estimated a higher average number of TE insertions in lab strains inbred for more generations compared with strains inbred for a smaller number of generations . Adrion et al.  estimated spontaneous insertion and deletion rates in D. melanogaster mutation accumulation lines . The authors identified 24 active superfamilies and estimated genome-wide insertion rates to be higher than deletion rates: 2.11 × 10−9 vs. 1.37 × 10−10 per site per generation, respectively. Superfamily-specific rates of insertion varied from 0 to 5.13 × 10−3 insertions per copy per generation and were within the range of previously estimated rates  (Table 1).
In humans, previous studies estimated the transposition rate as in 1 in 95 to 1 in 250 births for L1, 1 in 20 births for Alu insertions, and 1 in 916 births for SVA retrotransposons [104, 105, 106, 107]. Although there are several recent studies that estimate transposition rate in humans using NGS data, they all focused on somatic transposition in the brain or in tumor samples [47, 48, 90].
4.2 Transposition Control Mechanisms
Understanding the mechanisms controlling the transposition of TEs is central to our understanding of TE dynamics. Many different mechanisms of TE regulation have been described [43, 108, 109]. In this section, we will highlight recent advances in both TE self-regulation and regulation by host factors.
4.2.1 TE Self-Regulation
Self-regulation of transposition was first described in prokaryotes and soon after in TEs involved in hybrid dysgenesis in Drosophila . Recent studies have cast some doubt on one of the self-regulation mechanisms described: transposase overproduction inhibition. The transposase overproduction inhibition mechanism regulates the transposition of IS630-Tc1-mariner piggyBac and hobo-AC-Tam (hAT) superfamilies [111, 112]. However, several studies reported contradictory results suggesting that transposase inhibition by overproduction does not always happen . Bire et al.  suggested that some works failed to detect transposase inhibition because cellular cofactors are necessary to execute this regulation system, and as such it can only be detected in in vivo experiments . However, Woodard et al.  showed that aggregation of transposase proteins produces filamentous structures (rodlets) in the nucleus in a host independent manner . The authors further showed that a decline in transposition occurs after transposase concentrations are high enough for filamentous structures to be visible . Thus, it is still not clear why some in vitro experiments failed to detect transposase overproduction inhibition .
4.2.2 Regulation by Host Factors
Small RNAs, such as small-interfering RNAs (siRNAs) and piwi-interacting RNAs (piRNAs), are well-known to play an essential role in silencing TEs and preventing transposition. Several recent reviews highlight the monumental progress in this field [115, 116, 117, 118, 119]. In addition to posttranscriptional regulation of TEs, small RNAs are involved in transcriptional regulation as well. In mouse, piRNAs are required for de novo methylation and silencing of TEs . In Drosophila, Piwi proteins repress transcription and correlate with an increase in repressive chromatin marks at loci targeted by piRNAs .
While the role of siRNAs and piRNAs has been established for several years, a role of micro RNAs (miRs) in suppressing the mobility of retrotransposons was only recently described . The authors showed that mir-128 binds to L1 RNA and represses its integration in humans .
New studies have also provided evidence for the role in TE repression of proteins previously known for their roles in other cellular processes such as interferon-stimulated proteins, the tumor suppressor p53, and the longevity regulating protein SIRT6. Several interferon-stimulated genes, such as the Moloney leukemia virus 10 (MOV10), the zinc-finger antiviral protein (ZAP), and the 3′ repair exonuclease 1 (TREX1), which are associated with virus response, have been recently involved in the inhibition of L1 activity [66, 123]. Recently, it has also been shown that the p53 transcription factor, which is involved in stress response networks and acts to restrict oncogenesis, also restricts retrotransposon activity in zebra fish, flies, and humans . The authors showed that p53 interacts with components of the piwi-interacting RNA to suppress retrotransposition . Finally, the longevity regulating protein SIRT6 is also involved in retrotransposon repression by coordinating their packaging into transcriptionally repressive heterochromatin. SIRT6 binds to the 5′ UTR region of retrotransposons and mono-ADP ribosylates the Krüppel-associated protein 1 (KAP1) facilitating the interaction of KAP1 with the heterochromatin protein 1α (HP1α) leading to chromatin compaction .
5 Rate of Fixation and Frequency Distribution
5.1 Natural Selection Against TE Insertions
Natural selection and stochastic processes influence both the rate of fixation and the frequency distribution of TEs in populations. The efficiency of selection depends on the effective population size, which largely differs between Drosophila and humans: >108 and ~104, respectively [126, 127]. Thus, while in Drosophila the high efficiency of selection should led to the removal of slightly deleterious TE insertions, in humans, these insertions may accumulate in the genome. Indeed most of the TE sequences in the human genome are remnants of ancient insertions .
A review by Barrón et al.  explored the latest insights on the nature of selection acting against the deleterious effects of TEs in D. melanogaster populations . More recently, Kofler et al.  analyzed intraspecific TE dynamics between D. melanogaster and D. simulans populations to shed light on the long-term evolution of TEs . They confirmed that most of the TEs are present at low frequencies in D. melanogaster and showed that the same pattern is present in D. simulans. Based on computer simulations showing that 50% of the TE families have temporally heterogeneous transposition rates, and on the differences in TE composition between populations of the same species, the authors suggested that TE activity has recently increased in the two species. They proposed that the demographic history of both species, with a recent colonization of different environments, could be the cause of the high TE activity detected .
In humans, a recent study took advantage of the 1000 Genome Project data that reports 16,192 polymorphic TEs to perform the most complete TE dynamics analysis to date . Most of the polymorphic TEs were found to be present at very low frequencies: >93% of TEs showed <5% allele frequency in 26 human populations. These results confirm that overall polymorphic TE insertions are deleterious in humans as was previously suggested with smaller family-specific datasets .
5.2 TE-Induced Adaptations
Several recent reviews have compiled results that showcase the adaptive role of TEs [19, 24, 50, 59, 128]. We would like to highlight the recent discovery of a TE in a fish-like marine chordate that encodes RAG-like proteins with endonuclease-transposase activity . This discovery provides evidence that supports the TE origin hypothesis for the adaptive immune system in jawed vertebrates . Two other recent publications provide experimental evidence for a role of TEs as providers of functional transcription factor binding sites (TFBS) involved in immune response and in cell pluripotency [50, 132]. A recent study linked ERV elements in humans with the interferon response pathway . The authors showed that ERVs carrying enhancers have been co-opted to activate different genes involve in inflammatory response activated by interferon. This example shows how the exaptation of one family of TEs could shape a transcriptional network to activate different genes with one trigger system . Sundaram et al.  reported mouse-specific TEs that contain multiple transcription factor binding sites for pluripotency transcription factors. The majority of the TEs were experimentally shown to exhibit enhancer activity in mouse embryonic stem cells including an in silico reconstructed ancestral TE. This latter result suggests that ancestral TEs already had transcriptional regulatory sites .
In Drosophila, the adaptive role of several TEs has also been identified. Most of the TEs characterized so far are involved in stress response: viral infection and xenobiotics (Doc1420, [60, 61]), oxidative stress (FBti0018880, ), xenobiotic stress (Accord, [62, 63], and FBti0019627, ), cold stress (FBti0019985, ), and heavy metal stress (FBti0019170, ), while FBti0019386 insertion was associated with faster developmental time . Some of these adaptive insertions have been shown to affect gene expression through different molecular mechanisms, such as affecting the polyadenylation site choice , and adding TFBS , while others have been associated with gene duplication [60, 62].
6 Rate of Loss
A recent study estimated genome-wide and superfamily-specific TE deletion rates in D. melanogaster inbred lines . The authors found that most of the deletions involved retrotransposon elements suggesting that the deletions were due to ectopic recombination instead of excision. Deletion rates were smaller than insertion rates estimated in the same inbred lines .
In vertebrates, lineage-specific differences in TE deletion rates have been reported . A possible explanation for this observation is that the success of some families results in a competition for the genome resources leading to the elimination of other TE families .
In addition to TE deletion rates, DNA loss rates should also be considered. In the human linage, estimates of DNA loss are smaller than estimates of DNA gain, 650 Mb vs. 815 Mb , while in D. melanogaster, the rate of DNA loss is higher than the rate of DNA gain [135, 136, 137].
7 Horizontal Transfer of TE Insertions
In addition to parent to offspring transmission, TEs can also be horizontally transferred [138, 139, 140, 141]. By combining simulation and analytical approaches, Groth and Blumenstiel  suggested that exposure rate to new TE families through horizontal transfer can be an important determinant of TE genomic content when the effects of drift in a population are weak . Thus, larger populations are expected to carry a higher TE content if population exposure rate is proportional to population size . So far, most of the evidence for TE horizontal transfer comes from closely related and geographically close species . There are several examples of horizontal transfer of TEs in Drosophila species, while so far horizontal transfer of TEs has not been described in humans .
Recent years have seen an increase in the number of reference genome sequences available as well as of population genome datasets. The availability of all these genome sequences and the development of new bioinformatics tools have allowed us to update our previous estimates of genomic TE content that have increased both in humans and in D. melanogaster. These data has also allowed us to gather more evidence for the functional impact, both detrimental and beneficial, of TE insertions. Thus, it is still indisputable that understanding TE population dynamics is essential to understand genome structure, genome function, and genome evolution.
New methods developed to analyze the dynamics of TEs in populations have shed light on the interplay between autonomous and nonautonomous TE copies, TE invasion dynamics, and how the mating system influences the dynamics of TEs in genomes. We have also considerably advanced our knowledge on the host factors that regulate TE activity as well as in the genome features that influence TE dynamics (Fig. 3). Finally, differences in effective population sizes that affect the efficiency of selection against new TE insertions and differences in the rates of TE loss between humans and D. melanogaster can still be considered two important factors that contribute to the different abundance, diversity, and activity of TEs in this two species .
How differences in the rate of DNA loss can affect the evolutionary dynamics of TEs?
Why host regulation of transposition is relevant for TE dynamics?
Which is the most important factor explaining the differences in TE content, diversity, and activity between humans and Drosophila?
Have the next-generation sequencing (NGS) technologies allowed us to identify all the TEs in a given genome?
How does the interaction between active and inactive copies of TEs affect TE dynamics?
We thank the reviewers for providing constructive comments on a previous version of this manuscript. This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (H2020-ERC-2014-CoG-647900) and from the Spanish Ministry of Economy and Competitiveness/FEDER (BFU2014-57779-P).
- 25.Lippman Z, Gendrel AV, Black M, Vaughn MW, Dedhia N, McCombie WR, Lavine K, Mittal V, May B, Kasschau KD, Carrington JC, Doerge RW, Colot V, Martienssen R (2004) Role of transposable elements in heterochromatin and epigenetic control. Nature 430(6998):471–476PubMedCrossRefPubMedCentralGoogle Scholar
- 36.Huda A, Bushel PR (2013) Widespread exonization of transposable elements in human coding sequences is associated with epigenetic regulation of transcription. Transcr Open Access 1(1)Google Scholar
- 49.Payer LM, Steranka JP, Yang WR, Kryatova M, Medabalimi S, Ardeljan D, Liu C, Boeke JD, Avramopoulos D, Burns KH (2017) Structural variants caused by Alu insertions are associated with risks for many human diseases. Proc Natl Acad Sci U S A 114(20):E3984–E3992PubMedPubMedCentralCrossRefGoogle Scholar
- 58.Schrader L, Kim JW, Ence D, Zimin A, Klein A, Wyschetzki K, Weichselgartner T, Kemena C, Stokl J, Schultner E, Wurm Y, Smith CD, Yandell M, Heinze J, Gadau J, Oettler J (2014) Transposable element islands facilitate adaptation to novel environments in an invasive species. Nat Commun 5:5495PubMedPubMedCentralCrossRefGoogle Scholar
- 62.Schmidt JM, Good RT, Appleton B, Sherrard J, Raymant GC, Bogwitz MR, Martin J, Daborn PJ, Goddard ME, Batterham P, Robin C (2010) Copy number variation and transposable elements feature in recent, ongoing adaptation at the Cyp6g1 locus. PLoS Genet 6(6):e1000998PubMedPubMedCentralCrossRefGoogle Scholar
- 70.Kaminker JS, Bergman CM, Kronmiller B, Carlson J, Svirskas R, Patel S, Frise E, Wheeler DA, Lewis SE, Rubin GM, Ashburner M, Celniker SE (2002) The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective. Genome Biol 3(12):RESEARCH0084PubMedPubMedCentralCrossRefGoogle Scholar
- 81.Gogolesky K, Startek A, Gambin A, Le Rouzic A (2016) Modelling the proliferation of transposable elements in populations under environmental stress. arXiv. arXiv:1611.04812Google Scholar
- 85.Goubert C, Modolo L, Vieira C, ValienteMoro C, Mavingui P, Boulesteix M (2015) De novo assembly and annotation of the Asian tiger mosquito (Aedes albopictus) repeatome with dnaPipeTE from raw genomic reads and comparative analysis with the yellow fever mosquito (Aedes aegypti). Genome Biol Evol 7(4):1192–1205PubMedPubMedCentralCrossRefGoogle Scholar
- 95.Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz MH, Konkel MK, Malhotra A, Stutz AM, Shi X, Casale FP, Chen J, Hormozdiari F, Dayama G, Chen K, Malig M, Chaisson MJP, Walter K, Meiers S, Kashin S, Garrison E, Auton A, Lam HYK, Mu XJ, Alkan C, Antaki D, Bae T, Cerveira E, Chines P, Chong Z, Clarke L, Dal E, Ding L, Emery S, Fan X, Gujral M, Kahveci F, Kidd JM, Kong Y, Lameijer EW, McCarthy S, Flicek P, Gibbs RA, Marth G, Mason CE, Menelaou A, Muzny DM, Nelson BJ, Noor A, Parrish NF, Pendleton M, Quitadamo A, Raeder B, Schadt EE, Romanovitch M, Schlattl A, Sebra R, Shabalin AA, Untergasser A, Walker JA, Wang M, Yu F, Zhang C, Zhang J, Zheng-Bradley X, Zhou W, Zichner T, Sebat J, Batzer MA, McCarroll SA, Genomes Project C, Mills RE, Gerstein MB, Bashir A, Stegle O, Devine SE, Lee C, Eichler EE, Korbel JO (2015) An integrated map of structural variation in 2,504 human genomes. Nature 526(7571):75–81PubMedPubMedCentralCrossRefGoogle Scholar
- 96.Huddleston J, Chaisson MJP, Steinberg KM, Warren W, Hoekzema K, Gordon D, Graves-Lindsay TA, Munson KM, Kronenberg ZN, Vives L, Peluso P, Boitano M, Chin CS, Korlach J, Wilson RK, Eichler EE (2017) Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Res 27(5):677–685PubMedPubMedCentralCrossRefGoogle Scholar
- 97.Pendleton M, Sebra R, Pang AW, Ummat A, Franzen O, Rausch T, Stutz AM, Stedman W, Anantharaman T, Hastie A, Dai H, Fritz MH, Cao H, Cohain A, Deikus G, Durrett RE, Blanchard SC, Altman R, Chin CS, Guo Y, Paxinos EE, Korbel JO, Darnell RB, McCombie WR, Kwok PY, Mason CE, Schadt EE, Bashir A (2015) Assembly and diploid architecture of an individual human genome via single-molecule technologies. Nat Methods 12(8):780–786PubMedPubMedCentralCrossRefGoogle Scholar
- 101.Romero-Soriano V, Modolo L, Lopez-Maestre H, Mugat B, Pessia E, Chambeyron S, Vieira C, Garcia Guerreiro MP (2017) Transposable element misregulation is linked to the divergence between parental piRNA pathways in drosophila hybrids. Genome Biol Evol 9(6):1450–1470PubMedPubMedCentralCrossRefGoogle Scholar
- 132.Sundaram V, Choudhary MN, Pehrsson E, Xing X, Fiore C, Pandey M, Maricque B, Udawatta M, Ngo D, Chen Y, Paguntalan A, Ray T, Hughes A, Cohen BA, Wang T (2017) Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus. Nat Commun 8:14550PubMedPubMedCentralCrossRefGoogle Scholar
- 147.Sebaihia M, Peck MW, Minton NP, Thomson NR, Holden MT, Mitchell WJ, Carter AT, Bentley SD, Mason DR, Crossman L, Paul CJ, Ivens A, Wells-Bennik MH, Davis IJ, Cerdeno-Tarraga AM, Churcher C, Quail MA, Chillingworth T, Feltwell T, Fraser A, Goodhead I, Hance Z, Jagels K, Larke N, Maddison M, Moule S, Mungall K, Norbertczak H, Rabbinowitsch E, Sanders M, Simmonds M, White B, Whithead S, Parkhill J (2007) Genome sequence of a proteolytic (Group I) Clostridium botulinum strain Hall A and comparative analysis of the clostridial genomes. Genome Res 17(7):1082–1092PubMedPubMedCentralCrossRefGoogle Scholar
- 148.Rhee JS, Choi BS, Kim J, Kim BM, Lee YM, Kim IC, Kanamori A, Choi IY, Schartl M, Lee JS (2017) Diversity, distribution, and significance of transposable elements in the genome of the only selfing hermaphroditic vertebrate Kryptolebias marmoratus. Sci Rep 7:40121PubMedPubMedCentralCrossRefGoogle Scholar
- 150.Hernandez-Hernandez EM, Fernandez-Medina RD, Navarro-Escalante L, Nunez J, Benavides-Machado P, Carareto CMA (2017) Genome-wide analysis of transposable elements in the coffee berry borer Hypothenemus hampei (Coleoptera: Curculionidae): description of novel families. Mol Genet Genomics 292(3):565–583PubMedCrossRefPubMedCentralGoogle Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.