Abstract
Current high-throughput techniques have made it feasible to sequence even the genomes of non-model organisms. However, the annotation process now represents a bottleneck to genome analysis, especially when dealing with transposable elements (TE). Combined approaches, using both de novo and knowledge-based methods to detect TEs, are likely to produce reasonably comprehensive and sensitive results. This chapter provides a roadmap for researchers involved in genome projects to address this issue. At each step of the TE annotation process, from the identification of TE families to the annotation of TE copies, we outline the tools and good practices to be used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bergman CM, et al. (2006) Recurrent insertion and duplication generate networks of transposable element sequences in the Drosophila melanogaster genome. Genome Biol 7:R112
Quesneville H, et al. (2005) Combined evidence annotation of transposable elements in genome sequences. PLoS Comput Biol 1:166–175
Lander ES, et al. (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
Schnable PS, et al. (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115
Finnegan DJ (1989) Eukaryotic transposable elements and genome evolution. Trends Genet 5:103–107
Wicker T, et al. (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982
Bergman CM, Quesneville H (2007) Discovering and detecting transposable elements in genome sequences. Brief Bioinform 8:382–392
Quesneville H, Nouaud D, Anxolabehere D (2003) Detection of new transposable element families in Drosophila melanogaster and Anopheles gambiae genomes. J Mol Evol 57 Suppl 1:S50-59
Cuomo CA, et al. (2007) The Fusarium graminearum genome reveals a link between localized polymorphism and pathogen specialization. Science 317:1400–1402
Nene V, et al. (2007) Genome sequence of Aedes aegypti, a major arbovirus vector. Science 316:1718–1723
Vitte C, Panaud O, Quesneville H (2007) LTR retrotransposons in rice (Oryza sativa, L.): recent burst amplifications followed by rapid DNA loss. BMC Genomics 8:218
Abad P, et al. (2008) Genome sequence of the metazoan plant-parasitic nematode Meloidogyne incognita. Nat Biotechnol 26:909–915
Buisine N, Quesneville H, Colot V (2008) Improved detection and annotation of transposable elements in sequenced genomes using multiple reference sequence sets. Genomics 91:467–475
Martin F, et al. (2008) The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis. Nature 452:88–92
Cock JM, et al. (2010) The Ectocarpus genome and the independent evolution of multicellularity in brown algae. Nature 465:617–621
d’Alencon E, et al. (2010) Extensive synteny conservation of holocentric chromosomes in Lepidoptera despite high rates of local genome rearrangements. Proc Natl Acad Sci USA 107:7680–7685
Martin F, et al. (2010) Perigord black truffle genome uncovers evolutionary origins and mechanisms of symbiosis. Nature 464:1033–1038
Spanu PD, et al. (2010) Genome expansion and gene loss in powdery mildew fungi reveal tradeoffs in extreme parasitism. Science 330:1543–1546
Flutre T, et al. (2011) Considering transposable element diversification in de novo annotation approaches. PLoS One 6:e16526
Clark AG, et al. (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218
Altschul SF, et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Bao Z, Eddy SR (2002) Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res 12:1269–1276
Edgar RC, Myers EW (2005) PILER: identification and classification of genomic repeats, Bioinformatics 21 Suppl 1:i152-158
Huang X (1994) On global sequence alignment. Comput Appl Biosci 10:227–235
Katoh K, et al. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066
Blumenstiel JP, Hartl DL, Lozovsky ER (2002) Patterns of insertion and deletion in contrasting chromatin domains. Mol Biol Evol 19:2211–2225
Jurka J, et al. (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467
Finn RD, et al. (2010) The Pfam protein families database. Nucleic Acids Res 38:D211-222
Abrusan G, et al. (2009) TEclass – a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25:1329–1330
NCBI. NCBI suite
Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113
Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52:696–704
Smit AFA, Hubley R, Green P (1996–2004) RepeatMasker Open-3.0., Institute for Systems Biology
Jurka J, et al. (1996) CENSOR – a program for identification and elimination of repetitive elements from DNA sequences. Comput Chem 20:119–121
Kohany O, et al. (2006) Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC Bioinformatics 7:474
Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27:573–580
Kolpakov R, Bana G, Kucherov G (2003) mreps: Efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Res 31:3672–3678
Kurtz S, et al. (2008) A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes. BMC Genomics 9:517
Gu W, et al. (2008) Identification of repeat structure in large genomes using repeat probability clouds. Anal Biochem 380:77–83
Li R, et al. (2005) ReAS: Recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun. PLoS Comput Biol 1:e43
Price AL, Jones NC, Pevzner PA (2005) De novo identification of repeat families in large genomes. Bioinformatics 21 Suppl 1:i351-358
Ellinghaus D, Kurtz S, Willhoeft U (2008) LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9:18
Yang L, Bennetzen JL (2009) Structure-based discovery and description of plant and animal Helitrons. Proc Natl Acad Sci USA 106:12832–12837
Chen Y, et al. (2009) MUST: a system for identification of miniature inverted-repeat transposable elements and applications to Anabaena variabilis and Haloquadratum walsbyi. Gene 436:1–7
Lerat E (2010) Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs. Heredity 104:520–533
Caspi A, Pachter L (2006) Identification of transposable elements using multiple alignments of related genomes. Genome Res 16:260–270
Le QH, et al. (2000) Transposon diversity in Arabidopsis thaliana. Proc Natl Acad Sci USA 97:7376–7381
Rasmussen K, Stoye J, Myers EW (2006) Efficient q-gram filters for finding all e-matches over a given length. J Comput Biol 13:296–308
Feschotte C, et al. (2009) Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes. Genome Biol Evol 1:205–220
Jiang N, et al. (2004) Pack-MULE transposable elements mediate gene evolution in plants. Nature 431:569–573
Morgante M, et al. (2005) Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize. Nat Genet 37:997–1002
Eickbush TH, et al. (1997) Evolution of R1 and R2 in the rDNA units of the genus Drosophila. Genetica 100:49–61
Gray YH (2000) It takes two transposons to tango: transposable-element-mediated chromosomal rearrangements. Trends Genet 16:461–468
Clamp M, et al. (2004) The Jalview Java alignment editor. Bioinformatics 20:426–427
Acknowledgments
This work was supported in part by grants from the Agence Nationale de la Recherche (Holocentrism project, to HQ [grant number ANR-07-BLAN-0057]) and the Centre National de la Recherche Scientifique—Groupement de Recherche 2157 “Elements Transposables.” TF was supported by a PhD studentship form the Institut National de la Recherche Agronomique. EP was supported by a Post-Doctoral fellowship form the Agence Nationale de la Recherche.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Permal, E., Flutre, T., Quesneville, H. (2012). Roadmap for Annotating Transposable Elements in Eukaryote Genomes. In: Bigot, Y. (eds) Mobile Genetic Elements. Methods in Molecular Biology, vol 859. Humana Press. https://doi.org/10.1007/978-1-61779-603-6_3
Download citation
DOI: https://doi.org/10.1007/978-1-61779-603-6_3
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-61779-602-9
Online ISBN: 978-1-61779-603-6
eBook Packages: Springer Protocols