Skip to main content

Roadmap for Annotating Transposable Elements in Eukaryote Genomes

  • Protocol
  • First Online:
Mobile Genetic Elements

Part of the book series: Methods in Molecular Biology ((MIMB,volume 859))

Abstract

Current high-throughput techniques have made it feasible to sequence even the genomes of non-model organisms. However, the annotation process now represents a bottleneck to genome analysis, especially when dealing with transposable elements (TE). Combined approaches, using both de novo and knowledge-based methods to detect TEs, are likely to produce reasonably comprehensive and sensitive results. This chapter provides a roadmap for researchers involved in genome projects to address this issue. At each step of the TE annotation process, from the identification of TE families to the annotation of TE copies, we outline the tools and good practices to be used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bergman CM, et al. (2006) Recurrent insertion and duplication generate networks of transposable element sequences in the Drosophila melanogaster genome. Genome Biol 7:R112

    Article  PubMed  Google Scholar 

  2. Quesneville H, et al. (2005) Combined evidence annotation of transposable elements in genome sequences. PLoS Comput Biol 1:166–175

    Article  PubMed  CAS  Google Scholar 

  3. Lander ES, et al. (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921

    Article  PubMed  CAS  Google Scholar 

  4. Schnable PS, et al. (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115

    Article  PubMed  CAS  Google Scholar 

  5. Finnegan DJ (1989) Eukaryotic transposable elements and genome evolution. Trends Genet 5:103–107

    Article  PubMed  CAS  Google Scholar 

  6. Wicker T, et al. (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982

    Article  PubMed  CAS  Google Scholar 

  7. Bergman CM, Quesneville H (2007) Discovering and detecting transposable elements in genome sequences. Brief Bioinform 8:382–392

    Article  PubMed  CAS  Google Scholar 

  8. Quesneville H, Nouaud D, Anxolabehere D (2003) Detection of new transposable element families in Drosophila melanogaster and Anopheles gambiae genomes. J Mol Evol 57 Suppl 1:S50-59

    Article  PubMed  CAS  Google Scholar 

  9. Cuomo CA, et al. (2007) The Fusarium graminearum genome reveals a link between localized polymorphism and pathogen specialization. Science 317:1400–1402

    Article  PubMed  CAS  Google Scholar 

  10. Nene V, et al. (2007) Genome sequence of Aedes aegypti, a major arbovirus vector. Science 316:1718–1723

    Article  PubMed  CAS  Google Scholar 

  11. Vitte C, Panaud O, Quesneville H (2007) LTR retrotransposons in rice (Oryza sativa, L.): recent burst amplifications followed by rapid DNA loss. BMC Genomics 8:218

    Article  PubMed  Google Scholar 

  12. Abad P, et al. (2008) Genome sequence of the metazoan plant-parasitic nematode Meloidogyne incognita. Nat Biotechnol 26:909–915

    Article  PubMed  CAS  Google Scholar 

  13. Buisine N, Quesneville H, Colot V (2008) Improved detection and annotation of transposable elements in sequenced genomes using multiple reference sequence sets. Genomics 91:467–475

    Article  PubMed  CAS  Google Scholar 

  14. Martin F, et al. (2008) The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis. Nature 452:88–92

    Article  PubMed  CAS  Google Scholar 

  15. Cock JM, et al. (2010) The Ectocarpus genome and the independent evolution of multicellularity in brown algae. Nature 465:617–621

    Article  PubMed  CAS  Google Scholar 

  16. d’Alencon E, et al. (2010) Extensive synteny conservation of holocentric chromosomes in Lepidoptera despite high rates of local genome rearrangements. Proc Natl Acad Sci USA 107:7680–7685

    Google Scholar 

  17. Martin F, et al. (2010) Perigord black truffle genome uncovers evolutionary origins and mechanisms of symbiosis. Nature 464:1033–1038

    Article  PubMed  CAS  Google Scholar 

  18. Spanu PD, et al. (2010) Genome expansion and gene loss in powdery mildew fungi reveal tradeoffs in extreme parasitism. Science 330:1543–1546

    Article  PubMed  CAS  Google Scholar 

  19. Flutre T, et al. (2011) Considering transposable element diversification in de novo annotation approaches. PLoS One 6:e16526

    Article  PubMed  CAS  Google Scholar 

  20. Clark AG, et al. (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218

    Article  PubMed  Google Scholar 

  21. Altschul SF, et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  PubMed  CAS  Google Scholar 

  22. Bao Z, Eddy SR (2002) Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res 12:1269–1276

    Article  PubMed  CAS  Google Scholar 

  23. Edgar RC, Myers EW (2005) PILER: identification and classification of genomic repeats, Bioinformatics 21 Suppl 1:i152-158

    Article  PubMed  CAS  Google Scholar 

  24. Huang X (1994) On global sequence alignment. Comput Appl Biosci 10:227–235

    PubMed  CAS  Google Scholar 

  25. Katoh K, et al. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066

    Article  PubMed  CAS  Google Scholar 

  26. Blumenstiel JP, Hartl DL, Lozovsky ER (2002) Patterns of insertion and deletion in contrasting chromatin domains. Mol Biol Evol 19:2211–2225

    Article  PubMed  CAS  Google Scholar 

  27. Jurka J, et al. (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467

    Article  PubMed  CAS  Google Scholar 

  28. Finn RD, et al. (2010) The Pfam protein families database. Nucleic Acids Res 38:D211-222

    Article  PubMed  CAS  Google Scholar 

  29. Abrusan G, et al. (2009) TEclass – a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25:1329–1330

    Article  PubMed  CAS  Google Scholar 

  30. NCBI. NCBI suite

    Google Scholar 

  31. Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113

    Article  PubMed  Google Scholar 

  32. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52:696–704

    Article  PubMed  Google Scholar 

  33. Smit AFA, Hubley R, Green P (1996–2004) RepeatMasker Open-3.0., Institute for Systems Biology

    Google Scholar 

  34. Jurka J, et al. (1996) CENSOR – a program for identification and elimination of repetitive elements from DNA sequences. Comput Chem 20:119–121

    Article  PubMed  CAS  Google Scholar 

  35. Kohany O, et al. (2006) Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC Bioinformatics 7:474

    Article  PubMed  Google Scholar 

  36. Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27:573–580

    Article  PubMed  CAS  Google Scholar 

  37. Kolpakov R, Bana G, Kucherov G (2003) mreps: Efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Res 31:3672–3678

    Article  PubMed  CAS  Google Scholar 

  38. Kurtz S, et al. (2008) A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes. BMC Genomics 9:517

    Article  PubMed  Google Scholar 

  39. Gu W, et al. (2008) Identification of repeat structure in large genomes using repeat probability clouds. Anal Biochem 380:77–83

    Article  PubMed  CAS  Google Scholar 

  40. Li R, et al. (2005) ReAS: Recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun. PLoS Comput Biol 1:e43

    Article  PubMed  Google Scholar 

  41. Price AL, Jones NC, Pevzner PA (2005) De novo identification of repeat families in large genomes. Bioinformatics 21 Suppl 1:i351-358

    Article  PubMed  CAS  Google Scholar 

  42. Ellinghaus D, Kurtz S, Willhoeft U (2008) LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9:18

    Article  PubMed  Google Scholar 

  43. Yang L, Bennetzen JL (2009) Structure-based discovery and description of plant and animal Helitrons. Proc Natl Acad Sci USA 106:12832–12837

    Article  PubMed  CAS  Google Scholar 

  44. Chen Y, et al. (2009) MUST: a system for identification of miniature inverted-repeat transposable elements and applications to Anabaena variabilis and Haloquadratum walsbyi. Gene 436:1–7

    Article  PubMed  CAS  Google Scholar 

  45. Lerat E (2010) Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs. Heredity 104:520–533

    Article  PubMed  CAS  Google Scholar 

  46. Caspi A, Pachter L (2006) Identification of transposable elements using multiple alignments of related genomes. Genome Res 16:260–270

    Article  PubMed  CAS  Google Scholar 

  47. Le QH, et al. (2000) Transposon diversity in Arabidopsis thaliana. Proc Natl Acad Sci USA 97:7376–7381

    Article  PubMed  CAS  Google Scholar 

  48. Rasmussen K, Stoye J, Myers EW (2006) Efficient q-gram filters for finding all e-matches over a given length. J Comput Biol 13:296–308

    Article  PubMed  CAS  Google Scholar 

  49. Feschotte C, et al. (2009) Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes. Genome Biol Evol 1:205–220

    Article  PubMed  Google Scholar 

  50. Jiang N, et al. (2004) Pack-MULE transposable elements mediate gene evolution in plants. Nature 431:569–573

    Article  PubMed  CAS  Google Scholar 

  51. Morgante M, et al. (2005) Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize. Nat Genet 37:997–1002

    Article  PubMed  CAS  Google Scholar 

  52. Eickbush TH, et al. (1997) Evolution of R1 and R2 in the rDNA units of the genus Drosophila. Genetica 100:49–61

    Article  PubMed  CAS  Google Scholar 

  53. Gray YH (2000) It takes two transposons to tango: transposable-element-mediated chromosomal rearrangements. Trends Genet 16:461–468

    Article  PubMed  CAS  Google Scholar 

  54. Clamp M, et al. (2004) The Jalview Java alignment editor. Bioinformatics 20:426–427

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported in part by grants from the Agence Nationale de la Recherche (Holocentrism project, to HQ [grant number ANR-07-BLAN-0057]) and the Centre National de la Recherche Scientifique—Groupement de Recherche 2157 “Elements Transposables.” TF was supported by a PhD studentship form the Institut National de la Recherche Agronomique. EP was supported by a Post-Doctoral fellowship form the Agence Nationale de la Recherche.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hadi Quesneville .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Permal, E., Flutre, T., Quesneville, H. (2012). Roadmap for Annotating Transposable Elements in Eukaryote Genomes. In: Bigot, Y. (eds) Mobile Genetic Elements. Methods in Molecular Biology, vol 859. Humana Press. https://doi.org/10.1007/978-1-61779-603-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-603-6_3

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-602-9

  • Online ISBN: 978-1-61779-603-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics