Skip to main content

Overview of Repeat Annotation and De Novo Repeat Identification

  • Protocol
  • First Online:
Plant Transposable Elements

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1057))

Abstract

The availability of a large amount of genomic sequences has provided unique opportunities for understanding the composition and dynamics of transposable elements (TEs) in plants. As the cost of sequencing declines, the genomic sequences of most crop plants will be available within the next few years. Thus, the annotation of genomic sequences, rather than sequence availability, will become the “bottleneck” for genome study. Since TEs are the largest component of most plant genomes, the automation of TE identification and classification is essential for future genome annotation as well as characterization of TEs. In this chapter, the functions and mechanisms of different repeat finding tools are reviewed, with a focus on de novo repeat identification programs. In addition, this chapter covers the further processing of results from de novo identification programs and the construction of repeat libraries for downstream genome analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kumar A, Bennetzen JL (1999) Plant retrotransposons. Annu Rev Genet 33:479–532

    Article  PubMed  CAS  Google Scholar 

  2. Feschotte C, Jiang N, Wessler SR (2002) Plant transposable elements: where genetics meets genomics. Nat Rev Genet 3:329–341

    Article  PubMed  CAS  Google Scholar 

  3. Wicker T et al (2007) (2007), A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982

    Article  PubMed  CAS  Google Scholar 

  4. Schnable PS et al (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115

    Article  PubMed  CAS  Google Scholar 

  5. Yang L, Bennetzen JL (2009) Distribution, diversity, evolution, and survival of Helitrons in the maize genome. Proc Natl Acad Sci U S A 106:19922–19927

    PubMed  CAS  Google Scholar 

  6. Jiang N et al (2004) Pack-MULE transposable elements mediate gene evolution in plants. Nature 431:569–573

    Article  PubMed  CAS  Google Scholar 

  7. Holligan D et al (2006) The transposable element landscape of the model legume Lotus japonicus. Genetics 174:2215–2228

    Article  PubMed  CAS  Google Scholar 

  8. Hanada K et al (2009) The functional role of pack-MULEs in rice inferred from purifying selection and expression profile. Plant Cell 21:25–38

    Article  PubMed  CAS  Google Scholar 

  9. Jiang N et al (2009) Genome organization of the tomato sun locus and characterization of the unusual retrotransposon Rider. Plant J 60:181–193

    Article  PubMed  CAS  Google Scholar 

  10. Altschul SF et al (1990) Basic local alignment search tool. J Mol Biol 215:403410

    Google Scholar 

  11. Pereira V (2004) Insertion bias and purifying selection of retrotransposons in the Arabidopsis thaliana genome. Genome Biol 5:R79

    Article  PubMed  Google Scholar 

  12. Pereira V (2008) Automated paleontology of repetitive DNA with REANNOTATE. BMC Genomics 9:614

    Article  PubMed  Google Scholar 

  13. McCarthy EM, McDonald JF (2003) LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics 19:62–67

    Article  Google Scholar 

  14. Lerat E (2010) Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs. Heredity 104:520–533

    Article  PubMed  CAS  Google Scholar 

  15. Saha S et al (2008) Empirical comparison of ab initio repeat finding programs. Nucleic Acids Res 36:2284–2294

    Article  PubMed  CAS  Google Scholar 

  16. Bao Z, Eddy SR (2002) Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res 12:1269–1276

    Article  PubMed  CAS  Google Scholar 

  17. Li R et al (2005) ReAS: recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun. PLoS Comput Biol 1:e43

    Article  PubMed  Google Scholar 

  18. Jiang N et al (2003) An active DNA transposon family in rice. Nature 421:163–167

    Article  PubMed  CAS  Google Scholar 

  19. Edgar RC, Myers EW (2005) PILER: identification and classification of genomic repeats. Bioinformatics 21:i152–i158

    Article  PubMed  CAS  Google Scholar 

  20. Thompson JD et al (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882

    Article  PubMed  CAS  Google Scholar 

  21. Price AL, Jones NC, Pevzner PA (2005) De novo identification of repeat families in large genomes. Bioinformatics 21:i351–i358

    Article  PubMed  CAS  Google Scholar 

  22. Singh A et al (2010) An algorithm for the reconstruction of consensus sequences of ancient segmental duplications and transposon copies in eukaryotic genomes. Int J Bioinform Res Appl 6:147–162

    Article  PubMed  CAS  Google Scholar 

  23. Kennedy RC et al (2011) An automated homology-based approach for identifying transposable elements. BMC Bioinformatics 12:130

    Article  PubMed  CAS  Google Scholar 

  24. Abrusan G et al (2009) TEclass–a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25:1329–1330

    Article  PubMed  CAS  Google Scholar 

  25. Jurka J et al (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467

    Article  PubMed  CAS  Google Scholar 

  26. Feschotte C et al (2009) Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes. Genome Biol Evol 1:205–220

    Article  PubMed  Google Scholar 

  27. Morgenstern B (2004) DIALIGN: multiple DNA and protein sequence alignment at BiBiServ. Nucleic Acids Res 32:W33–W36

    Article  PubMed  CAS  Google Scholar 

  28. Agarwal P, States DJ (1994) The Repeat Pattern Toolkit (RPT): analyzing the structure and evolution of the C. elegans genome. Proc Int Conf Intell Syst Mol Biol 2:1–9

    PubMed  CAS  Google Scholar 

  29. Kurtz S et al (2001) REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res 29:4633–4642

    Article  PubMed  CAS  Google Scholar 

  30. Lefebvre A et al (2003) FORRepeats: detects repeats on entire chromosomes and between genomes. Bioinformatics 19:319–326

    Article  PubMed  CAS  Google Scholar 

  31. Campagna D et al (2005) RAP: a new computer program for de novo identification of repeated sequences in whole genomes. Bioinformatics 21:582–588

    Article  PubMed  CAS  Google Scholar 

  32. Gu W et al (2008) Identification of repeat structure in large genomes using repeat probability clouds. Anal Biochem 380:77–83

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

I thank Dr. Frank Dennis (Michigan State Univ.) for critical reading of the manuscript. This work was supported by NSF grant IOS-1126998.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media, New York

About this protocol

Cite this protocol

Jiang, N. (2013). Overview of Repeat Annotation and De Novo Repeat Identification. In: Peterson, T. (eds) Plant Transposable Elements. Methods in Molecular Biology, vol 1057. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-568-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-568-2_20

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-567-5

  • Online ISBN: 978-1-62703-568-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics