Boosting the Detection of Transposable Elements Using Machine Learning

  • Tiago LoureiroEmail author
  • Rui Camacho
  • Jorge Vieira
  • Nuno A. Fonseca
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 222)


Transposable Elements (TE) are sequences of DNA that move and transpose within a genome. TEs, as mutation agents, are quite important for their role in both genome alteration diseases and on species evolution. Several tools have been developed to discover and annotate TEs but no single one achieves good results on all different types of TEs. In this paper we evaluate the performance of several TEs detection and annotation tools and investigate if Machine Learning techniques can be used to improve their overall detection accuracy. The results of an in silico evaluation of TEs detection and annotation tools indicate that their performance can be improved by using machine learning classifiers.


Transposable Elements Machine Learning Genomics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bergman, C.M., Quesneville, H.: Discovering and detecting transposable elements in genome sequences. Briefings in Bioinformatics 8(6), 382–392 (2007)CrossRefGoogle Scholar
  2. 2.
    Chénais, B., Caruso, A., Hiard, S., Casse, N.: The impact of transposable elements on eukaryotic genomes: From genome size increase to genetic adaptation to stressful environments. Gene (2012)Google Scholar
  3. 3.
    Casacuberta, E., Gonzlez, J.: The impact of transposable elements in environmental adaptation. Mol. Ecol. (2013)Google Scholar
  4. 4.
    Cowley, M., Oakey, R.J.: Transposable elements re-wire and fine-tune the transcriptome. PLoS Genet. 9(1) (2013)Google Scholar
  5. 5.
    Myers, E.W., Edgar, R.C.: PILER: identification and classification of genomic repeats. Bioinformatics 21, 152–158 (2005)CrossRefGoogle Scholar
  6. 6.
    Jurka, J., Klonowski, P., Dagman, V., Pelton, P.: Censora program for identification and elimination of repetitive elements from DNA sequences. Computers & Chemistry 20(1), 119–121 (1996)CrossRefGoogle Scholar
  7. 7.
    Jurka, J., Kapitonov, V.V., Pavlicek, A., Klonowski, P., Kohany, O., Walichiewicz, J.: Repbase update, a database of eukaryotic repetitive elements. Cytogentic and Genome Research 110, 462–467 (2005)CrossRefGoogle Scholar
  8. 8.
    Kim, Y.J., Lee, J., Han, K.: Transposable elements: No more ’junk dna’. Genomics Inform. 10(4), 226–233 (2012)CrossRefGoogle Scholar
  9. 9.
    Koso, H., Takeda, H., Yew, C.C., Ward, J.M., Nariai, N., Ueno, K., Nagasaki, M., Watanabe, S., Rust, A.G., Adams, D.J., Copeland, N.G., Jenkins, N.A.: Transposon mutagenesis identifies genes that transform neural stem cells into glioma-initiating cells. Proceedings of the National Academy of Sciences 109(44), E2998–E3007 (2012)CrossRefGoogle Scholar
  10. 10.
    Pearson, W.R., Lipman, D.J.: Rapid and sensitive protein similarity searches. Science 227(4693), 1435–1441 (1985)CrossRefGoogle Scholar
  11. 11.
    Llorns, C., Futami, R., Bezemer, D., Moya, A.: The ::::gypsy:::: Database (gydb) of mobile genetic elements. Nucleic Acids Research 36(Database-Issue), 38–46 (2008)Google Scholar
  12. 12.
    Lisch, D.: How important are transposons for plant evolution? Nat. Rev. Genet. 14(1), 49–61 (2013)CrossRefGoogle Scholar
  13. 13.
    McQuilton, P., St. Pierre, E., Thurmond, J.: Flybase 101 - the basics of navigating flybase. Nucleic Acids Research 40(Database-Issue), 706–714 (2012)CrossRefGoogle Scholar
  14. 14.
    Green, P., Smit, A.F.A., Hubley, R.: RepeatMasker Open-3.0Google Scholar
  15. 15.
    Kent, W.: Blat the blast-like alignment tool. Genome Research 12 (2002)Google Scholar
  16. 16.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann (2005)Google Scholar
  17. 17.
    Xu, Z., Wang, H.: LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Research 35(suppl. 2), W265–W268 (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Tiago Loureiro
    • 1
    Email author
  • Rui Camacho
    • 2
  • Jorge Vieira
    • 3
  • Nuno A. Fonseca
    • 4
    • 5
  1. 1.DEI & Faculdade de EngenhariaUniversidade do PortoPortoPortugal
  2. 2.DEI & Faculdade de Engenharia & LIAAD-INESCTECUniversidade do PortoPortoPortugal
  3. 3.IBMC - Instituto de Biologia Molecular e Celular & Universidade do PortoPortoPortugal
  4. 4.EMBL Outstation, European Bioinformatics Institute (EBI)HinxtonUK
  5. 5.CRACS-INESCTECPortoPortugal

Personalised recommendations