Computational Methods for the Analysis of Primate Mobile Elements

  • Richard Cordaux
  • Shurjo K. Sen
  • Miriam K. Konkel
  • Mark A. BatzerEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 628)


Transposable elements (TE), defined as discrete pieces of DNA that can move from one site to another site in genomes, represent significant components of eukaryotic genomes, including primates. Comparative genome-wide analyses have revealed the considerable structural and functional impact of TE families on primate genomes. Insights into these questions have come in part from the development of computational methods that allow detailed and reliable identification, annotation, and evolutionary analyses of the many TE families that populate primate genomes. Here, we present an overview of these computational methods and describe efficient data mining strategies for providing a comprehensive picture of TE biology in newly available genome sequences.

Key words

Computational methods Transposable element Insertion Identification Classification Consensus sequence Subfamily Phylogenetic reconstruction Transpositional activity Primate Genome evolution 



Our research is supported by National Science Foundation BCS-0218338 (MAB) and EPS-0346411 (MAB), National Institutes of Health RO1 GM59290 (MAB) and PO1 AG022064 (MAB), and the State of Louisiana Board of Regents Support Fund (MAB). RC is supported by a Young Investigator ATIP award from the Centre National de la Recherche Scientifique (CNRS).


  1. 1.
    Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.PubMedCrossRefGoogle Scholar
  2. 2.
    Chimpanzee Sequencing and Analysis Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87.CrossRefGoogle Scholar
  3. 3.
    Gibbs, R.A., Rogers, J., Katze, M.G., Bumgarner, R., Weinstock, G.M., Mardis, E.R., et al. (2007) Evolutionary and biomedical insights from the rhesus macaque genome. Science 316, 222–234.PubMedCrossRefGoogle Scholar
  4. 4.
    Hedges, D.J. and Deininger, P.L. (2007) Inviting instability: Transposable elements, double-strand breaks, and the maintenance of genome integrity. Mutat Res 616, 46–59.PubMedCrossRefGoogle Scholar
  5. 5.
    Callinan, P.A., Wang, J., Herke, S.W., Garber, R.K., Liang, P. and Batzer, M.A. (2005) Alu Retrotransposition-mediated deletion. J Mol Biol 348, 791–800.PubMedCrossRefGoogle Scholar
  6. 6.
    Han, K., Sen, S.K., Wang, J., Callinan, P.A., Lee, J., Cordaux, R., et al. (2005) Genomic rearrangements by LINE-1 insertion-mediated deletion in the human and chimpanzee lineages. Nucleic Acids Res 33, 4040–4052.PubMedCrossRefGoogle Scholar
  7. 7.
    Sen, S.K., Han, K., Wang, J., Lee, J., Wang, H., Callinan, P.A., et al. (2006) Human genomic deletions mediated by recombination between Alu elements. Am J Hum Genet 79, 41–53.PubMedCrossRefGoogle Scholar
  8. 8.
    Han, K., Lee, J., Meyer, T.J., Wang, J., Sen, S.K., Srikanta, D., et al. (2007) Alu recombination-mediated structural deletions in the chimpanzee genome. PLoS Genet 3, 1939–1949.PubMedCrossRefGoogle Scholar
  9. 9.
    Bailey, J.A., Liu, G. and Eichler, E.E. (2003) An Alu transposition model for the origin and expansion of human segmental duplications. Am J Hum Genet 73, 823–834.PubMedCrossRefGoogle Scholar
  10. 10.
    Jurka, J., Kohany, O., Pavlicek, A., Kapitonov, V.V. and Jurka, M.V. (2004) Duplication, coclustering, and selection of human Alu retrotransposons. Proc Natl Acad Sci U S A 101, 1268–1272.PubMedCrossRefGoogle Scholar
  11. 11.
    Lobachev, K.S., Stenger, J.E., Kozyreva, O.G., Jurka, J., Gordenin, D.A. and Resnick, M.A. (2000) Inverted Alu repeats unstable in yeast are excluded from the human genome. Embo J 19, 3822–3830.PubMedCrossRefGoogle Scholar
  12. 12.
    Stenger, J.E., Lobachev, K.S., Gordenin, D., Darden, T.A., Jurka, J. and Resnick, M.A. (2001) Biased distribution of inverted and direct Alus in the human genome: implications for insertion, exclusion, and genome stability. Genome Res 11, 12–27.PubMedCrossRefGoogle Scholar
  13. 13.
    Pickeral, O.K., Makalowski, W., Boguski, M.S. and Boeke, J.D. (2000) Frequent human genomic DNA transduction driven by LINE-1 retrotransposition. Genome Res 10, 411–415.PubMedCrossRefGoogle Scholar
  14. 14.
    Xing, J., Wang, H., Belancio, V.P., Cordaux, R., Deininger, P.L. and Batzer, M.A. (2006) Emergence of primate genes by retrotransposon-mediated sequence transduction. Proc Natl Acad Sci U S A 103, 17608–17613.PubMedCrossRefGoogle Scholar
  15. 15.
    Morrish, T.A., Gilbert, N., Myers, J.S., Vincent, B.J., Stamato, T.D., Taccioli, G.E., et al. (2002) DNA repair mediated by endonuclease-independent LINE-1 retrotransposition. Nat Genet 31, 159–165.PubMedCrossRefGoogle Scholar
  16. 16.
    Sen, S.K., Huang, C.T., Han, K. and Batzer, M.A. (2007) Endonuclease-independent insertion provides an alternative pathway for L1 retrotransposition in the human genome. Nucleic Acids Res 35, 3741–3751.PubMedCrossRefGoogle Scholar
  17. 17.
    Mi, S., Lee, X., Li, X., Veldman, G.M., Finnerty, H., Racie, L., et al. (2000) Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis. Nature 403, 785–789.PubMedCrossRefGoogle Scholar
  18. 18.
    Cordaux, R., Udit, S., Batzer, M.A. and Feschotte, C. (2006) Birth of a chimeric primate gene by capture of the transposase gene from a mobile element. Proc Natl Acad Sci U S A 103, 8101–8106.PubMedCrossRefGoogle Scholar
  19. 19.
    Boissinot, S., Entezam, A. and Furano, A.V. (2001) Selection against deleterious LINE-1-containing loci in the human lineage. Mol Biol Evol 18, 926–935.PubMedCrossRefGoogle Scholar
  20. 20.
    Cordaux, R., Lee, J., Dinoso, L. and Batzer, M.A. (2006) Recently integrated Alu retrotransposons are essentially neutral residents of the human genome. Gene 373, 138–144.PubMedCrossRefGoogle Scholar
  21. 21.
    Schmid, C.W. (2003) Alu: A parasite’s parasite? Nat Genet 35, 15–16.PubMedCrossRefGoogle Scholar
  22. 22.
    Brosius, J. and Gould, S.J. (1992) On “genomenclature”: A comprehensive (and respectful) taxonomy for pseudogenes and other “junk DNA”. Proc Natl Acad Sci U S A 89, 10706–10710.PubMedCrossRefGoogle Scholar
  23. 23.
    Liu, W.M., Chu, W.M., Choudary, P.V. and Schmid, C.W. (1995) Cell stress and translational inhibitors transiently increase the abundance of mammalian SINE transcripts. Nucleic Acids Res 23, 1758–1765.PubMedCrossRefGoogle Scholar
  24. 24.
    Schmid, C.W. (1998) Does SINE evolution preclude Alu function? Nucleic Acids Res 26, 4541–4550.PubMedCrossRefGoogle Scholar
  25. 25.
    Brookfield, J.F. (2005) The ecology of the genome - mobile DNA elements and their hosts. Nat Rev Genet 6, 128–136.PubMedCrossRefGoogle Scholar
  26. 26.
    Le Rouzic, A., Dupas, S. and Capy, P. (2007) Genome ecosystem and transposable elements species. Gene 390, 214–220.PubMedCrossRefGoogle Scholar
  27. 27.
    Jurka, J., Kapitonov, V.V., Pavlicek, A., Klonowski, P., Kohany, O. and Walichiewicz, J. (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110, 462–467.PubMedCrossRefGoogle Scholar
  28. 28.
    Kohany, O., Gentles, A.J., Hankus, L. and Jurka, J. (2006) Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC Bio­infor­matics 7, 474.PubMedCrossRefGoogle Scholar
  29. 29.
    Edgar, R. C. and Myers, E. W. (2005) PILER: identification and classification of genomic repeats. Bioinformatics 21 Suppl. 1, i152-i158.PubMedCrossRefGoogle Scholar
  30. 30.
    Li, R., Ye, J., Li, S., Wang, J., Han, Y., Ye, C., et al. (2005) ReAS: Recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun. PLoS Comput Biol 1, e43.PubMedCrossRefGoogle Scholar
  31. 31.
    Bao, Z. and Eddy, S.R. (2002) Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res 12, 1269–1276.PubMedCrossRefGoogle Scholar
  32. 32.
    Price, A.L., Jones, N.C. and Pevzner, P.A. (2005) De novo identification of repeat families in large genomes. Bioinformatics 21 Suppl. 1, i351-i358.PubMedCrossRefGoogle Scholar
  33. 33.
    Wang, J., Song, L., Gonder, M.K., Azrak, S., Ray, D.A., Batzer, M.A., et al. (2006) Whole genome computational comparative genomics: A fruitful approach for ascertaining Alu insertion polymorphisms. Gene 365, 11–20.PubMedCrossRefGoogle Scholar
  34. 34.
    Konkel, M.K., Wang, J., Liang, P. and Batzer, M.A. (2007) Identification and characterization of novel polymorphic LINE-1 insertions through comparison of two human genome sequence assemblies. Gene 390, 28–38.PubMedCrossRefGoogle Scholar
  35. 35.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J Mol Biol 215, 403–410.PubMedGoogle Scholar
  36. 36.
    Wang, J., Song, L., Grover, D., Azrak, S., Batzer, M.A. and Liang, P. (2006) dbRIP: A highly integrated database of retrotransposon insertion polymorphisms in humans. Hum Mutat 27, 323–329.PubMedCrossRefGoogle Scholar
  37. 37.
    Milosavljevic, A., Haussler, D. and Jurka, J. (1989) Informed parsimonious inference of prototypical genetic sequence. In: Proceedings of the Second Annual Workshop on Computational Learning Theory (Rivest, R., Haussler, D. and Warmuth, M.K., eds.), pp. 102–117. Morgan Kaufman, San Mateo.Google Scholar
  38. 38.
    Milosavljevic, A. (1990) Categorization of Macromolecular Sequences by Minimal Length Encoding, University of California at Santa Cruz.Google Scholar
  39. 39.
    Keich, U. and Pevzner, P.A. (2002) Finding motifs in the twilight zone. Bioinformatics 18, 1374–1381.PubMedCrossRefGoogle Scholar
  40. 40.
    Price, A.L., Eskin, E. and Pevzner, P.A. (2004) Whole-genome analysis of Alu repeat elements reveals complex evolutionary history. Genome Res 14, 2245–2252.PubMedCrossRefGoogle Scholar
  41. 41.
    Xing, J., Hedges, D.J., Han, K., Wang, H., Cordaux, R. and Batzer, M.A. (2004) Alu element mutation spectra: molecular clocks and the effect of DNA methylation. J Mol Biol 344, 675–682.PubMedCrossRefGoogle Scholar
  42. 42.
    Jurka, J. (1994) Approaches to identification and analysis of interspersed repetitive DNA sequences. In: Automated DNA Sequencing and Analysis (Adams, M.D., Fields, C. and Venter, J.C., eds.), pp. 294–298. Academic Press, London.Google Scholar
  43. 43.
    Smit, A.F., Toth, G., Riggs, A.D. and Jurka, J. (1995) Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences. J Mol Biol 246, 401–417.PubMedCrossRefGoogle Scholar
  44. 44.
    Pace, J. K., II and Feschotte, C. (2007) The evolutionary history of human DNA transposons: evidence for intense activity in the primate lineage. Genome Res 17, 422–432.PubMedCrossRefGoogle Scholar
  45. 45.
    Kumar, S., Tamura, K. and Nei, M. (2004) MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform 5, 150–163.PubMedCrossRefGoogle Scholar
  46. 46.
    Posada, D. and Crandall, K.A. (2001) Intraspecific gene genealogies: trees grafting into networks. Trends Eco Evol 16, 37–45.CrossRefGoogle Scholar
  47. 47.
    Cordaux, R., Hedges, D.J. and Batzer, M.A. (2004) Retrotransposition of Alu elements: how many sources? Trends Genet 20, 464–467.PubMedCrossRefGoogle Scholar
  48. 48.
    Bandelt, H.J., Forster, P. and Rohl, A. (1999) Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol 16, 37–48.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Richard Cordaux
    • 1
  • Shurjo K. Sen
    • 2
  • Miriam K. Konkel
    • 2
  • Mark A. Batzer
    • 2
    Email author
  1. 1.Laboratoire Ecologie, Evolution et SymbioseCNRS UMR 6556, Universitè de PoitiersPoitiersFrance
  2. 2.Department of Biological Sciences, Biological Computation and Visualization CenterCenter for BioModular Multi-Scale Systems, Louisiana State UniversityBaton RougeUSA

Personalised recommendations