Skip to main content

Identification of Transcription Factor Binding Sites Derived from Transposable Element Sequences Using ChIP-seq

  • Protocol
  • First Online:
Computational Biology of Transcription Factor Binding

Part of the book series: Methods in Molecular Biology ((MIMB,volume 674))

Abstract

Transposable elements (TEs) form a substantial fraction of the non-coding DNA of many eukaryotic genomes. There are numerous examples of TEs being exapted for regulatory function by the host, many of which were identified through their high conservation. However, given that TEs are often the youngest part of a genome and typically exhibit a high turnover, conservation-based methods will fail to identify lineage- or species-specific exaptations. ChIP-seq has become a very popular and effective method for identifying in vivo DNA–protein interactions, such as those seen at transcription factor binding sites (TFBS), and has been used to show that there are a large number of TE-derived TFBS. Many of these TE-derived TFBS show poor conservation and would go unnoticed using conservation screens. Here, we describe a simple pipeline method for using data generated through ChIP-seq to identify TE-derived TFBS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lander, E.S., Linton, L.M., Birren, B., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.

    Article  PubMed  CAS  Google Scholar 

  2. Wicker, T., Sabot, F., Hua-Van, A., et al. (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8, 973–982.

    Article  PubMed  CAS  Google Scholar 

  3. Doolittle, W.F., and Sapienza, C. (1980) Selfish genes, the phenotype paradigm and genome evolution. Nature 284, 601–603.

    Article  PubMed  CAS  Google Scholar 

  4. Orgel, L.E., and Crick, F.H. (1980) Selfish DNA: the ultimate parasite. Nature 284, 604–607.

    Article  PubMed  CAS  Google Scholar 

  5. Gould, S.J., and Vrba, E.S. (1982) Exaptation; a missing term in the science of form Paleobiology 8, 4–15.

    Google Scholar 

  6. Jordan, I.K. (2006) Evolutionary tinkering with transposable elements. Proc Natl Acad Sci USA 103, 7941–7942.

    Article  PubMed  CAS  Google Scholar 

  7. Kidwell, M.G., and Lisch, D.R. (2001) Perspective: transposable elements, parasitic DNA, and genome evolution. Evolution 55, 1–24.

    PubMed  CAS  Google Scholar 

  8. Cohen, C.J., Lock, W.M., and Mager, D.L. (2009) Endogenous retroviral LTRs as promoters for human genes: a critical assessment. Gene 448, 105–114.

    Article  PubMed  CAS  Google Scholar 

  9. Conley, A.B., Piriyapongsa, J., and Jordan, I.K. (2008) Retroviral promoters in the human genome. Bioinformatics 24, 1563–1567.

    Article  PubMed  CAS  Google Scholar 

  10. Zemojtel, T., Kielbasa, S.M., Arndt, P.F. et al. (2009) Methylation and deamination of CpGs generate p53-binding sites on a genomic scale. Trends Genet 25, 63–66.

    Article  PubMed  CAS  Google Scholar 

  11. Wang, J., Bowen, N.J., Chang, L. et al. (2009) A c-Myc regulatory subnetwork from human transposable element sequences. Mol Biosyst 5, 1831–1839.

    Article  PubMed  CAS  Google Scholar 

  12. Feschotte, C. (2008) Transposable elements and the evolution of regulatory networks. Nat Rev Genet 9, 397–405.

    Article  PubMed  CAS  Google Scholar 

  13. Chimpanzee Sequencing and Analysis Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87.

    Article  Google Scholar 

  14. Silva, J.C., Shabalina, S.A., Harris, D.G. et al. (2003) Conserved fragments of transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and L2-derived sequences within the mouse and human genomes. Genet Res 82, 1–18.

    Article  PubMed  CAS  Google Scholar 

  15. Marino-Ramirez, L., and Jordan, I.K. (2006) Transposable element derived DNaseI-hypersensitive sites in the human genome. Biol Direct 1, 20.

    Article  PubMed  Google Scholar 

  16. Marino-Ramirez, L., Lewis, K.C., Landsman, D. et al. (2005) Transposable elements donate lineage-specific regulatory sequences to host genomes. Cytogenet Genome Res 110, 333–341.

    Article  PubMed  CAS  Google Scholar 

  17. Zhang, Z., and Gerstein, M. (2003) Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements. J Biol 2, 11.

    Article  PubMed  Google Scholar 

  18. Lowe, C.B., Bejerano, G., and Haussler, D. (2007) Thousands of human mobile element fragments undergo strong purifying selection near developmental genes. Proc Natl Acad Sci USA 104, 8005–8010.

    Article  PubMed  CAS  Google Scholar 

  19. Santangelo, A.M., de Souza, F.S., Franchini, L.F. et al. (2007) Ancient exaptation of a CORE-SINE retroposon into a highly conserved mammalian neuronal enhancer of the proopiomelanocortin gene. PLoS Genet 3, 1813–1826.

    Article  PubMed  CAS  Google Scholar 

  20. Nishihara, H., Smit, A.F., and Okada, N. (2006) Functional noncoding sequences derived from SINEs in the mammalian genome. Genome Res 16, 864–874.

    Article  PubMed  CAS  Google Scholar 

  21. Sasaki, T., Nishihara, H., Hirakawa, M. et al. (2008) Possible involvement of SINEs in mammalian-specific brain formation. Proc Natl Acad Sci USA 105, 4220–4225.

    Article  PubMed  CAS  Google Scholar 

  22. Hirakawa, M., Nishihara, H., Kanehisa, M. et al. (2009) Characterization and evolutionary landscape of AmnSINE1 in Amniota genomes. Gene 441, 100–110.

    Article  PubMed  CAS  Google Scholar 

  23. Smith, A.M., Sanchez, M.J., Follows, G.A. et al. (2008) A novel mode of enhancer evolution: the Tal1 stem cell enhancer recruited a MIR element to specifically boost its activity. Genome Res 18, 1422–1432.

    Article  PubMed  CAS  Google Scholar 

  24. Pang, K.C., Frith, M.C., and Mattick, J.S. (2006) Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function. Trends Genet 22, 1–5.

    Article  PubMed  CAS  Google Scholar 

  25. Johnson, R., Gamblin, R.J., Ooi, L. et al. (2006) Identification of the REST regulon reveals extensive transposable element-mediated binding site duplication. Nucleic Acids Res 34, 3862–3877.

    Article  PubMed  CAS  Google Scholar 

  26. Thornburg, B.G., Gotea, V., and Makalowski, W. (2006) Transposable elements as a significant source of transcription regulating signals. Gene 365, 104–110.

    Article  PubMed  CAS  Google Scholar 

  27. Johnson, D.S., Mortazavi, A., Myers, R.M. et al. (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502.

    Article  PubMed  CAS  Google Scholar 

  28. Bourque, G., Leong, B., Vega, V.B. et al. (2008) Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res 18, 1752–1762.

    Article  PubMed  CAS  Google Scholar 

  29. Langmead, B., Trapnell, C., Pop, M. et al. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25.

    Article  PubMed  Google Scholar 

  30. Hashimoto, T., de Hoon, M.J., Grimmond, S.M. et al. (2009) Probabilistic resolution of multi-mapping reads in massively parallel sequencing data using MuMRescueLite. Bioinformatics 25, 2613–2614.

    Article  PubMed  CAS  Google Scholar 

  31. Kuhn, R.M., Karolchik, D., Zweig, A.S. et al. (2009) The UCSC genome browser database: update 2009. Nucleic Acids Res 37, D755–D761.

    Article  PubMed  CAS  Google Scholar 

  32. Karolchik, D., Hinrichs, A.S., Furey, T.S. et al. (2004) The UCSC table browser data retrieval tool. Nucleic Acids Res 32, D493–D496.

    Article  PubMed  CAS  Google Scholar 

  33. Altschul, S.F., Madden, T.L., Schaffer, A.A. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.

    Article  PubMed  CAS  Google Scholar 

  34. Kent, W.J. (2002) BLAT – the BLAST-like alignment tool. Genome Res 12, 656–664.

    PubMed  CAS  Google Scholar 

  35. Burrows , M., and Wheeler, D.J. (1994) A block-sorting lossless data compression algorithm. Digital Systems Research Center.

    Google Scholar 

  36. Li, H., Ruan, J., and Durbin, R. (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18, 1851–1858.

    Article  PubMed  CAS  Google Scholar 

  37. Faulkner, G.J., Forrest, A.R., Chalk, A.M. et al. (2008) A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE. Genomics 91, 281–288.

    Article  PubMed  CAS  Google Scholar 

  38. Rozowsky, J., Euskirchen, G., Auerbach, R.K. et al. (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 27, 66–75.

    Article  PubMed  CAS  Google Scholar 

  39. Jothi, R., Cuddapah, S., Barski, A. et al. (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res 36, 5221–5231.

    Article  PubMed  CAS  Google Scholar 

  40. Birney, E., Stamatoyannopoulos, J.A., Dutta, A. et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816.

    Article  PubMed  CAS  Google Scholar 

  41. Gaszner, M., and Felsenfeld, G. (2006) Insulators: exploiting transcriptional and epigenetic mechanisms. Nat Rev Genet 7, 703–713.

    Article  PubMed  CAS  Google Scholar 

  42. Sverdlov, E.D. (2000) Retroviruses and primate evolution. Bioessays 22, 161–171.

    Article  PubMed  CAS  Google Scholar 

  43. Ondov, B.D., Varadarajan, A., Passalacqua, K.D. et al. (2008) Efficient mapping of Applied Biosystems SOLiD sequence data to a reference genome for functional genomic applications. Bioinformatics 24, 2776–2777.

    Article  PubMed  CAS  Google Scholar 

  44. Bailey, T.L., and Gribskov, M. (1998) Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14, 48–54.

    Article  PubMed  CAS  Google Scholar 

  45. Kim, T.H., Abdullaev, Z.K., Smith, A.D. et al. (2007) Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell 128, 1231–1245.

    Article  PubMed  CAS  Google Scholar 

  46. Frith, M.C., Fu, Y., Yu, L. et al. (2004) Detection of functional DNA motifs via statistical over-representation. Nucleic Acids Res 32, 1372–1381.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to I. King Jordan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Conley, A.B., Jordan, I.K. (2010). Identification of Transcription Factor Binding Sites Derived from Transposable Element Sequences Using ChIP-seq. In: Ladunga, I. (eds) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol 674. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-854-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-854-6_14

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60761-853-9

  • Online ISBN: 978-1-60761-854-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics