Abstract
Transposable elements (TEs) form a substantial fraction of the non-coding DNA of many eukaryotic genomes. There are numerous examples of TEs being exapted for regulatory function by the host, many of which were identified through their high conservation. However, given that TEs are often the youngest part of a genome and typically exhibit a high turnover, conservation-based methods will fail to identify lineage- or species-specific exaptations. ChIP-seq has become a very popular and effective method for identifying in vivo DNA–protein interactions, such as those seen at transcription factor binding sites (TFBS), and has been used to show that there are a large number of TE-derived TFBS. Many of these TE-derived TFBS show poor conservation and would go unnoticed using conservation screens. Here, we describe a simple pipeline method for using data generated through ChIP-seq to identify TE-derived TFBS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lander, E.S., Linton, L.M., Birren, B., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.
Wicker, T., Sabot, F., Hua-Van, A., et al. (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8, 973–982.
Doolittle, W.F., and Sapienza, C. (1980) Selfish genes, the phenotype paradigm and genome evolution. Nature 284, 601–603.
Orgel, L.E., and Crick, F.H. (1980) Selfish DNA: the ultimate parasite. Nature 284, 604–607.
Gould, S.J., and Vrba, E.S. (1982) Exaptation; a missing term in the science of form Paleobiology 8, 4–15.
Jordan, I.K. (2006) Evolutionary tinkering with transposable elements. Proc Natl Acad Sci USA 103, 7941–7942.
Kidwell, M.G., and Lisch, D.R. (2001) Perspective: transposable elements, parasitic DNA, and genome evolution. Evolution 55, 1–24.
Cohen, C.J., Lock, W.M., and Mager, D.L. (2009) Endogenous retroviral LTRs as promoters for human genes: a critical assessment. Gene 448, 105–114.
Conley, A.B., Piriyapongsa, J., and Jordan, I.K. (2008) Retroviral promoters in the human genome. Bioinformatics 24, 1563–1567.
Zemojtel, T., Kielbasa, S.M., Arndt, P.F. et al. (2009) Methylation and deamination of CpGs generate p53-binding sites on a genomic scale. Trends Genet 25, 63–66.
Wang, J., Bowen, N.J., Chang, L. et al. (2009) A c-Myc regulatory subnetwork from human transposable element sequences. Mol Biosyst 5, 1831–1839.
Feschotte, C. (2008) Transposable elements and the evolution of regulatory networks. Nat Rev Genet 9, 397–405.
Chimpanzee Sequencing and Analysis Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87.
Silva, J.C., Shabalina, S.A., Harris, D.G. et al. (2003) Conserved fragments of transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and L2-derived sequences within the mouse and human genomes. Genet Res 82, 1–18.
Marino-Ramirez, L., and Jordan, I.K. (2006) Transposable element derived DNaseI-hypersensitive sites in the human genome. Biol Direct 1, 20.
Marino-Ramirez, L., Lewis, K.C., Landsman, D. et al. (2005) Transposable elements donate lineage-specific regulatory sequences to host genomes. Cytogenet Genome Res 110, 333–341.
Zhang, Z., and Gerstein, M. (2003) Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements. J Biol 2, 11.
Lowe, C.B., Bejerano, G., and Haussler, D. (2007) Thousands of human mobile element fragments undergo strong purifying selection near developmental genes. Proc Natl Acad Sci USA 104, 8005–8010.
Santangelo, A.M., de Souza, F.S., Franchini, L.F. et al. (2007) Ancient exaptation of a CORE-SINE retroposon into a highly conserved mammalian neuronal enhancer of the proopiomelanocortin gene. PLoS Genet 3, 1813–1826.
Nishihara, H., Smit, A.F., and Okada, N. (2006) Functional noncoding sequences derived from SINEs in the mammalian genome. Genome Res 16, 864–874.
Sasaki, T., Nishihara, H., Hirakawa, M. et al. (2008) Possible involvement of SINEs in mammalian-specific brain formation. Proc Natl Acad Sci USA 105, 4220–4225.
Hirakawa, M., Nishihara, H., Kanehisa, M. et al. (2009) Characterization and evolutionary landscape of AmnSINE1 in Amniota genomes. Gene 441, 100–110.
Smith, A.M., Sanchez, M.J., Follows, G.A. et al. (2008) A novel mode of enhancer evolution: the Tal1 stem cell enhancer recruited a MIR element to specifically boost its activity. Genome Res 18, 1422–1432.
Pang, K.C., Frith, M.C., and Mattick, J.S. (2006) Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function. Trends Genet 22, 1–5.
Johnson, R., Gamblin, R.J., Ooi, L. et al. (2006) Identification of the REST regulon reveals extensive transposable element-mediated binding site duplication. Nucleic Acids Res 34, 3862–3877.
Thornburg, B.G., Gotea, V., and Makalowski, W. (2006) Transposable elements as a significant source of transcription regulating signals. Gene 365, 104–110.
Johnson, D.S., Mortazavi, A., Myers, R.M. et al. (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502.
Bourque, G., Leong, B., Vega, V.B. et al. (2008) Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res 18, 1752–1762.
Langmead, B., Trapnell, C., Pop, M. et al. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25.
Hashimoto, T., de Hoon, M.J., Grimmond, S.M. et al. (2009) Probabilistic resolution of multi-mapping reads in massively parallel sequencing data using MuMRescueLite. Bioinformatics 25, 2613–2614.
Kuhn, R.M., Karolchik, D., Zweig, A.S. et al. (2009) The UCSC genome browser database: update 2009. Nucleic Acids Res 37, D755–D761.
Karolchik, D., Hinrichs, A.S., Furey, T.S. et al. (2004) The UCSC table browser data retrieval tool. Nucleic Acids Res 32, D493–D496.
Altschul, S.F., Madden, T.L., Schaffer, A.A. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.
Kent, W.J. (2002) BLAT – the BLAST-like alignment tool. Genome Res 12, 656–664.
Burrows , M., and Wheeler, D.J. (1994) A block-sorting lossless data compression algorithm. Digital Systems Research Center.
Li, H., Ruan, J., and Durbin, R. (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18, 1851–1858.
Faulkner, G.J., Forrest, A.R., Chalk, A.M. et al. (2008) A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE. Genomics 91, 281–288.
Rozowsky, J., Euskirchen, G., Auerbach, R.K. et al. (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 27, 66–75.
Jothi, R., Cuddapah, S., Barski, A. et al. (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res 36, 5221–5231.
Birney, E., Stamatoyannopoulos, J.A., Dutta, A. et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816.
Gaszner, M., and Felsenfeld, G. (2006) Insulators: exploiting transcriptional and epigenetic mechanisms. Nat Rev Genet 7, 703–713.
Sverdlov, E.D. (2000) Retroviruses and primate evolution. Bioessays 22, 161–171.
Ondov, B.D., Varadarajan, A., Passalacqua, K.D. et al. (2008) Efficient mapping of Applied Biosystems SOLiD sequence data to a reference genome for functional genomic applications. Bioinformatics 24, 2776–2777.
Bailey, T.L., and Gribskov, M. (1998) Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14, 48–54.
Kim, T.H., Abdullaev, Z.K., Smith, A.D. et al. (2007) Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell 128, 1231–1245.
Frith, M.C., Fu, Y., Yu, L. et al. (2004) Detection of functional DNA motifs via statistical over-representation. Nucleic Acids Res 32, 1372–1381.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Conley, A.B., Jordan, I.K. (2010). Identification of Transcription Factor Binding Sites Derived from Transposable Element Sequences Using ChIP-seq. In: Ladunga, I. (eds) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol 674. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-854-6_14
Download citation
DOI: https://doi.org/10.1007/978-1-60761-854-6_14
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60761-853-9
Online ISBN: 978-1-60761-854-6
eBook Packages: Springer Protocols