Skip to main content

An Overview of the Computational Analyses and Discovery of Transcription Factor Binding Sites

  • Protocol
  • First Online:
Computational Biology of Transcription Factor Binding

Part of the book series: Methods in Molecular Biology ((MIMB,volume 674))

Abstract

Here we provide a pragmatic, high-level overview of the computational approaches and tools for the discovery of transcription factor binding sites. Unraveling transcription regulatory networks and their malfunctions such as cancer became feasible due to recent stellar progress in experimental techniques and computational analyses. While predictions of isolated sites still pose notorious challenges, cis-regulatory modules (clusters) of binding sites can now be identified with high accuracy. Further support comes from conserved DNA segments, co-regulation, transposable elements, nucleosomes, and three-dimensional chromosomal structures. We introduce computational tools for the analysis and interpretation of chromatin immunoprecipitation, next-generation sequencing, SELEX, and protein-binding microarray results. Because immunoprecipitation produces overly large DNA segments and well over half of the sequencing reads from constitute background noise, methods are presented for background correction, sequence read mapping, peak calling, false discovery rate estimation, and co-localization analyses. To discover short binding site motifs from extensive immunoprecipitation segments, we recommend algorithms and software based on expectation maximization and Gibbs sampling. Data integration using several databases further improves performance. Binding sites can be visualized in genomic and chromatin context using genome browsers. Binding site information, integrated with co-expression in large compendia of gene expression experiments, allows us to reveal complex transcriptional regulatory networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Maniatis, T., Ptashne, M., Backman, K. et al. (1975) Recognition sequences of repressor and polymerase in the operators of bacteriophage lambda. Cell 5, 109–113.

    Article  PubMed  CAS  Google Scholar 

  2. Stormo, G.D. (2000) DNA binding sites: representation and discovery. Bioinformatics 16, 16–23.

    Article  PubMed  CAS  Google Scholar 

  3. Cawley, S., Bekiranov, S., Ng, H.H. et al. (2004) Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509.

    Article  PubMed  CAS  Google Scholar 

  4. Wei, C.L., Wu, Q., Vega, V.B. et al. (2006) A global map of p53 transcription-factor binding sites in the human genome. Cell 124, 207–219.

    Article  PubMed  CAS  Google Scholar 

  5. Nielsen, R., Pedersen, T.A., Hagenbeek, D. et al. (2008) Genome-wide profiling of PPARgamma:RXR and RNA polymerase II occupancy reveals temporal activation of distinct metabolic pathways and changes in RXR dimer composition during adipogenesis. Genes Dev 22, 2953–2967.

    Article  PubMed  CAS  Google Scholar 

  6. Hamza, M.S., Pott, S., Vega, V.B. et al. (2009) De-novo identification of PPARgamma/RXR binding sites and direct targets during adipogenesis. PLoS One 4, e4907.

    Article  PubMed  CAS  Google Scholar 

  7. Tompa, M., Li, N., Bailey, T.L. et al. (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 23, 137–144.

    Article  PubMed  CAS  Google Scholar 

  8. Khan, A.A., Betel, D., Miller, M.L. et al. (2009) Transfection of small RNAs globally perturbs gene regulation by endogenous microRNAs. Nat Biotechnol 27, 549–555.

    Article  PubMed  CAS  Google Scholar 

  9. Jaenisch, R., and Bird, A. (2003) Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet 33(Suppl.), 245–254.

    Article  PubMed  CAS  Google Scholar 

  10. Ito, T. (2007) Role of histone modification in chromatin dynamics. J Biochem 141, 609–614.

    Article  PubMed  CAS  Google Scholar 

  11. Barski, A., and Zhao, K. (2009) Genomic location analysis by ChIP-Seq. J Cell Biochem 107, 11–18.

    Article  PubMed  CAS  Google Scholar 

  12. Matys, V., Fricke, E., Geffers, R. et al. (2003) TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res 31, 374–378.

    Article  PubMed  CAS  Google Scholar 

  13. Matys, V., Kel-Margoulis, O.V., Fricke, E. et al. (2006) TRANSFAC(R) and its module TRANSCompel(R): transcriptional gene regulation in eukaryotes. Nucleic Acids Res 34, D108–D110.

    Article  PubMed  CAS  Google Scholar 

  14. Robison, K., McGuire, A.M., and Church, G.M. (1998) A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. J Mol Biol 284, 241–254.

    Article  PubMed  CAS  Google Scholar 

  15. Liu, J., and Stormo, G.D. (2005) Combining SELEX with quantitative assays to rapidly obtain accurate models of protein-DNA interactions. Nucleic Acids Res 33, e141.

    Article  PubMed  Google Scholar 

  16. Djordjevic, M., and Sengupta, A.M. (2006) Quantitative modeling and data analysis of SELEX experiments. Phys Biol 3, 13–28.

    Article  CAS  Google Scholar 

  17. Berger, M.F., and Bulyk, M.L. (2009) Universal protein-binding microarrays for the comprehensive characterization of the DNA-binding specificities of transcription factors. Nat Protoc 4, 393–411.

    Article  PubMed  CAS  Google Scholar 

  18. Hesselberth, J.R., Chen, X., Zhang, Z. et al. (2009) Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nat Methods 6, 283–289.

    Article  PubMed  CAS  Google Scholar 

  19. Sabo, P.J., Humbert, R., Hawrylycz, M. et al. (2004) Genome-wide identification of DNaseI hypersensitive sites using active chromatin sequence libraries. Proc Natl Acad Sci 101, 4537–4542.

    Article  PubMed  CAS  Google Scholar 

  20. Workman, C.T., Mak, H.C., McCuine, S. et al. (2006) A systems approach to mapping DNA damage response pathways. Science 312, 1054–1059.

    Article  PubMed  CAS  Google Scholar 

  21. Elbashir, S.M., Harborth, J., Weber, K. et al. (2002) Analysis of gene function in somatic mammalian cells using small interfering RNAs. Methods 26, 199–213.

    Article  PubMed  CAS  Google Scholar 

  22. Ji, H., Jiang, H., Ma, W. et al. (2008) An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol 26, 1293–1300.

    Article  PubMed  CAS  Google Scholar 

  23. MacIsaac, K.D., and Fraenkel, E. (2006) Practical strategies for discovering regulatory DNA sequence motifs. PLoS Comput Biol 2, e36.

    Article  PubMed  CAS  Google Scholar 

  24. Viggiani, C.J., Aparicio, J.G., and Aparicio, O.M. (2009) ChIP-chip to analyze the binding of replication proteins to chromatin using oligonucleotide DNA microarrays. Methods Mol Biol 521, 255–278.

    Article  PubMed  CAS  Google Scholar 

  25. ENCODE Consortium. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816.

    Article  CAS  Google Scholar 

  26. Harbison, C.T., Gordon, D.B., Lee, T.I. et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104.

    Article  PubMed  CAS  Google Scholar 

  27. Ren, B., Robert, F., Wyrick, J.J. et al. (2000) Genome-Wide Location and Function of DNA Binding Proteins. Science 290, 2306–2309.

    Article  PubMed  CAS  Google Scholar 

  28. Johnson, D.S., Li, W., Gordon, D.B. et al. (2008) Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res 18, 393–403.

    Article  PubMed  Google Scholar 

  29. Lander, E.S., Linton, L.M., Birren, B. et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.

    Article  PubMed  CAS  Google Scholar 

  30. Quail, M.A., Kozarewa, I., Smith, F. et al. (2008) A large genome center's improvements to the Illumina sequencing system. Nat Methods 5, 1005–1010.

    Article  PubMed  CAS  Google Scholar 

  31. Margulies, M., Egholm, M., Altman, W.E. et al. (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380.

    PubMed  CAS  Google Scholar 

  32. Rougemont, J., Amzallag, A., Iseli, C. et al. (2008) Probabilistic base calling of Solexa sequencing data. BMC Bioinformatics 9, 431.

    Article  PubMed  CAS  Google Scholar 

  33. Erlich, Y., Mitra, P.P., delaBastide, M. et al. (2008) Alta-Cyclic: a self-optimizing base caller for next-generation sequencing. Nat Methods 5, 679–682.

    Article  PubMed  CAS  Google Scholar 

  34. Altschul, S.F., Madden, T.L., Schaffer, A.A. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.

    Article  PubMed  CAS  Google Scholar 

  35. Kent, W.J. (2002) BLAT – the BLAST-like alignment tool. Genome Res 12, 656–664.

    PubMed  CAS  Google Scholar 

  36. Langmead, B., Trapnell, C., Pop, M. et al. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25.

    Article  PubMed  CAS  Google Scholar 

  37. Trapnell, C., and Salzberg, S.L. (2009) How to map billions of short reads onto genomes. Nat Biotechnol 27, 455–457.

    Article  PubMed  CAS  Google Scholar 

  38. Ozsolak, F., Platt, A.R., Jones, D.R. et al. (2009) Direct RNA sequencing. Nature 461, 814–818.

    Article  PubMed  CAS  Google Scholar 

  39. Dohm, J.C., Lottaz, C., Borodina, T. et al. (2008) Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res 36, e105.

    Article  PubMed  CAS  Google Scholar 

  40. Hillier, L.W., Marth, G.T., Quinlan, A.R. et al. (2008) Whole-genome sequencing and variant discovery in C. elegans. Nat Methods 5, 183–188.

    Article  PubMed  CAS  Google Scholar 

  41. Vega, V.B., Cheung, E., Palanisamy, N. et al. (2009) Inherent signals in sequencing-based Chromatin-ImmunoPrecipitation control libraries. PLoS One 4, e5241.

    Article  PubMed  CAS  Google Scholar 

  42. Albert, I., Mavrich, T.N., Tomsho, L.P. et al. (2007) Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446, 572–576.

    Article  PubMed  CAS  Google Scholar 

  43. Pepke, S., Wold, B., and Mortazavi, A. (2009) Computation for ChIP-seq and RNA-seq studies. Nat Methods 6, S22–S32.

    Article  PubMed  CAS  Google Scholar 

  44. Miller, M. (2009) The importance of being flexible: the case of basic region leucine zipper transcriptional regulators. Curr Protein Pept Sci 10, 244–269.

    Article  PubMed  CAS  Google Scholar 

  45. Yamada, K., and Miyamoto, K. (2005) Basic helix-loop-helix transcription factors, BHLHB2 and BHLHB3; their gene expressions are regulated by multiple extracellular stimuli. Front Biosci 10, 3151–3171.

    Article  PubMed  CAS  Google Scholar 

  46. Ladomery, M., and Dellaire, G. (2002) Multifunctional zinc finger proteins in development and disease. Ann Hum Genet 66, 331–342.

    Article  PubMed  CAS  Google Scholar 

  47. Klinck, R., Serup, P., Madsen, O.D. et al. (2008) Specificity of four monoclonal anti-NKx6-1 antibodies. J Histochem Cytochem 56, 415–424.

    Article  PubMed  CAS  Google Scholar 

  48. Benjamini, Y., and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple hypothesis testing. J R Stat Soc B 57, 289–300.

    Google Scholar 

  49. Zhang, Y., Liu, T., Meyer, C.A. et al. (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9, R137.

    Article  PubMed  CAS  Google Scholar 

  50. Valouev, A., Johnson, D.S., Sundquist, A. et al. (2008) Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods 5, 829–834.

    Article  PubMed  CAS  Google Scholar 

  51. Fejes, A.P., Robertson, G., Bilenky, M. et al. (2008) FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics 24, 1729–1730.

    Article  PubMed  CAS  Google Scholar 

  52. Jothi, R., Cuddapah, S., Barski, A. et al. (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res 36, 5221–5231.

    Article  PubMed  CAS  Google Scholar 

  53. Nix, D.A., Courdy, S.J., and Boucher, K.M. (2008) Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks. BMC Bioinformatics 9, 523.

    Article  PubMed  CAS  Google Scholar 

  54. Rozowsky, J., Euskirchen, G., Auerbach, R.K. et al. (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 27, 66–75.

    Article  PubMed  CAS  Google Scholar 

  55. Zhu, C., Byers, K.J., McCord, R.P. et al. (2009) High-resolution DNA-binding specificity analysis of yeast transcription factors. Genome Res 19, 556–566.

    Article  PubMed  CAS  Google Scholar 

  56. Oliphant, A.R., Brandl, C.J., and Struhl, K. (1989) Defining the sequence specificity of DNA-binding proteins by selecting binding sites from random-sequence oligonucleotides: analysis of yeast GCN4 protein. Mol Cell Biol 9, 2944–2949.

    PubMed  CAS  Google Scholar 

  57. Tuerk, C., and Gold, L. (1990) Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249, 505–510.

    Article  PubMed  CAS  Google Scholar 

  58. Djordjevic, M. (2007) SELEX experiments: new prospects, applications and data analysis in inferring regulatory pathways. Biomol Eng 24, 179–189.

    Article  PubMed  CAS  Google Scholar 

  59. Ponomarenko, J.V., Orlova, G.V., Frolov, A.S. et al. (2002) SELEX_DB: a database on in vitro selected oligomers adapted for recognizing natural sites and for analyzing both SNPs and site-directed mutagenesis data. Nucleic Acids Res 30, 195–199.

    Article  PubMed  CAS  Google Scholar 

  60. Jagannathan, V., Roulet, E., Delorenzi, M. et al. (2006) HTPSELEX – a database of high-throughput SELEX libraries for transcription factor binding sites. Nucleic Acids Res 34, D90–D94.

    Article  PubMed  CAS  Google Scholar 

  61. Bulyk, M.L., Huang, X., Choo, Y. et al. (2001) Exploring the DNA-binding specificities of zinc fingers with DNA microarrays. Proc Natl Acad Sci USA 98, 7158–7163.

    Article  PubMed  CAS  Google Scholar 

  62. Philippakis, A.A., Qureshi, A.M., Berger, M.F. et al. (2008) Design of compact, universal DNA microarrays for protein binding microarray experiments. J Comput Biol 15, 655–665.

    Article  PubMed  CAS  Google Scholar 

  63. Forde, G.M. (2008) Preparation, analysis and use of an affinity adsorbent for the purification of GST fusion protein. Methods Mol Biol 421, 125–136.

    PubMed  CAS  Google Scholar 

  64. McCord, R.P., Berger, M.F., Philippakis, A.A. et al. (2007) Inferring condition-specific transcription factor function from DNA binding and gene expression data. Mol Syst Biol 3, 100.

    Article  PubMed  Google Scholar 

  65. Choi, Y., Qin, Y., Berger, M.F. et al. (2007) Microarray analyses of newborn mouse ovaries lacking Nobox. Biol Reprod 77, 312–319.

    Article  PubMed  CAS  Google Scholar 

  66. Hughes, J.D., Estep, P.W., Tavazoie, S. et al. (2000) Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol 296, 1205–1214.

    Article  PubMed  CAS  Google Scholar 

  67. Liu, L.A., and Bader, J.S. (2009) Structure-based ab initio prediction of transcription factor-binding sites. Methods Mol Biol 541, 23–41.

    Article  PubMed  CAS  Google Scholar 

  68. Liu, L.A., and Bader, J.S. (2007) Ab initio prediction of transcription factor binding sites. Pac Symp Biocomput 12, 484–495.

    CAS  Google Scholar 

  69. Hughes, T.R., Marton, M.J., Jones, A.R. et al. (2000) Functional discovery via a compendium of expression profiles. Cell 102, 109–126.

    Article  PubMed  CAS  Google Scholar 

  70. Staden, R. (1984) Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res 12, 505–519.

    Article  PubMed  CAS  Google Scholar 

  71. Pavesi, G., Mauri, G., and Pesole, G. (2001) An algorithm for finding signals of unknown length in DNA sequences. Bioinformatics 17(Suppl. 1), S207–S214.

    Article  PubMed  Google Scholar 

  72. Eskin, E., and Pevzner, P.A. (2002) Finding composite regulatory patterns in DNA sequences. Bioinformatics 18(Suppl 1), S354–S363.

    Article  PubMed  Google Scholar 

  73. Pevzner, P.A., and Sze, S.H. (2000) Combinatorial approaches to finding subtle signals in DNA sequences. Proc Int Conf Intell Syst Mol Biol 8, 269–278.

    PubMed  CAS  Google Scholar 

  74. Liang, S. (2003) cWINNOWER algorithm for finding fuzzy DNA motifs. Proc IEEE Comput Soc Bioinformatics Conf 2, 260–265.

    Google Scholar 

  75. van Helden, J., Andre, B., and Collado-Vides, J. (1998) Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 281, 827–842.

    Article  PubMed  Google Scholar 

  76. Bailey, T.L., and Elkan, C. (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2, 28–36.

    PubMed  CAS  Google Scholar 

  77. Lawrence, C.E., Altschul, S.F., Boguski, M.S. et al. (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214.

    Article  PubMed  CAS  Google Scholar 

  78. Dempster, A.P., Laird, N.M., and Rubin, D.B. (1977) Maximum Likelihood from Incomplete Data via the EM Algorithm. J R Soc Ser B 39, 1–38.

    Google Scholar 

  79. Grundy, W.N., Bailey, T.L., and Elkan, C.P. (1996) ParaMEME: a parallel implementation and a web interface for a DNA and protein motif discovery tool. Comput Appl Biosci 12, 303–310.

    PubMed  CAS  Google Scholar 

  80. Grundy, W.N., Bailey, T.L., Elkan, C.P. et al. (1997) Meta-MEME: motif-based hidden Markov models of protein families. Comput Appl Biosci 13, 397–406.

    PubMed  CAS  Google Scholar 

  81. Price, A., Ramabhadran, S., and Pevzner, P.A. (2003) Finding subtle motifs by branching from sample strings. Bioinformatics 19(Suppl. 2), ii149–ii155.

    Article  PubMed  Google Scholar 

  82. Bailey, T.L., Boden, M., Buske, F.A. et al. (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37, W202–W208.

    Article  PubMed  CAS  Google Scholar 

  83. Liu, X.S., Brutlag, D.L., and Liu, J.S. (2002) An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat Biotechnol 20, 835–839.

    PubMed  CAS  Google Scholar 

  84. Roth, F.P., Hughes, J.D., Estep, P.W. et al. (1998) Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat Biotechnol 16, 939–945.

    Article  PubMed  CAS  Google Scholar 

  85. Das, M.K., and Dai, H.K. (2007) A survey of DNA motif finding algorithms. BMC Bioinformatics 8(Suppl. 7), S21.

    Article  PubMed  CAS  Google Scholar 

  86. Quest, D., Dempsey, K., Shafiullah, M. et al. (2008) MTAP: the motif tool assessment platform. BMC Bioinformatics 9(Suppl. 9), S6.

    Article  PubMed  CAS  Google Scholar 

  87. Hu, J., Li, B., and Kihara, D. (2005) Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res 33, 4899–4913.

    Article  PubMed  CAS  Google Scholar 

  88. Gama-Castro, S., Jimenez-Jacinto, V., Peralta-Gil, M. et al. (2008) RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res 36, D120–D124.

    Article  PubMed  CAS  Google Scholar 

  89. Wasserman, W.W., and Fickett, J.W. (1998) Identification of regulatory regions which confer muscle-specific gene expression. J Mol Biol 278, 167–181.

    Article  PubMed  CAS  Google Scholar 

  90. MacIsaac, K.D., Wang, T., Gordon, D.B. et al. (2006) An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics 7, 113.

    Article  PubMed  CAS  Google Scholar 

  91. Sinha, S., Blanchette, M., and Tompa, M. (2004) PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences. BMC Bioinformatics 5, 170.

    Article  PubMed  CAS  Google Scholar 

  92. Siddharthan, R., Siggia, E.D., and van Nimwegen, E. (2005) PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny. PLoS Comput Biol 1, e67.

    Article  PubMed  CAS  Google Scholar 

  93. Davidson, E.H. (2001) Genomic regulatory systems: development and evolution. Academic Press, New York, NY.

    Google Scholar 

  94. Blanchette, M., Bataille, A.R., Chen, X. et al. (2006) Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res 16, 656–668.

    Article  PubMed  CAS  Google Scholar 

  95. Blanchette, M., Kent, W.J., Riemer, C. et al. (2004) Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res 14, 708–715.

    Article  PubMed  CAS  Google Scholar 

  96. Cohen, C.J., Lock, W.M., and Mager, D.L. (2009) Endogenous retroviral LTRs as promoters for human genes: a critical assessment. Gene 448, 105–114.

    Article  PubMed  CAS  Google Scholar 

  97. Conley, A.B., Piriyapongsa, J., and Jordan, I.K. (2008) Retroviral promoters in the human genome. Bioinformatics 24, 1563–1567.

    Article  PubMed  CAS  Google Scholar 

  98. Feschotte, C. (2008) Transposable elements and the evolution of regulatory networks. Nat Rev Genet 9, 397–405.

    Article  PubMed  CAS  Google Scholar 

  99. Wang, J., Bowen, N.J., Chang, L. et al. (2009) A c-Myc regulatory subnetwork from human transposable element sequences. Mol Biosyst 5, 1831–1839.

    Article  PubMed  CAS  Google Scholar 

  100. Wang, T., Zeng, J., Lowe, C.B. et al. (2007) Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc Natl Acad Sci USA 104, 18613–18618.

    Article  PubMed  CAS  Google Scholar 

  101. Hashimoto, T., de Hoon, M.J., Grimmond, S.M. et al. (2009) Probabilistic resolution of multi-mapping reads in massively parallel sequencing data using MuMRescueLite. Bioinformatics 25, 2613–2614.

    Article  PubMed  CAS  Google Scholar 

  102. Rhead, B., Karolchik, D., Kuhn, R.M. et al. (2009) The UCSC genome browser database: update 2010. Nucleic Acids Res, doi:10.1093/nar/gkp1939.

    Google Scholar 

  103. Portales-Casamar, E., Thongjuea, S., Kwon, A.T. et al. JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles. Nucleic Acids Res 38, D105–D110.

    Article  PubMed  CAS  Google Scholar 

  104. Griffith, O.L., Montgomery, S.B., Bernier, B. et al. (2008) ORegAnno: an open-access community-driven resource for regulatory annotation. Nucleic Acids Res 36, D107–D113.

    Article  PubMed  CAS  Google Scholar 

  105. Portales-Casamar, E., Arenillas, D., Lim, J. et al. (2009) The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences. Nucleic Acids Res 37, D54–D60.

    Article  PubMed  CAS  Google Scholar 

  106. Wang, J., and Morigen. (2009) BayesPI – a new model to study protein-DNA interactions: a case study of condition-specific protein binding parameters for Yeast transcription factors. BMC Bioinformatics 10, 345.

    Article  PubMed  CAS  Google Scholar 

  107. Kuhn, R.M., Karolchik, D., Zweig, A.S. et al. (2009) The UCSC Genome Browser Database: update 2009. Nucleic Acids Res 37, D755–D761.

    Article  PubMed  CAS  Google Scholar 

  108. Stein, L.D., Mungall, C., Shu, S. et al. (2002) The generic genome browser: a building block for a model organism system database. Genome Res 12, 1599–1610.

    Article  PubMed  CAS  Google Scholar 

  109. Spudich, G., Fernandez-Suarez, X.M., and Birney, E. (2007) Genome browsing with Ensembl: a practical overview. Brief Funct Genomic Proteomic 6, 202–219.

    Article  PubMed  CAS  Google Scholar 

  110. James, N., Graham, N., Clements, D. et al. (2007) AtEnsEMBL. Methods Mol Biol 406, 213–227.

    PubMed  CAS  Google Scholar 

  111. Huang, W., and Marth, G. (2008) EagleView: a genome assembly viewer for next-generation sequencing technologies. Genome Res 18, 1538–1543.

    Article  PubMed  CAS  Google Scholar 

  112. Balazsi, G., Barabasi, A.L., and Oltvai, Z.N. (2005) Topological units of environmental signal processing in the transcriptional regulatory network of Escherichia coli. Proc Natl Acad Sci USA 102, 7841–7846.

    Article  PubMed  CAS  Google Scholar 

  113. Qian, J., Dolled-Filhart, M., Lin, J. et al. (2001) Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. J Mol Biol 314, 1053–1066.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

The author is grateful to Yang Liu for the critical reading of the manuscript, and the NSF Grant EPS-0701892 for funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Istvan Ladunga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Ladunga, I. (2010). An Overview of the Computational Analyses and Discovery of Transcription Factor Binding Sites. In: Ladunga, I. (eds) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol 674. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-854-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-854-6_1

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60761-853-9

  • Online ISBN: 978-1-60761-854-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics