Abstract
Predicting transcription factor binding sites (TFBS) from sequence is one of the most challenging problems in computational biology. The development of (semi-)automated computer-assisted prediction methods is needed to find TFBS over an entire genome, which is a first step in reconstructing mechanisms that control gene activity. Bioinformatics journals continue to publish diverse methods for predicting TFBS on a monthly basis. To help practitioners in deciding which method to use to predict for a particular TFBS, we provide a platform to assess the quality and applicability of the available methods. Assessment tools allow researchers to determine how methods can be expected to perform on specific organisms or on specific transcription factor families. This chapter introduces the TFBS detection problem and reviews current strategies for evaluating algorithm effectiveness. In this chapter, a novel and robust assessment tool, the Motif Tool Assessment Platform (MTAP), is introduced and discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Das, M., and Dai, H. (2007) A survey of DNA motif finding algorithms. BMC Bioinformatics 8, 1–13.
Davidson, E.H., Rast, J.P., Oliveri, P. et al. (2002) A genomic regulatory network for development. Science 295, 1669–1678.
Stathopoulos, A., and Levine, M. (2005) Genomic regulatory networks and animal development. Dev Cell 9, 449–462.
Imai, K., Levine, M., Satoh, N. et al. (2006) Regulatory blueprint for a chordate embryo. Science 312, 1183–1187.
Salgado, H., Santos-Zavaleta, A., Gama-Castro, S. et al. (2006) The comprehensive updated regulatory network of Escherichia coli K-12. BMC Bioinformatics 7, 1–5.
Shen-Orr, S.S., Milo, R., Mangan, S. et al. (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet 31, 64–68.
Salgado, H., Gama-Castro, S., Martaenez-Antonio, A. et al. (2004) RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12. Nucleic Acids Res 32, D303–D306.
Karp, P., Riley, M., Saier, M. et al. (2002) The EcoCyc database. Nucleic Acids Res 30, 56–58.
Wingender, E., Dietze, P., Karas, H. et al. (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res 24, 238–241.
Ishii, T., Yoshida, K.-I., Terai, G. et al. (2001) DBTBS: a database of Bacillus subtilis promoters and transcription factors. Nucleic Acids Res 29, 278–280.
Kazakov, A.E., Cipriano, M.J., Novichkov, P.S. et al. (2006) RegTransBase – a database of regulatory sequences and interactions in a wide range of prokaryotic genomes. Nucleic Acids Res 35 (Database Issue), D407–D412.
Maench, R., Hiller, K., Barg, H. et al. (2003) PRODORIC: prokaryotic database of gene regulation. Nucleic Acids Res 31, 266–269.
Meng, H., Banerjee, A., and Zhou, L. (2006) BLISS: binding site level identification of shared signal-modules in DNA regulatory sequences. BMC Bioinformatics 7, 287.
Tompa, M., Li, N., Bailey, T. et al. (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 23, 137–144.
Sandve, G., Abul, O., Walseng, V. et al. (2007) Improved benchmarks for computational motif discovery. BMC Bioinformatics 8, 193.
Frith, M.C., Hansen, U., Spouge, J.L. et al. (2004) Finding functional sequence elements by multiple local alignment. Nucleic Acids Res 32, 189–200.
Eskin, E., and Pevzner, P.A. (2002) Finding composite regulatory patterns in DNA sequences. Bioinformatics 18(Suppl. 1), S354–S363.
Lawrence, C.E., Altschul, S.F., Boguski, M.S. et al. (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214.
Pavesi, G., Mereghetti, P., Mauri, G. et al. (2004) Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res 32, W199–W203.
Thijs, G., Lescot, M., Marchal, K. et al. (2001) A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17, 1113–1122.
Bailey, T.L., and Elkan, C. (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol ISMB Int Conf Intell Syst Mol Biol 2, 28–36.
Hughes, J.D., Estep, P.W., Tavazoie, S. et al. (2000) Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol 296, 1205–1214.
Workman, C.T., and Stormo, G.D. (2000) ANN-Spec: a method for discovering transcription factor binding sites with improved specificity. Pac Symp Biocomput 5: 467–478.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Quest, D., Ali, H. (2010). The Motif Tool Assessment Platform (MTAP) for Sequence-Based Transcription Factor Binding Site Prediction Tools. In: Ladunga, I. (eds) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol 674. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-854-6_8
Download citation
DOI: https://doi.org/10.1007/978-1-60761-854-6_8
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60761-853-9
Online ISBN: 978-1-60761-854-6
eBook Packages: Springer Protocols