Abstract
Identifying transcription factor binding sites computationally is a hard problem as it produces many false predictions. Combining the predictions from existing predictors can improve the overall predictions by using classification methods like Support Vector Machines (SVM). But conventional negative examples (that is, example of non-binding sites) in this type of problem are highly unreliable. In this study, we have used different types of negative examples. One class of the negative examples has been taken from far away from the promoter regions, where the occurrence of binding sites is very low, and another one has been produced by randomization. Thus we observed the effect of using different negative examples in predicting transcription factor binding sites in mouse. We have also devised a novel cross-validation technique for this type of biological problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tompa, M., Li, N., Bailey, T.L., Church, G.M., De Moor, B., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., Makeev, V.J., Mironov, A.A., Noble, W.S., Pavesi, G., Pesole, G., Régnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., Zhu, Z.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23(1), 137–144 (2005)
Sun, Y., Robinson, M., Adams, R., te Boekhorst, R., Rust, A.G., Davey, N.: Integrating genomic binding site predictions using real-valued meta classifiers. Neural Computing and Applications 18, 577–590 (2008)
Sun, Y., Robinson, M., Adams, R., Rust, A.G., Davey, N.: Prediction of Binding Sites in the Mouse Genome using Support Vector Machine. In: Kurkova, V., Neruda, R., Koutnik, J. (eds.) ICANN 2008, Part II. LNCS, vol. 5164, pp. 91–100. Springer, Heidelberg (2008)
Sun, Y., Robinson, M., Adams, R., Rust, A.G., Davey, N.: Using Pre and Posting-processing Methods to Improve Binding Site Predictions. Pattern Recognition 42(9), 1949–(1958)
Robinson, M., Castellano, C.G., Rezwan, F., Adams, R., Davey, N., Rust, A.G., Sun, Y.: Combining experts in order to identify binding sites in yeast and mouse genomic data. Neural Networks 21(6), 856–861 (2008)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeye, W.P.: Synthetic minority over-sampling Technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Blanco, E., Farré, D., Albà , M.M., Messeguer, X., Guigó, R.: ABS: a database of Annotated regulatory Binding Sites from orthologous promoters. Nucleic Acids Res. 34(Database issue), D63-7 (2006)
Montgomery, S.B., Griffith, O.L., Sleumer, M.C., Bergman, C.M., Bilenky, M., Pleasance, E.D., Prychyna, Y., Zhang, X., Jones, S.J.M.: ORegAnno: An open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation. Bioinformatics (March 2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rezwan, F., Sun, Y., Davey, N., Adams, R., Rust, A.G., Robinson, M. (2011). Effect of Using Varying Negative Examples in Transcription Factor Binding Site Predictions. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2011. Lecture Notes in Computer Science, vol 6623. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20389-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-20389-3_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20388-6
Online ISBN: 978-3-642-20389-3
eBook Packages: Computer ScienceComputer Science (R0)