Skip to main content

Using Varying Negative Examples to Improve Computational Predictions of Transcription Factor Binding Sites

  • Conference paper
  • 1553 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 311))

Abstract

The identification of transcription factor binding sites (TFBSs ) is a non-trivial problem as the existing computational predictors produce a lot of false predictions. Though it is proven that combining these predictions with a meta-classifier, like Support Vector Machines (SVMs), can improve the overall results, this improvement is not as significant as expected. The reason for this is that the predictors are not reliable for the negative examples from non-binding sites in the promoter region. Therefore, using negative examples from different sources during training an SVM can be one of the solutions to this problem. In this study, we used different types of negative examples during training the classifier. These negative examples can be far away from the promoter regions or produced by randomisation or from the intronic region of genes. By using these negative examples during training, we observed their effect in improving predictions of TFBSs in the yeast. We also used a modified cross-validation method for this type of problem. Thus we observed substantial improvement in the classifier performance that could constitute a model for predicting TFBSs. Therefore, the major contribution of the analysis is that for the yeast genome, the position of binding sites could be predicted with high confidence using our technique and the predictions are of much higher quality than the predictions of the original prediction algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tompa, M., Li, N., Bailey, T.L., Church, G.M., De Moor, B., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., Makeev, V.J., Mironov, A.A., Noble, W.S., Pavesi, G., Pesole, G., Régnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., Zhu, Z.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23(1), 137–144 (2005)

    Article  Google Scholar 

  2. Elnitski, L., Jin, V.X., Farnham, P.J., Jones, S.J.: Locating mammalian transcription factor binding sites: a survey of computational and experimental techniques. Genome Res. 16, 1455–1464 (2006)

    Article  Google Scholar 

  3. Pavesi, G., Mauri, G., Pesole, G.: In silico representation and discovery of transcription factor binding sites. Brief. Bioinformatics 5, 217–236 (2004)

    Article  Google Scholar 

  4. Hu, J., Li, B., Kihara, D.: Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res. 33, 4899–4913 (2005)

    Article  Google Scholar 

  5. Brown, C.T.: Computational approaches to finding and analyzing cis-regulatory elements. Methods Cell Biol. 87, 337–365 (2008)

    Article  Google Scholar 

  6. Sun, Y., Robinson, M., Adams, R., Rust, A.G., Davey, N.: Using Pre and Posting-processing Methods to Improve Binding Site Predictions. Pattern Recognition 42(9), 1949–1958 (2009)

    Article  MATH  Google Scholar 

  7. Robinson, M., Castellano, C.G., Rezwan, F., Adams, R., Davey, N., Rust, A.G., Sun, Y.: Combining experts in order to identify binding sites in yeast and mouse genomic data. Neural Networks 21(6), 856–861 (2008)

    Article  MATH  Google Scholar 

  8. Cherry, J.M., Hong, E.L., Amundsen, C., Balakrishnan, R., Binkley, G., Chan, E.T., Christie, K.R., Costanzo, M.C., Dwight, S.S., Engel, S.R., Fisk, D.G., Hirschman, J.E., Hitz, B.C., Karra, K., Krieger, C.J., Miyasato, S.R., Nash, R.S., Park, J., Skrzypek, M.S., Simison, M., Weng, S., Wong, E.D.: Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res. 40(Database issue), D700–D705 (2012)

    Google Scholar 

  9. Montgomery, S.B., Griffith, O.L., Sleumer, M.C., Bergman, C.M., Bilenky, M., Pleasance, E.D., Prychyna, Y., Zhang, X., Jones, S.J.M.: ORegAnno: An open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation. Bioinformatics (March 2006)

    Google Scholar 

  10. MacIsaac, K.D., Wang, T., Gordon, D.B., Gifford, D.K., Stormo, G., Fraenkel, E.: An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics 7, 113 (2006)

    Article  Google Scholar 

  11. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeye, W.P.: SMOTE: Synthetic minority over-sampling Technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)

    MATH  Google Scholar 

  12. Rezwan, F., Sun, Y., Davey, N., Adams, R., Rust, A.G., Robinson, M.: Effect of Using Varying Negative Examples in Transcription Factor Binding Site Predictions. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds.) EvoBIO 2011. LNCS, vol. 6623, pp. 1–12. Springer, Heidelberg (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rezwan, F., Sun, Y., Davey, N., Adams, R., Rust, A.G., Robinson, M. (2012). Using Varying Negative Examples to Improve Computational Predictions of Transcription Factor Binding Sites. In: Jayne, C., Yue, S., Iliadis, L. (eds) Engineering Applications of Neural Networks. EANN 2012. Communications in Computer and Information Science, vol 311. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32909-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32909-8_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32908-1

  • Online ISBN: 978-3-642-32909-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics