Prediction of DNA-Binding Propensity of Proteins by the Ball-Histogram Method

  • Andrea Szabóová
  • Ondřej Kuželka
  • Sergio Morales E.
  • Filip Železný
  • Jakub Tolar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6674)


We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing the charged patches of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of charged amino acids, capturing joint probabilities of specified amino acids occurring in certain distances from each other. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, achieving favorable accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ohlendorf, D.H., Matthew, J.B.: Electrostatics and flexibility in protein-DNA interactions. Advances in Biophysics 20, 137–151 (1985)CrossRefGoogle Scholar
  2. 2.
    Stawiski, E.W., Gregoret, L.M., Mandel-Gutfreund, Y.: Annotating nucleic acid-binding function based on protein structure. J. Mol. Biol. (2003)Google Scholar
  3. 3.
    Jones, S., Shanahan, H.P., Berman, H.M., Thornton, J.M.: Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. Nucleic Acid Research 31(24), 7189–7198 (2003)CrossRefGoogle Scholar
  4. 4.
    Tsuchiya, Y., Kinoshita, K., Nakamura, H.: Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces. Proteins: Structure, Function, and Bioinformatics 55(4), 885–894 (2004)CrossRefGoogle Scholar
  5. 5.
    Ahmad, S., Sarai, A.: Moment-based prediction of DNA-binding proteins. Journal of Molecular Biology 341(1), 65–71 (2004)CrossRefGoogle Scholar
  6. 6.
    Bhardwaj, et al.: Kernel-based machine learning protocol for predicting DNA-binding proteins. Nuc. Acids Res. (2005)Google Scholar
  7. 7.
    Szilágyi, A., Skolnick, J.: Efficient Prediction of Nucleic Acid Binding Function from Low-resolution Protein Structures. Journal of Molecular Biology 358, 922–933 (2006)CrossRefGoogle Scholar
  8. 8.
    Moreland, J.L., Gramada, A., Buzko, O.V., Zhang, Q., Bourne, P.E.: The Molecular Biology Toolkit (MBT): A Modular Platform for Developing Molecular Visualization Applications. BMC Bioinformatics (2005)Google Scholar
  9. 9.
    Breiman, L.: Random Forests. Machine Learning 45, 5–32 (2001)CrossRefMATHGoogle Scholar
  10. 10.
    Caruana, R., Karampatziakis, N., Yessenalina, A.: An empirical evaluation of supervised learning in high dimensions. In: International Conference on Machine Learning (ICML), pp. 96–103 (2008)Google Scholar
  11. 11.
    Lavrač, N., Flach, P.: An Extended Transformation Approach to Inductive Logic Programming. ACM Transactions on Computational Logic 2, 458–494 (2001)CrossRefGoogle Scholar
  12. 12.
    Pabo, C.O., Sauer, R.T.: Transcription factors: structural families and principles of DNA recognition. Annual Review of Biochemistry 20, 137–151 (1992)Google Scholar
  13. 13.
    Mandel-Gutfreund, Y., Schueler, O., Margalit, H.: Comprehensive analysis of hydrogen bonds in regulatory protein DNA-complexes: in search of common principles. Journal of Molecular Biology 253, 370–382 (1995)CrossRefGoogle Scholar
  14. 14.
    Jones, S., van Heyningen, P., Berman, H.M., Thornton, J.M.: Protein-DNA interactions: a structural analysis. Journal of Molecular Biology 287, 877–896 (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Andrea Szabóová
    • 1
  • Ondřej Kuželka
    • 1
  • Sergio Morales E.
    • 2
  • Filip Železný
    • 1
  • Jakub Tolar
    • 3
  1. 1.Czech Technical UniversityPragueCzech Republic
  2. 2.Instituto Tecnológico de Costa Rica ITCRCosta Rica
  3. 3.University of MinnesotaMinneapolisUSA

Personalised recommendations