Skip to main content

Predicting Protein Function and Protein-Ligand Interaction with the 3D Neighborhood Kernel

  • Conference paper
  • First Online:
Discovery Science (DS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9356))

Included in the following conference series:

Abstract

Kernels for structured data have gained a lot of attention in a world with an ever increasing amount of complex data, generated from domains such as biology, chemistry, or engineering. However, while many applications involve spatial aspects, up to now only few kernel methods have been designed to take 3D information into account. We introduce a novel kernel called the 3D Neighborhood Kernel. As a first step, we focus on 3D structures of proteins and ligands, in which the atoms are represented as points in 3D space. By comparing the Euclidean distances between selected sets of atoms, the kernel can select spatial features that are important for determining functions of proteins or interactions with other molecules. We evaluate the kernel on a number of benchmark datasets and show that it obtains a competitive performance w.r.t. the state-of-the-art methods. While we apply this kernel to proteins and ligands, it is applicable to any kind of 3D data where objects follow a common schema, such as RNA, cars, or standardized equipment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ballester, P.J., Mitchell, J.B.O.: A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics 26(9), 1169–1175 (2010)

    Article  Google Scholar 

  2. Borgwardt, K.: Graph Kernels. Ph.D. thesis, Computer Science, Ludwig-Maximilians-University Munich (2007)

    Google Scholar 

  3. Borgwardt, K., Ong, C., Schonauer, S., Vishwanathan, S., Smola, A., Kriegel, H.: Protein function prediction via graph kernels. Bioinformatics 21(S1), i47–i56 (2005)

    Article  Google Scholar 

  4. de Berg, M., Cheong, O., Kreveld, M., Overmars, M.: Computational Geometry: Algorithms and Applications. Springer, Heidelberg (2000)

    Book  MATH  Google Scholar 

  5. Ceroni, A., Costa, F., Frasconi, P.: Classification of small molecules by two- and three-dimensional decomposition kernels. Bioinformatics 23(16), 2038–2045 (2007)

    Article  Google Scholar 

  6. Costa, F., De Grave, K.: Fast neighborhood subgraph pairwise distance kernel. In: Proceedings of the 27th International Conference on Machine Learning, pp. 255-262 (2010)

    Google Scholar 

  7. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel Based Methods. Cambridge University Press, UK (2000)

    Book  MATH  Google Scholar 

  8. Deforche, K.: Modeling HIV resistance evolution under drug selective pressure. Ph.D. thesis, Katholieke Universiteit Leuven (2008)

    Google Scholar 

  9. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  10. Dobson, P.D., Doig, A.J.: Predicting enzyme class from protein structure without alignments. J. Mol. Biol. 345, 187–199 (2005)

    Article  Google Scholar 

  11. Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 98, 209–226 (1977)

    Article  MATH  Google Scholar 

  12. Hinselmann, G., Fechner, N., Jahn, A., Eckert, M., Zell, A.: Graph kernels for chemical compounds using topological and three-dimensional local atom pair environments. Neurocomputing 74, 219–229 (2010)

    Article  Google Scholar 

  13. Hue, M., Riffle, M., Vert, J.-P., Stafford Noble, W.: Large-scale prediction of protein-protein interactions from structures. BMC Bioinform. 11(144), 1–9 (2010)

    Google Scholar 

  14. Joachims, T.: Learning to Classify Text using Support Vector Machines: Methods, Theory, and Algorithms. Springer, US (2002)

    Book  Google Scholar 

  15. King, R.D., Muggleton, S., Srinivasan, A., Sternberg, M.J.E.: Structure-activity relationships derived by machine learning: the use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. Proc. Natl. Acad. Sci. 93, 438–442 (1996)

    Article  Google Scholar 

  16. Kuramochi, M., Karypis, G.: Discovering frequent geometric subgraphs. In: Proceedings of the 2004 IEEE International Conference on Data Mining, pp. 258–265 (2004)

    Google Scholar 

  17. Lee, D.T., Wong, C.K.: Worst-case analysis for region and partial region searches in multidimensional binary search trees and balanced quad trees. Acta Informatica 9, 23–29 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  18. Nowozin, S., Tsuda, K.: Frequent subgraph retrieval in geometric graph databases. In: Proceedings of the 2008 IEEE International Conference on Data Mining, pp. 953–958 (2008)

    Google Scholar 

  19. Provost, F., Fawcett, T.: Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 43–48. AAAI Press (1998)

    Google Scholar 

  20. Qiu, J., Hue, M., Ben-Hur, A., Vert, J.-P., Stafford Noble, W.: A structural alignment kernel for protein structures. Bioinformatics 23(9), 1090–1098 (2007)

    Article  Google Scholar 

  21. Ramon, J., Gärtner, T.: Expressivity versus efficiency of graph kernels. In: Proceedings of the First International Workshop on Mining Graphs, Trees and Sequences (MGTS2003), pp. 65–74 (2003)

    Google Scholar 

  22. Saidi, R., Maddouri, M., Nguifo, E.M.: Comparing graph-based representations of protein for mining purposes. In: Proceedings of the KDD-09 Workshop on Statistical and Relational Learning in Bioinformatics, pp. 35–38 (2009)

    Google Scholar 

  23. Sali, A., Blundell, T.L.: Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993)

    Article  Google Scholar 

  24. Schietgat, L., Vens, C., Struyf, J., Blockeel, H., Kocev, D., Džeroski, S.: Predicting gene function using hierarchical multi-label decision tree ensembles. BMC Bioinform. 11(2), 1–14 (2010)

    MATH  Google Scholar 

  25. Schietgat, L., Ramon, J., Bruynooghe, M.: A polynomial-time maximum common subgraph algorithm for outerplanar graphs and its application to chemoinformatics. Ann. Math. Artif. Intell. 69, 343–376 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  26. Shervashidze, N., Borgwardt, K.: Fast subtree kernels on graphs. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 22, pp. 1660–1668. Curran, USA (2009)

    Google Scholar 

  27. Srinivasan, A., Page, D., Camacho, R., King, R.D.: Quantitative pharmacophore models with inductive logic programming. Mach. Learn. 64, 65–90 (2006)

    Article  MATH  Google Scholar 

  28. Suykens, J., Van Gestel, T., De Brabanter, J., De Moor, B., Vandewalle, J.: Least Squares Support Vector Machines. World Scientific, Singapore (2005)

    MATH  Google Scholar 

  29. Wang, R., et al.: The PDBbind database: methodologies and updates. J. Med. Chem. 48, 4111–4119 (2005)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank students Davy De Mits and Sunil Aryal for conducting preliminary experiments, Dr. Kurt De Grave and Dr. Fabrizio Costa for assistance with running NSPDK, and Jérôme Renaux for proofreading. This research was supported by ERC-StG 240186 MiGraNT and IWT-SBO Nemoa.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leander Schietgat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Schietgat, L., Fannes, T., Ramon, J. (2015). Predicting Protein Function and Protein-Ligand Interaction with the 3D Neighborhood Kernel. In: Japkowicz, N., Matwin, S. (eds) Discovery Science. DS 2015. Lecture Notes in Computer Science(), vol 9356. Springer, Cham. https://doi.org/10.1007/978-3-319-24282-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24282-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24281-1

  • Online ISBN: 978-3-319-24282-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics