Graph of Words Embedding for Molecular Structure-Activity Relationship Analysis

  • Jaume Gibert
  • Ernest Valveny
  • Horst Bunke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6419)

Abstract

Structure-Activity relationship analysis aims at discovering chemical activity of molecular compounds based on their structure. In this article we make use of a particular graph representation of molecules and propose a new graph embedding procedure to solve the problem of structure-activity relationship analysis. The embedding is essentially an arrangement of a molecule in the form of a vector by considering frequencies of appearing atoms and frequencies of covalent bonds between them. Results on two benchmark databases show the effectiveness of the proposed technique in terms of recognition accuracy while avoiding high operational costs in the transformation.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence 18(3), 265–298 (2004)CrossRefGoogle Scholar
  2. 2.
    Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Analysis and Applications 13(1), 113–129 (2010)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Riesen, K., Bunke, H.: Graph Classification and Clustering Based on Vector Space Embedding. World Scientific, Singapore (2010)CrossRefMATHGoogle Scholar
  4. 4.
    Bunke, H., Shearer, K.: A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters 19(3), 13–25 (1998)MATHGoogle Scholar
  5. 5.
    Bunke, H., Allerman, G.: Inexact graph matching for structural pattern recognition. Pattern Recognition Letters 1, 245–253 (1983)CrossRefMATHGoogle Scholar
  6. 6.
    Fernandez, M.L., Valiente, G.: A graph distance metric combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letter 22(6-7), 753–758 (2001)CrossRefMATHGoogle Scholar
  7. 7.
    Helma, C., King, R., Kramer, S., Srinivasan, A.: The Predictive Toxicology Challenge 2000-2001. Bioinformatics 17, 107–108 (2001)CrossRefGoogle Scholar
  8. 8.
    Helma, C., Kramer, T., Kramer, S., De Raedt, L.: Data Mining and Machine Learning Techniques for the Identification of Mutagenicity Inducing Substructures and Structure-Activity Relationship of Noncongeneric Compounds. Journal of Chemical Information and Computer Sciences 44(4), 1402–1411 (2004)CrossRefGoogle Scholar
  9. 9.
    Kashima, H., Tsuda, K., Inokuchi, A.: Marginalized Kernels Between Labeled Graphs. In: Proceedings of the 20th International Conference on Machine Learning, pp. 321–328. AAAI Press, Menlo Park (2003)Google Scholar
  10. 10.
    Kramer, S., De Raedt, L.: Feature construction with version spaces for biochemical application. In: Proceeding of the 18th International Conference on Machine Learning, pp. 258–265 (2001)Google Scholar
  11. 11.
    Lewis, D.: Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. In: Proceedings of the 10th European Conference on Machine Learning, vol. (1398), pp. 4–15 (1998)Google Scholar
  12. 12.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  13. 13.
    Srinivasan, A., Muggleton, S., King, R.D., Sternberg, M.: Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence 85, 277–299 (1996)CrossRefGoogle Scholar
  14. 14.
    Swamidass, S.J., Chen, J., Bruand, J., Phung, P., Ralaivola, L., Baldi, P.: Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics 21, 359–368 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jaume Gibert
    • 1
  • Ernest Valveny
    • 1
  • Horst Bunke
    • 2
  1. 1.Computer Vision CenterUniversitat Autònoma de BarcelonaBellaterraSpain
  2. 2.Institute for Computer Science and Applied MathematicsUniversity of BernBernSwitzerland

Personalised recommendations