Machine Learning

, Volume 75, Issue 1, pp 3–35

Graph kernels based on tree patterns for molecules


DOI: 10.1007/s10994-008-5086-2

Cite this article as:
Mahé, P. & Vert, JP. Mach Learn (2009) 75: 3. doi:10.1007/s10994-008-5086-2


Motivated by chemical applications, we revisit and extend a family of positive definite kernels for graphs based on the detection of common subtrees, initially proposed by Ramon and Gärtner (Proceedings of the first international workshop on mining graphs, trees and sequences, pp. 65–74, 2003). We propose new kernels with a parameter to control the complexity of the subtrees used as features to represent the graphs. This parameter allows to smoothly interpolate between classical graph kernels based on the count of common walks, on the one hand, and kernels that emphasize the detection of large common subtrees, on the other hand. We also propose two modular extensions to this formulation. The first extension increases the number of subtrees that define the feature space, and the second one removes noisy features from the graph representations. We validate experimentally these new kernels on problems of toxicity and anti-cancer activity prediction for small molecules with support vector machines.


Graph kernelsSupport vector machinesChemoinformatics
Download to read the full article text

Copyright information

© The Author(s) 2008

Authors and Affiliations

  1. 1.Centre for Computational BiologyEcole des Mines de Paris—ParisTechFontainebleauFrance
  2. 2.Institut CurieParisFrance
  3. 3.INSERM, U900ParisFrance