A Generative Multiset Kernel for Structured Data

  • Davide Bacciu
  • Alessio Micheli
  • Alessandro Sperduti
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7552)


The paper introduces a novel approach for defining efficient generative kernels for structured-data based on the concept of multisets and Jaccard similarity. The multiset feature-space allows to enhance the adaptive kernel with syntactic information on structure matching. The proposed approach is validated using an input-driven hidden Markov model for trees as generative model, but it is enough general to be straightforwardly applicable to any probabilistic latent variable model. The experimental evaluation shows that the proposed Jaccard kernel has a superior classification performance with respect to the Fisher Kernel, while consistently reducing the computational requirements.


Feature Space Hide Markov Model Hide State Jaccard Similarity State Transition Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, pp. 487–493 (1999)Google Scholar
  2. 2.
    Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In: Proc. of the 40th Annual Meeting on Assoc. for Comput. Ling., pp. 263–270 (2002)Google Scholar
  3. 3.
    Bacciu, D., Micheli, A., Sperduti, A.: Input-output hidden markov models for trees. In: Verleysen, M. (ed.) Proc. of the 2012 Europ. Symp. on Artif. Neural Netw., Comput. Intell. and Machine Learning (ESANN), pp. 25–30 (2012)Google Scholar
  4. 4.
    Jaccard, P.: The distribution of the flora in the alpine zone. 1. New Phytologist 11(2), 37–50 (1912)CrossRefGoogle Scholar
  5. 5.
    Nicotra, L., Micheli, A., Starita, A.: Fisher kernel for tree structured data. In: Proc. of the 2004 Int. Joint Conf. on Neural Netw., vol. 3, pp. 1917–1922 (2004)Google Scholar
  6. 6.
    Nicotra, L., Micheli, A.: Generative Kernels for Gene Function Prediction Through Probabilistic Tree Models of Evolution. Artificial Intelligence in Medicine (45), 125–134 (2009)Google Scholar
  7. 7.
    Diligenti, M., Frasconi, P., Gori, M.: Hidden tree markov models for document image classification. IEEE Trans. Pattern Anal. Mach. Intell. 25(4), 519–523 (2003)CrossRefGoogle Scholar
  8. 8.
    Jebara, T., Kondor, R., Howard, A.: Probability product kernels. The Journal of Machine Learning Research 5, 819–844 (2004)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Denoyer, L., Gallinari, P.: Report on the XML mining track at INEX 2005 and INEX 2006: categorization and clustering of XML documents. SIGIR Forum 41(1), 79–90 (2007)Google Scholar
  10. 10.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), Software,

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Davide Bacciu
    • 1
  • Alessio Micheli
    • 1
  • Alessandro Sperduti
    • 2
  1. 1.Dipartimento di InformaticaUniversità di PisaItaly
  2. 2.Dipartimento di MatematicaUniversità di PadovaItaly

Personalised recommendations