Chapter

Structural, Syntactic, and Statistical Pattern Recognition

Volume 6218 of the series Lecture Notes in Computer Science pp 463-472

Information Theoretical Kernels for Generative Embeddings Based on Hidden Markov Models

  • André F. T. MartinsAffiliated withInstituto de Telecomunicações, Instituto Superior Técnico
  • , Manuele BicegoAffiliated withComputer Science Department, University of VeronaIstituto Italiano di Tecnologia (IIT)
  • , Vittorio MurinoAffiliated withComputer Science Department, University of VeronaIstituto Italiano di Tecnologia (IIT)
  • , Pedro M. Q. AguiarAffiliated withInstituto de Sistemas e Robótica, Instituto Superior Técnico
  • , Mário A. T. FigueiredoAffiliated withInstituto de Telecomunicações, Instituto Superior Técnico

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Many approaches to learning classifiers for structured objects (e.g., shapes) use generative models in a Bayesian framework. However, state-of-the-art classifiers for vectorial data (e.g., support vector machines) are learned discriminatively. A generative embedding is a mapping from the object space into a fixed dimensional feature space, induced by a generative model which is usually learned from data. The fixed dimensionality of these feature spaces permits the use of state of the art discriminative machines based on vectorial representations, thus bringing together the best of the discriminative and generative paradigms.

Using a generative embedding involves two steps: (i) defining and learning the generative model used to build the embedding; (ii) discriminatively learning a (maybe kernel) classifier on the adopted feature space. The literature on generative embeddings is essentially focused on step (i), usually adopting some standard off-the-shelf tool (e.g., an SVM with a linear or RBF kernel) for step (ii). In this paper, we follow a different route, by combining several Hidden Markov Models-based generative embeddings (including the classical Fisher score) with the recently proposed non-extensive information theoretic kernels. We test this methodology on a 2D shape recognition task, showing that the proposed method is competitive with the state-of-art.