Information Theoretical Kernels for Generative Embeddings Based on Hidden Markov Models

Martins, André F. T.; Bicego, Manuele; Murino, Vittorio; Aguiar, Pedro M. Q.; Figueiredo, Mário A. T.

doi:10.1007/978-3-642-14980-1_45

Information Theoretical Kernels for Generative Embeddings Based on Hidden Markov Models

André F. T. Martins²³,
Manuele Bicego^21,22,
Vittorio Murino^21,22,
Pedro M. Q. Aguiar²⁴ &
…
Mário A. T. Figueiredo²³

Conference paper

1754 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6218))

Abstract

Many approaches to learning classifiers for structured objects (e.g., shapes) use generative models in a Bayesian framework. However, state-of-the-art classifiers for vectorial data (e.g., support vector machines) are learned discriminatively. A generative embedding is a mapping from the object space into a fixed dimensional feature space, induced by a generative model which is usually learned from data. The fixed dimensionality of these feature spaces permits the use of state of the art discriminative machines based on vectorial representations, thus bringing together the best of the discriminative and generative paradigms.

Using a generative embedding involves two steps: (i) defining and learning the generative model used to build the embedding; (ii) discriminatively learning a (maybe kernel) classifier on the adopted feature space. The literature on generative embeddings is essentially focused on step (i), usually adopting some standard off-the-shelf tool (e.g., an SVM with a linear or RBF kernel) for step (ii). In this paper, we follow a different route, by combining several Hidden Markov Models-based generative embeddings (including the classical Fisher score) with the recently proposed non-extensive information theoretic kernels. We test this methodology on a 2D shape recognition task, showing that the proposed method is competitive with the state-of-art.

Download to read the full chapter text

Chapter PDF

References

Andreu, G., Crespo, A., Valiente, J.: Selecting the toroidal self-organizing feature maps (TSOFM) best organized to object recognition. In: Proc. of IEEE ICNN 1997, vol. 2, pp. 1341–1346 (1997)
Google Scholar
Bahl, L., Brown, P., de Souza, P., Mercer, R.: Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Tokyo, Japan, vol. I, pp. 49–52 (2000)
Google Scholar
Bicego, M., Cristani, M., Murino, V., Pekalska, E., Duin, R.: Clustering-based construction of hidden Markov models for generative kernels. In: Cremers, D., Boykov, Y., Blake, A., Schmidt, F.R. (eds.) Energy Minimization Methods in Computer Vision and Pattern Recognition. LNCS, vol. 5681, pp. 466–479. Springer, Heidelberg (2009)
Chapter Google Scholar
Bicego, M., Murino, V., Figueiredo, M.: Similarity-based classification of sequences using hidden Markov models. Pattern Recognition 37(12), 2281–2291 (2004)
Article Google Scholar
Bicego, M., Pekalska, E., Tax, D., Duin, R.: Component-based discriminative classification for hidden Markov models. Pattern Recognition 42(11), 2637–2648 (2009)
Article MATH Google Scholar
Bicego, M., Trudda, A.: 2D shape classification using multifractional Brownian motion. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) S+SSPR 2008. LNCS, vol. 5342, pp. 906–916. Springer, Heidelberg (2008)
Chapter Google Scholar
Bosch, A., Zisserman, A., Munoz, X.: Scene classification via PLSA. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)
Chapter Google Scholar
Carli, A., Bicego, M., Baldo, S., Murino, V.: Non-linear generative embeddings for kernels on latent variable models. In: Proc. ICCV 2009 Workshop on Subspace Methods (2009)
Google Scholar
Chen, L., Man, H., Nefian, A.: Face recognition based on multi-class mapping of Fisher scores. Pattern Recognition, 799–811 (2005)
Google Scholar
Cuturi, M., Fukumizu, K., Vert, J.P.: Semigroup kernels on measures. Journal of Machine Learning Research 6, 1169–1198 (2005)
MATH MathSciNet Google Scholar
Gales, M.: Discriminative models for speech recognition. In: Information Theory and Applications Workshop (2007)
Google Scholar
Hein, M., Bousquet, O.: Hilbertian metrics and positive definite kernels on probability measures. In: Ghahramani, Z., Cowell, R. (eds.) Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, AISTATS (2005)
Google Scholar
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems – NIPS, pp. 487–493 (1999)
Google Scholar
Jebara, T., Kondor, R., Howard, A.: Probability product kernels. Journal of Machine Learning Research 5, 819–844 (2004)
MathSciNet Google Scholar
Kaiser, Z., Horvat, B., Kacic, Z.: A novel loss function for the overall risk criterion based discriminative training of HMM models. In: International Conference on Spoken Language Processing, Beijing, China, vol. 2, pp. 887–890 (2000)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labelling sequence data. In: International Conference on Machine Learning, pp. 591–598 (2001)
Google Scholar
Martins, A., Smith, N., Xing, E., Aguiar, P., Figueiredo, M.: Nonextensive information theoretic kernels on measures. Journal of Machine Learning Research 10, 935–975 (2009)
MathSciNet Google Scholar
Mollineda, R., Vidal, E., Casacuberta, F.: Cyclic sequence alignments: Approximate versus optimal techniques. Int. Journal of Pattern Recognition and Artificial Intelligence 16(3), 291–299 (2002)
Article Google Scholar
Neuhaus, M., Bunke, H.: Edit distance-based kernel functions for structural pattern classification. Pattern Recognition 39, 1852–1863 (2006)
Article MATH Google Scholar
Ng, A., Jordan, M.: On discriminative vs generative classifiers: A comparison of logistic regression and naive Bayes. In: Advances in Neural Information Processing Systems (2002)
Google Scholar
Perina, A., Cristani, M., Castellani, U., Murino, V.: A new generative feature set based on entropy distance for discriminative classification. In: Proc. Int. Conf. on Image Analysis and Processing, pp. 199–208 (2009)
Google Scholar
Perina, A., Cristani, M., Castellani, U., Murino, V., Jojic, N.: A hybrid generative/discriminative classification framework based on free-energy terms. In: Proc. Int. Conf. on Computer Vision (2009)
Google Scholar
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. of IEEE 77(2), 257–286 (1989)
Article Google Scholar
Rubinstein, Y., Hastie, T.: Discriminative vs informative learning. In: Knowledge Discovery and Data Mining, pp. 49–53 (1997)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Google Scholar
Smith, N., Gales, M.: Speech recognition using SVMs. In: Advances in Neural Information Processing Systems, pp. 1197–1204 (2002)
Google Scholar
Tsuda, K., Kin, T., Asai, K.: Marginalised kernels for biological sequences. Bioinformatics 18, 268–275 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Verona, Verona, Italy
Manuele Bicego & Vittorio Murino
Istituto Italiano di Tecnologia (IIT), Genova, Italy
Manuele Bicego & Vittorio Murino
Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal
André F. T. Martins & Mário A. T. Figueiredo
Instituto de Sistemas e Robótica, Instituto Superior Técnico, Lisboa, Portugal
Pedro M. Q. Aguiar

Authors

André F. T. Martins
View author publications
You can also search for this author in PubMed Google Scholar
Manuele Bicego
View author publications
You can also search for this author in PubMed Google Scholar
Vittorio Murino
View author publications
You can also search for this author in PubMed Google Scholar
Pedro M. Q. Aguiar
View author publications
You can also search for this author in PubMed Google Scholar
Mário A. T. Figueiredo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Vision and Pattern Recognition Group,Computer Science, University of York Heslington, YO10-5DD, York, United Kingdom
Edwin R. Hancock
Department of Computer Science, University of York, YO10 5DD, UK
Richard C. Wilson
Centre for Vision, Speech and Signal Proc (CVSSP), University of Surrey, Guildford, GU2 7XH, Surrey, United Kingdom
Terry Windeatt
Electrical and Electronics Engineering Department, Middle East Technical University, 06531, Ankara, Turkey
Ilkay Ulusoy
Department of Computer Science and Artificial Intelligence, University of Alicante, P.O.B. 99, E-03080, Alicante, Spain
Francisco Escolano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martins, A.F.T., Bicego, M., Murino, V., Aguiar, P.M.Q., Figueiredo, M.A.T. (2010). Information Theoretical Kernels for Generative Embeddings Based on Hidden Markov Models. In: Hancock, E.R., Wilson, R.C., Windeatt, T., Ulusoy, I., Escolano, F. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2010. Lecture Notes in Computer Science, vol 6218. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14980-1_45

Download citation

DOI: https://doi.org/10.1007/978-3-642-14980-1_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14979-5
Online ISBN: 978-3-642-14980-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)