Similarity-Based Clustering of Sequences Using Hidden Markov Models

  • Manuele Bicego
  • Vittorio Murino
  • Mário A. T. Figueiredo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2734)


Hidden Markov models constitute a widely employed tool for sequential data modelling; nevertheless, their use in the clustering context has been poorly investigated. In this paper a novel scheme for HMM-based sequential data clustering is proposed, inspired on the similarity-based paradigm recently introduced in the supervised learning context. With this approach, a new representation space is built, in which each object is described by the vector of its similarities with respect to a predeterminate set of other objects. These similarities are determined using hidden Markov models. Clustering is then performed in such a space. By way of this, the difficult problem of clustering of sequences is thus transposed to a more manageable format, the clustering of points (vectors of features). Experimental evaluation on synthetic and real data shows that the proposed approach largely outperforms standard HMM clustering schemes.


Hide Markov Model Similarity Space Initial State Probability Rival Penalize Competitive Learning Pervised Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jain, A., Dubes, R.: Algorithms for clustering data. Prentice Hall (1988)Google Scholar
  2. 2.
    Rabiner, L.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc. of IEEE 77 (1989) 257–286CrossRefGoogle Scholar
  3. 3.
    Jain, A., Zongker, D.: Representation and recognition of handwritten digits using deformable templates. IEEE Trans. Pattern Analysis and Machine Intelligence 19 (1997) 1386–1391CrossRefGoogle Scholar
  4. 4.
    Graepel, T., Herbrich, R., Bollmann-Sdorra, P., Obermayer, K.: Classification on pairwise proximity data. In M. Kearns, S. Solla D.C., ed.: Advances in Neural Information Processing. Volume 11., MIT Press (1999)Google Scholar
  5. 5.
    Jacobs, D., Weinshall, D.: Classification with nonmetric distances: Image retrieval and class representation. IEEE Trans. Pattern Analysis and Machine Intelligence 22 (2000) 583–600CrossRefGoogle Scholar
  6. 6.
    Pekalska, E., Duin, R.: Automatic pattern recognition by similarity representations. Electronics Letters 37 (2001) 159–160CrossRefGoogle Scholar
  7. 7.
    Pekalska, E., Paclik, P., Duin, R.: A generalized kernel approach to dissimilaritybased classification. Journal of Machine Learning Research 2 (2002) 175–211zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Pekalska, E., Duin, R.: Dissimilarity representations allow for building good classifiers. Pattern Recognition Letters 23 (2002) 943–956zbMATHCrossRefGoogle Scholar
  9. 9.
    Smyth, P.: Clustering sequences with hidden Markov models. In Mozer, M., Jordan, M., Petsche, T., eds.: Advances in Neural Information Processing. Volume 9., MIT Press (1997)Google Scholar
  10. 10.
    Panuccio, A., Bicego, M., Murino, V.: A Hidden Markov Model-based approach to sequential data clustering. In Caelli, T., Amin, A., Duin, R., Kamel, M., de Ridder, D., eds.: Structural, Syntactic and Statistical Pattern Recognition. LNCS 2396, Springer (2002) 734–742CrossRefGoogle Scholar
  11. 11.
    Rabiner, L., Lee, C., Juang, B., Wilpon, J.: HMM clustering for connected word recognition. In: Proc. of IEEE ICASSP. (1989) 405–408Google Scholar
  12. 12.
    Lee, K.: Context-dependent phonetic hidden Markov models for speakerindependent continuous speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 38 (1990) 599–609CrossRefGoogle Scholar
  13. 13.
    Kosaka, T., Matsunaga, S., Kuraoka, M.: Speaker-independent phone modeling based on speaker-dependent hmm’s composition and clustering. In: Int. Proc. on Acoustics, Speech, and Signal Processing. Volume 1. (1995) 441–444Google Scholar
  14. 14.
    Bahlmann, C., Burkhardt, H.: Measuring hmm similarity with the bayes probability of error and its application to online handwriting recognition. In: Proc. Int. Conf. Document Analysis and Recognition. (2001) 406–411Google Scholar
  15. 15.
    Cadez, I., Gaffney, S., Smyth, P.: A general probabilistic framework for clustering individuals. In: Proc. of ACM SIGKDD 2000. (2000)Google Scholar
  16. 16.
    Law, M., Kwok, J.: Rival penalized competitive learning for model-based sequence. In: Proc. Int. Conf. Pattern Recognition. Volume 2. (2000) 195–198Google Scholar
  17. 17.
    Xu, L., Krzyzak, A., Oja, E.: Rival penalized competitive learning for clustering analysis, RBF nets, and curve detection. IEEE Trans. on Neural Networks 4 (1993) 636–648CrossRefGoogle Scholar
  18. 18.
    Li, C.: A Bayesian Approach to Temporal Data Clustering using Hidden Markov Model Methodology. PhD thesis, Vanderbilt University (2000)Google Scholar
  19. 19.
    Li, C., Biswas, G.: Clustering sequence data using hidden Markov model representation. In: Proc. of SPIE’99 Conf. on Data Mining and Knowledge Discovery: Theory, Tools, and Technology. (1999) 14–21Google Scholar
  20. 20.
    Li, C., Biswas, G.: A bayesian approach to temporal data clustering using hidden Markov models. In: Proc. Int. Conf. on Machine Learning. (2000) 543–550Google Scholar
  21. 21.
    Li, C., Biswas, G.: Applying the Hidden Markov Model methodology for unsupervised learning of temporal data. Int. Journal of Knowledge-based Intelligent Engineering Systems 6 (2002) 152–160Google Scholar
  22. 22.
    Li, C., Biswas, G., Dale, M., Dale, P.: Matryoshka: A HMM based temporal data clustering methodology for modeling system dynamics. Intelligent Data Analysis Journal in press (2002)Google Scholar
  23. 23.
    Schwarz, G.: Estimating the dimension of a model. The Annals of Statistics 6 (1978) 461–464zbMATHMathSciNetCrossRefGoogle Scholar
  24. 24.
    Stolcke, A., Omohundro, S.: Hidden Markov Model induction by Bayesian model merging. In Hanson, S., Cowan, J., Giles, C., eds.: Advances in Neural Information Processing Systems. Volume 5., Morgan Kaufmann, San Mateo, CA (1993) 11–18Google Scholar
  25. 25.
    Cheeseman, P., Stutz, J.: Bayesian classification (autoclass): Theory and results. In: Advances in Knowledge discovery and data mining. (1996) 153–180Google Scholar
  26. 26.
    Vapnik, V.: Statistical Learning Theory. John Wiley, New York (1998)zbMATHGoogle Scholar
  27. 27.
    Dubnov, S., El-Yaniv, R., Gdalyahu, Y., Schneidman, E., Tishby, N., Yona, G.: A new nonparametric pairwise clustering algorithm based on iterative estimation of distance profiles. Machine Learning 47 (2002) 35–61zbMATHCrossRefGoogle Scholar
  28. 28.
    Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press (1999)Google Scholar
  29. 29.
    Bicego, M., Murino, V.: Investigating Hidden Markov Models’ capabilities in 2D shape classification. Submitted for publication (2002)Google Scholar
  30. 30.
    Sebastian, T., Klein, P., Kimia, B.: Recognition of shapes by editing Shock Graphs. In: Proc. Int Conf. Computer Vision. (2001) 755–762Google Scholar
  31. 31.
    Bicego, M., Murino, V., Figueiredo, M.: Similarity-based classification of sequences using hidden Markov models (2002) Submitted for publication.Google Scholar
  32. 32.
    Law, M., A.K. Jain, Figueiredo, M.: Feature selection in mixture-based clustering. In: Neural Information Processing Systems — NIPS’2002, Vancouver (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Manuele Bicego
    • 1
  • Vittorio Murino
    • 1
  • Mário A. T. Figueiredo
    • 2
  1. 1.Dipartimento di InformaticaUniversità di VeronaVeronaItaly
  2. 2.Instituto de TelecomunicaçõesInstituto Superior TécnicoLisboaPortugal

Personalised recommendations