Temporal Pattern Generation Using Hidden Markov Model Based Unsupervised Classification

  • Cen Li
  • Gautam Biswas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1642)


This paper describes a clustering methodology for temporal data using hidden Markov model(HMM) representation. The proposed method improves upon existing HMM based clustering methods in two ways: (i) it enables HMMs to dynamically change its model structure to obtain a better fit model for data during clustering process, and (ii) it provides objective criterion function to automatically select the clustering partition. The algorithm is presented in terms of four nested levels of searches: (i) the search for the number of clusters in a partition, (ii) the search for the structure for a fixed sized partition, (iii) the search for the HMM structure for each cluster, and (iv) the search for the parameter values for each HMM. Preliminary experiments with artificially generated data demonstrate the effectiveness of the proposed methodology.


Hide Markov Model Bayesian Information Criterion Marginal Likelihood Finite State Automaton Hide Markov Model Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baum, L. E., Petrie, T., Soules, G., Weiss, N.: A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. The Annuals of Mathematical Statistics 4(1) (1970) 164–171.MathSciNetCrossRefGoogle Scholar
  2. 2.
    Forney, G.: The viterbi algorithm. Proceedings of the IEEE 61(3) (1973) 268–277.CrossRefMathSciNetGoogle Scholar
  3. 3.
    Dempster, A. P., Laird, N. M., Rubin, D. B.: Maximum likelihood from incomplete data via the em algorithm. Journal of Royal Statistical Society Series B(methodological) 39 (1977) 1–38.zbMATHMathSciNetGoogle Scholar
  4. 4.
    Bahl, L. R., Brown, P. F., De Souza, P. V., Mercer, R. L.: Maximum mutual information estimation of hidden markov model parameters. Proceedings of the IEEE-IECEJ-AS International Conference on Acoustics, Speech, and Signal Processing 1 (1978) 49–52.Google Scholar
  5. 5.
    Juang, B. H., Rabiner, L. R.: A probabilistic distance measure for hidden markov models. AT&T Technical Journal 64(2) (1985) 391–408.MathSciNetGoogle Scholar
  6. 6.
    Fisher, D.: Knowledge acquisition via incremental conceptual clustering. Machine Learning 2 (1987) 139–172.Google Scholar
  7. 7.
    Rabiner, L. R.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2) (1989) 257–285.CrossRefGoogle Scholar
  8. 8.
    Rabiner, L. R., Lee, C. H., Juang, B. H., Wilpon, J. G.: Hmm clustering for connected word recognition. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (1989).Google Scholar
  9. 9.
    Casacuberta, F., Vidal, E., and Mas B.: Learning the structure of hmm’s through grammatical inference techniques. Proceedings of the International Conference on Acoustic, Speech, and Signal Processing (1990) 717–720.Google Scholar
  10. 10.
    Lee, K. F.: Context-dependent phonetic hidden markov models for speaker-independent continuous speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 38(4) (1990) 599–609.CrossRefGoogle Scholar
  11. 11.
    Casella G., George, E. I.: Explaining the gibbs sampler. The American Statistician 46(3) (1992) 167–174.CrossRefMathSciNetGoogle Scholar
  12. 12.
    Cooper, G. F., Herskovits, E.: A bayesian method for the induction of probabilistic network from data. Machine Learning 9 (1992) 309–347.zbMATHGoogle Scholar
  13. 13.
    Omohundro, S. M.: Best-first model merging for dynamic learning and recognition. Advances in Neural Information Processing Systems 4 (1992) 958–965.Google Scholar
  14. 14.
    Takami, J., Sagayama, S.: A successive state splitting algorithm for Efficient allo-phone modeling. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 1 (1992) 573–576.Google Scholar
  15. 15.
    Stolcke, A., Omohundro, S. M.: Best-first model merging for hidden markov model induction. Technical Report TR-94-003, International Computer Science Institute, 1994.Google Scholar
  16. 16.
    Wallace, C. S., Dowe, D. L.: Intrinsic classificatin by mml-the snob program. Proceedings of the Seventh Australian Joint Conference on Artificial Intelligence (1994) 37–44.Google Scholar
  17. 17.
    Biswas, G., Weinberg, J., Li, C.: Iterate: A conceptual clustering method for knowledge discovery in databases. Artificial Intelligence in Petroleum Industry: Symbolic and Computational Applications, Braunschweig, B. and Day, R.editors, Teditions Technip, 1995.Google Scholar
  18. 18.
    Chib, S.: Marginal likelihood from the gibbs sampling. Journal of the American Statistical Association (1995) 1313–1321.Google Scholar
  19. 19.
    Kass, R. E., Raftery, A. E.: Bayes factor. Journal of the American Statistical Association (1995) 773–795.Google Scholar
  20. 20.
    Kosaka, T., Masunaga, S., Kuraoka, M.: Speaker-independent phone modeling based on speaker-dependent hmm’s composition and clustering. Proceedings of the ICASSP’95 (1995) 441–444.Google Scholar
  21. 21.
    Cheeseman, P., Stutz, J.: Bayesian Classification(autoclass): Theory and results. Advances in Knowledge Discovery and Data Mining (1996) chapter 6 153–180, Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.editors AAAI-MIT press.Google Scholar
  22. 22.
    Dermatas, E., Kokkinakis, G.: Algorithm for clustering continuous density hmm by recognition error. IEEE Transactions on Speech and Audio Processing 4(3) (1996) 231–234.CrossRefGoogle Scholar
  23. 23.
    Chickering, D. M., Heckerman, D.: Efficient approximations for the marginal like-lihood of bayesian networks with hidden variables. Machine Learning 29 (1997) 181–212.zbMATHCrossRefGoogle Scholar
  24. 24.
    Ghahramani, Z., Jordan, M. I.: Factorial hidden markov models. Machine Leaning 29 (1997) 245–273.zbMATHCrossRefGoogle Scholar
  25. 25.
    Ostendorf, M., Singer, H.: Hmm topology design using maximum likelihood successive state splitting. Computer Speech and Language 11 (1997) 17–41.CrossRefGoogle Scholar
  26. 26.
    Smyth, P.: Clustering sequences with hidden markov models. Advances in Neural Information Processing (1997).Google Scholar
  27. 27.
    Li, C.: Unsupervised Classification on temporal data. Technical Report VU-CS-TR-98-04, Vanderbilt University, April 1998.Google Scholar
  28. 28.
    Li, C., Biswas, G.: Clustering sequence data using hidden markov model representation. SPIE99 Conference on Data Mining and Knowledge Discovery: Theory, Tools, and Technology (1999) 14–21.Google Scholar
  29. 29.
    Sebastiani, P., Ramoni, M., Cohen, P., Warwick, J., Davis, J.: Discovering dynamics using bayesian clustering. Proceedings of the 3rd International Symposium on Intelligent Data Analysis (1999).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Cen Li
    • 1
  • Gautam Biswas
    • 1
  1. 1.Department of Computer ScienceVanderbilt UniversityNashvilleUSA

Personalised recommendations