Skip to main content

Orthogonal Mixture of Hidden Markov Models

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12457))

  • 1648 Accesses

Abstract

Mixtures of Hidden Markov Models (MHMM) are widely used for clustering of sequential data, by letting each cluster correspond to a Hidden Markov Model (HMM). Expectation Maximization (EM) is the standard approach for learning the parameters of an MHMM. However, due to the non-convexity of the objective function, EM can converge to poor local optima. To tackle this problem, we propose a novel method, the Orthogonal Mixture of Hidden Markov Models (oMHMM), which aims to direct the search away from candidate solutions that include very similar HMMs, since those do not fully exploit the power of the mixture model. The directed search is achieved by including a penalty in the objective function that favors higher orthogonality between the transition matrices of the HMMs. Experimental results on both simulated and real-world datasets show that the oMHMM consistently finds equally good or better local optima than the standard EM for an MHMM; for some datasets, the clustering performance is significantly improved by our novel oMHMM (up to 55 percentage points w.r.t. the v-measure). Moreover, the oMHMM may also decrease the computational cost substantially, reducing the number of iterations down to a fifth of those required by MHMM using standard EM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Spectral learning of MHMM [30] improves the time complexity and does not improve the clustering. Sparse MHMM [24], requiring data coming from a set of entities connected in a graph with a known topology, can be used together with oMHMM.

  2. 2.

    Available from the NCBI Sequence Read Archive (SRA) under accession number SRP074289; for pre-processing of the data, see [18].

References

  1. Aghabozorgi, S., Seyed Shirkhorshidi, A., Ying Wah, T.: Time-series clustering - a decade review. Inf. Syst. 53, 16–38 (2015)

    Article  Google Scholar 

  2. Agrawal, A., Verschueren, R., Diamond, S., Boyd, S.: A rewriting system for convex optimization problems. J. Control Decision 5(1), 42–60 (2018)

    Article  MathSciNet  Google Scholar 

  3. Altosaar, J., Ranganath, R., Blei, D.: Proximity variational inference. AISTATS (2017)

    Google Scholar 

  4. Bache, K., Lichman, M.: Uci machine learning repository. UCI machine learning repository (2013)

    Google Scholar 

  5. Baum, L., Petrie, T.: Statistical inference for probabilistic functions of finite state markov chains. Ann. Math. Stat. 37(6), 1554–1563 (1966)

    Article  MathSciNet  Google Scholar 

  6. Bishop, C.: Pattern recognition and machine learning. Springer, Information science and statistics, New York (2006)

    Google Scholar 

  7. Bishop, C.: Model-based machine learning. Philosophical transactions. Series A, Mathematical, physical, and engineering sciences 371 (2012)

    Google Scholar 

  8. Blei, D., Kucukelbir, A., Mcauliffe, J.: Variational inference: a review for statisticians. J. Am. Statist. Assoc. 112(518), 859–877 (2017)

    Article  MathSciNet  Google Scholar 

  9. Chamroukhi, F., Nguyen, H.: Model based clustering and classification of functional data. Wiley Interdiscip. Rev. Data Mining Knowl. Disc. 9(4), e1298 (2019)

    Google Scholar 

  10. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(1), 1–22 (1977)

    MathSciNet  MATH  Google Scholar 

  11. Diamond, S., Boyd, S.: CVXPY: a Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)

    MathSciNet  MATH  Google Scholar 

  12. Dias, J., Vermunt, J., Ramos, S.: Mixture hidden markov models in finance research. In: Advances in Data Analysis, Data Handling and Business Intelligence, pp. 451–459 (2009)

    Google Scholar 

  13. Esmaili, N., Piccardi, M., Kruger, B., Girosi, F.: Correction: Analysis of healthcare service utilization after transport-related injuries by a mixture of hidden markov models. PLoS One 14(4), e0206274 (2019)

    Article  Google Scholar 

  14. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (2013)

    MATH  Google Scholar 

  15. Jebara, T., Song, Y., Thadani, K.: Spectral clustering and embedding with hidden markov models. In: Machine Learning: ECML 2007: 18th European Conference on Machine Learning 4701, pp. 164–175 (2007)

    Google Scholar 

  16. Jonathan, A., Sclaroff, S., Kollios, G., Pavlovic, V.: Discovering clusters in motion time-series data. In: CVPR (2003)

    Google Scholar 

  17. Kulesza, A., Taskar, B.: Determinantal point processes for machine learning. Found. Trends Mach. Learn. 5(2–3), 123–286 (2012)

    Article  Google Scholar 

  18. Leung, M., et al.: Single-cell DNA sequencing reveals a late-dissemination model in metastatic colorectal cancer. Genome Res. 27(8), 1287–1299 (2017)

    Article  Google Scholar 

  19. Ma, Q., Zheng, J., Li, S., Cottrell, G.: Learning representations for time series clustering. Adv. Neural Inf. Process. Syst. 32, 3781–3791 (2019)

    Google Scholar 

  20. Maoying Qiao, R., Bian, W., Xu, D., Tao, D.: Diversified hidden markov models for sequential labeling. IEEE Trans. Knowl. Data Eng. 27(11), 2947–2960 (2015)

    Article  Google Scholar 

  21. McGibbon, R., Ramsundar, B., Sultan, M., Kiss, G., Pande, V.: Understanding protein dynamics with l1-regularized reversible hidden markov models. In: Proceedings of the 31st International Conference on Machine Learning, vol. 32, no. 2, pp. 1197–1205 (2014)

    Google Scholar 

  22. Montanez, G., Amizadeh, S., Laptev, N.: Inertial hidden markov models: modeling change in multivariate time series. In: AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  23. Oates, T., Firoiu, L., Cohen, P.: Clustering time series with hidden markov models and dynamic time warping. In: IJCAI-99 Workshop on Neural, Symbolic and Reinforcement Learning Methods for Sequence Learning, pp. 17–21 (1999)

    Google Scholar 

  24. Pernes, D., Cardoso, J.S.: Spamhmm: sparse mixture of hidden markov models for graph connected entities. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2019)

    Google Scholar 

  25. Rand, W.: Objective criteria for the evaluation of clustering methods. J. Am. Statist. Assoc. 66(336), 846–850 (1971)

    Article  Google Scholar 

  26. Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL (2007)

    Google Scholar 

  27. Safinianaini, N., Boström, H., Kaldo, V.: Gated hidden markov models for early prediction of outcome of internet-based cognitive behavioral therapy. In: Riaño, D., Wilk, S., ten Teije, A. (eds.) AIME 2019. LNCS (LNAI), vol. 11526, pp. 160–169. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21642-9_22

    Chapter  Google Scholar 

  28. Safinianaini, N., De Souza, C., Lagergren, J.: Copymix: mixture model based single-cell clustering and copy number profiling using variational inference. bioRxiv (2020). https://doi.org/10.1101/2020.01.29.926022

  29. Smyth, P.: Clustering sequences with hidden markov models. In: Advances in Neural Information Processing Systems (1997)

    Google Scholar 

  30. Subakan, C., Traa, J., Smaragdis, P.: Spectral learning of mixture of hidden markov models. Adv. Neural Inf. Process. Syst. 27, 2249–2257 (2014)

    Google Scholar 

  31. Tao, L., Elhamifar, E., Khudanpur, S., Hager, G., Vidal, R.: Sparse hidden markov models for surgical gesture classification and skill evaluation. In: Proceedings of International Conference on Natural Language Processing and Knowledge Engineering, pp. 167–177 (2012)

    Google Scholar 

  32. Wang, Q., Schuurmans, D.: Improved estimation for unsupervised part-of-speech tagging. In: Proceedings of International Conference on Natural Language Processing and Knowledge Engineering, pp. 219–224 (2005)

    Google Scholar 

  33. Xing, Z., Pei, J., Keogh, E.: A brief survey on sequence classification. ACM SIGKDD Explor. Newslett. 12(1), 40–48 (2010)

    Article  Google Scholar 

  34. Yuting, Q., Paisley, J., Carin, L.: Music analysis using hidden markov mixture models. IEEE Trans. Signal Process. 55(11), 5209–5224 (2007)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

We thank Johan Fylling, Mohammadreza Mohaghegh Neyshabouri, and Diogo Pernes for their great help during the preparation of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Negar Safinianaini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Safinianaini, N., de Souza, C.P.E., Boström, H., Lagergren, J. (2021). Orthogonal Mixture of Hidden Markov Models. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12457. Springer, Cham. https://doi.org/10.1007/978-3-030-67658-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67658-2_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67657-5

  • Online ISBN: 978-3-030-67658-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics