Abstract
This paper provides a constructive algorithm in which a hierarchical tree of hidden Markov models may be obtained directly from data using an unsupervised learning regime. The method is applied to health insurance transaction data such that profiles with similar local temporal behaviours are grouped together. By judicious incorporation of limited additional prior information, it is found that profiles can be separated into various sub-behavioural groups thus providing a technique for large-scale automatic labelling of data. In the application to the health insurance transaction data set, by incorporating limited information concerning the medical functions used in a medical procedure, it is possible to label some individual medical transactions as to whether they are related to a particular medical condition or not. This automatic labelling process adds values to the collected transactional database for possible further applications, e.g. public health studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Juang, B.H., Levenson, S.E., Sondhi, M.M.: Maximum likelihood estimation for multivariate mixture observations of Markov chains. IEEE Trans. on Information Theory 32, 307–309 (1986)
Liporace, L.A.: Maximum likelihood estimation for multivariate observations of Markov sources. IEEE Trans. on Information Theory 28, 729–734 (1982)
McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Duda, R.O., Hart, P.E.: Pattern recognition and scene analysis. J. Wiley, New York (1972)
Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49, 803–821 (1993)
Kohonen, T.: Self-Organizing Maps, 2nd Extended edn. Springer, Heidelberg (1995/1997)
Verbeek, J.J., Vlassis, N., Krose, B.: Efficient Greedy Learning of Gaussian Mixture Model. Neural Computation 15(2), 469–485 (2003)
Bierens, H.J.: Information criteria (November 2004), http://econ.la.psu.edu/~hbierens/INFCRIT.PDF
Hastie, T., Tibshirani, R., Friedman, J.: The Effective Number of Parameters. In: The Elements of Statistical Learning, Data Mining, Inference and Prediction, pp. 203–205. Springer, Heidelberg (2001)
Deller Jr., J.R., Proakis, J.G., Hansen, J.H.L.: Discrete-time Processing of Speech Signals. MacMillan Publishing Company, New York (1993)
Smyth, P.: Clustering Sequences with Hidden Markov Models. In: Advances in Neural Information Processing Systems, vol. 9, p. 648. MIT Press, Cambridge (1997)
Wahba, G.: Spline models for observational data. In: CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 59. SIAM, Philadelphia (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Tsoi, A.C., Zhang, S., Hagenbuchner, M. (2006). Hierarchical Hidden Markov Models: An Application to Health Insurance Data. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_19
Download citation
DOI: https://doi.org/10.1007/11677437_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32547-5
Online ISBN: 978-3-540-32548-2
eBook Packages: Computer ScienceComputer Science (R0)