Abstract
Exploratory data analysis using data mining techniques is becoming more popular for investigating subtle relationships in health data, for which direct data collection trials would not be possible. Health data mining involving clustering for large complex data sets in such cases is often limited by insufficient key indicative variables. When a conventional clustering technique is then applied, the results may be too imprecise, or may be inappropriately clustered according to expectations. This paper suggests an approach which can offer greater range of choice for generating potential clusters of interest, from which a better outcome might in turn be obtained by aggregating the results. An example use case based on health services utilization characterization according to socio-demographic background is discussed and the blended clustering approach being taken for it is described.
Chapter PDF
Similar content being viewed by others
References
McAullay, D., et al.: A delivery framework for health data mining and analytics. Australian Computer Society Inc., Darlinghurst (2005)
Correa-Velez, I., et al.: Hospital utilisation among people born in refugee-source countries: An analysis of hospital admissions, Victoria, 1998-2004. Medical Journal of Australia 186(11), 577 (2007)
Krasnik, A., et al.: Effect of ethnic background on Danish hospital utilisation patterns. Social Science & Medicine 55(7), 1207–1211 (2002)
RuÃc, M., et al.: Emergency hospital services utilization in Lleida (Spain): A cross-sectional study of immigrant and Spanish-born populations. BMC Health Services Research 8(1), 81–90 (2008)
Dias, S.n.F., Severo, M., Barros, H.: Determinants of health care utilization by immigrants in Portugal. BMC Health Services Research 8, 1–8 (2008)
Brunero, S., et al.: Clinical characteristics of people with mental health problems who frequently attend an Australian emergency department. Australian Health Review 31, 462–470 (2007)
Pascual, J.C., et al.: Utilization of psychiatric emergency services by homeless persons in Spain. General Hospital Psychiatry 30(1), 14–19 (2007)
Prather, J., et al.: Medical data mining: knowledge discovery in a clinical data warehouse. American Medical Informatics Association (1997)
Harrison Jr., J.H.: Introduction to the mining of clinical data. Clinics in Laboratory Medicine 28(1), 1–7 (2008)
Dominique, H., et al.: A review of software packages for data mining. The American Statistician 57(4), 290 (2003)
Tukey, J.: Exploratory data analysis. Addison-Wesley, Reading (1977)
Fayyad, U.M.: Advances in knowledge discovery and data mining. AAAI Press/MIT Press (1996)
Berger, A.M.M.M.R., Berger, C.R.M.M.: Data Mining as a Tool for Research and Knowledge Development in Nursing. CIN: Computers, Informatics, Nursing 22(3), 123–131 (2004)
Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann series in data management systems. Morgan Kaufmann, San Francisco (2006)
Mwasiagi, J., Wang, X., Huang, X.: The use of k-means and artificial neural network to classify cotton lint. Fibers and Polymers 10(3), 379–383 (2009)
Hair, J.F.: Multivariate data analysis. Prentice Hall, Upper Saddle River (1998)
Crowley, J., Ankerst, D.: Handbook of statistics in clinical oncology. CRC Press, Boca Raton (2006)
Berry, M.W., Brown, M.: Lecture notes in data mining. World Scientific, Hackensack (2006)
Tan, P.-N., Kumar, V., Steinbach, M.: Introduction to data mining. Pearson Addison Wesley, Boston (2005)
Wu, J., Xiong, H., Chen, J.: Adapting the right measures for k-means clustering. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 877–886. ACM, Paris (2009)
Ben-David, S., et al.: Stability of k-means clustering. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 20–34. Springer, Heidelberg (2007)
Rakhlin, A., Caponnetto, A.: Stability of k-means clustering. Advances in Neural Information Processing Systems 19, 1121–1127 (2007)
Kuncheva, L.I., Vetrov, D.P.: Evaluation of Stability of k-means Cluster Ensembles with Respect to Random Initialization. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(11), 1798–1808 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 IFIP
About this paper
Cite this paper
Mehar, A.M., Maeder, A., Matawie, K., Ginige, A. (2010). Blended Clustering for Health Data Mining. In: Takeda, H. (eds) E-Health. E-Health 2010. IFIP Advances in Information and Communication Technology, vol 335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15515-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-15515-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15514-7
Online ISBN: 978-3-642-15515-4
eBook Packages: Computer ScienceComputer Science (R0)