Length of Stay-Based Clustering Methods for Patient Grouping

  • Elia El-Darzi
  • Revlin Abbi
  • Christos Vasilakis
  • Florin Gorunescu
  • Marina Gorunescu
  • Peter Millard
Part of the Studies in Computational Intelligence book series (SCI, volume 189)


Length of stay (LOS) is often used as a proxy measure of a patient’ resource consumption because of the practical difficulties of directly measuring resource consumption and the easiness of calculating LOS. Grouping patient spells according to their LOS has proved to be a challenge in health care applications due to the inherent variability in the LOS distribution. Sound methods for LOS-based patient grouping should certainly lead to a better planning of bed allocation, and patient admission and discharge. Grouping patient spells according to their LOS in a computational efficient manner is still a research issue that has not been fully addressed. For instance, grouping patient spells according to LOS intervals (e.g. 0-3 days, 4-9 days, 10-21 days etc.), has previously been defined by non-algorithmic approaches using clinical judgement, visual inspection of the LOS distribution or according to the perceived casemix. The aim of this paper is to present a novel methodology of grouping patients according to their length of stay based on fitting Gaussian mixture models to LOS observations. This method was developed as part of an innovative prediction tool that helps identify groups of patients exhibiting similar resource consumption levels as these are approximated by patient LOS. As part of evaluating the approach, we also compare it to two alternative clustering approaches, K-means and the two-step algorithm. Computational results show the superiority of this method compared to alternative clustering approaches in terms of its ability to extract clinically meaningful patient groups as applied to a skewed LOS dataset.


length of stay patient grouping Gaussian mixture model clustering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abbi, R., El-Darzi, E., Vasilakis, C.: An analysis of the impact of outlier values on the pa-rameters of the GMM. University of Westminster, London. Internal Report, pp. 1–18 (2008a)Google Scholar
  2. 2.
    Abbi, R., El-Darzi, E., Vasilakis, C., Millard, P.H.: Analysis of stopping criteria for the EM algorithm in the context of patient grouping according to length of stay. In: IS 2008 - IEEE International Conference on Intelligent Systems, Varna, Bulgaria (2008b)Google Scholar
  3. 3.
    Abbi, R., El-Darzi, E., Vasilakis, C., Millard, P.H.: A Gaussian mixture model approach to grouping patients according to their hospital length of stay. In: 21st IEEE International Symposium on Computer-based Med. Systems, Finland, pp. 524–529 (2008c)Google Scholar
  4. 4.
    Abbi, R., El-Darzi, E., Vasilakis, C., Millard, P.H.: Length of stay based grouping and clas-sification methodology for modelling patient flow. J. of Operations and Logistics 2 (2008d)Google Scholar
  5. 5.
    Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademia Kiado, Budapest (1973)Google Scholar
  6. 6.
    Averill, R.F.: DRGs: Their Design and Development. Health Administration Press (1991)Google Scholar
  7. 7.
    Averill, R.F., Muldoon, J.H., Vertrees, J.C., Goldfield, N.I., Mullin, R.L., Fineran, E.C., Zhang, M., Steinbeck, B., Grant, T.: The Evolution of Casemix Measurement Using Diagnosis Related Groups (DRGs). 3M Health Information Systems, 1–40 (1998)Google Scholar
  8. 8.
    Bagirov, A.M., Churilov, L.: An Optimization-Based Approach to Patient Grouping for Acute Healthcare in Australia. In: International Conference on Computational Science, pp. 20–29 (2003)Google Scholar
  9. 9.
    Benton, P.L., Evans, H., Light, S.M., Mountney, L.M., Sanderson, H.F., Anthony, P.: The development of Healthcare Resource Groups - version 3. J. of Public Health Medicine 20, 351–358 (1998)Google Scholar
  10. 10.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)zbMATHGoogle Scholar
  11. 11.
    Ceglowski, A., Churilov, L., Wassertheil, J.: Knowledge Discovery through Mining Emergency Department Data. In: Proceedings of the 38th Hawaii International conference on System sciences, Hawaii, pp. 142–151 (2005)Google Scholar
  12. 12.
    Ceglowski, R., Churilov, L., Wasserthiel, J.: Combining Data Mining and Discrete Event Simulation for a value-added view of a hospital emergency department. J. of the Operational Res. Society 58, 246–254 (2007)zbMATHGoogle Scholar
  13. 13.
    Churilov, L., Bagirov, A., Schwartz, D., Smith, K., Dally, M.: Data Mining with Combined Use of Optimization Techniques and Self-Organizing Maps for Improving Risk Grouping Rules: Application to Prostate Cancer Patients. J. of Management Information Systems 21, 85–100 (2005)Google Scholar
  14. 14.
    CIHI ICIS. Acute Care Grouping Methodologies: From Diagnosis Related Groups to Case Mix Groups Redevelopment, Canadian Institute for Health Information, Ottawa, Ontario, pp. 1–16 (2004)Google Scholar
  15. 15.
    Codrington-Virtue, A., Chaussalet, T., Millard, P.H., Whittlestone, P., Kelly, J.: A system for patient management based discrete-event simulation and hierarchical clustering. In: Proceedings of the 19th IEEE Symposium on Computer-Based Med. Systems, pp. 800–804 (2006)Google Scholar
  16. 16.
    Costa, A.X., Ridley, S.A., Shahani, A.K., Harper, P.R., De Senna, V.: Mathematical modelling and simulation for planning critical care capacity. Anaesth. 58, 320–327 (2003)CrossRefGoogle Scholar
  17. 17.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. of the Royal Statistical Society. Series B (Methodological) 39, 1–38 (1977)zbMATHMathSciNetGoogle Scholar
  18. 18.
    Dilts, D.M., Khamalah, J.: A comparison of ordinal analysis techniques in medical re-source usage research. Electronic Notes in Discrete Mathematics 2, 51–68 (1999)CrossRefGoogle Scholar
  19. 19.
    El-Darzi, E., Vasilakis, C., Chaussalet, T., Millard, P.H.: A simulation model to evaluate the interaction between acute, rehabilitation, long-stay care and the community. In: Zanakis, S.H., Doukidis, G., Zopounidis, C. (eds.) Decision Making: Recent Developments and Worldwide Applications (2000)Google Scholar
  20. 20.
    El-Darzi, E., Vasilakis, C., Chaussalet, T., Millard, P.H.: A simulation modelling approach to evaluating length of stay, occupancy, emptiness and bed blocking in a hospital geriatric department. Health Care Management Sciences 1, 143–149 (1998)CrossRefGoogle Scholar
  21. 21.
    Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd International conference on know discovery and data min., pp. 226–231 (1996)Google Scholar
  22. 22.
    Faddy, M.J., McClean, S.I.: Analysing data on length of stay of hospital patients using phase-type distributions. Appl. Stoch. Models in Bus. and Industry 15, 311–317 (1999)zbMATHCrossRefGoogle Scholar
  23. 23.
    Georgoulakis, J.: Purpose/Goals of Groupers 17 (2003)Google Scholar
  24. 24.
    Goodisman, L.D., Trompeter, T.: Hospital case mix and average charge per case: an initial study. Health Services Res. 14, 44–55 (1979)Google Scholar
  25. 25.
    Gorunescu, F., McClean, S.I., Millard, P.H.: Using a Queuing model to help plan bed allocation in a department of geriatric medicine. Health Care Management Science 5, 307–312 (2002)CrossRefGoogle Scholar
  26. 26.
    Han, J., Kamber, M.: Data Mining: Concepts and techniques. Morgan Kaufmann, San Francisco (2006)Google Scholar
  27. 27.
    Harper, P.R.: A Framework for Operational Modelling of Hospital resources. Health Care Management Science 5, 165–173 (2002)CrossRefGoogle Scholar
  28. 28.
    Harper, P.R.: A review and comparison of classification algorithms for medical decision making. Health Policy 71, 315–331 (2005)CrossRefGoogle Scholar
  29. 29.
    Harrison, G.W.: Compartmental models of hospital patient occupancy patterns. In: Millard, P.H., McClean, S.I. (eds.) Modelling hospital resource use: a different approach to the planning and control of health care systems, pp. 53–61. Royal Society of Medicine, London (1994)Google Scholar
  30. 30.
    Harrison, G.W.: Implications of mixed exponential occupancy distributions and patient flow models for health care planning. Health Care Management Sciences 4, 37–45 (2001)CrossRefGoogle Scholar
  31. 31.
    Harrison, G.W., Millard, P.H.: Balancing acute and long term care: the mathematics of throughput in departments of geriatric medicine. Methods of Information in Medicine 30, 221–228 (1991)Google Scholar
  32. 32.
    Harwood, R., Huwez, F., Good, D.: Stroke Care: a practical manual, Trajectories of recovery, p. 60. Oxford University Press, Oxford (2005)Google Scholar
  33. 33.
    Heavens, J.: Casemix – The Missing Link In South African Healthcare Management. The Health Informatics R&D Co-ordination Pro-gramme of the Informatics & Communication Group (1999)Google Scholar
  34. 34.
    Isken, M.W., Rajagopalan, B.: Data Mining to Support Simulation Modelling of Patient Flow in Hospitals. J. of Med. Systems 26, 179–197 (2002)CrossRefGoogle Scholar
  35. 35.
    Kitsantas, P., Hollander, M., Li, L.: Using classification trees to assess low birth weight outcomes. Artificial Intelligence in Medicine 38, 275–289 (2006)CrossRefGoogle Scholar
  36. 36.
    Kulinskaya, E.: International Casemix Research: Why and How. In: Proceedings of the 19th International Case Mix Conference, Washington, DC, pp. 191–202 (2003)Google Scholar
  37. 37.
    Lezzoni, L.: Assessing Quality Using Administrative Data. In: Measuring quality, out-comes, and cost of care using large databases: The Sixth Regenstrief Conference. Annals of Internal Medicine, pp. 666–674 (1997)Google Scholar
  38. 38.
    Lin, X., Zhu, Y.: Degenerate Expectation-Maximisation Algorithm for local Dimension Reduction. In: Banks, D., House, L., McMorris, F.R., Arabie, P., Gaul, W. (eds.) Classification, Clustering, and Data Mining Applications. Springer, Chicago (2004)Google Scholar
  39. 39.
    Liu, Y., Phillips, M., Codde, J.: Factors Influencing Patients’ Length of Stay. Australian Health Review 24, 63–70 (2001)CrossRefGoogle Scholar
  40. 40.
    MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  41. 41.
    Marshall, A.H., McClean, S.I., Shapcott, C.M., Millard, P.H.: Modeling patient duration of stay to facilitate resource management of geriatric hospitals. Health Care Management Science 5, 313–319 (2002)CrossRefGoogle Scholar
  42. 42.
    Marshall, A.H., Vasilakis, C., El-Darzi, E.: Length of Stay-Based Patient Flow Models: Recent Developments and Future Directions. Health Care Management Science 8, 213–320 (2005)CrossRefGoogle Scholar
  43. 43.
    Maruster, L., Weijters, T., Vries, G.D., Bosch, A.V.D., Daelemans, W.: Logistic-based pa-tient grouping for multi-disciplinary treatment. Artificial Intelligence in Medicine 26, 87–107 (2002)CrossRefGoogle Scholar
  44. 44.
    McClean, S.I., Millard, P.H.: Modelling in-patient bed usage behaviour in a department of geriatric medicine. Methods of Information in Medicine 32, 79–81 (1993a)Google Scholar
  45. 45.
    McClean, S.I., Millard, P.H.: Patterns of length of stay after admission in geriatric medicine: an event history approach. The Statistician 42, 263–274 (1993b)CrossRefGoogle Scholar
  46. 46.
    McLachlan, G.J., Peel, D.: Finite Mixture Models. John Wiley & Sons, New York (2000)zbMATHCrossRefGoogle Scholar
  47. 47.
    Millard, P.H.: Background to and potential benefits of flow modelling medical and social services for an ageing population. In: Millard, P.H., McClean, S.I. (eds.) Go with the flow: a systems approach to healthcare planning. Royal Society of Med. Press Limited, London (1996)Google Scholar
  48. 48.
    Millard, P.H.: Flow rate modelling: a method of comparing performance in departments of geriatric medicine. St. George’s Hospital Medical School. University of London (1992)Google Scholar
  49. 49.
    Millard, P.H.: Geriatric medicine: a new method of measuring bed usage and a theory for planning. St. George’s Hospital Medical School. University of London (1988)Google Scholar
  50. 50.
    Miller, G.A.: The Magical Number Seven, Plus or Minus Two. The Psychological Review 63, 81–97 (1956)CrossRefGoogle Scholar
  51. 51.
    Morzuch, B.J., Allen, P.G.: Forecasting Hospital Emergency Department Arrivals. In: 26th Annual Symposium on Forecasting, Santander, Spain (2006)Google Scholar
  52. 52.
    Norusis, M.: SPSS 13.0 Statistical Procedures Companion. Prentice-Hall, Englewood Cliffs (2004)Google Scholar
  53. 53.
    Ridley, S.A., Jones, S., Shahani, A., Brampton, W., Rowan, K.: Classification Trees: A possible method for iso-grouping in intensive care. Anaesthesia 53, 833–840 (1998)CrossRefGoogle Scholar
  54. 54.
    Rissanen, J.: Modelling by the shortest data description. Automatica 14, 465–471 (1978)zbMATHCrossRefGoogle Scholar
  55. 55.
    Sanderson, H.F., Anthony, P., Mountney, L.M.: Healthcare Resource Groups - Version 2. J. of Public Health Medicine 17, 349–354 (1995)Google Scholar
  56. 56.
    Sanderson, H.F., Mountney, L.M.: The development of patient groupings for more effective management of health care. The European J. of Public Health 7, 210–214 (1997)CrossRefGoogle Scholar
  57. 57.
    Schwarz, G.: Estimating the Dimension of a Model. The Annals of Statistics 6, 461–464 (1978)zbMATHCrossRefMathSciNetGoogle Scholar
  58. 58.
    Siew, E.G., Smith, K., Churilov, L., Ibrahim, M.: A neural clustering approach to iso-resource grouping for acute healthcare in Australia. In: Proceedings of the 35th Hawaii International Conference on System Sciences, Hawaii, pp. 154–164 (2002)Google Scholar
  59. 59.
    Street, A., Dawson, D.: Costing hospital activity: the experience with healthcare resource groups in England. European J. of Health Economics. 3, 3–9 (2002)CrossRefGoogle Scholar
  60. 60.
    Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press, London (1999)Google Scholar
  61. 61.
    Titterington, D.M., Smith, A.F.M., Makov, U.E.: Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons, New York (1985)zbMATHGoogle Scholar
  62. 62.
    Vasilakis, C.: Simulating the flow of patients: an OLAP-enabled decision support framework. School of Computer Science. University of Westminster, London (2003)Google Scholar
  63. 63.
    Vasilakis, C., El-Darzi, E., Chountas, P.: An OLAP-Enabled Software Environment for Modelling Patient Flow. In: 3rd International IEEE Conference on Intelligent Systems, pp. 261–266 (2006)Google Scholar
  64. 64.
    Vasilakis, C., Marshall, A.H.: Modelling nationwide hospital length of stay: opening the black box. The J. of the Operational Res. Society 56, 862–869 (2005)zbMATHCrossRefGoogle Scholar
  65. 65.
    Walley, P.: Designing the accident and emergency system: lessons from manufacturing. Emergency Medicine J. 20, 126–130 (2003)CrossRefGoogle Scholar
  66. 66.
    Walter, M.: Automatic Model Acquisition and Recognition of Human Gestures. Harrow School of Computer Science. University of Westminster, London (2002)Google Scholar
  67. 67.
    Walter, M., Psarrou, A., Gong, S.: Data Driven Model Acquisition using Minimum Description Length. In: British Machine Vision Conference, Manchester, UK, pp. 673–683 (2001)Google Scholar
  68. 68.
    Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient data clustering method for very large databases. In: SIGMOD 1996: Proceedings of the 1996 ACM SIGMOD international conference on Management of data, New York, USA, pp. 103–114 (1996)Google Scholar
  69. 69.
    Zupan, B., Demsar, J., Kattan, M.W., Robert, B.J., Bratko, I.: Machine learning for survival analysis: a case study on recurrence of prostate cancer. Artificial Intelligence in Medicine 20, 59–75 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Elia El-Darzi
    • 1
  • Revlin Abbi
    • 1
  • Christos Vasilakis
    • 2
  • Florin Gorunescu
    • 3
  • Marina Gorunescu
    • 4
  • Peter Millard
    • 5
  1. 1.University of WestminsterLondonUK
  2. 2.University College LondonLondonUK
  3. 3.University of Medicine and Pharmacy of CraiovaRomania
  4. 4.University of CraiovaRomania
  5. 5.St Georges University of LondonUK

Personalised recommendations