Progress in Artificial Intelligence

, Volume 7, Issue 1, pp 41–53 | Cite as

Impact of time series discretization on intensive care burn unit survival classification

  • Isidoro J. Casanova
  • Manuel Campos
  • Jose M. Juarez
  • Antonio Fernandez-Fernandez-Arroyo
  • Jose A. Lorente
Regular Paper
  • 80 Downloads

Abstract

In the preprocessing step of a knowledge discovery process, the method of discretization selected can have a remarkable impact on the performance and accuracy of classification algorithms. In this article, we analyze and compare expert discretization and automatic discretization algorithms. In particular, we study their impact to predict the survival of patients in the context of intensive care burn units. We focus on the quality of different discretizations algorithm analyzing the number of intervals generated, the amount of patterns produced and the classification performance in a specific clinical problem. Our results show that the many algorithms underperform expert discretization and that it is necessary to take into account the correlation among continuous features to obtain the best accuracy.

Keywords

Discretization Burn unit Sequential patterns Survival classification 

Notes

Acknowledgements

This work was partially funded by the Spanish Ministry of Economy and Competitiveness under project TIN2013-45491-R, European Fund for Regional Development (EFRD),and Instituto de Salud Carlos III (Ref: FIS PI 12/2898).

References

  1. 1.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: International Conference on Data Engineering, March 6–10, 1995, Taipei, Taiwan (1995)Google Scholar
  2. 2.
    Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput. 13(3), 307–318 (2009)CrossRefGoogle Scholar
  3. 3.
    Azulay, R. et al.: Discretization of medical time series—A comparative study. In: Proceedings of the IDAMAP 2007, Amsterdam, The Netherlands, (2007)Google Scholar
  4. 4.
    Casanova, I.J., Campos, M., Juarez, J.M., Fernandez-Fernandez-Arroyo, A., Lorente, J.A.: Using multivariate sequential patterns to improve survival prediction in Intensive Care Burn Unit. In: Proceedings of the 15th Conference on Artificial Intelligence in Medicine, AIME 2015, pp. 277–286. Pavia, Italy (2015)Google Scholar
  5. 5.
    Casanova, I.J., Campos, M., Juarez, J.M., Fernandez-Fernandez-Arroyo, A., Lorente, J.A.: Impact of discretization with multivariate sequential patterns to do the classification of the survival prediction in Intensive Care Burn Unit. In: Proceedings of the VIII Simposio Teoría y Aplicaciones de Minería de Datos (TAMIDA 2016). CAEPIA 2016, pages 847–856. Salamanca, Spain (2016)Google Scholar
  6. 6.
    Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.: Data Mining: A Knowledge Discovery Approach. Springer Science & Business Media, Berlin (2007)MATHGoogle Scholar
  7. 7.
    Clarke, E.J., Barton, B.A.: Entropy and MDL discretization of continuous variables for Bayesian belief networks. Int. J. Intell. Syst. 15, 61–92 (2000)CrossRefGoogle Scholar
  8. 8.
    Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 20th International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, (1995)Google Scholar
  9. 9.
    Demsar, J., Zupan, B., Aoki, N., et al.: Feature mining and predictive model construction from severe trauma patient’s data. Int. J. Med. Inform. 63, 41–50 (2012)CrossRefGoogle Scholar
  10. 10.
    Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: XIII International Joint Conference on Artificial Intelligence (IJCAI93), Chambery, France, pp. 1022–1029, (1993)Google Scholar
  11. 11.
    Ferreira, A.J.: Feature selection and discretization for high-dimensional data. Ph.D. Thesis, Universidade de Lisboa, (2014)Google Scholar
  12. 12.
    Garcia, S., Luengo, J., Saez, J.A., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)CrossRefGoogle Scholar
  13. 13.
    Gomariz, A.: Techniques for the discovery of temporal patterns. Ph.D. Thesis, University of Murcia (Spain), University of Antwerp (Belgium), (2013)Google Scholar
  14. 14.
    Hoppner, F.: Time series abstraction methods—A survey in workshop on knowledge discovery in databases, Dortmund, (2002)Google Scholar
  15. 15.
    Jimenez, F., Sanchez, G., Juarez, J.M.: Multi-objective evolutionary algorithms for fuzzy classification in survival prediction. Artif. Intell. Med. 60, 197–219 (2014)CrossRefGoogle Scholar
  16. 16.
    Kerber, R.: ChiMerge: discretization of numeric attributes. In: Proceedings of 10th International Artificial Intelligence, pp. 123–128, (1992)Google Scholar
  17. 17.
    Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)Google Scholar
  18. 18.
    Lee, C.: A Hellinger-based discretization method for numeric attributes in classification learning. Knowl. Based Syst. 20(4), 419–425 (2007)CrossRefGoogle Scholar
  19. 19.
    Lima, M.D.C., et al.: Heuristic discretization method for bayesian networks. J. Comput. Sci. 10(5), 869–878 (2014)CrossRefGoogle Scholar
  20. 20.
    Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD DMKD workshop, (2003)Google Scholar
  21. 21.
    Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6(4), 393–423 (2002)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Liu, X.: A discretization algorithm based on a heterogeneity criterion. IEEE Trans. Knowl. Data Eng. 17(9), 1166–1173 (2005)CrossRefGoogle Scholar
  23. 23.
    Maslove, D.M., Podchiyska, T., Lowe, H.J.: Discretization of continuous features in clinical datasets. J. Am. Med. Inform. Assoc. 20(3), 544–553 (2013)CrossRefGoogle Scholar
  24. 24.
    Mehta, S., Parthasarathy, S., Yang, H.: Toward unsupervised correlation preserving discretization. IEEE Trans. Knowl. Data Eng. 17(9), 1174–1185 (2005)CrossRefGoogle Scholar
  25. 25.
    Mörchen, F., Ultsch, A.: Optimizing time series discretization for knowledge discovery. In: Proceedings of the KDD05 (2005)Google Scholar
  26. 26.
    Moskovitch, R., Shahar, Y.: Classification-driven temporal discretization of multivariate time series. Data Min. Knowl. Discov. 29(4), 871–913 (2015)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  28. 28.
    Ridzuan, N., Wolfe, D.: Human Readable Rule Induction in Medical Data Mining: A Survey of Existing Algorithms Proceedings of the European Computing Conference, Lecture Notes in Electrical Engineering, Volume 27, pp. 787–798 (2009)Google Scholar
  29. 29.
    Ruiz, F.J., Angulo, C., Agell, N.: IDD: a supervised interval distance-based method for discretization. IEEE Trans. Knowl. Data Eng. 20(9), 1230–1238 (2008)CrossRefGoogle Scholar
  30. 30.
    Shahar, Y.: A framework for knowledge-based temporal abstraction. Artif. Intell. 90(1—-2), 79–133 (1997)CrossRefMATHGoogle Scholar
  31. 31.
    Sheppard, N.N., Hemington-Gorse, S., Shelley, O.P., Philp, B., Dziewulski, P.: Prognostic scoring systems in burns: a review. Burns 37(8), 1288–1295 (2011)CrossRefGoogle Scholar
  32. 32.
    Stacey, M., McGregor, C.: Temporal abstraction in intelligent clinical data analysis: a survey. Artif. Intell. Med. 39, 1–24 (2007)CrossRefGoogle Scholar
  33. 33.
    Sun, C.-T., Hsu, J.H.: An extended Chi2 algorithm for discretization of real value attributes. IEEE Trans. Knowl. Data Eng. 17(3), 437–441 (2005)CrossRefGoogle Scholar
  34. 34.
    Wu, Q.X., Bell, D.A., Prasad, G., McGinnity, T.M.: A distribution-index-based discretizer for decision-making with symbolic AI approaches. IEEE Trans. Knowl. Data Eng. 19(1), 17–28 (2007)Google Scholar
  35. 35.
    Zighed, D.A., Rabaseda, R., Rakotomalala, R.: FUSINTER: a method for discretization of continuous attributes. Int. J. Uncertain. Fuzz. Knowl.-Based Syst. 6(3), 307–326 (1998)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.Computer Science FacultyUniversity of MurciaMurciaSpain
  2. 2.University Hospital of GetafeGetafeSpain
  3. 3.European University of MadridMadridSpain
  4. 4.CIBER Enfermedades RespiratoriasMadridSpain

Personalised recommendations