Advertisement

LR-SDiscr: An Efficient Algorithm for Supervised Discretization

  • Habiba Drias
  • Hadjer Moulai
  • Nourelhouda Rehkab
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10751)

Abstract

Discretization is the process of transforming continuous attributes into discrete. It has a great importance nowadays, as continuous data are often present in several domains such as health and industry. This paper describes a new supervised discretization method based on a LR (Left to Right) scanning technique called LR-SDiscr (Left to Right Supervised Discretization). Using both merging and partitioning operations, LR-SDiscr discretizes the data in a single pass, which reduces the complexity of the process and ensures scalability. Various discretization measures can be tested and then compared, as the algorithm offers the possibility of introducing any discretization measure as input. The preliminary results of experiments designed for classification purposes are encouraging.

Keywords

Data mining Data pre-processing Supervised classification Supervised discretization Division and merging framework Scanner 

References

  1. 1.
    Aha, D., et al.: UCI repository of machine learning databases (2017). http://www.ics.uci.edu/mlearn/MLRepository.html
  2. 2.
    Bettinger, R.: A \(chi^2\)-based discretization algorithm, a modern analytics. In: Proceedings of WUSS (2011)Google Scholar
  3. 3.
    Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers, Burlington (2011)zbMATHGoogle Scholar
  4. 4.
    Grzymala-Busse, J.W.: Discretization based on entropy and multiple scanning. Entropy 15, 1486–1502 (2013)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Grzymala-Busse, J.W., Mroczek, T.: A comparison of four approaches to discretization based on entropy. Entropy 18, 69 (2016)CrossRefGoogle Scholar
  6. 6.
    Kerber, R.: ChiMerge discretization of numeric attributes. In: AAAI Proceedings, pp. 123–128 (1992)Google Scholar
  7. 7.
    Kurgan, L., Cios, K.J.: CAIM discretization algorithm. IEEE Trans. Knowl. Data Eng. 16, 145–153 (2004)CrossRefGoogle Scholar
  8. 8.
    Lee, C.I., Tsai, C.J., Yang, Y.R., Yang, W.P.: A top-down and greedy method for discretization of continuous attributes. In: Proceedings of ICFSKD, pp. 145–153 (2007)Google Scholar
  9. 9.
    Liu, H., Setiono, R.: Feature selection and discretization. IEEE Trans. Knowl. Data Eng. 9, 1–4 (1997)CrossRefGoogle Scholar
  10. 10.
    Liu, H., Hussain, F., Tan, C.L., et al.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6, 393–423 (2002)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Su, C.T., Hsu, J.H.: An extended Chi2 algorithm for discretization of real value attributes. IEEE TKDE 17, 437–441 (2005)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Habiba Drias
    • 1
  • Hadjer Moulai
    • 1
  • Nourelhouda Rehkab
    • 1
  1. 1.LRIA Laboratory, Department of Computer ScienceUSTHBAlgiersAlgeria

Personalised recommendations