Advertisement

An Experimental Study on Decision Tree Classifier Using Discrete and Continuous Data

  • Monalisa JenaEmail author
  • Satchidananda Dehuri
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1040)

Abstract

Classification is one of the fundamental tasks of pattern recognition, data mining, and big data analysis. It spans across the domain for classifying novel instances whose class labels are unknown prior to the development of model. Decision trees like ID3, C4.5, and other variants for the task of classification have been widely studied in pattern recognition and data mining. The reason is that decision tree classifier is simple to understand, and its performance has been comparable with many promising classifiers. Therefore, in this work, we have developed a two-phase method of decision tree classifier for classifying continuous and discrete data effectively. In phase one, our method examines the database, whether it is a continuous-valued or discrete-valued database. If it is a continuous-valued database, then the database is discretized in this phase. In the second phase, the classifier is built and then classifies an unknown instance. To measure the performance of these two phases, we have experimented on a few datasets from the University of California, Irvine (UCI) Machine Learning repository and one artificially created dataset. The experimental evidence shows that this two-phase method of constructing a decision tree to classify an unknown instance is effective in both continuous and discrete cases.

Keywords

Decision tree Classification Discretization Data mining 

Notes

Acknowledgements

Thanks to Mr. Sagar Muduli, MCA student, Dept. of I & CT, F. M. University, Balasore, Odisha, for his notable contribution in this work.

References

  1. 1.
    Phyu, TN.: Survey of classification techniques in data mining. In: Proceedings of the International Multi Conference of Engineers and Computer Scientists, vol. 1, pp. 18–20 (2009)Google Scholar
  2. 2.
    Wang, R., Kwong, S., Wang, X.Z., Jiang, Q.: Segment based decision tree induction with continuous valued attributes. IEEE Trans. Cybern. 45(7), 1262–1275 (2015)CrossRefGoogle Scholar
  3. 3.
    Loh, W.Y.: Fifty years of classification and regression trees. Int. Stat. Rev. 82(3), 329–348 (2014)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Quinlan, J.R.: Decision trees and decision-making. IEEE Trans. Syst. Man Cybern. 20(2), 339–346 (1990)CrossRefGoogle Scholar
  5. 5.
    Garcia, S., Luengo, J., Saez, J.A., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)CrossRefGoogle Scholar
  6. 6.
    Quinlan, J.R.: Improved use of continuous attributes in c4.5. J. Artif. Intell. Res. 4, 77–90 (1996)CrossRefGoogle Scholar
  7. 7.
    Breiman, L.: Classification and regression trees. Routledge (2017)Google Scholar
  8. 8.
    Han, J., Pei, J., Kamber, M.: Data mining: concepts and techniques. Elsevier (2011)Google Scholar
  9. 9.
    Jearanaitanakij, K.: Classifying continuous data set by id3 algorithm. In: Information, Communications and Signal Processing, 2005 Fifth International Conference, pp. 1048–1051. IEEE (2005)Google Scholar
  10. 10.
    De Sa, C.R., Soares, C., Knobbe, A.: Entropy-based discretization methods for ranking data. Inf. Sci. 329, 921–936 (2016)CrossRefGoogle Scholar
  11. 11.
    Ching, J.Y., Wong, A.K.C., Chan, K.C.C.: Class-dependent discretization for inductive learning from continuous and mixed-mode data. IEEE Trans. Pattern Anal. Mach. Intell. 17(7), 641–651 (1995)CrossRefGoogle Scholar
  12. 12.
    Liu, L., Wong, A.K.C., Wang, Y.: A global optimal algorithm for class-dependent discretization of continuous data. Intell. Data Anal. 8(2), 151–170 (2004)CrossRefGoogle Scholar
  13. 13.
    Uther, W.T., Veloso, M.M.: Tree based discretization for continuous state space reinforcement learning. In: Aaai/iaai, pp. 769–774 (1998)Google Scholar
  14. 14.
    Chen, Y.C., Wheeler, T.A., Kochenderfer, M.J.: Learning discrete bayesian networks from continuous data. J. Artif. Intell. Res. 59, 103–132 (2017)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Dheeru, D., Taniskidou, E.K.: UCI machine learning repository (2017)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Department of I & CTFakir Mohan UniversityBalasoreIndia

Personalised recommendations