Advertisement

Extension of ICF Classifiers to Real World Data Sets

  • Kazuya Haraguchi
  • Hiroshi Nagamochi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4570)

Abstract

Classification problem asks to construct a classifier from a given data set, where a classifier is required to capture the hidden oracle of the data space. Recently, we introduced a new class of classifiers ICF, which is based on iteratively composed features on {0,1, ∗ }-valued data sets. We proposed an algorithm ALG-ICF ∗  to construct an ICF classifier and showed its high performance. In this paper, we extend ICF so that it can also process real world data sets consisting of numerical and/or categorical attributes. For this purpose, we incorporate a discretization scheme into ALG-ICF ∗  as its preprocessor, by which an input real world data set is transformed into {0,1, ∗ }-valued one. Based on the experimental studies on conventional discretization schemes, we propose a new discretization scheme, integrated construction (IC). Our computational experiments reveal that the ALG-ICF ∗  equipped with IC outperforms a decision tree constructor C4.5 in many cases.

Keywords

classification discretization iteratively composed features machine learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fayyad, U., Piatetsky-Shapiro, G., Padhraic, S.: From data mining to knowledge discovery in databases. AI Magazine 17(3), 37–54 (1996)Google Scholar
  2. 2.
    Weiss, S.M., Kulikowski, C.A.: Computer Systems that Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems. Morgan Kaufmann, San Francisco (1991)Google Scholar
  3. 3.
    Haraguchi, K., Ibaraki, T.: Construction of classifiers by iterative compositions of features with partial knowledge. IEICE Trans. Fund. Elec. Comm. and Comp. Sci. E89-A(5), 1284–1291 (2006)Google Scholar
  4. 4.
    Bohanec, M., Zupan, B.: A function-decomposition method for development of hierarchical multi-attribute decision models. Dec. Supp. Sys. 36(3), 215–233 (2004)CrossRefGoogle Scholar
  5. 5.
    Boros, E., Gurvich, V., Hammer, P.L., Ibaraki, T., Kogan, A.: Decomposability of partially defined boolean function. Disc. Appl. Math. 62, 51–75 (1995)zbMATHCrossRefGoogle Scholar
  6. 6.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  7. 7.
    Chow, C.K.: On optimum recognition error and reject tradeoff. IEEE Trans. Inf. Th. 16(1), 41–46 (1970)zbMATHCrossRefGoogle Scholar
  8. 8.
    Domingos, P., Pazzani, M.J.: Beyond independence: Conditions for the optimality of the simple bayesian classifier. In: Saitta, L. (ed.) Proc. 13th Int’l Conf. Mach. Learn. pp. 105–112 (1996)Google Scholar
  9. 9.
    Elomaa, T., Rousu, J.: Fast minimum training error discretization. In: Sammut, C., Hoffmann, A. (eds.) Proc. 19th Int’l Conf. Mach. Learn, pp. 131–138 (2002)Google Scholar
  10. 10.
    Mii, S.: Feature determination algorithms in the analysis of data. Master’s thesis, Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University (2001)Google Scholar
  11. 11.
    Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases. University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Kazuya Haraguchi
    • 1
  • Hiroshi Nagamochi
    • 1
  1. 1.Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto UniversityJapan

Personalised recommendations