Advertisement

An ICA-Based Multivariate Discretization Algorithm

  • Ye Kang
  • Shanshan Wang
  • Xiaoyan Liu
  • Hokyin Lai
  • Huaiqing Wang
  • Baiqi Miao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4092)

Abstract

Discretization is an important preprocessing technique in data mining tasks. Univariate Discretization is the most commonly used method. It discretizes only one single attribute of a dataset at a time, without considering the interaction information with other attributes. Since it is multi-attribute rather than one single attribute determines the targeted class attribute, the result of Univariate Discretization is not optimal. In this paper, a new Multivariate Discretization algorithm is proposed. It uses ICA (Independent Component Analysis) to transform the original attributes into an independent attribute space, and then apply Univariate Discretization to each attribute in the new space. Data mining tasks can be conducted in the new discretized dataset with independent attributes. The numerical experiment results show that our method improves the discretization performance, especially for the nongaussian datasets, and it is competent compared to PCA-based multivariate method.

Keywords

Data mining Multivariate Discretization Independent Component Analysis Nongaussian 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Liu, H.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6, 393–423 (2002)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Mehta, S.: Toward Unsupervised Correlation Preserving Discretization. IEEE Transaction On Knowledge and Data Engineering 17(9), 1174–1185 (2005)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on Machine Learning (1995)Google Scholar
  4. 4.
    Kerber, R.: Chimerge discretization of numeric attributes. In: Proceedings of the 10th International Conference on Artificial Intelligence (1991)Google Scholar
  5. 5.
    Zeta, K.M.H.O.: A Global Method for Discretization of Continuous Variables. In: The Third International Conference on Knowledge Discovery and Data Mining. (1997)Google Scholar
  6. 6.
    Liu, X., Wang, H.: A Discretization Algorithm Based on a Heterogeneity Criterion. IEEE Transactions on Knowledge and Data Engineering 17(9), 1166–1173 (2005)CrossRefGoogle Scholar
  7. 7.
    Ferrandiz, S., Boullé, M.: Multivariate Discretization by Recursive Supervised Bipartition of Graph. In: Perner, P., Imiya, A. (eds.) MLDM 2005. LNCS, vol. 3587, pp. 253–264. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Bay, S.D.: Multivariate Discretization of Continuous Variables for Set Ming. Knowledge and Information Systems 3(4), 491–512 (2001)zbMATHCrossRefGoogle Scholar
  9. 9.
    Huang, Y., Luo, S.: Genetic Algorithm Applied to ICA Feature Selection. In: Proceedings of the International Joint Conference on Neural Networks (2003)Google Scholar
  10. 10.
    Bach, F.R., Jordan, M.I.: Kernel Independent Component Analysis. Journal of Machine Learning Research 3 (2002)Google Scholar
  11. 11.
    Hyvärinen, A.: Independent Component Analysis:Algorithms and Applications. Neural Networks 13, 411–430 (2000)CrossRefGoogle Scholar
  12. 12.
    Comon, P.: Independent component analysis, A new concept? Signal Processing 36, 287–314 (1994)zbMATHCrossRefGoogle Scholar
  13. 13.
    Fayyad, U., Irani, K.B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In: Proceeding of 13th International Joint Conference on Artificial Intelligence (1993)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ye Kang
    • 1
    • 2
  • Shanshan Wang
    • 1
    • 2
  • Xiaoyan Liu
    • 1
  • Hokyin Lai
    • 1
  • Huaiqing Wang
    • 1
  • Baiqi Miao
    • 2
  1. 1.Department of Information SystemsCity University of Hong Kong 
  2. 2.Management SchoolUniversity of Science and Technology of ChinaHeFei, AnHui Province

Personalised recommendations