Space Decomposition in Data Mining: A Clustering Approach
- Cite this paper as:
- Rokach L., Maimon O., Lavi I. (2003) Space Decomposition in Data Mining: A Clustering Approach. In: Zhong N., Raś Z.W., Tsumoto S., Suzuki E. (eds) Foundations of Intelligent Systems. ISMIS 2003. Lecture Notes in Computer Science, vol 2871. Springer, Berlin, Heidelberg
Data mining algorithms aim at searching interesting patterns in large amount of data in manageable complexity and good accuracy. Decomposition methods are used to improve both criteria. As opposed to most decomposition methods, that partition the dataset via sampling, this paper presents an accuracy-oriented method that partitions the instance space into mutually exclusive subsets using K-means clustering algorithm. After employing the basic divide-and-induce method on several datasets with different classifiers, its error rate is compared to that of the basic learning algorithm. An analysis of the results shows that the proposed method is well suited for datasets of numeric input attributes and that its performance is influenced by the dataset size and its homogeneity. Finally, a homogeneity threshold is developed, that can be used for deciding whether to decompose the data set or not.
Unable to display preview. Download preview PDF.