Skip to main content
Log in

Subspace clustering through attribute clustering

  • Research Article
  • Published:
Frontiers of Electrical and Electronic Engineering in China

Abstract

Many recently proposed subspace clustering methods suffer from two severe problems. First, the algorithms typically scale exponentially with the data dimensionality or the subspace dimensionality of clusters. Second, the clustering results are often sensitive to input parameters. In this paper, a fast algorithm of subspace clustering using attribute clustering is proposed to overcome these limitations. This algorithm first filters out redundant attributes by computing the Gini coefficient. To evaluate the correlation of every two non-redundant attributes, the relation matrix of non-redundant attributes is constructed based on the relation function of two dimensional united Gini coefficients. After applying an overlapping clustering algorithm on the relation matrix, the candidate of all interesting subspaces is achieved. Finally, all subspace clusters can be derived by clustering on interesting subspaces. Experiments on both synthesis and real datasets show that the new algorithm not only achieves a significant gain of runtime and quality to find subspace clusters, but also is insensitive to input parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal R, Gehrke J, Gunopulos D, et al. Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of ACM SIGMOD International Conference on Management of Data. Washington: ACM Press, 1998: 94–105

    Chapter  Google Scholar 

  2. Agrawal R, Gehrke J, Gunopulos D, et al. Automatic subspace clustering of high dimensional data. Data Mining and Knowledge Discovery, 2005, 11(1): 5–33

    Article  MathSciNet  Google Scholar 

  3. Cheng C H, Fu A W, Zhang Y. Entropy-based subspace clustering for mining numerical data. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. USA: ACM Press, 1999: 84–93

    Chapter  Google Scholar 

  4. Goil S, Nagesh H S, Choudhary A. MAFIA: efficient and scalable subspace clustering for very large data sets. Technique Report No. CPDC-TR-9906-010. Center for Parallel and Distributed Computing, Dept. of Electrical and Computer Engineering, Northwestern University: Evanston, IL, 1999

    Google Scholar 

  5. Procopiuc C M, Johes M, Agarwal P K, et al. A Monte Carlo algorithm for fast projective clustering. In: Proceedings of ACM SIGMOD International Conference on Management of Data. Madison: ACM Press, 2002: 418–427

    Google Scholar 

  6. Huang Z, Ng M, Rong H. Automated variable weighting in k-means type clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5): 657–668

    Article  Google Scholar 

  7. Kriegel H, Kröger P, Renz M, et al. A generic framework for efficient subspace clustering of high-dimensional data. In: Proceedings of 5th IEEE International Conference on Data Mining. New Orleans: IEEE Press, 2005: 250–257

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kun Niu.

Additional information

__________

Translated from Journal of Beijing University of Posts and Telecommunications, 2007, 30(3): 1–5 [译自: 北京邮电大学学报]

About this article

Cite this article

Niu, K., Zhang, S. & Chen, J. Subspace clustering through attribute clustering. Front. Electr. Electron. Eng. China 3, 44–48 (2008). https://doi.org/10.1007/s11460-008-0010-x

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11460-008-0010-x

Keywords

Navigation