Deflation-based power iteration clustering

The, Anh Pham; Thang, Nguyen Duc; Vinh, La The; Lee, Young-Koo; Lee, Sungyoung

doi:10.1007/s10489-012-0418-0

Deflation-based power iteration clustering

Published: 03 February 2013

Volume 39, pages 367–385, (2013)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Anh Pham The¹,
Nguyen Duc Thang²,
La The Vinh¹,
Young-Koo Lee¹ &
…
Sungyoung Lee¹

670 Accesses
15 Citations
Explore all metrics

Abstract

Spectral clustering (SC) is currently one of the most popular clustering techniques because of its advantages over conventional approaches such as K-means and hierarchical clustering. However, SC requires the use of computing eigenvectors, making it time consuming. To overcome this limitation, Lin and Cohen proposed the power iteration clustering (PIC) technique (Lin and Cohen in Proceedings of the 27th International Conference on Machine Learning, pp. 655–662, 2010), which is a simple and fast version of SC. Instead of finding the eigenvectors, PIC finds only one pseudo-eigenvector, which is a linear combination of the eigenvectors in linear time. However, in certain critical situations, using only one pseudo-eigenvector is not enough for clustering because of the inter-class collision problem. In this paper, we propose a novel method based on the deflation technique to compute multiple orthogonal pseudo-eigenvectors (orthogonality is used to avoid redundancy). Our method is more accurate than PIC but has the same computational complexity. Experiments on synthetic and real datasets demonstrate the improvement of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A Comprehensive Survey of Clustering Algorithms

Article 01 June 2015

Dongkuan Xu & Yingjie Tian

Data clustering: application and trends

Article 27 November 2022

Gbeminiyi John Oyewole & George Alex Thopil

A Comprehensive Survey of Anomaly Detection Algorithms

Article 26 November 2021

Durgesh Samariya & Amit Thakkar

References

Cai D Mnist dataset. URL http://www.cad.zju.edu.cn/home/dengcai/Data/MNIST/10kTrain.mat
Cai D Tdt2 dataset. URL http://www.cad.zju.edu.cn/home/dengcai/Data/TDT2/TDT2.mat
Chen X, Cai D (2011) Large scale spectral clustering with landmark-based representation. In: Proceedings of the twenty-fifth AAAI conference on artificial intelligence, San Francisco, California, pp 313–318
Google Scholar
Chen L, Mao X, Wei P, Xue Y, Ishizuka M (2012) Mandarin emotion recognition combining acoustic and emotional point information. Appl Intell 37(4):602–612
Article Google Scholar
Drineas P, Mahoney MW (2005) On the Nyström method for approximating a gram matrix for improved kernel-based learning. J Mach Learn Res 6:2153–2175
MathSciNet MATH Google Scholar
Durrett R (2010) Some features of the spread of epidemics and information on a random graph. Proc Natl Acad Sci 107(10):4491–4498
Article Google Scholar
Erdős P, Rényi A (1960) On the evolution of random graphs. Magy Tud Akad Mat Kut Intéz Közl 5:17–61
Google Scholar
Fowlkes C, Belongie S, Chung F, Malik J (2004) Spectral grouping using the Nyström method. IEEE Trans Pattern Anal Mach Intell 26(2):214–225
Article Google Scholar
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Process Syst 18:507–514
Google Scholar
He J, Tong H, Carbonell J (2010) Rare category characterization. In: Proceedings of the 10th IEEE international conference on data mining, Sydney, Australia, pp 226–235
Google Scholar
Hu X, Zhang X, Lu C, Park EK, Zhou X (2009) Exploiting Wikipedia as external knowledge for document clustering. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, France, pp 389–396
Chapter Google Scholar
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31(8):651–666
Article Google Scholar
Keshet J, Bengio S (2009) Automatic speech and speaker recognition: large margin and kernel methods. Wiley Online Library
Book Google Scholar
Lehoucq RB, Sorensen DC (1996) Deflation techniques for an implicitly restarted Arnoldi iteration. SIAM J Matrix Anal Appl 17(4):789–821
Article MathSciNet MATH Google Scholar
Lewis DD Reuters-21578 dataset. URL http://www.daviddlewis.com/resources/testcollections/reuters21578/
Lin F, Cohen WW (2010) Power iteration clustering. In: Proceedings of the 27th international conference on machine learning, Haifa, Israel, pp 655–662
Google Scholar
Lin F, Cohen WW (2010) A very fast method for clustering big text datasets. In: Proceedings of the 19th European conference on artificial intelligence, Lisbon, Portugal, pp 303–308
Google Scholar
Luxburg UV (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Article MathSciNet Google Scholar
Mackey L (2008) Deflation methods for sparse pca. Adv Neural Inf Process Syst 21:1017–1024
Google Scholar
Mavroeidis D (2010) Accelerating spectral clustering with partial supervision. Data Min Knowl Discov 21(2):241–258
Article MathSciNet Google Scholar
Mishra N, Schreiber R, Stanton I, Tarjan RE (2007) Clustering social networks. In: Proceedings of the 5th international conference on algorithms and models for the web-graph, San Diego, CA, pp 56–67
Chapter Google Scholar
Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 14:849–856
Google Scholar
Pavan M, Pelillo M (2007) Dominant sets and pairwise clustering. IEEE Trans Pattern Anal Mach Intell 29(1):167–172
Article Google Scholar
Peña JM, Lozano JA, Larrañaga P (1999) An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognit Lett 20(10):1027–1040
Article Google Scholar
Peng W, Li T (2011) On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis. Appl Intell 35(2):285–295
Article MathSciNet MATH Google Scholar
Rennie J 20 newsgroups. URL http://qwone.com/~jason/20Newsgroups/
Saha S, Bandyopadhyay S (2011) Automatic MR brain image segmentation using a multiseed based multiobjective clustering approach. Appl Intell 35(3):411–427
Article Google Scholar
Shang F, Jiao LC, Shi J, Wang F, Gong M (2012) Fast affinity propagation clustering: a multilevel approach. Pattern Recognit 45(1):474–486
Article Google Scholar
Sheffield Face database. URL http://www.sheffield.ac.uk/eee/research/iel/research/face
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Article Google Scholar
Smola A, Schölkopf B Datasets for benchmarks and applications. URL http://www.kernel-machines.org/data/
Taşdemir K (2012) Vector quantization based approximate spectral clustering of large datasets. Pattern Recognit 45(8):3034–3044
Article Google Scholar
Tung F, Wong A, Clausi DA (2010) Enabling scalable spectral clustering for image segmentation. Pattern Recognit 43(12):4069–4076
Article MATH Google Scholar
Wu S, Chow TWS (2004) Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density. Pattern Recognit 37(2):175–188
Article MATH Google Scholar
Wu M, Schölkopf B (2006) A local learning approach for clustering. Adv Neural Inf Process Syst 19:1529–1536
Google Scholar
Yan D, Huang L, Jordan MI (2009) Fast approximate spectral clustering. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, France, pp 907–916
Chapter Google Scholar
Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 17:1601–1608
Google Scholar
Zhang K, Kwok JT (2009) Density-weighted Nyström method for computing large kernel eigensystems. Neural Comput 21(1):121–146
Article MathSciNet MATH Google Scholar
Zhang K, Tsang IW, Kwok JT (2008) Improved Nyström low-rank approximation and error analysis. In: Proceedings of the 25th international conference on machine learning, Helsinki, Finland, pp 1232–1239
Chapter Google Scholar

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 2010-0013689).

Author information

Authors and Affiliations

Department of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, Yongin-si, Gyeonggi-do, 446-701, South Korea
Anh Pham The, La The Vinh, Young-Koo Lee & Sungyoung Lee
Department of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam
Nguyen Duc Thang

Authors

Anh Pham The
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Duc Thang
View author publications
You can also search for this author in PubMed Google Scholar
La The Vinh
View author publications
You can also search for this author in PubMed Google Scholar
Young-Koo Lee
View author publications
You can also search for this author in PubMed Google Scholar
Sungyoung Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Young-Koo Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

The, A.P., Thang, N.D., Vinh, L.T. et al. Deflation-based power iteration clustering. Appl Intell 39, 367–385 (2013). https://doi.org/10.1007/s10489-012-0418-0

Download citation

Published: 03 February 2013
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10489-012-0418-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Deflation-based power iteration clustering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Data clustering: application and trends

A Comprehensive Survey of Anomaly Detection Algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deflation-based power iteration clustering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Data clustering: application and trends

A Comprehensive Survey of Anomaly Detection Algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation