Soft Competitive Learning for Large Data Sets
Soft competitive learning is an advanced k-means like clustering approach overcoming some severe drawbacks of k-means, like initialization dependence and sticking to local minima. It achieves lower distortion error than k-means and has shown very good performance in the clustering of complex data sets, using various metrics or kernels. While very effective, it does not scale for large data sets which is even more severe in case of kernels, due to a dense prototype model. In this paper, we propose a novel soft-competitive learning algorithm using core-sets, significantly accelerating the original method in practice with natural sparsity. It effectively deals with very large data sets up to multiple million points. Our method provides also an alternativefastkernelization of soft-competitive learning. In contrast to many other clustering methods the obtained model is based on only few prototypes and shows natural sparsity. It is the first natural sparse kernelized soft competitive learning approach. Numerical experiments on synthetical and benchmark data sets show the efficiency of the proposed method.
Unable to display preview. Download preview PDF.
- 1.Badoiu, M., Har-Peled, S., Indyk, P.: Approximate clustering via core-sets. In: STOC, pp. 250–257 (2002)Google Scholar
- 11.Qin, A.K., Suganthan, P.N.: A novel kernel prototype-based learning algorithm. In: Proc. of ICPR 2004, pp. 2621–2624 (2004)Google Scholar
- 14.Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press (2002)Google Scholar
- 16.Smola, A.J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning. In: Langley, P. (ed.) ICML, pp. 911–918. Morgan Kaufmann (2000)Google Scholar
- 19.Tzortzis, G., Likas, A.: The global kernel k-means clustering algorithm. In: IJCNN, pp. 1977–1984. IEEE (2008)Google Scholar