Skip to main content

Soft Competitive Learning for Large Data Sets

  • Conference paper
New Trends in Databases and Information Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 185))

  • 1424 Accesses

Abstract

Soft competitive learning is an advanced k-means like clustering approach overcoming some severe drawbacks of k-means, like initialization dependence and sticking to local minima. It achieves lower distortion error than k-means and has shown very good performance in the clustering of complex data sets, using various metrics or kernels. While very effective, it does not scale for large data sets which is even more severe in case of kernels, due to a dense prototype model. In this paper, we propose a novel soft-competitive learning algorithm using core-sets, significantly accelerating the original method in practice with natural sparsity. It effectively deals with very large data sets up to multiple million points. Our method provides also an alternative fast kernelization of soft-competitive learning. In contrast to many other clustering methods the obtained model is based on only few prototypes and shows natural sparsity. It is the first natural sparse kernelized soft competitive learning approach. Numerical experiments on synthetical and benchmark data sets show the efficiency of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Badoiu, M., Har-Peled, S., Indyk, P.: Approximate clustering via core-sets. In: STOC, pp. 250–257 (2002)

    Google Scholar 

  2. Blake, C., Merz, C.: UCI repository of machine learning databases. Department of Information and Computer Science. University of California, Irvine, CA (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

    Google Scholar 

  3. Camastra, F., Verri, A.: A Novel Kernel Method for Clustering. IEEE TPAMI 27(5), 801–805 (2005)

    Article  Google Scholar 

  4. Filippone, M., Camastra, F., Massulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recognition 41, 176–190 (2008)

    Article  MATH  Google Scholar 

  5. Frénay, B., Verleysen, M.: Parameter-insensitive kernel in extreme learning for non-linear support vector regression. Neurocomputing 74(16), 2526–2531 (2011)

    Article  Google Scholar 

  6. Frey, B., Dueck, D.: Clustering by message passing between data points. Science 315, 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  7. Hammer, B., Hasenfuss, A.: Topographic mapping of large dissimilarity data sets. Neural Computation 22(9), 2229–2284 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  8. Labusch, K., Barth, E., Martinetz, T.: Soft-competitive learning of sparse codes and its application to image reconstruction. Neurocomputing 74(9), 1418–1428 (2011)

    Article  Google Scholar 

  9. Liang, C., Xiao-Ming, D., Sui-Wu, Z., Yong-Qing, W.: Scaling up kernel grower clustering method for large data sets via core-sets. Acta Automatica Sinica 34(3), 376–382 (2008)

    MATH  Google Scholar 

  10. Martinetz, T., Berkovich, S., Schulten, K.: Neural Gas Network for Vector Quantization and its Application to Time-Series Prediction. IEEE Transactions on Neural Networks 4(4), 558–569 (1993)

    Article  Google Scholar 

  11. Qin, A.K., Suganthan, P.N.: A novel kernel prototype-based learning algorithm. In: Proc. of ICPR 2004, pp. 2621–2624 (2004)

    Google Scholar 

  12. Schleif, F.M., Villmann, T., Hammer, B., Schneider, P.: Effcient kernelized prototype-based classification. Journal of Neural Systems 21(6), 443–457 (2011)

    Article  Google Scholar 

  13. Schleif, F.-M., Villmann, T., Hammer, B., Schneider, P., Biehl, M.: Generalized Derivative Based Kernelized Learning Vector Quantization. In: Fyfe, C., Tino, P., Charles, D., Garcia-Osorio, C., Yin, H. (eds.) IDEAL 2010. LNCS, vol. 6283, pp. 21–28. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press (2002)

    Google Scholar 

  15. Schölkopf, B., Smola, A.J., Müller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10(5), 1299–1319 (1998)

    Article  Google Scholar 

  16. Smola, A.J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning. In: Langley, P. (ed.) ICML, pp. 911–918. Morgan Kaufmann (2000)

    Google Scholar 

  17. Tax, D.M.J., Duin, R.P.W.: Support vector domain description. Pattern Recognition Letters 20(11-13), 1191–1199 (1999)

    Article  Google Scholar 

  18. Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core vector machines: Fast svm training on very large data sets. Journal of Machine Learning Research 6, 363–392 (2005)

    MathSciNet  MATH  Google Scholar 

  19. Tzortzis, G., Likas, A.: The global kernel k-means clustering algorithm. In: IJCNN, pp. 1977–1984. IEEE (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank-Michael Schleif .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schleif, FM., Zhu, X., Hammer, B. (2013). Soft Competitive Learning for Large Data Sets. In: Pechenizkiy, M., Wojciechowski, M. (eds) New Trends in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol 185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32518-2_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32518-2_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32517-5

  • Online ISBN: 978-3-642-32518-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics