K-polytopes: a superproblem of k-means

Abstract

It has already been proven that under certain circumstances dictionary learning for sparse representations is equivalent to conventional k-means clustering. Through additional modifications on sparse representations, it is possible to generalize the notion of centroids to higher orders. In a related algorithm which is called k-flats, q-dimensional flats have been considered as alternative central prototypes. In the proposed formulation of this paper, central prototypes are instead simplexes or even more general polytopes. Using higher-dimensional, nonconvex prototypes may alleviate the curse of dimensionality while also enabling to model nonlinearly distributed datasets successfully. The proposed framework in this study can further be applied in supervised settings flexibly through one-class learning and also in other nonlinear frameworks through kernels.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

References

  1. 1.

    Deza, M.M., Deza, E.: Encyclopedia of Distances. Springer, Berlin (2016)

    Google Scholar 

  2. 2.

    Sibson, R.: SLINK: an optimally efficient algorithm for the single-link cluster method. Comp. J. 16(1), 30–34 (1973)

    MathSciNet  Article  Google Scholar 

  3. 3.

    Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    MathSciNet  Article  MATH  Google Scholar 

  4. 4.

    Carson, C., Belongie, S., Greenspan, H., Malik, J.: Blobworld: image segmentation using expectation-maximization and its application to image querying. IEEE Trans. Pattern Anal. Mach. Intell. 24(8), 1026–1038 (2002)

    Article  Google Scholar 

  5. 5.

    Novak, P., Neumann, P., Macas, J.: Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinform. 11(1), 378 (2010)

    Article  Google Scholar 

  6. 6.

    Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)

    Article  MATH  Google Scholar 

  7. 7.

    Scholkopf, B., Smola, A., Muller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10(5), 1299–1319 (1998)

    Article  Google Scholar 

  8. 8.

    Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: ACM International Conference on Knowledge Discovery and Data Mining, pp. 551–556 (2004)

  9. 9.

    Fred, A.L.N., Jain, A.K.: Data clustering using evidence accumulation. In: Object Recognition Supported User Interaction for Service Robots, pp. 276–280 (2002)

  10. 10.

    Hore, P., Hall, L.O., Goldgof, D.B.: A scalable framework for cluster ensembles. Pattern Recognit. 42(5), 676–688 (2009)

    Article  MATH  Google Scholar 

  11. 11.

    Iam-on, N., Garrett, S.: Linkclue: a matlab package for link-based cluster ensembles. J. Stat. Softw. 36(9), 1–36 (2010)

    Article  Google Scholar 

  12. 12.

    Cheung, Y.M.: k*-means: a new generalized k-means clustering algorithm. Pattern Recognit. Lett. 24(15), 2883–2893 (2003)

    Article  MATH  Google Scholar 

  13. 13.

    Pelleg, D., Moore, A.W.: X-means: extending k-means with efficient estimation of the number of clusters. In: International Conference on Machine Learning, pp. 727–734 (2000)

  14. 14.

    Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)

    Article  Google Scholar 

  15. 15.

    Bradley, P.S., Mangasarian, O.L.: K-plane clustering. J. Glob. Optim. 16(1), 23–32 (2000)

    MathSciNet  Article  MATH  Google Scholar 

  16. 16.

    Tseng, P.: Nearest q-flat to m points. J. Optim. Theory Appl. 105(1), 249–252 (2000)

    MathSciNet  Article  MATH  Google Scholar 

  17. 17.

    Canas, G., Poggio, T., Rosasco, L.: Learning manifolds with k-means and k-flats. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems. Curran Associates, Inc., pp. 2465–2473 (2012)

  18. 18.

    Szlam, A., Sapiro, G.: Discriminative k-metrics. In: International Conference on Machine Learning, pp. 1009–1016 (2009)

  19. 19.

    Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Asilomar Conference on Signals, Systems, and Computers, pp. 40–44 (1993)

  20. 20.

    Wang, H., Celik, T.: Sparse representation-based hyperspectral image classification. Signal Image Video Process. 12(5), 1009–1017 (2018)

    Article  Google Scholar 

  21. 21.

    Abolghasemi, V., Ferdowsi, S., Sanei, S.: Fast and incoherent dictionary learning algorithms with application to FMRI. Signal Image Video Process. 9(1), 147–158 (2015)

    Article  Google Scholar 

  22. 22.

    Meier, L., Van De Geer, S., Buhlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B 70(1), 53–71 (2008)

    MathSciNet  Article  MATH  Google Scholar 

  23. 23.

    Eldar, Y.C., Mishali, M.: Robust recovery of signals from a structured union of subspaces. IEEE Trans. Inf. Theory 55(11), 5302–5316 (2009)

    MathSciNet  Article  MATH  Google Scholar 

  24. 24.

    Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: IEEE Computer Vision and Pattern Recognition, pp. 2790–2797 (2009)

  25. 25.

    Aharon, M., Elad, M., Bruckstein, A.: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)

    Article  MATH  Google Scholar 

  26. 26.

    Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems. Curran Associates, Inc., pp. 2223–2231 (2009)

  27. 27.

    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Computer Vision and Pattern Recognition, pp. 3360–3367 (2010)

  28. 28.

    Zhang, Z., Xu, Y., Yang, J., Li, X., Zhang, D.: A survey of sparse representation: algorithms and applications. IEEE Access 3, 490–530 (2015)

    Article  Google Scholar 

  29. 29.

    Golubitsky, O., Mazalov, V., Watt, S.M.: An algorithm to compute the distance from a point to a simplex. ACM Commun. Comput. Algebra 46, 57–57 (2012)

    Google Scholar 

  30. 30.

    Duchi, J., Shalev-Shwartz, S., Singer, Y., Chandra, T.: Efficient projections onto the l1-ball for learning in high dimensions. In: International Conference on Machine Learning, pp. 272–279 (2008)

  31. 31.

    Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)

    Article  Google Scholar 

  32. 32.

    Lindemann, P.: The Gilbert–Johnson–Keerthi Distance Algorithm. Alg Media Informatics (2009)

  33. 33.

    Reynolds, D. (2015) Gaussian mixture models. In: Li, S.Z., Jain, A.K. (eds.) Encyclopedia of Biometrics. Springer, Boston, pp. 827–832. https://doi.org/10.1007/978-1-4899-7488-4_196

  34. 34.

    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)

    Google Scholar 

  35. 35.

    Dheeru, D., Taniskidou, E.K.: UCI Machine Learning Repository (2017)

  36. 36.

    LeCun, Y., Cortes, C., Burges, C.J.C.: MNIST Handwritten Digit Database (2010)

  37. 37.

    van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

Download references

Acknowledgements

Authors are grateful to Prof. Dr. Turker Ince for fruitful discussions, and for his constructive comments that greatly improved the manuscript.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Mehmet Turkan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 263 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Oktar, Y., Turkan, M. K-polytopes: a superproblem of k-means. SIViP 13, 1207–1214 (2019). https://doi.org/10.1007/s11760-019-01469-6

Download citation

Keywords

  • Sparse representations
  • Block sparsity
  • Simplexes
  • Polytopes
  • Clustering
  • Machine learning