Skip to main content

k-Proximal plane clustering

Abstract

Instead of clustering data points to cluster center points in k-means, k-plane clustering (kPC) clusters data points to the center planes. However, kPC only concerns on within-cluster data points. In this paper, we propose a novel plane-based clustering, called k-proximal plane clustering (kPPC). In kPPC, each center plane is not only close to the objective data points but also far away from the others by solving several eigenvalue problems. The objective function of our kPPC comprises the information from between- and within-clusters data points. In addition, our kPPC is extended to nonlinear case by kernel trick. A determinative strategy using a Laplace graph to initialize data points is established in our kPPC. The experiments conducted on several artificial and benchmark datasets show that the performance of our kPPC is much better than both kPC and k-means.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. 1.

    Han J, Kamber M (2006) Data mining concepts and techniques. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  2. 2.

    Wang Z, Shao Y, Bai L et al (2015) Twin support vector machine for clustering. IEEE Trans Neural Netw Learn Sys 26(10):2583–2588

    MathSciNet  Article  Google Scholar 

  3. 3.

    Anderberg M (1973) Cluster analysis for applications. Academic Press, New York

    MATH  Google Scholar 

  4. 4.

    Aldenderfer M, Blashfield R (1985) Cluster analysis. Sage Publications, Los Angeles

    MATH  Google Scholar 

  5. 5.

    Andrews H (1972) Introduction to mathematical techniques in pattern recognition. Wiley, New York

    MATH  Google Scholar 

  6. 6.

    Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recognit 28(5):781–793

    Article  Google Scholar 

  7. 7.

    Jain A, Dubes R (1988) Algorithms for clustering data. Englewood Cliffs, NJ

    MATH  Google Scholar 

  8. 8.

    Fisher D (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2(2):139–172

    Google Scholar 

  9. 9.

    Hassoun M (1995) Fundamentals of artificial neural networks. MIT, Cambridge

    MATH  Google Scholar 

  10. 10.

    Bradley P, Mangasarian O, Street W (1997) Clustering via concave minimization. Adv Neural Inf Process Syst 9:368–374

    Google Scholar 

  11. 11.

    Rao M (1987) Cluster analysis and mathematical programming. Am Stat Assoc 66(335):622–626

    MATH  Article  Google Scholar 

  12. 12.

    Selim S, Ismail M (1984) K-means-type algorithms: a generalized convergence theorem and characterization of local optimality. IEEE Trans Pattern Anal Mach Intell PAMI 6(1):81–87

    MATH  Article  Google Scholar 

  13. 13.

    Bezdek JC, Hathaway RJ, Sabin MJ, Tucker WT (1987) Convergence theory for fuzzy c-means: counterexamples and repairs. Syst Man Cybern IEEE Trans 17(5):873–877

    MATH  Article  Google Scholar 

  14. 14.

    Mangasarian O, Wild E (2006) Multisurface proximal support vector classification via generalize eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74

    Article  Google Scholar 

  15. 15.

    Bradley P, Mangasarian O (2000) k-Plane clustering. J Glob Optim 16(1):23–32

    MathSciNet  MATH  Article  Google Scholar 

  16. 16.

    Tseng P (2000) Nearest q-flat to m points. J Optim Theory Appl 105(1):249–252

    MathSciNet  MATH  Article  Google Scholar 

  17. 17.

    Amaldi E, Coniglio S (2013) A distance-based point-reassignment heuristic for the k-hyperplane clustering problem. Eur J Oper Res 227(1):22–29

    MathSciNet  MATH  Article  Google Scholar 

  18. 18.

    Rahman MA, Islam MZ, Bossomaier T (2014) Denclust: a density based seed selection approach for k-means. Artif Intell Soft Comput 8468:784–795

    Article  Google Scholar 

  19. 19.

    Bezdek JC, Pal NR (1998) Some new indexes of cluster validity. Syst Man Cybernet Part B Cybern IEEE Trans 28(3):301–315

    Article  Google Scholar 

  20. 20.

    Li C, Kuo B, Chin T (2011) Lda-based clustering algorithm and its application to an unsupervised feature extraction. Fuzzy Syst IEEE Trans 19(1):152–163

    Article  Google Scholar 

  21. 21.

    Pang Y, Wang S, Yuan Y (2014) Learning regularized lda by clustering. Neural Netw Learn Syst IEEE Trans 25(12):2191–2201

    Article  Google Scholar 

  22. 22.

    Yang ZM, Guo YR, Li CN, Shao YH (2015) Local k-proximal plane clustering. Neural Comput Appl 26(1):199–211

    Article  Google Scholar 

  23. 23.

    Shao Y, Deng N, Chen W, Wang Z (2013) Improved generalized eigenvalue proximal support vector machine. IEEE Signal Process Lett 20(3):213–216

    Article  Google Scholar 

  24. 24.

    Shao Y, Zhang C, Wang X, Deng N (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968

    Article  Google Scholar 

  25. 25.

    Shao Y, Chen W, Deng N (2014) Nonparallel hyperplane support vector machine for binary classification problems. Inf Sci 263:22–35

    MathSciNet  MATH  Article  Google Scholar 

  26. 26.

    Qi Z, Tian Y, Shi Y (2012) Twin support vector machine with universum data. Neural Netw 36:112–119

    MATH  Article  Google Scholar 

  27. 27.

    Qi Z, Tian Y, Shi Y (2012) Robust twin support vector machine for pattern classification. Pattern Recognit 46(1):305–316

    MATH  Article  Google Scholar 

  28. 28.

    Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. China Machine Press, Beijing

    MATH  Google Scholar 

  29. 29.

    Ding C, He X (2004) K-means clustering via principal component analysis, in: Proceedings of the twenty-first international conference on machine learning, ACM, p 29

  30. 30.

    Scarborough J (1958) Numerical mathematical analysis, 4th edn. Johns Hopkins Press, New York

    MATH  Google Scholar 

  31. 31.

    Deng N, Tian Y, Zhang C (2013) Support vector machines: optimization based theory, algorithms, and extensions. CRC Press, Boca Raton

    MATH  Google Scholar 

  32. 32.

    Naldi M, Campello R (2014) Evolutionary k-means for distributed datasets. Neurocomputing 127(3):30–42

    Article  Google Scholar 

  33. 33.

    Bradley P, Fayyad U (1998) Refining initial points for k-means clustering, in: Proceedings of the 15th international conference on machine learning (ICML98), pp. 91–99

  34. 34.

    Fayyad U, Reina C, Bradley B (1998) Initialization of iterative refinement clustering algorithms In: Proc 14th Intl Conf on machine learning (ICML), pp. 194–198

  35. 35.

    Shao Y-H, Bai L, Wang Z, Hua X-Y, Deng N-Y (2013) Proximal plane clustering via eigenvalues. Proc Comput Sci 17:41–47

    Article  Google Scholar 

  36. 36.

    Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    MathSciNet  Article  Google Scholar 

  37. 37.

    Hathaway RJ, Bezdek JC, Huband JM (2005) Kernelized non-euclidean relational c-means algorithms. Neural Parallel Sci Comput 13(3):305–326

    MathSciNet  MATH  Google Scholar 

  38. 38.

    Scholköpf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge, MA

    MATH  Google Scholar 

  39. 39.

    Shao Y, Deng N (2012) A coordinate descent margin based-twin support vector machine for classification. Neural Netw 25:114–121

    MATH  Article  Google Scholar 

  40. 40.

    Qi Z, Tian Y, Shi Y (2012) Laplacian twin support vector machine for semi-supervised classification. Neural Netw 35:46–53

    MATH  Article  Google Scholar 

  41. 41.

    Shao Y-H, Wang Z, Chen W-J, Deng N-Y (2013) A regularization for the projection twin support vector machine. Knowl Based Syst 37:203–210

    Article  Google Scholar 

  42. 42.

    Blake CL, Merz CJ (199 8) UCI repository for machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html. Accessed Jan 2015

  43. 43.

    Derya B, Alp K (2007) ST-DBSCAN: an algorithm for clustering spatial–temporal data. Data Knowl Eng 60(1):208–221

    Article  Google Scholar 

  44. 44.

    Zhou A, Zhou S, Cao J, Fan Y, Hu Y (2000) Approaches for scaling DBSCAN algorithm to large spatial databases. J Comput Sci Technol 15(6):509–526

    MATH  Article  Google Scholar 

  45. 45.

    The MathWorks Inc (1994–2001) Matlab, User’s guide. http://www.mathworks.com

  46. 46.

    Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. Intell Inf Syst J 17:107–145

    MATH  Article  Google Scholar 

  47. 47.

    Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mac Learn Res 7(1):1–30

    MathSciNet  MATH  Google Scholar 

  48. 48.

    Hodges J Jr, Lehmann EL (1956) The efficiency of some nonparametric competitors of the t-test. Ann Math Stat 27(2):324–335

    MathSciNet  MATH  Article  Google Scholar 

  49. 49.

    Hollander M, Wolfe D, Chicken E (1973) Nonparametric statistical methods, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  50. 50.

    Wang Y, Jiang Y, Wu Y, Zhou Z (2011) Localized k-flats. In: Proceedings of the twenty-fifth AAAI conference on artificial intelligence, pp. 525–530

  51. 51.

    Huang P, Zhang D (2010) Locality sensitive c-means clustering algorithms. Neurocomputing 73:2935–2943

    Article  Google Scholar 

  52. 52.

    Yang B, Chen S (2010) Sample-dependent graph construction with application to dimensionality reduction. Neurocomputing 74:301–314

    Article  Google Scholar 

  53. 53.

    Tian Y, Shi Y, Liu X (2012) Recent advances on support vector machines research. Technol Econ Dev Econ 18(1):5–33

    Article  Google Scholar 

  54. 54.

    Shao Y-H, Deng N-Y, Chen W-J (2013) A proximal classifier with consistency. Knowl Based Syst 49:171–178

    Article  Google Scholar 

  55. 55.

    Bezdek JC, Gunderson R, Ehrlich R, Meloy T (1978) On the extension of fuzzy k-means algorithms for detection of linear clusters. Decision and control including the 17th symposium on adaptive processes 17(1):1438–1443

    MATH  Google Scholar 

  56. 56.

    Bezdek JC, Coray C, Gunderson R, Watson J (1981) Detection and characterization of cluster substructure i. linear structure: fuzzy c-lines. SIAM J Appl Math 40(2):339–357

    MathSciNet  MATH  Article  Google Scholar 

  57. 57.

    Bezdek JC, Coray C, Gunderson R, Watson J (1981) Detection and characterization of cluster substructure ii. fuzzy c-varieties and complex combinations thereof. SIAM J Appl Math 40(2):358–372

    MathSciNet  MATH  Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Nos. 11201426, 11371365, and 11501310), the Zhejiang Provincial Natural Science Foundation of China (Nos. LY15F030013, LQ14G010004, and LY16A010020), the National Statistical Science Research Project of China (No. 2013LZ13), and the Natural Science Foundation of Inner Mongolia Autonomous Region of China (No. 2015BS0606).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yuan-Hai Shao.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, LM., Guo, YR., Wang, Z. et al. k-Proximal plane clustering. Int. J. Mach. Learn. & Cyber. 8, 1537–1554 (2017). https://doi.org/10.1007/s13042-016-0526-y

Download citation

Keywords

  • Clustering
  • k-means
  • k-Plane clustering
  • eigenvalue problem
  • Laplace graph