Neural Computing and Applications

, Volume 26, Issue 1, pp 199–211

Local k-proximal plane clustering

  • Zhi-Min Yang
  • Yan-Ru Guo
  • Chun-Na Li
  • Yuan-Hai Shao
Original Article

Abstract

k-Plane clustering (kPC) and k-proximal plane clustering (kPPC) cluster data points to the center plane, instead of clustering data points to cluster center in k-means. However, the cluster center plane constructed by kPC and kPPC is infinitely extending, which will affect the clustering performance. In this paper, we propose a local k-proximal plane clustering (LkPPC) by bringing k-means into kPPC which will force the data points to center around some prototypes and thus localize the representations of the cluster center plane. The contributions of our LkPPC are as follows: (1) LkPPC introduces localized representation of each cluster center plane to avoid the infinitely confusion. (2) Different from kPPC, our LkPPC constructs cluster center plane that makes the data points of the same cluster close to both the same center plane and the prototype, and meanwhile far away from the other clusters to some extent, which leads to solve eigenvalue problems. (3) Instead of randomly selecting the initial data points, a Laplace graph strategy is established to initialize the data points. (4) The experimental results on several artificial datasets and benchmark datasets show the effectiveness of our LkPPC.

Keywords

k-Plane clustering k-Proximal plane clustering k-Means Eigenvalue problem Laplace graph 

References

  1. 1.
    Han J, Kamber M (2006) Data mining concepts and techniques. Morgan Kaufmann, San FranciscoMATHGoogle Scholar
  2. 2.
    Anderberg M (1973) Cluster analysis for applications. Academic Press, New York MATHGoogle Scholar
  3. 3.
    Aldenderfer M, Blashfield R (1985) Cluster analysis. Sage, Los AngelesGoogle Scholar
  4. 4.
    Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323CrossRefGoogle Scholar
  5. 5.
    Cai W, Chen S, Zhang D (2007) Fast and robust fuzzy c-means clustering algorithms incorporating local information for image segmentation. Pattern Recognition 40(3):825–838CrossRefMATHGoogle Scholar
  6. 6.
    Wu Z, Leahy R (1993) An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans Pattern Anal Mac Intell 15(11):1101–1113CrossRefGoogle Scholar
  7. 7.
    Saha S, Bandyopadhyay S (2011) Automatic MR brain image segmentation using a multiseed based multiobjective clustering approach. Appl Intell 35(3):411–427CrossRefGoogle Scholar
  8. 8.
    Berry M (2004) Survey of text mining I: clustering, classification, and retrieval, vol 1. Springer, BerlinCrossRefGoogle Scholar
  9. 9.
    Hotho A, Nrnberger A, Paab G (2005) A brief survey of text mining. Ldv Forum 20(1):19–62Google Scholar
  10. 10.
    Shi K, Li L (2013) High performance genetic algorithm based text clustering using parts of speech and outlier elimination. Appl Intell 38(4):511–519CrossRefGoogle Scholar
  11. 11.
    Yu Z, Wong H, Wang H (2007) Graph-based consensus clustering for class discovery from gene expression data. Bioinformatics 23(21):2888–2896CrossRefGoogle Scholar
  12. 12.
    Bandyopadhyay S, Mukhopadhyay A, Maulik U (2007) An improved algorithm for clustering gene expression data. Bioinformatics 23(21):2859–2865CrossRefGoogle Scholar
  13. 13.
    Li C, Xia M, Peng W, Yu X, Mitsuru I (2012) Mandarin emotion recognition combining acoustic and emotional point information. Appl Intell 37(4):602–612CrossRefGoogle Scholar
  14. 14.
    Joseph K, Samy B (2009) Automatic speech and speaker recognition: large margin and kernel methods. Wiley Online Library, HobokenGoogle Scholar
  15. 15.
    Bradley P, Mangasarian O (1997) Clustering via concave minimization. Adv Neural Inf Proces Syst 9:368–374Google Scholar
  16. 16.
    Dembele D, Kastner P (2003) Fuzzy c-means method for clustering microarray data. Bioinformatics 19(8):973–980CrossRefGoogle Scholar
  17. 17.
    Bradley P, Mangasarian O (2000) k-plane clustering. J Glob Optim 16(1):23–32CrossRefMathSciNetMATHGoogle Scholar
  18. 18.
    Tseng P (2000) Nearest q-flat to m points. J Optim Theory Appl 105(1):249–252CrossRefMathSciNetMATHGoogle Scholar
  19. 19.
    Wang Y, Jiang Y, Wu Y, Zhou Z (2011) Localized k-flats. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, pp 525–530Google Scholar
  20. 20.
    Zhang T, Szlam A, Wang Y, Lerman G (2010) Randomized hybrid linear modeling by local best-fit flats. In: In CVPR, pp 1927–1934Google Scholar
  21. 21.
    Shao Y, Bai L, Wang Z, Hua X, Deng N (2013) Proximal plane clustering via eigenvalues. Proc Comput Sci 17:41–47CrossRefGoogle Scholar
  22. 22.
    Shao Y, Guo Y, Wang Z, Deng N (2014) k-proximal plane clustering. Neurocomputing (submitted)Google Scholar
  23. 23.
    Mangasarian O, Wild E (2006) Multisurface proximal support vector classification via generalize eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74CrossRefGoogle Scholar
  24. 24.
    Shao Y, Deng N, Chen W, Wang Z (2013) Improved generalized eigenvalue proximal support vector machine. IEEE Signal Process Lett 20(3):213–216CrossRefGoogle Scholar
  25. 25.
    Shao Y, Zhang C, Wang X, Deng N (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968CrossRefGoogle Scholar
  26. 26.
    Shao Y, Deng N, Yang Z, Chen W, Wang Z (2012) Probabilistic outputs for twin support vector machines. Knowl-Based Syst 33:145–151CrossRefGoogle Scholar
  27. 27.
    Qi Z, Tian Y, Shi Y (2012) Twin support vector machine with universum data. Neural Netw 36:112–119CrossRefMATHGoogle Scholar
  28. 28.
    Shao Y, Deng N (2012) A coordinate descent margin based-twin support vector machine for classification. Neural Netw 25:114–121CrossRefMATHGoogle Scholar
  29. 29.
    Qi Z, Tian Y, Shi Y (2012) Laplacian twin support vector machine for semi-supervised classification. Neural Netw 35:46–53CrossRefMATHGoogle Scholar
  30. 30.
    Balasundaram S, Tanveer M (2013) On lagrangian twin support vector regression. Neural Comput Appl 22(1):257–267CrossRefGoogle Scholar
  31. 31.
    Tanveer M (2014) Robust and sparse linear programming twin support vector machines. Cogn Comput 6:1866–9956Google Scholar
  32. 32.
    Qi Z, Tian Y, Shi Y (2013) Robust twin support vector machine for pattern classification. Pattern Recognit 46(1):305–316CrossRefMATHGoogle Scholar
  33. 33.
    Qi Z, Tian Y, Shi Y (2013) Structural twin support vector machine for classification. Knowl-Based Syst 43:74–81CrossRefGoogle Scholar
  34. 34.
    Scarborough J (1958) Numerical mathematical analysis, 4th edn. Johns Hopkins Press, New YorkGoogle Scholar
  35. 35.
    Deng N, Tian Y, Zhang C (2013) Support vector machines: optimization based theory, algorithms, and extensions. CRC Press, Boca RatonGoogle Scholar
  36. 36.
    Naldi M, Campello R (2014) Evolutionary k-means for distributed datasets. Neurocomputing 127:30–42Google Scholar
  37. 37.
    Bradley P, Fayyad U (1998) Refining initial points for k-means clustering. In: Proceedings of the 15 International Conference on Machine Learning (ICML98), pp 91–99Google Scholar
  38. 38.
    Fayyad U, Reina C, Bradley P (1998) Initialization of iterative refinement clustering algorithms. In: Proceedings of 14th International Conference on Machine Learning (ICML), pp 194–198Google Scholar
  39. 39.
    Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416CrossRefMathSciNetGoogle Scholar
  40. 40.
    Blake CL, Merz CJ (1998) UCI repository for machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html
  41. 41.
    Matlab, User’s Guide, The MathWorks Inc. http://www.mathworks.com (1994–2001)
  42. 42.
    Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. Intell Inf Syst J 17:107–145CrossRefMATHGoogle Scholar
  43. 43.
    Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mac Learn Res 7:1–30MathSciNetMATHGoogle Scholar
  44. 44.
    Garcia S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mac Learn Res 9:2677–2694MATHGoogle Scholar
  45. 45.
    Garcia S, Fernandez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 9:2044–2064CrossRefGoogle Scholar
  46. 46.
    Yang B, Chen S (2010) Sample-dependent graph construction with application to dimensionality reduction. Neurocomputing 74:301–314CrossRefGoogle Scholar
  47. 47.
    Tian Y, Shi Y, Liu X (2012) Recent advances on support vector machines research. Technol Econ Dev Econ 18(1):5–33CrossRefGoogle Scholar
  48. 48.
    Shao Y, Deng N, Yang Z (2012) Least squares recursive projection twin support vector machine for classification. Pattern Recognit 45(6):2299–2307CrossRefMATHGoogle Scholar
  49. 49.
    Shao YH, Wang Z, Chen WJ, Deng NY (2013) Least squares twin parametric-margin support vector machine for classification. Appl Intell 39(3):1–14Google Scholar
  50. 50.
    Ferraro MB, Guarracino MR (2014) From separating to proximal plane classifiers: a review, clusters, orders, and trees: methods and applications. Springer Optim Appl 92:167–180CrossRefGoogle Scholar
  51. 51.
    Tian Y, Qi Z, Ju X, Shi Y, Liu X (2014) Nonparallel support vector machines for pattern classification. Cybern IEEE Trans 44(7):1067–1079CrossRefGoogle Scholar

Copyright information

© The Natural Computing Applications Forum 2014

Authors and Affiliations

  • Zhi-Min Yang
    • 1
  • Yan-Ru Guo
    • 2
  • Chun-Na Li
    • 1
  • Yuan-Hai Shao
    • 1
  1. 1.Zhijiang CollegeZhejiang University of TechnologyHangzhouPeople’s Republic of China
  2. 2.College of ScienceZhejiang University of TechnologyHangzhouPeople’s Republic of China

Personalised recommendations