Advertisement

Neural Computing and Applications

, Volume 22, Issue 7–8, pp 1309–1319 | Cite as

Proximity multi-sphere support vector clustering

  • Trung Le
  • Dat Tran
  • Phuoc Nguyen
  • Wanli Ma
  • Dharmendra Sharma
ICONIP 2011

Abstract

Support vector data description constructs an optimal hypersphere in feature space as a description of a data set. This hypersphere when mapped back to input space becomes a set of contours, and support vector clustering (SVC) employs these contours as cluster boundaries to detect clusters in the data set. However real-world data sets may have some distinctive distributions and hence a single hypersphere cannot be the best description. As a result, the set of contours in input space does not always detect all clusters in the data set. Another issue in SVC is that in some cases, it cannot preserve proximity notation which is crucial for cluster analysis, that is, two data points that are close to each other can be assigned to different clusters using cluster labelling method of SVC. To overcome these drawbacks, we propose Proximity Multi-sphere Support Vector Clustering which employs a set of hyperspheres to provide a better data description for data sets having distinctive distributions and a proximity graph to favour the proximity notation. Experimental results on different data sets are presented to evaluate the proposed clustering technique and compare it with SVC and other clustering techniques.

Keywords

Clustering Support vector data description Multi-sphere support vector data description Support vector clustering Multi-sphere support vector clustering Proximity graph 

References

  1. 1.
    Ben-Hur A, Horn D, Siegelmann H, Vapnik V (2001) Support vector clustering. J Mach Learn Res 2:125–137Google Scholar
  2. 2.
    Ben-Hur A, Horn D, Siegelmann HT, Vapnik V (2001) A support vector method for hierarchical clustering. In: Advances in neural information processing systems 13. MIT Press, Cambridge, MA, pp 367–373Google Scholar
  3. 3.
    Bezdek JC (1993) A review of probabilistic, fuzzy and neural models for pattern recognition. J Intell Fuzzy Syst 1(1):1–25MathSciNetGoogle Scholar
  4. 4.
    Blatt M, Wiseman S, Domany E (1997) Data clustering using a model granular magnet. Neural Comput 9:1805–1842CrossRefGoogle Scholar
  5. 5.
    Estivill-Castro V, Lee I, Murray A (2001) Criteria on proximity graphs for boundary extraction and spatial clustering. In: Proceedings of the 5th Pacific-Asia conference on knowledge discovery and data mining. Springer, Berlin, pp 348–357Google Scholar
  6. 6.
    Fukunaga K (1990) Introduction to statistical pattern recognition, second edition (computer science and scientific computing series), 2nd edn. Academic Press, LondonGoogle Scholar
  7. 7.
    Hartuv E, Shamir R (1999) A clustering algorithm based on graph connectivity. Inf Proces Lett 76:175–181MathSciNetCrossRefGoogle Scholar
  8. 8.
    Kohonen T, Schroeder MR, Huang TS (2001) Self-organizing maps, 3rd edn. Springer, New YorkCrossRefGoogle Scholar
  9. 9.
    Le T, Tran D, Ma W, Sharma D (2010) A theoretical framework for multi-sphere support vector data description. In: Proceedings of the 17th international conference on neural information processing: models and applications. Springer, Sydney, pp 132–142Google Scholar
  10. 10.
    Le T, Tran D, Nguyen P, Ma W, Sharma D (2011) Multiple distribution data description learning method for novelty detection. In: Neural Networks (IJCNN), The 2011 international joint conference on, pp 2321–2326Google Scholar
  11. 11.
    Macqueen JB (1967) Some methods of classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, pp 281–297Google Scholar
  12. 12.
    Nguyen N, Caruana R (2007) Consensus clustering. In: International conference on data miningGoogle Scholar
  13. 13.
    Roberts S (1997) Parametric and non-parametric unsupervised cluster analysis. Pattern Recogn 30:261–272CrossRefGoogle Scholar
  14. 14.
    Rose K, Gurewitz E, Fox GC (1992) Vector quantization by deterministic annealing. IEEE Trans Inf Theory 38(4):1249–1257zbMATHCrossRefGoogle Scholar
  15. 15.
    Shamir R, Sharan R (2000) Click: A clustering algorithm for gene expression analysis. AAAI Press, Menlo Park, CAGoogle Scholar
  16. 16.
    Shamir R, Sharan R (2001) Algorithmic approaches to clustering gene expression data. In: Current topics in computational biology. MIT Press, Cambridge MA, pp 269–300Google Scholar
  17. 17.
    Tamayo P, Donna S, Jill M, Qing Z, Sutisak K, Ethan D, SL E, RG T (1999) Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. In: Proceedings of the National Academy of Sciences of the United States of America, vol 96, pp 2907–2912Google Scholar
  18. 18.
    Tax D, Duin R (1998) Outlier detection using classifier instabilityGoogle Scholar
  19. 19.
    Tax D, Duin R (1999) Support vector domain description. Pattern Recogn Lett 20:1191–1199CrossRefGoogle Scholar
  20. 20.
    Tax DMJ, Duin RPW (2004) Support vector data description. J Mach Learn Res 54(1):45–66zbMATHCrossRefGoogle Scholar
  21. 21.
    Tran D, Wagner M (2000) Fuzzy entropy clustering. In: FUZZ-IEEE, vol 1, pp 152–157Google Scholar
  22. 22.
    Yang J, Estivill-Castro V, Chalup S (2009) Support vector clustering through proximity graph modelling. In: International conference on neural information processing, vol 2Google Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  • Trung Le
    • 1
  • Dat Tran
    • 1
  • Phuoc Nguyen
    • 1
  • Wanli Ma
    • 1
  • Dharmendra Sharma
    • 1
  1. 1.Faculty of Information Sciences and EngineeringUniversity of CanberraCanberraAustralia

Personalised recommendations