Advertisement

Rough Core Vector Clustering

  • CMB Seshikanth Varma
  • S. Asharaf
  • M. Narasimha Murty
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4815)

Abstract

Support Vector Clustering has gained reasonable attention from the researchers in exploratory data analysis due to firm theoretical foundation in statistical learning theory. Hard Partitioning of the data set achieved by support vector clustering may not be acceptable in real world scenarios. Rough Support Vector Clustering is an extension of Support Vector Clustering to attain a soft partitioning of the data set. But the Quadratic Programming Problem involved in Rough Support Vector Clustering makes it computationally expensive to handle large datasets. In this paper, we propose Rough Core Vector Clustering algorithm which is a computationally efficient realization of Rough Support Vector Clustering. Here Rough Support Vector Clustering problem is formulated using an approximate Minimum Enclosing Ball problem and is solved using an approximate Minimum Enclosing Ball finding algorithm. Experiments done with several Large Multi class datasets such as Forest cover type, and other Multi class datasets taken from LIBSVM page shows that the proposed strategy is efficient, finds meaningful soft cluster abstractions which provide a superior generalization performance than the SVM classifier.

References

  1. 1.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp. 226–231 (1996)Google Scholar
  2. 2.
    Guha, Rastogi, Shim: CURE: An efficient clustering algorithm for large databases. SIGMODREC: ACM SIGMOD Record 27 (1998)Google Scholar
  3. 3.
    Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of ACM-SIGMOD International Conference of Management of Data, pp. 103–114 (1996)Google Scholar
  4. 4.
    Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Lingras, P., West, C.: Interval set clustering of web users with rough K -means. J. Intell. Inf. Syst 23, 5–16 (2004)zbMATHCrossRefGoogle Scholar
  6. 6.
    Asharaf, S., Shevade, S.K., Murty, N.M.: Rough support vector clustering (2005)Google Scholar
  7. 7.
    Ben-Hur, A., Horn, D., Siegelmann, H.T., Vapnik, V.: Support vector clustering. Journal of Machine Learning Research 2, 125–137 (2001)CrossRefGoogle Scholar
  8. 8.
    Fletcher: Practical Methods of Optimization. Wiley, Chichester (1987)zbMATHGoogle Scholar
  9. 9.
    Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research 6, 363–392 (2005)MathSciNetGoogle Scholar
  10. 10.
    Badoiu, M., Clarkson, K.L.: Smaller core-sets for balls. In: SODA, pp. 801–802 (2003)Google Scholar
  11. 11.
    Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Knowledge Discovery and Data Mining 2, 121–167 (1998)CrossRefGoogle Scholar
  12. 12.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. Online (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • CMB Seshikanth Varma
    • 1
  • S. Asharaf
    • 1
  • M. Narasimha Murty
    • 1
  1. 1.Computer Science and Automation, Indian Institute of Science, Bangalore-560012 

Personalised recommendations