Skip to main content
Log in

Large data sets classification using convex–concave hull and support vector machine

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Normal support vector machine (SVM) is not suitable for classification of large data sets because of high training complexity. Convex hull can simplify the SVM training. However, the classification accuracy becomes lower when there exist inseparable points. This paper introduces a novel method for SVM classification, called convex–concave hull SVM (CCH-SVM). After grid processing, the convex hull is used to find extreme points. Then, we use Jarvis march method to determine the concave (non-convex) hull for the inseparable points. Finally, the vertices of the convex–concave hull are applied for SVM training. The proposed CCH-SVM classifier has distinctive advantages on dealing with large data sets. We apply the proposed method on several benchmark problems. Experimental results demonstrate that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers. Compared with the other convex hull SVM methods, the classification accuracy is higher.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Bennett KP, Bredensteiner EJ (2000a) Geometry in learning. In: Gorini C (eds), Geometry at work. Mathematical Association of America, pp 132–145

  • Bennett K.P., Bredensteiner E.J. (2000b) Duality and geometry in SVM classifiers. 17th International Conference on Machine Learning, San Francisco

  • Berg M, Cheong O, Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications. Springer, Berlin

  • Cervantes J, Li X, Yu W, Li K (2008) Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71:611–619

    Article  Google Scholar 

  • Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm

  • Collobert R, Bengio S (2001) SVMTorch: support vector machines for large regression problems. J Mach Learn Res 1:143–160

    MathSciNet  Google Scholar 

  • Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. 23rd international conference on machine learning, Pittsburgh, pp 201–208

  • Crisp DJ, Burges CJC (2000) A geometric interpretation of υ-SVM classifiers. NIPS 12:244–250

    Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

  • Eddy W (1977) A new convex hull algorithm for planar sets. ACM Trans Math Softw 3(4):398–403

    Article  MATH  Google Scholar 

  • Franc V, Hlavac V (2003) An iterative algorithm learning the maximal margin classifier. Pattern Recogn Lett 36:1985–1996

    Article  MATH  Google Scholar 

  • Gilbert EG (1966) An iterative procedure for computing the minimum of a quadratic form on a convex set. SIAM J Control Optim 4(1):61–79

    Article  MATH  Google Scholar 

  • Graham RL (1972) An efficient algorithm for dutennining the convex hull of a finite pianar set. Inf Process Lett 1:132–133

    Article  MATH  Google Scholar 

  • Guo G, Zhang J-S (2007) Reducing examples to accelerate support vector regression. Pattern Recognit Lett 28:2173–2183

    Article  Google Scholar 

  • Ho TK, Kleinberg EM (1996) Checkerboard dataset. http://www.cs.wisc.edu/

  • Hsu C-W, Chang C-C, Lin C-J (2010) A practical guide to support vector classification. Bioinform Biol Insights 1(1):1–16

    MathSciNet  Google Scholar 

  • Jarvis RA (1973) On the identification of the convex hull of a finite set of points in the plane. Inf Process Lett 2:18–21

    Article  MATH  Google Scholar 

  • Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, Berlin

  • Kallay M (1984) The complexity of incremental convex hull algorithms. Inf Process Lett 19(4):197–212

    Article  MathSciNet  MATH  Google Scholar 

  • Keerthi SS, Gilbert EG (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46:351–360

    Article  MATH  Google Scholar 

  • Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans Neural Netw 11(12):124–137

    Google Scholar 

  • Li Y (2011) Selecting training points for one-class support vector machines. Pattern Recognit Lett 32:1517–1522

    Article  Google Scholar 

  • Mavroforakis ME, Theodoridis S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17(3):671–682

    Article  Google Scholar 

  • Mitchell BF, Dem’yanov VF, Malozemov VN (1971) Finding the point of a polyhedron closest to the origin. Vestinik Leningrad Gos Univ 13:38–45

    Google Scholar 

  • Moreira A, Santos MY (2007) Concave hull: a K-nearest neighbours approach for the computation of the region occupied by a set of points. GRAPP (GM/R), pp 61–68

  • Pizzuti C, Talia D (2003) P-auto class: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15(3):629–641

    Article  Google Scholar 

  • Platt J. (1998) Fast training of support vector machine using sequential minimal optimization, advances in kernel methods: support vector machine. MIT Press, Cambridge

  • Preparata FP, Hong SJ (1977) convex hulls of finite sets of points in two and three dimensions. Commun ACM 20(2):87–93

    Article  MathSciNet  MATH  Google Scholar 

  • Schlesinger MI, Kalmykov VG, Suchorukov AA, Sravnitelnyj (1981) Comparative analysis of algorithms synthesising linear decision rule for analysis of complex hypotheses. Automatika 1:3–9

  • Tsang IW, Kwok JT, Cheung PM (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6:363–392

    MathSciNet  MATH  Google Scholar 

  • Vapnik V (1995) The nature of statistical learning theory. Springer, New York

  • Xia C, Hsu W, Lee ML, Ooi BC (2006) BORDER: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18(3):289–303

    Article  Google Scholar 

  • Yu W, Li X (2008) On-line fuzzy modeling via clustering and support vector machines. Inf Sci 178:4264–4279

    Article  MATH  Google Scholar 

  • Yu H, Yang J, Han J (2003) Classifying large data sets using SVMs with hierarchical clusters. Proceedings of the 9th ACM SIGKDD 2003 Washington, DC

  • Yuille AL, Rangarajan A (2003) The concave-convex procedure. Neural Comput Appl 15(4):915–936

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Yu.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chau, A.L., Li, X. & Yu, W. Large data sets classification using convex–concave hull and support vector machine. Soft Comput 17, 793–804 (2013). https://doi.org/10.1007/s00500-012-0954-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-012-0954-x

Keywords

Navigation