Large data sets classification using convex–concave hull and support vector machine
 Asdrúbal López Chau,
 Xiaoou Li,
 Wen Yu
 … show all 3 hide
Purchase on Springer.com
$39.95 / €34.95 / £29.95*
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Abstract
Normal support vector machine (SVM) is not suitable for classification of large data sets because of high training complexity. Convex hull can simplify the SVM training. However, the classification accuracy becomes lower when there exist inseparable points. This paper introduces a novel method for SVM classification, called convex–concave hull SVM (CCHSVM). After grid processing, the convex hull is used to find extreme points. Then, we use Jarvis march method to determine the concave (nonconvex) hull for the inseparable points. Finally, the vertices of the convex–concave hull are applied for SVM training. The proposed CCHSVM classifier has distinctive advantages on dealing with large data sets. We apply the proposed method on several benchmark problems. Experimental results demonstrate that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers. Compared with the other convex hull SVM methods, the classification accuracy is higher.
Inside
Within this Article
 Introduction
 Convex–concave hull
 Experiments
 Conclusions
 References
 References
Other actions
 Bennett KP, Bredensteiner EJ (2000a) Geometry in learning. In: Gorini C (eds), Geometry at work. Mathematical Association of America, pp 132–145
 Bennett K.P., Bredensteiner E.J. (2000b) Duality and geometry in SVM classifiers. 17th International Conference on Machine Learning, San Francisco
 Berg M, Cheong O, Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications. Springer, Berlin
 Cervantes J, Li X, Yu W, Li K (2008) Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71:611–619 CrossRef
 Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm
 Collobert R, Bengio S (2001) SVMTorch: support vector machines for large regression problems. J Mach Learn Res 1:143–160
 Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. 23rd international conference on machine learning, Pittsburgh, pp 201–208
 Crisp DJ, Burges CJC (2000) A geometric interpretation of υSVM classifiers. NIPS 12:244–250
 Cristianini N, ShaweTaylor J (2000) An introduction to support vector machines and other kernelbased learning methods. Cambridge University Press, Cambridge
 Eddy W (1977) A new convex hull algorithm for planar sets. ACM Trans Math Softw 3(4):398–403 CrossRef
 Franc V, Hlavac V (2003) An iterative algorithm learning the maximal margin classifier. Pattern Recogn Lett 36:1985–1996 CrossRef
 Gilbert EG (1966) An iterative procedure for computing the minimum of a quadratic form on a convex set. SIAM J Control Optim 4(1):61–79 CrossRef
 Graham RL (1972) An efficient algorithm for dutennining the convex hull of a finite pianar set. Inf Process Lett 1:132–133 CrossRef
 Guo G, Zhang JS (2007) Reducing examples to accelerate support vector regression. Pattern Recognit Lett 28:2173–2183 CrossRef
 Ho TK, Kleinberg EM (1996) Checkerboard dataset. http://www.cs.wisc.edu/
 Hsu CW, Chang CC, Lin CJ (2010) A practical guide to support vector classification. Bioinform Biol Insights 1(1):1–16
 Jarvis RA (1973) On the identification of the convex hull of a finite set of points in the plane. Inf Process Lett 2:18–21 CrossRef
 Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, Berlin
 Kallay M (1984) The complexity of incremental convex hull algorithms. Inf Process Lett 19(4):197–212 CrossRef
 Keerthi SS, Gilbert EG (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46:351–360 CrossRef
 Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans Neural Netw 11(12):124–137
 Li Y (2011) Selecting training points for oneclass support vector machines. Pattern Recognit Lett 32:1517–1522 CrossRef
 Mavroforakis ME, Theodoridis S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17(3):671–682 CrossRef
 Mitchell BF, Dem’yanov VF, Malozemov VN (1971) Finding the point of a polyhedron closest to the origin. Vestinik Leningrad Gos Univ 13:38–45
 Moreira A, Santos MY (2007) Concave hull: a Knearest neighbours approach for the computation of the region occupied by a set of points. GRAPP (GM/R), pp 61–68
 Pizzuti C, Talia D (2003) Pauto class: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15(3):629–641 CrossRef
 Platt J. (1998) Fast training of support vector machine using sequential minimal optimization, advances in kernel methods: support vector machine. MIT Press, Cambridge
 Preparata FP, Hong SJ (1977) convex hulls of finite sets of points in two and three dimensions. Commun ACM 20(2):87–93 CrossRef
 Schlesinger MI, Kalmykov VG, Suchorukov AA, Sravnitelnyj (1981) Comparative analysis of algorithms synthesising linear decision rule for analysis of complex hypotheses. Automatika 1:3–9
 Tsang IW, Kwok JT, Cheung PM (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6:363–392
 Vapnik V (1995) The nature of statistical learning theory. Springer, New York
 Xia C, Hsu W, Lee ML, Ooi BC (2006) BORDER: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18(3):289–303 CrossRef
 Yu W, Li X (2008) Online fuzzy modeling via clustering and support vector machines. Inf Sci 178:4264–4279 CrossRef
 Yu H, Yang J, Han J (2003) Classifying large data sets using SVMs with hierarchical clusters. Proceedings of the 9th ACM SIGKDD 2003 Washington, DC
 Yuille AL, Rangarajan A (2003) The concaveconvex procedure. Neural Comput Appl 15(4):915–936 CrossRef
 Title
 Large data sets classification using convex–concave hull and support vector machine
 Journal

Soft Computing
Volume 17, Issue 5 , pp 793804
 Cover Date
 20130501
 DOI
 10.1007/s005000120954x
 Print ISSN
 14327643
 Online ISSN
 14337479
 Publisher
 SpringerVerlag
 Additional Links
 Topics
 Industry Sectors
 Authors

 Asdrúbal López Chau ^{(1)}
 Xiaoou Li ^{(1)}
 Wen Yu ^{(2)}
 Author Affiliations

 1. Departamento de Coputacion, CINVESTAVIPN, Mexico City, 07360, Mexico
 2. Departamento de Control Automatico, CINVESTAVIPN, Mexico City, 07360, Mexico