Large data sets classification using convex–concave hull and support vector machine
 Asdrúbal López Chau,
 Xiaoou Li,
 Wen Yu
 … show all 3 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
Normal support vector machine (SVM) is not suitable for classification of large data sets because of high training complexity. Convex hull can simplify the SVM training. However, the classification accuracy becomes lower when there exist inseparable points. This paper introduces a novel method for SVM classification, called convex–concave hull SVM (CCHSVM). After grid processing, the convex hull is used to find extreme points. Then, we use Jarvis march method to determine the concave (nonconvex) hull for the inseparable points. Finally, the vertices of the convex–concave hull are applied for SVM training. The proposed CCHSVM classifier has distinctive advantages on dealing with large data sets. We apply the proposed method on several benchmark problems. Experimental results demonstrate that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers. Compared with the other convex hull SVM methods, the classification accuracy is higher.
 Bennett KP, Bredensteiner EJ (2000a) Geometry in learning. In: Gorini C (eds), Geometry at work. Mathematical Association of America, pp 132–145
 Bennett K.P., Bredensteiner E.J. (2000b) Duality and geometry in SVM classifiers. 17th International Conference on Machine Learning, San Francisco
 Berg M, Cheong O, Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications. Springer, Berlin
 Cervantes, J, Li, X, Yu, W, Li, K (2008) Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71: pp. 611619 CrossRef
 Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm
 Collobert, R, Bengio, S (2001) SVMTorch: support vector machines for large regression problems. J Mach Learn Res 1: pp. 143160
 Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. 23rd international conference on machine learning, Pittsburgh, pp 201–208
 Crisp, DJ, Burges, CJC (2000) A geometric interpretation of υSVM classifiers. NIPS 12: pp. 244250
 Cristianini N, ShaweTaylor J (2000) An introduction to support vector machines and other kernelbased learning methods. Cambridge University Press, Cambridge
 Eddy, W (1977) A new convex hull algorithm for planar sets. ACM Trans Math Softw 3: pp. 398403 CrossRef
 Franc, V, Hlavac, V (2003) An iterative algorithm learning the maximal margin classifier. Pattern Recogn 36: pp. 19851996 CrossRef
 Gilbert, EG (1966) An iterative procedure for computing the minimum of a quadratic form on a convex set. SIAM J Control Optim 4: pp. 6179 CrossRef
 Graham, RL (1972) An efficient algorithm for dutennining the convex hull of a finite pianar set. Inf Process Lett 1: pp. 132133 CrossRef
 Guo, G, Zhang, JS (2007) Reducing examples to accelerate support vector regression. Pattern Recognit Lett 28: pp. 21732183 CrossRef
 Ho TK, Kleinberg EM (1996) Checkerboard dataset. http://www.cs.wisc.edu/
 Hsu, CW, Chang, CC, Lin, CJ (2010) A practical guide to support vector classification. Bioinformatics 1: pp. 116
 Jarvis, RA (1973) On the identification of the convex hull of a finite set of points in the plane. Inf Process Lett 2: pp. 1821 CrossRef
 Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, Berlin
 Kallay, M (1984) The complexity of incremental convex hull algorithms. Inf Process Lett 19: pp. 197212 CrossRef
 Keerthi, SS, Gilbert, EG (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46: pp. 351360 CrossRef
 Keerthi, SS, Shevade, SK, Bhattacharyya, C, Murthy, KRK (2001) A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans Neural Netw 11: pp. 124137
 Li, Y (2011) Selecting training points for oneclass support vector machines. Pattern Recognit Lett 32: pp. 15171522 CrossRef
 Mavroforakis, ME, Theodoridis, S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17: pp. 671682 CrossRef
 Mitchell, BF, Dem’yanov, VF, Malozemov, VN (1971) Finding the point of a polyhedron closest to the origin. Vestinik Leningrad Gos Univ 13: pp. 3845
 Moreira A, Santos MY (2007) Concave hull: a Knearest neighbours approach for the computation of the region occupied by a set of points. GRAPP (GM/R), pp 61–68
 Pizzuti, C, Talia, D (2003) Pauto class: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15: pp. 629641 CrossRef
 Platt J. (1998) Fast training of support vector machine using sequential minimal optimization, advances in kernel methods: support vector machine. MIT Press, Cambridge
 Preparata, FP, Hong, SJ (1977) Convex hulls of finite sets of points in two and three dimensions. Commun ACM 20: pp. 8793 CrossRef
 Schlesinger MI, Kalmykov VG, Suchorukov AA, Sravnitelnyj (1981) Comparative analysis of algorithms synthesising linear decision rule for analysis of complex hypotheses. Automatika 1:3–9
 Tsang, IW, Kwok, JT, Cheung, PM (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6: pp. 363392
 Vapnik V (1995) The nature of statistical learning theory. Springer, New York
 Xia, C, Hsu, W, Lee, ML, Ooi, BC (2006) BORDER: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18: pp. 289303 CrossRef
 Yu, W, Li, X (2008) Online fuzzy modeling via clustering and support vector machines. Inf Sci 178: pp. 42644279 CrossRef
 Yu H, Yang J, Han J (2003) Classifying large data sets using SVMs with hierarchical clusters. Proceedings of the 9th ACM SIGKDD 2003 Washington, DC
 Yuille, AL, Rangarajan, A (2003) The concaveconvex procedure. Neural Comput 15: pp. 915936 CrossRef
 Title
 Large data sets classification using convex–concave hull and support vector machine
 Journal

Soft Computing
Volume 17, Issue 5 , pp 793804
 Cover Date
 20130501
 DOI
 10.1007/s005000120954x
 Print ISSN
 14327643
 Online ISSN
 14337479
 Publisher
 SpringerVerlag
 Additional Links
 Topics
 Industry Sectors
 Authors

 Asdrúbal López Chau ^{(1)}
 Xiaoou Li ^{(1)}
 Wen Yu ^{(2)}
 Author Affiliations

 1. Departamento de Coputacion, CINVESTAVIPN, Mexico City, 07360, Mexico
 2. Departamento de Control Automatico, CINVESTAVIPN, Mexico City, 07360, Mexico