Abstract
A huge recent advance in machine learning has been the usage of sparsity as a guiding principle to perform classification. Traditionally sparsity has been used to exploit a property of high dimensional vectors–which is that, vectors of the same class lie on or near the same low dimensional subspace within an ambient high dimensional space–this is seen in algorithms like Basis Pursuit, and Sparse Representation classifier. In this paper we use sparsity to exploit a different property of data, which is that data points belonging to the same class constitute a cluster. Here classification is done by determining which cluster’s vectors can best convexly approximate the given test vector. So if the vectors of cluster ‘A’ best approximate or realise the given test vector, then label ‘A’ is assigned as its class. The problem of finding the best approximate is framed as a \(\ell _{1}\) norm minimization problem with convex constraints. The optimization framework of the proposed algorithm is convex in nature making the classification algorithm tractable. The proposed algorithm is evaluated by comparing its accuracy with the accuracy of other popular machine learning algorithms on a diverse collection of real datasets. The proposed algorithm on an average provides a 10 % improvement in accuracy over certain standard machine learning algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
L is the number of training vectors per class.
References
Zhang, S., et al.: Automatic image annotation using group sparsity. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2010)
Dong, W., et al.: Sparsity-based image denoising via dictionary learning and structural clustering. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2011)
Ramirez, I., Sprechmann, P., Sapiro, G.: Classification and clustering via dictionary learning with structured incoherence and shared features. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2010)
Wright, J., et al.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of 2002 12th IEEE Workshop on Neural Networks for Signal Processing. IEEE (2002)
Elhamifar, E., Vidal, R.: Clustering disjoint subspaces via sparse representation. In: 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). IEEE (2010)
Lichman, M.: UCI machine Learning Repository. School of Information and Computer Sciences, University of California, Irvine. http://archive.ics.uci.edu/ml
HP-labs: Isolated Handwritten Tamil Character Dataset Developed by HP India Along with IISc (2006). http://lipitk.sourceforge.net/datasets/tamilchardata.htm. Accessed 30 Sept 2010
Tagme dataset. http://events.csa.iisc.ernet.in/opendays2014/events/MLEvent/index.php. Accessed 26 Mar 2014
Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Blondel, V., Boyd, S., Kimura, H. (eds.) Recent Advances in Learning and Control. LNCS, vol. 371, pp. 95–110. Springer, Heidelberg (2008)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967)
Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. Appl. Stat. 28, 100–108 (1979)
Osuna, E., Freund, R., Girosi, F.: Training support vector machines: an application to face detection. In: Proceedings of 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE (1997)
Gui, J., et al.: Representative vector machines: a unified framework for classical classifiers. (2015)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. Ser. B (Methodol.) 58, 267–288 (1996)
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Babu, S. (2016). A Novel Sparsity Based Classification Framework to Exploit Clusters in Data. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2016. Lecture Notes in Computer Science(), vol 9728. Springer, Cham. https://doi.org/10.1007/978-3-319-41561-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-41561-1_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41560-4
Online ISBN: 978-3-319-41561-1
eBook Packages: Computer ScienceComputer Science (R0)