Skip to main content

A Novel Sparsity Based Classification Framework to Exploit Clusters in Data

  • Conference paper
  • First Online:
Advances in Data Mining. Applications and Theoretical Aspects (ICDM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9728))

Included in the following conference series:

  • 1563 Accesses

Abstract

A huge recent advance in machine learning has been the usage of sparsity as a guiding principle to perform classification. Traditionally sparsity has been used to exploit a property of high dimensional vectors–which is that, vectors of the same class lie on or near the same low dimensional subspace within an ambient high dimensional space–this is seen in algorithms like Basis Pursuit, and Sparse Representation classifier. In this paper we use sparsity to exploit a different property of data, which is that data points belonging to the same class constitute a cluster. Here classification is done by determining which cluster’s vectors can best convexly approximate the given test vector. So if the vectors of cluster ‘A’ best approximate or realise the given test vector, then label ‘A’ is assigned as its class. The problem of finding the best approximate is framed as a \(\ell _{1}\) norm minimization problem with convex constraints. The optimization framework of the proposed algorithm is convex in nature making the classification algorithm tractable. The proposed algorithm is evaluated by comparing its accuracy with the accuracy of other popular machine learning algorithms on a diverse collection of real datasets. The proposed algorithm on an average provides a 10 % improvement in accuracy over certain standard machine learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    L is the number of training vectors per class.

References

  1. Zhang, S., et al.: Automatic image annotation using group sparsity. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2010)

    Google Scholar 

  2. Dong, W., et al.: Sparsity-based image denoising via dictionary learning and structural clustering. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2011)

    Google Scholar 

  3. Ramirez, I., Sprechmann, P., Sapiro, G.: Classification and clustering via dictionary learning with structured incoherence and shared features. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2010)

    Google Scholar 

  4. Wright, J., et al.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)

    Article  Google Scholar 

  5. Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of 2002 12th IEEE Workshop on Neural Networks for Signal Processing. IEEE (2002)

    Google Scholar 

  6. Elhamifar, E., Vidal, R.: Clustering disjoint subspaces via sparse representation. In: 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). IEEE (2010)

    Google Scholar 

  7. Lichman, M.: UCI machine Learning Repository. School of Information and Computer Sciences, University of California, Irvine. http://archive.ics.uci.edu/ml

  8. HP-labs: Isolated Handwritten Tamil Character Dataset Developed by HP India Along with IISc (2006). http://lipitk.sourceforge.net/datasets/tamilchardata.htm. Accessed 30 Sept 2010

  9. Tagme dataset. http://events.csa.iisc.ernet.in/opendays2014/events/MLEvent/index.php. Accessed 26 Mar 2014

  10. Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Blondel, V., Boyd, S., Kimura, H. (eds.) Recent Advances in Learning and Control. LNCS, vol. 371, pp. 95–110. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  12. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967)

    Article  MATH  Google Scholar 

  13. Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. Appl. Stat. 28, 100–108 (1979)

    Article  MATH  Google Scholar 

  14. Osuna, E., Freund, R., Girosi, F.: Training support vector machines: an application to face detection. In: Proceedings of 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE (1997)

    Google Scholar 

  15. Gui, J., et al.: Representative vector machines: a unified framework for classical classifiers. (2015)

    Google Scholar 

  16. Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. Ser. B (Methodol.) 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  17. Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1998)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sudarshan Babu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Babu, S. (2016). A Novel Sparsity Based Classification Framework to Exploit Clusters in Data. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2016. Lecture Notes in Computer Science(), vol 9728. Springer, Cham. https://doi.org/10.1007/978-3-319-41561-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41561-1_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41560-4

  • Online ISBN: 978-3-319-41561-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics