Smart Sampling and Optimal Dimensionality Reduction of Big Data Using Compressed Sensing

  • Anastasios MaronidisEmail author
  • Elisavet Chatzilari
  • Spiros Nikolopoulos
  • Ioannis Kompatsiaris
Part of the Studies in Big Data book series (SBD, volume 18)


Handling big data poses as a huge challenge in the computer science community. Some of the most appealing research domains such as machine learning, computational biology and social networks are now overwhelmed with large-scale databases that need computationally demanding manipulation. Several techniques have been proposed for dealing with big data processing challenges including computational efficient implementations, like parallel and distributed architectures, but most approaches benefit from a dimensionality reduction and smart sampling step of the data. In this context, through a series of groundbreaking works, Compressed Sensing (CS) has emerged as a powerful mathematical framework providing a suite of conditions and methods that allow for an almost lossless and efficient data compression. The most surprising outcome of CS is the proof that random projections qualify as a close to optimal selection for transforming high-dimensional data into a low-dimensional space in a way that allows for their almost perfect reconstruction. The compression power along with the usage simplicity render CS an appealing method for optimal dimensionality reduction of big data. Although CS is renowned for its capability of providing succinct representations of the data, in this chapter we investigate its potential as a dimensionality reduction technique in the domain of image annotation. More specifically, our aim is to initially present the challenges stemming from the nature of big data problems, explain the basic principles, advantages and disadvantages of CS and identify potential ways of exploiting this theory in the domain of large-scale image annotation. Towards this end, a novel Hierarchical Compressed Sensing (HCS) method is proposed. The new method dramatically decreases the computational complexity, while displays robustness equal to the typical CS method. Besides, the connection between the sparsity level of the original dataset and the effectiveness of HCS is established through a series of artificial experiments. Finally, the proposed method is compared with the state-of-the-art dimensionality reduction technique of Principal Component Analysis. The performance results are encouraging, indicating a promising potential of the new method in large-scale image annotation.


Smart sampling Optimal dimensionality reduction Compressed Sensing Sparse representation Scalable image annotation 



This work was supported by the European Commission Seventh Framework Programme under Grant Agreement Number FP7-601138 PERICLES.


  1. 1.
    Aharon, M., Elad, M., Bruckstein, A.: SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)CrossRefGoogle Scholar
  2. 2.
    Bacardit, J., Llorà, X.: Large-scale data mining using genetics-based machine learning. Wiley Interdiscip. Rev.: Data Mining Knowl. Discov. 3(1), 37–61 (2013)Google Scholar
  3. 3.
    Baraniuk, R.: Compressive sensing. IEEE Signal Process. Mag. 24(4) (2007)Google Scholar
  4. 4.
    Baraniuk, R.G., Cevher, V., Duarte, M.F., Hegde, C.: Model-based compressive sensing. IEEE Trans. Inf. Theory 56(4), 1982–2001 (2010)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)CrossRefGoogle Scholar
  6. 6.
    Brandt, J.: Transform coding for fast approximate nearest neighbor search in high dimensions. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1815–1822. IEEE (2010)Google Scholar
  7. 7.
    Bryt, O., Elad, M.: Compression of facial images using the K-SVD algorithm. J. Vis. Commun. Image Represent. 19(4), 270–282 (2008)CrossRefGoogle Scholar
  8. 8.
    Cai, H., Mikolajczyk, K., Matas, J.: Learning linear discriminant projections for dimensionality reduction of image descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 338–352 (2011)CrossRefGoogle Scholar
  9. 9.
    Candes, E.J., Romberg, J.: Quantitative robust uncertainty principles and optimally sparse decompositions. Found. Comput. Math. 6(2), 227–254 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Candes, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory 51(12), 4203–4215 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Candès, E.J., Wakin, M.B.: An introduction to compressive sampling. IEEE Signal Process. Mag. 25(2), 21–30 (2008)CrossRefGoogle Scholar
  13. 13.
    Candès, E.J., et al.: Compressive sampling. In: Proceedings of the International Congress of Mathematicians, vol. 3, pp. 1433–1452. Madrid, Spain (2006)Google Scholar
  14. 14.
    Cevher, V., Sankaranarayanan, A., Duarte, M.F., Reddy, D., Baraniuk, R.G., Chellappa, R.: Compressive sensing for background subtraction. In: Computer Vision-ECCV 2008, pp. 155–168. Springer (2008)Google Scholar
  15. 15.
    Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The Devil is in the Details: An Evaluation of Recent Feature Encoding Methods (2011)Google Scholar
  16. 16.
    Chawla, N.V., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: Learning ensembles from bites: a scalable and accurate approach. J. Mach. Learn. Res. 5, 421–451 (2004)MathSciNetGoogle Scholar
  17. 17.
    Chen, S., Donoho, D.: Basis pursuit. In: 1994 Conference Record of the Twenty-Eighth Asilomar Conference on Signals, Systems and Computers, 1994, vol. 1, pp. 41–44. IEEE (1994)Google Scholar
  18. 18.
    Dantzig, G.B.: Linear Programming and Extensions. Princeton University Press (1998)Google Scholar
  19. 19.
    Davenport, M.A., Duarte, M.F., Eldar, Y.C., Kutyniok, G.: Introduction to Compressed Sensing. Preprint 93 (2011)Google Scholar
  20. 20.
    Davenport, M.A., Laska, J.N., Boufounos, P.T., Baraniuk, R.G.: A simple proof that random matrices are democratic. arXiv:0911.0736 (2009)
  21. 21.
    Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  22. 22.
    Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Duarte, M.F., Eldar, Y.C.: Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process. 59(9), 4053–4085 (2011)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer (2010)Google Scholar
  25. 25.
    Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736–3745 (2006)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The Pascal Visual Object Classes Challenge 2012 (2012)Google Scholar
  27. 27.
    Huang, J., Liu, H., Shen, J., Yan, S.: Towards efficient sparse coding for scalable image annotation. In: Proceedings of the 21st ACM international conference on Multimedia, pp. 947–956. ACM (2013)Google Scholar
  28. 28.
    Huang, K., Aviyente, S.: Sparse representation for signal classification. In: NIPS, pp. 609–616 (2006)Google Scholar
  29. 29.
    Jégou, H., Douze, M., Schmid, C.: Packing bag-of-features. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2357–2364. IEEE (2009)Google Scholar
  30. 30.
    Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Machine Intell. 33(1), 117–128 (2011)CrossRefGoogle Scholar
  31. 31.
    Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311. IEEE (2010)Google Scholar
  32. 32.
    Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2005)Google Scholar
  33. 33.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer vision, 1999, vol. 2, pp. 1150–1157. IEEE (1999)Google Scholar
  34. 34.
    Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration. IEEE Trans. Image Process. 17(1), 53–69 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Mairal, J., Sapiro, G., Elad, M.: Learning multiscale sparse representations for image and video restoration. Technical report, DTIC Document (2007)Google Scholar
  36. 36.
    Manjunath, B.S., Ohm, J.R., Vasudevan, V.V., Yamada, A.: Color and texture descriptors. IEEE Trans. Circ. Syst. Video Technol. 11(6), 703–715 (2001)CrossRefGoogle Scholar
  37. 37.
    Mohri, M., Talwalkar, A.: Can matrix coherence be efficiently and accurately estimated? In: International Conference on Artificial Intelligence and Statistics, pp. 534–542 (2011)Google Scholar
  38. 38.
    Nyquist, H.: Certain topics in telegraph transmission theory. Trans. Am. Inst. Electr. Eng. 47(2), 617–644 (1928)CrossRefGoogle Scholar
  39. 39.
    Patel, V.M., Chellappa, R.: Sparse representations, compressive sensing and dictionaries for pattern recognition. In: 2011 First Asian Conference on Pattern Recognition (ACPR), pp. 325–329. IEEE (2011)Google Scholar
  40. 40.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher kernel for large-scale image classification. In: Computer Vision-ECCV 2010, pp. 143–156. Springer (2010)Google Scholar
  41. 41.
    Raginsky, M., Lazebnik, S.: Locality-sensitive binary codes from shift-invariant kernels. In: Advances in Neural Information Processing Systems, pp. 1509–1517 (2009)Google Scholar
  42. 42.
    Sánchez, J., Perronnin, F.: High-dimensional signature compression for large-scale image classification. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1665–1672. IEEE (2011)Google Scholar
  43. 43.
    Shannon, C.E.: Communication in the presence of noise. Proc. IRE 37(1), 10–21 (1949)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Shi, Q., Petterson, J., Dror, G., Langford, J., Strehl, A.L., Smola, A.J., Vishwanathan, S.: Hash kernels. In: International Conference on Artificial Intelligence and Statistics, pp. 496–503 (2009)Google Scholar
  45. 45.
    Sorzano, C.O.S., Vargas, J., Montano, A.P.: A survey of dimensionality reduction techniques. arXiv:1403.2877 (2014)
  46. 46.
    Torralba, A., Fergus, R., Weiss, Y.: Small codes and large image databases for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8. IEEE (2008)Google Scholar
  47. 47.
    Tropp, J.A., Gilbert, A.C.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inf. Theory 53(12), 4655–4666 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer (2000)Google Scholar
  49. 49.
    Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the International Conference on Multimedia, pp. 1469–1472. ACM (2010)Google Scholar
  50. 50.
    Wang, C., Yan, S., Zhang, L., Zhang, H.J.: Multi-label sparse coding for automatic image annotation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 1643–1650. IEEE (2009)Google Scholar
  51. 51.
    Weinberger, K., Dasgupta, A., Langford, J., Smola, A., Attenberg, J.: Feature hashing for large scale multitask learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1113–1120. ACM (2009)Google Scholar
  52. 52.
    Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advances in Neural Information Processing Systems, pp. 1753–1760 (2009)Google Scholar
  53. 53.
    Willett, R.M., Marcia, R.F., Nichols, J.M.: Compressed sensing for practical optical imaging systems: a tutorial. Opt. Eng. 50(7), 072,601–072,601 (2011)Google Scholar
  54. 54.
    Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98(6), 1031–1044 (2010)CrossRefGoogle Scholar
  55. 55.
    Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)CrossRefGoogle Scholar
  56. 56.
    Yang, J., Bouzerdoum, A., Tivive, F.H.C., Phung, S.L.: Dimensionality reduction using compressed sensing and its application to a large-scale visual recognition task. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2010)Google Scholar
  57. 57.
    Yu, S., Tranchevent, L.C., Liu, X., Glanzel, W., Suykens, J.A., De Moor, B., Moreau, Y.: Optimized data fusion for kernel k-means clustering. IEEE Trans. Pattern Anal. Mach. Intell. 34(5), 1031–1039 (2012)CrossRefGoogle Scholar
  58. 58.
    Zhang, Q., Li, B.: Discriminative k-svd for dictionary learning in face recognition. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2691–2698. IEEE (2010)Google Scholar
  59. 59.
    Zhang, Z., Wang, J., Zha, H.: Adaptive manifold learning. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 253–265 (2012)CrossRefGoogle Scholar
  60. 60.
    Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15(2), 265–286 (2006)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Anastasios Maronidis
    • 1
    Email author
  • Elisavet Chatzilari
    • 1
  • Spiros Nikolopoulos
    • 1
  • Ioannis Kompatsiaris
    • 1
  1. 1.Information Technologies InstituteCentre for Research and Technology HellasThessalonikiGreece

Personalised recommendations