International Journal of Computer Vision

, Volume 96, Issue 3, pp 384–399 | Cite as

Compressed Histogram of Gradients: A Low-Bitrate Descriptor

  • Vijay Chandrasekhar
  • Gabriel Takacs
  • David M. Chen
  • Sam S. Tsai
  • Yuriy Reznik
  • Radek Grzeszczuk
  • Bernd Girod
Article

Abstract

Establishing visual correspondences is an essential component of many computer vision problems, which is often done with local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile visual search applications. We propose a framework for computing low bit-rate feature descriptors with a 20× reduction in bit rate compared to state-of-the-art descriptors. The framework offers low complexity and has significant speed-up in the matching stage. We show how to efficiently compute distances between descriptors in the compressed domain eliminating the need for decoding. We perform a comprehensive performance comparison with SIFT, SURF, BRIEF, MPEG-7 image signatures and other low bit-rate descriptors and show that our proposed CHoG descriptor outperforms existing schemes significantly over a wide range of bitrates. We implement the descriptor in a mobile image retrieval system and for a database of 1 million CD, DVD and book covers, we achieve 96% retrieval accuracy using only 4 KB of data per query image.

Keywords

CHoG Feature descriptor Mobile visual search Content-based image retrieval Histogram-of-gradients Low bitrate 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amazon (2007). SnapTell. http://www.snaptell.com.
  2. Banerjee, A., Merugu, S., Dhillon, I., & Ghosh, J. (2004). Clustering with Bregman divergences. Journal of Machine Learning Research, 234–245. Google Scholar
  3. Bay, H., Tuytelaars, T., & Gool, L. V. (2006). SURF: speeded up robust features. In Proc. of European conference on computer vision (ECCV), Graz, Austria. Google Scholar
  4. Bay, H., Ess, A., Tuytelaars, T., & Gool, L. V. (2008). Speeded-up robust feature. Computer Vision and Image Understanding, 110(3), 346–359. http://dx.doi.org/10.1016/j.cviu.2007.09.014. CrossRefGoogle Scholar
  5. Brasnett, P., & Bober, M. (2007). Robust visual identifier using the trace transform. In Proc. of IET visual information engineering conference (VIE), London, UK. Google Scholar
  6. Calonder, M., Lepetit, V., & Fua, P. (2010). Brief: binary robust independent elementary features. In Proc. of European conference on computer vision (ECCV), Crete, Greece. Google Scholar
  7. Chandrasekhar, V., Takacs, G., Chen, D. M., Tsai, S. S., & Girod, B. (2009a). Transform coding of feature descriptors. In Proc. of visual communications and image processing conference (VCIP), San Jose, California. Google Scholar
  8. Chandrasekhar, V., Takacs, G., Chen, D. M., Tsai, S. S., Grzeszczuk, R., & Girod, B. (2009b). CHoG: compressed histogram of gradients—a low bit rate feature descriptor. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Miami, Florida. Google Scholar
  9. Chandrasekhar, V., Chen, D. M., Lin, A., Takacs, G., Tsai, S. S., Cheung, N. M., Reznik, Y., Grzeszczuk, R., & Girod, B. (2010a). Comparison of local feature descriptors for mobile visual search. In Proc. of IEEE international conference on image processing (ICIP), Hong Kong. Google Scholar
  10. Chandrasekhar, V., Makar, M., Takacs, G., Chen, D., Tsai, S. S., Cheung, N. M., Grzeszczuk, R., Reznik, Y., & Girod, B. (2010b). Survey of SIFT compression schemes. In Proc. of international mobile multimedia workshop (IMMW), IEEE international conference on pattern recognition (ICPR), Istanbul, Turkey. Google Scholar
  11. Chandrasekhar, V., Reznik, Y., Takacs, G., Chen, D. M., Tsai, S. S., Grzeszczuk, R., & Girod, B. (2010c). Study of quantization schemes for low bitrate CHoG descriptors. In Proc. of IEEE international workshop on mobile vision (IWMV), San Francisco, California. Google Scholar
  12. Chou, P. A., Lookabaugh, T., & Gray, R. M. (1989) Entropy constrained vector quantization. IEEE Transactions on Acoustics, Speech and Signal Processing, 37(1). Google Scholar
  13. Conway, J. H., & Sloane, N. J. A. (1982). Fast quantizing and decoding algorithms for lattice quantizers and codes, IEEE Transactions on Information Theory IT28(2), 227–232. MathSciNetCrossRefGoogle Scholar
  14. Cover, T. M., & Thomas, J. A. (2006). Wiley series in telecommunications and signal processing. Elements of information theory. New York: Wiley-Interscience. MATHGoogle Scholar
  15. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), San Diego, CA. Google Scholar
  16. Erol, B., Antúnez, E., & Hull, J. (2008). Hotpaper: multimedia interaction with paper using mobile phones. In Proc. of the 16th ACM multimedia conference, New York, NY, USA. Google Scholar
  17. Freeman, W. T., & Roth, M. (1994). Orientation histograms for hand gesture recognition. In Proc. of international workshop on automatic face and gesture recognition (pp. 296–301). Google Scholar
  18. Gagie, T. (2006). Compressing probability distributions. Information Processing Letters, 97(4), 133–137. http://dx.doi.org/10.1016/j.ipl.2005.10.006. MathSciNetMATHCrossRefGoogle Scholar
  19. Girod, B., Chandrasekhar, V., Chen, D. M., Cheung, N. M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S. S., & Vedantham, R. (2010). Mobile visual search. IEEE signal processing magazine. Special Issue on Mobile Media Search, under review. Google Scholar
  20. Google (2009) Google Goggles. http://www.google.com/mobile/goggles/.
  21. Graham, J., & Hull, J. J. (2008). Icandy: a tangible user interface for itunes. In Proc. of CHI ’08: extended abstracts on human factors in computing systems, Florence, Italy. Google Scholar
  22. Hua, G., Brown, M., & Winder, S. (2007). Discriminant embedding for local image descriptors. In Proc. of international conference on computer vision (ICCV), Rio de Janeiro, Brazil. Google Scholar
  23. Hull, J. J., Erol, B., Graham, J., Ke, Q., Kishi, H., Moraleda, J., & Olst, D. G. V. (2007). Paper-based augmented reality. In Proc. of the 17th international conference on artificial reality and telexistence (ICAT), Washington, DC, USA. Google Scholar
  24. Jegou, H., Douze, M., & Schmid, C. (2008). Hamming embedding and weak geometric consistency for large scale image search. In Proc. of European conference on computer vision (ECCV), Berlin, Heidelberg. Google Scholar
  25. Jegou, H., Douze, M., & Schmid, C. (2010). Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence, accepted. Google Scholar
  26. Johnson, M. (2010). Generalized descriptor compression for storage and matching. In Proc. of British machine vision conference (BMVC). Google Scholar
  27. Ke, Y., & Sukthankar, R. (2004). PCA-SIFT: a more distinctive representation for local image descriptors. In Proc. of conference on computer vision and pattern recognition (CVPR) (Vol. 02, pp. 506–513). Washington: IEEE Computer Society. Google Scholar
  28. Kooaba (2007) Kooaba. http://www.kooaba.com.
  29. Kullback, S. (1987). The Kullback-Leibler distance. The American Statistician, 41, 340–341. Google Scholar
  30. Lowe, D. (1999). Object recognition from local scale-invariant features. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Los Alamitos, CA. Google Scholar
  31. Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110. CrossRefGoogle Scholar
  32. Makar, M., Chang, C., Chen, D. M., Tsai, S. S., & Girod, B. (2009). Compression of image patches for local feature extraction. In Proc. of IEEE international conference on acoustics, speech and signal processing (ICASSP), Taipei, Taiwan. Google Scholar
  33. Mikolajczyk, K., & Schmid, C. (2005). Performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630. http://dx.doi.org/10.1109/TPAMI.2005.188. CrossRefGoogle Scholar
  34. Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., & Gool, L. V. (2005). A comparison of affine region detectors. International Journal of Computer Vision, 65(1–2), 43–72. http://dx.doi.org/10.1007/s11263-005-3848-x. CrossRefGoogle Scholar
  35. Nistér, D., & Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), New York, USA. Google Scholar
  36. Nokia (2006). Nokia point and find. http://www.pointandfind.nokia.com.
  37. Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2008). Lost in quantization—improving particular object retrieval in large scale image databases. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Anchorage, Alaska. Google Scholar
  38. Rebollo-Monedero, D. (2007). Quantization and transforms for distributed source coding. PhD thesis, Department of Electrical Engineering, Stanford University. Google Scholar
  39. Reznik, Y., Chandrasekhar, V., Takacs, G., Chen, D. M., Tsai, S. S., Grzeszczuk, R., & Girod, B. (2010). Fast quantization and matching of histogram-based image features. In Proc. of SPIE workshop on applications of digital image processing (ADIP), San Diego, California. Google Scholar
  40. Rubner, Y., Tomasi, C., & Guibas, L. J. (2000). The Earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121. http://dx.doi.org/10.1023/A:1026543900054. MATHCrossRefGoogle Scholar
  41. Shakhnarovich, G., & Darrell, T. (2005). Learning task-specific similarity. Thesis. Google Scholar
  42. Shao, H., Svoboda, T., & Gool, L.V. (2003). Zubud-Zürich buildings database for image based recognition (Tech. Rep. 260). ETH Zürich. Google Scholar
  43. Sommerville, D. M. Y. (1958). An introduction to the geometry of n dimensions. New York: Dover. MATHGoogle Scholar
  44. Takacs, G., Chandrasekhar, V., Gelfand, N., Xiong, Y., Chen, W., Bismpigiannis, T., Grzeszczuk, R., Pulli, K., & Girod, B. (2008). Outdoors augmented reality on mobile phone using loxel-based visual feature organization. In Proc. of ACM international conference on multimedia information retrieval (ACM MIR), Canada, Vancouver. Google Scholar
  45. Tola, E., Lepetit, V., & Fua, P. (2008). A fast local descriptor for dense matching. In Proc. of IEEE conference on computer vision and pattern recognition (pp. 1–8). doi: 10.1109/CVPR.2008.4587673. Google Scholar
  46. Torralba, A., Fergus, R., & Weiss, Y. (2008). Small codes and large image databases for recognition. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Anchorage, Alaska. Google Scholar
  47. Tsai, S. S., Chen, D. M., Chandrasekhar, V., Takacs, G., Cheung, N. M., Vedantham, R., Grzeszczuk, R., & Girod, B. (2010). Mobile product recognition. In Proc. of ACM multimedia (ACM MM), Florence, Italy. Google Scholar
  48. Weiss, Y., Torralba, A., & Fergus, R. (2008). Spectral hashing. In Proc. of neural information processing systems (NIPS), Vancouver, BC, Canada. Google Scholar
  49. Winder, S., & Brown, M. (2007). Learning local image descriptors. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Minneapolis, Minnesota (pp. 1–8). doi: 10.1109/CVPR.2007.382971. Google Scholar
  50. Winder, S., Hua, G., & Brown, M. (2009). Picking the best daisy. In Proc. of computer vision and pattern recognition (CVPR), Miami, Florida. Google Scholar
  51. Yeo, C., Ahammad, P., & Ramchandran, K. (2008). Rate-efficient visual correspondences using random projections. In Proc. of IEEE international conference on image processing (ICIP), San Diego, California. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Vijay Chandrasekhar
    • 1
  • Gabriel Takacs
    • 1
  • David M. Chen
    • 1
  • Sam S. Tsai
    • 1
  • Yuriy Reznik
    • 1
  • Radek Grzeszczuk
    • 1
  • Bernd Girod
    • 1
  1. 1.Stanford UniversityStanfordUSA

Personalised recommendations