Skip to main content
Log in

Compressed Histogram of Gradients: A Low-Bitrate Descriptor

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Establishing visual correspondences is an essential component of many computer vision problems, which is often done with local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile visual search applications. We propose a framework for computing low bit-rate feature descriptors with a 20× reduction in bit rate compared to state-of-the-art descriptors. The framework offers low complexity and has significant speed-up in the matching stage. We show how to efficiently compute distances between descriptors in the compressed domain eliminating the need for decoding. We perform a comprehensive performance comparison with SIFT, SURF, BRIEF, MPEG-7 image signatures and other low bit-rate descriptors and show that our proposed CHoG descriptor outperforms existing schemes significantly over a wide range of bitrates. We implement the descriptor in a mobile image retrieval system and for a database of 1 million CD, DVD and book covers, we achieve 96% retrieval accuracy using only 4 KB of data per query image.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Amazon (2007). SnapTell. http://www.snaptell.com.

  • Banerjee, A., Merugu, S., Dhillon, I., & Ghosh, J. (2004). Clustering with Bregman divergences. Journal of Machine Learning Research, 234–245.

  • Bay, H., Tuytelaars, T., & Gool, L. V. (2006). SURF: speeded up robust features. In Proc. of European conference on computer vision (ECCV), Graz, Austria.

    Google Scholar 

  • Bay, H., Ess, A., Tuytelaars, T., & Gool, L. V. (2008). Speeded-up robust feature. Computer Vision and Image Understanding, 110(3), 346–359. http://dx.doi.org/10.1016/j.cviu.2007.09.014.

    Article  Google Scholar 

  • Brasnett, P., & Bober, M. (2007). Robust visual identifier using the trace transform. In Proc. of IET visual information engineering conference (VIE), London, UK.

    Google Scholar 

  • Calonder, M., Lepetit, V., & Fua, P. (2010). Brief: binary robust independent elementary features. In Proc. of European conference on computer vision (ECCV), Crete, Greece.

    Google Scholar 

  • Chandrasekhar, V., Takacs, G., Chen, D. M., Tsai, S. S., & Girod, B. (2009a). Transform coding of feature descriptors. In Proc. of visual communications and image processing conference (VCIP), San Jose, California.

    Google Scholar 

  • Chandrasekhar, V., Takacs, G., Chen, D. M., Tsai, S. S., Grzeszczuk, R., & Girod, B. (2009b). CHoG: compressed histogram of gradients—a low bit rate feature descriptor. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Miami, Florida.

    Google Scholar 

  • Chandrasekhar, V., Chen, D. M., Lin, A., Takacs, G., Tsai, S. S., Cheung, N. M., Reznik, Y., Grzeszczuk, R., & Girod, B. (2010a). Comparison of local feature descriptors for mobile visual search. In Proc. of IEEE international conference on image processing (ICIP), Hong Kong.

    Google Scholar 

  • Chandrasekhar, V., Makar, M., Takacs, G., Chen, D., Tsai, S. S., Cheung, N. M., Grzeszczuk, R., Reznik, Y., & Girod, B. (2010b). Survey of SIFT compression schemes. In Proc. of international mobile multimedia workshop (IMMW), IEEE international conference on pattern recognition (ICPR), Istanbul, Turkey.

    Google Scholar 

  • Chandrasekhar, V., Reznik, Y., Takacs, G., Chen, D. M., Tsai, S. S., Grzeszczuk, R., & Girod, B. (2010c). Study of quantization schemes for low bitrate CHoG descriptors. In Proc. of IEEE international workshop on mobile vision (IWMV), San Francisco, California.

    Google Scholar 

  • Chou, P. A., Lookabaugh, T., & Gray, R. M. (1989) Entropy constrained vector quantization. IEEE Transactions on Acoustics, Speech and Signal Processing, 37(1).

  • Conway, J. H., & Sloane, N. J. A. (1982). Fast quantizing and decoding algorithms for lattice quantizers and codes, IEEE Transactions on Information Theory IT28(2), 227–232.

    Article  MathSciNet  Google Scholar 

  • Cover, T. M., & Thomas, J. A. (2006). Wiley series in telecommunications and signal processing. Elements of information theory. New York: Wiley-Interscience.

    MATH  Google Scholar 

  • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), San Diego, CA.

    Google Scholar 

  • Erol, B., Antúnez, E., & Hull, J. (2008). Hotpaper: multimedia interaction with paper using mobile phones. In Proc. of the 16th ACM multimedia conference, New York, NY, USA.

    Google Scholar 

  • Freeman, W. T., & Roth, M. (1994). Orientation histograms for hand gesture recognition. In Proc. of international workshop on automatic face and gesture recognition (pp. 296–301).

    Google Scholar 

  • Gagie, T. (2006). Compressing probability distributions. Information Processing Letters, 97(4), 133–137. http://dx.doi.org/10.1016/j.ipl.2005.10.006.

    Article  MathSciNet  MATH  Google Scholar 

  • Girod, B., Chandrasekhar, V., Chen, D. M., Cheung, N. M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S. S., & Vedantham, R. (2010). Mobile visual search. IEEE signal processing magazine. Special Issue on Mobile Media Search, under review.

  • Google (2009) Google Goggles. http://www.google.com/mobile/goggles/.

  • Graham, J., & Hull, J. J. (2008). Icandy: a tangible user interface for itunes. In Proc. of CHI ’08: extended abstracts on human factors in computing systems, Florence, Italy.

    Google Scholar 

  • Hua, G., Brown, M., & Winder, S. (2007). Discriminant embedding for local image descriptors. In Proc. of international conference on computer vision (ICCV), Rio de Janeiro, Brazil.

    Google Scholar 

  • Hull, J. J., Erol, B., Graham, J., Ke, Q., Kishi, H., Moraleda, J., & Olst, D. G. V. (2007). Paper-based augmented reality. In Proc. of the 17th international conference on artificial reality and telexistence (ICAT), Washington, DC, USA.

    Google Scholar 

  • Jegou, H., Douze, M., & Schmid, C. (2008). Hamming embedding and weak geometric consistency for large scale image search. In Proc. of European conference on computer vision (ECCV), Berlin, Heidelberg.

    Google Scholar 

  • Jegou, H., Douze, M., & Schmid, C. (2010). Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence, accepted.

  • Johnson, M. (2010). Generalized descriptor compression for storage and matching. In Proc. of British machine vision conference (BMVC).

    Google Scholar 

  • Ke, Y., & Sukthankar, R. (2004). PCA-SIFT: a more distinctive representation for local image descriptors. In Proc. of conference on computer vision and pattern recognition (CVPR) (Vol. 02, pp. 506–513). Washington: IEEE Computer Society.

    Google Scholar 

  • Kooaba (2007) Kooaba. http://www.kooaba.com.

  • Kullback, S. (1987). The Kullback-Leibler distance. The American Statistician, 41, 340–341.

    Google Scholar 

  • Lowe, D. (1999). Object recognition from local scale-invariant features. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Los Alamitos, CA.

    Google Scholar 

  • Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Makar, M., Chang, C., Chen, D. M., Tsai, S. S., & Girod, B. (2009). Compression of image patches for local feature extraction. In Proc. of IEEE international conference on acoustics, speech and signal processing (ICASSP), Taipei, Taiwan.

    Google Scholar 

  • Mikolajczyk, K., & Schmid, C. (2005). Performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630. http://dx.doi.org/10.1109/TPAMI.2005.188.

    Article  Google Scholar 

  • Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., & Gool, L. V. (2005). A comparison of affine region detectors. International Journal of Computer Vision, 65(1–2), 43–72. http://dx.doi.org/10.1007/s11263-005-3848-x.

    Article  Google Scholar 

  • Nistér, D., & Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), New York, USA.

    Google Scholar 

  • Nokia (2006). Nokia point and find. http://www.pointandfind.nokia.com.

  • Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2008). Lost in quantization—improving particular object retrieval in large scale image databases. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Anchorage, Alaska.

    Google Scholar 

  • Rebollo-Monedero, D. (2007). Quantization and transforms for distributed source coding. PhD thesis, Department of Electrical Engineering, Stanford University.

  • Reznik, Y., Chandrasekhar, V., Takacs, G., Chen, D. M., Tsai, S. S., Grzeszczuk, R., & Girod, B. (2010). Fast quantization and matching of histogram-based image features. In Proc. of SPIE workshop on applications of digital image processing (ADIP), San Diego, California.

    Google Scholar 

  • Rubner, Y., Tomasi, C., & Guibas, L. J. (2000). The Earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121. http://dx.doi.org/10.1023/A:1026543900054.

    Article  MATH  Google Scholar 

  • Shakhnarovich, G., & Darrell, T. (2005). Learning task-specific similarity. Thesis.

  • Shao, H., Svoboda, T., & Gool, L.V. (2003). Zubud-Zürich buildings database for image based recognition (Tech. Rep. 260). ETH Zürich.

  • Sommerville, D. M. Y. (1958). An introduction to the geometry of n dimensions. New York: Dover.

    MATH  Google Scholar 

  • Takacs, G., Chandrasekhar, V., Gelfand, N., Xiong, Y., Chen, W., Bismpigiannis, T., Grzeszczuk, R., Pulli, K., & Girod, B. (2008). Outdoors augmented reality on mobile phone using loxel-based visual feature organization. In Proc. of ACM international conference on multimedia information retrieval (ACM MIR), Canada, Vancouver.

    Google Scholar 

  • Tola, E., Lepetit, V., & Fua, P. (2008). A fast local descriptor for dense matching. In Proc. of IEEE conference on computer vision and pattern recognition (pp. 1–8). doi:10.1109/CVPR.2008.4587673.

    Google Scholar 

  • Torralba, A., Fergus, R., & Weiss, Y. (2008). Small codes and large image databases for recognition. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Anchorage, Alaska.

    Google Scholar 

  • Tsai, S. S., Chen, D. M., Chandrasekhar, V., Takacs, G., Cheung, N. M., Vedantham, R., Grzeszczuk, R., & Girod, B. (2010). Mobile product recognition. In Proc. of ACM multimedia (ACM MM), Florence, Italy.

    Google Scholar 

  • Weiss, Y., Torralba, A., & Fergus, R. (2008). Spectral hashing. In Proc. of neural information processing systems (NIPS), Vancouver, BC, Canada.

    Google Scholar 

  • Winder, S., & Brown, M. (2007). Learning local image descriptors. In Proc. of IEEE conference on computer vision and pattern recognition (CVPR), Minneapolis, Minnesota (pp. 1–8). doi:10.1109/CVPR.2007.382971.

    Google Scholar 

  • Winder, S., Hua, G., & Brown, M. (2009). Picking the best daisy. In Proc. of computer vision and pattern recognition (CVPR), Miami, Florida.

    Google Scholar 

  • Yeo, C., Ahammad, P., & Ramchandran, K. (2008). Rate-efficient visual correspondences using random projections. In Proc. of IEEE international conference on image processing (ICIP), San Diego, California.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vijay Chandrasekhar.

Additional information

This work was first presented as an oral presentation at Computer Vision and Pattern Recognition (CVPR), 2009. Since then, the authors have studied feature compression in more detail in Chandrasekhar et al. (2009a, 2010a, 2010b, 2010c). A default implementation of CHoG is available at http://www.stanford.edu/vijayc/.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chandrasekhar, V., Takacs, G., Chen, D.M. et al. Compressed Histogram of Gradients: A Low-Bitrate Descriptor. Int J Comput Vis 96, 384–399 (2012). https://doi.org/10.1007/s11263-011-0453-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-011-0453-z

Keywords

Navigation