Localizing Objects with Smart Dictionaries

Fulkerson, Brian; Vedaldi, Andrea; Soatto, Stefano

doi:10.1007/978-3-540-88682-2_15

Brian Fulkerson⁴,
Andrea Vedaldi⁴ &
Stefano Soatto⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5302))

Included in the following conference series:

European Conference on Computer Vision

9221 Accesses
59 Citations

Abstract

We present an approach to determine the category and location of objects in images. It performs very fast categorization of each pixel in an image, a brute-force approach made feasible by three key developments: First, our method reduces the size of a large generic dictionary (on the order of ten thousand words) to the low hundreds while increasing classification performance compared to k-means. This is achieved by creating a discriminative dictionary tailored to the task by following the information bottleneck principle. Second, we perform feature-based categorization efficiently on a dense grid by extending the concept of integral images to the computation of local histograms. Third, we compute SIFT descriptors densely in linear time. We compare our method to the state of the art and find that it excels in accuracy and simplicity, performing better while assuming less.

Download to read the full chapter text

Chapter PDF

A Granulometry Based Descriptor for Object Categorization

SIFTpack: A Compact Representation for Efficient SIFT Matching

Locality constrained encoding of frequency and spatial information for image classification

Article 01 March 2018

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: Proc. CVPR (2006)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proc. CVPR (2007)
Google Scholar
Tuytelaars, T., Schmid, C.: Vector quantizing feature space with a regular lattice. In: Proc. ICCV (2007)
Google Scholar
Marszałek, M., Schmid, C.: Accurate object localization with shape masks. In: Proc. CVPR (2007)
Google Scholar
Opelt, A., Pinz, A.: Object localization with boosting and weak supervision for generic object recognition. In: Kalviainen, H., Parkkinen, J., Kaarna, A. (eds.) SCIA 2005. LNCS, vol. 3540, pp. 862–871. Springer, Heidelberg (2005)
Chapter Google Scholar
Viola, P., Jones, M.: Robust real-time object detection. In: Second International Workshop on Statistical and Computational Theories of Vision, Vancouver, Canada (2001)
Google Scholar
Slonim, N., Tishby, N.: Agglomerative information bottleneck. In: Proc. NIPS (1999)
Google Scholar
Lazebnik, S., Raginsky, M.: Learning nearest-neighbor quantizers from labeled data by information loss minimization. In: Proc. Conf. on Artificial Intellligence and Statistics (2007)
Google Scholar
Leibe, B., Micolajckzyk, K., Schiele, B.: Efficient clustering and matching for object class recognition. In: Proc. BMVC (2006)
Google Scholar
Winn, J., Criminisi, A., Minka, T.: Object categorization by learned universal visual dictionary. In: Proc. ICCV (2005)
Google Scholar
Marszałek, M., Schmid, C.: Spatial weighting for bag-of-features. In: Proc. CVPR (2006)
Google Scholar
Leordeanu, M., Hebert, M., Sukthankar, R.: Beyond local appearance: Category recognition from pairwise interactions of simple features. In: Proc. CVPR (2007)
Google Scholar
Ling, H., Soatto, S.: Proximity distribution kernels for geometric context in category recognition. In: Proc. CVPR (2007)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bag of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR (2006)
Google Scholar
Cao, L., Fei-Fei, L.: Spatially coherent latent topic model for concurrent object segmentation and classification. In: Proc. ICCV (2007)
Google Scholar
Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: Proc. ICCV (2007)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with implicit shape model. In: ECCV Workshop on Statistical Learning in Comp. Vision (2004)
Google Scholar
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model (2007), http://people.cs.uchicago.edu/pff/papers/
Vedaldi, A., Soatto, S.: Features for recognition: Viewpoint invariance for non-planar scenes. In: Proc. ICCV (2005)
Google Scholar
Shotton, J., Winn, J., Rother, C., Criminisi, A.: TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)
Chapter Google Scholar
He, X., Zemel, R., nán, M.C.P.: Multiscale conditional random fields for image labeling. In: Proc. CVPR (2004)
Google Scholar
Liu, J., Shah, M.: Scene modeling using co-clustering. In: Proc. ICCV (2007)
Google Scholar
Agarwal, A., Triggs, B.: Hyperfeatures - multilevel local coding for visual recognition. Technical report, INRIA (2005)
Google Scholar
Lampert, C., Blaschko, M., Hofmann, T.: Beyond sliding windows: Object localization by efficient subwindow search. cvpr (2008)
Google Scholar
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR (2008)
Google Scholar
Wang, X., Doretto, G., Sebastian, T., Rittscher, J., Tu, P.: Shape and appearance context modeling. In: Proc. ICCV (2007)
Google Scholar
Porikli, F.: Integral histogram: A fast way to extract histograms in cartesian spaces. In: Proc. CVPR (2005)
Google Scholar
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. IJCV (2006)
Google Scholar
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Proc. NIPS (2006)
Google Scholar
Slonim, N.: Iba_1.0: Matlab code for information bottleneck clustering algorithms (2003), http://www.princeton.edu/nslonim/
Vedaldi, A., Fulkerson, B.: Vlfeat: Feature extraction library (2007), http://vision.ucla.edu/vlfeat/
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 2(60), 91–110 (2004)
Article Google Scholar
Bay, H., Tuytelaars, T., Gool, L.V.: Surf: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)
Chapter Google Scholar
Tola, E., Lepetit, V., Fua, P.: A fast local descriptor for dense matching. In: Proc. CVPR (2008)
Google Scholar
Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background subtraction. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 751–767. Springer, Heidelberg (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of California, Los Angeles, CA 90095, USA
Brian Fulkerson, Andrea Vedaldi & Stefano Soatto

Authors

Brian Fulkerson
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Vedaldi
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Soatto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Illinois at Urbana Champaign, 3310 Siebel Hall, Urbana, IL 61801, USA
David Forsyth
Department of Computing, Oxford Brookes University, OX33 1HX, Wheatley, Oxford, UK
Philip Torr
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Electronic Supplementary Material

Supplementary material (3969 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fulkerson, B., Vedaldi, A., Soatto, S. (2008). Localizing Objects with Smart Dictionaries. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88682-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-88682-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88681-5
Online ISBN: 978-3-540-88682-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Localizing Objects with Smart Dictionaries

Abstract

Chapter PDF

Similar content being viewed by others

A Granulometry Based Descriptor for Object Categorization

SIFTpack: A Compact Representation for Efficient SIFT Matching

Locality constrained encoding of frequency and spatial information for image classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Electronic Supplementary Material

Supplementary material (3969 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Localizing Objects with Smart Dictionaries

Abstract

Chapter PDF

Similar content being viewed by others

A Granulometry Based Descriptor for Object Categorization

SIFTpack: A Compact Representation for Efficient SIFT Matching

Locality constrained encoding of frequency and spatial information for image classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Electronic Supplementary Material

Supplementary material (3969 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation