Hyperfeatures – Multilevel Local Coding for Visual Recognition

Agarwal, Ankur; Triggs, Bill

doi:10.1007/11744023_3

Ankur Agarwal¹⁹ &
Bill Triggs¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3951))

Included in the following conference series:

European Conference on Computer Vision

14k Accesses
54 Citations

Abstract

Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics at scales larger than their local input patches. We present a new multilevel visual representation, ‘hyperfeatures’, that is designed to remedy this. The starting point is the familiar notion that to detect object parts, in practice it often suffices to detect co-occurrences of more local object fragments – a process that can be formalized as comparison (e.g. vector quantization) of image patches against a codebook of known fragments, followed by local aggregation of the resulting codebook membership vectors to detect co-occurrences. This process converts local collections of image descriptor vectors into somewhat less local histogram vectors – higher-level but spatially coarser descriptors. We observe that as the output is again a local descriptor vector, the process can be iterated, and that doing so captures and codes ever larger assemblies of object parts and increasingly abstract or ‘semantic’ image properties. We formulate the hyperfeatures model and study its performance under several different image coding methods including clustering based Vector Quantization, Gaussian Mixtures, and combinations of these with Latent Dirichlet Allocation. We find that the resulting high-level features provide improved performance in several object image and texture image classification tasks.

Download to read the full chapter text

Chapter PDF

Leveraging Mutual Information in Local Descriptions: From Local Binary Patterns to the Image

Spatially Local Coding for Object Recognition

Locality constrained encoding of frequency and spatial information for image classification

Article 01 March 2018

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Center for Research in Language, International Picture Naming Project, Available from http://crl.ucsd.edu/~aszekely/ipnp/index.html
Agarwal, A., Triggs, B.: Hyperfeatures – Multilevel Local Coding for Visual Recognition. Technical report, INRIA Rhône Alpes (2005)
Google Scholar
Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. PAMI 26(11), 1475–1490 (2004)
Article Google Scholar
Berg, A., Malik, J.: Geometric Blur for Template Matching. In: Int. Conf. Computer Vision & Pattern Recognition (2001)
Google Scholar
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Bouman, C.A.: Cluster: An unsupervised algorithm for modeling Gaussian mixtures. (April 1997), Available from http://www.ece.purdue.edu/~bouman
Buntine, W., Jakaulin, A.: Discrete principal component analysis. Technical report, HIIT (2005)
Google Scholar
Buntine, W., Perttu, S.: Is multinomial pca multi-faceted clustering or dimensionality reduction? AI and Statistics (2003)
Google Scholar
Canny, J.: Gap: A factor model for discrete data. In: ACM Conference on Information Retrieval (SIGIR), Sheffield, UK (2004)
Google Scholar
Visual Object Classes Challenge. The PASCAL Object Recognition Database Collection, Available at www.pascal-network.org/challenges/VOC
Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: European Conf. Computer Vision (2004)
Google Scholar
Dorko, G., Schmid, C.: Object class recognition using discriminative local features. Technical report, INRIA Rhône Alpes (2005)
Google Scholar
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Int. Conf. Computer Vision & Pattern Recognition (2005)
Google Scholar
Ferencz, A., Learned-Miller, E., Malik, J.: Learning Hyper-Features for Visual Identification. In: Neural Information Processing Systems (2004)
Google Scholar
Fergus, R., Perona, P.: The Caltech database, Available at www.vision.caltech.edu/html-files/archive.html
Fritz, M., Hayman, E., Caputo, B., Eklundh, J.-O.: The KTH-TIPS database, Available at www.nada.kth.se/cvap/databases/kth-tips
Hayman, E., Caputo, B., Fritz, M., Eklundh, J.-O.: On the Significance of Real-World Conditions for Material Classification. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3024, pp. 253–266. Springer, Heidelberg (2004)
Chapter Google Scholar
Fukushima, K.: Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybernetics 36(4), 193–202 (1980)
Article MATH Google Scholar
Harris, C., Stephens, M.: A Combined Corner and Edge Detector. In: Alvey Vision Conference, pp. 147–151 (1988)
Google Scholar
Hofmann, T.: Probabilistic Latent Semantic Analysis. In: Proc. of Uncertainty in Artificial Intelligence, Stockholm (1999)
Google Scholar
Joachims, T.: Making large-Scale SVM Learning Practical. In: Advances in Kernel Methods - Support Vector Learning, MIT Press, Cambridge (1999)
Google Scholar
Jurie, F., Triggs, B.: Creating Efficient Codebooks for Visual Recognition. In: Int. Conf. Computer Vision (2005)
Google Scholar
Kadir, T., Brady, M.: Saliency, Scale and Image Description. Int. J. Computer Vision 45(2), 83–105 (2001)
Article MATH Google Scholar
Keller, M., Bengio, S.: Theme-Topic Mixture Model for Document Representation. In: PASCAL Workshop on Learning Methods for Text Understanding and Mining (2004)
Google Scholar
Lang, G., Seitz, P.: Robust Classification of Arbitrary Object Classes Based on Hierarchical Spatial Feature-Matching. Machine Vision and Applications 10(3), 123–135 (1997)
Article Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Affine-Invariant Local Descriptors and Neighborhood Statistics for Texture Recognition. In: Int. Conf. Computer Vision (2003)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Semi-local Affine Parts for Object Recognition. In: British Machine Vision Conference, vol. 2, pp. 779–788 (2004)
Google Scholar
LeCun, Y., Huang, F.-J., Bottou, L.: Learning Methods for Generic Object Recognition with Invariance to Pose and Lighting. In: CVPR (2004)
Google Scholar
Leung, T., Malik, J.: Recognizing Surfaces Using Three-Dimensional Textons. In: Int. Conf. Computer Vision (1999)
Google Scholar
Lowe, D.: Distinctive Image Features from Scale-invariant Keypoints. Int. J. Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Malik, J., Perona, P.: Preattentive texture discrimination with early vision mechanisms. J. Optical Society of America A 7(5), 923–932 (1990)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Analysis & Machine Intelligence 27(10) (2005)
Google Scholar
K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool. A comparison of affine region detectors. IJCV, 65(1/2), 2005.
Google Scholar
Mori, G., Malik, J.: Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA. In: Int. Conf. Computer Vision & Pattern Recognition (2003)
Google Scholar
Opelt, A., Fussenegger, M., Pinz, A., Auer, P.: The Graz image databases, available at http://www.emt.tugraz.at/~pinz/data/
Opelt, A., Fussenegger, M., Pinz, A., Auer, P.: Weak hypotheses and boosting for generic object detection and recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3022, pp. 71–84. Springer, Heidelberg (2004)
Chapter Google Scholar
Puzicha, J., Hofmann, T., Buhmann, J.: Histogram Clustering for Unsupervised Segmentation and Image Retrieval. Pattern Recognition Letters 20, 899–909 (1999)
Article Google Scholar
Riesenhuber, M., Poggio, T.: Hierarchical Models of Object Recognition in Cortex. Nature Neuroscience 2, 1019–1025 (1999)
Article Google Scholar
Schaffalitzky, F., Zisserman, A.: Viewpoint invariant texture matching and wide baseline stereo. In: Int. Conf. Computer Vision, Vancouver, pp. 636–643 (2001)
Google Scholar
Schiele, B., Crowley, J.: Recognition without Correspondence using Multidimensional Receptive Field Histograms. Int. J. Computer Vision 36(1), 31–50 (2000)
Article Google Scholar
Schiele, B., Pentland, A.: Probabilistic Object Recognition and Localization. In: Int. Conf. Computer Vision (1999)
Google Scholar
Schmid, C.: Weakly supervised learning of visual models and its application to content-based retrieval. Int. J. Computer Vision 56(1), 7–16 (2004)
Article Google Scholar
Schmid, C., Mohr, R.: Local Grayvalue Invariants for Image Retrieval. IEEE Trans. Pattern Analysis & Machine Intelligence 19(5), 530–534 (1997)
Article Google Scholar
Varma, M., Zisserman, A.: Texture Classification: Are filter banks necessary? In: Int. Conf. Computer Vision & Pattern Recognition (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

GRAVIR-INRIA-CNRS, 655 Avenue de l’Europe, Montbonnot, 38330, France
Ankur Agarwal & Bill Triggs

Authors

Ankur Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Bill Triggs
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Ljubljana, Ljubljana, Slovenia
Aleš Leonardis
Institute for Computer Graphics and Vision, TU Graz, Inffeldgasse 16, 8010, Graz, Austria
Horst Bischof
Vision-based Measurement Group, Inst. of El. Measurement and Meas. Sign. Proc. Graz, University of Technology, Austria
Axel Pinz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agarwal, A., Triggs, B. (2006). Hyperfeatures – Multilevel Local Coding for Visual Recognition. In: Leonardis, A., Bischof, H., Pinz, A. (eds) Computer Vision – ECCV 2006. ECCV 2006. Lecture Notes in Computer Science, vol 3951. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11744023_3

Download citation

DOI: https://doi.org/10.1007/11744023_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33832-1
Online ISBN: 978-3-540-33833-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hyperfeatures – Multilevel Local Coding for Visual Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Leveraging Mutual Information in Local Descriptions: From Local Binary Patterns to the Image

Spatially Local Coding for Object Recognition

Locality constrained encoding of frequency and spatial information for image classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Hyperfeatures – Multilevel Local Coding for Visual Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Leveraging Mutual Information in Local Descriptions: From Local Binary Patterns to the Image

Spatially Local Coding for Object Recognition

Locality constrained encoding of frequency and spatial information for image classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation