Skip to main content
Log in

Object Class Recognition and Localization Using Sparse Features with Limited Receptive Fields

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We investigate the role of sparsity and localized features in a biologically-inspired model of visual object classification. As in the model of Serre, Wolf, and Poggio, we first apply Gabor filters at all positions and scales; feature complexity and position/scale invariance are then built up by alternating template matching and max pooling operations. We refine the approach in several biologically plausible ways. Sparsity is increased by constraining the number of feature inputs, lateral inhibition, and feature selection. We also demonstrate the value of retaining some position and scale information above the intermediate feature level. Our final model is competitive with current computer vision algorithms on several standard datasets, including the Caltech 101 object categories and the UIUC car localization task. The results further the case for biologically-motivated approaches to object classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agarwal, S., Awan, A., & Roth, D. (2004). Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11), 1475–1490.

    Article  Google Scholar 

  • Berg, A. C., Berg, T. L., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondence. In CVPR, June 2005.

  • Bouchard, G., & Triggs, B. (2005). Hierarchical part-based visual object categorization. In CVPR, June 2005.

  • Csurka, G., Dance, C., Willamowski, J., Fan, L., & Bray, C. (2005). Visual categorization with bags of keypoints. In ECCV international workshop on statistical learning in computer vision, Prague, 2004.

  • DiCarlo, J., & Cox, D. (2007). Untangling invariant object recognition. Trends in Cognitive Science, 11, 333–341.

    Article  Google Scholar 

  • Epshtein, B., & Ullman, S. (2005). Feature hierarchies for object classification. In ICCV, Beijing.

  • Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In CVPR workshop on generative-model based vision.

  • Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In CVPR.

  • Figueiredo, M. (2003). Adaptive sparseness for supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(9), 1150–1159.

    Article  Google Scholar 

  • Franc, V., & Hlavac, V. (2004). Statistical pattern recognition toolbox for Matlab, version 2.04.

  • Fritz, M., Leibe, B., Caputo, B., & Schiele, B. (2005). Integrating representative and discriminative models for object category detection. In ICCV (pp. 1363–1370), Beijing, China, October 2005.

  • Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202.

    Article  MATH  Google Scholar 

  • Grauman, K., & Darrell, T. (2006). Pyramid match kernels: discriminative classification with sets of image features (Technical Report MIT-CSAIL-TR-2006-020), March 2006.

  • Holub, A., Welling, M., & Perona, P. (2005). Exploiting unlabeled data for hybrid object classification. In NIPS workshop on inter-class transfer, Whistler, BC, December 2005.

  • Hubel, D., & Wiesel, T. (1959). Receptive fields of single neurones in the cat’s striate cortex. Journal of Physiology, 148, 574–591.

    Google Scholar 

  • Knoblich, U., Bouvrie, J., & Poggio, T. (2007). Biophysical models of neural computation: max and tuning circuits (Technical Report CBCL paper), April 2007.

  • Krishnapuram, B., Carin, L., Figueiredo, M., & Hartemink, A. (2005). Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 957–968.

    Article  Google Scholar 

  • Lazebnik, S., Schmid, C., & Ponce, J. (2006) Beyond bags of features: Spatial pyramid. matching for recognizing natural scene categories. In CVPR, June 2006.

  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.

    Article  Google Scholar 

  • Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In ECCV workshop on statistical learning in computer vision (pp. 17–32), Prague, Czech Republic, May 2004.

  • Logothetis, N., Pauls, J., & Poggio, T. (1995). Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5, 552–563.

    Article  Google Scholar 

  • Mladenic, D., Brank, J., Grobelnik, M., & Milic-Frayling, N. (2004). Feature selection using linear classifier weights: interaction with classification models. In The 27th annual international ACM SIGIR conference (SIGIR 2004) (pp. 234–241), Sheffield, UK, July 2004.

  • Moosmann, F., Triggs, B., & Jurie, F. (2006). Randomized clustering forests for building fast and discriminative visual vocabularies. In Neural information processing systems (NIPS), November 2006.

  • Mutch, J., & Lowe, D. G. (2006). Multiclass object recognition with sparse, localized features. In CVPR (pp. 11–18), New York, June 2006.

  • Olshausen, B., & Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.

    Article  Google Scholar 

  • Opelt, A., Pinz, A., Fussenegger, M., & Auer, P. (2006). Generic object recognition with boosting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3).

  • Poggio, T., & Edelman, S. (1990). A network that learns to recognize three-dimensional objects. Nature, 343, 263–266.

    Article  Google Scholar 

  • Potter, M. (1975). Meaning in visual search. Science, 187, 965–966.

    Article  Google Scholar 

  • Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(11), 1019–1025.

    Article  Google Scholar 

  • Rolls, E. T., & Deco, G. (2001). The computational neuroscience of vision. Oxford: Oxford University Press.

    Google Scholar 

  • Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., & Poggio, T. (2005). A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex (Technical Report CBCL Paper #259/AI Memo #2005-036). Massachusetts Institute of Technology, Cambridge, MA, October 2005.

  • Serre, T., Wolf, L., & Poggio, T. Object recognition with features inspired by visual cortex. In CVPR, San Diego, June 2005.

  • Thorpe, S., Fize, D., & Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381, 520–522.

    Article  Google Scholar 

  • Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Visual features of intermediate complexity and their use in classification. Nature Neuroscience, 5(7), 682–687.

    Google Scholar 

  • Zhang, H., Berg, A., Maire, M., & Malik, J. (2006). Svm-knn: discriminative nearest neighbor classification for visual category recognition. In CVPR, June 2006.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jim Mutch.

Additional information

This paper updates and extends an earlier presentation (Mutch and Lowe 2006) of this research in CVPR 2006.

J. Mutch’s research described in this paper was carried out at the University of British Columbia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mutch, J., Lowe, D.G. Object Class Recognition and Localization Using Sparse Features with Limited Receptive Fields. Int J Comput Vis 80, 45–57 (2008). https://doi.org/10.1007/s11263-007-0118-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-007-0118-0

Keywords

Navigation