International Journal of Computer Vision

, Volume 80, Issue 1, pp 45–57

Object Class Recognition and Localization Using Sparse Features with Limited Receptive Fields

Article

Abstract

We investigate the role of sparsity and localized features in a biologically-inspired model of visual object classification. As in the model of Serre, Wolf, and Poggio, we first apply Gabor filters at all positions and scales; feature complexity and position/scale invariance are then built up by alternating template matching and max pooling operations. We refine the approach in several biologically plausible ways. Sparsity is increased by constraining the number of feature inputs, lateral inhibition, and feature selection. We also demonstrate the value of retaining some position and scale information above the intermediate feature level. Our final model is competitive with current computer vision algorithms on several standard datasets, including the Caltech 101 object categories and the UIUC car localization task. The results further the case for biologically-motivated approaches to object classification.

Keywords

Object class recognition Ventral visual pathway Sparsity Localized features 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, S., Awan, A., & Roth, D. (2004). Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11), 1475–1490. CrossRefGoogle Scholar
  2. Berg, A. C., Berg, T. L., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondence. In CVPR, June 2005. Google Scholar
  3. Bouchard, G., & Triggs, B. (2005). Hierarchical part-based visual object categorization. In CVPR, June 2005. Google Scholar
  4. Csurka, G., Dance, C., Willamowski, J., Fan, L., & Bray, C. (2005). Visual categorization with bags of keypoints. In ECCV international workshop on statistical learning in computer vision, Prague, 2004. Google Scholar
  5. DiCarlo, J., & Cox, D. (2007). Untangling invariant object recognition. Trends in Cognitive Science, 11, 333–341. CrossRefGoogle Scholar
  6. Epshtein, B., & Ullman, S. (2005). Feature hierarchies for object classification. In ICCV, Beijing. Google Scholar
  7. Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In CVPR workshop on generative-model based vision. Google Scholar
  8. Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In CVPR. Google Scholar
  9. Figueiredo, M. (2003). Adaptive sparseness for supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(9), 1150–1159. CrossRefGoogle Scholar
  10. Franc, V., & Hlavac, V. (2004). Statistical pattern recognition toolbox for Matlab, version 2.04. Google Scholar
  11. Fritz, M., Leibe, B., Caputo, B., & Schiele, B. (2005). Integrating representative and discriminative models for object category detection. In ICCV (pp. 1363–1370), Beijing, China, October 2005. Google Scholar
  12. Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202. MATHCrossRefGoogle Scholar
  13. Grauman, K., & Darrell, T. (2006). Pyramid match kernels: discriminative classification with sets of image features (Technical Report MIT-CSAIL-TR-2006-020), March 2006. Google Scholar
  14. Holub, A., Welling, M., & Perona, P. (2005). Exploiting unlabeled data for hybrid object classification. In NIPS workshop on inter-class transfer, Whistler, BC, December 2005. Google Scholar
  15. Hubel, D., & Wiesel, T. (1959). Receptive fields of single neurones in the cat’s striate cortex. Journal of Physiology, 148, 574–591. Google Scholar
  16. Knoblich, U., Bouvrie, J., & Poggio, T. (2007). Biophysical models of neural computation: max and tuning circuits (Technical Report CBCL paper), April 2007. Google Scholar
  17. Krishnapuram, B., Carin, L., Figueiredo, M., & Hartemink, A. (2005). Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 957–968. CrossRefGoogle Scholar
  18. Lazebnik, S., Schmid, C., & Ponce, J. (2006) Beyond bags of features: Spatial pyramid. matching for recognizing natural scene categories. In CVPR, June 2006. Google Scholar
  19. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. CrossRefGoogle Scholar
  20. Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In ECCV workshop on statistical learning in computer vision (pp. 17–32), Prague, Czech Republic, May 2004. Google Scholar
  21. Logothetis, N., Pauls, J., & Poggio, T. (1995). Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5, 552–563. CrossRefGoogle Scholar
  22. Mladenic, D., Brank, J., Grobelnik, M., & Milic-Frayling, N. (2004). Feature selection using linear classifier weights: interaction with classification models. In The 27th annual international ACM SIGIR conference (SIGIR 2004) (pp. 234–241), Sheffield, UK, July 2004. Google Scholar
  23. Moosmann, F., Triggs, B., & Jurie, F. (2006). Randomized clustering forests for building fast and discriminative visual vocabularies. In Neural information processing systems (NIPS), November 2006. Google Scholar
  24. Mutch, J., & Lowe, D. G. (2006). Multiclass object recognition with sparse, localized features. In CVPR (pp. 11–18), New York, June 2006. Google Scholar
  25. Olshausen, B., & Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609. CrossRefGoogle Scholar
  26. Opelt, A., Pinz, A., Fussenegger, M., & Auer, P. (2006). Generic object recognition with boosting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3). Google Scholar
  27. Poggio, T., & Edelman, S. (1990). A network that learns to recognize three-dimensional objects. Nature, 343, 263–266. CrossRefGoogle Scholar
  28. Potter, M. (1975). Meaning in visual search. Science, 187, 965–966. CrossRefGoogle Scholar
  29. Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(11), 1019–1025. CrossRefGoogle Scholar
  30. Rolls, E. T., & Deco, G. (2001). The computational neuroscience of vision. Oxford: Oxford University Press. Google Scholar
  31. Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., & Poggio, T. (2005). A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex (Technical Report CBCL Paper #259/AI Memo #2005-036). Massachusetts Institute of Technology, Cambridge, MA, October 2005. Google Scholar
  32. Serre, T., Wolf, L., & Poggio, T. Object recognition with features inspired by visual cortex. In CVPR, San Diego, June 2005. Google Scholar
  33. Thorpe, S., Fize, D., & Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381, 520–522. CrossRefGoogle Scholar
  34. Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Visual features of intermediate complexity and their use in classification. Nature Neuroscience, 5(7), 682–687. Google Scholar
  35. Zhang, H., Berg, A., Maire, M., & Malik, J. (2006). Svm-knn: discriminative nearest neighbor classification for visual category recognition. In CVPR, June 2006. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Department of Brain and Cognitive SciencesMassachusetts Institute of TechnologyCambridgeUSA
  2. 2.Department of Computer ScienceUniversity of British ColumbiaVancouverCanada

Personalised recommendations