Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition
In this paper we investigate a new method of learning part-based models for visual object recognition, from training data that only provides information about class membership (and not object location or configuration). This method learns both a model of local part appearance and a model of the spatial relations between those parts. In contrast, other work using such a weakly supervised learning paradigm has not considered the problem of simultaneously learning appearance and spatial models. Some of these methods use a “bag” model where only part appearance is considered whereas other methods learn spatial models but only given the output of a particular feature detector. Previous techniques for learning both part appearance and spatial relations have instead used a highly supervised learning process that provides substantial information about object part location. We show that our weakly supervised technique produces better results than these previous highly supervised methods. Moreover, we investigate the degree to which both richer spatial models and richer appearance models are helpful in improving recognition performance. Our results show that while both spatial and appearance information can be useful, the effect on performance depends substantially on the particular object class and on the difficulty of the test dataset.
KeywordsSpatial Model Maximal Clique Appearance Model Reference Node Visual Object Recognition
Unable to display preview. Download preview PDF.
- 1.Amit, Y., Trouve, A.: Pop: Patchwork of parts models for object recognition. Technical report, The University of Chicago (April 2005)Google Scholar
- 2.Crandall, D.J., Felzenszwalb, P.F., Huttenlocher, D.P.: Spatial priors for part-based recognition using statistical models. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10–17 (2005)Google Scholar
- 3.Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: European Conference on Computer Vision (2004)Google Scholar
- 4.Dorko, G., Schmid, C.: Object class recognition using discriminative local features. Technical report, INRIA Grenoble (September 2005)Google Scholar
- 5.Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient matching of pictorial structures. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. II, pp. 66–73 (2000)Google Scholar
- 6.Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: IEEE Conference on Computer Vision and Pattern Recognition (2003)Google Scholar
- 7.Fergus, R., Perona, P., Zisserman, A.: A sparse object category model for efficient learning and exhaustive recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 380–387 (2005)Google Scholar
- 11.Schneiderman, H., Kanade, T.: Probabilistic formulation for object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (1998)Google Scholar
- 12.Serre, T., Wolf, L., Poggio, T.: A new biologically motivated framework for robust object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)Google Scholar
- 13.Winn, J., Criminisi, A., Minka, T.: Object categorization by learned universal visual dictionary. In: IEEE International Conference on Computer Vision (2005)Google Scholar
- 14.Zhang, W., Yu, B., Samaras, D., Zelinsky, G.: Object class recognition using multiple layer boosting with heterogeneous features. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)Google Scholar