In recent years the problem of object recognition has received considerable attention from both the machine learning and computer vision communities. The key challenge of this problem is to be able to recognize any member of a category of objects in spite of wide variations in visual appearance due to variations in the form and colour of the object, occlusions, geometrical transformations (such as scaling and rotation), changes in illumination, and potentially non-rigid deformations of the object itself. In this paper we focus on the detection of objects within images by combining information from a large number of small regions, or ‘patches’, of the image. Since detailed hand-segmentation and labelling of images is very labour intensive, we make use of ‘weakly labelled’ data in which the training images are labelled only according to the presence or absence of each category of object. A major challenge presented by this problem is that the foreground object is accompanied by widely varying background clutter, and the system must learn to distinguish the foreground from the background without the aid of labelled data. In this paper we first show that patches which are highly relevant for the object discrimination problem can be selected automatically from a large dictionary of candidate patches during learning, and that this leads to improved classification compared to direct use of the full dictionary. We then explore alternative techniques which are able to provide labels for the individual patches, as well as for the image as a whole, so that each patch is identified as belonging to one of the object categories or to the background class. This provides a rough indication of the location of the object or objects within the image. Again these individual patch labels must be learned on the basis only of overall image class labels. We develop two such approaches, one discriminative and one generative, and compare their performance both in terms of patch labelling and image labelling. Our results show that good classification performance can be obtained on challenging data sets using only weak training labels, and they also highlight some of the relative merits of discriminative and generative approaches.
- Object Recognition
- Training Image
- Interest Point
- Object Category
- Foreground Object
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
Unable to display preview. Download preview PDF.
Barnard, K., Duygulu, P., Forsyth, D., Freitas, N., Blei, D., Jordan, M.I.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV 2004 (2004)
Dorko, G., Schmid, C.: Selection of scale invariant parts for object class recognition. In: ICCV 2003 (2003)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale invariant learning. In: CVPR 2003 (2003)
Kadir, T., Brady, M.: Scale, saliency and image description. International Journal of Computer Vision 45(2), 83–105 (2001)
Lowe, D.: Distinctive image features from scale invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
MacKay, D.J.C.: Probable networks and plausible predictions – a review of practical Bayesian methods for supervised neural networks, vol. 6(3), pp. 469–505 (1995)
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. International Journal of Computer Vision 60, 63–86 (2004)
Nabney, I.T.: Netlab Algorithms for Pattern Recognition. Springer, Heidelberg (2004)
Neal, R., Dayan, P., Hinton, G.E., Zemel, R.S.: The helmholtz machine. Neural Computation, 1022–1037 (1995)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Net- works of Plausible Inference. Morgan Kaufmann Publishers, San Francisco (1998)
Vidal-Naquet, M., Ullman, S.: Object recognition with informative features and linear classification. In: ICCV 2003 (2003)
Xie, L., Perez, P.: Slightly supervised learning of part-based appearance models. In: IEEE Workshop on Learning in CVPR 2004 (2004)
Editors and Affiliations
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bishop, C.M., Ulusoy, I. (2005). Object Recognition via Local Patch Labelling. In: Winkler, J., Niranjan, M., Lawrence, N. (eds) Deterministic and Statistical Methods in Machine Learning. DSMML 2004. Lecture Notes in Computer Science(), vol 3635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11559887_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29073-5
Online ISBN: 978-3-540-31728-9