Abstract
Robots need to ground their external vocabulary and internal symbols in observations of the world. In recent works, this problem has been approached through combinations of open-ended category learning and interaction with other agents acting as teachers. In this paper, a complementary path is explored, in which robots also resort to semantic searches in digital collections of text and images, or more generally in the Internet, to ground vocabulary about objects. Drawing on a distinction between broad and narrow (or general and specific) categories, different methods are applied, namely global shape contexts to represent broad categories, and SIFT local features to represent narrow categories. An unsupervised image clustering and ranking method is proposed that, starting from a set of images automatically fetched on the web for a given category name, selects a subset of images suitable for building a model of the category. In the case of broad categories, image segmentation and object extraction enhance the chances of finding suitable training objects. We demonstrate that the proposed approach indeed improves the quality of the training object collections.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Belpaeme, T., Cowley, S.: Extended symbol grounding. Interaction Studies 8(1), 1–6 (2007)
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: ICCV 2005: Proceedings of the Tenth IEEE International Conference on Computer Vision, pp. 1816–1823. IEEE Computer Society Press, Washington (2005)
Fergus, R., Perona, P., Zisserman, A., Science, D.E.: A visual category filter for google images. In: Proc. ECCV, pp. 242–256 (2004)
Fritz, M., Schiele, B.: Towards unsupervised discovery of visual categories. In: Franke, K., Müller, K.-R., Nickolay, B., Schäfer, R. (eds.) DAGM 2006. LNCS, vol. 4174, pp. 232–241. Springer, Heidelberg (2006)
Grauman, K., Darrell, T.: Unsupervised learning of categories from sets of partially matching image features. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2006, pp. 19–25 (2006)
Harnad, S.: The symbol grounding problem. Physica D 42, 335–346 (1990)
Li, L.-J., Wang, G., Fei-Fei, L.: Optimol: automatic online picture collection via incremental model learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8 (2007)
Lloyd, S.: Least squares quantization in pcm. IEEE Transactions on Information Theory 28, 129–137 (1982)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Pereira, R., Seabra Lopes, L.: Learning visual object categories with global descriptors and local features. In: Seabra Lopes, L., et al. (eds.) EPIA 2009. LNCS (LNAI), vol. 5816, pp. 225–236. Springer, Heidelberg (2009)
Roy, D., Pentland, A.: Learning words from sights and sounds: a computational model. Cognitive Science 26, 113–146 (2002)
Seabra Lopes, L., Chauhan, A.: How many words can my robot learn? an approach and experiments with one-class learning. Interaction Studies 8(1), 53–81 (2007)
Seabra Lopes, L., Chauhan, A.: Open-ended category learning for language acquisition. Connection Science 8(4), 277–298 (2008)
Steels, L., Kaplan, F.: Aibo’s first words: the social learning of language and meaning. Evolution of Communication 4(1), 3–32 (2002)
Steinhaus, H.: Sur la division des corp materiels en parties. Bulletin L’Acadmie Polonaise des Science IV C1. III, 801–804 (1956)
Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: Multiple-instance learning for weakly supervised object categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008)
Wnuk, K., Soatto, S.: Filtering internet image search results towards keyword based category recognition. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR 2008, pp. 1–8 (June 2008)
Yeh, T., Darrell, T.: Dynamic visual category learning. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR 2008, pp. 1–8 (2008)
Zhou, X.S., Huang, T.S.: Relevance feedback in image retrieval: A comprehensive review. Multimedia Systems 8(6), 536–544 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pereira, R., Seabra Lopes, L., Silva, A. (2009). Semantic Image Search and Subset Selection for Classifier Training in Object Recognition. In: Lopes, L.S., Lau, N., Mariano, P., Rocha, L.M. (eds) Progress in Artificial Intelligence. EPIA 2009. Lecture Notes in Computer Science(), vol 5816. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04686-5_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-04686-5_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04685-8
Online ISBN: 978-3-642-04686-5
eBook Packages: Computer ScienceComputer Science (R0)