Dataset Issues in Object Recognition

Ponce, J.; Berg, T. L.; Everingham, M.; Forsyth, D. A.; Hebert, M.; Lazebnik, S.; Marszalek, M.; Schmid, C.; Russell, B. C.; Torralba, A.; Williams, C. K. I.; Zhang, J.; Zisserman, A.

doi:10.1007/11957959_2

J. Ponce^20,21,
T. L. Berg²²,
M. Everingham²³,
D. A. Forsyth²⁰,
M. Hebert²⁴,
S. Lazebnik²⁰,
M. Marszalek²⁵,
C. Schmid²⁵,
B. C. Russell²⁶,
A. Torralba²⁶,
C. K. I. Williams²⁷,
J. Zhang²⁵ &
…
A. Zisserman²³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

3108 Accesses
108 Citations

Abstract

Appropriate datasets are required at all stages of object recognition research, including learning visual models of object and scene categories, detecting and localizing instances of these models in images, and evaluating the performance of recognition algorithms. Current datasets are lacking in several respects, and this paper discusses some of the lessons learned from existing efforts, as well as innovative ways to obtain very large and diverse annotated datasets. It also suggests a few criteria for gathering future datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, S., Roth, D.: Learning a Sparse Representation for Object Detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–127. Springer, Heidelberg (2002)
Chapter Google Scholar
Barnard, K., Duyguly, P., Forsyth, D.: Clustering art. In: Proc. IEEE Conf. Comp. Vision Patt. Recog., vol. II, pp. 435–439 (2001)
Google Scholar
Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D., Jordan, M.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Article MATH Google Scholar
Berg, A., Berg, T.L., Malik, J.: Shape matching and object recognition using low distortion correspondence. In: Proc. IEEE Conf. Comp. Vision Patt. Recog., vol. II, pp. 435–439 (2005)
Google Scholar
Berg, A.C.: Ph.D thesis (to appear)
Google Scholar
Berg, T.L., Berg, A.C., Edwards, J., Forsyth, D.: Who’s in the picture? In: Proc. Neural Inf. Proc. Syst. (2004)
Google Scholar
Berg, T.L., Forsyth, D.: Animals on the Web. In: Proc. IEEE Conf. Comp. Vision Patt. Recog. (2006)
Google Scholar
Everingham, M., Zisserman, A., Williams, C., Van Gool, L., Allan, M., Bishop, C., Chapelle, O., Dalal, N., Deselaers, T., Dorko, G., Duffner, S., Eichhorn, J., Farquhar, J., Fritz, M., Garcia, C., Griffiths, T., Jurie, F., Keysers, D., Koskela, M., Laaksonen, J., Larlus, D., Leibe, B., Meng, H., Ney, H., Schiele, B., Schmid, C., Seemann, E., Shawe-Taylor, J., Storkey, A., Szedmak, S., Triggs, B., Ulusoy, I., Viitaniemi, V., Zhang, J.: The 2005 PASCAL visual object classes challenge. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS, vol. 3944, pp. 117–176. Springer, Heidelberg (2006)
Chapter Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: Proc. IEEE Conf. Comp. Vision Patt. Recog. Workshop on Generative-Model Based Vision (2004)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. IEEE Conf. Comp. Vision Patt. Recog., vol.II, pp. 264–271 (2003)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: A visual category filter for Google images. In: Proc. Europ. Conf. Comp. Vision (2004)
Google Scholar
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. In: Proc. Int. Conf. Comp. Vision (2005)
Google Scholar
Giridharan, I., Duygulu, P., Feng, S., Ircing, P., Khudanpur, S., Klakow, D., Krause, M., Manmatha, R., Nock, H., Petkova, D., Pytlik, B., Virga, P.: Joint visual-text modeling for automatic retrieval of multimedia documents. In: Proc. ACM Multimedia Conference (2005)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features.MIT-CSAIL-TR-2006-020 (2006) Updated version of the ICCV 2005 paper with the same title, featuring the improved results shown in Fig. 4
Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 43, 177–196 (2001)
Article Google Scholar
Hoiem, D., Efros, A., Hebert, M.: Geometric context from a single image. In: Proc. Int. Conf. Comp. Vision (2005)
Google Scholar
Holub, A., Welling, M., Perona, P.: Combining generative models and fisher kernels for object class recognition. In: Proc. Int. Conf. Comp. Vision (2005)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. IEEE Conf. Comp. Vision Patt. Recog. (2005)
Google Scholar
Li, J., Wang, J.: Automatic linguistic indexing of pictures by a statistical modeling approach. PAMI 25(9), 1075–1088 (2003)
Google Scholar
Müller, H., Marchand-Maillet, S., Pun, T.: The Truth about Corel - Evaluation in Image Retrieval. In: Lew, M., Sebe, N., Eakins, J.P. (eds.) CIVR 2002. LNCS, vol. 2383, p. 38. Springer, Heidelberg (2002)
Chapter Google Scholar
Mutch, J., Lowe, D.: Multiclass object recognition using sparse, localized features. In: Proc. IEEE Conf. Comp. Vision Patt. Recog. (2006)
Google Scholar
Ommer, B., Buhmann, J.M.: Learning compositional categorization models. In: Proc. Europ. Conf. Comp. Vision (2006)
Google Scholar
Philips, P., Newton, E.: Meta-analysis of face recognition algorithms. In: Int. Conf. on Automatic Face and Gesture Recognition (2002)
Google Scholar
Rubner, Y., Tomasi, C., Guibas, L.: The Earth Mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40(2), 99–121 (2000)
Article MATH Google Scholar
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Technical report. MIT, AI Lab Memo AIM-2005-025 (2005)
Google Scholar
Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: Proc. IEEE Conf. Comp. Vision Patt. Recog. (2005)
Google Scholar
Stork, D.: The open mind initiative. IEEE Intelligent Systems and Their Applications 14(3), 19–20 (1999)
Google Scholar
Torralba, A.: Contextual priming for object detection. International Journal of Computer Vision 53(2), 153–167 (2003)
Article Google Scholar
von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proc. ACM Conf. Hum. Factors Comp. Syst. (CHI) (2004)
Google Scholar
von Ahn, L., Liu, R., Blum, M.: Peekaboom: A game for locating objects in images. In: Proc. ACM Conf. Hum. Factors Comp. Syst. (CHI) (2006)
Google Scholar
Wang, G., Zhang, Y., Fei-Fei, L.: Using dependent regions for object categorization in a generative framework. In: Proc. IEEE Conf. Comp. Vision Patt. Recog. (2006)
Google Scholar
Yanai, K., Barnard, K.: Probabilistic web image gathering. In: Workshop on MIR (2005)
Google Scholar
Zhang, H., Berg, A.C., Maire, M., Malik, J.: SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. In: Proc. IEEE Conf. Comp. Vision Patt. Recog. (2006)
Google Scholar
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: An in-depth study.Technical Report RR-5737, INRIA Rhône-Alpes (2005)
Google Scholar
http://www.vision.ethz.ch/projects/cogvis/CogVis-images/image-samples.html
http://www1.cs.columbia.edu/CAVE/research/softlib/coil-100.html
http://www.flickr.com

Download references

Author information

Authors and Affiliations

University of Illinois at Urbana-Champaign, USA
J. Ponce, D. A. Forsyth & S. Lazebnik
Ecole Normale Supérieure, Paris, France
J. Ponce
University of California at Berkeley, USA
T. L. Berg
Oxford University, UK
M. Everingham & A. Zisserman
Carnegie Mellon University, Pittsburgh, USA
M. Hebert
INRIA Rhône-Alpes, Grenoble, France
M. Marszalek, C. Schmid & J. Zhang
MIT, Cambridge, USA
B. C. Russell & A. Torralba
University of Edinburgh, Edinburgh, UK
C. K. I. Williams

Authors

J. Ponce
View author publications
You can also search for this author in PubMed Google Scholar
T. L. Berg
View author publications
You can also search for this author in PubMed Google Scholar
M. Everingham
View author publications
You can also search for this author in PubMed Google Scholar
D. A. Forsyth
View author publications
You can also search for this author in PubMed Google Scholar
M. Hebert
View author publications
You can also search for this author in PubMed Google Scholar
S. Lazebnik
View author publications
You can also search for this author in PubMed Google Scholar
M. Marszalek
View author publications
You can also search for this author in PubMed Google Scholar
C. Schmid
View author publications
You can also search for this author in PubMed Google Scholar
B. C. Russell
View author publications
You can also search for this author in PubMed Google Scholar
A. Torralba
View author publications
You can also search for this author in PubMed Google Scholar
C. K. I. Williams
View author publications
You can also search for this author in PubMed Google Scholar
J. Zhang
View author publications
You can also search for this author in PubMed Google Scholar
A. Zisserman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Département d’Informatique, Ecole Normale Supérieure, P.O. Box, Paris, France
Jean Ponce
Carnegie Mellon University, Pittsburgh, USA
Martial Hebert
GRAVIR-INRIA, 655 avenue de l’Europe, P.O. Box, 38330, Montbonnot, France
Cordelia Schmid
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ponce, J. et al. (2006). Dataset Issues in Object Recognition. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_2

Download citation

DOI: https://doi.org/10.1007/11957959_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics