Visual Recognition with Humans in the Loop

  • Steve Branson
  • Catherine Wah
  • Florian Schroff
  • Boris Babenko
  • Peter Welinder
  • Pietro Perona
  • Serge Belongie
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6314)


We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate our methods on Birds-200, a difficult dataset of 200 tightly-related bird species, and on the Animals With Attributes dataset. Our results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.


Computer Vision Bird Species Object Recognition Visual Recognition User Response 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Supplementary material

978-3-642-15561-1_32_MOESM1_ESM.pdf (3.1 mb)
Electronic Supplementary Material (3,168 KB)


  1. 1.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)Google Scholar
  2. 2.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL VOC Challenge 2009 Results (2009)Google Scholar
  3. 3.
    Nister, D., Stewenius, H.: Recognition with a vocabulary tree. In: CVPR (2006)Google Scholar
  4. 4.
    Nilsback, M., Zisserman, A.: Automated flower classification over a large number of classes. In: Indian Conf. on Comp. Vision, Graphics & Image Proc., pp. 722–729 (2008)Google Scholar
  5. 5.
    Lazebnik, S., Schmid, C., Ponce, J.: A maximum entropy framework for part-based texture and object recognition. In: ICCV, vol. 1, pp. 832–838 (2005)Google Scholar
  6. 6.
    Martınez-Munoz, et al.: Dictionary-free categorization of very similar objects via stacked evidence trees. In: CVPR (2009)Google Scholar
  7. 7.
    Belhumeur, P., Chen, D., Feiner, S., Jacobs, D., Kress, W., Ling, H., Lopez, I., Ramamoorthi, R., Sheorey, S., White, S., Zhang, L.: Searching the world’s herbaria: A system for visual identification of plant species. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 116–129. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Zhou, X., Huang, T.: Relevance feedback in image retrieval: A comprehensive review. Multimedia Systems 8, 536–544 (2003)CrossRefGoogle Scholar
  9. 9.
    Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. JMLR 2, 45–66 (2002)zbMATHCrossRefGoogle Scholar
  10. 10.
    Kapoor, A., Grauman, K., Urtasun, R., Darrell, T.: Active learning with gaussian processes for object categorization. In: ICCV, pp. 1–8 (2007)Google Scholar
  11. 11.
    Holub, A., Perona, P., Burl, M.: Entropy-based active learning for object recognition. In: Workshop on Online Learning for Classification (OLC), pp. 1–8 (2008)Google Scholar
  12. 12.
    Neapolitan, R.E.: Probabilistic reasoning in expert systems: theory and algorithms. John Wiley & Sons, Inc., New York (1990)Google Scholar
  13. 13.
    Beynon, M., Cosker, D., Marshall, D.: An expert system for multi-criteria decision making using Dempster Shafer theory. Expert Systems with Applications 20 (2001)Google Scholar
  14. 14.
    Tsang, S., Kao, B., Yip, K., Ho, W., Lee, S.: Decision trees for uncertain data. In: International Conference on Data Engineering, ICDE (2009)Google Scholar
  15. 15.
    Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  16. 16.
    Dembo, A., Cover, T., Thomas, J.: Information theoretic inequalities. IEEE Transactions on Information Theory 37, 1501–1518 (1991)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Sivic, J., Russell, B., Zisserman, A., Freeman, W., Efros, A.: Unsupervised discovery of visual object class hierarchies. In: CVPR, pp. 1–8 (2008)Google Scholar
  18. 18.
    Griffin, G., Perona, P.: Learning and using taxonomies for fast visual categorization. In: CVPR, pp. 1–8 (2008)Google Scholar
  19. 19.
    Torralba, A., Murphy, K., Freeman, W.: Sharing features: efficient boosting procedures for multiclass object detection. In: CVPR, vol. 2 (2004)Google Scholar
  20. 20.
    Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)zbMATHGoogle Scholar
  21. 21.
    Lampert, C., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)Google Scholar
  22. 22.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  23. 23.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and Simile Classifiers for Face Verification. In: ICCV (2009)Google Scholar
  24. 24.
    Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV (2009)Google Scholar
  25. 25.
    Platt, J.: Probabilities for SV machines. In: NIPS, pp. 61–74 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Steve Branson
    • 1
  • Catherine Wah
    • 1
  • Florian Schroff
    • 1
  • Boris Babenko
    • 1
  • Peter Welinder
    • 2
  • Pietro Perona
    • 2
  • Serge Belongie
    • 1
  1. 1.University of CaliforniaSan Diego
  2. 2.California Institute of Technology 

Personalised recommendations