Diagnosing Error in Object Detectors

  • Derek Hoiem
  • Yodsawalai Chodpathumwan
  • Qieyun Dai
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7574)


This paper shows how to analyze the influences of object characteristics on detection performance and the frequency and impact of different types of false positives. In particular, we examine effects of occlusion, size, aspect ratio, visibility of parts, viewpoint, localization error, and confusion with semantically similar objects, other labeled objects, and background. We analyze two classes of detectors: the Vedaldi et al. multiple kernel learning detector and different versions of the Felzenszwalb et al. detector. Our study shows that sensitivity to size, localization error, and confusion with similar objects are the most impactful forms of error. Our analysis also reveals that many different kinds of improvement are necessary to achieve large gains, making more detailed analysis essential for the progress of recognition research. By making our software and annotations available, we make it effortless for future researchers to perform similar analysis.


Localization Error Object Detector Average Precision Object Category Similar Object 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)Google Scholar
  2. 2.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) (2007) Results,
  3. 3.
    Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV (2009)Google Scholar
  4. 4.
    Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR (2008)Google Scholar
  5. 5.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI (2009)Google Scholar
  6. 6.
    Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: ICCV (2005)Google Scholar
  7. 7.
    Wang, X., Han, T.X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: ICCV (2009)Google Scholar
  8. 8.
    Yang, Y., Hallman, S., Ramanan, D., Fowlkes, C.: Layered object detection for multi-class segmentation. In: CVPR (2010)Google Scholar
  9. 9.
    Vedaldi, A., Zisserman, A.: Structured output regression for detection with partial occulsion. In: NIPS (2009)Google Scholar
  10. 10.
    Kushal, A., Schmid, C., Ponce, J.: Flexible object models for category-level 3d object recognition. In: CVPR (2007)Google Scholar
  11. 11.
    Hoiem, D., Rother, C., Winn, J.: 3d layoutcrf for multi-view object class recognition and segmentation. In: CVPR (2007)Google Scholar
  12. 12.
    Sun, M., Su, H., Savarese, S., Fei-Fei, L.: A multi-view probabilistic model for 3d object classes. In: CVPR (2009)Google Scholar
  13. 13.
    Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. IJCV 37, 151–172 (2000)zbMATHCrossRefGoogle Scholar
  14. 14.
    Gil, A., Mozos, O.M., Ballesta, M., Reinoso, O.: A comparative evaluation of interest point detectors and local descriptors for visual slam. Machine Vision and Applications 21(6), 905–920 (2009)CrossRefGoogle Scholar
  15. 15.
    Divvala, S., Hoiem, D., Hays, J., Efros, A., Hebert, M.: An empirical study of context in object detection. In: CVPR (2009)Google Scholar
  16. 16.
    Rabinovich, A., Belongie, S.: Scenes vs. objects: a comparative study of two approaches to context based recognition. In: Intl. Wkshp. on Visual Scene Understanding, ViSU (2009)Google Scholar
  17. 17.
    Pinto, N., Cox, D.D., DiCarlo, J.J.: Why is real-world visual object recognition hard? PLoS Computational Biology 4, e27 (2008)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Torralba, A., Efros, A.: Unbiased look at dataset bias. In: CVPR (2011)Google Scholar
  19. 19.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. IJCV 88, 303–338 (2010)CrossRefGoogle Scholar
  20. 20.
    Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: A benchmark. In: CVPR (2009)Google Scholar
  21. 21.
    Sim, T., Baker, S., Bsat, M.: The CMU Pose, Illumination, and Expression (PIE) database of human faces. Technical Report CMU-RI-TR-01-02, Carnegie Mellon, Robotics Institute (2001)Google Scholar
  22. 22.
    Phillips, P.J., Flynn, P.J., Scruggs, T., Bowyer, K.W., Chang, J., Hoffman, K., Marques, J., Min, J., Worek, W.: Overview of the face recognition grand challenge, pp. 947–954 (2005)Google Scholar
  23. 23.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Discriminatively trained deformable part models, release 4,
  24. 24.
    Park, D., Ramanan, D., Fowlkes, C.: Multiresolution Models for Object Detection. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 241–254. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  25. 25.
    Parkhi, O., Vedaldi, A., Jawahar, C.V., Zisserman, A.: The truth about cats and dogs. In: ICCV (2011)Google Scholar
  26. 26.
    Belhumeur, P.N., Chen, D., Feiner, S.K., Jacobs, D.W., Kress, W.J., Ling, H., Lopez, I., Ramamoorthi, R., Sheorey, S., White, S., Zhang, L.: Searching the World’s Herbaria: A System for Visual Identification of Plant Species. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 116–129. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  27. 27.
    Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD Birds 200. Technical Report CNS-TR-2010-001, California Institute of Technology (2010)Google Scholar
  28. 28.
    Khosla, A., Yao, B., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR (2011)Google Scholar
  29. 29.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)Google Scholar
  30. 30.
    Schneiderman, H., Kanade, T.: A statistical model for 3-d object detection applied to faces and cars. In: CVPR (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Derek Hoiem
    • 1
  • Yodsawalai Chodpathumwan
    • 1
  • Qieyun Dai
    • 1
  1. 1.Department of Computer ScienceUniversity of Illinois at Urbana-ChampaignUSA

Personalised recommendations