Object Detection and Localization Using Local and Global Features

  • Kevin Murphy
  • Antonio Torralba
  • Daniel Eaton
  • William Freeman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4170)


Traditional approaches to object detection only look at local pieces of the image, whether it be within a sliding window or the regions around an interest point detector. However, such local pieces can be ambiguous, especially when the object of interest is small, or imaging conditions are otherwise unfavorable. This ambiguity can be reduced by using global features of the image — which we call the “gist” of the scene — as an additional source of evidence. We show that by combining local and global features, we get significantly improved detection rates. In addition, since the gist is much cheaper to compute than most local detectors, we can potentially gain a large increase in speed as well.


Object Detection Global Feature Object Class Local Detector Interest Point Detector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 26(11), 1475–1490 (2004)CrossRefGoogle Scholar
  2. 2.
    Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., et al. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–127. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Biederman, I.: On the semantics of a glance at a scene. In: Kubovy, M., Pomerantz, J. (eds.) Perceptual organization, pp. 213–253. Erlbaum, Mahwah (1981)Google Scholar
  4. 4.
    Bishop, C.M.: Mixture density networks. Technical Report NCRG 4288, Neural Computing Research Group, Department of Computer Science, Aston University (1994)Google Scholar
  5. 5.
    Bouchard, G., Triggs, B.: A hierarchical part-based model for visual object categorization. In: CVPR (2005)Google Scholar
  6. 6.
    Csurka, G., Dance, C., Bray, C., Fan, L., Willamowski, J.: Visual categorization with bags of keypoints. In: ECCV workshop on statistical learning in computer vision (2004)Google Scholar
  7. 7.
    Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general contextual object recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 350–362. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. Intl. J. Computer Vision 61(1) (2005)Google Scholar
  9. 9.
    Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of statistics 28(2), 337–374 (2000)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Fink, M., Perona, P.: Mutual boosting for contextual influence. In: Advances in Neural Info. Proc. Systems (2003)Google Scholar
  11. 11.
    Fergus, R., Perona, P., Zisserman, A.: A sparse object category model for efficient learning and exhaustive recognition. In: CVPR (2005)Google Scholar
  12. 12.
    Friedman, J.: Greedy function approximation: a gradient boosting machine. Annals of Statistics 29, 1189–1232 (2001)MATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: IEEE Conf. on Computer Vision and Pattern Recognition (2005)Google Scholar
  14. 14.
    Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Computation 14, 1771–1800 (2002)MATHCrossRefGoogle Scholar
  15. 15.
    He, X., Zemel, R., Carreira-Perpinan, M.: Multiscale conditional random fields for image labelling. In: CVPR (2004)Google Scholar
  16. 16.
    Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Computation 6, 181–214 (1994)CrossRefGoogle Scholar
  17. 17.
    Lienhart, R., Kuranov, A., Pisarevsky, V.: Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: DAGM 25th Pattern Recognition Symposium (2003)Google Scholar
  18. 18.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Intl. J. Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  19. 19.
    Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(4), 349–361 (2001)CrossRefGoogle Scholar
  20. 20.
    Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic (May 2004)Google Scholar
  21. 21.
    Murphy, K., Torralba, A., Freeman, W.: Using the forest to see the trees: a graphical model relating features, objects and scenes. In: Advances in Neural Info. Proc. Systems (2003)Google Scholar
  22. 22.
    Navon, D.: Forest before the trees: the precedence of global features in visual perception. Cognitive Psychology 9, 353–383 (1977)CrossRefGoogle Scholar
  23. 23.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Intl. J. Computer Vision 42(3), 145–175 (2001)MATHCrossRefGoogle Scholar
  24. 24.
    Papageorgiou, C., Poggio, T.: A trainable system for object detection. Intl. J. Computer Vision 38(1), 15–33 (2000)MATHCrossRefGoogle Scholar
  25. 25.
    Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola, A., Bartlett, P., Schoelkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)Google Scholar
  26. 26.
    Rowley, H.A., Baluja, S., Kanade, T.: Human face detection in visual scenes. In: Advances in Neural Info. Proc. Systems, vol. 8 (1995)Google Scholar
  27. 27.
    Schneiderman, H., Kanade, T.: A statistical model for 3D object detection applied to faces and cars. In: CVPR (2000)Google Scholar
  28. 28.
    Schyns, P., Oliva, A.: From blobs to boundary edges: Evidence for time and spatial scale dependent scene recognition. Psychological Science 5, 195–200 (1994)CrossRefGoogle Scholar
  29. 29.
    Serre, T., Wolf, L., Poggio, T.: A new biologically motivated framework for robust object recognition. In: CVPR (2005)Google Scholar
  30. 30.
    Singhal, A., Luo, J., Zhu, W.: Probabilistic spatial context models for scene content understanding. In: CVPR (2003)Google Scholar
  31. 31.
    Torralba, A., Murphy, K., Freeman, W.: Contextual models for object detection using boosted random fields. In: Advances in Neural Info. Proc. Systems (2004)Google Scholar
  32. 32.
    Torralba, A., Murphy, K., Freeman, W., Rubin, M.: Context-based vision system for place and object recognition. In: Intl. Conf. Computer Vision (2003)Google Scholar
  33. 33.
    Torralba, A., Oliva, A.: Depth estimation from image structure. IEEE Trans. on Pattern Analysis and Machine Intelligence 24(9), 1225 (2002)CrossRefGoogle Scholar
  34. 34.
    Torralba, A.: Contextual priming for object detection. Intl. J. Computer Vision 53(2), 153–167 (2003)CrossRefGoogle Scholar
  35. 35.
    Viola, P., Jones, M.: Robust real-time object detection. Intl. J. Computer Vision 57(2), 137–154 (2004)CrossRefGoogle Scholar
  36. 36.
    Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: IEEE Conf. on Computer Vision and Pattern Recognition (2003)Google Scholar
  37. 37.
    Vidal-Naquet, M., Ullman, S.: Object recognition with informative features and linear classification. In: IEEE Conf. on Computer Vision and Pattern Recognition (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kevin Murphy
    • 1
  • Antonio Torralba
    • 2
  • Daniel Eaton
    • 1
  • William Freeman
    • 2
  1. 1.Department of Computer ScienceUniversity of British Columbia 
  2. 2.Computer Science and AI LabMIT 

Personalised recommendations