Fast PRISM: Branch and Bound Hough Transform for Object Class Detection

Lehmann, Alain; Leibe, Bastian; Van Gool, Luc

doi:10.1007/s11263-010-0342-x

Fast PRISM: Branch and Bound Hough Transform for Object Class Detection

Published: 28 April 2010

Volume 94, pages 175–197, (2011)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Alain Lehmann¹,
Bastian Leibe² &
Luc Van Gool^1,3

741 Accesses
46 Citations
6 Altmetric
Explore all metrics

Abstract

This paper addresses the task of efficient object class detection by means of the Hough transform. This approach has been made popular by the Implicit Shape Model (ISM) and has been adopted many times. Although ISM exhibits robust detection performance, its probabilistic formulation is unsatisfactory. The PRincipled Implicit Shape Model (PRISM) overcomes these problems by interpreting Hough voting as a dual implementation of linear sliding-window detection. It thereby gives a sound justification to the voting procedure and imposes minimal constraints. We demonstrate PRISM’s flexibility by two complementary implementations: a generatively trained Gaussian Mixture Model as well as a discriminatively trained histogram approach. Both systems achieve state-of-the-art performance. Detections are found by gradient-based or branch and bound search, respectively. The latter greatly benefits from PRISM’s feature-centric view. It thereby avoids the unfavourable memory trade-off and any on-line pre-processing of the original Efficient Subwindow Search (ESS). Moreover, our approach takes account of the features’ scale value while ESS does not. Finally, we show how to avoid soft-matching and spatial pyramid descriptors during detection without losing their positive effect. This makes algorithms simpler and faster. Both are possible if the object model is properly regularised and we discuss a modification of SVMs which allows for doing so.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agarwal, S., Awan, A., & Roth, D. (2004). Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11), 1475–1490.
Article Google Scholar
An, S., Peursum, P., Liu, W., & Venkatesh, S. (2009). Efficient algorithms for subwindow search in object detection and localization. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Baggenstoos, P. M. (2002). Statistical modeling using Gaussian mixtures and hmms with Matlab. Tech. rep., Naval Undersea Warfare Center, Newport, RI, http://www.npt.nuwc.navy.mil/Csf/software.html.
Ballard, D. (1981). Generalizing the hough transform to detect arbitrary shapes. Pattern Recognition, 13(2), 111–122.
Article MATH Google Scholar
Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). Surf: Speeded up robust features. Computer Vision and Image Understanding, 110(3), 346–359.
Article Google Scholar
Bentley, J. (1984). Programming pearls: algorithm design techniques. Communications of the ACM, 27(9), 865–873.
Article MathSciNet Google Scholar
Blaschko, M. B., & Lampert, C. H. (2008). Learning to localize objects with structured output regression. In Proceedings of the European conference on computer vision.
Breuel, T. M. (1992). Fast recognition using adaptive subdivisions of transformation space. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Breuel, T. M. (2002). A comparison of search strategies for geometric branch and bound algorithms. In Proceedings of the European conference on computer vision.
Carreira Perpiñán, M. Á. (2000). Mode-finding for mixtures of Gaussian distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11), 1318–1323.
Article Google Scholar
Chapelle, O. (2007). Training a support vector machine in the primal. Neural Computation, 19(5), 1155–1178.
Article MathSciNet MATH Google Scholar
Chum, O., & Zisserman, A. (2007). An exemplar model for learning object classes. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Comaniciu, D., Ramesh, V., & Meer, P. (2001). The variable bandwidth mean shift and data-driven scale selection. In Proceedings of the IEEE international conference on computer vision.
Cornelis, N., & Van Gool, L. (2008). Fast scale invariant feature detection and matching on programmable graphics hardware. In Proceedings of the computer vision and pattern recognition (CVPR) workshop.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Felzenszwalb, P., McAllester, D., & Ramanan, D. (2008). A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Ferrari, V., Jurie, F., & Schmid, C. (2007). Accurate object detection with deformable shape models learnt from images. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Fritz, M., Leibe, B., Caputo, B., & Schiele, B. (2005). Integrating representative and discriminant models for object category detection. In Proceedings of the IEEE international conference on computer vision.
Gall, J., & Lempitsky, V. (2009). Class-specific hough forests for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Grauman, K., & Darrell, T. (2005). The pyramid match kernel:discriminative classification with sets of image features. In Proceedings of the IEEE international conference on computer vision.
Heitz, G., & Koller, D. (2008). Learning spatial context: Using stuff to find things. In Proceedings of the European conference on computer vision.
Keysers, D., Deselaers, T., & Breuel, T. M. (2007). Geometric matching for patch-based object detection. Electronic Letters on Computer Vision and Image Analysis, 6(1), 44–54.
Google Scholar
Kittler, J., Hatef, M., Duin, R., & Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 226–239.
Article Google Scholar
Lampert, C. H., Blaschko, M. B., & Hofmann, T. (2008). Beyond sliding windows: Object localization by efficient subwindow search. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Lampert, C. H., Blaschko, M. B., & Hofmann, T. (2009). Efficient subwindow search: A branch and bound framework for object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 99(1).
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 2169–2178.
Lehmann, A., Leibe, B., & Van Gool, L. (2009a). Feature-centric efficient subwindow search. In Proceedings of the IEEE international conference on computer vision.
Lehmann, A., Leibe, B., & Van Gool, L. (2009b). Prism: Principled implicit shape model. In Proceedings of the British machine vision conference.
Leibe, B., & Schiele, B. (2004). Scale-invariant object categorization using a scale-adaptive mean-shift search. In Proceedings of the DAGM symposium.
Leibe, B., Seemann, E., & Schiele, B. (2005). Pedestrian detection in crowded scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection by interleaving categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.
Article Google Scholar
Liebelt, J., Schmid, C., & Schertler, K. (2008). Viewpoint-independent object class detection using 3D feature maps. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Lindeberg, T. (1994). Scale-space theory in computer vision. Amsterdam: Kluwer Academic.
Google Scholar
Maji, S., & Malik, J. (2009). Object detection using a max-margin hough transform. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Maji, S., Berg, A. C., & Malik, J. (2008). Classification using intersection kernel support vector machines is efficient. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Mallat, S., & Zhang, Z. (1993). Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 41(12), 3397–3415.
Article MATH Google Scholar
Mikolajczyk, K., & Schmid, C. (2004). Scale and affine invariant interest point detectors. International Journal of Computer Vision, 60(1), 63–86.
Article Google Scholar
Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.
Article Google Scholar
Ommer, B., & Buhmann, J. M. (2007). Learning the compositional nature of visual objects. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Opelt, A., Pinz, A., & Zisserman, A. (2006). A boundary-fragment-model for object detection. In Proceedings of the European conference on computer vision.
Perona, P., & Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7), 629–639.
Article Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2008). Lost in quantization: Improving particular object retrieval in large scale image databases. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Sandler, T., Talukdar, P. P., Ungar, L. H., & Blitzer, J. (2008). Regularized learning with networks of features. In Proceedings of the advances in neural information processing systems.
Schneiderman, H. (2004). Feature-centric evaluation for efficient cascaded object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (Vol 2, pp. 29–36).
Schneiderman, H., & Kanade, T. (2004). Object detection using the statistics of parts. International Journal of Computer Vision, 56(3), 151–177.
Article Google Scholar
Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.
Google Scholar
Sudderth, E. B., Torralba, A., Freeman, W. T., & Willsky, A. S. (2005). Learning hierarchical models of scenes, objects, and parts. In Proceedings of the IEEE international conference on computer vision.
Viola, P. A., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137–154.
Article Google Scholar
Williams, C. K. I., & Allan, M. (2006). On a connection between object localization with a generative template of features and pose-space prediction methods (Tech. Rep. 0719). University of Edinburgh.
Yeh, T., Lee, J. J., & Trevor, Darrell T. (2009). Fast concurrent object localization and recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Download references

Author information

Authors and Affiliations

Computer Vision Laboratory, ETH Zurich, Zurich, Switzerland
Alain Lehmann & Luc Van Gool
UMIC Research Centre, RWTH Aachen, Aachen, Germany
Bastian Leibe
ESAT-PSI/IBBT, KU Leuven, Leuven, Belgium
Luc Van Gool

Authors

Alain Lehmann
View author publications
You can also search for this author in PubMed Google Scholar
Bastian Leibe
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alain Lehmann.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lehmann, A., Leibe, B. & Van Gool, L. Fast PRISM: Branch and Bound Hough Transform for Object Class Detection. Int J Comput Vis 94, 175–197 (2011). https://doi.org/10.1007/s11263-010-0342-x

Download citation

Received: 21 September 2009
Accepted: 09 April 2010
Published: 28 April 2010
Issue Date: September 2011
DOI: https://doi.org/10.1007/s11263-010-0342-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast PRISM: Branch and Bound Hough Transform for Object Class Detection

Abstract

Access this article

Similar content being viewed by others

Hough Voting with Distinctive Mid-Level Parts for Object Detection

A Hybrid Approach for Object Proposal Generation

Visual Object Detection Using Cascades of Binary and One-Class Classifiers

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast PRISM: Branch and Bound Hough Transform for Object Class Detection

Abstract

Access this article

Similar content being viewed by others

Hough Voting with Distinctive Mid-Level Parts for Object Detection

A Hybrid Approach for Object Proposal Generation

Visual Object Detection Using Cascades of Binary and One-Class Classifiers

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation