The Semi-explicit Shape Model for Multi-object Detection and Classification

Polak, Simon; Shashua, Amnon

doi:10.1007/978-3-642-15552-9_25

The Semi-explicit Shape Model for Multi-object Detection and Classification

Simon Polak¹⁹ &
Amnon Shashua¹⁹

Conference paper

5428 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6312))

Abstract

We propose a model for classification and detection of object classes where the number of classes may be large and where multiple instances of object classes may be present in an image. The algorithm combines a bottom-up, low-level, procedure of a bag-of-words naive Bayes phase for winnowing out unlikely object classes with a high-level procedure for detection and classification. The high-level process is a hybrid of a voting method where votes are filtered using beliefs computed by a class-specific graphical model. In that sense, shape is both explicit (determining the voting pattern) and implicit (each object part votes independently) — hence the term ”semi-explicit shape model”.

This work was partially funded by ISF grant 519/09.

Download to read the full chapter text

Chapter PDF

References

Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Google Scholar
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminantly trained, multiscale, deformable part model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Combined object detection and segmentation with an implicit shape model. In: ECCV 2004 Workshop on Statistical Learning in Computer Vision (2004)
Google Scholar
Ommer, B., Malik, J.: Multi-scale object detection by clustering lines. In: Proceedings of the International Conference on Computer Vision (2009)
Google Scholar
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: Proceedings of the International Conference on Computer Vision (2009)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2003)
Google Scholar
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. International Journal of Computer Vision 61, 55–79 (2005)
Article Google Scholar
Fergus, R., Perona, P., Zisserman, A.: A sparse object category model for efficient learning and exhaustive recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Crandall, D., Felzenszwalb, P., Huttenlocher, D.: Spatial priors for part-based recognition using statistical models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Leibe, B., Mikolajczyk, K., Schiele, B.: Efficient clustering and matching for object class recognition. In: British Machine Vision Conference, BMVC 2006 (2006)
Google Scholar
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Stat. Soc., Series B 39, 1–38 (1977)
MATH MathSciNet Google Scholar
Wallace, C.S., Freeman, P.R.: Estimation and inference by compact coding. Journal of the Royal Statistical Society. Series B (Methodological) 49, 240–265 (1987)
MATH MathSciNet Google Scholar
Cignoni, P., Montani, C., Scopigno, R.: Dewall: A fast divide and conquer delaunay triangulation algorithm in e ^d. Computer-Aided Design 5, 333–341 (1998)
Article Google Scholar
Gionis, A., Indyk, P., Motwani, R.: Similarity Search in High Dimensions via Hashing. In: Proceedings of the 25th Very Large Database (VLDB) Conference (1999)
Google Scholar
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: CVPR (2008)
Google Scholar
Wainwright, M., Jaakkola, T., Willsky, A.: A new class of upper bounds on the log partition function. IEEE Transactions on Information Theory 51, 2313–2335 (2005)
Article MathSciNet Google Scholar
Hazan, T., Shashua, A.: Convergent message-passing algorithms for inference over general graphs with convex free energies. In: Conference on Uncertainty in Artifical Intelligence (UAI), Helsinki, Finland (2008)
Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: CVPR (2004)
Google Scholar
Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge 2006 (VOC2006) Results (2006), http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf
Varma, M., Ray, D.: Learning the discriminative power-invariance trade-off. In: Proceedings of the International Conference on Computer Vision (2007)
Google Scholar
Zhang, H., Berg, A., Maire, M., Malik, J.: Svm-knn: Discriminative nearest neighbor classification for visual category recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2006)
Google Scholar
Berg, A.: Shape matching and object recognition (2005)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, The Hebrew University of Jerusalem,
Simon Polak & Amnon Shashua

Authors

Simon Polak
View author publications
You can also search for this author in PubMed Google Scholar
Amnon Shashua
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Polak, S., Shashua, A. (2010). The Semi-explicit Shape Model for Multi-object Detection and Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15552-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-15552-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15551-2
Online ISBN: 978-3-642-15552-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics