Detecting People Using Mutually Consistent Poselet Activations

Bourdev, Lubomir; Maji, Subhransu; Brox, Thomas; Malik, Jitendra

doi:10.1007/978-3-642-15567-3_13

Lubomir Bourdev^19,20,
Subhransu Maji¹⁹,
Thomas Brox¹⁹ &
…
Jitendra Malik¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6316))

Included in the following conference series:

European Conference on Computer Vision

6932 Accesses
117 Citations

Abstract

Bourdev and Malik (ICCV 09) introduced a new notion of parts, poselets, constructed to be tightly clustered both in the configuration space of keypoints, as well as in the appearance space of image patches. In this paper we develop a new algorithm for detecting people using poselets. Unlike that work which used 3D annotations of keypoints, we use only 2D annotations which are much easier for naive human annotators. The main algorithmic contribution is in how we use the pattern of poselet activations. Individual poselet activations are noisy, but considering the spatial context of each can provide vital disambiguating information, just as object detection can be improved by considering the detection scores of nearby objects in the scene. This can be done by training a two-layer feed-forward network with weights set using a max margin technique. The refined poselet activations are then clustered into mutually consistent hypotheses where consistency is based on empirically determined spatial keypoint distributions. Finally, bounding boxes are predicted for each person hypothesis and shape masks are aligned to edges in the image to provide a segmentation. To the best of our knowledge, the resulting system is the current best performer on the task of people detection and segmentation with an average precision of 47.8% and 40.5% respectively on PASCAL VOC 2009.

This work was supported by Adobe Systems, Inc., a Google Fellowship., the German Academic Exchange Service (DAAD), and ONR MURI N00014-06-1-0734.

Download to read the full chapter text

Chapter PDF

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

Poselet-Based Contextual Rescoring for Human Pose Estimation via Pictorial Structures

Article 30 November 2015

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Article 28 November 2014

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. IJCV 61, 55–79 (2005)
Article Google Scholar
Ren, X., Berg, A.C., Malik, J.: Recovering human body configurations using pairwise constraints between parts. In: ICCV, pp. 824–831 (2005)
Google Scholar
Ramanan, D.: Learning to parse images of articulated bodies. In: NIPS (2006)
Google Scholar
Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR (2008)
Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: People detection and articulated pose estimation. In: CVPR (2009)
Google Scholar
Bergtholdt, M., Kappes, J., Schmidt, S., Schnörr, C.: A study of parts-based object class detection using complete graphs. IJCV 87, 93–117 (2010)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)
Google Scholar
Oren, M., Papageorgiou, C., Sinha, P., Osuna, E., Poggio, T.: Pedestrian detection using wavelet templates. In: CVPR, pp. 193–199 (1997)
Google Scholar
Mori, G., Malik, J.: Estimating human body configurations using shape context matching. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 666–680. Springer, Heidelberg (2002)
Chapter Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: ECCV workshop on statistical learning in computer vision, pp. 17–32 (2004)
Google Scholar
Gavrila, D.M.: A Bayesian, exemplar-based approach to hierarchical shape matching. PAMI 29, 1408–1421 (2007)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. PAMI (2009) (published online)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models (2010), Project website: http://people.cs.uchicago.edu/~pff/latent
Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3d human pose annotations. In: ICCV (2009)
Google Scholar
Taylor, C.: Reconstruction of articulated objects from point correspondences in a single uncalibrated image. CVIU 80, 349–363 (2000)
MATH Google Scholar
Crandall, D., Felzenszwalb, P., Huttenlocher, D.: Spatial priors for part-based recognition using statistical models. In: CVPR, pp. 10–17 (2005)
Google Scholar
Maji, S., Malik, J.: Object detection using a max-margin hough tranform. In: CVPR (2009)
Google Scholar
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: From contours to regions: an empirical evaluation. In: ICCV (2009)
Google Scholar
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004)
Google Scholar
Yang, Y., Hallman, S., Ramanan, D., Fowlkes, C.: Layered object detection for multi-class segmentation. In: CVPR (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California at Berkeley,
Lubomir Bourdev, Subhransu Maji, Thomas Brox & Jitendra Malik
Adobe Systems, Inc., San Jose, CA
Lubomir Bourdev

Authors

Lubomir Bourdev
View author publications
You can also search for this author in PubMed Google Scholar
Subhransu Maji
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Brox
View author publications
You can also search for this author in PubMed Google Scholar
Jitendra Malik
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bourdev, L., Maji, S., Brox, T., Malik, J. (2010). Detecting People Using Mutually Consistent Poselet Activations. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15567-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-15567-3_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15566-6
Online ISBN: 978-3-642-15567-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Detecting People Using Mutually Consistent Poselet Activations

Abstract

Chapter PDF

Similar content being viewed by others

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

Poselet-Based Contextual Rescoring for Human Pose Estimation via Pictorial Structures

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Detecting People Using Mutually Consistent Poselet Activations

Abstract

Chapter PDF

Similar content being viewed by others

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

Poselet-Based Contextual Rescoring for Human Pose Estimation via Pictorial Structures

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation