Finding Human Poses in Videos Using Concurrent Matching and Segmentation

Jiang, Hao

doi:10.1007/978-3-642-19315-6_18

Hao Jiang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6492))

Included in the following conference series:

Asian Conference on Computer Vision

2896 Accesses
2 Citations

Abstract

We propose a novel method to detect human poses in videos by concurrently optimizing body part matching and object segmentation. With a single exemplar image, the proposed method detects the poses of a specific human subject in long video sequences. Matching and segmentation support each other and therefore the simultaneous optimization enables more reliable results. However, efficient concurrent optimization is a great challenge due to its huge search space. We propose an efficient linear method that solves the problem. In this method, the optimal body part matching conforms to local appearances and a human body plan, and the body part configuration is consistent with the object foreground estimated by simultaneous superpixel labeling. Our experiments on a variety of videos show that the proposed method is efficient and more reliable than previous locally constrained approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth

Spatio-temporal Matching for Human Detection in Video

2D-3D Pose Estimation of Heterogeneous Objects Using a Region Based Approach

Article 19 December 2015

References

Bregler, C., Malik, J., Pullen, K.: Twist based acquisition and tracking of animal and human kinematics. IJCV 56(3), 179–194 (2004)
Article Google Scholar
Sminchisescu, C., Triggs, B.: Estimating articulated human motion with covariance scaled sampling. Inter. J. of Robotics Research 22(6), 371–391 (2003)
Article Google Scholar
Mori, G., Malik, J.: Estimating human body configurations using shape context matching. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 666–680. Springer, Heidelberg (2002)
Chapter Google Scholar
Gavrila, D.M.: A Bayesian, exemplar-based approach to hierarchical shape matching. TPAMI 29(8) (2007)
Google Scholar
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter sensitive hashing. In: ICCV 2003 (2003)
Google Scholar
Toyama, K., Blake, A.: Probabilistic tracking with exemplars in a metric space. IJCV 48(1), 9–19 (2002)
Article MATH Google Scholar
Ramanan, D., Forsyth, D.A., Zisserman, A.: Strike a pose: tracking people by finding stylized poses. In: CVPR 2005 (2005)
Google Scholar
Jiang, H.: Human pose estimation using consistent max-covering. In: ICCV 2009 (2009)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. IJCV 61(1) (January 2005)
Google Scholar
Ioffe, S., Forsyth, D.A.: Probabilistic methods for finding people. IJCV 43(1), 45–68 (2001)
Article MATH Google Scholar
Ren, X.F., Berg, A.C., Malik, J.: Recovering human body configurations using pairwise constraints between parts. In: ICCV 2005, vol. 1, pp. 824–831 (2005)
Google Scholar
Lee, M.W., Cohen, I.: Human upper body pose estimation in static images. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3022, pp. 126–138. Springer, Heidelberg (2004)
Chapter Google Scholar
Rosales, R., Sclaroff, S.: Inferring body pose without tracking body parts. In: CVPR 2000 (2000)
Google Scholar
Sigal, L., Black, M.J.: Measure locally, reason globally: occlusion sensitive articulated pose estimation. In: CVPR 2006 (2006)
Google Scholar
Jiang, H., Martin, D.R.: Global pose estimation using non-tree models. In: CVPR 2008 (2008)
Google Scholar
Wang, Y., Mori, G.: Multiple tree models for occlusion and spatial constraints in human pose estimation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 710–724. Springer, Heidelberg (2008)
Chapter Google Scholar
Mori, G.: Guiding model search using segmentation. In: ICCV 2005 (2005)
Google Scholar
Kohli, P., Rihan, J., Bray, M., Torr, P.H.S.: Simultaneous segmentation and pose estimation of humans using dynamic graph Cuts. IJCV 79(3), 285–298 (2008)
Article Google Scholar
Pawan Kumar, M., Torr, P.H.S., Zisserman, A.: OBJCUT. In: CVPR 2005 (2005)
Google Scholar
Ramanan, D.: Learning to parse images of articulated objects. In: NIPS 2006 (2006)
Google Scholar
Ferrari, V., Manuel, M., Zisserman, A.: Pose search: retrieving people using their pose. In: CVPR 2008 (2008)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. IJCV 59(2) (2004)
Google Scholar
Gupta, A., Mittal, A., Davis, L.S.: Constraint integration for efficient multiview pose estimation with self-occlusions. IEEE TPAMI 30(3), 493–506 (2008)
Article Google Scholar
Urtasun, R., Fleet, D., Fua, P.: Temporal motion models for monocular and multiview 3D human body tracking. CVIU 104(2), 157–177 (2006)
Google Scholar
Yezzi, A., Zollei, L., Kapur, T.: A variational framework for joint segmentation and registration. In: IEEE Workshop on Mathematical Methods in Biomedical Image Analysis 2001 (2001)
Google Scholar
Chen, C., Fan, G.: Hybrid body representation for integrated pose recognition, localization and segmentation. In: CVPR 2008 (2008)
Google Scholar
Johnson, S., Everingham, M.: Combining discriminative appearance and segmentation cues for articulated human pose estimation. In: IEEE International Workshop on Machine Learning for Vision-based Motion Analysis (2009)
Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: people detection and articulated pose estimation. In: CVPR 2009 (2009)
Google Scholar
Tian, T.P., Sclaroff, S.: Fast globally optimal 2D human detection with loopy graph models. In: CVPR 2010 (2010)
Google Scholar
HumanEva Dataset, http://vision.cs.brown.edu/humaneva

Download references

Author information

Authors and Affiliations

Boston College, Chestnut Hill, MA, 02467, USA
Hao Jiang

Authors

Hao Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Technion, Israel Institute of Technology, 32000, Haifa, Israel
Ron Kimmel
The University of Auckland, 37 Kohimarama Road, 1071, Mission Bay, Auckland, New Zealand
Reinhard Klette
National Institute of Informatics, 1018430, Chiyoda, Tokyo, Japan
Akihiro Sugimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, H. (2011). Finding Human Poses in Videos Using Concurrent Matching and Segmentation. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6492. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19315-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-19315-6_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19314-9
Online ISBN: 978-3-642-19315-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Finding Human Poses in Videos Using Concurrent Matching and Segmentation

Abstract

Access this chapter

Preview

Similar content being viewed by others

PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth

Spatio-temporal Matching for Human Detection in Video

2D-3D Pose Estimation of Heterogeneous Objects Using a Region Based Approach

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Finding Human Poses in Videos Using Concurrent Matching and Segmentation

Abstract

Access this chapter

Preview

Similar content being viewed by others

PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth

Spatio-temporal Matching for Human Detection in Video

2D-3D Pose Estimation of Heterogeneous Objects Using a Region Based Approach

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation