ACIIDS 2016: Intelligent Information and Database Systems pp 437-446 | Cite as
Recent Developments on 2D Pose Estimation From Monocular Images
Abstract
Human pose estimation from monocular images is one of the most significant aspects of modern computer vision tasks and its application demand is still increasing in such areas as automatic images indexing or human activity recognition from video. Among many approaches applied in these areas the one based on pose estimation gives, beyond all doubts, one of the most powerful representation of human on the picture in sense of sparsity and semantics. In this paper we provide a detailed survey of the most efficient methods in 2D pose estimation domain as well as the test results of selected methods on the LSP dataset, which is commonly used by state-of-the-art works.
Keywords
Human pose estimation PSM PoseletNotes
Acknowledgements
This work has been supported by the National Centre for Research and Development (project UOD-DEM-1-183/001 “Intelligent video analysis system for behavior and event recognition in surveillance networks”).
References
- 1.Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. IJCV 61(1), 55–79 (2005)CrossRefGoogle Scholar
- 2.Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of parts. In: CVPR, pp. 1385–1392 (2011)Google Scholar
- 3.Wang, F., Li, Y.: Beyond physical connections: tree models in human pose estimation. In: CVPR, pp. 596–603 (2013)Google Scholar
- 4.Tian, Y., Zitnick, C.L., Narasimhan, S.G.: Exploring the spatial hierarchy of mixture models for human pose estimation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 256–269. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 5.Johnson, S., Everingham, M.: Learning effective human pose estimation from inaccurate annotation. In: CVPR, pp. 1465–1472 (2011)Google Scholar
- 6.Bourdev, L., Malik., J.: Poselets: body part detectors trained using 3D human pose annotations. In: ICCV, pp. 1365–1372 (2009)Google Scholar
- 7.Gkioxari, G., Hariharan, B., Girshick, R., Malik, J.: Using k-poselets for detecting people and localizing their keypoints. In: CVPR, pp. 3582–3589 (2014)Google Scholar
- 8.Pishchulin, L., Andriluka., M., Gehler, P., Schiele, B.: Poselet conditioned pictorial structures. In: CVPR, pp. 588–595 (2013)Google Scholar
- 9.Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: CVPR, pp. 3686–3693 (2014)Google Scholar
- 10.Liu, Z., Zhu, J., Bu, J., Chen, C.: A survey of human pose estimation: the body parts parsing based methods. JVCI 32, 10–19 (2015)Google Scholar
- 11.Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
- 12.Maji, S., Malik, J.: Object detection using a max-margin hough tranform. In: CVPR, pp. 1038–1045 (2009)Google Scholar
- 13.Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR, pp. 1–8 (2008)Google Scholar
- 14.Wang, H., Klaser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: CVPR, pp. 3169–3176 (2011)Google Scholar
- 15.Fischler, M., Elschlager, R.: The representation and matching of pictorial structures. IEEE Trans. Comput. 22(1), 67–92 (1973)CrossRefGoogle Scholar
- 16.Eichner, M., Ferrari, V.: Better appearance models for pictorial structures. In: BMVC, pp. 1–11 (2009)Google Scholar
- 17.Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893 (2005)Google Scholar
- 18.Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: BMVC, pp. 1–11 (2010)Google Scholar
- 19.Sapp, B., Taskar, B.: MODEC: multimodal decomposable models for human pose estimation. In: CVPR, pp. 3674–3681 (2013)Google Scholar
- 20.Gkioxari, G., Arbelaez, P., Bourdev, L., Malik, J.: Articulated pose estimation using discriminative armlet classifiers. In: CVPR, pp. 3342–3349 (2013)Google Scholar
- 21.Cherian, A., Mairal, J., Alahari, K., Schmid, C.: Mixing body-part sequences for human pose estimation. In: CVPR, pp. 2361–2368 (2014)Google Scholar
- 22.Wang, C., Wang, Y., Yuille, A.: An approach to pose-based action recognition. In: CVPR, pp. 915–922 (2013)Google Scholar
- 23.Nie, B., Xiong, C., Zhu, S.-C.: Joint action recognition and pose estimation from video. In: CVPR, pp. 1293–1301 (2015)Google Scholar
- 24.Fan, X., Zheng, K., Lin, Y.: Combining local appearance and holistic view: dual-source deep neural networks for human pose estimation. In: CVPR, pp. 1347–1355 (2015)Google Scholar
- 25.Ouyang, W., Chu, X., Wang, X.: Multi-source deep learning for human pose estimation. In: CVPR, pp. 2337–2344 (2014)Google Scholar
- 26.Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: CVPR, pp. 1653–1660 (2014)Google Scholar
- 27.Tompson, J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: NIPS, pp. 1799–1807 (2014)Google Scholar
- 28.Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. 39(1), 1–38 (1977)MathSciNetMATHGoogle Scholar