Skip to main content

Semi-supervised Learning for Human Pose Recognition with RGB-D Light-Model

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing - PCM 2016 (PCM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9917))

Included in the following conference series:

  • 2504 Accesses

Abstract

This work targets human pose recognition based on RGB-D videos. In recently, RGB-D based methods can be typically represented as either maps-based approaches or skeleton-based approaches. This paper proposes a semi-supervised learning method for evaluating human posture via RGB-D and light-model. The light-model is generated to represent depth sequence, by using the dynamic-fusion strategy. In this regard, light-model has richer information than depth image, and a CNN classifier is further constructed to recognize human pose with trained labeled light model data. Soft correlation and hard correlation are used to adjust the CNN output of non-labeled data. This paper constructs a set of posture data which consist of RGB images and light model. The experiments results show that our method is more accuracy than the state of the art, and the efficient is also competitive. This study implies that feature extracted from 3D models is reliable for human pose recognition, especially for sitting posture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: 17th International Conference on Proceedings of the Pattern Recognition (ICPR 2004), vol. 3, pp. 32–36. IEEE Computer Society (2004)

    Google Scholar 

  2. Heng, W., Schmid, C.: Action recognition with improved trajectories. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE (2013)

    Google Scholar 

  3. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)

    Google Scholar 

  4. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia, vol. 2 (2014)

    Google Scholar 

  5. Chéron, G., Laptev, I., Schmid, C.: P-CNN: pose-based CNN features for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

    Google Scholar 

  6. Tran, D., et al.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

    Google Scholar 

  7. Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2014)

    Google Scholar 

  8. Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)

    Article  Google Scholar 

  9. Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE (2013)

    Google Scholar 

  10. Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  11. Lu, X., Aggarwal, J.K.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2013)

    Google Scholar 

  12. Oreifej, O., Liu, Z.: HON4D: histogram of oriented 4d normals for activity recognition from depth sequences. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2013)

    Google Scholar 

  13. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems (2014)

    Google Scholar 

  14. Ng, J.Y.-H., et al.: Beyond short snippets: deep networks for video classification. arXiv preprint arXiv:1503.08909a (2015)

  15. Wang, K., Wang, X., Lin, L., et al.: 3D human activity recognition with reconfigurable convolutional neural networks. In: Proceedings of the ACM International Conference on Multimedia. ACM (2014)

    Google Scholar 

  16. Whelan, T., et al.: Kintinuous: spatially extended kinectfusion. MIT-CSAIL-TR-2012-020 (2012)

    Google Scholar 

  17. Nießner, M., et al.: Real-time 3d reconstruction at scale using voxel hashing. ACM Trans. Graph. (TOG) 32(6) (2013). Article No. 169

    Google Scholar 

  18. Whelan, T., et al.: ElasticFusion: dense SLAM without a pose graph. In: RSS (2015)

    Google Scholar 

  19. Blan, A.O., et al.: Shining a light on human pose: on shadows, shading and the estimation of pose and shape. In: IEEE 11th International Conference on Computer Vision, ICCV 2007. IEEE (2007)

    Google Scholar 

  20. Lee, M.W., Nevatia, R.: Body part detection for human pose estimation and tracking. In: IEEE Workshop on Motion and Video Computing, WMVC 2007. IEEE (2007)

    Google Scholar 

  21. Lee, M.W., Nevatia, R.: Dynamic human pose estimation using Markov chain Monte Carlo approach. In: Seventh IEEE Workshops on Application of Computer Vision, WACV/MOTIONS 2005, vol. 1–2. IEEE (2005)

    Google Scholar 

  22. Fathi, A., Mori, G.: Human pose estimation using motion exemplars. In: IEEE 11th International Conference on Computer Vision, ICCV 2007. IEEE (2007)

    Google Scholar 

  23. Baumberg, A.M., Hogg, D.C.: An efficient method for contour tracking using active shape models. In: Proceedings of the 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects. IEEE (1994)

    Google Scholar 

  24. Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE (2010)

    Google Scholar 

  25. Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: Space-Time Occupancy Patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33275-3_31

    Chapter  Google Scholar 

  26. Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3d action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7441, pp. 872–885. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33709-3_62

    Google Scholar 

  27. Mao, Y., et al.: Accurate 3d pose estimation from a single depth image. In: 2011 IEEE International Conference on Computer Vision (ICCV). IEEE (2011)

    Google Scholar 

  28. Criminisi, A., Shotton, J., Robertson, D., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies. In: Menze, B., Langs, G., Tu, Z., Criminisi, A. (eds.) MCV 2010. LNCS, vol. 6533, pp. 106–117. Springer, Heidelberg (2011). doi:10.1007/978-3-642-18421-5_11

    Chapter  Google Scholar 

  29. Jalal, A., et al.: Recognition of human home activities via depth silhouettes and transformation for smart homes. Indoor Built Environ. 21(1), 184–190 (2011)

    Article  Google Scholar 

  30. Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia. ACM (2012)

    Google Scholar 

  31. Wu, S.-L., Cui, R.-Y.: Human behavior recognition based on sitting postures. In: 2010 International Symposium on Computer Communication Control and Automation (3CA), vol. 1. IEEE (2010)

    Google Scholar 

  32. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  33. Wang, X., Gupta, A.: Unsupervised learning of visual representations using videos. arXiv preprint arXiv:1505.00687 (2015)

  34. Stikic, M., Van Laerhoven, K., Schiele, B.: Exploring semi-supervised and active learning for activity recognition. In: 12th IEEE International Symposium on Wearable Computers, ISWC 2008. IEEE (2008)

    Google Scholar 

  35. Zhao, X., et al.: Human action recognition based on semi-supervised discriminant analysis with global constraint. Neurocomputing 105, 45–50 (2013)

    Article  Google Scholar 

  36. Zhang, T., et al.: Boosted multi-class semi-supervised learning for human action recognition. Pattern Recogn. 44(10), 2334–2342 (2011)

    Article  MATH  Google Scholar 

  37. Guan, D., et al.: Activity recognition based on semi-supervised learning. In: 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 2007. IEEE (2007)

    Google Scholar 

  38. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39, 1–38 (1977)

    Google Scholar 

  39. Miller, D.J., Uyar, H.S.: A mixture of experts classifier with learning based on both labelled and unlabelled data. In: Advances in Neural Information Processing Systems (1997)

    Google Scholar 

  40. Zhao, Y., et al.: Combing RGB and depth map features for human activity recognition. In: 2012 Asia-Pacific on Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE (2012)

    Google Scholar 

  41. Faria, D.R., Premebida, C., Nunes, U.: A probabilistic approach for human everyday activities recognition using body motion from RGB-D images. In: 2014 RO-MAN: The 23rd IEEE International Symposium on Robot and Human Interactive Communication. IEEE (2014)

    Google Scholar 

  42. Ming, Y., Ruan, Q., Hauptmann, A.G.: Activity recognition from RGB-D camera with 3d local spatio-temporal features. In: 2012 IEEE International Conference on Multimedia and Expo (ICME). IEEE (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dahai Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Wang, X., Zhang, G., Yu, D., Liu, D. (2016). Semi-supervised Learning for Human Pose Recognition with RGB-D Light-Model. In: Chen, E., Gong, Y., Tie, Y. (eds) Advances in Multimedia Information Processing - PCM 2016. PCM 2016. Lecture Notes in Computer Science(), vol 9917. Springer, Cham. https://doi.org/10.1007/978-3-319-48896-7_72

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48896-7_72

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48895-0

  • Online ISBN: 978-3-319-48896-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics