Robust 3D Action Recognition with Random Occupancy Patterns

  • Jiang Wang
  • Zicheng Liu
  • Jan Chorowski
  • Zhuoyuan Chen
  • Ying Wu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7573)


We study the problem of action recognition from depth sequences captured by depth cameras, where noise and occlusion are common problems because they are captured with a single commodity camera. In order to deal with these issues, we extract semi-local features called random occupancy pattern (ROP) features, which employ a novel sampling scheme that effectively explores an extremely large sampling space. We also utilize a sparse coding approach to robustly encode these features. The proposed approach does not require careful parameter tuning. Its training is very fast due to the use of the high-dimensional integral image, and it is robust to the occlusions. Our technique is evaluated on two datasets captured by commodity depth cameras: an action dataset and a hand gesture dataset. Our classification results are superior to those obtained by the state of the art approaches on both datasets.


Action Recognition Depth Sequence Sparse Code Hand Gesture Recognition Depth Camera 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR (2011)Google Scholar
  2. 2.
    Hadfield, S., Bowden, R.: Kinecting the dots: Particle Based Scene Flow From Depth Sensors. In: ICCV (2011)Google Scholar
  3. 3.
    Baak, A., Meinard, M., Bharaj, G., Seidel, H.P., Theobalt, C., Informatik, M.P.I.: A Data-Driven Approach for Real-Time Full Body Pose Reconstruction from a Depth Camera. In: ICCV (2011)Google Scholar
  4. 4.
    Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient Regression of General-Activity Human Poses from Depth Images. In: ICCV (2011)Google Scholar
  5. 5.
    Viola, P., Jones, M.J.: Robust Real-Time Face Detection. International Journal of Computer Vision 57, 137–154 (2004)CrossRefGoogle Scholar
  6. 6.
    Freund, Y., Schapire, R.: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. In: Computational Learning Theory, vol. 55, pp. 23–37. Springer (1995)Google Scholar
  7. 7.
    Weinland, D., Boyer, E., Ronfard, R.: Action Recognition from Arbitrary Views using 3D Exemplars. In: ICCV, pp. 1–7 (2007)Google Scholar
  8. 8.
    Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Computation 9, 1545–1588 (1997)CrossRefGoogle Scholar
  9. 9.
    Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR (2011)Google Scholar
  10. 10.
    Rahimi, A., Recht, B.: Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In: NIPS, vol. 885. Citeseer (2008)Google Scholar
  11. 11.
    Huang, G.B., Wang, D.H., Lan, Y.: Extreme learning machines: a survey. International Journal of Machine Learning and Cybernetics 2, 107–122 (2011)CrossRefGoogle Scholar
  12. 12.
    Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: Human Communicative Behavior Analysis Workshop (in conjunction with CVPR) (2010)Google Scholar
  13. 13.
    Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining Actionlet Ensemble for Action Recognition with Depth Cameras. In: CVPR (2012)Google Scholar
  14. 14.
    Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.M.: STOP: Space-Time Occupancy Patterns for 3D Action Recognition from Depth Map Sequences. In: 17th Iberoamerican Congress on Pattern Recognition, Buenos Aires (2012)Google Scholar
  15. 15.
    Yang, X., Tian, Y.: EigenJoints-based Action Recognition Using Naïve-Bayes-Nearest-Neighbor. In: CVPR 2012 HAU3D Workshop (2012)Google Scholar
  16. 16.
    Yang, X., Zhang, C., Tian, Y.: Recognizing Actions Using Depth Motion Maps-based Histograms of Oriented Gradients. In: ACM Multimedia (2012)Google Scholar
  17. 17.
    Tapia, E.: A note on the computation of high-dimensional integral images. Pattern Recognition Letters 32, 197–201 (2011)CrossRefGoogle Scholar
  18. 18.
    Wang, L., Chan, K.L.: Learning Kernel Parameters bu using Class Separability Measure. In: NIPS (2002)Google Scholar
  19. 19.
    Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society 67, 301–320 (2005)zbMATHMathSciNetCrossRefGoogle Scholar
  20. 20.
    Julien Mairal (SPArse Modeling Software),
  21. 21.
    Laptev, I.: On Space-Time Interest Points. IJCV 64, 107–123 (2005)CrossRefGoogle Scholar
  22. 22.
    Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. In: ICML. Citeseer (2010)Google Scholar
  23. 23.
    Bartlett, P.: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory 44, 525–536 (1998)zbMATHMathSciNetCrossRefGoogle Scholar
  24. 24.
    Kurakin, A., Zhang, Z., Liu, Z.: A real-time system for dynamic hand gesture recognition with a depth sensor. In: EUSIPCO (2012)Google Scholar
  25. 25.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jiang Wang
    • 1
  • Zicheng Liu
    • 2
  • Jan Chorowski
    • 3
  • Zhuoyuan Chen
    • 1
  • Ying Wu
    • 1
  1. 1.Northwestern UniversityUSA
  2. 2.Microsoft ResearchUSA
  3. 3.University of LouisvilleUSA

Personalised recommendations