3D Action Classification Using Sparse Spatio-temporal Feature Representations

  • Sherif Azary
  • Andreas Savakis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7432)


Automatic action classification is a challenging task for a wide variety of reasons including unconstrained human motion, background clutter, and view dependencies. The introduction of affordable depth sensors allows opportunities to investigate new approaches for action classification that take advantage of depth information. In this paper, we perform action classification using sparse representations on 3D video sequences of spatio-temporal kinematic joint descriptors and compare the classification accuracy against spatio-temporal raw depth data descriptors. These descriptors are used to create over-complete dictionaries which are used to classify test actions using least squares loss L1-norm minimization with a regularization parameter. We find that the representations of raw depth features are naturally more sparse than kinematic joint features and that our approach is highly effective and efficient at classifying a wide variety of actions from the Microsoft Research 3D Dataset (MSR3D).


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ghasemzadeh, H., Loseu, V., Jafari, R.: Collaborative Signal Processing for Action Recognition in Body Sensor Networks: A Distributed Classification Algorithm Using Mo-tion Transcripts. In: Proc. 9th ACM/IEEE Int. Conf. Inf. Process. (2010)Google Scholar
  2. 2.
    Raja, K., Laptev, I., Perez, P., Oisel, L.: Joint pose estimation and action recognition in image graphs. In: 18th IEEE International Conference on International Conference on Image Processing, ICIP (2011)Google Scholar
  3. 3.
    Weinland, D., Boyer, E., Ronfard, R.: Action Recognition from Arbitrary Views using 3D Exemplars. In: IEEE ICCV (2007)Google Scholar
  4. 4.
    Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2011)Google Scholar
  5. 5.
    Wang, Y., Zhang, Z.: View-invariant action recognition in surveillance videos. In: First Asian Conference on Pattern Recognition, ACPR (2011)Google Scholar
  6. 6.
    Imtiaz, H., Mahbub, U., Ahad, M.A.R.: Action recognition algorithm based on optical flow and RANSAC in frequency domain. In: Proceedings of SICE Annual Conference, SICE (2011)Google Scholar
  7. 7.
    Ahad, M.A.R., Tan, J., Kim, H., Ishikawa, S.: Action recognition by employing combined directional motion history and energy images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPRW (2010)Google Scholar
  8. 8.
    Lopes, A. P. B., Oliveira, R. S., de Almeida, J. M., de A Araujo, A.: Comparing alternatives for capturing dynamic information in Bag-of-Visual-Features approaches applied to human actions recognition. In: IEEE International Workshop on Multimedia Signal Processing, MMSP (2009)Google Scholar
  9. 9.
    Liu, J., Yang, J., Zhang, Y., He, X.: Action Recognition by Multiple Features and Hyper-Sphere Multi-class SVM. In: 20th International Conference on Pattern Recognition, ICPR (2010)Google Scholar
  10. 10.
    Ji, X., Liu, H., Li, Y.: Human actions recognition using Fuzzy PCA and discriminative hidden model. In: IEEE International Conference on Fuzzy Systems, FUZZ (2010)Google Scholar
  11. 11.
    Azary, S., Savakis, A.: View Invariant Activity Recognition with Manifold Learning. In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Chung, R., Hammound, R., Hussain, M., Kar-Han, T., Crawfis, R., Thalmann, D., Kao, D., Avila, L. (eds.) ISVC 2010. LNCS, vol. 6454, pp. 606–615. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Junejo, I.N., Dexter, E., Laptev, I., Perez, P.: View-Independent Action Recognition from Temporal Self-Similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)Google Scholar
  13. 13.
    Gall, J., Yao, A., Razavi, N., Gool, L.V., Lempitsky, V.: Hough Forests for Object Detection, Tracking, and Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)Google Scholar
  14. 14.
    Laptev, I.: On Space-Time Interest Points. International Journal of Computer Vision (2005)Google Scholar
  15. 15.
    Willems, G., Tuytelaars, T., Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 650–663. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Juran, J.M.: The non-Pareto Principle: Mea culpa. Quality Progress, 8–9 (May 1975)Google Scholar
  17. 17.
    Farmer, J.D., Geanakoplos, J.: Power laws in economics and elsewhere. Santa Fe Institute, Santa Fe (2008)Google Scholar
  18. 18.
    West, G.B.: The Origin of Universal Scaling Laws in Biology. Oxford University Press, New York (1999)Google Scholar
  19. 19.
    Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: IMC 2007 Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, New York, NY (2007)Google Scholar
  20. 20.
    Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust Face Recognition via Sparse Representation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI (2009)Google Scholar
  21. 21.
    Qiu, F., Xu, Y., Wang, C., Yang, Y.: Noisy image super-resolution with sparse mixing estimators. In: 4th International Congress on Image and Signal Processing, CISP (2011)Google Scholar
  22. 22.
    Bao, L., Liu, W., Zhu, Y., Pu, Z, Magnin: Sparse representation based MRI denoising with total variation. In: 9th International Conference on Signal Processing, ICSP (2008)Google Scholar
  23. 23.
    Zuo, Y., Zhang, B.: General image classification based on sparse representation. In: 9th IEEE International Conference on Cognitive Informatics, ICCI (2010)Google Scholar
  24. 24.
    Zhang, J., Wang, Y., Chen, J., Li, Q.: Sparse Representation for Action Recognition. In: 3rd International Congress on Image and Signal Processing, CISP 2010 (2010)Google Scholar
  25. 25.
    Liu, C., Yang, Y., Chen, Y.: Human Action Recognition using Sparse Representation. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS (2009)Google Scholar
  26. 26.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-Time Human Pose Recognition in Parts from Single Depth Images. In: CVPR (2011)Google Scholar
  27. 27.
    Wright, J., Ma, Y., Maira, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse Representation for Computer Vision and Pattern Recognition. Proceedings of the IEEE 98(6), 1031–1044 (2010)CrossRefGoogle Scholar
  28. 28.
    Miller, S.J.: The Method of Least Squares. Mathematics Department Brown University, Providence, RI (2006)Google Scholar
  29. 29.
    Bektaş, S., Şişman, Y.: The comparison of L1 and L2-norm minimization methods. International Journal of the Physical Sciences, IJPS (2010)Google Scholar
  30. 30.
    Donoho, D.L., Elad, M., Temlyakov, V.: Stable recovery of sparse overcomplete representations. IEEE Transactions on Information Theory (2005)Google Scholar
  31. 31.
    Donoho, D.L., Tsaig, Y.: Fast Solution of L1-norm Minimization Problems When the Solution May Be Sparse, Stanford CA, 94305, Department of Statistics, Stanford University (2006)Google Scholar
  32. 32.
    Schmidt, M.: Least Squares Optimization with L1-Norm Regularization. University of British Columbia (2005)Google Scholar
  33. 33.
    Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: CVPR Workshop, San Fransisco, CA (June 2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Sherif Azary
    • 1
  • Andreas Savakis
    • 1
  1. 1.Computing and Information Sciences and Computer EngineeringRochester Institute of TechnologyRochesterUSA

Personalised recommendations