Advertisement

Hessian Regularized Sparse Coding for Human Action Recognition

  • Weifeng Liu
  • Zhen Wang
  • Dapeng Tao
  • Jun Yu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8936)

Abstract

With the rapid increase of online videos, recognition and search in videos becomes a new trend in multimedia computing. Action recognition in videos thus draws intensive research concerns recently. Second, sparse representation has become state-of-the-art solution in computer vision because it has several advantages for data representation including easy interpretation, quick indexing and considerable connection with biological vision. One prominent sparse representation algorithm is Laplacian regularized sparse coding (LaplacianSC). However, LaplacianSC biases the results toward a constant and thus results in poor generalization. In this paper, we propose Hessian regularized sparse coding (HessianSC) for action recognition. In contrast to LaplacianSC, HessianSC can well preserve the local geometry and steer the sparse coding varying linearly along the manifold of data distribution. We also present a fast iterative shrinkage-thresholding algorithm (FISTA) for HessianSC. Extensive experiments on human motion database (HMDB51) demonstrate that HessianSC significantly outperforms LaplacianSC and the traditional sparse coding algorithm for action recognition.

Keywords

Action recognition sparse coding Hessian regularization manifold learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: A large video database for human motion recognition. In: IEEE International Conference on Computer Vision (ICCV), pp. 2556–2563 (2011)Google Scholar
  2. 2.
    Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: A review. ACM Computing Surveys (CSUR) 43(3), 16 (2011)CrossRefGoogle Scholar
  3. 3.
    Ke, Y., Sukthankar, R., Hebert, M.: Spatio-temporal shape and flow correlation for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)Google Scholar
  4. 4.
    Rodriguez, M., Ahmed, J., Shah, M.: Action MACH: A spatio-temporal maximum average correlation height filter for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008)Google Scholar
  5. 5.
    Campbell, L.W., Bobick, A.F.: Recognition of human body motion using phase space constraints. In: IEEE International Conference Computer Vision, pp. 624–630 (1995)Google Scholar
  6. 6.
    Rao, C., Shah, M.: View-invariance in action recognition. In: IEEE Conferences on Computer Vision and Pattern Recognition (CVPR), vol. 2, p. II-316 (2001)Google Scholar
  7. 7.
    Sheikh, Y., Sheikh, M., Shah, M.: Exploring the space of a human action. In: IEEE International Conference on Computer Vision, vol. 1, pp. 144–149 (2005)Google Scholar
  8. 8.
    Chomat, O., Crowley, J.L.: Probabilistic recognition of activity using local appearance. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2 (1999)Google Scholar
  9. 9.
    Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, p. II-123 (2001)Google Scholar
  10. 10.
    Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2-3), 107–123 (2005)CrossRefGoogle Scholar
  11. 11.
    Yilmaz, A., Shah, M.: Actions sketch: A novel action representation. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 984–989 (2005)Google Scholar
  12. 12.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: IEEE International Conference on Computer Vision (ICCV), vol. 2, pp. 1395–1402 (2005)Google Scholar
  13. 13.
    Yu, J., Tao, D., Wang, M., Rui, Y.: Learning to Rank Using User Clicks and Visual Features for Image Retrieval. IEEE Transactions on Cybernetics (2014), 10.1109/TCYB.2014.2336697Google Scholar
  14. 14.
    Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision 79(3), 299–318 (2008)CrossRefGoogle Scholar
  15. 15.
    Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In: IEEE International Conference on Computer Vision (ICCV), pp. 1593–1600 (2009)Google Scholar
  16. 16.
    Hong, C., Yu, J., Chen, X.: Image-Based 3D Human Pose Recovery with Locality Sensitive Sparse Retrieval. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2103–2108 (2013)Google Scholar
  17. 17.
    Olshausen, B.A.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)CrossRefGoogle Scholar
  18. 18.
    Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20(1), 33–61 (1998)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing 41(12), 3397–3415 (1993)CrossRefzbMATHGoogle Scholar
  20. 20.
    Yu, J., Rui, Y., Tao, D.: Click Prediction for Web Image Reranking using Multimodal Sparse Coding. IEEE Transactions on Image Processing 23(5), 2019–2032 (2014)CrossRefMathSciNetGoogle Scholar
  21. 21.
    Liu, B.-D., Wang, Y.-X., Zhang, Y.-J., Shen, B.: Learning dictionary on manifolds for image classification. Pattern Recognition 46(7), 1879–1890 (2013)CrossRefGoogle Scholar
  22. 22.
    Liu, B.-D., Wang, Y.-X., Shen, B., Zhang, Y.-J., Hebert, M.: Self-explanatory sparse representation for image classification. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 600–616. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  23. 23.
    Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1), 49–67 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  24. 24.
    Jenatton, R., Mairal, J., Bach, F.R., Obozinski, G.R.: Proximal methods for sparse hierarchical dictionary learning. In: The 27th International Conference on Machine Learning (ICML), pp. 487–494 (2010)Google Scholar
  25. 25.
    Jia, Y., Salzmann, M., Darrell, T.: Factorized latent spaces with structured sparsity. In: Advances in Neural Information Processing Systems, pp. 982–990 (2010)Google Scholar
  26. 26.
    Zheng, M., Bu, J., Chen, C., Wang, C., Zhang, L., Qiu, G., Cai, D.: Graph regularized sparse coding for image representation. IEEE Transactions on Image Processing 20(5), 1327–1336 (2011)CrossRefMathSciNetGoogle Scholar
  27. 27.
    Gao, S., Tsang, I.W.-H., Chia, L.-T.: Laplacian sparse coding, hypergraph laplacian sparse coding, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1), 92–104 (2013)CrossRefGoogle Scholar
  28. 28.
    Zheng, M., Bu, J., Chen, C.: Hessian sparse coding. Neurocomputing 123, 247–254 (2014)CrossRefGoogle Scholar
  29. 29.
    Liu, W., Tao, D., Cheng, J., Tang, Y.: Multiview hessian discriminative sparse coding for image annotation. Computer Vision and Image Understanding 118, 50–60 (2014)CrossRefGoogle Scholar
  30. 30.
    Yu, J., Wang, M., Tao, D.: Semisupervised multiview distance metric learning for cartoon synthesis. IEEE Transactions on Image Processing 21(11), 4636–4648 (2012)CrossRefMathSciNetGoogle Scholar
  31. 31.
    Kim, K.I., Steinke, F., Hein, M.: Semi-supervised regression using hessian energy with an application to semi-supervised dimensionality reduction. In: Advances in Neural Information Processing Systems, pp. 979–987 (2009)Google Scholar
  32. 32.
    Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2(1), 183–202 (2009)CrossRefzbMATHMathSciNetGoogle Scholar
  33. 33.
    Nemirovsky, A.S., Yudin, D.B.: Problem complexity and method efficiency in optimization (1983)Google Scholar
  34. 34.
    Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: A large video database for human motion recognition. In: IEEE International Conference on Computer Vision (ICCV), pp. 2556–2563 (2011)Google Scholar
  35. 35.
    Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research 7, 2399–2434 (2006)zbMATHMathSciNetGoogle Scholar
  36. 36.
    Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: Advances in Neural Information Processing Systems, pp. 801–808 (2006)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Weifeng Liu
    • 1
  • Zhen Wang
    • 1
  • Dapeng Tao
    • 2
    • 3
  • Jun Yu
    • 4
  1. 1.China University of Petroleum (East China)QingdaoChina
  2. 2.Shenzhen Institutes of Advanced TechnologyChinese Academy of ScienceShenzhenChina
  3. 3.The Chinese University of Hong KongHong KongChina
  4. 4.Hangzhou Dianzi UniversityHangzhouChina

Personalised recommendations