Advertisement

Efficient Framework for Action Recognition Using Reduced Fisher Vector Encoding

  • Prithviraj DharEmail author
  • Jose M. Alvarez
  • Partha Pratim Roy
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 460)

Abstract

This paper presents a novel and efficient approach to improve performance of recognizing human actions from video by using an unorthodox combination of stage-level approaches. Feature descriptors obtained from dense trajectory i.e. HOG, HOF and MBH are known to be successful in representing videos. In this work, Fisher Vector Encoding with reduced dimensions are separately obtained for each of these descriptors and all of them are concatenated to form one super vector representing each video. To limit the dimension of this super vector we only include first order statistics, computed by the Gaussian Mixture Model, in the individual Fisher Vectors. Finally, we use elements of this super vector, as inputs to be fed to the Deep Belief Network (DBN) classifier. The performance of this setup is evaluated on KTH and Weizmann datasets. Experimental results show a significant improvement on these datasets. An accuracy of 98.92 and 100 % has been obtained on KTH and Weizmann dataset respectively.

Keywords

Human action recognition Deep Belief Network Dense trajectory features Fisher Vector 

References

  1. 1.
    Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Human Behavior Understanding, pp. 29–39. Springer (2011)Google Scholar
  2. 2.
    Bengio, Y., LeCun, Y., et al.: Scaling learning algorithms towards ai. Large-scale kernel machines 34(5) (2007)Google Scholar
  3. 3.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: The Tenth IEEE International Conference on Computer Vision (ICCV’05). pp. 1395–1402 (2005)Google Scholar
  4. 4.
    Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009. pp. 1948–1955. IEEE (2009)Google Scholar
  5. 5.
    Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp. 1–8. IEEE (2008)Google Scholar
  6. 6.
    Gao, Z., Chen, M.Y., Hauptmann, A.G., Cai, A.: Comparing evaluation protocols on the kth dataset. In: Human Behavior Understanding, pp. 88–100. Springer (2010)Google Scholar
  7. 7.
    Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with lstm recurrent networks. The Journal of Machine Learning Research 3, 115–143 (2003)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Ikizler, N., Duygulu, P.: Histogram of oriented rectangles: A new pose descriptor for human action recognition. Image and Vision Computing 27(10), 1515–1526 (2009)CrossRefGoogle Scholar
  9. 9.
    Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2046–2053. IEEE (2010)Google Scholar
  10. 10.
    Lin, Z., Jiang, Z., Davis, L.S.: Recognizing actions by shape-motion prototype trees. In: 2009 IEEE 12th International Conference on Computer Vision,. pp. 444–451. IEEE (2009)Google Scholar
  11. 11.
    Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009. pp. 1996–2003. IEEE (2009)Google Scholar
  12. 12.
    Liu, J., Shah, M.: Learning human actions via information maximization. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp. 1–8. IEEE (2008)Google Scholar
  13. 13.
    Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: 2013 IEEE International Conference on Computer Vision (ICCV). pp. 1817–1824. IEEE (2013)Google Scholar
  14. 14.
    Sadek, S., Al-Hamadi, A., Michaelis, B., Sayed, U.: An action recognition scheme using fuzzy log-polar histogram and temporal self-similarity. EURASIP Journal on Advances in Signal Processing 2011(1), 540375 (2011)CrossRefGoogle Scholar
  15. 15.
    Schindler, K., Van Gool, L.: Action snippets: How many frames does human action recognition require? In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp. 1–8. IEEE (2008)Google Scholar
  16. 16.
    Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. vol. 3, pp. 32–36. IEEE (2004)Google Scholar
  17. 17.
    Sun, C., Nevatia, R.: Large-scale web video event classification by use of fisher vectors. In: 2013 IEEE Workshop on Applications of Computer Vision (WACV). pp. 15–22. IEEE (2013)Google Scholar
  18. 18.
    Sun, C., Junejo, I., Foroosh, H.: Action recognition using rank-1 approximation of joint self-similarity volume. In: 2011 IEEE International Conference on Computer Vision (ICCV). pp. 1007–1012. IEEE (2011)Google Scholar
  19. 19.
    Sun, X., Chen, M., Hauptmann, A.: Action recognition via local descriptors and holistic features. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2009. CVPR Workshops 2009. pp. 58–65. IEEE (2009)Google Scholar
  20. 20.
    Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),. pp. 3169–3176. IEEE (2011)Google Scholar
  21. 21.
    Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC 2009-British Machine Vision Conference. pp. 124–1. BMVA Press (2009)Google Scholar
  22. 22.
    Wang, Y., Mori, G.: Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(10), 1762–1774 (2009)CrossRefGoogle Scholar
  23. 23.
    Weinland, D., Boyer, E.: Action recognition using exemplar-based embedding. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp. 1–7. IEEE (2008)Google Scholar

Copyright information

© Springer Science+Business Media Singapore 2017

Authors and Affiliations

  • Prithviraj Dhar
    • 1
    Email author
  • Jose M. Alvarez
    • 2
  • Partha Pratim Roy
    • 3
  1. 1.Department of CSEKolkataIndia
  2. 2.NICTASydneyAustralia
  3. 3.Department of CSEIITRoorkeeIndia

Personalised recommendations