Skip to main content

Fine-Grained Activity Recognition with Holistic and Pose Based Features

  • Conference paper
  • First Online:
Pattern Recognition (GCPR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8753))

Included in the following conference series:

Abstract

Holistic methods based on dense trajectories [29, 30] are currently the de facto standard for recognition of human activities in video. Whether holistic representations will sustain or will be superseded by higher level video encoding in terms of body pose and motion is the subject of an ongoing debate [12]. In this paper we aim to clarify the underlying factors responsible for good performance of holistic and pose-based representations. To that end we build on our recent dataset [2] leveraging the existing taxonomy of human activities. This dataset includes \(24,920\) video snippets covering \(410\) human activities in total. Our analysis reveals that holistic and pose-based methods are highly complementary, and their performance varies significantly depending on the activity. We find that holistic methods are mostly affected by the number and speed of trajectories, whereas pose-based methods are mostly influenced by viewpoint of the person. We observe striking performance differences across activities: for certain activities results with pose-based features are more than twice as accurate compared to holistic features, and vice versa. The best performing approach in our comparison is based on the combination of holistic and pose-based approaches, which again underlines their complementarity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ainsworth, B., Haskell, W., Herrmann, S., Meckes, N., Bassett, D., Tudor-Locke, C., Greer, J., Vezina, J., Whitt-Glover, M., Leon, A.: 2011 compendium of physical activities: a second update of codes and MET values. MSSE 43(8), 1575–1581 (2011)

    Google Scholar 

  2. Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human poseestimation: new benchmark and state of the art analysis. In: CVPR’14

    Google Scholar 

  3. Brendel, W., Todorovic, S.: Learning spatiotemporal graphs of human activities. In: ICCV’11

    Google Scholar 

  4. Cardinaux, F., Bhowmik, D., Abhayaratne, C., Hawley, M.S.: Video based technology for ambient assisted living: a review of the literature. J. Ambient Intell. Smart Environ. 3(3), 253–269 (2011)

    Google Scholar 

  5. Chakraborty, B., Holte, M.B., Moeslund, T.B., Gonzalez, J., Xavier Roca, F.: A selective spatio-temporal interest point detector for human action recognition in complex scenes. In: ICCV’11

    Google Scholar 

  6. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection.In: CVPR’05

    Google Scholar 

  7. Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: ECCV’06

    Google Scholar 

  8. Dantone, M., Gall, J., Leistner, C., Gool., L.V.: Human pose estimation usingbody parts dependent joint regressors. In: CVPR’13

    Google Scholar 

  9. Duchenne, O., Laptev, I., Sivic, J., Bach, F., Ponce, J.: Automatic annotation of human actions in video. In: ICCV’09

    Google Scholar 

  10. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

  11. Ferrari, V., Marin, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR’08

    Google Scholar 

  12. Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.J.: Towards understanding action recognition. In: ICCV’13

    Google Scholar 

  13. Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: Proceedings of the International Conference on Computer Vision (ICCV) (2011)

    Google Scholar 

  14. Laptev, I.: On space-time interest points. IJCV 64(2/3), 107–123 (2005)

    Google Scholar 

  15. Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistichuman actions from movies. In: CVPR’08

    Google Scholar 

  16. Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in thewild. In: CVPR’09

    Google Scholar 

  17. Marszałek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR’09

    Google Scholar 

  18. Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.: Poselet conditionedpictorial structures. In: CVPR’13

    Google Scholar 

  19. Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.: Strong appearance and expressive spatial models for human pose estimation. In: ICCV’13

    Google Scholar 

  20. Rodriguez, M.D., Ahmed, J., Shah, M.: Action mach: a spatio-temporal maximum average correlation height filter for action recognition. In: CVPR’08

    Google Scholar 

  21. Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: CVPR’12

    Google Scholar 

  22. Rohrbach, M., Stark, M., Schiele, B.: Evaluating knowledge transfer andzero-shot learning in a large-scale setting. In: CVPR’11

    Google Scholar 

  23. Sadanand, S., J., C.J.: Action bank: a high-level representation of activity in video. In: ECCV’12

    Google Scholar 

  24. Sapp, B., Taskar, B.: Multimodal decomposable models for human pose estimation. In: CVPR’13

    Google Scholar 

  25. Singh, V.K., Nevatia, R.: Action recognition in cluttered dynamic scenes usingpose-specific part models. In: ICCV’11

    Google Scholar 

  26. Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human action classes from videos in the wild. Technical report CRCV-TR-12-01, UCF (2012)

    Google Scholar 

  27. Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. In: CVPR’10

    Google Scholar 

  28. Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. VC 29(10), 983–1009 (2013)

    Google Scholar 

  29. Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. IJCV 103(1), 60–79 (2013)

    Article  Google Scholar 

  30. Wang, H., Schmid, C.: Action recognition with improved trajectories. In:ICCV’13

    Google Scholar 

  31. Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation oflocal spatio-temporal features for action recognition. In: BMVC’09

    Google Scholar 

  32. Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. PAMI 61(1), 55–79 (2013)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Marcus Rohrbach and Sikandar Amin for helpful discussions. This work has been supported by the Max Planck Center for Visual Computing & Communication.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leonid Pishchulin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Pishchulin, L., Andriluka, M., Schiele, B. (2014). Fine-Grained Activity Recognition with Holistic and Pose Based Features. In: Jiang, X., Hornegger, J., Koch, R. (eds) Pattern Recognition. GCPR 2014. Lecture Notes in Computer Science(), vol 8753. Springer, Cham. https://doi.org/10.1007/978-3-319-11752-2_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11752-2_56

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11751-5

  • Online ISBN: 978-3-319-11752-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics