Advertisement

Data Mining for Action Recognition

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9007)

Abstract

In recent years, dense trajectories have shown to be an efficient representation for action recognition and have achieved state-of-the-art results on a variety of increasingly difficult datasets. However, while the features have greatly improved the recognition scores, the training process and machine learning used hasn’t in general deviated from the object recognition based SVM approach. This is despite the increase in quantity and complexity of the features used. This paper improves the performance of action recognition through two data mining techniques, APriori association rule mining and Contrast Set Mining. These techniques are ideally suited to action recognition and in particular, dense trajectory features as they can utilise the large amounts of data, to identify far shorter discriminative subsets of features called rules. Experimental results on one of the most challenging datasets, Hollywood2 outperforms the current state-of-the-art.

Notes

Acknowledgement

This work was supported by the EPSRC grant “Learning to Recognise Dynamic Visual Content from Broadcast Footage” (EP/I011811/1).

References

  1. 1.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV 2005, pp. 1395–1402 (2005)Google Scholar
  2. 2.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: ICPR 2004, pp. 32–36 (2004)Google Scholar
  3. 3.
    Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: ICCV 2011 (2011)Google Scholar
  4. 4.
    Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR 2009 (2009)Google Scholar
  5. 5.
    Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of MULTIMEDIA 2007, pp. 357–360 (2007)Google Scholar
  6. 6.
    Willems, G., Tuytelaars, T., Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 650–663. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  7. 7.
    Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D gradients. In: BMVC 2008 (2008)Google Scholar
  8. 8.
    Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV 2003, pp. 432–439 (2003)Google Scholar
  9. 9.
    Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103, 60–79 (2013). SpringerCrossRefMathSciNetGoogle Scholar
  10. 10.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR 2001, pp. 511–518 (2001)Google Scholar
  11. 11.
    Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 570–576 (1998)Google Scholar
  12. 12.
    Gilbert, A., Illingworth, J., Bowden, R.: Action recognition using mined hierarchical compound features. IEEE Trans. Pattern Anal. Mach. Intell. 33, 883–897 (2011)CrossRefGoogle Scholar
  13. 13.
    Quack, T., Ferrari, V., Leibe, B., Gool, L.: Efficient mining of frequent and distinctive feature configurations. In: ICCV 2007 (2007)Google Scholar
  14. 14.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR 2008, pp. 1–8 (2008)Google Scholar
  15. 15.
    Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. comput. vis. 79, 299–318 (2008)CrossRefGoogle Scholar
  16. 16.
    Uemura, H., Ishikawa, S., Mikolajczyk, K.: Feature tracking and motion compensation for action recognition. In: BMVC 2008 (2008)Google Scholar
  17. 17.
    Park, D., Zitnick, C.L., Ramanan, D., Dollár, P.: Exploring weak stabilization for motion feature extraction. In: CVPR 2013, pp. 2882–2889 (2013)Google Scholar
  18. 18.
    Hoai, M., Lan, Z.Z., De la Torre, F.: Joint segmentation and classification of human actions in video. In: CVPR 2011, pp. 3265–3272 (2011)Google Scholar
  19. 19.
    Han, D., Bo, L., Sminchisescu, C.: Selection and context for action recognition. In: ICCV 2009, pp. 1933–1940 (2009)Google Scholar
  20. 20.
    Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: ICCV 2013, pp. 1817–1824 (2013)Google Scholar
  21. 21.
    Yuan, J., Wu, Y., Yang, M.: Discovery of collocation patterns: from visual words to visual phrases. In: CVPR 2007, pp. 1–8 (2007)Google Scholar
  22. 22.
    Nowozin, S., Bakir, G., Tsuda, K.: Discriminative subsequence mining for action classification. In: ICCV 2007, pp. 1–8 (2007)Google Scholar
  23. 23.
    Siva, P., Russell, C., Xiang, T.: In defence of negative mining for annotating weakly labelled data. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 594–608. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  24. 24.
    Wang, L., Qiao, Y., Tang, X.: Mining motion atoms and phrases for complex action recognition. In: ICCV 2013, pp. 2680–2687 (2013)Google Scholar
  25. 25.
    Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28, 11–21 (1972)CrossRefGoogle Scholar
  26. 26.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of 20th International Conference on Very Large Data Bases VLDB 1994, pp. 487–499 (1994)Google Scholar
  27. 27.
    Menzies, T., Hu, Y.: Data mining for very busy people. Computer 36, 22–29 (2003)CrossRefGoogle Scholar
  28. 28.
    Bay, S.D., Pazzani, M.J.: Detecting change in categorical data: Mining contrast sets. In: KDD, pp. 302–306 (1999)Google Scholar
  29. 29.
    Mathe, S., Sminchisescu, C.: Dynamic eye movement datasets and learnt saliency models for visual action recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 842–856. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  30. 30.
    Jain, M., Jégou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: CVPR 2013, pp. 2555–2562 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Centre for Vision Speech and Signal Processing (CVSSP)University of SurreyGuildfordUK

Personalised recommendations