Abstract
We propose a human action detection framework called “mixture of heterogeneous attribute analyzer”. This framework integrates heterogeneous attributes learned from static and dynamic, local and global video features, to boost the action detection performance. To this end, we first detect and track multiple people by SVM-HOG detector and tracklet generation. Multiple short human tracklets are then linked into long trajectories by spatio-temporal matching. Human key poses and local dense motion trajectories are then extracted within the tracked human bounding box sequences. Second, we propose a mining method to learn discriminative attributes from these three feature modalities: human bounding box trajectory, key pose and local dense motion trajectories. Finally, the learned discriminative attributes are integrated in a latent structural max-margin learning framework which also explores the spatio-temporal relationship between heterogeneous feature attributes. Experiments on the ChaLearn 2014 human action dataset demonstrate the superior detection performance of the proposed framework.
Chapter PDF
Similar content being viewed by others
Keywords
References
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS (2005)
Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d gradients. In: British Machine Vision Conference (2008)
Laptev, I., Lindeberg, T.: Space-time interest points. In: International Conference on Computer Vision (2003)
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Tang, K., Fei-Fei, L., Koller, D.: Learning latent temporal structure for complex event detection. In: International Conference on Computer Vision and Pattern Recognition (2012)
Raptis, M., Kokkinos, I., Soatto, S.: Discovering discriminative action parts from mid-level video representations. In: International Conference on Computer Vision and Pattern Recognition (2012)
Wang, H., Kläser, A., Schmid, C., Cheng-Lin, L.: Action recognition by dense trajectories. In: International Conference on Computer Vision and Pattern Recognition, pp. 3169–3176 (2011)
Wang, Y., Mori, G.: Hidden part models for human action recognition: Probabilistic versus max margin. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(7), 1310–1323 (2011)
Ryoo, M.S., Aggarwal, J.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: International Conference on Computer Vision, pp. 1593–1600 (2009)
Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden markov model. In: International Conference on Computer Vision and Pattern Recognition, pp. 379–385 (1992)
Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and viterbi path searching. In: International Conference Computer Vision and Pattern Recognition (2007)
Vahdat, A., Gao, B., Ranjbar, M., Mori, G.: A discriminative key pose sequence model for recognizing human interactions. In: ICCV Workshop, pp. 1729–1736 (2011)
Raptis, M., Sigal, L.: Poselet key-framing: A model for human activity recognition. In: International Conference on Computer Vision and Pattern Recognition, pp. 2650–2657 (2013)
Bourdev, L., Malik, J.: Poselets: body part detectors trained using 3d human pose annotations. In: International Conference on Computer Vision (2009)
Snchez, D., Bautista, M., Escalera, S.: Hupba 8k+: Dataset and ecoc-graphcut based segmentation of human limbs. Neurocomputing (2014)
Gupta, A., Davis, L.: Objects in action: an approach for combining action understanding and object perception. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)
Escorcia, V., Niebles, J.: Spatio-temporal human-object interactions for action recognition in videos. In: IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 508–514 (2013)
Prest, A., Ferrari, V., Schmid, C.: Explicit modeling of human-object interactions in realistic videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(4), 835–848 (2013)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Kuhn, H.W.: The hungarian method for the assignment problem. Naval Research Logistics Quarterly 2, 83–97 (1955)
Juneja, M., Vedaldi, A., Jawahar, C.V., Zisserman, A.: Blocks that shout: distinctive parts for scene classification. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision 103(1), 60–79 (2013)
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3337–3344 (2011)
Joachims, T., Finley, T., Yu, C.N.J.: Cutting-plane training of structural svms. Machine Learning 77(1), 27–59 (2009)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2010)
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: International Conference on Computer Vision (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pei, Y., Ni, B., Atmosukarto, I. (2015). Mixture of Heterogeneous Attribute Analyzers for Human Action Detection. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8925. Springer, Cham. https://doi.org/10.1007/978-3-319-16178-5_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-16178-5_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16177-8
Online ISBN: 978-3-319-16178-5
eBook Packages: Computer ScienceComputer Science (R0)