Abstract
Hough-transform based voting has been successfully applied to both object and activity detections. However, most current Hough voting methods will suffer when insufficient training data is provided. To address this problem, we propose propagative Hough voting for activity analysis. Instead of letting local features vote individually, we perform feature voting using random projection trees (RPT) which leverage the low-dimension manifold structure to match feature points in the high-dimensional feature space. Our RPT can index the unlabeled feature points in an unsupervised way. After the trees are constructed, the label and spatial-temporal configuration information are propagated from the training samples to the testing data via RPT. The proposed activity recognition method does not rely on human detection and tracking, and can well handle the scale and intra-class variations of the activity patterns. The superior performances on two benchmarked activity datasets validate that our method outperforms the state-of-the-art techniques not only when there is sufficient training data such as in activity recognition, but also when there is limited training data such as in activity search with one query example.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2-3), 107–123 (2005)
Yuan, J., Liu, Z., Wu, Y.: Discriminative Video Pattern Search for Efficient Action Detection. IEEE Trans. on PAMI (2011)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. CVPR (2008)
Yuan, F., Prinet, V., Yuan, J.: Middle-Level Representation for Human Activities Recognition: the Role of Spatio-temporal Relationships. In: ECCV Workshop on Human Motion (2010)
Gaur, U., Zhu, Y., Song, B., Roy-Chowdhury, A.: A String of Feature Graphs Model for Recognition of Complex Activities in Natural Videos. In: ICCV (2011)
Ryoo, M.S.: Human Activity Prediction: Early Recognition of Ongoing Activities from Streaming Videos. In: ICCV (2011)
Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. PAMI, 2188–2202 (2011)
Ryoo, M.S., Chen, C., Aggarwal, J.: An overview of contest on semantic description of human activities, SDHA (2010)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: International Conference on Very Large Data Bases (VLDB), pp. 518–529 (1999)
Ryoo, M.S., Aggarwal, J.K.: Spatio-Temporal Relationship Match: Video Structure Comparison for Recognition of Complex Human Activities. In: ICCV (2009)
Amer, M.R., Todorovic, S.: A Chains Model for Localizing Participants of Group Activities in Videos. In: ICCV (2011)
Brendel, W., Todorovic, S.: Learning Spatiotemporal Graphs of Human Activities. In: ICCV (2011)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior Recognition via Sparse Spatio-Temporal Features. In: Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (2005)
Razavi, N., Gall, J., Van Gool, L.: Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 620–633. Springer, Heidelberg (2010)
Choi, W., Savarese, S.: Learning Context for Collective Activity Recognition. In: CVPR (2011)
Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: CVPR (2010)
Leibe, B., Leonardis, A., Schiele, B.: Robust Object Detection with Interleaved Categorization and Segmentation. IJCV 77(1-3), 259–289 (2007)
Dasgupta, S., Freund, Y.: Random projection trees and low dimensional manifolds. In: ACM Symposium on Theory of Computing (STOC), pp. 537–546 (2008)
Patron-perez, A., Marszalek, M., Zisserman, A., Reid, I.: High Five: Recognising human interactions in TV shows. In: BMVC (2010)
Yu, G., Yuan, J., Liu, Z.: Unsupervised Random Forest Indexing for Fast Action Search. In: CVPR (2011)
Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. PAMI 30, 1632–1646 (2008)
Klaser, A., Marszalek, M.: A spatio-temporal descriptor based on 3D-gradients. In: BMVC (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, G., Yuan, J., Liu, Z. (2012). Propagative Hough Voting for Human Activity Recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33712-3_50
Download citation
DOI: https://doi.org/10.1007/978-3-642-33712-3_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33711-6
Online ISBN: 978-3-642-33712-3
eBook Packages: Computer ScienceComputer Science (R0)