Abstract
As increasing levels of multimedia data online require more sophisticated methods to organise this data, we present a practical system for performing rapid localisation and retrieval of human actions from large video databases. We first temporally segment the database and calculate a histogram-match score for each segment against the query. High-scoring, adjacent segments are joined into candidate localised regions using a noise-robust localisation algorithm, and each candidate region is then ranked against the query. Experiments show that this method surpasses the efficiency of previous attempts to perform similar action searches with localisation. We demonstrate how results can be enhanced using relevance feedback, considering how relevance feedback can be effectively applied in the context of localisation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Zhang, H.J., Wu, J., Zhong, D., Smoliar, S.: An Integrated System for Content-based Video Retrieval and Browsing. Pattern Recognition 30(4), 643–658 (1997)
Jones, S., Shao, L., Zhang, J., Liu, Y.: Relevance Feedback for Real-World Human Action Retrieval. Pattern Recognition Lett. 33(4), 446–452 (2012)
Yu, G., Yuan, J., Liu, Z.: Unsupervised Random Forest Indexing for Fast Action Search. In: Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pp. 865–872 (2011)
Rahmani, R., Goldman, S.A., Zhang, H., Krettek, J., Fritts, J.E.: Localized Content Based Image Retrieval. In: ACM SIGMM Int. Conf. Multimedia Inform. Retrieval, pp. 227–236 (2005)
Zhang, D., Wang, F., Shi, Z., Zhang, C.: Interactive Localized Content Based Image Retrieval With Multiple-Instance Active Learning. Pattern Recognition 43(2), 478–484 (2010)
Ryoo, M., Aggarwal, J.: Spatio-temporal Relationship Match: Video Structure Comparison for Recognition of Complex Human Activities. In: IEEE Int. Conf. Comput. Vision, pp. 1593–1600 (2009)
Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing 28(6), 976–990 (2010)
Davis, J.W., Bobick, A.F.: The Representation and Recognition of Human Movement Using Temporal Templates. In: Proc. IEEE Conf. Comput. Vision and Pattern Recognition, p. 928 (1997)
Laptev, I.: On Space-Time Interest Points. Int. J. Comput. Vision 64(2-3), 107–123 (2005)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior Recognition via Sparse Spatio-Temporal Features. In: Proc. IEEE Workshop Visual Surveillance and Performance Evaluation Tracking and Surveillance, pp. 65–72 (2005)
Shao, L., Du, Y.: Spatio-temporal Shape Contexts for Human Action Retrieval. In: Proc. Int. Workshop Interactive Multimedia Consumer Electronics, pp. 43–50 (2009)
Choi, J., Jeon, W.J., Lee, S.-C.: Spatio-temporal pyramid matching for sports videos. In: ACM SIGMM Int. Conf. Multimedia Inform. Retrieval, pp. 291–297 (2008)
Kläser, A., Marszałek, M., Schmid, C.: A Spatio-Temporal Descriptor Based on 3D-Gradients. In: Proc. British Mach. Vision Conf., pp. 995–1004 (2008)
Shao, L., Mattivi, R.: Feature Detector and Descriptor Evaluation in Human Action Recognition. In: Proc. ACM Int. Conf. Image and Video Retrieval, pp. 477–484 (2010)
Kläser, A., Marszalek, M., Schmid, C., Zisserman, A.: Human Focused Action Localization in Video. In: International Workshop on Sign, Gesture, Activity (2010)
Sullivan, J., Carlsson, S.: Recognizing and Tracking Human Action. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 629–644. Springer, Heidelberg (2002)
Tong, S., Chang, E.: Support Vector Machine Active Learning for Image Retrieval. In: Proc. ACM Multimedia, pp. 107–118 (2001)
Tao, D., Tang, X., Li, X., Wu, X.: Asymmetric Bagging and Random Subspace for Support Vector Machines-Based Relevance Feedback in Image Retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1088–1099 (2006)
Cao, L., Liu, Z., Huang, T.: Cross-dataset Action Detection. In: Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pp. 1998–2005 (2010)
Kuehne, H., Poggio, H.: HMDB: A Large Video Database for Human Motion Recognition. In: IEEE Int. Conf. Comput. Vision (2011)
Reddy, K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vision and Applicat., 1–11 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jones, S., Shao, L. (2013). Rapid Localisation and Retrieval of Human Actions with Relevance Feedback. In: Wilson, R., Hancock, E., Bors, A., Smith, W. (eds) Computer Analysis of Images and Patterns. CAIP 2013. Lecture Notes in Computer Science, vol 8047. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40261-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-40261-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40260-9
Online ISBN: 978-3-642-40261-6
eBook Packages: Computer ScienceComputer Science (R0)