A compact and recursive Riemannian motion descriptor for untrimmed activity recognition

Martı́nez Carrillo, Fabio; Gouiffès, Michèle; Garzón Villamizar, Gustavo; Manzanera, Antoine

doi:10.1007/s11554-020-01057-9

A compact and recursive Riemannian motion descriptor for untrimmed activity recognition

Original Research Paper
Published: 05 January 2021

Volume 18, pages 1867–1880, (2021)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Fabio Martı́nez Carrillo¹,
Michèle Gouiffès²,
Gustavo Garzón Villamizar¹ &
…
Antoine Manzanera³

280 Accesses
4 Citations
Explore all metrics

Abstract

A very low dimension frame-level motion descriptor is herein proposed with the capability to represent incomplete dynamics, thus allowing online action prediction. At each frame, a set of local trajectory kinematic cues are spatially pooled using a covariance matrix. The set of frame-level covariance matrices forms a Riemannian manifold that describes motion patterns. A set of statistic measures are computed over this manifold to characterize the sequence dynamics, either globally, or instantaneously from a motion history. Regarding the Riemannian metrics, two different versions are proposed: (1) by considering tangent projections with respect to updated recursive statistics, and (2) by mapping the covariance onto a linear matrix using as reference the identity matrix. The proposed approach was evaluated for two different tasks: (1) for action classification on complete video sequences and (2) for online action recognition, in which the activity is predicted at each frame. The method was evaluated using two public datasets: KTH and UT-interaction. For action classification, the method achieved an average accuracy of 92.27 and 81.67%, for KTH and UT-interaction, respectively. In partial recognition task, the proposed method achieved similar classification rate as for the whole sequence using only the 40 and 70% on KTH and UT sequences, respectively. The code of this work is available at [code].

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Frame-Level Covariance Descriptor for Action Recognition

Online Action Recognition from Trajectory Occurrence Binary Patterns (ToBPs)

A Fast Action Recognition Strategy Based on Motion Trajectory Occurrences

Article 01 July 2019

References

Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. Vis Comput 29(10), 983–1009 (2013)
Article Google Scholar
Vrigkas, M., Nikou, C., Kakadiaris, I.A.: A review of human activity recognition methods. Front Robot AI 2, 28 (2015)
Article Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)
Jiang, Y.-G., Dai, Q., Liu, W., Xue, X., Ngo, C.-W.: Human action recognition in unconstrained videos by explicit motion modeling. IEEE Trans. Image Process. 24(11), 3781–3795 (2015)
Article MathSciNet Google Scholar
Ji, S., Wei, X., Yang, M., Kai, Y.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2012)
Article Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. Adv. Neural. Inf. Process. Syst. 27, 568–576 (2014)
Google Scholar
Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4305–4314 (2015)
Zhang, B., Wang, L., Wang, Z., Qiao, Y., Wang, H.: Real-time action recognition with deeply transferred motion vector CNNs. IEEE Trans. Image Process. 27(5), 2326–2339 (2018)
Article MathSciNet Google Scholar
Zhou, B., Andonian, A., Oliva, A., Torralba, A.: Temporal relational reasoning in videos. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 803–818 (2018)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Zhu, J., Zhu, Z., Zou, W. End-to-end video-level representation learning for action recognition. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 645–650. IEEE (2018)
Qiu, Z., Yao, T., Mei, T.: Learning spatiotemporal representation with pseudo-3D residual networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5533–5541 (2017)
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6450–6459 (2018)
Guo, K., Ishwar, P., Konrad, J.: Action recognition from video using feature covariance matrices. IEEE Trans. Image Process. 22(6), 2479–2494 (2013)
Article MathSciNet Google Scholar
Moreno, W., Garzón, G., Martı́nez, F.: Frame-level covariance descriptor for action recognition. In: Colombian Conference on Computing, pp. 276–290. Springer (2018)
Kong, Y., Kit, D., Fu, Y.: A discriminative model with multiple temporal scales for action prediction. In: European Conference on Computer Vision, pp. 596–611. Springer (2014)
Lan, T., Chen, T.-C., Savarese, S.: A hierarchical representation for future action prediction. In: European Conference on Computer Vision, pp. 689–704. Springer (2014)
Fernando, B., Gavves, E., Oramas, J.M., Ghodrati, A., Tuytelaars, T.: Modeling video evolution for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5378–5387 (2015)
Ma, S., Sigal, L., Sclaroff, S.: Learning activity progression in lstms for activity detection and early detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1942–1950 (2016)
Varol, G., Laptev, I., Schmid, C.: Longterm temporal convolutions for action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1510–1517 (2017)
Article Google Scholar
Veeriah, V., Zhuang, N., Qi, G.-J.: Differential recurrent neural networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4041–4049 (2015)
Kong, Y., Tao, Z., Fu, Y.: Deep sequential context networks for action prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1473–1481 (2017)
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: CVPR 2011, pp. 3169–3176. IEEE (2011)
Fletcher, P.T., Joshi, S.: Riemannian geometry for the statistical analysis of diffusion tensor data. Sig. Process. 87(2), 250–262 (2007)
Article Google Scholar
Pennec, X.: Intrinsic statistics on Riemannian manifolds: basic tools for geometric measurements. J. Math. Imaging Vis. 25(1), 127 (2006)
Article MathSciNet Google Scholar
Pennec, X., Fillard, P., Ayache, N.: A Riemannian framework for tensor computing. Int. J. Comput. Vision 66(1), 41–66 (2006)
Article Google Scholar
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2–3), 107–123 (2005)
Article Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition. ICPR 2004, vol. 3, pp. 32–36. IEEE (2004)
Ryoo, M.S., Chen, C.-C., Aggarwal, J.K., Roy-Chowdhury, A.: An overview of contest on semantic description of human activities (SDHA) 2010. In: International Conference on Pattern Recognition, pp. 270–285. Springer (2010)
Fletcher, P.T., Joshi, S.:. Principal geodesic analysis on symmetric spaces: statistics of diffusion tensors. In: Computer Vision and Mathematical Methods in Medical and Biomedical Image Analysis, pp. 87–98. Springer (2004)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Gang, Yu., Yuan, J., Liu, Z.: Propagative hough voting for human activity detection and recognition. IEEE Trans. Circuits Syst. Video Technol. 25(1), 87–98 (2014)
Google Scholar
Cao, X., Zhang, H., Deng, C., Liu, Q., Liu, H.: Action recognition using 3d daisy descriptor. Mach. Vis. Appl. 25(1), 159–171 (2014)
Article Google Scholar
Nour el houda Slimani, K., Benezeth, Y., Souami, F.: Human interaction recognition based on the co-occurence of visual words. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 455–460 (2014)
Ji, X., Wang, C., Zuo, X., Wang, Y.: Multiple feature voting based human interaction recognition. Int. J. Signal Process. Image Process. Pattern Recognit. 9(1), 323–334 (2016)
Google Scholar

Download references

Acknowledgements

This research is partially funded by the RTRA Digiteo project MAPOCA. The authors also acknowledge the Vicerrectoría de Investigación y Extensión (VIE) of the Universidad Industrial de Santander for supporting this research registered by the project: Cuantificación de patrones locomotores para el diagnóstico y seguimiento remoto en zonas de difícil acceso with SIVIE code 2697.

Author information

Authors and Affiliations

Biomedical Imaging, Vision and Learning Laboratory (BIVL2ab), Universidad Industrial de Santander (UIS), Bucaramanga, Colombia
Fabio Martı́nez Carrillo & Gustavo Garzón Villamizar
LIMSI, CNRS, Université Paris-Saclay, Saint-Aubin, France
Michèle Gouiffès
U2IS/Robotics and Autonomous Systems, ENSTA Paris, Institut Polytechnique de Paris, Palaiseau, France
Antoine Manzanera

Authors

Fabio Martı́nez Carrillo
View author publications
You can also search for this author in PubMed Google Scholar
Michèle Gouiffès
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo Garzón Villamizar
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Manzanera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabio Martı́nez Carrillo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martı́nez Carrillo, F., Gouiffès, M., Garzón Villamizar, G. et al. A compact and recursive Riemannian motion descriptor for untrimmed activity recognition. J Real-Time Image Proc 18, 1867–1880 (2021). https://doi.org/10.1007/s11554-020-01057-9

Download citation

Received: 22 May 2020
Accepted: 01 December 2020
Published: 05 January 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s11554-020-01057-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A compact and recursive Riemannian motion descriptor for untrimmed activity recognition

Abstract

Access this article

Similar content being viewed by others

Frame-Level Covariance Descriptor for Action Recognition

Online Action Recognition from Trajectory Occurrence Binary Patterns (ToBPs)

A Fast Action Recognition Strategy Based on Motion Trajectory Occurrences

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A compact and recursive Riemannian motion descriptor for untrimmed activity recognition

Abstract

Access this article

Similar content being viewed by others

Frame-Level Covariance Descriptor for Action Recognition

Online Action Recognition from Trajectory Occurrence Binary Patterns (ToBPs)

A Fast Action Recognition Strategy Based on Motion Trajectory Occurrences

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation