Multi-modality Gesture Detection and Recognition with Un-supervision, Randomization and Discrimination

Chen, Guang; Clarke, Daniel; Giuliani, Manuel; Gaschler, Andre; Wu, Di; Weikersdorfer, David; Knoll, Alois

doi:10.1007/978-3-319-16178-5_43

Guang Chen^16,17,
Daniel Clarke¹⁷,
Manuel Giuliani¹⁷,
Andre Gaschler¹⁷,
Di Wu¹⁸,
David Weikersdorfer¹⁷ &
…
Alois Knoll¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8925))

Included in the following conference series:

European Conference on Computer Vision

5294 Accesses
4 Citations

Abstract

We describe in this paper our gesture detection and recognition system for the 2014 ChaLearn Looking at People (Track 3: Gesture Recognition) organized by ChaLearn in conjunction with the ECCV 2014 conference. The competition’s task was to learn a vacabulary of 20 types of Italian gestures and detect them in sequences. Our system adopts a multi-modality approach for detecting as well as recognizing the gestures. The goal of our approach is to identify semantically meaningful contents from dense sampling spatio-temporal feature space for gesture recognition. To achieve this, we develop three concepts under the random forest framework: un-supervision; discrimination; and randomization. Un-supervision learns spatio-temporal features from two channels (grayscale and depth) of RGB-D video in an unsupervised way. Discrimination extracts the information in dense sampling spatio-temporal space effectively. Randomization explores the dense sampling spatio-temporal feature space efficiently. An evaluation of our approach shows that we achieve a mean Jaccard Index of \(0.6489\), and a mean average accuracy of \(90.3\,\%\) over the test dataset.

Download to read the full chapter text

Chapter PDF

One-Shot-Learning Gesture Segmentation and Recognition Using Frame-Based PDV Features

Challenges in Multi-modal Gesture Recognition

Multi-modal gesture recognition using integrated model of motion, audio and video

Article 19 July 2015

Keywords

References

Bosch, A., Zisserman, A., Muoz, X.: Image classification using random forests and ferns. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8, October 2007
Google Scholar
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Article MathSciNet MATH Google Scholar
Chen, G., Clarke, D., Giuliani, M., Gaschler, A., Knoll, A.: Combining unsupervised learning and discrimination for 3d action recognition. Signal Processing (2014)
Google Scholar
Chen, G., Clarke, D., Knoll, A.: Learning weighted joint-based features for action recognition using depth camera. In: International Conference on Computer Vision Theory and Applications (2014)
Google Scholar
Chen, G., Giuliani, M., Clarke, D., Knoll, A.: Action recognition using ensemble weighted multi-instance learning. In: IEEE International Conference on Robotics and Automation (2014)
Google Scholar
Chen, G., Zhang, F., Giuliani, M., Buckl, C., Knoll, A.: Unsupervised learning spatio-temporal features for human activity recognition from RGB-D video data. In: Herrmann, G., Pearson, M.J., Lenz, A., Bremner, P., Spiers, A., Leonards, U. (eds.) ICSR 2013. LNCS, vol. 8239, pp. 341–350. Springer, Heidelberg (2013)
Chapter Google Scholar
Escalera, S., Bar, X., Gonzlez, J., Bautista, M.A., Madadi, M., Reyes, M., Ponce, V., Escalante, H.J., Shotton, J., Guyon, I.: Chalearn looking at people challenge 2014: dataset and results. In: Proceedings of the ChaLearn Looking at People 2014 Workshop, ECCV 2014 (2014)
Google Scholar
Escalera, S., Gonzlez, J., Bar, X., Reyes, M., Lops, O., Guyon, I., Athitsos, V., Escalante, H.J.: Multi-modal gesture recognition challenge 2013: dataset and results. In: Chalearn Multi-Modal Gesture Recognition Workshop, International Conference on Multimodal Interaction (2013)
Google Scholar
Gaschler, A., Huth, K., Giuliani, M., Kessler, I., de Ruiter, J., Knoll, A.: Modelling state of interaction from head poses for social Human-Robot Interaction. In: ACM/IEEE HCI Conference on Gaze in Human-Robot Interaction Workshop (2012)
Google Scholar
Hadfield, S., Bowden, R.: Hollywood 3d: recognizing actions in 3d natural scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3398–3405 (2013)
Google Scholar
Laptex, I.: On space-time interest points. International Journal of Computer Vision 64, 107–123 (2005)
Article Google Scholar
Le, Q., Zou, W., Yeung, S., Ng, A.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3361–3368 (2011)
Google Scholar
Lu, D.V., Pileggi, A., Smart, W.D.: Multi-person motion capture dataset for analyzing human interaction. In: RSS 2011 Workshop on Human-Robot Interaction. RSS, Los Angeles, California, July 2011
Google Scholar
Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 872–885. Springer, Heidelberg (2012)
Chapter Google Scholar
Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1290–1297 (2012)
Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367, June 2010
Google Scholar
Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011
Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität München, Garching bei München, Germany
Guang Chen & Alois Knoll
fortiss GmbH, Guerickestr. 25, 80805, Munich, Germany
Guang Chen, Daniel Clarke, Manuel Giuliani, Andre Gaschler & David Weikersdorfer
University of Sheffield, Sheffield, UK
Di Wu

Authors

Guang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Clarke
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Giuliani
View author publications
You can also search for this author in PubMed Google Scholar
Andre Gaschler
View author publications
You can also search for this author in PubMed Google Scholar
Di Wu
View author publications
You can also search for this author in PubMed Google Scholar
David Weikersdorfer
View author publications
You can also search for this author in PubMed Google Scholar
Alois Knoll
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guang Chen .

Editor information

Editors and Affiliations

University College London, London, United Kingdom
Lourdes Agapito
University of Lugano, Lugano, Switzerland
Michael M. Bronstein
Technische Universität Dresden, Dresden, Germany
Carsten Rother

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, G. et al. (2015). Multi-modality Gesture Detection and Recognition with Un-supervision, Randomization and Discrimination. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8925. Springer, Cham. https://doi.org/10.1007/978-3-319-16178-5_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-16178-5_43
Published: 19 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16177-8
Online ISBN: 978-3-319-16178-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-modality Gesture Detection and Recognition with Un-supervision, Randomization and Discrimination

Abstract

Chapter PDF

Similar content being viewed by others

One-Shot-Learning Gesture Segmentation and Recognition Using Frame-Based PDV Features

Challenges in Multi-modal Gesture Recognition

Multi-modal gesture recognition using integrated model of motion, audio and video

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Multi-modality Gesture Detection and Recognition with Un-supervision, Randomization and Discrimination

Abstract

Chapter PDF

Similar content being viewed by others

One-Shot-Learning Gesture Segmentation and Recognition Using Frame-Based PDV Features

Challenges in Multi-modal Gesture Recognition

Multi-modal gesture recognition using integrated model of motion, audio and video

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation