A New RGB-D Gesture Video Dataset and Its Benchmark Evaluations on Light-Weighted Networks

Xiao, Guojian; Lu, Zhendong; Yang, Zhirong; Jin, Panji; Li, Kuan; Yin, Jianping

doi:10.1007/978-981-16-7443-3_4

Guojian Xiao⁸,
Zhendong Lu⁸,
Zhirong Yang⁸,
Panji Jin⁸,
Kuan Li⁸ &
…
Jianping Yin⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1494))

Included in the following conference series:

National Conference of Theoretical Computer Science

398 Accesses

Abstract

Video-based gesture recognition plays an important role in the field of human-computer interaction (HCI), and most of the existing video-based gesture recognition works are based on traditional RGB gesture videos. Compared with RGB gesture videos, RGB-D gesture videos contain additional depth information along with each data frame. Such depth information is considered effective to help overcome the impact of illumination and background variations. So far as we know, there are few RGB-D gesture video datasets, which fully consider sufficient illumination and background variations. We believe this missing factor is quite common in daily usage scenarios and will bring non-necessary obstacles to the development of gesture recognition algorithms. Inspired by this observation, this paper uses embedded devices to collect and classify a set of RGB-D gesture videos which retain both color information and depth information, and proposes a new RGB-D gesture video data set named DG-20. Specifically, DG-20 fully considers the changes of illumination and background when capturing data, and provides more realistic RGB-D gesture video data for the future research of RGB-D gesture recognition algorithms. Furthermore, we give out the benchmark evaluations about DG-20 on two representative light-weighted 3D CNN networks. Experimental results show that the depth information encoded within RGB-D gesture videos could effectively improve the classification accuracy when dramatic changes in illumination and background exist.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, Z.L., Lin, X.L., Chen, J.Y., Huang, Q.Y., Li, C.: Research and application of gesture recognition and rehabilitation system based on computer vision. Comput. Meas. Control 7, 203–207 (2021)
Google Scholar
Su, B.Y., Wang, G.J., Zhang, J.: Smart home system based on internet of things and kinect sensor. J. Cent. South Univ. (Sci. Technol.) 44, 181–184 (2013)
Google Scholar
Sha, J., Ma, J., Mou, H.J., Hou, J.H.: A review of vision based dynamic hand gestures recognition. Comput. Sci. Appl. 10, 990–1001 (2020)
Google Scholar
Zhao, Q.N.: Research on gesture recognition technology based on computer vision. Dalian University of Technology (2020)
Google Scholar
Zhou, S.: Gesture recognition based on feature fusion: Zhengzhou University (2019)
Google Scholar
Li, J.M.: 3D hand gesture recognition in RGBD images. Graduate School of National University of Defense Technology (2017)
Google Scholar
Kang, C.Q.: Hand gesture recognition and application based on RGBD Data. Chang’an University (2017)
Google Scholar
Negin, F., et al.: PRAXIS: towards automatic cognitive assessment using gesture recognition. Expert Syst. Appl. 106, 21–35 (2018)
Article Google Scholar
Zhang, Y.F., Cao, C., Cheng, J., Lu, H.: EgoGesture: a new dataset and benchmark for egocentric hand gesture recognition. IEEE Trans. Multimed. 20(5), 1038–1050 (2018)
Google Scholar
Chai, X.X., Wang, H., Yin, F., Chen, X.: Communication tool for the hard of hearings: a large vocabulary sign language recognition system. In: 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 781–783 (2015)
Google Scholar
Lin, F., Wilhelm, C., Martinez, T.: Two-hand global 3D pose estimation using monocular RGB. In: The CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2373–2381 (2021)
Google Scholar
Wan, J., Li, S.Z., Zhao, Y., Shuai, Z., Escalera, S.: ChaLearn looking at people RGB-D isolated and continuous datasets for gesture recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 761–769 (2016)
Google Scholar
Yuan, S.H., Jordi, S.R., Taking, L., Kai, L.H., Wen, H.C.: LaRED: a large RGB-D extensible hand gesture dataset. In: Xu, C.S. (ed.) Multimedia Systems Conference, pp. 53–28 (2014)
Google Scholar
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-Time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. 33, 1–10 (2014)
Article Google Scholar
Qin, F.: Real-time dynamic gesture recognition based on deep learning. Zhejiang University (2020)
Google Scholar
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv (2017)
Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Google Scholar
Escalera, S., Baro, X., Escalante, H.J., Guyon, I.: ChaLearn looking at people: a review of events and resources. In: Choe, Y. (ed.) 2017 International Joint Conference on Neural Networks (IJCNN), pp. 1594–1601 (2017)
Google Scholar
Kopuklu, O., Kose, N., Gunduz, A., Rigoll, G.: Resource efficient 3D convolutional neural networks. In: IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 1910–1919 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Dongguan University of Technology, Dongguan, 523808, Guangdong, China
Guojian Xiao, Zhendong Lu, Zhirong Yang, Panji Jin, Kuan Li & Jianping Yin

Authors

Guojian Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Zhendong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Zhirong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Panji Jin
View author publications
You can also search for this author in PubMed Google Scholar
Kuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianping Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kuan Li .

Editor information

Editors and Affiliations

National University of Defense Technology, Changsha, China
Zhiping Cai
Tsinghua University, Beijing, China
Jian Li
Chinese Academy of Sciences, Beijing, China
Jialin Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiao, G., Lu, Z., Yang, Z., Jin, P., Li, K., Yin, J. (2021). A New RGB-D Gesture Video Dataset and Its Benchmark Evaluations on Light-Weighted Networks. In: Cai, Z., Li, J., Zhang, J. (eds) Theoretical Computer Science. NCTCS 2021. Communications in Computer and Information Science, vol 1494. Springer, Singapore. https://doi.org/10.1007/978-981-16-7443-3_4

Download citation

DOI: https://doi.org/10.1007/978-981-16-7443-3_4
Published: 10 November 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-7442-6
Online ISBN: 978-981-16-7443-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)