Skip to main content

A New RGB-D Gesture Video Dataset and Its Benchmark Evaluations on Light-Weighted Networks

  • Conference paper
  • First Online:
Theoretical Computer Science (NCTCS 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1494))

Included in the following conference series:

  • 398 Accesses

Abstract

Video-based gesture recognition plays an important role in the field of human-computer interaction (HCI), and most of the existing video-based gesture recognition works are based on traditional RGB gesture videos. Compared with RGB gesture videos, RGB-D gesture videos contain additional depth information along with each data frame. Such depth information is considered effective to help overcome the impact of illumination and background variations. So far as we know, there are few RGB-D gesture video datasets, which fully consider sufficient illumination and background variations. We believe this missing factor is quite common in daily usage scenarios and will bring non-necessary obstacles to the development of gesture recognition algorithms. Inspired by this observation, this paper uses embedded devices to collect and classify a set of RGB-D gesture videos which retain both color information and depth information, and proposes a new RGB-D gesture video data set named DG-20. Specifically, DG-20 fully considers the changes of illumination and background when capturing data, and provides more realistic RGB-D gesture video data for the future research of RGB-D gesture recognition algorithms. Furthermore, we give out the benchmark evaluations about DG-20 on two representative light-weighted 3D CNN networks. Experimental results show that the depth information encoded within RGB-D gesture videos could effectively improve the classification accuracy when dramatic changes in illumination and background exist.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, Z.L., Lin, X.L., Chen, J.Y., Huang, Q.Y., Li, C.: Research and application of gesture recognition and rehabilitation system based on computer vision. Comput. Meas. Control 7, 203–207 (2021)

    Google Scholar 

  2. Su, B.Y., Wang, G.J., Zhang, J.: Smart home system based on internet of things and kinect sensor. J. Cent. South Univ. (Sci. Technol.) 44, 181–184 (2013)

    Google Scholar 

  3. Sha, J., Ma, J., Mou, H.J., Hou, J.H.: A review of vision based dynamic hand gestures recognition. Comput. Sci. Appl. 10, 990–1001 (2020)

    Google Scholar 

  4. Zhao, Q.N.: Research on gesture recognition technology based on computer vision. Dalian University of Technology (2020)

    Google Scholar 

  5. Zhou, S.: Gesture recognition based on feature fusion: Zhengzhou University (2019)

    Google Scholar 

  6. Li, J.M.: 3D hand gesture recognition in RGBD images. Graduate School of National University of Defense Technology (2017)

    Google Scholar 

  7. Kang, C.Q.: Hand gesture recognition and application based on RGBD Data. Chang’an University (2017)

    Google Scholar 

  8. Negin, F., et al.: PRAXIS: towards automatic cognitive assessment using gesture recognition. Expert Syst. Appl. 106, 21–35 (2018)

    Article  Google Scholar 

  9. Zhang, Y.F., Cao, C., Cheng, J., Lu, H.: EgoGesture: a new dataset and benchmark for egocentric hand gesture recognition. IEEE Trans. Multimed. 20(5), 1038–1050 (2018)

    Google Scholar 

  10. Chai, X.X., Wang, H., Yin, F., Chen, X.: Communication tool for the hard of hearings: a large vocabulary sign language recognition system. In: 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 781–783 (2015)

    Google Scholar 

  11. Lin, F., Wilhelm, C., Martinez, T.: Two-hand global 3D pose estimation using monocular RGB. In: The CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2373–2381 (2021)

    Google Scholar 

  12. Wan, J., Li, S.Z., Zhao, Y., Shuai, Z., Escalera, S.: ChaLearn looking at people RGB-D isolated and continuous datasets for gesture recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 761–769 (2016)

    Google Scholar 

  13. Yuan, S.H., Jordi, S.R., Taking, L., Kai, L.H., Wen, H.C.: LaRED: a large RGB-D extensible hand gesture dataset. In: Xu, C.S. (ed.) Multimedia Systems Conference, pp. 53–28 (2014)

    Google Scholar 

  14. Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-Time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. 33, 1–10 (2014)

    Article  Google Scholar 

  15. Qin, F.: Real-time dynamic gesture recognition based on deep learning. Zhejiang University (2020)

    Google Scholar 

  16. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv (2017)

    Google Scholar 

  17. Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)

    Google Scholar 

  18. Escalera, S., Baro, X., Escalante, H.J., Guyon, I.: ChaLearn looking at people: a review of events and resources. In: Choe, Y. (ed.) 2017 International Joint Conference on Neural Networks (IJCNN), pp. 1594–1601 (2017)

    Google Scholar 

  19. Kopuklu, O., Kose, N., Gunduz, A., Rigoll, G.: Resource efficient 3D convolutional neural networks. In: IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 1910–1919 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuan Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiao, G., Lu, Z., Yang, Z., Jin, P., Li, K., Yin, J. (2021). A New RGB-D Gesture Video Dataset and Its Benchmark Evaluations on Light-Weighted Networks. In: Cai, Z., Li, J., Zhang, J. (eds) Theoretical Computer Science. NCTCS 2021. Communications in Computer and Information Science, vol 1494. Springer, Singapore. https://doi.org/10.1007/978-981-16-7443-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-7443-3_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-7442-6

  • Online ISBN: 978-981-16-7443-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics