Recognizing Hand Gesture in Still Infrared Images by CapsNet

Xiao, Hongwang; Yang, Yun; Yu, Ke; Tian, Jiao; Cai, Xinyi; Zhao, Ying; Zhang, Kai; Guo, Na; Chen, Jinjun

doi:10.1007/978-3-030-90888-1_13

Hongwang Xiao¹²,
Yun Yang¹²,
Ke Yu¹²,
Jiao Tian¹²,
Xinyi Cai¹²,
Ying Zhao¹²,
Kai Zhang¹²,
Na Guo¹² &
…
Jinjun Chen¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13080))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1413 Accesses
1 Citations

Abstract

In recent years, gesture recognition has been becoming a hot topic that attracts much attention from the computer vision community because of its great potential in many real-world applications. There is a need to design a robust hand-based gesture recognition algorithm to cope with hand gesture recognition tasks. Infrared image recognition has the characteristic of not being disturbed by illumination variation. As a promising alternative to Convolutional Neural Networks (CNN), Capsule Networks (CapsNet) can represent the orientations of features and capture the spatial relationships between features of an entity, which makes CapsNet possess higher generalization ability. In this paper, we propose IRHGR-CapsNet to investigate hand gesture recognition in still infrared images. To evaluate the testing accuracy, the convergence ability and the generalization ability of IRHGR-CapsNet comprehensively, we split the original dataset into three subsets according to different split proportions and get four different dataset split modes for experiments. We can achieve almost 100.00% testing accuracy on all the four dataset split modes. Meanwhile, we can also demonstrate that our proposed IRHGR-CapsNet has a strong convergence ability and generalization ability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Marusarz, W.: The challenges and opportunities of gesture recognition. https://nexocode.com/blog/posts/gestures-recognition-challenges-and-opportunities/. Accessed 19 Apr 2021
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1–9 (2012).https://doi.org/10.1145/3065386
Deng, J., Dong, W., Socher, R., Li, L.-J., Kai, L., Li, F.-F.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009). https://doi.org/10.1109/cvpr.2009.5206848
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015). https://doi.org/10.1109/CVPR.2015.7298965
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: ICCV, pp. 2980–2988 (2017). https://doi.org/10.1109/ICCV.2017.322
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: OverFeat: integrated recognition, localization and detection using convolutional networks. In: ICLR (2014). https://doi.org/10.1016/j.visres.2006.11.009
Yosinski, J., Clune, J., Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: CVPR, pp. 427–436 (2015). https://doi.org/10.1109/CVPR.2015.7298640
Hinton, G.E., Krizhevsky, A., Wang, S.D.: Transforming auto-encoders. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 44–51. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_6
Chapter Google Scholar
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: NIPS, pp. 1–11 (2017). https://doi.org/10.1177/1535676017742133
Xiao, H., et al.: Sign language digits and alphabets recognition by capsule networks. J. Ambient Intell. Humaniz. Comput. 1–11 (2021). https://doi.org/10.1007/s12652-021-02974-8
Hoogi, A., Wilcox, B., Gupta, Y., Rubin, D.L.: Self-attention capsule networks for image classification. arXiv Prepr. arXiv:1904.12483 (2019)
LaLonde, R., Bagci, U.: Capsules for object segmentation. In: 1st Conference on Medical Imaging with Deep Learning (MIDL), pp. 1–9 (2018)
Google Scholar
Neelavathy Pari, S., Mohana, T., Akshaya, V.: Real-time traffic sign detection using capsule network. In: Proceedings of the 11th International Conference on Advanced Computing (ADCOM), pp. 193–196 (2019). https://doi.org/10.1109/ICoAC48765.2019.247140
Ertugrul, I.O., Jeni, L.A., Cohn, J.F.: FACSCaps: pose-independent facial action coding with capsules. In: CVPR Workshops, pp. 2211–2220 (2018). https://doi.org/10.1109/CVPRW.2018.00287
Duarte, K., Rawat, Y.S., Shah, M.: VideocapsuleNet: a simplified network for action detection. In: NeurIPS, pp. 7610–7619 (2018)
Google Scholar
McIntosh, B., Duarte, K., Rawat, Y.S., Shah, M.: Multi-modal capsule routing for actor and action video segmentation conditioned on natural language queries. arXiv Prepr. arXiv:1812.00303 (2018)
Yu, Y., Tian, N., Chen, X., Li, Y.: Skeleton capsule net: an efficient network for action recognition. In: Proceedings of 8th International Conference on Virtual Reality and Visualization (ICVRV), pp. 74–77. IEEE (2018). https://doi.org/10.1109/ICVRV.2018.00022
Algamdi, A.M., Sanchez, V., Li, C.-T.: Learning temporal information from spatial information using CapsNets for human action recognition. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3867–3871 (2019). https://doi.org/10.1109/icassp.2019.8683720
Mantecón, T., del-Blanco, C.R., Jaureguizar, F., García, N.: Hand gesture recognition using infrared imagery provided by leap motion controller. In: Blanc-Talon, J., Distante, C., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2016. LNCS, vol. 10016, pp. 47–57. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48680-2_5
Chapter Google Scholar
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv Prepr. arXiv:1603.04467 (2016)
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: ICLR, pp. 1–15 (2015)
Google Scholar
Huang, D.-Y., Hu, W.-C., Chang, S.-H.: Gabor filter-based hand-pose angle estimation for hand gesture recognition under varying illumination. Expert Syst. Appl. 38, 6031–6042 (2011). https://doi.org/10.1016/j.eswa.2010.11.016
Article Google Scholar
Mantecón, T., Mantecón, A., Del-Blanco, C.R., Jaureguizar, F., García, N.: Enhanced gesture-based human-computer interaction through a compressive sensing reduction scheme of very large and efficient depth feature descriptors. In: 2th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (2015). https://doi.org/10.1109/AVSS.2015.7301804
Qi, L., Dou, W., Zhang, X., Chen, J.: A Qos-aware composition method supporting cross-platform service invocation in cloud environment. J. Comput. Syst. Sci. 78(5), 1316–1329 (2012). https://doi.org/10.1016/j.jcss.2011.12.016
Article MATH Google Scholar
Wang, L., Jie, W., Chen, J.: Grid Computing: Infrastructure, Service, and Applications, 528 p. CRC Press, Boca Raton (2009). ISBN 13:978-1420067668. https://doi.org/10.1201/9781315218854
Qi, L., Dou, W., Chen, J.: Weighted principal component analysis-based service selection method for multimedia services in cloud. Computing 98(1–2), 195–214 (2014). https://doi.org/10.1007/s00607-014-0413-x
Article MathSciNet MATH Google Scholar
Liu, X., Yuan, D., Zhang, G., Chen, J., Yang, Y.: Swindew-C: a peer-to-peer based cloud workflow system. In: Borko, F., Armando, E. (eds.) Handbook of Cloud Computing, pp. 309–332. Springer, Boston (2010). ISBN 978-1-4419-6523-3. https://doi.org/10.1007/978-1-4419-6524-0_13
Song, X., Dou, W., Chen, J.: A workflow framework for intelligent service composition. Futur. Gener. Comput. Syst. 27(5), 627–636 (2011). https://doi.org/10.1016/j.future.2010.06.008
Article Google Scholar
Chen, J., Yang, Y.: Temporal dependency based checkpoint selection for dynamic verification of fixed-time constraints in grid workflow systems. In: Proceedings of ACM/IEEE 30th International Conference on Software Engineering (ICSE), pp. 141–150 (2008). https://doi.org/10.1145/1368088.1368108
Puthal, D., Nepal, S., Ranjan, R., Chen, J.: DLSeF: a dynamic key-length-based efficient real-time security verification model for big data stream. ACM Trans. Embed. Comput. Syst (TECS) 16(2), Article 51 (2017). https://doi.org/10.1145/2937755

Download references

Author information

Authors and Affiliations

Swinburne University of Technology, Hawthorn, VIC, 3122, Australia
Hongwang Xiao, Yun Yang, Ke Yu, Jiao Tian, Xinyi Cai, Ying Zhao, Kai Zhang, Na Guo & Jinjun Chen

Authors

Hongwang Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jiao Tian
View author publications
You can also search for this author in PubMed Google Scholar
Xinyi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Kai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Na Guo
View author publications
You can also search for this author in PubMed Google Scholar
Jinjun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinjun Chen .

Editor information

Editors and Affiliations

School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
Wenjie Zhang
Peking University, Beijing, China
Lei Zou
Zayed University, Dubai, United Arab Emirates
Zakaria Maamar
Swinburne University of Technology, Melbourne, VIC, Australia
Lu Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiao, H. et al. (2021). Recognizing Hand Gesture in Still Infrared Images by CapsNet. In: Zhang, W., Zou, L., Maamar, Z., Chen, L. (eds) Web Information Systems Engineering – WISE 2021. WISE 2021. Lecture Notes in Computer Science(), vol 13080. Springer, Cham. https://doi.org/10.1007/978-3-030-90888-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-90888-1_13
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90887-4
Online ISBN: 978-3-030-90888-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics