Abstract
In recent years, gesture recognition has been becoming a hot topic that attracts much attention from the computer vision community because of its great potential in many real-world applications. There is a need to design a robust hand-based gesture recognition algorithm to cope with hand gesture recognition tasks. Infrared image recognition has the characteristic of not being disturbed by illumination variation. As a promising alternative to Convolutional Neural Networks (CNN), Capsule Networks (CapsNet) can represent the orientations of features and capture the spatial relationships between features of an entity, which makes CapsNet possess higher generalization ability. In this paper, we propose IRHGR-CapsNet to investigate hand gesture recognition in still infrared images. To evaluate the testing accuracy, the convergence ability and the generalization ability of IRHGR-CapsNet comprehensively, we split the original dataset into three subsets according to different split proportions and get four different dataset split modes for experiments. We can achieve almost 100.00% testing accuracy on all the four dataset split modes. Meanwhile, we can also demonstrate that our proposed IRHGR-CapsNet has a strong convergence ability and generalization ability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Marusarz, W.: The challenges and opportunities of gesture recognition. https://nexocode.com/blog/posts/gestures-recognition-challenges-and-opportunities/. Accessed 19 Apr 2021
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1–9 (2012).https://doi.org/10.1145/3065386
Deng, J., Dong, W., Socher, R., Li, L.-J., Kai, L., Li, F.-F.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009). https://doi.org/10.1109/cvpr.2009.5206848
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015). https://doi.org/10.1109/CVPR.2015.7298965
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: ICCV, pp. 2980–2988 (2017). https://doi.org/10.1109/ICCV.2017.322
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: OverFeat: integrated recognition, localization and detection using convolutional networks. In: ICLR (2014). https://doi.org/10.1016/j.visres.2006.11.009
Yosinski, J., Clune, J., Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: CVPR, pp. 427–436 (2015). https://doi.org/10.1109/CVPR.2015.7298640
Hinton, G.E., Krizhevsky, A., Wang, S.D.: Transforming auto-encoders. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 44–51. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_6
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: NIPS, pp. 1–11 (2017). https://doi.org/10.1177/1535676017742133
Xiao, H., et al.: Sign language digits and alphabets recognition by capsule networks. J. Ambient Intell. Humaniz. Comput. 1–11 (2021). https://doi.org/10.1007/s12652-021-02974-8
Hoogi, A., Wilcox, B., Gupta, Y., Rubin, D.L.: Self-attention capsule networks for image classification. arXiv Prepr. arXiv:1904.12483 (2019)
LaLonde, R., Bagci, U.: Capsules for object segmentation. In: 1st Conference on Medical Imaging with Deep Learning (MIDL), pp. 1–9 (2018)
Neelavathy Pari, S., Mohana, T., Akshaya, V.: Real-time traffic sign detection using capsule network. In: Proceedings of the 11th International Conference on Advanced Computing (ADCOM), pp. 193–196 (2019). https://doi.org/10.1109/ICoAC48765.2019.247140
Ertugrul, I.O., Jeni, L.A., Cohn, J.F.: FACSCaps: pose-independent facial action coding with capsules. In: CVPR Workshops, pp. 2211–2220 (2018). https://doi.org/10.1109/CVPRW.2018.00287
Duarte, K., Rawat, Y.S., Shah, M.: VideocapsuleNet: a simplified network for action detection. In: NeurIPS, pp. 7610–7619 (2018)
McIntosh, B., Duarte, K., Rawat, Y.S., Shah, M.: Multi-modal capsule routing for actor and action video segmentation conditioned on natural language queries. arXiv Prepr. arXiv:1812.00303 (2018)
Yu, Y., Tian, N., Chen, X., Li, Y.: Skeleton capsule net: an efficient network for action recognition. In: Proceedings of 8th International Conference on Virtual Reality and Visualization (ICVRV), pp. 74–77. IEEE (2018). https://doi.org/10.1109/ICVRV.2018.00022
Algamdi, A.M., Sanchez, V., Li, C.-T.: Learning temporal information from spatial information using CapsNets for human action recognition. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3867–3871 (2019). https://doi.org/10.1109/icassp.2019.8683720
Mantecón, T., del-Blanco, C.R., Jaureguizar, F., García, N.: Hand gesture recognition using infrared imagery provided by leap motion controller. In: Blanc-Talon, J., Distante, C., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2016. LNCS, vol. 10016, pp. 47–57. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48680-2_5
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv Prepr. arXiv:1603.04467 (2016)
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: ICLR, pp. 1–15 (2015)
Huang, D.-Y., Hu, W.-C., Chang, S.-H.: Gabor filter-based hand-pose angle estimation for hand gesture recognition under varying illumination. Expert Syst. Appl. 38, 6031–6042 (2011). https://doi.org/10.1016/j.eswa.2010.11.016
Mantecón, T., Mantecón, A., Del-Blanco, C.R., Jaureguizar, F., García, N.: Enhanced gesture-based human-computer interaction through a compressive sensing reduction scheme of very large and efficient depth feature descriptors. In: 2th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (2015). https://doi.org/10.1109/AVSS.2015.7301804
Qi, L., Dou, W., Zhang, X., Chen, J.: A Qos-aware composition method supporting cross-platform service invocation in cloud environment. J. Comput. Syst. Sci. 78(5), 1316–1329 (2012). https://doi.org/10.1016/j.jcss.2011.12.016
Wang, L., Jie, W., Chen, J.: Grid Computing: Infrastructure, Service, and Applications, 528 p. CRC Press, Boca Raton (2009). ISBN 13:978-1420067668. https://doi.org/10.1201/9781315218854
Qi, L., Dou, W., Chen, J.: Weighted principal component analysis-based service selection method for multimedia services in cloud. Computing 98(1–2), 195–214 (2014). https://doi.org/10.1007/s00607-014-0413-x
Liu, X., Yuan, D., Zhang, G., Chen, J., Yang, Y.: Swindew-C: a peer-to-peer based cloud workflow system. In: Borko, F., Armando, E. (eds.) Handbook of Cloud Computing, pp. 309–332. Springer, Boston (2010). ISBN 978-1-4419-6523-3. https://doi.org/10.1007/978-1-4419-6524-0_13
Song, X., Dou, W., Chen, J.: A workflow framework for intelligent service composition. Futur. Gener. Comput. Syst. 27(5), 627–636 (2011). https://doi.org/10.1016/j.future.2010.06.008
Chen, J., Yang, Y.: Temporal dependency based checkpoint selection for dynamic verification of fixed-time constraints in grid workflow systems. In: Proceedings of ACM/IEEE 30th International Conference on Software Engineering (ICSE), pp. 141–150 (2008). https://doi.org/10.1145/1368088.1368108
Puthal, D., Nepal, S., Ranjan, R., Chen, J.: DLSeF: a dynamic key-length-based efficient real-time security verification model for big data stream. ACM Trans. Embed. Comput. Syst (TECS) 16(2), Article 51 (2017). https://doi.org/10.1145/2937755
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Xiao, H. et al. (2021). Recognizing Hand Gesture in Still Infrared Images by CapsNet. In: Zhang, W., Zou, L., Maamar, Z., Chen, L. (eds) Web Information Systems Engineering – WISE 2021. WISE 2021. Lecture Notes in Computer Science(), vol 13080. Springer, Cham. https://doi.org/10.1007/978-3-030-90888-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-90888-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90887-4
Online ISBN: 978-3-030-90888-1
eBook Packages: Computer ScienceComputer Science (R0)