Learning Siamese Features for Finger Spelling Recognition

  • Bogdan KwolekEmail author
  • Shinji Sako
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10617)


This paper is devoted to finger spelling recognition on the basis of images acquired by a single color camera. The recognition is realized on the basis of learned low-dimensional embeddings. The embeddings are calculated both by single as well as multiple siamese-based convolutional neural networks. We train classifiers operating on such features as well as convolutional neural networks operating on raw images. The evaluations are performed on freely available dataset with finger spellings of Japanese Sign Language. The best results are achieved by a classifier trained on concatenated features of multiple siamese networks.


Finger spelling recognition Siamese neural networks CNNs 



This work was supported by Polish National Science Center (NCN) under a NCN research grant 2014/15/B/ST6/02808 as well as JSPS KAKENHI Grant Number 17H06114 and 15KK0008.


  1. 1.
    Barros, P., Magg, S., Weber, C., Wermter, S.: A multichannel convolutional neural network for hand posture recognition. In: Wermter, S., Weber, C., Duch, W., Honkela, T., Koprinkova-Hristova, P., Magg, S., Palm, G., Villa, A.E.P. (eds.) ICANN 2014. LNCS, vol. 8681, pp. 403–410. Springer, Cham (2014). Google Scholar
  2. 2.
    Bell, S., Bala, K.: Learning visual similarity for product design with convolutional neural networks. ACM Trans. Graph. 34(4), 98:1–98:10 (2015)CrossRefGoogle Scholar
  3. 3.
    Berlemont, S., Lefebvre, G., Duffner, S., Garcia, C.: Siamese neural network based similarity metric for inertial gesture classification and rejection. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–6 (2015)Google Scholar
  4. 4.
    Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: Proceeding of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 539–546 (2005)Google Scholar
  5. 5.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  6. 6.
    Hosoe, H., Sako, S., Kwolek, B.: Recognition of JSL finger spelling using convolutional neural networks. In: 15th IAPR International Conference on Machine Vision Applications (MVA), pp. 85–88. IEEE, Nagoya, Japan (2017)Google Scholar
  7. 7.
    Kane, L., Khanna, P.: A framework for live and cross platform fingerspelling recognition using modified shape matrix variants on depth silhouettes. Comput. Vis. Image Underst. 141, 138–151 (2015)CrossRefGoogle Scholar
  8. 8.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR (2014)Google Scholar
  9. 9.
    Koller, O., Ney, H., Bowden, R.: Deep hand: how to train a CNN on 1 million hand images when your data is continuous and weakly labelled. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3793–3802 (2016)Google Scholar
  10. 10.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)Google Scholar
  11. 11.
    Kwolek, B.: Face detection using convolutional neural networks and Gabor filters. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 551–556. Springer, Heidelberg (2005). CrossRefGoogle Scholar
  12. 12.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceeding of the IEEE, pp. 2278–2324 (1998)Google Scholar
  13. 13.
    Lin, J., Morère, O., Chandrasekhar, V., Veillard, A., Goh, H.: Deephash: getting regularization, depth and fine-tuning right. CoRR (2015)Google Scholar
  14. 14.
    Nagi, J., Ducatelle, et al., F.: Max-pooling convolutional neural networks for vision-based hand gesture recognition. In: IEEE ICSIP, pp. 342–347 (2011)Google Scholar
  15. 15.
    Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture recognition. Neural Comput. Appl. 28, 1–11 (2016)CrossRefGoogle Scholar
  16. 16.
    Pisharady, P., Saerbeck, M.: Recent methods and databases in vision-based hand gesture recognition. Comput. Vis. Image Underst. 141, 152–165 (2015)CrossRefGoogle Scholar
  17. 17.
    Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015)CrossRefGoogle Scholar
  18. 18.
    Sagayam, K.M., Hemanth, D.J.: Hand posture and gesture recognition techniques for virtual reality applications: a survey. Virtual Reality 21(2), 91–107 (2017)CrossRefGoogle Scholar
  19. 19.
    Tabata, Y., Kuroda, T.: Finger spelling recognition using distinctive features of hand shape. In: International Conference on Disability, Virtual Reality and Associated Technologies with Art Abilitation, pp. 287–292 (2008)Google Scholar
  20. 20.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 1701–1708 (2014)Google Scholar
  21. 21.
    Tompson, J., Stein, M., LeCun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. 33(5), 169 (2014)CrossRefGoogle Scholar
  22. 22.
    Yi, D., Lei, Z., Li, S.Z.: Deep metric learning for practical person re-identification. In: ICPR, pp. 34–39 (2014).

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.AGH University of Science and TechnologyKrakowPoland
  2. 2.Frontier Research Institute for Information ScienceNagoya Institute of TechnologyShowa-ku NagoyaJapan

Personalised recommendations