American Sign Language Fingerspelling Recognition Using Wide Residual Networks

  • Kacper KaniaEmail author
  • Urszula Markowska-Kaczmar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10841)


Despite existing solutions for accurate translation between written and spoken language, sign language is still not well-studied area. A reliable, robust and working in real-time translator of American Sign Language is a crucial bridge to facilitate communication between deaf and hearing people. In this paper we propose a method of sign language fingerspelling recognition using a modern architecture of convolutional neural network called Wide Residual Network trained with Snapshot Learning procedure. The model was trained on augmented datasets available at Surrey University and Massey University web pages using transfer learning. The final result is a robust classifier of all alphabet letters, which beats current state-of-the-art results. The outcomes encourage further research in this field for creating fully usable sign language translator.



We thank Identt company for giving access to PC used to conduct experiments. Acknowledgments are directed also to dr Adam Gonczarek from the Wroclaw University of Science and Technology for leading the project of the recognition system. We thank Michał Kosturek and Piotr Grzybowski from scientific student assocation “” at Wrocław University of Science and Technology, who implemented dictionary and hand localization modules respectively for the system.


  1. 1.
    Mitchell, R.E., Young, T.A., Bachleda, B., Karchmer, M.A.: How many people use ASL in the United States? Why estimates need updating. Sign Lang. Stud. 6(3), 306–335 (2006)CrossRefGoogle Scholar
  2. 2.
    Rioux-Maldague, L., Giguère, P.: Sign language fingerspelling classification from depth and color images using a deep belief network. CoRR, abs/1503.05830 (2015)Google Scholar
  3. 3.
    Pigou, L., Dieleman, S., Kindermans, P.-J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014, Part I. LNCS, vol. 8925, pp. 572–578. Springer, Cham (2015). Scholar
  4. 4.
    Garcia, B., Viesca, S.: Real-time American sign language recognition with convolutional neural networks. In: Convolutional Neural Networks for Visual Recognition (2016)Google Scholar
  5. 5.
    Bheda, V., Radpour, D.: Using deep convolutional networks for gesture recognition in American sign language. CoRR, abs/1710.06836 (2017)Google Scholar
  6. 6.
    Ameen, S., Vadera, S.: A convolutional neural network to classify American sign language fingerspelling from depth and colour images. Expert Syst. 34(3), e12197 (2017)CrossRefGoogle Scholar
  7. 7.
    Zagoruyko, S., Komodakis, N.: Wide residual networks. CoRR, abs/1605.07146 (2016)Google Scholar
  8. 8.
    Kang, B., Tripathi, S., Nguyen, T.Q.: Real-time sign language fingerspelling recognition using convolutional neural networks from depth map. CoRR, abs/1509.03001 (2015)Google Scholar
  9. 9.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR, abs/1512.03385 (2015)Google Scholar
  10. 10.
    Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J.E., Weinberger, K.Q.: Snapshot ensembles: train 1, get M for free. CoRR, abs/1704.00109 (2017)Google Scholar
  11. 11.
    Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. CoRR, abs/1512.02325 (2015)Google Scholar
  12. 12.
    Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861 (2017)Google Scholar
  13. 13.
    Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 539–546. IEEE (2005)Google Scholar
  14. 14.
    University of Exeter: ASL Finger Spelling Dataset, 2 November 2017Google Scholar
  15. 15.
    Barczak, A.L.C., Reyes, N.H., Abastillas, M., Piccio, A., Susnjak, T.: A new 2D static hand gesture colour image dataset for ASL gestures. Res. Lett. Inf. Math. Sci. 15, 12–20 (2011)Google Scholar
  16. 16.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167 (2015)Google Scholar
  17. 17.
    Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. CoRR, abs/1411.4280 (2014)Google Scholar
  18. 18.
    Nesterov, Y.: Introductory Lectures on Convex Optimization. Springer US, New York (2004). Scholar
  19. 19.
    Ruder, S.: An overview of multi-task learning in deep neural networks. CoRR, abs/1706.05098 (2017)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Computer Science and ManagementWrocław University of Science and TechnologyWrocławPoland

Personalised recommendations