Advertisement

Spatial Pyramid Matching for Finger Spelling Recognition in Intensity Images

  • Samira Silva
  • William Robson Schwartz
  • Guillermo Cámara-Chávez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8827)

Abstract

Sign language is a complex way of communication mostly used for deaf people where hands, limbs, head and facial expressions are used to communicate. Finger spelling is a system where each letter of the alphabet is represented by a unique and discrete movement of the hand. In this paper, we are interested in studying the properties of the spatial pyramid matching descriptor for finger spelling recognition. This method is a simple extension of an orderless bag-of-features image representation where local features are mapped to multi-resolution histograms and compute a weighted histogram intersection. The performance of the approach is evaluated on a dataset of real images of the American Sign Language (ASL) finger spelling. We conduct experiments considering three evaluation protocols. The first uses 10% of the data as training and the remaining as test, we achieve an accuracy rate of 92.50%. The second protocol considers 50% as training data, the accuracy rate was about 97.1%. Finally, in the third protocol, we perform a 5-fold cross-validation, where we achieve an accuracy rate of 97.9%. Our method achieves the best results in all three protocols when compared to state-of-the-art approaches. In all the experiments, we also evaluate the influence of the weights of the multi-resolution histograms. They do not have a significant influence in the experimental results.

Keywords

Finger spelling recognition Sign language recognition Pyramidal matching 

References

  1. 1.
    Isaacs, J., Foo, S.: Hand pose estimation for american sign language recognition. In: 36th Southeastern Symposium on System Theory, pp. 132–136 (2004)Google Scholar
  2. 2.
    Dahmani, D., Larabi, S.: User-independent system for sign language finger spelling recognition. Journal of Visual Communication and Image Representation (2014)Google Scholar
  3. 3.
    Auephanwiriyakul, S., Phitakwinai, S., Suttapak, W., Chanda, P., Theera-Umpon, N.: Thai sign language translation using scale invariant feature transform and hidden markov models. Pattern Recognition Letters 34(11), 1291–1298 (2013)CrossRefGoogle Scholar
  4. 4.
    Uebersax, D., Gall, J., den Bergh, M.V., Gool, L.J.V.: Real-time sign language letter and word recognition from depth data. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 383–390 (2011)Google Scholar
  5. 5.
    Pugeault, N., Bowden, R.: Spelling it out: Real-time ASL fingerspelling recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1114–1119. IEEE (2011)Google Scholar
  6. 6.
    Van den Bergh, M., Van Gool, L.: Combining RGB and ToF cameras for real-time 3D hand gesture interaction. In: Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV), WACV 2011, pp. 66–72. IEEE Computer Society, Washington, DC (2011)CrossRefGoogle Scholar
  7. 7.
    Otiniano-Rodríguez, K., Cámara-Chávez, G.: A robust kernel descriptor for finger spelling recognition based on rgb-d information. International Journal of Computer Science & Information Security 11, 1–7 (2013)Google Scholar
  8. 8.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 2, pp. 2169–2178. IEEE Computer Society, Washington, DC (2006)Google Scholar
  9. 9.
    Nicolas Pugeault, R.B.: ASL finger spelling dataset, http://personal.ee.surrey.ac.uk/Personal/N.Pugeault/index.php (last visit: April 29, 2013)
  10. 10.
    Zhu, X., Wong, K.-Y.K.: Single-frame hand gesture recognition using color and depth kernel descriptors. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR), pp. 2989–2992. IEEE (2012)Google Scholar
  11. 11.
    Grauman, K., Darrell, T.: Pyramid match kernels: Discriminative classification with sets of image features. In: ICCV (2005)Google Scholar
  12. 12.
    Otiniano-Rodríguez, K., Cámara-Chávez, G.: Finger spelling recognition from RGB-D information using kernel descriptor. In: Proceedings of the SIBGRAPI 2013 (XXVI Conference on Graphics, Patterns and Images), pp. 1–7 (2013), http://www.ucsp.edu.pe/sibgrapi2013/eproceedings/
  13. 13.
    Estrela, B.N., Cámara-Chávez, G., Campos, M.F.M., Schwartz, W.R., Nascimento, E.R.: Sign language recognition using partial least squares and rgbd information. In: Proceedings of the IX Workshop de Visão Computacional, WVC 2013 (2013)Google Scholar
  14. 14.
    Otiniano-Rodríguez, K., Cámara-Chávez, G.: Finger spelling recognition from RGB-D information using kernel descriptor. In: Proceedings of the SIBGRAPI 2013 (XXVI Conference on Graphics, Patterns and Images), pp. 1–7 (2013), http://www.ucsp.edu.pe/sibgrapi2013/eproceedings/

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Samira Silva
    • 1
  • William Robson Schwartz
    • 2
  • Guillermo Cámara-Chávez
    • 1
  1. 1.Computer Science DepartmentFederal University of Ouro PretoOuro PretoBrazil
  2. 2.Computer Science DepartmentFederal University of Minas GeraisBelo HorizonteBrazil

Personalised recommendations