Some Approaches to Recognition of Sign Language Dynamic Expressions with Kinect

  • M. Oszust
  • M. Wysocki
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 300)


The paper considers recognition of isolated Polish Sign Language words observed by Kinect. A whole word model approach with nearest neighbour classifier applying dynamic time warping (DTW) technique is compared with an approach using models of subunits, i.e. some elements smaller than words, resembling phonemes in spoken expressions. Such smaller models are obtained using data-driven procedure involving division of time series representing words from a training set into subsequences which form homogeneous groups. Symbols are assigned to these groups and then gestures receive symbolic representations (transcriptions). Such transcriptions are classified using nearest neighbour approach based on edit distance. Two sets of features have been used: one based on Kinect’s skeletal images and the other using simplified description of hands extracted as skin coloured regions. Ten-fold cross-validation tests of classifiers using data representing signed Polish words were performed. Subunit based approach proved to be superior, in particular in the case when only one learning example was available. Features taking into account description of hands led to better recognition results in comparison with those based on skeleton.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Agris, U., Zieren, J., Canzler, U., et al.: Recent developments in visual sign language recognition. Universal Access in the Information Society 6(4), 323–362 (2008)CrossRefGoogle Scholar
  2. Awad, G., Han, J., Sutherland, A.: Novel boosting framework for subunit-based sign language recognition. In: Proc. IEEE International Conference on Image Processing, Piscataway, NJ, pp. 2693–2696 (2009)Google Scholar
  3. Cooper, H.: Sign language recognition: generalising to more complex corpora. Centre for Vision Speech and Signal Processing, PhD thesis, University of Surrey (2010)Google Scholar
  4. De Castro, L.N., Von Zuben, F.J.: Learning and optimization using the clonal selection principle. IEEE Transaction on Evolutionary Computation 6, 239–251 (2002)CrossRefGoogle Scholar
  5. Fu, T.C.: A review on time series data mining. Engineering Applications of Artificial Intelligence 24(1), 164–181 (2011)CrossRefGoogle Scholar
  6. Han, J., Awad, G., Sutherland, A.: Modelling and segmenting subunits for sign language recognition based on hand motion analysis. Pattern Recognition Letters 30(6), 623–633 (2009)CrossRefGoogle Scholar
  7. Hendzel, J.: Polish sign language dictionary. Publishing House Pojezierze (1986) (in Polish)Google Scholar
  8. Keskin, C., Kirac, F., Kara, Y.E., et al.: Real time hand pose estimation using depth sen-sors. In: Proc IEEE ICCV Workshops, pp. 1228–1234 (2011)Google Scholar
  9. KinectTCP, Temple University (2011), (accessed September 2, 2013)
  10. Kraiss, K.F.: Advanced man-machine interaction. Springer, Berlin (2006)CrossRefGoogle Scholar
  11. Lang, S., Block, M., Rojas, R.: Sign language recognition using kinect. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012, Part I. LNCS, vol. 7267, pp. 394–402. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1650–1654 (2002)CrossRefGoogle Scholar
  13. Marussy, K., Buza, K.: Hubness-based indicators for semi-supervised time-series clas-sification. In: Proc. 8th Japanese-Hungarian Symposium on Discrete Mathematics and Its Applications 2013, Veszprem, Hungary, pp. 97–108 (2013)Google Scholar
  14. Obdrzalek, S., Kurillo, G., Ofli, F., et al.: Accuracy and robustness of kinect pose estima-tion in the context of coaching of elderly population. In: Proc IEEE Annual International Conference on Engineering in Medicine and Biology Society, pp. 1188–1193 (2012)Google Scholar
  15. Ong, E.J., Cooper, H., Pugeault, N., et al.: Sign language recognition using sequential pattern trees. In: Proc. IEEE Computer Vision and Pattern Recognition, pp. 2200–2207 (2012)Google Scholar
  16. Ong, S.C.W., Ranganath, S.: Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6), 873–891 (2005)CrossRefGoogle Scholar
  17. Oszust, M., Wysocki, M.: Modelling and recognition of signed expressions using subunits obtained by data–driven approach. In: Ramsay, A., Agre, G. (eds.) AIMSA 2012. LNCS (LNAI), vol. 7557, pp. 315–324. Springer, Heidelberg (2012)Google Scholar
  18. Radovanovic, M., Nanopoulos, A., Ivanovic, M.: Hubs in space: popular nearest neighbors in high-dimensional data. Journal of Machine Learning Research 11, 2487–2531 (2010)MathSciNetzbMATHGoogle Scholar
  19. Ren, Z., Meng, J., Yuan, J., et al.: Robust hand gesture recognition with kinect sensor. In: Proc. 19th International Conference on Multimedia, Scottsdale, AZ, USA (2011)Google Scholar
  20. Shotton, J., Fitzgibbon, A., Cook, M., et al.: Real-time human pose recognition in parts from single depth images. In: Proc. IEEE Computer Vision and Pattern Recognition, pp. 1297–1304 (2011)Google Scholar
  21. Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press (2008)Google Scholar
  22. Trojanowski, K., Wierzchon, S.: Immune-based algorithms for dynamic optimization. Information Sciences 179, 1495–1515 (2009)CrossRefGoogle Scholar
  23. Vezhnevets, V., Sazonov, V., Andreeva, A.: A survey on pixel-based skin color detection techniques. In: Proc. GraphiCon 2003, pp. 85–92 (2003)Google Scholar
  24. Vogler, C., Metaxas, D.: Toward scalability in ASL recognition: Breaking down signs into phonemes. In: Braffort, A., Gherbi, R., Gibet, S., Richardson, J., Teil, D. (eds.) GW 1999. LNCS (LNAI), vol. 1739, pp. 211–224. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  25. Xu, R., Wunsch, D.: Clustering. Wiley-IEEE Press (2009)Google Scholar
  26. Zahedi, M., Manashty, A.R.: Robust sign language recognition system using ToF depth cameras. Computing Research Repository - CORR, abs/1105.0 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Computer and Control EngineeringRzeszow University of TechnologyRzeszówPoland

Personalised recommendations