Tracking Benchmark Databases for Video-Based Sign Language Recognition

  • Philippe Dreuw
  • Jens Forster
  • Hermann Ney
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6553)


A survey of video databases that can be used within a continuous sign language recognition scenario to measure the performance of head and hand tracking algorithms either w.r.t. a tracking error rate or w.r.t. a word error rate criterion is presented in this work.

Robust tracking algorithms are required as the signing hand frequently moves in front of the face, may temporarily disappear, or cross the other hand.

Only few studies consider the recognition of continuous sign language, and usually special devices such as colored gloves or blue-boxing environments are used to accurately track the regions-of-interest in sign language processing.

Ground-truth labels for hand and head positions have been annotated for more than 30k frames in several publicly available video databases of different degrees of difficulty, and preliminary tracking results are presented.


Sign Language Recognition Tracking Benchmark Data- bases 


  1. 1.
    Sarkar, S., Phillips, P., Liu, Z., Vega, I., Grother, P., Bowyer, K.: The humanid gait challenge problem: Data sets, performance, and analysis. PAMI 27, 162–177 (2005)CrossRefGoogle Scholar
  2. 2.
    Cheung, K., Baker, S., Kanade, T.: Shape-from-silhouette across time part i: Theory and algorithms. International Journal on Computer Vision 62, 221–247 (2005)CrossRefGoogle Scholar
  3. 3.
    Bowden, R., Windridge, D., Kadir, T., Zisserman, A., Brady, M.: A Linguistic Feature Vector for the Visual Interpretation of Sign Language. In: Pajdla, T., Matas, J. (eds.) ECCV 2004, Part I. LNCS, vol. 3021, pp. 390–401. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  4. 4.
    Dreuw, P., Rybach, D., Deselaers, T., Zahedi, M., Ney, H.: Speech recognition techniques for a sign language recognition system. In: Interspeech, Antwerp, Belgium (2007) (Best paper award)Google Scholar
  5. 5.
    Gavrila, D.: The visual analysis of human movement: A survey. Computer Vision and Image Understanding 73, 82–98 (1999)zbMATHCrossRefGoogle Scholar
  6. 6.
    Baker, S., Matthews, I.: Lukas-kanade 20 years on: A unifiying framework. International Journal of Computer Vision 69, 221–255 (2004)CrossRefGoogle Scholar
  7. 7.
    Schiele, B.: Model-free tracking of cars and people based on color regions. Image Vision Computing 24, 1172–1178 (2006)CrossRefGoogle Scholar
  8. 8.
    Cremers, D., Rousson, M., Deriche, R.: A review of statistical approaches to level set segmentation: Integrating color, texture, motion and shape. International Journal of Computer Vision 72, 195–215 (2007)CrossRefGoogle Scholar
  9. 9.
    Buehler, P., Everingham, M., Huttenlocher, D.P., Zisserman, A.: Long term arm and hand tracking for continuous sign language TV broadcasts. In: Proceedings of the British Machine Vision Conference (2008)Google Scholar
  10. 10.
    Grabner, H., Roth, P.M., Bischof, H.: Is pedestrian detection really a hard task? In: Tenth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (2007)Google Scholar
  11. 11.
    Fang, G., Gao, W., Zhao, D.: Large-vocabulary continuous sign language recognition based on transition-movement models. IEEE Trans. on Systems, Man, and Cybernetics 37 (2007)Google Scholar
  12. 12.
    Yao, G., Yao, H., Liu, X., Jiang, F.: Real time large vocabulary continuous sign language recognition based on op/viterbi algorithm. In: ICPR, Hong Kong, vol. 3, pp. 312–315 (2006)Google Scholar
  13. 13.
    Vogler, C., Metaxas, D.: A framework for recognizing the simultaneous aspects of american sign language. CVIU 81, 358–384 (2001)zbMATHGoogle Scholar
  14. 14.
    Wang, S.B., Quattoni, A., Morency, L.P., Demirdjian, D., Darrell, T.: Hidden conditional random fields for gesture recognition. In: CVPR, New York, USA, vol. 2, pp. 1521–1527 (2006)Google Scholar
  15. 15.
    Braffort, A.: Argo: An architecture for sign language recognition and interpretation. In: International Gesture Workshop: Progress in Gestural Interaction, pp. 17–30 (1996)Google Scholar
  16. 16.
    Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans. Pattern Analysis and Machine Intelligence 20, 1371–1375 (1998)CrossRefGoogle Scholar
  17. 17.
    Holden, E.J., Lee, G., Owens, R.: Australian sign language recognition. In: Machine Vision and Applications, vol. 16, pp. 312–320 (2005)Google Scholar
  18. 18.
    Bauer, B., Kraiss, K.: Video-based sign recognition using self-organizing subunits. In: International Conference on Pattern Recognition, pp. 434–437 (2002)Google Scholar
  19. 19.
    Ong, S., Ranganath, S.: Automatic sign language analysis: A survey and the future beyond lexical meaning. IEEE Trans. PAMI 27, 873–891 (2005)CrossRefGoogle Scholar
  20. 20.
    Dreuw, P., Ney, H., Martinez, G., Crasborn, O., Piater, J., Miguel Moya, J., Wheatley, M.: The signspeak project - bridging the gap between signers and speakers. In: International Conference on Language Resources and Evaluation, Valletta, Malta (2010)Google Scholar
  21. 21.
    Crasborn, O., Zwitserlood, I., Ros, J.: Corpus-ngt. An open access digital corpus of movies with annotations of sign language of the Netherlands. Technical report, Centre for Language Studies, Radboud University Nijmegen (2008),
  22. 22.
    Dreuw, P., Neidle, C., Athitsos, V., Sclaroff, S., Ney, H.: Benchmark databases for video-based automatic sign language recognition. In: LREC, Marrakech, Morocco (2008)Google Scholar
  23. 23.
    Zahedi, M., Dreuw, P., Rybach, D., Deselaers, T., Bungeroth, J., Ney, H.: Continuous sign language recognition - approaches from speech recognition and available data resources. In: LREC Workshop on the Representation and Processing of Sign Languages: Lexicographic Matters and Didactic Scenarios, Genoa, Italy, pp. 21–24 (2006)Google Scholar
  24. 24.
    Dreuw, P., Stein, D., Deselaers, T., Rybach, D., Zahedi, M., Bungeroth, J., Ney, H.: Spoken language processing techniques for sign language recognition and translation. Technology and Dissability 20, 121–133 (2008)Google Scholar
  25. 25.
    Stein, D., Bungeroth, J., Ney, H.: Morpho-Syntax Based Statistical Methods for Sign Language Translation. In: 11th EAMT, Oslo, Norway, pp. 169–177 (2006)Google Scholar
  26. 26.
    Bungeroth, J., Stein, D., Dreuw, P., Ney, H., Morrissey, S., Way, A., van Zijl, L.: The ATIS Sign Language Corpus. In: LREC, Marrakech, Morocco (2008)Google Scholar
  27. 27.
    Stein, D., Dreuw, P., Ney, H., Morrissey, S., Way, A.: Hand in Hand: Automatic Sign Language to Speech Translation. In: The 11th Conference on Theoretical and Methodological Issues in Machine Translation, Skoevde, Sweden (2007)Google Scholar
  28. 28.
    von Agris, U., Kraiss, K.F.: Towards a video corpus for signer-independent continuous sign language recognition. In: Gesture in Human-Computer Interaction and Simulation, Lisbon, Portugal (2007)Google Scholar
  29. 29.
    Dreuw, P., Deselaers, T., Rybach, D., Keysers, D., Ney, H.: Tracking using dynamic programming for appearance-based sign language recognition. In: IEEE Automatic Face and Gesture Recognition, Southampton, pp. 293–298 (2006)Google Scholar
  30. 30.
    Turk, M., Pentland, A.: Eigenfaces for recognition. Journal of Cognitive Neuroscience 3, 71–86 (1991)CrossRefGoogle Scholar
  31. 31.
    Piater, J., Hoyoux, T., Du, W.: Video analysis for continuous sign language recognition. In: 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, Valletta, Malta, pp. 192–195 (2010)Google Scholar
  32. 32.
    Viola, P., Jones, M.: Robust real-time face detection. International Journal of Computer Vision 57, 137–154 (2004)CrossRefGoogle Scholar
  33. 33.
    Kanthak, S., Sixtus, A., Molau, S., Schlüter, R., Ney, H.: From Speech Input to Augmented Word Lattices. In: Fast Search for Large Vocabulary Speech Recognition, pp. 63–78. Springer, Heidelberg (2000)Google Scholar
  34. 34.
    Mauser, A., Zens, R., Matusov, E., Hasan, S., Ney, H.: The RWTH Statistical Machine Translation System for the IWSLT 2006 evaluation. In: IWSLT, Kyoto, Japan, pp. 103–110 (2006) (Best Paper Award)Google Scholar
  35. 35.
    Dreuw, P., Forster, J., Deselaers, T., Ney, H.: Efficient approximations to model-based joint tracking and recognition of continuous sign language. In: IEEE International Conference Automatic Face and Gesture Recognition, Amsterdam, The Netherlands (2008)Google Scholar
  36. 36.
    Forster, J., Stein, D., Ormel, E., Crasborn, O., Ney, H.: Best practice for sign language data collections regarding the needs of data-driven recognition and translation. In: 4th LREC Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, CSLT, Malta (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Philippe Dreuw
    • 1
  • Jens Forster
    • 1
  • Hermann Ney
    • 1
  1. 1.Human Language Technology and Pattern Recognition GroupRWTH Aachen UniversityAachenGermany

Personalised recommendations