Australian sign language recognition

  • Eun-Jung HoldenEmail author
  • Gareth Lee
  • Robyn Owens
Original Paper


This paper presents an automatic Australian sign language (Auslan) recognition system, which tracks multiple target objects (the face and hands) throughout an image sequence and extracts features for the recognition of sign phrases. Tracking is performed using correspondences of simple geometrical features between the target objects within the current and the previous frames. In signing, the face and a hand of a signer often overlap, thus the system needs to segment these for the purpose of feature extraction. Our system deals with the occlusion of the face and a hand by detecting the contour of the foreground moving object using a combination of motion cues and the snake algorithm. To represent signs, features that are invariant to scaling, 2D rotations and signing speed are used for recognition. The features represent the relative geometrical positioning and shapes of the target objects, as well as their directions of motion. These are used to recognise Auslan phrases using Hidden Markov Models. Experiments were conducted using 163 test sign phrases with varying grammatical formations. Using a known grammar, the system achieved over 97% recognition rate on a sentence level and 99% success rate at a word level.


Visual tracking Vision system Target detection Human recognition 


  1. 1.
    Starner, T., Pentland, A.: Visual recognition of American sign language using hidden markov models. In: Proceedings of the International Workshop on Automatic Face- and Gesture-Recognition, pp. 189–194 (1995)Google Scholar
  2. 2.
    Yang, M.H., Ahuja, N.: Recognizing hand gestures using motion trajectories. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 466–472 (1999)Google Scholar
  3. 3.
    Imagawa, K., Lu, S., Igi, S.: Color-based hand tracking system for sign language recognition. In: Proceedings of the IEEE 3rd International Conference on Automatic Face and Gesture Recognition, pp. 462–467 (1998)Google Scholar
  4. 4.
    Akyol, S., Alvarado, P.: Finding relevant image content for mobile sign language recognition. In: Proceedings of the International Conference on Signal Processing, Pattern Recognition and Applications, pp. 48–52 (2001)Google Scholar
  5. 5.
    Tanibata, N., Shimada, N.: Extraction of hand features for recognition of sign language words. In: Proceedings of the 15th International Conference on Vision Interface, pp. 391–398 (2002)Google Scholar
  6. 6.
    Imagawa, K., Matsuo, H., Taniguchi, R., Arita, D., Lu, S., Igi, S.: Recognition of local features for camera-based sign language recognition system. In: Proceedings of the International Conference on Pattern Recognition (ICPR), pp. 4849–4853 (2000)Google Scholar
  7. 7.
    Holden, E., Owens, R.: Segmenting occluded objects using a motion snake. In: Proceedings of the 6th Asian Conference on Computer Vision, pp. 342–347 (2004)Google Scholar
  8. 8.
    Kass, M., Witkin, A., Terzopoulos, D.: Snakes: Active contour models. In: Proceedings of the IEEE First International Conference on Computer Vision, pp. 259–269 (1987)Google Scholar
  9. 9.
    Blake, A., Isard, M.: Active Contours. Springer, Berlin Heidelberg New York (1998)Google Scholar
  10. 10.
    Ohta, Y., Kanade, T., Sakai, T.: Color information for region segmentation. Comput. Graph. Image Process. 13, 222–241 (1980)CrossRefGoogle Scholar
  11. 11.
    Hotta, K., Kurita, T., Mishima, T.: Scale invariant face detection method using higher-order local autocorrelation features extracted from log-polar image. In: Proceedings of the International Workshop on Automatic Face- and Gesture-Recognition, pp. 422–433 (1998)Google Scholar
  12. 12.
    Manly, B.F.J.: Multivariate Statistical Methods: A Primer. Chapman and Hall, London (1986)Google Scholar
  13. 13.
    Habili, N., Lim, C.C., Moini, A.R.: Hand and face segmentation using motion and color cues in digital image sequences. In: Proceedings of the IEEE International Conference on Multimedia & Expo (2002)Google Scholar
  14. 14.
    Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the Image Understanding Workshop, pp. 121–130 (1981)Google Scholar
  15. 15.
    Williams, D.J., Shah, M.: A fast algorithm for active contours and curvature estimation. CVGIP: Image Understanding 55(1), 14–26 (1991)CrossRefGoogle Scholar
  16. 16.
    Rabiner, L.R., Juang, B.H.: An introduction to hidden Markov models. IEEE ASSP Mag. 3(1), 4–16 (1986)CrossRefGoogle Scholar
  17. 17.
    Woodland, P.C., Leggetter, C.J., Odell, J.J., Valtchev, V., Young, S.: The 1994 HTK large vocabulary speech recognition system. In: Proceedings of the ICASSP'95, Detroit, MI, pp. 73–76 (1995)Google Scholar
  18. 18.
    Johnston, T.A.: Auslan Dictionary: A Dictionary of the Sign Language of the Australian Deaf Community. Deafness Resources, Australia (1989)Google Scholar
  19. 19.
    Yeates, S., Holden, E., Owens, R.: An animated Auslan tuition system. Int. J. Mach. Graph. Vision 12(2), 203–214 (2003)Google Scholar

Copyright information

© Springer-Verlag 2005

Authors and Affiliations

  1. 1.School of Computer Science & Software EngineeringThe University of Western AustraliaCrawleyAustralia
  2. 2.School of Engineering ScienceMurdoch UniversityRockinghamAustralia

Personalised recommendations