Skip to main content
Log in

Sign Language Phoneme Transcription with Rule-based Hand Trajectory Segmentation

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

A common approach to extract phonemes of sign language is to use an unsupervised clustering algorithm to group the sign segments. However, simple clustering algorithms based on distance measures usually do not work well on temporal data and require complex algorithms. In this paper, we present a simple and effective approach to extract phonemes from American sign language sentences. We first apply a rule-based segmentation algorithm to segment the hand motion trajectories of signed sentences. We then extract feature descriptors based on principal component analysis to represent the segments efficiently. The segments are clustered by k-means using these high level features to derive phonemes. 25 different continuously signed sentences from a deaf signer were used to perform the analysis. After phoneme transcription, we trained Hidden Markov Models to recognize the sequence of phonemes in the sentences. Overall, our automatic approach yielded 165 segments, and 58 phonemes were obtained based on these segments. The average number of recognition errors was 18.8 (11.4%). In comparison, completely manual trajectory segmentation and phoneme transcription, involving considerable labor yielded 173 segments, 57 phonemes, and the average number of recognition errors was 33.8 (19.5%).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

References

  1. Fry, D. B. (1959). Theoretical aspects of mechanical speech recognition. Journal of the British Institution of Radio Engineers, 19(4), 211–229.

    Google Scholar 

  2. Denes, P. (1959). The design and operation of the mechanical speech recognizer at University College London. Journal of the British Institution of Radio Engineers, 19(4), 211–229.

    Google Scholar 

  3. Stokoe, W. C. (1978). Sign language structure: An outline of the visual communication system of the american deaf, studies in linguistics: Occasional papers 8. Silver Spring: Linstok, 1960.

    Google Scholar 

  4. Liddell, S. K., & Johnson, R. E. (1989). American sign language: The phonological base. Sign Language Studies, 64, 195–177.

    Google Scholar 

  5. Vogler, C., & Metaxas, D. (1999). Towards scalability in ASL recognition: Breaking down sign into phonemes. In Gesture workshop (pp. 211–224). Gif-sur-Yvette, France, March.

  6. Wang, C., Gao, W., & Shan, S. (2002). An approach based on phonemes to large vocabulary chinese sign language recognition. In Proceedings of the fifth IEEE international conference on automatic face and gesture recognition (pp. 393–398). Washinton, DC, USA, May.

  7. Walter, M., Psarrou, A., & Gong, S., (2001). Auto clustering for unsupervised learning of atomic gesture components using minimum description length. In Proceedings of the IEEE ICCV workshop on recognition, analysis, and tracking of faces and gestures in real-time systems (pp. 157–162). Vancouver, Canada, July.

  8. Bauer, B., & Kraiss, K.-F. (2001). Towards an automatic sign language recognition system using subunits. In Gesture workshop (pp. 64–75). London, UK, April.

  9. Wang, C., et al. (2000). An approach to automatically extracting the basic units in chinese sign language recognition. In Proceedings of the 5th international conference on signal processing (pp. 855–858). Beijing, China, August.

  10. Fang, G., et al. (2004). A novel approach to automatically extracting basic units from chinese sign language. In Proceedings of the 17th international conference on pattern recognition (pp. 454–457). Cambridge, UK, August.

  11. Wilpon, J., & Rabiner, L. (1985). A modified k-means clustering algorithm for use in isolated work recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 33(3), 587–594.

    Article  Google Scholar 

  12. Sagawa, H., & Takeuchi, M. (2000). A method for recognizing a sequence of sign language words represented in a japanese sign language sentence. In Proceedings of the fourth IEEE international conference on automatic face and gesture recognition (pp. 434–439). Grenoble, France, March.

  13. Wang, T.-S., et al. (2001). Unsupervised analysis of human gestures. In Proceedings of the second IEEE pacific rim conference on multimedia: Advances in multimedia information processing (pp. 174–181). Beijing, China, October.

  14. Gibet, S., & Marteau, P.-F. (2007). Approximation of curvature and velocity using adaptive sampling representations—application to hand gesture analysis. In Gesture workshop. Lisbon, Portugal, May.

  15. Rao, C., Yilmaz, A., & Shah, M. (2002) View-invariant representation and recognition of actions. International Journal of Computer Vision, 50(2), 203–226.

    Article  MATH  Google Scholar 

  16. Asada, H., & Brady, M. (1984). The curvature primal sketch. Technical Report 758, MIT AI memo.

  17. Nam, Y., & Wohn, K. (1996). Recognition of space-time hand-gestures using hidden Markov model. In Proceedings of the ACM symposium on virtual reality software and technology (pp. 51–58), Hong Kong, July.

  18. Rabiner, L. R. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.

    Article  Google Scholar 

  19. Bilmes, J. (2006). What HMMs can do. IEICE Transactions on Information and Systems, E89-D(3), 869–891.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Surendra Ranganath.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kong, W.W., Ranganath, S. Sign Language Phoneme Transcription with Rule-based Hand Trajectory Segmentation. J Sign Process Syst Sign Image Video Technol 59, 211–222 (2010). https://doi.org/10.1007/s11265-008-0292-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-008-0292-5

Keywords

Navigation