Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes

Vogler, Christian; Metaxas, Dimitris

doi:10.1007/3-540-46616-9_19

Christian Vogler³ &
Dimitris Metaxas³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1739))

Included in the following conference series:

International Gesture Workshop

1870 Accesses
37 Citations

Abstract

In this paper we present a novel approach to continuous, whole-sentence ASL recognition that uses phonemes instead of whole signs as the basic units. Our approach is based on a sequential phonological model of ASL. According to this model the ASL signs can be broken into movements and holds, which are both considered phonemes. This model does away with the distinction between whole signs and epenthesis movements that we made in previous work [17]. Instead, epenthesis movements are just like the other movements that constitute the signs.

We subsequently train Hidden Markov Models (HMMs) to recognize the phonemes, instead of whole signs and epenthesis movements that we recognized previously [17]. Because the number of phonemes is limited, HMM-based training and recognition of the ASL signal becomes computationally more tractable and has the potential to lead to the recognition of large-scale vocabularies.

We experimented with a 22 word vocabulary, and we achieved similar recognition rates with phoneme-and word-based approaches. This result is very promising for scaling the task in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. Braffort. ARGo: An architecture for sign language recognition and interpretation. In A. D. N. Edwards P. A. Harling, editor, Progress in gestural interaction. Proceedings of Gesture Workshop’ 96, pages 17–30, Berlin, New York, 1997. Springer.
Google Scholar
D. Brentari. Sign language phonology: ASL. In J. A. Goldsmith, editor, The Handbook of Phonological Theory, Blackwell Handbooks in Linguistics, pages 615–639. Blackwell, Oxford, 1995.
Google Scholar
G. R. Coulter, editor. Current Issues in ASL Phonology, volume 3 of Phonetics and Phonology. Academic Press, Inc., San Diego, CA, 1993.
Google Scholar
R. Erenshteyn and P. Laskov. A multi-stage approach to ngerspelling and gesture recognition. Proceedings of the Workshop on the Integration of Gesture in Language and Speech, Wilmington, DE, USA, 1996.
Google Scholar
S. Gibet, J. Richardson, T. Lebourque, and A. Braffort. Corpus of 3d natural movements and sign language primitives of movement. In I. Wachsmuth and M. Fröhlich, editors, Gesture and Sign Language in Human-Computer Interaction. Proceedings of Gesture Workshop’ 97, Berlin, New York, 1998. Springer.
Google Scholar
K. Grobel and M. Assam. Isolated sign language recognition using hidden Markov models. SMC, pages 162–167, Orlando, FL, 1997.
Google Scholar
H. Hienz, K.-F. Kraiss, and B. Bauer. Continuous sign language recognition using hidden Markov models. In Y. Tang, editor, ICMI’99, pages IV10–IV15, Hong Kong, 1999.
Google Scholar
M. W. Kadous. Machine recognition of Auslan signs using PowerGloves: Towards large-lexicon recognition of sign language. In Proceedings of the Workshop on the Integration of Gesture in Language and Speech, pages 165–174, Wilmington, DE, USA, 1996.
Google Scholar
R.-H. Liang and M. Ouhyoung. A real-time continuous gesture recognition system for sign language. In Proceedings of the Third International Conference on Automatic Face and Gesture Recognition, pages 558–565, Nara, Japan, 1998.
Google Scholar
S. K. Liddell and R. E. Johnson. American Sign Language: The phonological base. Sign Language Studies, 64:195–277, 1989.
Google Scholar
Y. Nam and K. Y. Wohn. Recognition of space-time hand-gestures using hidden Markov model. ACM Symposium on Virtual Reality Software and Technology, 1996.
Google Scholar
L. R. Rabiner. A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989.
Article Google Scholar
W. Sandler. Phonological Representation of the Sign: Linearity and Nonlinearity in American Sign Language. Number 32 in Publications in Language Sciences. Foris Publications, Dordrecht, 1989.
Google Scholar
T. Starner and A. Pentland. Visual recognition of American Sign Language using Hidden Markov Models. International Workshop on Automatic Face and Gesture Recognition, pages 189–194, Zürich, Switzerland, 1995.
Google Scholar
W. C. Stokoe. Sign Language Structure: An Outline of the Visual Communication System of the American Deaf. Studies in Linguistics: Occasional Papers 8. Linstok Press, Silver Spring, MD, 1960. Revised 1978.
Google Scholar
C. Vogler and D. Metaxas. Parallel hidden Markov models for American Sign Language recognition. ICCV, Kerkyra, Greece, 1999.
Google Scholar
C. Vogler and D. Metaxas. Adapting hidden Markov models for ASL recognition by using three-dimensional computer vision methods. SMC, pages 156–161, Orlando, FL, 1997.
Google Scholar
C. Vogler and D. Metaxas. ASL recognition based on a coupling between HMMs and 3D motion analysis. ICCV, pages 363–369, Mumbai, India, 1998.
Google Scholar
M. B. Waldron and S. Kim. Isolated ASL sign recognition system for deaf persons. IEEE Transactions on Rehabilitation Engineering, 3(3):261–71, September 1995.
Article Google Scholar

Download references

Author information

Authors and Affiliations

VAST Laboratory, University of Pennsylvania, 200 S. 33rd Street, Philadelphia, PA, 19104-6389, USA
Christian Vogler & Dimitris Metaxas

Authors

Christian Vogler
View author publications
You can also search for this author in PubMed Google Scholar
Dimitris Metaxas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIMSI-CNRS, BP 133, 91403, Orsay cedex, France
Annelies Braffort , Rachid Gherbi , Sylvie Gibet & Daniel Teil , , &
Université Paris Sud, LPM, 9140, Orsay, France
James Richardson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vogler, C., Metaxas, D. (1999). Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes. In: Braffort, A., Gherbi, R., Gibet, S., Teil, D., Richardson, J. (eds) Gesture-Based Communication in Human-Computer Interaction. GW 1999. Lecture Notes in Computer Science(), vol 1739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46616-9_19

Download citation

DOI: https://doi.org/10.1007/3-540-46616-9_19
Published: 20 December 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66935-7
Online ISBN: 978-3-540-46616-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics