A database-based framework for gesture recognition

Athitsos, Vassilis; Wang, Haijing; Stefan, Alexandra

doi:10.1007/s00779-009-0276-x

A database-based framework for gesture recognition

Original Article
Published: 05 March 2010

Volume 14, pages 511–526, (2010)
Cite this article

Personal and Ubiquitous Computing Aims and scope Submit manuscript

Vassilis Athitsos¹,
Haijing Wang¹ &
Alexandra Stefan¹

1417 Accesses
19 Citations
Explore all metrics

Abstract

Gestures are an important modality for human–machine communication. Computer vision modules performing gesture recognition can be important components of intelligent homes, assistive environments, and human–computer interfaces. A key problem in recognizing gestures is that the appearance of a gesture can vary widely depending on variables such as the person performing the gesture, or the position and orientation of the camera. This paper presents a database-based approach for addressing this problem. The large variability in appearance among different examples of the same gesture is addressed by creating large gesture databases, that store enough exemplars from each gesture to capture the variability within that gesture. This database-based approach is applied to two gesture recognition problems: handshape categorization and motion-based recognition of American Sign Language signs. A key aspect of our approach is the use of database indexing methods, in order to address the challenge of searching large databases without violating the time constraints of an online interactive system, where system response times of over a few seconds are oftentimes considered unacceptable. Our experiments demonstrate the benefits of the proposed database-based framework, and the feasibility of integrating large gesture databases into online interacting systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Comprehensive Database for Hand Gesture Recognition

3D Gesture Analysis Using a Large-Scale Gesture Database

A Multi-scale Boosted Detector for Efficient and Robust Gesture Recognition

References

Alon J, Athitsos V, Yuan Q, Sclaroff S (2005) Simultaneous localization and recognition of dynamic hand gestures. In: IEEE motion workshop, pp 254–260
Athitsos V, Alon J, Sclaroff S, Kollios G (2005) Filtering methods for similarity-based multimedia retrieval. In: International workshop on audio-visual content and information visualization in digital libraries (AVIVDiLib)
Athitsos V, Alon J, Sclaroff S, Kollios G (2008) Boostmap: an embedding method for efficient nearest neighbor retrieval. IEEE Trans Pattern Anal Mach Intell 30(1):89–104
Article Google Scholar
Athitsos V, Hadjieleftheriou M, Kollios G, Sclaroff S (2007) Query-sensitive embeddings. ACM Trans Database Syst 32(2)
Athitsos V, Neidle C, Sclaroff S, Nash J, Stefan A, Yuan Q, Thangali A (2008) The American sign language lexicon video dataset. In: IEEE workshop on computer vision and pattern recognition for human communicative behavior analysis (CVPR4HB)
Athitsos V, Sclaroff S (2003) Estimating hand pose from a cluttered image. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2, pp 432–439
Barrow HG, Tenenbaum JM, Bolles RC, Wolf HC (1977) Parametric correspondence and chamfer matching: two new techniques for image matching. In: International joint conference on artificial intelligence, pp 659–663
Bauer B, Kraiss KF (2001) Towards an automatic sign language recognition system using subunits. In: Camurri A, Volpe G (eds) Gesture workshop, pp 64–75
Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522
Article Google Scholar
Böhm C, Berchtold S, Keim DA (2001) Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Comput Surv 33(3):322–373
Article Google Scholar
Bourgain J (1985) On Lipschitz embeddings of finite metric spaces in Hilbert space. Isr J Math 52:46–52
Article MATH MathSciNet Google Scholar
Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698
Article Google Scholar
Cui Y, Weng J (2000) Appearance-based hand sign recognition from intensity image sequences. Comput Vis Image Underst 78(2):157–176
Article Google Scholar
Curious Labs, Santa Cruz, CA. Poser 5 Reference Manual, August 2002
Darrell TJ, Essa IA, Pentland AP (1996) Task-specific gesture analysis in real-time using interpolated views. IEEE Trans Pattern Anal Mach Intell 18(12):1236–1242
Article Google Scholar
de Campos TE, Murray DW (2006) Regression-based hand pose estimation from multiple cameras. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 1, pp 782–789
Deng J, Tsui H-T (2002) A PCA/MDA scheme for hand posture recognition. In: Automatic face and gesture recognition, pp 294–299
Faloutsos C, Lin KI (1995) FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: ACM international conference on management of data (SIGMOD), pp 163–174
Freeman WT, Roth M (1996) Computer vision for computer games. In: Automatic face and gesture recognition, pp 100–105
Fujimura K, Liu X (2006) Sign recognition using depth image streams. In: Automatic face and gesture recognition, pp 381–386
Gao W, Fang G, Zhao D, Chen Y (2004) Transition movement models for large vocabulary continuous sign language recognition. In: Automatic face and gesture recognition, pp 553–558
Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: International conference on very large databases, pp 518–529
Hart PE (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515–516
Article Google Scholar
Heap T, Hogg D (1996) Towards 3D hand tracking using a deformable model. In: Automatic face and gesture recognition, pp 140–145
Hjaltason GR, Samet H (2003) Index-driven similarity search in metric spaces. ACM Trans Database Syst 28(4):517–580
Article Google Scholar
Hjaltason GR, Samet H (2003) Properties of embedding methods for similarity searching in metric spaces. IEEE Trans Pattern Anal Mach Intell 25(5):530–549
Article Google Scholar
Hristescu G, Farach-Colton M (1999) Cluster-preserving embedding of proteins. Technical report 99-50, CS Department, Rutgers University
Indyk P (2000) High-dimensional computational geometry. PhD thesis, Stanford University
Kadir T, Bowden R, Ong E, Zisserman A (2004) Minimal training, large lexicon, unconstrained sign language recognition. In: British machine vision conference (BMVC), vol 2, pp 939–948
Kavakli M (2008) Gesture recognition in virtual reality. Int J Arts Technol 1(2):215–229
Article Google Scholar
Keogh E (2002) Exact indexing of dynamic time warping. In: International conference on very large data bases, pp 406–417
Keskin C, Balci K, Aran O, Sankur B, Akarun L (2007) A multimodal 3d healthcare communication system. In: 3DTV conference: the true vision—capture, transmission and display of 3D video, pp 1–4
Kruskal JB, Liberman M (1983) The symmetric time warping algorithm: from continuous to discrete. In: Sankoff D, Kruskal JB (eds) Time warps. Addison-Wesley
Li C, Chang E, Garcia-Molina H, Wiederhold G (2002) Clustering for approximate similarity search in high-dimensional spaces. IEEE Trans Knowl Data Eng 14(4):792–808
Article Google Scholar
Linial N, London E, Rabinovich Y (1994) The geometry of graphs and some of its algorithmic applications. In: IEEE symposium on foundations of computer science, pp 577–591
Lu S, Metaxas D, Samaras D, Oliensis J (2003) Using multiple cues for hand tracking and model refinement. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2, pp 443–450
Ma J, Gao W, Wu J, Wang C (2000) A continuous Chinese Sign Language recognition system. In: Automatic face and gesture recognition, pp 428–433
Martin J, Devin V, Crowley JL (1998) Active hand tracking. In: Automatic face and gesture recognition, pp 573–578
Ong SCW, Ranganath S (2005) Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Trans Knowl Data Eng 27(6):873–891
Article Google Scholar
Potamias M, Athitsos V (2008) Nearest neighbor search methods for handshape recognition. In: Makedon F, Baillie L (eds) conference on pervasive technologies related to assistive environments (PETRA)
Rabiner LR (1989) A tutorial on hidden markov models and selected applications in speech recognition. In: Proceedings of the IEEE, vol 77, p 2
Rehg JM (1995) Visual analysis of high DOF articulated objects with application to hand tracking. PhD thesis, Electrical and Computer Engineering, Carnegie Mellon University
Rosales R, Athitsos V, Sigal L, Sclaroff S (2001) 3D hand pose reconstruction using specialized mappings. In: IEEE international conference on computer vision (ICCV), vol 1, pp 378–385
Rowley HA, Baluja S, Kanade T (1998) Rotation invariant neural network-based face detection. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 38–44
Sagawa H, Takeuchi M (2000) A method for recognizing a sequence of sign language words represented in a Japanese Sign Language sentence. In: Automatic face and gesture recognition, pp 434–439
Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336
Article MATH Google Scholar
Shimada N, Kimura K, Shirai Y (2001) Real-time 3-D hand posture estimation based on 2-D appearance retrieval using monocular camera. In: Recognition, analysis and tracking of faces and gestures in realtime systems, pp 23–30
Starner T, Pentland A (1998) Real-time American Sign Language recognition using desk and wearable computer based video. IEEE Trans Pattern Anal Mach Intell 20(12):1371–1375
Article Google Scholar
Stenger B, Thayananthan A, Torr PHS, Cipolla R (2006) Model-based hand tracking using a hierarchical bayesian filter. IEEE Trans Pattern Anal Mach Intell 28(9):1372–1384
Article Google Scholar
Sturm I, Schiewe M, Köhlmann W, Jürgensen H (2009) Communicating through gestures without visual feedback. In: Conference on pervasive technologies related to assistive environments (PETRA)
Thayananthan A, Stenger B, Torr PHS, Cipolla R (2003) Shape context and chamfer matching in cluttered scenes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 127–133
Tuncel E, Ferhatosmanoglu H, Rose K (2002) VQ-index: an index structure for similarity searching in multimedia databases. In: Proceedings of ACM multimedia, pp 543–552
Uhlman J (1991) Satisfying general proximity/similarity queries with metric trees. Infor Process Lett 40(4):175–179
Article Google Scholar
Valli C (eds) (2006) The Gallaudet dictionary of American Sign Language. Gallaudet U. Press, Washington DC
Google Scholar
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: IEEE conference on computer vision and pattern recognition, vol 1, pp 511–518
Vogler C, Metaxas DN (1999) Parallel hidden markov models for american sign language recognition. In: IEEE international conference on computer vision (ICCV), pp 116–122
Vogler C, Metaxas DN (2003) Handshapes and movements: multiple-channel American sign language recognition. In: Camurri A, Volpe G (eds) Gesture workshop, pp 247–258
Wang C, Shan S, Gao W (2002) An approach based on phonemes to large vocabulary Chinese Sign Language recognition. In: Automatic face and gesture recognition, pp 411–416
Wang J, Athitsos V, Sclaroff S, Betke M (2008) Detecting objects of variable shape structure with hidden state shape models. IEEE Trans Pattern Anal Mach Intell 30(3):477–492
Article Google Scholar
Wang X, Wang JTL, Lin KI, Shasha D, Shapiro BA, Zhang K (2000) An index structure for data mining and clustering. Knowl Inf Syst 2(2):161–184
Article MATH Google Scholar
Weber R, Böhm K (2000) Trading quality for time with nearest-neighbor search. In: International conference on extending database technology: advances in database technology, pp 21–35
White DA, Jain R (1996) Similarity indexing: algorithms and performance. In: storage and retrieval for image and video databases (SPIE), pp 62–73
Wu Y, Huang TS (2000) View-independent recognition of hand postures. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2, pp 88–94
Wu Y, Lin JY, Huang TS (2001) Capturing natural hand articulation. In: IEEE international conference on computer vision (ICCV), vol 2, pp 426–432
Yang M, Ahuja N (1999) Recognizing hand gesture using motion trajectories. In: IEEE conference on computer vision and pattern recognition, vol 1, pp 466–472
Yao G, Yao H, Liu X, Jiang F (2006) Real time large vocabulary continuous sign language recognition based on OP/Viterbi algorithm. In: International conference on pattern recognition, vol 3, pp 312–315
Yianilos PN (1993) Data structures and algorithms for nearest neighbor search in general metric spaces. In: ACM-SIAM symposium on discrete algorithms, pp 311–321
Yuan Q, Sclaroff S, Athitsos V (2005) Automatic 2D hand tracking in video sequences. In: IEEE workshop on applications of computer vision, pp 250–256

Download references

Acknowledgments

This work has been supported by National Science Foundation grants IIS-0705749 and IIS-0812601, as well as by a University of Texas at Arlington startup grant to Professor Athitsos, and University of Texas at Arlington STARS awards to Professors Chris Ding and Fillia Makedon. We also acknowledge and thank our collaborators at Boston University, including Carol Neidle, Stan Sclaroff, Joan Nash, Ashwin Thangali, and Quan Yuan, for their contributions in collecting and annotating the American Sign Language Lexicon Video Dataset.

Author information

Authors and Affiliations

Computer Science and Engineering Department, University of Texas at Arlington, Arlington, TX, USA
Vassilis Athitsos, Haijing Wang & Alexandra Stefan

Authors

Vassilis Athitsos
View author publications
You can also search for this author in PubMed Google Scholar
Haijing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra Stefan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vassilis Athitsos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Athitsos, V., Wang, H. & Stefan, A. A database-based framework for gesture recognition. Pers Ubiquit Comput 14, 511–526 (2010). https://doi.org/10.1007/s00779-009-0276-x

Download citation

Received: 31 December 2008
Accepted: 05 October 2009
Published: 05 March 2010
Issue Date: September 2010
DOI: https://doi.org/10.1007/s00779-009-0276-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A database-based framework for gesture recognition

Abstract

Access this article

Similar content being viewed by others

A New Comprehensive Database for Hand Gesture Recognition

3D Gesture Analysis Using a Large-Scale Gesture Database

A Multi-scale Boosted Detector for Efficient and Robust Gesture Recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A database-based framework for gesture recognition

Abstract

Access this article

Similar content being viewed by others

A New Comprehensive Database for Hand Gesture Recognition

3D Gesture Analysis Using a Large-Scale Gesture Database

A Multi-scale Boosted Detector for Efficient and Robust Gesture Recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation