Abstract
We developed a new device-free user interface for TV viewing that uses a human gesture recognition technique. Although many motion recognition technologies have been reported, no man–machine interface that recognizes a large enough variety of gestures has been developed. The difficulty was the lack of spatial information that could be acquired from normal video sequences. We overcame the difficulty by using a time-of-flight camera and novel action recognition techniques. The main functions of this system are gesture recognition and posture measurement. The former is performed using the bag-of-features approach, which uses key-point trajectories as features. The use of 4-D spatiotemporal trajectory features is the main technical contribution of the proposed system. The latter is obtained through face detection and object tracking technology. The interface is useful because it does not require any contact-type devices. Several experiments proved the effectiveness of our proposed method and the usefulness of the system.
Similar content being viewed by others
References
Ahad MAR, Ogata T, Tan JK, Kim HS, Ishikawa S (2008) View-based Human Motion Recognition in the Presence of Outliers. Biomed Soft Comput Human Sci 13(1):71–78
Appenrodt J, Al-Hamadi A, Michaelis B (2010) Data Gathering for Gesture Recognition Systems Based on Single Color-, Stereo Color- and Thermal Cameras. International Journal of Signal Processing, Image Processing and Pattern Recognition 3(1)
Bahar B, Barla IB, Boymul Ö, Dicle Ç, Erol B, Saraçlar M, Sezgin TM, Železný M (2007) Mobile-phone based gesture recognition. Proc. of the eNTERFACE’07 Workshop on Multimodal Interfaces. (Jul. 2007), 139–146
Basharat A, Gritai A, Shah M (2009) Learning object motion patterns for anomaly detection and improved object detection. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR). 1–8
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Motions as space-time shapes. In Proc. of IEEE Int. Conf. on Computer Vision, Vol. 2. (Oct. 2005), 1395–1402
Bradski G, Davis J (2006) Modeling people: Vision-based understanding of a person’s shape, appearance, movement, and action. Comput Vis Image Understand 104:87–89
Chen M, Hauptmann A (2009) MoSIFT: Recognizing human actions in surveillance videos. CMU-CS-09-161. Carnegie Mellon University
Chen PH, Lin CJ, Schölkopf B (2005) A tutorial on ν-support vector machines. Appl Stoch Model Bus Ind 21:111–136
Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. ECCV Workshop on Statistical Learning in Computer Vision. 1–22
Fathi A, Mori G (Jun. 2008) 2008. Action recognition by learning mid-level motion features. In Proc. of IEEE Conf. on Computer Vision and, Pattern Recognition, pp 1–8
Ganapathi V, Plagemann C, Koller D, Thrun S (2010) Real time motion capture using a single time-of-flight camera. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 755–762
Grimble MJ (1994) Robust industrial control: Optimal Design Approach for Polynomial Systems. Prentice Hall 443–456
Ikemura S, Fujiyoshi H (2010) Real-time human detection using relational depth similarity features. ACCV 2010. Lecture Notes in Computer Science. Volume 6495/2011, 25–38
Laptev I (2005) On Space-Time Interest Points. Int J Comput Vis 64(2/3):107–123
Li Z, Fu Y, Huang TS, Yan S (2008) Real-time human motion recognition by luminance field trajectory analysis. In Proc. of ACM Multimedia. 671–676
Matikainen P, Hebert M, Sukthankar R (2009) Trajectons: Action recognition through the motion analysis of tracked features. Workshop on Video-Oriented Object and Event Classification (ICCV), (Sep. 2009)
Matikainen P, Hebert M, Sukthankar R (2010) Representing Pairwise Spatial and Temporal Relations for Action Recognition. Proceedings of European Conference on Computer Vision (ECCV)
Microsoft, USA. XBOX Kinect. doi:http://www.xbox.com/kinect
Mikolajczyk K, Uemura H (2008) Motion recognition with motion-appearance vocabulary forest. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)
Morency L-P, Darrell T (2006) Head gesture recognition in intelligent interfaces: The role of context in improving recognition. Proc. of the 11th International Conference on Intelligent User Interfaces (IUI). (Jan. 2006)
Nefian AV, Grzeszczuk R, Eruhimov V (2001) A statistical upper body model for 3D static and dynamic gesture recognition from stereo sequences. In Proc. of International Conference on Pattern Recognition, 2:286–289
Nickel K, Stiefelhagen R (2007) Visual recognition of pointing gestures for human-robot interaction. Image Vis Comput 25(12):1875–1884
Nintendo, Japan. Wii Remote Controller. doi:http://www.nintendo.com/wii/what/controllers#remote
Open CV video library. doi:http://opencv.willowgarage.com/wiki/
Panasonic, Japan. D-imager. doi: http://denko.panasonic.biz/Ebox/kyorigazou_en/feature.html
Park C, Roh M, Lee S (2008) Real-Time 3D Pointing Gesture Recognition in Mobile Space. IEEE Conference on Automatic Face and Gesture Recognition
Plagemann C, Ganapathi V, Koller D, Thrun S (2010) Realtime identification and localization of body parts from depth images. In IEEE Int. Conference on Robotics and Automation (ICRA)
Rajesh V, Kumar RR (2009) Hand gestures recognition based on SEMG signal using wavelet and pattern recognition. Int J Recent Trends in Eng 1(4), (May 2009)
Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson R C (2001) Estimating the support of a high-dimensional distribution. Neural Computation 13:1443–1471
Schuldt C, Laptev I, Caputo B (2004) Recognizing human motions: a local SVM approach. In Proc. of IEEE Int. Conf. on Pattern Recognition, Vol. 3. (Aug. 2004), 32–36
Shi J, Tomasi C (1994) Good features to track. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 593–600
Shiraki T, Saito H, Kamoshida Y, Ishiguro K, Fukano R, Shirai T, Taura K, Otake M, Sato T, Otsu N (2006) Real-time motion recognition using CHLAC features and cluster. Proc. of IFIP International Conference on Network and Parallel Computing (NPC). 50–56
Sillito RR, Fisher RB (2008) Semi-supervised learning for anomalous trajectory detection. Proc BMVC 104–1044
Sugawara M (2008) Super Hi-Vision—research on a future ultra-HDTV system. EBU Technical Review Q2
Sun X, Chen M-Y, Hauptmann A (2009) Action recognition via local descriptors and holistic features.IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) for Human Communicative Behaviour Analysis. (Jun. 2009), 58–65
Valstar M, Pantic M, Patras I (2004) MotionHistory for Facial Action Detection in Video. IEEE Conf. on Systems. Man Cybern 1:635–640
Viola P, Jones MJ (2001) Rapid object detection using a boosted cascade of simple features. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)
Wren CR, Azarbayejani A, Darrell T, Pentland AP (Jul. 1997) Pfinder: Real-Time Tracking of the Human Body. IEEE Trans Pattern Anal Mach Intell 19(7):780–785
Xia L, Chen C-C, Aggarwal JK (2011) Human Detection Using Depth Information by Kinect. Workshop on Human Activity Understanding from 3D Data in Conjunction with CVPR (HAU3D), (Jun. 2011)
Yu X, Xu C, Tian Q, Leong HW (2003) A ball tracking framework for broadcast soccer video. In Proc. of IEEE International Conference on Multimedia & Expo (ICME). Vol. II, 273–276
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Takahashi, M., Fujii, M., Naemura, M. et al. Human gesture recognition system for TV viewing using time-of-flight camera. Multimed Tools Appl 62, 761–783 (2013). https://doi.org/10.1007/s11042-011-0870-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-011-0870-6