Abstract
We present a fast and accurate algorithm for the detection of human hands in real-life 2D image sequences. We focus on a specific application of hand detection, viz. the annotation of egocentric recordings. A well known type of egocentric camera is the mobile eye-tracker, which is often used in research on human-human interaction. Nowadays, this type of data is typically annotated manually for relevant features (e.g. visual fixations of gestures), which is a time-consuming and error-prone task. We present a semi-automatic approach for the detection of human hands in images. Such an approach reduces the amount of manual analysis drastically while guaranteeing high accuracy. In our algorithm we combine several well-known detection techniques together with an advanced elimination scheme to reduce false detections. We validate our approach using a challenging dataset containing over 4300 hand instances. This validation allows us to explore the capabilities and boundaries of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The PASCAL Visual Object Classes Challenge 2009 (VOC2009) Dataset http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html.
- 3.
- 4.
References
Al Moubayed, S., Edlund, J., Gustafson, J.: Analysis of gaze and speech patterns in three-party quiz game interaction. In: Interspeech 2013 (2013)
Bo, N., Dailey, M.N., Uyyanonvara, B.: Robust hand tracking in low-resolution video sequences. In: Proceedings of the Third Conference on IASTED International Conference: Advances in Computer Science and Technology, Anaheim, CA, USA, pp. 228–233 (2007)
Brône, G., Oben, B.: Insight interaction. a multimodal and multifocal dialogue corpus. Lang. Resour. Eval. 49(1), 195–214 (2014)
Buehler, P., Everingham, M., Huttenlocher, D., Zisserman, A.: Long term arm and hand tracking for continuous sign language tv broadcasts. In: Proceedings of the British Machine Vision Conference, pp. 110.1–110.10. BMVA Press (2008)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)
De Beugher, S., Brône, G., Goedemé, T.: Automatic analysis of in-the-wild mobile eye-tracking experiments using object, face and person detection. In: Proceedings of the 9th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (2014)
De Beugher, S., Brône, G., Goedemé, T.: A case study on real life mobile eye-tracker data. In: Proceedings of the 10th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (2015)
Dubout, C., Fleuret, F.: Exact acceleration of linear object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 301–311. Springer, Heidelberg (2012)
Eichner, M., Marin-Jimenez, M., Zisserman, A., Ferrari, V.: 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. Int. J. Comput. Vis. 99, 190–214 (2012)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Gebre, B.G., Wittenburg, P., Lenkiewicz, P.: Towards automatic gesture stroke detection. In: The Eighth International Conference on Language Resources and Evaluation, pp. 231–235 (2012)
Jokinen, K.: Non-verbal signals for turn-taking and feedback. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (2010)
Kalman, R.: A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 82, 35–45 (1960)
Karlinsky, L., Dinerstein, M., Harari, D., Ullman, S.: The chains model for detecting parts by their context. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 25–32 (2010)
Mittal, A., Zisserman, A., Torr, P.: Hand detection using multiple proposals. In: Proceedings of the British Machine Vision Conference, pp. 75.1–75.11. BMVA Press (2011)
Abdul Rahim, N.A., Wei, K.C., See, J.: RGB-H-CbCr skin colour model for human face detection. In: MMU International Symposium on Information and Communications Technologies (M2USIC), Petaling Jaya, Malaysia (2006)
Pfister, T., Charles, J., Everingham, M., Zisserman, A.: Automatic and efficient long term arm and hand tracking for continuous sign language TV broadcasts. In: British Machine Vision Conference (2012)
Spruyt, V., Ledda, A., Philips, W.: Real-time, long-term hand tracking with unsupervised initialization. In: Proceedings of the IEEE International Conference on Image Processing, pp. 3730–3734. IEEE (2013)
Van den Bergh, M., Van Gool, L.: Combining rgb and tof cameras for real-time 3d hand gesture interaction. In: Proceedings of the 2011 IEEE Workshop on Applications of Computer Vision (WACV), WACV 2011, pp. 66–72. IEEE Computer Society, Washington, DC (2011)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)
Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. In: ACM SIGGRAPH 2009 Papers, pp. 63:1–63:8 (2009)
Williams, G., Bregler, C., Hackney, P., Rosenthal, S., Mcdowall, I., Smolskiy, K.: Body signature recognition (2008)
Wu, Y., Liu, Q., Huang, T.S.: An adaptive self-organizing color segmentation algorithm with application to robust real-time human hand localization. In: Proceedings of Asian Conference on Computer Vision, pp. 1106–1111 (2000)
Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1385–1392. IEEE (2011)
Acknowledgements
This work is partially funded by KU Leuven via the projects Cametron and InSight Out. We also thank Raphael Den Dooven for his contributions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
De Beugher, S., Brône, G., Goedemé, T. (2016). Semi-automatic Hand Annotation of Egocentric Recordings. In: Braz, J., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2015. Communications in Computer and Information Science, vol 598. Springer, Cham. https://doi.org/10.1007/978-3-319-29971-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-29971-6_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29970-9
Online ISBN: 978-3-319-29971-6
eBook Packages: Computer ScienceComputer Science (R0)