Markerless tracking and gesture recognition using polar correlation of camera optical flow

Gupta, Prince; da Vitoria Lobo, Niels; Laviola, Joseph J.

doi:10.1007/s00138-012-0451-3

Markerless tracking and gesture recognition using polar correlation of camera optical flow

Original Paper
Published: 07 November 2012

Volume 24, pages 651–666, (2013)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Prince Gupta¹,
Niels da Vitoria Lobo¹ &
Joseph J. Laviola Jr.¹

677 Accesses
6 Citations
Explore all metrics

Abstract

We present a novel, real-time, markerless vision-based tracking system, employing a rigid orthogonal configuration of two pairs of opposing cameras. Our system uses optical flow over sparse features to overcome the limitation of vision-based systems that require markers or a pre-loaded model of the physical environment. We show how opposing cameras enable cancellation of common components of optical flow leading to an efficient tracking algorithm that captures five degrees of freedom including direction of translation and angular velocity. Experiments comparing our device with an electromagnetic tracker show that its average tracking accuracy is 80 % over 185 frames, and it is able to track large range motions even in outdoor settings. We also present how our tracking system can be used for gesture recognition by combining it with a simple linear classifier over a set of 15 gestures. Experimental results show that we are able to achieve 86.7 % gesture recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computer vision-based hand gesture recognition for human-robot interaction: a review

Article Open access 19 July 2023

Jing Qi, Li Ma, … Yushu Yu

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

Jonathon Luiten, Aljos̆a Os̆ep, … Bastian Leibe

A review of computer vision-based approaches for physical rehabilitation and assessment

Article Open access 19 June 2021

Bappaditya Debnath, Mary O’Brien, … Ardhendu Behera

References

Beauchemin, S.S., Barron, J.L.: The computation of optical flow. ACM Comput. Surveys. 27(3), 433–466 (1995)
Article Google Scholar
Bishop, G., Fuchs, H.: The self-tracker: a smart optical sensor on silicon. In: Proceedings, Conference on Advanced Research in VLSI (1984)
Bouguet, J.-Y.: Pyramidal implementation of the affine lucas kanade feature tracker: description of the algorithm. Intel Corporation, Microprocessor Research Labs (2000)
Bradski, G.R.: Computer vision face tracking for use in a perceptual user interface. Intel Technol. J. 2(2), 12–21 (1998)
Google Scholar
Bruss, A.R., Horn, B.K.P.: Passive navigation. Comput. Vis. Graphics Image Process. 21(1), 3–20 (1983)
Article Google Scholar
Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 491–502. ACM, New York (2005)
Clipp, B., Kim, J-H., Frahm, J-M., Pollefeys, M., Hartley, R.: Robust 6dof motion estimation for non-overlapping, multi-camera systems. In: IEEE Workshop on Applications of Computer Vision, vol. 0, pp. 1–8 (2008)
Comport, A.I., Eric, M., Pressigout, M.: Fran?ois, C.: Real-time markerless tracking for augmented reality: the virtual visual servoing framework. IEEE Transact. Vis. Comput. Graphics 12(4), 615–628 (2006)
Article Google Scholar
Davis, J.W., Vaks, S.: A perceptual user interface for recognizing head gesture acknowledgements. In: Proceedings of the 2001 Workshop on Perceptive User Interfaces, pp. 1–7. ACM, New York (2001)
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: Monoslam: real-time single camera slam. IEEE Transact. Pattern Anal. Machine Intell. 29(6), 1052–1067 (2007)
Article Google Scholar
Demming, G.: Sony eyetoy \(^{TM}\): developing mental models for 3-D interaction in a 2-D gaming environment. In: Computer Human Interaction, vol. 3101, pp. 575–582. Springer, Berlin (2004)
Freeman, W.T., Tanaka, K., Ohta, J., Kyuma, K.: Computer vision for computer games. In: IEEE International Conference on Automatic Face and Gesture Recognition, vol. 0, p. 100 (1996)
Freeman, W.T., Anderson, D.B., Beardsley, P.A., Dodge, C.N., Roth, M., Weissman, C.D., Yerazunis, W.S., Kage, H., Kyuma, K., Miyake, Y., Tanaka, K.: Computer vision for interactive computer graphics, vol. 18, pp. 42–53. IEEE Computer Society, Los Alamitos (1998)
Gupta, P., da Vitoria Loba, N., LaViola, J.J. Jr.: Markerless tracking using polar correlation of camera optical flow. In: IEEE Virtual Reality Conference (2010)
Hämäläinen, P., Höysniemi, J.: A computer vision and hearing based user interface for a computer game for children. In: Universal Access Theoretical Perspectives, Practice, and Experience, pp. 299–318. Springer, Berlin (2003)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, New York (2003)
Google Scholar
Höysniemi, J., Hämäläinen, P., Turkki, L., Rouvi, T.: Children’s intuitive gestures in vision-based action games. Commun. ACM. 48(1), 44–50 (2005)
Article Google Scholar
Jepson, A., Heeger, D.J.: Linear subspace methods for recovering translational direction, pp. 39–62 (1992)
Kaess, M., Dellaert, F.: Visual slam with a multi-camera rig. Technical Report GIT-GVU-06-06, Georgia Institute of Technology (2006)
Kim, J-H., Li, H., Hartley, R.: Motion estimation for multi-camera systems using global optimization. Comput. Vis. Pattern Recogn. 1–8 (2008)
Kim, J-H., Li, H., Hartley, R.: Motion estimation for non-overlapping multi-camera rigs: linear algebraic and \(l_{\infty }\) geometric solutions. IEEE Transact. Pattern Anal. Machine Intell. 99(1), 1044–1059 (2009)
Google Scholar
Kim, J-S., Hwangbo, M., Kanade, T.: Motion estimation using multiple non-overlapping cameras for small unmanned aerial vehicles, pp. 3076–3081 (2008)
Koch, R., Koeser, K., Streckel, B., Evers-Senne, J.-F.: Markerless Image-based 3D Tracking for Real-time Augmented Reality Applications. Montreux, Switzerland (2005)
Google Scholar
Lee, M.S., Weinshall, D., Solal, E.C., Colmenarez, A., Lyons, D.M.: A computer vision system for on-screen item selection by finger pointing. Comput. Vis. Pattern Recogn. 1, 1026–1033 (2001)
Google Scholar
Li, H., Hartley, R.: Five-point motion estimation made easy. In: International Conference on Pattern Recognition, vol. 1, pp. 630–633 (2006)
Li, H., Hartley, R., Kim, J-H: A linear approach to motion estimation using generalized camera models. Comput. Vis. Pattern Recogn. 1–8 (2008)
Lim, J., Barnes, N.: Estimation of the epipole using optical flow at antipodal points. In: IEEE International Conference on Computer Vision, vol. 0, pp. 1–6 (2007)
Lim, J., Barnes, N.: Directions of egomotion from antipodal points. Comput. Vis. Pattern Recogn. 1–8 (2008)
Longuet-Higgins, H.C., Prazdny, K.: The interpretation of a moving retinal image. Proc. Roy. Soc. Lond. B208, 385–397 (1980)
Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the International Conference on Computer Vision, p. 1150. IEEE Computer Society, Washington, DC (1999)
Moeslund, T., Strring, M., Orring, M., Granum, E.: Vision-based user interface for interacting with a virtual environment. In: Proceedings of the DANKOMB 2000, pp. 20–28 (2000)
Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Transact. Pattern Anal. Machine Intell. 26(6), 756–777 (2004)
Article Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn., XXII, 664 p. Springer, New York (2006)
Pless, R.: Camera cluster in motion: motion estimation for generalized camera designs. In: IEEE Robotics and Automation Magazine, vol. 11, pp. 39–44 (2004)
Rhalibi, A., Merabti, M., Fergus, P., Shen, Y.: Perceptual user interface as games controller. In: IEEE Consumer Communications and Networking Conference, vol. 10, pp. 1059–1064 (2008)
Segen, J., Kumar, S.: Gesture vr: vision-based 3d hand interace for spatial interaction. In: Proceedings of the Sixth ACM International Conference on Multimedia, pp. 455–464. ACM, New York (1998)
Shi, J., Tomasi, C.: Good features to track. In: IEEE Conference on Computer Vision and Pattern Recognition (1994)
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3d. ACM Transact Graphics (TOG) (2006)
Sundaresan, A., Chellappa, R.: Markerless motion capture using multiple cameras. Comput. Vis. Interact. Intell. Environ. 15–26 (2005)
Tariq, S., Dellaert, F.: A multi-camera 6-dof pose tracker. In: Proceedings of the 3rd IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 296–297. IEEE Computer Society, Washington, DC (2004)
Thomas, I., Simoncelli, E.P.: Linear structure from motion. Institute for Research in Cognitive Science Technical Report IRCS-94-26, University of Pennsylvania (1994)
Tsai, R.Y.: A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robot. Autom. 3, 323–344 (1987)
Google Scholar
Tsao, A-T., Fuh, C-S., Hung, Y-P., Chen, Y-S.: Ego-motion estimation using optical flow fields observed from multiple cameras. In: Computer Vision and Pattern Recognition, p. 457. IEEE Computer Society, Washington, DC (1997)
Turk, M.: Moving from guis to puis. In: Proceedings of Fourth Symposium on Intelligent Information (1998)
Welch, G., Bishop, G., Vicci, L., Brumback, S., Keller, K., Colucci, D.: High-performance wide-area optical tracking: the hiball tracking system. Presence Teleoper. Virtual Environ. 10(1), 1–21 (2001)
Article Google Scholar
Welch, G., Foxlin, E.: Motion tracking: no silver bullet, but a respectable arsenal. IEEE Comput. Graphics Appl. 22(6), 24–38 (2002)
Article Google Scholar
Wilson, A.D., Cutrell, E.: FlowMouse: a computer vision-based pointing and gesture input device. In: Proceedings of IFIP International Conference on Human-Computer Interaction, pp. 565–578. Springer, Berlin (2005)
Wu, A., Shah, M, da Vitoria Lobo, N. : A virtual 3d blackboard: 3d finger tracking using a single camera. In: Fourth IEEE International Conference On Automatic Face And Gesture Recognition, pp. 536–543 (2000)

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, University of Central Florida, 4000 Central Florida Blvd., Orlando, FL, 32816-2362, USA
Prince Gupta, Niels da Vitoria Lobo & Joseph J. Laviola Jr.

Authors

Prince Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Niels da Vitoria Lobo
View author publications
You can also search for this author in PubMed Google Scholar
Joseph J. Laviola Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prince Gupta.

Electronic supplementary material

The Below is the Electronic Supplementary Material.

ESM 1 (avi 10046 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, P., da Vitoria Lobo, N. & Laviola, J.J. Markerless tracking and gesture recognition using polar correlation of camera optical flow. Machine Vision and Applications 24, 651–666 (2013). https://doi.org/10.1007/s00138-012-0451-3

Download citation

Received: 21 August 2011
Revised: 30 March 2012
Accepted: 12 September 2012
Published: 07 November 2012
Issue Date: April 2013
DOI: https://doi.org/10.1007/s00138-012-0451-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Markerless tracking and gesture recognition using polar correlation of camera optical flow

Abstract

Access this article

Similar content being viewed by others

Computer vision-based hand gesture recognition for human-robot interaction: a review

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

A review of computer vision-based approaches for physical rehabilitation and assessment

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1 (avi 10046 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Computer vision-based hand gesture recognition for human-robot interaction: a review

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

A review of computer vision-based approaches for physical rehabilitation and assessment

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1 (avi 10046 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation