Multi-camera Finger Tracking and 3D Trajectory Reconstruction for HCI Studies

  • Vadim Lyubanenko
  • Toni Kuronen
  • Tuomas Eerola
  • Lasse Lensu
  • Heikki Kälviäinen
  • Jukka Häkkinen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10617)


Three-dimensional human-computer interaction has the potential to form the next generation of user interfaces and to replace the current 2D touch displays. To study and to develop such user interfaces, it is essential to be able to measure how a human behaves while interacting with them. In practice, this can be achieved by accurately measuring hand movements in 3D by using a camera-based system and computer vision. In this work, a framework for multi-camera finger movement measurements in 3D is proposed. This includes comprehensive evaluation of state-of-the-art object trackers to select the most appropriate one to track fast gestures such as pointing actions. Moreover, the needed trajectory post-processing and 3D trajectory reconstruction methods are proposed. The developed framework was successfully evaluated in the application where 3D touch screen usability is studied with 3D stimuli. The most sustainable performance was achieved by the Structuralist Cognitive model for visual Tracking tracker complemented with the LOESS smoothing.


Human-computer interaction Object tracking Finger tracking Multi-view tracking 3D reconstruction 


  1. 1.
    FFmpeg (2017). Accessed 01 May 2017
  2. 2.
    Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.: Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401–1409 (2016)Google Scholar
  3. 3.
    van Beurden, M.H., Van Hoey, G., Hatzakis, H., Ijsselsteijn, W.A.: Stereoscopic displays in medical domains: a review of perception and performance effects. In: IS and T/SPIE Electronic Imaging, p. 72400A. International Society for Optics and Photonics (2009)Google Scholar
  4. 4.
    Chan, L.W., Kao, H.S., Chen, M.Y., Lee, M.S., Hsu, J., Hung, Y.P.: Touching the void: direct-touch interaction for intangible displays. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2625–2634. ACM (2010)Google Scholar
  5. 5.
    Choi, J., Jin Chang, H., Jeong, J., Demiris, Y., Young Choi, J.: Visual tracking using attention-modulated disintegration and integration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4321–4330 (2016)Google Scholar
  6. 6.
    Cleveland, W.S., Devlin, S.J.: Locally weighted regression: an approach to regression analysis by local fitting. J. Am. Stat. Assoc. 83(403), 596–610 (1988)CrossRefzbMATHGoogle Scholar
  7. 7.
    Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 142–149. IEEE (2000)Google Scholar
  8. 8.
    Elliott, D., Hansen, S., Grierson, L.E.M., Lyons, J., Bennett, S.J., Hayes, S.J.: Goal-directed aiming: two components but multiple processes. Psychol. Bull. 136(6), 1023–1044 (2010)CrossRefGoogle Scholar
  9. 9.
    Erdem, C.E., Sankur, B., Tekalp, A.M.: Performance measures for video object segmentation and tracking. IEEE Trans. Image Process. 13(7), 937–951 (2004)CrossRefGoogle Scholar
  10. 10.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. Comput. Vis. Image Underst. 108(12), 52–73 (2007). Special issue on vision for human-computer interactionCrossRefGoogle Scholar
  11. 11.
    Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.: Struck: structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2096–2109 (2016)CrossRefGoogle Scholar
  12. 12.
    Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)CrossRefzbMATHGoogle Scholar
  13. 13.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012). CrossRefGoogle Scholar
  14. 14.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)CrossRefGoogle Scholar
  15. 15.
    Hiltunen, V., Eerola, T., Lensu, L., Kälviäinen, H.: Comparison of general object trackers for hand tracking in high-speed videos. In: International Conference on Pattern Recognition, pp. 2215–2220 (2014)Google Scholar
  16. 16.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Forward-backward error: automatic detection of tracking failures. In: International Conference on Pattern Recognition, pp. 2756–2759. IEEE (2010)Google Scholar
  17. 17.
    Kooi, F.L., Toet, A.: Visual comfort of binocular and 3D displays. Displays 25(2), 99–108 (2004)CrossRefGoogle Scholar
  18. 18.
    Kristan, M., et al.: The visual object tracking VOT2016 challenge results. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 777–823. Springer, Cham (2016). CrossRefGoogle Scholar
  19. 19.
    Kuronen, T.: Post-processing and analysis of tracked hand trajectories. Master’s thesis, Lappeenranta University of Technology (2014)Google Scholar
  20. 20.
    Kuronen, T., Eerola, T., Lensu, L., Takatalo, J., Häkkinen, J., Kälviäinen, H.: High-speed hand tracking for studying human-computer interaction. In: Paulsen, R.R., Pedersen, K.S. (eds.) SCIA 2015. LNCS, vol. 9127, pp. 130–141. Springer, Cham (2015). CrossRefGoogle Scholar
  21. 21.
    Montero, A.S., Lang, J., Laganiere, R.: Scalable kernel correlation filter with sparse feature integration. In: Proceedings of the IEEE Conference on Computer Vision Workshops, pp. 587–594. IEEE (2015)Google Scholar
  22. 22.
    Nickels, K., Hutchinson, S.: Estimating uncertainty in SSD-based feature tracking. Image Vis. Comput. 20(1), 47–58 (2002)CrossRefGoogle Scholar
  23. 23.
    Nikulin, M.S.: Hellinger distance. Encyclopedia of Mathematics, vol. 151. Springer (2001)Google Scholar
  24. 24.
    Possegger, H., Mauthner, T., Bischof, H.: In defense of color-based model-free tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2113–2120 (2015)Google Scholar
  25. 25.
    Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1), 125–141 (2008)CrossRefGoogle Scholar
  26. 26.
    Servos, P., Goodale, M.A., Jakobson, L.S.: The role of binocular vision in prehension: a kinematic analysis. Vis. Res. 32(8), 1513–1521 (1992)CrossRefGoogle Scholar
  27. 27.
    Shi, J., Tomasi, C.: Good features to track. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600 (1994)Google Scholar
  28. 28.
    Valkov, D., Giesler, A., Hinrichs, K.: Evaluation of depth perception for touch interaction with stereoscopic rendered objects. In: Proceedings of the 2012 ACM International Conference on Interactive Tabletops and Surfaces, ITS 2012, pp. 21–30. ACM, New York, NY, USA (2012)Google Scholar
  29. 29.
    Van De Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Vojir, T.: Tracking with kernelized correlation filters (2017). Accessed 01 May 2017
  31. 31.
    Vojir, T., Noskova, J., Matas, J.: Robust scale-adaptive mean-shift for tracking. Pattern Recogn. Lett. 49, 250–258 (2014)CrossRefGoogle Scholar
  32. 32.
    Wu, H., Sankaranarayanan, A.C., Chellappa, R.: In situ evaluation of tracking algorithms using time reversed chains. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)Google Scholar
  33. 33.
    Zhang, K., Zhang, L., Liu, Q., Zhang, D., Yang, M.-H.: Fast visual tracking via dense spatio-temporal context learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 127–141. Springer, Cham (2014). Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Vadim Lyubanenko
    • 1
    • 2
  • Toni Kuronen
    • 1
  • Tuomas Eerola
    • 1
  • Lasse Lensu
    • 1
  • Heikki Kälviäinen
    • 1
  • Jukka Häkkinen
    • 3
  1. 1.School of Engineering Science, Machine Vision and Pattern Recognition LaboratoryLappeenranta University of TechnologyLappeenrantaFinland
  2. 2.Institute of Mathematics, Mechanics and Computer Science, Laboratory of Artificial Intelligence and RoboticsSouthern Federal UniversityRostov-on-DonRussian Federation
  3. 3.Institute of Behavioural SciencesUniversity of HelsinkiHelsinkiFinland

Personalised recommendations