Machine Vision and Applications

, Volume 24, Issue 8, pp 1575–1587 | Cite as

Detecting interaction above digital tabletops using a single depth camera

  • Nadia Haubner
  • Ulrich Schwanecke
  • Ralf Dörner
  • Simon Lehmann
  • Johannes Luderschmidt
Original Paper


Digital tabletop environments offer a huge potential to realize application scenarios where multiple users interact simultaneously or aim to solve collaborative tasks. So far, research in this field focuses on touch and tangible interaction, which only takes place on the tabletop’s surface. First approaches aim at involving the space above the surface, e.g., by employing freehand gestures. However, these are either limited to specific scenarios or employ obtrusive tracking solutions. In this paper, we propose an approach to unobtrusively segment and detect interaction above a digital surface using a depth sensing camera. To achieve this, we adapt a previously presented approach that segments arms in depth data from a front-view to a top-view setup facilitating the detection of hand positions. Moreover, we propose a novel algorithm to merge segments and give a comparison to the original segmentation algorithm. Since the algorithm involves a large number of parameters, estimating the optimal configuration is necessary. To accomplish this, we describe a low effort approach to estimate the parameter configuration based on simulated annealing. An evaluation of our system to detect hands shows that a repositioning precision of approximately 1 cm is achieved. This accuracy is sufficient to reliably realize interaction metaphors above a surface.


Human–computer interaction Depth sensing cameras Image segmentation Object detection 


  1. 1.
    Annett, M., Grossman, T., Wigdor, D., Fitzmaurice, G.: Medusa: a proximity-aware multi-touch tabletop. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, UIST ’11, pp. 337–346. ACM, New York (2011)Google Scholar
  2. 2.
    Benko, H.: Beyond flat surface computing: challenges of depth-aware and curved interfaces. In: Proceeding of the 17th ACM International Conference on Multimedia, pp. 935–944. ACM, new York (2009)Google Scholar
  3. 3.
    Benko, H., Morris, M.R., Brush, A.J.B., Wilson, A.D.: Insights on interactive tabletops: a survey of researchers and developers. In: Technical Report MSR-TR-2009-22, Microsoft Research Technical, Report (2009)Google Scholar
  4. 4.
    Hilliges, O., Izadi, S., Wilson, A.D., Hodges, S., Garcia-Mendoza, A., Butz, A.: Interactions in the air: adding further depth to interactive tabletops. In: UIST ’09: Proceeding of the 22nd Annual ACM Symposis on User Interface Software and Technology, pp. 139–148. ACM, New York (2009)Google Scholar
  5. 5.
    Huppmann, D., Luderschmidt, J., Haubner, N., Lehmann, S., Dörner, R., Schwanecke, U.: Exploring and evaluating the combined multi-touch and in-the-air tabletop interaction space. In: Geiger, C., Herder, J., Vierjahn, T. (eds.) 9, pp. 37–48. Workshop “Virtuelle und Erweiterte Realität” der GI-Fachgruppe VR/ARShaker Verlag, Aachen (2012)Google Scholar
  6. 6.
    Jain, H.P., Subramanian, A., Das, S., Mittal, A.: Real-time upper-body human pose estimation using a depth camera. In: Proceedings of the 5th International Conference on Computer Vision/Computer Graphics Collaboration Techniques, MIRAGE’11, pp. 227–238. Springer-Verlag, Berlin (2011)Google Scholar
  7. 7.
    Kattinakere, R.S., Grossman, T., Subramanian, S.: Modeling steering within above-the-surface interaction layers. In: Proceeding of the SIGCHI Conference on Human Factors in Computer System, pp. 317–326. ACM (2007)Google Scholar
  8. 8.
    Khoshelham, K., Elberink, S.O.: Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 12(2), 1437–1454 (2012)CrossRefGoogle Scholar
  9. 9.
    Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Knoop, S., Vacek, S., Dillmann, R.: Fusion of 2d and 3d sensor data for articulated body tracking. Robot. Auton. Syst. 57, 321–329 (March 2009)Google Scholar
  11. 11.
    Kolb, A., Barth, E., Koch, R., Larsen, R.: Time-of-flight sensors in computer graphics. in: Eurographics State of the Art Reports, pp. 119–134 (2009)Google Scholar
  12. 12.
    Lucero, A., Aliakseyeu, D., Martens, J.-B.: Augmenting mood boards: flexible and intuitive interaction in the context of the design studio. In: Horizontal Interactive Human-Computer Systems, 2007. TABLETOP ’07. 2nd Annual IEEE International Workshop on, pp 147–154 (2007)Google Scholar
  13. 13.
    Malassiotis, S., Strintzis, M.G.: Real-time hand posture recognition using range data. Image Vis. Comput. 26, 1027–1037 (2008)CrossRefGoogle Scholar
  14. 14.
    Marquardt, N., Jota, R., Greenberg, S., Jorge, J.A.: The continuous interaction space: interaction techniques unifying touch and gesture on and above a digital surface. In: Proceedings of the 13th IFIP TC 13 International Conference on Human-Computer Interaction, Vol. Part III, INTERACT’11, pp. 461–476. Springer-Verlag, Berlin (2011) Google Scholar
  15. 15.
    Plagemann, C., Ganapathi, V., Koller, D., Thrun, S.: Real-time identification and localization of body parts from depth images. In: Robotics and Automation (ICRA), 2010 IEEE International Conference on, pp. 3108–3113 (2010)Google Scholar
  16. 16.
    Poppe, R.: Vision-based human motion analysis: an overview. Comput. Vis. Image Underst. 108, 4–18 (October 2007)Google Scholar
  17. 17.
    Rosenfeld, A.: Some uses of pyramids in image processing and segmentation. In: Proceeding of Image Understanding, Workshop, pp. 112–120 (1980)Google Scholar
  18. 18.
    Saffer, D.: Designing Gestural Interfaces, 1st edn. O’Reilly Media, Inc., USA (2009)Google Scholar
  19. 19.
    Schwarz, L., Mkhitaryan, A., Mateus, D., Navab, N.: Estimating human 3D pose from time-of-flight images based on geodesic distances and optical flow. In: Automatic Face Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on, pp. 700–706 (2011)Google Scholar
  20. 20.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp 1297–1304 (2011)Google Scholar
  21. 21.
    Takeoka, Y., Miyaki, T., Rekimoto, J.: Z-touch: an infrastructure for 3d gesture interaction in the proximity of tabletop surfaces. In: ACM International Conference on Interactive Tabletops and Surfaces, pp. 91–94. ACM (2010)Google Scholar
  22. 22.
    Wilson, A.D., Benko, H.: Combining multiple depth cameras and projectors for interactions on, above and between surfaces. In: Proceeding of the 23rd Annual ACM Symposis on User Interface Software and Technology, pp. 273–282. ACM (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Nadia Haubner
    • 1
  • Ulrich Schwanecke
    • 1
  • Ralf Dörner
    • 1
  • Simon Lehmann
    • 1
  • Johannes Luderschmidt
    • 1
  1. 1.RheinMain University of Applied SciencesWiesbadenGermany

Personalised recommendations