Advertisement

Machine Vision and Applications

, Volume 24, Issue 2, pp 319–336 | Cite as

Multicamera human detection and tracking supporting natural interaction with large-scale displays

  • Xenophon ZabulisEmail author
  • Dimitris Grammenos
  • Thomas Sarmis
  • Konstantinos Tzevanidis
  • Pashalis Padeleris
  • Panagiotis Koutlemanis
  • Antonis A. Argyros
Original Paper

Abstract

This paper presents a computer vision system that supports non-instrumented, location-based interaction of multiple users with digital representations of large-scale artifacts. The proposed system is based on a camera network that observes multiple humans in front of a very large display. The acquired views are used to volumetrically reconstruct and track the humans robustly and in real time, even in crowded scenes and challenging human configurations. Given the frequent and accurate monitoring of humans in space and time, a dynamic and personalized textual/graphical annotation of the display can be achieved based on the location and the walk-through trajectory of each visitor. The proposed system has been successfully deployed in an archaeological museum, offering its visitors the capability to interact with and explore a digital representation of an ancient wall painting. This installation permits an extensive evaluation of the proposed system in terms of tracking robustness, computational performance and usability. Furthermore, it proves that computer vision technology can be effectively used to support non-instrumented interaction of humans with their environments in realistic settings.

Keywords

Person localization Person tracking Camera network Real-time volumetric reconstruction Non-instrumented location-based interaction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Allard, J., Franco, J., Menier, C., Boyer, E., Raffin, B.: The GrImage platform: a mixed reality environment for interactions. In: IEEE International Conference on Computer Vision Systems, p. 46 (2006)Google Scholar
  2. 2.
    Argyros, A.A., Lourakis, M.I.A.: Real time tracking of multiple skin-colored objects with a possibly moving camera. In: European Conference on Computer Vision, pp. 368–379 (2004)Google Scholar
  3. 3.
    Bannon L., Benford S., Bowers J., Heath C.: Hybrid design creates innovative museum experiences. Commun ACM 48(3), 62–65 (2005)CrossRefGoogle Scholar
  4. 4.
    Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: The clear mot metrics. EURASIP J Image Video Process (2008)Google Scholar
  5. 5.
    Blomberg, J., Giacomi, J., Mosher, A., Swenton-Wall, P.: Ethnographic field methods and their relation to design. In: Participatory design: Principles and practices, pp. 123–155. Lawrence Erlbaum Associates (2003)Google Scholar
  6. 6.
    Bobick A., Intille S., Davis J., Baird F., Pinhanez C., Campbell L., Ivanov Y., Schutte A., Wilson A.: The KidsRoom: a perceptually-based interactive and immersive story environment. Presence Teleoper Virtual Environ 8(4), 369–393 (1999)CrossRefGoogle Scholar
  7. 7.
    Brooke J.: SUS: a quick and dirty usability scale, pp. 189–194. Taylor and Francis, UK (1996)Google Scholar
  8. 8.
    Darrell, T., Gordon, G., Harville, M., Woodfill, J.: Integrated person tracking using stereo, color, and pattern detection. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 601–609 (1998)Google Scholar
  9. 9.
    Falcao, G., Hurtos, N., Massich, J., Fofi, D.: Projector-camera calibration toolbox (2009). http://code.google.com/p/procamcalib
  10. 10.
    Fleuret F., Berclaz J., Lengagne R., Fua P.: Multicamera people tracking with a probabilistic occupancy map. IEEE Trans Pattern Anal Mach Intell 30(2), 267–282 (2008)CrossRefGoogle Scholar
  11. 11.
    Franco, J., Menier, C., Boyer, E., Raffin, B.: A distributed approach for real time 3D modeling. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, p. 31 (2004)Google Scholar
  12. 12.
    Gronbaek, K., Iversen, O., Kortbek, K., Nielsen, L., Rand Aagaard, K.: IGameFloor: a platform for co-located collaborative games. In: ACE: Advances in Computer Entertainment Technology, pp. 64–71 (2007)Google Scholar
  13. 13.
    Haro G., Pardàs M.: Shape from incomplete silhouettes based on the reprojection error. Image Vis Comput 28(9), 1354–1368 (2010)CrossRefGoogle Scholar
  14. 14.
    Hornecker, E., Stifter, M.: Learning from interactive museum installations about interaction design for public settings. In: Australian conference on Computer-Human Interaction, pp. 135–142. Sydney, Australia (2006)Google Scholar
  15. 15.
    Khan, S., Shah, M.: A multiview approach to tracking people in crowded scenes using a planar homography constraint. In: European Conference on Computer Vision, pp. 133–146 (2006)Google Scholar
  16. 16.
    Kortbek, K., Gronbaek, K.: Interactive spatial multimedia for communication of art in the physical museum space. In: ACM Multimedia, pp. 609–618 (2008)Google Scholar
  17. 17.
    Laakso S., Laakso M.: Design of a body-driven multiplayer game system. Comput Entertain 4, 7 (2006)CrossRefGoogle Scholar
  18. 18.
    Ladikos, A., Benhimane, S., Navab, N.: Efficient visual hull computation for real-time 3d reconstruction using CUDA. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8 (2008)Google Scholar
  19. 19.
    Laurentini A.: The visual hull concept for silhouette-based image understanding. IEEE Trans Pattern Anal Mach Intell 16(2), 150–162 (1994)CrossRefGoogle Scholar
  20. 20.
    Liem, M., Gavrila, D.M.: Multi-person tracking with overlapping cameras in complex, dynamic environments. In: British Machine Vision Conference (2009)Google Scholar
  21. 21.
    Lourakis, M.I.A., Argyros, A.A.: SBA: A software package for generic sparse bundle adjustment. ACM Trans Math Softw 36(1) (2009)Google Scholar
  22. 22.
    Macedonia: from fragments to pixels (2010). http://www.makedonopixels.org. Demonstration video: http://www.makedonopixels.org/videos.php?c=8&sub_c=7&l=e
  23. 23.
    Mittal, A., Davis, L.: M2tracker: a multi-view approach to segmenting and tracking people in a cluttered scene. Int J Comput Vis 189–203 (2003)Google Scholar
  24. 24.
    Nielsen J.: Usability Engineering, chapter Thinking Aloud, pp. 195–199. Academic Press, san diego (1993)Google Scholar
  25. 25.
    Paradiso, J., Abler, C., Hsiao, K., Reynolds, M.: The magic carpet: physical sensing for immersive environments. In: Human factors in computing systems, pp. 277–278 (1997)Google Scholar
  26. 26.
    Pietroni, E., Antinucci, F.: The rule confirmation: virtual experience among the characters of Giotto’s work. In: International Symposium on Virtual Reality, Archaeology and Cultural Heritage (2010)Google Scholar
  27. 27.
    Raskar, R., Welch, G., Fuchs, H.: Seamless projection overlaps using image warping and intensity blending. In: Virtual Systems and Multimedia (1998)Google Scholar
  28. 28.
    Reddy, D., Sankaranarayanan, A., Cevher, V., Chellappa, R.: Compressed sensing for multi-view tracking and 3-D voxel reconstruction. In: IEEE International Conference on Image Processing, pp. 221–224 (2008)Google Scholar
  29. 29.
    Point Grey Research. MultiSync. http://www.ptgrey.com/products/multisync
  30. 30.
    Robertson, T., Mansfield, T., Loke, L.: Designing an immersive environment for public use. In: Conference on Participatory design, pp. 31–40 (2006)Google Scholar
  31. 31.
    Sarmis, T., Zabulis, X., Argyros, A.A.: A checkerboard detection utility for intrinsic and extrinsic camera cluster calibration. Technical Report 397, FORTH-ICS (2009)Google Scholar
  32. 32.
    Schick, A., Stiefelhagen, R.: Real-time GPU-based voxel carving with systematic occlusion handling. In: DAGM Symposium on Pattern Recognition, pp. 372–81 (2009)Google Scholar
  33. 33.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: EEE Conference on Computer Vision and Pattern Recognition (2011)Google Scholar
  34. 34.
    Snibbe, S., Raffle, H.: Social immersive media: pursuing best practices for multi-user interactive camera/projector exhibits. In: Human factors in computing systems, pp. 1447–1456 (2009)Google Scholar
  35. 35.
    Sparacino, F.: Scenographies of the past and museums of the future: from the wunderkammer to body-driven interactive narrative spaces. In: ACM Multimedia, pp. 72–79 (2004)Google Scholar
  36. 36.
    Tran, S., Lin, Z., Harwood, D., Davis, L.: UMD VDT, an integration of detection and tracking methods for multiple human tracking. In: Multimodal Technologies for Perception of Humans. Lecture Notes in Computer Science, vol. 4625/2008, pp. 179–190. Springer, Berlin (2008)Google Scholar
  37. 37.
    Tyagi, A., Keck, M., Davis, J., Potamianos, G.: Kernel-based 3d tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  38. 38.
    Tzevanidis, K., Zabulis, X., Sarmis, T., Koutlemanis, P., Kyriazis, N., Argyros, A.: From multiple views to textured 3d meshes: a gpu-powered approach. In: European Conference on Computer Vision Workshops, pp. 5–11 (2010)Google Scholar
  39. 39.
    Wu, B., Singh, V., Kuo, C., Zhang, L., Lee, S., Nevatia, R.: CLEAR’07 evaluation of usc human tracking system for surveillance videos. In: Multimodal Technologies for Perception of Humans. Lecture Notes in Computer Science, vol. 4625/2008, pp. 191–196. Springer, Berlin (2008)Google Scholar
  40. 40.
    Zabulis, X., Grammenos, D., Sarmis, T., Tzevanidis, K., Argyros, A.A.: Exploration of large-scale museum artifacts through non-instrumented, location-based, multi-user interaction. In: International Symposium on Virtual Reality, Archaeology and Cultural Heritage (2010)Google Scholar
  41. 41.
    Zabulis, X., Sarmis, T., Argyros, A.A.: 3d head pose estimation from multiple distant views. In: British Machine Vision Conference (2009)Google Scholar
  42. 42.
    Zabulis, X., Sarmis, T., Tzevanidis, K., Koutlemanis, P., Grammenos, D., Argyros, A.A.: A platform for monitoring aspects of human presence in real-time. In: International Symposium on Visual Computing (2010)Google Scholar
  43. 43.
    Zivkovic, Z.: Improved adaptive Gaussian mixture model for background subtraction. In: International Conference on Pattern Recognition, pp. 28–31 (2004)Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Xenophon Zabulis
    • 1
    Email author
  • Dimitris Grammenos
    • 1
  • Thomas Sarmis
    • 1
  • Konstantinos Tzevanidis
    • 1
    • 2
  • Pashalis Padeleris
    • 1
  • Panagiotis Koutlemanis
    • 1
  • Antonis A. Argyros
    • 1
    • 2
  1. 1.Institute of Computer Science, FORTHHeraklion, CreteGreece
  2. 2.Computer Science DepartmentUniversity of CreteCreteGreece

Personalised recommendations