Skip to main content
Log in

SmartCamera: a low-cost and intelligent camera management system

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Intelligent camera management systems were developed to automatically record meetings for videoconferencing. These systems provided many benefits, such as reducing the production cost and conveniently documenting events. However, automatically recorded videos in general were not visually engaging. This paper presents a novel approach that intelligently controls camera shots and angles to improve the visual interest. We use 3D infrared images captured by a Kinect sensor to recognize active speakers and their positions in a meeting. A movable camera, constructed by placing a wireless PTZ (pan-tilt-zoom) camera on top of a motorized rail, can automatically move its position to frame an active speaker in the center of the screen. Without interrupting the meeting, a speaker can seamlessly switch video sources through gesture-based commands. We have summarized and implemented a set of heuristic rules to simulate a human director. These rules can be visually edited through a graphical user interface. The customization of a virtual director makes our system applicable in various scenarios. We conducted a user study, and the evaluation results justified the quality of an automated video.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Basili VR, Caldiera G, Rombach HD (1994) The goal question metric approach, technical report, department of computer science, University of Maryland, ftp://ftp.cs.umd.edu/pub/sel/papers/gqm.pdf

  2. Bianchi M (1998) AutoAuditorium: a fully automatic, multi-camera system to televise auditorium presentation, In Proc. Joint DARPA/NIST smart spaces technology workshop

  3. Brandstein M, Ward D (2001) Microphone arrays: signal processing techniques and applications. Springer Verlag

  4. Cutler R, Rui Y, Gupta A, Cadiz J, Tashev I, He I, Colburn A, Zhang Z, Liu Z, Silverberg S (2002) Distributed meetings: a meeting capture and broadcasting system. ACM, Proc. Multimedia, pp 503–512

    Google Scholar 

  5. Foote J, Kimber D (2000) FlyCam: practical panoramic video. Proc. MULTIMEDIA. ACM, 487–488

  6. Gadanac D, Ericsson Nikola Tesla d. d., Zagreb, Croatia, Dujak M, Tomic D, Jercic D (2014) Kinect-based presenter tracking prototype for videoconferencing Proc. MIPRO, 485–490

  7. Heck R, Wallick M, Gleicher M (2007) Virtual videography. ACM Trans Multimedia Comput Commun Appl vol. 3(1)

  8. Howell AJ, Buxton H (2002) Visually mediated interaction using learnt gestures and camera control. HCI 2002. Springer-Verlag. 272–284

  9. Inoue T, Okada K, Matsushita Y (1995) Learning from TV programs: application of TV presentation to a videoconferencing system. Proc. UIST 1995, ACM Press 147–154

  10. Jones A, Lang A, Fyffe G, Yu X, Busch J, McDowall I, Bolas M, Debevec P (2009) Achieving eye contact in a one-to-many 3D video teleconferencing system. ACM Trans Graph 28 (3), Article 64

  11. Kuney J (1990) Take one: television directors on directing. Praeger Publishers

  12. Lee D, Erol B, Graham J, Hull J, Murata N (2002) Portable meeting recorder. ACM, Proc. MULTIMEDIA, pp 493–502

    Google Scholar 

  13. Liu Q, Rui Y, Gupta A, Cadiz JJ (2001) Automating camera management for lecture room environments. In Proc. CHI 2001. ACM, 442–449

  14. Liu Q, Kimber D, Foote J, Wilcox L, Boreczky J (2002) FlySPEC: a multi-user video camera system with hybrid human and automatic control. Proc. Multimedia 2002. ACM, 484–492

  15. Motlicek P, Duffner S, Korchagin D, Bourlard H, Scheffler C, Odobez JM, Galdo G, Kallinger M, Thiergart O (2013) Real-time audio-visual analysis for multiperson videoconferencing. Advances in Multimedia (2013), Volume, Article ID 175745

  16. Mukhopadhyay S, Smith B (1999) Passive capture and structuring of lectures. Proc Multimedia 99:477–487

    Google Scholar 

  17. Nagai T (2009) Automated lecture recording system with AVCHD camcorder and microserver, Proc. SIGUCCS, 47–54

  18. Nickel K, Gehrig T, Stiefelhagen R, McDonough R (2005) A joint particle filter for audio-visual speaker tracking. Proc. ICMI 2005. ACM, 61–68

  19. Norris J, Schnadelbach H, Qiu G (2012) CamBlend: an object focused collaboration tool. Proc CHI 12:627–636

    Google Scholar 

  20. Poltrock SE, Engelbeck G (1997) Requirements for a virtual collocation environment. In ACM GROUP, 61–70

  21. Ranjan A, Birnholtz JP, Balakrishnan R (2006) An exploratory analysis of partner action and camera control in a video-mediated collaborative task. Proc. ACM CSCW 403–412

  22. Ranjan A, Birnholtz JP, Balakrishnan R (2008) Improving meeting capture by applying television production principles with audio and motion detection. Proc. CHI 2008, ACM 227–236

  23. Ranjan A, Henrikson R, Birnholtz J, Balakrishnan R, Lee D (2010) Automatic camera control using unobtrusive vision and audio tracking. Proc. Graphics Interface 2010. ACM 47–54

  24. Ronzhin AL, Prischepa M, Karpov A (2010) A video monitoring model with a distributed camera system for the smart space. Proc. ruSMART/NEW2AN′10, Springer-Verlag, 102–110

  25. Rubin AM (2002) The uses-and-gratifications perspective of media effects. Media Effects: Advances in theory and persuasion, 525–548

  26. Rui Y, Gupta A, Cadiz JJ (2001) Viewing meeting captured by an omni-directional Camera. Proc. CHI 2001, ACM 450–457

  27. Rui Y, Gupta A, Grudin J (2003) Videography for telepresentations. Proc. CHI 2003, ACM, 457–464

  28. Song MS, Zhang C, Florencio D, Kang HG (2011) An Interactive 3-D audio system with loudspeakers. IEEE Trans Multimedia 13(5):844–855

    Article  Google Scholar 

  29. Suau X, Ruiz-Hidalgo J, Casas JR (2012) Real-time head and hand tracking based on 2.5D data. IEEE Trans Multimedia 14(3):575–585

    Article  Google Scholar 

  30. Takahashi M, Fujii M, Naemura M, Satoh S (2013) Human gesture recognition system for TV viewing using time-of-flight camera. Multimedia Tools Appl 62:761–783

    Article  Google Scholar 

  31. Tang JC, Marlow J, Hoff A, Roseway A, Inkpen K, Zhao C, Cao X (2012) Time travel proxy: Using Lightweight Video Recordings to Create Asynchronous, Interactive Meetings. Proc. CHI, 3111–3120

  32. Wang F, Ngo CW, Pong TC (2007) Lecture video enhancement and editing by integrating posture, gesture, and text. IEEE Trans Multimedia 9(2):397–409

    Article  Google Scholar 

  33. Wang F, Ngo CW, Pong TC (2008) Simulating a smartboard by real-time gesture detection in lecture videos. IEEE Trans Multimedia 10(5):926–935

    Article  Google Scholar 

  34. Williamson B, LaViola J, Roberts T, Garrity P (2012) Multi-kinect tracking for dismounted Soldier training. Proc. Interservice/industry training, simulation, and education conference, 1727–1735

  35. Yu Z, Nakamura Y (2010) Smart meeting systems: a survey of state-of-the-art and open issues. ACM Comput Surv Vol. 42, No. 2, Article 8

  36. Zhang JR (2012) Upper body gestures in lecture videos: indexing and correlating to pedagogical Significance. Proc. MM, 1389–1392

Download references

Acknowledgments

We thank the volunteer participants in this investigation. The authors would like to thank the anonymous reviewers for their insightful and constructive comments that helped to significantly improve the presentation. This work is in part supported by NSF under grant CNS-1126570.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Kong.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Roudaki, A., Kong, J. & Reetz, S. SmartCamera: a low-cost and intelligent camera management system. Multimed Tools Appl 75, 7831–7854 (2016). https://doi.org/10.1007/s11042-015-2700-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2700-8

Keywords

Navigation