Multimedia Tools and Applications

, Volume 74, Issue 11, pp 4161–4185 | Cite as

Cell-based visual surveillance with active cameras for 3D human gaze computation

  • Zhaozheng Hu
  • Takashi Matsuyama
  • Shohei Nobuhara


Capturing fine resolution and well-calibrated video images with good object visual coverage in a wide space is a tough task for visual surveillance. Although the use of active cameras is an emerging method, it suffers from the problems of online camera calibration difficulty, mechanical delay handling, image blurring from motions, and algorithm un-friendly due to dynamic backgrounds, etc. This paper proposes a cell-based visual surveillance system by using N (N ≥ 2) active cameras. We propose the camera scan speed map (CSSM) to deal with the practical mechanical delay problem for active camera system design. We formulate the three mutually-coupled problems of camera layout, surveillance space partition with cell sequence, and camera parameter control, into an optimization problem by maximizing the object resolution while meeting various constraints such as system mechanical delay, full visual coverage, minimum object resolution, etc. The optimization problem is solved by using a full searching approach. The cell-based calibration method is proposed to compute both the intrinsic and exterior parameters of active cameras for different cells. With the proposed system, the foreground object is detected based on motion and appearance features and tracked by dynamically switching the two groups of cameras across different cells. The proposed algorithms and system have been validated by an in-door surveillance experiment, where the surveillance space was partitioned into four cells. We used two active cameras with one camera in one group. The active cameras were configured with the optimized pan, tilt, and zooming parameters for different cells. Each camera was calibrated with the cell-based calibration method for each configured pan, tilt, and zooming parameters. The algorithms and system were applied to monitor freely moving peoples within the space. The system can capture good resolution, well-calibrated, and good visual coverage video images with static background in support of automatic object detection and tracking. The proposed system performed better than traditional single or multiple fixed camera system in term of image resolution, surveillance space, etc. We further demonstrated that advanced 3D features, such as 3D gazes, were successfully computed from the captured good-quality images for intelligent surveillance.


Cell-based surveillance Active cameras Camera layout Cell partition Cell-based calibration 



The work presented in this paper was sponsored by grants from National Natural Science Foundation of China (NSFC) (No. 51208168), Tianjin Natural Science Foundation (No. 13JCYBJC37700), the Youth Top-Notch Talent Plan of Hebei Province, China, and the Grant-in-Aid for Scientific Research Program (No. 10049) from the Japan Society for the Promotion of Science (JSPS).


  1. 1.
    Bellotto N, Benfold B, Harland H, Nagel HH, Pirlo N, Reid I, Sommerlade E, Zhao C (2013) Cognitive visual tracking and camera control. Comput Vis Image Underst 116(3):457–471CrossRefGoogle Scholar
  2. 2.
    Bradski GR (1998) Computer vision face tracking for use in a perceptual user interface, Intel Technology Journal, Q2Google Scholar
  3. 3.
    Chen KW, Lin CW, Chiu TH, Chen YY, Hung YP (2011) Multi-resolution design for large-scale and high-resolution monitoring. IEEE T Multimed 13(6):1256–1268CrossRefGoogle Scholar
  4. 4.
    De D, Ray S, Konar A, Chatterjee A (2005) An evolutionary SPDE breeding−based hybrid particle swarm optimizer: application in coordination of robot ants for camera coverage area optimization, PReMI, pp 413–416Google Scholar
  5. 5.
    Erdem U, Sclaroff S (2006) Automated camera layout to satisfy task-specific and floor plan-specific coverage requirements. Comput Vis Image Underst 103(3):156–169CrossRefGoogle Scholar
  6. 6.
    Hartley R, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  7. 7.
    Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE T SMC C 34(3):334–352Google Scholar
  8. 8.
    Loy C, Xiang T, Gong S (2011) Detecting and discriminating behavioral anomalies. Pattern Recogn 44(1):117–132CrossRefzbMATHGoogle Scholar
  9. 9.
    Matsuyama T, Ukita N (2002) Real-time multi-target tracking by a cooperative distributed vision system, Proceedings of the IEEE, 90(7):1136–1150Google Scholar
  10. 10.
    Morris BT, Trivedi MM (2008) A survey of vision-based trajectory learning and analysis for surveillance, IEEE Transactions on Circuits and Systems for Video Technology, 18(8):1114–1127Google Scholar
  11. 11.
    Saini M, Atrey PK, Mehrotra S, Kankanhalli M (2012) W3-privacy: understanding what, when, and where inference channels in multi-camera surveillance video, Multimedia Tools and ApplicationsGoogle Scholar
  12. 12.
    Sankaranarayanan K, Davis W (2011) Object association across PTZ cameras using logistic MIL, IEEE Conf. CVPR, pp. 3433–3440Google Scholar
  13. 13.
    Shi Q, Nobuhara S, Matsuyama T (2012) 3D face reconstruction and gaze estimation from multi-view video using symmetry prior. IPSJ T Comput Vis Appl 4:149–160Google Scholar
  14. 14.
    Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking, IEEE Conf. CVPR, pp.246–252Google Scholar
  15. 15.
    Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features, IEEE Conf. CVPR, pp.511–518Google Scholar
  16. 16.
    Wang XG (2013) Intelligent multi-camera video surveillance: a review. Pattern Recogn Lett 34(1):3–19CrossRefGoogle Scholar
  17. 17.
    Yamaguchi T, Yoshimoto H, Matsuyama T (2010) Cell-based 3D video capture method with active cameras, in Image and Geometry Proc. for 3-D Cinematography, pp. 171–191, SpringerGoogle Scholar
  18. 18.
    Yamaguchi T, Yoshimoto H, Nobuhara S, Matsuyama T (2010) Cell-based 3D video capture of a freely-moving object using multi-viewpoint active cameras. IPSJ T Comput Vis Appl 2(8):169–184Google Scholar
  19. 19.
    Yonetani R, Kawashima H, Hirayama T, Matsuyama T (2010) Gaze probing: event-based estimation of objects being focused on. ICPR, pp 101–104Google Scholar
  20. 20.
    Zhang Z (2000) A flexible new technique for camera calibration. IEEE T Pattern Anal Mach Intell 22(11):1330–1334CrossRefGoogle Scholar
  21. 21.
    Zheng WS, Gong SG, Xiang T (2011) Person re-identification by probabilistic relative distance comparison. CVPR, pp 649–656Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Zhaozheng Hu
    • 1
    • 2
  • Takashi Matsuyama
    • 2
  • Shohei Nobuhara
    • 2
  1. 1.ITS Research CenterWuhan University of TechnologyWuhanChina
  2. 2.Graduate School of InformaticsKyoto UniversityKyotoJapan

Personalised recommendations