Abstract
Visual detection and tracking of humans in complex scenes is a challenging problem with a wide range of applications, for example surveillance and human-computer interaction. In many such applications, time-synchronous views from multiple calibrated cameras are available, and both frame-view and space-level human location information is desired. In such scenarios, efficiently combining the strengths of face detection and person tracking is a viable approach that can provide both levels of information required and improve robustness. In this paper, we propose a novel vision system that detects and tracks human faces automatically, using input from multiple calibrated cameras. The method uses an Adaboost algorithm variant combined with mean shift tracking applied on single camera views for face detection and tracking, and fuses the results on multiple camera views to check for consistency and obtain the three-dimensional head estimate. We apply the proposed system to a lecture scenario in a smart room, on a corpus collected as part of the CHIL European Union integrated project. We report results on both frame-level face detection and three-dimensional head tracking. For the latter, the proposed algorithm achieves similar results with the IBM “PeopleVision” system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
CHIL project web-site, http://chil.server.de
Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Machine Intell. 20(1), 23–28 (1998)
Roth, D., Yang, M.-H., Ahuja, N.: A SNoW-based face detector. In: Proc. NIPS (2000)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proc. Conf. Computer Vision Pattern Recog. (2001)
Schneiderman, H., Kanade, T.: A statistical method for 3D object detection applied to faces and cars. In: Proc. Conf. Computer Vision Pattern Recog. (2000)
Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: Proc. Conf. Computer Vision Pattern Recog.(2000)
Isard, M., Blake, A.: Condensation - conditional density propagation for visual tracking. Int. J. Computer Vision 29(1), 5–28 (1998)
Black, J., Ellis, T.: Multi camera image tracking. In: Proc. IEEE Work on Performance Evaluation of Tracking and Surveillance (2001)
Hampapur, A., Pankanti, S., Senior, A.W., Tian, Y.-L., Brown, L., Bolle, R.: Face cataloger: Multi-scale imaging for relating identity to location. In: Proc. IEEE Conf. Advanced Video Signal Based Surveillance, pp. 13–20 (2003)
Zhang, Z., Zhu, L., Li, S.: Real time multiview face detection. In: Proc. IEEE Int. Conf. Face Gesture Recog. (2002)
Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recog. Lett. 15, 1119–1125 (1994)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting, Technical Report, Dept. Statistics, Stanford Univerity, Palo Alto, CA (1998)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. J. Machine Learning 37(3), 297–336 (1999)
Jain, A., Zongker, D.: Feature selection: Evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Machine Intell. 19(2), 153–158 (1997)
Somol, P., Pudil, P., Novovicova, J., Paclik, P.: Adaptive floating search methods in feature selection. Pattern Recog. Lett. 20, 1157–1163 (1999)
Bobick, A., Davis, J.: The representation and recognition of action using temporal templates. IEEE Trans. Pattern Anal. Machine Intell. 23(3), 257–267 (2001)
Welch, G., Bishop, G.: An introduction to the Kalman Filter, Technical Report TR 95- 041, Computer Science Dept., Univ. of North Carolina, Chapel Hill, NC (1995)
Bouguet, J.-Y.: Camera calibration toolbox, http://www.vision.caltech.edu/bouguetj/calibdoc/
Macho, D., Padrell, J., Abad, A., et al.: Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus. In: Proc. Int. Conf. Multimedia Expo. (2005)
Senior, A.: Tracking with probabilistic appearance models. In: Proc. Int. Work. on Performance Evaluation of Tracking and Surveillance Systems (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, Z., Potamianos, G., Senior, A., Chu, S., Huang, T.S. (2005). A Joint System for Person Tracking and Face Detection. In: Sebe, N., Lew, M., Huang, T.S. (eds) Computer Vision in Human-Computer Interaction. HCI 2005. Lecture Notes in Computer Science, vol 3766. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573425_5
Download citation
DOI: https://doi.org/10.1007/11573425_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29620-1
Online ISBN: 978-3-540-32129-3
eBook Packages: Computer ScienceComputer Science (R0)