Abstract
In this paper, two multimodal systems for the tracking of multiple users in smart environments are presented. The first is a multi-view particle filter tracker using foreground, color and special upper body detection and person region features. The other is a wide angle overhead view person tracker relying on foreground segmentation and model-based blob tracking. Both systems are completed by a joint probabilistic data association filter-based source localizer using the input from several microphone arrays. While the first system fuses audio and visual cues at the feature level, the second one incorporates them at the decision level using state-based heuristics.
The systems are designed to estimate the 3D scene locations of room occupants and are evaluated based on their precision in estimating person locations, their accuracy in recognizing person configurations and their ability to consistently keep track identities over time.
The trackers are extensively tested and compared, for each separate modality and for the combined modalities, on the CLEAR 2007 Evaluation Database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Khalaf, R.Y., Intille, S.S.: Improving Multiple People Tracking using Temporal Consistency. MIT Dept. of Architecture House_n Project Technical Report (2001)
Niu, W., Jiao, L., Han, D., Wang, Y.-F.: Real-Time Multi-Person Tracking in Video Surveillance. In: Pacific Rim Multimedia Conference, Singapore (2003)
Mittal, A., Davis, L.S.: M2Tracker: A Multi-View Approach to Segmenting and Tracking People in a Cluttered Scene Using Region-Based Stereo. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 18–33. Springer, Heidelberg (2002)
Checka, N., Wilson, K., Rangarajan, V., Darrell, T.: A Probabilistic Framework for Multi-modal Multi-Person Tracking. In: Workshop on Multi-Object Tracking (CVPR) (2003)
Comaniciu, D., Meer, P.: Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEE PAMI 24(5) (May 2002)
Haritaoglu, I., Harwood, D., Davis, L.S.: W4: Who? When? Where? What? A Real Time System for Detecting and Tracking People. In: Third Face and Gesture Recognition Conference, pp. 222–227 (1998)
Raja, Y., McKenna, S.J., Gong, S.: Tracking and Segmenting People in Varying Lighting Conditions using Colour. In: 3rd. Int. Conference on Face & Gesture Recognition, p. 228 (1998)
Viola, P., Jones, M.: Rapid Object Detection using a Boosted Cascade of Simple Features. In: IEEE CVPR (2001)
Lienhart, R., Maydt, J.: An Extended Set of Haar-like Features for Rapid Object Detection. In: IEEE ICIP 2002, September 2002, vol. 1, pp. 900–903 (2002)
Gehrig, T., McDonough, J.: Tracking of Multiple Speakers with Probabilistic Data Association Filters. In: CLEAR Workshop, Southampton, UK (April 2006)
Bernardin, K., Elbs, A., Stiefelhagen, R.: Multiple Object Tracking Performance Metrics and Evaluation in a Smart Room Environment. In: Sixth IEEE International Workshop on Visual Surveillance, in conjunction with ECCV 2006, Graz, Austria, May 13th (2006)
Bernardin, K., Gehrig, T., Stiefelhagen, R.: Multi- and Single View Multiperson Tracking for Smart Room Environments. In: CLEAR Evaluation Workshop 2006, Southampton, UK, April 2006. LNCS, vol. 4122, pp. 81–92 (2006)
Tao, H., Sawhney, H., Kumar, R.: A Sampling Algorithm for Tracking Multiple Objects. In: International Workshop on Vision Algorithms: Theory and Practice, pp. 53–68 (1999)
Wren, C., Azarbayejani, A., Darrell, T., Pentland, A.: Pfinder: Real-Time Tracking of the Human Body. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(7), 780–785 (1997)
Bar-Shalom, Y.: Tracking and data association. Academic Press Professional, Inc., San Diego (1987)
Knapp, C.H., Carter, G.C.: The Generalized Correlation Method for Estimation of Time Delay. IEEE Trans. Acoust. Speech Signal Proc. 24(4), 320–327 (1976)
Omologo, M., Svaizer, P.: Acoustic Event Localization Using a Crosspower-spectrum Phase Based Technique. In: Proc. ICASSP, vol. 2, pp. 273–276 (1994)
Klee, U., Gehrig, T., McDonough, J.: Kalman Filters for Time Delay of Arrival-Based Source Localization. EURASIP Journal on Applied Signal Processing (2006)
Gehrig, T., McDonough, J.: Tracking Multiple Simultaneous Speakers with Probabilistic Data Association Filters. LNCS, vol. 4122, pp. 137–150 (2006)
CHIL - Computers In the Human Interaction Loop, http://chil.server.de
AMI - Augmented Multiparty Interaction, http://www.amiproject.org
VACE - Video Analysis and Content Extraction, http://www.ic-arda.org
OpenCV - Open Computer Vision Library, http://sourceforge.net/projects/opencvlibrary
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bernardin, K., Gehrig, T., Stiefelhagen, R. (2008). Multi-level Particle Filter Fusion of Features and Cues for Audio-Visual Person Tracking. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds) Multimodal Technologies for Perception of Humans. RT CLEAR 2007 2007. Lecture Notes in Computer Science, vol 4625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68585-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-68585-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68584-5
Online ISBN: 978-3-540-68585-2
eBook Packages: Computer ScienceComputer Science (R0)