Advertisement

Real Time Person Tracking and Behavior Interpretation in Multi Camera Scenarios Applying Homography and Coupled HMMs

  • Dejan Arsić
  • Björn Schuller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6800)

Abstract

Video surveillance systems have been introduced in various fields of our daily life to enhance security and protect individuals and sensitive infrastructure. Up to now they have been usually utilized as a forensic tool for after the fact investigations and are commonly monitored by human operators. A further gain in safety can only be achieved by the implementation of fully automated surveillance systems which will assist human operators. In this work we will present an integrated real time capable system utilizing multiple camera person tracking, which is required to resolve heavy occlusions, to monitor individuals in complex scenes. The resulting trajectories will be further analyzed for so called Low Level Activities , such as walking, running and stationarity, applying HMMs, which will be used for the behavior interpretation task along with motion features gathered throughout the tracking process. An approach based on coupled HMMs will be used to model High Level Activities such as robberies at ATMs and luggage related scenarios.

Keywords

Ground Plane Dynamic Bayesian Network Visual Hull Connected Component Analysis Foreground Segmentation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ahlberg, J., Arsić, D., Ganchev, T., Linderhed, A., Menezes, P., Ntalampiras, S., Olma, T., Potamitis, I., Ros, J.: Prometheus: Prediction and interpretation of human behavior based on probabilistic structures and heterogeneous sensors. In: Proceedings 18th ECCAI European Conference on Artificial Intelligence, ECAI 2008, Patras, Greece, pp. 38–39 (2008)Google Scholar
  2. 2.
    Andriluka, M., Roth, S., Schiele, B.: Monocular 3d pose estimation and tracking by detection. In: Proceedings International IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp. 623–630 (2010)Google Scholar
  3. 3.
    Arsić, D., Hofmann, M., Schuller, B., Rigoll, G.: Multi-camera person tracking and left luggage detection applying homographic transformation. In: Proceedings Tenth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, PETS 2007, Rio de Janeiro, Brazil, pp. 55–62 (2007)Google Scholar
  4. 4.
    Arsić, D., Hörnler, B., Schuller, B., Rigoll, G.: A hierarchical approach for visual suspicious behavior detection in aircrafts. In: Proceedings 16th IEEE International Conference on Digital Signal Processing, Special Session “Biometric Recognition and Verification of Persons and their Activities for Video Surveillance”, DSP 2009, Santorini, Greece (2009)Google Scholar
  5. 5.
    Arsić, D., Hörnler, B., Schuller, B., Rigoll, G.: Resolving partial occlusions in crowded environments utilizing range data and video cameras. In: Proceedings 16th IEEE International Conference on Digital Signal Processing, Special Session “Fusion of Heterogeneous Data for Robust Estimation and Classification”, DSP 2009, Santorini, Greece (2009)Google Scholar
  6. 6.
    Arsić, D., Lehment, N., Hristov, E., Hrnler, B., Schuller, B., Rigoll, G.: Applying multi layer homography for multi camera tracking. In: Proceeedings Second ACM/IEEE International Conference on Distributed Smart Cameras, ICDSC 2008, Stanford, CA, USA, pp. 1–9 (2008)Google Scholar
  7. 7.
    Arsić, D., Lyutskanov, A., Kaiser, M., Rigoll, G.: Applying bayes markov chains for the detection of atm related scenarios. In: Proceedings IEEE Workshop on Applications of Computer Vision (WACV), in Conj. with the IEEE Computer Society’s Winter Vision Meetings, Snowbird, Utah, USA, pp. 1–8 (2009)Google Scholar
  8. 8.
    Arsić, D., Schuller, B., Rigoll, G.: Multiple camera person tracking in multiple layers combining 2d and 3d information. In: Proceedings Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications (M2SFA2), Marseille, France (2008)Google Scholar
  9. 9.
    Auvinet, E., Grossmann, E., Rougier, C., Dahmane, M., Meunier, J.: Left luggage detection using homographies and simple heuristics. In: Proceedings Ninth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, PETS 2006, New York, NY, USA, pp. 51–59 (2006)Google Scholar
  10. 10.
    Baum, L.E.: An inequality and associated maximalization technique in statistical estimation for probabilistic function of markov processes. Inequalities 3, 1–8 (1972)Google Scholar
  11. 11.
    Berclaz, J., Fleuret, F., Fua, P.: Multi-camera tracking and atypical motion detection with behavioral maps. In: Proceedings 10th European Conference on Computer Vision, Marseille, France (2008)Google Scholar
  12. 12.
    Broadhurst, A., Drummond, T., Cipolla, R.: A probabilistic framework for space carving. In: Proceedings Eighth IEEE International Conference on Computer Vision, ICCV 2001, pp. 388–393 (2001)Google Scholar
  13. 13.
    Carter, N., Ferryman, J.: The safee on-board threat detection system. In: Proceedings International Conference on Computer Vision Systems, pp. 79–88 (May 2008)Google Scholar
  14. 14.
    Carter, N., Young, D., Ferryman, J.: A combined bayesian markovian approach for behaviour recognition. In: Proceedings 18th International IEEE Conference on Pattern Recognition, ICPR 2006, Washington, DC, USA, pp. 761–764 (2006)Google Scholar
  15. 15.
    Chen, D., Liao, H.M., Shih, S.: Continuous human action segmentation and recognition using a spatio-temporal probabilistic framework. In: Proceedings Eighth IEEE International Symposium on Multimedia, ISM 2006, Washington, DC, USA, pp. 275–282 (2006)Google Scholar
  16. 16.
    Choi, J., Cho, Y., Cho, K., Bae, S., Yang, H.S.: A view-based multiple objects tracking and human action recognition for interactive virtual environments. The International Journal of Virtual Reality 7, 71–76 (2008)Google Scholar
  17. 17.
    Estrada, F., Jepson, A., Fleet, D.: Planar homographies, lecture notes foundations of computer vision. University of Toronto, Department of Computer Science (2004)Google Scholar
  18. 18.
    Ferryman, J., Shahrokni, A.: An overview of the pets 2009 challenge. In: Proceedings Eleventh IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, PETS 2009, Miami, FL, USA, pp. 1–8 (2009)Google Scholar
  19. 19.
    Fleuret, F., Berclaz, J., Lengagne, R., Fua, P.: Multi-camera people tracking with a probabilistic occupancy map. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 30(2), 267–282 (2008)CrossRefGoogle Scholar
  20. 20.
    Francois, A.R.J.: Real-time multi-resolution blob tracking. In: IRIS Technical Report, IRIS-04-422, University of Southern California. Los Angeles, USA (2004)Google Scholar
  21. 21.
    Guler, S.: Scene and content analysis from multiple video streams. In: Proceedings 30th IEEE Workshop on Applied Imagery Pattern Recognition, AIPR 2001, pp. 119–123 (2001)Google Scholar
  22. 22.
    Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)zbMATHGoogle Scholar
  23. 23.
    Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 34(3), 334–352 (2004)CrossRefGoogle Scholar
  24. 24.
    Khan, S.M., Yan, P., Shah, M.: A homographic framework for the fusion of multi-view silhouettes. In: Proceedings Eleventh IEEE International Conference on Computer Vision, ICCV 2007, Rio de Janeiro, Brazil, pp. 1–8 (2007)Google Scholar
  25. 25.
    Khan, S., Shah, M.: A multiview approach to tracking people in crowded scenes using a planar homography constraint. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 133–146. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  26. 26.
    Kutulakos, K., Seitz, S.: A theory of shape by space carving, technical report tr692. Tech. rep., Computer Science Deptartment, University Rochester (1998)Google Scholar
  27. 27.
    Laurentini, A.: The visual hull concept for silhouette-based image understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence 16(2), 150–162 (1994)CrossRefGoogle Scholar
  28. 28.
    Lehment, N., Arsić, D., Lyutskanov, A., Schuller, B., Rigoll, G.: Supporting multi camera tracking by monocular deformable graph tracking. In: Proceedings Eleventh IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, PETS 2009, Miami, FL, USA, pp. 87–94 (2009)Google Scholar
  29. 29.
    Lehment, N., Kaiser, M., Arsic, D., Rigoll, G.: Cue-independent extending inverse kinematics for robust pose estimation in 3d point clouds. In: Proceeding IEEE International Conference on Image Processing (ICIP 2010), Hong Kong, China, pp. 2465–2468 (2010)Google Scholar
  30. 30.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  31. 31.
    Oliver, N., Rosario, B., Pentland, A.: A bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis Machine Intelligence 22(8), 831–843 (2000)CrossRefGoogle Scholar
  32. 32.
    Orwell, J., Remagnino, P., Jones, G.: Multi-camera colour tracking. In: Proceedings Second IEEE Workshop on Visual Surveillance, VS 1999, Fort Collins, CO, USA, pp. 14–21 (1999)Google Scholar
  33. 33.
    Perera, A., Srinivas, C., Hoogs, A., Brooksby, G., Hu, W.: Multi-object tracking through simultaneous long occlusions and split-merge conditions. In: Proceedings 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, Washington, DC, USA, pp. 666–673 (2006)Google Scholar
  34. 34.
    Rabiner, L.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286 (1989)CrossRefGoogle Scholar
  35. 35.
    Seitz, S., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, CVPR, New York, NY, June 17-22, vol. 1, pp. 519–528 (2006)Google Scholar
  36. 36.
    Takahashi, K., Seki, S., Kojima, E., Oka, R.: Recognition of dexterous manipulations from time-varying images. In: Proceedings 1994 IEEE Workshop on Motion of Non-Rigid and Articulated Objects, pp. 23–28 (1994)Google Scholar
  37. 37.
    Thirde, D., Li, L., Ferryman, J.: Overview of the pets2006 challenge. In: Proceedings Ninth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, PETS 2006, pp. 1–8. IEEE, New York (2006)Google Scholar
  38. 38.
    Vigus, S., Bul, D., Canagarajah, C.: Video object tracking using region split and merge and a kalman filter tracking algorithm. In: Proceedings International Conference On Image Processing, ICIP 2001, Thessaloniki, Greece, vol. x, pp. 650–653 (2001)Google Scholar
  39. 39.
    Wang, L.: Abnormal walking gait analysis using silhouette-masked flow histograms. In: Proceedings 18th International Conference on Pattern Recognition, pp. 473–476. IEEE Computer Society, Washington, DC (2006)Google Scholar
  40. 40.
    Welsh, B., Ferrington, D.: Effects of closed circuit television surveillance on crime. Campbell Systematic Reviews 17, 110–135 (2008)Google Scholar
  41. 41.
    Wöllmer, M., Schuller, B., Eyben, F., Rigoll, G.: Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening. IEEE Journal of Selected Topics in Signal Processing 4(5), 867–881 (2010); special Issue on ”Speech Processing for Natural Interaction with Intelligent EnvironmentsCrossRefGoogle Scholar
  42. 42.
    Wu, C., Aghajan, H.: Model-based human posture estimation for gesture analysis in an opportunistic fusion smart camera network. In: Proceedings IEEE Conference on Advanced Video and Signal Based Surveillance, AVSS 2007, pp. 453–458 (2007)Google Scholar
  43. 43.
    Yue, Z., Zhou, S., Chellappa, R.: Robust two-camera tracking using homography. In: Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2004, vol. 3, pp. 1–4 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Dejan Arsić
    • 1
  • Björn Schuller
    • 2
  1. 1.Müller BBM Vibroakustiksysteme GmbHPlaneggGermany
  2. 2.Institute for Human-Machine CommunicationTechnische Universität MünchenGermany

Personalised recommendations