A Multiview Approach to Tracking People in Crowded Scenes Using a Planar Homography Constraint

  • Saad M. Khan
  • Mubarak Shah
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3954)


Occlusion and lack of visibility in dense crowded scenes make it very difficult to track individual people correctly and consistently. This problem is particularly hard to tackle in single camera systems. We present a multi-view approach to tracking people in crowded scenes where people may be partially or completely occluding each other. Our approach is to use multiple views in synergy so that information from all views is combined to detect objects. To achieve this we present a novel planar homography constraint to resolve occlusions and robustly determine locations on the ground plane corresponding to the feet of the people. To find tracks we obtain feet regions over a window of frames and stack them creating a space time volume. Feet regions belonging to the same person form contiguous spatio-temporal regions that are clustered using a graph cuts segmentation approach. Each cluster is the track of a person and a slice in time of this cluster gives the tracked location. Experimental results are shown in scenes of dense crowds where severe occlusions are quite common. The algorithm is able to accurately track people in all views maintaining correct correspondences across views. Our algorithm is ideally suited for conditions when occlusions between people would seriously hamper tracking performance or if there simply are not enough features to distinguish between different people.


Ground Plane Foreground Object Foreground Region Ground Location Dense Crowd 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Irani, M., Rousso, B., Peleg, S.: Computing Occluding and Transparent Motions. IJCV 12(1) (1994)Google Scholar
  2. 2.
    Gurdjos, P., Sturm, P.: Methods and Geometry for Plane-Based Self-Calibration. In: CVPR (2003)Google Scholar
  3. 3.
    Zhao, T., Nevatia, T.: Tracking Multiple Humans in Complex Situations. IEEE PAMI (2004)Google Scholar
  4. 4.
    Okuma, K., Taleghani, A., de Freitas, N., Little, J.J., Lowe, D.G.: A boosted particle filter: Multitarget detection and tracking. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 28–39. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Leibe, B., Seemann, E., Schiele, B.: Pedestrian Detection in Crowded Scenes. In: CVPR 2005 (2005)Google Scholar
  6. 6.
    McKenna, S.J., Jabri, S., Duric, Z., Rosenfeld, A., Wechsler, H.: Tracking Groups of People. In: CVIU 2000 (2000)Google Scholar
  7. 7.
    Rosales, R., Sclaroff, S.: 3D Trajectory Recovery for Tracking Multiple Objects and Trajectory Guided Recognition of Actions. In: CVPR 1999 (1999)Google Scholar
  8. 8.
    Sidenbladh, H., Black, M.J., Fleet, D.J.: Stochastic tracking of 3D human figures using 2D image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  9. 9.
    Orwell, J., Massey, S., Remagnino, P., Greenhill, D., Jones, G.A.: A Multi-agent framework for visual surveillance. In: ICIP 1999 (1999)Google Scholar
  10. 10.
    Cai, Q., Aggarwal, J.K.: Automatic tracking of human motion in indoor scenes across multiple synchronized video streams. In: ICCV 1998 (1998)Google Scholar
  11. 11.
    Krumm, J., Harris, S., Meyers, B., Brumitt, B., Hale, M., Shafer, S.: Multi-camera multi-person tracking for easy living. In: IEEE International Workshop on Visual Surveillance (2000)Google Scholar
  12. 12.
    Mittal, A., Larry, S.D.: M2Tracker: A Multi-View Approach to Segmenting and Tracking People in a Cluttered Scene. IJCV (2002)Google Scholar
  13. 13.
    Laurentini, A.: The Visual Hull Concept for Silhouette Based Image Understanding. IEEE PAMI (1994)Google Scholar
  14. 14.
    Franco, J., Boyer, E.: Fusion of Multi-View Silhouette Cues Using a Space Occupancy Grid. In: ICCV 2005 (2005)Google Scholar
  15. 15.
    Cheung, K.M., Kanade, T., Bouguet, J.-Y., Holler, M.: A real time system for robust 3d voxel reconstruction of human motions. In: CVPR 2000 (2000)Google Scholar
  16. 16.
    Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: CVPR 1999 (1999)Google Scholar
  17. 17.
    Gibson, J.J.: The Ecological Approach to Visual Perception. Houghton Mifflen, Boston (1979)Google Scholar
  18. 18.
    Marr, D.: Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman, New York (1982)Google Scholar
  19. 19.
    Neisser, U.: Cognition and Reality: Principles and Implications of Cognitive Psychology. W.H. Freeman, San Francisco (1976)Google Scholar
  20. 20.
    Poore, A.B.: Multidimensional Assignments and Multitarget Tracking. In: Proc. Partitioning Data Sets; DIMACS Workshop (1995)Google Scholar
  21. 21.
    Reid, D.B.: An Algorithm for Tracking Multiple Targets. IEEE Trans. Automatic Control (1979)Google Scholar
  22. 22.
    Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE PAMI (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Saad M. Khan
    • 1
  • Mubarak Shah
    • 1
  1. 1.University of Central FloridaUSA

Personalised recommendations