International Journal of Computer Vision

, Volume 75, Issue 2, pp 247–266 | Cite as

Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors

  • Bo Wu
  • Ram Nevatia


Detection and tracking of humans in video streams is important for many applications. We present an approach to automatically detect and track multiple, possibly partially occluded humans in a walking or standing pose from a single camera, which may be stationary or moving. A human body is represented as an assembly of body parts. Part detectors are learned by boosting a number of weak classifiers which are based on edgelet features. Responses of part detectors are combined to form a joint likelihood model that includes an analysis of possible occlusions. The combined detection responses and the part detection responses provide the observations used for tracking. Trajectory initialization and termination are both automatic and rely on the confidences computed from the detection responses. An object is tracked by data association and meanshift methods. Our system can track humans with both inter-object and scene occlusions with static or non-static backgrounds. Evaluation results on a number of images and videos and comparisons with some previous methods are given.


human detection human tracking AdaBoost 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

Electronic supplementary material

Electronic supplementary material

Electronic supplementary material


  1. Avidan, S. 2005. Ensemble tracking. CVPR, vol. II, pp. 494–501.Google Scholar
  2. Barrow, H.G., Tenenbaum, J.M., Bolles, R.C., and Wolf, H.C. 1977. Parametric correspondence and chamfer matching: Two new techniques for image matching. IJCAI, pp. 659–663.Google Scholar
  3. Brostow, G.J. and Cipolla, R. 2006. Unsupervised bayesian detection of independent motion in crowds. CVPR, vol. I, pp. 594–601.Google Scholar
  4. Comaniciu, D., Ramesh, V. and Meer, P. 2001. The variable bandwidth mean shift and data-driven scale selection. ICCV, vol. I, pp. 438–445.Google Scholar
  5. Dalal, N. and Triggs, B. 2005. Histograms of oriented gradients for human detection. CVPR, vol. I, pp. 886–893.Google Scholar
  6. Davis, L., Philomin, V. and Duraiswami, R. 2000. Tracking humans from a moving platform. ICPR, vol. IV, pp. 171–178.Google Scholar
  7. Felzenszwalb, P. 2001. Learning models for object recognition. CVPR, vol. I, pp. 56–62.Google Scholar
  8. Freund, Y. and Schapire R.E. 1996. Experiments with a New Boosting Algorithm. The 13th Conf. on Machine Learning, pp. 148–156.Google Scholar
  9. Gavrila, D. and Philomin, V. 1999. Real-time object detection for “Smart” Vehicles. ICCV, vol. I, pp. 87–93.Google Scholar
  10. Gavrila, D. 2000. Pedestrian detection from a moving vehicle. ECCV, vol. II, pp. 37–49.Google Scholar
  11. Grabner, H. and Bischof, H. 2006. Online boosting and vision. CVPR, vol. I, pp. 260–267.Google Scholar
  12. Huang, C., Ai, H., Wu, B., and Lao. S. 2004. Boosting nested cascade detector for multi-view face detection. ICPR, vol. II, pp. 415–418.Google Scholar
  13. Huang, C., Ai, H., Li, Y., and Lao, S. 2005. Vector boosting for rotation invariant multi-view face detection. ICCV, vol. I, pp. 446–453.Google Scholar
  14. Scholar
  15. Scholar
  16. Scholar
  17. Isard, M. and MacCormick, J. 2001. BraMBLe: A bayesian multiple-blob tracker. ICCV, vol. II, pp. 34–41.Google Scholar
  18. Kruppa, H., Castrillon-Santana, M., and Schiele, B. 2003. Fast and robust face finding via local context. Joint IEEE Int’l Workshop on VS-PETS.Google Scholar
  19. Kuhn, H.W. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2:83–87.Google Scholar
  20. Lee, M. and Nevatia, R. 2006. Human pose tracking using multi-level structured models. ECCV, vol. III, pp. 368–381.Google Scholar
  21. Leibe, B., Seemann, E. and Schiele B. 2005. Pedestrian detection in crowded scenes. CVPR, vol. I, pp. 878–885.Google Scholar
  22. Lowe, D.G. 1999. Object recognition from local scale-invariant features. ICCV, vol. II, pp. 1150–1157.Google Scholar
  23. Mikolajczyk, C., Schmid, C., and Zisserman, A. 2004. Human detection based on a probabilistic assembly of robust part detectors. ECCV, vol. I, pp. 69–82.Google Scholar
  24. Mohan, A., Papageorgiou, C., and Poggio, T. 2001. Example-based object detection in images by components. Trans. PAMI, 23(4):349.Google Scholar
  25. Papageorgiou, C., Evgeniou, T., and Poggio, T. 1998. A trainable pedestrian detection system. In Proceeding of Intelligent Vehicles, pp. 241–246.Google Scholar
  26. Peter, J.R., Tu, H., and Krahnstoever, N. 2005. Simultaneous estimation of segmentation and shape. CVPR, vol. II, pp. 486–493.Google Scholar
  27. Ramanan, D., Forsyth, D.A., and Zisserman, A. 2005. Strike a pose: Tracking people by finding stylized poses. CVPR, vol. I, pp. 271–278.Google Scholar
  28. Schapire, R.E. and Singer, Y. 1999. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37:297–336.Google Scholar
  29. Shashua, A., Gdalyahu, Y., and Hayun, G. 2004. Pedestrian detection for driving assistance systems: Single-frame classification and system level performance. IEEE Intelligent Vehicles Symposium, Parma, Italy, pp. 1–6.Google Scholar
  30. Sigal, L., Bhatia, S., Roth, S., Black, M.J., and Isard M. 2004. Tracking loose-limbed people. CVPR, vol. I, pp. 421–428.Google Scholar
  31. Smith, K., G.-Perez, D., and Odobez, J.-M. 2005. Using particles to track varying numbers of interacting people. CVPR, vol. I, pp. 962–969.Google Scholar
  32. Viola, P. and Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. CVPR, vol. I, pp. 511–518.Google Scholar
  33. Viola, P., Jones, M., and Snow, D. 2003. Detecting pedestrians using patterns of motion and appearance. ICCV, pp. 734–741.Google Scholar
  34. Wren, C.R., Azarbayejani, A., Darrell, T., and Pentland, A.P. 1997. Pfinder: Real-time tracking of human body. IEEE Trans. PAMI, vol. 19, no. 7.Google Scholar
  35. Wu Y., Yu, T., and Hua. G. 2005. A statistical field model for pedestrian detection. CVPR, vol. I, pp. 1023–1030.Google Scholar
  36. Wu, B. and Nevatia, R. 2006a. Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. ICCV, vol. I, pp. 90–97.Google Scholar
  37. Wu, B. and Nevatia, R. 2006b. Tracking of multiple, partially occluded humans based on static body part detection. CVPR, vol. II, pp. 951–958.Google Scholar
  38. Wu, B. and Nevatia, R. 2006c. Tracking of multiple humans in meetings. In V4HCI’06 workshop, in conjunction with CVPR, pp. 143–150.Google Scholar
  39. Wu, B., Singh, V.K., Nevatia, R., and Chu, C.-W. (2006). Speaker tssracking in seminars by human body detection. In CLEAR 2006 Evaluation Campaign and Workshop, in conjunction with FG.Google Scholar
  40. Zhao, T. and Nevatia, R. 2004a. Tracking multiple humans in crowded environment. CVPR, vol. II, pp. 406–413.Google Scholar
  41. Zhao, T. and Nevatia, R. 2004b. Tracking multiple humans in complex situations. IEEE trans. on PAMI, 26(9):1208–1221.Google Scholar
  42. Zhao, L. and Davis, L. 2005. Closely coupled object detection and segmentation. ICCV, vol. I, pp. 454–461.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.University of Southern CaliforniaInstitute for Robotics and Intelligent SystemsLos AngelesUSA

Personalised recommendations