High-Level Fusion of Depth and Intensity for Pedestrian Classification

  • Marcus Rohrbach
  • Markus Enzweiler
  • Dariu M. Gavrila
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5748)


This paper presents a novel approach to pedestrian classification which involves a high-level fusion of depth and intensity cues. Instead of utilizing depth information only in a pre-processing step, we propose to extract discriminative spatial features (gradient orientation histograms and local receptive fields) directly from (dense) depth and intensity images. Both modalities are represented in terms of individual feature spaces, in each of which a discriminative model is learned to distinguish between pedestrians and non-pedestrians. We refrain from the construction of a joint feature space, but instead employ a high-level fusion of depth and intensity at classifier-level.

Our experiments on a large real-world dataset demonstrate a significant performance improvement of the combined intensity-depth representation over depth-only and intensity-only models (factor four reduction in false positives at comparable detection rates). Moreover, high-level fusion outperforms low-level fusion using a joint feature space approach.


Depth Image Fusion Rule Pedestrian Detection Dense Stereo Maximum Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Duin, R.P.W., Tax, D.M.J.: Experiments with classifier combining rules. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 16–29. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  3. 3.
    Enzweiler, M., Gavrila, D.M.: Monocular pedestrian detection: Survey and experiments. In: IEEE PAMI, October 17, 2008. IEEE Computer Society Digital Library (2008),
  4. 4.
    Enzweiler, M., Kanter, P., Gavrila, D.M.: Monocular pedestrian recognition using motion parallax. In: IEEE IV Symp., pp. 792–797 (2008)Google Scholar
  5. 5.
    Ess, A., Leibe, B., van Gool, L.: Depth and appearance for mobile scene analysis. In: Proc. ICCV (2007)Google Scholar
  6. 6.
    Franke, U., Gehrig, S.K., Badino, H., Rabe, C.: Towards optimal stereo analysis of image sequences. In: Sommer, G., Klette, R. (eds.) RobVis 2008. LNCS, vol. 4931, pp. 43–58. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Gandhi, T., Trivedi, M.M.: Image based estimation of pedestrian orientation for improving path prediction. In: IEEE IV Symp., pp. 506–511 (2008)Google Scholar
  8. 8.
    Gavrila, D.M.: A Bayesian, exemplar-based approach to hierarchical shape matching. IEEE PAMI 29(8), 1408–1421 (2007)CrossRefGoogle Scholar
  9. 9.
    Gavrila, D.M., Munder, S.: Multi-cue pedestrian detection and tracking from a moving vehicle. IJCV 73(1), 41–59 (2007)CrossRefGoogle Scholar
  10. 10.
    Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE PAMI 22(1), 4–37 (2000)CrossRefGoogle Scholar
  11. 11.
    Leibe, B., et al.: Dynamic 3d scene analysis from a moving vehicle. In: Proc. CVPR (2007)Google Scholar
  12. 12.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)CrossRefGoogle Scholar
  13. 13.
    Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 69–82. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  14. 14.
    Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE PAMI 23(4), 349–361 (2001)CrossRefGoogle Scholar
  15. 15.
    Papageorgiou, C., Poggio, T.: A trainable system for object detection. IJCV 38, 15–33 (2000)CrossRefzbMATHGoogle Scholar
  16. 16.
    Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advance. In: Advances in Large Margin Classifiers, pp. 61–74 (1999)Google Scholar
  17. 17.
    Rapus, M., et al.: Pedestrian recognition using combined low-resolution depth and intensity images. In: IEEE IV Symp., pp. 632–636 (2008)Google Scholar
  18. 18.
    Seemann, E., Fritz, M., Schiele, B.: Towards robust pedestrian detection in crowded image sequences. In: Proc. CVPR (2007)Google Scholar
  19. 19.
    Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on Riemannian manifolds. In: Proc. CVPR (2007)Google Scholar
  20. 20.
    Van der Mark, W., Gavrila, D.M.: Real-time dense stereo for intelligent vehicles. IEEE PAMI 7(1), 38–50 (2006)Google Scholar
  21. 21.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)CrossRefzbMATHGoogle Scholar
  22. 22.
    Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. IJCV 63(2), 153–161 (2005)CrossRefGoogle Scholar
  23. 23.
    Wöhler, C., Anlauf, J.K.: A time delay neural network algorithm for estimating image-pattern shape and motion. IVC 17, 281–294 (1999)CrossRefGoogle Scholar
  24. 24.
    Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. IJCV 75(2), 247 (2007)CrossRefGoogle Scholar
  25. 25.
    Zhang, L., Wu, B., Nevatia, R.: Detection and tracking of multiple humans with extensive pose articulation. In: Proc. ICCV (2007)Google Scholar
  26. 26.
    Zhu, Q., et al.: Fast human detection using a cascade of histograms of oriented gradients. In: CVPR, pp. 1491–1498 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Marcus Rohrbach
    • 1
    • 3
  • Markus Enzweiler
    • 2
  • Dariu M. Gavrila
    • 1
    • 4
  1. 1.Environment Perception, Group ResearchDaimler AGUlmGermany
  2. 2.Image & Pattern Analysis Group, Dept. of Math.and Computer ScienceUniv. of HeidelbergGermany
  3. 3.Dept. of Computer ScienceTU DarmstadtGermany
  4. 4.Intelligent Systems Lab, Fac. of ScienceUniv. of AmsterdamThe Netherlands

Personalised recommendations