Abstract
This paper presents a novel approach to pedestrian classification which involves a high-level fusion of depth and intensity cues. Instead of utilizing depth information only in a pre-processing step, we propose to extract discriminative spatial features (gradient orientation histograms and local receptive fields) directly from (dense) depth and intensity images. Both modalities are represented in terms of individual feature spaces, in each of which a discriminative model is learned to distinguish between pedestrians and non-pedestrians. We refrain from the construction of a joint feature space, but instead employ a high-level fusion of depth and intensity at classifier-level.
Our experiments on a large real-world dataset demonstrate a significant performance improvement of the combined intensity-depth representation over depth-only and intensity-only models (factor four reduction in false positives at comparable detection rates). Moreover, high-level fusion outperforms low-level fusion using a joint feature space approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Duin, R.P.W., Tax, D.M.J.: Experiments with classifier combining rules. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 16–29. Springer, Heidelberg (2000)
Enzweiler, M., Gavrila, D.M.: Monocular pedestrian detection: Survey and experiments. In: IEEE PAMI, October 17, 2008. IEEE Computer Society Digital Library (2008), http://doi.ieeecomputersociety.org/10.1109/TPAMI.2008.260
Enzweiler, M., Kanter, P., Gavrila, D.M.: Monocular pedestrian recognition using motion parallax. In: IEEE IV Symp., pp. 792–797 (2008)
Ess, A., Leibe, B., van Gool, L.: Depth and appearance for mobile scene analysis. In: Proc. ICCV (2007)
Franke, U., Gehrig, S.K., Badino, H., Rabe, C.: Towards optimal stereo analysis of image sequences. In: Sommer, G., Klette, R. (eds.) RobVis 2008. LNCS, vol. 4931, pp. 43–58. Springer, Heidelberg (2008)
Gandhi, T., Trivedi, M.M.: Image based estimation of pedestrian orientation for improving path prediction. In: IEEE IV Symp., pp. 506–511 (2008)
Gavrila, D.M.: A Bayesian, exemplar-based approach to hierarchical shape matching. IEEE PAMI 29(8), 1408–1421 (2007)
Gavrila, D.M., Munder, S.: Multi-cue pedestrian detection and tracking from a moving vehicle. IJCV 73(1), 41–59 (2007)
Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE PAMI 22(1), 4–37 (2000)
Leibe, B., et al.: Dynamic 3d scene analysis from a moving vehicle. In: Proc. CVPR (2007)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 69–82. Springer, Heidelberg (2004)
Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE PAMI 23(4), 349–361 (2001)
Papageorgiou, C., Poggio, T.: A trainable system for object detection. IJCV 38, 15–33 (2000)
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advance. In: Advances in Large Margin Classifiers, pp. 61–74 (1999)
Rapus, M., et al.: Pedestrian recognition using combined low-resolution depth and intensity images. In: IEEE IV Symp., pp. 632–636 (2008)
Seemann, E., Fritz, M., Schiele, B.: Towards robust pedestrian detection in crowded image sequences. In: Proc. CVPR (2007)
Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on Riemannian manifolds. In: Proc. CVPR (2007)
Van der Mark, W., Gavrila, D.M.: Real-time dense stereo for intelligent vehicles. IEEE PAMI 7(1), 38–50 (2006)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. IJCV 63(2), 153–161 (2005)
Wöhler, C., Anlauf, J.K.: A time delay neural network algorithm for estimating image-pattern shape and motion. IVC 17, 281–294 (1999)
Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. IJCVÂ 75(2), 247 (2007)
Zhang, L., Wu, B., Nevatia, R.: Detection and tracking of multiple humans with extensive pose articulation. In: Proc. ICCV (2007)
Zhu, Q., et al.: Fast human detection using a cascade of histograms of oriented gradients. In: CVPR, pp. 1491–1498 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rohrbach, M., Enzweiler, M., Gavrila, D.M. (2009). High-Level Fusion of Depth and Intensity for Pedestrian Classification. In: Denzler, J., Notni, G., Süße, H. (eds) Pattern Recognition. DAGM 2009. Lecture Notes in Computer Science, vol 5748. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03798-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-03798-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03797-9
Online ISBN: 978-3-642-03798-6
eBook Packages: Computer ScienceComputer Science (R0)