High-Level Fusion of Depth and Intensity for Pedestrian Classification

Rohrbach, Marcus; Enzweiler, Markus; Gavrila, Dariu M.

doi:10.1007/978-3-642-03798-6_11

Marcus Rohrbach^18,20,
Markus Enzweiler¹⁹ &
Dariu M. Gavrila^18,21

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5748))

Included in the following conference series:

Joint Pattern Recognition Symposium

2655 Accesses
11 Citations

Abstract

This paper presents a novel approach to pedestrian classification which involves a high-level fusion of depth and intensity cues. Instead of utilizing depth information only in a pre-processing step, we propose to extract discriminative spatial features (gradient orientation histograms and local receptive fields) directly from (dense) depth and intensity images. Both modalities are represented in terms of individual feature spaces, in each of which a discriminative model is learned to distinguish between pedestrians and non-pedestrians. We refrain from the construction of a joint feature space, but instead employ a high-level fusion of depth and intensity at classifier-level.

Our experiments on a large real-world dataset demonstrate a significant performance improvement of the combined intensity-depth representation over depth-only and intensity-only models (factor four reduction in false positives at comparable detection rates). Moreover, high-level fusion outperforms low-level fusion using a joint feature space approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Chapter Google Scholar
Duin, R.P.W., Tax, D.M.J.: Experiments with classifier combining rules. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 16–29. Springer, Heidelberg (2000)
Chapter Google Scholar
Enzweiler, M., Gavrila, D.M.: Monocular pedestrian detection: Survey and experiments. In: IEEE PAMI, October 17, 2008. IEEE Computer Society Digital Library (2008), http://doi.ieeecomputersociety.org/10.1109/TPAMI.2008.260
Enzweiler, M., Kanter, P., Gavrila, D.M.: Monocular pedestrian recognition using motion parallax. In: IEEE IV Symp., pp. 792–797 (2008)
Google Scholar
Ess, A., Leibe, B., van Gool, L.: Depth and appearance for mobile scene analysis. In: Proc. ICCV (2007)
Google Scholar
Franke, U., Gehrig, S.K., Badino, H., Rabe, C.: Towards optimal stereo analysis of image sequences. In: Sommer, G., Klette, R. (eds.) RobVis 2008. LNCS, vol. 4931, pp. 43–58. Springer, Heidelberg (2008)
Chapter Google Scholar
Gandhi, T., Trivedi, M.M.: Image based estimation of pedestrian orientation for improving path prediction. In: IEEE IV Symp., pp. 506–511 (2008)
Google Scholar
Gavrila, D.M.: A Bayesian, exemplar-based approach to hierarchical shape matching. IEEE PAMI 29(8), 1408–1421 (2007)
Article Google Scholar
Gavrila, D.M., Munder, S.: Multi-cue pedestrian detection and tracking from a moving vehicle. IJCV 73(1), 41–59 (2007)
Article Google Scholar
Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE PAMI 22(1), 4–37 (2000)
Article Google Scholar
Leibe, B., et al.: Dynamic 3d scene analysis from a moving vehicle. In: Proc. CVPR (2007)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Article Google Scholar
Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 69–82. Springer, Heidelberg (2004)
Chapter Google Scholar
Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE PAMI 23(4), 349–361 (2001)
Article Google Scholar
Papageorgiou, C., Poggio, T.: A trainable system for object detection. IJCV 38, 15–33 (2000)
Article MATH Google Scholar
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advance. In: Advances in Large Margin Classifiers, pp. 61–74 (1999)
Google Scholar
Rapus, M., et al.: Pedestrian recognition using combined low-resolution depth and intensity images. In: IEEE IV Symp., pp. 632–636 (2008)
Google Scholar
Seemann, E., Fritz, M., Schiele, B.: Towards robust pedestrian detection in crowded image sequences. In: Proc. CVPR (2007)
Google Scholar
Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on Riemannian manifolds. In: Proc. CVPR (2007)
Google Scholar
Van der Mark, W., Gavrila, D.M.: Real-time dense stereo for intelligent vehicles. IEEE PAMI 7(1), 38–50 (2006)
Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Book MATH Google Scholar
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. IJCV 63(2), 153–161 (2005)
Article Google Scholar
Wöhler, C., Anlauf, J.K.: A time delay neural network algorithm for estimating image-pattern shape and motion. IVC 17, 281–294 (1999)
Article Google Scholar
Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. IJCV 75(2), 247 (2007)
Article Google Scholar
Zhang, L., Wu, B., Nevatia, R.: Detection and tracking of multiple humans with extensive pose articulation. In: Proc. ICCV (2007)
Google Scholar
Zhu, Q., et al.: Fast human detection using a cascade of histograms of oriented gradients. In: CVPR, pp. 1491–1498 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Environment Perception, Group Research, Daimler AG, Ulm, Germany
Marcus Rohrbach & Dariu M. Gavrila
Image & Pattern Analysis Group, Dept. of Math.and Computer Science, Univ. of Heidelberg, Germany
Markus Enzweiler
Dept. of Computer Science, TU Darmstadt, Germany
Marcus Rohrbach
Intelligent Systems Lab, Fac. of Science, Univ. of Amsterdam, The Netherlands
Dariu M. Gavrila

Authors

Marcus Rohrbach
View author publications
You can also search for this author in PubMed Google Scholar
Markus Enzweiler
View author publications
You can also search for this author in PubMed Google Scholar
Dariu M. Gavrila
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Digitale Bildverarbeitung, Universität Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany
Joachim Denzler & Herbert Süße &
Fraunhofer-Institut für Angewandte Optik und Feinmechanik, Albert-Einstein-Str. 7, 07745, Jena, Germany
Gunther Notni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rohrbach, M., Enzweiler, M., Gavrila, D.M. (2009). High-Level Fusion of Depth and Intensity for Pedestrian Classification. In: Denzler, J., Notni, G., Süße, H. (eds) Pattern Recognition. DAGM 2009. Lecture Notes in Computer Science, vol 5748. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03798-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-03798-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03797-9
Online ISBN: 978-3-642-03798-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics