Monocular 3D Scene Modeling and Inference: Understanding Multi-Object Traffic Scenes

Wojek, Christian; Roth, Stefan; Schindler, Konrad; Schiele, Bernt

doi:10.1007/978-3-642-15561-1_34

Christian Wojek^19,20,
Stefan Roth¹⁹,
Konrad Schindler^19,21 &
…
Bernt Schiele^19,20

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6314))

Included in the following conference series:

European Conference on Computer Vision

12k Accesses
24 Citations

Abstract

Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. In this paper, we present a novel probabilistic 3D scene model that encompasses multi-class object detection, object tracking, scene labeling, and 3D geometric relations. This integrated 3D model is able to represent complex interactions like inter-object occlusion, physical exclusion between objects, and geometric context. Inference allows to recover 3D scene context and perform 3D multiobject tracking from a mobile observer, for objects of multiple categories, using only monocular video as input. In particular, we show that a joint scene tracklet model for the evidence collected over multiple frames substantially improves performance. The approach is evaluated for two different types of challenging onboard sequences. We first show a substantial improvement to the state-of-the-art in 3D multi-people tracking. Moreover, a similar performance gain is achieved for multi-class 3D tracking of cars and trucks on a new, challenging dataset.

Download to read the full chapter text

Chapter PDF

Visual tracking in complex scenes through pixel-wise tri-modeling

Article 23 January 2015

SpOT: Spatiotemporal Modeling for 3D Object Tracking

Occlusion and Motion Reasoning for Long-Term Tracking

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Ess, A., Leibe, B., Schindler, K., Van Gool, L.: Robust multi-person tracking from a mobile platform. PAMI 31 (2009)
Google Scholar
Gavrila, D.M., Munder, S.: Multi-cue pedestrian detection and tracking from a moving vehicle. In: IJCV, vol. 73 (2007)
Google Scholar
Okuma, K., Taleghani, A., de Freitas, N., Little, J., Lowe, D.: A boosted particle filter: Multitarget detection and tracking. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 28–39. Springer, Heidelberg (2004)
Chapter Google Scholar
Kaucic, R., Perera, A.G., Brooksby, G., Kaufhold, J., Hoogs, A.: A unified framework for tracking through occlusions and across sensor gaps. In: CVPR (2005)
Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Putting objects in perspective. In: IJCV, vol. 80 (2008)
Google Scholar
Torralba, A.: Contextual priming for object detection. In: IJCV, vol. 53 (2003)
Google Scholar
Shotton, J., Winn, J., Rother, C., Criminisi, A.: TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)
Chapter Google Scholar
Ess, A., Müller, T., Grabner, H., Van Gool, L.: Segmentation-based urban traffic scene understanding. In: BMVC (2009)
Google Scholar
Brostow, G., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using SfM point clouds. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 44–57. Springer, Heidelberg (2008)
Chapter Google Scholar
Tu, Z., Chen, X., Yuille, A.L., Zhu, S.C.: Image parsing: Unifying segmentation, detection, and recognition. IJCV 63 (2005)
Google Scholar
Shashua, A., Gdalyahu, Y., Hayun, G.: Pedestrian detection for driving assistance systems: Single-frame classification and system level performance. In: IVS (2004)
Google Scholar
Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Van Gool, L.: Robust tracking-by-detection using a detector confidence particle filter. In: ICCV (2009)
Google Scholar
Huang, C., Wu, B., Nevatia, R.: Robust object tracking by hierarchical association of detection responses. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 788–801. Springer, Heidelberg (2008)
Chapter Google Scholar
Li, Y., Huang, C., Nevatia, R.: Learning to associate: HybridBoosted multi-target tracker for crowded scene. In: CVPR (2009)
Google Scholar
Khan, Z., Balch, T., Dellaert, F.: MCMC-based particle filtering for tracking a variable number of interacting targets. PAMI 27 (2005)
Google Scholar
Zhao, T., Nevatia, R., Wu, B.: Segmentation and tracking of multiple humans in crowded environments. PAMI 30 (2008)
Google Scholar
Isard, M., MacCormick, J.: BraMBLe: a Bayesian multiple-blob tracker. In: ICCV (2001)
Google Scholar
Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 (1995)
Google Scholar
Gilks, W., Richardson, S., Spiegelhalter, D. (eds.): Markov Chain Monte Carlo in Practice. Chapman and Hall, Boca Raton (1995)
Google Scholar
Wojek, C., Walk, S., Schiele, B.: Multi-cue onboard pedestrian detection. In: CVPR (2009)
Google Scholar
Dalal, N.: Finding People in Images and Videos. PhD thesis, Institut National Polytechnique de Grenoble (2006)
Google Scholar
Torralba, A., Murphy, K., Freeman, W.: Sharing visual features for multiclass and multiview object detection. PAMI 29 (2007)
Google Scholar
Wojek, C., Schiele, B.: A dynamic conditional random field model for joint labeling of object and scene classes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 733–747. Springer, Heidelberg (2008)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, TU Darmstadt,
Christian Wojek, Stefan Roth, Konrad Schindler & Bernt Schiele
MPI Informatics, Saarbrücken
Christian Wojek & Bernt Schiele
Photogrammetry and Remote Sensing Group, ETH Zürich,
Konrad Schindler

Authors

Christian Wojek
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Roth
View author publications
You can also search for this author in PubMed Google Scholar
Konrad Schindler
View author publications
You can also search for this author in PubMed Google Scholar
Bernt Schiele
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

1 Electronic Supplementary Material

Electronic Supplementary Material (14,468 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wojek, C., Roth, S., Schindler, K., Schiele, B. (2010). Monocular 3D Scene Modeling and Inference: Understanding Multi-Object Traffic Scenes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6314. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15561-1_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-15561-1_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15560-4
Online ISBN: 978-3-642-15561-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Monocular 3D Scene Modeling and Inference: Understanding Multi-Object Traffic Scenes

Abstract

Chapter PDF

Similar content being viewed by others

Visual tracking in complex scenes through pixel-wise tri-modeling

SpOT: Spatiotemporal Modeling for 3D Object Tracking

Occlusion and Motion Reasoning for Long-Term Tracking

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Electronic Supplementary Material (14,468 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Monocular 3D Scene Modeling and Inference: Understanding Multi-Object Traffic Scenes

Abstract

Chapter PDF

Similar content being viewed by others

Visual tracking in complex scenes through pixel-wise tri-modeling

SpOT: Spatiotemporal Modeling for 3D Object Tracking

Occlusion and Motion Reasoning for Long-Term Tracking

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Electronic Supplementary Material (14,468 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation