3D Urban Scene Modeling Integrating Recognition and Reconstruction

Cornelis, Nico; Leibe, Bastian; Cornelis, Kurt; Van Gool, Luc

doi:10.1007/s11263-007-0081-9

3D Urban Scene Modeling Integrating Recognition and Reconstruction

Published: 02 October 2007

Volume 78, pages 121–141, (2008)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Nico Cornelis¹,
Bastian Leibe²,
Kurt Cornelis¹ &
…
Luc Van Gool^1,2

1477 Accesses
159 Citations
3 Altmetric
Explore all metrics

Abstract

Supplying realistically textured 3D city models at ground level promises to be useful for pre-visualizing upcoming traffic situations in car navigation systems. Because this pre-visualization can be rendered from the expected future viewpoints of the driver, the required maneuver will be more easily understandable. 3D city models can be reconstructed from the imagery recorded by surveying vehicles. The vastness of image material gathered by these vehicles, however, puts extreme demands on vision algorithms to ensure their practical usability. Algorithms need to be as fast as possible and should result in compact, memory efficient 3D city models for future ease of distribution and visualization. For the considered application, these are not contradictory demands. Simplified geometry assumptions can speed up vision algorithms while automatically guaranteeing compact geometry models. In this paper, we present a novel city modeling framework which builds upon this philosophy to create 3D content at high speed.

Objects in the environment, such as cars and pedestrians, may however disturb the reconstruction, as they violate the simplified geometry assumptions, leading to visually unpleasant artifacts and degrading the visual realism of the resulting 3D city model. Unfortunately, such objects are prevalent in urban scenes. We therefore extend the reconstruction framework by integrating it with an object recognition module that automatically detects cars in the input video streams and localizes them in 3D. The two components of our system are tightly integrated and benefit from each other’s continuous input. 3D reconstruction delivers geometric scene context, which greatly helps improve detection precision. The detected car locations, on the other hand, are used to instantiate virtual placeholder models which augment the visual realism of the reconstructed city model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bay, H., Tuytelaars, T., & Gool, L. V. (2006). Surf: speeded-up robust features. In Ninth European conference on computer vision (ECCV’06).
Comaniciu, D., & Meer, P. (2002). Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.
Article Google Scholar
Cornelis, N., & Gool, L. V. (2005). Real-time connectivity constrained depth map computation using programmable graphics hardware. In IEEE conference on computer vision and pattern recognition (CVPR’05).
Cornelis, N., Cornelis, K., & Gool, L. V. (2006a). Fast compact city modeling for navigation pre-visualization. In IEEE conference on computer vision and pattern recognition (CVPR’06).
Cornelis, N., Leibe, B., Cornelis, K., & Gool, L. V. (2006b). 3d city modeling using cognitive loops. In Third international symposium on 3D data processing, visualization, and transmission (3DPVT’06).
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE conference on computer vision and pattern recognition (CVPR’05).
Debevec, P. E., Yu, Y., & Borshukov, G. D. (1998). Efficient view-dependent image-based rendering with projective texture-mapping. In Eurographics rendering workshop (pp. 105–116), June 1998.
Dick, A., Torr, P., Ruffle, S., & Cipolla, R. (2001). Combining single view recognition and multiple view stereo for architectural scenes. In Eighth international conference on computer vision (ICCV’01).
Fischler, M., & Bolles, R. (1981). Random sampling consensus: a paradigm for model fitting with application to image analysis and automated cartography. Communications of the ACM, 24, 381–395.
Article MathSciNet Google Scholar
Frueh, C., & Zakhor, A. (2001). 3D model generation for cities using aerial photographs and ground level laser scans. In IEEE conference on computer vision and pattern recognition (CVPR’01) (pp. 31–38).
Frueh, C., Jain, S., & Zakhor, A. (2005). Data processing algorithms for generating textured 3D building facade meshes from laser scans and camera images. International Journal of Computer Vision, 61, 159–184.
Article Google Scholar
Gruen, A. (1997). Automation in building reconstruction. In Fritsch & Hobbie (Eds.), Photogrammetric week’97, Stuttgart.
Haala, N., & Brenner, C. (1998). Fast production of virtual reality city models. International Archives of Photogrammetry and Remote Sensing, 32, 77–84.
Google Scholar
Haala, N., Brenner, C., & Stätter, C. (1998). An integrated system for urban model generation. In Proceedings ISPRS (pp. 96–103), Cambridge.
Haralick, R., Joo, H., Lee, C., Zhuang, X., Vaidya, V., & Kim, M. (1989). Pose estimation from corresponding point data. IEEE Transactions on Systems, Man and Cybernetics, 19(6), 1426–1446.
Article Google Scholar
Hartley, R., & Zisserman, A. (2000). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
MATH Google Scholar
Hoiem, D., Efros, A., & Hebert, M. (2006). Putting objects into perspective. In IEEE conference on computer vision and pattern recognition (CVPR’06).
Hu, J., You, S., & Neumann, U. (2003). Approaches to large-scale urban modeling. IEEE Computer Graphics & Applications, 23(6), 62–69.
Article Google Scholar
Leibe, B., Seemann, E., & Schiele, B. (2005). Pedestrian detection in crowded scenes. In IEEE conference on computer vision and pattern recognition (CVPR’05).
Leibe, B., Mikolajczyk, K., & Schiele, B. (2006). Segmentation based multi-cue integration for object detection. In British machine vision conference (BMVC’06), Edinburgh, UK, September 2006.
Leibe, B., Cornelis, N., Cornelis, K., & Van Gool, L. (2006). Integrating recognition and reconstruction for cognitive traffic scene analysis from a moving vehicle. In Lecture notes in computer science : Vol. 4174. DAGM’06 annual pattern recognition symposium ( pp. 192–201). Berlin: Springer.
Chapter Google Scholar
Leibe, B., Cornelis, N., Cornelis, K., & Van Gool, L. (2007a). Dynamic 3d scene analysis from a moving vehicle. In IEEE conference on computer vision and pattern recognition (CVPR’07).
Leibe, B., Leonardis, A., & Schiele, B. (2007b, to appear). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision.
Leonardis, A., Gupta, A., & Bajcsy, R. (1995). Segmentation of range images as the search for geometric parametric models. International Journal of Computer Vision, 14, 253–277.
Article Google Scholar
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Maas, H.-G. (2001). The suitability of airborne laser scanner data for automatic 3D object reconstruction. In International workshop on automatic extraction of man-made objects from aerial and space images.
Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 31–37.
Article Google Scholar
Mikolajczyk, K., Leibe, B., & Schiele, B. (2006). Multiple object class detection with a generative model. In IEEE conference on computer vision and pattern recognition (CVPR’06).
Nister, D. (2003). An efficient solution to the five-point relative pose problem. In IEEE conference on computer vision and pattern recognition (CVPR’03) (pp. 195–202).
Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2005). LabelMe: a database and web-based tool for image anotation. MIT AI Lab Memo AIM-2005-025, September 2005. http://labelme.csail.mit.edu/.
Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47, 7–42.
Article MATH Google Scholar
Stamos, I., & Allen, P. K. (2000). 3D model construction using range and image data. In IEEE conference on computer vision and pattern recognition (CVPR’00).
Sudderth, E., Torralba, A., Freeman, W., & Wilsky, A. (2005). Learning hierarchical models of scenes, objects, and parts. In Tenth international conference on computer vision (ICCV’05).
Sun, Y., Paik, J. K., Koschan, A., & Abidi, M. A. (2002). 3D reconstruction of indoor and outdoor scenes using a mobile range scanner. In International conference on pattern recognition (ICPR’02).
Torralba, A., Murphy, K., & Freeman, W. (2004). Sharing features: Efficient boosting procedures for multiclass object detection. In IEEE conference on computer vision and pattern recognition (CVPR’04).
Veksler, O. (2003). Fast variable window for stereo correspondence using integral images. In IEEE conference on computer vision and pattern recognition (CVPR’03) (pp. 556–564).
Vestri, C., & Devernay, F. (2001). Using robust methods for automatic extraction of buildings. In IEEE conference on computer vision and pattern recognition (CVPR’01).
Viola, P., & Jones, M. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137–154.
Article Google Scholar
Vosselman, G., & Dijkman, S. (2001). 3D building model reconstruction from point clouds and ground plans (34-3/W4:22–24).
Wolf, M. (1999). Photogrammetric data capture and calculation for 3D city models. In Photogrammetric week’99 (pp. 305–312).
Wu, B., & Nevatia, R. (2005). Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors. In Tenth international conference on computer vision (ICCV’05).
Yang, R., & Pollefeys, M. (2003). Multi-resolution real-time stereo on commodity graphics hardware. In IEEE conference on computer vision and pattern recognition (CVPR’03).

Download references

Author information

Authors and Affiliations

KU Leuven, Leuven, Belgium
Nico Cornelis, Kurt Cornelis & Luc Van Gool
ETH Zurich, Zurich, Switzerland
Bastian Leibe & Luc Van Gool

Authors

Nico Cornelis
View author publications
You can also search for this author in PubMed Google Scholar
Bastian Leibe
View author publications
You can also search for this author in PubMed Google Scholar
Kurt Cornelis
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nico Cornelis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cornelis, N., Leibe, B., Cornelis, K. et al. 3D Urban Scene Modeling Integrating Recognition and Reconstruction. Int J Comput Vis 78, 121–141 (2008). https://doi.org/10.1007/s11263-007-0081-9

Download citation

Received: 19 March 2007
Accepted: 06 August 2007
Published: 02 October 2007
Issue Date: July 2008
DOI: https://doi.org/10.1007/s11263-007-0081-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D Urban Scene Modeling Integrating Recognition and Reconstruction

Abstract

Access this article

Similar content being viewed by others

PhotoSketch: a photocentric urban 3D modeling system

Large Scale Urban Scene Modeling from MVS Meshes

Building 3D Virtual Worlds from Monocular Images of Urban Road Traffic Scenes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D Urban Scene Modeling Integrating Recognition and Reconstruction

Abstract

Access this article

Similar content being viewed by others

PhotoSketch: a photocentric urban 3D modeling system

Large Scale Urban Scene Modeling from MVS Meshes

Building 3D Virtual Worlds from Monocular Images of Urban Road Traffic Scenes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation