Abstract
In the paper we present a novel method for three-dimensional scene recovering from one image of a man-made environment. We use image segmentation and perspective cues such as parallel lines in space. The algorithm models a scene as a composition of surfaces (or planes) which belong to their vanishing points. The main idea is that we exploit obtained planes to recover neighbor surfaces. Unlike previous approaches which use one base plane to place reconstructed objects on it, we show that our method recovers objects that lie on different levels of a scene. Furthermore, we show that our technique improves results of other methods. For evaluation we have manually labeled two publicly available datasets. On those datasets we demonstrate the ability of our algorithm to recover scene surfaces in different conditions and show several examples of plausible scene reconstruction.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Barinova, O., Konushin, V., Yakubenko, A., Lee, K.C., Lim, H., Konushin, A.: Fast automatic single-view 3-d reconstruction of urban scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 100–113. Springer, Heidelberg (2008)
Boulanger, K., Bouatouch, K., Pattanaik, S.: Atip: A tool for 3d navigation inside a single image with automatic camera calibration. In: Proceedings of the EG UK Theory and Practice of Computer Graphics 15 (2006)
Coughlan, J.M., Yuille, A.L.: Manhattan world: compass direction from a single image by bayesian inference. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 941–947 (1999)
Criminisi, A., Reid, I., Zisserman, A.: Single view metrology. Int. J. Comput. Vis. 40, 123–148 (2000)
Delage, E., Lee, H., Ng, A.: A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image. Comput. Vis. Pattern Recogn. 2, 2418–2428 (2006)
Denis, P., Elder, J.H., Estrada, F.J.: Efficient edge-based methods for estimating manhattan frames in urban imagery. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 197–210. Springer, Heidelberg (2008)
Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica Int. J. Geog. Inf. Geovisualization 10, 112–122 (1973)
Guo, R., Hoiem, D.: Beyond the line of sight: labeling the underlying surfaces. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 761–774. Springer, Heidelberg (2012)
Gupta, A., Efros, A.A., Hebert, M.: Blocks world revisited: image understanding using qualitative geometry and mechanics. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 482–496. Springer, Heidelberg (2010)
Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: IEEE 12th International Conference on Computer Vision, pp. 1849–1856 (2009)
Hedau, V., Hoiem, D., Forsyth, D.: Thinking inside the box: using appearance models and context based on room geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010)
Hedau, V., Hoiem, D., Forsyth, D.: Recovering free space of indoor scenes from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2807–2814 (2012)
Hoiem, D.: Seeing the world behind the image: Spatial layout for three-dimensional scene understanding (2007)
Hoiem, D., Efros, A., Hebert, M.: Putting objects in perspective. Int. J. Comput. Vis. 80, 3–15 (2008)
Hoiem, D., Efros, A.A., Hebert, M.: Automatic photo pop-up. ACM Trans. Graph. TOG) 24, 577–584 (2005). ACM
Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 1, pp. 654–661. IEEE (2005)
Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vis. 75, 151–172 (2007)
Hoiem, D., Efros, A.A., Hebert, M.: Closing the loop in scene interpretation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)
Hoiem, D., Efros, A.A., Hebert, M.: Recovering occlusion boundaries from an image. Int. J. Comput. Vis. 91, 328–346 (2011)
Karsch, K., Hedau, V., Forsyth, D., Hoiem, D.: Rendering synthetic objects into legacy photographs. ACM Trans. Graph. TOG 30, 157 (2011)
Kovesi, P.D.: Matlab and octave functions for computer vision and image processing (2000). http://www.csse.uwa.edu.au/pk/research/matlabfns/
Lalonde, J.F., Hoiem, D., Efros, A.A., Rother, C., Winn, J., Criminisi, A.: Photo clip art. ACM Trans. Graph. 26, 3 (2007). ACM
Lee, D.C., Gupta, A., Hebert, M., Kanade, T.: Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In: Proceedings of the NIPS, vol. 1, p. 3. Vancouver, BC (2010)
Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2136–2143. IEEE (2009)
Ramer, U.: An iterative procedure for the polygonal approximation of plane curves. Comput. Graph. Image Proc. 1, 244–256 (1972)
Rother, C.: A new approach to vanishing point detection in architectural environments. Image Vis. Comput. 20, 647–655 (2002)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157–173 (2008)
Saxena, A., Chung, S.H., Ng, A.Y.: 3-d depth reconstruction from a single still image. Int. J. Comput. Vis. 76, 53–69 (2008)
Saxena, A., Sun, M., Ng, A.Y.: Make3d: learning 3d scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31, 824–840 (2009)
Stella, X.Y., Zhang, H., Malik, J.: Inferring spatial layout from a single image via depth-ordered grouping. In: CVPR Workshop (2008)
Tardif, J.P.: Non-iterative approach for fast and accurate vanishing point detection. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 1250–1257. IEEE (2009)
Toldo, R., Fusiello, A.: Robust Multiple structures estimation with J-linkage. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 537–547. Springer, Heidelberg (2008)
Tretyak, E., Barinova, O., Kohli, P., Lempitsky, V.: Geometric image parsing in man-made environments. Int. J. Comput. Vis. 97, 305–321 (2012)
Acknowledgements
To Jiri Matas for valuable comments on the paper and to Evgeny Stolov for the help during the research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Akhmadeev, F. (2015). Surface Prediction for a Single Image of Urban Scenes. In: Jawahar, C., Shan, S. (eds) Computer Vision - ACCV 2014 Workshops. ACCV 2014. Lecture Notes in Computer Science(), vol 9008. Springer, Cham. https://doi.org/10.1007/978-3-319-16628-5_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-16628-5_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16627-8
Online ISBN: 978-3-319-16628-5
eBook Packages: Computer ScienceComputer Science (R0)