Surface Prediction for a Single Image of Urban Scenes

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9008)

Abstract

In the paper we present a novel method for three-dimensional scene recovering from one image of a man-made environment. We use image segmentation and perspective cues such as parallel lines in space. The algorithm models a scene as a composition of surfaces (or planes) which belong to their vanishing points. The main idea is that we exploit obtained planes to recover neighbor surfaces. Unlike previous approaches which use one base plane to place reconstructed objects on it, we show that our method recovers objects that lie on different levels of a scene. Furthermore, we show that our technique improves results of other methods. For evaluation we have manually labeled two publicly available datasets. On those datasets we demonstrate the ability of our algorithm to recover scene surfaces in different conditions and show several examples of plausible scene reconstruction.

Notes

Acknowledgements

To Jiri Matas for valuable comments on the paper and to Evgeny Stolov for the help during the research.

References

  1. 1.
    Barinova, O., Konushin, V., Yakubenko, A., Lee, K.C., Lim, H., Konushin, A.: Fast automatic single-view 3-d reconstruction of urban scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 100–113. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  2. 2.
    Boulanger, K., Bouatouch, K., Pattanaik, S.: Atip: A tool for 3d navigation inside a single image with automatic camera calibration. In: Proceedings of the EG UK Theory and Practice of Computer Graphics 15 (2006)Google Scholar
  3. 3.
    Coughlan, J.M., Yuille, A.L.: Manhattan world: compass direction from a single image by bayesian inference. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 941–947 (1999)Google Scholar
  4. 4.
    Criminisi, A., Reid, I., Zisserman, A.: Single view metrology. Int. J. Comput. Vis. 40, 123–148 (2000)CrossRefMATHGoogle Scholar
  5. 5.
    Delage, E., Lee, H., Ng, A.: A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image. Comput. Vis. Pattern Recogn. 2, 2418–2428 (2006)Google Scholar
  6. 6.
    Denis, P., Elder, J.H., Estrada, F.J.: Efficient edge-based methods for estimating manhattan frames in urban imagery. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 197–210. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  7. 7.
    Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica Int. J. Geog. Inf. Geovisualization 10, 112–122 (1973)CrossRefGoogle Scholar
  8. 8.
    Guo, R., Hoiem, D.: Beyond the line of sight: labeling the underlying surfaces. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 761–774. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  9. 9.
    Gupta, A., Efros, A.A., Hebert, M.: Blocks world revisited: image understanding using qualitative geometry and mechanics. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 482–496. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  10. 10.
    Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: IEEE 12th International Conference on Computer Vision, pp. 1849–1856 (2009)Google Scholar
  11. 11.
    Hedau, V., Hoiem, D., Forsyth, D.: Thinking inside the box: using appearance models and context based on room geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  12. 12.
    Hedau, V., Hoiem, D., Forsyth, D.: Recovering free space of indoor scenes from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2807–2814 (2012)Google Scholar
  13. 13.
    Hoiem, D.: Seeing the world behind the image: Spatial layout for three-dimensional scene understanding (2007)Google Scholar
  14. 14.
    Hoiem, D., Efros, A., Hebert, M.: Putting objects in perspective. Int. J. Comput. Vis. 80, 3–15 (2008)CrossRefGoogle Scholar
  15. 15.
    Hoiem, D., Efros, A.A., Hebert, M.: Automatic photo pop-up. ACM Trans. Graph. TOG) 24, 577–584 (2005). ACMCrossRefGoogle Scholar
  16. 16.
    Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 1, pp. 654–661. IEEE (2005)Google Scholar
  17. 17.
    Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vis. 75, 151–172 (2007)CrossRefGoogle Scholar
  18. 18.
    Hoiem, D., Efros, A.A., Hebert, M.: Closing the loop in scene interpretation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)Google Scholar
  19. 19.
    Hoiem, D., Efros, A.A., Hebert, M.: Recovering occlusion boundaries from an image. Int. J. Comput. Vis. 91, 328–346 (2011)CrossRefMATHMathSciNetGoogle Scholar
  20. 20.
    Karsch, K., Hedau, V., Forsyth, D., Hoiem, D.: Rendering synthetic objects into legacy photographs. ACM Trans. Graph. TOG 30, 157 (2011)Google Scholar
  21. 21.
    Kovesi, P.D.: Matlab and octave functions for computer vision and image processing (2000). http://www.csse.uwa.edu.au/pk/research/matlabfns/
  22. 22.
    Lalonde, J.F., Hoiem, D., Efros, A.A., Rother, C., Winn, J., Criminisi, A.: Photo clip art. ACM Trans. Graph. 26, 3 (2007). ACMCrossRefGoogle Scholar
  23. 23.
    Lee, D.C., Gupta, A., Hebert, M., Kanade, T.: Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In: Proceedings of the NIPS, vol. 1, p. 3. Vancouver, BC (2010)Google Scholar
  24. 24.
    Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2136–2143. IEEE (2009)Google Scholar
  25. 25.
    Ramer, U.: An iterative procedure for the polygonal approximation of plane curves. Comput. Graph. Image Proc. 1, 244–256 (1972)CrossRefGoogle Scholar
  26. 26.
    Rother, C.: A new approach to vanishing point detection in architectural environments. Image Vis. Comput. 20, 647–655 (2002)CrossRefGoogle Scholar
  27. 27.
    Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157–173 (2008)CrossRefGoogle Scholar
  28. 28.
    Saxena, A., Chung, S.H., Ng, A.Y.: 3-d depth reconstruction from a single still image. Int. J. Comput. Vis. 76, 53–69 (2008)CrossRefGoogle Scholar
  29. 29.
    Saxena, A., Sun, M., Ng, A.Y.: Make3d: learning 3d scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31, 824–840 (2009)CrossRefGoogle Scholar
  30. 30.
    Stella, X.Y., Zhang, H., Malik, J.: Inferring spatial layout from a single image via depth-ordered grouping. In: CVPR Workshop (2008)Google Scholar
  31. 31.
    Tardif, J.P.: Non-iterative approach for fast and accurate vanishing point detection. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 1250–1257. IEEE (2009)Google Scholar
  32. 32.
    Toldo, R., Fusiello, A.: Robust Multiple structures estimation with J-linkage. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 537–547. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  33. 33.
    Tretyak, E., Barinova, O., Kohli, P., Lempitsky, V.: Geometric image parsing in man-made environments. Int. J. Comput. Vis. 97, 305–321 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Institute of Computer Mathematics and Information TechnologiesKazan Federal UniversityKazanRussian Federation

Personalised recommendations