Geometric Image Parsing in Man-Made Environments

  • Olga Barinova
  • Victor Lempitsky
  • Elena Tretiak
  • Pushmeet Kohli
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6312)


We present a new parsing framework for the line-based geometric analysis of a single image coming from a man-made environment. This parsing framework models the scene as a composition of geometric primitives spanning different layers from low level (edges) through mid-level (lines and vanishing points) to high level (the zenith and the horizon). The inference in such a model thus jointly and simultaneously estimates a) the grouping of edges into the straight lines, b) the grouping of lines into parallel families, and c) the positioning of the horizon and the zenith in the image. Such a unified treatment means that the uncertainty information propagates between the layers of the model. This is in contrast to most previous approaches to the same problem, which either ignore the middle levels (lines) all together, or use the bottom-up step-by-step pipeline.

For the evaluation, we consider a publicly available York Urban dataset of “Manhattan” scenes, and also introduce a new, harder dataset of 103 urban outdoor images containing many non-Manhattan scenes. The comparative evaluation for the horizon estimation task demonstrate higher accuracy and robustness attained by our method when compared to the current state-of-the-art approaches.


  1. 1.
    Schindler, G., Dellaert, F.: Atlanta world: An expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments. In: CVPR, vol. (1), pp. 203–209 (2004)Google Scholar
  2. 2.
    Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: ICCV, pp. 654–661 (2005)Google Scholar
  3. 3.
    Hoiem, D., Efros, A.A., Hebert, M.: Automatic photo pop-up. ACM Trans. Graph. 24, 577–584 (2005)CrossRefGoogle Scholar
  4. 4.
    Hoiem, D., Efros, A.A., Hebert, M.: Putting objects in perspective. International Journal of Computer Vision 80, 3–15 (2008)CrossRefGoogle Scholar
  5. 5.
    Duric, Z., Rosenfeld, A.: Image sequence stabilization in real time. Real-Time Imaging 2, 271–284 (1996)CrossRefGoogle Scholar
  6. 6.
    McLean, G.F., Kotturi, D.: Vanishing point detection by line clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17, 1090–1095 (1995)CrossRefGoogle Scholar
  7. 7.
    Tuytelaars, T., Gool, L.J.V., Proesmans, M., Moons, T.: A cascaded hough transform as an aid in aerial image interpretation. In: ICCV, pp. 67–72 (1998)Google Scholar
  8. 8.
    Cipolla, R., Drummond, T., Robertson, D.P.: Camera calibration from vanishing points in image of architectural scenes. In: BMVC (1999)Google Scholar
  9. 9.
    Antone, M.E., Teller, S.J.: Automatic recovery of relative camera rotations for urban scenes. In: CVPR, pp. 2282–2289 (2000)Google Scholar
  10. 10.
    Almansa, A., Desolneux, A., Vamech, S.: Vanishing point detection without any a priori information. IEEE Trans. Pattern Anal. Mach. Intell. 25, 502–507 (2003)CrossRefGoogle Scholar
  11. 11.
    Aguilera, D.G., Lahoz, J.G., Codes, J.F.: A new method for vanishing points detection in 3d reconstruction from a single view. In: Proc. of ISPRS Commission V (2005)Google Scholar
  12. 12.
    Tardif, J.P.: Non-iterative approach for fast and accurate vanishing point detection. In: ICCV (2009)Google Scholar
  13. 13.
    Collins, R., Weiss, R.: Vanishing point calculation as a statistical inference on the unit sphere. In: ICCV, pp. 400–403 (1990)Google Scholar
  14. 14.
    Kosecká, J., Zhang, W.: Video compass. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 476–490. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Rother, C.: A new approach for vanishing point detection in architectural environments. In: BMVC (2000)Google Scholar
  16. 16.
    Coughlan, J.M., Yuille, A.L.: Manhattan world: Compass direction from a single image by bayesian inference. In: ICCV, pp. 941–947 (1999)Google Scholar
  17. 17.
    Deutscher, J., Isard, M., MacCormick, J.: Automatic camera calibration from a single manhattan image. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 175–205. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  18. 18.
    Denis, P., Elder, J.H., Estrada, F.J.: Efficient edge-based methods for estimating manhattan frames in urban imagery. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 197–210. Springer, Heidelberg (2002)Google Scholar
  19. 19.
    Tu, Z., Chen, X., Yuille, A.L., Zhu, S.C.: Image parsing: Unifying segmentation, detection, and recognition. International Journal of Computer Vision 63, 113–140 (2005)CrossRefGoogle Scholar
  20. 20.
    Barnard, S.: Interpreting perspective images. Artificial Intelligence 21, 435–462 (1983)CrossRefGoogle Scholar
  21. 21.
    Beardsley, P., Murray, D.: Camera calibration using vanishing points. In: BMVC, pp. 416–425 (1992)Google Scholar
  22. 22.
    Barinova, O., Lempitsky, V., Kohli, P.: On detection of multiple object instances using hough transforms. In: CVPR (2010)Google Scholar
  23. 23.
    Besag, J.: On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society B-48, 259–302 (1986)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Olga Barinova
    • 1
  • Victor Lempitsky
    • 2
  • Elena Tretiak
    • 1
  • Pushmeet Kohli
    • 3
  1. 1.Moscow State University 
  2. 2.University of Oxford 
  3. 3.Microsoft Research Cambridge 

Personalised recommendations