Skip to main content

Planar Structures from Line Correspondences in a Manhattan World

  • Conference paper
  • First Online:
Computer Vision – ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9003))

Included in the following conference series:

Abstract

Traditional structure from motion is hard in indoor environments with only a few detectable point features. These environments, however, have other useful characteristics: they often contain severable visible lines, and their layout typically conforms to a Manhattan world geometry. We introduce a new algorithm to cluster visible lines in a Manhattan world, seen from two different viewpoints, into coplanar bundles. This algorithm is based on the notion of “characteristic line”, which is an invariant of a set of parallel coplanar lines. Finding coplanar sets of lines becomes a problem of clustering characteristic lines, which can be accomplished using a modified mean shift procedure. The algorithm is computationally light and produces good results in real world situations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A vector is represented by an arrowed symbol (\(\vec n\)) when the frame of reference is immaterial, and by a boldface symbol (\(\mathbf n\)) when expressed in terms of a frame of reference.

References

  1. Harris, C.G., Pike, J.: 3D positional integration from image sequences. Image Vis. Comput. 6, 87–90 (1988)

    Article  Google Scholar 

  2. Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)

    MATH  Google Scholar 

  3. Arth, C., Klopschitz, M., Reitmayr, G., Schmalstieg, D.: Real-time self-localization from panoramic images on mobile devices. In: 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 37–46 (2011)

    Google Scholar 

  4. Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., Pollefeys, M.: Live metric 3d reconstruction on mobile phones. In: IEEE International Conference on Computer Vision (ICCV), pp. 65–72 (2013)

    Google Scholar 

  5. Kǒsecká, J., Zhang, W.: Video compass. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 476–490. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Hwangbo, M., Kanade, T.: Visual-inertial UAV attitude estimation using urban scene regularities. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 2451–2458. IEEE (2011)

    Google Scholar 

  7. Elqursh, A., Elgammal, A.: Line-based relative pose estimation. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3049–3056 (2011)

    Google Scholar 

  8. Fitzgibbon, A.W., Zisserman, A.: Automatic camera recovery for closed or open image sequences. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 311–326. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  9. Bartoli, A., Sturm, P.: Structure-from-motion using lines: representation, triangulation, and bundle adjustment. Comput. Vis. Image Underst. 100, 416–441 (2005)

    Article  Google Scholar 

  10. Navab, N., Faugeras, O.D.: The critical sets of lines for camera displacement estimation: a mixed euclidean-projective and constructive approach. Int. J. Comput. Vision 23, 17–44 (1997)

    Article  Google Scholar 

  11. Košecká, J., Zhang, W.: Extraction, matching, and pose recovery based on dominant rectangular structures. Comput. Vis. Image Underst. 100, 274–293 (2005)

    Article  Google Scholar 

  12. Vincent, E., Laganiére, R.: Detecting planar homographies in an image pair. In: Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis, ISPA 2001, pp. 182–187. IEEE (2001)

    Google Scholar 

  13. Sagüés, C., Murillo, A., Escudero, F., Guerrero, J.J.: From lines to epipoles through planes in two views. Pattern Recogn. 39, 384–393 (2006)

    Article  Google Scholar 

  14. Guerrero, J.J., Sagüés, C.: Robust line matching and estimate of homographies simultaneously. In: Perales, F.J., Campilho, A.C., Pérez, N., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 297–307. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  15. Montijano, E., Sagues, C.: Position-based navigation using multiple homographies. In: IEEE International Conference on Emerging Technologies and Factory Automation, ETFA 2008, pp. 994–1001. IEEE (2008)

    Google Scholar 

  16. Zhou, Z., Jin, H., Ma, Y.: Robust plane-based structure from motion. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1482–1489. IEEE (2012)

    Google Scholar 

  17. Zhou, Z., Jin, H., Ma, Y.: Plane-based content-preserving warps for video stabilization. In: Computer Vision and Pattern Recognition, CVPR 2013. IEEE (2013)

    Google Scholar 

  18. Toldo, R., Fusiello, A.: Robust multiple structures estimation with j-linkage. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 537–547. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  19. Sinha, S.N., Steedly, D., Szeliski, R.: Piecewise planar stereo for image-based rendering. In: ICCV, pp. 1881–1888. Citeseer (2009)

    Google Scholar 

  20. Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vis. 75, 151–172 (2007)

    Article  Google Scholar 

  21. Hedau, V., Hoiem, D., Forsyth, D.: Thinking inside the box: using appearance models and context based on room geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  22. Delage, E., Lee, H., Ng, A.Y.: A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2418–2428. IEEE (2006)

    Google Scholar 

  23. Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2136–2143. IEEE (2009)

    Google Scholar 

  24. Flint, A., Murray, D., Reid, I.: Manhattan scene understanding using monocular, stereo, and 3d features. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2228–2235. IEEE (2011)

    Google Scholar 

  25. Ramalingam, S., Pillai, J.K., Jain, A., Taguchi, Y.: Manhattan junction catalogue for spatial reasoning of indoor scenes. In: Computer Vision and Pattern Recognition, CVPR 2013. IEEE (2013)

    Google Scholar 

  26. Tsai, G., Kuipers, B.: Dynamic visual understanding of the local environment for an indoor navigating robot. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4695–4701. IEEE (2012)

    Google Scholar 

  27. Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)

    Article  Google Scholar 

  28. Grompone von Gioi, R., Jakubowicz, J., Morel, J.M., Randall, G.: LSD: a line segment detector. Image Processing on Line 2012 (2012)

    Google Scholar 

  29. Wang, Z., Wu, F., Hu, Z.: Msld: a robust descriptor for line matching. Pattern Recogn. 42, 941–953 (2009)

    Article  Google Scholar 

  30. Ronda, J.I., Valdés, A., Gallego, G.: Line geometry and camera autocalibration. J. Math. Imaging Vis. 32, 193–214 (2008)

    Article  MathSciNet  Google Scholar 

  31. Wu, C.: VisualSFM. http://ccwu.me/vsfm/ (last checked: 15 June 2014)

  32. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

Download references

Acknowledgement

The project described was supported by Grant Number 1R21EY021643-01 from NEI/NIH. The authors would like to thank Ali Elqursh and Ahmed Elgammal for providing the implementation of their method.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Chelhwon Kim or Roberto Manduchi .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (pdf 11,702 KB)

Appendix: Characteristic Planes Revisited

Appendix: Characteristic Planes Revisited

We present here a different derivation of the characteristic planes concept, obtained through algebraic manipulations. For simplicity’s sake, we will restrict our attention to one canonical plane \(\varPi _i\), assuming that both cameras are located on it. A 2-D reference system is centered at the first camera. In this 2-D world, each camera only sees an image line, and the cameras’ relative pose is specified by the (unknown) 2-D vector \(\mathbf{t}\) and the (known) 2-D rotation matrix \(\mathbf R\). We’ll assume that both cameras have identity calibration matrices. Consider a plane \(\varPi _j\) with (known) normal \(\mathbf{n}_j\), orthogonal to \(\varPi _i\). A line \(\mathcal{L}\) in \(\varPi _j\) intersects \(\varPi _i\) at one point, \(\mathbf{X}\). Note that, from the image of this point in the first camera and knowledge of the plane normal \(\mathbf{n}_j\), one can recover \(\mathbf{X}/d\), where \(d\) is the (unknown) distance of \(\varPi _j\) from the first camera. Let \(\hat{\mathbf{x}}_2\) be the location of the projection of \(\mathbf X\) in the second camera’s (line) image, expressed in homogeneous coordinates. From Fig. 1 one easily sees that \(\lambda \hat{\mathbf{x}}_2 = \mathbf{R} \mathbf{X} + \mathbf{t}\) for some \(\lambda \), and thus

$$\begin{aligned} \mathbf{t}/d = \lambda \hat{\mathbf{x}}_2/d - \mathbf{R} \mathbf{X}/d= \lambda _2 \hat{\mathbf{x}}_2 - {(\mathbf R\mathbf X)}_{\bot }/d \end{aligned}$$
(6)

for some \(\lambda _2\), where \({(\mathbf R\mathbf X)}_{\bot }=\mathbf{R \mathbf X} - (\hat{\mathbf{x}}_2^T\mathbf{R X}) \hat{\mathbf{x}}_2 /(\hat{\mathbf{x}}_2^T \hat{\mathbf{x}}_2)\) is the component of \(\mathbf{R \mathbf X}\) orthogonal to \(\hat{\mathbf{x}}_2\). This imposes a linear constraint on \(\mathbf{t}/d\). It is not difficult to see that \(\hat{\mathbf{x}}_2\) is orthogonal to the lever vector \(\vec u_2\) in Fig. 1, and that \(\Vert {(\mathbf R\mathbf X)}_{\bot }/d\Vert \) is equal to the modulus of the sin ratio for the line \(\mathcal{L}\) seen by the two cameras. Hence, the linear constraint in (6) is simply an expression of the intersection of the characteristic plane \(\varPi (\mathcal{L},{\vec n}_j)\) with \(\varPi _i\).

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Kim, C., Manduchi, R. (2015). Planar Structures from Line Correspondences in a Manhattan World. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16865-4_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16864-7

  • Online ISBN: 978-3-319-16865-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics