Planar Structures from Line Correspondences in a Manhattan World

Kim, Chelhwon; Manduchi, Roberto

doi:10.1007/978-3-319-16865-4_33

Chelhwon Kim⁵ &
Roberto Manduchi⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9003))

Included in the following conference series:

Asian Conference on Computer Vision

2189 Accesses
5 Citations

Abstract

Traditional structure from motion is hard in indoor environments with only a few detectable point features. These environments, however, have other useful characteristics: they often contain severable visible lines, and their layout typically conforms to a Manhattan world geometry. We introduce a new algorithm to cluster visible lines in a Manhattan world, seen from two different viewpoints, into coplanar bundles. This algorithm is based on the notion of “characteristic line”, which is an invariant of a set of parallel coplanar lines. Finding coplanar sets of lines becomes a problem of clustering characteristic lines, which can be accomplished using a modified mean shift procedure. The algorithm is computationally light and produces good results in real world situations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A vector is represented by an arrowed symbol ($\vec n$) when the frame of reference is immaterial, and by a boldface symbol ($\mathbf n$) when expressed in terms of a frame of reference.

References

Harris, C.G., Pike, J.: 3D positional integration from image sequences. Image Vis. Comput. 6, 87–90 (1988)
Article Google Scholar
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)
MATH Google Scholar
Arth, C., Klopschitz, M., Reitmayr, G., Schmalstieg, D.: Real-time self-localization from panoramic images on mobile devices. In: 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 37–46 (2011)
Google Scholar
Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., Pollefeys, M.: Live metric 3d reconstruction on mobile phones. In: IEEE International Conference on Computer Vision (ICCV), pp. 65–72 (2013)
Google Scholar
Kǒsecká, J., Zhang, W.: Video compass. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 476–490. Springer, Heidelberg (2002)
Chapter Google Scholar
Hwangbo, M., Kanade, T.: Visual-inertial UAV attitude estimation using urban scene regularities. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 2451–2458. IEEE (2011)
Google Scholar
Elqursh, A., Elgammal, A.: Line-based relative pose estimation. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3049–3056 (2011)
Google Scholar
Fitzgibbon, A.W., Zisserman, A.: Automatic camera recovery for closed or open image sequences. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 311–326. Springer, Heidelberg (1998)
Chapter Google Scholar
Bartoli, A., Sturm, P.: Structure-from-motion using lines: representation, triangulation, and bundle adjustment. Comput. Vis. Image Underst. 100, 416–441 (2005)
Article Google Scholar
Navab, N., Faugeras, O.D.: The critical sets of lines for camera displacement estimation: a mixed euclidean-projective and constructive approach. Int. J. Comput. Vision 23, 17–44 (1997)
Article Google Scholar
Košecká, J., Zhang, W.: Extraction, matching, and pose recovery based on dominant rectangular structures. Comput. Vis. Image Underst. 100, 274–293 (2005)
Article Google Scholar
Vincent, E., Laganiére, R.: Detecting planar homographies in an image pair. In: Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis, ISPA 2001, pp. 182–187. IEEE (2001)
Google Scholar
Sagüés, C., Murillo, A., Escudero, F., Guerrero, J.J.: From lines to epipoles through planes in two views. Pattern Recogn. 39, 384–393 (2006)
Article Google Scholar
Guerrero, J.J., Sagüés, C.: Robust line matching and estimate of homographies simultaneously. In: Perales, F.J., Campilho, A.C., Pérez, N., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 297–307. Springer, Heidelberg (2003)
Chapter Google Scholar
Montijano, E., Sagues, C.: Position-based navigation using multiple homographies. In: IEEE International Conference on Emerging Technologies and Factory Automation, ETFA 2008, pp. 994–1001. IEEE (2008)
Google Scholar
Zhou, Z., Jin, H., Ma, Y.: Robust plane-based structure from motion. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1482–1489. IEEE (2012)
Google Scholar
Zhou, Z., Jin, H., Ma, Y.: Plane-based content-preserving warps for video stabilization. In: Computer Vision and Pattern Recognition, CVPR 2013. IEEE (2013)
Google Scholar
Toldo, R., Fusiello, A.: Robust multiple structures estimation with j-linkage. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 537–547. Springer, Heidelberg (2008)
Chapter Google Scholar
Sinha, S.N., Steedly, D., Szeliski, R.: Piecewise planar stereo for image-based rendering. In: ICCV, pp. 1881–1888. Citeseer (2009)
Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vis. 75, 151–172 (2007)
Article Google Scholar
Hedau, V., Hoiem, D., Forsyth, D.: Thinking inside the box: using appearance models and context based on room geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010)
Chapter Google Scholar
Delage, E., Lee, H., Ng, A.Y.: A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2418–2428. IEEE (2006)
Google Scholar
Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2136–2143. IEEE (2009)
Google Scholar
Flint, A., Murray, D., Reid, I.: Manhattan scene understanding using monocular, stereo, and 3d features. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2228–2235. IEEE (2011)
Google Scholar
Ramalingam, S., Pillai, J.K., Jain, A., Taguchi, Y.: Manhattan junction catalogue for spatial reasoning of indoor scenes. In: Computer Vision and Pattern Recognition, CVPR 2013. IEEE (2013)
Google Scholar
Tsai, G., Kuipers, B.: Dynamic visual understanding of the local environment for an indoor navigating robot. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4695–4701. IEEE (2012)
Google Scholar
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)
Article Google Scholar
Grompone von Gioi, R., Jakubowicz, J., Morel, J.M., Randall, G.: LSD: a line segment detector. Image Processing on Line 2012 (2012)
Google Scholar
Wang, Z., Wu, F., Hu, Z.: Msld: a robust descriptor for line matching. Pattern Recogn. 42, 941–953 (2009)
Article Google Scholar
Ronda, J.I., Valdés, A., Gallego, G.: Line geometry and camera autocalibration. J. Math. Imaging Vis. 32, 193–214 (2008)
Article MathSciNet Google Scholar
Wu, C.: VisualSFM. http://ccwu.me/vsfm/ (last checked: 15 June 2014)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book Google Scholar

Download references

Acknowledgement

The project described was supported by Grant Number 1R21EY021643-01 from NEI/NIH. The authors would like to thank Ali Elqursh and Ahmed Elgammal for providing the implementation of their method.

Author information

Authors and Affiliations

Electrical Engineering Department, University of California, Santa Cruz, CA, USA
Chelhwon Kim
Computer Engineering Department, University of California, Santa Cruz, CA, USA
Roberto Manduchi

Authors

Chelhwon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Manduchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Chelhwon Kim or Roberto Manduchi .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (pdf 11,702 KB)

Appendix: Characteristic Planes Revisited

We present here a different derivation of the characteristic planes concept, obtained through algebraic manipulations. For simplicity’s sake, we will restrict our attention to one canonical plane $\varPi _i$, assuming that both cameras are located on it. A 2-D reference system is centered at the first camera. In this 2-D world, each camera only sees an image line, and the cameras’ relative pose is specified by the (unknown) 2-D vector $\mathbf{t}$ and the (known) 2-D rotation matrix $\mathbf R$. We’ll assume that both cameras have identity calibration matrices. Consider a plane $\varPi _j$ with (known) normal $\mathbf{n}_j$, orthogonal to $\varPi _i$. A line $\mathcal{L}$ in $\varPi _j$ intersects $\varPi _i$ at one point, $\mathbf{X}$. Note that, from the image of this point in the first camera and knowledge of the plane normal $\mathbf{n}_j$, one can recover $\mathbf{X}/d$, where $d$ is the (unknown) distance of $\varPi _j$ from the first camera. Let $\hat{\mathbf{x}}_2$ be the location of the projection of $\mathbf X$ in the second camera’s (line) image, expressed in homogeneous coordinates. From Fig. 1 one easily sees that $\lambda \hat{\mathbf{x}}_2 = \mathbf{R} \mathbf{X} + \mathbf{t}$ for some $\lambda $, and thus

$$\begin{aligned} \mathbf{t}/d = \lambda \hat{\mathbf{x}}_2/d - \mathbf{R} \mathbf{X}/d= \lambda _2 \hat{\mathbf{x}}_2 - {(\mathbf R\mathbf X)}_{\bot }/d \end{aligned}$$

(6)

for some $\lambda _2$, where ${(\mathbf R\mathbf X)}_{\bot }=\mathbf{R \mathbf X} - (\hat{\mathbf{x}}_2^T\mathbf{R X}) \hat{\mathbf{x}}_2 /(\hat{\mathbf{x}}_2^T \hat{\mathbf{x}}_2)$ is the component of $\mathbf{R \mathbf X}$ orthogonal to $\hat{\mathbf{x}}_2$. This imposes a linear constraint on $\mathbf{t}/d$. It is not difficult to see that $\hat{\mathbf{x}}_2$ is orthogonal to the lever vector $\vec u_2$ in Fig. 1, and that $\Vert {(\mathbf R\mathbf X)}_{\bot }/d\Vert $ is equal to the modulus of the sin ratio for the line $\mathcal{L}$ seen by the two cameras. Hence, the linear constraint in (6) is simply an expression of the intersection of the characteristic plane $\varPi (\mathcal{L},{\vec n}_j)$ with $\varPi _i$.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, C., Manduchi, R. (2015). Planar Structures from Line Correspondences in a Manhattan World. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-16865-4_33
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Planar Structures from Line Correspondences in a Manhattan World

Abstract

Access this chapter

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material (pdf 11,702 KB)

Appendix: Characteristic Planes Revisited

Appendix: Characteristic Planes Revisited

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation