Abstract
Traditional structure from motion is hard in indoor environments with only a few detectable point features. These environments, however, have other useful characteristics: they often contain severable visible lines, and their layout typically conforms to a Manhattan world geometry. We introduce a new algorithm to cluster visible lines in a Manhattan world, seen from two different viewpoints, into coplanar bundles. This algorithm is based on the notion of “characteristic line”, which is an invariant of a set of parallel coplanar lines. Finding coplanar sets of lines becomes a problem of clustering characteristic lines, which can be accomplished using a modified mean shift procedure. The algorithm is computationally light and produces good results in real world situations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A vector is represented by an arrowed symbol (\(\vec n\)) when the frame of reference is immaterial, and by a boldface symbol (\(\mathbf n\)) when expressed in terms of a frame of reference.
References
Harris, C.G., Pike, J.: 3D positional integration from image sequences. Image Vis. Comput. 6, 87–90 (1988)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)
Arth, C., Klopschitz, M., Reitmayr, G., Schmalstieg, D.: Real-time self-localization from panoramic images on mobile devices. In: 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 37–46 (2011)
Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., Pollefeys, M.: Live metric 3d reconstruction on mobile phones. In: IEEE International Conference on Computer Vision (ICCV), pp. 65–72 (2013)
Kǒsecká, J., Zhang, W.: Video compass. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 476–490. Springer, Heidelberg (2002)
Hwangbo, M., Kanade, T.: Visual-inertial UAV attitude estimation using urban scene regularities. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 2451–2458. IEEE (2011)
Elqursh, A., Elgammal, A.: Line-based relative pose estimation. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3049–3056 (2011)
Fitzgibbon, A.W., Zisserman, A.: Automatic camera recovery for closed or open image sequences. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 311–326. Springer, Heidelberg (1998)
Bartoli, A., Sturm, P.: Structure-from-motion using lines: representation, triangulation, and bundle adjustment. Comput. Vis. Image Underst. 100, 416–441 (2005)
Navab, N., Faugeras, O.D.: The critical sets of lines for camera displacement estimation: a mixed euclidean-projective and constructive approach. Int. J. Comput. Vision 23, 17–44 (1997)
Košecká, J., Zhang, W.: Extraction, matching, and pose recovery based on dominant rectangular structures. Comput. Vis. Image Underst. 100, 274–293 (2005)
Vincent, E., Laganiére, R.: Detecting planar homographies in an image pair. In: Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis, ISPA 2001, pp. 182–187. IEEE (2001)
Sagüés, C., Murillo, A., Escudero, F., Guerrero, J.J.: From lines to epipoles through planes in two views. Pattern Recogn. 39, 384–393 (2006)
Guerrero, J.J., Sagüés, C.: Robust line matching and estimate of homographies simultaneously. In: Perales, F.J., Campilho, A.C., Pérez, N., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 297–307. Springer, Heidelberg (2003)
Montijano, E., Sagues, C.: Position-based navigation using multiple homographies. In: IEEE International Conference on Emerging Technologies and Factory Automation, ETFA 2008, pp. 994–1001. IEEE (2008)
Zhou, Z., Jin, H., Ma, Y.: Robust plane-based structure from motion. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1482–1489. IEEE (2012)
Zhou, Z., Jin, H., Ma, Y.: Plane-based content-preserving warps for video stabilization. In: Computer Vision and Pattern Recognition, CVPR 2013. IEEE (2013)
Toldo, R., Fusiello, A.: Robust multiple structures estimation with j-linkage. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 537–547. Springer, Heidelberg (2008)
Sinha, S.N., Steedly, D., Szeliski, R.: Piecewise planar stereo for image-based rendering. In: ICCV, pp. 1881–1888. Citeseer (2009)
Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vis. 75, 151–172 (2007)
Hedau, V., Hoiem, D., Forsyth, D.: Thinking inside the box: using appearance models and context based on room geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010)
Delage, E., Lee, H., Ng, A.Y.: A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2418–2428. IEEE (2006)
Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2136–2143. IEEE (2009)
Flint, A., Murray, D., Reid, I.: Manhattan scene understanding using monocular, stereo, and 3d features. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2228–2235. IEEE (2011)
Ramalingam, S., Pillai, J.K., Jain, A., Taguchi, Y.: Manhattan junction catalogue for spatial reasoning of indoor scenes. In: Computer Vision and Pattern Recognition, CVPR 2013. IEEE (2013)
Tsai, G., Kuipers, B.: Dynamic visual understanding of the local environment for an indoor navigating robot. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4695–4701. IEEE (2012)
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)
Grompone von Gioi, R., Jakubowicz, J., Morel, J.M., Randall, G.: LSD: a line segment detector. Image Processing on Line 2012 (2012)
Wang, Z., Wu, F., Hu, Z.: Msld: a robust descriptor for line matching. Pattern Recogn. 42, 941–953 (2009)
Ronda, J.I., Valdés, A., Gallego, G.: Line geometry and camera autocalibration. J. Math. Imaging Vis. 32, 193–214 (2008)
Wu, C.: VisualSFM. http://ccwu.me/vsfm/ (last checked: 15 June 2014)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Acknowledgement
The project described was supported by Grant Number 1R21EY021643-01 from NEI/NIH. The authors would like to thank Ali Elqursh and Ahmed Elgammal for providing the implementation of their method.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Characteristic Planes Revisited
Appendix: Characteristic Planes Revisited
We present here a different derivation of the characteristic planes concept, obtained through algebraic manipulations. For simplicity’s sake, we will restrict our attention to one canonical plane \(\varPi _i\), assuming that both cameras are located on it. A 2-D reference system is centered at the first camera. In this 2-D world, each camera only sees an image line, and the cameras’ relative pose is specified by the (unknown) 2-D vector \(\mathbf{t}\) and the (known) 2-D rotation matrix \(\mathbf R\). We’ll assume that both cameras have identity calibration matrices. Consider a plane \(\varPi _j\) with (known) normal \(\mathbf{n}_j\), orthogonal to \(\varPi _i\). A line \(\mathcal{L}\) in \(\varPi _j\) intersects \(\varPi _i\) at one point, \(\mathbf{X}\). Note that, from the image of this point in the first camera and knowledge of the plane normal \(\mathbf{n}_j\), one can recover \(\mathbf{X}/d\), where \(d\) is the (unknown) distance of \(\varPi _j\) from the first camera. Let \(\hat{\mathbf{x}}_2\) be the location of the projection of \(\mathbf X\) in the second camera’s (line) image, expressed in homogeneous coordinates. From Fig. 1 one easily sees that \(\lambda \hat{\mathbf{x}}_2 = \mathbf{R} \mathbf{X} + \mathbf{t}\) for some \(\lambda \), and thus
for some \(\lambda _2\), where \({(\mathbf R\mathbf X)}_{\bot }=\mathbf{R \mathbf X} - (\hat{\mathbf{x}}_2^T\mathbf{R X}) \hat{\mathbf{x}}_2 /(\hat{\mathbf{x}}_2^T \hat{\mathbf{x}}_2)\) is the component of \(\mathbf{R \mathbf X}\) orthogonal to \(\hat{\mathbf{x}}_2\). This imposes a linear constraint on \(\mathbf{t}/d\). It is not difficult to see that \(\hat{\mathbf{x}}_2\) is orthogonal to the lever vector \(\vec u_2\) in Fig. 1, and that \(\Vert {(\mathbf R\mathbf X)}_{\bot }/d\Vert \) is equal to the modulus of the sin ratio for the line \(\mathcal{L}\) seen by the two cameras. Hence, the linear constraint in (6) is simply an expression of the intersection of the characteristic plane \(\varPi (\mathcal{L},{\vec n}_j)\) with \(\varPi _i\).
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kim, C., Manduchi, R. (2015). Planar Structures from Line Correspondences in a Manhattan World. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-16865-4_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)