We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Advertisement

Joint Estimation of Camera Orientation and Vanishing Points from an Image Sequence in a Non-Manhattan World

Abstract

A widely used approach for estimating camera orientation is to use the points at infinity, i.e., the vanishing points (VPs). Enforcement of the orthogonal constraint between the VPs, known as the Manhattan world constraint, enables an estimation of the drift-free camera orientation to be achieved. However, in practical applications, this approach is neither effective (because of noisy parallel line segments) nor performable in non-Manhattan world scenes. To overcome these limitations, we propose a novel method that jointly estimates the VPs and camera orientation based on sequential Bayesian filtering. The proposed method does not require the Manhattan world assumption, and can perform a highly accurate estimation of camera orientation. In order to enhance the robustness of the joint estimation, we propose a keyframe-based feature management technique that removes false positives from parallel line clusters and detects new parallel line sets using geometric properties such as the orthogonality and rotational dependence for a VP, a line, and the camera rotation. In addition, we propose a 3-line camera rotation estimation method that does not require the Manhattan world assumption. The 3-line method is applied to the RANSAC-based outlier rejection technique to eliminate outlier measurements; therefore, the proposed method achieves accurate and robust estimation of the camera orientation and VPs in general scenes with non-orthogonal parallel lines. We demonstrate the superiority of the proposed method by conducting an extensive evaluation using synthetic and real datasets and by comparison with other state-of-the-art methods.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Notes

  1. 1.

    These cases are discussed in Sect. 6.4.

  2. 2.

    http://vi.kaist.ac.kr/open-sources/.

References

  1. Antone, M. E., & Teller, S. (2000). Automatic recovery of relative camera rotations for urban scenes. In Proceedings of the ieee conference on computer vision and pattern recognition (CVPR).

  2. Bazin, J. C., & Pollefeys, M. (2012). 3-line ransac for orthogonal vanishing point detection. In IEEE/RSJ international conference on intelligent robots and systems (IROS).

  3. Bazin, J. C., Demonceaux, C., Vasseur, P., & Kweon, I. (2012). Rotation estimation and vanishing point extraction by omnidirectional vision in urban environment. International Journal of Robotics Research, 31(1), 63–81.

  4. Bazin, J. C., Seo, Y., Demonceaux, C., Vasseur, P., Ikeuchi, K., Kweon, I., & Pollefeys, M. (2012). Globally optimal line clustering and vanishing point estimation in manhattan world. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  5. Burri, M., Nikolic, J., Gohl, P., Schneider, T., Rehder, J., Omari, S., et al. (2016). The euroc micro aerial vehicle datasets. International Journal of Robotics Research, 35(10), 1157–1163.

  6. Cipolla, R., Drummond, T., & Robertson, D. P. (1999). Camera calibration from vanishing points in image of architectural scenes. In Proceedings of the British machine vision conference (BMVC).

  7. Civera, J., Grasa, O. G., Davison, A. J., & Montiel, J. (2009). 1-point ransac for ekf-based structure from motion. In IEEE/RSJ international conference on intelligent robots and systems (IROS).

  8. Cummins, M., & Newman, P. (2008). Fab-map: Probabilistic localization and mapping in the space of appearance. International Journal of Robotics Research, 27(6), 647–665.

  9. Elloumi, W., Treuillet, S., & Leconge, R. (2017). Real-time camera orientation estimation based on vanishing point tracking under manhattan world assumption. Journal of Real-Time Image Processing, 13(4), 669–684.

  10. Fan, B., Wu, F., & Hu, Z. (2012). Robust line matching through line-point invariants. Pattern Recognition, 45(2), 794–805.

  11. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.

  12. Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The kitti vision benchmark suite. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  13. Gomez-Balderas, J. E., Castillo, P., Guerrero, J. A., & Lozano, R. (2012). Vision based tracking for a quadrotor using vanishing points. Journal of Intelligent and Robotic Systems, 65(1–4), 361–371.

  14. Ho, K. L., & Newman, P. (2006). Loop closure detection in slam by combining visual and spatial appearance. Robotics and Autonomous Systems, 54(9), 740–749.

  15. Kosecka, J., & Zhang, W. (2002). Video compass. In European conference on computer vision (ECCV).

  16. Kroeger, T., Dai, D., & Van Gool, L. (2015). Joint vanishing point extraction and tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2449–2457.

  17. Kurz, D., Meier, P. G., Plopski, A., & Klinker, G. (2013). An outdoor ground truth evaluation dataset for sensor-aided visual handheld camera localization. In IEEE and ACM international symposium on mixed and augmented reality (ISMAR).

  18. Lee, J. K., & Yoon, K. J. (2015). Real-time joint estimation of camera orientation and vanishing points. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  19. Lezama, J., Grompone von Gioi, R., Randall, G., & Morel, J.M. (2014). Finding vanishing points via point alignments in image primal and dual domains. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  20. Lu, X., Yao, J., Li, H., & Liu, Y. (2017). 2-line exhaustive searching for real-time vanishing point estimation in manhattan world. In IEEE winter conference on applications of computer vision (WACV).

  21. Martin, P., Marchand, E., Houlier, P., & Marchal, I. (2014). Mapping and re-localization for mobile augmented reality. In IEEE international conference on image processing (ICIP).

  22. Mirzaei, F. M., & Roumeliotis, S. I. (2011). Optimal estimation of vanishing points in a manhattan world. In Proceedings of the IEEE international conference on computer vision (ICCV), pp. 2454–2461.

  23. Mur-Artal, R., Montiel, J. M. M., & Tardós, J. D. (2015). ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 31(5), 1147–1163.

  24. Neubert, P., Protzel, P., Vidal-Calleja, T., & Lacroix, S. (2008). A fast visual line segment tracker. In IEEE international conference on emerging technologies and factory automation (ETFA), pp. 353–360.

  25. Pflugfelder, R., & Bischof, H. (2005). Online auto-calibration in man-made world. In Digital image computing: Techniques and applications (DICTA).

  26. Rondon, E., Garcia-Carrillo, L. R., & Fantoni, I. (2010). Vision-based altitude, position and speed regulation of a quadrotor rotorcraft. In IEEE/RSJ international conference on intelligent robots and systems (IROS).

  27. Rother, C. (2002). A new approach to vanishing point detection in architectural environments. Image and Vision Computing, 20(9–10), 647–655.

  28. Schindler, G., & Dellaert, F. (2004). Atlanta world: An expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR).

  29. Schmid, C., & Zisserman, A. (1997). Automatic line matching across views. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp. 666–671.

  30. Simon, D. (2006). Optimal state estimation: Kalman, H infinity, and nonlinear approaches. Hoboken, NJ: Wiley-Interscience.

  31. Sinha, S. N., Steedly, D., Szeliski, R., Agrawala, M., & Pollefeys, M. (2008). Interactive 3d architectural modeling from unordered photo collections. ACM Transactions on Graphics, 27(5), 159.

  32. Sinha, S. N., Steedly, D., & Szeliski, R. (2010). A multi-stage linear approach to structure from motion. In European conference on computer vision (ECCV), pp. 267–281.

  33. Sundararajan, K. (2011). Unified point-edgelet feature tracking. M.Sc. thesis, Graduate School of Clemson University. http://www.ces.clemson.edu/~stb/students/ssundar_thesis.pdf. Accessed 15 June 2014.

  34. Tardif, J. P. (2009). Non-iterative approach for fast and accurate vanishing point detection. In Proceedings of the IEEE international conference on computer vision (ICCV).

  35. Tretyak, E., Barinova, O., Kohli, P., & Lempitsky, V. (2012). Geometric image parsing in man-made environments. International Journal of Computer Vision, 97(3), 305–321.

  36. VonGioi, R. G., Jakubowicz, J., Morel, J. M., & Randall, G. (2010). Lsd: A fast line segment detector with a false detection control. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(4), 722–732.

  37. Wang, Z., Wu, F., & Hu, Z. (2009). Msld: A robust descriptor for line matching. Pattern Recognition, 42(5), 941–953.

  38. Wenzel, F., & Grigat, R. R. (2007). Representing directions for hough transforms. In Advances in computer graphics and computer vision (pp. 330–339). Springer.

  39. Williams, B., Cummins, M., Neira, J., Newman, P., Reid, I., & Tardós, J. (2009). A comparison of loop closing techniques in monocular slam. Robotics and Autonomous Systems, 57(12), 1188–1197.

  40. Xu, Y., Park, C., & Oh, S. (2013). A minimum error vanishing point detection approach for uncalibrated monocular images of man-made environments. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  41. Zhang, L., & Koch, R. (2013). An efficient and robust line segment matching approach based on lbd descriptor and pairwise geometric consistency. Journal of Visual Communication and Image Representation, 24(7), 794–805.

  42. Zhang, L., Li, Y., & Nevatia, R. (2008). Global data association for multi-object tracking using network flows. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

  43. Zhang, L., Lu, H., Hu, X., & Koch, R. (2016). Vanishing point estimation and line classification in a manhattan world with a unifying camera model. International Journal of Computer Vision, 117(2), 111–130.

Download references

Acknowledgements

This work was supported by Samsung Research Funding Center of Samsung Electronics under Project Number SRFC-TC1603-05 and Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (NRF-2017M3C4A7069369).

Author information

Correspondence to Kuk-Jin Yoon.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by Edmond Boyer.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (avi 21858 KB)

Supplementary material 1 (avi 21858 KB)

Appendix: Jacobian Derivation

Appendix: Jacobian Derivation

In the prediction step of the EKF system, the Jacobian of the system model, \(\mathbf {F}\), is defined by

$$\begin{aligned} \mathbf {F} = \frac{\partial \mathbf {f}}{\partial \mathbf {x}} = \left[ \begin{array}{cc} \frac{\partial \mathbf {f}_v}{\partial \mathbf {x}_v} &{}\quad \mathbf {0}_{7 \times 2n} \\ \mathbf {0}_{2n \times 7} &{}\quad \mathbf {I}_{2n} \end{array} \right] , \end{aligned}$$
(21)

where n is the number of VPs, \(\mathbf {0}_{a \times b}\) is an \(a \times b\) zero matrix, and \(\mathbf {I}_{c}\) is a \(c \times c\) identity matrix. The Jacobian of the camera motion model for the camera state, \(\frac{\partial \mathbf {f}_{v}}{\partial \mathbf {x}_{v}}\), is derived from Eq. (7) as

$$\begin{aligned} \frac{\partial \mathbf {f}_{v}}{\partial \mathbf {x}_{v}} = \left[ \begin{array}{cc} \frac{\partial \mathbf {q}_{WC}^{new}}{\partial \mathbf {q}_{WC}^{old}} &{}\quad \frac{\partial \mathbf {q}_{WC}^{new}}{\partial \omega _{C}^{old}} \\ \frac{\partial \omega _{C}^{new}}{\partial \mathbf {q}_{WC}^{old}} &{}\quad \frac{\partial \omega _{C}^{new}}{\partial \omega _{C}^{old}} \end{array} \right] = \left[ \begin{array}{cc} \frac{\partial \mathbf {q}_{WC}^{new}}{\partial \mathbf {q}_{WC}^{old}} &{}\quad \frac{\partial \mathbf {q}_{WC}^{new}}{\partial \omega _{C}^{old}} \\ \mathbf {0}_{3 \times 4} &{}\quad \mathbf {I}_{3} \end{array} \right] . \end{aligned}$$
(22)

Here, the Jacobian of the new camera orientation for the old camera orientation, \(\frac{\partial \mathbf {q}_{WC}^{new}}{\partial \mathbf {q}_{WC}^{old}}\), is easily computed as follows.

$$\begin{aligned} \frac{\partial \mathbf {q}_{WC}^{new}}{\partial \mathbf {q}_{WC}^{old}} = \bar{\mathbf {Q}} \left( \mathbf {q} \left( \left( \omega _{C}^{old}+{\varvec{\Omega }} \right) \varDelta t \right) \right) , \end{aligned}$$
(23)

where the function \(\bar{\mathbf {Q}}(\mathbf {p})\) is defined as

$$\begin{aligned} \bar{\mathbf {Q}} ( \mathbf {p} ) = \left[ \begin{array}{cccc} p_{1} &{}\quad -\,p_{2} &{}\quad -\,p_{3} &{}\quad -\,p_{4} \\ p_{2} &{}\quad p_{1} &{}\quad p_{4} &{}\quad -\,p_{3} \\ p_{3} &{}\quad -\,p_{4} &{}\quad p_{1} &{}\quad p_{2} \\ p_{4} &{}\quad p_{3} &{}\quad -\,p_{2} &{}\quad p_{1} \end{array} \right] , \end{aligned}$$
(24)

where \(\mathbf {p} = \left[ p_1, p_2, p_3, p_4 \right] ^{\mathrm {T}}\). Let \(\omega _{C}^{old} = \left[ w_{1}, w_{2}, w_{3} \right] ^{\mathrm {T}}\). Then, the Jacobian \(\frac{\partial \mathbf {q}_{WC}^{new}}{\partial \omega _{C}^{old}}\) is derived as

$$\begin{aligned} \frac{\partial \mathbf {q}_{WC}^{new}}{\partial \omega _{C}^{old}} = \begin{array}{c} \mathbf {Q} \left( \mathbf {q}_{WC}^{old} \right) \left[ \begin{array}{cc} m(\omega _{C}^{old},\varDelta t,1) &{}\quad m(\omega _{C}^{old},\varDelta t,2) \\ n(\omega _{C}^{old},\varDelta t,1) &{}\quad o(\omega _{C}^{old},\varDelta t,1,2) \\ o(\omega _{C}^{old},\varDelta t,2,1)&{}\quad n(\omega _{C}^{old},\varDelta t,2) \\ o(\omega _{C}^{old},\varDelta t,3,1) &{}\quad o(\omega _{C}^{old},\varDelta t,3,2) \end{array} \right. \\ \\ \left. \begin{array}{c} m(\omega _{C}^{old},\varDelta t,3) \\ o(\omega _{C}^{old},\varDelta t,1,3) \\ o(\omega _{C}^{old},\varDelta t,2,3) \\ n(\omega _{C}^{old},\varDelta t,3) \end{array} \right] , \end{array} \end{aligned}$$
(25)

where

$$\begin{aligned} m(\omega _{C}^{old},\varDelta t,i)= & {} -\frac{w_i}{||\omega _{C}^{old} ||} \sin \left( \frac{||\omega _{C}^{old} ||\varDelta t}{2} \right) \frac{\varDelta t}{2} , \end{aligned}$$
(26)
$$\begin{aligned} n(\omega _{C}^{old},\varDelta t,i)= & {} \frac{||\omega _{C}^{old} ||- w_{i}^{2} / ||\omega _{C}^{old} ||}{||\omega _{C}^{old} ||^{2}} \sin \left( \frac{||\omega _{C}^{old} ||\varDelta t}{2} \right) \nonumber \\&+\, \frac{w_i^2}{||\omega _{C}^{old} ||^2} \cos \left( \frac{||\omega _{C}^{old} ||\varDelta t}{2} \right) \frac{\varDelta t}{2} , \end{aligned}$$
(27)

and

$$\begin{aligned} o(\omega _{C}^{old},\varDelta t,i,j)= & {} -\frac{w_{i} w_{j}}{||\omega _{C}^{old} ||^{3}} \sin \left( \frac{||\omega _{C}^{old} ||\varDelta t}{2} \right) \nonumber \\&+\, \frac{w_i w_j}{||\omega _{C}^{old} ||^2} \cos \left( \frac{||\omega _{C}^{old} ||\varDelta t}{2} \right) \frac{\varDelta t}{2} . \ \ \end{aligned}$$
(28)

When the EKF system is updated, the computation of the Jacobian of the measurement model for the camera state, \(\mathbf {H}\), is required to correct the predicted state vector. The Jacobian is defined by concatenating the Jacobians for all the line features as follows.

$$\begin{aligned} \mathbf {H} = \left[ \begin{array}{ccccc} \frac{\partial h_{11}}{\partial \mathbf {x}} ^{\mathrm {T}}&\cdots&\frac{\partial h_{ij}}{\partial \mathbf {x}} ^{\mathrm {T}}&\cdots&\frac{\partial h_{nm}}{\partial \mathbf {x}} ^{\mathrm {T}} \end{array} \right] ^{\mathrm {T}} \end{aligned}$$
(29)

The Jacobian of the measurement model for the ith VP and the jth line feature, \(\frac{\partial h_{ij}}{\partial \mathbf {x}}\), is computed by

$$\begin{aligned} \frac{\partial h_{ij}}{\partial \mathbf {x}} = \left[ \frac{\partial h_{ij}}{\partial \mathbf {q}_{WC}} \ \ \ \frac{\partial h_{ij}}{\partial \omega _C} \ \ \ \cdots \ \ \ \frac{\partial h_{ij}}{\partial \mathbf {y}_i} \ \ \ \cdots \right] . \end{aligned}$$
(30)

When computing the Jacobian \(\frac{\partial h_{ij}}{\partial \mathbf {x}}\), all the Jacobians \(\frac{\partial h_{kl}}{\partial \mathbf {y}_k}\) except for \(k = i\) and \(l = j\) are two-dimensional column vectors consisting of zeros.

Let a quaternion be \(\mathbf {q} = \left[ q_1, q_2, q_3, q_4 \right] ^{\mathrm {T}}\).

The Jacobian of the measurement model for the camera orientation, \(\frac{\partial h_{ij}}{\partial \mathbf {q}_{WC}}\), is derived using Eq. (8) as

$$\begin{aligned} \frac{\partial h_{ij}}{\partial \mathbf {q}_{WC}} = \left[ \frac{\partial h_{ij}}{\partial q_1} \ \ \ \frac{\partial h_{ij}}{\partial q_2} \ \ \ \frac{\partial h_{ij}}{\partial q_3} \ \ \ \frac{\partial h_{ij}}{\partial q_4} \right] , \end{aligned}$$
(31)

where

$$\begin{aligned} \frac{\partial h_{ij}}{\partial q_1}= & {} 2 \mathbf {d}_i^{\mathrm {T}} \left[ \begin{array}{ccc} q_1 &{}\quad -\,q_4 &{}\quad q_3 \\ q_4 &{}\quad q_1 &{}\quad -\,q_2 \\ -\,q_3 &{}\quad -\,q_2 &{} \quad q_1 \end{array} \right] \mathbf {n}_{ij} , \end{aligned}$$
(32)
$$\begin{aligned} \frac{\partial h_{ij}}{\partial q_2}= & {} 2 \mathbf {d}_i^{\mathrm {T}} \left[ \begin{array}{ccc} q_2 &{}\quad q_3 &{}\quad q_4 \\ q_3 &{}\quad -\,q_2 &{}\quad -\,q_1 \\ q_4 &{}\quad -\,q_1 &{} \quad -\,q_2 \end{array} \right] \mathbf {n}_{ij} , \end{aligned}$$
(33)
$$\begin{aligned} \frac{\partial h_{ij}}{\partial q_3}= & {} 2 \mathbf {d}_i^{\mathrm {T}} \left[ \begin{array}{ccc} -\,q_3 &{} \quad q_2 &{}\quad q_1 \\ q_2 &{}\quad q_3 &{}\quad q_4 \\ -\,q_1 &{} \quad q_4 &{}\quad -\,q_3 \end{array} \right] \mathbf {n}_{ij} , \end{aligned}$$
(34)

and

$$\begin{aligned} \frac{\partial h_{ij}}{\partial q_4}= & {} 2 \mathbf {d}_i^{\mathrm {T}} \left[ \begin{array}{ccc} -\,q_4 &{}\quad -\,q_1 &{}\quad q_2 \\ q_1 &{}\quad -\,q_4 &{}\quad q_3 \\ q_2 &{}\quad q_3 &{}\quad q_4 \end{array} \right] \mathbf {n}_{ij}. \end{aligned}$$
(35)

The Jacobian of the measurement model for the angular velocity, \(\frac{\partial h_{ij}}{\partial \omega _{C}}\), is a three-dimensional row vector with zero components since the measurement model does not involve the variables of the angular velocity. The Jacobian of the measurement model for the ith VD, \(\frac{\partial h_{ij}}{\partial \mathbf {y}_i}\), is derived as

$$\begin{aligned} \frac{\partial h_{ij}}{\partial \mathbf {y}_i} = \frac{\partial h_{ij}}{\partial \mathbf {d}_i} \frac{\partial \mathbf {d}_{i}}{\partial \mathbf {y}_i} , \end{aligned}$$
(36)

where the left Jacobian of the right term, \(\frac{\partial h_{ij}}{\partial \mathbf {d}_i}\), is computed using Eq. (8) as

$$\begin{aligned} \frac{\partial h_{ij}}{\partial \mathbf {d}_i} = \left( \mathbf {R} ( \mathbf {q}_{WC} ) \mathbf {n}_{ij} \right) ^{\mathrm {T}} , \end{aligned}$$
(37)

and the right Jacobian of the right term, \( \frac{\partial \mathbf {d}_{i}}{\partial \mathbf {y}_i}\), is computed using Eq. (5) as follows.

$$\begin{aligned} \frac{\partial \mathbf {d}_{i}}{\partial \mathbf {y}_i} = \left[ \begin{array}{cc} -\,\cos \phi _i\sin \theta _i &{}\quad -\,\sin \phi _i\cos \theta _i \\ \cos \phi _i\cos \theta _i &{}\quad -\,\sin \phi _i\sin \theta _i \\ 0 &{}\quad \cos \phi _i \end{array} \right] \end{aligned}$$
(38)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lee, J., Yoon, K. Joint Estimation of Camera Orientation and Vanishing Points from an Image Sequence in a Non-Manhattan World. Int J Comput Vis 127, 1426–1442 (2019). https://doi.org/10.1007/s11263-019-01196-y

Download citation

Keywords

  • Camera orientation estimation
  • Vanishing point estimation
  • Kalman filter
  • Rotation estimation