Abstract
This paper presents a method that performs the rectification of planar objects. Based on the 2D Manhattan world assumption (i.e., the majority of line segments are aligned with principal axes), we develop a cost function whose minimization yields a rectification transform. We parameterize the homography with camera parameters and design a cost function which encodes the measure of line segment alignment. Since there are outliers in the line segment detection, we also develop an iterative optimization scheme for the robust estimation. Experimental results on a range of images with planar objects show that our method performs rectification robustly and accurately.
Similar content being viewed by others
References
Buenaposada, J.M., Baumela, L.: Real-time tracking and estimation of plane pose. In: Proceedings of International Conference on Pattern Recognition, pp. 697–700 (2002)
Clark, P., Mirmehdi, M.: Estimating the orientation and recovery of text planes in a single image. In: Proceedings of the 12th British Machine Vision Conference, pp. 421–430 (2001)
Cobzas, D., Jagersand, M., Sturm, P.: 3d ssd tracking with estimated 3d planes. Image Vis. Comput. 27, 69–79 (2009)
Corral-Soto, E.R., Elder, J.H.: Automatic single-view calibration and rectification from parallel planar curves. In: European Conference on Computer Vision, pp. 813–827 (2014)
Grompone, V.G.R., Jakubowicz, J., Morel, J.M., Randall, G.: LSD: a fast line segment detector with a false detection control. IEEE Trans. Pattern Anal. Mach. Intell. 32(4), 722–732 (2010)
Hanbury, A., Wildenauer, H.: Robust camera self-calibration from monocular images of manhattan worlds. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2831–2838 (2012)
Hartl, A., Reitmayr, G.: Rectangular target extraction for mobile augmented reality applications. In: Proceedings of International Conference on Pattern Recognition, pp. 81–84 (2012)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)
Hua, G., Liu, Z., Zhang, Z., Wu, Y.: Automatic business card scanning with a camera. In: IEEE International Conference on Image Processing, pp. 373–376 (2006)
Hua, G., Liu, Z., Zhang, Z., Wu, Y.: Iterative local-global energy minimization for automatic extraction of objects of interest. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1701–1706 (2006)
Korč, F., Förstner, W.: eTRIMS Image Database for interpreting images of man-made scenes. Tech. Rep. TR-IGG-P-2009-01, Department of Photogrammetry, University of Bonn (2009)
Lee, H., Shechtman, E., Wang, J., Lee, S.: Automatic upright adjustment of photographs. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 877–884 (2012)
Lee, H., Shechtman, E., Wang, J., Lee, S.: Automatic upright adjustment of photographs with robust camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 833–844 (2014)
Lee, W., Pack, Y., Lepetit, V.: Video-based In Situ tagging on mobile phones. IEEE Trans. Circuits Syst. Video Techn. 21, 1487–1496 (2011)
Liebowitz, D., Zisserman, A.: Metric rectification for perspective images of planes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 482–488 (1998)
Mirzaei, F., Roumeliotis, S.: Optimal estimation of vanishing points in a manhattan world. In: IEEE International Conference on Computer Vision, pp. 2454–2461 (2011)
Monasse, P., Morel, J.M., Tang, Z.: Three-step image rectification. In: The British Machine Vision Conference, pp. 89.1–10 (2010)
Mor, J.: The Levenberg-Marquardt algorithm: Implementation and theory. In: Watson, G. (ed.) Numerical Analysis. Lecture Notes in Mathematics, vol. 630, pp. 105–116. Springer, Berlin Heidelberg (1978)
Pilu, M.: Extraction of illusory linear clues in perspectively skewed documents. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. I363–I368 (2001)
Pritts, J., Chum, O., Matas, J.: Detection, rectification and segmentation of coplanar repeated patterns. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2973–2980 (2014)
Tardif, J.P.: Non-iterative approach for fast and accurate vanishing point detection. In: IEEE International Conference on Computer Vision, pp. 1250–1257 (2009)
Tretyak, E., Barinova, O., Kohli, P., Lempitsky, V.: Geometric image parsing in man-made environments. Int. J. Comput. Vis. 97(3), 305–321 (2012)
Xu, C., Kuipers, B., Murarka, A.: 3d pose estimation for planes. In: Proceedings of International Conference on Computer Vision Workshops, pp. 673–680 (2009)
Xu, Y., Oh, S., Hoogs, A.: A minimum error vanishing point detection approach for uncalibrated monocular images of man-made environments. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1376–1383 (2013)
Zaheer, A., Rashid, M., Khan, S.: Shape from angle regularity. In: Proceedings of the 12th European Conference on Computer Vision, vol part VI, pp. 1–14 (2012)
Zhang, Z., Ganesh, A., Liang, X., Ma, Y.: Tilt: Transform invariant low-rank textures. Int. J. Comput. Vis. 99(1), 1–24 (2012)
Zhang, Z., Matsushita, Y., Ma, Y.: Camera calibration with lens distortion from low-rank textures. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2321–2328 (2011)
Acknowledgments
This research was supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2015-H8501-15-1016) supervised by the IITP (Institute for Information & communications Technology Promotion).
Author information
Authors and Affiliations
Corresponding author
Appendix: Jacobian matrix of the proposed cost function
Appendix: Jacobian matrix of the proposed cost function
The cost function (12) can be minimized via the Levenberg–Marquardt algorithm. For the efficient implementation of the algorithm, we need the derivatives of
and
with respect to four parameters(i.e., \(t \in \{\theta _1, \theta _2, \theta _3, f\}\)). However, the \(\min (\cdot ,\cdot )\) function in (24) is not differentiable at some points; therefore, we use a simple approximation:
Although this approximation introduces ambiguities when \(f(\cdot ) = g(\cdot )\), this case seldom happens and the approximation works well in practice. We handle \(\min (\cdot ,\cdot )\) and \(|\cdot |\) functions in a similar manner.
1.1 Derivatives of (24)
Let us denote \(\hat{\varvec{p}}=\mathrm {H}^{-1}\mathbf {u} = \begin{bmatrix} \hat{p}_1&\hat{p}_2&\hat{p}_3 \end{bmatrix}^\top \) and \(\hat{\varvec{q}}=\mathrm {H}^{-1}\mathbf {v} = \begin{bmatrix} \hat{q}_1&\hat{q}_2&\hat{q}_3 \end{bmatrix}^\top \) and denote their inhomogeneous representation as \(\tilde{\varvec{p}} = \begin{bmatrix} x_1(\cdot )&y_1(\cdot ) \end{bmatrix}^\top \) and \(\tilde{\varvec{q}} = \begin{bmatrix} x_2(\cdot )&y_2(\cdot ) \end{bmatrix}^\top \), respectively. Then, the derivative of (24) with respect to t
can be decomposed into four cases:
Since we can derive \(\frac{\partial \tilde{\varvec{p}}}{\partial t} = \begin{bmatrix} \frac{\partial {x_1(\cdot )}}{\partial t}&\frac{\partial {y_1(\cdot )}}{\partial t} \end{bmatrix}^\top \) by using a chain rule:
where
it is sufficient to compute \(\frac{\partial \hat{\varvec{p}}}{\partial t}\) and \(\frac{\partial \hat{\varvec{q}}}{\partial t}\) to get (27). By using (8), we can get \(\frac{\partial \hat{\varvec{p}}}{\partial t}\) for each parameter:
The derivatives of \(\frac{\partial \hat{\varvec{q}}}{\partial t}\) can be derived in a similar way.
1.2 Derivatives of (25)
We can get the derivative of (25) with respect to f similarly:
Rights and permissions
About this article
Cite this article
An, J., Koo, H.I. & Cho, N.I. Rectification of planar targets using line segments. Machine Vision and Applications 28, 91–100 (2017). https://doi.org/10.1007/s00138-016-0807-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-016-0807-1