Skip to main content
Log in

Guaranteed Ellipse Fitting with a Confidence Region and an Uncertainty Measure for Centre, Axes, and Orientation

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

A simple and fast ellipse estimation method is presented based on optimisation of the Sampson distance serving as a measure of the quality of fit between a candidate ellipse and data points. Generation of ellipses, not just conics, as estimates is ensured through the use of a parametrisation of the set of all ellipses. Optimisation of the Sampson distance is performed with the aid of a custom variant of the Levenberg–Marquardt algorithm. The method is supplemented with a measure of uncertainty of an ellipse fit in two closely related forms. One of these concerns the uncertainty in the algebraic parameters of the fit and the other pertains to the uncertainty in the geometrically meaningful parameters of the fit such as the centre, axes, and major axis orientation. In addition, a means is provided for visualising the uncertainty of an ellipse fit in the form of planar confidence regions. For moderate noise levels, the proposed estimator produces results that are fully comparable in accuracy to those produced by the much slower maximum likelihood estimator. Due to its speed and simplicity, the method may prove useful in numerous industrial applications where a measure of reliability for geometric ellipse parameters is required.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. This is a realistic assumption, as \(\widehat{\varvec{\theta }}_{\mathrm {AML}}\) is known to be an unbiased estimate of \(\varvec{\theta }_*\) up to the first order. See [26], [27, Sect. 10.2.2], and [37] for more details.

  2. This is a genuine formula for the covariance matrix of \(\widehat{\varvec{\theta }}_{\mathrm {AML}}\) corresponding a different gauge constraint than the one underlying formula (7.3), which is the constraint that \(\widehat{\varvec{\theta }}_{\mathrm {AML}}\) be scaled to unit norm. Gauge constraints serve to eliminate redundant degrees of freedom in the parametrisation and lead to gauged covariances. See [57, Sect. 9] for more details.

  3. http://sites.google.com/site/szpakz/.

  4. The Sampson distance based estimation technique is sometimes classified as a geometric fitting method and sometimes as an algebraic fitting method. Chernov and Ma as well as Anh regard the Sampson distance optimisation as an algebraic fitting method.

References

  1. Ahn, S.J.: Least Squares Orthogonal Distance Fitting of Curves and Surfaces in Space. Lecture Notes in Computer Science, vol. 3151. Springer, Heidelberg (2004)

    Google Scholar 

  2. Ahn, S.J., Rauh, W., Warnecke, H.: Least-squares orthogonal distances fitting of circle, sphere, ellipse, hyperbola, and parabola. Pattern Recog. 34(12), 2283–2303 (2001)

    Article  MATH  Google Scholar 

  3. Al-Sharadqah, A., Chernov, N.: A doubly optimal ellipse fit. Comput. Stat. Data Anal. 56(9), 2771–2781 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  4. Albert, A.: Regression and the Moore-Penrose Pseudoinverse. Academic Press, New York (1972)

    MATH  Google Scholar 

  5. Bartoli, A.: Towards gauge invariant bundle adjustment: a solution based on gauge dependent damping. In: Proceedings of 9th International Conference on Computer Vision, pp. 760–765 (2003)

  6. Brooks, M.J., Chojnacki, W., Gawley, D., van den Hengel, A.: What value covariance information in estimating vision parameters? In: Proceedings of 8th International Conference on Computer Vision, vol. 1, pp. 302–308 (2001)

  7. Chernov, N., Huang, Q., Ma, H.: Does the best-fitting curve always exist? ISRN Prob. Stat. (2012). doi:10.5402/2012/895178

  8. Chernov, N., Huang, Q., Ma, H.: Fitting quadratic curves to data points. Br. J. Math. Comput. Sci. 4(1), 33–60 (2014)

    Article  Google Scholar 

  9. Chernov, N., Lesort, C.: Statistical efficiency of curve fitting algorithms. Comput. Stat. Data Anal. 47(4), 713–728 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  10. Chernov, N., Ma, H.: Least squares fitting of quadratic curves and surfaces. In: Yoshida, S.R. (ed.) Computer Vision, pp. 285–302. Nova Science Publishers, New York (2011)

    Google Scholar 

  11. Chernov, N., Wijewickrema, S.: Algorithms for projecting points onto conics. J. Comput. Appl. Math. 251, 8–21 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  12. Chojnacki, W., Brooks, M.J.: On the consistency of the normalized eight-point algorithm. J. Math. Imaging Vis. 28, 19–27 (2007)

    Article  MathSciNet  Google Scholar 

  13. Chojnacki, W., Brooks, M.J., van den Hengel, A., Gawley, D.: On the fitting of surfaces to data with covariances. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1294–1303 (2000)

    Article  Google Scholar 

  14. Chojnacki, W., Brooks, M.J., van den Hengel, A., Gawley, D.: Revisiting Hartley’s normalized eight-point algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1172–1177 (2003)

    Article  Google Scholar 

  15. Csurka, G., Zeller, C., Zhang, Z., Faugeras, O.D.: Characterizing the uncertainty of the fundamental matrix. Comput. Vis. Image Underst. 68(1), 18–36 (1997)

    Article  Google Scholar 

  16. Fazekas, I., Kukush, A., Zwanzig, S.: Correction of nonlinear orthogonal regression estimator. Ukr. Math. J. 56(8), 1308–1330 (2004)

    Article  MathSciNet  Google Scholar 

  17. Fitzgibbon, A., Pilu, M., Fisher, R.B.: Direct least square fitting of ellipses. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 476–480 (1999)

    Article  Google Scholar 

  18. Förstner, W.: Uncertainty and projective geometry. In: Corrochano, E.B., Förstner, W. (eds.) Handbook of Geometric Computing, pp. 493–534. Springer, Berlin Heidelberg (2005)

    Chapter  Google Scholar 

  19. Gander, W., Golub, G.H., Strebel, R.: Least-squares fitting of circles and ellipses. BIT 34(4), 558–578 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  20. Golub, G.H., Pereyra, V.: The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate. SIAM J. Numer. Anal. 10(2), 413–432 (1973)

    Article  MATH  MathSciNet  Google Scholar 

  21. Halíř, R., Flusser, J.: Numerically stable direct least squares fitting of ellipses. In: Proceedings of 6th International Conference in Central Europe on Computer Graphics and Visualization, vol. 1, pp. 125–132 (1998)

  22. Haralick, R.M.: Propagating covariance in computer vision. Int. J. Pattern Recogn. Artif. Intell. 10(5), 561–572 (1996)

    Article  Google Scholar 

  23. Hartley, R.: In defense of the eight-point algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 19(6), 580–593 (1997)

    Article  Google Scholar 

  24. Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  25. Hunyadi, L., Vajk, I.: Constrained quadratic errors-in-variables fitting. Visual Comput. (2013). doi:10.1007/s00371-013-0885-2

    Google Scholar 

  26. Kanatani, K.: Statistical bias of conic fitting and renormalization. IEEE Trans. Pattern Anal. Mach. Intell. 16(3), 320–326 (1994)

    Article  MATH  Google Scholar 

  27. Kanatani, K.: Statistical Optimization for Geometric Computation: Theory and Practice. Elsevier, Amsterdam (1996)

    MATH  Google Scholar 

  28. Kanatani, K.: Ellipse fitting with hyperaccuracy. IEICE Trans. Inf. Syst. E89–D(10), 2653–2660 (2006)

    Article  Google Scholar 

  29. Kanatani, K., Al-Sharadqah, A., Chernov, N., Sugaya, Y.: Renormalization returns: hyper-renormalization and its applications. In: Proceedings of 12th European Conference on Computer Vision, Lecture Notes on Computer Science, vol. 7574, pp. 384–397 (2012)

  30. Kanatani, K., Rangarajan, P.: Hyper least squares fitting of circles and ellipses. Comput. Stat. Data Anal. 55(6), 2197–2208 (2011)

    Article  MathSciNet  Google Scholar 

  31. Kanazawa, Y., Kanatani, K.: Optimal conic fitting and reliability evaluation. IEICE Trans. Inf. Syst. E79–D(9), 1323–1328 (1996)

    Google Scholar 

  32. Kim, I.: Orthogonal distance fitting of ellipses. Commun. Korean Math. Soc. 17(1), 121–142 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  33. Kukush, A., Markovsky, I., Huffel, S.V.: Consistent estimation in an implicit quadratic measurement error model. Comput. Stat. Data Anal. 47(1), 123–147 (2004)

    Article  MATH  Google Scholar 

  34. Lehmann, E.L., Romano, J.P.: Testing Statistical Hypotheses, 3rd edn. Springer, New York (2005)

    MATH  Google Scholar 

  35. Lütkepol, H.: Handbook of Matrices. Wiley, Chichester (1996)

    Google Scholar 

  36. Magnus, J.R., Neudecker, H.: Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, Chichester (1988)

    MATH  Google Scholar 

  37. Matei, B., Meer, P.: Reduction of bias in maximum likelihood ellipse fitting. In: Proceedings of 15th International Conference on Pattern Recognition, vol. 3, pp. 794–798 (2000)

  38. Newsam, G.N., Redding, N.J.: Fitting the most probable curve to noisy observations. In: Proceedings of International Conference on Image Processing 2, 752–755 (1997)

  39. Newsam, G.N., Redding, N.J.: Fitting the most likely curve through noisy data. Technical Report DSTO-RR-0242, Electronics and Surveillance Research Laboratory, Defence Science and Technology Organisation, Edinburgh, South Australia (2002)

  40. Okatani, T., Deguchi, K.: Improving accuracy of geometric parameter estimation using projected score method. In: Proceedings of 12th International Conference on Computer Vision, pp. 1733–1740 (2009)

  41. Okatani, T., Deguchi, K.: On bias correction for geometric parameter estimation in computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 959–966 (2009)

  42. O’Leary, P., Zsombor-Murray, P.: Direct and specific least-square fitting of hyperbolæ and ellipses. J. Electron. Imaging 13(3), 492–503 (2004)

    Article  Google Scholar 

  43. Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2000)

    Book  MATH  Google Scholar 

  44. Osborne, M.R.: Nonlinear least squares-the Levenberg algorithm revisited. J. Aust. Math. Soc. Ser. B 19(3), 343–357 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  45. Penrose, R.: A generalized inverse for matrices. Math. Proc. Cambridge Philos. Soc. 51, 406–413 (1955)

    Article  MATH  MathSciNet  Google Scholar 

  46. Porrill, J.: Fitting ellipses and predicting confidence envelopes using a bias corrected Kalman filter. Image Vis. Comput. 8(1), 37–41 (1990)

    Article  Google Scholar 

  47. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C. Cambridge University Press, Cambridge (1995)

    Google Scholar 

  48. Rosin, P.L.: A note on the least squares fitting of ellipses. Pattern Recogn. Lett. 14(10), 799–808 (1993)

  49. Rosin, P.L.: Analysing error of fit functions for ellipses. Pattern Recogn. Lett. 17(14), 1461–1470 (1996)

    Article  Google Scholar 

  50. Sampson, P.D.: Fitting conic sections to ‘very scattered’ data: an iterative refinement of the Bookstein algorithm. Comput. Graph. Image Process. 18(1), 97–108 (1982)

    Article  Google Scholar 

  51. Scheffé, H.: A method for judging all contrasts in the analysis of variance. Biometrika 40, 87–104 (1953)

    MATH  MathSciNet  Google Scholar 

  52. Stewart, G.W.: On the continuity of the generalized inverse. SIAM J. Appl. Math. 17(1), 33–45 (1969)

    Article  MATH  MathSciNet  Google Scholar 

  53. Sturm, P., Gargallo, P.: Conic fitting using the geometric distance. In: Proceedings of 8th Asian Conference on Computer Vision, Lecture Notes on Computer Science, vol. 4844, pp. 784–795 (2007)

  54. Szpak, Z., Chojnacki, W., van den Hengel, A.: Guaranteed ellipse fitting with the Sampson distance. In: Proceedings of 12th European Conference on Computer Vision, Lecture Notes on Computer Science, vol. 7576, pp. 87–100 (2012)

  55. Szpak, Z.L., Chojnacki, W., van den Hengel, A.: A comparison of ellipse fitting methods and implications for multiple-view geometry estimation. In: Proceedings of International Conference on Digital Image Computing Techniques and Applications, pp. 1–8 (2012)

  56. Taubin, G.: Estimation of planar curves, surfaces and nonplanar space curves defined by implicit equations, with applications to edge and range image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 13(11), 1115–1138 (1991)

    Article  Google Scholar 

  57. Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment—a modern synthesis. In: Proceedings of International Workshop on Vision Algorithms, Lecture Notes in Computer Science, vol. 1883, pp. 298–372 (1999)

  58. Turner, K.: Computer perception of curved objects using a television camera. Ph.D. thesis, Department of Machine Intelligence, University of Edinburgh (1974)

  59. Varah, J.M.: Least squares data fitting with implicit functions. BIT 36(4), 842–854 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  60. Wong, C.Y., Lin, C.F., Ren, T.R., Kwok, N.M.: A survey on ellipse detection methods. In: IEEE International Symposium on Industrial Electronics, pp. 1105–1110 (2012)

  61. Zhang, Z.: Parameter estimation techniques: a tutorial with application to conic fitting. Image and Vision Computing 15(1), 59–76 (1997)

    Article  Google Scholar 

  62. Zwillinger, D.: CRC Standard Mathematical Tables and Formulae, 32nd edn. CRC Press, Boca Raton (2012)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by the Australian Research Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zygmunt L. Szpak.

Appendices

Appendices

1.1 Appendix 1: Proof of Equation (4.14)

In this appendix, we establish Eq. (4.14). The proof will not rely on general identities for pseudo-inverse matrices but rather will involve the specifics of \(\mathbf {r}(\varvec{\theta })\) and \(\mathbf {\varvec{\pi }}(\varvec{\eta })\).

Differentiating \(\mathbf {r}(t \varvec{\theta }) = \mathbf {r}(\varvec{\theta })\) with respect to \(t\) and evaluating at \(t = 1\), we find that \(\partial _{\varvec{\theta }}{\mathbf {r}}(\varvec{\theta }) \varvec{\theta }= \mathbf {0}.\) Hence, in particular, \(\partial _{\varvec{\theta }}{\mathbf {r}}(\varvec{\pi }(\varvec{\eta }))\varvec{\pi }(\varvec{\eta }) = \mathbf {0}.\) Consequently, recalling (4.4), we have

$$\begin{aligned} \partial _{\varvec{\theta }}{\mathbf {r}}(\varvec{\pi }(\varvec{\eta }))\mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp&= \partial _{\varvec{\theta }}{\mathbf {r}}(\varvec{\pi }(\varvec{\eta }))\\&\quad - \Vert \varvec{\pi }(\varvec{\eta }) \Vert ^{-2} \partial _{\varvec{\theta }}{\mathbf {r}}(\varvec{\pi }(\varvec{\eta }))\varvec{\pi }(\varvec{\eta })\varvec{\pi }(\varvec{\eta })^\mathsf {T}\\&= \partial _{\varvec{\theta }}{\mathbf {r}}(\varvec{\pi }(\varvec{\eta })). \end{aligned}$$

We summarise this simply as

$$\begin{aligned} \partial _{\varvec{\theta }}{\mathbf {r}} \, \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp = \partial _{\varvec{\theta }}{\mathbf {r}} \end{aligned}$$

in line with our earlier convention that \(\partial _{\varvec{\theta }}{\mathbf {r}}\) be evaluated at \(\varvec{\pi }(\varvec{\eta })\). As an immediate consequence, we obtain

$$\begin{aligned} ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {I}_6) \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp = (\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp . \end{aligned}$$

This together with the observation that \((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp \) is symmetric (being the sum of the symmetric matrices \((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}}\) and \(\lambda \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp \)) yields

$$\begin{aligned} ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {I}_6) \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp&= (((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {I}_6) \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp )^\mathsf {T}\nonumber \\&= (\mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp )^\mathsf {T}((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {I}_6)^\mathsf {T}\nonumber \\&= \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {I}_6). \end{aligned}$$
(13.1)

Now note that if \(\mathbf {A}\) and \(\mathbf {B}\) are square matrices of the same sizes, \(\mathbf {A}\) is invertible, and \(\mathbf {A}\mathbf {B} = \mathbf {B}\mathbf {A}\), then \(\mathbf {A}^{-1}\mathbf {B} = \mathbf {B}\mathbf {A}^{-1}\), as is easily seen by pre- and post-multiplying the both sides of \(\mathbf {A}\mathbf {B} = \mathbf {B}\mathbf {A}\) by \(\mathbf {A}^{-1}\). This in conjunction with (13.1) implies

$$\begin{aligned} ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {I}_6)^{-1} \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp = \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} + \lambda \mathbf {I}_6)^{-1}. \end{aligned}$$
(13.2)

Differentiating the identity \(\Vert \varvec{\pi }(\varvec{\eta }) \Vert ^2 = \varvec{\pi }(\varvec{\eta })^\mathsf {T}\varvec{\pi }(\varvec{\eta }) = 1\) with respect to \(\varvec{\eta }\), we get

$$\begin{aligned} (\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}\varvec{\pi }(\varvec{\eta }) = 0. \end{aligned}$$
(13.3)

Hence

$$\begin{aligned} (\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}\mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp&= (\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}- \Vert \varvec{\pi }(\varvec{\eta }) \Vert ^{-2} (\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}\varvec{\pi }(\varvec{\eta })\varvec{\pi }(\varvec{\eta })^\mathsf {T}\\&= (\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}. \end{aligned}$$

Pre-multiplying both sides of this equality by \(((\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}\partial _{\varvec{\eta }}{\varvec{\pi }})^{-1}\) and invoking (4.12a), we obtain

$$\begin{aligned} (\partial _{\varvec{\eta }}{\varvec{\pi }})^+ \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp = (\partial _{\varvec{\eta }}{\varvec{\pi }})^+. \end{aligned}$$
(13.4)

Now recall that for any matrix \(\mathbf {A}\), the matrix \(\mathbf {A}^+ \mathbf {A}\) represents the orthogonal projection onto the range (column space) of \(\mathbf {A}^\mathsf {T}\), or equivalently, the orthogonal projection onto the orthogonal complement of the null space of \(\mathbf {A}\). Since, on account of (13.3), the null space of \((\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}\) is spanned by \(\varvec{\pi }(\varvec{\eta })\), it follows that

$$\begin{aligned} ((\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T})^+ (\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}= \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp . \end{aligned}$$
(13.5)

We now have all ingredients needed to establish (4.14). We calculate as follows:

$$\begin{aligned}&\!\!\!(\partial _{\varvec{\eta }}{\varvec{\pi }})^+ ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} +\lambda \mathbf {I}_6)^{-1} ((\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T})^+ (\partial _{\varvec{\eta }}{\varvec{\pi }})^\mathsf {T}((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} +\lambda \mathbf {I}_6) \partial _{\varvec{\eta }}{\varvec{\pi }}\\&\qquad = (\partial _{\varvec{\eta }}{\varvec{\pi }})^+ ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} +\lambda \mathbf {I}_6)^{-1} \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} +\lambda \mathbf {I}_6) \partial _{\varvec{\eta }}{\varvec{\pi }}\\&\qquad =(\partial _{\varvec{\eta }}{\varvec{\pi }})^+ \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} +\lambda \mathbf {I}_6)^{-1} ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} +\lambda \mathbf {I}_6) \partial _{\varvec{\eta }}{\varvec{\pi }}\\&\qquad =(\partial _{\varvec{\eta }}{\varvec{\pi }})^+ \mathbf {P}_{\varvec{\pi }(\varvec{\eta })}^\perp \partial _{\varvec{\eta }}{\varvec{\pi }}\\&\qquad =(\partial _{\varvec{\eta }}{\varvec{\pi }})^+ \partial _{\varvec{\eta }}{\varvec{\pi }}\\&\qquad =\mathbf {I}_5. \end{aligned}$$

In the above the second line comes from the first by (13.5); the third line comes from the second by (13.2); the fourth line comes from the third by the tautological identity \(((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} +\lambda \mathbf {I}_6)^{-1} ((\partial _{\varvec{\theta }}{\mathbf {r}})^\mathsf {T}\partial _{\varvec{\theta }}{\mathbf {r}} +\lambda \mathbf {I}_6) =\mathbf {I}_6;\) the fifth line comes from the fourth by (13.4); and the sixth line comes from the fifth by (4.13a). The end result of our calculation is what is exactly needed to establish (4.14).

1.2 Appendix 2: Proof of Equation (5.2)

In this appendix, we establish the covariance formula (5.2). The derivation is based on two ingredients: an equation characterising \(\widehat{\varvec{\theta }}_{\mathrm {AML}}\) and a covariance propagation formula. The first ingredient, embodied in Eq. (13.7) below, comes from the optimality condition that \(\widehat{\varvec{\theta }}_{\mathrm {AML}}\) satisfies as the minimiser of \(J_{\mathrm {AML}}\):

$$\begin{aligned}{}[\partial _{\varvec{\theta }}{J_{\mathrm {AML}}(\varvec{\theta }; \mathbf {x}_1, \dots , \mathbf {x}_N)}]_{\varvec{\theta }= \widehat{\varvec{\theta }}_{\mathrm {AML}}} = \mathbf {0}^\mathsf {T}. \end{aligned}$$
(13.6)

Direct computation shows that

$$\begin{aligned}{}[\partial _{\varvec{\theta }}{J_{\mathrm {AML}}(\varvec{\theta }; \mathbf {x}_1, \dots , \mathbf {x}_N)}]^\mathsf {T}= 2 \mathbf {X}_{\varvec{\theta }} \varvec{\theta }, \end{aligned}$$

where

$$\begin{aligned} \mathbf {X}_{\varvec{\theta }} = \sum _{n=1}^N \frac{\mathbf {A}_n}{\varvec{\theta }^{\mathsf {T}} \mathbf {B}_n\varvec{\theta }} - \sum _{n=1}^N \frac{\varvec{\theta }^{\mathsf {T}} \mathbf {A}_n \varvec{\theta }}{(\varvec{\theta }^{\mathsf {T}} \mathbf {B}_n \varvec{\theta })^2} \mathbf {B}_n. \end{aligned}$$

Accordingly, (13.6) can be reformulated as

$$\begin{aligned} \mathbf {X}_{\hat{\varvec{\theta }}} \hat{\varvec{\theta }}= \mathbf {0}, \end{aligned}$$
(13.7)

where \(\widehat{\varvec{\theta }}_{\mathrm {AML}}\) is abbreviated to \(\hat{\varvec{\theta }}\) for clarity.

The second ingredient, which will be used in combination with the first, is the covariance propagation formula

$$\begin{aligned} \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} = \sum _{n=1}^N \partial _{\mathbf {x}_n}{\hat{\varvec{\theta }}} \varvec{\varLambda }_{\mathbf {x}_n}^{} (\partial _{\mathbf {x}_n}{\hat{\varvec{\theta }}})^{\mathsf {T}} \end{aligned}$$
(13.8)

(cf. [15, 22]). Here it is tacitly assumed that, corresponding to varying sets of data points \(\mathbf {x}_1, \dots , \mathbf {x}_N\), the normalised vectors \(\hat{\varvec{\theta }}= \hat{\varvec{\theta }}(\mathbf {x}_1, \dots , \mathbf {x}_N)\) have been chosen in a coordinated way, so that \(\hat{\varvec{\theta }}\) varies smoothly, without sign flipping, as a function of \(\mathbf {x}_1, \dots , \mathbf {x}_N\) and in particular may be differentiated.

As a first step towards the derivation of formula (5.2), we differentiate \(\Vert \hat{\varvec{\theta }}\Vert ^2 =1\) with respect to \(\mathbf {x}_n\) to get \((\partial _{\mathbf {x}_n}{\hat{\varvec{\theta }}})^{\mathsf {T}}\hat{\varvec{\theta }}= \mathbf {0}.\) This jointly with (4.4) and (13.8) then implies

$$\begin{aligned} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} = \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp = \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{}. \end{aligned}$$
(13.9)

Next, letting \(\mathbf {x}_n = [m_{n,1}, m_{n,2}]^\mathsf {T}\) and \(\hat{\varvec{\theta }}= [\hat{\theta }_1, \dots , \hat{\theta }_6]^\mathsf {T}\), and differentiating (13.7) with respect to \(m_{n,i}\), we obtain

$$\begin{aligned} \left[ [\partial _{m_{n,i}}{\mathbf {X}_{\varvec{\theta }}}]_{\varvec{\theta }= \hat{\varvec{\theta }}} + \sum _{j=1}^6 [\partial _{\theta _j}{\mathbf {X}_{\varvec{\theta }}}]_{\varvec{\theta }= \hat{\varvec{\theta }}} \partial _{m_{n,i}}{{\hat{\theta }}_j} \right] {\hat{\varvec{\theta }}} + \mathbf {X}_{\hat{\varvec{\theta }}} \partial _{m_{n,i}}{\hat{\varvec{\theta }}} = \mathbf {0}. \end{aligned}$$

Introducing the Gauss-Newton approximation, i.e., neglecting the terms that contain \(\hat{\varvec{\theta }}^\mathsf {T}\mathbf {u}(\mathbf {x}_n)\), we arrive (after some calculations) at

$$\begin{aligned} \frac{\mathbf {u}(\mathbf {x}_n)[\partial _{m_{n,i}}{\mathbf {u}(\mathbf {x}_n)}]^{\mathsf {T}} \hat{\varvec{\theta }}}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} + \left[ \sum _{n=1}^N \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} \right] \partial _{m_{n,i}}{\hat{\varvec{\theta }}} = \mathbf {0}. \end{aligned}$$

This together with the observation that the scalar \([\partial _{m_{n,i}}{\mathbf {u}(\mathbf {x}_n)}]^{\mathsf {T}} \hat{\varvec{\theta }}\) can also be written as \(\hat{\varvec{\theta }}^{\mathsf {T}} \partial _{m_{n,i}}{\mathbf {u}(\mathbf {x}_n)}\) leads to

$$\begin{aligned} \left[ \sum _{n=1}^N \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} \right] \partial _{m_{n,i}}{\hat{\varvec{\theta }}}&= - \frac{\mathbf {u}(\mathbf {x}_n) [\partial _{m_{n,i}}{\mathbf {u}(\mathbf {x}_n)}]^{\mathsf {T}} \hat{\varvec{\theta }}}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}}\\&= - \frac{\mathbf {u}(\mathbf {x}_n) \hat{\varvec{\theta }}^{\mathsf {T}} \partial _{m_{n,i}}{\mathbf {u}(\mathbf {x}_n)}}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}}. \end{aligned}$$

Consequently,

$$\begin{aligned} \left[ \sum _{n=1}^N \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} \right] \partial _{\mathbf {x}_n}{\hat{\varvec{\theta }}} = - \frac{\mathbf {u}(\mathbf {x}_n) \hat{\varvec{\theta }}^{\mathsf {T}} \partial _{\mathbf {x}_n}{\mathbf {u}(\mathbf {x}_n)}}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} \end{aligned}$$

and further, recalling the definitions of \(\mathbf {A}_n\) and \(\mathbf {B}_n\) given in (2.5),

$$\begin{aligned}&\left[ \sum _{n=1}^N \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} \right] \partial _{\mathbf {x}_n}{\hat{\varvec{\theta }}} \varvec{\varLambda }_{\mathbf {x}_n}^{} (\partial _{\mathbf {x}_n}{\hat{\varvec{\theta }}})^{\mathsf {T}} \left[ \sum _{n=1}^N \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} \right] \\&\qquad = \frac{ \mathbf {u}(\mathbf {x}_n) \hat{\varvec{\theta }}^{\mathsf {T}} \partial _{\mathbf {x}_n}{\mathbf {u}(\mathbf {x}_n)} \varvec{\varLambda }_{\mathbf {x}_n}^{} [\partial _{\mathbf {x}_n}{\mathbf {u}(\mathbf {x}_n)}]^{\mathsf {T}} \hat{\varvec{\theta }}\mathbf {u}(\mathbf {x}_n)^{\mathsf {T}}}{(\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }})^2}\\&\qquad = \frac{\mathbf {u}(\mathbf {x}_n) \mathbf {u}(\mathbf {x}_n)^{\mathsf {T}}}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} = \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}}. \end{aligned}$$

Now

$$\begin{aligned}&\left[ \sum _{n=1}^N \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} \right] \left[ \sum _{n=1}^N \partial _{\mathbf {x}_n}{\hat{\varvec{\theta }}} \varvec{\varLambda }_{\mathbf {x}_n}^{} (\partial _{\mathbf {x}_n}{\hat{\varvec{\theta }}})^{\mathsf {T}} \right] \left[ \sum _{n=1}^N \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}} \right] \\&\qquad = \sum _{n=1}^N \frac{\mathbf {A}_n}{\hat{\varvec{\theta }}^{\mathsf {T}} \mathbf {B}_n \hat{\varvec{\theta }}}. \end{aligned}$$

By (5.1) and (13.8), the last equality becomes

$$\begin{aligned} \mathbf {M}_{\hat{\varvec{\theta }}} \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} \mathbf {M}_{\hat{\varvec{\theta }}} = \mathbf {M}_{\hat{\varvec{\theta }}}. \end{aligned}$$
(13.10)

At this stage, one might be tempted to conclude that \(\varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} = \mathbf {M}_{\hat{\varvec{\theta }}}^{-1}\), but this would contravene the fact that \(\varvec{\varLambda }_{\hat{\varvec{\theta }}}^{}\) is singular. In order to exploit (13.10) properly as an approximate equality, we first note that, in view of (13.9) and the fact that \(\mathbf {P}_{\hat{\varvec{\theta }}}^\perp \) is idempotent, \(\mathbf {P}_{\hat{\varvec{\theta }}}^\perp = (\mathbf {P}_{\hat{\varvec{\theta }}}^\perp )^2,\) we have

$$\begin{aligned} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp = \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{}, \end{aligned}$$
(13.11)

so (13.10) can be rewritten as

$$\begin{aligned} \mathbf {M}_{\hat{\varvec{\theta }}} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \mathbf {M}_{\hat{\varvec{\theta }}} = \mathbf {M}_{\hat{\varvec{\theta }}}. \end{aligned}$$

Pre- and post-multiplying the last equation by \(\mathbf {P}_{\hat{\varvec{\theta }}}^\perp \) and letting

$$\begin{aligned} \mathbf {M}_{\hat{\varvec{\theta }}}^\perp = \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \mathbf {M}_{\hat{\varvec{\theta }}} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \end{aligned}$$

now leads to

$$\begin{aligned} \mathbf {M}_{\hat{\varvec{\theta }}}^\perp \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} \mathbf {M}_{\hat{\varvec{\theta }}}^\perp = \mathbf {M}_{\hat{\varvec{\theta }}}^\perp . \end{aligned}$$
(13.12)

In turn, pre- and post-multiplying (13.12) by \((\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+\) yields

$$\begin{aligned} (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ \mathbf {M}_{\hat{\varvec{\theta }}}^\perp \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} \mathbf {M}_{\hat{\varvec{\theta }}}^\perp (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ = (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ \mathbf {M}_{\hat{\varvec{\theta }}}^\perp (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+. \end{aligned}$$
(13.13)

The matrix \(\mathbf {M}_{\hat{\varvec{\theta }}}^\perp \) is symmetric and its null space is spanned by \(\hat{\varvec{\theta }}\), so

$$\begin{aligned} \mathbf {M}_{\hat{\varvec{\theta }}}^\perp (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ = (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ \mathbf {M}_{\hat{\varvec{\theta }}}^\perp = \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \end{aligned}$$

(cf. [4, Cor. 3.5]). We also have \((\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ \mathbf {M}_{\hat{\varvec{\theta }}}^\perp (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ = (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+\) by virtue of one of the four defining properties of the pseudo-inverse [4, Thm. 3.9]. Therefore (13.13) can be restated as

$$\begin{aligned} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp = (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+, \end{aligned}$$

which, on account of (13.11), implies

$$\begin{aligned} \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} = (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+. \end{aligned}$$
(13.14)

We now deduce our final formula for \(\varvec{\varLambda }_{\hat{\varvec{\theta }}}^{}\), namely

$$\begin{aligned} \varvec{\varLambda }_{\hat{\varvec{\theta }}}^{} = \mathbf {P}_{\hat{\varvec{\theta }}}^\perp (\mathbf {M}_{\hat{\varvec{\theta }}})^+_5 \mathbf {P}_{\hat{\varvec{\theta }}}^\perp , \end{aligned}$$
(13.15)

which is nothing else but Eq. (5.2) transcribed to the present notation. First we note that as, by (13.7), \(\hat{\varvec{\theta }}\) spans the null space of \(\mathbf {X}_{\hat{\varvec{\theta }}}\), \(\mathbf {X}_{\hat{\varvec{\theta }}}\) has rank \(5\). Next we observe that in the Gauss–Newton approximation \(\mathbf {X}_{\hat{\varvec{\theta }}}\) is equal to \(\mathbf {M}_{\hat{\varvec{\theta }}}\), so, having rank \(5\), \(\mathbf {X}_{\hat{\varvec{\theta }}}\) is also approximately equal to \((\mathbf {M}_{\hat{\varvec{\theta }}})_5\). This in turn implies that, approximately,

$$\begin{aligned} (\mathbf {M}_{\hat{\varvec{\theta }}})^+_5 = \mathbf {X}_{\hat{\varvec{\theta }}}^+, \end{aligned}$$

given that the function \(\mathbf {A} \mapsto \mathbf {A}^+\) is continuous when considered on sets of matrices of equal rank [20, 36, 45, 52], The last equality together with \(\mathbf {P}_{\hat{\varvec{\theta }}}^\perp \mathbf {X}_{\hat{\varvec{\theta }}}^+ \mathbf {P}_{\hat{\varvec{\theta }}}^\perp = \mathbf {X}_{\hat{\varvec{\theta }}}^+,\) which immediately follows from the facts that \(\mathbf {X}_{\hat{\varvec{\theta }}}\) is symmetric and that \(\hat{\varvec{\theta }}\) spans the null space of \(\mathbf {X}_{\hat{\varvec{\theta }}}\), implies

$$\begin{aligned} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp (\mathbf {M}_{\hat{\varvec{\theta }}})^+_5 \mathbf {P}_{\hat{\varvec{\theta }}}^\perp = \mathbf {X}_{\hat{\varvec{\theta }}}^+. \end{aligned}$$
(13.16)

As \(\mathbf {M}_{\hat{\varvec{\theta }}}\) is approximately equal to \(\mathbf {X}_{\hat{\varvec{\theta }}}\), it is clear that \(\mathbf {M}_{\hat{\varvec{\theta }}}^\perp \) (\(= \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \mathbf {M}_{\hat{\varvec{\theta }}} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp \)) is approximately equal to \(\mathbf {P}_{\hat{\varvec{\theta }}}^\perp \mathbf {X}_{\hat{\varvec{\theta }}} \mathbf {P}_{\hat{\varvec{\theta }}}^\perp = \mathbf {X}_{\hat{\varvec{\theta }}}\). Both \(\mathbf {M}_{\hat{\varvec{\theta }}}^\perp \) and \(\mathbf {X}_{\hat{\varvec{\theta }}}\) have rank \(5\), so, as they are approximately equal, their pseudo-inverses are also approximately equal,

$$\begin{aligned} (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ = \mathbf {X}_{\hat{\varvec{\theta }}}^+, \end{aligned}$$

by the aforementioned continuity property of the pseudo-inverse. Combining this last equation with (13.16) yields

$$\begin{aligned} (\mathbf {M}_{\hat{\varvec{\theta }}}^\perp )^+ = \mathbf {P}_{\hat{\ varvec{\theta }}}^\perp (\mathbf {M}_{\hat{\varvec{\theta }}})^+_5 \mathbf {P}_{\hat{\varvec{\theta }}}^\perp , \end{aligned}$$

and this together (13.14) establishes (13.15).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Szpak, Z.L., Chojnacki, W. & van den Hengel, A. Guaranteed Ellipse Fitting with a Confidence Region and an Uncertainty Measure for Centre, Axes, and Orientation. J Math Imaging Vis 52, 173–199 (2015). https://doi.org/10.1007/s10851-014-0536-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-014-0536-x

Keywords

Navigation