1 Introduction

The perspective n-point problem (PnP) determines the pose in the camera coordinate system, i.e., a rigid 3D transformation, of a set of n points from their corresponding images obtained with a perspective camera. At least three point correspondences are required to obtain a finite number of solutions. The PnP problem has been researched extensively for perspective cameras. Numerous approaches have been proposed for three (P3P), four (P4P), five (P5P), and a general number of point correspondences (PnP); see, e.g., [15, 19, 22, 28, 40, 70] (P3P), [34, 59, 67, 70] (P4P), [59, 67, 70] (P5P), [2, 17, 18, 23, 27, 32, 39, 48, 49, 56, 61, 72] (PnP), and references therein. The PnP problem originated in photogrammetry, where it is called space resection (or simply resection); see, e.g., [52, Chapter 11.1.3], [51, Chapter 4.2.3], [21, Chapter 12.2.4].

If a camera with a telecentric lens is used, the projection is orthographic instead of perspective [64, Chapter 3.9], [63]. If we want to determine the pose of an object with a telecentric camera, this leads to a problem that we will call the orthographic n-point problem (OnP). This problem occurs in several applications. For example, it can be used to determine the initial values of the pose of a calibration object during camera calibration. Furthermore, some visual inspection algorithms require that the pose of an object is determined using a telecentric camera (e.g., to determine whether some components that have been mounted onto a printed circuit board are in the correct 3D orientation after reflow). In addition, it can be used to determine the pose of an object in object localization or recognition algorithms.

All of the above PnP algorithms assume a perspective camera. Consequently, none of them can solve the OnP problem. The approaches that might appear to come closest are the PnP algorithms described in [16] for non-coplanar 3D points and [55] for coplanar 3D points. They use a scaled orthographic projection (also known as weak perspective). An extension to paraperspective is described in [35]. In all of these approaches, the affine camera model that is used serves as an approximation to the true perspective projection of the camera. Since a finite projection center is still required by these algorithms, they cannot be extended to the OnP problem. For example, the similar triangles that are used in the derivation of the algorithms in [16, 55] do not exist for true orthographic projection. Furthermore, orthogonality of the 3D rotation matrix is only enforced after the algorithms in [16, 35, 55] have computed their solutions, which may lead to suboptimal solutions.

In principle, telecentric cameras also can be regarded as specialized instances of generalized cameras, in which the camera geometry is modeled by providing an explicit ray or line in 3D for every image point. For telecentric cameras, all rays are, of course, parallel. Several PnP algorithms have been proposed for generalized cameras [11, 54, 62]. However, in all of the approaches the case of parallel rays is excluded explicitly [11, 54] or implicitly [62]. Therefore, they do not provide a solution to the OnP problem.

Since the OnP problem seems to be largely unexplored, we propose several algorithms to solve the OnP problem in this paper. Some of the proposed algorithms are based on existing Procrustes problem solvers, while others are novel. In addition, we will perform an extensive performance evaluation of the proposed algorithms to determine which algorithms exhibit the best tradeoff between robustness and speed and are, therefore, suitable for practical use.

2 Problem Definition

2.1 Camera Model for Telecentric Cameras

To be able to define the OnP problem, we first discuss the camera model for telecentric cameras that we will use. Our presentation is based on the description in [63].Footnote 1

In our model, a point \({\varvec{p}}_{\mathrm {o}} = (x_{\mathrm {o}}, y_{\mathrm {o}}, z_{\mathrm {o}})^\top \) given in the object coordinate system is transformed into a point \({\varvec{p}}_{\mathrm {c}} = (x_{\mathrm {c}}, y_{\mathrm {c}}, z_{\mathrm {c}})^\top \) in the camera coordinate system by a rigid 3D transformation:

(1)

where \({\varvec{t}} = (t_x, t_y, t_z)^\top \) is a translation vector and is a rotation matrix. To solve the OnP problem, we must determine and \({\varvec{t}}\).

Next, the point \({\varvec{p}}_{\mathrm {c}}\) is projected into the image plane. For telecentric lenses, the projection is given by:

$$\begin{aligned} \left( \begin{array}{l} x_{\mathrm {u}} \\ y_{\mathrm {u}} \end{array} \right) = m \left( \begin{array}{l} x_{\mathrm {c}} \\ y_{\mathrm {c}} \end{array} \right) , \end{aligned}$$
(2)

where m is the magnification of the lens. Note that (2) is independent of \(z_{\mathrm {c}}\) and therefore also of \(t_z\). Consequently, we obviously cannot recover \(t_z\). Since \(t_z\) does not influence the projection, we can set it to an arbitrary value, e.g., to 0. The independence of \(z_{\mathrm {c}}\) additionally shows that only the first two rows of influence the projected points. In contrast to \({\varvec{t}}\), is determined uniquely from its first two rows: since is orthogonal, the third row of can be reconstructed as the vector product of the first two rows.

The undistorted point \((x_{\mathrm {u}}, y_{\mathrm {u}})^\top \) is then distorted to a point \((x_{\mathrm {d}}, y_{\mathrm {d}})^\top \). We support two distortion models: the division model [6, 20, 43,44,45,46,47] and the polynomial model [8, 9]. In the division model, the undistorted point \((x_{\mathrm {u}}, y_{\mathrm {u}})^\top \) is computed from the distorted point \((x_{\mathrm {d}}, y_{\mathrm {d}})^\top \) as follows:

$$\begin{aligned} \left( \begin{array}{l} x_{\mathrm {u}} \\ y_{\mathrm {u}} \end{array} \right) = \frac{1}{1 + \kappa r_{\mathrm {d}}^2} \left( \begin{array}{l} x_{\mathrm {d}} \\ y_{\mathrm {d}} \end{array} \right) , \end{aligned}$$
(3)

where \(r_{\mathrm {d}}^2 = x_{\mathrm {d}}^2 + y_{\mathrm {d}}^2\). In the polynomial model, the undistorted point is computed by:

$$\begin{aligned} \left( \begin{array}{l} x_{\mathrm {u}} \\ y_{\mathrm {u}} \end{array} \right) = \left( \begin{array}{l} x_{\mathrm {d}} (1 + K_1 r_{\mathrm {d}}^2 + K_2 r_{\mathrm {d}}^4 + K_3 r_{\mathrm {d}}^6) \\ \phantom {1ex} + (P_1 (r_{\mathrm {d}}^2 + 2 x_{\mathrm {d}}^2) + 2 P_2 x_{\mathrm {d}} y_{\mathrm {d}}) \\ y_{\mathrm {d}} (1 + K_1 r_{\mathrm {d}}^2 + K_2 r_{\mathrm {d}}^4 + K_3 r_{\mathrm {d}}^6) \\ \phantom {1ex} + (2 P_1 x_{\mathrm {d}} y_{\mathrm {d}} + P_2 (r_{\mathrm {d}}^2 + 2 y_{\mathrm {d}}^2)) \end{array} \right) . \end{aligned}$$
(4)

The distortion in the division model can be inverted analytically, while that of the polynomial model cannot. As we will see below, in this paper we are only interested in undistorting points. Therefore, the analytical undistortions in (3) and (4) are exactly what we need.

Finally, the distorted point \((x_{\mathrm {d}}, y_{\mathrm {d}})^\top \) is transformed into the image coordinate system:

$$\begin{aligned} \left( \begin{array}{l} x_{\mathrm {i}} \\ y_{\mathrm {i}} \end{array} \right) = \left( \begin{array}{l} \displaystyle \frac{x_{\mathrm {d}}}{s_x} + c_x \\ \displaystyle \frac{y_{\mathrm {d}}}{s_y} + c_y \end{array} \right) . \end{aligned}$$
(5)

Here, \(s_x\) and \(s_y\) denote the pixel pitch on the sensor in the horizontal and vertical direction, respectively, and \((c_x, c_y)^\top \) is the principal point.

In this paper, we assume that the interior orientation of the camera has been calibrated, e.g., using the approach in [63], i.e., that m, \(\kappa \) or \((K_1, K_2, K_3, P_1, P_2)\), \(s_x\), \(s_y\), \(c_x\), and \(c_y\) are known. With this, it is possible to transform an image point \((x_{\mathrm {i}}, y_{\mathrm {i}})^\top \) back into a 2D point \((x_{\mathrm {c}}, y_{\mathrm {c}})^\top \) in the camera coordinate system. First, we invert (5):

$$\begin{aligned} \left( \begin{array}{l} x_{\mathrm {d}} \\ y_{\mathrm {d}} \end{array} \right) = \left( \begin{array}{l} s_x (x_{\mathrm {i}} - c_x) \\ s_y (y_{\mathrm {i}} - c_y) \end{array} \right) . \end{aligned}$$
(6)

Next, we apply (3) or (4). Finally, we invert (2):

$$\begin{aligned} \left( \begin{array}{l} x_{\mathrm {c}} \\ y_{\mathrm {c}} \end{array} \right) = \frac{1}{m} \left( \begin{array}{l} x_{\mathrm {u}} \\ y_{\mathrm {u}} \end{array} \right) . \end{aligned}$$
(7)

The point \((x_{\mathrm {c}}, y_{\mathrm {c}})^\top \) is obviously given in the same units, e.g., meters, as the point \({\varvec{p}}_{\mathrm {o}}\). To distinguish this projected 2D point from the 3D point \({\varvec{p}}_{\mathrm {c}}\), we define \({\varvec{p}}_{\mathrm {p}} = (x_{\mathrm {c}}, y_{\mathrm {c}})^\top \).

2.2 Procrustes Problems

From the above discussion, we can see that the defining equation for the OnP problem for a single point correspondence is

(8)

where denotes the first two rows of and \({\varvec{t}}_2\) denotes the first two rows of \({\varvec{t}}\). As noted above, the third row of can be computed as the vector product of the two rows of , while the element \(t_z\) of \({\varvec{t}}\) cannot be determined and can be set to 0.

To simplify the discussion below, from now on, we will omit the subscripts from and \({\varvec{t}}_2\). Consequently, from now on will denote a \(2 \times 3\) matrix with orthogonal rows () and \({\varvec{t}}\) will denote \((t_x, t_y)^\top \). Furthermore, to simplify the notation, we define \({\varvec{x}}' = {\varvec{p}}_{\mathrm {o}}\) and \({\varvec{y}}' = {\varvec{p}}_{\mathrm {p}}\).

To solve the OnP problem for n point correspondences \({\varvec{x}}'_i \leftrightarrow {\varvec{y}}'_i\), we minimize the following reprojection error over and \({\varvec{t}}\):

(9)

Proposition 1

The translation \({\varvec{t}}\) in (9) is given by

(10)

where

$$\begin{aligned} \bar{{\varvec{x}}} = \frac{1}{n} \sum _{i=1}^n {\varvec{x}}' \qquad \textit{and} \qquad \bar{{\varvec{y}}} = \frac{1}{n} \sum _{i=1}^n {\varvec{y}}' . \end{aligned}$$
(11)

Proof

The proof is analogous to the proofs in [27, Sections II.B and III.B]. \(\square \)

As a result of Proposition 1, the OnP problem reduces to the following orthogonal Procrustes problem:

(12)

where \({\varvec{x}}_i = {\varvec{x}}'_i - \bar{{\varvec{x}}}\) and \({\varvec{y}}_i = {\varvec{y}}'_i - \bar{{\varvec{y}}}\).

As is customary for Procrustes problems [25], we can stack the points \({\varvec{x}}_i^\top \) into an \(n \times 3\) matrix and the points \({\varvec{y}}_i^\top \) into an \(n \times 2\) matrix and can write (12) as

(13)

where and denotes the Frobenius norm of .

Proposition 2

Every Procrustes problem , where is a \(p \times q\) matrix, is an \(n \times p\) matrix, is an \(n \times q\) matrix, \(p \ge q\), and \(n \ge p\), can be reduced to an equivalent Procrustes problem , where is a \(p \times p\) matrix and is a \(p \times q\) matrix.

Proof

The proof is adapted from [10, Section 2] and [24, Chapter 5.3.3]. Let us compute the QR decomposition of : , where has orthogonal columns () and is upper triangular, i.e., can be written as , where is a \(p \times p\) upper triangular matrix. Then, . Furthermore, , where is \(p \times q\) and is \((n-p) \times q\). By using the fact that the Frobenius norm is invariant to rotations, we obtain:

$$\begin{aligned} \Vert {{\mathbf {\mathtt{{X}}}}} {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{Y}}}}} \Vert _{\mathrm {F}}^2&= \Vert {{\mathbf {\mathtt{{S}}}}}^\top ( {{\mathbf {\mathtt{{X}}}}} {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{Y}}}}} ) \Vert _{\mathrm {F}}^2 \end{aligned}$$
(14)
$$\begin{aligned}&= \Vert {{\mathbf {\mathtt{{T}}}}} {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{V}}}}} \Vert _{\mathrm {F}}^2 \end{aligned}$$
(15)
$$\begin{aligned}&= \Vert {{\mathbf {\mathtt{{U}}}}} {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{V}}}}}_1 \Vert _{\mathrm {F}}^2 + \Vert {{\mathbf {\mathtt{{0}}}}} {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{V}}}}}_2 \Vert _{\mathrm {F}}^2 \end{aligned}$$
(16)
$$\begin{aligned}&= \Vert {{\mathbf {\mathtt{{U}}}}} {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{V}}}}}_1 \Vert _{\mathrm {F}}^2 + \Vert {{\mathbf {\mathtt{{V}}}}}_2 \Vert _{\mathrm {F}}^2 . \end{aligned}$$
(17)

Since is constant, (13) is equivalent to the reduced Procrustes problem

$$\begin{aligned} \min _{{{\mathbf {\mathtt{{Q}}}}}} \Vert {{\mathbf {\mathtt{{X}}}}}' {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{Y}}}}}' \Vert _{\mathrm {F}}^2 , \end{aligned}$$
(18)

where \({{\mathbf {\mathtt{{X}}}}}' = {{\mathbf {\mathtt{{U}}}}}\) and \({{\mathbf {\mathtt{{Y}}}}}' = {{\mathbf {\mathtt{{V}}}}}_1\) are computed from \({{\mathbf {\mathtt{{X}}}}}\) and \({{\mathbf {\mathtt{{Y}}}}}\) via the QR decomposition of \({{\mathbf {\mathtt{{X}}}}}\) as described above. \(\square \)

2.3 The OnP Problem for Non-coplanar 3D Points

We now apply the results of Sect. 2.2 to the OnP problem for non-coplanar points.

Obviously, we need at least three points \({\varvec{x}}_i\) in general position, i.e., not collinear, to have a finite number of solutions. Since three points are always coplanar, we assume \(n \ge 4\). The case of coplanar points will be discussed in Sect. 2.4. We assume that \({{\mathbf {\mathtt{{X}}}}}\) has full rank.

Proposition 2 shows that \({{\mathbf {\mathtt{{X}}}}}\) can be reduced to a \(3 \times 3\) matrix and \({{\mathbf {\mathtt{{Y}}}}}\) to a \(3 \times 2\) matrix.

Definition 1

The Stiefel manifold \({\mathscr {V}}_{p,q}\) is the set of all \(p \times q\) matrices \({{\mathbf {\mathtt{{Q}}}}}\) with orthogonal columns (\(p \ge q\)), i.e., \({{\mathbf {\mathtt{{Q}}}}}^\top {{\mathbf {\mathtt{{Q}}}}} = {{\mathbf {\mathtt{{I}}}}}_q\) [65]. An element \({{\mathbf {\mathtt{{Q}}}}} \in {\mathscr {V}}_{p,q}\) is called a Stiefel matrix. We denote \({\mathscr {V}}_{3,2}\) by \({\mathbb {S}}\).

The discussion in Sect. 2.2 shows that the OnP problem for non-coplanar 3D points is equivalent to the following Procrustes problem:

$$\begin{aligned} \min _{{{\mathbf {\mathtt{{Q}}}}} \in {\mathbb {S}}} = \Vert {{\mathbf {\mathtt{{X}}}}} {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{Y}}}}} \Vert _{\mathrm {F}}^2 . \end{aligned}$$
(19)

The Procrustes problem in (19) is called the projection Procrustes problem (e.g., [25]) or the unbalanced Procrustes problem (e.g., [57, 71]). We will discuss algorithms that can be used to solve (19) in Sect. 3.

Remark 1

The Stiefel manifold \({\mathbb {S}}\) obviously has three degrees of freedom: the matrices \({{\mathbf {\mathtt{{Q}}}}}\) have six degrees of freedom and the orthogonality requirement \({{\mathbf {\mathtt{{Q}}}}}^\top {{\mathbf {\mathtt{{Q}}}}} = {{\mathbf {\mathtt{{I}}}}}_2\) provides three equations that the elements of \({{\mathbf {\mathtt{{Q}}}}}\) must fulfill.

Remark 2

Instead of the direct representation of the rotation by its matrix elements and the constraints , we can also parameterize the rotation by a unit quaternion \({\varvec{q}} = (q_0, q_1, q_2, q_3)^\top \) (\(\Vert {\varvec{q}}\Vert _2^2 = 1\)). Then, the matrix is given by

(20)

2.4 The OnP Problem for Coplanar 3D Points

Like for non-coplanar points, we must have at least three points \({\varvec{x}}_i\) in general position, i.e., not collinear, to have a finite number of solutions.

Since the points \({\varvec{x}}_i\) are coplanar, without loss of generality we can assume that they lie in the plane \(z = 0\). This means that the third column of \({{\mathbf {\mathtt{{X}}}}}\) will be \({\varvec{0}}\). This shows that for coplanar points the third row of \({{\mathbf {\mathtt{{Q}}}}}\) (i.e., the third column of ) cannot be determined from the Procrustes problem alone. Therefore, we may omit the third column of \({{\mathbf {\mathtt{{X}}}}}\) and the third row of \({{\mathbf {\mathtt{{Q}}}}}\) in (13).

Definition 2

A sub-Stiefel matrix \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}}\) is a \(p \times p\) matrix that is obtained by deleting the last row of a \((p+1) \times p\) Stiefel matrix \({{\mathbf {\mathtt{{Q}}}}}\):

$$\begin{aligned} {{\mathbf {\mathtt{{Q}}}}} = \left( \begin{array}{l} {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}} \\ {\varvec{q}}^\top \end{array} \right) \end{aligned}$$
(21)

for some \({\varvec{q}} \in {\mathbb {R}}^p\) [10]. We denote the set of \(2 \times 2\) sub-Stiefel matrices by \({\mathbb {S}_{\mathrm {s}}}\).

Proposition 2 and [10, Lemma 4.1]Footnote 2 show that, for coplanar points, \({{\mathbf {\mathtt{{X}}}}}\) and \({{\mathbf {\mathtt{{Y}}}}}\) can be reduced to \(2 \times 2\) matrices.

From the discussion in Sect. 2.2, it follows that the OnP problem for coplanar 3D points is equivalent to the following Procrustes problem:

$$\begin{aligned} \min _{{{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}} \in {\mathbb {S}_{\mathrm {s}}}} = \Vert {{\mathbf {\mathtt{{X}}}}} {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}} - {{\mathbf {\mathtt{{Y}}}}} \Vert _{\mathrm {F}}^2 . \end{aligned}$$
(22)

Algorithms that can be used to solve (22) will be discussed in Sect. 4.

Proposition 3

If \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}} \in {\mathbb {S}_{\mathrm {s}}}\), then

$$\begin{aligned} \text{ tr }({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}}^\top {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}}) - (\det {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}})^2 = \Vert {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}} \Vert _{\mathrm {F}}^2 - (\det {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}})^2 = 1 . \end{aligned}$$
(23)

Proof

Suppose \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}} \in {\mathbb {S}_{\mathrm {s}}}\). Then, by Definition 2 there is a matrix \({{\mathbf {\mathtt{{Q}}}}} \in {\mathbb {S}}\) of which \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}}\) is the upper \(2 \times 2\) matrix. Let us denote the elements of \({{\mathbf {\mathtt{{Q}}}}}\) by \(q_{ij}\). Since the columns of \({{\mathbf {\mathtt{{Q}}}}}\) are orthogonal, we have:

$$\begin{aligned}&q_{11}^2 + q_{21}^2 + q_{31}^2 = 1 \end{aligned}$$
(24)
$$\begin{aligned}&q_{12}^2 + q_{22}^2 + q_{32}^2 = 1 \end{aligned}$$
(25)
$$\begin{aligned}&q_{11} q_{12} + q_{21} q_{22} + q_{31} q_{32} = 0 . \end{aligned}$$
(26)

To eliminate \(q_{31}\) and \(q_{32}\), which do not occur in \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}}\), we can solve (24) and (25) for \(q_{31}\) and \(q_{32}\), respectively:

$$\begin{aligned} q_{31}&= \pm \sqrt{1 - q_{11}^2 - q_{21}^2} \end{aligned}$$
(27)
$$\begin{aligned} q_{32}&= \pm \sqrt{1 - q_{12}^2 - q_{22}^2} . \end{aligned}$$
(28)

Moving the term \(q_{31} q_{32}\) to the right-hand side and substituting (27) and (28) into (26) results in

$$\begin{aligned} q_{11} q_{12} + q_{21} q_{22} = \pm \sqrt{1 - q_{11}^2 - q_{21}^2} \sqrt{1 - q_{12}^2 - q_{22}^2}.\qquad \end{aligned}$$
(29)

To remove the sign ambiguity on the right-hand side, we can square both sides to obtain

$$\begin{aligned} (q_{11} q_{12} + q_{21} q_{22})^2 = (1 - q_{11}^2 - q_{21}^2) (1 - q_{12}^2 - q_{22}^2).\qquad \end{aligned}$$
(30)

By bringing all terms to the left-hand side and simplifying, we obtain

$$\begin{aligned} q_{11}^2 + q_{12}^2 + q_{21}^2 + q_{22}^2 - (q_{11} q_{22} - q_{12} q_{21})^2 - 1 = 0.\quad \end{aligned}$$
(31)

This proves the proposition. \(\square \)

Remark 3

Proposition 3 shows that the set \({\mathbb {S}_{\mathrm {s}}}\) has three degrees of freedom: the matrices \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}}\) have four degrees of freedom and the requirement (23) provides one equation that the elements of \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}}\) must fulfill.

Remark 4

Theorems 4.2 and 4.4 of [10] give two additional characterizations of sub-Stiefel matrices. We will not make use of these characterizations in this paper.

Proposition 4

Given a sub-Stiefel matrix \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}} \in {\mathbb {S}_{\mathrm {s}}}\), there are two possible solutions for the associated \({{\mathbf {\mathtt{{Q}}}}} \in {\mathbb {S}}\).

Proof

The possible solutions for \(q_{31}\) and \(q_{32}\) are given by (27) and (28). We can select one solution \((q_{31}, q_{32})\) arbitrarily based on (26). It is obvious from (24)–(26) that \((-q_{31}, -q_{32})\) will be a second solution. It is also obvious that the other two candidates \((q_{31}, -q_{32})\) and \((-q_{31}, q_{32})\) cannot be solutions because if \((q_{31}, q_{32})\) fulfills (26) then neither \((q_{31}, -q_{32})\) nor \((-q_{31}, q_{32})\) can fulfill (26) (unless \((q_{31}, q_{32}) = (0,0)\)). \(\square \)

Corollary 1

The OnP problem for coplanar points has two possible poses that correspond to the solution \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {s}}\).

Remark 5

Corollary 1 is also proved for three point correspondences under weak perspective in [38, Proposition 2]. Furthermore, [38] shows that the two solutions correspond to the well-known Necker reversal. This is also true for the OnP problem.

Remark 6

A method to reconstruct a full rotation matrix from a general \(p \times p\) sub-Stiefel matrix is given in [10, Theorem 4.3]. As in Proposition 4, there are two possible solutions.

Remark 7

If the two solutions in Proposition 4 are represented by Euler angles as , the two solutions are related in a very simple manner: if one solution is given by \((\alpha , \beta , \gamma )\), the second solution is given by \((-\alpha , -\beta , \gamma )\).

Remark 8

Like for non-coplanar points (see Remark 2), we can parameterize the rotation by a unit quaternion \({\varvec{q}} = (q_0, q_1, q_2, q_3)^\top \) (\(\Vert {\varvec{q}}\Vert _2^2 = 1\)), resulting in the matrix

(32)

Proposition 5

Given a quaternion \({\varvec{q}} = (q_0, q_1, q_2, q_3)^\top \) that solves the OnP problem for coplanar points, the second solution of the OnP problem derived in Proposition 4 and Corollary 1 is given by \({\varvec{q}}' = (q_0, -q_1, -q_2, q_3)^\top \).

Proof

According to the proof of Proposition 4, the second solution has a matrix for which the left \(2 \times 2\) submatrix is identical to that of the first solution, whereas the third column is negated. Substituting \({\varvec{q}}\) and \({\varvec{q}}'\) into (20) proves the proposition. \(\square \)

Corollary 2

Based on Proposition 5 and the fact that \({\varvec{q}}\) and \(-{\varvec{q}}\) represent the same rotation, there are four equivalent solutions to the OnP problem for coplanar points if the rotation is parameterized by quaternions. Uniqueness can be achieved, for example, by requiring \(q_0 \ge 0\) and \(q_1 \ge 0\).

Remark 9

The translation \({\varvec{t}}\) in Proposition 1 is identical for both solutions in Proposition 4.

3 Algorithms for Solving the OnP Problem for Non-coplanar Points

As discussed in Sect. 2.3, the OnP problem for non-coplanar points is equivalent to the unbalanced Procrustes problem. Consequently, we can use algorithms that have been proposed to solve this problem. The two main requirements for the algorithms we have are robustness and speed. Since the OnP problem may exhibit multiple local minima, the algorithm should ideally find the global minimum. Furthermore, the algorithm should be as fast as possible. The robustness and speed of the algorithms will be evaluated in Sect. 5.

An analytic solution for the unbalanced Procrustes problem was given by Cliff [13] (see also [25, Chapter 5.1]). However, this problem does not use the least-squares criterion we use in the OnP problem formulation. Instead, it uses an inner-product criterion to find the optimal rotation. Therefore, it solves a different problem than the one we are interested in.

Analytic solutions for the unbalanced \(p \times q\) Procrustes problem using least squares as the optimization criterion are only known for \(p = q\) (the balanced Procrustes problem) or \(q = 1\). In our case, \(p = 3\) and \(q = 2\). Therefore, we must resort to iterative algorithms that may converge to local minima. This explains our requirements that an ideal algorithm should converge to the global minimum and should be as fast as possible.

One candidate algorithm we found is the algorithm of Green and Gower [25, Algorithm 5.1], which is also described in [66]. We will describe this algorithm in Sect. 3.1. We selected this algorithm as a candidate because the results in [71] indicated good performance.

Another candidate is the algorithm of Koschat and Swayne [41] (see also [25, Algorithm 5.2]). This algorithm will be described in Sect. 3.2. We selected this algorithm as a candidate because the results in [12] seemed to indicate reasonable performance.

There are further algorithms that have been proposed [7, 53, 57, 71]. The results in [71] indicate that the algorithms proposed by Park [57] and Bojanczyk and Lutoborski [7] are significantly slower than the algorithm of Green and Gower. Furthermore, the algorithm proposed by Zhang and Du [71] is also slightly slower than the algorithm of Green and Gower. Finally, the results in [12] indicate that the algorithm of Mooijaart and Commandeur [53] is significantly slower than the algorithm of Koschat and Swayne. Therefore, we did not consider any of these algorithms.

In our search for fast and robust algorithms, we also implemented a Levenberg–Marquardt algorithm (Sect. 3.3) and two algorithms that iteratively solve systems of polynomial equations that are based on Lagrange multipliers of the Procrustes problem (Sects. 3.4 and 3.5).

Finally, to check whether the above algorithms converge to the global optimum, we also implemented a solver based on an algorithm that is capable of finding the global optimum of polynomial optimization problems with polynomial equality and inequality constraints (Sect. 3.6).

3.1 The Algorithm of Green and Gower

The algorithm of Green and Gower is described in [25, Algorithm 5.1] (see also [66]). We extend it by reducing the number of point correspondences to three, as described in Proposition 2. Applied to the OnP problem for non-coplanar points, it can be described as follows:

  1. 1.

    Reduce \({{\mathbf {\mathtt{{X}}}}}\) and \({{\mathbf {\mathtt{{Y}}}}}\) to \({{\mathbf {\mathtt{{X}}}}}'\) and \({{\mathbf {\mathtt{{Y}}}}}'\) as described in Proposition 2.

  2. 2.

    Extend \({{\mathbf {\mathtt{{Y}}}}}'\) to a \(3 \times 3\) matrix by adding a \({\varvec{0}}\) column on the right.

  3. 3.

    Set the result matrix \({{\mathbf {\mathtt{{Q}}}}}\) to \({{\mathbf {\mathtt{{I}}}}}_3\).

  4. 4.

    Use a balanced orthogonal Procrustes algorithm to determine the 3D rotation matrix \({{\mathbf {\mathtt{{Q}}}}}'\) based on \({{\mathbf {\mathtt{{X}}}}}'\) and the extended \({{\mathbf {\mathtt{{Y}}}}}'\).

  5. 5.

    Replace \({{\mathbf {\mathtt{{X}}}}}'\) by \({{\mathbf {\mathtt{{X}}}}}' {{\mathbf {\mathtt{{Q}}}}}'\).

  6. 6.

    Replace \({{\mathbf {\mathtt{{Q}}}}}\) by \({{\mathbf {\mathtt{{Q}}}}} {{\mathbf {\mathtt{{Q}}}}}'\).

  7. 7.

    If the norm of the difference between the last column of \({{\mathbf {\mathtt{{X}}}}}'\) and the last column of the extended \({{\mathbf {\mathtt{{Y}}}}}'\) is below a threshold, go to step 9.

  8. 8.

    Replace the last column of the extended \({{\mathbf {\mathtt{{Y}}}}}'\) by the last column of the updated \({{\mathbf {\mathtt{{X}}}}}'\) and go to step 4.

  9. 9.

    Compute by (10). Set \(t_z = 0\) (cf. Sect. 2.2).

There are numerous algorithms to solve the balanced orthogonal Procrustes problem in step 4 above [3, 26, 27, 36, 37, 60, 68, 69], [25, Chapter 4]. We use the algorithm proposed in [68] because it ensures that a rotation matrix is returned (as opposed to a general orthogonal matrix, which also could include a reflection).

We also note that ten Berge and Knol [66] proposed a different initialization of the extension of \({{\mathbf {\mathtt{{Y}}}}}'\) in step 2 above. They claimed that their modification helps the algorithm of Green and Gower to avoid local minima. This does not correspond to our experience. When we used the modified initial value in the experiments reported in Sect. 5.1, a significantly decreased robustness resulted in some of the experiments, i.e., the algorithm converged to local minima significantly more often. Therefore, we do not use the modified initialization proposed in [66].

3.2 The Algorithm of Koschat and Swayne

The algorithm of Koschat and Swayne was originally proposed in [41]. Our implementation is based on the modifications that are proposed in [25, Algorithm 5.2]. Since the inner loop of the algorithm does not use the matrices \({{\mathbf {\mathtt{{X}}}}}\) and \({{\mathbf {\mathtt{{Y}}}}}\) directly, the runtime of the inner loop does not depend on the number of point correspondences. Therefore, there is no need to reduce the number of point correspondences to three by Proposition 2. When applied to the OnP problem, the algorithm can be described as follows:

  1. 1.

    Initialize \({{\mathbf {\mathtt{{Q}}}}}\) by the algorithm of Cliff (see [13] or [25, Chapter 5.1]).

  2. 2.

    Set \(\rho ^2 = \Vert {{\mathbf {\mathtt{{X}}}}} \Vert _{\mathrm {F}}\).

  3. 3.

    Set \({{\mathbf {\mathtt{{A}}}}} = \rho ^2 {{\mathbf {\mathtt{{I}}}}} - {{\mathbf {\mathtt{{X}}}}}^\top {{\mathbf {\mathtt{{X}}}}}\) and \({{\mathbf {\mathtt{{B}}}}} = {{\mathbf {\mathtt{{Y}}}}}^\top {{\mathbf {\mathtt{{X}}}}}\).

  4. 4.

    Set \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {o}} = {{\mathbf {\mathtt{{Q}}}}}\).

  5. 5.

    Set \({{\mathbf {\mathtt{{Z}}}}} = {{\mathbf {\mathtt{{B}}}}} + {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {o}}^\top {{\mathbf {\mathtt{{A}}}}}\).

  6. 6.

    Compute the SVD of \({{\mathbf {\mathtt{{Z}}}}}\): \({{\mathbf {\mathtt{{Z}}}}} = {{\mathbf {\mathtt{{U}}}}} {{\mathbf {\mathtt{{S}}}}} {{\mathbf {\mathtt{{V}}}}}^\top \).

  7. 7.

    Set \({{\mathbf {\mathtt{{Q}}}}} = {{\mathbf {\mathtt{{V}}}}} {{\mathbf {\mathtt{{U}}}}}^\top \).

  8. 8.

    If \(\Vert {{\mathbf {\mathtt{{Q}}}}} - {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {o}} \Vert _{\mathrm {F}}\) is greater than a threshold, go to step 4.

  9. 9.

    Compute . Compute the third row of as the vector product of the first two rows. Compute \({\varvec{t}}\) by (10). Set \(t_z = 0\) (cf. Sect. 2.2).

3.3 The Levenberg–Marquardt Algorithm

The Levenberg–Marquardt algorithm we implemented is an adaptation of the algorithm described in [29, Appendix 6]. In contrast to the additive augmentation of the normal equations that is used in [29, Appendix 6], we use the multiplicative augmentation described in [58, Chapter 15.5]. The rotation matrix is parameterized by Euler angles, as described in Remark 7. First, the problem is reduced to three point correspondences by Proposition 2. We then use the same algorithm that is used in Sect. 3.4 to compute the initial estimate of the rotation matrix and initialize the Euler angles \(\alpha \), \(\beta \), and \(\gamma \) from this rotation matrix. To terminate the Levenberg–Marquardt algorithm, we use the criteria that are described in [58, Chapter 15.5]. After the Levenberg–Marquardt algorithm has converged, we compute from the Euler angles, \({\varvec{t}}\) by (10), and set \(t_z = 0\) (cf. Sect. 2.2).

3.4 The Iterative Polynomial System Solver Based on a Direct Representation of the Rotation

The OnP problem (12) for non-coplanar points can also be regarded as a constrained minimization problem. Let us denote the function to be minimized as

(33)

The constraint can be written explicitly as

(34)

Thus, the constrained minimization problem is given by

(35)

This is a problem of minimizing a polynomial function with polynomial constraints, which we solve using the following approach: the first-order necessary conditions for this problem are given by (see [50, Chapter 11.3]):

$$\begin{aligned} \nabla f({{\mathbf {\mathtt{{R}}}}}) + \varvec{\lambda }^\top \nabla {\varvec{h}}({{\mathbf {\mathtt{{R}}}}})&= {\varvec{0}} \end{aligned}$$
(36)
$$\begin{aligned} {\varvec{h}}({{\mathbf {\mathtt{{R}}}}})&= {\varvec{0}} , \end{aligned}$$
(37)

where \(\varvec{\lambda }= (\lambda _1, \lambda _2, \lambda _3)^\top \in {\mathbb {R}}^3\) are Lagrange multipliers.

Computing (36) explicitly using an approach analogous to that in [27, Section III.B] results in the following six equations:

$$\begin{aligned} a_{11} r_{11} + a_{12} r_{12} + a_{13} r_{13} + \lambda _1 r_{11} + \lambda _3 r_{21} - b_{11}&= 0 \end{aligned}$$
(38)
$$\begin{aligned} a_{11} r_{21} + a_{12} r_{22} + a_{13} r_{23} + \lambda _3 r_{11} + \lambda _2 r_{21} - b_{12}&= 0 \end{aligned}$$
(39)
$$\begin{aligned} a_{12} r_{11} + a_{22} r_{12} + a_{23} r_{13} + \lambda _1 r_{12} + \lambda _3 r_{22} - b_{21}&= 0 \end{aligned}$$
(40)
$$\begin{aligned} a_{12} r_{21} + a_{22} r_{22} + a_{23} r_{23} + \lambda _3 r_{12} + \lambda _2 r_{22} - b_{22}&= 0 \end{aligned}$$
(41)
$$\begin{aligned} a_{13} r_{11} + a_{23} r_{12} + a_{33} r_{13} + \lambda _1 r_{13} + \lambda _3 r_{23} - b_{31}&= 0 \end{aligned}$$
(42)
$$\begin{aligned} a_{13} r_{21} + a_{23} r_{22} + a_{33} r_{23} + \lambda _3 r_{13} + \lambda _2 r_{23} - b_{32}&= 0 , \end{aligned}$$
(43)

where

$$\begin{aligned} {{\mathbf {\mathtt{{A}}}}}&= {{\mathbf {\mathtt{{X}}}}}^\top {{\mathbf {\mathtt{{X}}}}} \end{aligned}$$
(44)
$$\begin{aligned} {{\mathbf {\mathtt{{B}}}}}&= {{\mathbf {\mathtt{{X}}}}}^\top {{\mathbf {\mathtt{{Y}}}}} \end{aligned}$$
(45)

(using the notation of the stacked point matrices \({{\mathbf {\mathtt{{X}}}}}\) and \({{\mathbf {\mathtt{{Y}}}}}\) in (13)). In matrix notation, (38)–(43) can be written as

(46)

where

$$\begin{aligned} {{\mathbf {\mathtt{{L}}}}} = \left( \begin{array}{ll} \lambda _1 &{} \lambda _3 \\ \lambda _3 &{} \lambda _2 \end{array} \right) . \end{aligned}$$
(47)

This shows that (36) and (37) result in the nine equations (38)–(43) and (34). They all are polynomials of degree two in the nine unknowns and \(\varvec{\lambda }\).

By Bézout’s theorem [14, Chapter 3.3], the polynomial system has at most \(2^9 = 512\) distinct solutions. We used the software Bertini [4] on some of the test problems that are reported in Sect. 5.1 to verify the correctness of the above formulation.Footnote 3 These experiments also showed that for the random noise experiment, there are typically four real finite solutions of the polynomial system. For the random point correspondence experiment, we found between four and twelve real finite solutions. The number of solutions was always even. A closer inspection of the data of the random point correspondence experiment showed that some of the solutions correspond to saddle points.

To solve the system of polynomial equations, we used the solver generator described in [42] to generate a solver for this problem.Footnote 4 In addition, we wrote a wrapper code around the solver that extracted the solution that had the minimum error of (33). When we integrated the solver into the experimental framework in Sect. 5.1, the experiments showed that the generated solver never returned a better solution (i.e., a solution with a smaller error) than the best solution of any of the other solvers. Furthermore, the experiments showed that the generated solver frequently failed to find any solution, sometimes in more than 70% of the random test cases, independent of the scenario (random noise, outliers, and random point correspondences). This is in stark contrast to the fact that (33), as a continuous function on a compact domain, by the extreme value theorem always has a global minimum. Furthermore, it contrasts with the fact that all the other solvers returned at least a local minimum of (33). Therefore, we did not consider the generated solver any further.

Since one of the goals we strive for is speed, we solve the nine polynomial equations by Newton’s method. If we denote the nine equations as a function \({\varvec{z}}({\varvec{p}})\), where \({\varvec{p}}\) denotes the nine parameters in and \(\varvec{\lambda }\), we start with an initial value \({\varvec{p}}_0\) and iterate \({\varvec{p}}_{i+1} = {\varvec{p}}_i - (\nabla {\varvec{z}}({\varvec{p}}_i))^{-1} {\varvec{z}}({\varvec{p}}_i)\) until convergence.

Note that \({\varvec{z}}({\varvec{p}})\) and \(\nabla {\varvec{z}}({\varvec{p}})\) only depend on \({{\mathbf {\mathtt{{A}}}}}\) and \({{\mathbf {\mathtt{{B}}}}}\). Therefore, there is no speed advantage if the number of point correspondences is reduced to three. Hence, we do not perform this step.

To obtain the initial value \({\varvec{p}}_0\), we use the following heuristic: suppose the points \({\varvec{x}}_i\) and \({\varvec{y}}_i\) are related perfectly by an orthogonal matrix . Then, we would have \(\varvec{\lambda }= {\varvec{0}}\). Therefore, (46) reduces to , i.e., . This means that, under the above hypothesis, is simply given by the solution of the unrestricted Procrustes problem.

In reality, the points \({\varvec{x}}_i\) and \({\varvec{y}}_i\) are not related perfectly by an orthogonal matrix. Consequently, the matrix computed above will not be orthogonal. We therefore project onto the closest orthogonal matrix. It is well known that this can be achieved via the SVD [5, Theorem 1], [33, Theorem 2.2]. Let the SVD of be given by . Then, the closest orthogonal matrix to is given by . We use and \(\varvec{\lambda }= {\varvec{0}}\) as the initial value \({\varvec{p}}_0\).

The above algorithm will converge to any solution that fulfills the nine equations. Since the first-order conditions are merely necessary and not sufficient, this means that the algorithm might converge to a local maximum or a saddle point. To detect whether this happens, we use the second-order conditions for minimization problems with equality constraints, as described in [50, Chapter 11.5]. In our problem, we must check whether the Hessian matrix is positive on the tangent subspace . Here, denotes the Hessian matrix of , is given by , denotes the Hessian matrix of , and is the ith component of . A basis for M can be calculated via the full SVD of : It is given by the last three columns of \({{\mathbf {\mathtt{{V}}}}}\). Let this matrix be called . Therefore, we must check whether the matrix is positive definite. To do so, we must examine the eigenvalues of this matrix and check whether they are all positive. If this is not the case, the algorithm has not converged to a local minimum. In this case, we could try to use one of the algorithms discussed in the previous sections or we could simply return an error. We will examine in Sect. 5.1 how often this case occurs.

Once has been determined, \({\varvec{t}}\) is determined by (10) and \(t_z\) is set to 0 (cf. Sect. 2.2).

3.5 The Iterative Polynomial System Solver Based on a Quaternion Representation of the Rotation

As mentioned in Remark 2, instead of the direct representation of the rotation, we can also parameterize the rotation by a unit quaternion to obtain the rotation matrix (20). We also have the constraint

$$\begin{aligned} h({\varvec{q}}) = q_0^2 + q_1^2 + q_2^2 +q_3^2 - 1 = 0 . \end{aligned}$$
(48)

Therefore, we have a constrained minimization problem

$$\begin{aligned} \begin{array}{ll} \text {minimize} &{} f({\varvec{q}}) \\ \text {subject to} &{} h({\varvec{q}}) = 0 , \end{array} \end{aligned}$$
(49)

where is given by (33) and is given by (20). The first-order necessary conditions for this problem are given by (see [50, Chapter 11.3]):

$$\begin{aligned} \nabla f({\varvec{q}}) + \lambda \nabla h({\varvec{q}})&= {\varvec{0}} \end{aligned}$$
(50)
$$\begin{aligned} h({\varvec{q}})&= 0 , \end{aligned}$$
(51)

where \(\lambda \in {\mathbb {R}}\) is a Lagrange multiplier. Computing (50) results in the following four polynomial equations:

$$\begin{aligned}&4 \Bigl ( \bigl (q_0 (q_0^2 + q_1^2 - q_2^2 + q_3^2) + 2 q_1 q_2 q_3 \bigr ) a_{11} {} \nonumber \\&\quad +2 \bigl ( q_3 (q_2^2 - q_1^2) +2 q_0 q_1 q_2 \bigr ) a_{12} {} \nonumber \\&\quad + \bigl ( q_2 (q_0^2 - q_1^2 - q_2^2 + q_3^2) + 2 q_0 (q_0 q_2 - q_1 q_3) \bigr ) a_{13} {} \nonumber \\&\quad + \bigl ( q_0 (q_0^2 - q_1^2 + q_2^2 + q_3^2) - 2 q_1 q_2 q_3 \bigr ) a_{22} {} \nonumber \\&\quad + \bigl ( q_1 (q_1^2 + q_2^2 - q_3^2 - q_0^2) - 2 q_0 (q_0 q_1 + q_2 q_3) \bigr ) a_{23} {} \nonumber \\&\quad + 2 q_0 (q_1^2 + q_2^2) a_{33} {} \nonumber \\&\quad - q_0 b_{11} + q_3 b_{21} - q_2 b_{31} {} \nonumber \\&\quad - q_3 b_{12} - q_0 b_{22} + q_1 b_{32} \Bigr ) \nonumber \\&\quad + 2 \lambda q_0 = 0 \end{aligned}$$
(52)
$$\begin{aligned}&\quad 4 \Bigl ( \bigl ( q_1 (q_0^2 + q_1^2 + q_2^2 - q_3^2) + 2 q_0 q_2 q_3 \bigr ) a_{11} {} \nonumber \\&\quad +2 \bigl ( q_2 (q_0^2 - q_3^2) - 2 q_0 q_1 q_3 \bigr ) a_{12} {} \nonumber \\&\quad +\bigl ( q_3 (q_1^2 + q_2^2 - q_3^2 - q_0^2) + 2 q_1 (q_3 q_1-q_0 q_2) \bigr ) a_{13} {} \nonumber \\&\quad +\bigl ( q_1 (q_1^2 + q_2^2 + q_3^2 - q_0^2) - 2 q_0 q_2 q_3 \bigr ) a_{22} {} \nonumber \\&\quad +\bigl ( q_0 (q_1^2 + q_2^2 - q_3^2 - q_0^2) + 2 q_1 (q_0 q_1+q_2 q_3) \bigr ) a_{23} {} \nonumber \\&\quad +2 q_1 (q_0^2+q_3^2) a_{33} {} \nonumber \\&\quad -q_1 b_{11} - q_2 b_{21} - q_3 b_{31} {} \nonumber \\&\quad -q_2 b_{12} + q_1 b_{22} + q_0 b_{32} \Bigr ) \nonumber \\&\quad + 2 \lambda q_1 = 0 \end{aligned}$$
(53)
$$\begin{aligned}&\quad 4 \Bigl ( \bigl ( q_2 (q_1^2 + q_2^2 + q_3^2 - q_0^2) + 2 q_0 q_1 q_3 \bigr ) a_{11} {} \nonumber \\&\quad +2 \bigl ( q_1 (q_0^2 - q_3^2) + 2 q_0 q_2 q_3 \bigr ) a_{12} {} \nonumber \\&\quad +\bigl ( q_0 (q_0^2 - q_1^2 - q_2^2 + q_3^2) + 2 q_2 (q_1 q_3 - q_0 q_2) \bigr ) a_{13} {} \nonumber \\&\quad +\bigl ( q_2 (q_0^2 + q_1^2 + q_2^2 - q_3^2) - 2 q_0 q_1 q_3 \bigr ) a_{22} {} \nonumber \\&\quad +\bigl ( q_3 (q_1^2 + q_2^2 - q_3^2 - q_0^2) + 2 q_2 (q_0 q_1+q_3 q_2) \bigr ) a_{23} {} \nonumber \\&\quad +2 q_2 (q_0^2+q_3^2) a_{33} {} \nonumber \\&\quad +q_2 b_{11} - q_1 b_{21} - q_0 b_{31} {} \nonumber \\&\quad -q_1 b_{12} - q_2 b_{22} - q_3 b_{32} \Bigr ) \nonumber \\&\quad {} + 2 \lambda q_2 = 0 \end{aligned}$$
(54)
$$\begin{aligned}&\quad 4 \Bigl ( \bigl ( q_3 (q_0^2 - q_1^2 + q_2^2 + q_3^2) + 2 q_0 q_1 q_2 \bigr ) a_{11} {} \nonumber \\&\quad +2 \bigl ( q_0 (q_2^2 - q_1^2) - 2 q_1 q_2 q_3 \bigr ) a_{12} {} \nonumber \\&\quad +\bigl ( q_1 (q_1^2 + q_2^2 - q_3^2 - q_0^2) + 2 q_3 (q_0 q_2-q_1 q_3) \bigr ) a_{13} {} \nonumber \\&\quad +\bigl ( q_3 (q_0^2 + q_1^2 - q_2^2 + q_3^2) - 2 q_0 q_1 q_2 \bigr ) a_{22} {} \nonumber \\&\quad +\bigl ( q_2 (q_1^2 + q_2^2 - q_3^2 - q_0^2) - 2 q_3 (q_0 q_1 + q_2 q_3) \bigr ) a_{23} {} \nonumber \\&\quad +2 q_3 (q_1^2+q_2^2) a_{33} {} \nonumber \\&\quad +q_3 b_{11} + q_0 b_{21} - q_1 b_{31} {} \nonumber \\&\quad -q_0 b_{12} + q_3 b_{22} - q_2 b_{32} \Bigr ) \nonumber \\&\quad + 2 \lambda q_3 = 0 \end{aligned}$$
(55)

Together with (48), we have five polynomial equations of degree two and three in the five unknowns. By Bézout’s theorem [14, Chapter 3.3], this polynomial system has at most \(2 \cdot 3^4 = 162\) distinct solutions. Since \({\varvec{q}}\) and \(-{\varvec{q}}\) describe the same rotation, we can expect that this polynomial system has more local minima that that of Sect. 3.4.

Like for the polynomial system in Sect. 3.4, we used the solver generator described in [42] to generate a solver for this problem. In addition, we wrote a wrapper code around the solver that extracted the solution that had the minimum error of (33). When we integrated the solver into the experimental framework in Sect. 5.1, a similar problem to that described in Sect. 3.4 occurred: the generated solver failed to find any solution in up to 10% of the random test cases, independent of the scenario (random noise, outliers, and random point correspondences). Furthermore, the experiments showed that the generated solver never returned a better solution than the best solution of any of the other solvers. Therefore, we did not consider the generated solver any further.

Instead, we used the same approach as in Sect. 3.4: We use Newton’s method to compute a solution. Since the inner loop only depends on \({{\mathbf {\mathtt{{A}}}}}\) and \({{\mathbf {\mathtt{{B}}}}}\), we do not reduce the number of point correspondences to three. The same initialization as in Sect. 3.4 is used. After convergence, we check whether the second-order conditions for minimization problems are fulfilled to ensure the algorithm has converged to a local minimum. Once \({\varvec{q}}\) has been determined, we compute and then \({\varvec{t}}\) as in Sect. 3.4.Footnote 5

3.6 The Polynomial System Solver Based on Gloptipoly

As noted in Sect. 3.4, solving (35) is a problem of minimizing a polynomial function with polynomial constraints. A specialized algorithm (called Gloptipoly) for these kinds of problems has been proposed in [30, 31]. Therefore, we also implemented a solver based on Gloptipoly, version 3.8.

If the function in (33) is expanded and the polynomial terms are collected, it can be seen that (33) is a polynomial of degree two in the entries of that only depends on the entries of the matrices \({{\mathbf {\mathtt{{A}}}}}\) and \({{\mathbf {\mathtt{{B}}}}}\) (see (44) and (45)). Therefore, we did not reduce the number of point correspondences to three by Proposition 2. We did not use the quaternion parameterization of Sect. 3.5 because this would have resulted in a much more complicated polynomial equation of degree four. Based on the experiments reported in Sect. 5, it was determined that Gloptipoly’s relaxation parameter had to be set to 2 to ensure that Gloptipoly finds the global optimum.

4 Algorithms for Solving the OnP Problem for Coplanar Points

None of the algorithms that are described in Sect. 3 work for coplanar points. For the algorithm by Green and Gower (Sect. 3.1), the norm of the difference in step 7 is 0 in this case. Therefore, the algorithm terminates immediately with an incorrect solution. For the algorithm of Koschat and Swayne (Sect. 3.2), the matrix \({{\mathbf {\mathtt{{Q}}}}}\) in step 8 is equal to \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {o}}\), which leads to the same problem. For the Levenberg–Marquardt algorithm (Sect. 3.3) and the iterative polynomial system solvers (Sects. 3.4 and 3.5), the initial solution cannot be computed since the matrix \({{\mathbf {\mathtt{{A}}}}}\) is singular. Furthermore, for the Levenberg–Marquardt algorithm and the first polynomial solver, the equation systems that are solved in the inner loop of the algorithms are singular.

As described in Sect. 2.4, the OnP problem for coplanar points is equivalent to the sub-Stiefel Procrustes problem. The only algorithm for solving this problem that we could find is the algorithm of Cardoso and Ziȩtak [10], which is discussed in Sect. 4.1.

Analogous to the non-coplanar case, we also implemented a Levenberg–Marquardt algorithm (Sect. 4.2) and two algorithms that iteratively solve a system of polynomial equations that are based on the Karush–Kuhn–Tucker conditions of the Procrustes problem (Sect. 4.3) and on Lagrange multipliers of the Procrustes problem (Sect. 4.4).

Finally, to check whether the above algorithms converge to the global optimum, we also implemented a solver based on Gloptipoly (Sect. 4.5).

4.1 The Algorithm of Cardoso and Ziȩtak

The algorithm of Cardoso and Ziȩtak is described in [10, Section 7]. It uses a similar idea as the algorithm of Green and Gower: The sub-Stiefel Procrustes problem is expanded to a 3D orthogonal Procrustes problem. Applied to the coplanar OnP problem, it can be described as follows:

  1. 1.

    Reduce \({{\mathbf {\mathtt{{X}}}}}\) and \({{\mathbf {\mathtt{{Y}}}}}\) to the \(2 \times 2\) matrices \({{\mathbf {\mathtt{{X}}}}}'\) and \({{\mathbf {\mathtt{{Y}}}}}'\) as described in Proposition 2.

  2. 2.

    Set \({{\mathbf {\mathtt{{X}}}}} = \gamma {{\mathbf {\mathtt{{X}}}}}'\) and \({{\mathbf {\mathtt{{Y}}}}} = \gamma {{\mathbf {\mathtt{{Y}}}}}'\). As discussed below, the parameter \(\gamma \) is required to ensure fast convergence of the algorithm.

  3. 3.

    Set \({{\mathbf {\mathtt{{Q}}}}} = \hbox {diag}(1,0.5)\).

  4. 4.

    Set

    $$\begin{aligned} {{\mathbf {\mathtt{{X}}}}}_{\mathrm {e}} = \left( \begin{array}{ll} {{\mathbf {\mathtt{{X}}}}} &{} {\varvec{0}} \\ {\varvec{0}}^\top &{} 1 \end{array} \right) . \end{aligned}$$
    (56)
  5. 5.

    Set \({\varvec{p}} = {{\mathbf {\mathtt{{X}}}}} (-\sqrt{0.75},0)^\top \) and \({\varvec{q}} = (0,\sqrt{0.75})^\top \).

  6. 6.

    Set

    $$\begin{aligned} {{\mathbf {\mathtt{{Y}}}}}_{\mathrm {e}} = \left( \begin{array}{ll} {{\mathbf {\mathtt{{Y}}}}} &{} {\varvec{p}} \\ {\varvec{q}}^\top &{} 0.5 \end{array} \right) . \end{aligned}$$
    (57)
  7. 7.

    Use a balanced orthogonal Procrustes algorithm to determine the 3D rotation matrix \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {e}}\) based on \({{\mathbf {\mathtt{{X}}}}}_{\mathrm {e}}\) and \({{\mathbf {\mathtt{{Y}}}}}_{\mathrm {e}}\). We use the algorithm proposed in [68].

  8. 8.

    Partition \({{\mathbf {\mathtt{{Q}}}}}_{\mathrm {e}}\) as

    $$\begin{aligned} {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {e}} = \left( \begin{array}{ll} {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {n}} &{} {\varvec{p}} \\ {\varvec{q}}^\top &{} \alpha \end{array} \right) . \end{aligned}$$
    (58)
  9. 9.

    If \(\Vert {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {n}} - {{\mathbf {\mathtt{{Q}}}}} \Vert _{\mathrm {F}}^2\) is smaller than a threshold, set \({{\mathbf {\mathtt{{Q}}}}} = {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {n}}\) and go to step 13.

  10. 10.

    Set \({{\mathbf {\mathtt{{Q}}}}} = {{\mathbf {\mathtt{{Q}}}}}_{\mathrm {n}}\).

  11. 11.

    Set

    $$\begin{aligned} {{\mathbf {\mathtt{{Y}}}}}_{\mathrm {e}} = \left( \begin{array}{ll} {{\mathbf {\mathtt{{Y}}}}} &{} {{\mathbf {\mathtt{{X}}}}} {\varvec{p}} \\ \text{ sign }(\alpha ) {\varvec{q}}^\top &{} |\alpha | \end{array} \right) . \end{aligned}$$
    (59)
  12. 12.

    Go to step 7.

  13. 13.

    Complete \({{\mathbf {\mathtt{{Q}}}}}\) to a \(3 \times 2\) matrix by Proposition 4 (selecting an arbitrary solution of the two possible solutions; see Remark 7).

  14. 14.

    Compute and \({\varvec{t}}\) by (10). Set \(t_z = 0\) (cf. Sect. 2.2).

In our experiments, we found that the parameter \(\gamma \) is crucial for the convergence of the algorithm. If it is set too small, the algorithm will converge extremely slowly. We experimentally determined that \(\gamma = 10{,}000\) is a suitable value for our formulation of the OnP problem. This presumably is the case since we use meters as the units for \({\varvec{x}}_i\) and \({\varvec{y}}_i\). Because telecentric lenses cannot be arbitrarily large, the values of the point coordinates are typically in the order of a few centimeters. A principled way to select \(\gamma \) is currently unknown [10, Section 8].

4.2 The Levenberg–Marquardt Algorithm

The Levenberg–Marquardt algorithm for the coplanar OnP problem is mostly identical to that for the non-coplanar OnP problem in Sect. 3.3. The differences are that Proposition 2 allows us to reduce the problem to one with two point correspondences and that we have two potential solutions (see Proposition 4). Because the two solutions are related to each other in a very simple manner (see Remark 7), we only return one of the two possible solutions.

4.3 The Iterative Polynomial System Solver Based on a Direct Representation of the Rotation

The OnP problem (12) for coplanar points can also be regarded as a constrained minimization problem. Let us denote the function to be minimized as

(60)

The constraint in Proposition 3 can be written explicitly as

(61)

In addition, obviously must fulfill the following inequality constraints:

(62)

Thus, the constrained minimization problem is given by

(63)

The first-order necessary conditions (the Karush–Kuhn–Tucker (KKT) conditions) for this problem are given by (see [50, Chapter 11.8]):

$$\begin{aligned} \nabla f({{\mathbf {\mathtt{{R}}}}}) + \lambda \nabla h({{\mathbf {\mathtt{{R}}}}}) + \varvec{\mu }^\top \nabla {\varvec{g}}({{\mathbf {\mathtt{{R}}}}})&= {\varvec{0}} \end{aligned}$$
(64)
$$\begin{aligned} \varvec{\mu }^\top {\varvec{g}}({{\mathbf {\mathtt{{R}}}}})&= {\varvec{0}} \end{aligned}$$
(65)
$$\begin{aligned} h({{\mathbf {\mathtt{{R}}}}})&= 0 \end{aligned}$$
(66)
$$\begin{aligned} {\varvec{g}}({{\mathbf {\mathtt{{R}}}}})&\le {\varvec{0}} \end{aligned}$$
(67)
$$\begin{aligned} \varvec{\mu }&\ge {\varvec{0}} , \end{aligned}$$
(68)

where \(\varvec{\mu }= (\mu _1, \mu _2, \mu _3, \mu _4)^\top \).

Computing (64)–(66) explicitly results in the following nine equations:

$$\begin{aligned}&2 ( a_{11} r_{11} + a_{12} r_{12} + \lambda r_{22} ( r_{12} r_{21} - r_{11} r_{22} ) \nonumber \\&\quad + ( \lambda + \mu _1 + \mu _3 ) r_{11} - b_{11} ) = 0 \end{aligned}$$
(69)
$$\begin{aligned}&\quad 2 ( a_{12} r_{11} + a_{22} r_{12} + \lambda r_{21} ( r_{11} r_{22} - r_{12} r_{21} ) \nonumber \\&\quad + ( \lambda + \mu _1 + \mu _4 ) r_{12} - b_{21} ) = 0 \end{aligned}$$
(70)
$$\begin{aligned}&\quad 2 ( a_{11} r_{21} + a_{12} r_{22} + \lambda r_{12} ( r_{11} r_{22} - r_{12} r_{21} ) \nonumber \\&\quad + ( \lambda + \mu _2 + \mu _3 ) r_{21} - b_{12} ) = 0 \end{aligned}$$
(71)
$$\begin{aligned}&\quad 2 ( a_{12} r_{21} + a_{22} r_{22} + \lambda r_{11} ( r_{12} r_{21} - r_{11} r_{22} ) \nonumber \\&\quad + ( \lambda + \mu _2 + \mu _4 ) r_{22} - b_{22} ) = 0\end{aligned}$$
(72)
$$\begin{aligned}&\quad \mu _1 ( r_{11}^2 + r_{12}^2 - 1 ) = 0 \end{aligned}$$
(73)
$$\begin{aligned}&\quad \mu _2 ( r_{21}^2 + r_{22}^2 - 1 ) = 0 \end{aligned}$$
(74)
$$\begin{aligned}&\quad \mu _3 ( r_{11}^2 + r_{21}^2 - 1 ) = 0 \end{aligned}$$
(75)
$$\begin{aligned}&\quad \mu _4 ( r_{12}^2 + r_{22}^2 - 1 ) = 0 \end{aligned}$$
(76)
$$\begin{aligned}&\quad r_{11}^2 + r_{12}^2 + r_{21}^2 + r_{22}^2 - (r_{11} r_{22} - r_{12} r_{21})^2 - 1 = 0 . \end{aligned}$$
(77)

where \({{\mathbf {\mathtt{{A}}}}}\) and \({{\mathbf {\mathtt{{B}}}}}\) are given by (44) and (45), respectively.

As can be seen, the KKT conditions are nine polynomial equations of degree three or four in the nine unknowns , \(\lambda \), and \(\varvec{\mu }\). By Bézout’s theorem [14, Chapter 3.3], the polynomial system has at most \(3^4 \cdot 4^5 = 82944\) distinct solutions. Using the software Bertini [4], we examined a few test instances of the problem.Footnote 6 It turned out that the polynomial system can have several tens of real finite solutions (many of which do not fulfill the condition \(\varvec{\mu }\ge {\varvec{0}}\)). Because of the large number of possible solutions, we did not attempt to use the solver generator described in [42].

Like for the non-coplanar OnP solvers in Sects. 3.4 and 3.5, we solve the nine polynomial equations by Newton’s method. If we denote the nine equations as a function \({\varvec{z}}({\varvec{p}})\), where \({\varvec{p}}\) denotes the nine parameters in , \(\lambda \), and \(\varvec{\mu }\), we start with an initial value \({\varvec{p}}_0\) and iterate \({\varvec{p}}_{i+1} = {\varvec{p}}_i - (\nabla {\varvec{z}}({\varvec{p}}_i))^{-1} {\varvec{z}}({\varvec{p}}_i)\) until convergence.

Note that \({\varvec{z}}({\varvec{p}})\) and \(\nabla {\varvec{z}}({\varvec{p}})\) only depend on \({{\mathbf {\mathtt{{A}}}}}\) and \({{\mathbf {\mathtt{{B}}}}}\). Therefore, we do not reduce the number of point correspondences to two.

To obtain the initial value \({\varvec{p}}_0\), we use the same heuristic as in Sect. 3.4: suppose the points \({\varvec{x}}_i\) and \({\varvec{y}}_i\) are related perfectly by an orthogonal matrix . Then, we would have \(\lambda = 0\) and \(\varvec{\mu }= {\varvec{0}}\). Therefore, (69)–(77) reduce to , i.e., .

In reality, the points \({\varvec{x}}_i\) and \({\varvec{y}}_i\) are not related perfectly by an orthogonal matrix. Consequently, the matrix computed above will not be a sub-Stiefel matrix. We therefore project onto the closest sub-Stiefel matrix. By [10, Theorem 5.5], this can be achieved via the SVD. Let the SVD of be given by . Then, the closest sub-Stiefel matrix to is given by , where \({{\mathbf {\mathtt{{S}}}}}_* = \hbox {diag}(1,\ldots ,1,s_*)\), , and denotes the smallest singular value or . We use , \(\lambda = 0\), and \(\varvec{\mu }= {\varvec{0}}\) as the initial value \({\varvec{p}}_0\).

The above algorithm will converge to any solution that fulfills the nine equations. Since the first-order conditions are merely necessary and not sufficient, the algorithm might converge to a local maximum, a saddle point, or even to a point outside the feasible region (62). We use the second-order conditions for minimization problems with equality and inequality constraints [50, Chapter 11.8] to detect whether this happens. In our problem, we must check whether the Hessian matrix is positive on the tangent subspace , i.e., on the tangent subspace corresponding to the equality constraint and all active inequality constraints. Here, denotes the Hessian matrix of , the Hessian matrix of , is given by , denotes the Hessian matrix of , and is the ith component of . A basis for M can be calculated via the full SVD of the matrix composed of the gradients of the equality constraint and the active inequality constraints: It is given by the last \(m+1\) columns of \({{\mathbf {\mathtt{{V}}}}}\), where m denotes the number of active inequality constraints. Let this matrix be called . Therefore, we must check whether the matrix is positive definite. To do so, we must examine the eigenvalues of this matrix and check whether they are all positive. Furthermore, we check whether the conditions \(\varvec{\mu }\ge {\varvec{0}}\) are fulfilled. If any of the preceding conditions are not fulfilled, the algorithm has not converged to a feasible local minimum. In this case, we could try to use one of the algorithms discussed in the previous sections or we could simply return an error. We will examine in Sect. 5.2 how often this case occurs.

Once has been determined, it must be completed to a \(2 \times 3\) matrix in a manner analogous to Proposition 4 (an arbitrary solution of the two possible solutions can be selected; see Remark 7). Finally, \({\varvec{t}}\) is determined by (10) and \(t_z\) is set to 0 (cf. Sect. 2.2).

4.4 The Iterative Polynomial System Solver Based on a Quaternion Representation of the Rotation

As mentioned in Remark 8, we can also parameterize the rotation in (60) using unit quaternions. The matrix in (60) is given by (32). With this, conceptually (48)–(51) remain unchanged. The resulting polynomial system is given by (52)–(55), where all the terms involving \(a_{13}\), \(a_{33}\), \(a_{33}\), \(b_{31}\), and \(b_{32}\) are deleted.

By Bézout’s theorem [14, Chapter 3.3], this polynomial system has at most \(2 \cdot 3^4 = 162\) distinct solutions. This is a significantly smaller number of potential solutions than that of the approach in Sect. 4.3.

Like for the polynomial systems in Sects. 3.4 and 3.5, we used the solver generator described in [42] to generate a solver for this problem. It turned out that the generated solver returned solutions that did not even fulfill the polynomial equations that were specified to the solver generator. Therefore, we did not use the generated solver.

Instead, like for the other polynomial solvers, we use Newton’s method to compute a solution. Since the inner loop only depends on \({{\mathbf {\mathtt{{A}}}}}\) and \({{\mathbf {\mathtt{{B}}}}}\), we do not reduce the number of point correspondences to two. The same initialization as in Sect. 4.3 is used. After convergence, we check whether the second-order conditions for minimization problems are fulfilled to ensure the algorithm has converged to a local minimum. Once \({\varvec{q}}\) has been determined, we compute and then \({\varvec{t}}\) as in Sect. 4.3.

4.5 The Polynomial System Solver Based on Gloptipoly

Like for the non-coplanar case (cf. Sect. 3.6), we also implemented a solver based on Gloptipoly [30, 31]. In contrast to the non-coplanar case, we used the quaternion representation of Sect. 4.4 to parameterize the rotation, resulting in a polynomial function of degree four that must be minimized under the constraint (48) and the two constraints given in Corollary 2. As before, the function to be optimized depends on the point correspondences only via the matrices \({{\mathbf {\mathtt{{A}}}}}\) and \({{\mathbf {\mathtt{{B}}}}}\) (see (44) and (45)). Therefore, we did not reduce the number of point correspondences to two. Based on the experiments reported in Sect. 5, it was determined that Gloptipoly’s relaxation parameter had to be set to 2 to ensure that Gloptipoly finds the global optimum.

5 Evaluation

In this section, we will evaluate the algorithms that were described in Sects. 3 and 4. Our purpose is to answer the question which of the proposed algorithms best fulfills the following two criteria:

  • Robustness Ideally, the algorithm should find the global minimum of the OnP problem.

  • Speed The algorithm should be as fast as possible.

To evaluate the algorithms with respect to these criteria, we performed the following experiments: n random non-coplanar or coplanar 3D points and a random pose were generated. The random non-coplanar 3D points were distributed uniformly \(\in [-0.01, 0.01]^3\) [m]. The coplanar points were distributed uniformly \(\in [-0.01, 0.01]^2\) [m] (\(z = 0\)). The random 3D points were projected into a virtual image of size \(2560 \times 1920\) using a telecentric camera with the following data: \(m = 0.08\), no distortion,Footnote 7 \(s_x = s_y = 2\) µm, \((c_x, c_y)^\top = (1180, 1010)^\top \). The number of point correspondences n was varied between the minimum number, i.e., 4 or 3, and 50,000 in roughly logarithmic increments.Footnote 8 The random 3D points were projected into the image using the random pose to obtain 2D points. For each n, 10,000 random experiments were performed. To test the robustness, three different scenarios were evaluated:

  • Random noise The 3D points and their 2D correspondences were disturbed by random, uniformly distributed noise. The maximum value of the noise corresponded to 1% of the diameter of the point cloud. Hence, the 3D points were disturbed by noise uniformly distributed \(\in [-0.0001, 0.0001]^d\) [m] (\(d = 3\) for the non-coplanar algorithms and \(d = 2\) for the coplanar algorithms) and the 2D points were disturbed by noise uniformly distributed \(\in [-4, 4]^2\) [Pixel]. Note that this is a very large amount of noise that is not realistic for real applications and is solely designed to test the robustness of the algorithms.

  • Outliers 80% of the 3D points and their 2D correspondences were disturbed by random, uniformly distributed noise. To increase the difficulty of the problem, the maximum value of the noise corresponded to 2% of the diameter of the point cloud. Hence, the 3D points were disturbed by noise uniformly distributed \(\in [-0.0002, 0.0002]^d\) [m] (\(d = 3\) for the non-coplanar algorithms and \(d = 2\) for the coplanar algorithms) and the 2D points were disturbed by noise uniformly distributed \(\in [-8, 8]^2\) [Pixel]. Furthermore, 20% of the point correspondences were created as outliers. This was done by adding noise to the point correspondences that amounted to 100% of the diameter of the point cloud, i.e., noise \(\in [-0.01, 0.01]^d\) [m] (\(d = 3\) for the non-coplanar algorithms and \(d = 2\) for the coplanar algorithms) for the 3D points and \(\in [-400, 400]\) [Pixel] for the 2D points. If there were fewer than five point correspondences, at least one outlier was generated to ensure that the results were always contaminated by outliers. This scenario is intended to test the algorithms with a moderate number of outliers and an extremely large noise level.

  • Random point correspondences To make the problem as difficult as possible, the n 3D points were replaced by random points \(\in [-0.01, 0.01]^d\) [m] (\(d = 3\) for the non-coplanar algorithms and \(d = 2\) for the coplanar algorithms). The 2D points were unchanged. Consequently, the 3D points had no functional relation to the 2D points whatsoever, i.e., the outlier ratio was 100%. The motivation for this experiment is mainly to evaluate the breaking point of the algorithms and to identify significant performance differences between the algorithms. We do not consider this experiment relevant for real applications.

We implemented the Gloptipoly solvers in Sects. 3.6 and 4.5 to determine the global optimum of the above problems. The runtime of both algorithms in MATLAB is approximately 2 s. All the other algorithms were implemented in C as HALCONFootnote 9 operators in a HALCON extension package. Our evaluation framework was also implemented within HALCON. To interface the Gloptipoly solvers to the evaluation framework, we initially called the MATLAB executable for each problem instance. This created an overhead of around 14 s per call, raising the execution time to 16 s. To reduce the overhead, we compiled the Gloptipoly solvers into executables using the MATLAB command mcc. This decreased the overhead to around 4 s. Hence, one call of the Gloptipoly solvers took approximately 6 s. With this execution time, the experiments reported in the following would have required approximately three months to run for each scenario. Unfortunately, this meant that it was infeasible to integrate the Gloptipoly solvers into the experiments in Sects. 5.1 and 5.2 directly. Therefore, we will first evaluate the robustness of all algorithms except the Gloptipoly solvers and will then address the question whether the proposed solvers converge to the global optimum based on experiments with fewer different numbers of point correspondences and with fewer trials.

Therefore, in the first evaluation, we use the solution with the minimum root-mean-square (RMS) error returned by all evaluated algorithms as the global minimum, i.e., as the correct solution. We will then use the second evaluation to determine an estimate for the probability that the Gloptipoly solvers find a better solution than the best solution of all the other algorithms. We will see that this probability is extremely low for all scenarios, i.e., it is almost certain that one of the iterative algorithms converges to the global optimum. This justifies our procedure for the first evaluation.

In the first evaluation, we count a solution as being correct if it achieves an RMS error that lies within 0.1% of the smallest RMS error. This takes into account that the algorithms have slightly different stopping criteria. In the evaluation, we report in what percentage of the 10,000 trials the respective algorithm has converged to the solution with the minimum RMS error. Therefore, the ideal algorithm would achieve an evaluation of 100% on this metric. To test the speed of the algorithms, we report their average runtime in milliseconds on the 10,000 trials, as measured on a 3.2 GHz Intel Core i5-4570 CPU under Linux. All algorithms were implemented in C. The linear algebra used in the algorithms was implemented using LAPACK [1].

Fig. 1
figure 1

Results of the evaluation of the OnP algorithms for non-coplanar points for three test cases. The results in the left column display the percentage of the tests in which an algorithm found the best solution as a function of the number of point correspondences. The best solution was determined as the solution that had the lowest RMS error of all algorithms for a particular random data set. A total of 10,000 experiments with random poses, random points, and random noise were executed for each number of point correspondences. The right column displays the average runtime in milliseconds as a function of the number of point correspondences. Note the logarithmic scale on the x axis in both columns and on the y axis in the right column. See the text for details about the three experiments displayed in the graphs

5.1 Algorithms for Non-coplanar 3D Points

The results of the evaluation of the algorithm of Green and Gower (Sect. 3.1), the algorithm of Koschat and Swayne (Sect. 3.2), our Levenberg–Marquardt algorithm (Sect. 3.3), and our proposed iterative polynomial system solvers (Sects. 3.4 and 3.5) are displayed in Fig. 1. For the iterative polynomial system solver, two results are displayed. One graph displays the robustness if the internal self diagnosis taken into account and an error is returned if the internal self diagnosis indicates that the algorithm has not converged to a local minimum, i.e., the result is only counted as correct if the internal self diagnosis indicates that the algorithm has converged to a local minimum and if the RMS error of the local minimum corresponds to the smallest RMS error of all algorithms, as described above. A second graph displays the results if the internal self diagnosis is taken into account and the algorithm of Green and Gower is used as a fallback if the internal self diagnosis indicates that the algorithm has not converged to a local minimum. In the interest of brevity, we will refer to these two cases as “without fallback” and “with fallback” below.

The experiments with random noise in Fig. 1a show that in this experiment all algorithms almost always find the correct solution. The only exceptions are the cases \(n=4\) and \(n=5\), where the algorithms in rare cases do not find the correct solution. In these cases, the two polynomial system solvers without fallback, the polynomial system solver with fallback that is based on unit quaternions, and the algorithm by Koschat and Swayne perform slightly worse than the other three algorithms. Interestingly, the performance of the polynomial system solver with fallback that is based on a direct rotation representation is the best of the seven algorithms. The runtimes in Fig. 1b show that the polynomial system solvers are asymptotically the fastest. It is only for \(n \le 200\) that the Levenberg–Marquardt algorithm is slightly faster than some of the polynomial system solvers.Footnote 10 For \(n = 100\), the polynomial system solvers and the Levenberg–Marquardt algorithm are faster by a factor of more than 10 than the algorithm of Green and Gower and by a factor of more than 100 than the algorithm of Koschat and Swayne (note the logarithmic scale of the runtime graphs). Asymptotically, the polynomial system solvers are faster by a factor of more than 4 than all other algorithms. The algorithm of Koschat and Swayne is the slowest by a large margin. The reason for this is that, especially for small n, it requires tens of thousands of iterations to converge. Thus, we can conclude that for the random noise experiment, the polynomial system solvers, in particular the polynomial system solver with fallback that is based on a direct rotation representation, provide the best tradeoff between speed and robustness.

In the experiment with outliers (see Fig. 1c, d), the polynomial system solvers without fallback, the polynomial solver with fallback that is based on unit quaternions, and the Levenberg–Marquardt algorithm perform the worst for small n, while the polynomial system solver with fallback that is based on a direct rotation representation, the algorithm of Green and Gower and that of Koschat and Swayne perform best. For larger n, all four algorithms find the correct solution in all cases. The runtime behavior of the algorithms is very similar to the random noise case. Therefore, in this case, the best tradeoff between robustness and speed is achieved by the polynomial system solver with fallback that is based on a direct rotation representation.

For the experiment with completely random point correspondences (100% outliers), Fig. 1e, f show that the algorithm by Green and Gower and that by Koschat and Swayne almost always achieve the correct solution. The Levenberg–Marquardt algorithm sometimes does not find the correct solution for small or large n. The polynomial system solvers without fallback provide the worst robustness. Their heuristic to obtain the initial solution fails because the underlying assumptions are no longer fulfilled. The polynomial system solver that is based on a direct rotation representation converges to saddle points or local maxima in up to 40% of the cases. With the fallback, the correct solution can be obtained in more than 98% of the cases. However, in the remaining less than 2% of the cases, the algorithm converges to a different local minimum. Interestingly, the polynomial system solver that is based on unit quaternions converges to saddle points or local maxima in only up to 28% of the cases. With the fallback, however, it performs worse than the polynomial system solver with fallback that is based on a direct rotation representation. It sometimes fails to find the correct solution in more than 12% of the cases. Obviously, the more complex formulation of Sect. 3.5 causes more local minima than the simpler formulation of Sect. 3.4. Note that the polynomial system solvers without fallback are by far the fastest algorithms. With fallback, they are beaten by the Levenberg–Marquardt algorithm, which is less robust, however. In this experiment, the algorithm of Green and Gower clearly results in the best tradeoff between robustness and speed. However, this case is not really representative for the problems we are trying to solve in practice. Here, we would assume that the image points are related to the 3D points, at least partially. Consequently, the first two experiments are much more relevant for real applications. Therefore, the polynomial system solver with fallback that is based on a direct rotation representation seems to be the best choice overall. The algorithm of Green and Gower can be used for exceedingly difficult cases, such as those in the third experiment.

We now examine the question how often the above algorithms do not converge to the global optimum. As described previously, we test how often the Gloptipoly solver of Sect. 3.6 finds a better solution than the best solution returned by the other algorithms. We counted a solution as better if the Gloptipoly solver achieved an RMS error that was at least 0.001% smaller than the smallest RMS error of the remaining algorithms. We used the same scenarios, but reduced the number of trials to 1000 (instead of 10,000, as in the above experiments) and used only 20 different numbers of point correspondences (instead of 38), resulting in 20,000 trials for each scenario. Despite this large reduction, the evaluation of each scenario still required more than two days of computation time.

From this experiment, we can derive an estimate for the probability \(p_\mathrm {b}\) that the Gloptipoly solver returns a better solution than the best solution of the remaining algorithms. For the random noise scenario, the Gloptipoly solver never returned a better solution (\(p_\mathrm {b} = 0\%\)), for the outlier scenario, it found two instances with a smaller RMS error (\(p_\mathrm {b} = 0.01\%\)), and for the random point correspondence scenario, it found twelve instances (\(p_\mathrm {b} = 0.06\%\)). Therefore, even for the scenario that was designed to produce the maximum likelihood for the iterative algorithms to converge to the wrong local minimum, it is extremely likely (\({>} 99.9\%\)) that at least one of the iterative algorithms converges to the global minimum.

Fig. 2
figure 2

Results of the evaluation of the OnP algorithms for coplanar points for three test cases. The results in the left column display the percentage of the tests in which an algorithm found the best solution as a function of the number of point correspondences. The best solution was determined as the solution that had the lowest RMS error of all algorithms for a particular random data set. 10,000 experiments with random poses, random points, and random noise were executed for each number of point correspondences. The right column displays the average runtime in milliseconds as a function of the number of point correspondences. Note the logarithmic scale on the x axis in both columns and on the y axis in the right column. See the text for details about the three experiments displayed in the graphs

5.2 Algorithms for Coplanar 3D Points

Figure 2 displays the results of the evaluation of the algorithm of Cardoso and Ziȩtak (Sect. 4.1), our Levenberg–Marquardt algorithm (Sect. 4.2), and our proposed iterative polynomial system solvers (Sects. 4.3 and 4.4). Like in Sect. 5.1, two results are displayed for the iterative polynomial system solvers: the results without and with fallback. In this case, the algorithm of Cardoso and Ziȩtak was used as the fallback.

Figure 2a shows that the algorithm of Cardoso and Ziȩtak performs slightly less robustly for small n than the Levenberg–Marquardt algorithm and the polynomial system solvers without fallback. The polynomial system solvers with fallback show the best overall robustness. Furthermore, as shown by Fig. 2b, the polynomial system solvers are asymptotically the fastest algorithms.Footnote 11 For \(n = 100\), the polynomial system solvers and the Levenberg–Marquardt algorithm are faster than the algorithm of Cardoso and Ziȩtak by a factor of more than 30. Asymptotically, the polynomial system solvers are faster than the Levenberg–Marquardt algorithm by a factor of more than 3.

For the experiment with outliers (Fig. 2c, d), the Levenberg–Marquardt algorithm performs the worst for small n, followed by the polynomial system solver without fallback that is based on a direct rotation representation. The algorithm of Cardoso and Ziȩtak and the polynomial system solver with fallback that is based on a direct rotation representation and the polynomial system solver without fallback that is based on unit quaternions perform almost equally well. The polynomial system solver with fallback that is based on unit quaternions performs best overall. All algorithms return the correct result for larger n. The runtime behavior of the algorithms is very similar to the random noise case. Therefore, in this case, the best tradeoff between robustness and speed is achieved by the polynomial system solver with fallback that is based on unit quaternions.

Finally, the results of the experiment with completely random point correspondences (100% outliers; see Fig. 2e, f) show that there is no clear winner in terms of robustness. For small n, the algorithm of Cardoso and Ziȩtak and the polynomial system solvers (without or with fallback) and for large n the algorithm of Cardoso and Ziȩtak and the polynomial system solver based on unit quaternions outperform the Levenberg–Marquardt algorithm. On the other hand, the Levenberg–Marquardt algorithm performs best for medium n. Overall, the polynomial system solvers without fallback and the polynomial system solver with fallback that is based on a direct rotation representation exhibit the least robustness. It is interesting to note the differences between the different kinds of polynomial system solvers. The solvers that are based on a direct rotation representation without or with fallback perform worse than the solver without fallback that is based on unit quaternions. The fallback for the solver that is based on the direct rotation representation does not improve the performance substantially. It is likely that the reason for this is that the more complex formulation in Sect. 4.3 causes many more local minima than the comparatively simpler formulation in Sect. 4.4. Note that, like in the non-coplanar case, the polynomial system solvers are by far the fastest algorithms. In this experiment, the algorithm of Cardoso and Ziȩtak and the polynomial system solver with fallback that is based on unit quaternions seem to be the best compromise. As discussed in Sect. 5.1, we do not regard the completely random case as representative for the problems we are trying to solve in practice. Consequently, the first two experiments are much more relevant for real applications. Therefore, the polynomial system solver with fallback based on unit quaternions seems to be the best choice overall. Because it shows a slight robustness advantage, the algorithm of Cardoso and Ziȩtak can be used for exceedingly difficult cases, such as those in the third experiment.

We now turn to the question how often the above algorithms do not converge to the global optimum. As in Sect. 5.1, we used the same scenarios, but reduced the number of trials to 1000 and used only 21 different numbers of point correspondences (instead of 39), resulting in 21000 trials for each scenario.

For the random noise scenario, the Gloptipoly solver never returned a better solution (\(p_\mathrm {b} = 0\%\)), for the outlier scenario, it found one instance with a smaller RMS error (\(p_\mathrm {b} = 0.0048\%\)), and for the random point correspondence scenario, it found 60 instances (\(p_\mathrm {b} = 0.286\%\)). Therefore, even for the scenario that was designed to produce the maximum likelihood for the iterative algorithms to converge to the wrong local minimum, it is extremely likely (\({>} 99.6\%\)) that at least one of the iterative algorithms converges to the global minimum.

An application of the algorithms in this section is to use them as a minimal solver (\(n=3\)) in a RANSAC scheme [19] to automatically determine the pose of an object from point correspondences with outliers. The evaluation shows that the polynomial system solver without fallback based on unit quaternions is the most suitable algorithm for this purpose. Its robustness for \(n=3\) is better than that of the algorithm of Cardoso and Ziȩtak, the Levenberg–Marquardt algorithm, and the polynomial system solver without fallback that is based on a direct rotation representation. Furthermore, although the robustness increases slightly if the fallback is used, this does not seem to justify the fourfold increase in the runtime. The RANSAC scheme will benefit more from the increased speed of the minimal solver than from the slight increase in robustness.

5.3 Accuracy of the Results

In addition to the robustness and speed, we evaluated the accuracy of the proposed algorithms. This was done by creating random poses and random 3D points in an identical manner as in the robustness experiments. The 3D points were then projected into a virtual image and were disturbed by uniform noise of a maximum amplitude a. Hence, the standard deviation of the noise was \(\sigma _\mathrm {n} = a/\sqrt{3}\). The number of point correspondences n was varied as in the robustness experiments, while the noise amplitude was varied between 0 and 10 in steps of 0.5. For each combination of n and a, 10,000 trials were performed. All the proposed algorithms except the Gloptipoly solvers were tested.

Fig. 3
figure 3

Average errors for the pose parameters for the OnP algorithms for non-coplanar points as functions of the number of point correspondences and the noise level. Note the logarithmic scale on all three axes. a Average translation error. b Average norm of the error of the elements of the rotation matrix. c Average error of the rotation angle of the pose. d Average error of the rotation axis of the pose

Fig. 4
figure 4

Average errors for the pose parameters for the OnP algorithms for coplanar points as functions of the number of point correspondences and the noise level. Note the logarithmic scale on all three axes. a Average translation error. b Average norm of the error of the elements of the rotation matrix. c Average error of the rotation angle of the pose. d Average error of the rotation axis of the pose

We evaluated four error measures. The first measure is the norm of the difference between the true translation \({\varvec{t}}_\mathrm {t}\) and the translation estimated by the algorithms \({\varvec{t}}_\mathrm {e}\): \(\varepsilon _{\varvec{t}} = \Vert {\varvec{t}}_\mathrm {t} - {\varvec{t}}_\mathrm {e} \Vert _2\). Since all algorithms determine the rotation matrix , the second error measure we evaluated is the Frobenius norm of error of the rotation matrix: . For the non-coplanar OnP algorithms, is a \(2 \times 3\) matrix, while for the coplanar OnP algorithms, it is a \(2 \times 2\) matrix. Since is hard to interpret geometrically, we derived two additional error measures from the rotation matrices. We converted the rotations into an axis–angle representation \(({\varvec{n}}, \theta )\) (\(\Vert {\varvec{n}} \Vert _2 = 1\)). For the coplanar algorithms, we ensured that the solution that corresponds best to the true pose was selected.Footnote 12 With this, the third measure is the rotation angle error \(\varepsilon _\theta = | \theta _\mathrm {t} - \theta _\mathrm {e} |\) and the fourth measure is the angle error between the rotation axes \(\varepsilon _{\varvec{n}} = \arccos ( {\varvec{n}}_\mathrm {t} \cdot {\varvec{n}}_\mathrm {e} )\).

The results of the evaluation are displayed in Fig. 3 for the non-coplanar OnP algorithms and in Fig. 4 for the coplanar OnP algorithms. All algorithms return almost exactly the same accuracies. This is not surprising since all algorithms solve the same minimization problem (9). The only difference is the parameterization of the rotation. However, at the end of each algorithm, the result is converted to a rotation matrix. Since the algorithms converge to the same minimum of (9), it does not matter which path they took to reach the minimum. Consequently, we only display the results of a single algorithm for each case (the algorithm of Sect. 3.4 for the non-coplanar case and the algorithm of Sect. 4.4 for the coplanar case).

From [29, Result 5.2], we expect that the error measures are proportional to \(\sigma _\mathrm {n}\) (and hence to a) and to \(1/\sqrt{n}\). To check whether this is the case, we plot the results in Figs. 3 and 4 logarithmically for all axes.Footnote 13 We obviously had to omit the values for \(a = 0\) to be able to use logarithmic axes. The errors for \(a = 0\) are all in the range of the floating point precision of the machine on which we ran the tests (e.g., around \(10^{-14}\) for \(\varepsilon _{\varvec{t}}\)). As can be seen from Figs. 3 and 4, the errors behave as expected: they are proportional to \(a/\sqrt{n}\), except for very small values of n, where they are slightly larger. For reasonable amounts of noise (\(a \le 1\)), the translation errors are always smaller than 25 µm for the non-coplanar case (for \(n=4\)), while for the coplanar case they are always smaller than 60 µm (for \(n=3\)). Similarly, the angle errors are always smaller than 0.25\(^\circ \) for the non-coplanar case and 1\(^\circ \) for the coplanar case. The errors decrease quickly for increasing n.

6 Conclusions

We have examined the OnP problem (the PnP problem for cameras with telecentric lenses) and have shown that it is equivalent to the unbalanced orthogonal Procrustes problem for non-coplanar 3D points and to the sub-Stiefel Procrustes problem for coplanar 3D points.

For non-coplanar 3D points, we have applied two existing algorithms (Green and Gower; Koschat and Swayne) to the OnP problem. Furthermore, we have proposed three novel algorithms (one based on the Levenberg–Marquardt algorithm and two based on iterative polynomial system solvers) to solve the OnP problem. The evaluation of the algorithms has shown that the polynomial system solvers provide the best tradeoff between robustness and speed for reasonably posed problems (with noise and outliers in the data). For exceedingly difficult problems (100% outliers), the algorithm of Green and Gower provides the optimum robustness. However, it is substantially slower than the polynomial system solvers. The Levenberg–Marquardt algorithm does not provide the optimum robustness for a small number of point correspondences and it does not provide the fastest runtime for a large number of point correspondences. The algorithm of Koschat and Swayne is quite robust, but extremely slow compared to the other algorithms. Overall, the polynomial system solver with fallback that is based on a direct rotation representation is the best choice, while the algorithm of Green and Gower is a useful choice if robustness on very difficult problems is essential.

For coplanar 3D points, we have applied one existing algorithm (Cardoso and Ziȩtak) to the OnP problem. Furthermore, we have proposed three novel algorithms (one based on the Levenberg–Marquardt algorithm and two based on iterative polynomial system solvers) to solve the OnP problem. Based on the evaluation of the four algorithms, it was determined that the polynomial system solvers with fallback provide the best tradeoff between robustness and speed for reasonably posed problems (with noise and outliers in the data). For exceedingly difficult problems (100% outliers), there is no clear winner. The algorithm by Cardoso and Ziȩtak and the polynomial system solver based on unit quaternions seem slightly preferable to the Levenberg–Marquardt algorithm in terms of robustness. Overall, the polynomial system solver with fallback that is based on unit quaternions is the best choice. If robustness on very difficult problems is essential, the algorithm by Cardoso and Ziȩtak and the polynomial system solver based on unit quaternions seem to be the best choice. The evaluation also has shown that the polynomial system solver without fallback that is based on unit quaternions is the best choice for a minimal solver in a RANSAC scheme.