1 Introduction

Optimization over the special orthogonal group of the orthogonal matrices with determinant one occurs in many geometric vision problems where rigidity of a model needs to be preserved under transformations. While the objective functions are often simple least squares costs, constraining a matrix to be a rotation requires a number of quadratic equality constraints on the elements which makes the problem non-convex. On the other hand, since both the objective and the constraints are quadratic, the Lagrange dual function can be computed (in closed form) and therefore optimization of the dual problem can be considered. It turns out that this is a linear semidefinite program (SDP), which can be reliably solved with standard solvers in polynomial time.

Recent studies [9, 12, 14, 18, 28, 32] have observed that in many practical applications the lower bound provided by the dual problem is often the same as the optimal value of the primal one. In such cases, duality offers a way of solving the original non-convex problem using a tight convex relaxation. For different problem classes, the prevalence of problem instances with tight relaxations varies, as illustrated by the synthetically generated instances in Fig. 1. Furthermore, finding conditions that can be rigorously proven to be sufficient for a tight relaxation and strong duality for a given problem class remains a challenging research area. In this paper, we focus on the converse question: For what problem classes can we find objective functions that give a nonzero duality gap and a non-tight relaxation? We use tools from algebraic geometry for analyzing when this happens in the case of general quadratic objective functions over rotational constraints. In particular, we consider the three most commonly occurring parametrizations that can be realized with quadratic constraints, namely 3D-rotations represented either by matrices from \(\mathrm{SO}(3)\) or by unit norm quaternions, and planar rotations represented by matrices from \(\mathrm{SO}(2)\).

We consider the dual of the dual wherein all quadratic terms of the primal problem are replaced by linear terms over a set of ‘lifted’ variables subject to semidefinite constraints. The quadratic objective functions are then replaced by linear functions in the new variables, which are known to attain their optimum in extreme points of the feasible set. By studying the extreme points of this relaxation, we show that the situation is not as favorable as one might expect from the literature: Even for applications with relatively few rotations, we prove the existence of extreme points with rank strictly greater than one, and objective functions that are minimized at such points. The larger rank then prevents us from extracting a solution to the primal problem from the ‘lifted’ variables and shows that there is a duality gap.

Fig. 1
figure 1

Histogram of the rank of the extreme points for 1000 synthetic experiments. If the rank is one, then the globally optimal solution can be extracted. a Problem instances with a quadratic objective function defined over \(\mathrm{SO}(3)\) are solved via an SDP relaxation. The coefficients for the quadratic function of each instance are uniformly drawn from \([\,\text {-}1,1\,]\). b The experiments are performed in a similar manner, except that the quadratic functions are defined over \(\mathrm{SO}(3) \times \mathrm{SO}(3)\). In almost \(50\%\) of the cases, an extreme point with rank greater than one is obtained, and hence, the SDP relaxations are not tight. See Sect. 7.3 for details

Our main contributions are:

  • We present a novel analysis of the duality properties for quadratic minimization problems subject to rotational constraints based on algebraic geometry.

  • We characterize for several applications when the standard SDP relaxation is tight and when it is not. For instance, we give counterexamples for the registration problem with \(\mathrm{SO}(3)\)-parametrization, showing that the SDP relaxation is not always tight since its solution may be an extreme point of rank 6. Similarly, we generate counterexamples which show that averaging of four planar rotations is not necessarily tight.

  • We show that the registration problem and the hand–eye calibration problem with \(\mathrm{SO}(2)\)-parametrization or quaternion parametrization are always guaranteed to produce tight SDP-relaxations.

1.1 Related Work

It is well known that finding the optimal rigid transformation that registers two point clouds can be done in closed form [27]. This is a key subroutine used in many different applications, for example, in the ICP algorithm [2]. However, registering other geometric primitives is a much harder problem. In [34, 35], a branch-and-bound approach is proposed for finding the 3D rigid transformation for corresponding points, lines, and planes. The same problem is solved in [9] by first eliminating the translation and then using SDP relaxations for estimating the rotation. Empirically, it was noted that the relaxations were always tight, but no theoretical analysis was given. The problem of registering multiple point clouds was solved using SDP relaxations and Lagrangian duality in [14]. The problem was further studied in [28] where it was shown that for low noise regimes, the SDP relaxation is always tight.

In robotics, SDP relaxations for estimating rigid transformations in simultaneous localization and mapping (SLAM) have been explored in a number of recent papers [8, 11, 12, 19, 32]. Again, the empirical performance is generally good, the optimal solutions can be efficiently computed [5], and the relaxations are shown to be tight for bounded noise levels. Non-tight counterexamples are also provided. In computer vision, there are many structure-from-motion (SfM) pipelines that rely on solving the so-called rotation averaging problem, see [1, 13, 17, 18, 20, 23, 24, 40]. One of the first approaches to use convex relaxations and duality in this setting was [20] where it was empirically observed that the relaxations are tight. A theoretical analysis and proof that for low noise regimes, SDP relaxations are indeed tight (no duality gap) have been derived in [15, 18, 36] for the problem of rotation averaging. The recent paper [17] explores this analysis to develop an efficient algorithm with optimality guarantees.

Estimating the pose of a camera also involves optimization over the special orthogonal group, [10, 33, 38, 41]. Approaches based on minimal solvers and Gröbner bases are often used. Alternatively, we show that the camera pose problem can be solved with SDP relaxations and convex optimization. Another classical problem that involves rotations is the hand–eye calibration problem [26]. In a recent paper [21], an SDP relaxation is proposed, again, with seemingly good empirical performance.

There are several previous works with similar aims as ours, but for more general problem settings. For a general, geometric overview of the problem at hand, we recommend [37] where orbitopes are studied. An orbitope is the convex hull of an orbit of a compact group acting linearly on a vector space. The dual of the dual formulation that we study corresponds to the first-order relaxation of the moment-SOS hierarchy [25], pioneered by Lasserre [30]. The approach of Lasserre has previously been applied to multiview geometry [29], but without any tightness guarantees. In [15], SDP relaxations for quadratically constrained quadratic programs (QCQP) are analyzed. Given that the SDP relaxation correctly solves a problem under noiseless observations (which is the case for the problems that we analyze), conditions are given which guarantee strong duality in the low noise regime. The size of this neighborhood is, however, not explicitly given. Further, a geometric interpretation of the relaxation is provided in [16]. We base our framework on the mathematical results proved in [4] where a deep connection is established between, on the one hand, algebraic varieties of minimal degree and on the other hand, the study of nonnegativity and its relation with sums of squares (SOS).

1.2 Contents of the Paper

In the next section, we present our general problem formulation. In Sect. 3, applications are presented and formulated on the standard form, of which the SDP relaxation is given in Sect. 4. In Sects. 57, we relate our problem to duality and analyze it using results from algebraic geometry. Our main result, a complete classification of SDP tightness for our example applications, is presented in Sect. 8.

2 Problem Formulation

The class of problems that we are interested in analyzing are problems involving rotations, parametrized either by

  1. (i)

    \(p\times p\) matrices belonging to the special orthogonal group, denoted \(\mathrm{SO}(p)\), where \(p=2\) or \(p=3\) for planar rotations and 3D-rotations, respectively.

  2. (ii)

    4-vectors of unit length representing quaternions, denoted \({{\mathcal {Q}}}\) for 3D-rotations.

In addition, we require that the objective function is quadratic in the variables of the chosen parametrization.

Let \( R= [R_1, \ldots \,R_n] \) where each \(R_i \in \mathrm{SO}(p)\) with \(p=2\) or \(p=3\) and let \(\text {vec}\left( R\right) \) denote the column-wise stacked vector of the \(p\times pn\) matrix R. Now let M be a real, symmetric \((p^2n+1)\times (p^2n+1)\) matrix; then, we would like to solve the following non-convex optimization problem

$$\begin{aligned} \begin{array}{lll} &{}\min \limits _{R \in \mathrm{SO}(p)^n} &{} \begin{bmatrix}\text {vec}\left( R\right) \\ 1\end{bmatrix}^T M \begin{bmatrix}\text {vec}\left( R\right) \\ 1\end{bmatrix} \end{array}. \end{aligned}$$
(1)

Alternatively we model 3D-rotations with unit quaternions, \(q= [q_1, \ldots \,q_n]\), and consider the similar formulation

$$\begin{aligned} \begin{array}{lll} &{}\min \limits _{q \in {\mathcal {Q}}^n} &{} \begin{bmatrix}\text {vec}\left( q\right) \\ 1\end{bmatrix}^T M \begin{bmatrix}\text {vec}\left( q\right) \\ 1\end{bmatrix} \end{array}. \end{aligned}$$
(2)

Note that not every problem with 3D-rotations may be straightforward to model on both of the formulations (1) and (2), and furthermore, their residual errors have different interpretation, and therefore, the formulations are not equivalent. Also, the set of quaternions forms a double covering of the set of rotations, meaning that q and \(-q\) represent the same rotations [24].

Both of these problem formulations can be put in the following standard form:

$$\begin{aligned} \begin{array}{lll} \min \limits _{r } &{} r^T M r \\ {\text{ subject } \text{ to }} &{} r^T A_i r = 0, &{} i=1,\ldots ,l\\ &{} r^T e= 1 \end{array}. \end{aligned}$$
(3)

The l quadratic equations \(r^T A_i r = 0\) enforce the rotational constraints and \(r^T e = 1\) with \(e=[0,\ldots ,0,1]^T\) forces the final element of r to be one. The rest of the paper will be devoted to this standard form, and we will analyze it in detail.

3 Applications

There are several application problems that can be modeled and solved using the above formulation. Often, one would like to solve for one or several rigid transformations (a rotation and a translation). However, in many cases, one can directly eliminate the translation and concentrate on the more difficult part of finding the rotations.

Next we give several examples of rotation problems appearing in the literature.

Example 1

Registration with point-to-point, point-to-line and point-to-plane correspondences can be written as in (3). The residuals are all of the form

$$\begin{aligned} \Vert P_i(R x_i + t - y_i)\Vert ^2 = \Vert (x_i^T\otimes P_i)\text {vec}\left( R\right) + P_i(t - y_i)\Vert ^2. \end{aligned}$$
(4)

If point \(x_i\) corresponds to point \(y_i\), then set \(P_i = I_3\). If \(x_i\) is a measurement known to lie on a line, then set \(P_i = I - v_i v_i^T\), where \(v_i\) is a unit directional vector and \(y_i\) is a point on the line. Similarly, if \(x_i\) lies on a plane, then set \(P_i = n_i^T\), where \(n_i\) is a unit normal and \(y_i\) is a point on the plane.

Minimizing over t gives the closed-form solution

(5)

which is linear in \(\text {vec}\left( R\right) \). Inserting this into the objective function thus gives an expression which is quadratic in \(\text {vec}\left( R\right) \) and therefore can be reshaped into (3).

In [27], it is shown that the registration problem with point-to-point correspondences can be formulated as a quadratic optimization problem in the quaternion representation. We are not aware of any quadratic quaternion formulation for the point-to-line and point-to-plane cases.

Example 2

Resectioning is the problem of recovering the position and orientation of a camera given 2D-to-3D correspondences. Geometrically this can be done by aligning the viewing rays from the camera with the 3D points. This reduces the problem to a special case of point-to-line registration where all of the lines intersect in the camera center.

Example 3

Hand–eye calibration is the problem of determining the transformation between a sensor (often a camera) and a robot hand on which the sensor is mounted. Given rotation measurements \(U_i\) and \(V_i\), \(i=1,\ldots ,m\) relative to a global frame, for the sensor and the robot hand, respectively, the objective is to find the relative rotation R between them by solving the following optimization problem:

$$\begin{aligned}&\min \limits _{R \in \mathrm{SO}(3)} \sum \limits _{i=1}^m ||U_iR-RV_i||_F^2.\nonumber \\&\Vert U_iR-RV_i \Vert _F^2 = \Vert (I\otimes U_i - V_i^T\otimes I)\text {vec} \left( R\right) \Vert ^2 \nonumber \\&\qquad \qquad \qquad \qquad = \text {vec} \left( R\right) ^T M_i \text {vec} \left( R\right) , \end{aligned}$$
(6)

where \(M_i = 2I- V_i\otimes U_i-V_i^T\otimes U_i^T\). Finally, set M as

$$\begin{aligned} M = \begin{bmatrix} \sum _{i=1}^m M_i &{} 0 \\ 0 &{} 0 \end{bmatrix}. \end{aligned}$$
(7)

An alternative formulation using quaternions can also be employed. If the unit quaternions \(u,v \in {{\mathcal {Q}}}\) represent the rotations \(U,V \in \mathrm{SO}(3)\), then the quaternion representing the composition UV can be written Q(u)v, where

$$\begin{aligned} Q(u) = { \left( \begin{matrix} u_0 &{} -u_1 &{} -u_2 &{} -u_3\\ u_1 &{} u_0 &{} -u_3 &{} u_2\\ u_2 &{} u_3 &{} u_0 &{} -u_1\\ u_3 &{} -u_2 &{} u_1 &{} u_0 \end{matrix} \right) . } \end{aligned}$$
(8)

Therefore, the optimization problem

$$\begin{aligned} \begin{array}{lll}&\min \limits _{q \in {{\mathcal {Q}}}}&\sum \limits _{i=1}^m ||Q(u_i)q-Q(q)v_i||^2 \end{array} \end{aligned}$$
(9)

can also be used and turned into the standard form in order to solve hand–eye calibration. Note, however, that due to the double covering the signs of \(u_i\) and \(v_i\) have to be selected consistently in order for q to give a low objective value.

Example 4

Rotation averaging aims to determine a set of absolute orientations \(R_i\), \(i=1,\ldots ,n\) from a number of measured relative rotations \(R_{ij}\) by minimizing

$$\begin{aligned} \sum _{i \not = j} \Vert R_i R_{ij}-R_j\Vert _F^2. \end{aligned}$$
(10)

Since \(\Vert R_i\Vert _F^2 = 3\), the problem is (ignoring constants) equivalent to minimizing

$$\begin{aligned} \begin{array}{cc} \begin{array}{c} -{\displaystyle \sum _{i \not = j}} \langle R_iR_{ij},R_j\rangle \\ = \text {tr}\left( R M_0 R^T\right) , \end{array} &{} , \quad M_0 = - { \begin{bmatrix} 0 &{} R_{12} &{} \cdots &{} R_{1n} \\ R^T_{12} &{} 0 &{} \cdots &{} R_{2n} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ R^T_{1n} &{} R^T_{2n} &{} \cdots &{} 0 \\ \end{bmatrix}. } \end{array} \end{aligned}$$
(11)

Letting \(M = \text {blkdiag}\left( M_0 \otimes I_3, 0\right) \) now gives an optimization problem of form (3). \(I_3\) is a \(3 \times 3\) identity matrix, and the \(\text {blkdiag}\left( \cdot \right) \) operation constructs a block-diagonal matrix.

Similar to the hand–eye calibration problem, rotation averaging can be formulated with quaternions [22] using the objective function \(\sum _{i\ne j} \Vert Q(r_i)r_{ij} - r_j\Vert ^2\), which after simplifications yields an expression similar to (11) and hence can be put in the standard form (3).

Example 5

Point-set averaging is the problem of registering a number of point sets, measured in different coordinate systems, to an unknown average model. If \(X_i\), \(i=1,\ldots ,n\) are \(3 \times n\) matrices containing measurements of corresponding 3D points, we want to find a \(3 \times n\) matrix Y, rotations \(R_i\) and translations \(t_i\) such that

$$\begin{aligned} \sum _i \Vert R_i X_i + t_i{\mathbbm {1}}^T - Y\Vert _F^2, \end{aligned}$$
(12)

where \({\mathbbm {1}}\) is a column vector with all ones, which is minimized. Since the variables Y and \(t_i\), \(i=1,\ldots ,n\) are unconstrained, they can be solved for as a function of the rotations \(R_i\), \(i=1,\ldots ,n\). Assuming that the centroid of the points in \(X_i\) is the origin for all i, back-substitution allows us to write the problem solely in terms of the rotations as

$$\begin{aligned} \min _{R \in \mathrm{SO}(3)^n} \text {tr}\left( R M_0 R^T\right) , \end{aligned}$$
(13)

where

(14)

Letting \(M = \text {blkdiag}\left( M_0 \otimes I_3, 0\right) \) now gives an optimization problem of form (3).

In the above examples, we only considered the case of 3D-rotations. Note, however, that all of these problems also have meaningful versions in the plane for which parametrization using \(\mathrm{SO}(2)\) yields the same type of problem.

4 SDP Relaxation

Let us now derive the standard convex SDP relaxation for our standard form (3). Consider the objective function, which can be rewritten using trace notation as

$$\begin{aligned} \begin{array}{lllll} r^T M r= & {} \text {tr}\left( Mrr^T\right)= & {} \text {tr}\left( MX\right) , \end{array} \end{aligned}$$

where \(X=rr^T\). Note that \(\text {rank}\left( X\right) =1\) and \(X \succeq 0\).

For \(\mathrm{SO}(3)\), the condition that \(R_i\) belongs to the special orthogonal group can be expressed by quadratic constraints in the entries of \(R_i\), for instance \(R_i^TR_i-I=0\). Similarly, that the cross product of the first and second rows should equal the third, which ensures \(\det {R_i}=1\) is also a quadratic constraint. Consequently, the same constraints can be expressed by linear equations in the entries of X in the form \(\text {tr}\left( A_iX\right) =0\). It can be checked that there are 20 linearly independent such constraints for each \(R_i\).

The corresponding program for \(\mathrm{SO}(2)\) is similar to that of \(\mathrm{SO}(3)\), but it only requires one constraint per rotation. We represent a rotation by \( R_i = \left( \begin{matrix} c_i &{} -s_i \\ s_i &{} c_i \end{matrix} \right) , \) where \(c_i^2+s_i^2 = 1\). Hence, for n rotations, only 2n variables are needed in the vector r and the unit length constraint becomes linear in the entries of X. Similarly for quaternions \({{\mathcal {Q}}}\), the unit length constraint for each \(q_i\) can be written as a linear constraint.

If we ignore the non-convex constraint that \(\text {rank}\left( X\right) =1\), then we get a semidefinite program problem over X: The objective function is linear in X subject to linear equality constraints and a positive semidefinite constraint, \(X\succeq 0\). This leads to the following convex relaxation:

$$\begin{aligned} \begin{array}{lll} &{}\min \limits _{X \succeq 0 } &{} \text {tr}\left( MX\right) \\ &{} &{} \text {tr}\left( A_iX\right) = 0, \quad i=1,\ldots ,l \\ &{} &{} \text {tr}\left( ee^TX\right) = 1 \end{array}. \end{aligned}$$
(15)

Note that as we have relaxed (ignored) \(\text {rank}\left( X\right) =1\), the optimal value will be a lower bound on the original non-convex problem (3). Further, if the optimal solution \(X^*\) has rank one, then we say that the relaxation is tight and the globally optimal solution is obtained.

5 Duality and Sums of Squares

Consider again the objective function in (3) and the Lagrangian dual problem of (15)

$$\begin{aligned} \begin{array}{lll} &{}\max \limits _{\gamma ,\lambda _1,\ldots ,\lambda _l } &{} \gamma \\ &{} &{} M-\sum _i \lambda _i A_i - \gamma ee^T \succeq 0, \end{array} \end{aligned}$$
(16)

where \((\gamma ,\lambda )\) are the dual variables. By construction, this problem gives the same objective value as (15) and therefore a lower bound on the original (3). We are interested in knowing when this lower bound is tight.

Let I be the ideal of the polynomials defining the constraint set, that is, a polynomial p is in I when

$$\begin{aligned} p(r) = v(r)(r^Te e^T r-1) + \sum _i v_i(r)r^T A_i r, \end{aligned}$$
(17)

where v and \(v_i\) are any polynomials. The variety V(I) consists of the feasible points \(\{r \, | \, p(r) = 0\ \forall p\in I\}\). Let \(R_2\) denote the set of quadratic polynomials modulo I, that is, two polynomials \(f, g \in R_2\) are considered equal if \(f-g \in I\).

The question of tightness between the original problem (3) and the relaxation (15) and its dual (16) is related to the two convex, closed cones

$$\begin{aligned} P := \left\{ f \in R_2 \, | \, f(r) \ge 0 \text{ for } \text{ all } r \in V(I) \,\right\} , \end{aligned}$$
(18)

and

$$\begin{aligned} \begin{array}{ll} \Sigma := \{ f \in R_2 \, | \,&{} \text{ there } \text{ exist } \text{ vectors } a_1,\ldots ,a_k \\ &{}\text{ such } \text{ that } f(r) = \sum _{i=1}^k (a_i^T r)^2 \, \}. \end{array} \end{aligned}$$
(19)

Note that the cones are defined to be dependent on the constraint set of (3) and not on the actual form of the objective function \(r^TMr\). As any quadratic polynomial f in \(\Sigma \) is a sum of linear squares on the feasible set V(I) and hence nonnegative, it follows that \(\Sigma \subseteq P\).

Consider again our original problem in (3), written as

$$\begin{aligned} \eta ^* = \min _{r\in V(I)} r^T M r. \end{aligned}$$

It follows that \(r^T(M-\eta ^*ee^T)r \in P\). If \(\gamma ^*\) is the optimal value of (16) with dual variables \(\lambda ^*\), then the matrix \(M-\sum _i \lambda ^*_i A_i - \gamma ^* ee^T\) is positive semidefinite, and we can factor it into a sum of rank-1 matrices \(\sum _j a_j a_j^T\). Therefore,

$$\begin{aligned} \sum _j (a_j^Tr)^2 = r^T (M-\gamma ^*ee^T) r - r^T\left( \sum _i \lambda ^*_i A_i\right) r . \end{aligned}$$

Now, \(r^T\left( \sum _i \lambda ^*_i A_i\right) r\) lies in I, and we can conclude that the quadratic polynomial \(r^T(M - \gamma ^*ee^T)r\) belongs to \(\Sigma \) when \((\gamma ^*,\lambda ^*)\) is the solution to (16).

In view of the above discussion, it is clear that the convex formulations (15) and (16) can only give the same objective value as (3) when \(r^T M r - \eta ^*\) is a sum of squares, where \(\eta ^*\) is the optimal value of (3). The question we are interested in answering is hence when is it possible to find an SOS for this nonnegative quadratic form? If the cones are not equal, that is, \(\Sigma \subsetneq P\), then there may exist objective functions for which the relaxation is not tight. We shall investigate this further in a constructive manner. First, we need some more tools from algebraic geometry. See also book [3] for a general introduction.

6 The Varieties of Rotations

An algebraic variety V is the set of solutions of a system of polynomial equations over the reals. In this paper, we analyze three varieties that are commonly used in computer vision applications: \(\mathrm{SO}(2)^n\), \(\mathrm{SO}(3)^n\) and \({{\mathcal {Q}}}^n\). These varieties can be defined by a system of polynomial equations in the entries of \(2\times 2\), \(3\times 3\) matrices and 4-vectors, respectively (cf. Sect. 4). The dimensions and co-dimensions of these varieties are well-known, and \(\dim {\mathrm{SO}(2)} = 1\), \({{\,\mathrm{codim}\,}}{\mathrm{SO}(2)} = 1\), \(\dim {\mathrm{SO}(3)} = 3\), \({{\,\mathrm{codim}\,}}{\mathrm{SO}(3)} = 6\), \(\dim {{{\mathcal {Q}}}} = 3\), and \({{\,\mathrm{codim}\,}}{{{\mathcal {Q}}}} = 1\). The degree of V is by definition the number of intersection points of the variety with \(\dim {V}\) general hyperplanes, and we have that \(\deg {\mathrm{SO}(2)} = 2\), \(\deg {\mathrm{SO}(3)} = 8\) (see [7] for a derivation) and \(\deg {{{\mathcal {Q}}}} = 2\).

For n copies of V, it is straightforward to show that the variety of \(V^n\) has dimension \(n\dim {V}\), co-dimension \(n {{\,\mathrm{codim}\,}}{V}\), and degree \((\deg {V})^n\). For instance, for the case of \(\mathrm{SO}(3)^n\), we have that \(\dim {\mathrm{SO}(3)^n} = 3n\), \({{\,\mathrm{codim}\,}}{\mathrm{SO}(3)^n} = 6n\), and \(\deg {\mathrm{SO}(3)^n} = 8^n\).

For any variety V, \(\deg {V} \ge {{\,\mathrm{codim}\,}}{V}+1\). A variety is called minimal if it is non-degenerate (that is, not contained in a hyperplane) and \(\deg {V} = {{\,\mathrm{codim}\,}}{V}+1\). Similarly, it is called almost minimal when \(\deg {V} = {{\,\mathrm{codim}\,}}{V}+2\). Considering the degrees and co-dimensions of the varieties previously listed, Table 1 summarizes their characterization as minimal, almost minimal, and not minimal, for the cases \(n = 1\), \(n = 2\), and \(n > 2\).

Table 1 Characterization of \(V=\mathrm{SO}(3)^n\), \(V=\mathrm{SO}(2)^n\), and \(V={{\mathcal {Q}}}^n\), in terms of their degree. If \(\deg {V}={{\,\mathrm{codim}\,}}{V}+1\), V is said to be minimal, and if \(\deg {V}={{\,\mathrm{codim}\,}}{V}+2\), V is almost minimal. Otherwise, \(\deg {V} \ge {{\,\mathrm{codim}\,}}{V}+3\), and V is neither minimal nor almost minimal

7 The Extreme Points of the SDP Relaxation

In this section, we investigate further the convex cone of nonnegative polynomials and that of SOS polynomials over the rotational varieties, \(\mathrm{SO}(3)\), \(\mathrm{SO}(2)\), and \({{\mathcal {Q}}}\). The goal is to find out when the SDP relaxation is tight and to characterize all possible extreme points for the relaxation.

The following general result is proved in [4].

Lemma 1

(Blekherman et al. [4]) Let V be a real irreducible non-degenerate variety such that its subset of real points is Zariski dense. Every real quadratic form that is nonnegative on V is a sum of squares of linear forms if and only if V is a variety of minimal degree.

An illustration of the result is given in Fig. 2. We will now apply it to our varieties of interest.Footnote 1

Fig. 2
figure 2

Illustration of the closed convex cones \(\Sigma \) and P defined in (18) and (19). If the variety V(I) is minimal, then \(\Sigma =P\), otherwise \(\Sigma \subsetneq P\)

7.1 Minimal Varieties

Among the varieties in Table 1, we know that only \(\mathrm{SO}(2)\) and \({{\mathcal {Q}}}\) are minimal. In the remaining cases, the convex cones P and \(\Sigma \) defined in (18) and (19) are therefore strictly different, i.e., \(\Sigma \subsetneq P\). The proof of Lemma 1 is constructive, and it allows us to generate objective functions that are nonnegative, but not sums of squares, and thereby the SDP relaxations will not be tight for these optimization problems. However, our example applications have objective functions of a special form, and it remains to see whether there are such objectives which are not sums of squares. We will return to this question in Sect. 8. We can conclude that any hand–eye calibration problem defined over \(\mathrm{SO}(2)\) or \({{\mathcal {Q}}}\) will always have tight SDP relaxations.

7.2 Almost Minimal Varieties

In the case when V is almost minimal, that is, when V is either \(\mathrm{SO}(3)\), \(\mathrm{SO}(2) \times \mathrm{SO}(2)\) or \({{\mathcal {Q}}}\times {{\mathcal {Q}}}\) (Table 1), we will still have \(\Sigma \subsetneq P\), but the gap between the cones will be smaller. Furthermore, for problems in \(P \setminus \Sigma \), the extreme points of the corresponding SDP relaxation can be characterized based on the theory in [4]. An immediate reformulation of Proposition 3.5 for our purposes gives the following corollary.

Corollary 1

Assume that the variety V is almost minimal and arithmetically Cohen–Macaulay. Then, the extreme points \(X^*\) of the SDP relaxation in (15) either have \(\text {rank} \left( X^*\right) =1\) or \(\text {rank} \left( X^*\right) ={{\,\mathrm{codim}\,}}{V}\).

All of the varieties we study are smooth and therefore arithmetically Cohen–Macaulay. Furthermore recall from Sect. 6 that \({{\,\mathrm{codim}\,}}{\mathrm{SO}(3)} = 6\), \({{\,\mathrm{codim}\,}}{\mathrm{SO}(2) \times \mathrm{SO}(2)} = 2\) and \({{\,\mathrm{codim}\,}}{{{\mathcal {Q}}}\times {{\mathcal {Q}}}} = 2\). In the \(\mathrm{SO}(3)\) case, if the computed optimal solution \(X^*\) of the relaxation has not rank 1 nor 6, but say, for instance, rank 2, then \(X^*\) can be decomposed into two rank-1 matrices, \(X^*=\lambda X_1^*+(1-\lambda )X_2^*\) for some \(\lambda \in [ 0,1 ]\) where \(X_1^*\) and \(X_2^*\) are optimal solutions and extreme points.

If \(\text {rank}\left( X^*\right) =1\), then the corresponding objective function \(r^TMr-\eta ^*\) (where \(\eta ^*\) is the optimal objective value) is a sum of squares, and as shown previously can be retrieved by solving the SDP. If \(\text {rank}\left( X^*\right) >1\) and \(X^*\) cannot be decomposed into rank-1 extreme points, then the corresponding objective function \(r^TMr-\eta ^*\) is not a sum of squares. For almost minimal arithmetically Cohen–Macaulay varieties, such extreme points \(X^*\) must be of \(\text {rank}\left( X^*\right) ={{\,\mathrm{codim}\,}}{V}\) and there are no other possibilities.

To summarize, if when minimizing a given problem over an almost minimal variety V we obtain the extreme point \(X^*\) which has \(\text {rank}\left( X^*\right) =1\), then we have indeed computed the globally optimal solution, but if it turns out that \(\text {rank}\left( X^*\right) ={{\,\mathrm{codim}\,}}{V}\), then the relaxation is not tight, and we do not even have a feasible solution to the original problem, just a lower bound on the optimal value.

7.3 Prevalence of Non-tight Problem Instances

In Fig. 1, we presented the results of two sets of synthetic experiments, illustrating the significance of almost minimal varieties.

In the first set of experiments, the domain is the almost minimal variety \(V=\mathrm{SO}(3)\) and the entries of the objective function, encoded by the \(10\times 10\) symmetric matrix M in (3), were randomly drawn from a uniform distribution from \([\,\text {-}1,1\,]\). In all 1000 examples, we obtained a rank-1, globally optimal solution for the SDP relaxation, even though the variety is not minimal. This shows that the rank-6 extreme points predicted by Corollary 1 are rare in practice among the random objective functions considered. It is, however, possible to produce such non-tight examples, and we shall return to this question later.

In the second set of experiments, the optimization took place over \(V=\mathrm{SO}(3) \times \mathrm{SO}(3)\), which is not almost minimal. The entries of the \(19 \times 19\) symmetric matrix M were generated in the same way via a uniform distribution. In this case, the relaxation works poorer, and various ranks are obtained for its solutions.

Remark

For neither minimal nor almost minimal varieties, the nonnegative cone P becomes significantly larger than the SOS cone \(\Sigma \). Non-tight SDP relaxations will be more prevalent, and various ranks will be observed for the solutions to these non-tight relaxations. A rank-1 solution will, however, always provide a solution to the primal problem.

8 Tightness of Our Example Applications

Table 2 Tightness of SDP relaxations for various applications and parametrizations. Colors follow Table 1, illustrating whether the domain is minimal, almost minimal, or neither. The main new results are for the almost minimal cases, for which we have generated rare non-tight counterexamples. For the low noise cases, tightness can only be guaranteed in the low noise regime. We conclude that only the problem classes over minimal varieties come with tightness guarantees

The theoretical results in the previous section apply to general quadratic objective functions. For actual applications, the objective functions will be structured. For instance, consider the hand–eye calibration problem in Example 3. There are only purely quadratic terms of the rotation variables in the objective and no linear ones. Hence, the last row and the last column of the matrix M will be zero. In this section, we analyze structured objective functions corresponding to different problem classes. We also relate our new results to previous ones in the literature.

In Table 2, we present a complete classification of SDP tightness for our example applications. In accordance with Table 1, applications for the minimal varieties \(\mathrm{SO}(2)\) and \({{\mathcal {Q}}}\) are always tight—this is a known result, as there is a single quadratic constraint (see, for example, Boyd and Vandenberghe [6]). For the almost minimal varieties, we generate rare non-tight problem instances, and for the non-minimal cases we conclude that tightness can only be guaranteed in the low noise regime, supported by previous works and empirically demonstrated by us.

Noise-free case All of our considered example applications have objective functions of the form \(r^TMr = r^TU^TUr = \Vert Ur \Vert ^2\) for some matrix U. If the optimal value \(\eta ^* = \min _{r\in V(I)} r^T M r\) is equal to zero (which corresponds to the noise-free case), then \(r^T(M-\eta ^*ee^T)r = r^TMr \in P\), where P is the cone of nonnegative quadratic forms in (18). Further, since M has nonnegative eigenvalues, \(M\succeq 0\) and we can factor it into a sum of rank-1 matrices \(M = \sum _j a_j a_j^T\). It follows that \(r^TMr = \sum _j (a_j^Tr)^2\) is SOS. This is a well-known result, and it has further been studied in [15] where it is shown that for low noise levels (\(\eta ^*\) close to zero), the nonnegative polynomial \(r^T M r - \eta ^*\) is a sum of squares as well.

8.1 Registration and Resectioning

The formulations of these two applications over the domain \(\mathrm{SO}(3)\) are given in Examples 1 and 2, respectively. As the variety \(\mathrm{SO}(3)\) is almost minimal (Table 1), one may wonder if there are actual problem instances that lead to non-tight relaxations and extreme points of rank-6 (Corollary 1)? In [4], there is a procedure for finding polynomials that are nonnegative, but not sums of squares. However, this will in general not result in objective functions originating from registration or resectioning problems. The objective function for this type of problem is of the form \(r^TMr = r^TU^TUr = \Vert Ur \Vert ^2\) where \(M=U^TU\) with U of size \(m \times 10\) and m is the number of correspondences of type point-to-point, point-to-line or point-to-plane. That in particular means that M has nonnegative eigenvalues and \(M\succeq 0\), and there are some additional requirements as well.

In Sect. A in appendix, we show how to modify the procedure of Blekherman et al. [4] in order to achieve such objective functions. For every non-tight problem instance generated with Procedure 1 described in Sect. A, we get a rank-6 solution \(X^*\) as predicted by Corollary 1, and consequently, no feasible solution is obtained. Hence, there exist indeed problem instances that are non-tight, but they are rare in practice. See also the first column of Table 2.

Relation to the empirical results of Briales and Gonzalez–Jimenez [9]. Extensive experiments using the SDP relaxation in (15) for registration over \(\mathrm{SO}(3)\) are performed in [9], but not a single instance with a non-tight relaxation among their real or synthetic experiments is found. This is consistent with our experiments, as presented in Fig. 1a, where we have done an empirical analysis of SDP tightness over \(\mathrm{SO}(3)\) for quadratic objective functions with random entries. The counterexamples are indeed rare in practice for this almost minimal variety.

8.2 Hand–Eye Calibration

As previously mentioned, the objective function for hand–eye calibration contains only purely quadratic terms of the rotation variables and no linear ones. Hence, the last row and the last column of the \(10\times 10\) symmetric matrix M will be zero. We tested the same procedure as for registration (see Procedure 1 in Sect. A in appendix) in order to generate non-tight counterexamples with the structure of a hand–eye calibration objective. We succeeded in obtaining problem instances for averaging 8 rotation matrices that yielded objective functions with non-tight relaxations, see the second column of Table 2 for a summary. Again, all of these optimization problems attain their minima at rank-6 extreme points, in accordance with Corollary 1.

8.3 Rotation Averaging

In [18], Eriksson et al. proved that the SDP relaxation for problems involving three rotations with \(\mathrm{SO}(3)\)-parametrization is always tight. This result trivially extends to \(\mathrm{SO}(2)\). Further, for instances with more than three cameras, it is shown that in the low noise regime the SDP relaxation is tight. Low noise results applicable to \(\mathrm{SO}(3)\) as well as \(\mathrm{SO}(2)\) have also been presented by Rosen et al. [36], although \(\mathrm{SO}(2)\) is parametrized by all matrix elements in their case. Fredriksson and Olsson [20] parametrize the rotation averaging problem with quaternions \({{\mathcal {Q}}}\) and in all the reported experiments, the SDP relaxation was always numerically found to be tight. For \(\mathrm{SO}(2)\), Zhong and Boumal [42] proved the existence of an upper bound on the noise level for which the SDP relaxation is tight; however, no explicit estimates were given.Footnote 2

Here we present results for the case of four rotations in \(\mathrm{SO}(2)\) (a three-rotation problem is always tight [18]). Figure 3 shows the average rank of the computed SDP solution \(X^*\). The \(M_0\) matrix in (11) was generated by sampling the relative rotation angles from \({\mathcal {N}}(0, \sigma ^2)\), \(\sigma \in [0, 1]\) radians. For each noise level \(\sigma \), we ran the problem 10, 000 times and plotted the average obtained rank of the lifted variables \(X^*\). The observed ranks were 1 or 2. Similar to our results, Fan et al. [19] find instances of the 2D SLAM problem with non-tight relaxations\(^{2}\), and Carlone et al. [12] present analogous results for 3D pose-graph optimization (PGO).

Fig. 3
figure 3

Average rank for instances of the rotation averaging problem over \(\mathrm{SO}(2)^4\), with varying noise levels

Relation to Mangelson et al. [32]. The planar pose-graph problem with

$$\begin{aligned} \sum _{i \ne j} \Vert R_i R_{ij} -R_j\Vert _F^2 + \tau \Vert t_j - t_i - R_i t_{ij}\Vert ^2 \end{aligned}$$
(20)

is studied in [32]. Here, additional relative translation estimates \(t_{ij}\) are present, but \(\tau = 0\) reduces the problem to rotation averaging. A ‘proof’ of strong duality for the Sparse-BSOS relaxation [31, 39] is presented. This would imply that rotation-averaging can be solved exactly in polynomial time but that the SDP relaxation (15) still gives a duality gap. While such a weakness is entirely plausible, we note that the presented proof in [32] is in fact flawed as the domain, \(\mathrm{SO}(2)^n\) using unit norm constraints on the diagonals of the rotation matrices, is incorrectly claimed to be SOS-convex (see [3, 31, 39] for a definition). It is not even a convex domain.

While the lack of a proof does not exclude the possibility that Sparse-BSOS is exact, our counterexamples in Fig. 3 show that this is only possible if Sparse-BSOS is stronger than the SDP relaxation (15). A detailed comparison of these two formulations would reveal if this is the case. Such an in-depth analysis is, however, beyond the scope of this paper.

8.4 Point Set Averaging

In previous work by Chaudhury et al. and Iglesias et al. [14, 28], it has been shown that SDP relaxations are tight in the low noise regime for registering multiple point clouds, while in the high noise regime non-tight instances arise. Here we reproduce similar results, registering an artificial point set over four frames. 100 points are sampled from \({\mathcal {N}}(0, 1)\), after which one direction is squeezed with a factor 1/100, causing higher prevalence of non-tight instances. Gaussian noise, sampled from \({\mathcal {N}}(0, \sigma ^{2})\), was added to each point. Figure 4 shows the average rank over 10.000 problem instances of the computed SDP solution \(X^*\), for each noise level \(\sigma \).

Fig. 4
figure 4

Average rank for instances of the point set averaging problem over \(\mathrm{SO}(3)^4\), with varying noise levels

9 Conclusions

We have presented a framework for analyzing the power of SDP relaxations for optimization over rotational constraints. The key to our analysis has been to investigate the two convex cones of nonnegative and sum-of-squares polynomials and to establish a connection between them and the tightness of an SDP relaxation. We have shown that certain parametrizations lead to tight SDP relaxations and others do not. For our applications which have structured objective functions, we have generated non-tight counterexamples to settle the question of whether the relaxation is always tight or not.

An interesting avenue for future research is to develop algorithms that can recover a good solution from a non-tight relaxation for practitioners of SDP relaxations. This was recently done for the rotation averaging problem [17]. Another interesting direction is to explore the existence of noise bounds for which the registration and hand–eye calibration problems over \(\mathrm{SO}(3)\) are guaranteed to be tight.