1 Introduction

Orientation or attitude is a prominent facet of problems pertaining to disciplines such as computer graphics, computer vision, photogrammetry, robotics and augmented reality. A typical example of such a problem in computer animation is the interpolation of orientation during a process known as key-framing, the main objective of which is to achieve esthetically pleasing results in the representation of the motion of a rigid object not only in terms of translation, but also in terms of the changes in its orientation [66].

In an inverse fashion, 3D computer vision deals with 3D reconstruction, also often referred to as structure from motion estimation (SfM). This consists in using sets of images depicting an unknown scene and captured from unknown locations, in order to automatically extract a 3D geometric representation of the imaged scene plus the camera intrinsic parameters and their poses, i.e., positions and orientations [26, 45]. Several solutions to the SfM problem involve the estimation of the sought parameters by iteratively minimizing the total geometric error pertaining to overdetermined sets of image measurements. Considering that a 3D rotation matrix has nine elements but only three degrees of freedom (DoF), suitable (and preferably minimal) parameterizations of rotation are thus necessary in order to intrinsically incorporate orthonormality constraints on rotations during the optimization.

The modified Rodrigues parameters (MRPs) constitute a minimal rotation parameterization with attractive properties. Despite being well-established in the field of aerospace engineering, MRPs are, to the best of our knowledge, unheard-of in the computer graphics, vision and robotics communities. MRPs are essentially the stereographic coordinates of quaternions and, as such, they are mapped rationally and bijectively to the quaternion sphere. This paper studies the representation of orientation via MRPs. An important finding is that the Jacobian of a quaternion is a polynomial function of its scalar and vector parts, thereby yielding simple expressions in rotation derivatives. Furthermore, it is shown that quaternions can be updated from a given perturbation in parameter space without explicitly using the MRPs. These two findings are very important for iterative optimization, because they allow both Jacobian computation and orientation updates to be carried out using exclusively quaternion components in simple additions and multiplications. As a consequence, iterative optimization completes with fewer calculations in less time. The paper also demonstrates the applicability of MRPs in problems related to orientation interpolation and pose estimation and provides experimental evidence that their use leads to new solutions or the simplification of existing ones and, in most cases, the improvement of performance.

The rest of the paper is structured as follows. A brief overview of orientation representations with respect to various applications is given in Sect. 2, followed by descriptions of common problems involving parameterized orientation and respective solutions in Sect. 3. A derivation of the MRPs based on stereographic projection is provided in Sect. 4, whereas Sect. 5 discusses special properties of MRPs as vectors parallel to the rotation axis; such properties include the relationship with axis-angle and Gibbs vectors as well as the Cayley transform from MRPs to rotation matrices. Section 6 focuses on the differentiation properties of the MRP parameterization. In Sect. 7, the problem of smooth interpolation on the quaternion sphere is examined and a general method for spline based fitting is presented. Experimental results comparing the performance of MRPs against different parameterizations of rotations are given in Sects. 8 and 9 summarizes the contributions of the paper.

2 Rotation Representations

The literature provides many representations of rotation using \(3 \times 3\) matrices and vectors of three or four components. Representing spatial rotations is challenging due to their non commutativity and the fact that their topology does not permit a smooth embedding in Euclidean 3D space. Besides, different practical uses of rotations have different requirements. This section briefly presents the representations most commonly employed in the fields of graphics, robotics and vision, and discusses their strengths and weaknesses.

2.1 Rotation Matrices

Rotation matrices are \(3 \times 3\) orthonormal matrices that arguably constitute the most intuitive representation of orientation. The reason for this is that ordinary linear algebra can be employed to express common operations involving rotations. For example, a point can be rotated using standard matrix-vector multiplication, two rotations can be composed via matrix multiplication, whereas a rotation can be inversed via matrix transposition. A rotation matrix consists of nine elements but has only three DoFs due to the six independent constraints imposed by orthonormality.

Being quadratic, these constraints are cumbersome to impose, typically in the context of a Lagrangian function. This, however, does not render the representation entirely unattractive. For instance, Carlone et al. [7], Olsson and Eriksson [43] as well as Briales and Jimenez [6] make explicit use of matrix orthonormality constraints to formulate the Lagrangian of the camera pose registration problem. The advantage of this approach is that it provides measures for the optimality of solutions of relaxations by monitoring the duality gap in the original problem.

In overall, rotation matrices are typically used to represent rotations when transformations of objects such as points and lines are involved but not very often used in other operations such as interpolation and estimation. For future reference, the set of all rotation matrices is the 3D rotation group, denoted \(\mathcal{SO}(3)\).

2.2 Euler Angles

Euler angles define a rotation in terms of three consecutive elemental rotations around the orthogonal axes of a Cartesian coordinate system. There exist twelve possible sets of Euler angles, depending on the chosen, non commutative order of rotation axes. Even for a particular axes sequence, Euler angles are not unique since supplementary and/or negative angles can yield the same overall rotation [58]. Although an easily conceptualized and minimal DoF encoding, there exist several arguments as to why Euler angles are a parameterization scheme unsuitable for most applications [16, 28, 51].

Notwithstanding their ambiguity, the primary drawback of Euler angles is that they suffer from singularities near which infinitesimal changes in orientation can cause large jumps in the values of their elemental constituent rotations [51, 58]. When represented with Euler angles, every orientation is at most 90 degrees away from a singularity. Such a singularity, known as gimbal lock from its physical manifestation in gyroscopes, occurs when two of the three rotation axes coincide and results in the loss of one degree of freedom, i.e., one rotation having no effect [44]. Since gimbal lock is a discontinuity in the Euler angle representation, it might have undesirable side-effects such as ill-conditioning or instabilities in applications involving rotation operations like iterative optimization, filtering, averaging or interpolation. Thus, the use of Euler angles in describing large and especially arbitrary rotations is limited. Furthermore, the kinematic differential equations of Euler angles are fairly nonlinear, involving computationally expensive trigonometric functions [51]. On the other hand, being more understandable to humans, Euler angles are commonly used in user interfaces for 3D rotations in graphics and CAD software. Another favorable application concerns the use of Euler angles with a linear Kalman filter for position and orientation tracking, as they maintain a linear process model.

2.3 Axis-Angle

Every rotation can be represented as a revolution by an angle \(\theta \) around an axis parallel to a unit 3-vector u. The vector \(\omega =\theta u\) is the angle-axis representation of a rotation. This representation is not unique, since an equivalent representation for the same rotation is \(-(2 \pi - \theta ) u\). The matrix representation in \(\mathcal{SO}(3)\) of an axis-angle rotation \(\omega \) is given by the infinite series

$$\begin{aligned} \exp \left( \left[ \omega \right] _{\times }\right) =\mathrm {I} + \sum _{n=1}^{\infty } \frac{1}{n!}\left[ \omega \right] _{\times }^n , \end{aligned}$$
(1)

where \(\left[ \omega \right] _\times \) is the cross-product skew symmetric matrix associated with \(\omega {\mathop {=}\limits ^{\mathrm{def}}} \begin{bmatrix}\omega _1&\omega _2&\omega _3 \end{bmatrix}^T\):

$$\begin{aligned} \left[ \omega \right] _\times = \begin{bmatrix} 0&-\omega _3&\omega _2\\ \omega _3&0&-\omega _1\\ -\omega _2&\omega _1&0 \end{bmatrix} \end{aligned}$$

Equation 1 is referred to as the exponential map and can be evaluated with Rodrigues’ formula for \(\theta =\Vert \omega \Vert \) [48]:

$$\begin{aligned} R(\omega ) = \exp \left( \left[ \omega \right] _{\times }\right) = \mathrm {I} + \frac{\sin (\theta )}{\theta } \left[ \omega \right] _{\times } + \frac{1-\cos (\theta )}{\theta ^2} \left[ \omega \right] _{\times }^2 \end{aligned}$$
(2)

The axis-angle representation is a many-to-one mapping and has singularities at \(\theta =2 n \pi \), \(n \ge 1\). Furthermore, although the formula converges to the identity matrix for very small values of \(\theta \), in practice it presents numerical issues which call for approximating \(\sin (\theta )\) and \(\cos (\theta )\) with their Taylor series expansions near the origin and using them to simplify the two fractions in Eq. 2. On the other hand, it is surjective, i.e., every rotation has a representation as the exponential of a skew symmetric matrix. Since it is minimal and does not require any additional constraints, the axis-angle representation is very often employed in vision and robotics problems. The terms axis-angle and exponential map are used interchangeably.

Although the exponential map is periodic, a rotation matrix logarithm Footnote 1 from rotation matrices to axis-angle vectors can be defined for \(\theta \in \left( -\pi , \pi \right) \):

$$\begin{aligned}&\log {R}=\frac{\theta }{2\sin {\theta }}\left( R-R^T\right) \end{aligned}$$
(3)
$$\begin{aligned}&\theta =\arccos \left( \frac{\mathrm {Tr}\left( R\right) -1}{2} \right) , \end{aligned}$$
(4)

where \(\mathrm {Tr()}\) denotes a square matrix’s trace, i.e., the sum of its diagonal elements. Evidently, the rotation matrix logarithm converges to the zero vector at the identity, but in practice it is necessary to resort to approximations in order to avoid the effects of very small numbers in the denominator.

In the specific case where Eq. 4 yields \(\theta = \pi \), the axis-angle vector cannot be recovered with Eq. 3. Instead, we initially obtain the absolute values of the components of \(\omega \) as follows:

$$\begin{aligned} \vert \omega _i\vert =\sqrt{\frac{r_{ii}+1}{2}}, \end{aligned}$$
(5)

where \(i\in \lbrace 1, 2, 3\rbrace \) and \(r_{ij}\) is the element of R in the ith row and jth column. Since \(\omega \) is sign-ambiguous, we may choose the component which has the largest absolute value to be positive. The remaining two components can be recovered from the off-diagonal elements of R as follows:

$$\begin{aligned} \omega _j=\frac{r_{kj}}{\omega _k}, \end{aligned}$$
(6)

where \(k={{\mathrm{arg\,max}}}_{i} \lbrace \vert \omega _i\vert \rbrace \) and \(j\in \lbrace 1,2,3\rbrace - \lbrace k\rbrace \).

2.4 Unit Quaternions

One of the most popular ways to unambiguously represent orientation in 3D is with the Euler–Rodrigues parameters, as a location on the unit sphere in 4D. The modern formalism for Euler–Rodrigues parameters are unit quaternions, which form a multiplicative group that fully describes 3D rotations. Unit quaternions constitute a redundant parameterization which does not suffer from gimbal lock. However, their numerical estimation in practice is complicated by the need to incorporate a unit-norm constraint; more details on this are provided in Sect. 3.2.

For consistency of notation throughout the rest of the paper, the field of quaternions will henceforth be denoted with \(\mathbb {H}\), and an arbitrary quaternion \(q\in \mathbb {H}\) will be written in the form

$$\begin{aligned} q=\rho +{\upsilon }^T\varphi , \end{aligned}$$
(7)

where \(\rho \in {\mathbb {R}}\) is the scalar part, \(\upsilon \in {\mathbb {R}}^3\) is the vector part and \(\varphi ={\begin{bmatrix} i&j&k \end{bmatrix}}^T\) is the vector of the 3 imaginary units (also referred to as fundamental quaternion units). The reader is referred to [16, 29, 44, 61, 64] for more detailed introductions on quaternions and their properties.

A quaternion \(q\in \mathbb {H}\) such that \(\Vert q\Vert =1\), is called a unit quaternion. Unit quaternions lie on the unit sphere in 4D (also known as the quaternion sphere in kinematics or 3-sphere \(\mathbb {S}^3\) in topology) and form a group under multiplication which precisely describes the group of rotations. In particular, provided a quaternion \(q=\rho +\upsilon ^T\varphi \) such that \(\rho ^2+\upsilon ^T\upsilon =1\), it can be shown that it corresponds to the following rotation matrix [41]:

$$\begin{aligned} R=\left( \rho ^2-\upsilon ^T\upsilon \right) I_3+2\upsilon \upsilon ^T+2\rho {\left[ \upsilon \right] }_\times \end{aligned}$$
(8)

It is worth noting that the formula in Eq. 8 implies that the elements of the rotation matrix are polynomial expressions of the unit quaternion components. It is also evident from Eq. 8 that the same rotation matrix corresponds to quaternions q and \(-q\). Thus, when treated as elements of the special \(3\times 3\) orthogonal group, antipodal unit quaternions represent the same rotation.

In direct analogy to the exponential map from the space of skew-symmetric matrices to the group of rotation matrices, there exists an exponential map from the space of axis-angle vectors to unit quaternions:

$$\begin{aligned} \exp \left( \frac{1}{2} \omega ^T\varphi \right) =\cos {\frac{\theta }{2}}+\sin {\frac{\theta }{2}}\left( \frac{\omega }{\theta }\right) ^T\varphi , \end{aligned}$$
(9)

where \(\omega \in {\mathbb {R}}^3\) is the axis-angle vector or the rotation associated with q and \(\theta =\Vert \omega \Vert \). Note that the right-hand side of Eq. 9 can be obtained with the Taylor expansion of \(\exp \left( \frac{1}{2}\omega ^T\varphi \right) \), provided the observation \(\left( \omega ^T\varphi \right) ^2=-\theta ^2\).

As in the case of rotation matrices, unit quaternions can be mapped to the corresponding axis-angle vectors via a logarithmic function:Footnote 2

$$\begin{aligned}&\log {q}=\frac{\theta }{2\sin {\frac{\theta }{2}}}\left( q-{\overline{q}}\right) \end{aligned}$$
(10)
$$\begin{aligned}&\theta =2\arccos \left( \frac{q+{\overline{q}}}{2}\right) , \end{aligned}$$
(11)

where \({\overline{q}}\) denotes the conjugate of q, i.e., \({\overline{q}}=\rho -{\upsilon }^T\varphi \).

3 Common Problems and Standard Solutions

This section briefly describes prevalent problems involving parameterized orientation along with the most common solutions employed and their typical shortcomings. Although applications may vary, these problems essentially fall under two major categories, namely interpolation of orientation and estimation of rotation parameters.

3.1 Interpolation of Orientation

The problem of interpolating rotational motion from a sequence of key orientations often arises in computer animation, computer-aided design and robot kinematics applications [11, 27, 30, 49, 53, 59]. Since quaternions conveniently possess the properties of a metric space (i.e., \({\mathbb {R}}^4\)), it is very common to perform this task on the unit sphere in 4D where properties such as smoothness, length and curvature can be measured and manipulated with standard calculus.

Suppose that a sequence of key orientations is given in the form of unit quaternions \(q_0, \, q_1, \, \ldots , \, q_n, \, \ldots \) and the goal is to interpolate the sequence with a smooth spherical curve. Possibly the most popular tool for elementary interpolation on a great arc between two successive quaternions is Shoemake’s classic formula for spherical linear interpolation (abbreviated as \(\textsc {slerp}\)) [56]:

$$\begin{aligned} \textsc {slerp}(q_n \, q_{n+1}; u)=\frac{\sin {\left( 1-u\right) \varPhi }}{\sin {\varPhi }}q_n + \frac{\sin {u\varPhi }}{\sin {\varPhi }}q_{n+1} , \end{aligned}$$
(12)

where \(u\in \left[ 0, 1\right] \) is the interpolation parameter, \(\varPhi =\arccos \left( q_n \cdot q_{n+1}\right) \) is the angle between \(q_n\), \(q_{n+1}\) and \(\cdot \) denotes the dot product between quaternions as vectors in \({\mathbb {R}}^4\).

Interpolation of more than two key orientations is a far more challenging task, primarily because the constituent segments of the curve have to be pieced smoothly at the data points. A popular solution is Shoemake’s spherical quadrangle interpolation (\(\textsc {squad}\)) [57]. In a nutshell, \(\textsc {squad}\) is the spherical analog of parabolic blending between quaternions \(q_n\) and \(q_{n+1}\):

$$\begin{aligned} \textsc {squad}\left( q_n,q_{n+1};u\right)&= \textsc {slerp}\Big (\textsc {slerp}\left( q_n, \, q_{n+1}; u\right) ,\nonumber \\&\textsc {slerp}\left( \alpha _n,\alpha _{n+1};u\left( 1-u\right) \right) \Big ), \end{aligned}$$
(13)

where \(u\in \left[ 0,1\right] \) is the interpolation parameter and \(\alpha _n\), \(\alpha _{n+1}\) are auxiliary points chosen specifically to impose smoothness at the key points that can be computed with the following formula:

$$\begin{aligned} \alpha _i=q_i\exp \left( -\frac{\log \left( {\overline{q}}_iq_{i-1}\right) + \log \left( {\overline{q}}_iq_{i+1}\right) }{4} \right) \end{aligned}$$
(14)

Shortly after Shoemake’s contribution, Duff followed with [20], introducing a B-spline spherical curve interpolating the data similarly to planar B-splines. More recently, a construction scheme in \({\mathbb {R}}^3\) for smooth quaternion curves was proposed in [32]. The main shortcoming associated with the aforementioned methods is the lack of a general strategy to enforce arc length and curvature minimization constraints, owing to the complexity of the corresponding expressions for the spherical polynomial derivatives. In their thorough report on Shoemake’s work, Dam et al. [13] show that finding a generic, curvature minimizing smooth exponential curve on the sphere is highly impractical, primarily due to the complicated derivatives of the curve. Another ramification of the generally intractable differentiation is that in most cases, speed adjustment is performed purely numerically (cf. the chord-length approximation method [67]).

Following Shoemake’s work, several solutions for orientation interpolation have been introduced, taking matters from a different perspective. For instance, Johnstone and Wiliams [31] introduced a rational function mapping 4D Bézier curves onto the quaternion sphere. Although they were unaware at the time, this mapping is the generalized form of stereographic projection [17], which, without any precautions will cause distortions on the sphere. Other geometric methods were proposed in [3, 46, 47]. With the exception of the method by Roberts et al. [47], these approaches focus on minimizing functionals defined on characteristics of the curve such as tangential velocity or centrifugal acceleration primarily by making approximations to the actual expressions. The common drawback of these approaches is that they are relatively complex to implement and not so flexible to configure under different circumstances.

An interesting alternative to the mainstream is the work of Boumal [5], which optimizes a cost function directly over rotation matrices. Since it is difficult to fit a parametric function on matrices with orthonormal constraints, Boumal defines a cost function over a sequence of rotations with penalty terms on chordal distanceFootnote 3 from the key-rotations as well as on first- and second-order finite differences in \(\mathcal{SO}(n)\) to impose smoothness. To iteratively optimize the cost function on the rotation manifold, analytical expressions for the Riemanian derivatives of the penalty terms are obtained as orthogonal projections of Euclidean matrix derivatives onto the tangent space of the current rotation estimates [1]. The method can be adapted either for interpolation or regression and applies to problems involving orthogonal matrices of arbitrary dimensionality.

3.2 Estimation of Orientation and Rotation Matrix Differentiation

At the very core of several key problems in computer graphics, vision and robotics lies the problem of estimating orientation. The typical formulation of orientation estimation problems involves a cost function which is a sum of positive (by means of a suitable metric) error terms, in which the unknown rotation matrices act on vectors measured in different coordinate frames. The usual method of minimizing such a cost function is by setting its derivatives equal to zero and solving the resulting equations. The Jacobian of the rotation matrix is therefore crucial to the estimation.

It becomes evident from the exponential map expression in Eq. 1 that differentiation of the rotation matrix R with respect to the axis-angle vector \(\omega \) is not trivial and the associated Jacobian contains complicated transcendental expressions. Most importantly, the derivative of the exponential map presents a “malignant” singularity at the origin, owed to the presence of an angle in the denominator. A complete list of analytic expressions for these derivatives can be found in a report by Diebel [16].

Recently, Gallego and Yezzi [23] have discovered a reasonably compact expression for the Jacobian of the rotation matrix:

$$\begin{aligned} \dfrac{\partial {R}}{\partial \omega _i} = \frac{\omega _i{\left[ \omega \right] }_\times + {\left[ \omega \times \left( I_3-R\right) e_i\right] }_\times }{\theta ^2}R, \end{aligned}$$
(15)

where \(i\in \ \left\{ 1, 2, 3\right\} \) indexes the components of \(\omega \) and \(e_i\) is the ith canonical basis vector of \({\mathbb {R}}^3\). It should be stressed that despite the denominator \(\theta ^2\) in Eq. 15, the derivatives of the rotation matrix are continuous at the origin, \(\omega = \begin{bmatrix} 0&0&0\end{bmatrix}^T\) and are equal to the cross-product skew symmetric matrices associated with the canonical vectors \(e_i\) (also known as the infinitesimal generators of the Lie algebra \( so (3)\) and denoted \(G_i\)) [54]:

$$\begin{aligned} \dfrac{\partial {R}}{\partial \omega _i}\bigg \vert _{\omega _i=0}=\left[ e_i\right] _\times {\mathop {=}\limits ^{\mathrm{def}}} G_i \end{aligned}$$
(16)

Gallego and Yezzi’s formula in Eq. 15 is a significant improvement, yet it still is not simple enough, let alone it entails the evaluation of a few trigonometric expressions. Furthermore, the singularity at the origin must still be accounted for with the aid of Taylor approximations.

The alternative to computing the actual derivatives of the rotation matrix with respect to the axis-angle vector is either the use of finite differences or incremental rotations with analytical derivatives at the identity. Suppose, for example, that we are attempting to optimize the parameters of a rotation matrix in the context of an iterative method. The idea is to replace the rotation matrix R at step k with another rotation \(R^\prime \) given by the product of the current estimate and a perturbing rotation matrix \(\exp \left( \left[ u\right] _\times \right) \) which is initially equal to the identity. Thus, instead of taking the actual derivative of R with respect to its current axis-angle parameters as given in Eq. 15, the much simpler derivative of \(R^\prime \) in terms of u is taken at the origin:

$$\begin{aligned} \dfrac{\partial {R}^\prime }{\partial u_i} = R\dfrac{\partial \exp \left( \left[ u\right] _\times \right) }{\partial u_i}\Bigg \vert _{u_i=0}=R G_i \end{aligned}$$
(17)

The workaround of Eq. 17 found early advocates such as Taylor [60], or Drummond and Cipolla [18] and is popular in practice [19, 33]. On the other hand, by all accounts, it is not the actual derivative of the rotation in terms of the axis-angle parameters, but rather the derivative of an expression that has a corrective effect on the current estimate. As such, it is a mapping that has the characteristics of a retraction from the tangent space of R to \({\mathcal {SO}}(3)\) as defined by Absil et al. [1] and can therefore be used to provide a valid direction of descent on the rotation manifold. The latter suggests that the application scope of this approach is limited only to problems involving iterative optimization.

A special class of problems which involve the recovery of orientation and have attracted considerable attention recently is that of rotation averaging [10, 12, 24, 25]. The objective in this case is to recover the absolute or relative orientation most consistent with many estimates. Typically, these problems are solved iteratively and require the determination of a direction of descent either in parameter space or directly on the rotation manifold. Several solutions to rotation averaging problems presented by Hartley [25] incorporate the incremental rotation approach of Eq. 17 to establish descent directions.

Another approach for obtaining the derivatives of a rotation matrix is to parametrize it with a unit quaternion. Although practical by virtue of the rotation matrix expression in terms of a quaternion in Eq. 8, this approach unfortunately requires imposing a hard unit-norm constraint on the quaternion components. To impose this constraint in the context of Euclidean bundle adjustment, Lourakis and Argyros [37] optimize only the vector part \(\upsilon \) of a quaternion and implicitly obtain the scalar part as \(\rho = \sqrt{1 - \upsilon ^T\upsilon }\) (cf. Eq. 7). Clearly, this does not allow for negative scalar parts and, therefore, the rotation must be limited to the range \(\left[ -\pi /2, \pi /2 \right] \). To overcome this, the initial orientation of every camera before the optimization is retained and only the difference from the initial orientation is optimized. This local update is certain to lie within the aforementioned range, and the approach is also referred to as an incremental update in Sect. 6.2.2 of [59].

4 Derivation of Modified Rodrigues Parameters with Stereographic Projection

Stereographic projection is a rational, bijective mapping from a sphere to a plane often encountered in complex analysis, topology, quantum computing, etc. However, it has been largely overlooked in the computer graphics and vision communities as a practical means of parameterizing orientation. In contrast, aerospace engineering literature has several notable references to the potential of this formalism, also known as the modified Rodrigues parameters (MRPs), e.g., [40, 50, 58, 63, 68].

4.1 Derivation of Projection/Back-Projection Maps

To establish notation for the rest of the paper, a brief derivation of the stereographic projection formulas is in order at this point. Consider a unit quaternion \(q=\rho +\upsilon ^T\varphi \) such that \(\rho \in {\mathbb {R}}\) and \(\upsilon \in {\mathbb {R}}^3\) with \(\rho ^2+\upsilon ^T\upsilon =1\). We designate the “South Pole” of the sphere to be the unit quaternion \(\mathrm {S}=-1\). Let now \(r\left( t\right) \) be the ray parameterized by \(t\in {\mathbb {R}}\) passing through q and a purely imaginary quaternion \(\psi ^T\varphi \):

$$\begin{aligned} r\left( t\right) =\mathrm {S}+t\left( \psi ^T\varphi -\mathrm {S}\right) , \end{aligned}$$
(18)

where \(\psi \in {\mathbb {R}}^3\). Thus, the subspace of purely imaginary quaternions can be regarded as an equatorial hyperplane that “slices” the 4D unit sphere along 3 canonical directions through the origin and the “South Pole” as the center of projection, through which, the unit quaternion q is projected onto \(\psi ^T\varphi \) in the hyperplane. A visualization of this projection is provided in Fig. 1.

Fig. 1
figure 1

A visualization of stereographic projection in 3D. The unit quaternion \(\mathrm {S}=-1\) is the center of projection and \(\psi ^T\phi \) is a quaternion in the equatorial plane. The ray \(r\left( t \right) =\mathrm {S}+t\left( \psi ^T\varphi -\mathrm {S}\right) \) intersects the unit sphere at q

When the ray intersects the sphere, the resulting quaternions should have a unit norm, i.e., \(\vert r\left( t\right) \vert ^2=1\). Substituting the expression of \(r\left( t\right) \) from Eq. 18 into the unit-norm constraint yields a solution for the parameter t when the ray intersects q:

$$\begin{aligned} t=\frac{2}{1+{\Vert \psi \Vert }^2} \end{aligned}$$
(19)

The unit quaternion can now be expressed in terms of \(\psi \) by substituting Eq. 19 into Eq. 18:

$$\begin{aligned} q=\frac{1-{\Vert \psi \Vert }^2}{1+{\Vert \psi \Vert }^2} + \frac{2}{1+{\Vert \psi \Vert }^2}\psi ^T\varphi \end{aligned}$$
(20)

Conversely, it is fairly easy to project a unit quaternion onto the equatorial hyperplane. It suffices to solve first for \({\Vert \psi \Vert }^2\) in terms of the quaternion scalar part, \(\rho \):

$$\begin{aligned} {\Vert \psi \Vert }^2=\frac{1-\rho }{1+\rho } \end{aligned}$$
(21)

Thus, \(\psi \) can be expressed in terms of the components of q using Eqs. 20 and 21:

$$\begin{aligned} \psi = \frac{\upsilon }{1+\rho } \end{aligned}$$
(22)

It should be stressed here that the components of \(\psi \) can assume infinite values. Thus, more rigorously, \(\psi \in \overline{{\mathbb {R}}}^3\), where \(\overline{{\mathbb {R}}}={\mathbb {R}}\cup \lbrace -\infty , +\infty \rbrace \) is the affinely extended set of real numbers and this notation will be used throughout the rest of this paper.

4.2 Related Parameterizations: The Gibbs Vector

For completeness, we remark that a representation closely related to to MRPs is the Gibbs or classical Rodrigues parameter vector g [25, 51, 58]. The Gibbs vector is defined by the projection of the quaternion parameters from the center of the unit sphere onto the hyperplane tangent to its “South Pole”, given algebraically as:

$$\begin{aligned} g = \frac{\upsilon }{\rho } \end{aligned}$$
(23)

Classical Rodrigues parameters provide a minimal DoF representation that is singular and discontinuous at the angle of rotation \(\pi \). Using the axis-angle parameterization \(\rho =\cos \left( \frac{\theta }{2}\right) \) and \(\upsilon =\frac{\sin {\frac{\theta }{2}}}{\theta }\omega \), it is straightforward to derive a relationship between g and \(\omega \) [58]:

$$\begin{aligned} g=\frac{\tan {\frac{\theta }{2}}}{\theta }\omega \end{aligned}$$
(24)

As demonstrated in Sect. 5, the relationship between the Gibbs vector and the axis-angle vector given in Eq. 24 becomes particularly useful in deriving a Cayley transform from the space of MRP vectors to the respective rotation matrices.

5 Modified Rodrigues Parameters as Vectors Parallel to the Rotation Axis

Depending on the choice of projection center, the resulting expressions for the coordinates of the projected quaternion on the equatorial plane will vary. In order for these coordinates to be valid modified Rodrigues parameters, the projection center should lie on the real axisFootnote 4 as is the case with the derivation of Sect. 4. To state this more clearly, consider a unit quaternion \(q=\rho +\upsilon ^T\varphi \) where \(\varphi =\begin{bmatrix} i&j&k\end{bmatrix}^T\) and its axis-angle parameterization, such that:

$$\begin{aligned} \rho= & {} \cos {\frac{\theta }{2}}\\ \upsilon= & {} \frac{\sin {\frac{\theta }{2}}}{\theta }\omega \end{aligned}$$

with \(\omega \in {\mathbb {R}}^3\) and \(\Vert \omega \Vert =\theta \). The vector of modified Rodrigues parameters associated with q is the triplet of stereographic coordinates \(\psi \) that back-projects to the corresponding spherical point in the following way:

$$\begin{aligned} \frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}= & {} \rho =\cos {\frac{\theta }{2}} \end{aligned}$$
(25)
$$\begin{aligned} \frac{2\psi }{1+\Vert \psi \Vert ^2}= & {} \upsilon =\frac{\sin {\frac{\theta }{2}}}{\theta }\,\omega \end{aligned}$$
(26)

Thus, the vector part of the quaternion is always represented by \(2\psi /(1+\Vert \psi \Vert ^2)\), which is collinear with the parameter vector.

With \(\psi \) being parallel to \(\upsilon \), it follows from Eq. 26 that it is also parallel to the rotation axis. This implies that MRPs are a member of the so-called family of vectorial parameterizations [4]. In particular, MRPs and Gibbs vectors belong to the tangent family, which enjoys certain important properties, the most prominent of them being the inter-connections in terms of the Cayley transform explained in Sect. 5.1. Using Eqs. 25 and 26, the relationship between the axis-angle vector \(\omega \) and the MRPs of a rotation is straightforward for a rotation angle \(\theta \in \left[ 0, 2\pi \right) \) [40, 58]:

$$\begin{aligned} \psi =\frac{\tan {\frac{\theta }{4}}}{\theta }\,\omega \end{aligned}$$
(27)

Thus, comparing Eq. 27 to the corresponding relationship for Gibbs vectors in Eq. 24, it can be inferred that the MRP vector \(\psi \) has twice the rotational range of the classical Rodrigues parameters g. In the special case where \(\theta =2\pi \), any MRP vector \(\psi \) with at least one of its coordinates equal to “infinity” will back-project to -1 on the quaternion sphere. More details on this representation peculiarity are given in Sect. 5.2.

Furthermore, using Eqs. 26 and 25, it is easy to derive a composition rule between MRP vectors based on quaternion multiplication. Specifically, for the unit quaternion product \(q_3=q_1 q_2\), we obtain the following corresponding relationship in MRP space [58]:

$$\begin{aligned} \psi _3 = \frac{\left( 1-{\Vert \psi _2 \Vert }^2\right) \psi _1 + \left( 1 - {\Vert \psi _1 \Vert }^2\right) \psi _2 - 2 \psi _1 \times \psi _2}{1 + {\Vert \psi _1 \Vert }^2 {\Vert \psi _2 \Vert }^2 - 2 \psi _1 \cdot \psi _2} \end{aligned}$$
(28)

5.1 MRPs and the Cayley Transform

The Cayley transform is a mapping from the space of skew-symmetric matrices directly to the group of rotation matrices. In particular, it is straightforward to show that the Cayley transform maps the skew symmetric matrix \(\left[ g\right] _{\times }\) of a Gibbs vector to the respective rotation matrix as follows [40, 62]:

$$\begin{aligned} R=\left( I_3+\left[ g\right] _{\times }\right) \left( I_3-\left[ g\right] _{\times }\right) ^{-1} \end{aligned}$$
(29)

where \(I_3\) is the \(3\times 3\) identity matrix. It is relatively easy to show that both factors in the product commute and are invertibleFootnote 5 for any skew-symmetric matrix \(\left[ g\right] _{\times }\). The inverse Cayley transform maps a rotation matrix to its corresponding Gibbs vector as follows:

$$\begin{aligned} \left[ g\right] _{\times }=\left( R-I_3\right) \left( R+I_3\right) ^{-1} \end{aligned}$$
(30)

The transformation from an MRP skew-symmetric matrix \(\left[ \psi \right] _{\times }\) to the corresponding rotation matrix is a so-called second-order Cayley transform, given by the following mapping [62]:

$$\begin{aligned} R=\left( I_3+\left[ \psi \right] _{\times }\right) ^2\left( I_3-\left[ \psi \right] _{\times }\right) ^{-2} \end{aligned}$$
(31)

It is worth outlining here the rationale behind the derivation of the Cayley transform for MRPs as expounded by Schaub et al. [52]. The idea is to observe the relationship between classical Rodrigues parameters (Gibbs vectors) and MRPs through axis-angle vectors as given in Eqs. 24 and 27. Evidently, the rotation matrix obtained by employing the first-order (standard) Cayley transform formula of Eq. 29 on \(\left[ \psi \right] _{\times }\) yields a rotation matrix with half the angle of the actual rotation that corresponds to the MRP triplet. This can be demonstrated with the aid of exponential notation:

$$\begin{aligned} \exp {\left( \frac{1}{2}\left[ \omega \right] _{\times }\right) }=\left( I_3+\left[ \psi \right] _{\times }\right) \left( I_3-\left[ \psi \right] _{\times }\right) ^{-1}, \end{aligned}$$
(32)

where \(R=\exp {\left( \left[ \omega \right] _{\times }\right) }\). Thus, since matrices \(I_3-\left[ \omega \right] _{\times }\) and \(\left( I_3-\left[ \omega \right] _{\times }\right) ^{-1}\) commute, squaring both sides in Eq. 32 yields the second-order Cayley transform for MRPs given in Eq. 31. Schaub et al. report that, unlike classical Rodrigues parameters, there exists no similar expression for the inverse mapping from rotation matrices to MRPs [52].

5.2 Negated Quaternions and Shadow MRPs

Two antipodal (i.e., opposite) quaternions q and \(-q\) represent the same rotation. Nevertheless, the stereographic coordinates of these two quaternions will differ. Markley calls the stereographic coordinates of the negated quaternion shadow MRPs [40]. Shadow coordinates can be easily worked-out from Eq. 20:

$$\begin{aligned} \eta =-\frac{\psi }{{\Vert \psi \Vert }^2} \end{aligned}$$
(33)

It should be noted that the last formula is not valid for \(\psi = \left( 0, 0, 0\right) \), in which case the shadow quaternion coincides with the center of projection (i.e., the chosen “South Pole”) and the rotation has a single representation at the origin of the hyperplane. As the MRP vector moves far away from the origin of the hyperplane, the corresponding quaternions converge asymptotically to \(q=-1\) at the projection center. This is a direct consequence of the fact that the line \(y=-1\) is the asymptote at both \(+\infty \) and \(-\infty \) of function \((1-t^2)/(1+t^2)\), i.e., the quaternion’s scalar part from Eq. 25, as shown in Fig. 2. The plot clearly indicates that decay is fast and for \(\vert t\vert \ge 15\), the function is already very close to \(-1\); it becomes even closer for larger values. For example, the deviation of the function from \(-1\) for \(t=10^2\) is in the order of \(10^{-4}\).

Fig. 2
figure 2

Plot of function \((1-t^2)/(1+t^2)\) for \(\vert t\vert \le 20\) shown in solid blue. The horizontal asymptote \(y=-1\) is shown with a red dashed line (Color figure online)

The above observation suggests that it is possible to approach the projection center with high accuracy using MRP vectors whose components are well within the nominal floating point range. Hence, this is a reasonable alternative to representing the quaternion at the center of projection without resorting to shadow coordinates which could cause discontinuities in applications such as interpolation. However, as will be explained in the following sections, our analysis is focused on the quaternion sphere, while MRPs are essentially used to provide the theoretical underpinning that allows us to work with unit quaternions without the need to impose the norm constraint. Thus, with the exception of interpolation, unit quaternions can be directly manipulated in terms of their MRPs, without having to explicitly switch parameter domain.

6 Differential Properties of Unit Quaternions with Respect to Modified Rodrigues Parameters

It is clear from Eqs. 26, 25 that the derivatives of a unit quaternion with respect to its MRPs are rational functions of \(\psi \). This is an advantageous fact not only from the aspect of computational efficiency but, most importantly, in terms of the complexity of the resulting expressions in the Jacobian [61]. In this section, we will provide a very important novel observation regarding these derivatives, which will not only simplify the computation of the Jacobian of a rotation matrix (refer to “Appendix A” for a complete set of formulas), but will also provide simple relationships between the derivatives of curves on the hyperplane and the derivatives of their back-projections on the sphere.

6.1 Quaternion Jacobian

Proposition 1

Let \(q=\rho +\upsilon ^T\varphi \) where \(\varphi =\begin{bmatrix}i&j&k\end{bmatrix}^T\) be a unit quaternion. Then, the Jacobian \({\nabla }q\) of q with respect to its modified Rodrigues parameters is:

$$\begin{aligned} {\nabla }q=\nabla \begin{bmatrix}\upsilon \\ \rho \end{bmatrix}=-\begin{bmatrix} \upsilon \upsilon ^T - \left( 1+\rho \right) I_3 \\ \left( 1+\rho \right) \upsilon ^T \end{bmatrix} \end{aligned}$$
(34)

Proof

Let \(\psi \in \overline{{\mathbb {R}}}^3\) be the MRPs of the unit quaternion. It follows from Eq. 26 that the derivative of the vector part \(\upsilon \) with respect to \(\psi \) will be:

$$\begin{aligned} \begin{aligned} \dfrac{\partial \upsilon }{\partial \psi }&=\dfrac{\partial \frac{2\psi }{1+\Vert \psi \Vert ^2}}{\partial \psi } = -\frac{2\psi \dfrac{\partial \left( 1+\Vert \psi \Vert ^2\right) }{\partial \psi }}{\left( 1+\Vert \psi \Vert ^2\right) ^2}+\frac{2}{1+\Vert \psi \Vert ^2}\dfrac{\partial \psi }{\partial \psi }\\&=-\frac{2\psi 2\psi ^T}{\left( 1+\Vert \psi \Vert ^2\right) ^2}+\frac{2}{1+\Vert \psi \Vert ^2}I_3\\&=-\left( \frac{2\psi }{1+\Vert \psi \Vert ^2}\right) \left( \frac{2\psi ^T}{1+\Vert \psi \Vert ^2}\right) \\&\quad +\frac{2+ \Vert \psi \Vert ^2-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}I_3\\&=-\left( \frac{2\psi }{1+\Vert \psi \Vert ^2}\right) \left( \frac{2\psi ^T}{1+\Vert \psi \Vert ^2}\right) \\&\quad +\,\left( 1+\frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}\right) I_3\\&=-\upsilon \upsilon ^T+\left( 1+\rho \right) I_3 \end{aligned} \end{aligned}$$

Similarly, using Eq. 25, the derivative of the scalar part in terms of \(\psi \) will be:

$$\begin{aligned} \begin{aligned} \dfrac{\partial \rho }{\partial \psi }&= -\frac{\left( 1-\Vert \psi \Vert ^2\right) \dfrac{\partial \left( 1+\Vert \psi \Vert ^2\right) }{\partial \psi }}{\left( 1+\Vert \psi \Vert ^2\right) ^2} +\frac{\dfrac{\partial \left( 1-\Vert \psi \Vert ^2\right) }{\partial \psi }}{1+\Vert \psi \Vert ^2}\\&=\dfrac{\partial \frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}}{\partial \psi }=-\frac{\left( 1-\Vert \psi \Vert ^2\right) 2\psi ^T}{\left( 1+\Vert \psi \Vert ^2\right) ^2}-\frac{2\psi ^T}{1+\Vert \psi \Vert ^2}\\&=-\left( \frac{2}{1+\Vert \psi \Vert ^2}\right) \left( \frac{2\psi ^T}{1+\Vert \psi \Vert ^2}\right) \\&=-\left( 1+\frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}\right) \left( \frac{2\psi ^T}{1+\Vert \psi \Vert ^2}\right) \\&=-\left( 1+\rho \right) \upsilon ^T \end{aligned} \end{aligned}$$

The Jacobian formula of Eq. 34 is very important because it has low complexity of expressions and involves only the quaternion components in simple additions and multiplications without the need of additional constraints. This means that, by virtue of the rotation matrix expression in terms of a unit quaternion given in Eq. 8, the components of the rotation matrix Jacobian tensor will in turn comprise simple polynomial expressions of the quaternion components (see “Appendix A”). In other words, the computation of the rotation matrix derivatives entails exclusively multiplications and additions on previously stored quantities.

6.2 Quaternion Updates from Perturbations in MRPs

Although the Jacobian of a unit quaternion with respect to MRPs can be expressed without the explicit presence of the parameters in the respective expressions, it will, however, produce a perturbation in parameter space during iterative optimization. Thus, it would appear that, in order to obtain the new quaternion estimate, one is required to convert it to MRPs, then perform the update in \({\mathbb {R}}^3\) and eventually, convert the resulting parameters to the new quaternion estimate. In Proposition 2, we show that the current estimate of a quaternion can be updated without alternating among parameter spaces.

Proposition 2

Consider a perturbation \(\delta \in \overline{{\mathbb {R}}}^3\) in the modified Rodrigues parameters of a unit quaternion \(q=\rho +\upsilon ^T\varphi \). Then, the scalar and vector part of the unit quaternion \(q^{\prime }=\rho ^{\prime }+{\upsilon ^{\prime }}^T\varphi \) corresponding to the perturbed MRPs can be obtained as follows:

$$\begin{aligned} \upsilon ^{\prime }= & {} \frac{\upsilon +\left( 1+\rho \right) \delta }{1+\upsilon \cdot \delta +\frac{1}{2}\left( 1+\rho \right) \Vert \delta \Vert ^2} \end{aligned}$$
(35)
$$\begin{aligned} \rho ^{\prime }= & {} \frac{\rho -\upsilon \cdot \delta -\frac{1}{2}\left( 1+\rho \right) \Vert \delta \Vert ^2}{1+\upsilon \cdot \delta +\frac{1}{2}\left( 1+\rho \right) \Vert \delta \Vert ^2} \end{aligned}$$
(36)

Proof

Let \(\psi \) be the MRP triplet associated with q. Then, taking the stereographic projection formula in Eq. 26 for \(\upsilon ^{\prime }\), we have:

$$\begin{aligned} \begin{aligned} \upsilon ^{\prime }&=\frac{2\left( \psi +\delta \right) }{1+\Vert \psi +\delta \Vert ^2}=\frac{\frac{2\left( \psi +\delta \right) }{1+\Vert \psi \Vert ^2}}{\frac{1+\left( \psi +\delta \right) \cdot \left( \psi +\delta \right) }{1+\Vert \psi \Vert ^2}}\\&=\frac{\frac{2\psi }{1+\Vert \psi \Vert ^2}+\frac{2}{1+\Vert \psi \Vert ^2}\delta }{1+\left( \frac{2\psi }{1+\Vert \psi \Vert ^2}\right) \cdot \delta +\frac{1}{2}\left( \frac{2}{1+\Vert \psi \Vert ^2}\right) \Vert \delta \Vert ^2}\\&=\frac{\frac{2\psi }{1+\Vert \psi \Vert ^2}+\left( 1+\frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}\right) \delta }{1+\left( \frac{2\psi }{1+\Vert \psi \Vert ^2}\right) \cdot \delta +\left( 1+\frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}\right) \Vert \delta \Vert ^2}\\&=\frac{\upsilon +\left( 1+\rho \right) \delta }{1+\upsilon \cdot \delta +\frac{\left( 1+\rho \right) }{2}\Vert \delta \Vert ^2} \end{aligned} \end{aligned}$$

Similarly, taking the stereographic projection formula in Eq. 25 for \(\rho ^{\prime }\), yields:

$$\begin{aligned} \begin{aligned} \rho ^{\prime }&=\frac{1-\Vert \psi +\delta \Vert ^2}{1+\Vert \psi +\delta \Vert ^2}=\frac{\frac{1-\left( \psi +\delta \right) \cdot \left( \psi +\delta \right) }{1+\Vert \psi \Vert ^2}}{\frac{1+\left( \psi +\delta \right) \cdot \left( \psi +\delta \right) }{1+\Vert \psi \Vert ^2}}\\&=\frac{\frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}-\left( \frac{2\psi }{1+\Vert \psi \Vert ^2}\right) \cdot \delta -\frac{1}{2}\left( \frac{2}{1+\Vert \psi \Vert ^2}\right) \Vert \delta \Vert ^2}{1+\left( \frac{2\psi }{1+\Vert \psi \Vert ^2}\right) \cdot \delta +\frac{1}{2}\left( \frac{2}{1+\Vert \psi \Vert ^2}\right) \Vert \delta \Vert ^2}\\&=\frac{\frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}-\left( \frac{2\psi }{1+\Vert \psi \Vert ^2}\right) \cdot \delta -\frac{1}{2}\left( 1+\frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}\right) \Vert \delta \Vert ^2}{1+\left( \frac{2\psi }{1+\Vert \psi \Vert ^2}\right) \cdot \delta +\frac{1}{2}\left( 1+\frac{1-\Vert \psi \Vert ^2}{1+\Vert \psi \Vert ^2}\right) \Vert \delta \Vert ^2}\\&=\frac{\rho -\upsilon \cdot \delta -\frac{1}{2}\left( 1+\rho \right) \Vert \delta \Vert ^2}{1+\upsilon \cdot \delta +\frac{1}{2}\left( 1+\rho \right) \Vert \delta \Vert ^2} \end{aligned} \end{aligned}$$

Propositions 1 and 2 have a significant impact in the way rotations parameterized with MRPs are updated during iterative optimization. In particular, Proposition 1 ensures that the elements of the rotation matrix Jacobian are computed with a few multiplications and additions of previously stored numbers (i.e., the four quaternion components); furthermore, Proposition 2 ensures that the updated rotation matrix in each step of the iterative method can be obtained without having to compute the MRPs of the previous estimate. In other words, both the Jacobian computation as well as the update of the rotation matrix do not explicitly require the use of MRPs and both can be computed with a few primitive operations on previously stored numbers.

6.3 Arc Length of Quaternion Curves parameterized with MRPs

Being a rational map, stereographic projection can be used to back-project smooth 3D curves on the sphere. The resulting spherical curves have certain differential attributes which could be useful in manipulating their properties in the more familiar space \({\mathbb {R}}^3\).

Lemma 1

For any unit quaternion q, the Gram matrix of the Jacobian with respect to its MRPs is a scalar multiple of the \(3\times 3\) identity matrix:

$$\begin{aligned} {\left( \nabla q\right) }^T\nabla q = {\left( 1+\rho \right) }^2I_3 \end{aligned}$$
(37)

Proof

Using the result of Proposition 1, we have:

$$\begin{aligned} \begin{aligned} \left( \nabla q\right) ^T\nabla q&=\begin{bmatrix}\upsilon \upsilon ^T-\left( 1+\rho \right) I_3\\ \left( 1+\rho \right) \upsilon ^T \end{bmatrix}^T \begin{bmatrix}\upsilon \upsilon ^T-\left( 1+\rho \right) I_3\\ \left( 1+\rho \right) \upsilon ^T \end{bmatrix}\\&=\begin{bmatrix}\upsilon \upsilon ^T-\left( 1+\rho \right) I_3&\,\,\,\left( 1+\rho \right) \upsilon \end{bmatrix} \begin{bmatrix}\upsilon \upsilon ^T-\left( 1+\rho \right) I_3\\ \left( 1+\rho \right) \upsilon ^T \end{bmatrix}\\&=\left( \upsilon ^T\upsilon -2\left( 1+\rho \right) +\left( 1+\rho \right) ^2\right) \upsilon \upsilon ^T+\left( 1+\rho \right) ^2I_3\\&=\left( \underset{1}{\underbrace{\rho ^2+\upsilon ^T\upsilon }}-1\right) \upsilon \upsilon ^T+\left( 1+\rho \right) ^2I_3\\&=\left( 1+\rho \right) ^2I_3 \end{aligned} \end{aligned}$$

A direct consequence of Lemma 1 is that the columns of the quaternion Jacobian must be orthogonal for every unit quaternion \(q\ne -1\). More formally:

Corollary 1

For any unit quaternion \(q\ne -1\) , the columns of the Jacobian of q with respect to its MRPs constitute an orthogonal basis of the tangent space of the quaternion sphere at q.

Lemma 2

Let \(\psi \left( t\right) = \left( x\left( t\right) , y\left( t\right) , z\left( t\right) \right) :{\mathbb {R}}\rightarrow {\mathbb {R}}^3\) be a smooth curve. Then the unit quaternion function \(q\left( t\right) =\rho \left( t\right) +\left( \upsilon \left( t\right) \right) ^T\varphi \) obtained as the stereographic back-projection of \(\psi \left( t\right) \) on the quaternion sphere is also smooth and the arc length \(s\left( t\right) \) of \(q\left( t\right) \) is given by the following expression:

$$\begin{aligned} s\left( t \right) = 2\int _0^t{ \frac{1}{1+{\Vert \psi \left( u\right) \Vert }^2} \Big \Vert \dfrac{\mathrm{d}\psi }{\mathrm{d}u}\Big \Vert \, du} \end{aligned}$$
(38)

Proof

It naturally follows from Proposition 1 that, by means of the chain rule, the first derivative will be a product of smooth functions and consequently, higher order derivatives will be the sum of such products. Thus, the spherical back-projection of \(\psi \left( t \right) \) will be smooth. Another way of arriving at this conclusion is to simply consider that stereographic projection is a rational mapping and therefore maps smooth functions to smooth functions.

To prove Eq. 38, we make use of Lemma 1, starting from the standard formula for the arc length of \(q\left( t\right) \):

$$\begin{aligned} \begin{aligned} s\left( t\right)&=\int _0^t{\sqrt{{\left( \dfrac{\mathrm{d}q}{\mathrm{d}u}\right) }^T\dfrac{\mathrm{d}q}{\mathrm{d}u}} \, du}\\&=\int _0^t{\sqrt{\left( \nabla q\dfrac{\mathrm{d}\psi }{\mathrm{d}u}\right) ^T\left( \nabla q\dfrac{\mathrm{d}\psi }{\mathrm{d}u}\right) } \, du}\\&=\int _0^t{\sqrt{{\left( \dfrac{\mathrm{d}\psi }{\mathrm{d}u}\right) }^T{\underset{=\left( 1+\rho \left( u\right) \right) ^2I_3}{\underbrace{\left( \nabla q\right) ^T\nabla q}}}\,\dfrac{\mathrm{d}\psi }{\mathrm{d}u}} \, du}\\&=\int _0^t{\left( 1+\rho \left( u\right) \right) \Big \Vert \dfrac{\mathrm{d}\psi }{\mathrm{d}u} \Big \Vert \, du} \end{aligned} \end{aligned}$$

It can be easily inferred from Eq. 25 that \(1+\rho \left( u\right) =\frac{2}{1+\Vert \psi \left( u\right) \Vert ^2}\). Thus,

$$\begin{aligned} s\left( t\right) =\int _0^t{\frac{2}{1+\Vert \psi \left( u\right) \Vert ^2}\Big \Vert \dfrac{\mathrm{d}\psi }{\mathrm{d}u} \Big \Vert \, du} \end{aligned}$$

7 Quaternion Interpolation

Quaternion interpolation is ubiquitous in the fields of computer graphics, robotics and aerospace engineering [8, 21, 22, 44, 55]. Generating smooth orientation paths between key orientations is a very challenging task, primarily because we wish to attach linear interfaces onto steering mechanisms which, by definition, manipulate objects (i.e., rotations) that reside in a spherical manifold. In other words, the desired attributes of the generated sequences are hard to attain, due to the topological nonlinearities of the group of rotations.

Stereographic projection is a smooth, bijective mapping from \({\mathbb {R}}^3\) to the unit sphere in \({\mathbb {R}}^4\). Consequently, lines in 3D become distorted on the sphere to account for the incompatibility between the two topological spaces. In this section, we provide a simple solution to cope with the distortion caused by perspective projection, while designing the spherical curve in the hyperplane. The idea is to interpolate the derivative of the spherical curve, in addition to interpolating the data.

7.1 Configuring Unit Quaternion Derivatives on the Hyperplane

Suppose we wish to establish a relationship between the derivative of a parametric unit quaternion curve \(q\left( t\right) =\rho \left( t\right) +\left( \upsilon \left( t\right) \right) ^T\varphi \) and the derivative of the corresponding MRP curve \(\psi \left( t\right) \). The chain rule for \(q\left( t\right) \) yields:

$$\begin{aligned} q^{\prime }\left( t\right) =\nabla q\,\psi ^{\prime }\left( t\right) , \end{aligned}$$
(39)

where \(\psi ^{\prime }\left( t\right) =\dfrac{\mathrm{d}\psi }{\mathrm{d}t}\) and \(\nabla q\) is the Jacobian of the quaternion curve at \(\psi \). Clearly, Eq. 39 defines a \(4\times 3\) linear system of equations in the components of \(\psi ^{\prime }\left( t\right) \). We claim that for \(q\left( t\right) \ne -1\) this system has the following unique solution:

$$\begin{aligned} \psi ^{\prime }\left( t\right) = \frac{1}{\left( 1+\rho \right) ^2}\left( {\nabla q}\right) ^T\,q^{\prime }\left( t\right) \end{aligned}$$
(40)

Lemma 3

For the unit quaternion \(q=\rho +\upsilon ^T\varphi \), examine the \(4\times 3\) linear system \({\nabla q}\,\xi =b\), where \(b\in {\mathbb {R}}^4\) and \(\nabla q\) is the Jacobian of q with respect to its modified Rodrigues parameters.

  1. (a)

    For \(q\ne -1\), consider the vector:

    $$\begin{aligned} \xi =\frac{1}{\left( 1+\rho \right) ^2}\left( {\nabla q}\right) ^T\,b \end{aligned}$$
    (41)
    1. (i)

      If b lies in the tangent space of q, then \(\xi \) is the unique solution of the system.

    2. (ii)

      if b is not in the tangent space of q, then \(\xi \) is the least squares minimizer of the system and consequently, \({\nabla q}\,\xi \) is the projection of b on the tangent space of q.

  2. (b)

    For \(q=-1\):

    1. (i)

      If \(b\ne 0\), the system has no solutions.

    2. (ii)

      If \(b=0\), any \(\xi \in {\mathbb {R}}^3\) is a solution.

Proof

The proof is trivial for \(q=-1\). For \(q\ne -1\), we multiply by \(\nabla q\) on the left to get the \(3\times 3\) equivalent system:

$$\begin{aligned} \left( \left( \nabla q\right) ^T\nabla q\right) \,\xi ={\left( \nabla q\right) }^T\,b \end{aligned}$$

From Lemma 1, we know that the Gram matrix of the gradient will be a non-zero scalar multiple of the identity:

$$\begin{aligned} \left( 1+\rho \right) ^2\,\xi&=\left( \nabla q\right) ^T\,b \\ \Leftrightarrow \, \xi&= \frac{1}{\left( 1+\rho \right) ^2}{\left( \nabla q\right) }^T\,b \end{aligned}$$

The solution of the \(3\times 3\) overdetermined system in Eq. 41 will satisfy all the original equations because b is in the tangent space of q and we know from Corollary 1 that it can be expressed as a linear combination of the columns of \(\nabla q\).

7.2 Spherical Catmull–Rom Splines manufactured in 3D

Equation 41 provides the means to configure the differential properties of a 4D spherical curve by manipulating its projection in \({\mathbb {R}}^3\). To demonstrate how effective this approach is in terms of eliminating projective distortion, we present a sample scheme for designing spherical Catmull–Rom splines [9].

Standard Catmull–Rom splines are composed of polynomial segments with end-point derivatives that match the slope of the linear segments that connect data points immediately preceding and trailing the end-points (see Fig. 3). We consider an analog of Catmull–Rom splines on the sphere in which we require the derivative of the spherical curve at a specific data-point to be collinear to the tangent-space projection of the linear segment defined by the trailing and preceding data points (Fig. 4).

Fig. 3
figure 3

Standard Catmull–Rom interpolation. The tangents at the data points \(p_n\) and \(p_{n+1}\) (shown as black solid lines) are parallel to the linear segments defined by \(p_{n-1}\), \(p_{n+1}\) and \(p_{n}\), \(p_{n+2}\) (shown with dashed lines)

Fig. 4
figure 4

Spherical Catmull–Rom interpolation. The tangents of the spherical curve at \(q_n\) and \(q_{n+1}\) are parallel to the projections of linear segments \(\overline{q_{n-1} \, q_{n+1}}\) and \(\overline{q_n \, q_{n+2}}\) onto the tangent spaces of \(q_n\) and \(q_{n+1}\), respectively

To interpolate between two key points \(q_n\) and \(q_{n+1}\) according to the spherical analog of Catmull–Rom splines described in this section, we must solve for the coefficients of a cubic polynomial \(\psi \left( t\right) =b_3t^3+b_2t^2+b_1t+b_0\), where \(b_0, b_1, b_2, b_3\in {\mathbb {R}}^3\) and \(t\in \left[ 0, 1\right] \). It follows that \(\psi \left( t\right) \) should interpolate the projections of two data points at \(\psi _n\) and \(\psi _{n+1}\) for \(t=0\) and \(t=1\), respectively. Hence, the following conditions should apply:

$$\begin{aligned}&b_0 = \psi _n \end{aligned}$$
(42)
$$\begin{aligned}&b_3+b_2+b_1+b_0 = \psi _{n+1} \end{aligned}$$
(43)

Furthermore, we require that the tangents at the spherical points are parallel to the chords that connect the trailing and preceding data points, as shown in Fig. 4. Thus, two additional constraints are obtained on the coefficients of the cubic:

$$\begin{aligned}&b_1 = \frac{\lambda \left( \nabla q_n\right) ^T\left( q_{n+1}-q_{n-1}\right) }{\left( 1+\rho _n\right) ^2} \end{aligned}$$
(44)
$$\begin{aligned}&3b_3+2b_2+b_1 =\frac{\lambda \left( \nabla q_{n+1}\right) ^T \left( q_{n+2}-q_n\right) }{\left( 1+\rho _{n+1}\right) ^2}, \end{aligned}$$
(45)

where \(\rho _i\) is the scalar part of \(q_i\) and \(\lambda \) is a user-defined positive scalar. Using Eqs. 42, 43, 44 and 45, the coefficients of \(\psi \left( t\right) \) can be computed in the following order:

$$\begin{aligned} \tau _1= & {} \frac{\left( \nabla q_n\right) ^T\left( q_{n+1}-q_{n-1}\right) }{\left( 1+\rho _n\right) ^2} \end{aligned}$$
(46)
$$\begin{aligned} \tau _2= & {} \frac{\left( \nabla q_{n+1}\right) ^T \left( q_{n+2}-q_n\right) }{\left( 1+\rho _{n+1}\right) ^2} \end{aligned}$$
(47)
$$\begin{aligned} b_0= & {} \psi _n \end{aligned}$$
(48)
$$\begin{aligned} b_1= & {} \lambda \tau _1 \end{aligned}$$
(49)
$$\begin{aligned} b_3= & {} \lambda \tau _2 + b_1- 2\left( \psi _{n+1}-b_0\right) \end{aligned}$$
(50)
$$\begin{aligned} b_2= & {} \psi _{n+1}-b_3-b_1-b_0 \end{aligned}$$
(51)
Fig. 5
figure 5

Spherical Catmull–Rom (blue) and \(\textsc {squad}\) interpolation (red) for 8 key-quaternions (black dots) (Color figure online)

Fig. 6
figure 6

Arc length, speed and distance from the great arcs for a sequence of 8 points generated using polar jumps in the interval \(\left[ 10^{\circ }, 40^{\circ }\right] \) (to be viewed in color). a Arc length, b speed, c distance from great arc

It should be noted that the scheme for designing spherical Catmull–Rom splines is intended as an example of a more general methodology for eliminating perspective distortion by configuring the planar curve to produce a back-projection that matches the desired differential properties on the sphere. The result of Eq. 41 is a tool with multiple uses when designing spherical splines on the plane.

Figure 5 illustrates an 8-point spherical Catmull–Rom spline along with the corresponding spherical quadrangle interpolation (\(\textsc {squad}\)) curve. Clearly, both curves fulfill the primary requirement of smoothness. Still, a more careful observation reveals that the spherical Catmull–Rom spline achieves a generally shorter distance between the key points in comparison with \(\textsc {squad}\). Obtaining shorter arc lengths in orientation interpolants is important in robotic and animation applications [14, 27, 49, 53] because they yield reduced torque in the rotational motion. For similar reasons, minimal curvature is an equally important attribute [13]. Particularly in the case of animation, approximately constant speed in spherical curves is desirable because it can be warped into any desirable acceleration profile (e.g., trapezoidal moves) [66]. In summary, it would appear that connected great arcs can be loosely regarded as the “ideal” interpolants in terms of the aforementioned attributes (minimal arc length, minimal curvature and constant speed) if we could somehow overlook the lack of smoothness at the end-points. In Sect. 8.1, we study the characteristics of generated Catmull–Rom and \(\textsc {squad}\) curves and use the great arcs between data points as a benchmark to obtain quantifiable measures on their performance.

8 Experimental Results

This section presents experimental results comparing the performance of MRPs in various applications against alternative parameterizations of rotation.

8.1 Spherical Catmull–Rom Splines as Orientation Interpolants

The method proposed in Sect. 7.2 for the design of spherical Catmull–Rom (SCR) splines was merely a demonstration of the ways that the properties of a spherical curve can be configured in the hyperplane. However, the resulting curve, although not optimal in all aspects, exhibits, besides smoothness, significant improvements in terms of arc length and curvature when compared to the one produced by \(\textsc {squad}\). Furthermore, it approximates the great arcs between the key points better than \(\textsc {squad}\) in the majority of cases.

For demonstration, we performed comparisons on three sequences of eight key quaternions each, generated by successive random “jumps” in polar coordinates. In particular, the angles of each jump are uniformly sampled from a specified range in degrees. The range from which the angles are sampled determines the density of the key points which in turn affects the behavior of the interpolants. To observe the qualitative characteristics of the spherical curves in datasets of varying sparsity, the angular jumps used in the three generated sequences were sampled from the intervals \(\left[ 10^{\circ }, 40^{\circ }\right] \), \(\left[ 10^{\circ }, 70^{\circ }\right] \) and \(\left[ 10^{\circ }, 100^{\circ }\right] \), respectively. Figures 6, 7, 8 illustrate arc length, speed and distance from great arc for both spherical Catmull–Rom splines and \(\textsc {squad}\) in the three aforementioned sequences.

Fig. 7
figure 7

Arc length, speed and distance from the great arcs for a sequence of 8 points generated using polar jumps in the interval \(\left[ 10^{\circ }, 70^{\circ }\right] \). a Arc length, b speed, c distance from great arc

Fig. 8
figure 8

Arc length, speed and distance from the great arcs for a sequence of 8 points generated using polar jumps in the interval \(\left[ 10^{\circ }, 100^{\circ }\right] \). a Arc length, b speed, c distance from great arc

The results indicate that SCR splines produce curves that are often much more proximal to great arcs than those obtained from \(\textsc {squad}\), with generally shorter arc lengths, suggesting that the quality of interpolation attains similar, if not higher standards. The latter is an indication that perspective distortion is either not present or minimal as a direct consequence of configuring spherical derivatives in the hyperplane. On the other hand, \(\textsc {squad}\) presents speed patterns that match \(\textsc {slerp}\) closer than SCR splines. This is not necessarily a bad trait, primarily because SCR curves consistently produce a symmetric bell-shaped speed profile, which can be perceived as a smooth trapezoidal pattern. It should be noted that \(\textsc {squad}\) is twice differentiable at the key points, while SCR curves are only continuous in the first derivative, a direct consequence of the definition of Catmull–Rom splines. In summary, SCR curves are smooth interpolants presenting little distortion on the sphere and bell-shaped symmetric speed patterns which can be easily warped to produce other speed profiles. On the other hand, velocity is not differentiable at the key points, a fact which may bear consequences, depending on the application. It should be noted, however, that the underlying general interpolation rationale involving the interpolation of derivatives side-to-side with data points is a method successful in producing spherical interpolants which are devoid of distortion and can be adapted to suit spherical interpolation based on more general planar curves such as B-splines, for instance.

8.2 Descent Behavior of MRPs

Aiming to assess the descent behavior of MRPs in a situation involving a single unknown rotation, we chose to employ absolute orientation [29] as a test problem. It is widely known that this problem can be dealt with non-iteratively, e.g., [15, 28, 29, 36]. However, our objective in this experiment was not to provide yet another solution, but rather to benchmark how MRPs compare against other parameterization schemes in the context of a basic, quadratic minimization problem in only the rotation parameters. Specifically, the parameterization schemes compared with MRPs were axis-angle, normalized quaternion and incremental rotation.

Given two \(3\times N\) matrices X and Y comprised of N corresponding points in two different reference frames with the same origin, absolute orientation requires determining the rotation \(R(p^*)\), where

$$\begin{aligned} p^* = \mathop {{{\mathrm{arg\,min}}}}\limits _p \; \big \Vert R\left( p\right) \,Y-X\big \Vert ^2, \end{aligned}$$
(52)

\(p \in {\mathbb {R}}^3\) is the orientation parameter vector and \(\Vert . \Vert \) denotes the Frobenius norm for matrices.Footnote 6

In our experimental setup, the dataset comprises 100 correspondences, i.e., matrices X and Y have size \(3\times 100\). The unrotated points X were sampled from a 3D Gaussian with a covariance matrix \(10^2I_3\), thus producing a “spread” of roughly 10 metric units. The ground-truth rotation matrix was synthesized by uniformly sampling the 3 Euler angles from the interval \(\left[ 20^{\circ }, 80^{\circ }\right] \). This rotation was then applied to every row of X to yield Y.

To study the descent behavior of MRPs against alternative parameterizations across multiple levels of noise, the optimization was carried out for 100 incremental standard deviation levels of noise from 0 to 2.5 using the Levenberg–Marquardt algorithm [34, 42]. The noise is presumed Gaussian and was added to Y, which, by virtue of the property of linear propagation of covariance, is statistically equivalent to contaminating the relative position of the correspondences.

For each noise level, the same experiment (i.e., using the same ground-truth data) was repeated 40 times from a random starting point to convergence and the error in each step was recorded, as well as the overall steps to convergence (or maximum permissible iterations). Since we know that the absolute orientation cost function has 4 stationary points which are the eigenvectors of a data-dependent matrix, we would expect the process to occasionally get stuck in one of the 3 suboptimal points. In practice, we observed that this rarely happens. However, to ensure that the descent observations are not biased by the occasional convergence to local minima, we used median values for the error and the number of steps to convergence. Figure 9 illustrates plots of steps-to-convergence versus standard deviation of Gaussian noise. It should be noted that the iterative process terminates when any of the following conditions are met: (a) squared error below \(10^{-6}\), (b) change in squared error below \(10^{-12}\) and, (c) the process has reached 100 iterations.

The incremental rotation approach employed here has been adopted by several notable pieces of work in vision, e.g., [18, 33, 39, 60]. The rationale behind this approach is to take advantage of the fact that the tangent space of a rotation matrix R comprises all matrices \(RS_{\times }\) where \(S_{\times }\) is a skew-symmetric matrix. It is therefore possible to devise a very simple retraction [1] \(R^{\prime }\), which maps the tangent space of R onto \(\mathcal{SO}(3)\) by applying a perturbing rotation on the rightFootnote 7 of R:

$$\begin{aligned} R^{\prime }(S_{\times })=R\exp \left( S_{\times }\right) \end{aligned}$$
(53)

It can be easily shown that \(R^{\prime }\) is a retraction, since the exponential map is smooth and \(R^{\prime }\left( \left[ 0\right] _{\times }\right) =Rexp\left( \left[ 0\right] _{\times }\right) =R\). Most importantly, \(R^{\prime }\) satisfies the local rigidity requirement, since the directional derivative

$$\begin{aligned} \dfrac{dR^{\prime }\left( tU_{\times }\right) }{dt}\bigg \vert _{]t=0}=RU_{\times } \end{aligned}$$

is the identity mapping in the tangent space of R for any skew-symmetric matrix \(U_x\). Consequently, the 3 descent directions on the manifold are \(RG_1\), \(RG_2\), \(RG_3\), and they are obtained by differentiating \(R^{\prime }\) at the origin.

Fig. 9
figure 9

Steps to convergence for added Gaussian noise up to 2.5 in a synthetic point-set with spread over 10 metric units

The results of Fig. 9 clearly indicate that MRPs and incremental rotations consistently reach a converged state within 10 to 20 iterations, while normalized quaternions and axis-angle parameters require roughly between 20 to 60 iterations. We conjecture that the incremental approach combined with MRPs would most likely improve the number of steps to convergence, but the gain would be marginal. This conjecture is based on the observation (using the formulas in “Appendix A”) that the derivatives of a rotation matrix with respect to MRPs at the origin are the scaled multiples of infinitesimal rotations, \(4G_1\), \(4G_2\), \(4G_3\).

Fig. 10
figure 10

Median error versus iteration in the Gauss-Newton method for added Gaussian noise with standard deviation 0, 1.5 and 2.5. a Noise level 0, b noise level 1.5, c noise level 2.5

The evolution of error during the entire Gauss-Newton process is consistent with the convergence rates of Fig. 9. Indicative plots of how the error evolves throughout the Gauss-Newton process for three different added Gaussian noise levels (standard deviation 0, 1.5 & 2.5) are shown in Fig. 10. The error in each step is obtained as the median estimate of the error values in the same step across 20 distinct Gauss-Newton executions. It is clear that MRPs and incremental rotations present a similar error curve with a very steep slope, as opposed to the axis-angle parameters and, to a lesser extent, normalized quaternion; furthermore, this pattern is consistent throughout the various levels of noise. We attribute this behavior primarily to the numerical stability of the rotation derivatives associated with the MRPs and the incremental rotation. In contrast, axis-angle and normalized quaternion Jacobians contain variable quantities in the denominator and very small/large values in these quantities may produce unstable descent patterns.

The noticeable abrupt “dives” in the error curves in Fig. 10 most likely correspond to periods in which the Levenberg–Marquardt damping factor grows upon successive steps in order to shorten the step size in the search for a better solution. This would account for the occasional nearly-constant error values for certain periods. Subsequently (but not in all cases), the error curve introduces a steep “dive” that corresponds to a step size that reached-out to an improved solution lying in a steeper region of the search surface.

8.3 Sparse Bundle Adjustment

Given a set of images depicting a number of 3D points from several different viewpoints, bundle adjustment (BA) is the problem of simultaneously refining the 3D coordinates of these points, as well as the parameters of the relative motion and possibly the optical characteristics of the camera(s) employed to acquire the images, according to an optimality criterion involving the cumulative image reprojection error of all points. BA amounts to a large, nonlinear optimization problem on the 3D structure and viewing parameters (i.e., camera pose and possibly intrinsic calibration and radial distortion). It is employed as the last step of most feature-based 3D reconstruction pipelines, since its solution yields a reconstruction which is optimal in the MLE sense under the assumption that the noise pertaining to the observed image features is zero-mean Gaussian [37].

BA constitutes a special type of a nonlinear least squares problem, since the lack of interaction among parameters for different 3D points and cameras results in the underlying normal equations having a special “arrowhead” sparse block structure [35]. sba [37] is a software package that efficiently solves BA using a sparse variant of the Levenberg–Marquardt algorithm that exploits the particular zero pattern of the underlying normal equations.

With the aid of publicly available, real-world datasets from [2], we tested different rotation parameterizations applied to BA. These datasets originate from incremental 3D reconstruction for large-scale community photo collections. More specifically, we used the first two data files from each of the “Ladybug”, “Trafalgar Square”, “Dubrovnik”, “Venice” and “Final” datasets. The sba [37] package was used to optimize those datasets using its default, quaternion-based local rotation parameterization described in Sect. 3.2. We also adapted sba to employ a global rotation parameterization based on the MRPs and compared it against the local quaternion parameterization. The results of the comparison are illustrated in Fig. 11, which shows the execution times and the number of iterations for the two rotation parameterizations.

In all applications of sba, the default convergence parameters were employed; in particular, the maximum number of iterations was set to 150. With the exception of the last data file (namely ‘394-100368’), both parameterizations converged to the same global minimum. In the last dataset, the parameterization employing MRPs required roughly four times more iterations but converged to a better minimum, which corresponded to over 60% lower average reprojection error compared to that obtained with quaternions. For the majority of datasets, both parameterizations required very similar numbers of iterations to converge. However, MRPs converged with noticeably fewer iterations for two datasets. The execution times were generally lower for the parameterization based on MRPs, owing to the simpler calculations involved in the evaluation of the image projections and their derivatives. Still, we note that the execution time for each iteration of sba is dominated by the time needed for the linear algebra operations involved in the solution of the normal equations (in particular, the Cholesky factorization of the Schur complement, cf. [37]), which does not depend on the choice of rotation parameterization.

Fig. 11
figure 11

Execution times for various BA datasets and two rotation parameterizations (orange for MRPs and blue for quaternions). Notice the logarithmic scale in the vertical axis. Following [2], each dataset in the horizontal axis is labeled as C-P, C and P being, respectively, the numbers of cameras and 3D points it involves. The actual execution times are shown in bold near the top of every bar. The numbers in italics near the bar bottoms are the iterations needed for convergence. The increased execution time required by MRPs for the rightmost dataset is due to that they required 4 times more iterations but converged to a much better minimum (Color figure online)

8.4 Exterior Orientation

Exterior orientation refers to the estimation of the position and orientation of a camera given its intrinsic parameters and a set of \(n\ge 3\) correspondences between known 3D points and their 2D image projections. This problem, also known as the PnP or camera resection problem,Footnote 8 has received much attention due to is applicability in various domains. Exterior orientation is typically dealt with by embedding minimal-size PnP solvers to robust regression frameworks such as RANSAC (see [38] and references therein). However, as minimal solutions ignore much of the redundancy present in the data, they suffer from inaccuracies. To remedy this, an additional step comprised of nonlinear optimization with the Levenberg–Marquardt algorithm is employed to minimize the reprojection error pertaining to all inliers [38].

Starting with the datasets employed for bundle adjustment in Sect. 8.3, we extracted the 3D points projecting to their first frames. Then, those 3D points along with their projections in the first frame of each dataset were used for estimating the corresponding camera poses using the posest library implementing [38]. We also modified the nonlinear refinement step of posest to employ a rotation parameterization based on MRPs and compared it with its native axis-angle parameterization. Since the execution times for pose estimation are in most cases very small, and in order to accurately measure them, each pose estimation was run 100 times and the elapsed time scaled accordingly.

Figure 12 shows the execution times and the number of iterations for both rotation parameterizations. Similarly to the BA experiment described above, all optimizations converged to the same poses for both parameterizations. However, the execution times pertaining to MRPs are shorter, despite that the number of iterations is occasionally slightly higher compared to those spent for the exponential parameterization. Compared to the BA experiment, the difference between the execution time performance of the two parameterizations is more evident. This is due to the small size of the nonlinear minimization of the single view reprojection error and its consequent low computational cost, and clearly demonstrates the performance benefits gained by the use of MRPs.

Fig. 12
figure 12

Execution times for exterior orientation problems corresponding to the first frames of the datasets employed in Sect. 8.3, using two rotation parameterizations (orange for MRPs and blue for axis-angle). The actual execution times are shown in bold near the top of every bar, whereas the number of iterations are in italics at the bar bottoms (Color figure online)

9 Conclusion

Modified Rodrigues parameters is a formalism for the representation of orientation based on stereographic projection, originally introduced in the field of aerospace engineering by Wiener [68] in 1962. Stereographic projection is a well-established mathematical construct with primarily theoretical applications in complex analysis, topology and projective geometry. However, the practical significance of this mapping in applied fields such as computer vision, graphics and robotics has been overlooked.

This paper has advocated the use of MRPs for parameterizing rotations in problems arising in the fields of computer graphics and vision. Its primary objective is to familiarize the community with this formalism from the aspect of practical applications involving the recovery and/or interpolation of orientation by emphasizing its graceful properties not only as a rational parameterization but also in terms of differentiation.

In particular, it was shown that the Jacobian of a quaternion is not only a rational function of its MRPs, but also a polynomial function of its scalar and vector part. This is favorable from the perspective of nonlinear optimization problems involving the recovery of orientation, considering that the Jacobians corresponding to parametric unknowns such as normalized quaternions or axis-angle vectors yield occasionally highly complicated and non-rational expressions. In addition to the succinct Jacobian, it was shown that the update of a quaternion from a perturbation in its MRPs does not require the use of the actual parameter vector. This means that there is no need to move through parameter spaces in iterative optimization, which is also an important benefit from a numerical and algorithmic standpoint. To support our claims, this paper has also provided experimental evidence regarding the practical advantages stemming from the use of MRPs in small as well as large-scale iterative optimization in classic problems in 3D computer vision.

Further advantages of MRPs include the flexibility in constructing smooth quaternion curves with minimal distortion in more intuitive ways. Specifically, we presented a novel general strategy for designing quaternion splines in the hyperplane by interpolating not only the key points, but also the derivatives of the spherical curve while working on its projection in the hyperplane. This yields smooth rational interpolants with minimal perspective distortion that are very competitive with popular algorithms such as spherical quadrangle interpolation (\(\textsc {squad}\)).

Concluding, we briefly summarize the benefits of MRPs and stereographic projection as an orientation parameterizing scheme. It is a multi-purpose tool with convenient properties that allows for less complicated solutions in otherwise difficult, nonlinear or even intractable problems and offers efficiency up-to and beyond the standards of existing solutions as well as simplicity of design and implementation. C++ and Matlab code implementing most formulas in the paper is available in the following repository: https://github.com/terzakig/Quaternion.