Introduction

Helmert transformation (3D seven-parameter similarity transformation) is a frequently encountered task in geodesy, photogrammetry, geographical information science, mapping, engineering surveying, machine vision, etc. (see, e.g., Aktuğ (2009); Aktuğ 2012; Akyilmaz 2007; Arun et al. 1987; Burša 1967; Chang 2015; Chang et al. 2017; El-Habiby et al. 2009; El-Mowafy et al. 2009; Grafarend and Awange 2003; Han 2010; Horn 1986, 1987; Han and Van Gelder 2006; Horn et al. 1988; Jaw and Chuang 2008; Jitka 2011; Kashani 2006; Krarup 1985; Leick 2004; Leick and van Gelder 1975; Mikhail et al. 2001; Neitzel 2010; Soler 1998; Soler and Snay 2004; Soycan and Soycan 2008; Teunissen 1986; Teunissen 1988; Wang et al. 2014; Závoti and Kalmár 2016; Zeng 2014; Zeng 2015; Zeng et al. 2016; Zeng and Yi 2011). Helmert transformation problem is to determine the seven transformation parameters including three rotation angles, three translation parameters and one scale factor using a set of control points (the number of control points should be equal to or more than three because three equations can be constructed for one point). Numerous algorithms of Helmert transformation have been presented. On the one hand, the algorithms can be classified to a numerical iterative algorithm, e.g., El-Habiby et al. (2009), Zeng and Yi (2011), Paláncz et al. 2013, Zeng et al. (2016), etc., and an analytical algorithm, e.g., Grafarend and Awange (2003), Shen et al. (2006a, b), Zeng (2015), etc. For the numerical iterative algorithm, initial values of transformation parameters are usually required. However, if a global optimization algorithm is employed to recover the transformation parameters, then no initial values are needed (see, e.g., Xu 2002, 2003a, b). For the analytical algorithm, it is fast and reliable because it does not need iterative computation. On the other hand, the algorithms can be classified to different algorithms by the means of representation of the rotation matrix, such as algorithms based on Eulerian angle (see, e.g., Zeng and Tao 2003; El-Habiby et al. 2009), algorithms based on quaternion (see, e.g., Horn 1987; Shen et al. 2006a, b; Zeng and Yi 2011), algorithms based on Rodrigues matrix and Gibbs vector see (e.g., Zeng and Huang 2008; Zeng and Yi 2010; Zeng et al. 2016), algorithms based on dual quaternion (see, e.g.,Walker et al. 1991; Jitka 2011) and algorithms which regard the rotation matrix as a variant and directly solve the rotation matrix (see, e.g., Arun et al. 1987; Grafarend and Awange 2003; Zeng 2015).

Jitka (2011) presents a dual quaternion algorithm for geodetic datum transformation; however, the algorithm adopts a nonlinear method to solve a minimization problem, i.e., Lagrange multipliers, which has eight variants with two constraints equations, so its solution is complex and the solution process of transformation is not explicit. The feasibility of this algorithm to big rotation angle transformation is not verified; moreover, the algorithm does not deal with the weight of observation. Walker et al. (1991) present a dual quaternion to estimate the 3D location parameters including both position and direction information; however, it does not consider the scale factor of 3D coordinate transformation, so it is not suitable for Helmert transformation. Motivated by these studies, this paper aims to construct the Helmert transformation based on quaternion and present a new dual quaternion algorithm for Helmert transformation which has explicit computation steps, no initial value problem of transformation parameters, fast computation and reliable result no matter how big the rotation angles are. Meanwhile, it is able to deal with the different weight of observation since different control points usually have different positioning accuracy.

The remainder of the paper is organized as follows. In “Mathematical model and algorithm of Helmert transformation based on dual quaternion” section, firstly Helmert transformation and dual quaternion (and its preliminary, i.e., quaternion) are introduced in brief. Secondly, the mathematic model of Helmert transformation based on dual quaternion is established, and then a new dual quaternion algorithm of Helmert transformation is presented by Lagrangian extremum law and eigenvalue–eigenvector decomposition. Lastly, the solution of initial value of scale is introduced. In “Case study and discussion” section, an actual case, i.e., a small rotation angle case, and a simulative case, i.e., a big rotation angle case, are demonstrated to verify the presented algorithm. In the last section, i.e., “Conclusion”, a conclusion is drawn.

Mathematical model and algorithm of Helmert transformation based on dual quaternion

Helmert transformation

The Helmert transformation (i.e., seven-parameter similarity transformation) model can be written as

$${\mathbf{p}}_{i}^{\text{t}} = \lambda {\mathbf{Rp}}_{i}^{\text{o}} + {\mathbf{t}},$$
(1)

subject to

$${\mathbf{R}}^{\text{T}} {\mathbf{R}} = {\mathbf{I}},\;\det ({\mathbf{R}}) = 1 ,$$
(2)

where \({\mathbf{p}}_{i}^{\text{o}} = \left[ {\begin{array}{*{20}c} {x_{i}^{\text{o}} } & {y_{i}^{\text{o}} } & {z_{i}^{\text{o}} } \\ \end{array} } \right]^{\text{T}}\) and \({\mathbf{p}}_{i}^{\text{t}} = \left[ {\begin{array}{*{20}c} {x_{i}^{\text{t}} } & {y_{i}^{\text{t}} } & {z_{i}^{\text{t}} } \\ \end{array} } \right]^{\text{T}}\) (\(i = 1,2, \ldots ,n\)) are the 3D coordinate vectors of a control point in the original coordinate system (labeled with superscript \({\text{o}}\)) and the target coordinate system, respectively (labeled with superscript \({\text{t}}\)). \({\mathbf{I}}\) is a 3 × 3 identity matrix, superscript \({\text{T}}\) represents transpose computation, and \(\det\) stands for determinant computation of matrix. \(\lambda\) is the scale parameter, \({\mathbf{t}} = \left[ {\begin{array}{*{20}c} {t_{x} } & {t_{y} } & {t_{z} } \\ \end{array} } \right]^{\text{T}}\) denotes the three translation parameters, and \({\mathbf{R}}\) stands for the 3 × 3 rotation matrix, which is produced by the three rotation angles parameters. Assuming \({\mathbf{R}}\) is formed by rotating angles (Eulerian angles) \(\theta_{x}\), \(\theta_{y}\), \(\theta_{z}\) counterclockwise about the X, Y and Z axes, respectively, then \({\mathbf{R}}\) can be expressed by rotation angles as

$${\mathbf{R}} = \left[ {\begin{array}{*{20}c} {\cos \theta_{z} \cos \theta_{y} } & {\sin \theta_{z} \cos \theta_{x} + \cos \theta_{z} \sin \theta_{y} \sin \theta_{x} } & {\sin \theta_{z} \sin \theta_{x} - \cos \theta_{z} \sin \theta_{y} \cos \theta_{x} } \\ { - \sin \theta_{z} \cos \theta_{y} } & {\cos \theta_{z} \cos \theta_{x} - \sin \theta_{z} \sin \theta_{y} \sin \theta_{x} } & {\cos \theta_{z} \sin \theta_{x} + \sin \theta_{z} \sin \theta_{y} \cos \theta_{x} } \\ {\sin \theta_{y} } & { - \cos \theta_{y} \sin \theta_{x} } & {\cos \theta_{y} \cos \theta_{x} } \\ \end{array} } \right].$$
(3)

Reversely if the rotation matrix \({\mathbf{R}}\) is given, the rotation angles \(\theta_{x}\), \(\theta_{y}\), \(\theta_{z}\) can be computed by Eq. (3) as

$$\theta_{x} = - \tan^{ - 1} \frac{{{\mathbf{R}}_{32} }}{{{\mathbf{R}}_{33} }},\;\theta_{y} = \sin^{ - 1} ({\mathbf{R}}_{31} ),\;\theta_{z} = - \tan^{ - 1} \frac{{{\mathbf{R}}_{21} }}{{{\mathbf{R}}_{11} }} .$$
(4)

where \({\mathbf{R}}_{ij}\) is the element of \({\mathbf{R}}\) in the ith row and jth column.

Helmert transformation aims to recover the seven parameters, i.e., one scale factor, three translation parameters and three rotation angles based on Eqs. (1) and (2) given the coordinates of at least three control points.

Quaternion and dual quaternion

Quaternion was invented by Irish mathematician Hamilton in 1843, which is generally expressed as follows.

$$\varvec{q} = iq_{ 1} + jq_{ 2} + kq_{ 3} + q_{ 4} ,$$
(5)

where \(q_{1}\), \(q_{2}\), \(q_{3}\), \(q_{4}\) are real numbers, \(i\), \(j\) and \(k\) are basic quaternion units, and they meet the properties: ① \(i^{2} = j^{2} = k^{2} = - 1\), ② \(ij = - ji = k\), ③ \(jk = - kj = i\), ④ \(ki = - ik = j\).

Usually quaternion \(\varvec{q}\) is also expressed in the vector form as

$$\varvec{q} = \left[ {\begin{array}{*{20}c} {\mathbf{q}} \\ {q_{ 4} } \\ \end{array} } \right],\quad {\mathbf{q}} = \left[ {\begin{array}{*{20}c} {q_{ 1} } \\ {q_{ 2} } \\ {q_{ 3} } \\ \end{array} } \right],$$
(6)

where \({\mathbf{q}}\) is called the vector part, \(q_{4}\) is called the scalar part. If the scalar is zero, \({\mathbf{q}}\) is nonzero vector, and then \(\varvec{q}\) is called pure imaginary. The conjugate quaternion of \(\varvec{q}\) is defined as

$$\varvec{q}^{*} = - iq_{1} - jq_{2} - kq_{3} + q_{4} .$$
(7)

The norm of quaternion \(\varvec{q}\) is defined as

$$\left\| \varvec{q} \right\| = \sqrt {\varvec{qq}^{ * } } = \sqrt {q_{1}^{2} + q_{2}^{2} + q_{3}^{2} + q_{4}^{2} } .$$
(8)

If \(\left\| \varvec{q} \right\| = 1\), \(\varvec{q}\) is called unit quaternion. Based on the above definitions, it is easy to prove the properties:

$$\varvec{p} + \varvec{q} = \left( {\begin{array}{*{20}c} {{\mathbf{p}} + {\mathbf{q}}} \\ {p_{4} + q_{4} } \\ \end{array} } \right),$$
(9)
$$\varvec{pq} = \left( {\begin{array}{*{20}c} {{\mathbf{p}} \times {\mathbf{q}} + p_{4} {\mathbf{q}} + q_{4} {\mathbf{p}}} \\ {p_{4} q_{4} - {\mathbf{p}} \cdot {\mathbf{q}}} \\ \end{array} } \right),$$
(10)
$$\left( {\varvec{pq}} \right)^{ * } = \varvec{q}^{ * } \varvec{p}^{ * } ,$$
(11)
$$\varvec{q}^{ - 1} = \frac{{\varvec{q}^{ * } }}{{\left\| \varvec{q} \right\|}}.$$
(12)

where \(\varvec{p}\) and \(\varvec{q}\) are arbitrary quaternions, \(\varvec{q}^{ - 1}\) is the inverse of \(\varvec{q}\). The symbols \(\cdot\) and × denotes dot and cross-product, respectively, and the dot and cross-product of vectors are defined as

$${\mathbf{p}} \cdot {\mathbf{q}} = {\mathbf{p}}^{\text{T}} {\mathbf{q}} ,$$
(13)
$$\begin{aligned} {\mathbf{p}} \times {\mathbf{q}} & = {\mathbf{C}}({\mathbf{p}}){\mathbf{q}} \\ & = \left[ {\begin{array}{*{20}c} 0 & { - p_{3} } & {p_{2} } \\ {p_{3} } & 0 & { - p_{1} } \\ { - p_{2} } & {p_{1} } & 0 \\ \end{array} } \right]{\mathbf{q}}, \\ \end{aligned}$$
(14)

where \({\mathbf{C}}({\mathbf{p}})\) is a skew symmetric matrix. The product of \(\varvec{pq}\) can also be written in the form of product of matrix and vector quaternion as

$$\varvec{pq} = Q(\varvec{p})\varvec{q} = W(\varvec{q})\varvec{p} ,$$
(15)

where

$$Q(\varvec{p}) = \left[ {\begin{array}{*{20}c} {p_{4} {\mathbf{I}} + {\mathbf{C}}({\mathbf{p}})} & {\mathbf{p}} \\ { - {\mathbf{p}}^{T} } & {p_{4} } \\ \end{array} } \right] ,$$
(16)
$${\mathbf{W}}(\varvec{q}) = \left[ {\begin{array}{*{20}c} {q_{4} {\mathbf{I}} - {\mathbf{C}}({\mathbf{q}})} & {\mathbf{q}} \\ { - {\mathbf{q}}^{T} } & {q_{4} } \\ \end{array} } \right] .$$
(17)

A dual quaternion is written as follows.

$$\begin{aligned} \bar{\varvec{q}} & = \varvec{r} + \varepsilon \varvec{s} \\ & = (r_{ 1} + \varepsilon s_{ 1} )i + (r_{ 2} + \varepsilon s_{ 2} )j + (r_{ 3} + \varepsilon s_{ 3} )k + (r_{ 4} + \varepsilon s_{ 4} ) \\ & = q_{\text{d1}} i + q_{\text{d2}} j + q_{\text{d3}} k + q_{\text{d4}} \\ \end{aligned} ,$$
(18)

where \(\varvec{r}\) and \(\varvec{s}\) are quaternions, \(\varepsilon\) is a dual unit with the property \(\varepsilon^{2} = 0\) and \(\varepsilon\) commutes with quaternion units. \(q_{\text{d4}}\) is the scalar part (dual number), \(\left( {\begin{array}{*{20}c} {q_{\text{d1}} } & {q_{\text{d2}} } & {q_{\text{d3}} } \\ \end{array} } \right)^{\text{T}}\) is the vector part (dual number vector). The product of dual quaternions \(\bar{\varvec{q}}\) and \(\bar{\varvec{p}} = \varvec{u} +\upvarepsilon\varvec{v}\) is defined as

$$\varvec{\bar{p}\bar{q}} = \varvec{ur} + \varepsilon (\varvec{us} + \varvec{vr}) .$$
(19)

The conjugate of a dual quaternion is defined based on the conjugate of quaternion as follows.

$$\bar{\varvec{q}}^{ * } = \varvec{r}^{ * } + \varepsilon \varvec{s}^{ * } .$$
(20)

And the norm of a dual quaternion is a dual scalar and is defined as

$$\left\| {\bar{\varvec{q}}} \right\| = \sqrt {\bar{\varvec{q}}^{ * } \bar{\varvec{q}}} = \sqrt {q_{{{\text{d}}1}}^{2} + q_{{{\text{d}}2}}^{2} + q_{{{\text{d}}3}}^{2} + q_{{{\text{d}}4}}^{2} } .$$
(21)

If \(\left\| {\bar{\varvec{q}}} \right\| = 1\), thus \(\bar{\varvec{q}}\) is called a unit dual quaternion, which means it meats this condition:

$$\varvec{r}{}^{\text{T}}\varvec{r} = 1,$$
(22)
$$\varvec{r}^{\text{T}}\varvec{s} = 0 .$$
(23)

Unit dual quaternion \(\bar{\varvec{q}}\) can be elegantly employed to represent the rigid transformation including rotation about an axis and translation along this axis (e.g., Walker et al. 1991; Jitka 2011). The rotation matrix can be expressed as (Zeng and Yi 2011; Jitka 2011)

$${\mathbf{R}} = \left( {r_{ 4}^{ 2} - {\mathbf{r}}^{\text{T}} {\mathbf{r}}} \right){\mathbf{I}} + 2\left( {{\mathbf{rr}}^{\text{T}} + r_{ 4} {\mathbf{C}}({\mathbf{r}})} \right) ,$$
(24)

or

$$\left[ {\begin{array}{*{20}c} {\mathbf{R}} & {\mathbf{0}} \\ {{\mathbf{0}}^{\text{T}} } & 1 \\ \end{array} } \right] = {\mathbf{W}}(\varvec{r})^{\text{T}} Q(\varvec{r}) .$$
(25)

Suppose \(\varvec{t}\) is the pure imaginary quaternion made of translation \({\mathbf{t}}\), i.e.,

$$\varvec{t} = \frac{1}{2}\left[ {\begin{array}{*{20}c} {t_{x} } \\ {t_{y} } \\ {t_{z} } \\ 0 \\ \end{array} } \right] = \frac{1}{2}\left[ {\begin{array}{*{20}c} {\mathbf{t}} \\ 0 \\ \end{array} } \right] ,$$
(26)

thus according to Jitka (2011), one can get

$$\varvec{tr} = \varvec{s} ,$$
(27)
$$\varvec{t} = \varvec{sr}^{ - 1} = \varvec{s}\frac{{\varvec{r}^{*} }}{{\left\| \varvec{r} \right\|}} = \varvec{sr}^{*} = W(\varvec{r}^{*} )\varvec{s} = W(\varvec{r})^{\text{T}} \varvec{s} .$$
(28)

Formulation and a dual quaternion algorithm of Helmert transformation

The 3D coordinate vector quaternion can be defined as

$$\varvec{p}_{i}^{\text{o}} = \left[ {\begin{array}{*{20}c} {x_{i}^{\text{o}} } \\ {y_{i}^{\text{o}} } \\ {z_{i}^{\text{o}} } \\ 0 \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {{\mathbf{p}}_{i}^{\text{o}} } \\ 0 \\ \end{array} } \right],\quad \varvec{p}_{i}^{\text{t}} = \left[ {\begin{array}{*{20}c} {x_{i}^{\text{t}} } \\ {y_{i}^{\text{t}} } \\ {z_{i}^{\text{t}} } \\ 0 \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {{\mathbf{p}}_{i}^{\text{t}} } \\ 0 \\ \end{array} } \right],$$
(29)

thus Eq. (1) can be rewritten in the quaternion form as

$$\varvec{p}_{i}^{\text{t}} = \lambda \left[ {\begin{array}{*{20}c} {\mathbf{R}} & {\mathbf{0}} \\ {{\mathbf{0}}^{\text{T}} } & 1 \\ \end{array} } \right]\varvec{p}_{i}^{\text{o}} + 2\varvec{t}.$$
(30)

Substituting Eqs. (25) and (28) into Eq. (30), one gets

$$\varvec{p}_{i}^{\text{t}} = \lambda {\mathbf{W}}(\varvec{r})^{\text{T}} Q(\varvec{r})\varvec{p}_{i}^{\text{o}} + 2{\mathbf{W}}(\varvec{r})^{\text{T}} \varvec{s}.$$
(31)

Considering the transformation error, Eq. (30) can be rewritten as

$$\varvec{p}_{i}^{\text{t}} = \lambda {\mathbf{W}}(\varvec{r})^{\text{T}} Q(\varvec{r})\varvec{p}_{i}^{\text{o}} + 2{\mathbf{W}}(\varvec{r})^{\text{T}} \varvec{s} + \varvec{e}_{i} ,$$
(32)

where \(\varvec{e}_{i}\) is the transformation error vector. Thus, the solution to Helmert transformation problem in the least squares principle is essentially a Lagrangian extremum problem as

$$\mathop {\hbox{min} }\limits_{{\varvec{r},\varvec{s},\lambda }} \left\{ {e = \sum\limits_{i = 1}^{n} {\alpha_{i} } \varvec{e}_{i}^{\text{T}} \varvec{e}_{i} } \right\},$$
(33)

subject to Eqs. (22) and (23). In Eq. (33), \(\alpha_{i}\) is point-wise weight. In order to solve the Lagrangian extremum problem with constraints, it can be transformed to a Lagrangian extremum problem without constraints by adding the constraints into the error function \(e\) as

$$\mathop {\hbox{min} }\limits_{{\varvec{r},\varvec{s},\lambda ,\beta_{ 1} ,\beta_{ 2} }} \left\{ {\tilde{e} = \sum\limits_{i = 1}^{n} {\alpha_{i} } \varvec{e}_{i}^{\text{T}} \varvec{e}_{i} + \beta_{ 1} \left( {\varvec{r}^{\text{T}} \varvec{r} - 1} \right) + \beta_{ 2} \varvec{s}^{\text{T}} \varvec{r}} \right\},$$
(34)

where \(\beta_{ 1}\) and \(\beta_{ 2}\) are the Lagrangian multipliers. We can obtain the expression of \(\varvec{e}_{i}\) from Eq. (32), substitute it into error function \(\tilde{e}\) and expand the error function \(\tilde{e}\), thus

$$\begin{aligned} \widetilde{e} &= \sum\limits_{i = 1}^{n} {\alpha_{i} \left( {\varvec{p}_{i}^{\rm t} - \lambda {\mathbf{W}}(\varvec{r})^{\rm T} Q(\varvec{r})\varvec{p}_{i}^{\rm o} - 2{\mathbf{W}}(\varvec{r})^{\rm T} \varvec{s}} \right)^{\rm T} \left( {\varvec{p}_{i}^{t} - \lambda {\mathbf{W}}(\varvec{r})^{\rm T} Q(\varvec{r})\varvec{p}_{i}^{\rm o} - 2{\mathbf{W}}(\varvec{r})^{\rm T} \varvec{s}} \right) + \beta_{ 1} \left( {\varvec{r}^{\rm T} \varvec{r} - 1} \right) + \beta_{ 2} \varvec{s}^{\rm T} \varvec{r}} \hfill \\ &= \sum\limits_{i = 1}^{n} {\alpha_{i} \left( {{\varvec{p}_{i}^{\rm t}}^{\rm T} - \lambda {\varvec{p}_{i}^{\rm o}}^{\rm T} Q(\varvec{r})^{\rm T} {\mathbf{W}}(\varvec{r}) - 2\varvec{s}^{\rm T} {\mathbf{W}}(\varvec{r})} \right)\left( {\varvec{p}_{i}^{\rm t} - \lambda {\mathbf{W}}(\varvec{r})^{\rm T} Q(\varvec{r})\varvec{p}_{i}^{\rm o} - 2{\mathbf{W}}(\varvec{r})^{\rm T} \varvec{s}} \right) + \beta_{ 1} \left( {\varvec{r}^{\rm T} \varvec{r} - 1} \right) + \beta_{ 2} \varvec{s}^{\rm T} \varvec{r}} \hfill \\ &= \sum\limits_{i = 1}^{n} {\alpha_{i} \left( {\begin{array}{*{20}l} {{\varvec{p}_{i}^{\rm t}}^{\rm T} \varvec{p}_{i}^{\rm t} + \lambda^{ 2} {\varvec{p}_{i}^{\rm o}}^{\rm T} Q(\varvec{r})^{\rm T} {\mathbf{W}}(\varvec{r}){\mathbf{W}}(\varvec{r})^{\rm T} Q(\varvec{r})\varvec{p}_{i}^{\rm o} + 4\varvec{s}^{\rm T} {\mathbf{W}}(\varvec{r}){\mathbf{W}}(\varvec{r})^{\rm T} \varvec{s}} \\ { - \lambda {\varvec{p}_{i}^{\rm t}}^{\rm T} {\mathbf{W}}(\varvec{r})^{\rm T} Q(\varvec{r})\varvec{p}_{i}^{\rm o} - 2{\varvec{p}_{i}^{\rm t}}^{\rm T} {\mathbf{W}}(\varvec{r})^{\rm T} \varvec{s} - \lambda {\varvec{p}_{i}^{\rm o}}^{\rm T} Q(\varvec{r})^{\rm T} {\mathbf{W}}(\varvec{r})\varvec{p}_{i}^{\rm t} } \\ { + 2\lambda {\varvec{p}_{i}^{\rm o}}^{\rm T} Q(\varvec{r})^{\rm T} {\mathbf{W}}(\varvec{r}){\mathbf{W}}(\varvec{r})^{\rm T} \varvec{s} - 2\varvec{s}^{\rm T} {\mathbf{W}}(\varvec{r})\varvec{p}_{i}^{\rm t} + 2\lambda \varvec{s}^{\rm T} {\mathbf{W}}(\varvec{r}){\mathbf{W}}(\varvec{r})^{\rm T} Q(\varvec{r})\varvec{p}_{i}^{\rm o} } \\ \end{array} } \right) + \beta_{ 1} \left( {\varvec{r}^{\rm T} \varvec{r} - 1} \right) + \beta_{ 2} \varvec{s}^{\rm T} \varvec{r}} \hfill \\ &= \sum\limits_{i = 1}^{n} {\alpha_{i} \left( {\begin{array}{*{20}l} {{\varvec{p}_{i}^{\rm t}}^{\rm T} \varvec{p}_{i}^{\rm t} + \lambda^{ 2} {\varvec{p}_{i}^{\rm o}}^{\rm T} \varvec{p}_{i}^{\rm o} + 4\varvec{s}^{\rm T} \varvec{s} - \lambda \varvec{r}^{\rm T} Q(\varvec{p}_{i}^{\rm t} )^{\rm T} {\mathbf{W}}(\varvec{p}_{i}^{\rm o} )\varvec{r} - 2\varvec{r}^{\rm T} Q(\varvec{p}_{i}^{\rm t} )^{\rm T} \varvec{s}} \\ { - \lambda \varvec{r}^{\rm T} W(\varvec{p}_{i}^{\rm o} )^{\rm T} Q(\varvec{p}_{i}^{\rm t} )\varvec{r} + 2\lambda \varvec{r}^{\rm T} W(\varvec{p}_{i}^{\rm o} )^{\rm T} \varvec{s} - 2\varvec{s}^{\rm T} Q(\varvec{p}_{i}^{\rm t} )\varvec{r} + 2\lambda \varvec{s}^{\rm T} W(\varvec{p}_{i}^{\rm o} )\varvec{r}} \\ \end{array} } \right) + \beta_{ 1} \left( {\varvec{r}^{\rm T} \varvec{r} - 1} \right) + \beta_{ 2} \varvec{s}^{\rm T} \varvec{r}} \hfill \\ &= \sum\limits_{i = 1}^{n} {\left\{ {\begin{array}{*{20}l} {\alpha_{i} \left( {{\varvec{p}_{i}^{\rm t}}^{\rm T} \varvec{p}_{i}^{\rm t} + \lambda^{ 2} {\varvec{p}_{i}^{\rm o}}^{\rm T} \varvec{p}_{i}^{\rm o} + 4\varvec{s}^{\rm T} \varvec{s} - 2\lambda \varvec{r}^{\rm T} W(\varvec{p}_{i}^{\rm o} )^{\rm T} Q(\varvec{p}_{i}^{\rm t} )\varvec{r} - 4\varvec{s}^{\rm T} Q(\varvec{p}_{i}^{\rm t} )\varvec{r}+ 4\lambda \varvec{s}^{\rm T} W(\varvec{p}_{i}^{\rm o} )\varvec{r}} \right)} \\ { + \beta_{ 1} \left( {\varvec{r}^{\rm T} \varvec{r} - 1} \right) + \beta_{ 2} \varvec{s}^{\rm T} \varvec{r}} \\ \end{array} } \right.} \hfill \\ \end{aligned}$$
(35)

The derivation of Eq. (35) makes use the good properties of quaternions as

$$Q(\varvec{u})^{T} Q(\varvec{u}) = Q(\varvec{u})Q(\varvec{u})^{T} = \varvec{u}^{T} \varvec{u}{\mathbf{I}},$$
(36)
$${\mathbf{W}}(\varvec{u})^{T} {\mathbf{W}}(\varvec{u}) = {\mathbf{W}}(\varvec{u}){\mathbf{W}}(\varvec{u})^{T} = \varvec{u}^{T} \varvec{u}{\mathbf{I}},$$
(37)
$$Q(\varvec{u})\varvec{v} = {\mathbf{W}}(\varvec{v})\varvec{u},$$
(38)

where \(\varvec{u}\) and \(\varvec{v}\) are arbitrary quaternions.

Let

$$a = \sum\limits_{i = 1}^{n} {\alpha_{i} } {\varvec{p}_{i}^{\rm t}}^{\rm T} \varvec{p}_{i}^{\rm t} ,$$
(39)
$$b = \sum\limits_{i = 1}^{n} {\alpha_{i} } {\varvec{p}_{i}^{\rm o}}^{\rm T} \varvec{p}_{i}^{\text{o}} ,$$
(40)
$$c = \sum\limits_{i = 1}^{n} {\alpha_{i} } ,$$
(41)
$${\mathbf{A}} = \sum\limits_{i = 1}^{n} {\alpha_{i} } W(\varvec{p}_{i}^{\text{o}} )^{\text{T}} Q(\varvec{p}_{i}^{\text{t}} ),$$
(42)
$${\mathbf{B}} = \sum\limits_{i = 1}^{n} {\alpha_{i} } Q(\varvec{p}_{i}^{\text{t}} ),$$
(43)
$${\mathbf{C}} = \sum\limits_{i = 1}^{n} {\alpha_{i} } W(\varvec{p}_{i}^{\text{o}} ),$$
(44)

thus

$$\widetilde{e} = a + b\lambda^{ 2} + 4c\varvec{s}^{\text{T}} \varvec{s} - 2\lambda \varvec{r}^{\text{T}} {\mathbf{A}}\varvec{r} - 4\varvec{s}^{\text{T}} {\mathbf{B}}\varvec{r} + 4\lambda \varvec{s}^{\text{T}} {\mathbf{C}}\varvec{r} + \beta_{ 1} \left( {\varvec{r}^{\text{T}} \varvec{r} - 1} \right) + \beta_{ 2} \varvec{s}^{\text{T}} \varvec{r}.$$
(45)

According Lagrangian extremum law, the Lagrangian extremum exists if and only if the following conditions are satisfied:

$$\frac{{\delta \widetilde{e}}}{{\delta \varvec{r}}} = - 2\lambda \varvec{r}^{\text{T}} \left( {{\mathbf{A}} + {\mathbf{A}}^{\text{T}} } \right) - 4\varvec{s}^{\text{T}} {\mathbf{B}} + 4\lambda \varvec{s}^{\text{T}} {\mathbf{C}} + 2\beta_{ 1} \varvec{r}^{\text{T}} + \beta_{ 2} \varvec{s}^{\text{T}} = 0,$$
(46)
$$\frac{{\delta \widetilde{e}}}{{\delta \varvec{s}}} = 4c\varvec{s}^{\text{T}} - 2\varvec{r}^{\text{T}} {\mathbf{B}}^{\text{T}} + 2\lambda \varvec{r}^{\text{T}} {\mathbf{C}}^{\text{T}} + \beta_{ 2} \varvec{r}^{\text{T}} = 0,$$
(47)
$$\frac{{\delta \widetilde{e}}}{\delta \lambda } = 2b\lambda - 2\varvec{r}^{\text{T}} {\mathbf{A}}\varvec{r} + 4\varvec{s}^{\text{T}} {\mathbf{C}}\varvec{r} = 0,$$
(48)
$$\frac{{\delta \widetilde{e}}}{{\delta \beta_{ 1} }} = \varvec{r}^{\text{T}} \varvec{r} - 1= 0,$$
(49)
$$\frac{{\delta \widetilde{e}}}{{\delta \beta_{ 2} }} = \varvec{s}^{\text{T}} \varvec{r} = 0.$$
(50)

Transposing Eq. (46) gives

$$4\lambda {\mathbf{C}}^{\text{T}} \varvec{s} - 2\lambda \left( {{\mathbf{A}} + {\mathbf{A}}^{\text{T}} } \right)\varvec{r} - 4{\mathbf{B}}^{\text{T}} \varvec{s} + 2\beta_{ 1} \varvec{r} + \beta_{ 2} \varvec{s} = 0.$$
(51)

It is proved that \({\mathbf{A}}\) is a symmetric matrix, and the proof is given in “Appendix 1”. So Eq. (51) is rewritten as

$$4\lambda {\mathbf{C}}^{\text{T}} \varvec{s} - 4\lambda {\mathbf{A}}\varvec{r} - 4{\mathbf{B}}^{\text{T}} \varvec{s} + 2\beta_{ 1} \varvec{r} + \beta_{ 2} \varvec{s} = 0.$$
(52)

Transposing Eq. (47) gives

$$4c\varvec{s} - 2{\mathbf{B}}\varvec{r} + 2\lambda {\mathbf{C}}\varvec{r} + \beta_{ 2} \varvec{r} = 0.$$
(53)

Dividing Eq. (48) by 2 gives

$$b\lambda - \varvec{r}^{\text{T}} {\mathbf{A}}\varvec{r} + 2\varvec{s}^{\text{T}} {\mathbf{C}}\varvec{r} = 0.$$
(54)

Left multiplying Eq. (53) by \(\varvec{r}^{T}\) gives

$$4c\varvec{r}^{\text{T}} \varvec{s} - 2\varvec{r}^{\text{T}} {\mathbf{B}}\varvec{r} + 2\lambda \varvec{r}^{\text{T}} {\mathbf{C}}\varvec{r} + \beta_{ 2} \varvec{r}^{\text{T}} \varvec{r} = 0.$$
(55)

\({\mathbf{B}}\) and \({\mathbf{C}}\) are skew symmetric matrixes, and the following property is proved, for the proof the reader can refer to “Appendix 2”.

$$\varvec{r}^{\text{T}} {\mathbf{B}}\varvec{r} = 0,\;\varvec{r}^{\text{T}} {\mathbf{C}}\varvec{r} = 0.$$
(56)

Considering Eqs. (22), (23) or Eqs. (49), (50) and (56), it is obtained from Eq. (55) that

$$\beta_{ 2} = 0.$$
(57)

By Eq. (53), we get

$$\varvec{s} = \frac{ 1}{ 2c}\left( {{\mathbf{B}} - \lambda {\mathbf{C}}} \right)\varvec{r}$$
(58)

Substituting Eq. (58) into Eq. (52) gives

$$\frac{ 1}{c}\left( {\lambda {\mathbf{C}}^{\text{T}} - {\mathbf{B}}^{\text{T}} } \right)\left( {{\mathbf{B}} - \lambda {\mathbf{C}}} \right)\varvec{r} - 2\lambda {\mathbf{A}}\varvec{r} + \beta_{ 1} \varvec{r} = 0.$$
(59)

Arranging Eq. (59) gives

$$\left[ { 2\lambda {\mathbf{A}} + \frac{ 1}{c}\left( {{\mathbf{B}} - \lambda {\mathbf{C}}} \right)^{T} \left( {{\mathbf{B}} - \lambda {\mathbf{C}}} \right)} \right]\varvec{r} = \beta_{ 1} \varvec{r}.$$
(60)

Let

$${\mathbf{D}} = 2\lambda {\mathbf{A}} + \frac{ 1}{c}\left( {{\mathbf{B}} - \lambda {\mathbf{C}}} \right)^{\text{T}} \left( {{\mathbf{B}} - \lambda {\mathbf{C}}} \right),$$
(61)

thus Eq. (60) can be rewritten as

$${\mathbf{D}}\varvec{r} = \beta_{ 1} \varvec{r}.$$
(62)

So \(\beta_{ 1}\) and \(\varvec{r}\) are the eigenvalue and eigenvector of \({\mathbf{D}}\). Since \({\mathbf{D}}\) is symmetric and real, there are four real eigenvalues and four orthogonal real eigenvectors. To obtain the only solution, we need to refer back to the error function Eq. (33).

Left multiplying Eq. (52) by \(\varvec{r}^{\text{T}}\) gives

$$2\lambda \varvec{r}^{\text{T}} {\mathbf{C}}^{\text{T}} \varvec{s} - 2\lambda \varvec{r}^{\text{T}} {\mathbf{A}}\varvec{r} - 2\varvec{r}^{\text{T}} {\mathbf{B}}^{\text{T}} \varvec{s} + \beta_{ 1} = 0,$$
(63)
$$2\lambda \varvec{r}^{\text{T}} {\mathbf{A}}\varvec{r} = 2\lambda \varvec{r}^{\text{T}} {\mathbf{C}}^{\text{T}} \varvec{s} - 2\varvec{r}^{\text{T}} {\mathbf{B}}^{\text{T}} \varvec{s} + \beta_{ 1} .$$
(64)

Left multiplying Eq. (53) by \(\varvec{s}^{\text{T}}\) gives

$$4c\varvec{s}^{\text{T}} \varvec{s} = 2\varvec{s}^{\text{T}} {\mathbf{B}}\varvec{r} - 2\lambda \varvec{s}^{\text{T}} {\mathbf{C}}\varvec{r}.$$
(65)

Substituting Eqs. (64) and (65) to error function \(e\) gives

$$e = a + b\lambda^{ 2} - \beta_{ 1} ,$$
(66)

thus when \(\beta_{ 1}\) is the largest eigenvalue, \(e\) gets its minimal value. In other words, the solution of \(\varvec{r}\) is the eigenvector of \({\mathbf{D}}\) corresponding to the largest eigenvalue.

Substituting Eq. (58) into Eq. (54) gives

$$\lambda = \frac{{\varvec{r}^{T} {\mathbf{A}}\varvec{r} - c^{ - 1} \varvec{r}^{T} {\mathbf{B}}^{T} {\mathbf{C}}\varvec{r}}}{{b - c^{ - 1} \varvec{r}^{T} {\mathbf{C}}^{T} {\mathbf{C}}\varvec{r}}}.$$
(67)

From the solution process of \(\varvec{r}\) and Eq. (67), it is seen that there is no analytical formula for \(\varvec{r}\) and \(\lambda\). Therefore, an iterative algorithm is presented in Table 1.

Table 1 An dual quaternion algorithm of Helmert transformation

It is worthy of note that the mathematic model and algorithm have a special property, i.e., they do not require the difference process of coordinates (e.g., centrobaric coordinate) as other relative studies do, e.g., Grafarend and Awange (2003), Chen et al. (2004), Shen et al. (2006a, b), Zeng and Huang (2008), Han (2010), Zeng and Yi (2010, 2011), Zeng (2015), Závoti and Kalmár (2016), Zeng et al. (2016), etc. But the difference process of coordinates has no necessary relation with transformation accuracy.

Computation of initial value of scale parameter

From the presented algorithm in Table 1, we can see an initial value of scale parameter is needed before iterative computation. Some studies have given a formula for scale which is independent to the rotation angles and translation. For example, Han (2010) presented following formula to estimate the scale:

$$\lambda_{H} = {\text{mean}}\left( {\frac{{\left\| {{\text{d}}{\mathbf{x}}_{ij}^{'} } \right\|}}{{\left\| {{\text{d}}{\mathbf{x}}_{ij} } \right\|}}} \right)\quad i \ne j,$$
(68)

where \({\text{d}}{\mathbf{x}}_{ij}\) is coordinate difference of the original points \(i\) and \(j\), \({\text{d}}{\mathbf{x}}_{ij}^{'}\) is coordinate difference of the transformed points \(i\) and \(j\), \(\lambda_{H}\) is the scale estimation. Závoti and Kalmár (2016) presented three solutions of scale as follows.

$$\lambda_{Z1} = \frac{{\sum\nolimits_{i = 1}^{n} {\sqrt {X_{is}^{2} + Y_{is}^{2} + Z_{is}^{2} } } }}{{\sum\nolimits_{i = 1}^{n} {\sqrt {x_{is}^{2} + y_{is}^{2} + z_{is}^{2} } } }},$$
(69)
$$\lambda_{Z2} = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {X_{is}^{2} + Y_{is}^{2} + Z_{is}^{2} } \right)} }}{{\sum\nolimits_{i = 1}^{n} {\left( {x_{is}^{2} + y_{is}^{2} + z_{is}^{2} } \right)} }}} ,$$
(70)
$$\lambda_{Z3} = \frac{{\sum\nolimits_{i = 1}^{n} {\sqrt {\left( {x_{is}^{2} + y_{is}^{2} + z_{is}^{2} } \right)\left( {X_{is}^{2} + Y_{is}^{2} + Z_{is}^{2} } \right)} } }}{{\sum\nolimits_{i = 1}^{n} {\left( {x_{is}^{2} + y_{is}^{2} + z_{is}^{2} } \right)} }} ,$$
(71)

where \(x_{is}\), \(y_{is}\), \(z_{is}\) are the centrobaric coordinates in the original coordinate system, i.e.,

$$x_{is} = x_{i} - x_{s} ,\;y_{is} = y_{i} - y_{s} ,\;z_{is} = z_{i} - z_{s} ,\;x_{s} = \frac{ 1}{n}\sum\limits_{i = 1}^{n} {x_{i} } ,\;y_{s} = \frac{ 1}{n}\sum\limits_{i = 1}^{n} {y_{i} } ,\;z_{s} = \frac{ 1}{n}\sum\limits_{i = 1}^{n} {z_{i} } ,$$
(72)

\(X_{is}\), \(Y_{is}\), \(Z_{is}\) are the centrobaric coordinates in the target coordinate system, i.e.,

$$X_{is} = X_{i} - X_{s} ,\;Y_{is} = Y_{i} - Y_{s} ,\;Z_{is} = Z_{i} - Z_{s} ,\;X_{s} = \frac{ 1}{n}\sum\limits_{i = 1}^{n} {X_{i} } ,\;Y_{s} = \frac{ 1}{n}\sum\limits_{i = 1}^{n} {Y_{i} } ,\;Z_{s} = \frac{ 1}{n}\sum\limits_{i = 1}^{n} {Z_{i} } .$$
(73)

Case study and discussion

Actual case (small rotation angles)

The data are chosen from Grafarend and Awange (2003), and the rotation angles are very small (less than \(1^{{\prime \prime }}\)) in this case. The 3D coordinate of control points in the original system and the target system is listed in Table 2. When the weights of control points are not considered or identical, the computed transformation parameters with presented dual quaternion algorithm (DQA), orthonormal matrix algorithm (OMA) from Zeng (2015) and Procrustes algorithm (PA) from Grafarend and Awange (2003) are listed in Table 3. DQA sets the threshold \(\tau\) to 1.0 × 10−10 and adopts six initial values of \(\lambda\), i.e., the solution of Han (2010), the three solutions of Závoti and Kalmár (2016), a biased one (solution 5 with initial value as 10) and a seriously biased one (solution 6 with initial value as 100). From Table 2 it is seen that for the two initial values, i.e., \(\lambda_{Z2}\) and \(\lambda_{Z3}\), which are identical to the least square estimate, one iterative computation is enough to recover the correct seven transformation parameters. For the rest cases of initial values of \(\lambda\), two iterative computation can converge to the correct solution of transformation parameters no matter how biased (slightly biased, biased or seriously biased) the initial values of \(\lambda\) are from best estimate. So DQA converges in two iterations for all situations. OMA and PA have the identical results. From the viewpoint of solution accuracy, the DQA is comparable to the OMA and PA.

Table 2 Coordinates of control points in local system and WGS-84 system
Table 3 Computed transformation parameters (identical weight)

At times the weight of the control point needs to be considered because different control points have different accuracy of positioning, and the weight can be obtained based on the different accuracies of the control points. For the situation that the weight is a point-wise one, i.e., the weight of every point is isotropic for the three axes (x-axis, y-axis, z-axis) directions and control points are independent of each other, the point-wise weight is generated by the means from Grafarend and Awange (2003) and is given in Table 4. The computed results of seven parameters with DQA, OMA and PA are listed in Table 5. For this situation, because the Han (2010) and Závoti and Kalmár (2016) do not consider the weight of control point when computing the solution of scale, all the four solutions (solution 1 to solution 4) just offer slightly biased initial values of \(\lambda\). And it is seen from Table 5 that two iterations are needed to obtain the correct result of seven parameters regardless of slightly biased initial value, biased one or seriously biased one. OMA and PA have the identical results if the bias caused by decimal rounding is ignored. From the perspective of computation accuracy, the DQA is consistent with OMA and PA.

Table 4 Point-wise weight
Table 5 Calculated transformation parameters (point-wise weight)

Simulative case (big rotation angles)

In this case, the big rotation angles are considered and the data are simulated as follows. The control points and their 3D coordinates in the original system are firstly given in Table 6. Secondly, the true seven transformation parameters are given in Table 7 with randomly generated big rotation angles. Thirdly, the 3D coordinates of control points in target system are computed by Eq. (1). Lastly, in order to simulate the real-world data which always has noise, the 3D coordinates in the original system are added N(0, 0.022) noise and the 3D coordinates in the target system are added N(0, 0.012) noise. Namely, the case assumes the errors of coordinate in three axes direction equivalent and independent. The normally distributed random noise is produced and listed in Tables 6 and 8.

Table 6 3D coordinates of control points in original system and noises added (m)
Table 7 True values of transformation parameters
Table 8 3D coordinates of control points in target system and noises added (m)

When the weights of control points are not taken into account or identical, the transformation parameters are computed with DQA, OMA and PA. DQA sets the threshold \(\tau\) to 1.0 × 10−10 and adopts six initial values of \(\lambda\) which are listed in Table 9. DQA requires two iterations to obtain the final result for the all six initial values of \(\lambda\). And DQA, OMA and PA obtain the identical results which are listed in Table 10.

Table 9 Six initial values of scale parameter
Table 10 Recovered transformation parameters (identical weight) by DQA (with six initial values of scale parameter), OMA and PA

Furthermore, the performance of DQA, OMA and PA is compared taking into the weights of control points. Firstly, the positional error sphere which is defined from Grafarend and Awange (2003) is calculated and listed in Tables 6 and 8, respectively. And then the point-wise weight matrix is computed by the approach from Grafarend and Awange (2003) with the positional error sphere and the recovered transformation parameters (identical weight) as listed in Table 10. The weight result is listed in Table 11 which ignores the magnitude since that ratio of weight rather than absolute weight is more important. Finally, the transformation parameters are computed with DQA, OMA and PA. At this time, DQA has the same setup of threshold \(\tau\) and initial values of \(\lambda\) as that in the identical weight situation. DQA needs two iterations to obtain the final result for the all six initial values of \(\lambda\). And DQA, OMA and PA obtain the identical results which are listed in Table 12.

Table 11 Point-wise weight
Table 12 Recovered transformation parameters (point-wise weight) by DQA (with six initial values of scale parameter), OMA and PA

The transformation residuals of coordinates are computed by subtracting the calculated coordinates by Eq. (1) with the recovered transformation parameters and coordinates in the source system to known coordinates in the target system. The result is listed in Table 13, where \({\text{d}}x\), \({\text{d}}y\) and \({\text{d}}z\) denote the transformation residuals of coordinates in the three axes direction. And the root-mean-square errors are also computed and listed in Table 13. It is seen from Table 13 that the transformation residuals of coordinates are consistent with the noises added into coordinates of control points in the original system.

Table 13 Transformation residuals of coordinates (m)

According to the above analysis, DQA, OMA and PA have the identical performance for the big rotation angles. Thus, the presented algorithm is correct and reliable.

Conclusions

Unit dual quaternion can be elegantly employed to describe the rigid transformation including rotation and translation. Based on unit dual quaternion, a non-differential model of Helmert transformation (seven-parameter similarity transformation) is constructed and a rigid iterative algorithm of Helmert transformation using dual quaternion is presented. The case study shows the presented algorithm requires one iteration to recover the transformation parameter if accurate initial value of scale is provided like the solutions no. 2 and 3 of Závoti and Kalmár (2016) for the situation that the weights are identical; otherwise, exact two iterative computation converges to the correct solution of transformation parameters no matter how big the rotation angles are and how biased the initial value of scale is. Hence, the presented algorithm has an excellent or fast convergence, and it becomes an analytical algorithm when the accurate initial value of scale is offered. In addition, the presented algorithm is able to deal with point-wise weight transformation which is more rational than those algorithms which do not consider the weight difference among control points. And from the viewpoint of solution accuracy, the presented algorithm is comparable to the classic Procrustes algorithm and orthonormal matrix algorithm from Zeng (2015).