A Complete Workflow for Automatic Forward Kinematics Model Extraction of Robotic Total Stations Using the Denavit-Hartenberg Convention

Development and verification of real-time algorithms for robotic total stations usually require hard-ware-in-the-loop approaches, which can be complex and time-consuming. Simulator-in-the-loop can be used instead, but the design of a simulation environment and sufficient detailed modeling of the hardware are required. Typically, device specification and calibration data are provided by the device manufacturers and are used by the device drivers. However, geometric models of robotic total stations cannot be used directly with existing ro-botic simulators. Model details are often treated as company secrets, and no source code of device drivers is available to the public. In this paper, we present a complete workflow for automatic geometric model extraction of robotic total stations using the Denavit-Hartenberg convention. We provide a complete set of Denavit-Hartenberg parameters for an exemplary ro-botic total station. These parameters can be used in existing robotic simulators without modifications. Furthermore, we analyze the difference between the extracted geometric model, the calibrated model, which is used by the device drivers, and the standard spherical representation for 3D point measurements of the device.

construction to as-built verification and deformation monitoring. Modern devices allow the measurement of distances and angles, provide image data, and automatically compensate various system effects, such as inaccuracies in production, sensor drifts and environmental influences [38]. While measured 3D points are usually represented in spherical coordinates, these devices are calibrated with extended geometric models by the manufacturers to reduce systematic errors, to combine multiple sensors and to achieve higher precision and accuracy. Manufacturers usually provide a software development kit (SDK) to access and convert the measured data. However, the detailed mathematical formulation of the corrections is often a company secret and is not available to the public.
The contribution of this work is a detailed description of the forward kinematics model parameter estimation for an (RTS) using the Denavit-Hartenberg (DH) convention [8], which can hardly be found in the literature in a closed form. Based on this description, we provide the estimated (DH) parameters for an exemplary (RTS). These parameters can be used for custom or existing robotic simulators, such as Roboanalyzer [27], WorkcellSimulator [37], ABB RoboStudio [5] or OpenSim [7]. The introduced model estimation workflow can also be applied to open chain robots with comparable actuators and sensors in general. In particular, we extended the methods described by Barker [2] and Rajeevlochana et al. [28] to extract the relationship between the robot control parameters, one or more cameras and the electronic distance meter (EDM) of an (RTS) automatically. Furthermore, we compared three different kinematics model representations for the exemplary (RTS): (a) the geometrically extracted kinematics model, (b) an simplified model using a spherical coordinate system and (c) the numerically optimized kinematics model. This paper is structured as follows. Related work on (RTS), (DH) parameter estimation and robotic modeling are described in Section 2. In Section 3, the kinematics modeling workflow is described. In Section 4, experimental results for the kinematics model of an exemplary RTS are presented, including the comparison of the error for the geometrically extracted, analytically simplified, and numerically optimized models. A brief discussion about findings, limitations and possible future work is provided in Section 5, while concluding remarks are given in Section 6.

Related Work
Most robotic simulation environments for open kinematic chains use DH parameters for geometric linkage simulations [3,5,7,22,27,30,35,37]. While DH parameter based models are quite common in robotic theory, RTS specifications do not provide this kind of information. Without the claim to completeness, this section provides an overview of published work in the field of RTS, the particular device used in this paper, the DH convention and related kinematics modeling approaches.
Uren and Price [38] provide an introduction to surveying devices and methods, including standardized mathematical models, measurement methods, error calculations and common workflows. Forward kinematics modeling or device simulation are not described, however.
Klug et al. [21] use a spherical model for the RTS in the context of human structure measurements. However, the applied model is an idealized representation of the device. The full kinematics model and the influence of the simplification are not analyzed (see Fig. 1). Kinematics modeling and model identification has been addressed extensively in the literature, using different models and notations [6,8,12,13,33,36]. Reuleaux [29] introduced the concept of kinematic chains in 1876. Denavit and Hartenberg [8] presented a systematic notation for kinematic chains, later called DH convention, which is still the most common method to identify rigid kinematic chains. Veitschegger and Wu [39] added the base and tool transform Fig. 1 Simplified kinematics model used in [21] for complete kinematics description. A general introduction to the field of robotics, including a detailed description of the DH convention and its variants, can be found in the book by Siciliano and Khatib [34]; a comparison of different DH convention variants and an extensive study of geometric linkages can be found in book by McCarthy and Soh [25]. Despite the long history of kinematics modeling of robotic devices, to the best of our knowledge, no complete description for kinematics model extraction of an RTS with camera has been presented so far.
Barker [2] describes a vector-algebra approach for extracting geometric properties of assembled robotic arms. Rajeevlochana et al. [28] present a description for automatic model parameter estimation using a modified version of the algorithm based on line geometry. More details about their workflow, data acquisition, model extraction, and modeling error evaluation are given in the work of Hayat et al. [15] and in the work of Chittawadigi et al. [4]. However, their work is not tailored to RTS, hence device-specific algorithm steps are missing. Furthermore, numerical optimization of the geometrically extracted models was not addressed by the authors.
In our work, we describe a workflow based on DH parameters, which is tailored to kinematics modeling and simulation of modern RTS. To the best of our knowledge, we are the first to describe such a workflow in self-contained form. Furthermore, we show the significance of numerical optimization with a global cost function in addition to the geometric approach. Finally, instead of focusing on a single application case, we analyze the difference between the geometrically extracted model, the numerically optimized model and the spherical approximation.

Forward Kinematics Modeling of RTS
An RTS is an electrical theodolite, which consists of an optical telescope, an electrical distance meter and one or more cameras [38]. Modern devices support teleoperation; hence, the optical telescope of the instrument is completely replaced by a camera module without any eyepiece. While the idealized model of the robotic module defines a spherical coordinate system, this model is violated by inevitable inaccuracies in manufacturing and physical device calibration. Manufacturing details are usually trade secrets, and only limited information is available to the public. Software-side calibration can reduce these errors, but requires an extended kinematics model of the RTS.
By extracting a combined model for hardware and software, a more accurate simulation environment can be implemented, device specifications can be verified, and less in-depth knowledge of the physical device is required.
The linear geometric properties of the rigid body can be described using a series of Euclidean transformation matrices with six degree of freedom (DOF) each. In the field of robotics, more systematic approaches with fewer DOF are known for estimating kinematic and dynamic properties, such as the DH convention. In this section, we provide a detailed description of how to derive a geometric model for RTS using the DH convention. This allows to estimate device properties when specification are missing, to verify calibration results, and to re-use existing simulation environments for robotic devices.

Forward Kinematics
The dynamic parts of an RTS can be described as a spatial kinematic chain. Forward kinematics describes the pose of an end effector using links and joints without consideration of driving forces or moments [23]. Joints are rigidly connected by links and can be affected at run-time, using the related joint control variables. End effectors or tools are sensors or actuators attached to the last link of a robotic device.
The DH convention was originally introduced to describe the geometric relationship of an open kinematic chain with M − 1 joints and M links, using a series of homogeneous joint displacement matrices Z i and link displacement matrices X i [8]. In this work, we used the (DH) notation proposed by Rajeevlochana and Saha [27,31].
By convention, joint i connects link i − 1 and link i. The displacement matrices Z i and X i define the local coordinate frame i by {o i , x i , y i , z i } where o i is the frame origin, and x i , y i , and z i are the normalized x, y, and z axes, respectively. Frame i is rigidly attached to the end of link i − 1 and must satisfy the following conditions: (i) The x i axis is perpendicular to the z i−1 axis; (ii) the x i axis intersects with the z i−1 axis.
The DH convention defines two types of joints: (a) revolute joints and (b) prismatic joints. A revolute joint allows a rotation around the z axis of frame i by the angle γ i . A prismatic joint allows the translation along the z axis of frame i by the distance d i .
The relative pose M i,j of frame j with respect to frame i is given by (1) The joint matrix Z i describes a screw displacement along the z axis of frame i where d i and γ i are the control variables for joint i. The link matrix X i describes a rigid screw displacement along the x axis of frame i where a i and α i are used to define the static properties of link i. Figure 2 shows the relative pose M i,i+1 of frame j in respect to frame i using the DH convention. The set Q i contains the control variables for link i, which are called the DH parameters: The rigid twist of link i − 1 is given by the rotation matrix R x , while the rigid length of link i − 1 is given by the translation matrix T x : The rotation of link i around joint i is given by the rotation matrix R z , the translation of joint i along the z i axis is given by the translation matrix T z :  The pose of frame i+1 with respect to frame i can be written as where R i,i+1 is a 3×3 rotation matrix, and t i,i+1 is a 3×1 translation vector.

Forward Kinematics Modeling Workflow for RTS
Estimating the forward kinematics model of an RTS can be reduced to the problem of deriving the DH parameters for a robotic device. Figure 3a shows the coordinate frames of an RTS with a single camera. 1 The pose of the RTS with respect to the reference frame C 0 could be defined by the base transform M B , which allows arbitrary placement of the robot in the scene. Multiple tools which are rigidly attached to the last 1 Coordinate frames for different RTS are not standardized and vary between individual devices.
The transformation of homogeneous points from tool space k to the reference space is given bŷ wherex (k) i,T defines a homogeneous point in the tools space k, andx i,0 defines the same point in the reference space.
The steps required for estimating the DH parameters according the workflow of Chittawadigi et al. [4] are: 1. identify joint count and type of joints, 2. identify end effectors, 3. record end effector poses, while varying a single joint, and 4. calculate the DH parameters. Figure 3b shows the model of an exemplary RTS. The system can be described by a reference frame C 0 = C B , two revolute joints, and two end effectors C  The rotation and orientation of a joint can be estimated by recording end effector positions with respect to the reference frame, while rotating only one joint at a time. Figure 3c shows the flow chart for measuring end effector positions and extracting DH parameters.
The recorded 3D points describe a planar circular trajectory. The center of the circles and the plane of rotation can be used to estimate the DH parameters. Figure 4 shows the concept for circular feature extraction.

Data Acquisition for DH Parameter Estimation
The estimation of DH parameters usually requires an external measurement setup for measuring the pose of the end effectors. However, for forward kinematics model estimation of an RTS, the required data can be fetched from the application programming interface (API) of the device without any external devices. 3 For each joint, a trajectory is defined by recording end effector positions, while varying the related joint control parameter and keeping other control parameters constant. A linear trajectory of the recorded end effector positions indicates a prismatic joint. A circular trajectory of the recorded points indicates a revolute joint.
If the recorded data does neither describe a linear or circular movement, the affected joint type is either not prismatic or revolute, or the end effector coincides with the 3 For device calibration, external measurements would still be required. rotation axis of a revolute joint. To avoid singularities during DH parameter extraction, the recording for the affected joint has to be repeated using different fixed joint settings or different API parameters. 4 Figure 5a shows the required measurement API. Joint i = 1 is the horizontal rotation, joint i = 2 is the vertical rotation, and F(ϕ, θ) is the control function for the corresponding joints. Data recording for joint 1 can be done by varying ϕ in the range [0, 2π ], while keeping θ = π 2 constant. Data recording for joint 2 can be done by varying θ between [0, π], while keeping ϕ = 0 constant. The device API function applies the control variables {ϕ i , θ i } and converts a 1D distance d EDM from (EDM) space to to a 3D point x i in the reference frame. The device API function applies the control variables {ϕ i , θ i } and converts a 2D image pixel u to a 3D point x i in the reference frame. The two functions F (EDM) and F (CAM) are sufficient for forward kinematics modeling. They must be provided by the API of the RTS. Figure 5b shows the parameter space of the angle control variables {ϕ i , θ i }; Fig. 5c and d show the positions of the recorded EDM and camera end effectors, respectively.

Estimating Circular Features
A plane p can be defined as where n = [n x n y n z ] T is the normalized plane normal. Given a set of 3D points S p = {x 1 , x 2 , . . . , x N }, the center of massx is given bȳ The plane normal can be fitted to the point cloud S p using Singular Value Decomposition (SVD): Ã is a 3 × N matrix containing the normalized and centered measurement samples.
The normalization k is the absolute maximum value of all elements inĀ. A vector parallel to the plane normal n is given by the eigenvector ofÃ, which corresponds to the smallest eigenvalue, which is simply the last column Plane p is then fully defined by Eqs. 13 and 14.
It is convenient to define a right-handed orthogonal basis B for each plane such that the plane normal is aligned with the z axis, and the plane contains two additional orthogonal vectors according to The determinant of U is either equal to 1 if no reflection happens or equal to −1 in case of a reflection. Therefore B can be interpreted as a reflection-free rotation of the plane with respect to the reference frame. All points must be transformed to the xy plane and projected to a 2D Euclidean space before fitting a circle to the planar measurements. This can be formalized as projection of the centered measurement matrixĀ given in Eq. 17: where A c is a [2 × N] matrix, and P c is a [2 × 3] projection matrix which applies the inverse plane rotation B T and the projection of the stacked 3D pointsĀ to the 2D space. 5 A circle in 2D is given by the implicit equation which is linear in the unknown parameters c 1 , c 2 and k 3 . This can be written as an inhomogeneous linear system whereÃ c is a 3 × N matrix, and ξ i is the sum of squared 2D coordinate values of point i. SVD can be used to solve for the unknown circle parameters: The 3D circular feature c for a single joint is fully defined by c = {c, r}, using where B is the plane rotation according to Eq. 20, c is the center of the circle with respect to the reference frame, and r is the radius of the circle.

Correcting the Sign of Circular Features
A plane normal n as given in Eq. 13 is defined up to sign. This leads to sign ambiguities in the DH control parameters.
For revolute joints, this can lead to an inverse rotation, for prismatic joints, this can lead to an inverse translation. The ambiguity can be resolved using two consecutive measurement points x 1 and x 2 of the circular feature ĩ where c i is the center of the circle, and sign(x) is the sign of the scalar value x according to The corrected plane normal n i ensures a positive rotation direction for increasing control parameter values, if point x 1 was recorded with a smaller control parameter than point x 2 . 6 Figure 6a shows the concept for sign correction of circular features.

Link Constellation and Frame Alignment
The link constellation and frame alignment are based on the the spatial relationship of the z i−1 and the z i axes. An intersection is treated as special case of skewed lines. Rajeevlochana et al. used Plücker coordinates and Dual Vector Algebra for estimating the line constellation [28]. Plücker coordinates allow closed form line intersection testing. However, for kinematic chains with low link counts, we did not observe any computational benefits when using Plücker coordinates instead of simple vector algebra as described in the work by Barker [2]. Thus, for the sake of simplicity, we describe line constellations using simple vector algebra. Figure 6b shows the distance of two lines in the 3D space. Given two lines L 0 and L 1 in their parametric form we wish to find the minimum distance between the two lines. Let q s and q t define points on line L 0 and L 1 which minimizes the length of the vector using The minimum Euclidean distance between the lines is given by ||v||, if v has the same direction as the common normal, which can be written as If we substitute Eqs. 34 and 35 in Eq. 36 we can solve for the two unknown parameters s s and t s The constellation type of two lines c c ∈ {parallel, skewed} can be determined from the divisor of Eq. 37: The parameters s c and t c for calculating the closest points q s and q t on line L 0 and L 1 , respectively, are given by Finally, the points q s and q t can be calculated by substituting the parameters from Eq. 41 into the line equations given in Eq. 33:

First Link Frame
The z direction of the first frame is aligned with the plane normal of the first circular featureñ 1 The origin of the first frame o 1 must lie on the first rotational axis L z1 (s) = c 1 + sñ 1 ; the x axis of the first frame must lie on the plane defined by o 1 and z 1 . The rotation of x 1 around z 1 can be arbitrary defined. The first frame is fully defined by where B 1 is a right-handed orthogonal base according to Eq. 20. However, this approach will lead to different DH parameters for different point sets of the first circular feature.
A more convenient approach is to further align the x direction of the reference frame x 0 with the x direction of the first frame x 1 by projecting x 0 onto plane p 1 : If plane p 1 is parallel to the yz plane of the reference frame, the alignment of the y directions can be used instead.

Middle Link Frames
The frames i = 2, . . . , M − 1 can be defined iteratively using , are skewed or parallel t are the common normal intersections of L i and L i+1 according to Eq. 35, respectively. Based on the calculated coordinate frames, the (DH) parameters for a revolute joint i can be derived by

Last Link Frame
The last frame can be used to define the tool pose. However, a general pose of the tool would require six DOF whereas a single frame of the DH framework is limited to four DOF. One solution to this problem is to use multiple DH frames to describe the tool pose. In this work, we used a more general approach by extracting a six DOF tool matrix M T for each tool k separately.
By including individual tool matrices, the rotation of the last link frame around the z N axis is arbitrary. However, it is convenient to align the last frame with the previous one: The DH parameters are then calculated according to Section 3.1.7.

Base Transform
The base transform defines the pose of the first coordinate frame {o 1 , x 1 , y 1 , z 1 } with respect to the reference frame {o 0 , x 0 , y 0 , z 0 } using the six DOF Euclidean transformation M B . This problem can be described as estimating 3D rigid transformations between two point sets and has been studied extensively [1,9,17,18]. In this work, we follow the SVD approach from reference [1]. Let S a = {a 1 , a 2 . . . a N } and S b = {b 1 , b 2 . . . b N } be two point sets with corresponding points. To determine the Euclidean transformation three or more correspondences are needed. We are looking for the best transformation in the least squares sense: The centroidsā andb of of the point clouds are given bȳ The cross-covariance matrix of the two point sets is given by where the measurement matrices S a and S b contains the stacked points of the two point sets The SVD of the cross-covariance matrix H is The rotation R can be calculated by Finding the optimal rotation R is also known as Kabsch algorithm [20]. The optimal translation t is given by The rigid transform M B which describes the pose of the first link frame {o 1 , x 1 , y 1 , z 1 } with respect to the reference frame {o 0 , x 0 , y 0 , z 0 } can be calculated according to Eq. 51:

Camera Tool Frame
The pose of the end effector k with respect to the last link frame {o M , x M , y M , z M } is defined by a six DOF rigid transform. If two point sets with N ≥ 3 correspondences Fig. 7 a Recorded back-projected pixels for camera tool pose estimation. b Simplified view ray relation between principal ray and back projected pixel to validate ray length approximation during the camera tool pose estimation can be obtained, the point set alignment algorithm as described in Section 3.1.9 can be applied. The pose of the camera end effector can be defined by the end effector position, two orthogonal vectors describing the x and y direction, and the right-handed coordinate system constraint. The end effector position was already used for the DH parameter extraction. The x and y directions of the camera are aligned with the u and v direction of the image space, respectively. By extending the set of backprojected pixel coordinates during the measurement flow for the DH parameters, the x and y direction of the end effector with respect to the reference frame can be extracted. Figure 5c and d shows the end effector recordings, including the observable frame axis of the end effectors; Fig. 7a shows the back-projection method and the alignment of camera and image frame.
Given the camera intrinsics, the camera projection matrix P can be written asû in Eq. 10. By defining the tool space as camera space, the relationship between a homogeneous pointx T in the camera space and a homogeneous pointx 0 in the reference space is given bŷ where M 1,M is the relative pose of frame M with respect to frame 1 as defined in Eq. 9. Rearranging Eq. 64 leads to which can be solved for M T using N ≥ 3 point correspondences. Finding point correspondencesx T andx 0 can be done using back-projection of pixels into the camera space. Let S u = {u 1 , u 2 , u 3 } be a set of 2D image points using where c is the principal point of the camera. The image coordinates given in S u are back-projected by using the API function given in Eq. 12 using d CAM = 1 and constant parameters {ϕ j , θ j } for the set S u . The measured points are then converted to a local coordinate frame using x 1 as origin The rigid transformation M (CAM) T can be extracted according to Section 3.1.6 using the two point sets wherex i is the Euclidean representation of the homogeneous coordinatex i given in Eq. 69 and c t is given by A constant distance d CAM = 1 for all back-projected points in Eq. 68 will lead to a systematic error, which is negligible for the tool transform estimation. This can be shown by calculating the corrected length d CAM,i of the back projected ray for the image coordinates u i,i∈{2,3} in Eq. 67. Assuming u 1 is aligned with the principal ray and the focal length f is known, the ray lengthd CAM,i is given by triangle similarity and the Pythagorean theorem according tõ where ||u 1 −u i || describes the distance of two pixels {u 1 , u i } in the image space and {b u,i , d 1 ,d CAM,i } define a rightangled triangle. For f ||u 1 − u i || and d 1 = d CAM = 1 we can setd CAM,i ≈ d i . Figure 7b shows the simplified relationship of back projected rays with respect to the principal ray.

EDM Tool Frame
Similar to the camera tool, the pose of the EDM end effector can be described by a six DOF Euclidean transformation. Distance measurements can be modelled by a 3D ray in the Euclidean space. This leads to a tool pose defined up to an arbitrary rotation around the measurement axis. However, it is convenient to align the EDM frame with the camera frame.
The pose of the EDM end effector can be defined by a ray which describes the distance measurement in 3D, a related orthogonal vector, and the right-handed coordinate system constraint. The end effector position, which defines a point on the ray, was already used for the DH parameter extraction.
We define the z axis of the EDM as the distance measurement direction. By extending the set of backprojected distances during the measurement flow for the (DH) parameters, the z direction of the end effector with respect to the reference frame can be extracted.
A 1D distance measurement can be back-projected usinĝ wherex i is a homogeneous point in the reference frame, andd i is a homogeneous point in the (EDM) space. The homogeneous pointd i for a distance measurement d i along the z axis of the (EDM) frame is defined aŝ Rearranging Eq. 74 leads to which is of the same form as Eq. 65. We construct three correspondences for solving (74) for the rigid transform M (EDM) T . Let S d = {d 1 , d 2 } be a set of two distances where d 1 = 1 and d 2 = 2. The back-projection of the distances d i to 3D points in the reference frame can be done with the API function given in Eq. 11 using constant parameters {ϕ j , θ j } for the set S d . The x direction of the (EDM) frame can be found by projecting the x direction 8 of the camera frame onto the plane defined by point d 1 and plane normal n = z EDM = d 2 −d 1 The points are then converted to an intermediate coordinate frame with d 1 as origin can be extracted according to Section 3.1.6, where we define the two point sets as whered i is the Euclidean representation of the homogeneous coordinated i , and c t is given by

Experimental Results for the Forward Kinematics Model of an Exemplary (RTS)
In this section, we apply the modeling method which we introduced in Section 3 to the HILTI PLT 300 [16]. To cover a wide spectrum of possible applications, we compare the DH results to the related spherical model, which represents simplified and idealized relationships between actuators and sensors, and the numerically optimized model of the exemplary RTS. The RTS HILTI PLT 300 is a high precision measurement device which contains one EDM and supports RGB and infrared (IR) image and video streaming using a single camera. The device driver provides access of basic control and measurement variables, calibration values, data streams and seven manufacturer-side calibrated optical zoom positions of the camera. While the device can be geometrically described by standard kinematics models of theodolites and robotic total stations [38], no detailed kinematics model of the device is available.
The data set for the DH model extraction of the PLT 300 contained 15 points for each joint, which have been recorded using the API functions introduced in Eqs. 11 and 12. End point positions for the first joint were recorded by setting the control variable θ to the fixed value θ = π 2 , while varying ϕ from 0 to 2π . End point positions for the second joint were recorded by setting the control variable ϕ to the fixed value ϕ = 0 while varying θ from 0 to π. The angular parameter space coverage is shown in Fig. 5b. For each position, the length of the back projected EDM ray and the length of the back-projected camera view ray were set to one meter (d 1 = 1). The pixel position for the back-projected camera view ray was aligned with the principal point of the camera, which was also provided by the API of the device. For EDM tool pose extraction, a second ray at each control position was extracted using a constant ray length of two meter (d 2 = 2). For camera tool pose extraction, two additional rays at each position were used, for which the back-projected image coordinates were shifted by one pixel in x and one pixel in y direction, respectively. Table 1 shows the extracted DH parameters, Table 2 shows the base and tool transforms for both, the EDM and camera end effectors. The complete extracted kinematics model, including all coordinate frames and transformation matrices, is shown in Fig. 8. We show results for only one of the seven fixed zoom levels.

Model Simplification
The extracted model parameters in Section 4 are sufficient for building a device simulator. However, an idealized device simulation is often preferred. Idealized models can be used for analytic system and algorithm design and verification, for calculating or simulating desired system behaviours or for generating reference data sets.
The device used for this work shows similarities in the tool transforms, translation parameters, which are close to zero, and rotation parameters, which can be approximated by multiples of π 2 . In this section, we apply numerical approximations to relate the extracted kinematics model with the simplified and idealized spherical model. We represent a rotation with a 3D rotation matrix according to where {α, β, γ } are the Euler angles [32].  By setting the translation parameters to 0 and replacing the rotational components by the nearest multiple of π 2 , we get following forward kinematics model of the PLT 300: The rotation matrices R x and R z are defined in Eqs. 6 and 7, respectively. The rotation matrix R y is defined as Multiplying and simplifying matrix M B,T leads to which can be written as an Euler rotation according to Eq. 83:

Model Error Estimation
The (DH) model error can be expressed using the average point distanced and the unbiased standard deviationσ between recorded and calculated point sets according tō where x i,meas are points of the measurement set and x i,calc are points of the calculated point set. The measured point sets for the (EDM) tool and for the camera tool have been created using Eqs. 11 and 12, respectively. The calculated point set for the (EDM) tool is given by Eq. 74. The calculated point set for the camera tools can be derived from Eq. 62. A back-projected homogeneous image coordinateû defines a view ray X T (û, λ) in the reference frame [14]: where P † is the pseudo-inverse of the projection matrix P and c is the camera center. The 3 × 4 projection matrix P is defined by which projects points from the reference frame to the camera image. While SVD can used to reduce numerical instabilities for estimating the pseudo-inverse P † [14], Eq. 94 has two further problems: 1. A camera center c = 0 does not result in a valid ray equation, and 2. λ is not a linear parameterization of the ray length.
Therefore, the following alternative view ray calculation should be used for finite cameras [14]: whereû is a homogeneous coordinate vector, and μ is the length of the back-projected ray. The decomposition of the projection matrix P into the 3 × 3 matrix M and the 3 × 1 column vector p 4 is given by Equation 96 can be used to calculate the evaluation point set for the camera tool frame, using the ray length μ = d CAM .
We analyzed different aspects of the modeling error and the influence of the individual end effectors. Traditionally, the camera of the (RTS) is used by the operator to measure a certain 3D point, but the actual measurement does not use image based measurement (IBM) methods. Scanning applications may not use the camera at all. Ideally, the z axes of the camera and the EDM are aligned, the tool rotations around the principal ray and the EDM ray do not influence the result. For such applications, the tool rotations around their respective z axis should be excluded in the error analysis (method A). 9 Advanced applications use IBM methods; hence, errors of all tool poses have to be considered (method B). The different aspects of the evaluation are shown in Table 3.
The results in Table 4 show that the simplified, spherical model, compared to the geometrically extracted (DH) The individual and compound errors for the end effectors are provided.

Model Optimization
The kinematics modeling method proposed in Section 3 is a greedy algorithm, which only optimizes local cost functions. While this is sufficient for many applications, the estimated model can be refined using nonlinear optimization techniques to decrease the error of the model. In this section, we briefly discuss model optimization using a global error function. However, this is only a proof of concept, while a detailed optimization analysis is beyond the scope of this work. Finding an optimal model can be formalized as nonlinear optimization problem with boundary conditions and linear scalarization     We used the lower boundary t min = −1e −2 m and the upper boundary t max = 1e −2 m for all translational model parameters. Additionally, we used the lower boundary r min = −π and the upper boundary r max = +π for all rotational model parameters. 11 Sequential quadratic programming (SQP) was used to refine the initial kinematics model, which is a gradient based iterative numerical optimization method. Details about SQP can be found in the book by Nocedal and Wright [26].
Usually the weights of the objectives are normalized to one, hence ω d + ω σ = 1 [10]. In this work, however, we weighted the mean value with ω d = 1.0 and the unbiased standard deviation with ω σ = 1.0 for better comparison between the non-optimized result given in Table 4 and the evolution of the residual over the optimization iterations shown in Fig. 9a.
The optimization was implemented and executed with MATLAB 2017 and the MATLAB Optimization Toolbox [24] using four parallel sub-processes, Microsoft Windows 10, an Intel Core i7 processor with 64GB RAM. The runtime of the SQP based optimization was 1.326e 3 s (≈ 0.4h).
The error results of the optimized model e optimized are shown in Table 5; the optimized model parameters are shown in Tables 6 and 7, respectively. Error distributions with respect the angle control parameter space are shown in Figs. 10 and 11, respectively. 12

Discussion
The model error e optimized of Table 5 was calculated using the reduced control parameter set {ϕ i , θ i } as shown in Fig. 5b for both, optimization and evaluation. For simple cross-validation, the optimized model was applied to all samples of the full control parameter space, as shown in Fig. 9b, which corresponds to the error e optimized,cross in Table 5. Cross-validation is explained in the book by Witten and Frank [40]. In Table 5, e opt imized,f ull shows the error of the optimized model, where the full control parameter space for model fitting and validation was used. The errors for all three methods {e optimized , e optimized,cross , e optimized,full } were calculated using Eqs. 92 and 93. The error values of all three methods are in the same order of magnitude, which shows that the sub-set of the recorded samples is sufficient for RTS model optimization. However, a slight decrease in the modeling error can be observed when using samples from the full angle control parameter space. 13 The comparison of Table 4 with Table 5 shows a decrease in the modeling error by one order of magnitude when applying numerical optimization subsequent to the geometric DH parameter estimation. Hence, numerical optimization is essential for kinematics modeling. Figures 10 and 11 show the distribution of the modeling error with respect to the angle control parameter space for the non-optimized and the optimized model, respectively. Sample points in the critical vertical angle regions were excluded throughout this work for stability reasons. In Fig. 11, samples of the full control parameter space were used for optimization and validation. The critical regions exclude samples near the poles of the spherical model from the calculations. The poles are defined by θ i = {0, π}. In particular, the critical region around θ i = π also excludes the non-measurable area of a physical (RTS), as shown in Fig. 12. In this work, we set critical regions to |θ i | = 0, 3 4 π . . . π . All calculations were carried out with 64-bit precision arithmetic. In particular, a 64-bit value has a precision of 16 decimal digits. The integer part of the values of all distances and angles used in this work can be represented with one decimal digit; the fractional parts are represented by 15 decimal digits. Details about floating point arithmetic are given in the IEEE 754-2008 standard for floating-point arithmetic [19]. The range of the EDM error distribution of the simplified model, shown in the top left diagram of Fig. 10, is in the magnitude of the round-off effects of the 64-bit floating point arithmetic. Hence, the error distribution can be considered as noise, introduced by round-off effects. The error distribution of the geometrically estimated DH 13 The analysis of the effect of using a reduced distance parameter space is beyond the scope of this work. When comparing all analyzed models, the simplified model is the best match for the device, if only the EDM end effector is considered. In particular, the EDM modeling error distribution, shown in the top left diagram of Fig. 10, is in the magnitude of numerical round-off effects. This indicates that the driver uses a spherical coordinate system to convert angle and distance sensor data {ϕ i , θ, d i } to Euclidean points x i .
When only considering the camera end effector, the accuracy of the geometrically estimated model is by a order lower, compared to the simplified model; the error distribution is more uniform with respect to the angle parameter space. The distribution of the simplified model indicates a translational component of the camera pose, which cannot be modeled by a single spherical coordinate system.
The compound errors of the individual models are mainly influenced by sample points of the camera frame. This indicates that the camera model used in this work is too simplistic. To lower the modeling accuracy and precision, a more general camera model can be applied.
The results presented in Section 4 show that a spherical representation of the RTS is sufficient for idealized kinematics simulation of the system. If a more detailed model is required, the system can be described by DH parameters using the method we introduced in Section 3. The results proposed in Section 4.3 show the significance of the downstream numerical optimization.
While the analyzed models cover a wide range of possible applications, additional aspects can be addressed to improve the simulation of the device properties: Optical Zoom Levels An RTS usually provides different optical zoom levels. The exemplary device used for the project supports seven optical zoom levels [16]. In this work, only the first optical zoom level was presented.
A more detailed description would include individual tool matrices M (k) T and projection matrices P (k) for each zoom level.
Faces An RTS supports measurement in both faces to reduce systematic measurement errors [38]. Changing the face for a particular measurement from face one to face two means to change the control variables by ϕ 2 = ϕ 1 − π and θ 2 = −θ 1 , where {ϕ 1 , θ 1 } and {ϕ 2 , θ 2 } are the control variables for face one and face two, respectively. Depending on the particular device specific API, face control could be either coded into the control variables or accessed explicitly. Hence, either an unique kinematics model for both faces or a separate kinematics model for each face has to be applied. In this work, we used only one face for model estimation. Furthermore, the image orientation of the second face might be corrected implicitly by the device driver or explicitly by the application software.

Simulation of Complete Model
A complete kinematics model allows arbitrary rigid placement of the robot base and tools [11]. The complete kinematics description used in this work requires base and tool transforms in addition to the (DH) parameters, as proposed by Veitschegger and Wu [39]. However, some existing kinematics simulators might not support this extension without further modifications [27]. For end-to-end simulation, it might be required to express base and tool transforms as (DH) joints.

Nonlinear Kinematics Model
Including non-linear system properties could further increase the model accuracy. For example, signal conditioning for angular control variables can be included in the circular feature extraction method of Section 3 or in the numerical optimization method of Section 4.3. Fig. 13 Exemplary scene graphs for for RTS simulators, including following nodes: Scene node (reference frame), environment node (measurement targets), RTS (device model). Simple scene graph: no visualization of the kinematics of the individual RTS components. Extended scene graph: DH parameters can be applied to the individual RTS components Dynamic Model of the Robot The RTS prototype used in this work provides only limited access to dynamic control parameters, hence no dynamic model was derived. We will further work on modeling the dynamic behavior of the PLT 300.
Alternative Error Formulation The model error used in this work is based on Euclidean distances between recorded and calculated point sets. A more detailed error formulation could separate translational and angular errors.
The estimated models can be used for RTS simulation, using custom or standard robotic simulators. Figure 13 shows the scene graph of an exemplary RTS simulator, Fig. 14 shows a custom RTS simulator in Unity3D. An early version of the simulator was used by Klug et al. for the design of interactive RTS algorithms [21]. A detailed description of the simulator is beyond the scope of this work.

Conclusion
In this work, we presented a complete workflow for forward kinematics modeling of an RTS using the DH convention. The workflow is tailored to RTS and includes a detailed formalization and description of all individual steps. Furthermore, we provided the forward kinematics model of the PLT 300, for which we validated the influence of simplifying and optimizing the extracted kinematics model. The applied error analysis clearly shows the significance of the downstream optimization.
The choice between simplified and optimized model depends on the application. The simplified model represents an ideal device without systematic errors, which can be used for basic algorithm design, development and verification. A typical application of this model is the error propagation analysis [38]. The optimized model matches the analyzed device more precisely. Typical applications of this model are device simulation, test data generation and conversion of sensor data. By using (DH) parameters, existing robotic simulators could be used, or custom simulation environments could be implemented.
The modeling error distribution shown in Figs. 10 and 11 indicates that further enhancements could be achieved by applying more general camera models.
In general, we believe that the DH parameters and the proposed workflow is feasible for RTS simulation. We will further work on including more complete models of the sensors, modeling dynamics of RTS, and on comparing the usability of existing simulators for RTS simulation.