Leveraging spatial uncertainty for online error compensation in EMT

Purpose Electromagnetic tracking (EMT) can potentially complement fluoroscopic navigation, reducing radiation exposure in a hybrid setting. Due to the susceptibility to external distortions, systematic error in EMT needs to be compensated algorithmically. Compensation algorithms for EMT in guidewire procedures are only practical in an online setting. Methods We collect positional data and train a symmetric artificial neural network (ANN) architecture for compensating navigation error. The results are evaluated in both online and offline scenarios and are compared to polynomial fits. We assess spatial uncertainty of the compensation proposed by the ANN. Simulations based on real data show how this uncertainty measure can be utilized to improve accuracy and limit radiation exposure in hybrid navigation. Results ANNs compensate unseen distortions by more than 70%, outperforming polynomial regression. Working on known distortions, ANNs outperform polynomials as well. We empirically demonstrate a linear relationship between tracking accuracy and model uncertainty. The effectiveness of hybrid tracking is shown in a simulation experiment. Conclusion ANNs are suitable for EMT error compensation and can generalize across unseen distortions. Model uncertainty needs to be assessed when spatial error compensation algorithms are developed, so that training data collection can be optimized. Finally, we find that error compensation in EMT reduces the need for X-ray images in hybrid navigation.


Introduction
Electromagnetic tracking (EMT) is a key technology to enable navigation in minimally invasive surgery without line of sight. As miniaturized sensors can be integrated into catheters, EMT has potential to be employed for guidewire navigation in abdominal aortic aneurysm repair (AAAR) [15,16]. In current clinical practice, fluoroscopic X-ray imaging is considered the gold standard for guidewire navigation in endovascular aneurysm repair [5]. However, X-ray imaging exposes both the surgeon and the patient to ionizing radiation [8]. The high accuracy [13] and visual feedback of fluoroscopy means complete removal of X-ray in minimally invasive vascular surgery is unrealistic in near future. A more realistic approach is to consider a hybrid navigation framework. In this framework, continuous navigation will be performed by radiation-free EMT, while X-ray snapshots will be acquired on demand for recalibration or dexterous maneuver. This hybrid navigation reduces the amount of Xray images that need to be captured during the procedure, which in turn will reduce the radiation exposure for both surgeon and patient.
EMT navigation is negatively affected by the presence of metal or electromagnetic interference within the vicinity of the tracking system. The presence of the c-arm X-ray unit within the operating room (OR) is a dominant source of metallic distortion for the EMT measurement. While it is well-known that passive countermeasures (such as removal of the metallic object) might mitigate such error [6], the c-arm fluoroscopy unit is essential for hybrid tracking procedures. Thus, the c-arm cannot be removed from the OR.
Rather, an active error compensation is necessary to improve electromagnetic tracking accuracy. Unlike random 123 Compensation-inherent errors Lack of training data, sparsity of training points More training data points denser spacing of points error that can be eliminated by averaging recorded sensor data over multiple samples, compensating systematic error requires more sophisticated algorithms. Classical techniques such as lookup-tables [17], interpolation [23] or polynomial regression [9], only work under known distortion characteristics. Such offline compensation requires a tedious data acquisition procedure every time the c-arm position is changed. Clearly, these algorithms are impractical for hybrid navigation in the OR. EMT navigation in surgery thus requires online compensation approaches, where the compensating algorithm can be used in any distortion scenario.
Training data for online compensation need to be collected only once for several scenarios. For the sake of brevity, all types of errors related to EMT and countermeasures are summarized in Table 1. In this paper, we present an active online error compensation approach for EMT navigation in AAAR. First, we collect positional EMT data in one laboratory and several OR scenarios with different degrees of distortion. We capture EMT sensor positions on a calibrated Lego phantom (Figs. 1a and 2). Metallic distortion is artificially introduced to the magnetic field by positioning a c-arm fluoroscopy unit (Fig. 1b) in varying alignments next to the Lego setup. We then use an artificial neural network (ANN) for approximating a function that maps erroneous positions to compensated positions. The ANN is evaluated in an online setup to compensate distortions that are not available in the training phase. As model predictions can be uncertain in regions where training data are sparse, we assess spatial uncertainty inherent to the ANN. This uncertainty evaluation is performed at different regions of the navigation volume with varying availability of training data. In a final experiment, we simulate a trajec-tory resembling the guidewire path through the abdominal aorta. We use our knowledge about ANN inherent uncertainty for finding optimal points to acquire X-ray images. For guidewire insertion in the real OR, these X-ray images can be used to rectify uncertain (and hence erroneous) EMT sensor positions. Our simulation provides initial understanding to the trade-off in loss of precision versus reduction in radiation exposure using hybrid tracking. This is the first work describing an uncertainty-aware online error compensation approach for EMT in endovascular surgery. Our contributions are twofold; first, we describe a neural network to approximate a spatial compensation function based on relative distances. The approximation works online in scenarios with unknown distortions. Second, we assess spatial model-inherent uncertainty of the neural network regression and its effect on positional error. This analysis provides us insight about the linear relation between model uncertainty and tracking accuracy, and the potential radiation-error trade-off for hybrid guidewire navigation in AAAR.

Related work
As a comprehensive description of all the active EMT error compensation techniques is beyond the scope of the paper, we point the reader to Franz et al. [6] and Kindratenko et al. [10] who provide a comprehensive review of this topic. Instead, in this paper, we mainly focus on the compensation techniques similar to ours. Kindratenko et al. [11] propose a two hidden layer neural network that outperforms polynomial fits and lookup-table compensation in an offline compensation setup.
Online compensation approaches use data from additional sensors [20] or sensor arrays [18] to map metallic distortions in the tracking volume. Sadjadi et al. propose a simultaneous localization and mapping (SLAM) approach that reduces positional error by 67%, but requires auxiliary sensors to be rigidly attached to the tracked instrument-which is not applicable to guidewires or catheters in endovascular navigation.
In endovascular surgery, the use of EMT is evaluated in several phantom [4,22], swine [15,22] and patient studies [16]. These studies show that there is potential for EMT to be applied in AAAR, with positional errors of up to 5 mm.
Neural networks, such as those we use for error compensation in this paper, are black boxes due to their complexity and nonlinearity. We therefore employ means to make model predictions traceable. Gal et al. [7] propose to use dropout masks for hidden layers at both training and inference time to obtain a Bayesian approximation for prediction uncertainty in classification problems. In this paper, we generalize this approximation to learn about the limits of the presented regression approach for spatial error compensation.

Materials
Positional tracking experiments are performed with an Ascension trakSTAR 3D Guidance system (Northern Digital Inc.) under the use of a 1.8 mm sensor. Positional EMT measurement data are collected on a calibrated Lego measurement phantom (repeatability 20 µm) similar to the one proposed by us earlier [14]. EMT measurements are performed in laboratory and near a Ziehm Vision 3D c-arm fluoroscopy unit. Software for interfacing the trakSTAR system is developed in C++. Compensation models are implemented in Python (Python Software Foundation) using Keras [3] and tensorflow backend [1].

Methods
First, training and evaluation datasets are acquired in one laboratory and multiple c-arm scenarios. We describe the acquisition and preprocessing in "Data acquisition and preprocessing" section. Acquired datasets are then used for training neural networks for EMT error compensation, which we describe in "Error compensating neural networks" section. We use our ANN in four experimental setups. In the first experiment, the ANN is trained on a multitude of datasets for online compensation ("Compensation of unseen distortions" section). Afterward, we perform an offline evaluation to compare the ANNs to a similar compensation approach ("Known distortion compensation" section). We then evaluate spatial model uncertainty for the online model in "Model uncertainty evaluation" section. Finally, in a simulation experiment ("Simulated hybrid AAAR intervention" section), we use model uncertainty to find a threshold for recalibration.

Data acquisition and preprocessing
Data points are captured by sequentially positioning a Lego block with an embedded EMT sensor on ten calibrated positions (see Fig. 1a) of the phantom. Random EMT error is eliminated by taking the median of 500 samples. We collect positional datasets in multiple scenarios with artificial distortion and in a laboratory scenario. Each distorted scenario uses a different c-arm alignment with respect to the Lego phantom. Table 2 shows the datasets collected for the experiments in "Compensation of unseen distortions" and "Known distortion compensation" sections. Positional data are collected in three phantom elevations in steps of 9.6 mm (height of one Lego brick). With each c-arm position, we also measure positions with the phantom rotated by 180 • around its azimuth axis. Error values are calculated as e = ||x 2 − x 1 ||− y, where e is error, x 1 and x 2 are two different measuring points and y is the respective ground truth distance on the Lego board.

Error compensating neural networks
We mitigate systematic positional error by approximating a compensation function that maps erroneous to compensated points. The compensation function is approximated by a three-layer ANN with 32 units per layer. These parameters as well as the batch size (512) are estimated by grid search. The ANN uses leaky ReLU activations (α = 0.01) in the hidden layers to prevent vanishing gradients.
The compensation function has four input units for x, y, z and the trakSTAR quality indicator, which are all normalized to an interval [0, 1] to improve model stability. This normalization is reverted after the final layer, which contains three units with linear activations for x, y and z coordinates.
During training, two ANNs with shared weights are arranged in parallel, as illustrated in Fig. 3. We can therefore train the compensation function on relative displacements, but use the trained function for absolute point predictions. Training on relative displacements between positions ensures that the exact distance of phantom to the field generator center does not need to be measured [14]. Hence, this approach circumvents the need for a second measurement standard to capture absolute positions, which would contribute to overall measurement uncertainty. In the training phase, the displacement error (mean-squared-error) is minimized by Adam optimizer [12] (learning rate = 0.01): where f denotes the learned compensation function approximated by the ANN, y is the ground truth displacement length, x 1 , x 2 are measured EMT points, q 1 and q 2 are respective quality indicator values and ω is the matrix of learned weights.
As mentioned earlier, we add a quality indicator value that is reported by the trakSTAR system along with every measurement, as additional input to the compensation models in expectation of better generalization performance across scenarios. According to the trakSTAR user manual [2], the quality value is computed from an internal error indication and four user-defined quality-parameters: where S, b, and m denote user parameters (sensitivity, offset, slope), r is the sensor-transmitter range. We obtain raw quality values by setting the user parameters to S = 1, b = 0, m = 0. For comparison, we implement mixed-term polynomial regression models as proposed by Kügler et al. [14]. Both compensation models are trained on pairs of EMT sensor positions and corresponding ground truth distances on the Lego phantom.

Compensation of unseen distortions
We train the ANN on data from four c-arm scenarios (see Table 2). For model validation, we use the data obtained in the laboratory setup. The trained model is evaluated on the remaining four datasets (Fig. 4). Likewise, we train and

Known distortion compensation
In this experiment, we evaluate ANN models in the same carm setup in which they are trained (offline compensation). This experiment measures the best-case outcome for learning based compensation. We examine compensation for all scenarios in Table 2 individually, where training and evaluation sets are chosen to be spatially independent. The data are divided into training/validation/testing sets with a split of 45/5/50.

Model uncertainty evaluation
Model-inherent uncertainty for the ANN is estimated by applying 10% dropout during training and at inference time (Monte Carlo Dropout [7]). We take 3000 samples from the distribution of compensated output positions to obtain a Bayesian approximation of model-inherent uncertainty. Spatial uncertainty is expressed as the standard deviation σ = σ 2 x + σ 2 y for each point (x, y) in the planar full-baseboard dataset. Distributions of model predictions cannot be assumed to be Gaussian (see section 4.1 in [14]), so that the 68-95-99.7 rule for confidence interval approximation does not apply here.
To examine spatial uncertainty for a large portion of the specified tracking volume, we collect a dataset in the 2D plane by moving the phantom to different positions on the gray Lego board. This dataset contains 10,598 different displacements, collected in six different alignments of the c-arm to the tracker. We use this dataset for training a neural network analogously to "Compensation of unseen distortions" section, but with modifications to the ANN layout. That is, a neural network with two layers, 64 units per layer and two     (x, y) is employed for the following evaluations.
In addition to the training set, we collect measurement data for evaluation from all Lego points within the specified area (green area in Fig. 2). Unlike the training sets, this dataset also contains phantom points that were not calibrated. As phantom uncertainty (≈ 20 µm according to [14]) is negligible compared to model-inherent uncertainty we want to examine, this simplification is valid. This measurement allows for an evaluation of spatial epistemic uncertainty over the whole base board (see Fig. 5).
Although the symmetric ANN is trained and validated on pairs of positions and respective ground truth distances, a single trained ANN can be used for absolute point compensation during inference (compare x 1,c and x 2,c in Fig. 3). In this experiment, we let our trained ANN predict compensated positions for the whole baseboard. Absolute positional error is then estimated by calculating the measured distances to adjacent points in a Moore neighborhood (r = 3) [21] and averaging the error.

Simulated hybrid AAAR intervention
Based on real EMT data, we simulate a sensor moving on a path inside a virtual abdominal aorta. This simulation is inspired by guidewire insertion in AAAR using hybrid navigation. Path shapes are motivated by those of abdominal aorta, as shown in Fig. 6. Since the average abdominal aorta is 20 cm to 25 cm in length, a simulated path of 21.9 cm is chosen.
Start and end points of each path segment are taken from the full-base-board dataset. Comparing the distances between segment start and end points to respective ground truth distances yields positional error. Uncertainty is determined position-wise as described in "Model uncertainty evaluation" section.
In hybrid navigation, guidewire position can be precisely recalibrated by X-ray pose estimation [13] and fiducial registration [15,22]. Correcting the EMT sensor location by an X-ray image exposes the patient to radiation, so that recalibrations should rarely be employed. Hence, we are facing a trade-off between radiation dose and tracking accuracy. We introduce the concept of recalibration to our simulation by resetting error and accumulated uncertainty at calculated recalibration points.
We evaluate two different strategies for determining when to perform the recalibration in our simulation: (A) We choose recalibration points based on model uncertainty. Recalibration is performed when accumulated model uncertainty exceeds a certain threshold τ . (B) We simulate recalibration in defined constant intervals based on traveled distance. We simulate the recalibration process for different adaptive thresholds τ . The same is done for different uniform distance intervals ranging from 0 cm to 21.9 cm.

Results
Error compensation In unseen scenarios ("Compensation of unseen distortions" section), ANNs clearly outperform polynomial regression models (Fig. 4). However, the compensated EMT error does not reach the results achieved by offline compensation. We observe that including the quality indicator in the model input improves generalization abilities of the neural network. Scenario-wise compensation ("Known distortion compensation" section) appears to be a simple task for both polynomial fits and ANNs, as we can compensate between 35% and 75% of error in each scenario. Both algorithms can reduce tracking error to sub-millimeter values in this offline setup. Model uncertainty evaluation We compare error after compensation to model uncertainty, finding that both quantities are correlated (Fig. 7). Hypothesizing that model-inherent uncertainty grows with less training points nearby, we measure distances between points in the evaluation set and the nearest point in the training set. The resulting distances serve as a proxy measure for training point density. Figure 8 shows the relationship between training point density and the resulting model uncertainty. We observe that for distances to next training point greater than 35 mm, error after compensation grows linearly with decreasing point density (Fig. 9). Simulated hybrid AAAR intervention Along with the sensor traveling along its path, uncertainty accumulates by Figure 10 shows how error and uncertainty develop during virtual guidewire insertion with and without X-ray recalibration. The trade-off between required X-ray recalibrations and tracking error for uncertainty-based (blue, adaptive) and distance-based (red, static) triggering of X-ray recalibrations is illustrated in Fig. 11. Choosing a threshold of τ = 2 mm, as motivated by Fig. 7, yields a good compromise between tracking error and radiation exposure in both seen and unseen scenarios.

Discussion
Although localization error of 4 mm are believed to be acceptable in endovascular surgery [22], improving EMT accuracy raises the overall trust of the hybrid navigation system. In this work, we have shown that ANNs can improve positional tracking to achieve sub-millimeter accuracy in an online setting. With higher localization accuracy, less X-ray images are needed for navigation. Spatial uncertainty can be examined for ANN models and it should be utilized as a measure for model validation whenever compensation algorithms are employed. On the one hand, knowledge about model uncertainty can be used to refine phantom design. We have found that our ANN model requires a training point spacing of 35 mm to be sufficiently accurate, which should be considered in future experimental designs. On the other hand, our simulation experiments show that knowledge about model uncertainty can be exploited to minimize radiation exposure in the online hybrid setting. We envision that model-inherent uncertainty assessment will become an essential part of future EMT error compensation approaches.
The presented method can be further improved by conducting more realistic evaluations. Currently, assumptions about radiation reduction are solely made on the basis of simulations. Consequently, the findings presented in this paper should be assessed in realistic phantom or cadaver studies.
In addition, the rotational degrees of freedom (DOF) need to be considered for realistic evaluations. Especially the roll angle of EMT sensors are of great importance in endovascular procedures and will thus be a major subject of our future work. Our current evaluations show that ANN can compensate static error to sub-millimeter values with only a single sensor. However, the effort of data collection in multiple scenarios with all DOF poses a problem that still needs to be solved, for instance by automatized data acquisition.
Furthermore, only one source of metallic distortion (carm) is considered in our experiments. In the real OR, other metallic artifacts contribute to measurement error in addition to the c-arm. Considering additional artifacts, such as the patient bed, and the rotational DOF might require more complex models than the proposed neural network.

Conclusion and future work
In this paper, we present a novel active error compensation framework for EMT in endovascular surgery. We introduce neural networks capable of generalizing across distortion scenarios in single-sensor configuration while providing sub-millimeter accuracy. We also quantify the positional uncertainty of the error compensating neural network. When error compensated EMT reaches its limits, we show that knowledge about positional uncertainty helps to get EMT navigation back on track. Our work suggests inherent limits of spatial uncertainty that can only be realized when EMT and the compensation scheme are evaluated in tandem. In future phantom evaluation protocols, we will consider these spatial uncertainty limits.
In the future, we will work on automatized data acquisition protocols in order to extend our approach to more than two DOF. Moving toward more realistic evaluations, we will evaluate our method in a hybrid setup with 3D printed aortic phantoms and additional metallic artifacts.