1 Introduction

Despite the advances in image-guided interventions over the last 25 years [1] and a widespread distribution of navigation systems in North America and Europe [2], conventional fluoroscopy remains to be the most frequently used intra-operative imaging and guidance modality in surgery. In spine surgery, 87% of surgeons worldwide use fluoroscopy routinely compared to 11% using navigation systems in their daily routine [2]. The primary challenge determining patient outcome in image-guided interventions is the surgeons’ ability to mentally recreate the 3D surgical scene from intra-operative images [1], as surgeons do not have a direct view on the surgical area anymore. During C-arm based procedures this ability directly depends on the correct handling of the C-arm carried out by an operator, usually a nurse [3], based on the communication with the surgeon. Mastery in surgery requires extensive and immersive experiences to acquire the relevant surgical skills [4]. However, due to several mandated working-hour restrictions [4], increasing cost of operating room time and ethical concerns regarding patient-safety, clinical training opportunities are continuously decreasing while the complexity of interventions are continuously increasing. As a consequence, alternative training models have been proposed.

While animal or human cadaver training provides adequate haptic feeling and fluoroscopic images, they require X-ray radiation, are costly, ethically problematic, and pathologies relevant to the trained procedure are, in general, not present in the specimen. Commercially available synthetic training models offer only a very limited range of pathologies and typically do not show realistic images under X-ray.

More recently, computer-based simulation has emerged as a form of training [3, 5,6,7,8]. Most simulators that include fluoroscopic imaging target the spine, due to its complex anatomy and proximity to critical structures. Most reported works on C-arm simulators use the principle of Digitally Reconstructed Radiographs (DRR) to create fluoroscopic images without radiation from Computed Tomography (CT) data [3, 5,6,7]. The representation of a C-arm and its control in simulators has been realized in many different degrees of realism from virtual representations to real C-arms. Gong et al. [5] mount a webcam next to the X-ray source to track a C-arm relative to a virtual patient represented by an empty cardboard box with AR markers. Clinical 4D CT data is visualized as DRR using the tracked position. Bott et al. [6] use an electromagnetic (EM) tracking system to track a physical C-arm, the operating table and a mannequin representing the patient in order to generate the DRR images. Both systems, however, are not suited for interventional surgical training, as no anatomy matching the image data is present which could be treated.

Beyond the use in training of C-arm operators and diagnostic procedures, several works have aimed at presenting patient-based anatomy in a tangible manner. Despite their relatively high cost, haptic devices are widely used in surgical simulators to generate force feedback according to the anatomy represented in the CT data—in a few cases combined with a physical C-arm. Wucherer et al. [7] place a real C-arm as part of a operating room scenery without linking its function to the spinal surgery simulator they use. Rudarakanchana et al. [9] combine a C-arm replica simulator with an endovascular simulator. However, they do not state whether both systems are spatially registered. Patient anatomy can also be represented physically by 3D printing. At present, 3D printing is already commonly used for procedure planning and training [10]. Harrop et al. [8] reproduce the equivalent of navigation with 3-axial image set visualization using 3D printed models from CT scans. In summary, several works exist that simulate C-arm operation and replicate patient anatomy from medical imaging data. However, none of them bring both in an accurately registered spatial relation.

Contributions. Our proposed mixed-reality approach of combining patient-based 3D printed anatomy and simulated X-ray imaging with a real C-arm complements traditional training. To the authors’ knowledge no other simulation environment places a radiation-free physically present C-arm in an accurate spatial relation to simulated patient anatomy. This allows the use of real instruments and accurately aligns C-arm images with a physical patient model, which is important for training of hand-eye coordination and mental mapping of projection images to the surgical scene and patient anatomy. The patient-based models are created from CT data using a 3D printer and can be replicated as often as needed at low cost. The printed models contain the pathology present in the underlying CT data, in contrast to cadaver specimens that most often do not contain a relevant pathology. A further contribution is the transfer of the concept of Spatial Relationship Graphs (SRG) from Industrial AR [11] to Computer Assisted Interventions (CAI). A SRG is a directed graph in which the nodes represent coordinate systems. Edges represent transformations between adjacent coordinates systems. Throughout this work, we use SRGs to provide an intuitive visual description of the complex, dynamic chain of transformations of tracked objects and calibrations involved in the proposed mixed-reality system.

Fig. 1.
figure 1

(a) and (b) Overview of the proposed system with C-arm, 3D print, and optical marker targets, (c) Spatial Relationship Graph (SRG) of the simulation system.

2 Methodology

Setup. In the proposed system, both the C-arm (C), the 3D printed patient model (P) and the tool (T) are physical objects that are tracked using a ARTTRACK2 4-camera optical outside-in tracking system (W). A schematic representation of the setup is shown in Fig. 1a. In order to simulate an X-ray acquisition, the position of the virtual camera (S) in the CT coordinate system needs to be computed. Figure 1c shows the SRG of this simulation system, detailing on the transformations spatially linking all components. Edges are labeled with the type of transformation: 6D for 3D rigid transformations, 2D and 3D for 2D and 3D translations, \(3D\rightarrow 2D\) for projective transformations. Edges not varying over time are labeled static, edges that do vary are labeled dynamic. Edges that need to be calibrated are static by definition throughout this work. The following colors are used in figures: blue: calibrated, black: static, red: dynamic.

Synthetic Patient Model. From a patient CT dataset, a segmentation of the spine was created and four walls of a box added around it. On the surface of these walls, twenty artificial landmark holes \(L_i\) were placed for the registration of the printed patient model to the CT data. From the segmentation, a surface mesh was created, which was then smoothed and printed in PLA on an Ultimaker2+ 3D printer. To this printed model (P) an optical tracking target (\(P_{Target}\)) was rigidly attached. For evaluation purposes, CT-markers were attached to the printed model and a CT scan of it was acquired (\(CT_{3DP}\)).

Fig. 2.
figure 2

Spatial Relationship Graphs: (a) C-arm Target to X-ray Source, (b) Print Target to patient CT and CT of printed model.

System Calibration. To place the simulated X-ray source S at the C-arm real X-ray source, the calibrated transformation \(T_{C_{Target} \rightarrow Xray}\) (Fig. 2a) is required to calculate the dynamic transformation \(T_{C_{Target} \rightarrow S}\). This problem is known as hand-eye calibration in robotics and augmented reality. A planar grid of X-ray visible markers is placed on a fixed surface between the real X-ray source and image intensifier. Multiple images of the grid are acquired from different poses of the C-arm and, based on the grid of markers, a virtual camera pose is computed using the PnP algorithm. For every X-ray image acquired, a pair of poses, composed of a C-arm tracking target pose \(T_{W\rightarrow C_{Target}}\) and a camera pose in the grid coordinate system \(T_{S\rightarrow Grid}\), is computed. From those pose pairs, the hand-eye calibration algorithm [12] estimates \(T_{C_{Target} \rightarrow S}\). To render the DRR spatially aligned with the printed model, we also need to obtain the transformation from the printed model tracking target to the patient CT coordinate system \(T_{P_{Target}\rightarrow CT_{PAT}}\) (Fig. 2b). For evaluation purposes, we also want to obtain the transformation \(T_{P_{Target}\rightarrow CT_{3DP}}\) from the printed model tracking target to the CT of the printed model. For registration, 20 artificial landmarks \(L_i\) were placed in the segmentation of the patient CT and thus are observable in the printed model and the CT (\(CT_{3DP}\)) of it (Fig. 3, blue circles). Using a pointer tool, the 3D position of every landmark \(L_i\) in the printed model is located in the coordinate system of the printed model tracking target \(P_{Target}\). The same landmark positions were also extracted manually from the CT of the printed model \(CT_{3DP}\). Using the corresponding 3D points sequence, the transformations \(T_{P_{Target}\rightarrow CT_{3DP}}\) and \(T_{P_{Target}\rightarrow CT_{PAT}}\) are estimated using the least mean square minimization on the distances between corresponding points [13]. Knowing \(T_{C_{Target}\rightarrow S}\) and \(T_{P_{Target}\rightarrow CT_{PAT}}\), we compute the transformation from the patient CT to simulated X-ray source \(T_{CT_{PAT}\rightarrow S}\) for any C-arm and printed model pose with: \( T_{CT_{PAT}\rightarrow S}=T_{P_{Target}\rightarrow CT_{PAT}}^{-1}T_{W\rightarrow P_{Target}}^{-1}T_{W\rightarrow C_{Target}}T_{C_{Target}\rightarrow S} \).

Fig. 3.
figure 3

Artificial landmarks (blue) and CT markers (yellow) in (a) patient CT and segmentation, (b) 3D print, (c) 3D print CT. (d) Synthetic patient print filled with red-colored wax used during the user-study.

The pose of the X-ray source in the patient CT coordinate system \(T_{CT_{PAT}\rightarrow S}^{-1}\) is used to position a virtual camera to compute the DRR image. The intrinsics of the X-ray imaging are derived by a standard camera calibration method.

3 System Evaluation

First, we evaluated the errors (a) of the printing process, i.e. the registration of \(CT_{PAT}\) to \(CT_{3DP}\), (b) the registration of P to \(CT_{PAT}\) used to visualize DRR spatially aligned with patient model in the user study and (c) the registration P to \(CT_{3DP}\) used in the evaluation of the error between DRR and real X-ray images. The respective rigid transformations describing those spatial relationships \(T_{CT_{PAT}\rightarrow CT_{3DP}}\), \(T_{P\rightarrow CT_{PAT}}\) and \(T_{P\rightarrow CT_{3DP}}\) are calculated based on a least mean square error minimization of the distances between corresponding artificial landmarks. The root-mean-square error (RMSE) on the distance residuals is as follows: (a) 0.58 mm, (b) 0.75 mm and (c) 0.84 mm. Second, we evaluated the full-chain accuracy of tracking, answering the question to what extent the simulated X-ray matches the real X-ray image. We compare the 2D positions of Ct markers placed on the 3D print (see Fig. 3, yellow circles) in DRR images generated from the CT of the printed model (\(CT_{3DP}\)) and in real X-ray images of the printed model (P). This evaluation step is represented as an SRG graph along with an exemplary X-ray and DRR image pair used in the evaluation in Fig. 4. The RMSE error over 7 C-arm poses is \(4.85\,\pm \,2.37\) pixels (\(1.85\,\pm \,0.90\) mm).

Fig. 4.
figure 4

Full-chain tracking evaluation: (a) Spatial Relationship Graph, (b) exemplary pair of real X-ray and DRR images acquired during evaluation

For the user study, a synthetic patient print was filled with red-colored gel candle wax, using a print of the segmented skin as a mold to exactly recreate the patient’s body shape, then covered with a skin-colored foam rubber sheet to imitate skin. This model, shown in Fig. 3d was placed in between a mannequin phantom, to indicate where head and feet of the patient are located, then positioned on a operating table and finally draped. The surgeons participating in the study were presented with a patient case suggesting a FJI and asked to perform four injections into L1/L2 and L2/L3 on both sides using the simulated C-arm operated by a standardized nurse following the surgeons’ instructions. After the performance, the participants were asked to answer a questionnaire.

A total of \(N=6\) surgeons (5 trauma and 1 orthopedic surgeons), mean age 40 (SD 10.7, range 32–61), with prior experience in general spine surgery of mean 6.8 years (SD 6.6, range 2–20) and experience in FJI of mean 4.2 years (SD 4.4, range 0–10), 3 participants with teaching experience in both image guided surgery and FJI, 2 participants with \({\ge }1000\) procedures performed, the rest with \({\le }60\), participated in the study. All but one participants had prior experience with surgical simulators, 2 participants had used this simulator before. The result of the questionnaire is summarized in Fig. 5. Participants expressed agreement with the overall realism of the simulation (Q1) and strong agreement with the usefulness of the system for training of novices (Q12) and experts (Q13). The participants strongly agree that an integration into medical education would be useful (Q15). Free-text areas for improvements in the questionnaire reflected the positive reception of the participants: “[Replicate] facet joint capsule (haptic sensation when feeling around)”, “Improve haptics of the soft-tissue and ligaments surrounding the vertebrae”, “Current state very good, possibly further develop for more spine procedures”.

Fig. 5.
figure 5

Box plot of the 5-point Likert scale questionnaire results from the user study.

4 Discussion

Without any viable alternative, the current training of teams of surgeons and operators in C-arm based procedures in general involves X-ray radiation for the full length of cadaver trainings or patient treatments under supervision. The proposed mixed-reality system has the potential to complement or even replace large parts of cadaver training and to reduce the risk for errors when proceeding to patient treatment. 3D printing enables the accurate replication patient anatomy. Using the presented methodology these can be aligned correctly in spatial relation to the C-arm and surgical instruments. This allows training institutions to include any available patient case with its specific pathologies in a training. The SRG methodology used throughout this work proved to be a versatile tool in providing an intuitive description of the spatial relations involved in the simulation system, in identifying the required transformations and in modeling appropriate calibrations. We therefore suggest the general usage of SRGs for high-level descriptions of complex, dynamic real-world spatial relations in CAI applications.

Limitations. To improve the model fidelity, e.g. replication of ligaments and capsule tissue, the latest generation 3D printers, supporting materials with varying consistency and density [10], could be used. Patient breathing was not modeled, as it is of little relevance in FJI. If required in another procedure, it could be modeled by mechanically moving the tracked patient model.

The simulated X-ray images generated by our system result in an accuracy within the tolerable range of \({\le }2\) mm for image-guided spine surgery [14]. The system is thus well suited for training of technical skills, e.g. the hand-eye coordination in surgical tool usage or the mental mapping from 2D projective images to the 3D surgical scene. Additionally, it can potentially be used for the training of non-technical skills such as communication between surgeon and C-arm operator.

Conclusion. We propose a C-arm based surgery simulation system that accurately simulates patient anatomy and X-Ray imaging. We have shown the feasibility of using the system to simulate a surgical procedure with a fidelity sufficient for training of novices and experts and integration in medical education, according to surgical experts that evaluated the system in a user study.