1 Introduction

Augmented Reality (AR) is a technology that allows virtual elements to be inserted into the reality perceived by a user [1], so that an enriched reality where non-tangible parts are added to the external world can be experienced [2]. This approach fosters the development of solutions in several fields, among which the medical one, particularly to support surgeries, thanks to AR ability of optimizing the visualization of complex 3D data [3]. In the current work the focus is on osteotomies. “Osteotomy” refers to the generic bone cut using a dedicated surgical device. In orthopedic and maxillofacial surgery, osteotomies often need to be performed for pathology correction and tumor removal [4, 5]. Pathology correction and bone distraction generally involves the repositioning by planned displacements and rotations (roto-translations) of bone portions from the original preoperative position to the planned position, i.e., the optimal position that can lead the patient to resolution or improvement of the problem. To this aim, the 3D anatomical parts are extracted from the patient’s CT or MR scan [6], from which a DICOM file is obtained. The surgeon, together with specialized technicians, performs Virtual Surgical Planning (VSP) mainly defining the cutting planes and the roto-translation of bone flaps for each step of the surgery [7,8,9]. VSP is followed by the surgery in the operating room, and after a few days, another scan of the patient is performed to compare the outcome of the surgery with the one provided by the VSP.

Osteotomies can be performed in three ways: freehand (FH) [10], with the guidance of patient-specific templates (PSTs) [11,12,13,14] or with the support of the AR. In the first case, no support is provided to the surgeon, who relies only on personal ability and experience, so that the risk of obtaining an actual result that differs from the planned one increases [15]. PSTs are custom designed and manufactured tools for a specific patient and are generally used to perform targeted actions that are difficult or impossible to perform with commercially available instruments. In the orthopedic and maxillofacial fields, the most used PSTs are surgical guides (SGs) and cutting guides (CGs). For the sake of clarity, the former includes tools for guiding cuts on anatomical tissues, puncture masks, positioners, and distractors, while the latter specifically indicate templates for guiding the surgical cutting blade. Nonetheless, PST, SG and CG are often used as synonyms in the literature, generally indicating a physical reference to guide the surgeon’s cut. The use of PSTs undoubtedly improves accuracy with regards to the FH approach [16], providing a rigid support for the blade, as well as a rigid support for the parts of bone to be reoriented. The rigidity can be obtained reconnecting the PST to the source bone [17], since PSTs surfaces exactly replicate the shape of the specific bone to which they are anchored. PSTs management is integrated into the VSP and is typically performed using CAD software; subsequently, once the VSP is finalized, the PST is 3D printed. A drawback related to the PST usage is the eventual presence of a spatial clutter that may obstruct the surgeon’s view or make some anatomical parts inaccessible [18]. Moreover, when surgeons perform the osteotomies FH or with PSTs some small veins, arteries and nerves could being severed due to their poor visibility. The third way to perform osteotomies involves the AR, through which it is possible to visualize via display devices the crucial information for the surgeon, such as cutting planes, bones, surgical template [19], vital anatomical parts not to be severed directly on the patient in the correct spatial position [20]. For example, it is possible to visualize the bone flap in the planned virtual position in order to have a virtual guide over which to superimpose the physical bone flap, and also to display the spatial and angular discrepancies of a pointing device with respect to the virtual position [21]. However, the AR employment does not ensure the cut accuracy that is entirely demanded to the ability of the surgeon’s hand, since a physical feedback is not provided.

Various reviews have been conducted to frame the AR-usage contribution in the medical field [22]. For instance, Chytas et al. [23], Laverdière et al. [24], and Jud et al. [25], focused their research on orthopedics, as well as [26] and Badiali et al. [27] focused on the maxillofacial surgery.

This review reports the state of the art of AR as a support for osteotomies, highlighting the advantages, the limitations, and the open issues related to the use of this technology within the surgical room. For the sake of completeness, AR has been considered both when used to replace or to support PSTs assessing advantages and limitations of each approach. The aim of this work is to highlight the AR-related key open issues in osteotomies in order to address the future research towards an increasing employment of this technology.

The paper has been organized as follows: Section 2 details the search and selection methods of articles suitable for review; in Section 3 first the selected articles are presented, organized by category, then the state of the art of the research is described; in Section 4 the results obtained are commented, while in Section 5 the conclusions are provided.

2 Materials and methods

The Preferred Reporting Items on Systematic Reviews and Meta-analysis (PRISMA) guidelines were followed to perform the review. Two reviewers collected the research articles from the following online databases: Scopus, Web of Science, Pubmed and IEEE Xplore. The search was conducted using the following keywords: osteotomy (K1), osteotomies (K2), bone-cut (K3), surgical-guide (K4), cutting-guide (K5), patient-specific-template (K6), augmented-reality (K7), AR (K8), mixed-reality (K9), MR (K10), extended-reality (K11), XR (K12). Keywords from K1 to K6 were always alternately associated with all the others using the Boolean operator “AND” (e.g., “osteotomy AND augmented-reality” or “cutting-guide AND MR”). Table 1 shows the number of records that were found using specific keyword pairings.

Table 1 Articles found in the online databases Scopus, Web of Science, Pubmed and IEEE Xplore, using 12 keywords: the keywords from K1 to K6 were used in conjunction with the keywords from K7 to K12 using the Boolean operator AND

Other inclusion criteria for the article selection comprehended only English-written publications from January 2017 and February 2023, inclusive. The time period was constrained to the previous five years to assess the state of the art without including out-of-date solutions.

Regarding the exclusion criteria, book chapters, literature reviews, thesis, and articles not written in English were eliminated during the identification phase duplicates. Furthermore, during the screening phase the title and the abstract of each study were analyzed to discard those studies that did not cover osteotomies performed using AR as well as those that did not have full text available. Finally, among the remaining ones, also those articles published in journals not ranked by SCimago [28] were removed. Figure 1 describes the selection process according to the PRISMA guidelines.

Fig. 1
figure 1

Paper selection flow-chart. The flow diagram follows the PRISMA guidelines

The following research questions (RQs) were formulated in order to frame the role of AR in osteotomies and address the future research by highlighting the key open issues:

  • RQ1: In which anatomical districts AR is used to support osteotomies?

  • RQ2: How the interaction between the information provided through AR and the reality within the surgical room is performed?

  • RQ3: What are advantages and limitations of using AR to support osteotomies?

3 Results

After that the RQs were identified, the selection process led to identify forty-nine papers (Table 2).

Selected papers were classified according to the main common features of the proposed AR-based methodologies in order to provide a preliminary organization and improve readiness. The papers have been organized according to four different categories: anatomical district, since it impacts on the involved specialists; the marker characteristics, as they influence the accessibility and the solution robustness; the accuracy assessment methodology, since it is gaining increasing importance in precision medicine; AR usage, because it is core to understand the AR contribution in each application. For the sake of clarity, the flow-chart of the research has been provided in Fig. 2.

The first category identifies the anatomical district where AR support was provided, namely maxillofacial, pelvic, lower limbs and other districts, and it is also the category according to which the selected papers have been grouped within the present review.

The second category shows which type of marker was used, if any. Markers are physical objects of various shapes, sometimes containing patterns, that the AR systems camera can recognize in order to perform the “registration”, i.e., the alignment of the virtual model on the physical one. Within Table 2, “markerless” identifies a registration occurred without physical markers; otherwise, markers are reported depending on their shape, i.e., planar, biplanar, cubic, spherical, or cylindrical, as shown in Fig. 4.

Table 2 Classification of articles describing the use of AR to guide osteotomies according to the anatomical district, the type of marker (if any), the methodology used for the accuracy evaluation, and the AR usage

In the third category, the osteotomies-accuracy assessment method is reported. This category aims to highlight which parameter the authors identify to describe the accuracy of AR-supported osteotomies performed in the operating room compared with the osteotomies planned in the VSP (Fig. 5). The reported accuracy computation methodologies are the following: the deviation between corresponding control points (point dist.), the angular deviation between corresponding planes (angle btw planes), the distance between corresponding surfaces (surface dist.), tailored gauge-template for visual inspection of osteotomy lines (inspection window), qualitative assessment (questionnaire), percentage of tumor violated due to osteotomy execution (intratumoral cut). Considering the papers in which this information is not described, “no info” has been reported within the Table.

The fourth category defines the usage of AR within each of the studies: AR can be used to be compared with FH technique (AR vs FH), to be compared with PST-based osteotomies (AR vs PST), to be used jointly with PST (AR + PST), to be used standalone (AR standalone), and to be compared with other AR-based guidance alternatives (AR vs AR).

In Section 3.1 some preliminary evaluations on the selected works will be introduced as well as some introductory concepts related to the AR employment in osteotomies; in Section 3.2 the content of the selected works will be summarized; the identified criticalities will be arisen in Section 3.3; finally, the answers to the RQs and the guidelines to foster the future research regarding AR employment in osteotomies will be described in detail in Section 4.

Fig. 2
figure 2

Research flow-chart

Fig. 3
figure 3

Pie charts showing the overall statistics for each of the 4 categories to classify the reviewed articles: (a) anatomical district; (b) marker; (c) accuracy method and (d) AR usage

3.1 Overall statistics

Regarding the anatomical districts (Fig. 3a), twenty-eight articles concern the maxillofacial area, which, in turn, included the following regions: mandibular (seven articles), maxillary (ten articles), cranial (six articles), zygomaticomaxillary and sinonasal (two articles each), and orbital (one article). Ten papers regard the lower limbs: specifically, fibular (six articles), femoral (two articles), and foot bone (one article). Eight articles include osteotomy AR guidance applied to the pelvic bone. Only one article concerning spinal surgery was found, while 2 studies deal with surgeries on different anatomical districts, populating the “general” subcategory.

By analyzing the second category (Fig. 3b), physical markers are employed in thirty-four studies: planar markers are used in sixteen papers, cubic markers in five studies, biplanar markers are employed in three studies, and spherical and cylindrical markers appear in eight and two studies, respectively. A representation for every marker type has been provided in Fig. 4.

Fig. 4
figure 4

Physical markers used in the process of registration and tracking the virtual model on the anatomical replica, cadaveric specimens, or patients. From left to right: planar, biplanar, cubic, spherical, and cylindrical

Planar markers are two-dimensional images consisting of a geometric pattern, like QR codes, that can be detected and tracked to determine the marker’s position and orientation in three-dimensional space [78]. Bi-planar markers consist of two perpendicular planes with a distinctive pattern each. Cubic markers consist of six square faces, each of them with a different color or symbol pattern. Spherical markers and cylindrical markers exploit their peculiar shape to be easily recognizable even without an applied pattern or texture. Twelve different papers contain a markerless registration procedure, and on the remaining three papers there is no information about the registration process. In 69.4 % of the cases the superimposition of the planned virtual model over the real one is performed through physical markers, while in 24.5 % of the selected articles the registration is performed without markers.

Accuracy assessment (Fig. 3c) is performed in forty-two articles, while the remaining ones do not provide specific information. Distance between points is computed in thirty articles, while five works adopted the distance between surfaces as accuracy assessment methodology. Angular deviation, inspection window and intratumoral cut quantification are employed in two articles each, while one study uses questionnaire for the evaluation, focusing more on user interface responsiveness (Fig. 5).

Fig. 5
figure 5

Outcomes analyzed by physicians for accuracy assessment of osteotomies: distance between points, angle between planes, distance between two surfaces, intratumoral cut, inspection window, and questionnaire (qualitative assessment)

To deeply and quantitatively understand how accuracy is determined by clinicians during post-operative quality assessment, a concise explanation of each method is provided. The premise is that the planned 3D model and the post-operative 3D scan have been aligned prior to each evaluation.

Regarding point distance calculation, common practice consists in following these steps:

  • A point is identified on the planned cutting surface

  • The corresponding point is pinpointed on the post-operative model

  • The Euclidean distance is computed with respect to a common reference system as shown in (1):

$$\begin{aligned} d = \sqrt{(x_{post}-x_{plan})^2 + (y_{post}-y_{plan})^2 + (z_{post}-z_{plan})^2} \end{aligned}$$
(1)

These steps are repeated for other points, after which the mean and standard deviation of d are computed.

The angle between planes is computed by fitting postoperative surface with a plane and then calculating the angle between its normal, \(\mathbf {n_1}\), and the one of the planned plan, \(\mathbf {n_2}\), as shown in (2):

$$\begin{aligned} \theta = \arccos \left( \frac{\mathbf {n_1} \cdot \mathbf {n_2}}{\mathbf {n_1} \times \mathbf {n_2}}\right) \end{aligned}$$
(2)

The surface distance is computed isolating a portion of the osteotomy surface or, in the case of bone repositioning, the entire closed surface of both the planned and post-operative models. Given that the surfaces are triangular meshes, the calculation falls under the point-to-surface distance, where each point of one mesh is compared to the surface of the other mesh (3). Let \(M_1\) and \(M_2\) be the two meshes to be compared. For every point p in \(M_1\), the distance \(d(p, M_2)\) from p to the mesh \(M_2\) is determined as follows:

$$\begin{aligned} d(p, M_2) = \min _{q \in M_2} \Vert p - q \Vert \end{aligned}$$
(3)

where q is a point on \(M_2\) that minimizes the Euclidean norm \(\Vert p - q \Vert \). This process is repeated for every point in \(M_1\), and a color map of the distances is plotted on the postoperative surface. Subsequently, mean and standard deviation of all the obtained distances are computed.

Intratumoral cut refers to the measure of the penetration of the cut within the tumor. If the postoperative osteotomy plane is within the tumor, the closest parallel plane to the former, which is also tangent to the tumor, is drawn. Consequently, the distance between these two planes, i.e., the intratumoral cut, is determined. Furthermore, this value is divided by the longitudinal extension of the tumor to yield a percentage result.

Regarding the inspection window methodology, a template is commonly used as a reference for curvilinear axes: all portions of the postoperative line that fall within the inspection windows are projected onto these curvilinear axes. The sum of the curvilinear lengths of each projected segment is computed and then divided by the total length of the axes, producing a percentage value. This measurement is attained using adhesive millimetric tape to measure each curve segment falling within the template and subsequently summing them to obtain an overall value.

Finally, with respect to the fourth category (Fig. 3d), in twenty-six studies AR was applied proposing a standalone methodology, while in the other twenty-three it is compared or combined with other approaches: in five papers AR is compared with freehand osteotomies, while different AR approaches are tested in two papers. In eight papers there is direct comparison between AR and PST, as well as in eight papers the two guidance methods are combined (AR + PST).

Fig. 6
figure 6

Flowchart of (a) FH, (b) PST-guided and (c) AR-guided osteotomies surgical procedures

3.2 AR-based osteotomies studies

Whether FH, PST-guided, or AR-guided osteotomies, the first steps of the workflow are the same: a clinical assessment of the patient to diagnose the pathology, a 3D scan (either CT or MR) of the anatomical district and the identification of the osteotomy planes through imaging analysis. Similarly, the last steps of the workflow follow a standard pattern: after the surgery, the post-operative scans (both 3D and 2D via CT/MR, or visual inspection) of the operated area are retrieved to be compared with the pre-identified osteotomy planes and to evaluate the accuracy. Differences among the approaches arise during the planning and the surgery.

Concerning the FH surgical method (Fig. 6a), the operation strictly adheres to the identified osteotomy planes, relying on the clinician’s experiential memory. When PSTs are employed (Fig. 6b), CAD modeling of the surgical support is performed using the identified osteotomy planes as design constraints. A replica of the specific anatomical region to which PSTs are tailored to is 3D printed for a fitting check, which aims to validate or suggest PST modifications. Prior to the usage in the operating room, a sterilization process is essential. Regarding AR-based surgery, the initial step is identifying the AR device, factoring in budget and availability. Then, the digital platform and toolset suitable for the chosen AR device are selected. Depending on the surgeon’s requirements, the user interface is designed considering all the visual elements through which clinicians will interact with AR applications and assets. Concurrently, a registration and tracking technique is determined taking into account boundary conditions such as lighting and occlusions; if a physical marker is needed, its fabrication is required. Additionally, if PSTs integration is chosen, a process analogous to “PST fabrication” is followed. The AR application is then developed, deployed onto the AR device, and tested both for registration and tracking accuracy as well as clinical usability. All the studies analyzed follow the workflow illustrated in Fig. 6c.

The numbers of samples tested are reported along with the measured accuracy values. Moreover, we have calculated the Coefficient of Variation (CV) when the overall mean (\(\mu \)) and standard deviation values (\(\sigma \)) are provided:

$$\begin{aligned} CV = \frac{\sigma }{\mu } \end{aligned}$$

Given that the coefficient of variation (CV) is a normalized measure of dispersion, it is dimensionless. This allows for the comparison of different distances and angular deviations presented in the form of \(\mu \pm \sigma \), even if the methods of accuracy calculation differ, as observed in the analyzed studies. For the sake of clarity, the studies reported in this review are grouped by the anatomical district they refer to.

3.2.1 Maxillofacial

Maxillofacial surgery entails procedures to treat conditions, wounds, and birth defects affecting face, jaw, or mouth.

Zhu et al. (2017) [29] employed Augmented Reality (AR) to project inferior alveolar nerve bundles onto both a 3D-printed mandibular model and the patient, aiming to ensure safe mandibular incisions. The authors utilized an HMD and a Hiro marker on an occlusal splint to improve the accuracy during the registration phase. During surgery, clinicians followed the osteotomy lines projected on the virtual 3D model, achieving a positioning error of \(0.96 \pm 0.51\) mm (\(CV = 0.53\)). A detailed depiction of the Occlusal Splint with Marker (OSM) is presented in Fig. 7.

Fig. 7
figure 7

Occlusal splint with marker (OSM) flow diagram: 1) Patient before VSP; 2a) Definition of cutting planes on the patient’s anatomy; 2b) Realization of the OSM with a (planar or biplanar) marker cemented to a patient’s dental cast; 2c) OSM + dental cast 3D scanning; 3) Virtual model (VM) generation: superimposition of the anatomical model + cutting planes, and the OSM + dental cast; 4) Phisycal OSM placement and virtual model projection on the patient, via AR device; 5) Virtual model registration by AR device recognition of the marker. Steps 2a and 2b + 2c can be performed by different technicians simultaneously

Furthermore, Zhu et al. (2018) [36] compared three osteotomy methods for mandibular angle surgery, AR, PSTs, and FH, in a retrospective study. They used AR to project cutting planes holograms on 31 patients (surgical guide group) using an OSM, achieving an accuracy of \(1.18 \pm 0.34\) mm (\(CV = 0.29\)). The surgical guide group achieved \(0.96 \pm 0.42\) mm (\(CV = 0.43\)), while FH group had an accuracy of \(3.64 \pm 0.77\) mm (\(CV = 0.21\)).

Pietruski et al. (2019) [38] conducted a proof of concept study to compare accuracy in osteotomies using 3D-printed surgical guides and two AR navigation methods using Polaris Vicra and Movierio BT-200 Smart Glasses. The first method consisted in positioning the blade through digital coordinates or using a 3D mandible image. The second method superimposed VSP onto the surgeon’s view, displaying virtual osteotomy features. Achieved errors resulted to be \(1.65 \pm 0.88\) mm and \(4.94 \pm 4.62\)° with guides, \(1.79 \pm 0.94\) mm and \(5.34 \pm 3.67\)° with first AR method, and \(2.41 \pm 1.34\) mm and \(7.14 \pm 5.19\)° with the second one. The study was replicated in 2023 [77] on cranial replicas, yielding \(1.30 \pm 0.73\) mm and \(3.73 \pm 2.94\)° for 3D printed guides, \(1.86 \pm 0.88\) mm and \(5.93 \pm 5.12\)° with simple AR, and \(1.97 \pm 0.70\) mm and \(6.75 \pm 5.33\)° with navigated AR.

Ahn et al. (2019) [39] compared tracking accuracy of custom AR, stereo camera, and IR-based optical tracking system (OTS) for maxillary repositioning. AR system was flanked by cubic markers for registration. IR-based OTS had 0.0584 mm error (98.83 % accuracy, i.e., \(CV = 1 - 0.99 = 0.01\)), while AR system had 0.0596 mm error (98.81 % accuracy, i.e., \(CV = 1 - 0.99 = 0.01\)).

Han et al. (2019) [40] used AR for cranial vault reconstruction. They superimposed a semi-transparent virtual 3D skull model with cutting planes using an OSM. Surgery outcome was evaluated by intracranial volume change, but osteotomy accuracy was not directly measured.

Gao et al. (2019) [41] conducted a feasibility assessment of mandibular angle split osteotomy, using the Hololens AR system. They employed two approaches: according to the first one, the drilling information was projected onto the field with a navigation interface; the second approach consisted in providing no guidance. Results for the first method showed positional errors of \(2.09 \pm 0.53\) mm (\(CV = 0.25\)) and \(2.39 \pm 0.76\)° (\(CV = 0.32\)). The second method resulted to produce higher errors: \(2.92 \pm 0.88\) mm (\(CV = 0.30\)) and \(6.77 \pm 1.86\)° (\(CV = 0.27\)).

Cercenelli et al. (2020) [42] created a “Video and Optical See-Through Augmented Reality Surgical System (VOSTARS)” to perform the cut for maxillary repositioning (Le Fort 1 osteotomy), using a dental occlusal splint with spherical marker. The AR system consisted of two Leopard Imaging LI-OV4689 cameras, a tracking module and projection module. Marker detection resulted in a colored circle superimposed to the spherical marker recorded image. During the experiment ten participants drew the Le Fort I osteotomy line on a 3D model with a 0.5 mm thick pencil, following a virtual dotted line projected by the AR system. Subsequently, a specific 3D-printed inspection window was used to quantify the accuracy of the traced line by measuring the percentage of the line that fell within the 3D-printed inspection window itself. The authors accounted for the width of the stroke by adding 0.5 to the thickness of the light in the windows noting that in the 2.5-mm inspection window (which measures a maximum error of \(\pm 1\) mm) falls \(100\%\) of the lines (\(CV = 0\)), while in the 1.5-mm inspection window (maximum error of \(\pm 1\) mm) falls the 87.6% (\(CV = 1 - 0.88 = 0.12\)).

Mamone et al. (2020) [43] employed a custom AR setup for Le Fort I osteotomy projection on a cranial replica. Their system, equipped with cameras and a projector, achieved an accuracy of 0.3 mm through Hausdorff distance evaluation [79].

Neves et al. (2020) [47] utilized Magic Leap One AR goggles for frontal sinus osteotomies. Automatic alignment through facial surface recognition was employed on 6 cadaveric specimens. A semi-transparent skull hologram with colored bone flaps was projected onto the patient. Accuracy averaged \(1.4 \pm 4.1\) mm (\(CV = 2.93\)), measuring radial distances between osteotomy points and frontal sinus.

Kim et al. (2020) [48] employed HTC Vive Pro for orbital floor reconstruction in total maxillectomy. A hologram with cutting planes and a semi-transparent skull was projected, using a forehead-attached marker for registration. Average positioning error was \(2.77 \pm 1.29\) mm (\(CV = 0.47\)).

Condino et al. (2021) [50] tested the reliability of a wearable AR device for craniotomies. They used VOSTRAS, previously employed by Cercenelli et al. [42], and patient-adapted markers to project osteotomy lines onto a 3D-printed skull. Accuracy inspection showed 97 % of lines below 1.5 mm and 92 % below 1 mm (\(CV = 0.08\)), building on prior work by Cercenelli et al. [42].

Jo et al. (2021) [52] utilized AR in orthognathic surgery to exhibit a virtual model featuring the skull, jaw, Le Fort 1 osteotomy, and repositioned jaw. A display monitor and camera system were combined for tracking, along with manual registration based on facial markers. Positional error, quantified as the distance between planned and post-operative points, was \(3.00 \pm 1.44\) mm (\(CV = 0.47\)).

Koyachi et al. (2021) [53] employed the Hololens display system with a binary-patterned marker on a dental occlusal splint for Le Fort 1 osteotomy, which also housed a cutting and repositioning guide. Gesture-controlled display showcased the virtual skull, preserved vessels, and repositioned maxilla, with adjustable transparency through gestures and voice commands. Median deviation of seven maxillary points resulted in a repositioning error of 0.38 mm.

Sahovaler et al. (2021) [55] compared osteotomy accuracy for sinonasal malignancies using unguided, AR-guided, Intraoperative Navigation (IN), and AR+IN methods. AR-guided employed markers for the registration and a PicoPro device for the visualization, while IN tracked tools and patients displaying them on a screen. Four 3D skull models, each with a tumor, were used for the method assessment. Intratumoral cuts occurred in 20.7 %, 9.4 %, and 1.2 % of unguided, AR-guided, and IN cases. AR+IN yielded 0.0 % intratumoral cuts, proving to be the most effective.

Tang et al. (2021) [56] performed retrospective analysis on seven patients who underwent oral and maxillofacial tumor removal with Hololens support. The virtual skull model was aligned using skin surface registration and highlighted the target bony area, along with cranial points for correctness verification. The osteotomy planes’ positional error was \(1.68 \pm 0.92\) mm (\(CV = 0.55\)). Similarly, García-Mato et al. (2021) [59] designed an AR app for cranial vault remodeling using markers on the skull. Their workflow, transferred to the OR, yielded a positioning error of \(0.62 \pm 0.51\) mm (\(CV = 0.82\)) and an angular error of \(1.80 \pm 1.88\)° (\(CV = 1.04\)). Leuze et al. (2021) [60] employed Magic Leap One to guide cranial bone removal, achieving an average osteotomy contour accuracy of \(2.2 \pm 2.6\) mm (\(CV = 1.18\)).

Meulstee et al. (2022) [65] explored VSP accuracy in cranial vault reconstruction using surgical guides and Hololens. They employed 3D-printed replicas, registering them through point-based methods. Augmented points along osteotomy lines were displayed via Hololens, aided by an interactive system for alignment. Surgical guides achieved \(0.9 \pm 0.6\) mm (\(CV = 0.67\)), while AR guidance yielded \(2.1 \pm 1.5\) mm (\(CV = 0.71\)) discrepancies.

Zoabi et al. (2022) [68] utilized Hololens 1 for orbital floor implant placement. Markerless alignment of virtual skull models and implants achieved a remarkable precision, with placement accuracy under 0.3 mm.

Bussink et al. (2022) [70] employed Hololens 2 to guide mandibular condyle osteotomy. A square QR marker aided virtual planning registration, guiding the surgeon’s pointer through displayed distances and orientation cues. Postoperative analysis revealed a 1 mm distance between planned and actual points.

Ceccariglia et al. (2022) [71] implemented markerless AR guidance with Hololens 2 for mandibular and maxillary osteotomies. Virtual model registration incorporated the skull, tumor mass, and cutting planes based on facial surface recognition across three patients. Surgeons controlled hologram display using voice commands and an interactive menu. Osteotomy accuracy using AR was compared to 3D-printed guides, yielding less than 2 mm discrepancy.

Chan et al. (2022) [72] utilized PicoPro projector, Polaris Spectra camera, and ICAN Webcam for AR-guided cranial osteotomies. Colored lines, tumor projections, landmarks, and real-time updating bars were used to guide osteotomies. Registration with fiducial markers resulted in 0.0 % cut inside the tumor area under AR guidance, versus 1.9 % without any guidance.

Lin et al. (2022) [73] developed Hololens app for zygomaticomaxillary fractures. Square QR markers aided virtual planning and zygomatic repositioning. Color change and auditory signals indicated correct zygoma positioning; moreover, interactive tools set transparency and provided guidance lines. The experimental group (AR) achieved an accuracy of \(1.6 \pm 0.3\) mm (\(CV = 0.19\)), while the control group (optical navigation) obtained \(2.0 \pm 0.3\) mm accuracy. As a result, the app improved fracture reduction in eleven patients compared to optical navigation. Lin et al. (2022) [74] expanded on previous work by assessing ten cases, evenly divided between experimental and control groups. The accuracy achieved was 1.35 mm for the experimental group and 1.61 mm for the control group.

Sugahara et al. (2023) [76] utilized the Hololens for maxillary surgery involving Le Fort 1 osteotomy. They employed 3D-printed surgical guides and holographic projections to position iliac crest flaps accurately. By projecting a semi-transparent hologram onto the patient, they facilitated adjustments through voice and gestures, resulting in an average accuracy error of under 2 mm.

There were also animal studies that were conducted to explore AR potential. Zhou et al. (2017) [32] combined AR with robot-assisted surgery for mandibular osteotomy in dogs. They used virtual markers for drilling and overlaid a virtual mandible model for safety. By integrating AR and a robotic arm, they achieved a position error of \(1.13 \pm 0.15\) mm (\(CV = 0.13\)) and an angular error of \(6.69 \pm 1.05\)° (\(CV = 0.16\)), while assessing proximity to the mandibular nerve.

In a different context, Hou et al. (2022) [63] employed the nVisor ST60 HMD for cosmetic mandibular surgery on canine specimens. They used holograms to guide osteotomies and measure deviations. The achieved deviations were \(0.18 \pm 0.46\) mm (\(CV = 2.56\)), \(0.20 \pm 0.51\) mm (\(CV = 2.55\)), and \(0.948 \pm 1.388\)° (\(CV = 1.46\)) for position, segment length, and angular alignment, respectively.

3.2.2 Pelvic

Pflugi et al. (2018) [34] developed a system using a Raspberry Pi Zero, WiFi USB dongle, and sensors to track marker orientation during surgery. Results showed average angular differences of \(1.34 \pm 1.50\)° (\(CV = 1.12\)) for cadavers and \(1.63 \pm 1.48\)° (\(CV = 0.91\)) for replicas.

Kiarostami et al. (2020) [49] utilized Hololens 1 for osteotomy guidance. Their study compared AR and FH approaches, yielding positional errors of \(2.4 \pm 0.8\) mm (\(CV = 0.33\)) and \(2.8 \pm 0.8\) mm (\(CV = 0.29\)), along with angular errors of \(7.5 \pm 2.1\)° (\(CV = 0.28\)) and \(10.3 \pm 2.5\)° (\(CV = 0.24\)).

Ackerman et al. (2021) [57] demonstrated AR feasibility by superimposing virtual cutting planes on cadavers using Hololens. Real-time angular discrepancies were displayed during complex osteotomies, with a positional error of 10.8 mm and angular error of 5.4°.

In the realm of surgical guides, García-Sevilla et al. (2021) [61] explored several different positioning methods. They compared FH, smartphone AR, and Hololens 2 approaches, utilizing planar markers and employing optical tracking. Median errors ranged from 1.04 to 3.37 mm for different methods and scenarios.

Modabber et al. (2022) [66] conducted a cadaveric study to compare the accuracy of osteotomies for iliac crest harvesting using 3D-printed surgical guides and a markerless AR approach. They employed a L750ST mini projector and a RealSense D415 3D camera mounted on the LBR 14 R820 lightweight robot to project bone sections onto cadavers. The study included ten iliac crests from five cadavers. The positioning error was calculated by aligning actual and planned iliac crests scans, resulting in \(2.65 \pm 3.32\) mm (\(CV = 1.25\)) and \(1.47 \pm 1.36\) mm (\(CV = 0.93\)) mean distances using AR and surgical guides, respectively. The angular deviation was \(14.99 \pm 11.69\)° (\(CV = 0.78\)) and \(8.49 \pm 5.42\)° (\(CV = 0.64\)) for AR and surgical guides.

Winnand et al. (2022) [67] conducted a pilot study to evaluate the accuracy of markerless AR navigation versus 3D-printed surgical guides for iliac crest transplantation in facial skeleton defect correction. They performed osteotomies on ten additively printed iliac crest models, with positioning errors of \(2.29 \pm 1.98\) mm (\(CV = 0.86\)) for AR navigation and \(1.32 \pm 1.00\) mm (\(CV = 0.76\)) for surgical guides. The angular deviations were \(10.21 \pm 7.22\)° (\(CV = 0.71\)) and \(6.98 \pm 4.70\)° (\(CV = 0.67\)) for AR navigation and surgical guides, respectively.

Mendicino et al. (2022) [69] compared the accuracy of surgical guide positioning for pelvic resections using the VOSTRAS AR device [42] and a FH approach. They placed guides on hip bone replicas covered with foam to simulate soft tissue, resulting in positioning errors of 3.55 mm using the AR device and 5.12 mm using the FH approach.

Cho et al. (2018) [33] assessed pelvic bone tumor excision accuracy using a tablet PC with AR guidance and a conventional optical navigation system. They performed thirty-six excisions on pig specimens, obtaining positional errors of \(1.59 \pm 4.14\) mm (\(CV = 2.66\)) for AR and \(4.55 \pm 9.70\) mm (\(CV = 2.13\)) for the optical system.

3.2.3 Lower limb

Bong et al. (2017) [31] utilized the Polaris Vicra optical tracking system and a Logitech C270 camera to project a virtual triangular prism onto a model, simulating femoral osteotomy. The positioning error was assessed, yielding an average of \(0.59 \pm 0.39\) mm (\(CV = 0.66\)) and an angular error of \(1.31 \pm 0.38\)° (\(CV = 0.29\)).

Moreta-Martínez et al. (2018) [35] employed Hololens and Polaris for tumor removal surgery, offering an interactive menu for displaying 3D models such as skin, bones, and tumors. Positioning and display errors were quantified, resulting in a marker positioning error of 1.87 mm and an AR display error of 2.90 mm.

Battaglia et al. (2019) [37] introduced a markerless AR approach for mandibular reconstruction using smartphones and tablets. They achieved successful registration of fibular bone, veins, and surgical guides, enhancing visualization and control during surgery.

Moreta-Martínez et al. (2020) [44] outlined a protocol for creating an AR smartphone app to register and visualize tibia and fibula bone segments relative to a 3D-printed cubic marker. This work highlighted the potential for improved visualization and guidance in surgical procedures.

Pietruski et al. (2020) [45] investigated AR systems for fibula free flap harvest in mandibular reconstruction. They utilized two AR approaches, one involving an optical tracking system with Moverio BT-200 Smart Glasses, and the other one using smart glasses with spherical markers for tracking. The positional errors were \(2.76 \pm 1.06\) mm, \(2.67 \pm 1.09\) mm, and \(2.95 \pm 1.11\) mm for the three methods, respectively. Angular errors ranged from \(3.18 \pm 1.34\)° to \(5.42 \pm 3.92\)°.

Viehöfer et al. (2020) [46] employed the Hololens and a position tracker for hallux valgus correction. They achieved angular errors of \(4.9 \pm 4.2\)° using AR and \(6.7 \pm 6.1\)° with the FH approach.

Meng et al. (2021) [51] utilized the Hololens to project cutting plane holograms onto fibula specimens in mandibular reconstruction. They manually registered the holograms for visualization and achieved positioning errors of \(2.11 \pm 1.31\) mm and angular deviations of \(2.85 \pm 1.97\)°.

Lin et al. (2022) [64] performed AR-guided mandibular reconstruction using free fibula flap harvesting on a patient with ameloblastoma. They utilized 3D-printed surgical guides for osteotomies and validated them with AR.

Zhao et al. (2022) [75] utilized AR navigation for free fibula flap osteotomy in mandibular reconstruction both for an in vitro group and an in vivo group. They employed a bi-planar marker attached to the fibula to aid registration. Virtual fibular flaps were projected and used as guides for osteotomies, achieving a positional error of \(1.03 \pm 0.68\) mm (\(CV = 0.66\)) and angular deviations of \(5.04 \pm 2.61\)° (\(CV = 0.52\)) for the in vitro group, and \(1.18 \pm 0.84\) mm (\(CV = 0.71\)) and \(5.45 \pm 1.47\)° (\(CV = 0.27\)) for the in vivo group.

Cho et al. (2017) [30] focused on bone tumor removal in pig femurs, employing a Surface Pro 3 tablet PC and a binary marker for registration. They projected a cylindrical hologram serving as a “ruler” reference, enabling clinicians to interact with the AR application for surgery planning. The accuracy assessment showed a distance of 1.71 mm for the AR group and 2.64 mm for the control group in terms of plan-post cutting planes.

3.2.4 Other districts

Molina et al. (2021) [58] utilized Augmedics Xvision for an en bloc resection of a spinal tumor, employing simultaneous surgeon collaboration and planar marker registration to guide the procedure.

Moreta-Martinez et al. (2021) [54] developed an AR mobile app for orthopedic oncology, integrating Polaris Vicra for optical tracking and a smartphone display fot the visualization. Their surgical guides featured planar surfaces and a unique symbol-patterned cubic marker, facilitating precise osteotomies. The app enabled users to interact with holograms, adjusting model transparency and assessing marker recognition success. The placement error averaged \(1.75 \pm 0.61\) mm, and the tracking error measured \(2.80 \pm 0.98\) mm.

Dennler et al. (2021) [62] conducted a user-focused feasibility study on AR instruments in orthopedic surgery, utilizing Hololens 1. Surgeons evaluated the AR system’s usability and its applicability in different procedures, rating AR use in osteotomies at \(82 \pm 17\) points out of 100. Unlike previous works, this study emphasized user experience over procedural accuracy.

All CV values, distinguishing between those related to distances and angular deviations, are shown in Fig. 8 for a direct comparison of the achieved results, regardless of the employed accuracy evaluation methodology.

Fig. 8
figure 8

Coefficient of Variation (CV) of distance outcomes (left) and angular deviations (right)

3.3 AR-based technology comparison and open-issues

Based on the previously analyzed articles, there are three primary types of AR (Augmented Reality) technologies that are employed: smartphone or tablet displays, head-mounted displays (HMDs), and projectors.

Concerning the first approach, the AR application is installed on a smartphone or tablet device. Given the widespread adoption of these devices, the main advantage is that users can quickly familiarize themselves with these applications. However, the inherent limitations of this method are the small screen size and the absence of the non-compliance to the see-through paradigm. Due to these constraints, an immersive experience cannot be promoted and the clinician is forced to frequently switch the gaze between the display and the surgical field. Additionally, continuous handling of the device is required during surgical procedures, thereby either immobilizing one of the clinician’s hands or necessitating an assistant holding the device. Alternatively, a specialized adjustable stand may be needed for the device.

Regarding the second approach, the AR application is integrated into a wearable HMD, allowing the clinician to directly observe the surgical field. This promotes concentration on the surgical task with both hands free and eliminates the inconvenience of shifting the gaze, as experienced with handheld devices. Furthermore, HMDs support gesture and voice recognition to further enhance the user interaction. Potential disadvantages include a steeper learning curve to get accustomed to both the device and the AR application and an increasingly cumbersome wearability after prolonged usage. Moreover, the cost is typically higher than smartphones.

The third approach involves the employment of a projector, namely a technology that utilizes light sources to cast specific shapes directly onto the patient. The main advantages consist in the elimination of the need for a display system, coupled with the fact that the equipment is neither complex nor expensive. However, it does demand an accurate preliminary calibration and ambient light conditions should be consistently managed to maintain optimal projection quality. The quality of projected shapes is contingent on the target surface, and as the size of the projected shape increases, so does the potential for distortions.

The points discussed above have been summarized in Table 3.

Table 3 Comparison of different AR technologies in terms of features: higher the better, lower the better

Regardless of the employed technology, some open problems common to every considered tool or device have arisen downstream from the literature analysis. Going into the detail, the main technical issues are related to the registration and the calibration phases [29, 77], the localization and the visualization of surgical instruments [31, 77], the delay between real and virtual models [38, 44, 59], lightning issues [54, 58, 71], HMD discomfort [48], and interface management [45, 48, 60]. All these aspects will be discussed in the next Section and some guidelines to overcome these issues and to foster the research on AR in osteotomies will be drawn.

4 Discussion

The literature review brought to light several aspects to be considered in order to answer the Research Questions (RQs) properly and provide the guidelines to foster the research about AR solutions. For the sake of clarity, this Section has been divided into four subsections. Three of them aim to answer to the three RQs. The last subsection gathers all the retrieved information to emphasize the major issues reported in the literature and propose some guidelines for the future research.

4.1 RQ1: Anatomical districts

In Section 3 the results of the literature review have been sorted according to the anatomical district. In this way the answer regarding the areas involved in AR-based osteotomies (RQ1) have been highlighted, together with some information regarding the different AR-based approaches for each specific study. Nonetheless, some considerations that could further frame the scenario can be drawn. Indeed, the results indicate that AR was used in 57.1 % of cases to guide osteotomies in the maxillofacial area. The anatomical complexity of the skull along with the facial aesthetic factor, which is core to positively assess the surgery outcome, fosters the effort of the research regarding the maxillofacial areas for reliable alternatives besides patient-specific templates (PSTs) and the freehand (FH) approach. Indeed, among the identified twenty-nine research focusing on the maxillofacial field, eight of them directly compare AR with traditional approaches: Zhu et al. (2018) [36], Pietruski et al. (2019, 2020 and 2023) [38, 45, 77], Meulstee et al. [65], Modabber et al. [66], Winnand et al. [67], and Ceccariglia et al. [71]. Moreover, 20.4 % and 16.3 % of the studies covered lower limb bones and pelvic bone, respectively. High accuracy is also required in the aforementioned anatomical region, where osteotomies are commonly performed to obtain bone fragments for transplantation into the skull, such as for mandibular and maxillary reconstructions. The need for precise positioning of the cut bone fragments at the desired location justifies the development of new solutions for osteotomy guidance. Only one study concerns the spine, and two studies involve interventions on different anatomical locations, i.e., pelvic bone, femur, rib cage in the study conducted by Moreta-Martinez et al. (2021) [54] and spine, shoulder, knee, hand and foot bone in the article of Dennler et al. (2021) [62]. In this regard, when designing an AR solution intended for different anatomical regions, the overall complexity increases. To this aim, the designed solution must be as versatile as possible to be applied in each anatomical region with minor modifications, carefully considering to achieve a balanced trade-off between accuracy and flexibility.

4.2 RQ2: Real-virtual interaction

The interaction between the information provided using AR, hence the holograms when using a HMD or the display when employing a personal device such as a smartphone, is the most sensitive step when designing and developing an AR solution. RQ2 was formulated to understand how interaction has been faced in the literature.

The main advantage of AR technology over PST and FH approaches is the opportunity to display additional information on the surgical field, which is beneficial for both patient safety and surgeon navigation. Different approaches can be followed in order to properly identify the cutting location. Using a projector, a simple line or the PST reproduction, i.e., the projection of the PST shape, is displayed directly on the patient’s bone. AR applications uploaded on personal devices or HMDs can display the cutting line or points belonging to the planned trajectory; nonetheless, the most common solution to guide the surgeon during the osteotomy is the visualization of the cutting plane defined during VSP, in order to provide both the position and the orientation [73]. AR can also be exploited to enhance surgeon’s visibility by augmenting on the patient veins [53], arteries, tumoral masses [55, 71], and nerves [29, 41]. Given the complex nature of the information, the HMD holograms are the best option to properly display the 3D models, which is beneficial for both the patient safety and the surgeon navigation. Using HMDs, the guidance can be also integrated by indications in the form of directional arrows [65], angular and positional discrepancies between target and actual cut device location [72] and color changing of the virtual model once a step has been completed successfully [56]. For instance, the color would change when the cutting tool is in the correct position, the PST is appropriately anchored, and the bone fragment has been correctly relocated [73]. Gesture and vocal commands can further improve the interaction keep surgeon’s hands free.

A key step to guarantee the interaction between the real and the virtual world is the registration phase, during which the physical and the virtual 3D models are aligned in order to track their position and orientation. There are multiple methods to facilitate both interaction and usability of AR applications during registration process, according to the marker type. Once that the marker detection is successfully completed, a signal such as an acoustic prompt [73] or a visual alert, like a frame around the marker [39, 59] or a color change of the marker-recorded image [42, 69], could be provided by the AR device. Furthermore, colored fiducial landmarks could be projected on the target surface [72]. This approach has also been adopted in markerless AR applications.

The marker type choice plays a key role in the developing of an AR application. Planar markers are used as reference points for accurately positioning virtual objects in short-range AR applications. A subgroup of planar markers, the Hiro markers, are well known for their high quality and reliability due to their grid structure and efficient detection algorithms [80]. Bi-planar markers improve the robustness and accuracy of registering virtual objects with the real world, as they can provide greater coverage of the field of view compared to traditional planar markers [81]. Cubic markers, among which there are those used in used in ARToolKit, are primarily employed for long-range AR applications [82, 83]. Spherical markers are suitable for high-precision three-dimensional localization. They are used in situations where the marker needs to be easily identifiable from different viewpoints, as their shape allows for a 360-degree recognition [84, 85].

In the context of maxillofacial surgery, a widely adopted solution to ensure the correct mutual patient-marker positioning using the “occlusal splint with marker” (OSM). This configuration, where the marker is attached to a plaster cast modeled on the patient’s dentition, allows for repeatable placement of the marker and was adopted in six studies using a planar marker [29, 36, 40,41,42, 63], and in two works using a biplanar marker [53, 76].

A correct positioning of the physical markers is not always possible due to differences and imperfections of the bone-marker interface. Additionally, the surgeon could temporarily occlude marker detection of the camera by passing their hand over the marker itself [59]. A markerless registration is the most suitable approach to overcome these problems [35, 61] and decrease the time required for CAD modeling and printing [67]. Moreover, the absence of physical markers could enhance the movements of the surgeons without obstructing the surgical field [71, 75]. However, a markerless method also has its limitations. Firstly, surface-based registration requires distinctive shapes to be clearly recognizable from an AR device camera [75]. Additionally, soft tissue-based registration, such as facial skin registration, could be imprecise if the soft tissue is deformed in relation to its planned configuration [60].

The larger prevalence of physical markers can be justified by the simplicity of their realization [29, 35] (they are almost always 3D-printed) and the possibility of attaching them directly to the patient’s bone [54] allowing rapid registration of the virtual model on it [35, 40].

4.2.1 Fabrication tolerances impact on AR technologies

The accuracy of osteotomies outcomes retrieved in the selected studies is provided without distinguishing the intrinsic AR device projection and tracking error from the fabrication tolerances inherent to the production process of the employed surgical tools, although their manufacturing is not negligible in terms of additional effort and time. Referring to the OSM detailed in Section 3.2.1, the fabrication includes a dental cast with a cemented 3D printed planar marker, and a 3D scan of the latter, introducing three potential sources of errors even before the AR application usage. For instance, the dental cast might not fit perfectly with the teeth, the 3D marker could be affected by flatness errors along with the profile error of the connection element, and the 3D scanning might introduce inaccuracies due to alignment of point clouds, lighting conditions, and camera resolution. Moreover, spherical markers, commonly used in AR surgical tools tracking, require the projection of a surgical tool virtual model onto the physical instrument using a marker frame, designed with a minimum of three spheres. The position of the spheres, i.e., the distances between their centers, may deviate from the CAD model, particularly when the sphere frame is 3D printed, due to fabrication error. In addition, AR projection error contribution will be added to the previous one, creating a final superimposition error (Fig. 9 on the top). Another example is the cubic marker registration. Given that these markers are often 3D printed, potential errors can arise not only from the profile deviation of the cubic surface, but also from the angular deviation at the junction between the marker and its attaching base. Moreover, the AR projection error between the virtual model and the physical marker accumulates, further increasing with the distance from the cube to its attaching base (Fig. 9 on the bottom).

Fig. 9
figure 9

Schematic decoupling of fabrication and AR projection errors as contributors to the total superimposition error of the virtual model onto the physical one. On the top, the spherical marker frame with location errors of the sphere centers; at the bottom, the cubic marker with potential angular deviation

4.3 RQ3: AR-based solutions advantages and limitations

The assessment of FH, PST, and AR-based approaches to perform osteotomies is typically made on the surgery outcome accuracy. The most used accuracy assessment method when using planar markers resulted to be point distance, although there are some exceptions: indeed, Pflugi et al. [34] and Zhu et al. [36] computed the angle between planes and surface distance. In contrast, when biplanar markers are used, osteotomy accuracy can be evaluated by measuring both point and surface distance [76]. Furthermore, point distance is also the most used method to evaluate accuracy when employing spherical and cubic markers. An interesting alternative has been proposed by Cercenelli et al. [42] and Condino et al. [50], who demonstrated that an inspection window can be easily fabricated and positioned on the top of an osteotomy line to measure accuracy. Additionally, in the study by Sahovaler et al. [55] involving spherical markers, the accuracy of AR-supported osteotomy reported when the tumor was affected by the surgeon’s cutting procedure, considering the patient’s safety core even in the accuracy evaluation. Point distance is suitable also when cylindrical markers are used; nonetheless, Viehöfer et al. [46] proposed measuring the angle between planes.

Overall, the analysis performed shows that 85.7 % of the times quantitative assessments are made, such as distance between corresponding points, angular deviation, distance between surfaces and inspection windows. Among these, the distance between points appears to be the most popular method, although the point selection is not always unique: in some works these points are selected directly on the cutting planes or surfaces [30, 31, 38, 45, 49, 51, 57, 59, 63, 70, 75, 77], while in other works, especially after roto-translations of bony portions, these points are located above the surfaces [35, 39, 52, 53, 68, 73, 74]. Only a few studies reported the PST placement error [29, 35, 54, 61, 68, 69], which directly affects the position of the cutting plane. Furthermore, when the bone cut is made by consecutive drillings, the distance between the start and end drilling points is evaluated [32, 41]. There are also other criteria that must be considered to assess the accuracy: the mean distance between the actual osteotomy line and planned line [43, 47, 65], eventual tumor boundaries [33] or anatomical regions not to be severed [60].

This variety of methods to assess the accuracy makes impossible to compare FH, PST and AR-based approaches, also considering that they are performed in different anatomical districts and, hence, in different boundary conditions. For instance, in 53 % of the articles AR is used standalone to guide osteotomies, while in 30.6 % of the cases the accuracy of AR is compared with the accuracy obtained with other guidance systems. Furthermore, since this review focuses on the use of AR to support a specific task, i.e., the osteotomies, a methodology capable of blending medical and technological demands is indispensable. For these reasons, a Quality Function Deployment (QFD) [86] has been used to identify all the requirements that an osteotomy approach should satisfy and, subsequently, make the comparison and the identification of advantages and limitations more consistent. QFD allows to integrate two orthogonal dimensions. For our purpose, the first dimension gathers the surgeons’ requirements, i.e., the needs when performing an osteotomy that has been identified after a detailed description of the operative scenario provided by the medical personnel, while the second dimension identifies the engineering metrics, i.e., the specifications that characterize the technical key features of the proposed solutions for the different osteotomy approaches (FH, PST, AR, AR+PST). The QFD reported in Fig. 10 has been filled in by a focus group including a maxillofacial surgeon, two residents, a mechanical engineer expert in the design and the production of PSTs and two computer science engineers in the AR/VR area.

Fig. 10
figure 10

Quality Function Deployment (QFD) identifies the requirements and enables consistent comparison of different approaches. On the left side, the needs of the surgeon during the osteotomy procedure are reported, while on the the upper right side the engineering metrics that characterize different osteotomy approaches, such as FH, PST, AR, and AR+PST are accounted

Going into detail, the identified surgeons’ requirements have been explained below:

  • High-precision cut: the bone cut should be performed as close as possible to the planned position according to the planned orientation

  • High-speed cut: the quicker the cut is performed the lower the risk of complications is

  • Short lead time: amount of time elapsing from the VSP to the solution ready to be employed, i.e., PST has been printed or the AR-based solution has been developed

  • Short learning curve: the proposed procedure must be learned quickly by the medical personnel

  • Real-time osteotomy variation: opportunity to vary the approach during the surgery

  • Flexible configuration: opportunity to design and keep available different solutions for the same surgery

  • Sustainability: this requirements aims to reduce the environmental and cost impact

  • Patient safety: patient’s health must be core whatever step of the procedure is

For the sake of clarity, also the engineering metrics have been described:

  • Surgical tool interface robustness: this metric identifies how much the physical or virtual interface, if any, can support the cut

  • Anatomical tool anchoring efficacy: a reference system must be rigorously and quickly defined whatever the employed solution is

  • Surgical field accessibility: occlusions that could partially hinder the surgical field should be avoided or minimized

  • Designing time: amount of time elapsing for the VSP

  • Implementation time: amount of time during which the PST is printed or the AR-based solution is developed

  • Anchoring time: amount of time elapsing for positioning the support for the osteotomy

  • Surgical Tool Components: number of additional physical elements involved in the surgery

  • Material waste: material that cannot be used more than once

  • Management effort: parameter that considers the investment, the operating costs, the sterilization and the materials that must be employed for the osteotomy support solution

The QFD outcomes highlighted that the management effort, the surgical field accessibility, and the anatomical tool anchoring efficacy are the three most important metrics on which to focus on, respectively. Nonetheless, a proper interpretation cannot be done without considering the existing correlations between the metrics themselves, information that can be retrieved from the top of the QFD. From this perspective, the management effort and the surgical tool components are the most interesting parameters in terms of correlation with other engineering metrics; besides, they are in a positive correlation relationship. Predictably, the management effort has a strong negative or negative correlation with several other metrics, coherently with the fact that increasing the complexity of a solution results in increased costs [38]. On the other hand, reducing the number of surgical tool components can positively affect all the time-related metrics, hence also reducing the management effort, and at the same time support the surgery enhancing surgical field accessibility and improving the anatomical tool anchoring efficacy [36].

In this sense, AR-based solutions proved to match most of the metrics. The components employed during the surgery by AR-solutions can be managed more easily than PSTs [45]. Indeed, only markers must be 3D printed and sterilized and this is simpler than performing the same operations on surgical or cutting guides, that could also be composed of several elements [53]. The designing time is demanding both for PSTs and AR-based solutions, mostly depending on the difficulties related to the specific osteotomy. Nonetheless, the implementation time reduction is a prerogative of AR solutions, since the 3D printing of PSTs can last hours and prone to errors that could make inevitable to start over, losing time and wasting more materials [65]. In literature, this advantage is even more evident when the surgical planning must be modified. Indeed, it is sufficient to edit the VSP and reload the updated virtual model into the AR application without having to CAD-remodel the PST and wait for the printing time [45, 66, 77].

Although the AR employment resulted to be successful, in accordance with an increasing number of studies and research activity on this research, the QFD highlighted some metrics according to which PSTs strengths arouse. The surgical tool interface robustness rewards PSTs, because having a rigid physical support for the blade is an undeniable advantage especially for surgeons with less experience, resulting in more accurate bone cuts [69]. Furthermore, the anatomical tool anchoring efficacy using PSTs is simpler, resulting in a reduced anchoring time. Nonetheless, anchoring is one of the most difficult tasks to be treated. On the one hand, a PST should have a large surface to facilitate the anchoring, but, on the other hand, it should be small enough not to interfere with the accessibility [54]. In this sense, AR-based solutions are close to PSTs in terms of trade-off, especially when the anchoring is obtained markerless.

Finally, a brief mention to highlight that some solutions integrating PSTs and AR, combining the advantages of the two approaches, exist. Nonetheless, the required management effort to deal both with PSTs and AR makes the common usage of this hybrid approach not sustainable as a common practice, but viable for cases with specific needs.

4.4 Guidelines to foster AR-based solutions

Once determined that the employment of AR-based solutions can be a viable solution to support osteotomies, this subsection aims to highlight the suggestions to the identified open problems. The issues are presented below, from the most topical ones to the least restrictive:

  • Registration and calibration management. Time to properly calibrate and configure AR hardware before surgery should be considered and integrated into the VSP. Moreover, a constant monitoring of the registration parameters should be performed to troubleshoot any potential issues that could arise during the surgery.

  • Visualization and localization of surgical instruments. The development of new tracking technologies, such as ultrasound or magnetic tracking systems, may lead to an improvement in the accuracy of optical tracking, especially if employed jointly with the use of error correction algorithms. Furthermore, experimentation and implementation of alternative techniques such as multi-sensor fusion are currently carried out [87].

  • Delay between virtual and real model movement. A preliminary assessment of the maximum computational complexity that the hardware can handle should be carried out, in order to adopt suitable countermeasures such as lowering the AR application detail level and providing them with lighter functionalities. For example, it is good practice to consider a proper decimation of mesh vertices in virtual model, maintaining a high resolution only for the elements strictly necessary for surgical guidance.

  • Lightning issues. The operating room is usually kept under strict surveillance in terms of boundary conditions. Nonetheless, AR tracking is particularly demanding from the lightning perspective; hence, if it is not adequate for this purpose, the robustness of AR solutions should be increased by using larger and more visible markers, keeping in mind that this would also increase implementation time and decrease surgical field accessibility. Alternatively, markerless AR solutions based on computer vision and environment mapping can operate at greater distances than marker-based AR, at the cost of more processing power.

  • Discomfort due to the weight and bulk of the AR device. When using an AR device such as a smartphone or tablet, a physical support should be provided to free up the surgeon’s hands. Analogously, a head support could be designed to reduce weight and discomfort when using HMDs. This issue is not so manageable from the developers’ point of view, whose only solution is to time-limit the HMD usage, designing solutions also according to this constraint. In this sense, the technological development is moving in the direction of making devices ever lighter and easily wearable.

  • Distractions from excessive AR information flow during surgery. User interface should be carefully designed to provide information only when needed and allow for easy exit or changing of displayed content. To this aim, voice and gesture commands could facilitate the user interaction and, consequently, the surgical procedure.

In addition to the above-mentioned well-known problems, the present review has arisen another challenging issue. The vast majority of the analyzed works apply its own method for accuracy evaluation; consequently, grouping these methodologies within predefined categories is arduous, as well as comparing the results obtained by different studies. Future studies should be focused on the establishment of a standardized protocol to facilitate the accuracy evaluation, allow direct comparisons between different works and foster the adoption of AR-based methodologies.

5 Conclusions

A systematic review of recent studies regarding osteotomies supported by AR-based applications has been performed. The selected studies have been grouped by anatomical district and a focus on the real-virtual environment interaction, the accuracy computation and AR-advantages and limitations has been put. A QFD has been used to identify the most impactful engineering metrics according to the needs identified by a focus group involving, among the others, a surgeon and maxillofacial residents.

AR can lead to a better surgical field accessibility, more flexible solutions and lower the management effort resulting to be cutting-edge in this field. Nonetheless, future research should address some well-known issues, among which the calibration time, the robustness of the tracking, and the HMDs discomfort, and should also focus on the establishment of a standardized protocol for accuracy evaluation in order to foster the adoption of AR technology in osteotomies treatment.