Systematic review on the effectiveness of augmented reality applications in medical training
- 12k Downloads
Computer-based applications are increasingly used to support the training of medical professionals. Augmented reality applications (ARAs) render an interactive virtual layer on top of reality. The use of ARAs is of real interest to medical education because they blend digital elements with the physical learning environment. This will result in new educational opportunities. The aim of this systematic review is to investigate to which extent augmented reality applications are currently used to validly support medical professionals training.
PubMed, Embase, INSPEC and PsychInfo were searched using predefined inclusion criteria for relevant articles up to August 2015. All study types were considered eligible. Articles concerning AR applications used to train or educate medical professionals were evaluated.
Twenty-seven studies were found relevant, describing a total of seven augmented reality applications. Applications were assigned to three different categories. The first category is directed toward laparoscopic surgical training, the second category toward mixed reality training of neurosurgical procedures and the third category toward training echocardiography. Statistical pooling of data could not be performed due to heterogeneity of study designs. Face-, construct- and concurrent validity was proven for two applications directed at laparoscopic training, face- and construct validity for neurosurgical procedures and face-, content- and construct validity in echocardiography training. In the literature, none of the ARAs completed a full validation process for the purpose of use.
Augmented reality applications that support blended learning in medical training have gained public and scientific interest. In order to be of value, applications must be able to transfer information to the user. Although promising, the literature to date is lacking to support such evidence.
KeywordsAugmented reality Training Medical specialist training Surgery Medical education
Simulation of critical situations creates a promising opportunity for the education of medical professionals in a safe environment . Virtual reality (VR) modalities may create a digital environment, designed to resemble aspects of the real world. As a result, trainees using VR simulation learn tasks in a setting closely mimicking relevant realistic situations. Relevant scenarios can thus be practiced in surroundings where exploration and troubleshooting are safe. Applications using VR have shown to be able to improve learning outcome for different training procedures for various medical specialists [2, 3, 4, 5]. Much desired outcomes in healthcare such as improvement of patient safety and the reduction in costs and morbidity after use of computer-enhanced training have been reported .
Caudell introduced the term ‘augmented reality’ (AR) in 1990 while working for Boeings Computer Services . Workers were guided through the use of a head-mounted display to perform electrical wiring for aircraft equipment, without having to interpret abstract diagrams in manuals, allowing performing tasks without hours of effort to study . In medicine, complex sequential tasks must be mastered; number of operations and quality maintained, while working hours are reduced [9, 10, 11]. Whilst conditions at the workplace for learning in terms of hours and opportunities are under stress, adequate training experiences must be ensured.
VR refers to a digital environment in which the user interacts as if it takes place in the real world. However, the focus of the interaction remains in the digital environment. AR differs from VR because the focus of the interaction of the performed task lies within in the real world (AR) instead of the digital environment (VR). AR thus offers the opportunity of a digital, often interactive overlay onto a real or virtual environment. Augmented reality applications (ARAs) are digital applications offering such an extra layer. To the user, layers of the virtual and physical environment are blended in such a way that an immersive, interactive environment is experienced. Hence, ARAs may have great potential in training medical personnel.
Modern teaching curricula aim to educate trainees efficiently and in a safe environment. Educational methods currently being used in medical specialist training include practice-based learning, problem-based learning [12, 13] team-based learning [14, 15], eLearning [16, 17] and (VR) simulation training . Although VR learning environments offer opportunities for full- and partial-task training, they are often a mere representation of a task in reality . This may result in medical specialists that may be well trained for a particular task on the job in a set context, but who lack competencies needed to adapt to ever-changing situations in the real working environment . To acquire stable, crossover competencies, it is necessary to create a training environment offering flexibility and adaptation in training true-to-life working processes in changing environments as is much needed in medical settings. As medical specialist training involves complex learning , ARAs are of great potential.
Within healthcare, ARAs have been developed to train or educate medical professionals , as a navigation tool during surgical procedures [22, 23] to enhance visualization at the operating room  and as a therapeutic tool in the treatment of patients [25, 26, 27].
The aim of this review is to identify the value of ARAs for training professionals in medicine. The first objective is to provide an overview of ARAs used in medical training. The second objective is to evaluate their validity in doing so systematically.
A systematic literature search was performed in search of reports using ARAs to train or educate medical professionals validly. For our search, we classified ARAs as systems that use digital content in combination with real-time user interaction, tied to a specific time and location, resulting in a computer-based enhancement of the real environment . A training tool was defined as an application aimed at improvement of performance or skills. A medical professional refers to an individual taking care of patients in an institutionalized setting, or in formal training to do so. Reports addressing VR without AR components were excluded from analysis.
Study selection and assessment AR applications
PubMed, Embase, INSPEC and PsychInfo were searched for key terms (medical or surgery) AND (augmented reality) AND (educat* OR simulat* OR training). The latest search was conducted on August 28, 2015. All study types were considered eligible for inclusion. Reports that did not relate to a learning context for medical professionals were excluded from analysis, as were conference proceedings, reviews and studies investigating internal validity or technological aspects. All reports were screened on title and abstract according to the aforementioned criteria. Reports deemed ‘relevant,’ ‘dubious’ or ‘unknown’ were examined in full text. The reference lists of the reports assessed for eligibility were searched for other relevant reports. None of the reports were excluded because of language. The Internet was searched, and study authors were contacted directly in case of incompleteness of the data in a report. The following data were extracted from all reports: name, system, purpose, target group and validity evidence.
Review of studies
Matrix of validity type for augmented reality applications (ARA) to train or educate medical professionals
Stages of validity
Criteria for achievement
Appropriate method of examination
1. Face validity
The degree of resemblance between an ARA and the educational construct as assessed by medical experts (referents) and novices (trainees)
Uniform and positive evaluation of the resemblance between the ARA with the educational construct among novice and expert medical professionals
Questionnaire after use of the ARA
2. Content validity
The degree to which the ARA content adequately covers the dimensions of the medical content it aims to educate (or is associated with) (‘the truth whole truth and nothing but the truth’)
Uniform and positive evaluation of the ARA content and associated testing parameters by panel considered to be experts in the field
Questionnaire considering the content of the ARA
3. Construct validity
Inherent difference in outcome between experts and novices on outcome parameters relevant to the educational construct
Outcome differences considered to be of statistical significance between subjects considered to be of different levels of skill
Comparative study measuring the relevant outcome parameters on the ARA for subjects with presumed different levels of expertise in the educational construct.
4. Concurrent validity
Concordance of subject outcome parameters using tie ARA compared to outcome parameters on an established instrument or method, believed to measure the same educational construct (preferably the golden standard) training method)
Study results show correlation considered to be significant between ARA and the alternative, established training method
Comparative study comparing the outcome parameters of two different training methods in the same study participants
5. Predictive validity
The degree of concordance of ARA outcome parameters and subjects’ performance on the educational construct it aims to resemble in reality
Metrics show correlation considered to be significant between relevant outcome parameters on ARA and performance on educational construct it aims to resemble in reality
Randomized controlled trial comparing performance on educational construct in reality before/after training on ARA and control group using another training method
Data extraction on validity studies was in accordance with the Cochrane Handbook for Systematic Reviews of Interventions  and concerned methodological aspects (study design, intention to treat, randomization, concealment of allocation, blinding, follow-up and other possible bias), details of the ARA, details on intervention, primary and secondary endpoints, instruments, timing, results of measurements performed and funding. Quality of the randomized controlled trials was systematically assessed using the Cochrane Collaboration’s tool for assessing risk of bias, estimating the level of risk being either high or low. The methodological index for non-randomized studies (MINORS) was used to assess the quality of observational studies. This instrument uses a 12-item scale, scoring a maximum score of 16 points for non-comparative studies and 24 for comparative studies . The articles were rated according to a modified form of the Oxford Centre for Evidence-Based Medicine (CEBM). The data extracted was used to assess the validation steps achieved in a validation process. Two reviewers extracted data independently, and in case of disagreement, a third reviewer was consulted.
Statistical pooling of data was not performed due to heterogeneity of study designs.
Category 1: augmented reality application designed to train laparoscopic tasks
The ProMIS augmented reality simulator
The ProMIS is a simulator training laparoscopic procedures . It contains an instrument tracking system, which captures instrument motion, while realistic haptic feedback is provided. Time, path length and smoothness of movement can be recorded objectively and used as outcome parameters. For these metrics, there is an intrinsic performance measurement, providing detailed information and statistics regarding a specific task. The systematic search identified thirteen studies assessing the use of the ProMIS augmented reality simulator (Haptica, Ireland) for training laparoscopic tasks including navigation, object positioning, suturing, knot tying and sharp dissection.
Botden and coworkers  tested face validity of the ARA using a questionnaire among 55 experienced and intermediate surgeons or surgical residents regarding realism, haptics and didactic value, comparing suturing and knot-tying performances. There was a general consensus considering ProMIS to be very realistic, with good haptics and a useful training tool, indicative for obtaining face validity.
Ten studies could be identified to provide evidence for construct validity of ProMIS [34, 35, 36, 37, 38]. Van Sickle et al.  demonstrated the apparatus’ ability to significantly distinguish between ten novice and experienced laparoscopists based on all parameters for a laparoscopic suturing task (p < 0.001). Nugent et al.  tested performance of 80 surgeons, surgical residents and students based on three basic laparoscopic modules. Experts outperformed postgraduate years (PGYs) 3 and 4, who in turn achieved better scores than the PGYs 1 and 2, who did better than the premedical students (p < 0.001). Results have shown that these differences between experience levels were significant based on all performance outcomes: time (p < 0.001), motion analysis (p < 0.001) and error score (p < 0.001), proving construct validity.
Overall, construct validity of ProMIS was established for outcome parameters time [34, 35, 36, 37, 38], path length  and smoothness of movement  comparing medical experts versus novices . Results concerning validity were based on performance outcomes regarding navigation, object positioning, suturing, knot tying and sharp dissection.
Ritter et al.  tested 60 experienced, intermediates and novices. They established concurrent validity based on the comparison with the well-established FLS score for path length and smoothness with respect to the peg transfer task (p < 0.001). Botden and colleagues proved concurrent validity for the knot-tying task.
None of the reports considering ProMIS to train laparoscopic tasks investigated the instrument’s predictive validity.
AR laparoscopic simulator
Lahanas et al.  have developed a non-commercial AR laparoscopic simulator for training and assessment of surgical skills in minimally invasive surgery. Authors tested 20 experienced and novice surgeons. They provided evidence for face- and construct validity in all performance metrics for the instrument navigation-, peg transfer- and clipping task as the experienced group outperformed the novices significantly.
Category 2: augmented reality applications designed to train neurosurgical procedures
The perk station
The Perk Station [41, 42, 43] is a training platform for image-guided interventions. While training on a phantom, trainees perform tasks using AR image overlay. The Perk Station intrinsically measures total procedure time, time inside phantom, path length, potential tissue damage, out-of-plane deviation and in-plane deviation. The Perk Station has been used to train facet joint injections and lumbar puncture.
None of the authors reported assessment of a validation process.
Two other studies used the Perk Tutor to investigate the effectiveness to train facet joint injections. By means of a randomized controlled trial, the value of the Perk Station in the learning process of percutaneous facet joint injections was assessed. The success rate of facet joint injections of the Perk Tutor trained group was significantly higher in comparison with the control group (p = 0.031), while potential tissue damage was significantly lower . Time, time inside phantom, path inside phantom, out-of-plane deviation and in-plane deviation revealed no significant differences between the two groups .
Another study assessed twenty-four neurosurgical residents, randomly assigned to perform lumbar punctures using the Perk Station or without. Participants in the Perk Station group outperformed the control group by operating within a shorter distance (p = 0.02), a shorter period of needle insertion time (p = 0.05) and with less tissue damage compared to the control group (p = 0.01) .
The immersive touch augmented virtual reality system
The immersive touch augmented virtual reality system (IT) contains an electromagnetic head-tracking system in combination with a half-silvered mirror [44, 45, 46]. Outcome parameters of study are performance accuracy measurement and failure rate measurement. The device is described as a learning tool for training thoracic pedicle screw placement, clipping aneurysms and trigeminal rhizotomy.
Luciano et al.  used this system to train thoracic pedicle screw placement. The objective was to assess learning retention. Validity testing was not mentioned. The error rate was consistent with clinical results reported in the literature.
Seventeen neurosurgery residents used the IT to clip aneurysms. It was perceived as a useful educational tool by 64 % of the participants, while 71 % thought the simulator would help define which approach should be used in order to access the aneurysm safely, indicating face validity .
During a percutaneous trigeminal rhizotomy simulator session, seventy-one residents were divided into two groups based on experience. Increasing level of experience was significantly associated with a decreased distance from the ideal entry point (p = 0.001), a shorter distance from the target (p = 0.05) and a higher final score (p = 0.05), except for number of fluoroscopy shots (p = 0.52), indicative of construct validity .
A mixed reality ventriculostomy simulator
A third simulator, a novel mixed reality ventriculostomy simulator was described by Hooten et al. . This simulator can be used as a training tool for a ventriculostomy procedure. In their study, 260 residents were divided in four groups based on experience. Use of the simulator was perceived as beneficial in training residents because of its realism. There was a general opinion the simulator would increase patient safety, both indicative for face validity. Senior and junior residents outperformed interns (p = 0.003). However, senior residents did not significantly outperform junior residents, making the achievement of construct validity questionable.
Category 3: augmented reality applications used to train echocardiography
The CAE VIMEDIX™ ultrasound simulator
The CAE VIMEDIX™ ultrasound simulator uses a transducer, which provides positional and orientation data to reconstruct images in relation to a mannequin . The simulator has been used to train transthoracic echocardiography (TTE) and transesophageal echocardiography (TOE).
The majority of the attendees claimed that the simulator was highly realistic (90 % agreed or strongly agreed for the TOE simulator and 87 % for the TTE simulator), proving face validity. These results were based on a questionnaire obtained from cardiology registrants and sonography students. Other forms of validity were not reported, nor an intrinsic experiment assessing specific performance skills.
The EchoCom consists of a mannequin attached to a 3D tracking system and is used to train identifying congenital heart diseases based on sonographic information. Weidenbach et al.  tested 43 experts, intermediates and beginners. Face validity was proven as participants judged the simulator as realistic and useful. Evidence of content validity was achieved as experts evaluated the content of the simulator positively. Experts had a performance grade of 0.98, and intermediates and beginners had a mean value of 0.69 and 0.44, respectively. As all groups differed significantly in their diagnostic performance, construct validity was achieved.
Augmented reality applications (ARAs) are innovations wanting to be explored yet waiting to be scrutinized in medical education. The systematic literature review retrieved seven AR applications that have been developed in the field of medical professional training. AR allows trainees to understand the spatial relationships and concepts, and it provides substantial, contextual and situated learning experiences. Several of these ARAs can be viewed as a valid and reliable method for training. Moreover, AR helps to create authentic simulated experiences. It is thought to increases trainees’ subjective attractiveness, enhancing learning retention and performance. This is the first study to scrutinize the value of ARAs as a potential addition to the toolbox of medical professional education.
In modern times, the use of digital strategies to teach healthcare professionals has led to a major paradigm shift now reflected in many educational curricula [20, 50]. Computerized simulation models, mannequins and virtual reality simulators are used in medical professional training for partial-task rehearsal, full procedure rehearsal and team training. Studies that assessed the effect of simulation have shown a marked increase in self-reported confidence and comfort, technical skills and knowledge [51, 52, 53]. Furthermore, the transfer of skills to reality has been reported.
One of the limitations of VR simulation is that it has to render a full representation of the construct, which often leads to compromises because of costs and technical difficulties. Therefore, it may lead to rejection by (a part of) the trainees and educators. VR simulation in laparoscopic surgery has therefore only been applied as partial-task trainers .
Augmented reality differs from virtual reality in their ability to combine a physical simulation (such as laparoscopy equipment or mannequins) with a virtual reality overlay simulation, creating a truly immersive experience. Rare or complex situations, such as anatomical variations or emergencies, may be trained more optimally and realistically. This gives the opportunity for simulation training to transcend from partial-task training (such as laparoscopic dexterity exercises) to realistic full-task trainers that cover both interaction and complex spatial orientation (such as neurosurgery or echocardiography).
According to Gartner’s most current estimations, within 5- to 10-year AR, it is believed to have significant impact on society. Therefore, one needs to consider AR in the medical educational field seriously . New commercially available technology such as Microsoft Hololens  Oculus Rift  and Google Cardbox , among others, is expected to propel new initiatives in medical training and education [56, 57, 58, 59]. Medical educators should seek potential use, whilst remaining critical among their limitations. Only then will ARAs be a useful addition to medical training.
Our systematic search identified seven ARAs in the literature to date, designed to train medical professionals and professionals to be in institutionalized settings. Due to omit or the improper use of relevant keywords, it is possible that relevant articles were not within the range of search of this study. Although additional articles deemed relevant were found through cross-referencing, this might be the reason for an incomplete overview of all ARAs described in the literature.
The importance of validating new tools within the field of medical education is noted and illustrated by the fact that within all categories, validity steps have indeed been undertaken, especially since 2011. However, no follow-up studies on retention of skills could be identified, nor could subsequent clinical improvement of trainees be retrieved from studies. As no full validation strategies were outlined, it is unclear whether innovations assessed are of true value in training healthcare professionals. To date, it is unclear if the use of ARAs in training medical professionals is likely to contribute to patient safety. However, as training methods become more engaging and reliable, learning curves may be expected to become steeper and patients will ultimately benefit.
The main focus of surgical curricula has been on the acquisition of technical skills. However, to date, no surgical training methods have been developed to train residents how to avoid making errors during surgery. Training situational awareness should be essential, as errors result from misperceptions and using suboptimal problem-solving strategies . Modern operating theaters are enriched with an enormous increase in new technology. This increases incoming signals and thus the mental load while performing surgery. AR allows the transfer of digital information into the real world, therefore blending two worlds together. In turn, this creates opportunities to filter input from the environment because additional information is within the surgeons’ field of vision. The use of AR is therefore preeminently suited for training curricula aiming at situational awareness. It is known that training situational awareness in high-risk environments such as the operating room is much needed, but lacking in medical educational curricula . The benefit of AR could be widespread, from training better surgeons to making fewer errors in the operating room, ultimately leading to improvement of patient safety.
AR is a new technology in educational methodology. It has survived the initial phase and has shown the enormous potential within the medical field. Without doubt, healthcare will be profoundly affected developments in AR. As with any innovation, however, it is important to assess true value and place for results to be generated and curricula to sustain. Several applications have shown the potential of ARAs to bridge the gap between achieving the actual competence needed in the real working environment and training them in a virtual context. In order to implement existent and new ARAs in a training curriculum of medical specialists validly and reliably, uniform assessment strategies and complete validation trajectory are much needed. Only then, augmented reality training in medicine will become a winner in the digital revolution.
Compliance with ethical standards
Authors E.Z. Barsom, M. Graafland and M.P. Schijven have no conflicts of interest or financial ties to disclose.
- 2.Bharathan R, Vali S, Setchell T, Miskry T, Darzi A, Aggarwal R (2013) Psychomotor skills and cognitive load training on a virtual reality laparoscopic simulator for tubal surgery is effective. Eur J Obstet Gynecol Reprod Biol 7(2):310–327Google Scholar
- 18.Wang X, Dunston PS (2007) Design, Strategies, and Issues Towards an Augmented Reality-based Construction Training Platform. ITcon 12:363–380Google Scholar
- 27.Mousavi HH, Khademi M, Dodakian L, Cramer SC, Lopes CV (2013) A Spatial Augmented Reality rehab system for post-stroke hand rehabilitation. Stud Health Technol Inform 184:279–285Google Scholar
- 31.Higgins JPT, Green S (ed) (2011) Cochrane Handbook for Systematic Reviews of Interventions, version 5.1.0 [updated March 2011]. The Cochrane Collaboration http://www.cochrane-handbook.org/. Accessed 26 Jan 2016
- 44.Luciano CJ, Banerjee PP, Bellotte B, Oh GM, Lemole M Jr, Charbel FT et al (2011) Learning retention of thoracic pedicle screw placement using a high-resolution augmented reality simulator with haptic feedback. Neurosurgery 69 Suppl Operative(1):ons14–ons19Google Scholar
- 55.Gartner http://www.gartner.com/newsroom/id/3114217. Accessed 26 Jan 2016
- 56.Engadget http://www.engadget.com/2015/07/08/microsoft-hololens-medical-student-demo/. Accessed 26 Jan 2016
- 57.Forbes http://www.forbes.com/forbes/welcome/. Accessed 26 Jan 2016
- 58.http://www.imedicalapps.com/2015/06/google-cardboard-apps-youtube-360o-impact-medicine/. Accessed 26 Jan 2016
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.