Transfer of Automated Performance Feedback Models to Different Specimens in Virtual Reality Temporal Bone Surgery
- 1.4k Downloads
Virtual reality has gained popularity as an effective training platform in many fields including surgery. However, it has been shown that the availability of a simulator alone is not sufficient to promote practice. Therefore, simulator-based surgical curricula need to be developed and integrated into existing surgical training programs. As practice variation is an important aspect of a surgical curriculum, surgical simulators should support practice on multiple specimens. Furthermore, to ensure that surgical skills are acquired, and to support self-guided learning, automated feedback on performance needs to be provided during practice. Automated feedback is typically provided by comparing real-time performance with expert models generated from pre-collected data. Since collecting data on multiple specimens for the purpose of developing feedback models is costly and time-consuming, methods of transferring feedback from one specimen to another should be investigated. In this paper, we discuss a simple method of feedback transfer between specimens in virtual reality temporal bone surgery and validate the accuracy and effectiveness of the transfer through a user study.
KeywordsVirtual reality surgical training Automated performance feedback Temporal bone surgery
Virtual reality (VR) is increasingly being used in surgical training as it offers a risk-free, interactive, repeatable, and easily accessible platform that can be utilised to develop standardised training programs. Despite an emerging body of evidence related to the effectiveness of VR in surgical training [1, 10, 15, 40], it is clear that the availability of a surgical simulator alone cannot promote best practice amongst surgical trainees. For example, a study in the United States observed that only 14% of surgical residents completed VR training when participation was voluntary . Thus, even when facilities for VR training exist, a lack of awareness, trainee motivation, and limited access to simulators inhibit their usage [22, 25]. To overcome these barriers, an appropriate VR-based curriculum should be developed and integrated into mandatory competency-based surgical training programs [31, 33].
Optimal skill acquisition during simulation-based training relies on the availability of performance feedback, task variety with a range of difficulty levels, and the opportunity for extensive deliberate practice [13, 21, 33]. The incorporation of the above considerations into a VR-based module of a surgical curriculum is likely to improve trainees’ readiness for the operating room. The availability of immediate performance feedback is a required component of deliberate practice . Its purpose is to reinforce strengths, address weaknesses, and foster improvements in the learner by providing insights into the consequences of their actions and by highlighting the differences between intended and actual results . While some simulators provide feedback by means of an expert supervising practice [6, 27], others have been developed with in-built real-time procedural feedback. For example, a dental simulator exists that compares the user’s tool position, tool orientation and force application to an expert data set, and displays its feedback on the screen . Similarly, Sewell et al.  have developed a system that provides real-time feedback on bone visibility, drilling velocity and force. The University of Melbourne VR Temporal Bone Surgery Simulator  provides step-by-step procedural feedback  and technical verbal feedback on drill handling skills [7, 18, 19, 41, 42].
Another important aspect of a surgical curriculum is practice variation, which is essential to prepare trainees for anatomical variation between patients [13, 33]. In the context of VR simulation, practice variation refers to the availability of multiple specimens of varying difficulty levels. The availability of such practice variation has been shown to improve surgical performance on previously unseen temporal bone models by Otolaryngology residents . Various VR surgical simulators for laparoscopy [3, 27] and temporal bone drilling [5, 16, 23, 34] have been developed to offer a selection of cases with a range of difficulties.
To maximise skill acquisition and support self-directed learning, real-time feedback must be provided when practicing on different specimens. However, performance feedback doesn’t appear to be available across the full range of cases on existing surgical simulators, limiting their educational value. Also, at present there are no reported methods that transfer feedback models automatically between different cases, as an alternative to the time consuming and data intensive process of developing feedback models individually for each case.
According to the concepts of transfer learning , feedback transfer can be defined as transferring the same task (providing feedback on performance) from a source domain to a target domain. The differences in domains can be characterised as the variations in anatomy. Although the feature space (metrics on which feedback should be provided) is the same, the values that these metrics take may differ according to anatomical variations between specimens. Therefore, the transfer of feedback from one specimen to another can be characterised as a domain adaptation problem . It is not practical to obtain labelled data for each new specimen to train a new model or to retrain an existing one. As such, unsupervised learning (such as, instance weighting for covariate shift, self-labelling methods, changes in feature representation, and cluster-based learning ) is commonly used in solving problems of this form.
In contrast to using unsupervised learning for domain adaptation, we investigate a simpler, direct transfer approach supported by a pre-processing task that makes the source and target domains similar. To this end, we define regions of a specimen where surgical skills can be considered to be consistent. By defining these regions, we account for the changes in anatomical variation in specimens. We assume that the source and target specimens are similar enough that changes in the values of metrics (features) that feedback is provided on between specimens are negligible. This enables direct transfer of a feedback model of one region in the original specimen to the corresponding region in another specimen. Using this method, we transfer the neural network based model developed for providing technical feedback in VR temporal bone surgery in Ma et al.  to new specimens. We show through a user study that the feedback provided by the transferred models are as accurate as that provided by the original model. We also show that practice on multiple specimens with transferred performance feedback results in positive acquisition of surgical skills.
2 VR Environment
3 Types of Performance Feedback
Surgical skills are multi-faceted. As such, surgeons provide performance feedback and guidance on different aspects of surgical skill during training. To emulate this, the simulation system considers four main aspects of skill that need to be acquired: procedural knowledge, knowledge of landmarks/boundaries of the operative field, manipulation of environmental variables, and drill handling/technical skills. The effectiveness of these types of feedback/guidance methods on one specimen have been established by Davaris et al. .
Procedural guidance is provided using the step-by-step guidance method of Wijewickrema et al. . The steps were obtained by manually segmenting an expert procedure. Each step of the surgery is highlighted sequentially on the temporal bone - the next step is only provided once the current step is completed.
Verbal warnings are provided in the form of verbal advice when nearing an anatomical structure to make trainees aware of the boundaries of the operative field . To this end, distance thresholds per anatomical structure were defined, the crossing of which generated proximity warnings. Further, to enable learning of the anatomical structures, functionality to make the temporal bone transparent, so that the underlying structures can be viewed, is also available.
Feedback on environmental settings such as magnification level and burr size are provided as verbal advice. The ideal values of these settings differ according to where the surgeon is drilling. For example, at the start of a cortical mastoidectomy, an overall view of the surgical space is required, and therefore, a lower magnification level is used. When drilling in tighter spaces, a higher magnification level is required. Advice on how to change these values are provided by comparing against value ranges calculated from pre-collected expert data per surgical region. The region calculation process is discussed in the next section.
For the offline training of the neural network classifier, a dataset of 16 surgeries recorded by 7 experts and 34 surgeries from 18 novices was used. The surgical performances were segmented into strokes - continuous drilling motions without abrupt changes in direction . All strokes in expert and novice performances were considered to be expert and novice strokes respectively. The strokes were separated according to the region. Isolation forests  were used to remove outliers. Characteristics (or metrics) of each stroke, such as length, duration, speed, and force were then calculated to represent a stroke. These were used to train a neural network with one hidden layer per region. The number of hidden neurons for each region was chosen using cross validation .
In real-time, strokes are segmented from the surgical trajectory, and the neural network classifier for the relevant region is used to identify whether it is an expert or novice stroke. In the case of a novice stroke, an adversarial example , a small modification of the metrics that changes the prediction of the model from novice to expert, is generated. The resulting change is recorded in a buffer as an increase or decrease of the metrics that were changed to generate the expert prediction. Once multiple instances of the same change is generated in a row, it is presented to the user as verbal auditory feedback (for example, ‘decrease force’) .
4 Transfer of Feedback Models
As a method of adapting the feedback models to specimens other than the one they were developed on, we explored a method of direct transfer. We assumed that surgical technique (and environmental settings) are similar in the same region on all specimens and that the specimens are similar enough that the values of the metrics (features) that the feedback is provided on remain the same. As such, once the regions are defined on a new specimen, feedback models developed on the original specimen can be transferred to be used on this new specimen without any changes to the models themselves. Note that this assumption is only valid for specimens with no abnormal or pathological anatomy, which is the case for the specimens considered here.
For the generation of proximity warnings on different specimens, we used the same distance thresholds that were defined for the original specimen. We manually segmented steps of an expert procedure for each specimen in order to provide procedural guidance.
5 Validation of the Feedback Transfer
5.1 Study Design
We conducted a user study of 14 medical students to evaluate the accuracy of feedback transfer and to test the effect of the transferred feedback on skill acquisition. The ratio of postgraduate (MD) to undergraduate (MBBS) students was 5:2 and the male to female ratio was 4:3. This study was approved by the Royal Victorian Eye and Ear Hospital Human Ethics Committee (#17/1312H). Written consent was obtained from all participants.
5.2 Accuracy of Transfer
To determine the accuracy of the provided technical feedback, the errors in the feedback were determined by an expert surgeon through the analysis of anonymised videos based on the following criteria .
False positives (FP): feedback was provided while stroke technique was acceptable.
Wrong content (WC): participants’ technique was accurately detected as poor, but the content of the feedback was inaccurate.
False negatives (FN): Feedback was not provided while stroke technique was unacceptable.
5.3 Effectiveness of Transfer
The results of this study demonstrate the accuracy of the feedback transfer, as no significant difference was observed between the accuracy of the feedback of the original and transferred models. Furthermore, participants showed significant improvement in surgical performance after training on specimens with transferred feedback models, demonstrating that the transferred feedback (along with other factors such as repeated practice) had a positive impact on skill acquisition. However, it has already been established that repeated practice (without feedback) is not sufficient to impart surgical skills in mastoidectomy in a novice cohort such as the participants in our study . Therefore, we can attribute the improvements in performance to the effectiveness of the feedback.
Successful feedback transfer (of the type outlined in this study) will allow VR simulators to meet the requirement of deliberate practice to have immediate and continuous feedback [9, 33]. The provision of instant, unsupervised performance feedback by VR simulators offers a time efficient alternative to the current dependency on continuous expert supervision. Thus, this VR curriculum may serve as a valuable adjunct to current surgical training. In addition, developing a library of virtual temporal bone models covering anatomical variants complete with automated feedback could provide a valuable training resource for rural trainees where exposure to varying cases is limited.
It would also be beneficial to apply feedback transfer to VR simulation in other types of surgery, including laparoscopic surgery  and neurosurgery , or even endovascular procedures [8, 39]. However, a potential barrier to the reapplication of this direct feedback transfer technique would be the ability for comparable pre-processing of the simulation cases, defining different anatomical regions to facilitate the transfer of feedback models.
A limitation of this work is that the developed method was for feedback transfer between specimens with normal anatomy. As surgical behaviour may not be the same when operating on abnormal or pathological specimens, this direct transfer method may not be as accurate for those. For example, for an abnormally large specimen, values of feedback metrics such as stroke length may not be directly transferable. In such cases, the region-based method could be used in conjunction with more complicated domain adaptation techniques and/or a limited amount of labelled data from the abnormal or pathological specimens to overcome this. This may also be used to improve the accuracy of transfer between normal specimens. This is a future avenue of research we will explore.
A further study limitation is that only three of the four types of performance guidance/feedback provided during training were automatically transferred. Procedural guidance was provided by segmenting an expert procedure performed on each specimen. In future work, this process will also be automated, albeit using different techniques to that used for transferring technical feedback. A simulation-based surgical training program that incorporates other concepts of curriculum design that were not considered here (such as practice distribution, task difficulty including pathological cases, and proficiency based training)  will also be developed and validated.
The generalisability of our results are limited by the small number of specimens, cohort size, and use of a single expert reviewer. Further studies will be conducted to account for this bias with a larger number of specimens on a larger cohort, including those with intermediate level surgical skills (surgical residents). Assessments by multiple experts will also be performed to reduce the subjectivity of assessment.
We introduced a method of transferring technical feedback models from the specimen they were developed on to other specimens and showed that the feedback provided by the transferred models were as accurate as that of the original model. We also showed that the transferred feedback assisted in positive skill acquisition. This enables the development of self-directed, simulation-based surgical curricula that can be used as adjuncts to traditional surgical training methods.
- 2.Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1-2), 151–175 (2010)Google Scholar
- 6.Crochet, P., et al.: Deliberate practice on a virtual reality laparoscopic simulator enhances the quality of surgical technical skills. Ann. Surg. (6), 1216 (2011)Google Scholar
- 7.Davaris, M., et al.: The importance of automated real-time performance feedback in virtual reality temporal bone surgery training. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds.) AIED 2019. LNCS (LNAI), vol. 11625, pp. 96–109. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23204-7_9CrossRefGoogle Scholar
- 8.Desender, L., et al.: Patient-specific rehearsal before EVAR: influence on technical and nontechnical operative performance. a randomized controlled trial. Ann. Surg. 264(5), 703–709 (2016)Google Scholar
- 11.Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
- 12.Hall, R., et al.: Towards haptic performance analysis using k-metrics. In: Haptic and Audio Interaction Design, pp. 50–59 (2008)Google Scholar
- 16.Linke, R., et al.: Assessment of skills using a virtual reality temporal bone surgery simulator. Acta Otorhinolaryngologica Italica 33(4), 273–281 (2013)Google Scholar
- 17.Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: ICDM, pp. 413–422 (2008)Google Scholar
- 18.Ma, X., et al.: Adversarial generation of real-time feedback with neural networks for simulation-based training. arXiv preprint arXiv:1703.01460 (2017)
- 19.Ma, X., Wijewickrema, S., Zhou, Y., Zhou, S., O’Leary, S., Bailey, J.: Providing effective real-time feedback in simulation-based surgical training. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10434, pp. 566–574. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66185-8_64CrossRefGoogle Scholar
- 20.Margolis, A.: A literature review of domain adaptation with unlabeled data. Technical Report, pp. 1–42 (2011)Google Scholar
- 22.Milburn, J.A., Khera, G., Hornby, S.T., Malone, P.S., Fitzgerald, J.E.: Introduction, availability and role of simulation in surgical education and training: review of current evidence and recommendations from the association of surgeons in training. Int. J. Surg. (8), 393 (2012)Google Scholar
- 23.Morris, D., Sewell, C., Barbagli, F., Salisbury, K., Blevins, N., Girod, S.: Visuohaptic simulation of bone surgery for training and evaluation. IEEE Comput. Graph. Appl. (6), 48 (2006)Google Scholar
- 24.Nagendran, M., Gurusamy, K., Aggarwal, R., Loizidou, M., Davidson, B.: Virtual reality training for surgical trainees in laparoscopic surgery. Cochrane Database Syst. Rev. 27(8), CD006575 (2013)Google Scholar
- 25.Okuda, Y., et al.: The utility of simulation in medical education: what is the evidence? Mount Sinai J. Med. New York 76(4), 330–343 (2009)Google Scholar
- 26.O’leary, S.J., et al.: Validation of a networked virtual reality simulation of temporal bone surgery. Laryngoscope 118(6), 1040–1046 (2008)Google Scholar
- 27.Palter, V., Grantcharov, T.: Individualized deliberate practice on a virtual reality simulator improves technical performance of surgical novices in the operating room: a randomized controlled trial. Ann. Surg. (3), 443 (2014)Google Scholar
- 35.Wijewickrema, S., et al.: Region-specific automated feedback in temporal bone surgery simulation. In: 2015 IEEE 28th International Symposium on Computer-Based Medical Systems (CBMS), pp. 310–315. IEEE (2015)Google Scholar
- 36.Wijewickrema, S., et al.: A temporal bone surgery simulator with real-time feedback for surgical training. Med. Meets Virtual Real. 21, NextMed/MMVR21 196, 462 (2014)Google Scholar
- 37.Wijewickrema, S., et al.: Providing automated real-time technical feedback for virtual reality based surgical training: is the simpler the better? In: Penstein Rosé, C., et al. (eds.) AIED 2018. LNCS (LNAI), vol. 10947, pp. 584–598. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93843-1_43CrossRefGoogle Scholar
- 38.Wijewickrema, S., Zhou, Y., Bailey, J., Kennedy, G., O’Leary, S.: Provision of automated step-by-step procedural guidance in virtual reality surgery simulation. In: Proceedings of the 22nd ACM Conference on Virtual Reality Software and Technology, pp. 69–72. ACM (2016)Google Scholar
- 40.Zhao, Y.C., Kennedy, G., Yukawa, K., Pyman, B., Stephen, O.L.: Can virtual reality simulator be used as a training aid to improve cadaver temporal bone dissection? Results of a randomized blinded control trial. LARYNGOSCOPE (4), 831 (2011)Google Scholar
- 41.Zhou, Y., Bailey, J., Ioannou, I., Wijewickrema, S., Kennedy, G., O’Leary, S.: Constructive real time feedback for a temporal bone simulator. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8151, pp. 315–322. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40760-4_40CrossRefGoogle Scholar
- 42.Zhou, Y., Bailey, J., Ioannou, I., Wijewickrema, S., O’Leary, S., Kennedy, G.: Pattern-based real-time feedback for a temporal bone simulator. In: Proceedings of the 19th ACM Symposium on Virtual Reality Software and Technology, pp. 7–16. ACM (2013)Google Scholar