Introduction

Minimally invasive surgery (MIS) is rapidly becoming the standard of treatment for many surgical pathologies. [1] However, the skills required to perform MIS are significantly different to that of open surgery. The surgeon has to cope with restricted movement and visual field, fulcrum effect, hand-eye coordination, and ever-changing instruments and equipment. [2] Training surgeons to adapt to these challenges requires equally advanced tools that replicate them.

Historically, MIS training has adapted techniques from other fields of technology mostly notably from aviation training. [3] Virtual reality (VR) simulation has been the cornerstone of training pilots in flight simulation training in that it offers an immersive visual and physical representation and replication of real-world scenarios. [4] This has been possible with the use of mock cockpits that are fitted with screens in place of windows and actuators that move the enclosure around making it true to a real-life setting. [5] However, VR simulation in MIS training has not truly achieved the immersion that their counterparts offer.

Current VR simulators for MIS training are equipped with a monitor and instrument handles and foot pedals to perform procedure-specific tasks that replicate tissue-specific haptic feedback. [6] Several validation studies demonstrate the effective transfer of technical skills from the skills labs to the operating room (OR) with the use of procedural VR simulators. [7,8,9] However, a major deficiency of the current procedural VR simulation is its distraction-void and therefore lack of immersive environments. They are set-up in isolated skills labs or rooms where they seldom replicate the busy and often chaotic operating room (OR) environment. As Pluyter et al state “surgeons cannot operate in a bubble and thus should not be trained in one”. [10] It is vital that surgeons are trained in circumstances that replicate the real OR environments. Training in environments that replicate distractions increases the mental load and stress level of the surgeons and helps surgical trainees to adapt faster to the real OR environment. [11]

Distractions that occur during the surgical procedure have been identified and broadly classified into environmental factors, social factors, equipment factors, and organizational factors. [12] These range from procedural distractions, such as changing instruments, procedure-related conversation between teams, to social factors, such as music, non-procedure related conversations, etc. Nowadays available VR headsets have made it accessible and affordable to create immersive environments that replicate true to life with distractions and a sense of being. [13] The combination of VR simulators and VR headsets for the purpose of virtual operating room simulation setup (VORSS) for procedural simulation will be explored in this study. We aim to analyze the experience of VORSS by surgeons and surgical trainees and the potential added benefit to the existing procedural VR simulation.

Materials and Methods

Participants

The aim was to include all surgeons and surgical residents from GSL Medical College, Rajahmundry, India to participate in this study. All the participants had prior experience either in real MIS surgery or in using laparoscopic VR simulator or box trainers, laparoscopic instruments, and equipment. They were divided into two groups based on their professional background: novices consisted of the surgical residents and the experts were made up of the surgeons. This was based on the demographics questions on the questionnaires completed by the participants.

A total of 28 participants enrolled in the study, of which 15 were residents and 13 were surgeons. Throughout this article, we refer to the residents as “novices” and surgeons as “experts”. Of the experts in this study, four had completed > 200 cases, three 101–200, three 50–100, and three had performed < 50 clinical procedures. Of the novices, 14 had performed fewer than 50 clinical procedures previously and one performed none.

Virtual Operating Room Simulation Setup (VORSS)

The VORSS contains three essential components: a VR laparoscopic simulator (1), a VR headset (2), and a virtual OR environment (3).

The VR laparoscopic simulator (1): LapMentor III (Simbionix™, 3D Systems Corporation, the US) with MentorLearn Software. The specific hardware includes a 24″ flat touch-screen monitor, a keyboard with trackball, two instrument handles offering tactile feedback, and a double footswitch for activating simulated electrosurgical coagulation.

The VR headset (2): 2016 Oculus Rift providing stereoscopic images (1080*1200 per eye, 110°field of view), integrated 3D audio and six-degrees-of-freedom head-tracking.

The virtual OR environment (3): a panoramic VR scene regenerates a real OR including a full setup of instruments and equipment, and as a new feature, also a surgical team and various distractions. The distractions cover some of the distractive events observed in a real OR [14] (Fig. 1a). The virtual OR can be simultaneously seen on the monitor and in the VR headset from the same point of view (Fig. 1b).

Fig 1
figure 1

a the replicated OR setup of the VORSS. b an external view of the setup of the VORSS

Task

Firstly, the purpose was introduced to the participants to the VORSS system, to evaluate the use of VORSS in procedural VR simulation training in a realistic OR context. Participants were introduced to the VORSS and given time to familiarize themselves with the system. Informed consent was completed by the participants before the start of the study.

After the participants put on the VR headset, the VR simulator was adjusted ergonomically according to their height. Then they started a hands-on task “Complete Laparoscopic Cholecystectomy Procedure”, which was previously validated as a basic procedural module of Laparoscopic Surgical Skills Grade 1 Level 1 course [15]. A predefined protocol required participants to interact with the VORSS for 15 min. Since the task not aimed at assessing their performance, participants could stop whenever they thought it was enough to evaluate the VORSS.

After completing the task, the participants were asked to complete four questionnaires related to the VORSS experience. At the end, general suggestions and comments could be made regarding the realism of the VORSS by participants.

Assessment Methods

The participants were asked to score questions regarding the immersion, usability and reality of the VORSS experience. Since this is an efficacy study, power calculations were not performed a priori. While our sample size is small, one of the strengths of our approach in this study is that we present the results of multiple validated tools to assess each criterion. [16] The responses were analyzed via Presence Questionnaire (PQ) [17], Questionnaire for Intuitive Use (QUESI) [18], NASA-Task Load Index (NASA-TLX) [19], and a heuristics questionnaire. To avoid variability in response due to inconsistent protocol for each of the methods we followed the protocol listed in the references given above.

The Presence Questionnaire was modified and previously validated (Cronbach a = 0.878) to measure the immersion at a sensory level [20, 21]. The PQ contained twenty-four items reflecting seven influencing factors for self-reported immersion, including Realism, Possibility to act, Quality of interface, Possibility to examine, Self-evaluation of performance, along with haptic and sound factors. The study added two items on haptic and one item on sound according to the VORSS. An extended 7-point scale was used in fine gradient in which one is not immersive and 21 completely [22]. A baseline of the high level of immersion was assigned as 15 [18].

The Questionnaire for Intuitive Use (QUESI) indicated the subjective satisfaction of interacting with the immersive VORSS [18]. The QUESI measures five aspects of satisfaction using a 5-point Likert scale in which one represents low usability and 5 represents high usability. The baselines of the subscales and total were set respectively according to Hurtienne and Naumann [17].

The NASA-TLX assessed the mental workload or performance problem when performing the task in VORSS [19, 23]. The subscales measured six factors of the mental effort from very low (1) to very high (21). A baseline value was assigned as 11 represented a medium level of workload.

A questionnaire was developed based on the ease-of-use heuristics for medical devices. [24] Participants used the heuristics as a guideline to rate their experience with a 5-point scale at the system level, in which one means not realistic and 5 completely. A baseline of reality was considered as 4, indicating that only appearance problems were encountered by participants when using the VORSS.

As the final step of the assessment, participants were interviewed with two questions: (1) How satisfied are you with the virtual or experience? (2) Which factors were not compelling or not realistic in the virtual or experience?

Statistical Analysis

The data was analyzed using SPSS v.25. The mean and standard deviation of each questionnaire of the sample, novices and experts were calculated. The means and the baselines were then compared using one-sample t-test (normally distributed) or Wilcoxon signed rank test (non-normally distributed). The differences between novices and experts were tested using classical independent-sample t-test, otherwise non-parametric tests such as the Kruskal-Wallis test and Mann-Whitney U test where appropriate.

Results

Participants

A total of 28 participants enrolled in the study, of which 15 were novices (surgical residents) and 13 were experts (surgeons). Of the experts in this study, four had completed > 200 cases, three 101–200, three 50–100, and three had performed < 50 clinical procedures. Of the novices, 14 had performed fewer than 50 clinical procedures previously and one performed none. There were 8 male 7 female novices and 9 male and 4 female experts. The groups are approximately comparable in terms of demographic characteristics, with 17 males and 11 females.

Immersion: Presence Questionnaire

Table 1 presents the results of the self-reported immersion from the subscales of the Presence Questionnaire. In summary, the four subscales - Realism, Possibility to act, Quality of interface, and Haptic - as well as the overall total had a significantly lower level of immersion than the baseline (PQ subscales = 15, p < .05). Both novices and experts had similar immersion level across the subscales and overall, which were also all significantly different from the threshold. There were no significant differences between the opinion of the novices and experts.

Table 1 showing summary data for self-reported immersion from the subscales of presence questionnaire with (*) indicating significant (p < .05) difference between the mean for the whole data set and the threshold. The presence questionnaire contains three descriptors indicating the level of immersion (1=“Not at all”, 11 = “Somewhat”, 21 = “Completely”)

Usability: QUESI and NASA-TLX

The QUESI and NASA-TLX both reflected the usability of VORSS at a cognitive level. The five subscales and total score of the QUESI were calculated to discover whether the participants were satisfied when performing the task within the VORSS (Table 2). None of the subscales nor the total score of VORSS were significantly lower than the baselines (W = 2.94, G = 2.89, L = 3.00, F = 2.88, E = 3.04, total = 2.95). However, the score of subjective mental workload and perceived achievement of goals for VORSS were significantly lower for the novices than experts (p < .05).

Table 2 showing summary data for the level of intuitive use of the VORSS with (**) indicating significant (p < .05) difference between the mean for novices and experts. The descriptors of the questionnaire show opposite attitude on usability (1 = “Fully disagree”and 5=“ Fully agree”)

Six subscales of NASA-TLX were calculated to detect the main sources of mental workload (Table 3). The mental demand was significantly higher than the baseline, while frustration and performance were significantly lower than it (NASA-TLX subscales = 11, p < .05). In addition, the novices had a significantly higher mental workload in mental demand than experts (p < .05).

Table 3 showing summary data for the self-reported mental workload after using the VORSS with (*) indicating significant (p < .05) difference between the mean for the whole data set and the threshold and (**) indicating significant (p < .05) difference between the mean for novices and experts. The descriptors of the questionnaire show the level of mental workload (1=” Very low”, 21=” Very high”)

Reality: Heuristics Questionnaire

Fourteen heuristics were analyzed to judge the reality of VORSS at system level. Table 4 shows the criteria of the heuristics instead of the full guideline. All fourteen heuristics scored significantly lower than the baselines (heuristics = 4, p < .05). The experts showed significantly higher agreement on the heuristic the VORSS Prevent errors and Reversible actions categories (p < .05) than the novices did.

Table 4 showing summary data for the level of reality of the VORSS with (*) indicating significant (p < .05) difference between the mean for the whole data set and the threshold and (**) indicating significant (p < .05) difference between the mean for novices and experts. (1 = fully disagree, the descriptors of the questionnaire show opposite attitude on reality 5 = fully agree)

Semi-structured Interview

Comments solicited from the participants were broadly categorized into Virtual OR experience related, OR team-related and Personalization related:

Virtual OR Experience

Participants were critical on a few aspects of the VOR experience pertaining to interaction between the VOR and the VR simulator. Some could not see their own legs and foot pedals because the system did not allow them. Some comments were related to the procedural steps of the laparoscopic cholecystectomy depicted in the VR simulator perceived being different from their way of practice. Overall the participants were intrigued with the novelty of the system and were proactive in using and validating the system.

OR Team

Several participants commented on OR team and how it affected their perception of the level of system realism. The team would normally be located differently to the placement depicted in the VORSS. The team spoke English as opposed to the local language. The interaction between the team is not realistic and distracting. The voices in the background were unfamiliar and unrelated. The aggregate perception towards the OR team reproduction was negative.

Personalization

Overall the participants felt the system could benefit from personalization to meet individual preferences and realistic workplace replication.

Discussion

VR simulators have been successfully implemented to different training curricula in MIS, significantly. They have been shown to contribute to the acquisition of clinical skills, which is mandatory for safe performance of MIS surgery. [25] The outcome of multiple validation studies of VR simulators indicates that they adequately reproduce clinical surgical procedures, operative techniques and instrumentation to a level deemed adequate for training and certification. [26] This has proven to be of value in providing a constant objective evaluation of the task and procedural performance. The challenges of current VR simulators and simulation settings face lack of the system realism and immersion that are otherwise present in other fields of simulation training, such as in aviation, military training, and even in the entertainment.

The VORSS outlined and validated in this study builds upon the strength of the VR procedural simulation, and provides additional immersion experience of the operating room. The outcome of the usability, by applying QUESI and NASA-TLX tests, reflect the usability of the VORSS, at the cognitive level, which indicates a good sense of immersion and satisfaction, when performing the procedure within VORSS. The difference in mental workload was perceived significantly different by experts than novices, indicating that performing the task itself was more demanding for the surgical residents (novices) that the more experienced surgeons (experts).

Increased mental load created by the VOR environment with additional distractions and tasks, with the introduction of the OR team, implicates that trainees will be better prepared and will adapt to the work environment in the real OR more easily and faster. This has been proven in prior research when exploring the role of the distractors and increased mental load in course of procedural VR training in skills lab setting. [11] The outcome of this study has demonstrated clearly that training in an environment mimicking the real workplace shows higher efficiency of training shortening of adaptation period to the real OR environment. Benefits of this approach is demonstrated and proven by using immersive training programs for military personnel, emergency crew training and ICU personnel showing shorter learning curves and shortened adaption period to real-world setting. [27, 28]

Regarding the issue of self-assessment from our prior studies, we found that self-assessment has a good correlation with expert assessment and VR simulator assessment. [29] However, it is interesting to note that both experts and novices over-assess their performance in this study. While it seems to be possible to over-assess their performance in a new immersive training environment [11], it is crucial to develop objective criteria, next to the existing VR simulation criteria, for accurate self-assessment in VORSS setting. Implementing of the self-assessment component within the VORSS could importantly contribute to self-development and proficiency awareness of trainees.

The semi-structured interviews of the participants show a strong emphasis of the user perception on personalization. All users appreciated the immersive environment, created by the VORSS. The lack of personalization pertaining to language, crew placement, crew interaction, instrument-specific personalization, OR-layout was considered to be less realistic. This obviously indicates the need to improve the realism of the virtual environment, focusing upon above-mentioned aspects. One should also consider potentially customizing the environment, considering specific conditions, related to the region of the world, country or even specific institution were training takes place. This approach could lead to optimizing the procedural VR simulation training, resulting in improvement of safety and quality of MIS surgery. Furthermore, with the increased training demands of trainees and trainer constraints in India, there is an imminent need to address these challenges with effective tools that prepare a trainee for the operating room. [30] Future extensions of this work could include a study into the cost-effectiveness of this approach compared with mentor-mentee training, the use of simulated OR experience in a skills lab setting and a multi-national validation study to confirm the effects seen here.

Conclusion

The VORSS for procedural training has the potential to become a useful tool to provide immersive training in MIS surgery. Further optimizing of the VORSS improving realism and introduction of distractors in the VOR should result in an improvement in the effectiveness of the procedural training by shortening the learning curve and speeding up the adaption of trainees to the real OR setting.