1 Introduction

Virtual reality (VR) technology may be defined as ‘a three-dimensional computer-generated simulated environment, which attempts to replicate real world or imaginary environments and interactions, thereby supporting work, education, recreation, and health’ [1]. The technology used to deliver this experience may vary in terms of immersion, fidelity, and interactivity, and is by a more modern definition characterised by use of a head-mounted display (HMD) which occludes the users view of the physical world so that the user only perceives the manufactured digital environment [2].

VR technology’s ability to display standardised reproducible scenarios/environments have made it attractive for education, training, and recreation not only in healthcare but in many facets of industry [3,4,5,6]. Recent consumer orientated devices such as the Oculus Rift [7] virtual reality head-mounted display have made high degree immersive simulation accessible for educators [8], and there is a wide variety of systems and procedures designed for their implementation in different areas [8,9,10].

In medical education specifically, VR uses include practical/procedural skills training, acute/emergency medicine simulation training, as well as other case-based learning activities [3, 9,10,11]. There is also an increasing role for its use as an assessment tool [12, 13].

Given the recorded expansion of VR technology in medical education [6], it is important to consider what barriers its users are faced with in order to make strategic decisions about its use and identify areas for development of the technology.

Although there are existing review publications describing the general and system barriers to VR implementation—such as cost, and stakeholder training for successful implementation [14, 15]—this paper aims to experimentally explore barriers faced directly by users, and the impact on the usefulness of VR.

Supporting the growth in VR technology use in medical education, several high-profile systematic reviews have explored its positive effects on learning. Kyaw et al. explored VR effectiveness for Health Professions Education [16] with a meta-analysis which concluded VR improves post-intervention knowledge and skills outcomes compared to a control group. Their findings are also supported by other published meta-analyses for specific areas such as improved knowledge acquisition in anatomy teaching [17] and surgical psychomotor skills [18], as well as improved learning performance in the wider medical [19] and non-medical fields [20].

However, these studies suggest VR is not without its challenges and suggest the need for further research into the barriers to learning using VR. Jensen and Konradsen in their systematic review summarised “While the studies found that learners are generally very positive about using HMDs there are still substantial barriers to their use, especially in regard to cybersickness symptoms, lack of appropriate software, and technical limitations of peripheral devices.” [18]. This is worsened by the widely varying results of learner experience [18], attributed to differences in content delivered, learner background, and technology used.

Within our department similar concerns have been anecdotally expressed regarding the possible impact of cybersickness symptoms on learning gain, as well as its overall student usefulness above an exciting novelty, and this was something that needed to be empirically addressed before any further adoption could be considered. To remedy this we set out to address this specific gap in knowledge by investigating the uncertainty of the undergraduate medical student experience when using a commonly available consumer grade VR headset, and a simulation learning package used by over 150 institutions for VR simulation [21]. This gap is underpinned by the theoretical understanding that barriers such as cybersickness impact on the constructivist learning process experienced by students, and therefore jeopardise the technology’s utility. Using an objectivist deductive method—whereby we gather and analyse data to provide empirical evidence and insight towards a preformulated hypothesis—we aim to inform evaluation of its continued use, as well as expansion in similar settings.

2 Method

2.1 Hypothesis

Specific barriers to a positive student experience exist that will negatively impact on the effectiveness and acceptability of VR simulation as a teaching tool.

2.2 Participants and the simulation training curriculum

The participants consisted of twenty final-year Medicine (Bachelor of Medicine, Bachelor of Surgery: MBBS) students, studying at Hull York Medical School (HYMS), undergoing either the Medicine or Surgery learning block, while based for study at the Hull University Teaching Hospitals NHS Trust.

The MBBS programme at HYMS is a five-year programme (six years if selected optional intercalated degree), with the first two years being primarily University based, covering the basic medical sciences and introducing clinical skills through a variety of problem-based learning, lecture, and clinical placement activities [22]. The subsequent years (excluding optional intercalation) are based across primary and secondary care, developing clinical, history-taking, examination, and problem-solving skills through both facilitated clinical teaching and self-directed learning [22].

Final year medical students were chosen for this study as clinical simulation forms a more significant part of their curriculum compared to other student groups within HYMS, and are also likely to be more generalisable to other institutions that are also more clinically weighted in the later years of training. The sample size was selected pragmatically: given a population size of thirty-nine students currently undergoing their Medicine/Surgery blocks based in Hull, twenty students was determined after discussion within the research department as an adequately representative sample, as well as feasibly manageable considering time constraints, and student participation rates.

As part of their curriculum, the students are timetabled to attend two simulation sessions, making use of either virtual reality simulation or high-fidelity simulation using a SimMan 3G. The aims of these sessions are to learn and apply the ABCDE approach of assessing unwell patients [23], exercise clinical reasoning with differential diagnosis formulation, evaluate basic investigations, commence appropriate management plans, and communicate their findings with a senior clinician.

The sessions are facilitated by a HYMS Clinical Teaching Fellow, who is trained to guide the student through the experience. Active discussion is encouraged throughout the scenario, especially where there is uncertainty about the ABCDE approach or around differential diagnoses or management. A short five minute debrief is often conducted to discuss the case, as well as any feedback for improvement. For logistics, the scenarios are often conducted in pairs, with one student leading the simulation, while another observes, before alternating.

From September to November 2021, students scheduled for these sessions were contacted one week beforehand via email, to invite them to participate in the study. Participation was voluntary and those not wishing to take part/did not respond still undertook their scheduled VR simulation.

2.3 Apparatus and setting

The study was conducted within a dedicated teaching room within the HYMS department of Castle Hill Hospital (Hull, UK).

In terms of apparatus, for the simulation we used an Oculus Rift S virtual reality head-mounted display. The headset provided 3D stereoscopic imagery to provide depth, with an encompassing 115° field of view, 80 Hz refresh rate, and in-built stereo speakers [7].

The headset was powered by an Alienware 15 R4 laptop, with Windows 10 Pro, Intel® Core™ i9-8950HK CPU @ 2.90 GHz, 16.0 GB 2600 MHz DDR4 RAM, Nvidia GeForce GTX 1080 8 GB GPU, 256 GB NVMe M2 SSD. The simulation video and audio output were also presented on a large LCD screen within the room so that the facilitator and any observers could watch the scenario in real-time.

The simulation scenarios were provided by the Oxford Medical Simulation (OMS) suite [21]. This package provides artificial intelligence driven virtual patients in a virtual hospital-based setting, and is used across the UK by numerous healthcare education providers [21]. It provides interaction with history taking, physical examination, investigations and radiology, basic interventions and administering medication, and phone referrals, while providing dynamic patient responses and observations. This interaction is enabled by a series of provided option menus, which the participant can select via the handset to initiate the relevant task.

The scenario pictured in Fig. 1 is a 42-year-old virtual patient who presents with abdominal pain. The participant takes a focussed history on the presenting complaint and relevant risk factors, and is prompted to verbalise their clinical reasoning process. Through the ABCDE approach their differential diagnosis is dynamically evaluated, while appropriate investigations and management plans are undertaken, again prompted by the facilitator as necessary to verbalise their understanding around this. In the debrief performance feedback is given, as well as discussion around relevant specifics such as causes of pancreatitis for example.

Fig. 1
figure 1

Left: Image of participant wearing the VR headset. Middle: Image of the large wall-mounted LCD screen displaying the simulated imagery viewed by the partipant. Right: Image displaying the laptop apparatus including the laptop, headset, and large LCD screen

2.4 Study design and procedure

Following recruitment, on attendance to the session participants were verbally informed of the study design by the study author, before being given a written information leaflet and consent form.

Following their consent, they were asked to complete the written pre-simulation questionnaire.

Once completed they proceeded to the simulation scenario. Two scenarios were used for the experiment, in accordance with the usual simulation curriculum, which also allowed for a second participant to observe a scenario, before undertaking their own scenario.

Each scenario was timed at fifteen minutes as set by the OMS suite and was facilitated by the study author, a HYMS Clinical Teaching Fellow to ensure consistency.

Following completion of the scenario and standard case debrief, participants were asked to complete the written post-simulation questionnaire.

Each participant’s questionnaires were paired and numbered for anonymity in analysis, however, a separate physical record was kept allowing participants to later withdraw if requested.

2.5 Data collection

A pre- and post-simulation questionnaire was utilised for this study, which allowed for a direct comparison of results before and after experiencing the VR simulation scenario. The questionnaires consisted of basic participant demographics such as age, gender, and interest/past use of VR/technology, as well as a seven 0–10 Likert items, and free-text fields.

The 11-point response scales question both expected, and experienced interest and perceived usefulness of VR, as well as several other domains which were considered potential barriers to VR effectiveness as a teaching tool. These barriers included claustrophobia/anxiety, nausea/vomiting, disorientation, headaches, and discomfort; a collection of symptoms often termed cybersickness, referring to the motion-sickness type effect of VR [24]. These specific adverse symptoms were chosen as are often reported in the general literature [11, 25], and the questions were adapted from the Virtual Reality Sickness Questionnaire (VRSQ) [26], to suit a pre-and post-simulation use. Participants were asked in the pre-simulation questionnaire to what extent did they expect to be affected by these symptoms, and in the post-simulation questionnaire to what extent they were affected by them.

The questionnaire also included two Yes/No items for if participants were able to use the system well enough to focus on the scenario, as well as if the scenario was pitched at the correct standard for their level of training. These questions were designed to ensure that confounders such as inappropriate learning content impacting usefulness and interest scores were controlled.

Free-text fields allowed participants to describe any other reservations/issues which may operate/operated as potential barriers that limit effectiveness.

2.6 Statistical analysis

The items of the scored questionnaires were analysed through Spearman’s rank correlation analysis, Mann–Whitney U testing, Chi-squared, and paired sample Wilcoxon signed-rank testing, considering a p-value of ≤ 0.05 (two-tailed) for statistical significance. These non-parametric tests were used given the nature of ordinal Likert item responses. The analysis was performed using IBM SPSS Statistics for Windows, Version 28.0. Released 2021. Armonk, NY: IBM Corp.

3 Results

3.1 Participant demographics and past-experience

Twenty final year HYMS medical students took part in the study, eight female and twelve male. Age ranged from 22 to 29 years old, with a mean age of 23. All participants completed the pre- and post-simulation questionnaires.

Fourteen (70%) expressed an interest in VR technology before the simulation, with eleven (55%) having used it in the past, with a median rating of the experience of 8/10 (0/10 = very poor, and 10/10 = excellent).

3.2 Evaluation of the scenario and usability

Twenty (100%) participants answered ‘Yes’ to both if they were able to use the system well enough to focus on the scenario, as well if they considered the scenario was pitched at the correct standard for their level of training.

3.3 Pre- and post-simulation cybersickness symptoms and discomfort ratings

Table 1 shows the descriptive statistics of the results of the Likert items regarding cybersickness symptoms, with associated significance values from a paired sample Wilcoxon signed-rank test, as well as Spearman rank correlation analysis and associated significance. The results provide tangible figures of experienced symptoms for our equipment and setting, with median ratings of 0/10 for claustrophobia, nausea, headaches, and discomfort, and a 1/10 rating for disorientation. They also demonstrate that among all five specific domains of symptoms/barriers evaluated the pre-conceived/anticipated rating of symptoms were statistically significantly (p < 0.01) higher than what was actually experienced. Only two participants scored their post-simulation symptom ratings higher than their expected pre-simulation rating, and this was only in either one, or two symptom domains.

Table 1 Summary statistics of the five domains of symptoms/barriers evaluated, as well as usefulness and interest in VR, paired in pre- and post-simulation rating, using the Wilcoxon Signed-Rank Test for significance, and Spearman’s rho for correlation

Spearman’s correlation results demonstrate a significant correlation between participants' pre- and post-simulation rating of both claustrophobia and nausea symptoms, but not the other symptom domains. This suggests the participants that expect more severe symptoms in these domains, do experience more severe symptoms compared to the rest of the cohort, even if it is significantly less severe than their prediction. Figure 2 further illustrates the distribution of ratings.

Fig. 2
figure 2

Simple bar charts illustrating the distribution of ratings for each symptom domain; x-axis = Anticipated/Experienced Symptom Rating (0–10), y-axis = Participant Rating Count (0–20) (top row = pre-simulation; bottom row = post-simulation). Figure created in Microsoft Excel 365

3.4 Usefulness and interest

Participants were also questioned on their perceived usefulness of VR in this setting both pre- and post-simulation. We saw a modest but statistically significant (p < 0.01) improvement in usefulness ratings, from a median 7/10 (pre-simulation) to a median 8/10 (post-simulation), as well as interest in VR technology—median 7/10 to 9/10 (p < 0.01). Figure 3 illustrates the distribution of rating scores.

Fig. 3
figure 3

Simple bar charts illustrating the distribution of ratings for interest in VR, and perceived usefulness of VR; x-axis = Rating (0–10), y-axis = Participant Rating Count (0–20) (top row = pre-simulation; bottom row = post-simulation). Figure created in Microsoft Excel 365

To evaluate if the experience of any symptoms/barriers impacted the post-simulation perception of the usefulness of VR, Spearman’s rank coefficient was used to assess for correlation. Only claustrophobia had a significant negative correlation (-0.446; p = 0.049) with perceived usefulness, indicating that as scores for claustrophobia increased, reported usefulness decreased. All other symptoms were non-significant. We also did not see a significant correlation between interest in VR—either pre- or post-simulation—and perceived post-simulation usefulness of VR (0.017, p = 0.943; 0.320, p = 0.169 respectively).

3.5 Age, gender, and past experience

We used Spearman’s rank co-efficient to evaluate for any correlation between age and any of our question domains. Age was not significantly correlated to any reported symptoms, however, did positively correlate to reported interest in VR pre-simulation (p = 0.047), but not post-simulation.

The only significant correlation between rating of past VR experience was the pre-simulation expected extent of discomfort. This correlation was negative, suggesting those with higher past experience scores, scored lower expected discomfort ratings.

Differences in ratings between gender were assessed using the Mann–Whitney U test, and the Chi-squared test depending on the question format, in which ratings in post-simulation—not pre-simulation—interest in VR (11-point response scale), and pre-simulation interest in technology generally (Yes/No response format) was significantly different (p = 0.012 and p = 0.028), with males rating higher interest scores.

3.6 Free text comments

Participants in the pre-simulation questionnaire were asked about expected barriers or reservations they have about using the technology. Important concerns raised were suggestions around possible difficulty interacting with the scenario because of the interface provided compared to real-life interaction, and that it may not feel as ‘real’ as physical simulation modalities.

The post-simulation questionnaire asked about experienced barriers not covered elsewhere in the questionnaire. Some of the provided comments were specific to our apparatus/setting as some participants complained about an audio echo from the sound presented through the headset and through the facilitator's LCD screen, which made it difficult to process what was being said. Other comments focussed on barriers related to using the technology hardware, such as visual focusing issues if wearing glasses, others around software platform aspects such as difficulty navigating the provided option menus for interaction with the scenario. One participant concerning VR immersion commented they ‘might feel anxious if I [they] were alone in the room wearing the VR headset’. Fortunately, all of our sessions are accompanied by a facilitator, but this may raise concerns for other formats of self-directed learning.

4 Discussion

This section will discuss and confront the outcomes we have gathered in this study, as well as compare them where possible to other published works on the topic.

We have demonstrated in our setting and usage, that students perceive VR simulation is a useable and appropriate teaching tool (100% participants consensus).

4.1 Pre- and post-simulation cybersickness symptoms and discomfort ratings

In this study, robust results have been presented on experienced cybersickness and discomfort symptoms for our setting and found in all cases they are statistically significantly lower than what the students expected. They also are remarkably low throughout the cohort, with a median rating of 0–1/10 in all domains, with discomfort and disorientation seeing the highest maximum reported experiences of 6/10 for both domains. Comparing these rates to other published studies, Jensen and Konradsen in their systematic review of the use of VR for education [18] noted a wide range in reported frequency of symptoms, some very low correlating to our results [27, 28], while some remarkably high, even citing students unable to continue [29,30,31]. Many factors may influence this, namely, the technology apparatus varies widely between these studies which may provide some explanation, as issues around lag (delay) in head-movement detection [11] and interpupillary distance, field-of-view, and refresh rate can all impact cybersickness symptoms [32]. Exposure duration is also implicated [32, 33], our simulations have a fixed 15-min limit, but longer exposure may have seen increased ratings of symptoms.

Interestingly, students expected symptom extent rating trended with that of their experienced symptom extent for claustrophobia and nausea—although statistically was significantly lower. This may represent students’ own assessment of their predisposition to these symptoms based on past transferrable experiences, which may be useful to influence the estimation of symptom severity.

4.2 Usefulness and interest

Only claustrophobia had any effect on the students' rating of usefulness, all other domains did not impact their rating of the experience.

Revisiting our hypothesis, we can refute our hypothesis on all counts apart from claustrophobia, which while significant, still represented a very small reported finding.

Few studies investigate these symptoms individually, but again there are varying results regarding the impact of cybersickness on perceived usefulness by participants. Some studies suggest it has a high impact on learning outcomes [31], whereas others suggest little to no impact [4]. It may be worth noting this may be intrinsically tied to the extent of symptoms experienced in each study.

Both usefulness ratings and interest in VR increased after VR usage and again were remarkably high at 8/10 and 9/10 respectively. These results support those of a previous study in the department [34], and this enthusiasm from students is also shared by the general literature [4, 18, 35].

4.3 Age, gender, and past experience

Age and gender influenced only ratings of interest in VR and were not influential over cybersickness symptoms. In previous literature, suggestions have been made that females are likely to experience higher degrees of symptoms [36, 37], whereas others concluded no significant difference in incidence [32, 37,38,39]. We must consider whether the lack of difference is masked by a low overall incidence of symptoms. One study noted an increase in cybersickness symptoms reported with increasing age of the participant [39], however, our cohort age range is likely too small to see meaningful differences.

Previous studies have noted significant differences between those with previous experience of VR and cybersickness, versus those naïve to the technology [30, 36, 39], and others have demonstrated the frequency of symptoms decreases with repeated use [25]. We, however, did not see any significant differences in this category between ratings of symptoms. As we did not assess how extensive their experience with VR was, we do not know if it may be because those who had used it before, had not done so extensively, therefore the disparity between groups was still minimal.

4.4 Free text comments

Free text questioning also raised some important barriers specific to our setting to address, as well as more generally to VR technology, such as concerns around glasses wearers and wearing the headset, interacting with a currently cumbersome menu system on the platform, and potential anxiety around unaccompanied use. The concern raised around users that wear glasses has been shared elsewhere in the literature [4] and raises a wider issue around equality of provision. In our study, all students were able to use the headset over their glasses if they fitted, or without glasses, if their eyesight allowed without hindrance. Undoubtedly however this is not possible for all students and may pose a significant barrier to learning. If certain groups of students are unable to use the technology due to physical incompatibility, or predisposition to side effects as a result, this may exclude them from the learning experience altogether.

As for unintuitive menu interaction systems, although this was in specific reference to our elected platform, it should be considered in all VR applications, as it is an often-cited barrier to immersion [14, 18], and there is still much work to be done to make VR a seamless transition from reality. Finally, a free-text comment discussed anxiety around unaccompanied VR use, where they suggested they would feel vulnerable due to their reduced awareness of their surroundings. Sentiments on vulnerability while users are in the virtual space have also been shared elsewhere [18, 30] and these require careful consideration around protected spaces to ensure users feel safe while using the immersive technology.

For students that are unable to use the VR technology effectively, our equipment allows for interaction with the simulation on a more traditional screen-based interface. This undoubtedly detracts from the immersion of the experience [30], however, is a small consolation for learning for those students affected.

4.5 Limitations

This study focussed only on the use of one simulation software suite on one model of head-mounted display. The underpinning for this is that it is representative/liken of the curriculum we, as well as many other healthcare education providers across the UK deliver [21]. This however may limit the applicability or direct comparisons that can be drawn to other less commonly used apparatus/research designs. The study participant group represented only a sample from a larger diverse cohort of students, and indeed while the sample size was pragmatically chosen, it remains small, and therefore this provides scope for further larger scale studies, especially to further characterise uncommon barriers not reported/experienced in this study that may exclude only a small number of students from the educational opportunity. The pre- and post-simulation questionnaires used were not piloted, and whilst no problems arose with their use, a pilot trial would have been good practice to explore any potential problems with the questions used before the study period. The questionnaire items used represent subjective reporting of symptoms and views, therefore are less reliable than objective measurements, although in this case, as symptoms represent an experience, we felt subjective measurements were appropriate.

5 Conclusion

We conclude virtual reality simulation using commonly utilised apparatus is effective and acceptable for undergraduate medical education. Rates of experienced cybersickness were found to be remarkably low (medians ranged from 0 to 1/10) and overall usefulness and interest rates were encouragingly positive (median 8–9/10). The most significant barrier reported by our free-test questioning was students wearing glasses. Further work however needs to be done to ensure we provide an equitable teaching experience for all students.