Being prepared for emergencies: a virtual environment experiment on the retention and maintenance of egress skills

The retention of safety-critical egress skills is an essential part of emergency preparedness on offshore petroleum platforms. Virtual environment (VE) training has been shown to be an effective method for teaching basic onboard familiarization and offshore emergency evacuation procedures. This technology has the potential to train crews before they are deployed offshore. This paper investigates the long-term retention and maintenance of emergency egress competence obtained using a virtual offshore platform. In particular, the research aimed to answer two questions: (1) what egress skills can be remembered after a period of 6 months? and (2) how effective is a VE-based retraining program at maintaining egress skills? A two-phased experiment was designed to first teach basic egress skills and subsequently assess skill retention after a 6- to 9-month period. The first phase of the experiment used a simulation-based mastery learning (SBML) pedagogical approach to teach naïve subjects the necessary spatial and procedural skills to evacuate safely. In the second phase of the experiment, the same 36 participants were tested after the retention interval on their ability to respond to a series of egress test scenarios. Participants who had trouble remembering the egress procedures were provided retraining on deficient skills. The results of the experiment indicate that emergency egress skills (both spatial and procedural knowledge) are susceptible to skill decay. This paper will highlight the skills that were most susceptible to skill fade after a period of 6 to 9 months and discuss the efficacy of the retraining participants received to return to competence.


Introduction
Offshore emergencies require the prompt response of prepared crews. Emergencies do not afford second chances. Thus, the retention of safety-critical emergency response skills is an essential part of an offshore emergency preparedness plan. Offshore emergency response teams rely on individuals to follow egress protocols to ensure that all personnel onboard have been accounted for in an emergency.
Workers typically acquire egress skills through conventional safety induction training on their first deployment offshore. This training involves watching safety videos followed by a supervised orientation period for their own safety. Crew are typically required to participate in induction training for a designated time period, but no formal assessment is performed to assure competence has been achieved. This form of training does not address individual learning requirements and allows learning outcomes to vary amongst inductees. To maintain competence while offshore, workers are required to perform weekly muster drills and quarterly evacuation drills. Due to safety constraints, the drills are not representative of real emergency conditions. This disconnect between drills and emergencies can result in negative training transfer (Wickens et al. 2013).
According to industry standards (e.g. CAPP, 2015), personnel who return to work on a platform after an extended period (e.g. 6 months or more) are required to undergo safety training again, regardless of their previous experience. This requirement is based on the understanding that egress skills deteriorate over time without practice. The mandated recurrency schedule is not informed by the individual skill retention abilities. The lack of personalized training can result in a recurrency schedule that is too infrequent for some, causing training to be forgotten, or conversely, a recurrency cycle that is too-frequent for others, which can undermine the training by causing worker complacency.
Virtual environment (VE) training can address weaknesses in conventional safety training and provide a way to practice emergency egress skills regularly, unconstrained by safety, logistical, or financial concerns. VEs are effective at teaching basic onboard familiarization and emergency evacuation procedures for offshore petroleum platforms (Smith and Veitch 2018). Specifically, VE training can provide assurance that all individuals have at least achieved the same minimum standard of competence. VE training also provides practice in situations that muster drills onboard cannot replicate, such as the high-stress, dynamic and hazardous conditions of emergency situations.
For safety-critical skills, training is only effective if the skills can be recalled and used in a real emergency. This brings to question the following: (1) can egress skills acquired using VE training be remembered after a period of 6 months without any other form of training? (2) can a VE-based retraining program help maintain egress skills by returning participants to competence? To inform these questions, this paper presents a two-phased experiment to investigate the long-term retention of offshore egress skills attained using a virtual environment.
The first phase, skill acquisition, was conducted using the simulation-based mastery learning (SBML) pedagogical framework to teach virtual offshore emergency egress training (Smith and Veitch 2018). Fifty-five novice participants participated in the skill acquisition phase of the experiment. All participants who completed the SBML training achieved the targeted performance outcomes and demonstrated competence at the end of the program. Smith and Veitch (2018) compared the SBML approach to a benchmark training program called lecture-based teaching (LBT). The LBT training program represented the existing safety protocols used offshore for egress training (Smith 2015). The results of this comparison showed that SBML training was more effective at bringing all participants to competence and did so in less time than the LBT methods. This phase of the experiment established a benchmark of competent performance and corresponding times required to achieve competence (i.e. for comparison with the retention phase measurements). All fifty-five participants were invited to return to participate in the second phase, skill retention, of the experiment. The second phase evaluated the retention of skills attained by the same participants who completed virtual offshore egress training in the first phase SBML experiments. After the retention interval of 6 to 9 months, thirty-six participants returned to complete the same test scenarios used in the SBML experiment. The participants' performance in the test scenarios at the end of the skill acquisition phase was compared to their first attempt performance in the same test scenarios at the beginning of the retention phase. This comparison assesses the retention of egress skills required to evacuate an offshore platform in an emergency. Participants in the retention study who failed to complete the test scenarios were retrained using exercises that focused on the particular skills they failed to demonstrate. The impact of retraining was measured to determine how well retraining improved participants' performance in subsequent test scenarios. The goals of this research were to (1) determine if egress skills were retained for a period of 6 months without other training interventions, (2) identify the learning objectives that were more susceptible to skill degradation and (3) determine the efficacy of the retraining in bringing participants back to competence. All three aspects are discussed in this paper.

Overview of factors that influence skill retention
Many factors influence how well skills are remembered. Arthur et al. (1998) performed a meta-analysis of skill retention literature and described seven factors that influence skill decay and retention: (i) length of time lapsed of non-skill use during retention interval, (ii) the quality of the original skill acquisition and the amount of overlearning that occurred, (iii) skill type and task characteristics (e.g. physical versus cognitive tasks), (iv) the methods used to test learning and retention, (v) conditions of retrieval or specificity of training (i.e. the similarity of learning and testing contexts), (vi) the instructional strategies or methods used to teach the skills, and (vii) individual differences in acquiring and retaining skills. Sanli and Carnahan (2018) in their review of multi-day training courses in medical, military, marine and offshore safety fields discussed similar factors that influence skill retention. According to Sanli and Carnahan (2018), the factors that influence skill and knowledge retention in these safety-critical domains include the following: (a) type of skill (e.g. practical and declarative knowledge), (b) task complexity and difficulty (e.g. number of steps and order of tasks), (c) individual differences and the experience of the learner, (d) specificity of training (i.e. closeness of the learning and testing contexts), (e) the amount of practice and on the job exposure provided, and (f) the frequency that refresher interventions are delivered.
Three main topics will be discussed in the context of virtual offshore egress training: (1) the influence of instructional or pedagogical strategies on skill acquisition, (2) the impact of skill type on forgetting (such as spatial, declarative and procedural knowledge), and (3) the frequency of practice (e.g. how often recurrency training is provided and the length of time that passes between training sessions).

Strategies for improved skill acquisition
In emergency response domains, the amount of training provided is typically dictated by fixed timelines and does not take into consideration individual differences in learning. Training dictated by a fixed timeline refers to limiting the training material and/or opportunities to practice to within the allocated time of the course. In the medical field, fixed training times have been found to cause performance outcomes to vary (Cook et al. 2013;Gallagher et al. 2005). To assure skills are properly acquired in the first place, training programs are shifting from time-based frameworks to competence-based models. Virtual environment training using the pedagogical frameworks developed by the medical education field can assist offshore operators in transitioning from a fixed time training model to competence-based training. For example, McGaghie et al. (2014) developed the simulation-based mastery learning (SBML) pedagogical framework for the medical education field. The SBML method accommodates different learning styles and paces by ensuring all individuals reach a minimum competence standard by the end of the training program. It achieves this using two main features: (1) opportunities to practice and receive formative corrective feedback until competence is demonstrated and (2) allowing trainees to advance to more complicated training material only once foundational skills are demonstrated.

Skill type-spatial and procedural skills for retention
Offshore egress training is provided with the expectation that egress skills remain current in the event of an emergency so that individuals are prepared to take action. The type of task influences how well skills are retained after a period of non-use. The safe evacuation of an offshore platform requires two types of skills: spatial knowledge of the platform to assist in wayfinding and procedural knowledge of the protocols in place to protect personnel from harm in an emergency. Thus, understanding the retention of spatial and procedural skills is important for providing adequate training for real emergency situations. This section will discuss the differences in acquiring and retaining spatial, declarative and procedural knowledge.

Spatial knowledge
Landmark-Route-Survey (LRS) is a spatial knowledge acquisition model (Seigel and White 1975) that explains how people develop their understanding of an environment. People first recognize landmarks, then learn the routes that connect landmarks and, over time, they develop survey knowledge of how the landmarks and routes are interconnected. Developing a spatial understanding of an environment on all three levels (landmark, route and survey) can also develop concurrently (Taylor et al. 2008). However, survey knowledge often requires longer exposure to the environment to gain a map-like representation of the environment (e.g. learning how landmarks and routes are interconnected). Survey knowledge is important for evacuating an offshore platform because a well-known route may not always be available in an emergency, leaving personnel to find a less traversed tenable path to their muster stations. For example, researchers observed that in emergency situations, people tend evacuate buildings by taking the known main exit instead of the nearest fire exit (Kobes et al. 2010). This risky behaviour can be addressed by providing people with more time to learn survey knowledge of an environment. Kim et al. (2013) integrated four learning theories (Fitts 1964;Anderson 1982;Rasmussen 1986;VanLehn 1996) into a three-staged skill acquisition process: (1) declarative stage-learning declarative knowledge (i.e. information or facts), (2) mixed stage-consolidating the acquired task knowledge to form a mix of declarative and procedural knowledge; and (3) procedural stage-tuning the knowledge towards predominately procedural knowledge through overlearning. This model provides a framework to help explain how skills are learned and forgotten.

Declarative and procedural knowledge
Declarative knowledge will degrade with the lack of use (e.g. information will no longer be available in memory for retrieval). Declarative knowledge can be transformed into procedural knowledge over time (e.g. gradually associating knowledge, transforming it into rules and developing heuristics and biases). Frequent practice and contextual experience allow experts to proceduralize skills so that they rely less on declarative knowledge and are able to perform the task automatically in response to a situation (Kim et al. 2013). Procedural knowledge is implicit as experts possessing the knowledge are able to perform the actions without effort but are sometimes unable to verbalize the knowledge (Wickens et al. 2013). Sui et al. (2016) suggest that trainees should be provided with sufficient practice to allow them to reach the proceduralization stage, thereby increasing likelihood of skill retention.

Frequency of retraining
The amount of time that lapses between retraining sessions is an important factor to investigate in order to ensure safety-critical skills are maintained. Predicting the rate at which skills will be forgotten can help inform the frequency with which recurrency training should be provided (Wickens et al. 2013).
In the review of multi-day safety training courses, Sanli and Carnahan (2018) concluded that complex skills could be remembered for at most a 6-month period without any form of training interventions. Atesok et al. (2016) reviewed literature on the retention of simulation-based trained orthopaedic surgery skills and found that repetitive practicing of skills learned in a simulator helped mitigate skill decay even after some time had lapsed (these studies ranged in amount of time lapsed; e.g. followup retention assessments occurred at 1 month, 3 months, 6 months to a maximum of 30 months).
Knowing that egress skills (i.e. spatial, declarative and procedural knowledge) deteriorate over time without practice, the offshore industry standards require personnel to undergo recurrency training if they have been away from the platform after an extended period (e.g. 6 months or more). This brings to question: how well are egress skills retained in a 6-month period? and how effective is a VE-based retraining program at maintaining egress skills? This paper presents results to answer these questions.

Methods
The experiment consisted of two phases: (1) a skill acquisition phase using the simulation based mastery learning (SBML) approach and (2) a skill retention assessment and retraining phase, which took place after a period of 6 to 9 months. Figure 1 depicts phases I and II of the experiment. Both phases of the experiment consisted of a habituation stage followed by a series of modules with practice scenarios and testing scenarios (denoted in Fig. 1 as P1-P8 and T1-T4, respectively). In phase I, participants were tasked in each module with completing the practice scenario correctly (i.e. to criterion) before advancing to the test scenarios (Fig. 1a). In phase II, participants were re-tested on the same test scenarios and were re-trained if they made any errors in the test scenario. The retraining consisted of specific practice scenarios (Fig. 1b).
This section will briefly describe the effect size and power analysis, participants, the All-hands Virtual Emergency Response Trainer (AVERT) simulator, skill acquisition and test scenarios, and the retention assessment and retraining matrices. A detailed description of the methods used in phase I can be found in Smith and Veitch (2018). A description of the methods used in phase II can be found in Doody (2018).

Estimated effect size and power analysis
The effect of interest in this experiment was the change in performance score from the skill acquisition phase to the retention phase. The effect size was calculated based on an estimated drop in performance of 15% or greater due to skill fade and was informed by previous experiments (Smith 2015;Smith and Veitch 2018). Based on the estimated minimum amount of skill degradation to be detected, the effect size calculation resulted in an effect size d = 0.6. This is a large effect based on Cohen's convention for t test on means of two dependent (or paired) samples (Cohen 1988).
A priori power analysis was performed using G*Power3 (version 3.1.9.2) software (Faul et al. 2007) to determine the required sample size for the retention portion of the Fig. 1 AVERT skill acquisition (phase I) and retention and retraining (phase II) (after Smith et al. 2018) longitudinal experiment. For the repeated measures design, the following specifications were used: a matched pairs Wilcoxon signed rank test (the non-parametric equivalent of a two dependent samples t test), one-tailed (for directional hypothesis that some egress skills will be lost), with input parameters: significance level α = 0.05, power level (1 − β) = 0.95 and effect size d = 0.60. For the Wilcoxon signed rank test, G*Power3 returned a sample size of N = 33 participants to achieve a power level of 0.95 with critical t = 1.696 and non-centrality parameter δ = 3.389. This result indicated that the retention portion of the longitudinal study required at least 33 participants to return and complete the test scenarios in order to maintain a statistical power of 0.95 (i.e. 95% chance of the result was not due to a type II error).

Participants
Memorial university's interdisciplinary committee on ethics in human research approved the experimental protocol. Following the approved research protocol, the recruitment strategy focused on naïve participants (to control for spatial knowledge and experience) and this translated into recruiting undergraduate and graduate students. Participants were recruited from the university's campus by email, posters and by word of mouth. All volunteers who participated were naïve to the experimental design, had no prior experience working offshore and had no exposure to the simulator prior to the study. All volunteers provided their informed consent before participating in the experiment.
Sixty participants were recruited for the first phase of the study with an expectation of 25% attrition for the longitudinal portion of the study. Five participants withdrew at the onset, due to simulator sickness or difficulty with the controller. Fifty-five participants completed the skill acquisition training (phase I) and were invited to return after a period of 6 months to participate in the retention assessment (phase II). Seventeen participants opted out of the longitudinal study during the 6-to 9-month retention interval. The remaining 38 participants completed the retention phase. Two were identified as outliers (e.g. completing the retention assessment at 4 and 10 months) and were removed from the retention analysis. Thirty-six participants completed the retention phase within the designated 6-to 9-month period. Twenty-seven participants were male, and nine participants were female. Participants ranged in age from 19 to 54 years (M = 29 years, SD = ±8.8 years).

AVERT simulator
Emergency egress training was provided using the AVERT simulator. The AVERT simulator is a desktop virtual environment that allows participants to interact with the virtual offshore platform using a gamepad controller (Xbox). The virtual environment depicts a realistic representation of an offshore Floating Production Storage and Offloading (FPSO) vessel, similar to those used in the Newfoundland offshore area. A generic virtual FPSO platform was chosen for its relevance to the local offshore industry.
Participants moved within the FPSO by controlling a first-person perspective avatar of an offshore worker. Participants were first provided with habituation scenarios for orientation with the simulator controls. The AVERT training provided a series of training scenarios with built-in guidance, multiple opportunities to practice and test scenarios with after-action feedback. Participants were tasked with learning their way around the accommodation block of the platform and the safety protocols for responding to emergency situations.

Skill acquisition and test scenarios
A previous virtual environment training experiment by Smith (2015) used lecture based teaching (LBT) methods with the AVERT simulator and found that fixed instructional time was ineffective at ensuring participants acquired the necessary skills to respond to virtual emergency situations. Individual learning differences, such as style and pace, were believed to contribute to the failure of participants to reach competence using conventional LBT training. The simulation-based mastery learning (SBML) pedagogical framework (McGaghie et al. 2014) was adopted for the first phase of the longitudinal experiment to accommodate for individual differences. The SBML framework was used to deliver offshore emergency egress training in the AVERT simulator.
The training curriculum and assessment criteria for the experiment were developed based on subject matter expert guidance and industry regulations (Transport  Table 1 provides a list of the learning objectives taught using AVERT. The US Coast Guard's method for developing mariner assessments was used to develop the assessment criteria, proficiency standard and performance scoring system (McCallum et al. 2000). Subject matter experts in offshore training were consulted in the development of the performance measures and test scenarios to assess trainee competency. The experts provided credible real-world emergency scenarios for the research team to model in AVERT so that trainees could demonstrate their understanding of the learning objectives. The test scenarios covered a range of activities, from basic muster drills that required the trainees to go to their muster station to a full emergency evacuation that required trainees to avoid hazards that blocked their paths and then to muster at their lifeboat stations. Hazard types and likely locations for the hazards to occur on the platform were based on the circumstances provided by the subject matter experts. Detailed public address (PA) announcements were recorded to describe important information about the emergency to the participants for each scenario. The scenarios were tested and refined prior to starting the experiment. This research looked at the retention of spatial and procedural skills in the context of emergency egress. The spatial learning objectives for this experiment included familiarity with the platform layout and knowledge of the egress route options. The procedural skills were defined as a combination of declarative and procedural knowledge (e.g. remembering facts and formulating rules to follow). The training in the experiment aimed to teach personnel to comply with safety protocols. The procedural learning objectives included recognizing emergency alarms, assessing the emergency situation, avoiding hazards, following safety protocols and mustering procedures.
As depicted in Fig. 1a, participants were taught the learning objectives using four modules. Each module had training scenarios (depicted in Fig. 1a, as P1, P2, P3, P4, P5, P6, P7 and P8) to teach participants how to accomplish the egress tasks and subsequent test scenarios (depicted in Fig. 1a, as T1, T2, T3 and T4) to assess participants' competence. The modules gradually increased in difficulty, building on previously presented learning objectives. Module 1 taught the spatial layout of the platform (LO1), the different egress routes available from the trainee's cabin (LO3) and how to safely move within the platform by avoiding running and remembering to close fire and watertight doors (LO8 & LO9). Module 2 taught trainees how to respond to different alarm types (LO2) and the mustering procedures at the temporary safe refuge (TSR) on the platform (LO6 & LO7). Module 3 taught trainees how to assess the emergency situation and to listen to the PA announcement for information on the tenability of the egress routes (LO4). Module 4 taught hazard avoidance and what to do when an egress route was obstructed (LO5).
Participants were required to complete each training scenario correctly before moving on to the next scenario. Participants who made errors in a particular training scenario were required to repeat the scenario until competence was demonstrated. This protocol of training until competent is referred to as trials to criterion. After each training module, the participants' performance was assessed using a test scenario. Table 2 provides a detailed description of the four test scenarios.
The test scenarios required participants to use the knowledge learned in the module. Each module built upon the learning objectives taught in prior modules, and as a result, the corresponding test scenarios became more comprehensive. Following the same trials to criterion protocol as the training scenarios, participants who made errors in the test scenarios were required to repeat the scenarios until competence was demonstrated.

Retention assessment and adaptive retraining matrices
After a retention interval of 6 to 9 months, participants were given the opportunity to demonstrate their retention of offshore emergency egress skills by performing the same four test scenarios that they had successfully mastered in the skill acquisition phase. Figure 1 b depicts the retention assessment and retraining phase in AVERT. The figure shows the test scenarios for each of the modules. Participants who were successful at completing a test scenario advanced to the next test scenario. This process continued until all the test scenarios were completed. A failure to complete a test scenario resulted in participants being required to do corrective training exercises. After successfully completing the retraining exercises, these participants were required to reattempt and pass the test scenario before moving on to subsequent test scenarios.
A series of adaptive training matrices were used to assign the participants the corrective training scenarios to address the specific errors they made in the test scenarios. Each learning objective that participants failed had a corresponding corrective training scenario. Figures 2 and 3 provide examples of the retraining matrices used for test scenarios T2 and T4, respectively.
In this example of test scenario T2, a participant, who failed to identify the alarm (LO2) and forgot how to register at the temporary safe refuge area (LO7), was required to complete one corrective training scenario focused on teaching the different alarms and the corresponding muster station for each alarm (P4) and another corrective training scenario that reinforced the importance of registering at your designated muster station to ensure all personnel onboard are accounted for during an emergency (P5). Once all prescribed retraining exercises were completed correctly, the participant was required to reattempt and pass test scenario T2 before moving on to the test scenario in module 3.
As illustrated in Fig. 3 for the final test scenario, a participant who failed to select the safest route (LO3), forgot to close fire doors (LO9) and encountered a smoke hazard (LO5) was required to complete three retraining scenarios. One corrective training scenario focused on teaching the available egress routes from the cabin to the muster station (P3). Another corrective training scenario reinforced the importance of keeping fire and water-tight doors closed (P1). The last corrective training scenario highlighted the importance of being aware of the surroundings and of avoiding exposure to hazards This scenario assessed the participants' spatial knowledge of the platform. Participants were asked to meet their supervisor at their assigned lifeboat station by following their primary or secondary egress routes.

T2
Muster drill This scenario assessed the participants' understanding of alarms and muster procedures. Participants were tasked with responding to a muster drill (General Platform Alarm). During this alarm, all personnel were required to collect their safety equipment and muster at their primary muster station.

T3
Blocked route This scenario assessed the participants' ability to deal with obstructions to their planned egress route. Participants were required to respond to the alarm, listen to the announcement and follow the muster procedures. The PA announcements provided information to help the participants select the most effective route.

T4 Emergency
This scenario assessed the participants' ability to avoid hazards and follow the safest available route to their lifeboat station. Participants were tasked with responding to an emergency involving a General Platform Alarm due to fire in the galley. The fire compromised the muster station with smoke and the situation escalated to a Prepare to Abandon Platform Alarm. Initially, all personnel were required to go to the muster station but were forced to re-route to the lifeboat station because of the compromised muster station.
during emergencies (P7). Due to individual differences in the errors made in the test scenarios, participants received specific training scenarios to meet their individual needs.

Results and discussion
To measure the retention of basic offshore emergency egress skills acquired using a virtual environment, the participants' performance in the skill acquisition phase was used as a benchmark to compare with the performance achieved after the retention interval. Several metrics were used to investigate retention and impact of retraining: (1) the overall competence demonstrated after the retention period, (2) the performance of each learning objective after the retention period, (3) the overall time spent retraining and (4) the influence of the time lapsed on performance after the retention interval. This section presents the performance results as participants first encountered each test scenario. This section discusses which learning objectives were found to be more susceptible to degradation and how quickly participants were able to return to competence following the retraining program.

Impact of retention period on overall competence retention
To investigate the group's average competence after a period of 6 to 9 months, the final performance scores of the skill acquisition phase (phase I) were compared with the first attempt performance scores of the retention phase (phase II) for each of the four test scenarios. Table 3 shows the descriptive statistics for the performance in the skill acquisition and retention phases for all four test scenarios. Figure 4 provides a visual representation of the data in Table 3 using boxplots. The boxplots are grouped by the four test scenarios and experiment phases. Phase I data (skill acquisition) are denoted by ACQ, and phase II data (retention) are denoted by RET in the figure. Boxplots indicate the data distribution, including the median, first quartile, third quartile, minimum and maximum values of the data set, as well as outliers. The median is represented in the box, which is bounded by the first and third quartiles (interquartile range). The minimum and maximum values are represented by the whiskers. Outliers are represented as individual points and are defined as values outside the range of 1.5 times the first and third quartiles.
As shown in Table 3 and Fig. 4, the SBML training brought all participants to demonstrable competence at the end of the skill acquisition phase of the experiment. When 36 of the participants were reassessed after a period of 6 to 9 months, skill fade was observed. Only 4 participants (11%) were able to successfully complete all the test scenarios without making any errors. The average performance of participants drops from the skill acquisition phase in test scenarios T1 and T2 after the retention interval.
Only 10 participants (28%) in T1, and 13 participants (36%) in T2, were successful in demonstrating competence in test scenarios T1 and T2, respectively. There appears to be little appreciable difference in average performance for test scenarios T3 and T4. Thirty-one participants (86%) in T3 and 33 participants (92%) in T4 were successful in demonstrating competence in test scenarios T3 and T4, respectively, after the retention interval. RStudio (version 3.5.0) software was used for statistical analysis (R Studio Team 2016). The data was tested for normality and found to be positively skewed, so nonparametric statistical tests were performed. The Wilcoxon signed rank test is the nonparametric equivalent to a paired t test and uses the median scores of two dependent samples (Corder and Foreman 2014). The Wilcoxon signed rank test (using the Pratt method for pairs with ties) was used to compare the performance scores of the test scenarios (T1, T2, T3 and T4) before (pre-interval) and after the retention interval (postinterval). For each comparison, the statistical test (Z), p value (p) and effect size (r) are reported.
The results showed significant differences between skill acquisition and retention phases. The output of the Wilcoxon-Pratt signed rank indicated that the post-interval retention scores were statistically lower than the pre-interval acquisition scores, for three test scenarios, T1 (Z = 4.67, p < .001, r = .79), T2 (Z = 4.55, p < .001, r = .77) and T3 (Z = 2.64, p = .008, r = .44). No statistical differences between the acquisition and the retention scores were found for the final test scenario T4 (Z = 0.05, p = .964, r = .008).

Fig. 4 Boxplots of performance scores at the skill acquisition and retention phases for all test scenarios
These results indicate that participants had difficulties recalling the egress protocol, specifically the learning objectives that were tested in the first three scenarios (T1, T2 and T3). It also suggests that the combination of the retraining and exposure to the test scenarios helped the participants regain the competence required to correctly perform the final test scenario (T4). Further investigation into how participants' performed when they first encountered each learning objective in the test scenarios provides more information on what skills were retained or lost during the 6-to 9-month period.

Performance by learning objective after retention period
The first time that the participants were tested on an individual learning objective after the retention interval is an important measure of how well the particular skill was retained for the learning objective. In the skill acquisition phase, participants were tested on an increasing number of learning objectives as they completed each additional module. In the retention phase, the learning objectives were again tested in a cascading format, each test scenario building on the previous scenario. In the first test scenario (T1-wayfinding), the retention of four learning objectives was assessed (LO1, LO3, LO8 and LO9). All three subsequent test scenarios tested these same learning objectives. The second test scenario (T2-muster drill) assessed the retention of three new learning objectives (LO2, LO6 and LO7). These learning objectives were tested again in the subsequent scenarios, T3 and T4. The third and fourth test scenarios assessed the retention of one more new learning objective each. The third test scenario (T3blocked route) assessed the retention of learning objective LO4 for the first time in the retention phase. Learning objective LO4 was tested again in the final test scenario. The final test scenario (T4-emergency evacuation) assessed the retention of learning objective LO5 for the first time, as well as all the other learning objectives that had already been introduced in previous test scenarios. Table 4 shows the percentage of participants who were successful at completing each learning objective for each of the test scenarios in the retention phase. The numbers in italics represent the first time the corresponding learning objective was assessed in the retention study.
None of the nine learning objectives were successfully demonstrated by all 36 participants when first encountered in the test scenarios. All participants who were unsuccessful at completing a test scenario were re-trained using exercises that focused on the particular errors they made as prescribed by the adaptive training matrix (as described in Sect. 3.4). Depending on the errors made, they received specific training scenarios to help improve their performance in subsequent attempts at the test scenarios.
In general, the retraining exercises were effective at returning participants to competence in specific skills. The skill retention of spatial (LO1, LO3 and LO4) and procedural (LO2, LO5, LO6, LO7, LO8, LO9) skills will be discussed separately by scenario.

Retention of spatial learning objectives (LO1, LO3, LO4)
The wayfinding test scenario (T1) focused on spatial knowledge and assessed participants on their ability to find their way around the platform (specifically testing spatial learning objectives LO1 and LO3). At the end of the skill acquisition phase, the participants were successful in all performance metrics. In the retention phase, not all participants retained the skills associated with the same learning objectives. Spatially, 81% of participants were able to recognize the correct muster location (LO1) and only 42% of participants were successful in following their egress route (LO3). In terms of the LRS model (Seigel and White 1975), this result suggests that participants remembered landmark knowledge, such as recognizing muster locations, better than route or survey knowledge of the environment after the retention period.
The muster drill test scenario (T2) retested participants' spatial knowledge of their designated muster locations (LO1) and the available egress routes from their cabin (LO3). The percentage of participants who were successful at the spatial aspects improved in this scenario: 100% of participants were able to recognize the correct muster location and 94% of participants were successful in following their egress route. This suggests that a combination of the testing that participants completed in the first scenario and the retraining received after the first scenario helped the majority of participants return to competence in route knowledge of the platform.
The final two test scenarios (T3 and T4) assessed participants on their survey knowledge of the platform by blocking known egress routes, requiring participants to re-route to find their muster stations. The blocked route test scenario (T3) assessed participants' ability to re-route in the event that their egress route was obstructed. This was the first time participants were assessed on LO4 in the retention phase. In this scenario, 75% of participants selected the safest egress route and 8% of participants re-routed based on information from the PA. Some participants still experienced difficulties with route and survey knowledge: 11% of participants had trouble following their egress routes (they deviated from their routes) and 6% of participants had difficulties finding an alternate route when their path was disrupted, requiring them to re-route after encountering the blocked route. The emergency test scenario (T4) assessed participants on their procedural knowledge to assess the situation, and on their survey, knowledge of the platform to avoid hazardous egress routes and re-route effectively if their chosen egress route was obstructed. This scenario was the second time participants were assessed on LO4. In this scenario, 82% of participants took the safest route available for the situation. Another 9% of participants attempted to follow the safest route but had some difficulty and deviated at various points along the route. Only one participant followed the less optimal route but re-routed effectively to avoid hazards by listening to the PA. Two participants (representing 6% of participants) followed an unsafe route and encountered the hazard (failing both spatial and procedural learning objectives).

Retention of procedural learning objectives (LO2, LO5, LO6, LO7, LO8, LO9)
The wayfinding test scenario (T1) not only focused on spatial knowledge but also assessed participants on their ability to follow safety protocols on the platform (specifically testing procedural learning objectives LO8 and LO9). In this test scenario, 63% of participants remembered not to run on the platform (LO8) and 86% of participants remembered to close all the watertight doors (LO9).
The adaptive training matrix was used to retrain all participants who were unsuccessful at completing test scenarios (as described in Sect. 3.4). The retraining for the procedural learning objectives appeared to correct participants' performance. For example, after the errors made in LO8 for T1, no one failed this learning objective in the three subsequent test scenarios.
The muster drill test scenario (T2) assessed participants on their understanding of alarms and basic muster procedures at their designated muster stations. The percentage of participants who were successful increased from scenario T1 to T2 for both the spatial (LO1, LO3) and procedural (LO8, LO9) elements that recurred in T2 (after being first assessed in T1). This may be a result of the re-training that took place after the first test scenario. However, there were still deficiencies in remembering procedural steps. The three procedural tasks (LO2, LO6 and LO7) that were first assessed in the muster drill scenario were forgotten by many participants. Some participants forgot that the alarm type dictated the muster location (8%), others forgot to take their personal protective equipment from their cabin (47%) and some forgot the mustering or unmuster procedures (37%). These skills were not practiced during scenario T1 or during the retraining associated with T1.
The blocked route test scenario (T3) did not assess any new procedural learning objectives. The emergency test scenario (T4) assessed participants on their procedural knowledge to assess the situation and avoid hazards (LO5) for the first time in the retention phase. The majority of participants did not make procedural errors in T3 and T4. Errors made by some individuals in these scenarios were in remembering safety equipment (LO6), registering at the TSR (LO7) and avoiding hazard exposure (LO5).

Time spent retraining
The retraining helped participants return quickly to competence. Table 5 provides a comparison between the mean time spent training in the skill acquisition phase and the mean time spent retraining in the retention assessment phase.
Overall, compared to the initial training, 47% less time was spent in the retention phase to return participants to demonstrable competence in egress skills. During the retraining, participants took less time reviewing tutorial material (4.6 min compared to 20.5 min) and training in AVERT (15.9 min compared to 45.6 min). However, participants spent more time demonstrating retention of competence in test scenarios (18.3 min compared to 12.9 min).

Influence of retention interval (time lapsed) on overall competence retention
The experiment was designed to evaluate retention after a 6-month interval. Due to logistical constraints, participants returned to complete the retention phase of the experiment after a period of 6 to 9 months. The average elapsed time between phases was 7.42 months (SD = 0.91 months). Participants were grouped based on the time that lapsed between phases (6, 7, 8 or 9 months) to determine if the difference in the retention interval impacted skill degradation. Five participants were assessed at 6 months, 16 participants were assessed at 7 months, 10 participants were assessed at 8 months and 5 participants returned at 9 months. Of the four participants who were successful in all test scenarios, two completed the retention assessment at 6 months, and the remaining two participants completed the retention assessment at 7 months.
Due to the small sample size in months 6 and 9, the participants were grouped into two separate groups: group 1, time lapse of 6-7 months (N = 21), and group 2, time lapse of 8-9 months (N = 15). Figure 5 shows the distribution of performance by the participants in each retention interval for all test scenarios.
To compare the performance of the two groups' elapsed time, the Mann-Whitney U test was performed in RStudio. Performing multiple tests on the same data inflates the type I error, therefore the Bonferroni procedure was used to correct α = 0.05 for determining the significant differences between the samples (Corder and Foreman Total training time 95.6 (29.9) 46.9 (25.3) a a Three participants' habituation scenario time was not recorded 2014). The adjusted significance level was determined to be α B = 0.025. For each comparison, the statistical test (U), p value (p) and effect size (r) are reported. No statistical difference was found for test scenarios T1 (U = 150.5, p > .05, r = 0), T3 (U = 139.5, p = .416, r = .14) and T4 (U = 134.5, p = .833, r = .04). A statistical difference in the performance between the groups was found for test scenario T2 (U = 50.5, p = .0006, r = .58). This result indicates that the difference in time that elapsed for the retention interval affected participants' performance in the second test scenario. Participants who completed the retention assessment at 8 and 9 months performed worse on T2 than those who completed the assessment at 6 and 7 months.
Tables 6, 7, 8 and 9 provide the distribution of spatial and procedural errors by learning objective for the four test scenarios for each group (6-7 months and 8-9 months). The majority of errors occurred in the first two test scenarios (T1 and T2). Spatial errors in test scenario T1 were equally distributed across all groups, regardless of the time lapse between skill acquisition and retention assessment. The majority of errors made in test scenario T2 were procedural errors (shown in Table 7), such as forgetting to take safety equipment (LO6) and difficulties remembering how to  register at the designated muster station (LO7). This result indicates that the longer participants waited to be assessed, the more likely participants forgot these tasks. All other errors made throughout the retention phase were not found to be statistically different amongst the groups.  The numbers in italics represent the first time the corresponding learning objective was assessed in the retention study  The numbers in italics represent the first time the corresponding learning objective was assessed in the retention study

Conclusions
The results of this retention study show that emergency egress skills attained using a virtual environment are susceptible to skill decay over a period of 6 to 9 months. Although skill decay occurred, the adaptive retraining matrix employed in the study was successful in bringing all participants back to demonstrable competence at the end of the experiment. This section will discuss the answers to two questions: (1) what  The numbers in italics represent the first time the corresponding learning objective was assessed in the retention study egress skills degraded after a period of 6 to 9 months? and (2) how effective was the VE-based retraining program at maintaining egress skills?
5.1 What egress skills degraded after a period of 6 to 9 months?
Two indicators were used to understand the retention of egress skills acquired using a virtual environment: (1) the overall performance scores in each test scenario and (2) the performance of participants in their first test attempt at each learning objective. The first indicator was useful in showing the initial skill fade in the first two test scenarios and provided less evidence of skill fade in the latter test scenarios. This result was interesting because the first two test scenarios were foundational scenarios that tested participants' basic knowledge of the platform and their understanding of the safety protocols in benign conditions. The latter test scenarios were more complex scenarios and assessed participants' survey knowledge of the platform and their ability to respond to emergency conditions, including blocked routes and hazards. The first two test scenarios (T1 and T2) are the most important in terms of retention assessment because seven of the nine learning objectives were encountered for the first time in these test scenarios. The subsequent test scenarios (T3 and T4), although more complex emergency situations, built on the foundational learning objectives from early test scenarios and tested fewer new learning objectives. The results provided evidence that participants forgot foundational skills needed to perform in simple scenarios. Participants regained these skills and performed better in the more complex emergency test scenarios due to a combination of the exposure to the test scenarios and the corrective retraining they received.
The second indicator-participants' performance in terms of learning objectiveprovided more practical information about the loss of specific egress skills (i.e. identifying which skills were most susceptible to skill fade). Participants' performance in terms of learning objective showed that most of the participants (89%) did not retain the full requisite skill set over the study interval. It also identified which of the learning objectives were relatively more or less susceptible to skill fade, which is important in determining training interventions. This method is supported by other researchers, such as Atesok et al. (2016), who broke down orthopaedic surgery procedural skills into smaller components to investigate the decay of skills and identify how to improve longterm retention.

Spatial skills
The learning objective that scored worst in terms of retention was remembering egress routes (LO3). The majority of participants failed to remember their egress routes when they first encountered this decision in the retention study (test scenario T1). Choosing the safest egress route also had the most persistent failures across test scenarios. Once the skills were forgotten, it took longer to retrain spatial skills than some of the procedural skills. This suggests that spatial competence needs relatively more training than the other skills and that a shorter retraining interval is required to reduce spatial skill decay. There are some limitations of using a virtual setting to learn spatial skills when compared to the real environment. VEs often take longer to learn survey knowledge compared with the traditional maps or the real world for short-term exposures (Waller et al. 1998;Witmer et al. 2002;Darken and Peterson 2001). However, VEs are useful for longer-term exposures and for situations when the real world environment is not easily accessible, which is the case for all offshore platforms.

Procedural skills
Compliance with procedures was an issue for the first two test scenarios (T1 and T2) but was less so in the latter two test scenarios (T3 and T4). When the procedural learning objectives were first encountered, half the participants forgot to take their safety equipment, about 40% did not follow the muster procedures and about 40% forgot to refrain from running on the platform. As for retraining procedural skills, most participants only needed to be reminded of the protocols once in order to complete them successfully in the subsequent scenarios. In the cases where participants failed to complete procedural tasks, it is possible that the initial training did not provide adequate practice (or frequency of practice) for all participants to proceduralize the declarative knowledge related to these tasks. Since procedural knowledge takes time to develop, the retention assessment in this study may have solely assessed not only participants' procedural knowledge but also their ability to retrieve declarative knowledge (which is more susceptible to decay over time). Therefore, a shorter retraining interval would be beneficial for most participants. More frequent practice would help participants maintain declarative knowledge long enough to develop procedural skills.

How effective was the retraining at maintaining egress skills?
This experiment demonstrated that the VE-based training can retrain participants who have lost egress skills after a period of 6 to 9 months. Only 11% of participants completed all test scenarios without errors and so did not require retraining. Eightynine percent of participants failed one or multiple test scenarios in this experiment and were required to complete corrective exercises. These participants regained foundational skills and performed better in the more complex emergency test scenarios due to a combination of the exposure to the test scenarios and the corrective retraining they received after the first test scenarios.
Most egress skills that were forgotten were quickly addressed with minimal retraining. A series of adaptive matrices were used to assign participants their corrective retraining scenarios based on their errors they made. Participants received a specific sequence of retraining scenarios based on the areas with which they had difficulty. For participants with spatial deficiencies, such as remembering their egress route (LO3), the spatial skills took the longest to relearn. Some participants required multiple iterations of the adaptive training matrix to re-acquire the forgotten spatial layout of the platform. For participants having difficulty with procedural skills, such as remembering not to run on the platform (LO8), and remembering to close fire doors (LO9), many participants only needed to be reminded of procedures in order to correct their behaviour. This finding is supported by other researchers (Hein et al. 2010) who found that minimal practice before performing a task can help improve performance after time has passed.
Overall, the adaptive retraining matrix was an integral part of accommodating different individual learning paths and an effective method to return all participants to demonstrable competence by the end of the experiment. This experiment demonstrated that VE-based training can help workers maintain their egress skills even if they have been absent from the platform for 6 months.

Limitations
The experiment was designed to combine the assessment of retention and retraining (a cascade format was used to measure the retention of targeted learning objectives for each test scenario and retraining was provided between test scenarios to address errors in prior tested learning objectives). This design was used to strike a balance between experimental control, ecological validity and the practicality of training delivery. The authors recognize this design limits the conclusions that can be made from this work in the sense that it did not investigate only retention.

Future work
Two questions arising from this research open new lines of inquiry. One question is: what is the most practical retention interval to maintain offshore egress skills? Industry standards require personnel to undergo safety training if they have been absent from the offshore platform for an extended period (e.g. 6 months or more). This experiment demonstrated that egress skills can be lost by the 6-month period. Therefore, a shorter retention interval is necessary to maintain egress skills, but what is the ideal frequency? Although this experiment did not examine the retention of skills between initial training and the 6-month interval, it would be interesting to look at the rate of decay of skills at shorter retention intervals (e.g. 1 month, 3 months). Investigating a shorter retention interval would help determine the appropriate recurrency training interval to maintain egress competence. This information would help inform the retention rate (e.g. predict the rate of decay) and estimate a suitable frequency for retraining.
The second question is: should recurrency training be competence driven as opposed to time based? In this experiment, the observed differences in individuals' performance (both in learning and retaining skills) indicate that a fixed or standardized retraining interval may not be the best solution. Rather than identifying a standardized retention interval, further investigation should focus on evaluating the on-demand feature of simulation-based training (i.e. customize the frequency of training). Sui et al. (2016) suggest using metric driven scheduling to train and maintain skills. VE training has the flexibility to provide people with practice, assessment and corrective feedback at customized scheduling to meet individual needs of each learner. Competence-driven VE training could reshape how recurrency training is provided for offshore egress. VE training could help transition recurrency programs from fixed-interval training (i.e. only meeting the needs of some individuals) to tailored training for individuals (i.e. adaptive training to meet each individual's learning needs) by providing custom frequency of practice intervals to maintain skills. This would provide the groundwork for a competence-driven training frequency based on participants' needs.
Compliance with ethical standards Memorial university's interdisciplinary committee on ethics in human research approved the experimental protocol. All volunteers provided their informed consent before participating in the experiment.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.